Journal of Computer Science

Exploiting Surrounding Text for Retrieving Web Images

S. A. Noah, A. Azilawati, T. M.T. Sembok and T. W.T.S. Meriam

DOI : 10.3844/jcssp.2008.842.846

Journal of Computer Science

Volume 4, Issue 10

Pages 842-846


Web documents contain useful textual information that can be exploited for describing images. Research had been focused on representing images by means of its content (low level) description such as color, shape and texture, little research had been directed to exploiting such textual information. The aim of this research was to systematically exploit the textual content of HTML documents for automatically indexing and ranking of images embedded in web documents. A heuristic approach for locating and assigning weight surrounding web images and a modified tf.idf weighting scheme was proposed. Precision-recall measures of evaluation had been conducted for ten queries and promising results had been achieved. The proposed approach showed slightly better precision measure as compared to a popular search engine with an average of 0.63 and 0.55 relative precision measures respectively.


© 2008 S. A. Noah, A. Azilawati, T. M.T. Sembok and T. W.T.S. Meriam. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.