Spatial Color Indexing: An Efficient and Robust Technique for Content-Based Image Retrieval

Problem statement: Color Histogram is admitted as a useful representation of features because it is a statistical result and possesses the merits of simplicity, robustness and efficiency. However, the main problem with color histogram indexing is that it doesn't take into account the spatial information. Previous researches have proved that the effectiveness of image retrieval increases when spatial feature of colors is included in image retrieval. Approach: This study examined the use of a computational geometry-based spatial color indexing methodology, there are two major contributions: (1) Color Spatial Entropy (CSE) which introduce entropy to describe the spatial information of colors. (2) Color Hybrid Entropy (CHE) witch introduce a description spatial on multiresolution images. Results: The experiment results showed that CSE and CHE is more better performance and efficiently and relevant result than those traditional CBIR method based on the local histograms. Conclusion: our new system was presented to strengthen the retrieval efficacy and remains more stable performance by transformations geometry in more CHE characterize quantitatively the compactness of the multiresolution images.


INTRODUCTION
Content Based Image Retrieval (CBIR) has been an active research area for decades. One of the fundamental problems for image retrieval is how to represent the images. In general, images features (color, texture, shape) are extracted to represent the images. Image indexing grew in the last decade and rapidly became color-oriented, since most of the images of interest are in colors. A color histogram is frequently used to represent an image's features. In literature, major color indexing methods are based on color histograms [1][2][3] . The histogram expresses the frequency distribution of color bins in an image. For simplicity, let's I denote a digital image and I denote the size of the image. Then, we discretize its color space into m distinct colors c1,...,cm. Finally, a normalized histogram is computed by dividing the frequency of each color bin by the size of the image. Therefore, the normalized histogram of image H I can be defined as: A main advantage of using a histogram is its robustness with respect to the projection of the image. Color histogram are invariant to translation, rotation around the viewing axis and change slowly with distance to the object and partial occlusion. However, the histogram captures only the color distribution in an image and does not include any spatial correlation between individual pixels. Such indexing can potentially give false results on image queries. Sometimes, two images with dramatically different semantics can give rise to similar histograms. To reduce the problem, several schemes including spatial information have been developed. Color correlogram [4] and color coherence vector [5] can combine the spatial correlation of color regions as well as the global distribution of local spatial correlation of colors. These techniques perform better than traditional color histograms. However, they require very expensive computation. Another common approach is to incorporate spatial information into the color histogram, the local color histograms feature was also introduced to overcome the drawbacks of a color histogram. In this method, image is partitioned into several windows and the average color of each window is calculated [6] . Similarity measurement plays a vital role In Content Based Image Retrieval (CBIR), since without this concept of similarity measurement; the retrieval of images from a database would not be possible. Boujemaa [6] used a distance L 1 for measuring similarity between two images. However, in large bases images that measure L 1 becomes low to catch any deviation of visual content by the transformations: rotation and translation. Therefore due to the drawback, some results of the image retrieval are not so effective. Alaoui et al. [7] introduced a new measure (PMM) efficient than L 1 , this distance will be efficacy in low dimensionalities, but they pose an extra overhead on the system when the number of windows is intrinsically high-dimensional , so a partition of the image more that 4 windows implicate a great problem in time of execution and more complex processing during the image matching.
In this study we propose a new system of indexing using a new descriptor called Color Spatial Entropy (CSE) method, which takes account of correlation of the color spatial distribution in an image, the CSE remains more stable performance by transformations geometry. The main difference between CSE and system using local histograms approaches is that CSE describes how pixel patches of identical color are distributed in an image. Even though, single color spatial entropy of images is not enough for efficient image retrieval, we suggest using a description spatial on multiresolution images for more efficient image retrieval. Multiresolution images using wavelets [8] , Gabor filters [9] , Gaussian filters [10] and spatial filters [11,12] were used before for histograms. We decompose the images into different resolution using simple median filtering of different sizes. So we developed a new descriptor called Color Hybrid Entropy (CHE) which introduce a description spatial on multiresolution images. In the end, we will analyze and compare the performance of the proposed method with some systems using local histograms approaches.
The problem areas to use local histograms with the measure PMM and L 1 : An image I is evenly divided into a number of M non-overlapping windows and each individual window is abstracted as a unique feature descriptor with its spatial location, suppose Ai be the set of pixels with color bin i of an image I and h(i) be the number of elements in Ai. Let h j (i) be the count of the pixels of color bin i inside window j. Then the local color histograms can be written as (h 1 (i),h 2 (i),…h M (i)).
Based on the local color histograms, the NLDH (the normalized local distribution histogram) can be defined as h(i) % where: The L 1 distance becomes weak to capture all deviation of visual content image by rotation or translation, the Fig. 1 shows two images in black and white (every images are dived into 4 windows) that have the same general content but they are different by the measure classic L 1 . Indeed: is very big because the window 1 in the two images 1 and 2 are different in color.
To get a measure of similarity stable by rotation or translation, we can be considered the PMM measure Z [7] . It is easy to verify that the two images shown in Fig. 1 are similar by Z. d(x , s ) = ∑ for n! order, for example a partition of image at 9 windows needed 9! = 362880 instructions into formula of Z. So the computational requirement of Z will be a high order when the number of windows is big.

MATERIALS AND METHODS
The powerful idea in this study is developing a robust descriptor that will be stable of visual content by the transformations: rotation and translation and opens up the possibility of dealing with high locality description with a gains importance for reduction of computational time.

Feature extraction:
Color distribution entropy CSE: John [13] proposed using entropy which was developed by Shannon [14] to represent color information of an image and retrieve image in CBIR: The entropy of Eq. 3 is clearly insufficient as a measure of a pixel's uncertainty on natural images as it completely neglects any spatial properties.
Based on the NLDH and the definition of entropy, we propose a new descriptor, CSE (the color spatial entropy), describing the spatial information of an image. The CSE of color bin i can be defined as: Which gives the dispersive degree of pixel patches of a color bin in an image. It is easy to verify that the two images showed in Fig. 1 have an identical descriptor CSE.

Color Hybrid Entropy (CHE):
It is quite difficult to construct a good retrieval system. Most of the systems are based on features that are semantically too primitive. Following this, a single local histograms descriptor, however, suffers from the inability to encode spatial image variation. An obvious way to extend this feature is to compute the local histograms of multiple resolutions of an image to form a multiresolution local histograms. Multiresolution approaches have been introduced by using filtering system in multi-levels. The local histograms of a part (9×9) of an input image quantized to four levels and its lower resolution are different as shown in Fig. 2. In Fig. 2, we propose to use the multiresolution local histograms instead of single local histograms for efficient image retrieval. Multiresolution images are obtained using the median filtering. Based on the multiresolution local color histograms, the NMLH (the normalized multiresolution local histograms) can be defined as Mh where: where, Lj represents the size of median filtering and hence the level of resolution.
It is important to empirically evaluate the effect of different resolution levels, we adopt the following procedure. We give more weights to the higher resolution image local histograms as they contain more information than the lower resolution images. The Based on the NMLH and the definition of entropy, we propose a new descriptor, CHE (the color hybrid entropy), describing the spatial information of multiresolution image. The CHE of color bin i can be defined as: The novelty of the CHE descriptor is in devising a way to combine spatial and multiresolution methods to support a content-based retrieval mechanism which cannot be implemented efficiently by any one method individually.
Similarity measurement: In content-based image retrieval, a distance metric is usually used to check similarity or dissimilarity between two images. The distance metric tries to capture the strength of relationships between features during comparisons of images in a database. In literature, many similarity measures have been suggested to compare images. With reference to the definition of similarity metric [15] , the distance of images q 1 and q 2 can be noted as: The similarity metric d is made up of two parts; the first part,

RESULTS AND DISCUSSION
The performance of above technique was evaluated through experimentation with image database consisting of 1700 images comprising 10 classes. We indexed the image database by using CSE and CHE in RGB color. The color space is uniformly quantized into 8 bins/component quantization. Figure 3 shows the top ten retrieval results of a query (the top left image is the query image and the retrieved image; our system performs using CSE and CHE clearly give better performance in image retrieval than systems of indexing using local histograms with PPM measure and the distance L 1 . The precision and recall measurements of Mehtre and all [16] are often used to describe the performance of an image retrieval system. The precision and recall are defined as follows:

Number of relevant images selected Precision
Total number of retrieved images =

Number of relevant images selected Recall
Total number of similar images in the database = In order to evaluate the performance of the proposed methods, CSE and CHE were compared with systems of indexing using local histograms with PPM measure and the distance L 1 : • In Fig. 4, our system particularly considered a subdivision of 4 windows in image, the precisionrecall curve of the proposed approaches (CSE and CHD) is compared with the curve of alternative approaches using local histograms. The proposed solution shows an improved performance to all alternative approaches • In Fig. 5, our system particularly considered a subdivision of 9 windows in image, shows the precision-recall curves (9 windows) of the CSE, CHE and approach with the local histograms using L 1 in measure of similarity) The study discussed and summarized the proposed CBIR methodology for correlation of the color spatial distribution. Our CBIR approach will be stable by the transformations: rotation and translation and opens up the possibility of dealing with high locality description with a gains importance for reduction of computational time. To evaluate the retrieval performance, we relied on standard precision/recall curves. As can be observed from the graphs, in both of the tests, CSE and CHE has stronger performance than other descriptors (with increased recall and precision rates), we showed that the computational complexity of our approach is significantly lower than the PMM method. The CHE curve is well above the CSE curve on both of the tests. This proves that the proposed CHE descriptor has effectively overcome the drawback of CSE descriptor by encoding spatial variation in image color, the multiresolution approaches have been introduced by using filtering system in multi-levels.

CONCLUSION
A good image/video indexing system is a critical step and hard problem in content-based image retrieval. The primary goal of feature indexing is to improve the matching efficiency and quality. This study examines the use of a computational geometry-based spatial color indexing methodology for efficient and effective image retrieval; we introduced two descriptors for CBIR using entropy distribution. The first descriptor CSE proposes a mechanism to transform local histograms features into entropy feature, the second descriptor CHE integrates spatial relationship of colors in multiresolution images. The multiresolution images are generated using median and mean filters of different sizes. The multiresolution images can obtained with different filters such as mean, median, Laplacian, Gaussian. According to the properties of entropy, our new system is presented to strengthen the retrieval efficacy and remains more stable performance by transformations geometry. In more CHE characterize quantitatively the compactness of the multiresolution images. It should be noted that the proposed system may be further improved by combining more complex similarity metrics. Experimental results indicated that the proposed system are quite robust, provide high precision in image retrieval system and takes more querying time than the local histograms system.