BI-LEVEL CLASSIFICATION OF COLOR INDEXED IMAGE HISTOGRAMS FOR CONTENT BASED IMAGE RETRIEVAL

This dissertation proposes content based image classification and retrieval with Classification and Regression Tree (CART). A simple CBIR system (WH) is designed and proved to be efficient even in the presence of distorted and noisy images. WH exhibits good performance in terms of precision, without using any intensive image processing feature extraction techniques. Unique indexed color histogram and wavelet decomposition based horizontal, vertical and diagonal image attributes have been chosen as the primary attributes in the design of the retrieval system. The output feature vectors of the WH method serve as input to the proposed decision tree based image classification and retrieval system. The performance of the proposed content based image classification and retrieval system is evaluated with the standard SIMPLIcity dataset which has been used in several previous works. The performance of the system is measured with precision as the metric. Holdout validation and k-fold cross validation are used to validate the results. The proposed system performs obviously better than SIMPLIcity and all the other compared methods.


INTRODUCTION
The explosive growth in the number of images can be attributed to the myriad of novel ways of capturing and storing digital images. This has also spawned the need for Content Based Image Retrieval (CBIR) that is retrieving similar images to a query image from large image databases. Image classification finds its applications in a number of areas, such as remote sensing, image analysis, pattern recognition, similarity search, categorical search and automated image annotation.
Classification and Clustering techniques have been widely used to preprocess the image and improve the overall retrieval accuracy of the system. Classification is primarily used for categorical search and to prune the search space to only a set of most relevant images, instead of performing a distance based search with every image in the database.
Image Classification can be textual or content-based. Textual classification either relies on the annotation or is attribute based, whereas content based classification relies on the low level features of the image. This study focuses on content based classification of images to notch up the level of retrieval efficiency and accuracy.
The two popular approaches to classification are supervised and unsupervised techniques.
Supervised image classification makes use of training sets to characterize the class. A vigilantly selected training set of depict the generalized characteristics of that class. A classifier method is employed to generate descriptors for a particular class by analyzing the training set. The descriptors are used to predict the class of other images. There are different methods for classification including neural networks, decision trees, state vector machines and Bayesian statistical methods.
Unsupervised methods do not depend on the use of training sets. Instead, it makes use of clustering techniques to group the most similar images into the same cluster and dissimilar images into different clusters. The clusters are then assigned class labels. JCS

Problem Definition
A deluge of methods have been used for the retrieval of images based on visual features such as color, texture and shape. Most of the successful methods use sophisticated time consuming image processing techniques to learn the semantic content of the image. For example, if separate regions of the image have to be studied (Chen and Wang, 2002;Li et al., 2000), suitable color or texture segmentation algorithms should be used to separate the homogeneous regions. Further analysis is required to classify those regions based on the features. Some algorithms are based on salient points (Hiremath and Pujari, 2008) where each salient point is described by a feature vector. Even after such sophisticated semantic analysis, the improvements in the results have not been so significant. Further, simple image matching policies often lead to poor accuracy in image retrieval. So, a CBIR system with model based classification technique may lead to better results.
This study models a simple CBIR system. The CBIR system makes use of features that can be acquired from the image in a very fast manner. The accuracy of a very simple CBIR system can be made competitively equal to that of a sophisticated CBIR system, if the simple and more significant features of the image are scrupulously chosen for coding the feature set. To further improve the image retrieval accuracy, a classification and regression based decision tree model is used.
The development process for this project involves three phases; image feature extraction, classification tree construction and matching the query image with database images using the previously constructed decision tree. For the first phase, we use WH, the method proposed in our earlier work (Karpagam and Rangarajan, 2012). For the second phase, we use a Classification and Regression Tree (CART). For the retrieval part, we use the previously constructed decision tree as well as the simple distance measure. Chowdhury et al. (2012), neural networks has been utilized for image pre-classification and the CBIR system is evaluated using 2×5-fold cross validation followed by a statistical analysis. Banerjee et al. (2009) have used C4.5 classification for feature selection and leave one-out validation to measure the effectiveness of the reduced feature set. Britos et al. (2005) have made use of C4.5 and C5.0 classification algorithms to discriminate human faces based on distances between MPEG4 FDP (Face Definition Parameters). SOM (self-organizing maps) are applied before C4.5 and C5.0 to cluster the records into groups. Akgun et al. (2004) the maximum likelihood method, minimum distance method and parallel-piped method have been compared for land use classification of satellite images and it has been concluded that maximum likelihood method is the most reliable for the specified application. DT-ST, an improved decision tree based learning algorithm has been proposed in (Liu et al., 2007;. It makes use of semantic templates and introduces a hybrid tree simplification method.

Color Indexed Histogram and DWT Image Features
In image recognition or pattern recognition in general, the two major issues are feature extraction and distance measure definition. Failure in either of the two issues will lead to poor performance of the recognition system. There is no exception to the content-based image retrieval system. In this work, we use WH, a unique color histogram and wavelet decomposition based horizontal, vertical and diagonal image attributes design the main features of the retrieval system.
Generally, the three color layers information of a typical RGB image is handled separately in feature extraction engines. This may lead to inaccurate representation of the color features. For example, the same level of one particular layer color may create different colors at different parts of the image, since the colors of the other two layers also will decide the color of the pixel. Several previous works handle the three layers separately and use separate histograms to measure the color features.

The Indexed Color Image Histogram
In WH, instead of using three separate layers for measuring the color features of the image, the RGB image is converted to an indexed image with low level of color detail. That is, a 24 bit color image is converted to a 256 color indexed color image and the color map of only one image of the whole dataset is stored separately to decompose the remaining images of the data set. The color approximation method will do such color mapping and the images will be almost in the same original color.
After this decomposition, each indexed 8-bit pixel will represent a particular color which is stored separately as a map. For example, if the color red (255,0,0) is indexed with the number 78, then all the indexed pixels with value 78 will represent the same red color (255,0,0). So, now 78 denotes red and the value 78 in all the indexed images of the whole data set will represent red and red only. So, now it is possible to represent the color distribution of the image with a single histogram in which the bin 78 will just represent count of the red color indexed pixels. JCS

The Feature Extraction using Wavelet Decomposition
Wavelet transform has become very popular in different fields and often used for analysis, de-noising and compression of signals and images. The resultant images of single-level two-dimensional wavelet decomposition have lot of interesting characteristics. Generally, the 2D wavelet decomposition produces four output images L1, H1, V1 and D1.
The matlab implementation of two dimensional dwt function (dwt2) computes the approximation coefficients matrix L1 and details coefficients matrices H1, V1, D1, obtained by a wavelet decomposition of the input image matrix.

The Feature Data Set Creation using Wavelet Decomposition and Indexed Color Image Histogram Method (WH)
The Feature Data Set is a feature based index that represents the whole image data in a most simplified form. These features reflect the content of the image. Using this Feature Data Set, we can search for a particular type of image using a query feature set which is derived from an input query image.
The function Indexed_Image_Histogram (I,M) returns 256 bins of color histogram values. The 24 bit rgb color image is converted to a 256 color indexed image and its corresponding histogram is returned.

Proposed Image Classification and Retrieval System
Classification prunes the search space in CBIR to only one class of images instead of performing a similarity check with every image in the database. Classification also reduces the semantic gap by trying to associate the image with a semantic label.
Classification algorithms can be categorized into symbolic learning, neural networks, statistical and genetic methods. While genetic algorithms are still under its infancy, it has been proved that different approaches perform better for different datasets. Neural networks require careful training which is time consuming and computationally intensive. Edvardsen (2006) a thorough analysis has been performed on different datasets using different classification algorithms. It has been concluded that rule-based approaches perform better than the other approaches when data sets exhibit extreme distribution. The accuracy as well as the time taken for training can be taken as objective measures of the performance of the classifier.

Classification and Regression Tree (CART)
CART (Breiman et al., 1984) is a supervised decision tree induction technique. It recursively bifurcates the input into disjoint classes based on some attribute. Decision tree learning is practically simple and invariant to incomplete and noisy input features. Most of the decision tree approaches in the literature aim at improving the retrieval accuracy of the system.
CART uses impurity as a measure to determine the best split. The splitting is terminated when further growth of the tree does not contribute to significant improvement in the results. Every image is assigned to some leaf node that emulates a class.
CART makes use of a post-pruning process to arrive at a compromise between the size of the tree and the accuracy of the estimates.

Advantages of Tree based Methods
Tree based methods require the least of parameter tuning, when compared to neural network and genetic algorithms. The classification speed is faster than nearest neighbor methods. The interpretability of the decision tree is almost effortless.
Tree methods are appropriate for data mining tasks, where domain related a priori knowledge is missing.
The process of computing classification and regression trees involves specifying the criteria for predictive accuracy, selecting splits, determining when to stop splitting and selecting the "right-sized" tree. The proposed CBIR system is evaluated using the classification and regression tree classifier.

The Steps of Proposed CBIR System using CART
• Constructing tree using the features color indexed image histogram and discrete wavelet • Decomposition of the training images • Classifying the input image using the decision tree • Retrieving all the best matching images from the matching class of the input image using a simple distance metric

Image Feature Matching
For matching the input image features with the stored features of image data set, the simple Euclidean distance is used as a distance metric. The ranks of the matching images were calculated based on the Euclidean distance with the query image. In our evaluations, we only considered top 50 ranked matching images and calculated the precision by taking the average of precision of several runs with same category input query images.

Model Validation
The results of classification can be measured through the error rate or mis-classification rate. Validation techniques have been proved to be the best to measure the prediction accuracy of a classification algorithm. A comprehensive survey of cross validation techniques for model selection can be found in (Arlot and Celisse, 2010). The techniques include hold-out, k-fold and leave one out.
In this study we have selected k-fold cross validation as the main metric for evaluating the performance of the image classification system. The optimal value for k is between 5 and 10 (Hastie et al., 2009). The improvement in performance is not very significant for values of k larger than 10. In this study, a computationally feasible, optimal 10-fold cross validation is applied for evaluating the performance of the classifiers.

RESULTS AND DISCUSSION
The results are benchmarked with standard systems namely, SIMPLIcity (Wang et al., 2001), FIRM and some other previous works using the same database which was used in all those reference works. We have compared the results of our previous work (Karpagam and Rangarajan, 2012) as well as the results of this work. In this work, we measure the performance of the system with more accuracy. Two validation methods have been used, namely (1) Holdout validation and (2) k-fold cross validation. The category wise precision is measured and tabulated. The average of precision in all the categories is considered as the overall precision of the CBIR system. Table 1 shows the results of some of the earlier works which will be compared with the proposed CBIR system. Decision tree is constructed using CART. The leaf nodes denote the category IDs. The attributes X 1 to X n were the n attributes which were extracted from the images and used for classification and retrieval. Generally, during k fold validation, k is taken as 10. Therefore, always 90% of data will be used for the construction of the tree and the remaining 10% data for validating the tree. Table 2 shows the results of the proposed CBIR system. The first one is the result of our previous work (Karpagam and Rangarajan, 2012). The second and third are the results of the model proposed in this study. The precision of the proposed model is presented through Holdout validation and k-fold validation. Figure 1 shows the comparative precision of different CBIR systems. The performance in terms of precision in the case of proposed CBIR models is significantly higher than all the compared earlier works.

The
Comparison of Category-Wise Performance Figure 2 shows the comparison of category-wise performance of Proposed CBIR systems.
The proposed model is able to find matching images from the database with more accuracy in almost all the categories of images.     The category-wise precision was good during validating it with holdout validation as well as k-fold validation. This proves that classification provides a significant improvement in performance of the proposed classification tree based CBIR system.

CONCLUSION
A CBIR system has been implemented successfully using the decision tree based classification and retrieval model. For constructing the decision tree we have used the color indexed histogram with wavelet features which we have proposed in our earlier work (WH). The performance of the proposed system measured in terms of precision was found to be good and the proposed model is competent with all the compared models.
Most of the earlier models produced almost equal or poor results even with the aid of sophisticated region, shape and texture matching techniques. But the proposed model provides excellent performance with simple features and a simple classification model. Hence we, hereby prove the possibility of a better CBIR system with more simple and significant feature sets.