CONTENT BASED MEDICAL IMAGE RETRIEVAL USING BINARY ASSOCIATION RULES

In this study, we propose a content-based medical image retrieval framework based on binary association rules to augment the results of medical image diagnosis, for supporting clinical decision making. Specifically, this work is employed on scanned Magnetic Resonance brain Images (MRI) and the proposed Content Based Image Retrieval (CBIR) process is for enhancing relevancy rate of retrieved images. The pertinent features of a query brain image are extracted by applying third order moment invariant functions, which are then examined with the selected feature indexes of large medical image database for appropriate image retrieval. Binary association rules are incorporated here for organizing and marking the significant features of database images, regarding a specific criterion. Trigonometric function distance similarity measurement algorithm is applied to improve the accuracy rate of results. Moreover, the performances of classification and retrieval methods are determined in terms of precision and recall rates. Experimental results reveal the efficacy of the adduced methodology as compared to the related works.


INTRODUCTION
In current decade, the enormous growth of digital images related to clinical diagnosis has led to the development of efficient image retrieval systems. Myriad researches have been proposed with variant techniques for medical image search. The content based image retrieval is one of the most progressive research areas that are intently related to the fields of computear vision, information retrieval and image processing that assists for better decision making in medical practice. Though there is a pervasive enthusiasm for CBIR in the engineering research community, the application of this expertise to solve practical medical problems is a goal yet to be accomplished. CBIR is widely used in many research areas such as internet, commerce, biomedicine and education. It can also be termed as content-based visual image retrieval since it exploits the visual content of image data. Inherently, the image retrieval process extracts several features that define the image content like intensity, shape, color, texture, size and location. Such kinds of information are described as independent vectors (Chatzichristofis et al., 2010). Furthermore, the CBIR system provides better indexing and returns more accurate results that tend to be applicable in medical domain. With that concern, we propose an effective CBIR methodology using binary association rules for brain tumor diagnosis. The performance of CBIR is in two-fold: • Index the image features based on the visual content characteristics • Retrieve images from massive database that are similar to the query image Content-based image retrieval is an advanced method to search, navigate and browse large medical image Science Publications JCS databases that can help as a diagnostic aid, for research studies and to manage teaching file systems and similar image collections in medical domain.
The query image bestowed to our criticism is the magnetic resonance brain images, which provides good disparity between the soft tissues of brain. We engage moment invariant feature extraction methods in our work since it involves in shape discrimination based on some unique features of brain images. Depending on those features, the feature vector is evaluated and given as the index for further classification of brain images under normal, benign and malignant classes. The significant part of this diagnosis is to train the neural network for classifying brain images according to its characteristics.
A common similarity measurement method used in CBIR are Euclidean distance, Minkowski 1-distance and Kolmogorov-Smirnov distance (Horsthemke et al., 2007), which determines the similarity between different shapes of brain images (Torres and Falcao, 2006). But this kind of metric has some limitations and may greatly reduce the validity of similarity measures. Hence, we use trigonometric function distance to measure image similarities. The similarity metrics that are considered for analysis are feature similarity measure and correlation similarity measure.
The Fig. 1 demonstrates the general architecture for content based image retrieval. As is well known, feature extraction is the core key to brain image recognition that reduces the dimensionality of brain image. In the proposed methodology, binary association rules are incorporated for organizing and marking the significant features, which augments the relevancy rate of obtained results.
Association Rule (AR) based method is involved in selecting typical features of MRI images by combining low-level features extracted from images and high-level acquaintance from specialists (Ribeiro et al., 2008). The AR subsumes in supporting better decision making on medical image diagnosis. In this method of tumor detection, each training image is combined with a set of keywords, which are the representative terms preferred by the specialists for accurate results. Association Rule mining involves in efficient classification of magnetic resonance brain images into three categories, normal, benign and malignant (Rajendran and Madheswaran, 2010). Mining can be done based on the integrated collection of brain images, termed as associated data. The binary association rule method proposed in this study is to select unique features of distinctive images and reduces the number of features considerably through feature indexing methods.
As we stated above, the task of classification plays a substantial role for province practitioners using medical images. Hence, we propose a novel approach called CBIR based image retrieval using binary association rule. The resultant images are obtained with more relevancy and accuracy rate with its specific category.
The remainder of this study is structured as follows, Section 2 confers about the related works, Section 3 summarizes our proposed CBIR method for brain image classification and image retrieval on tumor diagnosis, Section 4 discussed the experiments and results achieved. Finally, in Section 5, we present the conclusion and future enhancements of the adduced work. Datta et al. (2005) summarized the distinctive approaches and trends of content based image retrieval.

Related Works
The key contributions and key challenges in the current research trends related to image retrieval and annotation were discussed in this study by referencing some existing techniques. In (Flusser, 2005), there is a portrayal about image classification based on moment invariants. The authors reviewed efficient numerical algorithms, used for moment evaluation and demonstrated some practical examples of moment invariance based real-time applications. There explained the construction methodologies of moment invariant functions, which can be used in medical image diagnosis. Invariant-based approach is an apparent step provided robustness and reliability in pattern recognition methods.
An image retrieval method based on query topic dependent image features that comprised a database, query topics and ground truth data, was given in (Xiong et al., 2005). Both inter-category and intra-category statistical variations of images were captured to provide better performance of precision and recall. An optimal relevance feedback strategy was also incorporated into the retrieval system. The limitation of this approach was that the database contained more topics than the data sufficient for image evaluation. It is well known that efficient image retrieval is an application and content dependent task. Torres and Falcao (2006), various theories and applications of CBIR system were illustrated. Specifically, the study provided information about feature extraction process to encode image features into feature vectors and a similarity measure to calculate the relevancy rate. The study also enclosed with the details about query specification methods such as K-Nearest Neighbor Query (KNNQ) and Range Query (RQ) and distinct CBIR system including indexing structures and effectiveness measures. The CBIR system was being used for the purpose of retrieving pathologies specific to an anatomical structure (Horsthemke et al., 2007). The performance evaluation was made with precision and recall measurements and similarity measures with Euclid distance, Chi square statistics and Minkowski 1 distance. The feature space oriented image representation based on probabilistic output of multi class Support Vector Machines (SVM) and several classifier combination rule based outputs were explored in (Rahmana et al., 2008). The accuracy rate could be improved by the fusion of similarity matching functions. Nonetheless, the limitation of this approach was on appropriate parameter selection and matching function for a category specific search process. Caicedo et al. (2008) obtained a different image retrieval result by applying two phases namely, preprocessing phase and retrieval phase. A scheme called Cross Category Feature Importance (CCFI) was incorporated for combining similarity measures in order to provide better results. The enhancement of this study was based on blending up of textual and virtual information of the medical image. Further, the authors of (Uwimana and Ruiz, 2008) affirmed an automatic image classification method for CBIR. The recall, precision and error rates of retrieved images were analyzed with the occurrences of True Positive (TP) and True Negative (TN) results. There were some limitations including uneven dataset distribution and image content overlapping. Wang and Miao (2008) developed a scale invariant face recognition frameworks, suggested some probabilistic similarity measure calculations for precise result computation. A different approach for image retrieval based on Shannon entropy and the self-Adaptive Genetic Algorithm based on Random operator (AGAR) was explained in (Lei et al., 2009). But, the retrieval efficiency remained relatively low while searching images in massive databases. Wang et al. (2009) proposed a paper for classifying the brain tumors regarding the information from MRI and Magnetic Resonance Spectroscopy (MRS). Segmentation, feature extraction, features selection and classification model conception were the steps included in this study for brain tumor classification. Moreover, they used the Region of Interest (ROI) for feature extraction process and Concentric Circle (CC) method for selecting peculiar features. The classification accuracy of this study could be improved by incorporating more specific information such as spatial details about the tumor.
Following the advancements of CBIR, a review paper (Wanjale et al., 2010;Surya and Sasikala, 2011) presented an overview of various techniques, clustering algorithms and storage methods. The paper incorporated the explanations of methods such as shape based retrieval method, texture based methods, conceits of feature selection, low-level visual information and indexing techniques. There proposed a meticulous classification of MR-brain images (Li et al., 2010) using both textures and shape features. They applied statistical association rule miner algorithm to evaluate weight coefficient of each characteristics. The brain images were defined under 14 categories with respect to its distinctive anatomical structure and contents and developed a scrupulous classifier for brain image retrieval system.
A model derived in (Ramamurthy and Chandran, 2011) included the basic conceits of CBIR like, feature extraction, image classification, image indexing and retrieval, in which the features are extracted using canny edge detection algorithm. The similarity measure was calculated over there by Euclidean distance. Semantic

Science Publications
JCS association rules (Anca and Udristoiu, 2011) were used to produce high-level concepts, which were extracted from visual content. The approach forwarded a modality for learning the medical image diagnosis using low-level features. Associative rule mining reveals all the consuming relationships in a conceivably large image database. A framework formed with the combination of associative rule mining and classification rule mining in medical image diagnosis called neural network association classification system (Shekhawat and Dhande, 2011). This system is used for the construction of accurate and efficient classifiers and the classification methods could be further enhanced with predictive apriori algorithm. Due to the discrepancy and complexity of tumors, the classification of brain tumor image is considered as a difficult task (Kharat et al., 2012). Basically, the neural network technique constitutes two stages namely, feature extraction and classification. In our proposed work, we incorporate the rule pruning methods based on binary association rule for feature selection from extracted features of brain images before doing classification. Similarity based image retrieval using trigonometric function distance computation was described in (Akila and Maheswari, 2012). The approach surpassed the limitations of similarity measurement functions like Euclidean distance, root mean square error and bullbackleiber distance. With those conclusions, the proposed paper integrates the binary association rule mining methods for adept feature selection and trigonometric distance function for effective similarity calculation, into a content based medical image retrieval system that widely supports in diverse medical applications.

Proposed Work
Progression in data storage techniques and image acquisition in medical domain have enabled the creation of massive image databases. Therefore, searching and retrieving images from such massive database with high precision remains a great confrontation. Hence, we propose content based medical image retrieval system using binary association rules, notably for brain tumor diagnosis. The methodology comprises general phases of CBIR such as feature extraction, Indexing, feature selection, classification and similarity metric analysis. The significance of this study stands in the accession of binary association rule based pruning techniques for feature selection and trigonometric distance function for similarity metric analysis. Apparently, the examination constitutes the procedures of training and test phases. The training phase involves in drilling the neural network with variant brain images, whereas the test phase rivets in the inspection of unseen image for tumor cells. The Fig. 2 depicts the overall architecture of our adduced model.
The magnetic resonance brain images are grabbed as the input here since it provides good contrast among distinctive soft tissues of the brain, which fosters result accuracy. The acquired MRI image is preprocessed with some denoising functions. Moreover, the noise free image is given for feature extraction process that provides exact classification results. Feature extraction is done here with third order moment invariant functions in parallel for both the query image and the images present in massive image database. Feature matrix has been obtained from large database images, whereas feature vector is obtained for the single query image. Following, the feature selection is done by binary association rules and rule pruning technique is employed along with feature selection index in order to select pertinent features from image data store. Consecutively, classification process is performed by trained neural network classifier into three classes: normal, benign and malignant. The similarity metric is determined in terms of image features and correlation factors using trigonometric distance evaluation. Finally, similar images to the query image are retrieved from the large medical image database and analyzed for precision and recall rates.

Moment Invariant Feature Extraction
The feature extraction of MR images is accomplished here with the consideration of third order moment invariant functions. Generally, moments are given as projection of the image function into a polynomial basis. In practice, the interpretation of an image obtained by MRI system provides the degraded version of the original scene.
Those degradations have occurred during image acquisition by factors like lens aberration, imaging geometry, motion of the scene, wrong focus and random sensor error. The dexterity of invariants with respect to these factors is a crucial part. Henceforth, we provide a moment invariant mechanism in feature extraction. Images under each moment are too sensitive to local changes, but they are very robust to noise. Accordingly, invariants are applied to intensity changes, contrast images, convolution and rotational images.
During MRI, brain is scanned in various positions to give distinctive brain images for accurate prediction and classification of brain tumor. By differentiating the intensity values of images in increasing orders, we evaluate the moment invariance Equation 1-7: where, ϕ represents invariant value of extracted feature of a particular brain slice, which is obtained by µ value, differential values of image intensities.

JCS
From the above equations, the distinctive features of images are extracted based on 7 invariants of rotation using 3 rd order differentiations. Then, by applying binary association rules along with rule pruning method on the medical image database, selected feature indexes are obtained, which are given as feature subset into the neural network classifier.

Binary Association Rule Generation
The conceit of feature selection involves in reducing the inputs to an endurable size for effective processing and analysis. A quality pattern has been discovered with substantial features from large training dataset using binary association rule. The rule pursues in discovering the association among features extracted from MRI image gallery. Moreover, it contrives strong rules in database for analysis using different measures of intrusiveness. The problem of binary association rule generation is given as: Let D = {t 1 , t 2 … t m } be a set of transactions and I = {i 1 , i 2 … i n }. It is conspicuous that each transaction has a subset of the items in I. Inherently, the aforementioned rule is defined as an implication of the form X ⇒ Y, where X,Y⊆ I (X is the antecedent of the rule and Y is the consequent of the rule). The association rules are confined such that the antecedent of the rules is comprised conjunction of features from the magnetic resonance brain image, whereas the consequent of the rule is constantly the class label to which the brain image concerns. The method draws in finding rules that provide minimum confident and minimum support values specified by the user.

Rule Pruning Technique
Employing rule pruning techniques become necessary since the number of rules produced in the precedent phase is very large. The rule pruning technique eliminates the rules that are conflicting. Pruning the specific association rules can be performed with the following cases: Case 1: Consider two rules X1 ⇒C and X2 ⇒ C, the first rule is a general rule if X1⊆ X2. To accomplish this, the association rules must be ordered, according to case 2. Case 2: In the given two rules X1 and X2, X1 is higher ranked than X2 if: • X1 has higher confidence value than X2 • If the confidences are equal, support of X1 must exceed support value of X2 • If both confidences and support values are equal, but X1 has less number of attributes in left hand side than X2 The next case is for eliminating the conflicting rule.
Case 3: The rules X1 ⇒ C1 and X2 ⇒C2, are conflicting in nature. Based on the above cases, duplicates have been eliminated. The set of rules that are chosen after pruning represents the actual classifier. These cases have been used to predict to which class the new test image belongs in an adept manner.
After applying the rule pruning technique, the number of features for brain tumor diagnosis is considerably reduced. Thus, the process tremendously reduces the computation time and increases the result accuracy.

Neural Network Classification
Following the training phase, a neural network classifier with pruned set of association rules are employed for classifying the brain images. Each training image is associated with a set of keywords (ground truth data), which are the representative words given by a specialist to use in the medical image diagnosis. The feature vectors and subsets obtained from the rule pruning method are submitted to the neural network classifier that uses the set of keywords and association rules to categorize the given image. Furthermore, both the magnetic resonance brain images in database and the query image are classified under three stages namely, normal, benign and malignant.

Similarity Metric Evaluation
After classification, the similarity metric evaluations are made in terms of features and visual correlation. Feature based similarity measure of two image is based on the distance between their descriptors. The distance metric used here is Trigonometric Function Distance. This method can normalize the distance of two points and can be used in similarity measurement to augment the accuracy of image matching.

Trigonometric Function Distance-Description
Let (x, y)∈ R. The distance of Trigonometric function distance is defined as Equation 8: d(xy) sin(arctan( x y )) = − The distance between two vectors can be obtained by the following Equation (

JCS
where, the minimum distance is 0 and the maximum distance is 1. Hence, the trigonometric function can normalize the distance between two points. If the parameter as x-ythen, the distance of trigonometric function distance is transformed into the following form Equation 10: The properties of Trigonometric Function distance are as follows.
The distance is always positive or zero: The distance from the feature to itself is zero: It obeys the inequality property of a triangle for any three points x, y, z: Dist(x,z) Dist(y,z) Dist(x, y) + ≥ According to the properties and equations stated above, we empirically evaluate the similarity between the query image and database image. Consecutively, the similarity measure in terms of visual correlation is also determined for further convergence. The correlation coefficient computation is stated as, r = corr (X, Y) Where X and Y denotes the image matrices of same size. The estimation of correlation coefficient regarding X and Y is given by Equation 11: The value of correlation coefficient ranges, 0≤ r ≥1. The relevancy rate of retrieved images is dependent on the trigonometric distance and correlation value between two images. High relevancy rate is obtained with minimum distance and maximum visual correlation value.
Finally, the relevant images are retrieved from the large medical image database. Moreover, the performances of classification and retrieval methods are determined in terms of precision and recall rates.

Performance Evaluation Criteria
For effective CBIR system, the paper estimates the overall retrieval performance using precision and recall defined by the Equation (12) and (13). The process for evaluating performance inherently considers each image as query, measures every retrieval outcome and reports overall average: To report on the overall evaluation of our medical image retrieval system's performance, we incorporate performance analysis section. Due to the variance and complexity of tumors, the classification of brain tumor image is considered as a challenging task. Hence, our proposal works on an efficient retrieval of relevant medical images from massive databases that supports for better decision making in clinical diagnosis.

Experimental Results
We have tested our CBIR approach with the IBSR dataset (http://www.cma.mgh.harvard.edu/ibsr/), which contains multiple scan images of patients with and without brain tumor. Dataset was partitioned into three sets-80% for training, 10% for validation and 10% for testing. All the computations are implemented using MATLAB V7.9 with learning rate of 0.001. Initially, MRI query image is fed up for moment invariant feature extraction process to excerpt the decisive features that approving effective brain tumor diagnosis. For the sake of providing experimental results, we take the brain image, reveals as Fig. 3 as input query.
The process derives 7 distinctive characteristics of brain images from MRI gallery. Using Binary association rules and pruning methods, significant features are selected for efficient classification. This procedure is applied for both the query image and the images in large medical database. The selected feature indexes are given as feature vector onto the feature subset. Moreover, the subset involves in category prediction of query image. Then, the MRI test image and database brain images are classified by a neural network classifier. In this experimentation, the classifier is trained with Levenberg-Marquardt training function for optimal classification of MRI images under normal, benign and malignant classes. The given query image classified under normal category.

Fig. 3. MRI brain image
The Fig. 4 reveals the graphical representation over the number of epochs needed for attaining the best performance evaluation and the Mean Square Error Rate (MSR) at the phase of neural network classification. We obtained the best evaluation rate as 0.16187 at 5 epochs with minimal MSE. Besides the related methods, the binary association rules based CBIR affords higher performance rate accommodating less epochs and MSE rate.
The Fig. 5 represents the performance of training, validation and testing phases and finally the regression rate of proposed approach. The graphs plotted in the following figure specify the relationship between the target of our classification procedure and the actual output produced.
Similarity measures are calculated in two stages of analysis, using the equation stated in section 3.5, against the images in database and the query image for adept retrieval. By the first stage of experimental analysis with large database of brain images, according to the trigonometric distance function, we obtain 10 relevant images for the given query image.
The relevant image ID, category and distance are given in Table 1. It is obvious from the table presented below that most of the images are relevant to the category and pertinent features of the query image, nonetheless, the images of ID 5 and 128 slightly deviates from the specified constraints (the image may either has large neighbor distance value or mismatched category).
In the next stage of analysis, the most relevant images are scrutinized for efficacious retrieval.
Sequentially, correlation analysis is performed to peruse the visual similarity of images given in Table 1 with respect to the given query. The retrieved images should have minimum trigonometric distance and maximum correlation coefficient values. As per the consideration, Table 2 shows the image details of the most relevant brain images.
The relationship between the values of neighbor index (relevant image ID) and neighbor distance (relevant image distance) is represented in Fig. 6. The graph is plotted for the relevant image indexes obtained after the retrieval analysis with trigonometric function distance and correlation similarity. It is obvious from the graph that the retrieved images with less similarity have larger neighbor distance. Figure 7 exemplifies the correlation similarity graph that is plotted against the values of neighbor index (index of relevant image) and correlation measures, which is calculated by (10). The approach mainly concentrates on the similarity measurements on both features and visual correlation of images. Feature similarity is obtained using trigonometric function distance, whereas visual similarity is measured with correlation coefficient factors. The combination of immense methods in CBIR tends to an adept image retrieval process with high precision and relevancy rates.
As mentioned earlier, the correlation measure ranges 0≤r≥1 and in our assumption, we consider the images that are having correlation coefficient value nearer to 1 are more relevant to the given query image. Those images are retrieved from large medical image database that assists for effective diagnosis of brain tumor in clinical practice.
Following the evaluation of similarity measures, the relevant images for the query image portrays in (Fig. 8) are obtained. The retrieved images have minimum trigonometric distance and maximum correlation coefficient values. The retrieved images of specified IDs are as follows.
Finally, six images are retrieved using CBIR with binary association rules with index 6, 7, 8, 9, 10 and 12, which are in the normal category in the medical image database. The process is concluded with the performance analysis by the determination of precision and recall rates (section 3.6). Table 3 shows the results of performance analysis with respect to precision and recall rates. Thus, our experiment analysis provides an evidence for the efficiency of the accumulated approach.

CONCLUSION
The predominant intention of this study is to suggest a congruous method for an effective CBIR that assists medical image diagnosis in clinical domain. Enforcing binary association rules and pruning techniques, pertinent features are selected from the concern images. We impose the similarity measurement between MRI scanned brain image and classified images of massive database by computing trigonometric distance and visual correlation metrics. The experimental results have shown that the proposed method achieves high accuracy, relevancy, precision and recall rates. Hence, we pose that the association rule based CBIR system affords better decision making in discriminating brain tumors and reduces complexity. In future, investigating the applicability of our suggested procedure for other medical images is of great interest. Improving the speed of image retrieval is the other direction of future research efforts.