SHAPE RETRIEVAL THROUGH MAHALANOBIS DISTANCE WITH SHORTEST AUGMENTING PATH ALGORITHM

Shape matching and object recognition plays an vital role in the computer vision. The shape matching is difficult in case of the real world images like mpeg database images since the real world images has the internal and external contours. The Mahalanobis distance based shape context approach is proposed to measure similarity between shapes and exploit it for shape retrieval. The process of shape retrieval identifies the relevant shapes from the data base for the query images. The query image matched with the reference images and it gives the dissimilarity between the shapes. This dissimilarity measures used to identify the relevant images from the databases. The dissimilarity is distance between the two images. The shape matching has the three major steps that are finding correspondence, measusring distance and the applying allinging transformation. The finding correspondence is find the best matching point between the query image and the reference image, The correspondence is solved by the shape context with Shortest augmenting path algorithm. The measuring distance is used to find the distance between the corresponding point. In this study, Mehalanobis distance is used to find the distance between the images. The alligning transformation is used to allign the shapes in order to achieve the best matching point. Object recognition is achieved by the k-nearest neighbor algorithm. The proposed method is simple, invariant to noise and gives better error rate compared to the existing methods.


INTRODUCTION
Owing to the rapid development of digital and information technologies, more and more digital information is generated and available in digital form from varieties of sources around the world. Shape retrieval is one of the significant ways to retrieve the information from the database. Shape retrieval from databases of isolated visual shapes has become an important information retrieval problem. The goal of the current work is to achieve high retrieval speed with reasonable retrieval effectiveness and support for partial and occluded shape queries.
The shape can be defined as the equivalence class under a group of transformations. Shape matching is identifying shapes when the similarities between two shapes are high. The statistician's definition of shapes addresses the problem of shape distance, assumes the correspondence that are known. Shape matching can identify without finding the correspondence by using the intensity-based technique. An extensive survey of shape matching (2002) in a computer can broadly be classified Science Publications JCS into two approaches. They are brightness based and feature based methods.
The recognition is difficult for 3D objects. The following factors affect object recognition in 3D databases like MPEG-7 database: • Finding edges in 3D images is difficult compare to 2D images • The contour of 3D objects has the complexity due to internal and external contours • The 3D objects should have more number of reference images, since the view in different angle is different, in this type of images The algorithm used to recognize 3D objects should have well defined edge detection method, adopt complex contours and apply well defined transformation. The reference images should increase with different views of images in order to increase the efficiency.
The feature based method uses the feature such as the boundries, edge, length of the shapes in the image. By using this featues the reference image, test images are compared and objects are identified. There are different kind of techniques available for the shape matching in feature based method. Zhang et al. (2011) proposed a Boosted Exemplar Learning (BEL) approach to model various actions in a weakly supervised manner. The proposed BEL method can be summarized as three steps. First, for each action category, amount of class-specific candidate exemplars are learned through an optimization formulation considering their discrimination and co-occurrence. Second, each action bag is described as a set of similarities between its instances and candidate exemplars. Third, they formulate the selection of the most discriminative exemplars into a boosted feature selection framework and simultaneously obtain an action bag-based detector. The give the test result for KTH dataset and Weizmann dataset. Geronimo et al. (2010) presented a more convenient strategy to survey the different approaches. the problem of pedestrian detection systems divide of detecting pedestrians from images into different processing steps and each with attached responsibilities. Then, the different proposed methods are analyzed and classified with respect to each processing stage, favoring a comparative viewpoint. Bai et al. (2010) provided a new perspective to this problem by considering the existing shapes as a group and study the similarity measures to the query shape in a graph structure. This method is general and can be built on top of any existing shape similarity measure. For a given similarity measure, a new similarity is learned through graph transduction. The new similarity is learned iteratively so that the neighbors of a given shape influence its final similarity to the query. This approach yields significant improvements over the state-of-art shape matching algorithms. This method obtained a retrieval rate of 91.61 percent on the MPEG-7 data set. Moreover, the learned similarity by this method also achieves promising improvements on both shape classification and shape clustering. Eitz et al. (2011) introduced a benchmark for evaluating the performance of large-scale sketch-based image retrieval systems. The necessary data are acquired in a controlled user study where subjects rate how well given sketch/image pairs match. The method suggested how to use the data for evaluating the performance of sketch-based image retrieval systems. It gives the benchmark data as well as the large image database for publicly available. they developed new descriptors based on the bag-of-features approach and use the benchmark to demonstrate that they significantly outperform other descriptors. Yu et al. (2012) considered multiple features from different views, i.e., color histogram, Hausdorff edge feature and skeleton feature, to represent cartoon characters with different colors, shapes and gestures. Each visual feature reflects a unique characteristic of a cartoon character. It gives the idea for the combining the multiple feature into the similarity measures. This method introduced a semisupervised Multiview Subspace Learning (semi-MSL) algorithm, to encode different features in a unified space. the effectiveness of the method demonstrated a experimental evaluations based on both cartoon character retrieval and clip synthesis for cartoon application. Lin and Chang (2011) proposed an efficient 2D shape matching algorithm. The mean distances and standard deviations of shape contexts used as the index of shapes to reduce the search space on shape matching with shape context descriptor. The best-fit ellipse modeling is adopted as the preprocessing for normalizing its scale. They Experimented with shapes of 3D objects from MPEG-7 silhouettes and the COIL data set, respectively.
The framework to match and recognize multiple instances of multiple reference logos in image archives presented by Sahbi et al. (2013). The reference and test images are taken as the local features and matched by minimizing an energy function mixing (1) a fidelity term that measures the quality of feature matching, (2) a neighborhood criterion that captures feature cooccurrence/geometry and (3) a regularization term that controls the smoothness of the matching solution. It

JCS
improves the efficiency by 20% for MICC-Logos dataset compare to the previous methods. Temlyakov et al. (2013) proposed a novel method which refines pairwise similarity measures using population cues by examining the most similar instances shared by the compared shapes or images. By using this refined measure to organize instances into disjoint components that consist of similar instances. Connectivity is then established between components to avoid hard constraints on what instances can be retrieved, improving retrieval performance. They experimented with the MPEG-7 and Swedish Leaf shape datasets. This method is versatile, performing very well on its own or in concert with existing methods. Khalid and Mukhtar (2013) proposed an approach to significantly speed up complex but accurate shape matching approaches. They try to meet the online shape retrieval and classification demands. This algorithm presented an extremely efficient shape matching approach based on compressed fourier coefficients. The result gives efficient algorithm for the shape matching with the online shape retrieval and classification. Restrepo et al. (2012) presented a new volumetric representation for categorizing objects in large-scale 3-D scenes reconstructed from image sequences. This algorithm presents the first work to characterize and use the local 3-D information in the scenes. The resulting description is used in a bag-of-features approach to classify buildings, houses, cars, planes and parking lots learned from aerial imagery collected.
This algorithm achieved higher classification accuracy than Harris-based features.
Salve and Jondhale (2010) proposed shape detection method using a feature called shape context. Shape context describes all boundary points of a shape with respect to any single boundary point. Thus it is descriptive of the shape of the object. Object recognition can be achieved by matching this feature with a priori knowledge of the shape context of the boundary points of the object. Honge et al. (2009) proposed an algorithm which implements image segmentation using color information in the HSV color space obtain the pixel of the object and use this pixel implement edge detection to recognize the object. Experiments show that this algorithm can recognize the object exactly in the different illumination conditions, satisfy the requirement of the competition.
Thin Plate Spline (TPS) is an effective tool modeled by Bookstein (1989) for modeling coordinate transformation in several computer vision applications. Milios and Euripides (2000) proposed a shape-matching algorithm for deformed shapes based on dynamic programming. This algorithm is superior when compared to the traditional approaches of shape matching and retrieval such as Fourier descriptors ad geometric and sequential moments. Robust shape similarity retrieval developed by Attalla and Siy (2005) and it is used to match and recognize 2D objects. Contour flexibility is a technique developed by Xu et al. (2009) for shape matching in which the predominant problem is matching the shapes. Felzenszwalb and Schwartz (2007) describe a new representation for 2D objects that captures shape information at multiple levels of resolution. A segmentbased shape matching algorithm (2006) which avoids problems associated with global or local methods and performs well on shape retrieval test. The drawbacks of the existing methods are that they do not return correspondences, suffer from the need for human designed templates and are applied to 2D images and limited to 3D images. The proposed algorithm gives better retrieval rate compare to existing methods.

Dissimilarity Measures between Shapes
The images are contains the different objects with the different shapes in different background. These images are subjected to the edge detecting processing to identify the shape of the objects with or without internal contours. These shapes are the point set contains the locations of the edge pixel. This point set of the reference and test image are matched to identify the dissimilarity between shapes.
The disimilarity measures has the following steps: • Preprocessing • Feature extraction • Finding correspondence • Bipartite graph matching • Allining transformation • Classification

Preprocessing
Preprocessing is removal of noise from the images and its usefull for the better matching with the reference image. There are various techniques for nise reduction, among that gaussian filter is most popular since it has many advantage compare to others Equation (1): When the Gaussian is spread parameter sigma determines the width of the Gaussian. For image processing, the zero mean two dimension discrete Gaussian function is Equation (2)

Feature Extraction
The image should be compared for the unique features like number of pixel, width, length, edges and brightness of the image. The vision processing identifies the features in images that are relevant to estimate the structure and properties of objects in an image. Edges are one such feature. The edges can be used as the unique features of the input and the reference image for its simplicity and effective of matching.
The canny edge detector is used for the detection of the edges of the image (Gonzalez, 2006). The canny edge detector is the first derivative of a Gaussian and closely approximates and optimizes the operator that product of signal to noise ratio and localization.

Finding Correspondence
Finding correspondence is measuring the dissimilarity between the reference image and the test image (Belongie et al., 2002). The statistical method is used to find correspondence between the reference image and the test image. For each point p i on the first shape, find the best matching point q j on the second shape. This is a correspondence problem similar to that in stereopsis. Consider the set of vectors originating from a point to all other sample points on a shape. This vector expresses the configuration of the entire shape relative to the reference point. The shape is represented by the n-1 vector, since n gets larger the shape become the exact. By this context, the distribution of pixel comes to known.
For a point p i on the shape, compute the course of histogram h i of the relative coordinates of the remaining n-1 points. The Fig. 1 shows the shape of an object (example-apple). The bins are uniform in log-polar space, making the descriptor more sensitive to positions of nearby sample points than to those of points farther away. The structure of the bin with 5 radius and 12 angles are shown in Fig. 2.
The cost matrix (C ij ) formed for a point p i and q j on the second shape by using the chi square test. The ch i square test for the point p i and q j is given by Equation (3): where, h i (k) and h j (k) denote the K-bin normalized histogram at p i and q j respectively. The cost matrix is reduced by the bipartite graph matching using Shortest augmenting path method.

Bipartite Graph Matching
The cost of the matching should be minimized with the given set of costs C ij between all pairs of points between pi on the first shape and q j on the second shape Equation (4): To minimize the cost matrix the Shortest augmenting path method is chosen. By this Shortest Science Publications JCS augmenting path method, the minimized cost can be found for the shapes which do not have the equal number of points on both the shapes.

Shortest Augmenting Path Algorithm
The linear assignment problem has solved by using the shortest augmenting path algorithm. It contains the new initialization routines and special implementation of Dijkstra's shortest path method (Jonker and Volgenant, 1987). The following steps are followed for the augmenting algorithm: Step1: Initialization: In this algorithm the initialization is primarily aimed at reaching a high initial reduction of the costs matrix. Step2: Termination, if all rows are assigned.
Step3: Augmentation, construct the auxiliary network and determine from an unassigned row i to an unassigned column j an alternating path of minimal total reduced cost and use it to augment the solution. Step4: Adjust the dual solution to restore complementary slackness. Go to step 2.

Alligning Transformation
Given a finite set of correspondences between points on two shapes, one can proceed to estimate a plane transformation Y: R 2 →R 2 that may be used to map arbitrary points from one shape to the other. Where in the specified correspondences consisted of a small number of landmark points and T extends the correspondences to arbitrary points.The affine transformation from a point x Є R 2 to the point y Є R 2 Equation (5) where, P and Q contain the homogeneous coordinates of p and q, respectively, i.e., Equation (8) Here Q + denotes the pseudo inverse of Q. Thin plate spline (Bookstein, 1989) is a natural interpolating function for two dimensions and plays a similar role in the m = 2 dimensions to the natural cubic spline for interpolation in one dimension case. The natural cubic spline in one dimension is unique interpolate g(x) which minimizes the roughness penalty Equation (9): Subject to interpolation at the knots. For shape analysis, consider the (2×1) landmarks t j , j = (1,…..,k) on the first shape mapped exactly into the y i , i = (1,……,k) on the second shape, i.e., there are 2k interpolate constraints Equation (10): Where: Also written as Equation (11): It can be proved that the transformation Equation (14) minimizes the total bending energy of all possible interpolating functions mapping from T to Y, where the total bending energy is given by Equation (12): and has the form Equation (

JCS
where the kernel function U(r) is defined by U(r) = r 2 log r 2 and U(0) = 0 as usual. Here the r is Mahalanobis distance between the point set Equation (14): where, (x 1 , y 1 ) and (x 2 , y 2 ) are the coordinate points of the warped image. In the Equation (13) in order for Ø(x, y) to have square integral second derivatives, we require that:

∑ ∑
Together with the interpolation conditions, (x i , y i ) = v i , this yields a linear system for the TPS coefficients Equation (15): where, K ij = U(||(x i ,y i )-(x j ,y j )||), the ith row of P is (1,x i ,y i ), w and v are column vectors formed from w i and v i , respectively and a is the column vector with elements a 1 ,a x ,a y . The proposed algorithm is an iterative based approach, the image is treated as point set and this point set is matched using the synthetic matching. The two sample images and the corresponding point sets is shown in Fig. 3. Correspondence is found for the two point set by using the shape descriptor, shape context and Shortest augmenting path algorithm. Figure 4 shows the corresponding points for the first shape to the second shape at first and fifth iteration. Further, with this correspondence, the aligning transform is applied on the second shape with reference to the first shape, the transformed second shape and the correspondence points with the first shape is shown in the Fig. 5. The iteration continues until the best matching occurs or maximum iteration reaches. In this algorithm, the maximum five iterations processed for the best matching.

Mahalanobis Distance
The correlations between variables by which different patterns can be identified and analyzed based on Mahalanobis distance. It identifies similarity of an unknow to known sample set. In this distance, The correlation of the data set and scale invariant are taken into account which differs from Euclidian distance. The mehalanobis distance is defined as Equation (16): Here: X = Multivariat vector and defined as{ x 1 ,x 2 ,x 3 ….x n } µ = Mean vector and defined as { µ 1 , µ 2 , µ 3 …. µ n } The dissimilarity measures of two random vectors X and Y with the covariance matrix can be defined by the mehalonobis distance as Equation (17): The Mahalanobis distance reduces to the Euclidean distance, if the covariance matrix is the identity matrix. The resulting distance measure is called the normalized Euclidean distance, if the covariance matrix is diagonal. Where-s i is the standard deviation of the x i and y i over the sample set.

JCS
The proposed algorithm belongs to the category of prototype-based recognition. The query image is compared with all the images in database and generates the matching error. With this matching error, the top 40 matches are identified.

MPEG-7 Database
The MPEG-7 shape silhouette database core experiment CE-Shape-1 part-B, measures performance of similarity-based retrieval. There are 70 groups of objects in dataset and 20 images per groups. Totally, the dataset contains the 1400 images. The bullseye test has done on the MPEG-7 data set to compare the retrieval rate. In a bullseye test each image from a data base is given as the query and count the number of correct images on top 40 matches. The task is repeated for the each shapes and number of correct image is (maximum 20 correct per shape) counted for retrieval rate. Each image is represented by the 100 sample points which are taken from the canny edge detection. When computing the C ij 's for the bipartite matching, we included a term representing the dissimilarity of local tangent angles. Specifically, we defined the matching cost as Equation (18): where, 3c ij C is the shape context cost Equation (19): Measures tangent angle dissimilarity and β = 0.1. For recognition, we used a K-NN classifier with a distance function Equation (20) The weights in (19) have been optimized on a 3,000 x 3,000 subset of the training data. In order to have the best match each image is given on normal, horizontally flipped and vertically flipped format, so the distance is calculated using the Equation (24), the examples for the images in this three category is shown in Fig. 6 Equation (21) where, R a , R b , R c are denote the normal, horizontally flipped, vertically flipped.  The proposed algorithm gives the maximum retrieval rate compared with the existing algorithms. The retrieval rate is The comparison of retrieval rate for various algorithms on MPEG database is shown in Table 1. Since the proposed method yields better result compared to other methods, it may be extended for other similarity measures.

CONCLUSION
In this study, the shape matching with shortest path augmenting algorithm with mehalonobis distance is proposed, this algorithm produces an efficient and simplest way for shape matching. It retrive the shape from the databases efficienlty compare to the exsisting methods. The metod produces a good results in 2-D as well as 3-D objects. The shape matching can achived for the occulded images also by using this algorithm. further the algorithm can be extened for the motion pictures and vedio analysis. The algorithm can be extended and apply for the various application like pedestrian identification, sequrity systems, humanoid robotics.