Shape Retrieval through Angular Distance with Shortest Augmenting Path Algorithm

: Problem statement: The shape of an object is very important in object recognition. Shape matching is a challenging problem, especially when articulation and deformation of a part occurs. These variations may be insignificant for human recognition but often cause a matching algorithm to give results that are inconsistent with our perception. Approach: We proposed a customized approach to measure similarity between shapes and exploit it for shape retrieval. The similarity was measured using the correspondence between the points on the two shapes and applying the aligning transformation. The correspondence was solved by the shape context with shortest augmenting path algorithm. Based on the correspondence, the aligning transformation is applied which best aligns the two shapes. Thin Plate Spline (TPS) with angular distance was to provide the better class of transformation maps. The matching error was calculated by the errors between the correspondence points on the two shapes and energy required in aligning transformation. Object recognition was achieved by the k-nearest neighbor algorithm. Result: The algorithm was efficient method for shape matching which performs the well on bulls eye test and produce 91.23% of retrieval rate on MPEG database. Conclusion: The proposed method is simple, invariant to noise and gives better error rate compare to the existing methods. It can also be extended to the handwritten characters, industrial objects, face recognition and COIL data base.


INTRODUCTION
Object Recognition is a difficult task in the real world. Humans still out perform machines in most vision tasks, in both speed and quality. Our goal is to design machines that can recognize the objects at levels approaching or exceeding human performance. Using the shape of an object for object recognition and image understanding is an emerging topic in computer vision and multimedia processing analyzing silhouette of an object is the most important step in shape matching.
On Growth and Form, Thompson observed that related, but not identical shapes can often be deformed into alignment using simple coordinate transformations. Serge Belongie et al. (2002) presented a novel approach to measure similarity between shapes and object recognition for handwritten digits, COIL database and silhouette and trademark retrieval.
The shape can be defined as the equivalence class under a group of transformations. Shape matching is identifying the shapes when the similarities between the two shapes are high. The statistician's definition of shapes addresses the problem of shape distance, assumes the correspondence that are known. Shape matching can identify without finding the correspondence by using the intensity-based technique. An extensive survey of shape matching Belongie et al. (2002) in a computer can broadly be classified into two approaches. They are brightness based and feature based methods.
Brightness based method: Brightness or appearance based method is a complementary view of feature based methods (Abusham et al., 2008). This approach makes direct use of gray values within the visible portion of the objects instead of shapes. The brightness information is used to find the correspondences and to align the gray scale values to compare brightness of two different shapes. Yuille (1991) presented the Fitting Hand Craft Model (FHC) suggests a flexible model to build invariance to certain kinds of transformations, but it suffers from the need of human designed templates and the sensitivity to initialization when searching via gradient descent.
Elastic graph matching developed by Lades et al. (1993) involves both the geometry and photometric features in the form of local descriptors based on the Dynamic Link Architecture, which is a Gaussian derivative. The performance is evaluated using statistical analysis. The alternative is the feature based method.

Feature based method:
Feature based method extract the information from the image instead of appearance of the image. The shape information is in the form of boundaries, length and width of the image, etc. The similarity of shape can easily identified by boundaries of Silhouette images since silhouettes do not has the holes or internal markings. The boundaries are more convenient to represent images and it can be clearly represented through closed curves. There are several approaches for shape recognition using feature-based method. Fischler and Elschlager (1973) proposed a mass spring model that minimizes the energy by using the dynamic programming technique. Thin Plate Spline (TPS) is an effective tool modeled by Bookstein (1989) for modeling coordinate transformation in several computer vision applications. Cootes et al. (1995) described a method for building models by learning patterns of variability from a training set of correctly annotated images, which can be used for image search in an iterative refinement algorithm that employed by Active Shape Models. Shape Features and Tree Classifiers were introduced by Amit et al. (1997) and described a very large family of binary features for twodimensional shapes. The algorithm adapted to a shape family is fully automatic, once the trained samples are provided. The standard method for constructing tree is not practical because the feature set is virtually infinite. Simard et al. (1994) proposed a memory based character recognition using a transformation invariant metric. The method was tested on large handwritten character databases and MNIST. LeCun et al. (1998) proposed object recognition with gradient-based learning. This study shows that it recognizes simple objects with handwritten characters using set of features and it gives better performance by recognizing multiple objects without requiring explicit segmentation of the objects. Gdalyahu and Weinshall, (1999) proposed a flexible syntactic matching of curves to classify the silhouettes using dissimilarity measures.
According to (Latecki and Lakamper, 2000) a shape similarity measures based on correspondence of visual parts using silhouette image database. The digital curve evolution is implemented to simplify the shapes, segmenting errors and digitizing noise. Shape matching procedure gives an intuitive shape correspondence and is stable with respect to noise distortions. A novel approach to find a correspondence between two curves was presented by Sebastian et al. (2003). The correspondence is based on a notion of an alignment curve and it is found by an efficient dynamic programming method. Milios and Petrakis, (2000) proposed a shapematching algorithm for deformed shapes based on dynamic programming. This algorithm is superior when compared to the traditional approaches of shape matching and retrieval such as Fourier descriptors ad geometric and sequential moments. Robust shape similarity retrieval developed by Attalla and Siy (2005) and it is used to match and recognize 2D objects. Rizon et al. (2006) developed a computational model to identify the face of an unknown person's by applying the eigen faces and neural networks.
Contour flexibility is a technique developed by (Xu et al., 2009) for shape matching in which the predominant problem is matching the shapes. Schwartz and Felzen (2007); (Mokhtarian, 2003) describe a new representation for 2D objects that captures shape information at multiple levels of resolution. A segmentbased shape matching algorithm which avoids problems associated with global or local methods and performs well on shape retrieval test. The drawbacks of the existing methods are that they do not return correspondences, suffer from the need for human designed templates and are applied to 2D images and limited 3D images. Chai et al. (2009) presented to reduce the effect of beard and moustache for facial features detection and introduced facial features based template matching as the classification method. The proposed algorithm gives better retrieval rate compare to existing methods.

Shape matching with angular distance:
The object can be treated as point set. The shape of an object is essentially captured by a finite subset of points. A shape is represented by a discrete set of points sampled from the internal or external contours of the object. Contours can be obtained as location of pixels as found by an edge detection. The shapes should be matched with similar shapes from the reference shapes. Matching with shapes is used to find the best matching point on the test image from the reference image. The proposed algorithm for matching with the shapes contains Preprocessing, Feature extraction, Finding correspondence, Applying transformation, Similarity measures and Classifying images.

Preprocessing:
The preprocessing method is modifying the image for best matching to the reference image. Noise cancellation is the one of the preprocessing techniques. For noise cancellation a number of methods can be used like Median filter, Mean filter, Gaussian filter, etc. The Gaussian filter has been used for noise cancellation. Gaussian filters are a class of linear smoothing filters with the weight chosen according to the shape of the Gaussian function. The Gaussian smoothing filter is very good filtering for removing noise drawn from a normal distribution. The zero mean one dimension Gaussian function is in Eq. 1: When the Gaussians is spread parameter sigma determines the width of the Gaussian. For image processing, the zero mean two dimension discrete Gaussian function is in Eq. 2: Feature extraction: The image should be compared for the unique features like number of pixel, width, length, edges and brightness of the image. The vision processing identifies the features in images that are relevant to estimate the structure and properties of objects in an image. Edges are one such feature. The edges can be used as the unique features of the input and the reference image for its simplicity and effective of matching.

Canny edge detection:
The canny edge detector is used for the detection of the edges of the image. The canny edge detector is the first derivative of a Gaussian and closely approximates the operator that optimizes the operator that optimizes the product of signal to noise ratio and localization. The image is denoted by I[i,j]. The result from convolving the image in Eq. 3 with the Gaussian smoothing filter using separable filtering is an array of smoothed data: where, the σ is the spread of the Gaussian and controls the degree of smoothing. The gradient of smoothed array S[i,j] can be computed using the 2*2 first-difference approximations to produce two arrays. P[i,j] and Q [i,j] in Eq. 4 and 5 for the x and y partial derivatives.
The steps are acquiring the image, smooth the image with a Gaussian filter, compute the gradient magnitude and the orientation by using finite difference approximations for the partial derivations, apply no maxima suppression to the gradient magnitude and use the double threshold algorithm to detect and link edges.
Finding correspondence: Finding correspondence is the measure of dissimilarity between reference image and test image (Belongie et al., 2002). The statistical method is used to find correspondence between the reference image and the test image. For each point p i on the first shape, find the best matching point q j on the second shape. This is a correspondence problem similar to that in stereopsis. Consider the set of vectors originating from a point to all other sample points on a shape. This vector expresses the configuration of the entire shape relative to the reference point. The shape is represented by the n-1 vector, since n gets larger the shape become the exact. By this context, the distribution of pixel comes to known.
For a point p i on the shape, compute the course of histogram h i of the relative coordinates of the remaining n-1 points. The Fig. 1 shows the shape of an object (example-apple). The bins are uniform in log-polar space, making the descriptor more sensitive to positions of nearby sample points than to those of points farther away. The structure of the bin with 5 radius and 12 angles are shown in Fig. 2.
The cost matrix (C ij ) formed in Eq. 6 for a point p i and q j on the second shape by using the chi square test. The chi square test for the point p i and q j is given by: where, h i (k) and h j (k) denote the K-bin normalized histogram at p i and q j respectively. The cost matrix is reduced by the bipartite graph matching using modified Shortest augmenting path method.
Bipartite graph matching with shortest augmenting path algorithm: The given set of costs C ij between all pairs of points p i on the first shape and q j on the second shape, the cost of the matching should be minimized using Eq. 7.
This cost matrix is minimized using the shortest augmenting path algorithm andit can be found for the shapes which do not have the equal number of points on the both the shape.
Solving the correspondence problem is an instance of squared assignment problem, which can be solved by the Linear Assignment Problem (LAP). The first well known method for LAP is Hungarian method, but the linear assignment problem has solved by using the shortest augmenting path algorithm (Jonker and Volgenant, 1987). It has comparatively faster and gives better results for correspondence problem. This algorithm contains the new initialization routines and special implementation of Dijkstra's shortest path method. The steps for the augmenting algorithm: Step 1: Initialization: In this algorithm the initialization is primarily aimed at reaching a high initial reduction of the costs matrix Step 2: Termination, if all rows are assigned Step 3: Augmentation, construct the auxiliary network and determine from an unassigned row i to an unassigned column j an alternating path of minimal total reduced cost endues it to augment the solution Step 4: Adjust the dual solution to restore complementary slackness. Go to step 2 Modeling transformation: Given a finite set of correspondences between points on two shapes, one can proceed to estimate a plane transformation Y: R 2 →R 2 that may be used to map arbitrary points from one shape to the other. Where in the specified correspondences consisted of a small number of landmark points and T extends the correspondences to arbitrary points. The affine transformation from a point x Є R 2 to the point y Є R 2 in Eq. 8: where, A is m×m, c is m×1 and A is non singular. A linear transformation is affine transformation of equation but with c=0. For some matrix A (m×m) and a translational offset vector c (m×1) parameterizing the set of all allowed transformations. The least squares solution Y = (A, c) is obtained by using Eq. 9 and 1:) where, P in Eq. 11 and Q contain the homogeneous coordinates of p and q, respectively, i.e.,: Here Q + denotes the pseudo inverse of Q.
Thin plate spline (Bookstein, 1989) is a natural interpolating function for two dimensions and plays a similar role in the m=2 dimensions to the natural cubic spline for interpolation in one dimension case. The natural cubic spline in one dimension is unique interpolate g(x) which minimizes the roughness penalty using Eq. 12: Subject to interpolation at the knots. For shape analysis, consider the (2×1) landmarks t j , j=(1,…..,k) on the first shape mapped exactly into the y i , i=(1,……,k) on the second shape in Eq. 13 i.e., there are 2k interpolate constraints: (y j ) r =φ r (t j ), r=1,2; j=1,…,k Where: φ r (t j )=(φ 1 (t j ) φ 2 (t j )) T Also written as: It can be proved that the transformation Eq. 14 minimizes the total bending energy of all possible interpolating functions mapping from T to Y, where the total bending energy is given in Eq. 15: And has the form in Eq. 16: 1 x y n i i i i 1 (x, y) a a x a y w U( x , y ) (x, y) ) where, the kernel function U(r) is defined by U(r) = r 2 log r 2 and U(0) = 0 as usual. Here the r is angular distance between the point set is given in Eq. 17: 1 1 2 2 2 2 2 2 1 1 2 2 x y x y r x y x y where, (x 1 , y 1 ) and (x 2 , y 2 ) are the coordinate points of the warped image. In order for (x, y) to have square integral second derivatives, we require that:

∑ ∑
Together with the interpolation conditions, (x i , y i ) = v i , this yields a linear system is in Eq. 18 for the TPS coefficients: where, K ij = U(||(x i ,y i )-(x j ,y j )||), the ith row of P is (1,x i ,y i ), w and v are column vectors formed from w i and v i , respectively and a is the column vector with elements a 1 ,a x ,a y . The proposed algorithm is an iterative based approach, the image is treated as point set and this point set is matched using the synthetic matching. The two sample images and the corresponding point sets is shown in Fig. 3. Correspondence is found for the two point set by using the shape descriptor, shape context and Shortest Augmenting Path Algorithm. Figure 4 shows the corresponding points for the first shape to the second shape at first and fifth iteration. Further, with this correspondence, the aligning transform is applied on the second shape with reference to the first shape, the transformed second shape and the correspondence points with the first shape is shown in the Fig. 5. The iteration continues until the best matching occurs or maximum iteration reaches. In this algorithm, the maximum five iterations processed for the best matching. The proposed algorithm belongs to the category of prototype-based recognition. The query image is compared with all the images in database and generates the matching error. With this matching error, the top 40 matches are identified.

MPEG-7 database:
The MPEG-7 shape silhouette database core experiment CE-Shape-1 part-B, measures performance of similarity-based retrieval. There are 70 groups of objects in dataset and 20 images per groups. Totally, the dataset contains the 1400 images. The bull's eye test has done on the MPEG-7 data set to compare the retrieval rate. In a bulls eye test each image from a data base is given as the query and counts the number of correct images on top 40 matches. The task is repeated for the each shapes and number of correct image is (maximum 20 correct per share) counted for retrieval rate. Each image is represented by the 100 sample points which are taken from the canny edge detection. When computing the C ij 's for the bipartite matching in Eq. 19, we included a term representing the dissimilarity of local tangent angles. Specifically, we defined the matching cost as: where 3c ij C is the shape context cost: Fig. 6: Sample shapes of normal, horizontal and vertical flipped images in mpeg-7 database Table1: Comparison of retrieval rates for MPEG-7 database Algorithm Score (%) Shape context (Belongie et al., 2002) 76.51 Generative model (Xu et al., 2009) 80.03 Curvature scale space (Mokhtarian, 2003) 81.12 Chance probability function (Xu et al., 2009) 82.69 Polygonal multi resolution (Milios and Petrakis, 2000) 84.33 Multi scale representation (Adamek and O'Connor, 2004) 84.93 HPM-Fn (Xu et al., 2009) 86.35 Shape-tree (Xu et al., 2009) 87.70 Contour flexibility (Xu et al., 2009) 89.31 Proposed Method 91.23 Equation 20 measures tangent angle dissimilarity and β = 0.1. For recognition, we used a K-NN classifier with a distance function: D=1.6D ac + D 3c + 0.3D be The weights in Eq. 21 have been optimized on a 3,000 x 3,000 subset of the training data. In order to have the best match each image is given on normal, horizontally flipped and vertically flipped format, so the distance is calculated using the Eq. 22, the examples for the images in this three category is shown in Fig. 6. dist(Q,R)=min {dist(Q,R a ), dist(Q,R b ), dist(Q,R c )} (22) where, R a , R b , R c are denote the normal, horizontally flipped, vertically flipped.
The proposed algorithm gives the maximum retrieval rate compared with the existing algorithms. The retrieval rate is 91.23%. The comparison of retrieval rate for various algorithms on MPEG database is shown in Table 1. Since the proposed method yields better result compared to other methods, it may be extended for other similarity measures.

CONCLUSION
This study gives the efficient method for shape matching which performs the well on bullseye test and produce better result on MPEG database. The proposed algorithm is simple, invariant to noise and gives better error rate compare to the existing methods. The CPU time has improved considerably while using the angular distance in aligning transformations with shortest augmenting path algorithm. It can also be extended to the handwritten characters, industrial objects, face recognition and COIL data base. Further, the algorithm will extend to recognize multiple objects from the images simultaneously. It can improve the reliability of the real time system. The algorithm can also be applied for the video image and aerial images to identify the object. The proposed method can be developed for different applications like military areas, investigation departments and industrial automation.