Harmonic Mean Projection Shape Transform for Leaf Classification

Corresponding Authors: Sophia Jamila Zahra, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Malaysia Email: sophia_jz@siswa.ukm.edu.my Abstract: Shape feature extraction has emerged as an important part of computer vision and image processing applications. Identification of shape objects in huge data based on shape similarity still remains a challenging problem due to the similarity of the shapes and existing of classes with similar contour information. This paper presents to investigate the shape analysis for plant leaf classification by provide a suitable technique in feature extraction, a new approach Harmonic mean projecting transform which is adapted from the Radon transform is proposed in this study to extract the shape information in order to classify the leaf images into different classes and this paper propose a framework for the application of leaf classification. The process considers all the pixels’ information using the harmonic mean formula and enhanced with similarity measure methods called DIMI in the feature extraction process, these two engages techniques are proposed to meet rotation and scale invariant transformation. Encouraging experimental results on Swedish leaf dataset demonstrated that the proposed method can achieve better accuracy compared to the state-of-the-art techniques using precision, recall and accuracy standard evaluation metrics.


Introduction
The shape is an important feature used to represent the object by characterizes the contents of an image in digital format. Compared to other features like texture and color, the shape of an object is independent of capturing the sensors. Nonetheless, it remains a challenge to represent the shape information for an object accurately due to noise, occlusion and clutters of the images (Kahaki et al., 2016a). Shape classification is important for many digital object analysis, especially in agriculture and botanical research. When an incomplete diseased leaf is captured using different devices such as a scanner or smartphone, it can affect the result of the plant identification or classification. Plant identification is very demanding in agriculture research since it is a process of resulting the assignment of each individual plant to a descending series of related plant groups regarding their common characteristics (Wang et al., 2003).
Classification of plant species has become an active area of research in image processing. Unlike other organs of plants; flowers, fruits and seeds. Leaves are easy to be obtained since they are abundant throughout the year. Leaf recognition plays an important role in plant identification and classification, the key issues lies in the way of selecting features which have good capability to distinguish various kinds of leaves (Tsolakidis et al., 2014). The problem of leaf identification became a challenge when it derived from image processing with geometric deformations (scale, rotation, translation) and illumination. As the main issues which have arisen from leaf properties are the huge number and diversity of leaf species and the low intra-class and high inter-class similarity among them. The good features descriptor is the essence of resulting good leaf identification.
The performance of shape identification method depends on the type of shape descriptor and matching algorithm (Kahaki et al., 2016b). In the earlier study, the former focused on extract the effective and important shape features and arrange them in a data structure. Finally, it improved to obtain shape descriptors in order to determine the similarity value of the two shapes based on a shape 1213 distance measure. Eventually, shape description and matching are two important parts of shape classification for identification and retrieval (Zahra et al., 2017).
In this study we focus on classification of plant leaves based on the shapes, bring forward a new technique in image transform to extract features of the leaves shape known as Harmonic Mean Transform and engage of a deformation invariant similarity metric known as Distance based Mutual Information are proposed to generate the supportive features to improve the plant leaf classification result. We demonstrate the utility of the proposed framework in plant leaves classification has shown an outperform state-of-the-art shape-based techniques.

Related Work
In the last decades, various methods have been proposed for shape recognition. As in traditional cognition of shape analysis, the similarity between two shapes is measured using global and local features which capture the properties of the shape such as contours and regions (Kahaki et al., 2014). Shape signature is one of the prominent hybrid methods between local and global descriptors that cope with certain limitations of each method respectively. Compared to other shape descriptor methods such as shape context and multi-scale shape descriptors; shape signature is able to capture the essential information of the shape. Signatures proposed such as tangent angle, local diameters, centroid distance, complex coordinates and many others (Zhang and Lu, 2004). They are widely used in many shape analysis tasks furthermore it compacts and efficient enough to represent the shape descriptor of images. Nonetheless, some of the time they also failed to discriminate shapes with large differences. Most of the shape signatures are not invariant toward articulation, classic shape signatures still failed to match planar and non-planar images or called isometric transformation of the shape because articulation between two shapes proposes adverse information to signatures. Articulation invariant became one of a challenging problem in shape analysis history since it needs to account for changes of 2D shape. Due to 3D articulations, it could be caused by viewpoint variations or varying effects of imaging process on different regions due to non-planarity. This issue early has solved by (Ling and Jacobs, 2007) who proposed shape context with Inner Distance method (IDSC), instead of using Euclidean distance they used the distance of the shortest path inside the shape. However, shape context still less compact than shape signatures.
In the prior research of multiscale representation, (Adamek and O'Connor, 2004) have proposed Multiscale Convexity Concavity (MCC) representation, the relative displacement of contour point was used to compute the position in the preceding scale level so the properties of convexity and concavity can be measured at different scales. (Alajlan et al., 2007) Proposed another multiscale shape descriptor called Triangle-Area Representation (TAR), considered the area of the triangle formed by the boundary points to compute the convexity and concavity at each point at different scales. The uncertainty of the proper triangles used has inspired (Mouine et al., 2013), triangle-area representation has associated with side length and oriented angles that termed as Triangle Side Lengths and Angle representation (TSLA), and however, TSLA has achieved the better outcome than TAR.
Then eventually, according to (Wang et al., 2015) have proposed a novel shape description and matching method. The proposed method named Multiscale Arch Height (MARCH). In this point, it extracted hierarchical arch heights features at different chord spans from each contour point, then the descriptors are compared using L1-norm based dissimilarity measurement in order to provide fast matching. Based on the result, MARCH obtained higher performance from the state of the art such as MCC, TAR, IDSC and TSLA with accuracy 97.33% conducted on Swedish leave dataset. However, it remained some drawbacks since it relies on the arch on the contour shape, whereas sometimes the condition of the acquired leaf images is not complete due to missing edges caused by noises during image acquisition and extra edges in the inner side of the leaf. These things affects the accuracy and final detection rate. Those are the development of conventional methods based on the boundary, curve and contour.
However, there are still remain problems that couldn't be solved with those elements since they cannot capture shape interior content and they cannot deal with typical disjoint shapes when the boundary may not be available. For those reasons, this study proposes Harmonic mean projection transform as a new method of shape extraction and description and distance based mutual information as measuring their similarity which based on a spatial domain to be able to extract descriptors from the entire of a region so this technique more robust toward any transformation changes. Harmonic mean projection transform is adopted from Radon transform in the way of calculates a function of the image function along the lines. The same thing goes to Trace transform that has a similar calculation to Radon transform yet the functional in Trace transform is not necessarily the integral (Nasrudin and Petrou, 2011). This paper aims to develop a steady shape description method for random object and leaf shape identification. Similar to the above mentioned some state of the art such TAR and TSLA used triangles region of the triangles and multiscale shape description, while the others such as IDSC used inner distance as the shortest path inside the shape and MARCH that used multiscale shape description method with using arch height function as a measure the curvature surrounding a contour point and utilized a simple L1 norm based as a similarity measurement used for shape matching. Our method focuses on extracting the features of the shapes on the entire object associated with integral measures of all pixels in the image and measure the shape similarity of two images with distance based mutual information.
Distance-based mutual information is the extension of Mutual Information (MI), MI is a popular method to measure the similarity between two images. MI is a type of similarity measures techniques that have been successfully implemented in areas such as medical imaging, segmentation, image registration, feature-based image retrieval and shape recognition (Alajlan et al., 2007;Govender et al., 2014). The limitation of MI is it calculated on a pixel by pixel base, means that it only considers the relationship between corresponding individual pixels and ignore those pixels respective neighborhood (Viola and Wells III, 1997). However, it remains an important drawback as the way of comparing images, MI fails to put geometry into account since it only considers the pixel values not the position of the pixel (Russakoff et al., 2004). Instead of mapping the pixel one to one correspondence, our method considers all the mutual information on pixels as well as the distances between them. When the intensity varies between the correspondence pixels, but the distance of the shifts remains the same, it indicates that the position of all pixels has changed by the same magnitude of the shifts.

Methodology
In this section, we focus on the shape feature descriptor and similarity measure in order to achieve better classification rate in leaf identification. Swedish leaf dataset is applied in order to measure the shape descriptor then followed by similarity measurement to improve the overall results of shape identification.
The well-known Swedish leaf dataset is one of an important benchmark dataset associated with the leaf classification project at Linkoping University (Söderkvist, 2001) which have been considered in many shape classification and retrieval applications. In the implementation of Swedish leaf dataset for leaf shape identification, the classification rate achieved (>95%) by (Alajlan et al., 2007). This article uses Swedish leave dataset to ascertain the performance of the proposed method, Swedish leave dataset is challenging due to a huge variety of plants and many different features need to be considered (Tsolakidis et al., 2014). The sample of Swedish leaf dataset collections is shown in Fig. 1.

The Proposed Harmonic Mean Projection Transform for Shape Description
A new image transform, known as Harmonic Mean Transform (HMT) is proposed in this study, for use as a global descriptor method to extract features. By using the harmonic mean function, the signal carries the greater important information in signal acquisition. The selected image is extracted from the whole region where all the pixels are considered and accumulated in order to get a set of features. The harmonic mean of the pixels in the shape is extracted vertically and horizontally then accumulated into sets of vector matrices, each vector matrix contains a set of the sum in HMT which is rotated from zero to 180 degrees as shown in Fig. 2 the features set then dimensionality reduced with Principal Component Analysis (PCA). For the initial step, the input image is undergone background subtraction to select and separate the image from the background which does not belong to the target input. Then feature sets are generated by HMT algorithm, the PCA and FFT are used to reduce the dimension and extract the important information of features matrix. Figure 2 shows how HMT matrix is extracted from the image, which the features utilized in two sides, horizontal has presented with 'yi' and vertical has presented with 'xi'. Each horizontal and vertical line contains the integral sum of harmonic mean. Considering the image as a function 'f(x,y)', the HMT formulation is defined as Equation While the Equation 2 describes the function of 'f(x,y)', where 'xi' is the pixel value in location 'i' to 'n' generated from a horizontal line or row direction and 'yi' is the pixel value in location 'i' to 'n' generated from a vertical line or column direction.

Distance based Mutual Information (DIMI)
A new similarity metric called Distance based Mutual Information (DIMI) is proposed as a supportive method of Harmonic mean transform, DIMI is adopted from the existing similarity metric methods such as SAD, LSAD, SSD, NCC and many more. As similarity metrics are the main section of many computer vision and signal processing techniques, the idea to improve these metrics is particularly enhance the final application performance result.
Robust and invariant similarity metric can improve the final results significantly. Thus, it is very important to have an appropriate similarity measure technique to describe the image features. Mutual Information has proven to have good results for 2D active shape recognition incorporated with Fourier descriptor (Govender et al., 2014). In the implementation of many image processing applications, such as classification, image matching and retrieval, there must be always two input images as a source and the target that need to be quantified the similarity between them in order to obtain the final results. Distance-based Mutual Information (DIMI) is proposed in this study as the extension of mutual information which is able to extract the features from both source and model images to improve the accuracy of the result. The goal is to measure the similarity between features from HMT image and features from the model image which are defined based on available images from all classes in the dataset. The proposed approach ensures all the mutual information of the pixels and the correspondence distance are invariant toward all image rotations and deformations. In the implementation, DIMI similarity measurement was evaluated using 100 images sourced from USC-SIPI dataset in order to ensure the performance of DIMI prior to executing into the shape identification process. As shown in Fig. 3 considering the 'R' and 'T' as two input vectors extracted from the input images with the length of 'N' and 'M' are considered as depicted.
Based on those vectors, the initial joints histogram matrix is defined as 'I (x,y)' based on the Equation 3: In order to reduce the sensitivity of the DIMI method towards the illumination variation, the standard deviation of each input vector is calculated and considered in JMI matrix which is defined as Equation 6 where, R and T are the mean for vectors R and T respectively. M represents the number of pixels in R and N indicates number of pixels in T. In the next step, S(x,y) matrix which is final joint mutual information matrix is achieved as depicted in the Equation 7: The distance between pixels is strengthened with the importance intensity variation for both R and T. Since the pixel's intensity is its brightness, the elements in the intensity matrix represent various intensities, or gray levels variations. So, the intensity variation is considered as one important factor value. It presents as in the Equation 8: where, μ is the mean in both R and T vectors. Finally, mutual information formulation using proposed join mutual information matrix is applied to calculate the DIMI value as presented in Equation 9:

Experiments and Results Discussion
In the evaluation of the shape identification, the performance conducted by integrating the features extracted from the proposed HMT method and the features obtained from the DIMI, then we compare them with the state-of-the-art methods, including TAR (Alajlan et al., 2007), MCC (Adamek and O'Connor, 2004), TSLA (Mouine et al., 2013), IDSC (Ling and Jacobs, 2007) and MARCH (Wang et al., 2015). To measure the performance of the proposed shape descriptor, the Swedish leaf dataset is conducted which contains 15 classes and each class consists of 75 images. Figure 4 presents the proposed algorithm including the feature extraction and similarity measure to generate shape identification framework.
For all features extracted from input images are through background subtraction process then it stored in a set of array feature matrix. The HMT method process is conducted to extract all the features where every set of feature matrix contains an angle feature r1 r2 r3 rm T(tn) R T 1217 descriptor from 0 to 180C. This technique applied in order to tackle the rotation invariant due to most of images have undergone some transformation changes such as scale, rotation and point of view.
Mainly, each image consists of 180 columns of feature matrix, multiplied by 278 for the whole images in one dataset. This huge numbers then dimensional reduced by Principal Component Analysis (PCA). In the other side, the set of feature matrix done by DIMI process is combined with HMT feature matrix. In order to verify this proposed method, we use neural network to get the classification rate. Table 1 presents the neural network parameter setting used to train all the data input.
In this experiment, the number of hidden layer is set to 10, due to adding more layers of hidden neurons enable greater system flexibility and processing power, while perform too less hidden neurons could reduce the robustness of the system. Back-propagation algorithm is selected for training since it widely used for classification in neural network, it is the practice of fine-tuning weights and biases throughout the network. The proper tuning of weights obtains lower error rates, it produces the model reliable by increasing its generalization so that we can get the desired output.
The input data are divided into two subsets: Training and testing set. The training is set to 70% of the whole data, it utilized to compute the gradient and update the weights. The network begins to over fit the data, the error on the validation set also come up, after the specified number of iterations and validation error arise, when the training is stop and the weights with minimum validation errors are return as the final neural network structure. When the parameter level is set to 70% of the whole data for training, it means the remain 30% of the data is utilized for testing separately. The benefit of back-propagation algorithm can calculate the gradient of a loss function that relates to all weights in the network and it's suitable to train a multi-layer neural network in learning arbitrary mapping of input and output.
To evaluate the classification results, standard evaluation techniques such as precision-recall, f-measure and average accuracy are applied in below paragraph. Table 2 shows the measurement for multi-class classification by (Sokolova and Lapalme, 2009), where an individual class Ci the assessment is specified by the tpi, fni, tni. Accuracy, Precision, Recall, F-measure are calculated from the counts for Ci. Fig. 4: The framework of shape extraction and similarity measurement for shape recognition     Table 3 presents the evaluation results of multi-class classification using precision, recall, F-measure and accuracy from 15 classes of Swedish dataset, while the value of those are generate from the confusion matrix such as (tp, tn, fp and fn). Table 4 presents the classification rate of the proposed method and state of the art approaches. The proposed method achieved slightly higher than MARCH and the other methods. The classification rate obtained from the evaluation method using precision-recall then accumulated with the average Accuracy (ACC) as the calculation has mentioned above.

Conclusion
A new shape transform called Harmonic mean projection transform is proposed as a feature extraction technique. The proposed technique is aimed to overcome the shortcoming available methods in terms of robustness and accuracy. The specific aspects of this study are accomplished by proposing new fundamental techniques as a shape descriptor, similarity metric and leaf classification. Harmonic mean transform and DIMI are engaged to provide better results of leaf classification as a fast and reliable method. The evaluation results using standard evaluation methods tested on Swedish standard dataset indicates that the proposed methods as a part of this study overcome other state of the art techniques in terms of precision, recall, accuracy and classification rate.
The main objective of the shape descriptor technique proposed in this study is to outgrow the weaknesses and improve the existing methods also propose a new technique of shape descriptor for identification purposes. Consequently, the proposed technique compared with the existing method using the same dataset is proposed. This method has worked efficacious toward scale invariant and local deformation.
However, the proposed method is designed to overcome the limitation of the previous techniques in terms of image deformation and rotation, but it is still not successfully robust under high illumination and color changes of the objects.
In the future work, the approach can be extended to identify and improve the available methods such as calculating the chord distance and combine with projection-based technique, furthermore the most of similarity metrics are sensitive to minor image deformation. There has been little work on those things; those might be done in the context of plant classification or retrieval either for global or local shape descriptor area.