Off-Line Signature Authentication Based on Moment Invariants Using Support Vector Machine

: Problem statement: The research addressed the computational load reduction in off-line signature verification based on minimal features using bayes classifier, fast Fourier transform, linear discriminant analysis, principal component analysis and support vector machine approaches. Approach: The variation of signature in genuine cases is studied extensively, to predict the set of quad tree components in a genuine sample for one person with minimum variance criteria. Using training samples, with a high degree of certainty the Minimum Variance Quad tree Components (MVQC) of a signature for a person are listed to apply on imposter sample. First, Hu moment is applied on the selected subsections. The summation values of the subsections are provided as feature to classifiers. Results: Results showed that the SVM classifier yielded the most promising 8% False Rejection Rate (FRR) and 10% False Acceptance Rate (FAR). The signature is a biometric, where variations in a genuine case, is a natural expectation. In the genuine signature, certain parts of signature vary from one instance to another. Conclusion: The proposed system aimed to provide simple, faster robust system using less number of features when compared to state of art works.


INTRODUCTION
Hand written signature is widely accepted all around the world socially and legally. There exist some challenging aspects in signature verification. The variation between samples is an issue to take into account because the methods are acquired over time. Analysis of discriminative powers of the features that can be extracted will be affected by the physical and emotional state of the user. This leads to difficulty in detecting "true" signature. The system should consider statistical matching methods, with decision adaptation methods, as different features will have different thresholds for different users. The design of a complete automatic handwritten signature verification system which is able to cope with all classes of forgeries is a very difficult task. The material is organized in following manner: In the Prologue, preliminaries which describes quad tree, Hu moments, variance and the classifiers are explained. Following that, state-of-the-art is discussed. Then approach is presented, followed by experimental results for various approaches. Sequentially conclusions are drawn with a discussion of future research.

Problem statement:
In off-line signature verification system, the signatures are treated as grey level images. The features are to be invariant to rotation, translation and scaling of the object sample. Some of the static features are vertical midpoints, number of vertical midpoint crossings in signature, total pen travel writing distance, signature area and maximum pixel change. In the proposed system sub-patterns of signature area is considered under minimum variance criteria to provide off-line handwritten signature authentication.
Prologue: A tree data structure in which each internal node has up to four children is termed as quad tree. The tree directory follows the spatial decomposition of the quad tree. Quad trees are classified according to the type of data they represent, including areas, points, lines and curves. In this study, region quad trees are used. The region quad tree represents a partition of space in two dimensions by decomposing the region into four equal quadrants. A region quad tree with a depth of 'n' may be used to represent an image consisting of 2n×2n pixels, where each pixel value is 0 or 1. The root node represents the entire binary image. Let R represent the entire normalized binary signature image. Quad tree partitions R into 4 sub regions, R 1 , R 2 , R 3 , R 4 , such that (a) ∪ R i = R. (b) R i is a connected region, i = 1, 2, 3, 4 (c) R i ∩ R j = ϕ ∀ i, j, i ≠ j. For further subdivisions same clauses are applicable. The proposed system implements first, second and third trie level of decomposition to represent variable size of the normalized binary signature. This data structure is selected because, in a real time scenario storing in external storage files are simpler since every node is either a leaf or it contains exactly four children as compared to binary trees which involve number of traversals with level numbers of nodes for different encode levels. In the image processing field, centroid and size normalization provide significant inferences. Moments of order p, q of a binary image I are calculated as: Hu (1962) derived moment expression which can be extended as centralized moment, is: The parameters a and b are the centers of mass in the 2D co-ordinate system. The lower order moments derive the shape characteristics. The first, second and eight Hu moments are used in this study. The moments from 3-7, are usually assigned to moment invariants of order 3, are not considered (Foster et al., 2002).
For the entire material, these moments are identified as A, B, C moments. The expressions of three moments are: respectively. By using Hu invariant moments, the technique is invariant to rotation, translation and scaling operations. The signing process depends on the amount of area available to sign, this aspect makes scale invariance very important. Authentication via moment based descriptors is achieved through variance criteria.
Let M be the order of the normalized binary image I. In this study M = 512 is considered. L = M/d 1 × M/d 1 , d 1 = 64, 128, 256. L denotes number of subregions formed on the application of quad tree procedure on I. d 1 denotes the minimum subregion size which forms M/d 1 trie level of quad tree. Moments are applied on each of the L quad tree components using (1 and 2): The variance of each of corresponding subregion in m genuine samples from the training set is found as shown in Fig. 1. The average variance is calculated. Each var i, i ∈ {1, ..,L} less than average is selected for subregion list. Let b be the threshold parameter of the system. If the number of elements in the subregion list is greater than b, process is repeated with new average of variance for the subregions in the list. The b template MVQC's are obtained which denote less variation subregions of signature of a person with respect to moment applied.
Bayes minimum distance classifier decision rule is a quadratic function of the sample vector x given by: Where: w 1 , w 2 = Represent two classes µ 1 , µ 2 = Represent mean vector for classes ∑ −1 = Covariance matrix equal for both the classes P = Probability  In the fast Fourier transform when the zero frequency component is shifted to middle of the spectrum, provides a basis of visual classification. Let x(nT) represent the discrete time signal and the Fast Fourier transform (FFT) is given as: Analysis (PCA) has been shown to be a powerful tool for dimensionality reduction and feature extraction in pattern recognition community. It has been successfully used in many applications (Chen et al., 2005). PCA is a linear transformation that removes the correlation among the elements of a random vector. It is used in many applications because of its optimality properties and information representation (Zeng et al., 2002). The Eigen vector and eigen values of the covariance matrix play vital role. The Eigen vector with the highest Eigen value is the principle component of the data set. Fisher Linear Discriminant Analysis (LDA) can be thought of as a nonparametric method (i.e., distributional assumptions are not explicitly made) because this procedure maximizes the between-class variability relative to the within-class variability assuming equal sample covariance matrices across classes (Xie et al., 2006). Let be a ndimensional sample set with N elements, where n >> N, c is the number of the total classes and Ni is the number of the samples in the ith class. The between-class scatter matrix s B , the within-class scatter matrix s w are defined as: x N = = µ = ∑∑ = The mean of all samples (Zheng et al., 2004) LDA is used to seek a projection w, from the original sample space to a lower dimensional space, which maximizes the between-class scatter while minimizing the within-class scatter. A typical way to achieve this is to maximize the ratio T T B w | w s w | / | w s w | , where s B is the between-class scatter matrix and s w is the within-class scatter matrix (Huang et al., 2002). Support vector machine maps the input vectors into a high dimensional feature space through nonlinear mapping. Optimal separating plane is searched. Support Vector Machine (SVM) is very effective method for general purpose pattern recognition. Given a set of points which belong to either of two classes, a SVM finds the hyperplane leaving the largest possible fraction of points of the same class on the same side, while maximizing the distance of either class from the hyperplane. This minimizes the risk of misclassifying not only the examples in the training set but also the yet-to-be seen examples of the test set (Pontil and Verri, 1998). SVMs perform pattern recognition between two classes by finding a decision surface that has maximum distance to the closest points in the training set which are termed support vectors. Principle of SVM is, where there are many possible linear classifiers that can separate the data, there is only one that maximizes the difference between classes (Faruqe and Al Mehedi Hasan, 2009). SVMs are particular classifiers that are based on the marginmaximization principle. They perform structural risk minimization as stated by Vapni. SVMs use suitable kernels to produce nonlinear boundaries (Adankon and Cheriet, 2008). The SVM training consists of a quadratic programming problem that can be solved efficiently and for which we are guaranteed to find a global extremum (Scholkopf et al., 1997). Training a SVM is equivalent to solving a linearly constrained Quadratic Programming (QP) problem in a number of variables equal to the number of data points (Osuna et al., 1997). SVM for 2-class classification can be explained as:

X→Y mapping
Where: x∈X = Some object y∈Y = A class label Let n x R , y { 1} ∈ ∈ ± . We should choose optimal hyperplane, from the set of hyperplanes in R n as f (x,{w, b}) sign(w.x b) = + which minimize the overall risk given as: R( ) l(f (x, ), y)dP(x, y) α = α ∫ l = The zero-one loss function P(x,y) = The unknown joint distribution function of x and y (Weston, 1998) State-of-the-art: Many of the verification systems use writer dependent threshold and writer independent thresholds. The recognition system using warping proposed by Agam and Suresh (2007) is with the dataset built by scanned documents. Signatures of 76 subjects with each of 5 samples in test collection were extracted. The approach obtained rates of 100% precision with 30% recall. In the verification system using enhanced modified direction features proposed by Nguyen et al. (2006), the classifiers were trained using 3840 genuine and 4800 targeted forged samples. FARR is the measurement of false acceptance rate for random forgery and FARG is measurement of false acceptance rate for targeted forgery. DER is distinguishing error rate, which is average of FARR and FARG. The system obtained DER of 17.78% with SVM. FARR for random forgeries was below 0.16%. The system for fuzzy vault construction proposed by Freire et al. (2007) used MCYT, Spain database for training. The system achieved seperability distance of 12 for random forgeries. The distance is termed as average distance between genuine and impostor vault input vectors. Justino et al. (2004) reports on a comparison of the two classifiers in off-line signature verification. The study proposed by Araujo et al. (2007) for a real application (4-6 samples), the results presented for false rejection error rate was 13% for HMM. In Hanmandlu et al. (2004) approach using fuzzy modeling, used samples from Graphics Visualization and Games Development (GVGD) lab database at the Multimedia University, Cyberjaya, Malaysia. According to results using TS model with consequent coefficients fixed with his second formulation (which depends on number of rules) out of 200 genuine signatures 125 were accepted.
The Approach: The image of size 850×360 is read and is converted to gray from rgb format. The intensity image is converted to binary. A nonlinear operation, median filtering is applied to reduce noise and preserve edges. Some of the noise pixels at boundary are eliminated by implementing a white boundary rectangle. To achieve good connectivity the signature is thickened. For suitability of moment application, intensity levels are exchanged by xoring with a white template of size 850×360. The thinning and rotation to base line operations are performed, which leads to a smooth connected signature. The minimum bounding box is found and signature is extracted. To maintain the uniformity the sample is resized for 512×512. The signature is normalized to minimum bounding box. Quad tree is constructed as shown in Fig. 2a.

Feature extraction:
The geometric moments are applied for quad tree components. The procedure is explained for the case of trie level = 4, which depicts minimum size of the component to be 128×128. The L = 16 components formed are numbered as shown in the Fig. 2b. The m training samples of the person are considered. In this study m = 10. The sixteen components are subjected to the any one of the moments. The variance of corresponding quad tree components for m training samples are calculated. The threshold b, the number of MVQC's is selected. In this study, minimum variation quad tree components are found and the same are used to detect imposter signatures. In these MVQC's the variation is not minimum for an imposter sample. For one of the subject in the database, for b = 4, the MVQC's listed were 2, 3, 13, 15 with Moment C applied. This is a learning obtained by using only genuine m training samples of the person. The summation of moment value on all MVQCs is calculated. In many cases, the summation in imposter sample was larger when compared with summation in genuine sample as shown in (3 and 4): In PCA approach, the principal component values that represent s m in the principal component space are processed using (8-11). Let H K be the subset of H. In this study K = 5. The mean of H K is represented as H Kavg . The system is stabilized by considering the mean of the differences as the threshold C th as shown in (11). H Kavg is maintained as person dependent constant value. The difference of testing sample H values with H Kavg are compared against C th for classification. Genuine testing samples will have value less than C th . In LDA classifier, the distance metric used is Mahalonabis distance on the s m . The testing samples class label are noted. SVM training process considers the training samples which consist of both genuine and imposter samples. The hyper planes are fixed with specific kernel function. In this study linear, quadratic, radial basis function and multi layer perceptron were the different kernel functions applied on the s m .  The train structure is maintained for further testing.
Testing samples class labels are noted.

MATERIALS AND METHODS
The experiment was conducted on MCYT signature database of 75 subjects. Database consists of 15 genuine samples and 15 forged samples for each subject. Classifiers applied were SVM, FFT, Bayes, PCA and LDA.

RESULTS AND DISCUSSION
The experimental results are listed in Table 1. The best results were obtained using Moment C for d 1 = 128 and b = 4 with acceptance rate of 92% and accuracy of rejection achieved of 90% using radial basis kernel for support vector machine classifer. For the increase in the value of b, FRR increases and FAR decreases for all values of d 1 as shown in Table 1. FAR and FRR are trade off against one another.

CONCLUSION
In general, it is hard to say which classification algorithm is better. We can only say, one classification algorithm is better than others for a specific problem. A new technique is presented which is simple and robust for authentication of off-line hand written signature using moment based descriptors. The technique uses Hu moments and is hence invariant to rotation, translation and scaling. Results show that performance is improved using support vector machine. False rejection rate of 8% and false acceptance rate of 10% was achieved on the MCYT database of 75 subjects, 30 samples each. Quad tree further depth decomposition will lead to more performance as it selects more minute details of variation. Continuous dynamic programming method of classification will provide piecewise comparison (Radhika et al., 2009a). Zernike moments are rotation invariant (Radhika et al., 2009b). Another off-line feature can be selected to provide 3D Zernike moment application as quad tree will be extended to octree.