Iris Recognition Using Discrete Cosine Transform and Artificial Neural Networks

Problem statement: The study presented an efficient Iris recognition system. Approach: The design used the discrete cosine transform for feature extraction and artificial neural networks for classification. The iris images used in this system were obtained from the CASIA database. Results: A robust system for iris recognition was developed. Conclusion: An iris recognition system that produces very low error rates was successfully designed.


INTRODUCTION
In many applications, it is important to determine the identity of a person. Conventional methods of recognizing the identity of a person by using cards or passwords are not always reliable, because these possessions can be lost, stolen, or forgotten.
Biometric technology, on the other hand, uses technology such as Artificial Intelligence (AI), to identify features particular to an individual's body. By using specific physiological or behavioral characteristic possessed by the user, a biometric system is designed to establish personal identification. The physical characteristics that Biometric identification relies on must be unique to each person. Such characteristics include hand geometry, fingerprint, face, speech, DNA, retina, iris and palm vein.
Iris recognition, which has for the most part replaced retina recognition, has received increasing attention in recent years. In particular, it has formed a substantial part of the research in the field of biometric pattern recognition and machine learning. Figure1 shows a view of the iris and other parts of the eye, obtained from [1] . Another view of the iris is shown in Fig. 2.
In this study, we develop a biometric system for iris recognition. The system is based on using the twodimensional (2-D) Discrete Cosine Transform (DCT) to obtain distinctive features from an iris image. Classification of the iris image is then achieved by applying an Artificial Neural Network (ANN) to the coefficients (features) extracted from the DCT (frequency) matrix.
Most of the researchers in the field of iris recognition use iris images from the following databases, which are available freely online: • The Chinese Academy of Sciences database (CASIA) [3] • The Bath database, produced by the university of Bath [4] Fig. 1: The human eye [1] Fig. 2: The iris [2] The state of the art in iris recognition includes the following contributions: Vatsa et al. [5] proposed algorithms for iris segmentation to improve the speed of iris recognition. Ren et al. [6] worked on feature extraction techniques and applied them to the CASIA database.
Monro et al. [7] in his iris coding method, used the DCT for feature extraction. He used Iris images obtained from CASIA database, version 1 and the Bath database. A Bayesian approach to the problem was tackled by Thornton et al. [8] . Miyazawa et al. [9] worked on the problem using phase-based image matching.
Park et al. [10] worked on the problem using Support Vector Machine (SVM) and wavelet transform.
Lim et al. [11] tackled the iris recognition task using the wavelet transform and the Learning Vector Quantization (LVQ). Masood et al. [12] worked on the problem using wavelets for feature extraction. Schuckers et al. [13] worked on processing and encoding of the rotated iris image using Daugman's integrodifferential operator as an objective function to estimate the gaze direction of a rotated iris image. Zhenan et al. [14] worked on the classification task by combining the Local Feature Based Classifiers ( LFC) and an iris blob matcher.

MATERIALS AND METHODS
The algorithm proposed in this study is based on using the DCT to extract distinctive features from the iris image. These features are then applied to an ANN for classification. A block diagram of the proposed system is shown in Fig. 3.
The iris images used in this study were obtained from the CASIA database (version 2.0) [3] . The CASIA database contains 1200 iris images. The images are for 30 persons. For each person, 20 iris images were captured for the left eye and another 20 images for the right eye giving total of 40 images for each person. The original size of each image is 480×640 pixels, with 256 grey levels per pixel.
The study presented, in this research is based on the iris images of the right eye only. Thus, our dataset contained 600 images. Figure 4 shows 6 sample iris images for a person's right eye, obtained from CASIA, v.2. Note that the left and right irises for a given person are different from each other. The discrete cosine transform of an N×N image, f(x, y) is defined by: The inverse transform is defined by: The DCT has been used in many practical applications, especially in signal compression. For example, the compression achieved in the famous JPEG image format is based on the DCT.
The strong capability of the DCT to compress energy makes the DCT a good candidate for pattern recognition applications. Coupled with classification techniques such as Vector Quantization (VQ) [15] and ANN, the DCT can constitute an integral part of a successful pattern recognition system. For example, the DCT was successfully used in face recognition applications.
The DCT decomposes a signal into its elementary frequency components. When applied to an M X N image/matrix, the 2D-DCT compresses all the energy/information of the image and concentrates it in a few coefficients located in the upper-left corner of the resulting real-valued M X N DCT/frequency matrix. This is shown in Fig. 5 which shows an iris image (Top) and its DCT transform (Bottom). Note that the transform image has zero or low-level pixel values except at the top left corner where the intensities are very high. These low-frequency, highintensity coefficients, are therefore, the most important coefficients in the frequency matrix and carry most of the information about the original image.
Two methods were followed to extract features from these low-frequency DCT coefficients. The first method is a square-windowing method that extracts the L × L = L 2 lowest-frequency coefficients in the upperleft corner of the DCT matrix, as shown in Fig. 6 (Top). This windowing method makes use of the fact that the DCT pushes most of the energy/information of the signal in the dc component and the lower frequency components. The dc coefficient (first harmonic) contains the highest value or most of the energy. The second harmonic has the second highest value and so on.
To illustrate the scanning scheme of the square window, let a mn designate the coefficient in the DCT matrix located in the m th row and n th column. Then a 1×1 window, generates the vector W 1×1 = [a 11 ]. Similarly, a 2×2 window generates the vector W 2×2 = [a 11 a 12 a 21 a 22 ] and a 3×3 window produces the vector W 3×3 = [a 11 a 12 a 13 a 21 a 22 a 23 a 31 a 32 a 33 ].
The second or alternative method is a ziq-zaq method, as depicted in Fig. 6 (Bottom). Here the coefficients are more selectively scanned, depending on their magnitudes.
To classify the DCT feature vectors obtained from the DCT coefficients, we employ in this study, the popular ANN algorithms. ANNs were introduced by McCulloch and Pitts in 1943 [16] . ANNs are trainable algorithms that can "learn" to solve complex problems from training data of pairs of inputs and desired outputs (targets). They can be trained to perform specific tasks such as prediction and classification. ANNs have been applied successfully in many fields including pattern recognition, image processing and adaptive control.
ANNs have enormous flexibility in their design. The design parameters include the number of layers, number of neurons in each layer, type of transfer function (log-sigmoid, hard-line, linear) and the learning rule (backpropagation, ada-line.) The ANNs examined in this study had the following specifications: • Perceptron multilayer architecture • Back-propagation was used as the learning algorithm. • Log-sigmoid functions were used as the transfer functions for the output layer. • The output layer contained a constant number of 30 neurons, which corresponds to the number of individuals/irises to be classified.

RESULTS
Here we investigate the optimum number of DCT coefficients and optimum ANN structure (number of layers and number of neurons in each layer). Figure 7 shows the error rate as a function of the number of DCT coefficients used. The ANN structure was a three-layer network with the number of neurons in the first, second and output layers equal to 20, 5 and 30, respectively. It is clear from Fig. 7 that as the number of DCT coefficients/features used increases, the error rate decreases or levels off.
The minimum error rate of 4% (96% success rate) occurs, however, when the number of coefficients used is 49, which corresponds to a window of 7×7 coefficients.
Next, using the same 3-layer ANN structure with the same number of DCT coefficients, we search for the optimum number of neurons in the first and second layers. Figure 8 shows the error rate when the number of coefficients used is 49, with varying number of neurons in the first and second layers.
It is clear from Fig. 8 that for the three-layer case using 49 DCT coefficients, the minimum error rate occurs when the number of neurons in the first and second layers equal to 15 and 20, respectively.
Next, we investigate the potential change in error rate as a function of epochs, as showed in Fig. 9. Again, the same structure was used with 49 DCT coefficients.
It is clear from Fig. 9 that as the number of epochs is increased, the error rate decreases and then levels off when the number of epochs is 74 at which point the network is sufficiently trained. If the network is still forced to train on more epochs, over-learning occurs and the error rate starts increasing with increasing number of epochs. If too many epochs are used, the network will tend to memorize the data instead of discovering the features. This will result in failing to classify new input data. Last, we investigate the performance of the system when only 2 layers are used. Figure 10 shows the success rate for the case of 2 layer ANN architecture using 49 DCT coefficients, with 30 neurons in the output layer and a varying number of neurons in the first layer.
It is clear from Fig. 10 that for the 2 layer structure, the maximum success rate that can be achieved by varying the number of neurons in the first layer, is 90%. Consequently, the system performance degrades when using only 2-layers, producing a success rate of less than the 96% rate produced by the 3 layer structure.

DISCUSSION
An iris recognition method based on the DCT and ANN is presented. First, the dimensionality of the original iris image is reduced by using the DCT. Next, the upper-left corner of the DCT matrix is scanned using two different scanning schemes. The resulting truncated DCT coefficients are used as representative features of the iris image. Then, the feature vector is applied to an ANN for classification.
In the simulations, 600 iris images were used in the training and testing phases. The system was trained with 400 images. The system was then tested with 300 images; 100 images from the training data and 200 new images that were not present in the training data.

CONCLUSION
In this study, a robust iris recognition system is presented. The system is based on the 2-D DCT and ANNs.
ANNs have many parameters and the DCT feature vector can have a variable size of coefficients. In this study, we investigated the optimum ANN structure and the optimum size of the DCT vector for the problem of iris recognition.
Simulation results showed that the best performance occurs when the ANN used is a 3-layer structure, using 49 DCT coefficients, trained with 74 epochs.
Experimental tests on the CASIA Database achieved 96.00% of recognition accuracy using only 49 DCT coefficients, with low computational cost.