A Vehicle License Plate Detection and Recognition System

: Problem statement: Automatic vehicle license plate detection and recognition is a key technique in most of traffic related applications and is an active research topic in the image processing domain. Different methods, techniques and algorithms have been developed for license plate detection and recognitions. Approach: Due to the varying characteristics of the license plate from country to country like numbering system, colors, language of characters, style (font) and sizes of license plate, further research is still needed in this area. Results: In most of the middle East countries, they use the combination of Arabic and English letters, plus their countries logo. Thus, it makes the localization of plate number, the differentiation between Arabic and English letters and logo’s object and finally the recognition of those characters become more challenging research task. The use of artificial neural network has proved itself beneficial for plate recognition, but it has not been applied for the plate detection. Radial Basis Function (RBF) neural network is used both for the detection and recognition of Saudi Arabian license plate. Conclusion/Recommendations: The proposed approach has been tested on 200 front images of national license plate of Saudi Arabia. A higher percentage of accuracy has been obtained to show that the significant of this approach. The study could be further investigated on other middle east countries.


INTRODUCTION
From the last few decades, Vehicle License Plate Recognition (VLPR) is the quite popular and active research topic in image processing domain. With constantly increasing traffic on roads, there is a need of intelligent traffic management systems which not only detect and track a vehicle but also identify it. The realtime license plate recognition is important in automatic traffic monitoring and law enforcement of traffic; however the area is very challenging (Sarfraz et al., 2003). License Plate (LP) recognition helps in identification of vehicle entering in secure premises. Thus, License plate recognition is urgently needed in countries where the security issues are very critical.
As the LP detection and recognitions are two separate processes, the research on these two processes has always been performed separately. Different methods, techniques and algorithms have been developed and applied for these two processes. Also, the previously developed concepts from the field of image processing or the concepts from other domains are applied in order to get more accuracy; however, there is still a room for improvement. With ever declining cost of hardware devices, increasing speed of computing and ubiquity of embedded devices there is always a need for finding new solutions. Furthermore, each country has its own LP numbering system, colors, language of characters, style (font) and sizes. Even within the same country the license plate differs from state to state and in terms of types of LP.
Although some researchers have been performed on LP detection and recognition, but this research work is different from the previous works due to a number of reasons. Most of the Middle East countries use the combination of Arabic and English characters in their license plate number as shown in Fig. 1.
This study is related with the automatic detection and recognition of license plate for Saudian vehicles. For the detection of the license plate, edge detection and basic morphology tools were used. According to the best knowledge of the authors, Radial Basis Function (RBF) Neural Network (NN) is only used in the recognition process, but the novelty of this work is that RBF was used for both detection and recognition. The Connected Component Analysis was used for character segmentation while the recognition process was based on selected extracted features. Radial basis function: Radial Basis Function (RBF) is an approach of Neural Network (NN) which allows viewing a design as a curvefitting problem. The basic form of RBF NN comprises of three layers: an input layer of source nodes connected to the environment, a hidden layer and an output layer with linear nodes (Wang, 2009). The nodes of hidden layers represent clusters in the input space. Hidden units are known as radial centers and are represented by same vector as of input units. If the input units are closed to the radial centers the output would be maximum and vice versa. The output layer supplies the response of the NN (Garcia et al., 2010).
The main benefit of RBFs over binary features is that RBF create approximate functions that smoothly vary and are distinguishable. Moreover, some learning techniques for RBF NN modify the centres and widths of the characteristic. These nonlinear methods may more easily fit the target function (Wang, 2009). The transformation from input space to hidden layer is nonlinear while from hidden to out layer is linear (Garcia et al., 2010). Thus, RBF NN is a mapping function which map from non-linearly separable space to linearly separable space. Due to these benefits RBF was used in this research not only for recognition but also for detection purposes.

MATERIALS AND METHODS
The research design is divided into four (4) main phases: The main phases of a VLPR process are: Image Acquisition, Image Pre-processing, LP Detection, Character Segmentation and Character Recognition. The complete block diagram of the proposed method is shown in Fig. 2. Preprocessing: In the pre-processing stage, the main goal is to prepare the image for LP detection using the proposed RBF network. Figure 3 above shows the sequence of process in the proposed pre-processing pipeline. It starts with (a) converting the gray-scale image into binary form; (b) performing edge detection using Sobel's mask operator; (c) performing morphological operation using dilation process; (d) filling the interior gaps in order to obtain a closed shape using 'flood fill' algorithm; (e) applying the filtering task. The main purpose here is to delete all the connected objects and remove any connected borders that are associated with the highlighted area of license plate. The resultant image is a highlighted closed shape of license plate area; (f) performing noise removal operation by using image smoothing technique to have a more clearer license plate and finally (g) the outline of the selected area shown in the previous figure is used to map it on the original gray-scale image in order to get an outline across this area.

Fig. 5: LP extraction
License plate detection: The License plate detection phase started with the pre-processed image described in the above preprocessing section. Then a threshold value is defined in the form of minimum and maximum values in order to obtain the LP only and remove other very small or very large identified objects which were outside the threshold range. The objects passed successfully through predefined threshold criterion were forwarded to the training process. During the training process these identified objects were classified as the "Plate" (denoted as "1") or "No Plate", (denoted as "0") manually depending on the shape of the identified white area. The dataset was prepared by resizing all the identified white areas into fixed size of 25×100 in order to make the input size of the RBF is uniform. Based on this dimension (25×100), the total number of neurons used as input to the RBF was 2500, while the RBF provided only 2 neurons (plate and notplate) as an output. This trained data included both the "plate" and the "not plate" patterns.
The training process is shown in Fig. 4. In the generalized RBF NN, the training means learning the weights and the number of radial basis functions with their parameters (Gonzalez and Woods, 2008). In this research work, two RBF NNs are used for learning in both the width side and the length side of the image as shown in Fig. 4b and 4c.
Before testing a new image using RBF NN the image has to undergo the same image pre-processing steps mentioned in previous preprocessing section and the object detection process using threshold values. After that the identified objects of the image to be tested are sent to RBF NN for testing. According to the trained data set the RBF NN produces the results in the form of LP in the new image.

License plate extraction:
In this research study the license plate is extracted from the image in which it was detected in the previous step, by specifying its top left and bottom right corners. After that the border across the LP are removed so that the characters written on it can be recognized in the next step. The (x, y) coordinates of top left and bottom right corners of the LP are used to crop the LP from whole input image (Fig. 5). Character segmentation: Character segmentation is the procedure of extracting the characters and numbers from the license plate image. Diverse aspects make the character segmentation task complicated, like image noise, plate frame, space mark, plate's rotation and light variance. A number of procedures have been proposed for character segmentation to overcome these problems. After removing the plate borders in the previous step, this step starts with removing the noise from the plate. The approach used in this work for character segmentation is based on thresholding and Connected Component Analysis (CCA). In binary image processing, CCA is an important technique that scans and labels the pixels of a binarized image into components based on pixel connectivity. Each pixel is labeled with a value depending on the component to which it was assigned. The connected components are then analyzed to filter out long and wide components and only left the components according to the defined values. Finally, height to width ratio is used to separate the English numbers from Arabic word ‫د‬ ‫ا‬ " ". This word is treated as a single word and does not separated into individual letters, because it is written on every LP of Saudi.
On the extracted LP from the previous phase, following are the steps that would be performed for character segmentation in this phase: • Otsu's thresholding (Otsu, 1979) method is performed to produce binary image (Fig. 6) for further processing • Some noises are present on binary image. Median filter (Lim, 1990) is considered to remove those noises. In this study, a filter with size 3×3 is considered • The number in LP should not touch the border. Thus, any pixels that connected with the border will be removed using morphological reconstruction. The pixels in image marked as zero everywhere except along the border based on 8connected neighborhood. The output is pixels with non-zero will be removed • Connected component analysis is performed to label the component. In this step, the number and Arabic word on LP should have unique label. Then, boundary of each component is traced to determine COL_START, COL_END, ROW_START, ROW_END, height (CH_SIZE) and width (CW_SIZE) of the component (Fig. 7) • A simple algorithm is defined to decide whether the component belongs to number, strip or Arabic word. This algorithm is presented in Fig. 8. This rule is derived based on experimental study where the number always has size greater that 3×5 and the position of Arabic word is consistent in each of Saudi Arabian LP. The final output in segmentation is each component indicated by a bounding box (Fig. 9) Character recognition: After the segmentation of elements (characters and numbers), the final module in the license plate recognition process is character recognition. Although there are many techniques present and applied for character recognition like statistical, syntactic and neural networks in this research, character recognition is performed by using feature extraction.  (Wang, 2009) Our approach for character recognition using feature extraction is based on (Wang, 2009), in which an image is divided into a sequence of horizontal "scan lines" by using the raster scanning. Each scan line can consists of jointed pixels and a number appears in a matrix of size 4×2, as shown in the following (Fig. 10).
The presence of pixel is represented as 1 while the absence by 0, thus, forming a feature. This feature would be used to train RBF NN which later in testing would help to recognize the characters of a new LP. The feature of above characters can be seen as (Fig. 11).
Beside above feature, this study combined with two features called ratio of size and ratio of foreground and background pixels. These two features help to determine non-number image such as strip and Arabic word. It is because strip and Arabic word has significant difference on those two features compare to the numbers. Thus, this study considers 10; total number of features, which are 8 features of pixels existence and 2 features of ratio.

RESULTS AND DISCUSSION
This study shows the results of the experiments which were carried out based on the methodology defined.
The dataset for training and testing: All the images were taken in outdoor environment in different times of the day so they have different illuminations, but all pictures were taken in day light. Only the pictures from the front of a vehicle were included in the data sets. The resolution of the used digital camera was 2.0 megapixels. All the pictures were stored in jpeg format. The size of the images used was almost the same, but it does not matter as it is not a matter of concern. Figure 12a shows the sample of output result from LP detection, plate extraction, character/digit segmentation and finally character recognition. After segmentation, the bounding box is extracted one by one. Afterward, proposed feature extraction is performed to extract the patterns. In this result, 17 components are segmented and it is presented in Fig.  12b. Each component has 10 features. Those feature vectors are fed to RBF NN in order to recognize the pattern. The already trained RBF NN with the same features would recognize the pattern for number and Arabic letters. The "-" is denoted as non-digit or noncharacter object. The recognition accuracy is presented in Table 1 and recognition result is depicted in Fig. 12c. The recognition accuracy is presented in the following table (Table 1). The performance of the proposed method cannot be benchmarked with the closest study presented by Soille (2003), or other researchers; since there is no benchmark dataset available for this task, as far as Saudi Arabian's Car Images is concerned. Most of the researchers are working on their own dataset for various license plate system which vary from country to country. Though (Soille, 2003) had claimed 95% recognition accuracy in their work for 610 vehicles; it is still may be argued as compared to us, as many more parameters are yet to be considered, such as nature of car images that include illumination, orientation, background colour. Unless and until the same dataset is used, only then the performance can be benchmarked. As mentioned earlier, RBF neural network has been applied by other researchers (Li et al., 2008;Shan, 2010) in different problem domains. To best of our knowledge, none of RBF NN has been tried on plate detection as in our proposed method. In our proposed method, RBF has been successfully experimented through a novelty systematically developed preprocessing techniques that lead to the proposal of "plate" and "no plate" object as an input to RBF's architecture. In later subsequence part, i.e. character recognition, the same RBF's model is shared by us to recognize the segmented character.

CONCLUSION
A lot of research has been performed on detection and recognition of license plate. Different researchers provided different methods and techniques for this process. However, every technique has its own advantages and disadvantages. Furthermore each country has its own license plate numbering system, colors, language of characters, style (font) and sizes. Even within the same country the license plate differs from state to state and in terms of types of License plate.
However, desirable work for Saudi Arabian license plate detection and recognition could not get required attention in the literature. It is mainly due to the reason of different style of Saudian license plates. Although some researchers have been performed on Arabic LP detection and recognition, but this research is different from the previous works because Saudi Arabian license plate use English numbers with Arabic characters written on it and because of the complex font of the text on its plate. Saudi Arabian vehicle plate identification presents a critical issue due its unique nature.
The proposed approach has been tested on 200 front images of national license plate of Saudi Arabia. A higher percentage of accuracy has been obtained to show that the significant of this approach. The study could be further investigated on other middle east countries.