AUTOMATIC TEXT EXTRACTION FROM COMPLEX COLORED IMAGES USING GAMMA CORRECTION METHOD

The aim of this study is to propose a new methodology for text region extraction and non text region removal from complex background colored images. This study presents a new approach based on Gamma correction by determining a gamma value for enhancing the foreground details in an image. The approach also uses gray level co-occurrence matrices, texture measures, threshold concepts. The proposed method is a useful preprocessing technique to remove non text region and to show the text region in the image. Experiments were on various images from the datasets collected and tagged by the ICDAR robust reading dataset collection team. Experimental results show that the proposed method has a good performance on extracting text regions in an image.


INTRODUCTION
Text Extraction from image is concerned with extracting the relevant text data from a collection of images. Rapid development of digital technology has resulted in digitization of all categories of materials. Lot of resources are available in electronic medium. Many existing paper-based collections, historical manuscripts books, journals, scanned document, video images, maps, posters, broadsides, newspapers, micro facsimile, microfilms, university archives, book plates, graphic materials, coins, currency, stamps, business cards, advertisements, web pages are converted to images and these images present many challenging research issues in text extraction and recognition. Text extraction from images have many useful applications in document analysis, detection of vehicle license plate, analysis of article with tables, maps, charts, diagrams, keyword based image search, identification of parts in industrial automation, content based retrieval, object identification, street signs, text based video indexing, page segmentation, document retrieving, address block location.
Due to growing requirement for information many research work has been done on text extraction in images. Several techniques have been developed for extracting the text from an image. The existing methods were based on morphological operators, wavelet transform, artificial neural network, skeletonization operation, edge detection algorithm, histogram technique. All these techniques have their benefits and restrictions. Strouthopoulos et al. (2002) method was based on a combination of an Adaptive Color Reduction (ACR) technique, Principal Component Analyzer (PCA) and Self-Organized Feature Map (SOFM) was used to achieve color reduction. Zhan et al. (2006) proposed an algorithm that uses the multiscale wavelet features and the structural information to locate candidate text lines. Then a SVM classifier was used to identify true text from the candidate text lines. Kumar et al. (2007) proposed a scheme for the extraction of textual areas from an image using Globally Matched Wavelet Filters (GMW) filters with Fisher classifiers. To improve the

JCS
result Markov Random Field (MRF) based post processing had been applied. Liu and Sarkar (2008) proposed an algorithm that employs two novel filters and a basic component-based text detection framework. The framework uses the Niblack algorithm to threshold images and groups components into regions with commonly used geometry features. Pan et al. (2009) proposed a novel hybrid method where a Conditional Random Field (CRF), Minimum Classification Error (MCE) learning and graph cuts inferece algorithm were followed. Babu et al. (2010) proposed a new text extraction algorithm which was insensitive to noise, skewness and text orientation, color or intensity, layout and orientation from a text/graphics heterogeneous document images. Zhang and Kasturi (2010) proposed a new unsupervised text detection approach which is based on Histogram of Oriented Gradient and Graph Spectrum. Zhang and Kasturi (2011) proposed a new text extraction approach based on character and link energies by analyzing the properties of single characters and text objects.
The article (Sumathi et al., 2012) discussed in detail about the various existing schemes on extracting the text from an image.

GAMMA CORRECTION
In a CRT monitor, the voltage V driving the electron flow is related to the intensity I of light emitted by the phosphor hit by the electron according the formula I ≈V γ where γ is a constant that depends on the phosphor Jayaraman et al. (2011). This non linearity causes serious distortions in the colors of the image displayed with some areas being too dark, while others are being saturated. To avoid this problem, an amplitude filter is required that corrects the intensity of each pixel before the information is passed to the CRT called gamma correction. The correction is given by V ≈ I 1/γ Gamma correction controls the overall brightness of an image. When the amount of γ is less than one, the transformed image becomes lighter than the original image; and when the amount of γ is greater than one, the transformed image becomes darker than the original image. The gamma curve is shown in Fig. 1.
To enhance the image, knowledge of gamma is required. A proper estimation of gamma value enhances the contrast of the image. This correction must be applied to the pixel intensities before the signal is converted into voltage in the CRT.

TEXTURAL ANALYSIS
Texture analysis refers to the characterization of regions in an image by their texture content. Texture analysis attempts to quantify intuitive qualities described by terms such as rough, smooth, silky, or bumpy as a function of the spatial variation in pixel intensities.
Texture is one of the significant characteristics used in identifying objects or regions of interest in an image. Texture contains important information about the structural arrangement of surfaces. Texture can be defined as a regular repetition of an element or pattern on a surface. First order texture measures are statistics calculated from the original image values, like Mean, Variance, Skewness, Kurtosis and do not consider pixel neighborhood relationships. Spatial gray level cooccurrence estimates image properties related to secondorder statistics which considers the relationship among pixels or groups of pixels (Srinivasan and Shobha, 2008). Haralick et al. (1973) suggested the use of Gray Level Co-occurrence Matrices (GLCM) often called gray tone spatial dependence matrix or Gray level Dependency Matrix in second order texture information extraction in an image. The GLCM functions characterize the texture of an image by calculating how often pairs of pixel with specific values and in a specified spatial relationship occur in an image, creating a GLCM and then extracting statistical measures from this matrix.

Gray-Level Co-Occurrence Matrix
GLCM is defined as "A two dimensional histogram of gray levels for a pair of pixels, which are separated by a fixed spatial relationship". GLCM of an image is computed using a displacement vector, defined by its radius δ and orientation θ.

Radius δ
Large displacement δ value would yield a GLCM that does not capture detailed textural information. Pixel is more likely to be correlated to other closely located pixel than the one located far away, so classification accuracies with δ = 1 and δ = 2 give best results. Displacement value equal to the size of the texture element improves classification.

Angle θ
Every pixel has eight neighboring pixels allowing eight choices for θ, which are 0, 45,90,135,180,225,270 or 315°. However, taking into consideration the definition of GLCM, the co-occurring pairs obtained by choosing θ equal to 0° would be similar to those obtained by choosing θ equal to 180°. This concept extends to 0, 45, 90 and 135° as well. Hence, one has four choices to select the value of θ.

Quantized Gray Levels (G)
The dimension of a GLCM is determined by the maximum gray value of the pixel. Number of gray levels is an important factor in GLCM computation. More levels would mean more accurate extracted textural information, with increased computational costs.

Textural Features
Some of the texture features extracted from gray level co-occurrence matrix are discussed below (

THRESHOLDS
Threshold is one of the widely used methods for image segmentation. It is useful in discriminating foreground from the background. By selecting an adequate threshold value T, the gray level image can be converted to binary image. The binary image should contain all of the essential information about the position and shape of the objects of interest (Al-Amri et al., 2010). The advantage of obtaining first a binary image is that it reduces the complexity of the data and simplifies the process of recognition and classification. The most common way to convert a gray-level image to a binary image is to select a single threshold value (T). Then all the gray level values below this T will be classified as black (0) and those above T will be white (1). The segmentation problem becomes one of selecting the proper value for the threshold T.
The thresholding methods are classified into six groups according to the information they are exploiting (Sezgin and Sankur, 2004). These categories are: • Histogram shape-based methods, where, for example, the peaks, valleys and curvatures of the smoothed histogram are analyzed • Clustering-based methods, where the gray-level samples are clustered in two parts as background and foreground object, or alternately are modeled as a mixture of two Gaussians • Entropy-based methods result in algorithms that use the entropy of the foreground and background regions, the cross-entropy between the original and binarized image • Object attribute-based methods search a measure of similarity between the gray-level and the binarized images, such as fuzzy shape similarity, edge coincidence • Spatial methods use higher-order probability distribution and/or correlation between pixels • Local methods adapt the threshold value on each pixel to the local image characteristics

Otsu's Method
This is used to automatically perform histogram shape-based image thresholding or the reduction of a gray level image to a binary image. The algorithm assumes that the image to be threshold contains two classes of pixels or bi-modal histogram (e.g., foreground and background) then calculates the optimum threshold separating those two classes so that their combined spread (intra-class variance) is minimal. The extension of the original method to multi-level thresholding is referred to as the Multi Otsu method.
Otsu's method (Otsu, 1979) exhaustively searches for the threshold that minimizes the intra-class variance (the variance within the class), defined as a weighted sum of variances of the two classes: Weights ω i are the probabilities of the two classes separated by a threshold t and σ 2 i variances of these classes.
Otsu shows that minimizing the intra-class variance is the same as maximizing inter-class variance: Which is expressed in terms of class probabilities ω i and class means µ i .
The class probability ω 1 (t) is computed from the histogram as t: where, x(i) is the value at the center of the i th histogram bin. Similarly, ω 2 (t) and µ 2 (t) can be computed on the right-hand side of the histogram for bins greater than t. The class probabilities and class means can be computed iteratively. This idea yields an effective algorithm.

PROPOSED METHOD
Aim of this research is to suppress non-text background details from the image by applying appropriate gamma value and to remove non text region. The present research is to estimate the gamma value of an input image. To find a proper gamma value, a range of gamma values from 0.1 to 10, with interval of 0.1 is applied for the image resulting in 100 images. Gray Level co-occurrence matrix for each image is computed to extract the textural features contrast and energy after converting into gray image. Each image is processed for Science Publications JCS four combinations of radius and angle that is 1 and 0°, 1 and 45°, 1 and 90° and finally 1 and 135°. Contrast and Energy measures of matrices of the four orientations are averaged and threshold value is calculated for each image by using Otsu's threshold algorithm and recorded in a table. The table will have the column of Gamma, Contrast, Energy and Threshold, Table 2.

Estimation of Gamma Value
The value of Energy and contrast has to be examined for the image of gamma value = 1 (Original Image) from the generated table to determine the value of gamma.
Rule 1: If the value of Energy >= 0.05, find an instance where threshold value is 0.5 from the table. If more than one instances are found, select an instance which has maximum value of Contrast and Energy >= 0.05. If there is no instance found, find an instance where threshold value is next nearer to 0.5. The corresponding gamma value of this selected instance is the estimated gamma value.
Rule 2: If the value of Energy <0.05 and the value of Contrast >= 1000, find an instance which has the value of Energy >= 0.1, the value of Contrast >= 1000 and threshold value of 0.5 from the table for the Gamma values 1 to 10. If more than one instance is found, select an instance which has the value of Energy maximum and the value of Contrast >1000. If there is no such instance found, find an instance in between gamma value of 0.1 and 0.9 such that value of the threshold should be nearer or next nearer value of 0.5. The corresponding gamma value of this selected instance is the estimated gamma value.
Rule 3: If the value of Energy <0.05 and the value of Contrast <1000, find an instance which has the value of Energy >= 0.1, the value of Contrast is maximum and the maximum contrast value should be greater than 100 for the Gamma values 1 to 10.If no such instance found, find an instance in between gamma value of 0.1 and 1 such that value of the threshold should be nearer or next nearer value of 0.5. The corresponding gamma value of this selected instance is the estimated gamma value. By applying this estimated gamma value to an input image, the background suppressed image will be achieved. This Gamma corrected image will be converted to gray scale image. At last, Otsu's thresholding algorithm is used to calculate the threshold value and applied to create output image.

EXPERIMENTATION
ICDAR datasets which contains images are used for experiment. To find a proper gamma value, a range of gamma values from 0.1 to 10, with interval of 0.1 is applied to an input image (Fig. 2) and the values of Contrast, Energy, Threshold is recorded. Table 2 is obtained for the input image. The value of contrast is 483.1 and Value of Energy is 0.000582 are found for the original Image. Gamma value is estimated by applying the Rules stated Rule1/Rule2/Rule3. Here, Rule 3 is applied for the input image as the value of Energy <0.05 and the value of Contrast <1000 and Gamma value estimated is 5.5. By applying this gamma value to the input image, the background suppressed image is achieved (Fig. 2b). The Gamma Corrected image is converted to gray scale image and a threshold value (Otsu's Algorithm) is evaluated and applied to create output image (Fig. 2c).
When a range of gamma values from 0.1 to 10, with interval of 0.1 is applied to an input image (Fig. 6a) Table  3 is obtained. The value of contrast is 1870 and the value of Energy is 0.00676are found for the original Image. Rule 2 is applied for the input image as the value of Energy <0.05 and the value of Contrast >1000. From the table it is found that there is no instance which has the value of Energy >= 0.1, the value of Contrast >= 1000 and threshold value of 0.5 for the Gamma values 1 to 10. So, Gamma value must be in between 0.1 and 0.9. By applying the last part of Rule 2, Gamma value is determined as 0.7. The output image is shown in (Fig. 6c).
When an input image (Fig. 7a) is processed with the range of 0.1 to 10 gamma values Table 4 is obtained. The value of contrast is 617.7 and value of Energy is 0.4858 are found for the original Image. Rule 1 is applied for the input image as the value of Energy >= 0.05. More than one instance is found, so an instance which has the value of Energy maximum and the value of Contrast >1000 has been selected. The corresponding value is of this instance is 3.1. Gamma Corrected image and output image is shown in the ( Fig.  7b-7c) respectively.    To evaluate the performance of the proposed method, ICDAR Robust Reading Dataset is used. Some of the Experiments results have been shown in the Fig. 2-8.

JCS
The results in this research show that the new proposed method removes the background of non text region and retains the text details of the image. Proposed method can detect most text region successfully, including text with different styles, size, font, orientations and color. Table 5 shows the comparison of proposed algorithm with the other algorithms. The

JCS
proposed text extraction approach has the average precision rate of 78% and recall rate of 96%.

CONCLUSION
The paper presents a new algorithm for the extraction of text region information in an image. This proposed method suppresses the background non text region detail based on Gamma correction. The algorithm estimates the gamma value by using texture measure without any prior details of the imaging device. The proposed technique is an essential preprocessing stage for most of the object separation method. The algorithm is applied on several images with text of different styles, size, font, alignment and complex backgrounds taken from ICDAR datasets and shown promising results. The future work mainly concentrates on the next stage of developing an exact and fast algorithm to remove further unwanted details and separate alpha numeric character from the output image obtained from the new proposed technique.