Edge Detection in Gray Level Images based on the Shannon Entropy

: Most of the classical mathematical methods for edge detection based on the derivative of the pixels of the original image are Gradient operators, Laplacian and Laplacian of Gaussian operators. Gradient based edge detection methods, such as Roberts, Sobel and Prewitts, have used two 2-D linear filters to process vertical edges and horizontal edges separately to approximate first-order derivative of pixel values of the image. The Laplacian edge detection method has used a 2-D linear filter to approximate second-order derivative of pixel values of the image. Major drawback of second-order derivative approach is that the response at and around the isolated pixel is much stronger. In this research study, a novel approach utilizing Shannon entropy other than the evaluation of derivates of the image in detecting edges in gray level images has been proposed. The proposed approach solves this problem at some extent. In the proposed method, we have used a suitable threshold value to segment the image and achieve the binary image. After this the proposed edge detector is introduced to detect and locate the edges in the image. A standard test image is used to compare the results of the proposed edge detector with the Laplacian of Gaussian edge detector operator. In order to validate the results, seven different kinds of test images are considered to examine the versatility of the proposed edge detector. It has been observed that the proposed edge detector works effectively for different gray scale digital images. The results of this study were quite promising.


INTRODUCTION
Edge detection has received much attention during the past two decade because of its significant importance in many research areas [1] . Since, the edge is a prominent feature of an image; it is the front-end processing stage in object recognition and image understanding system. The accuracy with which this task can be performed is a crucial factor in determining overall system performance [2] . The detection results benefit applications such as image enhancement, recognition, morphing, compression, retrieval, watermarking, hiding, restorationand registration etc [3] . Edge detection concerns localization of abrupt changes in the gray level of an image [4] . Edge detection can be defined as the boundary between two regions separated by two relatively distinct gray level properties [5] . The causes of the region dissimilarity may be due to some factors such as the geometry of the scene, the radio metric characteristics of the surface, the illumination and so on [6] .
Most of the traditional methods for edge detection are based on the first and second order derivatives of gray levels of the pixels of the original image such as the Gradient operator and Laplacian operator [7] . Roberts, Prewitt and Sobel are Gradient operators that use 2D spatial convolution masks to approximate the first-order derivative of an image in horizontal and vertical directions separately. The detected edges by Gradient operators are thick, which may not be suitable for some applications, where the detection of the outmost contour of an object is required. The Laplacian edge detection method uses a 2D spatial linear filter to approximate the second-order derivative of pixel values of the image for producing sharp edges [8] . The Laplacian generally is not used in its original form for edge detection for several reasons: As a second-order derivative, the Laplacian typically is unacceptably sensitive to noise. The magnitude of the Laplacian produces double edges, an undesirable effect because it complicates segmentation [9] . For these reasons, the Laplacian is combined with smoothing as a precursor to finding edges via zero-crossings. Marr and Hildreth achieved this by using the Laplacian of a Gaussian (LOG) function as a filter [10] . LOG filtered images also suffer from the problem of missing edges-edges in the original image may not have corresponding edges in a filtered image. In addition, it turns out to be very difficult to combine LOG zero-crossings from different scales, primarily because of the following [11] : • A physically significant edge does not match a zero-crossing for more than a few and very limited number of scales • Zero-crossings in larger scales move very far away from the true edge position due to poor localization of the LOG operator • There are too many zero-crossings in the small scales of a LOG filtered image, most of which is due to noise To solve these problems, the study proposed a novel approach based on information theory. Shannon entropy is the most important among several measures of information. Edges can be extracted by the detection of all pixels on the borders between different homogenous areas. Entropy measures the randomness of intensity distribution [12] . According to this property of entropy, the value of entropy is low for homogenous areas and is high where the diversity of gray level of pixels is large [13] .

Concept of entropy:
Entropy is a concept in information theory. Entropy is used to measure the amount of information [14] . Entropy is defined in terms of the probabilistic behavior of a source of information. In accordance with this definition, a random event A that occurs with probability P(A) is said to contain Units of information. The amount I(A) is called the self-information of event A. The amount of selfinformation of the event is inversely related to its probability. If P(A) = 1, then I(A) = 0 and no information is attributed to it. In this case, uncertainty associated with the event is zero. Thus, if the event always occurs, then no information would be transferred by communicating that the event has occurred. If P(A) = 0.8, then some information would be transferred by communicating that the event has occurred.
The base of the logarithm determines the unit which is used to measure the information. If the base of the logarithm is 2, then unit of information is bit. If P(A) = ½, then I(A) = -log 2 (½) = 1 bit. That is, 1 bit is the amount of information conveyed when one of two possible equally likely events occurs. A simple example of such a situation is flipping a coin and communicating the result (Head or Tail).
The basic concept of entropy in information theory has to do with how much randomness is in a signal or in a random event. An alternative way to look at this is to talk about how much information is carried by the signal. Entropy is a measure of randomness.
Consider a probabilistic experiment in which the output of a discrete source is observed during every unit of time (signaling interval). The source output is modeled as a discrete random variable S. S is referred as a set of source symbols [15] . The symbols generated by the source during successive signaling intervals are statistically independent. A source that satisfies such property is called a discrete memory-less source; memory-less source is that in which the symbol emitted at any time is independent of previous choices.
The amount of self-information of the event S=s j which occurs with probability p j is: I j Is a discrete random variable that takes on the values I(s 1 ), I(s 2 ),...,I(s k ) with probabilities p 1, p 2 ,...,p k respectively [16] . The self-information generated by the production of a single source symbol is I(s j ) = -log (p j ). If n source symbols are generated, the law of large numbers stipulates that, for a sufficiently large value of n, symbol s j will (on average) be output j np times. Thus the average self -information obtained from n outputs is given by The average information per source output, denoted H (Z) [17] , is: The important quantity H(Z) is called the entropy of a discrete memory less source with source alphabet Z . It is a measure of the average information content per source symbol. The entropy H(Z) depends only on the probabilities of the symbols in the alphabet Z in H(Z) is not an argument of a function but rather a label for a source.

Selection of threshold value:
Threshold value is used to transform a dataset containing values that vary over some range into a new dataset containing just two values. When a threshold value is applied on to the input data, then input values that fall below the threshold are replaced by one of the output values and input values that at or above the threshold are replaced by the other output value. Image thresholding [18] is a segmentation technique because it classifies pixels into two categories. Category1: Pixels whose gray level values fall below the threshold and category2: Pixels whose gray level values are equal or exceed the threshold. In gray level image, range of input dataset is [0,255]. After thresholding, output dataset contains only two values 0 and 255. Thus, thresholding creates a binary image. If T is a threshold value, then any pixel (x, y) for which f(x, y)>T is called an object point; otherwise the pixel is called a background pixel. In general, the threshold can be chosen as the relation, T=T[x, y, p(x, y), f(x, y)] where f(x, y) is the gray level of the pixel(x, y) and p(x, y) denotes some local property of this pixel, for example, the average gray level of a neighborhood centered on (x, y). A threshold image h(x, y) is defined as h(x, y)=1 if f(x, y)>T; otherwise h(x, y)=0. Thus, pixels labeled 1 correspond to objects, whereas pixels labeled 0 correspond to the background. When T depends only on f(x, y) (only on gray level values), the threshold is called global. If T depends on f(x, y) and p(x, y), the threshold is called local. If T depends on the pixel position (x, y) as well as f(x, y) at that pixel position, then it is called dynamic or adaptive threshold. In proposed scheme to detect edges, global threshold value is used.

Procedure to select suitable threshold value
Step 1: Select an initial estimate for T.
Step 2: Segment the image using T. This will produce two groups of pixels: R 1 consisting of all pixels with gray level values >Tand R 2 consisting of pixels with gray level values T.
Step 3: Compute the average gray level values µ 1 and µ 2 for the pixels in region R 1 and R 2.

Step 4: Compute a new threshold value
Set T New = (µ 1 + µ 2 )/2 and Set T Old =0 Step 5: While (T New T Old ) do µ 1 =Mean gray level of pixels for which f(x, y)>T New µ 2 =Mean gray level of pixels for which f(x, y) T New Set T Old =T New Set T New = (µ 1 +µ 2 )/2 End while Step 6: Suitable threshold value Set T=T New Step 7: Stop Proposed scheme for edge detection: In digital image processing, an image defined in the real world is considered to be a function of two real variables, for example, f(x, y) with f as the amplitude (brightness) of the image at the real coordinate position (x, y). A spatial filter mask may be defined as a (template) matrix w of size m × n. Assume that m = 2a+1 and n = 2b+1, where a, b are nonzero positive integers. Smallest meaningful size of the mask is 3×3. Such mask coefficients, showing coordinate arrangement as: w(-1,1) w(-1,0) w(-1,1) w(0,1) w(0,0) w(0,1) w (1,1) w(1,0) w (1,1) Image region under the above mask is shown as: Basic idea behind edge detection is: • Classification of all pixels that satisfy the criterion of homogeneousness • Detection of all pixels on the borders between different homogeneous areas In the proposed scheme, first create a binary image by choosing a suitable threshold value. Window is applied on the binary image. Set all window coefficients equal to 1 except centre, centre equal to × as shown below: Move the window on the whole binary image and find the probability of each central pixel of image under the window. Then, the entropy of each central pixel of image under the window is calculated as Where, p is the probability of central pixel of binary image under the window. For example, at any instance the image under the window is: Now, the probability of central pixel, p = 4/9 and the entropy of central pixel, If, for any other instance, the image under the window is: In this case, the probability of central pixel, p = 2/9 and the entropy of central pixel, When the probability of central pixel, p=1, then the entropy of this pixel is zero. Thus, if the gray level of all pixels under the window homogeneous, p=1and H=0.In this case, the central pixel is not an edge pixel. Other possibilities of entropy of central pixel under window are shown in Table1.
In case no.1, 2, the diversity for gray level of pixels under the window is low. So, in these cases, central pixel is not an edge pixel. In remaining cases, the diversity for gray level of pixels under the window is high. So, for these cases, central pixel is an edge pixel. Thus, the central pixel with entropy greater than and equal to 0.2441 is an edge pixel, otherwise not.

RESULTS AND DISCUSSION
The performance of the proposed scheme is evaluated through the simulation results using MATLAB 7 for a set of eight test images and the results of the proposed scheme are compared with the results of well-established edge detection operator on the same set of test images. Such edge detection operator is Laplacian of Gaussian (LOG).LOG is chosen for comparison because both approaches are rotation invariant. For this purpose, first, a standard test image eight.tif was taken from MATLAB 7 environment. Its edge was detected using LOG edge detector whose function was inbuilt in MATLAB 7. After this, the performance of proposed approach for edge detection on the same image was checked. In the proposed scheme, a suitable threshold value was calculated using the threshold evaluation procedure given in the research. Such threshold value for the test image is 0.3472 when image in normalized form (all gray level values lie between 0 and 1). The result of edge detection is shown in Fig. 1. It has been observed that the proposed method for edge detection works well as compare to LOG.
In order to validate the results about the performance of proposed scheme for edge detection, seven different test images are considered which are present in MATLAB 7 environment. Suitable threshold values calculated by the threshold evaluation procedure for different test images are given in Table 2.The results of edge detections for these test images using LOG and proposed scheme are shown in Fig. 2 using log using proposed using log using proposed Fig. 2: Performance of Proposed Edge Detector for different images using LOG using proposed it has again been observed that the performance of the proposed edge detection scheme is found to be satisfactory for all the test images as compare to the performance of LOG.

CONCLUSION
In this study, an attempt is made to develop a new technique for edge detection. Experiment results have demonstrated that the proposed scheme for edge detection works satisfactorily for different gray level digital images. The theoretical principles and systematic development of the algorithm for the proposed versatile edge detector is described in detail. The technique has potential future in the field of digital image processing. The work is under further progress to examine the performance of the proposed edge detector for different gray level images affected with different kinds of noise.