Identification of Pecan Weevils through Image Processing

: Problem Statement: Pecan weevil is one of the most destructive pests of Oklahoma. The scope of this study is to develop a recognition system that can serve in a wireless imaging network for monitoring pecan weevils. Approach: The recognition methods used in this study are based on template matching. Five recognition methods were implemented: Normalized cross-correlation, Fourier descriptors, Zernike moments, String matching and Regional properties. The training set consisted of 205 pecan weevils and the testing set included 30 randomly selected pecan weevils and 74 other insects which typically exist in pecan habitat. Results: It was found that Region-based methods were better in representing and recognizing biological objects such as insects. Different recognition rates were obtained at different order of Zernike moments. The optimum result among the tested orders of Zernike moments was found to be at the order 3. The results also showed that using different number of Fourier descriptors may not significantly increase the recognition rate of this method. Conclusion: The most robust and reliable recognition rate was achieved when the Zernike moments and Region properties recognition methods were used in a combination. A positive match from either of these two independent tests would yield reliable results. Therefore, 100% recognition could be achieved by adopting the proposed algorithm. The processing time for such recognition is 0.44 sec.


INTRODUCTION
More than twenty types of insects can attack the pecan tree. However, pecan weevil is one of the most destructive pests of Oklahoma pecans. It is considered as the most serious late-season pest because it attacks the nut (Harris, 1979). The life cycle of Pecan weevil ( Fig. 1) ranges from 2-3 years, most of which is underground. As soon as the adult pecan weevils emerge, they feed on pecan nuts, mate and oviposition eggs in the nuts. These eggs will hatch and larvae will be developed in 30 days. After their complete formation, the larvae would chew a hole in the nut, fall to the ground and burrow into the soil where it will pupate in 3 weeks and remain as an adult for one or two years before emerging on the pecan tree.
Nut damage is caused by adult and larva feeding and egg laying. Starting from July through September, the adults begin emerging from the soil and feeding on the nuts. Pecan weevils mate shortly after emerging and females choose the nuts that passed the gel stage but have not hardened. Within 24 day post emergence, a female can attack 25 nuts to lay about 3 eggs in each nut (Harris, 1979). This amount of damage constitutes major damage while the amount of damage caused by adults feeding on nuts (they feed on about 1 nut every four days) is considered minor damage (Mulder, 2004).
The present management methods for controlling pecan weevils involve detecting their emergence and then applying insecticides. Pecan weevil control requires about one to four well-timed insecticide applications (Mulder, 2004). Some Integrated Pest Management (IPM) stations delay the first treatment until nuts have reached the gel stage of development. This is because successful pecan weevil oviposition can only occur at and after that point until shuck split. Generally, insecticide coverage of at least 20-30 days is needed for pecan weevil management. These treatments will be economically justified in high priced, large fruited pecans if the infestation level is higher than the threshold of 500 post-emergence pecan weevil adults per hectare. The threshold for small fruited, low priced pecans is approximately 3500 pecan weevil adults per hectare. A second or even a third treatment may be needed to prevent economic damage from occurring if pecan weevils continue to emerge from the soil after an initial treatment (Harris, 1979). There are several monitoring techniques to detect the appearance and activities of adult pecan weevils. They include inspecting dropped nuts for feeding and/or oviposition injury and using knock down sprays, sticky bands, limb jarring, ground cover traps and assorted traps (Ree et al., 2000). However, traps (Fig. 2) are the most commonly used method. There are different types of traps utilized for monitoring weevil including: the wire cone trap, pyramid trap and the circle trap. The wire cone trap has been used for years and it is normally placed on the ground beneath pecan trees with a known history of pecan weevil infestations. The number of pecan trees in an orchard block varies from 60 trees per hectare (thin density) to 237 trees per hectare (ultra density) (Herrera, 2000). It is recommended to use 1-2 traps per tree and 3-5 trees per orchard block (Mizell, 2003). Traps should be placed in the orchard 1-2 weeks before the earliest maturing varieties reach the gel stage and these traps are monitored every 2-3 days.  (Mizell, 2003) Since pecan weevil emergence varies greatly from year to year and is significantly affected by the soil moisture, initial emergence and peak population emergence can vary from orchard to orchard and tree to tree. As a result, traps must be checked carefully during the emergence season and the adult weevils collected in the traps should be counted and removed with each inspection (Ree et al., 2000). This technique of monitoring pecan weevils is labor intensive and requires very careful observation. Assuming that it would take a farmer one minute to check each trap in a 40 hectare orchard (600 traps), it would then take 10 h to inspect all of the traps. This is amount of 30 h of study per week during the emergence season which could last for three months. Therefore, the use of an automatic monitored system would significantly reduce the labor requirement.
There are several insect identification systems that have been developed including: Digital Automated Identification System (DAISY) (Watson et al., 2003), Automated Bee Identification System (ABIS) (Arbuckle et al., 2001), Species Identification Automated and Web Accessible (SPIWA) (Do and Harp, 1999) and the Automated Insect Identification through Concatenated Histograms of Local Appearance (AIICHLA) (Larios et al., 2007). However, these systems have some limitations and may not be applicable for identifying pecan weevils.
The target group that DAISY was designed to identify is Ophioninae (Hymenoptera: Ichneumonidae). For accurate classification, the system requires that insects be aligned for capturing their image and is, therefore, not applicable for field application where no human interaction is preferred. Furthermore, for insects that are closely related and similar in shape, large number of training images would be required especially with the Random N-tuple Classifier (NCC) used in this system. The ABIS system was designed specifically to identify bees based on the differences in their forewings. It requires user interaction for aligning the specie wing before capturing its image. Also, the system is limited to species with membranous wings as the algorithm depends on a specific set of characters of the wing venation for identification. In the SPIDA-web system, manual manipulation of spider specimen is required for proper image acquisition. User interaction is, also, required for region selection and preprocessing of images. The AIICHLA system is specifically designed to identify stonefly larvae which live in water. An operator has to make sure that the larvae are in the standard orientation for properly capturing their images. No fully automated system for identifying insects in the field has been developed thus far. Furthermore, to our knowledge, no recognition system has been designed specifically for identifying pecan weevils. Therefore, the development of an automated monitoring system based on a wireless network imaging system is paramount.
The main objective of this study was (a) to development of a recognition algorithm that can identify pecan weevils among other insects, the robustness of this recognition system would replace the manual insect monitoring techniques currently in use and would be a useful tool for pest control management and (b) to develop the software part of a wireless network imaging system that can automatically identify pecan weevils in the field.

Recognition methods:
The shape of an object is an important feature for certain image recognition application. There are two criteria for representing the shape of an object: (a) the shape descriptors should be sufficiently accurate so that they uniquely represent that shape and (b) the shape descriptor should be broad enough to be insensitive to minor variations among objects of the same type. This applies, in particular, to biological objects since they are irregular. The shape of objects can be represented by different methods which are generally classified under two major categories of shape representation: (a) the boundary-based and (b) region-based methods. Boundary-based representations utilize only the information of the shape boundary whereas the region-based techniques consider the internal and external details of the shape. In this study, methods from both types of shape descriptors were used. Fourier descriptors and String matching methods were implemented as boundary-based method. Geometric moments, Zernike moments and Region properties were selected from Region-based method. In addition to these methods, the Normalized crosscorrelation method was, also, employed in this study.

Matching by correlation:
The template will be denoted as of size that is to be matched with an image of size where the size of the template should be less than or equal to the size of the image. The Sum of Squared Differences (SSD) is a similarity measure widely used in computer vision. In a gray level image, differences of the sum squared of each corresponding template and input image pixel is taken as an indication of the similarity between the template and the searched area of the image (Storring and Moeslund, 1997). The SSD is determined as follows: The cross-correlation can be derived as follows: In Eq. 2, the energy of the searched area and the template are represented by the first and second terms, respectively. The last term is the Cross Correlation (CC) which forms the correlation between the image and the template. The value of the CC ranges from zero (no match), to 2552 (maximum value). The need for Normalizing the Cross Correlation (NCC) term appeared since the energy of the different searched area in an image is not usually constant (Storring and Moeslund, 1997). The CC can be normalized as follows: The normalization is done by dividing the CC with the square root of the energy of the searching area and the template. The range of the NCC is between 0 (no match) and 1 (match). In this study, NCC was used with a simple algorithm to identify pecan weevils among other insects. First, the program reads the gray level input image and the image of pecan weevil stored in the database. Then, the input images were treated as a template and the normalized cross correlation was performed between this template and the database images one by one. If the value of the correlation was greater than the experimentally determined threshold (0.75), then the input image was recognized as a pecan weevil.

Matching by strings:
In this method, the boundary of an insect is represented by a string which is generated by coding the interior angles of the polygons. Then, strings were generated from a given angle array by quantizing the angles into increments which produced strings whose elements were numbers between 1 and 8 with 1 increment (Gonzalez and Woods, 2004). For an input image of unknown insect and pecan weevil, the two boundaries can be coded into strings 1 2 n 1 2 n a a ,...,a and b b ,...,b respectively. If α represents the number of matches between the two strings and the match takes place in the k th location, then the number of unmatched symbols can be described as follows: Where: |α| = The length of the string representing the unknown insect |b| = The pecan weevil images In this case, the value of β is equal to zero if the two images are identical. Even though there are many definitions of string similarity, a simple measure between strings was implemented in this study which is represented by the following ratio: The value of D is equal to zero when none of the symbols in a (unknown insect's image) and b (pecan weevil's image) is matched. D is equal to infinite when the two images are identically matched. In String matching, a tested image is recognized as pecan weevil if the D value is greater than or equal to the value (1.0) of the threshold.

Object recognition by Zernike moments:
Zernike moment descriptor has the properties of rotation invariance, robustness to noise, expression efficiency, fast computation and multi-level representation for describing the various shapes of patterns (Kim and Kim, 2000). Zernike moments introduces a set of complex polynomials which form a complete orthogonal set over the interior of a circle. The computation of Zernike moments from an input image consisting of three steps: (a) computation of radial polynomials (b) computation of Zernike basis function and (c) computation of Zernike moments by projecting the image on to the basis function (Hwang and Kim, 2006). The form of these polynomials is as follows: Where: N = Called "order" M = A positive and negative integer (known as "repetition") with constraint that V = The length of vector from origin to pixel Θ = The angle between vector and axis in counterclockwise direction R = The radial polynomial defined as: These polynomials are orthogonal and satisfy the orthogonal properties for the same repletion: Where: The Zernike moments of order with repetition for a continuous image function ( ) f x, y outside the unit circle is given as follows: In Eq. 9, the integral can be replaced by summations (since all the images are digital) as follows: The Zernike moments are computed for an image by considering the center of the image as the origin and the pixel coordinates are mapped to the range of the unit circle. The computation will not include pixels outside the unit circle. The orthogonality implies no redundancy or overlap of information between the moments with different orders and repetitions (Hwang and Kim, 2006). In this case, each moment will be a unique and independent representation to a given image. In many comparison studies of moments based methods (Teh and Chin, 1988;Lin and Chou, 2003;Belkasim and Shridhar, 1991;Zhang and Lu, 2004;Park and Kim, 2004;Ezer et al., 1994;Padilla-Vivanco and Urcid-Serrano, 2007;Liao and Pawlak, 1996), Zernike moments outperformed the others methods.
Object recognition by Fourier descriptors: Fourier descriptors are produced by the Fourier Transformation which represents the shape in the frequency domain. The lower frequency descriptors store the general information of the shape and the higher frequency (Sarfraz, 2006). Therefore, the lower frequency components of the Fourier descriptors are sufficient for general shape description. The boundary of a shape consists of points in the xy plane. Tracing once around the boundary from an arbitrary starting point ( ) The coordinate pair of shape boundary can be described as a complex number as follows: Where: This representation changed the problem from twodimensional to one-dimensional case. The discrete Fourier transform of Eq. 13 is as follows: for u 0, 1, 2, ..., K 1 = − and the complex coefficients a(u) are known as Fourier descriptors of the boundary. The inverse Fourier transform of Equation 13 is as follows: Where: k = The number of points in the boundary s = The featured value from Fourier descriptors for object recognition and representation High frequency components account for fine detail and low frequency components determine global shape. Therefore, not all Fourier descriptors are required for general object recognition. Instead, only the first P coefficients should be used. In this case, Eq. 14 can be rewritten as follows: Regional properties descriptors: While the aim of this study is to identify pecan weevils among other insects, it is desired to keep such a system as simple as possible.
A regional property is one of the approaches among regional descriptors as it deals with the region(s) of the image instead of its boundary. It is a simple method for describing important properties of image regions such as: the area, centroid and orientation. Although there are many insects that are very close to pecan weevils in terms of shape description, one important feature can be utilized to distinguish pecan weevils from other insects. This feature is the pecan weevil's rostrum. Pecan weevil can be recognized by its long rostrum which is ¾ the length of the male's body and as long as the female's body.
As pecan weevil is not the only insect that has a rostrum therefore utilizing this feature alone (majoraxis length) may not be very effective. Thus, this feature was related to other features in order to form a unique representation of pecan weevils. The area, major-axis length and minor-axis length were used to describe pecan weevils in this study. The area of the selected region is defined as the number of pixels in that region. The major-axis length is defined as the length (in pixels) of the major axis of the ellipse that has the same second moments as the region. Finally, the minor-axis length is the length (in pixels) of the major axis of the ellipse that has the same second moments as the region (Gonzalez and Woods, 2004).
Euclidean Distance (ED) was implemented as a classifier to measure the similarity degree of the corresponding descriptors of an input insect image and the database of pecan weevil's images. The ED's equation can be written as follows: Using the descriptors of Fourier, Zernike and Regional properties methods, an acquired image is recognized as pecan weevil when the value of ED is less than or equal to the experimentally determined threshold for each method.

Collection of insects:
Traps were set up for pecan weevils at different locations in Stillwater, Oklahoma. The other source of insects was the Entomology Museum at Oklahoma State University. Over 205 pecan weevils were collected from both sources and these included both males and females. The collected weevils varied in size, color and age. About 27 other types of insects were, also, collected to be part of the experiment. These insects are normally present in the pecan habitat. The names of insects used in the experiment and their number of replicates are presented in Table 1.

Image acquisition:
In template-based application, training set of image should be a real representative of the targeted object or shape. Even though traps were checked regularly, few pecan weevils were found alive. Experiments showed that those live weevils die in short time when kept in cages. Moreover, it was very hard to position live weevils appropriately for imaging without causing some damage to their bodies or losing them since they can fly. As a result, live collected insects were put to "sleep" by placing them in a refrigerator at 4°C for 60 min.
Muscles of pecan weevil shrink and pull the legs in shortly after they die. This generally results in all six legs of pecan weevil remaining close to the body or sometimes touching the abdomen. Since this system is designed to identify live pecan weevils in the field, images of them in such positions would not simulate the natural appearance of the insects in the field. Therefore, preserved insects' parts (legs and antenna) were stretched out so that they would appear similar to the position in live insects. In order to achieve good results without losing these fragile parts, some careful pre-processing steps were undertaken to prepare the insects for imaging so that they would appear like live insects. The first pre-processing step was to put the insects in a humidifying chamber for 10 days. The humidified environment helps in making the legs and antenna of the insects more flexible for stretching them out such that they are closer to their normal position. The second step was to align each insect at the camera view for imaging. All insects were approximately placed at a reference position and orientation. Images of insects were then acquired with the image acquisition system.
The imaging system: The imaging system (Fig. 3) consisted of an AVT F-145B CCD black and white camera (IEEE 1394 SXGA+ camera) equipped with a 1.45 megapixel 2/3" progressive CCD sensor. This camera was manufactured by Allied Vision Technologies GmbH 2003, Stadtroda, Germany. Images of insects were of the size of 335 285 pixels. The lighting system was an Aristo model MS1417  (Height) and was equipped with a cold cathode grid lamp. Therefore, a diffused light chamber was designed and fabricated in the departmental workshop and used to enhance edge detection and body reflection. This tool helped in reducing the specular reflection from external light sources. It is 45.72 cm in length, 23.8 cm in width and 12.7 cm in height. The chamber had an opening of 7.5 cm radius to allow the lens to go through the chamber. An opaque white-class cover (0.3175 cm thick) was used on top of the lighting box. A Dell Optiplex GX745, Pentium® D, 3.4 GHz CPU was used and MATLAB® (R2006a) image processing software was utilized to conduct these experiments. Some pictures of the pican weevils taken by the imaging system are shown in Fig. 4. Figure 5 presents the algorithm of the recognition system. The sequence starts by loading a new image of insect which will directly be processed by the Zernike moments of order three and its six moments would be calculated. The similarity degree between these moments and the moments of pecan weevils will be measured. If this degree is greater than or equal to a threshold value of 0.8, the input image will be classified as pecan weevil. A value of 1 will be assigned to the counter (S = 1) and the algorithm will do the next step. If an insect does not match any pecan weevil of the training set, the algorithm will keep S = 0 and move to the next step.

Algorithm:
The input image then would be analyzed by the Region properties method in the second stage. After measuring the three properties of that insect (area, major and minor axis), their similarity to each pecan weevil of the training set will be evaluated. If the degree of similarity is greater than or equal to the threshold of recognition (1.0), this insect will be recognized as pecan weevil and the counter will add 1 to its value. If that insect does not match any pecan weevil of the training set, the algorithm will keep the value of S unchanged and move to the next step. Thus, the value of the counter S would either be 0, 1, or 2 at the second step.
In the third step, Normalized cross-correlation method will be used. If the correlation value between the input image and any pecan weevil image is greater than or equal to 0.75, this image will be recognized as pecan weevil and the correlation process will stop. The counter value will increase by one (S = S+1). At this stage, the counter S can have possible values of 3, 2, 1, or zero. For the case when S equals to 2 or 1, the algorithm will go to the fourth step. On the other hand, if S equals to 3, which means input image was recognized by all three previous methods, the algorithm will recognize this image as pecan weevil ending the recognition process of the input image. The program would then be ready for the next image. If the S value is 0, which indicates that the input image was not positively classified by any of the three methods, the algorithm will classify this insect as non-pecan weevil insect ending the recognition process and would be ready for a new image.
In the fourth step, the string matching method will process the input image only if the counter value is either S = 1 or S = 2. If the similarity measure of the string of this image and any other string of the training set is greater than or equal to 0.96, this image will be regarded as pecan weevil insect. In this case, the counter value will be either S = 2 or S=3. In the first case, the algorithm will go to the fifth step whereas in the second case the insect will be confirmed as pecan weevil. If this insect did not match any pecan weevil of the training set, the counter value will remain as either S = 1 or S = 2. The input image will then go through the fifth method at S = 2 even though it was not recognized at this level. However, when S = 1, the algorithm will classify the input image as non-pecan weevil ending the recognition process.
In the last step, Fourier descriptors method will process the image if the counter value is S = 2. This method will calculate the Fourier descriptors (450 descriptors) of the input image and measure the similarity between this set of descriptors and those of the training data set. If the similarity measure is greater than or equal to the threshold of 1.059, this image will be classified as pecan weevil. In this case, the counter will add one to its value (S = 3) and hence the image will be confirmed as a pecan weevil insect. Otherwise the input image will be regarded as non-pecan weevil insect. In both cases, the recognition process for that image will be complete and the system would be ready for a new input image.

RESULTS
The threshold at which insects are recognized as pecan weevil was experimentally determined using a template of 205 pecan weevils. This threshold was set approximately at where 80% of the training data were found. For all methods, the recognition rate was evaluated using two types of data sets. The first group consisted of 30 pecan weevils that were randomly selected from a group of 200 pecan weevils. The second group was a set of 19 different insects (74 insects) that are naturally present in the pecan habitat. The performance time given for each method is the average time required for loading and processing that particular image.
Correlation method: Figure 6 illustrates the results of using Normalized cross-correction method to identify pecan weevils among other insects. The, pecan weevils are represented by the solid circles while the other insects are represented with hollow circles. Clearly, it can be noticed that this method can distinguish pecan weevils from other insects. About 90% of the pecan weevils were above the experimentally determined threshold of 0.75. The three pecan weevils which fall below the threshold line were very close (0.74) to the passing criteria and not significantly away from being correctly distinguished. About 95% of the non-pecan weevils were correctly classified. The average processing time for this method was 25 sec.
Region properties: Figure 7 illustrates the results of the experiments conducted using this method. The results showed that 90% of the pecan weevils and 93% of the other insects were positively matched. The average processing time for this method was 0.35 sec. These encouraging results in addition to the rotation and translation properties of this method suggest its adoption in identify pecan weevils. String matching: The String matching is a simple, yet very effective method in recognizing pecan weevils. The recognition threshold for this method was set at 1.0 as shown in Fig. 8. Using this method, 80% of pecan weevils and 88% of the other insects were positively identified. The average processing time for this method was 2.5 sec.
Zernike moments: Zernike moments at different order were studied for accuracy and time performance. Figure 9 presents the performance analysis of Zernike moments. The optimum point was found at order 3 of Zernike moments where the highest recognition rate for both pecan weevils and other insects was obtained at the shortest time (0.09 sec.). It can be seen from Fig. 10 show that the two testing groups are clearly separated into two different sections. This powerful classification ability of Zernike moments at this order strongly suggests its adoption in the proposed recognition system. The recognition rate was 97% for pecan weevils and 99% for other insects. These results are considered to be the best in terms of correct classification rate and speed.
Fourier descriptors: Figure 11 illustrates the results of the Fourier descriptors method. This method showed that 80% of the pecan weevils were correctly classified whereas 51% of the other insects were positively classified. One attributes to the relatively poor performance of the Fourier descriptors method is the non-linear variation among the pecan .weevils in terms of body size and part orientation. This method process a new acquired image in about 0.5 sec.

DISCUSSION
The algorithm was successfully implemented in the recognition system and it yielded promising results for the data sets investigated. The performances of these five methods are compared and the percentage of pecan weevils and non-pecan weevils successfully identified by each of the methods are shown in Fig. 12. The Type I and Type II errors for each of the methods were also evaluated as shown in Table 2. On the average, the maximum processing time for one image through the five methods is 25.44 sec. However, the system may require shorter time because an input image may not need to be matched with all pecan weevil images in the template, if it positively matches any one of them.
These results supported the idea of implementing more than one recognition method as only one may not provide the desired result. Based on these above findings and a careful analysis of the system requirements, it is concluded that the application of the two methods Zernike Moments and Region properties would yield the desired success rates for identifying pecan weevil in field applications Fig. 13 illustrates the revised algorithm implemented in this study. It shows the two methods mentioned above applied in a sequential order. It can be seen that a positive match form either of these two methods was used as the    selection criterion. This algorithm yielded the best results when compared to other combinations of methods. Therefore, this algorithm is expected to be implemented in a wireless monitoring system for field applications.

CONCLUSION
It can be concluded that a combination of more than one method is essential for a robust recognition system since no single method yielded the desired detection rates. The Zernike moments at order 3 was found to have the highest recognition rates for pecan weevils and other insects. This method yielded the lowest Type I and II errors and required the least processing time. The region properties method showed similar advantages to the Zernike moments. Thus, 100% successful recognition rate for pecan weevils was achieved using a combination of Zernike moments and Region properties methods. Fourier descriptors method using 450 descriptors was found to be the least successful of these methods and yielded the highest Type I and II errors. The region-based methods were found to represent the shape of insects better than the boundary-based method. As a result, the recognition rates of the region-based methods were higher than the boundary-based method.