Histogram of Intensity Feature Extraction for Automatic Plastic Bottle Recycling System Using Machine Vision

: Currently, many recycling activities adopt manual sorting for plastic recycling that relies on plant personnel who visually identify and pick plastic bottles as they travel along the conveyor belt. These bottles are then sorted into the respective containers. Manual sorting may not be a suitable option for recycling facilities of high throughput. It has also been noted that the high turnover among sorting line workers had caused difficulties in achieving consistency in the plastic separation process. As a result, an intelligent system for automated sorting is greatly needed to replace manual sorting system. The core components of machine vision for this intelligent sorting system is the image recognition and classification. In this research, the overall plastic bottle sorting system is described. Additionally, the feature extraction algorithm used is discussed in detail since it is the core component of the overall system that determines the success rate. The performance of the proposed feature extractions were evaluated in terms of classification accuracy and result obtained showed an accuracy of more than 80%.


INTRODUCTION
A recent study by the Ministry of Housing and Local Government of Malaysia on current recycling practices showed that plastics accounted for about 9.7% of household waste. In addition to that, the National Technical Committee on Management of Solid Waste was studying various proposals on the use of modern technology for solid waste disposal to replace existing manual techniques. Most waste ends up in landfill sites; only 19% of household waste is currently recycled or composted. Recycling is widely assumed to be environmentally beneficial, although the collection, sorting and processing of materials give rise to some environmental impacts and energy use. Previously, plastic recycling is based on the material used [1] . Plastic packaging can be made from different types of resins and the most common are PETE, HDPE, LDPE, PVC, PP and PS these are shown in Table 1 [1,2] . They are amongst the types of plastic that have numerous recycled applications. With plastics recycling, however, there is usually only a single re-use. Most bottles and jugs do not become food and beverage containers again. For example, pop bottles might become carpet or stuffing for sleeping bags. Milk jugs are often made into plastic lumber, recycling bins and toys. When working with plastics there is often a need to identify which particular plastic material has been used for a given product. Most consumers recognize the types of plastics by the numerical coding system created by the Society of the Plastics Industry in the late 1980s. There are six different types of plastic resins that are commonly used to package household products. The identification codes are shown in Table 1 [1,2] and these codes can be found on the bottom of most plastic packaging.
In this study, we propose a new approach to classify plastic bottle by implementing the viability of imaging technology for automated sorting. According to the original image of plastic bottle, there are obvious features that can discriminate between 2 classes of plastic bottles which are PET and Non-PET. Based on vision system the obvious features of the two different classes are the color which clear and opaque. In optic, transparency (glare/shiny) is the property of allowing light to pass. Opposite property of transparency is opacity. In so doing, we applied the histogram of intensity technique in order to differentiate between 2 classes of bottle; i.e., PET and Non-PET according to the property of transparency and opacity. This paper has been structured accordingly. The next section briefly reports related previous work done by others followed by the methodology section. In the methodology section, two types of algorithm for features extraction are discussed and compared to get the best result. Subsequently, the results and discussion sections are presented respectively and followed by the conclusion.

PREVIOUS WORK
A variety of techniques and algorithms have been developed recently which use the automatic detected ROIs for image intensity, spatial arrangements of patterns and textural features to distinguish among features or to separate from their background. Some of the most popular techniques include the use of the 2-D wavelet transform as reported in [4,5] which are especially useful in the suppression of noise and detection of fine structures. In [6] , neural networks were trained on ideal shapes that take into account the possible intensity changes at the edges of the structures and edge detection algorithms. It was also applied to detect oceanographic structures, using traditional gradient operators, grey level co-occurrence matrixes and derived measurements [7] . Danijela et al. in [8] proposed facial feature point detection method that uses individual feature patch templates to detect points in the relevant region of interest. Additionally, in [9] Tiffany presented an algorithm that selects regions of interest (ROIs) containing tumor based on combined texture and histogram analysis. The first analysis compares texture features extracted from different regions in an image to the same features extracted from known tumorous regions. The second analysis detects the ROIs with two thresholds computed from the histograms of known tumorous masks. As in [10] , Seung et al, proposed a new region of interest (ROIs) extraction algorithm using scale salient information and multiple features such as intensity, edge, R+G-and B+Y-color to reflect more exact salient regions.

MATERIALS AND METHODS
In view of that, a study has been proposed to determine the viability of using computer vision for There are several types as listed in Table 1 but this study focuses on the classification between PET and Non-PET bottles, leading to a 2-categorical pattern recognition task. For that reason, this work will only focus on categorizing the bottles as general as possible by classifying them in two different classes namely the PET and Non-PET bottle classes. Fig. 1 shown an overview of the overall system that outlines the basic structure. It consists of the following steps; preprocessing, feature extraction and classification using the histogram of intensity prior to classification. This feature extraction section discusses the two algorithms to derive feature vectors for the classification process. In this study, an image analysis preprocessing stage is performed to normalize and standardize the image used. Once the feature vectors are extracted from the images, they are used as input and fed to the linear discriminant analyzer for the purpose of plastic bottle classification.

Pre-processing:
To obtain all sets of feature vectors, an image has to go through the pre-processing stage. The image pre-processor module performs the following operations: image resizing, filtering, getting silhouette image and region properties measurement for the bottle's image [11] . Image filtering will filter all the noise due to lighting and also perform background subtraction. The output of the gray level image is referred to as silhouette. Then, the regionprops Matlab command [12] was used to measure object or region properties in an image and returns them in a structure array. When applied to the plastic bottle image with basic components, it creates one structure element for each component. Here, only the bounding box and centroid properties are used.

Feature extraction:
The objective of feature extraction process is to represent the raw image in its reduced and compact form in order to facilitate and speed up the decision making process such as classification. Each of the pixels that represent an image stored inside a computer has a pixel value which describes how bright that pixel is and/or what color it should be. In the simplest case of binary images, the pixel value is a 1-bit number indicating either foreground or background. For a grayscale images, the pixel value is a single number that represents the brightness of the pixel. The most common pixel format is the byte image, where this number is stored as an 8-bit integer giving a range of possible values from 0 to 255. Typically zero is taken to be black and 255 is taken to be white. Two ways have been deployed in this study to extract features from the original plastic bottle images. The first algorithm is studying the histogram of intensity from the whole plastic bottle image. While in the second algorithm, the whole image of plastic bottle will be segmented into five regions and the fifth region will become the region of interest (ROI). The bounding box image algorithm and segmented region of interest algorithm are explained in detail in the following sub-sections.
Bounding box image algorithm: The bounding box represents the smallest rectangle that can contain a region, or in this case, the plastic bottle. The objective of this algorithm is to find an average of white pixel value from the gray scale plastic bottle image. Figure 2 shows an example of bounding box image used in this work.
The histogram of intensity feature values was extracted from the silhouette of gray level image. In this work, the silhouette of gray level image is a bounding box image where we try to minimize the background and maximize the object. From the image, the histogram is plotted and from the histogram a few features that can discriminate between the two classes of bottles are extracted. The histogram of a digital where, r k is the kth gray level and n k is the number of pixels in the image having gray level r k . It is a common practice to normalize a histogram by dividing each of its values by the total number of pixels in the image, denoted by n. A normalized histogram is given by; p(r k ) = n k /n, for k = 0, 1,…., L -1 Next, we measure the average of white pixel is measured using the function below; W(r k ) = n k / n k (150 -256 pixel) ( The average of white pixel value from the function (3) is the feature vector which will be used as input to the Linear Discriminant Analysis (LDA) classifier.

Segmented region of interest algorithm: A Region of
Interests, often abbreviated ROI, is a selected subset of samples within a dataset identified for a particular purpose, for example on an image, the boundaries of an object is the ROI. A ROI is an area of an image defined for further analysis or processing [11] . It is sometimes of interest to process a single sub-region of an image, leaving other regions unchanged. In this paper, the ROI according to PET and Non-PET plastic bottle appearance is selected. For the PET bottles, the appearances are transparent with high gloss; clear or colored; no seams; injection molding nub on bottom or opaque with dull finish. For the Non-PET bottle, the appearance are translucent matte finish or not shiny [3] . The whole image of the plastic bottles will be segmented into five regions and from the fifth's region, the ROI will be cropped from the centre of the fifth region. After obtaining the ROI of the two classes, from the intensity theory, the highest intensity is equal to  Figure 3 shows all five split regions and the generated ROI for this research. The ROI is an automatic crop of a region from the center of the fifth region. Only this region is taken to avoid the region that is covered by the label and the region that have discriminant value between PET and Non-PET bottles The intensity image is the equivalent to a gray scale image and this is the image focused in this study. It represents an image as a matrix in which every element has a value corresponding to how bright or dark the pixel at the corresponding position should be colored. Once the suitable ROIs have been obtained, histogram of the pixel intensity value is plotted. In an image processing context, the histogram of an image normally refers to a histogram of the pixel intensity values [12] . This histogram is a graph showing the number of pixels in an image at each different intensity value found in that image. For an 8-bit grayscale image there are 256 different possible intensities and so the histogram will graphically display 256 numbers showing the distribution of pixels amongst those grayscale values. The probability of occurrence of gray level r k in an image is approximated by; p r (r k ) = n k /n, k = 0, 1, 2,….,L-1 (4) where, n is the total number of pixels in the image, n k is the number of pixels that have gray level r k and L is the total number of possible gray levels in the image or equal to 256. From the histogram, we extract the mean and standard deviation from pixel 1-100 out of 256 different possible intensities. The mean and standard deviation value are the two feature values which will be used as input to the Linear Discriminant Analysis (LDA) classifier.

Linear discriminant analysis: Linear Discriminant
Analysis (LDA) is a powerful tool for dimensionality reduction and classification [13,14] . It is also a method to discriminate between two or more groups of samples. The groups to be discriminated can be defined either naturally by the problem under investigation, or by some preceding analysis, such as a cluster analysis. In this work, the groups to be discriminated are based on the ROI of PET and Non-PET plastic bottle appearance. In principle, any mathematical function may be used as a discriminating function. In case of the LDA, a linear function of the following form is used: where, w is the weight vector and 0 the bias or threshold weight. For a discriminant function of the form of Eq. (2), a two-category classifier implements the following decision rule: Decide 1 if y(x)>0 and 2 if y(x)<0. Thus, x is assigned to 1 if the inner product w t x exceed the threshold -0 and to 2 otherwise. The parameters w t x have to be determined in such a way that the discrimination between the groups is the best. Given that a discriminating function can be found which provides satisfactory separation, this function can be used to classify unknown objects. All the extracted HIPV from the processed images are used as input to the LDA for the classification purposes.

RESULTS AND DISCUSSION
A total collection of 300 images of plastic bottle constitutes the database to generate the input images. All these images are divided into two groups, PET and Non-PET. In this study, two types of feature vector, an average of white pixel value from algorithm A, mean and standard deviation from algorithm B are inputs to the linear discriminant classifier. For algorithm A, feature vector which was derived from the average of white pixel value based on the following function, an average value W(r k ) W (r k ) = n k /n k (150-256 pixel) (6) was computed from total sum of white pixel values from 150-256 divided by the total sum of the whole pixel from 0-256 of grey level image. Only an average of white pixel value are considered in this algorithm since it is an obvious feature that can discriminate between 2 classes of plastic bottles which are PET and Non-PET. Figure 4 show the step by step results on how to obtain the feature value for plastic bottle classification.
where, r s,t is the gray level at coordinates (s,t) in the neighborhood and p(r s,t ) is the neighbourhood normalized histogram component corresponding to that value of gray level. The gray level standard deviation of the pixels in the region S xy is given b, xy s 2 Sxy s,t xy s,t (s,t ) S [r ms ] p(r ) The local mean is a measure of average gray level between 1 and 100 pixel value, in neighbourhood S xy and the standard deviation is a measure of contrast in that neighbourhood. Figure 5 and 6 displays the stepby-step results obtained from the pre-processing and automated detected ROIs implementations for the two categories of plastic bottles.
Based on explanation above, LDA for two-class PET or Non-PET ( 1 or 2 ) problem is y(x) = w t x+ 0 . About 100 images each from the two classes of plastic  Fig. 4 above. Based on testing for the discriminating function; -6.742900x_1 + -24.211400x_2+15.000000 = 0 for algorithm A and 2.374290x_1+3.073220x_2+ -1.0000000 = 0 for algorithm B, the following results were obtained; From the classification result (Table 2), it can be concluded that: • Bounding box image algorithm (algorithm A) is an algorithm that extracts features from the whole plastic bottle image. Label and plastic bottle cap  from the image give a few noises, so that result for the feature vector will be affected • Segmented region of interest algorithm will find exactly the region having discriminating value between the two classes of plastic bottle • Segmented region of interest algorithm give better result compared to bounding box image algorithm

CONCLUSION
This research has considered two algorithms to extract feature from plastic bottle images and implemented histogram of intensity to classify plastic bottle type known as PET and Non-PET. Results obtained revealed that the segmented region of interest algorithm is a better FE algorithm with percent of correct classification slightly higher than the bounding box image algorithm. It can be concluded that segmented ROI algorithm are uniquely separated as either PET or Non-PET thus making it easier for the linear discriminant function in statistical pattern recognition to perform its task.