Fish Recognition Based on Robust Features Extraction from Size and Shape Measurements Using Neural Network

Problem statement: Image recognition is a challenging problem researchers had been research into this area for so long especially in the recent years, due to distortion, noise, segmentation errors, overlap and occlusion of objects in digital images. In our study, there are many fields concern with pattern recognition, for example, fingerprint verification, face recognition, iris discrimination, chromosome shape discrimination, optical character recognition, texture discrimination and speech recognition, the subject of pattern recognition appears. A system for recognizing isolated pattern of interest may be as an approach for dealing with such application. Scientists and engineers with interests in image processing and pattern recognition have developed various approaches to deal with digital image recognition problems such as, neural network, contour matching and statistics. Approach: In this study, our aim was to recognize an isolated pattern of interest in the image based on the combination between robust features extraction. Where depend on size and shape measurements, that were extracted by measuring the distance and geometrical measurements. Results: We presented a system prototype for dealing with such problem. The system started by acquiring an image containing pattern of fish, then the image features extraction is performed relying on size and shape measurements. Our system has been applied on 20 different fish families, each family has a different number of fish types and our sample consists of distinct 350 of fish images. These images were divided into two datasets: 257 training images and 93 testing images. An overall accuracy was obtained using the neural network associated with the back-propagation algorithm was 86% on the test dataset used. Conclusion: We developed a classifier for fish images recognition. We efficiently have chosen a features extraction method to fit our demands. Our classifier successfully design and implement a decision which performed efficiently without any problems. Eventually, the classifier is able to categorize the given fish into its cluster and categorize the clustered fish into its poison or non-poison fish and categorizes the poison and non-poison fish into its family.


INTRODUCTION
Recently, a lot of study was done by depending on the computer; In order to let the processing time to be reduced and to provide more results that are accurate, for example, depending on different types of data, such as digital image and characters and digits. In order to automate systems that deal with numbers such as Fingerprint verification, face recognition, iris discrimination, chromosome shape discrimination, optical character recognition, texture discrimination and speech recognition. And an automatic fish image recognition system is proposed in this study. Digital image recognition has been extremely found and studied. Various approaches in image processing and pattern recognition have been developed by scientists and engineers to solve this problem (Al-Omari et al., 2009;Chen et al., 2001). That is because it has an importance in several fields. In this study, system for recognized of fish image is built, which may benefit various fields, the system concerning on isolated pattern of interest, the input is considered to be an image of specific size and format, the image is processed and then recognized the given fish into its cluster and Categorize the clustered fish into poison or non-poison fish and categorizes the non-poison fish into its family.
The proposed system recognizes isolated pattern of fish as the system acquire an image consisting pattern of fish, then, the image will be processed into several phases such as pre processing and feature extraction before recognizing the pattern of fish. A neural network used for the recognition phase.

Problem statement:
The problem statement of this study extracted from the previous studies, several efforts have been devoted to the recognition of digital image but so far it is still an unresolved problem. Due to distortion, noise, segmentation errors, overlap and occlusion of objects in color images (Bai et al., 2008;Kim and Hong, 2009). Recognition and classification as a technique gained a lot of attention in the last years wherever many scientists utilize these techniques in order to enhance the scientific fields. Fish recognition and classification still active area in the agriculture domain and considered as a potential research in utilizing the existing technology for encouraging and pushing the agriculture researches a head. Although advancements have been made in the areas of developing real time data collection and on improving range resolutions (Patrick et al., 1991;Nery et al., 2005), existing systems are still limited in their ability to detect or classify fish, despite the widespread development in the world of computers and software. There are many of people die every day because they do not have the ability to distinguish between poison fish and non-poison. Object classification problem lies at the core of the task of estimating the prevalence of each fish species. Solution to the automatic classification of the fish should address the following issues as appropriate: • Arbitrary fish size and orientation; fish size and orientation are unknown a priori and can be totally arbitrary • Feature variability; some features may present large differences among different fish species • Environmental changes; variations in illumination parameters, such as power and color and water characteristics, such as turbidity, temperature, not uncommon. The environment can be either outdoor or indoor • Poor image quality; image acquisition process can be affected by noise from various sources as well as by distortions and aberrations in the optical system • Segmentation failures; due to its inherent difficulty, segmentation may become unreliable or fail completely And the vast majority of research-based classification of fish points out that the basic problem in the classification of fish; they typically use small groups of features without previous thorough analysis of the individual impacts of each factor in the classification accuracy (Alsmadi et al., 2009;Lee et al., 2008;Tsai and Lee, 2002).

Related study:
Selecting suitable variables is a critical step for a successful implementation of image classification. Many potential variables may be used in image classification such as shapes and texture and it can be done by the feature extraction process. The purpose of feature extraction is to determine the most relevant and the least amount of data representation of the image characteristics in order to minimize the within-class pattern variability, whilst, enhancing the between-class pattern variability. There are two categories of features: Statistic features and structural features. Feature extraction from an image is a major process in image analysis. An image feature is an attribute of an image. Image features can be classified into two types: natural and artificial ones. The natural features are defined by the visual appearance of an image such as luminance of a region (Wang et al., 2005), whilst artificial features are obtained from some manipulations of an image such as image amplitude histogram and filters (Petrou and Kadyrov, 2001). Image analysis requires the use of image features that capture the characteristics of the objects depicted so that they are invariant to the way the objects are presented in the image. Historically, the process of extracting image features has been anthropocentric: The features calculated are defined in a way that captures the attributes the human vision system would recognize in the image. Thus, features like compactness, brightness are features which have some physical and perceptual meaning. It is not however necessary for the features to have a meaning to the human perception in order to characterize well an object. Indeed, features which broaden the human perception may prove to be more appropriate for the characterization of complex structures, like the objects often one wishes to identify in an image (Sze et al., 1999). Zion et al. (1999) have proposed a classifier based on color and shape features of fish to deal with the shape-based retrieval problem. They mentioned about the necessity of using shape and color of fish to search the fish database of Taiwan. The developed technique is able to perform scale and rotation invariant matching between two fishes. A target object selected by a bounding rectangle has to be processed by a foreground/background separation step. The target object (foreground part) is then converted into a Curvature Scale Space (CSS) map. In order for performing rotation invariant matching, The authors further converts the CSS map into a Circular Vector (CV) map and then find its representative vector based on the concept of force equilibrium. After rotating the representative vector into the canonical orientation, every unknown object can be compared with the model objects efficiently. An image-processing algorithm developed by Zion et al. (1999) and Shutler and Nixon (2001) has been used for discrimination between images of three fish species for use on freshwater fish farms. Zernike velocity moments were developed by Dudani et al. (1977), to describe an object using not only its shape, but also its motion throughout an image as claimed by Mercimekm et al. (2005). Classification is the final stage of any image-processing system where each unknown pattern is assigned to a category. The degree of difficulty of the classification problem depends on the variability in feature values for objects in the same category, relative to the difference between feature values for objects in different categories. Mercimekm et al. (2005) and Lee et al. (2008) have proposed shape analysis of images of fish to deal with the fish classification problem. A new shape analysis algorithm was developed for removing edge noise and redundant data point such as short straight line. A curvature function analysis was used to locate critical landmark points. The fish contour segments of interest patterns were then extracted based on landmark points for species classification, which were done by comparing individual contour segments to the curves in the database. Regarding the feature extraction process, the authors tackled in their research the following features: Fish contour extraction; fish detection and tracking; shape measurement and descriptions (i.e., shape characters (features), anal and caudal fin and size); data reduction; landmark points; landmark points statistics (i.e., curve segment of interest). In their study, they have chosen nine species of fishes that have similar shape characters and the total of features was nine features. Also, they recommended that the decision tree is considered as a suitable method to obtain high accurate results of fish images based on the common characters used, such as: Caudal, anal and adipose fin. Furthermore, the authors claimed that the number of shape characters needed to be used and how to use them depending on the number of species and what kind of species are required by the system to be classified. Their experiments conducted 22 fish images that belong to 9 species, where the detection percentage of the classification process was 90%.

MATERIALS AND METHODS
This study had focused on five hundred images of fish which collected from Global Information System (GIS) on Fishes (fish-base) and department of fisheries Malaysia ministry of agricultural and Agro-based industry in Putrajaya, Malaysia region currently, the database contains 500 of fish images. Data acquired on 22th August, 2008, are used.
The feature selection approach: Feature extraction refers to a process by which fish attributes are computed and collected from size and shape measurements through the distance and geometrical tools. The goal of a feature extraction determines a largest set of features.

Anchor/landmark points location detection:
In the size and shape measurements, a number of anchor/landmark points are required to be determined as labeled in Fig. 1. Anchor/landmark points detection is the goal in several works during the last few years. The aim of point detection is to detect a relevant set of point to get the anchor point for patterns of interest. The goal of anchor point detection in our study is to determine seventeen labeled points that will give the location of each features determined for fishes recognition. Then it will be used to calculate the features geometry (distance and angle tools) for the recognition purpose described in chapter four.
After detecting the anchor/landmark points over the image, we can extract the features from the size and shape measurements. Shape measurements: Using shape measurements, the external contour and edge detection of the pattern for each fish and to determine the significant similarity part, such as the tail shape. Furthermore, through the usage of distance and angle tools, the following features can be determined: The size of mouth, angle of head, caudal fin length, dorsal fin length, caudal angle and the angle between the mouth and the eye. Besides, by dividing the fish into two parts it can be a significant step in obtaining a high accuracy of fish classification. According to Fig. 1a and b, two different vectors are drawn based on the maximum and minimum points on the x-axes as well as y-axes, finalizing the triangle drawing process by connecting lines between the maximum and the minimum points on x-axes with the maximum and minimum points on y-axes. This will lead to the classification process through the calculation of vector's angles between three points.
Distance measurements: Distance is a numerical description of how far apart objects are at any given moment in the time in physics or everyday discussion, distance may refer to a physical length, a period time, or estimation based on other criteria (e.g., "two counties over"). In mathematics, distance must meet more rigorous criteria.
In neutral geometry, the minimum distance between two points is the length of the line segment between them.
In algebraic geometry, the distance 'd' between the points A = (x 1 , y 1 ) and B = (x 2 , y 2 ) is given by the formula: Distance between the right-end Dist (P5, P15) of mouth and the eye center D7 Distance between the right-end Dist (P5, P3) of mouth and the start of dorsal fin D8 Distance between pelvic fin and Dist (P4, P5) the right-end of mouth D9 Anal fin length Dist (P10, P11) D10 Pelvic fin length Dist (P12, P4)

Fig. 2: The angle between two vectors
Similarly, given points (x 1 , y 1 , z 1 ) and (x 2 , y 2 , z 2 ) the distances between them, are given by the formula: The distance calculation can be seen in Table 1 and referred to the ten landmark points as in Fig. 1a shows the distance between mass points as in Table 1. There are ten features produced from this distance measurement category.
Calculate the angles: An angle can be defined as two rays or two line segments having a common end point. The endpoint becomes known as the vertex. An angle occurs when two rays meet or unite at the same endpoint. The angles between two vectors, as we show in Fig. 2 can be identified as ∠ABC or ∠CBA. You can also write this angle as ∠B which names the vertex (common endpoint of the two rays).
The distance formula as mentioned previously can used to find the distance between two points (A, B and C). Once the two side measurements are known, the internal angles 'θ' can be found as well. When the angle (θ) is unknown, the cosine rule is the only option to find the angle. This is represented by an angular separation formula that represents cosine angle between two vectors. Basically, from vector algebra we remember that cosine angle between two vectors can be represented as dot product divided by length of the two vectors as shown in Fig. 2: The length of a vector (also known as modulus) is the root of square of its coordinate: Putting the two together, we get: Finally, the obtained angle is converted into an angle degrees as follows: Angle degrees = Theta *(180/π) (6) Table 2 shows the five angle features calculated from the angle category calculation based on the anchor/landmark points in Fig. 1.

Neural network model:
The multilayer feed forward neural network model with back propagation algorithm for training is employed for classification task as shows in Fig. 3, which illustrates our implemented neural network contains three layers which are the input layer, the hidden layer and the output layer. The number of neurons is varied from layer to another (except the output layer which has only one neuron) in order to determine the suitable number of neurons for both input and hidden layers, therefore, obtaining high accurate results.

Fig. 3: Multilayer feed forward neural network model
The developed neural network is trained with Termination Error (TE) 0.01 in 411 epochs the value of learning constant Learning Rate (LR) used is 0.1. In our experiment we built the neural network with number of input features, three hidden layers and different numbers of neurons in order to achieve our goal. The Table 3 shows the number of input features and number of neurons for each layer that determined experimentally.
Experimental result: As we shows in Fig. 4, the accuracy of recognition test results for each fish family (20 families) based on the size and shape measurements are varied from a family to another. These results indicated a high accuracy of each fish family recognition percentage, which are lies between 75% as minimum percentage of accuracy and 97% as a maximum percentage of accuracy. Some of the results that is close to the minimum percentage (e.g., Sillaginidae) are due to share some common features with each other (e.g., Stromateidae) which causes a noise identification interruption to the neural network. However, in the other hand, some families shared the same features with each other , but each one has its own species-specific traits.    This made the neural network easier to recognize the respected family, for example, some of the poison fishes has the same angle tail with other non-poison fishes, but with some dissimilarity such as length of dorsal fin and the distance between the pelvic fin and the right-end of the mouth. The same situation goes with the non-poison fishes, for example, the size of mouth, anal fin length, the distance between the rightend of mouth and the dorsal fin, are usually different from family to another. As shown in the Fig. 4, the poison fish families are recognized with high accurate results, due to their species-specific traits unlike to the non-poison fish families. The obtained results of the poison fish families are within 91 and 94%.

RESULTS
The methods have been implemented in MATLAB programming language on a CPU Core 2 Duo 2.33 GHZ. We have considered different fish images families, obtained from Global Information System (GIS) on Fishes (fish-base) and department of fisheries. For experimentation purpose 500 hundred fish images families are considered, 350 fish images for training and the rest 150 for testing. The Table 4 describes the overall training and testing accuracy obtained based on robust features extracted from size and shape measurements using neural network.
In addition, the problem in fish recognition is to find meaningful features based on the image segmentation and features extraction. An efficient classifier that produce better fish images recognition accuracy rate is also required. As we show in Table 4 the overall training accuracy equals to 89% and the overall testing accuracy equals to 86%.

DISCUSSION
The feature extraction is done based on size and shape measurements, utilizing local geometric approach that uses distance and angles measurements. This is to obtain 10 features that rely on distance measurements and 5 features that rely on angles measurements. We determined 18 anchor/landmark points on the shape of pattern of interest (fish), where 4 landmark/anchor points were determined automatically using our program (feature extractor). While 14 landmark/anchor points were extracted manually. Only one fish-based study is reported in the literature that extracted the features using the distance measurements, while in our work, we increased the number of features extracted using the distance measurements. In addition, we added (for the first time in the fish classification) the angles measurements and dividing the pattern of interest (fish) into two triangles. The main advantage of the local geometric approach that is less affected by global changes in the appearance of fish images including fish expression. Nevertheless, this approach has received little attention due to the fact that it requires an additional step of reliably locating fish landmarks/anchor points, which may affect their overall performance (Gupta et al., 2007;Lee et al., 2008).

CONCLUSION
Eighteen features representation have been extracted from eighteen detected landmark points as shown in the second section of the study. All features were obtained from size and shape measurements of fish images, through angle and distance measurements. Our experimental results suggest that our feature selection methodology can be successfully used to significantly improve the performance of fish classification systems. Unlike previous approaches which propose descriptors and do not analyze their impact in the classification task as a whole. We propose a general set of 18 features and their corresponding weights which may be used as a priori information by the classifier. Moreover, our study presents a novel set of features extracted from size and shape measurements. The overall accuracy for NN classification was 86%.