The Performance of Maximum Likelihood, Spectral Angle Mapper, Neural Network and Decision Tree Classifiers in Hyperspectral Image Analysis

: Several classification algorithms for pattern recognition had been tested in the mapping of tropical forest cover using airborne hyperspectral data. Results from the use of Maximum Likelihood (ML), Spectral Angle Mapper (SAM), Artificial Neural Network (ANN) and Decision Tree (DT) classifiers were compared and evaluated. It was found that ML performed the best followed by ANN, DT and SAM with accuracies of 86%, 84%, 51% and 49% respectively.


INTRODUCTION
The increasing application of remote sensing for forest monitoring and inventory is seen as a cost effective source of information for the practice of sustainable forest management. Over the past few decades, the emergence of hyperspectral sensors that enables the acquisition of data with increased number of spectral bands and higher spectral resolution has certainly give significant impacts on our ability to map forest. With hyperspectral scanner onboard satellites (EO-1, Orbview-4) and currently on airborne platforms (e.g. AISA, AVIRIS, CASI, HYMAP), hyperspectral applications will certainly aroused various research issues pertaining to its use. Although some of the applications have proven to be successful in the inventory of temperate forests [1] , doubts have been raised concerning the ability of the sensor to effectively discriminate among the rich diversity of flora of tropical forests. This issue can also be looked at by examining the effectiveness of several classification algorithms in classifying hyperspectral data of tropical forest.
Several work devoted to the studies on the classification of hyperspectral remote sensing data have been reported in the literature, for example [2,3] . Common classifiers include the statistically-based technique such as the Maximum Likelihood (ML) and Artificial Neural Network (ANN) which is a nonparametric classification algorithm. Decision Tree (DT) classifier is also a non-parametric classifier and depends on several factors such as the choice of pruning method and types of tree growing algorithm. The SAM classifier is based on the theory of spectral matching in which the spectral similarity between the reference and target spectra is used in classification. In this study, the performance of the classifiers will be assessed for the mapping of Malaysian tropical forest using hyperspectral data.
Classification Algorithms: A brief description of ML, ANN, DT and SAM are given in the next section.

Maximum Likelihood (ML) classifier:
The ML classifier assumes that the statistics for each class in each band are normally distributed and calculates the probability that a given pixel belongs to a specific class. Unless a probability threshold is selected, all pixels are classified. Each pixel is assigned to the class that has the highest probability. If the highest probability is smaller than a threshold, the pixel remains unclassified. The following discriminant functions for each pixel in the image are implemented in ML classification [4] . g i (x) = ln p(w i ) -½ ln |Σi| -½ (x-m i )t Σi -1 (x-m i ) Where: i = class x = n-dimensional data (where n is the number of bands) p(w i ) = probability that class w i occurs in the image and is assumed the same for all classes |Σ i | = determinant of the covariance matrix of the data in class w i Σ i -1 = its inverse matrix m i = mean vector Implementation of the ML classification involves the estimation of class mean vectors and covariance matrices using training pattern chosen from known examples of each particular class.

Artificial Neural Network (ANN) classifier:
In this study, a multi-layered feed-forward ANN is used to perform a non-linear classification. This is the most widely used model and its design consists of one input layer, at least one hidden layer and one output layer. This algorithm is a promising technique for a number of situations such as non-normality, complex feature spaces and multivariate data types, where traditional methods fail to give accurate results [5] . One of the most notable feature about a neural network [6,7,8,9] which motivates its adoption in this study is its robustness when presented with partially incomplete or incorrect input pattern and the ability to generalize input. The technique uses standard back propagation for supervised learning. The number of hidden layers to use and the choice between a logistic or hyperbolic activation function can be made. Learning occurs by adjusting the weights in the node to minimize the difference between the output node activation and the output. The error is back propagated through the network and weight adjustment is made using a recursive method. The multi layer perceptron model with an error minimization back-propagation learning was applied in this study which is based on several optimal set of structures and training parameters.
Decision Tree (DT) classifier: One that has several advantages in terms of the ease of identification of key explanatory variables is the decision tree classification approach in remote sensing [10] . As one of a method of data mining, a decision tree learns from a given data set and formulates explicit rules to classify, segment or make predictions about a target variable [11,12,13,14] . Decision trees share the same advantages of neural networks compared with the traditional probabilistic algorithms because they are strictly non parametric, free from distribution assumptions, able to deal with nonlinear relations, insensitive to missing values and capable of handling numerical and categorical inputs [15] . The classification and regression tree (CART) which is a univariate tree with binary outputs was used in this study.

Spectral Angle Mapper (SAM):
The Spectral Angle Mapper (SAM) is a physically-based spectral classification that uses an n-dimensional angle to match pixels to reference spectra [16] . The algorithm determines the spectral similarity between two spectra by calculating the angle between the spectra, treating them as vectors in a space with dimensionality equal to the number of bands. This technique, when used on calibrated reflectance data, is relatively insensitive to illumination and albedo effects. SAM compares the angle between the endmember spectrum vector and each pixel vector in n-dimensional space. Smaller angles represent closer matches to the reference spectrum. Pixels further away than the specified maximum angle threshold in radians are not classified.

MATERIALS AND METHODS
The study area used in this work is located in Forest Research Institute (FRIM) in Kepong, Selangor for a 4 hectare natural forest plot. With an average altitude of 1200 feet above sea level, the site comprise of the hill mixed dipterocarp forests common to the tropical regions such as Malaysia. Hyperspectral data were acquired on 27 May 2005 by the AISA airborne imaging spectrometer onboard the NOMAD GAF-27 aircraft. The over-flight occurred over a 90-minute time span between 1105 to 1235 hrs local time. The study site was chosen from flight line 6 over the 7 flight lines that was flown in a North-East to South-West direction, which covered the whole 1000 hectares of FRIM. The sensor altitude was 1000m above the target creating a 1 m ground resolution with a swath width of 360 m. With a speed of 120 knots during data acquisition, the sensor was operated in spatial mode-B comprising of 20 spectral bands, which were configured by the user [17] , for mapping over the tropical forest landscape.
Pre-processing was carried out with the CALIGEO software, which automatically corrects for both geometric and radiometric distortions of the raw image data. The radiance data set was then converted to atsensor reflectance derived from the FODIS sensor (attached to the sensor unit during flight), which collects downwelling irradiances.
The AISA data had been analysed and enhanced in the pre-processing stage in order to reduce the effects of noise and improve the data quality spectrally and spatially. Figure 1 shows the AISA image of the study area that was extracted for further analysis. This image is of 250 x 370 pixels, at a spatial resolution of 1 m. Fig. 1: AISA image of the study area A field study was carried out on a 4 ha plot where 7 classes of tree species were identified. A reference image (ground truth image) was generated after the field study campaign (Figure 2). Random sampling was carried out to select the pixels for training and testing the classifiers. Pixels selected using random sampling were then divided into two parts, one for training and one for testing the classifiers in order to avoid the bias resulted from the use of the same set of pixels for testing and training. The main aim of the study is to evaluate the performance of the four classification algorithms using the airborne hyperspectral data acquired from the AISA sensor over a tropical forest area. Their performance is based on the capability to identify different tree species accurately based on ground truth information. Overall classification accuracies were calculated for each of the classifiers used in this study.

RESULTS
The classification accuracy of each of the four classifiers is presented in Table 1. Overall, the ML classifier shows the highest overall accuracy (85.56%). The results of the classification accuracy assessment when using a SAM classifier shows an overall accuracy of 48.83%. The ANN classifier showed an overall classification accuracy of 83.61% and the DT classifier showed an overall classification accuracy of 50.67%.
The higher accuracy as shown by the ML classifier in this study suggests that the hyperspectral data which was derived from the optimal band configuration of the airborne sensor [17] has a sufficiently Gaussian distribution that is able to give a full and representative description of the respective classes (spectrally separable tree species classes) and fulfils the requirement (biased towards) for such a parametric algorithm [18] . For the other type of classifiers (non parametric algorithm) such a condition would not be as equally informative and vary in usefulness [19] . [20] has shown better accuracy for ANN algorithms when decision boundaries lies on the edge of the class distributions between two or more classes, that is when the decision boundary is less defined. Such a case is found when the species classes are spectrally less separable. Figure 3 shows the classified image of the study area using the ML classifier with the best accuracy level compared with the other classifiers.

CONCLUSIONS
It is rather unexpected to see the ability of ML to outperform the more advanced classifiers such as ANN, DT and SAM. This accurate yet simple approach to hyperspectral data mapping shows the importance of considering the data set/ classifier relationship for successful image classification.
It could also be concluded from the study that traditional classification accuracy such as ML can still outperform more advanced classification algorithms such as SAM and ANN. This could be due to the high level of heterogeneity of the Malaysian tropical forest. The accuracy of the ML could be due to the adequate training samples of about 200 pixels for each class with reference to the 20 bands used for the classification. The ANN could have problems with the structure and is still being improved. As for SAM, this shows that we need more than the direction of vector in order to separate the tropical forest species, which are spectrally similar in nature. Furthermore, the complexity of SAM and ANN that require many parameters to be defined could still be sub-optimal for the Malaysian high biodiversity conditions. Further studies to improve the use of the classifiers will be conducted to enhance the applicability of such methods. There is also a need to test the capability of more and newly-developed classification algorithms in mapping Malaysian tropical forests such as Support Vector Machines (SVM), linear unmixing and waveletbased classifiers in order to find the most optimal algorithms that work well in all conditions. Development of new algorithms that are designed specifically for tropical forest mapping using hyperspectral data is also required. This is important as accurate information on forest biodiversity status is vital in conservation and monitoring efforts towards achieving sustainable development for the country.