Moisture Content Prediction of Dried Longan Aril from Dielectric Constant Using Multilayer Perceptrons and Support Vector Regression

Problem statement: Estimation of moisture contents of dried food prod ucts from their dielectric constants was an important step in moist ure measurement systems. The regression models that provide good prediction performance are desira ble. Approach: The Multilayer Perceptrons (MLP) and Support Vector Regression (SVR) were applied in this research to predict the moisture contents of dried longan arils from their dielectric constants. The data set was collected from 1500 samples of dried longan aril with five different moisture cont e s of 10, 14, 18, 22 and 25% Wet basis (Wb.) Dielectric constant of dried longan aril was measur ed by using our previously proposed electrical capacitance-based system. The results from the MLP and SVR models were compared to that from the linear regression and polynomial regression models. To take into account the generalization of the models, the four-fold cross validation was applied. Results: For the training sets, the average mean absolute errors over three bulk densities of 1.30, 1.45 and 1.60 g cm −3 were 1.7578, 0.6157, 0.3812, 0.3113, 0.0103 and 0.0044% Wb for the linear regres sion, second-, third-, fourth-order polynomial regression, MLP and SVR models, respectively. For t he validation sets, the average mean absolute errors over the three bulk densities were 1.7616, 0 .6192, 0.3844, 0.3146, 0.0126 and 0.0093% Wb for the linear regression, 2nd, 3rd and 4th-order polyn omial regression, MLP and SVR models, respectively. Conclusion: The regression models based on MLP and SVR yielded better performances than the models based on linear regression and poly nomial regression on both training and validation sets. The models based on MLP and SVR also provided robustness to the variation of bulk density. Not only for dried longan aril, the proposed models can also be adapted and applied to other materials or dried food products.


INTRODUCTION
Longan fruit (Dimocarpus longan Lour.) is a nonclimacteric subtropical fruit grown commercially in many countries including China, Thailand, India, Vietnam, Australia and the United States (Jiang et al., 2002). One of the factors that affect the deterioration and ultimately the costs of all dried agricultural products is moisture content (Karathanos, 1999). The moisture content of dried longan must not exceed 13.5% Wb (wet basis) of the whole fruit or 18% Wb of aril according to the National Bureau of Agricultural Commodity and Food Standard of Thailand. Because no specific moisture content tester for dried longan was available, farmers would have to subjectively estimate the moisture from surface of skin, aril and seed by their experiences. Therefore, we previously proposed a prototype economical moisture measurement system for dried longan. It estimated the dielectric constant of dried longan aril by measuring the electrical capacitance of the dried longan-based capacitor. The system was designed to predict the moisture content of dried longan aril from its dielectric constant by using the second-order polynomial.
There have been many research works in the area of system identification of a nonlinear black-box model. One of the most popular classes of artificial neural networks is the Multilayer Perceptrons (MLP) with the backpropagation algorithm as the training method. The MLP have been applied to several areas, for example, agriculture (Effendi et al., 2010), medicine (Benamrane et al., 2005;Isa et al., 2007;Eiamkanitchat et al., 2010), face recognition (Rizon et al., 2006), electric power systems (Benslimane et al., 2006).
The Support Vector Machine (SVM) is one of the most successful algorithms based on the statistical learning theory (Vapnik, 1999;Christiani and Shawe-Taylor, 2000). It was originally developed to solve classification problems but recently extended to the domain of regression problems known as the Support Vector Regression (SVR) (Gunn and Brown, 1999). One of the advantages of the SVM is that it has a few free parameters to adjust and solving for its optimal model parameters can be achieved using any standard quadratic programming algorithms. This can be done in a short time and there is no local minimum. It is a powerful technique for solving the nonlinear function approximation problems. Moreover, the Structural Risk Minimization (SRM) in learning SVM algorithm is more powerful than the Empirical Risk Minimization (ERM) in the MLP. It has been shown in several applications that both SVR and MLP provided better regression performance than the linear regression and polynomial regression, e.g., in flood prediction , electric load forecasting (Pahasa and Theera-Umpon, 2008;Abd, 2009), drug concentration estimation (Sumonphan et al., 2008), power systems (Boonprasert et al., 2003), computer networks (Hasegawa et al., 2001), telecommunications (Suyaroj et al., 2009), finance (Song et al., 2010), environment (Mileva-Boshkoska andStankovski, 2007).
In this study, we investigate the applications of the MLP and the SVR to another regression problem. The regression models try to predict the moisture content of dried longan aril from the dielectric constant. The accuracy of each model was evaluated by using the Mean Absolute Error (MAE). The comparison was conducted on several regression models including linear regression, polynomial regression (second-order, thirdorder, fourth-order polynomials), MLP and SVR.

Dried longan preparation:
The dried longan arils were prepared using the conventional drying process. Fresh longan fruits cv. Dew were dried at 70°C for 13 h and then at 75°C for 20 h. After that, the temperature was adjusted to 65°C for 15 h or until the moisture content was reduced to 10% Wb. However, in our experiments, 25 h after the beginning of the drying process, random samples were taken out every 2 h. They were further dried at 70°C under vacuum for about 8 h or until their weights were constants. This was according to the official methods and recommended practices of the Association of Official Analytical Chemists (Horwitz, 2005). Their actual moisture contents were calculated by: Weight before drying -Weight after drying Moisture content (% Wb)= ×100% Weight before drying (1) The dielectric constant of each dried longan aril was measured by our previously proposed moisture measurement system at the room temperature. Dried longan aril was placed between 2 stainless steel discs inside a cylindrical plastic container. The weights of arils placed into the cylinder were varied from 9, 10 and 11 g (equivalent to the bulk densities of 1.30, 1.45 and 1.60 g cm −3 (five different moisture contents of 10, 14, 18, 22 and 25% Wb were considered. For each of the five moisture contents and each of the three bulk densities, 100 samples of dried longan aril were tested. Therefore, the total of 1500 samples were used in the experiments. Artificial neural networks and support vector regression are well-described in literatures (Vapnik, 1999;Christiani and Shawe-Taylor, 2000;Haykin, 2008). We provide only their brief introduction in this study.

Artificial neural network: An Artificial Neural
Network (ANN) is a mathematical model mimicking the biological neural network. An ANN can be considered as a universal function approximator and has been applied to several areas of research such as military, medicine, business. The typical structure of a feed forward neural network is displayed in Fig. 1. The goal is to find the best set of weights (w) so that the outputs o j,n are as close to the desired outputs d j,n as possible for a given input pattern x i,n , i = 1, …, P and j = 1, …, Q. P and Q are the number of input features and the number of classes, respectively. In the support vector regression, the goal is to find a function f(x) that has anε-deviation from the actually obtained target y i for all training set,{x i , y i }, x i ∈ℜ n , y i ∈ℜ with l observations. At the same time, f(x) is as flat as possible. Suppose f(x) takes the following form: Therefore, the objective is to choose a hyperplane that minimizes the Euclidean norm vector ||w|| while simultaneously minimizes the sum of the distances from the data points to the hyperplane. By introducing 2n Lagrange multipliers α, α * and using the Karush-Kuhn-Tucker (KKT) theory (Christiani and Shawe-Taylor, 2000). We obtain: Substituting Eq. 3 into 2 yields the regression function: where, x i are the support vectors predetermined by the training patterns. From the KKT conditions, the support vectors are only the points x i where exactly one of the Lagrange multipliers is greater than zero. For the nonlinear case, the input data need to be mapped into a high dimensional feature space. Let the nonlinear transformation function be Φ(•) and using the kernel functions defined as: This implies that the dot product in the high dimensional space is equivalent to a kernel function of the input space. There are many types of kernel functions that can be used. The bias term b may be dropped if it is contained within a kernel function and the regression function in Eq. 4 is given by: Moisture content prediction models: In this research, the input to the regression models is the dielectric constant of dried longan aril whereas the output is the moisture content. There are no parameters to set for the linear and polynomial regression models. However, there are some parameters to set for the MLP and SVR. For the MLP, back-propagation algorithm was applied in the training phase. Therefore, we needed to find the best structure for this particular problem, i.e., the number of hidden layers and the number of neurons in each hidden layer. For the SVR, the ε-insensitive loss function was applied. The support vectors are on the two hyper-planes with ε distance from the real hyperplane. Therefore, ε is an error between an actual hyperplane and the support vector hyper-planes. The data standing between the support vector hyper-planes are considered to produce no error. In the training stage, we try to find the support vector hyper-planes that can cover all training data. That is, all training data must be in between the two support vector hyper-planes. Figure 2 shows the architecture of the prediction model. In this study, the Radial Basis Function (RBF): was used as the kernel function. Some important parameters to set for the SVR model include the RBF kernel specific parameter σ that controls the spread of the RBF and, therefore, the generalization of the SVR, the width ε of the tube and the regularization parameter C which controls the regression quality.
Evaluation procedures: To evaluate the system performance quantitatively, we use the Mean Absolute Error (MAE) which is defined as: where, n is the number of the samples considered. The experiments were conducted by using the fourfold cross validation technique which is a standard testing technique for any data set without training/test sets assignment. To be more specific, the entire data set (1500 samples) was randomly divided into four groups. Each group contains 375 samples, i.e., 75 samples for each of the 10, 14, 18, 22 and 25% Wb moisture contents.) In each validation, the data in each of four groups (called validation set) was used as the test set whereas the data in the three remaining groups were used as the training set. A regression model was generated by using the data in training set only. The derived model was then tested on the validation set to evaluate its generalization. Ultimately, after four validations, all samples in the data set are used as the test data in the validation sets. We evaluated the results using the average of MAE's over all four validations. The cross-validation was then performed on the data from each of the three bulk densities.

RESULTS
For parameter settings in the MLP and SVR, we performed extensive experiments to find the best sets of parameters under each condition. For the MLP, we found that two hidden layers with the numbers of hidden neurons of {3,5}, {5,9} and {4,6} yielded the best results for the bulk densities of 1.30, 1.45 and 1.60 g cm −3 , respectively, where the first and second elements of each pair denote the numbers of hidden neurons in the first and second hidden layers, respectively. Furthermore, we found from many experiments that εinsensitive loss functions with ε = 0.0001 was the best choices for all three bulk densities of 1.30, 1.45 and 1.60 g cm −3 . The regularization parameter C, needed for solving the weight β i , was chosen to be 100 for all three bulk densities. Finally, the parameter σ was set to 1.35, 0.65 and 0.85 for the bulk densities of 1.30, 1.45 and 1.60 g cm −3 , respectively.
The performances of the proposed models on the training sets of the four-fold cross validation are shown in Table 1. The average MAE's over the three bulk densities are 1.7578, 0.6157, 0.3812, 0.3113, 0.0103 and 0.0044% Wb for the linear regression, second-, third-, fourth-order polynomial regression, MLP and SVR models, respectively. Table 2 shows the performances on the validation sets. The average MAE's over the three bulk densities are 1.7616, 0.6192, 0.3844, 0.3146, 0.0126 and0.0093% Wb for the linear regression, second-, third-, fourth-order polynomial regression, MLP and SVR models, respectively.

DISCUSSION
The results in Table 1 show that the polynomial regression models yield higher errors than the MLP and SVR models on the training sets of the four-fold cross validation by about one or two orders of magnitude. The MLP models also yield higher errors than the SVR models by about one order of magnitude.
The results in Table 2 are very similar to that on the training sets shown in Table 1. On the validation sets, the polynomial regression models yield higher errors than the MLP models by about one order of magnitude. In the mean time, the MLP models yield higher errors than the SVR models by about one order of magnitude.
It can be clearly seen that both SVR and MLP models yield better prediction performance than the models based on linear regression and polynomial regression in both training and validation sets. We can also see that the SVR models yield a little bit better performance than the MLP models in both training and validation sets. It is not surprising that the average MAE's of the training sets are less than that of the validation sets because the data in the validation sets are not involved in the model generation.      Even though the average MAE's at the 1.30 g cm −3 bulk density are less than that at the other two bulk densities, the differences are not much in MLP and SVR models. Therefore, the system would have more robustness to the bulk density variation when the MLP or SVR is applied as the regression model.

CONCLUSION
The regression models based on multilayer perceptrons and support vector regression were proposed to predict the moisture content of dried longan aril from its dielectric constant. The performances of the proposed models were compared with that of linear regression and second-, third-, fourth-order polynomial regression models. The results using four-fold cross validation suggested that the SVR models achieved the best prediction performances, while the MLP models, polynomial regression models and linear regression models were next in line ordering from best to worst. The results also suggested that the bulk density of dried longan aril in the plastic container affected the prediction performances for the linear and polynomial regression models. However, this effect was very little when the MLP or SVR was applied. Therefore, both MLP and SVR models are the good choices for the system in that they provide very little prediction error and also provide robustness to the bulk density variation. Moreover, the proposed models are not only applicable to dried longan aril, they can also be adapted and applied to other materials or food products.