CONSUMER CHOICE PREDICTION: ARTIFICIAL NEURAL NETWORKS VERSUS LOGISTIC MODELS

Conventional econometric models, such as discriminant analysis and logistic regression have been used to predict consumer choice. However, in recent years, there has been a growing interest in applying artificial neural networks (ANN) to analyse consumer behaviour and to model the consumer decision-making process. The purpose of this paper is to empirically compare the predictive power of the probability neural network (PNN), a special class of neural networks and a MLFN with a logistic model on consumers' choices between electronic banking and non-electronic banking. Data for this analysis was obtained through a mail survey sent to 1,960 New Zealand households. The questionnaire gathered information on the factors consumers' use to decide between electronic banking versus non-electronic banking. The factors include service quality dimensions, perceived risk factors, user input factors, price factors, service product characteristics and individual factors. In addition, demographic variables including age, gender, marital status, ethnic background, educational qualification, employment, income and area of residence are considered in the analysis. Empirical results showed that both ANN models (MLFN and PNN) exhibit a higher overall percentage correct on consumer choice predictions than the logistic model. Furthermore, the PNN demonstrates to be the best predictive model since it has the highest overall percentage correct and a very low percentage error on both Type I and Type II errors.


Introduction
Quantitative analysis for forecasting in business and marketing, especially in consumer behavior and in the consumer decision-making process (consumer choice model), has become more popular in business practices. The ability to understand and to accurately predict a consumer decision can lead to more effectively targeting products, cost effectiveness in marketing strategies, increasing sales and result in substantial improvement in the overall profitability of the firm. Conventional econometric models, such as discriminant analysis and logistic regression can predict consumers' choices, but recently, there has been a growing interest in using ANN to analyze and the model consumer decision-making process.
ANN have been applied in many disciplines, including biology, psychology, statistics, mathematics, medical science, and computer science. Recently ANN have been applied to a variety of business areas such as accounting and auditing, finance (with special emphasis on bankruptcy prediction and credit evaluation), management and decision making, marketing and production (Vellido et al., 1999a). However, the technique has been sparsely used in modeling consumer choices. For example, Dasgupta et al. (1994) compared the performance of discriminant analysis and logistic regression models against an ANN model with respect to their ability to identify a consumer segment based upon their willingness to take financial risks and to purchase a non-traditional investment product. Fish et al. (1995) examined the likelihood of clustering managers-customers purchasing from a firm via discriminant analysis, logistic regression and ANN models. Vellido et al. (1999b), using the Self-Organizing Map (SOM), an unsupervised neural network model, carried out an exploratory segmentation of the on-line shopping market while Hu et al. (1999) showed how neural networks can be used to estimate the posterior probabilities of consumer situational choices on communication channels (verbal versus non-verbal communications).
Previous studies have utilised the multi-layer feed-forward neural network (MLFN) which is a family of the ANN. However, very few studies have applied a special class of artificial neural networks called "Probabilistic Neural Network (PNN)" in modelling consumers' choices. The purpose of this study is to empirically compare the predictive power of the probability neural network (PNN), a special class of neural networks, and the MLFN with the logistic model on consumers' banking choices between electronic banking and non-electronic banking.

Banking Channels and Consumer Choice Theory
The evolution of electronic banking, such as internet banking, has altered the nature of personal-customer banking relationships and has many advantages over traditional banking delivery channels.
This includes an increased customer base, cost savings, mass customization and product innovation, marketing and communications, development of noncore businesses and the offering of services regardless of geographic area and time.  (Taylor, 2002). It is predicted that the usage of internet banking in New Zealand will continue to grow in the near future, as customer support for internet banking is mounting.
Despite its growing popularity, majority of consumer behavior banking studies has focused on a specific type of electronic banking instead of investigating the concept of electronic banking as a whole in relation to consumers' decision making behavior (see Al-Ashban and Burney 2001). Furthermore, the limited electronic banking studies that have been published are descriptive in nature, providing information on basic concepts of electronic banking instead of focusing on complex and in-depth consumer decision making processes (Orr, 1998).

The Consumer Decision-Making Process
The consumer decision-making process pioneered by Dewey (1910) in examining consumer purchasing behavior toward goods and services involves a five-stage decision process. This includes problem recognition, search, and evaluation of alternatives, choice, and outcome.

Figure 1 Consumer Decision-Making Process Model
Analogous to Dewey's (1910) paradigm for goods, Zeithaml and Bitner (2003) suggested the decision-making process could be applied to services. The five stages of the consumer decision-making process operationalized by Zeithaml and Bitner (2003) were; need recognition, information search, evaluation of alternatives, purchases and consumption, and post-purchase evaluation (see Figure 1). Furthermore, the authors imply that in purchasing services, these five stages do not occur in a linear sequence as they usually do in the purchase of goods.

Logistic Model in Electronic Banking
For many durable commodities, the individual's choice is discrete and the traditional demand theory has to be modified to analyse such a choice (Ben-Akiva and Lerman, 1985). Let be the utility function of the consumer i, where y i is a dichotomous variable indicating whether the individual is an electronic banking user, w i is the wealth of the consumer and z i is a vector of the consumer's characteristics. Also, let c be the average cost of using electronic banking, then economic theory posits that the consumer will choose to use Even though the consumer's decision is straightforward, the analyst does not have sufficient information to determine the individual's choice. Instead, the analyst is able to observe the consumer's characteristics and choice, and using them to estimate the relationship between them. Let x i be a vector is of the consumer's characteristics and wealth, , then equation (1) can be formulated as an ex-post model given by: where i ε is the random term. If the random term is assumed to have a logistic distribution, then the above represents the standard binary logit model. However, if we assume that the random term is normally distributed, then the model becomes the binary probit model (Maddala, 1993;Ben-Akiva and Lerman, 1985;Greene, 1990). The logit model will be used in this analysis because of convenience as the differences between the two models are slight (Maddala, 1993). The model will be estimated by the maximum likelihood method used in the LIMDEP software.
The decision to use electronic banking is hypothesised to be a function of the six variables (measured on a 5-point Likert-type scale) and demographic characteristics. The variables include service quality dimensions, perceived risk factors, user input factors, price factors, service product characteristics, and individual factors (see Figure 1). The demographic variables include age, gender, marital status, ethnic background, educational qualification, employment, income, and area of residence.
Implicitly, the empirical model can be written under the general form: A priori hypotheses are indicated by (+) or (-) in the above specification (see Figure 1). For example, service quality dimensions such as reliability, assurance and responsiveness are positively related to the use of electronic banking (Gerrard and Cunningham (2003).
Furthermore, consumers' decision to use electronic banking is negatively related to financial, performance, physical risk, social, and psychological risks (Sarin, Sego and Chanvarasuth, 2003).
User input factors such as control, enjoyment, and intention to use have a positive impact on consumers' decision to use electronic banking (Ng and Palmer, 1999). Polatoglu and Ekin's (2001) study identified that users of electronic banking were negatively influenced by price factors. Consumers are price sensitive. The service product characteristics of electronic banking such as consumers' perception of a standard and consistent service, the time saving feature of electronic banking, and the absence of personal interactions, have been empirically found to positively influence consumers' use of electronic banking (Polatoglu and Ekin, 2001;Karjaluoto, Mattila and Pento, 2002). Likewise individual factors such as consumers' knowledge and resources positively influence consumers' use of electronic banking.
Demographic characteristics such as age, gender, marital status, education, ethnic group, area of residence, and income were hypothesised to influence the respondent's decision to use electronic banking. This research seeks to determine which age group has the greatest tendency to use electronic banking and whether gender plays a part in differentiating electronic banking users and non-electronic banking users. Income was divided into low (below $19,000), medium (between $20,000-$39,000) and high (above $40,000); age group was divided into young (between 18 to 35 years old), medium (36 to 55 years old) and old (above 56 years old); ethnic group was divided into New Zealand European, Maori, and others (Pacific Islander or Asian); and employment level was divided into blue-collar works, white-collar worker, casual worker (including unemployed, students and house persons) and retirees. These are dummy variables and one dummy variable is dropped from each group to avoid the dummy trap problem in the model.

Multi-Layer Feed-Forward Neural Network (MLFN)
The artificial neural network model, inspired by the structure of the nerve cells in the brain, can be represented as a massive parallel interconnection of many simple computational units interacting across weighted connections (Venugopal and Baets, 1994). Each computational unit (or neuron or node) consists of a set of input connections that receive signals from other computational units, a set of weights for input connection, and a transfer function (see Figure   2). The output for the computational unit (node j) is the result of applying a transfer function F j to the summation of all signals from each connection (X i ) times the value of the connection weight between node j and connection i (W ij ) (Equation 4).
( ) where U j is output for node j and F j is a transfer function which can take many different functional forms: linear functions, linear threshold functions, step functions, sigmoid functions or Gaussian function (James and Carol, 2000).
The artificial neural network that is widely used is called multi-layer feed-forward neural network (MLFN) because the information flows in the direction from the origin to the destination, one cannot return to the origin, and the computational units are grouped into 3 main layers -the first layer is the input layer, the last layer is the output layer, and the layer(s) in between is called the hidden layer(s) (Hu et al., 1999). Figure 3 shows the structure of the multi-layer feed-forward neural network with one hidden layer. Since the output of one layer is an input to the following layer, the output of the network can be exhibited algebraically as shown in equation 5.
where Z is the output of the network, F is the transfer function in the output node, ( )

Figure 3 Multi-Layer Feed-Forward Neural Network Structure with One Hidden Layer
The calculation of the neural network weights is known as training process. The process starts by randomly initializing connection weights and introduces a set of data inputs and actual outputs to the network. Then the network calculates the network output and compares it to the actual output and calculated error. In an attempt to improve the overall predictive accuracy and to minimise the network total mean squared error, the network adjusts the connection weights by propagating the error backward through the network to determine how to best update the interconnection weights between individual neurons. For this reason, the learning algorithm is called back-propagation (Rao and Ali, 2002).
While the performance of the MLFN can be influenced by the number of hidden nodes and layers in the network, there is no theoretical framework to determine the appropriate number of hidden nodes and layers, and also the optimal internal error threshold in a network. Too few hidden nodes and layers in the network will inhibit the learning ability of network. On the other hand, too many hidden nodes and layers could reduce the network generalizing ability and efficiency. In practice, the design of the neural network model is a tedious process of trail and error to find the optimal model.

Probabilistic Neural Network (PNN)
The PNN, original proposed by Specht (1990), is basically a classification network. Its general structure consists of 4 layers -an input layer, a pattern layer (the first hidden layer), a summation layer (the second hidden layer) and an output layer (see Figure 4). Source: Modified from Specht (1990)

Figure 4 The Probabilistic Neural Network (PNN) Architecture
PNN is conceptually based on the Bayesian classifier statistical principle. According to the Bayesian classification theorem, X will be classified into class A, if the inequality in equation 6 holds: where X is the input vector to be classified, h A and h B are prior probabilities for class A and B, c A and c B are costs of misclassification for class A and B, f A (X) and f B (X) are probabilities of X given the density function of class A and B, respectively (Albanis and Batchelor, 1999).
To determine the class, the probability density function is estimated by a non-parametric estimation method developed by Parzen (1962) and extended afterwards by Cacoulos (1966).
The joint probability density function for a set of p variables can be expressed as: where p is the number of variables in the input vector X, n A is the number of training samples which belongs to class A, Y Aj is the j th training sample in class A and σ is a smoothing parameter (Chen et al., 2003).
The working principle of PNN begins with the input layer, where inputs are distributed to the pattern units. Then the pattern unit, which is required for every training pattern, is used to memorize each training sample and estimate the contribution of a particular pattern to the probability density function. The summation layer comprises of a group of computational units with the number equal to the total number of classes. Each summation unit that delicate to a single class sums the pattern layer units corresponding to that summation unit's class.
Finally, the output neuron(s), which is a threshold discriminator, chooses the class with the largest response to the inputs (Albanis and Batchelor, 1999;Yang et al., 1999).

Data and Methodology
Data for this analysis was obtained through a random mail survey sent to 1,960 household in

Empirical Results
The estimated logistic regression equation (3) is as shown in Table 1. In general, the logistic model fitted the data quite well. The chi-square test strongly rejected the hypothesis of no explanatory power and the model correctly predicted 92% of the observations. Furthermore, SQ, PR, UIF, OLD, WHITE, CASUAL, HIGHSCH, HIGH, and RURAL are statistically significant and the signs on the parameter estimates support the a priori hypotheses outlined earlier. The estimated coefficients indicate that service quality dimensions and user input factors have a positive impact on consumers' likelihood to electronic banking. This implies the level of service quality in electronic, the independence and freedom associated with electronic banking and the enjoyment that could be derived from electronic banking will favourably influence consumers' decision to use electronic banking. Perceived risk factors were found as hypothesised, to negatively affect the probability to use electronic banking. Research tells us a consumer who is risk adverse perceives electronic banking as a financial risk when it is not possible to reverse a mistakenly entered transaction or stopping a payment. Furthermore, the threat of personal information accessed by a third party negatively influences a consumer's likelihood to use electronic banking. This supports the finding of Ho and Ng (1994) and Lockett and Littler (1997).
The demographic variables (age, employment, education, income and residence) were also significant in explaining the respondents' probability in using electronic banking. For example, the negative coefficient of the age group above 56 years showed that senior consumers were less likely to use electronic banking. Senior consumers are more risk adverse and prefer a personal banking relationship to non personal electronic banking. High school respondents may be less likely to use electronic banking due to their low income status.
Furthermore, electronic banking transaction could be costly for this age group who primarily work part-time.
Additional information can be obtained through analysis of the marginal effects calculated as the partial derivatives of the non-linear probability function, evaluated at each variable's sample mean (Greene, 1990). For example, the consumers' choice of electronic banking is relatively sensitive to the perceived risk (PR) (Rank = 1) and the user input factor (UIF) (Rank = 2), where an unit increases in PR and UIN scores would decrease and increase the probability of being an electronic banking user by 24.31% and 15.47%, respectively.
The overall percentage correct of 92.03 shows that the logistic model is quite accurate in consumers' choice prediction. However, the percentage incorrect indicate that the logistic model is likely to produce Type I error (wrongly reject H 0 or accept non-electronic banking user as electronic banking user) compared to than Type II error (wrongly accept H 0 or accept electronic banking user as non-electronic banking user), as it has 19.78% and 4.69% incorrect on non-electronic banking and electronic banking classifications, respectively (see Table 1).
Given that the neural network uses nonlinear functions, it is very difficult to spell out the algebraic relationship between a dependent variable and an independent variable.
Furthermore, the learned output or connection weights could not be elucidated and tested.
Therefore, only the relative contribution factors and the classification rates are presented in  The classification results in Table 2 show that both MLFN and PNN exhibit a superior ability to learn and memorize the patterns corresponding to consumers' choice on the electronic banking. Both of methods have higher overall percentage correct on consumers' choice predictions than the logistic model. Generally, the MLFN model can predict quite well on the electronic banking group but its performance is relatively poor when predicting the nonelectronic banking group. In contrast, the PNN can predict well for both groups. Therefore, the PNN is assumed to be the best prediction model in this study since it has the highest overall percentage correct (99.81%) and a very low percentage error on Type I error (0.70%) with 0.00% of Type II errors.
The relative contribution factors and the ranks in Tables 1 and 2 showed a consistency result across all the models. That is, both perceived risk (PR) and the user input factor (UIF) have a strong influence on the consumers' decision between electronic banking and non electronic banking in all three models, Rank = 1 and 2 respectively, whereas the other variables have a strong influence in some models but they might have less influence in another model or vice versa. Therefore, these two factors must be considered and set as high priority factors as they strongly impact on the consumers' decision in choosing between electronic banking and non electronic banking.
The within-sample forecast always yields an upward bias; the out-of-sample forecast is a more appropriate measure of the future predictive power.

Conclusion
The estimated results from the logistic regression indicate that age, occupation, qualification, income, area of residence, service quality, perceived risk and user input factor are the major factors that influence consumers' decision between electronic banking versus non electronic banking. The logistic model can be considered as an accurate prediction model because the overall correct classification rates are high, above 90.00% for both in-sample and out-of-sample predictions. However, its performance does not outperform both neural network models, MLFN and PNN, for both in-sample and out-of-sample forecasts.
The neural networks yield better prediction results but there are some drawbacks on using the neural networks. Firstly, the neural networks lack theoretical background concerning the explanatory capabilities. The connection weights in the networks cannot be interpreted or used to identify the relationships between dependent and independent variables. Secondly, there are no formal techniques for non-linear methods to test the relative relevance of the independent variables and to carry out the variable selection process. Lastly, the neural networks learning process can be very time consuming.
In summary, in term of prediction accuracy, the results present in this paper indicated that the PNN can be successfully implemented to predict consumers' choices because it outperforms both the MLFN and the logistic model. This indicates the superiority of using the PNN for prediction of consumers' choices. Furthermore, the study exhibits the potential of the neural methodology, especially the PNN, as an analysis tool to for marketing research. Since neither the consumers' choices are always binary nor the neural network is limited to the binary choice classification problem, the research on the predictive power of the neural networks on the multiple level classifications would be an area for further research, particularly on the consumers' choice prediction.