Forecasting River Flow in the USA: A Comparison between Auto-Regression and Neural Network Non-Parametric Models 1

Forecasting a time series became one of the most challenging tasks to variety of data sets. The existence of large number of parameters to be estimated and the effect of uncertainty and outliers in the measurements makes the time series modeling too complicated. Recently, Artificial Neural Network (ANN) became quite successful tool to handle time series modeling problem. This paper provides a solution to the forecasting problem of the river flow for two well known Rivers in the USA. They are the Black Water River and the Gila River. The selected ANN models were used to train and forecast the daily flows of the first station no: 02047500, for the Black Water River near Dendron in Virginia and the second station no: 0944200 for the Gila River near Clifton in Arizona. The feed forward network is trained using the conventional back propagation learning algorithm with many variations in the NN inputs. We explored models built using various historical data. The selection process of various architectures and training data sets for the proposed NN models are presented. A comparative study of both ANN and the conventional Auto-Regression (AR) model networks indicates that the artificial neural networks performed better than the AR model. Hence, we recommend ANN as a useful tool for river flow forecasting.


INTRODUCTION
Water is the sources of life for all creatures. Rivers flow forecasting can protect from water shortage, flood damage and in agriculture management. Different models have been proposed for forecasting the daily flow of Rivers [1,2,3] . Linear prediction model (LP) [4,5] such as Auto-Regressive and Neural Network models were used in variety of forecasting problems [6] . Selecting a suitable model for forecasting is very complicated and difficult process. These difficulties include the data availability, the size of the basins of interest and the different sensing and measuring instruments being used.
Recently, artificial neural networks have been introduced as a useful tool which can be used for modeling hydrologic processes. ANN showed a strong capability in handling diversity of problems including rainfall-runoff, water quality, sedimentation and rainfall forecasting. It has been also an efficient and experimented model widely used in number of applications [7,8] such as the sales prediction [9] , shift failures [10] , estimating prices [11] and stock returns [12] .
In this paper, we are presenting yet another neural network forecasting application, namely the river flow forecasting of the Black Water River and the Gila River in the USA. The proposed NN models have been developed and evaluated for its performance for forecasting the river flow of two rivers in the USA. Many authors were able to develop a variety of NN models to solve river flow forecasting problem [13,14,15] . We are investigating the use of the AR and NN based regression model in solving such a problem. The proposed NN was also trained using the Levenberg Marquardt technique to provide better forecasting capabilities [16] .
Artificial neural network: Artificial Neural Network (ANN) is an information processing paradigm that is inspired from biological nervous systems, such as the brain process information. The key element of this paradigm is the novel structure of the information processing system. It is composed of a large number of highly interconnected processing elements (neurons) working in union to solve specific problem. Neural Network can derive meaning from complicated or imprecise data. There are many types of Neural Network but Back-propagation Neural Networks are the most famous neural type [17,18].
There are mainly three different types of layers presented in most ANNs. The first layer is called the input layer. Its main task is to receive input from the outside world. This layer has number of neuron equal to the number of model input. The layer next to the input layer is called the hidden layer. This layer is receiving input from the immediately preceding layers. The final layer of the network is called the output layer. The neuron present in this layer presents the output of the network. Neurons in any layer are fully connected to all neurons in the next layer. The neurons in the same layer are not connected among each other. A weighted sum of the neuron inputs specifies the activation (i.e. sigmoid) function argument. This activation function is assumed to be nonlinear.
An example of a three layer feed forward ANN is shown in Fig. 1. The shown network structure can simulate the behaviors of a model which has four inputs and one output. The number of neurons in the hidden layer depends on the problem complexity. The study area: The data flow were recorded and collected from two stations operated by the U.S. Geological Survey (USGS). The 1st station No: 02047500, for the Black Water River isolated near Dendron, Virginia [1] and 2nd station No: 0944200 for the Gila River is located near Clifton, Arizona [1] . The location of these stations is shown in Fig. 2. This map was presented in [1] For the 1st station the training data period was from 01 Oct 1990 to 30 Sept 1996, (six water years) and tested data were from the period of 01 Oct 1996 to 30 Sept 1997 (one water year). For the 2nd station the training data period was from 01 Oct 1995 to 30 Sept 1998 (three water years) and the tested data were from the period of 01 Oct 1998 to 30 Sept 1999 (one water year).
Problem formulation: In our case two models, the linear Auto-Regression model and the Backpropagation model, were used to predict the future flow for both the Black Water and Gila Rivers. Both models were trained and tested on different set of data.
We used the Sum Square of Errors (SSE) as the evaluation criterion for the developed models. The SSE was computed for both training and testing cases. The proposed NN consists of three layers. The input layer contains number of neurons varied from 3 to 7 based on the developed model order. The hidden layer has seven hidden nodes. This number was an arbitrary chosen. Weights from the Input-to-hidden layer and hidden-to-output layer were adjusted using backpropagation learning algorithms. The output layer consists of one output neuron to produce the prediction of the flow.
The Network was trained using the BP algorithm. To develop our results we used the NNSYSID Matlab toolbox [17] . The adopted NN was the Neural Network Auto-Regression Matrix model (NNARX). We used the Levenberg Marquardt technique to train the neural network [16] . In Fig. 5a, we show the actual flow in solid line and the estimated flow with the dotted line, based on NN7 model for the training period of Black Water River. The developed model was validated as shown in Fig. 5b. In Fig. 6a and Fig. 6b, we show the results for both training and testing cases of the Gila River used NN5 model.   (7) 7.75E+07 4.12E+06 8.16E+07 Table 2: SSE for ANN and AR models-training and testing data of the Gila River

CONCLUSION
In this study, we presented a detailed comparison between Artificial Neural Networks and the Auto-Regression models in solving the River flow forecasting problem. We concluded that neural networks can offer several advantages over conventional modeling approaches. The most important among them is their ability to develop a generalized solution to the forecasting problem from a given set of examples. We showed that ANN models can be used to train and forecast the daily flows of the Black Water River near Dendron in Virginia and the Gila River near Clifton in Arizona. ANN model were found to perform better for forecasting daily river flow than the conventional AR model.