Parameters Estimate of Autoregressive Moving Average and Autoregressive Integrated Moving Average Models and Compare Their Ability for Inflow Forecasting

In this study the ability of Autoregressive Moving Average (ARMA) and Autoregressive Integrated Moving Average (ARIMA) models in forecas ting the monthly inflow of Dez dam reservoir located in Teleh Zang station in Dez dam upstream i s estimated. ARIMA model has found a widespread application in many practical sciences. In addition, dam reservoir inflow forecasting is done by some methods such as ordinary linear regres sion, ARMA and artificial neural networks. On the other hand, application of both ARMA and ARIMA models simultaneously in order to compare their ability in autoregressive forecast of monthly inflow of dam reservoir has not been carried out i n previous researches. Therefore, this paper attempts to forecast the inflow of Dez dam reservoir by using ARMA and ARIMA models while increasing the nu mber of parameters in order to increase the forecast accuracy to four parameters and comparing them. In ARMA and ARIMA models, the polynomial was derived respectively with four and s ix parameters to forecast the inflow. By comparing root mean square error of the model, it was determi ned that ARIMA model can forecast inflow to the Dez reservoir from 12 months ago with lower error t han the ARMA model.


INTRODUCTION
More accurate estimation of the monthly inflow to the reservoir is significantly important in water resources management due to the importance of management and operation of reservoirs, hydroelectric energy generation and structures designed to control. In this study, the monthly inflow to the reservoir has been forecast by two models of ARMA and ARIMA. After publishing the paper of Box and Jenkins (1976), ARIMA and ARMA models or Box-Jenkins models became one general time series model of hydrological forecasting. An ARIMA model is a generalization of an ARMA model. Access to basic information requires integration from the series (for a continuous series) or calculating all of differences the series (for a continuous series). Since the constant of integration in derivation or differences deleted, the probability of using these amount or middle amount in this process is not possible. Therefore, ARIMA models are non-static and cannot be used to reconstruct the missing data. However, these models are very useful for forecasting changes in a process (Karamouz and Araghinejad, 2012). Models of time series analysis (ARMA and ARIMA) in various fields of hydrology are widely applied, which some of them will be described in the following. Baareh et al. (2006) used the artificial neural network and Auto-Regression (AR) models to the river flow forecasting problem. A comparative study of both ANN and the AR conventional model networks indicated that the artificial neural networks performed better than the AR model. They showed that ANN models can be used to train and forecast the daily flows of the Black Water River near Dendron in Virginia and the Gila River near Clifton in Arizona. Xiong and O'connor (2002) used four different error-forecast updating models, Autoregressive (AR), Autoregressive-Threshold (AR-TS), Fuzzy Autoregressive-Threshold (FU-AR-TS) and Artificial Neural Network (ANN) to the real-time river flow forecasting. They found that all of these four updating models are very successful in improving the flow forecast accuracy. Chenoweth et al. (2000) estimated the ARMA model parameters using neural networks. Their results showed that the ability of neural networks to accurately identify the order of an ARMA model was much lower than reported by previous researchers and is especially low for time series with fewer than 100 observations. Using forecasting of hydrologic time series with ridge regression in feature space, Yu and Liong (2007) showed that the training speed in data mining method was very much faster than ARIMA model. See and Abrahat (2001) used of data fusion for hydrological forecasting. Their results showed that using of data fusion methodologies for ANN, fuzzy logic and ARMA models accuracy of forecasting would increase. Using hybrid approaches, Srinivas and Srinivasan (2000) improved the accuracy of AR model parameters for annual streamflows. Using the Fourier coefficients, Ludlow and Enders (2000) estimated the ARMA model parameters with a relatively good accuracy. Chenoweth et al. (2004) estimated the ARMA model parameters using the Hilbert coefficients. Their results showed that the Hilbert coefficients are considered a useful tool for estimating ARMA model parameters. Balaguer et al. (2008) used the method of Time Delay Neural Network (TDNN) and ARMA model to forecast asking for help in support centers for crisis management. The obtained correlation results for TDNN model and ARMA were 0.88 and 0.97, respectively. This study confirmed the superiority of ARMA model to the TDNN. Toth et al. (2000) used the artificial neural network and ARMA models to forecast rainfall. The results show the success of both short-term rainfall-forecasting models for forecast floods in real time. Eslami et al. (2005) forecast Karaj reservoir inflow using data of melting snow and artificial neural network and ARMA methods and regression analysis. 60% of inflow in dam happens between Aprils until June, so forecasting the inflow in this season is very important for dam's performance. The highest inflows were in the spring due to the snow melt caused by draining in threshold winter. The results showed that artificial neural network has lower significant errors as compared with other methods. Mohammadi et al. (2006) in other research estimated parameters of an ARMA model for river flow forecasting using goal programming. Their results showed that the goal programming is a precise and effective method for estimating ARMA model parameters for forecasting inflow.
Therefore, considering the above mentioned performed researches, we can know the efficacy of ARMA and ARIMA in forecasting field and hydrologic sampling as compared with another statistic models such as usual linear and nonlinear regression. However, in forecasting inflow to the reservoir, by ARMA and ARIMA methods the maximum number of parameters was two. Furthermore, concurrent use of ARMA and ARIMA models has not been done in previous research to compare them. This study aims to forecast inflow to Dez reservoir using ARMA and ARIMA models, by increasing the number of parameters to evaluate the accuracy of forecast to four parameters, according to discharge of Taleh Zang station located on the Dez dam upstream.

MATERIALS AND METHODS
Dez basin encompasses some part of the middle peaks of Zagros. The basin ranges between 32°, 35' to 34°, 07' North latitude and 48°, 20' to 50°, 20' east longitude and is located in southwestern Iran. Dez basin is limited from west to Karkheh basin, from north to Ghareh Chay basin and from east and south to Karun basin. In this research due to the characteristic of autocorrelation of ARMA and ARIMA models for forecasting irrigation of entrance station in Dez reservoir, Taleh Zang station data is used. In order to forecast the goal station discharge (Taleh Zang station at the entrance to the Dez reservoir) at the monthly scale, the station's monthly discharge period from water year 1960-1961 water year 2006-2007 has been selected. Actually, the used data involved 564 data that began from October 1960 and end in September 2007.
In this study ARMA and ARIMA models for forecasting monthly flow of Teleh Zang station individually were used. ARMA and ARIMA models obtained from a combination of autoregressive and moving average models. For modeling seasonal time series beside non-seasonal series, ARIMA (p, d, q) (P, D, Q)ω model known as multiplicative ARIMA model is defined as follows Eq. 1: where, ε t is the random variable, ω is the periodic term, B is the difference operator as B (Z t ) = Z t-1, (1-B ω ) D s the D-th seasonal difference measure ω, d = (1-B) d is the d-th non-seasonal difference, p is the order of nonseasonal utoregressive model, q is the order of nonseasonal moving average model, P is the order of seasonal autoregressive model, Q is the order of seasonal moving average model, φ  is the parameter of non-seasonal autoregressive model, θ is the parameter of non-seasonal moving average model, Φ is the parameter of seasonal autoregressive model and Θ is the seasonal moving average model (Karamouz and Araghinejad, 2012). It should be noted that, in equation (1) when d = D = 0, ARIMA model becomes ARMA model. The next stage is determining the number of ARMA and ARIMA models parameters that perform by PACF and ACF curves (Cryer and Kung-Sik, 2008;Mohammadi et al., 2006). These curves are depicted in the Fig. 1 and 2 which the axis line shows the delay time and the vertical axis showed the amounts of ACF and PACF, respectively. These curves show that the amounts of ACF and PACF in the delays 1 and 2 are high. So choosing up to two autoregressive parameters and two moving average parameters are sufficient (Karamouz and Araghinejad, 2012), but in order to investigate the effect of increasing the number of parameters in forecasting accurately in this study, up to 4 autoregressive parameters and also up to 4 moving average parameters were used. The next parameters that should be determined are d and D, which defined for ARIMA models. These parameters are considered in practice maximum one or two (Karamouz and Araghinejad, 2012). Due to the number of possible scenarios for the parameters, p = P = q = Q = {0, 1, 2, 3, 4} and considering two cases for the presence or absence of a constant term in the models, number of ARMA structures used to forecast the inflow to Dez dam is equal to 1250. Also, considering the three modes d = 1, D = 0, d = 0, D = 1 and d = D = 1, the number of ARIMA structures is equal to 3×1250 = 3750. If we proceed to determine the effects of seasonal trends between the data in ARMA and ARIMA models, the parameter of the periodic trend must be specified-by symbol ωand given that discharge data is monthly and input data to the model is 504, it is determined as follows: where, A is the divisors set of 504 and B is the multiples of 12. Therefore, ω = {12, 24, 36, 72, 84, 126, 186, 252, 504}. To determine the best value for this parameter, the annual discharge data was classified separately for different months and given as input to the ARMA and ARIMA models. The results to determine the parameters p, q, P, Q, d and D were computed using MINITAB software and are given in Tables 1-5. Figure 3 shows a flowchart of calculation steps by the ARIMA model.
As specified in Fig. 3 initially, data related to the calibration and forecasting stages are entered to ARIMA model. Then it is determined that whether the model is seasonal or non-seasonal. Since both seasonal and non-seasonal models are used in this study, firstly the non-seasonal and then seasonal models were chosen. If chosen model is non-seasonal, it is not necessary to determine the period and otherwise that must be determined. Next step is determining the parameters p, q, P, Q, d and D. Then presence or absence of constant term in the model is investigated. In this study, all the ARIMA constructs with and without constant term were performed. In the next step, the monthly discharge is forecast. Finally, the best structure is selected based on the root mean square error. The algorithm of calculation steps by the ARMA model is similar to ARIMA model, except that it is not required to determine two parameters d and D in parameter determination step.
Criterion to select the best structure of ARMA and ARIMA models: In order to select the best structure between ARMA and ARIMA the root mean square error and the mean bias error were used as follows: where, RMSE is the root mean square error, MBE is the mean square error, i is the number of months, Q ci is the computational discharge in month i, the Q oi observational discharge in month i and n is the num of data. Finally, for being comparable the results with other similar studies, oi RMSE / Q error index are used, where oi Q is the average of observational discharge.
Remarkably, the amount of oi Q for calibration (calibration) period contain data 445 to 504 and is equal to 335.7 m 3 /s and for the forecasting period contain data 505 to 564 and is equal to 199.9 m 3 /s. In addition, to determine the time error and the best time of the forecasting, three following criteria were used: where, E i is the relative error in month i, F i is the average of cumulative relative error in the month i,E is the average of relative error and C v is the variation coefficient of relative error.

RESULTS
Results from forecasting the annual discharge is given in Table 1 for every water month. In Table 1, for example, ARIMA (2, 1, 2)42 indicate an ARIMA model structure with two seasonal autoregressive parameters (P), 2 seasonal moving average parameter (Q), D = 1 and ω = 42.

DISCUSSION
Since 504 monthly data are relevant to 42 years, obtaining ω = 42 for the most of months indicates that once in a 42 years a logical trend between the data is established, i.e., it is 42 years that there is no significant relationship between data. For example, if in the annual discharge forecasting ω = 2, then in monthly discharge forecastingω = 2×12 = 24. However, it can be seen in Table 1 that in annual discharge forecasting ω = 1, i.e., each year has a relationship just with itself; therefore, periodic trend for forecasting monthly discharge in models ARMA and ARIMA is equal to 12. The best value of parameters in the ARMA and ARIMA models ared shown in Table 2.
According to Table 2 it is determined by increase in number of autoregressive and the moving average parameters, error rate is reduced. Thus, as is clear from Table 1 the best ARMA model has three seasonal autoregressive parameters and four seasonal moving average parameters. The best ARIMA model has four autoregressive parameters, one moving average parameter, one seasonal autoregressive parameter and one seasonal moving average parameter and d = 1.      It is not possible to mention all relevant results of the 5000 structure used to ARMA and ARIMA models in this study, therefore, only results related to the best structures is presented in Table 2. Benchmark error index oi RMSEQ for the forecasting model ARIMA(1,1,0)(1,1,2)12 data is equal to 0.7148 and it was chosen as the best model to forecast inflow to the Dez reservoir in Taleh Zang station, from the all models between ARMA and ARIMA. Table 3 and 4 also show obtained coefficients for the ARMA and ARIMA models, respectively. According to Table 3 and 4 the obtained values for the ARMA and ARIMA models with degrees of freedom 41, is lower than its critical value (56.9420) (Wei, 1990). Therefore, it can be concluded these structures used to predict correctly. It should be noted that, degrees of freedom was 41, with a minus of the number of parameters used in each model (except for d and D in ARIMA model) from the maximum lag which is visible in Fig. 2 and 3 (48). Figure 4 and 5 are compared the ability of the superior structures ARMA and ARIMA models, used in the calibration and forecasting. By comparing Fig. 4 and 5 it will be clear that although the ARMA model better forecast the initial months and also better simulate the peak points in calibration period than ARIMA, situation in forecasting period has changed and ARIMA model gives a better forecasting not only of the initial months, but also of peak points and other months than the ARMA. It should also be noted that in forecasting period, the ARMA chart locates often over observational data, but the ARIMA chart often below observational data, due to the effect of differential operator and making time series stationary (of course with lower error than ARMA model into the observational data). The noncompliance of peak points can be considered as a reason for reducing ARIMA model accuracy.
Disadvantage of used models is forecasting error of peak flows. Figure 6 and 7 show comparison of observed data with ARMA and ARIMA in forecasting period regardless of occurrence time.
Comparing Fig. 5-7 indicates that the ARMA model has forecasted more than the actual amount almost all of discharges less than 260 CMS and less than the actual amount other discharges. But ARIMA model due to making of discharge data series stationary, not only establish equality between the number of more and less than forecasted data, but can also forecast data less than 100 CMS with a good accuracy. In order to study the time changes of forecasting by using the equations (4), (5) and (6), the best forecasting time for the models was obtained. Table 5 shows the minimum E and F indexes, the month of occurrence of these values and the C v index for forecasting period. As from Table 5 is clear, the amount of E min index and E and F min indexes in the ARIMA model has been decreased more than 90% and nearly 50%, respectively, compared to the ARMA model. This is a significant reduction in the forecast error in ARIMA model compared to the ARMA model. For a better comparison of E and F indexes in forecasting period, Fig. 8 and 9 could be used, respectively. Figure 8 and 9 will show that relative error and cumulative relative error indexes are lower for ARIMA model in most months during the forecasting time and they have a lesser fluctuation over time. Therefore, the ARIMA model is entitled to more reliability. By comparing the Fig. 8, 9 and Table 5 lower values E and F indexes for ARIMA are observed compared to the ARMA model and drastic changes in the relative cumulative error in ARIMA are not seen. In other words, the ARIMA model has reached a kind of stagnation in error. Also lower value of the mean of relative error and coefficient of variation of relative error for ARIMA model indicate the lower error variation for the ARIMA than ARMA model and it implies that ARIMA is superior to ARMA model.
According to Table 5, the lowest rate of F min index has happened in the first month forecast, which it also is clear in Fig. 9. Meanwhile the increasing trend of F index is visible in the ARMA model as well as in Fig.  9. The amount of F min index is not only happen in the tenth month forecast in ARIMA model, but also according to Fig. 9 (F max ) ARIMA . < (F min ) ARMA . It is an important result to compare the errors in ARMA and ARIMA models. It means that the maximum average cumulative relative error in the ARIMA model is less than the minimum average cumulative relative error in the ARMA model. In other words, the amount of F index in ARIMA model has never reached the amount of F index in the model ARMA, which it shows again the superiority of ARIMA model to ARMA model. By studying obtained diagrams for the ARIMA model it is specified that this model gives a better answer than the ARMA model for short-term forecasts. Figure 5 shows that ARIMA model has the best forecasting for the first 4 months. The reason of the better performance of ARIMA model for the short-term forecasting is related to the nature of the hydrological data used in this model. In this model, since the 504 data is fed to this model as a lump sum, the relationship between the data is established only as one monthly. It means that model considers the once-a-month communications between discharges in order to forecast the new month discharges. Thus, the best forecasting horizon does not go beyond a year. According to Fig. 8, 9 and Table 5 it can be said that the monthly inflow could be forecast by ARIMA model about a year ago with a good accuracy.

CONCLUSION
In this study, ability of ARMA and ARIMA models is compared in forecasting Dez reservoir inflow at the Taleh Zang stations. Monthly discharge data for a period of 42 years were collected from Taleh Zang hydrometrical station and used for calibration models. Then, the accuracy of forecasting models were investigated by 5 years data. To summarize, it could be concluded that: The accuracy of both models ARMA and ARIMA increased compared to previous studies, due to increase in the number of autoregressive and moving average parameters in these models.
The ARIMA model has a better performance than ARMA model because it makes time series stationary, in both calibration and forecasting phases.
Changes in relative error, cumulative mean relative error and variation coefficient of relative error in ARIMA model was less than ARMA model; this indicates the superiority of ARIMA model to ARMA model. By investigating these changes, it will be clear that the ARIMA model could be used for forecasting an appropriate monthly inflow for the next 12 months.