COMBINING LONG MEMORY AND NONLINEAR MODEL OUTPUTS FOR INFLATION FORECAST

Long memory and nonlinearity have been proven as tw o models that are easily to be mistaken. In other w ords, nonlinearity is a strong candidate of spurious long memory by introducing a certain degree of fraction al integration that lies in the region of long memory. Indeed, nonlinear process belongs to short memory with zero integration order. The idea of the forecast is to obtain the future condition with minimum error. Some researches argued that no matter what the model is, the important thing is we can generate a reliable for cast. Several tests have been proposed to solve the probl em of distinguishing long memory and nonlinearity a ppears in a series. The power of the tests is somehow ques tionable in the sense that there is still a probabi lity to obtain spurious result. To overcome this, model combinatio will be one of the solutions dealing with uncerta inty in the model selection. In this case, it is assumed th at both processes are candidates of best models wit h certain power to generate a good forecast. This research in vest gates the performance three model combination approaches to forecast the Indonesia inflation i.e. , simple combination using balance weight as well a s inverse Mean Prediction Error (MSPE) weight and Bayesian Mo del Averaging (BMA). These methods are capable to generate a reliable forecast in very short lead tim e. Combination using BMA outperforms the simple ave r ging for 1 ahead forecast, while MSPE performs best for long lead forecasts.


INTRODUCTION
Time series forecasting is intended to generate a model which is able to produce a reliable forecast. The modeling step is normally begun with the series identification. Proper identification step will lead to the best model. Otherwise, incorrect identification will lead to a spurious model which produces bad forecast or high error of prediction. The latter condition highly depends on the test statistic applied for the model identification. Long memory is one of the phenomena in time series, where the dependence between observations is still observed for long lead time. In fact, long memory can be easily misspecified with other time series models such as nonlinear models (Kuswanto and Sibbertsen, 2007), which is known as spurious long memory model. Lobato and Savin (1998) and the references therein discuss the real and spurious long memory properties of stock market data. They investigated major causes of spurious long memory, such as aggregation, nonstationarity and regime switching. It is well known that several processes are able to create spurious long memory by generating a certain degree of fractional integration. Several works have been devoted to this topic such as Ohanissian et al. (2008) and Kuswanto (2011). The tests are developed by utilizing the properties of flow aggregation for long memory. Hurvich et al. (1998) argued that both aggregation procedures have similar properties concerning the invariance of the memory parameter. In contrast with this, Kuswanto et al. (2012) proved by simulation study that the invariance of the memory parameter doesn't hold for stock aggregation. In the study, Kuswanto et al.

JMSS
(2012) proposed a simple guidance that could be used to distinguish between true and spurious long memory designed specifically for skip sampled time series data.
The main issue with statistical test is always about the power of the test. In fact, the existing tests cannot detect spurious long memory perfectly. It means that there is uncertainty in the model choice leading to probability of obtaining wrong identification result. To overcome this problem, it turns to the idea of combining the forecast output from both competing models instead of selecting the best model. This idea is quiet reasonable and straightforward as the forecasters in fact never know the true model especially for the case of long memory and nonlinear process. Incorporating information from both processes may increase the reliability of the forecast. Model combination in time series has been introduced in several researches such as Hibon and Evgeniou (2005), Drought andMcDonald (2011), Kuswanto (2012) among others. However, none of them discuss specifically on model combination between long memory and nonlinearity i.e., two models that are strongly be misspecified. Moreover, the combination approach is carried out by simply combining the model output without taking into account the performance of each model. This condition may lead to unreliable forecast and hence, this research will examine another combination procedure namely Bayesian Model Averaging (BMA). The idea of BMA is to assign a proportional weight for each model output. The BMA applied in this research adopts the methodology of Raftery et al. (2005) that correcting the bias prior to the estimation of the variance and weight.
This study will investigates the performance of those aforementioned forecast combination approaches for forecasting the inflation in Indonesia. Forecast from two spurious long memory models which belong to the class of nonlinear models i.e., Markov Switching and Logistic Smooth Transition Autoregressive Model (LSTAR) will be combined with the forecast from long memory models. It is expected that the combination is capable to produce more reliable forecast. Three lead time forecasts will be examined i.e., one, sixth and twelve months. The forecast performance will be evaluated.
The study is organized as follows. Section 2 briefly presents an overview about long memory as well as the examined nonlinear models. Brief description about the combination approaches will also be given in this section. The stylized facts and results of forecasting the Indonesian inflation using forecast combination are presented in section 3 and 4 concludes.

Literature Review
This section discusses some theoretical background of the long memory and spurious long memory models.

Long Memory Process
Long memory means that observations are still strongly correlated up to very long lead. A time series Y t , t =,…,N is said to be long memory if the correlation function ρ(k) for k→∞ has the following behavior: where, C ρ is a constant and d∈(0.05) is the memory parameter. Long long memory process has correlation function that decays hyperbolically. If d∈(-0. 5, 0) the process has short memoryand it is antipersistant, while for d∈(0.5, 1) the is said to be nonstationary but mean reverting. Beran (1994) provides detail about the process.
GPH method was frstly introduced by Geweke and Porter-Hudak (1983). It is used to characterize the memory behavior by introducing a fractional degree of difference. It is calculated from m periodogram ordinates: where, λ j = 2π/N and m is a positive integer smaller than N. The estimator is derived from the spectral density by which the logarithm is taken on the both sides of the equation. It yields on a linear regression model and the memory parameter can be estimated by standard least squares procedure. The final equation to calculate the GPH estimator is GPH is very simple to calculate and it does not is that not require a knowledge about the dependencies of the process. Several researches showed that N 0.8 yields on the optimal MSE (Hurvich et al. (1998)). Autoregressive

JMSS
Fractionally Integrated Moving Average (ARFIMA) is a popular model to foreacst long memory process. The ARIMA and ARFIMA differs in the value of estimated integrated parameter (d), where ARFIMA has d parameter that is fractional representing the degree of long memory. Reisen et al. (2001) provides a thorough steps for ARIMA modeling.

Markov Switching Model
The Markov switching model has been introduced by Hamilton (1989) and it has been proven to be a good model for describing the nonlinear dynamic of financial time series. The Markov Switching defined in Timmermann (2006) can be written as: and S t = 1,2,…,k shows the latent indicator state, following process k-state of ergodic Markov defined as: where, i,j = 1,2,…,k shows that there are k different possiible state or regime satisfiying: The maximum likelihood can be applied to estimate the model parameter (AR coefficients and varians of residual) if states S = (Sp+1,…,S n ) is known.
Observation y t smoothly switch between regimes, in this case there are two regimes. Therefore, the dynamic of y t is calculated on both regime, where each regime has different magnitude and degree of strong influence. The interpretation of STAR model depends on smooth transition function G(z t ).
There are two popular transition functions i.e., logistic function and exponential function, where differ only on the form of the smoothing function. However, some previous researhes have proven that both transition functions yields on not significantly different result. This paper uses logistic smooth transition function defined as: where, z t = y t-1 and the delay parameter to be integer positive (1>0). Using logistic function yield on the so called LSTAR model. The parameter c the threshold parameter, as in Threshold Autoregressive (TAR) and γ represents the degree of smoothess of the transition. For

Model Combinations
Forecast combination and ensemble forecasting, are procedure to incerase the accuracy and reduce the variability of forecast result. Combination is done by combining the forecasts generated from different time series models with an expectation that the forecast will be more reliable than single model forecast. There are several techniques to combine the forecast, i.e., simple combination and Bayesian Model Averaging (BMA).

Simple Combination
Simple combination is done by summing up the forecast of each model weighed with certain weight. The forecat combination result according to Ravazzolo (2007) for y T+h with simple combination scheme is described below: is the forecast of h ahead on k-th model. Ravazzolo (2007) introduces two mechanisms for Simple Model Averaging as follows:

Balance Weight
Balance weight is done by assigning the same weight for every forecast of the individual model as follow: The balace weight will be optimum on the situation when the variance of the residual is homogenous and identic (Timmermann, 2006).

Inverse Mean Square Prediction Error (MSPE) Weight
The second scheme to obtain the weight from inverse Mean Square Prediction Error (MSPE) relatif model, calculated using m window from past observations (Timmermann, 2006). Residual estimation of the weight combination tends to be higher due to the difficulty to predict the accuracy of the variance covariance matrix of the forecast residual. One of the ways to overcome this problem is by ignoring the correlation between residuals so that the combination weight shows the relatif performace of each individual model toward the performance of the average model. MSPE according to the forecast is obtained by averaging the residual of the forecast of m window in every model, shown as the following Equation 1: The weight of each model is calculated as Equation 2: 1, ,

Bayesian Model Averaging
Bayesian Model Averaging (BMA) assigns certain weights to each model in the forecast combination (Wang and Ma, 2008), Suppose that Y t = (y t , y t-1 ,…,y 1 )' is

Data
This study analyzes month;y inflation of Indonesian economy spanning from January 2000-August 2012. The forecast combination is carried out to the inflation forecast data generated from those considered models.

Steps of the Analysis
The steps of the analysis that is carried out in this study are described as follows: • Investigate the stylized facts of the inflation data • Generate the inflation forevast by long memry and two spurious processes • Apply the model combination approaches Having applied nonlinearity test to the series above, it comes up with the conclusion that the inflation moves nonlinearl. Moreover, testing for long memory has also been applied to the series and it is obtained that the series has characteristic of being long memory process by introducing certain degree of fractional integration. Hence, the Indonesian inflation is a candidate of spurious long memory process. Furthermore, forecast combination will be conducted as a method to generate the forecast instead of selecting the best model. In fact, the best model is selected based on minimum average of the error and none of the model consistently generates best forecast in all periods. Prior to applying the forecast combination, the forecasts for 1 month ahead, 6 months ahead and 12 months ahead will be generated from long memory model and nonlinear models (Markov Switching and LSTAR).

Forecasting with ARFIMA
The first stage of ARFIMA model building is identification of some possible ARFIMA models with different order combinations. Furthermore, the best model will be selected to generate te forecast by considering the criterias of having small AIC, all parameters are significant and the residual of the model satisfies the assumptions of being white noise and normally distributed. Among the combinations, there are several candidates of ARFIMA models having small AIC as shown in Table 1.
Based on the table, it is known that the the smallest AIC is produced by ARFIMA (3,d,1). However, among those models, ARFIMA (1,d,0) is the only model which satisfies the assumption required for the residuals of the model. Therefore, the forecast for three defined lead times will be generated by ARFIMA (1,d,0). The model has characteristic of stationary long memory process with the order of fractional difference of 0.261.

Forecasting with Markov Switching and LSTAR
Similar to the forecasting using ARFIMA, the Markov Switching model is selected by considering the minimum AIC as well as the residual assumptions. The smallest AIC of the Markov Switching model is AR (1) where the residual satiisfies the required assumptions. The modeling process is done by estimating the transition matrix of the series. This research uses two regimes yielding on the following transition matrix: Another nonlinear model used to forecast the inflation Logistic Smooth Transition Autoregressive (LSTAR). It is assumed that the delay equals to two and the series transit in two regimes. Similar modeling steps with ARFIMA and Markov Switching have been carried out, however best model which satisfies the assumption of normally distributed residual cannot be obtained. As the idea of the forecast is to minimize the forecast error, the best model is selected under the condition of minimum AIC and white noise. In this case, the LSTAR (1) is the candidate of the best model.

Forecast Combination
The forecast combination is done by assigning a certain weight to each model output (forecast). Among the three methods, the different is only on the weight assigned to the forecast. The balance weight gives same weight to each forecast, while MSPE and BMA estimates the weight that proportional to the performance of the models. These two latter methods uses training window (m) in estimating the weight. This study uses m = 6, m = 9 and m = 12, so that for each m, the number of combination will be fewer than forecast with single model with the lag of m-1 for MSPE and m for BMA. Therefore, although balance weight doesn't need training window, the forecast will be compared based on the same period as MSPE.

Forecast Combination Using Balance Weight
As described in the previous section, the balace weight assigns the same weight to the forecast. Since we have three models to be combined, the weight will be 1/3 or it is a simple averaging. The variance of the forecast as the result of combination is given as: The result of combination between ARFIMA, Markov Switching and LSTAR for Indonesian inflation forecast is presented in the table below. In this case, the inflation series is assumed to be normally distributed so that the estimated parameters are parameters of normal distribution used to calculate the interval forecast i.e., µ and 2 σ .
The interval is used to assess the forecast performance i.e., whether the forecast is capable to capture the observation or not. A good forecast will capture the observation with small interval widht. In order to clearly assess the performance of the forecast interval, the following Fig. 2 depicted plots of the forecasts and its corresponding observations. Only six last periods are presented as an illustration.
From the Figure, it is known that the forecast combination with balance weight for lead 12 forecast on 6 lag periods, there are only two observations that are able to be captured by the interval. In overall, only 50% of the obsrevations that lies within the intervals. The complete periods of the forecast is shown in Fig. 3. Based on the figure, the 1 month forecast can capture most of the observations. The figures show also that the interval forecast is getting wider with longer lead time. However, it fails to capture most of the observations.

Forecast Combination using Inverse Mean Square Prediction Error (MSPE) Weight
The concept of estimating the weight using this method has been discussed in subsection 2.3.1. If the MSPE yields on small value on the m period of the forecast, thus the model is sufficiently acurate to forecast the observation and hence the weight is larger.
From Fig. 4, we can see that the MSPE forecast performs good by being able to capture the observation. The illustration about the forecast on 6 and 12 months ahead are skipped for the sake of simplicity. The comparison of the forecasting results for the whole period using m = 6 can be seen in Fig. 5. The result for m = 9 and m = 12 are omitted for the sake of space.

Forecast Combination Using Bayesian Model Averaging (BMA)
In general, forecast combination using simple model averaging does not yield on reliable forecast for forecasting inflation either for lead 1, lead 6 or lead 12. It is expected that combination using Bayesian Model Averaging will improve the forecast reliability. Similar to MSPE, calibration using BMA requires the use of training window m. Figure 6 below performs the forecast performance only for several selected months.
Based on Fig. 7, using m = 6 we obtained 5 observations can be captured by the interval, meaning 88,46% observations lies with in interval of BMA forecast. Moreover, the interval is reliable enough with proper widht. This shows that the BMA performs good for lead 1 forecast especially using m = 6. Interval forecast for lead 6 and 12 are not as good as lead 1. In particular for m = 12 yields on very poor performance.

Comparison of the Forecast Accuracy of the Combined Forecasts
This section performs comparison of the forecast accuracy using MSE and MAPE criterias. These two criterias assess the forecast performance deterministically. Table 2 summarizes the values.
Based on the values in the table, we can see that for forecast on lead 1, the MSE minimum has been reached by forecast combination using BMA with m = 9, while for lead 6 and 12, the MSPE outperforms the two others. Among all settings, using m = 6 gives the lowest MSE and MAPE, therefore the Indonesian inflation is better forecasted with 6 months training window.

JMSS
MSE and MAPE assess the bias of the forecast only, without taking account into the width of the forecast interval. In order to assess both accuracy and resolution of the forecast, the Continuous Ranked Probability Score (CRPS) is used. The idea of the CRPS is to calculate the difference between CDF of the combination result with CDF of the observed inflation data. In this case, smaller CRPS shows better forecast reliability. Table 3 performs the mean CRPS over the whole forecast periods.
The CRPS shows that BMA with lead time of 9 yield on best forecast for lead 1, while MSPE outperforms BMA and balance weight for forecast on lead 6 and 12. General conclusion whether the forecast combination will always outperform the single model can be done by simulation study, which is the subject of the future research.

CONCLUSION
This research applies three different forecast combination approaches for forecasting Indonesian inflation. It has been proven that the Indonesian inflation can be modeled by long memory model as well as nonlinear models. However, it is unclear whether the long memory is true or spurious. The results of the analysis shows that the forecast combination can be a good approach for solving the confusion problem between these two competing processes. In term of the forecast accuracy, model combination outperforms the single model although the error is not significantly different. However, in the reality we never know which single model will generate the best forecast for forecasting the future inflation. Forecasting using forecast combination solve the problem by utilysing all information about the forecasts generated by all models. Among the three combination approach, BMA performs best for 1 ahead forecast, while MSPE perfors good for long lead time forecast.