New Procedure to Improve the Order Selection of Autoregressive Time Series Model

Problem statement: We propose new approach could be used to guide the selection of the “true” order of autoregressive model for different sample size. Approach: We used simulation study to compare four model selection criteria with and without the help of the new approach. The comparison of the four model selection criteria was in terms of their percentage of number of times that they identify the “true” order of autoregressive model with and without the help of the new approach. Results: The simulation results indicate that overall, the new proposed approach showed very good performance with all the four model selection criteria comparing to their performance without the help of the new approach, where the SBC, AICC and HQIC criteria provided the best performance for all the cases. Conclusion: The main result of our article is that we recommend using the new proposed approach with SBC, AICC and HQIC criteria as a standard procedure to identify the “true” order of autoregressive model.


INTRODUCTION
An Autoregressive Moving average, {ARMA (p,q)}, model is a model for a time series that is originally stationary of order p,q with the form Eq. 1: In this model the time series depends on p past values of itself and on q past random error terms ε that have E(ε t ) = 0, var (ε t ) = σ 2 and Cov (ε t , ε t-k ) =0 for all t, the parameters φ 1, φ 2,… φ p are the autoregressive parameters associated with the time series values, the parameters θ 1, θ 2,… θ q are moving average parameters associated with the error terms, p is the order of the autoregressive component of the time series process and q is the order of the moving average component of the time series process (Box and Jenkins, 1976;Pankratz, 1983).
In this study we are concern with originally stationary Autoregressive model {AR (p)} which is a special case of the Autoregressive Moving average {ARMA (p, q)} model. The selection of the suitable order of the Autoregressive process is critical step in the analysis of time series since inappropriate order selection may result into inconsistent estimate of parameters and it increase in the variance of the model when p greater than the true value (Shibata, 1976). In practice many researchers recommend using some information criterion to guide the selection of the true model order among the class of candidate model orders (Hurvich and Tsai, 1991;Kadilar and Erdemir, 2002;Sen and Shitan, 2002;Nakamura et al., 2006 andAladag et al., 2010). Statisticians often use information criteria such as Akaike's Information Criterion (AIC) by Akaike (1974), Schwarz's Bayes Information Criteria (SBC) by Schwartz (1978), Hannan's and Quinn's Information Criterion (HQIC) by Hannan and Quinn (1979) and Bias-Corrected Akaike's Information Criterion (AICC) by Hurvich and Tsai (1989) to guide the selection of the true model order. Lately, many studies have proposed and evaluated either new or modified criteria that are used to select the true Autoregressive model order (Padmanabhan and Rao, 1982;Hurvich and Tsai, 1989;Wong and Li, 1998;Tiee-Jian and Sepulveda, 1998;Kadilar and Erdemir, 2002;Sen and Shitan, 2002;Bengtsson and Cavanaugh, 2006;Nakamura et al., 2006;Aladag et al., 2010). Unfortunately, these criteria sometimes have low percentage of selecting the true model order.
Our research objective is evaluating a new approach could be used to guide the selection of the true Autoregressive model order. Also, our research objective involves comparing four model selection criteria in terms of their ability to identify the true model order with and without the help of the new approach.

MATERIALS AND METHODS
The ARMA procedure of the SAS system is a standard tool for fitting time series data. One of the main reasons that the ARMA procedure of the SAS system is very popular is the fact that it is a generalpurpose procedure for time series. In ARMA procedure, users find the following two model selection criteria available, which give users tools can be used to select an appropriate model order. The two model selection criteria are (SAS Institute Inc, 2008): • Akaike's Information Criterion (AIC) by Akaike (1974) • Schwarz's Bayes Information Criteria (SBC) by Schwarz (1978) Two more model selection criteria will be considered in this study that are bias-corrected Akaike's Information Criterion (AICC) by Hurvich and Tsai (1989) and Hannan and Quinn Information Criterion (HQIC) by Hannan and Quinn (1979). Our study concerns with comparing the four information criteria in terms of their ability to identify the true Autoregressive model order with and without the help of the new approach.
The new approach involves using new sequence sampling technique and the Multiple Comparisons with the Best (MCB) procedure by Hsu (1984) as tools to help the four information criterion in identifying the right Autoregressive model order. The idea of the new approach can be justified and applied in a very general context, one which includes the selection of the true Autoregressive model order.
In the context of the Autoregressive models, the algorithm for using the new sequence sampling technique in our new approach can be outlined as follows: Let the observed order vector of data O 1 is defined as follows: Fit all the class of candidate model orders of Autoregressive model, which we would like to select the true model order among them, to the observed data, (O 1 ), thereby obtaining the AIC*, HQIC*,AICC* and SBC* for each model order of the class of candidate model.
Repeat step (2) for each data sequence, Statisticians often use the previous collection of information criteria to guide the selection of the true model order such as selecting the model with the smallest value of the information criteria (Pankratz, 1983). We will follow the same rule in our approach, but we have the advantage that each information criteria has (n) replication values result of fitting the different sequences of the observed data (from step 1, 2 and 3).
To make use of this advantage, we propose using MCB procedure by Hsu (1984) to pick the winners (i.e., selecting the best set of models or single model if possible), when we consider the replicates of the information criteria, that is produced by each of the candidate model, as group.
The simulation study: A simulation study of PROC ARMA's time series model analysis of data was conducted to compare the four model selection criteria with and without the new approach in terms of their percentage of number of times that they identify the true model order.
Normal data were generated according to stationary Autoregressive model with first and second orders. There were 24 scenarios to generate data involving four settings of the first order autoregressive and four settings of the second order Autoregressive, with three different sample sizes (n = 25, 50 and 100 observations). The four settings of parameter values for first order Autoregressive model and the four settings of parameter values for second order Autoregressive model are given in Table 1. For those scenarios with sample size 25, we simulated 200 datasets, for those scenarios with sample size 50, we simulated 100 datasets and for those scenarios with sample size 100, we simulated 50 datasets. SAS code was written to generate the datasets according to the described setup using the SAS®9.1.3 package (SAS Institute Inc, 2008). The algorithm of our approach was applied to each one of the generated data sets with each candidate model (AR(1), AR(2), AR(3), AR(4), AR(5) and AR(6), total of 6 models) for each one of the four information criteria in order to compare their performance with and without the new approach. The objective of implanting MCB procedure by Hsu, (1984) in our new approach is the same objective that was used in my previous studies (AL-Marshadi, 2007;2009;2010A;2010B).

RESULTS
The simulation results indicated that the new procedure selects the right model order as member of the best subset hundred percent of the times from the class of candidate model orders for all the information criteria. Table 2 summarizes results of the percentage of number of times that the procedure selects the true model order alone from the class of candidate model orders (AR(1), AR(2), AR(3), AR(4), AR(5) and AR(6)) i.e. out of 6 models for the four criteria with the new approach and also, the percentage of number of times without the new approach, using the first parameters setting when n=25, 50 and 100. Table 3 summarizes results of the percentage of number of times that the procedure selects the true model order alone from the class of candidate model orders (AR(1), AR(2), AR(3), AR(4), AR(5) and AR(6)) i.e. out of 6 models for the four criteria with the new approach and also, the percentage of number of times without the new approach, using the second parameters setting when n=25, 50 and 100. Table 4 summarizes results of the percentage of number of times that the procedure selects the true model order alone from the class of candidate model orders (AR(1), AR(2), AR(3), AR(4), AR(5) and AR (6)) i.e., out of 6 models for the four criteria with the new approach and also, the percentage of number of times without the new approach, using the third parameters setting when n=25, 50 and 100. Table 5 summarizes results of the percentage of number of times that the procedure selects the true model order alone from the class of candidate model orders (AR(1), AR(2), AR(3), AR(4), AR(5) and AR(6)) i.e., out of 6 models for the four criteria with the new approach and also, the percentage of number of times without the new approach, using the fourth parameters setting when n=25, 50 and 100.  AR (1)       AR (1)    AR (1)    AR (1) Table 6 summarizes results of the average percentage of number of times that the procedure selects the true model order alone from the class of candidate model orders (AR(1), AR(2), AR(3), AR(4), AR(5) and AR (6)) i.e., out of 6 models for the four criteria with the new approach and also, the average percentage of number of times without the new approach, averaging over the four parameters settings when n = 25, 50 and 100. Table 2-6 show that the performance of all the four criteria with the new approach is better than their performance without the new approach. Although the new approach shows very good performance over all with all the criteria for all the cases, it was outstanding with SBC, AICC and HQIC criteria.

CONCLUSION
In our simulation, we considered Autoregressive process, looking at the performance of the new proposed approach for selecting the suitable Autoregressive model order with different cases. Overall, the new approach provided the best guide to select the suitable model order. The new approach showed outstanding performance with SBC, AICC and HQIC criteria. Thus, this new approach can be recommended to be used with one of the three mentioned criteria. Note for users of the propose approach: if the MCB procedure suggested the best subset of models contains more than one model, we recommend selecting the true model as the one with a smaller order since the examination of simulation results showed that in this case the other models are over fitted models, i.e., model that contains the right order of the true model and higher order terms. The main result of our article is that the three criteria SBC, AICC and HQIC criteria are competitive in term of their ability to identifying the right model order with the help of the new proposed.