Bayesian Models for Time Series with Covariates, Trend, Seasonality, Autoregression and Outliers

Bayesian methods furnish an attractive approach to time series data analysis. This article proposes the forecasting models that can detect trend, seasonality, auto regression and outliers in time series data related to some covariates. Cumulative Weibull distribution functions for trend, dummy variables for seasonality, binary selections for outliers and latent autoregression for autocorrelated time series data are used for the data analysis. The Gibbs sampling, a Markov Chain Monte Carlo (MCMC) algorithm, is used for the parameter estimation. The proposed models are applied to vegetable price time series data in Thailand. According to the RMSE, MAPE and MAE criteria for model comparisons, the proposed models provide the best results compared to the exponential smoothing models, SARIMA models and the Bayesian models with trend, auto regression and outliers.


INTRODUCTION
Time series data are observations obtained through repeated measurements over time. For example, measuring product prices each month of the year would comprise time series data. Data collected irregularly or only once are not time series data. Time series data can be decomposed into three main components: trend which is a long term direction, seasonality which is systematic and calendar related movements and irregularity which is unsystematic and short term fluctuations. Some other components, such as outliers and autoregression can also be implicit in time series data. The presence of those components could easily mislead the time series analysis procedure resulting in the wrong conclusion.
A good time series forecast is essential in all fields, such as sciences, industry, agriculture, commerce and economics. The prediction of future events is a critical input into many types of planning and decision making process (Montgomery et al., 2008).
Several classical methods have been designed to handle those components. The Holt-Winters exponential smoothing method was first introduced more than half a century ago for the trend and seasonal time series forecast and it is still one of the most popular forecasting systems widely used in many application areas (Szmit and Szmit, 2012). The autoregressive integrated moving average (ARIMA) model is usually used for time series data with trend and autoregression. The seasonal ARIMA denoted as SARIMA is a generalization and extension of the regular ARIMA. It is used for time series where a pattern repeates seasonally over time (Machiwal and Jha, 2012). Besides those methods, Yelland (2010) proposed a Bayesian framework, using binary selections to detect outliers, cummulaitve Weibull distributions to detect trends and latent autoregression to detect the correlated time series data. Tongkhow and Kantanantha (2012) proposed Bayesian forecasting models by applying and adjusting the model proposed by Yelland (2010) in the way that the distribution of outliers, autoregression and some prior distributions were different. The proposed models were applied to vegetable Science Publications JCS prices in Thailand which were used in the previous study of Tongkhow and Kantanantha (2011) in which multiple regression, ARIMA, exponential smoothing, SARIMA and Bayesian model were used. This article proposes time series forecasting models using Bayesian methods to detect trend, seasonality, autoregression and outliers. We extend our previous models (Tongkhow and Kantanantha, 2012) by including covariates related to the time series data and seasonality. The Gibbs sampling, one of the most popular Markov Chain Monte Carlo (MCMC) algorithms, is used for parameter estimation. The proposed models are then applied to vegetable price time series data in Thailand which were used in our two previous studies (Tongkhow and Kantanantha, 2011;. The results are compared with the best models in the two previous studies which were exponential smoothing, SARIMA and the Bayesian model with trend, autoregressioon and outliers. For model comparisons, some assessment criteria such as Root Mean Squared Error (RMSE), Mean Absolute Percent Error (MAPE) and Mean Absolute Error (MAE) are employed.

Exponential Smoothing Models
A simple exponential smoothing model is used to reduce irrigularities in times series data. It assigns exponentially decreasing weights as the observations get older. In other words, recent observations are given relative more weight in forecasting than the older observations (Wang, 2010). A double exponential smoothing model applies the process of a simple exponential smoothing model to account for linear trend in time series data and a triple exponential smoothing model or Holt-Winters model can adjust for both trend and seasonality (Szmit and Szmit, 2012).

SARIMA or Seasonal ARIMA Model
ARIMA (p, d, q) models (Box et al., 1994) take into account historical data and decomposes them into an autoregressive process (AR), an integrated (I) process and moving average (MA) process of the forecast errors. Therefore, the ARIMA models have three model parameters, one for AR(p) process, another one for I(d) process and the other one for MA(q) process. The seasonal ARIMA model (Machiwal and Jha, 2012) denoted as SARIMA is a generalization and extension of the ordinary ARIMA model to allow seasonality in the data. This seasonal component of the ARIMA model is denoted by capital letters, SARIMA (p, d, q)(P, D, Q)s, where the last bracket indicates the seasonal factor parameters for the order of autoregressive, integration and moving average parts of the model. The first bracket indicates the non-seasonal parameters.

Bayesian Methods
A hierarchical Bayesian model (Congdon, 2010) is formulated as Equation 1: where, p(D|θ) is a likelihood, P(θ|D) is a posterior distribution which stands for the marginal probability density of the parameter vector θ given the data D, p(θ) is a prior distribution of θ, which summarizes any priori or alternative knowledge on the distribution of θ and p(D) is the marginal distribution of data D. The MCMC algorithms are used for parameter estimation. The MCMC methods provide a way to sample from P(θ|D) without necessarily knowing its analytic form. The final result of MCMC is a set of vectors θ with density p(θ|D) in which the model parameters can be estimated. The Gibbs sampling is a common MCMC that can be used for parameter estimation. The most common hierarchical Bayesian model has three stages. A distribution for the data given parameters is specified at the first stage, prior distributions for parameters given hyper-parameters are specified at the second stage and the distribution for hyper-parameters are specified at the third stage. Complicated models can be built through the specification of several simple stages under hierarchical Bayesian models.

Trend: A Cumulative Wiebull Distribution
A cumulative Weibull distribution for the trend of the new product demand was used by Yelland (2010). It can be applied for product prices since the price varies directly with the demand. The cumulative Weibull distribution is defined as Equation 2:

Outliers: A Binary Selection
Yelland (2010) adopted a latent binary selection to detect outliers in which the observation Y ij is associated with a latent binary variable ζ ij ∈{0, 1} that identifies Y ij as an outlier if ζ ij = 1. The prior distribution for the indicator ζ ij is a Bernoulli distribution such that the probability that ζ ij =1 is about 5% since the occurrence of outliers is a rare event.

Latent Autoregression: A(n)
The A(n) (Yelland, 2010) is defined as Equation 3: where, a i is the latent autoregression coefficient, A t is the residual variation and n is the order (length) of the latent autoregression. The error term, ε t , is assumed to follow a normal distribution.

Seasonality: Dummy Variables
Dummy variables (Wooldridge, 2009) are used to represent seasonality. The regression model with seasonality has the form of Equation 4: where, Y t is the time series observation and S it is the dummy seasonal variable, s = 4 for quarterly and s = 12 for monthly data. S ij is 1 in their corresponding quarter or month and 0 otherwise. ω i is the regression coefficient. ε t is the random error.

The Proposed Bayesian Model
Let Y t be time series data at time t, t = 1,…,n. Y t is assumed normally distributed whose mean can detect trend, autoregression, seasonality and account for some covariates and whose variance can detect outliers. The proposed model is defined as Equation 5: The mean of Y t is Equation 6: The variance of Y t is Equation 7: where, γ is the expectation of Z which is the sum of time series data within the study period. W (t|a,δ) is a cumulative Weibull distribution. A t is a latent autoregression at time t. ζ t are outliers at time t. β 0 is an intercept and β 1 ,.., β p are regression coefficients corresponding to the covariates X 1t ,…,X pt at time t, respectively. ω 1 , ω 2 ,…, ω s-1 are regression coefficients corresponding to the seasonal dummy variables S 1t ,S 2t ,…,S s-1,t at time t, respectively. 2 y σ is the common variance of Y t . The prior distributions for Bayesian methods are assigned to each parameter as follows:

Application
In Thailand, vegetables are high-value economic plants useful for improving income of farmers. The vegetable prices play a major role in coordinating the supply and demand of these products. Hence, the vegetable prices forecast will be advantageous to producers, consumers, processors, rural development planners and other people involved in the vegetable market.

Data
The data have been extracted from the database of the Office of Agricultural Economics, Ministry of Agriculture and Cooperatives, Thailand (OAE, 2012). The monthly average consumer prices for coriander, green shallot and celery, from 2000 to 2011 (144 months) are used for this study since those vegetables are common and their prices usually fluctuate. For some missing data, simple three-month moving averages of the preceding observations are computed to fill in the missing observations.

Data Analysis
The proposed Bayesian models were applied to the prices of coriander, green shallot and celery. The Gibbs sampling algorithm was used for parameter Science Publications JCS estimation via Open BUGS program. The Gibbs sampling was run for 10,000 iterations, discarding the first 1,000 iterations (the burn-in iterations) and the rest was used to compute the posterior means and standard errors. For the model evaluation, simulations were done in R program. The SPSS for Windows was used to estimate the parameters in the exponential smoothing and SARIMA. The RMSE, MAPE and MAE are criteria for the model comparison.

Model Evaluation
Given that the parameters were obtained from the analysis of each vegetable price data, 500 samples of time series data (Y t ) were generated to evaluate the proposed model.

RESULTS
The proposed models perform better than the exponential smoothing models, SARIMA models and the Bayesian models with trend, autoregression and outliers in all vegetables since all error measurements of the proposed model are smallest. The graphs of the actual values and the predicted values from the proposed model of each vegetable are shown in Fig. 1-3. The error measurements, RMSE, MAPE and MAE, are shown in Table 4.
It is evident that the predicted values from the proposed model are very close to the actual ones. The proposed Bayesian models with covariates that can detect trend, seasonality, autoregression and outliers, are appropriate for all vegetables because the models account for all main components which usually occur in the time series data. In contrast, the exponential smoothing models, SARIMA models and the Bayesian models that can detect only trend, autoregression and outliers in our previous study (Tongkhow and Kantanantha, 2012) account for some of those components, so they are inferior comparing to the proposed models. The conjugate inverse gamma priors are assinged to the variance components of the normal distributions, therefore, the full conditional distributions of those variances are inverse gamma distributions. The conjugate inverse gamma prior is one the prior distributions having been suggested in Bayesian analysis (Jackman, 2009).

CONCLUSION
As this study has shown, the proposed Bayesian models with covariates that can detect trend using cummulative Weibull distribution functions, seasonality using dummy variables, autoregression using latent autoregressions and outliers using binary selections, are appropriate for all vegetables. The proposed models can be applied to other vegetables or products that have unusual components in the data. This would be valuable to anyone that would like to forecast from those data.