Forecasting the Stock Exchange of Thailand uses Day of the Week Effect and Markov Regime Switching GARCH

Problem statement: We forecast return and volatility of the Stock Exc hange of Thailand (SET) Index. Approach: In this study, we modeled the SET Index returns us ing mean equation with day of the week effect and autoregressive moving-av erage. Next we forecast the volatility of the SET Index by using the GARCH-type model and the Markov Regime Switching GARCH (MRS-GARCH) model. Results: When we model the SET Index by the ARMA (3, 3) pro cess, we find that Friday is the day of the effect of the SET Index. The empiric al analysis demonstrates that the MRS-GARCH models outperform all GARCH-type models in forecast ing volatility at long term horizons (two weeks and a month). Conclusion: The ARMA (3, 3) and the Friday is the day of the e ffect of the SET Index return. The MRS-GARCH models outperform at long ter m horizons.


INTRODUCTION
In the time series, the stock price is transformed to return series for stationary process which looked like white noise and forecasting was possible using the mean equation. The forecasting of daily returns has led to additional research in financial literature, specifically extending the analysis of the seasonal behavior to include the day of the week effect. This seasonality has been the subject of different studies which detected empirical evidence of abnormal yield distributions based upon the day of the week. The pioneering work was carried out as used in the analysis of seasonality and can be specifically seen in Miralles and Quiros (2000), they included five dummy variables, one for each day of the week.
Nevertheless two serious problems arise with this approach. The first problem is that the residuals obtained from the regression model can be autocorrelated, thus creating errors in the inference. The second problem is that the variances of the residuals are not constant and possibly time-dependent.
A solution to the first type of problem can be solved by introducing the returns with a one week delay into the regression model, as used in the works by Easton and Faff (1994) and Kyimaz and Berument (2001).
Moreover, Apolinario et al. (2006) and Ulussever et al. (2011) try to solve the second problem by modeling the residuals with the ARCH model in order to correct the variability in the variance of the residuals.
In this study, we reconsidered the two problems again. For the first problem, we modeled the SET Index returns by mean equation with the day of the week effect and the autoregressive moving-average order p and q (ARMA (p, q)). For the second problem, we model the residuals by the GARCH, EGARCH, GJR-GARCH and MRS-GARCH models. Finally, we compare their performance by one day, one week, two weeks and one month.
Next, we present forecasting returns with the mean equation. Then we forecast volatility of returns and estimate parameters within-sample evaluation results. Moreover, statistical loss functions are described and out-of-sample forecasting performance of various models is discussed.

MATERIALS AND METHODS
Forecasting financial returns: Let {P t } denote the series of the financial price at time t and the returns for each market {r t } t > 0 be a sequence of random variables on a probability space (Ω, F, P). The index t denotes the daily closing R observations with t = -R+1,..,0. The sample period consists of an estimation (or in-sample) period with n observations and an evolution (or out-ofsample) period with n observations (t = 1,…,n), let r t be the logarithmic return (in percent) on the financial price at time t, i.e. Eq. 1: t t t 1 P r 100 ln( ) P − = ⋅ (1) To put the volatility models in proper perspective, it is informative to consider the conditional mean and variance of given, that is: where, F t-1 refers to information up to time t-1. Typically, F t-1 consists of all linear functions of the past returns. Therefore, the equation for µ t in Eq. 2 should be simple and we assume that r t follows a simple time series model such as a stationary ARMA(p, q) model which includes five dummy variables, one for each day of the week, such that Eq. 3: where, D jt , j = 1,…, 5 are dummy variables which take on the value of 1 if the corresponding return of the day it is a Monday, Tuesday, Wednesday, Thursday or Friday, respectively and 0 otherwise. Let β j , j = 1,…, 5 are coefficients which represent the average return for each day of the week φ i , = 1,…,p and θ i , i = 1,…, q, are coefficients which represent the ARMA (p, q).

Forecasting financial volatility:
We allow variance of errors to be time dependent to include a conditional heteroskedasticity that captures time variations of variances in stock returns Eq. 3. The GARCH-type models in our consideration are GARCH (1, 1), EGARCH (1, 1), GJR-GARCH (1, 1) and MRS-GARCH. For notation conveniences, we shall present some basic definitions of these models.
The GARCH (1, 1) model in the series of the returns rt in Eq. 3 can be written as Eq. 4: where, a 0 > 0,a 1 ≥ 0 and β 1 ≥ 0 are assumed to be nonnegative real constants to ensure that h t ≥ 0. We assume n t is an i.i.d. Process with zero mean and unit variances. The parameters of the GARCH model are generally considered as constants. But, the movement of financial returns between recession and expansion may result in the variation volatility. Gray (1996) extended the GARCH model to the MRS-GARCH model in order to capture regime changes in volatility with unobservable state variables. It was assumed that those unobservable state variables satisfy the first order of the Markov Chain process.
The MRS-GARCH model represented as the variance of the residual term is not constant through time with only two regimes and distributed as ε t ∼i.i.d. (0, h t,s ) and defined: where , S t = 1 or 2, h t, S is the volatility under regime S t on F t-1 . Also µ t and h t,St are measurable functions of F t-τ for τ ≤ t. In order to ensure the positivity of the conditional variance, we impose the restrictions In the MRS-GARCH model with two regimes, Klaassen (2002) forecast volatility for k-step-ahead. Klaassen used the recursive method as in the standard GARCH model for k = 1,2,…, n. In order to compute the k-step-ahead volatility forecasts, we first compute a weighted average of the k-step-ahead volatility forecasts in each regime and the weights are the prediction probability Pr(S i+t = i/F t-1 ).
Since there is no serial correlation in the returns, the k-step-ahead volatility forecast at a time depends on information at time t-1. Let h t, t+k denotes the time t aggregated volatility forecasts for the next k steps. It can be calculated as follows: where indicates thestep-ahead volatility forecast in the regime i made at time t and can be calculated recursively as follows Eq. 6: Also, in general the prediction probability in Eq. 6 is computed as: where, P defined in Eq. 4 and Pr (S t-1 = i/F t-1 ) will be calculated in Eq. 12. Lastly, we compute expectation part E t-1 [h t,t+τ-1 |S t+τ = i] as appeared in Eq. 7 as follows Eq. 8: The first on the right hand side of Eq. 11 can be calculated as follows Eq. 9 and 10: Where: Similarly, the second term on the right hand side in Eq. 8 is equal to: Substituting Eq. 9 and 11 into Eq. 8, one gets: Now we are ready to compute those regime probabilities p it = Pr (S t = i|F t-1 ) for i = 1, 2 in Eq. 10. In order to compute the regime probabilities, we denote f 1t = f (r t |S t = 1, F t-1 ) f 2t = f (r t |S t = 2F t-1 ). Then, the conditional distribution of return series r t becomes a mixture-of-distribution model. Which the mixing variable is a regime probability p it . That is: We shall compute regime probabilities recursively by following two steps (Kim and Nelson, 1999).
Step 1, given the PR (S t-1 = j|F t-1 ) at the end of the time t-1, the regime probabilities p it Pr (S t-1 = j|F t-1 ) is computed as: Since the current regime (S t ) only depends on the regime one period ago (S t-1 ), then: Step 2, once r t observed at the end of time t, we can update the probability term in the following way: Let f(r t , S t = i/F t-1 ) is the joint density of returns and unobserved at state for i = 1, 2 and it can be written as follows: Define f(r t /F t-1 ) is a marginal density function of returns and can be constructed as follows: We use Bayesian arguments Eq. 12: Then, all regime probabilities (p it ) can be computed by iterating these two steps. However, at the beginning of the iteration, Pr (S 0 = i/F 0 ) for i = 1, 2 are necessary to start iterating. We follow the technique of Hamilton (1989;1990) by setting: Given initial values for regime probabilities, conditional mean and conditional variance in each regime, the parameters of the MRS-GARCH model can be obtained by maximizing numerically the loglikelihood function Marcucci (2005). The loglikelihood function is constructed recursively similar to that in the GARCH model.

RESULTS
The data set was used the daily closing prices of the SET Index P t over the period 3/01/2007 through 30/03/2011 (t = 1,…, 1,038 observations). The data set is obtained from the Stock Exchange of Thailand. The data set is divided into in-sample (R 977 observations) and out-of-sample (n = 61 observations). The plot p t of and its log returns series r t (Eq. 1) are given in Fig. 1. Plot p t and r t display the usual properties of financial data series. As expected, volatility is not constant over that period of time and exhibit volatility clustered with large changes in the index often followed by large changes and small changes often followed by small changes.
Descriptive statistics of r t are presented in Table 1. As Table 1 shows, overall, r t has a quite small positive average return (about 0.0436%). Standard deviation of r t is 1.5525%. The lowest average return is observed on Monday and the highest average return occurs on Friday.
Moreover, we tested for the normality of r t by using the Jarque-Bera test (The Jarque-Bera Normality test is a goodness-of-fit measure of departure from normality and can be used to test which has a x 2 distribution with 2 degrees of freedom under the null hypothesis that the data is from a normal distribution. The 5% critical value is, therefore, 5.99) under the null hypothesis r t is normally distributed and we find that the test statistic value is 1,758.1080 which lead us to reject the null hypothesis. So r t is not normally distributed. Also, the skewness and kurtosis of r t are -0.7189 (not equal zero) and 6.2605 (greater than 3) respectively. These values confirm that the returns are not normally distributed, namely, it has fatter tails.   Moreover, we test for the stationary of r t by using the Augmented Dickey-Fuller test (The Augmented Dickey-Fuller test is a test for a unit root in a time series sample, the null hypothesis of ADF test is that the series is nonstationary. The 1, 5 and 10% critical value are -3.44, -2.86 and -2.57 respectively). The test statistic value is -30.0801 which indicates the stationary of r t . Table 2 reports the day of the week effects and ARMA (p, q) for returns. Panel A of Table 2 displays the first estimated coefficients of the day of the week effect (β I : I = 1,…, 5). From Table 2 (Panel A), we found the estimated coefficients of β i are almost zero. Then we test under the null hypothesis that each coefficient (β I : I = 1,…, 5) is zero. We find that the coefficient of Fridays' dummy variable is not zero significant at the 95% level and other days are insignificant. These observations suggest that only Friday is the day of the effect of the SET Index.
Panel B displays the estimated coefficients of the ARMA process and P-values. By using t-test under the null hypothesis that each coefficient AR (p) and MA (q) is zero, we found that the P-values are all zero then each coefficient is not zero significant at the 99% level. Hence the SET Index return can be modeled by the ARMA (3,3) process.
The autocorrelation functions (ACF) are presented in Table 3, when we apply Ljung-Box to test serial correlation in P t and r t . We use the specified lag from the first to the tenth lags and the twenty-second lag. Serial correlation in P t (column 2) confirmed as nonstationary but r t is stationary because of ACF values (column 5) decrease very fast when the lag increases and is confirmed by the Augmented Dickey-Fuller test in Table 1. We analyze the significance of autocorrelation in the squared mean adjusted (r t -µ t ) 2 return series by using the Ljung-Box Q-test (The Ljung-Box Q-test is a type of statistical test of whether any of a group of autocorrelations of a time series are different from zero. The test is also distributed as a x 2 (q), where q is the number of lags). Since the P -value in column 10 is equal to zero then the squared mean adjusted return is non-stationary. Next, we apply Engle's ARCH test (The ARCH test is a test with the null hypothesis that, in the absence of ARCH components, we have α i = 0 for all i 1,2,...,q = . The test is also distributed as a x 2 (q), where q is the number of lags). The test is also distributed as a x 2 (q), where q is the number of lags) (1982) to test ARCH effects of the squared mean adjusted return. The P-value in column 12 suggests the conditional heteroskedasticity.
Empirical methodology: This empirical part adopts the GARCH type and MRS-GARCH (1,1) models to estimate the volatility of the P t . The GARCH type models that will be considered are GARCH (1,1), EGARCH (Model of EGARCH (1,1) is): (where ξ is the asymmetry parameter to capture leverage effect) and GJR-GARCH (Model of GJR-GARCH(1,1) is: (where I {εt-1} > 0 is equal to one when ε t-1 is greater than zero and another is zero) (Klaanssen, 2002) (1,1). In order to account for the fat tails feature of financial returns, we consider three different distributions for the innovations: Normal (N), Student-t (t) and Generalized Error Distributions (GED). Table 4, presents and estimation of the results for GARCH type models. It is clear from the table that almost all parameter estimates are highly significant at 1%. However, the asymmetry effect term ξ in EGARCH models is significantly different from zero, which indicates unexpected negative returns implying higher conditional variance as compared to the same size positive returns. All models display strong persistence in volatility ranging from 0.8950 to 0.9521, that is, volatility is likely to remain high over several price periods once it increases.

Garch type models:
Markov regime switching garch models: Estimation results and summary statistics of MRS-GARCH models are presented in Table 5. Most parameter estimates in MRS-GARCH are significantly different from zero at least at the 95% confidence level.

In-sample evaluation:
We use various goodness-of-fit statistics to compare volatility models. These statistics are Akaike Information Criteria (AIC), Schwarz Bayesian Information Criteria (SBIC) and Loglikelihood (LOGL) values. In Table 6, the results of goodness-of-fit statistics and loss functions: For all volatility models are presented. According to SBIC, the EGARCH model with GED-distribution performs best in modeling the SET Index volatility. However, the MSE1 and MSE2 suggest that the EGARCH with a t -distribution performs best in SET Index volatility. Also AIC and LOGL suggest that the MRS-GARCH-2t performs best in SET Index volatility. MAD1, MAD2 and HMSE suggest that the MRS-GARCH-t performs best in SET Index volatility and in QLIKE the MRS-GARCH with GED-distribution performs best in SET Index volatility.

Forecasting volatility in out-of-sample:
We investigate the ability of MRS-GARCH and GARCH type models to forecast volatility of the SET Index in out-of-sample.
In Table 7, we present the results of loss function of out-of-sample with forecasting volatility for one day ahead, five days ahead (a week), ten days ahead (two weeks) and twenty-two days ahead (a month). We found the GARCH-type models perform best in the short term (one day and a week) for forecasting volatility of the SET Index. Additionally, we have reported a particular sign-test, the Success Ratio (SR), i.e.: The SR test is simply the fraction of volatility forecasts that have the same sign as volatility realizations. From the table we can see that the GARCHtype models do a great job in correctly predicting the sign of the future volatility in the short term.
On the other hand, we found that the MRS-GARCH models perform best in the long term (two weeks and a month) for forecasting the volatility of the SET Index. Also, the SR test MRS-GARCH models do a great job in correctly predicting the future volatility in the long term.

DISCUSSION
For forecasting volatility in the long term in SET Index, the MRS-GARCH models perform best.

CONCLUSION
In this study, we modeled the returns of the SET Index by mean equation with the day of the week effect and the autoregressive moving-average order p and q (ARMA (p, q)) and forecasted the volatility of the SET Index by the GARCH, EGARCH, GJR-GARCH and MRS-GARCH models. Moreover we compared their volatility forecast performance with one day, one week, two weeks and one month returns.
Friday is day effect of the SET Index. Displays the first estimate of return equation with ARMA (3, 3). The GARCH-type models perform best in the short term (one day and a week). On the other hand, the MRS-GARCH models perform best in the long term (two weeks and a month) for forecasting volatility of the SET Index.
For further study, three or four volatility regime settings can be considered rather than two-volatility regimes or using Markov Regime Switching with other volatility models e.g., EGARCH, GJR. In addition, the performance of the MRS-GARCH models can be compared in terms of their ability to forecast Value at Risk (VaR) for long and short positions.