A Hybrid Approach based on Winter’s Model and Weighted Fuzzy Time Series for Forecasting Trend and Seasonal Data

: Problem statement: In the literature, the most studied of fuzzy time series for the purpose of forecasting is the first order fuzzy time series model. In this model, only the first lagged variable is used when constructing the first order fuzzy time series model. Therefore, such approaches fail to analyze accurately trend and seasonal time series which is an important class in time series models. Approach: In this paper, a hybrid approach is proposed in order to analyze trend and seasonal fuzzy time series. The proposed hybrid approach is based on Winter’s model and weighted fuzzy time series. The Winter’s model and the WFTS model are used jointly, aiming to capture different forms of pattern in the time series data. The order of this model is determined by utilizing graphical order fuzzy relationship. A real time series about tourist arrivals data is analyzed with this method to show the efficiency of the proposed hybrid method. Results: The results obtained from the proposed method are compared with the other methods, i.e., Decomposition, Winter’s and ARIMA models. As a result, it is observed that more accurate results are obtained from the proposed hybrid method. Conclusion: The empirical results with tourist arrivals data clearly suggest that the hybrid model is able to outperform each component model used in isolation the pattern of time series data. Moreover, these empirical evidences suggest that by using dissimilar models or models that disagree each other strongly, the hybrid model will have lower generalization variance or error. Additionally, because of the possible unstable or changing patterns in the data, using the hybrid method can reduce the model uncertainty which typically occurred in statistical inference and time series forecasting.


INTRODUCTION
The definitions of fuzzy time series were firstly introduced by Song and Chissom (1993a;1993b) and they developed the model by using fuzzy relation equations and approximate reasoning. Furthermore, Song and Chissom (1994) divided the fuzzy time series into two types, namely time-variant and time-invariant, whose difference relies on whether there exists the same relation between time t and its prior time t-k (where k = 1, 2,…,m). If the relations are all the same, it is a time-invariant fuzzy time series; likewise, if the relations are not the same, then it is time-variant.
Recently, Liu (2009) proposed an integrated fuzzy time series forecasting system in which the forecasted value will be a trapezoidal fuzzy number instead of a single-point value and effectively deal with stationary, trend and seasonal time series. Later, Egrioglu et al. (2009) proposed a new hybrid approach based on SARIMA and partial high order bivariate fuzzy time series for forecasting seasonal data. Elaal et al. (2010) introduced fuzzy clustering to select membership functions in fuzzy time series model. Additionally, Lee and Suhartono (2010) also proposed a new weighted fuzzy time series for forecasting time series with seasonal pattern.
In this paper, a new hybrid model based on the Winter's model and weighted fuzzy time series is proposed to improve the forecast accuracy in trend and seasonal data. This approach follows the idea from Zhang (2003) who proposed a hybrid model based on ARIMA and Neural Network model. In this new hybrid model, a linear chronological weight from Yu (2005) is expanded to a uniform and/or exponential chronological weight as Lee and Suhartono (2010) for forecasting the error series from Winter's model. This study shows that the graphical order fuzzy relationship could be used effectively to select an appropriate order of fuzzy time series. Additionally, this study also shows that by using a series of monthly tourist arrivals to Bali, Indonesia, the hybrid approach with an exponential chronological weight (Lee and Suhartono, 2010) outperforms the hybrid fuzzy time series proposed by Chen (1996); Yu (2005) and Cheng et al. (2008) and some classical methods, i.e., Decomposition, Winter's and ARIMA models.

Data sources:
A real monthly datasets about the number of tourist arrivals to Bali, Indonesia, from 1989 to 1997, is used as case study. This series was obtained from the Indonesia Central Bureau of Statistics (see www.bps.go.id). Bali is the main destination of the international tourists who visit Indonesia and these data also have trend and seasonal pattern. Ismail et al. (2009) analyzed these tourism data using intervention analysis and recently Suhartono (2011) also used these data for evaluating the effect of additive or multiplicative order in SARIMA model. For this datasets, the last 12 observations are reserved as the test for forecasting evaluation and comparison (out-sample dataset or testing data). Chen (1996) improved the approach proposed by Song and Chissom (1993a;1993b). Chen's method uses a simple operation, instead of complex matrix operations, in the establishment step of fuzzy relationships. The algorithm of Chen's method can be given as follows:

MATERIALS AND METHODS
Step 1: Define the universe of discourse and intervals for rules abstraction. Based on the issue domain, the universe of discourse can be defined as: U = [starting, ending]. As the length of interval is determined U can be partitioned into several equally length intervals.
Step 2: Define fuzzy sets based on the universe of discourse and fuzzify the historical data.
Step 4: Establish Fuzzy Logical Relationships (FLRs) and group (FLRG) them based on the current states of the data of the fuzzy logical relationships.
Step 5: Forecast. Let i F(t 1) = A − . Case 1: If the fuzzy logical relationship of A i is empty; A i →∅, then F(t), forecast value, is equal to A i .

Case 2:
There is only one fuzzy logical relationship in the fuzzy logical relationship sequence. If A i →A j , then F(t), forecast value, is equal to A j . A ,A ,…,A .
Step 6: Defuzzify. If the forecast of F(t) is Yu's method: Yu (2005) proposed weighted models to tackle two issues in fuzzy time series forecasting, namely, recurrence and weighting. The method proposed by Yu applies a linear chronologically weights and produces more accurate forecasts than Chen's first order fuzzy time series method. The steps of the algorithm of the weighted method proposed by Yu (2005) can be given below.
Step 1: Define the discourse of universe and subintervals. Based on min and max values in the data set, D min and D max variables are defined. Then choose two arbitrary positive numbers which are D 1 and D 2 in order to divide the interval evenly, Step 2: Define fuzzy sets based on the universe of discourse and fuzzify the historical data.
Step 4: Establish fuzzy logical relationships (revised Chen's method). The recurrent FLRs are taken into account by revising Step 4 in Chen's method. For example, there are 5 FLRs with the same LHS, These FLRs are used to establish fuzzy logical relationship group as: 1 Step 5: Forecast. Use the same rule as Chen's.
Step 6: Defuzzify. Suppose the forecast of F(t) is where, M(t) represents the defuzzified forecast of F(t).
Step 7: Assigning weights. Suppose the forecast of F(t) is We then obtain the weight matrix as: where, w h is the corresponding weight for h j A .
Step 8: Calculating the final forecast values. In the weighted model, the final forecast is equal to the product of the defuzzified matrix and the transpose of the weight matrix: Cheng's method: Cheng et al. (2008) proposed fuzzy time series based on adaptive expectation model for obtain forecasts. The method proposed by Cheng et al. produces more accurate forecasts than Chen's and Yu's method on two real data, namely TAIEX and the enrollments of the University of Alabama. The steps of the algorithm of the method proposed by Cheng et al. (2008) are given below.
Step 1: Define the discourse of universe and subintervals as Yu's.
Step 2: Define fuzzy sets based on the universe of discourse and fuzzify the historical data.
Step 4: Establish fuzzy logical relationships (revised Chen's method). The FLRs with the same LHSs can be grouped to form of FLR Group. For example, there are 5 FLRs with the same LHS, These FLRs are used to establish fuzzy logical relationship group as: . All FLRs will construct a fluctuation-type matrix. Hence, the fluctuation-type matrix is: Step 5: Assigning weights. The matrix from Step 4 is further standardized to n W and multiplied by the deffuzified matrix, df L , to produce the forecast value. These weights should standardized to obtain the weight matrix, i.e. n 1 2 k This weight should be normalized by applying the standardize weight matrix equation as follows: Step 6: Calculate forecast value. From Step 5, we can obtain the standardized weight matrix, to get the forecast value by using: where, df L (t 1) − is the deffuzified matrix and n W (t 1) − is the weight matrix.
Step 7: Employ the adaptive forecasting equation to produce a conclusive forecast.
Lee's method: Lee and Suhartono (2010) proposed a uniform and exponential chronologically weights to tackle two issues in fuzzy time series forecasting, namely, recurrence and weighting, as extension of Yu's method. This method produces more accurate forecasts than Chen's, Yu's and Cheng's methods. The steps of the algorithm of the weighted method proposed by Lee and Suhartono (2010) are given as follows.
Step 1: Define the universe of discourse and partition it into intervals as Yu's method.
Step 2: Establish a related fuzzy set (linguistic value) for each observation in the training dataset.
Step 4: Establish fuzzy relationships groups for all FLRs.
Step 5: Select the best order of FLRs. The graphical orders for FLRs and fluctuation-type matrixes are used to identify the best order of FLRs.
Step 8 and w h is the corresponding weight for h j A . This proposed weights become an exponential weights when c 1 > and tend to give the recent FLRs as more important than the older ones and generally higher values than Yu's weigth. Additionally, these proposed weights also show that when c 1 = then the weigths will have uniformly chronological pattern which imply the same important time of chronological relationship.
Step 9: Calculate the final forecast values. The final forecast is equal to the product of the defuzzified matrix and the transpose of the weight matrix:  Zhang (2003) stated that since it is difficult to completely know the characteristics of the data in a real problem, hybrid methodology that has both linear and nonlinear modeling capabilities can be a good strategy for practical use. By combining different models, different aspects of the underlying patterns may be captured. As proposed by Zhang (2003), it may be reasonable to consider a time series to be composed of a linear structure and a nonlinear component. That is: Where: L t = The linear component N t = The nonlinear component These two components have to be estimated from the data.
In this paper, first, we let Winter's model to model the linear component particularly trend and seasonal components, then the residuals from Winter's model will become stationary series and may contain only the nonlinear relationship. Thus, we propose to consider the forecast of time series to be composed of two components, 1,t Y and 2,t Y , as follows: • The exponentially smoothed series: • The trend estimate: • The seasonality estimate: • Forecast p periods into the future: Let e t denote the residual at time t from the Winter's model, then: where 1,t Y is the forecast value for time t from the estimated Winter's model. Residuals are important in diagnosis of the sufficiency of linear models. A linear model is not sufficient if there are still linear correlation structures left in the residuals. However, residual analysis is not able to detect any nonlinear patterns in the data. In fact, there is currently no general diagnostic statistics for nonlinear autocorrelation relationships. Therefore, even if a model has passed diagnostic checking, the model may still not be adequate in that nonlinear relationships have not been appropriately modeled. By modeling residuals using WFTS, nonlinear relationships can be discovered. In summary, the proposed methodology of the hybrid system consists of two steps. In the first step, a Winter's model is used to analyze the trend and seasonal part of the problem. In the second step, a WFTS model is developed to model the residuals from the Winter's model. In this second step, we apply four WFTS models proposed by Chen (1996); Yu (2005); Cheng et al. (2008) and Lee and Suhartono (2010). The results from the WFTS can be used as predictions of the error terms for the Winter's model. The hybrid model exploits the unique feature and strength of Winter's model as well as WFTS model in determining different patterns. Thus, it could be advantageous to model trend, seasonal and nonlinear patterns separately by using different models and then combine the forecasts to improve the overall modeling and forecasting performance.
To validate the methodology of hybrid model for forecasting trend and seasonal time series data, a new algorithm is proposed as follows.
Step 1: Apply Winter's model at Eq. 3-6 to get the first forecast component, 1,t Y and the residuals, t e .
Step 2: Apply WFTS method to model the residuals from the Winter's model and get the second forecast component, 2,t Y . In this step, four WFTS methods proposed by Chen (1996); Yu (2005); Cheng et al. (2008) and Lee and Suhartono (2010) are applied to find the best forecasted values.
Step 3: Calculate the final forecast values by adding the forecast values at the first and second steps as Eq. 2.

RESULTS
To demonstrate the effectiveness of this hybrid method, we use data about the number of tourist arrivals to Bali, Indonesia, via Ngurah Rai airport from January 1989 until December 1997 as a case study. The time series plot at Fig. 1 illustrates that the data have both trend and seasonal pattern. To assess the forecasting performance of different models, each data set is divided into two samples of training and testing. The training data set that contains 96 records (January 1989until December 1996 is used exclusively for model development and then the last 12 records (January 1997until December 1997 as test sample is used to evaluate the established model. In this study, all hybrid modeling is implemented via two package programs, i.e., MINITAB for Winter's model at the first step and MATLAB for WFTS model at the second step. The results are compared with three classical time series models, namely Decomposition method, ARIMA and Winter's models. Only the k-stepahead forecasting is considered. The Root Mean Squared Error (RMSE) is selected to be the forecasting accuracy measures.   The results of RMSEs obtained using the hybrid models and three classical time series models, both in training and testing data, are listed in Table 1. Column ratio illustrates the ratio between each method to the result of ARIMA model. The value is less than 1 show that the result is better than ARIMA.

DISCUSSION
The results at Table 1 in general show that the overall forecasting errors can be significantly reduced by using the hybrid models (by combining two models together), both in training and testing datasets. In terms of RMSE, the performance evaluation in training data shows that hybrid model, i.e., a combination between Winter's and Cheng's WFTS methods at the 12th order FLR yields the most accurate forecasted values than other models. Additionally, these results also show that most of hybrid models yield more accurate forecasted values than ARIMA and two other classical time series models.
Moreover, the hybrid model between Winter's and Lee's WFTS methods at the first order FLR yields the best forecasted values than other models at testing dataset. Additionally, the results in testing data also show that all the proposed hybrid methods in the first order FLR yield more accurate forecast than other hybrid methods and two classical time series models, i.e. Decomposition and ARIMA models. The results also show that Winter's model could reconstruct well the trend and seasonal component of the series and the WFTS could fit well the residual from Winter's model to improve the forecast accuracy.

CONCLUSION
Time series analysis and forecasting is an active research area over the last few decades. The accuracy of time series forecasting is fundamental to many decision processes and hence the research for improving the effectiveness of forecasting models has never stopped. With the efforts of Box and Jenkins (1976), the ARIMA model has become one of the most popular methods in the forecasting research and practice. More recently, WFTS have shown their promise in time series forecasting applications with their nonlinear modeling capability.
In this study, we propose to take a hybrid approach based on Winter's and WFTS models and apply for forecasting trend and seasonal data, i.e., tourist arrivals data. The Winter's model and the WFTS model are used jointly, aiming to capture different forms of pattern in the time series data. The empirical results with tourist arrivals data clearly suggest that the hybrid model is able to outperform each component model used in isolation the pattern of time series data.
Various combining methods have been proposed in the literature. However, most of them are designed to combine the similar methods. Zhang (2003) stated that theoretical as well empirical evidences in the literature suggest that by using dissimilar models or models that disagree each other strongly, the hybrid model will have lower generalization variance or error. Additionally, because of the possible unstable or changing patterns in the data, using the hybrid method can reduce the model uncertainty which typically occurred in statistical inference and time series forecasting. Furthermore, by fitting the Winter's model first to the trend and seasonal data, the fitting problem with higher order related to fuzzy time series model can be eased.