A TIME-DELAY CASCADING NEURAL NETWORK ARCHITECTURE FOR MODELING TIME-DEPENDENT PREDICTOR IN ONSET PREDICTION

The occurrence of rain before the real start of a rainy season often mjslcad farmers imo thinking lha1 rainy season has started and suggesting lhcm to start planting immediately. In reality, rainy season has not started yet, cau:.-ing Lhe already-planted rice seed to experience dehydration. Therefore, a model that can predict the onset of rainy season is required, so lhat drought disaster can be avoided. This study presents Time DelayCascading Neural Network (TD-CNN) which deals wi th situations where lhe response variable is determined by a number of time-dependent inter-related predictors. The proposed model is used to predict lhe onset in Pacitan District Indonesia based on Southern Oscillation Index (SOI). The Leave One Out (LOO) cros~-validation w11h i.erie.~ data 1982-2012 are used in order to compare the accuracy of the proposed model with lhe Back-Propagation Neural Network (BPNN) and Cascading Neurnl Network (CNN). The experiment shows that lhe accuracy or the proposed model is 0.74, slightly above than the two other models, BPNN and CNN which a.re 0.71 and 0.72. respectively.


INTRODUCTION
season (May. June. July and August) while the rest occwrccl in tl1e early rainy i.euson (November and Empirical data has suggested 1hat agricultural December). This is nn interesting foct, because productivity in Indonesia 1s affected by climate theoretically. water should be in abundance dunng this vmability. The occurrence of extreme climate penod. Further rnvestigaoons upon this phenomena conditions has been increasing since 1950, both in showed that drought occurred when there is a delay in frequency and intensicy. as reported in 1he lntemn1ionnl the on~et of rainy 1:cason which is preceded by a fal~e Disaster Database (Boer and Perdinan, 2008), giving rai n. This phenomenon occurred when there is a heavy rise to large loss, in particular, to the national rice rain around the expected onset of rainy season and productivity. A total of around 80% of dis:isters in rice farmers intuitively concluded that rainy season has agricuJrure are caused by flood and drought conditions, staned, but i1 fact it has not and rain has not occurred respectively at nround 10 and 70% (Boer and Subbiah, over the next 2 dasarians (tens-of-days). This led to the 2005). The remaining disasters are caused by other condition where the already-planted rice fields LO factors, including pests. The research result by experience water-deficiency, leading 10 crop failure. This Pasaribu et al. (20 10) shows that national rice has occurred on 1997, 2006 and 2007 leading to a production has lost about 1.2 rrullion tone~. due to drought of over 3000 hectares of rice fields in Pacnan drough1 in 2007. According to the dnta in Pacitan on the district Indonesia. Figure 1 shows how this has ume frame from 1982 to 2009 indicates that around 90% occurred; taking as an example lhe I 997 case with a total or drough1 rela1ed disasters occurred in the drought urea or droughl equals 3595 hectares.   Based on the above facts, it has been deemed necessary for 1hc govcrnmcnl und rclevan1 agencies to provide accurate information regarding the onset of rainy season to mark the start of the planting of rice fields. With such informatjoo, it is hoped that farmers will start the plnnting only when ii is certain thnL the onset of rainy season has been reached, therefore avoiding the disaster of crop failure due to drought.
To help the government in determining the onset of rainy season. 3 system is required that can predict the onset accurately. Robertson et al. (2007} showed that the onset of rainy season in Indonesia is closely rela1ed to global climate phoneme. one of which can be measured used lhe Southern Oscillation Index ~ ~ f'UbiOGO!,.()ffi 977 (SOI). Research conducted by Buono ti al. (2012) showed that art1hcinl neur:il ne1works models using SOI ~ predictors arc able to predict the onset of rainy season in the district of lndramayu with correlation results or up to 0.8. Other research conducted by Larasnti (2012) showed th111 models using Suppon Vector Regression (SVR) arc able to predict the onset of rainy season in the same district with correlation results of 0.7. Meanwhile, Buono and Mushthofa (2012) used fuzzy inference systems with SOI as predictors and obtained a correlation result of 0.68.
The shoncoming of the previous models was that they do not accurately rcnect the natural characteristi cs of the underlying system Global clim:lle indexes as well as che onsec of rainy systems tllC phcnomcntl which l.Untain the it~pC::Cl Of time serh:!> and lag lime. The developed models do not accommodate such chJractcnMacs. In tha~ ~tudy, an artificial neural network model is developed from the cascaded multi-layer perceptron to model the time series aspect of SOI a.~ predictor in prediccing the onset. This model is intended to elaborate the characteristic of time lag from the existing data, in the hope that it will provide a better accuracy compared to previous models. To compare the performance of the proposed system, this study will also present the result of the experiment using the models trained using the Back-Propagation Neural Network (BPNN) and the Cascading Neural Network (CNN).
In what follows, the paper will be structured according to the following: Part 2 described the research method used, the data as well as the processing performed The next pan, Part 3 will be focused on the discussion regarding the research results. Finolly, Pan 4 will provide concluding rem:irks for this research 2. DA TA AND RESEARCH METHOD 2.1. Data ln this study, the predictor variable used to predict the onset is the Southern Oscillation Index (SOI). SOI data arc obtained from http://www.bom.gov.au/climate/current/soihtm I .shtml . SOI is an index which reflects the condition of the Pacific Ocean in comparison with the ocean around Indonesia The choice of using chis index is based on the foct thal seasons an Indonesia are affected by 1he condition of the Pacific Ocean.
The monthly data of SOI arc determined based on the pressure differences between Tahiti and Darwin The response variable 10 be predicted b 1he onsel of rainy season According to the Indonesian Bureau of Me1eorology, Climatology and Geophysics (BMKG), the onsc1 of rnin y scnson is defined by the threshold On the lOtal precipitnliOn Of at least 50 mm over the course of one dasarian (ten days) and followed by the next dasarian. Therefore. the variable to represent the onset of rainy season can be measured in terms of dasarians. Figure J illustrated the process of determining 1he ons'ct of rainy season. h can be seen that on 1he second dasarinn of December (35th dasarian), the rain volume is above 50 mm and continued on for the nex1 few dasarians. Therefore, on that year, the onset of rainy season occurred on the 35th dasarian. On the second dasanan of November. the rain volume is above 50 mm, but not for the next few dasarian and hence we can conclude that the onset did not occur at the second dasarian of November. From Fig. 4, it can be seen that 1hc anomaly of rainy i.eason onsel from the year of 1982 10 2009 vanes from -5 (ahead 5 dasarians) and +5 (delayed 5 d:isarians). Based on this dat:i, the average of onset falls an the 32th dasanan (second dasar1an of November). This means that the onset ranges at the earliest from the 27th dasarian (third dasarian of September) and at the latest on the 37th dasanan (first dasarian of January) .
According to empirical data, the onset can be delayed or advanced from the normal (a\'erage) values The advancement and delay of the onset depends on the variation of the condition of the Pacific Ocean, which in turn is reflected by the values of the SOI. documentation. The BPNN model used m this layer is the multi-layer perccptron with one hidden layer and training using the scandard back-propagation training algori 1hm as described in (Fausel. 1994

RESULTS AND ANALYSIS
Jn 1his section, we presen1 the rela1ionsh1p among the considered vanables. model arch11ec1urc and compare their accuracies.

Relationship of the Considered Variables
B:i.se<l on the available data, 11 can be seen tha1 SOI gives a s1gmfican1 mflucncc on the onset of rainy season, as shown in Fig. 6. The first one is the apparent significant corrclat.ions between the onsets with SOI in the months of May 10 Ociubcr. Titli. means that the onset can be modeled using the SOI variables of May 10 October. However. based on this data. the onset usually falls in October. which makes the appropriate time to predict the onset around Augusr or beginning of September. Therefore, lhc SOJ data in lbc month.~ of September and October cannot be u~ for prediction, since !he SOI data of September can only be obtained in around the start of Ocrober and similarly for the October, the SOI i~ available in November. To wait for these data 10 be available would mean that it is too lare 10 do the prediction.
The second result is that there is an apparent time series relation between the values of SOT. This facl allows us to build Lhe model for predicung rhe onset using the 980 SOI data of the months of May to August, while still taking into account the SOI of September and October.
The strategy employed here IS to perform a cascading between SOI prediction models based on the SOI of the previous months. Therefore. the first model to represent the time series relationships between the SOI data will have an aspect of time-delay. The neural network architecture which allows such relattonshlps is Lhe cascading neural network. Therefore, the model called the Time-Delay Cascading Neural Network (TD-CNN) will be built and its performances will be compared to the st1111dGrd models: BPNN o.nd CNN.

TD-CNN Architecture and Algorithm
Based on the reasoning explained in Section 3.1. Lhe srructure of the TD-CNN to be built as presented in F ig. 7. To ease the formullltion of Lhe training algorithm, we define the following terminology for the network: u I,, = The weights between the SOI of May (i = I) 10 SOI of August (i = 4) and bias I (i = 0) to neuron HI,. with j = 1,2,3, . ,p u2,, = The weights between the SOI of May (1 = I) to SOI of September (i = 5) and bias 3 (i = 0) 10 neuron H2;. with j = 1,2,3 Training for this network used the back-propagation (Fauset, 1994) with adding some hnk in order to make ll suitable 10 the model being construcced.

Accuracy
To perform cestiog for 1he prediction method used. we need to divide lhc data we have into training and testing data. One of 1hc ways this can be performed is by usin~ the LOO (leave One Out) Cron Validation The advantage or using this method is that we can simulate the condition as if all data are used for testing. This step is imponan1, since there arc not many data available for this research and to obcain a new data. we have to wait for at least a year. In a LOO. all instances will have the chance of being a training data and testing data. Figure 8 provides the comparison of accuracies of the thre'e models visually. The interesting thing to sec from the above figure is that in BNN. the predicted values arc not linearly correlated with tbe observed values. This is in contrast with the CNN and TD-CNN results. In CNN, the predicted values bave linear correlation with the observed value. However, the error values arc s1ill too large. In TD-CNN. we can sec a linear correlation between the predicted and the observed values. However, we can sec that there is a systematic deviauon. This shows that if we had performed a calibration of the predicccd values of TD-CNN, then its prediction results would be belier than the other two methods.  (Buono et ul., 2012). This siudy shows that the TD-CNN method gives lhe comparable corrcJation vnlue i.e., 0.74. Figure 9 illustrates the cause of the 1997 drought disaster in Pucitan. On that year, the rainfall on the 30th dasan:ln (end of November) reached more than 50 mm, which caused the farmer to as~ume that the 982 rainy season has started and began planting their rice fields . Observational data showed that over the next 6 dasarians. only one dauri:an ha~ :i rninfall above 50 nun. Therefore, the rice fieds experience water deficiency up to 3595 Ha. From this figure. we can conclude that on the first dasarian (or 37th, beginning of Jauary) rainfall reaches over 50 mm and continued on for consistently for the almost all the next dasnrian on the rainy season. According to the definition by BMKG, we can conclude that the (real) onset of rainy season occurred on the beginning of January. From the result of this experiment, the predicted onset of rainy season for lhis particular year is on the 36th dasarian. Therefore, if only this 1nformu1ion had been availnble to the formers, the drought disaster of year 1997 could have been avoided.  ---- Fig. 9. A logical c~planution of the drought tlisnstcr 1n 1997

CONCLUSION
Based on the expenment, it can be concluded 1hat the TD-CNN model has a higher accuracy, which is 0.74, compared to the other two existing standard models, the BPNN and CNN, with accuracies 0.71 and 0. 72, respectively. This is due 10 1he fact 1ha1 in lhe TD· CNN model. 1he response variable. which 1s 1he onse1 of rainy season, is also formulated u\ing SOI in Sep1ember and Oc1ober, which is closer to the onset While on the other 1wo models. the response variables can only be formulated using SOI up to August. This research has rwo limitations. Firstly, 1he global parameters that are used as predictors have not represented all global condi1ions influencing climate in Indonesia, particularly in Java Islnnd. Secondly, the parameters for modeling have not given optimum resultS yet Therefore. the future works include increasing the number of predictor variables and 10 perform a more effective model paramelers selection. The predictor variables 1ha1 can be used is lhe index that represents the cond111on of the Hindian Ocean which also affects 1he climate in Indonesia (especially Java island). To obtain more effective model parameters. we will use optimiza1ion 1echniques based on empirical dala in estimating the model's pararne1ers.

ACKNOWLEDGEMENT
The researchers thank.~ 10 the Direclor:ite of Higher E<lucation, Indonesia for financial support nnd lhe Centre for Climate Risk and Opponunity Management in Sou1heas1 Asia and Pacific (CCROM-SEAP). Indonesia for providing the rainfall data.