Spatial Durbin Model to Identify Influential Factor s of Diarrhea

Problem statement: An analysis of regression modeling which influenced by the characteristics of the region is very important. Th at modeling is the spatial autoregressive model. On e type of spatial autoregressive model is a Spatial D urbin Model (SDM), which performs a lag effect of the dependent and independent variables. This mo del was developed because the dependencies in the spatial relationships doesn’t only occur in the dependent variable, but also on the independent variables. Modeling of diarrhea and the factors tha t influence is the case that followed this method. Approach: This problem was solved by identification of spati al autocorrelation and modeling to get the influence factors of diarrhea. The modelings we re Ordinary Least Square (OLS) and SDM. Then, it was compared between two models. This research loca ted in Tuban Regency, East Java, Indonesia. Results: There were a spatial autocorrelation on diarrhea a nd the factors variable that influence it. Furthermore, the SDM was giving better performance than OLS model. The results of SDM showed that the lag in the dependent and independent varia bles significantly affected. These independent variables were source of drinking water, health cen ter and medical personnel which were significant at α = 5%. Conclusion: SDM has good performance to identify influential f ctors of diarrhea which has spatial factors.


INTRODUCTION
Spatial method is a method to get information of observations influenced by space or location effect. Spatial model often use dependency relationship in the form of covariance structure through autoregressive model (Wall, 2004). LeSage and Pace (2009) stated that the autoregressive process is indicated by the dependency relationship among a set of observations or locations. Anselin (1988) has shown that one model of spatial autoregressive is Mixed Regressive-Autoregressive, which the function is y = ρW 1 y + Xβ 1 + ε. It shows the spatial lag effect on the dependent variable. Spatial relationship among observations is expressed by the weight matrix (W 1 ). Parameter ρ is the spatial lag parameter on dependent variable and β 1 is spatial lag parameter on the independent variable. The model called a Mixed Regressive-Autoregressive model because it combines the linear regression and a spatial lag regression model on the dependent variable. The model is also called the Spatial Autoregressive Models (SAR).
Special cases of SAR mode is add lag effect of the independent variables, so that the model is y = ρW 1 y + β 0 + Xβ 1 + W 1 Xβ 2 + ε. β 2 is parameter of lag on W 1 X. This model is called Spatial Durbin Model (SDM). This model was developed because the dependencies in the spatial relationships not only occur in the dependent variable, but also on the independent variables. Therefore, it is necessary to add spatial lag W 1 X.
The researchers who discuss about SDM are Kissling and Carl (2007). This research was about biological and autocorrelation spatial is affected on dependent and independent variables. Also, Brasington and Hite (2005) were modeled characteristic and location of houses and the price of houses. The results were neighboring or dependencies on independent variable are significant. Spatial modeling has also been developed in the healthy and environment cases, such as Myaux et al. (1997) and Kazembe et al. (2009). Myaux et al. (1997) showed that the analysis of health data which included related to space is very important in epidemiological research and healthy planning of infectious diseases. This research aims to looking at the geographic distribution of acute watery diarrhea cases in community and to assess the disease which is more common in certain areas. Murad (2011) used GIS in health care planning in Jeddah City. This application was considered as spatial decision suport system for health planners. In other various fields are agriculture, meteorology, forestry, poverty and econometrics. Elobaid et al. (2009) investigated the spatial correlation of the mean diameter of trees. In poverty, Bekti and Sutikno (2011) use Geographically Weighted Regression (GWR) to modeling on the relationship between asset society and poverty in East Java, Indonesia.
A diarrhea case in public was influenced by physical and environmental conditions, socioeconomic and cultural as well as where they live. The indicators used are the criteria of availability of sanitation and wastewater infrastructure and the criteria for resident status. These indicators can be used to determine the factors that influence diarrhea.
In Tuban Regency, Indonesia, diarrhea was one of the health problems until now. According to data from the Health Department 2007, diarrhea was occupies the second highest percentage after acute respiratory infections by 18.05%. Susenas data 2007 shows that the percentage of patients with diarrhea was 0.73%. Arumsari and Sutikno (2010) have analyzed the incidence of diarrhea in Tuban with spatial models. The variables that significanly affect are the availability of drinking water facilities and distance of the home with feces landfills (less than 10 m). The spatial modeling used was Geographically Weighted Poisson Regression (GWPR) which is the approach point. It was need the development of spatial modeling which approach to spatial area and use the spatial effect on dependent and independent variables. So, this research is modelling SDM to identify factors that affect the incidence of diarrhea in Tuban Regency.

MATERIALS AND METHODS
The data used in this study are the data from Susenas, Moran's I: Moran's I coefficient is used to test the spatial dependence or autocorrelation between observations or location (Lee and Wong, 2001). Percentage of population with diarrheal disease and registered in health centers in every district.

Independent variables (X):
Ownership of sanitation facilities, clean water and health facilities. X1 Source of drinking water Percentage of households who uses drinking water from rainwater, rivers, unprotected springs and unprotected wells X2 The distance of pumps/ wells Percentage of households who have pumps, wells, /springs to shelter dirt/feces or springs into shelters dirt or feces less than 10 m X3 Water  And: where, y represent vector of dependent variable (n×1), X represent matrix of independent variable (n × (k+1)), β represent vector of regression coefficient parameter ((k+1) ×1), ρ represent spatial lag coefficient parameter on dependent variable, λ represent spatial lag coefficient parameter on error u and ε error (n×1), W 1 and W 2 represent weighted matrix (n×n), I represent identity matrix (n×n), n represent number of observations or locations (i = 1,2,3,...,n) and k represent number of independent variable (k = 1,2,3,...,l).
If X = 0 and W 2 = 0, Equation 2 would be first order spatial autoregressive model y = ρW 1 y + ε. This model represents the variance on y as linear combination of variance among neighboring locations without independent variable. If W 2 = 0 or λ = 0, Equation 2 would be Mixed Regressive-Autoregressive model or Spatial Autoregressive Model (SAR) y = ρW 1 y + Xβ + ε. This model assumed that autoregressive process just on dependent variable.
Spatial Durbin Model (SDM) is special cases of SAR, which adding spatial lag on independent variable (Anselin, 1988). This model was developed because the dependencies in the spatial relationships not only occur in the dependent variable, but also in the independent variable. SDM model is show in Eq. 4: Vector coefficient parameter of spatial lag on independent variable is 2 β.

RESULTS
In 2007, the population of Tuban Regency was 1,127,416 persons with the population density of 613 persons per km2. Health Department noted that there are 2.82% or 31.770 persons who suffering diarrhea. Compared to regencies in East Java, Tuban Regency was ranked the ninth to the incidence of diarrhea. That number has declined over the previous year. It shows from 2.84% or 31 917 persons who suffer diarrhea. Figure 1 shows the percentage diarrhea by sub district in Tuban. It is known that sub district in suburb area have high percentage of diarrhea than others. There were Parengan (4.12%), Soko (4.07%), Rengel (3.79%), Plumpang (3.39%), Cross (3.70%) and Bancar (3.54%). Furthermore, districts which have low percentages of diarrhea were located in the central area. There were Montong (0.93%), Grabagan (1.15%) and Merakurak (1.60%). The pattern distribution of those diarrhea shows that there were clustered sub district that have same diarrhea characteristics. Such us, the high incidence of diarrhea was located in suburb area.  The distributions of other variables are presented in Fig. 2. The figure also shows that there were clustered sub district that have same characteristics. Sub districts which have high percentage of households who have type of toilet cubluk or cemplung or don't have toilet were in north area (Fig. 2d). There were Jatirogo, Bancar and Jenu which have 68.492-90.63%. Then, sub districts in middle area have lower percentage than other.

Moran's I:
The result of spatial autocorrelation test was shown in Table 2. The result of spatial autocorrelation test was landfills feces variables (X 6 ) have autocorrelation among sub districts at level significant 5%. The results at level significant 10% were source of drinking variable (X 1 ), the distance of pumps/wells/springs to shelter dirt/feces variable (X 2 ), defecate facilities (X 4 ) and type of toilet (X 5 ) have autocorrelation among sub districts. It showed from the value Z score which exceed Z 0,025 = 1,96 and Z 0,05 = 1,64.
Most of the independent variable have the value of Moran's I greater than Io = -0.053. It indicates that there was positive autocorrelation or clustered data pattern. Sub districts which in the some cluster have similar characteristics. The diarrhea incidence as the dependent variable has Moran's I of 0.015 which was not significant both at α = 5% and 10%. Based on comparison by Io, it indicates that the data pattern is spread. Among sub districts have different characteristics of diarrhea. Other variables that have a pattern of spread were water facilities (X 3 ), health center (X 7 ), medical personnel (X 8 ).

DISCUSSION
The pattern of diarrhea distribution was clustered and similar characteristics among nearby locations showed that the spatial analysis needs to ρ be done. Furthermore, moran's I show that there were a spatial autocorrelation in some variable.
OLS method has poor performance, because the assumption of identical residual not met. Not identic would effect on residual variances which was not homogeneous. It indications that residual was clustered. Therefore it was necessary for spatial modeling.
The result of SDM was that there was dependency lag on dependen and independent variable. It was shown by parameter ρ and β 2 which significant effect. The significance of the lagged independent variable was indicated by the independent variables with weighting which significant effect to model. These variables were source of drinking water, health center and medical personnel which were significant at α = 5%. Other variables which were significant at α = 20% were the distance of pumps/wells/springs to shelter dirt/feces and landfills feces. Coefficient determinansi is 66.06% and sum square error is 5.9743.
Coefficient of weighted sources of drinking water variable was 2.3123. It is positive value. It indicates sub district, which was nearby with other sub districts by the high percentage of households who used drinking water from rain water, rivers, unprotected springs and unprotected wells, will has high percentage of diarrheal disease. Otherwise, sub district, which was nearby with other sub districts by the low percentage households who uses drinking water from rain water, rivers, unprotected springs and unprotected wells, will has low diarrheal disease.
Model comparison of OLS and SDM showed that SDM was given better performance than OLS. It has sum square error smaller and there were many parameters which significant effect on model. Based on the analysis, it can be concluded that the lagged dependent and independent variable is very important about the role of modeling the diarrhea and the factors that influence it. Furthermore, based on the relationship between the incidence of diarrhea and ownership of sanitation, water and health facilities, the similarities or differences in the characteristics many sub districts may result an increase or decrease the diarrhea incidence. Example, sub district which have high percentage of households uses a source of drinking water from springs and wells unprotected will be triggered by a nearby districts which have low percentage incidence of diarrhea. These triggers can be done by the relevant programs which have been implemented by government.

CONCLUSION
Diarrhea case in Tuban Regency has spatial effect. It can be shown from Morans'I and SDM of diarrhea incidence and factors that influence it. The results of SDM show that the lag in the dependent and independent variables significantly affected. These independent variables were source of drinking water, health center and medical personnel which were significant at α = 5%. Furthermore, SDM was give better performance than OLS. It has sum square error smaller and there were many parameters which significant effect on model. In SDM model, lag on dependent and some independent variable.

ACKNOWLEDGEMENT
Many thanks' for PDPM-LPPM ITS which support the data.