MAPPING OF ILLITERACY AND INFORMATION AND COMMUNICATION TECHNOLOGY INDICATORS USING GEOGRAPHICALLY WEIGHTED REGRESSION

Geographically Weighted Regression (GWR) is a technique that brings the framework of a simple regression model into a weighted regression model. Each parameter in this model is calculated at each point geographical location. The significantly parameter can be used for mapping. In this research GWR model use for mapping Information and Communication Technology (ICT) indicators which influence on illiteracy. This problem was solved by estimation GWR model. The process was developing optimum bandwidth, weighted by kernel bisquare and parameter estimation. Mapping of ICT indicators was done by P-value. This research use data 29 regencies and 9 cities in East Java Province, Indonesia. GWR model compute the variables that significantly affect on illiteracy ( α = 5%) in some locations, such as percent households members with a mobile phone (x 2 ), percent of household members who have computer (x 3 ) and the percent of households who access the internet at school in the last month (x 4 ). Ownership of mobile phone was significant ( α = 5%) at 20 locations. Ownership of computer and internet access were significant at 3 locations. Coefficient determination at all locations has R 2 between 73.05-92.75%. The factors which affecting illiteracy in each location was very diverse. Mapping by P-value or critical area shows that ownership of mobile phone significantly affected at southern part of East Java. Then, the ownership of computer and internet access were significantly affected on illiteracy at northern area. All the coefficient regression in these locations was negative. It performs that if the number of mobile phone ownership, computer ownership and internet access were high then illiteracy will be decrease.


INTRODUCTION
There are several development spatial modeling, such as area and point approach model. Area models are Spatial Autoregressive Models (SAR), Spatial Error Model (SEM), Spatial Durbin Model (SDM) and Spatial Autoregressive Moving Average (SARMA). These models use dependency relationship in the form of covariance structure through autoregressive model (Anselin, 1988;LeSage and Pace, 2009;Zheng and Zhu, 2012). In space time data also known as Space Time Autoregressive (STAR) model (Giacinto et al., 2005) and Conditional Autoregressive (CAR) (Lekdee and Ingsrisawang, 2013). Bekti and Sutikno (2012) have used SDM to modeling on diarrhea and factors which influenced it. Bekti and Sutikno (2010) have modeled the relationship between an assets society with HCI through the SLM and the SEM approach to the area in East Java, Indonesia. Anselin and Rey (2010) noted that spatial heterogeneity raises even greater methodological issues, because it suggests that any attempt to find universal principles that apply everywhere on the earth's surface is fundamentally problematic. Analysis should focus on estimating, interpreting the inevitable variation in parameters and adopting a methodological position. Local or place-based analysis is more consistent with this position, such as Anselin's LISA (Anselin, 1995) and the Geographically Weighted Regression (Fotheringham et al., 2002).

JMSS
Geographically Weighted Regression (GWR) is a technique that brings the framework of a simple regression model into a weighted regression model. Each parameter in this model is calculated at each point geographical location, so that each point in different geographic location has different regression parameter. GWR also called as method of spatial model based on point area. Another method is Geographically Weighted Poisson Regression (GWPR). Sarma et al. (2011) were used GWR, Ordinary Least Square (OLS) and Geographic Information System (GIS) to evaluate the long-term trends in agricultural productivity. An important catalyst for better integration of GIS and spatial data analysis for improved interpolation has been the development of local spatial statistical techniques. Spatial analysis and Geographic Information Systems (GIS) are much related. Mapping the spatial distribution can be perform by GIS, such as Ibrahim et al. (2012) who perform mapping honeybee plants. GIS are important to perform four basic functions on spatial data: Input, storage, analysis and output. Sarma et al. (2011) also noted that GIS can perform predicting and mapping.
Mapping can demonstrate and visualize the spatial analysis. Matthews and Yang (2012) were doing it. It used GWR for mapping the results of local statistics. Matthews and Yang (2012) use parameter estimate and tvalue because the spatial distribution of the parameter estimates must be presented in concert with the distribution of significance. It yields meaningful interpretation of results. Cho et al. (2009), Sarma et al. (2011 also represents a major improvement in visualizing GWR results, respectively use mapping of extreme coefficient in housing research and residuals in spatial distribution of precipitation and crop yields. Another methods for illustrate spatial relationships is by 'K'luster Analysis by Tree Edge Removal (SKATER). This method calculating based on clustering analysis then illustrates it on mapping, such as research by Rachmawati and Bekti (2013).
This research performs GWR for modeling in education problems in East Java Indonesia. Education is one of primary need to improve the community quality of life and welfare. The rate of illiteracy is one indicator of the level of education. Central Bureau of Statistics Indonesia noted that in 2010, there were 7,09% illiterate people which aged over 15 years. This number was decreased from 2009. In 2010, East Java Province has the higher percent illiterate people than other province in Java and Sumatera Island. This number was 11,66% illiterate people which aged over 15 years, 2,39% people aged 15-44 years and 26,22% people aged over 44 years.
The development of Information and Communication Technology (ICT) is one factor that influence on illiteracy. It has an impact on education, especially in learning and education. Central bureau of statistics Indonesia noted that it has some indicators, such the ownership of fixed-line telephone, mobile phone, computer and internet access. D' Silva et al. (2011) was shown that ICT is an important mechanism to further boost rural development in Malaysia. Then, Astiwi (2011) was show that there are some activities to decrease illiteracy in East Java Indonesia, such as community literacy movement. The activity is a improve library management to more interesting, such as using a computer with software literacy lessons.
Some regencies and cities in East Java have almost the same characteristics in among neighboring and adjacent regencies or cities. For example, Madura Island has the high percent illiterate people. Bondowoso Regency, Situbondo and Probolinggo Regency which adjacent with Madura Island have high percent illiterate people too. It shows that there is an influence factor or spatial locations.
To get spatial relathionship between illiteracy and ICT indicators in East Java Province Indonesia, this research was modelling these variable by GWR. This method calculate parameter at each point location. It also has done mapping ICT indicators which influence in every regencies and cities area.

Geographically Weighted Regression (GWR)
GWR method is a technique that brings the framework of a simple regression model into a weighted regression model (Fotheringham et al., 2002). It was introduced to the geography literature by Brunsdon et al. (1996). This model is the locally linear regression. It based on non-parametric technique of locally weighted regression developed in statistics for curve fitting and smoothing applications (Fischer and Getis, 2010). It yields parameter estimations which localized to each point or the location where data is collected. The dependent variable is predicted by each independent variable which coefficient regression depends on the location where the data is observed.
Each parameter will be estimated at each point of the geographical location so that each point of geographic location has the different parameter estimation. This will give a variation on the regression parameter values in a set geographical area. If the parameters estimation in each location is constant, the GWR models are called global models. This means that each location have the same model.
The general function of GWR model is in (1): Where: y i = Dependent variable at locations i-th (i = 1, 2, ... , +n) x ik = Independen variable k at location i-th (i = 1, 2, ... , n) (u i ,v i ) = Coordinate of longitude and latitude at location i-th β k (u i ,v i ) = Parameter estimation k-th at location i-th ε i = Error, IIDN(0,σ 2 ) Matrix from of (1) is Equation (2 and 3): Estimation parameter in location i-th by Weighted Least Square in (2) show as in (3): where, X is matrix variable independent, y is vektor dependent variable and W(i) is weighted matrix:

Weighted
Weighted is show the neighboring relationship among locations on the model. It is important because it represents the weighted value of the location of the observation data with one another so that need accuracy weighting method. There are several weight functions (Fotheringham et al., 2002) such us Inverse distance function, Kernel Gauss function and Kernel Bisquare function. In this research was use Kernel Bisquare function in (4). It gives weight value zero when the location j is at or beyond the radius b of the location i. Whereas if the location j is within the radius b, then it will get weight following the bi-square function:

Bandwidth
Bandwidth is a measure of the distance weighting function and the extent of the influence of location to others. It noted as b in Equation (4). Theoretically the bandwidth is a circle with radius b from the center point location. It is used as the basis for determining the weight of each observation on the regression model on the site. For the observations are located close to the location i will be more influential in shaping the model parameters on the location i.

Data and Variable
This study use data 29 regencies and 9 cities in East Java province, Indonesia (Fig. 1). The data source was from National Social and economic survey (Susenas), Central Bureau of Statistics Indonesia (BPS) in 2009.
The independent variables are ICT indicators. These indicators were obtained from National Social and economic survey (Susenas) by BPS, consist of percent of households which have fixed-line telephone (x 1 ), percent households members which have mobile phone (x 2 ), percent of household members who have a computer (x 3 ) and the percent of households who access the internet at school in the last month (x 4 ). The dependent variable is rate of illiteracy in each district (ABH).
The spatial pattern of ABH distribution was presented in Fig. 2. It can be seen that ABH can be grouped into five major groups, namely under 7, 7-13, 13-18, 18-23 and over 23%. It was recorded that 8 of 9 regional cities have ABH included in group 1, such as Batu, Surabaya, Mojokerto, Kediri, Pasuruan, Madiun, Malang and regency of Blitar. Probolinggo Regency was the only city included in the second group ABH, with 7.08%. Almost all regency's have ABH higher than the city area. There are only two regencies were included in group 1, such as Sidoarjo Regency (2.94%) and Gresik (5.17%). In

Analysis
The steps for analysis and mapping were developing optimum bandwidth by cross validation, weighted, GWR model and mapping P-value from the GWR parameter estimation. It uses spgwr and spdep package in R (Bivand et al., 2008).

RESULTS
The first step was developing optimum bandwidth. Table 1 shows the optimum bandwidth in 29 regencies and 9 cities. Bandwidth has the function to determine the weight of a location to another location that is used as the center. For example, Sampang Regency which was the highest ABH has bandwidth value 1.2903. It shows the area around within a radius of 1.2903 o from Sampang be considered as affecting the location of the Sampang. Locations which closed to the center area will be getting a big influence from this central area. The high bandwidth was 2,1924 at Banyuwangi and the smallest was 0,7976 at Kediri. The next step was found the weighted for the surrounding area in every location. Following the previous example with Sampang district center, the area within a bandwidth 1.2903 o will be assigned weights that follow the kernel Bi-square function and the area outside the radius will be affected is very small and will be given a weight of zero.
Formula to assigned weighted in Sampang based on formula (4) The results of weighted calculation for Sampang as the center area are presented in Table 2.
Results of estimation parameter in GWR can be seen in Table 3. The factors which affecting ABH in each locations was very diverse.   Table 3. Parameter estimation in geographically weighted regression y i = b 0 (u i ,v i )+b 1 (u i ,v i ) x i1 +b 2 (u i ,v i )+x i2 +b 3 (u i ,v i )+x i3 +b 4 (u i ,v i )+x i4  (*) shows the significant at α = 5% with t (α/2; 24,505) = 2,0638, (**) shows the significant at α=10%, with t (α/2; 24,505) = 1,7108; In location column, marked (*) is city and the other (not be marked) is regency In general, the variables that significantly affect on ABH (α = 5%) in some locations are percent households members with a mobile phone (x 2 ), percent of household members who have a computer (x 3 ) and the percent of households who access the internet at school in the last month (x 4 ). It used the critical value t (0,05;24,505) = 2,0638.

JMSS
Percent households members with a mobile phone (x 2 ) significant with α = 5% at 20 locations. Percent of household members who have a computer (x 3 ) and percent of households who access the internet at school in the last month (x 4 ) significant each one at 3 locations. Coefficient determination (R 2 ) indicates how much variance that can be explained by ICT indicators to ABH. In GWR parameter estimation, all locations have R 2 between 73.05 and 92.75%.

DISCUSSION
The results of GWR model was shown that ICT indicator, such as ownership of mobile phone, computer and internet access were significant Science Publications JMSS influenced on illiteracy. This session discuss about mapping ICT indicators which influence in every regencies and cities area. Figure 3 illustrates it. Significance is shows by P-value which illustrates by different point colors at each location. It reference from Mathews and Yang (2012). Figure 3a shows P-value of percent of households which have fix-line phone. As discussed in the previous discussion, it does not significantly affect ABH on α of 5 or 10%. This is evident from the results of the mapping that shows none of the area that have P-value below 0.05. However, these variables are still significant effect on α lest than 31% in 3 regencies on the Madura Island (Sampang, Pamekasan and Regency of Sumenep), also Lumajang and Regency of Lamongan.
About ownership of mobile phone (Fig. 3b), southern part of East Java Province has a P-value less than 0.05. It can be concluded that this factor significantly affected on ABH at these locations. All the coefficient regression from GWR model in these locations is negative. It performs that if the number of mobile phone ownership is high then ABH will decrease. In contrast to its location in the north, this location has a P-value values above 0,05, which means the level of ABH in these areas is not affected in α 0.05.
Generally, ownership of computer (Fig. 3c) significantly affected on ABH at northern of East Java. It is inversely if compared in southern location. Northern area has a P-value less than 0.1. All the coefficient regression from GWR model in these locations is negative. It performs that if the number of computer ownership is high then ABH will decrease. This condition also happens for internet access (Fig. 3d).

JMSS
The GWR model was show that there is spatial effect on illiteracy cases. It appropriate with research from Firmansyah and Sutikno (2011) which use Spatial Autoregressive (SAR) and Spatial Error Model (SEM). The results was there is also spatial effect on illiteracy model, which the independent variable is population and education. Also Lailiyah and Purhadi (2012) were use Geographically Weighted Ordinal Logistic regression (GWOLR). Computer facility also support for decrease illiteracy (Astiwi, 2011). In GWR model, variable percent of households member who have a computer is significance at some locations.

CONCLUSION
Geographically Weighted Regression (GWR) model was show that ICT indicators which affecting ABH in each locations was very diverse. In general by α = 5% and α = 10%, these were percent households members with a mobile phone (x 2 ), percent of household members who have a computer (x 3 ) and the percent of households who access the internet at school in the past month (x 4 ). Significantly parameter in GWR can be used for mapping of relationship between ICT indicators and illiteracy in East Java Province. Significance is shows by P-value or critical area.
The ownership of home phone does not significantly affect ABH on α of 5% or 10%. However, these variables are still significant effect on α lest than 31% in some regencies at north of East Java. About ownership of mobile phone significantly affected at southern part of East Java. Then, the ownership of computer and internet access were significantly affected on ABH at northern area. All the coefficient regression from GWR model in these locations is negative. It performs that if the number of mobile phone ownership, computer ownership and internet access were high then ABH will be decrease.