Investigating Commuting Time in a Metropolitan Statistical Area Using Spatial Autocorrelation Analysis

Corresponding Author: S. Hessam Miri Master of Urban Design, University of Guilan, Rasht, Iran Email: hessam.miri@gmail.com Abstract: Commuting is an unavoidable issue as living and working are two spatially separated activities for most people. The most influence of commuting is on land uses and transportation systems and ultimately it poses its consequences to the society. Research on urban commuting is one of the most favorable approaches to lessening the impact and intensity of land use and transportation problems. As urban spatial structure affects commuting patterns, this study aims to understand the spatial distribution of mean commuting time at the block group level in Charlotte-ConcordGastonia Metropolitan Statistical Area (MSA) using spatial autocorrelation analysis method. The results show that the areas of recent housing boom have longer commuting time and the commuting time decreases as the areas’ age increase. Also, there is no significant difference in Moran’s I values for Rook and Queen methods as they are 0.45939 and 0.45265, respectively. The positive value of Moran’s I (p-values <0.05) shows that block groups with longer average commutes are adjacent to block groups with longer average commutes and shorter commutes next to shorter commutes. Furthermore, it is identified that clustering of low commuting time is in the central part of the cities with old houses and clustering of high commuting time is in suburbs with newer houses.


Introduction
People commute because of the separation between their residency location and their workplaces. While the number of people who work at home is fairly small, commuting is an essential part of urban life in modern urbanized areas (Shirzadi Babakan and Alimohammadi, 2016;Yang, 2005). Commuting is defined as the journey from home to work and vice versa which is a basic activity undertaken by people and they may tend to plan other activities and meet their necessities in their home and work travel (Horner, 2004;Yang, 2005). Urban areas are expanding (Karimi et al., 2019a) and transportation infrastructures are developing continuously (Giuliano and Hanson, 2017). Urban growth and transportation development (Karimi et al., 2019b) impose an increase of travel demand across metropolitan regions and people make longer distance commutes to provide life needs, as a consequence commuting patterns are changing continuously (Horner, 2004). The most effect of commuting is on land uses and transportation systems and ultimately it poses its consequences to the society Shirzadi et al., 2013). Understanding these impacts makes the management of land uses and transportation system more convenient (Acker and Witlox, 2011;Azari and Shirzadi Babakan, 2016;Horner, 2004). Research on urban commuting is a favorable method to mitigating these land use and transportation problems (Shirzadi Babakan et al., 2015).
An urban spatial structure where jobs and housings are located affects commuting patterns, hence commuting patterns can be identified by examining urban spatial structure (Horner, 2004;Sohn, 2005;Yang, 2005). Urban growth is one of the impacts of the predominance of work trips undertaken by car (Karimi et al., 2019a). In the United States, increased automobile ownership has tended toward suburbanization and urban areas' periphery undergo new housing development on previously undeveloped land (Karimi et al., 2019b) which leads to longerdistance commutes to work (Horner, 2004). Therefore, decentralized urban growth and its impacts on the transportation system have prompted researchers to study widely the relationship between commuting and urban spatial structure (García-Palomares, 2009). Some studies considered commuting time and some studies considered commuting distance and some considered both. It is distinctive that shorter commuting distance does not guaranty shorter commuting time because of congestion (Acker and Witlox, 2011). Yang (2005) examined commuting time and distance increase in relation to concurrent decentralized development in two sizable but contracting cities, Atlanta and Boston. He realized spatially decentralization in Boston results in shorter commuting time and distance compared to the much more sprawling Atlanta. Sultana and Weber (2007) compared the commuting length of workers in sprawl areas with of workers in higher density areas, the results showed the longer time and distance of commuting from sprawl areas to urban areas. This study described that sprawl is not the only indicator on commuting length, but workers' socioeconomic characteristics are important in such investigations. Sandow and Westin (2010) analyzed commuting in a relatively sparsely populated and peripheral area in northern Sweden, the results showed geographical structure, available infrastructure and socio-economic factors in restricting people's (especially women's) commuting behavior in sparsely populated areas. Furthermore, this study showed when commuting times exceed 45 min tendency to commute declines rapidly regardless of gender, transport mode and socio-economic factors. Sultana and Weber (2014) analyzed the mean commuting time of housing areas within the 50 largest US metropolitan areas for 1980, 1990 and 2000. They described that new neighborhoods in most metropolitan areas show higher commuting time than old neighborhoods. In a recent study, Hu and Wang (2016) analyzed the temporal trends of commuting patterns in both time and distance based on the 1990-2010 Census Transportation Planning Package data of Baton Rouge, Louisiana. This research presented urban land use as a good predictor of commuting patterns over time.
Spatial autocorrelation statistics have been broadly used to measure the correlation among neighboring observations and assess the levels of spatial clustering among neighboring regions. Moran's Index (Moran's I), in particular, has been used to study the crash frequencies in Mashhad, Iran (Matkan et al., 2013), the spatial pattern of heavy metals in Beijing agricultural soils (Huo et al., 2011), the heterogeneity of the cardiovascular drug prescribing pattern in Taiwan (Cheng et al., 2011), the racial differences in the built environment-body mass index relationship in Boston (Duncan et al., 2012), Spatial clustering and hotspots detection of HIV/AIDS prevalence in (Jeefoo, 2016), spatiotemporal clustering of road accidents (Prasannakumara et al., 2011) and spatiotemporal clustering of malaria in Hubei Province, China from 2004-2011 (Xia et al., 2015).
The purpose of this study is to conduct a spatial cluster analysis and measure spatial autocorrelation to discover clustering patterns in mean commuting time between block groups in the Charlotte-Concord-Gastonia Metropolitan Statistical Area (MSA) in 2015. Global Moran's I and Local Indicators of Spatial Association (LISA) are utilized to identify the mean commuting time spatial cluster. Furthermore, the relationship between mean commuting time and the age of the houses of block groups is analyzed based on the idea that housing characteristics are the source of commuting patterns (Sultana and Weber, 2014).

Study Area and Data
The Charlotte-Concord-Gastonia MSA is including seven counties in North Carolina and three counties in South Carolina within and surrounding the city of Charlotte. Figure 1 shows the geographic location of Charlotte-Concord-Gastonia MSA. The population of the MSA was 2,379,177 according to 2015 Census estimates (DataUSA; United-States-Census-Breau). The major city in this MSA is Charlotte which is the 2nd largest city in the Southeast and 17th largest city and 22nd largest metro area in the United States. Between 2004 and 2014, Charlotte was ranked as the country's fastest-growing metro area, with 888,000 new residents (DataUSA; United-States-Census-Breau). Therefore, being a fastgrowing urban area is a motivation to examine commuting time for this MSA. In this paper, US census data of 2015 for this MSA are examined to explore the autocorrelation of mean commuting time. The socioeconomic data including median age of houses and aggregate commuting time come from (IPUMS) and the level of analysis is block groups as these are the smallest zones for which both housing and commuting data are available. The shapefile of block groups was downloaded from (Census Bureau, 2018), then these data were joined through GeoId using ArcGIS software.  (Anselin, 1995). However, in the presence of uneven spatial clustering, the LISA is utilized. It measures the contribution of individual spatial units to the global Moran's I statistic (Anselin, 1995). The first step in the analysis of spatial autocorrelation is to construct a spatial weights matrix. In this study, the Rook's and Queen's weight methods are used. The study also generates Moran scatter plots to demonstrate the spatial distribution of mean commuting time of block groups across the study area.

Spatial Weight Matrix
The spatial weight matrix contains information on the neighborhood structure for each location. The weight matrix is based on using either distance or contiguity between spatial units (O'sullivan and Unwin, 2010). Each (i, j) element of the matrix W, quantifies the spatial dependency between areal unit i and j. In adjacency approaches assigning values to each matrix element is based on sharing an edge or meeting at a corner vertex which are the Rook's case and the Queen's case, respectively. Each element of weight matrix equals to 1 if two areal units share a common boundary in rook's case or share a vertex in Queen's case; otherwise, it is 0 (O'sullivan and Unwin, 2010). Figure 2 shows the Rook's case and the Queen's case adjacency. In distance approaches, each matrix element may be the inverse distance between two units i and j. The spatial weight matrix is also zero along its diagonal implying that a unit cannot be a neighbor to itself. In adjacency approach, the matrix is symmetric and binary and the dimension of the matrix is equal to the number of areal units in the study area.

Global Moran's I Spatial Autocorrelation
Spatial autocorrelation occurs when the spatial distribution of the variable of interest exhibits a clustering or dispersion pattern (Huo et al., 2011). Spatial autocorrelation measures spatial clustering based on feature locations and attribute values. The most widely used autocorrelation measure is Moran's I, which is a spatial translation of a non-spatial correlation and is applied to numerical variables of areal units (O'sullivan and Unwin, 2010 Where: w ij = The element in the spatial weights matrix and it indicates the spatial relationship between location i and location j y i and y j = The observations of the variable in location i and j The result of Moran's I analysis is a number which ranges from -1 to +1 (Huo et al., 2011). A positive Moran's I indicates the presence and degree of spatial autocorrelation and occurs when a unit is surrounded by neighbors with similar values of the variable of interest, also +1 means perfect spatial correlation. Negative spatial autocorrelation occurs when a unit is surrounded by neighbors with dissimilar values of the variable of interest and -1 means perfect spatial dispersion. Value 0 indicates a random spatial pattern and no autocorrelation across the study area (Huo et al., 2011;O'sullivan and Unwin, 2010). The major limitation of Moran's I as a global statistic is that it is based on simultaneous measurements from many locations, it only provides some broad spatial association measurements, ignores the specific details of the location and cannot identify which local spatial clusters contribute the most to the global statistic (Holt, 2007). Finally, the result of the Moran's I is dependent on whether the matrix was based on adjacency or distance. However, a pattern of decreasing spatial autocorrelation with increasing orders of contiguity (distance decay) is commonly observed in most spatial processes regardless of the matrix definition (Oort and Frank, 2004).

Local Spatial Autocorrelation
To overcome the limitation of Moran's I, local statistics commonly referred to as Local Indicators of Spatial Association (LISA) used along with graphic visualization techniques of the spatial clustering using a Moran's Scatterplot, have been developed. LISA (Anselin, 1995) allows us to decompose the study area into small units, thus enabling the assessment of significant local spatial clustering around an individual unit. The degree of spatial clustering, the detailed variations of clustering in the locally defined geo-space and the locations of the spatial clusters can be identified by LISA. The local version of Moran's I at unit i is given by (Cheng et al., 2011): In the LISA analysis, if the test statistic is not significant at any sensible level, no spatial pattern is present in the areas and all observations are spatially random. When it is significant, four possible patterns are likely to be exhibited (Anselin, 1995): • When y i is higher than the average of the entire study area ( y ) and so are its neighbors, a High-High (HH) association, a known as hot spot, is indicated • When both y i and its neighbors are lower than the average, the spatial tendency is Low-Low (LL), or a cold spot • When y i is higher than the average of the entire study area ( y ) but its neighbors are not, a High-Low (HL) association is exhibited • When y i is lower than the average of the entire study area ( y ) and its neighbors are higher than the average, a Low-High (LH) association is indicated To have a visually meaningful map for spatial autocorrelation, the local Moran I is represented by clusters, in which the locations of significant spatial correlations are highlighted to identify the patterns of associations and p-value < 0.05 was considered statistically significant.

Scatter Plot
The Moran scatterplot provides a more disaggregated view of the nature of the spatial autocorrelation. It not only provides information on the presence of clusters in the data but also the outliers contained in it. Moran's scatterplot demonstrates the z-value of the interested variable on the horizontal axis and the spatial lag, a weighted average of the z-value of that variable in the neighboring locations, on the vertical axis. The slope of the regression line in the scatterplots is equal to Moran's I value (Anselin et al., 2006). This scatterplot is divided into four quadrants, each of which represents a different type of spatial association (Anselin, 2005;O'sullivan and Unwin, 2010): • The upper right quadrant represents spatial clustering of a district with a high value of the variable of interest around neighbors that also have a high value of that variable. This quadrant is also called the High-High zone (HH) since z-value and spatial lag both have high values. In general, these are locations in which the local Moran's I value is a positive • The upper left quadrant represents spatial clustering of a district with a low value of the variable of interest around neighbors that have a high value of that variable. This quadrant is also called the Low-High zone (LH) since z-value is low while spatial lag has high values indicating a low outlier among neighbors with high values. In general, these are locations in which the local Moran's I value is negative • The lower left quadrant represents spatial clustering of a district with a low value of the variable of interest around neighbors that also have a low value of that variable. This quadrant is also called the Low-Low zone (LL) since z-value and spatial lag both have low values. In general, these are locations in which the local Moran's I value is negative • The lower right quadrant represents spatial clustering of a high district with a high value of the variable of interest around neighbors that have a low value of that variable. This quadrant is also called the High-Low zone (HL) since the z-value is high while spatial lag has low values indicating a high outlier among neighbors with high values. In general, these are locations in which the local Moran's I value is negative In short, the High-High and Low-Low locations suggest clustering of similar values of one variable, whereas the High-Low and Low-High locations indicate spatial outliers of the same variable.

Explanation of the Software
GeoDa, a free software program, is an introduction to spatial data analysis and is consist of visualization, exploration and explanation of interesting patterns in geographic data (Anselin et al., 2006). In terms of the range of spatial statistical techniques included, GeoDa is similar to the open-source R environment. Capabilities of GeoDa can be classified into six categories including (Anselin et al., 2006): • Spatial data input, output and conversion • Data transformation and creation of new data • Choropleth maps, cartogram and map animation • Statistical graphics • Spatial autocorrelation including global and local spatial autocorrelation statistics • Linear spatial regression models These different practical functions of GeoDa helped to analyze the data and interpret the results. The software is user-friendly and easy to learn (Anselin, 2005). GeoDa is used to analysis mean commuting time autocorrelation in Charlotte-Concord-Gastonia MSA. By conducting univariate local Moran's, I function, Moran's I value and scatter plot, LISA map and significance map are exhibited simultaneity. Also, natural break mapping is used to show the classification of block groups based on the median year of housing construction.

Results
In order to identify the spatial distribution regularities and the extent to which neighboring mean commuting time values are correlated, the spatial autocorrelation analysis is applied based on the above calculations. For the first step urban spatial structure which affects commuting patterns is analyzed using the median age of housing in block groups.
The median year of housing construction is utilized to determine the date at which the majority of housing in a census block group was developed. Each block group is then assigned to a decade representing a particular housing boom. If the date is before 1950, it is classified as the 1950s. Figure 3 shows the classification of block groups of Charlotte-Concord-Gastonia MSA based on median year of housing construction. Figure 4 presents the mean commuting time considering the age of houses of block groups for the year 2015. For creating mean commuting time, aggregate commuting time was divided by the number of workers who work outside the home. As Fig. 4 illustrates, the areas of recent housing boom have longer commuting time and the commuting time decreases as the areas' age increase. Figure 3 shows that older houses are in the urban core of charlotte city and other cities in this MSA, as well newly built houses are in outer suburbs. According to these two figure the mean commuting time for an area decreases as that area age.
For the second step, the extent to which neighboring values are correlated was measured using the Global Moran's I and LISA for the study area by GeoDa software (Anselin et al., 2006). Global Moran's I is calculated using Queen and Rook Contiguity weight method and Local Indicators of Spatial Association (LISA) cluster Map is generated. The results show no significant difference in Moran's I values for Rook and Queen methods as they are 0.45939 and 0.45265, respectively. The reason may be that there are enough areal units in the study area. The positive and non-zero value of Moran's I (p-values <0.05) shows that block groups with longer average commutes are adjacent to block groups with longer average commutes and shorter commutes next to shorter commutes. The results obtained through implementing the Moran's I is consistent with Tobler's first law of geography that states that geographic features that are near each other are likely to be more similar than distant features (Tobler, 1970).
LISA cluster map obtained from spatial autocorrelation analysis is shown in Fig. 5. It illustrates the distribution of mean commuting time in the study area. The results of the LISA identify the local spatial clustering of variables at the block group level. Red areas represent block groups with high aggregation (High-High), the blue for low aggregation (Low-Low) and the light red (High-Low) and light blue (Low-High) indicates spatial outliers. Red color for a block group means that the mean commuting time of that block group and its neighbors are higher than mean commuting time of the whole study area. Blue color for a block group shows less mean commuting time for that block group and its neighbors in compare to the mean commuting time of the whole study area. Light blue color for a block group means that the mean commuting time for that block group is less than the mean commuting time of the whole study area but its neighbors mean commuting time is higher than the mean of study area. As well, a block group with light red color shows that its mean commuting time is higher than the mean of the region but its neighbors are less than the mean of the region. Considering Fig. 3, it is identified that clustering of low commuting time is in urban cores and clustering of high commuting time is in suburbs. For the people who work locally mean commuting time is less than the people who travel longer distances from suburbs to work in urban cores. Figure 6 shows the Moran's I scatter plot. Moran's scatterplot demonstrates the z-value of the mean commuting time on the horizontal axis and the spatial lag on the vertical axis. The slope of the regression line in the scatterplots is equal to Moran's I value which is 0.45939. The upper right and lower left quadrants of the scatter plot represent block groups with positive spatial autocorrelation, which means clustering of like mean commuting time values while the lower right and upper left quadrants represent negative spatial autocorrelation or spatial outliers of mean commuting time.

Conclusion
Considering the importance of commuting patterns in land use and transportation system problems, this study examined the spatial layout and distribution of mean commuting time at block group level in Charlotte-Concord-Gastonia MSA using spatial autocorrelation analysis method. First, the relationship between mean commuting time and age of housings is analyzed. The analysis revealed that the mean commuting time for older neighborhoods is less than newer neighborhoods. Then, the mean commuting time for each block group is used to calculate spatial autocorrelation. The global measures of Moran's I and Local Indicators of Spatial Association methods were utilized using Rook contiguity weight method. No significant difference between Rook's case and Queen's case was observed. While Moran's I provides information on the overall spatial distribution of the data, LISA provides information on types of the spatial association at the local level. The calculated value of Moran's I was 0.45939 using Rook' case weight matrix, it indicated that nearby block groups tend to have similar mean commuting time. In other words, block groups with longer average commutes are next to block groups with longer average commutes and shorter commutes are near to shorter commutes. In addition, new neighborhoods in charlotte metropolitan area showed higher commuting time than old neighborhoods. This means that commuting times decline as neighborhoods age over the following decades and new growth areas move outward.

Author's Contributions
All the authors contributed equally to this prepare, develop and carry out this manuscript.