1. Introduction
There has been long-term steady urban growth worldwide. Developing countries have been the major contributors to urbanization. In 2018, nine out of the top 10 largest cities in the world were located in Asia. It is projected that the number of urban populations will double and the urban area will triple in size by 2030 [
1]. Urban expansion has increased at a higher rate than the urban population, and rapid expansion has led cities to face enormous challenges [
1], such as uncontrolled informal settlement, insufficient urban service [
2], climate change and global warming [
3,
4], negative effects on social-environmental responses [
1,
5], and consumption of agriculture and natural land [
6].
Research from multiple disciplines has addressed the importance of studying urban growth to better understand the occurrence and consequences of urban expansion and to explore urban sprawl across space and over time [
2,
6,
7]. Simulations of the future urbanized area can assist the government and urban planners in policy making, land use, and land management in response to fast economic development and rapid population growth. Such simulations are typically generated using urban growth models and mapped with GIS and remote sensing techniques.
Cellular automata (CA) models for urban growth simulation have proliferated over the past 20 years due to their simplicity, flexibility, and intuitiveness [
8]. CA was first developed by S. Ulan and J. von Neumann in the 1940s and applied as a theoretical approach for the simulation of urban expansions in the 1980s [
9,
10,
11]. The assumption of CA models is that future patterns of land use will be affected by past urban development through local interactions [
8], so the development of computing power led directly to the emergence of operational CA models. CA models are effective and reliable in spatial and temporal simulation research [
12]. CA models have four elements: a discrete cell space, states of cells, rules of neighborhoods, and rules of transition. The state of a cell is determined by its previous state and the states of its neighboring cells according to transition rules (Betty 2007). Though simple, CA models are capable of incorporating the spatial and temporal dimensions of processes and modeling complex dynamic systems. Recently, advances in computers have increased the number of CA models used in real-world urban simulations [
8,
12]. Moreover, the urban CA model can be easily integrated with the GIS environment, which can produce high spatial resolution maps [
2].
Compared to a general CA model, the Land Transformation model, although demonstrating a high capacity for prediction with high resolution, has complex operational steps, which make it one of the least popular applications [
13]. The Weights of Evidence model (WE) requires rich data and detailed maps, making it very hard to collect all the needed data [
14]. An integrated model combining the Frequency Ratio (FR), Analytical Hierarchical Process (AHP), Logistic Regression (LR), and Artificial Neural Network (ANN) was designed to predict and compare urban growth [
15], but the model had difficulty identifying the best method due to the differences among the requirements, needs, and the means of each method. Therefore, CA is one of the most popular approaches in urban growth modeling. Agent-based model (ABM) is also one of the most popular models in land use and land cover change studies [
16,
17] which include individual agents. By following rules, those agents steer land use development by planning, permitting, or restricting land use changes [
18]. ABMs are applicable to land use change studies with individual decision-making processes, while CA models are mainly used in studies based on historical land use change patterns.
Early applications of CA to urban dynamic modeling were theoretical models, which allowed modelers to test hypotheses of urban theory and simulate simple urban structures [
19]. Theoretical CA models in their simple forms are not strong enough to create realistic simulations due to their simplicity and inability to consider social, economic, and demographic factors when simulating urban dynamics. Hence, these conventional CA models were integrated with quantitative systems to simulate real-world urban development processes [
20,
21]. Integrating CA models with the Markov Chain [
22,
23], Analytical Hierarchical Process [
24], Logistic Regression [
25,
26], Multicriteria Evaluation [
27,
28], Support Vector Machine [
29], and ANNs [
30] overcomes the limitation of conventional CA [
31]. The SLEUTH model was built with multiple data layers, including slope, land use, exclusion, urban extent, transportation, and hill shade [
20], so the possibility that the transition rules would affect the state of cells is tuned and weighted through those layers, making it a widespread model [
6,
20,
32]. Integrated CA-AHP and CA-LR models are the strongest in terms of their ability to deal with the most factors; their strong validity in simulations; their effectiveness in explaining the results; and their ability to generate different scenarios using both environmental and social-economic factors [
12]. Moreover, previous studies have shown the effectiveness of AHP in using weights of factors on quantitative models and spatial-temporal factors to create realistic simulations. Moreover, AHP has the ability to combine mathematical methods and the experience of experts in the field of urban studies.
Based on Santé et al.’s review (2010) [
8], there are relaxations in the original structure of CA that allow the introduction of more complexity to the models.
Transition rules: One of the most commonly used rules is transition potential. It is usually calculated as the weighted sum of a number of factors, including road accessibility, distance to urban centers, slope, accessibility to railways and water, planning and environmental factors, suitability for development, population density, and Gross Domestic Production (GDP). GDP is the most commonly used factor in the majority of the research [
25,
33]. Different techniques can be used to calculate the transition potential, such as logistic regression or multicriteria evaluation [
28].
Cell space: The cell space can be irregular and non-uniform, for example, graphs [
34,
35].
Neighborhood: The neighborhood space can be extended with a distance-decay effect or defined differently according to its state and location [
36].
Cell states: Most urban simulation models make transitions between two states: urban and non-urban. Some models extend the transition to multiple land uses using a Markov chain [
37].
The integration of CA models with GIS and AHP is widely used to simulate land use changes. However, most of the models are used at regional scales and only a few of them are at a large scale. The aims of this study are (1) to develop a prototype simulation model for national-scale urban simulation studies by integrating GIS and CA and (2) to analyze the sensitivity of the model with respect to:
The neighborhood size: How does the size of the neighborhood affect the simulation results? What is the best fitted neighborhood size in this study?
The urban ratio: How does the urban ratio in a transition rule affect the urban growth speed? What is the most appropriate urban ratio in this study?
We employ the transition probability as a transition rule in the model. AHP is used to determine the weights of multiple land use change drivers. China is used as a case study to test the model.
2. Study Area
China has the fourth largest land area in the world. Its urbanized area has experienced a fast-expanding period over the past 40 years. The urban population in China is anticipated to reach a level higher than 70% by 2035 [
38]. The number of cities increased from 193 to 658 between 1978 and 2013 in China [
39].
China also has a very diverse urban area across more than 650 cities. The rate of urban expansion across cities has varied from 0 to over 90%, showing a broad spatial and temporal variability [
33]. In Deng et al.’s (2008) econometric model, GDP (or income) is a positive and highly significant coefficient, and it was shown to explain nearly 40% of the variability of the urban core expansion in the model [
33]. By comparing the model with both economic (such as GDP, population, agricultural investment) and geographical factors (such as existing urban area, highway density, distance to port cities and provincial capital, rainfall, slope, temperature, elevation) with a model that includes only geographical factors, the authors concluded that geographical variables play the major role in explaining the diversity across space in urban core expansion, while economic factors measure their impacts more precisely [
33].
Combining urban growth drivers from past urban simulation studies and Deng et al.’s research of urban expansion in China, we chose seven factors as the urban growth drivers for generating land conversion probability, which are displayed in the Data section.
5. Parameter Tests
To improve the results of the model, besides the modifications to the edge effects, there were still two parameters in the model which could be adjusted: the number of neighborhoods and the urban ratio threshold. We modified the original rules: if the urban ratio of a non-urban cell’s neighborhoods was over a certain threshold, and the generated random number was greater than the cell’s land suitability at the same time, the cell became an urban cell and was exported to the new layer. In this new rule, two parameters were edited to evaluate their influence on the final results of the model. Then, we performed multiple tests on different sizes of the neighborhood with the same urban ratio, and then changed the urban ratio threshold, expecting to understand the role that these two parameters play in this model.
5.1. Neighborhood Size Tests
We tested the Moore Neighborhood first and began with an urban ratio of 0.375. Then, we increased the number of neighborhoods while keeping the urban ratio stable at 0.375, and performed different tests (
Figure 4).
With a neighborhood of 3 × 3, almost 95% of the existing urban cells were surrounded by new urban cells. The smallest urban area with new urban growth had only two original urban cells. The new urban growth was 5201 km2.
With a neighborhood of 5 × 5, the smallest urban area with new urban growth had seven original urban cells. The larger the historical urban area was, the more new urban cells that grew adjacent to it. New urban cells not only appeared next to existing cells but also accumulated near newly grown cells. Therefore, the original large urban areas, such as the city of Beijing and the city of Tianjin, were enlarged by a circle of new urban cells. This simulation with a 5 × 5 neighborhood had the largest new urban growth: 6467 km2.
With a neighborhood of 7 × 7, a large number of small existing urban clusters remained the same after this simulation. Around large urban clusters, some new urban cells were not tied closely to the edges. They extended close to but not directly adjacent to urban clusters. Starting from this simulation, the computation time of one simulation increased exponentially if we increased the neighborhood size. The new urban growth decreased to 5887 km2.
With larger neighborhoods, only a few metropolitan clusters expanded with new urban cells, which is not true in reality because most urban growth results from suburban sprawl.
Due to the stochasticity in the model, the results were different each time, so we ran each test 10 times and calculated the average number of simulated urban pixels (
Figure 4). There was a regular pattern in the results. The simulated urban development was shown to grow when the number of neighborhoods increased and reached a peak at around 5 × 5. After that, the number of changes started to decrease with the continuous increase of neighborhoods. The variation among those results was not obvious. In
Figure 4, there is a difference of around 1300 pixels between the peak number of changes at 5 × 5 and the fewest changes at 3 × 3. When we conducted simulations using neighborhoods of 3 × 3 and 5 × 5, new urban cells occurred adjacent to a small area of original urban land use, which is true in reality. However, when the simulation was performed with an increasing number of neighborhoods, new urban cells only occurred adjacent to a large area of original urban areas (See
Supplementary Material).
5.2. Urban Ratio Tests
After testing different sizes, the combination of 3 × 3, 5 × 5, and 7 × 7 neighborhoods stood out and generated the most realistic results. In the following tests, we kept the neighborhood size stable while changing the urban ratios (
Table 3).
With a 3 × 3 neighborhood, the urban growth after one simulation had large variance between the simulation using an urban ratio of 0.25 (two urban cells out of eight neighborhoods) and that using an urban ratio of 0.375 (three urban cells out of eight neighborhoods) (
Figure S1a). The new urban area generated by the former was 9939 square kilometers, which is almost twice the latter. However, there was less variance between the simulated urban growth using urban ratios of 0.375 and 0.5 (four urban cells out of eight neighborhoods). Their urban growth results were almost the same: around 5200 km
2.
The simulation generated by a 5 × 5 neighborhood showed a similar trend to the previous simulation (
Figure S1b). It had 6506 new urban cells, with an urban ratio of 0.375 (nine urban cells out of 24 neighborhoods) and 32% more urban cells with an urban ratio of 0.33, which means the new urban cells had at least eight urban neighbors. With an urban ratio of 0.5, the simulation generated 32% fewer urban cells compared to that with a 0.375 urban ratio.
The pattern was similar for a 7 × 7 neighborhood (
Figure S1c). The results showed a new urban area of 7795 square kilometers using a ratio of 0.3125 (15 urban cells in 48 neighborhoods), which is 34% more than the urban growth using a ratio of 0.375. The results generated by a ratio of 0.5 had 25% fewer urban cells than when using 0.375. Moreover, with an urban ratio of 0.5 in the model, urban growth decreased, along with the increase in the size of the neighborhood.
After testing multiple neighborhoods with different urban ratios as the threshold in the model (
Figure S1d–f), we picked three tests with results after the first simulation that were very similar to the urban land cover in 2005 and evaluated them. The first chosen test used a 3 × 3 neighborhood with an urban ratio of 0.25. The evaluation test produced a kappa coefficient of 0.79, which indicates that there is good agreement between the simulated result and the real urban land cover in the year 2005. The second chosen test used a 5 × 5 neighborhood and an urban ratio of 0.33. The kappa coefficient given by this test was 0.78, which is slightly lower than the first one, but it still demonstrates good agreement. The third test used a 7 × 7 neighborhood and 0.3125 as the urban ratio, and the kappa coefficient produced by the evaluation test was 0.77. All three tests had an overall accuracy of around 0.89 (
Figure S2).
Then, we used simulated land use cover from 2005 as the input layer, and then we generated the land use cover for 2010 and 2015 with three sets of parameters. The results are shown in
Table 4 and
Table 5. The results of the simulation with a 5 × 5 neighborhood and an urban threshold of 0.33 were the closest to the observed urban area. The simulated urban growth rate in 2000–2005 was 0.39, 14.7% higher than the observed growth rate, 0.34, and the simulated growth rate between 2005 and 2010 was 0.81, 3.6% less than the observed rate. Therefore, the final results were produced using this set of parameters.
After validation and calibration, new simulations were completed based on the historical land use data in 2010. The model was run four times. Each iteration represented a 5-year urban growth change. The model simulated the urban growth changes until 2030. The results are demonstrated in
Figure S3.
The simulated future urban pattern is similar to the historical urban area in 2010. Three major existing large urban clusters, including Beijing-Tianjin, Yangtze River Delta, and Pearl River Delta, were estimated to continue to expand. Capital and large cities in central provinces grow quickly, and Hohhot, Baotou, Xi’an, and Wuhan are expected to have large urban growth compared with a map from 2010.
According to the simulation, there will be an increasing trend in terms of the total urbanized area, and the simulated urban area in 2030 approaches 105,000 square kilometers, which is more than twice the observed urban area in 2010. The simulated results indicate a decreasing trend in urban growth rate, from around 38% in 2015 to 17% in 2030. The new urban growth is predicted to become smaller and smaller in the future (
Table 6).
7. Discussion
CA models have inherent challenges associated with the use of parameters and their impacts on model results. Uncertainties in the model generated by those parameters are often ignored or not adequately addressed. This study examined the impacts of the neighborhood size and urban ratio on the generated model outputs using a visual comparison, the coincidence matrix with a KAPPA index. The results indicate that there are significant impacts from a changing neighborhood size and urban ratio on the simulated outputs.
Under the transition rule, a small neighborhood size results in a more dispersed pattern of urban growth. Under the rule of small neighborhood size, only a few urban cells are required to enable urban growth. Therefore, most urban growth occurs adjacent to existing land use, regardless of the patch size of the existing urban area. As the neighborhood size grows, it requires more urban cells to enable urban growth. A lot of small-sized urban patches that are far from large urban centers start to lose the ability to generate urban growth. The majority of urban growth starts to accumulate around the large urban centers. For a given neighborhood size, urban growth decreases as the urban ratio increases. Therefore, the urban ratio also has a significant effect on the model’s output. Both neighborhood size and urban ratio require fine-tuning when using the model.
It is worth noting that the significance of urban models does not only lie in the results, but also in their individual behaviors and the settings of behavior rules. Through the process of developing models, we learn how each model functions. For urban simulation models, although the simulated results will never be perfect, planners can still get to know the specific kinds of factors which may influence urban growth and how those possible factors lead to actual land use changes.
This research demonstrates that the cellular automata model is appropriate for simulating urban growth in China. The calibrated model can predict the urban growth rate and spatial location of urban areas, but it still has flaws. The population data used to generate the suitability map is from the year of 2010, and it is stable throughout the whole simulation process. However, there is an obvious increasing trend of urban population in China during these years, which will have contributed to urban growth in terms of of both magnitude and spatial location. Therefore, the stable population embedded in the model will result in a conservative prediction of urban growth.
Although AHP was determined by expertise knowledge, it will still bring human subjective influence to the model outputs. In future research, logistic regression could be used to complement AHP weights. Furthermore, the urban growth in this model was endogenously generated by the CA model. Further growth constraints could be added to control the total area of urban growth. Multiple land use types could be introduced into the model.
8. Conclusions
This paper focused on the urban development simulation of China until 2030. GIS, multicriteria analysis, and AHP techniques were adopted, and an integrated simulation model was developed in combination with the traditional cellular automata model. The land suitability score was calculated based on seven factors, including distance to city centers, railroads, major roads, river, slope, population density, and GDP. Two parameter tests were carried out for the urban ratio and neighborhood size. As a result, a 5 ×5 neighborhood and an urban ratio of 0.33 were selected to generate the results. The spatial pattern of future urban growth in China will not change a lot, but more satellite cities will be built to connect existing urban centers. The total urban development will still increase and is expected to approach 105,000 square kilometers. However, the simulation indicates a decreasing trend in urban growth rate, from 38% in 2005 to 17% in 2030.