The comprehensive transformation probability of a specific land-use class for each cell in the traditional Cellular Automata (CA) model depends on the land development suitability probability, neighborhood effect probability, constraint factors, and random effects.
In this article, to explore the impact of cell development in different spatial directions on land use change, we introduced the Spatial Anisotropy Index (SAI) into the traditional comprehensive transformation suitability module. The formula is as follows:
where,
represents the comprehensive conversion probability of the cell at position
;
denotes the urban development suitability probability of the cell at position
, which in this paper is computed using the Random Forest (RF) module. It integrates 15 driver factor layers’ data with the land-use data changing from 2010 to 2020, establishing a unified spatial database. This suitability probability is derived by employing a random forest model based on the influencing driver factors.
signifies the neighborhood effect probability of the cell at position
, calculated in this paper by the expansion neighborhood and CNN module.
stands for the Spatial Anisotropy Index (SAI) of the cell at position
, representing the influence of forces from different directions on land-use change.
represents the random effect during the land expansion and change process.
2.3.1. Spatial Anisotropy Module
Spatial Anisotropy (SA) is a significant feature in urban development, primarily focusing on the extent to which spatial directionality influences simulation results in urban expansion modeling. Specifically, it describes how different directions of urban expansion affect simulation results. For instance, in simulations, there may be variations in the evolution and changes of a non-urban cell in different spatial directions. In urban expansion modeling, partitioning schemes are commonly employed to address this directional issue, but this approach may lead to imbalances in simulation results in transition zones between two sector regions.
To delve deeper into revealing the spatial anisotropic patterns in urban land expansion, this paper introduces Spatial Anisotropy Index (SAI) on the basis of traditional urban expansion models. SAI quantifies the anisotropic probability of cells and integrates it into the overall conversion probability of cells. This allows for a more accurate simulation of urban expansion changes and development trends in different directions, thereby enhancing the accuracy and interpretability of simulation results.
How to measure anisotropy is crucial in anisotropy modeling. In contrast to the urban center, the proportion of pixels of a certain land use type in different directions compared to the total number of pixels in that direction exhibits significant differences. We refer to this proportion as the anisotropy index of a certain land type, as specified in Formula (2).
: The anisotropy index of a certain land type.
: The number of cells of a certain type along the line connecting the central city cell and the observation point.
: The total number of cells along the line connecting the central city cell and the observation point.
Specific Steps: (1) Load Land Use Layer Data: Load land use layer data for the years 2010–2020, ensuring correct data formatting and the availability of land use type information for each cell. (2) Select City Center Point: Choose the center point of the city within the land use data layer, which can be either the geographic center of the city or a representative location. (3) Establish Four Quadrants: Create four quadrants around the city center point, dividing them according to the positive and negative directions of spatial coordinate axes. (4) Determine Line Equations: For randomly selected cell points within each quadrant and the city center point, calculate the line equations connecting them. This can be achieved using methods like two-point form or slope-intercept form. (5) Calculate Spatial Anisotropy Index (SAI): Utilize the aforementioned line equations to compute the Spatial Anisotropy Index (SAI) for each research point based on the provided formula. This index reflects the trends and differences of cells in different directions.
In this study, the spatial anisotropy index (SAI) for six types of land use in Chongqing in 2010 was calculated based on Formula (2) and the land use data of Chongqing in 2010. The results are presented in
Figure 4.
The spatial anisotropy index calculated earlier represents static observational (statistical) data at a specific point in time, which to some extent, expresses the directionality and preference of cell expansion in space. We believe that the factors leading to the anisotropic characteristics of land classes in spatial distribution stem from two forces: expansion forces from within the city (referred to as internal forces) and interactions between cities from outside (referred to as external forces). Presently, urban land change models place significant emphasis on internal forces but tend to overlook external forces due to several challenges in modeling them. In the CA iteration module of this study, we aim to explore the impact of external forces on urban expansion.
2.3.2. Neighborhood Effect Extraction Module
The traditional cellular neighborhood effect is a crucial mechanism used to describe the interaction between cells and how they respond to changes in the surrounding environment. Typically, the definition of neighborhood effect involves a fixed neighborhood structure that determines the surrounding neighboring cells considered by each cell. Taking the classic Moore neighborhood as an example, it includes eight adjacent cells around a cell. The fundamental idea of neighborhood effect is that the state update of a cell is influenced by the states of its neighboring cells, simulating local interactions between cells. However, traditional neighborhood effect methods have some limitations in certain situations. Firstly, they often employ a fixed neighborhood structure, which may limit the adaptability of the model, as some problems may require a more flexible neighborhood definition to better capture changes in the surrounding environment. Secondly, traditional neighborhood effect methods usually focus on the local neighborhood of cells, neglecting the influence of cells at greater distances on the simulation. This may lead to the model being unable to accurately predict global phenomena in some cases. Another limitation is that neighborhood effects may result in zero probabilities for some cell neighborhoods. This implies that, in certain situations, the state of a cell cannot change according to the current rules, leading to the abandonment of some cells with development potential and potentially inaccurate simulation results. In summary, although traditional neighborhood effects play a crucial role in cellular automaton models, they have some shortcomings in terms of flexibility, globality, and dynamics, which may limit the accuracy and applicability of the model.
Taking the neighborhood calculation of urban land as an example, the traditional mathematical expression of neighborhood calculation is: the density of urban cells within the
neighborhood of the central cell. The mathematical expression of the neighborhood effect of cell
at time
is:
In the formula, is the neighborhood effect of cell at time , is the Moore neighborhood side length, con(.) is the conditional function, the value is 1 when the cell state is city, otherwise the value is 0, is the state of cell at time .
This article addresses the limitations of traditional neighborhoods in terms of flexibility, globality, and dynamics, particularly focusing on the issue of cell expansion failure in traditional neighborhood effect computations. It introduces the concept of an expanded neighborhood construction. The basic approach involves gradually increasing the side length of the neighborhood from 3, extending up to a maximum value of N (in this paper, through experimentation, N is chosen as 11), ensuring computational effectiveness and efficiency (refer to
Figure 5). If the expansion up to N still fails to obtain effective neighborhood probability values, a Convolutional Neural Network (CNN) is utilized to extract spatial features from the perspective of the driving factors, further resolving the neighborhood effect.
The yellow color in
Figure 5 represents woodland, green denotes grassland, gray indicates arable land, black represents water bodies, red denotes built-up areas, and dark blue stands for unused land. In the first 3 × 3 Moore neighborhood, the calculated neighborhood effect for built-up areas is 0. When the Moore neighborhood expands to 5 × 5, the neighborhood effect for built-up areas remains 0. However, as the Moore neighborhood expands to 7 × 7, the neighborhood effect probability for built-up areas becomes 12/49. The concept of expanded neighborhoods proposed in this paper aims to address situations where there is a developmental trend, but the fixed neighborhood constraints lead to a neighborhood effect probability of 0 for cell points. Starting with the 3 × 3 Moore neighborhood is because this traditional neighborhood effect focuses more on the spatial correlation and explores the interaction between adjacent areas, capturing the characteristics and changes in local land use.
The traditional neighborhood calculation method, employing statistical techniques, is suitable for the spread and expansion of old areas but becomes ineffective in expressing the expansion of new areas. This is because, with the traditional method, there are cases where the probability is 0. If the traditional neighborhood calculation method continues, it objectively results in a lack of seed points for the expansion of new areas. Although expanded neighborhoods can reduce the occurrence of probability being 0 to some extent, it cannot completely eliminate this phenomenon. To address this issue, this paper introduces an alternative approach by leveraging convolutional neural networks (CNN) to extract the neighborhood effect at the driver factor level (see
Figure 6). This is because the driver factors surrounding the cell’s neighboring points, to some extent, can also reflect the transformation trend of that cell point.
The first step involves preprocessing the land-use data for the year 2020 to obtain the dataset. Subsequently, with each label sample as the center, the neighborhood was established with a range set to N. This procedure led to the segmentation of the 15 driver factor layers into images of size N × N. According to the research of He Jialv, and considering the practical research conducted in this paper [
8], the optimal value for N was found to be 51 for the model’s accuracy. Therefore, N was set to 51 in this study. Following this, the 51 × 51 × 15 image data was fed into the CNN model, undergoing convolution and pooling operations to ultimately output the neighborhood effect of the cell.
The CNN model mainly consists of 5 convolutional layers, 2 pooling layers and 2 fully connected layers. The activation function uses ReLu, the loss function uses CrossEntropy, and the stochastic gradient descent method is used during the training process.
2.3.3. CA Dynamic Iteration Module Based on Urban Inter-City Gravity Model
- (1)
Comprehensive Suitability Extraction
The operational process of the RF-CNN-SAI-CA model (
Figure 7) is as follows: (1) Urban land use change data were obtained by detecting changes in land use classification data in the two periods before and after. After unifying the spatial reference system and resolution with the 15 driving factor layer data, a spatial database of dependent variables and independent variables in the study area was established; (2) To conduct stratified random sampling on a preprocessed unified dataset, the following steps are undertaken: Initially, label data for six types of land within the study area are quantified, resulting in 1,844,778 instances of label 1, 552,571 of label 2, 216,804 of label 3, 65,238 of label 4, 193,713 of label 5, and 40 of label 6. Subsequently, a random selection of 70% from each label category is extracted to form the training and testing dataset: 1,200,000 from label 1, 380,000 from label 2, 150,000 from label 3, 40,000 from label 4, 130,000 from label 5, and 30 from label 6. These datasets are then amalgamated and shuffled to create a comprehensive dataset containing 1,900,030 instances across six land use types. This dataset is further segmented into a training set, constituting 70%, and a testing set, accounting for 30%. Ultimately, this dataset is fed into a random forest model for training and testing. Subsequently, the comprehensive land use data from 2010 were applied to the trained RF model to obtain the initial conversion probability (urban development suitability)
for each cell; (3) Subsequently, the neighborhood effect was determined based on the expansion neighborhood algorithm. If the neighborhood boundary expands to 11 × 11 and the probability remains zero, the next step involves training a Convolutional Neural Network (CNN) model using the preprocessed driving factor data. Based on the trained CNN model combined with the comprehensive driving factor dataset, the neighborhood effect probability
based on the driving factors for the cells can be obtained; (4) According to the formula for Spatial Anisotropy Index (SAI) calculation, the spatial anisotropy probability
for each cell point is computed; (5) According to the preliminary urban conversion probability, neighborhood effects, spatial anisotropy, and random factors, the overall transformation probability is calculated; (6) Using ARCGIS to calculate the total urban expansion based on land use data from 2010 and 2020, serving as the global constraint condition for the Cellular Automata (CA) model; (7) Incorporating the overall transformation probability into the CA model, selecting the next cell in the neighborhood based on the gravity model, iterating continuously to eventually obtain the simulation results.
- (2)
Gravity-Guided Iteration Module
Another method introduced in this article is the gravity model to determine the magnitude of attraction between cities, assisting in analyzing the impact of intercity attraction on urban expansion. The formula is as follows:
Among them: is the attractiveness of city to city ; is the distance from city to city ; represents the quality of city , represents the quality of city ; are coefficients, Delphi method Determine: (The Delphi method is a technique for expert surveys, typically achieving consensus through iterative rounds of opinion collection and feedback. In this context, the Delphi method is employed to determine the coefficient values in the gravity model). For the gravity model formula between two cities, it can be written directly as .
Urban quality indicators: Urban quality indicators refer to indicators that can reflect the comprehensive strength of a city, or can reflect the comprehensive energy indicators of a city. In urban socio-economic development, four indicators, including population, gross regional product, total retail sales of consumer goods, and total import and export, are significant criteria for judging whether a city is developed. The urban quality index can be expressed as:
Among them, is the quality of the city, is the regional GDP, is the population, is the total retail sales of consumer goods, and is the total import and export volume.
Distance indicators involve geographical distance as well as subjective factors such as social, psychological, political, and cultural aspects. However, due to the difficulty in measuring subjective factors like social, psychological, political, and cultural distances, this paper employs geographical distance indicators to measure distances between cities. This choice is made because geographical distance is a relatively easy-to-measure objective indicator that can provide actual spatial gap information between cities, without being influenced by subjective factors.
Based on land transportation, this paper utilizes the geometric mean of three indicators—road distance, railway distance, and spatial latitude and longitude distance—to depict the distance between Chongqing and various cities, namely:
Among them, is the distance, is the highway mileage, is the railway mileage, and is the spatial longitude and latitude distance.
This article explores the 21 districts in the main city of Chongqing (Yuzhong District, Dadukou District, Jiangbei District, Shapingba District, Jiulongpo District, Nan’an District, Beibei District, Yubei District, Banan District, Fuling District, Changshou District, Jiangjin District, Hechuan District District, Yongchuan District, Nanchuan District, Qijiang District, Dazu District, Bishan District, Tongliang District, Tongnan District, Rongchang District) and the relationship between the development of the city and the gravitational force between cities, respectively, with the main city of Chongqing as the center point, from Qianjiang, Zunyi, Neijiang, Guang’an, Tongren, Luzhou, Chengdu and Dazhou were selected as research cities in 8 directions: east, south, west, north, southeast, southwest, northwest and northeast, and the 21 districts of Chongqing’s main city and 8 surrounding cities were explored gravitational relationship between them. Integrate gravity into the CA iteration module to create a CA dynamic iteration model based on gravity (
Figure 8).
According to Formula (5), the comprehensive quality of each city is calculated (
Table A1); based on Formula (6), the comprehensive distance from each research area in the main city of Chongqing to the surrounding cities is calculated (
Table A2); using the data from
Table A1 and
Table A2 in conjunction with Formula (4), this paper computed the magnitude of gravitational pull between the 21 main districts of Chongqing and the surrounding 8 cities (refer to
Table A3). To better analyze the impact of these gravitational values, we normalized these data. Normalization transforms the data into a distribution with a mean of 0 and a standard deviation of 1. In this paper, the gravitational value of cells at position
concerning surrounding cities is denoted as
, which is integrated into the overall conversion probability as follows:
=
×
, where
= (
). It reflects the external directional probability brought by the size of gravitational forces, contributing to the external directionality of cells. It is worth noting that the computation parameters for gravitational size are related to the population and economic factors of construction land. Hence, this paper only introduces this module on simulating construction land (non-permeable surface) to explore the impact of gravitational forces from surrounding cities on the simulated urban development results.
The specific calculation process begins by using the ArcMap tool to process the administrative boundary vector map of the 21 main urban districts in Chongqing (
Figure 9a). The vector map is converted into raster data by uniformly setting the resolution to 100. In this process, rasterization is performed based on the NAME attribute of the main urban districts, ensuring that each region has a unique corresponding label in the obtained raster data layer of the 21 main urban districts in Chongqing (
Figure 9b), with labels ranging from 1 to 21. Next, based on the land-use data layer of the base year, a starting cell point is randomly selected, serving as the starting point for the Cellular Automaton (CA) dynamic simulation. Using this cell as the center, a 3 × 3 Moore neighborhood is constructed as the foundational structure for iteration. For each iteration, the corresponding region label for the current cell is identified by its coordinates in
Figure 9b, thereby determining the specific area to which the cell belongs. Subsequently, using the precomputed area-specific gravitational values from
Table A3, an assessment and comparison of the gravitational values for each direction within the 3 × 3 Moore neighborhood are performed (8 grid positions represent 8 directions). Among the gravitational values in the eight directions, the direction with the highest gravitational value is chosen as the target for the next iteration. Simultaneously, the probability of construction land at the selected iteration point is multiplied by
, the specific land type conversion at that iteration point is jointly determined by the conversion probabilities of the other five land types and the enhanced probability of transforming into construction land. Among all possible land type conversions, the type with the highest probability is selected for conversion. After completing these steps, the count of this land type is incremented by 1, and the iteration continues. This process continues until the count of the six planned land use types for 2020 is reached. If the count of a certain land type reaches the planned value during the iteration, in the subsequent iterations, the comprehensive conversion probability of that land type will no longer be considered.
In
Figure 8, the red color represents seed cell elements, while the brown area represents the 3 × 3 Moore neighborhood constructed around the seed cell. The numbers 1–8 represent the magnitude of gravitational force between Chengdu, Guang’an, Dazhou, Neijiang, Qianjiang, Luzhou, Zunyi, and Tongren, respectively, and the central seed cell element. The point with the maximum gravitational force is chosen as the next iteration’s seed cell, and the overall transformation probability of the construction land type at this iteration point is multiplied by the normalized gravitational force value
. Then, based on the overall conversion probability of the six land use types at that cell point, the conversion status of that cell is determined. The process continues by constructing a 3 × 3 Moore neighborhood for iteration until the land quantity reaches the land demand of the current year, at which point the iteration stops.