LUCC Simulation Based on RF-CNN-LSTM-CA Model with High-Quality Seed Selection Iterative Algorithm

Liu, Minghao; Chen, Haiyan; Qi, Liai; Chen, Chun

doi:10.3390/app13063407

Open AccessArticle

LUCC Simulation Based on RF-CNN-LSTM-CA Model with High-Quality Seed Selection Iterative Algorithm

by

Minghao Liu

^1,2,3,*,

Haiyan Chen

^1,2,3,

Liai Qi

^1,2,3 and

Chun Chen

⁴

¹

College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China

²

Tourism Multi-Source Data Perception and Decision Technology Laboratory, Chongqing University of Posts and Telecommunications, Chongqing 400065, China

³

Chongqing Spatial Big Data Intelligent Engineering Research Center, Chongqing 400065, China

⁴

Urban Planning/Eco-Habitat and Green Transportation Research Center, College of Architecture, Chongqing Jiaotong University, Chongqing 400074, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(6), 3407; https://doi.org/10.3390/app13063407

Submission received: 13 January 2023 / Revised: 14 February 2023 / Accepted: 5 March 2023 / Published: 7 March 2023

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Land use/cover change (LUCC) models are essential for studying the profound impact of land use/cover dynamics on various aspects of the natural and social environment. Cellular Automata (CA) is widely used in the dynamic modeling of complex LUCC systems. In the traditional machine learning CA model, when using statistical methods to obtain neighborhood features, there is usually the problem that the spatio-temporal feature learning of neighborhood factors is insufficient. At the same time, the CA dynamic iteration module using the random seed selection mechanism often has the problem that the seed selection efficiency is very low. In this paper, taking the Chongqing Metropolitan Area as an example, convolutional neural networks (CNN)-Long Short-Term Memory Network (LSTM) is introduced to improve the learning effect of the traditional random forest (RF)-CA model in the spatial and temporal characteristics of neighborhood factors. CNN is used to extract the spatial dimension features of LUCC in the neighborhood, and the LSTM model is used to extract the time dimension features and long-term dependencies. At the same time, a high-quality seed selection iterative algorithm (HQSSIA) is used to improve the accuracy of the multi-land-use dynamic change model and the efficiency of the iterative algorithm. The results show that, the proposed model performs better than other models in simulating the LUCC from 2015 to 2020 (Kappa = 0.9684, FOM = 0.1744, Accuracy = 0.9829, F1 = 0.9641, Hamming = 0.0171) and from 2010 to 2020 (Kappa = 0.9599, FOM = 0.4662, Accuracy = 0.9785, F1 = 0.8113, Hamming = 0.0214). After introducing the CNN-LSTM model, the Figure of Merit (FOM) increased by 1.56% and 18.88% for 2015–2020 and 2010–2020. Compared with the CA model-based random seed selection algorithm, the FOM of the model using HQSSIA in the dynamic iteration module are improved by 11.60% and 24.79% for 2015–2020 and 2010–2020, and the operation efficiency of the dynamic iteration module is improved by about 19 times. Compared with the current mainstream LUCC models PLUS and FLUS, the proposed model has improved 14.38%, 37.55%, and 14.93%, 37.74% in FOM, respectively, for 2015–2020 and 2010–2020. The research shows that: (1) RF-CNN-LSTM-CA model not only retains the interpretability advantage of the traditional RF-CA model, but also improves the accuracy of the whole model by improving the spatio-temporal characteristics of neighborhood factors through in-depth learning; (2) the HQSSIA can quickly and accurately search for cells to be converted with higher conversion probability in the observed data, which can not only significantly reduce the time complexity of the model, but also improve the accuracy of LUCC simulation.

Keywords:

neighborhood spatio-temporal features; high-quality seed selection iterative algorithm (HQ SSIA); RF-CNN-LSTM-CA model; Chongqing metropolitan area

Graphical Abstract

1. Introduction

The land use/cover change (LUCC) models can not only strengthen the understanding of driving factors and key processes behind dynamic change through quantitative means, but also provide support for decision-makers to make land use/cover policies under different development needs through scenario analysis [1]. At present, LUCC model has become a powerful tool to study the process of LUCC and support the preparation of territorial spatial planning. Among the many LUCC models, the Cellular Automata (CA) model has become the mainstream model of LUCC simulation because of its powerful spatial and temporal modeling ability [2]. The CA model changes the cellular state through local simple rules, resulting in macro LUCC results.

The CA model is mainly composed of four elements: space, cell, neighborhood effect, and transformation rules [3]. In the traditional CA model, the space is composed of two-dimensional grid space; cell is a unit in a two-dimensional grid; the state of the cell represents the land use type; and the probability of cell transformation into other land use types is determined by neighborhood effect and transformation rules [4]. Neighborhood effect is a function that reflects the interaction between the current cell and the neighborhood cell in the CA model [5]. The transformation rules measure the possibility of cell state transition under the influence of each spatial variable.

The core problem in the CA model is how to effectively extract the land conversion rules. In previous studies, many scholars have proposed the method of machine learning to obtain the conversion rules of the urban CA model, such as the logistic regression method [6,7,8,9], ant colony algorithm [10], genetic algorithm [11], simulated annealing algorithm [12], random forest (RF) algorithm [5,13,14], artificial neural network algorithm [15,16], convolutional neural network algorithm [17,18,19], and so on. Among them, the RF algorithm has the advantages of high accuracy, moderate time complexity, strong anti-overfitting ability, and elimination of multicollinearity between driving factors. It can also give the LUCC model interpretability through the important measurement of driving factors. The coupling model of the RF algorithm and CA has been verified to not only obtain better simulation accuracy but also help to reveal the mechanism of complex LUCC.

There are complex neighborhood interactions in LUCC [20]. In previous studies, statistical data were used as neighborhood driving factors to describe the neighborhood relationship of a place, such as neighborhood density [21], extended neighborhood enrichment [22], spatial autocorrelation factor [23], etc. However, these simple spatial statistical methods do not have a strong ability to capture the spatial characteristics of land use/cover data and cannot mine the time dimension characteristics [24].

LUCC simulation is essentially a time series modeling problem, and the process of LUCC will be affected by historical land use/cover patterns [18]. The convolutional neural network (CNN) model in the deep learning model uses a convolution kernel to extract image information, which can fully consider the neighborhood information of each pixel in the image [25]. Time series data prediction models, such as the Recurrent Neural Network (RNN) model, Long Short-Term Memory (LSTM) model, and Gated Recurrent Unit (GRU) model can mine historical trends from long-term series data [26]. Many scholars combine these two models to extract spatiotemporal features in land use/cover data. Liu et al. (2022) used the neighborhood spatial features extracted by the CNN model as the input of the LSTM model to fully explore the intrinsic relationship between historical LCCs [27]. Xiao et al. (2022) combined spatial and temporal neighborhood feature learning in the CNN-GRU model to supplement the extraction of spatiotemporal neighborhood features and long-term dependence in LUCC simulation research [28].

Cellular evolution mechanism is also a key issue in CA model construction, including seed selection, seed planting, and seed growth mechanism [29]. In the dynamic iteration process of the CA model, many studies use random algorithms to simulate the randomness and decisiveness of LUCC in real scenes when selecting seeds [30], random algorithms help determine the precise location of LUCC [29]. Liu Tianlin et al. [31] used a random selection algorithm and threshold to determine the starting seed point of patch growth in the iterative process of the CA model. Xun Liang et al. [14] used the Monte Carlo algorithm to generate change seeds on the surface of land use/cover growth probability and used the decline threshold to control the organic growth and natural growth of patches. However, the existing random algorithms cannot fit the complex geographical phenomena in nature well but will cause a loss of simulation accuracy and a waste of time. To solve this problem, we propose the high-quality seed selection iterative algorithm (HQSSIA), which uses the existing land conversion probability to select the changed seed points, and locates the cells with higher conversion probability faster and more accurately.

In this paper, a RF-CNN-LSTM-CA land use change simulation model based on HQSSIA is proposed. The model mainly includes three important parts: (1) using CNN-LSTM model to extract spatio-temporal features from land use/cover data and inputting them into RF model as one of the driving factors; (2) using RF model to establish the relationship between spatial variables and LUCC observation data, so as to extract the change suitability of LUCC; and (3) completing the simulation of LUCC in the dynamic iteration module of the CA model based on HQSSIA. This model not only retains the interpretability of the traditional RF-CA model, but also gives full play to the strong spatio-temporal feature mining advantages of deep learning. In order to make the land use model better mine the neighborhood effect in the process of long-term land use change in the study area, the CNN-LSTM model is introduced to improve the learning of the spatio-temporal characteristics of neighborhood drivers in the traditional RF-CA model. In order to solve the problem of low accuracy and efficiency of seed selection in the dynamic iterative module of CA model, a CA model based on HQSSIA is designed. Using the proposed model, this paper analyzes the influencing factors of land use change in Chongqing main city metropolitan area from 2015 to 2020, and predicts the land use pattern of Chongqing main city metropolitan area in 2025.

2. Materials and Methods

2.1. Study Area

Chongqing is a municipality directly under the Central Government of the People’s Republic of China, one of the important central cities of the country approved by the State Council, and a financial center in western China. Chongqing is located in the southwest of Inland China. The northwest and central parts are dominated by hills and low mountains. In May 2020, Chongqing clarified the concept of “21 districts in the main urban area” for the first time. The original 9 districts in the main urban area (Yuzhong, Dadukou, Jiangbei, Shapingba, Jiulongpo, Nanan, Beibei, Yubei, and Banan) were the central city areas, and the 12 districts in the western Chongqing area (Fuling, Changshou, Jiangjin, Hechuan, Yongchuan, Nanchuan, Qijiang, Dazu, Bishan, Tongliang, Tongnan, and Rongchang) were the main urban new areas (Figure 1). The 21 districts in the main urban area have an area of 28,700 square kilometers, a permanent population of 20.27 million people, and a total economic output of CNY 1.8 trillion.

2.2. Data

A series of natural geographical and socio-economic factors were used to quantify the suitability of different land types, including elevation, slope, soil type, nighttime light value, population spatial distribution data, Gross Domestic Product (GDP) spatial distribution data, point density of point of interest (POI), and several variables based on proximity (Table 1).

The land cover data are the global 30-meter fine land cover dynamic monitoring product produced by the team of Liu Liangyun and Zhang Xiao from 1985 to 2020 (https://data.casearth.cn/, accessed on 27 December 2021) [32]. The update period of the land cover data is 5 years and includes 29 land cover types. According to the characteristics of land use/cover in Chongqing, the land use/cover map is divided into six categories: forest, cropland, grass, shrub, impervious surfaces, and water area.

The GDP dataset is provided by the Resource and Environmental Science and Data Center (https://www.resdc.cn/, accessed on 8 November 2021) every 5 years. The variables based on proximity are Euclidean distances to roads, water area systems, and railways at all levels. The source data come from the national basic geographic dataset provided by the national geographic information resource directory service system.

The ASTER GDEM V3 data (https://search.earthdata.nasa.gov/, accessed on 3 September 2021) set developed by METI and NASA and distributed to the public free of charge is used for elevation data. Slope data are calculated based on elevation data.

POI data are provided by Gaode open platform (https://lbs.amap.com/, accessed on 30 July 2020), including the location data of hospitals, railway stations, bus stations, parks and infrastructure, government agencies, highway toll stations, shopping malls, etc., and calculate various POI point density raster data in ArcMap10.7 software.

Population data were extracted from the world pop website’s annual global population dataset of countries with a resolution of 3arc from 2000 to 2020, in units of the number of people per pixel.

Nighttime light data are derived from the National Earth System Science Data Center (http://www.geodata.cn, accessed on 5 November 2021). This product has parameter properties consistent with NPP-VIIRS nighttime light data and data quality similar to NPP-VIIRS and solves the problem that DMSP-OLS and NPP-VIIRS nighttime light data cannot be used simultaneously.

The soil texture type data are derived from the Harmonized World Soil Database (HWSD). The data are grid data with a resolution of kilometers, providing information, such as soil type, soil phase, and soil physical and chemical properties, of each grid point.

To build a unified driving factor dataset, the research scope, spatial coordinate system, and resolution (100 m) of all the above datasets were processed the same (Figure 2).

2.3. Methods

The proposed model mainly includes four modules (Figure 3): (1) the spatio-temporal characteristics of the neighborhood extraction module based on the CNN-LSTM model; (2) the LUCC suitability extraction module based on RF model; (3) the dynamic iteration module of CA model based on HQSSIA; and (4) model validation and scenario simulation module.

2.3.1. Neighborhood Spatio-Temporal Feature Extraction Module

CNN is a kind of deep feedforward neural network with convolution operation as the core, which usually includes input layer, convolution layer, pooling layer, fully connected layer, output layer, and other modules. The feature extraction of data by convolutional neural network mainly depends on the convolution kernel in the convolution layer. The convolution kernel generally performs convolution operations in a step-by-step manner to extract data features (Figure 4).

LSTM is a special RNN network, which solves the problem of gradient explosion and disappearance of long time series data in training, and is suitable for long time series analysis and prediction. Compared with the standard RNN network, LSTM adds a memory state unit to the hidden layer neural node to store the past information, and uses three threshold control structures (including input gate layer, forgetting gate layer, and output gate layer) to control the forgetting and updating of historical information, so that the information is effectively filtered. The basic structure of LSTM is shown in Figure 5.

In the Figure 5,

h_{t - 1}

is the output data at time

t - 1

,

x_{t}

is the input data at time t;

σ

is sigmoid function,

f_{t}

represent the forgotten gate layer;

i_{t}

represents the input gate layer, tanh is tanh activation function;

o_{t}

represent the output gate layer,

{\tilde{C}}_{t}

is a vector of new candidate values created by a tanh activation function;

h_{t}

is the output data at time t;

C_{t - 1}

is the memory state vector at time

t - 1

,

C_{t}

is the memory state vector at time t.

The formulae for

f_{t}

,

i_{t}

,

o_{t}

,

{\tilde{C}}_{t}

, and

C_{t}

are as follows:

f_{t} = σ \times (W_{f} \times [h_{t - 1}, x_{t}] + b_{f})

(1)

i_{t} = σ \times (W_{i} \times [h_{t - 1}, x_{t}] + b_{i})

(2)

{\tilde{C}}_{t} = t a n h \times (W_{C} \times [h_{t - 1}, x_{t}] + b_{C})

(3)

C_{t} = f_{t} \times C_{t - 1} + i_{t} \times {\tilde{C}}_{t}

(4)

o_{t} = σ \times (W_{o} \times [h_{t - 1}, x_{t}] + b_{o})

(5)

h_{t} = o_{t} \times t a n h (C_{t})

(6)

where

W_{i}

,

W_{f}

,

W_{o}

, and

W_{C}

represent the weight matrix of each connection layer;

b_{i}

,

b_{f}

,

b_{o}

, and

b_{C}

represent the bias terms of each connecting layer.

According to the input

x_{t}

at the current moment, the output

h_{t - 1}

at the previous moment and the memory state

c_{t - 1}

at the previous moment, the output

h_{t}

and the memory state

c_{t}

at the current moment can be obtained and continue to advance as the input of the next time step. The LSTM model, which continuously collects and filters data in chronological order for iterative calculation, has a strong ability to mine temporal features.

This paper combines the advantages and characteristics of the above two networks to construct a CNN-LSTM coupling model. In the data processing stage, the collected land use/cover data are divided into two time series data by sliding window. The land use/cover data of 2000, 2005, and 2010 were used as a set to establish a link with land use/cover change data from 2010 to 2015. The land use/cover data of 2005, 2010 and 2015 are used as a set to predict land use/cover changes from 2015 to 2020. Then the 3 × 3 Moore neighborhood of each grid in these land use data is extracted. The Moore neighborhood is a square neighborhood, a set of units around a given unit. All the Moore neighborhood data provide spatial information for the CNN-LSTM model.

During the model training, the normalized data are input into the CNN network for important information extraction, and then the extracted spatial feature data are input into the LSTM model for temporal feature learning. The specific process of CNN-LSTM model used to extract neighborhood spatio-temporal feature data is shown in Figure 6: (1) the land use/cover data of 2000, 2005, and 2010 are clipped with a 3 × 3 window size and combined into a data cube according to the time sequence to form a neighborhood dataset; (2) The data of multiple land types in 2010 and 2015 were superimposed for LUCC detection. The change data from 2010 to 2015 were used as the label dataset, and the label categories were expressed by numbers 1–6, that is, converted to different situations of six land types. (3) The above two datasets are integrated as a training set and input into the CNN-LSTM model for full sample learning; (4) Put the land use/cover neighborhood datasets of 2005, 2010, and 2015 into the trained CNN-LSTM model for prediction; (5) finally, the neighborhood probabilities of six different land use/cover types are output through the fully connected layer, which are converted to cropland, grass, forest, shrub, impervious surfaces, and water area, respectively (Figure 7).

2.3.2. Land Suitability Extraction Module

In this paper, the multi-LUCC suitability extraction problem is transformed into several binary classification problems. The various types of land expansion were separated to extract the suitability for conversion to 6 types of land. The specific steps are as follows: (1) taking other spatial driving factor data and neighborhood efficiency as independent variables, the expansion data of six land types between LUCC from 2015 to 2020 are extracted as dependent variables; (2) stratified sampling 10% of the data input into the RF model for training; and (3) put all the driving factor data into the trained RF model for prediction, and obtain the expansion suitability data of 6 types of land. Based on the idea of integrated learning, the random forest model combines multiple decision trees into a forest and uses the majority voting rule to calculate the probability

P_{i, k}

of converting the land use type of cell i to k. The formula for calculating

P_{i, k}

is as follows:

P_{i, k} = \frac{\sum_{m = 1}^{n} I (h_{m} (x) = Y_{k})}{n}

(7)

where the value of

Y_{k}

is 0 or 1, the value of 1 indicates that other land use types are converted to land use types k, and the value of 0 indicates other conversions; x is a vector composed of multiple driving factors;

h_{m} (x)

represents the prediction result of the mth decision tree model according to x;

I ()

is the classification result index function; n is the total number of decision trees.

2.3.3. Dynamic Iteration Module

The execution order of the dynamic iteration module is determined by the unallocated quantity of various types of land, and the land types with large demand for land expansion are given priority. Before the iteration starts, the overall conversion probability of each type of land is sorted according to the size of the value, and the corresponding cell position (i.e., high-quality seed points) is recorded. Each land use type selects the location of the expansion from large to small according to the overall conversion probability during the iteration, and then performs the following seed planting. Seed planting involves multi-purpose land competition, and the basis for winning is the overall conversion probability. The specific process of the dynamic iteration process is shown in Figure 8.

In Figure 8,

O P_{i, k}

denotes the overall transition probability of k on cell i. Cell i is the selected high-quality seed points. The formula is:

O P_{i, k}^{t} = P_{i, k} \times C_{i, k}^{t} \times R

(8)

where k represents the state of the cell i.

P_{i, k}

represents the LUCC suitability at cell i.

C_{i, k}^{t}

is the constraint condition, which limits the expansion of some land types according to local policies or the mutual conversion of historical land types. The expression formula of

C_{i, k}^{t}

is:

C_{i, k}^{t} = c o n (S_{i}^{t - 1} = L a n d u s e_{k})

(9)

where

S_{i}^{t - 1}

represents the state of the cell i in iteration

t - 1

times. When

S_{i}^{t - 1}

is a restricted land type, the value of

c o n ()

is 1. Otherwise, the value of

c o n ()

is 0. In this paper, the impermeable layer cannot be converted to other land use/cover types because of the high cost of converting the impermeable layer to other types of land.

According to the study of Liu Chunlin et al. [27], the random disturbance factor R is introduced to make the simulation results more in line with the period. The expression formula of R is:

R = 1 + {(- ln γ)}^{α}

(10)

where

γ

is a uniform random variable in the interval from 0 to 1.

α

is the parameter to control the size of random disturbance, which is 5 in this paper.

T M_{i, k}

is a transformation matrix that defines whether the original type of cell i is allowed to be converted to type k. Considering the changes between different land types every 5 years from 2000 to 2020, the value corresponding to the two land types that have not been converted is 0, indicating that conversion is not allowed. The rest are 1, indicating that conversion is allowed.

The

M O P_{i}^{t}

represents the maximum comprehensive conversion probability of the cell i at the current time, and its value is dynamically updated with the increase in the number of iterations of the CA model. By comparing the comprehensive conversion probability of the cell i converted to different land types, the competition of multiple land types is realized.

2.3.4. Accuracy Verification

The Kappa coefficient and Figure Of Merit (FOM) were used to test and evaluate the accuracy of LUCC results simulated by the model. The Kappa coefficient based on the confusion matrix can accurately perform the consistency test. The Kappa value is between 0 and 1, and greater than 0.7 indicates that the simulation results are acceptable [33,34]. The calculation formula of Kappa is as follows:

K a p p a = \frac{p_{o} - p_{e}}{1 - p_{e}}

(11)

p_{o} = \frac{\sum_{i = 1}^{6} w_{i}}{n}

(12)

p_{e} = \frac{\sum_{i = 1}^{6} a_{i} \times b_{i}}{n \times n}

(13)

where i refers to LUCC type;

w_{i}

denotes the number of grids whose actual type is class i and predicted type is class i;

a_{i}

represents the number of land use type i in the actual observation data;

b_{i}

represents the number of grids occupied by type i in the prediction results.

At the same time, Accuracy and F1 were used to evaluate. Accuracy is the proportion of correct samples to the total number of samples, and it is the most simple and intuitive evaluation index in the classification problem. Precision embodies the distinguishing ability of the model to negative samples, the higher the Precision, the stronger the distinguishing ability of the model to negative samples; Recall reflects the ability of the model to identify positive samples, and the higher the Recall, the stronger the ability of the model to identify positive samples. F1 is the harmonic average of Accuracy and Recall. The higher F1 is, the more robust the model is. Accuracy and F1 involves four measures: TP (true positive), TN (true negative), FP (false positive), and FN (false negative), which are defined in Table 2.

The formulas for Accuracy and F1 are:

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(14)

P r e c i s i o n = \frac{T P}{T P + F P}

(15)

R e c a l l = \frac{T P}{T P + F N}

(16)

F 1 = \frac{2 \times P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l} = \frac{2 \times T P}{2 \times T P + F P + F N}

(17)

Another effective multi-label evaluation method is Hamming loss, which measures the proportion of samples with false prediction in all samples. Therefore, for Hamming loss, the smaller the value, the better the performance of the model.

H a m m i n g l o s s = \frac{1}{m q} \sum_{i = 1}^{m} \sum_{j = 1}^{q} I (y_{j}^{(i)} \neq {\hat{y}}_{j}^{(i)})

(18)

where m is the total number of samples; q represents the number of types of samples;

y_{j}^{(i)}

represents the j label of the I th sample in the observed data;

{\hat{y}}_{j}^{(i)}

represents the j label of the I th sample in the predicted results. We use the ’metrics’ module under python3’s ’sklearn’ package to calculate Accuracy, F1 and Hamming Loss.

In LUCC simulation, the accuracy of the model is often verified by comparing the simulation results with the real results. The above metrics only compares the simulation results with the land use/cover data of the simulated year (the last year) to evaluate the model. However, FOM evaluates the simulation results according to the difference between the actual results (including the base year and the end year) and the simulation results [35]. FOM inherently has the ability to better evaluate the ability of models to capture the conversion between land use/cover types. So compared with the above metrics, FOM more accurately reflects the consistency and accuracy of the simulation of complex geographic systems. The FOM value is between 0 and 1, and the higher the value, the higher the accuracy of the model prediction [35]. The FOM calculation method is as follows:

F O M = \frac{B}{A + B + C + D}

(19)

where A is the number of error areas where the conversion occurs, but no conversion occurs in the simulation. B is the number of correct areas where the conversion occurs in both the actual and the simulation. C is the number of error regions that are converted in both the actual and simulated, but not the same as the actual. D is the number of error regions that did not convert, but the conversion occurred in the simulation.

3. Results

3.1. Analysis of LUCC from 2015 to 2020

In this paper, the land use/cover data in the study area are divided into 3,839,397 cells, each of which is spatially displayed as a grid of 100 m × 100 m. The land use data we obtained from 2000 to 2020 are processed into the same format and geographic coordinate system, so the number and location of these cells remain unchanged over time.

The actual situation of mutual conversion between various types of land use/cover during the period from 2015 to 2020 is shown in Table 3. There are 3,792,559 cells with no change in land use/cover type, and 46,838 cells with type conversion, accounting for about 1.24% of the number of changed cells. In 2015, grass, shrub, and impervious surfaces were not converted to other land use/cover types, indicating that these three land use/cover types were relatively stable or not easily converted to other types in the study area. The conversion of cropland to impervious land type accounted for 63.86% of the total change. It shows that the metropolitan area of Chongqing is in a period of rapid development from 2015 to 2020, and the expansion of urban land mainly occurs in the cropland area.

3.2. Ablation Experiment and Comparison with Traditional Model

The Kappa coefficient, FOM, Accuracy, F1 and Hamming Loss are used to verify the simulation results of the proposed model, and compared with the results of the other five models: Degraded I (the RF-CA model without neighborhood factors); Degraded II (using the CA model based on a random seed selection algorithm); a (using the neighborhood time feature data extracted by the LSTM model); b (using neighborhood density data as neighborhood driving factors); c (using extended neighborhood enrichment data as neighborhood driving factors) (Table 4). Among them, the model represented by the letters a, b and c is to further verify the superiority of using CNN-LSTM model to extract neighborhood spatio-temporal features. The neighborhood density data are the area proportion of various types of land in the 3 × 3 Moore neighborhood of the central cell. The extended neighborhood enrichment data are the ratio of the land density in the 9 × 9 Moore neighborhood of the central cell to the 3 × 3 Moore neighborhood. Both neighborhood information is obtained from the single-phase land use/cover model in 2015.

As shown in Table 4, the proposed model performs better than other models in simulating the LUCC from 2015 to 2020 (Kappa = 0.9684, FOM = 0.1744, Accuracy = 0.9829, F1 = 0.9641, Hamming = 0.0171) and from 2010 to 2020 (Kappa = 0.9599, FOM = 0.4662, Accuracy = 0.9785, F1 = 0.8113, Hamming = 0.0214). More specifically, the the proposed model surpasses the RF-CA model (Degraded I) by 1.56% and 18.88% of FOM for 2015–2020 and 2010–2020, indicating that using CNN-LSTM to extract neighborhood spatiotemporal features in historical land use/cover patterns as driving factors helps to improve the simulation accuracy of traditional RF-CA models. The FOM of the proposed model are 11.60% and 24.79% higher than Degraded II for 2015–2020 and 2010–2020. It indicates that the CA model based on HQSSIA has better performance in spatial simulation accuracy than the CA model based on random seed selection mechanism. The accuracy of the proposed model is better than that of a model, b model, and c model. Therefore, compared with the traditional neighborhood feature extraction methods, CNN-LSTM model can better learn the neighborhood interaction in the process of complex multi-purpose land change.

As shown in Table 3, there are fewer cells in Chongqing’s from 2015 to 2020 state change, accounting for about 1.24% of the total. Furthermore, the seed selection process in Degraded I is completely random, and the probability of correctly selecting the changed seed points decreases as the number of land changes in the two periods decreases. The high-quality seed mechanism uses the existing LUCC suitability probability data and constraints to initially screen the change location, which can improve the hit rate and accuracy of the seed selection process, thereby reducing the running time of the CA model. The results show that the running time of the CA model based on the random seed selection algorithm is about 19 times that of the CA model based on the HQSSIA, which verifies the above conjecture and analysis (Table 5).

3.3. Comparison with LUCC Mainstream Model

In the main urban area of Chongqing, the model proposed in this paper is used to simulate the multi-class changes with the current mainstream PLUS model [14] and FLUS model [36], and the three simulation results are compared (Table 6). The FLUS model is an ANN-CA multi-land use/cover simulation model based on the pattern analysis strategy (PAS) and unit conversion mechanism proposed in 2017. The system dynamics model and the CA model are deeply coupled to gradually simulate future land use/cover patterns. The PLUS model is an RF-CA multi-LUCC simulation model based on the land expansion analysis strategy (LEAS) and patch growth strategy proposed in 2020. Combined with random seed generation and threshold decline mechanism, the automatic generation of simulated patches in time and space dynamics explores the landscape layout of land use/cover patterns under different scenarios. PLUS model and FLUS model have been widely used since they were proposed, and have achieved good feedback.

To ensure the reference and credibility of the experimental results, this model uses the same data as the other two models and optimizes the parameters of the PLUS model and the FLUS model multiple times to achieve the best simulation results. The experimental results are shown in Table 6. In several different metrics, the proposed model has better performance than PLUS model and FLUS model. Compared with the PLUS model, the FOM of the proposed model in this paper are 14.38% and 37.55% higher for 2015–2020 and 2010–2020. Compared with the FLUS model, the FOM were 14.93% and 37.74% higher, respectively.

The PLUS model and FLUS model dynamic iteration module both use all land types to build a roulette wheel to achieve multi-land competition and use the roulette algorithm to determine the land status of the next iteration. Multi-land competition depends on the comprehensive conversion probability, and the change suitability is the most important part of the comprehensive conversion probability. When the intelligent algorithm is used to extract change suitability, the overall probability of land types with more samples is higher than that of land types with fewer samples, which leads to the overall decline of competitiveness of land types with fewer samples. PLUS model and FLUS model use roulette to increase the conversion opportunities of land types with low comprehensive conversion probability, which cannot solve the problem fundamentally. The RF-CNN-LSTM-CA model based on the HQSSIA proposed in this paper uses the method of successively converting each type of land cell to avoid competition between most land types and reduce the probability of decision-making errors in land type competition.

3.4. Importance Analysis of Driving Factors

Land use change is the result of the combination of natural, social, economic and other factors, which reflects the intensity and rationality of land use, development intensity, economic input, policy orientation, and other factors in the study area. Comprehensive analysis of the driving factors of land use change process is helpful to better understand the mechanism and trend of land use change. In order to provide scientific reference for formulating more reasonable land use policies and promote the sustainable development of regional economy and ecology. The RF model can use Out-Of-Bag (OOB) data to calculate the importance of feature variables from a large dataset, thereby revealing the complex relationship between feature variables. In this paper, the RF model is used to analyze and quantify the influence of driving factors such as neighborhood interaction in multi-land change. Figure 9 shows the top 10 most important drivers of LUCC.

According to the importance of driving factors in the RF model, it can be concluded that: (1) neighborhood probability is the most important driving factor in the process of cropland expansion. It indicates that the development of cropland is largely affected by historical land use/cover patterns and follows a certain time development law and neighborhood effect. (2) The point density of the light rail station plays an important role in the expansion of the impermeable layer. The location of the light rail station is highly concentrated in the downtown area, indicating that the urban development of Chongqing is still concentrated in the central city. (3) In shrub and forest, the slope is the most important factor. (4) In the process of water area development, the importance of the initial probability of transformation into the water area is as high as 0.1097, which is far greater than other driving factors. (5) In the process of Grass expansion, the neighborhood probability of cropland plays the most important role, and the neighborhood probability of grass also gives great reference value. Both urban grass and cropland belong to artificial-natural mixed ecosystems, and their development has similar characteristics.

3.5. Future Land Use/Cover Pattern Simulation

According to the LUCC from 2015 to 2020, the Markov model is used to calculate the quantity demand of each category in 2025. Driven by future land use/cover demand, the RF-CNN-LSTM-CA model was used to simulate LUCC in 2025. The land use/cover pattern in 2025 and the land expansion from 2020 to 2025 are shown in Figure 10.

The expansion of land use/cover from 2020 to 2025 is shown in Table 7. The impervious surfaces are still the land type with the largest number of expansions, and the main expansion areas are concentrated in Tongnan District and the central urban area; the expansion of the remaining land is small and scattered.

4. Discussion

This paper optimizes the model from two aspects: spatio-temporal feature mining of neighborhood factors and dynamic iteration of modules. First of all, to solve the problem that the spatio-temporal feature learning of neighborhood factors in traditional machine learning CA model is insufficient, this paper introduces CNN-LSTM to learn the spatiotemporal feature of neighborhood factors of all land use types, and inputs them into RF-CA model as one kind of driver factors with other drivers of the model, so as to improve the learning effect of traditional CA model. Secondly, to solve the problem of low efficiency of random seed selection in traditional CA dynamic iteration, a high-quality seed selection algorithm is used to improve the precision of multi-category land-use dynamic change model and the efficiency of iterative algorithm. In this paper, taking the main urban area of Chongqing as an example, our proposal model, RF-CNN-LSTM-CA, is used to simulate the land use distribution pattern in 2020 from the base year of 2015. In this process, several different neighborhood modeling schemes are designed and the current mainstream model, such as PLUS and FLUS, are compared.

The results show that: (1) using the CNN-LSTM model to extract the spatial and temporal characteristics of each neighborhood from the historical multi-period and multi-category land use data as the driving factors can improve the accuracy of the overall model simulation results; compared with the case without considering any static neighborhood drivers, the FOM are improved by by 1.56% and 18.88% for 2015–2020 and 2010–2020. It shows that in the complex simulation of land use change with multi-land-use types, the simulation accuracy of the model can be improved to a certain extent by using the deep learning model to mine the spatial and temporal characteristics of the neighborhood of each category. (2) The CNN-LSTM model and LSTM model are, respectively, incorporated into the RF-CA model. Through the ablation experiment, it can be seen that LSTM model has advantages in acquiring the temporal characteristics of land use types in historical periods, while the CNN model has advantages in acquiring the spatial characteristics of land use types. Our proposal model in this paper, through the introduction of the CNN-LSTM model, overcomes the shortcomings of the traditional neighborhood that the spatio-temporal feature learning of the neighborhood factor is insufficient, and gives full play to the powerful spatio-temporal feature mining ability of the deep learning model. (3) Compared with the traditional random seed selection algorithm, the high-quality seed selection mechanism can help CA model find the position of seed pixel faster and better, and achieve the conversion to the target land use type, so as to improve the accuracy of spatial simulation and save the running time of the model. (4) This study uses the RF-CA model to decompose the multi-classification problem into several binary-classification problems to build a model to simulate the urban dynamic simulation model. At the same time, it uses the deep learning method to obtain the spatio-temporal feature of the neighborhood of each land use type, and uses the neighborhood feature of each land use type as the driver factor of RF, which improves the accuracy of the model and retains the explanatory advantage of the RF model. According to the importance measurement data of driving factors output from the RF model, it can be seen that the influence of driving factors acting on different land use types is different. For the distribution pattern of cultivated land, grassland, and water area, the spatio-temporal feature of their neighborhood factors play the largest role. On the one hand, the test results show the advantages of RF in interpretability through quantitative expression of the importance of driving factors, and also indicate that the important role of neighborhood factors in land use dynamic simulation cannot be ignored. To some extent, this also shows that land use change essentially follows the first law of geography.

This research optimizes the model from the aspects of spatio-temporal feature mining and dynamic iteration of neighborhood factors, but it needs further improvement in the future. This study uses the CNN-LSTM model to extract the spatial and temporal characteristics of neighborhood factors of different land use types. Obviously, a more powerful spatiotemporal network model has been developed in the research field of deep learning at present. In the future, we can try to use a better deep learning model to mine the spatiotemporal feature of neighborhood factors. This study also suggests that even data driven models cannot be built without the guidance of basic principles. How to enhance the interpretability of the model in the future land use dynamic modeling includes at least two aspects. One is to further play a role in the quantitative description of factor influence through the use of traditional machine learning models, and the other is to consider how to organically combine with mechanism models (such as the basic laws of geography) while giving play to the advantages of machine learning and deep learning models in mining time space characteristics.

Author Contributions

Conceptualization, M.L. and H.C.; methodology, H.C.; software, H.C.; validation, M.L. and L.Q.; formal analysis, M.L. and H.C.; investigation, H.C. and L.Q.; resources, M.L.; data curation, H.C.; writing—original draft preparation, H.C.; writing—review and editing, M.L. and L.Q.; visualization, H.C.; supervision, M.L.; project administration, C.C.; funding acquisition, C.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China OF FUNDER grant number 42071218 and Natural Science Foundation of Chongqing, China OF FUNDER grant number cstc2019jcyj-msxmX0139.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The code and some data of the article can be obtained at https://github.com/CQUPT1711laboratory/RF-CNN-LSTM-CA.git.

Acknowledgments

We gratefully acknowledge the support of various foundations.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

LUCC	Land use/cover change
CA	Cellular Automata
RF	Random forest
CNN	Convolutional Neural Networks
RNN	Recurrent Neural Network
LSTM	Long Short-Term Memory
GRU	Gated Recurrent Unit
Kappa	Kappa Coefficients
FOM	Figure Of Merit

References

Erfu DAI, L.M. Review on land change modeling approaches. Prog. Geogr. 2018, 37, 152. [Google Scholar]
Xia, L.; Yeh, A.G.O.; Tao, L.; Liu, X. Analysis of error propagation and uncertainties in urban cellular automata. Geogr. Res. 2007, 26, 443. [Google Scholar]
White, R.; Engelen, G. Cellular Automata and Fractal Urban Form: A Cellular Modelling Approach to the Evolution of Urban Land-Use Patterns. Environ. Plan. A Econ. Space 1993, 25, 1175–1199. [Google Scholar] [CrossRef] [Green Version]
Yi, L.S.; Pi, L.X.; Xiu, L.; Min, C.Y. Simulation model of land use dynamics and application: Progress and prospects. J. Remote Sens. 2017, 21, 12. [Google Scholar]
Da, Z.; Xiao, L.; Yao, Y.; Jin, Z. Simulating Spatiotemporal Change of Multiple Land Use Types in Dongguan by Using Random Forest Based on Cellular Automata. Geogr. Geo-Inf. Sci. 2016, 32, 29–36. [Google Scholar]
Fang, S.; Gertner, G.Z.; Sun, Z.; Anderson, A.A. The impact of interactions in spatial simulation of the dynamics of urban sprawl. Landsc. Urban Plan. 2005, 73, 294–306. [Google Scholar] [CrossRef]
Arsanjani, J.J.; Helbich, M.; Kainz, W.; Boloorani, A.D. Integration of logistic regression, Markov chain and cellular automata models to simulate urban expansion. Int. J. Appl. Earth Obs. Geoinf. 2013, 21, 265–275. [Google Scholar] [CrossRef]
Chen, Y.; Li, X.; Liu, X.; Ai, B. Modeling urban land-use dynamics in a fast developing city using the modified logistic cellular automaton with a patch-based simulation strategy. Int. J. Geogr. Inf. Sci. 2014, 28, 234–255. [Google Scholar] [CrossRef]
Wang, H.; Guo, J.; Zhang, B.; Zeng, H. Simulating urban land growth by incorporating historical information into a cellular automata model. Landsc. Urban Plan. 2021, 214, 104168. [Google Scholar] [CrossRef]
Xiao, L.; Xia, L.; Yeh, A.G.O.; Jin, H.; Jia, T. Using ant colony intelligent mining transformation rules of geographic cellular automata. Sci. China Earth Sci. 2007, 37, 824–834. [Google Scholar]
Ming, L.; Wei, S.; Qi, Z.; Han, F. Optimization of Logistic Regression Coefficients Based on Genetic Algorithm and Simulation of Dynamic Change of Urban Land Use: Taking the Chengdu-Chongqing Economic Zone as an Example. Geogr. Geo-Inf. Sci. 2018, 34, 12–17. [Google Scholar]
Feng, Y.; Liu, Y. A heuristic cellular automata approach for modelling urban land-use change based on simulated annealing. Int. J. Geogr. Inf. Sci. 2013, 27, 449–466. [Google Scholar] [CrossRef]
Kai, C.; Kai, L.; Lin, L.; Yuanhui, Z. Urban expansion simulation by random-forest-based cellular automata: A case study of Foshan City. Prog. Geogr. 2015, 34, 937–946. [Google Scholar]
Liang, X.; Guan, Q.; Clarke, K.C.; Liu, S.; Wang, B.; Yao, Y. Understanding the drivers of sustainable land expansion using a patch-generating land use simulation (PLUS) model: A case study in Wuhan, China. Comput. Environ. Urban Syst. 2021, 85, 101569. [Google Scholar] [CrossRef]
Li, X.; Chen, G.; Liu, X.; Liang, X.; Wang, S.; Chen, Y.; Pei, F.; Xu, X. A new global land-use and land-cover change product at a 1-km resolution for 2010 to 2100 based on human–environment interactions. Ann. Am. Assoc. Geogr. 2017, 107, 1040–1059. [Google Scholar] [CrossRef]
Li, R.; Chen, H. Simulation of Urban Spatial Expansion and Growth Boundary in Hangzhou Based on ANN-CA Model. Resour. Environ. Yangtze Basin 2021, 30, 10. [Google Scholar]
Zhai, Y.; Yao, Y.; Guan, Q.; Liang, X.; Li, X.; Pan, Y.; Yue, H.; Yuan, Z.; Zhou, J. Simulating urban land use change by integrating a convolutional neural network with vector-based cellular automata. Int. J. Geogr. Inf. Sci. 2020, 34, 1475–1499. [Google Scholar] [CrossRef]
Qian, Y.; Xing, W.; Guan, X.; Yang, T.; Wu, H. Coupling cellular automata with area partitioning and spatiotemporal convolution for dynamic land use change simulation. Sci. Total Environ. 2020, 722, 137738. [Google Scholar] [CrossRef]
He, J.; Li, X.; Yao, Y.; Hong, Y.; Jinbao, Z. Mining transition rules of cellular automata for simulating urban expansion by using the deep learning techniques. Int. J. Geogr. Inf. Sci. 2018, 32, 2076–2097. [Google Scholar] [CrossRef]
Verburg, P.H.; Nijs, T.; Eck, J.; Visser, H.; Jong, K.D. A method to analyse neighbourhood characteristics of land use patterns. Comput. Environ. Urban Syst. 2004, 28, 667–690. [Google Scholar] [CrossRef]
Minghao, L.; Yuan, T.; Baobao, X.; Xiaobo, L. Analysis of the influence of neighborhood factors on the simulation effect of urban land development intensity—Comparison of simulation results based on BP artificial neural network. J. Southwest China Norm. Univ. (Nat. Sci. Ed.) 2014, 39, 40–47. [Google Scholar]
Liao, J.; Tang, L.; Shao, G.; Su, X.; Chen, D.; Xu, T. Incorporation of extended neighborhood mechanisms and its impact on urban land-use cellular automata simulations. Environ. Model. Softw. 2016, 75, 163–175. [Google Scholar] [CrossRef] [Green Version]
Zhang, J.D.; Mei, Z.X.; Lv, J.; Chen, J. Simulating Multiple Land Use Scenarios based on the FLUS Model Considering Spatial Autocorrelation. J. Geo-Inf. Sci. 2020, 22, 531–542. [Google Scholar] [CrossRef]
Liu, J.; Xiao, B.; Li, Y.; Wang, X.; Bie, Q.; Jiao, J. Simulation of dynamic urban expansion under ecological constraints using a long short term memory network model and cellular automata. Remote Sens. 2021, 13, 1499. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
Huang, G.; Li, X.; Zhang, B.; Ren, J. PM2.5 concentration forecasting at surface monitoring sites using GRU neural network based on empirical mode decomposition. Sci. Total Environ. 2021, 768, 144516. [Google Scholar] [CrossRef]
Li, L.C.; Xu, J. Study on dynamic simulation method of land use based on LSTM-CA model. Remote Sens. Nat. Resour. 2022, 34, 122–128. [Google Scholar] [CrossRef]
Xiao, B.; Liu, J.; Jiao, J.; Li, Y.; Liu, X.; Zhu, W. Modeling dynamic land use changes in the eastern portion of the hexi corridor, China by CNN-GRU hybrid model. GISci. Remote Sens. 2022, 59, 501–519. [Google Scholar] [CrossRef]
Sohl, T.L.; Sayler, K.L.; Drummond, M.A.; Loveland, T.R. The FORE-SCE model: A practical approach for projecting land cover change using scenario-based modeling. J. Land Use Sci. 2007, 2, 103–126. [Google Scholar] [CrossRef]
Verburg, P.H.; Van Eck, J.R.R.; de Nijs, T.C.; Dijst, M.J.; Schot, P. Determinants of land-use change patterns in The Netherlands. Environ. Plan. B Plan. Des. 2004, 31, 125–150. [Google Scholar] [CrossRef] [Green Version]
Li, T.; Liu, M.; Lei, J.; Ting, L. An Urban Expansion Simulation Method of Dual Constrained RF-Patch-CA Considering the Importance of Driving Factors. Geogr. Geo-Inf. Sci. 2021, 37, 63–70. [Google Scholar]
Zhang, X.; Liu, L.; Chen, X.; Gao, Y.; Xie, S.; Mi, J. GLC_FCS30: Global land-cover product with fine classification system at 30 m using time-series Landsat imagery. Earth Syst. Sci. Data 2021, 13, 2753–2776. [Google Scholar] [CrossRef]
Landis, J.R.; Koch, G.G. The Measurement of Observer Agreement for Categorical Data. Biometrics 1977, 33, 159. [Google Scholar] [CrossRef] [Green Version]
Monserud, R.A.; Leemans, R. Comparing global vegetation maps with the Kappa statistic. Ecol. Model. 1992, 62, 275–293. [Google Scholar] [CrossRef]
Pontius, R.G.; Boersma, W.; Castella, J.C.; Clarke, K.; de Nijs, T.; Dietzel, C.; Duan, Z.; Fotsing, E.; Goldstein, N.; Kok, K.; et al. Comparing the input, output, and validation maps for several models of land change. Ann. Reg. Sci. 2008, 42, 11–37. [Google Scholar] [CrossRef] [Green Version]
Liu, X.; Liang, X.; Li, X.; Xu, X.; Ou, J.; Chen, Y.; Li, S.; Wang, S.; Pei, F. A future land use simulation model (FLUS) for simulating multiple land use scenarios by coupling human and natural effects. Landsc. Urban Plan. 2017, 168, 94–116. [Google Scholar] [CrossRef]

Figure 1. Location of the Chongqing metropolitan area.

Figure 2. Driving factors of LUCC.

Figure 3. Flowchart of RF-CNN-LSTM-CA model.

Figure 4. The process of convolution operation.

Figure 5. The basic structure of LSTM.

Figure 6. Flowchart of extracting neighborhood probability from CNN-LSTM model.

Figure 7. Neighborhood probability of conversion to six types of land from 2015 to 2020.

Figure 8. Flowchart of CA model based on high-quality seed selection iterative algorithm (HQSSIA).

Figure 9. Importance of driving factors.

Figure 10. Simulation results of land use/cover pattern in 2025 (a) and multi-land use expansion from 2020 to 2025 (b).

Table 1. Research data.

Name	Type	Resolution	Year	Data Source
Land use/cover	Raster	30 m	from 2000 to 2020	https://data.casearth.cn, (accessed on 27 December 2021)
Population	Raster	100 m	2015 and 2020	https://www.worldpop.org/, (accessed on 8 November 2021)
Topographic	Vector	-	2017	https://www.webmap.cn/main.do?method=index, (accessed on 28 November 2021)
Elevation	Raster	30 m	2019	https://search.earthdata.nasa.gov/, (accessed on 3 September 2021)
POI	Point	-	2020	https://lbs.amap.com/, (accessed on 30 July 2020)
GDP	Raster	1 km	2015 and 2020	https://www.resdc.cn/, (accessed on 8 November 2021)
Soil type	Raster	1 km	2015	https://www.fao.org/soils-portal/soil-survey/soil-maps-and-databases/harmonized-world-soil-database-v12/en/, (accessed on 17 April 2022)
Nighttime light	Raster	500 m	2015 and 2020	http://www.geodata.cn, (accessed on 5 November 2021)

Table 2. The definitions of TP, TN, FP, and FN.

	Positive	Negative
Observed Data	Positive	Negative
Positive	TP	FN
Negative	FP	TN

In the multi-classification problem, Positive refers to the type that needs to be measured, and Negative represents other types.

Table 3. Confusion matrix/error or land conversion from 2015 to 2020.

	Cropland	Grass	Forest	Shrub	Impervious Surfaces	Water Area	Total
2015	Cropland	Grass	Forest	Shrub	Impervious Surfaces	Water Area	Total
Cropland	2,312,150	382	9258	610	29,910	852	2,353,162
Grass	0	13,271	0	0	0	0	13,271
Forest	4249	190	1,191,481	182	858	88	1,197,048
Shrub	0	0	0	30,426	0	0	30,426
Impervious surfaces	0	0	0	0	184,467	0	184,467
Water area	128	24	43	0	64	60,764	61,023
total	2,316,527	13,867	1,200,782	31,218	215,299	61,704	3,839,397

Table 4. Comparison of the simulated result using the proposed model, Degraded I model, Degraded II model, a model, b model, and c model.

Period	Type	Proposed	Degraded I	Degraded II	a	b	c
2015–2020	FOM	0.1744	0.1588	0.0584	0.1661	0.1595	0.1608
	Kappa	0.9684	0.9668	0.9596	0.9678	0.9670	0.9671
	Accuracy	0.9829	0.9821	0.9771	0.9827	0.9820	0.9821
	F1	0.9641	0.9585	0.9578	0.9626	0.9590	0.9595
	Hamming loss	0.0171	0.0179	0.0229	0.0172	0.0179	0.0178
2010–2020	FOM	0.4662	0.2774	0.2183	0.4535	0.2793	0.2741
	Kappa	0.9599	0.9388	0.9300	0.9591	0.9392	0.9386
	Accuracy	0.9785	0.8017	0.9625	0.9781	0.9675	0.9671
	F1	0.8113	0.8017	0.7862	0.8100	0.8033	0.8022
	Hamming loss	0.0214	0.0326	0.0374	0.0218	0.0325	0.0328

2010–2020 indicates that the land use/cover pattern in 2020 is simulated based on 2010 land use/cover data and multiple spatial variables. The process is similar to the 2015 data used in the article to simulate the land use pattern in 2020.

Table 5. Comparison of running time of dynamic iterative modules between the proposed model and the Degraded I model.

Type	Proposed	Degraded II
Time	81 s	1526 s

Table 6. The proposed model compared to PLUS model and FLUS model.

Period	Type	Proposed	FLUS	PLUS
2015–2020	FOM	0.1744	0.0251	0.0306
	Kappa	0.9684	0.9629	0.9303
	Accuracy	0.9829	0.9641	0.9696
	F1	0.9641	0.9493	0.8253
	Hamming loss	0.0171	0.0358	0.0303
2010–2020	FOM	0.4662	0.0888	0.0907
	Kappa	0.9599	0.8923	0.8981
	Accuracy	0.9785	0.9424	0.9449
	F1	0.8113	0.7769	0.7910
	Hamming loss	0.0214	0.0576	0.0551

Table 7. Prediction results and analysis of land use/cover in 2025.

Type	Cropland	Grass	Forest	Shrub	Impervious Surfaces	Water Area	Total
Land use/cover in 2025	2,280,545	14,458	1,204,355	32,001	245,669	62,369	3,839,397
Expansion from 2020 to 2025	4312	591	8462	783	30,370	1056	45,574
Reduction from 2020 to 2025	40,294	0	4889	0	0	391	45,574

Suppose that the current discussion of the listing is cultivated land. Expansion indicates that it is not cultivated land in 2020, but will become cultivated land in 2025; reduction is the opposite.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, M.; Chen, H.; Qi, L.; Chen, C. LUCC Simulation Based on RF-CNN-LSTM-CA Model with High-Quality Seed Selection Iterative Algorithm. Appl. Sci. 2023, 13, 3407. https://doi.org/10.3390/app13063407

AMA Style

Liu M, Chen H, Qi L, Chen C. LUCC Simulation Based on RF-CNN-LSTM-CA Model with High-Quality Seed Selection Iterative Algorithm. Applied Sciences. 2023; 13(6):3407. https://doi.org/10.3390/app13063407

Chicago/Turabian Style

Liu, Minghao, Haiyan Chen, Liai Qi, and Chun Chen. 2023. "LUCC Simulation Based on RF-CNN-LSTM-CA Model with High-Quality Seed Selection Iterative Algorithm" Applied Sciences 13, no. 6: 3407. https://doi.org/10.3390/app13063407

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

LUCC Simulation Based on RF-CNN-LSTM-CA Model with High-Quality Seed Selection Iterative Algorithm

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Data

2.3. Methods

2.3.1. Neighborhood Spatio-Temporal Feature Extraction Module

2.3.2. Land Suitability Extraction Module

2.3.3. Dynamic Iteration Module

2.3.4. Accuracy Verification

3. Results

3.1. Analysis of LUCC from 2015 to 2020

3.2. Ablation Experiment and Comparison with Traditional Model

3.3. Comparison with LUCC Mainstream Model

3.4. Importance Analysis of Driving Factors

3.5. Future Land Use/Cover Pattern Simulation

4. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI