Landslide Displacement Prediction Stacking Deep Learning Algorithms: A Case Study of Shengjibao Landslide in the Three Gorges Reservoir Area of China

Jiang, Hongwei; Wang, Yunmin; Guo, Zizheng; Zhou, Hao; Wu, Jiayi; Li, Xiaoshuang

doi:10.3390/w16213141

Open AccessArticle

Landslide Displacement Prediction Stacking Deep Learning Algorithms: A Case Study of Shengjibao Landslide in the Three Gorges Reservoir Area of China

by

Hongwei Jiang

^1,2,3

,

Yunmin Wang

⁴,

Zizheng Guo

⁵,

Hao Zhou

¹,

Jiayi Wu

¹ and

Xiaoshuang Li

^1,2,3,*

¹

School of Urban Construction, Changzhou University, Changzhou 213164, China

²

College of Civil Engineering, Qilu Institute of Technology, Jinan 250200, China

³

Key Laboratory of Rock Mechanics and Geohazards of Zhejiang Province, Shaoxing University, Shaoxing 312000, China

⁴

Sinosteel Maanshan General Institute of Mining Research Co., Ltd., Maanshan 243000, China

⁵

School of Civil and Transportation Engineering, Hebei University of Technology, Tianjin 300401, China

^*

Author to whom correspondence should be addressed.

Water 2024, 16(21), 3141; https://doi.org/10.3390/w16213141

Submission received: 29 September 2024 / Revised: 28 October 2024 / Accepted: 30 October 2024 / Published: 2 November 2024

(This article belongs to the Section Hydrogeology)

Download

Browse Figures

Versions Notes

Abstract

:

Computational models enable accurate, timely prediction of landslides based on the monitoring data on-site as the development of artificial intelligence technology. The most existing prediction methods focus on finding a single prediction algorithm with excellent performance or an integrated and efficient hyperparameter optimization algorithm with a highly accurate regression prediction algorithm. In order to break through the limitation of generalization of prediction models, this paper proposes an ensemble model that combines deep learning algorithms, with a stacking framework optimized with the sliding window method. Multiple deep learning algorithms are set as the first layer of the stacking framework, which is optimized with the sliding window method to avoid confusion in the time order of datasets based on time series analysis. The Shengjibao landslide in the Three Gorges Reservoir is used as a case study. First, the cumulative displacement is decomposed into a trend and a periodic term using a moving average method. A single-factor and a multi-factor superposition model based on multiple deep learning algorithms are used to predict the trend and periodic term of the displacement, respectively. Finally, the predicted values of the trend and periodic terms are added to obtain the total predicted landslide displacement. For monitoring point ZK2-3, the values of RMSE and MAPE of the total displacement prediction with the stacking model are 15.93 mm and 0.54%, and the values of RMSE and MAPE of the best-performing individual deep learning model are 20.00 mm and 0.64%. The results show that the stacking model outperforms other models by combining the advantages of each individual deep learning algorithm. This study provides a framework for integrating landslide displacement prediction models. It can serve as a reference for the geological disaster prediction and the establishment of an early warning system in the Three Gorges Reservoir Area.

Keywords:

Shengjibao landslide; landslide displacement prediction; stacking; sliding window method; algorithm integrated

1. Introduction

In the wake of the normalization of extreme weather, landslides, a common geological hazard, have caused unexpected damage [1,2]. Since the 20th century, due to the intensifying and expanding scope of human engineering activities, the frequency of landslide disasters in our country has been steadily rising, significantly impacting the living environment and the safety of lives and property. Accurate displacement prediction is an essential component of an early warning system [3].

Recently, researchers have made progress on the computational models for landslide displacement prediction [4], including those based on a physical model [5], statistical methods [6], time series analysis [7], and artificial intelligence methods [8]. In recent years, research on artificial intelligence methods has made great progress. For artificial intelligence methods applied to landslide displacement, machine learning (ML) algorithms have become the main method for dealing with complex non-linear deformable landslides [9]. The prediction of landslide displacement currently uses machine learning methods, such as the time series model (TSM) [10], the support vector machine (SVM) model [11], and the LSTM model [12], which have become popular.

There are two critical issues in predicting the displacement of landslides. The first is the decomposition of landslide displacement. The second is the construction of the prediction. Regarding the decomposition of landslide displacement, the moving average method (MAM) [11,13], the weighted moving average (WMA) method [14], the Hodrick–Prescott (HP) filter method [15], the singular spectrum analysis (SSA) [16], and the local mean decomposition (LMD) algorithm [17,18] have been proposed widely. In this study, we adopt the MAM to decompose the cumulative displacement of landslides because it is convenient for solving displacement decomposition without random terms. At the same time, this method can better handle the impact of the fluctuation period of landslide-related factors on landslide displacement [19].

The cumulative displacement of landslides is influenced by internal and external factors [20]. Establishing an accurate response relationship between causal factors and landslide deformation is the key to predicting landslide displacement [21]. In recent research on the prediction of landslide displacement, generally speaking, cumulative landslide displacements were decomposed into trend and period terms or trend, period, and random terms [12,22] by using these decomposition methods. However, the random term displacement is usually caused by random factors such as wind loads and vehicle loads, which the current monitoring equipment cannot accurately monitor [19]. Therefore, the MAM was adopted to decompose the cumulative displacement for its simplicity and effectiveness in this paper.

Currently, research in landslide displacement prediction has mainly focused on selecting a single predictive model from numerous artificial intelligence methods [23]. The introduction of ensemble models can bring new ideas for predicting landslide displacement. Jiang proposed a weight coefficient ensemble model based on linear combination theory [19]. Liu et al. [23] have harmonized the temporal component displacements through the amalgamation of diverse algorithms, thereby obviating the biases inherent in individual machine learning methodologies and significantly enhancing the precision and robustness of the predictive outcomes. Within the realm of landslide susceptibility assessment, which is intrinsically linked to the field of landslide hazard prediction and forecasting, numerous studies have explored the application of ensemble learning methodologies. Zeng et al. [24] constructed a stacking ensemble model based on three distinct base classifiers, achieving commendable results in landslide susceptibility mapping at the township level.

This paper used the stacking method to integrate four commonly used deep learning algorithms, which achieved good results in this case study. The stacking method is a powerful technique that capitalizes on the strengths of multiple independent models to enhance predictive performance. A key advantage lies in its ability to amalgamate diverse models, each capable of capturing distinct patterns within the data. This diversity mitigates the risk of overfitting and bolsters the model’s generalization capabilities, rendering stacking particularly well-suited for complex tasks that a solitary model may find challenging. Furthermore, stacking facilitates the integration of both simple and intricate models, thus creating a more robust ensemble capable of adapting to various data distributions. By employing a meta-learner to aggregate the predictions from base models, stacking effectively harnesses their complementary strengths, achieving greater accuracy and reliability in the task at hand.

In this study, we proposed a model for predicting the displacement of landslides based on the sliding window method and the deep learning ensemble model. Cumulative displacement and influencing factors were decomposed into specific subsequences by the MAM. Then, we proposed a sliding window approach to process time series data, which divides the dataset using a stacking method and constructs a series of models using a validation set. After conducting data pre-processing and feature selection, the stacking ensemble method was employed to predict the trend and periodicity components of landslide displacement using four commonly used deep learning algorithms. The predicted results were then aggregated to obtain the total predicted displacement value, which was compared with those obtained using a single deep learning algorithm.

2. Study Site

2.1. Shengjibao Landslide

The Shengjibao landslide is located on the southern bank of the Yangtze River in Fengjie County, and the width of the river surface is 350–400 m. The river flows from the southwest of the landslide front to the northeast (Figure 1). The geomorphology of the landslide is located in a low mountainous geomorphic unit of tectonic-erosion-denudation, exhibiting a V-shaped valley terrain. The south bank of the landslide is the downhill slope, with an overall slope of 15–20 degrees. The slope surface is stepped. Gullies are developing on the slope. The elevation of the front and rear edges of the landslide is 81–85 m and 370 m, respectively. The landslide’s longitudinal length and transverse width are 1500 m and 1160 m, respectively, with an average thickness of about 25 m. The total area of the landslide is approximately 2.45 km², and the total volume is 3998 × 10⁴ m³.

2.2. Landslide Activity

The initial monitoring network for the Shengjibao landslide was established in March 2007 and included 8 GPS monitoring points. Figure 2 illustrates the current distribution of monitoring points at the Shengjibao landslide site. In May 2007, the Shengjibao landslide began to deform after the first impoundment of the Three Gorges Reservoir to a water level of 156 m, primarily due to the effects of seasonal rainfall and the decrease in the reservoir’s water level. The main sliding direction of the landslide was 348 degrees. Based on the monitoring data from March 2008 to December 2015, surface displacement can be divided into four stages (Table 1). Schematic geological cross section II-II’ of the Shengjibao landslide was shown in Figure 3.

3. Methodologies

3.1. Displacement Decomposition

Understanding the impact of influencing factors on the prediction and development of landslides is crucial in this analysis [11]. Based on previous research, we mainly selected some influencing factors related to rainfall and reservoir water level.

In this study, the original displacement of landslides was subjected to a decomposition process that separates it into two components: trend and period terms. Various factors, including geological tectonism, weathering, and the specific stage of evolution of the deformation, influence landslide trend terms.

The periodic term of landslides corresponds to the short-term displacement observed in wading landslides within the Three Gorges Reservoir Area [25,26]. This displacement is predominantly influenced by two factors: rainfall changes and reservoir water level changes. It is important to note that this study did not consider displacement caused by random factors, as it is challenging to monitor and generally of relatively small magnitude [27].

To analyze the accumulated displacement time series, the accumulated displacement time series can be decomposed as follows:

X (m) = μ (m) + σ (m)

(1)

where m indicates the time step, that they are all spaced by an interval

Δ t

. The index m corresponds to the time instant

t = t_{0} + m * Δ t

·

X (m)

is the accumulated displacement,

μ (m)

is the trend term, and

σ (m)

is the periodic term.

3.2. Moving Average Methods

Scientific decomposition of trends and cyclical terms is the basis for building reliable models. [11]. In this paper, the moving average method was employed to smoothen the time series data to achieve a more consistent pattern. By accurately identifying time series and their influencing factors, this method can more effectively predict trends [28].

The primary purpose of applying the moving average method is to minimize fluctuations and generate a smoother representation of the data. This smoothening process helps to understand the underlying trend better and facilitates the identification of significant factors that influence the time series. When a more accurate trend estimation is achieved, the model’s predictive capacity can be enhanced, leading to improved forecasting outcomes. The function of the single-moving average method is shown as follows:

{\bar{X}}_{t} = \frac{X_{t} + X_{t - 1} + \cdot \cdot \cdot + X_{t - n + 1}}{n} (t = n, n + 1, \dots, T)

(2)

where

{\bar{X}}_{t}

is the displacement trend term at time step t,

X_{t}

is the accumulated displacement of the landslide at time step t, and n is the moving average period. In this paper, the value of the n is 12, and the reservoir water level dispatching in the Three Gorges Reservoir Area takes 12 months.

3.3. Recurrent Neural Networks and Long Short-Term Memory Neural Networks

The standard recurrent neural networks (RNNs) module contains a single layer (Figure 4a) [29]. RNNs have input and output units that contain data sets accordingly. We marked the input data set as

\{x_{0} {, x}_{1}, \cdot \cdot \cdot, x_{t} {, x}_{t + 1}, \cdot \cdot \cdot\}

, while the output data set as

\{y_{0} {, y}_{1}, \cdot \cdot \cdot, y_{t} {, y}_{t + 1}, \cdot \cdot \cdot\}

. RNNs also contain hidden units, whose output set is marked as

\{s_{0} {, s}_{1}, \cdot \cdot \cdot, s_{t} {, s}_{t + 1}, \cdot \cdot \cdot\}

. The calculation of

S_{t}

is expressed as a non-linear activation function, which reads.

s_{t} = f (U x_{t} + W s_{t - 1})

(3)

where

s_{t}

is the state of the hidden layer at time step t, which is calculated based on the output of the current input layer and the status of the previous hidden layer. f is a non-linear activation function.

It is correct to point out that traditional RNNs can suffer from vanishing or exploding gradient problems, making them less effective in handling long-term dependencies. However, LSTM (Long Short-Term Memory) neural networks (Figure 4b), which are a specific type of RNN, are designed to overcome these limitations [30].

In an LSTM network, each RNN unit is replaced with a memory block containing three gate functions: the input gate, the output gate, and the forget gate. Those gate functions control the flow of information within the memory block.

By incorporating these gate functions, LSTM networks can effectively capture and learn long-term dependencies in a data sequence. This makes them particularly well suited for handling and modeling complex temporal relationships over extended periods.

3.4. Convolutional Neural Networks

In 1986, Rumelhart and Hinton et al. proposed the Back Propagation (BP) algorithm. In 1998, LeCun et al. used the BP algorithm to train the LeNet5 network, marking the actual emergence of CNN (Figure 5).

CNN is a feedforward neural network with a convolution structure [31]. The convolution structure can reduce the memory occupied by the deep network. Three critical operations, i.e., local receptive field, weight sharing, and pooling layer, effectively reduce the number of network parameters and alleviate the problem of overfitting the model. Compared with traditional neural networks, it uses images as input parameters and uses multiple layers.

There are already some application examples of CNN algorithms in landslide displacement prediction [32,33,34]. Whether applied to single algorithm models or coupled models, CNN is a deep learning algorithm that can be considered.

3.5. Stacking and Its Optimization

When dealing with complex data, achieving an accurate fit with a single model can be difficult. Moreover, single models often exhibit limited robustness to disturbances. To address these challenges in displacement prediction, using an ensemble model, consisting of multiple strategically combined models, can be beneficial. This approach leverages individual models’ diverse strengths and weaknesses to enhance the ensemble model’s overall generalization ability [35].

In this study, the stacking approach combined Boosting and Bagging, which are two commonly used ensemble learning techniques.

Stacking, also known as stacked generalization, involves modeling the stacked predictions generated by multiple base learners fitted to the original data [36]. The base learners are first trained on the original data to produce individual predictions in this process. These predictions are then horizontally stacked together, resulting in a two-dimensional array where the rows represent the samples and the columns represent the base learners. Subsequently, this newly created data set is used as input for a higher-level model to improve prediction performance further. The diagram provided in Figure 6 depicts the schematic representation of a traditional stacking model.

The sliding window method is a commonly used data processing technology, often used in extracting several subsequences from a long data sequence and then performing certain calculations and analyses on these subsequences. The basic idea of this technique is to divide the input sequence into several fixed-length subsequences (also called windows) and process the data within each window interval to obtain a result sequence. For example, we can divide the original sequence into several fixed-length windows in time-series data analysis. Then, various statistics and calculations, such as mean, variance, maximum, and minimum, can be performed on the data within each window, serving as the feature values of this window. By extracting features from data within the window, we can effectively reduce the data dimensionality and enhance the data representation and utilization efficiency.

This paper introduces a model based on stacked ensemble learning. In the initial layer of this model, we have employed four distinct deep learning algorithms as base learners, thereby constructing the respective base models, each of which is crafted using a singular deep learning algorithm. Upon completion of the training of these base models, their performance is evaluated on the validation set. It is noteworthy that the stacking ensemble technique itself does not intervene with the models; it refrains from providing predictive data post-training of the base models. In the subsequent layer, we employ a straightforward regression algorithm to train the data outputted from the first layer. This algorithm necessitates no hyperparameter optimization and, following the division of the training and prediction sets, directly furnishes the prediction outcomes.

The conventional stacking model utilizes k-fold cross-validation to handle the dataset, which may have limitations when applied to time series problems. For example, the fact that each training dataset contains information from other samples may lead to data leakage issues, resulting in overly optimistic evaluation results. In this study, if future data was adopted to predict previous data, it would go against timeliness. To overcome this challenge, this study introduced an optimized stacking model incorporating the sliding window method to process the raw data set, thus preserving the inherent temporal sequence. The flowchart of this method is shown in Figure 7.

Like traditional stacking frameworks, the improved stacking framework utilizing the sliding window method can be divided into two primary components: first-layer algorithms and second-layer algorithms.

The first-layer algorithms partitioned the dataset into a fitting set and a test set in a 6:1 ratio. The fitting set included training and validation sets with varying sample sizes. Notably, not all of the fitting set data was used for training in each base model. The sliding window method was employed to avoid validating past landslide data with future landslide data. Based on the previously mentioned dataset partitioning period of 12, the validation datasets consist of the sample data corresponding to the last twelve steps of each training set. Consequently, five different datasets were established with training-to-validation ratios of 1:1, 2:1, 3:1, 4:1, and 5:1, respectively, creating five base models. Then, the GS optimization algorithm is used to obtain the best hyperparameters for these models, and the retained prediction set is used to evaluate the models.

In the second-level algorithm, the input dataset was constructed from the outputs of the first-level algorithm. Specifically, for each base learner, datasets with five distinct sample sizes generated a set of validation data, VR n, after validating the validation set with the training set. These five sets of validation data were then stacked to form the input data for the second-level algorithm’s training samples, with the output value being the actual landslide displacement. Similarly, different base models within each base learner were tested on a reserved test set, and the processed prediction results were used as the input data for the second-level algorithm’s prediction samples, with the output value representing the unknown displacement.

The traditional stacking model often employs k-fold cross-validation, with the handling of the prediction set typically involving the computation of an average. In the sliding window approach, the number of samples per base model varies, and the differences influence the diversity of information provided by each base model to the meta-model in the training data and algorithms of these models. Consequently, determining the weighting for each base model’s predictions within the test set is paramount. In this study, the weighting of base model outputs for the meta-model’s predictive set was determined based on the mean absolute error (MAE) metric calculated on the test set. The MAE, the average of absolute errors, provides a more accurate reflection of the actual error in predictions. A smaller MAE value signifies better model performance, and a greater weighting is assigned. After weighing the results of each base learner’s output, the data obtained will serve as the predictive set for the second-level algorithm.

3.6. Multiple Linear Regression

Multiple linear regression is a regression model that is based on the relationship between multiple independent variables and a dependent variable, which is an essential tool in statistics and machine learning [37]. The basic idea is to use multiple independent variables to predict and model a dependent variable. In multiple linear regression, there may be a linear or non-linear relationship between the dependent and independent variables, but it is assumed to be a linear relationship [38].

In multiple linear regression, the most critical task is to estimate the regression coefficients [39]. In the stacking model proposed in this paper, the second-level algorithm employed a multiple linear regression approach. It is worth noting that multiple linear regression is a straightforward regression technique readily accessible in Python’s Scikit-Learn without parameter optimization or hyperparameter tuning. Consequently, unlike the first-level algorithm, where training and validation sets were established within the fitting set, the entire fitting set was used as the training set in the second-level algorithm. The second-level model utilized 60 time steps of data as the fitting set and the remaining 12 time steps as the prediction set, yielding the final landslide displacement predictions.

Thus, the principal steps of our study have been thoroughly delineated. The overall process is shown in Figure 8.

3.7. Reliability Evaluation of the Model

The best hyperparameters for the model are obtained by fitting the entire dataset, and the model is then trained using the entire training dataset and the best hyperparameters [40]. To verify the prediction accuracy of the LSTM model, the RNN model, CNN model, DNN model, and the integrated stacking model, the root mean square error (RMSE) [21] and the mean absolute percentage error (MAPE) [41] are calculated.

R M S E = \sqrt{\frac{1}{N} {\sum_{t = 1}^{N} (x_{org} - x_{prd})}^{2}}

(4)

M A P E = \frac{1}{N} \sum_{t = 1}^{N} |x_{prd} - \frac{x_{org}}{x_{prd}}|

(5)

where

x_{org}

is the measured value,

x_{prd}

is the predicted value, and N is the number of predicted values.

4. Prediction Process

4.1. Point Selection and Data Processing

In this paper, the ZK2-3 data was selected as the GPS deformation detection point. As sample data, we picked up the displacement monitoring data of 84 time steps from January 2009 to December 2015.

For a single deep learning algorithm, the original data of the landslide displacement was first decomposed using the moving average method. The dataset was often divided in proportions such as 7:1.5:1.5 or 6:2:2 according to relevant literature on landslide displacement prediction [18,42]. After decomposition, the first 72 steps are assigned to the fitting data set, as illustrated in Section 3.5. The training and validation datasets will be selected from these 72 time steps of data. The reserved 12 time steps served as the prediction set to evaluate the model’s performance and provide prediction data for the second layer algorithm. The specific situation of the decomposition of the increment of landslide displacement is shown in Figure 9.

In contrast to the equal-sized training datasets in k-fold cross-validation, the sliding window method was employed to handle time series data. In this approach, the data was divided into groups with 12 sequential time points. These groups were stacked in succession to form the training set of the model. The 12 time points after each training set were used as a validation set to fine-tune the hyperparameters of the model. The last 12 time points of the original dataset were used as validation sets for each respective group.

4.2. The Composition of Dataset

As mentioned in Section 3.5, when dividing the dataset, we did not follow the traditional k-fold cross-validation approach to split the data into equal parts. To ensure that future landslide data would not be used to predict past landslide occurrences, we employed a sliding window method to handle the time series data.

We selected a data set that spans seven years from 2009 to 2015, comprising a total of 84 time points, explicitly focusing on landslide occurrences in a region. We employed the sliding window method to handle the time series data and generated five different training sets. These five training sets each contain data comprising 12, 24, 36, 48, and 60 time steps, respectively. Specifically, the first training set comprises twelve time-step datasets from the year 2009, while the second training set encompasses twenty-four time-step datasets from the years 2009 and 2010. Consequently, this process yields five datasets of varying sizes. The output results of the first-layer algorithms, obtained by evaluating the performance of each training set on respective validation sets, were used as input for the second-layer training. Once the model parameters were determined, the trained models were used to predict the test dataset. The second layer algorithm used in this study was multiple linear regression. The five sets of prediction results from the second-layer models were weighted and combined to generate the final prediction datasets.

It is essential to specify that we partition the dataset into subsets of varying sizes to ensure that the outputs of each ensemble of base models adhere to the logical continuity of the time series. Furthermore, we aim to investigate whether the model can still predict effectively under these subdivision criteria. Additionally, employing the sliding window technique for handling time series data does not adversely affect the performance of the model or the accuracy of the predictions.

4.3. Factors Selection

The selection of alternative influencing factors and status factors plays a vital role in landslide prediction. Rainfall is a significant influencing factor in the occurrence of landslide disasters. Considering the linkage between reservoir water level and rainfall, several candidate factors related to rainfall variation and reservoir water level were selected and labeled as f₁ to f₉ in turn [43]. The relationship between landslide initial displacement and influencing factors is shown in Figure 10. Table 2 shows the initial data of landslide displacement and the sources of the relevant influencing factor data. A grey correlation coefficient with a resolution of 0.5 was obtained by the grey correlation analysis method (GCAM) [27]. When the grey correlation coefficient is more significant than 0.6, which means that there is a strong correlation between the two variables, the results show that the candidate input factors selected in this section meet the requirements (Table 3).

This article selected ZK2-3 as the research subject because this point was located in a high deformation zone; its displacement trend conformed to the characteristics of a stepped landslide, which was conducive to studying landslide displacement patterns. Meanwhile, the choice of landslide point did not affect the final accuracy of the model.

Meanwhile, on the basis of correlation degree analysis, we carried out the collinearity analysis on candidate factors. By excluding input factors with collinearity exceeding the standard, the model’s prediction accuracy was improved. For example, when tolerance is less than or equal to 0.1 or the variance inflation factor (VIF) is greater than 5, the problem of collinearity exists in the index [44].

According to the analysis results, we excluded f₂, f₅, and f₇ to ensure the scientific validity of factor screening. After completing the screening process, input factor evaluation has been completed (Table 4).

4.4. Normalization and Inverse Normalization

Normalization is an essential step in data preprocessing for machine learning algorithms. Its purpose is to facilitate the algorithms in finding optimal parameters by scaling the data to a specific interval.

To avoid information leakage, it is essential to obtain the boundary values for normalization and inverse normalization from the fitted dataset rather than the entire dataset [45]. Because the fitted data set is considered known information, while the predicted data set is unknown.

x_{scale} = 2 \times \frac{x_{origin} - x_{\min}}{x_{\max} - x_{\min}} - 1

(6)

where

x_{scale}

is the normalized value,

x_{origin}

is the original value,

x_{\max}

is the maximum value of the samples, and

x_{\min}

is the minimum value. The range of

x_{scale}

is [−1, 1].

In this study, the fitting set contained 72 time steps. We selected boundary values from these time steps. It is worth recalling that the number of time steps in the data set was different after the sliding window method was processed. Considering the consistency of the data, no special processing will be performed on the partitioned data set. That means all the datasets used the same boundary values to calculate.

4.5. Hyperparameters of Deep Learning Models and Stacking Model

The Keras 2.10.0 framework is used to build deep learning models, with TensorFlow 2.10.0 as the backend and the codes written in Python 3.9.

The deep learning models operated in the Python language environment. For the establishment of the model, batch size, number of neurons, and number of layers of models were decided with the GS method in succession [46]. In GS processing, the batch size of the grid ranged from [1,20], the grid step was 1, and the number of neurons was [1,30] with a grid step of 1. During the algorithmic process, the best result was obtained by setting an early stop. The patience setting for early stop was 50, which means that the algorithm process will stop when the result does not improve within 50 steps. The optimal epochs were obtained. The optimal hyperparameters of the single deep learning algorithm and stacking models are shown in Table 5 and Table 6; “TR n” represents the base model corresponding to each base learner.

5. Results

5.1. The Results of Each Model

In this study, we present the prediction results of various models. These results are accurate to within 0.01 mm, as the raw monitoring data collected at our observation points is precise to two decimal places in the millimeter unit. To ensure the reliability of the original data, we also calibrated the predictions of landslide displacement values, achieving an accuracy of 0.01 mm.

The comparison results of predicted and measured displacement obtained using deep learning models at point ZK2-3 are shown in Figure 11 and Figure 12. For ZK2-3, the RMSE and MAPE values of the LSTM model are 20.00 mm and 0.64%, respectively; the RMSE and MAPE values of the CNN model are 23.07 mm and 0.85%, respectively; the RMSE and MAPE values of the RNN model are 35.01 mm and 1.26%, respectively; and the accuracy factors of the DNN model are 21.60 mm and 0.71%, respectively.

Comparing the results between the validation dataset and the prediction dataset in Table 7, it can be seen that the accuracy of the predicted results on the validation dataset is also well reflected in the prediction dataset.

From Figure 13, it can be seen that the cumulative displacement growth of landslides is mainly concentrated in June, July, and August. We compared the reservoir water level and rainfall data from 2013 and 2014. We found that the rainfall in 2013 was mainly concentrated from the end of July to mid-August, while the rainfall in 2014 was mainly concentrated from mid-July to early August.

This indicated that certain fluctuations in reservoir water level and rainfall may have a certain impact on landslide deformation. At the same time, the difference in the distribution time of rainfall and the fluctuation of reservoir water level were more likely to cause deviations in landslide displacement prediction.

It is imperative to note that in Figure 9, Figure 11 and Figure 14, we meticulously delineate the nuances of the initial landslide displacement, juxtaposing the predictions of the singular model with the actual initial displacement, as well as contrasting the integrated model with the singular model. These visual representations are exclusively derived from the outcomes of the model predictions. Conversely, in Figure 10, Figure 13 and Figure 15, we illustrate the interplay between influencing factors and landslide displacement, the temporal variations of precipitation, the fluctuations in reservoir water levels, as well as the interactions between cumulative displacement increments and influencing factors. These aspects are, in fact, not contradictory. We have incorporated them in this section to ensure that these visual aids simultaneously bolster the credibility of our model predictions during the analysis of our results. They serve more as supplementary elucidations of the model predictions rather than detracting from the original forecasts.

5.2. Analysis of Differences in Prediction Performance Among Models

The comparison results of predicted displacement and measured displacement obtained using deep learning models at point ZK2-3 are shown in Figure 14. Compared with LSTM, CNN, RNN, and DNN, the accuracy of the ensemble model was noticeably improved by an average of 20.4%, 30.9%, 54.5%, and 26.3%, respectively.

Overall, the stacking model performs better because it can exhibit better predictive performance at some time points and does not differ significantly from other models at some time points with poor predictive performance. Comparing the prediction results of the four graphs comprehensively, the prediction bias of each model mainly occurred after May 2015. Especially in mid-year 2015, the overall displacement of the landslide was in an upward stage. From the selected regions in Figure 14 (such as A, A’, B, B’, C, C’, D, and D’), it is evident that when there are significant changes in landslide displacement, the predictions made by the Stacking algorithm often align more closely with the actual displacement values of the landslide or better reflect the trend of these absolute displacement values compared to those made by individual deep learning algorithms.

The robustness and effectiveness of the proposed method have been verified, and significant progress has been made in predicting landslide displacement.

6. Discussion

6.1. Stochastic Terms in Landslide Displacement Prediction

In this paper, the focus is on decomposing the original landslide displacement into trend and periodic terms, while the effects of stochastic terms are not considered. Currently, limited research exists on the decomposition of landslide displacement into random terms, as these random terms are influenced by various stochastic factors such as wind load and vehicle load. The method of numerical analysis of the observed accumulated displacement and time series of a landslide cannot decompose and predict random displacements [17].

It is challenging to identify the factors contributing to the random term displacement of landslides using existing detection techniques. Therefore, exploring and developing high-quality models capable of accurately predicting the random term of landslide displacement in future studies is imperative. By doing so, we can improve our understanding of the underlying factors and the accuracy of landslide displacement prediction models.

6.2. The Development of Integrated Models in the Field of Deep Learning

Numerous research studies have employed machine learning techniques to forecast landslide displacement, employing various integrated models [47]. The integrated models usually contain bagging, boosting, and stacking. The bagging algorithm avoids overfitting by reducing the variance of the results. Boosting is a machine learning algorithm that can reduce bias in supervised learning.

In this study, a robust ensemble model using the coupled deep learning algorithm was developed for prediction purposes, showcasing excellent performance. Currently, multiple investigations have been conducted on the prediction of landslide displacement using deep learning. To achieve precise and accurate results in landslide displacement prediction, it is necessary to explore more advanced deep learning algorithms [48]. Hence, the novel approach proposed in this paper presents a fresh perspective on utilizing deep learning for landslide displacement prediction [49].

6.3. Decomposition and Reconstruction of Trend and Periodic Terms

One should note that the ensemble model performs less effectively on its trend component compared to other individual deep learning models. However, the final result obtained by adding the trend and periodic components together is significantly better than the final results of other models. Therefore, the results of different models on the trend and periodic components are combined sequentially, resulting in the results shown in Table 8. The horizontal axis of the table represents the different algorithms employed for the cyclical component, while the vertical axis denotes the various algorithms utilized for the trend component. The numbers within the table indicate the RMSE values of the predictions resulting from the integration of the two algorithms. It is evident that applying the Stacking algorithm to both the trend and cyclical components yields the most optimal prediction results.

Moreover, the Stacking algorithm performs better in forecasting the cyclical component than other deep learning algorithms. In future studies, examining whether these disparities are amplified or diminished in the total displacement after the addition calculation is imperative. The main objective of future research on landslide displacement prediction is to properly understand the relationship between trend and periodic components to achieve accurate prediction results.

6.4. Further Exploration of the Landslide Research Cycle

In this study, the data set was divided into groups of 12 time points each, as shown in Figure 15, according to chronological order. Incremental displacement of landslides is closely related to local rainfall and reservoir water level variations, which often exhibit a periodicity of 12 time points. We also attempted to divide the initial displacement of landslides into groups of 8 time points each. We used the LSTM algorithm model to predict the displacement of the landslide, and the results are shown in Table 9. In the table, LSTM-8 denotes the partitioning scheme with a period of 8, while LSTM-12 refers to the partitioning scheme with a period of 12. It can be seen that it is necessary to divide the data set based on the fluctuation patterns of external factors. In future research, it would be more precise to consider daily units instead of monthly units.

7. Conclusions

In this study, a stacking ensemble landslide displacement prediction model was proposed. This model had been applied to the displacement prediction of the Shengjibao landslide in the Three Gorges Reservoir Area and was compared to single deep learning models such as CNN and LSTM models, achieving reliable results. Based on the above research, the following conclusions can be drawn:

(1): It is essential to select appropriate influencing factors based on the rainfall and reservoir water level data at the geographical location of the landslide, and conducting correlation analysis after normalization is more conducive to improving the prediction accuracy of the model.
(2): Overall, the LSTM model was better than other single deep learning models based solely on evaluation indicators. However, its results were only sometimes closer to the original values than those of other single deep learning models at all time steps of the prediction dataset.
(3): The landslide displacement prediction model in the stacking ensemble optimized by the sliding window method in this paper combined the advantages of deep learning algorithms. The ensemble model is ultimately closer to the original value than other models. Therefore, this paper considers its prediction performance superior to that of other models, and effectively constructs the relationship between landslide displacement and factors.

The methodology presented in this paper adeptly amalgamates time series analysis, the sliding window technique, multiple linear regression, deep learning algorithms, and the principles of ensemble prediction to effectively forecast the displacement of the Shengjibao landslide within the Three Gorges reservoir region. Research indicates that the ensemble model developed through this approach holds substantial promise for future applications in predicting landslide displacements in the Three Gorges Reservoir Area and other regions susceptible to such geological occurrences.

In practical applications, the integrated deep learning model proposed by this study can be seamlessly incorporated into existing landslide monitoring and early warning systems. By synergizing meteorological data, geological data, and remote sensing information, it facilitates real-time analysis, thereby enhancing the accuracy and timeliness of predictions. Government agencies can leverage the predictive and evaluative outcomes of the model to formulate policies and regulations geared towards landslide disaster prevention and mitigation, thereby bolstering the overall disaster resilience of society. Furthermore, the landslide risk maps and predictive results generated by the model can serve as invaluable resources for public education and training, elevating public awareness and preparedness in the face of potential disasters.

Ensemble models can effectively reduce the bias and variance of a single model by leveraging the strengths of multiple foundational models, thereby enhancing the accuracy of landslide risk prediction. This holds significant implications for scientific decision-making and risk management. The application of integrated deep learning models offers new technological approaches for landslide prediction, fostering the connection between theoretical research and practical application, and elevating the practicality and efficiency of disaster early warning systems.

Author Contributions

Writing—original draft, H.J., H.Z. and J.W.; Writing—review & editing, Y.W., Z.G. and X.L. All authors have read and agreed to the published version of the manuscript.

Funding

The research work was funded by the Research Fund of National Natural Science Foundation of China (NSFC) (Grant No. 52404113, 42477142, 42277154); the Changzhou Sci&Tech Program (Grant No. CJ20240042); the Science and technology projects of Yunnan Province (Grant No. 202407AC110019); the project of Slope safety control and disaster prevention technology innovation team of “Youth Innovation Talent Introduction and Education Plan” of Shandong Colleges and universities (Grant No. Lu Jiao Ke Han [2021] No. 51); National Natural Science Foundation of Shandong Province of China (Grant No. ZR2022ME188), Jinan City “new university 20” research leader studio project (Grant No. 20228108); and the high-level Talent Introduction Project of Changzhou University (Grant No. ZMF22020036, ZMF24020037).

Data Availability Statement

Data is contained within the article.

Acknowledgments

The authors wish to thank the geo-environmental monitoring office of Wanzhou District. We also wish to thank Xiaojing Chu for her technical support.

Conflicts of Interest

Author Yunmin Wang was employed by the company Sinosteel Maanshan General Institute of Mining Research Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Zeng, T.; Yin, K.; Jiang, H. Groundwater level prediction based on a combined intelligence method for the Sifangbei landslide in the Three Gorges Reservoir Area. Sci. Rep. 2022, 12, 11108. [Google Scholar] [CrossRef] [PubMed]
Guo, Z.; Chen, L.; Yin, K. Quantitative risk assessment of slow-moving landslides from the viewpoint of decision-making: A case study of the Three Gorges Reservoir in China. Eng. Geol. 2020, 273, 105667. [Google Scholar] [CrossRef]
Wang, J.; Nie, G.; Gao, S. Landslide Deformation Prediction Based on a GNSS Time Series Analysis and Recurrent Neural Network Model. Remote Sens. 2021, 13, 1055. [Google Scholar] [CrossRef]
Huang, F.; Yao, C.; Liu, W. Landslide susceptibility assessment in the Nantian area of China: A comparison of frequency ratio model and support vector machine. Geomat. Nat. Hazards Risk 2018, 9, 919–938. [Google Scholar] [CrossRef]
Zhang, J.; Lin, C.; Tang, H. Input-parameter optimization using a SVR based ensemble model to predict landslide displacements in a reservoir area—A comparative study. Appl. Soft Comput. 2024, 150, 111107. [Google Scholar] [CrossRef]
Zhang, J.; Tang, H.; Zhou, B. A new early warning criterion for landslides movement assessment: Deformation Standardized Anomaly Index. Bull. Eng. Geol. Environ. 2024, 83, 205. [Google Scholar] [CrossRef]
Lin, Z.; Sun, X.; Ji, Y. Landslide Displacement Prediction Based on Time Series Analysis and Double-BiLSTM Model. Int. J. Environ. Res. Public Health 2022, 19, 2077. [Google Scholar] [CrossRef]
Tang, F.; Tang, T.; Zhu, H. A semantic information-driven stepwise landslide displacement prediction model. Environ. Monit. Assess. 2022, 194, 836. [Google Scholar] [CrossRef]
Zhang, J.; Tang, H.; Tan, Q. A generalized early warning criterion for the landslide risk assessment: Deformation probability index (DPI). Acta Geotech. 2024, 19, 2607–2627. [Google Scholar] [CrossRef]
Jia, W.; Wen, T.; Li, D. Landslide Displacement Prediction of Shuping Landslide Combining PSO and LSSVM Model. Water 2023, 15, 612. [Google Scholar] [CrossRef]
Zhang, Y.; Tang, J.; He, Z. A novel displacement prediction method using gated recurrent unit model with time series analysis in the Erdaohe landslide. Nat. Hazards 2020, 105, 783–813. [Google Scholar] [CrossRef]
Zhang, M.; Han, Y.; Yang, P. Landslide displacement prediction based on optimized empirical mode decomposition and deep bidirectional long short-term memory network. J. Mt. Sci. 2023, 20, 637–656. [Google Scholar] [CrossRef]
Du, J.; Yin, K.; Lacasse, S. Displacement prediction in colluvial landslides, Three Gorges Reservoir, China. Landslides 2012, 10, 203–218. [Google Scholar] [CrossRef]
Lin, Z.; Sun, X.; Ji, Y. Landslide Displacement Prediction Model Using Time Series Analysis Method and Modified LSTM Model. Electronics 2022, 11, 1519. [Google Scholar] [CrossRef]
Li, D.; Sun, Y.; Yin, K. Displacement characteristics and prediction of Baishuihe landslide in the Three Gorges Reservoir. J. Mt. Sci. 2019, 16, 2203–2214. [Google Scholar] [CrossRef]
Li, L.; Zhang, M.; Wen, Z. Dynamic prediction of landslide displacement using singular spectrum analysis and stack long short-term memory network. J. Mt. Sci. 2021, 18, 2597–2611. [Google Scholar] [CrossRef]
Lin, Z.; Ji, Y.; Liang, W. Landslide Displacement Prediction Based on Time-Frequency Analysis and LMD-BiLSTM Model. Mathematics 2022, 10, 2203. [Google Scholar] [CrossRef]
Luo, W.; Dou, J.; Fu, Y. A Novel Hybrid LMD–ETS–TCN Approach for Predicting Landslide Displacement Based on GPS Time Series Analysis. Remote Sens. 2022, 15, 229. [Google Scholar] [CrossRef]
Jiang, H.; Li, Y.; Zhou, C. Landslide Displacement Prediction Combining LSTM and SVR Algorithms: A Case Study of Shengjibao Landslide from the Three Gorges Reservoir Area. Appl. Sci. 2020, 10, 7830. [Google Scholar] [CrossRef]
Ye, X.; Zhu, H.; Wang, J. Subsurface Multi-Physical Monitoring of a Reservoir Landslide with the Fiber-Optic Nerve System. Geophys. Res. Lett. 2022, 49, e2022GL098211. [Google Scholar] [CrossRef]
Zeng, T.; Jiang, H.; Liu, Q. Landslide displacement prediction based on Variational mode decomposition and MIC-GWO-LSTM model. Stoch. Environ. Res. Risk Assess. 2022, 36, 1353–1372. [Google Scholar]
Miao, F.; Wu, Y.; Xie, Y. Prediction of landslide displacement with step-like behavior based on multialgorithm optimization and a support vector regression model. Landslides 2017, 15, 475–488. [Google Scholar] [CrossRef]
Liu, L.; Yin, H.; Xiao, T. Ensemble learning for landslide displacement prediction: A perspective of Bayesian optimization and comparison of different time series analysis methods. Stoch. Environ. Res. Risk Assess. 2024, 38, 3031–3058. [Google Scholar] [CrossRef]
Zeng, T.; Wu, L.; Peduto, D. Ensemble learning framework for landslide susceptibility mapping: Different basic classifier and ensemble strategy. Geosci. Front. 2023, 14, 101645. [Google Scholar] [CrossRef]
Guo, Z.; Chen, L.; Gui, L. Landslide displacement prediction based on variational mode decomposition and WA-GWO-BP model. Landslides 2019, 17, 567–583. [Google Scholar] [CrossRef]
Kamran, M.; Hu, X.; Hussain, M.A. Dynamic Response and Deformation Behavior of Kadui-2 Landslide Influenced by Reservoir Impoundment and Rainfall, Baoxing, China. J. Earth Sci. 2023, 34, 911–923. [Google Scholar] [CrossRef]
Lu, X.; Miao, F.; Xie, X. A new method for displacement prediction of “step-like” landslides based on VMD-FOA-SVR model. Environ. Earth Sci. 2021, 80, 542. [Google Scholar] [CrossRef]
Zhou, C.; Yin, K.; Cao, Y. Application of time series analysis and PSO–SVM model in predicting the Bazimen landslide in the Three Gorges Reservoir, China. Eng. Geol. 2016, 204, 108–120. [Google Scholar] [CrossRef]
Li, H.; Xu, Q.; He, Y. Temporal detection of sharp landslide deformation with ensemble-based LSTM-RNNs and Hurst exponent. Geomatics. Nat. Hazards Risk 2021, 12, 3089–3113. [Google Scholar] [CrossRef]
Niu, X.; Ma, J.; Wang, Y. A Novel Decomposition-Ensemble Learning Model Based on Ensemble Empirical Mode Decomposition and Recurrent Neural Network for Landslide Displacement Prediction. Appl. Sci. 2021, 11, 4684. [Google Scholar] [CrossRef]
Li, L.; Wang, C.; Wen, Z. Landslide displacement prediction based on the ICEEMDAN, ApEn and the CNN-LSTM models. J. Mt. Sci. 2023, 20, 1220–1231. [Google Scholar] [CrossRef]
Nava, L.; Carraro, E.; Reyes-Carmona, C. Landslide displacement forecasting using deep learning and monitoring data across selected sites. Landslides 2023, 20, 2111–2129. [Google Scholar] [CrossRef]
Zhu, X.; Zhang, F.; Deng, M. A Hybrid Machine Learning Model Coupling Double Exponential Smoothing and ELM to Predict Multi-Factor Landslide Displacement. Remote Sens. 2022, 14, 3384. [Google Scholar] [CrossRef]
Yu, C.; Huo, J.; Li, C. Landslide Displacement Prediction Based on a Two-Stage Combined Deep Learning Model under Small Sample Condition. Remote Sens. 2022, 14, 3732. [Google Scholar] [CrossRef]
Jia, H.; Wang, Y.; Ge, D. InSAR Study of Landslides: Early Detection, Three-Dimensional, and Long-Term Surface Displacement Estimation—A Case of Xiaojiang River Basin, China. Remote Sens. 2022, 14, 1759. [Google Scholar] [CrossRef]
Zhang, L.; Dai, K.; Deng, J. Identifying Potential Landslides by Stacking-InSAR in Southwestern China and Its Performance Comparison with SBAS-InSAR. Remote Sens. 2021, 13, 3662. [Google Scholar] [CrossRef]
Collier, T.; Johnson, A.; Ruggiero, J. Technical efficiency estimation with multiple inputs and multiple outputs using regression analysis. Eur. J. Oper. Res. 2011, 208, 153–160. [Google Scholar] [CrossRef]
Huntley, D.; Bobrowsky, P.; Charbonneau, F. Innovative Landslide Change Detection Monitoring: Application of Space-Borne InSAR Techniques in the Thompson River Valley, British Columbia, Canada. Adv. Cult. Living Landslides 2017, 3, 219–229. [Google Scholar]
Korkmaz, M. A study over the general formula of regression sum of squares in multiple linear regression. Numer. Methods Partial Differ. Equ. 2020, 37, 406–421. [Google Scholar] [CrossRef]
Huang, F.; Yin, K.; Zhang, G. Landslide displacement prediction using discrete wavelet transform and extreme learning machine based on chaos theory. Environ. Earth Sci. 2016, 75, 1376. [Google Scholar] [CrossRef]
Zhou, C.; Yin, K.; Cao, Y. A novel method for landslide displacement prediction by integrating advanced computational intelligence algorithms. Sci. Rep. 2018, 8, 7287. [Google Scholar] [CrossRef] [PubMed]
Wen, H.; Xiao, J.; Xiang, X. Singular spectrum analysis-based hybrid PSO-GSA-SVR model for predicting displacement of step-like landslides: A case of Jiuxianping landslide. Acta Geotech. 2023, 19, 1835–1852. [Google Scholar] [CrossRef]
Krkač, M.; Bernat, G.; Arbanas, Ž. A comparative study of random forests and multiple linear regression in the prediction of landslide velocity. Landslides 2020, 17, 2515–2531. [Google Scholar] [CrossRef]
Ye, C.; Wei, R.; Ge, Y. GIS-based spatial prediction of landslide using road factors and random forest for Sichuan-Tibet Highway. J. Mt. Sci. 2021, 19, 461–476. [Google Scholar] [CrossRef]
Fang, L.; Yue, J.; Xing, Y. Research on Landslide Displacement Prediction Based on DES-CGSSA-BP Model. Processes 2023, 11, 1559. [Google Scholar] [CrossRef]
Zhang, J.; Tang, H.; Wen, T. A Hybrid Landslide Displacement Prediction Method Based on CEEMD and DTW-ACO-SVR-Cases Studied in the Three Gorges Reservoir Area. Sensors 2020, 20, 4287. [Google Scholar] [CrossRef] [PubMed]
Pei, H.; Meng, F.; Zhu, H. Landslide displacement prediction based on a novel hybrid model and convolutional neural network considering time-varying factors. Bull. Eng. Geol. Environ. 2021, 80, 7403–7422. [Google Scholar] [CrossRef]
Ma, Z.; Mei, G. Deep learning for geological hazards analysis: Data, models, applications, and opportunities. Earth-Sci. Rev. 2021, 223, 103858. [Google Scholar] [CrossRef]
Huang, F.; Zhang, J.; Zhou, C. A deep learning algorithm using a fully connected sparse autoencoder neural network for landslide susceptibility prediction. Landslides 2019, 17, 217–229. [Google Scholar] [CrossRef]

Figure 1. (a) Location of the Three Gorges Reservoir Area (TGRA) in China; (b) location of the Shengjibao landslide; (c) overall view of the Shengjibao landslide (satellite image from Google Earth).

Figure 2. Monitoring arrangement in Shengjibao landslide and displacement blocks.

Figure 3. A schematic geological cross-section II-II’ of the Shengjibao landslide.

Figure 4. (a) The standard recurrent neural network (RNN) module contains one single layer. (b) The long short-term memory (LSTM) module contains four interaction layers.

Figure 5. Architecture diagram of 1-D CNN.

Figure 6. Traditional stacking models.

Figure 7. The flow chart of stacking ensemble algorithms.

Figure 8. Flow chart of the proposed predictive model.

Figure 9. Decomposition results of displacement monitoring data of point ZK2-3.

Figure 10. Rainfall, reservoir water level, and cumulative displacement monitoring data from four points of the landslide area.

Figure 11. Curves of the relationship between observed and predicted displacement.

Figure 12. Accuracy of predicted displacement by deep learning models and stacking model of point ZK2-3.

Figure 13. (a) Rainfall process from June to August in 2013 and 2014. (b) Reservoir water level fluctuation from June to August in 2013 and 2014.

Figure 14. Curves of the relationship between observed and predicted displacement. (A, B, D: displacement comparison at the fifth to the seventh point; C: displacement comparison at the first to the third point; A’, B’, D’: displacement comparison at the ninth to the twelfth point; C’: displacement comparison at the fifth to the eighth point).

Figure 15. Rainfall, reservoir water level, and cumulative displacement increasement from ZK2-3 of the landslide area.

Table 1. Deformation stages of the Shengjibao landslide.

Deformation Stage	Time Range	Remarks
1	March 2008–May 2008	The initial stage of landslide deformation.
2	June 2008–October 2009	The deformation intensified, and the first fluctuation was 172 m.
3	November 2009–August 2011	Steady deformation stage, the first fluctuation of 175 m.
4	September 2011–December 2015	Deformation has periodic and trend characteristics

Table 2. Original data of this study.

Data	Source
Landslide displacement	Geological Environmental Monitoring Station
Rainfall	Local meteorological station monitoring data
Reservoir level	The reservoir level of the Three Gorges Reservoir

Table 3. Grey correlation coefficient between input factors and periodic displacement of the Shengjibao landslide.

Candidate Factors	Description	ZK2-3
f₁	The precipitation during the current month	0.66
f₂	The precipitation during the past two months	0.63
f₃	The average reservoir level during the current month	0.60
f₄	The change in the reservoir level during the current month	0.66
f₅	The change in the reservoir level during the past two months	0.69
f₆	The change in displacement during the last month	0.76
f₇	The change in displacement during the last two months	0.75
f₈	The change in displacement during the last three months	0.74
f₉	The average displacement during the last 12 months	0.72

Table 4. The result of the collinearity test in ZK2-3.

Candidate Factors	Initial Input Factor		New Input Factor 1		New Input Factor 2
Candidate Factors	Tolerance	VIF	Tolerance	VIF	Tolerance	VIF
f₁	0.204	4.904	0.212	4.727	0.471	2.122
f₂	0.146	6.851	0.176	5.666	/	/
f₃	0.213	4.699	0.337	2.966	0.391	2.560
f₄	0.226	4.427	0.502	1.993	0.579	1.726
f₅	0.138	7.244	/	/	/	/
f₆	0.183	5.454	0.466	2.148	0.466	2.147
f₇	0.062	16.133	/	/	/	/
f₈	0.113	8.819	0.385	2.599	0.390	2.562
f₉	0.734	1.362	0.744	1.345	0.761	1.313

Table 5. Optimal hyperparameters of periodic term in point ZK2-3.

Models	Hyperparameters	TR 1	TR 2	TR 3	TR 4	TR 5
LSTM	Neurons	16	15	24	26	22
	Batch size	10	9	7	1	19
	Epochs	142	98	70	127	54
CNN	Neurons	5	11	14	17	18
	Batch size	4	9	7	13	19
	Epochs	112	85	126	131	50
RNN	Neurons	11	16	17	8	3
	Batch size	11	7	5	1	19
	Epochs	60	97	53	137	107
DNN	Neurons	28	13	14	16	17
	Batch size	12	9	5	2	2
	Epochs	33	106	95	94	56

Table 6. Optimal hyperparameters of trend term in point ZK2-3.

Models	Hyperparameters	TR 1	TR 2	TR 3	TR 4	TR 5
LSTM	Neurons	13	17	10	9	26
	Batch size	12	9	8	6	3
	Epochs	20	100	99	102	80
CNN	Neurons	5	6	6	25	3
	Batch size	12	8	11	9	10
	Epochs	30	99	99	94	91
RNN	Neurons	1	5	10	28	26
	Batch size	12	9	7	9	2
	Epochs	30	99	68	80	51
DNN	Neurons	5	6	6	25	3
	Batch size	12	8	11	9	10
	Epochs	30	99	99	94	91

Table 7. Accuracy of the predicted total displacement of point ZK2-3 in the validation and prediction dataset.

Models	RMSE (mm)
Models	Validation Dataset	Prediction Dataset
LSTM	34.97	20.00
CNN	52.82	23.07
RNN	18.92	35.01
DNN	30.93	21.60

Table 8. Evaluation of the results of the prediction of landslide displacement in different combinations of algorithms.

RMSE		Periodic Terms
RMSE		Stacking	LSTM	CNN	RNN	DNN
Trend Terms	Stacking	15.93	18.66	26.74	33.39	18.6
	LSTM	17.58	20	24.62	34.89	20.05
	CNN	19.34	21.27	23.07	35.72	21.6
	RNN	17.83	20.18	24.37	35.01	20.27
	DNN	19.34	21.27	23.07	35.72	21.6

Table 9. Comparison of accuracy in predicted displacement across different partitioning periods.

Model	RMSE (mm)
Model	Trend Term	Periodic Term	Total Displacement
LSTM-8	7.09	24.34	28.17
LSTM-12	1.22	18.71	20.00

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jiang, H.; Wang, Y.; Guo, Z.; Zhou, H.; Wu, J.; Li, X. Landslide Displacement Prediction Stacking Deep Learning Algorithms: A Case Study of Shengjibao Landslide in the Three Gorges Reservoir Area of China. Water 2024, 16, 3141. https://doi.org/10.3390/w16213141

AMA Style

Jiang H, Wang Y, Guo Z, Zhou H, Wu J, Li X. Landslide Displacement Prediction Stacking Deep Learning Algorithms: A Case Study of Shengjibao Landslide in the Three Gorges Reservoir Area of China. Water. 2024; 16(21):3141. https://doi.org/10.3390/w16213141

Chicago/Turabian Style

Jiang, Hongwei, Yunmin Wang, Zizheng Guo, Hao Zhou, Jiayi Wu, and Xiaoshuang Li. 2024. "Landslide Displacement Prediction Stacking Deep Learning Algorithms: A Case Study of Shengjibao Landslide in the Three Gorges Reservoir Area of China" Water 16, no. 21: 3141. https://doi.org/10.3390/w16213141

APA Style

Jiang, H., Wang, Y., Guo, Z., Zhou, H., Wu, J., & Li, X. (2024). Landslide Displacement Prediction Stacking Deep Learning Algorithms: A Case Study of Shengjibao Landslide in the Three Gorges Reservoir Area of China. Water, 16(21), 3141. https://doi.org/10.3390/w16213141

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Landslide Displacement Prediction Stacking Deep Learning Algorithms: A Case Study of Shengjibao Landslide in the Three Gorges Reservoir Area of China

Abstract

1. Introduction

2. Study Site

2.1. Shengjibao Landslide

2.2. Landslide Activity

3. Methodologies

3.1. Displacement Decomposition

3.2. Moving Average Methods

3.3. Recurrent Neural Networks and Long Short-Term Memory Neural Networks

3.4. Convolutional Neural Networks

3.5. Stacking and Its Optimization

3.6. Multiple Linear Regression

3.7. Reliability Evaluation of the Model

4. Prediction Process

4.1. Point Selection and Data Processing

4.2. The Composition of Dataset

4.3. Factors Selection

4.4. Normalization and Inverse Normalization

4.5. Hyperparameters of Deep Learning Models and Stacking Model

5. Results

5.1. The Results of Each Model

5.2. Analysis of Differences in Prediction Performance Among Models

6. Discussion

6.1. Stochastic Terms in Landslide Displacement Prediction

6.2. The Development of Integrated Models in the Field of Deep Learning

6.3. Decomposition and Reconstruction of Trend and Periodic Terms

6.4. Further Exploration of the Landslide Research Cycle

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI