*Article* **Using Complementary Ensemble Empirical Mode Decomposition and Gated Recurrent Unit to Predict Landslide Displacements in Dam Reservoir**

**Beibei Yang 1, Ting Xiao 2,\*, Luqi Wang <sup>3</sup> and Wei Huang <sup>4</sup>**


**Abstract:** It is crucial to predict landslide displacement accurately for establishing a reliable early warning system. Such a requirement is more urgent for landslides in the reservoir area. The main reason is that an inaccurate prediction can lead to riverine disasters and secondary surge disasters. Machine learning (ML) methods have been developed and commonly applied in landslide displacement prediction because of their powerful nonlinear processing ability. Recently, deep ML methods have become popular, as they can deal with more complicated problems than conventional ML methods. However, it is usually not easy to obtain a well-trained deep ML model, as many hyperparameters need to be trained. In this paper, a deep ML method—the gated recurrent unit (GRU)—with the advantages of a powerful prediction ability and fewer hyperparameters, was applied to forecast landslide displacement in the dam reservoir. The accumulated displacement was firstly decomposed into a trend term, a periodic term, and a stochastic term by complementary ensemble empirical mode decomposition (CEEMD). A univariate GRU model and a multivariable GRU model were employed to forecast trend and stochastic displacements, respectively. A multivariable GRU model was applied to predict periodic displacement, and another two popular ML methods—long short-term memory neural networks (LSTM) and random forest (RF)—were used for comparison. Precipitation, reservoir level, and previous displacement were considered to be candidate-triggering factors for inputs of the models. The Baijiabao landslide, located in the Three Gorges Reservoir Area (TGRA), was taken as a case study to test the prediction ability of the model. The results demonstrated that the GRU algorithm provided the most encouraging results. Such a satisfactory prediction accuracy of the GRU algorithm depends on its ability to fully use the historical information while having fewer hyperparameters to train. It is concluded that the proposed model can be a valuable tool for predicting the displacements of landslides in the TGRA and other dam reservoirs.

**Keywords:** reservoir landslide; displacement prediction; time series analysis; complementary ensemble empirical mode decomposition; gated recurrent unit

#### **1. Introduction**

Landslides are one of the most catastrophic disasters and are widely distributed in numerous parts of the world [1–4]. In China, annual reports from China Institute of Geo-Environment Monitoring (IGEM) show that landslides account for more than 50% of all geological hazards in recent years [5]. In 2020, for instance, 7840 geology-related hazards occurred in China, resulting in 139 deaths or people missing, 58 people injured, and a direct economic loss of CNY 5.02 billion. Among these geological disasters, 4810 were landslides, accounting for 61.3% of the total. Other types of hazards in 2020 included 1797 avalanches, 899 debris flows, 183 ground collapses, 143 ground fissures, and 8 cases of ground subsidence.

**Citation:** Yang, B.; Xiao, T.; Wang, L.; Huang, W. Using Complementary Ensemble Empirical Mode Decomposition and Gated Recurrent Unit to Predict Landslide Displacements in Dam Reservoir. *Sensors* **2022**, *22*, 1320. https:// doi.org/10.3390/s22041320

Academic Editor: Francesca Cigna

Received: 16 December 2021 Accepted: 3 February 2022 Published: 9 February 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

As one of the most landslide-prone areas in China, the Three Gorges Reservoir Area (TGRA) has been given much attention concerning severe landslides [6]. One main reason is that the construction of the Three Gorges Dam (TGD) has significantly changed the regional hydrogeological conditions [7,8]. Some landslides in the TGRA (e.g., Bazimen landslide) have deformed continuously for several decades, whereas some landslides (e.g., Woshaxi landslide) have achieved a displacement of 28,065.9 mm, and the deformation is still increasing [9,10]. Once landslides in dam reservoirs occur, they can cause severe damage along both sides of the reservoir area. In addition, these reservoir landslides can induce secondary surge disasters, endangering the shipping and bridges along the river and its tributaries [11]. The Honyanzi landslide, which occurred on 24 June 2015, was such an example, initiating a reservoir tsunami that resulted in two deaths and severe damage to shipping facilities (Figure 1) [12]. These risks can be mitigated if one can establish reliable early warning systems. As landslide displacement can represent its evolution intuitively, accurate landslide displacement prediction is an effective means of establishing such reliable early warning systems [10,13,14].

**Figure 1.** Location map of landslides in TGRA mentioned in the paper.

In situ displacement monitoring techniques have been available since the 1940s, especially the global positioning system (GPS) technique [15–17]. These techniques make it possible to acquire real-time monitoring information. These monitoring data have been applied extensively in landslide displacement prediction (LDP). The research of LDP dates back to the 1960s with the presentation of the Saito model. Subsequently, numerous LDP theories and models have been successively proposed [18]. The development of LDP research can be summarized into three stages [14,19,20]. The first stage (from the 1960s to 1970s) is the phenomenological and empirical prediction, mainly based on the macroscopic deformation phenomenon before landslide failure. The prediction accuracy is usually unsatisfied because of a high dependence on the gained experience. The second stage (during the 1980s) is the displacement-time statistical analysis prediction, leading qualitative prediction to quantitative prediction. Benefiting from the development of mathematical sciences, various statistical mathematical models have been proposed and applied to the LDP (e.g., grey system theory) [21]. Without considering influencing factors, these models are built from statistics and mathematics. Hence, these approaches are primarily valid for landslides with similar deformation characteristics [22]. The third stage (from the 1990s to the present) is the nonlinear prediction and intelligent integrated prediction. Numerous nonlinear and intelligent LDP models have been proposed and applied in cases. These models can build relationships between landslide displacement and multiple triggering factors. Their prediction performance has shown encouraging improvement.

As intelligent algorithms, machine learning (ML) models have been extensively utilized to predict landslide displacements because of their nonlinear processing ability. These models, such as the back-propagation (BP) neural network [23,24], extreme learning machine (ELM) [25–29], random forest (RF) [30,31], and support vector machine (SVM) [32–34], have become popular and have been adopted in some landslide cases in the TGRA. Influencing factors and displacement are set as the input and output of the models, respectively. The trained models have achieved encouraging performances. Zhou et al. [27] selected an artificial bees colony (ABC) to optimize the parameters of a kernel-based extreme learning machine (KELM) for LDP. Li et al. [28] proposed an ensemble-based ELM and copula model to predict the displacement of the Baishuihe landslide in the TGRA. Hu et al. [30] developed an integrated LDP model by combining the Verhulst inverse function (VIF) and RF algorithm, which provided a practical approach for predicting the long-term deformation of landslides. Bui et al. [34] adopted ABC optimization to model the least squares support vector regression (LSSVR). These forecasting models belong to static models, whereas the evolution of landslides is a complex nonlinear dynamic process [35]. The deformation conditions of landslides at one time can be affected by that of the former time [36]. A dynamic model—long short-term memory (LSTM) neural networks—was applied to LDP [9]. Jiang et al. [37] combined the support vector regression (SVR) algorithm and LSTM model to forecast the displacement of the Shengjibao landslide in the TGRA. As a deep ML method, LSTM can deal with more complicated time series predictions. With the increment of the number of available monitoring data and the improvements in computer hardware and software, the LSTM model has become a priority choice to deal with more complicated time series prediction [38,39]. One drawback of LSTM is that it has more parameters to be trained than classical ML methods, which makes it challenging to obtain the optimum of all parameters simultaneously [10]. An improved version of the LSTM—the gated recurrent unit (GRU)—is proposed and adopted in LDP. GRU replaced the three gates (input gate, forget gate, and output gate) of LSTM with two new gates (reset gate and update gate). This structure of GRU makes it possible to reduce the number of hyperparameters required for training. Thus, it can be easier for GRU to obtain a well-trained model than the LSTM [31].

In general, the LDP in the dam reservoir involves decomposing the total displacement into several components (trend term, periodic term, and stochastic term) according to time series analysis and then through predicting each component by different methods. Each displacement component has clear mathematical and physical significance. This treatment of LDP has been proven to be effective in previous studies [10,23,31,33,36,40–42]. Several decomposition methods have been adopted, such as the average moving method [10,33], double exponential smoothing [10], variational mode decomposition (VMD) [40], empirical mode decomposition (EMD) [37], ensemble empirical mode decomposition (EEMD) [40], and wavelet transform (WT) [41]. It is critical to forecast periodic displacement accurately to ensure the good prediction performance of accumulated displacement for landslides [23]. The prediction of periodic displacement is a heated topic, and the predictive models are summarized as mentioned above. The trend displacement is usually modeled and predicted by fitting the curve of displacement–time with polynomial functions [23,31,33]. A piecewise curve may need several polynomial functions [10]. Another displacement component—the stochastic term—is usually ignored [10,32,37,43]. The main reason is that stochastic displacement is influenced by varied, ever-present, and unquantifiable stochastic factors.

This paper decomposed accumulated displacement into a trend term, periodic term, and stochastic term by CEEMD. A univariate and a multivariable GRU model were used to predict the trend and stochastic displacements, respectively. A multivariable GRU model was adopted to predict periodic term displacement, and another two popular ML methods—LSTM and RF—were used for comparison. The proposed model was applied in the displacement prediction of the Baijiabao landslide in the TGRA. The deep dynamic model has the advantages of a powerful prediction ability with a simpler structure and fewer trained hyperparameters. In addition, the stochastic displacement, neglected in most exiting prediction models, was considered in the proposed model.

#### **2. Approach to Model Displacements in Three Gorges Dam Reservoir**

#### *2.1. Time Series Decomposition*

The change in landslide accumulated displacement is determined by geological conditions, triggering factors, and stochastic factors [10,33]. Geological conditions involve internal factors, such as the geological structure, topography, lithology, etc. Triggering factors for landslides in the TGRA are mainly the seasonal rainfall and reservoir level fluctuation. Stochastic factors appear with uncertainties, including earthquakes, traffic load, wind load, etc. The displacement components induced by the above three factors can be represented as trend displacement, periodic displacement, and stochastic displacement, respectively. Consequently, the accumulated displacement can be expressed as Equation (1):

$$\mathbf{A} = \mathbf{T} + \mathbf{P} + \mathbf{S} \tag{1}$$

where A is accumulated displacement, T is trend displacement, P is periodic displacement, and S is stochastic displacement.

#### *2.2. Complementary Ensemble Empirical Mode Decomposition*

Empirical mode decomposition (EMD) was firstly proposed by Huang et al. [44]. They implemented EMD by converting a nonlinear sequence into a set of stationary sequences that consisted of several intrinsic mode functions (IMFs) and a residual. EMD, however, has the disadvantage of mode mixing, and thus ensemble empirical mode decomposition (EEMD) was presented by Wu et al. [45]. In EEMD, uncorrelated finite white noise is added into the original signal, and the final IMF is obtained by averaging all the IMFs. Due to the dependence of the added noise in EEMD, Yeh et al. [46] presented a modified algorithm of EEMD named complete ensemble empirical mode decomposition (CEEMD) to decompose the signal into different scale IMFs. By adding opposite random white noise into the decomposition results of EEMD, CEEMD realized the advantages of an improved decomposition, better denoising, and higher computational efficiency. The following steps settle the process of CEEMD decomposing the original time series.

The first step is to add positive and negative white noise pairs to the original time series.

$$
\begin{bmatrix} B\_i(t) \\ C\_i(t) \end{bmatrix} = \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix} \begin{bmatrix} S\_i(t) \\ a\_i(t) \end{bmatrix} \tag{2}
$$

where *Bi*(*t*) and *Ci*(*t*) are the time series after adding positive and negative white noise, respectively, *Si*(*t*) is the original time series, and *ai*(*t*) is the added white noise.

Subsequently, the EMD algorithm is used to decompose *Bi*(*t*) and *Ci*(*t*).

$$\begin{cases} \begin{array}{c} B\_i(t) = \sum\_j^l IMF\_{ij}^+ \\ \end{array} \\ \begin{array}{c} \text{C}\_i(t) = \sum\_j^l IMF\_{ij}^- \end{array} \end{cases} \tag{3}$$

where *J* is the number of *IMF* after decomposing, and *IMF*<sup>+</sup> *ij* and *IMF*<sup>−</sup> *ij* are the *j*th components of IMF after adding positive and negative white noise, respectively.

*N* sets of *IMFs* can be obtained after repeating the above two steps.

$$\left\{ \left\{ \left\{ \begin{aligned} {IMF^+\_{1j'}} {IMF^+\_{2j'}} \cdots {IMF^+\_{Nj}} \\ {IMF^-\_{1j'}} {IMF^-\_{2j'}} \cdots {IMF^-\_{Nj}} \end{aligned} \right\} \right\} \right\} \tag{4}$$

We can obtain the final *j*th *IMF* by averaging its positive and negative components.

$$IMF\_{\vec{j}} = \frac{1}{2N} \sum\_{i=1}^{N} \left( IMF\_{\vec{ij}}^{+} + IMF\_{\vec{ij}}^{-} \right) \tag{5}$$

Finally, the time series *Si*(*t*) is decomposed as Equation (6):

$$S(t) = \sum\_{j=1}^{N} IMF\_j \tag{6}$$

#### *2.3. Machine Learning Methods*

#### 2.3.1. Long Short-Term Memory Neural Network

Long short-term memory (LSTM) neural networks are in the category of dynamic recurrent neural networks (RNN). Due to the issues of gradient vanishing and gradient exploding in conventional RNN, they cannot handle the dependency of a long time series. To avoid such disadvantages of conventional RNN, Hochreite and Schmidhuber [47] proposed LSTM in 1997. In LSTM, a memory block is used as the basic unit of its hidden layer, consisting of a memory cell and three gates, named the input gate, forget gate, and output gate (Figure 2) [48].

**Figure 2.** Architecture of LSTM neural network.

The input gate controls the flow of input activations into the memory cell. The information from the hidden state at step t − 1 (*ht-1*) and the current input value (*xt*) is firstly passed along to the sigmoid function (σ). Then, the information of input data from the current step and previous data from the last step is used to update and generate a new vector. The forget gate is responsible for filtering information by means of passing along useful information to the next step and abandoning useless information. The output gate controls the transfer of useful information into other memory blocks.

We recorded the input sequence as *x* = (*x*1, *x*2, ... , *xT*), and can obtain the output sequence *y* = (*y*1, *y*2,... , *yT*) by treating Equation (7) to Equation (12).

$$\mathbf{i}\_t = \sigma(\mathcal{W}\_{\text{xi}}\mathbf{x}\_t + \mathcal{W}\_{\text{li}}\mathbf{h}\_{t-1} + \mathcal{W}\_{\text{cir}}\mathbf{c}\_{t-1} + \mathbf{b}\_i) \tag{7}$$

$$f\_t = \sigma \left( W\_{xf} \mathbf{x}\_t + W\_{hf} h\_{t-1} + W\_{cf} \mathbf{c}\_{t-1} + b\_f \right) \tag{8}$$

$$x\_t = f\_t c\_{t-1} + i\_t \tan h \left(\mathcal{W}\_{\text{xc}} x\_t + \mathcal{W}\_{\text{hc}} h\_{t-1} + b\_c\right) \tag{9}$$

$$O\_t = \sigma(\mathcal{W}\_{xo}\mathbf{x}\_t + \mathcal{W}\_{ho}h\_{t-1} + \mathcal{W}\_{co}\mathbf{c}\_{t-1} + b\_o) \tag{10}$$

$$h\_l = o\_l \tan h(c\_l) \tag{11}$$

$$y\_t = \mathcal{W}\_{hy} h\_t + b\_y \tag{12}$$

where *it*, *ft*, *ot*, and *ct* are the values of the input gate, forget gate, output gate, and a memory cell at time t; *bi*, *bf* , *bo*, and *bc* are their corresponding bias values; *Wx* are the weights between input nodes and hidden nodes; *Wh* are the weights between hidden nodes and cell memory; *Wc* are the weights connecting the memory cell to output nodes; *σ* is the sigmoid activation function; tan *h* is the hyperbolic tangent function mapping data to [−1, 1]; and *ht* is the hidden state, containing information about the history of earlier elements in the series.

#### 2.3.2. Gated Recurrent Unit

The gated recurrent unit (GRU) is an improved version of LSTM. Compared with LTSM, GRU has the advantages of fewer hyperparameters and faster training by using two new gates (update gate and reset gate) (Figure 3). These two gates are utilized to store as much information as possible for a long time series [49,50]. The reset gate is responsible for determining how much information at the previous moment is passed along, and resets the information at the current moment. The update gate controls the extent of information from both the previous time step and the current time step that will be passed along to the memory cell. The equations in GRU are given as follows:

$$\mu\_t = \sigma \left( \mathsf{W}\_{\text{xu}} \mathsf{x}\_t + \mathsf{W}\_{\text{hu}} \mathsf{h}\_{t-1} + \mathsf{b}\_{\text{u}} \right) \tag{13}$$

$$r\_t = \sigma(\mathsf{W}\_{\mathsf{X}r}\mathsf{x}\_t + \mathsf{W}\_{\mathsf{hr}}\mathsf{h}\_{t-1} + \mathsf{b}\_r) \tag{14}$$

$$h' = \tanh\left(\mathcal{W}\_{\text{xlr}}\mathbf{x}\_t + (r\_t \odot h\_{t-1})\mathcal{W}\_{hh} + b\_{\text{li}}\right) \tag{15}$$

$$h\_t = (1 - \mu\_t) \odot h' + \mu\_t \odot h\_{t-1} \tag{16}$$

where *u*t and *r*t are the values of the upset gate and reset gate, respectively; *h* is the value after resetting; *W* and *b* are the weights and deviations, respectively; represents pointwise multiplication between tensors. Other parameters indicate the same meaning as those in LSTM.

**Figure 3.** Structure chart of GRU.

#### 2.3.3. Random Forest

Random Forest (RF) is an ensemble ML method that has been well-developed for classification, regression, and other tasks [51]. This method has some advantages, including great robustness, data adaptability, and low overfitting [52]. The RF algorithm is realized based on multiple decision trees by sampling from the original dataset (both samples and their features) [53].

To build a decision tree, we divide the predictor space into the number of *J* regions that are distinct and non-overlapping and represented as *R*1, ... , *Rj*. The division is implemented by minimizing the root of the sum of squares.

$$\sum\_{j=1}^{J} \sum\_{i \in \mathcal{R}\_j} \left( y\_i - \hat{y}\_{\mathcal{R}\_j} \right)^2 \tag{17}$$

where *yi* is the observation belonging to *Rj*, and *y*ˆ*Rj* is the mean response for the training observations within the *j*th region.

Bagging is used to select training sets from the original dataset, and each training set is utilized for building a decision tree. The final prediction result *y*ˆ*bag* can be achieved by averaging the results of all decision trees (Equation (18)), which can improve the prediction accuracy by doing so.

$$\mathcal{g}\_{\text{bag}} = \frac{1}{\mathcal{M}} \sum\_{i=1}^{M} \mathcal{g}\_i \tag{18}$$

where *y*ˆ*<sup>i</sup>* is the prediction result of the *i*th decision tree and *M* is the number of decision trees.

#### *2.4. Prediction Process with the Proposed Model*

In the establishment of the proposed model (Figure 4), we adopted CEEMD to decompose the monitored accumulated displacement into a trend component and a periodic component. Subsequently, we used a univariate GRU model and a multivariate GRU model to predict the trend term and periodic term, respectively. The univariate GRU model described the trend displacement versus time, whereas the multivariate GRU model described the relationships between periodic displacement and influencing factors. A multivariate LSTM model and a multivariate RF model were also utilized for forecasting periodic displacement to verify the prediction performance of the GRU model. We adopted a multivariate GRU model to predict stochastic displacement.

The error analysis introduces the root mean square error (*RMSE*), mean absolute percentage error (*MAPE*), and the goodness of fit (*R*2) for validations. Smaller values of *RMSE* and *MAPE* and a larger value of *R*<sup>2</sup> reflect a better prediction performance.

$$RMSE = \sqrt{\frac{1}{N} \sum\_{i=1}^{N} (\mathbf{x}\_i - \mathbf{x}\_i)^2} \tag{19}$$

$$MAPE = 100\% \times \frac{1}{N} \sum\_{i=1}^{N} \left| \frac{\mathbf{x}\_i - \mathbf{x}\_i}{\mathbf{x}\_i} \right| \tag{20}$$

$$R^2 = 1 - \frac{N\sum(\mathbf{x\_i} - \hat{\mathbf{x\_i}})^2}{N\sum \mathbf{x\_i^2} - \sum \hat{\mathbf{x\_i}}^2} \tag{21}$$

where *xi* and *x*ˆ*<sup>i</sup>* represent the *i*th observed displacement and predicted displacement, respectively, and *N* is the record number of displacement.

**Figure 4.** Flowchart of the proposed predictive model.

#### **3. Baijiabao Landslide Case Study**


The Baijiabao landslide is located on the west bank of the Xiangxi River and belongs to Zigui County, Hubei Province, China (Figure 5). The Xiangxi River is a major tributary of the Yangtze River, approximately 2.5 km upstream from the estuary. The main sliding direction of the landslide is perpendicular to the Xiangxi River and orientated at N 82◦ E. The front part of the landslide is submerged in the Xiangxi River, whereas the interface between bedrock and soil bounds the upper edge. The left and right boundaries are defined by seasonal homologous gullies (Figure 6). The landslide has a leading-edge elevation of 160–175 m, a trailing-edge elevation of 265 m, a width of approximately 550 m, a length of approximately 400 m, an average thickness of 45 m, and an estimated volume of 9.9 × 106 <sup>m</sup><sup>3</sup> [25].

The sliding mass is mainly composed of silty clay and fragmented rubble. These sliding materials form a loose and disordered structure of the slope. The slip bed is silty mudstones and muddy siltstones of the Jurassic Xiangxi group, which dig into the hill by a direction of 260◦ with an angle of 30◦ [9]. The sliding surface is defined by the interface between colluvial materials and subjacent bedrock. The sliding zone is mainly composed of silty clay (Figure 7).

The Baijiabao landslide experienced large deformations since the impoundment of the Three Gorges Dam (TGD) in 2003 and kept deforming in the following years. In June 2007, tensile cracks with a length of 160 m and depth of 10 cm occurred at both side boundaries of the landslide close to the trailing edge. In May 2009, tensile cracks were observed on the road in the front and right parts of the landslide. A similar road deformation appeared in

the middle of the landslide. In June 2012, cracks of the trailing edge showed a connecting tendency. Besides, cracks of the boundaries extended to the front part of the landslide. In June 2015, several tensile cracks, both on the right boundary and Zi-Xing road, became larger. Before the impoundment of the TGD, 165 residents used to live in the landslide area, whereas now, only 20 residents live there.

**Figure 5.** (**a**) Location of the Baijiabao landslide; (**b**) overall view of the Baijiabao landslide.

**Figure 6.** Monitoring arrangement in the Baijiabao landslide.

**Figure 7.** Schematic geological cross-section A–1 of the Baijiabao landslide.

3.1.2. Monitoring Data and Deformation Characteristics of the Landslide

Four GPS stations numbered ZG323, ZG324, ZG325, and ZG326 were installed in the landslide area to monitor the surface displacements at one time per month since late 2006. Another two stations numbered ZG320 and ZG321 were established as the datum stations. Monitoring data from January 2007 to July 2018 were acquired (Figure 8). The displacements of the four monitoring stations showed a similar trend of step-wise, which meant that the landslide deformed distinctly in steps during April and September (especially from May to July) and became unremarkable in other times of the year.

**Figure 8.** Accumulated displacement in the Baijiabao landslide.

Cao et al. [25] analyzed the deformation characteristics and evolution of the Baijiabao landslide. The analysis showed that the Baijiabao landslide deformed as an entity. Station ZG324, located in the central position of the landslide, was chosen as a representative for establishing the displacement forecasting model. Figure 9 displayed the accumulated displacements at station ZG324, monthly rainfall, and reservoir water level, and all the data were obtained by measurement. The annual displacement, displacement during step-wise deformation period (from May to September), and the maximum monthly displacement were summarized in Figure 10.

**Figure 9.** Rainfall, reservoir water level, and accumulated displacement at ZG324, Baijiabao landslide.

**Figure 10.** Annual displacement increment, displacement during step-wise deformation period, and the maximum monthly displacement at ZG324, Baijiabao landslide.

It can be seen that a sharp displacement increment occurred every few years (2009, 2012, and 2015) that was more than 200 mm (204.81 mm, 206.18 mm, and 216.92 mm, respectively). The displacement in other years increased by less than 100 mm. Another phenomenon was that the displacement during the step-wise deformation period (from May to September) contributed to the majority of the displacement in the whole year, especially from May to July, which contributed to more than 70% of the annual displacement. The maximum monthly displacement occurred in June or July each year, except 2015 (occurred in August). For example, the yearly displacement in 2012 was 206.18 mm; the displacement increment between May and July was 187.55 mm and occupied 91% of the whole year displacement. The maximum monthly rainfall occurred in June and was up to 164 mm. The reservoir level dropped between May and July 2012, and the cumulative rainfall rose to 349.73 mm. Thus, the time from May to July can be the critical early warning period for step-wise landslides. The deformation during this period was mainly controlled by reservoir water level decline and heavy rainfall.

#### *3.2. Accumulated Displacement Decomposition*

The monitored data of station ZG324 from January 2007 to July 2017 and from August 2017 to July 2018 were selected as training and testing data sets, respectively. An appropriate decomposition method is crucial in establishing a landslide displacement prediction model. Several methods have been used in accumulated displacement decomposition, as mentioned in the introduction, and each has advantages and disadvantages. Zhu et al. [54] and Fu et al. [55] have demonstrated that CEEMD is an effective method for reconstructing landslide displacement, with the advantages of a high stability and complete decomposition. Therefore, the CEEMD method was adopted here to decompose accumulated displacement into trend term and periodic term displacements.

In the training of the forecast model, we tested 200 trials and set the standard deviation of the added white noise in each ensemble to 0.25. We used the CEEMD to decompose the accumulated displacement into several IMFs and a residual, while the residual represented a trend component. Subsequently, we can obtain the periodic displacement by summing up all of the IMFs or subtracting the trend term from the accumulated displacement. Figure 11 displayed the trend and periodic components of ZG324 in the Baijiabao landslide.

**Figure 11.** Displacement decomposition at ZG324.

#### *3.3. Trend Displacement Prediction*

Controlled by "internal" conditions, the trend displacement increases monotonically with time [23]. Some researchers forecasted trend displacement by fitting the displacement– time curve, and a polynomial was commonly used [33,37]. However, a single function can be insufficient to fit the curve properly [10]. A univariate GRU model was adopted to forecast the trend displacement in this study, and the established model achieved an excellent prediction performance (Figure 12). The prediction results of RMSE, MAPE, and R2-values were 2.09 mm, 0.14%, and 0.9984, respectively.

**Figure 12.** Predicted and measured trend displacement.

#### *3.4. Periodic Displacement Prediction*

#### 3.4.1. Triggering Factors Selection

Triggering factors selection is essential to guarantee the accuracy of a displacement predictor. According to the monitoring data of the Baijiabao landslide (Figures 9 and 10), rainfall and reservoir water level fluctuation are two major factors triggering its step-wise deformation. Selby [56] proposed that the evolutionary state of landslides was also an influential factor in the dependence of the movement on external factors. By referring to the research [9,25,31,36] and our previous work [42], seven candidate triggering factors were considered here.

Gray relational analysis (GRA) was used to check the degree of correlation between the periodic displacement and candidate triggering factors [57]. In GRA, we chose periodic displacement and candidate triggering factors as primary sequence and sub-sequences, respectively. All the sequences were normalized in the following way:

$$X\_k(i)' = X\_k(i) / \frac{1}{n} \sum\_{i=0}^n X\_k(i) \tag{22}$$

where *i* = 0, 1, ··· , *n*; *k* = 0, 1, ··· , *m*; *n* is the number of data points; m is the number of candidate triggering factors. The correlation coefficients were thus obtained by Equation (23):

$$\delta\left( (\mathbf{x}\_0(i)', \mathbf{x}\_k(i)') \right) = \frac{p + \rho q}{|X\_k(i)' - X\_0(i)'| + \rho q} \tag{23}$$

$$p = \underset{k}{\text{minim}} \left( X\_k(i)' - X\_0(i)' \right) \tag{24}$$

$$q = \max\_{k} \max\_{i} \left( X\_k(i)' - X\_0(i)' \right) \tag{25}$$

where *ρ* is the resolution coefficient and is usually set to 0.5.

The grey relational grade (GRG) was adopted to evaluate the correlation between variables, and was calculated by Equation (26):

$$r(\mathbf{x}\_0, \mathbf{x}\_i) = \frac{1}{n} \sum\_{k=1}^n \delta\left( (\mathbf{x}\_0(i)', \mathbf{x}\_k(i)') \right) \tag{26}$$

The GRG values vary from 0 to 1, with GRG values above 0.6 indicating a strong correlation between variables. The results were summarized in Table 1. GRG values between all the variables were above 0.6, suggesting that the candidate triggering factors can be used as the input of the prediction model.

**Table 1.** Candidate factors for the periodic displacement of Baijiabao landslide.


#### 3.4.2. Establishment of the Prediction Model

The training dataset was divided into training and validation sections, and they accounted for 70% and 30% of the total [9,35]. The triggering factors and periodic displacement were normalized to [−1, 1], and they were used as the input sequence and output sequence of the models, respectively. In this experiment, all the models used in the paper were implemented on MATLAB R2021a software, where the ML toolbox and deep ML toolbox were used. The GRU model had three layers: two were GRU layers, and the other one was a hidden layer. In the established GRU model, the number of hidden units was 200. The values of maximum epochs, minimum batch size, and initial learning rate were 250, 10, and 0.05, respectively. Those parameters of LSTM were 250, 1, and 0.01, respectively. In the RF model, the number of predictors and trees were 5 and 10, respectively.

The predicted values of GRU, LSTM, and RF models in the training process were shown in Figure 13. The prediction accuracy of the trained models was shown in Table 2. It indicated that the predicted displacements fitted well with the measured displacement in the trained LSTM and GRU models and were more satisfied than the RF model.

**Figure 13.** Measured and predicted displacements of GRU, LSTM, and RF models in training.


**Table 2.** Prediction accuracy of the trained models.

#### 3.4.3. Predicted Periodic Displacement

Figures 14 and 15 compared the measured and predicted periodic displacement at locations ZG324 using the GRU, LSTM, and RF models. The prediction accuracy of each model was summarized in Table 3. The GRU model gave the best agreement with the measured values in the three models, with RMSE, MAPE, and R<sup>2</sup> values of 1.21 mm, 11.87%, and 0.9952. Another deep ML method—LSTM—showed a lower prediction accuracy than the GRU model. Its RMSE, MAPE, and R2 were 3.67 mm, 26.67%, and 0.9672, respectively. Compared with the two deep ML methods—LSTM and GRU—the ensemble model RF did not demonstrate a satisfied prediction performance, and the accuracy factors were 7.35 mm, 69.84%, and 0.8517.

**Figure 14.** Training and prediction process of each model.

**Figure 15.** Predicted and measured periodic displacement.

**Table 3.** Prediction accuracy of periodic displacement.


The predicted displacements of GRU and LSTM aligned well with the measured displacement, including in the critical early warning period of the step-wise landslides (from May to July). During May to July 2018, the reservoir water level decreased from 160.39 m to 145.33 m, and the cumulative precipitation rose to 397.83 mm. The above two influencing factors caused the displacement to increase sharply. Several local peaks existed in the curve of the predicted results for the RF model. The error of each prediction time point (each month) was distributed disorderly.

It should be noted that the GRU model showed a better prediction performance than the LSTM and RF models on the whole rather than at every time point. For example, for the displacement prediction of March, 2018, the absolute error (AE) and relative error (RE) of the GRU model were 0.38 mm and 1.72%, whereas the indicators of the RF model were 0.27 mm and 1.25%.

#### *3.5. Stochastic Displacement Prediction*

According to displacement component composition, stochastic displacement can be obtained by removing the trend term and the periodic term from the accumulated displacement series. The results were shown in Figure 16, which indicated that stochastic displacement varied with time disorderly.

In this paper, the stochastic displacement of the Baijiabao landslide was trained and predicted by a multivariate GRU model. All of the impact factors and stochastic displacements were converted to a [−1, 1] format in sample data preprocessing. The prediction results were shown in Figure 17. The RMSE, MAPE, and R2 values were 1.48 mm, 94.36%, and 0.0793, respectively. The prediction accuracy was not satisfied, whereas the whole variant trend between the predicted value and measured stochastic displacement was identical.

#### *3.6. Accumulated Displacement Prediction*

According to the accumulated displacement composition, the total displacement can be obtained by making the sum of the predicted trend and periodic and stochastic displacements. Figure 18 showed that the predicted accumulated displacements compared well with the measured displacement. The RMSE, MAPE, and R<sup>2</sup> values were 1.48 mm, 0.09%, and 0.9936.

**Figure 16.** Stochastic displacement at ZG324.

**Figure 17.** Predicted and measured stochastic displacement.

**Figure 18.** Predicted and measured accumulated displacement.

#### **4. Discussion**

It is critical to forecast periodic displacement accurately in the prediction of accumulated displacement for landslides with step-wise deformation [23]. Multiple ML methods have been proposed and adopted in the periodic displacement prediction, such as BPNN, ElM, SVM, RF, etc. The evolution process of landslides is a dynamic, complex, and nonlinear system. With the advantages of handling complex nonlinear problems and considering the dynamic evolution, a deep dynamic model—GRU—is thus selected to predict landslide periodic displacement.

The performance of the model was validated with the observations of the Baijiabao landslide. Another two popular models, LSTM and RF, were adopted for comparison. The results showed that GRU achieved the best prediction accuracy in the three models. Compared with RF, GRU has the ability to establish connections between adjacent time steps, and this structure contributes to improving the prediction performance of the models. Compared with LSTM, GRU has a simpler structure and fewer hyperparameters. Thus, it can be easier to establish a well-trained GRU model and achieve a better prediction accuracy. It should be noted that though GRU indicated a higher prediction accuracy for one monitoring point in the Baijiabao landslide, this does not mean that the model applies to all landslides. The limitation of generalization inherent in the GRU model makes it difficult to predict all cases accurately. Such a limitation exists in all models [37]. To deal with this problem, ensemble models can be established by combining several models with different weights of the individual model [58]. In addition, switched prediction methods can be adopted to select the appropriate individual prediction model from several candidate models for a landslide [59].

Although the GRU model achieved an encouraging prediction accuracy, it has some drawbacks. One drawback is that the GRU uses the stochastic gradient descent optimization algorithm to update weights, which risks falling into local optimization [60]. Another drawback is that the deep GRU model demands a larger dataset size than conventional ML models [10]. The monitoring frequency is one time per month for the GPS data used in the Baijiabao case. It may take years to obtain enough data for the prediction model. If not enough training samples are available, the neural network cannot be fully trained, and therefore the prediction accuracy of the model will be affected. This drawback of GRU places a higher requirement on the monitored data of landslide deformation.

The stochastic displacement is induced by some stochastic factors, including earthquakes, wind load, and vehicle load, which make it a disordered series (Figure 14). This feature contributes to the difficulty in stochastic displacement accurate prediction. Little research on stochastic displacement prediction has been reported [33]. If a slope is marginally stable or even unstable, a slight stochastic "load" can lead to disequilibrium and intense deformation. The ignorance or underestimation of stochastic displacement may make landslide planners carry out nothing, thus increasing the possibility of landslide accidents. In this paper, stochastic component displacement was considered in accumulated displacement prediction. The stochastic displacement was determined by deducting the trend and periodic displacements from accumulated displacement, and was predicted by a multivariable GRU model. The prediction performance was unsatisfactory due to the varied, ever-present, and unquantifiable stochastic factors. The work is still a helpful experiment for understanding landslide displacement components and serves as an early warning for landslides. One should consider methods to develop optimal models for predicting stochastic displacement in the future [37].

The temporal prediction of landslides is one of the main components of early warning systems [61]. Empirical methods based on the trend of landslide rate and semi-empirical practices based on the displacement rate and acceleration can provide an estimation of landslide failure time [62]. In addition, multiple parameters relating to displacement, such as the displacement rate, displacement acceleration, and tangential angle, have been proposed as thresholds to suggest a probable failure, although these approaches cannot provide a time frame for such an occurrence [63]. Realizing the temporal prediction

of landslides at slope-scale based on relating the displacement would require a deeper dissertation in future work.

#### **5. Conclusions**

Displacement prediction is a vital and economic measure for landslide risk reduction and always emphasizes landslide research. This paper decomposed accumulated displacement into different displacement components by CEEMD. A univariate GRU model and a multivariable GRU model were used to predict the trend and stochastic displacements. A multivariable GRU model was used to establish a predictor for periodic displacement prediction, and two other popular ML models—LSTM and RF—were adopted for comparison. The predicted accumulated displacement was gained by the superposition of the three predicted displacement components. The results showed that predictors of deep ML methods—GRU and LSTM—had a higher prediction accuracy than the RF model in the studied case, which revealed the superiority of deep ML methods in long time series prediction. Both as deep ML methods, the GRU model achieved a better prediction performance than the LSTM model. One main reason is that the GRU algorithm has fewer hyperparameters to be trained in the model establishment than the LSTM algorithm. A prediction model with the structure of CEEMD—univariate GRU (trend displacement), multivariable GRU (periodic displacement), and multivariable GRU (stochastic displacement)—was proposed and achieved an encouraging prediction performance. The proposed model can be a potential tool for landslide risk reduction in the dam reservoir.

**Author Contributions:** Conceptualization, T.X. and B.Y.; analysis, B.Y. and T.X.; investigation, L.W. and W.H.; writing—original draft preparation, B.Y.; writing—review and editing, B.Y. and T.X.; supervision, B.Y. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by the Natural Science Foundation of Shandong Provincial, China (No. ZR2021QD032).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Some or all data, models, or code that support the findings of this study are available from the corresponding author upon reasonable request.

**Acknowledgments:** We wish to thank the Seventh Geological Brigade of Hubei Geological Bureau for their landslide investigation.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Data Mining and Deep Learning for Predicting the Displacement of "Step-like" Landslides**

**Fasheng Miao 1,2, Xiaoxu Xie 1, Yiping Wu 1,\* and Fancheng Zhao <sup>1</sup>**


**Abstract:** Landslide displacement prediction is one of the unsolved challenges in the field of geological hazards, especially in reservoir areas. Affected by rainfall and cyclic fluctuations in reservoir water levels, a large number of landslide disasters have developed in the Three Gorges Reservoir Area. In this article, the Baishuihe landslide was taken as the research object. Firstly, based on time series theory, the landslide displacement was decomposed into three parts (trend term, periodic term, and random term) by Variational Mode Decomposition (VMD). Next, the landslide was divided into three deformation states according to the deformation rate. A data mining algorithm was introduced for selecting the triggering factors of periodic displacement, and the Fruit Fly Optimization Algorithm– Back Propagation Neural Network (FOA-BPNN) was applied to the training and prediction of periodic and random displacements. The results show that the displacement monitoring curve of the Baishuihe landslide has a "step-like" trend. Using VMD to decompose the displacement of a landslide can indicate the triggering factors, which has clear physical significance. In the proposed model, the R2 values between the measured and predicted displacements of ZG118 and XD01 were 0.977 and 0.978 respectively. Compared with previous studies, the prediction model proposed in this article not only ensures the calculation efficiency but also further improves the accuracy of the prediction results, which could provide guidance for the prediction and prevention of geological disasters.

**Keywords:** Three Gorges Reservoir; Baishuihe landslide; data mining; displacement prediction; VMD-FOA-BPNN

#### **1. Introduction**

Landslides occur frequently around the world and are one of the most destructive geological disasters in the world [1,2]. Landslide displacement prediction is one of the geological engineering problems that at present has not been solved, especially for mountain and reservoir areas. Reservoir impoundment usually affects the surrounding geological environment, resulting in landslide disasters. As the largest power station in terms of installed capacity in the world since 2012, the water level of the Three Gorges Reservoir fluctuates between 145 and 175 m all year round. Hence, a large number of landslide disasters have developed in the Three Gorges reservoir [3,4]. Because the Three Gorges reservoir plays an important role in flood control and power generation, it is of great significance to study geological landslides in the Three Gorges Reservoir area [5,6].

Landslide displacement prediction is a hot topic at the forefront of natural hazard research [7]. Displacement prediction is the basis of early warning systems for landslide disasters. Accurate landslide displacement prediction can reduce the losses caused by such disasters as much as possible, so as to ensure the safety of people's lives and property. Due to the complex geological environment, the accuracy of current methods for directly predicting total displacement is not sufficient [8]. Hence, landslide displacement should be divided into several parts by the decomposition technique. At present, landslide

**Citation:** Miao, F.; Xie, X.; Wu, Y.; Zhao, F. Data Mining and Deep Learning for Predicting the Displacement of "Step-like" Landslides. *Sensors* **2022**, *22*, 481. https://doi.org/10.3390/s22020481

Academic Editor: Domenico Calcaterra

Received: 5 December 2021 Accepted: 8 January 2022 Published: 9 January 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

displacement decomposition mainly adopts two methods. The first is the time series and simple moving average method [9,10]. This method is simple and practical, and the displacement component obtained has a clear physical meaning. However, due to defects of the decomposition method itself, the random displacement cannot be obtained. The second is empirical mode decomposition (EMD), wavelet analysis, and ensemble empirical mode decomposition (EEMD), which can divide the total displacement into a specific number of components, so it has clear physical significance [11–13].

The landslide displacement prediction model has experienced rapid development in the past 50 years, which was from the initial empirical model to the mathematical statistical model, and then to the non-linear theoretical model and the comprehensive model [14]. Nowadays, with the development of high-speed computers, various machine learning models including deep learning have been widely used for predicting landslide displacement, such as ELM (Extreme Learning Machine) [15], EML (Evaluating Machine Learning) [16], BPNN (Back Propagation Neural Network) [17,18], SVR (Support Vector Regression) [19], KELM (Kernel Extreme Learning Machine) [20,21], LSTM (Long Short-Term Memory) [9,22], and so on. Many algorithms have been used to optimize the parameters for the prediction models, including GS (Grid Search algorithm) [10], PSO (Particle Swarm Optimization) [23], GA (Genetic Algorithm) [24], FOA (Fruit Fly Optimization Algorithm) [25], GWO (Grey Wolf Optimizer) [26], and so on. Therefore, selection of the influencing factors plays a crucial role in the development of landslide prediction. Besides, for landslides in a reservoir area, the fluctuation of the reservoir level and rainfall are usually used as the hydrologic triggering factors of landslide deformation and failure [27]. However, the increase of input factors does not necessarily lead to higher prediction accuracy in the model of landslide displacement prediction. Based on the above facts, for different types of displacement, it is necessary to select the appropriate inducing factor as the input layer to establish the model. At present, data mining technology has been widely used in the field of geological hazards. Nevertheless, the research on data mining technology in landslides mostly focuses on association criterions and thresholds of triggering factors, while there are few publications on the joint use of data mining technology and deep learning. In order to optimize the triggering factors to find the most suitable factors for displacement prediction, data mining technology could be used.

In this paper, the Baishuihe landslide was taken as an example, which was in the east of Three Gorges Reservoir area. Data mining and deep learning were used for predicting the displacement. Based on the time series analysis of landslides, the displacement and triggering factors are decomposed by VMD. Periodic and random terms were predicted by FOA-BPNN. A flow chart of this work is shown in Figure 1.

**Figure 1.** Flow chart of the displacement prediction.

#### **2. Methodology**

#### *2.1. Two-Step Clustering*

The two-step clustering algorithm is usually applied to deal with large-scale types of data, which divides and integrates data through a two-step process of pre-clustering and clustering to complete the data classification [28]. For sample data including both numerical and subtype variables, the two-step clustering algorithm usually uses a loglikelihood function. If clustered into *j* classes, it is defined as:

$$I = \sum\_{j=1}^{J} \sum\_{i \in I\_j} \log p(X\_i \mid \theta\_i) = \sum\_{j=1}^{J} l\_j \tag{1}$$

where, *p* is the likelihood function; *Ij* is the set of samples of *j*th class; *θ<sup>j</sup>* is the parameter vector of *j*th class; *J* is the number of clusters. For all samples, the log-likelihood clusters are obtained as the aggregation of the log-likelihood clusters for each category.

For the certain *i*th class and *j*th class, the combination is noted as <*i*, *j*>, and then their distance can be defined as:

$$d(i,j) = \mathfrak{F}\_i + \mathfrak{F}\_j - \mathfrak{F}\_{\langle i,j \rangle} \tag{2}$$

where *ξ<sup>i</sup>* and *ξ<sup>j</sup>* are the log-likelihood distance of *i*th class and *j*th class, respectively. *ξ*<*i*,*j*<sup>&</sup>gt; is the log-likelihood distance of the combination of <*i*, *j*>. *ξ* is the specific form of the log-likelihood function:

$$\mathcal{L}\_{\mathcal{V}} = -N\_V \left( \sum\_{k=1}^{K^A} \frac{1}{2} \log \left( \mathcal{O}\_k^2 + \mathcal{O}\_{\text{vk}}^2 \right) + \sum\_{k=1}^{K^B} E\_{\text{vk}} \right) \tag{3}$$

where,

$$\stackrel{\triangle}{E\_{vk}} = -\sum\_{l=1}^{L\_k} \frac{N\_{vkl}}{N\_v} \log \frac{N\_{vkl}}{N\_v} \tag{4}$$

where *K<sup>A</sup>* is the number of numerical variables; *KB* is the number of categorical variable; *σ*ˆ <sup>2</sup> *k* and *σ*ˆ <sup>2</sup> *vk* denote the total variance of the *kth* numerical variable and the variance in *v*th class respectively; *Nv* and *Nvkl* are the sample size of category *v* and the first category in the *k*th subtype variable; *Lk* is the category of the *k*th subtype variable.

After *i*th class and *j*th class are combined, −*ξ*<*i*,*j*<sup>&</sup>gt; is greater than *ξ<sup>i</sup>* + *ξj*, and hence *d*(*i,j*) is less than 0. Moreover, the smaller *d*(*i,j*) is, the more it means that the merging of *i*th class and *j*th class will not cause a significant increase in intra-class differences. Specially, when *d*(*i,j*) is less than the threshold *C*, *i*th class and *j*th class can be merged. Conversely, when *d*(*i,j*) is greater than the threshold *C*, indicating that merging will cause a significant increase in variability within the clustered clusters, and the *i*th class and *j*th class cannot be merged.

The threshold value *C* is given by:

$$
\mathbb{C} = \log(V) \tag{5}
$$

$$V = \prod\_{k} R\_k \prod\_{m} L\_m \tag{6}$$

where *Rk* is the range of values of the *k*th numeric variable; *Lm* is the sample size of the *m*th subtype variable.

#### *2.2. Apriori Algorithm*

The a priori algorithm was proposed by Agrawal [29]. This algorithm can deal only with categorical variables rather than numeric variables. The algorithm mainly includes two steps: (1) generating frequent item sets that meet the minimum support values, and (2) generating association rules that satisfy the minimum credibility in the frequent item set generated in the first step.

The frequent item set *T* contains item *a* (frequent item set). If its support is equal to or greater than the support threshold specified by the user, as shown in Equation (1), the a priori algorithm uses the iterative method of layer-by-layer searching to generate frequent item sets.

$$\frac{|T(a)|}{|T|} \ge \min \sup p \tag{7}$$

Frequent *k*-item sets are used to explore and generate (*k* + 1)-item sets. The algorithm implementation process is shown in Figure 2.

**Figure 2.** The implementation process of the a priori algorithm.

Simple association rules are generated from the frequent item sets, and association rules with confidence levels greater than the threshold value are selected to form an effective rule set. If *CL*→(*L*−*L*) is greater than the confidence threshold specified by the user (see Equation (2)), then the association rule can be generated.

$$\mathcal{C}\_{L' \to (L - L')} = \frac{|T(L)|}{|T(L')|} \ge \min \, conf \,\tag{8}$$

#### *2.3. VMD*

Based on the EMD model, the VMD model was proposed in 2013, which is an adaptive method for signal processing and modal variation [30]. The constraint variation can be expressed as:

$$\left\{ \begin{array}{c} \min\_{\{u\_{k}\}, \{\omega\_{k}\}} \left\{ \sum\_{k=1}^{K} \parallel \partial\_{t} \left[ \left( \sigma(t) + \frac{j}{\pi t} \right) \* \mu\_{k}(t) \right] \mathbf{e}^{-j\omega\_{k}t} \parallel\_{2}^{2} \right\} \\\ \text{s.t. } \sum\_{k=1}^{K} u\_{k} = f(t) \end{array} \right\} \tag{9}$$

where *f*(*t*) is the original signal, *K* is the number of components, *∂<sup>t</sup>* denotes the Dirac function, {*ωk*} denotes the actual central frequency, {*uk*} denotes the component obtained after decomposition, *σ*(*t*) + *<sup>j</sup> πt* ∗ *uk*(*t*) denotes the analytical signal of each component, *e*−*jω<sup>k</sup> <sup>t</sup>* denotes the estimated central frequency of each analytical signal, and \* denotes the convolution operator. We then obtain the following:

$$L(\{u\_k\}, \{\omega\_k\}, \lambda) = a \sum\_k \parallel \partial\_t \left[ \left( \sigma(t) + \frac{j}{\pi t} \right) \* u\_k(t) \right] \mathbf{e}^{-j\omega\_k t} \parallel\_2^2 + \parallel f(t) - \sum\_k u\_k(t) \parallel\_2^2 + \left\langle \lambda(t), f(t) - \sum\_k u\_k(t) \right\rangle \tag{10}$$

where *λ* denotes the Lagrange multiplier.

By using the alternative direction method of multipliers (ADMM), the saddle point of the model without an upper constraint can be obtained, which is the optimal solution of the constrained variational model, so that the original signal can be decomposed into IMF components.

#### *2.4. FOA-BPNN*

The BPNN is a multilayer feedforward neural network based on error back propagation algorithm training, which was first proposed by Rumelhart and McClelland [31]. BPNNs have arbitrary complex pattern classification and good multidimensional function mapping ability. In addition, it can solve XOR and other problems that simple perceptrons cannot solve. Structurally, BPNNs are composed of an input layer, a hidden layer, and an output layer. The BP algorithm takes the network's square error as the objective function and uses the gradient descent method to calculate the minimum objective function.

In addition, as proposed by Wen-Tsao Pan [32], the FOA is a new method of global optimization, which is based on the foraging behavior of *Drosophila melanogaster*. Because the fruit fly is superior to other species in terms of smell and vision, the olfactory organ of *Drosophila* can collect all kinds of smells floating in the air, even the smells of food sources 40 km away. Then, after flying to the vicinity of the food location, they can use their sharp vision to find the food or observe the gathering position of their companions, and fly in that direction. The optimization procedure of the FOA-BPNN is shown in Figure 3.

**Figure 3.** Optimization procedure of an FOA-BPNN.

#### **3. Case Study**

#### *3.1. Geological Settings of Three Gorges Reservoir Area*

The Three Gorges reservoir is an artificial lake formed after the completion of the Three Gorges hydropower station, situated in the middle part of China. The total lengths of the Yangtze River and surrounding area are 660 km and 1084 km<sup>2</sup> respectively. The altitude drops from the highest part to the west and east, forming a hilly landform and medium altitude mountains, respectively. The trend of the mountains is controlled by the main geological structures. The strata in the Three Gorges Reservoir area are from pre Sinian to Quaternary. Jurassic red strata are dominant in the Three Gorges Reservoir area, mainly exposed in the west of Zigui county east of Fengjie county (the red strata refer to sandstone, mudstone, and sandstone interbedded with mudstone layers). In addition, other sedimentary rocks (limestone, marl, and dolomite) also exist in the area between Fengjie and Zigui. These hard rocks form a steep canyon in Fengjie–Zigui area. Metamorphic complexes and magmatic rocks appear in the area near the dam site on a relatively small scale. Controlled by the complex geological conditions, coupled with seasonal rainfall and periodic fluctuation of reservoir water level, a large number of geological disasters have developed in the Three Gorges Reservoir area. A total of 4429 geological disasters have been found up to the present time, most of which are landslides, rock falls, and debris flows [4]. A geological map of the Three Gorges Reservoir area is shown in Figure 4.

**Figure 4.** Geological map of the Three Gorges Reservoir area.

#### *3.2. Local Environmental Conditions*

The Baishuihe landslide is in the Zigui County area of the Three Gorges Reservoir area, located in the middle latitude, belonging to a subtropical continental monsoon climate zone, with a warm and humid climate, sufficient light, abundant rainfall, and distinct seasons. The average annual rainfall of Zigui County is 1493.2 mm. Rainfall is generally concentrated in the flood season in this area, and the maximum daily rainfall has historically reached up to 358 mm. The monsoon is mainly southerly. Limited by the terrain, the wind speed is generally low. The Yangtze River is the lowest erosion base level in this area and flows through the front edge of the landslide from west to east. The cross section of the river valley is a "V" shape, steep to the north and gentle to the south. There are several gullies short in length and depth in the landslide area, all of which are trunk gullies. Only temporary flood flows are formed after rainstorms, which constitute the primary discharge channel of surface water in the area.

The Baishuihe landslide is located on the south bank (convex bank) of the Yangtze River. The elevation of the landslide gradually decreases from south to north. The elevation of the toe and rear edges is about 70 m and 400 m, and the bedrock ridge is the boundary between the eastern and western sides. The deformation of the middle and front part of the landslide is relatively strong. Pinnate fissures are continuously distributed on both sides of the boundary, and the boundary between the east side and the rear edge is basically connected. The slope of the landslide is 30◦ to 35◦, and the landslide has an average thickness of 30 m with a volume of 1.26 × <sup>10</sup><sup>7</sup> <sup>m</sup>3. The topographic map and a schematic geological profile of the Baishuihe landslide are shown in Figures 5 and 6.

**Figure 5.** (**a**) Location of the Baishuihe landslide; (**b**) Topographic map of the Baishuihe landslide; (**c**) Overall view of the Baishuihe landslide © 2022 Springer Nature [33].

**Figure 6.** Schematic geological profile of the Baishuihe landslide (II–II') © 2022 Springer Nature [10].

#### *3.3. Deformation of the Landslide*

The Baishuihe landslide has been monitored since June 2003, and the layout of the monitoring surface layout is shown in Figure 5b. According to the characteristics of the surface monitoring displacement and surface macro-deformations, the Baishuihe landslide can be divided into two areas; however, the active area (area *A*) is the middle and front part of the landslide, which has strong deformation. After the completion of the Three Gorges Dam, the landslide has produced obvious displacement due to the impoundment of the reservoir. Several transverse tension cracks have appeared in the east of the landslide. Specifically, the eastern and posterior boundaries are basically connected, and the western boundary's cracks are in the shape of pinnately distributed cracks. From August 2005 to August 2006, there were many landslides on the inner side slope of the riverside highway with an elevation of about 220 m, and many subsidence and tension cracks appeared on the surface of the landslide. Approximately 100,000 m3 of landslide debris piled on the road towards the rear of the active area in June 2007 (Figure 7).

**Figure 7.** Macroscopic deformation of the Baishuihe landslide © 2022 Springer Nature [33].

#### *3.4. Analysis of the Monitoring Data*

There are three monitoring sections and six GPS monitoring points in the active Baishuihe landslide area. Among them, the monitoring points ZG93 and ZG118 have been in place since June 2003; XD01 and XD02 were added in May 2005, and XD03, XD04 were added in October 2005. Note that the displacement of all the monitoring points is synchronous. The monitoring period of Points ZG93, ZG118, and XD01 is long, meaning that they are representative and can reflect the entire movement process of the landslide. Therefore, in this study, these three points were taken for detailed analyses (Figure 8). According to the filling scheduling of the reservoir, the monitoring data can be divided into three stages for analysis, as described below.

**Figure 8.** Long term monitoring data of the Baishuihe landslide (displacement, reservoir level, precipitation).


In summary, the fluctuation in the reservoir's water level resulted in the significant extension of the fluctuating range and immersion range of the reservoir's water level, then, in turn, the stress field, seepage field, and rock–soil structure characteristics of the sliding mass changed significantly, which had a significant impact on the evolution process of the Baishuihe landslide. In addition, owing to the different stages of the reservoir's water level operations, there were some differences in the degree of impact on the deformation evolution process of the landslide. Moreover, although the landslide has undergone some adjustment, its shear deformation energy has been released to a certain extent. However, when external effects such as rainfall and the reservoir water level change dramatically again, the landslide will tend to be unstable.

#### **4. Results**

#### *4.1. Triggering Factors*

Rainfall and periodic fluctuation of reservoir water level are the main inducing factors of landslide deformation [34,35]. The periodic fluctuation of the water level in the Three Gorges causes dynamic osmotic pressure in the slope, resulting in landslide deformation [36]. On the one hand, rainfall can increase the weight of the landslide mass, thus increasing the sliding force of landslides; on the other hand, it can weaken the mechanical strength of the landslide rock and soil mass, resulting in landslide deformation [37,38]. Therefore, rainfall and reservoir water level can be used as trigger factors for landslide deformation [39]. In this research, a total of 10 triggering factors were selected to carry out displacement prediction research, including 5 reservoir level related factors (monthly average water level *h*; monthly maximum daily drop of water level Δ*h dailydrop* max ; monthly maximum daily rise of water level Δ*h dailyrise* max ; monthly fluctuation of water level Δ*hmonth*; bimonthly fluctuation of water level Δ*h*2*month*), 4 rainfall related factors (monthly maximum

effective continuous rainfall *q ef f ective continuous*; monthly cumulative rainfall *<sup>q</sup>month*; bimonthly cumulative rainfall *q*2*month*; monthly maximum daily rainfall *q day* max), and 1 deformation factor (last monthly velocity of deformation *v*), as shown in Table 1. In this study, the monitoring data of ZG93 were selected for landslide prediction. In addition, because the monitoring points ZG118 and XD01 have similar deformation characteristics to ZG93, the monitoring data of ZG118 and XD01 were added to increase the sample size and overcome model overfitting errors, as well as to provide a more representative prediction of the overall landslide displacement. The triggering factors are shown in Table 1.


**Table 1.** Triggering factors used to carry out displacement predictions.

#### *4.2. Clustering Results*

In the two-step clustering algorithm, the minimum and maximum categories of the triggering factors were set as 2 and 10 respectively. In clustering algorithms, there are two commonly used clustering criteria: the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC). When the number of samples is large, the BIC criterion can effectively avoid the model complexity caused by high model accuracy. Therefore, in this study, the BIC was chosen as the cluster criterion, and the distance measurement method was Euclidean distance. The clustering results of the external triggering factors are shown in Tables 2 and 3. Monthly velocity (*v*) was clustered into three categories (Low, V1; Medium, V2; High, V3), as shown in Table 4.

**Table 2.** Clustering results of the reservoir water level factors (ZG93, ZG118, XD01).



**Table 3.** Clustering results of the rainfall factors (ZG93, ZG118, XD01).

**Table 4.** Clustering results of the monthly velocity (ZG93, ZG118, XD01).


#### *4.3. Association Rules*

In the a priori algorithm, the minimum conditional support was set to 0.01 and the minimum rule confidence was set to 100% to ensure that the mining association criteria were absolutely correct. In total, 5447 association rules were generated, most of which were V1 and V2 stages (4247 and 1008, respectively). The main factors controlling V1 deformation of the landslide were smooth fluctuations of the reservoir's water level and light rainfall. The main factors controlling V2 deformation of the landslide were sharp fluctuations of the water level and medium to heavy rainfall. The main factor controlling V3 deformation of the landslide was heavy rainfall. Nevertheless, there may be some time correlation between these nine factors. In general, a drop in reservoir water and heavy rainfall were the main factors causing landslide deformation in the Three Gorges Reservoir area. It can be seen from Figure 7 that the water level of the Three Gorges reservoir has had a period of slow decline (175 m–165 m) from January to April and a rapid decline (165 m–145 m) from April to June since 2008. The heavy rainfall is concentrated from June to September every year. Moreover, this is also a critical period when the landslide produces severe deformation.

The statistical results of the data mining and association rules are shown in Table 5. The total support, average support, and the contribution without support of each triggering factor were counted, and the comprehensive contribution was the mean value of these three contributions. The comprehensive contribution of each factor according to the association rules is shown in Figure 9. Factors with a degree of contribution less than 0.3 were eliminated and were not used as input layers in the prediction model. Therefore, eight triggering factors were taken as the input layer in the V1 and V3 prediction models (F1, F3, F5, F6, F7, F8, F9, and F10), and eight triggering factors were taken as the input layer in the V2 prediction model (F1, F2, F5, F6, F7, F8, F9, and F10).


**Table 5.** Statistical results of the data mining and association rules.

**Figure 9.** Comprehensive contribution of each factor according to the association rules (V1: Low monthly velocity; V2: Medium monthly velocity; V3: High monthly velocity; F1: Monthly average water level; F2: Maximum monthly daily drop in water level; F3: Maximum monthly daily rise in water level; F4: Monthly fluctuation of the water level; F5: Bimonthly fluctuation of the water level; F6: Maximum monthly effective continuous rainfall; F7: Cumulative monthly rainfall; F8: Cumulative bimonthly rainfall; F9: Maximum monthly daily rainfall).

#### *4.4. Decomposition of Displacement*

The non-stationary time series theory indicated that the time series consisted of three parts: the trend term, the periodic term, and the random term. For the landslide displacement, the time series can be divided into three parts: (1) trend displacement, which is controlled by internal factors, such as geological conditions, geomorphology, geological structure, rock and soil properties, etc.; (2) periodic displacement, which is controlled by external factors, such as rainfall, the reservoir's water level, wind load, air temperature, etc.; and (3) random displacement, which is controlled by random factors, such as human activities, engineering construction, vehicle loads, vibration loads, etc., as shown in Figure 10.

$$X(t) = \mathfrak{a}(t) + \beta(t) + \gamma(t) \tag{11}$$

**Figure 10.** Relationship among trend displacement, periodic displacement, and random displacement.

Here, *X*(*t*) denotes the observed value of landslide displacement, and *α*(*t*), *β*(*t*), and *γ*(*t*) denote the trend, periodic, and random displacements, respectively.

Therefore, *K* was set to 3 and 2 in the VMD decomposition of the landslide displacements and triggering factors, respectively. The penalty parameter *a* and the rising step *τ* (*a* = 1.5 and *τ* = 0.1) were finally determined through multiple trials as follows. (1) In the displacement decomposition, *a* = 1.5 and *τ* = 0.1. (2) In the triggering factors decomposing, *a* = 700 and *τ* = 0.5. The decomposition results are shown in Table 6 and Figure 11.


**Table 6.** Composition of training and prediction samples.

#### *4.5. Displacement Prediction*

4.5.1. Trend Term Prediction

The displacement of the trend term showed a distinct piecewise function. Therefore, the trend term of ZG93 was divided into three phases: Phase 1 (June 2003~June 2007), Phase 2 (June 2007~June 2014), and Phase 3 (June 2014~December 2016). Multiple fitting results showed that good fitting results can be obtained by using a cubic function and the robust least squares method. The fitting function can be defined as:

$$S = at^3 + bt^2 + ct + d \tag{12}$$

The fitting results and parameters are shown in Figure 12 and Table 7, which indicate that the prediction accuracy's R2 and the RMSE of the trend term were 99.4% and 4.063.

**Figure 11.** VMD decomposition of displacements at the monitoring points (ZG93, ZG118, XD01).

**Figure 12.** Fitting and prediction curves of the trend term.


**Table 7.** Parameters of the trend term of displacement based on polynomial fitting.

#### 4.5.2. Periodic and Random Term Prediction

The periodic and random displacements were trained and predicted by FOA-BPNN. In general, a model's performance is usually affected by its own structure. Through extensive sensitivity analysis, the most reliable structure can be obtained [40,41]. Therefore, in the process of FOA optimization, 12 different structures with population sizes between 10 and 120 (10 intervals) were tested [42,43]. Each network was executed with 100 repetitions, and the MSE (between the actual and predicted periodic displacements of the landslide) was defined as the objective function used to evaluate the performance error of the model. It is worth noting that each structure was tested five times to evaluate its repeatability. The sensitivity curves are shown in Figure 13, which indicate that the MSE of the model decreases with an increase in the population size. However, because FOA-BPNN integration reduces the error in the training process, the model is less sensitive to population size. The computing time of models with different population sizes is shown in Figure 14. After consideration of the calculation costs and error, the population size of FOA-BPNN model was determined as 10. In total, six FOA-BPNN models were built, including individual periodic prediction models for V1, V2, and V3, and individual random prediction models for V1, V2, and V3, as shown in Figure 15.

**Figure 13.** A population-based sensitivity analysis for the FOA-BPNN model.

**Figure 14.** Computing time of different population sizes in MATLAB2019 software.

**Figure 15.** Training and prediction curves of the periodic and random terms.

Figure 15 indicates that the proposed models achieved good prediction results. According to the results of the residual error analysis, in the training process of the model, the residual error of displacement was relatively stable, which also verified the robustness and reliability of the model. For the prediction samples, there were some fluctuations in the residual error. The prediction accuracy of the model will be analyzed in the Discussion section.

#### 4.5.3. Total Displacement

The total displacement prediction results of the landslide can be obtained by superimposing the prediction results of all three types of displacement, as shown in Figure 16, which shows that the prediction model achieved good accuracy for monitoring point ZG93. In June 2007, there was a significant difference between the total displacement training value and the actual value, resulting in the obvious mutation of the residual error. This was because in the three parts of landslide displacement (trend, periodic, random), the trend displacement accounts for more than 85%. In June 2007, it was the boundary between Phase 1 and Phase 2, where there were some differences in the training results of the two polynomial fitting functions, resulting in a large residual error in the total displacement. However, the residual error was relatively stable for the prediction samples.

**Figure 16.** Training and prediction curves of the total displacement.

#### **5. Discussion**

As mentioned above, the landslide displacement contains three parts: (1) trend displacement, which is controlled by internal factors; (2) periodic displacement, which is controlled by external factors; and (3) random displacement, which is controlled by random factors. Generally, in predictions of landslide displacement, the selection of triggering factors is based on monitoring data such as rainfall and the reservoir's water level, which are the main factors causing periodic displacement. Therefore, taking these factors as the input of periodic displacement will not only have clear physical significance but will also significantly improve the accuracy of landslide displacement predictions. When the time series analysis method was used to predict the landslide displacement, the displacement trend was relatively easy to predict. Therefore, choosing the appropriate periodic displacement prediction model is the key to improving the effect of landslide displacement predictions. Moreover, landslide prediction models have experienced rapid development in the past 50 years, and various machine learning models have been widely used for predicting landslide displacements. However, each algorithm has its limitations. For instance, SVM has low computational complexity but it is sensitive to the choice of parameters and kernel function. The decision tree model does not need any prior assumptions on the data, but the required sample size is relatively large, and its ability to deal with missing values is quite limited. ELM uses the principle of least squares and a pseudo-inverse matrix to solve the problem, which is only suitable for single-hidden-layer neural networks. BPNN has strong

self-learning, self-adaptive ability, and good generalization ability but it is prone to slow convergence. In this study, based on the VMD and data mining results, the FOA-BPNN was used to predict the periodic and random terms of monitoring point ZG93 s displacement. The BPNN, SVM, and ELM algorithms were chosen as the comparison models (Models 2–4). The performance of various displacement prediction models of the Baishuihe landslide are shown in Table 8 and Figure 17. The prediction accuracy of the FOA-BPNN model was the highest. The R<sup>2</sup> reached 0.977 and its RMSE was only 10.041. In contrast, the proposed model could improve the accuracy of landslide displacement predictions.

**Table 8.** Performance of various displacement prediction models of the Baishuihe landslide.

**Figure 17.** Prediction curves of the total displacement.

In this study, ZG93's monitoring data were selected for predicting displacement, and the monitoring data of points ZG118 and XD01 were added to increase the sample size and overcome the model's overfitting error, and to provide a better representative prediction of the overall landslide displacement. The accuracy of various models in terms of predicting ZG93's displacement has been discussed. The monitoring points ZG118 and XD01 in 2016 were used for the model validation. The measured and predicted displacements of ZG93, ZG118, XD01 are shown in Figure 18. The R<sup>2</sup> values between the measured and predicted displacements of ZG118 and XD01 were 0.977 and 0.978, respectively. The RMSE of these two monitoring points was 12.40 and 16.04, respectively. In the previous study [10], cumulative displacement was divided into trend term and periodic term by time series model and moving average method. A cubic polynomial model was proposed to predict the trend term of displacement. Then, multiple algorithms were used to determine the optimal support vector regression (SVR) model and train and predict the periodic term. In this paper, data mining technology is used to screen the trigger factors of periodic items, and the more advanced FOA optimization algorithm is used to optimize the parameters of the machine learning model. Furthermore, this paper uses a VMD model to divide the landslide displacement data, which makes great progress compared with the moving average model, and will be more conducive to the integration and automation of the landslide prediction model. Therefore, the prediction accuracy obtained in this paper

(R<sup>2</sup> = 0.977 and 0.978) is significantly higher than that of previous studies (R<sup>2</sup> = 0.963 and 0.951). In general, the model proposed in this study has achieved good results in terms of predicting the displacement of different monitoring points of the landslide, which has high practicability and application value in the study of landslide displacement predictions. However, it is worth noting that due to the small amount of displacement data in the V3 state of the monitoring point (Table 6), the prediction results of XD01 have obvious errors for July 2016. Therefore, in order to obtain satisfactory prediction results, the monitoring data of various states should be supplemented as much as possible.

**Figure 18.** Measured and predicted displacements of ZG93, ZG118, and XD01.

#### **6. Conclusions**

In this paper, the Baishuihe landslide in the Three Gorges Reservoir area was taken as an example. Data mining and deep learning were used for displacement prediction. The following conclusions can be reached:


**Author Contributions:** Writing—original draft, F.M.; data curation, X.X.; writing—review and editing, Y.W.; investigation, F.Z. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by the National Natural Science Foundation of China (42007267, 41977244), Science and Technology Project of Hubei Provincial Department of Natural Resources (ZRZY2020KJ12), and the National Key R&D Program of China (2017YFC-1501301).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

