Slope Stability Prediction Using Principal Component Analysis and Hybrid Machine Learning Approaches

Lei, Daxing; Zhang, Yaoping; Lu, Zhigang; Lin, Hang; Fang, Bowen; Jiang, Zheyuan

doi:10.3390/app14156526

Open AccessArticle

Slope Stability Prediction Using Principal Component Analysis and Hybrid Machine Learning Approaches

by

Daxing Lei

^1,2,*,

Yaoping Zhang

^1,2,

Zhigang Lu

^1,2,

Hang Lin

³

,

Bowen Fang

⁴ and

Zheyuan Jiang

^4,*

¹

School of Resources and Architectural Engineering, GanNan University of Science and Technology, Ganzhou 341000, China

²

Key Laboratory of Mine Geological Disaster Prevention and Control and Ecological Restoration, Ganzhou 341000, China

³

School of Resources and Safety Engineering, Central South University, Changsha 410083, China

⁴

Jiangsu Key Laboratory of Urban Underground Engineering and Environmental Safety, Institute of Geotechnical Engineering, Southeast University, Nanjing 210096, China

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2024, 14(15), 6526; https://doi.org/10.3390/app14156526

Submission received: 27 June 2024 / Revised: 16 July 2024 / Accepted: 24 July 2024 / Published: 26 July 2024

(This article belongs to the Special Issue Artificial Intelligence in Civil Engineering: Latest Advances and Prospects)

Download

Browse Figures

Versions Notes

Abstract

:

Traditional slope stability analysis methods are time-consuming, complex, and cannot provide fast stability estimates when facing a large amount of slope cases. In this case, artificial neural networks (ANN) provide a better alternative. Based on the ANN, the particle swarm optimization (PSO) algorithm, and the principal component analysis (PCA) method, a novel PCA-PANN model is proposed. Then, a dataset of 307 slope cases covering a wide range of slope geometries and mechanical properties of geomaterial is developed. The hybrid machine learning model trained with the dataset is applied to the factor of safety (FoS) prediction of the actual slope, and three evaluation indicators are introduced to measure the prediction performance of the model. Finally, the sensitivity analysis of input parameters is carried out, and the slope protection strategy for different sensitive factors is proposed. The results show that this new model can quickly obtain the FoS and stable state of the slope without complex calculation, only by providing the relevant characteristic parameters. The correlation coefficient of the PCA-PANN model for slope stability analysis reaches more than 0.97. The sensitivity degree of influencing factors from large to small is slope angle, cohesion, pore pressure ratio, slope height, unit weight, and friction angle.

Keywords:

slope stability; factor of safety; principal component analysis; machine learning; neural network

1. Introduction

A slope is a commonly used geological environment and load carrier in engineering construction [1,2,3]. As shown in Figure 1, a large number of slopes will be formed during the construction of mines, highways, railways, water conservancy, and construction projects. Stability analysis is the most important part of any slope design study [4,5]. Slope instability will bring great loss to economic construction and people’s lives [6]. Nearly USD 19.8 billion is spent each year on the damage caused by landslide worldwide, accounting for 17% of the annual loss attributed to the global natural disasters [7]. Therefore, it is of great theoretical and practical significance to construct a reliable and effective slope stability prediction model [8,9,10,11,12].

To better estimate and evaluate the slope stability, the factor of safety based on an appropriate geotechnical model is required. The term “factor of safety” (FoS) was considered an important criterion for assessing slope stability [13]. Generally, a slope with a FoS greater than 1.2 is considered safe [14]. Since Terzaghi [15] published the classic work entitled “Mechanism of Landslides”, numerous researchers contributed to a better understanding of slope stability. Traditional methods of slope stability analysis mainly include limit equilibrium methods (LEM) and qualitative evaluation methods. The LEM was first proposed by Fellenius [16] and is the earliest and most widely used method in slope stability research. The principle of LEM is to evaluate the slope stability by calculating the FoS based on the relationship between the shear strength and sliding force on the slope sliding body [17]. Commonly used LEM include the Swedish arc method, Morgenstern-Price method, simplified Bishop method, Janbu’s generalized method, Spencer method, etc. [18,19,20,21]. In addition, Zhou and Cheng [22] used a quasi-dynamic method to establish a strict LEM considering interstrip forces to automatically search the sliding surface of a 3D landslide and determine its FoS. Recently, based on the LEM, Bansal and Sarkar [23] used the GeoStudio software to study the slope safety factor in Kalimpong, Darjeeling Himalayas under dry and saturated conditions. Apart from LEM, a series of excellent numerical analysis methods, such as finite element method (FEM), finite difference method (FDM), and Fast Lagrangian Analysis of Continua (FLAC) emerged [24], which can effectively solve complex slope stability problems. Numerous studies combined with numerical analysis methods calculated the FoS, and quantitatively revealed the mechanism and process of gradual failure of geomaterials before and after instability.

The above studies greatly deepened the understanding of slope instability. The slope is an open, uncertain, and nonlinear complex system. Affected by a variety of random and uncertain factors (e.g., the spatial variability of geotechnical parameters and the infiltration of water), the application of the above methods was limited by several restrictions, and it is difficult to achieve the ideal prediction effect [25,26,27]. For example, due to the large number of potential sliding surfaces, it is cumbersome and difficult to find the critical sliding surface by using the LEM [28,29]. Therefore, the use of LEM is not feasible when a significant amount of slope cases need to be analyzed in a limited time, particularly for the rapid detection of landslide activity and issuing landslide warnings [30,31]. The accuracy of the numerical simulation method depends on the setting of the constitutive model, boundary conditions, and mechanical parameters, which often require rich engineering experience and in situ back analysis to obtain reasonable results [32]. Moreover, compared with the LEM, the numerical simulation method requires longer time [33]. Of course, flawed as it is, the numerical simulation method is still a promising tool. On the other hand, an easily developed method is needed to replace LEM and qualitative evaluation methods for fast FoS prediction.

Benefiting from the vigorous development of computational intelligent technology, the advent of the machine learning (ML) model shed light on tackling the challenge of FoS prediction [34,35,36]. Applying the ML approaches to predict FoS based on collected slope cases became an important solution [37]. As a robust soft computing model, neural networks are widely used to assess slope stability and represent the mechanical behavior of materials and other nonlinear problems [38,39,40,41,42,43,44,45,46,47,48]. Neural networks include artificial neural networks (ANN), back propagation neural networks, differential evolution neural networks, and other methods. Feng [49] pioneered the concept of intelligent rock mass mechanics and established a neural network slope stability prediction model. Bui et al. [50] introduced a deep learning neural network (DLNN) model in the assessment of landslides in Kon Tum Province, Vietnam, and compared its prediction performance with other classical ML methods. Foong and Moayedi [51] use two novel optimization techniques (i.e., the vortex search algorithm and equilibrium optimization) to fine-tune the neural network model used to determine the FoS of a single-layer slope. Table A1 in Appendix A summarizes published typical neural network-based methods. Until now, the vast majority of existing studies on slope stability prediction adopted a single ML technology, and more complex algorithms are still needed to improve slope stability prediction [29]. Despite many efforts, there is still a debate about which model can achieve the most stable landslide displacement prediction. Moreover, most ML models verify their superiority with only a few cases, and this strategy may lead to unreliable conclusions [52]. With the increasing requirement for FoS prediction accuracy, there is a trend to develop more reliable ML models.

In this paper, the principal component analysis (PCA) method is utilized to map high-dimensional raw data to low-dimensional through matrix transformation. Then, combining the advantages of the particle swarm optimization (PSO) algorithm and ANN model, a hybrid PANN model is developed to predict FoS. It introduced evaluation indicators to evaluate the generalization performance of the model and further discussed the application of the PCA-PANN model. The outline of this study is as follows: Section 2 briefly describes the importance of the study. Section 3 describes the selection of the influencing variables and the statistical description of the collected dataset. Section 4 presents the details of the developed model. Section 5 demonstrates the model effects with practical applications. Section 6 provides slope prevention methods as well as limitations of the current study. Section 7 summarizes the conclusions of this study.

2. Research Significance

The occurrence of slope instability is a long-term, nonlinear, and complex process [53]. Due to the complexity of slope structure and the discontinuity of mechanical properties of geotechnical materials, as well as the variability of control factors acting on the slope, slope stability is comprehensively affected by geological and engineering factors [54]. As an indispensable key mechanical parameter in slope engineering design and stability analysis, FoS was incorporated into various standards and engineering codes, and its research significance cannot be ignored. FoS evaluation involves many variables and requires the calculation of slope geometry data, geological material parameters, and pore water pressure. These uncertain factors have different influence weights on slope stability, and there is a complex nonlinear relationship between them [55].

The potential economic benefits to be gained from solving FoS by the ML model are huge (e.g., allowing steeper slopes, better support system design, etc.). In practice, it is usually necessary to estimate FoS at the preliminary stage of an engineering project. When sufficient historical cases are available, a machine learning approach provides an alternative model for slope stability assessment by establishing the input–output relationship between the FoS and related parameters [55]. To this end, this paper proposes an intelligent FoS prediction method combined with the PCA method, which uses ML methods to train and develop a new hybrid ML model for FoS prediction. The input–output relationship between the variables and FoS is identified, and potential explicit or implicit relationship functions are determined to predict FoS for a given set of input variables. The new method can process the data without depending on the measurement scale, and the arrangement of the data is not affected [56]. In this model, the PSO algorithm is used to Improve the predictive performance and generalization capability of the ANN model. To the best of the author’s knowledge, this is a preliminary exploration in the field of slope stability research using the PCA-PANN model. Compared with the traditional LEM and qualitative evaluation methods, the proposed model can find and deal with the implicit nonlinear relationship between variables in depth. Therefore, it is a promising method for slope stability prediction.

3. Dataset Description and Processing

3.1. Dataset Preparation

Geological materials such as soil, rock, and sand are exposed to a complex environment of strain rates, water, and temperature changes over a long period time, and their complex pore structures result in extremely discrete mechanical properties [57,58,59,60,61]. Due to the limitations of open available geological data, it is not possible to comprehensively consider all potential variables. This section compiles a dataset encompassing 307 sets of data based on available measurements reported in the literature [62,63,64,65,66,67] for training and developing ML models to predict FoS. Referring to the practice of ref. [68], the six most representative variables were selected to evaluate FoS. These six key variables include friction angle (φ), cohesion (c), unit weight (γ), slope height (H), slope angle (Φ), and pore pressure ratio (r_u). Here, φ, c, and γ reflect the mechanical properties of the rock and soil mass, r_u reflects the magnitude of the pore water pressure, and H and Φ reflect the geometric characteristics of the slope.

The specific definitions of these six key variables are as follows:

H indicates the vertical distance between the slope crest to slope base.
Φ represents the angle between the inclined slope plane and slope base.
r_u denotes the ratio of pore water pressure to overburden pressure and represents the external triggering factor of landslide.
γ indicates the weight per unit volume of geomaterials.
φ and c represent the geomaterial’s ability to withstand shear stress.

Descriptive statistics are an important part of data analysis and provide the basis for subsequent ML modeling. Table 1 provides univariate descriptive statistics of the data to summarize the data in an organized manner. The input variables collected in different slopes have certain differences. Visual analysis can reflect the characteristic parameter information in the dataset and can qualitatively analyze the slope state to a certain extent. In order to evaluate whether there are outliers in the dataset, we plotted the violin plot of each variable in the dataset, as shown in Figure 2. In general, violins combine the box plots and the kernel density plots to effectively depict the data distribution. The shape of the violin indicates how the data are distributed along the vertical axis, and the interior of the violin contains a box plot. Figure 2 shows that most of the cohesion is distributed in 10–36 KPa; the slope angle is mainly 25° to 45°. The slope height is mainly 40.3 m. A cursory inspection of Figure 2 indicates a relatively uniform distribution of the input variables in this dataset. There are no outliers or anomalous data in the violin plot, indicating that this collected dataset is constructed reasonably.

To study the correlation between different input variables, Figure 3 analyzes the correlation between the six key input variables of the dataset and the output results FoS. It can be seen that the correlation coefficient of two identical variables is represented diagonally from bottom left to top right as the symbol of the variable (representing a correlation coefficient of 1). The correlation coefficient of the part above the diagonal is the same as that of the other part of the symmetry. The correlation coefficients between the different input variables and FoS are relatively low (most values are less than 0.4). That is to say, it is not easy to find an obvious relationship between the six input variables and FoS, which indicates that there is a strong nonlinear relationship between FoS and input variables.

In the subsequent ML modeling, the whole dataset is randomly divided into two independent subsets: a training set and test set. The training set is utilized to build and train the ML model, and then the test set is utilized to validate the predictive performance and generalization capability of the proposed model [69]. Through optimization analysis, the optimal percentages of the training set and the test set in the whole dataset are determined to obtain two sufficiently representative subsets. In this paper, 80% of the whole dataset is included in the training set (i.e., 245 cases) and the remaining 20% in the test set (i.e., 62 cases).

3.2. Cross-Validation and Performance Measures

In order to minimize the bias introduced by randomly splitting the training and test sets in Section 3.1, the K-fold cross-validation (K-CV) method is utilized. K-CV is a statistical method of splitting a dataset into smaller subsets and effectively eliminating the bias caused by sampling randomness. The original training set is equally divided into K different subsets, each subset is utilized as a new test set, and the remaining K − 1 subsets are utilized as new training sets. The accuracy of the ML model on each subset is evaluated to evaluate the effectiveness of the proposed method. As suggested by Kuhn and Johnson [70], 10-fold or 5-fold cross-validation is recommended. In this study, 10-fold cross-validation is used according to the scale of the dataset and the computational time. The process of 10-fold CV is shown in Figure 4, where P₁, P₂, …, and P₁₀ represent the prediction results of the corresponding folds, respectively.

In this section, the correlation coefficient (R²), mean absolute error (MAE), and root mean squared error (RMSE) are widely utilized to evaluate the predictive performance and generalization capability of ML models. The basic equations of these evaluation indicators are shown in Equations (1)–(3) [71,72,73]. The evaluation indicators between the predicted and measured values are excellent methods to check the predictive performance and generalization capability of the predictive models [74]. Theoretically, the ML model works best when its evaluation indicators are R² = 1, MAE = 0, and RMSE = 0 [75,76].

R^{2} = 1 - \frac{\sum_{i = 1}^{N} {(F o S_{i}^{m e a} - F o S_{i}^{p r e})}^{2}}{\sum_{i = 1}^{N} {(F o S_{i}^{m e a} - E [F o S^{m e a}])}^{2}}

(1)

M A E = \frac{1}{N} \sum_{i = 1}^{N} |F o S_{i}^{m e a} - F o S_{i}^{p r e}|

(2)

R M S E = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(F o S_{i}^{m e a} - F o S_{i}^{p r e})}^{2}}

(3)

where FoS_i^pre and FoS_i^mea are the predicted and measured results, respectively; E[FoS_i^mea] represents the average of FoS_i^mea; and i = 1, 2…, and N is the number of samples.

4. Methodology

4.1. Principal Components Analysis

The slope instability shows typical fluctuation and multi-scale characteristics. Due to the influence of random factors such as noise and interference, it is difficult to fully extract useful information from the data by directly modeling the original data, and the prediction accuracy is low. More importantly, there is a significant correlation between various types of data, which is easy to cause repetition of sample information input, increases the complexity of model training, and reduces the generalization performance. The principal component analysis (PCA) method is a classical data dimensionality reduction and denoising method, which maps high-dimensional data into a few principal components in low-dimensional space by matrix transformation. The original data are compressed to facilitate calculation and improve the overfitting problem, and these principal components contain most of the useful information in the original data [77].

The application of the PCA method in the slope field was proved to be feasible and effective [78,79]. PCA is used to eliminate the correlation between the six key variables in the dataset established in Section 3.1, and the new dataset is established. The steps are as follows:

The original dataset X_mxn can be represented as a matrix with m rows and n columns, where m is the number of samples and n is the number of variables:

X_{m \times n} = (\begin{matrix} x_{11} & \dots & x_{1 n} \\ ⋮ & ⋱ & ⋮ \\ x_{m 1} & \dots & x_{m n} \end{matrix}) .

(4)

The dimension difference between each input variable may be too large to affect the prediction results. Therefore, to better use the PCA method, the X_mxn was standardized and transformed into a standardized matrix X* according to Equation (5).

{x_{i, j}}^{*} = \frac{x_{i, j} - μ_{j}}{\sqrt{σ_{i}}}, i = 1, 2, \dots, m; j = 1, 2, \dots, n

(5)

where x_i,j* denotes the normalized value; σ_i and μ_j denote the variance value and mean value of original data, respectively.

The correlation matrix R is calculated from the X* as shown in Equation (6).

R_{m \times n} = X^{* T} X^{*}

(6)

Calculate the eigenvalue of each principal component according to |λ−R| = 0 to find the eigenvalues λ_i. Finally, the contribution rate of each principal component η_i is obtained, and the equation is as follows:

η_{i} = \frac{λ_{i}}{\sum_{i}^{n} λ_{i}} .

(7)

4.2. Artificial Neural Network

Based on the structure and function of biological neurons, ANN is a mathematical approach that simulates the reasoning operation of the human brain, and can solve complex mechanisms well [80]. For multi-dimensional data with low correlation, ANN is the best alternative to traditional neural networks because of its excellent function approximation and feature selection ability [81]. The basic unit of ANN is the neuron, which is responsible for receiving input features from sensory organs and passing them to the brain. Neural networks have types of layers. The first layer is the Input layer, through which all the input data and parameters are fed into the neural network. The second layer is the Hidden layer that processes the input data using activation functions. The processed data are finally output at the Output layer.

The ANN model framework constructed in this section is shown in Figure 5. The model consists of neurons connected to construct a multi-layer neuronal network. Input variables such as friction angle (φ), cohesion (c), and unit weight (γ) are transmitted and trained through neuronal connections and activation functions. The optimal number of neurons in the Hidden layer is constantly adjusted during training to minimize the error between the predicted FoS value and the true FoS value. Training means that the ANN model needs to learn the weights associated with all neurons. As shown in Equation (8), the weighted sum of all inputs is passed through the nonlinear activation function f.

Y = f (b + \sum_{i = 1}^{n} x_{i} ω_{i})

(8)

where b and w_i denote the bias and the respective weights, n is the number of inputs for the node.

4.3. PANN

Slow learning speed and ease of falling into local minima are the inherent disadvantages of ANN. This section attempts to improve the performance of ANN by the particle swarm optimization (PSO) algorithm. As a heuristic algorithm, the PSO algorithm with powerful global optimal searchability is a kind of evolutionary computing technology. Particles are the basic unit of PSO algorithm, and each particle represents a candidate solution to the optimization problem in the population.

The basic principle of PSO is to simulate the flight of birds by using a swarm of particles with only velocity and position attributes. The position of the food represents the optimal solution to the problem, and the distance between the particle and the optimal solution represents the objective function value of the current particle. Each particle finds the optimal position separately, and the optimal position sought by each particle is the individual extremum. Then, each particle shares the individual extreme value with other particles, and finds the best individual extreme value by comparison, and the best individual extreme value is the best value of the group. Finally, all particles are updated according to the current individual optimal position and group optimal position, and the above steps are repeated until the optimal solution is found.

The specific Implementation process of PSO is as follows:

Suppose that there are N particles in the D-dimensional space, and X and V are used to represent the position and velocity of the particles. The position of the ith particle is:

X_{i D} = (x_{i 1}, x_{i 2}, \dots, x_{i D}) .

(9)

The velocity of the ith particle is given in Equation (10):

V_{i D} = (v_{i 1}, v_{i 2}, \dots, v_{i D}) .

(10)

In every iteration of the system, the particle moves to find the best places (i.e., P_best and G_best). The velocity and position of all particles can be attained:

V_{i D}^{n e w} = V_{i D} + R_{1} + C_{1} R_{1} (P_{b e s t} - X_{i D}) + C_{2} R_{2} (G_{b e s t} - X_{i D})

(11)

X_{i D}^{n e w} = X_{i D} + V_{i D}^{n e w}

(12)

where

V_{i D}^{n e w}

and

X_{i D}^{n e w}

are terms that show the updated velocity and position of particles; C₁ and C₂ are two customized position acceleration constants; and two positive acceleration constants R₁ and R₂ are the random numbers in the range of (0,1).

In order to deepen the understanding of the PSO algorithm, the geometrical illustration of particles moving in two-dimensional space is shown in Figure 6. P_best denotes the local best position, and G_best denotes the global best position. Further details about the use of PSO development and operations can be found in the Ref. [82].

The PSO is utilized to search the hyperparameters of ANN, such as the optimal number of neurons, learning rate, and number of iterations. The connection weights and thresholds of the ANN are mapped to a swarm of particles. The dimension of each particle represents the sum of the number of weights in the network that play a connection role and the number of network thresholds. Using the PSO algorithm to train neural network can give full play to its global optimization ability and local rapid convergence advantages. The particle moves and searches in the weight and threshold space to minimize the error of the Output layer of the neural network. Finally, a hybrid PANN model is developed by using the advantages of PSO and ANN. The good global search ability of PSO improves the generalization ability and learning performance of ANN. The hybrid PANN model incorporates the advantages of PSO and ANN; specifically, PSO finds the global minimum in the search space, while ANN finds the best result using the determined global minimum. In addition, PCA is performed on the original data to extract new variables that meet the requirements of principal components. With the new variables as the new inputs of the PANN model, the flowchart of the final PCA-PANN model obtained is shown in Figure 7.

5. Results and Applications

5.1. Feature Extraction by PCA

Principal component analysis (PCA) was performed on the data in the dataset as described in Section 4.1. The data standardized according to Equation (5) were analyzed, and the eigenvalues and contribution rate of the six principal components were finally obtained in Table 2. It can be seen that the cumulative contribution rate of the first four principal components is 85.036%, which meets the condition that the cumulative contribution rate of the principal component variance accounts for more than 80% of the total variance and can fully reflect the main characteristics of the sample. Therefore, the first four principal components (numbered F₁, F₂, F₃, and F₄, respectively) are selected to replace the original variables for analysis.

According to the factor score coefficient matrix (see Table 3), the expressions of each principal component can be obtained as shown in Equation (13):

\{\begin{matrix} F_{1} = 0.796 γ^{*} + 0.63 c^{*} + 0.67 φ^{*} + 0.653 Φ^{*} + 0.827 H^{*} - 0.034 {r_{u}}^{*} \\ F_{2} = - 0.014 γ^{*} - 0.413 c^{*} + 0.287 φ^{*} + 0.187 Φ^{*} - 0.015 H^{*} + 0.895 {r_{u}}^{*} \\ F_{3} = 0.333 γ^{*} + 0.38 c^{*} - 0.335 φ^{*} - 0.535 Φ^{*} + 0.101 H^{*} + 0.401 {r_{u}}^{*} \\ F_{4} = - 0.215 γ^{*} + 0.328 c^{*} + 0.539 φ^{*} - 0.268 Φ^{*} - 0.266 H^{*} + 0.027 {r_{u}}^{*} \end{matrix}

(13)

where the superscript * represents the physical quantity normalized by Equation (5).

5.2. Model Performance

For comparison purposes, the effect of the PANN model without PCA processing is also shown. It is worth noting that before applying the PANN model, data preprocessing is also required to improve the prediction accuracy. The respective prediction effects of the two models (PCA-PANN model and PANN model) are shown in Figure 8. The horizontal coordinate in Figure 8 represents the data number. Figure 8 shows that the FoS error generated by the PCA-PANN model is close to zero on most datasets, with an R² value of 0.988. This error is negligible when predicting FoS, which indicates that the PCA-PANN model has high prediction accuracy. To better support these results, Table 4 lists the prediction results of the two models. RMSE quantifies the spread of measured values around the mean of predicted values. MAE calculates the average absolute error between predicted and actual values. R² is a statistical metric that assesses the strength of the relationship between two variables using N pairs of measured and predicted values. Higher R² value and lower error values (RMSE and MAE) show better predictability of measured values from the prediction model. It is obvious that the prediction results of the PCA-PANN model proposed in this paper are the closest to the measured values, and the prediction errors are the smallest (RMSE = 0.13, R² = 0.971, and MAE = 0.125). The PANN model combined with the PCA method enables the proposed PCA-PANN model to efficiently explore the most appropriate computational parameters using the principal component information, thus improving the accuracy of FoS prediction.

Overall, the proposed model is effective in addressing nonlinear FoS caused by multiple variables, and this trained ML model can provide a reference for experienced team members. Admittedly, in terms of current ML research results, ML models cannot completely replace traditional methods to estimate FoS. However, if possible, using ML methods to accurately estimate the FoS of slopes with known design parameters based on datasets from previous studies will greatly reduce the test time and cost.

5.3. Case Study

In order to verify the prediction performance of the proposed PCA-PANN model in practical application, this model is applied to a slope project in Heihe City, Heilongjiang Province, China. The stability of this slope directly affects the stability of oil and gas transportation. The AA001–AA004 piles of the China–Russia East Line Natural Gas Pipeline are the control works of the transit section. The positions of AA001–AA004 piles and the monitored slope are shown in Figure 9a. The slope angle is about 40°, the horizontal distance of the slope is 477 m from the total length, and the slope height is 120 m. According to the field hydrogeological data (see Table 5 for the specific characteristics), the stability of the slope was evaluated by the model established by the numerical simulation method in Ref. [83] as shown in Figure 9b, and the FoS of the slope was calculated.

Table 6 gives the calculation of the FoS of the slope based on the ANN model established in Ref. [83] and the proposed PCA-PANN model. It can be seen that compared with Ref. [83], the proposed model is closer to the numerical results. Interestingly, with the increase in moisture content, the FoS predicted by this model shows a trend of first strengthening and then weakening, which is consistent with the trend of slope stability obtained by the numerical simulation method.

These results strongly prove the feasibility and effectiveness of the proposed model. In other words, machine learning simply offers a promising alternative to solving the challenge of slope stability modeling. It is worth noting that this method cannot completely replace the numerical analysis methods at present. Last but the most important, all methods should corroborate each other and jointly provide a reference for solving slope stability problems.

6. Discussion

6.1. Slope Safety Protection

For the problem of slope stability evaluation, feature importance analysis can reflect the relevant influence degree of each influencing factor on slope stability, so as to provide an effective reference for the formulation of slope safety protection. Figure 10 shows the results of the sensitivity analysis. The results show that slope angle (Φ) is 0.28, cohesion (c) is 0.16, pore pressure ratio (r_u) is 0.13, slope height (H) is 0.12, unit weight (γ) is 0.03, and friction angle (φ) is 0.011. It is important to note that the importance scores and ranking of each input variable will be different for different datasets and different feature importance scores. It is worth mentioning that the importance ranks of each of the input variables obtained in this paper apply only to the dataset in Section 3 and cannot be considered as general rules. With more valid slope cases and more comprehensive feature consideration in the future, more representative results can be obtained.

Several relatively important characteristics (slope angle (Φ), cohesion (c), pore pressure ratio (r_u), and slope height (H)) are selected and combined with the slope engineering practice for analysis and prevention. It is suggested to carry out slope prevention from the following three aspects:

(1): Reduce the slope height and slope angle: for high and steep slopes, the slope height and slope angle can be reduced by cutting the head and cutting the slope to reduce the load.
(2): Reduce unit weight and pore water pressure: for surface water can be set up drainage ditch, drainage ditch, etc.; the blind ditch, drainage hole, and collection well can be set for groundwater.
(3): Increase shear strength: A vegetation cover and vertical root system are used to prevent slope erosion, reduce pore water pressure, and improve shear strength. Specific methods include planting trees, planting grass, and laying sod.

6.2. Limitations

The integrated ML approaches based on the PANN model and PCA method established in this study are very promising for classification and regression problems and have great potential to be more widely used in slope stability prediction. However, there are still some shortcomings in this paper that need to be improved. As a machine learning approach, the predictive performance of the PCA-PANN model is highly influenced by the quantity and quality of the supporting data. In other words, the reliability of the PCA-PANN model strongly depends on the scale and quality of the amount of data. The scale of datasets created from field or experimental studies is limited [84,85,86]. At present, the FoS dataset established in Section 3.1 is still limited and cannot cover all slope types. Therefore, it is necessary to further enrich the dataset to make the FoS prediction results more reliable. The generalization of this model deserves further improvement and exploration in the future. For example, by varying and testing different swarm sizes and the number of iterations, the PCA-PANN model produces more convincing scenarios. Moreover, other external factors, such as earthquakes, rainfall, impoundment of reservoirs, and human activities, can also have a significant impact on FoS. Due to the difficulty of collecting such data, these factors are not considered in this paper. With the increasing demand of FoS prediction accuracy in slope engineering practice, it is a trend to develop or utilize more potential factors. Last but not least, the manuscript uses the PCA method to reduce the dimensionality of the input data and eliminate correlations between variables. In the future, we will try to combine other dimensionality reduction techniques such as manifold learning method with ML methods to predict slope stability. All of these will be the subject of future works.

7. Conclusions

(1): In this paper, the PSO algorithm is introduced to optimize ANN model. By inputting the slope information and mechanical parameters, combined with the powerful ability of hybrid machine learning model, ideal and fast FoS prediction results can be obtained.
(2): Four principal components of slope-influencing variables are extracted to eliminate the correlation between influencing variables. The amount of information carried by these principal components is 85.036%. The number of these principal components is less than the original variables, and the proposed PCA-PANN model is significantly improved by using these principal components. The well-performing statistical metrics (R² = 0.971) also demonstrate the effectiveness of the proposed hybrid machine learning model.
(3): The prediction results of the proposed model are compared with the FoS of slope engineering cases in Heihe city. The FoS predicted values obtained for each case are very close to the real value, which proves the effectiveness and feasibility of the proposed hybrid machine learning model.

Author Contributions

Conceptualization, D.L.; methodology, Z.J.; software, Y.Z.; validation, B.F. and Z.L.; investigation, H.L.; resources, Z.L.; data curation, D.L. and H.L.; writing—original draft preparation, D.L.; visualization, H.L.; funding acquisition, D.L. All authors have read and agreed to the published version of the manuscript.

Funding

This paper gets its funding from Jiangxi Province Higher Education Teaching Reform Research Project (JXJG-22-36-4); Jiangxi Provincial Department of Education Science and technology research Program (GJJ218511). The authors wish to acknowledge these supports.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Some or all data or models that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Previous work about neural network-based methods.

Reference	Model	Input Variables
Verma et al. [87]	ANN	c, φ, α, pore pressure
Foong and Moayedi [51]	Multi-layer perceptron neural network (MLPNN)	Undrained cohesive strength (Cu), slope angle (β), the surcharge on the footing (w), and the ratio of setback distance (d/D)
Wang et al. [88]	BPNN (Back propagation neural network)	φ, c, γ, H, Φ
Mahmoodzadeh et al. [14]	DNN	φ, c, γ, H, Φ, r_u
Liao et al. [89]	BPNN	φ, c, γ, H, Φ, r_u
Huang et al. [90]	CNN	φ, c, γ, H, Φ, r_u
Sakellariou and Ferentinou [91]	BPNN	φ, c, γ, H, Φ, r_u
Zhang et al. [92]	ANN	φ, c, γ, H, Φ, pore pressure
Marrapu et al. [31]	ANN	φ, c, γ, H, Φ, r_u
Choobbasti et al. [93]	ANN	φ, c, γ, H, Φ, r_u
Das et al. [94]	ANN	φ, c, γ, H, Φ, r_u
Suman et al. [33]	Functional networks (FNs)	φ, c, γ, H, Φ, r_u
Lu and Rosenbaum [95]	BPNN	φ, c, γ, H, Φ, r_u
Rukhaiyar et al. [96]	PSO–ANN	φ, c, γ, H, Φ, r_u

References

Kardani, N.; Zhou, A.; Nazem, M.; Shen, S.-L. Improved prediction of slope stability using a hybrid stacking ensemble method based on finite element analysis and field data. J. Rock Mech. Geotech. Eng. 2021, 13, 188–201. [Google Scholar] [CrossRef]
Mroginski, J.L.; Castro, H.G.; Podesta, J.M.; Beneyto, P.A.; Anonis, A.R. A fully coupled particle method for dynamic analysis of saturated soil. Comput. Part Mech. 2021, 8, 845–857. [Google Scholar] [CrossRef]
Xie, S.; Lin, H.; Chen, Y. New constitutive model based on disturbed state concept for shear deformation of rock joints. Arch. Civ. Mech. Eng. 2022, 23, 26. [Google Scholar] [CrossRef]
Bostanci, H.T.; Alemdag, S.; Gurocak, Z.; Gokceoglu, C. Combination of discontinuity characteristics and GIS for regional assessment of natural rock slopes in a mountainous area (NE Turkey). Catena 2018, 165, 487–502. [Google Scholar] [CrossRef]
Keskin, M.S.; Kezer, S. Stability of msw landfill slopes reinforced with geogrids. Appl. Sci. 2022, 12, 11866. [Google Scholar] [CrossRef]
Contreras, L.-F.; Brown, E.T. Slope reliability and back analysis of failure with geotechnical parameters estimated using Bayesian inference. J. Rock Mech. Geotech. Eng. 2019, 11, 628–643. [Google Scholar] [CrossRef]
Zhou, C.; Cao, Y.; Gan, L.; Wang, Y.; Motagh, M.; Roessner, S.; Hu, X.; Yin, K. A novel framework for landslide displacement prediction using MT-InSAR and machine learning techniques. Eng. Geol. 2024, 334, 107497. [Google Scholar] [CrossRef]
Baker, R. Sufficient conditions for existence of physically significant solutions in limiting equilibrium slope stability analysis. Int. J. Solids Struct. 2003, 40, 3717–3735. [Google Scholar] [CrossRef]
Huang, F.; Yan, J.; Fan, X.; Yao, C.; Huang, J.; Chen, W.; Hong, H. Uncertainty pattern in landslide susceptibility prediction modelling: Effects of different landslide boundaries and spatial shape expressions. Geosci. Front. 2022, 13, 101317. [Google Scholar] [CrossRef]
Asteris, P.G.; Rizal, F.I.M.; Koopialipoor, M.; Roussis, P.C.; Ferentinou, M.; Armaghani, D.J.; Gordan, B. Slope stability classification under seismic conditions using several tree-based intelligent techniques. Appl. Sci. 2022, 12, 1753. [Google Scholar] [CrossRef]
Chen, Y.; Chen, Y.; Lin, H.; Hu, H. Nonlinear Strength Reduction Method of Rock Mass in Slope Stability Evaluation. Materials 2023, 16, 2793. [Google Scholar] [CrossRef] [PubMed]
Yang, H.-T.; Bai, B.; Lin, H. Seismic magnitude calculation based on rate- and state-dependent friction law. J. Cent. South Univ. 2023, 30, 2671–2685. [Google Scholar] [CrossRef]
Luo, Z.; Bui, X.-N.; Nguyen, H.; Moayedi, H. A novel artificial intelligence technique for analyzing slope stability using PSO-CA model. Eng. Comput. 2021, 37, 533–544. [Google Scholar] [CrossRef]
Mahmoodzadeh, A.; Mohammadi, M.; Farid Hama Ali, H.; Hashim Ibrahim, H.; Nariman Abdulhamid, S.; Nejati, H.R. Prediction of safety factors for slope stability: Comparison of machine learning techniques. Nat. Hazards 2022, 111, 1771–1799. [Google Scholar] [CrossRef]
Terzaghi, K. Mechanism of landslides. In Application of Geology to Engineering Practice; Paige, S., Ed.; Geological Society of America: New York, NY, USA, 1950; pp. 83–123. [Google Scholar] [CrossRef]
Fellenius, W. Erdstatische Berechnungen mit Reibung und Kohäsion (Adhäsion) und unter Annahme kreiszylindrischer Gleitflächen; Ernst & Sohn: Berlin, Germany, 1927. [Google Scholar]
Mafi, R.; Javankhoshdel, S.; Cami, B.; Chenari, R.J.; Gandomi, A.H. Surface altering optimisation in slope stability analysis with non-circular failure for random limit equilibrium method. Georisk Assess. Manag. Risk Eng. Syst. Geohazards 2021, 15, 260–286. [Google Scholar] [CrossRef]
Ji, J.; Zhang, W.J.; Zhang, F.; Gao, Y.F.; Lü, Q. Reliability analysis on permanent displacement of earth slopes using the simplified bishop method. Comput. Geotech. 2019, 117, 103286. [Google Scholar] [CrossRef]
Bishop, A.W. The use of the slip circle in the stability analysis of slopes. Geotechnique 1955, 5, 7–17. [Google Scholar] [CrossRef]
Li, R.J.; Xu, Q.; Zheng, W.; Lin, H.C. The stability analyses of unsaturated slope based on the sarma method. Adv. Mater. Res. 2012, 393, 1569–1573. [Google Scholar] [CrossRef]
Janbu, N. Slope Stability Computations. In Embankment-Dam Engineering. Textbook; Hirschfeld, R.C., Poulos, S.J., Eds.; John Wiley & Sons, Incorporated: Hoboken, NJ, USA, 1973; 40p. [Google Scholar]
Zhou, X.P.; Cheng, H. Stability analysis of three-dimensional seismic landslides using the rigorous limit equilibrium method. Eng. Geol. 2014, 174, 87–102. [Google Scholar] [CrossRef]
Bansal, V.; Sarkar, R. Prophetical modeling using limit equilibrium method and novel machine learning ensemble for slope stability gauging in kalimpong. Iran. J. Sci. Technol. Trans. Civ. Eng. 2024, 48, 411–430. [Google Scholar] [CrossRef]
Aringoli, D.; Calista, M.; Gentili, B.; Pambianchi, G.; Sciarra, N. Geomorphological features and 3D modelling of Montelparo mass movement (Central Italy). Eng. Geol. 2008, 99, 70–84. [Google Scholar] [CrossRef]
Achu, A.; Aju, C.; Di Napoli, M.; Prakash, P.; Gopinath, G.; Shaji, E.; Chandra, V. Machine-learning based landslide susceptibility modelling with emphasis on uncertainty analysis. Geosci. Front. 2023, 14, 101657. [Google Scholar] [CrossRef]
Yang, H.-Q.; Zhang, L. Bayesian back analysis of unsaturated hydraulic parameters for rainfall-induced slope failure: A review. Earth-Sci. Rev. 2024, 251, 104714. [Google Scholar] [CrossRef]
Zhang, W.; Gu, X.; Han, L.; Wu, J.; Xiao, Z.; Liu, M.; Wang, L. A short review of probabilistic slope stability analysis considering spatial variability of geomaterial parameters. Innov. Infrastruct. Solut. 2022, 7, 249. [Google Scholar] [CrossRef]
Yang, Y.; Sun, Y.; Sun, G.; Zheng, H. Sequential excavation analysis of soil-rock-mixture slopes using an improved numerical manifold method with multiple layers of mathematical cover systems. Eng. Geol. 2019, 261, 105278. [Google Scholar] [CrossRef]
Qi, C.; Tang, X. A hybrid ensemble method for improved prediction of slope stability. Int. J. Numer. Anal. Met. 2018, 42, 1823–1839. [Google Scholar] [CrossRef]
Fu, Y.; Lin, M.; Zhang, Y.; Chen, G.; Liu, Y. Slope stability analysis based on big data and convolutional neural network. Front. Struct. Civ. Eng. 2022, 16, 882–895. [Google Scholar] [CrossRef]
Marrapu, B.M.; Kukunuri, A.; Jakka, R.S. Improvement in prediction of slope stability & relative importance factors using ANN. Geotech. Geol. Eng. 2021, 39, 5879–5894. [Google Scholar] [CrossRef]
Huang, F.; Xiong, H.; Chen, S.; Lv, Z.; Huang, J.; Chang, Z.; Catani, F. Slope stability prediction based on a long short-term memory neural network: Comparisons with convolutional neural networks, support vector machines and random forest models. Int. J. Coal Sci. Technol. 2023, 10, 18. [Google Scholar] [CrossRef]
Suman, S.; Khan, S.Z.; Das, S.K.; Chand, S.K. Slope stability analysis using artificial intelligence techniques. Nat. Hazards 2016, 84, 727–748. [Google Scholar] [CrossRef]
Fang, K.; Tang, H.; Li, C.; Su, X.; An, P.; Sun, S. Centrifuge modelling of landslides and landslide hazard mitigation: A review. Geosci. Front. 2023, 14, 101493. [Google Scholar] [CrossRef]
Youssef, A.M.; Pradhan, B.; Dikshit, A.; Al-Katheri, M.M.; Matar, S.S.; Mahdi, A.M. Landslide susceptibility mapping using CNN-1D and 2D deep learning algorithms: Comparison of their performance at Asir Region, KSA. Bull. Eng. Geol. Environ. 2022, 81, 1–22. [Google Scholar] [CrossRef]
Al-Najjar, H.A.H.; Pradhan, B. Spatial landslide susceptibility assessment using machine learning techniques assisted by additional data created with generative adversarial networks. Geosci. Front. 2021, 12, 625–637. [Google Scholar] [CrossRef]
Youssef, A.M.; Pourghasemi, H.R. Landslide susceptibility mapping using machine learning algorithms and comparison of their performance at Abha Basin, Asir Region, Saudi Arabia. Geosci. Front. 2021, 12, 639–655. [Google Scholar] [CrossRef]
Sarwar, S.; Aziz, G.; Kumar Tiwari, A. Implication of machine learning techniques to forecast the electricity price and carbon emission: Evidence from a hot region. Geosci. Front. 2023, 15, 101647. [Google Scholar] [CrossRef]
Xie, S.; Jiang, Z.; Lin, H.; Ma, T.; Peng, K.; Liu, H.; Liu, B. A new integrated intelligent computing paradigm for predicting joints shear strength. Geosci. Front. 2024, 15, 101884. [Google Scholar] [CrossRef]
Tinoco, J.; Correia, A.G.; Cortez, P.; Toll, D. An evolutionary neural network approach for slopes stability assessment. Appl. Sci. 2023, 13, 8084. [Google Scholar] [CrossRef]
Cho, Y.S.; Hong, S.U.; Lee, M.S. The assessment of the compressive strength and thickness of concrete structures using nondestructive testing and an artificial neural network. Nondestruct. Test. Eval. 2009, 24, 277–288. [Google Scholar] [CrossRef]
Artymiak, P.; Bukowski, L.; Feliks, J.; Narberhaus, S.; Zenner, H. Determination of S-N curves with the application of artificial neural networks. Fatigue Fract. Eng. 1999, 22, 723–732. [Google Scholar] [CrossRef]
Pal, A.; Kundu, T.; Datta, A.K. Damage Localization in Rail Section Using Single AE Sensor Data: An Experimental Investigation with Deep Learning Approach. Nondestruct. Test. Eval. 2023, 39, 1088–1106. [Google Scholar] [CrossRef]
de Albuquerque, V.H.C.; Cortez, P.C.; de Alexandria, A.R.; Tavares, J. A new solution for automatic microstructures analysis from images based on a backpropagation artificial neural network. Nondestruct. Test. Eval. 2008, 23, 273–283. [Google Scholar] [CrossRef]
Pestana, M.S.; Kalombo, R.B.; Freire, R.C.S.; Ferreira, J.L.A.; da Silva, C.R.M.; Araújo, J.A. Use of artificial neural network to assess the effect of mean stress on fatigue of overhead conductors. Fatigue Fract. Eng. Mater. Struct. 2018, 41, 2577–2586. [Google Scholar] [CrossRef]
Ince, R. Artificial neural network-based analysis of effective crack model in concrete fracture. Fatigue Fract. Eng. Mater. Struct. 2010, 33, 595–606. [Google Scholar] [CrossRef]
Bilgehan, M. A comparative study for the concrete compressive strength estimation using neural network and neuro-fuzzy modelling approaches. Nondestruct. Test. Eval. 2011, 26, 35–55. [Google Scholar] [CrossRef]
Ceylan, H.; Gopalakrishnan, K.; Bayrak, M.B.; Guclu, A. Noise-tolerant inverse analysis models for nondestructive evaluation of transportation infrastructure systems using neural networks. Nondestruct. Test. Eval. 2013, 28, 233–251. [Google Scholar] [CrossRef]
Feng, X.-T. Introduction of Intelligent Rock Mechanics; Science Press: Beijing, China, 2000. [Google Scholar]
Bui, D.T.; Tsangaratos, P.; Nguyen, V.-T.; Liem, N.V.; Trinh, P.T. Comparing the prediction performance of a Deep Learning Neural Network model with conventional machine learning models in landslide susceptibility assessment. Catena 2020, 188, 104426. [Google Scholar] [CrossRef]
Foong, L.K.; Moayedi, H. Slope stability evaluation using neural network optimized by equilibrium optimization and vortex search algorithm. Eng. Comput. 2022, 38, 1269–1283. [Google Scholar] [CrossRef]
Wang, Y.; Tang, H.; Huang, J.; Wen, T.; Ma, J.; Zhang, J. A comparative study of different machine learning methods for reservoir landslide displacement prediction. Eng. Geol. 2022, 298, 106544. [Google Scholar] [CrossRef]
Criss, R.E.; Yao, W.M.; Li, C.D.; Tang, H.M. A predictive, two-parameter model for the movement of reservoir landslides. J. Earth Sci. 2020, 31, 1051–1057. [Google Scholar] [CrossRef]
Tien Bui, D.; Moayedi, H.; Gör, M.; Jaafari, A.; Foong, L.K. Predicting Slope Stability Failure through Machine Learning Paradigms. ISPRS Int. J. Geo-Inf. 2019, 8, 395. [Google Scholar] [CrossRef]
Kang, F.; Xu, B.; Li, J.; Zhao, S. Slope stability evaluation using Gaussian processes with various covariance functions. Appl. Soft Comput. 2017, 60, 387–396. [Google Scholar] [CrossRef]
Bui, D.T.; Tuan, T.A.; Klempe, H.; Pradhan, B.; Revhaug, I. Spatial prediction models for shallow landslide hazards: A comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree. Landslides 2016, 13, 361–378. [Google Scholar] [CrossRef]
Li, J.C.; Yuan, W.; Li, H.B.; Zou, C.J. Study on dynamic shear deformation behaviors and test methodology of sawtooth-shaped rock joints under impact load. Int. J. Rock Mech. Min. Sci. 2022, 158, 105210. [Google Scholar] [CrossRef]
Lu, B.T. Crack growth model for pipeline steels exposed to near-neutral pH groundwater. Fatigue Fract. Eng. Mater. Struct. 2013, 36, 660–669. [Google Scholar] [CrossRef]
Selçuk, L.; Yabalak, E. Evaluation of the ratio between uniaxial compressive strength and Schmidt hammer rebound number and its effectiveness in predicting rock strength. Nondestruct. Test. Eval. 2015, 30, 1–12. [Google Scholar] [CrossRef]
Yuan, W.; Cheng, Y.; Min, M.; Wang, X. Study on acoustic emission characteristics during shear deformation of rock structural planes based on particle flow code. Comput. Part. Mech. 2023, 11, 105–118. [Google Scholar] [CrossRef]
Yuan, W.; Min, M. Investigation on the scale dependence of shear mechanical behavior of rock joints using DEM simulation. Comput. Part. Mech. 2023, 10, 1613–1627. [Google Scholar] [CrossRef]
Wang, J.; Xu, Y.; Li, J. Prediction of slope stability coefficient based on grid search support vector machine. Railw. Eng. 2019, 59, 94–97. [Google Scholar]
Hong, Y.; Shao, Z.; Ma, L. Application of a support vector machine for analysis and prediction of slope stability. J. Shenyang Jianzhu Univ. 2017, 33, 1004–1010. [Google Scholar]
Su, G. Fast estination of safety factor for circular failure rock slope using gaussian process model. J. Basic Sci. Eng. 2010, 18, 959–966. [Google Scholar]
Kostić, S.; Vasović, N.; Sunarić, D. A new approach to grid search method in slope stability analysis using Box–Behnken statistical design. Appl. Math. Comput. 2015, 256, 425–437. [Google Scholar] [CrossRef]
Xu, X. Highway Slope Stability Assessment Based on the Fuzzy-Neural Network. Master’s Thesis, Chongqing University, Chongqing, China, 2012. [Google Scholar]
Wang, C. Study on Prediction Methods for High Engineering Slope. Master’s Thesis, Beijing Jiaotong University, Beijing, China, 2009. [Google Scholar]
Khajehzadeh, M.; Keawsawasvong, S. Predicting slope safety using an optimized machine learning model. Heliyon 2023, 9, e23012. [Google Scholar] [CrossRef] [PubMed]
Xie, S.; Lin, H.; Chen, Y.; Duan, H.; Liu, H.; Liu, B. Prediction of shear strength of rock fractures using support vector regression and grid search optimization. Mater. Today Commun. 2023, 36, 106780. [Google Scholar] [CrossRef]
Kuhn, M.; Johnson, K. Applied Predictive Modeling; Springer: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
Xie, S.; Lin, H.; Duan, H.; Chen, Y. Modeling description of interface shear deformation: A theoretical study on damage statistical distributions. Constr. Build. Mater. 2023, 394, 132052. [Google Scholar] [CrossRef]
Xie, S.J.; Lin, H.; Chen, Y.F.; Ma, T.X. Modified Mohr-Coulomb criterion for nonlinear strength characteristics of rocks. Fatigue Fract. Eng. Mater. Struct. 2024, 47, 2228–2242. [Google Scholar] [CrossRef]
Xie, S.; Lin, H.; Cheng, C.; Chen, Y.; Wang, Y.; Zhao, Y.; Yong, W. Shear strength model of joints based on Gaussian smoothing method and macro-micro roughness. Comput. Geotech. 2022, 143, 104605. [Google Scholar] [CrossRef]
Xie, S.J.; Lin, H.; Duan, H.Y.; Liu, H.W.; Liu, B.H. Numerical study on cracking behavior and fracture failure mechanism of fractured rocks under shear loading. Comput. Part. Mech. 2023, 11, 903–920. [Google Scholar] [CrossRef]
Xie, S.; Lin, H.; Duan, H. A novel criterion for yield shear displacement of rock discontinuities based on renormalization group theory. Eng. Geol. 2023, 314, 107008. [Google Scholar] [CrossRef]
Xie, S.; Lin, H.; Wang, Y.; Chen, Y.; Xiong, W.; Zhao, Y.; Du, S. A statistical damage constitutive model considering whole joint shear deformation. Int. J. Damage Mech. 2020, 29, 988–1008. [Google Scholar] [CrossRef]
Bishop, C.M. Pattern Recognition and Machine Learning (Information Science and Statistics); Springer: New York, NY, USA, 2006. [Google Scholar]
Oliveira, B.C.F.; Seibert, A.A.; Borges, V.K.; Albertazzi, A.; Schmitt, R.H. Employing a U-net convolutional neural network for segmenting impact damages in optical lock-in thermography images of CFRP plates. Nondestruct. Test. Eval. 2021, 36, 440–458. [Google Scholar] [CrossRef]
Rashid, A.; Ayub, M.; Javed, A.; Khan, S.; Gao, X.; Li, C.; Ullah, Z.; Sardar, T.; Muhammad, J.; Nazneen, S. Potentially harmful metals, and health risk evaluation in groundwater of Mardan, Pakistan: Application of geostatistical approach and geographic information system. Geosci. Front. 2021, 12, 101128. [Google Scholar] [CrossRef]
Muñoz-Abella, B.; Rubio, L.; Rubio, P. Stress intensity factor estimation for unbalanced rotating cracked shafts by artificial neural networks. Fatigue Fract. Eng. Mater. Struct. 2015, 38, 352–367. [Google Scholar] [CrossRef]
Xie, S.; Lin, H.; Chen, Y.; Yao, R.; Sun, Z.; Zhou, X. Hybrid machine learning models to predict the shear strength of discontinuities with different joint wall compressive strength. Nondestruct. Test. Eval. 2024, 1–21. [Google Scholar] [CrossRef]
Ben Seghier, M.E.; Carvalho, H.; Keshtegar, B.; Corria, J.; Berto, F. Novel hybridized adaptive neuro-fuzzy inference system models based particle swarm optimization and genetic algorithms for accurate prediction of stress intensity factor. Fatigue Fract. Eng. Mater. Struct. 2020, 43, 2653–2667. [Google Scholar] [CrossRef]
Bai, G.; Hou, Y.; Wan, B.; An, N.; Yan, Y.; Tang, Z.; Yan, M.; Zhang, Y.; Sun, D. Performance evaluation and engineering verification of machine learning based prediction models for slope stability. Appl. Sci. 2022, 12, 7890. [Google Scholar] [CrossRef]
Xie, S.J.; Lin, H.; Chen, Y.F.; Yong, R.; Xiong, W.; Du, S.G. A damage constitutive model for shear behavior of joints based on determination of the yield point. Int. J. Rock Mech. Min. Sci. 2020, 128, 104269. [Google Scholar] [CrossRef]
Xie, S.J.; Lin, H.; Chen, Y.F.; Wang, Y.X.; Cao, R.H.; Li, J.T. Statistical damage shear constitutive model of rock joints under seepage pressure. Front. Earth Sci. 2020, 8, 16. [Google Scholar] [CrossRef]
Xie, S.; Lin, H.; Wang, Y.; Cao, R.; Yong, R.; Du, S.; Li, J. Nonlinear shear constitutive model for peak shear-type joints based on improved Harris damage function. Arch. Civ. Mech. Eng. 2020, 20, 95. [Google Scholar] [CrossRef]
Verma, A.K.; Singh, T.N.; Chauhan, N.K.; Sarkar, K. A hybrid FEM–ANN approach for slope instability prediction. J. Inst. Eng. (India) Ser. A 2016, 97, 171–180. [Google Scholar] [CrossRef]
Wang, H.B.; Xu, W.Y.; Xu, R.C. Slope stability evaluation using back propagation neural networks. Eng. Geol. 2005, 80, 302–315. [Google Scholar] [CrossRef]
Liao, Z.; Liao, Z. Slope stability evaluation using backpropagation neural networks and multivariate adaptive regression splines. Open Geosci. 2020, 12, 1263–1273. [Google Scholar] [CrossRef]
Huang, Z.; Cui, J.; Liu, H. Chaotic neural network method for slope stability prediction. Chin. J. Rock Mech. Eng. 2004, 22, 3808–3812. [Google Scholar]
Sakellariou, M.G.; Ferentinou, M.D. A study of slope stability prediction using neural networks. Geotech. Geol. Eng. 2005, 23, 419–445. [Google Scholar] [CrossRef]
Zhang, M.-h.; Wei, J.; Bian, H.-d. Slope stability analysis method based on machine learning-taking 618 slopes in China as examples. J. Earth Sci. Environ. 2022, 44, 1083–1095. [Google Scholar] [CrossRef]
Choobbasti, A.J.; Farrokhzad, F.; Barari, A. Prediction of slope stability using artificial neural network (case study: Noabad, Mazandaran, Iran). Arab. J. Geosci. 2009, 2, 311–319. [Google Scholar] [CrossRef]
Das, S.K.; Biswal, R.K.; Sivakugan, N.; Das, B. Classification of slopes and prediction of factor of safety using differential evolution neural networks. Environ. Earth Sci. 2011, 64, 201–210. [Google Scholar] [CrossRef]
Lu, P.; Rosenbaum, M.S. Artificial neural networks and grey systems for the prediction of slope stability. Nat. Hazards 2003, 30, 383–398. [Google Scholar] [CrossRef]
Rukhaiyar, S.; Alam, M.N.; Samadhiya, N.K. A PSO-ANN hybrid model for predicting factor of safety of slope. Int. J. Geotech. Eng. 2018, 12, 556–566. [Google Scholar] [CrossRef]

Figure 1. Photograph of a slope case in Jiangxi Province, China (photograph by Daxing Lei).

Figure 2. Violin plots of input variables.

Figure 3. Correlation matrix of the dataset.

Figure 4. A schematic diagram of K-CV (K = 10).

Figure 5. Architecture of the proposed ANN model.

Figure 6. Geometric illustration of particle velocity and position updates.

Figure 7. The framework of the hybrid PCA-PANN model.

Figure 8. Comparisons between the measured and predicted values. (a) PANN model; (b) PCA-PANN model.

Figure 9. (a) Locations of the monitoring slope case. (b) Stratigraphic model of the monitoring slope [83].

Figure 10. Importance scores of the input variables.

Table 1. Statistical description of the dataset employed in this study.

Parameters	c (KPa)	φ (°)	Φ (°)	H (m)	γ (KN/m³)	r_u	FoS
Number	307	307	307	307	307	307	307
Max	150.05	45	66	511	31.3	0.503	5.799
Min	0	0	8	3.66	12	0	0.46
Mean	25.3633	29.0593	35.5752	110.7239	22.1733	0.2694	1.4612
Standard deviation	25.3140	10.0418	10.7847	147.2382	4.5259	0.1245	0.7118
Kurtosis	7.3598	1.1794	-0.4032	0.52461	−0.2327	0.7679	10.4443
Skewness	2.2021	−1.1805	−0.5592	1.38309	−0.1840	−0.5038	2.7874

Table 2. Principal component eigenvalue and contribution rate.

Factor	Eigenvalue	Percent of Variance (%)	Cumulative Percent of Variance (%)
1	2.593	43.20947	43.20947
2	1.09	18.16364	61.373
3	0.825	13.74771	75.121
4	0.595	9.91501	85.036
5	0.546	9.09848	94.134
6	0.352	5.86569	100

Table 3. Component matrix for the normalized statistical parameters.

	F₁	F₂	F₃	F₄
γ*	0.796	−0.014	0.333	−0.215
c*	0.63	−0.413	0.38	0.328
φ*	0.67	0.287	−0.335	0.539
Φ*	0.653	0.187	−0.535	−0.268
H*	0.827	−0.015	0.101	−0.266
r_u*	−0.034	0.895	0.401	0.027

Table 4. Comparison of prediction effects of each model.

Model	Dataset	R²	RMSE	MAE
PANN	Training set	0.951	0.16	0.092
PANN	Test set	0.90	0.24	0.139
PCA-PANN	Training set	0.988	0.08	0.044
PCA-PANN	Test set	0.971	0.13	0.125

Table 5. Mechanical parameters of the slope case [83].

Case	Slope Layer	c (KPa)	φ (°)	H (m)	Φ (°)	γ (KN/m³)	FoS
Case 1	Sand–gravel layer	8	30	120	40	13.2	1.05
Case 1	Strong weathered andesite	215	15.6	120	40	15.75	1.4
Case 2	sand–gravel layer	85	25	120	40	13.75	1.38
Case 2	Strong weathered andesite	195	14.6	120	40	16.35	1.4
Case 3	sand–gravel layer	9	17.9	120	40	15.63	1.08
Case 3	Strong weathered andesite	130	11.9	120	40	18.56	1.35

Table 6. Comparison of model prediction effects.

	FoS
Case	True Values	ANN Model [83]	Error (%)	PCA-PANN Model	Error (%)
Case 1	1.458	1.40	−3.978	1.447	−0.754
Case 2	1.904	1.38	−27.521	1.894	−0.525
Case 3	1.150	1.08	−6.087	1.136	−1.217

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lei, D.; Zhang, Y.; Lu, Z.; Lin, H.; Fang, B.; Jiang, Z. Slope Stability Prediction Using Principal Component Analysis and Hybrid Machine Learning Approaches. Appl. Sci. 2024, 14, 6526. https://doi.org/10.3390/app14156526

AMA Style

Lei D, Zhang Y, Lu Z, Lin H, Fang B, Jiang Z. Slope Stability Prediction Using Principal Component Analysis and Hybrid Machine Learning Approaches. Applied Sciences. 2024; 14(15):6526. https://doi.org/10.3390/app14156526

Chicago/Turabian Style

Lei, Daxing, Yaoping Zhang, Zhigang Lu, Hang Lin, Bowen Fang, and Zheyuan Jiang. 2024. "Slope Stability Prediction Using Principal Component Analysis and Hybrid Machine Learning Approaches" Applied Sciences 14, no. 15: 6526. https://doi.org/10.3390/app14156526

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Slope Stability Prediction Using Principal Component Analysis and Hybrid Machine Learning Approaches

Abstract

1. Introduction

2. Research Significance

3. Dataset Description and Processing

3.1. Dataset Preparation

3.2. Cross-Validation and Performance Measures

4. Methodology

4.1. Principal Components Analysis

4.2. Artificial Neural Network

4.3. PANN

5. Results and Applications

5.1. Feature Extraction by PCA

5.2. Model Performance

5.3. Case Study

6. Discussion

6.1. Slope Safety Protection

6.2. Limitations

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI