*Article* **Slope Stability Classification under Seismic Conditions Using Several Tree-Based Intelligent Techniques**

**Panagiotis G. Asteris 1,\*, Fariz Iskandar Mohd Rizal 2, Mohammadreza Koopialipoor 3, Panayiotis C. Roussis 4, Maria Ferentinou 5, Danial Jahed Armaghani 6,\* and Behrouz Gordan <sup>7</sup>**


**Abstract:** Slope stability analysis allows engineers to pinpoint risky areas, study trigger mechanisms for slope failures, and design slopes with optimal safety and reliability. Before the widespread usage of computers, slope stability analysis was conducted through semi analytical methods, or stability charts. Presently, engineers have developed many computational tools to perform slope stability analysis more efficiently. The challenge associated with furthering slope stability methods is to create a reliable design solution to perform reliable estimations involving a number of geometric and mechanical variables. The objective of this study was to investigate the application of treebased models, including decision tree (DT), random forest (RF), and AdaBoost, in slope stability classification under seismic loading conditions. The input variables used in the modelling were slope height, slope inclination, cohesion, friction angle, and peak ground acceleration to classify safe slopes and unsafe slopes. The training data for the developed computational intelligence models resulted from a series of slope stability analyses performed using a standard geotechnical engineering software commonly used in geotechnical engineering practice. Upon construction of the tree-based models, the model assessment was performed through the use and calculation of accuracy, F1-score, recall, and precision indices. All tree-based models could efficiently classify the slope stability status, with the AdaBoost model providing the highest performance for the classification of slope stability for both model development and model assessment parts. The proposed AdaBoost model can be used as a screening tool during the stage of feasibility studies of related infrastructure projects, to classify slopes according to their expected status of stability under seismic loading conditions.

**Keywords:** classification; slope stability; tree-based models; random forest; AdaBoost; decision tree

#### **1. Introduction**

Geotechnical engineers often employ analytical and empirical methods in order to estimate the safety factor, based on design parameters and engineering properties, of soil or rock material. It is a challenging task to develop an adequate model to efficiently simulate site specific engineering geological conditions and follow the appropriate design approach in order to eliminate the possibility of failure and propose the most cost-effective design.

**Citation:** Asteris, P.G.; Rizal, F.I.M.; Koopialipoor, M.; Roussis, P.C.; Ferentinou, M.; Armaghani, D.J.; Gordan, B. Slope Stability Classification under Seismic Conditions Using Several Tree-Based Intelligent Techniques. *Appl. Sci.* **2022**, *12*, 1753. https://doi.org/ 10.3390/app12031753

Academic Editor: Chiara Bedon

Received: 16 December 2021 Accepted: 7 February 2022 Published: 8 February 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Slope stability analysis is a standard practice in geotechnical engineering employed for the estimation of the stability of natural or man-made slopes such as embankments of highways, railways, earth dams, tailings, etc. The analysis of slope stability mainly involves the calculation of the factor of safety (FOS), which is defined as the ratio between shear strength and the acting shear stress. The key parameters that define the geometry of the slope (i.e., height and slope inclination) and the material properties (i.e., angle of internal friction, cohesion, and pore water pressure) influence the evaluation of stability of slopes [1–3]. Many sources of uncertainties, such as soil properties and loading, contribute to the stability of a slope [4–6]. The slopes can be classified as stable slopes (SS) or unstable slopes (US), depending on whether their FOS is greater or less than one [7]. The assessment of slope stability is usually performed using analytical techniques, such as the limit equilibrium method (LEM) and finite element methods.

The challenge associated with further development of slope stability analysis methods is to create a reliable generic design tool in order to perform precise evaluations of slope performance. Before the advent of computers, slope stability analysis was conducted using semi-graphical solutions, using manual calculations, or using stability charts [8]. Presently, engineers have developed many computational tools to perform slope stability analysis more efficiently. Geotechnical software based on analytical methods such as the limit equilibrium method (LEM) are widely used by engineers although this method is known to be inadequate when analysing complex slope conditions, requiring more efficient designs, where more sophisticated tools like finite element methods are used [9].

Statistical methods for slope stability classification are based on mathematical formulas that are used in the statistical analysis of research. Multiple regression is a statistical analysis method that can predict the nature of relationship among independent variables and dependent variables. Multiple regression is able to predict the relationship of multiple independent variables against an output variable. This technique is widely used in analysing slope stability problems [10]. For instance, Erzin and Cetin [11] used multiple regression to predict the FOS of homogeneous slopes. The cohesion of soil (c), angle of internal friction (φ), unit weight of soil (γ), and seismic coefficient (k) were used as input parameters, and the output parameter was FOS. It was concluded that the predictions made by the multiple regression model were acceptable. In a similar study, Chakraborty and Goswami [12] used the height of cut or slope height H, material properties, cohesion (c), friction (φ), slope inclination (β), unit weight (γ), and dimensionless parameter (m) as input parameters to predict the status of stability. They also reported a very similar conclusion to the study by Erzin and Cetin [11]. However, the analyses performed by statistical models are only statistical-based, and they are not able to provide a clear view to researchers and designers [13].

Artificial intelligence (AI) and machine learning (ML) techniques have been successfully implemented in the area of engineering and sciences [14–32] for the last 25 years. The same models were used to solve the slope stability problems [3,11,33–37]. Algorithms like ANFIS, (Adaptive Neuro-Fuzzy Inference System), were applied by Mohamed and Kasa [38] to predict the FOS of slopes and they compared their results from the LEM method. The predictions made by the ANFIS model were acceptable for applications in slope stability prediction. In another study, Kalatehjari et al. [39] utilized particle swarm optimization (PSO) to estimate the FOS of 3D slopes in comparison with a 3D finite element method (FEM) model using material properties (cohesion (c) and friction (φ) and unit weight (γ) as input variables. They confirmed a successful application of PSO for 3D slope stability conditions but lower performance for 2D slope stability analysis. Artificial neural network (ANN) as a basic and benchmark AI model was used by Sakellariou and Ferentinou [36], Ferentinou and Sakellariou [37], and Lu and Rosenbaum [40], and its performance was studied to estimate slope stability compared to the LEM slope stability analysis. The results produced by the ANN model were found to concur with the results obtained by the LEM and allowed for the classification of sample observations according to the anticipated failure mechanism. In another study, Samui [41] proposed a support vector machine (SVM) technique for the prediction of FOS and compared it with the ANN results. He found that the SVM was able to receive a slightly higher accuracy in comparison with the ANN technique. In addition, the same SVM model with different kernels, including polynomial, radial basis and spline, was proposed by Samui [35] to classify the FOS of slopes. The accuracy of the model was proven to be very high as it showed 100% similarity when compared to the expected slope stability classification results. It was concluded that the classifications made by the SVM model were acceptable for applications in slope stability predictions; however, when the size of the dataset and/or the dimension of the input vector were high, the performance of the developed models was poor. In a study carried out by Tien Bui et al. [42], decision tree (DT) was used to predict the FOS of slopes and was compared with the results obtained by some other ML/AI techniques such as SVM. The accuracy of the DT model was proven to be acceptable, but it was lower than the SVM model. It is clear that the AI/ML models have enough potential in classifying/predicting slope failure or FOS. Table 1 presents some of the classifications/prediction studies in the areas of slope stability using AI/ML models. In these studies, FOS was set as model output where the model performance was assessed using the coefficient of determination (R2) and accuracy.

**Table 1.** Some of the classifications/prediction studies in the areas of slope stability using AI/ML models.


H: Height of cut, c: Cohesion of soil, φ: Angle of internal friction, β: Slope inclination, ru: Pore water pressure ratio, kmax: seismic coefficient.

In the light of the above discussion, it is clear that ANN and ANN-based models are the main body for the previous investigations. On the other hand, some other techniques, namely, tree-based, performed well in the areas of geotechnics and civil engineering [51–54]. In this study, different classification systems are proposed for slope stability using decision trees (DT), random forest (RF), and AdaBoost tree-based techniques. As presented in Table 1, many researchers used key parameters (i.e., height (H), cohesion (c), friction (φ), and unit weight (γ)) for the classification of slope FOS under static conditions. According to

our review, there is a limited number of studies aimed at FOS estimation or status of stability classification under dynamic conditions. In the current study, the horizontal component of peak ground acceleration (PGA) is included in the input parameters. Therefore, the contribution of this study concerns, firstly, the use of tree-based models in slope stability classification, and secondly, the inclusion of a component related to dynamic conditions in slope stability. This allows for a more reliable slope stability classification under dynamic loading conditions. The rest of this paper is outlined as follows:

Concepts of earthquake on soil slopes will be discussed in Section 2. Then, Section 3 describes the used models' concepts and fundamental facts. In addition, the same section will provide the needed information about data preparation used for modelling to the readers. Tree-based model developments for slope stability classifications will be provided in Section 4. The results of the study are evaluated and discussed in Section 5. In addition, the best tree-based model to classify slope stability will be discussed in the same section. Future work directions and the conclusion will offer some valuable input to the readers in Section 6.

#### **2. Effect of Earthquake on Soil Slopes**

If a slope is situated in a region subject to earthquakes, the design must satisfy these adverse conditions. The effect of the shaking depends on whether the shear strength of the soil material remains adequate during cyclic loading or shaking results in a significant loss of strength. Since deformation is the result of shearing or sliding movement, slope stability analysis is necessary to ensure that the factor of safety is adequate to satisfy dynamic loading and minimize the resulting deformation. In the case of loose, saturated, cohesionless material, the total lack of strength due to cyclic loading might induce liquefaction, which is when a cohesionless saturated or partially saturated soil loses structural strength as a result of an applied stress (such as trembling during an earthquake or another abrupt change in stress condition), and a material that is normally a solid acts as a liquid. Liquefaction assessment requires a more complex analysis and additional data, such as pore water pressure measurements, and is beyond the scope of this paper.

The susceptibility of a slope to failing due to a seismic event is also determined through the critical acceleration coefficient ky. The coefficient of critical acceleration ky is an appropriate measure of a soil or rock mass' resistance to earthquake induced sliding. The value of the coefficient depends on the slope inclination β. Essentially, ky is as important for the sliding block model method [55], as the static safety factor is for the limit equilibrium method; these two variables are linearly related [56]. According to Sarma and Bhave [57], ky is a measure of safety factor, and is the yield acceleration of the slope. Sarma and Bhave [57] proposed a method to relate these two coefficients which is independent of the assumed failure mechanism and the material properties. The coefficient of critical acceleration ky is unique for each slope and is calculated when the safety factor is equal to one.

#### **3. Material and Methods**

#### *3.1. Data Preparation*

During the training process of developing a mathematical model to predict a parameter value as a function of a number of other variables, most researchers tend to focus on computational aspects, while at the same time paying less attention to the database being used for the training and development of the mathematical model.

However, we firmly believe that the main emphasis should be on the database to be used, as it is the database itself that describes the behaviour of the problem being modelled. The database, whether based on experimental or analytical data, is the available knowledge which must be properly utilized during the training process of the development of the mathematical model. In this regard, the database must be reliable with a sufficient amount of data to adequately describe the problem under study.

It should be noted that the phrase "sufficient amount of data" does not necessarily imply a high amount of data, but rather datasets that cover a wide range of combinations of input parameter values, thus assisting in the model's capability to simulate the problem. The demand for a reliable database is particularly crucial in the case of experimental databases, which are databases compiled using experimental results. In this case, significant deviations between experimental values are frequently noticed, not only between experiments conducted by different research teams and laboratories, but even between datasets derived from experiments conducted on specimens of the same synthesis, produced by the same technicians, cured under the same conditions, and tested implementing the same standards and testing instruments.

In light of the above discussion, in this study, in order to develop a comprehensive database for FOS classification under dynamic conditions, a series of models were constructed to calculate FOS using a standard geotechnical software. Figure 1 illustrates a generic limit equilibrium model for the simulated slope. In fact, many slope stability analysis tools use various versions of the methods of slices, such as Bishop simplified. The simplified Bishop method uses the method of slices to discretize the soil mass and determine the FOS. These methods were used in this research, the ordinary method of slices (Swedish circle method/Petterson/Fellenius), Spencer, Sarma, etc. Sarma and Spencer are called "rigorous methods" because they satisfy all three conditions of equilibrium: force equilibrium in both horizontal and vertical directions and moment equilibrium condition. Rigorous methods can provide more accurate results than non-rigorous methods. Bishop simplified or Fellenius are non-rigorous methods, satisfying only some of the equilibrium conditions and making some simplifying assumptions [58,59]. Some of these approaches are discussed below. Finally, slope stability analysis using Bishop simplified is a static or dynamic, analytical, or empirical method to evaluate the stability of earth and rock-fill dams, embankments, excavated slopes, and natural slopes in soil and rock. Slope stability refers to the ability of inclined soil or rock slopes to withstand or undergo movement.

**Figure 1.** Limit equilibrium model for the stability analysis, (W: weight, τ: shear strength, *kh*: seismic coefficient, g: acceleration due to gravity, β: is slope inclination, H: slope height).

The contribution of seismic loading is considered in the current slope stability analysis through the application of a horizontal force component of peak ground acceleration (PGA), that characterizes the amplitude of shaking within the sliding mass. Namely, the slope is assumed to be subjected to a force defined by

$$F\_h = k\_h \mathbb{W} \tag{1}$$

where *W* is the weight of the sliding mass and *kh* is a dimensionless coefficient defined by

$$k\_h = PGA/\,\text{g} \tag{2}$$

The process was carried out in several phases to achieve a representative database. Boundary conditions, model dimensions, material properties, and seismic motion were the parameters considered in modelling. To do this, multiple homogeneous slopes with different conditions were modelled. Slopes with heights of 15, 20, 25, and 30 metres and inclinations of 20◦, 25◦, 30◦, and 35◦ were produced. In terms of rigid behaviour, all of the models were placed on top of bedrock.

The failure criterion used in this method was the Mohr–Coulomb failure criterion

$$
\pi = \mathfrak{c} + \sigma \tan \mathfrak{q} \tag{3}
$$

where *c*: cohesion, *ϕ*: friction angle, *σ*: normal stress for slopes with soils with cohesion and internal friction, for a slope subjected to circular failure. The parametric values used were cohesion of 20, 30, 40, and 50 kPa and internal friction angle of 20◦, 25◦, 30◦, 35◦, and 40◦. The effect of earthquake motion on slope behaviour was considered in the current analysis. For the purposes of this analysis, the soil unit weight was assumed to be 18 (kN/m3). The amplitudes were defined as 0.1, 0.2, 0.3, and 0.4 g. On all of the slope models, thirty slices were used as slip surfaces. To achieve FOS values in this analysis, a grid and radius slip surface were used. The calculated FOS should be almost in the centre of the grid by using the grid and radius method. The FOS from the dataset was then separated manually into groups of safe slope or SS and unsafe slope or US in order to meet the objective of analysing and classifying all the slope stability cases in the dataset. Table 2 shows the input and output parameters used in the database development.

**Table 2.** Input and output variables for slope stability classification.


In this study, 700 homogeneous slopes were simulated using GeoStudio which utilizes the LEM method shown in Figure 1, along with the most critical FOS parameters. In these 700 slopes, different values of the mentioned parameters in Table 2 were used and their FOS values were recorded. Based on a literature review conducted, the parameters presented in Figure 1 are considered to be the most important. The best relationships between these input parameters and the output (i.e., FOS) were calculated. In this way, simple regression analysis (one to one relationship) was employed. The highest R2 value was achieved by the PGA parameter through a polynomial trend-line (as the best trend-line among applied linear, exponential, logarithmic, and power) as follows:

$$\text{FOS} = 0.0612 \text{(PGA)}^2 - 0.3512 \text{(PGA)} + 1.4545 \tag{4}$$

A value of R<sup>2</sup> equal to 0.305 was reported for the above equation. Besides PGA, the parameter φ showed the best relationship with FOS values with R2 = 0.122 through an exponential trend-line.

To determine the relative effect of each input parameter on the output parameter, a sensitivity analysis was performed. The following equation was used to perform the same analysis:

$$r\_{ij} = \frac{\sum\_{k=1}^{m} \mathbf{x}\_{ik} \mathbf{x}\_{jk}}{\sum\_{k=1}^{m} \mathbf{x}^{2} \mathbf{x}\_{ik} \sum\_{k=1}^{m} \mathbf{x}^{2} \mathbf{}\_{jk}} \tag{5}$$

where, *rij* is the strength of relation between each input and output, *xik* is the *ith* sample of input *k*, *j* is the number of each sample in the output set, and *m* is the total number of data samples. Table 3 shows the strengths of the relations (*rij* values) between the inputs and output (FOS). The sensitivity analysis results showed that the input parameters have a great influence on the FOS. Parameter φ had the highest impact on FOS values followed by H, β, C, and PGA. The results obtained were in line with previous studies [60,61].

**Table 3.** Sensitivity analysis of input and output variables.


#### *3.2. Overview of Research Methodology*

A review of past related studies that utilize AI in slope stability methods was first conducted in order to choose the parameters to be used in the dataset required for training and testing the DT, RF, and AdaBoost models. The review revealed an absence of studies considering the PGA as a parameter in the performance of slope stability analysis. Subsequently, the FOS values were estimated using intelligent techniques. For this purpose, DT, RF, and AdaBoost were utilized based on the most influential parameters for slope stability performance as mentioned before for the input parameters. The results of the DT, RF, and AdaBoost model were compared to the results from the GeoStudio software to observe the performance of the DT, RF, and AdaBoost methods. Results of both methods were evaluated using performance indicators and the best model was selected and introduced for the problem of this study. Figure 2 presents a flowchart of the research methodology followed in this study.

**Figure 2.** Procedure flowchart for FOS classification.

#### *3.3. Decision Tree (DT)*

DT is an AI technique that uses conditional judgement rules to divide predictor variables into homogeneous categories. The aim of DT specification is to find a set of decision rules for predicting an outcome from a set of input boundaries [62]. The DT is referred to as a predictive data mining tree depending on whether the target variables are objective or subjective [63]. Classifying the FOS of slopes from multiple input parameters is possible because modelling complex relationships between multiple input variables with an output variable is possible with a DT model as it will have both categorical and continuous variables without making any conclusions about the distribution of the provided data [64]. Furthermore, DT models are simple to implement, and the prediction results are simple to understand. The findings of the DT model revealed the relative significance of input parameters to the output parameter [65].

A root node, internal nodes, and leaf nodes make up a DT structure. All of the input variables are stored in the root node. A decision function is connected with an internal node, which may have two or three branches. The output of a given input vector is represented by a leaf node [42]. Figure 3 shows the flowchart of procedures conducted for the modelling of a DT model. The procedure of modelling a DT model is governed by two steps: tree building and pruning.

**Figure 3.** Methodology flowchart for DT modelling.

In the first step, the root node of the DT is defined by determining the input vector with the maximum gain ratio. The dataset is then divided into sub-nodes depending on the root values. For discrete input variables, each potential value is represented by a sub-node of the tree [66]. The gain ratio is then calculated for each of the sub-nodes separately in the second process, and the process is replicated until all of the instances in a node are classified the same way. Leaf nodes are such nodes, and their names are the class values. Since the tree produced during the design process will have a large number of branches, it will be vulnerable to over-fitting [67], it must be pruned in order to improve the prediction performance for new data. Tree pruning can be divided into two categories: pre-pruning and post-pruning. In the case of pre-pruning, the tree's development will be halted before another criterion is true; in the case of post-pruning, the whole tree will be grown first, and then the finished subtrees will be replaced by leaves based on the tree's flaw relation before and after eliminating sub-trees. More explanations regarding DT models can be found in [54].

#### *3.4. Random Forest (RF)*

RF, also known as random decision forest, is an ensemble modelling technique for grouping, regression, and other tasks that works by training a vast group of DTs and then outputting the category that is the average approximation (regression) of the individual trees [68]. The values of an independent random variable are used to develop the individual DTs. On the basis of voting, classification models estimate the value yielded by individual trees [69]. The basic RF algorithm utilizes the random subspace method. RFs are often used in industries as "black box" models because they provide accurate estimates over a broad variety of data with no configuration [70].

The DTs in the RF model recognize rules and patterns from the input data. The output parameter (FOS) can be easily measured using these rules and patterns for any new collection of results. The gain ratio formula can be used to rank the most important parameters of slope failures. To solve the issue of over-fitting, mathematical methods such as conservative pruning are used subsequently [71]. Figure 4 shows the flowchart of procedures for RF modelling.

**Figure 4.** Methodology flowchart for RF modelling.

#### *3.5. AdaBoost Algorithm*

Adaptive Boosting, also known as AdaBoost, is a boosting algorithm that attempts to use weighted derivatives of the same testing dataset rather than sub-samples [72]. The benefit of this approach is that the algorithm does not need a large amount of data because it uses the same training dataset twice [73]. The algorithm is well-known for producing good results when constructing ensemble classifiers [74]. To get a classification model of the ensemble prediction function H:X → (−1, +1) shown in Equation (6), the AdaBoost machine learns using a series of weak learners or classifiers.

$$H(\mathbf{x}) = \text{sign}\left(\sum\_{m=1}^{M} a\_{\text{m}} H\_{\text{m}}(\mathbf{x})\right) \tag{6}$$

where *H*(*x*) is the output of the developed ensemble classifier, *a*1...., *am*, are a set of weights, and *Hm*(*x*) is the performance of the weak learners *m*∈(1, ..., *M*) that are combined to get *H*(*x*). In each round of the algorithm, the weights allocated to the training dataset are determined by how previous classifiers behaved. The algorithm then works on the specimens or data sets that have already been mistakenly classified in this case. Figure 5 shows the flowchart of procedures for AdaBoost modelling. More information on the AdaBoost concept can be found in the other studies ([75,76]).

**Figure 5.** Methodology flowchart for AdaBoost modelling.

#### *3.6. Performance Indicators*

To measure the performance of the results obtained from the DT, RF, and AdaBoost models against each other and the expected results obtained from the GeoStudio software, a few performance indicators were used. These performance indicators were accuracy, precision, recall, F1-score, and ROC curve. All the models were subjected to the performance indicators to observe their effectiveness. Accuracy is the ratio of the number of correctly classified predictions divided by the total number of projections. It ranges from 0 to 1. Equation (7) shows the calculation of accuracy where True Positive and True Negative are correct predictions made by the model.

$$\text{Accuracy} = \frac{\text{TruePositive} + \text{TrueNegative}}{\text{Total number of samples}} \tag{7}$$

Precision is the measurement of positive class predictions that actually belong to the positive class, which in turn calculates the accuracy of the minority class. This calculation is expressed in Equation (8) where the False Positive represents the false positive prediction made by the model.

$$\text{Precision} = \frac{\text{TruePositive}}{\text{TruePositive} + \text{FalsePositive}} \tag{8}$$

Recall is a statistic index that measures how many accurate positive assumptions were made out of all possible positive expectations. Unlike precision, which only considers true positive predictions out of all predictions, considering the positive predictions that were wrong. This calculation is expressed in Equation (9) where the False Negative represents the false negative prediction made by the model.

$$\text{Recall} = \frac{\text{TruePositive}}{\text{TruePositive} + \text{FalseNegative}} \tag{9}$$

F1-score is a method for combining precision and recall into a single measure that encompasses both. Neither precision nor recall can provide the full picture on their own. We may have excellent precision but poor recall, or vice versa, poor precision but good recall. With the F1-score, all issues with a single score can be expressed (Equation (10)).

$$F1 - score = 2 \times \frac{Precision \times Recall}{Precision + Recall} \tag{10}$$

ROC curve or receiver operating characteristic curve is a graph of the false positive rate (*x*-axis) vs. the precision (*y*-axis) with a variety of candidate thresholds ranging from 0.0 to 1.0. The false positive rate is determined by dividing the total number of false positives by the total number of false positives and true negatives. With all the performance indicators mentioned above, the area under the ROC curve could be obtained for each model. This value will represent the effectiveness of each model.

#### **4. Development of Tree-Based Techniques**

In order to develop the models implemented in this study, the hyperparameters of each model were optimized. A parametric analysis was performed on the parameters of each model because the models needed to be adjusted for each problem and dataset. Here, three types of DT, RF, and AdaBoost models were implemented, each of which had specific parameters related to its structure. In each section, these parameters were defined, and various values of their parameters were analysed in order to find the optimal structure. The details of each model are presented in the following.

#### *4.1. DT Model*

To obtain the most effective DT model, several models were developed using different numbers of parameters. Table 4 reports the parameters used for modelling in this study. Upon experimenting with the values of the number of instances in leaves, minimum limit of the split subset, and maximal tree depth, the most effective DT model with the optimal value of these parameters was obtained. In addition, Figure 6 shows the tree flowchart of the proposed DT model for classifying slope stability.


**Table 4.** The optimal parameters obtained by the DT model.

**Figure 6.** The optimal DT model for FOS classification.

In the training phase, 75% of the dataset was used (525 slope cases), which is similar to a study conducted by Piryonesi and El-Diraby [70]. The data was selected randomly, and the input parameters were inserted into the model. In the testing phase, 25% of the dataset was used, which corresponds to 175 slope cases. Figure 7 shows the results of the DT model in the classification of the FOS for training and testing sets. According to the training set, the DT model classified 300 safe slopes and 162 unsafe slopes accurately, while classifying 12 safe slopes and 21 unsafe slopes, wrongly. In addition, in the case of the testing set, the DT model classified 109 safe slopes and 47 unsafe slopes accurately, while classifying 3 safe slopes and 16 unsafe slopes, wrongly. Later, the results of the DT from both phases were observed using the performance indicators accuracy, precision, recall, F1-score, and ROC curve.

**Figure 7.** The DT model results for FOS classification: (**A**) Training and (**B**) Testing.

#### *4.2. RF Model*

A similar modelling process was completed for the RF technique aiming at classification of slope stability considering FOS values of more than one as safe (SS) and less than one as unsafe (US). After experimenting with different numbers of trees and the minimum limit

of split subsets, the most effective RF model with optimal values was obtained (Table 5). The same portions of DT model were used for the training and testing phases. Figure 8 displays the results obtained by the RF technique for the classification of slope stability for the training and testing phases. Considering the training phase, the RF technique classified 344 safe slopes and 169 unsafe slopes accurately, while classifying 5 safe slopes and 7 unsafe slopes, wrongly. In the case of the testing phase, the RF model was able to classify 116 safe slopes and 44 unsafe slopes accurately, while wrong classification of 9 safe slopes and 6 unsafe slopes, was reported. As with the DT model, the results obtained by the RF model are assessed and discussed later.

**RF Parameter Value** Number of trees 7 Minimum limit of the split subset 5

**Table 5.** The optimal parameters obtained by the RF model.

**Figure 8.** The RF model results for FOS classification: (**A**) Training and (**B**) Testing.

#### *4.3. AdaBoost Model*

The same data with five input parameters under seismic condition was used to classify slopes as safe and unsafe. As with the previous parts, it was important to obtain the optimal parameters of the model, which was AdaBoost in this sub-section. Several parametric studies were conducted to get the most accurate AdaBoost model. The optimal AdaBoost parameters for the expressed aim are presented in Table 6. It should be mentioned that a different base model could be selected for the modelling of AdaBoost where DT was the best among them for solving the defined problem. As a result, the proposed AdaBoost model was able to classify 351 safe slopes and 174 unsafe slopes accurately, with no wrong classification results by AdaBoost in the training or model development phase (Figure 9). However, during the testing or model evaluation part, there were several wrong cases. An accurate value of 120 safe slopes and 43 unsafe slopes were reported for the testing part, while 7 safe cases and 5 unsafe cases were obtained wrongly (Figure 9). It seems that the classification results obtained by the AdaBoost model are slightly better than those obtained by the RF and DT techniques. It is important to mention that the evaluation of the proposed models was not the aim of this section and this will be reported in the following section.

**Table 6.** The optimal parameters obtained by the AdaBoost model.


**Figure 9.** The AdaBoost model results for FOS classification: (**A**) Training and (**B**) Testing.

#### **5. Results and Discussion**

This section presents the comparison of results obtained from the DT, RF, and AdaBoost models. The results obtained from these models were subjected to several performance indicators: namely, accuracy, precision, recall, F1-score, and the area under ROC curve or AUC (area under curve) to determine which method was the most accurate and effective for slope stability classification. Here, the testing phase of the datasets was considered for the validation of each tree-based model. This is a common method of evaluation or model assessment to understand the level of accuracy during training/model development. On the other hand, the training stage results showed that the proposed AdaBoost model could be considered as perfect, and therefore, there is no need to discuss further about this stage and have any comparison between models. Table 7 shows the comparison of the testing stage results obtained by the indicators: i.e., accuracy, precision, recall, F1-score, and AUC of ROC. In addition, the ranking procedure proposed by Zorlu et al. [77] was applied in this table. The ranking system is very easy to understand. In this system, the most accurate performance index receives the highest rank. According to Table 7, the model that showed the highest accuracy was AdaBoost as it obtained the highest rank value, which was 13. The second most accurate model was the RF, which obtained a total rank value of 10. The lowest accurate model was the DT model, with a total rank value of 7. Except for the AUC, AdaBoost achieved better accuracy and performance compared to the RF and DT models. It is important to note that the RF also received a high degree of accuracy, and it can be used for slope stability classification by the other researchers or engineers. For a better comparison, Figure 10 shows the classification results of the DT, RF, and AdaBoost models from the testing phase compared to the FOS results obtained with the GeoStudio software. As stated earlier, 175 data samples, which constituted 25% of the whole data, were used for each model in the testing phase. It is clear from Figure 10 that the AdaBoost technique was able to record an outstanding performance with the lowest number of unmatched answers (i.e., 11). The number of matched and unmatched for RF and DT were 160 and 15, and 156 and 19, respectively, confirming the RF model's superiority over the DT in slope stability classification. Overall, the error rate during the testing phase was very low, which reflected the high-performance level of the model development during the training phase. It was concluded that the best performing model for slope stability classification was the AdaBoost, and that it could be used in this field for the same purpose to minimize the associated risk.


**Table 7.** Modelling results for the testing datasets of DT, RF, and AdaBoost for slope stability classification.

**Figure 10.** Chart of results obtained from the models compared to expected results.

#### **6. Conclusions and Future Works**

To achieve the aim of this study, tree-based models including DT, RF, and AdaBoost were developed to classify the stability of 700 slopes (464 safe slopes and 236 unsafe slopes) under seismic condition, which were modelled and analysed in GeoStudio software. The variables of H, β, C, φ, and PGA were set as model inputs for the classification of slopes where FOS ≥ 1 and FOS < 1 was considered for safe and unsafe slopes, respectively. To measure the performance of the DT, RF, and AdaBoost models, accuracy, precision, recall, F1-score, and AUC as performance indices were calculated for both stages of training and testing. After conducting modelling procedures of classification, the best technique was selected based on the performance indices' results. From the training part, it was found that the AdaBoost was a perfect technique capable of achieving the highest possible performance compared to the other employed models. Additionally, a higher degree of classification performance for the testing phase was reported for all calculated indices except AUC. Values of 0.910, 0.931, 0.931, 0.931, and 0.931; 0.961, 0.914, 0.915, 0.916 and 0.914; and 0.968, 0.891, 0.895, 0.908 and 0.891 were obtained for AUC, Accuracy, F1, Precision, and Recall of AdaBoost, RF, and DT models, respectively. These values confirmed the successful use of tree-based models in classifying slope stability. However, the better performance and higher capability for classification purpose goes to the proposed AdaBoost technique. Therefore, it can be introduced as a new technique for slope stability classification with the largest number of matched cases.

It is well established that to propose a new method for classifying slope stability cases using AI techniques, extensive investigation is required. Therefore, in order to develop a model for classifying slope stability, a comprehensive database comprising real cases must be gathered and utilized. Yet, collecting such database is very difficult and time consuming. By providing the mentioned data, slope stability classifications can be conducted using new (hybrid) AI techniques, such as RF or AdaBoost, combined with metaheuristic algorithms.

Moreover, the use of real slope stability data based on different types of soils considering other properties, such as unit weight, permeability, and ground water table, would be of interest and importance to geotechnical engineers. In this regard, model generalization as an important issue in classification and prediction problems can be considered, with the

developed models covering a wider range of input parameters, as well as a larger number of effective problem variables.

**Author Contributions:** Conceptualization, D.J.A., B.G. and P.G.A.; methodology, D.J.A., F.I.M.R., M.K. and P.G.A.; software, D.J.A., F.I.M.R., M.K. and P.G.A.; formal analysis, D.J.A., F.I.M.R., M.K. and P.G.A.; writing—original draft preparation, D.J.A., F.I.M.R., M.K., P.G.A., B.G., P.C.R. and M.F.; writing—review and editing, D.J.A., F.I.M.R., M.K., P.G.A., B.G., P.C.R. and M.F.; supervision, D.J.A., B.G. and P.G.A.; Data curation, B.G. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data are available upon request.

**Acknowledgments:** Authors of this study wish to express their appreciation to the University of Malaya for supporting this study and making it possible.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

