*Article* **Estimation of Blast-Induced Peak Particle Velocity through the Improved Weighted Random Forest Technique**

**Biao He 1, Sai Hin Lai 1,\*, Ahmed Salih Mohammed 2, Mohanad Muayad Sabri Sabri 3,\* and Dmitrii Vladimirovich Ulrikh <sup>4</sup>**


**Abstract:** Blasting is one of the primary aspects of the mining operations, and its environmental effects interfere with the safety of lives and property. Therefore, it is essential to accurately estimate the environmental impact of blasting, i.e., peak particle velocity (PPV). In this study, a regular random forest (RF) model was developed using 102 blasting samples that were collected from an open granite mine. The model inputs included six parameters, while the output is PPV. Then, to improve the performance of the regular RF model, five techniques, i.e., refined weights based on the accuracy of decision trees and the optimization of three metaheuristic algorithms, were proposed to enhance the predictive capability of the regular RF model. The results showed that all refined weighted RF models have better performance than the regular RF model. In particular, the refined weighted RF model using the whale optimization algorithm (WOA) showed the best performance. Moreover, the sensitivity analysis results revealed that the powder factor (PF) has the most significant impact on the prediction of the PPV in this project case, which means that the magnitude of the PPV can be managed by controlling the size of the PF.

**Keywords:** blasting; ground vibration; PPV prediction; random forest; whale optimization algorithm

#### **1. Introduction**

Blasting is an economical method of rock excavation in mining and civil engineering, and produces a series of adverse environmental effects such as blasting vibration [1–3], flying rocks [4–6], and back beak [7–9]. Among these adverse effects, the harm caused by blasting vibration is quite serious. For example, the surrounding structures can be damaged or fail because of excessive structural vibration produced by ground vibration during blasting [10,11]. Therefore, it is indispensable to predict the magnitude of blast vibration accurately.

The standard base parameter for assessing the magnitude of the blast-induced ground vibration is the peak particle velocity (PPV) [12]. Many studies on PPV prediction have been implemented, such as the proposed empirical formulas, multiple linear and nonlinear regression methods, and machine learning (ML) methods. Among these methods, the empirical formulas for the PPV prediction are easy to construct, but these formulas do not have enough accuracy since few factors are considered to implement the prediction task [13,14]. Moreover, some empirical formulas are usually designed for a given location with distinct geological parameters and field morphology, indicating that these formulas are limited and unable to predict the PPV at other blasting locations [15] accurately. Concerning the multiple linear and nonlinear regression methods, several scholars have demonstrated

**Citation:** He, B.; Lai, S.H.; Mohammed, A.S.; Sabri, M.M.S.; Ulrikh, D.V. Estimation of Blast-Induced Peak Particle Velocity through the Improved Weighted Random Forest Technique. *Appl. Sci.* **2022**, *12*, 5019. https://doi.org/ 10.3390/app12105019

Academic Editor: Ricardo Castedo

Received: 12 April 2022 Accepted: 11 May 2022 Published: 16 May 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

that these methods are capable of handling high-dimensional problems, which means that the effects of multiple factors can be considered simultaneously when predicting PPV [15,16]. However, many studies prove that the accuracy of statistical models is inferior compared to the ML methods [17–20].

In recent years, a large number of ML techniques have been applied in different areas of geotechnics, such as tunnel construction and risk assessment [21–28], soil classification [29], pile technology [30,31], materials properties [32–35], slope stability [36,37], blasting environmental issues [38,39], pillar stability prediction [40], and rock material properties [41–44], which reveal the favorable application prospects of the ML techniques. Similarly, as for the research on PPV prediction, to compensate for the shortcomings of empirical formulas and statistical models, there has been a strong inclination toward using the ML techniques in predicting the PPV because of their feasible ability to handle multidimensional nonlinear problems. For instance, Zhang et al. [1] used five machine learning techniques (i.e., classification and regression trees (CART), chi-squared automatic interaction detection (CHAID), random forest (RF), artificial neural network (ANN), and support vector machine (SVM) to predict the PPV caused by mine blasting. Their research utilized five parameters, including maximum charge per delay (MC), stemming (ST), distance from the measuring station to the blast face (DI), powder factor (PF), and hole depth (HD), to develop the ML models. The results showed that the RF had a superior capability in predicting PPV compared with the other four techniques. Lawal [45] developed an ANN-based formula to predict the PPV using two inputs, i.e., the distance from the monitoring point to the blast face and the explosive charge per delay. This study attempts to construct an interpretable formula through the ANN to identify the effect of inputs on the PPV. Rana et al. [46] compared the performance of two AI techniques (e.g., ANN and decision tree (DT)) for forecasting the blast-induced PPV. In the study, eight input parameters, i.e., total charge, number of holes, hole diameter, distance from blasting face, hole depth, tunnel cross section, the maximum charge per delay, and charge per hole, were used to design the ML models. The results indicated that the precision of the DT model performed better than the ANN model. Some relevant studies for predicting the PPV are shown in Table 1.


**Table 1.** Research on blast-induced PPV prediction.

Note: MC—maximum charge per delay; HD—hole depth; ST—stemming; PF—powder factor; DI—distance from the measuring station to the blast face; *ρ*—rock density; SRH—Schmidt rebound hardness; TC—total charge; A—tunnel cross section area; NH—number of holes; H—hole diameter; CPH—charge per hole; B—burden; S—spacing; N—number of raw; BS—burden to spacing; E—Young modulus; Vp—p-wave velocity; CART classification and regression trees; CHAID—chi-squared automatic interaction detection; RF—random forest; ANN—artificial neural network; SVM—support vector machine; MPSO—modified particle swarm optimization; GEP—gene expression programming; ANFIS—adaptive neuro-fuzzy inference system; SCA—sine cosine algorithm; DT—decision tree; ICA—imperialist competitive algorithm; FIS—fuzzy logic.

In light of the above, the good performance and adaptability of the ML models in predicting the PPV have been proven gradually, and hence a large number of studies of ML models for predicting the PPV emerged. Furthermore, the ML models with better adaptability and higher prediction accuracy used for the prediction of blast-induced PPV need to be further developed and improved.

In the present study, we utilized a classic ML model termed the RF to conduct the PPV prediction task. After designing the regular RF model using the datasets collected from a quarry mine, we proposed five categories for modifying the regular RF model's weights and, as a result, making better predictions. The designed weighting frameworks include the improved weighted RF model based on the prediction accuracy of decision trees, as well as the optimized weights of decision trees obtained by three metaheuristic algorithms, i.e., whale optimization algorithm (WOA), gray wolf optimization (GWO), and tunicate swarm algorithm (TSA). Subsequently, four evaluation metrics were used to validate the performance of the developed models. Finally, a sensitivity analysis was conducted to identify the predominant factors for PPV prediction in this engineering case.

#### **2. Project Description and Data Collection**

In the present study, a granite mine in Penang state, Malaysia, was selected as the subject to research PPV prediction, and thus its blasting operations were investigated. Granite is the most common rock type found in the study area. The top layer is generally less than three feet thick and consists mostly of sandy clay with humus and tree roots. Explosive operations are rather common in this mine, and these operations are repeated at various intervals. Blasting at this location aims to create aggregates for various building projects with annual capacity ranging from 500–700 thousand tons per year, and large quantities of explosives are used in explosive operations. For example, in holes with sizes ranging from 76 mm to 89 mm, explosives weighing 856 kg to 9420 kg are commonly utilized. Concerning the present study, 102 blasting operations were recorded, including the design details of the blast parameters and the PPV values obtained by the Vibra ZEB seismographer. The measured and recorded blast parameters involve the number of holes, hole diameter, hole depth, burden, spacing, stemming length, subdrilling, total charge, powder factor, the maximum charge per delay, and distance from the blast-face to the measuring points. Among them, six parameters were selected as the inputs that will be used to implement the ML modeling, which is in accord with previously published works [1,48,49]. The simple statistics of the six parameters are shown in Table 2, including their max/min values, mean values, and standard deviations. It can be seen from Table 2 that the distance from the seismograph to the explosion site was about 285 m–531 m, and the range of the PPV is between 0.13 and 11.05 mm/s.

**Table 2.** Statistic of the data collected from the study area.


#### **3. Methods**

*3.1. RF*

The RF approach is a machine learning model proposed by Leo Breiman in 2001 [55]. Because of its special algorithmic mechanism and efficient performance, the RF model has attracted much attention from researchers in various research fields. The RF model is an approach that integrates many base learners (also called decision trees) through integrated learning. It uses bagging and bootstrap techniques to train the decision trees, which solves the problem of insufficient performance of individual decision trees when dealing with

complex data. At the same time, the RF is also a nonparametric classification or regression method, so it does not require prior knowledge when processing data.

The schematic diagram of the design of the RF and improved RF models is depicted in Figure 1. In the process of building decision trees, the RF model generates more randomness as the number of decision trees in the forest increases. Instead of selecting the optimal value step by step like a decision tree, the RF uses its ability of random selection and the voting mechanism of the decision tree to find the optimal value quickly. This property allows the RF model to have better classification or regression performance and a strong generalization ability of the learning system. For the regression task, when predicting the unknown output value, every single decision tree yields a predicted value, and the final value is the average of all decision trees, as shown in Equation (1). In this process, each decision tree occupies the same weight:

$$y\_{pr\varepsilon} = \sum\_{i=1}^{t} weight\_i \times Tree\_i(\mathbf{x}) \tag{1}$$

where *x* is the input variable, *ypre* denotes the predicted value corresponding to the input *x*, *t* is the amount of the constructed decision trees, {*Tree*1, *Tree*2, ... , *Treet*} represent the set of decision trees in a forest, and *weighti* is the weight of each decision tree. In the current case (i.e., for the regular RF model), the value of weight is obtained as follows:

**Figure 1.** Schematic diagram of the RF model.

In light of the above, it can be inferred that the number of decision trees is a crucial hyperparameter that governs the predictive capability of the RF model. Therefore, a key task in this paper is to determine the optimal number of decision trees. Moreover, to avoid possible overfitting of the RF model, another hyperparameter termed the maximum depth of decision trees must also be tuned. These two critical steps will be discussed in more detail in a later section. In addition, the datasets used for constructing the RF model are randomly split into two parts; that is, 80% of datasets are used for training the RF model, and 20% of datasets are used for validating the performance of the built models. Simultaneously, a fourfold cross-validation procedure is applied in this work when implementing model

training procedures. Note that fourfold cross-validation means that the datasets are divided into four equal parts, one of which is used to test the model, and the rest are used to train the model in each modeling session. A total of four modeling sessions are performed, and the average of the four modeling sessions is used as the final result.

#### *3.2. GWO*

The gray wolf optimization (GWO) algorithm is a new swarm intelligence optimization algorithm proposed by Mirialili et al. in 2014, which is based on the simulation of the hierarchical mechanism and predatory behavior of the gray wolf population in nature. Moreover, the optimization of the GWO algorithm is achieved through the process of wolf stalking, encircling, chasing, and attacking prey [56]. Gray wolves are top carnivores, and their lifestyles are mostly gregarious, thus constituting a hierarchical pyramid in the gray wolf population, with a strict hierarchical management system, as shown in Figure 2.

The first level of the pyramid is the head of the population, called *α*, which is mainly responsible for all the decision-making matters of the population. The second pyramid level is called *β*, which assists *α* in making management decisions. The third level of the pyramid is *δ*, which is mainly responsible for scouting, sentry, hunting, and guarding. Furthermore, the bottom level of the pyramid is called *ω*, which is mainly responsible for coordinating the relationship within the population. The hierarchy of gray wolves plays a crucial role in achieving prey hunting. The predation process is led by *α*. At first, the wolves search, track, and approach the prey in a team pattern; then, the wolves encircle the prey from all directions, and when the encirclement is small enough and perfect, the wolves will attack from *β* and *δ*, which are closest to the prey, under the command of *α*. The mathematical models of this process are explained below.

First, the encirclement of the prey by the gray wolves during predation can be characterized by the following equation:

$$D = \left| \mathbb{C} \times X\_{\mathbb{P}}(t) - X(t) \right| \tag{3}$$

where *Xp*(*t*) and *X*(*t*) denote the position of the prey and the position of the wolves during the *t*-th iteration, respectively, and *C* is the coefficient, which is computed by the following equation:

$$C = 2r\_1 \tag{4}$$

Here *r*<sup>1</sup> is the random value in the interval [0, 1].

Then, the equation for updating the position of the gray wolf in search space is as follows:

$$X(t+1) = X\_p(t) - A \times D \tag{5}$$

where *A* is the convergence factor. *A* can be computed by the following equation:

$$A = 2a \times r\_2 - a \tag{6}$$

$$a = \left(2 - 2 \times \left(\frac{t}{t\_{\text{max}}}\right)\right) \tag{7}$$

where *r*<sup>2</sup> is the random value in the interval [0, 1], and *tmax* denotes the maximum iterations.

After that, when the gray wolf determines the position of the prey, the wolf *α* will lead *β* and *δ* to initiate the pursuit behavior. In the wolf pack, *α*, *β*, and *δ* are the closest to the prey. The positions of these three wolves can be used to determine the location of the prey. The mathematical description is as follows:

$$D\_a = |C\_1 \times X\_a(t) - X(t)|\tag{8}$$

$$D\_{\beta} = \left| C\_2 \times X\_{\beta}(t) - X(t) \right| \tag{9}$$

$$D\_{\delta} = |\mathbb{C}\_{3} \times X\_{\delta}(t) - X(t)|\tag{10}$$

$$X\_1 = X\_\hbar - A\_1 \times D\_\hbar \tag{11}$$

$$X\_2 = X\_\beta - A\_2 \times D\_\beta \tag{12}$$

$$X\_3 = X\_\delta - A\_3 \times D\_\delta \tag{13}$$

$$X\_p(t+1) = \frac{X\_1 + X\_2 + X\_3}{3} \tag{14}$$

Using Equations (8)–(13) can obtain the distance between the prey and the *α*, *β*, and *δ*, and then using Equation (14) identifies the direction of movement of individual gray wolves toward the prey.

#### *3.3. WOA*

The whale optimization algorithm (WOA) is a novel swarm intelligence optimization algorithm proposed by Mirjalili and Lewis in 2016 [57]. The WOA is based on the simulation of the hunting behavior of humpback whales in nature, and it optimizes the searching process by mimicking the behavior of searching, encircling, pursuing, and attacking the prey by the humpback whales. Whales are considered the largest mammals globally, with adults growing up to 30 m long and weighing 180 tons. Whales have a unique feeding behavior, which is the bubble-net method, as shown in Figure 3. The method is divided into two stages: upward spiral and double circulation. Based on this special foraging behavior, the WOA algorithm is obtained.

**Figure 3.** Bubble-net attacking method of whales.

In the WOA, it is assumed that the number of populations is *N* and the dimension of the search space is *d*, and the position of the *i*-th whale in the dimensional space can be expressed as *Xi* = - *Xd* <sup>1</sup> , *<sup>X</sup><sup>d</sup>* <sup>2</sup> , *<sup>X</sup><sup>d</sup>* <sup>3</sup> , ..., *<sup>X</sup><sup>d</sup> N* . The position of the prey in the search space corresponds to the optimal global solution. Whales can identify the position of the prey and then encircle it since there is no a priori knowledge of the position of the optimal global solution in the search space before solving the optimization problem. Therefore, in the WOA, assuming that the whale at the optimal position in the current population is the prey and all other whale individuals in the population encircle the optimal whale, the mathematical model of the process is as follows:

$$X(t+1) = X\_p(t) - A \times \left| \mathbb{C} \times X\_p(t) - X(t) \right| \tag{15}$$

where *t* is the current iteration, *X*(*t*) denotes the position of whales, *Xp*(*t*) denotes the position of prey, and *A* and *C* are the coefficients that can be defined as follows:

$$A = 2a \times rand\_1 - a \tag{16}$$

$$C = 2 \times rand\_2\tag{17}$$

Here *rand*<sup>1</sup> and *rand*<sup>2</sup> represent the random values in the interval [0, 1], and *a* is the convergence factor that decreases linearly from 2 to 0 as the number of iterations increases.

To describe the bubble-net attacking behavior of whales with a mathematical model, two different methods were designed in the WOA, namely, the shrinking encircling mechanism and the spiral updating position. Among them, the shrinking encircling mechanism is implemented by the linear reduction in the convergence factor *a* in Equation (16). In the method of spiral updating position, the spiral motion of the whale is simulated to capture the prey, and its mathematical model is shown as follows:

$$X(t+1) = D' \times e^{bl} \times \cos(2\pi l) + X\_p(t) \tag{18}$$

where *D* = *Xp*(*t*) <sup>−</sup> *<sup>X</sup>*(*t*) denotes the distance between the prey and the *<sup>i</sup>*-th whale, *<sup>b</sup>* signifies a constant for controlling the shape of the logarithmic spiral, and *l* is a random value in the interval [−1, 1]. Noted that, the whales move within a constricted circle and simultaneously follow a spiral path towards their prey.

In addition to the bubble-net attacking method, whales will also randomly search for prey. Individual whales search randomly according to each other's position, and the mathematical model can be expressed as follows:

$$X(t+1) = X\_{rand}(t) - A \times \left| \mathbb{C} \times X\_{rand}(t) - X(t) \right| \tag{19}$$

where *Xrand* represents a position vector of an individual search agent randomly chosen from the current population.

#### *3.4. TSA*

The tunicates are small, net-like organisms that are found throughout the sea. They live solitary or parasitic lives and can locate food in the ocean. Based on the biological inspiration of the predatory behavior of tunicates, Kaur et al. proposed a swarm intelligence optimization algorithm termed tunicate swarm algorithm (TSA), which is inspired by the jet-propelled migration mechanism and the intelligent foraging behavior of tunicates in the ocean [58]. The mathematical models of this algorithm are explained below.

#### (1) Initialization

Similar to most metaheuristic algorithms, the TSA starts to execute the optimization process by initializing the tunicate population, i.e., by initializing the positions of every tunicate ( <sup>→</sup> *A*0) in the search space, as shown in the following equation:

$$
\stackrel{\rightarrow}{A}\_0 = A\_{\text{min}}^{\rightarrow} + rand \left( A\_{\text{max}}^{\rightarrow} - A\_{\text{min}}^{\rightarrow} \right) \tag{20}
$$

where <sup>→</sup> *Amin* and <sup>→</sup> *Amax* denote the upper and lower limits of the search space, respectively, and *rand*() signifies the random value in the interval [0, 1].

(2) Avoid conflicts between search agents

To avoid conflicts between the individuals when implementing the searching task, the TSA utilizes vector <sup>→</sup> *A* to calculate the position of new search agents, which can be illustrated by the following equations:

$$
\stackrel{\rightarrow}{\vec{A}} = \frac{\vec{G}}{\vec{M}}\tag{21}
$$

$$
\stackrel{\rightarrow}{G} = c\_1 + c\_2 - \stackrel{\rightarrow}{F} \tag{22}
$$

$$\stackrel{\rightarrow}{M} = \lfloor P\_{\rm min} + c\_{\mathfrak{J}} \times P\_{\rm max} - P\_{\rm min} \rfloor \tag{2.3}$$

where <sup>→</sup> *<sup>A</sup>* denotes the new positions of the search agents, <sup>→</sup> *<sup>G</sup>* denotes the gravity, <sup>→</sup> *M* denotes the interaction forces between the tunicates, <sup>→</sup> *F* signifies the current advection in the deep sea, and <sup>→</sup> *F* = 2*c*3, *c*1, *c*2, and *c*<sup>3</sup> are the random value in the interval [0, 1].

#### (3) Move to the best neighbor

After avoiding conflicts between the individuals, the search agents will move towards the position of the best neighbor, which the following equation can interpret:

$$
\stackrel{\rightarrow}{P}\_d = \left| \stackrel{\rightarrow}{F}\_s - r \times \stackrel{\rightarrow}{P}\_p(t) \right| \tag{24}
$$

where <sup>→</sup> *Pd* denotes the distance between the food and the search agents, <sup>→</sup> *Fs* represents the position of the food, <sup>→</sup> *Pp*(*t*) signifies the positions of the search agents during the *t-*th iteration, and *r* is the random value in the interval [0, 1].

#### (4) Move towards the best individual

The mathematical description of the movement of the tunicate population towards the position of the optimal search agent is as follows:

$$
\stackrel{\rightarrow}{P}\_p(t^\*) = \begin{cases}
\stackrel{\rightarrow}{F\_s} + \stackrel{\rightarrow}{A} \times \stackrel{\rightarrow}{P\_{d\prime}} & r \ge 0.5 \\
\stackrel{\rightarrow}{F\_s} - \stackrel{\rightarrow}{A} \times \stackrel{\rightarrow}{P\_{d\prime}} & r < 0.5
\end{cases} \tag{25}
$$

where <sup>→</sup> *Pp*(*t* ∗) is the position of the search agent closest to the target food.

#### (5) Swarm behavior

TSA is implemented to mimic the tunicates' swarm behavior by saving the first two optimal solutions and updating the other tunicates' positions according to the first two optimal solutions. The swarm behavior of the tunicates is mathematically described as follows:

$$
\stackrel{\rightarrow}{P}\_p(t+1) = \frac{\stackrel{\rightarrow}{P}\_p(t) + \stackrel{\rightarrow}{P}\_p(t+1)}{2 + c\_1} \tag{26}
$$

where <sup>→</sup> *Pp*(*t* + 1) denotes the positions of the search agent when implementing the (*t +* 1)-th iteration.

#### *3.5. Improved RF Models*

Although the RF model has good performance and has been applied in many areas, there is still some room for improvement. For the random forest approach, in addition to sampling with the replacement for constructing training datasets, a random number of features are picked each time to lower the degree of correlation between individual produced decision trees. The end outcome is a simple average computed by each created decision tree of the forest [55]. Although the RF shows remarkable performance in some

regression or classification tasks, it seems that some improvements for the combined way of the base learners (i.e., decision trees) can be made to achieve a better prediction of the RF. Decision trees exhibit different predictive capabilities because of the bootstrap replicates and random feature picking. However, in the regular RF, each decision tree is assigned with the same weights, which seems unreasonable and can be further improved. Therefore, some techniques such as weighting the decision trees based on their predictive capabilities are proposed to optimize the regular RF. The accuracy of decision trees is considered the indicator that can amend the weights of decision trees instead of using the same weight for each tree [59]. In this paper, two indicators, i.e., the coefficient of determination (R2) and the root mean squared error (RMSE) of decision trees, are used as benchmarks for improving the weights of decision trees. For example, the first category is that decision trees with higher R2 will be assigned relatively higher weights, and vice versa. Another case is that decision trees with lower RMSE will be assigned relatively higher weights, and vice versa. Furthermore, we also put forward using three metaheuristic algorithms (i.e., GWO, WOA, and TSA) to search for the best weights of decision trees and then compare the effect of their improvements on the regular RF model.

To sum up, there are a total of five methods that are leveraged to refine the weights of decision trees in a forest. Before applying these methods, the first task that needs to be carried out is the parameter tuning of the RF model, which is to establish the optimal RF model based on the training dataset. Subsequently, the task is to use the five categories mentioned above to amend the weights of decision trees. This part will be elaborated hereinafter. Moreover, the schematic diagram of the design of the RF and improved RF models is depicted in Figure 4.

#### 3.5.1. Improved Weights Based on the Accuracy

First, let {*Tree*1(*x*), *Tress*2(*x*), *Tree*3(*x*), ··· , *Treet*(*x*)} represent the set of decision trees. For the regular RF model, the final predicted values can be obtained through Equation (27), which indicates that the weight of every single decision tree is equal to 1/*t*.

$$y\_{\mathcal{P}^{\text{prt}}} = \sum\_{i=1}^{t} \frac{1}{t} \times \text{Trec}\_i(\mathbf{x}) \tag{27}$$

where *t* is the number of decision trees in a forest.

For the improved weights based on R2, the weight of every single decision is obtained through Equation (28), and the final predicted values can be obtained using Equation (29). The key step of this approach is to compute the weights of decision trees. After constructing the optimal RF model based on the training sets, the accuracy (i.e., R2) of each decision tree can be obtained. Then, according to Equation (28), all decision trees will be assigned a normalized weight, and the larger the term R2, the larger the weight assigned, which means that these decision trees with large weights will have a greater role in the final prediction. Equations (28) and (29) are:

$$\text{weight\\_R}^2 = \frac{\text{R}\_i^2}{\sum\_{i=1}^t \text{R}\_i^2} \tag{28}$$

$$y\_{pre} = \sum\_{i=1}^{t} \text{weight\\_R}^2 \times Time\_i(\mathbf{x}) \tag{29}$$

Similarly, for the improved weights based on the RMSE, the weight of every single decision is obtained through Equation (30), and the final predicted values can be obtained using Equation (31). The weights of the decision trees are obtained by first taking the inverse of the RMSE and then performing a normalized calculation, which means that the larger the RMSE of a decision tree, the smaller the weight it is assigned to, and vice versa. Equations (30) and (31) are:

$$\text{weight\\_RMSE} \, \overline{\, := \, \frac{1}{\sum\_{i=1}^{t} \left(\frac{1}{\text{RMSE}\_i}\right)}} \tag{30}$$

$$y\_{pr\epsilon} = \sum\_{i=1}^{t} \text{weight\\_RMSE}\_i \times \text{Trace}\_i(\mathbf{x}) \tag{31}$$

**Figure 4.** Flowchart of the design of the RF and improved RF models.

3.5.2. Improved Weights Based on Three Metaheuristic Algorithms

In this section, we consider the weights of decision trees in a forest as an unknown space of high dimensionality to be solved. After determining the number of trees in a forest, the dimension of the space to be solved is equal to the number of trees. Then, the GWO, WOA, and TSA algorithms are used to capture the optimal weights in the unknown space. To achieve this goal, it is assumed that the independent variables to be solved are set to {*x*1, *x*2, *x*3, ··· , *xt*}, where *t* is the number of trees and the range of *x* is in the interval [0, 1], and then the standardized weights are computed as follows:

$$\text{weight\\_optimization}\_{i} = \frac{x\_{i}}{\sum\_{i=1}^{t} x\_{i}} \tag{32}$$

where weight\_optmization*<sup>i</sup>* represents the weights of decision trees optimized by the metaheuristic algorithms, and it is clear that ∑*<sup>t</sup> <sup>i</sup>*=<sup>1</sup> weight\_optimization*<sup>i</sup>* = 1.

After that, the RMSE is considered the criteria used to evaluate each result's performance. In other words, the fitness function that needs to be solved by the metaheuristic algorithms is the RMSE function. Moreover, the L2 regularization (also named penalty terms) is also incorporated in the RMSE function, which avoids overfitting. Therefore, the fitness function consists of two parts: the RMSE function and the L2 regularization, which is presented by Equation (34):

$$y\_{p\pi\varepsilon} = \sum\_{i=1}^{t} \text{weight\\_optimization}\_i \times Tree\_i(\infty) \tag{33}$$

$$\text{Fitness function} = \sqrt{\frac{1}{m} \sum\_{i=1}^{m} \left( y\_i - y\_{j\text{ref}} \right)^2} + \gamma \times \sqrt{\sum\_{i=1}^{t} \text{weight\\_optimization}\_i^2} \tag{34}$$

where *ypre* denotes the predicted PPV computed by the improved RF model based on optimization algorithms, *m* is the number of training samples, and *γ* is the coefficient of the L2 regularization, whose value is set to 0.08 in this paper according to the trial-and-error method.

For the GWO, WOA, and TSA, the parameters that need to be set are the swarm sizes and the number of iterations. The appropriate selection of these parameters can effectively and quickly lead to optimal results. Therefore, after constructing the model several times, the swarm sizes set to each optimization algorithm are 50, 100, 150, and 200, respectively, and the number of iterations is set at 1000.

The GWO, WOA, and TSA optimization techniques can be used to improve the RF model in the following way:


#### *3.6. Criteria for Evaluation*

An important task to be conducted after the modeling is to evaluate each model's accuracy and generalization ability. As mentioned previously, in this work, 80% of the data samples were assigned as training that was used to train the RF and improve the RF models, while the rest 20% of the data samples were assigned as testing data to verify the performance of the RF and improved RF models. To evaluate the performance of the built models, four metrics were used in this work, i.e., the coefficient of determination (R2), the root mean squared error (RMSE), the mean absolute error (MAE), and the variance account for (VAF). These evaluation indicators are defined below.

The square of the correlation between the predicted and measured values is represented by R2. The RMSE characterizes the standard deviation of the fitting error between the predicted values and the measured values. The MAE indicates the mean absolute error between the predicted values and measured values. The VAF describes the prediction performance by comparing the standard deviation of the fitting error with the standard deviation of the actual value. The following equations were utilized to calculate the R2, RMSE, MAE, and VAF values:

$$\mathcal{R}^2 = 1 - \frac{\sum\_{i=1}^n (y\_i - \mathcal{Y}\_i)^2}{\sum\_{i=1}^n (y\_i - \overline{y})^2} \tag{35}$$

$$\text{RMSE} = \sqrt{\frac{1}{n} \sum\_{i=1}^{n} (y\_i - \mathcal{y}\_i)^2} \tag{36}$$

$$\text{MAE} = \frac{1}{n} \sum\_{i=1}^{n} |y\_i - \hat{y}\_i| \tag{37}$$

$$\text{VAF} = \left(1 - \frac{var(y\_i - \hat{y}\_i)}{var(y\_i)}\right) \times 100\tag{38}$$

where *yi*, *y*ˆ*i*, and *y* denote the measured, predicted, and mean values of the PPV, respectively. When the predicted and measured PPV values are precisely the same, R2 is 1, the RMSE is 0, the MAE is 0, and the VAF is 100 (%).

#### **4. Results and Discussion**

#### *4.1. Parameter Tuning of the RF Model*

The main purpose of this section is to determine the optimal hyperparameters of the RF model, including the number of decision trees and the maximum depth of decision trees. In addition, before conducting the model training, the original dataset was standardized to eliminate the negative effect of magnitude between the data and speed up the model's training. The formula for standardization of data is given as follows:

$$X\_i^\* = \frac{X\_i - \mu\_i}{\sigma\_i}, \text{ i } = 1, \text{ 2, ..., 6} \tag{39}$$

where *Xi* denotes the data samples belonging to feature *i*, *μ<sup>i</sup>* and *σ<sup>i</sup>* denote the mean and standard deviation of *Xi*, respectively, and *Xi* \* signifies the standardized datasets that are prepared for constructing the RF model.

The parameters tuning of RF consist of two steps. First, the scale of the RF is optimized to determine the number of decision trees in a forest. In this regard, the number of trees increases from 1 to 200 with 1 increment each time. Meanwhile, other parameters are set to default values. Figure 5 shows the performance of the RF model with respect to the number of trees. Intuitively, it can be seen that the mean squared error decreases when the number of trees increases from 1 to 50, and then with the increase in the number of trees, the mean squared error only shows small fluctuations. The results show that the minimum mean squared error is reached when the number of trees is 51. Consequently, the scale of the RF model is set to 51 decision trees. Furthermore, using the testing set to validate the performance of the RF model when the number of trees is 51, the results show that the R2, RMSE, MAE, and VAF values on the testing set are 0.915, 0.275, 0.224, and 93.504, respectively.

**Figure 5.** MSE with respect to the number of trees in a forest.

After that, the optimal maximum depth of decision trees was determined. The number of the maximum depth increases from 1 to 10 with 1 increment each time, while the number of decision trees is 51 currently. Figure 6 depicts the performance of the RF model concerning the maximum depth of decision trees. It can be found that the minimum mean squared error is reached when the maximum depth is set to 7. When the maximum depth values are larger than 7, there is no apparent decrease in the mean squared error. At the same time, the performance evaluation results of the current RF model on the testing set are R<sup>2</sup> of 0.923, the RMSE of 0.262, the MAE of 0.209, and the VAF of 93.994, respectively. Thus, according to the aforementioned results, the maximum depth is set to 7. Now, the RF model has been successfully constructed, and the later work uses five techniques to optimize the weights of decision trees, aiming to improve the performance of the current RF model.

**Figure 6.** MSE with respect to the maximum depth of trees in a forest.

#### *4.2. Improved RF-Based Models*

In this section, five methods were used to optimize the regular RF model, namely the refined weights of decision trees based on their predictive capabilities that are characterized by R2 and the RMSE, and the refined weights of decision trees based on the optimization solution of three metaheuristic algorithms (i.e., GWO, WOA, and TSA). The main goal of the five techniques is to obtain the optimal weight set of decision trees that can minimize the generalization error of the improved RF models. Moreover, to validate the models' performance, each model's accuracy and error on the training set and testing set are both utilized. In this way, the optimal improved RF model can be effectively obtained.

Figure 7 presents the calculation results of the three metaheuristic algorithms on the training set. It can be seen that there are some differences in the calculation results due to the swarm sizes. For example, for the RF-WOA model, the final fitness value is 200 swarms sizes < 150 swarm sizes < 100 swarm sizes < 50 swarm sizes; for the RF-GWO model, the final fitness value is 100 swarm sizes < 200 swarm sizes < 150 of swarm sizes < 50 swarm sizes; and for the RF-TSA model, the final fitness value is 150 of swarm sizes < 200 swarm sizes < 50 of swarm sizes < 100 swarm sizes. It should be noted that the above results are obtained based on the training set. For a clearer identification of the performance of each model, it should be combined with its performance on the testing set. For this purpose, a scoring evaluation method was employed to select the best model [60]. The principle of this method is that the higher the score, the better the performance, and vice versa. Consequently, the final scoring of the RF-WOA, RF-GWO, and RF-TSA models with different swarm sizes on both training and test sets is obtained in Tables 3–5. According to the results in Tables 3–5, it can be concluded that the best RF-WOA model with a total score of 26 is obtained when the swarm sizes are set to 100, the best RF-GWO model with a total score of 28 is obtained when the swarm sizes are set to 200, and the best RF-TSA model with a total score of 27 is obtained when the swarm sizes are set to 50.

**Figure 7.** Various RF-based metaheuristic models based on different swarm sizes.


 on predicting the PPV.


 0.986 4

200

 0.117 3

 0.092 4

98.619 3

 0.929 3

 0.251 3

 0.195 3

94.569 3

 26

*Appl. Sci.* **2022**, *12*, 5019


**Table 6.** Performance of the RF and improved RF models on predicting the PPV.

After obtaining the best model for individual metaheuristic algorithms in their respective model comparisons, the next step is to perform further comparative analysis of them and the two previously mentioned improved weighting methods based on the accuracy (i.e., R2 and the RMSE) to identify the optimal improved RF model. Concerning the improved weights based on the R<sup>2</sup> and RMSE of every single decision tree, the weights of the decision trees are computed through Equations (28) and (30), respectively. For the metaheuristic algorithms, the RF-WOA with 100 swarm sizes, the RF-GWO with 200 swarm sizes, and the RF-TSA with 50 swarm sizes are chosen because of their best performance. At the same time, the regular RF model is also used for comparative analysis in this section. Then, the results of the performance of these six models are tabulated in Table 6. As can be seen from Table 6, the model with the highest score is the RF-WOA, whose scoring is 46, followed by the RF-GWO and RF-TSA, whose scorings are both 41. As for the RF-RMSE and RF-R2, their scores were 30 and 22, respectively, which indicates the performance of the RF-RMSE is slightly better than that of the RF-R2. In general, these five models all outperform the regular RF model with a scoring of 14. Moreover, the ranking of these five models is as follows: RF-WOA > RF-GWO = RF-TSA > RF-RMSE > RF-R2 > RF.

To intuitively observe the differences among these models, we divide the four evaluation metrics into two groups, one for the VAF and R<sup>2</sup> × 100, where larger values indicate better model performance, and the other one for the RMSE and MAE, where smaller values indicate better models. As depicted in Figure 8, for the group of the VAF and R2 × 100 (the left one in Figure 8), the accuracies of the RF-GWO, RF-WOA, and RF-TSA are quite close and are all better than the other three models (i.e., RF-RMSE, RF-R2, and RF), whereas, for the group of the RMSE and MAE (the right one in Figure 8), the error of the RF-GWO is slightly lower than those of the RF-WOA and RF-TSA. Likewise, the errors of the three metaheuristic-based RFs are all lower than those of the RF-RMSE, RF-R2, and RF. Moreover, it can be concluded that all of the refined weights RF models have a good performance on the training set compared with the regular RF model. As for the testing set, the results are shown in Figure 9. For the group of the VAF and R<sup>2</sup> × 100 (the left one in Figure 9), the ranking of the accuracy of these six models is shown as RF-WOA > RF-TSA > RF-GWO > RF-RMSE > RF-R<sup>2</sup> > RF. The same situation occurs in the group of the RMSE and MAE (the right one in Figure 9), which indicates the ranking of error of these six models is RF-WOA < RF-TSA < RF-GWO < RF-RMSE < RF-R<sup>2</sup> < RF. Therefore, it can be inferred that the RF-WOA has a better generalization ability than other models, which can be verified as it has a better performance on the testing set. Additionally, we can also conclude that all of the refined weights RF models have a good performance on the testing set compared with the regular RF model, which means the refined techniques proposed in this paper can effectively enhance the performance of the regular RF model.

**Figure 8.** Results of the evaluation metrics of the developed models on the training set.

**Figure 9.** Results of the evaluation metrics of the developed models on the testing set.

In a different way, the Taylor diagram is presented in this work, which may be used to graphically summarize how closely a set of patterns fits observations [61]. The similarities between the patterns and observations are quantified by utilizing their correlation coefficients, centered root-mean-square errors, and standard deviations, as shown in Equation (40) [62]:

$$E^{'2} = \sigma\_p^2 + \sigma\_a^2 - 2 \times \sigma\_p \times \sigma\_a \times R \tag{40}$$

where *E* is the centered root-mean-square error between the predicted and measured values, *σ*<sup>2</sup> *<sup>p</sup>* and *σ*<sup>2</sup> *<sup>a</sup>* are the variances of the predicted and measured values, respectively, and *R* is the correlation coefficient between the predicted and measured values.

Figures 10 and 11 depict how closely the six developed models match the training and testing sets, respectively, as references. In the Taylor diagram, the standard deviation is shown by the distance between the circles representing the models and the origin point on the *x*-axis, and the ticks on the clockwise arc represent the correlation coefficient. The actual PPV values are shown by the star-shaped point 'REF' on the *x*-axis, and the distance between the other circles and the point 'REF' reflects the centered RMSE (i.e., the grey arc).

**Figure 10.** Taylor diagram of the training results for the six developed models.

**Figure 11.** Taylor diagram of the testing results for the six developed models.

In Figure 10, it can be found that the standard deviations of all six models are less than the standard deviations of the measured values, and the standard deviations of the RF-WOA model, RF-GWO model, and RF-TSA model are closer to the standard deviation of the actual values compared with the models of the RF, RF-R2, and RF-RMSE. Moreover, the same applies to the correlation coefficient, i.e., the correlation coefficients of the RF-WOA model, RF-GWO model, and RF-TSA model are closer to that of the actual values, but they differ little from each other. For the centered RMSE, the results indicate that the RF-WOA model, RF-GWO model, and RF-TSA model have smaller errors than those of the RF, RF-R2, and RF-RMSE models. From a holistic perspective, the calculation results of the three metaheuristic algorithms on the training set are closer to the actual PPV values, while the RF-R<sup>2</sup> and RF-RMSE models perform slightly better on the training set than the RF model.

In Figure 11, it can be inferred that the standard deviations of the three metaheuristic algorithms on the testing set are close to the standard deviation of the actual PPV and slightly greater. With regard to the centered RMSE and correlation coefficient, the calculation results of the RF-WOA models are better than that of the RF-TSA and RF-GWO models. Overall, based on the distance between these circles and the point 'REF', it can be determined that the ranking of the model's superiority on the testing set is as RF-WOA > RF-TSA > RF-GWO > RF-RMSE > RF-R<sup>2</sup> > RF. Compared with the results on the training set, the results on the testing set can more clearly reflect the performance differences among these models, especially for the results obtained by the metaheuristic algorithms.

To sum up, the RF-WOA is the optimal model that is recommended for refining the weights of decision trees of the RF model in this study. Compared with the regular RF model, for the training set, the RF-WOA model increases the R2 value of the RF from 0.973 to 0.986 and the VAF of the RF from 97.298 to 98.603 and simultaneously reduces the RMSE of the RF model from 0.164 to 0.118 and the MAE of the RF from 0.123 to 0.093. For the testing set, the RF-WOA model increases the R<sup>2</sup> value of the RF from 0.923 to 0.932 and the VAF value of the RF from 93.994 to 95.032 and simultaneously reduces the RMSE of the RF model from 0.262 to 0.246 and the MAE of the RF from 0.209 to 0.188. Finally, the measured and predicted PPV by the RF-WOA model on both training and test sets are depicted in Figures 12 and 13, respectively.

**Figure 12.** Measured and predicted PPV of the RF-WOA model on the training set.

**Figure 13.** Measured and predicted PPV of the RF-WOA model on the testing set.

#### *4.3. Sensitivity Analysis of Predictor Variables*

Owing to the possible hazards of the PPV that are caused by blasting operations, it is indispensable to clarify the major factor affecting the PPV. Therefore, in this section, further analysis is conducted to identify the significant correlation of predictor variables that are used to predict the PPV. As previously stated, six variables, i.e., BS, HD, ST, PF, MC, and DI, were used to develop the RF models in this study. Based on the attribute of the RF model, a significant correlation of each input variable can be obtained. The (normalized) total decrease in the criterion brought by a feature is used to calculate the input variable's relevance with the target variable, which is also known as the Gini importance [56,57]; the higher it is, the more significant the input variable. In this way, the significant correlation of the input variables with PPV can be determined, as shown in Figure 14. It can be revealed that the most significant predictor variable is the PF, followed by the BS, DI, and HD. The least significant parameters include ST and MC. Accordingly, we will focus more on the relationship between the PF and PPV.

**Figure 14.** Significant correlation of the input variables with PPV.

To obtain a clearer picture of how the PF affects the PPV, partial dependency plots (PDPs) are utilized to achieve this goal. The PDPs illustrate the intuitive relationship between the target response and an input feature of interest and simultaneously marginalize the values of other input features. The PDPs visualize the average influence of the input feature on the target response, which can reveal a homogeneous connection between the feature and the target response [63]. Figure 15 shows the partial dependence between the PPV and PF. From an overall perspective, as the PF gradually increased from 0.23 to 0.94, the PPV gradually increased from 3.13 to 6.94. The process of the PF causing the PPV changes can be divided into three stages. The first stage is that when PF increases from 0.23 to 0.46, there is no significant change in the PPV, and the PPV reaches a minimum value when the PF equals 0.46. The second stage is that when the PF increases from 0.46 to 0.63, the PPV shows a dramatic increase from 3.01 to 6.27 in an exponential-like trend. The third stage is when the PF increases from 0.63 to 0.94, and the PPV increases from 6.27 to 6.94, but the amplification is significantly smaller compared with that in the second stage. From this, managing the PPV through curbing the size of the PF is a potentially effective means.

**Figure 15.** The partial dependency between the PPV and PF.

#### *4.4. Comparision with the Published Works*

As for the datasets used in the present paper, it has also been used in several published articles, e.g., references [1,64,65]. This section aims to compare the superiority of the model proposed in this paper with the existing models in the published papers.

Reference [64] proposed two models used for the PPV prediction, i.e., the nonlinear multiple regression (NLMR) and the gene expression programming (GEP). Likewise, 102 datasets of blasting operations were randomly divided into 80% for the training set and 20% for the testing set. The input parameters, including BS, HD, ST, PF, MC, and DI, are the same as the input parameters used in the present study. The results of the GEP and NLMP models are presented in Table 7. Reference [1] also used the same datasets to research the PPV prediction. In the research, a special case is that a feature selection technique was used to filter out unimportant input parameters before conducting the modeling. In this way, five variables except for BS, including HD, ST, PF, MC, and DI, were chosen as the input parameters, which is the main difference from the current paper. In their study, five ML models, i.e., CART, CHAID, RF, ANN, and SVM, were developed using the new datasets after feature selection. Furthermore, the datasets used for developing the models were randomly split into two parts, i.e., 70% of datasets for training the models and 30% of datasets for validating the performance of the models. The performance results of these five models are presented in Table 7. Reference [65] also utilized the same datasets to develop two ML models, i.e., RF and Bayesian Network (BN). Similar to Reference [1], feature selection was also used to filter features of low importance before building the ML model. After that, the input parameters that have been identified consist of DI, PF, HD, ST, and MC, in addition to BS.

Moreover, the datasets were randomly split into two parts, i.e., 70% of datasets for training the models and 30% of datasets for validating the performance of the models. The performance results of these two models are presented in Table 7. Overall, according to the results in Table 7, the improved RF model (i.e., the RF-WOA) proposed in this paper can significantly outperform the models proposed in the published papers, which proves the improved RF model is good enough and robust to predict the PPV caused by mining blasting. Likewise, this shows that it is reasonable and feasible to amend the weights of the decision trees of the RF model as proposed in this paper.


**Table 7.** Results of the evaluation metrics of the models that used the same datasets.

#### **5. Conclusions**

This paper utilized the RF and improved weighted RF models to predict the blastinduced PPV. Thus, a dataset of a total of 102 samples collected from an open granite mine was used to develop the target regular RF model. The input parameters used for modeling included BS, HD, ST, PF, MC, and DI, while the output was the PPV. Then, five techniques, i.e., refined weights based on the accuracy (R2 and the RMSE) of decisions trees as well

as the optimization results of three metaheuristics algorithms (WOA, GWO, and TSA), were employed to enhance the performance of the regular RF model by reassigning the new weights of decision trees. The results concluded that the optimal hyperparameters of the regular RF model are 51 decision trees and 7 of maximum depth. Subsequently, the performance evaluation results of the five weighted RF models showed that all improved RF models outperformed the regular RF model. Moreover, the RF-WOA model shows the best performance among these five models, as evidenced by the fact that the RF-WOA model has the R2 values of 0.986 and 0.932, the RMSE values of 0.118 and 0.246, the MAE values of 0.093 and 0.188, and the VAF values of 98.603 and 95.032 for the training and testing sets, respectively. Additionally, compared with the models developed in published articles that also used the same dataset, the RF-WOA model still shows the best accuracy, proving that the RF-WOA model developed in this study has better performance, adaptability, and robustness. Furthermore, the sensitivity analysis revealed that the PF model shows a great significant correlation with the PPV prediction, and the results of the PDPs indicated that when the PF is less than 0.46, the PPV maintains a small fluctuation. At the same time, when the PF exceeds 0.46, there is a significant increasing trend of PPV, and when the PF exceeds 0.63, the increasing trend of the PPV gradually shrinks, which suggests that governing the magnitude of the PPV by managing the PF is an effective and practical measure for the case of this project.

**Author Contributions:** Conceptualization, B.H. and A.S.M.; methodology, B.H., S.H.L.; software, B.H. and M.M.S.S.; validation, B.H., A.S.M.; formal analysis, B.H. and S.H.L.; investigation, B.H.; writing—original draft preparation, B.H., M.M.S.S., S.H.L., A.S.M. and D.V.U.; writing—review and editing, B.H., M.M.S.S., S.H.L., A.S.M. and D.V.U.; supervision, S.H.L., A.S.M. and M.M.S.S.; funding acquisition, M.M.S.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** The research is partially funded by the Ministry of Science and Higher Education of the Russian Federation under the strategic academic leadership program 'Priority 2030' (Agreement 075-15-2021-1333 dated 30 September 2021).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data are available upon request.

**Acknowledgments:** Authors of this study wish to express their appreciation to the Universiti Malaya for supporting this study and making it possible.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


65. Zhou, J.; Asteris, P.G.; Armaghani, D.J.; Pham, B.T. Prediction of ground vibration induced by blasting operations through the use of the Bayesian Network and random forest models. *Soil Dyn. Earthq. Eng.* **2020**, *139*, 106390. [CrossRef]
