Rockburst Intensity Classification Prediction Based on Multi-Model Ensemble Learning Algorithms

Wang, Jiachuang; Ma, Haoji; Yan, Xianhang

doi:10.3390/math11040838

Open AccessArticle

Rockburst Intensity Classification Prediction Based on Multi-Model Ensemble Learning Algorithms

by

Jiachuang Wang

¹,

Haoji Ma

² and

Xianhang Yan

^1,*

¹

School of Resources and Safety Engineering, Central South University, Changsha 410083, China

²

Sifang Gold Mine Co., Baoji 721000, China

^*

Author to whom correspondence should be addressed.

Mathematics 2023, 11(4), 838; https://doi.org/10.3390/math11040838

Submission received: 26 December 2022 / Revised: 9 January 2023 / Accepted: 13 January 2023 / Published: 7 February 2023

Download

Browse Figures

Versions Notes

Abstract

:

Rockburst is a common and huge hazard in underground engineering, and the scientific prediction of rockburst disasters can reduce the risks caused by rockburst. At present, developing an accurate and reliable rockburst risk prediction model remains a great challenge due to the difficulty of integrating fusion algorithms to complement each other’s strengths. In order to effectively predict rockburst risk, firstly, 410 sets of valid rockburst data were collected as the original data set in this paper, which was used to process these rockburst cases by the SMOTE oversampling method. Then, four integrated algorithms and eight basic algorithms were selected, which were optimized by hyperparameters and five-fold cross-validation and combined with the random search grid method, thus improving the classification performance of these algorithms. Third, the stacking integration algorithm, which was combined with the principles of various machine learning algorithms and the characteristics of the rockburst cases, integrated the optimization of rockburst algorithms with reference to four combinatorial strategies. Further, we adopted the voting integration algorithm, chose multiple combination schemes, and referred to the weighted fusion of accuracy, F1 score, recall, precision, and cv-mean as the weight values, and the optimal model for rockburst risk prediction was obtained. Finally, using the 35 generated stacking integration algorithms and 18 voting integration algorithms, the optimal model in the fusion strategy was selected and the traditional integration algorithm model was analyzed on the basis of different sample combinations of the models. The results showed that the prediction performance of stacking and voting integration algorithms was mostly better than the ordinary machine-learning performance, and the selection of appropriate fusion strategies could effectively improve the performance of rockburst prediction for ensemble learning algorithms.

Keywords:

rockburst prediction; stacking integration algorithm; voting integration algorithm; SMOTE data oversampling; fusion strategy

MSC:

no information is found.

1. Introduction

Rockbursts usually refer to a series of hazardous geological hazards such as rock flight and peeling in underground engineering, in which the accumulated elastic strain energy of the stored rock is released by the influence of external force disturbance. Due to the sudden and destructive characteristics of rockburst disasters, they are highly likely to cause to casualties, damage to construction facilities, and damage to property [1,2]. Especially with the rapid development of the world economy, the shallow mineral resources of the earth can no longer meet the needs of humanity, resource development continues to enter the depths of the earth [3], underground engineering construction is increasing [4], many engineering disasters occur frequently, and rockburst disasters constitute a representative thereof. Due to the occurrence of rockburst often being uncertain and dangerous, and compared with shallow engineering, the frequency, intensity, and consequences of rockburst are particularly serious in the deep part. Therefore, rockburst risk prediction is imminent and it has become a hot topic of constant concern in the industry.

Due to the complex mechanism of conception and occurrence for rockburst, in particular, when rockburst occurs, the causes of the rock’s internal rupture are different [5,6]; scholars have conducted fruitful research on rockburst risk prediction in many directions and proposed various theoretical criteria, such as Russenes criterion [7], Barton criterion [8], Hoek criterion [9], energy theory criterion [10,11], brittleness index criterion [12], fractal theory [13,14], and mutation theory [15], etc. Since there is no unified understanding of the mechanism for rockburst in the academic community, and some of the criteria and theories are mostly empirical criteria models proposed under a specific engineering condition, the traditional rockburst risk prediction methods still have limitations and one-sidedness. In addition, the selection and prediction process of the indicators have a certain degree of ambiguity, and this ambiguity mainly comes from the incomprehensiveness of single-factor discriminatory indicators, the variability of the inherent nature of rock mechanics, and the randomness and chance of the rockburst process, so the development of uncertainty theory for rockburst risk prediction provides a new means. At present, a series of uncertainty theories have been applied to the rockburst prediction process, such as the fuzzy evaluation method [16,17,18], set pair analysis [19], multi-criteria decision theory [20], cloud model [21,22], and rough set theory [23,24]. Uncertainty theory has enriched the rockburst risk prediction methods to improve the reliability of the prediction model when dealing with characteristic indicators as qualitative or quantitative values, but such methods usually need to determine the threshold values for each characteristic indicator and different indicator levels, so there may still be some uncertainty for the prediction results near the threshold values.

With the gradual accumulation of big data cases for rockburst, the theory of rockburst risk prediction based on machine learning has been developed rapidly. On the one hand, machine learning algorithms have proved their own ability to deal with classification problems in several fields. On the other hand, machine learning algorithms avoid data overload, weaken human subjective initiative, and they can obtain optimal parameters by a single prediction model; especially machine learning algorithms have significant advantages in computational efficiency and ability to deal with nonlinear complex problems when dealing with multidimensional and large sample data. Common machine learning algorithms include neural network algorithms [25,26], support vector machines [27,28], random forests [29], Gaussian processes [30], Bayesian networks [31,32,33,34], regression models [35], ant colony clustering algorithms [36], and so on. However, due to the uncertainty of the rockburst risk prediction problem and the complexity of the occurrence mechanism, the abovementioned methods usually have certain drawbacks. For example, neural network algorithms have a “black box” nature, that is, the results obtained cannot explain the inference process and inference basis [37]. Support vector machines, due to their high sensitivity to data noise, cannot easily deal with large-scale samples [38]. Random forest algorithms increase the space and time required for the training of a large number of decision trees, and slow calculation efficiency of the model [39]. Gaussian process needs to use the complete sample and characteristic information to make classification prediction, and it is easy to lose its effectiveness in high-dimensional space, which has high requirements on the data characteristics of the sample [40]. To solve the abovementioned problems, some scholars have optimized and improved the original machine learning methods based on such basic algorithms. For example, Xue et al. [41] used particle swarm optimization (PSO) to optimize the input matrix and hidden layer bias of the extreme learning machine to further establish the PSO-ELM rockburst prediction model. Pu et al. [42] used the t-SNE data dimensionality reduction method to reduce the correlation of the original data attributes and used a clustering method to re-label the original data and determine the relative strength of rockburst cases. Wu et al. [43] used a particle swarm optimization algorithm to improve the least squares support vector machine and established a Copula-LSSVM rockburst prediction probability model. Xu et al. [44] used principal component analysis in order to eliminate the correlation between the feature parameters; rockburst feature parameters were selected and the weight coefficients of the new evaluation index were determined using information entropy theory, and then the study of rockburst risk prediction was carried out. Xie et al. [45] optimized the parameter search process of extreme value gradient boosting (XGB) based on a genetic algorithm (GA).

With the increasing volume of rockburst data, the problems of outliers, missing values and data imbalance in the original dataset have emerged, and scholars have started to use integration and fusion methods to integrate multiple machine learning algorithms to obtain better performance of rockburst risk prediction models. For example, Zhang et al. [46] proposed to aggregate seven independent classifiers to obtain a stronger classifier. Wang et al. [47] used bagging and boosting techniques in an integrated approach to compare rockburst prediction using classification trees as the base model. Tan et al. [48] proposed a fusion of diversity and accuracy weights for a Stacking and Voting rockburst intensity grading prediction method. Liu et al. [49] fused eight machine learning algorithms and proposed a stacking integration algorithm considering multiple rockburst predictors while ensuring the diversity of new feature information.

In the current rockburst risk prediction methods, each method has its own limitations; it is still difficult to utilize integration and fusion to achieve complementary advantages, and developing an accurate and reliable rockburst intensity classification prediction model is still a huge challenge. Therefore, first of all, this paper selected the SMOTE (Synthetic Minority Oversampling Technique) method to sample the original dataset. Secondly, an algorithm model based on four kinds of integrated algorithms and eight kinds of basic algorithms was selected, and the hyperparametric optimization of the underlying algorithm was carried out through five-fold cross validation and combined with the random search grid method. Then, with reference to 4 ways, including basic algorithms—basic algorithms, basic algorithms—integrated algorithms, integrated algorithms—basic algorithms, and integrated algorithms—integrated algorithms as a stacking integration strategy. In addition, we chose basic algorithm models weighted fusion, integrated algorithm models weighted fusion, and basic algorithm models + integrated algorithm models weighted fusion, and selected accuracy, F1 score, recall, precision, cv-mean as weight values as the voting integration strategy, through analysis and discussion, and then obtained the optimal integrated classification model for rockburst risk prediction.

2. Establishment of Rock Burst Prediction Database

2.1. Rockburst Event Data Sources

In this paper, 410 sets of valid rockburst data were selected to carry out the study of rockburst risk prediction, and all data were obtained from published materials and papers on relevant domestic and international projects [27,41,43,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91]. The database classifies the rockburst risk into four classes, i.e., none, light, moderate, and strong. According to the distribution of the data samples, the distribution of rockburst classes can be plotted as shown in Figure 1. As can be seen in Figure 1, none rockburst events totaled 62, accounting for about 15.12%. Light rockburst events totaled 108, accounting for about 26.34%. Moderate rockburst events totaled 158, accounting for about 38.54%. Strong rockburst events totaled 82, accounting for about 38.54%. For the convenience of statistical analysis and calculation, the above-mentioned rockburst level samples were marked as 0, 1, 2, and 3. From the sample database of case statistics, it can be found that the largest amount of data was on moderate rockburst, and the minimum number of samples was for none rockburst events, which indicates that the sample exhibited an overall inhomogeneity. For the choice of data set, the larger the sample data volume, the less the fitting phenomenon, the better the model computation, and the better the computation model when processing test set data.

2.2. Analysis of Event Characteristic Indicators for Rockburst

In this paper, six characteristic indicators of maximum tangential stress (σ_θ), uniaxial compressive strength of rock (σ_c), uniaxial tensile strength of rock (σ_t), elastic energy index of rock (Wet), stress coefficient (σ_θ/σ_c), and rock brittleness coefficient (σ_c/σ_t) were selected to carry out the research work of rockburst risk prediction. The sample distribution and box plots of each characteristic index corresponding to different rockburst risk levels are shown in Figure 2. Table 1 describes the characteristics of statistical parameters of indicators for rockburst. Average means average of the sample. S.D. (standard deviation) of the sample is the arithmetic square root of the arithmetic mean from the square of the mean difference. Min. and Max. represent the maximum and minimum values in the sample, respectively. Finally, 25% quantile, 50% quantile, and 75% quantile represent the figures of 25%, 50%, and 75% of all values in the sample, arranged from small to large, respectively. From Figure 2 and Table 1, the overall sample distribution showed a normal distribution form, but for different characteristics of the indicators, there were outliers, especially σ_θ/σ_c and Wet. In addition, the outliers of some indicators may be higher than the high-level risk distribution values of the same indicators. Moreover, the distribution of the characteristic indicators corresponding to different rockburst risk levels can be seen in the box plots, where the upper and lower edges, upper and lower quartiles, and sample medians of some of the characteristics did not differ much (e.g., σ_c and σ_t), and there was an overlap in the range of scalar values of the high-grade rockburst samples.

Figure 3 shows the magnitude of the correlation among the characteristic indicators, from which it can be seen that there were significant correlations among some of the selected indicators, for example, the X1 and X4 correlation was 0.530, the X1 and X5 correlation was −0.755, and the X3 and X6 correlation was −0.589. All of the abovementioned phenomena indicate the complexity of the conditions under which the actual rockburst problem occurs, and also confirm that relying only on a single indicator to analyze the rockburst risk is not feasible. As the abovementioned outliers are real in the actual conditions, this paper did not deal with the outliers.

3. Principles of Stacking and Voting Integration Algorithms

3.1. Principle of Stacking Integration Algorithm

When using a single model for rockburst prediction, the prediction ability is usually poor due to weak model generalization. Stacking models use a combination strategy of model fusion to combine multiple base learners to accomplish a specific classification task [92]. The stacking integration algorithm is a heterogeneous integration algorithm. As a fusion method of strong learners, the model is usually effective, interpretable, and suitable for complex multidimensional data. Stacking integration algorithms mainly consist of two parts: the base model and meta-model [49], and its core is to use multiple trained models as the base classifier (base model, also called individual classifier). The prediction results of these basic classifiers were used as the training set of the new model, and then a new classifier (meta-model) was learned, and the output of the new classifier was used as the prediction result of the final model, stacking working principle as shown in Figure 4.

First, the rockburst data were decomposed into a training set and test set, and the samples of the training set were further partitioned by using the five-fold cross, in which four copies were used to train the base classifier, and one copy of the training set samples was applied as input to the trained model together with the test set. We circulated the abovementioned steps five times, resulting in five forecasts. Further, the five prediction results were reassembled in the original order and input to the new classifier as new data features, while the five prediction results of the test set were averaged and used as the test set of the new classifier, thus ensuring that the number of prediction samples and data features of the meta model were consistent with the dimensionality of the base classifier.

3.2. Principle of Voting Integration Algorithm

The voting method is another effective approach to classification prediction tasks and has been widely used. In brief, the voting method uses the prediction results of the base classifier and re-fuses the predictions using different combination strategies [92]. Usually, the voting method includes three strategies, namely, majority voting strategy, average strategy, and weighting strategy. The majority voting strategy comprises counting the results of all the base classifiers according to their categories, and the one with more occurrences is the output category of the fusion model, including the relative majority voting method and absolute majority voting method. The average strategy is to give the same probability value to the prediction results of different base classifiers to determine the final prediction results. The weighting strategy is based on the prediction results and performance of different base classifiers in the voting process, combined with the prediction probability of base classifiers for different rockburst classes to give a certain weight value, and then obtain the final result. In this strategy, we give more weight to the base classifier with better performance, but do not give more weight to prevent the occurrence of the overfitting phenomenon. It should be noted that if the prediction results of the original base classifier do not differ significantly, the weighted strategy generally does not promote the quality of the fusion model, even if a higher weight value is given to the better-performing classifier. In addition, the generalization ability of the fusion model with a weighted strategy is not necessarily better than that of ordinary algorithms.

The essence of the weighted voting strategy is that the probabilities under different categories are summed [93], and the category with the highest probability is the voting result. In this paper, the probability values under different categories were calculated by measuring the confidence level of the vote in the form of soft voting, and the higher the probability under a category, the more confident the model was in the prediction result of that category. The formula for determining the final rockburst risk class using the base learner is as follows.

P_{i} = m a x (\sum_{k = 1}^{L} p_{k i} w_{k})

where: p_ki denotes the prediction probability of the different base classifier k for rank i; w_k denotes the weight of the base classifier k.

From the abovementioned equation, it is clear that the focus of the weighting strategy is to determine the size of the weight values. Usually, the classification performance of the base classifier can be determined by its accuracy, F1 value, precision and recall [94]; therefore, in determining the weight value of the weighting fusion algorithm, this paper selected the abovementioned classifier performance evaluation index to determine its weight value. In addition, this paper also selected the cv (cross-validation) [95] results of each base classifier itself as the weight values. The abovementioned weighting indexes were calculated as follows.

The accuracy of a classifier usually indicates the proportion of samples that are correctly classified, and its meaning can be understood as the ratio of the number for samples correctly classified by the classifier to the total number of samples for a given test data set, calculated as follows.

A c c u r a c y (y, \tilde{y}) = \frac{1}{n} \sum_{i = 1}^{m} s i g n (\tilde{y_{i}}, y_{i})

s i g n (\tilde{y_{i}}, y_{i}) = \{\begin{matrix} 0, \tilde{y_{i}} \neq y_{i} \\ 1, \tilde{y_{i}} = y_{i} \end{matrix}

where

y_{i}

is the actual rockburst level of the ith sample in the sample space, and

\tilde{y_{i}}

corresponds to the predicted rockburst level of the ith sample in the sample space. When the label levels are the same, it means that the classifier classifies correctly, at which the time sign is 1; when the label levels are different, it means that the classifier classifies incorrectly, at which the time sign is 0.

Accuracy is for the classifier classification results, and usually indicates how many of the samples are truly positive when the classification result is positive, which can be expressed as.

P = \frac{T P}{T P + F P}

Recall represents the proportion of samples correctly predicted by the classifier to all positive samples and can be expressed as

R e c a l l = \frac{T P}{T P + F N}

The P and Recall metrics can be contradictory under extreme conditions. The F₁-score combines the results of P and Recall and is calculated as follows:

F_{1} = \frac{2 * P * R e c a l l}{P + R e c a l l}

4. Rockburst Risk Prediction Model Based on Integrated Algorithms

4.1. Rockburst Risk Prediction Process Based on Integrated Algorithms

The specific process of applying the integrated algorithms to rockburst risk prediction is shown in Figure 5. The detailed steps are as follows.

(1): In order to improve the performance of prediction for learner models and reduce the influence of sample differences, the SMOTE data processing method was selected to oversample the original data and process the rockburst samples with different proportions and grades as 1:1:1:1, which in turn improved the accuracy of the prediction results.
(2): Among the processed rockburst samples, 80% was randomly selected as the training set and 20% was selected as the test set.
(3): A base model library was established, which contains two modules: integration algorithms and basic algorithms. Among them, XGBoost, LightGBM, CatBoost, and Random Forest were selected as the integrated algorithms, while the basic algorithms include decision tree (DT), simple Bayesian model (NB), K-neighborhood algorithm (KNN), support vector machine (SVM), logistic regression (LR), linear discriminant analysis (LDA), multilayer neural network (MLPC), and Gaussian process (GP).
(4): We determined the hyperparameters, search range, and search step of the base models, and obtained the optimal hyperparameter values by the 5-fold cross-validation and combined with a random search grid method.
(5): The optimized base models were selected as an appropriate integration strategy for algorithmic fusion. In this paper, the stacking integration strategy and voting integration strategy were chosen as the integration rules. Stacking integration was referred to the basic algorithms—basic algorithms, basic algorithms—integration algorithms, integration algorithms—basic algorithms, integration algorithms—basic algorithms, which are composed of base models and meta models. The voting method proposed three integration strategies, including eight basic algorithm models weighted integration, eight integrated algorithm models weighted integration, and basic algorithm models + integrated algorithm models weighted integration, and selected accuracy, F1 score, recall, precision, and cv-mean as the weight values.
(6): According to different integration strategies, different integrated classification models could be obtained. The test set was applied to the integrated algorithm models, and the classification performance of various fusion algorithms were compared and analyzed using accuracy, F1 score, recall, and precision, and the optimal integration scheme was finally determined. The flow chart of rockburst risk prediction is shown in Figure 5.

4.2. Raw Data Processing

According to data sources of rockburst events, it is known that none rockburst events: light rockburst events: moderate rockburst events: strong rockburst events = 62:108:158:82, so there is a certain imbalance in the sample. Typically, the unbalanced distribution of the sample may lead to counterexamples when using the classification algorithms with the default classification threshold output. This is because. in the classification problem, the weights are negatively correlated with the number of different rockburst class features, i.e., the smaller the number of samples, the higher their weight values, and the classification model will be more sensitive to less class events when the obtained model prediction results are not accurate. In this paper, the imbalance of data was adjusted by the SMOTE sampling method, which improved the generalization ability of the model and reduced the impact of overfitting. It should be clarified that SMOTE does not repeat the original data to achieve the purpose of sample balancing [96], but generates new positive examples of data (i.e., sample less-category events) by K-neighborhood in the local area. After the original sample data were processed by the SMOTE sampling method, the proportion of each type of event was balanced and the database was expanded to 632 data sets.

4.3. Hyperparameter Optimization of the Base Algorithms

Hyperparametric optimization is a key step in improving the generalization ability of the model, reducing the fitting and improving the classification performance of the algorithm. Most machine-learning algorithms contain different hyperparameters. The hyperparametric optimization problem can be defined as follows:

a r g m i n_{x \in X} f (x)

where x denotes the search range of the model hyperparameter, X is the hybrid design space, and f(x) is the objective function in the process of model hyperparameter optimization; the objective function selected in this paper was the accuracy (scoring = ‘accuracy’), and the goal of hyperparameter optimization was to find the global optimal solution of the above equation in a very short time to ensure that the objective function reaches the optimal value. Hyperparametric optimization algorithms generally include a grid-search method, heuristic search algorithm, and Bayesian optimization algorithm, etc. The global grid search method is simple and brutal, with a long traversal time and exponential growth of the combinatorial arrangement when the parameter space is expanded. The heuristic search algorithm consumes too much space and time. The Bayesian optimization algorithm is a global optimization method, which can obtain the approximate solution of the complex objective function with less evaluation times, but it requires more data, the process of analysis and calculation is more complicated, and it also reduces the objective accuracy for parts of the data that must use subjective probabilities. In order to obtain a better predictive performance of the original model, this paper used the 5-fold cross validation and combined the random search grid method to optimize the hyperparameters of the partial base models. The range of hyperparameters, the step size (i.e., the retrieval interval), and the optimal value calculation results are shown in Table 2.

4.4. Strategy and Process of the Stacking Integration Algorithm

From the abovementioned formulation, it is clear that the base models and meta models are the core of the stacking integration algorithm, the essence of which is that the prediction results output from the base model are the input features of the meta model. Usually, the base models may contain one or more strong classifiers, which require high complexity and high learning ability, while the meta models contain only one classifier, which usually has high interpretability and a simple structure. It may result in different algorithms behaving differently in terms of feature information when processing sample data, and thus, if the models are not properly fused, the prediction accuracy of a single basic classifier may be higher than the output of the meta models.

To improve the shortcomings of the stacking integration algorithm, four groups of integration algorithms were proposed in this paper, and four meta models were set for each group of integration algorithms, and the specific integration model is shown in Figure 6. The base model is the basic model, where Model-B1~Model-B4 are the basic algorithm models. In this paper, Decision Tree (DT), Parsimonious Bayesian Model (NB), K-Neighborhood Algorithm (KNN), Support Vector Machine (SVM), Logistic Regression (LR), Linear Discriminant Analysis (LDA), Multilayer Neural Network (MLPC), and Gaussian Process (GP) were used as the base models; Model-I1~Model-I4 were the integrated algorithm models, including XGBoost, LightGBM, CatBoost, and Random Forest, which were selected for the integrated model. The meta model in Figure 6 comprises the meta models, and the stacking integration in this paper referred to four ways: basic algorithms—basic algorithms, basic algorithms—integrated algorithms, integrated algorithms—basic algorithms, integrated algorithms—integrated algorithms as the constituent forms for the base models and the meta models. The final 24 stacking integration algorithms (Stacking1~Stacking24) can be obtained in the form of the composition for these models, and the calculation process of stacking integration algorithms is as follows.

(1): Determine the hyperparameter index of each base classifier (basic algorithms+ integrated algorithms), determine the hyperparameter values of each algorithm using the random search grid method, and debug the classification models.
(2): Use the debugged classification models to calculate the probability of rockburst risk separately, and find the accuracy and evaluation index of each model.
(3): Build the stacking overlay strategy in the Sklearn module, and determine the base models and meta models according to the abovementioned fusion strategy.
(4): Calculate the accuracy and evaluation indexes of 24 stacking integration algorithms (Stacking1~Stacking24) to determine the best stacking integration strategy and model. The strategy and process of stacking integration algorithm are shown in Figure 6.

4.5. Strategy and Process of Voting Integration Algorithm

In this paper, three voting strategies were proposed, including eight basic algorithms weighted fusion, four integrated algorithms weighted fusion, and basic algorithms + integrated algorithms weighted fusion, which can finally obtain eighteen voting integrated models (Voting1~Voting18).

It should be clarified that the abovementioned evaluation index of performance for each classifier can also be used as the evaluation criteria of various classifiers in addition to the weight value of the weighting strategy, and then the optimal fusion model can be determined. The calculation steps are as follows.

(1): Determine the hyperparameters of each basic classifier (basic algorithms + integrated algorithms) and debug the classification model.
(2): Use the debugged classification models to calculate the probability of rockburst risk, the accuracy, and evaluation index of each model.
(3): Determine the voting models using a weighted combination strategy based on the mean, accuracy, F1 score, precision, recall, and the cross-validation results of each base classifier.
(4): Calculate the accuracy and evaluation indexes of 18 voting algorithms (Voting1~Voting18) to determine the best voting integration strategy and model.

5. Analysis and Discussion of Prediction Results

5.1. Comparison and Evaluation of Base Algorithms

In this part, 12 base models were evaluated using the test set. Table 3 lists the accuracy, precision, recall, F1 score, and cross-validation results for each algorithm, where clf1 to clf4 denote the integrated algorithms including Random Forest, LightGBM, XGBoost, and CatBoost, respectively. clf5 to clf12 denote the base algorithms including decision tree (DT), K-neighborhood algorithm (KNN), support vector machine (SVM), logistic regression (LR), linear discriminant analysis (LDA), multilayer neural network (MLPCC), Gaussian process (GP), and Bayesian model (NB). It can be seen from Table 3 that CatBoost had the best results in terms of accuracy and other evaluation metrics, including a training accuracy of 85.04%, precision of 84.93%, recall of 85.04%, F1 index of 84.65%, and cv_mean of 78.02% (in agreement with XGBoost). The training accuracy of the logistic regression (LR) algorithm was the lowest with 55.91%, and the other evaluation indexes (except cv_mean) performed the worst with 56.71%, 55.91%, and 56.08%, respectively. The Cv_mean index of the multilayer neural network (MLPCC) was the lowest with 43.35%. Considering the accuracy and other evaluation metrics of each algorithm, the ranking order of the base algorithms could be obtained as clf4 > clf3 > clf2 > clf1 > clf7 > clf6 > clf11 > clf5 > clf10 > clf9 > clf12 > clf8. This also indicates that the integrated algorithm performed better compared with the basic algorithms. The “accuracy” in Table 3 refers to the accuracy on the training set, corresponding to the left-hand picture in Figure 7. The “testing accuracy” in the picture to the right in Figure 7 refers to the accuracy on the training set. A comparison of the two values shows whether there has been fitting in the sample for the classification method.

Figure 7 also shows the prediction performance of various machine-learning algorithms. From the figure, it can be seen that the prediction performance of the integrated algorithms all performed well and showed good generalization ability compared with the basic algorithms, while the basic models, clf6 and clf7, had more serious overfitting problems.

5.2. Classification Performance Evaluation of the Stacking Integration Algorithm

Twenty-four integrated model algorithms (Stacking1~Stacking24) were obtained by the stacking integration strategy, and their confusion matrix is shown in Figure 8. In this section, the base algorithms in the integration algorithm meta models were compared and analyzed with the base algorithm models that were not fused. Here, the stacking fusion model with the base classifier as the integrated algorithms are all called Stacking-I, and the stacking fusion model with the base classifier as basic algorithms are all called Stacking-II. The evaluation index of each algorithm in each level of rockburst level prediction is shown in Table 4, Table 5, Table 6 and Table 7. The average of the improvement rate of each evaluation index in each level for each meta model in each group of stacking integration algorithms is shown in Figure 8.

From Table 4, Table 5, Table 6 and Table 7 and Figure 8, we can see that (1) except for the random forest algorithm (clf1), the evaluation metrics such as accuracy and recall of stacking model with the integrated algorithm as the meta model were inferior to the performance of the single integrated algorithm. (2) Stacking integrated algorithms had the same trend line of growth rate as a whole when calculating each evaluation performance index. (3) In terms of stacking models with basic algorithms as the meta models, the overall performance of the evaluation indicators was better than the evaluation indicators of a single basic algorithm (except clf5 and clf6). (4) The prediction results of different rockburst risk levels were affected by database, for example, stacking models in dealing with none rockburst events and strong rockburst events, which presented a good performance; this is due to the original sample of these two types of events being less, the data oversampling process being such that that the linear law of the data was enhanced, the performance robustness was improved, and the mathematical model generalization ability was enhanced. (6) Stacking models with the base classifier as the integrated algorithm (i.e., Stacking-I) had an overall less-improved prediction performance than the stacking model with the basic classifier as the integrated algorithm (i.e., Stacking-II) when the meta models were used as integrated algorithms. (6) Stacking models with a base classifier as the basic algorithm (i.e., Stacking-II), when the meta models were basic algorithms, had an overall less-improved prediction performance than the stacking models with basic classifier as integrated algorithms (i.e., Stacking-I). (7) The prediction performance of the stacking models with the meta model as clf7 was not very significantly improved compared to the base algorithm itself. In conclusion, when choosing a stacking fusion strategy, we should try to choose the base model and the meta model as a combination of different classes for models in order to obtain better prediction results when dealing with rockburst-grade prediction problems.

It should be noted that since the integrated algorithm single model performs better and its prediction performance improvement range is smaller, the accuracy of the new feature information provided by the base model will largely make the integrated algorithm in the meta model lower than the accuracy of the original feature information, so choosing the integrated algorithm as the meta model will achieve worse results than its own prediction; therefore, as can be seen in Figure 9, the improvement rates of clf1~clf4 in each evaluation index were all negative. When the meta model is a basic algorithm, due to basic algorithms having different ways of processing the feature information contained in the data than integrated algorithms, it has a different ability to accept the feature information. In addition, the prediction performance of the basic algorithm itself is low, and its improvement range of prediction performance is larger, which leads to the accuracy of new feature information provided by the different selected algorithms, as the base model is higher than the original one for the basic algorithm in the meta model. The accuracy provided by the feature information (i.e., the basic algorithm itself) is higher for the basic algorithm in meta model, so the improvement rate of each evaluation was positive for some of basic algorithms in meta model. It should be noted that the improvement rate of each evaluation index for basic algorithms in the meta model exceeded that of the integrated algorithms, in which the highest degree of performance improvement for each evaluation was achieved with the meta model as clf8; the highest improvement of accuracy was nearly 40.8%, the highest improvement of precision was nearly 39.2%, the highest improvement of recall was nearly 40.8%, and the highest improvement of the F1-score was nearly 40.5%, which also indicates that the logistic regression (LR) algorithm’s performance improvement sensitivity was the best when using the stacking method for algorithm integration.

5.3. Classification Performance Evaluation of the Voting Integration Algorithm

According to the three voting fusion strategies proposed in the previous section, namely, the weighted fusion of four integrated algorithm models, the weighted fusion of eight basic algorithm models, and the weighted fusion of four integrated algorithm models + eight basic algorithm models, eighteen voting integrated models (Voting1~Voting18) were finally obtained. Their weighted values were average, accuracy, precision, recall, F1-score, and cross-validation results of each base classifier, and the accuracy and other evaluation indexes are shown in Table 8, Table 9, Table 10, Table 11, Table 12 and Table 13.

With reference to the results of each base algorithm model in Section 5.1, the following can be seen:

(1) For the base algorithm that is integration algorithm, the accuracy and other model performance evaluation indexes decreased when the new fusion model was obtained using each fusion voting strategy (the accuracy rate and other performance indexes of the integration algorithms were over 0.8, but the highest values of the voting fusion strategy were 0.7874, 0.7895, 0.7874, and 0.7877), and the values remained the same regardless of the weighted form used. (2) For the basic algorithm as the base model, the accuracy and other model performance evaluation metrics increased when new fusion models were obtained using each fusion voting strategy. (3) For the weighted fusion of four integrated algorithms + eight basic algorithms, the values of all metrics increased and exceeded 0.8 when each metric was used for the weighting operation. (4) Among the four integrated algorithm models + eight basic algorithm models fusion strategies, the weighted fusion based on the F1 value and cross-validation results of each base classifier performed best (Voting 15 and Voting 18), with accuracy, recall, and F1 values of 0.8275, 0.8268, and 0.8266, respectively; compared with the most basic algorithm models, the performance was second only to clf3 and clf4. (5) Among the eight basic algorithms, fusion strategies based on average and accuracy had the worst performances, but their performances were still better than most of the basic algorithms (second only to clf6, clf7, and clf11). (6) Among the overall weighted fusion strategies, the accuracy-based weighting strategy performed the worst. (7) Taking voting-15, the best performing of the voting strategies, as an example, the values of each evaluation index of its prediction performance are shown in Figure 10. From the figure, it can be seen that the fusion algorithm model of voting strategy performed better than some of the individual integration algorithms in terms of prediction performance (the prediction performance due to clf1 and clf2).

5.4. Comparative Analysis of the Fusion Results of Other Combination Strategies

From the essence of the stacking strategy, it could be found that the effectiveness of stacking was usually derived from feature extraction of base models. Since features of the meta model are derived from the learning of the base models, primitive features should not be included in the underlying model algorithms contained in meta models to reduce the risk of overfitting. Therefore, meta models will generally be studied by selecting relatively simple models for fusion. In the process of feature extraction, meta models do not need to select complex classifiers because they have already chosen the base model of more complex nonlinear transformations. According to the analysis in Section 5.2, the evaluation performance of meta models as logical regression is the highest, so this section chose logical regression (LR) as the meta model compared with the aforementioned stacking integration strategy.

First, the regularization term included in logistic regression can reduce the overfitting phenomenon that often occurs in classification problems. Meanwhile, effective features can be selected with L1 regularization, and unnecessary classifiers can be removed from the base model to improve the computational efficiency, which will make the classification problem more objective and realistic. The other combinatorial fusion strategies discussed in this part are mainly to select algorithms other than logistic regression in the abovementioned base algorithm as base models and logistic regression as the meta model, and the specific fusion strategies and calculation results are as Table 14.

The comparative analysis showed that the accuracy and other evaluation performance indexes of the combined strategy with clf8 as the meta model improved significantly, especially for the basic model, except for stacking 25 and stacking 27, which had lower accuracy (only lower than clf6 and clf11, which were the best performers among the basic models); the accuracy of the remaining models improved significantly, as shown in Figure 11. However, it should be noted that the performance of abovementioned fusion models was still inferior to most of the integration algorithms. Stacking26 and stacking34 improved the accuracy by nearly 4%, and stacking26 performed best when comparing the evaluation metrics.

The best-performing fusion model (stacking26) and integrated algorithms were selected to analyze the prediction performance of each model for different rockburst risk levels, as shown in Figure 12. It can be seen that (1) when compared with the evaluation indicators, for none rockburst and strong rockburst events, each model showed good prediction performance. (2) For different levels of rockburst, the prediction performance of the same algorithm varies greatly. (3) In terms of accuracy and stability of rockburst levels, clf4 had a more stable model performance. (4) The models performed poorly for moderate rockburst events compared with other rockburst levels, which may be related to the fact that the raw data had the highest number of moderate rockburst events, and the sample of this rockburst level was not added when the raw data were processed. (5) In terms of precision, stacking-26 performed best for none rockburst evebts and strong rockburst events, and worst for the light and moderate rockburst events. (6) In terms of recall, stacking-26 performed best in handling light rockburst events and fair in handling moderate rockburst events, second only to clf2. (7) In terms of F1 score, stacking-26 performed better in handling none rockburst and light rockburst events, which is similar to the models with better prediction performance among the integrated algorithms. In summary, the stacking fusion strategy integration algorithms obtained better prediction results in dealing with low-intensity rockburst events, and the overall prediction performances were higher than some individual integration algorithms.

5.5. Comparative Analysis of Other Models with Different Sample Combinations

Combined with the correlation analysis of the feature indicators for rockburst events in Section 2.2, it can be seen that correlation of X1 and X5, and X3 and X6 was larger. This section used four integrated algorithms to analyze the importance of the six feature indicators as shown in Figure 13. The left graph indicates the better performance of integrated algorithms for the feature importance; the right graph indicates that the average of the four algorithms for the feature importance of rockurst was 0.2084, 0.1251, 0.1188, 0.2576, 0.1541, and 0.1365, respectively. The feature importance ranking was X4 > X1 > X5 > X6 > X2 > X3. Through a combination of the abovementioned feature indicators in order of importance, it can be found that X4 (rock elastic energy index Wet) and X1 (maximum tangential stress σ_θ) both play a role in the occurrence of rockburst.

Combined with the results of feature importance ranking and correlation, this section selected different feature inputs as new samples for comparing and analyzing the performance of the stacking algorithm, voting algorithm, and integrated algorithm models. The model obtained with the feature parameters of X₁, X₂, X₃, X₄, X₅ and X₆ was named as M-1, the model obtained with the feature parameters of X₁, X₂, X₃, X₄, and X₆ as M-2, the model obtained with the feature parameters of X₂, X₃, X₄, X₅, X₆ as M-3, the model obtained with the feature parameters of X₁, X₂, X₃, X_4, and X₅ as M-4, the model obtained with the feature parameters of X₁, X₂, X₄, X₅, and X₆ as M-5, and the model obtained with the feature parameters of X₁, X₂, X₃, X₅, and X₆ as M-6. The input models are clf1, clf2, clf3, clf4, stacking26 and voting15; the testing accuracy, precision, recall, and F1 score for each model are shown in Figure 14.

From Figure 14, it can be seen that the overall changes of evaluation indexes for each model had similar trends, and the variability of different algorithms among the indexes was large. The prediction results of different algorithms were influenced by the samples: for dataset 1, clf4 had the best prediction performance, clf1 had the worst performance, and the two fusion algorithms also outperformed clf1 and clf2. For dataset 2, clf4 had the best prediction performance, clf1 had the worst performance, and the performance of stacking-26 was only better than clf1. For dataset 3, clf4 and voting15 had the same prediction performance, and F1 score showed that voting15 performed better than clf4. For dataset 4, clf1 predicted better than other datasets, while clf2 performed the worst. For dataset 5, clf2 continued to perform the worst, while clf3 performed the best. For dataset 6, the prediction performance of all algorithms decreased, but overall clf4 still performed the best. The above phenomenon firstly illustrates the good performance of clf4 in dealing with the rockburst prediction problem, and it also shows that the feature parameter X4 has a great influence on output results, which coincides with the aforementioned index feature importance ranking results. Secondly, it can also be seen that dataset 3 (lack feature parameter X1) had a reduced prediction performance of algorithms compared with other datasets. In conclusion, the important feature parameters had a great influence on the prediction effect of algorithm performance, and each algorithm was more sensitive to X1 and X4. The integrated model performance of the voting fusion strategy and stacking fusion strategy was no less than traditional integrated algorithms.

6. Conclusions

Using a single model for rockburst prediction, the prediction capability is usually poor due to weak generalization of the model. In this paper, we introduced machine learning algorithms and K Fold cross-validation method into rockburst intensity level prediction, established a skating and voting classification prediction model, and analyzed and verified the prediction model by sampling 632 sets of data; the conclusions are as follows.

(1): In this paper, machine learning algorithms and 5-fold cross validation methods were introduced into rockburst intensity prediction, 12 basic algorithm models were introduced, and 632 sets of data were obtained by using the SMOTE oversampling data processing method and evaluated in comparison. The results showed that compared with the basic algorithms, the prediction performance of traditional integrated algorithms was good, and they showed good generalization ability.
(2): From the 35 stacking models obtained, it can be seen that the performance of different fusion strategy schemes in dealing with rockburst level prediction problem varied greatly, and the performance was not completely better than that of traditional integration algorithms. In the case of integration algorithms as the meta model, the overall improvement in prediction performance was less than that of the stacking fusion model with the basic algorithm as base model. The stacking fusion model with the base classifier as the basic algorithm, when using the basic algorithm as the meta model, had an overall less-improved prediction performance than the stacking fusion model with the integrated algorithm as the base model. The stacking algorithms achieved better prediction results for low-level rockburst events, and the overall prediction performance was higher than some of the individual integrated algorithms.
(3): Stacking integration algorithms can fuse the independent machine algorithms and give full play to the performance of each algorithm. When the meta model was a basic algorithm, the accuracy of the new feature information provided by different algorithms selected as the base model was higher than that provided by the original feature information for the basic algorithm in the meta model, because the basic algorithm processed the feature information contained in the data differently from the integrated algorithm, and the basic algorithm had different acceptability of the feature information, so the stacking integrated algorithm could achieve better results. In addition, the stacking fusion strategy integration algorithm obtained better prediction results for low-intensity rockburst events, and the overall prediction performance was higher than that of some individual integration algorithms.
(4): Different voting strategies behave differently. For integrated algorithms as the base model, the accuracy and other evaluation metrics decreased when the new fusion model was obtained using each fusion voting strategy. For basic algorithms as the base model, the accuracy and other evaluation metrics increased when the new model was obtained using each fusion voting strategy. For basic algorithm as the weighted fusion of the integrated algorithm models + basic algorithm models, the values of all metrics increased when each metric was used for the weighting operation.
(5): The influence of different rockburst event feature indicators on the model prediction results was analyzed, and the four basic integrated algorithms and the best-performing fusion strategy model were selected for the good performance of the CatBoost algorithm model in dealing with the rockburst prediction problem. When selecting different feature parameters as different data and input for analysis, it is considered that important feature parameters have a great influence on the predictive effect of algorithm performance; the maximum tangential stress σ_θ and elastic energy index Wet play a more important impact on the propensity of rockburst.
(6): With the continuous improvement of the amount of rock burst data, problems such as outliers, missing values, and data imbalance appear in the original data set. Therefore, this paper adopted the integrated fusion method to integrate multiple machine-learning algorithms to obtain a better-performance rockburst risk prediction model. Because different machine learning methods have their own advantages and disadvantages in the process of computing, choosing appropriate machine learning algorithms to combine can weaken the problems that occur in processing data. Therefore, considering the prediction index of multiple rock bursts, the common machine learning algorithms are combined at this stage to give full play to the advantages of each machine learning algorithm. In addition, the combination method chosen in this paper is not mentioned in the previous study, which is also a new attempt.

Author Contributions

Conceptualization, J.W. and X.Y.; methodology, J.W.; software, J.W.; validation, J.W., H.M. and X.Y.; investigation, X.Y.; resources, J.W.; writing—original draft preparation, J.W.; writing—review and editing, J.W.; visualization, J.W.; supervision, J.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found in reference [23,31,33,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81].

Conflicts of Interest

The authors declare no conflict of interest.

References

He, J.; Dou, L.; Gong, S.; Li, J.; Ma, Z. Rock burst assessment and prediction by dynamic and static stress analysis based on micro-seismic monitoring. Int. J. Rock Mech. Min. Sci. 2017, 93, 46–53. [Google Scholar] [CrossRef]
Si, X.; Gong, F. Strength-weakening effect and shear-tension failure mode transformation mechanism of rockburst for fine-grained granite under triaxial unloading compression. Int. J. Rock Mech. Min. Sci. 2020, 131, 104347. [Google Scholar] [CrossRef]
Dong, L.; Tong, X.; Li, X.; Zhou, J.; Wang, S.; Liu, B. Some developments and new insights of environmental problems and deep mining strategy for cleaner production in mines. J. Clean. Prod. 2018, 210, 1562–1578. [Google Scholar] [CrossRef]
Zhou, Z.; Cai, X.; Li, X.; Cao, W.; Du, X. Dynamic Response and Energy Evolution of Sandstone Under Coupled Static–Dynamic Compression: Insights from Experimental Study into Deep Rock Engineering Applications. Rock Mech. Rock Eng. 2019, 53, 1305–1331. [Google Scholar] [CrossRef]
Dong, L.; Chen, Y.; Sun, D.; Zhang, Y. Implications for rock instability precursors and principal stress direction from rock acoustic experiments. Int. J. Min. Sci. Technol. 2021, 31, 789–798. [Google Scholar] [CrossRef]
Dong, L.; Yang, L.; Chen, Y. Acoustic Emission Location Accuracy and Spatial Evolution Characteristics of Granite Fracture in Complex Stress Conditions. Rock Mech. Rock Eng. 2022, 1–18. [Google Scholar] [CrossRef]
Russenes, B.F. Analysis of rock spalling for tunnels in steep valley sides. Nor. Inst. Technol. 1974. [Google Scholar]
Barton, N.; Lien, R.; Lunde, J. Engineering classification of rock masses for the design of tunnel support. Rock Mech. Rock Eng. 1974, 6, 189–236. [Google Scholar] [CrossRef]
Hoek, E.; Brown, E.T. Underground Excavation in Rock; Institute of Mining and Metallurgy: London, UK, 1980. [Google Scholar]
Kidybiński, A. Bursting liability indices of coal. Int. J. Rock Mech. Min. Sci. Geomech. Abstracts. Pergamon 1981, 18, 295–304. [Google Scholar] [CrossRef]
Wang, J.A.; Park, H.D. Comprehensive prediction of rockburst based on analysis of strain energy in rocks. Tunn. Undergr. Space Technol. 2001, 16, 49–57. [Google Scholar] [CrossRef]
Tarasov, B.G.; Randolph, M.F. Superbrittleness of rocks and earthquake activity. Int. J. Rock Mech. Min. Sci. 2011, 48, 888–898. [Google Scholar] [CrossRef]
Du, C.; Pan, Y.; Liu, Q.; Huang, X.; Yin, X. Rockburst inoculation process at different structural planes and microseismic warning technology: A case study. Bull. Eng. Geol. Environ. 2022, 81, 1–18. [Google Scholar] [CrossRef]
Xie, H. Fractal characteristics and mechanism of rockburst. Chin. J. Rock Mech. Eng. 1993, 12, 28–37. (in Chinese). [Google Scholar]
Qiao, C. Research on the Prediction of Rockburst Hazard in Deep-Buried Tunnels Based on Catastrophe Theory; Hebei University of Engineering: Handan, China, 2018. [Google Scholar]
Cai, W.; Dou, L.; Zhang, M.; Cao, W.; Shi, J.-Q.; Feng, L. A fuzzy comprehensive evaluation methodology for rock burst forecasting using microseismic monitoring. Tunn. Undergr. Space Technol. 2018, 80, 232–245. [Google Scholar] [CrossRef]
Cai, W.; Dou, L.; Si, G.; Cao, A.; He, J.; Liu, S. A principal component analysis/fuzzy comprehensive evaluation model for coal burst liability assessment. Int. J. Rock Mech. Min. Sci. 2016, 81, 62–69. [Google Scholar] [CrossRef]
Dong, L.-J.; Zhou, Y.; Deng, S.-J.; Wang, M.; Sun, D.-Y. Evaluation methods of man-machine-environment system for clean and safe production in phosphorus mines: A case study. J. Central South Univ. 2021, 28, 3856–3870. [Google Scholar] [CrossRef]
Wang, M.; Jin, J.; Li, L. SPA-VFS Model for the Prediction of Rockburst. In Proceedings of the 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery, Jinan, China, 18–20 October 2008; Volume 5, pp. 34–38. [Google Scholar]
Liang, W.; Zhao, G.; Wu, H.; Dai, B. Risk assessment of rockburst via an extended MABAC method under fuzzy environment. Tunn. Undergr. Space Technol. 2018, 83, 533–544. [Google Scholar] [CrossRef]
Wang, J.; Huang, M.; Guo, J. Rock Burst Evaluation Using the CRITIC Algorithm-Based Cloud Model. Front. Phys. 2021, 8, 593701. [Google Scholar] [CrossRef]
Liu, R.; Ye, Y.; Hu, N.; Chen, H.; Wang, X. Classified prediction model of rockburst using rough sets-normal cloud. Neural Comput. Appl. 2018, 31, 8185–8193. [Google Scholar] [CrossRef]
Xue, Y.; Li, Z.; Li, S.; Qiu, D.; Tao, Y.; Wang, L.; Yang, W.; Zhang, K. Prediction of rock burst in underground caverns based on rough set and extensible comprehensive evaluation. Bull. Eng. Geol. Environ. 2017, 78, 417–429. [Google Scholar] [CrossRef]
Liu, L.; Chen, Z.; Wang, L. Rock burst laws in deep mines based on combined model of membership function and dominance-based rough set. J. Cent. South Univ. 2015, 22, 3591–3597. [Google Scholar] [CrossRef]
Zhang, M. Prediction of rockburst hazard based on particle swarm algorithm and neural network. Neural Comput. Appl. 2021, 34, 2649–2659. [Google Scholar] [CrossRef]
Feng, G.L.; Xia, G.Q.; Chen, B.R.; Xiao, Y.X.; Zhou, R.C. A Method for Rockburst Prediction in the Deep Tunnels of Hydropower Stations Based on the Monitored Microseismicity and an Optimized Probabilistic Neural Network Model. Sustainability 2019, 11, 3212. [Google Scholar] [CrossRef]
Zhou, J.; Li, X.; Shi, X. Long-term prediction model of rockburst in underground openings using heuristic algorithms and support vector machines. Saf. Sci. 2012, 50, 629–644. [Google Scholar] [CrossRef]
Ji, B.; Xie, F.; Wang, X.; He, S.; Song, D. Investigate Contribution of Multi-Microseismic Data to Rockburst Risk Prediction Using Support Vector Machine with Genetic Algorithm. IEEE Access 2020, 8, 58817–58828. [Google Scholar] [CrossRef]
Dong, L.-J.; Li, X.-B.; Peng, K. Prediction of rockburst classification using Random Forest. Trans. Nonferrous Met. Soc. China 2013, 23, 472–477. [Google Scholar] [CrossRef]
Lan, T.; Zhang, Z.; Sun, J.; Zhao, W.; Zhang, M.; Jia, W.; Liu, M.; Guo, X. Regional prediction and prevention analysis of rockburst hazard based on the Gaussian process for binary classification. Front. Earth Sci. 2022, 10, 1779. [Google Scholar] [CrossRef]
Li, N.; Feng, X.; Jimenez, R. Predicting rock burst hazard with incomplete data using Bayesian networks. Tunn. Undergr. Space Technol. 2017, 61, 61–70. [Google Scholar] [CrossRef]
Kong, G.; Xia, Y.; Qiu, C. Cost-sensitive Bayesian network classifiers and their applications in rock burst prediction. In International Conference on Intelligent Computing; Springer: Cham, Switzerland, 2014; pp. 101–112. [Google Scholar]
Dong, L.; Wesseloo, J.; Potvin, Y.; Li, X. Discrimination of Mine Seismic Events and Blasts Using the Fisher Classifier, Naive Bayesian Classifier and Logistic Regression. Rock Mech. Rock Eng. 2015, 49, 183–211. [Google Scholar] [CrossRef]
Dong, L.J.; Wesseloo, J.; Potvin, Y.; Li, X.B. Discriminant models of blasts and seismic events in mine seismology. Int. J. Rock Mech. Min. Sci. 2016, 86, 282–291. [Google Scholar] [CrossRef]
Li, N.; Jimenez, R. A logistic regression classifier for long-term probabilistic prediction of rock burst hazard. Nat. Hazards 2017, 90, 197–215. [Google Scholar] [CrossRef]
Saghatforoush, A.; Monjezi, M.; Faradonbeh, R.S.; Armaghani, D.J. Combination of neural network and ant colony optimization algorithms for prediction and optimization of flyrock and back-break induced by blasting. Eng. Comput. 2015, 32, 255–266. [Google Scholar] [CrossRef]
Dumitru, C.; Maria, V. Advantages and Disadvantages of Using Neural Networks for Predictions. Ovidius Univ. Ann. Ser. Econ. Sci. 2013, 13, 444–449. [Google Scholar]
Karamizadeh, S.; Abdullah, S.M.; Halimi, M.; Shayan, J.; Javad Rajabi, M. Advantage and drawback of support vector machine functionality. In Proceedings of the 2014 International Conference on Computer, Communications, and Control Technology (I4CT), Langkawi, Malaysia, 2–4 September 2014; IEEE: Manhattan, NY, USA, 2014; pp. 63–65. [Google Scholar]
Ahmad, M.W.; Mourshed, M.; Rezgui, Y. Trees vs Neurons: Comparison between random forest and ANN for high-resolution prediction of building energy consumption. Energy Build. 2017, 147, 77–89. [Google Scholar] [CrossRef]
Costabal, F.S.; Perdikaris, P.; Kuhl, E.; Hurtado, D.E. Multi-fidelity classification using Gaussian processes: Accelerating the prediction of large-scale computational models. Comput. Methods Appl. Mech. Eng. 2019, 357, 112602. [Google Scholar] [CrossRef]
Xue, Y.; Bai, C.; Qiu, D.; Kong, F.; Li, Z. Predicting rockburst with database using particle swarm optimization and extreme learning machine. Tunn. Undergr. Space Technol. 2020, 98, 103287. [Google Scholar] [CrossRef]
Pu, Y.; Apel, D.B.; Xu, H. Rockburst prediction in kimberlite with unsupervised learning method and support vector classifier. Tunn. Undergr. Space Technol. 2019, 90, 12–18. [Google Scholar] [CrossRef]
Wu, S.; Wu, Z.; Zhang, C. Rock burst prediction probability model based on case analysis. Tunn. Undergr. Space Technol. 2019, 93, 103069. [Google Scholar] [CrossRef]
Xu, C.; Liu, X.; Wang, E.; Zheng, Y.; Wang, S. Rockburst prediction and classification based on the ideal-point method of information theory. Tunn. Undergr. Space Technol. 2018, 81, 382–390. [Google Scholar] [CrossRef]
Xie, X.; Jiang, W.; Guo, J. Research on Rockburst Prediction Classification Based on GA-XGB Model. IEEE Access 2021, 9, 83993–84020. [Google Scholar] [CrossRef]
Zhang, J.; Wang, Y.; Sun, Y.; Li, G. Strength of ensemble learning in multiclass classification of rockburst intensity. Int. J. Numer. Anal. Methods Géoméch. 2020, 44, 1833–1853. [Google Scholar] [CrossRef]
Wang, S.-M.; Zhou, J.; Li, C.-Q.; Armaghani, D.J.; Li, X.-B.; Mitri, H.S. Rockburst prediction in hard rock mines developing bagging and boosting tree-based ensemble techniques. J. Central South Univ. 2021, 28, 527–542. [Google Scholar] [CrossRef]
Tan, W.; Hu, N.; Ye, Y.; Wu, M.; Huang, Z.; Wang, X. Rockburst intensity classification prediction based on four ensemble learning. Chin. J. Rock Mech. Eng. 2022, 41 (Suppl. S2), 3250–3259. [Google Scholar]
Liu, D.; Dai, Q.; Zuo, J.; Shang, Q.; Chen, G.; Guo, Y. Research on rockburst grade prediction based on stacking integrated algorithm. Chin. J. Rock Mech. Eng. 2022, 41 (Suppl. S1), 2915–2926. [Google Scholar]
Wang, Y.H.; Li, W.D.; Li, Q.G. Fuzzy mathematics comprehensive evaluation method for rockburst prediction. Chin. J. Rock Mech. Eng. 1998, 15–23. [Google Scholar]
Bai, M.Z.; Wang, L.J.; Xu, Z.Y. Study on a Neutral Network Model and its Application in Predicting the Risk of Rock Blast. China Saf. Sci. J. 2002, 12, 68–72. [Google Scholar]
Gong, F.; Li, X. A distance discriminant analysis method for prediction of possibility and classification of rockburst and its application. Yanshilixue Yu Gongcheng Xuebao/Chin. J. Rock Mech. Eng. 2007, 26, 1012–1018. [Google Scholar]
Wang, J.L.; Chen, J.P.; Yang, J.; Que, J.S. Method of distance discriminant analysis for determination of classification of rockburst. Rock Soil Mech. 2009, 30, 2203–2208. [Google Scholar]
Gong, F.Q.; Li, X.B.; Zhang, W. Rockburst prediction of underground engineering based on Bayes discriminant analysis method. Rock Soil Mech. 2010, 31 (Suppl. S1), 370–377. [Google Scholar]
Kang, Y. Research on the Failure Mechanism of Surrounding Rock in Deep Tunnels. Ph.D. Thesis, Chongqing University: Chongqing, China, 2006. [Google Scholar]
He, Z.; Li, X.; Lu, Y.; Wang, X. Application of BP neural network model in rockburst prediction in deep tunnels. Chin. J. Undergr. Space Eng. 2008, 3, 494–498. [Google Scholar]
Zhang, L.W.; Zhang, D.Y.; Qiu, D.H. Application of extension evaluation method in rockburst prediction based on rough set theory. J. China Coal Soc. 2010, 35, 1461–1465. [Google Scholar]
Yi, Y.; Cao, P.; Pu, C. Multi-factorial comprehensive estimation for Jinchuan’s deep typical rockburst tendency. Keji Daobao/Sci. Technol. Rev. 2010, 28, 76–80. [Google Scholar]
Ding, X.D.; Wu, J.M.; Li, J.; Liu, C.J. Artificial neural network for forecasting and classification of rockbursts. J. Hohai Univ. Nat. Sci. 2003, 31, 424–427. [Google Scholar]
Jin-Lin, Y.; Xi-Bing, L.; Zi-Long, Z.; Ye, L. A Fuzzy assessment method of rock-burst prediction based on rough set theory. Met. Mine 2010, 39, 26–29. [Google Scholar]
Feng, X.T.; Wang, L.N. Rockburst prediction based on neural networks. Trans. Nonferrous Met. Soc. China 1994, 4, 7–14. [Google Scholar]
Zhang, J.F. Research on Staged Prediction and Control Technology of Rock Burst Disaster in Daxiangling Tunnel. Master’s Thesis, Southwest Jiaotong University, Chengdu, China, 2010. [Google Scholar]
Liang, Z.Y. Research on Rockburst Prediction and Prevention Countermeasures in the Diversion Tunnel of Jinping II Hydropower Station. Master’s Thesis, Chengdu University of Technology, Chengdu, China, 2004. [Google Scholar]
Xu, M.G.; Du, Z.J.; Yao, G.H.; Liu, Z.P. Rockburst prediction of chengchao iron mine during deep mining. Chin. J. Rock Mech. Eng. 2008, 27 (Suppl. S1), 2918–2921. [Google Scholar]
Ran, L.; Yi-Cheng, Y.; Guang-Quan, Z.; Nan, Y.; Hu, C.; Qi-Hu, W. Grading prediction model of rockburst based on rough set-multidimensional normal cloud. Met. Mine 2019, 48, 48–55. [Google Scholar]
Wang, Y.; Xu, Q.; Chai, H.; Liu, L.; Xia, Y.; Wang, X. Rock burst prediction in deep shaft based on RBF-AR model. J. Jilin Univ. Earth Sci. Ed. 2013, 43, 1943–1949. [Google Scholar]
Wang, Y.C.; Shang, Y.Q.; Sun, H.Y.; Yan, X.S. Study of prediction of rockburst intensity based on efficacy coefficient method. Rock Soil Mech. 2010, 31, 529–534. [Google Scholar]
Song, C.S.; Li, D.H. Artificial neural networks for predicting rockburst in deep mining. J. Henan Polytech. Univ. Nat. Sci. 2007, 26, 365–369. [Google Scholar]
Liu, Z.J.; Yuan, Q.P.; Li, J.L. Application of fuzzy probability model to pedict of classification of rockburst intensity. Chin. J. Rock Mech. Eng. 2008, 27 (Suppl. S1), 3095–3103. [Google Scholar]
Li, S.L.; Feng, X.T.; Wang, Y.J.; Yang, N. Evaluation of rockburst tendency of hard rock in deep well. J. Northeast. Univ. 2001, 22, 60–63. [Google Scholar]
Tang, S.H.; Wu, Z.J.; Chen, X.H. Approach to occurrence and mechanism of rockburst in deep underground mines. Chin. J. Rock Mech. Eng. 2003, 8. [Google Scholar]
Cai, S.J.; Zhang, L.; Zhou, W. Research on prediction of rock burst in deep hard-rock mines. J. Saf. Sci. Technol. 2005, 1, 17–20. [Google Scholar]
Qin, S.W.; Chen, J.P.; Wang, Q.; Qiu, D.H. Research on rockburst prediction with extenics evaluation based on rough set. In Proceedings of the 13th International Symposium on Rockburst and Seismicity in Mines; Rinton Press: Dalian, China, 2009; pp. 937–944. [Google Scholar]
Zhang, L.X.; Li, C.H. Study on Tendency Analysis of Rockburst and Comprehensive Prediction of Different Types of Surrounding Rock; Rinton Press: Princeton, NJ, USA, 2009. [Google Scholar]
Zhang, Z.L. Research on Rockburst and Large Deformation Prediction of Xuefengshan Tunnel of Shaohuai Expressway. Master’s Thesis, Chengdu University of Technology, Chengdu, China, 2002. [Google Scholar]
Liu, J.P. Research on the Relationship between Ground Pressure Activity and Microseismic Temporal and Spatial Evolution in Deep Mines. Master’s Thesis, Northeastern University, Shenyang, China, 2011. [Google Scholar]
Zhang, B. Research on Safety and Stability of Deep Buried Highway Tunnel based on Rock Mass Anisotropy. Ph.D. Thesis, Graduate School of Chinese Academy of Sciences (Wuhan Institute of Rock and Soil Mechanics), Wuhan, China, 2007. [Google Scholar]
Xiao, X.P. Research on Prediction and Prevention of Rock Burst in Traffic Tunnel of Jinping II Hydropower Station. Master’s Thesis, Southwest Jiaotong University, Chengdu, China, 2005. [Google Scholar]
Jiang, L.F. Research on Rockburst Prediction and Prevention in Anlu Tunnel of Guangkun Railway. Master’s Thesis, Southwest Jiaotong University, Chengdu, China, 2008. [Google Scholar]
Wang, G.Y.; Zhang, S.X.; Ren, G.F. Analysis and prediction of rock burst in deep mining of Tonglushan copper-iron ore. Min. Saf. Environ. Prot. 2005, 32, 20–22. [Google Scholar]
Guo, C.; Zhang, Y.S.; Deng, H.; Su, Z.; Sun, D. Study on rock burst prediction in the deep-buried tunnel at Gaoligong Mountain based on the rock proneness. Geotech. Investig. Surv. 2011, 39, 8–13. [Google Scholar]
Zhang, C.Q.; Zhou, H.; Feng, X.T. An index for estimating the stability of brittle surrounding rock mass: FAI and its engineering application. Rock Mech. Rock Eng. 2011, 44, 401–414. [Google Scholar] [CrossRef]
Zhang, C.; Feng, X.T.; Zhou, H.; Qiu, S.; Wu, W. Case histories of four extremely intense rockbursts in deep tunnels. Rock Mech. Rock Eng. 2012, 45, 275–288. [Google Scholar] [CrossRef]
Sun, H.F.; Li, S.C.; Qiu, D.H.; Zhang, L.W.; Zhang, N. Application of extensible comprehensive evaluation to rockburst prediction in a relative shallow chamber. In Proc., RaSiM7 (2009): Controlling Seismic Hazard and Sustainable Development of Deep Mines; Tang, C.A., Ed.; Rinton Press: Princeton, NJ, USA, 2009; pp. 777–784. [Google Scholar]
Yu, X.Z. Development of Database Management System for Highway Tunnel Geological Disaster Prediction and Treatment Measures. Master’s Thesis, Chongqing University, Chongqing, China, 2009. [Google Scholar]
Li, L. Study on Scheme Optimization and Rockburst Prediction in Deep Mining in Xincheng Gold Mine. Ph. D. Thesis, University of Science and Technology, Beijing, China, 2009. [Google Scholar]
Zhang, Y.L.; Liu, X.; Hu, Z.Q. Rock burst forecast based on artificial neural network in underground engineering. Hunan Nonferrous Met. 2007, 23, 1–4. [Google Scholar]
Zhao, X.F. Study on the High Geo-Stress and Rockburst of the Deep-Lying Long Tunnel. Master’s Thesis, North China University of Water Resources and Electric Power, Zhengzhou, China, 2007. [Google Scholar]
Hao, J. Quality Evaluation and Stability Study of Surrounding Rock during Construction of Tunnel in High Ground Stress Area. Ph.D. Thesis, Xinjiang Agricultural University, Urumqi, China, 2015. [Google Scholar]
Sun, C.S. Tunnel rockburst prediction model based on improved MATLAB-BP neural network algorithm. Chongqing Jiaotong Univ. (Nat. Sci. Ed.) 2019, 38, 41–49. [Google Scholar]
Zhou, J.; Li, X.; Mitri, H.S. Classification of rockburst in underground projects: Comparison of ten supervised learning methods. J. Comput. Civ. Eng. 2016, 30, 04016003. [Google Scholar] [CrossRef]
Sun, L.; Hu, N.; Ye, Y.; Tan, W.; Wu, M.; Wang, X.; Huang, Z. Ensemble stacking rockburst prediction model based on Yeo–Johnson, K-means SMOTE, and optimal rockburst feature dimension determination. Sci. Rep. 2022, 12, 15352. [Google Scholar] [CrossRef]
Liang, W.; Sari, Y.A.; Zhao, G.; McKinnon, S.D.; Wu, H. Probability estimates of short-term rockburst risk with ensemble classifiers. Rock Mech. Rock Eng. 2021, 54, 1799–1814. [Google Scholar] [CrossRef]
Kumar, R. Machine Learning Quick Reference: Quick and Essential Machine Learning Hacks for Training Smart Data Models; Packt Publishing Ltd.: Birmingham, UK, 2019. [Google Scholar]
Jung, Y. Multiple predicting K-fold cross-validation for model selection. J. Nonparametric Stat. 2018, 30, 197–215. [Google Scholar] [CrossRef]
Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]

Figure 1. Distribution of rock blast levels in the original sample database.

Figure 2. Sample distribution and box plots for different rockburst risk levels.

Figure 3. Correlations between the characteristic indicators.

Figure 4. Stacking working principle.

Figure 5. Rockburst risk prediction process based on integrated algorithms.

Figure 6. Stacking integration algorithm fusion strategy and process.

Figure 7. The accuracy of the test set and training set of the base algorithms.

Figure 8. Heat map of the confusion matrix for the stacking fusion integration algorithm model.

Figure 9. The improvement rate of each evaluation index of the stacking algorithm compared to the basic algorithm models as the meta model.

Figure 10. Comparison of voting-15 and the single integrated algorithm.

Figure 11. Accuracy improvement with clf8 as the meta model compared with basic algorithms.

Figure 12. Comparison of the predictive performance for stacking-26 and integrated algorithms in different rockburst levels.

Figure 13. Importance and average importance of each feature index based on traditional integration algorithm.

Figure 14. Rockburst prediction performance of each classifier using different input samples.

Table 1. Characteristics of statistical parameters of the indicators for rockburst.

Statistical Parameters	Characteristic Indicators
Statistical Parameters	$X_{1} (σ_{θ})$	$X_{2} (σ_{c})$	$X_{3} (σ_{t})$	$X_{4} (W e t)$	$X_{5} (σ_{θ} / σ_{c})$	$X_{6} (σ_{c} / σ_{t})$
Average	59.97	115.32	7.04	20.54	4.99	0.64
S.D.	49.23	47.76	4.28	2.95	0.78	13.03
Min.	2.60	15.50	0.38	0.70	0.04	0.15
25% quantile	32.08	83.60	3.72	2.92	0.29	11.54
50% quantile	50.44	111.50	6.30	4.80	0.46	17.50
75% quantile	68.00	144.40	9.25	6.31	0.64	25.00
Max.	297.80	304.20	22.60	21.00	5.26	80.00

Table 2. Hyperparameter optimization results of the base algorithm model.

Classification	Methods	Hyperparameters	Search Scope and Number of Steps	Optimal Value of the Hyperparameters
Integration algorithm	Random Forest	max_depth	(10, 100, 5)	25
		min_samples_leaf	(2, 11, 1)	3
		min_samples_split	(2, 11, 1)	9
	LightGBM	num_leaves	(2, 100, 1)	81
		n_estimators	(200, 2000, 50)	1600
		subsample	(0.1, 1, 0.1)	0.8
		max_depth	(10, 100, 5)	75
		learning_rate	(0.01, 0.2, 0.01)	0.09
	XGBoost	n_estimators	(200, 2000, 50)	1700
		learning_rate	(0.01, 0.2, 0.01)	0.05
		max_depth	(10, 100, 5)	25
		colsample_bytree	(0.1, 1, 0.1)	0.7
		subsample	(0.1, 1, 0.1)	0.8
	CatBoost	random_strength	(2, 10, 1)	3
		learning_rate	(0.1, 1, 0.1)	0.2
		iterations	(200, 2000, 100)	1900
		random_seed	(10, 100, 10)	30
		depth	(1, 10, 1)	6
Basic algorithm	Decision Tree	min_samples_split	(2, 11, 1)	10
		max_depth	(10, 100, 5)	85
		min_samples_leaf	(2, 11, 1)	2
	KNN	n_neighbors	(1, 20, 1)	1
	SVM	C	(0.1, 10, 0.1)	6.8
	SVM	gamma	[0.0001, 0.001, 0.01, 0.1, 1, 3, 5, 7]	0.01
	Logistic Regression	penalty	[‘l1’, ‘l2’]	12
	Logistic Regression	C	(0.1, 10, 10.1)	0.1
	Multi-layer neural network	hidden_layer_sizes	(10, 100, 1)	65
		learning_rate_init	(0.0001, 0.02, 0.0001)	0.0145
		max_iter	(200, 2000, 100)	1900

Table 3. Rockburst prediction results of the 12 base algorithms.

Algorithms	Accuracy	Precision	Recall	F1 score	cv_mean
clf1	0.8031	0.7977	0.8031	0.7952	0.7525
clf2	0.8110	0.8169	0.8110	0.8092	0.7743
clf3	0.8346	0.8349	0.8346	0.8296	0.7802
clf4	0.8504	0.8493	0.8504	0.8465	0.7802
clf5	0.7087	0.7135	0.7087	0.7036	0.6752
clf6	0.7717	0.7624	0.7717	0.7650	0.7426
clf7	0.7874	0.7823	0.7874	0.7838	0.7485
clf8	0.5591	0.5671	0.5591	0.5608	0.5228
clf9	0.6063	0.6158	0.6063	0.6082	0.5208
clf10	0.6535	0.6848	0.6535	0.6497	0.4535
clf11	0.7717	0.7659	0.7717	0.7599	0.7010
clf12	0.5748	0.6410	0.5748	0.5829	0.5748

Table 4. Average accuracy of the stacking integration algorithm for rockburst events.

Algorithm Type	Algorithm	Testing Accuracy	Stacking-I	Stacking-II
Integration Algorithms	clf1	0.8031	0.7714	0.8189
	clf2	0.8110	0.7953	0.8031
	clf3	0.8346	0.8189	0.8110
	clf4	0.8504	0.8031	0.8268
Basic Algorithms	clf5	0.7087	0.6772	0.7008
	clf6	0.7717	0.7165	0.7087
	clf7	0.7874	0.7874	0.7953
	clf8	0.5591	0.7874	0.7874
	clf9	0.6063	0.8031	0.8031
	clf10	0.6535	0.7717	0.6850
	clf11	0.7717	0.8110	0.7795
	clf12	0.5748	0.7874	0.7953

Table 5. Average accuracy rate of the stacking integration algorithm for rockburst events.

Algorithm Type	Algorithm	Precision	Stacking-I	Stacking-II
Integration Algorithms	clf1	0.7977	0.7853	0.8190
	clf2	0.8169	0.8069	0.7998
	clf3	0.8349	0.8293	0.8120
	clf4	0.8493	0.8073	0.8262
Basic Algorithms	clf5	0.7135	0.7093	0.7129
	clf6	0.7624	0.7160	0.7180
	clf7	0.7823	0.7907	0.7876
	clf8	0.5671	0.7895	0.7822
	clf9	0.6158	0.8128	0.8008
	clf10	0.6848	0.7810	0.7117
	clf11	0.7659	0.8179	0.7731
	clf12	0.6410	0.7907	0.7945

Table 6. Average recall of the stacking integration algorithm for rockburst events.

Algorithm Type	Algorithms	Recall	Stacking-I	Stacking-II
Integration Algorithms	clf1	0.8031	0.7717	0.8189
	clf2	0.8110	0.7953	0.8031
	clf3	0.8346	0.8189	0.8110
	clf4	0.8504	0.8031	0.8268
Basic Algorithms	clf5	0.7087	0.6772	0.7008
	clf6	0.7717	0.7165	0.7087
	clf7	0.7874	0.7874	0.7953
	clf8	0.5591	0.7874	0.7874
	clf9	0.6063	0.8031	0.8031
	clf10	0.6535	0.7717	0.6850
	clf11	0.7717	0.8110	0.7795
	clf12	0.5748	0.7874	0.7953

Table 7. Average F1 values of the stacking integration algorithm for rockburst events.

Algorithm Type	Algorithms	F1 Score	Stacking-I	Stacking-II
Integration Algorithms	clf1	0.7952	0.7748	0.8177
	clf2	0.8092	0.7957	0.8012
	clf3	0.8296	0.8198	0.8114
	clf4	0.8465	0.8020	0.8263
Basic Algorithms	clf5	0.7036	0.6808	0.6999
	clf6	0.7650	0.7148	0.7102
	clf7	0.7838	0.7883	0.7897
	clf8	0.5608	0.7877	0.7841
	clf9	0.6082	0.8062	0.8010
	clf10	0.6497	0.7721	0.6931
	clf11	0.7599	0.8133	0.7742
	clf12	0.5829	0.7883	0.7927

Table 8. Rockburst prediction results of the voting-weighted algorithm based on average value.

Model	Combination Strategy	Basic Algorithmic	Testing Accuracy	Precision	Recall	F1 score
voting 1	Integrated algorithmic	clf1~clf4	0.7874	0.7895	0.7874	0.7877
voting 2	Basic algorithmic model	clf5~clf12	0.7559	0.7617	0.7559	0.7571
voting 3	Integrated Algorithm + Basic Algorithm	clf1~clf12	0.8110	0.8104	0.8110	0.8106

Table 9. Rockburst prediction results of the voting-weighted algorithm based on accuracy.

Model	Combination Strategy	Basic Algorithmic	Testing Accuracy	Precision	Recall	F1 score
voting 4	Integrated algorithmic	clf1~clf4	0.7874	0.7895	0.7874	0.7877
voting 5	Basic algorithmic model	clf5~clf12	0.7559	0.7639	0.7559	0.7582
voting 6	Integrated Algorithm + Basic Algorithm	clf1~clf12	0.8031	0.8028	0.8031	0.8028

Table 10. Rockburst prediction results of the voting-weighted algorithm based on precision.

Model	Combination Strategy	Basic Algorithmic	Testing Accuracy	Precision	Recall	F1 score
voting 7	Integrated algorithmic	clf1~clf4	0.7874	0.7895	0.7874	0.7877
voting 8	Basic algorithmic model	clf5~clf12	0.7795	0.7717	0.7795	0.7736
voting 9	Integrated Algorithm + Basic Algorithm	clf1~clf12	0.8110	0.8112	0.8110	0.8110

Table 11. Rockburst prediction results of the voting-weighted algorithm based on recall.

Model	Combination Strategy	Basic Algorithmic	Testing Accuracy	Precision	Recall	F1 score
voting 10	Integrated algorithmic	clf1~clf4	0.7874	0.7895	0.7874	0.7877
voting 11	Basic algorithmic model	clf5~clf12	0.7717	0.7635	0.7717	0.7658
voting 12	Integrated Algorithm + Basic Algorithm	clf1~clf12	0.8110	0.8112	0.8110	0.8110

Table 12. Rockburst prediction results of the voting-weighted algorithm based on F1 score.

Model	Combination Strategy	Basic Algorithmic	Testing Accuracy	Precision	Recall	F1 score
voting 13	Integrated algorithmic	clf1~clf4	0.7874	0.7895	0.7874	0.7877
voting 14	Basic algorithmic model	clf5~clf12	0.7795	0.7715	0.7795	0.7733
voting 15	Integrated Algorithm + Basic Algorithm	clf1~clf12	0.8268	0.8275	0.8268	0.8266

Table 13. Rockburst prediction results of the voting-weighted algorithm based on cv_mean.

Model	Combination Strategy	Basic Algorithmic	Testing Accuracy	Precision	Recall	F1 score
voting 16	Integrated algorithmic	clf1~clf4	0.7874	0.7895	0.7874	0.7877
voting 17	Basic algorithmic model	clf5~clf12	0.7717	0.7645	0.7717	0.7667
voting 18	Integrated Algorithm + Basic Algorithm	clf1~clf12	0.8268	0.8275	0.8268	0.8266

Table 14. Fusion strategy and calculation results based on logical regression (LR) as the meta model.

Model	First-Level Classfier	Second-Level Classfier	Testing Accuracy	Precision	Recall	F1 score
Stacking 25	clf1	clf8	0.7795	0.7774	0.7795	0.7778
Stacking 26	clf1, clf2	clf8	0.8189	0.8236	0.8189	0.8200
Stacking 27	clf1~clf3	clf8	0.7795	0.7816	0.7795	0.7796
Stacking 28	clf1~clf4	clf8	0.7874	0.7895	0.7874	0.7877
Stacking 29	clf1~clf5	clf8	0.8031	0.8037	0.8031	0.8031
Stacking 30	clf1~clf6	clf8	0.7953	0.7964	0.7953	0.7953
Stacking 31	clf1~clf7	clf8	0.8110	0.8133	0.8110	0.8117
Stacking 32	clf1~clf7, clf9	clf8	0.8031	0.8045	0.8031	0.8035
Stacking 33	clf1~clf7, clf9, clf10	clf8	0.8031	0.8045	0.8031	0.8035
Stacking 34	clf1~clf7, clf9~clf11	clf8	0.8189	0.8199	0.8189	0.8192
Stacking 35	clf1~clf3, clf9~clf12	clf8	0.8031	0.8045	0.8031	0.8035

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, J.; Ma, H.; Yan, X. Rockburst Intensity Classification Prediction Based on Multi-Model Ensemble Learning Algorithms. Mathematics 2023, 11, 838. https://doi.org/10.3390/math11040838

AMA Style

Wang J, Ma H, Yan X. Rockburst Intensity Classification Prediction Based on Multi-Model Ensemble Learning Algorithms. Mathematics. 2023; 11(4):838. https://doi.org/10.3390/math11040838

Chicago/Turabian Style

Wang, Jiachuang, Haoji Ma, and Xianhang Yan. 2023. "Rockburst Intensity Classification Prediction Based on Multi-Model Ensemble Learning Algorithms" Mathematics 11, no. 4: 838. https://doi.org/10.3390/math11040838

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Rockburst Intensity Classification Prediction Based on Multi-Model Ensemble Learning Algorithms

Abstract

1. Introduction

2. Establishment of Rock Burst Prediction Database

2.1. Rockburst Event Data Sources

2.2. Analysis of Event Characteristic Indicators for Rockburst

3. Principles of Stacking and Voting Integration Algorithms

3.1. Principle of Stacking Integration Algorithm

3.2. Principle of Voting Integration Algorithm

4. Rockburst Risk Prediction Model Based on Integrated Algorithms

4.1. Rockburst Risk Prediction Process Based on Integrated Algorithms

4.2. Raw Data Processing

4.3. Hyperparameter Optimization of the Base Algorithms

4.4. Strategy and Process of the Stacking Integration Algorithm

4.5. Strategy and Process of Voting Integration Algorithm

5. Analysis and Discussion of Prediction Results

5.1. Comparison and Evaluation of Base Algorithms

5.2. Classification Performance Evaluation of the Stacking Integration Algorithm

5.3. Classification Performance Evaluation of the Voting Integration Algorithm

5.4. Comparative Analysis of the Fusion Results of Other Combination Strategies

5.5. Comparative Analysis of Other Models with Different Sample Combinations

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI