Application of a Multi-Algorithm-Optimized CatBoost Model in Predicting the Strength of Multi-Source Solid Waste Backfilling Materials

Qiu, Jianhui; Li, Jielin; Xiong, Xin; Zhou, Keping

doi:10.3390/bdcc9080203

Open AccessArticle

Application of a Multi-Algorithm-Optimized CatBoost Model in Predicting the Strength of Multi-Source Solid Waste Backfilling Materials

School of Resources and Safety Engineering, Central South University, Changsha 410083, China

^*

Author to whom correspondence should be addressed.

Big Data Cogn. Comput. 2025, 9(8), 203; https://doi.org/10.3390/bdcc9080203

Submission received: 29 June 2025 / Revised: 26 July 2025 / Accepted: 30 July 2025 / Published: 7 August 2025

(This article belongs to the Special Issue Applications of Artificial Intelligence and Data Management in Data Analysis)

Download

Browse Figures

Versions Notes

Abstract

Backfilling materials are commonly employed materials in mines for filling mining waste, and the strength of the consolidated backfill formed by the binding material directly influences the stability of the surrounding rock and production safety in mines. The traditional approach to obtaining the strength of the backfill demands a considerable amount of manpower and time. The rapid and precise acquisition and optimization of backfill strength parameters hold utmost significance for mining safety. In this research, the authors carried out a backfill strength experiment with five experimental parameters, namely concentration, cement–sand ratio, waste rock–tailing ratio, curing time, and curing temperature, using an orthogonal design. They collected 174 sets of backfill strength parameters and employed six population optimization algorithms, including the Artificial Ecosystem-based Optimization (AEO) algorithm, Aquila Optimization (AO) algorithm, Germinal Center Optimization (GCO), Sand Cat Swarm Optimization (SCSO), Sparrow Search Algorithm (SSA), and Walrus Optimization Algorithm (WaOA), in combination with the CatBoost algorithm to conduct a prediction study of backfill strength. The study also utilized the Shapley Additive explanatory (SHAP) method to analyze the influence of different parameters on the prediction of backfill strength. The results demonstrate that when the population size was 60, the AEO-CatBoost algorithm model exhibited a favorable fitting effect (R² = 0.947, VAF = 93.614), and the prediction error was minimal (RMSE = 0.606, MAE = 0.465), enabling the accurate and rapid prediction of the strength parameters of the backfill under different ratios and curing conditions. Additionally, an increase in curing temperature and curing time enhanced the strength of the backfill, and the influence of the waste rock–tailing ratio on the strength of the backfill was negative at a curing temperature of 50 °C, which is attributed to the change in the pore structure at the microscopic level leading to macroscopic mechanical alterations. When the curing conditions are adequate and the parameter ratios are reasonable, the smaller the porosity rate in the backfill, the greater the backfill strength will be. This study offers a reliable and accurate method for the rapid acquisition of backfill strength and provides new technical support for the development of filling mining technology.

Keywords:

backfill materials; machine learning; mine safety; material property evaluation

1. Introduction

Backfill material is a widely utilized component in mining operations, primarily employed to fill voids created by the extraction of mineral resources [1]. It typically consists of a mixture of soil, sand, gravel, boulders, industrial waste residues, and cementitious materials in specific proportions. Upon mixing with water and agitation, this combination yields a backfill slurry with defined concentration levels that can be transported to the filling face or void areas via gravity flow or pressurization [2]. Following dewatering and subsequent hardening processes, it results in a solidified backfill structure capable of providing necessary support for ground stability while addressing void management concerns. Furthermore, the backfill aggregate effectively utilizes waste rock generated during mining activities, as well as tailings from ore processing and industrial by-products from metallurgical operations [3]. This approach not only ensures safety, but also promotes environmental sustainability and enhances mineral recovery rates. Consequently, the application of backfill mining techniques has gained increasing traction within non-ferrous metal and gold mining sectors [4,5,6].

The strength of the backfill is paramount in the context of backfill mining methods, as it directly impacts the safety and productivity of underground operations. Among various aspects of backfill strength relevant to mining production, compressive strength stands out as the most critical factor, providing essential support to overlying rock strata while safeguarding personnel and equipment situated below [7,8]. The integrity of the backfill fundamentally relies on its own compressive strength. Although this compressive strength fluctuates over time, it typically follows an upward trajectory that increases with age. It is widely accepted that after 28 days, the compressive strength of mine backfill stabilizes; thus, the 28-day compressive strength is commonly regarded as a key indicator for assessing overall backfill performance [9,10,11]. Given that backfill strength serves as a crucial determinant for long-term support of mine roof structures above voids, accurately determining and optimizing these parameters holds significant implications for ensuring safe mining practices [12].

The primary methods for determining the strength of fill materials include laboratory tests and in situ sampling assessments [7,13,14]. However, laboratory tests often fail to account for additional conditions present at construction sites, such as the maintenance pressure exerted by the weight of the fill material and the impact of temperature fluctuations within the mine during maintenance periods [15]. Conversely, in situ sampling assessments not only entail lengthy procedures, but also exhibit significant variability in strength distribution, complicating the design and optimization processes for fill material strength and severely hindering mining operations [16]. Both methodologies demand substantial manpower and resources, thereby escalating labor and time costs associated with achieving optimal fill material strength. Consequently, there is a pressing need for research into high-precision, efficient, and easily implementable predictive methods for assessing fill material strength [17,18,19].

In recent years, researchers have undertaken comprehensive investigations into the mechanical properties, strength prediction, and optimization methodologies of mine backfill [9,20,21]. Numerous approaches for predicting backfill strength have been proposed, including nonlinear multivariate regression, elastic mechanics analysis, and artificial intelligence techniques. Some researchers have developed mathematical models to elucidate the relationship between the uniaxial compressive strength of backfill and longitudinal wave velocity, utilizing these models to forecast compressive strength [22]. Additionally, several scholars have conducted statistical analyses on the compressive strength of backfill at various curing ages alongside parameters such as slurry volume fraction, cement content, and sand-to-gravel ratio; they established multiple regression equations to facilitate the predictive modeling of backfill strength [23]. Furthermore, other researchers investigated the arching effect and stress migration mechanisms during mining by monitoring in situ stress changes within the backfill and subsequently formulated an internal stress prediction model for it [24]. However, these studies are primarily tailored to specific types of backfill materials and exhibit limited generalizability. The compressive strength is influenced by factors such as slurry concentration, raw material ratios, and curing duration. Consequently, determining the optimal compressive strength under multifactorial influences remains a prevalent challenge faced by mines employing backfilling technology [25].

In addressing these challenges, artificial intelligence methodologies have proven effective in managing complex multi-factorial and non-linear problems. Numerous researchers have employed artificial intelligence algorithms to develop predictive models for the strength of backfills. Orejarena et al. [26] constructed an artificial neural network (ANN) model aimed at predicting the compressive strength of cemented backfills resistant to sulfate corrosion. Qi et al. [27] developed a genetic programming (GP) model for predicting the strength of cemented backfills, and conducted a comparative analysis with Decision Tree (DT), Gradient Boosting Machine (GBM), and Random Forest (RF) models, demonstrating its superior predictive capability. Traditional machine learning algorithms are often hindered by limitations such as inadequate learning capacity, slow convergence rates, and poor stability. Consequently, there has been increased research and development focused on ensemble learning models and their optimized hybrid counterparts within this domain. De-Prado-Gil et al. [28] performed a comparative analysis of the practicality of RF, Gradient Boosting (GB), Extreme Gradient Boosting model (XGBoost), and Light Gradient Boosting machine (LightGBM) ensemble methods in forecasting the compressive strength of self-compacting recycled aggregate concrete. Qi et al. [29] utilized particle swarm optimization to fine-tune the hyperparameters of the boosting regression tree (BRT) model for backfill strength prediction, resulting in a hybrid PSO-BRT model with enhanced predictive performance. Xiong et al. [30], employing the whale optimization algorithm (WOA), optimized XGBoost to establish a prediction framework for phosphogypsum filling strengths. Lu et al. [31], through the application of Gradient Boosting Regression Tree (GBRT) methodology, significantly improved prediction accuracy regarding the compressive strength of cemented backfills while reducing error margins.

Significant advancements have been made in the application of machine learning, particularly in predicting the uniaxial compressive strength of backfill bodies through integrated learning models [32,33]. However, there is a paucity of reports regarding the use of Categorical Boosting (CatBoost), a hybrid ensemble learning model based on symmetric decision trees, within this domain. Research indicates that parameters such as filling ratio, curing time, and concentration exert considerable influence on the strength of backfill bodies. These parameters are predominantly discrete and exhibit pronounced nonlinear relationships with the strength of backfills [34]. CatBoost is well suited for modeling these nonlinear characteristics among discrete data.

In light of this, the present study identified the solid mass fraction, cement content, tailings content, and curing age as input parameters while designating the uniaxial compressive strength of the backfill as the output parameter. This research aimed to evaluate the predictive capability of the CatBoost hybrid model for estimating the compressive strength of backfill. Additionally, six meta-heuristic algorithms were employed to optimize the hyperparameters of the CatBoost model in order to mitigate overfitting and achieve global optimization. By comparing results from various hybrid models, we identified the optimal model based on its predictive performance. Furthermore, SHAP analysis was conducted to assess input parameter importance and ascertain key factors influencing backfill strength along with their respective degrees of impact; this ultimately served as a reference for designing and optimizing mining backfill materials.

2. Methodology and Indictors

2.1. Workflow

In order to accurately reflect the actual strength parameters of the backfill, tailings and waste rocks from a specific tin mine were selected as raw materials, with concentrations set at 82%, 84%, and 86%. The cement-to-sand ratio was configured at 1:4, 1:5, and 1:6, while the waste rock-to-tailing ratio was established at 7:3, 8:2, and 9:1. Curing durations were designated as 3 days, 7 days, and 28 days, and curing temperatures were controlled at 20 °C, 30 °C, 40 °C, and 50 °C. These temperature variations were implemented to investigate the correlation between curing temperature and the mechanical strength of the backfill, aiming to determine the optimal curing temperature for maximizing strength performance. An orthogonal experimental design was applied to generate a total of 174 parameter combinations, enabling the acquisition of corresponding strength data for each backfill sample.

Furthermore, this study employed the CatBoost algorithm to develop a nonlinear correlation prediction model that relates backfill strength to its test indicators (Figure 1), utilizing data from 174 sets of backfill strength and the corresponding test indicators. This model elucidates the relationships between various indicators and backfill strength, thereby facilitating precise predictions of the latter. The primary research steps are outlined as follows:

1. A dataset was constructed based on experimental data pertaining to backfill strength and its associated test indicators, which was subsequently divided randomly into a training set (80%) and a testing set (20%).

2. A fitness function was formulated to optimize the parameters of CatBoost (Depth, learning_rate, l2_leaf_reg). Depth controls the complexity of individual decision trees, balancing underfitting and overfitting risks; learning_rate adjusts the contribution weight of each tree, influencing convergence speed and prediction accuracy; and l2_leaf_reg applies L2 regularization to leaf nodes, effectively mitigating overfitting. These parameters are widely recognized as the most impactful for CatBoost in regression tasks, as validated in studies on material property prediction [35]. In this study, the cross-validation scores obtained at different optimization iterations served as the fitness evaluation metric.

3. Six population optimization algorithms—namely Artificial Ecosystem-Based Optimization (AEO), Aquila Optimization (AO), Germinal Center Optimization (GCO), Sand Cat Swarm Optimization (SCSO), Sparrow Search Algorithm (SSA), and Walrus Optimization Algorithm (WaOA)—were employed for the iterative optimization of CatBoost parameters. The parameter optimization fitness across different algorithms was assessed, with optimal parameters being reported upon meeting specified iteration conditions.

4. By comparing the results yielded by various population optimization algorithms, we identified the optimal parameters for CatBoost and established a composite optimization model aimed at predicting backfill strength.

5. Additionally, SHAP analysis was conducted to further investigate the relationships among concentration, cement–sand ratio, waste rock–tailing ratio, curing time, curing temperature, and their impact on backfill strength.

2.2. Categorical Boosting (CatBoost)

CatBoost is an advanced machine learning algorithm developed by Yandex, a Russian company, in 2017. It represents an enhancement of the Gradient Boosting Decision Tree (GBDT) framework [36]. Traditional GBDT algorithms construct each decision tree based on the residuals of the current model ensemble, resulting in strong dependencies among trees and a propensity for overfitting. The CatBoost algorithm addresses this issue by introducing the Ordered Boosting method along with a novel weighting strategy. This approach reorders the training data and clusters similar feature values into identical leaf nodes, thereby enhancing both the accuracy and stability of gradients while mitigating model variance, ultimately resolving overfitting concerns.

In the GBDT algorithm, the mean of the data labels serves as the criterion for node splitting. The specific formula is presented as follows:

{\hat{x}}_{k}^{i} = \frac{\sum_{j = 1}^{N} I_{\{x_{j}^{i} = x_{k}^{i}\}} y_{j}}{\sum_{j = 1}^{N} I_{\{x_{j}^{i} = x_{k}^{i}\}}}

(1)

where

x_{k}^{i}

denotes the i-th feature of the k-th training sample;

{\hat{x}}_{k}^{i}

represents the mean value; I is an indicator function that signifies whether a specific condition holds true; and y_j refers to the label value of the j-th sample.

Unlike the GBDT approach, the CatBoost algorithm integrates an adaptive learning rate mechanism. This mechanism allows for a more precise regulation of the impact exerted by weak learners in each iteration, ultimately leading to an improvement in the model’s accuracy. The computational procedure for the adaptive learning rate within the CatBoost algorithm is outlined below:

{\begin{matrix} η_{t} = \frac{1}{\sqrt{k + 1}} \\ α_{t} = \frac{\sum_{i = 1}^{t} η_{i}}{k} \end{matrix}

(2)

where t denotes the iteration number, η_t represents the learning rate for the t-th round of iteration, and α_t signifies the average learning rate over the first t iterations.

Traditional neural network models necessitate a substantial amount of data for effective training; models trained on limited datasets tend to exhibit low accuracy. In contrast, the CatBoost algorithm is a decision tree-based model that does not require an extensive sample size as its training set to achieve high-precision predictions. Furthermore, CatBoost is characterized by its high training efficiency, enabling rapid model training and enhanced prediction speed [37].

2.3. Artificial Ecosystem-Based Optimization (AEO)

Artificial Ecosystem-Based Optimization (AEO) simulates the behaviors of production, consumption, and decomposition among producers, consumers, and decomposers in an ecosystem [38]. The production operator is specifically constructed to balance exploration and exploitation capabilities; the consumption operator is intended to strengthen exploratory capacity; and the decomposition operator concentrates on boosting exploitation performance.

(1).: Producers

Producers are defined as individuals exhibiting the lowest fitness values, whereas decomposers represent those with the highest fitness values. Producers update their positions based on both the upper and lower bounds of the search space, as well as information from decomposers:

x_{1} (k + 1) = (1 - a) x_{n} (t) + a x_{r a n d} (k) a = (1 - \frac{k}{K}) r_{1} x_{r a n d} = R + r (H - R)

(3)

where n denotes the population size, K indicates the maximum number of iterations, while H and R represent the upper and lower bounds of the search space, respectively. Additionally, r₁ is a random value within the interval [0, 1], whereas r is a random vector also confined to the range of [0, 1].

(2).: Consumers

In the ecosystem, consumers are categorized into herbivores, carnivores, and omnivores with equal probability. In the context of AEO, a consumption factor C is introduced:

C = \frac{1}{2} \frac{v_{1}}{|v_{2}|} v_{1} \sim N o r (0,1), v_{2} \sim N o r (0,1)

(4)

where Nor (0, 1) denotes the probability density function corresponding to a standard normal distribution.

Herbivores are animals that only eat producers, and their position update method is

x_{j} (k + 1) = x_{j} (k) + U \cdot (x_{j} (k) - x_{1} (k)), j \in [2, \dots, n]

(5)

where x₁ represents the location of the producer.

Carnivores are animals that only eat consumers, and their position update method is

x_{j} (k + 1) = x_{j} (k) + U \cdot (x_{j} (k) - x_{e} (k)) j \in [3, \dots, n], e = r a n d i ([2 j - 1])

(6)

where x_e represents the location of the consumer with a higher energy level.

Omnivores eat both producers and consumers, and their position update method is

x_{j} (k + 1) = x_{j} (k) + U \cdot (r_{2} \cdot (x_{j} (k) - x_{1} (k)) + (1 - r_{2}) (x_{j} (k) - x_{e} (k))) i = 3, \dots, n; j = r a n d i ([2 i - 1])

(7)

where r₂ is a random number within the range of [0, 1].

In an ecosystem, decomposers play a crucial role in providing nutrients to producers. Within the framework of AEO, each individual in the population is permitted to determine its subsequent location based on the presence of decomposers. To enhance the performance of the algorithm, an adjustment factor G and weight coefficients e and h are incorporated. The methodology for updating decomposer positions is as follows:

x_{j} (k + 1) = x_{n} (k) + G \times (c \times (x_{n} (t) - v \times x_{j} (k))), j = 1, \dots, n G = 3 b, b \sim N (0,1) c = r_{3} \times r a n d i ([1,2]) - 1 v = 2 \times r_{3} - 1

(8)

where x_n represents the position of the decomposer, and r₃ is a random number within the interval [0, 1].

2.4. Aquila Optimization (AO)

In 2023, Sasmalet al. [39] drew inspiration from the hunting behaviors of eagles to introduce the Aquila Optimizer (AO). This algorithm simulates four distinct hunting strategies employed by eagles to formulate a mathematical model and adeptly applies various search strategies in accordance with the unique characteristics of solutions within the search space.

(1) Expanded Exploration Stage:

In the initial stage, the hawk ascends to a high altitude and performs a vertical dive to locate prey and delineate a search area:

X_{1} (k + 1) = X_{b e s t} (k) \times (1 - \frac{k}{K}) + (X_{M} (k) - X_{b e s t} (k) \times r a n d)

(9)

X_{M} (k) = \frac{1}{N} \sum_{i = 1}^{N} X_{i} (k)

(10)

where k denotes the current iteration number, K signifies the total number of iterations, X₁(k + 1) represents the solution for the next iteration following the t-th iteration, X_best(k) indicates the best solution identified during all previous iterations up to t, X_M(k) refers to the mean of all solutions in the t-th iteration, and rand is defined as a random value within the interval [0, 1].

(2) Narrow the exploration phase

The Peregrine Falcon’s short gliding attack behavior is simulated.

X_{2} (k + 1) = X_{b e s t} (k) \times L e v y (F) + X_{R} (k) + (y - x) \times r a n d \times 1

(11)

where Levy (F) represents the Levy flight distribution strategy, and x and y are used to abstractly represent the shape of the eagle’s spiral flight.

(3) Expanded Development Stage

In the third phase, when the eagle enters the hunting zone and readies itself to land and initiate an attack, it adopts a vertical descending strategy for the initial assault. From a mathematical perspective, this process can be expressed by the following formula:

X_{3} (k + 1) = (X_{b e s t} (t k) - X_{M} (k)) \times β - r a n d + ((H B - R B) \times r a n d + R B) \times γ

(12)

where β and γ represent the development adjustment parameters, with a smaller value of 0.1.

2.5. Germinal Center Optimization (GCO)

Germinal Center Optimization (GCO) is an innovative optimization algorithm inspired by the germinal center response observed in the vertebrate immune system [40]. The fundamental concept of the GCO algorithm involves simulating two distinct regions within the germinal center: the dark zone (DZ) and the light zone (LZ). In the dark zone, B cells enhance antibody diversity through clonal expansion and somatic hypermutation. Conversely, in the light zone, B cells engage in competition for binding with antigens and helper T cells, undergoing a process of affinity selection. By emulating these biological processes, the GCO algorithm optimizes its search for problem solutions. The primary steps of the GCO algorithm are as follows [41]:

1. Non-Uniform Particle Selection: The algorithm selects particles for mutation based on B cell affinity (i.e., solution quality), which can be interpreted as establishing a temporary leadership role within this population-based meta-heuristic framework.

2. Clonal Expansion: Within the dark zone, B cells proliferate through clonal expansion, paralleling the particle duplication process inherent to this algorithm.

3. Affinity Selection: In the light zone, B cells are selected according to their affinity for antigens; this mirrors the optimization process applied to solutions within the algorithm.

4. Dynamic Leadership: Leadership dynamics within this algorithm are fluid and can adaptively modify search strategies based on solution quality to achieve a balance between exploration and exploitation.

2.6. Sand Cat Swarm Optimization (SCSO)

The Sand Cat Swarm Optimization (SCSO) algorithm is an evolutionary metaheuristic approach derived from the behavioral characteristics of sand cats in their natural environment [42]. Sand cats exhibit two primary behaviors: foraging for food and hunting prey. A key source of inspiration for this algorithm lies in the sand cat’s exceptional ability to detect low-frequency sounds, an exclusive adaptive trait that enables it to pinpoint prey both above ground and underground.

The key parameter governing the transition between exploration and exploitation phases is denoted as R. When |R| > 1, the sand cat actively engages in prey-searching.

r_{G} = s_{M} - \frac{s_{M} \times i t e r_{c}}{i t e r_{m a x}}

(13)

R = 2 \times {\vec{r}}_{G} \times r a n d - {\vec{r}}_{G}

(14)

r = r_{G} \times r a n d

(15)

The search for prey by the sand cat is contingent upon the release of low-frequency noise, with the assumption that the sensitivity range

{\vec{r}}_{G}

spans 0 to 2 kHz. The parameter S_M draws inspiration from the auditory characteristics of the sand cat, with an assumed value of 2. Here, iter_c denotes the current iteration number, iter_max represents the maximum iteration count, and rand signifies a random number generated within the interval [0, 1].

Each sand cat adjusts its location according to three factors: its optimal candidate position P_ac, its present position P_b, and its sensitivity scope r_G. Consequently, this enables each sand cat to identify other potential optimal locations for prey as follows:

P (k + 1) = r \times (P_{a c} (k) - r a n d (0,1) \times P_{b} (k))

(16)

when |R| ≤ 1, the sand cat initiates an attack on its prey. Initially, it generates a random position utilizing both the optimal position P_bc and its current location P_b. On the assumption that the sensitivity range of each sand cat is circular, a roulette selection approach is utilized to randomly determine an angle θ for each individual sand cat. In the end, this procedure enables the predation of prey via the application of the formula provided below:

P_{r n d} = | r a n d (0,1) \cdot P_{b} (k) - P_{c} (k) |

(17)

P (k + 1) = P_{b} (k) - \vec{r} \cdot P_{r n d} \cdot c o s (θ)

(18)

Specifically, the random position P_rnd ensures proximity to potential prey while introducing randomness in angle selection helps prevent convergence into local optima.

2.7. Sparrow Search Algorithm (SSA)

The Sparrow Search Algorithm (SSA) is an emerging swarm intelligence approach that simulates the foraging behaviors and anti-predator strategies of sparrow populations to perform iterative optimization [43]. Endowed with advantages like high search accuracy, fast convergence rate, strong stability, and robust global search capability, SSA has garnered significant attention and been effectively applied to solving complex global optimization problems. In recent years, it has exhibited excellent performance across various domains including fault diagnosis, system control, path planning, and image processing. In this study, SSA is employed to optimize the hyperparameters of the CatBoost model, with its fundamental equations presented as follows:

X_{i, j}^{k + 1} = {\begin{matrix} X_{i, j}^{k} e x p (\frac{- i}{α \times i t e r_{m a x}}) i f R_{2} < S T \\ X_{i, j}^{k} + S \times B i f R_{2} < S T \end{matrix}

(19)

where t denotes the current iteration, and j = 1, 2, … d.

X_{i, j}^{k}

represents the value of the jth dimension of the ith sparrow at iteration t, with iter_max being a preset constant signifying the maximum number of iterations. In this study, the value of iter_max is set as 1000.

α

(

α \in (0, 1]

) is a randomly generated number. R₂ (

R_{2} \in [0, 1]

) and (

S T \in [0.5, 1.0]

) stand for the alarm value and the safety threshold, respectively. S is a random number that adheres to a normal distribution. B is a 1 × d matrix with all elements set to 1, where d denotes the dimension of the variables. If R₂ < ST, this suggests the absence of predators in the surrounding area, allowing producers to switch to an extensive search mode. In contrast, when R₂ > ST, it indicates that certain sparrows have detected predators, prompting all sparrows to quickly move to other secure locations.

The position update formula for the scrounger is described as follows:

X_{i, j}^{k + 1} = {\begin{matrix} S \times e x p (\frac{X_{w}^{k} - X_{i, j}^{k}}{i^{2}}) i f i > n / 2 \\ X_{M}^{k + 1} + |X_{i, j}^{k} - X_{M}^{k + 1}| \times A^{+} \times B o t h e r w i s e \end{matrix}

(20)

where

X_{M}

symbolizes the prime position held by the producer.

X_{w}

designates the presently prevailing global worst position. A is configured as a 1 × d matrix, with each constituent element randomly allocated a value of either 1 or −1, and it satisfies the relation A⁺ = A^T (A A^T)⁻¹.

The entities assuming the role of investigators are these sparrows, which possess an acute awareness of potential perils. They typically constitute a proportion ranging from 10% to 20% of the entire sparrow population. In the context of this study, the specific percentage of investigators has been set precisely at 10%.

X_{i, j}^{k + 1} = {\begin{matrix} X_{b e s t}^{k} + A R \times |X_{i, j}^{k} - X_{b e s t}^{k}| i f f_{i} > f_{g} \\ X_{i, j}^{k} + E \times \frac{|X_{i, j}^{k} - X_{w}^{k}|}{(f_{i} - f_{w}) + ε} i f f_{i} = f_{g} \end{matrix}

(21)

where

X_{b e s t}

represents the current global optimal position. AR, serving as the step size control parameter, is a random number following a normal distribution with a mean of 0 and a variance of 1. E (

E \in [- 1, 1]

) denotes a randomly generated number.

f_{i}

is the fitness value of the current sparrow.

f_{g}

and

f_{w}

are the current global best and worst fitness values, respectively. ε is the minimal constant introduced to avert zero-division errors, ensuring the stability and reliability of computational operations during the algorithmic process.

2.8. Walrus Optimization Algorithm (WaOA)

The Walrus Optimization Algorithm (WaOA) emulates the natural behaviors exhibited by walruses [44]. The primary inspiration for the design of WaOA is derived from their feeding, migration, evasion, and predator confrontation processes. The implementation of WaOA is mathematically organized into three distinct phases: exploration, migration, and exploitation [45].

(1) Feeding Strategy

By replicating the foraging behavior of walruses, this algorithm conducts extensive exploration within the search space to identify regions that harbor optimal solutions.

x_{i, j}^{P_{1}} = x_{i, j} + r a n d_{i, j} \times (W_{j} - I_{i, j} \times x_{i, j})

(22)

where

x_{i, j}^{P_{1}}

is the optimal solution of the first stage, and i and j represent the number and latitude of solutions, respectively. W represents the optimal candidate solution, which is regarded as the most powerful walrus. I_i,j are integers chosen randomly to be either 1 or 2.

(2) Migration Strategy

The migration behavior of walruses is used to guide individual search within the space to discover new suitable areas.

x_{i, j}^{P_{2}} = {\begin{matrix} x_{i, j} + r a n d_{i, j} \cdot (x_{k, j} - I_{i, j} \cdot x_{i, j}), F_{k} < F_{i} \\ x_{i, j} + r a n d_{i, j} \cdot (x_{i, j} - x_{k, j}), e l s e \end{matrix}

(23)

where

x_{i, j}^{P_{2}}

is the optimal solution of the second stage, and F represents the value of the objective function.

(3) Escaping and fighting against predators

The algorithm mimics the behavior of sea lions in escaping and combating predators, conducting a search within the local area surrounding candidate solutions to enhance the quality of the solution.

x_{i, j}^{P_{3}} = x_{i, j} + (l b_{l o c a l, j}^{t} + (u b_{l o c a l, j}^{t} - r a n d \cdot l b_{l o c a l, j}^{t}))

(24)

where

x_{i, j}^{P_{3}}

is the optimal solution of the third stage, and local represents the lower bound of local optimization.

2.9. Model Verification and Evaluation

In this study, a hybrid forecasting model was developed by integrating multiple algorithmic approaches. A particular portion of the training dataset was allocated to model training, whereas the test dataset served to evaluate the model’s generalization capability in handling unobserved data. Before carrying out a comprehensive analytical assessment, a set of widely recognized evaluation metrics—including R² (coefficient of determination), VAF (variance accounted for), MAE (mean absolute error), and RMSE (root mean square error)—were utilized in this paper to rigorously examine the model’s stability and predictive accuracy [46,47,48]. These performance indicators can quantitatively demonstrate the level of agreement between the model’s predicted outcomes and the actual measured values. The mathematical formulations of these metrics are presented as follows [49]:

R^{2} = 1 - \sum_{i = 1}^{N} {(y_{i} - {\overset{⌢}{y}}_{i})}^{2} / \sum_{i = 1}^{N} {(y_{i} - {\bar{y}}_{i})}^{2}

(25)

R M S E = \sqrt{\sum_{i = 1}^{N} {({\overset{⌢}{y}}_{i} - y_{i})}^{2} / N}

(26)

V A F = [1 - v a r (y_{i} - {\overset{⌢}{y}}_{i}) / v a r (y_{i})] \times 100

(27)

M A E = \sum_{i = 1}^{N} |y_{i} - {\overset{⌢}{y}}_{i}| / N

(28)

where

y_{i}

represents the w value,

{\overset{⌢}{y}}_{i}

is the predicted value of the model,

{\bar{y}}_{i}

represents the average of the predicted values, and N denotes the number of samples in the training or testing stages.

2.10. Shapley Additive Explanatory (SHAP)

In the pursuit of a more profound understanding of our research subject, we have opted to leverage the Shapley Additive Explanatory (SHAP) framework, as detailed in references [50,51]. This methodological choice is driven by the need to unearth the latent significance harbored within various features and to meticulously dissect their specific contributions to fortifying the integrity and strength of the backfill structure. SHAP, operating as a sophisticated post-model interpretive mechanism, is underpinned by a revolutionary concept that envisions each feature as an indispensable “contributor” within the intricate web of the model.

The essence of SHAP’s analytical process lies in its ability to quantitatively ascertain the marginal contribution that each individual feature imparts to the overall model output. This is achieved through a dual-faceted computational approach. Firstly, it demands the precise calculation of the incremental value that a particular feature brings to the model upon its incorporation. Secondly, a comprehensive analysis is carried out by meticulously considering the performance and marginal contribution of said feature across the entirety of possible feature sequences, adopting both a macroscopic global perspective that encapsulates the overarching trends and a microscopic local view that zooms in on specific nuances [52].

This multi-pronged strategy enables us to offer a holistic elucidation of the model, bridging the gap between the macro and micro levels of analysis. For every individual predicted sample that the model processes, the SHAP algorithm is designed to generate a bespoke predictive value. This value serves as a crucial metric, allowing us to accurately gauge the relative weight and importance that each feature holds in the grand scheme of the model’s functionality. The fundamental mathematical underpinning that governs this entire analytical process is presented in the ensuing equation.

Φ_{i} = \sum_{S \subseteq N \ (i)} \frac{|L|! (|W| - |L| - 1)!}{|W|!} [q_{S \cup i} (x_{S U i}) - q_{S} (x_{S})]

(29)

where Φ_i denotes the significance of the i-th feature, L stands for the entire set of features within the dataset, W refers to the subset of L from which the index of i has been excluded, x_s signifies the input features contained in set W, and q represents the function used to quantify the contribution of features.

3. Results

3.1. Datasets and Descriptive Analysis

The correlation matrix derived from 174 sets of laboratory-acquired data on varying backfill parameter ratios and corresponding strength values is shown below (Figure 2). Given that all test parameters are discrete data points, no clear linear relationship exists between these parameters and backfill strength. A strong positive correlation (0.739) was observed between curing temperature and strength. Backfill strength rises notably as curing time increases. Additionally, both the cement–sand ratio and curing time exhibited positive correlations with strength, with correlation coefficients of 0.494 and 0.408, respectively.

A more in-depth analysis of the correlations between different curing temperatures (20 °C, 30 °C, 40 °C, 50 °C), other parameters, and backfill strength indicates that when the curing temperature reached 50 °C, a significant negative correlation with backfill strength emerged. Additionally, as the curing temperature increased, the correlation between curing time and backfill strength first increased and then decreased, peaking when the curing temperature was 30 °C.

3.2. Comparison of Parameter Optimization

We combined six distinct population optimization algorithms with CatBoost to obtain the optimal parameters for CatBoost. In general, the population size in a population optimization algorithm dictates the optimization accuracy, and varying population sizes lead to different optimization results. It is usually advisable to select a population size in the range of 20 to 120. For this study, six different population sizes were selected, specifically 20, 40, 60, 80, 100, and 120. Each algorithm was independently run 6 times for each population size to reduce randomness in parameter optimization, and the prediction training was conducted in combination with the dataset of 174 backfill tests. We obtained the variation in the fitness curve under different algorithm optimizations and different population sizes (Figure 3).

Lower fitness scores indicate better optimization results for the algorithm. As evident from the chart, the AEO-CatBoost algorithm achieved the overall optimal fitness score of 0.857 with a population size of 60 after 30 iterations, signifying convergence. The next best fitness scores, at around 0.880, were obtained by the GCO-CatBoost, SCSO-CatBoost, and WaOA-CatBoost algorithms with population sizes of 20, 100, and 40, respectively. Finally, the AO-CatBoost and SSA-CatBoost algorithms reached their respective optimal outcomes with a population size of 80, where the SSA-CatBoost attained the highest overall fitness score of 1.509.

The outcomes of CatBoost parameter optimization employing six combinations of diverse population sizes for various algorithms are further depicted in Figure 4. Among these, AEO-CatBoost acquired optimal parameters when the population size amounts to 60, featuring the following values for the three CatBoost parameters: Depth is 2, learning_rate is 0.15, and l2_leaf_reg is 0.001. At this parameter combination, CatBoost demonstrates the highest precision in predicting the strength of the backfill.

To further assess the precision of the models in predicting the strength of the backfill, the performance indicators of the aforesaid model evaluation criteria (R2, VAF, MAE, and RMSE) were employed to rank the performance indicators of the results of different population sizes through the TOPSIS evaluation method, and a radar chart (Figure 5) was constructed. On the basis of various performance indicators and evaluation outcomes, when the population size of AEO-CatBoost was 60, the coverage area of the radar chart was the greatest, suggesting that the overall performance of the model prediction was optimal at this point. When the population size of SCSO-CatBoost was 100, the coverage area of the radar chart was similar to that of AEO-CatBoost when the population size was 60, and the overall performance of the model prediction obtained was suboptimal.

3.3. Comparison of Prediction Results

After conducting the model training on the training set, we further present the model’s prediction outcomes on the test set and select the optimal results from the six combination optimization models to compare the accuracy of the filler strength prediction (Figure 6). The proximity of the predicted values to the observed values is reflected by how closely the data points align with the reference line y = x.

Clearly, the AEO-CatBoost model exhibited superior performance in prediction accuracy when compared to other models. Additionally, a reduced number of points falling outside the 95% confidence interval indicates a higher level of model precision. From the data evaluation indicators, the prediction results of the AEO-CatBoost model for the backfill strength were the optimum (Figure 7a). Among all the models, the AEO-CatBoost model had the highest R² value of 0.947 for predicting the backfill strength, and the VAF could attain 93.614, followed by the SCSO-CatBoost model with an R² value of 0.944 and a VAF of 93.423. Additionally, the AEO-CatBoost prediction model had the minimum error of MAE (0.465) and RMSE (0.606); the SCSO-CatBoost model achieved the second-lowest error, with MAE (0.438) and RMSE (0.62).

3.4. Sensitivity Analysis

To quantify the contribution of each parameter in the dataset, we performed a more detailed analysis on the overall impact of input parameter features on variations in filler strength, based on the correlation between the magnitude of input features and their influence on filler strength (Figure 7).

In the chart, points closest to blue indicate that the parameter input exerts a negative effect on backfill strength, whereas those nearest to red signify a positive effect. The parameters are arranged in descending order of their overall contribution from top to bottom. Clearly, curing temperature carries the greatest overall significance, followed by curing time. As curing temperature and curing time increased, they exerted the most prominent positive impact on backfill strength; specifically, higher curing temperature and longer curing time resulted in greater backfill strength. A further evaluation of the importance of different parameters shows that curing temperature had the highest importance score (1.80), followed by curing time (0.97). The remaining parameters, ranked in descending order of importance, were as follows: concentration (0.43) and cement–sand ratio (0.30).

4. Discussions

4.1. Evaluation of Model Applicability

In the aforesaid study, we compared the outcomes of parameter optimization for CatBoost by employing six distinct population optimization algorithms, namely AEO, AO, GCO, SCSO, SSA, and WaOA. We analyzed the adaptability curve of the model, the radar chart of four evaluation indicators, and the prediction results, and concluded that the AEO-CatBoost model has the optimal evaluation indicators for predicting the strength of backfill.

Among all the models evaluated, the AEO-CatBoost model yielded the highest R² value of 0.9447, signifying a robust correlation between its predicted outcomes and the actual measured data. Additionally, the AEO-CatBoost model achieved a VAF of 93.614, which ranked highest among all regression models considered. This finding demonstrates that the AEO-CatBoost model possesses exceptional capabilities in aggregating and correlating discrete data within the backfill strength dataset, resulting in predicted backfill strength values that are highly consistent with the actual results.

Moreover, in terms of the differences between predicted and actual values, the AEO-CatBoost model achieved the lowest MAE and RMSE values, with the majority of data points falling within the 95% error band. This suggests that the AEO-CatBoost model possesses a robust fitting ability and can effectively minimize the variability of predictions by capturing the nonlinear patterns inherent in the backfill prediction dataset.

The prediction performance of the GCO-CatBoost model and the WaOA-CatBoost model for backfill strength is similar to that of the AEO-CatBoost model. However, GCO-CatBoost and WaOA-CatBoost converged when the iteration changes of fitness were relatively small. Judging from the curve of iteration changes of fitness, GCO-CatBoost and WaOA-CatBoost converged when the curve changed fewer than 5 times, considering that they might have fallen into a local optimal solution. Therefore, overall, the suitability of the AEO-CatBoost model for predicting backfill strength was the most superior. Utilizing the AEO-CatBoost model can promptly obtain the backfill strength under different concentrations and curing conditions, which can significantly reduce the labor and time costs for determining the optimal backfill strength and is more conducive to the rapid progress of mine filling design.

4.2. Analysis of the Correlation of Fill Body Strength

Considering the correlation matrix derived from the data and the analytical outcomes furnished by SHAP explainability analysis, it was observed that the curing temperature exerted the preponderant overall significance, with the curing time assuming a secondary yet substantial role in the hierarchy of importance. The underpinning mechanism can be ascribed to the fact that, concomitant with the elevation of the curing temperature of the backfill and the elongation of the curing time, the vigor of the hydration reaction within the backfill matrix experienced a pronounced intensification. This, ineluctably, precipitated the sequential infilling of the internal voids within the backfill material by the hydration byproducts, as expounded in the literature [53].

Upon the attainment of solidification by the material, the tailings and waste rock amalgamate to engender a structurally robust and integrated entity, as graphically depicted in Figure 8. The macroscopic mechanical attributes of the backfill material epitomize the extrinsic manifestation of its underlying microstructure. Fundamentally, any metamorphosis in the microstructure of the backfill material constitutes the cardinal etiological factor for the variance in its mechanical properties, as substantiated by [54]. Consequently, when the filling ratio of the internal porosity of the backfill material ascends and the material solidifies, the holistic strength of the backfill body will inexorably witness a commensurate augmentation.

The correlation matrix revealed that the impact of curing duration and waste rock–tailing proportion on the strength of the backfill varies with increasing curing temperature. Curing duration had a positive effect on backfill strength; nonetheless, the coefficient of this positive effect first rose and then fell as the curing temperature increased. The strongest positive impact occurred when the curing temperature was 30 °C, which suggests that in the actual preparation process of backfill materials, researchers must rationally analyze the curing properties of the materials and design the optimal time and temperature to ensure the strength of the backfill.

Furthermore, it has been noted that when the curing temperature was kept at 50 °C, the waste rock–tailing ratio had a negative effect on the strength of the backfill. The primary cause of this phenomenon can be referenced to the explanations in the literature [55]. Under high-temperature conditions, in backfill bodies with a low waste rock–tailing ratio, the relatively larger pores became compressed and then turned into smaller ones. On the other hand, backfill bodies with a high waste rock–tailing ratio had more pores than those with other waste rock-to-tailing ratios. As a result, the small pores in these high-ratio backfill bodies kept expanding and eventually grew into larger pores.

It is generally acknowledged that the strength of backfill displays an inverse correlation with its porosity: as the porosity of backfill rises, its strength diminishes correspondingly. Building on this relationship, when the curing temperature was kept constant at 50 °C, a notable decrease in backfill strength occurred alongside an increase in the waste rock–tailing ratio. This finding not only highlights the intricate interplay between the compositional proportion and the physical characteristics of the backfill material, but also stresses the significance of meticulously adjusting the waste rock–tailing ratio during the backfilling process—especially under specific temperature conditions—to guarantee the intended mechanical properties and stability of backfilled structures.

5. Conclusions

(1) Through the analysis of the adaptation curve, the radar chart of the four evaluation indicators, and the prediction results, it was discovered that the AEO-CatBoost algorithm model exhibited an excellent fitting effect when the population size was 60 (R² = 0.947, VAF = 93.614), and had the slightest error influence on the prediction result (RMSE = 0.606, MAE = 0.465). It can be inferred that the adopted AEO-CatBoost is capable of accurately determining the correlation among the strength of the backfill, the ratio of backfilling materials, and the curing conditions of the backfill, and achieving precise prediction of the strength of the backfill.

(2) According to the correlation matrix of backfill strength and the results of SHAP explainability analysis, curing temperature and curing time have a marked positive effect on backfill strength. In terms of importance, they are followed by concentration, the waste rock–tailing ratio, and the cement–sand ratio.

(3) Elevated curing temperature and extended curing duration contribute to increased strength of the backfill, as the hydration reactions within the material require adequate time and appropriate thermal conditions. This hydration process reduced the porosity of the backfill, resulting in a more compact microstructure and, consequently, improved macroscopic strength. Furthermore, at a curing temperature of 50 °C, the waste rock–tailing ratio exerted a negative impact on backfill strength. This is attributed to the fact that a higher waste rock–tailing ratio increases the inherent porosity of the backfill. Under high-temperature conditions, the small pores within the backfill continue to expand and evolve into larger pores, thereby diminishing its macroscopic strength.

(4) The establishment and application of the AEO-CatBoost model offer a novel intelligent algorithm for predicting the strength of mine filling bodies, thereby providing fresh technical support for the advancement of filling mining technology in the era of artificial intelligence. This enables mining technicians to acquire reasonable strength of various filling mixtures and maintenance conditions at a low cost and rapidly, thus guaranteeing safe mining operations.

Author Contributions

Conceptualization, K.Z.; Methodology, J.Q. and J.L.; Software, J.Q. and X.X.; Investigation, J.Q. and X.X.; Data curation, J.Q. and J.L.; Writing—original draft, J.Q.; Writing—review and editing, J.L.; Supervision, K.Z.; Funding acquisition, K.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Guangxi Key Research and Development Program of China grant number [2024AD47009].

Data Availability Statement

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Feng, J.W.; Zhang, Z.Y.; Guan, W.M.; Wang, W.; Xu, X.Y.; Song, Y.Z.; Liu, H.; Su, H.; Zhao, B.; Hou, D.Z. Review of the Backfill Materials in Chinese Underground Coal Mining. Minerals 2023, 13, 473. [Google Scholar] [CrossRef]
Li, M.; Peng, Y.F.; Ding, L.W.; Zhang, J.X.; Ma, D.; Huang, P. Analysis of Surface Deformation Induced by Backfill Mining Considering the Compression Behavior of Gangue Backfill Materials. Appl. Sci.-Basel 2023, 13, 160. [Google Scholar] [CrossRef]
Li, M.; Zhang, J.X.; Huang, P.; Gao, R. Mass ratio design based on compaction properties of backfill materials. J. Cent. South Univ. 2016, 23, 2669–2675. [Google Scholar] [CrossRef]
Zhang, J.X.; Li, M.; Taheri, A.; Zhang, W.Q.; Wu, Z.Y.; Song, W.J. Properties and Application of Backfill Materials in Coal Mines in China. Minerals 2019, 9, 53. [Google Scholar] [CrossRef]
Zhao, Y.J.; Lu, X.Y.; Liu, L.; Wen, D.; Wang, M.Y.; Zhang, X.Y.; Zhang, B. Comparative study of novel backfill coupled heat exchangers using different waste materials in underground stopes. Geothermics 2023, 110, 102676. [Google Scholar] [CrossRef]
Zhu, X.J.; Guo, G.L.; Liu, H.; Chen, T.; Yang, X.Y. Experimental research on strata movement characteristics of backfill-strip mining using similar material modeling. Bull. Eng. Geol. Environ. 2019, 78, 2151–2167. [Google Scholar] [CrossRef]
Behera, S.K.; Singh, P.; Mishra, D.P.; Mishra, K.; Kumar, A.; Mandal, S.K.; Mandal, P.K.; Mishra, A.K. Required strength design of cemented backfill for underground metalliferous mine. Int. J. Min. Reclam. Environ. 2023, 37, 927–952. [Google Scholar] [CrossRef]
Tang, J.; Li, P.; Chen, X.; Bai, Y. Experimental study of strength, pore structure and phase evolution characteristics of iron tailings cemented paste backfill under high-temperature. Cem. Wapno Beton 2020, 25, 78–94. [Google Scholar] [CrossRef]
Gao, Z.; Huang, M.Q.; Zhan, S.L.; Tan, W. Strength distribution of cemented waste rock backfill: A similarity simulation experiment. Front. Earth Sci. 2024, 11, 1328421. [Google Scholar] [CrossRef]
Wang, Z.-Z.; Wu, A.-X.; Wang, H.-J. A Strength Design Method of Cemented Backfill with a High Aspect Ratio. Adv. Civ. Eng. 2020, 2020, 7159208. [Google Scholar] [CrossRef]
Zhang, Y.; Zhang, Z.; Guo, L.; Du, X. Strength Model of Backfill-Rock Irregular Interface Based on Fractal Theory. Front. Mater. 2021, 8, 792014. [Google Scholar] [CrossRef]
Porathur, J.L.; Sekhar, S.; Godugu, A.K.; Bhargava, S. Stability analysis of a free-standing backfill wall and a predictive equation for estimating the required strength of a backfill material-a numerical modelling approach. J. South. Afr. Inst. Min. Metall. 2022, 122, 227–233. [Google Scholar] [CrossRef]
Hassani, F.P.; Mortazavi, A.; Shabani, M. An investigation of mechanisms involved in backfill-rock mass behaviour in narrow vein mining. J. South. Afr. Inst. Min. Metall. 2008, 108, 463–472. [Google Scholar]
Hou, C.; Zhu, W.-C.; Yan, B.-X.; Yang, L.-J.; Du, J.-F.; Niu, L.-L. Mechanical behavior of backfilled pillar under biaxial loading. J. Cent. South Univ. 2023, 30, 1191–1204. [Google Scholar] [CrossRef]
Huang, M.; Chen, L.; Zhang, M.; Zhan, S. Multi-Objective Function Optimization of Cemented Neutralization Slag Backfill Strength Based on RSM-BBD. Materials 2022, 15, 1585. [Google Scholar] [CrossRef] [PubMed]
Li, M.; Peng, Y.; Zhang, J.; Zhao, Y.; Wang, Z.; Guo, Q.; Guo, S. Properties of a backfill material prepared by cementing coal gangue and fly ash through microbial-induced calcite precipitation. Constr. Build. Mater. 2023, 384, 131329. [Google Scholar] [CrossRef]
Qiu, H.; Zhang, F.; Liu, L.; Hou, D.; Tu, B. Influencing Factors on Strength of Waste Rock Tailing Cemented Backfill. Geofluids 2020, 2020, 8847623. [Google Scholar] [CrossRef]
Qiu, H.; Zhang, F.; Sun, W.; Liu, L.; Zhao, Y.; Huan, C. Experimental Study on Strength and Permeability Characteristics of Cemented Rock-Tailings Backfill. Front. Earth Sci. 2022, 10, 802818. [Google Scholar] [CrossRef]
Qiu, J.-P.; Yang, L.; Xing, J.; Sun, X.-G. Analytical Solution for Determining the Required Strength of Mine Backfill Based on its Damage Constitutive Model. Soil Mech. Found. Eng. 2018, 54, 371–376. [Google Scholar] [CrossRef]
Han, B.; Zhang, S.Y.; Sun, W. Impact of Temperature on the Strength Development of the Tailing-Waste Rock Backfill of a Gold Mine. Adv. Civ. Eng. 2019, 2019, 4379606. [Google Scholar] [CrossRef]
Liu, E.Y.; Zhang, Q.L.; Feng, Y.; Zhao, J.W. Experimental study of static and dynamic mechanical properties of double-deck backfill body. Environ. Earth Sci. 2017, 76, 689. [Google Scholar] [CrossRef]
Liu, W.Z.; Hu, Z.J.; Liu, C.; Huang, X.P.; Hou, J.F. Mechanical properties under triaxial compression of coal gangue-fly ash cemented backfill after cured at different temperatures. Constr. Build. Mater. 2024, 411, 134268. [Google Scholar] [CrossRef]
Qiu, H.F.; Zhang, F.S.; Liu, L.; Huan, C.; Hou, D.Z.; Kang, W. Experimental study on acoustic emission characteristics of cemented rock-tailings backfill. Constr. Build. Mater. 2022, 315, 125278. [Google Scholar] [CrossRef]
Wang, J.; Fu, J.X.; Song, W.D.; Zhang, Y.F. Viscosity and Strength Properties of Cemented Tailings Backfill with Fly Ash and Its Strength Predicted. Minerals 2021, 11, 78. [Google Scholar] [CrossRef]
Yang, Y.B.; Lai, X.P.; Zhang, Y.; Shan, P.F.; Tong, L.; Liu, Y.Z.; Zhang, L.M.; Wu, L.Q. Strength deterioration and energy dissipation characteristics of cemented backfill with different gangue particle size distributions. J. Mater. Res. Technol.-JmrT 2023, 25, 5122–5135. [Google Scholar] [CrossRef]
Orejarena, L.; Fall, M. The use of artificial neural networks to predict the effect of sulphate attack on the strength of cemented paste backfill. Bull. Eng. Geol. Environ. 2010, 69, 659–670. [Google Scholar] [CrossRef]
Qi, C.C.; Fourie, A.; Chen, Q.S.; Zhang, Q.L. A strength prediction model using artificial intelligence for recycling waste tailings as cemented paste backfill. J. Clean. Prod. 2018, 183, 566–578. [Google Scholar] [CrossRef]
de-Prado-Gil, J.; Palencia, C.; Silva-Monteiro, N.; Martinez-Garcia, R. To predict the compressive strength of self compacting concrete with recycled aggregates utilizing ensemble machine models. Case Stud. Constr. Mater. 2022, 16, e01046. [Google Scholar] [CrossRef]
Qi, C.C.; Tang, X.L.; Dong, X.J.; Chen, Q.S.; Fourie, A.; Liu, E.Y. Towards Intelligent Mining for Backfill: A genetic programming-based method for strength forecasting of cemented paste backfill. Miner. Eng. 2019, 133, 69–79. [Google Scholar] [CrossRef]
Xiong, S.; Liu, Z.; Min, C.; Shi, Y.; Zhang, S.; Liu, W. Compressive Strength Prediction of Cemented Backfill Containing Phosphate Tailings Using Extreme Gradient Boosting Optimized by Whale Optimization Algorithm. Materials 2023, 16, 308. [Google Scholar] [CrossRef]
Lu, X.; Zhou, W.; Ding, X.H.; Shi, X.Y.; Luan, B.Y.; Li, M. Ensemble Learning Regression for Estimating Unconfined Compressive Strength of Cemented Paste Backfill. IEEE Access 2019, 7, 72125–72133. [Google Scholar] [CrossRef]
Suo, Y.; Zhang, C.; Liu, L.; Qu, H.; Yang, P.; Xie, G. Proportion optimization and strength prediction of CGS backfill materials based on GA-ELM mode. Energy Sources Part A-Recovery Util. Environ. Eff. 2023, 45, 5173–5189. [Google Scholar] [CrossRef]
Wu, M.; Wang, C.; Zuo, Y.; Yang, S.; Zhang, J.; Luo, Y. Study on strength prediction and strength change of Phosphogypsum-based composite cementitious backfill based on BP neural network. Mater. Today Commun. 2024, 41, 110331. [Google Scholar] [CrossRef]
Hu, Y.; Ye, Y.; Zhang, B.; Li, K.; Han, B. Distribution characterization and strength prediction of backfill in underhand drift stopes based on sparrow search algorithm-extreme learning machine and field experiments. Case Stud. Constr. Mater. 2024, 21, e03784. [Google Scholar] [CrossRef]
Qi, Y.S.; Zhang, X.D.; Zhang, J.X. Gear faults identification based on big data analysis and CatBoost model. Int. J. Model. Identif. Control. 2022, 41, 334–342. [Google Scholar] [CrossRef]
Hancock, J.T.; Khoshgoftaar, T.M. CatBoost for big data: An interdisciplinary review. J. Big Data 2020, 7, 94. [Google Scholar] [CrossRef]
Dasi, H.; Ying, Z.; Yang, B.Y. Predicting the consumed heating energy at residential buildings using a combination of categorical boosting (CatBoost) and Meta heuristics algorithms. J. Build. Eng. 2023, 71, 106584. [Google Scholar] [CrossRef]
Zhao, W.G.; Wang, L.Y.; Zhang, Z.X. Artificial ecosystem-based optimization: A novel nature-inspired meta-heuristic algorithm. Neural Comput. Appl. 2020, 32, 9383–9425. [Google Scholar] [CrossRef]
Sasmal, B.; Hussien, A.G.; Das, A.; Dhal, K.G. A Comprehensive Survey on Aquila Optimizer. Arch. Comput. Methods Eng. 2023, 30, 4449–4476. [Google Scholar] [CrossRef]
Villaseñor, C.; Arana-Daniel, N.; Alanis, A.Y.; López-Franco, C.; Hernandez-Vargas, E.A. Germinal Center Optimization Algorithm. Int. J. Comput. Intell. Syst. 2019, 12, 13–27. [Google Scholar] [CrossRef]
Villaseñor, C.; Rios, J.D.; Arana-Daniel, N.; Lopez-Franco, C.; Gomez-Avila, J. Optimized control and neural observers with germinal center optimization: A review. Annu. Rev. Control. 2019, 48, 273–280. [Google Scholar] [CrossRef]
Li, Y.C.; Yu, Q.; Du, Z.F. Sand cat swarm optimization algorithm and its application integrating elite decentralization and crossbar strategy. Sci. Rep. 2024, 14, 8927. [Google Scholar] [CrossRef]
Yan, S.Q.; Liu, W.D.; Li, X.Q.; Yang, P.; Wu, F.X.; Yan, Z. Comparative Study and Improvement Analysis of Sparrow Search Algorithm. Wirel. Commun. Mob. Comput. 2022, 2022, 4882521. [Google Scholar] [CrossRef]
Han, M.X.; Du, Z.F.; Yuen, K.F.; Zhu, H.T.; Li, Y.C.; Yuan, Q.Y. Walrus optimizer: A novel nature-inspired metaheuristic algorithm. Expert Syst. Appl. 2024, 239, 122413. [Google Scholar] [CrossRef]
Hasanien, H.M.; Alsaleh, I.; Ullah, Z.; Alassaf, A. Probabilistic optimal power flow in power systems with Renewable energy integration using Enhanced walrus optimization algorithm. Ain Shams Eng. J. 2024, 15, 102663. [Google Scholar] [CrossRef]
Gu, Z.; Xiong, X.; Yang, C.; Cao, M. Investigation of Micro-Scale Damage and Weakening Mechanisms in Rocks Induced by Microwave Radiation and Their Associated Strength Reduction Patterns: Employing Meta-Heuristic Optimization Algorithms and Extreme Gradient Boosting Models. Mathematics 2024, 12, 2954. [Google Scholar] [CrossRef]
Gu, Z.; Xiong, X.; Yang, C.; Cao, M.; Xu, C. Research on prediction of PPV in open pit mine used on intelligent hybrid model of extreme gradient boosting. J. Environ. Manag. 2024, 371, 123248. [Google Scholar] [CrossRef] [PubMed]
Qiu, Y.; Zhou, J.; He, B.; Armaghani, D.J.; Huang, S.; He, X. Evaluation and Interpretation of Blasting-Induced Tunnel Overbreak: Using Heuristic-Based Ensemble Learning and Gene Expression Programming Techniques. Rock Mech. Rock Eng. 2024, 57, 7535–7563. [Google Scholar] [CrossRef]
Sun, M.; Yang, J.; Yang, C.; Wang, W.; Wang, X.; Li, H. Research on prediction of PPV in open-pit mine used RUN-XGBoost model. Heliyon 2024, 10, e28246. [Google Scholar] [CrossRef]
Qiu, Y.; Zhou, J. Novel rockburst prediction criterion with enhanced explainability employing CatBoost and nature-inspired metaheuristic technique. Undergr. Space 2024, 19, 101–118. [Google Scholar] [CrossRef]
Mame, M.; Qiu, Y.; Huang, S.; Du, K.; Zhou, J. Mean Block Size Prediction in Rock Blast Fragmentation Using TPE-Tree-Based Model Approach with SHapley Additive exPlanations. Min. Metall. Explor. 2024, 41, 2325–2340. [Google Scholar] [CrossRef]
Zhou, J.; Wang, Z.; Qiu, Y.; Li, P.; Tao, M. Sparrow search algorithm enhanced multi-output regression for predicting rock fracture shear displacements: A metaheuristic-hybridized model. Mech. Adv. Mater. Struct. 2024, 32, 1286–1302. [Google Scholar] [CrossRef]
Qin, Z.F.; Jin, J.X.; Liu, L.; Zhang, Y.; Du, Y.L.; Yang, Y.; Zuo, S.H. Reuse of soil-like material solidified by a biomass fly ash-based binder as engineering backfill material and its performance evaluation. J. Clean. Prod. 2023, 402, 136824. [Google Scholar] [CrossRef]
Xu, H.Q.; Huang, Y.H.; Shu, S.; Zhou, A.Z.; Jiang, P.M.; Liu, S.Q.; Wang, L.Y.; Mei, L. Dynamic of solidified dredged sediment based on different cements and confining pressures. Environ. Geotech. 2022, 9, 390–398. [Google Scholar] [CrossRef]
Gao, R.G.; Wang, W.J.; Xiong, X.; Li, J.J.; Xu, C. Effect of curing temperature on the mechanical properties and pore structure of cemented backfill materials with waste rock-tailings. Constr. Build. Mater. 2023, 409, 133850. [Google Scholar] [CrossRef]

Figure 1. Workflow.

Figure 2. Data correlation matrix (the correlation coefficient is Pearson correlations, and the number of symbols “*” indicates the degree of correlation; the greater the number, the stronger the correlation between the two.).

Figure 3. Parameter optimization results.

Figure 4. Parallel coordinate diagram of parameter optimization.

Figure 5. Radar chart of the parameter optimization validation index.

Figure 6. Comparative analysis of the prediction results. (The blue dotted line in the figure indicates the 95% error boundary.).

Figure 7. SHAP interpretability optimization. (a) Overall impact effects. (b) Overall rating ranking.

Figure 8. Consolidation mechanism of backfill.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Qiu, J.; Li, J.; Xiong, X.; Zhou, K. Application of a Multi-Algorithm-Optimized CatBoost Model in Predicting the Strength of Multi-Source Solid Waste Backfilling Materials. Big Data Cogn. Comput. 2025, 9, 203. https://doi.org/10.3390/bdcc9080203

AMA Style

Qiu J, Li J, Xiong X, Zhou K. Application of a Multi-Algorithm-Optimized CatBoost Model in Predicting the Strength of Multi-Source Solid Waste Backfilling Materials. Big Data and Cognitive Computing. 2025; 9(8):203. https://doi.org/10.3390/bdcc9080203

Chicago/Turabian Style

Qiu, Jianhui, Jielin Li, Xin Xiong, and Keping Zhou. 2025. "Application of a Multi-Algorithm-Optimized CatBoost Model in Predicting the Strength of Multi-Source Solid Waste Backfilling Materials" Big Data and Cognitive Computing 9, no. 8: 203. https://doi.org/10.3390/bdcc9080203

APA Style

Qiu, J., Li, J., Xiong, X., & Zhou, K. (2025). Application of a Multi-Algorithm-Optimized CatBoost Model in Predicting the Strength of Multi-Source Solid Waste Backfilling Materials. Big Data and Cognitive Computing, 9(8), 203. https://doi.org/10.3390/bdcc9080203

Article Menu

Application of a Multi-Algorithm-Optimized CatBoost Model in Predicting the Strength of Multi-Source Solid Waste Backfilling Materials

Abstract

1. Introduction

2. Methodology and Indictors

2.1. Workflow

2.2. Categorical Boosting (CatBoost)

2.3. Artificial Ecosystem-Based Optimization (AEO)

2.4. Aquila Optimization (AO)

2.5. Germinal Center Optimization (GCO)

2.6. Sand Cat Swarm Optimization (SCSO)

2.7. Sparrow Search Algorithm (SSA)

2.8. Walrus Optimization Algorithm (WaOA)

2.9. Model Verification and Evaluation

2.10. Shapley Additive Explanatory (SHAP)

3. Results

3.1. Datasets and Descriptive Analysis

3.2. Comparison of Parameter Optimization

3.3. Comparison of Prediction Results

3.4. Sensitivity Analysis

4. Discussions

4.1. Evaluation of Model Applicability

4.2. Analysis of the Correlation of Fill Body Strength

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI