Explainable Ensemble Learning Model for Residual Strength Forecasting of Defective Pipelines

Liu, Hongbo; Meng, Xiangzhao

doi:10.3390/app15074031

Open AccessArticle

Explainable Ensemble Learning Model for Residual Strength Forecasting of Defective Pipelines

by

Hongbo Liu

^1,2 and

Xiangzhao Meng

^1,*

¹

School of Human Settlements and Civil Engineering, Xi’an Jiaotong University, Xi’an 710049, China

²

Shaanxi Provincial Natural Gas Co., Ltd., Xi’an 710016, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(7), 4031; https://doi.org/10.3390/app15074031

Submission received: 6 February 2025 / Revised: 29 March 2025 / Accepted: 2 April 2025 / Published: 6 April 2025

(This article belongs to the Topic Oil and Gas Pipeline Network for Industrial Applications)

Download

Browse Figures

Versions Notes

Abstract

:

The accurate prediction of the residual strength of defective pipelines is a critical prerequisite for ensuring the safe operation of oil and gas pipelines, and it holds significant implications for the pipeline’s remaining service life and preventive maintenance. Traditional machine learning algorithms often fail to comprehensively account for the correlative factors influencing the residual strength of defective pipelines, exhibit limited capability in extracting nonlinear features from data, and suffer from insufficient predictive accuracy. Furthermore, the predictive models typically lack interpretability. To address these issues, this study proposes a hybrid prediction model for the residual strength of defective pipelines based on Bayesian optimization (BO) and eXtreme Gradient Boosting (XGBoost). This approach resolves the issues of excessive iterations and high computational costs associated with conventional hyperparameter optimization methods, significantly enhancing the model’s predictive performance. The model’s prediction performance is evaluated using mainstream metrics such as the Mean Absolute Percentage Error (MAPE), Coefficient of Determination (R²), Root Mean Square Error (RMSE), robustness analysis, overfitting analysis, and grey relational analysis. To enhance the interpretability of the model’s predictions, reveal the significance of features, and confirm prior domain knowledge, Shapley additive explanations (SHAP) are employed to conduct the relevant research. The results indicate that, compared with Random Forest, LightGBM, Support Vector Machine, gradient boosting regression tree, and Multi-Layer Perceptron, the BO-XGBoost model exhibits the best prediction performance, with MAPE, R², and RMSE values of 5.5%, 0.971, and 1.263, respectively. Meanwhile, the proposed model demonstrates the highest robustness, the least tendency for overfitting, and the most significant grey relation degree value. SHAP analysis reveals that the factors influencing the residual strength of defective pipelines, ranked in descending order of importance, are defect depth (d), wall thickness (t), yield strength (σ_y), external diameter (D), defect length (L), tensile strength (σ_u), and defect width (w). The development of this model contributes to improving the integrity management of oil and gas pipelines and provides decision support for the intelligent management of defective pipelines in oil and gas fields.

Keywords:

defective pipelines; residual strength; prediction method; machine learning; explainable

1. Introduction

As a means of transporting oil and natural gas, pipelines have become the primary mode of oil and gas transportation due to their large-scale transportation capacity, reliable and safe transportation process, low cost, and environmental “friendliness” [1]. During their entire life cycle, pipelines are affected by many factors, such as improper construction and operation [2], design flaws [3], mechanical damage [4], material ageing [5], and corrosion [6,7]; as a result, the remaining strength of pipelines decreases, thus affecting the service life of pipelines.

Nearly half of the world’s operational oil and gas pipelines have entered the ageing phase. Taking the United States as an illustration, approximately half of the pipelines currently in service were commissioned before the 1960s. Statistical data indicate that, due to ageing pipeline infrastructure corrosion defects and other problems, the annual failure rate of oil and gas pipelines has escalated from 0.04 incidents per 10³ km to 0.14 incidents per 10³ km. Since 2010, statistics show that 670 oil and gas pipeline accidents have occurred in the United States. Factors such as defects are the primary causes of pipeline failures for these ageing pipelines. Precisely predicting residual strength in defective pipelines is critical for assessing their remaining service life, guaranteeing operational safety, and facilitating proactive maintenance measures. Consequently, it is essential to employ appropriate methods to forecast the remaining strength of pipelines with defects, thereby enabling the pre-emptive prevention of failure accidents [8].

According to different calculation methods, the commonly used prediction methods for pipeline remaining strength include empirical formulas, finite elements, and machine learning approaches. The widely used methods are shown in Figure 1. Due to their convenience in calculation and simplicity in operation, empirical formulas have been commonly applied in predicting the remaining strength of defective pipelines. A series of typical formulas, such as ASME B31G-2009, BS 7910-2005, PCORRC, and API 579-1/AMSE FFS-1-2007, are applied to calculate remaining strength [9,10,11,12,13]. Among them, when calculating the remaining strength, BS 7910-2005 and PCORRC can yield similar curves. ASME B31G-2009 shows a certain conservatism in evaluating the pipeline’s remaining strength. API 579-1/AMSE FFS-1-2007 exhibits significant conservatism in evaluating medium- and high-grade pipeline steels, resulting in inaccurate evaluation results. The above-mentioned empirical formulas are relatively conservative, have poor adaptability, limited application scope, and the prediction errors cannot be ignored. With the advancement of computer technology, researchers have gradually applied the finite element method to the numerical simulation of pipeline remaining strength, and relatively good prediction results have been obtained. For instance, Wei et al. proposed a variable-strength material damage model with damage accumulation, achieving the coupled analysis of structural damage and material fatigue residual strength, and enabling the calculation of the strength of nonlinear damage cross-sections, which has essential practical engineering application value [14]. Zhou et al. considered the influence of different defect sizes on pipeline residual strength, established a finite-element model of an X80 pipeline with grouped corrosion defects using ABAQUS 6.14.2, and verified the accuracy of the finite-element model through burst tests. The research results provide theoretical support for the safety management of X80 pipelines [9]. Li et al. analyzed the residual strength of buried pipelines with corrosion defects under different defect parameters using the finite-element elastic-plastic method, and the research can provide valuable references for the simulation real-time monitoring and safety analysis during the pipeline operation stage [15]. Muda et al. studied whether the strength of the repaired offshore pipelines was sufficient to withstand the burst pressure load, constructed a strength calculation model, and verified the correctness of the calculation model by combining it with the finite-element method [16]. Although finite-element analysis often yields satisfactory results, the actual operating conditions of the pipeline need to be considered. To achieve high accuracy, many meshes are required, resulting in high computational costs and lengthy time consumption. In addition, if the defect parameters change, the model needs to be rebuilt, lacking replicability and portability, making this method inconvenient and inefficient.

With the development of artificial intelligence (AI) technology, many researchers have introduced the machine learning (ML) approach to predict pipeline residual strength. For example, Abyani et al. utilized different machine learning algorithms to predict the failure pressure of defective offshore pipelines. The results showed that the Gaussian Process Regression (GPR) and Multi-Layer Perceptron (MLP) models performed the best among all models [17]. Chen et al. applied the Artificial Neural Network (ANN) to the prediction of the remaining strength of defective pipelines, addressing the problem of model over-fitting caused by insufficient experimental data in the past. The results indicated that compared with the feed-forward neural network, the MLP method had higher prediction accuracy when the sample data were insufficient [18]. Li et al. proposed a data-driven method for predicting the remaining strength of corrosion defects in offshore pipelines. Compared with the prediction results of existing algorithms, the established model had advantages in terms of error and regression performance [19]. Kumar et al. combined the ANN with the finite element method to predict the failure pressure of defective pipelines. The study found that the combination of the two methods by the artificial neural network overcame the shortcomings of the finite element method, and by using the ANN as part of the evaluation framework, the failure pressure of corroded pipelines could be predicted in a short time without affecting the accuracy of the results [20]. Sun et al. aimed to address the problem of significant errors in predicting the remaining strength of pipelines with defects and established a prediction model for the remaining strength of pipelines with defects based on SVR. The results showed that the minimum relative error of the SVR model’s prediction results was 0.55%, the maximum relative error was 10.35%, and the average relative error was 2.63%, verifying the accuracy and robustness of the SVR model. The research results can provide decision-making support for pipeline operation scheduling, inspection, and maintenance [21]. Peng et al. facing the difficulty of predicting the failure strength of internal corrosion defects in pipelines under external pressure, established a parametric analysis model for pipelines with corrosion defects, and analyzed the influence of corrosion defects and hole parameters on the collapse resistance strength of pipelines. The results showed that the established data-driven prediction model has good prediction performance. This research can provide a reference for evaluating the strength of pipeline corrosion defects and establishing a safety early-warning mechanism [22]. Su et al., aiming to solve the problem of predicting the failure pressure of defective pipelines, proposed a method based on deep learning. The prediction results showed that the adopted deep-learning model had high prediction accuracy. In addition, under the same computational conditions, the calculation speed of the deep learning model was at least two orders of magnitude faster than that of finite element simulation. This research is expected to support the rapid prediction of defective oil and gas pipeline failure pressure [23].

However, a single machine learning algorithm has its limitations. During training, a single model often suffers from problems such as getting trapped in local minima and having poor generalization performance. Additionally, deep learning network architectures are complex. Much historical data are often required as empirical knowledge during the model training process. The improper selection of hidden layers and nodes in the model architecture will significantly affect the network performance, making over-fitting likely to occur. In recent years, eXtreme Gradient Boosting (XGBoost) has achieved good application results in industry and academia. Many researchers have applied this method to the oil and gas pipeline field. As a cutting-edge algorithm in artificial intelligence, XGBoost has advantages such as fast convergence, strong fitting ability, and exemplary performance in capturing nonlinear and complex relationships. Considering that XGBoost has numerous hyperparameters, optimizing the selection of the XGBoost hyperparameter set can further improve its prediction performance. Bayesian optimization (BO), as a highly effective global optimization algorithm, requires fewer iterations and has a faster convergence speed compared to other hyperparameter optimization methods. This makes it more applicable when computational resources are limited [24], and it has been widely used in various prediction algorithms. Therefore, BO is adopted to optimize the hyperparameters of XGBoost, and a prediction model for the remaining strength of defective pipelines is constructed through BO-XGBoost. However, most current machine learning methods in engineering applications exhibit a “black box” form during calculation. The mapping process between the input and output in the learning model is unexplainable. Even if machine learning makes highly accurate predictions, it is difficult to apply machine learning to some high-risk application scenarios due to the poor interpretability of the prediction results. Therefore, it is necessary to strengthen the research on the interpretability of artificial intelligence technology and the judgement and prevention of potential risks. In high-risk fields, relevant studies on the interpretability of machine learning have been carried out, such as medical diagnosis [25], aerospace [26], construction [27], turbulent premixed combustion [28], and other fields [29,30,31]. For instance, in the domain of medical diagnosis, the implementation of SHAP analysis has rendered the model’s decision-making process both transparent and comprehensible. This ensures that medical experts can understand and trust the model’s predictive outcomes, enhancing its practical utility in clinical settings and providing invaluable assistance to physicians in making accurate lung cancer diagnoses. SHAP analysis has been employed in the aerospace sector to elucidate the factors contributing to flight delays, enabling airlines to discern the underlying causes. This approach offers valuable insights into the significance of individual features and their interactions in predicting flight delays, thereby improving the transparency of the model’s outputs. The relevant research statistics are shown in Table 1. Compared to LIME or feature permutation, SHAP demonstrates superior performance in the theoretical foundation, global and local feature interpretation, feature interaction, computational efficiency, and visualization. It offers more precise and comprehensive model explanations, making it suitable for interpretability research in complex models. However, to improve the interpretability of the model, the Shapley additive explanation (SHAP) method is adopted to conduct research on the interpretability of the model, analyze the importance of input features to the prediction results, meet the industry’s requirements for the interpretability of prediction models, and enhance the guiding effect of the model on production practice.

Compared with previous studies, the differences and novelty of the study are as follows: (1) The stratified sampling method is employed to divide the training set and the test set, effectively mitigating the significant bias introduced by random sampling and ensuring the reliability and objectivity of the prediction results. A Bayesian optimization algorithm is proposed to address the issues of numerous iterations and high computational costs associated with traditional hyperparameter optimization methods. (2) A novel hybrid prediction model, BO-XGBoost, is proposed. The model is evaluated based on metrics such as RE, MAPE, R², and RMSE, and its superior predictive performance is validated through robustness analysis, overfitting, and grey relational analysis. This model holds significant implications for predicting the remaining useful life of pipelines and facilitating preventive maintenance. (3) The SHAP method is utilized to investigate the influence of input variables on the model’s output, enhancing the interpretability of the model’s predictions and meeting the industry’s demand for interpretable predictive models. This improvement strengthens the model’s practical guidance for production. The establishment of this model contributes to the advancement of the integrity management of oil and gas pipelines and provides critical decision-making support for the intelligent management of pipeline systems.

This work attempts to construct a prediction model for the remaining strength of defective pipelines using BO-XGBoost. To enhance the novelty and interpretability of the research, we analyze the importance of input features to the prediction results by calculating SHAP values, thus improving the interpretability of the model. The remaining part of this paper is structured as follows. Section 2 provides an overview of the methods employed, elaborating on essential details such as data collection, and the specific model constructed. Section 3 evaluates the prediction performance of the constructed model. Section 4 presents a conclusive summary, summarizing the key findings.

2. Individual Methods

This section briefly overviews the various machine learning methods involved in the hybrid model construction process and elaborates on the essential details of data collection, data processing, and the particular model constructed.

2.1. Machine Learning Algorithms

2.1.1. RF

Random Forest (RF) is an integrated learning method proposed by Leo Breiman based on classification tree construction [32]. The method is based on traditional decision trees. It incorporates the integrated learning concept, obtaining the training samples of each decision tree and the features for constructing the decision tree by random sampling and obtaining the final prediction results by voting on multiple decision trees. The principle of the random forest regression model is shown in Figure 2.

2.1.2. LGB

LightGBM (LGB) was proposed by Microsoft in 2017 as an improved optimization algorithm based on the GBDT algorithm, aiming at solving machine learning problems on large-scale datasets [33]. The principle of the LGB model is shown in Figure 3, which solves the issues of excessive number of samples and number of features through the use of the unilateral gradient sampling algorithm and the mutual exclusion feature bundle algorithm, further improving the model’s operational efficiency. As an efficient and flexible gradient boosting decision tree framework, it has many advantages, especially in large-scale data processing, high efficiency, memory consumption, and prediction performance. LightGBM shows strong performance and low computational overhead when dealing with complex tasks by optimizing the tree construction, histogram algorithm, and leaf-splitting strategy.

2.1.3. SVM

SVM, proposed by Vapnik et al., is a robust classification and regression tool for high-dimensional data and small-sample learning, which is widely used in classification and regression problems [34]. The principle of the model is shown in Figure 4. The core idea of SVM is to separate different categories of sample data by finding an optimal hyperplane for classification or regression. Its core advantage lies in its ability to obtain good generalization by maximizing the classification interval and its ability to solve nonlinear problems through kernel tricks.

2.1.4. GBRT

GBRT is a supervised ML method for solving classification and regression tasks, also known as a gradient boosting regression tree, which is widely used as an integrated learning method for regression and classification tasks [35]. The model principle is shown in Figure 5. It uses a statistical method boosting to improve traditional decision tree methods. The basic idea behind this technique is to combine multiple “weak” models into a “strong” consensus model rather than constructing a single optimization model.

2.1.5. MLP

Multi-Layer Perceptron (MLP) is a feed-forward artificial neural network, which belongs to one of the most basic network structures in the field of deep learning [36]. The principle of the model is shown in Figure 6. MLP usually consists of an input layer, one or more hidden layers, and an output layer. Each layer consists of several neurons (nodes), and the nodes are connected by weights. As a powerful neural network model with strong nonlinear fitting ability, it performs well in many supervised learning tasks. Its strengths include the ability to handle complex data relationships, automatic learning of features, and a flexible network structure that allows for good predictions in various tasks.

2.1.6. XGBoost

XGBoost is an integrated learning algorithm developed by Chen et al. [37]. The algorithm is based on the gradient enhancement framework, and the model principle is shown in Figure 7. As an efficient and robust machine learning algorithm, it is widely used in regression and classification problems. It can deal with complex data through gradient enhancement and regularization optimization with high efficiency and powerful prediction ability. With proper hyperparameter tuning and reasonable feature engineering, XGBoost can provide high prediction accuracy and model robustness. XGBoost can achieve better prediction performance in a shorter time compared to other gradient enhancement algorithms.

2.1.7. Hyper-Parameter Optimization of the Model

The XGBoost model has many hyper-parameters, and different combinations can significantly affect the model’s computational efficiency and generalization performance. Therefore, optimizing the hyper-parameters to obtain a model with good computational efficiency and generalization performance is essential. Hyper-parameter optimization is a black-box problem, meaning that the functional expression is unknown during the optimization process, and only the corresponding objective function values can be obtained based on discrete independent variables. This characteristic poses numerous difficulties for hyperparameter optimization. Currently, an increasing number of hyper-parameter optimizations are accomplished through automated methods. These methods aim to find the optimal parameter combination with less resource consumption following specific strategies. Commonly used hyper-parameter optimization methods include grid search, random search, and Bayesian optimization algorithm et al. [38]. Among them, grid search determines a set of optimal values as the final hyper-parameter combination of the model by traversing and looping through all the points on the grid in the hyper-parameter space within the given hyper-parameter range. However, this method is highly time-consuming and computationally resource-intensive in the case of multiple parameters and large datasets. The idea of random search is quite similar to that of grid search. It searches for the global optimal point by randomly sampling points within the search range, saving time for hyper-parameter optimization, but cannot guarantee to obtain the best combination. Bayesian optimization (BO) is a global optimization algorithm based on probability distribution. Unlike grid search and random search in constructing the objective function, BO constructs a surrogate function using a limited number of sample points to approximately substitute the aim of the black-box function. This reduces the time cost of calculating the objective function and can achieve the optimal solution of complex objective functions with fewer evaluation times, thus significantly improving the optimization efficiency and convergence speed of hyper-parameters. Therefore, in this paper, BO is proposed to optimize the hyper-parameters of the established model. Substituting the optimized hyper-parameters into the model can yield the trained remaining strength prediction model.

2.2. Data Collection

Factors influencing the remaining strength of the pipeline include the outer diameter of the pipeline (D), the wall thickness of the pipeline (t), defect length (L), defect depth (d), defect width (w), yield strength of the pipeline(σ_y), and ultimate tensile strength (σ_u). The remaining strength is characterized by the failure pressure (P_f). The datasets of these influencing factors and the failure pressure are sourced from the published literature [39]. In total, 60 sets of burst test data from defective pipelines were collected. The seven influencing factors are taken as the input variables of the prediction model. The output variable is the failure pressure of the defective pipeline. In Figure 8, histograms are plotted to reflect the distribution of the input variable datasets. The height of each bar represents the relative frequency or probability density of each value, and the total area of the histogram is equal to 1. It is found that the features contained in the dataset have a wide range, the dataset is well-balanced, and most of the input variables follow a normal distribution. This meets the data requirements of machine learning and ensures the effectiveness of the machine learning algorithm constructed based on these data.

2.3. Data Processing

The traditional approach mainly uses random sampling to divide the training set and test set. However, random sampling often leads to sampling bias when the sample size is not massive. This causes the distribution pattern of the test set obtained by random sampling to deviate significantly from the original dataset, rendering the prediction results unreliable and subjective [40].

Taking the remaining strength data as an example, the impacts of random and stratified sampling on the dataset division results were studied. According to the distribution law of the remaining strength data, the collected data were divided into three intervals. The deviations between the test sets obtained by the two sampling methods and the initial data are shown in Table 2. The Mean Absolute Percentage Error (MAPE) of random and stratified sampling is presented in Figure 9. The MAPE of random sampling is 14.89%, while that of stratified sampling is 7.17%. The results indicate that the test set obtained by stratified sampling shows better consistency with the initial samples. Therefore, this paper proposes the stratified sampling method to divide the training and test sets.

2.4. Evaluation Indexes of Models

The effectiveness of a model is typically showcased by its prediction accuracy, a concept well-documented in the existing literature [41,42,43]. Based on thorough research of relevant studies, four key metrics of Root Error (RE), Coefficient of Determination (R²), Mean Absolute Percentage Error (MAPE), and Root Mean Square Error (RMSE) are chosen to assess the model’s prediction performance, with their respective formulas presented in Table 3.

2.5. Construction of Residual Strength Prediction Model

This study proposes the BO-XGBoost method for predicting the residual strength of defective pipelines, with the prediction framework illustrated in Figure 10. The framework consists of data collection and processing, model construction, performance evaluation, and interpretability analysis. Regarding data collection, the seven types of correlated parameters are relatively limited in features, which neither compromises the model’s generalization capability nor significantly increases computational costs. Therefore, the seven factors chosen as input features for the model include the pipe outer diameter D, pipe wall thickness t, defect length L, defect depth d, defect width w, pipe yield strength σ_y, and ultimate tensile strength σ_u. At the same time, the failure pressure P_f is considered the output parameter of the model. During data processing, stratified sampling is applied to divide the dataset into training set and test set, ensuring the objectivity and reliability of the prediction results. To eliminate the dimensional differences among features, ensure comparability across different features, and enhance the performance and accuracy of the model, it is necessary to normalize the input data. In this study, the input data were normalized to the range of [0, 1]. Additionally, to ensure the objectivity of the prediction results, avoid overfitting, and enhance the generalization capability of the model, 6-fold cross-validation was employed for data preprocessing. For model construction, the predictive performance of six mainstream models is evaluated using metrics such as RE, MAPE, R², and RMSE to identify the optimal model. Finally, SHAP analysis is conducted to investigate the model’s interpretability, quantifying the influence of input variables on the output and enhancing model transparency. The development of this model contributes to improving the integrity management of oil and gas pipelines and offers decision-making support for the intelligent management of oil and gas pipelines.

3. Case Study

3.1. Evaluation of Predictive Performance

To assess the predictive performance of the proposed hybrid model, the performance of BO-XGBoost was compared with that of five other models: RF, LightGBM, SVM, GBRT, and MLP. Figure 11 displays the distribution histograms of prediction errors for the six models. The horizontal coordinate is the distribution interval of prediction error values, and the vertical coordinate is the frequency density, reflecting the concentration of prediction error values in each interval. Meanwhile, the µ and σ of the forecast error are given in Figure 11. As illustrated in Figure 11, the MLP model exhibits the most significant mean and standard deviation of prediction errors, followed by SVM, LightGBM, and GBRT. The mean and standard deviation of the prediction errors from the RF model are only surpassed by those of BO-XGBoost. The BO-XGBoost model demonstrates a minor mean and standard deviation in prediction errors compared to the other five models, indicating its superior predictive performance.

To further assess the predictive performance of the hybrid model, a comprehensive assessment was conducted using metrics such as R², MAPE, and RMSE. The prediction results are shown in Figure 12. Combining R², MAPE, and RMSE and fitting curves reflects the predictive performance of different models. The study found that the MLP model had the worst predictive performance, with R², MAPE, and RMSE values of 0.938, 0.103, and 1.377, respectively. The SVM model performed better than MLP, with values of 0.947 for R², 0.091 for MAPE, and 1.362 for RMSE. The R², MAPE, and RMSE values for LightGBM were 0.946, 0.089, and 1.359, respectively; for GBRT, the values were 0.954, 0.077, and 1.344; for RT, the values were 0.963, 0.062, and 1.287; and for BO-XGBoost, the values were 0.971, 0.055, and 1.263. The prediction effect of the above models becomes more successful, among which BO-XGBoost has the best fitting effect. Due to the wide coverage of the dataset, the above prediction results also indicate that the BO-XGBoost model has a good range of engineering applications. However, the model requires substantial computational resources during the training process. In the future, further efforts should focus on optimizing hyperparameter tuning, data preprocessing, and parallelization or distributed computing techniques to reduce the computational demands of this method.

In order to further evaluate the predictive performance of BO-XGBoost, combined with the R² index, it was compared with ASME B31G, BS 7910, and API 579. The R² values of BO-XGBoost, ASME B31G, BS 7910, and API 579 are 0.97, 0.52, 0.61, and 0.46, respectively. The results show that BO-XGBoost has a better prediction effect.

To further evaluate the computational cost of the BO-XGBoost model, the training and testing times for MLP, SVM, LightGBM, GBRT, RT, and BO-XGBoost were calculated. The study revealed that the computational times required for the six models mentioned above were 12 s, 6 s, 3 s, 8 s, 2 s, and 32 s, respectively. BO-XGBoost exhibited the longest computational time due to its parameter tuning process. However, given its superior predictive accuracy and the fact that the computational time remains within an acceptable range, this model is deemed suitable for practical applications. Additionally, the computational time of BO-XGBoost could be further reduced through GPU acceleration.

3.2. Robustness Analysis of the BO-XGBoost Model

After developing and evaluating the predictive model for the remaining strength of defective pipelines using datasets, it is necessary to conduct a robustness analysis. This analysis provides additional insights into the model’s adaptability and effectiveness to varying data conditions. The robustness of the BO-XGBoost model can be assessed by introducing varying levels of white noise into the input data distribution and observing the changes in R² values. The point at which the R² value trend begins to decline after adding extra noise identifies the model’s noise tolerance threshold. At this threshold, the model with the highest R² value is considered the most resilient [44]. The model with the lowest slope of accuracy degradation is regarded as the most robust in terms of prediction performance.

Gaussian white noise at different levels was generated using the probability density function P_x(x) defined in Equation (1), where x represents the variable, µ represents the mean, and s represents the standard deviation. By adjusting the value “s” while keeping the mean (μ) at zero [45,46].

P_{x} = \frac{1}{s \sqrt{2 π}} e^{- \frac{{(x - μ)}^{2}}{2 s^{2}}}

(1)

Figure 13 shows the R² values predicted by different models as the level of noise changes, the horizontal coordinate represents the noise level, and the vertical coordinate represents the R² predicted by the model after adding noise. For the dataset with added noise, the R² value of the BO-XGBoost model decreased the least. In contrast, the R² values of GBRT, RT, LightGBM, and MLP decreased significantly, and the R² value of the SVM model exhibited the most significant decline. When the S value exceeds 0.25, the R² decline rate of the BO-XGBoost model becomes more pronounced, indicating that its noise tolerance threshold is approximately 25%.

3.3. Overfitting Analysis of the BO-XGBoost Model

Similarly to the robustness analysis of the model, after developing and initially evaluating the remaining strength prediction model of defective pipelines using datasets, it is necessary to conduct an overfitting indicator evaluation. Overfitting is a common issue in developing and applying machine learning prediction models. Many machine learning models tend to overfit the training data they are exposed to when dealing with complex datasets containing multiple nonlinear relationships, noise, or unpredictable fluctuations. This often results in low prediction errors for the training set but poor generalization when applied to test data. Therefore, it is necessary to conduct an unbiased evaluation of the constructed model to measure the degree of overfitting during the training process [47]. The specific formula is shown in Equation (2), a practical method that entails calculating an overfitting index (OFI) by utilizing a model’s Performance Index (PI) with both training (PItr) and testing (PIts) datasets, along with the number of data instances in the training (Ktr) and testing (Kts) datasets. The Performance Index (PI) is used to evaluate the model’s predictive power and generalization performance. A lower OFI value signifies a decreased likelihood of overfitting.

O F I = (\frac{K_{t r} - K_{t s}}{K}) P I_{t r} + 2 (\frac{K_{ts}}{K}) P I_{t s}

(2)

Table 4 presents the performance of different models on the dataset, specifically their overfitting index (OFI) values when applied to the remaining strength dataset of defective pipelines. The results indicate that the BO-XGBoost model has the lowest OFI, while the SVM model has the highest. Based on the OFI values, the models can be ranked as follows: BO-XGBoost (least overfitting) < GBRT < RF < LightGBM < MLP < SVM (most overfitting). This ranking indicates that BO-XGBoost is the least prone to overfitting compared to the other five models.

3.4. Grey Relational Analysis

The Grey Relation Degree (GRD) is a metric used to assess the correlation between the predicted outcomes and the observed data. A higher GRD value corresponds to superior predictive performance of the model [48]. The specific operation process is as follows.

The comparison time series:

X_{i} = (X_{i} (1), X_{i} (2), \cdot \cdot \cdot X_{i} (n))

(3)

The reference time series:

X_{0} = (X_{0} (1), X_{0} (2), \cdot \cdot \cdot X_{0} (n))

(4)

The two-time series are standardized by

x_{i} (t) = (X_{i} (t) - \frac{1}{n} \sum_{t = 1}^{n} X_{i} (t)) / \sqrt{\frac{1}{n - 1} \sum_{t = 1}^{n} (X_{i} (t) - \frac{1}{n} \sum_{t = 1}^{n} X_{i} (t))}

(5)

The correlation coefficient between x₀ and x_i is

ξ (k) = (\min_{i} \min_{k} |x_{0} (k) - x_{i} (k)| + ρ \max_{i} \max_{k} |x_{0} (k) - x_{i} (k)|) /

(|x_{0} (k) - x_{i} (k)| + ρ \max_{i} \max_{k} |x_{0} (k) - x_{i} (k)|), ρ \in (0, \infty)

(6)

The GRD is

r_{i} = \frac{1}{n} \sum_{k = 1}^{n} ξ_{i} (k)

(7)

Table 5 displays the GRD values for the constructed and the comparison models. The analysis indicates that the constructed prediction model achieves a GRD value of 0.919, surpassing all comparison models. This demonstrates a strong correlation between the predicted results of the constructed model and the observed data.

3.5. SHAP Analysis

Although many machine learning models are designed for accurate prediction, it is equally important to understand why a model makes specific predictions and quantify the relationship between the model’s predictions and input features. While the BO-XGBoost model demonstrates strong predictive performance on the dataset, its complex and highly nonlinear architecture results in a “black-box” effect. The lack of transparency limits its practical application and ability to guide production. Consequently, there is a growing demand for model interpretability. SHAP is utilized to analyze the influence of input variables on model outputs and explains the model’s prediction as the sum of the Shapley values of each input feature [49]:

g (x') = φ_{0} + \sum_{M}^{j = 1} φ_{j}

(8)

where g(x’) is the model’s value, φ₀ is the constant that explains the model (i.e., the predicted mean of all training samples), and φ_j is the imputed value of each feature.

The influence of different parameters on residual strength is quantified using SHAP absolute values, as illustrated in Figure 14. A higher SHAP absolute value indicates a more significant impact on the corrosion rate. The analysis reveals that the defect depth (d), the wall thickness of the pipeline (t), the yield strength of the pipeline (σ_y), the outer diameter of the pipeline (D), the defect length (L), the ultimate tensile strength (σ_u), and the defect width (w) affect the residual strength in descending order of significance. Figure 15 further depicts the relationship between the SHAP values of residual strength and the various contributing factors. As illustrated in Figure 15, a significant negative correlation is observed between d and the residual strength of the pipeline. The depth of the defect is prone to inducing stress concentration effects, which directly compromise the pipeline’s residual strength. Specifically, as the defect depth increases, the residual strength decreases markedly. This phenomenon becomes particularly critical when the defect depth approaches or exceeds a certain proportion of the pipeline wall thickness, posing a substantial threat to pipeline integrity and safety. Secondly, both t and σ_y exhibit positive correlations with residual strength. Specifically, the t of the pipeline has a direct relationship with its residual strength; generally, the greater the wall thickness, the higher the residual strength tends to be. For pipelines, σ_y represents the maximum stress the material can withstand without fracture. Exceeding this threshold results in the permanent deformation of the pipeline. Both D and L exhibit a negative correlation with residual strength. The relationship between D and residual strength is a complex engineering issue that requires the consideration of factors such as the pipeline’s actual operating environment, material properties, and damage conditions. Generally, defects such as cracks or corrosion are present in the pipeline. In that case, larger external diameters may result in a larger defect area, significantly reducing residual strength. The impact of defects on pipelines with larger external diameters may be more severe, particularly under high-pressure or high-stress conditions. The effect of L on the residual strength is significant, as the size of the crack or defect increases, the residual strength decreases. In engineering design, it is essential to consider the size of defects and conduct inspections to ensure that the material does not fail due to defects during service. There is a positive correlation between σ_u and residual strength. σ_u represents the maximum load-bearing capacity under ideal conditions, while residual strength considers the material’s load-bearing capacity after damage. As the damage increases, the residual strength falls below the σ_u. Finally, the relationship between w and the residual strength is negative. As the defect width increases, the residual strength typically decreases. However, the exact extent of this effect needs to be analyzed, considering various factors such as material properties, loading conditions, and defect morphology.

4. Conclusions

(1) The stratified sampling method divides the training set and the test set, effectively avoiding the significant deviation caused by random sampling and ensuring the reliability and objectivity of the prediction results. The Bayesian optimization algorithm is proposed to solve the problems of excessive iterations and high computational cost of traditional hyperparameter optimization methods. A novel hybrid forecasting model, BO-XGBoost, is proposed. The model is evaluated based on metrics such as RE, MAPE, R², and RMSE, along with robustness, overfitting, and grey relational analysis, verifying that the constructed model exhibits high predictive performance.

(2) The SHAP was used to study the influence of input variables on model output, which enhanced the explainability of model prediction, met industry requirements on the explainability of prediction models, and enhanced the guiding role of the model for production practice. The establishment of this model contributes to advancing the integrity management level of oil and gas pipelines, providing critical decision-making support for intelligent management of pipeline systems, and this methodology is equally applicable in the fields of gas and water pipelines.

(3) The residual strength prediction model for defective pipelines is deployed on data-acquisition terminals through algorithmic encapsulation. The model dynamically updates prediction results in real time by integrating data from sensors and monitoring systems. The software can assess the pipeline’s safety status in case of defects, and when the predicted residual strength is below the safety threshold, the system will give an early warning. Such alerts prompt maintenance personnel to implement timely repair or replacement measures, thereby preventing pipeline failures and guiding predictive maintenance strategies.

(4) The research on residual strength prediction of defective pipelines based on XGBoost has important theoretical significance and practical application value. Future research in data enhancement, multi-modal data fusion, model optimization, real-time data monitoring, and interpretability improvement is expected to further improve the prediction accuracy and practicability and provide strong support for the safe operation of pipelines.

Author Contributions

Conceptualization, H.L.; Methodology, H.L.; Resources, X.M.; Writing—original draft, H.L.; Writing—review & editing, H.L.; Supervision, X.M.; Project administration, X.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

Author Hongbo Liu was employed by the company Shaanxi Provincial Natural Gas Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Xu, L.; Wang, Y.F.; Mo, L.; Tang, Y.F.; Wang, F.; Li, C.J. The research progress and prospect of data mining methods on corrosion prediction of oil and gas pipelines. Eng. Fail. Anal. 2023, 144, 106951. [Google Scholar]
Li, X.H.; Liu, Y.Z.; Abbassi, R.; Khan, F.; Zhang, R.R. A Copula-Bayesian approach for risk assessment of decommissioning operation of aging subsea pipelines. Process Saf. Environ. Prot. 2022, 167, 412–422. [Google Scholar]
Savanna, C.M.D.; Renato, D.S.M.; Silvana, M.B.A. An investigation on the collapse response of subsea pipelines with interacting corrosion defects. Eng. Struct. 2024, 321, 118911. [Google Scholar]
Wen, J.; Zhang, C.; Xia, Y.Y.; Wang, C.X.; Sang, X.X.; Fang, H.Y.; Wang, N.N. UV/thermal dual-cured MWCNTs composites for pipeline rehabilitation: Mechanical properties and damage analysis. Constr. Build. Mater. 2024, 450, 138602. [Google Scholar]
Velázquez, J.C.; González-Arévalo, N.E.; Dĺaz-Cruz, M.; Cervantes-Tonbón, A.; Herrera-Hernández, H.; Hernández-Sánchez, E. Failure pressure estimation for an aged and corroded oil and gas pipeline: A finite element study. J. Nat. Gas Sci. Eng. 2022, 101, 104532. [Google Scholar]
Soomro, A.A.; Mokhtar, A.A.; Hussin, H.B.; Lashari, N.; Oladosu, T.L.; Jameel, S.M.; Inayat, M. Analysis of machine learning models and data sources to forecast burst pressure of petroleum corroded pipelines: A comprehensive review. Eng. Fail. Anal. 2024, 155, 107747. [Google Scholar]
Xu, L.; Yu, P.F.; Wen, S.M.; Tang, Y.F.; Wang, Y.F.; Tian, Y.; Mao, T.; Li, C.J. Visualization and Analysis of Oil and Gas Pipeline Corrosion Research: A Bibliometric Data-Mining Approach. J. Pipeline Syst. Eng. Pract. 2024, 15, 04024017. [Google Scholar] [CrossRef]
Kin, C.; Chen, L.; Wang, H.; Castandea, H. Global and local parameters for characterizing and modeling external corrosion in underground coated steel pipelines: A review of critical factors. J. Pipeline Sci. Eng. 2021, 1, 17–35. [Google Scholar]
Zhou, R.; Gu, X.T.; Luo, X. Residual strength prediction of X80 steel pipelines containing group corrosion defects. Ocean Eng. 2023, 274, 114077. [Google Scholar]
Sharples, J.; Hadley, I. Treatment of residual stress in fracture assessment: Background to the advice given in BS 7910, 2013. Int. J. Press. Vessel. Pip. 2018, 168, 323–334. [Google Scholar] [CrossRef]
Miao, X.Y.; Zhao, H. Novel method for residual strength prediction of defective pipelines based on HTLBO-DELM model. Reliab. Eng. Syst. Saf. 2023, 237, 109369. [Google Scholar]
Timashev, S.A.; Bushinskaya, A.V. Markov approach to early diagnostics, reliability assessment, residual life and optimal maintenance of pipeline systems. Struct. Saf. 2015, 56, 68–79. [Google Scholar]
Garcĺa, D.; Burgos, J.B.; Chaparro, J.; Eicker, U.; Cárdenas, D.R.; Saldana-Robles, A. Analyzing joint efficiency in storage tanks: A comparative study of API 650 standard and API 579 using finite element analysis for enhanced reliability. Int. J. Press. Vessel. Pip. 2024, 207, 105113. [Google Scholar]
Wei, W.L.; Zhang, R.; Zhang, Y.W.; Cheng, J.R.; Cao, Y.P.; Fang, F.Y. Study on the Residual Strength of Nonlinear Fatigue-Damaged Pipeline Structures. Appl. Sci. 2024, 14, 754. [Google Scholar] [CrossRef]
Li, X.L.; Chen, G.T.; Liu, X.Y.; Ji, J.; Han, L.F. Analysis and Evaluation on Residual Strength of Pipelines with Internal Corrosion Defects in Seasonal Frozen Soil Region. Appl. Sci. 2021, 11, 12141. [Google Scholar] [CrossRef]
Muda, M.F.; Hashim, M.H.M.; Kamarudin, M.K.; Mohd, M.H.; Tafsirojjaman, T.; Rahman, M.A.; Paik, J.K. Burst pressure strength of corroded subsea pipelines repaired with composite fiber-reinforced polymer patches. Eng. Fail. Anal. 2022, 136, 106204. [Google Scholar]
Abyani, M.; Bahaari, M.R.; Zarrin, M.; Nasseri, M. Predicting failure pressure of the corroded offshore pipelines using an efficient finite element based algorithm and machine learning techniques. Ocean. Eng. 2022, 254, 111382. [Google Scholar]
Chen, Z.F.; Li, X.Y.; Wang, W.; Li, Y.; Shi, L.; Li, Y.X. Residual strength prediction of corroded pipelines using multilayer perceptron and modified feedforward neural network. Reliab. Eng. Syst. Saf. 2023, 231, 108980. [Google Scholar]
Li, X.H.; Jia, R.C.; Zhang, R.R. A data-driven methodology for predicting residual strength of subsea pipeline with double corrosion defects. Ocean Eng. 2023, 279, 114530. [Google Scholar] [CrossRef]
Kumar, S.D.V.; Kai, M.L.Y.; Arumugam, T.; Karuppanan, S. A Review of Finite Element Analysis and Artificial Neural Networks as Failure Pressure Prediction Tools for Corroded Pipelines. Materials 2021, 14, 6135. [Google Scholar] [CrossRef]
Sun, B.C.; Zhu, C.W.; Ling, X. Research on residual strength of pipeline with defects based on SVR. J. Saf. Sci. Technol. 2022, 18, 172–176. [Google Scholar]
Peng, Y.D.; Fu, G.M.; Sun, B.J.; Chen, J.Y.; Zhang, W.G.; Ren, M.P.; Zhang, H. Data-driven collapse strength modelling for the screen pipes with internal corrosion defect based on finite element analysis and tree-based machine learning. Ocean Eng. 2023, 279, 114400. [Google Scholar] [CrossRef]
Su, Y.; Li, J.F.; Yu, B.; Zhao, Y.L.; Yao, J. Fast and accurate prediction of failure pressure of oil and gas defective pipelines using the deep learning model. Reliab. Eng. Syst. Saf. 2021, 216, 108016. [Google Scholar]
Bergstra, J.; Yamins, D.; Cox, D. Hyperopt: A Python Library for Optimizing the Hyperparameters of Machine Learning Algorithms. In Proceedings of the Python in Science Conference, Austin, TX, USA, 24–29 June 2013; pp. 13–19. [Google Scholar] [CrossRef]
Nahiduzzaman, M.; Abdulrazak, L.F.; Ayari, M.A.; Khandakar, A.; Islam, S.M.R. A novel framework for lung cancer classification using lightweight convolutional neural networks and ridge extreme learning machine model with SHapley Additive exPlanations (SHAP). Expert Syst. Appl. 2024, 248, 123392. [Google Scholar]
Jaramillo, J.P.; Munoz, C.; Arango, R.M.; Calderon, C.G.; Lange, A. Integrating multiple data sources for improved flight delay prediction using explainable machine learning. Res. Transp. Bus. Manag. 2024, 56, 101161. [Google Scholar]
Sun, B.C.; Cui, W.J.; Liu, G.Y.; Zhou, B.; Zhao, W.J. A hybrid strategy of AutoML and SHAP for automated and explainable concrete strength prediction. Case Stud. Constr. Mater. 2023, 19, e02405. [Google Scholar] [CrossRef]
Sheyyab, M.; Lynch, P.; Mayhew, P.; Brezinsky, K. Optimized synthetic data and semi-supervised learning for Derived Cetane Number prediction. Combust. Flame 2024, 259, 113184. [Google Scholar]
Cheng, Y.; Huang, X.F.; Peng, Y.; Yang, S.Y.; Chen, G.M. A KPCA-BRANN based data-driven approach to model corrosion degradation of subsea oil pipelines. Environ. Pollut. 2023, 316, 120685. [Google Scholar] [PubMed]
Wang, X.L.; Chen, A.R.; Liu, Y.Q. Explainable ensemble learning model for predicting steel section-concrete bond strength. Constr. Build. Mater. 2022, 356, 129239. [Google Scholar] [CrossRef]
Bialek, J.; Bujalski, W.; Wojdan, K.; Guzek, M.; Kurek, T. Dataset level explanation of heat demand forecasting ANN with SHAP. Energy 2022, 261, 125075. [Google Scholar]
Breiman, L. Random forest. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Ke, G. Light GBM: A hightly efficient grasient boosting decision tree. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
Vapnik, V. The Nature of Statistical Learning Theory; Springer: Berlin/Heidelberg, Germany, 1995. [Google Scholar]
Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2021, 1189–1232. [Google Scholar] [CrossRef]
Moshe, L.; Vladimir, Y.; Allan, P.; Schocken, S. Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Netw. 1993, 6, 861–867. [Google Scholar]
Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the KDD’16: The 22nd ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
Li, Y.R.; Zhang, Y.L.; Wang, J.C. Survey on Bayesian Optimization Methods for Hyper-Parameter Tuning. Comput. Sci. 2022, 49, 86–92. [Google Scholar] [CrossRef]
Gao, J.; Yang, P.; Li, X.; Zhou, J.; Liu, J. K Analytical prediction of failure pressure for pipeline with long corrosion defect. Ocean Eng. 2019, 191, 106497. [Google Scholar] [CrossRef]
Hens, A.B.; Tiwari, M.K. Computational time reduction for credit scoring: An integrated approach based on support vector machine and stratified sampling method. Expert Syst. Appl. 2012, 39, 6774–6781. [Google Scholar] [CrossRef]
Xu, L.; Yu, J.; Zhu, Z.Y.; Man, J.F.; Yu, P.F.; Li, C.J.; Wang, X.T.; Zhao, Y.Q. Research and Application for Corrosion Rate Prediction of Natural Gas Pipelines Based on a Novel Hybrid Machine Learning Approach. Coatings 2023, 13, 856. [Google Scholar] [CrossRef]
Lu, H.F.; Peng, H.Y.; Xu, Z.; Qin, G.J.; Azimi, M.; Matthews, J.; Cao, L. Theory and Machine Learning Modeling for Burst Pressure Estimation of Pipeline with Multipoint Corrosion. J. Pipeline Syst. Eng. Pract. 2023, 14, 1481. [Google Scholar] [CrossRef]
Du, J.; Zheng, J.Q.; Liang, Y.T.; Xu, N.; Liao, Q.; Wang, B.H.; Zhang, H.R. Deeppipe: Theory-guided prediction method based automatic machine learning for maximum pitting corrosion depth of oil and gas pipeline. Chem. Eng. Sci. 2023, 278, 118927. [Google Scholar] [CrossRef]
Davoodi, S.; Thanh, H.V.; Wood, D.; Mehrad, M.; Al-Shargabid, M.; Rukavishnikov, V. Committee machine learning: A breakthrough in the precise prediction of CO₂ storage mass and oil production volumes in unconventional reservoirs. Geoenergy Sci. Eng. 2025, 245, 213533. [Google Scholar] [CrossRef]
Xue, N.; Xu, S.H.; Shi, B. Study on statistical rule of surface corrosion depth of Q235 steel. Sichuan Build. Mater. 2013, 39, 117–118. [Google Scholar]
Liu, C.L.; Yang, J.; Zhu, M.; Zhang, C.; Guan, J.; Xiao, N.J. Residual life prediction method of corroded pipelines based on normal distribution. Corros. Prot. 2023, 44, 100–106. [Google Scholar]
Davoodi, S.; Thanh, H.V.; Wood, D.A.; Mehrad, M.; Muravyov, S.; Rukavishnikov, V. Carbon dioxide storage and cumulative oil production predictions in unconventional reservoirs applying optimized machine-learning models. Pet. Sci. 2025, 22, 296–323. [Google Scholar] [CrossRef]
Wu, H.Y.; Lei, H.G.; Chen, Y. Grey relational analysis of static tensile properties of structural steel subjected to urban industrial atmospheric corrosion and accelerated corrosion. Constr. Build. Mater. 2022, 315, 125706. [Google Scholar]
Wen, Z.P.; Liu, H.T.; Zhou, M.Q.; Liu, C.; Zhou, C.C. Explainable machine learning rapid approach to evaluate coal ash content based on X-ray fluorescence. Fuel 2023, 332, 125991. [Google Scholar]

Figure 1. Different prediction methods for pipeline residual strength.

Figure 2. Schematic diagram of the RF algorithm.

Figure 3. Schematic diagram of the LGB algorithm.

Figure 4. Schematic diagram of the SVM algorithm.

Figure 5. Schematic diagram of the GBRT algorithm.

Figure 6. Schematic diagram of the MLP algorithm.

Figure 7. Schematic diagram of the XGBoost algorithm.

Figure 8. Distributions of features in the database.

Figure 9. Comparison chart of MAPE obtained by random sampling and stratified sampling.

Figure 10. Predictive framework process.

Figure 11. The distribution histograms of prediction errors.

Figure 12. The predicting results of six models.

Figure 13. The R² values predicted by different models as the level of noise changes.

Figure 14. Ranking of contribution residual strength influencing factors.

Figure 15. Effect of different input characteristics on the prediction model.

Table 1. Statistical applications of interpretable machine learning methods in relevant fields.

Author	Research Themes	Methods	Main Finding
Nahiduzzaman et al. (2024) [25]	Medical diagnosis	CNN-ELM	The proposed method exhibits high recognition accuracy. The SHAP can enhance the model’s interpretability, thus boosting confidence in lung cancer diagnosis.
Jaramillo et al. (2024) [26]	Flight delay prediction	SMOTE-ENN	The interpretable artificial intelligence method based on SHAP can accurately predict flight delays and offer actionable insights into the determinants of flight delays.
Sun et al. (2023) [27]	Concrete strength prediction	AutoML	The SHAP can provide a global interpretation of the impact of hybrid parameters on compressive strength, rendering the prediction process transparent and reliable.
Sheyyab et al. (2024) [28]	Combustion performance prediction	ANN	The SHAP can clarify the influence of different parameters on the prediction results of the combustion model.
Cheng et al. (2023) [29]	Environmental Pollution	XGBoost	This method demonstrates excellent prediction performance and can identify the impact of different emission sources on ozone formation.
Wang et al. (2022) [30]	Steel strength prediction	LightGBM	The constructed hybrid model is reliable. Combined with SHAP, it explains the contribution of basic features to LightGBM’s individual predictions.
Bialek et al. (2022) [31]	Energy demand forecasting	ANN	It accurately predicts energy demand based on SHAP and provides practical insights into the model’s internal principles.

Table 2. Comparison of sample selection bias between random sampling and stratified sampling.

Classification	Overall	Random	Stratified	Rand Error %	Strat Error %
1	0.376	0.320	0.384	14.89	−2.13
2	0.369	0.447	0.335	−21.14	9.21
3	0.255	0.233	0.281	8.63	−10.19

Table 3. Performance measures.

Metric	Formulas	Best Performance Value
RE	$R E (%) = \frac{{\hat{y}}_{i} - y_{i}}{y_{i}} \times 100 %$	Closest value to 0
R²	$R^{2} = 1 - {\sum_{i = 1}^{N} (y_{i} - \overset{\land}{y_{i}})}^{2} / {\sum_{i = 1}^{N} (y_{i} - \bar{y_{i}})}^{2}$	Closest value to 1
MAPE	$M A P E (%) = \frac{1}{N} \sum_{i = 1}^{N} \|\frac{y_{i} - \overset{\land}{y_{i}}}{y_{i}}\| \times 100 %$	Minimum
RMSE	$R M S E = \sqrt{\frac{1}{N} \times \sum_{i = 1}^{N} {(\overset{\land}{y_{i}} - y_{i})}^{2}}$	Minimum

where y_i,

{\bar{y}}_{i}

and

{\hat{y}}_{i}

are the collected, mean, and predicted values, respectively, and N is the predicted horizon.

Table 4. The overfitting index (OFI) values.

Target Parameter	SVM	MLP	LightGBM	RF	GBRT	BO-XGBoost
Residual strength forecasting	0.042	0.037	0.025	0.021	0.016	0.011

Table 5. Comparison of the grey relational degree values.

Testing Model	Data A
MLP	0.824
SVM	0.835
LightGBM	0.860
GBRT	0.874
RT	0.896
BO-XGBoost	0.919

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, H.; Meng, X. Explainable Ensemble Learning Model for Residual Strength Forecasting of Defective Pipelines. Appl. Sci. 2025, 15, 4031. https://doi.org/10.3390/app15074031

AMA Style

Liu H, Meng X. Explainable Ensemble Learning Model for Residual Strength Forecasting of Defective Pipelines. Applied Sciences. 2025; 15(7):4031. https://doi.org/10.3390/app15074031

Chicago/Turabian Style

Liu, Hongbo, and Xiangzhao Meng. 2025. "Explainable Ensemble Learning Model for Residual Strength Forecasting of Defective Pipelines" Applied Sciences 15, no. 7: 4031. https://doi.org/10.3390/app15074031

APA Style

Liu, H., & Meng, X. (2025). Explainable Ensemble Learning Model for Residual Strength Forecasting of Defective Pipelines. Applied Sciences, 15(7), 4031. https://doi.org/10.3390/app15074031

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Explainable Ensemble Learning Model for Residual Strength Forecasting of Defective Pipelines

Abstract

1. Introduction

2. Individual Methods

2.1. Machine Learning Algorithms

2.1.1. RF

2.1.2. LGB

2.1.3. SVM

2.1.4. GBRT

2.1.5. MLP

2.1.6. XGBoost

2.1.7. Hyper-Parameter Optimization of the Model

2.2. Data Collection

2.3. Data Processing

2.4. Evaluation Indexes of Models

2.5. Construction of Residual Strength Prediction Model

3. Case Study

3.1. Evaluation of Predictive Performance

3.2. Robustness Analysis of the BO-XGBoost Model

3.3. Overfitting Analysis of the BO-XGBoost Model

3.4. Grey Relational Analysis

3.5. SHAP Analysis

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI