A Unified Framework for Asphalt Pavement Distress Evaluations Based on an Extreme Gradient Boosting Approach

Liu, Bing; Javed, Danial; Hu, Jianghai; Li, Wei; Chen, Leilei

doi:10.3390/coatings15030349

Open AccessArticle

A Unified Framework for Asphalt Pavement Distress Evaluations Based on an Extreme Gradient Boosting Approach

by

Bing Liu

¹,

Danial Javed

²,

Jianghai Hu

¹,

Wei Li

² and

Leilei Chen

^2,*

¹

Shandong Hi-Speed Hubei Development Co., Ltd., Wuhan 430010, China

²

Intelligent Transportation System Research Center, Southeast University, Nanjing 211189, China

^*

Author to whom correspondence should be addressed.

Coatings 2025, 15(3), 349; https://doi.org/10.3390/coatings15030349

Submission received: 25 February 2025 / Revised: 13 March 2025 / Accepted: 14 March 2025 / Published: 18 March 2025

(This article belongs to the Special Issue Surface and Interface Characteristics of Pavements: New Perspectives and Applications)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Flexible pavements are susceptible to distress when subjected to long-term vehicle loads and environmental factors, thereby reqsuiring appropriate maintenance. To overcome the hectic field data collection and traffic congestion problems, this paper presents an intelligent prediction system framework utilizing Extreme Gradient Boosting (XGboost) to predict two relevant functional indices: rutting deformation and cracks damage. The model framework considers multiple essential factors, such as traffic load, material characteristics, and climate data conditions, to predict rutting behavior and employs image data to classify cracks behavior. The Extreme Gradient Boosting (XGboost) algorithm exhibited good performance, achieving an R² value of 0.9 for rutting behavior and an accuracy of 0.91, precision of 0.92, recall of 0.9, and F1-score of 0.91 for cracks. Moreover, a comparative assessment of the framework model with prominent AI methodologies reveals that the XGboost model outperforms support vector machine (SVM), decision tree (DT), random forest (RF), and K-Nearest Neighbor (KNN) methods in terms of quality of the result. For rutting behavior, a SHAP (Shapley Additive Explanations) analysis was performed on the XGboost model to interpret results and analyze the importance of individual features. The analysis revealed that parameters related to load and environmental conditions significantly influence the model’s predictions. Finally, the proposed model provides more precise estimates of pavement performance, which can assist in optimizing budget allocations for road authorities and providing dependable guidance for pavement maintenance.

Keywords:

extreme gradient boosting; pavement maintenance; rutting behavior; cracks; SHAP

1. Introduction

In terms of road structure design, the road’s structural integrity will be impacted by any damage to the surface layer, subbase layer, or subgrade layer [1]. This, unfortunately, not only threatens the service life of the pavement but also affects the deterioration of the pavement [2]. In this regard, various transportation agencies are developing pavement maintenance and rehabilitation strategies to assess pavement conditions and address pavement distress phenomena. At present, the key area of distress analysis mainly involves challenges associated with rutting depth and cracks [3]. The correlation between pavement cracks and structural condition is notably significant; thus, any deformation in the pavement structure can lead to surface cracking of the asphalt layer. In addition, rutting deformation and cracks represent two key categories of distress analysis, predominantly influenced by temperature fluctuations. Rutting is typically caused by higher temperature areas, usually above 50 °C, as the asphalt softens and deforms under traffic loads, while cracks occur due to lower temperatures, generally below −10 °C, where the asphalt becomes brittle [4]. Various techniques have been employed to predict these two types of distress conditions. These distress detections are primarily reliant on either manual inspection or automated inspection techniques. Despite their diversity, manual inspection is certainly accurate; however, it necessitates skilled labor, time, and money. Over the years, researchers have used diverse mechanistic–empirical approaches [5,6,7]. Nevertheless, these methods are not effective for multi-dimensional data. In contrast, automated inspection is both cost-effective and time-saving. However, in the realm of intelligent inspection for identifying pavement deterioration, precision, and efficacy provide significant challenges.

Recent advancements in the field of artificial intelligence have facilitated machine learning (ML) algorithms for prediction. ML techniques can be systematically categorized into three primary types: classification, regression, and clustering. For rutting behavior, the researcher used various regression models to perform prediction [8,9,10,11,12]. Yang et al. used an artificial neural network to forecast rutting behavior using a wide range of independent variables, including traffic and climate data [8]. Recurrent Neural Network (RNN) is another approach; Okuda used to train the model using the past three years’ data and predicted the current rutting behavior [9]. Choi et al. employed RNN to assess data from the Korean National Highway and Pavement Management System in order to forecast the pavement condition index, including rutting behavior [10]. Gradient boosting and Random Forest are two additional machine learning techniques employed by Guo to predict the two performance indices: the International Roughness Index (IRI) and rutting depth. He utilized the LTPP database for analysis and identified several factors influencing rutting, including pavement structure and environmental conditions [11]. Haddad et al. utilized the same LTPP database employing a Deep Neural Network (DNN) technique for prediction. Furthermore, after a sensitivity analysis, he incorporated several influential variables, including traffic and climate conditions [12]. The Shapely Additive Explanations (SHAP) method is important in the regression analysis. It provides insightful analysis of the impact of certain input variables on prediction outcomes, highlighting which variables significantly affect the results [13]. In practice, while these algorithms, under certain conditions, can accurately predict the rutting depth, they cannot highlight the most significant parameters influencing on final result; moreover, they rely on a fixed set of parameters for prediction, without integrating additional variables.

For the analysis of cracks’ classification, Convolutional Neural Networks (CNNs) are employed in a range of image recognition tasks, encompassing image classification, segmentation, and detection techniques [14]. Unnamed aerial vehicles (UAVs) are widely used to collect image data to analyze cracks. Zhu et al. employed UAV images to train CNN models for the detection of pavement cracks [15]. Jang et al. used a laser thermography scanning system to evaluate the multiple cracks in the concrete pavement [16]. Additionally, the image recognition technique, a cutting-edge approach to classification analysis, was employed by Chen et al., who utilized mean filtering recognition for preprocessing and Support Vector Classification as the final model for prediction [17]. Dogan et al. employed deep learning methodology, specifically utilizing the CNN approach, which yielded more effective analysis of road surface cracks while necessitating less computational resources compared to alternative algorithms [18]. Although CNNs are widely employed for cracks detection within images, their approach is primarily focused on a single type of distress analysis. In contrast, this study utilizes the framework of a single model to investigate two distinct distresses, thereby expanding the scope of its application. The XGboost algorithm has been used as a common in the proposed model for both distress analysis due to its good accuracy and efficient data handling capabilities compared to other models. It would be advantageous for a pavement engineer to utilize a single model for maintenance when examining these two types of distresses. In conclusion, the authors emphasize the significance of the proposed framework that employs the XGboost algorithm for analyzing two distinct types of distress, with the goal of improving the prediction outcomes in the following ways:

Through suitable optimization and the inclusion of relevant variables and images for both types of distress analysis, XGboost delivers accurate results and faster computational speed when compared to other machine learning algorithms used in the proposed model;
Utilizing SHAP analysis methods to identify the key parameters that significantly impact the prediction of rutting behavior.

2. Materials and Methods

A complete framework technique has been developed to facilitate the creation of intelligent prediction systems, as shown in Figure 1. The main aim is to forecast a proposed framework that employs the XGBoost algorithm for analyzing two distinct types of distress. The framework consists of several stages, starting with data collection and ending with model utilization. Every stage is carefully planned and executed to achieve the ultimate goal.

The initial phase of the system involves data collection. As the proposed system consists of two parts, Figure 1a shows the regression model analysis, which focuses on the prediction of rutting behavior; the primary data source is numerical data. The feature parameters are AC Thickness, ESAL, KESAL, Temperature, Resilient Modulus, Dynamic Modulus, Moisture Content, and MRI as an independent parameter and Rutting Depth as a dependent parameter, while for cracks, classification analysis was used by utilizing image data and a Gabor Filter extraction technique for further processing as shown in Figure 1b.

The missing values in the numerical dataset for rutting analysis were addressed through data preprocessing analysis. Figure 2 shows Moisture Content and MRI contain missing values of about 5506 columns. Following the preprocessing analysis, the data have been prepared for further analysis.
The subsequent phase in the method involves selecting algorithms. The emphasis is on selecting effective predictive algorithms capable of rapidly processing the obtained data and obtaining accurate results for distress analysis. Four distinct machine learning techniques were employed for crack damage classification, alongside for regression analysis, different algorithms were used for rutting deformation analysis. XGboost was utilized for both distress conditions, and promising outcomes were achieved.

Figure 2. Missing values heatmap of all feature parameters involved in prediction of rutting.

2.1. Machine Learning Algorithms

This study used various machine learning algorithms for classification and regression analysis, including Support Vector Machine, Decision Tree, Xtreme Gradient Boosting, K-Nearest Neighbor, and Random Forest. A brief description of these three algorithms is given below.

2.1.1. Support Vector Machine Algorithm

SVM is widely recognized for its exceptional prediction and classification accuracy. It utilizes a range of kernel functions, such as linear, radial basis function (RBF), and polynomial. The objective function of SVM is shown in Equations (1)–(3).

\frac{1}{2} ∥ w ∥ + C \sum_{i = 1}^{n} (ξ_{i} + ξ_{i}^{*})

(1)

y_{i} - w^{T} ϕ (x_{i}) - b_{i} \leq ε + ξ_{i}

(2)

w^{T} ϕ (x_{i}) + b_{i} - y_{i} \leq ε + ξ_{i}^{⋆} ξ_{i}, ξ_{i}^{*} \geq 0

(3)

where

ϕ (x_{i})

is defined as a kernel function that can be used to transfer real data into a higher dimensional space,

w

is defined as the vector coefficient,

y_{i}

is the target variable,

C

is the SVM optimization, which means how much error can be avoided, b is defined as the bias term, ε is denoted as the maximum error withstand, and

ξ_{i}^{⋆}, ξ_{i}

are what we call a slack variable that permit the value to fall outside of

ε

.

SVM has been the subject of extensive research and is used to forecast various asphalt mixture characteristics. K. Gopalakrishnan and S. Kim proposed using Support Vector Machines (SVM) to represent the mechanical properties of hot mixed asphalt (HMA) due to the significant complexity and uncertainty involved in HMA modeling [19]. SVM produced superior results when compared to the results of the multi-linear model. One of the most significant aspects of the SVM model is that it was also used to make a forecast about the state of the pavement [20].

2.1.2. Decision Tree Algorithm

Decision trees play a significant role in predicting numerical target variables and are essential to forecasting techniques. The hierarchical framework of these tree structures, which consists of roots, branches, and leaves, improves the accuracy of predictions [11]. As the technique traverses the tree, it performs conditional evaluations at internal nodes to determine the most favorable route along the branching structure. Many evaluation measures guide this method, such as the total sum of squared errors. The algorithm’s prediction result is determined by the value assigned to the leaf node at the end of the calculated path [11]. The process of partitioning data into regression trees is driven by minimizing the divergence of output features (D) from the mean, as shown in Equation (4).

D_{Total} = \sum {(Y_{i} - Y)}^{2}

(4)

Here, Y represents the average of the output features, whereas Y_i represents the target feature. When a segmentation point divides the data into two distinct and mutually exclusive groups (left and right), the reduction in D can be mathematically represented as in Equation (5).

Δ_{jotal} = D_{Total} - (D_{Right} + D_{Left})

(5)

Here, in this equation,

D_{Right}

and

D_{Left}

show the difference between right and left subsets. It is crucial to highlight different categories of regression trees, including intricate, intermediate, and fundamental trees. These variations frequently center on the utilization of small leaf sizes.

2.1.3. Extreme Gradient Boosting Algorithm

XGboost is an advanced method developed by Chen and Guestrin in 2016 [21], which has become widely popular because of its exceptional performance and improved flexibility. An advantage of XGboost is its capacity to mitigate overfitting in both training and test data, resulting in superior outcomes compared to other machine learning algorithms and faster results. The output

{\hat{y}}_{i}

of XGboost gradient boosting can be expressed in the following manner as in Equation (6).

{\hat{y}}_{i} = \sum_{k = 1}^{K} f_{k} (x_{i}), f_{k} \in Γ

(6)

where

f_{k} (x_{i})

is the prediction score,

Γ

is known as the space of trees, while k is the number of trees, and

x_{i}

represents the independent parameters corresponding to the y. The final structural scoring function of the XGboost algorithms is as in Equation (7), which is a fixed tree structure.

Φ_{(k)}^{*} = - \frac{1}{2} \sum_{j = 1}^{T} \frac{{(\sum_{i \in I_{j}} g_{i})}^{2}}{\sum_{i \in I_{j}} h_{i} + λ} + γ T

(7)

where

Φ_{(k)}^{*}

is the objective function. More detailed information about the XGboost model can be referred to in the research paper [21].

2.1.4. K Nearest Neighbor Algorithm

The KNN technique is typically employed for both classification and regression analysis. The primary advantage of KNN over other algorithms is its improved data distribution, considerable flexibility, and ease of use [22]. This algorithm primarily utilizes eigenvalues; it first computes the distances between various eigenvalues, identifies the values closest to the target variables, and subsequently derives the result from a weighted average. Various methods are employed for distance calculation, including Chebyshev distance, Huffman distance, and the Euclidean distance formula. As Luo X. et al. proposed, the Euclidean distance is employed here to compute both interval distances and time series [23].

2.1.5. Random Forest Algorithm

Random Forest is another machine learning algorithm related to tree-based algorithms. It has different roots and nodes and is involved in different phases. Information entropy is used to measure how pure the dataset is, and the entropy of the dataset X is defined as in Equation (8).

H (X) = - \sum_{K = 1}^{N} P_{k} {l o g}_{2} P_{K}

(8)

Here, in this equation,

H (X)

is the entropy of the dataset and measures its impurity.

P_{k}

shows the probability of the dataset belonging to the k class, while

{l o g}_{2} P_{K}

quantifies the best contribution of individual class to the overall uncertainty. In this equation, if the value of

H (X)

is smaller, this mean that the purity of the data is higher. If the discrete feature is used to divide the set of X, then the classification result will be generated, and in this case, the classification result is denoted by

X^{V}

[24]. The detailed comparison of the pros and cons of the XGboost algorithm to other alternative algorithms highlights their fit for data scenarios in Table 1.

2.2. Model Interpretation

The Shapley Additive Explanations (SHAP) method, developed by Lundberg and Lee [25], seeks to elucidate the specific impact of each input feature on the total prediction and its correlation with the target variable. Therefore, it allows individuals to comprehend the significance of each characteristic in forecasting the result and how it interacts with other characteristics. In this particular scenario, let us consider a game where the individuals involved symbolize the inputs, and the forecast determines the outcome. Thus, SHAP quantifies each player’s specific impact on the ultimate result of a game. Put simply, it demonstrates the extent to which every player impacts the ultimate outcome [26]. Each point on the chart corresponds to the Shapley value ascribed to a certain feature and instance. The x-axis displays the Shapley values, which quantify the individual contributions of each feature to the final forecast. The y-axis represents the impact of each characteristic on the prediction, namely, the extent to which a single feature affects the target. The vertical position on the y-axis represents the impact of the features on the target variable, while different colors indicate their significance level.

The SHAP analysis dependence plot describes the impact of individual feature parameters on the target variable. It shows in the different colors to show the percentage of impact. This approach is better than conventional partial dependence plots, because it shows a more detailed plot and the impact of feature parameters on the prediction. More detail about this dependence plot can be found in this research paper [27]. Other research by Lundberg and Lee presented a subclass of SHAP analysis, such as LinearSHAP, KernelSHAP, and TreeSHAP. Out of these SHAPS, a TreeSHAP gives the linear explanatory model and SHAP values. At present, research uses TreeSHAP [25]. Adnan et al. [27] used a beeswarm plot of SHAP analysis to predict the Marshall mix design parameters and resilient modulus by using the XGboost model. The model can be predicted using Equation (9) as follows:

p (f^{'}) = \emptyset_{o} + \sum_{i = 1}^{N} \emptyset_{i} {f^{'}}_{i}

(9)

where p,

f^{'}

, and N describe the explanation model, the basic features, and the maximum size of the collection, and

\emptyset

represents the feature attribute. As the individual feature attribute is represented by Equations (10) and (11).

\emptyset_{i} = \sum_{S \subseteq M {i}} \frac{| S |! (N - | S | - 1)!}{N!} [g_{x} (S \cup \{i\} - g_{x} (S))]

(10)

g_{x} (S) = E [g (x) ∣ x_{S}]

(11)

Here, S, M, and E[g(x)∣

x_{S}

] are represented as feature subsets, sets of all inputs, and the expected outcome of a function on the subset.

3. Data Description and Feature Parameters Analysis

The main aim of this study is to propose a system that will analyze the pavement performance, specifically rutting behavior analysis and cracks damage, by using a single XGboost algorithm. To assess rutting behavior, the system considers several factors, such as the thickness of the asphalt layer, climatic conditions, trends in yearly average traffic, and truck volume, modulus of resilience, dynamic modulus, and other variables. In order to carry out this study, a significant quantity of data is necessary. The Long-Term Pavement Performance (LTPP) database is a highly accessible source for these data. This database consolidates data gathered from 49 states in the United States and Canada. The database contains a wide range of raw data, including the thickness of the asphalt layer, Annual Gesal Trend, Daily Average Traffic per year, Yearly Truck Volume, Rutting Depth, Equivalent Single Axle, Average Temperature, and Resilient Modulus and Dynamic Modulus. Missing values in the dataset were mainly in the context of Moisture Content and MRI. These values were handled by using mean imputation, where each entity was filled by the corresponding mean from available data points. By performing data cleansing, all inaccuracies and outliers were removed, resulting in a dataset consisting of 24,500 records that include ten discrete feature parameters including rutting depth as shown in Table 2. The analysis of cracks relies on image data sourced from the Kaggle dataset, categorized into positive images that contain cracks and negative images without cracks. The Gabor filter, a data extraction technique, can extract features from an image as shown in Figure 3. Gabor filter has been extensively employed in various vision analysis, including object recognition, segmentation, and edge detection. Detailed information regarding the Gabor filter can be found in this paper [28].

Accurate feature measurement characteristics facilitate exact model fitting, and feature selection serves as a valuable tool for the machine learning model. Hence, it is imperative to choose the most pertinent characteristics as the learning strategy for both the pavement classification as well as regression algorithms. The two predominant approaches for feature selection involve utilizing Pearson and Spearman correlation coefficients. The correlation test, also known as the product-difference correlation, was devised in the 20th century by the British statistician Pearson as a means of quantifying linear association [29]. The Pearson correlation coefficient is also known as a product of two correlation coefficients. It is shown in a sample set denoted as an r, while the sample, when drawn, is represented by ρ, the coefficient has no unit, and it has taken a value from +1 to −1 [29].

It is widely recognized as one of the most commonly utilized correlation coefficients. Here, ‘r’ denotes the magnitude of a linear correlation between variables

X

and

Y

. Let

X

and

Y

be considered as independent variables. To calculate the Pearson’s correlation coefficient between the two variables, use Equation (12).

Ρ (X, Y) = \frac{E [(X - μ x) (Y - μ Y)]}{σ_{X} σ_{Y}}

(12)

where

E [X - μ x) (Y - μ Y)]

is the relation of covariance between X and Y, and

σ_{X}

is the standard deviation of X, and

σ_{Y}

is the standard deviation of Y. The non-parametric Spearman coefficient (

r_{s}

), which gauges the relationship between two variables, can be found using Equation (13).

r_{s} = 1 - \frac{6 \sum d_{i}^{2}}{n (n^{2} - 1)}

(13)

where “n” represents the quantity of data and “

d_{i}

” indicates the variation between the two variables. Although both the pavement survey data and the prediction indicator data exhibit complex patterns of linearity and nonlinearity, they suffer from the drawback of being heavily reliant on linear relationships. For Rutting prediction analysis, Figure 4 displays the Pearson correlation results between the input and target variables. It is evident that all the measures exhibit a mix of positive and negative correlations with each other. There is a significant positive association between GESAL and Truck Volume. Figure 5a–i illustrates the correlation between the target variable, which is rutting depth, and every individual input variable. For instance, a higher truck volume and temperature generally correspond to increased rutting behavior. Conversely, parameters like resilient modulus and dynamic modulus show complex interaction, reflecting nonlinear behavior.

Figure 3. Feature extraction technique using Gabor filter for crack images.

4. Results and Discussion

4.1. Model Development and Evaluation

The evaluation framework of the model system presents two types of evaluation analysis for pavement maintenance and rehabilitation: regression analysis for rutting deformation based on numerical data and classification analysis for crack damage behavior through image data. For regression analysis, performance evaluation relies on statistical parameters such as mean absolute error (MAE), root mean squared error (RMSE), and correlation coefficient (R²). The correlation coefficient (R²) ranges from 0 to 1 and indicates the degree of similarity between predicted and actual values. A higher value of R² indicates greater accuracy, but it cannot exceed 1. Both root mean squared error and mean absolute error can be used to measure the proximity of the predicted value to the actual value, with smaller values indicating higher accuracy. The three evaluation metrics were determined using Equations (14)–(16) [30,31].

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(Y_{i} - {\hat{Y}}_{i})}^{2}}

(14)

M A E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} | Y_{i} - {\hat{Y}}_{i} |}

(15)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(Y_{i} - {\hat{Y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(Y_{i} - {\bar{Y}}_{i})}^{2}}

(16)

where n is the total number of sample sets,

Y_{i}

is defined as the measured value of rutting depth,

\hat{Y}

is defined as the predicted value of rutting depth, and

{\bar{Y}}_{i}

is the average of measured value of rutting depth. The k-fold cross-validation method was utilized to provide a reliable evaluation and reduce the possibility of overfitting. Using this approach, the dataset is segmented into k equal folds, trains the model on k − 1 folds, and validates its performance on the remaining fold. This process is repeated k times, with each fold being the validation set once. For this study, k = 5 was chosen, balancing computational efficiency and statistical reliability. Using k-fold cross-validation ensures the model is exposed to the entire dataset during the training and validation phases, enhancing its generalizability and reducing bias in performance metrics.

For classification analysis, model performance is assessed using precision, recall, F1 score, and accuracy. Precision is defined as the proportion of positively identified outcomes that are correct, as indicated in Equation (17). Recall is defined as the number of positive outcomes predicted, as shown in Equation (18). The F1 score is defined as the harmonic mean of precision and recall, as given in Equation (19). The most significant metric is accuracy, defined as the average proportion of the total number of correct results, as described in Equation (20) [32].

P r e c i s i o n = \frac{T r u e P o s i t i v e}{T r u e P o s i t i v e + F a l s e P o s i t i v e}

(17)

R e c a l l = \frac{T r u e P o s i t i v e}{T r u e P o s i t i v e + F a l s e N e g a t i v e}

(18)

F 1 S c o r e = \frac{2 \times P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}

(19)

A c c u r a c y = \frac{T r u e P o s i t i v e + T r u e N e g a t i v e}{T o t a l P r e d i c i t o n}

(20)

4.2. Prediction Results for Rutting Analysis

Figure 6 illustrates scatter plots comparing the predictions of the three-prediction algorithms with the observed values for rutting behavior. The lines indicate a complete match between the predicted and actual values, but the data points above the lines indicate anticipated values that are not proportional to the actual values. Outlier distributions for SVM and XGboost exhibit similar patterns for the test dataset as in Figure 6a,b. However, upon closer examination, it becomes evident that XGboost has significantly fewer outliers that deviate from the regression line than SVM. This indicates that the predictive capabilities of XGboost are superior to those of the other machine learning methods. The Support Vector Machine (SVM) and Decision Tree (DT) algorithms perform adequately simultaneously and achieve satisfactory accuracy, although not as high as the XGboost algorithm.

Figure 7 illustrates a statistical analysis of the three algorithms, comparing their performance using key metrics, including Mean Squared Error (MSE), Root Mean Square Error (RMSE), and coefficient of determination. The analysis reveals that XGboost and SVM yield comparable results, with slight differences in their performance values; however, the performance of DT is comparatively lower. SVM algorithm classifies data into three categories: linear kernel, radial basis function (RBF) kernel, and polynomial kernel. Among these, the RBF kernel provides good results. Nevertheless, XGboost is notable for its high R² value of approximately 0.9, attributed to its effective use of regularization techniques that help mitigate the overfitting issue in the model. Additionally, it incorporates parallel processing, enabling the execution of the model on several CPU cores. Lastly, the main reason is that it can automatically handle missing values, enhancing prediction accuracy.

XGboost Algorithm Evaluation by SHAP Analysis

XGboost algorithm is interpreted by examining the parameters involved, which is performed using SHAP, as previously explained in the introduction. Figure 8 illustrates the proportionate impact of each parameter in the algorithm used to forecast rutting behavior. The model testing and prediction are influenced mainly by Truck Volume and Generic Equivalent Single Axle Load (GESAL), as well as Temperature, Resilient Modulus, and Dynamic Modulus. On the other hand, Moisture Content and Mean Roughness Index (MRI) have the least impact. Figure 9 displays a beeswarm plot of SHAP values, with high values represented by the color red and low values represented by the color blue. The parameters exhibit a negative influence for larger feature values and a positive influence for lower feature values.

SHAP analysis provides concise information on how different feature parameters influence the model prediction. At the x-axis, values show the strength and direction of the features on model outputs; positive values increase the prediction and vice versa. The Y-axis shows the features ranked according to their importance to the model prediction. This plot also illustrates that feature like Truck Volume has great impact on model prediction. While GESAL, AC Thickness, Temperature, and KESAL also have a good impact, the values are spread across both positive and negative ranges, showing that there is a more complex relationship with output. On the other hand, some parameters, such as Moisture Content and MRI, have a less significant role in the prediction. SHAP visualization clearly describes how features influence the model’s decision, making it easier for the reader to pick the best feature parameter to find the rutting depth behavior.

4.3. Classification Results for Cracks Analysis

As previously discussed, the crack image data are categorized into positive and negative images. To extract features from the positive images using Gabor filter techniques, multiple filters with varying orientations are applied, and the greyscale intensity is illustrated in Figure 10. The comparison analysis result can be observed from Figure 11; XGboost has an impressive accuracy of 91% across both the test and train dataset, making it a valuable tool for insights, and 93% in the validation dataset, as shown in Figure 11e. Meanwhile, KNN and Random Forest provide moderate accuracy, whereas SVC performs slightly lower. The results indicate that XGboost is the most reliable due to its higher accuracy and balanced performance across precision, recall, and F1-score metrics. Table 3 evaluates four models according to their performance in terms of Accuracy, Precision, F1-Score, and Recall. These comparisons stem from the Confusion Matrix illustrated in Figure 12. XGboost demonstrates superior accuracy, correctly classifying a greater proportion of true positives and negatives, reflecting fewer misclassifications. In contrast, KNN and RF show slightly higher misclassification rates, particularly in false negatives, while SVC exhibits the weakest performance, misclassifying a substantial number of actual positive cases. This further reinforces the robustness of XGboost in accurately classifying crack damages.

5. Conclusions

This paper presents an intelligent prediction system by utilizing Extreme Gradient Boosting (XGboost) to predict two relevant functional indices: rutting deformation and cracks damage. The goal is to analyze the XGboost algorithm in both situations and to compare its performance outcomes with those alternative algorithms. The following conclusions were derived from the analysis and findings of the study on rutting behavior and cracks:

For both types of distress analysis, the Extreme Gradient Boosting (XGboost) algorithm exhibited good performance, achieving an R² value of 0.9 for rutting behavior and an accuracy of 0.91, precision of 0.92, recall of 0.9, and F1-score of 0.91 for cracks, outperforming alternative algorithms in the proposed framework. Hence, this research paper proved that the proposed system can be utilized for both the rutting behavior and cracks analysis of flexible pavement.
SHAP analysis of XGboost reveals that parameters related to Traffic Load and Environmental Conditions have a good impact on predicting rutting behavior when using this model. Additionally, the analysis indicates a moderate impact of Dynamic modulus on the prediction and a lesser impact of MRI and Moisture Content among all parameters. The analysis indicates that while utilizing the proposed system in the future, the significant variables to consider for predicting rutting behavior are Truck Volume, ESAL, AC Thickness, Temperature condition, and the Resilient Modulus of asphalt.

6. Limitation and Future Work

However, this research could not verify the reliability of the process equipment for specific locations or projection years. Thus, this area still needs attention. For rutting behavior, LTPP statistics are only for different states of the United States and Canada, and for cracks, it utilizes only the Kaggle dataset. Thus, project-level estimates for a specific roadway project or area should be evaluated further. Moreover, road performance is also influenced by other distresses, including potholes. Future Research must be conducted on different kinds of distress to evaluate predictive performance, particularly utilizing the proposed system, focusing on significant independent variables. To enhance comprehensive evaluation of predictive performance for different type of distress, this framework should also be applied on rigid pavement similar to flexible pavement.

Author Contributions

B.L.: Conceptualization, Methodology; D.J.: Writing—original draft; J.H.: Data curation, Investigation; W.L.: Writing—review and editing; L.C.: Writing—review and editing, Supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Fundamental Research Funds for the Central Universities (Grant no. 2242024k30053).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data of this study are available from the corresponding author upon request.

Conflicts of Interest

Author Bing Liu and Jianghai Hu was employed by the company Shandong Hi-Speed Hubei Development Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest

References

Liu, Z.; Wang, S.; Gu, X.; Wang, D.; Dong, Q.; Cui, B. Intelligent Assessment of Pavement Structural Conditions: A Novel FeMViT Classification Network for GPR Images. IEEE Trans. Intell. Transp. Syst. 2024, 25, 13511–13523. [Google Scholar] [CrossRef]
Liu, F.; Liu, J.; Wang, L.; Al-Qadi, I.L. Multiple-type distress detection in asphalt concrete pavement using infrared thermography and deep learning. Autom. Constr. 2024, 161, 105355. [Google Scholar] [CrossRef]
E17 Committee. Practice for Roads and Parking Lots Pavement Condition Index Surveys; ASTM: West Conshohocken, PA, USA, 2024. [Google Scholar] [CrossRef]
Ma, R.; Li, Y.; Cheng, P.; Chen, X.; Cheng, A. Low-Temperature Cracking and Improvement Methods for Asphalt Pavement in Cold Regions: A Review. Buildings 2024, 14, 3802. [Google Scholar] [CrossRef]
Tian, Y.; Lee, J.; Nantung, T.; Haddock, J.E. Calibrating the Mechanistic–Empirical Pavement Design Guide Rutting Models using Accelerated Pavement Testing. Transp. Res. Rec. 2018, 2672, 304–314. [Google Scholar] [CrossRef]
Ji, X.; Zheng, N.; Niu, S.; Meng, S.; Xu, Q. Development of a rutting prediction model for asphalt pavements with the use of an accelerated loading facility. Road Mater. Pavement Des. 2016, 17, 15–31. [Google Scholar] [CrossRef]
Development and Calibration of Shear-Based Rutting Model for Asphalt Concrete Layers: International Journal of Pavement Engineering: Vol 18, No 10—Get Access. Available online: https://www.tandfonline.com/doi/full/10.1080/10298436.2016.1138111 (accessed on 22 January 2025).
Yang, J.D.; Lu, J.J.; Gunaratne, M.; Xiang, Q.J. Forecasting overall pavement condition with neural networks—Application on Florida highway network . In Pavement Management and Rigid and Flexible Pavement Design 2003: Pavement Design, Management, And Performance; Transportation Research Record-Series; Transportation Research Board Natl Research Council: Washington, DC, USA, 2003; pp. 3–12. Available online: https://webofscience.clarivate.cn/wos/alldb/full-record/WOS:000189432800001 (accessed on 26 December 2024).
Okuda, T.; Suzuki, K.; Kohtake, N. Proposal and Evaluation of prediction of pavement rutting depth by recurrent neural network. In Proceedings of the 2017 6th Iiai International Congress On Advanced Applied Informatics (IIAI-AAI), Shizuoka, Japan, 9–13 July 2017; Matsuo, T., Fukuta, N., Mori, M., Hashimoto, K., Hirokawa, S., Eds.; IEEE: New York, NY, USA, 2017; pp. 1053–1054. [Google Scholar]
Choi, S.; Do, M. Development of the Road Pavement Deterioration Model Based on the Deep Learning Method. Electronics 2020, 9, 3. [Google Scholar] [CrossRef]
Guo, R.; Fu, D.; Sollazzo, G. An ensemble learning model for asphalt pavement performance prediction based on gradient boosting decision tree. Int. J. Pavement Eng. 2022, 23, 3633–3646. [Google Scholar] [CrossRef]
Haddad, A.J.; Chehab, G.R.; Saad, G.A. The use of deep neural networks for developing generic pavement rutting predictive models. Int. J. Pavement Eng. 2022, 23, 4260–4276. [Google Scholar] [CrossRef]
Gupta, A.; Gowda, S.; Tiwari, A.; Gupta, A.K. XGBoost-SHAP framework for asphalt pavement condition evaluation. Constr. Build. Mater. 2024, 426, 136182. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
Zhu, J.; Zhong, J.; Ma, T.; Huang, X.; Zhang, W.; Zhou, Y. Pavement distress detection using convolutional neural networks with images captured via UAV. Autom. Constr. 2022, 133, 103991. [Google Scholar] [CrossRef]
Jang, K.; An, Y.-K. Multiple crack evaluation on concrete using a line laser thermography scanning system. Smart Struct. Syst. 2018, 22, 201–207. [Google Scholar] [CrossRef]
Chen, C.; Seo, H.; Jun, C.; Zhao, Y. A potential crack region method to detect crack using image processing of multiple thresholding. SIViP 2022, 16, 1673–1681. [Google Scholar] [CrossRef]
Doğan, G.; Ergen, B. A new mobile convolutional neural network-based approach for pixel-wise road surface crack detection. Measurement 2022, 195, 111119. [Google Scholar] [CrossRef]
Support Vector Machines Approach to HMA Stiffness Prediction | Journal of Engineering Mechanics | Vol 137, No 2. Available online: https://ascelibrary.org/doi/10.1061/%28ASCE%29EM.1943-7889.0000214 (accessed on 27 December 2024).
Full Article: Prediction of Remaining Service Life of Pavement Using an Optimized Support Vector Machine (Case Study of Semnan–Firuzkuh Road). Available online: https://www.tandfonline.com/doi/full/10.1080/19942060.2018.1563829 (accessed on 27 December 2024).
Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining; ACM Digital Library: New York, NY, USA, 2016; pp. 785–794. [Google Scholar]
Clark, S. Traffic prediction using multivariate nonparametric regression. J. Transp. Eng. 2003, 129, 161–168. [Google Scholar] [CrossRef]
Luo, X.; Li, D.; Yang, Y.; Zhang, S. Spatiotemporal Traffic Flow Prediction with KNN and LSTM. J. Adv. Transp. 2019, 2019, 4145353. [Google Scholar] [CrossRef]
Performance Prediction of Asphalt Pavement Based on Random Forest-All Databases. Available online: https://webofscience.clarivate.cn/wos/alldb/full-record/CSCD:7090014 (accessed on 27 December 2024).
Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar] [CrossRef]
Ekanayake, I.U.; Meddage, D.P.P.; Rathnayake, U. A novel approach to explain the black-box nature of machine learning in compressive strength predictions of concrete using Shapley additive explanations (SHAP). Case Stud. Constr. Mater. 2022, 16, e01059. [Google Scholar] [CrossRef]
Khan, A.; Huyan, J.; Zhang, R.; Zhu, Y.; Zhang, W.; Ying, G.; Ahmad, K.N.; Shah, S.K. An ensemble tree-based prediction of Marshall mix design parameters and resilient modulus in stabilized base materials. Constr. Build. Mater. 2023, 401, 132833. [Google Scholar] [CrossRef]
Salman, M.; Mathavan, S.; Kamal, K.; Rahman, M. Pavement Crack Detection Using the Gabor Filter. In Proceedings of the 2013 16th International IEEE Conference on Intelligent Transportation Systems—(ITSC), The Hague, The Netherlands, 6–9 October 2013; IEEE: New York, NY, USA, 2013; pp. 2039–2044. Available online: https://webofscience.clarivate.cn/wos/alldb/full-record/WOS:000346481000327 (accessed on 27 December 2024).
Sedgwick, P. Pearson’s correlation coefficient. BMJ 2012, 345. [Google Scholar] [CrossRef]
Gandomi, A.H.; Roke, D.A. Assessment of artificial neural network and genetic programming as predictive tools. Adv. Eng. Softw. 2015, 88, 63–72. [Google Scholar] [CrossRef]
Zhang, W.; Khan, A.; Ju, H.; Zhong, J.; Peng, T.; Cheng, H. Predicting Marshall parameters of flexible pavement using support vector machine and genetic programming. Constr. Build. Mater. 2021, 306, 124924. [Google Scholar] [CrossRef]
Garrido-Labrador, J.; Serrano-Mamolar, A.; Maudes-Raedo, J.; Rodriguez, J.J.; Garcia-Osorio, C. Ensemble methods and semi-supervised learning for information fusion: A review and future research directions. Inf. Fusion 2024, 107, 102310. [Google Scholar] [CrossRef]

Figure 1. Schematic diagram of the methodology section of the whole framework: (a) rutting analysis, (b) crack analysis.

Figure 4. Correlation matrix analysis for rutting behavior.

Figure 5. Graphical relationship of rutting behavior with: (a) resilient modulus, (b) truck volume, (c) temperature, (d) ac thickness, (e) dynamic modulus, (f) GESAL, (g) KESAL, (h) MRI, and (i) moisture content.

Figure 6. Comparison analysis of actual vs. predicted values of proposed algorithms with different R² values. (a) XGboost; (b) SVR; (c) DT.

Figure 7. Statistical comparison of three models with MSE, RMSE and R².

Figure 8. SHAP percentage impact of parameters on model output.

Figure 9. SHAP beeswarm plot between feature parameters.

Figure 10. Feature extraction from cracks images using Gabor Filter.

Figure 11. Performance metrics results for classification of cracks: (a) XGboost algorithm, (b) KNN algorithm, (c) Random Forest algorithm, (d) SVC algorithm, (e) cross-validation analysis result of XGboost algorithm.

Figure 12. Comparison analysis of confusion matrix of (a) XGboost, (b) KNN, (c) RF, and (d) SVC.

Table 1. Comprehensive summary of strengths and weaknesses of XGboost compared to others.

Algorithms	Pros	Cons
XGboost	High accuracy, regularization, handling missing data, fast processing.	Computationally intensive for large datasets.
SVM	Well-separated datasets have a strong theoretical background.	Extensive for large datasets outdated for noisy data.
Decision Tree	Simple to explain, suitable for small datasets.	Mostly, there is a problem of overfitting.
Random Forest	Reduce the risk of overfitting due to the multiple decision tree phenomena.	Less accurate compared to XGboost.
KNN	Perform well on small datasets without the need for complex training	Sensitive to noise, expensive for large datasets.

Table 2. Description of Feature parameters from LTPP.

No	Field-Name	Field-Alias
1	Rutting Depth	Maximum Average Depth reference from 1.8m Straight Edge.
2	Moisture Content	Moisture Content of Asphalt Pavement
3	Annual_Truck_Volume_Trend	LTPP Lane Annual Truck Trend Estimate.
4	Annual_Gesal_Trend	Trend LTPP Generic Equivalent single axle Load.
5	Temp_Avg	LTTP Average Temperature of All the States.
6	Resilient _MOD_AVG	Average of Resilient Modulus.
7	Dynamic _MOD_AVG	Average of Dynamic Modulus.
8	AC_THICKNESS	Axel Load Repetition and Asphalt Course Thickness.
9	Kesal-Year	Equivalent Single Axle Load (ESAL) in thousands.
10	MRI	Mean Roughness Index

Table 3. Evaluating and comparing the performance results of all algorithms.

Algorithms	Accuracy	Precision	Recall	F1-Score
XGboost	0.91	0.92	0.90	0.91
KNN	0.89	1.00	0.75	0.86
RF	0.88	0.94	0.80	0.86
SVC	0.84	0.96	0.70	0.81

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, B.; Javed, D.; Hu, J.; Li, W.; Chen, L. A Unified Framework for Asphalt Pavement Distress Evaluations Based on an Extreme Gradient Boosting Approach. Coatings 2025, 15, 349. https://doi.org/10.3390/coatings15030349

AMA Style

Liu B, Javed D, Hu J, Li W, Chen L. A Unified Framework for Asphalt Pavement Distress Evaluations Based on an Extreme Gradient Boosting Approach. Coatings. 2025; 15(3):349. https://doi.org/10.3390/coatings15030349

Chicago/Turabian Style

Liu, Bing, Danial Javed, Jianghai Hu, Wei Li, and Leilei Chen. 2025. "A Unified Framework for Asphalt Pavement Distress Evaluations Based on an Extreme Gradient Boosting Approach" Coatings 15, no. 3: 349. https://doi.org/10.3390/coatings15030349

APA Style

Liu, B., Javed, D., Hu, J., Li, W., & Chen, L. (2025). A Unified Framework for Asphalt Pavement Distress Evaluations Based on an Extreme Gradient Boosting Approach. Coatings, 15(3), 349. https://doi.org/10.3390/coatings15030349

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Unified Framework for Asphalt Pavement Distress Evaluations Based on an Extreme Gradient Boosting Approach

Abstract

1. Introduction

2. Materials and Methods

2.1. Machine Learning Algorithms

2.1.1. Support Vector Machine Algorithm

2.1.2. Decision Tree Algorithm

2.1.3. Extreme Gradient Boosting Algorithm

2.1.4. K Nearest Neighbor Algorithm

2.1.5. Random Forest Algorithm

2.2. Model Interpretation

3. Data Description and Feature Parameters Analysis

4. Results and Discussion

4.1. Model Development and Evaluation

4.2. Prediction Results for Rutting Analysis

XGboost Algorithm Evaluation by SHAP Analysis

4.3. Classification Results for Cracks Analysis

5. Conclusions

6. Limitation and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI