Boosting Hot Mix Asphalt Dynamic Modulus Prediction Using Statistical and Machine Learning Regression Modeling Techniques

Awed, Ahmed M.; Awaad, Ahmed N.; Kaloop, Mosbeh R.; Hu, Jong Wan; El-Badawy, Sherif M.; Abd El-Hakim, Ragaa T.

doi:10.3390/su151914464

Open AccessArticle

Boosting Hot Mix Asphalt Dynamic Modulus Prediction Using Statistical and Machine Learning Regression Modeling Techniques

by

Ahmed M. Awed

¹

,

Ahmed N. Awaad

¹

,

Mosbeh R. Kaloop

^1,2,3,4

,

Jong Wan Hu

^2,3,*

,

Sherif M. El-Badawy

¹

and

Ragaa T. Abd El-Hakim

⁵

¹

Public Works Engineering Department, Faculty of Engineering, Mansoura University, Mansoura 35516, Egypt

²

Department of Civil and Environmental Engineering, Incheon National University, Incheon 22012, Republic of Korea

³

Incheon Disaster Prevention Research Center, Incheon National University, Incheon 22012, Republic of Korea

⁴

Digital InnoCent Ltd., London WC2A 2JR, UK

⁵

Public Works Engineering Department, Faculty of Engineering, Tanta University, Tanta 31527, Egypt

^*

Author to whom correspondence should be addressed.

Sustainability 2023, 15(19), 14464; https://doi.org/10.3390/su151914464

Submission received: 10 August 2023 / Revised: 27 September 2023 / Accepted: 28 September 2023 / Published: 3 October 2023

Download

Browse Figures

Versions Notes

Abstract

:

The prediction of asphalt mixture dynamic modulus (E*) was investigated based on 1128 E* measurements, using three regression and thirteen machine learning models. Asphalt binder properties and mixture volumetrics were characterized using the same feeding features in the NCHRP 1-37A Witczak model. However, three aggregate gradation characterization approaches were involved in both modelling techniques: the NCHRP 1-37A gradation parameters, Weibull distribution factors, and Bailey method parameters. This study evaluated the performance of these models based on various performance indicators, using both statistical and machine learning regression modeling techniques. K-fold cross-validation and learning curve analysis were conducted to assess the models’ generalization capabilities. The conclusions of this study demonstrate the superiority of the ML models, particularly the Catboost ensemble learning regression (CbR). Hyperparameter optimization and residual analysis were performed to fine-tune and confirm the heteroscedasticity of the CbR model. The Bailey-based CbR model showed the highest coefficient of determination (R²) of 0.998 and the lowest root mean square error (RMSE) of 220 MPa. Moreover, SHAP values interpreted the CbR model and showed the relative importance of its feeding features. Based on the findings of this study, the CbR model is suggested to accurately predict E* for a variety of asphalt mixtures. This information can be used to improve pavement design and construction, leading to more durable and long-lasting pavements.

Keywords:

dynamic modulus; Weibull distribution; Bailey method; statistical-based regression models; machine learning-based regression models; NCHRP 1-37 A Witczak model; Catboost regression algorithm

1. Introduction

Complex dynamic modulus (E*) is an intrinsic instrument employed for predicting the linear viscoelastic performance of asphalt mixtures and evaluating pavement responses, using the AASHTOWare Pavement ME Design (PMED) framework [1]. This substantial property represents the temperature- and frequency-dependent (consequently time-dependent) stiffness characteristics of the asphalt mixture [2]. Due to the prominence of E* in predicting pavement performance simultaneously with the tiresome, time-consuming, and expensive nature of the test procedure, considerable research efforts have been devoted to the prediction of E* of asphalt mixtures [3]. These efforts originated from the statistical regression modeling in the premature modelling trials. One of the most renowned models is the 1999 η-based NCHRP 1-37A Witczak model. The revised E* predictive equation was based on a dataset containing 205 mixtures with 2750 E* measurements, and can be represented as follows [4]:

\log_{10} E^{*} = 3.750063 + 0.02923 ρ_{200} - 0.001767 {(ρ_{200})}^{2} - 0.002841 ρ_{4} - 0.058097 V_{a} - 0.82208 \frac{V_{b e f f}}{V_{b e f f} + V_{a}} + \frac{3.871977 - 0.0021 ρ_{4} + 0.003958 ρ_{38} - 0.000017 {(ρ_{38})}^{2} + 0.00547 ρ_{34}}{1 + e^{(- 0.603313 - 0.313351 \log f - 0.393532 \log η)}}

(1)

where E* is the dynamic modulus of asphalt mixtures, in psi; η is the binder viscosity, in 10⁶ poise; f is the loading frequency, in Hz; V_a is the percentage of air void content in the asphalt mixture by volume; V_beff is the percentage of effective binder content by volume; ρ₃₄ is the cumulative percentage of retained weight on the 3/4 in. sieve; ρ₃₈ is the cumulative percentage of retained weight on the 3/8 in. sieve; ρ₄ is the cumulative percentage of retained weight on No. 4 sieve; and ρ₂₀₀ is the percentage of passing No. 200 sieve.

The aforementioned equation is the most famous and well-known model for the prediction of E* in addition to the Hirsch model. The feeding parameters of the Witczak model can be categorized into parameters related to aggregate gradation, binder viscosity, and mixture properties, while the Hirsch model stresses more on the volumetric properties of asphalt mixtures [4]. More research efforts related to E* are demonstrated in the following section in detail.

2. Literature Review

Based on Equation (1), the model feeding parameters can be explicitly divided into four categories: testing conditions, aggregate gradation parameters, binder viscosity, and asphalt mixture volumetrics. For example, some literature-based statistical regression models are summarized in Table 1 comparing their represented feeding parameters based on the aforementioned four categories, the dataset for model development, and goodness-of-fit statistics.

It can be noticed from Table 1 that researchers exerted subsequent efforts representing binder properties either in the form of binder viscosity (η) or complex shear modulus (G*) and phase angle (δ). However, the representation of mix properties varied among literature-based statistical regression models. Although the majority of the models incorporated classic gradation parameters such as ρ₃₄, ρ₃₈, ρ₄, and ρ₂₀₀ in addition to volumetric parameters such as V_a and V_beff, few researchers characterized mix volumetrics in terms of VMA and VFA. However, it should be noted that none of the previous literature studies utilized further surrogates for aggregate gradation parameters other than ρ₃₄, ρ₃₈, ρ₄, and ρ₂₀₀.

After numerous statistical regression modeling attempts [9,10,11,12,13], the adoption of machine learning (ML) approaches, which tend to constantly improve themselves over time using various algorithms, became imperative. The accuracy of machine learning capabilities would primarily depend on the quality of entry data and the selection of an ML approach. Plentiful ML techniques were utilized to predict the E^∗ of asphalt mixtures. These techniques varied to include among others artificial neural networks (ANNs), bagging trees ensemble (BTE), support vector machines (SVMs), gene expression programming (GEP), biogeography-based programming (BBP), random forest (RF), and beetle antennae search (BAS) [6,14,15]. Some research attempts were designated to the investigation of the significance of each model input parameters through developing many ANN solutions, then optimizing these solutions using grey wolf optimizer (GWO) [16]. The investigation included many input parameters representing aggregate gradation, binder characteristics, mix volumetric properties, and test conditions. Amid the investigated variables, the results pointed out that temperature, frequency, and low-temperature performance grade (PGL) had the most conspicuous influence on the values of E^∗. Furthermore, ρ200 and nominal maximum aggregate size (NMAS) had no contribution in any optimal ANN model, while ρ3/4, the percentage of reclaimed asphalt pavement (RAP), and the percentage of asphalt content (AC) appeared in only one, three, and four optimal models out of 30 models [16]. Few ML-based models aimed at predicting E* of asphalt mixtures in low temperature regions [17], while other achievements focused on predicting E* for in-service asphalt layers in hot climates [18]. However, other ML-based models [19] attempted to predict high modulus (E*) in terms of high-temperature performance-related index properties such as asphalt penetration degree, asphalt softening point, asphalt rutting factor, mix stability, and mix dynamic stability.

Some ML-based models used numerous input parameters, such as Ali [20]. Extreme gradient-boosting regression (EGBR) was used with a dataset of 1152 E* measurements to predict E* as a function of 24 variables representing testing conditions, mix volumetric properties, and gradation parameters with R²-value of 0.835. Another ML-based model [21] used M5P depending on a dataset of 4022 records to develop a model consisting of 15 variables representing the aforementioned parameters plus binder performance grade, with R²-value of 0.951. Table 2 summarizes most of the ML-based prediction models in the literature.

As can be noticed from Table 2, a number of models [14] used the Witczak and Bari dataset of 7400 datapoints to develop E*-predictive models replacing Witczak 1999 and 2006 models. Despite the fact that the original 2006 model was based on the G* and δ measurements, while the collected dataset contained some records without actual G* and δ measurements [22]. On another hand, there are ML-based models [8] that used more feeding variables than the benchmark models resulting in predictive models with excessive feeding parameters (13 independent variables) although it was evident through research that the value of determination coefficient (R²) does not significantly increase after including seven feeding variables [8]. All the summarized ML-based models in Table 2 used a limited number of sieves to capture the gradation effect on E*, while most of them used a single ML-modeling technique or combined two ML-based techniques.

Table 2. Main feeding parameters incorporated in literature-based ML models for E* prediction.

Ref.	Model	Binder Properties					Mix Properties									Test Conditions		Training	Validation	Testing	Scale	Sensitivity	Goodness-of-Fit Statistics
		Binder Properties					Volumetric Parameters					Gradation Parameters				Test Conditions
		δ	G*	η	PG_L	PG_H	AC	VMA	VFA	V_a	V_beff	ρ3/8	ρ3/4	ρ4	ρ200	Freq.	Temp.
[14]	ANN	-	-	√	-	-	-	-	-	√	√	√	√	√	√	f_c	-	90%	3%	7%	log		N = 7400, R² = 0.978
[14]	ANN	√	√	-	-	-	-	-	-	√	√	√	√	√	√	-	-	90%	3%	7%	log		N = 7400, R² = 0.959
[6]	DCNN	-	-	√	-	-	-	-	-	√	√	√	√	√	√	f_c	-	80%	-	20%	log		N = 6060, R² = 0.95, Se/Sy = 0.21
[6]	DCNN	√	√	-	-	-	-	-	-	√	√	√	√	√	√	-	-	80%	-	20%	log		N = 6060, R² = 0.96, Se/Sy = 0.19
[6]	DCNN	-	-	√	-	-	-	-	-	√	√	√	√	√	√	f_c		20%	-	80%	log		N = 1071, R² = 0.98, Se/Sy = 0.13
[6]	DCNN	√	√	-	-	-	-	-	-	√	√	√	√	√	√	-		20%	-	80%	log		N = 1071, R² = 0.99, Se/Sy = 0.10
[15]	BTE *	-	√		-	-	-	√	-	-	-	-	-	-	-	-	-	85%	-	15%	log	√	N = 1656, R² = 0.954
[7]	ANN	-	-	√	-	-	-	-	-	√	√	√	√	√	√	f_c		93%	-	7%	Ar		N = 7400, R² = 0.98, Se/Sy = 0.14
[7]	ANN	√	√	-	-	-	-	-	-	√	√	√	√	√	√	-		93%	-	7%	Ar		N = 7400, R² = 0.96, Se/Sy = 0.21
[23]	SVM	-	-	-	-	-	-	-	-	-	-	-	-	-	-	f_r	√	99%	-	1%	Ar		N = 80, R² = 0.956
[24]	PCA-GEP	-	√	-	-	-	√	-	√	√	-	√	√	√	-	f_c	√	-	-	-	log		N = 7400, R² = 0.925
[8]	DRNN	√	√	√	-	-	-	√	√	√	√	√	√	√	√	f_r	√	80%	-	20%	log		N = 4650, R² = 0.98, Se/Sy = 0.11
[25]	BBP	-	-	-	√	-	-	√	-	-	-	-	-	-	-	f_r	√	98%	-	2%	log	√	N = 4122, R² = 0.97, MAPE = 2.3%
[25]	BBP ^a	-	-	-	√	√	√	√	-	-	√	-	-	-	-	f_r	√	98%	-	2%	log	√	N = 4122, R² = 0.98, MAPE = 2.0%
[17]	BAS-RF	√	√	-	-	-	-	-	-	√	√	-	√	√	√	-	-	-	-	-	Ar		N = NA, R² = 0.98
[26]	RF ^a,b	-	-	-	√	√	√	√	√	√	√	√	√	√	√	f_r	√	-	-	-	log	√	N =4022, R² = 0.95
[27]	ANN	√	√	-	-	-	-	-	-	√	√	√	√	√	√	-	-	85%	-	15%	log		N = 1656, R² = 0.99
[27]	ANN	-	√	-	-	-	-	√	√	-	-	-	-	-	-	-	-	85%	-	15%	log		N = 1656, R² = 0.99
[28]	RF	√	√	-	-	-	-	-	-	√	√	√	-	√	√	-	-	70%	-	30%	Ar		N = 144, R² = 0.98
[29]	BAS	√	√	-	-	-	-	-	-	√	√	√	-	√	√	-	-	70%	-	30%	Ar		N = 144, R² = 0.92
[30]	GDTB	-	-	√	-	-	-	-	-	√	√	√	√	√	√	f_c	-	90%	3%	7%	log		N = 7400, R² = 0.98, Se/Sy = 0.14, MAPE = 2.1%
[30]	GDTB	√	√	-	-	-	-	-	-	√	√	√	√	√	√	-	-	90%	3%	7%	log		N = 7400, R² = 0.98, Se/Sy = 0.13, MAPE = 3.9%
[31]	ANNPSO	√	√	-	-	-	-	-	-	√	√	√	-	√	√	-	-	-	-	-	Ar		N = 144, R² = 0.989
[32]	ANN	√	√	-	-	-	-	-	-	√	√	√	-	√	√	-	-	75%	-	25%	Ar		N = 1320, R² = 0.98

Note: * 61% contained RAP; a: mentioned variables plus RAP content; b: mentioned variables plus NMAS; DCNN: deep convolution neural network; BTE: bagging tree ensemble; SVM: support vector machine; PCA-GEP: hybrid principal component analysis–gene expression programming; DRNN: deep residual neural network; BBP: biogeography-based programming; GDTB: gradient decision tree boosting; BAS: beetle antennae search; RF: random forest; PGH: high-temperature performance grade; PGL: low-temperature performance grade; AC: asphalt content. P.S: The summarized goodness-of-fit statistics in the table are based on the testing dataset.

3. Research Motivation and Objectives

Based on the aforementioned literature review, it can be summarized that (1) some literature-based models used excessive and overrated feeding variables, (2) other models implemented records of input variables which were predicted rather than being measured, (3) some models were developed based on modest databases, (4) few research papers analyzed their model sensitivity to each model input variables, and (5) both statistical and ML-based models in the literature depended on direct aggregate gradation parameters (ρ₃₄, ρ₃₈, ρ₄, and ρ₂₀₀) only in characterizing aggregate properties.

Therefore, the current study aims to evaluate multiple regression statistical and ML-based techniques to predict E* of asphalt mixtures based on a reliable measured database and new feeding parameters for the mix aggregate properties. The specific objectives are as follows:

Compare the E* prediction accuracy of the most widely used NCHRP 1-37A Witczak model with new developed statistical regression models based on advanced aggregate gradation feeding features.
Investigate various ML techniques to enhance the E* prediction based on the same alternatives of feeding variables.
Conduct a sensitivity analysis on the optimized and most accurate ML-based model to its feeding input parameters.
The approach of using Weibull distribution factors and Bailey method parameters is novel, as both methods employ full aggregate gradation sieve sizes instead of using only four gradation ones. Furthermore, the use of a comprehensive and reliable dataset of measured E* values and feeding parameters promotes trustworthy and dependable predictions of E* based on optimized and accurate ML-based model.

4. Material and Testing Measurements

A dataset consisting of 1128 E* records was employed based on 47 testing specimens of Superpave asphalt mixtures and tested under 24 combinations of temperatures and frequencies as testing conditions [33,34]. Thus, the main features in the dataset are temperature, frequency, η, ρ₃₄, ρ₃₈, ρ₄, ρ₂₀₀, V_a, V_beff, and measured E* in the lab. Moreover, new features representing the mix aggregate gradation structure were added to the dataset, which are represented by Weibull distribution factors (λ and κ) and Bailey method parameters (CA, FA_C, FA_f ratios).

A Weibull distribution, as represented by Equation (2), that considers two parameters has been used to analyze the overall distribution of aggregate sizes in the tested mixtures [35,36,37]. Weibull distribution function is defined as follows:

f (X) = 1 - e^{(- 1 * {(\frac{x}{λ})}^{k})}

(2)

The dependent variable, f(X), represents the cumulative number of aggregates that passes a certain size, while the independent variable, x, represents the size of the aggregate in millimeters. The distribution is characterized by two parameters, λ and k, which represent the scale and shape of the aggregate size distribution, respectively.

In other respect, the Bailey technique emphasizes the packing of aggregates to ascertain the particles that constitute the coarse aggregate framework and those that occupy the voids created within that framework [38]. The Bailey method is suitable for most asphalt mixtures, as it facilitates an understanding of the reasons behind the problematic compaction of certain Superpave mixtures. Additionally, it provides valuable insights into the substantial changes in volumetric properties and/or field compactability that can occur due to slight gradation variations during production, which are usually within the allowable tolerance limits [38,39]. The Bailey method employs three ratios (CA, FA_C, and FA_f) based on the gradation sieves to control the target gradation as follows:

C A R a t i o = \frac{% P a s s H a l f S i e v e - % P a s s P C S}{100 % - % P a s s H a l f S i e v e}

(3)

{F A}_{C} R a t i o = \frac{% P a s s S C S}{% P a s s P C S}

(4)

{F A}_{f} R a t i o = \frac{% P a s s T C S}{% P a s s S C S}

(5)

where CA Ratio is the coarse aggregate ratio, FA_C Ratio is the fine aggregate coarse ratio, FA_f Ratio is the fine aggregate fine ratio, % Pass Half Sieve is the passing percentage of the sieve which equals one half of the nominal maximum aggregate size (NMAS), % Pass PCS is the passing percentage of the primary control sieve which differentiates between the coarse and fine aggregates, % Pass SCS is the passing percentage of the secondary control sieve which differentiates between the coarse and fine sand, and % Pass TCS is the passing percentage by the tertiary control sieve which is the closest sieve to 0.22 times the SCS.

Based on the aforementioned dataset, a summary of the main feeding features and their statistical descriptions including mean, standard deviation, maximum, and minimum values for each feature are provided in Table 3. The dataset ranges covered wide domains of aggregate, binder, and mixture characteristics, which raises the confidence in the developed models.

In addition, an exploratory data analysis was conducted on the main feeding features. Figure 1 describes the normalized histograms and normal probability distributions of the main feeding features. There were four temperatures (4.4, 21.1, 37.8, and 54.4 °C) and six frequencies (25, 10, 5, 1, 0.5, and 0.1 Hz) used for testing of each E* specimen. The normalization was based on the total number of dataset records. It was obvious that the investigated feeding features were not fully normally distributed. In addition, the measured values of E* were highly skewed to the right, which urged us to use the logarithmic transformation on the statistical regression models.

The estimated Pearson correlations (R) between the main feeding features used in the modeling are presented by a heatmap shown in Figure 2. By examining the last row of the heatmap which presents the linear correlation between the E* and the other feeding features, some features such as temperature, frequency, and η already have high correlations with E*. The correlation matrix exhibited some collinearities between the aggregate features, which may mislead the regression modeling. The collinearity between the traditional aggregate features (ρ₃₄, ρ₃₈, ρ₄, and ρ₂₀₀) and new aggregate features (Weibull factors and Bailey parameters) was denied, as they were different alternatives for characterizing aggregate gradation, and none of them was employed in parallel through the modeling process.

5. Methodology

As mentioned earlier, this research methodology was to apply different types of regression modeling techniques including both statistical and supervised ML-based algorithms including the same feeding features used in the NCHRP 1-37A Witczak model. Furthermore, the regression modelling efforts incorporated other alternatives for characterizing aggregate gradation such as Weibull distribution factors and Bailey method parameters. Eventually, a comparison between the performance of every algorithm was conducted to determine the optimized one which was tuned to find out the best hyperparameter. In addition, a comprehensive analysis on the optimized algorithm was employed to measure its performance using cross-validation, residual analysis, feature importance, and model interpretability.

The pipeline of the workflow and its architecture can be summarized as follows:

Data preparation: by cleaning, removing outliers, and scaling.
Feature engineering: by generating new feeding features from the measured data, using the Bailey method and Weibull distribution to characterize aggregate gradation in a better way.
Modelling phase: by developing both statistical and ML-based models using three different representations of aggregate gradation feeding features; each representation contains different feeding features representing aggregate gradation.
Model training on the selected dataset.
Model evaluation on a holdout test dataset, using different types of evaluation techniques such as R², root mean square error (RMSE), mean absolute error (MAE), and k-fold cross-validation.
Model selection and optimization: by choosing the outperforming model and adjusting its hyperparameters.
Feature importance: by analyzing the model results, using the SHAP values to find out the most important feeding features affecting the model predictions.
Model deployment: by deploying the optimized model. This involves making the model available to users so that they can use it to predict E* for new asphalt mixtures.

5.1. Statistical Models

Two statistical regression models were developed to predict E*, including other alternatives characterizing aggregate gradation which differ from what is used in the NCHRP 1-37A Witczak model:

Statistical regression model incorporating Bailey method parameters ( $C A, {F A}_{c}, a n d {F A}_{f})$ :

{l o g}_{10} E^{*} = 2.21827 + 1.72073 {F A}_{f} - 0.405838 {({F A}_{f})}^{2} - 5.644703 {F A}_{c} + 6.622224 {({F A}_{c})}^{2} - 0.787878 {C A}_{R} + 0.019777 V_{a} + 0.615156 \frac{V_{b_{e f f}}}{V_{b_{e f f}} + V_{a}} + \frac{3.425455 - 0.312899 {C A}_{R} + 0.804113 {({C A}_{R})}^{2}}{1 + e x p (- 1.211348 - 0.473207 \log f - 0.663111 \log η)}

(6)

ii.: Statistical regression model incorporating Weibull distribution factors ( $λ a n d Κ)$ :

{l o g}_{10} E^{*} = - 3.912981 + 0.522028 {(\frac{1}{λ})}^{Κ} + 3.718953 {(\frac{1}{λ})}^{2 Κ} + 13.919199 Κ - 8.72234 Κ^{2} + 0.066412 V_{a} + 0.881993 \frac{V_{b_{e f f}}}{V_{b_{e f f}} + V_{a}} + \frac{11.381659 + 6.876535 {(\frac{1}{λ})}^{Κ} - 26.475238 {(\frac{1}{λ})}^{2 Κ} - 19.241097 Κ + 11.497402 Κ^{2}}{1 + e x p (- 1.201659 - 0.481678 \log f - 0.672156 \log η)}

(7)

5.2. ML-Based Modeling Techniques

In addition to statistical regression models, several ML-based regression algorithms were trained alternately using different aggregate gradation characterizations: literature-based, Weibull-based, and Bailey-based feeding features. A detailed flowchart depicting the followed methodology through the ML-based regression modeling is presented in Figure 3.

As shown in Table 3 above, the target variable E* covers an extensive range from 14.9 to 18,974 MPa and there were no missing gaps in the dataset. However, the E* dataset was imbalanced and skewed to the right. To generalize the developed modelling algorithms, the dataset was randomly divided into 70% training and 30% testing datasets in accordance with the stratification on the temperature at which E* was measured in the laboratory; thus, the training and testing sets were representative in terms of the temperature values used in the dataset, taking into consideration that E* is a temperature-dependent parameter.

After that, a pipeline was established to scale the feeding features, using a standard scaler. The employed modeling strategy to find the pattern in the training dataset was based on supervised ML-based regression algorithms to predict the laboratory measured E* of asphalt mixtures [27]. On the other hand, the testing data were utilized for performance assessment of the proposed models. Moreover, a number of ensemble models were utilized such as random forests or gradient-boosted trees. These methods combine multiple decision trees, which can help to mitigate the impact of imbalanced data on individual trees and improve overall performance.

Each algorithm performance was evaluated using various goodness-of-fit statistics such as adjusted R², RMSE, and MAE. Moreover, a k-fold cross-validation technique was used to judge the variability of the algorithms’ results to facilitate the selection process of the optimized algorithm. Finally, a hyperparameter tuning was applied to the optimized algorithm to achieve the best prediction, then a residual analysis and a feature importance study were performed on the optimized ML-based algorithm.

As mentioned above, there are many types of supervised ML-based regression algorithms; the ones considered in this research study are highlighted in green in Figure 4 [28]. The utilized algorithms are listed as follows: multiple linear regression (MLiR), regularized Lasso regression (RLaR), regularized ridge regression (RRdR), k-nearest neighbors regression (KNNR), support vector machine–linear kernel (SVM-L), support vector machine–radial basis function kernel (SVM-RBF), support vector machine–polynomial kernel (SVM-P), decision tree regression (DTR), random forest regression (RFR), adaptive boosting regression (ABR), gradient-boosting regression (GBR), extreme gradient-boosting regression (EGBR), and Catboost regression (CbR). The main differences among these algorithms are described in more detail below.

5.2.1. Linear Regression (LiR)

Multiple Linear Regression (MLiR)

The regression function E*(Y|X) must be defined as a linear function in the multiple feeding inputs X₁,…, X_p in order for a linear regression model to exist [41], no matter the source of the X_j [41]. The MLiR model form is as follows:

f (X) = β_{0} + \sum_{j = 1}^{p} X_{j} β_{j}

(8)

Regularized Linear Regression (RLiR)

By adding regularization terms that penalize large coefficients, the RLiR models overcome the inability of the linear regression to handle overfitting [42].

Regularized Ridge Regression (RRdR):

The RRdR adds a regularization element to the LiR objective function, which is the product of the squared coefficient sum and the regularization parameter (alpha). It promotes lower and more balanced coefficients through this regularization term. Both overfitting and the negative effects of irrelevant features are lessened.

Regularized Lasso Regression (RLaR):

Similar to the RRdR, the RLaR augments the objective function with a regularization term. However, it does not employ squared values but rather the sum of the absolute coefficient values (L₁ norm). The RLaR has the ability to induce sparsity, which causes some coefficients to inevitably drop to zero.

5.2.2. K-Nearest Neighbors Regression (KNNR)

The KNNR algorithm is a nonparametric supervised learning technique that relies on proximity to predict how a data point will be predicted. The average of the KNN is used to forecast the value of predictions [43]. Given that the KNNR technique just saves the training dataset rather than going through a training phase, it should be noted that it belongs to a family of “lazy learning” models.

5.2.3. Decision Tree Regression (DTR)

The DTR is a predictive model that divides the feature space recursively into subspaces as the foundation for prediction [44]. By using a greedy search to find the ideal split points inside a tree, the DTR uses a divide and conquer technique [45]. Although there are many approaches to choosing the optimal characteristic at each node, the information gain and Gini impurity are the most frequently used methods as splitting criteria in DTR models. The information gain criteria are as follows:

I n f o r m a t i o n G a i n (S, a) = E n t r o p y (S) - \sum \frac{|S v|}{|S|} E n r o p y (S v)

(9)

where a represents a specific attribute or class label; Entropy(S) is the entropy of dataset S; |Sv|/|S| represents the proportion of the values in Sv to the number of values in dataset S; and Entropy(Sv) is the entropy of dataset Sv. Entropy is a concept that stems from information theory, which measures the impurity of the sample values, and it ranges between 0 and 1 [45].

5.2.4. Support Vector Machine (SVM)

The SVM comes from the statistical learning theory proposed by Boser et al. [46]. The fundamental idea behind SVMs is the process of determining the proper hyperplane with the aid of support vectors [47]. The kind of decision boundary by which an SVM learns depends on the kernel selection, which by its turn depends on the type of data being used and the required level of complexity for the decision boundary.

The linear kernel (SVM-L) is appropriate for linear data, while the radial basis function kernel (SVM-RBF) handles nonlinear data by transferring them to a higher-dimensional space, using Gaussian similarity. On the other hand, the polynomial kernel (SVM-P) offers a versatile method by employing polynomial functions to capture nonlinear correlations [47].

5.2.5. Ensemble Learning

In ensemble learning, a number of predictors can be combined to produce predictions that are frequently more accurate than the best individual predictor [40].

Bagging Ensemble Learning

The bagging (short for bootstrap aggregating) strategy is to train each predictor, using the same training algorithm on several random subsets of the training data [40]. The random forest regression (RFR) is well known, as a bagging algorithm, for its robustness, flexibility, and ability to handle complex data and capture nonlinear relationships. The RFR generates a large number of decision trees and averages their forecasts. Each tree is trained using a separate random subset of features on a different random sample of the data. This contributes to less overfitting and increases the model’s capacity to generalize fresh data [40].

Boosting Ensemble Learning

The main concept behind most boosting (originally called hypothesis-boosting) techniques is to train predictors in succession. Each of them is attempting to correct the one before it. Adaptive boosting regression (ABR), gradient-boosting regression (GBR), extreme gradient-boosting regression (EGBR), and Catboost regression (CbR) are all ensemble learning algorithms that belong to the family of boosting methods [48].

The ABR focuses on adjusting sample weights to improve predictions, while the GBR builds trees iteratively to minimize residuals. Moreover, the EGBR, which is an optimized implementation of gradient boosting, optimizes performance and scalability, and the CbR employs a symmetric tree structure and uses a combination of ordered boosting and random permutations to improve generalization.

The explored supervised ML-based algorithms in this study, and their required hyperparameters and related categories are summarized in Table 4.

5.3. Model Performance Indicators

The performance of both statistical and ML-based prediction models was measured via the trained dataset, using adjusted R² (Equation (10)), which considers the effect of the multiple features, rather than ordinary R² (Equation (11)), which indicates how much variance is accounted for by the assumed relationship [42]. The better the model’s prediction, the higher the value of Adjusted R².

{A d j u s t e d R}^{2} = 1 - \frac{(1 - R^{2}) (n - 1)}{n - p - 1}

(10)

R^{2} = 1 - \frac{R S S}{T S S} = 1 - \frac{\sum {(y_{i} - \hat{y_{i}})}^{2}}{\sum {(y_{i} - \bar{y})}^{2}}

(11)

where n is the number of samples in the dataset, and p is the number of prediction features.

Additionally, RMSE and MAE were employed to assess the performance of the prediction models as presented in Equations (12) and (13) [59]. RMSE measures the spread of the residuals, while the MAE is less sensitive to outliers. The RMSE was also used to figure out the overfitting and underfitting of every algorithm.

R M S E = \sqrt{\frac{\sum_{i = 1}^{N} {(y_{i} - \hat{y_{i}})}^{2}}{N}}

(12)

M A E = \frac{1}{N} \sum_{i = 1}^{N} |y_{i} - \hat{y_{i}}|

(13)

5.4. Threats to Validity

In this subsection, the proposed approach is a promising approach to predicting the E* of asphalt mixtures. However, it is important to be aware of potential limitations/restrictions, advantages/benefits, disadvantages, and factors that may affect the potential threats to validity of our study findings. The probable threats could be related to issues such as:

i.: Selection bias: The dataset was randomly divided into training and testing sets, but it is important to note that the training set may not be fully representative of the population of interest. This could lead to selection bias, which could affect the validity of the results. To mitigate this risk, the authors stratified the training and testing sets by temperature, which is an important factor that affects E*.
ii.: Overfitting: The models were trained on the training set, and their performance was evaluated on the testing set. However, there is a risk that the models could overfit the training data, which means that they may not perform well on new data. To mitigate this risk, the authors used a k-fold cross-validation technique.
iii.: Imbalanced data: The E* dataset was imbalanced and skewed to the right. This could lead to the models being biased towards the more common values of E*. To mitigate this risk, the authors used ensemble models, such as random forests and gradient-boosted trees. Ensemble models combine multiple decision trees, which can help to mitigate the impact of imbalanced data on individual trees and improve overall performance.

However, the benefits of the implemented methodology can be summarized in the following points:

i.: The proposed approach can learn complex relationships between the input and output variables as presented by the proposed feeding features for predicting the E* of asphalt mixtures.
ii.: The proposed approach can be used to model a wide range of input variables, including both continuous and categorical variables. This makes it more flexible and versatile than traditional regression models.
iii.: The proposed approach can be used to handle imbalanced data.
iv.: The proposed approach can be used to improve the interpretability of ML-based models.
v.: The findings of this research study are essential for practitioners and engineers who are dealing with asphalt mixtures. The proposed models could be used by asphalt engineers to design asphalt mixtures with a desired E*.

On the other hand, ML-based algorithms have the following common disadvantages:

i.: ML-based regression algorithms can be computationally expensive to train, especially for large datasets.
ii.: ML-based regression models are difficult to interpret, as they can learn complex patterns in the data that are not easily understood by humans.

Finally, there may be some potential restrictions to our proposed approach as follows:

i.: The proposed approach has only been evaluated on a single dataset. It is important to evaluate and validate the approach on other datasets to ensure that it is generalizable.
ii.: The proposed approach has only been used to predict the E* of asphalt mixtures. It is important to evaluate the approach for other applications.

6. Results and Discussion

6.1. Statistical Regression Models

In general, E* predictions using statistical models were compared to laboratory measurements on the arithmetic scale as exhibited in Figure 5. As presented in Table 5, the literature-based 1-37A model predictions did not exhibit the best results compared to the measured E* values. The model yielded highly biased and scattered predictions on both the arithmetic and logarithmic scales. Meanwhile, it was noticed that E* predictions using the Weibull-based and Bailey-based models were significantly better compared to the literature-based 1-37-A model predictions on both scales. It was also obvious that the predictions using the Weibull-based model were almost comparable to the ones using the Bailey-based model in terms of the scatter around the equality line as represented by R².

6.2. ML-Based Regression Models

The E* relationships with mixture-related components and test conditions are not simple linear relationships, and this was obvious in the ML-based algorithms’ results depicted in Figure 6, Figure 7 and Figure 8 in addition to the results of k-fold cross-validation technique as depicted in Figure 9 and Figure 10. The performance of the linear regression models (MLiR, RLaR, RRdR and SVM-L) was way beyond the performance of the other models. This indicates a high bias of these models, and this was confirmed in the learning curve section in Figure 11.

The performance of the investigated ML-based algorithms was assessed as described above, using three indicators: adjustedR², RMSE, and MAE. In Figure 6, the adjusted-R² results of the ML-based algorithms are exhibited with each gradation feeding feature. Ensemble models, especially the boosting techniques, outperformed the ordinary algorithms such as linear regression, k-nearest neighbors, and support vector machine. It was evident that the EGBR and CbR models had the best performance with adjustedR² higher than 0.999, using the training dataset. However, the CbR models were more robust due to minimal difference in accuracy between the training and testing datasets, which was less than 0.003 percent.

Linear regression models were the least-performing due to the nonlinearity between the E* and the feeding features; nonetheless, their performance achieved a moderate adjusted-R² value of 0.774 on the arithmetic scale. The DTR algorithms tended to overfit the dataset as depicted in Figure 6, as their performance indicator reached 1.0 for the training dataset and was around 0.96 for the testing dataset.

Adjusted R² is a suitable measure of model performance based on determining the bias; however, it is not sufficient to address the precision. This is why other indicators are required to rationally judge the model performance. In Figure 7 and Figure 8, RMSE and MAE are presented for the same investigated ML-based regression models. The DTR algorithms overfitted the datasets, as RMSE and MAE were 0.0 MPa, using the training datasets. On the other hand, the EGBR and CbR models provided the least errors for the training and testing datasets. Even though the EGBR performance using the training dataset was better than the performance of the CbR models, the CbR model’s performance using the testing dataset was superior with the lowermost error around 220 MPa and 40 MPa for RMSE and MAE, respectively.

6.3. K-Fold Cross-Validation

K-fold cross-validation is a useful technique to obtain a more accurate assessment of a model performance and to detect any dataset overfitting. By validating how the model might perform on an inexperienced dataset, it offers a more realistic evaluation.

As depicted in Figure 9, five folds from a shuffled dataset were developed. Four folds were used for training the model and one fold for testing it, alternately. R², RMSE, and MAE of each cross-validation iteration were recorded and plotted as shown in Figure 10.

Figure 9. Five-fold cross-validation technique [60].

It is evident from Figure 10 that the CbR models had the least variance whether in adjusted R², RMSE, or MAE. The worst-performing algorithm was SVM-P, which showed high variability in the results, followed by KNNR and ABR. The linear regression models had an acceptable performance among all investigated algorithms.

Figure 10. K-fold cross-validation results from ML-based regression models: (a) R² distribution of shuffled 5-fold cross-validation, (b) RMSE distribution of shuffled 5-fold cross-validation, (c) MAE distribution of shuffled 5-fold cross-validation.

6.4. Learning Curve

Based on the performance results of the ML-based algorithms, it was clarified that there were some algorithms that underfitted the datasets such as ML-based linear regression algorithms. On the contrary, there were some other algorithms that overfitted the datasets such as the DTR. To demonstrate that, the learning curves for the three ML-based algorithms (MLiR, DTR, and CbR) were generated and plotted in Figure 11.

Figure 11a depicts that there was a high bias and underfitting for the MLiR model due to the fact that the relationship between the feeding features and E* was not simply linear. In Figure 11b, the DTR model overfitted the training dataset since the algorithm performed well for the training dataset and showed contradictory performance for the testing dataset, as well as due to the high variance between both the training and testing results. Lastly, Figure 11c shows the optimal balance where there is no overfitting or underfitting of the datasets, and this might be reached via a bias–variance trade-off. The CbR model showed low-bias and low-variance learning curves both for the training and testing datasets.

Figure 11. Learning curve identification of ML-based algorithms: (a) underfitting of the learning curve in the MLiR model, (b) overfitting of the learning curve in the DTR model, (c) low-variance and low-bias of the learning curve in the CbR model.

6.5. Comparing the Statistical and ML-Based Models

Based on k-fold cross-validation and learning curves, the CbR models yielded the most accurate and less biased E* predictions among the other ML-based algorithms. An indisputable comparison between statistical and ML-based CbR models was conducted in accordance with the three representations of gradation feeding parameters as displayed in Figure 12. The performance of the ML-based CbR models was superior compared to traditional statistical prediction models. In addition, E* predictions based on the Bailey method parameters and Weibull distribution factors were more accurate and less biased compared to predicting E* using literature-based gradation parameters especially via statistical prediction models.

6.6. Model Selection and Hyperparameter Optimization

In view of the foregoing, the CbR technique outperformed all the examined ML-based algorithms. Using the CbR algorithm, the k-fold cross-validation based on five folds elaborated the most accurate characterization approach to aggregate gradation, which was the Bailey-based approach as presented in Figure 13.

Based on Figure 13, the Bailey-based CbR model had the highest R², least RMSE, and least variance among its prediction iterations. Consequently, the Bailey-based CbR model showed the most superior performance, and the hyperparameter tuning/optimization for the model was accomplished as described below.

6.6.1. Hyperparameter Tuning

In ML modelling, a parameter whose value is utilized to regulate the learning process is known as a hyperparameter. The task of selecting a set of ideal hyperparameters for a learning algorithm is known as hyperparameter optimization or tuning [61]. To generalize various data patterns, the Bailey-based CbR model may need different constraints, weights, or learning rates. Hyperparameters are these variables that need to be adjusted in order for the model to work best while solving a ML challenge and eliminating the possibility of overfitting [61].

Three major hyperparameters were tuned using the Bailey-based CbR estimations. Since it is a boosting ensemble learning technique, the most interesting hyperparameters were the number of estimators (number of trees), learning rate, and maximum depth of the tree for every hyperparameter. Different random values of hyperparameters were defined to fit the model in order to reach the optimized structure of the model. As shown in Figure 14, the optimization hyperparameters were 2000 for the number of trees, 0.1 for the learning rate, and 7 for the maximum depth.

The performance of Bailey-based CbR model is exhibited in Figure 15 after the process of hyperparameter tuning.

6.6.2. Residual Analysis

The residual analysis was performed on the Bailey-based CbR model results to ascertain that the residuals have no identified patterns, no heteroscedasticity, and were well-scattered, since they were assumed to be random. These conditions were fulfilled in the predictive model as displayed in Figure 16.

6.6.3. Model Interpretation and Feeding Feature Sensitivity

A technique called SHAP values (SHapley Additive exPlanations) [62], which is based on the cooperative game theory, was employed to make the Bailey-based CbR models more transparent and understandable. The SHAP value of a feature is the average impact on the model output from an expected value. This expected value is the average prediction made by the ML-based model across all possible combinations of feeding features. Then, the SHAP value was used to determine the importance of every feeding feature [62].

Figure 17a represents the SHAP values illustrating the impact of each feeding feature range on the model prediction. Figure 17b exhibits the relative importance of the feeding features, using average absolute SHAP values. Therefore, the feeding features could be ranked based on their importance as follows: the testing condition parameters, the binder viscosity, the mixture volumetrics, and the aggregate gradation features. The temperature was the most critical feeding feature that significantly affected the E* prediction followed by the frequency and the binder viscosity. However, it should be noted that the Bailey method parameters representing the aggregate gradation had a noticeable impact on the E* predictions almost close to the impact of the mixture volumetrics.

7. Conclusions and Recommendations

In conclusion, the motivation for this research study was derived from the shortcomings identified in the existing literature-based models for predicting the E* of asphalt mixtures. These models often used excessive and overrated feeding variables, computed records of input variables, modest databases, and lacked the sensitivity analysis of model input variables. Additionally, these models relied on simple aggregate gradation feeding features which do not reflect the full gradation impact.

To address these limitations, the current study aimed to evaluate multiple regression statistical and ML-based techniques for predicting E* based on a reliable measured database and new feeding parameters characterizing aggregate gradation. The following points summarize the main conclusions of this study:

On the first hand, two statistical regression models incorporating the Bailey method parameters and Weibull distribution factors were developed to predict E* with alternative characterizations of aggregate gradations. The predictions of both proposed models were compared to E* predictions of the NCHRP 1-37A Witczak model. As predicted, the performance of statistical models varied. While the literature-based 1-37A model did not provide satisfactory results, the Weibull-based and Bailey-based models showed significant improvement in predicting E*. The Weibull-based model slightly outperformed the Bailey-based model in terms of adjusted R².
On the other hand, 13 ML-based regression algorithms were trained involving the three approaches to aggregate gradation characterizations. The performance of each algorithm was evaluated using goodness-of-fit statistics and k-fold cross-validation techniques. The optimized algorithm underwent hyperparameter tuning and was subjected to the residual analysis and feature importance study.
Notably, the results indicated that the ML-based models outperformed the statistical models in predicting the E* of asphalt mixtures. Among the ML-based algorithms, ensemble models, particularly boosting techniques such as EGBR and CbR, exhibited superior performance. These models achieved the adjusted R² values higher than 0.999, using the training dataset.
The CbR models showed robustness, with minimal differences in accuracy between the training and testing datasets. Furthermore, the k-fold cross-validation analysis revealed that the CbR model had the least variance among the investigated algorithms, indicating their stability and consistency in predicting E*. Additionally, the learning curve analysis demonstrated that the CbR models achieved an optimal balance between bias and variance, indicating their ability to generalize well to unseen data.
Based on the comprehensive evaluation, the CbR models, particularly the Bailey-based CbR model, were recommended as the most accurate and reliable models for predicting the E* of asphalt mixtures. Hyperparameter optimization was performed to fine-tune the model, while the residual analysis confirmed that the residuals of the optimized model exhibited no identified patterns or heteroscedasticity, supporting the assumption of randomness in the predictive model.
Moreover, the SHAP values were applied to interpret the optimized Bailey-based CbR model and determine the relative importance of the feeding features. The results highlighted that temperature was the most critical feeding feature, followed by frequency and binder viscosity. The impact of the Bailey method parameters representing aggregate gradation on E* predictions was comparable to that of mixture volumetrics.

Overall, this research study contributes to the advancement of E* prediction techniques and provides valuable insights into the effects of different feeding variables on the predictions. The findings of this study have implications for the design and performance evaluation of asphalt pavements, ultimately leading to more sustainable and durable infrastructure. Future research can further explore the application of advanced ML techniques and incorporate additional factors for more comprehensive modeling of asphalt mixture properties.

Author Contributions

Conceptualization, A.M.A., S.M.E.-B. and R.T.A.E.-H.; Methodology, A.M.A. and A.N.A.; Software, A.N.A.; Validation, A.M.A., A.N.A. and M.R.K.; Data curation, A.N.A., M.R.K. and S.M.E.-B.; Writing—Original draft, A.M.A., A.N.A., S.M.E.-B. and R.T.A.E.-H.; Project administration, J.W.H.; Funding acquisition, J.W.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Korea Agency for infrastructure technology advancement (KAIA)’s grant funded by the Ministry of Land, Infrastructure, and Transport (Grant RS-2023-00243421).

Data Availability Statement

The proposed CbR algorithm using the three aggregate gradation approaches can be accessed by any user via the following link: https://mu-dynamic-modulus-prediction.streamlit.app/, (accessed on 1 September 2023).

Conflicts of Interest

The authors declare no conflict of interest.

References

Ghos, S.; Ali, S.A.; Zaman, M.; Chen, D.H.; Hobson, K.R.; Behm, M. Evaluation of Transverse Cracking in Flexible Pavements Using Field Investigation and AASHTOWare Pavement ME Design. Int. J. Pavement Res. Technol. 2022, 15, 561–576. [Google Scholar] [CrossRef]
Kim, Y.R.; Seo, Y.; King, M.; Momen, M. Dynamic modulus testing of asphalt concrete in indirect tension mode. In Transportation Research Record; SAGE PublicationsSage CA: Los Angeles, CA, USA, 2004; pp. 163–173. [Google Scholar]
Fadhil, T.H.; Ahmed, T.M.; Al Mashhadany, Y.I. Application of artificial neural networks as design tool for hot mix asphalt. Int. J. Pavement Res. Technol. 2021, 15, 269–283. [Google Scholar] [CrossRef]
El-Badawy, S.; Abd El-Hakim, R.; Awed, A. Comparing Artificial Neural Networks with Regression Models for Hot-Mix Asphalt Dynamic Modulus Prediction. J. Mater. Civ. Eng. 2018, 30, 04018128. [Google Scholar] [CrossRef]
Sakhaeifar, M.S.; Richard Kim, Y.; Kabir, P. New predictive models for the dynamic modulus of hot mix asphalt. Constr. Build. Mater. 2015, 76, 221–231. [Google Scholar] [CrossRef]
Moussa, G.S.; Owais, M. Pre-trained deep learning for hot-mix asphalt dynamic modulus prediction with laboratory effort reduction. Constr. Build. Mater. 2020, 265, 120239. [Google Scholar] [CrossRef]
Ceylan, H.; Schwartz, C.W.; Kim, S.; Gopalakrishnan, K. Accuracy of Predictive Models for Dynamic Modulus of Hot-Mix Asphalt. J. Mater. Civ. Eng. 2009, 21, 286–293. [Google Scholar] [CrossRef]
Moussa, G.S.; Owais, M. Modeling Hot-Mix asphalt dynamic modulus using deep residual neural Networks: Parametric and sensitivity analysis study. Constr. Build. Mater. 2021, 294, 123589. [Google Scholar] [CrossRef]
Hou, H.; Wang, T.; Wu, S.; Xue, Y.; Tan, R.; Chen, J.; Zhou, M. Investigation on the pavement performance of asphalt mixture based on predicted dynamic modulus. Constr. Build. Mater. 2016, 106, 11–17. [Google Scholar] [CrossRef]
Solatifar, N. Performance evaluation of dynamic modulus predictive models for asphalt mixtures. J. Rehab. Civ. Eng. 2020, 8, 87–97. [Google Scholar] [CrossRef]
El-Badawy, S.; Bayomy, F.; Awed, A. Performance of MEPDG dynamic modulus predictive models for asphalt concrete mixtures: Local calibration for Idaho. J. Mater. Civ. Eng. 2012, 24, 1412–1421. [Google Scholar] [CrossRef]
Khattab, A.M.; El-Badawy, S.M.; Al Hazmi, A.A.; Elmwafi, M. Evaluation of Witczak E* predictive models for the implementation of AASHTOWare-Pavement ME Design in the Kingdom of Saudi Arabia. Constr. Build. Mater. 2014, 64, 360–369. [Google Scholar] [CrossRef]
Zhang, D.; Birgisson, B.; Luo, X. A new dynamic modulus predictive model for asphalt mixtures based on the law of mixtures. Constr. Build. Mater. 2020, 255, 119348. [Google Scholar] [CrossRef]
Gong, H.; Sun, Y.; Dong, Y.; Han, B.; Polaczyk, P.; Hu, W.; Huang, B. Improved estimation of dynamic modulus for hot mix asphalt using deep learning. Constr. Build. Mater. 2020, 263, 119912. [Google Scholar] [CrossRef]
Barugahare, J.; Amirkhanian, A.N.; Xiao, F.; Amirkhanian, S.N. Predicting the dynamic modulus of hot mix asphalt mixtures using bagged trees ensemble. Constr. Build. Mater. 2020, 260, 120468. [Google Scholar] [CrossRef]
Mohammadi Golafshani, E.; Behnood, A.; Karimi, M.M. Predicting the dynamic modulus of asphalt mixture using hybridized artificial neural network and grey wolf optimizer. Int. J. Pavement Eng. 2021. [Google Scholar] [CrossRef]
Huang, J.; Shiva Kumar, G.; Ren, J.; Zhang, J.; Sun, Y. Accurately predicting dynamic modulus of asphalt mixtures in low-temperature regions using hybrid artificial intelligence model. Constr. Build. Mater. 2021, 297, 123655. [Google Scholar] [CrossRef]
Solatifar, N.; Kavussi, A.; Abbasghorbani, M. Dynamic Modulus Predictive Models for In-Service Asphalt Layers in Hot Climate Areas. J. Mater. Civ. Eng. 2021, 33, 04020438. [Google Scholar] [CrossRef]
Wang, C.; Tan, S.; Chen, Q.; Han, J.; Song, L.; Fu, Y. Dynamic Modulus Prediction of a High-Modulus Asphalt Mixture. Adv. Civ. Eng. 2021, 2021, 9944415. [Google Scholar] [CrossRef]
Ali, Y.; Hussain, F.; Irfan, M.; Buller, A.S. An eXtreme Gradient Boosting model for predicting dynamic modulus of asphalt concrete mixtures. Constr. Build. Mater. 2021, 295, 123642. [Google Scholar] [CrossRef]
Behnood, A.; Daneshvar, D. A machine learning study of the dynamic modulus of asphalt concretes: An application of M5P model tree algorithm. Constr. Build. Mater. 2020, 262, 120544. [Google Scholar] [CrossRef]
Bari, J.; Witczak, M.W. Development of a New Revised Version of the Witczak E Predictive Model for Hot Mix Asphalt Mixtures; Arizona State University: Tempe, AZ, USA, 2006; Volume 75, pp. 381–424. [Google Scholar]
Aggarwal, P. Predicting dynamic modulus for bituminous concrete using support vector machine. In Proceedings of the 2017 International Conference on Infocom Technologies and Unmanned Systems: Trends and Future Directions, ICTUS, Dubai, United Arab Emirates, 18–20 December 2017; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2018; pp. 751–755. [Google Scholar]
Eleyedath, A.; Swamy, A.K. Prediction of dynamic modulus of asphalt concrete using hybrid machine learning technique. Int. J. Pavement Eng. 2022, 23, 2083–2098. [Google Scholar] [CrossRef]
Behnood, A.; Mohammadi Golafshani, E. Predicting the dynamic modulus of asphalt mixture using machine learning techniques: An application of multi biogeography-based programming. Constr. Build. Mater. 2021, 266, 120983. [Google Scholar] [CrossRef]
Daneshvar, D.; Behnood, A. Estimation of the dynamic modulus of asphalt concretes using random forests algorithm. Int. J. Pavement Eng. 2022, 23, 250–260. [Google Scholar] [CrossRef]
Barugahare, J.; Amirkhanian, A.N.; Xiao, F.; Amirkhanian, S.N. ANN-based dynamic modulus models of asphalt mixtures with similar input variables as Hirsch and Witczak models. Int. J. Pavement Eng. 2022, 23, 1328–1338. [Google Scholar] [CrossRef]
Xu, W.; Huang, X.; Yang, Z.; Zhou, M.; Huang, J. Developing Hybrid Machine Learning Models to Determine the Dynamic Modulus (E*) of Asphalt Mixtures Using Parameters in Witczak 1-40D Model: A Comparative Study. Materials 2022, 15, 1791. [Google Scholar] [CrossRef]
Huang, J.; Zhou, M.; Sabri, M.M.S.; Yuan, H. A Novel Neural Computing Model Applied to Estimate the Dynamic Modulus (DM) of Asphalt Mixtures by the Improved Beetle Antennae Search. Sustainability 2022, 14, 5938. [Google Scholar] [CrossRef]
Gong, H.; Sun, Y.; Dong, Y.; Hu, W.; Han, B.; Polaczyk, P.; Huang, B. An efficient and robust method for predicting asphalt concrete dynamic modulus. Int. J. Pavement Eng. 2022, 23, 2565–2576. [Google Scholar] [CrossRef]
Huang, J.; Zhang, J.; Li, X.; Qiao, Y.; Zhang, R.; Kumar, G.S. Investigating the effects of ensemble and weight optimization approaches on neural networks’ performance to estimate the dynamic modulus of asphalt concrete. Road Mater. Pavement Des. 2022, 24, 1939–1959. [Google Scholar] [CrossRef]
Rezazadeh Eidgahee, D.; Jahangir, H.; Solatifar, N.; Fakharian, P.; Rezaeemanesh, M. Data-driven estimation models of asphalt mixtures dynamic modulus using ANN, GP and combinatorial GMDH approaches. Neural Comput. Appl. 2022, 34, 17289–17314. [Google Scholar] [CrossRef]
Awed, A.M. Material Characterization of HMA for MEPDG Implementation in Idaho. Master’s Thesis, University of Idaho, Moscow, ID, USA, 2010. [Google Scholar]
Bayomy, F.; El-Badawy, S.; Awed, A. Implementation of the MEPDG for Flexible Pavements in Idaho; Idaho Transportation Department: Boise, ID, USA, 2012. [Google Scholar]
Masad, E.; Rezaei, A.; Chowdhury, A. Field Evaluation of Asphalt Mixture Skid Resistance and Its Relationship to Aggregate Characteristics [2011]; Texas Transportation Institute: Bryan, TX, USA, 2011. [Google Scholar]
Kassem, E.; Awed, A.; Masad, E.; Little, D. Development of predictive model for skid loss of asphalt pavements. J. Mater. Civ. Eng. 2015, 2372, 83–96. [Google Scholar] [CrossRef]
Awed, A.; Kassem, E.; Masad, E.; Little, D. Method for Predicting the Laboratory Compaction Behavior of Asphalt Mixtures. J. Mater. Civ. Eng. 2015, 27, 04015016. [Google Scholar] [CrossRef]
Vavrik, W.R.; Pine, W.J.; Huber, G.; Carpenter, S.H.; Bailey, R. The Bailey method of gradation evaluation: The influence of aggregate gradation and packing characteristics on voids in the mineral aggregate. In Asphalt Paving Technology: Association of Asphalt Paving Technologists (AAPT)—Proceedings of the Technical Sessions; Association of Asphalt Paving Technologists (AAPT): Gainesville, Florida, USA, 2002; Volume 70, pp. 132–175. [Google Scholar]
Thompson, G. Investigation of the Bailey Method for the Design and Analysis of Dense-Graded HMAC Using Oregon Aggregates; Oregon Department of Transportation Research Unit: Salem, OR, USA, 2006. [Google Scholar]
Géron, A. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd ed.; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2011; Volume 44, ISBN 9788578110796. [Google Scholar]
Hamel, L. Elements of Statistical Learning Theory. In Knowledge Discovery with Support Vector Machines; Wiley: London, UK, 2009; pp. 171–181. [Google Scholar] [CrossRef]
Ziegel, E.R. The Elements of Statistical Learning. Technometrics 2003, 45, 267–268. [Google Scholar] [CrossRef]
What is the k-Nearest Neighbors Algorithm?|IBM, (n.d.). Available online: https://www.ibm.com/eg-en/topics/knn (accessed on 31 August 2022).
Rokach, L.; Maimon, O. Decision Trees. In Data Mining and Knowledge Discovery Handbook; Springer: Berlin/Heidelberg, Germany, 2006; pp. 165–192. [Google Scholar]
What is a Decision Tree|IBM, (n.d.). Available online: https://www.ibm.com/eg-en/topics/decision-trees (accessed on 31 August 2022).
Zhang, X.-D. Support Vector Machines (SVM). In Gesture; Springer: Boston, MA, USA, 2001; Volume 23, pp. 349–361. ISBN 9789811527708. [Google Scholar]
Rani, A.; Kumar, N.; Kumar, J.; Sinha, N.K. Machine learning for soil moisture assessment. In Deep Learning for Sustainable Agriculture; Academic Press: Cambridge, MA, USA, 2022; pp. 143–168. ISBN 9780323852142. [Google Scholar]
Dietterich, T.G. Ensemble methods in machine learning. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2000; pp. 1–15. [Google Scholar]
Tibshirani, R. Regression shrinkage and selection via the lasso: A retrospective. J. R. Stat. Soc. Ser. B Stat. Methodol. 2011, 73, 273–282. [Google Scholar] [CrossRef]
Hoerl, A.E.; Kennard, R.W. Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics 2000, 42, 80. [Google Scholar] [CrossRef]
Sharma, D.K.; Aayush; Sharma, A.; Kumar, J. KNNR: K-nearest neighbour classification based routing protocol for opportunistic networks. In Proceedings of the 2017 10th International Conference on Contemporary Computing, IC3, Noida, India, 10–12 August 2017; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2018; pp. 1–6. [Google Scholar]
Schölkopf, B. SVMs—A practical consequence of learning theory. IEEE Intell. Syst. Their Appl. 1998, 13, 18–21. [Google Scholar] [CrossRef]
Loh, W.Y. Classification and regression trees, Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2011, 1, 14–23. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Solomatine, D.P.; Shrestha, D.L. AdaBoost.RT: A boosting algorithm for regression problems. In Proceedings of the 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No. 04CH37541), Budapest, Hungary, 25–29 July 2004; Volume 2, pp. 1163–1168. [Google Scholar]
Zemel, R.S.; Pitassi, T. A gradient-based boosting algorithm for regression problems. In Advances in Neural Information Processing Systems 13 (NIPS 2000); MIT Press: Cambridge, MA, USA, 2001. [Google Scholar]
Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. Catboost: Unbiased boosting with categorical features. In Advances in Neural Information Processing Systems; Neural Information Processing Systems Foundation: La Jolla, CA, USA, 2018; pp. 6638–6648. [Google Scholar]
Hastie, T.; Tibshirani, R.; James, G.; Witten, D. An Introduction to Statistical Learning, 2nd ed.; Springer: Cham, Switzerland, 2021; Volume 102, ISBN 9780387781884. [Google Scholar]
Introduction to k-Fold Cross-Validation in Python—SQLRelease, (n.d.). Available online: https://sqlrelease.com/introduction-to-k-fold-cross-validation-in-python (accessed on 29 May 2023).
Vanderplas, J.T. Chapter 5: Machine Learning. In Python Data Science Handbook; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2016. [Google Scholar]
Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems; Neural Information Processing Systems Foundation: La Jolla, CA, USA, 2017; pp. 4766–4775. [Google Scholar]

Figure 1. Normalized histograms and normal probability distributions of the main feeding features.

Figure 2. Heatmap of the estimated correlations between the main feeding features.

Figure 3. Flowchart of ML-based regression methodology.

Figure 4. ML-based algorithms [40].

Figure 5. The predicted vs. measured E* using statistical regression models on arithmetic and logarithmic scales: (a) literature-based 1-37A prediction model, (b) Weibull-based prediction model, (c) Bailey-based prediction model.

Figure 6. Adjusted R² of training and testing datasets using ML-based regression models: (a) adjusted-R² of the training dataset, (b) adjusted-R² of the testing dataset.

Figure 7. RMSE of training and testing datasets using ML-based regression models: (a) RMSE of the training dataset, (b) RMSE of the testing dataset.

Figure 8. MAE of training and testing datasets using ML-based regression models: (a) MAE of the training dataset, (b) MAE of the testing dataset.

Figure 12. Comparison between statistical and ML-based CbR models.

Figure 13. K-fold cross-validation of CbR algorithms.

Figure 14. Hyperparameter tuning.

Figure 15. Prediction performance of Bailey-based CbR model after hyperparameter optimization.

Figure 16. Residual analysis of Bailey-based CbR model.

Figure 17. SHAP values of the Bailey-based CbR model: (a) SHAP values of feeding features using the testing dataset, (b) relative importance of feeding features using average absolute SHAP values.

Table 1. Main feeding parameters incorporated in literature-based statistical regression models for E* prediction.

Ref.	Binder Properties			Mix Properties								Test Conditions		Calibration	Validation	Scale	Statistics of Goodness-of-Fit
	Binder Properties			Volumetric Parameters				Gradation Parameters				Test Conditions
	δ	G*	η	VMA	VFA	V_a	V_beff	ρ3/8	ρ3/4	ρ4	ρ200	Freq	Temp
[5]	-	√	-	√	-	√	√	√	√		√			90%	10%		N = 223, R² = 0.95, Se/Sy = 0.22
[6]	-	-	√	-	-	√	√	√	√	√	√	f_c	-	80%	20%	log	N = 6060, R² = 0.54, Se/Sy = 0.68
[6]	√	√	-	-	-	√	√	√	√	√	√	-	-	80%	20%	log	N = 6060, R² = 0.73, Se/Sy = 0.52
[6]	-	-	√	-	-	√	√	√	√	√	√	f_c		20%	80%	log	N = 1071, R² = 0.85, Se/Sy = 0.38
[6]	√	√	-	-	-	√	√	√	√	√	√	-		20%	80%	log	N = 1071, R² = 0.53, Se/Sy = 0.69
[7]	-	-	√	-	-	√	√	√	√	√	√	f_c		93%	7%	Ar	N = 7400, R² = 0.68, Se/Sy = 0.57
[7]	√	√	-	-	-	√	√	√	√	√	√	-		93%	7%	Ar	N = 7400, R² = 0.77, Se/Sy = 0.48
[8]	-	-	√	-	-	√	√	√	√	√	√	f_r	-	-	-	log	N = 4650, R² = 0.84, Se/Sy = 0.39
[8]	√	√	-	-	-	√	√	√	√	√	√	f_r	-	-	-	log	N = 4650, R² = 0.92, Se/Sy = 0.29
[8]	√	-	-	√	√	-	-	-	-	-	-	-	-	-	-	log	N = 4650, R² = 0.68, Se/Sy = 0.57

Note: VMA is the percentage of voids in a mineral aggregate, VFA is the percentage of voids filled with asphalt, f_c is the loading frequency in Hz, f_r is the reduced frequency in Hz (loading frequency at the reference temperature), log means logarithmic scale, Ar means arithmetic scale, N is the number of data points, R² is the coefficient of determination, Se is the standard error of estimate, and Sy is the standard deviation of measured values.

Table 3. Statistical descriptions of the main feeding data.

Feeding Features		Feature Description	Mean	Standard Deviation	Minimum	Maximum
Testing Conditions	T	Temperature (ºC)	29.43	18.65	4.40	54.40
Testing Conditions	f_C	Frequency (Hz)	6.93	8.79	0.10	25.00
Binder	η_f,T	Viscosity (cP)	3.1 × 10⁹	14.8 × 10⁹	0.0001 × 10⁹	84.8 × 10⁹
Aggregate	ρ3/4	Retained on the 3/4 in. sieve (%)	0.96	2.49	0.00	14.00
	ρ3/8	Retained on the 3/8 in. sieve (%)	24.31	7.38	13.00	36.00
	ρ4	Retained on No. 4 sieve (%)	49.84	5.84	42.00	63.00
	ρ200	Passing No. 200 sieve (%)	5.42	1.62	3.50	8.20
	λ	Weibull distribution factors	6.04	1.15	4.65	8.67
	κ	Weibull distribution factors	0.89	0.07	0.72	1.07
	CA Ratio	Bailey method parameters	0.82	0.15	0.53	1.31
	FA_c Ratio		0.37	0.11	0.23	0.57
	FA_f Ratio		2.05	0.37	1.60	2.90
Mixture Volumetrics	V_a	Volume of the air voids in the mixture (%)	7.47	0.89	5.94	9.58
Mixture Volumetrics	V_beff	Volume of effective bitumen content (%)	10.75	1.26	9.00	16.00
	E*	Measured E* (MPa)	3489.94	4117.70	14.90	18,974.00

Table 4. Supervised ML-based algorithms, required hyperparameters, and related Categories.

Ref.	Algorithm	Nonlinear	Data Transformation Is Required?	Hyperparameters	Category
[42]	MLiR		√	-	Linear
[49]	RLaR		√	Alpha	Linear
[50]	RRdR		√	Alpha	Linear
[51]	KNNR	√	√	K	KNN
[52]	SVM-L		√	C	SVM
[52]	SVM-RBF	√	√	C	SVM
[52]	SVM-P	√	√	C, gamma, and d	SVM
[53]	DTR	√		Criterion, depth, and nodes	DT
[54]	RFR	√		Criterion, N, and depth	Bagging
[55]	ABR	√		Base, Lr, and N	Boosting
[56]	GBR	√		Loss, Lr, criterion, N, and depth	Boosting
[57]	EGBR	√		N, Lr, alpha, lambda, CC, and CW	Boosting
[58]	CbR	√		Depth, Lr, and number of iterations	Boosting

Note: alpha, lambda, or C: regularization parameters; K: number of points in KNNR; gamma: kernel coefficient; d: degree of the polynomial; criterion: function to measure the quality of a split; depth: maximum depth of the tree; nodes: maximum leaf nodes; N: number of estimators; base: base estimator; Lr: learning rate; loss: loss function to be optimized; CC: complexity control; CW: child weight.

Table 5. Goodness-of-fit statistics of statistical regression models’ predictions.

Statistical Regression Model	R² (Arithmetic)	RMSE	MAE
Literature-Based 1-37A Prediction Model	0.52	2831.37	1706.03
Weibull-Based Prediction Model	0.90	1287.49	734.03
Bailey-Based Prediction Model	0.89	1317.58	744.46

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Awed, A.M.; Awaad, A.N.; Kaloop, M.R.; Hu, J.W.; El-Badawy, S.M.; Abd El-Hakim, R.T. Boosting Hot Mix Asphalt Dynamic Modulus Prediction Using Statistical and Machine Learning Regression Modeling Techniques. Sustainability 2023, 15, 14464. https://doi.org/10.3390/su151914464

AMA Style

Awed AM, Awaad AN, Kaloop MR, Hu JW, El-Badawy SM, Abd El-Hakim RT. Boosting Hot Mix Asphalt Dynamic Modulus Prediction Using Statistical and Machine Learning Regression Modeling Techniques. Sustainability. 2023; 15(19):14464. https://doi.org/10.3390/su151914464

Chicago/Turabian Style

Awed, Ahmed M., Ahmed N. Awaad, Mosbeh R. Kaloop, Jong Wan Hu, Sherif M. El-Badawy, and Ragaa T. Abd El-Hakim. 2023. "Boosting Hot Mix Asphalt Dynamic Modulus Prediction Using Statistical and Machine Learning Regression Modeling Techniques" Sustainability 15, no. 19: 14464. https://doi.org/10.3390/su151914464

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Boosting Hot Mix Asphalt Dynamic Modulus Prediction Using Statistical and Machine Learning Regression Modeling Techniques

Abstract

1. Introduction

2. Literature Review

3. Research Motivation and Objectives

4. Material and Testing Measurements

5. Methodology

5.1. Statistical Models

5.2. ML-Based Modeling Techniques

5.2.1. Linear Regression (LiR)

Multiple Linear Regression (MLiR)

Regularized Linear Regression (RLiR)

5.2.2. K-Nearest Neighbors Regression (KNNR)

5.2.3. Decision Tree Regression (DTR)

5.2.4. Support Vector Machine (SVM)

5.2.5. Ensemble Learning

Bagging Ensemble Learning

Boosting Ensemble Learning

5.3. Model Performance Indicators

5.4. Threats to Validity

6. Results and Discussion

6.1. Statistical Regression Models

6.2. ML-Based Regression Models

6.3. K-Fold Cross-Validation

6.4. Learning Curve

6.5. Comparing the Statistical and ML-Based Models

6.6. Model Selection and Hyperparameter Optimization

6.6.1. Hyperparameter Tuning

6.6.2. Residual Analysis

6.6.3. Model Interpretation and Feeding Feature Sensitivity

7. Conclusions and Recommendations

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI