**1. Introduction**

The stress-strain relationship for asphalt mixtures under sinusoidal loading can be described by the dynamic modulus, |*E*∗|, a function of material's components properties, loading rate, and temperature [1,2]. The dynamic modulus is one of the primary design inputs in Pavement Mechanistic-Empirical (M-E) Design to describe the fundamental linear viscoelastic material properties [3–5], and is one of the key parameters used to evaluate rutting and fatigue cracking distress predictions in Mechanistic-Empirical Pavement Design Guide (MEPDG) [5,6]. Although |*E*∗| has a significant role in pavement design, the associated test procedure is time-consuming and requires expensive equipments, so extensive effort has been extended to predict |*E*∗| from hot mix asphalt (HMA) material properties [7–9].

Predictive modeling is a process of estimating outcomes from several predictor variables using data mining tools and probability theory. An initial model can be formulated using either a simple linear equation or a more sophisticated structure obtained through a complex optimization algorithm [10].

There are several well-known predictive models for dynamic modulus, some of them are regression models, and some more recent ones have used techniques that include Artificial Neural Networks (ANN) and genetic programming [11]. Andrei et al. [12], used 205 mixtures with 2750 data points and revised the original Witczak model, and the developed model has subsequently been reformulated to use binder shear modulus rather than binder viscosity [13]. Christensen et al. [14], developed a new |*E*∗| predictive model based on the law of mixtures. The data base used for training the model contained 206 |*E*∗| measurements from 18 different HMA mixtures. Jamrah et al. [15], attempted to develop improved |*E*∗| predictive models for HMA used in the State of Michigan. They observed a significant difference between measured and fitted |*E*∗| values, especially at high temperatures and low frequencies. Alkhateeb et al. [16], developed a new predictive model from the law of mixtures to be used over broader ranges of temperature and loading frequencies, including higher temperatures/lower frequencies. The predictor variables used in that model were Voids in Mineral Aggregate (VMA) and binder shear modulus (*G*∗).

Sakhaeifar et al. [17], developed individual temperature-based models for predicting dynamic modulus over a wide range of temperatures. The predictor variables used in their model were aggregate gradation, VMA, Voids Filled with Asphalt (VFA), air void (*Va*), effective binder content (*Vbef f*), *G*∗, and binder phase angel (*δ*).

The existing dynamic modulus predictive models in the literature typically use two or more predictors from the following list: aggregate gradation, volumetric properties, and binder shear properties. These predictor variables are not necessarily an independent set of variables and thus it may not be appropriate for use in developing models. Since cross-correlated inputs in a dataset can unfavorably affect the accuracy of a predictive model by unduly affecting the estimation of their causative effects on the response variable, a pre-processing step of data evaluation would be useful for studying the quality of the input variables and their pair-wise correlations [18]. Principal Component Analysis (PCA) is a multivariate statistical approach that not only reduces the dimensionality of the problem but also converts a set of correlated inputs to a set of orthogonal (pseudo-)inputs using an orthogonal transformation [19]. During such a transformation, PCA maximizes the amount of information of the original dataset **X** by using a smaller set of pseudo-variables [20,21]. Another issue in all of the predictive models is extrapolation that can be risky because a model might behave differently outside of the convex hull that contains all of the data points used for its training. To avoid using points outside of this convex hull, a hyper-space containing all data points can be found and added as a constraint on the desired modeling problem.

Ghasemi et al. [22], developed a methodology for eliminating correlated inputs and extrapolation in modeling; they created a laboratory database of accumulated strain values of several asphalt mixtures and used the resulting framework to estimate the amount of permanent deformation (rutting) in asphalt pavement. Following their new PCA-based approach, this study focuses on developing a machine-learning based framework for predicting the dynamic modulus of HMA using orthogonal pseudo-inputs obtained from principal component analysis. Unlike most of the existing |*E*∗| predictive models, the proposed framework uses different data sets for model training and performance testing. To avoid extrapolation, an n-dimensional hyperspace is developed and added as a constraint to the modeling problem. This study also claims to determine the optimal HMA design and design variables for a pre-specified |*E*∗| by applying framework using an evolutionary-based optimization algorithm. It is worth pointing out that, unlike other predictive models, the proposed framework is not site-specific and also not limited to the materials used in the American Association of State Highway and Transportation Officials (AASHTO) road test, i.e., this framework can adjust itself based on the dataset presented to the framework. The need for a more robust and general framework for

performance prediction in asphalt pavement also stems from the availability of the vast amount of experimental data in this field. In this work, the developed framework operates in such a spirit and improves the accuracy of available models via machine learning-based approaches.

The remainder of the document is organized as follows: Section 2 presents material and methodology, followed by Section 3 that covers results and discussion. Two examples of the proposed framework's applications are discussed in Section 4, followed by conclusions presented in Section 5.

#### **2. Material and Methodology**

Twenty-seven specimens from nine different asphalt mixtures (three replicates for each mixture group) were used in this study. Using AASHTO TP 79-13 the dynamic modulus test was performed at three temperatures (0.4, 17.1, and 33.8 ◦C) and nine loading frequencies (25, 20, 10, 5, 2, 1, 0.5, 0.2, 0.1 Hz). The maximum theoretical specific gravity (*Gmm*), the bulk specific gravity (*Gmb*), and the effective binder content (*Vbef f*) were determined and used to calculate other volumetric properties of the asphalt mixtures.Asphalt binder shear properties were obtained from a dynamic shear rheometer (DSR) test. Using ASTM D7552-09(2014) the test was performed over a wide range of temperatures (−10 to 54 ◦C) and frequencies (0.1 Hz to 25 Hz), the same test temperatures and loading frequencies used in the mixture dynamic modulus test. It is important to note that this study uses a consistent definition of frequency, and that in order to predict the dynamic modulus value of an asphalt mixture for example at 4 ◦C and 25 Hz, for example, one should use as a model input the complex shear modulus of asphalt binder, |*G*∗|, at 4 ◦C and 25 Hz. A summary of the nine different mixture properties is given in Table 1. Using the laboratory test results on 27 specimens, a database of 243 data points was created for use in further modeling.

**Table 1.** General Mixture Properties of Nine Asphalt Mixtures Used in this Study.

