**2. Objectives**


### **3. Data Collection and Preparation**

### *3.1. Full-Size Track Road Test*

A test road was constructed and prepared for the full-size track road test. The track was designed to be a 40 km single-lane of a typical structure (4 cm + 6 cm Superpave surface course and an 8-cm asphalt-treated base) on a 50-cm cement-stabilized base (20 + 30 cm cement-stabilized macadam) (Figure 1). Twenty-five full-load trucks were used to present the accelerated test in an unfavorable season, which was in hot weather and frequent rain days during June that year. The average range of air temperature was 16–28◦ and there were eight days of rain over 20 testing days. The whole loading process was separated into four stages by every 5 days. After each stage, the quality parameters, such as the international roughness index (IRI), the deflection value, and the rutting depth, were collected by automatic devices. Asphalt in-field samples were cored to test their void rates and splitting strength before and after the process. As 29,820 standardized loading times were achieved in the whole test, some bumps and pits appeared at random surface areas. These positions of potential and emerged distress obtained by coring and observation are marked as the label for the model.

**Figure 1.** The structure of the test road.

### *3.2. Data Collection and Preparation*

The data related to deterioration were collected and arranged in Table 1. Thirty-four variables were chosen to build the forest.

### **Table 1.** Description of variables.




The data can be separated into two groups, which are initial data before the test running and in-process data during the test. Some variables imply that damage has already occurred. However, when a decision was made to uncover the surface, it was found that the result was not accurate. Some variables are linear with predictions that can be accumulated by different weights, but some variables are non-linear whose margins are hard to decide. That is why an RF model is needed to improve the accuracy of predictions.

The data are collected as much as detected to avoid disregarding any small factor, which may also have influence on the predicting result. Nevertheless, some of the variables are dependent on other inputs. We prefer more information richness than data independence. This is because the RF model is very good at multicollinear problems. Besides, if there are some negative factors introduced in the model, they can be pruned in the procedure of model optimization for computing spend and model strength.
