*3.3. Models' Validation*

The machine learning methods were validated by employing k-fold and statistical methods. The k-fold technique, in which related data are randomly spread and separated into 10 groups, is widely used to determine a technique's validity [48]. Nine groups are employed for training the model, and one group is used for validation, as shown in Figure 10. The model is more accurate when the errors (MAE and RMSE) are less and the R<sup>2</sup> is high. In order to get a reasonable conclusion, the operation should be repeated 10 times. The model's outstanding accuracy is due in large part to this enormous effort. In addition, both models were statistically tested based on errors (MAE and RMSE), as shown in Table 3. In comparison to the gradient boosting technique, this assessment also validated the random forest model's superior accuracy due to reduced error readings. Equations (1) and (2), which were obtained from prior investigations [31,49], were used to determine the approaches' prediction performance statistically.

$$\text{MAE} = \frac{1}{n} \sum\_{i=1}^{n} |P\_i - T\_i| \,\tag{1}$$

$$\text{RMSE} = \sqrt{\sum \frac{\left(P\_i - T\_i\right)^2}{n}},\tag{2}$$

where *n* = total number of data points, *Ti* = experimental value, and *Pi* = predicted value.

**Table 3.** Statistical measurements of the models for validation.


**Figure 10.** K-fold analysis procedure [50].

MAE, RMSE, and R<sup>2</sup> were measured to see how well the k-fold analysis was executed, and the results are shown in Table 4. Figures 11–13 show a comparison of k-fold analysis for all of the machine learning techniques used. The MAE values for the gradient boosting compressive strength model ranged from 4.78 to 14.60 MPa, with an average of 10.27 MPa. In contrast, the MAE values for the random forest compressive strength model ranged from 4.19 to 10.92 MPa, with an average of 8.34 MPa. Likewise, the gradient boosting and random forest models for the compressive strength of RAC had average RMSE values of 11.05 and 9.41 MPa, respectively. When R2 values were evaluated, the average R<sup>2</sup> values for the gradient boosting and random forest models were 0.67 and 0.72, respectively. When compared to the gradient boosting model, the random forest model—with smaller error values and greater R<sup>2</sup> values—was more precise in projecting the compressive strength of RAC. A similar distribution of error and R2 values was discovered for the flexural strength of RAC for both the gradient boosting and random forest models, and this also validated the higher precision of the random forest model. Hence, the random forest model might be employed for the strength estimation of RAC in order to reduce the number of trials required for experimentation.


**Table 4.** Results of k-fold analysis.

**Figure 11.** Mean absolute error distribution from k-fold analysis.

**Figure 12.** Root-mean-square error distribution from k-fold analysis.

**Figure 13.** Correlation coefficient (R2) distribution from the k-fold analysis. GB: gradient boosting, RF: random forest, CS: compressive strength, FS: flexural strength.
