*2.1. Data Retrieval and Analysis*

To obtain the appropriate result, supervised machine learning techniques need a varied range of input variables [34–36]. The compressive and flexural strength of RAC were projected using data obtained from the past studies (see Table S1 in Supplementary Materials). Experimental data were arbitrarily selected from previous studies so as to avoid biased images. Twelve variables were chosen as input factors, as listed below:


In addition, the compressive and flexural strength were chosen as the output variables. The quantity of input variables and the dataset have a substantial impact on a machine learning method's result [37–39]. In the present study, 638 data points (mixes) were employed to run machine learning methods for compressive strength prediction, and 139 data points (mixes) were used for flexural strength prediction. Tables 1 and 2 summarize the descriptive statistic evaluation of each input variable for compressive and flexural strength prediction, respectively. The mode, median, and mean exemplify basic propensity, while the standard deviation, minimum, and maximum denote variability. The relative frequency dispersal of input factors employed to forecast the compressive and flexural strength is depicted in Figures 1 and 2, respectively. This represents the overall number of readings linked to each input parameter.



**Figure 1.** *Cont*.

**Figure 1.** Relative frequency dispersal of input parameters for the compressive strength dataset. NA: natural aggregate, RCA: recycled concrete aggregate.

**Figure 2.** *Cont*.

Los Angeles abrasion index of natural aggregate Los Angeles abrasion index of RCA

**Figure 2.** Relative frequency dispersal of inputs parameters for the flexural strength dataset. NA: natural aggregate, RCA: recycled concrete aggregate.
