**2. Punching Shear Resistance Database of RC Slab-Column Joints**

The high-fidelity data is the basis of the construction of ML models, so that the compilation of the experimental database is required. The punching shear resistance database containing 610 experimental data is shown in Appendix A, and the statistic information of input variables is listed in Table 1. Some relevant studies [8,14,38] report that there are seven main influential factors affecting slab-column joints: cross-section shape of column (*s*), cross-section area of column (*A*), slab's effective depth (*d*), compressive strength of concrete (*f'c*), yield strength of reinforcement (*fy*), reinforcement ratio (*ρ*), and spandepth ratio (*λ*). Their distributions are described in four measures: minimum, maximum, standard deviation, and average. The cross-section of each column has three shapes: square (*s* = 1), circle (*s* = 2), and rectangle (*s* = 3). The prediction target of the ML models is the punching shear resistance (*V*) of slab-column joints.


**Table 1.** Statistic information of input variables.

The histograms displayed in Figure 2 show the relative frequency distributions of the input variables and the output, and the red lines represent the cumulative distribution functions (CDF) of the parameters. To further understand the correlations between the input variables, they are quantified as a Pearson correlation coefficient matrixand shown in Figure 3, where coefficients represent the degree of linear correlation between input variables [39]. The coefficients close to −1 or 1 represent the obvious negative or positive linear correlation, and the degree of linear correlation between *A* and *d* is highest.

**Figure 2.** Distributions of the parameters in the database: (**a**) *s*; (**b**) *A*; (**c**) *d*; (**d**) *f'c*; (**e**) *fy*; (**f**) *ρ*; (**g**) *λ*; (**h**) *V*.

**Figure 3.** Correlation coefficient matrix of input variables.
