Stratified Metamodeling to Predict Concrete Compressive Strength Using an Optimized Dual-Layered Architectural Framework

Neto, Geraldo F.; Macêdo, Bruno da S.; Boratto, Tales H. A.; Gontijo, Tiago Silveira; Bodini, Matteo; Saporetti, Camila; Goliatt, Leonardo

doi:10.3390/mca30010016

Open AccessArticle

Stratified Metamodeling to Predict Concrete Compressive Strength Using an Optimized Dual-Layered Architectural Framework

by

Geraldo F. Neto

¹,

Bruno da S. Macêdo

²

,

Tales H. A. Boratto

¹

,

Tiago Silveira Gontijo

³

,

Matteo Bodini

⁴

,

Camila Saporetti

⁵

and

Leonardo Goliatt

^1,*

¹

Graduate Program on Computational Modeling, Federal University of Juiz de Fora, Juiz de Fora 36036-900, MG, Brazil

²

Department of Systems Engineering and Automation, Federal University of Lavras, Lavras 37200-000, MG, Brazil

³

Campus Centro-Oeste, Federal University of São João del-Rei, Divinópolis 355901-296, MG, Brazil

⁴

Dipartimento di Economia, Management e Metodi Quantitativi, Università degli Studi di Milano, Via Conservatorio 7, 20122 Milano, Italy

⁵

Polytechnic Institute, State University of Rio de Janeiro, Nova Friburgo 28625-570, RJ, Brazil

^*

Author to whom correspondence should be addressed.

Math. Comput. Appl. 2025, 30(1), 16; https://doi.org/10.3390/mca30010016

Submission received: 19 December 2024 / Revised: 28 January 2025 / Accepted: 6 February 2025 / Published: 9 February 2025

(This article belongs to the Section Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

Concrete is one of the most commonly used construction materials worldwide, and its compressive strength is the most important mechanical property to be defined at the time of structural design. Establishing a relationship between the amount of each component in the mixture and the properties of the concrete is not a trivial task, since a high degree of nonlinearity is involved. However, the use of machine learning methods as modeling tools has assisted in overcoming this difficulty. The objective of this work is to investigate the efficiency of using stacking as a technique for predicting the compressive strength of concrete mixtures. Four datasets obtained from the literature were used to verify the generalization capacity of the stacking technique; these datasets included a number of samples and numbers and types of attributes. Statistical tests were used to compare the existence of significant similarities between stacking and individual machine learning models. The results obtained from the statistical tests and evaluation metrics show that stacking yields results similar to those of the standalone machine learning models, with better performance.

Keywords:

stacking; computational intelligence; concrete; optimization

1. Introduction

Concrete is the most widely utilized material in global civil construction projects, serving as a fundamental component in countless structures and infrastructure systems. A total of 4.1 billion tons of the primary raw material for concrete, cement, was produced in 2022 [1], with the projected demand for concrete reaching 18 billion tons by the year 2050 [2]. This fact is linked to the characteristics that make concrete an attractive structural material. The ability to adapt to different types of shapes, its long functional life, its low maintenance demands, and its resistance to adverse weather conditions are some of the characteristics that make concrete a widely used material [3,4]. The high rates of urbanization, economic development, and population growth can explain the increase in the use of concrete materials over the years. These factors mean that the demand for the main component of concrete, cement, has increased in recent years [5,6].

The mechanical and physical properties of concrete are essential information during the structural design phase, since each type of project has certain requirements, such as compressive strength, durability, and specific weight [7]. There are numerous types of concrete, each with a particular application and with characteristics that adapt to project requirements. Mechanical properties are linked to the components of the concrete mix and the quantities of each element therein. Concrete mixtures contain the following essential components: aggregates, fine or coarse particles, water, and cement. Additions and additives can also be included in different mixtures. Mineral materials with cementing or pozzolanic properties are typically added, and chemical products can be used as additives to reduce the water–cement ratio [8].

The physical properties of concrete are usually defined through laboratory tests. These tests are standardized and enable the determination of properties such as compressive strength, modulus of elasticity, and tensile strength, among others [9]. One of the most important mechanical properties to be defined at the time of structural design is the compressive strength that a given concrete must possess, aiming to support the load of the structure and other loads to which it will be subjected [10,11]. However, a possible way to reduce the need for laboratory tests is to establish a relationship between the amount of each component in the mixture and the properties of concrete mechanics. This approach allows us to define the mixing parameters from a value established in the project for a given mechanical property, such as compressive strength. However, defining this relationship is not a trivial task, since there is a high degree of nonlinearity [12]. However, this difficulty has been overcome with the use of machine learning methods [13,14,15,16,17].

Machine learning methods have been used to solve problems in different research areas, creating models capable of modeling highly nonlinear problems in science and engineering [18,19,20]. In the case of predicting the compressive strength of concrete, the input parameters include information relating to the components of the mixture and other extra information, such as the curing time of the specimen; the output parameter is the compressive strength [21,22].

In the literature, one can find works that use machine learning methods to estimate the mechanical components of concrete [23]. Abd and Abd [24] carried out a study using nonstandard regression methods, a multivariate linear model, and a support vector machine (SVM) to predict concrete’s resistance to compression. The dataset used in this work came from 150 concrete mixtures containing lightweight foam or cellular concrete. Using a function of least squares loss, the multivariate nonlinear regression model obtained a correlation coefficient of 0.958, representing a good correlation between the actual and predicted values. The SVM models using the kernel, radial, linear, polynomial, and sigmoid basis functions yielded correlation coefficients of 0.986, 0.951, 0.976, and 0.851, respectively.

Ahmadi-Nedushan [25] analyzed the forecasting ability of k-nearest neighbors (KNN) algorithm with regard to the compressive strength of concrete via four variations of the KNN algorithm, including generalized regularization neural networks, stepwise regression, and a modular neural network. The dataset used was composed of 104 samples of high-performance concrete mixes; each data sample was made up of seven input data points related to the elements of the mixture and the compressive strength of the concrete after 28 days. The KNN-derived algorithms implemented by the author used a set of weights to weigh the significance of each attribute. This algorithm presented the best results among the others evaluated, reaching a value of 0.984 for the coefficient of correlation and 1.174 for the root mean squared error (MSE).

The prediction of compressive strength for high-performance concretes was also addressed by Al-Shamiri et al. [26]. In this paper, a comparison between a neural network trained with the backpropagation algorithm and an extreme learning machine (ELM) was performed. The dataset used was composed of 324 samples and consisted of five parameters of the mixture and the compressive strength at 28 days. The tests carried out with both approaches obtained satisfactory results, with correlation coefficients in the order of 0.990. An algorithm based on decision trees was used by Behnood et al. [27] to predict the compressive strength of normal and high-performance concrete. The correlation coefficient was 0.900, indicating that strategies using trees are appropriate for this type of problem.

Gilan et al. [28] used the particle swarm optimization (PSO) algorithm in conjunction with an SVM to predict the compressive strength of concrete mixtures containing metakaolin. The results obtained by the authors indicate that the use of the PSO-SVM hybrid model guarantees a greater ability to predict the compressive strength of concrete mixtures. In the work of Qi et al. [29], PSO was used to optimize the architecture parameters of an ANN. The results indicate that the use of PSO in conjunction with the ANN guaranteed good forecast quality, since the values predicted by the model were close to the experimental values.

The study that was conducted by Ly et al. [30] optimized artificial neural networks (ANNs) for faster and more accurate prediction of key properties in fly ash composites (FC). Using particle swarm optimization (PSO), the study fine-tuned the structure and parameters of the ANN. The results showed excellent prediction accuracy, with a strong correlation between the predicted and actual values. Additionally, this study identified the most influential factors affecting the properties under study, providing valuable insights for FC design and optimization.

Huang et al. [31] optimized the hyperparameters of a random forest (RF) model using the firefly algorithm (FA) to achieve significant performance gains. The resulting hybrid FA-RF model demonstrated high accuracy in predicting the concrete compressive strength, as evidenced by the strong R-squared and low root mean squared error (RMSE) values and the close alignment between the predicted and actual values.

This study, developed by Zhu et al. [32], presented two hybrid support vector regression models, AOSVR and ALSVR, optimized by advanced algorithms. These models accurately predicted concrete compressive strength from ingredient data, with AOSVR achieving even higher precision. Implementing these models reduces testing costs and improves concrete characterization analysis.

Understanding and forecasting scenarios of interest to experts, academics, and decision-makers has become extremely difficult due to the growing complexity of contemporary issues and the exponential growth of data from diverse processes. Models that can depict nonlinear relationships between inputs and outputs and are resilient to data noise and uncertainties are necessary for complex machine learning and data science challenges.

The use of stacked models, which have the ability to improve the accuracy displayed by individual models, can help address these issues [33]. Additionally, these models function in accordance with their topology, enabling the aggregation of models with various capacities. Because of this, it is possible to create multiple stacking models by using different algorithms in the first layer of the stacked architecture. This process enables various algorithms to identify patterns in the training data, combining the models in the first layer to produce accurate results [34].

Because it uses ensemble learning to combine the strengths of several predictive models, stacking is a useful technique for predicting concrete compressive strength that may provide better accuracy and generalization than individual models. Although current techniques like neural networks, decision trees, and regression models have demonstrated promising outcomes, stacking potentially surpasses them by lowering model bias and variance by combining various algorithms.

This paper aims to combine consolidated strategies with an approach that has still been little explored, denominated stacking, to predict the compressive strength of concrete. The stacking approach is a layered learning strategy in which, in the first layer, a set of machine learning algorithms are individually trained on the available data. In the second layer, a metamodel is responsible for making the final prediction, taking as input data the predictions made by the first layer models [35]. It is worth noting that stacking can present a greater number of layers, but, in this paper, only two layers are used, as shown in the schematic model in Figure 1.

The objectives of this paper are as follows:

To investigate the efficiency of the use of stacking as a technique for predicting the compressive strength of concrete mixtures;
To examine whether the results obtained with stacking are at least similar to those obtained with the individual use of computational intelligence techniques;
To investigate the best strategy for utilizing PSO as a tool for optimizing the parameters of machine learning models.

The PSO algorithm was chosen due to its simple usage, requiring fewer parameters to adjust compared to other algorithms. In addition, PSO strikes a good balance between exploring the search space for possible solutions and refining the most promising ones, which helps avoid getting stuck in suboptimal results. It is also effective at searching the entire solution space, similar to genetic algorithms, but with simpler and faster update mechanisms.

Furthermore, while other algorithms such as genetic algorithms (GA) [36] or pattern search (PS) [37] were considered, they have limitations. GA often requires more fine-tuning of parameters, such as mutation and crossover rates, and can be slower due to its reliance on random processes. PS, on the other hand, is more suited for local optimization and does not explore the solution space globally as effectively as PSO or GA. PSO has been used in some other works in the literature for problems related to concrete, such as optimizing concrete composition [38] and predicting mechanical properties [39].

The remainder of this paper is divided as follows. In Section 2, the datasets used to test the efficiency of the proposed approaches are presented. This section also details the machine learning methods employed, as well as the joint learning and optimization strategies. Furthermore, metrics and statistical tests were used to evaluate the results. In Section 3, the results are introduced, and their analysis is presented. Finally, in Section 4, the conclusions are presented.

2. Materials and Methods

2.1. Experimental Data

In this paper, four datasets are used to analyze the prediction capacity of the proposed model. Each dataset consists of experimental data on different types of concrete mixtures, as described below.

The first dataset was obtained from [40] and contains information extracted from 104 high-quality concrete performance samples. The cylindrical specimens (

100 \times 200

mm) were removed from the molds after 24 h, cured for 28 days, and subsequently tested to verify the compressive strength. Table 1 shows the parameters of the mixtures, their respective maximum and minimum values, and the values for resistance to compression.

Figure 2 shows the correlation matrix; the closer the value is to one, the greater the degree of correlation is, and the closer the value is to negative one, the greater the inverse correlation degree is. It can be observed that AE and SP have a high correlation with CS and AE is also highly correlated with SP. Additionally, AE, SP, and CS are inversely correlated with w/c.

The second dataset (D2) was collected from [41], which used data to investigate the influence of fly ash and silica fumes on the compression resistance of high-performance concrete. A total of 24 different mixtures with varying quantities of the components are presented in Table 2. Compression tests were performed for the following six different curing periods: 3, 7, 28, 56, 90, and 180 days. In total, 144 tests were carried out, thus composing the dataset used in this study. The correlation matrix for the dataset D2 parameters can be found in Figure 3. It can be seen that the correlation values are lower than for D1.

The D3 dataset was provided by [42], who investigated artificial neural networks to predict the compressive strength of self-compacting concrete containing fly ash. A compatible dataset was used for the materials present in the mixtures. Dataset D3 includes 80 samples, and the components of the mixtures are presented in Table 3. Figure 4 displays the correlation matrix for dataset D3; we can observe that the correlation values are low for all components of the mixtures and also for their relationships with compressive strength.

Finally, dataset D4 was extracted from [26]. The dataset has 324 samples, each comprising six parameters, namely, the mixture and the compressive strength at 28 days. The dataset’s maximum and minimum values are presented in Table 4. Figure 5 presents the correlation matrix for dataset D4. C is the component of the mixtures that correlates most highly with CS, followed by SP.

2.2. Regression Methods

The following five regression models were used in this paper: an artificial neural network (ANN), decision trees (DTs), an extreme learning machine (ELM), K-nearest neighbors (KNNs), and support vector machines (SVMs). These machine learning methods were chosen due to their widespread use in modeling engineering problems and their availability in machine learning libraries in different programming languages, allowing research reproducibility.

2.2.1. Artificial Neural Networks (ANNs)

Artificial neural networks are machine learning algorithms inspired by the neural system of the animal brain that is capable of learning, generalizing, and organizing data [43]. The smallest unit of an ANN is an artificial neuron, as shown in Figure 6, whose interconnection represents the synaptic communication process of biological neurons.

In addition, the processing unit of a neuron can be represented by an activation function, which takes input information and generates output information sent to other neurons. Three activation functions were employed, as presented in Table 5.

A multilayer perceptron (MLP) is a neural network that has at least one internal layer; since it can have multiple internal layers, an MLP is widely used and recommended for solving nonlinear problems. In this work, the neural network has three internal layers, each composed of up to one hundred neurons, and the topology used is feedforward and simple unidirectional.

The learning process of an ANN consists of adjusting the weights (

w_{i j}

) and minimizing the network prediction error. Several algorithms in the literature are capable of carrying out network training. In this case, L-BFGS has the ability to solve problems with a large number of variables controlling the amount of memory used.

2.2.2. Decision Trees (DTs)

Decision trees (DTs) are machine learning algorithms capable of generating expert systems to solve classification and regression problems. A DT is built from a set of tests performed on the input data. The internal nodes of the tree are the representations of the tests; in the case of regression problems, these tests are quantitative and are carried out with the output value defined for the attribute set, compared to the value that outputs a division. Tests are typically performed by checking whether the value of a feature is greater or less than the division value. The leaf nodes of the tree store a return value relating to a given entry. The return value of a leaf node is defined by the mean of the output values for all test datasets that reached that knot.

The classification and regression trees (CART) algorithm was used to induce binary trees using the input variables and a threshold to achieve the greatest information gain at each node, thus reducing knot impurity [44]. In regression problems, the impurity of a node is defined by a function whose value should be minimized. In the case of CART, the mean squared error is defined as follows:

M S E (y, \hat{y}) = \frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}

(1)

The CART algorithm determines which divisions will be created and the topology of the tree [45]. Knowing that the input data (

x_{i}, y_{i}

), with

i = 1, 2, \dots, N

, and

x_{i} = (v_{i 1}, v_{i 2}, \dots, v_{i j})

, will generate K partitions in regions,

R_{1}, R_{2}, \dots, R_{K}

, and that the output is given by a constant

c_{m}

in each region, one can write the output as follows:

\hat{y} (x) = \sum_{m = 1}^{k} c_{m} I (y | x \in R_{m})

(2)

As the minimization criterion adopted is the MSE, one can verify that the optimal

c_{m}

is the average of the

y_{i}

belonging to the region

R_{m}

, as follows:

c_{m} = m e a n (y | x \in R_{m})

(3)

Then, to obtain the division variable h and the division point s, the minimization problem is given by Equation (4), as follows:

m i n_{j, s} [m i n_{c_{1}} \frac{1}{N} \sum_{x_{i} \in R_{1} (j, s)} {(y_{i} - c_{1})}^{2} + m i n_{c_{2}} \frac{1}{N} \sum_{x_{i} \in R_{2} (j, s)} {(y_{i} - c_{2})}^{2}]

(4)

2.2.3. Extreme Learning Machine (ELM)

The extreme learning machine (ELM) is a neural network with only one internal layer that differs from the others due to its weight adjustment strategy [46]. In this strategy, the weights of the inner layer are randomly assigned, while the output layer weights are obtained analytically. The output of an ELM model is defined as follows:

\hat{y} (x) = \sum_{i = 1}^{N} H (x) w

(5)

where N is the number of neurons in the hidden layer, w is the weight of the output layer, and H(x) is equal to the output of the activation function of the neurons in the hidden layer. The activation functions applied are described in Table 6.

In the ELM training process, the hidden layer parameters (weights (a) and bias (b)) are defined randomly, ensuring a significant reduction in the training time for this type of network. The weights of the output layer (w) are defined by minimizing the error, as shown in Equation (6) below:

m i n ∥H (x) w - y (x)∥

(6)

which is found by solving the problem

H w = y,

(7)

which can be written in matrix form, as shown in Equation (8) below:

[\begin{matrix} h_{1} (x_{1}) & \dots & h_{N} (x_{1}) \\ ⋮ & ⋮ & ⋮ \\ h_{1} (x_{n}) & \dots & h_{N} (x_{n}) \end{matrix}] [\begin{matrix} w_{1} \\ ⋮ \\ w_{N} \end{matrix}] = [\begin{matrix} y_{1} \\ ⋮ \\ y_{N} \end{matrix}]

(8)

The weights that minimize the error are found by solving

w = H^{†} y

, where

H^{†}

is the generalized inverse Moore–Penrose matrix.

2.2.4. K-Nearest Neighbors (KNN)

The K-nearest neighbors (KNN) algorithm is an instance-based method, which performs predictions by comparing sets of attributes that have similar outputs [47]. KNN uses the principle that a dataset that is close in the attribute space will also be close in the answer space. Starting from this point, the algorithm seeks to find K neighbors of a set of attributes and, through the value of the target variable, performs the prediction.

To find the K-nearest neighbors, different strategies are used, among which the simplest is the Euclidean distance. In this work, algorithms based on decision trees, such as KD-Trees and BallTrees, are used to optimize the process of searching for neighbors. The set of nearest neighbors for the prediction of a feature set can be given by the average of the response values of the K nearest neighbors, which is defined as follows:

\hat{y} = \frac{\sum_{i = 1}^{K} f (X_{i})}{K}

(9)

where

(X_{1}, X_{2}, . . ., X_{k})

is the set of attributes of the K-nearest neighbors and

f (X_{i})

is the response value of each attribute set.

To weight the response values of neighbors, one can add to Equation (10) a weight (

w_{i})

that aims to value the closest neighbors, as follows:

\hat{y} = \frac{\sum_{i = 1}^{K} w_{i} f (X_{i})}{K}

(10)

The weight values can be calculated as the inverse of the distance between the compared instances. Here, the weights

w_{i}

are uniform for all the k-neighbors.

2.2.5. Support Vector Machine (SVM)

Support vector machines (SVMs) were developed by Vapnik in 1995 [48]. The technique is based on the theory of static learning, which aims to reduce the error of generalization [49]. SVMs have been used in many different fields and activities, such as image recognition, text categorization, and bioinformatics, and have yielded results comparable to or sometimes superior to those of techniques such as ANN. This fact can be justified by the SVM’s ability to deal with large datasets. The support vector regression (SVR) algorithm was used in this study. This version of the SVM is capable of working with continuous values for output.

The SVR algorithm is based on the principle of a linear machine that maps the input values while minimizing the generalization error. This machine is defined as follows:

\hat{y} = (w \cdot x) + b

(11)

where

\hat{y}

is the prediction based on the input x, b represents the bias, and (·) is the inner product.

In the SVR optimization problem, two slack variables (

ξ

and

ξ^{*}

) are included, and the problem formulation is as follows:

\begin{matrix} m i n \frac{1}{2} {∥w∥}^{2} + C \sum_{n}^{i - 1} (ξ_{i} + ξ_{i}^{*}) \\ s u b j e c t t o \{\begin{matrix} y_{i} - {\hat{y}}_{i} \leq ε + ξ_{i}^{*} \\ {\hat{y}}_{i} - y_{i} \leq ε + ξ_{i} \\ ξ_{i}, ξ_{i}^{*} \geq 0, i = 0, . . ., n \end{matrix} \end{matrix}

(12)

where the parameter C is used to regulate the tolerated clearance (

ξ

). The choice of C also influences the complexity of obtaining the model.

The regression problem can be solved in its dual form, where w can be replaced by w =

\sum_{i, j = 1}^{n} (α_{i} + a l p h a_{i}^{*}) x_{i}

. The SVR output is given by Equation (13), as follows:

\hat{y} (x) = \sum_{i, j = 1}^{n} (α_{i} + α_{i}^{*}) (x_{i} \cdot x) + b

(13)

To perform nonlinear regression, a function

K (x, x^{'}) = ϕ (x) \cdot ϕ (x^{'}

) is defined, where

ϕ (x)

is a nonlinear transformation. This function is referred to as the kernel. In this way, the output of the nonlinear SVR is given by Equation (14), as follows:

\hat{y} (x) = \sum_{i, j = 1}^{n} (α_{i} + α_{i}^{*}) K (x_{i} \cdot x) + b

(14)

The kernel function used is the radial basis function (RBF) given by

K (x_{i}, x) = e^{- γ | x - x^{'} |^{2}}

. The parameter

γ

is a coefficient to be defined for the kernel through an optimization process.

2.3. Stacking

Techniques that combine the prediction capacity of models generated by learning algorithms to achieve better results than traditional models individually are known as ensembles [50,51]. Different ensemble techniques can be divided into categories according to their objectives in improving prediction results, strategies for combining individual results, and types of individual algorithms used [52,53]. The last category can be subdivided into heterogeneous and homogeneous combinations. Techniques such as bagging [44] and boosting [54] are examples of combination strategies that use homogeneous algorithms. In this work, the ensemble strategy adopted is stacking, which uses heterogeneous algorithms.

Stacking was defined by Wolpert [55] as follows: when layers are proposed (Figure 7), the algorithms belonging to the first level (level-0) are trained with a set of samples and generate predictions that are used as a training set for the algorithm in the second level (level-1), also denominated the metamodel, which generates the final predictions.

The objective of stacking is to reduce generalization errors through the use of model cascading. It is based on the premise that each model is less capable of making a better prediction than the set of these models.

An advantage, or at least a difference, between stacking and other ensemble methods, is the way in which the predictions of level-0 models are combined [56]. While techniques such as bagging and boosting use simpler ways to carry out combinations, such as using the average of individual forecasts as the final forecast, stacking uses more robust models to make the final prediction, for example, linear regression or even learning algorithms.

Here, linear regression is used as a metamodel. This justifies the use of this algorithm because it is a technique where it is possible to evaluate with greater clarity the participation of level-0 model predictions in the final prediction.

2.4. Particle Swarm Optimization (PSO)

Particle swarm optimization (PSO) is inspired by the behavior of sets of animals, e.g., birds [57]. The algorithm is stochastic and is based on a very simple concept: each individual in the set moves through the search space at a speed that is adjusted through their experiences and the experiences of the group. In this scenario, one can define a set of individuals as the population and define that each individual represents a solution to the minimization problem. The value of an individual is assessed by the objective function to determine whether the position of the individual is good or closer to optimal.

The objective function is defined according to the problem to be solved. In problems where the parameters of a regression model have to be optimized, an error function can be used that reflects the results of the model. To delimit the search space, the maximum and minimum values are defined according to the problem to be solved.

Individuals in the population are endowed with a memory that records the best position of the individual, denominated pBest, and a collective memory that records the best position already reached by the set of individuals, denominated gBest.

These characteristics allow individuals to evaluate their next position within the search space, aiming to reach points that represent better solutions to the problem. The learning process of the algorithm is defined by the memory capacity of individuals, which allows one to change the velocity (v) and direction (x).

In Equation (15), the constants

ϕ_{p}

and

ϕ_{g}

represent the speed rate in the direction of the best individual position and the best global position, respectively. The variables

r_{1}

and

r_{2}

are introduced in Equation (15) and generate randomness in the algorithm’s learning process. The variables are generated randomly and range from 0 to 1. This randomness guarantees a more complete exploration of the search space.

v_{i} (t) = v_{i} (t - 1) + ϕ_{p} \cdot r_{1} (x_{p B e s t} - x_{i} (t - 1)) + ϕ_{g} \cdot r_{2} (x_{g B e s t} - x_{i} (t - 1))

(15)

x_{i} (t) = x_{i} (t - 1) + v_{i} (t)

(16)

PSO can be used for discrete or numeric search spaces, as in the case of optimizing the parameters of an MLP, where the number of hidden layers and the activation function can be optimized. In the case of discrete numeric parameters, the PSO response is converted to an integer value; however, in the case of discrete discrete parameters, a strategy must be used. One of these strategies is to assign an integer value to each of the parameter options; thus, the same procedure is used for discrete numeric parameters to define the option.

The algorithm may have as a stopping criterion the maximum number of interactions or a tolerance value for variation in the value of the objective function. The maximum number of interactions was used as a stopping criterion in this study.

2.5. Cross-Validation

The cross-validation method is a technique used to evaluate the performance of estimators using all the data in the dataset. This method is employed to minimize prediction errors caused by overfitting and is also indicated for the validation of datasets that have a reduced number of sample data points.

The cross-validation technique used is k-fold, which consists of dividing the dataset into k sets of equal sizes, adjusting the model to

k - 1

sets, and validating the remaining set. This process is carried out k times, and the model is validated for each part of the dataset [58].

The parameter k must be adjusted appropriately to avoid negatively impacting the final result. The value of k is normally chosen to be between 5 and 10. The choice of this value depends directly on the size of the dataset used, since selecting a large value for k can generate a training set that does not meet all the characteristics of the dataset.

2.6. Proposed Approaches

Two approaches have been proposed to predict the compressive strength of concrete. The topology used by stacking in both approaches has two levels. The first stacking level comprises SVM, MLP, ELM, KNN, and DT. The second level is the metamodel, a linear regression algorithm. Figure 8 presents a graphical model of the topology used.

The main difference between these approaches is that PSO optimizes the first-level model parameters and adjusts the linear regression meta-parameters. As shown in Figure 8, in the first approach (Stacking 1 (ST-1)), the first layer models are optimized individually via PSO, and the predictions from the optimized models serve as input data for the metamodel, whose parameters are adjusted using the least squares method. In the second approach (ST-2), the first layer models are optimized along with the parameters of the metamodels, with the objective function of optimization being the mean squared error of the final prediction.

2.6.1. ST-1

In ST-1, the optimization algorithm, PSO, is executed for each method in the first layer. The optimization of method parameters in this approach aims to minimize the mean squared error of each model individually. The parameters used for PSO are presented in Table 7. The population size varied according to the methods used; in some cases, but the other parameters remained the same.

Each method present in the first layer of stacking has a set of parameters whose adjustment directly influences the prediction quality. Table 8 lists the parameters that are optimized via PSO for each method. The values presented in Table 8 were obtained from the literature and previous experiences.

During the process of obtaining the optimized parameters, the models in the first layer are trained using the cross-validation strategy. This strategy is justified mainly due to the reduced number of samples in the datasets studied. The metamodel is a linear regression method adjusted by the least squares approach. The model is adjusted on the basis of the predictions performed by the first-level models obtained through the optimization process.

2.6.2. ST-2

In ST-2, optimizing the model parameters of the first layer and metamodel parameters of the linear regression differs from that in ST-1, as they are carried out together. In this approach, the optimization aims to minimize the mean square error of the final prediction made by the metamodel. The PSO algorithm is applied to stacking to optimize the parameters of the first-level models and adjust the metamodel. The parameters of the PSO algorithm for this approach are presented in Table 9.

The output of the metamodel, and consequently of stacking, is given in Equation (17), as follows:

{\hat{y}}_{S t a c k i n g} = (\sum_{i = 1}^{N} α_{i} {\hat{y}}_{i}) + b

(17)

where N is the number of first-level models,

{\hat{y}}_{i}

are the predictions of the first-level models,

α_{i}

are the metamodel parameters of the linear regression, and b is the bias.

In the optimization process, the parameters

α_{i}

can assume values between 0 and 1. b can assume values between −10 and 10. The parameters of the first layer methods can assume values according to Table 8. The PSO algorithm in this strategy has a search space of twenty-two dimensions, which justifies the increase in the population size and the maximum number of interactions.

To enable the analysis of the influence of the prediction of each first-level model on the final result of stacking, a restriction was used in the PSO (Equation (18)) so that the sum of the parameters

α_{i}

was 1 with a tolerance (t) of 0.05.

1 - t \leq (\sum_{i = 1}^{N} α_{i}) \leq 1 + t

(18)

2.7. Assessment Metrics

The following three metrics, which were used in this study, are commonly applied in the literature: the coefficient of determination (

R^{2}

), the root mean squared error (RMSE), and the mean absolute percentage error (MAPE).

Setting

\hat{y}

as the estimated output, y as the sample label,

\bar{y}

as the average of the sample labels, and N as the number of samples, the metrics are defined as stated below.

The coefficient of determination (

R^{2}

) is defined in Equation (19), where (

R^{2}

) varies from 0 to 1, with a closer value to 1 indicating better generalization quality. The RMSE is obtained using Equation (20), and the MAPE is defined by Equation (21), as follows:

R^{2} (y, \hat{y}) = \frac{\sum_{i = 1}^{N} {({\hat{y}}_{i} - \bar{y})}^{2}}{\sum_{i = 1}^{N} {(y_{i} - \bar{y})}^{2}}

(19)

R M S E (y, \hat{y}) = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}}

(20)

M A P E (y, \hat{y}) = \frac{1}{N} (\sum_{i = 1}^{N} \frac{| y_{i} - {\hat{y}}_{i} |}{| y_{i} |}) 100

(21)

In the case of error metrics (RMSE and MAPE), the closer the obtained value is to zero, the closer the predicted value is to the true value. In addition, the MAPE is given as a percentage, and a value closer to 0% indicates that the method obtained better results, allowing us to make good predictions.

2.8. Statistical Tests

Statistical tests were used with the main objective of identifying the significant similarity between the first-layer models and the stacking results.

To determine the existence of similarity between three or more groups, parametric or nonparametric methods were used, depending on whether the samples were normally distributed. The Shapiro–Wilk test was used to determine whether the normality hypothesis is true for a given sample [59]. The p-values indicate whether the sample can be considered to have a normal distribution. A p-value less than 0.05 indicated that the normality hypothesis was rejected.

The Lilliefors test was also used to indicate the normality of a given sample. This test is adapted from the Kolmogorov–Smirnov test and has the same statistics as in reference [60], which is the maximum difference between the empirical distribution function and the theoretical cumulative distribution function. The null hypothesis for this test was that the sample follows a normal distribution, which was confirmed if the p-value in the test result is greater than 0.05.

The use of parametric or nonparametric tests were defined based on the results of the Shapiro–Wilk and Lilliefors tests. Parametric tests are used for cases where the samples originate from distributions that present normality, allowing inference about parameters that characterize the origin distribution of the sample. The use of nonparametric tests occurs when the origin distributions of samples are not determined; in this case, an inference was made about the center of the distribution.

The nonparametric test used throughout this work was the Kruskal–Wallis test. The test is used to determine whether there is a significant difference between the medians of the distributions of two or more groups of an independent variable, continuous or ordinary. This test is an alternative to the parametric ANOVA test [61]. The ANOVA test assumes normality and homoscedasticity and an equal distribution of variance for the samples. In the Kruskal–Wallis test, as with other tests, a statistic was calculated and compared to the point of cutoff, defined by the significance level, which is normally 0.05. The hypotheses tested were as follows: H0, where the population medians are equal; H1, where the population medians are different.

The use of tests such as the Kruskal–Wallis test and ANOVA allowed us to indicate whether there was a statistically significant difference between the groups of samples tested. The groups of samples that presented significant differences between each other were defined using post hoc tests.

The Dunn test is a nonparametric post hoc test used to compare pairs of sample groups and identify whether there is a significant difference between the pairs [62]. This test was used after the Kruskal–Wallis test, which indicated a significant difference between the compared groups. The test yields a p-value for each pair of compared samples; a p-value less than 0.05 indicates that the samples are not significantly similar.

3. Results and Discussion

The results are presented using the average and standard deviation of the metrics for the thirty-five independent runs. The results of the first c applied to the four databases are presented, followed by those of ST-2.

The computational experiments were conducted on a computer with the following specifications: Intel(R) Core(TM) i7-9700F (eight cores of 3 GHz and cache memory of 6 MB), 32 GB RAM, and the operating system Linux Ubuntu 22. Additionally, the codes were implemented in Python, based on the pandas [63], NumPy [64], matplotlib [65], seaborn [66], scikit-learn [67], and scipy [68] libraries.

3.1. Results for ST-1 Stacking

The results for the simulations using ST-1 are presented in Table 10, Table 11, Table 12 and Table 13. The average and standard deviation (values shown in parentheses) values are presented. The best results, on average, among the first layer methods and stacking methods are highlighted in the tables.

The average values of the assessment metrics show that stacking does not yield the best results for all the datasets; however, it always presents itself as at least the second-best result.

To determine whether there was a significant difference between the results of the metric evaluation of stacking and those of other machine learning methods, statistical tests were used. To determine whether parametric or nonparametric tests were used, the Shapiro–Wilk and Lilliefors tests were performed to indicate whether the results of the evaluation metrics presented a normal distribution. In Table 14 and Table 15, p-values for normality tests are presented.

The results presented in Table 16 indicate that the average of the metrics of the first-layer and stacking models present a significant difference, since the p-values for all tests were less than 0.05. Once it was verified that there was a significant difference between the methods, Dunn’s post hoc test was used to verify where the difference was located. The main interest of this work was to verify whether the results obtained by stacking presented a significant difference from those of the first-layer models. For simplicity, comparisons between the first-layer models were suppressed.

Comparing the results presented in Table 17 and Table 10 to those in Table 13, one can observe that the stacking yield results are statistically similar to those of the first-layer models. To assist in the analysis of variance, Figure 9, Figure 10, Figure 11 and Figure 12 present boxplots of the metric results for each dataset.

Through the graphs, it is possible to confirm what was observed in the results: Stacking has low variance, which guarantees the prediction reliability of the method. The SVM for datasets D1, D3, and D4 yields the most similar variance. MLP is the one that presents the closest variance to the stacking in dataset D2.

Table 18 presents the models’ best parameters for datasets D1, D2, D3, and D4, for reproducibility. It can be observed that, for dataset D1, the results were better and, in most cases, the model was more simple. D3 was where the model had the most difficulty acquiring learning. Table 19 presents the results using the parameters exhibits in Table 18.

To evaluate the participation of each first-layer model in the final stacking prediction, linear regression was used as a metamodel, as explained previously. Table 20 presents the averages of the regression coefficients (

α

) associated with each first-layer method and the intercept term (

β

) for the executions performed for each dataset.

According to the results presented in Table 20, there is a direct relationship between the quality of the individual prediction of the first layer method and its participation in the prediction made by the metamodel. Taking the results for dataset D1 as an example, it can be observed that SVM has the greatest participation (58.2%) in the final prediction, which is justified since it was the model with the best individual result for this dataset. The performances of the MLP (23.5%) and ELM (10.7%) algorithms are proportional to the performances of the models, as are the performances of the KNN (3.2%) and DT (4.1%) algorithms.

The KNN and DT methods present low participation in the stacking final prediction in the four datasets, not even impacting the final prediction by 10% on average. Despite this overview, in dataset D2, the DT has more representation in the final prediction than the SVM, and in dataset D3, the KNN has a representation close to that obtained by the MLP.

3.2. Results for ST-2 Stacking

The results of the simulations using ST-2 are presented in Table 21, Table 22, Table 23 and Table 24. The data in the tables follow the same model used for ST-1: average and standard deviation for the 35 executions.

Evaluating only the average values of the metrics presented in Table 21, Table 22, Table 23 and Table 24, it can be seen that stacking does not yield better results in all the cases. However, stacking generates at least the second best result compared to first-layer models.

For the ST-1 results, statistical tests were used to determine whether there was a significant difference between the stacking evaluation and the other machine learning methods. The Shapiro–Wilk and Lilliefors tests were performed to determine whether the tests would be parametric or nonparametric. The results of these tests indicate whether the results of the evaluation metrics exhibit a normal distribution. In Table 25 and Table 26, p-values for normality tests are presented.

The results indicate that none of the metrics showed a normal distribution in any of the samples. From these results, the need to apply nonparametric tests. As in ST-1, the Kruskal–Wallis test was applied to determine whether there was a significant difference between the first-layer models and the stacking model. In Table 27, the p-values for the tests are presented.

The results in Table 27 indicate that there is a significant difference between the first-layer models and stacking. Given that there is a difference between the methods, it is necessary to identify which methods yield differences. To analyze the results for ST-1, Dunn’s post hoc test was used. Table 28 presents comparisons between the first-layer models and stacking.

By observing the results presented in Table 28 and comparing them with the results presented in Table 21, Table 22, Table 23 and Table 24, one can find that stacking is significantly similar to the first-layer models that present the best results. Datasets D2 and D4 showed similarities between Stacking and more of the first-layer models, which may be due to the high variance values presented by the metrics in this approach. To evaluate the variance of the evaluation metric results for the models in the first layer and for stacking, boxplots were generated; the results are presented in Figure 13, Figure 14, Figure 15 and Figure 16.

In the executions carried out for the second approach with the four datasets, when observing the variance of the results of the evaluation metrics, it can be seen that the first layer models MLP and ELM have higher variance values. Observing Figure 13, Figure 14, Figure 15 and Figure 16, it is clear that the MLP and ELM are present in their boxplot outlier points, which shows that these models presented with low precision. The variances of the metrics for the stacking results are low for the executions with datasets D1, D3, and D4 and higher for dataset D2, taking as a reference the lowest variance among the models. The SVM, KNN, and DT models presented with lower variance values.

To analyze the influence of each method on the final stacking prediction, Table 29 presents the average regression coefficients (

α

) associated with each first-layer method and the intercept term (

β

) for the executions performed for each dataset.

The metamodel coefficients in this approach were obtained through optimization using the PSO algorithm. By comparing the results presented in Table 29 and in Table 21, Table 22, Table 23 and Table 24, one can verify that there is no direct relationship between the participation of the models of the first layer in the stacking prediction and between the results of the evaluation metrics of each model. Taking the results for Database 1 as an example, when examining Table 21, one can see that the SVM model yields the best individual results. However, its participation in the stacking prediction is smaller than that of ELM, which, in turn, is only the third-best individual model when comparing evaluation metrics.

3.3. Comparison Between ST-1 and ST-2

By analyzing the results of the evaluation of first-layer models and stacking using the data presented in Table 10, Table 11, Table 12, Table 13 and Table 21, Table 22, Table 23, Table 24, one can observe that, in general, the results presented by ST-1 are better on average. There are only two exceptions to the previous statement. The KNN model in ST-2, for simulations with dataset D1, yields better MAPE values than the ST-1 model, similar to what occurs for simulations with dataset D3, where the DT model yields better MAPE and RMSE values. According to Table 10, Table 11, Table 12, Table 13 and Table 21, Table 22, Table 23, Table 24, both approaches yield satisfactory results in the use of stacking, since the results of the evaluation metrics for the stacking predictions are the best, compared to those of the individual models, or are statistically equal to those of the first-layer model, which yields the best results for the evaluation metrics.

A comparison between approaches can be carried out by conducting a capacity analysis of the contribution of first-layer models to the final stacking prediction. ST-1 allowed us to visualize that the individual results of the first layer were directly related to their participation in the final prediction of the stacking, since, the better the model’s result was, the greater its participation was, which allows us to infer that the adjustment of the metamodel in ST-1 was satisfactory. In ST-2, there is an absence of a direct relationship, or of any other identifiable relationship, between the individual results of the first-layer models and the final result of stacking. Through this observation, it is not possible to define the fit of the metamodel as unsatisfactory, but it can be seen as an indicator of the greater complexity of adjusting the metamodel in ST-2.

3.4. Limitations of the Proposed Approach and Possibilities for Future Works

These results and the analysis presented in this paper consistently show that the stacking and PSO approach contribute to developing an automated and precise computer method. However, some limitations of the study can be cited, as follows: Factors such as the datasets which were used for training and the type of models used in this study restrict the generalization of the findings. More specifically, the stacking technique depends on the quality and diversity of the concrete mixture used; however, in the current study, only three types of concrete mixtures were evaluated. Stacking is defined as a technique that integrates various models, which can in turn add more depth in a manner of predicting. This can complicate the understanding of one of the essential components of interpretation, which is how each model contributes to the stacking strategy, and that makes it less clear than other forms of machine learning approaches. The diversity of models that could be employed in the stacking ensemble model was also not fully covered in this study, while stacking indeed offers the ability to incorporate a variety of machine learning models. Furthermore, the selection of a base model is one of the most important factors for achieving the maximum efficiency of the results; future work could use a wider range of model classes.

This work can be further extended with more advanced machine learning models, such as deep learning. These models are better suited to capture complex and potentially non-linear relationships and can also accommodate high-dimensional and time-varying data more effectively, which may enhance the predictive accuracy of compressive strength for concrete mixtures. Data augmentation techniques, such as synthetic data generation, may also be helpful in overcoming the challenges of smaller datasets and improving model robustness. Transfer learning, where a model is trained to predict strengths of one type of concrete and then applied to predict strengths of others, may allow stacking techniques to generalize to concrete mixtures outside of those provided in training. This would allow for much more scalable and generalizable models.

4. Conclusions

This work investigated the ability of the stacking learning method to predict the compressive strength of concrete specimens with different characteristics. Stacking was used in conjunction with the PSO optimization algorithm, to optimize the parameters of the machine learning methods present in the first layer of the stacking model, and the performance of the linear regression model used as a metamodel was improved in the second layer. Five different machine learning methods were used in the first layer: MLP, SVM, ELM, KNN, and DT. The models were trained and validated using the K-fold cross-validation method. The training, validation, and calculation of the evaluation metrics were carried out over the course of 35 independent runs to enable a statistical analysis of the results. Two approaches were proposed in this work to optimize the parameters of the first-layer methods and adjust the metamodel parameters. The results achieved in ST-1 indicated a good performance for stacking in predicting compressive strength.

Stacking delivered good results on average, with a MAPE of approximately 11% in the worst case; this result was obtained for dataset D3. However, for dataset D1, the result was 2%.
For the stacking models, the R² values were approximately 0.85 in the worst case, which was, again, dataset D3; for the other datasets, the values were above 0.95.
The statistical tests carried out, using the results of the evaluation metrics, indicated that stacking gives results as good as the best first-layer method.

For ST-2, the results differ; however, from a qualitative point of view, the two approaches present similar results.

Among the results for ST-2, stacking presented a worst-case average MAPE of approximately 15%, in this case, for dataset D3.
The best result was 2.4%, which was obtained for dataset D1.
The result for R² was 0.79 in the worst case, namely for dataset D3; for the other bases, the results exceeded 0.9.
For this approach, the stacking results were also statistically similar to the results obtained by the best first-layer model.
By analyzing the variance of the metric results, one can verify that, similarly to ST-1, stacking yields better accuracy than first-layer models.
The first-layer models that presented the best results were the SVM and the MLP.
These results were statistically similar to those for stacking in the first and ST-2 subgroups.

Therefore, it can be concluded that the computational framework created by stacking and PSO was efficient and promising.

Further investigations should employ feature selection techniques, including boruta feature selection (BFS) and recursive feature elimination (RFE). Additionally, metaheuristic algorithms, such as gray wolf optimization (GWO), artificial bee colony (ABC), and natural exponential differential evolution (DE), should be implemented to facilitate comparative analysis and the identification of optimal methodologies, in order to guide the search for increasingly improved predictions.

Author Contributions

Conceptualization: T.H.A.B., L.G. and G.F.N.; methodology: T.H.A.B., T.S.G., M.B. and G.F.N.; software: T.B., B.d.S.M. and G.F.N.; methodology: T.H.A.B., T.S.G., M.B. and L.G.; validation: T.H.A.B., M.B. and C.S.; formal analysis: B.d.S.M., T.H.A.B. and L.G.; investigation: B.d.S.M., T.H.A.B., C.S., T.S.G. and M.B.; resources: C.S. and L.G.; funding: L.G., C.S., T.S.G. and M.B.; data curation: T.H.A.B., G.F.N. and B.d.S.M.; writing—original draft: G.F.N.; writing—review and editing: T.H.A.B., C.S., M.B. and L.G.; supervision: L.G. All authors have read and agreed to the published version of the manuscript.

Funding

The authors acknowledge the financial support provided by the funding agencies CNPq (grants 401796/2021-3, 307688/2022-4, and 409433/2022-5), Fapemig (grants APQ-02513-22, APQ-04458-23 and BPD-00083-22), FINEP (grant SOS Equipamentos 2021 AV02 0062/22), and Capes (Finance Code 001).

Data Availability Statement

Data and materials can be obtained upon request from the authors.

Conflicts of Interest

The corresponding author, on behalf of all the authors, declares that there are no conflicts of interest.

References

US Geological Survey. Mineral Commodity Summaries, 2023; Government Printing Office: Reston, VA, USA, 2023. [CrossRef]
Mohtasham Moein, M.; Saradar, A.; Rahmati, K.; Ghasemzadeh Mousavinejad, S.H.; Bristow, J.; Aramali, V.; Karakouzian, M. Predictive models for concrete properties using machine learning and deep learning approaches: A review. J. Build. Eng. 2023, 63, 105444. [Google Scholar] [CrossRef]
Zhang, J.; Li, D.; Wang, Y. Predicting uniaxial compressive strength of oil palm shell concrete using a hybrid artificial intelligence model. J. Build. Eng. 2020, 30, 101282. [Google Scholar] [CrossRef]
Kumar, P.; Arora, H.C.; Bahrami, A.; Kumar, A.; Kumar, K. Development of a Reliable Machine Learning Model to Predict Compressive Strength of FRP-Confined Concrete Cylinders. Buildings 2023, 13, 931. [Google Scholar] [CrossRef]
Gao, T.; Shen, L.; Shen, M.; Liu, L.; Chen, F.; Gao, L. Evolution and projection of CO2 emissions for China’s cement industry from 1980 to 2020. Renew. Sustain. Energy Rev. 2017, 74, 522–537. [Google Scholar] [CrossRef]
Shen, W.; Liu, Y.; Yan, B.; Wang, J.; He, P.; Zhou, C.; Huo, X.; Zhang, W.; Xu, G.; Ding, Q. Cement industry of China: Driving force, environment impact and sustainable development. Renew. Sustain. Energy Rev. 2017, 75, 618–628. [Google Scholar] [CrossRef]
Wu, X.; Yan, G.; Zhang, W.; Bao, Y. Prediction of Compressive Strength of High-performance Concrete Using Multi-layer Perceptron. J. Appl. Sci. Eng. 2024, 27, 2719–2733. [Google Scholar] [CrossRef]
Neville, A.M. Propriedades do Concreto-5ª Edição; Bookman: Rio de Janeiro, Brazil, 2015. [Google Scholar]
Karim, R.; Islam, M.H.; Datta, S.D.; Kashem, A. Synergistic effects of supplementary cementitious materials and compressive strength prediction of concrete using machine learning algorithms with SHAP and PDP analyses. Case Stud. Constr. Mater. 2024, 20, e02828. [Google Scholar] [CrossRef]
Farooq, F.; Czarnecki, S.; Niewiadomski, P.; Aslam, F.; Alabduljabbar, H.; Ostrowski, K.A.; Śliwa Wieczorek, K.; Nowobilski, T.; Malazdrewicz, S. A comparative study for the prediction of the compressive strength of self-compacting concrete modified with fly ash. Materials 2021, 14, 4934. [Google Scholar] [CrossRef]
Yang, D.; Xu, P.; Zaman, A.; Alomayri, T.; Houda, M.; Alaskar, A.; Javed, M.F. Compressive strength prediction of concrete blended with carbon nanotubes using gene expression programming and random forest: Hyper-tuning and optimization. J. Mater. Res. Technol. 2023, 24, 7198–7218. [Google Scholar] [CrossRef]
Endzhievskaya, I.; Endzhievskiy, A.; Galkin, M.; Molokeev, M. Machine learning methods in assessing the effect of mixture composition on the physical and mechanical characteristics of road concrete. J. Build. Eng. 2023, 76, 107248. [Google Scholar] [CrossRef]
Davawala, M.; Joshi, T.; Shah, M. Compressive strength prediction of high-strength concrete using machine learning. Emergent Mater. 2023, 6, 321–335. [Google Scholar] [CrossRef]
Chen, H.; Li, X.; Wu, Y.; Zuo, L.; Lu, M.; Zhou, Y. Compressive Strength Prediction of High-Strength Concrete Using Long Short-Term Memory and Machine Learning Algorithms. Buildings 2022, 12, 302. [Google Scholar] [CrossRef]
Jiao, H.; Wang, Y.; Li, L.; Arif, K.; Farooq, F.; Alaskar, A. A novel approach in forecasting compressive strength of concrete with carbon nanotubes as nanomaterials. Mater. Today Commun. 2023, 35, 106335. [Google Scholar] [CrossRef]
Migallón, V.; Penadés, H.; Penadés, J.; Tenza-Abril, A.J. A Machine Learning Approach to Prediction of the Compressive Strength of Segregated Lightweight Aggregate Concretes Using Ultrasonic Pulse Velocity. Appl. Sci. 2023, 13, 1953. [Google Scholar] [CrossRef]
Günaydın, O.; Akbaş, E.; Özbeyaz, A.; Güçlüer, K. Machine learning based evaluation of concrete strength from saturated to dry by non-destructive methods. J. Build. Eng. 2023, 76, 107174. [Google Scholar] [CrossRef]
Albaijan, I.; Mahmoodzadeh, A.; Hussein Mohammed, A.; Fakhri, D.; Hashim Ibrahim, H.; Mohamed Elhadi, K. Optimal machine learning-based method for gauging compressive strength of nanosilica-reinforced concrete. Eng. Fract. Mech. 2023, 291, 109560. [Google Scholar] [CrossRef]
Goliatt, L.; Yaseen, Z.M. Development of a hybrid computational intelligent model for daily global solar radiation prediction. Expert Syst. Appl. 2023, 212, 118295. [Google Scholar] [CrossRef]
Ikram, R.M.A.; Goliatt, L.; Kisi, O.; Trajkovic, S.; Shahid, S. Covariance Matrix Adaptation Evolution Strategy for Improving Machine Learning Approaches in Streamflow Prediction. Mathematics 2022, 10, 2971. [Google Scholar] [CrossRef]
Song, H.; Ahmad, A.; Farooq, F.; Ostrowski, K.A.; Maślak, M.; Czarnecki, S.; Aslam, F. Predicting the compressive strength of concrete with fly ash admixture using machine learning algorithms. Constr. Build. Mater. 2021, 308, 125021. [Google Scholar] [CrossRef]
Kumar, A.; Arora, H.C.; Kapoor, N.R.; Mohammed, M.A.; Kumar, K.; Majumdar, A.; Thinnukool, O. Compressive Strength Prediction of Lightweight Concrete: Machine Learning Models. Sustainability 2022, 14, 2404. [Google Scholar] [CrossRef]
Wang, S.; Xia, P.; Chen, K.; Gong, F.; Wang, H.; Wang, Q.; Zhao, Y.; Jin, W. Prediction and optimization model of sustainable concrete properties using machine learning, deep learning and swarm intelligence: A review. J. Build. Eng. 2023, 80, 108065. [Google Scholar] [CrossRef]
Abd, A.M.; Abd, S.M. Modelling the strength of lightweight foamed concrete using support vector machine (SVM). Case Stud. Constr. Mater. 2017, 6, 8–15. [Google Scholar] [CrossRef]
Ahmadi-Nedushan, B. An optimized instance-based learning algorithm for estimation of compressive strength of concrete. Eng. Appl. Artif. Intell. 2012, 25, 1073–1081. [Google Scholar] [CrossRef]
Al-Shamiri, A.K.; Kim, J.H.; Yuan, T.F.; Yoon, Y.S. Modeling the compressive strength of high-strength concrete: An extreme learning approach. Constr. Build. Mater. 2019, 208, 204–219. [Google Scholar] [CrossRef]
Behnood, A.; Behnood, V.; Gharehveran, M.M.; Alyamac, K.E. Prediction of the compressive strength of normal and high-performance concretes using M5P model tree algorithm. Constr. Build. Mater. 2017, 142, 199–207. [Google Scholar] [CrossRef]
Gilan, S.S.; Jovein, H.B.; Ramezanianpour, A.A. Hybrid support vector regression–Particle swarm optimization for prediction of compressive strength and RCPT of concretes containing metakaolin. Constr. Build. Mater. 2012, 34, 321–329. [Google Scholar] [CrossRef]
Qi, C.; Fourie, A.; Chen, Q. Neural network and particle swarm optimization for predicting the unconfined compressive strength of cemented paste backfill. Constr. Build. Mater. 2018, 159, 473–478. [Google Scholar] [CrossRef]
Ly, H.B.; Nguyen, M.H.; Pham, B.T. Metaheuristic optimization of Levenberg–Marquardt-based artificial neural network using particle swarm optimization for prediction of foamed concrete compressive strength. Neural Comput. Appl. 2021, 33, 17331–17351. [Google Scholar] [CrossRef]
Huang, J.; Sabri, M.M.S.; Ulrikh, D.V.; Ahmad, M.; Alsaffar, K.A.M. Predicting the Compressive Strength of the Cement-Fly Ash–Slag Ternary Concrete Using the Firefly Algorithm (FA) and Random Forest (RF) Hybrid Machine-Learning Method. Materials 2022, 15, 4193. [Google Scholar] [CrossRef] [PubMed]
Zhu, W.; Huang, L.; Zhang, Z. Novel hybrid AOA and ALO optimized supervised machine learning approaches to predict the compressive strength of admixed concrete containing fly ash and micro-silica. Multiscale Multidiscip. Model. Exp. Des. 2022, 5, 391–402. [Google Scholar] [CrossRef]
L’heureux, A.; Grolinger, K.; Elyamany, H.F.; Capretz, M.A. Machine learning with big data: Challenges and approaches. Ieee Access 2017, 5, 7776–7797. [Google Scholar] [CrossRef]
Ting, K.M.; Witten, I.H. Issues in stacked generalization. J. Artif. Intell. Res. 1999, 10, 271–289. [Google Scholar] [CrossRef]
Goliatt, L.; Saporetti, C.; Pereira, E. Super learner approach to predict total organic carbon using stacking machine learning models based on well logs. Fuel 2023, 353, 128682. [Google Scholar] [CrossRef]
Reeves, C.R. Genetic algorithms. In Handbook of Metaheuristics; Springer: New York, NY, USA, 2010; pp. 109–139. [Google Scholar]
Koessler, E.; Almomani, A. Hybrid particle swarm optimization and pattern search algorithm. Optim. Eng. 2021, 22, 1539–1555. [Google Scholar] [CrossRef]
Innocente, M.S.; Torres, L.; Cahis, X.; Barbeta, G.; Catalan, A. Optimal flexural design of frp-reinforced concrete beams using a particle swarm optimizer. arXiv 2021, arXiv:2101.09974. [Google Scholar]
Wahab, S.; Mahmoudabadi, N.S.; Waqas, S.; Herl, N.; Iqbal, M.; Alam, K.; Ahmad, A. Comparative Analysis of Shear Strength Prediction Models for Reinforced Concrete Slab–Column Connections. Adv. Civ. Eng. 2024, 2024, 1784088. [Google Scholar] [CrossRef]
Lim, C.H.; Yoon, Y.S.; Kim, J.H. Genetic algorithm in mix proportioning of high-performance concrete. Cem. Concr. Res. 2004, 34, 409–420. [Google Scholar] [CrossRef]
Pala, M.; Özbay, E.; Öztaş, A.; Yuce, M.I. Appraisal of long-term effects of fly ash and silica fume on compressive strength of concrete by neural networks. Constr. Build. Mater. 2007, 21, 384–394. [Google Scholar] [CrossRef]
Siddique, R.; Aggarwal, P.; Aggarwal, Y. Prediction of compressive strength of self-compacting concrete containing bottom ash using artificial neural networks. Adv. Eng. Softw. 2011, 42, 780–786. [Google Scholar] [CrossRef]
Kröse, B.; van der Smagt, P. An Introduction to Neural Networks; The University of Amsterdam: Amsterdam, The Netherlands, 1993. [Google Scholar]
Breiman, L.; Friedman, J.; Stone, C.J.; Olshen, R.A. Classification and Regression Trees; CRC Press: Boca Raton, FL, USA, 1984. [Google Scholar]
Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer Science & Business Medi: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
Zhu, Q.Y.; Qin, A.K.; Suganthan, P.N.; Huang, G.B. Evolutionary extreme learning machine. Pattern Recognit. 2005, 38, 1759–1763. [Google Scholar] [CrossRef]
Aha, D.W.; Kibler, D.; Albert, M.K. Instance-based learning algorithms. Mach. Learn. 1991, 6, 37–66. [Google Scholar] [CrossRef]
Vapnik, V. The Nature of Statistical Learning Theory; Springer: Berlin/Heidelberg, Germany, 1995. [Google Scholar]
Vapnik, V. Statistical Learning Theory; Wiley-Interscience: Hoboken, NJ, USA, 1998. [Google Scholar]
Ahmad, A.; Farooq, F.; Niewiadomski, P.; Ostrowski, K.; Akbar, A.; Aslam, F.; Alyousef, R. Prediction of compressive strength of fly ash based concrete using individual and ensemble algorithm. Materials 2021, 14, 794. [Google Scholar] [CrossRef]
Song, Y.; Zhao, J.; Ostrowski, K.A.; Javed, M.F.; Ahmad, A.; Khan, M.I.; Aslam, F.; Kinasz, R. Prediction of compressive strength of fly-ash-based concrete using ensemble and non-ensemble supervised machine-learning approaches. Appl. Sci. 2022, 12, 361. [Google Scholar] [CrossRef]
Tipu, R.K.; Suman; Batra, V. Development of a hybrid stacked machine learning model for predicting compressive strength of high-performance concrete. Asian J. Civ. Eng. 2023, 24, 2985–3000. [Google Scholar] [CrossRef]
Xu, Y.; Ahmad, W.; Ahmad, A.; Ostrowski, K.A.; Dudek, M.; Aslam, F.; Joyklad, P. Computation of high-performance concrete compressive strength using standalone and ensembled machine learning techniques. Materials 2021, 14, 7034. [Google Scholar] [CrossRef] [PubMed]
Freund, Y. Boosting a weak learning algorithm by majority. Inf. Comput. 1995, 121, 256–285. [Google Scholar] [CrossRef]
Wolpert, D.H. Stacked generalization. Neural Netw. 1992, 5, 241–259. [Google Scholar] [CrossRef]
Shafighfard, T.; Bagherzadeh, F.; Rizi, R.A.; Yoo, D.Y. Data-driven compressive strength prediction of steel fiber reinforced concrete (SFRC) subjected to elevated temperatures using stacked machine learning algorithms. J. Mater. Res. Technol. 2022, 21, 3777–3794. [Google Scholar] [CrossRef]
Eberhart, R.; Kennedy, J. A new optimizer using particle swarm theory. In Proceedings of the MHS’95, Sixth International Symposium on Micro Machine and Human Science, Nagoya, Japan, 4–6 October 1995; IEEE: Piscataway, NJ, USA, 1995; pp. 39–43. [Google Scholar]
Kohavi, R. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the IJCAI, Montreal, QC, Canada, 20–25 August 1995; Volume 14, pp. 1137–1145. [Google Scholar]
Ghasemi, A.; Zahediasl, S. Normality tests for statistical analysis: A guide for non-statisticians. Int. J. Endocrinol. Metab. 2012, 10, 486. [Google Scholar] [CrossRef]
Whitnall, C.; Oswald, E.; Mather, L. An exploration of the kolmogorov-smirnov test as a competitor to mutual information analysis. In Proceedings of the International Conference on Smart Card Research and Advanced Applications, Leuven, Belgium, 14–16 September 2011; Springer: Berlin/Heidelberg, Germany, 2011; pp. 234–251. [Google Scholar]
Vargha, A.; Delaney, H.D. The Kruskal–Wallis test and stochastic homogeneity. J. Educ. Behav. Stat. 1998, 23, 170–192. [Google Scholar] [CrossRef]
Dinno, A. Nonparametric pairwise multiple comparisons in independent groups using Dunn’s test. Stata J. 2015, 15, 292–300. [Google Scholar] [CrossRef]
Wes McKinney. Data Structures for Statistical Computing in Python. In Proceedings of the 9th Python in Science Conference. Austin, TX, USA, 28 June–3 July 2010; pp. 56–61. [CrossRef]
Harris, C.R.; Millman, K.J.; van der Walt, S.J.; Gommers, R.; Virtanen, P.; Cournapeau, D.; Wieser, E.; Taylor, J.; Berg, S.; Smith, N.J.; et al. Array programming with NumPy. Nature 2020, 585, 357–362. [Google Scholar] [CrossRef] [PubMed]
Hunter, J.D. Matplotlib: A 2D graphics environment. Comput. Sci. Eng. 2007, 9, 90–95. [Google Scholar] [CrossRef]
Waskom, M.L. seaborn: Statistical data visualization. J. Open Source Softw. 2021, 6, 3021. [Google Scholar] [CrossRef]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Virtanen, P.; Gommers, R.; Oliphant, T.E.; Haberland, M.; Reddy, T.; Cournapeau, D.; Burovski, E.; Peterson, P.; Weckesser, W.; Bright, J.; et al. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nat. Methods 2020, 17, 261–272. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Schematic model of the stacking architecture.

Figure 2. Correlation matrix for dataset D1.

Figure 3. Correlation matrix for dataset D2.

Figure 4. Correlation matrix for dataset D3.

Figure 5. Correlation matrix for dataset D4.

Figure 6. Model of an artificial neuron.

Figure 7. Layered architecture—stacking.

Figure 8. Stacking topology approaches.

Figure 9. Metric boxplots for dataset D1.

Figure 10. Metric boxplots for dataset D2.

Figure 11. Metric boxplots for dataset D3.

Figure 12. Metric boxplots for dataset D4.

Figure 13. Metric boxplots for dataset D1.

Figure 14. Metric boxplots for dataset D2.

Figure 15. Metric boxplots for dataset D3.

Figure 16. Metric boxplots for dataset D4.

Table 1. Mixture components dataset D1.

Parameters	Minimum	Maximum
Water–cement (w/c) (%)	30	45
Water (W) (kg/m³)	160	180
Sand–aggregate ratio (s/a) (%)	37	53
Fly ash (FA) (%)	0	20
Superplasticizer (SP) (kg/m³)	1.89	8.5
Air entrainer (AE) (kg/m³)	36	78
Compressive strength (MPa)	38	74

Table 2. Mixture components dataset D2.

Parameters	Minimum	Maximum
Water (W) (kg/m³)	150	205
Sand–aggregate ratio (sa) (kg/m³)	536	724
Fly ash (FA) (%)	0	55
Active silica (SF) (%)	0	5
Total cementitious materials (TCMs) (kg/m³)	400	500
Coarse aggregate (ca) (kg/m³)	1086	1157
Naphthalene-based water reducing mixture (HRWRA) (L/m³)	0	13
Curing days (D)	3	180
Compressive strength (MPa)	24	107.8

Table 3. Mixture components dataset D3.

Parameters	Minimum	Maximum
Cement (kg/m³)	160	427
Fly ash (FA) (%)	0	261
Water–cement (w/c)	0.33	0.87
Superplasticizer (SP) (%)	0	1
Sand (kg/m³)	478	1079
Coarse aggregate (CA) (kg/m³)	621	923
Compressive strength (MPa)	10.2	73.5

Table 4. Mixture components dataset D4.

Parameters	Minimum	Maximum
Water (W) (kg/m³)	160	180
Cement (C) (kg/m³)	284	600
Fine aggregate (FA) (kg/m³)	552	951
Coarse aggregate (CA) (kg/m³)	845	989
Superplasticizer (SP) (kg/m³)	0	2
Compressive strength (MPa)	37.5	73.60

Table 5. Activation functions of MLP.

Name	Function
Linear	$ϕ (v) = v s .$
Rectified linear	$ϕ (v) = m a x (0, x)$
Logistics	$ϕ (v) = \frac{1}{1 + e^{- x}}$

Table 6. Activation functions for ELM.

Name	Function
Linear	$H (x) = a x + b$
Rectified linear	$H (x) = \{\begin{matrix} 0 i f x < 0 \\ a x + b i f x \geq 0 \end{matrix}$
Logistics	$H (x) = \frac{1}{1 + e^{- (a x + b)}}$
Gaussian	$H (x) = e^{- {(a x + b)}^{2}}$
Multiquadric	$H (x) = {(∥a - x∥ + b^{2})}^{\frac{1}{2}}$
Inverse multiquadric	$H (x) = 1 / {(∥a - x∥ + b^{2})}^{\frac{1}{2}}$

Table 7. PSO parameters—first approach.

Parameters			Value
Population	SVM	ELM	MLP	KNN	DT
	40	20	15	15	15
Number of iterations			35
$ω$			0.6
$ϕ_{g}$			0.5
$ϕ_{p}$			0.5
Objective function			MSE

Table 8. Parameters to be adjusted using PSO.

Model	Parameter	Variation
SVM	Regression accuracy $(ϵ)$	$[10^{- 6}, 10^{- 4}]$
	Regularization parameter (C)	$[1, 10^{2}]$
	Kernel coefficient ( $γ$ )	$[1^{- 2}, 10]$
MLP	Learning rate ( $η$ )	$[10^{- 5}, 1]$
	Regularization term ( $α$ )	$[10^{- 5}, 10^{- 2}]$
	Activation function	Linear, rectified linear, and logistics
	N° of hidden layers	1, 2, or 3
	N° of neurons in each layer	[1, 100]
ELM	N° of neurons in the hidden layer	[1, 50]
	Activation function	Linear, rectified linear,
		logistics, Gaussian,
		multiquadric, and multiquadric inverse
DT	Maximum depth	[5, 50]
	N° minimum for node splitting	[2, 3]
	N° minimum number of samples per sheet	[1, 4]
KNN	N° of neighbors	[3, 10]

Table 9. PSO parameters for ST-2.

Parameters	Value
Population	100
N° of maximum interactions	70
$ω$	0.6
$ϕ_{g}$	0.7
$ϕ_{p}$	0.7
Objective function	MSE

Table 10. Assessment metrics results for dataset D1.

Output	MAPE	RMSE	$R^{2}$
SVM	2.021 (0.141)	1.302 (0.105)	0.981 (0.003)
MLP	2.135 (0.179)	1.503 (0.130)	0.974 (0.005)
ELM	2.506 (0.199)	1.714 (0.162)	0.966 (0.007)
KNN	3.608 (0.212)	2.440 (0.163)	0.932 (0.009)
DT	3.120 (0.284)	2.286 (0.266)	0.940 (0.014)
Stacking	1.957 (0.110)	1.306 (0.091)	0.981 (0.003)

Table 11. Assessment metrics results for dataset D2.

Output	MAPE	RMSE	$R^{2}$
SVM	11.288 (0.713)	6.379 (0.462)	0.927 (0.011)
MLP	8.703 (0.553)	4.567 (0.309)	0.963 (0.005)
ELM	13.376 (0.776)	7.695 (0.763)	0.893 (0.022)
KNN	25.543 (1.440)	11.990 (0.716)	0.743 (0.031)
DT	13.730 (1.268)	7.551 (0.569)	0.898 (0.016)
Stacking	8.137 (0.559)	4.344 (0.279)	0.966 (0.004)

Table 12. Assessment metrics results for dataset D3.

Output	MAPE	RMSE	$R^{2}$
SVM	11.360 (0.919)	5.209 (0.369)	0.862 (0.020)
MLP	13.427 (1.002)	6.159 (0.664)	0.805 (0.044)
ELM	13.354 (1.358)	6.101 (0.760)	0.808 (0.051)
KNN	22.745 (1.156)	8.337 (0.475)	0.646 (0.041)
DT	26.019 (3.875)	11.008 (1.636)	0.372 (0.200)
Stacking	11.766 (0.813)	5.343 (0.362)	0.855 (0.020)

Table 13. Assessment metrics results for dataset D4.

Output	MAPE	RMSE	$R^{2}$
SVM	5.304 (0.437)	3.320 (0.316)	0.961 (0.008)
MLP	5.846 (0.590)	3.595 (0.346)	0.954 (0.009)
ELM	5.865 (0.494)	3.561 (0.495)	0.955 (0.014)
KNN	9.467 (0.758)	6.314 (0.642)	0.859 (0.029)
DT	9.479 (1.977)	6.520 (1.649)	0.842 (0.085)
Stacking	5.504 (0.258)	3.350 (0.310)	0.960 (0.009)

Table 14. Shapiro–Wilk p-values test for ST-1.

Method	p-Value
	Dataset D1			Dataset D2
	MAPE	RMSE	$R^{2}$	MAPE	RMSE	$R^{2}$
SVM	0.243	0.458	0.272	0.009	0.004	0.001
MLP	0.028	0.061	0.009	0.659	0.197	0.097
ELM	0.023	0.002	0.000	0.930	0.014	0.002
KNN	0.881	0.296	0.363	0.605	0.287	0.141
DT	0.544	0.618	0.229	0.000	0.001	0.000
Stacking	0.647	0.560	0.505	0.127	0.421	0.187
	Dataset D3			Dataset D4
	MAPE	RMSE	$R^{2}$	MAPE	RMSE	$R^{2}$
SVM	0.279	0.061	0.017	0.771	0.028	0.004
MLP	0.708	0.003	0.000	0.123	0.240	0.042
ELM	0.003	0.000	0.000	0.002	0.000	0.000
KNN	0.979	0.462	0.246	0.219	0.165	0.039
DT	0.084	0.013	0.000	0.031	0.022	0.000
Stacking	0.892	0.373	0.162	0.142	0.000	0.000

Table 15. Lilliefors p-values test for ST-1.

Method	p-Value
	Dataset D1			Dataset D2
	MAPE	RMSE	$R^{2}$	MAPE	RMSE	$R^{2}$
SVM	0.110	0.200	0.200	0.200	0.010	0.003
MLP	0.200	0.200	0.200	0.200	0.069	0.137
ELM	0.047	0.049	0.015	0.200	0.109	0.028
KNN	0.200	0.200	0.200	0.200	0.200	0.200
DT	0.200	0.133	0.036	0.014	0.076	0.026
Stacking	0.200	0.200	0.200	0.200	0.200	0.200
	Dataset D3			Dataset D4
	MAPE	RMSE	$R^{2}$	MAPE	RMSE	$R^{2}$
SVM	0.049	0.200	0.166	0.200	0.200	0.118
MLP	0.200	0.038	0.022	0.127	0.062	0.037
ELM	0.024	0.003	0.000	0.002	0.000	0.000
KNN	0.200	0.200	0.200	0.200	0.039	0.012
DT	0.200	0.200	0.120	0.033	0.124	0.005
Stacking	0.200	0.200	0.200	0.200	0.000	0.000

Table 16. Kruskal–Wallis p-values test for ST-1.

Dataset	MAPE	RMSE	$R^{2}$
D1	≤0.0001	≤0.0001	≤0.0001
D2	≤0.0001	≤0.0001	≤0.0001
D3	≤0.0001	≤0.0001	≤0.0001
D4	≤0.0001	≤0.0001	≤0.0001

Table 17. Comparison of Dunn p-values test for stacking and models of first—layer-ST-1.

Method	p-Value
	Dataset D1			Dataset D2
	MAPE	RMSE	$R^{2}$	MAPE	RMSE	$R^{2}$
SVM	0.363	0.948	0.948	0.000	0.000	0.000
MLP	0.015	0.001	0.000	0.196	0.302	0.302
ELM	0.000	0.000	0.000	0.000	0.000	0.000
KNN	0.000	0.000	0.000	0.000	0.000	0.000
DT	0.000	0.000	0.000	0.000	0.000	0.000
	Dataset D3			Dataset D4
	MAPE	RMSE	$R^{2}$	MAPE	RMSE	$R^{2}$
SVM	0.346	0.428	0.428	0.221	0.847	0.847
MLP	0.000	0.000	0.000	0.109	0.023	0.023
ELM	0.002	0.001	0.001	0.047	0.116	0.116
KNN	0.000	0.000	0.000	0.000	0.000	0.000
DT	0.000	0.000	0.000	0.000	0.000	0.000

Table 18. Best parameters for datasets D1, D2, D,3 and D4 (ST-1).

Model	Parameters	Dataset D1	Dataset D2	Dataset D3	Dataset D4
	maximum depth	5	41	39	27
DT	minimum samples split	2	2	3	3
	minimum samples leaf	1	1	2	2
KNN	n° neighbors	6	3	3	3
	initial learning rate	0.432421	0.679184	0.303970	0.032268
	alpha	0.008296	0.004813	0.007516	0.010000
MLP	n° of hidden layers	2	2	1	2
	n° of neurons in each hidden layer	[9, 89]	[61, 42]	[62]	[99, 33]
	activation function	relu	relu	relu	relu
ELM	n° of neurons hidden layer	36	39	18	21
	activation function	sigmoid	multiquadric	multiquadric	sigmoid
	C	4.095809	49.779083	13.479575	88.406859
SVM	epsilon	0.000085	0.000049	0.000043	0.000091
	gamma	1.267587	0.833383	0.662144	0.753403

Table 19. Best model results (ST-1) for datasets D1, D2, D3, and D4, using the parameters showed in Table 18.

Performance Metrics	Dataset D1	Dataset D2	Dataset D3	Dataset D4
MAPE	1.7556	7.0966	11.5478	5.09369
RMSE	1.1335	3.6410	4.6654	2.99910
R²	0.9854	0.9763	0.8896	0.96859

Table 20. Metamodel coefficients for ST-1.

	Dataset D1	Dataset D2	Dataset D3	Dataset D4
$α_{S V M}$	0.582	0.141	0.575	0.327
$α_{M L P}$	0.235	0.646	0.103	0.290
$α_{E L M}$	0.107	0.045	0.223	0.318
$α_{K N N}$	0.032	0.006	0.083	0.012
$α_{D T}$	0.041	0.173	0.048	0.046
$β$	0.165	−0.788	−1.537	0.386

Table 21. Assessment metrics results for dataset D1.

Output	MAPE	RMSE	$R^{2}$
SVM	2.788 (0.378)	1.880 (0.267)	0.959 (0.012)
MLP	2.643 (0.999)	1.838 (0.740)	0.955 (0.059)
ELM	3.213 (1.557)	2.254 (1.126)	0.928 (0.111)
KNN	3.594 (0.194)	2.512 (0.149)	0.928 (0.008)
DT	3.433 (0.333)	2.556 (0.301)	0.925 (0.018)
Stacking	2.349 (0.269)	1.660 (0.178)	0.968 (0.007)

Table 22. Assessment metrics results for dataset D2.

Output	MAPE	RMSE	$R^{2}$
SVM	14.484 (2.316)	11.713 (3.907)	0.728 (0.178)
MLP	12.242 (7.050)	6.161 (3.066)	0.916 (0.110)
ELM	21.028 (12.165)	10.805 (6.991)	0.705 (0.717)
KNN	32.995 (4.378)	14.099 (1.409)	0.642 (0.069)
DT	15.388 (1.815)	8.288 (0.772)	0.877 (0.023)
Stacking	13.371 (2.749)	7.084 (1.605)	0.906 (0.047)

Table 23. Assessment metrics results for dataset D3.

Output	MAPE	RMSE	$R^{2}$
SVM	18.943 (3.674)	8.540 (1.967)	0.610 (0.180)
MLP	15.062 (4.650)	6.666 (1.644)	0.761 (0.157)
ELM	20.378 (12.290)	9.590 (5.285)	0.392 (0.933)
KNN	23.523 (1.643)	8.557 (0.582)	0.627 (0.051)
DT	25.563 (2.413)	10.682 (1.119)	0.415 (0.126)
Stacking	15.255 (2.535)	6.363 (0.775)	0.792 (0.057)

Table 24. Assessment metrics results for dataset D4.

Output	MAPE	RMSE	$R^{2}$
SVM	8.645 (2.143)	5.660 (1.601)	0.879 (0.066)
MLP	7.947 (3.040)	4.839 (1.757)	0.907 (0.068)
ELM	10.432 (7.790)	6.829 (4.848)	0.755 (0.444)
KNN	10.090 (1.006)	6.754 (0.632)	0.839 (0.030)
DT	12.000 (2.397)	8.046 (1.501)	0.766 (0.081)
Stacking	7.627 (1.478)	4.732 (1.037)	0.918 (0.037)

Table 25. Shapiro–Wilk p-values test for ST-2.

Method	p-Value
	Dataset D1			Dataset D2
	MAPE	RMSE	$R^{2}$	MAPE	RMSE	$R^{2}$
SVM	0.089	0.150	0.016	0.000	0.030	0.001
MLP	0.000	0.000	0.000	0.000	0.000	0.000
ELM	0.000	0.000	0.000	0.000	0.000	0.000
KNN	0.070	0.802	0.864	0.067	0.704	0.971
DT	0.785	0.396	0.211	0.272	0.153	0.044
Stacking	0.016	0.026	0.005	0.051	0.001	0.000
	Dataset D3			Dataset D4
	MAPE	RMSE	$R^{2}$	MAPE	RMSE	$R^{2}$
SVM	0.001	0.013	0.001	0.102	0.024	0.004
MLP	0.000	0.000	0.000	0.000	0.000	0.000
ELM	0.000	0.000	0.000	0.000	0.000	0.000
KNN	0.616	0.646	0.259	0.257	0.474	0.313
DT	0.978	0.071	0.008	0.737	0.194	0.466
Stacking	0.015	0.000	0.000	0.015	0.053	0.006

Table 26. Lilliefors p-values test for ST-2.

Method	p-Value
	Dataset D1			Dataset D2
	MAPE	RMSE	$R^{2}$	MAPE	RMSE	$R^{2}$
SVM	0.035	0.007	0.001	0.000	0.056	0.033
MLP	0.000	0.000	0.000	0.000	0.000	0.000
ELM	0.000	0.000	0.000	0.000	0.000	0.000
KNN	0.166	0.200	0.200	0.200	0.200	0.200
DT	0.200	0.020	0.004	0.200	0.014	0.004
Stacking	0.200	0.012	0.005	0.164	0.033	0.001
	Dataset D3			Dataset D4
	MAPE	RMSE	$R^{2}$	MAPE	RMSE	$R^{2}$
SVM	0.001	0.028	0.015	0.122	0.038	0.008
MLP	0.000	0.000	0.000	0.000	0.000	0.000
ELM	0.000	0.000	0.000	0.000	0.000	0.000
KNN	0.200	0.200	0.200	0.200	0.200	0.168
DT	0.200	0.135	0.037	0.200	0.190	0.200
Stacking	0.200	0.107	0.107	0.200	0.181	0.182

Table 27. Kruskal–Wallis p-values test for ST-2.

Dataset	MAPE	RMSE	$R^{2}$
1	≤0.0001	≤0.0001	≤0.0001
2	≤0.0001	≤0.0001	≤0.0001
3	≤0.0001	≤0.0001	≤0.0001
4	≤0.0001	≤0.0001	≤0.0001

Table 28. Comparison of Dunn p-values test for stacking and models of first layer—ST-2.

ML Model	p-Value
	Dataset D1			Dataset D2
	MAPE	RMSE	$R^{2}$	MAPE	RMSE	$R^{2}$
SVM	0.002	0.015	0.015	0.331	0.000	0.000
MLP	0.184	0.375	0.375	0.041	0.115	0.115
ELM	0.000	0.000	0.000	0.000	0.000	0.000
KNN	0.000	0.000	0.000	0.000	0.000	0.000
DT	0.000	0.000	0.000	0.025	0.045	0.045
	Dataset D3			Dataset D4
	MAPE	RMSE	$R^{2}$	MAPE	RMSE	$R^{2}$
SVM	0.008	0.000	0.000	0.221	0.847	0.847
MLP	0.511	0.704	0.704	0.101	0.903	0.903
ELM	0.036	0.000	0.000	0.132	0.043	0.043
KNN	0.000	0.000	0.000	0.000	0.000	0.000
DT	0.000	0.000	0.000	0.000	0.000	0.000

Table 29. Metamodel coefficients for ST-2.

	Dataset D1	Dataset D2	Dataset D3	Dataset D4
$α_{S V M}$	0.237	0.245	0.249	0.147
$α_{M L P}$	0.156	0.162	0.258	0.225
$α_{E L M}$	0.243	0.211	0.133	0.189
$α_{K N N}$	0.225	0.172	0.205	0.195
$α_{D T}$	0.152	0.236	0.120	0.246
$β$	−0.004	−0.011	0.013	0.004

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Neto, G.F.; Macêdo, B.d.S.; Boratto, T.H.A.; Gontijo, T.S.; Bodini, M.; Saporetti, C.; Goliatt, L. Stratified Metamodeling to Predict Concrete Compressive Strength Using an Optimized Dual-Layered Architectural Framework. Math. Comput. Appl. 2025, 30, 16. https://doi.org/10.3390/mca30010016

AMA Style

Neto GF, Macêdo BdS, Boratto THA, Gontijo TS, Bodini M, Saporetti C, Goliatt L. Stratified Metamodeling to Predict Concrete Compressive Strength Using an Optimized Dual-Layered Architectural Framework. Mathematical and Computational Applications. 2025; 30(1):16. https://doi.org/10.3390/mca30010016

Chicago/Turabian Style

Neto, Geraldo F., Bruno da S. Macêdo, Tales H. A. Boratto, Tiago Silveira Gontijo, Matteo Bodini, Camila Saporetti, and Leonardo Goliatt. 2025. "Stratified Metamodeling to Predict Concrete Compressive Strength Using an Optimized Dual-Layered Architectural Framework" Mathematical and Computational Applications 30, no. 1: 16. https://doi.org/10.3390/mca30010016

APA Style

Neto, G. F., Macêdo, B. d. S., Boratto, T. H. A., Gontijo, T. S., Bodini, M., Saporetti, C., & Goliatt, L. (2025). Stratified Metamodeling to Predict Concrete Compressive Strength Using an Optimized Dual-Layered Architectural Framework. Mathematical and Computational Applications, 30(1), 16. https://doi.org/10.3390/mca30010016

Article Menu

Stratified Metamodeling to Predict Concrete Compressive Strength Using an Optimized Dual-Layered Architectural Framework

Abstract

1. Introduction

2. Materials and Methods

2.1. Experimental Data

2.2. Regression Methods

2.2.1. Artificial Neural Networks (ANNs)

2.2.2. Decision Trees (DTs)

2.2.3. Extreme Learning Machine (ELM)

2.2.4. K-Nearest Neighbors (KNN)

2.2.5. Support Vector Machine (SVM)

2.3. Stacking

2.4. Particle Swarm Optimization (PSO)

2.5. Cross-Validation

2.6. Proposed Approaches

2.6.1. ST-1

2.6.2. ST-2

2.7. Assessment Metrics

2.8. Statistical Tests

3. Results and Discussion

3.1. Results for ST-1 Stacking

3.2. Results for ST-2 Stacking

3.3. Comparison Between ST-1 and ST-2

3.4. Limitations of the Proposed Approach and Possibilities for Future Works

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI