Prediction of the Unconfined Compressive Strength of a One-Part Geopolymer-Stabilized Soil Using Deep Learning Methods with Combined Real and Synthetic Data

Chen, Qinyi; Hu, Guo; Wu, Jun

doi:10.3390/buildings14092894

Open AccessArticle

Prediction of the Unconfined Compressive Strength of a One-Part Geopolymer-Stabilized Soil Using Deep Learning Methods with Combined Real and Synthetic Data

by

Qinyi Chen

¹,

Guo Hu

^1,* and

Jun Wu

²

¹

School of Urban Rail Transportation, Shanghai University of Engineering Science, Shanghai 201620, China

²

School of Civil Engineering, Shanghai Normal University, Shanghai 201418, China

^*

Author to whom correspondence should be addressed.

Buildings 2024, 14(9), 2894; https://doi.org/10.3390/buildings14092894

Submission received: 9 August 2024 / Revised: 8 September 2024 / Accepted: 11 September 2024 / Published: 13 September 2024

(This article belongs to the Special Issue Urban Underground Space Design: Structural Stability and Mechanics Analysis)

Download

Browse Figures

Versions Notes

Abstract

:

This study focused on exploring the utilization of a one-part geopolymer (OPG) as a sustainable alternative binder to ordinary Portland cement (OPC) in soil stabilization, offering significant environmental advantages. The unconfined compressive strength (UCS) was the key index for evaluating the efficacy of OPG in soil stabilization, traditionally demanding substantial resources in terms of cost and time. In this research, four distinct deep learning (DL) models (Artificial Neural Network [ANN], Backpropagation Neural Network [BPNN], Convolutional Neural Network [CNN], and Long Short-Term Memory [LSTM]) were employed to predict the UCS of OPG-stabilized soft clay, providing a more efficient and precise methodology. Among these models, CNN exhibited the highest performance (MAE = 0.022, R² = 0.9938), followed by LSTM (MAE = 0.0274, R² = 0.9924) and BPNN (MAE = 0.0272, R² = 0.9921). The Wasserstein Generative Adversarial Network (WGAN) was further utilized to generate additional synthetic samples for expanding the training dataset. The incorporation of the synthetic samples generated by WGAN models into the training set for the DL models led to improved performance. When the number of synthetic samples achieved 200, the WGAN-CNN model provided the most accurate results, with an R² value of 0.9978 and MAE value of 0.9978. Furthermore, to assess the reliability of the DL models and gain insights into the influence of input variables on the predicted outcomes, interpretable Machine Learning techniques, including a sensitivity analysis, Shapley Additive Explanation (SHAP), and 1D Partial Dependence Plot (PDP) were employed for analyzing and interpreting the CNN and WGAN-CNN models. This research illuminates new aspects of the application of DL models with training on real and synthetic data in evaluating the strength properties of the OPG-stabilized soil, contributing to saving time and cost.

Keywords:

unconfined compressive strength; one-part geopolymer; deep learning method; Generative Adversarial Networks; interpretable machine learning method

1. Introduction

Ordinary Portland cement (OPC) is one of the most common materials in construction engineering. Nonetheless, the production of OPC is accompanied by the generation and emission of significant quantities of carbon dioxide, which imposes substantial damage on the environment [1,2,3]. Therefore, in the context of the global emphasis on sustainable development, many researchers have begun to search for alternative materials and options to replace OPC. Geopolymer is a novel type of cementitious material synthesized from industrial by-products. It represents a promising alternative to conventional cement. These by-products include fly ash (FA), ground granulated blast furnace slag (GGBFS), rice husk ash, steel slag, and waste glass powder, among others [4,5,6,7]. The application of geopolymer in structural and geotechnical engineering has attracted significant attention in recent years [8,9]. Geopolymer is found to improve the strength of stabilized soil by developing a three-dimensional microstructure with calcium silicate hydrate (C-S-H) and sodium aluminosilicate hydrate (N-A-S-H) gels.

Normally, the preparation of the geopolymer involves two methods: one-part and two-part methods [7,10]. The distinguishing feature of the one-part geopolymer (OPG) lies in its unique composition, which entails a mixture of solid aluminosilicate materials (known as precursors), solid alkaline activators, and water, setting it apart from the two-part geopolymer formulations. It can be seen that OPG stands out for in situ construction projects due to its potential to reduce environmental damage and low storage and transportation costs compared to the two-part geopolymer.

As for ground improvement, the unconfined compressive strength (UCS) is commonly used to evaluate the mechanical performance of cement-stabilized soil. Time-consuming and costly laboratory experiments are often required for determining the UCS of OPG-stabilized soil. Moreover, the existing findings regarding the application of OPG in soil stabilization remain relatively limited [10,11,12,13,14,15]. It has been observed that the adoption of OPG can substantially improve the mechanical properties of soil. Notably, it has been identified that the OPG prepared by combining solid binary precursors (FA and GGBFS) and a solid activator (sodium hydroxide or sodium silicate) with water can markedly improve the UCS and shear strength of the soft soil [12,13,16,17,18]. Nevertheless, the mechanical behavior of geopolymer-stabilized soil is affected by various influences, including the formulation of precursors, blending processes, surrounding curing environment, etc. Especially, gaining a comprehensive understanding of how precursors and activators impact on the strength performance of OPG-stabilized soil needs a comprehensive experimental examination and study, presenting difficulties regarding time, costs, and labor [13,19,20,21]. Hence, accurately predicting the UCS of geopolymer-stabilized soil represents a challenging task.

In recent years, with the rapid development of Machine Learning (ML) technology, its application in the field of materials science has attracted more and more attention. The contribution of ML techniques to concrete-like materials brings a great change in the computation of physical and mechanical properties [22,23,24,25,26,27,28], such as the modulus, strength, durability [29,30], and impermeability [31,32,33]. The basic principle of ML is data driven using the logic created by computation coding, allowing the computer to independently explore implicit relationships between the data and complete specified tasks, such as predicting target values. ML is widely employed in predicting the compressive strength of concrete and demonstrates that there is significant potential for utilizing ML in evaluating the mechanical performance of geopolymer. Among them, the ensemble learning and Boosting algorithms are the most widely adopted because of their intuitive and easy-to-understand principles, ease of construction, and fast speeds of fitting and prediction. Some cutting-edge ML models, such as Random Forest (RF) [34,35,36,37,38,39,40,41], Boosted Tree (BT) [42], Extra Trees [43], Gradient Boosting Regression Tree (GBRT) [44], Adaptive Boosting (AdaBoost) [6,40,45,46], and Extreme Gradient Boosting (XGB) [34,36,47,48,49,50], have been already used to predict the materials’ performance in civil and mechanical engineering. However, overfitting tends to occur for ensemble learning models when the training dataset is not ideal. For example, ensemble learning models improve predictive performance by constructing multiple base learners, particularly in cases where the training set has high dimensions and a sparse distribution. Under such circumstances, the base learners of the ensemble learning models may struggle to capture the overall features. This is because each subsequent base learner is trained based on the results of the previous one, which induces the ensemble learning model to be less effective than models such as neural networks and support vector machines.

As a subset of ML, deep learning (DL) has the ability to handle more complex data and predict the output more accurately. DL has now achieved convincing results in predicting the compressive strength of concrete and geopolymer-based concrete (GPC). Contemporary mainstream deep learning networks can be categorized into several types: fully connected networks, Convolutional Neural Networks, recurrent neural networks, attention mechanism networks, and graph convolutional networks. Notably, attention mechanism networks and graph convolutional networks are more suited for text and image data, which do not align with the data types utilized in this paper. Artificial Neural Network (ANN), a simple and effective network model that mimics the thinking behavior of the human brain [51], has been applied to predict the compressive strength of materials by many scholars [35,42,46,48,52,53,54,55,56,57,58,59,60,61,62,63]. Subsequently, advancements in computer technology and computational power have led to the development of more complex networks based on ANN. This progression gives rise to the Deep Neural Network (DNN) [37,48,56,57,64] and deep Residual Networks (ResNet) [56,57,65] which further increases the number of hidden layers and neurons. For example, the Backpropagation Neural Network (BPNN) incorporates a feedforward algorithm to adjust neuron parameters [66]. The Convolutional Neural Network (CNN) utilizes convolutional calculations for feature extraction [67], and the Long Short-Term Memory (LSTM) network incorporates the attention mechanism for extracting time series features [68,69]. Table 1 summaries the applications of the ML and DL models in predicting the compressive strength of GPC.

Despite their prevalence, numerous DL models pose ongoing challenges in terms of deciphering their internal processes [72]. In the last few years, interpretable approaches have been developed to elevate the explanation of ML models [73,74,75]. Explainable ML techniques, such as Shapley Additive Explanation (SHAP) [47,48,76,77,78,79,80,81] and Partial Dependence Plot (PDP) [80,81,82,83], make it clear how the models predict through inputs and provide a thorough comprehension of the link between inputs and outputs.

The aforementioned findings showed that ML and DL models can be adeptly applied to assess the physical and mechanical properties of geopolymer materials. For the focuses of the current study, only relying on the limited samples obtained from laboratory and field tests may fall short of meeting the data volume requirements for typical data-driven DL models. Therefore, there is a need to introduce data augmentation techniques to expand the training dataset and consequently enhance the performance of DL models. After the data augmentation methods based on Generative Adversarial Networks (GANs) were proposed in 2014 [84], there have been significant applications and research in the field of image restoration [85,86,87], paving the way for one of the trending directions in DL [88]. GANs achieve the objective of dataset augmentation by learning the distribution characteristics of the original training dataset to imitate and generate convincingly realistic synthetic data. By utilizing GANs to generate convincingly synthetic samples, data augmentation serves to address the shortfall in small sample sizes for DL models, thereby improving the models’ performance. In light of the data types considered in this study, the ANN, BPNN, CNN, and LSTM models were selected for predictions. This choice is justified by their distinct computational and predictive mechanisms, making them all exemplary and representative options from the perspective of predictive value. Therefore, in this study, four different DL models were first employed to predict the UCS of OPG-stabilized soil by training the experimental data. Furthermore, to enhance the performance of DL models, the Wasserstein Generative Adversarial Network (WGAN) was utilized to generate additional synthetic samples by learning the features of the experimental data. The developed DL models with and without data augmentation were then compared to achieve better performance on predicting the UCS of the OPG-stabilized soil. Finally, a sensitivity analysis and SHAP and PDP methods were applied to the DL model with data augmentation for elucidating the inherent mechanism between the input features and output. The findings of this study may stimulate the employment of DL models in assessing the strength performance of the OPG-stabilized soil by incorporating both real and synthetic data into the training process, thereby attaining a balance between resource allocation and exactitude.

2. Deep Learning Models

2.1. ANN

Neural networks are a class of ML models that simulate the functioning of neurons in the human brain. With the continuous advancement of neural networks, particularly in conjunction with the increasing scale of data, these networks have progressively transitioned from shallow architectures to deeper structures, ultimately evolving into DL models.

The most fundamental neural network is the Artificial Neural Network (ANN), which can approximate any nonlinear function by increasing the number of neurons and hidden layers. The input layer, hidden layers, and output layer make up the three primary components of the ANN’s structure, as shown in Figure 1. In the ANN, the neurons are the fundamental blocks of neural networks, which are responsible for receiving input from other neurons, applying weighted sums of these inputs using activation functions, and then transmitting the results to the next layer of neurons or the output layer. This process imitates the response mechanism of biological neurons. The hidden layers, located between the input and output layers in a neural network, consist of one or multiple neurons that facilitate information transmission and processing. The output of neurons can be expressed in the computational form illustrated in Equation (1). Many neural networks derived based on the ANN are suitable for processing complex data.

z = f (\sum_{i = 1}^{p} ω_{i} x_{i} + b), i = 1, 2, \dots, p,

(1)

where

z

is the output of a neuron,

x_{i}

represents the inputs received by the neuron.

ω

denotes the weights of each input node and

b

denotes the bias of the neuron.

f

is defined as the activation function.

Before training, the types of functions associated with the network in the study must be defined, primarily encompassing the activation functions, loss functions, and optimization functions. The activation function determines the information output by the neurons and is a crucial component enabling the neural network to fit nonlinear relationships. Commonly used activation functions include Rectified Linear Unit (ReLU), Sigmoid, and tanh. In this study, apart from the output layer, all other layers utilize the ReLU activation function, as expressed in Equation (2). A notable feature of the ReLU is that its derivative is 0 on the negative half-axis. Therefore, the connected neuron will output 0 via the ReLU activation function when the activation value is negative. The structure of networks can be relatively sparse under this mechanism, thereby enhancing the training efficiency and improving the model’s generalization capability [89]. The selection of the loss function primarily depends on the type of data. Given that the problem addressed in this study involves regression, the Mean Squared Error (MSE) function, which is commonly used for regression problems, is employed for the loss function, as shown in Equation (3). The optimization function, also referred to as the optimizer, serves the purpose of updating the model parameters, specifically the weights and biases of the neurons, according to the gradient of the loss function. In this study, the Adam optimizer is utilized, characterized by its high computational efficiency and its suitability for non-convex optimization problems. It is particularly effective for handling sparse gradients and noise, and does not require additional adjustments for hyperparameters such as the learning rate [90].

Furthermore, to address the issue of overfitting that may arise during training, a Dropout layer is introduced into the neural network. In the training progress, a certain proportion of neurons will randomly output 0 due to this layer. This configuration reduces the network’s complexity while enhancing the model’s capacity [91]. Since the current study aimed to predict the UCS value, the output layer was typically set to a single neuron.

f_{R e l u} (x) = m a x (0, x),

(2)

M S E = \frac{\sum_{i = 1}^{N} | U C S_{p r e d i c t} - U C S_{t r u e} |}{N} .

(3)

2.2. BPNN

The Back Propagation Neural Network (BPNN), first proposed by Rumelhart et al. [92], involves updating the parameters of the neural network through the backpropagation algorithm. Compared to the ANN, the BPNN exhibits a better generalization ability, enabling it to handle more complex data [93]. In the ANN, forward propagation is used solely to compute and obtain the neural network’s output without adjusting the network’s parameters.

Error backpropagation refers to the process of adjusting the weights and biases of the preceding layers based on the error generated by comparing the actual output values with the expected values, with the aim of minimizing this error as much as possible. This process can be encapsulated in two stages. The first stage is forward propagation, which begins at the input layer and progresses through the calculations of the hidden layers, ultimately reaching the output layer. This stage involves computing the output results at each layer and evaluating the associated errors. The second stage is backward propagation, which starts at the output layer; when the output results do not align with the actual results, the error is calculated using a loss function and subsequently, each neuron’s weights are updated through an optimization function. These two steps are alternated, optimizing the parameters of each neuron and thereby reducing the discrepancy between the predicted values and the actual values [94,95].

2.3. CNN

The Convolutional Neural Network (CNN) is a type of feed-forward neural network that is similar to the ANN. The CNN model adopts Convolutional layers, Pooling layers, and fully connected layers (referred to the Dense layers) for feature extraction and classification [96,97]. Using Convolutional layers, the CNN model can identify features from input sequences, which leverages local correlations and weight sharing to reduce the model’s parameter count, thereby enhancing the training speed and generalization capabilities.

A conventional CNN model is composed of the five layers: Convolutional layers, Pooling layers, Activation layers, Normalization layers, and Dense layers. The key of the CNN structure is the Convolutional layers. Adjusting the filter and kernel_size parameters in the 1D Convolutional layer can yield different effects on the feature extraction. The filter parameter determines the number of Convolutional kernels used to operate on the input data, while the kernel_size parameter plays a significant role in CNN performance. A larger kernel_size can capture broader features, while a smaller one can capture finer details. The purpose of the Pooling layer is to reduce dimensionality and accelerate the computation of the results from the Convolutional layer. Average pooling retains the overall feature information, while max pooling preserves local detailed features. As illustrated in Figure 2, for a tensor with input shape

[b, 3, 1]

and setting

f

filters, a three-dimensional tensor of shape

[b, 3, f]

is obtained, which is then dimensionally reduced by the Pooling layer to output a two-dimensional tensor of shape

[b, f]

.

2.4. LSTM

The recurrent neural network (RNN) has a greater advantage in handling time series data compared to the neural networks by using Dense layers and Convolutional layers [98,99]. The RNN is derived from the BPNN by adding a recurrent core to enhance short-term memory. In its recurrent structure, each neuron is determined not only by the parameters of the current input but also includes the output of the hidden layer from the previous time step. However, the traditional RNNs may encounter the vanishing or exploding gradient problem when dealing with long sequences, causing unstable training, an inability to increase network depth, and difficulty in effectively capturing long-range dependencies.

As one of the improved RNN models, the Long Short-Term Memory (LSTM) network has the characteristics of three gate designs, that is, the forget gate, input gate, and output gate [100,101]. These three gates manage units, which are used for the management of information forgetting and updating. The LSTM structure, as depicted in Figure 3, has

x

as the input feature at the current time step

h_{t - 1}

that captures the preserved state details from the preceding time step, and

h_{t}

that stores the output for the present state. The forget gate regulates the influence of the memory from the previous time step on the current time step. The input gate controls the acceptance level of the input by the LSTM network, obtained through a non-linear transformation of

x

and

h_{t - 1}

. The output gate, which is determined by the input gate and the tanh activation function, decides the output

h_{t}

at the current time step.

2.5. WGAN

Generative Adversarial Network (GAN), formally introduced in 2014 [84], consists of two neural network structures, the Generator and the Discriminator, as depicted in Figure 4. Through the adversarial interplay between these two networks, GAN can generate synthetic data. The Generator learns the distribution of real samples and then produces a series of synthetic samples, while the Discriminator distinguishes between synthetic samples generated by the Generator and real samples. The basic GAN model is prone to the issue of mode collapse, in which the Generator tends to generate a small number of high-quality synthetic samples in a narrow range. As these synthetic samples approach the real distribution, they can receive high scores from the Discriminator, inducing poor diversity and the concentration of generated synthetic samples within a small range [102,103]. Thus, various improved networks have been derived from GAN to address the mode collapse, such as the Wasserstein Generative Adversarial Network (WGAN) [103], Deep Convolutional Generative Adversarial Network (DCGAN) [104], and Cycle-Consistent Adversarial Network (CycleGAN) [105].

The WGAN identifies Jensen–Shannon Divergence (JSD) as the cause of unstable GAN training. JSD is a method for measuring the similarity between two probability distributions. For non-overlapping distributions, the gradient surface is constantly zero, meaning that the distribution probability of generated samples does not overlap with that of real samples, resulting in a JSD value of log2. This leads to a situation where the gradient at the generated sample location is always zero, making it impossible to update the Generator’s parameters effectively and thereby affecting training performance. In GAN, real samples are labelled as 1 and synthetic samples are set to 0, and the Discriminator’s objective function is used to minimize the error between real samples (label 1) and synthetic samples (label 0).

The WGAN introduces the Wasserstein distance, also known as the Earth-Mover (EM) Distance, which calculates the minimum cost of transforming one distribution into another. The calculation of the Wasserstein distance can be expressed in Equation (4), which can effectively address the lack of gradient information due to the JSD [103,106]. Generating low-quality synthetic samples induces a greater Wasserstein distance, and thus forces the Generator’s parameters to be updated. In the WGAN, the Discriminator also serves as the unit for calculating the Wasserstein distance. The more accurate the Discriminator’s distance calculation, the more beneficial it is for the Generator. Therefore, in each iteration, the Generator is trained once while the Discriminator is trained five times to obtain a more accurate calculation of the Wasserstein distance. In terms of Generator design, based on the structure of the BPNN, a smaller Convolutional kernel size is used for 1D convolutions to capture the distribution and precise features of the original samples.

W (p, q) = \inf_{γ \prod (p, q)} E_{(x, y) γ} [‖ x - y ‖],

(4)

where

\prod (x - y)

represents the collection of all possible joint distributions formed by the combination of distributions

p

and

q

. For each of these potential joint distributions

y ~ \prod (x - y)

, the expected distance

E_{(x, y) ~ γ} [‖ x - y ‖]

is computed, with

(x, y)

being sampled from the joint distribution

γ

. The term

\inf \{\cdot\}

signifies the infimum of the set.

2.6. K-Fold Cross-Validation

K-Fold cross-validation is a common method to validate and evaluate the performance of models [107]. The basic idea is to divide a dataset into

k

groups, in which

k - 1

groups are used as the training set for model training and the remaining group is taken as the test set for model testing and validation, as illustrated in Figure 5. This process results in

k

models, in which the average results are taken as the final performance of the model. K-Fold cross-validation can extract as much valuable information as possible from limited data, thereby avoiding poor model performance due to the uneven distribution of the training set, and can further reduce the likelihood of overfitting and increase the stability of the model. Most studies recommend using 5-fold or 10-fold cross-validation to obtain the final model results [108,109]. In this study, 5-fold cross-validation was chosen, with the average values of each metric taken as the final conclusion.

2.7. Interpretable Methods

2.7.1. Sensitivity Analysis

Most ML models exhibit considerable structural complexity, particularly DL models, making it challenging to comprehend the predictive principles underlying these models. To ascertain the feasibility of such models—specifically, whether the contributions of various input features to the output values align with existing research or whether they can provide clear decision-making justifications—it is essential to implement post hoc interpretability methods [23,110].

Sensitivity analysis reveals how independent variables affect the output [111]. The specific calculation process involves randomly sampling the independent variables within a specified range and using a pre-trained model to predict the output. By employing the one-at-a-time (OAT) method to control variables, the sensitivity analysis perturbs only one independent variable at a time while keeping other variables at their nominal values, and thus the relationship between the output and that variable can be determined [112,113]. Sensitivity analysis can be divided into a global sensitivity analysis and local sensitivity analysis. In contrast to the local sensitivity analysis, the global sensitivity analysis focuses on the impacts of all input variables on the output and can consider interactions between variables. Notably, this study adopted a global sensitivity analysis method based on Sobol coefficients, which include first- and second-order sensitivity analyses. The first-order sensitivity analysis primarily focuses on the impacts of individual input parameters on the model output. The second-order sensitivity analysis reflects the degree to which interactions between two input parameters affect the model output, providing a more comprehensive understanding of how interactions between parameters influence the model output. The calculation of the first-order sensitivity Sobol index is shown in Equation (5), while the calculation of the second-order sensitivity Sobol index is shown in Equation (6).

S_{i} = \frac{V_{i}}{V} = \frac{V a r (E (Y | X^{(i)}))}{V a r (Y)},

(5)

S_{i j} = \frac{V_{i j}}{V} = \frac{V a r (E (Y | X^{(i)}, X^{(j)})) - V_{i} - V_{j}}{V a r (Y)},

(6)

where

E

describes the mathematical expectation.

V_{i}

is determined by the ANOVA variance decomposition formula

V_{i} = V a r (E (Y | X^{(i)}))

.

2.7.2. SHAP

SHAP, which is short for Shapley Additive Explanation, is an interpretive method that is deeply influenced by game theory. As indicated by its nomenclature, the SHAP method primarily facilitates post hoc explanations through the computation of Shapley values. In the context of cooperative game theory, the Shapley value (hereinafter referred to as the SHAP value) serves to quantify the contribution of each participant to the value generated by the collaboration. When applied to machine learning models, it evaluates the contribution of each input feature to the resulting output label. In SHAP methods, the model’s prediction progress is interpreted as a linear function of binary variables [114], as shown in Equation (7). Additionally, the computation of SHAP value is shown in Equations (8) and (9).

g (z) = ϕ_{0} + \sum_{k = 1}^{N} ϕ_{k} z_{i}^{'},

(7)

where

g

is explainable DL model;

z

denotes the input feature; and

N

is the number of inputs.

ϕ_{k}

is the SHAP value of feature

k

.

ϕ_{0}

is a constant.

ϕ_{k} = \sum_{K \subseteq M \{i\}} \frac{|K|! (N - |K| - 1)!}{N!} [g_{X} (K \cup \{i\}) {- g}_{X} (K)],

(8)

g_{X} (K) = E [g (x) | x_{K}],

(9)

where

M = \{x_{1}, x_{2}, \dots, x_{N}\}

is the set of input features.

E [g (x) | x_{K}]

represents the expected value of the subset

K

.

2.7.3. PDP

The Partial Dependence Plot (PDP) is a visualization technique that illustrates the marginal effects of features on the predictions of the trained ML model [115]. Its advantage is to generate plots to display the relationship between the target and input variables. This makes it easier to comprehend how each input specifically affects the model’s predictions, which improves the interpretability of the DL models. The PDP method elucidates the relationship between inputs and outputs by utilizing all available data points within the dataset [116,117]. The partial dependence function

f_{x_{s}} (x_{s})

can be estimated by calculating the mean values within the training data, as illustrated in Equation (10):

f_{x_{s}} (x_{s}) = \frac{1}{n} \sum_{k = 1}^{n} D L (x_{s}, x_{c}^{(k)}),

(10)

where

x_{s}

is the feature of partial dependence in the set

S

;

x_{c}^{(k)}

denotes the actual values within the dataset corresponding to the features in the feature space, excluding the specified set

S

;

n

is the total number of instances in the dataset; and DL is the trained DL model.

2.8. Performance Index

In the present study, the three metrics, including MAE, RMSE, and R², were employed to evaluate the performance of DL models. These metrics play crucial roles in evaluating the accuracy and explanatory power of predictive models. Specifically, MAE and RMSE quantify the disparities between the forecasted and real values, shedding light on the model’s precision and consistency. On the other hand, R² serves as a pivotal indicator of how well the independent variable elucidates the variations in the dependent variable. Equations (11)–(13) delineate the precise methodologies for computing MAE, RMSE, and R², respectively.

M A E = \frac{1}{N} \sum_{i = 1}^{N} | U C S_{p r e d i c t} - U C S_{t r u e} |,

(11)

R M S E = \sqrt{\frac{\sum_{i = 1}^{N} {(U C S_{p r e d i c t} - U C S_{t r u e})}^{2}}{N}},

(12)

R^{2} = 1 - \frac{\sum_{i = 1}^{N} {(U C S_{t r u e} - U C S_{p r e d i c t})}^{2}}{\sum_{i = 1}^{N} {(U C S_{t r u e} - \bar{U C S_{t r u e}})}^{2}},

(13)

3. Methodology

3.1. Workflow of the Current Research

The framework of the current study was divided into four parts, data processing, model development, model assessment, and interpretable analysis, as illustrated in Figure 6.

In the data processing section, to mitigate the effect of different scales, the collected dataset was first standardized using Z-score normalization, as shown in Equation (14). The original real data were then divided into a training set and validation set at the ratio of 9:1. In addition, the mean UCS for the parallel samples with the same mixing proportion in the original real data was defined as the refined set. In the model development part, five different DL models were constructed: ANN, BPNN, CNN, LSTM, and WGAN-CNN. Notably, the WGAN-CNN model involved the synthetic samples generated by WGAN, which were added into the training set for the CNN model. For model assessment, three evaluation metrics (MAE, RMSE, and R²) and the K-Fold cross-validation method were employed. In the interpretable analysis section, the best-performing DL models were further analyzed by the three types of interpretable methods (sensitivity analysis, SHAP, and PDP) to further validate the feasibility of the DL models in the prediction of the mechanical properties of the OPG-stabilized soil.

X_{i} = \frac{(x_{i} - μ)}{σ},

(14)

where

X_{i}

is the

i

th input after normalization,

x_{i}

is the

i

th input,

μ

represents the sample mean, and

σ

represents the sample standard deviation. Standardizing the data allows them to be mapped to a range centered around zero for easier processing alongside other data.

3.2. Data Collection

The data utilized in this study originated from the laboratory experiments conducted by our study team. Initially, a solid alkali activator (NaOH), water, and binary precursors (GGBFS and FA) were blended to create the one-part geopolymer (OPG) paste. Subsequently, the OPG paste was cast into the remolded soil. The strength development of the OPG-stabilized soil was primarily influenced by the quantities of FA, GGBFS, NaOH, and water, since the one-part method was adopted in present study. Thus, the input variables for DL models were defined as the mass ratios for FA/GGBFS, NaOH/precursor, and water/binder. To investigate the impact of the alkaline concentration on the stabilized soil’s strength, the molarity was also selected as an input variable in this study. Equation (15) illustrates the calculation of the molarity for the one-part geopolymer. After the casting process, the OPG-stabilized soil samples were subjected to standard curing conditions for durations of 3, 7, 14, and 28 days prior to conducting the UCS tests. Consequently, the curing time was included as an input parameter, and the UCS of OPG-stabilized soil served as the output. In current study, at least three samples for each design mixture were conducted. Therefore, a total of 390 data points were obtained from the experiment.

M o l a r i t y = \frac{n_{N a O H}}{V_{N a O H}} = \frac{m_{N a O H} / M_{N a O H}}{m_{N a O H} / ρ_{N a O H}},

(15)

where

m_{N a O H}

represents the mass of solid NaOH.

M_{N a O H}

represents molar mass of NaOH.

n_{N a O H}

represents the molar concentration of NaOH.

V_{N a O H}

represents the amount of NaOH.

ρ_{N a O H}

represents the density of the NaOH solution.

The statistical description of the dataset is presented in Table 2, with corresponding visual representations in Figure 7 (histogram and box diagram). Figure 7a displays the distribution of UCS in a histogram. Figure 7b illustrates the distributions of the five input variables after standardization. The box diagram in Figure 7b shows that the distributions of NaOH/precursor and molarity are relatively discrete. The linear correlation coefficient was computed to clarify the relationships among the variables, as presented in Equation (16) and illustrated in Figure 8.

C o r r (X, Y) = \frac{c o v (X, Y)}{S D_{X} S D_{Y}} = \frac{[\sum_{i = 1}^{n} (X_{i} - \bar{X}) (Y_{i} - \bar{Y})] / (n - 1)}{\sqrt{V a r [X] V a r [Y]}},

(16)

where

c o v (X, Y)

is the covariance of

X

and

Y

.

V a r

is the variance.

S D

denotes the standard deviation.

4. Results and Discussion

4.1. Performance of the DL Models

The study was executed on a Windows operating system, employing Python programming within the Visual Studio Code platform. In terms of training time, the WGAN model required the most extensive training duration, while the CNN and LSTM took slightly longer than the BPNN and ANN.

The performance results and error analyses of the four DL models are shown in Figure 9, Figure 10, Figure 11, Figure 12, Figure 13 and Figure 14. From the loss figures in Figure 9, Figure 10, Figure 11 and Figure 12, it can be observed that the loss functions of all four DL models converged. In comparison to the other models, the loss figure of the ANN in Figure 9 shows that the loss for the validation set did not completely approach 0, but exhibited some fluctuations. This is likely due to the absence of a backpropagation algorithm in the ANN to adjust the parameters of the preceding neural network units, which caused the model’s predictions for the refined set to not be completely concentrated on the best fit line.

Among all four DL models, the CNN exhibited the best performance (as shown in Figure 11, MAE = 0.022, RMSE = 0.0409, R² = 0.9938), followed by LSTM (as shown in Figure 12, MAE = 0.0274, RMSE = 0.0451, R² = 0.9924) and BPNN (as shown in Figure 10, MAE = 0.0272, RMSE = 0.0462, R² = 0.9921). The linear (prediction) lines in Figure 9, Figure 10, Figure 11 and Figure 12 represent the linear regression of the predicted and actual UCS of original dataset. For the BPNN, CNN, and LSTM models, these lines exhibited some deviation from the best fit line, primarily due to the limited training data for the UCS beyond 2.5 MPa and the presence of experimental errors in the samples. However, when adopting the mean UCS for the same mixing group as the refined set for DL models, the predictions from the DL models aligned with the best fit line (red dot line in the figures), indicating that the models did not exhibit overfitting and performed well in predicting the mean UCS for each group. Furthermore, the CNN, BPNN, and LSTM developed in the current study were capable of mitigating sample errors. The LSTM outperformed the BPNN in terms of R² and RMSE, but the BPNN had a lower MAE value than that of the LSTM, implying that the recurrent computation in the LSTM leaded to fewer outlier values during the prediction process and was less susceptible to the influence of input outliers.

The Kernel Density Estimation (KDE) plots of the four models, which are presented in Figure 14, can be used to analyze the distribution of errors for each model. It is evident that the errors for the CNN model were most concentrated around 0, followed by the LSTM, which had smaller errors compared to the BPNN, and the ANN gave the largest errors among the four models. Based on the above analysis, the CNN model had better performance than the other DL models in the present study.

4.2. Performance of WGAN-CNN

Firstly, standardized preprocessing was applied to the source data, and then violin plots were generated using synthetic samples distributed by GAN and WGAN (Dense layers [118] and Convolutional layers [85]), as shown in Figure 15. The distribution of input variables in the original dataset is shown in Figure 15a. Based on Figure 15b, it can be observed that using GAN alone induced model collapse, in which the generated synthetic samples were concentrated within a small range and failed to effectively capture the distribution of the original data samples. At the same time, the WGAN generator with Dense layers, as shown in Figure 15c, can extract more diverse features, and the WGAN generator with Convolutional layers, as shown in Figure 15d, can capture a complementary distribution. Obviously, the combination of the generator with dense and Convolutional layers can have more comprehensive, synthetic data, compensating for the lack of distribution in some ranges. Taking the input variable of curing time as an example, the original data only included the UCS at 3, 7, 14, and 28 curing days. Through the combination of the synthetic data from the WGAN generators with Dense and Convolutional layers, the data for the UCS within a month can be obtained, further expanding the distribution for the training set.

Hence, in present study, the synthetic samples used to augment the training set were composed of samples generated by two different WGAN generators (Dense layers and Convolutional layers). By assigning a fixed random seed, the synthetic data were randomly sampled for each group of CNN training. Different synthetic samples were added to the original training set while keeping the refined set unchanged, resulting in the outcomes shown in Figure 16 and Figure 17. It can be observed that the performance of the WGAN-CNN models was obviously improved when 100 to 200 synthetic samples were added in the training set. Particularly, the WGAN-CNN models with 150 and 200 groups of synthetic samples provided excellent results, in which the R² values approached 0.9979 and 0.9978, respectively. Additionally, the corresponding MAE values were 0.0154 and 0.0119, and the RMSE values were 0.0238 and 0.0243, respectively. According to the KDE plot illustrated in Figure 17, the WGAN-CNN model trained with 200 synthetic samples exhibited the optimum performance, with the errors being concentrated at zero. The scatter plots with linear fit lines are shown in Figure 18, which compared the predicted UCS values from these two WGAN-CNN models with the actual UCS values. It can be observed that the scatter points of the WGAN-CNN model with 200 synthetic samples were closer to the perfect fit line. Thus, it can be concluded that the performance of the WGAN-CNN model with 200 synthetic samples was slightly better than that of the model with 150 synthetic samples. In the subsequent sections, the term “WGAN-CNN” refers to the WGAN-CNN model with the incorporation of 200 WGAN-generated synthetic samples.

4.3. Interpretable Analysis

4.3.1. Results of Sensitivity Analysis

First-order and second-order sensitivity analyses were conducted on the five DL models, and the corresponding results are shown in Figure 19 and Figure 20, respectively. Based on the first-order sensitivity analysis, as given in Figure 19, all models indicted that the most important feature of the UCS prediction was the curing time. The top-performing CNN and WGAN-CNN models showed a consistent ranking of influencing factors, with only slight differences in the rankings of water/binder and molarity, which were placed third and fifth, respectively.

The second-order sensitivity analysis revealed the interactions between inputs and their impacts on the UCS. The sensitivity analyses in Figure 20a,b,d did not exhibit significant interaction relationships. From Figure 20c,e, it can be observed that both the CNN and WGAN-CNN models demonstrated that the interaction between water/binder and molarity had a significant impact on the UCS. However, the sensitivity analysis only identified the degree of influence by varying the input parameters and could not reveal the nonlinear relationship between the input variable and the output. Therefore, more robust interpretability methods such as SHAP and PDP were adopted in the following analyses.

4.3.2. SHAP Results

In the interpretability analysis, the focus was on examining the CNN and WGAN-CNN models with superior performance. The SHAP values were calculated and visualized in Figure 21 and Figure 22.

As depicted in Figure 21, the SHAP global analysis revealed that the two DL models yielded essentially identical SHAP values and global interpretations. The difference occurred in the sequence of the second-to-third and third-to-fourth rankings. This finding resonated with previous research indicating that the SHAP value increased when the curing time increased, signifying the improvement of the UCS. Furthermore, Figure 8 indicated a strong linear correlation coefficient between the UCS and curing time, further implying that the curing time could be a crucial factor in the development of UCS of the samples. As shown in Figure 21, the trend between the SHAP value and FA/GGBFS did not demonstrate a complete negative correlation. This suggested that FA/GGBFS within a lower range could improve the UCS. Similarly, lower values of molarity, as depicted by the SHAP values concentrating around 0.1, were found to have a positive impact on UCS as well. Conversely, the water/binder variable generally displayed a negative correlation with the UCS. The NaOH/precursor ratio exhibited relatively low SHAP values, making it difficult to establish a direct relationship with the UCS.

The decision plots for the two models, presented in Figure 22, illustrate the influence of the inputs on the models’ outputs for all samples. The x-axis depicts the direction and trend of the effect of each feature on the output, and the plot can also detect outlier samples.

Through the SHAP interaction analysis, the changes in SHAP values for one input feature interacting with another can be obtained. As shown in Figure 23, Figure 24 and Figure 25, the trends of the CNN and WGAN-CNN models in the SHAP analysis were generally consistent. By examining the interaction plot of the curing time with other factors in Figure 23, it is evident that due to the overall increase in scatter distribution with the increase in curing time, the input of curing time was found to be the dominant factor. From Figure 23a, c, it can be inferred that at lower values of the curing time (below 10 days), higher FA/GGBFS and water/binder ratios induced a decrease in the SHAP value for the curing time. Figure 23b indicates that when the curing time exceeds 15 days, a NaOH/precursor ratio greater than 0.15 increased the SHAP value of the curing time. As illustrated in Figure 23d, it is noticeable that beyond 20 days of curing time, higher molarity caused a reduction in the SHAP value for the curing time, and vice versa. This observation suggested that a certain NaOH/precursor ratio accelerated the hydration in the sample at an early curing age, implying that the UCS trend of the soil increased fast at the early curing time.

In Figure 24a, with the increase in NaOH/precursor, the SHAP value of FA/GGBFS below 0.2 increased, indicating an improvement of the UCS. Conversely, the SHAP value of FA/GGBFS above 0.2 decreased as the NaOH/precursor increased. This phenomenon was primarily attributed to the higher NaOH content in the OPG that activated more Ca and Si components in the precursor, leading to the formation of numerous hydrated areas in the soil.

Figure 24b revealed that when FA/GGBFS was less than 0.3, a higher water/binder ratio resulted in an increase in the SHAP value of FA/GGBFS. This is due to the fact that a lower water content made it more difficult for the OPG binder and soil to mix evenly, which decreased the UCS.

Figure 24c illustrates the impact of curing time variations on FA/GGBFS. It is observed that FA/GGBFS in the range of 0.12 to 0.25 might reduce the model’s dependence on the curing time to some extent, in which an increase in the curing time led to a reduction in the SHAP value of FA/GGBFS. However, when the FA/GGBFS was beyond 0.25, an increase in the curing time induced an increase in the SHAP value of FA/GGBFS. This might be possible because the optimal addition of FA enhanced the workability of the OPG binder, ensuring the more uniform spread of the binder in the soil. Additionally, the existence of FA participated in secondary pozzolanic reaction during later curing times, thereby enhancing the UCS of the stabilized soil.

Figure 24d demonstrates the effect of molarity variations on the SHAP value for FA/GGBFS. When FA/GGBFS ranged between 0.12 and 0.25, higher molarity could increase the SHAP value for FA/GGBFS. The findings aligned with the understanding that a lower molarity may not effectively catalyze the aluminosilicate components in FA and GGBFS to produce hydration gels, inducing a limited contribution to the improvement of the UCS of the stabilized soil. Conversely, a higher molarity could lead to a rapid release of Ca and Si ions from FA and GGBFS, thereby decreasing the OPG binder’s setting time. In this case, the soil particles may not be efficiently connected by the hydration gels from the binder [12,80].

In Figure 25a, it is shown that when the NaOH/Precursor exceeded 0.14, a decrease in the water/binder led to an increase in the SHAP value for NaOH/precursor. This is because, for a specific amount of NaOH, a lower water content led to a higher concentration of the NaOH solution, which enhanced geopolymerization in OPG binder. Figure 25b,c reveals that with an increase in molarity, both the SHAP values for NaOH/precursor and water/binder exhibited a concave curve. This trend suggested that an excessively high or low molarity of the NaOH solution did not significantly enhance the properties of the soil. The OPG binder has an ideal molarity for fixed precursor or binder contents.

4.3.3. PDP Results

The analysis of 1D Partial Dependence Plots (PDPs) for the CNN and WGAN-CNN models is depicted in Figure 26. It is evident that the 1D PDP tendency for both DL models demonstrated consistency. It should be noted that the 1D PDP analysis only displayed the average responses (indicated by red line in figures). In order to visualize how the output depends on each input, the Individual Conditional Expectation (ICE) is introduced in the study (indicated by blue lines in the figures) in Figure 26. ICE provided a way to observe the influence of individual features by keeping other variables constant and analyzing how the prediction changed with variations in a single feature. For the current study, 390 samples were used, leading to the creation of 390 ICE lines. ICE, like 1D-PDP, assumed independence among variables, limiting its ability to showcase the effects of interconnected variables on the ultimate prediction of the model.

The effect of FA/GGBFS on the partial dependence is reported in Figure 26a. Overall, a slight decrease in partial dependence was observed as FA/GGBFS increased. Particularly, there was a significant decrease in partial dependence as FA/GGBFS increased from 0.25 to 0.43. This phenomenon aligned with previous findings indicating that excess FA increased the soil’s porosity, thereby exerting a detrimental effect on the UCS [12,13,80,81].

In Figure 26b, it is noted that the partial dependence increased when the NaOH/precursor increased within range of 0.1 to 0.15. However, for NaOH/precursor values greater than 0.15, the partial dependence did not change. Notably, the ICE results from both DL models exhibited larger variations for NaOH/precursor at 0.2. This implied that the UCS was responsive to a NaOH/precursor ratio of 0.2.

Figure 26c shows a decreasing trend in partial dependence when the water/binder increased. It was true that for a fixed amount of the precursors, with the increase in water, the concentration of the alkaline solution became lower, which could not effectively activate the Al and Si ions from FA and GGBFS. Under such circumstances, the amount of hydrated gels was less, thereby affecting the strength development of the UCS of the sample.

In Figure 26d, a significant increase in partial dependence was observed as the curing time progressed. The ICE lines displayed more significant variations during curing durations of 14 to 28 days relative to 0 to 14 days, indicating that the strength enhancement of soil mainly took place in the later curing period [19].

Figure 26e illustrates a two-stage linear correlation between molarity and the partial dependence. A rise in molarity from 3.27 to 4.64 positively influenced the partial dependence. However, once the molarity surpassed 4.64, further increases led to a reduction in the partial dependence. This could be attributed to the dissolution of Si and Al ions from FA and GGBFS subjected to a highly concentrated NH solution, resulting in the formation of stable N-A-S-H gels. These hydrated gels were stable and could not effectively connect the soil particles, thereby weakening the strength development of the soil.

5. Conclusions

This study first utilized four DL models with different principles to predict the OPG-stabilized soil’s UCS. Then, the WGAN model was employed to learn the distribution of the original dataset and generate additional synthetic data for the training dataset, thereby innovatively enhancing the model’s performance and robustness. Additionally, sensitivity analyses and SHAP and PDP methods were applied to elucidate the inherent mechanisms of the two DL models with higher prediction accuracy. The main conclusions were as follows:

(1): Four DL models with different principles, constructed based on original data, performed well in predicting the UCS of OPG-stabilized soil. After K-Fold cross-validation, all models achieved an R² above 0.98, with the MAE controlled below 0.0489. Among them, the CNN showed the best performance (MAE = 0.022, RMSE = 0.0409, R² = 0.9938), followed by the LSTM (MAE = 0.0274, RMSE = 0.0451, R² = 0.9924) and BPNN (MAE = 0.0272, RMSE = 0.0462, R² = 0.9921).
(2): By adding synthetic samples generated by the WGAN models with Dense and Convolutional layers to the training set, the performance of the CNN model improved. The best performance was achieved by the WGAN-CNN model trained with 200 added synthetic samples, in which the R² values reached 0.9978. Therefore, it can be found that in present study, the addition of synthetic data in the training set can significantly improve the accuracy of the DL model. The synthetic data generated by the WGAN models with the combination of the Dense and Convolutional layers can have more comprehensive distributions with high quality.
(3): Three interpretability analyses (sensitivity analysis, SHAP, and PDP) were conducted on the best-performing CNN and WGAN-CNN models. The results showed that the analyses of both models were essentially the same, identifying the curing time as the most significant factor influencing the UCS. In particular, it was revealed that the UCS of OPG-stabilized soil primarily occurred at later curing times. The moderate addition of FA/GGBFS is beneficial for increasing UCS and holds practical value. The similar observation from the CNN and WGAN-CNN models implied that the DL model with the combined of the real and synthetic data can predict the mechanical properties of OPG-stabilized soil well. The conclusion is that the developed WGAN-CNN model achieved the balance between the resource and accuracy. Moreover, the observation that the WGAN-CNN exhibited the same predictive trend as the CNN based on interpretable methods further validated the effectiveness and rationality of the synthetic data generated by the WGAN model developed in this study.

The limitations of this study lie in the fact that we have only compared various DL models based on differing principles, without employing any optimization algorithms aimed at enhancing the performance of specific models. Furthermore, there remains room for broader considerations regarding the factors influencing UCS, and as the experiments are expanded, the model’s capabilities can be further enhanced.

Author Contributions

Conceptualization, J.W. and G.H.; methodology, J.W. and G.H.; software, Q.C.; formal analysis, Q.C.; investigation, Q.C.; resources, J.W.; data curation, Q.C.; writing—original draft preparation, Q.C.; writing—review and editing, J.W.; supervision, J.W. and G.H.; project administration, J.W.; funding acquisition, J.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant number 42377201.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data will be made available on request.

Acknowledgments

Grateful acknowledgments are made to National Natural Science Foundation of China (No. 42377201) for the support of this project.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Yang, H.; Liu, L.; Yang, W.; Liu, H.; Ahmad, W.; Ahmad, A.; Aslam, F.; Joyklad, P. A comprehensive overview of geopolymer composites: A bibliometric analysis and literature review. Case. Stud. Constr. Mat. 2022, 16, e00830. [Google Scholar] [CrossRef]
Cook, R.; Han, T.; Childers, A.; Ryckman, C.; Khayat, K.; Ma, H.; Huang, J.; Kumar, A. Machine learning for high-fidelity prediction of cement hydration kinetics in blended systems. Mater. Des. 2021, 208, 109920. [Google Scholar] [CrossRef]
Longarini, N.; Crespi, P.; Zucca, M.; Giordano, N.; Silvestro, G.D. The Advantages of Fly Ash Use in Concrete Structures. Inżynieria Miner. 2014, 15, 141–145. [Google Scholar]
Borçato, A.G.; Thiesen, M.; Medeiros-Junior, R.A. Incorporation of clay brick wastes and calcium hydroxide into geopolymers: Compressive strength, microstructure, and efflorescence. J. Build. Eng. 2024, 88, 109259. [Google Scholar] [CrossRef]
Rathnayaka, M.; Karunasinghe, D.; Gunasekara, C.; Wijesundara, K.; Lokuge, W.; David, W.L. Machine learning approaches to predict compressive strength of fly ash-based geopolymer concrete: A comprehensive review. Constr. Build. Mater. 2024, 419, 135519. [Google Scholar] [CrossRef]
Shamim Ansari, S.; Muhammad Ibrahim, S.; Danish Hasan, S. Conventional and Ensemble Machine Learning Models to Predict the Compressive Strength of Fly Ash Based Geopolymer Concrete. Mater. Today Proc. 2023, in press. [Google Scholar] [CrossRef]
Cong, P.; Cheng, Y. Advances in geopolymer materials: A comprehensive review. J. Traffic Transp. Eng. (Engl. Ed.) 2021, 8, 283–314. [Google Scholar] [CrossRef]
Zhang, M.; Guo, H.; El-Korchi, T.; Zhang, G.; Tao, M. Experimental feasibility study of geopolymer as the next-generation soil stabilizer. Constr. Build. Mater. 2013, 47, 1468–1478. [Google Scholar] [CrossRef]
Cristelo, N.; Glendinning, S.; Teixeira Pinto, A. Deep soft soil improvement by alkaline activation. Proc. Inst. Civ. Eng.-Ground Improv. 2011, 164, 73–82. [Google Scholar] [CrossRef]
Lei, Z.; Pavia, S.; Wang, X. Biomass ash waste from agricultural residues: Characterisation, reactivity and potential to develop one-part geopolymer cement. Constr. Build. Mater. 2024, 431, 136544. [Google Scholar] [CrossRef]
Hang, Y.-J.; Heah, C.-Y.; Liew, Y.-M.; Mohd, M.A.B.A.; Lee, Y.-S.; Lee, W.-H.; Phakkhananan, P.; Ong, S.-W.; Tee, H.-W.; Hsu, C.-H. Microwave absorption function on a novel one-part binary geopolymer: Influence of frequency, ageing and mix design. Constr. Build. Mater. 2024, 427, 136264. [Google Scholar]
Zheng, X.; Wu, J. Early Strength Development of Soft Clay Stabilized by One-Part Ground Granulated Blast Furnace Slag and Fly Ash-Based Geopolymer. Front. Mater. 2021, 8, 616430. [Google Scholar] [CrossRef]
Min, Y.; Wu, J.; Li, B.; Zhang, J. Effects of Fly Ash Content on the Strength Development of Soft Clay Stabilized by One-Part Geopolymer under Curing Stress. J. Mater. Civ. Eng. 2021, 33, 04021274. [Google Scholar] [CrossRef]
Jaditager, M.; Sivakugan, N. Consolidation Behavior of Fly Ash-Based Geopolymer-Stabilized Dredged Mud. J. Waterw. Port Coast. Ocean Eng. 2018, 144, 4. [Google Scholar] [CrossRef]
Phetchuay, C.; Horpibulsuk, S.; Arulrajah, A.; Suksiripattanapong, C.; Udomchai, A. Strength development in soft marine clay stabilized by fly ash and calcium carbide residue based geopolymer. Appl. Clay. Sci. 2016, 127–128, 134–142. [Google Scholar] [CrossRef]
Wang, B.; Cui, C.; Xu, C.; Meng, K.; Li, J.; Xu, L. A novel analytical solution for horizontal vibration of partially embedded offshore piles considering the distribution effect of wave loads. Ocean. Eng. 2024, 307, 118179. [Google Scholar] [CrossRef]
Cui, C.; Liang, Z.; Xu, C.; Xin, Y.; Wang, B. Analytical solution for horizontal vibration of end-bearing single pile in radially heterogeneous saturated soil. Appl. Math. Model. 2023, 116, 65–83. [Google Scholar] [CrossRef]
Cui, C.; Meng, K.; Xu, C.; Liang, Z.; Li, H.; Pei, H. Analytical solution for longitudinal vibration of a floating pile in saturated porous media based on a fictitious saturated soil pile model. Comput. Geotech. 2021, 131, 103942. [Google Scholar] [CrossRef]
Min, Y.; Gao, M.; Yao, C.; Wu, J.; Wei, X. On the use of one-part geopolymer activated by solid sodium silicate in soft clay stabilization. Constr. Build. Mater. 2023, 402, 132957. [Google Scholar] [CrossRef]
Cui, C.; Xu, M.; Xu, C.; Zhang, P.; Zhao, J. An ontology-based probabilistic framework for comprehensive seismic risk evaluation of subway stations by combining Monte Carlo simulation. Tunn. Undergr. Space Technol. 2023, 135, 105055. [Google Scholar] [CrossRef]
Cui, C.; Meng, K.; Xu, C.; Wang, B.; Xin, Y. Vertical vibration of a floating pile considering the incomplete bonding effect of the pile-soil interface. Comput. Geotech. 2022, 150, 104894. [Google Scholar] [CrossRef]
Dinesh, A.; Anitha Selvasofia, S.D.; Datcheen, K.S.; Rakhesh Varshan, D. Machine learning for strength evaluation of concrete structures—Critical review. Mater. Today Proc. 2023, in press. [CrossRef]
Li, Z.; Yoon, J.; Zhang, R.; Rajabipour, F.; Srubar, W.V., III; Dabo, I.; Radlińska, A. Machine learning in concrete science: Applications, challenges, and best practices. NPJ Comput. Mater. 2022, 8, 1. [Google Scholar]
Zhang, J.; Huang, Y.; Wang, Y.; Ma, G. Multi-objective optimization of concrete mixture proportions using machine learning and metaheuristic algorithms. Constr. Build. Mater. 2020, 253, 119208. [Google Scholar] [CrossRef]
Ben Chaabene, W.; Flah, M.; Nehdi, M.L. Machine learning prediction of mechanical properties of concrete: Critical review. Constr. Build. Mater. 2020, 260, 119889. [Google Scholar] [CrossRef]
Oey, T.; Jones, S.; Bullard, J.W.; Sant, G. Machine learning can predict setting behavior and strength evolution of hydrating cement systems. J. Am. Ceram. Soc. 2019, 103, 480–490. [Google Scholar] [CrossRef]
Derousseau, M.A.; Laftchiev, E.; Kasprzyk, J.R.; Rajagopalan, B.; Srubar, W.V. A comparison of machine learning methods for predicting the compressive strength of field-placed concrete. Constr. Build. Mater. 2019, 228, 116661. [Google Scholar] [CrossRef]
Rafiei, M.H.; Khushefati, W.H.; Demirboga, R.; Adeli, H. Neural Network, Machine Learning, and Evolutionary Approaches for Concrete Material Characterization. ACI. Mater. J. 2016, 113, 781–789. [Google Scholar] [CrossRef]
Felix, E.F.; Possan, E.; Carrazedo, R. Artificial Intelligence Applied in the Concrete Durability Study. In Hygrothermal Behaviour and Building Pathologies; Springer: Berlin/Heidelberg, Germany, 2021; pp. 99–121. [Google Scholar]
Taffese, W.Z.; Sistonen, E. Machine learning for durability and service-life assessment of reinforced concrete structures: Recent advances and future directions. Automat. Constr. 2017, 77, 1–14. [Google Scholar] [CrossRef]
Wang, L.; Wu, X.G.; Chen, H.Y.; Zeng, T.M. Iop, Prediction of impermeability of the concrete structure based on random forest and support vector machine. In Proceedings of the International Conference on Sustainable Development and Environmental Science (ICSDES), Zhengzhou, China, 19–21 June 2020. [Google Scholar]
Huang, J.; Duan, T.; Zhang, Y.; Liu, J.; Zhang, J.; Lei, Y.; Zhang, J. Predicting the Permeability of Pervious Concrete Based on the Beetle Antennae Search Algorithm and Random Forest Model. Adv. Civ. Eng. 2020, 2020, 8863181. [Google Scholar] [CrossRef]
Najigivi, A.; Khaloo, A.; Iraji Zad, A.; Abdul Rashid, S. An Artificial Neural Networks Model for Predicting Permeability Properties of Nano Silica–Rice Husk Ash Ternary Blended Concrete. Int. J. Concr. Struct. Mater. 2013, 7, 225–238. [Google Scholar] [CrossRef]
Hu, T.; Zhang, H.; Cheng, C.; Li, H.; Zhou, J. Explainable machine learning: Compressive strength prediction of FRP-confined concrete column. Mater. Today Commun. 2024, 39, 108883. [Google Scholar] [CrossRef]
Yang, S.; Sun, J.; Zhifeng, X. Prediction on compressive strength of recycled aggregate self-compacting concrete by machine learning method. J. Build. Eng. 2024, 88, 109055. [Google Scholar] [CrossRef]
Miao, X.; Chen, B.; Zhao, Y. Prediction of compressive strength of glass powder concrete based on artificial intelligence. J. Build. Eng. 2024, 91, 109377. [Google Scholar] [CrossRef]
Kurt, Z.; Yilmaz, Y.; Cakmak, T.; Ustabaş, I. A novel framework for strength prediction of geopolymer mortar: Renovative precursor effect. J. Build. Eng. 2023, 76, 107041. [Google Scholar] [CrossRef]
Da Silveira Maranhão, F.; De Souza Junior, F.G.; Soares, P.; Alcan, H.G.; Çelebi, O.; Bayrak, B.; Kaplan, G.; Aydın, A.C. Physico-mechanical and microstructural properties of waste geopolymer powder and lime-added semi-lightweight geopolymer concrete: Efficient machine learning models. J. Build. Eng. 2023, 72, 106629. [Google Scholar] [CrossRef]
Parhi, S.K.; Patro, S.K. Prediction of compressive strength of geopolymer concrete using a hybrid ensemble of grey wolf optimized machine learning estimators. J. Build. Eng. 2023, 71, 106521. [Google Scholar] [CrossRef]
Wang, Q.; Ahmad, W.; Ahmad, A.; Aslam, F.; Mohamed, A.; Vatin, N.I. Application of Soft Computing Techniques to Predict the Strength of Geopolymer Composites. Polymers 2022, 14, 6. [Google Scholar] [CrossRef]
Han, Q.; Gui, C.; Xu, J.; Lacidogna, G. A generalized method to predict the compressive strength of high-performance concrete by improved random forest algorithm. Constr. Build. Mater. 2019, 226, 734–742. [Google Scholar] [CrossRef]
Ouyang, B.; Song, Y.; Li, Y.; Wu, F.; Yu, H.; Wang, Y.; Sant, G.; Bauchy, M. Predicting Concrete’s Strength by Machine Learning: Balance between Accuracy and Complexity of Algorithms. ACI. Mater. J. 2020, 117, 125–133. [Google Scholar]
Zhang, L.V.; Marani, A.; Nehdi, M.L. Chemistry-informed machine learning prediction of compressive strength for alkali-activated materials. Constr. Build. Mater. 2022, 316, 106521. [Google Scholar] [CrossRef]
Zhang, J.F.; Li, D.; Wang, Y.H. Toward intelligent construction: Prediction of mechanical properties of manufactured-sand concrete using tree-based models. J. Clean Prod. 2020, 258, 120665. [Google Scholar] [CrossRef]
Abdullah, G.M.S.; Ahmad, M.; Babur, M.; Badshah, M.U.; Al-Mansob, R.A.; Gamil, Y.; Fawad, M. Boosting-based ensemble machine learning models for predicting unconfined compressive strength of geopolymer stabilized clayey soil. Sci. Rep. 2024, 14, 2323. [Google Scholar] [CrossRef] [PubMed]
Ahmad, A.; Ahmad, W.; Chaiyasarn, K.; Ostrowski, K.A.; Aslam, F.; Zajdel, P.; Joyklad, P. Prediction of Geopolymer Concrete Compressive Strength Using Novel Machine Learning Algorithms. Polymers 2021, 13, 3389. [Google Scholar] [CrossRef] [PubMed]
Das, P.; Kashem, A. Hybrid machine learning approach to prediction of the compressive and flexural strengths of UHPC and parametric analysis with shapley additive explanations. Case. Stud. Constr. Mat. 2024, 20, e02723. [Google Scholar] [CrossRef]
Huo, W.; Zhu, Z.; Sun, H.; Ma, B.; Yang, L. Development of machine learning models for the prediction of the compressive strength of calcium-based geopolymers. J. Clean Prod. 2022, 380, 135159. [Google Scholar] [CrossRef]
Ma, G.; Cui, A.; Huang, Y.; Dong, W. A Data-Driven Influential Factor Analysis Method for Fly Ash–Based Geopolymer Using Optimized Machine-Learning Algorithms. J. Mater. Civ. Eng. 2022, 34, 7. [Google Scholar] [CrossRef]
Nguyen, H.; Vu, T.; Vo, T.P.; Thai, H.-T. Efficient machine learning models for prediction of concrete strengths. Constr. Build. Mater. 2021, 266, 120950. [Google Scholar] [CrossRef]
Mcculloch, W.S.; Pitts, W. A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 1943, 5, 115–133. [Google Scholar] [CrossRef]
Wang, Y.; Iqtidar, A.; Amin, M.N.; Nazar, S.; Hassan, A.M.; Ali, M. Predictive modelling of compressive strength of fly ash and ground granulated blast furnace slag based geopolymer concrete using machine learning techniques. Case Stud. Constr. Mat. 2024, 20, e03130. [Google Scholar] [CrossRef]
Kioumarsi, M.; Dabiri, H.; Kandiri, A.; Farhangi, V. Compressive strength of concrete containing furnace blast slag; optimized machine learning-based models. Clean. Eng. Technol. 2023, 13, 100604. [Google Scholar] [CrossRef]
Maheepala, M.M.A.L.N.; Nasvi, M.C.M.; Robert, D.J.; Gunasekara, C.; Kurukulasuriya, L.C. Mix design development for geopolymer treated expansive subgrades using artificial neural network. Comput. Geotech. 2023, 161, 105534. [Google Scholar] [CrossRef]
Nazar, S.; Yang, J.; Amin, M.N.; Khan, K.; Ashraf, M.; Aslam, F.; Javed, M.F.; Eldin, S.M. Machine learning interpretable-prediction models to evaluate the slump and strength of fly ash-based geopolymer. J. Mater. Res. Technol. 2023, 24, 100–124. [Google Scholar] [CrossRef]
Emarah, D.A. Compressive strength analysis of fly ash-based geopolymer concrete using machine learning approaches. Results Mater. 2022, 16, 100347. [Google Scholar] [CrossRef]
Paruthi, S.; Husain, A.; Alam, P.; Husain Khan, A.; Abul Hasan, M.; Magbool, H.M. A review on material mix proportion and strength influence parameters of geopolymer concrete: Application of ANN model for GPC strength prediction. Constr. Build. Mater. 2022, 356, 129253. [Google Scholar] [CrossRef]
Shahmansouri, A.A.; Yazdani, M.; Ghanbari, S.; Akbarzadeh Bengar, H.; Jafari, A.; Farrokh Ghatte, H. Artificial neural network model to predict the compressive strength of eco-friendly geopolymer concrete incorporating silica fume and natural zeolite. J. Clean. Prod. 2021, 279, 123697. [Google Scholar] [CrossRef]
Aalimahmoody, N.; Bedon, C.; Hasanzadeh-Inanlou, N.; Hasanzade-Inallu, A.; Nikoo, M. BAT Algorithm-Based ANN to Predict the Compressive Strength of Concrete—A Comparative Study. Infrastructures 2021, 6, 6. [Google Scholar] [CrossRef]
Ahmad, W.; Farooq, S.H.; Usman, M.; Khan, M.; Ahmad, A.; Aslam, F.; Yousef, R.A.; Abduljabbar, H.A.; Sufian, M. Effect of Coconut Fiber Length and Content on Properties of High Strength Concrete. Materials 2020, 13, 1075. [Google Scholar] [CrossRef]
Nguyen, T.T.; Pham Duy, H.; Pham Thanh, T.; Vu, H.H. Compressive Strength Evaluation of Fiber-Reinforced High-Strength Self-Compacting Concrete with Artificial Intelligence. Adv. Civ. Eng. 2020, 2020, 3012139. [Google Scholar] [CrossRef]
Dao, D.; Ly, H.-B.; Trinh, S.; Le, T.-T.; Pham, B. Artificial Intelligence Approaches for Prediction of Compressive Strength of Geopolymer Concrete. Materials 2019, 12, 6. [Google Scholar] [CrossRef]
Ngo, A.Q.; Nguyen, L.Q.; Tran, V.Q. Developing interpretable machine learning-Shapley additive explanations model for unconfined compressive strength of cohesive soils stabilized with geopolymer. PLoS ONE 2023, 18, e0286950. [Google Scholar] [CrossRef] [PubMed]
Oyebisi, S.; Alomayri, T. Artificial intelligence-based prediction of strengths of slag-ash-based geopolymer concrete using deep neural networks. Constr. Build. Mater. 2023, 400, 132606. [Google Scholar] [CrossRef]
Huynh, A.T.; Nguyen, Q.D.; Xuan, Q.L.; Magee, B.; Chung, T.; Tran, K.T.; Nguyen, K.T. A Machine Learning-Assisted Numerical Predictor for Compressive Strength of Geopolymer Concrete Based on Experimental Data and Sensitivity Analysis. Appl. Sci. 2020, 10, 7726. [Google Scholar] [CrossRef]
Peng, Y.; Unluer, C. Analyzing the mechanical performance of fly ash-based geopolymer concrete with different machine learning techniques. Constr. Build. Mater. 2022, 316, 125785. [Google Scholar] [CrossRef]
Deng, F.; He, Y.; Zhou, S.; Yu, Y.; Cheng, H.; Wu, X. Compressive strength prediction of recycled concrete based on deep learning. Constr. Build. Mater. 2018, 175, 562–569. [Google Scholar] [CrossRef]
Chen, H.; Li, X.; Wu, Y.; Zuo, L.; Lu, M.; Zhou, Y. Compressive Strength Prediction of High-Strength Concrete Using Long Short-Term Memory and Machine Learning Algorithms. Buildings 2022, 12, 3. [Google Scholar] [CrossRef]
Latif, S.D. Concrete compressive strength prediction modeling utilizing deep learning long short-term memory algorithm for a sustainable environment. Environ. Sci. Pollut. Res. 2021, 28, 30294–30302. [Google Scholar] [CrossRef]
Zhou, J.; Tian, Q.; Nazar, S.; Huang, J. Hyper-tuning gene expression programming to develop interpretable prediction models for the strength of corncob ash-modified geopolymer concrete. Mater. Today Commun. 2024, 38, 107885. [Google Scholar] [CrossRef]
Kumar Dash, P.; Kumar Parhi, S.; Kumar Patro, S.; Panigrahi, R. Efficient machine learning algorithm with enhanced cat swarm optimization for prediction of compressive strength of GGBS-based geopolymer concrete at elevated temperature. Constr. Build. Mater. 2023, 400, 132814. [Google Scholar] [CrossRef]
Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 2019, 1, 206–215. [Google Scholar] [CrossRef]
Naser, M.Z. An engineer’s guide to eXplainable Artificial Intelligence and Interpretable Machine Learning: Navigating causality, forced goodness, and the false perception of inference. Automat. Constr. 2021, 129, 103821. [Google Scholar] [CrossRef]
Nithurshan, M.; Elakneswaran, Y. A systematic review and assessment of concrete strength prediction models. Case Stud. Constr. Mat. 2023, 18, e01830. [Google Scholar] [CrossRef]
Ke, X.; Duan, Y. Coupling machine learning with thermodynamic modelling to develop a composition-property model for alkali-activated materials. Compos. Part B Eng. 2021, 216, 108801. [Google Scholar] [CrossRef]
Feng, J.; Zhang, H.; Gao, K.; Liao, Y.; Yang, J.; Wu, G. A machine learning and game theory-based approach for predicting creep behavior of recycled aggregate concrete. Case Stud. Constr. Mat. 2022, 17, e01653. [Google Scholar] [CrossRef]
Han, B.; Wu, Y.; Liu, L. Prediction and uncertainty quantification of compressive strength of high-strength concrete using optimized machine learning algorithms. Struct. Concr. 2022, 23, 3772–3785. [Google Scholar] [CrossRef]
Haque, M.A.; Chen, B.; Kashem, A.; Qureshi, T.; Ahmed, A.A.M. Hybrid intelligence models for compressive strength prediction of MPC composites and parametric analysis with SHAP algorithm. Mater. Today Commun. 2023, 35, 105547. [Google Scholar] [CrossRef]
Peng, Y.; Unluer, C. Modeling the mechanical properties of recycled aggregate concrete using hybrid machine learning algorithms. Resour. Conserv. Recycl. 2023, 190, 106812. [Google Scholar] [CrossRef]
Chen, Q.; Hu, G.; Wu, J. Comparative study on the prediction of the unconfined compressive strength of the one-part geopolymer stabilized soil by using different hybrid machine learning models. Case Stud. Constr. Mat. 2024, 21, e03439. [Google Scholar] [CrossRef]
Yao, C.; Hu, G.; Chen, Q.; Wu, J. Prediction on the freeze-thaw resistance of a one-part geopolymer stabilized soil by using deep learning method. Case Stud. Constr. Mat. 2024, 21, e03530. [Google Scholar] [CrossRef]
Shen, J.; Li, Y.; Lin, H.; Li, H.; Lv, J.; Feng, S.; Ci, J. Prediction of compressive strength of alkali-activated construction demolition waste geopolymers using ensemble machine learning. Constr. Build. Mater. 2022, 360, 129600. [Google Scholar] [CrossRef]
Li, Y.; Shen, J.; Lin, H.; Li, Y. Optimization design for alkali-activated slag-fly ash geopolymer concrete based on artificial intelligence considering compressive strength, cost, and carbon emission. J. Build. Eng. 2023, 75, 106929. [Google Scholar] [CrossRef]
Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems—Volume 2, Cambridge, MA, USA, 8–13 December 2014; MIT Press: Montreal, QC, Canada, 2014; pp. 2672–2680. [Google Scholar]
Shao, S.; Wang, P.; Yan, R. Generative adversarial networks for data augmentation in machine fault diagnosis. Comput. Ind. 2019, 106, 85–93. [Google Scholar] [CrossRef]
He, J.; Xu, Y.; Pan, Y.; Wang, Y. Adaptive weighted generative adversarial network with attention mechanism: A transfer data augmentation method for tool wear prediction. Mech. Syst. Signal Proc. 2024, 212, 111288. [Google Scholar] [CrossRef]
Du, W.Z.; Tian, S.H. Transformer and GAN-Based Super-Resolution Reconstruction Network for Medical Images. Tsinghua Sci. Technol. 2024, 29, 197–206. [Google Scholar] [CrossRef]
Gui, J.; Sun, Z.A.; Wen, Y.G.; Tao, D.C.; Ye, J.P. A Review on Generative Adversarial Networks: Algorithms, Theory, and Applications. IEEE Trans. Knowl. Data Eng. 2023, 35, 3313–3332. [Google Scholar] [CrossRef]
Hahnloser, R.H.R.; Sarpeshkar, R.; Mahowald, M.A.; Douglas, R.J.; Seung, H.S. Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature 2000, 405, 947–951. [Google Scholar] [CrossRef]
Kingma, D.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Hinton, G.E.; Srivastava, N.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Improving neural networks by preventing co-adaptation of feature detectors. arXiv 2012, arXiv:1207.0580. [Google Scholar]
Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
Li, D.; Huang, F.; Yan, L.; Cao, Z.; Chen, J.; Ye, Z. Landslide Susceptibility Prediction Using Particle-Swarm-Optimized Multilayer Perceptron: Comparisons with Multilayer-Perceptron-Only, BP Neural Network, and Information Value Models. Appl. Sci. 2019, 9, 18. [Google Scholar] [CrossRef]
Wan, T.; Bai, Y.; Wang, T.; Wei, Z. BPNN-based optimal strategy for dynamic energy optimization with providing proper thermal comfort under the different outdoor air temperatures. Appl. Energy 2022, 313, 118899. [Google Scholar] [CrossRef]
Song, H. Using Multifactor Inputs BP Neural Network to Make Power Consumption Prediction. Master’s Thesis, State University of New York at Binghamton, Binghamton, NY, USA, 2018; p. 77. [Google Scholar]
Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
Waibel, A.; Hanazawa, T.; Hinton, G.; Shikano, K.; Lang, K.J. Phoneme recognition using time-delay neural networks. IEEE Trans. Acoust. Speech Signal Process. 1989, 37, 328–339. [Google Scholar] [CrossRef]
Lecun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
Greff, K.; Srivastava, R.K.; Koutnik, J.; Steunebrink, B.R.; Schmidhuber, J. LSTM: A Search Space Odyssey. IEEE Trans. Neural Netw. Learn. Syst. 2017, 28, 2222–2232. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural. Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Allahyani, M.; Alsulami, R.; Alwafi, T.; Alafif, T.; Ammar, H.; Sabban, S.; Chen, X. DivGAN: A diversity enforcing generative adversarial network for mode collapse reduction. Artif. Intell. 2023, 317, 103863. [Google Scholar] [CrossRef]
Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein Generative Adversarial Networks. In Proceedings of the 34th International Conference on Machine Learning, PMLR, Sydney, NSW, Australia, 6–11 August 2017. [Google Scholar]
Radford, A.; Metz, L.; Chintala, S. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv 2015, arXiv:1511.06434. [Google Scholar]
Zhu, J.Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2242–2251. [Google Scholar]
Chai, P.; Hou, L.; Zhang, G.; Tushar, Q.; Zou, Y. Generative adversarial networks in construction applications. Automat. Constr 2024, 159, 105265. [Google Scholar] [CrossRef]
Wong, T.T.; Yeh, P.Y. Reliable Accuracy Estimates from k-Fold Cross Validation. IEEE Trans. Knowl. Data Eng. 2020, 32, 1586–1594. [Google Scholar] [CrossRef]
Fushiki, T. Estimation of prediction error by using K-fold cross-validation. Stat. Comput. 2011, 21, 137–146. [Google Scholar] [CrossRef]
Rodríguez, J.D.; Pérez, A.; Lozano, J.A. A general framework for the statistical analysis of the sources of variance for classification error estimators. Pattern Recognit. 2013, 46, 855–864. [Google Scholar] [CrossRef]
Molnar, C.; Casalicchio, G.; Bischl, B. Interpretable Machine Learning—A Brief History, State-of-the-Art and Challenges. In ECML PKDD 2020 Workshops, Communications in Computer and Information Science; Springner: Cham, Switzerland, 2020; pp. 417–431. [Google Scholar]
Saltelli, A.; Andres, T.H.; Homma, T. Sensitivity analysis of model output: An investigation of new techniques. Comput. Stat. Data Anal. 1993, 15, 211–238. [Google Scholar] [CrossRef]
Owais, M.; Alshehri, A.; Gyani, J.; Aljarbou, M.H.; Alsulamy, S. Prioritizing rear-end crash explanatory factors for injury severity level using deep learning and global sensitivity analysis. Expert Syst. Appl. 2024, 245, 123114. [Google Scholar] [CrossRef]
Morio, J. Global and local sensitivity analysis methods for a physical system. Eur. J. Phys. 2011, 32, 1577. [Google Scholar] [CrossRef]
Ekanayake, I.U.; Meddage, D.P.P.; Rathnayake, U. A novel approach to explain the black-box nature of machine learning in compressive strength predictions of concrete using Shapley additive explanations (SHAP). Case Stud. Constr. Mat. 2022, 16, e01059. [Google Scholar] [CrossRef]
Molnar, C. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable, 2nd ed.; Independently Published: Munich, Germany, 2022. [Google Scholar]
Kriegler, B. Cost-Sensitive Stochastic Gradient Boosting within a Quantitative Regression Framework; University of California: Los Angeles, CA, USA, 2007; p. 144. [Google Scholar]
Braun, W.J.; Murdoch, D.J.; Hlynka, M.; Iacus, S.; Atkinson, A.C.; Donev, A.; Tobias, R.D.; Arnold, B.C.; Balakrishnan, N.; Schilling, E.G.; et al. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. J. R. Stat. Soc. Ser. A Stat. Soc. 2010, 173, 693–694. [Google Scholar]
Yang, J.; Liu, J.; Xie, J.; Wang, C.; Ding, T. Conditional GAN and 2-D CNN for Bearing Fault Diagnosis with Small Samples. IEEE Trans. Instrum. Meas. 2021, 70, 1–12. [Google Scholar] [CrossRef]

Figure 1. Framework of the ANN model for the prediction of the UCS in this study.

Figure 2. Calculations of the Convolutional and Pooling layers of the CNN.

Figure 3. Structure of the LSTM network.

Figure 4. Structure of the WGAN.

Figure 5. The diagram of K-Fold cross-validation.

Figure 6. Workflow of the study.

Figure 7. Description of the data distribution. (a) Histogram of the UCS. (b) Box diagram of the inputs.

Figure 8. Correlation matrix.

Figure 9. The performance index of the ANN model.

Figure 10. The performance index of the BPNN model.

Figure 11. The performance index of the CNN model.

Figure 12. The performance index of the LSTM model.

Figure 13. Performance comparison of the DL models on the refined set.

Figure 14. KDE plot of the prediction errors of DL models.

Figure 15. Violin plots of different samples. (a) The original dataset. (b) The synthetic samples generated by GAN. (c) The synthetic samples generated by WGAN (adopting Dense layers in the generator). (d) The synthetic samples generated by WGAN (adopting Convolutional layers in the generator).

Figure 16. The comparison of the performance of the WGAN-CNN models.

Figure 17. The KDE plot of the prediction errors of the WGAN-CNN models.

Figure 18. The performance of WGAN-CNN (with 150 and 200 synthetic samples).

Figure 19. First order sensitive analysis of DL models.

Figure 20. Second-order sensitivity analyses of DL models. (a) ANN. (b) BPNN. (c) CNN. (d) LSTM. (e) WGAN-CNN.

Figure 21. SHAP global explanations of the CNN and WGAN-CNN.

Figure 22. SHAP decision plots for the CNN and WGAN-CNN.

Figure 23. SHAP interaction analysis of the curing time. (a) Curing time versus FA/GGBFS. (b) Curing time versus NaOH/precursor. (c) Curing time versus water/binder. (d) Curing time versus molarity.

Figure 24. SHAP interaction analysis of FA/GGBFS. (a) FA/GGBFS versus NaOH/precursor. (b) FA/GGBFS versus water/binder. (c) FA/GGBFS versus the curing time. (d) FA/GGBFS versus molarity.

Figure 25. SHAP interaction analysis of other factors. (a) NaOH/precursor versus water/binder. (b) NaOH/precursor versus molarity. (c) NaOH/precursor versus water/binder.

Figure 26. PDP analysis of CNN and WGAN-CNN models. (a) Partial dependence on FA/GGBFS. (b) Partial dependence on NaOH/precursor. (c) Partial dependence on water/binder. (d) Partial dependence on curing time. (e) Partial dependence on molarity.

Table 1. Summary of the ML and DL models used to predict the compressive strength of GPC.

Reference	Type of GPC	Algorithms	Best Model (R²)
Zhou et al. [70]	Slag- and CCA-modified GPC	GEP	GEP (0.96)
Parhi and Patro [39]	FA-based GPC	RF, NN, MARS, HEML	HEML (0.97)
Nazar et al. [55]	FA-based GPC	ANFIS, ANN, GEP	GEP (0.94)
Ngo et al. [63]	Coal ash-based GPC	RF, ANN, XGB, GB, PSO	ANN (0.9808)
Oyebisi et al. [64]	GGBFS-CCA-based GPC	DNN	DNN (0.986)
Huo et al. [48]	Calcium-based GPC	KNN, SVM, RF, GBDT, BA, ET, XGB, DNN	XGB (0.91)
Kumar Dash et al. [71]	GGBS-based GPC	ELM, ELM-CSO, ELM-ECSO	ELM-ECSO (0.94)
Shahmansouri et al. [58]	GGBS-based GPC	ANN	ANN (0.924)
Huynh et al. [65]	FA-based GPC	ANN, DNN, ResNet	ResNet (0.937)
Dong et al. [62]	FA-based GPC	ANFIS, ANN	ANFIS (0.879)

Note: CCA is corn cob ash; GGBS is ground granulated blast slag; HEML is hybrid ensemble machine learning; ECSO is enhanced cat swarm; GEP is gene expression programming; ANFIS is adaptive neuro-fuzzy inference system; MARS is multivariate adaptive regression spline.

Table 2. Statistical description of inputs and output.

Type	Variable	Standard Deviation	Mean	Min	Max
Independent	FA/GGBFS	0.120	0.134	0	0.43
	NaOH/Precursor	0.034	0.151	0.1	0.2
	Water/Binder	0.084	0.620	0.5	0.7
	Curing time (days)	8.103	11.318	3	28
	Molarity (mol/L)	0.663	4.754	3.27	8.33
Dependent	UCS (MPa)	0.638	1.229	0.1	3.282

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, Q.; Hu, G.; Wu, J. Prediction of the Unconfined Compressive Strength of a One-Part Geopolymer-Stabilized Soil Using Deep Learning Methods with Combined Real and Synthetic Data. Buildings 2024, 14, 2894. https://doi.org/10.3390/buildings14092894

AMA Style

Chen Q, Hu G, Wu J. Prediction of the Unconfined Compressive Strength of a One-Part Geopolymer-Stabilized Soil Using Deep Learning Methods with Combined Real and Synthetic Data. Buildings. 2024; 14(9):2894. https://doi.org/10.3390/buildings14092894

Chicago/Turabian Style

Chen, Qinyi, Guo Hu, and Jun Wu. 2024. "Prediction of the Unconfined Compressive Strength of a One-Part Geopolymer-Stabilized Soil Using Deep Learning Methods with Combined Real and Synthetic Data" Buildings 14, no. 9: 2894. https://doi.org/10.3390/buildings14092894

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Prediction of the Unconfined Compressive Strength of a One-Part Geopolymer-Stabilized Soil Using Deep Learning Methods with Combined Real and Synthetic Data

Abstract

1. Introduction

2. Deep Learning Models

2.1. ANN

2.2. BPNN

2.3. CNN

2.4. LSTM

2.5. WGAN

2.6. K-Fold Cross-Validation

2.7. Interpretable Methods

2.7.1. Sensitivity Analysis

2.7.2. SHAP

2.7.3. PDP

2.8. Performance Index

3. Methodology

3.1. Workflow of the Current Research

3.2. Data Collection

4. Results and Discussion

4.1. Performance of the DL Models

4.2. Performance of WGAN-CNN

4.3. Interpretable Analysis

4.3.1. Results of Sensitivity Analysis

4.3.2. SHAP Results

4.3.3. PDP Results

5. Conclusions

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI