1. Introduction
Liquefaction is one of the most destructive effects of earthquakes. This phenomenon, which has occurred several times during recent earthquakes, is caused by seismic shear waves that propagate upward to the surface layers, increasing pore water pressure in saturated, relatively loose or loose sandy deposits. Liquefaction occurs when rapid earthquake motion prevents drainage, thereby increasing excess pore water pressure to as much as initial overburden stress. Liquefaction causes severe property damage and fatalities. After liquefaction, the strength and stiffness of the liquefied soil are considerably decreased, often resulting in a range of structural failures. Evaluating the liquefaction potential guides the decision of which method is to be applied to prevent the disaster.
A number of approaches and models have been presented to assess the liquefaction potential of soils. Some of these approaches follow a stress-based procedure, on the basis of the equation presented by Seed and Idriss [
1]. During this process, the shear stress induced by an earthquake is first determined according to the peak ground surface acceleration (
amax). Then, the cyclic stress ratio (CSR) and the cyclic shear strength (CRR) are determined, and by comparing them the potential of liquefaction is analyzed [
2,
3,
4,
5,
6,
7,
8,
9,
10,
11]. Strain-based methods have also been conducted by supposing that pore water pressure grows by control of the cyclic shear strain during dynamic loads [
12,
13].
In contrast, other researchers have assessed the potential of liquefaction through an energy-based method, which considers the energy dissipated into the soil by earthquake motions. This method can be divided into three groups based on case histories during earthquakes that occurred in the past [
14,
15,
16], the Arias intensity (
Ih) [
17,
18], and laboratory test results [
19,
20]. Shafee et al. [
21] performed some uniaxial shaking tests and demonstrated that the difference between strain energy generated in the soil caused by biaxial and uniaxial shaking tests is negligible. Zheghal et al. [
22] studied the effect of non-proportionality and the phase angle of the induced shear stresses on rising pore water pressure. Moreover, researchers investigated the influence of five parameters, including the initial effective mean confining pressure (
), initial relative density (
Dr)%, fine content (
FC)%, coefficient of uniformity (
Cu), and mean grain size (
D50), on the capacity energy of soils (
W) [
13,
23,
24,
25,
26,
27,
28,
29,
30,
31] by considering laboratory test results. Baziar et al. [
26] collected a large number of datasets with a wide range of test results, including six parameters, and they divided them randomly into a testing and training phase in order to present artificial neural networks (ANNs). Furthermore, they eliminated the coefficient of curvature (
Cc) due to no increase in the model’s accuracy, and they presented new artificial neural network (ANN) and multi-linear regression (MLR) models, including five parameters containing
,
Dr%,
FC%,
Cu, and (
D50) in mm. With the same dataset collected by Baziar et al. [
26], Alavi et al. [
28] developed three new models. By adding new data to the dataset of Baziar et al. [
26] and applying a neuro-fuzzy interface system (ANFIS), Cabalar et al. [
32] developed a model that included six parameters containing
Cc and demonstrated its influence. However, data division was conducted randomly in all studies, without considering the statistical aspects of parameters. Furthermore, data was divided into two groups, testing and training, without performing a validating phase to prevent overtraining of the ANN model. A validating phase is applied to minimize overfitting of the trained model [
33,
34]. In addition, Tao [
35] investigated the complicated influence of
FC and illustrated that liquefaction resistance, in terms of the unit energy, starts to increase with an increase in
FC above 28%. They also indicated that the liquefaction resistance becomes less dependent on relative density when
FC is less than 28%. Maurer et al. [
36] investigated 7000 dataset case histories from the 2010–2011 Canterbury Earthquakes and indicated that the evaluation of liquefaction is less accurate when soils have a high
FC value. Although these studies have indicated an altered influence of a high
FC value on liquefaction assessment, it has not yet been taken into account to propose a model.
In this study, a larger dataset is first collected. Then, to evaluate the influence of
FC on
W, two ANN models are constructed that include the following six input parameters:
,
Dr%,
FC%,
Cu,
D50, and
Cc. To analyze the complicated influence of
FC, the first ANN model, without any constraints in the range of input parameters, is derived similarly to other studies that have been performed and explained herein, and the second ANN model is derived through a database with
FC values less than 28%, as in Tao [
35]. To increase the accuracy and capability of the ANN models, the dataset is divided into three groups by considering the statistical aspects of parameters with similar mean as well as mean coefficient of variation (COV) values, instead of random division. The first group is for the training phase, the second is for testing, and the third is for the validating phase, to prevent overtraining in the training phase. In the second step, six visualized equations are captured by using the response surface method (RSM), which was demonstrated to be a capable method for evaluating liquefaction in sandy soil by Pirhadi et al. [
37]. To the best of our knowledge, no other studies have been conducted on liquefaction using RSM. In
Section 5, the dataset and two derived ANN models and their characteristics are described. According to any ANN model, three different design of experiments (DOEs) are performed. Therefore, three equations are obtained to illustrate and calculate correlation between six defined parameters and
W as a target. During this step, the meaningfulness of all terms of the equations are analyzed through hypothesis testing, and to obtain more capable and reliable equations, some equation terms that do not provide a meaningful correlation with the target are eliminated, instead of performing an overall elimination of parameters such as
Cc. The final equations thus contain all six parameters and are presented in
Section 6. Furthermore, by applying three different DOEs, their influence and capability are studied to determine the best DOE that can be applied for similar issues. Finally, to demonstrate the accuracy and capability of the presented equations, their predicted results are compared with four existing, well-known, and highly rated models that are currently used.
Section 7 presents this comparison, using 20 samples from Dief [
38] that are not included in the database to develop the ANN and RSM models for this study.
Figure 1 illustrates the flowchart of the process applied in this study to develop the RSM equations.
2. Approaches Based on Laboratory Test Results
By inspecting and monitoring the number of site responses to earthquakes’ time history accelerations in the West of the United States and by introducing normalized maximum energy, which is the area under the stress–strain of earthquake motion at depths, Alkahtib [
39] derived a relationship between maximum energy,
Dr,
amax, and initial effective confining stress.
Additionally, Liang [
13] derived the equations by performing 74 torsional shear tests on Reid Bedford sand, Lower San Fernando Dam (LSFD) silty sand, and Lapis Luster dried sand (LSI-30):
LSI-30 sand:
where
δW is the cumulative unit energy (J/m
3),
Γ is the shear strain amplitude, and R
2 is the coefficient of determination.
Furthermore, by conducting 27 strain-controlled torsional triaxial tests on Reid Bedford sand, Kusky [
23] derived the following two regression equations:
where
f is the cyclic rate (Hz).
For the first time, Figueroa et al. [
24] and Rokoff [
25] inspected the influence of particle size distribution on the potential of liquefaction according to the strain energy-based procedures. Rokoff performed some cyclic torsional shear tests on the sand samples from Nevada, incorporating the
Cu and the
Cc, and the author presented the equations expressed as:
where
Di is the particle diameter, which is given by a grain-size distribution for a given percent finer, denoted by the subscript
i.
Through a statistical method, using the test results data of Liang [
40] and Rokoff [
25], Wallin [
41] presented mathematical equations for Reid Bedford sand, LSFD silty sand, and Nevada sand.
Baziar et al. [
26] collected a large database containing 284 cyclic, triaxial, torsional shear, and simple shear test results, and they developed two ANN models. Thereafter, by comparing the ANN models’ results with data from 18 centrifuge tests, they evaluated their model. Their first model included six input parameters—
,
Dr%,
FC%,
Cu,
D50 (in mm), and
Cc—and the output (the target) was
W (J/m
3). In the second model, they eliminated
Cc and developed the model with the five extra input parameters.
With the same database as Baziar et al. [
26] and using genetic programming (GP), linear genetic programming (LGP), and multi-expression programming (MEP), Alavi et al. [
28] developed three equations to evaluate
W. Cabalar et al. [
32] utilized an ANFIS on the same database as Baziar et al. [
26] and illustrated the effect of input parameters by graphical representation. By adding some new datasets to those of Baziar et al. [
26] and by applying GP, Baziar et al. [
27] developed an equation to estimate the
W with the same parameters as those of Baziar et al [
26]. Zhang et al. [
29] applied multivariate adaptive regression splines (MARS), which is a nonparametric regression procedure, and by using a similar database to that of Baziar al. [
26], they developed a model to measure W based on five input parameters, which are similar to the previously mentioned studies [
26,
27,
32]. It should be mentioned that all models and equations presented in the previously mentioned studies [
26,
27,
28,
29,
32] estimated the capacity energy (
W) in a logarithm term (log
W).
6. The RSM Equations
The second-degree polynomial with cross-terms Equation (14) is selected to establish the RSM equations, due to it being the most capable and precise model. Based on the ANN models which are developed in this study and considering three DOEs, a total of 6 equations are derived. For the BB, CC, and HCC, 54, 90, and 53 coded samples, respectively, were constructed according to six input parameters in this study. Due to the lack of values of
W in coded points, the ANN models were used to predict the targets of coded samples. Thereafter, three DOEs were performed to develop three equations to predict the (
W) for any dataset. Then, by performing a hypothesis test through
P-values, some terms of the original second-degree polynomial with cross-terms were eliminated. In this study, the common alpha value of 0.05, which many researchers have used, is considered. If the
P-value of a test statistic is larger than the alpha, then the null hypothesis is accepted, whereas if it is less than the alpha, then the hypothesis is rejected. The RSM was then run repeatedly to establish the final equations, as presented in
Table 5,
Table 6,
Table 7,
Table 8,
Table 9 and
Table 10. Finally, their results were compared to other well-known models to demonstrate the accuracy and capability of the six presented equations herein.
The adjusted R
2 demonstrates the power of the regression, taking into account the number of predictors. In other words, it is a modified version of the R
2, and it is always lower than the R
2; however, when it is closer to the R
2, this demonstrates a greater accuracy and ability to predict. It must be considered that to use these equations to predict
W, the real values of six input parameters must first be transferred to the coded value in Equation (16) below; then, the value of W can be estimated by substituting the coded value in the RSM equations presented in
Table 5,
Table 6,
Table 7,
Table 8,
Table 9 and
Table 10.
Both RSM equations presented in this study must be applied with caution:
- (1)
Both RSM equations require soil properties and laboratory test results to estimate Dr%, FC%, Cu, D50 (in mm), and Cc.
- (2)
Both RSM equations are applicable for the range of the parameters as defined in
Table 1 and
Table 3.
- (3)
The second RSM equation is only applicable for samples with an FC value of less than 28%.
- (4)
It is necessary to transfer the real value of the parameters to the coded value as in Equation (16), then input this into the equations to estimate the results.
7. Comparison of the Predicted Capacity Energy of Liquefaction between the RSM Equations and Existing Models
To demonstrate the capability and accuracy of the RSM equations presented in this study, their prediction values are compared with the GP, LGP, MEP [
28], and MARS [
29] models, which are presented in the
Appendix A. This is undertaken using 20 samples from Dief [
38], from Nevada sand and Reid Bedford sand, which were not included in the database used to develop the ANN and RSM models for this study. The 20 samples’ parameter values are listed in
Table 11, and the predicted values are compared in
Table 12.
To illustrate the capability and accuracy of different equations, the results are summarized in
Table 13 according to root mean square error (RMSE), mean absolute error (MAE), and coefficient of correlation (R). Lower RMSE and MAE values testify to more accuracy, whereas a higher R indicates a higher accuracy. Extra models are considered herein including genetic programming (GP), linear genetic programming (LGP), and multi expression programming (MEP) all developed by Alavi et al. [
27] and multivariate adaptive regression splines (MARS) which is presented by Zhang et al. [
28].
A comparison was conducted regarding the capability and accuracy of the predicted values for the capacity energy liquefaction of soil
W between six new equations and other models. All results are summarized in
Table 14.
Figure 2,
Figure 3,
Figure 4,
Figure 5,
Figure 6,
Figure 7,
Figure 8,
Figure 9,
Figure 10 and
Figure 11 provide a visual comparison of the results. The second group of equations, with a database limited to FC values of less than 28%, illustrated more accuracy than all DOEs conducted in this study, as can be seen in
Figure 5,
Figure 6 and
Figure 7. The CC and BB designs also indicated higher accuracy in comparison with the HCC design, as shown in
Figure 2,
Figure 3,
Figure 4,
Figure 5,
Figure 6 and
Figure 7. Furthermore, the R value of 0.911 for the BB28 equation and the CC with 0.830, closely followed by the HCC and CC28 equations, with 0.792 and 0.722, respectively, demonstrated the highest precision.
On the other hand, with RSME and MAE values of 0.173 and 0.139, respectively, CC28 demonstrated less inaccuracy. Considering the graphs, BB28 and CC28 were the most capable and accurate models among all six equations presented in this study as well as the four extra inspected models, as shown in
Figure 2,
Figure 3,
Figure 4,
Figure 5,
Figure 6 and
Figure 7. Moreover, five of the six RSM models demonstrated a higher R than the extra four models. The RMSEs of CC28, MEP, and BB28 were 0.173, 0.182, and 0.218, respectively, displaying the highest accuracy to predict
W, as demonstrated in
Table 14. In addition, the MAEs of CC28, MEP, and BB28—0.139, 0.157, and 0.203, respectively—revealed the most accuracy.
As can be seen from
Figure 2,
Figure 3,
Figure 4,
Figure 5,
Figure 6,
Figure 7,
Figure 8,
Figure 9,
Figure 10 and
Figure 11, among all models evaluated in this study, CC and HCC present underpredicted values for
W. In contrast, HCC28 and LGP overstimated
W in their predictions. Furthermore, BB28 shows the highest accuracy in predicting
W.
8. Summary and Conclusions
In this study, a new tool, the response surface method (RSM), was used to develop six new equations to estimate the capacity energy of soil liquefaction (W). While the RSM has been used in industry, medicine, and science, according to the literature review, it has not been used to investigate soil liquefaction by other researchers. To examine the complicated influence of fine content (FC), two sets of databases were arranged and an artificial neural network (ANN) model was developed for each. Finally, three RSM equations were developed based on each ANN model, with each equation belonging to a specific design of experiment (DOE). The first dataset contained six parameters: initial effective mean confining pressure () and initial relative density (Dr)%, FC, coefficient of uniformity (Cu), coefficient of curvature (Cc) and mean grain size (D50), with no limitation on the range of the parameters, whereas the second dataset was built by eliminating all samples with FCs higher than 28%. To establish the RSM equations, three common DOEs, the Box-Behnken (BB), central composite (CC), and half central composite (HCC), were applied to assess the best DOE. Then, after performing a hypothesis test based on P-values, some terms of the original equations were eliminated instead of eliminating a parameter such as Cc. The RSM procedure was then repeatedly rebuilt to obtain the most accurate and capable final equations.
To validate and confirm the capability of the developed RSM models, 20 new laboratory test samples, which were not applied in both datasets in this study, were selected and compared to the predicted values for W from the six RSM models as well as those from four other available models with similar parameters. To compare the results, the measured RMSE, MAE, and R were assigned to the predicted values of all models. The major conclusions drawn are as follows:
Applying a validation phase provides a significant increase in the accuracy of the model in predicting W. Furthermore, performing data division considering statistical factors instead of random division raises the performance of the model.
The second group of models containing three equations demonstrate higher capability and accuracy for measuring W. It should be considered that the second group of models were derived on the basis of a smaller dataset, 309 samples, due to eliminated samples with FCs higher than 28%. Therefore, FCs in varying amounts that are higher than 28% are confirmed to have different effects on W.
In general, the RSM is a capable tool for predicting the potential of liquefaction, and it can be used by researchers.
Of all the DOEs inspected in the present study, the CC and BB designs demonstrated the highest capability and accuracy in predicting W; they both displayed lower RMSEs and MAEs, and they both had a higher R2.