Next Article in Journal
Machine Learning and Image Processing-Based System for Identifying Mushrooms Species in Malaysia
Previous Article in Journal
Suboptimal Analysis of the Differential System of the Conceptual Trailer Air Brake Valve
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A New Shear Wave Velocity-Based Liquefaction Probability Model Using Logistic Regression: Emphasizing Fines Content Optimization

School of Civil Engineering and Transportation, Northeast Forestry University, Harbin 150040, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2024, 14(15), 6793; https://doi.org/10.3390/app14156793
Submission received: 10 July 2024 / Revised: 1 August 2024 / Accepted: 1 August 2024 / Published: 4 August 2024
(This article belongs to the Section Civil Engineering)

Abstract

:
A new liquefaction probability model based on shear wave velocity (Vs) was developed through a detailed comparative analysis of existing evaluation methods. Publicly available shear wave velocity liquefaction data were used to evaluate multiple existing liquefaction probability assessment methods under various probability contours and fines content levels. Significant performance differences were observed among the formulae under varying fines content levels. To construct the new model, the random forest feature importance ranking algorithm was employed to select the key parameters, including the effective stress-normalized shear wave velocity (Vs1), corrected cyclic resistance ratio (CSR7.5), magnitude (MW), depth (Z), and fines content (FC). Using these parameters, a new liquefaction probability assessment formula was developed utilizing the logistic regression model to predict the liquefaction probability. The new formula’s performance was subsequently evaluated through a detailed case analysis and validation. The results demonstrate that the new formula achieves a higher accuracy (3–11%) for the liquefaction assessment compared to the existing formulae, performing consistently well across different probability contours and fines content levels, especially in areas with high fines content. This study provides theoretical support and empirical evidence for optimizing the shear wave velocity-based liquefaction probability assessment methods.

1. Introduction

Seismic liquefaction is a primary cause of earthquake-induced structural and infrastructural damage. Major seismic events, such as the 2011 Tohoku earthquake (MW 9.0) and the 1976 Tangshan earthquake (MW 7.5), have precipitated extensive liquefaction phenomena. These occurrences resulted in differential settlement, foundation failures, and sand boils, rendering numerous structures inoperative and presenting substantial challenges to post-disaster reconstruction initiatives [1,2,3].
The precise prediction of liquefaction is paramount for mitigating related disasters and implementing appropriate countermeasures. As a result, liquefaction assessment continues to be a central concern in geotechnical engineering. The parameters frequently employed in liquefaction assessment encompass the Standard Penetration Test (SPT) [4,5], Cone Penetration Test (CPT) [6,7], Dynamic Penetration Test (DPT) [8], and Shear Wave Velocity Test (Vs) [9], as well as methodologies integrating multiple parameters [10].
Vs is pivotal in evaluating the site liquefaction probability, being directly correlated with soil relative density [11]. In the realm of geotechnical engineering, Vs offers numerous dynamic elastic mechanical parameters. Owing to its cost-effectiveness and simplicity, Vs-based liquefaction assessment methodologies have witnessed rapid advancements in recent years. Compared to the SPT and CPT, Vs testing exhibits a superior performance in gravelly soils and irregular sites where the SPT and CPT are challenging to implement [12].
In recent years, Vs testing has gained widespread adoption in geotechnical engineering, leading to the accumulation of substantial data. The Vs of soil layers reflects the soil’s dynamic properties and is extensively utilized in diverse engineering practices. The principal methodologies of Vs testing encompass borehole techniques and surface wave techniques. Borehole techniques are categorized into single-hole and cross-hole methods, whereas surface wave techniques comprise transient surface wave and steady-state surface wave methods. Table 1 lists the characteristics of each method.
Extensive research has been conducted on liquefaction assessment methods based on Vs. In 1981, Dobry et al. [16] first studied the soil liquefaction evaluation method based on Vs. Through indoor cyclic loading tests, they defined the threshold shear strain that generates the pore water pressure and used Seed et al.’s CSR (cyclic stress ratio) calculation formula to convert it into the Vs threshold value [17]. When the field-measured Vs is less than the Vs threshold value, the soil will liquefy. Kayen et al. [18], Robertson et al. [19], and Lodge et al. [20] proposed the critical relationships of liquefaction triggering based on the effective stress-normalized shear wave velocity (Vs1) using different field liquefaction data.
Due to the limitations of deterministic methods, researchers have recently begun to adopt probabilistic liquefaction assessment methods. Juang et al. [21] used logistic regression and Bayesian methods for probabilistic analysis and proposed a liquefaction probability evaluation formula based on Vs. Shen et al. [22] proposed a similar formula based on a logistic regression model, while Kayen et al. [15] proposed a corresponding assessment method based on the Bayesian updating model.
In recent years, as human activities have expanded, liquefaction has been observed in lower magnitude earthquakes at deeper depths and in more diverse soil types. For example, the 2018 Songyuan earthquake (Ms 5.7) caused widespread liquefaction [23]; the 2011 Christchurch earthquake in New Zealand (MW 6.3) saw significant liquefaction at depths of 10–20 m [24]; and the 2008 Wenchuan earthquake (MW 7.9) produced fewer sand boils than previous major earthquakes, but the ejected materials were more diverse, including silt, fine sand, medium sand, coarse sand, gravel, and pebbles [25]. These new phenomena challenge traditional understanding and necessitate the optimization and upgrading of traditional liquefaction risk assessment methods.
Researchers have adopted emerging machine learning techniques as novel methods for studying liquefaction problems, leading to innovative research. For example, Kumar et al. (2021) [26] proposed a deep learning (DL)-based approach for soil liquefaction assessment; Ahmad et al. (2021) [27] analyzed the application of four machine learning techniques in evaluating the likelihood of seismic liquefaction based on CPT data; Mustafa et al. (2022) [28] discussed the applicability of seven machine learning methods for predicting the liquefaction of fine-grained soils; and Nerusupalli et al. (2024) [29] explored the use of random forest (RF) and support vector machine (SVM) models in SPT-based liquefaction studies.
Drawing on the valuable research mentioned above and considering the advantages and limitations of existing methods, this paper proposes a new Vs-based liquefaction probability model that integrates random forest feature importance ranking and Logistic regression. This study is primarily divided into two parts.
Part 1. Based on publicly available Vs liquefaction survey data, two databases (Database A and Database B) were established to evaluate the performance of the existing formulae and to construct and validate the new models. A detailed comparative analysis of existing Vs-based seismic liquefaction probability assessment methods was conducted, studying the performance differences of various formulae at different probability contours and fines content.
Part 2. A new liquefaction probability model was developed using the random forest feature importance ranking algorithm to select key parameters, including the moment magnitude (MW), Vs1, cyclic stress ratio that was adjusted to the reference state of MW = 7.5 (CSR7.5), depth (Z), and fines content (FC). A new evaluation formula was constructed using the logistic regression model. Through case analysis and validation, the new formula was proven to have a higher accuracy in liquefaction assessment under different conditions, particularly in areas with a high fines content.
This study aims to systematically analyze and optimize the liquefaction probability assessment method based on Vs, verifying its effectiveness in practical applications and providing theoretical support and empirical analysis for optimizing liquefaction probability assessment methods.

2. Data

To enhance the analysis, over 1000 sets of liquefaction investigation data based on Vs were collected and organized from publicly accessible sources [15,22,30,31,32,33,34]. These data were divided into two databases—Database A and Database B—categorized by their historical applications.
Database A was utilized to evaluate the performance of existing formulae. It included 225 sets of data derived from 26 earthquakes and 70 site investigations summarized by Andrus et al. [30], comprising 105 liquefaction and 120 non-liquefaction cases. These data served as the foundation for developing Vs-based liquefaction assessment methods and were crucial for evaluating the optimal performance of the current formulae. Figure 1 depicts the sample distribution in Database A.
Database B was employed for the development and validation of the new formulae. To enhance the accuracy and applicability, this database encompassed more extensive and representative Vs data related to earthquake liquefaction. It included all the data from Database A, supplemented by an additional 933 sets from the databases of Kayen et al. [15], Shen et al. [22], Chu et al. [31], Saygili et al. [32], Cai et al. [33], and Hanna et al. [34], among others. This comprehensive dataset, including 417 liquefaction, 507 non-liquefaction, and 9 critical cases, supported the construction and refinement of the new formulae by incorporating key parameters such as the fines content at test sites. Figure 2 illustrates the sample distribution in Database B.

3. Comparative Analysis of Existing Evaluation Methods

Based on the Vs liquefaction database, various methods for assessing liquefaction probability have been developed, including Bayesian probability models and logistic regression. This section elaborates on these models and compares their performance utilizing a common database (Database A).

3.1. Existing Evaluation Methods

The Bayesian probability method leverages Bayes’ theorem to calculate the posterior probability of an event, given prior information. It offers several advantages. It integrates prior knowledge with observed data, providing reliable inferences even with limited or incomplete data. It represents parameter or model uncertainty using probability distributions, effectively managing uncertainties. It possesses a solid theoretical foundation and delivers intuitive results. This paper employs the methods from Juang et al. [21], Chen et al. [35], and Kayen et al. [15] as examples to illustrate the Vs-based liquefaction assessment models derived from the Bayesian probability method.
Logistic regression is a generalized linear regression analysis model predicated on the sigmoid function. Due to its efficiency, ease of understanding, and simplicity of calculation, it is widely utilized in binary classification and probability prediction. Compared to other generalized linear regression models, logistic regression exhibits a superior performance and greater applicability in seismic liquefaction assessments. This paper employs the methods from Cao et al. [36], Shen et al. [22], and Rollins et al. [9] to illustrate the Vs-based liquefaction assessment models derived from the logistic regression method.

3.1.1. Juang’s Liquefaction Probability Assessment Method

Juang et al. [21] applied the Bayesian probability method to liquefaction probability assessment, exploring the SPT, CPT, and Vs1. This discussion centers on the shear wave velocity method. Given the complexity of the original CRR (cyclic resistance ratio) empirical correction steps, Juang et al. [37] applied artificial neural networks to establish a new CRR equation, resulting in a new CRR formula based on the clean sand equivalence of stress-corrected shear wave velocity (Vs1,cs):
C R R = 0.013 V S 1 , C S 100 4 + 1 225 V S 1 , C S
Vs1,cs is calculated as follows:
V S 1 , C S = K V S 1
K is the adjustment coefficient obtained by regression analysis of the Vs1-CRR data points, calculated as follows:
K = 1 , F C 5 %
K = 1 + F C 5 0.0090 0.0109 V S 1 100 + 0.0038 V S 1 100 2 , 5 % F C 35 %
K = 1 + 30 0.0090 0.0109 V S 1 100 + 0.0038 V S 1 100 2 , F C 35 %
Vs1 is calculated as follows:
V S 1 = V S ( P a σ v ) 0.26
Pa is the standard atmospheric pressure and σv′ is the vertical effective stress.
Since many geotechnical engineers prefer to represent liquefaction potential using the factor of safety (FS), Juang et al. established the relationship between FS and liquefaction probability through regression analysis of the Andrus sand and silt database [30], resulting in the following liquefaction probability formula for sand and silt:
P L = 1 1 + F S 0.72 3.1
PL is the liquefaction probability. Juang conducted logistic regression analysis on the Vs-based liquefaction database and compared the probabilistic correlation curve derived from the Bayesian mapping function with those obtained from logistic regression. The results demonstrated a good consistency, verifying the validity of the proposed Bayesian probability method.

3.1.2. Kayen’s Liquefaction Probability Assessment Method

Kayen et al. [15] conducted field investigations on 301 liquefaction cases worldwide from 2001 to 2011, combined with the liquefaction history database, and constructed a global database containing 415 sets of sand and silt Vs liquefaction data. Based on this database, Kayen et al. adopted the Bayesian updating model, using Vs1, CSR, FC, MW, and σv′ as the parameters to establish the liquefaction probability and cyclic resistance ratio formulae for sand and silt:
P L = 0.0073 V S 1 2.8011 1.946 l n C S R 2.6168 l n M W 0.0099 l n σ v + 0.0028 F C 0.4809
C R R = e x p 0.0073 V S 1 2.8011 2.6168 l n M W 0.0099 l n σ v + 0.0028 F C + 0.4809 1 P L 1.946
Φ is the cumulative normal distribution function. Kayen et al. used the PL = 15% contour as the deterministic liquefaction assessment curve and compared it with the liquefaction assessment curves of Andrus et al. [12] and Zhou et al. [38], pointing out that the liquefaction limit shear wave velocity of Vs1 = 215 m/s proposed by Andrus et al. is overly conservative.

3.1.3. Chen’s Liquefaction Probability Assessment Method

After compiling the sand and silt databases of Andrus et al. (1999) [30], Chu et al. (2004) [31], Saygili et al. (2005) [32], and Kayen et al. (2013) [15], Chen et al. [35] obtained a total of 380 liquefaction cases and 234 non-liquefaction cases, amounting to 614 historical datasets. When establishing the probabilistic correlation curve, they incorporated the seismic safety factor of nuclear power plant sites, avoiding misclassifying liquefaction sites as non-liquefaction sites. They repeatedly adjusted the parameters to fit the boundaries of liquefaction and non-liquefaction areas, deriving a relatively conservative new CRR evaluation formula:
C R R = e x p V S 1 86.4 + V S 1 134.0 2 V S 1 125.2 3 + V S 1 158.5 4 4.8
Chen et al. [35] considered the sample imbalance in the database, where the number of liquefaction datasets exceeded the non-liquefaction datasets, and introduced the weight factors (wNL = 1.2 and wL = 0.8) proposed by Cetin et al. [39] to balance the data imbalance. Based on the simplified liquefaction probability model proposed by Juang et al. [40], they established a shear wave velocity-based liquefaction probability assessment formula for sand and silt, as follows:
P L = 1 1 + F S ~ 0.509 2.511
F S ~ is the nominal safety factor.

3.1.4. Cao’s Liquefaction Probability Assessment Method

Cao et al. [36] analyzed the liquefaction investigation data of the Wenchuan earthquake (Ms = 8.0) and integrated Vs with the logistic model for the gravel soil liquefaction probability assessment. The liquefaction probability formula is articulated as follows:
P L = 1 1 + exp ( θ 0 + θ 1 V S 1 + θ 2 ln ( C S R ) )
where θ0, θ1, and θ2 are the coefficients determined by logistic regression of the limited Vs data of the gravel soil from the Wenchuan earthquake. The liquefaction probability formula for gravel soil proposed by Cao is as follows:
P L = 1 1 + exp ( 11.97 0.039 V S 1 + 1.77 ln ( C S R ) )

3.1.5. Shen’s Liquefaction Probability Assessment Method

Shen et al. [22] compiled 36 sets of severe liquefaction data from the 2011 New Zealand Canterbury earthquake (MW = 6.3) and integrated them with 225 sets of Andrus’ liquefaction data, creating a new database comprising 261 sets of sand and silt data. Using Vs1,cs and the natural logarithm of the CSR7.5 as the influencing parameters, they established liquefaction probability models for sand and silt based on the logistic model, probit model, log–log model, and complementary log–log model:
Logistic:
P L = 1 1 + e x p b 0 + b 1 V S 1 , C S + b 2 l n C S R
Probit:
P L = p [ b 0 + b 1 V S 1 , C S + b 2 l n ( C S R ) ]
Log–log:
P L = e x p { e x p [ ( b 0 + b 1 V S 1 , C S + b 2 l n ( C S R ) ) ] }
Complementary log–log:
P L = 1 e x p { e x p [ b 0 + b 1 V S 1 , C S + b 2 l n ( C S R ) ] }
Here, b0, b1, and b2 are the model parameters, and Φ is the cumulative distribution function of the standard normal distribution. Comparing the assessment results of the four models, the log–log model and logistic model were considered optimal. However, the log–log model is relatively less conservative, whereas the logistic model is more balanced. This paper exclusively selects the logistic model for comparative analysis, excluding the log–log model from the comparison scope. The liquefaction probability assessment formula is as follows:
P L = 1 1 + e x p 14.3931 + 0.0552 V S 1 , C S 2.8628 l n C S R

3.1.6. Rollins’ Liquefaction Probability Assessment Method

While utilizing Cao’s liquefaction probability formula for the back analysis, Rollins et al. [9] identified that many non-liquefied points were incorrectly predicted as liquefied points. Considering the limited gravel soil data utilized by Cao et al. [36]. in constructing the formula, Rollins et al. [9]. integrated 76 sets of liquefaction data from seven countries with the original Chinese data, constructing a large gravel soil liquefaction database containing 174 sets of Vs data and updated the calculation formula for the stress reduction factor. Rollins et al. [9]. formulated the liquefaction probability assessment model for gravel soil as follows:
r d = exp α z + β z M W
α z = 1.012 1.126 sin z 11.73 + 5.113
β z = 0.106 + 0.118 sin z 11.28 + 5.142
Rollins et al. [9]. established the following liquefaction probability assessment formula for gravel soil:
P L = 1 1 + exp 1.6 M W + 4.95 ln C S R 3.88 × 10 7 V S 1 3
Compared to Cao’s liquefaction probabilistic correlation curve, Rollins’ curve better fits the field data, and the differences between the probabilistic correlation curves are smaller, resulting in lower uncertainty.

3.2. Overall Accuracy

The predictive performance of the formula is assessed for accuracy. Accuracy is defined as the ratio of correctly predicted samples to the total number of samples, with detailed statistics further refined for the liquefaction and non-liquefaction cases.
Formulae (7) and (18) utilize the equivalent corrected Vs1,cs, compared to Formulae (8) and (11). Figure 3 illustrates the binary classification effectiveness of each formula at liquefaction probabilities of 15%, 50%, and 85%. Due to the complex interleaving of liquefaction and non-liquefaction points, delineating clear curves is challenging.
At 15%, all formulae effectively separate the liquefaction points. However, with increased seismic intensity, significant differences in the curve trends emerge. Formula (11) is the most conservative, whereas Formula (7) exhibits steeper trends, better distinguishing the non-liquefaction cases. Both consider an upper Vs limit of 200 m/s for liquefaction, causing rapid curve rises around this threshold.
At 50%, Formula (8) encompasses more liquefaction and non-liquefaction points, indicating a high evaluation accuracy but a higher risk of false positives, making it more conservative overall.
At 85%, the formulae can effectively separate non-liquefaction points, but the overall performance is less effective than at 15%. Kayen’s study does not provide a recommended safety factor for an 85% probability; thus, an FS value of 0.83 is adopted, resulting in a conservative estimate. Formula (7) shows the ideal performance, encompassing more liquefaction points while separating the non-liquefaction cases.
To assess these formulae’s performance, a 50% liquefaction probability is used as a binary evaluation threshold and the accuracy is statistically evaluated using the database. Here, accuracy refers to the proportion of cases where the evaluation matches the actual situation. Figure 4 presents the accuracy rates for each formula.
Overall, the accuracy of each formula is around 80%. Formulae (7) and (18) exhibit the highest accuracy, approximately 3% higher than formulae (8) and (11). Notably, formula (11) has a balanced accuracy in both liquefaction and non-liquefaction areas.
The Bayesian and logistic regression models demonstrate close overall accuracy, suggesting a high consistency and reliability in liquefaction evaluation, suitable for constructing and optimizing liquefaction probability methods.

3.3. Accuracy for Different Levels of Fines Content

The fines content significantly impacts liquefaction, yet its influence pattern remains unclear. The fines content is defined as the percentage of particles smaller than 0.075 mm by mass. The accuracy of each formula is compared across three fines content intervals: FC ≤ 5%, 5% < FC ≤ 35%, and FC > 35% (Figure 5).
At a FC ≤ 5%, Formulae (7) and (18) exhibit the highest overall accuracy, with consistent results in both the liquefaction and non-liquefaction areas. The other formulae also perform well, indicating an effective liquefaction assessment at low fines contents. Specifically, Formula (11) shows a higher accuracy in non-liquefaction areas, with slightly lower conservativeness than others. The overall accuracy of Formula (8) is slightly lower than Formulae (7) and (18).
At a 5% < FC ≤ 35%, Formula (8) maintains the highest overall accuracy, with minimal reduction compared to FC ≤ 5% and balanced performance in both the liquefaction and non-liquefaction areas, showing strong adaptability. Formulae (7) and (18) exhibit slightly lower accuracy. Formula (11) excels in non-liquefaction areas, albeit with some imbalance.
At a FC > 35%, the overall accuracy decreases significantly in the liquefaction areas while remaining stable in the non-liquefaction areas, indicating potential risks. Formulae (7) and (18) outperform Formulae (8) and (11).
The accuracy of all formulae decreases with the increasing fines content. While each performs well in low fines content areas, the accuracy drops substantially in high fines content areas, highlighting the need for further research to improve the adaptability and precision.

3.4. Applicability Analysis

Overall, Formulae (7) and (18) perform well across different probability contours, particularly at low fines contents, with a high and balanced accuracy. Formula (8) adapts well to moderate fines content areas, whereas Formula (11) excels in non-liquefaction areas but exhibits a lower accuracy in liquefaction areas.
These findings indicate the limitations in existing liquefaction evaluation formulae when handling high fines contents, necessitating further research and optimization to enhance their adaptability and accuracy.

4. Development of LR Liquefaction Probability Model

In the process of constructing the new method, the logistic regression model was selected, with database B serving as the fitting sample. The parameters for the new method were chosen based on the importance ranking from the random forest algorithm. The model coefficients for the logistic regression were obtained using the maximum likelihood estimation method. The construction approach of the new method is illustrated in Figure 6.

4.1. Feature Selection and Parameter Setting

In constructing the model, Vs serves as the primary parameter for the liquefaction assessment, necessitating the inclusion of suitable auxiliary parameters to ensure the method’s efficacy. This study employs machine learning techniques to identify the key parameters. Specifically, the random forest feature importance ranking algorithm in MATLAB evaluates the significance of various parameters influencing earthquake-induced liquefaction.
Feature importance measures the extent to which each parameter influences the liquefaction outcome and its contribution to the model’s predictive performance. In this study, feature importance is calculated based on the Out-of-Bag (OOB) error. The OOB data refer to the approximately one-third of the data that are not used in the construction of a decision tree during the bootstrap sampling. These data are used to calculate the model’s prediction error rate, known as the OOB error.
Specifically, feature importance is determined by randomly permuting the values of a feature X in the OOB data and calculating the resulting increase in the prediction error for each tree. The average of these increases is then divided by the standard deviation of the entire ensemble to compute the importance value of the feature. In summary, the feature importance is estimated based on the change in the OOB error before and after introducing noise to the feature values in the OOB data.
The analysis encompassed all the collected independent variables, including Vs, Vs1, Vs1,cs, groundwater level (dw), CSR, CSR7.5, peak ground acceleration (amax), MW, FC, depth (Z), vertical total stress (σv), and σv′, with the liquefaction outcomes as the dependent variable. The raw data were cleaned and normalized, with any duplicates being checked and removed. The parameters such as amax and CSR were recalculated and validated. Additionally, the missing values of various feature parameters were compared and addressed. The final parameter importance ranking, illustrated in Figure 7, highlights that FC is the most critical parameter, followed by those representing the site’s seismic intensity.
The ranking of certain similar parameters in Figure 7 requires careful differentiation. For example, although parameters such as Vs, Vs1, and Vs1,cs may have reduced individual importance due to their similar mechanisms, it is more critical to compare the importance among these primary parameters than to compare them with other distinct parameters. Similarly, the primary parameters representing seismic intensity (amax and CSR) and the auxiliary parameters representing depth and groundwater level should be analyzed using the same approach. Therefore, the impact of the parameters’ similarity on their importance needs to be discerned through engineering experience. Based on these findings, Vs1, CSR7.5, MW, Z, and FC were selected as the parameters for the liquefaction probability model.
Among these parameters, the fines content (FC) has garnered significant attention in recent liquefaction studies. Researchers have widely recognized the influence of fines content on liquefaction and have incorporated this parameter into the development of liquefaction probability assessment methods in recent years [15,28]. This paper confirms the critical impact of fines content on liquefaction assessment from a statistical data analysis perspective.

4.2. Model Construction

A logistic regression model was employed to develop the new method using five parameters: Vs1, CSR7.5, MW, Z, and FC. The model was optimized based on parameter importance rankings, as shown in Equation (23):
P L = 1 1 + e ( θ 1 × V S 1 + θ 2 × L N C S R 7.5 + θ 3 × L N M W + θ 4 × L N Z + θ 5 × F C + θ 6 )
where θ1, θ2, θ3, θ4, θ5, and θ6 are the model parameters estimated using the maximum likelihood with 924 samples. The likelihood function is articulated as follows:
( a = 1 n L 1 1 + e θ 1 × V S 1 + θ 2 × L N C S R 7.5 + θ 3 × L N M W + θ 4 × L N Z + θ 5 × F C + θ 6 ) w L ( b = 1 n N L [ 1 1 1 + e θ 1 × V S 1 + θ 2 × L N C S R 7.5 + θ 3 × L N M W + θ 4 × L N Z + θ 5 × F C + θ 6 ] ) w N L
where nL represents 417 liquefaction datasets; nNL represents 507 non-liquefaction datasets; wL is the correction coefficient for the discrepancy in the liquefaction data quantity; and wNL is the correction coefficient for the discrepancy in the non-liquefaction data quantity, calculated based on Ku’s recommendations [41]:
w L = n L + n N L 2 × n L
w N L = n L + n N L 2 × n N L
Using this database, wL = 1.10 and wNL = 0.91 were calculated. The MATLAB programming was employed to complete the construction process and the maximum likelihood estimation was specifically employed to obtain the model parameters for the logistic regression. In the maximum likelihood estimation, the quasi-Newton method was used as an optimization technique, with the Hessian matrix being used to accelerate convergence. Several formulae were derived. By excluding the formulae that violated the natural laws of liquefaction and adjusting them based on engineering experience, the final formula was obtained as follows:
P L = 1 1 + e ( 0.0696 × V S 1 + 2.2606 × L N C S R 7.5 + 7.9805 × L N M W 1.3489 × L N Z 0.0244 × F C 3.4025 )

5. Discussion of Results

To evaluate the performance of the new formula in comparison to existing ones, a comparative analysis was conducted using the comprehensive Database B.
Comparing the predicted results with the objective reality can intuitively demonstrate the accuracy of the new method given in Equation (27). The liquefaction evaluation results under various probability contours and fines contents are presented in Figure 8 and Figure 9.
Figure 8 demonstrates that the new formula achieves an overall accuracy exceeding 60% across all the probability contours, reaching the highest accuracy at the 60% probability contour. The new formula maintains a leading overall accuracy across all the probability profiles compared to the other formulae. At a conservative probability level of 20%, the overall accuracy of the new formula is comparable to the best-performing formulae (e.g., Formula (8)), exceeding the accuracy of the other formulae by approximately 3%. At a balanced probability level of 50%, the overall accuracy of the proposed formula is approximately 3% higher than all the other formulae. Specifically, the non-liquefaction accuracy is 3% to 7% higher than that of the other formulae, while the liquefaction accuracy is 3.8% lower than that of Formula (8). However, the total accuracy and non-liquefaction accuracy of the new formula are both higher than those of Formula (8).
Figure 9 compares the accuracy of the new formula with the other formulae under varying fines contents. The new formula exhibits a clear advantage in balance across various fines contents. Its liquefaction evaluation accuracy ranges between 87% and 75%, comparable to the best performance of the existing formulae. Moreover, its non-liquefaction evaluation accuracy never falls below 50%, significantly higher than that of all the other formulae, demonstrating the new formula’s balance within these intervals. Similarly, in the FC > 35% interval, although Formula (8) outperforms the new formula by 7% in liquefaction accuracy, its non-liquefaction accuracy is significantly lower by 14%.
Additionally, when FC > 35%, all formulae except the new one show a slight increase in the liquefaction accuracy and a significant decrease in the non-liquefaction accuracy. In this range, the overall non-liquefaction accuracy of the existing formulae is around 40%, while the new formula maintains at least 60%, which is, on average, 11% higher. This demonstrates that the new formula better addresses the impact of fines content changes on the liquefaction probability evaluation, reflecting the performance improvement from the fines content optimization during the new method’s construction. Thus, the new formula exhibits a good balance across different fines contents, especially in the non-liquefaction evaluation, significantly outperforming the other formulae and adapting well to the changes in fines content.

6. Conclusions

Existing methods generally exhibit notable conservatism in handling high fines content liquefiable soils, with significant performance differences among the formulae under various fines contents. Formulae (7) and (18) exhibit a high overall accuracy in low fines content areas (FC ≤ 5%); Formula (8) performs well in medium fines content areas (5% < FC ≤ 35%); and Formula (11) demonstrates a high accuracy in non-liquefaction areas but a low liquefaction evaluation accuracy in high fines content areas, showing an imbalance. By selecting five parameters through random forest screening—Vs1, CSR7.5, MW, Z, and FC—a new liquefaction probability evaluation formula based on the logistic optimization model was constructed. The new formula demonstrates a better balance in the liquefaction and non-liquefaction evaluation under different probability contours and fines contents than the other formulae, with balanced results in the liquefaction evaluation across different fines contents. What is especially notable is the significant increase in accuracy (11%) in high fines content (FC > 35%) conditions.
This paper explores the performance and advantages of various existing Vs-based liquefaction probability evaluation methods, constructing a new set of liquefaction probability evaluation formulae based on logistic optimization models through random forest screening involving the factors that influence liquefaction. Although the new formula performs well in many aspects, it still has some limitations. The geological conditions and regional coverage of the database are limited, which may affect the model’s generalizability. Future research should expand the database sample size, continuously incorporating more extensive and comprehensive data to improve the model’s generalizability and reliability. Additionally, exploring other advanced machine learning algorithms, such as deep learning, can further enhance the model’s performance.

Author Contributions

Conceptualization, Y.Y.; methodology, Y.W.; software, Y.W.; validation, Y.Y.; formal analysis, Y.Y.; investigation, Y.Y.; resources, Y.Y.; data curation, Y.W.; writing—original draft preparation, Y.Y.; writing—review and editing, Y.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (52370128).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created.

Conflicts of Interest

The authors declare no conflicts of interest.

Nomenclature

amaxpeak ground accelerationPastandard atmospheric pressure
CSRcyclic stress ratioPLliquefaction probability
CSR7.5CSR normalized to MW = 7.5Vsshear wave velocity
CRRcyclic resistance ratioVs1effective stress-normalized shear wave velocity
dwgroundwater levelVs1,csclean sand equivalence of stress-corrected shear wave velocity
FCfines contentZdepth
FSfactor of safetyσvvertical total stress
F S ~ nominal safety factorσvvertical effective stress
Kfines content correctionΦcumulative normal distribution function
MWmoment magnitude

References

  1. Zhou, J.; Huang, S.; Zhou, T.; Armaghani, D.J.; Qiu, Y. Employing a genetic algorithm and grey wolf optimizer for optimizing RF models to evaluate soil liquefaction potential. Artif. Intell. Rev. 2022, 55, 5673–5705. [Google Scholar] [CrossRef]
  2. Castro, G. Liquefaction and cyclic mobility of saturated sands. J. Thegeotechnical Eng. Div. 1975, 101, 551–569. [Google Scholar] [CrossRef]
  3. Guo, H.; Zhao, J.X. The surface rupture zone and paleoseismic evidence on the seismogenic fault of the 1976 Ms 7.8 Tangshan earthquake, China. Geomorphology 2019, 327, 297–306. [Google Scholar] [CrossRef]
  4. Hu, J.L.; Tang, X.W.; Qiu, J.N. A Bayesian network approach for predicting seismic liquefaction based on interpretive structural modeling. Georisk Assess. Manag. Risk Eng. Syst. Geohazards 2015, 9, 200–217. [Google Scholar] [CrossRef]
  5. Vipin, K.S.; Sitharam, T.G.; Anbazhagan, P. Probabilistic evaluation of seismic soil liquefaction potential based on SPT data. Nat. Hazards 2010, 53, 547–560. [Google Scholar] [CrossRef]
  6. Mahmood, A.; Tang, X.W.; Qiu, J.N.; Gu, W.J.; Feezan, A. A hybrid approach for evaluating CPT-based seismic soil liquefaction potential using Bayesian belief networks. J. Cent. South Univ. 2020, 27, 500–516. [Google Scholar] [CrossRef]
  7. Hu, J.L.; Liu, H.B. Bayesian network models for probabilistic evaluation of earthquake-induced liquefaction based on CPT and Vs databases. Eng. Geol. 2019, 254, 76–88. [Google Scholar] [CrossRef]
  8. Rollins, K.M.; Roy, J.; Athanasopoulos-Zekkos, A.; Zekkos, D.; Amoroso, S.; Cao, Z. A new dynamic cone penetration test–based procedure for liquefaction triggering assessment of gravelly soils. J. Geotech. Geoenvironmental Eng. 2021, 147, 04021141. [Google Scholar] [CrossRef]
  9. Rollins, K.M.; Roy, J.; Athanasopoulos-Zekkos, A.; Zekkos, D.; Amoroso, S.; Cao, Z.; Milana, G.; Vassallo, M.; Di Giulio, G. A new V s-based liquefaction-triggering procedure for gravelly soils. J. Geotech. Geoenvironmental Eng. 2022, 148, 04022040. [Google Scholar] [CrossRef]
  10. Yang, H.; Liu, Z.; Xie, Y.; Li, S. A probabilistic liquefaction reliability evaluation system based on CatBoost-Bayesian considering uncertainty using CPT and Vs measurements. Soil Dyn. Earthq. Eng. 2023, 173, 108101. [Google Scholar] [CrossRef]
  11. Wang, T.; Xiao, S.; Zhang, J.; Zuo, B. Depth-consistent models for probabilistic liquefaction potential assessment based on shear wave velocity. Bull. Eng. Geol. Environ. 2022, 81, 255. [Google Scholar] [CrossRef]
  12. Andrus, R.D.; Stokoe, K.H. Liquefaction resistance of soils from shear-wave velocity. J. Geotech. Geoenvironmental Eng. 2000, 126, 1015–1025. [Google Scholar] [CrossRef]
  13. Hou, X.M.; Qu, S.Y.; Shi, X.D. Signal processing method for shear wave velocity measurement. Earthq. Eng. Eng. Vib. 2007, 6, 205–212. [Google Scholar] [CrossRef]
  14. Lu, C.C.; Hwang, J.H. Correlations between Vs and SPT-N by different borehole measurement methods: Effect on seismic site classification. Bull. Earthq. Eng. 2020, 18, 1139–1159. [Google Scholar] [CrossRef]
  15. Kayen, R.; Moss, R.E.S.; Thompson, E.M.; Seed, R.B.; Cetin, K.O.; Kiureghian, A.D.; Tanaka, Y.; Tokimatsu, K. Shear-wave velocity–based probabilistic and deterministic assessment of seismic soil liquefaction potential. J. Geotech. Geoenvironmental Eng. 2013, 139, 407–419. [Google Scholar] [CrossRef]
  16. Dobry, R.; Stokoe, K.H.; Ladd, R.S.; Youd, T.L. Liquefaction susceptibility from S-wave velocity. In Proceedings of the In-Situ Tests to Evaluate Liquefaction Susceptibility, ASCE National Convention, New York, NY, USA, 26–31 October 1981. [Google Scholar]
  17. Seed, H.B.; Idriss, I.M. Simplified procedure for evaluating soil liquefaction potential. J. Soil Mech. Found. Div. 1971, 97, 1249–1273. [Google Scholar] [CrossRef]
  18. Kayen, R.E.; Mitchell, J.K.; Seed, R.B.; Lodge, A.; Nishio, S.Y.; Coutinho, R. Evaluation of SPT-, CPT-, and shear wave-based methods for liquefaction potential assessment using Loma Prieta data. In Proceedings of the 4th Japan-US Workshop on Earthquake Resistant Design of Lifeline Facilities and Countermeasures for Soil Liquefaction, Buffalo, NY, USA, 19–21 August 1991. [Google Scholar]
  19. Robertson, P.K.; Woeller, D.J.; Finn WD, L. Seismic cone penetration test for evaluating liquefaction potential under cyclic loading. Can. Geotech. J. 1992, 29, 686–695. [Google Scholar] [CrossRef]
  20. Lodge, A.L. Shear Wave Velocity Measurements for Subsurface Characterization; University of California: Berkeley, CA, USA, 1994. [Google Scholar]
  21. Juang, C.H.; Chen, C.J.; Jiang, T. Probabilistic framework for liquefaction potential by shear wave velocity. J. Geotech. Geoenvironmental Eng. 2001, 127, 670–678. [Google Scholar] [CrossRef]
  22. Shen, M.; Chen, Q.; Zhang, J.; Gong, W.; Hsein Juang, C. Predicting liquefaction probability based on shear wave velocity: An update. Bull. Eng. Geol. Environ. 2016, 75, 1199–1214. [Google Scholar] [CrossRef]
  23. Li, P.; Tian, Z.; Bo, J.; Zhu, S.; Li, Y. Study on sand liquefaction induced by Songyuan earthquake with a magnitude of M5. 7 in China. Sci. Rep. 2022, 12, 9588. [Google Scholar]
  24. Lees, J.; Ballagh, R.; Orense, R.; van Ballegooy, S. CPT-based analysis of liquefaction and re-liquefaction following the Canterbury earthquake sequence. Soil Dyn. Earthq. Eng. 2015, 79, 304–314. [Google Scholar] [CrossRef]
  25. Zhou, Y.G.; Xia, P.; Ling, D.S.; Chen, Y.M. Liquefaction case studies of gravelly soils during the 2008 Wenchuan earthquake. Eng. Geol. 2020, 274, 105691. [Google Scholar] [CrossRef]
  26. Kumar, D.; Samui, P.; Kim, D.; Singh, A. A novel methodology to classify soil liquefaction using deep learning. Geotech. Geol. Eng. 2021, 39, 1049–1058. [Google Scholar] [CrossRef]
  27. Ahmad, M.; Tang, X.W.; Qiu, J.N.; Ahmad, F.; Gu, W.J. Application of machine learning algorithms for the evaluation of seismic soil liquefaction potential. Front. Struct. Civ. Eng. 2021, 15, 490–505. [Google Scholar] [CrossRef]
  28. Ozsagir, M.; Erden, C.; Bol, E.; Sert, S.; Özocak, A. Machine learning approaches for prediction of fine-grained soils liquefaction. Comput. Geotech. 2022, 152, 105014. [Google Scholar] [CrossRef]
  29. Moghaddam, A.; Barari, A.; Farahani, S.; Tabarsa, A.; Jeng, D.-S. Effective stress analysis of residual wave-induced liquefaction around caisson-foundations: Bearing capacity degradation and an AI-based framework for predicting settlement. Comput. Geotech. 2023, 159, 105364. [Google Scholar] [CrossRef]
  30. Andrus, R.D.; Stokoe, K.H., II; Chung, R.M. Draft Guidelines for Evaluating Liquefaction Resistance Using Shearwave Velocity Measurements and Simplified Procedures; US Department of Commerce, Technology Administration, National Institute of Standardsand Technology: Gaithersburg, MD, USA, 1999. [Google Scholar]
  31. Chu, B.L.; Hsu, S.C.; Chang, Y.M. Ground behavior and liquefaction analyses in central Taiwan-Wufeng. Eng. Geol. 2004, 71, 119–139. [Google Scholar] [CrossRef]
  32. Saygili, G. Liquefaction Potential Assessment in Soil Deposits Using Artificial Neural Networks; Concordia University: Montreal, QC, Canada, 2005. [Google Scholar]
  33. Cai, G.J.; Liu, S.Y.; Puppala, A.J. Liquefaction assessments using seismic piezocone penetration (SCPTU) test investigations in Tangshan region in China. Soil Dyn. Earthq. Eng. 2012, 41, 141–150. [Google Scholar] [CrossRef]
  34. Hanna, A.M.; Ural, D.; Saygili, G. Neural network model for liquefaction potential in soil deposits using Turkey and Taiwan earthquake data. Soil Dyn. Earthq. Eng. 2007, 27, 521–540. [Google Scholar] [CrossRef]
  35. Guoxing, C.; Mengyun, K.; Khoshnevisan, S.; Weiyun, C.; Xiaojun, L. Calibration of Vs-based empirical models for assessing soil liquefaction potential using expanded database. Bull. Eng. Geol. Environ. 2019, 78, 945–957. [Google Scholar] [CrossRef]
  36. Cao, Z.Z.; Youd, T.L.; Yuan, X.M. Gravelly soils that liquefied during 2008 Wenchuan, China earthquake, Ms = 8.0. Soil Dyn. Earthq. Eng. 2011, 31, 1132–1143. [Google Scholar] [CrossRef]
  37. Juang, C.H.; Chen, C.J. A rational method for development of limit state for liquefaction evaluation based on shear wave velocity measurements. Int. J. Numer. Anal. Methods Geomech. 2000, 24, 1–27. [Google Scholar] [CrossRef]
  38. Zhou, Y.G.; Chen, Y.M. Laboratory investigation on assessing liquefaction resistance of sandy soils by shear wave velocity. J. Geotech. Geoenvironmental Eng. 2007, 133, 959–972. [Google Scholar] [CrossRef]
  39. Cetin, K.O.; Seed, R.B.; Der Kiureghian, A.; Tokimatsu, K.; Harder, L.F., Jr.; Kayen, R.E.; Moss, R.E. Standard penetration test-based probabilistic and deterministic assessment of seismic soil liquefaction potential. J. Geotech. Geoenvironmental Eng. 2004, 130, 1314–1340. [Google Scholar] [CrossRef]
  40. Juang, C.H.; Jiang, T.; Andrus, R.D. Assessing probability-based methods for liquefaction potential evaluation. J. Geotech. Geoenvironmental Eng. 2002, 128, 580–589. [Google Scholar] [CrossRef]
  41. Ku, C.-S.; Juang, C.H.; Chang, C.-W.; Ching, J. Probabilistic version of the Robertson and Wride method for liquefaction evaluation: Development and application. Can. Geotech. J. 2012, 49, 27–44. [Google Scholar] [CrossRef]
Figure 1. Sample distribution in Database A. (a) Sample distribution of liquefaction conditions; and (b) sample distribution under different fines contents.
Figure 1. Sample distribution in Database A. (a) Sample distribution of liquefaction conditions; and (b) sample distribution under different fines contents.
Applsci 14 06793 g001
Figure 2. Sample distribution in Database B. (a) Sample distribution of liquefaction conditions; and (b) sample distribution under different fines contents.
Figure 2. Sample distribution in Database B. (a) Sample distribution of liquefaction conditions; and (b) sample distribution under different fines contents.
Applsci 14 06793 g002
Figure 3. Curves for separating liquefaction and non liquefaction cases using various formulae under different probabilities. (a) Equation (7), (b) Equation (8), (c) Equation (11), and (d) Equation (18).
Figure 3. Curves for separating liquefaction and non liquefaction cases using various formulae under different probabilities. (a) Equation (7), (b) Equation (8), (c) Equation (11), and (d) Equation (18).
Applsci 14 06793 g003aApplsci 14 06793 g003b
Figure 4. Accuracy of different formulae.
Figure 4. Accuracy of different formulae.
Applsci 14 06793 g004
Figure 5. Accuracy of different formulae for different fines particle contents: (a) FC ≤ 5%. (b) 5% < FC ≤ 35%, and (c) FC > 35%.
Figure 5. Accuracy of different formulae for different fines particle contents: (a) FC ≤ 5%. (b) 5% < FC ≤ 35%, and (c) FC > 35%.
Applsci 14 06793 g005
Figure 6. Flowchart of the construction process.
Figure 6. Flowchart of the construction process.
Applsci 14 06793 g006
Figure 7. Ranking of parameters’ importance.
Figure 7. Ranking of parameters’ importance.
Applsci 14 06793 g007
Figure 8. Accuracy of evaluation under different probability contours for new formula and existing formulae: (a) Equation (27), (b) Equation (7), (c) Equation (8), (d) Equation (11), and (e) Equation (18).
Figure 8. Accuracy of evaluation under different probability contours for new formula and existing formulae: (a) Equation (27), (b) Equation (7), (c) Equation (8), (d) Equation (11), and (e) Equation (18).
Applsci 14 06793 g008
Figure 9. Evaluation accuracy under different fines contents for new formula and existing formulae: (a) Equation (27), (b) Equation (7), (c) Equation (8), (d) Equation (11), and (e) Equation (18).
Figure 9. Evaluation accuracy under different fines contents for new formula and existing formulae: (a) Equation (27), (b) Equation (7), (c) Equation (8), (d) Equation (11), and (e) Equation (18).
Applsci 14 06793 g009
Table 1. Vs testing methods and characteristics.
Table 1. Vs testing methods and characteristics.
Testing MethodOperation MethodAdvantagesDisadvantages
Borehole MethodSingle BoreholePlace a geophone in a single borehole and generate waves using a seismic source. Record the waveform to calculate the Vs of the soil.Simple principle, easy calculation, large data volume at different depths, low cost [13].Weak anti-interference capability, limited testing depth.
Cross-HoleGenerate shear waves in one borehole and receive them in two other boreholes. Calculate the Vs based on the propagation distance and time.Strong anti-interference capability, wide application range, large testing depth.High cost, high construction requirements, significant limitations due to borehole inclination [14].
Surface Wave
Method
Transient Surface WaveThe exciter generates signals and the geophone records Rayleigh waves. The Vs is obtained after conversion [15].No need for drilling, fast testing speed, strong adaptability, wide application.Requires conversion of Rayleigh waves, resulting in larger errors.
Steady-State
Surface Wave
The exciter emits fixed-frequency Rayleigh waves. Measure the wavelength and convert to Vs.Compensates for the low excitation energy of the transient surface wave method.Complex equipment, long measurements.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yang, Y.; Wei, Y. A New Shear Wave Velocity-Based Liquefaction Probability Model Using Logistic Regression: Emphasizing Fines Content Optimization. Appl. Sci. 2024, 14, 6793. https://doi.org/10.3390/app14156793

AMA Style

Yang Y, Wei Y. A New Shear Wave Velocity-Based Liquefaction Probability Model Using Logistic Regression: Emphasizing Fines Content Optimization. Applied Sciences. 2024; 14(15):6793. https://doi.org/10.3390/app14156793

Chicago/Turabian Style

Yang, Yang, and Yitong Wei. 2024. "A New Shear Wave Velocity-Based Liquefaction Probability Model Using Logistic Regression: Emphasizing Fines Content Optimization" Applied Sciences 14, no. 15: 6793. https://doi.org/10.3390/app14156793

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop