Next Article in Journal
An Extended Evaluation of the CERCHAR Abrasivity Test for a Practical Excavatability Assessment
Previous Article in Journal
Understanding the Geotechnical Behaviour of Pumiceous Soil: A Review
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

Machine Learning–Enhanced Modeling of Stress–Strain Behavior of Frozen Sandy Soil

Danial Rezazadeh Eidgahee
Hodjat Shiri
Civil Engineering Department, Faculty of Engineering and Applied Sciences, Memorial University of Newfoundland, St. John’s, NL A1C 5S7, Canada
Author to whom correspondence should be addressed.
Geotechnics 2024, 4(4), 1228-1245;
Submission received: 8 October 2024 / Revised: 27 November 2024 / Accepted: 30 November 2024 / Published: 3 December 2024


Many experiments and computational techniques have been employed to explain the mechanical properties of frozen soils. Nevertheless, due to the substantial complexity of their responses, modeling the stress–strain characteristics of frozen soils remains challenging. In this study, artificial neural networks (ANNs) were employed for modeling the mechanical behavior of frozen soil, while different testing strategies were carried out. A database covering stress–strain data from frozen sandy soil subjected to varying temperatures and confining pressures, resulting from triaxial tests, was compiled and employed to train the model. Subsequently, different artificial neural networks were trained and developed to estimate the deviatoric stress and volumetric strain, while temperature, axial strain, and confining pressure were considered as the main input variables. Based on the findings, it can be indicated that the models effectively predict the stress–strain behavior of frozen soil with a significant level of accuracy.

1. Introduction

Building on frozen soil presents a unique challenge for cold regions, as it supports crucial infrastructure like pipelines, railways, and even buildings. However, ensuring the stability of these structures is a complex challenge [1,2]. Unlike their unfrozen counterparts, frozen soils exhibit sensitive and complicated mechanical behavior due to temperature variations. Accurately predicting how frozen soil responds to stress is paramount for safe and sustainable construction practices in cold regions. Frozen soil is a common feature across the globe, with permafrost, a permanently frozen layer, underlying a vast 21.8% of the Northern Hemisphere’s landmass [3]. This frozen ground is not uniform, though. Classifications like short-term frozen soil and seasonal frozen soil exist, highlighting the variations in freezing duration [4]. The amount of frozen soil even develops significantly during winter, with an estimated 50% of the land area experiencing frozen conditions in the coldest month [5]. Frozen soil’s inherent complexity arises from its heterogeneity, discontinuous nature, and highly non-linear response to stress and strain. These characteristics make it particularly difficult to accurately predict its mechanical behavior under real-world conditions [6,7]. This is why exploring the mechanical response of frozen soil under varying temperatures and stress conditions becomes crucial, especially for engineering projects in cold regions.
Researchers have explored different concepts from theoretical aspects to develop constitutive models, describing the mechanical behavior of frozen soils, such as extended hypoplastic behavior [8], the rate-independent behavior of saturated frozen soils [9], the phenomenological elastoplastic damage constitutive model [10], and other research on the mechanical behavior of frozen soils, including their strength, load type, and deformation characteristics [11]. While valuable, traditional constitutive models for frozen soil behavior face several limitations that hinder their real-world application [12]. Traditional models are only effective for specific soil types, restricting their broader use in engineering practice. Model accuracy can be good for the specific data used to develop the model; however, this cannot be the case for a different type of stress [13]. The increasing mathematical complexities of these models often translate to a large number of parameters and require model calibration. The inherent complexity makes calibration of the involved parameters challenging, obstructing their practical use in engineering [12,13]. In addition, soil mechanical behavior is a variable influenced by factors like load history, time, pore water pressure, relative density, preload, pressure field, and loading rate [14].
Machine learning-based approaches, with their ability to learn from datasets and use computing capabilities, offer an alternative to traditional mechanical behavior modeling and techniques. In the past decade, there has been a notable increase in interest in data-driven machine learning approaches within civil and geotechnical engineering [15,16,17]. Moreover, machine learning techniques are also successfully implemented to provide insight through frozen soil characteristics prediction. Artificial neural networks (ANNs) optimized with genetic algorithms were implemented for predicting unfrozen water content in frozen clay [18]. The long short-term memory (LSTM) approach and its combination with Monte Carlo dropout were employed to predict the stress–strain response of frozen soil, incorporating uncertainty quantification [7]. This growing body of research highlights a key distinction between traditional constitutive models and artificial intelligence-based (AI-based) approaches. While constitutive models are based upon mathematical equations and assumptions, these models excel at capturing the complex, non-linear relationships between stress and strain in soils through their powerful ability to analyze high-dimensional data [12,19]. AI-based methods are a promising road for predicting the mechanical behavior of frozen soil, offering a potentially more accurate and versatile approach compared to traditional methods.

2. Research Significance

This research investigates the application of machine learning for enhancing the precision and applicability of artificial neural networks (ANNs) to mimic the stress–strain behavior of frozen sandy soils resulting from the experimental triaxial test. It aims to establish reliable and user-friendly models with an explainable and closed-form formulation for determining the stress–strain response of frozen sandy soil under different freezing temperatures and confining pressures. Thus, four different combinations of training and testing datasets, which are the outcomes of frozen samples under triaxial testing, and two different input feature combinations were also considered. To mitigate potential overfitting issues, a rigorous approach to model selection and evaluation was implemented. Model performance was assessed through comparisons with experimental data and unseen test data, and common error criteria were thoroughly evaluated and analyzed. Therefore, it was determined that if the developed models achieved an acceptable level of accuracy, they could be reliably used to provide a general estimation of the stress–strain behavior of frozen sandy soil, which is a key factor in general engineering design work. The aim is to overcome the lack of straightforward solutions for the mechanical behavior characterization of frozen sandy soils. The overall workflow is depicted in Figure 1.

3. Materials and Methods

The data extracted from the frozen soil triaxial test under −4 and −6 °C freezing temperatures and confining pressures of 0.3, 0.6, 0.8, and 1 MPa were compiled by Xu (2014) [20], as depicted in Figure 2. Standard sand was used in the triaxial compression tests. The sandy soil maximum and minimum diameters were 2.0 mm and 0.075 mm, respectively, and also, the diameter of 50% passing (D50) was 0.7 mm. The total number of achieved data points was 212 points for stress and volumetric strain through eight distinct triaxial compression tests.

3.1. Data Preprocessing and Sampling Strategies

The considered input combinations for the model development involved the antecedent deviatoric stress and volumetric strains, denoted by q(t−1) and εv(t−1), respectively; axial strain (εa); confining pressure (σc); and freezing temperature (T). Note that q(t−1) and εv(t−1) are both a common variable of any constitutive model. However, these two can also be omitted, and another independent model can also be developed. At present, there is a wide variety of input parameters and frameworks employed in machine learning-based constitutive modeling, and there exists no established methodology or guidance for selecting these parameters and frameworks [12]. Both input combination cases were categorized as case 1 and case 2, representing the inclusion and exclusion of the antecedent target value in the models.
Additionally, three scenarios for model testing were considered:
  • Scenario (I): Two out of eight tests results were reserved for the testing phase. This scenario was designed to simulate real-world conditions, where some test conditions might be new, and it illustrates the model’s ability to predict outcomes for the experiments that were not included in training phase;
  • Scenario (II): The testing divisions were randomly selected segments of the strain–stress curves, each consisting of five consecutive data points. By putting aside pieces of the stress–strain curves, we aimed to test the model’s effectiveness in predicting partially unknown data, addressing conditions of incomplete data while performing laboratory experiments;
  • Scenario (III): Data points were sampled from the entire dataset for the testing phase. Sampling from the full dataset allowed assessing the model’s generalization ability for a variety of data points with different pressure and temperature combinations. Two different sampling approaches were also employed in this scenario. The first approach was stratified sampling, in which an equal number of randomly selected points were chosen from each individual experiments. This process was selected to ensure that each class of data (each stress–strain curve) was proportionally represented in the testing phase and to ensure a confident testing phase for each different freezing and pressure condition. The second sampling approach was putting aside randomly selected points out of the whole data.
The testing data partition was approximately 20% for the entire developed model scenarios. Moreover, before utilizing the scenarios and selecting data for developing models, the data were scaled and normalized. In this process, the original data values were transformed to a range between 0.1 and 0.9, employing the linear relationship stated in Equation (1).
X S c a l e d = 0.8 × ( X X m i n ) ( X m a x X m i n ) + 0.1
where X is the variable, and Xmin and Xmax are the minimum and maximum of each variable, respectively. This ensured all features were presented on a comparable scale. Additionally, to avoid saturation of sigmoid transfer functions (used in the ANNs hidden layers), it is a common practice to scale data before using them for model development [21,22].

3.2. Artificial Neural Networks (ANNs)

Artificial neural networks (ANNs) have emerged as powerful tools for modeling complex engineering problems [23]. Their ability to learn from existing patterns in experimental data enables them to predict future trends for unseen datasets. A typical ANN architecture consists of interconnected neurons organized into layers, including an input layer, multiple hidden layers (n), and an output layer (multilayer perceptron). The strength of each connection is represented by a weight value, and bias nodes are introduced for neurons in the hidden and output layers [24]. Multilayered feed-forward ANNs are frequently employed in civil and geotechnical engineering applications [25,26,27]. Their general architecture aligns with the basic ANN concept, featuring an input layer, n hidden layers, and an output layer. Interconnected neurons within these layers are linked in a feed-forward manner. While a sufficient network configuration is crucial for accurate predictions, practitioners must avoid overfitting by using too many neurons, as this can limit the model’s generalization capabilities [25]. Using ANN, models were developed in this study to estimate the deviatoric stress and volumetric strain that can lead to stress–strain responses. Neural Network Toolbox in MATLAB was implemented in this study for ANN modeling. Following the data division procedure recommended by Shahin et al. (2004) [25], around 75% of the data repository was dedicated to training the model. Moreover, feed-forward multilayer-based networks with a single hidden layer were utilized in this study.
Artificial neural networks (ANNs) process data through interconnected layers of neurons. Each input signal is multiplied by a weight and summed with others in the same layer. This weighted sum is then fed to the next layer, where a similar process occurs. ANNs can have one or more hidden layers, forming a complex connection of weights. The most common types are feed forward and feed-forward backpropagation (FFBP). Training an ANN involves iteratively adjusting these weights to minimize the difference between the predicted and actual outputs. The activation function, often a sigmoid function, determines how the weighted sum influences the neuron’s output. Training algorithms like Levenberg–Marquardt (LM) optimize these weights by minimizing the squared error between the network’s prediction and the target value [28,29]. This iterative process allows ANNs to mimic complex relationships from data. For detailed mathematical formulations and governing equations of LM, readers are referred to the existing literature [22,30]. The schematic illustration of ANNs with a single hidden layer is presented in Figure 3.
Hyperbolic tangent sigmoid transfer function (tansig) (y = 2/(1 + e−2x) − 1) was considered for the hidden and linear (purelin) for the output layer. The use of one hidden layer to solve different nonlinear problems has been approved in the literature [31,32]. To evaluate the performance of the developed ANN models, mean squared error (MSE), linear correlation coefficient (R), and mean absolute percentage error (MAPE) were employed. The optimal number of neurons in the single hidden layer was determined through a trial-and-error process, evaluating models with hidden layer sizes ranging from 1 to 20 neurons. Considering the testing phase data performance, the best network configuration was selected. In the study, the deviatoric stress and volumetric strain were taken as two different targets, and efforts were made to develop distinct networks for each target.

3.3. Error Criteria

The performance of the optimized ANN model was assessed using well-known statistical criteria (Equations (2)–(7)) commonly employed to evaluate model performance and error. These metrics include correlation coefficient (R) (Equation (2)), coefficient of determination (R2) (Equation (3)), mean squared error (MSE) (Equation (4)), root mean squared error (RMSE) (Equation (5)), mean absolute percentage error (MAPE) (Equation (6)), and mean absolute error (MAE) (Equation (7)). To ensure a thorough comparison between models’ error values, these criteria were evaluated using the original, non-normalized target values after the back conversion of normalized data.
R = i = 1 n E i E ¯ P i P ¯ i = 1 n E i E ¯ 2 i = 1 n P i P ¯ 2
R 2 = i = 1 n E i E ¯ P i P ¯ i = 1 n E i E ¯ 2 i = 1 n P i P ¯ 2 2
M S E = 1 n i = 1 n E i P i 2
  R M S E = 1 n i = 1 n E i P i 2
M A P E = 1 n i = 1 n E i P i E i × 100
M A E = 1 n i = 1 n E i P i
Ei and Pi represent the measured and estimated values for each data point, and n is the total number of data. In addition, E ¯ and P ¯ denote the mean measured and estimated values, respectively.

4. Results and Discussion

4.1. Optimized Developed Networks

The number of neurons in the hidden layer was chosen based on the performance of the developed models, between 1 and 20 neurons. The outperforming networks are presented in Table 1. This table presents the configurations of the chosen models, along with their training and testing R, MSE, and MAPE values. It should be noted that these values were calculated using the normalized and scaled model outputs.
Figure 4 and Figure 5 present the optimized model predictions (converted back to real, not scaled values) alongside the experimental data. Training points are depicted with blue markers, while testing points are shown as hollow circles. The red dashed line in Figure 4 and Figure 5 represents a perfect match between the estimated and experimental values (R = 1). This line signifies the ideal outcomes. Data points plotted closer to this line indicate better agreement between measured and predicted values. It can be seen that the points mostly lie in the vicinity of the ideal fit line, and almost all models’ correlation coefficients (R-value) are more than 0.99 (with only two decimal places shown in Table 2) for both targets of q and εv. Beyond that, additional criteria for model performance and error evaluation are investigated in the following sections.

4.2. Error Analysis

Figure 6 and Figure 7 illustrate the absolute errors between predicted and measured values (q and εv, respectively) for each data point in the training and testing sets. Additionally, Figure 8 presents the mean absolute error (MAE) for the testing data of each developed model. Figure 6 and Figure 8a reveal larger errors in predicted deviatoric stress for scenario (I) compared to other models in both cases 1 and 2. Additionally, both stratified and random sampled data point models for q estimation exhibited greater errors than scenario 2 within the same case class. Moreover, Figure 7 and Figure 8b demonstrate greater accuracy for εv predictions using scenario (II). Outstandingly, excluding the antecedent target value (Yt−1) in case 2 of scenario (I) resulted in the least accurate model, which can be clearly seen in Figure 8b.

4.3. Model Performance

The outputs of the models are depicted in Figure 9, Figure 10, Figure 11, Figure 12, Figure 13, Figure 14, Figure 15 and Figure 16 in the form of deviatoric stress and volumetric strain along with axial strain. Training and testing in these stress–strain curves are shown in filled and hollow markers, respectively.
These diagrams illustrate the estimated mechanical behavior of frozen soil under different freezing temperatures and confining pressures using ANNs. The plots for deviatoric stress–axial strain (qεa) and volumetric strain–axial strain (εvεa) generally agree with the experimental data from the training set. The ANN predictions for the testing sets are consistent with the experimental measurements, and this is particularly evident in the close agreement between the predicted and measured qεa curves. However, some differences are observed in the volumetric strain–axial strain curves, especially for scenario (I). Table 2 presents the prediction performance using error and correlation evaluation indicators. In this table, the coefficient of determination R2 is also used as an additional measure and complementary to the correlation coefficient, R-value. While R provides valuable insight into the strength and direction of the linear relationship between predicted and experimental values, R2 is another effective criterion, as it quantifies the proportion of variance in the experimental data that the model can explain [33]. By showing that all R2-values exceed 0.99 (with only two decimal places shown in Table 2), it can be demonstrated that the ANN’s capability is not only to correlate well with the experimental values but also to capture nearly all variability in the mechanical behavior of frozen soil. This combined use of R and R2 provides a more comprehensive validation of the model’s performance, as R2 supports the predictive accuracy by confirming that the model captures almost all the variance in the data.
Some of the results presented in Figure 11, Figure 12, Figure 13, Figure 14, Figure 15 and Figure 16 may seem to indicating an overfitting issue. These figures include both training and testing datasets, with predictions for the testing data specifically highlighted. A key indicator of overfitting would be a significant inconsistency between the model’s performance on training and testing data, which is not the case here. Also, the error metrics presented in Table 1 and Table 2 consistently show high accuracy for the unseen testing divisions, indicating that the model generalizes well in the introduced input span rather than overfitting. The testing strategies (partial stress–strain curve or data points) were designed to challenge the model’s ability to generalize in different conditions. The agreement between predictions on testing division and experimental data across the employed scenarios suggests that the model’s performance is not confined to the training data. Therefore, the model seems to effectively balance the accuracy and generalization without serious concern about a potential overfitting problem.
In Table 2, all the data in the training and testing divisions are considered to evaluate the metrics. Additionally, all the developed models are compared in predicting q and εv for cases 1 and 2. Although all the models show a high level of accuracy, those with the least error criteria are highlighted in bold and shaded.

4.4. Discussion on the Testing Strategy

This study examined three distinct scenarios for model testing. Scenario (I) involved keeping individual experiments unseen, scenario (II) entailed setting aside sections of each stress–strain curve, and scenario (III) comprised sampling data points. Scenario (I) imposed the most rigorous condition, where the model was expected to predict an individual experiment based on its training on other physical states of experiments. In this scenario, confining pressures of 0.6 and 0.8 MPa at freezing temperatures of −4 and −6, respectively, were kept unseen as the test set. Since no previous data were presented to the model under these conditions, the lower accuracy compared to other modeling strategies can be justified. Moreover, stratified data sampling in models involving Yt−1 led to improved accuracies compared to random point sampling.

4.5. Stress–Strain Formulation

Unlike other studies that solely reported optimized models, this work provides the weights and bias values for each layer of the ANN model. This information allows readers to directly replicate the presented results using any spreadsheet program and calculate the estimated deviatoric stress and volumetric stress. The following equation expresses the relationship between the normalized input parameters (T, σc, εa and q(t−1) or εv(t−1) in case 1 models) and the normalized output (q or εv).
Y n = f n 1 b 0 + k = 1 h w k f n 2 b h k + i = 1 m w i k X i
Yn represents the normalized target (q or εv); fn1 and fn2 are the linear and tansig transfer functions, respectively; h indicates neuron numbers in the hidden layer; Xi indicates the normalized values of each contributing variable; m is the number of contributing input variables; wik specifies the connecting weights between the ith input and kth neuron in the hidden layer; wk is the associating weight between the kth neuron in the hidden layer and the output neuron; bhk is the bias in the kth neuron of the hidden layer; and b0 is the bias value in the output layer. Subsequently, wik, bhk, wk, and b0, representing the weights and biases of the most accurate trained models with the least error in each case, were inserted into Equation (8), resulting in the formulations presented in Equations (9)–(12).
q—Scenario (III)—Stratified Sampling Model (Case 1)
q n = 0.425 J 1 + 0.295 J 2 0.919 J 3 0.156 J 4 + 0.037 J 5 0.875 J 6 0.104 J 7   0.839 J 8 + 6.017 J 9 + 0.438 J 10 1.144 J 11 5.481 J 12 4.409 J 13 + 1.937
J 1 J 2 J 3 J 4 J 5 J 6 J 7 J 8 J 9 J 10 J 11 J 12 J 13 = T a n s i g 1.209 2.124 0.325 2.553 0.673 0.753 3.186 2.017 0.879 0.050 0.474 1.332 1.512 1.296 0.594 1.381 6.120 2.360 6.301 5.633 0.256 7.897 0.341 4.543 1.767 1.626 1.522 3.961 0.102 2.122 0.799 1.035 0.868 3.751 1.882 0.404 0.425 1.502 0.878 1.829 0.196 0.062 0.152 6.194 0.996 2.715 2.132 0.383 0.452 0.035 0.022 0.391 T σ c ε a q ( t 1 ) + 2.103 2.892 0.877 0.863 5.350 3.341 3.281 0.761 4.428 1.855 5.198 3.479 1.510
εv—Scenario (II) (Case 1)
ε v ( n ) = 0.267 J 1 + 0.143 J 2 0.564 J 3 3.619 J 4 0.050 J 5 0.068 J 6 0.750 J 7 0.103 J 8 + 3.819 J 9 0.111 J 10 0.008
J 1 J 2 J 3 J 4 J 5 J 6 J 7 J 8 J 9 J 10 = T a n s i g 0.766 2.525 0.111 0.909 2.772 0.460 0.780 0.262 0.247 0.071 0.930 0.216 0.628 0.068 0.217 2.466 4.681 8.300 0.416 9.565 1.124 2.941 6.741 0.825 0.022 0.081 0.119 0.866 2.633 1.986 7.088 0.400 0.298 0.056 0.247 2.250 8.716 0.276 1.176 0.407 T σ c ε a ε v ( t 1 ) + 3.033 0.892 1.186 2.850 6.170 0.819 0.157 3.347 2.371 8.956
q—Scenario (III)—Random Sampling Model (Case 2)
q ( n ) = 0.088 J 1 0.192 J 2 1.297 J 3 + 0.582 J 4 0.151 J 5 + 5.004 J 6 0.071 J 7 0.048 J 8 0.094 J 9 5.072 J 10 0.008 J 11 + 0.054 J 12 + 0.074 J 13 4.767 J 14 + 4.022
J 1 J 2 J 3 J 4 J 5 J 6 J 7 J 8 J 9 J 10 J 11 J 12 J 13 J 14 = T a n s i g 4.301 3.959 12.019 0.551 12.670 5.202 0.102 0.024 3.970 1.222 0.562 0.687 14.152 13.460 1.267 3.872 0.495 2.034 2.543 2.968 3.096 5.609 15.141 0.580 5.809 6.656 5.558 0.915 0.372 2.167 39.024 69.602 111.488 8.648 5.826 9.016 1.231 4.269 9.069 6.820 0.377 2.229 T σ c ε a + 12.153 15.023 3.300 0.973 0.592 1.814 1.772 1.668 1.464 1.234 39.424 3.996 6.575 6.485
εv—Scenario (III)—Random Sampling Model (Case 2)
ε v ( n ) = 0.556 J 1 + 0.671 J 2 9.911 J 3 0.312 J 4 + 1.153 J 5 11.866 J 6 + 1.333 J 7 + 10.283 J 8 + 11.820 J 9 9.980 J 10 + 10.106 J 11 + 2.181 J 12 0.171 J 13 1.215 J 14 0.368
J 1 J 2 J 3 J 4 J 5 J 6 J 7 J 8 J 9 J 10 J 11 J 12 J 13 J 14 = T a n s i g 3.530 8.325 3.287 2.066 6.342 2.251 0.147 0.916 3.955 4.146 0.171 1.856 0.667 1.726 0.565 6.062 7.343 0.321 1.569 1.086 0.906 3.564 0.904 3.906 3.039 3.182 0.384 3.534 0.906 3.947 0.207 0.907 3.927 1.198 1.131 0.171 1.551 1.152 2.646 2.936 0.401 0.903 T σ c ε a + 7.641 5.582 3.265 3.589 1.221 2.959 0.464 0.481 1.729 0.123 3.851 0.021 2.861 2.981

5. Sensitivity Analysis

To assess the impact of different input variables on the predicted values of q and εv, a sensitivity analysis was performed on the ANN model’s results. This analysis, following Milne’s approach [34], utilized the current weights within the neural network to determine the relative influence of each input variable on the network output. Using the following equation, the percentage contribution (Qik) of each input variable (xi) to the final output (q or εv) can be calculated by considering the weights connecting input neurons to hidden neurons (wij) and weights connecting hidden neurons to the output neuron (Mjk). The summation of weights connecting all N input neurons to a specific hidden neuron (j) is denoted by r = 1 N w r j . The sum of Qik values for all input variables must always equal 100%.
Q i k = j = 1 L w i j r = 1 N w r j M j k i = 1 N j = 1 L w i j r = 1 N w r j M j k
Figure 16 visually depicts the relative influence of each input variable on the target as determined by this analysis.
As presented in Figure 17a,b, the Yt−1, which is the antecedent deviatoric stress or volumetric strain, is the most influential contributing parameter, with the importance of 36.18% and 32.93%, respectively. However, in the case 2 model of deviatoric stress prediction, axial strain is the most effective input variable in the model, with 39.68%, and the least effective input is the confining pressure, with an importance of 29.2%. For the model of volumetric strain prediction in case 2, the temperature is the most influential parameter, with a 36.03% effect on the response, and the least effective parameter is the confining pressure, with a 31.56% influence on the target. It can be observed that out of all the input variables involved in the model, none of them were over- or underrated in the trained models.

6. Conclusions

Traditional constitutive models for frozen soil behavior require specialized testing equipment and procedures, leading to time-consuming and costly processes. Moreover, the intricate internal structure of frozen soil, consisting of multiple phases, results in complex mechanical behaviors. To address these challenges, this study proposes a machine learning approach using artificial neural networks (ANNs) to predict the complete stress–strain response of frozen soils under various conditions. The ANN model was trained using a database from a previous experimental study on frozen soils, which included measurements at different temperatures and confining pressures.
Two different approaches were considered in the model development process: one involving the consideration of the antecedent (Yt−1) target and addressing the time series problem (case 1) and the other using only independent input variables (case 2). Additionally, various testing division sampling strategies were employed to assess the impact of different sampling scenarios. Three scenarios were applied for testing data divisions: (I) keeping individual tests unseen, (II) setting aside sections of the stress–strain curves, and (III) selecting different data points from the curves. Furthermore, the effect of stratified sampling versus random sampling in the third scenario was investigated. The results showed that the choice of test division significantly affected the accuracy of the model, with scenario (I) resulting in the least accurate models. However, scenario (I) demonstrated potential in generalizing predictions for targets whose physical states were not previously introduced to the model. Most of the developed models exhibited outstanding accuracies in predicting the stress–strain behavior of frozen sandy soil, with correlation coefficients exceeding 0.99. Selected models, based on their accuracies relative to other developed models, were also presented as closed-form solutions, offering insights into the stress–strain behavior of frozen sandy soil.
It was demonstrated in this study that artificial neural networks (ANNs) can efficiently model the stress–strain behavior of frozen sandy soils, which has proven to be challenging due to the complex and nonlinear nature of frozen soils under different temperature and pressure conditions. Traditional constitutive models are usually limited by extensive calibration requirements. In contrast, the ANN models developed here offer a reliable, practical approach that provides accurate predictions with a limited dataset, reducing the need for empirical models that may not fully capture frozen soil mechanics. It is worth mentioning that the current study is limited to using only one soil type, with eight experiments including 212 data points, which may seem less sufficient for resulting in robust conclusions. Therefore, more experimental tests are required to be included in future studies to further prove the robustness of the results obtained here and to provide a comprehensive intelligent model for frozen soil behavior. It should be noted that the classic constitutive models also lack adaptability across various soil types and conditions. The finding of this study highlights the potential of data-driven modeling to enhance practical engineering applications in cold regions, supporting more efficient and adaptable design solutions. Future works can address this study’s limitations by including additional experimental data from a wider range of soil compositions and testing conditions to further validate and expand the model’s applicability.

Author Contributions

Conceptualization, H.S.; methodology, D.R.E. and H.S.; software, H.S.; validation, D.R.E. and H.S.; formal analysis, D.R.E.; investigation, D.R.E.; resources, H.S.; data curation, D.R.E.; writing—original draft preparation, D.R.E.; writing—review and editing, H.S.; visualization, D.R.E.; supervision, H.S.; project administration, H.S.; funding acquisition, H.S. All authors have read and agreed to the published version of the manuscript.


This project was funded by Mitacs and Samen Data Technologies Inc. through the Accelerate program with the grant number of 215934.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All of the data used in this study is described and addressed in this paper.


The authors acknowledge the research funding kindly provided by Mitacs through the Accelerate program sponsored by Samen Data Technologies Inc.

Conflicts of Interest

The authors declare no conflict of interest.


  1. Li, K.-Q.; Yin, Z.-Y.; Liu, Y. Influences of spatial variability of hydrothermal properties on the freezing process in artificial ground freezing technique. Comput. Geotech. 2023, 159, 105448. [Google Scholar] [CrossRef]
  2. Li, H.; Lai, Y.; Wang, L.; Yang, X.; Jiang, N.; Li, L.; Wang, C.; Yang, B. Review of the state of the art: Interactions between a buried pipeline and frozen soil. Cold Reg. Sci. Technol. 2019, 157, 171–186. [Google Scholar] [CrossRef]
  3. Obu, J.; Westermann, S.; Bartsch, A.; Berdnikov, N.; Christiansen, H.H.; Dashtseren, A.; Delaloye, R.; Elberling, B.; Etzelmüller, B.; Kholodov, A.; et al. Northern Hemisphere permafrost map based on TTOP modelling for 2000–2016 at 1 km2 scale. Earth-Sci. Rev. 2019, 193, 299–316. [Google Scholar] [CrossRef]
  4. Cheng, G. A roadbed cooling approach for the construction of Qinghai–Tibet Railway. Cold Reg. Sci. Technol. 2005, 42, 169–176. [Google Scholar] [CrossRef]
  5. Chen, H.; Gao, X.; Wang, Q. Research progress and prospect of frozen soil engineering disasters. Cold Reg. Sci. Technol. 2023, 212, 103901. [Google Scholar] [CrossRef]
  6. Sun, K.; Zhou, A. A multisurface elastoplastic model for frozen soil. Acta Geotech. 2021, 16, 3401–3424. [Google Scholar] [CrossRef]
  7. Li, K.-Q.; Yin, Z.-Y.; Zhang, N.; Liu, Y. A data-driven method to model stress-strain behaviour of frozen soil considering uncertainty. Cold Reg. Sci. Technol. 2023, 213, 103906. [Google Scholar] [CrossRef]
  8. Xu, G.; Wu, W.; Qi, J. An extended hypoplastic constitutive model for frozen sand. Soils Found. 2016, 56, 704–711. [Google Scholar] [CrossRef]
  9. Amiri, S.G.; Grimstad, G.; Kadivar, M.; Nordal, S. Constitutive model for rate-independent behavior of saturated frozen soils. Can. Geotech. J. 2016, 53, 1646–1657. [Google Scholar] [CrossRef]
  10. Lai, Y.; Jin, L.; Chang, X. Yield criterion and elasto-plastic damage constitutive model for frozen sandy soil. Int. J. Plast. 2009, 25, 1177–1205. [Google Scholar] [CrossRef]
  11. Zhao, Y.; Zhang, M.; Gao, J. Research progress of constitutive models of frozen soils: A review. Cold Reg. Sci. Technol. 2023, 206, 103720. [Google Scholar] [CrossRef]
  12. Zhang, P.; Yin, Z.-Y.; Jin, Y.-F. State-of-the-Art Review of Machine Learning Applications in Constitutive Modeling of Soils. Arch. Comput. Methods Eng. 2021, 28, 3661–3686. [Google Scholar] [CrossRef]
  13. Yin, Z.-Y.; Jin, Y.-F. Practice of Optimisation Theory in Geotechnical Engineering; Springer Nature: Dordrecht, The Netherlands, 2019. [Google Scholar]
  14. Alamanis, N.; Lokkas, P.; Chrysanidis, T.; Christodoulou, D.; Paschalis, E. Assessment Principles for the Mechanical Behavior of Clay Soils. WSEAS Trans. Appl. Theor. Mech. 2021, 16, 47–61. [Google Scholar] [CrossRef]
  15. Jahangir, H.; Eidgahee, D.R. A new and robust hybrid artificial bee colony algorithm—ANN model for FRP-concrete bond strength evaluation. Compos. Struct. 2020, 257, 113160. [Google Scholar] [CrossRef]
  16. Eidgahee, D.R.; Jahangir, H.; Solatifar, N.; Fakharian, P.; Rezaeemanesh, M. Data-driven estimation models of asphalt mixtures dynamic modulus using ANN, GP and combinatorial GMDH approaches. Neural Comput. Appl. 2022, 34, 17289–17314. [Google Scholar] [CrossRef]
  17. Mojtahedi, F.F.; Ahmadihosseini, A.; Eidgahee, D.R.; Rezaee, M.; Spagnoli, G. Bio-inspired Predictive Models Development for Strength Characterization of Cement Deep-Mixed Plastic Soils. Int. J. Geosynth. Ground Eng. 2024, 10, 9. [Google Scholar] [CrossRef]
  18. Ren, Z.; Liu, J.; Jiang, H.; Wang, E. Experimental study and simulation for unfrozen water and compressive strength of frozen soil based on artificial freezing technology. Cold Reg. Sci. Technol. 2023, 205, 103711. [Google Scholar] [CrossRef]
  19. Zhang, P.; Jin, Y.; Yin, Z. Machine learning–based uncertainty modelling of mechanical properties of soft clays relating to time-dependent behavior and its application. Int. J. Numer. Anal. Methods Géoméch. 2021, 45, 1588–1602. [Google Scholar] [CrossRef]
  20. Xu, G. Hypoplastic Constitutive Models for Frozen Soil. Ph.D. Dissertation, University of Natural Resources and Life Sciences, Vienna, Austria, 2014. [Google Scholar]
  21. Naderpour, H.; Kheyroddin, A.; Amiri, G.G. Prediction of FRP-confined compressive strength of concrete using artificial neural networks. Compos. Struct. 2010, 92, 2817–2829. [Google Scholar] [CrossRef]
  22. Demuth, H.; Beale, M. MATLAB Neural Network Toolbox User’s Guide; The MathWorks, Inc.: Natick, MA, USA, 2009. [Google Scholar]
  23. Olden, J.D.; Jackson, D.A. Illuminating the “black box”: A randomization approach for understanding variable contributions in artificial neural networks. Ecol. Model. 2002, 154, 135–150. [Google Scholar] [CrossRef]
  24. Beale, M.H.; Hagan, M.T.; Demuth, H.B. MATLAB Neural Network Toolbox User’s Guide (Version R2017b); The MathWorks, Inc.: Natick, MA, USA, 2017; Volume 158. [Google Scholar]
  25. Shahin, M.A.; Maier, H.R.; Jaksa, M.B. Data Division for Developing Neural Networks Applied to Geotechnical Engineering. J. Comput. Civ. Eng. 2004, 18, 105–114. [Google Scholar] [CrossRef]
  26. Alavi, A.H.; Gandomi, A.H.; Mollahassani, A.; Heshmati, A.A.; Rashed, A. Modeling of maximum dry density and optimum moisture content of stabilized soil using artificial neural networks. J. Plant Nutr. Soil Sci. 2010, 173, 368–379. [Google Scholar] [CrossRef]
  27. Eidgahee, D.R.; Haddad, A.; Naderpour, H. Evaluation of shear strength parameters of granulated waste rubber using artificial neural networks and group method of data handling. Sci. Iran. 2019, 26, 3233–3244. [Google Scholar] [CrossRef]
  28. Marquardt, D.W. An Algorithm for Least-Squares Estimation of Nonlinear Parameters. J. Soc. Ind. Appl. Math. 1963, 11, 431–441. [Google Scholar] [CrossRef]
  29. Levenberg, K. A method for the solution of certain non-linear problems in least squares. Q. Appl. Math. 1944, 2, 164–168. [Google Scholar] [CrossRef]
  30. Sapna, S. Backpropagation Learning Algorithm Based on Levenberg Marquardt Algorithm. In Computer Science & Information Technology (CS & IT); Academy & Industry Research Collaboration Center (AIRCC): Chennai, India, 2012; pp. 393–398. [Google Scholar]
  31. Naderpour, H.; Rafiean, A.H.; Fakharian, P.; Naderpour, H.; Rafiean, A.H.; Fakharian, P.; Naderpour, H.; Rafiean, A.H.; Fakharian, P.; Naderpour, H.; et al. Compressive strength prediction of environmentally friendly concrete using artificial neural networks. J. Build. Eng. 2018, 16, 213–219. [Google Scholar] [CrossRef]
  32. Ahmadi, M.; Naderpour, H.; Kheyroddin, A. Utilization of artificial neural networks to prediction of the capacity of CCFT short columns subject to short term axial load. Arch. Civ. Mech. Eng. 2014, 14, 510–517. [Google Scholar] [CrossRef]
  33. Hair, J.; Black, W.C.; Babin, B.J.; Anderson, R.E. Multivariate Data Analysis; CENGAGE: Independence, KY, USA, 2019. [Google Scholar]
  34. Milne, L. Feature selection using neural networks with contribution measures. In AI-CONFERENCE; World Scientific Publishing: Singapore, 1995; p. 571. [Google Scholar]
Figure 1. Overall research workflow for stress–strain behavior modeling of frozen sandy soil.
Figure 1. Overall research workflow for stress–strain behavior modeling of frozen sandy soil.
Geotechnics 04 00062 g001
Figure 2. Stress–strain responses from triaxial test under −4 and −6 °C freezing temperatures, reproduced from Xu (2014) [20].
Figure 2. Stress–strain responses from triaxial test under −4 and −6 °C freezing temperatures, reproduced from Xu (2014) [20].
Geotechnics 04 00062 g002
Figure 3. The schematic ANNs architecture.
Figure 3. The schematic ANNs architecture.
Geotechnics 04 00062 g003
Figure 4. Predicted deviatoric stress versus experimentally measured for different modeling scenarios.
Figure 4. Predicted deviatoric stress versus experimentally measured for different modeling scenarios.
Geotechnics 04 00062 g004
Figure 5. Predicted volumetric train versus experimentally measured for different modeling scenarios.
Figure 5. Predicted volumetric train versus experimentally measured for different modeling scenarios.
Geotechnics 04 00062 g005
Figure 6. Comparing q models absolute errors for case 1 and 2 in different modeling scenarios.
Figure 6. Comparing q models absolute errors for case 1 and 2 in different modeling scenarios.
Geotechnics 04 00062 g006
Figure 7. Comparing εv models absolute errors for case 1 and 2 in different modeling scenarios.
Figure 7. Comparing εv models absolute errors for case 1 and 2 in different modeling scenarios.
Geotechnics 04 00062 g007
Figure 8. Testing division MAE of the developed models for (a) deviatoric stress and (b) volumetric strain.
Figure 8. Testing division MAE of the developed models for (a) deviatoric stress and (b) volumetric strain.
Geotechnics 04 00062 g008
Figure 9. Deviatoric stress versus axial strain results of scenario (I).
Figure 9. Deviatoric stress versus axial strain results of scenario (I).
Geotechnics 04 00062 g009
Figure 10. Volumetric strain versus axial strain results of scenario (I).
Figure 10. Volumetric strain versus axial strain results of scenario (I).
Geotechnics 04 00062 g010
Figure 11. Deviatoric stress versus axial strain results of scenario (II).
Figure 11. Deviatoric stress versus axial strain results of scenario (II).
Geotechnics 04 00062 g011
Figure 12. Volumetric strain versus axial strain results of scenario (II).
Figure 12. Volumetric strain versus axial strain results of scenario (II).
Geotechnics 04 00062 g012
Figure 13. Deviatoric stress versus axial strain results of scenario (III)—Stratified Sampling.
Figure 13. Deviatoric stress versus axial strain results of scenario (III)—Stratified Sampling.
Geotechnics 04 00062 g013
Figure 14. Volumetric strain versus axial strain results of scenario (III)—Stratified Sampling.
Figure 14. Volumetric strain versus axial strain results of scenario (III)—Stratified Sampling.
Geotechnics 04 00062 g014
Figure 15. Deviatoric stress versus axial strain results of scenario (III)—Random Sampling.
Figure 15. Deviatoric stress versus axial strain results of scenario (III)—Random Sampling.
Geotechnics 04 00062 g015
Figure 16. Volumetric strain versus axial strain results of scenario (III)—Random Sampling.
Figure 16. Volumetric strain versus axial strain results of scenario (III)—Random Sampling.
Geotechnics 04 00062 g016
Figure 17. Relative importance of each contributing parameter in (a) Scenario (III)—Stratified Sampling Model (Case 1), (b) Scenario (II) (Case 1), (c) Scenario (III)—Random Sampling Model (Case 2) and (d) Scenario (III)—Random Sampling Model (Case 2).
Figure 17. Relative importance of each contributing parameter in (a) Scenario (III)—Stratified Sampling Model (Case 1), (b) Scenario (II) (Case 1), (c) Scenario (III)—Random Sampling Model (Case 2) and (d) Scenario (III)—Random Sampling Model (Case 2).
Geotechnics 04 00062 g017
Table 1. Optimized developed models performance for deviatoric stress and volumetric strain.
Table 1. Optimized developed models performance for deviatoric stress and volumetric strain.
Model TargetModeling ScenarioInput
Neuron No.Training R-ValueTesting
Training MSE
Testing MSE
Training MAPE (%)Testing MAPE (%)
Deviatoric Stress (q)(I)Case 130.990.990.00023500.00024552.983.14
Case 240.990.990.00003800.00010531.241.54
(II)Case 1120.990.990.00000320.00002460.270.55
Case 2120.990.990.00001080.00006170.580.92
(III)Stratified SamplingCase 1130.990.990.00000090.00003100.120.90
Case 2110.990.990.00001860.00008420.741.49
Random SamplingCase 1120.990.990.00000120.00003510.160.73
Case 2140.990.990.00000770.00003780.430.95
Volumetric Strain (εv)(I)Case 120.990.990.00001010.00000611.170.99
Case 230.990.990.00013970.00166394.059.02
(II)Case 1100.990.990.00000100.00000310.340.42
Case 290.990.990.00000470.00000730.850.86
(III)Stratified SamplingCase 190.990.990.00000170.00000600.461.10
Case 2140.990.990.00000070.00001640.231.64
Random SamplingCase 1140.990.990.00000060.00001060.271.42
Case 2140.990.990.00000140.00000370.420.76
Table 2. Models performances for deviatoric stress and volumetric strain on all data.
Table 2. Models performances for deviatoric stress and volumetric strain on all data.
TargetParameterCase 1Case 2
Scenario (I)Scenario (II)Scenario (III)Scenario (I)Scenario (II)Scenario (III)
Stratified SamplingRandom SamplingStratified SamplingRandom Sampling
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Rezazadeh Eidgahee, D.; Shiri, H. Machine Learning–Enhanced Modeling of Stress–Strain Behavior of Frozen Sandy Soil. Geotechnics 2024, 4, 1228-1245.

AMA Style

Rezazadeh Eidgahee D, Shiri H. Machine Learning–Enhanced Modeling of Stress–Strain Behavior of Frozen Sandy Soil. Geotechnics. 2024; 4(4):1228-1245.

Chicago/Turabian Style

Rezazadeh Eidgahee, Danial, and Hodjat Shiri. 2024. "Machine Learning–Enhanced Modeling of Stress–Strain Behavior of Frozen Sandy Soil" Geotechnics 4, no. 4: 1228-1245.

APA Style

Rezazadeh Eidgahee, D., & Shiri, H. (2024). Machine Learning–Enhanced Modeling of Stress–Strain Behavior of Frozen Sandy Soil. Geotechnics, 4(4), 1228-1245.

Article Metrics

Back to TopTop