1. Introduction
Hundreds of thousands of people in Alaska, Canada, Russia, and Greenland live on permafrost, a type of soil that covers nearly 24% of the northern hemisphere [
1]. Frost heave and thawing actions are key issues in permafrost regions that can cause various engineering problems, such as the progressive lifting of sewer pipelines, subsidence of buildings, cracking of road surfaces, and damage to ground infrastructure structures or geological repository systems (
Figure 1). Recently, we have seen that natural freeze-thaw cycles from season to season can also cause significant subsidence problems for buildings or underground structures, even in non-permafrost areas. The sequence of subsidence events caused by the frost heave and thawing cycle is depicted in
Figure 2, which shows a homogeneous fine-grained soil column subjected to one-sided natural freezing from top-down. To analyze and predict this phenomenon, it is necessary to understand the complex thermal-hydro-mechanical (THM) coupling that occurs during the freezing process. The preconditions for frost heave action in frozen soil are as follows [
2,
3]: (a) the soil is potentially subject to frost heave action, (b) a sufficient supply of water, the material source for frost heave in soils, is available, and (c) the thermal conditions must be suitable to cause the freezing front to move at a sufficiently slow rate to allow for water migration.
In general, frozen soils comprise three zones: a frozen zone, a frozen fringe, and an unfrozen zone (
Figure 3). The boundary between the frozen fringe and the unfrozen zone is called the freezing front, which is related to the 0 °C (273.15 K) isotherm [
2]. When freezing begins in the frozen zone, the freezing front propagates to the unfrozen zone, resulting in an expansion in volume due to the phase change of the pore water behind the frozen fringe. Water subsequently moves into the freezing front to compensate for the water loss due to the phase change, forming a periodic ice layer referred to an ice lens [
2,
3,
4,
5]. The continuous growth of the ice lens ultimately causes a significant amount of frost heave.
Early research on the subject involved various experimental investigations aimed at understanding the frost heave action mechanism at various scales [
6,
7,
8,
9]. Such investigations covered small-scale column-freezing tests [
6], large-scale tests [
7], and long-term field-scale monitoring [
8,
9]. Afterwards, several numerical simulation studies based on heat and mass transfer in porous media were conducted to evaluate frost heave. Initially, such simulation models used empirical equations [
10,
11,
12,
13,
14]. Konrad and Morgenstern [
11] proposed the segregation potential (SP
0) theory, which explains the correlation between the temperature gradient (gradT) and the water flux in the frozen fringe according to the coefficient SP
0. Subsequently, Konrad and Duquennoi [
14] proposed a thermodynamic model that treated soil as an incompressible material to derive new standards for ice lens formation. Shin et al. [
15] developed an elasto-plastic mechanical constitutive equation for frozen soil using SP
0 to efficiently describe the complex THM phenomena of frozen soil. Zheng et al. [
16] proposed a practical method that expands the one-dimensional frost heave equation (Takashi’s equation) into multi-dimensional situations.
Figure 1.
Uneven permafrost thawing underneath a building foundation in Kangerlussuaq. (Re-printed with permission from ref. [
17]. (photo by: Thomas Ingeman-Nielsen)).
Figure 1.
Uneven permafrost thawing underneath a building foundation in Kangerlussuaq. (Re-printed with permission from ref. [
17]. (photo by: Thomas Ingeman-Nielsen)).
Figure 2.
The sequence of subsidence events caused by frost heave and thawing (Reproduced from [
18]).
Figure 2.
The sequence of subsidence events caused by frost heave and thawing (Reproduced from [
18]).
Afterwards, a new approach was proposed to account for the fluid flow due to the temperature gradient. This approach estimated cryogenic suction using the interfacial tension between ice and fluid. Several studies used the Clausius–Clapeyron equation, which determines the ice-water pressure at phase equilibrium, to calculate cryogenic suction and perform THM analysis [
19,
20,
21,
22,
23,
24]. Aside from this, thermomechanical models have also been presented [
2,
25,
26,
27]. Although such models could not predict the formation of individual ice lenses, they effectively examined the global response of freezing soils by introducing a porosity rate function with no hydraulic analysis.
Meanwhile, due to the development of computer computational capabilities, prediction studies based on artificial neural networks are gaining traction. An ANN handles incomplete data and captures nonlinear and complex relationships among variables of a system. With these traits, ANNs have been recognized as a powerful tool for prediction. Similar to how ANNs are being applied in various engineering fields, the application of ANNs in the field of geotechnical engineering is also extending to various purposes, such as estimating ground surface settlement, in-situ permeability, undrained shear strength, thermal properties, and landslide susceptibility [
28,
29,
30,
31,
32,
33,
34,
35]. However, among many kinds of research, only a fraction of the studies was about predicting frost heave behavior [
35]. Zhang et al. [
35] predicted frost heave ratio of saline soil using back-propagation neural network (BPNN) and generalized regression neural network (GRNN) approaches and compared the prediction performance between two approaches to obtain a relatively reliable model.
Despite the significant progress brought upon by the aforementioned experimental, numerical, and statistical studies, several challenges remain. Most previous studies focused on the mathematical modeling of the freezing process accompanied by experimental validation, yet there remains a lack of in-depth analysis for freezing behavior from a geotechnical point of view. Most notably, the estimation of frost heave amount should be evaluated based on various geotechnical properties. Therefore, this study numerically evaluates the frost heave behavior of frozen soil at a specimen scale by considering important geotechnical parameters. A parametric study is also conducted to quantitatively analyze the effect of major geotechnical properties on frost heave behavior. In addition, after evaluating the sensitivity of each physical property to frost heave behavior via multiple statistical analyses, a prediction model based on an artificial neural network capable of practically estimating the frost heave ratio is finally presented.
Figure 3.
Schematic representation of a frozen soil (Reproduced from [
2]).
Figure 3.
Schematic representation of a frozen soil (Reproduced from [
2]).
3. Evaluation of Frost Heave Ratio
Using the governing equations described above, we performed a THM analysis to predict frost heave for a saturated soil specimen. The material properties used in the numerical model are presented in
Table 1. The governing equations are highly non-linear, and thus, the commercial finite element (FE) software COMSOL Multiphysics [
47] was used to solve the complex differential equations. Furthermore, the numerical simulation was implemented for a one-dimensional freezing test. The depth of the soil specimen was set as 100 mm, and the initial temperature of the entire specimen was set as 5 °C. The analysis was conducted until thermal equilibrium was achieved while maintaining constant bottom and top boundary temperatures (Top boundary temperature was set as 5 °C and bottom boundary temperature was set as −5 °C and hence the temperature gradient was 1 °C/cm). The groundwater level (GWL) was fixed at the bottom boundary to allow for a continuous water supply during freezing, and the overburden pressure applied on the top boundary was set to atmospheric pressure (101.3 kPa). The frost heave ratio was obtained using Equation (21).
where ζ is the frost heave ratio (%). Δ
Hf is the amount of total heave (mm),
H0 is initial specimen height (mm), and Δ
t is the elapsed time (h).
Figure 4 shows the variation of frost heave ratio (ζ) and the position of the freezing front over time obtained from the simulation model. The propagation rate of the freezing front gradually slowed down as time passed: the freezing front propagated rapidly during the initial stages of freezing but came to a halt as it approached thermal equilibrium. The amount of frost heave also steadily increased until thermal equilibrium was achieved. After reaching thermal equilibrium (approximately after 60 h), the freezing front no longer moved, and no further severe frost heave occurred. Overall, the amount of frost heave increased in a nonlinear manner with time, a tendency that was also observed in the experimental results of Konrad and Morgenstern [
11]. In
Figure 4, the calculated frost heave ratio at 60 h was approximately 8%.
To verify the reliability of these simulation results, we compared the results with those of the previous numerical studies for the same freezing conditions. As shown in
Figure 5, the predictions of the model for the pore pressure and temperature with specimen depth showed good agreement with the results of Zhou and Li [
21]. This suggests that the numerical model used in this study is reliable.
A parametric study was subsequently conducted to quantitatively analyze the effects of geotechnical properties on the frost heave ratio. This study considered the thermal conductivity and initial hydraulic conductivity of the soil particles as crucial parameters. This is because frost heave behavior is mainly determined by the propagation rate of the freezing front and the water supply in the frozen zone. Whereas the particle thermal conductivity affects the propagation rate of the freezing front, the initial hydraulic conductivity is concerned with the inflow of pore water in the unfrozen zone. Thus, we used the numerical simulation model to obtain and mutually compare frost heave ratio values according to a total of 251 influencing parameter combinations. As shown in
Figure 6, the amount of heave tends to decrease as the particle thermal conductivity increases. This is because the freezing rate becomes too high if freezing is accelerated due to the high thermal conductivity of the particles, resulting in thermal conditions that prevent a sufficient inflow of water from external sources. On the other hand, the frost heave ratio has a positive correlation with initial hydraulic conductivity: as the initial hydraulic conductivity increases, the frost heave ratio tends to increase. In other words, if the thermal conditions are kept the same and the hydraulic conditions are altered, a higher soil hydraulic conductivity would result in a higher frost heave ratio. However, it should be noted that these phenomena are only valid for silty soil, which is potentially subject to frost heave action. If the soil specimen is closer to sandy soil, no capillary action occurs, resulting in insignificant amounts of heave, regardless of the permeability.
Meanwhile, in order to investigate the sensitivity of both the thermal and hydraulic conductivities of frozen soils on the frost heave ratio, a correlation analysis and regression analysis were conducted.
Table 2 shows the results of the Pearson correlation analysis for each variable [
48]. The analysis illustrates that both the thermal conductivity and initial hydraulic conductivity of a frozen soil can significantly affect the heave ratio, as the
p-value of each parameter was less than 0.05 (
Table 2). Furthermore, this study also conducted a regression analysis, as shown in
Table 3. According to the regression analysis, the
p-values of the coefficients for thermal conductivity and initial hydraulic conductivity were also lower than 0.05, which indicates that these two variables can significantly affect the heave ratio [
49]. Although a significant correlation was confirmed between the independent and dependent variables, it was confirmed that an auto-correlation exists in the dependent variable. Therefore, in this study, it was judged that it would be more beneficial to propose a predictive model for frost behavior using an artificial neural network instead of deriving a regression equation.
4. Prediction of Frost Heave Ratio Using the Artificial Neural Network Model
4.1. Establishment of an Artificial Neural Network
In this study, an ANN for the estimation of frost heave ratio was designed with three layers: an input layer, a hidden layer, and an output layer. The input layer stores and provides data to the ANN network, whereas the hidden layer, which is constructed with general neurons, connects the input layer to the output layer. As shown in
Figure 7, the developed ANN model has a 2-5-1 structure: with two neurons in the input layer, five neurons in the hidden layer, and one neuron in the output layer. Each neuron has an input parameter that is the weighted sum of the output from every neuron in the previous layer. This sum is passed through a transfer function to provide an outgoing signal to the next layer. Finally, the output layer stores the value predicted by the network. For each neuron, the total input value can be obtained as follows.
where
W is the weight matrix, which stores the weights of every connection between the current and the preceding layer. A vector
x contains all output signal values from the previous layer, whereas the vector
b comprises the bias value at the current layer. The input value is transformed within neurons via a transfer function. Thus, Equation (22) can rewrite as follows.
where
f is the transfer function, which usually adds non-linearity to the network to try to fit the network. Therefore, the network is able to produce an output that fits within the proper value. Without transfer functions, the network could only be able to provide a linear output when compared with its input signal.
To guarantee the performance of the network, the Nguyen-Widrow method [
50] was adopted to produce the initial weight and bias. In this study, the back-propagation technique was applied to the training procedure. According to this method, the procedure includes two phases. First, the feed-forward phase involves the passing of all data from the input layer to the output layer according to the weighted sum of the output from every connected neuron in the preceding layer. A transfer function is applied to estimate the output value within neurons. Thus, the predicted output is estimated at the output layer. The difference between the predicted value and the expected value is obtained by a cost function. In this study, the quadratic cost function was applied as follows.
where
yi is the predicted value obtained by ANN while exp
i is the expected value from the dataset. In Equation (24), the predicted value is obtained by variables that contain input signal, weight, biases, transfer function, and the expected value. Therefore, the cost function can be rewritten as follows.
Secondly, the backward pass computes the loss function and updates the weight matrix. This process is repeated until the sum squared error over all epochs is minimized. With every training iteration, a new weight
W+ can be obtained based on the cost function and current weight
W.
where
η is learning rate, which is usually a small constant. ∇
C is the gradient of the cost function with respect to the weight and can be estimated as follows.
One of the most frequently encountered problems is overfitting, which occurs during training procedures. Overfitting occurs when ANN model is overly trained with training data and fails to evaluate the testing data. In this study, we have investigated the effect of Bayesian regularization and Levenberg Marquardt.
The Bayesian regularization technique can be applied to guarantee the efficiency of the ANN training process. This study also applied the Bayesian regularization technique. The training process reduces the sum of squared errors, which can be denoted
F =
FD. However, the Bayesian regularization adds some terms to construct the objective function as follows.
where
ED is the sum of squared errors,
Ew is the sum of the square of the weight matrix of the ANN model.
α and
β are the objective function parameters. Both objective function parameters can be obtained via the Gauss-Newton Approximation method.
The Levenberg Marquardt technique is used to solve the non-linear least squares problem that is combined the Gaussian-Newton method and the Steepest Decent method. The new weights are calculated using the following equation
where
I is identity unit matrix,
μ is a learning parameter.
J is the Jacobian matrix and
E is cumulative error vector which is determined as following [
51]. For the learning rate of
μ = 0, the Gauss-Newton method is adapted while the Steepest Decent is applied within larger learning rate. The learning rate
μ is automatically adjusted at each iteration. The disadvantage of Lenvenberg Marquardt requires the high computational cost to compute the large Jacobians and inverting matrixes.
4.2. Application of an ANN to Frost Heave Ratio Predictions
In this study, an ANN model was developed to predict the frost heave ratio ζ for the frozen soil. In the ANN model, two parameters-hydraulic conductivity in the unfrozen zone (k0) and the thermal conductivity of the soil particle (λs)-were considered as input parameters. The training data included input-target pairs: 197 pairs for training and 49 pairs for testing. Bayesian regularization was applied to a back-propagation neural network.
The architecture of an ANN is usually determined via trial and error. Generally, the input-target pairs scale in the range of [−1, 1] before training. Thus, the minimum and maximum values of the original input-target pairs are scaled to “−1” and “1”, respectively. After the training procedure, the weights matrix and bias vectors are applied to any future inputs, which should be scaled based on the minimum-maximum pairs of the original inputs and targets. Once the network is trained, the predicted value falls within the range [−1, 1]. The predicted value should be converted back into the same units by vector contains the minimum and maximum of the original input-targets pair. In this study, the tangent sigmoid transfer function is adopted for all layers except for the input layer where the linear transfer function is used instead.
Additionally, the learning rate (in Equation (28)) plays a vital role in the ANN network. If the learning rate is too low, the weight matrix updates at an inadequate rate and the local minimum may take a long time to achieve. In contrast, an overly large learning rate may result in the network overreaching and missing the local minimum optima. Traditionally, many studies adopted learning rate of 0.1 or 0.01. In our study, we investigated the effect of both learning rates on the ANN model.
Figure 8 shows the relationship between the coefficient of determination R
2 and the number of neurons in the hidden layer for Bayesian Regularization and Levenberg Marquardt according to both learning rates (
η = 0.01 and
η = 0.1). The R
2 converged to a high value (R
2 ≥ 0.95) when the ANN model had more than three neurons and six neurons in the hidden layer for Bayesian Regularization and Levenberg Marquardt, respectively. Although the Levenberg Marquardt algorithm achieved a higher converged coefficient compared to those of Bayesian Regularization at eight and ten neurons in the hidden layer but it requires a high computational cost for computing Jacobian matrix. Thus, the Bayesian Regularization was adopted in this study. With a learning rate of 0.01, the R
2 value of the ANN model based on the Bayesian Regularization peaked highest at five neurons in the hidden layer. Therefore, the learning rate and the number of neurons in the hidden layer were set as 0.01 and 5, respectively.
Figure 9 shows the relationship between the converged coefficient R
2 and the number of neurons in the hidden layer of the ANN model based on the Bayesian Regularization according to the performance functions that consist of Mean Absolute Error (MAE), Mean Square Error (MSE), Sum Absolute Error (SSA), and Sum Square Error (SSE). The convergence coefficient peaked at five neurons in the hidden layer when the performance function was mean square error (MSE). Other performance functions were similar trending but they had a lower value of converged coefficient R
2. In this study, the Mean Square Error (MSE) was adopted to evaluate the performance of the ANN model.
Figure 10 illustrates a comparison of the frost heave ratio predicted by the ANN model and simulation model. The model exhibited R
2 value of 0.9538 for training data and 0.8929 for the testing data. Thus, it can be judged that the proposed ANN-based prediction model is reliable and applicable for predicting the frost heave ratio using hydraulic conductivity in the unfrozen zone and the thermal conductivity of the soil particle. However, it should be noted that that the inherited error can be involved in ANN and probably multiplied by the estimation error when ANN is implemented because FEM itself contains inherited modeling error.
Table 4 and
Figure 11 present the weights and biases for the trained model, which ultimately allows others to make practical use of the developed ANN model.
In order to determine the sensitivity of the ANN model, the Garson analysis [
52] was adopted to calculate the interpreting of the connection weights that indicate the importance of the input weights importance. The interpreting of the connection weights along the connection from the input to output can be calculated as follows.
where
NH and
NV are the number of the neurons in the hidden layers and the number of the variable (input parameters).
IV is the sum the product of the input connection weight in the hidden layer while
O is the connection weight of the output node.
Table 5 illustrates the connection weights and bias for each layer.
Table 6 demonstrated that the hydraulic conductivity in the unfrozen zone (
k0) was the most important input factor while the thermal conductivity of the soil particle (
λs) was lesser importance.