1. Introduction
Rock mass stability is closely linked to the distribution of structural planes [
1]. These structural planes intersect the rock mass, dividing it into blocks of varying shapes and sizes. When subjected to external forces, certain blocks on the free surface may slip along the structural planes, leading to instability of the rock mass; these blocks are referred to as “key blocks” [
2]. Accurately determining the distribution of structural planes within the rock mass is crucial for the timely identification of key blocks and the subsequent analysis of rock mass stability [
3]. While large-scale deterministic structural planes are relatively straightforward to measure, contemporary techniques such as terrestrial laser scanning facilitate the geometric identification of these planes. Conversely, the number of stochastic structural planes is large, and the scale is small and can only be generated stochastically by analyzing the statistical characteristics of the planes exposed on rock mass surfaces.
Since the 1980s, the Monte Carlo method has been widely employed for generating stochastic structural planes. This method operates by analyzing the probability distribution of each parameter and then generating stochastic numbers according to the corresponding probability distributions [
4,
5,
6,
7,
8]. Rafiee and Vinches [
9] combined geological statistical analysis and conventional methods to model the three-dimensional structure of rock masses; Wang et al. [
10] used the Monte Carlo method to generate stochastic structural planes, based on which movable blocks were analyzed using the block theory and limit equilibrium. Sun et al. [
11] used the Monte Carlo method to develop a 3D stochastic network model of the structural plane with a Baecher disk model based on the SfM photogrammetric method.
However, some studies has shown that
there are correlations between parameters of structural planes, such as the correlation between the dip direction and the dip angle [
6,
12,
13,
14]. The traditional Monte Carlo method establishes probability density functions of various geometric parameters of structural planes based on measured factual structural planes and samples them according to these known probability density functions to obtain stochastic variables approximating the factual probability distribution function. This method, which independently samples each geometric parameter, ignores the correlations. Therefore, it is challenging to generate structural planes that are accurately consistent with the internal structural planes of the rock mass, and it is crucial to propose a new method for generating structural planes that considers the correlation between multiple parameters.
In recent years, deep learning has been extensively applied in content generation, exemplified by models such as ChatGPT. Deep generative models are highly generalizable, relying on generic probabilistic modelling and feature learning, which can be applied across various domains and data types. DGMs utilize deep neural networks to learn the patterns and structures of samples. Through multiple nonlinear transformations, they can automatically capture the high-dimensional features and the complex relationships within the data. In addition, they introduce latent variables in the latent space to generate new data that are similar but not identical to the samples. DGMs mainly include generative adversarial networks (GAN) [
15], denoising diffusion probabilistic models (DDPM) [
16], and variational autoencoder (VAE) [
17]. These models have been widely used in various fields, such as image generation [
18], text generation [
19], natural language processing [
20], and other fields [
21,
22,
23].
DDPMs have shown significant advantages in generative tasks and have gradually become the most popular generative model [
24,
25]. For example, Xu et al. [
26] utilized the DDPMs to predict the LOS mass-weighted number density of GMCs from column density maps in astronomy. Li et al. [
27] successfully generated point–label pair generation based on a DDPM generative model. Nair and Patel [
28] proposed an effective solution for V2T face translation using DDPMs. Their approach proved to be an ideal solution to generating samples from the conditional distribution of visible images given thermal images. The noise prediction module of DDPM is a deep neural network that excels at universal function approximation within numerical paradigms due to its self-learning abilities, adaptivity, fault tolerance, nonlinearity, and advanced input-to-output mapping capabilities [
29]. DDPM is capable of generating data without assuming a prior probability distribution. During training, DDPM can automatically capture all associations, dependencies, and structures in the data, including higher-order features and interactions in the data, and is able to maintain the consistency of these features when generating new samples.
Other deep generative models, such as GANs and VAEs, can also be used to generate stochastic structural planes. However, DDPMs provide a more stable training process compared to the adversarial training between the generator and discriminator in GANs, and they exhibit lower sensitivity to hyperparameters. The theoretical foundation of DDPMs is based on a relatively simple diffusion process, making them easier to understand and implement compared to the variational inference framework of VAEs. In addition, DDPMs offer a wider variety of generation modes and possess an interpretable generation process [
30]. Therefore, DDPMs can be considered as a novel method for generating contents such as stochastic structural planes.
To address the limitation of the traditional Monte Carlo method, which does not consider parameter correlations in the generation process, we propose a deep learning approach based on DDPM to generate stochastic structural planes. The essential idea is to utilize the characteristics of neural networks to input different parameters of structural planes into the DDPM as a whole, taking into account the correlation between the parameters while generating. The traditional Monte Carlo method ignores the correlation between the structural plane parameters when generating the structural plane. The proposed DDPM-based method considers the correlation between the parameters during generation, and the generated stochastic structural planes can be more consistent with the measured factual structured planes. This allows for more accurate identification of key blocks and analysis of rock mass stability.
The remainder of this paper is summarized as follows.
Section 2 provides a detailed description of the DDPM.
Section 3 introduces the overall process and verification method of this study.
Section 4 explains the experimental setting, results, and validation in this paper.
Section 5 discusses the advantages and shortcomings of this study and prospects for future work.
Section 6 concludes the paper.
2. Background: Denoising Diffusion Probabilistic Model (DDPM)
The idea of the diffusion probabilistic models was initially introduced by Sohl-Dickstein in 2015 [
31] and was further improved by Jonathan Ho in 2020 with the denoising diffusion probabilistic models [
16]. The noise prediction module of DDPM employs a deep neural network, which is highly effective in universal function approximation within numerical paradigms. This effectiveness is attributed to the network’s self-learning capabilities, adaptability, fault tolerance, nonlinearity, and sophisticated input-to-output mapping [
29]. DDPM is capable of generating data without presuming any prior probability distribution.
DDPM inputs and generates data as a whole, allowing for the capture of the correlation between parameters [
32]. Additionally, DDPM is easier to train than other deep generation models, offers a wider variety of generation modes, and has an interpretable generation process [
30]. Li et al. [
27] successfully generated annotated point clouds based on the DDPM. Utilizing the DDPM and aggregating the intermediate features of the generator, a feature interpreter is proposed to convert intermediate features into semantic labels. An uncertainty metric is introduced to enhance the quality of the generated point cloud dataset, further showcasing the effectiveness and efficiency of the DDPM for sparse supervised labelling examples. Nair and Patel [
28] utilized the DDPM to generate point clouds by annotating point clouds from a long-wave infrared thermal image to a corresponding visible image transformation problem. During the training, the model learns the conditional distribution of visible facial images given their corresponding thermal images through the diffusion process. In the reverse process, the visible domain image is obtained by starting from Gaussian noise and iteratively performing denoising.
Figure 1 shows the principle of DDPM; the model is composed of two main processes: (1) the diffusion process and (2) the reverse process.
2.1. The Diffusion Process
This process samples original data (
) and adds noise to it. The noise addition process is divided into T steps, with each step adding a small amount of noise to obtain a series of samples with noise (
,
...to
). As the number of steps increases, the final result,
, can be approximated as an isotropic Gaussian distribution. The process obeys the following formula:
;
z represents the added noise. The noise added at each step is independent and follows a Gaussian distribution, with its intensity increasing with T. If the original data (
) are known,
can be obtained at any time. The formula above can also be expressed as Equation (
1), which represents the distribution of latent variables in the forward process.
where
is the variance, ranging from 0 to 1;
t is the number of steps.
2.2. The Reverse Process
In contrast to the forward diffusion process, the reverse process is the process of removing noise. In the reverse process,
is used to gradually remove Gaussian noise at each step and produce a series of samples (
,
,
), culminating in a completely noise-free
. If the conditional probability distribution
is known, i.e., the data distribution of the overall sample is known, it is possible to iterate from
to
, step by step, with
t in the reverse process. But we do not know the data distribution of the sample; in this case, a neural network model
is needed to approximate the probability distribution. This model takes
and
t as inputs, and the outputs are of the same dimension as
.
represents the parameters of the neural network. Since the Gaussian distribution is determined by the mean and variance,
can be expressed in the form of Equation (
2), where
represents the mean and
represents the variance.
Although
is not known,
can be derived from
and
, as shown in Equation (
3). Then,
can be used to supervise
conduct training.
It can be deduced that
where
,
and
are combination coefficients whose sum of squares is 1,
z represents the added noise, and
t is the number of steps.
2.3. Loss Function
The aim of DDPM is to train
using the variational bound to optimize its negative logarithmic likelihood function. The formula is as follows:
From the derived formula (Equation (
6)), it can be seen that the training
is intended to minimize the KL divergence of
and
. Here,
q represents the known Gaussian distribution, and
is the distribution to be fitted. The variance of
is constant, and we only need to approximate the mean of
q and
, which is equivalent to minimizing Equation (
7):
where
C is a constant, and the derived Equation (
7) becomes
The further simplified Equation (
8) can become
where
is the added Gaussian noise,
is a neural network for predicting the noise added from moment
to
. The author [
16] found that removing the coefficients in front of the equation is beneficial in that it improves sample quality (and simpler to implement). The optimization objective of DDPM can be expressed as (Equation (
10)):
The key procedure of DDPM is to train this model for estimating noise, and the loss function is an expression of the difference between the estimation and the actual result.
4. Results and Analysis
In this section, we apply the proposed DDPM-based generation method to a real-world rock slope and evaluate the results using various metrics.
4.1. Experimental Data and Environment
4.1.1. Experimental Data
The experimental data used in this paper are obtained from the rock slope data provided by Larissa Elisabeth Darvell [
47]. The study site, Oernlia slope, is situated in north-central Norway at coordinates 15
43
E, 67
19
N. It is positioned to the east of Lake Straumsvatnet in Nordland County, above the mountain road which serves as an access road to the local power plant. The study area, which measures 230 m in length and lies between elevations of 280 m and 425 m above sea level, consists of a gently dipping rock slope and a talus [
47]. The rock slope faces westward and rises to a height of approximately 125 m, and is predominantly composed of granitic and gneiss formations [
47].
Figure 3 shows the location of the case slope and the 3D point cloud model.
In the Oernlia study area, three sets of joints were identified on west-facing rock slopes. The rocks consist of massive, high-quality granite/granodiorite with relatively few structural planes. Oernlia exhibits characteristics such as a gentle slope dip angle, specific orientations of joints relative to the slope, and the presence of curved surfaces. The dominant set of structural planes (
) has an average dip direction of 35/257, parallel to the slope [
47].
defines the base plane for rock detachment.
strikes northwestward, while
dips southeastward at an angle of 66 degrees [
47]. With only 68 mapped planes, joint set
is the least prominent among the three sets.
and
form lateral and rear release planes for rock blocks. Their orientations are inclined with the slope, defining a wedge with an angle ranging from 60 to 120 degrees [
47].
In our study, we selected three parameters—dip direction, dip angle, and trace length—to exemplify the generation of structural planes. Before conducting our experiments, we performed an extensive literature review, which revealed that scholars commonly employ the Baecher model to define the shape of structural planes. The Baecher model comprises three main components: fracture occurrence probability distribution, fracture size distribution, and fracture location probability distribution. Therefore, we adopted the Baecher model in our paper, focusing on six parameters, including dip direction, dip angle, trace length, and the center coordinates of the disc, to generate stochastic structural planes. Notably, the center point of the disc is typically stochastically and uniformly distributed. Consequently, we selected the remaining parameters—dip direction, dip angle, and trace length—for experimental generation.
For our experiment, we selected the first set of structural planes as our experimental data. The structural planes in the research area include a total of 767 pieces. We used two methods for outlier elimination. One method involved using the
function for logical evaluation, which effectively identified any duplicate values in the dataset. Also, we used the
function to carefully check for any missing or empty values. There are 766 pieces of structural planes remaining after eliminating the outliers (1 piece of structural plane). To determine their respective probability distributions, we drew histograms, fitting curves, and quantile–quantile (Q-Q) plots, as shown in
Figure 4. The occurrence of the first group of structural planes is listed in
Table 1.
4.1.2. Experimental Environment
The experimental tests were carried out on a Windows 10 system with 16 GB of memory, an AMD Ryzen 7 5800H processor with Radeon Graphics 3.20 GHz, and an NVIDIA GeForce RTX 3060 graphics card. The programming language is Python 3.8.8, and the deep learning framework is PyTorch1.9.0.
In the experiments, we did not differentiate between the training set and the test set due to the limited number of structural planes. We employed various methods to enhance the efficiency of DDPM during the experiment. Given the small number of samples and features, we reduced the model size to simplify it without significantly impacting performance. For example, we continuously adjusted hyperparameters (e.g., batch_size, learning rate, number of epochs) during training to achieve faster convergence and improved performance. In addition, we implemented early stopping based on the calculated loss to prevent overfitting.
The noise prediction model
utilized fully connected layers, comprising five hidden layers. The
function was applied to each layer, with 256 nodes configured. The training process starts with the forward diffusion process, where Gaussian noise is added to the sample data progressively at a time step of
T. This method iteratively produces noisy samples, following the guidelines set forth in Equation (
7) in
Section 2.1. Subsequently, the noise prediction model is trained to forecast the noise introduced at each time step. This training is accomplished by optimizing the loss function, which measures the discrepancy between the actual noise and the predicted noise. The underlying rationale for this process is based on the formulae presented in
Section 2.3. The reverse process then denoised the Gaussian noise to generate data, utilizing the predicted noise distribution as described by the formulas in
Section 2.2.
Parametric trial-and-error methods are employed to iteratively fine-tune hyperparameters throughout the training process to achieve optimal performance. These hyperparameters typically include the learning rate, the number of diffusion steps, batch size, and model architecture parameters such as the number of layers and the number of hidden units. Various combinations of these parameters (e.g., learning rate = 0.00001, step size = 100, batch_size = 128) were systematically tested, and the outcomes were visualized using histograms to elucidate the impact of each hyperparameter. This iterative adjustment process continued until optimal performance was attained. Ultimately, the total number of steps T was set to 1000, with a batch_size of 16. The Adam optimizer was employed with a learning rate of 0.0001, and the training process comprised 4000 epochs.
4.2. Results Generated by the Conventional Monte Carlo Method
Based on the probability distribution of structural planes analyzed in
Section 4.1.1, we generated the dip direction, the dip angle, and the trace length separately. A total of 766 stochastic structural planes were generated to facilitate comparison with the measured factual structural planes. As shown in
Figure 5, the histograms of the measured structural planes are compared with the histograms of the generated stochastic structural planes using the conventional Monte Carlo method.
Figure 6 displays box plots comparing the parameters of the structural plane generated by the conventional Monte Carlo method with those of the measured factual structural planes. Upon observing the three sets of box plots, it is evident that the overall distribution of each parameter is relatively similar to the distribution of the corresponding parameter of the measured factual structural planes. It can be considered that the generated stochastic structural planes are well-fitted with the measured factual structural planes.
Figure 7 presents a comparison of the cumulative distribution curves for each parameter between the generated structural planes using the conventional Monte Carlo method and the measured factual structural planes. The horizontal axis denotes the parameter value, while the vertical axis represents the probability.
4.3. Results Generated by the Proposed DDPM-Based Method
In this section, we present the results of comparing the stochastic structural planes generated using the proposed DDPM-based method with the measured factual structural planes.
DDPM takes three parameters—dip direction, dip angle, and trace length—as inputs and generates them simultaneously as a whole. Using the proposed DDPM-based method, 766 pieces of stochastic structural planes were generated, and the generated results are also consistent with the measured factual structural planes.
Figure 8 illustrates a comparison of the histograms between the measured factual structural planes and those generated using the proposed DDPM-based method.
Figure 9 illustrates the box plots comparing the parameters of the structural plane generated by the proposed DDPM-based method with those of the measured factual structural planes. Upon observing the three sets of box plots, it is evident that the overall distribution of each parameter is relatively similar, and it can be considered that the generated structural planes are well-fitted with the measured factual structural planes.
Figure 10 illustrates the comparison of the cumulative distribution curves of each parameter of the generated structural planes using the proposed DDPM-based method and the measured factual structural planes. The horizontal axis represents the parameter value, and the vertical axis represents the probability.
4.4. Comparison of Results and Analysis
Through a statistical comparison of the measured factual structural planes and generated stochastic structural planes, we also conducted a quantitative analysis. Initially, similarity measurements were employed to evaluate the resemblance between the two sets of structural planes.
Table 2 and
Table 3 list the measurement results of comparing the stochastic structural planes generated by the conventional Monte Carlo method and the proposed DDPM-based method with the measured factual structural planes.
In addition to employing similarity measures, the difference ratio was also utilized as a metric by which to evaluate the generated results.
Table 4 presents a comprehensive comparison between the mean and variance of the stochastic structural planes generated using the Monte Carlo method and the proposed DDPM-based method with the mean and variance of the measured factual structural planes. Relative errors are calculated accordingly, revealing a maximum relative error of 6.35%. Specifically, the maximum absolute error for dip direction and dip angle does not exceed 1°, while the maximum absolute error for trace length does not exceed 0.3m. These error values fall within an acceptable range of tolerance.
The presence of errors can be attributed to factors such as inaccurate measurements, missing data, or other sources encountered during the measurement of structural planes [
48,
49]. These factors compromise data accuracy, consequently affecting the quality of generation. Moreover, the Monte Carlo method simplifies the process by using a single probability distribution formula to describe structural plane distributions, relying on underlying assumptions [
48]. The estimation of parameters for these probability formulas based on measured structural planes, which inherently possess a certain degree of error, can result in the generation of stochastic structural planes accompanied by errors. Similarly, as for the proposed DDPM-based method, the generation process is influenced by sample data and noise [
50], and a certain degree of error may exist even if the model consists of a highly complex network.
In addition, Monte Carlo is a sophisticated method with extensive applications in the field of structural plane generation. For this method, its principle is a simple single-function fitting, which effectively captures the distribution patterns of parameters in data with clear regular distributions. Notably, this method is relatively independent of sample size, with the degree of fit remaining largely unaffected by increases in sample size. In contrast, DDPM is a deep learning model primarily composed of deep neural networks, which consist of hundreds or thousands of nonlinear transformation functions. This model facilitates the acquisition of more abstract representations, exhibiting robust expressive capabilities compared to conventional methods. However, this powerful representation learning ability requires sufficient samples to avoid overfitting and ensure training accuracy.
In our case study, the limited number of structural planes in the rock slope impacts the quality of the stochastic structural planes generated by the proposed DDPM-based method. Therefore, compared to the structural planes generated by the Monte Carlo method, the structural planes generated by DDPM perform worse in some aspects. However, overall, the structural planes generated by the proposed DDPM-based method are accurate. The structural planes presented in
Table 4 further demonstrate that the generated structural planes are consistent with the measured factual structural planes.
The essential idea behind this paper is to generate stochastic structural planes and automatically capture the correlation between dip direction and dip angle using neural networks. In addition to the use of various verification methods to prove the accuracy of the stochastic structural planes generated by the proposed DDPM-based method, we also draw linear regression curves of the dip direction and the dip angle in this section to demonstrate the advantages of the proposed DDPM-based method.
Figure 11 shows the linear regression plots plotted for dip direction and dip angle for each set of structural planes. In the linear regression plots, the symbol “r” represents the correlation coefficient, which serves as a metric to quantify the strength of the linear relationship between two variables [
51,
52]. The correlation coefficient, ranging from −1 to 1, serves as an indicator of the direction and magnitude of the correlation. Negative values indicate a negative correlation, positive values indicate a positive correlation, and a larger absolute value of “r” represents a stronger correlation.
In
Figure 11, (a) shows the linear regression curve of the dip direction and the dip angle of the measured structural surface, with a correlation coefficient of −0.685, indicating a significant negative correlation. In contrast, (b) is the linear regression curve of the dip direction and the dip angle of stochastic structural planes generated by the conventional Monte Carlo method, with a correlation coefficient of −0.046, suggesting a near absence of correlation. Moreover, (c) describes the linear regression curves of the dip direction and dip angles of the stochastic structural planes generated by the proposed DDPM-based method with a correlation coefficient of −0.599, indicating a strong negative correlation. This analysis leads to the conclusion that the proposed DDPM-based method successfully captures the correlation among the structural plane parameters. This phenomenon can be attributed to the inherent mechanisms of deep learning models. During the training process, deep generative models are typically trained on a specific dataset with the objective of minimizing an optimization function. By iteratively optimizing this function, the model acquires the ability to encode and decode data while preserving the interconnectedness of various features. Consequently, the statistical patterns and correlations inherent in the training data are effectively captured when generating real samples [
32].
6. Conclusions
The distribution of structural planes plays a vital role in the stability of the rock mass. Stochastic structural planes, in particular, are challenging to observe due to their small size, large number, and random distribution. Therefore, accurately and promptly identifying key blocks and analyzing the stability of the rock mass requires obtaining the internal stochastic structural plane distribution. Currently, the Monte Carlo method is the most widely used approach for generating stochastic structural planes. This method establishes probability density functions of various geometric parameters of structural planes based on measured factual structural planes and samples them according to known probability density functions to obtain stochastic variables that are approximate to the factual probability distribution function. This method, which independently samples each geometric parameter, cannot capture the correlations between the parameters.
To address this limitation, in this paper, we propose a deep learning approach based on DDPM for generating stochastic structural planes. The advantage of using a neural network for generation is its ability to learn high-dimensional and complex sample distribution features without requiring a priori probability distributions. Moreover, by inputting each parameter into the neural network simultaneously, the method can automatically capture the correlations between structural plane parameters.
The study applies both the DDPM-based method and the conventional Monte Carlo method to generate stochastic structural planes on the Oernlia slope in Nordland County, Norway.
Section 4 presents the results of the structural plane generation, using various validation methods, including histograms, box plots, and cumulative distribution curves, to visualize data distribution, as well as quantitative analysis methods such as mean square error, KL divergence, JS divergence, Wasserstein distance, and Euclidean distance. Additionally, error analyses of mean and variance and linear regression plots illustrating the correlation between dip direction and dip angle were performed. The verification results indicate that (1) the proposed DDPM-based method is able to generate stochastic structural planes that are well with the measured factual structural planes, which verifies the accuracy of this method; (2) the proposed DDPM-based method can automatically capture the correlation between the dip direction and the dip angle; (3) the stochastic structural planes generated by the proposed DDPM-based method are consistent with the original measured structural planes. The proposed DDPM-based method for generating stochastic structural planes is reliable.