Research on 3D Geological Modeling Method Based on Deep Neural Networks for Drilling Data

Liu, Liang; Li, Tianbin; Ma, Chunchi

doi:10.3390/app14010423

Open AccessArticle

Research on 3D Geological Modeling Method Based on Deep Neural Networks for Drilling Data

by

Liang Liu

^1,2,3,*,

Tianbin Li

^1,2,*

and

Chunchi Ma

^1,2

¹

State Key Laboratory of Geohazard Prevention and Geoenvironment Protection, Chengdu University of Technology, Chengdu 610059, China

²

College of Environment and Civil Engineering, Chengdu University of Technology, Chengdu 610059, China

³

Xi’an Surveying and Mapping Institute, Xi’an 710054, China

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2024, 14(1), 423; https://doi.org/10.3390/app14010423

Submission received: 7 December 2023 / Revised: 27 December 2023 / Accepted: 30 December 2023 / Published: 3 January 2024

(This article belongs to the Special Issue Future Trends in Tunnel and Underground Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

Three-dimensional (3D) models provide the most intuitive representation of geological conditions. Traditional modeling methods heavily depend on technicians’ expertise and lack ease of updating. In this study, we introduce a deep learning-based method for 3D geological implicit modeling, leveraging a substantial dataset of geological drilling data. By applying resampling and normalization techniques, we standardize drilling data and significantly expand the dataset, making it suitable for training deep neural networks. Utilizing the characteristics of the sample data, we design and establish the network structure, loss function, and parameter configurations, resulting in the training of a deep neural network with high accuracy and robust generalization capability. Ultimately, we utilize the dataset generated from the network’s predictions to render and construct the 3D geological model. The research in this paper demonstrates the significant promise of deep neural networks in addressing geological challenges. The deep learning-based implicit 3D modeling method surpasses traditional approaches in terms of generalization, convenience, and adaptability.

Keywords:

geological modeling; 3D implicit modeling; drilling data; deep learning; deep neural networks

1. Introduction

The ongoing urbanization in China has brought forth challenges including population concentration, environmental pollution, and a scarcity of surface resources, impacting the quality of life for urban residents. Confronted with limited surface land resources, individuals have increasingly turned their attention below the city’s surface, investigating the feasibility of utilizing underground spaces and, at times, contemplating the establishment of subterranean cities. The development of underground spaces can satisfy the urgent demand for additional urban development areas and efficiently circumvent challenges such as compensation and relocation, which are frequently encountered in above-ground construction projects. Nevertheless, the construction of underground engineering projects presents its own unique set of challenges, encompassing geological conditions, structural load-bearing capacity, corrosion prevention, and waterproofing. Geological factors, in particular, act as the fundamental determinants of the suitability of underground spaces for construction, essentially forming the foundation for urban and underground development.

Following the inception of 3D geological modeling, this interdisciplinary technology, integrating geology, computer science, statistics, and geographic information technology, has gained extensive traction across diverse geoscience domains. Traditional 3D modeling methods predominantly depend on visually representing geological structures through the creation of mesh models and the manual drawing of profiles to construct geological surfaces across multiple cross-sections. This approach is time-consuming, difficult to update, and heavily reliant on the modeler’s expertise, leading to variable model quality. An alternative to explicit 3D modeling involves downscaling 3D data without considering the spatial structure of the geological body. However, this method can introduce additional errors during the application of geological profiles and data processing [1,2]. Fernandez and colleagues [3] introduced the 3D inclined domain method for creating fully 3D surfaces, which substantially improves model quality through the automated extraction of azimuthal data from geological contacts, the construction of inclined domains, and the automatic generation of geological levels [4]. In 2005, Dhont and collaborators [5] presented a method for constructing 3D geological models utilizing the Earth’s surface data, freely accessible satellite imagery, and DEM data. This approach is especially valuable for modeling straightforward sites. While it enables a rapid and precise quantitative description of geometric contact surfaces between strata and faults using drilling data, it lacks accuracy in modeling stratigraphic heterogeneity in the absence of drilling data. Consequently, this method is primarily suited for applications like cost–benefit analysis, geomechanics of geotechnical bodies, and environmental impact studies [6]. In situations where drilling data or structural information is inadequate, spatial interpolation is often employed to supplement the data and construct the model. Calcagno and colleagues [7] utilized the potential field synergistic kriging method, which integrates planar geological maps, DTM, structured data, drillings, and geological interpretations. This approach was applied for three-dimensional geological modeling and the evaluation of model uncertainty. In practical applications, actual geological conditions are frequently unknown, rendering uncertainty an intrinsic attribute of geological models. Indeed, some scholars contend that assessing uncertainty is as crucial as the model itself [8,9,10].

To expedite the construction of a precise and easily updated 3D geological model while minimizing model uncertainty, this paper introduces a novel 3D modeling approach rooted in deep learning. This method draws upon a comprehensive integration of knowledge and techniques from various fields, including geology, geography, and computer science. As illustrated in Figure 1, the input drilling samples are partitioned into three distinct subnetworks based on geological age, geological origin, and geotechnical type. These divisions are achieved through the extraction of relevant features from the drilling data. The processed sample data are subsequently fed into a fully connected deep neural network with N hidden layers and a K-dimensional configuration for each layer. Here, f represents the activation function, while b and w denote the bias and weight parameters connecting individual neurons, respectively. The three subnetworks operate independently and are later amalgamated to establish a geomathematical model. This composite model allows for the prediction of drilling data in uncharted regions through mathematical modeling.

A substantial volume of drilling data forms the foundation for establishing a mathematical framework that connects geological entities. Once this mathematical structure is in place, geological entities can be represented in 3D using various formats, such as point clouds, meshes, and 3D bodies. Models generated using this approach can be swiftly adapted to updated data, eliminating the need for reconstruction when data changes or new drilling data are introduced, as is often necessary in traditional modeling methods. Within the framework of deep learning, the loss function serves as a metric for measuring the disparity between the model’s output and real-world scenarios. In this paper, we introduce the stratigraphical order rule from the field of geology to design a loss function that enforces constraints on the chronological order of geological epochs. This allows the deep neural network to discern the sequence of different strata during the training process. Simultaneously, considering the distribution characteristics of drilling data, we propose a loss function to address category imbalance. This mitigates the impact of imbalanced category representation in the training dataset, thereby enhancing the model’s performance, especially in categories with limited data. Through this implicit modeling method, machine learning, a tool with strong generalization capabilities, is integrated into traditional geological assessment, offering innovative approaches and techniques for high-precision and readily updated 3D geological modeling in urban areas.

2. Related Work

2.1. Machine Learning in Geological Entity Research

In recent years, significant advancements in computer hardware processing power and the development of relevant algorithms have led to the increased application of machine learning in tasks such as identifying urban geological features and distinguishing regional geological conditions. This process entails the extraction of essential geological information from associated textual data to construct and describe the fundamental characteristics and spatial morphology of geological entities [11]. These methods are based on deep neural network (DNN) models and other machine learning models, effectively addressing the limitations of traditional methods, including their inefficiency in handling large datasets and vulnerability to excessive subjectivity. Sobhana and colleagues [12] utilized randomized conditional domains and sequence kernels to extract relationships among entities from geotextual material. Wang and collaborators [13] developed an ontology model for geohazard events and used it to extract spatio-temporal and semantic information from web texts. Chu and colleagues [14] treated geospatial relationships as a sequence labeling problem, enabling the efficient extraction of topological, absolute, and relative directional relationships from geological texts. In addition to active analytical extraction, the establishment of a geological entities and relations corpus facilitates the swift and dependable transformation of textual data into a structured format encompassing geological entities and relations. This structured format is readily available for research and querying purposes [15].

Another common application of machine learning methods in the investigation of geological entities is geologic mapping, a process aimed at generating two-dimensional representations of geological features. Geological mapping encompasses the utilization of remote sensing data, complemented by other geological and geomorphological data. Furthermore, geologic structure mapping serves as a reference for inferring surface geological structure from geophysical survey data. In comparison to traditional methods, machine learning approaches offer enhanced efficiency, accuracy, and greater objectivity [16,17,18,19,20,21,22,23,24].

Geological entities, a vital category of geographic entities, encompass not only abundant semantic information but also reflect geologists’ observations and interpretations of these entities. Their role is critical in the creation of urban geological models and in the development, planning, and construction of urban areas, both above and below ground. Traditionally, the identification of geological entities has been achieved through the extraction of key information from geological texts. However, recent research has shifted toward using machine learning methods to automate and streamline this process.

2.2. Implicit Geological Modeling

Implicit modeling signifies an innovative approach to constructing geological models, deviating notably from traditional methods reliant on profiles. Instead, it leverages mathematical techniques to derive models from data, establishing mathematical relationships among geological elements. These relationships can subsequently be visualized in three dimensions through various methodologies. Guo and colleagues [25] introduced an integrated display–implicit 3D geological modeling method that utilizes a range of modeling techniques tailored for different types of geological structures. Subsequently, they convert the display model into an implicit model, facilitating the comprehensive regional modeling of diverse geological structures. Yang and collaborators [26] introduced an implicit potential field method with the capacity to integrate various data, including geological boundary contacts and their measurement directions. This is accomplished by creating intermediate 3D geological models that correspond to a subset of the data and then consolidating them into a unified, comprehensive 3D geological model while complying with data constraints and geological principles. Presently, the primary aim of implicit modeling is the generation and simulation of geological boundaries, accomplished through automatic construction employing techniques like radial basis functions, variational Gaussian processes, and other methodologies [27,28,29]. To address concerns regarding overlap and gaps that arise in the modeling process, Li and colleagues [30] introduced a reconstruction algorithm based on the labeling of voxel points. This algorithm possesses the capability to resolve issues related to overlap and blanking in geological modeling through a labeling process. The multi-label regularized tetrahedron method introduced by Sun and colleagues [31] can be effectively employed in the surface reconstruction of multi-geoidal bodies.

As a mathematical model derived from data, the central aspect of the implicit modeling approach revolves around the effective utilization of diverse geological data. Affonseca and colleagues [32] showcased the automatic construction of boundary surfaces between geological units through the utilization of lithological contact data, along with the location and orientation of potential faults. This method provides a direct assessment of the geological framework and the ability for swift updates with new data. Furthermore, by harnessing limited, sparse, and discrete borehole data, the automatic construction of 3D geological models within a specified range can be accomplished using algorithms such as support vector machines, backpropagation (BP) neural networks, marching cubes, hermite radial basis functions, and various others [33,34,35].

These studies demonstrate the particular aptness of machine learning methods for handling geological data and predicting geological entities. With technological advancements, these approaches are anticipated to become increasingly crucial in the evolution of urban geological information systems. This will enhance the efficiency and accuracy of urban planning and resource management.

3. Methods

3.1. Deep Neural Network

A deep neural network is a type of neural network with multiple hidden layers, achieved by extending the binary linear perceptron model through the addition of hidden layers, expanding the output layer, and incorporating various activation functions. These hidden layers can comprise a single neuron or multiple neurons, substantially enhancing the neural network’s modeling capabilities while simultaneously increasing its complexity. Likewise, the output layer can accommodate multiple neurons, enabling the model to flexibly represent diverse information, rendering it well suited for applications in classification, regression, dimensionality reduction, clustering, and various other domains. Activation functions provide a wide array of options, each with its own adaptability, advantages, and limitations [36,37,38,39,40,41]. The selection of different activation functions can further enhance the neural network’s expressive capabilities. Taking into account the representation of borehole data and the characteristics of 3D geological modeling, the commonly employed Sigmoid function in neural networks, originally designed for binary classification, may not be the most suitable choice. In contrast, Softmax is well suited for multi-category classification, and ReLU effectively mitigates the gradient vanishing issue. Both of these activation functions are more appropriate for the task of 3D geological modeling.

A deep neural network comprises hidden layers, with each neuron in layer “i” connected to every neuron in the subsequent layer, “i + 1”. Although deep neural networks may appear complex, they fundamentally integrate linear relationships and activation functions at specific locations. The linear coefficients, often referred to as weights, within a deep neural network establish connections between different layers of the network. Each neuron is associated with a corresponding weight as it connects to the subsequent layer. These weights determine the strength of the connections between neurons in the preceding layer and those in the succeeding layer. They are utilized to linearly transform the input data by performing a weighted sum with the input values. Throughout the training process, the neural network autonomously learns and adjusts these weights to minimize the loss function. In deep neural networks, biases are constant offsets assigned to each neuron within the network. They function as additional parameters for each neuron, working independently of the input data. Biases enable neurons to activate even in the absence of input, and they provide a means to adjust a neuron’s activation threshold by modifying the bias. These biases introduce a form of translational invariance within the network, thereby enhancing the network’s flexibility and expressive capabilities.

During the forward propagation of deep neural networks, individual neurons are situated within distinct layers. Neurons within each layer receive signals from neurons in the preceding layer and generate signals for subsequent transmission to the following layer. The network operates without feedback, and the propagation of signals follows a unidirectional path from the input layer to the output layer. Information within neural networks is propagated through the continuous iteration of Equations (1) and (2).

Z (l) = W (l) a (l - 1) + b (l)

(1)

a (l) = f_{l} (Z (l))

(2)

In this equation, Z(l) denotes the net input or net activity value of the neuron in layer “l”, W(l) represents the weight matrix connecting layer “l − 1” to layer “l”, a(l) denotes the output or activity value of the neuron in layer “l”, b(l) signifies the bias from layer “l − 1” to layer “l”, and f_l(*) denotes the activation function of the neuron in layer “l”.

3.2. Gradient Descent

Gradient descent, commonly referred to as steepest descent, is a first-order optimization algorithm frequently used to optimize loss functions in machine learning tasks. In machine learning, models learn from continuous data input, and this learning process involves the use of the gradient descent method for optimization. For deep neural networks, the widely adopted approach is backpropagation, which iteratively updates model parameters until convergence, thereby optimizing the model. In Equation (3), a simple linear model is assumed, and if the loss function is denoted as J(θ), the basic form of gradient descent can be expressed as shown in Equation (4).

h (θ) = θ_{0} + θ_{1} x_{1} + θ_{2} x_{2} + θ_{3} x_{3} + \dots + θ_{i} x_{i}

(3)

θ_{n + 1} = θ_{n} - α J^{'} (θ)

(4)

where α represents the iteration step size, also known as the learning rate.

When training deep neural networks, the training data are typically large, and calculating the gradient on the entire training dataset at each iteration during gradient descent would demand a substantial amount of computational resources. Moreover, large-scale training datasets often include redundant data. Therefore, small-batch gradient descent is commonly used for training deep neural networks.

3.3. Loss Function

In deep neural networks, the loss function functions as a metric for measuring the disparity between the model’s output and the actual label. It establishes the network’s optimization objective and guides the network in learning suitable parameter values. Minimizing the loss function enables the network’s predictions to closely match the true labels, thereby improving the accuracy and overall performance of the model. The role of the loss function extends beyond merely assessing the model’s performance; it also plays a vital role in the backpropagation algorithm. Calculating the gradient of the loss function with respect to the network parameters enables us to determine the direction in which the parameters should be updated, thereby allowing the network to be adjusted in a more optimal direction.

Commonly used loss functions include the mean square error and cross-entropy loss. The mean square error is a standard loss function for regression problems. It calculates the squared difference between the predicted value and the true value and then averages these differences to obtain the loss value. The mean square error is expressed as depicted in Equation (5).

M S E = \frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}

(5)

where N represents the number of samples, y_i is the actual label of the sample, and

{\hat{y}}_{i}

represents the label predicted by the model.

Cross-entropy loss is frequently used in classification problems, particularly in multi-category classification tasks. It calculates the loss value by comparing the actual labels with the predicted probability distribution generated by the model. Cross-entropy loss can be expressed as depicted in Equation (6).

C E = - \frac{1}{N} \sum_{i = 1}^{N} \sum_{j = 1}^{C} y_{i j} \log ({\hat{y}}_{i j})

(6)

where C represents the number of categories, y_ij represents the actual label of the jth category for the ith sample, and

{\hat{y}}_{i j}

represents the probability distribution predicted by the model.

Furthermore, in addition to the two common loss functions mentioned above, a variety of loss functions are designed to suit different classification or regression tasks. Moreover, for specific model tasks, customized loss functions can be developed by combining other metrics or incorporating domain-specific expertise.

3.4. Optimization Algorithm

Parameter learning in deep neural networks heavily depends on the gradient descent method to determine a set of parameters that can minimize structural risk. Depending on the volume of data and the number of parameters, suitable optimization algorithms can be selected. These optimization algorithms can be broadly classified into two types: one involves adjusting the learning rate for improved stability, while the other concentrates on correcting gradient estimation to accelerate training.

3.4.1. Learning Rate Adjustment

The learning rate is a crucial parameter that determines the step size for updating the neural network’s weights and biases. Selecting the appropriate learning rate is vital to ensuring efficient network convergence during training, thereby avoiding problems like slow convergence or divergence. In general, a larger learning rate speeds up convergence but can make the training process unstable. Conversely, a smaller learning rate improves training stability but may require more training iterations to achieve optimal performance. Commonly used methods for adjusting the learning rate include learning rate decay, learning rate warm-up, periodic adjustments, and various adaptive learning rate techniques. Adaptive learning rate methods can assign different learning rates to each parameter, including algorithms such as AdaGrad [42], AdaDelta [43], and RMSprop [44].

In this paper, the RMSprop algorithm is used to adapt the learning rate. RMSprop is an adaptive learning rate method proposed by Geoff Hinton and is employed to mitigate certain limitations associated with other adaptive algorithms, where the learning rate continuously decreases to the extent of premature decay in specific situations.

The RMSprop algorithm computes an exponentially decaying moving average of the squared gradient (g_t) for each iteration using Equation (7) and subsequently calculates the parameter update differences based on Equation (8).

G_{t} = β G_{t - 1} + (1 - β) g_{t} ⨀ g_{t} = (1 - β) \sum_{τ = 1}^{t} β^{t - τ} g_{τ} ⨀ g_{τ}

(7)

∆ θ_{t} = - \frac{α}{\sqrt{G_{t} + ϵ}} ⨀ g_{t}

(8)

In these equations, β represents the decay rate, typically set to 0.9; α is the initial learning rate, usually set to 0.001; and ϵ is a small constant used to ensure numerical stability, typically falling within the range of e × 10⁻⁷ to e × 10⁻¹⁰.

3.4.2. Gradient Estimation

In stochastic gradient descent, the gradient estimate for each iteration exhibits a degree of randomness and may not align with the optimal gradient for the entire training set. To mitigate this stochasticity and enhance the optimization speed, the average gradient over a recent period of time can be employed as the direction for parameter updates. This technique is referred to as gradient estimation correction and is frequently implemented using methods such as the momentum method.

The concept of momentum in optimization algorithms is borrowed from classical physics. In physics, momentum refers to an object’s tendency to continue moving in its present direction, and it is calculated as the product of the object’s mass and velocity. In the realm of optimization algorithms, the momentum method replaces the actual gradient with the previously accumulated momentum. The gradient at each iteration in optimization algorithms can indeed be likened to an acceleration. It signifies the direction and magnitude of changes needed to minimize the loss function and guide the optimization process. In the t-th iteration, Equation (9) is utilized to compute the weighted moving average of the negative gradient, which then serves as the direction for updating the parameters. This approach helps reduce randomness and speeds up the optimization process.

∆ θ_{t} = ρ ∆ θ_{t - 1} - α g_{t} = - α \sum_{τ = 1}^{t} ρ^{t - τ} g_{t}

(9)

where ρ is the momentum factor, which is typically set to 0.9, and α represents the learning rate.

In this manner, the actual parameter update for each parameter relies on the weighted average of gradients over a recent period. If the gradient direction for a parameter has shown consistency in the recent past, its true parameter update is greater, effectively acting as an accelerator. Conversely, if the gradient directions have been inconsistent in the recent past, the true parameter update becomes smaller, serving as a decelerator to enhance stability. This allows the use of a larger learning rate, potentially resulting in quicker convergence. Nevertheless, directions with larger gradients will exhibit less amplitude per update due to momentum. As a result, the learning rate remains constant when employing the momentum method.

3.4.3. Adaptive Moment Estimation

In order to deal with the stochastic nature of gradient estimation in the stochastic gradient descent method, the momentum method can be utilized. While it offers stability and quicker convergence, it maintains a fixed learning rate throughout training. To overcome this limitation, Kingma et al. introduced the Adaptive Moment Estimation Algorithm (Adam) [45]. Adam combines the advantages of the momentum method and the RMSprop algorithm, enabling adaptive learning rate adjustments while utilizing momentum as the direction for parameter updates.

The Adam algorithm calculates the exponentially weighted average of the gradient squared, g_t² (similar to RMSprop), and the exponentially weighted average of the gradient, g_t (similar to the momentum), using Equations (10) and (11).

M_{t} = β_{1} M_{t - 1} + (1 - β_{1}) g_{t}

(10)

G_{t} = β_{2} G_{t - 1} + (1 - β_{2}) g_{t} ⨀ g_{t}

(11)

Here, β₁ and β₂ are the decay rates of the two moving averages, typically set empirically as β₁ = 0.9 and β₂ = 0.99. M_t and G_t denote the mean and the variance of the gradient without subtracting the mean. Initially assuming M₀ = 0 and G₀ = 0, the values of M_t and G_t will be smaller than the true mean and variance at the beginning of the iteration. This deviation becomes notable when both β₁ and β₂ are close to 1, and, therefore, this deviation is corrected using Equations (12) and (13). Subsequently, the parameter update differences for the Adam algorithm are calculated using Equation (14).

{\hat{M}}_{t} = \frac{M_{t}}{1 - β_{1}^{t}}

(12)

{\hat{G}}_{t} = \frac{G_{t}}{1 - β_{2}^{t}}

(13)

∆ θ_{t} = - \frac{α}{\sqrt{{\hat{G}}_{t} + ϵ}} {\hat{M}}_{t}

(14)

where the initial learning rate α is usually set to 0.001 and can also be decayed during the iteration process.

4. Data Scenario

4.1. Resampling

The cornerstone of 3D geological implicit modeling lies in acquiring a substantial amount of standardized geological drilling data. Raw drilling data comprise planar coordinates, elevation, and stratigraphic depth of exploration points. These data can be processed to derive the spatial coordinates of different strata at each point, along with attribute information like stratigraphic age, geotechnical type, and other descriptions. The determination of the stratigraphy to which a point belongs is typically based on its age and geotechnical type, facilitating the identification of the geological body to which it pertains. However, recognizing the continuous area between the upper and lower boundaries of stratigraphy as strata with the same attributes presents a challenge, necessitating the resampling of drilling data. The original dataset also requires procedures such as deduplication, error checking, patching, encoding, and others to transform it into a series of standardized, continuous spatial point-cloud data that include spatial coordinates (x, y, and z) and stratigraphic attributes.

The resampling of drilling data involves both depth-wise resampling within the drills and planar space resampling. Planar space resampling primarily aims to standardize and normalize the input features of both the drilled sample set and the sample space to be predicted. The study area is divided into a series of grids of uniform size, and this grid size plays a vital role in determining the precision of the 3D geological model. Essentially, the grid size represents the smallest cell within the 3D representation of the geological entity. Larger grid sizes significantly reduce the total cell count in the model, providing a substantial advantage in terms of computation and convergence speed for the deep neural network. However, this may result in multiple distinct geological features coexisting within the same cell, potentially reducing model accuracy. On the other hand, a smaller grid size results in an exponential increase in deep neural network computation and convergence speed. Therefore, the mesh size must be determined in conjunction with the characteristics of the sample data and the accuracy requirements for 3D modeling. Depth-direction resampling includes the decimal place normalization of elevations for stratigraphic divisions within the original drilling data, using standardized intervals of 0.5 m. These intervals align with the drill sampling intervals commonly employed in engineering. Subsequently, each stratum was subdivided at regular intervals to attain a more uniform set of sample data.

Figure 2a illustrates that the study area is located in the urban region of Xi’an City, Shaanxi Province, China. It is important to note that the 7922 drills used in this study are not uniformly distributed across the approximately 120 km² study area. The sample data used in this research are obtained from engineering construction projects, including house construction, road construction, bridge construction, and more. The minimum spacing between drills is governed by specifications, typically ranging from 10 to 30 m. After analyzing the spacing between samples, it was determined that the minimum Euclidean distance between drills is 5 m. To meet the requirements of 3D modeling accuracy and the computational capabilities of hardware equipment, the standard grid size for planar sampling was set at 5 m by 5 m. As indicated in Figure 2b, within each standard grid, the coordinates and attribute features of the drills are mapped to the geometric center of the grid, simplifying attribute assignment to the standard grid. Also, the figure shows the assignment logic for the samples. If a single mesh contains only one drill, the attributes of that drill are assigned directly to the mesh. In cases where a mesh contains multiple drill samples, the criterion for selecting the assigned drill is based on the Euclidean distance from each drill point to the geometric center of the mesh. The drill with the smallest of these distances is chosen as the assigned point for the drill. Figure 2c depicts the method of resampling drills along the depth direction. This process involves mapping the attributes of raw drill data onto a grid based on elevation. In cases where more than two strata exist within a grid cell, the stratum occupying the largest percentage is chosen as the representative stratum for mapping onto the grid.

Resampling serves to standardize and normalize the original data in various ways. The sample data volume, comprising spatial coordinates and attributes of geotechnical bodies, expands from 33,597 to 571,970 instances, marking a substantial increase of 41.3%. This expansion enhances computational accuracy for deep learning.

4.2. Normalization

Normalization is a process that transforms data into a common scale or range to make them more suitable for analysis. Typically, this involves scaling data to have a mean of 0 and a standard deviation of 1, or rescaling them to a specific range, such as between 0 and 1. In this paper, we utilize the batch normalization method [46] for preprocessing drilling data, a choice tailored to accommodate the distinctive characteristics of geological drilling data.

Within the framework of a deep neural network, where we denote the net input to the lth layer as z(l) and represent the output of the neuron in that layer as a(l), the relationship between input and output is expressed in Equation (15).

a (l) = f (z (l)) = f (W a (l - 1) + b)

(15)

In the equation, the function f(*) represents the activation function, while W and b denote the weights and biases.

To enhance optimization efficiency, each dimension of z(l) is standardized to follow a standard normal distribution using a normalization method, as depicted in Equation (16).

\hat{z} (l) = \frac{z (l) - E [z (l)]}{\sqrt{v a r (z (l))} + ε}

(16)

In the equation, E[z(l)] and var(z(l)) represent the expectation and variance of each dimension of z(l) across the entire training dataset with the current parameters, while ε is a small constant introduced to avoid a zero variance.

In the case of a small-batch sample set consisting of K samples, Equations (17) and (18) can be used to calculate the mean and variance of the net inputs z(1, l), …, z(K, l) for the neurons in layer “l”.

μ_{B} = \frac{1}{K} \sum_{k = 1}^{K} z^{(k, l)}

(17)

σ_{B}^{2} = \frac{1}{K} \sum_{k = 1}^{K} (z^{(k, l)} - μ_{B}) ⊙ (z^{(k, l)} - μ_{B})

(18)

The values of the net inputs are centered around 0 through standard normalization. This interval closely aligns with the linear transformation range, reducing the nonlinearity of the neural network. To ensure that normalization does not compromise the network’s expressiveness, an additional scaling and translation transformation can usually be applied to adjust the interval, as shown in Equation (19).

{\hat{z}}^{(l)} = \frac{z^{(l)} - μ_{B}}{\sqrt{σ_{B}^{2}} + ε} ⊙ γ + β

(19)

In the equation, the parameter vectors for scaling and translation, represented as γ and β, respectively, are included. The batch normalization operation can be viewed as a specialized neural network layer added before each layer with a nonlinear activation function, as depicted in Equation (20).

a^{(l)} = f (B N_{γ, β} (z^{(l)})) = f (N_{γ, β} (W a^{(l - 1)}))

(20)

Batch normalization not only improves optimization efficiency but also acts as an implicit regularization technique. During training, the prediction for an individual sample by the neural network is impacted not only by that specific sample but also by other samples within the same batch. As batch selection is randomized, it prevents the neural network from overfitting to a particular sample, thus enhancing the network’s generalization capability.

4.3. Partitioning

For the proper evaluation and selection of neural network models, the sample data can be split into training, validation, and test sets. The training set acts as the primary dataset for training the network model. Here, the neural network adapts its parameters and weights by learning from the samples in the training set to capture the data’s characteristics and patterns. It is important that the training set is representative and covers a wide range of scenarios and samples from the dataset. The validation set is used for selecting and fine-tuning the network model during training. It assesses the model’s performance on unseen data and helps choose optimal model parameters, hyperparameters, or network structures to prevent overfitting to the training set. The test set is reserved for evaluating the deep neural network’s generalization ability in real-world scenarios. Typically, it is used after training and validation to avoid over-tuning the neural network during the training process. Evaluating the trained network model’s performance on the test set provides an objective assessment of its capabilities and validates its predictive power.

By using independent validation and test sets, the performance of models can be more accurately assessed, and models can be checked for overfitting. Validation sets are crucial for hyperparameter tuning and model selection, while test sets are used to evaluate model performance in real-world scenarios. Ensuring the independence and representativeness of these three datasets is vital for the reliable assessment and comparison of different models. Typically, sample data are divided following a randomization principle, where the dataset is shuffled and partitioned based on a specific ratio. The geological sample data used in this paper follow a “long-tailed distribution” pattern. Therefore, the stratified random division method is adopted. In this method, sample datasets are grouped by categories, and they are further divided into groups at a ratio of 6:2:2 (training set: validation set: test set) for each category. This approach ensures that both the training and validation sets are representative.

5. Experiments and Results

5.1. Experiment Parameters

5.1.1. Neural Network Structure

Geological entities are primarily shaped by their spatial coordinates, while their characteristics, such as geological age, geological genesis, and geotechnical type, collectively determine their types. Although constructing neural networks for training and prediction using multi-label classification tasks can accelerate model training, using the 3D vector representations of geological age, geological genesis, and geotechnical type features as sample labels may limit the accuracy of neural network representation due to their linear independence. To overcome this limitation, this paper employs three separate submodules to represent geological age, geological genesis, and geotechnical type features individually. These submodules are designed to capture the unique characteristics of each attribute. Subsequently, the outputs of these submodules are integrated into a unified output to prevent information loss due to data dimensionality reduction and mitigate the issue of the model underfitting high-dimensional data. The network structure, depicted in Figure 1, involves dividing the input data into three separate samples based on distinct attribute labels. These samples are then input into three independent deep neural networks. DNN-1 predicts stratigraphic epochs, DNN-2 focuses on geological genesis prediction, and DNN-3 is responsible for geotechnical category prediction. Each of these three independent networks has specific conditions, including the number of hidden layers and hyperparameters, which were fine-tuned based on their respective training and testing results.

5.1.2. Geologically Constrained Loss

In this study, the deep neural network comprises three submodels dedicated to classifying geological age, geological genesis, and geotechnical type features. To improve model training and final performance, geological rules and knowledge are integrated. A loss function suitable for geological modeling is developed by introducing constraint terms based on geological features, in addition to the cross-entropy loss function commonly used for classification tasks. The loss function with geological constraints is formulated as shown in Equation (21).

L o s s = α \cdot C E + β \cdot G e o P e n a l t y

(21)

In this equation, CE represents the cross-entropy loss term, GeoPenalty is the penalty term for geological constraints, and α and β are weight factors that control the relative importance of each subcategory. These weights are typically determined through an iterative process to find the optimal parameter combinations. The cross-entropy loss term, as defined in Equation (22), entails consolidating multiple categories into a single category.

C E = - \frac{1}{N} \sum_{i = 1}^{N} y_{i} \log ({\hat{y}}_{i})

(22)

The data used for training the deep neural network in this study exclusively contain quaternary stratigraphy. These stratigraphic data are represented with numerical codes based on the geological era, where each code carries both classification and sequential significance. The numerical coding is as follows: Holocene: 4; Upper Pleistocene: 3; Middle Pleistocene: 2; Lower Pleistocene: 1. In this coding scheme, a higher code not only indicates the classification of the era but also signifies a more recent geological period, with the Holocene (code 4) being the most recent and the Lower Pleistocene (code 1) being the oldest. In the context of deep neural networks for prediction, when dealing with samples that share the same planar position (i.e., they have identical x and y coordinates but different z coordinates), these samples can be regarded as part of the same borehole. Furthermore, there exists a structured relationship between the geological epochs represented by these boreholes. Specifically, points situated at higher elevations (i.e., those with greater z-coordinates) are inherently associated with more recent geological strata. Consequently, their geological epoch encoding must be greater than that of points situated at lower elevations. This encoding scheme reflects both the classification of geological eras and their temporal sequence, with higher numerical codes indicating more recent geological periods. Given that training-batch sample data are randomly chosen during the network training process and may not originate from the same borehole, it becomes challenging to compute the sequential loss for all the data. To account for the loss associated with the order of geological epochs in the loss function, it is essential to identify sample points within the same training batch that share identical planar coordinates. By doing so, the geological epochs within this series of sample data can be compared to ensure that the model accurately captures the sequential nature of geological epochs. The geological epoch order loss is calculated as described in Equation (23).

G e o P e n a l t y = \sum_{i = 1}^{N - 1} \max (0, {l a b e l}_{i + 1} - {l a b e l}_{i})

(23)

In this equation, N represents the number of samples in the same training batch, label_i denotes the geologic era code of the ith sample, and the difference between neighboring samples’ codes is calculated (taking 0 if the difference is less than 0). These differences are then summed to obtain the sequential loss term for the geologic era. When each training batch of samples is fed into the neural network, the samples are sorted based on their planar coordinates. If the training batch contains samples from the same drill, the geological epoch order loss is calculated. Otherwise, the order loss of the batch is disregarded.

5.1.3. Hidden Dim and Hidden Layers

The number of hidden layers and the dimension of the hidden layers are crucial parameters when constructing deep neural networks. Increasing the number of hidden layers enhances the network’s ability to represent complex features and patterns. However, the dimension of the hidden layers determines the number of neurons in each layer and affects the network’s complexity and computational requirements. It is important to strike a balance, as an overly complex structure with too many hidden layers and dimensions can significantly increase the computational demands and make the network more susceptible to overfitting the training data, which may reduce its performance on unseen data. Therefore, selecting an appropriate number of hidden layers and dimensions is crucial for optimizing network performance. To assess the performance of deep neural networks with varying numbers of hidden layers and dimensions, this paper considers both the loss and accuracy of the test set. Figure 3 illustrates the changes in loss and accuracy for each network configuration. Specifically, the DNN-1 and DNN-3 networks perform best when they have seven hidden layers and a hidden layer dimension of 256, while DNN-2 performs best with eight hidden layers and a hidden layer dimension of 256.

5.1.4. Batch Size

Batch size plays a critical role in determining how many samples are used in each step of updating the network’s weights and biases. Larger batch sizes can accelerate the training process but consume more memory, while smaller batch sizes can provide more stable gradient estimates but may introduce increased noise during training. Selecting the appropriate batch size requires balancing the computational resources available with the stability of the training process. In this paper, a systematic approach is taken to identify the optimal network structure for each subnetwork and to evaluate the training loss, test accuracy, and total training time for different batch size configurations. The objective is to determine the batch size that strikes a balance between maintaining network performance and minimizing the overall training time. This optimization process is achieved through experiments and comparisons, leading to the selection of the most suitable batch size configuration for a specific task and dataset. It allows for the fine-tuning of deep neural network training and ensures efficient model optimization. Figure 4 demonstrates a comparison of the optimal network training loss, test accuracy, and total training time for different batch size cases. It can be seen that despite the differences in the training data and training tasks of each subnetwork, the overall trend shows that as the batch size increases, the training loss gradually decreases, the test accuracy continuously improves, and the total training time gradually shortens. Specifically, at batch sizes of 512 and 1024, the network expressiveness reaches its optimal state, as evidenced by lower loss, higher accuracy, and faster training speed. However, as the batch size continues to increase, the training loss begins to increase, the test accuracy decreases, and the total training time increases. Geological data are characterized by a high degree of discreteness due to the sampling process and sampling difficulty. During the training process, the training data used in each round of iteration are randomly selected, and it is difficult to ensure that each training can contain multiple feature data points. Therefore, using a large training batch can cover more multi-feature data in each iteration calculation, reduce the impact of the discrete nature of the geologic data on the training results, and improve the accuracy of the calculation. Considering the network performance and training efficiency, a batch size of 1024 was selected for this work.

5.2. Validation and Results

To enhance the performance of the deep neural network, as elucidated in Section 4.1, a common approach entails the optimization of both the network’s architecture and the hyperparameter configurations for each subnetwork, followed by training the neural network using these parameter settings. A frequently employed strategy is the extension of the training epochs to facilitate the convergence of the network’s loss function toward the global minimum. Nevertheless, an excessive increment in the number of training epochs not only imposes a substantial demand on computational resources but can also engender overfitting, thereby impairing the network’s capacity for generalization.

To address the overfitting challenge, this paper incorporates the early-stop mechanism, which is designed to thwart overfitting by halting the training process when the network approaches its optimal performance state. By specifying a tolerance threshold for the number of validation rounds, it preserves the network’s opportunity to approach a more optimal state, even in the face of fluctuations or temporary stagnation. This approach aids in precisely identifying the appropriate juncture to terminate training, avoiding premature cessation while safeguarding against overfitting. The amalgamation of the early-stop mechanism and the parameter setting for the number of validation rounds, denoted as ‘N’, empowers deep neural networks to attain enhanced generalization performance, resulting in superior training outcomes.

Table 1 and Table 2 showcase the training parameters and outcomes of the deep neural network, respectively. The utilization of the parameters from Table 1 in training the deep neural networks leads to a gradual decline in training loss across all three subnetworks, ultimately stabilizing within the predetermined number of training iterations. Concurrently, the validation loss and test loss consistently exhibit a decreasing trajectory. These observations lend support to the effective convergence of the neural networks within their respective training rounds, alluding to their avoidance of overfitting. Subnetwork DNN1, tasked with predicting geological epochs, notably achieves an impressive accuracy of 97.25% on the test set. Subnetwork DNN2, responsible for predicting geological genesis, delivers a commendable accuracy of 90.18% on the test set. Subnetwork DNN3, designed to forecast geotechnical body types, attains a solid accuracy of 85.17% on the test set. Cumulatively, the networks demonstrate a robust capacity for accurate prediction, effectively capturing the geological characteristics of the study area. In summation, these results underscore the formidable performance of the established deep neural network during both the training and testing phases. The network consistently attains a high level of accuracy in predicting geological data, substantiating its efficacy and generalization capabilities.

To further evaluate the generalization capacity of the deep neural network, ten sets of original data have been chosen for each of the following drill depths: 30 m, 50 m, 70 m, and 90 m. These datasets were not utilized during any phase of the network’s training, validation, or testing. A comparison has been conducted between the predicted data and the original data to investigate spatial patterns in the network’s predictive error. This analysis serves as a guideline for formulating the 3D model rendering rule. The comparison between some of the predicted drills and the original data is shown in Figure 5, while the prediction accuracy of the deep neural network under varying drill-depth conditions is presented in Table 3. The statistical findings illuminate the network’s performance’s dependency on drill depth. In general, the accuracy of the network tends to increase as the drill depth decreases, indicating a correlation between performance of the network and depth of the measurement point. This paper also investigates the distribution of drilling data employed in the deep neural network’s training concerning the depth dimension. The data are categorized into depth ranges as follows: less than 30 m (25,269 drills), 30 to 50 m (8207 drills), 50 to 70 m (7260 drills), and greater than 90 m (4335 drills). While these drills exhibit variations in elevation, the substantial disparity in the number of drills within each depth range suggests a trend of “decreasing sample data with decreasing depth”. Consequently, the reduction in network performance as the measurement point’s depth increases can be attributed to the diminished number of sample data points available for training the deep neural network. The network’s performance tends to decrease as depth increases due to this scarcity of training data.

5.3. Three-Dimensional Modeling

Upon acquiring the dataset for the modeling area, the creation of the 3D geological model can be swiftly accomplished by opting for different depth sampling accuracies. In this investigation, we have employed PyVista, an open-source Python-based extension package, to formulate the 3D geological model with depth sampling accuracies of 5 m and 1 m. Figure 6 illustrates the 3D geological model under these varying sampling accuracy conditions. When utilizing lower sampling accuracy, the model adeptly captures the overarching geological trends over expansive areas but may compromise precision and local detail. Conversely, higher sampling accuracy is capable of delineating local geological variations, yet it may introduce localized errors. Consequently, for research and presentations focusing on extensive geological changes, opting for lower sampling accuracy is recommended, whereas higher sampling accuracy is more appropriate for investigating or depicting localized geological conditions.

6. Conclusions

This paper establishes the theoretical framework for deep neural networks and introduces a geological modeling approach founded on supervised deep neural networks. Our research objectives are segmented into three distinct submodels, each dedicated to one of the following attributes: geological age, geological genesis, and geotechnical type. The network architecture is individually crafted, with specific loss functions tailored to the unique characteristics of each task. Through algorithmic refinement and extensive experimentation, a set of hyperparameters that yield exceptional performance is identified. By training on standardized sample data, a highly adaptable deep neural network model is forged, facilitating the creation of a comprehensive three-dimensional geological model. The methodology demonstrates certain advantages and limitations in its ability to forecast unknown geological data:

(1) Accurate shallow data classification: The network excels at precisely classifying the geotechnical types of shallow locations.

(2) Robust prediction for continuous geological layers: In areas with extensive, uninterrupted geological bodies, the deep neural network effectively predicts the geotechnical type.

(3) Challenges in predicting thin interlayers: The network encounters difficulties in predicting the geotechnical types of thin interlayers (e.g., sand and ancient soil) due to their limited thickness, often resulting in their omission.

(4) Elevation-associated prediction errors: The deep neural network’s prediction error rate increases as the elevation decreases. This suggests that data density strongly impacts the network’s predictive performance, and enhancing the quantity of sample data is a direct approach to improving predictions.

(5) Prediction errors in “thin layers”: The network occasionally produces prediction errors for “thin layers”. These errors occur because the network predicts geotechnical types in a coded format, and interpretations are based on the proximity of predicted values to specific codes. When predictions fall between two codes, interpretations lean toward one geotechnical type, leading to bias. To mitigate these errors, increasing the sampling interval in the depth direction can be considered, albeit at the cost of some accuracy.

The results presented in this paper demonstrate that the implicit 3D modeling method, underpinned by rigorous data validation, signifies a noteworthy advancement in geological modeling. This approach shifts the modeling task away from empirical and subjective foundations toward a more mathematically rigorous and objective process. In comparison to conventional modeling methods, this approach showcases superior generalizability, convenience, and scalability. It underscores the feasibility and substantial potential of harnessing machine learning and data mining techniques to address geological challenges.

Author Contributions

Conceptualization, L.L. and T.L.; methodology, L.L.; software, L.L.; validation, L.L., T.L. and C.M.; formal analysis, L.L.; investigation, L.L.; resources, L.L.; data curation, L.L.; writing—original draft preparation, L.L.; writing—review and editing, L.L. and C.M.; funding acquisition, T.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research is supported by the New Basic Surveying and Mapping Pilot of Xi’an, China (XCZX2021-M017).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data, models, and codes presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy.

Conflicts of Interest

The authors declare no conflict of interest.

References

Rowan, M.G. Three-dimensional geometry and evolution of a segmented detachment fold, Mississippi Fan foldbelt, Gulf of Mexico. J. Struct. Geol. 1997, 19, 463–480. [Google Scholar] [CrossRef]
de Kemp, E.A. 3-D visualization of structural field data: Examples from the Archean Caopatina Formation, Abitibi greenstone belt, Québec, Canada. Comput. Geosci. 2000, 26, 509–530. [Google Scholar] [CrossRef]
Fernández, O.; Muñoz, J.A.; Arbués, P.; Falivene, O.; Marzo, M. Three-dimensional reconstruction of geological surfaces: An example of growth strata and turbidite systems from the Ainsa basin (Pyrenees, Spain). AAPG Bull. 2004, 88, 1049–1068. [Google Scholar] [CrossRef]
Fernandez, O.; Jones, S.; Armstrong, N.; Johnson, G.; Ravaglia, A.; Muñoz, J.A. Automated tools within workflows for 3D structural construction from surface and subsurface data. Geoinformatica 2009, 13, 291–304. [Google Scholar] [CrossRef]
Dhont, D.; Luxey, P.; Chorowicz, J. 3-D modeling of geologic maps from surface data. AAPG Bull. 2005, 89, 1465–1474. [Google Scholar] [CrossRef]
Guillen, A.; Calcagno, P.; Courrioux, G.; Joly, A.; Ledru, P. Geological modelling from field data and geological knowledge: Part II. Modelling validation using gravity and magnetic data inversion. Phys. Earth Planet. Inter. 2008, 171, 158–169. [Google Scholar] [CrossRef]
Calcagno, P.; Chilès, J.P.; Courrioux, G.; Guillen, A. Geological modelling from field data and geological knowledge: Part I. Modelling method coupling 3D potential-field interpolation and geological rules. Phys. Earth Planet. Inter. 2008, 171, 147–157. [Google Scholar] [CrossRef]
Turner, A.K. Challenges and trends for geological modelling and visualisation. Bull. Eng. Geol. Environ. 2006, 65, 109–127. [Google Scholar] [CrossRef]
Lelliott, M.R.; Cave, M.R.; Wealthall, G.P. A structured approach to the measurement of uncertainty in 3D geological models. GeoScienceWorld 2009, 42, 95–105. [Google Scholar] [CrossRef]
Wellmann, J.F.; Horowitz, F.G.; Schill, E.; Regenauer-Lieb, K. Towards incorporating uncertainty of structural data in 3D geological inversion. Tectonophysics 2010, 490, 141–151. [Google Scholar] [CrossRef]
Wu, L.; Xue, L.; Li, C.; Lv, X.; Chen, Z.; Guo, M.; Xie, Z. A geospatial information grid framework for geological survey. PLoS ONE 2015, 10, e0145312. [Google Scholar] [CrossRef] [PubMed]
Sobhana, N.V.; Ghosh, S.K.; Mitra, P. Entity Relation Extraction from geological text using Conditional Random Fields and subsequence kernels. In Proceedings of the 2012 Annual IEEE India Conference (INDICON), Kochi, India, 7–9 December 2012; pp. 832–840. [Google Scholar]
Wang, W.; Stewart, K. Spatiotemporal and semantic information extraction from Web news reports about natural hazards. Comput. Environ. Urban Syst. 2015, 50, 30–40. [Google Scholar] [CrossRef]
Chu, D.; Wan, B.; Li, H.; Dong, S.; Fu, J.; Liu, Y.; Huang, K.; Liu, H. A machine learning approach to extracting spatial information from geological texts in Chinese. Int. J. Geogr. Inf. Sci. 2022, 36, 2169–2193. [Google Scholar] [CrossRef]
Wang, B.; Wu, L.; Xie, Z.; Qiu, Q.; Zhou, Y.; Ma, K.; Tao, L. Understanding geological reports based on knowledge graphs using a deep learning approach. Comput. Geosci. 2022, 168, 105229. [Google Scholar] [CrossRef]
Cracknell, M.J.; Reading, A.M. Geological mapping using remote sensing data: A comparison of five machine learning algorithms, their response to variations in the spatial distribution of training data and the use of explicit spatial information. Comput. Geosci. 2014, 63, 22–33. [Google Scholar] [CrossRef]
Harris, J.R.; Grunsky, E.C. Predictive lithological mapping of Canada’s North using Random Forest classification applied to geophysical and geochemical data. Comput. Geosci. 2015, 80, 9–25. [Google Scholar] [CrossRef]
Othman, A.A.; Gloaguen, R. Integration of spectral, spatial and morphometric data into lithological mapping: A comparison of different Machine Learning Algorithms in the Kurdistan Region, NE Iraq. J. Asian Earth Sci. 2017, 146, 90–102. [Google Scholar] [CrossRef]
Kuhn, S.; Cracknell, M.J.; Reading, A.M. Lithologic mapping using Random Forests applied to geophysical and remote-sensing data: A demonstration study from the Eastern Goldfields of Australia. Geophysics 2018, 83, B183–B193. [Google Scholar] [CrossRef]
Wang, Z.; Zuo, R.; Dong, Y. Mapping Himalayan leucogranites using a hybrid method of metric learning and support vector machine. Comput. Geosci. 2020, 138, 104455. [Google Scholar] [CrossRef]
Wang, Z.; Zuo, R.; Jing, L. Fusion of geochemical and remote-sensing data for lithological mapping using random forest metric learning. Math. Geosci. 2021, 53, 1125–1145. [Google Scholar] [CrossRef]
Wu, G.; Chen, G.; Cheng, Q.; Zhang, Z.; Yang, J. Unsupervised machine learning for lithological mapping using geochemical data in covered areas of Jining, China. Nat. Resour. Res. 2021, 30, 1053–1068. [Google Scholar] [CrossRef]
Xu, L.; Green, E.C.R. Inferring geological structural features from geophysical and geological mapping data using machine learning algorithms. Geophys. Prospect. 2023, 71, 1728–1742. [Google Scholar] [CrossRef]
El Fels, A.E.A.; El Ghorfi, M. Using remote sensing data for geological mapping in semi-arid environment: A machine learning approach. Earth Sci. Inform. 2022, 15, 485–496. [Google Scholar] [CrossRef]
Guo, J.; Wang, J.; Wu, L.; Liu, C.; Li, C.; Li, F.; Lin, M.; Jessell, M.W.; Li, P.; Dai, X.; et al. Explicit-implicit-integrated 3-D geological modelling approach: A case study of the Xianyan Demolition Volcano (Fujian, China). Tectonophysics 2020, 795, 228648. [Google Scholar] [CrossRef]
Yang, L.; Achtziger-Zupančič, P.; Caers, J. 3D modeling of large-scale geological structures by linear combinations of implicit functions: Application to a large banded iron formation. Nat. Resour. Res. 2021, 30, 3139–3163. [Google Scholar] [CrossRef]
Guo, J.; Wang, X.; Wang, J.; Dai, X.; Wu, L.; Li, C.; Li, F.; Liu, S.; Jessell, M.W. Three-dimensional geological modeling and spatial analysis from geotechnical borehole data using an implicit surface and marching tetrahedra algorithm. Eng. Geol. 2021, 284, 106047. [Google Scholar] [CrossRef]
Martin, R.; Boisvert, J.B. Iterative refinement of implicit boundary models for improved geological feature reproduction. Comput. Geosci. 2017, 109, 1–15. [Google Scholar] [CrossRef]
Gonçalves, Í.G.; Guadagnin, F.; Cordova, D.P. Variational Gaussian processes for implicit geological modeling. Comput. Geosci. 2023, 174, 105323. [Google Scholar] [CrossRef]
Li, B.; Zhong, D.; Wang, L. Repair of geological models based on multiple material marching cubes. Mathematics 2021, 9, 2207. [Google Scholar] [CrossRef]
Sun, H.; Zhong, D.; Wu, Z.; Wang, L. Multi-labeled Regularized Marching Tetrahedra Method for Implicit Geological Modeling. Math. Geosci. 2023, 1–30. [Google Scholar] [CrossRef]
D’Affonseca, F.M.; Finkel, M.; Cirpka, O.A. Combining implicit geological modeling, field surveys, and hydrogeological modeling to describe groundwater flow in a karst aquifer. Hydrogeol. J. 2020, 28, 2779–2802. [Google Scholar] [CrossRef]
Guo, J.T.; Liu, Y.H.; Han, Y.F.; Wang, X.L. Implicit 3D geological modeling method for borehole data based on machine learning. J. Northeast. Univ. (Nat. Sci.) 2019, 40, 1337. [Google Scholar]
Zou, Y.H.; He, J.C.; Ding, M.L. Implicit simulation for three-dimensional spatial morphology of geological body based on marching cubes algorithm. Appl. Mech. Mater. 2012, 195, 807–813. [Google Scholar] [CrossRef]
Wang, J.; Zhao, H.; Bi, L.; Wang, L. Implicit 3D modeling of ore body from geological boreholes data using hermite radial basis functions. Minerals 2018, 8, 443. [Google Scholar] [CrossRef]
Fisher, R.A. The use of multiple measurements in taxonomic problems. Ann. Eugen. 1936, 7, 179–188. [Google Scholar] [CrossRef]
Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
Nair, V.; Hinton, G.E. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine learning (ICML-10), Haifa, Israel, 21–24 June 2010; pp. 807–814. [Google Scholar]
Maas, A.L.; Hannun, A.Y.; Ng, A.Y. Rectifier nonlinearities improve neural network acoustic models. Proc. icml 2013, 30, 3. [Google Scholar]
Clevert, D.A.; Unterthiner, T.; Hochreiter, S. Fast and accurate deep network learning by exponential linear units (elus). arXiv 2015, arXiv:1511.07289. [Google Scholar]
Ramachandran, P.; Zoph, B.; Le, Q.V. Searching for activation functions. arXiv 2017, arXiv:1710.05941. [Google Scholar]
Duchi, J.; Hazan, E.; Singer, Y. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 2011, 12, 2121–2259. [Google Scholar]
Zeiler, M.D. Adadelta: An adaptive learning rate method. arXiv 2012, arXiv:1212.5701. [Google Scholar]
Tieleman, T.; Hinton, G. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA Neural Netw. Mach. Learn. 2012, 4, 26–31. [Google Scholar]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv 2015, arXiv:1502.03167. [Google Scholar]

Figure 1. Illustration of the overall idea of the method.

Figure 2. Illustration of the resampling process. (a) Location of study area. (b) Resampling in the planar space. (c) Resampling in the depth direction.

Figure 3. Loss and accuracy change curves for different numbers of hidden layers and hidden layer dimension conditions (batch size is 256).

Figure 4. Training loss, accuracy, and training duration variation for different batch sizes.

Figure 5. Comparison of raw data with predicted data.

Figure 6. Three-dimensional geological model of the study area. (a) Depth-direction sampling accuracy of 5 m. (b) Depth-direction sampling accuracy of 1 m.

Table 1. Deep neural network training parameters.

Indicators	Subnetwork
Indicators	DNN-1	DNN-2	DNN-3
Duration (s)	4813	2840	3114
Total epochs	1305	752	809
Best test loss	0.0239	0.4256	0.3131
Testing Accuracy (%)	97.25	90.18	85.17
Best training loss	0.0099	0.1858	0.6158
Best validation loss	0.0237	0.4202	0.5304
Final learning rate	6.46108 × 10⁻⁵	2.05891 × 10⁻⁴	1.85302 × 10⁻⁴

Table 2. Deep neural network training results.

Parameters	Subnetwork
Parameters	DNN-1	DNN-2	DNN-3
Hidden layers	7	8	7
Hidden dim	256	256	256
Number of computational parameters	394,497	460,033	394,497
Loss function	Geologically Constrained Loss
Activation function	ReLU
Batch size	1024
Initial learning rate	0.001
Decay factor of learning rate	0.9
Optimizer	Adam
Maximum epochs	10,000
Tolerance epochs	100

Table 3. Prediction accuracy statistics of deep neural network.

Depth (m)	Accuracy-1 (%)	Accuracy-2 (%)	Accuracy-3 (%)
30	98.8	94.2	89.9
50	99.1	95.3	90.4
70	99.2	93.3	88.7
90	98.3	93.9	84.4

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, L.; Li, T.; Ma, C. Research on 3D Geological Modeling Method Based on Deep Neural Networks for Drilling Data. Appl. Sci. 2024, 14, 423. https://doi.org/10.3390/app14010423

AMA Style

Liu L, Li T, Ma C. Research on 3D Geological Modeling Method Based on Deep Neural Networks for Drilling Data. Applied Sciences. 2024; 14(1):423. https://doi.org/10.3390/app14010423

Chicago/Turabian Style

Liu, Liang, Tianbin Li, and Chunchi Ma. 2024. "Research on 3D Geological Modeling Method Based on Deep Neural Networks for Drilling Data" Applied Sciences 14, no. 1: 423. https://doi.org/10.3390/app14010423

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on 3D Geological Modeling Method Based on Deep Neural Networks for Drilling Data

Abstract

1. Introduction

2. Related Work

2.1. Machine Learning in Geological Entity Research

2.2. Implicit Geological Modeling

3. Methods

3.1. Deep Neural Network

3.2. Gradient Descent

3.3. Loss Function

3.4. Optimization Algorithm

3.4.1. Learning Rate Adjustment

3.4.2. Gradient Estimation

3.4.3. Adaptive Moment Estimation

4. Data Scenario

4.1. Resampling

4.2. Normalization

4.3. Partitioning

5. Experiments and Results

5.1. Experiment Parameters

5.1.1. Neural Network Structure

5.1.2. Geologically Constrained Loss

5.1.3. Hidden Dim and Hidden Layers

5.1.4. Batch Size

5.2. Validation and Results

5.3. Three-Dimensional Modeling

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI