Next Article in Journal
Pressure Swing Adsorption Plant for the Recovery and Production of Biohydrogen: Optimization and Control
Previous Article in Journal
Analysis of Internal Flow Characteristics of the Bearingless Direct-Drive Centrifugal Pump System during Transient Start-Up
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

DRAG: A Novel Method for Automatic Geological Boundary Recognition in Shale Strata Using Multi-Well Log Curves

1
Research Institute of Petroleum Exploration and Development, PetroChina, Beijing 100083, China
2
Shale Gas Institute of PetroChina Southwest Oil & Gasfield Company, Chengdu 610051, China
*
Author to whom correspondence should be addressed.
Processes 2023, 11(10), 2998; https://doi.org/10.3390/pr11102998
Submission received: 31 August 2023 / Revised: 7 October 2023 / Accepted: 11 October 2023 / Published: 17 October 2023

Abstract

:
Ascertaining the positions of geological boundaries serves as a cornerstone in the characterization of shale reservoirs. Existing methods heavily rely on labor-intensive manual well-to-well correlation, while automated techniques often suffer from limited efficiency and consistency due to their reliance on single well log data. To overcome these limitations, an innovative approach, termed DRAG, is introduced, which uses deep belief forest (DBF), principal component analysis (PCA), and an enhanced generative adversarial network (GAN) for automatic layering recognition in logging curves. The approach employed in this study involves the use of PCA for dimensionality reduction across multiple well log datasets, coupled with a sophisticated GAN to generate representative samples. The DBF algorithm is then applied for stratification, incorporating a confidence screening mechanism to improve computational efficiency. In order to improve both accuracy and stability, a coordinate system is introduced that adjusts for stratification variations among neighboring wells around the target well. Experimental comparisons demonstrate the superior performance of the proposed algorithm in reducing stratification fluctuations and improving precision.

1. Introduction

Well log stratification plays an integral role in geological data interpretation, significantly influencing lithology identification, log facies analysis, and reservoir parameter studies [1,2,3,4,5,6,7,8,9,10]. Conventional methodologies, which include both labor-intensive manual techniques and automated approaches, have inherent limitations. The origin of automatic stratification can be traced back to mathematical statistical methods, encompassing intra-layer difference, optimal segmentation, change point analysis, and extreme variance clustering [11,12,13,14,15]. While these pioneering methods were transformative, they generally necessitated substantial computational resources and had difficulty with the acquisition of accurate prior probabilities [16,17,18,19].
The advent of machine learning techniques, exemplified by fuzzy clustering and neural networks, represents a significant evolution [20,21,22,23,24]. These methods focus on stratification using individual well-logging curves, potentially leading to imprecision, especially in instances of ambiguous curve delineation [25]. Moreover, these techniques often have difficulty correlating stratification resulting from proximate wells that share similar stratigraphic structures [26,27].
More recently, the development of deep learning has provided a promising direction for well logging curve prediction. Techniques such as fully convolutional neural networks (FCNNs) and recurrent neural networks (RNNs) have exhibited considerable promise. Zhang et al. [28] utilized the RNN-based improved long short-term memory (LSTM) network and the C-LSTM network with a cascading system to generate logging curves. Zhou et al. [29] accurately predicted the changing trend of logging curves using LSTM networks and gated recurrent unit (GRU) neural networks [28,29,30,31,32]. In recent years, both the GRU algorithm and the BPNN algorithm have emerged as foundational tools in the domain of neural network-based stratigraphic division [28,29,30,31,32]. The GRU, a variant of the recurrent neural network (RNN), enhances the capacity to capture dependencies across different time steps, addressing some of the vanishing gradient challenges inherent to traditional RNNs [16,24,26,27,28,29,30,31,32]. Its architecture, characterized by update and reset gates, allows for efficient memory retention over longer sequences [28,29,30,31,32]. Conversely, BPNN, a foundational artificial neural network, relies on the propagation of errors backward through the network to refine its predictions iteratively [28,29,30,31,32]. Although BPNN has exhibited versatility across a wide range of tasks, its susceptibility to issues, such as local minima and slow convergence in certain scenarios, is noteworthy [16,24,26,27,28,29,30,31,32]. The methods based on recurrent neural networks can improve the prediction effect of logging curves to a certain extent, compared to those based on fully connected neural networks; however, when local mutations arise in logging curves, their prediction effect still needs improvement [28,29,30,31,32]. Nevertheless, these methods also face unique challenges, particularly when local mutations arise in logging curves.
Despite these advancements, the majority of automatic stratification techniques rely on the data derived from single wells, frequently overlooking the pivotal data from drilling location coordinates. This singular focus results in stratification that does not consider the potential impact of neighboring wells, thereby posing a significant challenge [30,31,32].
Subtle changes in mineral composition and organic matter content of black shale surrounding the stratigraphic boundary result in small variations in the logging response tied to the shale stratigraphic boundary. Such nuances make the stratigraphic division notably more challenging than in sandstone and carbonate formations [33,34,35,36]. In light of these complexities, this paper focuses on Ordovician–Silurian black shale and the corresponding logging curves on the southern edge of the Sichuan Basin, aiming at automatic stratigraphic division. Based on the above analysis, an innovative method, DRAG, which combines deep belief forest (DBF), principal component analysis (PCA), and an advanced generative adversarial network (GAN), is introduced to achieve automatic layering recognition in logging curves [37,38,39,40]. This method encompasses several techniques; principal component analysis (PCA) is utilized for dimensionality reduction, followed by the application of a generative adversarial network (GAN) for sample generation. Furthermore, DRAG integrates the deep belief forest (DBF), a cascaded deep forest algorithm [40,41,42,43,44,45,46]. An integral feature of the method is the automatic calibration strategy, which uses neighborhood information to adjust single point results [47,48,49].

2. The Proposed Method

This paper proposes a novel approach, termed DRAG, for well log analysis and shale boundary identification, as illustrated in Figure 1. Specifically, the original well log curve data undergo dimensional reduction using the PCA algorithm. Subsequently, a GAN network is employed to generate synthetic samples of the well log curve data, aiming to achieve a balanced quantity of samples across different categories. The deep belief forest algorithm is then utilized for well log curve stratification. Lastly, the calibration threshold function of the layer division result for a single well is constructed, and the layer division result is corrected by considering the stratigraphic division schemes of neighboring wells.

2.1. Data Preprocessing

Well log curve data typically consist of a high-dimensional feature space, with each feature represents different attributes. However, in well log analysis, not all features are equally informative or relevant for the target task [32,33,34,35]. Consequently, working with such high-dimensional data, which may contain redundant or noisy information, can lead to challenges such as high computational complexity and low analysis performance [38]. To overcome these limitations, dimensionality reduction is applied to the original well log curve data, where the most important and distinctive features should be identified and retained [42,43,44].
In this study, a principal component analysis (PCA) algorithm, a widely utilized method, is employed for dimensionality reduction in well log curve analysis. PCA helps eliminate redundant dimensions and identifies the most important and distinctive features, thereby preserving crucial information [38,39,40]. By reducing the dimensionality of well log curve data while preserving vital attributes, PCA improves the robustness of analysis results. The trend in TOC can effectively predict the boundaries of stratigraphic units. Therefore, conducting PCA analysis on well logging data and TOC data enhances the accuracy in predicting the limits of these stratigraphic units. This enables more efficient processing and enhances the ability to extract meaningful insights from the well log data, leading to improved decision-making in shale boundary identification and well log analysis tasks [44].
First, the principal component score of each sample is obtained after PCA dimension reduction of the original data, and it is represented by F ic   ( i = 1 , 2 , , n ; c = 1 , 2 , , m ) , where n is the number of samples and m is the number of main components [24].
Then, each principal component is normalized. The process can be formulated as follows [38]:
F ic * = F ic F c ( min ) F c ( max ) F c ( min ) ( i = 1 , 2 , , n ; c = 1 , 2 , , m )
where F ic * is the normalized score of the c principal component of the i sample. F ic is the score of the c principal component of sample i. F c ( max ) is the maximum score of the c principal component. F c ( min ) is the minimum score of the c principal component. Then, the gravity of each principal component sample is calculated.
  P ic = F ic * / i = 1 n F ic *     i = 1 , 2 , , n ; c = 1 , 2 , , m  
where P ic is the proportion of the c principal component in sample i .
Then, the information entropy of each principal component is calculated. The formula is as follows [38]:
E c = 1 lnn i = 1 n P ic lnP ic ( i = 1 , 2 , , n ; c = 1 , 2 , , m )
where E c is the information entropy of the c principal component. Next, the weight of each principal component is calculated. The formula is as follows [38]:
w c = ( 1 E c ) / f = 1 m ( 1 E f ) ( c , f = 1 , 2 , , m )
where w c is the weight of the c principal component, and E f is the information entropy of the f principal component. Then, the comprehensive score of principal component of information entropy is calculated, and the formula is as follows:
F = c = 1 m w c P ic ( i = 1 , 2 , , n ; c = 1 , 2 , , m )
where F is the comprehensive score of principal components based on information entropy. According to the comprehensive score of principal components based on information entropy, the first 30% of principal components are selected as the data after dimensionality reduction.
This paper takes Well A as a case study and reduces the dataset for predicting TOC using PCA analysis. The specific parameters related to the PCA analysis are presented in Tables S1–S7. The weights of principal components (Wc) and the composite scores (F’) for logging data at each depth allow for the identification of principal components that explain the data with the highest granularity. The magnitude of these weights determines which principal components are pivotal in describing the data’s variability. Using Well A as a reference, Figure 2a reveals that the weights for the principal components GR, AC, RT, and DEN are comparatively high, indicating their pronounced predictive capacity for the TOC parameter (Figure 2a). The variability in the composite scores (F’) for each sample reflects the inherent importance or influence of each sample within the entirety of the logging data (Figure 2b). This information facilitates the recognition of samples that stand out across all principal components.

2.2. Sample Balance Treatment

The generative adversarial network (GAN) is a powerful unsupervised learning mechanism that was introduced in 2014 by Goodfellow [42,43,44,45,46]. Goodfellow and his colleagues first applied GAN to image generation [48,49,50,51,52]. Since then, GAN has attracted significant attention in the deep learning community for its capability to model high-dimensional data distributions and produce realistic samples [48,49,50,51,52].
In the context of well log data analysis, GAN provides valuable contributions by addressing the challenges associated with complex and high-dimensional data [32]. Well log curve data contain intricate patterns and subtle variations that are crucial for accurate shale boundary identification [44]. Utilizing GAN allows for harnessing its robust computational power to model the underlying distribution of the well log data and generate synthetic samples that capture the essential characteristics of the actual data.
GAN’s key characteristic is its composition of two competing networks: a generator (G) and a discriminator (D) [48,49,50,51,52]. The generator captures the latent distribution of actual sample data, creating new data samples, while the discriminator distinguishes between input data and generator-produced samples [48,49,50,51,52]. The training alternates and consists of competing phases between G and D, and it concludes once a balance is reached. At this point, the generator produces data that are true enough to elude detection by the discriminator [48,49,50,51,52]. The GAN process does not necessarily require prior knowledge of boundaries identified by experts. GANs are designed to learn and generate data in an unsupervised manner, without relying on pre-defined boundaries or labels. Instead, GANs learn from input data to generate new samples that resemble the training data.
In this paper, an improved GAN algorithm is used to generate samples. The network structure of the generator is mainly composed of five deconvolution layers, in which the input is a 100-dimensional log vector, and the output is a matrix with both length and width of 64 and 3 channels.
The discriminator network structure is basically the opposite of the generator, aims to judge the probability that the input feature matrix is the true sample, and is composed of 5 convolution layers and 1 reshaping layer.
The input feature matrix is a 3-channel matrix with length and width of 64. After 5 convolutional layers, the feature vector with length and width of 1 and channel number of 1 is output. After transformation, the output scalar is output in the range of 0–1; the closer the value is to 1, the higher the probability that the input feature vector is true. As with the generator, batch normalization is added after each convolution layer.
The convolution kernel size of CONV1, CONV2, CONV3, CONV4, and CONV5 is 4 × 4, and the step size is 2. Figure 3 shows the framework of a GAN.
For logging curves, amplitude and frequency can reflect the main differences between different curves. To capture the main differences between different curves based on their amplitudes and frequencies, we used the difference function C ( f t , f u ) , defined as:
C ( f t , f u ) = f t × f u f t + f u  
where f t is the amplitude and f u is the frequency. The function combines the amplitude ( f t ) and frequency ( f u ) using a weighted average, where the product of the two values is divided by their sum. Both f t and f u represent the amplitude (for f t ) and the frequency (for f u ) of individual samples, spanning the entire data distribution of the dataset. These are not confined to specific intervals. The term Ec in Equation (6) calculates the weighted average difference between the amplitude and frequency of the generated and real samples, capturing the dissimilarity between them. In the provided formulation, the difference is normalized by the sum of the two values to ensure the result stays within a meaningful range. This formulation emphasizes both amplitude and frequency information in determining the dissimilarity between curves.
In the improved GAN algorithm, the similarity between the generated log and the real log is defined as the objective function of the network, and the formula is:
R loss   = | R real   R generate   |
where R real   represents the actual log curve and R generate   represents the generated log curve.

2.3. Layering Recognition

In order to effectively extract geological information and identify different geological layers or reservoirs from well log curve data, a robust algorithm is required [53,54,55,56]. In this study, the deep belief forest (DBF) algorithm, which combines the concepts from deep belief networks (DBNs) and random forests, is employed for well log curve stratification, as illustrated in Figure 4. The DBF algorithm offers several advantages over traditional random forest algorithms [53,54,55,56]. It utilizes a cascade forest structure, consisting of multiple layers of decision trees, to extract higher-level geological features and enhance the model’s expressive capability [53,54,55,56]. Additionally, the DBF algorithm incorporates DBNs to extract shale-related features, further improving its performance [53,54,55,56]. Moreover, the hierarchical structure of random forests provides interpretability to the classified well log curve data, enabling users to better understand the obtained stratification results [53,54,55,56].
However, all the data in the deep forest must pass through each step of the cascade forest, making the time cost increase linearly with the increase of the number of cascade forest layers [57,58,59,60,61]. Moreover, each original sample will generate hundreds of new samples after multi-particle scanning, greatly increasing the training set and computing cost [62,63,64]. To tackle the issue of time and memory overhead caused by all samples passing through each layer of the cascaded forest, this paper proposes a confidence screening mechanism in the cascaded forest structure, where each layer of the cascaded forest is able to automatically determine its own confidence threshold such that this mechanism improves the computational efficiency of the deep forest model while ensuring performance [62,63,64].
Figure 5 shows a deep confidence level forest [62,63,64]. The confidence screening mechanism aims to divide the instances of each level of the cascade into two subsets, those that are easy to predict and those that are more difficult to predict [62,63,64]. Specifically, if an instance is classified as belonging to the easily predictable subset, it will be directly outputted and used as the final result. Conversely, if an instance is determined to be difficult to predict, it needs to be passed to the next level of the cascaded forest [62,63,64].
At layer t of the cascade forest, its predicted confidence threshold n t is determined according to the cross-validation error rate ϵ t of layer t. The hyperparameter α < 1 represents the cross-verification error rate that the training sample with high confidence needs to achieve α . ϵ t sorts the training sample at the same level in descending order according to the prediction confidence, where c i represents the prediction confidence of x i of the m samples. The confidence threshold is set as follows:
n t = m   { c k L ( x 1 , , x k ) < α t , k [ 1 , m ] }
In the equation L ( x 1 , , x k ) = 1 k i = 1 k 1 [ g t ( x i ) y i ] , the term 1[gt(xi) ≤ α] is an indicator function. Specifically, it takes the value of 1 if the condition gt(xi) ≤ α holds true and 0 otherwise. This function serves to capture the instances where the prediction confidence exceeds the threshold α. The equation aims to compute the cross-verification error rate for the k samples with the highest prediction confidence. Considering only the samples with the highest confidence ensures that the selected training samples have a low cross-verification error rate, indicating more reliable predictions. When the output class vector of the last layer of forest is obtained, each forest class vector of the last layer of forest is averaged, and then the category corresponding to the maximum value is taken as the prediction category of the model, namely the classification category Y t of the stratum in this paper. Moreover, m is the sample size, α ϵ t represents the cross-validation error rate, c k denotes the prediction confidence of the kth sample.

3. Result Calibration

The automatic calibration of single point results through neighborhood information can significantly reduce the fluctuation of formation division between different logs in different regions; this can lead to the single detection results in the same region having a more similar formation division, which is more in line with the accuracy of actual formation division. In addition, since there are some fuzzy formation boundaries in single well-logs, the use of formation division results can make the fuzzy boundaries of a single well clearer and its performance more stable [50,51].
In this paper, the calibration threshold function of the single point result is constructed and the point information is corrected by considering the neighborhood stratification result. The correction function is constructed as follows:
Y f i n a = r a n d ( 5 Y t + Y 1 + Y 2 + Y 3 + Y 4 + Y 5 + Y 6 + Y 7 + Y 8 13 )
where Y f i n a is the final classification result, r a n d (   ) is the integral function, Yt represents the classification result obtained directly from the stratigraphic division results predicted by the DRAG method (using the PCA algorithm, the well log curve data dimensions are reduced. The GAN network then generates synthetic samples for balanced distribution. Finally, the deep belief forest algorithm refines well log stratification using a calibration threshold function; for details, see Section 2), before considering any neighborhood stratification influences. Essentially, Yt is the primary output classification of a specific depth point in the well based on its intrinsic logging properties. This classification acts as the foundation, which is then refined by the correction function using neighborhood stratification results. In terms of its computation, Yt is determined through the deep belief forest-based analysis, where well log characteristics at each depth point in layer 4 are input into the model, and the model then provides a classification based on the learned patterns and relationships in the data. Y 1 ,   Y 2 ,   Y 3 ,   Y 4 ,   Y 5 ,   Y 6 ,   Y 7 ,   Y 8 are the stratigraphic classification results of the surrounding wells, respectively. Moreover, the function rand() is utilized to generate random numbers. Within GANs, the generator model (G) initiates its training with a random input, often referred to as a noise vector. This noise vector, after being processed by the generator, is transformed into an output resembling the distribution of the real data. The inherent randomness ensures that the generator produces varied results each time, enhancing the diversity of generated samples. Had we consistently employed the same input, the generator might repetitively produce identical or highly similar images. This randomness is pivotal for the generator’s capability to explore and learn diverse data distributions.

4. Result and Discussion

4.1. Targeted Stratigraphic Formations and Sub-Divisions

The objective of this study was to conduct well log analysis and identify shale boundaries in the southern Sichuan area [40,46]. This region poses particular complexities in terms of stratigraphic division due to the subtle variations across different areas and the subtle changes in well log responses in shale formations. The primary focus of this study is on the Longmaxi Formation, specifically the sub-section of Long1, which shows substantial exploration potential [43].
The Longmaxi Formation can be broadly divided into two sections, Long1 and Long2. The Long1 section is further divided into two sub-sections, Long1-1 and Long1-2. The Long1-1 sub-section, which is the target layer, is sub-divided further into four layers: L11, L12, L13, and L14 [1,14,25]. Moreover, the Wufeng Formation is also a crucial target for shale gas exploration. It is necessary to automatically identify L11, L12, L13, L14, and W1 (Wufeng Formation) by using deep belief forest analysis [30,31,32].

4.2. Experimental Data

For this study, well log data from 168 shale gas wells drilled in the southern Sichuan Basin were utilized. These data were specifically extracted from the Wufeng–Longmaxi Formation (Long1-1 sub-section, [33]). Through PCA analysis, the top four logging curves with the highest principal component weights have been selected for stratigraphic unit division (see the Section 2.1 for details). The well log curves employed for training include gamma ray (GR), acoustic transit time (AC), bulk density (DEN), and deep resistivity (Rt). Each of these sets of well log data comes with the corresponding layer labels, which allow for precise identification of the stratigraphic layers [35]. In the process of implementing the deep belief forest-based analysis, the goal is to overcome the challenges of shale formation division in the Southern Sichuan Basin [42,43,44]. Furthermore, the ability to accurately identify the boundaries within the sub-sections of Long1-1 could facilitate more effective and efficient exploration of shale gas resources in this region [35].

4.3. Experimental Results and Analysis

Figure 6a displays the stratigraphic curve annotated manually by experts. Figure 6b presents the stratigraphic curve results yielded by the method proposed in this study. Figure 6c depicts the results of well log curve stratification carried out using the gated recurrent unit algorithm, while Figure 6d shows the outcomes of well log curve stratification performed via the backpropagation neural network (BPNN) algorithm. The comparative analysis of these results emphasizes that the method introduced in this study possesses superior classification accuracy and performance.
Compared with the GRU and BPNN algorithms, the automatically identified results based on deep belief forest are the most similar to the results of artificial stratigraphic division, with an error of ±1 m. Compared with artificial results, the mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R2) of both algorithms are the smallest. The results of automatic stratigraphic division based on the GRU algorithm generally differ from those of manual stratigraphic division by less than 5 m. The difference between the results of automatic stratigraphic division based on the BPNN algorithm and the results of artificial stratigraphic division is the greatest (Table 1 and Table 2). Compared with the results of artificial stratigraphic division, the errors of the two are generally concentrated within 10 m, and the error of stratigraphic division results is relatively large.
Both GRU and BPNN have been widely regarded as effective tools for sequence modeling and stratigraphic division tasks, respectively. The GRU’s architecture, with its capacity to maintain memory over extended sequences, has demonstrated its efficacy in capturing intricate dependencies within data [28,29,30,31,32]. On the other hand, the BPNN, being a seminal artificial neural network approach, has demonstrated adaptability and precision across myriad applications, even though it can encounter challenges such as local minima or slow convergence in certain contexts [16,24,26,27,28,29,30,31,32]. Given their prominence and applicability in related domains, juxtaposing the performance of GRU and BPNN against DRAG provides a comprehensive and rigorous assessment [28,29,30,31,32]. This comparative analysis aims to discern the inherent strengths and potential limitations of our proposed method, while anchoring it to established benchmarks in the field [28,29,30,31,32]. Compared to the GRU and BPNN algorithms, the hierarchical results based on the deep belief forest are the most similar to the result of artificial stratigraphic division. The deviation from manual interpretations is within ±1 m. In comparison to the results of artificial stratigraphic division, the deep belief forest method yields the lowest values for MAE, RMSE and R2. On the other hand, the results of automatic stratigraphic division based on the GRU algorithm generally differ by less than 5 m from manual interpretations. The automatic stratigraphic identification using the BPNN algorithm exhibits the largest discrepancy compared to manual interpretations, with the deviations typically concentrated within 10 m, indicating significant errors in artificial stratigraphic division results. The proposed method in this study introduces a confidence-based filtering mechanism within the cascade forest structure, partitioning instances into subsets of easily predictable and difficult-to-predict instances. As a result, it effectively reduces time and memory overhead, enhances classification accuracy, and exhibits advantages in terms of resource efficiency, hierarchical processing, and high-dimensional data handling (Table 1 and Table 2).
The proposed method, DRAG, demonstrates high accuracy, attributed not only to appropriate dimensionality reduction and sample generation operations, but also to the incorporation of a confidence-based filtering mechanism in the deep belief forest algorithm. Confidence filtering involves dividing instances at each cascade layer into two subsets, one comprising easily predictable instances and the other comprising difficult-to-predict instances. If an instance is deemed easy to predict, it is directly output as the final result; only when an instance is difficult to predict does it get passed to the next layer. This hierarchical approach greatly enhances the accuracy of classification. Additionally, utilizing neighborhood information for single-point result calibration further improves the classification performance.
As illustrated in Figure 6, the well log curve predictions for Well A have been adjusted using the stratigraphic layering outcomes from Wells B, C, and D (Figure 6, Table 3). Figure 7a exhibits the well log curve derived from auto division for stratigraphic units by single well (Figure 7a), whereas Figure 7b presents the well log curve for Well A, corrected by employing stratigraphic data from Well B (Figure 7b). Figure 7c depicts the well log curve for Well A, modified with stratigraphic information from both Wells B and C (Figure 7c). Figure 7d represents the well log curve for Well A, refined using stratigraphic insights derived from Wells B, C, and D (Figure 7d). When using the artificial stratigraphic division from Well B to correct the automatic stratigraphic division of Well A, the deviation between the automatic and manual interpretations is controlled within 5 m. When both Well B and Well C are used to correct the stratigraphic division of Well A, the automatic interpretations of Well A align more closely with the manual interpretations, and the boundary errors between the two stratigraphic divisions are reduced to within 4 m. However, when the artificial stratigraphic division from Well B, Well C, and Well D, which are shale gas wells located within 100 km of Well A, are simultaneously used to correct the automatic stratigraphic division of Well A, it is found that the automatic interpretations become more accurate. The deviation between the automatic and manual stratigraphic division results is reduced to within 1 m. The utilization of multi-well analysis and shale characterization for well-logging curve analysis enables auxiliary support for the classification results of the target logging.
Evaluation of these well log curves reveals a progressive approximation of the adjusted outcomes to the authentically labeled well log curves, achieved by implementing stratigraphic corrections on Well A using data from its neighboring wells. This progressive alignment underscores the efficacy of the correction methodology adopted in this study.
The three subfigures below represent the manual stratification results for Wells B, C, and D, respectively. Hence, the proposed methodology in this study further substantiates the necessity of well log curve rectification as a crucial step. This precise technique contributes to a significant enhancement in the accuracy of stratification efforts.
The proposed research presents a distinctive methodology that manifests considerable advantages in managing high-dimensional well log data and conducting stratigraphic analysis. Central to the approach is the integration of PCA, an indispensable tool for data dimension reduction. This technique judiciously eliminates superfluous dimensions while conscientiously preserving critical data attributes. Through its adeptness in identifying robust correlations amongst a multitude of variables, the methodology excels in retaining the essential characteristics of the data. This process safeguards the preservation of pivotal information, even amidst a substantial reduction in overall data size.
The experimental findings offer compelling evidence for the precision and efficacy inherent in the proposed method. A testament to the robustness and high predictive accuracy of the technique is its ability to iteratively refine approximations to achieve closer alignment with the actual labelled well log curves. When juxtaposed with established algorithms such as the GRU and the BPNN, the proposed methodology demonstrates superior classification performance, further bolstering its merit.

5. Conclusions

The research introduces a cutting-edge method, DRAG, designed for well log analysis and automated stratigraphic layer identification within the Wufeng–Longmaxi shale of the Southern Sichuan Basin. By harnessing the PCA algorithm, the dimensions of the original well log curve data are reduced. A subsequent application of a GAN network facilitates the generation of synthetic samples, ensuring a balanced distribution across categories. The deep belief forest algorithm then undertakes well log curve stratification, further refined through a calibration threshold function and by incorporating stratigraphic schemes from neighboring wells. Notably, the deviation between automated and manual stratigraphic divisions is minimized to only 1 m, indicating a precision surpassing methods such as GRU, BPNN, and random forest.
Three pivotal facets underscore DRAG’s superiority in stratification precision: (1) the confidence-based filtering mechanism within the deep belief forest algorithm; (2) the integration of PCA, a critical tool for dimensionality reduction; and (3) the importance of well-to-well correlation rectification.
In the deep belief forest algorithm, the confidence-based filtering mechanism classifies instances at each cascade layer into two categories, easily predictable and challenging to discern. While easily predictable instances are immediately finalized, the more complex ones proceed to subsequent layers, enhancing classification accuracy. This accuracy is further refined by integrating neighborhood data to calibrate individual point results.
The DRAG approach is especially adept at managing high-dimensional well log data, chiefly owing to its integration of PCA, which efficiently trims redundant dimensions while preserving essential data attributes. By identifying potent correlations among numerous variables, the method ensures the retention of critical data features, even with significant data downsizing. Emphasizing well log curve rectification, the technique benefits from considering spatial information from surrounding wells, culminating in enhanced stratification accuracy.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/pr11102998/s1, Table S1. Stratigraphic division data and original well-logging data of a well; Table S2. Process parameter Fic calculated from 10 types of well-logging data of Well A during PCA computation; Table S3. Process parameter F*ic calculated from 10 types of well-logging data of Well A during PCA computation; Table S4. Process parameter Pic calculated from 10 types of well-logging data of Well A during PCA computation; Table S5. Process parameter Ec calculated from 10 types of well-logging data of Well A during PCA computation; Table S6. Process parameter Wc calculated from 10 types of well-logging data of Well A during PCA computation; Table S7. Process parameter F’ calculated from 10 types of well-logging data of Well A during PCA computation.

Author Contributions

Conceptualization, T.Z., Q.Z. (Qingzhong Zhu) and S.Z.; Methodology, T.Z., Q.Z. (Qingzhong Zhu) and S.Z.; Software, Q.Z. (Qun Zhao), S.Z. and C.Z.; Validation, Z.S.; Formal analysis, Q.Z. (Qingzhong Zhu), H.Z., Q.Z. (Qun Zhao), C.Z. and S.W.; Investigation, Q.Z. (Qingzhong Zhu); Resources, Q.Z. (Qun Zhao) and C.Z.; Data curation, H.Z.; Writing—original draft, T.Z.; Writing—review & editing, T.Z., Q.Z. (Qun Zhao), Z.S. and S.W.; Visualization, T.Z., Q.Z. (Qingzhong Zhu), H.Z., Z.S., C.Z. and S.W.; Supervision, H.Z.; Project administration, Z.S. and S.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data is unavailable due to privacy or ethical restrictions.

Conflicts of Interest

The authors declare no conflict of interest.

Nomenclature

DRAGa novel deep belief forest-based automatic layering recognition method for logging curves
DBFDeep belief forest
PCAPrincipal component analysis
GANGenerative adversarial network
FCNNFully convolutional neural network
RNNRecurrent neural network
LSTMLong short-term memory
C-LSTMConvolutional long short-term memory
GRUGated recurrent unit neural networks
BPNNBackpropagation neural network
F ic * The normalized score of the c principal component of the i sample
F ic The score of the c principal component of sample i
F c ( max ) The maximum score of the c principal component
F c ( min ) The minimum score of the c principal component
P ic The proportion of the c principal component in sample i
E c The information entropy of the c principal component
w c The weight of the c principal component
E f The weight of the f principal component
F The comprehensive score of principal components based on information entropy
f t The amplitude
f u The frequency
R real   The actual log curve
R generate   The generated log curve
n t Predicted confidence threshold of layer t
ϵ t The cross-validation error rate of layer t
αHyperparameter for indicating the cross-verification error rate
c i   The prediction confidence of x i of the sample
Y t The classification category of the stratum
Y f i n a The final classification result
YnThe stratigraphic classification result of the surrounding well
GRGamma ray
ACAcoustic transit time
DENBulk density
RtDeep resistivity
MAEMean absolute error
RMSERoot mean square error
R2Coefficient of determination

References

  1. Karimi, A.M.; Sadeghnejad, S.; Rezghi, M. Well-to-well correlation and identifying lithological boundaries by principal component analysis of well-logs. Comput. Geosci. 2021, 157, 104942. [Google Scholar] [CrossRef]
  2. Zaitouny, A.; Small, M.; Hill, J.; Emelyanova, I.; Ben Clennell, M. Fast automatic detection of geological boundaries from multivariate log data using recurrence. Comput. Geosci. 2020, 135, 104362. [Google Scholar] [CrossRef]
  3. Partovi, S.M.A.; Sadeghnejad, S. Geological boundary detection from well-logs: An efficient approach based on pattern recognition. J. Pet. Sci. Eng. 2019, 176, 444–455. [Google Scholar] [CrossRef]
  4. Behdad, A. A step toward the practical stratigraphic automatic correlation of well logs using continuous wavelet transform and dynamic time warping technique. J. Appl. Geophys. 2019, 167, 26–32. [Google Scholar] [CrossRef]
  5. Liu, L.-L.; Wang, Y. Quantification of stratigraphic boundary uncertainty from limited boreholes and its effect on slope stability analysis. Eng. Geol. 2022, 306, 106770. [Google Scholar] [CrossRef]
  6. Zhang, Q.; Zhang, F.; Liu, J.; Wang, X.; Chen, Q.; Zhao, L.; Tian, L.; Wang, Y. A method for identifying the thin layer using the wavelet transform of density logging data. J. Pet. Sci. Eng. 2018, 160, 433–441. [Google Scholar] [CrossRef]
  7. Dobróka, M.; Szabó, N.P. Interval inversion of well-logging data for automatic determination of formation boundaries by using a float-encoded genetic algorithm. J. Pet. Sci. Eng. 2012, 86–87, 144–152. [Google Scholar] [CrossRef]
  8. Omeragic, D.; Polyakov, V.; Shetty, S.; Brot, B.; Habashy, T.; Mahesh, A.; Friedel, T.; Denichou, J. Integration of well logs and reservoir geomodels for formation evaluation in high-angle and horizontal wells. In Proceedings of the SPWLA 52nd Annual Logging Sym-Posium, Colorado Springs, CO, USA, 14–18 May 2011; OnePetro: Richardson, TX, USA, 2011. [Google Scholar]
  9. Maiti, S.; Tiwari, R. Automatic detection of lithologic boundaries using the Walsh transform: A case study from the KTB borehole. Comput. Geosci. 2005, 31, 949–955. [Google Scholar] [CrossRef]
  10. Luthi, S.M.; Bryant, I.D. Well-log correlation using a back-propagation neural network. J. Int. Assoc. Math. Geol. 1997, 29, 413–425. [Google Scholar] [CrossRef]
  11. Delfiner, P.; Peyret, O.; Serra, O. Automatic determination of lithology from well logs. SPE Form. Eval. 1987, 2, 303–310. [Google Scholar] [CrossRef]
  12. Shaw, B.R.; Cubitt, J.M. Stratigraphic correlation of well logs: An automated approach. In Geomathematical and Petro-Physical Studies in Sedimentology; Elsevier: Amsterdam, The Netherlands, 1979; pp. 127–148. [Google Scholar]
  13. Silversides, K.L.; Melkumyan, A.; Wyman, D.A.; Hatherly, P.J.; Nettleton, E. Detection of geological structure using gamma logs for autonomous mining. In Proceedings of the 2011 IEEE International Conference on Robotics and Automation, Shanghai, China, 9–13 May 2011; pp. 1577–1582. [Google Scholar] [CrossRef]
  14. Reading, A.M.; Gallagher, K. Transdimensional change-point modeling as a tool to investigate uncertainty in applied ge-ophysical inference: An example using borehole geophysical logs. Geophysics 2013, 78, WB89–WB99. [Google Scholar] [CrossRef]
  15. Sen, D.; Texas A&M University; Ong, C.; Kainkaryam, S.; Sharma, A.; Houston, T. Automatic detection of anomalous density measurements due to wellbore cave-in. Petrophysics 2020, 61, 434–449. [Google Scholar] [CrossRef]
  16. Gill, D.; Shomrony, A.; Fligelman, H. Numerical zonation of log suites and logfacies recognition by multivariate clustering. Aapg. Bull. 1993, 77, 1781–1791. [Google Scholar]
  17. Smith, J.H. A method for calculating pseudo sonics from e-logs in a clastic geologic setting. Gcags Trans. 2007, 57, 675–678. [Google Scholar]
  18. Merembayev, T.; Yunussov, R.; Yedilkhan, A. Machine learning algorithms for classification geology data from well logging. In Proceedings of the 2018 14th International Conference on Electronics Computer and Computation (ICECCO), Kaskelen, Kazakhstan, 29 November–1 December 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 206–212. [Google Scholar]
  19. Partovi, S.M.A.; Sadeghnejad, S. Reservoir rock characterization using wavelet transform and fractal dimension. Iran. J. Chem. Chem. Eng. 2018, 37, 223–233. [Google Scholar]
  20. Belozerov, B.; Bukhanov, N.; Egorov, D.; Zakirov, A.; Osmonalieva, O.; Golitsyna, M.; Reshytko, A.; Semenikhin, A.; Shindin, E.; Lipets, V. Automatic well log analysis across priobskoe field using machine learning methods. In Proceedings of the SPE Russian Petroleum Technology Conference, Moscow, Russia, 15–17 October 2018; OnePetro: Richardson, TX, USA, 2018. [Google Scholar]
  21. Partovi, S.M.A.; Sadeghnejad, S. Fractal parameters and well-logs investigation using automated well-to-well correlation. Comput. Geosci. 2017, 103, 59–69. [Google Scholar] [CrossRef]
  22. Gulbrandsen, M.L.; Cordua, K.S.; Bach, T.; Hansen, T.M. Smart Interpretation–automatic geological interpretations based on supervised statistical models. Comput. Geosci. 2017, 21, 427–440. [Google Scholar] [CrossRef]
  23. Gonzalez, A.; Kanyan, L.; Heidari, Z. Integrated multi-physics workflow for automatic rock classification and formation evaluation using multi-scale image analysis and conventional well logs. In Proceedings of the SPWLA 60th Annual Logging Symposium, The Woodlands, TX, USA, 15–19 June 2019. [Google Scholar] [CrossRef]
  24. Loginov, G.; Petrov, A. Automatic detection of geoelectric boundaries according to lateral logging sounding data by applying a deep convolutional neural network. Russ. Geol. Geophys. 2019, 60, 1319–1325. [Google Scholar] [CrossRef]
  25. Lapkovsky, V.; Istomin, A.; Kontorovich, V.; Berdov, V. Correlation of well logs as a multidimensional optimization problem. Russ. Geol. Geophys. 2015, 56, 487–492. [Google Scholar] [CrossRef]
  26. Gardner, G.H.F.; Gardner, L.W.; Gregory, A.R. Formation velocity and density-the diagnostic basics for stratigraphic traps. Geophysics 1974, 39, 770–780. [Google Scholar] [CrossRef]
  27. Castagna, J.P.; Batzle, M.L.; Eastwood, R.L. Relationships between compressional-wave and shear-wave velocities in clastic silicate rocks. Geophysics 1985, 50, 571–581. [Google Scholar] [CrossRef]
  28. Zhang, D.; Yuntian, C.; Jin, M. Synthetic well logs generation via Recurrent Neural Networks. Pet. Explor. Dev. 2018, 45, 629–639. [Google Scholar] [CrossRef]
  29. Zhou, X.; Cao, J.; Wang, X.; Wang, J.; Liao, W. Acoustic log reconstruction based on bidirectional Gated Recurrent Unit (GRU) neural network. Prog. Geophys. 2022, 37, 357–366. [Google Scholar]
  30. Wood, D.A. Carbonate/siliciclastic lithofacies classification aided by well-log derivative, volatility and sequence boundary attributes combined with machine learning. Earth Sci. Inform. 2022, 15, 1699–1721. [Google Scholar] [CrossRef]
  31. Anvari, K.; Mousavi, A.; Sayadi, A.R.; Sellers, E.; Salmi, E.F. Automatic detection of rock boundaries using a hybrid recurrence quantification analysis and machine learning techniques. Bull. Eng. Geol. Environ. 2022, 81, 398. [Google Scholar] [CrossRef]
  32. Wang, Y.; Shi, C.; Li, X. Machine learning of geological details from borehole logs for development of high-resolution subsurface geological cross-section and geotechnical analysis. Georisk: Assess. Manag. Risk Eng. Syst. Geohazards 2021, 16, 2–20. [Google Scholar] [CrossRef]
  33. Tözün, K.A.; Özyavaş, A. Automatic detection of geological lineaments in central Turkey based on test image analysis using satellite data. Adv. Space Res. 2022, 69, 3283–3300. [Google Scholar] [CrossRef]
  34. Shi, Z.; Zhou, T.; Guo, W.; Liang, P.; Cheng, F. Quantitative Paleogeographic Mapping and Sedimentary Microfacies Divi-sion in a Deep-water Marine Shale Shelf: Case study of Wufeng-Longmaxi shale, southern Sichuan Basin, China. Acta Sedimentol. Sin. 2022, 40, 1728–1744. [Google Scholar]
  35. Hongyan, W.; Zhensheng, S.; Shasha, S.; Leifu, Z.; Aarnes, I. Characterization and genesis of deep shale reservoirs in the first Member of the Silurian Longmaxi Formation in southern Sichuan Basin and its periphery. Oil Gas Geol. 2021, 42, 66–75. [Google Scholar]
  36. Shi, Z.; Dong, D.; Wang, H.; Sun, S.; Wu, J. Reservoir characteristics and genetic mechanisms of gas-bearing shales with different laminae and laminae combinations: A case study of Member 1 of the Lower Silurian Longmaxi shale in Sichuan Basin, SW China. Pet. Explor. Dev. 2020, 47, 888–900. [Google Scholar] [CrossRef]
  37. Wang, H.; Shi, Z.; Zhao, Q.; Liu, D.; Sun, S.; Guo, W.; Liang, F.; Lin, C.; Wang, X. Stratigraphic framework of the Wufeng-Longmaxi shale in and around the Sichuan Basin, China: Implications for targeting shale gas. Energy Geosci. 2020, 1, 124–133. [Google Scholar] [CrossRef]
  38. Girolami, M.; Mischak, H.; Krebs, R. Analysis of complex, multidimensional datasets. Drug Discov. Today Technol. 2006, 3, 13–19. [Google Scholar] [CrossRef] [PubMed]
  39. Karamizadeh, S.; Abdullah, S.M.; Manaf, A.A.; Zamani, M.; Hooman, A. An overview of principal component analysis. J. Signal Inf. Process. 2013, 4, 173. [Google Scholar] [CrossRef]
  40. Gonog, L.; Zhou, Y. A review: Generative adversarial networks. In Proceedings of the 2019 14th IEEE Conference on Industrial Electronics and Applications (ICIEA), Xi’an, China, 19–21 June 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 505–510. [Google Scholar]
  41. Aggarwal, A.; Mittal, M.; Battineni, G. Generative adversarial network: An overview of theory and applications. Int. J. Inf. Manag. Data Insights 2021, 1, 100004. [Google Scholar] [CrossRef]
  42. Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. arXiv 2014, arXiv:1406.2661. [Google Scholar]
  43. Chen, K.; Chen, H.; Zhou, C.; Huang, Y.; Qi, X.; Shen, R.; Liu, F.; Zuo, M.; Zou, X.; Wang, J.; et al. Comparative analysis of surface water quality prediction performance and identification of key water parameters using different machine learning models based on big data. Water Res. 2019, 171, 115454. [Google Scholar] [CrossRef]
  44. Shewalkar, A.; Nyavanandi, D.; Ludwig, S.A. Performance evaluation of deep neural networks applied to speech recognition: RNN, LSTM and GRU. J. Artif. Intell. Soft Comput. Res. 2019, 9, 235–245. [Google Scholar] [CrossRef]
  45. Rajeswari, S.; Suthendran, K. C5.0: Advanced Decision Tree (ADT) classification model for agricultural data analysis on cloud. Comput. Electron. Agric. 2019, 156, 530–539. [Google Scholar] [CrossRef]
  46. Zhou, Z.-H.; Feng, J. Deep Forest: Towards an Alternative to Deep Neural Networks. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, Melbourne, VIC, Australia, 19–25 August 2017; pp. 3553–3559. [Google Scholar]
  47. Canziani, A.; Paszke, A.; Culurciello, E. An analysis of deep neural network models for practical applications. arXiv 2016, arXiv:1605.07678 2016. [Google Scholar]
  48. Loh, W.-Y.; Eltinge, J.; Cho, M.J.; Li, Y. Classification and regression trees and forests for incomplete data from sample surveys. Stat. Sin. 2018, 29, 431–453. [Google Scholar] [CrossRef]
  49. Zhang, H.; Goodfellow, I.; Metaxas, D.; Odena, A. Self-attention generative adversarial networks. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 7354–7363. [Google Scholar]
  50. Fedus, W.; Goodfellow, I.; Dai, A.M. Maskgan: Better text generation via filling in the_. arXiv 2018, arXiv:1801.07736 2018. [Google Scholar]
  51. Salimans, T.; Goodfellow, I.; Zaremba, W.; Cheung, V.; Radford, A.; Chen, X. Improved techniques for training gans. Adv. Neural Inf. Process. Syst. 2016, 29. [Google Scholar]
  52. Goodfellow, I.J. On distinguishability criteria for estimating generative models. arXiv 2014, arXiv:1412.6515 2014. [Google Scholar]
  53. Zhang, C.; Zuo, R. Recognition of multivariate geochemical anomalies associated with mineralization using an improved generative adversarial network. Ore Geol. Rev. 2021, 136, 104264. [Google Scholar] [CrossRef]
  54. Liu, M.; Li, W.; Jervis, M.; Nivlet, P. 3D seismic facies classification using convolutional neural network and semi-supervised generative adversarial network. In SEG Technical Program Expanded Abstracts 2019; Society of Exploration Geophysicists: Houston, TX, USA, 2019. [Google Scholar] [CrossRef]
  55. Fu, R.; Chen, J.; Zeng, S.; Zhuang, Y.; Sudjianto, A. Time Series Simulation by Conditional Generative Adversarial Net. arXiv 2019, arXiv:1904.11419. [Google Scholar] [CrossRef]
  56. Jo, H.; Santos, J.E.; Pyrcz, M.J. Rule-Based Models with Generative Adversarial Networks: A Deepwater Lobe. In Proceedings of the Deep Learning Example, 2019 AAPG Annual Convention and Exhibition, San Antonio, TX, USA, 15–20 September 2019. [Google Scholar]
  57. Nakayama, J.Y.; Ho, J.; Cartwright, E.; Simpson, R.; Hertzberg, V.S. Predictors of progression through the cascade of care to a cure for hepatitis C patients using decision trees and random forests. Comput. Biol. Med. 2021, 134, 104461. [Google Scholar] [CrossRef]
  58. Ko, B.C.; Kim, S.; Jung, M. Energy-efficient pupil tracking method and device based on simplification of cascade regression forest. Sensors 2021, 20, 5141. [Google Scholar]
  59. Johnson, S.L.; Henshaw, D.; Downing, G.; Wondzell, S.; Schulze, M.; Kennedy, A.; Cohn, G.; Schmidt, S.A.; Jones, J.A. Long-term hydrology and aquatic biogeochemistry data from H. J. Andrews Experimental Forest, Cascade Mountains, Oregon. Hydrol Process 2021, 35, e14187. [Google Scholar]
  60. Zheng, L.; Bao, Q.; Weng, S.; Tao, J.; Zhang, D.; Huang, L.; Zhao, J. Determination of adulteration in wheat flour using multi-grained cascade forest-related models coupled with the fusion information of hyperspectral imaging. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2021, 270, 120813. [Google Scholar] [CrossRef]
  61. Shao, M.; Zou, Y. Multi-spectral cloud detection based on a multi-dimensional and multi-grained dense cascade forest. J. Appl. Remote Sens. 2021, 15, 028507. [Google Scholar] [CrossRef]
  62. Li, D.; Liu, Z.; Armaghani, D.J.; Xiao, P.; Zhou, J. Novel Ensemble Tree Solution for Rockburst Prediction Using Deep Forest. Mathematics 2022, 10, 787. [Google Scholar] [CrossRef]
  63. Runxuan, W.; Croft, R.A.C.; Patrick, S. Deep forest: Neural network reconstruction of intergalactic medium temperature. Mon. Not. R. Astron. Soc. 2022, 515, 1568–1579. [Google Scholar]
  64. Zhang, J.; Song, H. Multi-Feature Fusion for Weak Target Detection on Sea-Surface Based on FAR Controllable Deep Forest Model. Remote Sens. 2021, 13, 812. [Google Scholar] [CrossRef]
Figure 1. A framework illustration of DRAG. After the application of the PCA algorithm, the well log data undergo a dimensionality reduction, leading to the extracted principal components. These components, representing the main features of the original well log data, are then used in two distinct pathways. The first pathway feeds into the generative adversarial network (GAN), where it provides the basis for generating synthetic samples. The second pathway channels the reduced data directly into the deep belief forest for further stratigraphic analysis. Both routes are crucial for the comprehensive stratigraphic division process, with the GAN ensuring a balanced data distribution and the deep belief forest refining stratigraphic categorization. This dual-path strategy optimizes the use of reduced data, ensuring a more precise and reliable stratigraphic division.
Figure 1. A framework illustration of DRAG. After the application of the PCA algorithm, the well log data undergo a dimensionality reduction, leading to the extracted principal components. These components, representing the main features of the original well log data, are then used in two distinct pathways. The first pathway feeds into the generative adversarial network (GAN), where it provides the basis for generating synthetic samples. The second pathway channels the reduced data directly into the deep belief forest for further stratigraphic analysis. Both routes are crucial for the comprehensive stratigraphic division process, with the GAN ensuring a balanced data distribution and the deep belief forest refining stratigraphic categorization. This dual-path strategy optimizes the use of reduced data, ensuring a more precise and reliable stratigraphic division.
Processes 11 02998 g001
Figure 2. (a) Weights (wc) of principal components during PCA analysis of Well A and (b) Composite scores (F’) for each depth interval of Well A. PC1: Gamma ray (GR), PC2: Acoustic transit time (AC), PC3: Deep resistivity (Rt), PC4: Potassium–thorium–hydrogen (KTH), PC5: Compensated neutron log (CNL), PC6: Bulk density (DEN), PC7: Resistivity of flushed zone (RXO), PC8: Caliper log (CAL), PC9: Thorium–hydrogen (TH), PC10: Uranium (URAN).
Figure 2. (a) Weights (wc) of principal components during PCA analysis of Well A and (b) Composite scores (F’) for each depth interval of Well A. PC1: Gamma ray (GR), PC2: Acoustic transit time (AC), PC3: Deep resistivity (Rt), PC4: Potassium–thorium–hydrogen (KTH), PC5: Compensated neutron log (CNL), PC6: Bulk density (DEN), PC7: Resistivity of flushed zone (RXO), PC8: Caliper log (CAL), PC9: Thorium–hydrogen (TH), PC10: Uranium (URAN).
Processes 11 02998 g002
Figure 3. The framework of a GAN for the deep belief forest-based automatic layering recognition method.
Figure 3. The framework of a GAN for the deep belief forest-based automatic layering recognition method.
Processes 11 02998 g003
Figure 4. Schematic diagram of the deep belief forest used in this study.
Figure 4. Schematic diagram of the deep belief forest used in this study.
Processes 11 02998 g004
Figure 5. Deep confidence level forest used for this study.
Figure 5. Deep confidence level forest used for this study.
Processes 11 02998 g005
Figure 6. Comparison of automatic stratigraphic unit division results based on manual division, GRAG, GRU, and BPNN methods.
Figure 6. Comparison of automatic stratigraphic unit division results based on manual division, GRAG, GRU, and BPNN methods.
Processes 11 02998 g006
Figure 7. Comparison of corrections applied to stratigraphic division results from well log. (a) well A geological boundary identified by experts, (b) well A geological boundary identified by well B correlation after deep belief forest analysis, (c) well A geological boundary identified by well B and well c correlations after deep belief forest analysis, (d) well A geological boundary identified by well B, well C and well D correlations after deep belief forest analysis, (e) well locations of well A, well B, well C and well D, (f) geological boundary identification of well B, (g) geological boundary identification of well C, (h) geological boundary identification of well D.
Figure 7. Comparison of corrections applied to stratigraphic division results from well log. (a) well A geological boundary identified by experts, (b) well A geological boundary identified by well B correlation after deep belief forest analysis, (c) well A geological boundary identified by well B and well c correlations after deep belief forest analysis, (d) well A geological boundary identified by well B, well C and well D correlations after deep belief forest analysis, (e) well locations of well A, well B, well C and well D, (f) geological boundary identification of well B, (g) geological boundary identification of well C, (h) geological boundary identification of well D.
Processes 11 02998 g007
Table 1. Results of different methods for identifying automatic shale boundary.
Table 1. Results of different methods for identifying automatic shale boundary.
WellLayerArtificial Geological Boundary Identified ResultsProposed MethodGRUBPNNRandom Forest
Top Depth (m)Bottom Depth (m)Top Depth (m)Bottom Depth (m)Top Depth (m)Bottom Depth (m)Top Depth (m)Bottom Depth (m)Top Depth (m)Bottom Depth (m)
Well AL141273.4711302.51274.0581302.9221273.9561303.3671279.6071303.0511274.0811306.388
Well AL131302.51311.6521302.9221311.6771303.3671312.7491303.0511305.8851306.3881313.845
Well AL121311.6521318.3321311.6771318.9971312.7491321.8841305.8851320.2491313.8451322.706
Well AL111318.3321320.3681318.9971320.5971321.8841322.1381320.2491322.2361322.7061326.11
Well AW11320.36813231320.5971322.5551322.1381328.7281322.2361325.9841326.111327.87
Well BL142290.7082321.8792291.0042322.6362290.4762322.6762293.192325.1562299.9472316.072
Well BL132321.8792339.5372322.6362339.4932322.6762341.612325.1562336.7992316.0722345.462
Well BL122339.5372345.5062339.4932345.2012341.612343.2332336.7992347.4782345.4622346.248
Well BL112345.5062348.4772345.2012348.5772343.2332350.5722347.4782344.5492346.2482343.577
Well BW12348.47723572348.5772356.5912350.5722354.8832344.5492361.7422343.5772357.78
Well CL142906.3182940.1762906.812940.5422908.1492941.5662908.7842944.7142907.0242936.386
Well CL132940.1762954.2372940.5422953.4142941.5662952.8142944.7142951.9762936.3862957.748
Well CL122954.2372958.5112953.4142958.8712952.8142958.4862951.9762959.9552957.7482960.499
Well CL112958.5112959.962958.8712959.9932958.4862960.4052959.9552964.132960.4992962.396
Well CW12959.962962.8082959.9932961.8692960.4052964.5992964.132967.7382962.3962965.122
Well DL142038.452073.9332038.7622072.9522037.9772069.6512029.6182071.7432028.3792074.321
Well DL132073.9332092.1522072.9522092.6882069.6512090.592071.7432094.5872074.3212095.42
Well DL122092.1522101.532092.6882100.6492090.592100.9412094.5872102.9062095.422102.095
Well DL112101.532107.0162100.6492106.532100.9412103.0312102.9062110.9922102.0952111.828
Well DW12107.01621112106.532111.7962103.0312113.3752110.9922109.7872111.8282108.992
Table 2. Comparison between the proposed algorithms and the comparison algorithm.
Table 2. Comparison between the proposed algorithms and the comparison algorithm.
Evaluation IndexProposed MethodGRUBPNNRandom Forest
MAE6.2218.87610.24110.221
RMSE8.94411.34514.21414.341
R20.9320.9110.8340.831
MAE: mean absolute error, RMSE: root mean square error, R2: coefficient of determination.
Table 3. Automatic stratigraphic division results of well A modified by artificial stratigraphic division results of the adjacent shale gas drilling well.
Table 3. Automatic stratigraphic division results of well A modified by artificial stratigraphic division results of the adjacent shale gas drilling well.
WellLayerArtificial Geological Boundary
Identified Results
Geological Boundary Identified by 1 Well Correlations after Deep Belief Forest AnalysisGeological Boundary Identified by 2 Well Correlations after Deep Belief Forest AnalysisGeological Boundary Identified by 3 Well Correlations after Deep Belief Forest Analysis
Top Depth (m)Bottom Depth (m)Top Depth (m)Bottom Depth (m)Top Depth (m)Bottom Depth (m)Top Depth (m)Bottom Depth (m)
Well AL141273.4711302.51281.0221303.2281274.811301.9151274.0581302.922
Well AL131302.51311.6521303.2281313.3491301.9151314.6141302.9221311.677
Well AL121311.6521318.3321313.3491316.0121314.6141322.2011311.6771318.997
Well AL111318.3321320.3681316.0121321.4951322.2011323.0941318.9971320.597
Well AW11320.36813231321.4951323.7281323.0941325.2611320.5971322.555
Well BL142290.7082321.8792293.1712320.8682292.3222322.9932291.0042322.636
Well BL132321.8792339.5372320.8682342.8942322.9932342.8932322.6362339.493
Well BL122339.5372345.5062342.8942348.4122342.8932345.7192339.4932345.201
Well BL112345.5062348.4772348.4122353.0432345.7192346.3722345.2012348.577
Well BW12348.47723572353.0432355.8712346.3722357.0552348.5772356.591
Well CL142906.3182940.1762898.7262940.4022905.6412940.2232906.812940.542
Well CL132940.1762954.2372940.4022952.8372940.2232956.6972940.5422953.414
Well CL122954.2372958.5112952.8372962.612956.6972957.3532953.4142958.871
Well CL112958.5112959.962962.612957.1522957.3532963.4542958.8712959.993
Well CW12959.962962.8082957.1522965.1542963.4542965.0942959.9932961.869
Well DL142038.452073.9332038.8382073.4952039.6042071.562038.7622072.952
Well DL132073.9332092.1522073.4952094.3612071.562094.3222072.9522092.688
Well DL122092.1522101.532094.3612098.6492094.3222102.8312092.6882100.649
Well DL112101.532107.0162098.6492108.3082102.8312103.1682100.6492106.53
Well DW12107.01621112108.3082109.0492103.1682111.7772106.532111.796
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhou, T.; Zhu, Q.; Zhu, H.; Zhao, Q.; Shi, Z.; Zhao, S.; Zhang, C.; Wang, S. DRAG: A Novel Method for Automatic Geological Boundary Recognition in Shale Strata Using Multi-Well Log Curves. Processes 2023, 11, 2998. https://doi.org/10.3390/pr11102998

AMA Style

Zhou T, Zhu Q, Zhu H, Zhao Q, Shi Z, Zhao S, Zhang C, Wang S. DRAG: A Novel Method for Automatic Geological Boundary Recognition in Shale Strata Using Multi-Well Log Curves. Processes. 2023; 11(10):2998. https://doi.org/10.3390/pr11102998

Chicago/Turabian Style

Zhou, Tianqi, Qingzhong Zhu, Hangyi Zhu, Qun Zhao, Zhensheng Shi, Shengxian Zhao, Chenglin Zhang, and Shanyu Wang. 2023. "DRAG: A Novel Method for Automatic Geological Boundary Recognition in Shale Strata Using Multi-Well Log Curves" Processes 11, no. 10: 2998. https://doi.org/10.3390/pr11102998

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop