Research on Image Identification Method of Rock Thin Slices in Tight Oil Reservoirs Based on Mask R-CNN

Liu, Tao; Li, Chunsheng; Liu, Zongbao; Zhang, Kejia; Liu, Fang; Li, Dongsheng; Zhang, Yan; Liu, Zhigang; Liu, Liyuan; Huang, Jiacheng

doi:10.3390/en15165818

Open AccessFeature PaperArticle

Research on Image Identification Method of Rock Thin Slices in Tight Oil Reservoirs Based on Mask R-CNN

by

Tao Liu

¹,

Chunsheng Li

¹,

Zongbao Liu

^2,*,

Kejia Zhang

^1,*,

Fang Liu

¹,

Dongsheng Li

¹,

Yan Zhang

¹,

Zhigang Liu

¹,

Liyuan Liu

² and

Jiacheng Huang

²

¹

School of Computer & Information Technology, Northeast Petroleum University, Daqing 163318, China

²

School of Earth Sciences, Northeast Petroleum University, Daqing 163318, China

^*

Authors to whom correspondence should be addressed.

Energies 2022, 15(16), 5818; https://doi.org/10.3390/en15165818

Submission received: 23 June 2022 / Revised: 25 July 2022 / Accepted: 6 August 2022 / Published: 10 August 2022

Download

Browse Figures

Versions Notes

Abstract

:

Terrestrial tight oil has extremely strong diagenesis heterogeneity, so a large number of rock thin slices are needed to reveal the real microscopic pore-throat structure characteristics. In addition, difficult identification, high cost, long time, strong subjectivity and other problems exist in the identification of tight oil rock thin slices, and it is difficult to meet the needs of fine description and quantitative characterization of the reservoir. In this paper, a method for identifying the characteristics of rock thin slices in tight oil reservoirs based on the deep learning technique was proposed. The present work has the following steps: first, the image preprocessing technique was studied. The original image noise was removed by filtering, and the image pixel size was unified by a normalization technique to ensure the quality of samples; second, the self-labeling image data augmentation technique was constructed to solve the problem of sparse samples; third, the Mask R-CNN algorithm was introduced and improved to synchronize the segmentation and recognition of rock thin slice components in tight oil reservoirs; Finally, it was demonstrated through experiments that the SMR method has significant advantages in accuracy, execution speed and migration.

Keywords:

tight oil reservoir; rock thin slices; characteristics identification; deep learning; unconventional oil and gas

1. Introduction

Tight oil reservoirs [1,2,3] in unconventional oil and gas resources have a complex pore structure and great resource potential, and the identification of rock thin slices is significant for the analysis of the microscopic pore structure and reservoir-sweet spot distribution [4]. Tight oil reservoirs are generally characterized by large heterogeneity and strong diagenesis, resulting in blurred component boundaries in rock thin slices and difficult image identification. It is urgent to establish an image identification method for rock thin slices in tight oil reservoirs based on the cross fusion of geological big data and artificial intelligence (AI), to achieve a multidimensional quantitative segmentation of rock thin slice images and fast intelligent recognition of its components [5,6,7,8]. The current identification methods include artificial intelligence-based image segmentation methods for rock thin slices in tight oil reservoirs and intelligent identification technology.

The artificial intelligence-based image segmentation methods for rock thin slices in tight oil reservoirs mainly consist of superpixel segmentation [9] and semantic segmentation [10]. Superpixel segmentation can be divided into graph theory and pixel clustering. The graph theory converts the segmentation and recognition of slice components into the division of image pixel units. In the Citations [11,12], the composition and structure of rock thin slices were segmented and analyzed by graph theory. However, due to the inaccurate division boundaries, there were segmentation cavities and over-segmentation problems. Pixel clustering clusters the pixels with similar spatial distance and similar features such as color, brightness and texture into the same superpixel to obtain the image segmentation result [10,13]. A K-means clustering algorithm based on probability selection was proposed in Citation [14], which solved the problems of the traditional method of being time-consuming and prone to discontinuity; However, the K center point is still hard to choose, and the clustering process is affected by noise and outliers. Semantic segmentation first assigns a semantic category label to each pixel, then divides and predicts slice components through the establishment of region proposals and classification evaluation. In the Citations [15,16], applying semantic segmentation can mark the overall boundary of different components, but cannot delineate independent individuals accurately.

Discriminant classifiers and neural networks are the two main intelligent identification techniques of tight oil reservoir composition. The key to the discriminant classifier is learning the geometric features of each component. In Citation [17], a semiautomatic identification method for the pore images of rock thin slices was proposed, but dissolution pores could not be well identified. The neural network algorithm identifies the components of rock thin slice images by establishing a neural network model. In Citation [18], 12 image features such as component hue and saturation were captured with artificial neural network (ANN) to identify component categories, but ANN is not suitable for high-density slice images with similar optical characteristics.

To sum up, the existing methods have low identification accuracy because of algorithm design flaws, susceptibility to noise, complex slice image structure, etc. In addition, due to the complex process and high cost of making tight oil rock thin slices [9,19], there are rarely sufficient samples for the existing methods. To this end, given the outstanding performance of Mask region-based convolutional neural network (Mask R-CNN) algorithm [20] in semantic segmentation, object detection and instance segmentation of natural images [21], a self-labeling augmented Mask R-CNN method for component identification of tight oil slices based on transfer learning (SMR) was proposed in this paper. First, the image preprocessing technique was studied. The original image noise was removed by filtering, and the image pixel size was unified by a normalization technique to ensure the quality of the samples; second, the self-labeling image data augmentation technique was constructed to solve the problem of sparse samples; third, the Mask R-CNN algorithm was introduced and improved to synchronize the segmentation and recognition of rock thin slice components in tight oil reservoirs; finally, the accuracy, execution speed and migration of the SMR method were demonstrated by experiments. At the same time, the SMR method has high segmentation accuracy and good recognition effect, which greatly saves the working time and labor cost of geologists. This method has a good application value for the microscopic study of tight oil reservoirs.

The rest of this paper is organized as below: the Section 2 elaborates the techniques; the Section 3 describes the experimental scheme; the Section 4 analyzes and discusses the experimental results; the Section 5 summarizes the study and prospects for future work.

2. SMR Methods

The workflow and key techniques of the SMR method are mainly elaborated in this section. The workflow of SMR mainly includes three stages: establishment of slice image data sets, augmentation of slice images, and identification of slice image components (Figure 1). The key techniques at each stage are, respectively, an image preprocessing technique, a self-labeling image augmentation mechanism and an improved Mask R-CNN algorithm. This will be discussed in detail in the stages below.

The augmentation of slice images realizes the augmentation of a small number of labeled images based on the self-labeling mechanism [22]. The identification of slice image components completes the fine segmentation and accurate identification of the slice image components by the Mask R-CNN algorithm.

2.1. Establishment of Slice Image Data Sets

There are two steps (image preprocessing and image labeling) at this stage. The original data sets used in this paper came from two tight reservoirs in China: the Fuyu reservoir in Sanzhao Sag of Songliao Basin (SZS data set) and the Upper Paleozoic in Linxing Block of Ordos Basin (OB data set), with a total of 100 bitmaps. The SZS data set was a self-made training data set, the original image size was 616 × 416 pixels, and the number was 50; the OB data set was a migration test data set, the original image size was 2560 × 1920 pixels, and the number of samples was 45.

To avoid the influence of noise and image size on experimental results, in this paper image preprocessing was firstly performed. Image preprocessing includes image denoising and image normalization, as marked by ① in Figure 1. Since the non-local means algorithm [23] can retain the texture features of images while denoising, image denoising was realized by applying the non-local means method with Gaussian core [24]. Meanwhile, in order to meet the image quality requirements of Mask R-CNN algorithm, the image size was unified to 515 × 512 pixels by the resize image processing method, to ensure that the SZS image quality was closest to the original data.

Image labeling guides the training process. LabelMe [25] was used in this paper for semantic labeling to ensure the accuracy of labels. The labels consist of Quartz, Feldspar, Lithic, Primary Pore (PP), Casting Pore (CP), Cemented Dissolution Pore (CDP) and Microcrack, as marked by ② in Figure 1. The labeled image samples would be formed into JSON files and be converted to the desired format.

2.2. Augmentation of Slice Images

The self-labeling image augmentation mechanism [26] was introduced to augment the number of data sets and improve network robustness. Currently, image augmentation mainly depends on deep learning and image processing [27,28]. Deep learning requires a large number of training samples, and the images generated still rely on large-scale manual verification and labeling, so it is not suitable for the augmentation of small sample slice images. Image processing realizes image augmentation based on small sample images, including flipping, color transformation, cropping, rotation, translation, noise injection [29], and mixed images [30]. Based on image processing, the self-labeling mechanism was introduced to complete the augmentation of labeled image samples and corresponding label files, as shown in (2) in Figure 1. The self-labeling mechanism determines the position change of each component labeling point through spatial geometry principle, and completes the labeling of augmented images.

To ensure the clarity of the augmented images and the accuracy of labeling points, five image augmentation forms were selected and two constraints were set.

(1): Flipping: image horizontal/vertical/diagonal mirror flipping.
(2): Rotation: images were rotated by 30 degrees, 60 degrees, 90 degrees and 120 degrees clockwise.
(3): Gaussian blur: 3 × 3, 5 × 5 and 7 × 7 Gaussian kernels were applied to blur images globally, horizontally, vertically and diagonally.
(4): Change exposure: increase and decrease operations of global, horizontal, vertical and diagonal exposures on images.
(5): Noise injection: global, horizontal, vertical and diagonal noise addition operations were performed by using 0.05, 0.075, 0.1 and 0.125 noise percentages.

The set constraints were: (1) Two augmentation forms can be used, at most, each time, and flipping and rotation can be selected simultaneously; (2) Three augmentation forms of Gaussian blur, change exposure and noise injection cannot be selected at the same time.

After augmentation, the total number

F_{n u m}

of samples was:

F_{n u m} = \sum_{k = 1}^{n} [(\sum_{a_{1} = 1}^{n - (k - 1)} p_{a_{1}}) \times (\sum_{a_{2} = a_{1} + 1}^{n - (k - 2)} p_{a_{2}}) \times \dots \times (\sum_{a_{k} = a_{k - 1} + 1}^{n} p_{a_{k}})]

(1)

where, n denotes the type of initial image change,

k \in [1, n]

,

a_{k}

represents the number of operations, and

p_{a_{k}}

is the number of images generated by operation change. The number of valid samples after augmentation reached 15,350, which were divided into training set and test set at the ratio of 7:3.

2.3. Identification of Slice Image Components

Mask-RCNN is the core algorithm for identification of slice image components, which mainly includes three parts: backbone, region of interest area (ROI), segmentation and recognition.

(1): The backbone consists of ResNet101 [31] and Feature Pyramid Network (FPN), as marked by ③ in Figure 1. ResNet101 extracts image features through residual network to obtain the feature layer; FPN extracts the features and semantic values of each component in the feature layer by undersampling, and then generates the effective feature layer (P2, P3, P4, P5, P6) [32] by upsampling and fusion of feature layers, to complete feature extraction, as shown in Figure 2.
(2): The ROI area is the area where the region proposal network (RPN) slides on the effective feature layer to obtain and output the possible slice components [33], as marked by ④ in Figure 1. The RPN consists of a binary classification network and a regression network. The former detects the slice components of the candidate area by judging the intersection over union value (IOU) in region proposals, and the latter outputs the edge boxes of each component region proposal. The RPN structure is shown in Figure 3. If the region proposal detected by the binary classification network does not contain components, the edge box is invalid. Each ROI area is finally adjusted to ROI align of the same size.
(3): The segmentation and recognition part is composed of fully connected layers and mask branch, as marked by ⑤ in Figure 1. Fully connected layers identify each component category with ROI align as input, and regress and refine the edge box of each component; mask branch applies a small fully connected network to generate pixel-level object masks for each component in ROI align and complete the instance segmentation of components. The total loss function $L_{T}$ of Mask R-CNN can be defined by Equation (2) [20], where $L_{c l s}$ is the recognition process loss; $L_{b o x}$ is the box regression loss; $L_{m a s k}$ is the segmentation loss.

L_{T} = L_{c l s} + L_{b o x} + L_{m a s k}

(2)

2.4. Mask R-CNN Algorithm Training

Transfer learning was used for algorithm pretraining, in order to improve the learning efficiency and training speed of Mask R-CNN algorithm. Transfer learning is a machine learning technique that applies the knowledge of a task to related scenarios to complete a new task [33]. The specific performance of this technique in this paper was to apply the training weights on the COCO [34] data set to the Mask R-CNN algorithm, which is mainly divided into three steps: first, the pretraining network structure is modified according to the component type; second, a small learning rate is applied to optimize the pretraining network; finally, the algorithm training is completed to realize the segmentation and recognition of each component. In algorithm training, the stochastic gradient descent method is adopted to update the training parameter

θ_{j}

through cost function

J (θ)

to complete iterative training. The update process of training parameter

θ_{j}

can be defined as Equation (3) [31], where,

α

is a hyper-parameter representing the learning rate. The learning rate was finally set as

α = 10^{- 5}

through fine-tuning test, and the algorithm training was completed with 100 learning epochs. In addition, to optimize the network performance, the IOU size was set to 60% and the batch size to 8 steps. Training time was 34 h. Each loss in the iterative process is shown in Figure 4. It can be seen from the figure that the 60th epoch model tends to be stable.

θ_{j + 1} = θ_{j} - α \frac{\partial}{\partial θ_{j}} J (θ_{j})

(3)

3. Experimental Scheme

An accuracy experiment, execution speed experiment and migration experiment were conducted in this paper. The specific parameters of experimental devices are shown below: CPU: Intel Xeon Silver 4210R, Memory: 64G, GPU: RTX 6000/8000; Operating system: Ubuntu 20.04.3; Experimental framework: Tensorflow-GPU 2.1.0. The comparison algorithm was YOLACT [35] (the self-labeling image augmentation mechanism (“SYL”) was added).

3.1. Accuracy Experiment

The accuracy experiment included segmentation accuracy and recognition accuracy.

3.1.1. Segmentation Accuracy Experiment

The evaluation method of the segmentation accuracy experiment was the roughness analysis method for shape particles based on improved Fourier particle profile proposed by Su et al. [36]. In this method, first, the particle profile is reconstructed with Fourier series-based method; second, the elongation (EI), angle (AI) and arithmetic average roughness (Ra) of the segmented profile are calculated based on the reconstruction; then, the results are compared with the manual calculation results to obtain the error; finally, the segmentation accuracy of the algorithm is judged by the error. The experimental data set was the test set of SZS original images, and the number was 15; The segmentation experiment was carried out by randomly selecting images from experimental data sets, and the segmentation accuracy was evaluated with the evaluation method.

Reconstruction of particle profiles with Fourier series-based method. This process consists of circular parameterization and function representation. Circular parameterization refers to mapping the points on the edge of segmented component to the circle with the same perimeter, as shown in Figure 5, the points A, B, and C on the original edge are mapped to A′, B′ and C′, and the distance between each point remains constant during the mapping process. Function representation refers to applying Fourier series to transform the Cartesian coordinates of each edge point into the radius angle

φ (0 \leq φ < 2 π)

function, namely:

x (φ) = a_{x 0} + \sum_{n = 1}^{N} [a_{x n} \cos (n φ) + b_{x n} \sin (n φ)]

(4)

y (φ) = a_{y 0} + \sum_{n = 1}^{N} [a_{y n} \cos (n φ) + b_{y n} \sin (n φ)]

(5)

where

x

and

y

represent coordinates of the boundary points;

n

is the serial number of harmonics;

N

is the total number of harmonics;

a_{x 0}

,

a_{y 0}

,

a_{x n}

,

a_{y n}

,

b_{x n}

,

b_{y n}

are Fourier coefficients that are calculated by coordinates of the sampled points on the aggregate particle boundary by Equations (4) and (5). For a given

N

, the total number of coefficients is

4 N + 2

and for a given number of sampled points

M

, the total number of coordinates is

2 M

.

EI computing [37]. The

E I

value reflects the composition profile of tight oil slice image, and the computing method can be expressed as:

E I = \frac{D_{\min}}{D_{\max}}

(6)

where,

D_{\min}

and

D_{\max}

represent the lengths of short diameter and long diameter, respectively, which are judged by the normal vector second-order tensor matrix.

Ω_{i j} = \frac{1}{L_{p}} \int_{0}^{2 π} l^{φ} T_{i}^{φ} T_{j}^{φ} d_{φ}

(7)

C = [\begin{matrix} Ω_{11} & Ω_{12} \\ Ω_{21} & Ω_{22} \end{matrix}] = [\begin{matrix} \cos_{η} & - \sin_{η} \\ \sin_{η} & \cos_{η} \end{matrix}] [\begin{matrix} λ_{a} & 0 \\ 0 & λ_{b} \end{matrix}] [\begin{matrix} \cos_{η} & \sin_{η} \\ - \sin_{η} & \cos_{η} \end{matrix}]

(8)

where,

L_{p}

denotes the profile perimeter of each component;

l^{φ}

is the corresponding arc length of the polar angle

φ

;

T_{i}^{φ}

denotes the unit normal vector component in the direction of micro-arc

i

corresponding to the polar angle

φ

;

T_{j}^{φ}

represents the unit normal vector component in the direction of micro-arc

j

corresponding to the polar angle

φ

;

λ_{a}

and

λ_{b}

refer to the eigenvalues of matrix

C

, i.e., the long and short diameter directions of

Ω_{i j}

;

η

represents the long diameter directions of each component. After determining the long and short diameters,

D_{\min}

and

D_{\max}

can be calculated and the

E I

value can be determined.

AI computing. Since components in the tight oil sandstone slice image were non-circular particles, the original equation was discretized in this paper and expressed by Equation (9):

AI = \frac{1}{2 π} \sum_{i = 0}^{w - 1} | θ_{(i + 1) Δ φ^{'}} - θ_{i Δ φ^{'}} | - 1

(9)

where,

x^{'} (φ)

and

y^{'} (φ)

represent the derivative of

x (φ)

and

y (φ)

, respectively, and

θ

is the angle starting along the X-axis, w is the increment element,

Δ φ

is

2 π / w

, and denotes the increment of polar coordinate angle.

Ra computing. Ra computing was obtained by comparing the real profile and reconstructed profile of each component, as shown in Equations (10) and (11).

R_{a} = \frac{1}{L} \int_{0}^{L} r (l) d l

(10)

R_{a} = \frac{1}{L} \sum_{i = 1}^{m} r_{i} l_{i}

(11)

where,

r

represents the vertical deviation between real profile and reference profile of each component,

l

is the line segment length of reference profile, and

L

is the total length of all line segments.

3.1.2. Accuracy Experiment

The evaluation method of the accuracy experiment calculates the precision, recall,

F 1

score and accuracy of the recognition result of each component [38,39]. The experimental data set is the same as the segmentation accuracy experiment. The SZS experimental data set was used for recognition experiment, and the accuracy was evaluated by the evaluation method.

Precision. Precision refers to the ratio of the number of components correctly identified to the number of components identified, as shown in Equation (12), where

T P

is the true positive test;

F P

is the false positive test.

Precision = T P / (T P + F P)

(12)

Recall. Recall refers to the ratio of components correctly identified in the true component category, as shown in Equation (13), where

F N

is the false negative test.

Recall = TP / (TP + FN)

(13)

The

F 1

score was defined according to precision and recall, as shown in Equation (14).

F 1 = 2 \times (Precision \times Re call) / (Precision + Re call)

(14)

3.1.3. Execution Speed Experiment

The evaluation method of execution speed experiment tests the mean response time (MRT) value of two algorithms to judge the execution speed that the algorithm completes the identification task. The experimental data set was the images of SZS test set, and the number was 1000. The selection of data set is shown in Table 1. The number of experimental data sets was set as

N

,

N \in \{100, 200, 400, 600, 800, 1000\}

. MRT values of component identification completed by the algorithm were tested 5 times, and the average value of 5 experiments was finally calculated.

3.1.4. Migration Experiment

The evaluation method of migration experiment calculates the precision and accuracy of two methods in different experimental region proposals. SZS and OB original test sets were experimental data sets, and the number of SZS was 15; The OB test data set was divided into three groups with 15 in each group and 45 in total. Different experimental data sets were selected, and the number of data sets increased from 3 to 15 (an increase of 3) to calculate the precision and accuracy.

4. Experimental Results and Discussion

4.1. Accuracy Experiment Results

4.1.1. Segmentation Accuracy Experiment Results

The calculation results of some EI, AI and Ra values in the segmentation accuracy experiment are shown in Figure 6. The values on the left outside the brackets are the SMR calculation results, the values on the right outside the brackets are the calculation results of SYL, and the values in the brackets are the results manually calculated by geologists. The error distribution results are shown in Figure 7.

The following conclusions can be drawn by analyzing Figure 6 and Figure 7:

Conclusion 1: The SMR method had relatively stable error calculation results compared with the SYL method, and the error of SMR and manual calculation results was within 10%, indicating that SMR method has higher segmentation accuracy for each component.

Conclusion 2: The SMR method had the lowest segmentation accuracy error for cuttings, probably due to the largest number of cuttings.

4.1.2. Accuracy Experiment Results

To verify the accuracy of the recognition results, the confusion matrix of the recognition results of the two algorithms was established, as shown in Figure 8, where the X-axis represents the predicted value and the Y-axis represents the true value. The precision and recall of two algorithms for each component are shown in Table 2 (The green highlights indicate better experimental results); the F1 score and the accuracy under different types of components are shown in Figure 9.

The following conclusions can be drawn by analyzing the experimental results:

Conclusion 1:Figure 8 shows that some quartz and lithic samples were identified as feldspar, and some PPs and CDPs were identified as CP. This is mainly induced by three causes: (1) the number of quartz and lithic is more than that of feldspar; the number of PP and CDP is more than that of CP; (2) Under single polarized light, quartz, lithic and feldspar have similar optical characteristics; PP, CDP and CP have similar optical characteristics; (3) There may be mislabeling in data labeling. Besides, a certain number of lithic samples were identified as quartz, and some CDPs were identified as PP. However, this had little impact on the recognition effect of quartz and PP due to the large number of quartz and PP.

Conclusion 2: According to Table 2, the two algorithms had the same precision for microcrack, while the SMR algorithm had higher precision and recall for other components than the SYL algorithm, indicating that the SMR algorithm had better accuracy. However, both algorithms had low precision for feldspar and CP, because the number of such components was relatively small and they had no obvious features.

Conclusion 3: According to Figure 9a, quartz, lithic, PP and CDP can be easily recognized with the two algorithms, but their recognition effect on feldspar, CP and microcrack was relatively poor, due to the influence of the number of components. There were few microcracks, but they have obvious characteristics, so they were not easily affected by other components, and can be better recognized than feldspar and CP. In addition, the F1 score of the SMR algorithm for each component was higher than that of the SYL algorithm.

Conclusion 4:Figure 9b indicates that the recognition accuracy of the two algorithms was above 75% and relatively stable, but the overall SMR algorithm was above 88%, suggesting that the SMR algorithm had a higher recognition accuracy.

4.2. Execution Speed Experiment Results

The execution speed experiment results of the two algorithms are shown in Table 3 (The green highlights indicate better experimental results). The growth trend of the MRT value with experimental data size is shown in Figure 10. The X-axis represents the number of flakes and the Y-axis represents the time required.

The following conclusions can be drawn by analyzing the experimental results:

Conclusion 1: Table 3 shows that the MRT of the SMR algorithm was smaller than that of the SYL algorithm, indicating that the SMR algorithm has a higher execution speed. In addition, when the experimental data set was 1000, the MRT of the SMR algorithm was 95.34 seconds, showing that the operating efficiency of the SMR algorithm can meet the actual needs of image segmentation and recognition of tight oil sandstone slices.

Conclusion 2: According to Figure 10, the SYL algorithm was relatively stable under a small number of data sets, but the MRT value increased significantly when the experimental data set was large (>600 in the experiment), suggesting that the algorithm is easily affected by data sets. The SMR algorithm tended to be linearly correlated with the number of experimental data sets, indicating that the execution speed of the algorithm is relatively less affected by the number of data sets.

4.3. Migration Experiment Results

Figure 11a–h shows the migration experimental results of the two algorithms, and the effects of the control experimental data set are shown in Figure 11a,b. EI, AI and Ra are indexed on the left Y-axis, while precision, recall, F1 and accuracy on the right Y-axis.

The following conclusions can be drawn by analyzing the migration experiment results:

Conclusion 1:Figure 11 shows that since different sedimentary environments led to different types and contents of slice components in the OB region proposal, the segmentation accuracy and recognition accuracy of the two algorithms were lower than that in the SZS region proposal. The segmentation accuracy and recognition accuracy of the SMR algorithm were higher than those of SYL algorithm, indicating that the SMR algorithm has better migration.

Conclusion 2: According to Figure 11c–h, the accuracy error of the SMR algorithm was within 10, with precision of 0.5–0.7, recall of 0.7–0.85, F1 of 0.6–0.75, accuracy of 0.75–0.9, higher than those of the SYL algorithm. The lower precision and F1 results may be caused by the large number of lithic samples, and misjudgment directly affects the calculation results of quartz and feldspar. Better overall results can be obtained with the SMR algorithm, demonstrating that the SMR algorithm has good migration and applies to the segmentation and recognition of other slice components.

Conclusion 3: From Figure 11c,e,g, the results of various indicators of the SMR algorithm were relatively stable under different experimental data sets, indicating that the SMR method has better migration.

5. Conclusions

Characteristic identification of rock thin slices of tight oil reservoirs is the core of analyzing the characteristics of microscopic pore-throat structures, and is quite important in reservoir-sweet spot prediction and microscopic evaluation. In order to solve the problem of low accuracy caused by traditional methods due to algorithm design flaws, susceptibility to noise interference, complex slice image structure and sparse sample sizes, the SMR method was proposed in this paper. Through theoretical elaboration and experimental demonstration, the research conclusions of the SMR method are summarized below:

(1): Image preprocessing can improve image quality and avoid noise interference;
(2): The self-labeling image data augmentation mechanism can increase the number of samples and ensure the availability of samples;
(3): Image segmentation and recognition can be simultaneously realized with the improved Mask R-CNN algorithm. The error of segmentation accuracy and manual calculation results was within 10%, and the overall recognition accuracy was 93.18%, so it can be applied to characteristic identification of rock thin slices of tight oil reservoirs.

The key work of this paper includes the augmentation and feature identification of dense rock images. Therefore, it can be applied to the amplification and identification of dense images in related fields such as medical cells and physical molecules. In addition, the SMR method can be improved and applied to the fields of fault identification, thin-section pore-throat feature analysis, and reservoir simulation evaluation.

In the subsequent study, we will optimize the image augmentation process of rocks in tight oil reservoirs using GAN network, so as to further ensure the availability of incremental samples. At the same time, we consider the AlexNet proposed in the paper [40] as the backbone network, combined with the adaptive idea of paper [41] to improve the Mask R-CNN algorithm, optimize the network structure design, improve the identification effect and apply it to the pore-throat feature analysis to evaluate the potential of the method.

Author Contributions

Conceptualization, T.L. and C.L.; methodology, Z.L. (Zongbao Liu), K.Z. and F.L.; software, D.L. and Z.L. (Zhigang Liu); validation, T.L. and L.L.; formal analysis, Z.L. (Zongbao Liu), C.L. and J.H.; investigation, T.L. and D.L.; resources, Z.L. (Zongbao Liu); data curation, T.L.; writing—original draft preparation, T.L. and Z.L. (Zongbao Liu); writing—review and editing, T.L. and K.Z.; visualization, T.L. and D.L.; supervision, T.L. and F.L.; project administration, C.L., K.Z. and Y.Z.; funding acquisition, Z.L. (Zongbao Liu), K.Z. and Z.L. (Zhigang Liu). All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China (Project No. 42172161), CNPC Innovation Foundation (Project No. 2020D-5007-0102), Heilongjiang Provincial Natural Science Foundation of China (Project No. YQ2020D001), Heilongjiang Provincial Natural Science Foundation of China (Project No. LH2020F003), Heilongjiang Provincial Department of Education Project of China (Project No. UNPYSCT-2020144).

Data Availability Statement

The migration experiment data set is derived from https://www.scidb.cn/en/detail?dataSetId=727601552654598144 (accessed on 24 March 2022).

Conflicts of Interest

The authors declare no conflict of interest.

References

Cheng, K.; Wu, W.; Holditch, S.A.; Ayers, W.B.; Mcvay, D.A. Assessment of the distribution of technically-recoverable resources in North American Basins. In Proceedings of the Canadian Unconventional Resources and International Petroleum Conference, Calgary, AB, Canada, 19–21 October 2010; pp. 1–11. [Google Scholar]
Vidas, H.; Hugman, B. Availability, Economics, and Production Potential of North American Unconventional Natural Gas Supplies; The INGAA Foundation, Inc.: Fairfax, VA, USA, 2008; pp. 11–12. [Google Scholar]
Zhao, J. Conception, Classification and Resource Potential of Unconventional Hydrocarbons. Nat. Gas Geosci. 2012, 23, 393–406. [Google Scholar]
Cheng, G.J.; Yang, J.; Huang, Q.Z.; Liu, Y. Rock image classification recognition based on probabilistic neural networks. Sci. Technol. Eng. 2013, 13, 9231–9235. [Google Scholar]
Zhou, Y.; Ji, Y.; Xu, L.; Che, S.; Niu, X.; Wan, L.; Zhou, Y.; Li, Z.; You, Y. Controls on reservoir heterogeneity of tight sand oil reservoirs in Upper Triassic Yanchang Formation in Longdong Area, southwest Ordos Basin, China: Implications for reservoir quality prediction and oil accumulation. Mar. Pet. Geol. 2016, 78, 110–135. [Google Scholar] [CrossRef]
Wang, M.; Yang, Z.; Shui, C.; Yu, Z.; Wang, Z.; Cheng, Y. Diagenesis and its influence on reservoir quality and oil-water relative permeability: A case study in the Yanchang Formation Chang 8 tight sandstone oil reservoir, Ordos Basin, China. Open Geosci. 2019, 11, 37–47. [Google Scholar] [CrossRef]
Cai, Y.H.; Teng, Q.Z.; Tu, C.Y. Automatic extraction of pores in thin slice images of rock castings based on deep learning. Sci. Technol. Eng. 2021, 28, 296–304. [Google Scholar]
Zhou, H.; Zhang, C.L.; Zhang, X.; Chen, Q.X.; Zhang, Y.; Zhong, C.C. Edge Extraction and Particle Segmentation Based on Coherent Features of Rock Slice Sequence lmages. J. Jilin Univ. (Earth Sci. Ed.) 2021, 51, 1897–1907. [Google Scholar]
Jiang, F.; Gu, Q.; Hao, H.Z.; Li, N.; Hu, X.M. Grain segmentation of sandstone thin section images based on semantic feature extraction. Sci. Sin. Inf. 2020, 50, 109–127. [Google Scholar]
Jiang, F.; Gu, Q.; Hao, H.Z.; Li, N.; Chen, D.X. Survey on Content-Based I mage Segmentation Methods. J. Softw. 2017, 28, 160–183. [Google Scholar]
Budennyy, S.; Pachezhertsev, A.; Bukharev, A.; Erofeev, A.; Mitrushkin, D.; Belozerov, B. Image processing and machine learning approaches for petrographic thin section analysis. In Proceedings of the SPE Russian Petroleum Technology Conference, Moscow, Russia, 16–18 October 2017; OnePetro: Richardson, TX, USA, 2017. [Google Scholar]
Asmussen, P.; Conrad, O.; Günther, A.; Kirsch, M.; Riller, U. Semi-automatic segmentation of petrographic thin section images using a “seeded-region growing algorithm” with an application to characterize wheathered subarkose sandstone. Comput. Geosci. 2015, 83, 89–99. [Google Scholar] [CrossRef]
Huang, P.; Zheng, Q.; Liang, C. Overview of Image Segmentation Methods. J. Wuhan Univ. (Nat. Sci. Ed.) 2020, 66, 519–531. [Google Scholar]
Yang, Y.; Liu, N.; Cheng, G.; Qiang, X.; Wang, X. Clustering Analysis of Rock Images Based on Spark Platform. J. Xi’an Shiyou Univ. (Nat. Sci. Ed.) 2016, 31, 114–118. [Google Scholar]
Jiang, F.; Gu, Q.; Hao, H.; Li, N.; Wang, B.; Hu, X. A method for automatic grain segmentation of multi-angle cross-polarized microscopic images of sandstone. Comput. Geosci. 2018, 115, 143–153. [Google Scholar] [CrossRef]
Saxena, N.; Day-Stirrat, R.J.; Hows, A.; Hofmann, R. Application of deep learning for semantic segmentation of sandstone thin sections. Comput. Geosci. 2021, 152, 104778. [Google Scholar] [CrossRef]
Ghiasi-Freez, J.; Soleimanpour, I.; Kadkhodaie-Ilkhchi, A.; Ziaii, M.; Sedighi, M.; Hatampour, A. Semi-automated porosity identification from thin section images using image analysis and intelligent discriminant classifiers. Comput. Geosci. 2012, 45, 36–45. [Google Scholar] [CrossRef]
Izadi, H.; Sadri, J.; Mehran, N.A. Intelligent mineral identification using clustering and artificial neural networks techniques. In Proceedings of the 2013 First Iranian Conference on Pattern Recognition and Image Analysis (PRIA), Birjand, Iran, 6–8 March 2013; pp. 1–5. [Google Scholar]
Li, N.; Gu, Q.; Jiang, F.; Hao, H.Z.; Yu, H.; Ni, C. Feature Representation Method of Microscopic Sandstone Images Based on Convolutional Neural Network. J. Softw. 2020, 31, 3621–3639. [Google Scholar]
He, K.; Gkioxari, G.; Dollar, P.; Girshick, R. Mask R-CNN, International Conference on Computer Vision. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 25 December 2017; pp. 2980–2988. [Google Scholar]
Johnson, J.W. Automatic nucleus segmentation with Mask-RCNN. In Proceedings of the Science and Information Conference, Las Vegas, NV, USA, 2–3 May 2019; Springer: Cham, Switzerland, 2019; pp. 399–407. [Google Scholar]
Bloice, M.; Stocker, C.; Holzinger, A. Augmentor: An image augmentation library for machine learning. arXiv 2017, arXiv:1708.04680. [Google Scholar] [CrossRef]
Buades, A.; Coll, B.; Morel, J.M. Non-Local Means Denoising. Image Processing Line 2011, 1, 208–212. [Google Scholar] [CrossRef] [Green Version]
Chachada, S.; Oh, B.T.; Cho, N.; Phong, S.A.; Manchala, D.; Kuo, C.C.J. Extension of Non-Local Means (NLM) algorithm with Gaussian filtering for highly noisy images. In Proceedings of the 2011 Visual Communications and Image Processing (VCIP), Tainan, China, 6–9 November 2011; pp. 1–4. [Google Scholar]
Russell, B.C.; Torralba, A.; Murphy, K.P.; Freeman, W.T. LabelMe: A database and web-based tool for image annotation. Int. J. Comput. Vis. 2008, 77, 157–173. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
Shorten, C.; Khoshgoftaar, T.M. A survey on Image Data Augmentation for Deep Learning. J. Big Data 2019, 6, 60. [Google Scholar] [CrossRef]
Ma, Y.; Tang, P.; Zhao, L.; Zhang, Z. Review of data augmentation for image in deep learning. J. Image Graph. 2021, 26, 487–502. [Google Scholar]
Moreno-Barea, F.J.; Strazzera, F.; Jerez, J.M.; Urda, D.; Franco, L. Forward noise adjustment scheme for data augmentation. In Proceedings of the 2018 IEEE Symposium Series on Computational Intelligence (SSCI), Bangalore, India, 18–21 November 2018; pp. 728–734. [Google Scholar]
Inoue, H. Data augmentation by pairing samples for images classification. arXiv 2018, arXiv:1801.02929. [Google Scholar]
Burke, C.J.; Aleo, P.D.; Chen, Y.C.; Liu, X.; Peterson, J.R.; Sembroski, G.H.; Lin, J.Y.Y. Deblending and classifying astronomical sources with Mask R-CNN deep learning. Mon. Not. R. Astron. Soc. 2019, 490, 3952–3965. [Google Scholar] [CrossRef] [Green Version]
Qin, J.; Zhang, Y.; Zhou, H.; Yu, F.; Sun, B.; Wang, Q. Protein Crystal Instance Segmentation Based on Mask R-CNN. Crystals 2021, 11, 157. [Google Scholar] [CrossRef]
Tan, C.; Sun, F.; Kong, T.; Zhang, W.; Liu, C. A Survey on Deep Transfer Learning. In Proceedings of the 27th International Conference on Artificial Neural Networks, Rhodes, Greece, 4–7 October 2018; pp. 270–279. [Google Scholar]
Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Zitnick, C.L. Microsoft Coco: Common Objects in Context; Springer International Publishing: Berlin/Heidelberg, Germany, 2014; pp. 740–755. [Google Scholar]
Bolya, D.; Zhou, C.; Xiao, F.; Lee, Y.J. YOLACT: Real-time instance segmentation. arXiv 2019, arXiv:1904.02689. [Google Scholar]
Su, D.; Wang, X.; Yang, H.W.; Hong, C. Roughness analysis of general-shape particles, from 2D closed outlines to 3D closed surfaces. Powder Technol. 2019, 356, 423–438. [Google Scholar] [CrossRef]
Su, D.; Yan, W.M. Quantification of angularity of general-shape particles by using Fourier series and a gradient-based approach. Constr. Build. Mater. 2018, 161, 547–554. [Google Scholar] [CrossRef]
Dietler, N.; Minder, M.; Gligorovski, V.; Economou, A.M.; Joly, D.A.H.L.; Sadeghi, A.; Rahi, S.J. A convolutional neural network segments yeast microscopy images with high accuracy. Nat. Commun. 2020, 11, 5723. [Google Scholar] [CrossRef]
Kruitbosch, H.T.; Mzayek, Y.; Omlor, S.; Guerra, P.; Milias-Argeitis, A. A convolutional neural network for segmentation of yeast cells without manual training annotations. Bioinformatics 2022, 38, 1427–1433. [Google Scholar] [CrossRef]
Zhang, T.; Li, Y.; Chen, Y.; Feng, X.; Zhu, X.; Chen, Z.; Yao, J.; Zheng, Y.; Cai, J.; Song, H.; et al. Review on space energy. Appl. Energy 2021, 292, 116896. [Google Scholar] [CrossRef]
Zhang, T.; Li, Y.; Li, Y.; Sun, S.; Gao, X. A self-adaptive deep learning algorithm for accelerating multi-component flash calculation. Comput. Methods Appl. Mech. Eng. 2020, 369, 113207. [Google Scholar] [CrossRef]

Figure 1. SMR Method Process.

Figure 2. Feature pyramid network FPN.

Figure 3. Region proposal network.

Figure 4. Iterative process loss.

Figure 5. Circular parameterization process: (a) represents the original position of each component edge point, (b) represents the position of each edge point after cyclic parameterization.

Figure 6. EI, AI and Ra calculation results of each component profile of tight sandstone thin section images.

Figure 7. Error statistics between the EI, AI and Ra calculation results of each component contour of slice images and the manual calculation results between the two algorithms.

Figure 8. Confusion matrix of the two algorithms: (a) SYL confusion matrix, (b) SMR confusion matrix.

Figure 9. F1 scores for both algorithms and accuracy for different components: (a) F1 scores for both algorithms, (b) accuracy of different components of two algorithms.

Figure 10. Change trend of MRT with data volume.

Figure 11. The comparison chart of the migration experiment results of the two algorithms: (a) represents the experimental results of SMR on the SZS dataset; (b) represents the experimental results of SYL on the SZS dataset; (c,e,g) represent the experimental results of SMR on OB1, OB2 and OB3 datasets; (d,f,h) represent the experimental results of SYL on OB1, OB2 and OB3 datasets.

Table 1. Efficiency experiment data set selection.

The Amount of Data	Number of Original Images	Number of Amplified Images
100	2	98
200	4	196
400	6	394
600	8	592
800	10	790
1000	12	988

Table 2. Two algorithms for precision and recall statistics of thin sections.

Ingredient Name		Quartz	Feldspar	Lithic	PP	CP	CDP	Microcrack
Precision	SYL	0.61	0.22	0.98	0.76	0.16	0.96	0.5
Precision	SMR	0.78	0.46	0.99	0.84	0.38	0.98	0.5
Recall	SYL	0.83	0.74	0.86	0.80	0.68	0.82	0.5
Recall	SMR	0.92	0.88	0.94	0.89	0.84	0.91	0.75

Table 3. MRT values for different data sets.

Experimental Algorithm	Number of Test Sets (N)	MRT(S)
SYL Algorithm	100	9.08
	200	19.06
	400	40.30
	600	63.58
	800	91.03
	1000	121.32
SMR Algorithm	100	9.08
	200	18.86
	400	36.68
	600	55.04
	800	74.12
	1000	95.34

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, T.; Li, C.; Liu, Z.; Zhang, K.; Liu, F.; Li, D.; Zhang, Y.; Liu, Z.; Liu, L.; Huang, J. Research on Image Identification Method of Rock Thin Slices in Tight Oil Reservoirs Based on Mask R-CNN. Energies 2022, 15, 5818. https://doi.org/10.3390/en15165818

AMA Style

Liu T, Li C, Liu Z, Zhang K, Liu F, Li D, Zhang Y, Liu Z, Liu L, Huang J. Research on Image Identification Method of Rock Thin Slices in Tight Oil Reservoirs Based on Mask R-CNN. Energies. 2022; 15(16):5818. https://doi.org/10.3390/en15165818

Chicago/Turabian Style

Liu, Tao, Chunsheng Li, Zongbao Liu, Kejia Zhang, Fang Liu, Dongsheng Li, Yan Zhang, Zhigang Liu, Liyuan Liu, and Jiacheng Huang. 2022. "Research on Image Identification Method of Rock Thin Slices in Tight Oil Reservoirs Based on Mask R-CNN" Energies 15, no. 16: 5818. https://doi.org/10.3390/en15165818

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on Image Identification Method of Rock Thin Slices in Tight Oil Reservoirs Based on Mask R-CNN

Abstract

1. Introduction

2. SMR Methods

2.1. Establishment of Slice Image Data Sets

2.2. Augmentation of Slice Images

2.3. Identification of Slice Image Components

2.4. Mask R-CNN Algorithm Training

3. Experimental Scheme

3.1. Accuracy Experiment

3.1.1. Segmentation Accuracy Experiment

3.1.2. Accuracy Experiment

3.1.3. Execution Speed Experiment

3.1.4. Migration Experiment

4. Experimental Results and Discussion

4.1. Accuracy Experiment Results

4.1.1. Segmentation Accuracy Experiment Results

4.1.2. Accuracy Experiment Results

4.2. Execution Speed Experiment Results

4.3. Migration Experiment Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI