**1. Introduction**

Faults play a major role in lateral sealing of thin reservoirs and accumulation of the remaining oil in conventional and unconventional reservoirs onshore in China [1]. Almost all developed onshore oil and gas fields in China are distributed in rift basins which are rich in oil and gas resources with highly developed and very complex fault systems [2–4]. At present, there are many kinds of fault recognition techniques with different principles, but there are still great difficulties in fine fault imaging. This is because the rift basin experienced a variety of external forces during its growth, and developed a variety of faults, such as normal faults, normal oblique-slip faults, oblique faults, and strike–slip faults, etc. According to their different combinations in plane and section, they also present many forms such as broom shaped, comb shaped, goose row shaped, and parallel interlaced in planes. In rift basins, the filling of sediments, the development and distribution of sedimentary sequences, the formation, distribution and evolution of oil and gas reservoirs (including the formation and effectiveness of traps, and the migration and accumulation of oil and gas) are closely related to the distribution and activity of faults [5]. Therefore, fine detection and characterization of faults in rift basins in China has become a key basic geological problem for oil and gas exploration and development efforts and has become the key topic of basin tectonic research [6].

Continuous and regular event breakpoints constitute faults in seismic imaging data. However, because the accuracy, resolution and signal-to-noise ratio of seismic imaging data cannot reach the theoretical limit and the geological situation is complicated, it is a great challenge for petroleum engineers to describe the spatial distribution of faults from seismic data. In the past, fault characterization has been regarded as an interpretative task, followed by seismic data processing and imaging, because it requires extensive geological

**Citation:** Wu, J.; Shi, Y.; Wang, W. Fault Imaging of Seismic Data Based on a Modified U-Net with Dilated Convolution. *Appl. Sci.* **2022**, *12*, 2451. https://doi.org/10.3390/ app12052451

Academic Editor: Chiara Bedon

Received: 5 January 2022 Accepted: 23 February 2022 Published: 26 February 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

knowledge and experience. In recent years, researchers use convolutional neural network to identify faults, focusing on the construction of network architecture, network parameter debugging and optimization and model training. They are less and less constrained by geological knowledge and personal experience, and the processing and mining of seismic data are becoming more and more important. Therefore, it is very reasonable to attribute fault identification via deep learning to the research field of seismic data processing and imaging, and it is also the development trend in the future. Based on this concept, our research employs seismic imaging data to realize the description of fault characteristics through a new neural network model, that is, to realize fault imaging.

In the past 30 years, with the continuous development of computer hardware and software, fault identification has made great progress in efficiency and accuracy. From the perspective of method evolution, fault interpretation has experienced from the initial manual interpretation to the emergence of various identification methods, such as coherence method, curvature attribute method and ant colony algorithm, which describe faults by calculating transverse discontinuities of seismic data. In the past five years, with the rapid development of artificial intelligence [7–9], various fault identification methods based on deep learning have gained remarkable achievements. In 2014, Zheng et al. [10] used deep learning tools to conduct fault identification tests on prestack synthetic seismic records. Araya-Polo et al. [11] applied machine learning and deep neural network algorithms to automate fault recognition, which greatly improved the efficiency and stability of fault interpretation. Waldeland et al. [12] and Guitton et al. [13] successively used a Convolutional Neural networks (CNN) model to make progress in fault description. Xiong et al. [14] used results of the skeletonized seismic coherent self-correction method as training samples to train a CNN model to identify seismic faults. In 2019, Wu et al. [15] realized the identification of small-scale faults by using U-Net network. Wu et al. [6] used a full-convolutional neural network, FCN, to achieve a better characterization of faults. Among these networks, the U-Net architecture is currently very popular, due to its shortcut operation which concatenates attribute maps from the low-level feature (shallow layer) to maps of high-level feature (deep layer), and it can be seen as a special kind of CNN [16–18]. In addition, the U-Net does not have strict requirements on the size of training sets, and smaller training sets can also provide satisfactory results. However, most networks including the U-Net uniformly process all feature maps of the same layer, resulting in the same receptive field of the network in the same layer, thus obtaining relatively single local information. Moreover, with the continuous down-sampling of the network and the convolution operation with step size, the defect that only a single size information can be obtained at the same layer becomes more and more obvious, resulting in the inaccurate identification of faults by the neural network.

To address these issues, this paper introduces a new neural network model, which takes U-Net as the basic network and uses inter-group channel dilated convolution module (GCM) to connect each cross-connection layer between encoding path and decoding path, and uses inter-group space dilated convolution module (GSM) to connect layers after each deconvolution layer in decoding path. Both GCM and GSM use dilated convolution. Dilated Convolution, also known as hole convolution or expanded convolution, is to inject cavities into the standard convolution kernel to increase the reception field of the model. In the CNN structure, convolution and pooling drive most layers, convolution performs feature extraction, while pooling performs feature aggregation. For an image classification task, this network structure has good feature extraction capability, among which the most classical structure is VGGNet (a convolutional neural network was developed by the University of Oxford's Visual Geometry Group and Google DeepMind in 2014). However, this structure has some problems for target detection and image segmentation. For example, the size of receptive field is very important in target detection and image segmentation, and the guarantee of receptive field depends on down-sampling. However, down-sampling will make small targets difficult to detect. If down-sampling is not completed and the number of convolutional layers is only increased, the computation of the network will

increase. In addition, if pooling is not carried out for features, the final feature extraction effect will also be affected, and there will be no change in the receptive field. In order to solve these problems in CNN, this paper introduces dilated convolution to increase the receptive field without sacrificing the size of the feature map. Compared with the conventional convolution operation, dilation rate is added to the dilated convolution, which refers to the number of intervals of points in the convolution kernel [19–21]. When dilatation rate is 1, the dilated convolution will degenerate into conventional convolution. The similarity between dilated convolution and conventional convolution lies in that the size of convolution kernel is the same, that is, the number of parameters of neural network remains unchanged [22,23]. The difference lies in that dilated convolution has a larger receptive field and can preserve the structure of internal data [24,25].

### **2. Illustration of Dilated Convolution**

We use a set of pictures to illustrate the principle of dilated convolution. Figure 1a is the conventional convolution kernel, and the dilated convolution is obtained by adding intervals to this basic convolution kernel. Figure 1b corresponds to the convolution of 3 × 3, with dilation rate of 2 and interval 1, that is, corresponding to 7 × 7 image blocks. It can be understood that the kernel size becomes 7 × 7, but only 9 points have parameters, and the rest have parameters of 0. Convolution calculation was performed for the 9 points in Figure 1b and the corresponding pixels in the feature map, and the other positions were skipped. Figure 1b,c are similar, except that the dilation rate is 4, which is equivalent to a 15 × 15 convolution kernel. As the convolution kernel becomes larger, the receptive field becomes larger naturally.

**Figure 1.** The dilated convolution with dilation rate of 1, 2 and 4, respectively. (**a**) dilation rate = 1; (**b**) dilation rate = 2; (**c**) dilation rate = 4.

In practical application, when the same dilation rate is used for all convolutional layers, a problem called grid effect will appear. Since the convolution calculation points on the feature map are discontinuous, for example, if we repeatedly accumulate 3 × 3 kernel of dilation rate 2, this problem will occur.

The blue square in Figure 2 is the convolution calculation points participating in the calculation, and the depth of the color represents the calculation times. As can be seen, since the dilation rates of the three convolutions are consistent, the calculation points of the convolution will show a grid expansion outward, while some points will not become calculation points. Such kernel discontinuities, that is, not all pixels are used for calculations, will result in the loss of continuity of information, which is very detrimental for tasks such as pixel-level dense prediction. The solution is to discontinuously use dilated convolution with the same dilation rate, but this is not comprehensive enough. If the dilation rate is multiple, such as 2,4,8, then the problem still exists. Therefore, the best way is to set the dilation rate of continuous dilated convolution as "jagged", such as 1,2,3, respectively, so that the distribution of convolution calculation points will become like Figure 3 without discontinuity.

**Figure 2.** The dilated convolution with the same dilation rate of 2, respectively. There are grid effects in all three graphs. (**a**) the first convolution with dilation rate of 2; (**b**) the second convolution with dilation rate of 2; (**c**) the third convolution with dilation rate of 2.

**Figure 3.** The dilated convolution with dilation rate of 1, 2 and 3, respectively. (**a**) dilation rate = 1; (**b**) dilation rate = 2; (**c**) dilation rate = 3.

## **3. The Architecture of the Modified U-Net**

The proposed neural network adopts a U-Net of a 4-layer structure as the basic network. In the coding path, feature maps of each layer are connected to the corresponding decoding layer by GCM, in which each layer adopts two 3 × 3 convolution layers and maximum pooling layer for feature extraction. Then, in the decoding path, feature maps of each layer are connected to the corresponding decoding layer by GSM. Each layer adopts a 3 × 3 convolution layer, up-sampling layer and 1 × 1 convolution layer to restore, and the output layer adopts a 3 × 3 convolution layer and 1 × 1 convolution layer to output. In this modified U-Net, the batch regularization (BN) and modified linear units (ReLU) are added to all convolution layers to correct data distribution, except for the output layer. GCM and GSM modules play a key role in the modified U-Net, and their operation mechanism is similar. GCM module can divide the input feature map into four groups on average, and then carry out the dilated convolution operation with dilation rates of 1, 2, 3 and 5, respectively. In addition, the module extracts and outputs the features of the input feature map through pooling, convolution, batch regularization, activation, softmax and other conventional operations, and finally obtains the channel information of all groups. For the GSM module, it can divide the input feature map into three groups on average, and each group carries out the dilated convolution operation with dilation rates of 1, 2 and 4, respectively. This module carries out feature extraction and output in sequence by down-sampling, convolution, batch regularization, activation, up-sampling, convolution, batch regularization, activation and softmax, and finally obtains the spatial information of all groups.

Figure 4 shows the structure of GCM in the modified U-Net. This module divides the input feature map into four groups evenly, and each group performs dilated convolution operations with dilation rates of 1, 2, 3, and 5. The size of the target area of fault identification determines the value of the dilation rate. After dilated convolution, four groups of feature maps with different scales are obtained. Besides, four groups of channel information are returned by softmax, which were taken as weights and multiplied by four groups of feature maps with different scales obtained via dilated convolutions to acquire new feature maps. The receptive field corresponding to the group with the largest weight contributed the most to the final network prediction. Finally, the four groups of new feature maps are spliced together and then a residual operation is performed with the input feature map to obtain the final prediction result. The GCM module uses the idea of grouping and realizes the automatic selection of inter-group multi-scale information under the guidance of channel information.

**Figure 4.** The architecture of GCM.

Figure 5 shows the structure of GSM, which realizes the selection of multi-scale information between groups in another way and enhances the consistency of receptive field and target region recognition. In this module, the input feature map was divided into three groups, and then three groups of feature maps with different scales are obtained by the dilated convolution with dilation rates of 1, 2 and 4. At the same time, three feature maps with spatial weights are cropped from the input feature map through a series of conventional operations. In these operations, the purpose of down-sampling is to obtain more global information, the purpose of up-sampling is to restore the size of feature maps, and the purpose of softmax is to enable the module to automatically select multi-scale information. Three feature maps with spatial weights are multiplied by three feature maps of different scales obtained by dilated convolution to get three new feature maps. Finally, after splicing the three groups of new feature maps, a residual operation is performed with the input feature map to acquire the final prediction results. In summary, under the guidance of spatial information, the GSM module can select multi-scale information among a group of feature maps.

**Figure 5.** The architecture of GSM.

The proposed neural network is based on the U-Net network and has two functional modules, the GCM and GSM, which can finely describe faults of different scales. Its architecture is shown in Figure 6. Due to the powerful multi-scale information selection ability of GCM and GSM modules, this paper only uses a 4-layer U-Net based on a coding– decoding structure as the basic network. In the coding path, only two 3 × 3 convolution and maximum pooling are used to quickly obtain feature maps with different resolutions. In the decoding path, multiple simple decoding blocks are used to quickly and effectively recover feature maps with high resolution. In this neural network, the data distribution after convolution is corrected by BN and ReLU, and the GCM module is placed at the connection layer of the network to automatically select multi-scale information, which makes up for the lack of transmitting single information to the decoder by the encoder of conventional U-Net. At the same time, the GSM module between groups is placed in the path of decoder to realize the function of multi-scale information selection, which makes up for the disadvantage of losing global information in up-sampling process.

**Figure 6.** The architecture of the modified U-Net.
