**1. Introduction**

Gas hydrate is an ice-like crystalline solid, formed by water molecules and methane molecules under low temperature and high pressure. It is mainly distributed in seabed sediments on continental margins and permafrost regions. Gas hydrate can cause seabed geo-hazards and atmospheric environmental problems [1], but is also a clean energy with huge reserves [2]. Gas hydrate saturation is an important index for evaluating gas hydrate reservoirs. Well logs are widely used to estimate gas hydrate saturation due to their fast speed and low cost. The common methods for estimating the saturation of gas hydrate by using well logs mainly include resistivity methods and velocity methods [3]. Resistivity-based methods use resistivity logs to estimate gas hydrate saturation according to Archie's law [4,5], while velocity-based methods use the theoretical or empirical relationship between gas hydrate saturation and velocity to estimate gas hydrate saturation by using velocity logs. The frequently used relationships between gas hydrate saturation and velocity include time-average equations [6], the effective medium theory [7,8], and three-phase Biot-type equations [9,10].

The close relationship between gas hydrate saturation and well log machine learning technology provides a new idea for using well logs to estimate gas hydrate saturation. Singh et al. [11,12] used different combinations of well logs to predict gas hydrate saturation through unsupervised and supervised machine learning algorithms. They obtained a higher accuracy of gas hydrate saturation than in classic resistivity and velocity methods, showing the advantages of machine learning technology in gas hydrate saturation predictions. As the most vigorous branch of machine learning, deep learning technology can achieve more accurate prediction and classification than traditional technology. This is

because it builds a deep neural network model with multiple hidden layers and uses a lot of data to train the model to learn complex and e ffective information. Therefore, to use well logs better to estimate gas hydrate saturation and to establish the deep internal connections and laws of the data, we propose a method of estimating gas hydrate saturation from well logs by using deep learning technology.

The concept of deep learning first proposed by Hilton et al. [13] has been successfully applied in image, audio, and natural language processing, and its unique advantages have attracted increasing attention from geoscientists. Deep learning technology is being gradually applied to well log interpretation and reservoir prediction, such as in rock facies classification [14–19] and the prediction of shale content [20] and porosity [21]. Well logs are sequence samples, so to estimate the gas hydrate saturation, we adopted the long short-term memory (LSTM) recurrent neural network, which is suitable for processing sequential data to apply to the well logs that are sensitive to gas hydrate. This method brought good application results in the Shenhu area, South China Sea. It demonstrated the unique advantages of deep learning technology in gas hydrate saturation estimates, and laid the foundation for its further application in gas hydrate research.

#### **2. Long Short-Term Memory (LSTM) Recurrent Neural Network**

#### *2.1. Recurrent Neural Network (RNN)*

A recurrent neural network (RNN) is a neural network model with memory function that can discover the interrelationships between samples. It is especially used to process data with sequential characteristics. Unlike other network structures, an RNN introduces the idea of self-loop, which can input the output of the previous and next samples into the model for operation (Figure 1). The feature information processed by the model contains not only the information of the sequence data before the current sample, but also the information of the current sample itself. However, an RNN cannot effectively deal with long-term dependency problems (neurons that are far away in the hidden layer) because in the process of using the stochastic gradient descent method to train the RNN, the partial derivative of the loss function to the weight matrix will tend toward zero or infinity as the number of input sequence samples increases. This will bring problems of gradient vanishing or gradient exploding, limiting its wide application.

**Figure 1.** Unfolded form of recurrent neural network [22].

#### *2.2. LSTM Recurrent Neural Network*

The LSTM network is a special recurrent neural network proposed by Hochreiter and Schmidhuber in 1997 [23]. It improves and perfects the loop body repeated in a chain in the conventional RNN. By adding a forget gate layer, an input gate layer, and an output gate layer in the network cell, continuous write, read, and reset operations on memory cells can be performed [24]. This enables LSTM to have long-term learning capabilities, and e ffectively solves the problems of gradient vanishing and gradient exploding, making it one of the most successful RNN networks. Figure 2 shows the basic network structure of LSTM, while Figure 3 shows the structure of an LSTM neuron.

**Figure 2.** The basic network structure of the long short-term memory (LSTM) network [22].

**Figure 3.** The structure of an LSTM neuron [22]: (**a**) the forget gate layer, (**b**) the input gate layer, (**c**) the cell status, and (**d**) the output gate layer.

The forget gate layer of the LSTM network determines which information needs to be discarded (Figure 3). The expression is:

$$f\_t = \sigma(\mathcal{W}\_f \cdot [h\_{t-1}, \mathbf{x}\_t] + b\_f) \tag{1}$$

The input gate layer determines which new information is stored in the cell state (Figure 3b). The expression is:

$$\dot{\mathbf{u}}\_t = \sigma(\mathbb{W}\_i \cdot [h\_{t-1}, \mathbf{x}\_t] + b\_i) \tag{2}$$

$$\overline{\mathcal{C}}\_t = \tanh(\mathcal{W}\_\mathbb{C} \cdot [h\_{t-1}, \mathbf{x}\_t] + b\_\mathcal{C}) \tag{3}$$

Then, the current cell status (Figure 3c) is updated to:

$$\mathbf{C}\_{t} = f\_{t} \cdot \mathbf{C}\_{t-1} + i\_{t} \cdot \overline{\mathbf{C}}\_{t} \tag{4}$$

The cell state of LSTM runs through the whole process, so that information is transmitted in a fixed and unchanging way. The output gate layer determines the information that needs to be output at that moment (Figure 3d). The expression is:

$$h\_t = \sigma(\mathbb{W}\_o \cdot [h\_{t-1}, \mathbf{x}\_t] + b\_o) \cdot \tanh(\mathbb{C}\_t) \tag{5}$$

where *xt* is the input vector of the LSTM neuron; *ft* is the activation vector of the forget gate layer; *it* is the activation vector of the input gate layer; *ht* is the output vector of the LSTM neuron; *Ct* is the

neuron cell state vector; *W* is weight matrix; *b* is the bias term; σ is the sigmoid function; (tanh) is the hyperbolic tangent function; the subscript *t* indicates di fferent moments.

#### **3. Gas Hydrate Saturation Estimate**

## *3.1. Geological Background*

The Shenhu area is in the Pearl River Mouth Basin, in the middle of the northern slope of the South China Sea (Figure 4), and it is a key area for gas hydrate exploration. The water depth is 500–1500 m, the seabed topography is complicated, and the topographic slope varies greatly [25]. Since the late Miocene, with its gravity flow having developed and its high deposition rate, several kilometers of Mesozoic and Cenozoic sediments have accumulated to form enough organic matter to provide a source for gas hydrates [26]. In previous geological surveys of the area, many geophysical and geochemical markers indicating the existence of gas hydrates were discovered. In 2007, the Guangzhou Marine Geological Survey conducted the first gas hydrate drilling expedition in this area, and successfully drilled gas hydrate samples.

**Figure 4.** The Pearl River Mouth Basin in the northern slope of the South China Sea; the Shenhu area is shown by the red rectangle.

## *3.2. Well Logs*

Eight sites were drilled in the expedition area in 2007 (Figure 4). Gas hydrates were found in the cores of sites SH2, SH3, and SH7, but no hydrates were found at sites SH1 and SH5. The other three sites, namely, SH4, SH6, and SH9, were drilled for logging without cores.

Figure 5 shows the well logs of site SH2. The cores at this site confirmed that the gas hydrate-bearing sediments were in the range of 190–220 m, and the hydrate saturation could reach 47.3% [27].

In the well logs of site SH2, the resistivity and acoustic velocity in the gas hydrate-bearing formations showed apparent high value anomalies, while the density and gamma showed no obvious changes. The well logs of site SH7 (Figure 6) showed that the depth of the gas hydrate-bearing formation was approximately 152–177 m, and the hydrate saturation could reach 43% [27]. The well log characteristics of the gas hydrate-bearing formation at site SH7 were completely consistent with those at site SH2.

Gas hydrate causes the chloride concentration of the formation pore water to decrease, so the saturation of gas hydrate can be calculated by measuring the chloride concentration of pore water from cores [28] using:

$$S\_{\rm li} = \frac{1}{\rho\_{\rm li}} \left( 1 - \frac{C l\_{\rm pw}}{C l\_{\rm sw}} \right) \tag{6}$$

.

where ρ*h* = 0.924 is the value of the density of pure gas hydrate in g/cm3. Here, *Clsw* is the in situ baseline pore water chloride concentration and *Clpw* is the measured chloride concentration in core water after gas hydrate dissociation. The baseline chloride concentration can be determined by smoothly fitting the chloride data above and below the gas hydrate zone [3].

**Figure 6.** Well logs at site SH7.

Because the chloride concentration of the formation pore water was relatively less disturbed, and the chloride concentration measured by the cores was more accurate, the gas hydrate saturation calculated by the pore water chloride concentration had a higher accuracy [28]. Figure 7 shows the gas hydrate saturations calculated by using the chloride concentration measured by cores in the gas hydrate-bearing formation at sites SH2 and SH7. There were 41 gas hydrate-bearing cores at site SH2, and 21 cores containing gas hydrate at site SH7 [3,29].

**Figure 7.** Gas hydrate saturations calculated by the chloride concentration of the pore water from the cores at sites SH2 and SH7.

## *3.3. Data Preparation*

To use the LSTM recurrent neural network to estimate the gas hydrate saturation, site SH2 was used as a training well to train the LSTM recurrent neural network, while site SH7 was used as a verification well to verify the accuracy of the network model. In site SH2, the resistivity and acoustic velocity, which are more sensitive to gas hydrate, were used as the input of the network model. The gas hydrate saturations calculated by the chloride concentration of the pore water in the cores were used as the output to train the LSTM recurrent neural network.

Because there were only 41 gas hydrate saturation values calculated from the chloride concentration at site SH2, too little training data would seriously affect the training effect of the LSTM recurrent neural network model. Therefore, the interpolation of the gas hydrate saturation was performed at the sampling interval of the well logs to obtain 1400 sample datasets in the range of 191–219 m (Figure 8) where the resistivity and the acoustic velocity were the input of the network model, and the interpolated gas hydrate saturation were output. Before the dataset was input to the LSTM recurrent neural network for training, 1000 consecutive samples were selected as the training dataset, with the remaining samples used as the test dataset. To eliminate the dimensional influence between the parameters, and to ensure that each parameter was within a reasonable distribution range, data standardization processing was required. The expression is:

$$z\_i = \frac{\mathbf{x}\_i - \mu\_i}{\delta\_i} \tag{7}$$

where *zi* refers to the log parameters after standardization, *xi* refers to the input log parameters, μ*i* and δ*i* are the mean and standard deviation of the parameters, respectively.

#### *3.4. The Prediction Framework of the LSTM Recurrent Neural Network*

We constructed an LSTM network prediction model that included an LSTM recurrent layer and two dense layers (Figure 9), where *xi* is the standardized input sequence sample of the resistivity and *p*-wave velocity; *yi* is the output saturation sample; LSTMi is the LSTM neuron that makes up the LSTM recurrent layer, which has the exact structure in Figure 3; *oi* is the output of the LSTM neuron; *Ci* and *hi* have the same meanings as in Equations (1)–(5). Because the actual data were not particularly complicated, to improve the calculation efficiency, the number of nodes of the two fully connected layers was set to 20 and 10, respectively. The optimization algorithm adopted the Adam algorithm, and the dropout regularization method was used to prevent over-fitting.

**Figure 8.** Training dataset of the network model.

**Figure 9.** The prediction framework of LSTM recurrent neural network.

The training process of the LSTM recurrent neural network was similar to that of a conventional fully connected neural network, namely: (1) Use feedforward propagation to input training data into the network, calculate the output of the LSTM unit, and then extract features through the two fully connected layers. This trains it layer by layer to the output layer to obtain the predicted estimate of this sample. (2) Back-calculate the error term of each neuron. The backward propagation of the error term of the LSTM recurrent neural network includes two directions: the first is the back propagation along time, that is, starting from the current *t* time, calculating the error term at each time; the second is propagating the error term to the upper layer. (3) Use the Adam optimization algorithm based on gradient descent to adjust the model parameters by calculating the gradient of each weight according to the corresponding error item, so that the prediction is close to the optimization target. (4) Through the above iterations, train until it meets the required optimization target, then the LSTM recurrent neural network prediction model that meets the error requirements is established.
