*4.1. Isotropic Simultaneous Inversion*

Simultaneous inversion is a prestack seismic inversion that simultaneously inverts the angle gathers into the reflectivities of the *ZP*, *ZS*, and density based on the linear relationships between the *ZP* and each of the *ZS* and density. The methodology used in this study [92] directly inverted for the *ZP*, *ZS*, and density [93,94].

The Aki–Richards equation [95] is an approximation of the relationship between the P-wave reflectivity between two media and the incident angle as follows:

$$R(\theta) = \frac{1}{2}(\Delta V/V + \Delta \rho/\rho) - 2(W/V)^2(2\Delta W/W + \Delta \rho/\rho)\sin^2\theta + \frac{1}{2}(\Delta V/V)\tan^2\theta \tag{12}$$

where *R*(*θ*) is the P-wave reflectivity at an incident angle (*θ*), while *V*, *W*, and *ρ* are the average P-wave velocity, S-wave velocity, and density between the two media, respectively. The idea of Fatti's equation [93] is to re-express the Aki–Richards equation in (12) in terms of the acoustic and shear impedances as follows:

$$R(\theta) = \frac{1}{2} \frac{\Delta I}{I} (1 + \tan^2 \theta) - 4 \left(\frac{W}{V}\right)^2 \left(\frac{\Delta f}{I}\right) \sin^2 \theta - \left[\frac{1}{2} \left(\frac{\Delta \rho}{\rho}\right) \tan^2 \theta - 2 \left(\frac{W}{V}\right)^2 \left(\frac{\Delta \rho}{\rho}\right) \sin^2 \theta\right] \tag{13}$$

where *I* is the acoustic impedance, *J* is the shear impedance, the term ( <sup>1</sup> 2 Δ*I <sup>I</sup>* ) is the zero-offset P-wave reflectivity, and the term ( <sup>1</sup> 2 Δ*J <sup>J</sup>* ) is the zero-offset S-wave reflectivity, such that:

$$\frac{1}{2}\frac{\Delta f}{f} = \frac{1}{2}(\Delta V/V + \Delta \rho/\rho),\\\frac{1}{2}\frac{\Delta f}{f} = \frac{1}{2}(\Delta W/W + \Delta \rho/\rho) \tag{14}$$

Fatti's equation could be re-expressed by replacing the reflectivities by the rock properties as follows:

$$\mathbf{s}(\theta) = \frac{1}{2}\mathbb{C}\_1 \mathcal{W}\_\theta D \ln(Z\_P) + \frac{1}{2}\mathbb{C}\_2 \mathcal{W}\_\theta D \ln(Z\_S) + \mathbb{C}\_3 \mathcal{W}\_\theta D \ln(\rho) \tag{15}$$

where *s*(*θ*) is the angle trace, *W*(*θ*) is the angle-dependent wavelet, *D* is a derivative operator, *ZP* is the acoustic impedance, *ZS* is the shear impedance, *ρ* is the density, and *θ* is the incident angle. The terms *C*1, *C*2, and *C*<sup>3</sup> are given by:

$$\mathbb{C}\_1 = 1 + \tan^2 \theta,\\ \mathbb{C}\_2 = -8 \left( \frac{V\_S}{V\_P} \right) \tan^2 \theta,\\ \mathbb{C}\_3 = -0.5 \tan^2 \theta + 2 \left( \frac{V\_S}{V\_P} \right)^2 \sin^2 \theta \tag{16}$$

where *C*1, *C*2, and *C*<sup>3</sup> are constants, *θ* is the incident angle, *VS* is the S-wave velocity, and *VP* is the P-wave velocity.

Next, the terms *ln*(*ZP*), *ln*(*ZS*), and *ln*(*ρ*) were coupled by the following equations:

$$
\ln(Z\_s) = k \ln(Z\_p) + k\_c + \Delta \ln(Z\_s) \tag{17}
$$

$$\ln(\rho) = \operatorname{mln}(Z\_p) + m\_c + \Delta \ln(\rho) \tag{18}$$

where *k* and *m* are the intercepts of the linear relations and *kc* and *mc* are constants, while Δ*ln*(*ZS*) and Δ*ln*(*ρ*) are controlled by the deviations away from the best-fit lines of the linear relationships. Δ*LD* and Δ*LS* are the displacements of *ln*(*ρ*) and *ln*(*ZS*) due to the deviation from the best-fit lines.

The next step was to apply the derivative operator (*D*) to Equations (3) and (4):

$$D\ln Z\_{\mathcal{S}} = kD\ln Z\_{P} + D\Delta l n Z\_{\mathcal{S}} \tag{19}$$

$$D\ln\rho = mDln Z\_P + D\Delta ln\rho\tag{20}$$

where *D* is the derivative operator, while *k* and *m* are the intecepts of the linear relations. Eventually, the algorithm was applied to an initial guess and then iterated until it reached a solution at which the initial model matched the real data with the least-squared error.

### *4.2. Partial-Log-Constrained Inversion*

The separate inversion of angle stacks is based on a partial inversion approach that obtains the zero-offset P- and S-wave velocities from the inverted P- and S-waves at different angular ranges [90]. The idea is based on Thomsen's equations in (9) and (10), which relate the vertical P-wave (*VP*) and S-wave (*VS*) velocities to the anisotropy parameters, Epsilon and Delta, in VTI media.

Partial-log-constrained inversion requires a wavelet to be extracted from each angle stack so that the wavelet is centered at the average angle of the angle stack [90]. An initial model was created from logging data to guide the inversion. This method required S-wave velocity logs in the initial model to constrain the inversion. Each partial seismic volume was inverted into a velocity volume at the central angle of its angle range. Next, the velocity-domain Thomsen's equations in (9) and (10) were used to solve for the zero-offset P- and S-wave velocities as well as the Epsilon and Delta parameters, as shown in the following equations:

$$
\begin{bmatrix}
1 & \sin^2(\theta\_1)\cos^2(\theta\_1) & \sin^4(\theta\_1) \\
1 & \sin^2(\theta\_2)\cos^2(\theta\_2) & \sin^4(\theta\_2) \\
\vdots & \vdots & \vdots \\
1 & \sin^2(\theta\_n)\cos^2(\theta\_n) & \sin^4(\theta\_n)
\end{bmatrix}
\begin{bmatrix}
V\_{P0} \\
V\_{P0}\delta \\
V\_{P0}\epsilon
\end{bmatrix} = 
\begin{bmatrix}
V\_{qP}(\theta\_1) \\
V\_{qP}(\theta\_2) \\
\vdots \\
V\_{qP}(\theta\_n)
\end{bmatrix} \tag{21}
$$

where *VP*<sup>0</sup> is the zero-offset P-wave velocity, and *δ* are the anisotropy parameters, *VqP* is the quasivertical P-wave velocity, *θ* is the incident angle, and *n* is the number of partial stacks.

$$
\begin{bmatrix}
1 & \sin^2(\theta\_1)\cos^2(\theta\_1) \\
1 & \sin^2(\theta\_2)\cos^2(\theta\_2) \\
\vdots & \vdots \\
1 & \sin^2(\theta\_n)\cos^2(\theta\_n)
\end{bmatrix}
\begin{bmatrix}
V\_{s0} \\
\frac{V\_{p0}^2(\varepsilon-\delta)}{V\_{s0}}
\end{bmatrix} = 
\begin{bmatrix}
V\_{qSV}(\theta\_1) \\
V\_{qSV}(\theta\_2) \\
\vdots \\
V\_{qSV}(\theta\_n)
\end{bmatrix}
\tag{22}
$$

where *VS*<sup>0</sup> is the zero-offset P-wave velocity, and *δ* are the anisotropy parameters, *VqSV* is the quasivertical S-wave velocity, *θ* is the incident angle, and n is the number of the partial stacks.

The log-constrained inversion of partial stacks was adopted from the nonlinear optimization inversion [90], such that the objective function is obtained by:

$$f(V) = \|\ S - D\| \to \min \tag{23}$$

where *V* is the inverted P- or S-wave velocity, *D* is the angle-domain seismic data, and *S* is the corresponding prediction, which can be expressed according to the convolution model [96]. The reflectivity function is based on the Aki and Richard approximation of the angle-dependent reflectivity [97], as shown below:

$$R(\theta) = R(0) + A \sin^2(\theta) + B \tan^2(\theta) \tag{24}$$

where *R*(*θ*) is the reflectivity at an incident angle *θ*, while *R*(0) is the zero-offset reflectivity, which is given by:

$$R(0) = \frac{1}{2} \left( \frac{\Delta V\_P}{V\_P} + \frac{\Delta \rho}{\rho} \right) \tag{25}$$

$$A = -2\left(\frac{V\_S}{V\_P}\right)^2 \left(\frac{2\Delta V\_S}{V\_S} + \frac{\Delta \rho}{\rho}\right) \tag{26}$$

$$B = \frac{1}{2} \left(\frac{\Delta V\_P}{V\_P}\right) \tag{27}$$

where *VP*, *VS*, and *ρ* are, respectively, the average P-wave velocity, S-wave velocity, and density of the underlying and overlying layers, while Δ*VP*, Δ*VS*, and Δ*ρ* are the differences between these properties at the underlying and overlying strata.

The term <sup>Δ</sup>*<sup>ρ</sup> <sup>ρ</sup>* can be obtained by Gardner's equation as follows [98]:

$$\frac{\Delta\rho}{\rho} = \frac{1}{4} \left( \frac{\Delta V\_P}{V\_P} \right) \tag{28}$$

According to Equations (24)–(28), the reflectivity will be as follows:

$$R(\theta) = \left(\frac{5}{8} - \frac{1}{2}\frac{V\_S^2}{V\_P^2}\sin^2\theta + \frac{1}{2}\tan^2\theta\right)\frac{\Delta V\_P}{V\_P} - 4\frac{V\_S^2}{V\_P^2}\sin^2\theta\frac{\Delta V\_S}{V\_S} \tag{29}$$

After inverting the partial stacks separately into *VP* and *VS* volumes at the near, mid, and far angles, the zero-offset velocities Epsilon and Delta were solved by Equations (21) and (22).

The current study applied the same methodology to *ZP* and *ZS* by replacing the velocities of Equations (9) and (10) by impedances, as shown below:

$$Z\_P(\theta) = Z\_{P0} \left[ 1 + \delta \sin^2(\theta) \cos^2(\theta) + \epsilon \sin^4(\theta) \right] \tag{30}$$

$$Z\_S(\theta) = Z\_{S0} \left[ 1 + \left(\frac{Z\_{P0}}{Z\_{S0}}\right)^2 (\epsilon - \delta) \sin^2(\theta) \cos^2(\theta) \right] \tag{31}$$

Partial inversion is sensitive to the quality of seismic data because the magnitude of the noise may be more than that of the anisotropy effect. Hence, the zero-offset impedances were estimated from the impedances at the near, mid, and far incident angles by using statistical modeling and MLFN. Seismic data consisted of three 3D volumes of angle ranges, from 5◦ to 15°, 15◦ to 25°, and 25◦ to 40°. Accordingly, three partial stacks were created at the near, mid, and far central incident angles: 10°, 20°, and 32.5°, respectively. The advantage of impedance modeling is that it normalizes the error and reduces the anisotropy effect simultaneously.

An important step prior to the modeling process is to calculate Epsilon and Delta at the well locations in order to have an idea about the degree of anisotropy within the study area and to specify the model's assumptions accordingly. The anisotropy parameters were calculated by the methodology of Liner and Fei [79], who calculated the layer-induced stiffness parameters from the averaged properties of the isotropic elastic layers and then obtained Epsilon and Delta as functions of those parameters. The stiffness parameters were obtained from the following equations:

$$a = \langle \frac{\lambda}{\lambda + 2\mu} \rangle^2 \langle \frac{1}{\lambda + 2\mu} \rangle^{-1} + 4 \langle \frac{\mu(\lambda + \mu)}{\lambda + 2\mu} \rangle \tag{32}$$

$$c = \left< \frac{1}{\lambda + 2\mu} \right>^{-1} \tag{33}$$

$$f = \langle \frac{1}{\lambda + 2\mu} \rangle \langle \frac{1}{\lambda + 2\mu} \rangle^{-1} \tag{34}$$

$$I = \left\langle \frac{1}{\mu} \right\rangle^{-1} \tag{35}$$

where is the Backus averaging symbol, *μ* is the shear modulus, and *λ* is Lame's constant, while *a*, *c*, *f* , and *l* are the stiffness parameters. Next, Epsilon and Delta were obtained from the following equations:

$$
\epsilon = \frac{a - c}{2c} \tag{36}
$$

$$\delta = \frac{(f+l)^2 - (c-l)^2}{2c(c-l)}\tag{37}$$

Another premodeling step is to explore the relations between the zero-offset impedances, which are the *ZP* and *ZS* logs, and the angle impedances, which are the composite traces obtained from the partial-stack inversion. Figure 3 shows a fair linear relationship based on the data of wells (×1) and (×2), which consist of 1292 observations for *ZP* 1582 observations for *ZS*.

**Figure 3.** Plots of the zero-offset P-impedance (*ZP*0) versus each of the partially inverted Pimpedances (**a**) *Zp*-Near, (**b**) *Zp*-Mid, and (**c**) *Zp*-Far, while the zero-offset S-impedance (*ZS*0) is plotted versus each of the partially inverted S-impedances (**d**) *Zs*-Near, (**e**) *Zs*-Mid, and (**f**) *Zs*-Far.

The linearity between variables is proven visually through the best-fit lines (in blue). In addition, the Pearson correlation coefficient (R-value) was calculated in Python using NumPy. The R-values show strong positive correlations at the near and mid angles, while the correlation is moderate at the far angle. This is matching with Thomsen's equations in (9) and (10), which consider a linear relationship between the zero-offset velocity and the velocity at an angle theta.

Accordingly, a linear regression model was firstly created to forecast the zero-offset impedance from the partially inverted impedances based on the MCMC simulation. As the R-value between each of the P- and S-impedances and their far-angle impedances was moderate due to the anisotropy effect, the MLFN was created to better estimate the zerooffset impedances and turn the randomness in the data points into recognized patterns.

### *4.3. Zero-Offset Impedance Modeling*

### 4.3.1. Statistical Modeling

The idea of the model is to fit a linear relationship between the zero-offset impedance, which is the response variable, and the partially inverted impedances, which are the explanatory variables. The output of the model consists of the posterior means of the coefficients of the three partially inverted impedances. The general form of the model is:

$$Imp\_0 = b\_1 Im p\_{\theta 1} + b\_2 Im p\_{\theta 2} + b\_3 Im p\_{\theta 3} \tag{38}$$

where *Imp*<sup>0</sup> is the zero-offset *ZP* or *ZS*, while *b*1, *b*2, and *b*<sup>3</sup> are the model's coefficients and *Impθ*1, *Impθ*2, and *Impθ*<sup>3</sup> are the impedances at the near, mid, and far angles, respectively. The model's parameters are estimated based on Bayesian theory and MCMC simulation. The Bayesian model consists of three components: the likelihood, priors, and posterior. The priors of the model were the parameters of the proposed distribution of the model coefficients. The posterior component was considered the probability distribution function of all realizations of the simulated coefficients.

The linear expression of the *ZP*<sup>0</sup> model was obtained from Thomsen's equation in (9), which can be rewritten as follows:

$$Z\_{\rm P0} = b Z\_{\rm P}(\theta) \tag{39}$$

where:

$$b = \frac{1}{1 + \delta \sin^2(\theta) \cos^2(\theta) + \epsilon \sin^4(\theta)}\tag{40}$$

By substituting the near, mid, and far P-impedances, *ZP*1, *ZP*2, and *ZP*3, respectively, to Equation (39), three equations were obtained as follows:

$$Z\_{P0} = b\_1 Z\_{P1} \tag{41}$$

$$Z\_{P0} = b\_2 Z\_{P2} \tag{42}$$

$$Z\_{P0} = b\_3 Z\_{P3} \tag{43}$$

By adding the three Equations (41)–(43), the linear expression will be as follows:

$$Z\_{P0} = 0.33b\_1 Z\_{P1} + 0.33b\_2 Z\_{P2} + 0.33b\_3 Z\_{P3} \tag{44}$$

Similarly, the linear expression of the *ZS*<sup>0</sup> model was obtained based on Thomsen's equation in (10), which can be rewritten as shown below:

$$Z\_{\mathbb{S}0} = b.Z\_{\mathbb{S}}(\theta) \tag{45}$$

where

$$b = \frac{1}{1 + \left(\frac{Z\_{\rm D}}{Z\_{\rm S0}}\right)^2 (\epsilon - \delta) \sin^2(\theta) \cos^2(\theta)}\tag{46}$$

After adding the three equations of the near, mid, and far angles, the linear expression will be as follows:

$$Z\_{S0} = 0.33b\_1 Z\_{S1} + 0.33b\_2 Z\_{S2} + 0.33b\_3 Z\_{S3} \tag{47}$$

The likelihood function of the *ZP* or *ZS* models can be written as shown below:

$$Imp\_0|b\_{\rangle\prime}(Imp\_0)\_{i\nu}\sigma^2 \sim \mathcal{N}\left( \left[ b\_1(Imp\_1)\_i + b\_2(Imp\_2)\_i + b\_3(Imp\_3)\_i \right], \sigma^2 \right) \tag{48}$$

Given the coefficients vector *b*, the explanatory variables vector *Impθ*, and the variance *σ*2, the zero-offset impedance *Imp*<sup>0</sup> follows a normal distribution with a mean equal to the linear expression of the covariates and coefficients, where (*i*) is the number of observations.

The next layer of the model is the coefficients vector *bj*, which consists of *b*1, *b*2, and *b*<sup>3</sup> such that the three coefficients follow a common normal distribution. The mean and variance of this distribution were obtained from Equation (40) for the *ZP* model and Equation (46) for the *ZS* model. Based on the weak anisotropy assumption, the mean of the anisotropy parameters was considered zero and the standard deviation was 0.2. Moreover, the squared ratio of *ZP* and *ZS* was set, according to the wells' data, to have a

mean equal to 6 and a standard deviation (SD) equal to 3. The prior function for the *ZP* and *ZS* models can be written as follows:

$$bj \sim N(\mu, \tau), j = \{1, 2, 3\} \tag{49}$$

where *μ* and *τ* are, respectively, the mean and variance of the normal distribution of the coefficients vector *b*, which has an index (*j*) from 1 to 3.

The variance of the model was assumed to follow an inverse gamma distribution, which depends on the shape parameter *α* and the scale parameter *β*, which act as priors of the parameter *σ*2. The distribution function of the *σ*<sup>2</sup> is shown below:

$$
\sigma^2 \sim \text{IG}(\mathfrak{a}, \mathfrak{k}) \tag{50}
$$

The idea of the model is to calculate the the zero-offset impedance by substituting the posterior means of the coefficients to Equation (38). The MCMC was used to draw multiple random realizations from the proposed normal distribution. Based on the law of large numbers, the average of a random sample from a distribution converges to the true mean of that distribution [99]. In other words, the Markov chains would converge to the posterior means of the model's coefficients. The advantage of the MCMC is that it can solve for complicated posterior distributions which are difficult to be calculated mathematically.

Finally, the posterior means of the *ZP* and *ZS* models were substituted into Equation (38) to solve for the zero-offset impedances. The final graphical representation of the impedance model is shown in Figure 4, where *Imp*<sup>0</sup> is the zero-offset impedance, *Imp*1, *Imp*2, and *Imp*<sup>3</sup> are the impedances at the near, mid, and far angles, respectively, i is the observation index that ranges from 1 to n, bj is the common distribution of the coefficients which depends on the mean *μ* and variance *τ*, and *σ*<sup>2</sup> is the variance of the model which depends on shape parameter *α* and scale parameter *β*.

**Figure 4.** The graphical representation of the statistical *ZP* or *ZS* models.

4.3.2. Multilayer Feed-Forward Neural Network (MLFN)

Due to the availability of vast amounts of data in recent years, ANNs have become the most common prediction methods [100]. One of the widely used ANNs is the multilayer feed-forward neural network (MLFN) [101,102]. The MLFN consists of an input layer, one or more hidden layers, and an output layer [103]. An activation function is used to map the summation of the weighted inputs into the output layer [104]. The number of hidden layers and number of neurons in each of them should be carefully determined because this step controls the model's accuracy [105].

The selection of appropriate activation functions has a high impact on a model's performance. There are various types of activation functions such as the sigmoid, or logistic, function, which is commonly used in classification models. The current study considers a linear activation function to solve for the zero-offset impedance in a regression model. The choice of linear function is based on the moderate-to-strong correlation between the input and output variables as discussed earlier.

The connection between any two neurons is controlled by a random weight. A model's random weights are adjusted until reaching the least mean square error and the best match between the predicted and actual output values. The backpropagation method is commonly used to obtain the optimal set of weights and calculate the error in a repetitive way until reaching the best training accuracy [106,107]. The backpropagation process involves an optimization algorithm such as stochastic gradient descent (SGD) [108] and adaptive moment estimation (Adam) [109]. The algorithm used in our model is the Levenberg–Marquardt (LMA) [110] due to its fast performance and ability to train MLFN models [111].

A model's error can be represented by various loss functions such as cross-entropy [112], weighted binary cross-entropy (WCE) [113], and dice loss [114]. The loss function used in our study is the mean squared error (MSE), which is one of the best unbiased estimators used in regression models [115]. The MSE can be defined by the following equation:

$$MSE = \frac{1}{N} \sum\_{i=1}^{N} (y\_i - \hat{y}\_i)^2 \tag{51}$$

where *N* is the number of samples and *yi* is the output variable at an observation (*i*).

The structure of the MLFN, in this study, consists of three layers: an input, a hidden, and an output layer. Figure 5 shows the graphical representation of the MLFN models for both *ZP* and *ZS*, where the input layer consists of three neurons, which are the near-, mid-, and far-angle impedances, while the output layer consists of one neuron, which is the zero-offset impedance. Several trials were performed in MATLAB to determine the best number of neurons for the hidden layer. The best performance was observed at 20 neurons for both the *ZP* and *ZS* models.

**Figure 5.** A graphical representation of the MLFN for both the *ZP* and *ZS* models. The input layer consists of three nodes, which are the near, mid, and far impedances, the hidden layer consists of 20 nodes, and the output layer consists of one node, which is the zero-offset impedance.

### *4.4. Calculation of the Anisotropy Parameters: Epsilon and Delta*

After comparing and validating *ZP* and *ZS*, obtained from the isotropic inversion, statistical modeling, and MLFN, the best P- and S-impedance volumes were applied to Thomsen's anisotropy equations to solve for the parameters Epsilon and Delta. Both the partially inverted impedances, zero-offset impedances, and central angles of the angle stacks (10, 20, and 32.5 degrees) were applied to the impedance-domain Thomsen's equations in (30) and (31), yielding six equations, as shown below:

$$Z\_{P1} = Z\_{P0} \left[ 1 + \delta \sin^2(10) \cos^2(10) + \epsilon \sin^4(10) \right] \tag{52}$$

$$Z\_{P2} = Z\_{P0} \left[ 1 + \delta \sin^2(20) \cos^2(20) + \epsilon \sin^4(20) \right] \tag{53}$$

$$Z\_{P3} = Z\_{P0} \left[ 1 + \delta \sin^2(32.5) \cos^2(32.5) + \epsilon \sin^4(32.5) \right] \tag{54}$$

$$Z\_{S1} = Z\_{S0} \left[ 1 + \left(\frac{Z\_{P0}}{Z\_{S0}}\right)^2 (\epsilon - \delta) \sin^2(10) \cos^2(10) \right] \tag{55}$$

$$Z\_{S2} = Z\_{S0} \left[ 1 + \left(\frac{Z\_{P0}}{Z\_{S0}}\right)^2 (\epsilon - \delta) \sin^2(20) \cos^2(20) \right] \tag{56}$$

$$Z\_{S3} = Z\_{S0} \left[ 1 + \left(\frac{Z\_{P0}}{Z\_{S0}}\right)^2 (\epsilon - \delta) \sin^2(32.5) \cos^2(32.5) \right] \tag{57}$$

where *ZP*1, *ZP*2, and *ZP*<sup>3</sup> are the P-impedance volumes at the near, mid, and far angles, respectively, while *ZS*1, *ZS*2, and *ZS*<sup>3</sup> are the S-impedance volumes at the near, mid, and far angles, respectively. *ZP*<sup>0</sup> and *ZS*<sup>0</sup> are the zero-offset P- and S-impedances, respectively, and and *δ* are the anisotropy parameters.

By adding Equations (52)–(54) to each other and Equations (55)–(57) to each other, two equations were obtained such that one of them related Epsilon and Delta to the *ZP* volumes and the other one related Epsilon and Delta to the *ZS* volumes. Finally, the two resulting equations were solved by MATLAB for the anisotropy parameters Epsilon and Delta, as shown below:

$$\text{Epsillon} = 2.3(Z\_{P0}Z\_{P1} + Z\_{P0}Z\_{P2} + Z\_{P0}Z\_{P3} + Z\_{S0}Z\_{S1} + Z\_{S0}Z\_{S2} + Z\_{S0}Z\_{S3} - 3Z\_{P0}^2 - 3Z\_{S0}^2)/(Z\_{P0}^2) \tag{58}$$

Delta = 2.3(*ZP*0*ZP*<sup>1</sup> + *ZP*0*ZP*<sup>2</sup> + *ZP*0*ZP*<sup>3</sup> − 0.3*ZS*0*ZS*<sup>1</sup> − 0.3*ZS*0*ZS*<sup>2</sup> − 0.3*ZS*0*ZS*<sup>3</sup> − 3*Z*<sup>2</sup> *<sup>P</sup>*<sup>0</sup> + 0.9*Z*<sup>2</sup> *S*0)/(*Z*<sup>2</sup> *<sup>P</sup>*0) (59)

### *4.5. Estimation of Facies Distribution*

After obtaining the zero-offset P- and S-impedances, they were transformed into elastic properties in order to forecast the lithology and fluid probabilities separately by two logistic regression models. The lithology and fluid models were merged together to estimate the distribution of gas sand, wet sand, and shale by using the decision tree algorithm. *ZP* and *ZS* were firstly transformed into the lithology predictors: near and far elastic impedances [116]. Next, the fluid predictors, Mu–Rho, Lambda–Mu–Rho, and Poisson's ratios, were calculated based on the LMR theory [117]. The idea of the facies model is based on the combination of the logistic regression [52] and decision tree algorithms.

Logistic regression is one of the linear regression models that deal with binary response variables. The idea of logistic regression is to provide model estimates which lie between zero and one [118]. These estimates can be considered the probabilities of occurrence of both zero and one. Therefore, the likelihood of a logistic regression model follows a Bernoulli distribution which takes on the value 1 with a probability *φ* and 0 with a probability (1 – *φ*), as shown in the following equation:

$$Y\_i | \phi\_i \sim Bernoulli(\phi\_i) \tag{60}$$

where *Yi* is a binary event at an observation (*i*), while *φ<sup>i</sup>* is the probability of the success of this event at the observation (*i*). Instead of directly relating the expected value of *Yi* to the model's variables, it can be assigned to the value of *φi*, which can be related to the model's variables and coefficients by using a link function, as shown in the following equation:

$$\log it(\phi\_i) = \log \left(\frac{\phi\_i}{1 - \phi\_i}\right) = \beta\_0 + \beta\_1 (X\_1)\_i + \dots \beta\_n (X\_n)\_i \tag{61}$$

where *logit*(*φi*) is the logarithmic function of the odds of *φi*, *β*<sup>0</sup> is the intercept, *β*<sup>1</sup> to *β<sup>n</sup>* are the model's coefficients, and (*X*1)*<sup>i</sup>* to (*Xn*)*<sup>i</sup>* are the model's explanatory variables at an observation (*i*). After simplifying the previous equation in (61), the *φ<sup>i</sup>* can be obtained as shown below:

$$E(Y\_i) = \phi\_i = \left(1/1 + e^{-(\beta\_0 + \beta\_1(X\_1)\_i + ... \beta\_n(X\_n)\_i)}\right) \tag{62}$$

where *E*(*Yi*) is the expected value of *Yi*. Assuming a three-class medium, which consists of gas sand, wet sand, and shale, the lithology model aims at predicting the sand probability from the lithology predictors, while the fluid model aims at estimating the gas probability from the fluid predictors. Given the observations of the elastic attributes and the facies log, the model's coefficients can be obtained based on Bayes's theorem [119].

The lithology model aims at predicting the sand probability from the near-EI and far-EI attributes. On the other hand, the fluid model aims at estimating the gas probability from the Mu–Rho, Lambda–Rho/Mu–Rho, and Poisson's ratios. The posterior means of the coefficients of the two models were obtained by MCMC simulation. The facies and fluid models were merged together by using the sand and gas probabilities, along with depth values, as inputs to a decision tree, which turned the probabilities into facies estimates.

A decision tree is a classification method that generates questions over training samples and answers them using a set of statistical criteria for data classification [42]. There are two types of decision trees, which are regression and classification trees. As the response variable in this study was categorical, a decision tree was considered.

A decision tree begins with a root node where the samples are classified into two branches such that if the answer to the question is "Yes", the sample goes under the "Yes" branch, while if the answer is "NO", the sample goes under the "No" branch. A distinct series of questions and branches are set in the internal nodes until satisfying the stopping rule in the last terminal of the tree, which is called the leaf node. According to the classification rule [120], each leaf in a tree represents a class (A or B).

A popular method used to construct decision trees is called classification and regression trees (CART) [121]. The CART method splits observations into two parts by minimizing the relative sum of squared errors [122] between the two partitions, according to the principle of binary recursive partitioning (BRP) [123]. The advantages of this method are that it is nonparametric, free of significance tests, and has low sensitivity to outliers [124].

The process of tree shortening is called pruning. Avoiding the large size of a tree is crucial to eschew over-fitting. The size of a decision tree is controlled by many factors, such as the minimum number of samples that should be left in a node to achieve splitting [125]. A pruned tree mostly leads to results which are close to those of the original tree. However, the pruned tree may be more accurate than the initial one in some cases [126].

This study used the "rpart" package in R to model the three-class facies variable based on the CART method. The inputs of the tree were the sand and gas probabilities that had been estimated by logistic regression as well as the depth variable as trend factors. The output of the tree was a categorical variable having three values, 1, 2, and 3, which corresponded to the three litho-fluid classes: gas sand, wet sand, and shale, respectively. The tree was assessed by calculating the correct classification rate (CCR), which is the number of correctly classified observations over the total number of samples.
