*Article* **Differentiation and Prediction of Shale Gas Production in Horizontal Wells: A Case Study of the Weiyuan Shale Gas Field, China**

**Lixia Kang 1,2,\*, Wei Guo 1,2,\*, Xiaowei Zhang 1,2,\*, Yuyang Liu 1,2 and Zhaoyuan Shao 1,2**


**Abstract:** The estimated ultimate recovery (EUR) of shale gas is an important index for evaluating the production capacity of horizontal wells. The Weiyuan shale gas field has wells with considerable EUR differentiation, which hinders the prediction of the production capacity of new wells. Accordingly, 121 wells with highly differentiated production are used for analysis. First, the main control factors of well production are identified via single-factor and multi-factor analyses, with the EUR set as the production capacity index. Subsequently, the key factors are selected to perform the multiple linear regression of EUR, accompanied by the developed method for well production prediction. The thickness and drilled length of Long 11 <sup>1</sup> (Substratum 1 of Long 1 submember, Lower Silurian Longmaxi Formation) are demonstrated to have the uttermost effects on the well production, while several other factors also play important roles, including the fractured horizontal wellbore length, gas saturation, brittle mineral content, fracturing stage quantity, and proppant injection intensity. The multiple linear regression method can help accurately predict EUR, with errors of no more than 10%, in wells that have smooth production curves and are free of artificial interference, such as casing deformation, frac hit, and sudden change in production schemes. The results of this study are expected to provide certain guiding significances for shale gas development.

**Keywords:** shale gas; grey correlation method; multiple linear regression; production evaluation; main control factor; estimated ultimate recovery

#### **1. Introduction**

The recovery technology for shallow (<3500 m) marine shale gas in China has been mature in terms of supporting techniques and fit-for-purpose equipment [1]. Starting from scratch, the shale gas industry of China reached an annual production of 1.0 × <sup>10</sup><sup>10</sup> m3 within six years, which was doubled to approach a historical breakthrough in eight years [2]. Shale gas reservoirs are found in various sedimentary environments in China [3], including marine (50% of the whole country's shale gas resources), marine–continental transitional (20%), and continental (30%) environments [4–7].

Located in the southern Sichuan Basin [8,9], the Weiyuan shale gas field has been in exploration and development since 2009, with a cumulative proven shale gas resource in place of 4.277 × 1011 m3 [10]. In 2021, the shale gas production climbs up to 4.12 × <sup>10</sup><sup>9</sup> m3, which accounts for 32% of the total shale gas production of China National Petroleum Corporation (CNPC). In this context, it is of critical importance to calculate the estimated ultimate recovery (EUR) of gas wells for accurately estimating the potential of shale gas recovery and realizing large-scale economic development.

EUR is an important index for evaluating the production capacity of a shale gas well, and it is affected by numerous factors [11,12], which can be analyzed by various methods [13,14]. Most popular methods include the grey correlation approach [15,16]

**Citation:** Kang, L.; Guo, W.; Zhang, X.; Liu, Y.; Shao, Z. Differentiation and Prediction of Shale Gas Production in Horizontal Wells: A Case Study of the Weiyuan Shale Gas Field, China. *Energies* **2022**, *15*, 6161. https://doi.org/10.3390/en15176161

Academic Editors: Yong Li, Fan Cui and Chao Xu

Received: 6 July 2022 Accepted: 1 August 2022 Published: 25 August 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

and numerical simulations [17]. Efforts have been made to quantitatively investigate the correlation between the engineering parameters and the fracture network effectiveness via comparison of reservoir attributes and analysis of well performances [18,19]. Moreover, numerous methods have been developed for calculating the EUR of shale gas wells, including the empirical method, the modern production decline analysis, the analytical method, the linear flow analysis, and the big data approach [20]. Specifically, the pressure transient and Blasingame production decline analysis were used to calculate the EUR of a well with short-term production [21]. Moreover, innovative models for the multi-stage fractured horizontal well [22] and those based on the deep neural network [23] are useful ways of predicting the production of shale gas wells. Niu et al. [24] forecast the production capacity of shale gas wells of the Weiyuan gas field via multiple regression. Furthermore, Yin et al. [25] developed a production capacity assessment model for shale gas wells by building and solving the dual-porosity mathematical model of shale gas reservoirs. Accurately calculating EUR of wells and analyzing the main control factors of the production capacity of shale gas wells are of great significance for improving well production and guiding the deployment of shale gas recovery.

In the Weiyuan shale gas field, the production of horizontal wells is highly differentiated from well to well, and thus systematic efforts should be made to clarify the main control factors of gas well production capacity. In this research, single-factor analysis and grey correlation analysis are used to identify key factors affecting the production capacity of wells, while multiple linear regression is used to predict the production capacity of a single well. The findings of this research are expected to guide shale gas well operation in the Weiyuan gas field and enable high-efficiency recovery of shale gas.

#### **2. Geological Setting**

The Weiyuan gas field is located in the Weiyuan and Zizhong Counties in the southwestern Sichuan Basin. It lies in the Southwest Sichuan gentle fold zone of the Central Sichuan uplift and presents itself as a large dome anticline (Figure 1). The main layers for shale gas recovery are within the Wufeng Formation–Longmaxi Formation. The Wufeng– Longmaxi shale is buried at 2000–4000 m, with a thickness of 180–600 m. The interval from the Wufeng Formation to the Long 1 Sub-Member of the Longmaxi Formation is the target interval, with a shale thickness of 43.9–54.8 m, averaging 46.4 m. The Long 1 Sub-Member is further divided into the Long 11 1–11 <sup>4</sup> small layers from the bottom to top, among which the Long 11 <sup>1</sup> is the main layer for recovery, with a thickness of 1.7–7.0 m (4.6 m on average). The shale in the interval from the Wufeng Formation to the Long 1 Sub-Member has a TOC of 2.7–3.6%, averaging 3.2%; a porosity of 5.2–6.7%, averaging 5.9%; a gas saturation of 32.7–84.6%, averaging 64.7%; a gas content of 3.3–8.5 m3/t, averaging 5.5 m3/t; a brittle mineral content of 60–82%, averaging 74%; and a pressure coefficient of 1.2–2.0 in the favorable zone for production capacity building.

**Figure 1.** Structural map of the Weiyuan shale gas field.

#### **3. Data and Analysis Methods**

By the end of 2021, a total of 396 wells have been brought into production in the Weiyuan block and these wells present an average testing daily production of 2.16 × <sup>10</sup><sup>5</sup> m3/d and an average single-well EUR of 8.9 × 107 m3. A total of 121 wells that were brought into production within the recent three years and featured considerably differentiated production are selected as the samples for analysis, which present the average testing daily production of 2.488 × 105 <sup>m</sup>3/d, the average first-year daily production of 9.4 × <sup>10</sup><sup>4</sup> m3/d, and the EUR of (0.31–2.22) × 108 m3 (averaging 1.06 × 108 <sup>m</sup>3; Figure 2; Table 1). Among these wells, low-production wells with the EUR lower than 0.6 × <sup>10</sup><sup>8</sup> <sup>m</sup><sup>3</sup> account for 8%; medium-production wells with the EUR of (0.6–1.2) × <sup>10</sup><sup>8</sup> m3, 59%; and high-production wells with the EUR above 1.2 × 108 <sup>m</sup>3, 33%. A total of 62 wells are found with EUR below 1.0 × <sup>10</sup><sup>8</sup> m3, which account for about 50% of the wells brought into production. The overall distribution of EUR is even and the production of wells is greatly differentiated.

**Figure 2.** Probability distribution of the EUR of the 121 sample wells.


#### *3.1. Factors Affecting Gas Well Production*

Various factors affect the production capacity of shale gas wells, which include geological and engineering factors. The geological factors include the thickness of the Long 11 1 interval, TOC, brittle mineral content, gas content, gas saturation, and pressure coefficient, while the engineering factors include the drilled length of the Long 11 <sup>1</sup> interval, the fractured horizontal wellbore length, fracturing stage quantity, liquid injection intensity, and proppant injection intensity.

#### *3.2. Grey Correlation Analysis*

The production capacity of shale gas wells is subjected to the joint effects of multiple factors, which may interact with each other. The basic regression method fails to deliver a satisfactory analysis of the main control factors, as it cannot quantitatively capture the root causes of the differentiated production capacity of wells [26]. Hence, the grey correlation analysis is performed to quantify the intensity of impacts of each factor on the well production capacity and correspondingly identify the main control factors of the production capacity of shale gas horizontal wells in the Weiyuan gas field.

The grey system theory was developed by the Chinese scholar Julong Deng in 1982. This theory deals with the "limited-sample" system with poor data (lean in information) with partially known and partially unknown information [27]. The grey correlation analysis is an important analytical method for the grey system theory and is highly valuable in investigating the correlations among variables [28]. When no strict mathematical relationships exist between each influential factor and the total result, the grey correlation method can deliver an effective analysis of data and describe the strength, magnitude, and ranking of relationships between factors [29]. The grey correlation analysis determines the closeness of the relationship, according to the geometric similarity between the reference sequence and analysis sequence, and can be used to determine how intensive the effects of each factor are on the result [30].

The grey correlation analysis is performed after normalizing the above data of the factors. The temporal sequence of the EUR is set as the reference sequence X0, while the temporal sequences of each factor, namely the Long 11 <sup>1</sup> thickness, TOC, brittle mineral content, gas content, gas saturation, pressure coefficient, drilled Long 11 <sup>1</sup> length, fractured horizontal wellbore length, fracturing stage quantity, liquid injection intensity, and proppant injection intensity, are used as the analysis sequences. The following matrix is built:

$$(\text{X0}, \text{X1}, \dots, \text{Xn}) \quad = \begin{bmatrix} \text{x}\_0(1) & \text{x}\_1(1) & \cdots & \text{x}\_n(1) \\ \text{x}\_0(2) & \text{x}\_1(2) & \cdots & \text{x}\_n(2) \\ \vdots & \vdots & \ddots & \vdots \\ \text{x}\_0(m) & \text{x}\_1(m) & \cdots & \text{x}\_n(m) \end{bmatrix} \tag{1}$$

Then, the absolute difference between the elements of each analysis sequence and the reference sequence is calculated one after another:

$$\Delta\_{0i}(k) = |\mathbf{x}\_0(k) - \mathbf{x}\_i(k)|(k = 1, \dots, m, i = 1, \dots, n)$$

Accordingly, the absolute difference matrix is formed:

$$
\begin{bmatrix}
\Delta\_{01}(1) & \Delta\_{02}(1) & \cdots & \Delta\_{0n}(1) \\
\Delta\_{01}(2) & \Delta\_{02}(2) & \cdots & \Delta\_{0n}(2) \\
\vdots & \vdots & \ddots & \vdots \\
\Delta\_{01}(m) & \Delta\_{02}(m) & \cdots & \Delta\_{0n}(m)
\end{bmatrix}
\tag{2}
$$

The maximum value in the absolute difference matrix is defined as the maximum difference, while the minimum one is identified as the minimum difference:

$$
\Delta(\max) = \max \{ \Delta\_{0i}(k) \} \tag{3}
$$

$$
\Delta(\min) = \min \{ \Delta\_{0i}(k) \} \tag{4}
$$

The correlation coefficient is computed as below:

$$\zeta\_{0i}(k) = \frac{\Delta(\min) + \rho \Delta(\max)}{\Delta\_{0i}(k) + \rho \Delta(\max)} \tag{5}$$

where *ρ* is the resolution ratio determining the effects of Δ(*max*) on the data conversion; *ξ*0*i*(*k*) is the correlation coefficient with a positive value of no more than 1. The function of *ρ* is to control the influence of the Δ(*max*) on the overall correlation degree, so that the resolution of the correlation degree between different factors at the longitudinal level is as large as possible. A lower *ρ* results in promoted differentiation of correlation coefficients. In this research, *ρ* is empirically set as 0.5.

The grey correlation degrees between factors are computed:

$$\gamma\_{0i} = \frac{1}{N} \sum\_{k=1}^{N} \xi\_{0i}(k) \begin{pmatrix} i \ \vdots \ \mathbf{2}, \mathbf{3}, \vdots \ \mathbf{\prime}, n \end{pmatrix} \tag{6}$$

where *γ*0*<sup>i</sup>* is the correlation degree, a measure of the correlation between the reference sequence *X*<sup>0</sup> and the analysis sequence *Xi*(*i* = 1, 2, 3, ··· , *n*).

The weight coefficients of each factor are determined using the following equation, according to the correlation degrees:

$$\mathcal{W}\_i = \frac{\gamma\_{0i}}{\sum\_{i=1}^n \gamma\_{0i}} \tag{7}$$

#### *3.3. Principles of the Multiple Regression Algorithm*

The multiple linear regression (MLR) analysis is a mathematical analysis method based on the correlation between the dependent and independent variables [31,32]. With a dependent variable related to multiple factors, MLR can be performed for analysis [33], in which an MLR model is developed via the weighted summation of these factors [34]. It is assumed that the dependent variable *y* can be expressed as a linear combination of *m* independent variables:

$$\mathbf{y} = \beta\_0 + \beta\_1 \mathbf{x}\_1 + \beta\_2 \mathbf{x}\_2 + \dots + \beta\_i \mathbf{x}\_m + \varepsilon \tag{8}$$

where y is the dependent variable; *x*1, *x*<sup>2</sup> ··· *xi* are the independent variables; *β*0, *β*1, *β*<sup>2</sup> ··· *β<sup>m</sup>* are the regression coefficients; and *ε* is the random error with the normal distribution N 0, *δ*<sup>2</sup> .

Providing that there are N sets of observation data (samples), the theoretical equation shall be:

$$\mathbf{y} = \mathbf{X}\boldsymbol{\beta} + \boldsymbol{\epsilon} \tag{9}$$

where:

$$\mathbf{y} = \begin{bmatrix} y\_1 \\ y\_2 \\ \vdots \\ y\_n \end{bmatrix} \tag{10}$$

$$\mathbf{X} = \begin{bmatrix} 1 & \mathbf{x}\_{11} & \dots & \mathbf{x}\_{1m} \\ 1 & \mathbf{x}\_{21} & \dots & \mathbf{x}\_{2m} \\ & \vdots & \vdots & \vdots \\ 1 & \mathbf{x}\_{n1} & \dots & \mathbf{x}\_{nm} \end{bmatrix} \tag{11}$$

$$\begin{array}{rcl} \mathfrak{e}\_1 &=& \begin{bmatrix} \mathfrak{e}\_1 \\ \mathfrak{e}\_2 \\ \vdots \\ \mathfrak{e}\_n \end{bmatrix} \\\\ &\mathfrak{e}\_n \end{array} \tag{12}$$

$$\begin{array}{rcl} \clubsuit &=& \begin{bmatrix} \beta\_0\\ \beta\_1\\ \vdots\\ \beta\_m \end{bmatrix} \end{array} \tag{13}$$

In accordance with the least-square principle, the difference between the regression estimate and the true value (also known as the residual error) is calculated and the regression coefficients are computed by searching the least sum of squares of residual errors. Hence, the regression equation is obtained.

The test of significance can be performed for the regression coefficients of the MLR equation to validate whether or not a linear relationship exists between variables. On one hand, the regression coefficients that are not all zero stand for significant linear relationships between variables. On the other hand, regression coefficients that are all zero represent no significant linear relationship.

#### **4. Results and Discussion**

#### *4.1. Single-Factor Analysis*

As is shown in Figure 3, EUR is found with relatively large positive correlations with the Long 11 <sup>1</sup> thickness, gas saturation, drilled length of the Long 11 1, fractured horizontal wellbore length, and fracturing stage quantity, whereas it presents smaller correlations with TOC, brittle mineral content, gas content, pressure coefficient, liquid injection intensity, and proppant injection intensity.

The Pearson correlation analysis is implemented on EUR and 11 influential factors after the Z-score normalization to reveal the linear relationships among these factors and those between each factor and the EUR (Table 2 and Figure 4). Significant correlations are seen between EUR and Long 11 <sup>1</sup> thickness, brittle mineral content, gas saturation, drilled Long 11 <sup>1</sup> length, fractured horizontal wellbore length, fracturing stage quantity, and proppant injection intensity, whereas weak relationships are observed between EUR and TOC, gas content, pressure coefficient, and liquid injection intensity.

**Table 2.** Results of the Pearson correlation analysis.


Notes: \* represents a significant correlation; \*\* represents an extremely significant correlatio.

**Figure 3.** Single-factor regression of factors affecting EUR.

**Figure 4.** Heat map of the single-factor analysis results.

#### *4.2. Grey Correlation Analysis*

The correlation degrees between each factor and EUR are shown in Table 3, while the corresponding weight coefficients are presented in Figure 5. Clearly, the thickness and drilled length of Long 11 <sup>1</sup> have the highest effects on the EUR, followed by the second tier composed of the fractured horizontal wellbore length, gas saturation, brittle mineral content, and fracturing stage quantity, and the third tier consisting of the proppant injection intensity, liquid injection intensity, TOC, pressure coefficient, and gas content. According to the results, for the first seven factors, the results are consistent with single-factor analysis and grey correlation analysis, which indicates that the results are reliable.

**Table 3.** Calculated correlation degrees between each factor and the EUR.


**Figure 5.** Weight coefficients characterizing the effects of each factor on the EUR of wells.

#### *4.3. MLR Analysis*

According to the results of the grey correlation analysis of the factors affecting well production, seven key factors that are highly correlated with the EUR are identified and selected for the MLR analysis. These factors are the Long 11 <sup>1</sup> thickness, brittle mineral content, gas saturation, drilled Long 11 <sup>1</sup> length, fractured horizontal wellbore length, fracturing stage quantity, and proppant injection intensity.

As shown in Table 4, the calculated significance is 1.1 × <sup>10</sup>−15, far lower than the 0.05 significance level, and the constant for the MLR is non-zero, which means that the regression performance of the MLR equation is satisfactory and the results of the linear regression analysis between the EUR and these factors are reliable. The MLR fitting formula of the EUR of a shale gas horizontal well in relation to the above factors can be written as(Table 5): the EUR of a shale gas well = −1.008 + 0.143 × Long 11 <sup>1</sup> thickness + 0.006 × brittle mineral content + 0.001 × gas saturation + 1.6 × <sup>10</sup>−<sup>4</sup> × drilled Long 11 1 length − 4.9 × <sup>10</sup>−<sup>6</sup> × fractured horizontal wellbore length + 0.018 × fracturing stage quantity + 0.022 × proppant injection intensity.


**Table 4.** Variance analysis of the MLR model.

**Table 5.** Results of the MLR calculation.


#### *4.4. Application*

The EUR of 20 shale gas wells of the Weiyuan gas field is calculated using the MLR production capacity model presented above and compared with that computed using the analytical method. The analytical method considers the shale reservoir's physical properties, fracture parameters, fracture conductivity, and other model parameters. Moreover, it establishes the shale gas horizontal well model, with the model parameters historically fitted to correct the model parameters. After the historical fitting is completed, the production is predicted. Generally, the analytical method is used to calculate EUR for shale gas and the results are reliable in most cases. The results show (Table 6) that production predictions of 4 wells are overestimated, those of 5 are underestimated, and those of 11 are found with relatively accurate estimates, with errors no more than 10%. The matching rate of the model is around 50%.


**Table 6.** Comparison of the results between MLR and the analytical method.

Furthermore, the production performance of the wells presenting accurate estimates (Figure 6) are typically found with continuous production and pressure curves, with no artificial interference at the time being brought into production and during the production (e.g., engineering-induced casing deformation, frac hit, and sudden change in the production scheme). In contrast, the typical production performances of the overestimated wells (Figure 7) reveal that the wells are subjected to frac hit from the adjacent well and present discontinuous production curves; accordingly, the actual EUR far deviates from the ideal EUR predicted by the MLR model. Moreover, the typical production performances of the underestimated wells (Figure 8) show a change in the production scheme around the 350th day of production, and the production rate and pressure remain stable for a rather long period; as a result, the predicted EUR is lower than the actual EUR. The above analysis suggests that effective screening of sample data is required to select wells with continuous production decline for the MLR analysis in an attempt to avoid artificial interference and improve the accuracy of the MLR model.

**Figure 6.** The production history of Well W5 (with the accurate prediction).

**Figure 8.** The production history of Well W17 (underestimated).

#### **5. Conclusions**


**Author Contributions:** Conceptualization, L.K., W.G. and X.Z.; Data curation, L.K., W.G., Y.L. and Z.S.; Formal analysis, X.Z., Y.L. and Z.S.; Investigation, L.K., W.G., Y.L. and Z.S.; Methodology, L.K., X.Z. and Y.L.; Project administration, W.G. and X.Z. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by [Study on productivity evaluation and development technology policy optimization of Marine and continental transitional shale gas] grant number [2021DJ2005] and [Study on shale gas optimization and scale efficiency development technology in southern Sichuan] grant number [2022KT1204].

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

