**1. Introduction**

Heavy oil and bitumen resources are estimated to be 158.43 billion ton, which account for more than 2/3 of the worldwide oil reserves (236.73 billion ton), according to OGL's oil reserves summary [1]. The efficient development of heavy oil reserves is considered as a significant means to add to world energy supply [2].

To date, thermal recovery is the primary method to improve the production of heavy oils [3], among which cyclic steam stimulation (CSS) has proven a cost-efficient technique widely applied in field practices [4,5]. The CSS technology was first applied on vertical wells [6,7] for heavy oil reservoirs with thick layers; it is, however, usually uneconomic to develop thin-layer heavy oil reservoirs, due to severe heat losses [8,9]. For thin-layer reservoirs, the use of horizontal wells has proven to be more cost-effective than

**Citation:** Xie, Z.; Feng, Q.; Zhang, J.; Shao, X.; Zhang, X.; Wang, Z. Prediction of Conformance Control Performance for Cyclic-Steam-Stimulated Horizontal Well Using the XGBoost: A Case Study in the Chunfeng Heavy Oil Reservoir. *Energies* **2021**, *14*, 8161. https://doi.org/10.3390/en14238161

Academic Editor: Reza Rezaee

Received: 14 October 2021 Accepted: 12 November 2021 Published: 5 December 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

vertical wells [10], thus the CSS integrated with horizontal wells has been widely used worldwide [8,9]. A significant issue with the CSS is that steam channeling exacerbated greatly after multiple cycles of steam injections, due to reservoir heterogeneities [11,12]. The channeled steams along high permeability areas lead to reducing the sweep efficient, and hence the oil recovery factors are approximately 10~20% [13,14].

The conformance control is an efficient technology to increase steam sweep efficiency and oil recovery factors in heavy oil reservoirs [15,16]. To date, the high-temperatureresistant gel [17,18] and foam [19,20] system are the primary agents that have been used to realize conformance control [21,22]. The gel system is typically formulated using polymer and cross-linker [23,24]. The primary plugging mechanism of gel is that the injected gel flows through the high-permeability channels and remains therein, which plugs steam channels effectively [25]. When gel is injected by injection wells, it can change the direction of flow to a lower permeability zone and block the offended areas [26]. As a result, the steam sweep efficiency is improved during steam injection operations and then excess coproduction of injection fluids (i.e., steam and water) is reduced [25]. However, the conventional cross-linker decreased the performance when the reservoir temperature rises over 120 ◦C [27]. A number of gel systems that can be used in high temperature conditions were invented [28,29] and have been implemented successfully on many heavy oil reservoirs [30,31].

Foam system usually includes foam and other gas, such as nitrogen (N2) and carbon dioxide (CO2), in addition to hydrocarbon gas [32,33]. Compared with CO2 and hydrocarbon gas, N2 is capable of better stability and flooding in high temperature reservoirs [34]. Foams injected into formation can increase the steam viscosity to stabilize the displacement process and reduce the capillary by the presence of surfactant [35,36]. Foams can also restrain steam overlying to the top of reservoirs and prevent steam channeling in the high permeability regions [37]. Meanwhile, the foams help reduce the heat loss of steam injection and steam migration, due to low thermal conductivity [38]. Compared with the gel systems, the N2-foam system exhibits better temperature-resistance capability and is beneficial to reducing underground heat loss [39,40]. However, the presence of oil has a significant effect on the stability of foam [32,33]. Many experiments confirmed that they could overcome this problem by optimizing the foaming agent system [41,42]. The injected foam is capable of blocking the water flow pathways without affecting the oil, which therefore is advantageous to reducing produced water volume.

It is important to evaluate the improved oil production of conformance control prior to field implementation, in order to ensure the cost-efficiency. To date, preliminary evaluations are commonly undertaken with numerical simulations, which require specific numerical simulators that are usually expensive. Besides, numerical simulations are usually quite time-consuming. Thus, it should be of practical significance to construct an accurate and robust model for the fast prediction of the improved oil production of conformance control after CSS. This paper proposes the use of the extreme gradient boost (XGBoost) [43,44] to estimate the EOR of CSS, with a focus on the heavy oil reservoirs of the Chunfeng Oilfield. The validity of the prediction model was tested using both synthetic and real field production data. Sensitivity of the influencing factors was quantified using the permutation information (PI) method.

#### **2. Methods**

#### *2.1. Database*

Valid and extensive data are mandatory for the construction of a reliable prediction model based on supervised learning methods [45–47]. In this study, the datasets were constructed by numerically simulating the CSS and the subsequent conformance control process, based on the geological feature and fluid property of the Southern P601 block of the Chunfeng Oilfield [48].

## 2.1.1. Construction of the Geological Models

The reservoir models used for constructing the datasets were extracted from a pre-built base geological model for the Southern P601 block in the Chunfeng Oilfield, which is an extreme high-viscosity reservoir that is characterized with low thickness, high permeability and high oil viscosity (Table 1). The base model for the Southern P601 block was initially constructed using the sequential Gaussian–Bayesian simulation [49], based on the logging data of 98 wells (Appendix A). A number of 200 scenarios of the sub-model were extracted from the base model (Figure 1). Each extracted model consists of 40, 21 and 10 grid blocks in the x-, y- and z-directions, respectively. The dimension of each grid block is 10 m, 10 m and 0.5 m in the x-, y- and z-directions, respectively. The uncertainty of top depth, net to gross ratio, permeability and porosity were considered in real reservoirs (Figure 2).

**Table 1.** The properties of the Southern P601 block in the Chunfeng Oilfield.


**Figure 1.** Workflow for predicting the performance of conformance control by numerical simulation, including extracted model, and calculating the oil increment of conformance control.

**Figure 2.** Distribution of top depth, NTG, permeability and porosity of the reservoir model. (**a**) Top depth distribution (579 m in average), (**b**) net to gross ratio distribution (0.47 in average), (**c**) permeability distribution (5405 mD in average), (**d**) porosity distribution (0.31 in average).

The heterogeneity has a great influence on production performance in heavy oil reservoirs. Therefore, it is very necessary to consider the heterogeneity in numerical simulation models of cyclic steam stimulations. To reflect the heterogeneity of heavy oil reservoirs, we calculate the average of top depth, NTG, permeability and porosity in numerical simulation models. Variation coefficient of permeability (*vk* see Equation (1)) and permeability ratio (*α<sup>k</sup>* see Equation (2)) are also important parameters for the heterogeneity of reservoirs. Therefore, we incorporate the average of NTG, permeability and porosity, variation coefficient of permeability and permeability ratio into the geological features and take them into account in XGBoost model training.

$$w\_k = \frac{1}{\overline{k}} \sqrt{\frac{1}{n} \sum\_{j=1}^{n} \left(\overline{k\_j} - \overline{k}\right)^2} \tag{1}$$

$$
\alpha\_k = \frac{k\_{\max}}{k\_{\min}} \tag{2}
$$

where *vk* is the variation coefficient of permeability; *kj* is the average permeability of *j* layer; *k* is the average permeability of reservoir; *kmax* is the maximum permeability of reservoir; *kmin* is the minimum permeability of reservoir.

#### 2.1.2. Numerical Simulation Setup

Numerical simulations were conducted using the CMG's STARS simulator in this study. The STARS is capable of simulating water/gas/oil flow, heat transfer, plugging agent chemical reactions and the associated plugging effects, which has been used widely for predicting the performance of thermal recovery in heavy oil reservoirs [50,51].

The injection of high-temperature steam into heavy oil reservoirs tends to reduce the oil viscosity and enlarge the two-phase flow span of relative permeability curves [52]. The dependence of oil viscosity on temperature is shown in Figure 3. For relative permeability, we defined saturation endpoints at different temperatures (Figure 4), and then relative permeability can be determined by endpoint scaling due to the similarity [53]. The heat loss through the roof and ceil layers was considered using the semi-analytical infiniteoverburden heat loss model that was proposed by Vinsome, P.K.W. [54]. Key thermal parameters of the reservoir and formation fluids are given in Table 2.

**Figure 3.** Oil viscosity versus temperature in the simulation model.

**Figure 4.** Relative permeability curves used in the simulation for (**a**) water–oil and (**b**) liquid–gas.


**Table 2.** The key thermal reservoir parameters.

In this study, we considered two popular plugging agents that have been widely used for conformance control, namely the gel and nitrogen foam. When cross-linker and polymer were injected underground, they formed a mixture in the high permeability region that is called in situ gel [55]. The blocking mechanisms of the gel are based on the adsorption of injection chemicals in the porous media and the residual resistance factor (RRF) that reduces the effective permeability [56,57]. We set that the value of RRF was 40 in our model. Meanwhile, a chemical reaction was set up to complete the underground gelation process. Firstly, three components of the gel are designed, including the xlinker, the xanthan and the gel generated by the reaction. The chemical reaction rate is set at 16, and different gel injections are simulated by controlling the injection amount of xlinker and xanthan. The concentration of xlinker and xanthan is 0.002% and 0.1%.

Two methods, namely the mechanism method and the empirical approach, are implemented in STARS to simulate nitrogen foam conformance control. In this study, the empirical approach was used, considering foam plugging water without plugging oil, through an interpolating relative permeability curve that decreased the fluidity of foam, which needed fewer parameters and conveniently used the field scale [53,58]. The relative permeability is interpolated based on a dimensionless "interpolation factor (FM)" that is shown on Equation (3) [53].

$$FM = \left\{ 1 + MRF \left( \frac{\omega s}{\omega s^{\max}} \right)^{cs} \left( \frac{S\_o^{\max} - S\_o}{S\_o^{\max}} \right)^{co} \left( \frac{N\_c^{ref}}{N\_c} \right)^{cv} \right\}^{-1} \tag{3}$$

where *FM* varying between 1 (no foam) and (*MRF*) <sup>−</sup><sup>1</sup> (strongest foam) where *MRF* was the maximum mobility reduction factor obtained via maximum surfactant concentration (*ωsmax*) or capillary number (*Nref <sup>c</sup>* ), and was valued at 100, 5 × <sup>10</sup>−<sup>5</sup> and 2 × <sup>10</sup><sup>−</sup>4, respectively, in our model. The *es*, *eo* and *ev* were exponents, and were chosen as simply 1, 1 and 0.3, respectively. *Smax <sup>o</sup>* was the maximum oil saturation above which no foam will form and the value of *Smax <sup>o</sup>* was set at 0.6 in our model [59].

Three components (i.e., water, foaming agent and nitrogen) were used to generate a nitrogen foam system. We controlled the injection rate of foaming agent and nitrogen to simulate different injection rates of nitrogen foam. In the numerical simulation model, relative parameters with the nitrogen foam system were set as follows: an injection rate of nitrogen at 10,000 m3/d, foaming agent concentration of 0.6%, injection rate of the foaming agent between 0.2 and 0.6 PV, injection temperature of 34 ◦C and injection mode of continuous injection.

To perform CSS in the numerical simulation model, we set up an injection well at the same location as the production well to simulate the injection of steam, and cycling group events were used to control the cycles switch. Cycling group events included three cycle parts (i.e., steam injection, soaking and production that were altered by setting the injection rate, soaking time and production rate). When the oil rate reached 3 m3/d or the production time lasted 180 d, a new cycle was started. Different cycle parts parameters were set in different geological models, as operation features trained in XGBoost models. Steam quality and steam injection temperatures also have an effect on well production performance [60].

Hence, we also selected both steam quality and steam injection temperatures as operation features. With the production continuing, oil rate did not meet our requirements, and we took conformance control for production wells using different plugging agents in different development stages (i.e., water cut, oil rate and oil recovery). After conformance control in a cycle, the production well also performed ten cyclic steam stimulations in the same operation parameters. Another cyclic steam stimulation model that had the same production cycles was compared with the cumulative oil result of the model of conformance control to calculate the oil increment of measures (Figure 1). The parameters of development stages were also selected as training features to train the XGBoost model.

In conclusion, three types of feature parameters (i.e., geological parameters, operation and conformance control timing) will be input data. Considering a reasonable range of parameter variations, we generated 200 simulation cases for every plugging agent to randomly make up the input database. Ranges of the geological parameters, operation and conformance control timing are summarized in Table 3. According to the design above, we constructed 800 numerical simulation models to obtain oil increments of conformance control measure. Therefore, the datasets need to be collated after carrying out the numerical simulations. Consequently, a total of 400 completed samples, including a different plugging agent, composed of oil increment and corresponding parameters, constitute the production dynamic database required for XGBoost training.


**Table 3.** Summary of the database used for constructing the estimation model.

#### *2.2. Principal of XGBoost Trees*

The XGBoost is a supervised learning algorithm proposed by Chen [43] under the gradient boosting framework. The XGBoost integrates multiple classification and regression tree (CART) models to form a classifier with strong generalization abilities. Each CART consists of a root node, a set of internal nodes and a set of leaf nodes (Figure 5a). Given a dataset <sup>D</sup> <sup>=</sup> {(**X***i*, *yi*)},(**X***<sup>i</sup>* <sup>∈</sup> <sup>R</sup>*m*, *yi* <sup>∈</sup> <sup>R</sup>, *<sup>i</sup>* <sup>=</sup> 1, 2, . . . , *<sup>n</sup>*) that consists of *<sup>n</sup>* samples with m feature variables, the XGBoost output is computed as the sum of the predicted values of a number of *K* CARTs (Figure 5b), with the mathematical model expressed as

$$
\hat{y}\_i = \sum\_{k=1}^K f\_k(X\_i)\_\prime f\_k \in \mathcal{F} \tag{4}
$$

where *fk* is the *K*th independent tree; *y*ˆ*<sup>i</sup>* is the output computed using XGBoost tree. The space of a CART tree (F) is represented with

$$\mathcal{F} = \left\{ f(\mathbf{X}) = \omega\_{q(\mathbf{X})} \right\} \left( q: \mathbb{R}^m \to T, \omega \in \mathbb{R}^T \right) \tag{5}$$

where *q* is a decision rule that maps an example to a binary leaf index; *ωq*(**X**) is the fractions of leaves that form a set; *T* is number of leaf nodes; *ω* is the weight of leaf.

**Figure 5.** Illustrations of the (**a**) CART model and (**b**) boosting ensemble trees [46,47].

In order to establish the prediction model f(x), the following objective function L(*φ*) is to be minimized

$$\mathcal{L}(\phi) = \sum\_{i}^{n} l(\mathcal{Y}\_{i\prime}y\_{i}) + \sum\_{k}^{K} \Omega(f\_{k}) \tag{6}$$

$$
\Omega(f) = \gamma T + \frac{1}{2}\lambda \|\|\|\omega\|\|^2 \tag{7}
$$

where *l* is a differentiable convex loss function; Ω is regularized terms that limit the complexity of the model; *γ* is the coefficient of loss function; *λ* is the regularized term coefficient; *ω* is the weight of the leaf.

The first term on the right side of Equation (6) is the loss function term that is a differentiable convex function. For regression problems, the mean square error is common. By adding the loss function, we can obviously reduce the mean square error. The second term is the regularization term, which stands for the sum of the complexity of each CART. In the process of minimizing the objective, XGBoost applies a series of techniques to control the complexity of the model and prevent overfitting, e.g., regularization, optimize hyperparameters and set early stopping rounds [61–63]. For more details on the mathematical formulations of the XGBoost model, readers are referred to [43,46,47].

#### *2.3. Construction of the Prediction Model*

In this paper, the open source XGBoost package in Python [43] was implemented to construct the prediction model for the prediction of potential conformance control after multi-cycle steam stimulation on three types of input features, namely geological parameters, operation and conformance control timing. There are thirteen parameters in the database and the whole database was randomly divided into two parts, namely the training (80%) and testing (20%) sets. The 320 samples consisting the training set were used to train the XGBoost model and to determine the optimal hyperparameter values for the XGBoost trees; the remaining 180 samples constructing the testing set were used to examine the stability and robustness of the prediction model.

There is a type of parameter called hyperparameters in machine learning, which must be set manually before the process of learning. Empirically, the optimal hyperparameters can significantly improve the performance and effect of the XGBoost model. Learning rate (LR) can improve the generalization ability of the XGBoost model by reducing the feature weight. Min child weight (MCW) determines the sum of weight in a minimum child. A large MCW value makes the boost model avoid learning part of the special samples, while an exorbitant MCW value will lead to underfitting. Maximum tree depth (MTD) is connected with the complexity of the ensemble model, and increasing the MTD value can find more specific and more local samples [64–67]. The number of trees (*n*) is another important hyperparameter; eliminating potential overfitting requires one to add a larger *n* value and smaller LR value to the boosting model [68]. To get the optimal compound mode of these four key hyperparameters, we adopt the K-fold cross-validation integrated with the exhaustive grid search approach for the optimization [69–72].

In our XGBoost model, we first specify the range of hyperparameters that search the space with manual tuning. Five grid values in each of hyperparameters will be adjusted, and 5 × 5 × 5 × 5 = 625 searching scenarios were produced (Table 4). There were remaining hyperparameters that may exert minor effect on the performance, for which we adopted default values [53,68]. For each searching scenario, the K-fold (K = 5) cross-validation approach was applied to calculate the coefficient of determination (R2) for each fold. The maximum averaged R<sup>2</sup> on the 5-fold subsets was the optimal compound mode of hyperparameters that will be set in our XGBoost model.

**Table 4.** Ranges of key hyperparameter values for the XGBoost used in cross-validation.

