Inverse Problem of Permeability Field under Multi-Well Conditions Using TgCNN-Based Surrogate Model

Li, Jian; Zhang, Ran; Wang, Haochen; Xu, Zhengxiao

doi:10.3390/pr12091934

Open AccessArticle

Inverse Problem of Permeability Field under Multi-Well Conditions Using TgCNN-Based Surrogate Model

¹

Ningbo Institute of Digital Twin, Eastern Institute of Technology, Ningbo 315201, China

²

Geosteering & Logging Research Institute, Sinopec Matrix Corporation, Qingdao 266003, China

³

School of Petroleum and Natural Gas Engineering, Changzhou University, Changzhou 213164, China

^*

Authors to whom correspondence should be addressed.

Processes 2024, 12(9), 1934; https://doi.org/10.3390/pr12091934

Submission received: 25 June 2024 / Revised: 27 July 2024 / Accepted: 31 July 2024 / Published: 9 September 2024

(This article belongs to the Special Issue Multiphase Flow, and Efficient Development Methodology and Technology in Unconventional Reservoirs (2nd Edition))

Download

Browse Figures

Versions Notes

Abstract

:

Under the condition of multiple wells, the inverse problem of two-phase flow typically requires hundreds of forward runs of the simulator to achieve meaningful coverage, leading to a substantial computational workload in reservoir numerical simulations. To tackle this challenge, we propose an innovative approach leveraging a surrogate model named TgCNN (Theory-guided Convolutional Neural Network). This method integrates deep learning with computational fluid dynamics simulations to predict the behavior of two-phase flow. The model is not solely data-driven but also incorporates scientific theory. It comprises a coupled permeability module, a pressure module, and a water saturation module. The accuracy of the surrogate model was comprehensively tested from multiple perspectives in this study. Subsequently, efforts were made to address the permeability-field inverse problem under multi-well conditions by combining the surrogate model with the Ensemble Random Maximum Likelihood (EnRML) algorithm. The research findings indicate that modifying the network structure allows for improved integration of the outputs, resulting in prediction accuracy and computational efficiency. The TgCNN surrogate model demonstrated outstanding predictive performance and computational efficiency in two-phase flow. By combining the surrogate model with the EnRML algorithm, the inversion results closely aligned with those from the commercial simulation software, significantly improving the computational efficiency.

Keywords:

theory-guided convolutional neural network; inverse problem; surrogate model; multiple wells

1. Introduction

In the field of oil and gas field development, studying the inverse problem regarding the permeability field of underground reservoirs is a challenging task [1,2,3]. The distribution of permeability is an important property that describes the permeability of reservoir rocks, and its distribution may vary due to factors such as geological structure, rock type, and porosity. An accurate understanding of permeability distribution is crucial for optimizing development plans, improving production efficiency, reducing development costs, and predicting fluid migration and accumulation within reservoirs [4,5,6,7,8,9].

Traditional methods for exploring permeability can be divided into direct methods and indirect methods. In direct methods, nuclear magnetic resonance (NMR) logging is a commonly used logging technique that infers permeability and provides information about pore structure and fluid distribution by measuring the characteristics of nuclear magnetic resonance signals in formations. However, NMR technology is limited by depth restrictions and noise interference and cannot provide global permeability information. In indirect methods, production analysis inversion is a common inversion method. Permeability is usually treated as an inversion parameter, and then the production data are compared and matched with the model to infer the optimal permeability distribution. This process faces challenges such as non-uniqueness, high-dimensional parameter space, uncertainty, and computational complexity [10,11]. In dealing with two-phase flow (oil–water problem) under multi-well conditions, traditional numerical simulation methods require a large amount of computation to fully understand the system behavior. These simulations typically use computational fluid dynamics (CFD) methods, involving complex mathematical models and large-scale computations. However, this approach has limitations such as high computational costs, uncertainty in model parameters, and the need for extensive validation and model adjustment. Therefore, the efficient processing and simulation of large-scale geological and subsurface data are needed to effectively utilize these data to solve the inverse problem [12,13]. To address these challenges, surrogate models have emerged as a promising solution [14,15,16]. Surrogate models can efficiently approximate the permeability distribution of reservoirs, thereby accelerating the solution process of the inverse problem. Compared to traditional physically based computational methods, surrogate models use a small amount of simulation data for training and rapidly generate predictive results, greatly improving computational efficiency. Additionally, surrogate models can handle diverse data sources, including simulation data, measurement data, and expert knowledge, providing a comprehensive information fusion framework. Through surrogate models, we can better understand reservoir characteristics, quantify uncertainty, and provide accurate predictions for practical decision-making. In dealing with two-phase flow problems under multi-well conditions, surrogate models can capture the key features and behavioral patterns of the system through training and learning from existing numerical simulation data.

With the development of statistics and probability theory, researchers have begun to explore the use of statistical methods to establish surrogate models in the field of oil and gas field development. Statistical models, such as Gaussian Process Regression (GPR), have been widely applied for modeling and predicting underground multiphase flow problems through the statistical analysis and regression modeling of large amounts of actual data [17,18]. These models can fit the distribution and mapping relationships of existing data, enabling predictions for unknown data. Statistical models have the advantage of handling incomplete and noisy data, but their expressive power is limited for complex underground multiphase flow problems and often relies on extensive data support. However, with the rapid development of machine learning technology, various machine learning models, including artificial neural networks, support vector machines, and decision trees, have been introduced for constructing surrogate models [19,20]. These machine learning models, trained on large-scale actual data, can automatically learn the behavioral patterns of underground multiphase flow, enabling efficient prediction and simulation. Furthermore, the rise of deep learning technology has brought revolutionary advancements to surrogate models. Deep learning models improve the expressive power of models through hierarchical feature learning. In the context of underground multiphase flow problems, physics-constrained deep learning models have become a hot research topic. These models achieve accurate predictions for complex multiphase flow problems by incorporating equation residuals into the loss function. The application of this model in the field of underground multiphase flow has yielded significant research results, providing a highly effective approach to tackling complex problems.

Zhang, Wang et al. [21] developed a surrogate model using a fully convolutional encoder/decoder network based on dense convolutional networks. Similar to methods used in image/image regression tasks in deep learning, this model was used to predict field saturation. Compared to the traditional numerical reservoir simulation, the surrogate model not only achieved the same level of accuracy but also had lower computational costs and a shorter runtime. Ma, Zhang et al. [22] proposed a novel adaptive parameter control data-driven nested differential evolution algorithm (DNDE-APC) to address the non-uniqueness in inversion. This algorithm integrates clustering methods, nested techniques, and local surrogate-assisted approaches, aiming to balance exploration and convergence in solving multimodal inverse problems. By analyzing the fitting and prediction of production data, the implementation of history matching, the distribution of inversion parameters, and the quantification of predictive uncertainty, the results demonstrated that this new method effectively handles the non-uniqueness in inversion and produces more robust predictions. However, the application of these methods relies on data-driven approaches, necessitating a large volume of training data for the accurate training of surrogate models. Acquiring such labeled data in the field of oil field development is prohibitively expensive, and the labeled data obtained from real-world sources are often characterized by sparsity and incompleteness. Consequently, it may not be entirely feasible to apply these models built under ideal conditions to real-world scenarios.

In this scenario, a theory-guided machine learning framework has recently emerged [23,24,25]. Raissi was the pioneer in proposing the inclusion of physical constraints into the neural network’s loss function to constrain the training of network parameters. Building upon this, Wang, Chang et al. [26] introduced the TgCNN surrogate model, specifically designed to describe pressure in single-phase flow. Its objective is to efficiently quantify uncertainty and perform data assimilation for reservoir flow with uncertain model parameters. Trained surrogate models have demonstrated significant enhancements in the efficiency of uncertainty quantification and data assimilation processes. However, it is important to note that this model solely focuses on single-phase flow and does not account for variations in saturation, resulting in lower levels of nonlinearity. In our previous research [27], the TgNN surrogate model was introduced, which utilizes fully connected networks to capture the behavior of two-phase flow. During the construction of the neural network, we tested various network structures with different coupling forms based on the coupled two-phase flow equations. This rigorous evaluation process led to the selection of a novel coupled neural network model, specifically suited for addressing oil–water two-phase-flow challenges. Through extensive testing to evaluate the accuracy and robustness of the model, we found that the TgNN surrogate model efficiently and accurately captures the intricate mapping relationship between inputs and outputs. However, it is worth mentioning that our study focuses solely on the scenario involving two wells and does not consider the complexities associated with multi-well conditions.

In this study, a coupled TgCNN network framework is proposed, consisting of three modules. Firstly, input data are passed into the network through the coupled permeability module. The network is then divided into two parts, each dedicated to approximating pressure and saturation, respectively. By adjusting this network structure, the coupling of the two outputs (i.e., pressure and saturation) through the relative permeability in the governing equations is improved, resulting in enhanced prediction accuracy and computational efficiency. Tests demonstrate that the TgCNN-based surrogate approach effectively meets the accuracy requirements for two-phase flow in a two-dimensional setting under multi-well conditions. Subsequently, the TgCNN-based surrogate model is utilized to solve the inverse problem of permeability. The results reveal that, compared to traditional numerical simulation software, the TgCNN-based surrogate model achieves excellent inversion performance with significantly improved efficiency. This indicates that estimates of permeability can be obtained more rapidly, thereby enhancing the efficiency of the workflow.

2. Methods

2.1. Governing Equations

In a two-dimensional heterogeneous porous medium, the underground seepage of oil–water two-phase flow satisfies the following governing equations:

\frac{\partial}{\partial x} (ρ_{w} \frac{k k_{rw}}{μ_{w}} \frac{\partial P_{w}}{\partial x}) + \frac{\partial}{\partial y} (ρ_{w} \frac{k k_{rw}}{μ_{w}} \frac{\partial P_{w}}{\partial y}) + Q_{w} = \frac{\partial}{\partial t} (ϕ ρ_{w} S_{w})

\frac{\partial}{\partial x} (ρ_{o} \frac{k k_{ro}}{μ_{o}} \frac{\partial P_{o}}{\partial x}) + \frac{\partial}{\partial y} (ρ_{o} \frac{k k_{ro}}{μ_{o}} \frac{\partial P_{o}}{\partial y}) + Q_{o} = \frac{\partial}{\partial t} (ϕ ρ_{o} S_{o})

P_{C} = P_{o} - P_{w}

S_{o} + S_{w} = 1,

(1)

where

ρ_{\circ}

and

ρ_{w}

represent oil and water density, respectively, kg/m³;

μ_{\circ}

and

μ_{w}

are oil and water viscosity, respectively, mPa·s;

k

denotes permeability, 10⁻³·um²;

k_{r o}

and

k_{r w}

are the relative permeability of oil and water, respectively;

P_{o}

and

P_{w}

denote the pressure of the oil and water phase, respectively, kPa;

S_{0}

and

S_{w}

denote the saturation of the oil and water phase, respectively;

P_{C}

represents the capillary pressure of the reservoir, kPa;

ϕ

is the porosity of the reservoir rock;

Q_{o}

and

Q_{w}

represent the source/sink term of the oil and water phase, respectively.

The source/sink term in Equation (1) represents the condition of injection or production wells in the reservoir, typically governed by two control conditions: production rate control and bottomhole pressure control. In this case, the source/sink term is specified as a bottomhole pressure control condition. Since the radius of the well is very small compared to the distance between wells, the well points can be approximated as Dirac functions in the governing equations. The Peaceman equation [28,29] can be used to approximate the flow in the wellpoints as a steady-state flow, which is consistent with the planar radial control equation and can be expressed as follows:

Q_{w_i n j} (t) = 2 π h k_{(i n j)} ρ_{w} \times (\frac{k_{r w} (S_{w})}{μ_{w}} + \frac{k_{r o} (S_{w})}{μ_{o}}) \times (\frac{P_{I N J} (t) - P (t, x_{(i n j)}, y_{(i n j)})}{l n (\frac{r_{e}}{r_{w}})}),

(2)

Q_{w_p r o} (t) = 2 π h k_{(p r o)} ρ_{w} \times (\frac{k_{r w} (S_{w})}{μ_{w}}) \times (\frac{P (t, x_{(p r o)}, y_{(p r o)}) - P_{P R O} (t)}{l n (\frac{r_{e}}{r_{w}})}),

(3)

Q_{o_p r o} (t) = 2 π h k_{(p r o)} ρ_{o} \times (\frac{k_{r o} ({1 - S}_{o})}{μ_{w}}) \times (\frac{P (t, x_{(p r o)}, y_{(p r o)}) + P_{C} - P_{P R O} (t)}{l n (\frac{r_{e}}{r_{w}})}),

(4)

where

P_{I N J} (t)

and

P_{P R O}

denote the bottom hole pressure of the injection well and the production well, respectively, kPa;

(t, x_{(i n j)}, y_{(i n j)})

and

(t, x_{(p r o)}, y_{(p r o)})

denote the coordinates of collocation points for the injection well and the production well, respectively;

Q_{w_i n j} (t)

denotes the water mass flow rate of the injection well; and

Q_{w_p r o} (t)

and

Q_{o_p r o} (t)

denote the water mass flow rate and oil mass flow rate of the production well, respectively.

2.2. Model Structure

Convolutional neural network (CNN) models have the ability to pass input data to the neural network in the form of images, thereby preserving the complete information of the permeability field and well locations during the transmission process. In other words, the convolutional and pooling layers of CNN models can retain the spatial structure of the input data, which is particularly useful for handling multi-well models since preserving spatial information is crucial for solving such problems. Furthermore, CNN models utilize convolutional and pooling layers to achieve a local perception of the input data. This means that CNN models can capture changes in local features such as images and spatial data, especially the drastic variations in the vicinity of well locations, resulting in a better identification and understanding of patterns in the input.

In this section, an encoding–decoding approach is employed to tackle the multi-well problem. The model first encodes the image information into a hidden connected layer, which is subsequently decoded and expanded. Following the same coupling approach as in article [27], the hidden fully connected layer is divided into two parts and independently decoded to describe the pressure field and water saturation field, as illustrated in Figure 1.

The input layer receives the raw input data, which pertains to the permeability field and the matrix representing time. The downsampling section comprises 4 layers. The first convolutional layer consists of 16 channels, with subsequent layers mirroring this configuration. Following these, the third convolutional layer contains 64 channels, and the fourth convolutional layer hosts 128 channels. Subsequently, there is a fully connected network with 256 neurons, followed by an upsampling structure that mirrors the downsampling section’s configuration. In traditional encoder–decoder structures, information loss may occur due to multiple downsampling and upsampling operations. To prevent information loss in the encoding and decoding process, this study incorporates U-Net connections into both the encoding and decoding parts. The most prominent feature of U-Net is the skip connections, which directly connect feature maps from certain layers in the encoder to corresponding layers in the decoder, allowing the decoder to utilize feature information from different levels. This helps preserve both low-level and high-level features, enabling a better capture of local and global characteristics of the target. Moreover, the skip connections in U-Net help mitigate the vanishing gradient problem during the backpropagation process, aiding the model in learning and optimization.

The pressure and water saturation generated from 50 permeability fields are used as labeled data in this study, while 500 permeability fields are used as training data points (these 500 fields do not require iterative solutions for pressure and saturation; only the equation residuals need to be computed). Additionally, 5000 permeability fields are used as the test set.

As convolutional neural network models cannot utilize automatic differentiation, theoretical guidance in the form of finite differences is incorporated into the network structure for discrete training. The discretization operation is applied to Equation (1).

\frac{λ_{i + \frac{1}{2}} (P_{i + 1} - P_{i})}{Δ x^{2}} - \frac{λ_{i + \frac{1}{2}} (P_{i} - P_{i - 1})}{Δ x^{2}} + \frac{λ_{j + \frac{1}{2}} (P_{j + 1} - P_{j})}{Δ y^{2}} - \frac{λ_{j - \frac{1}{2}} (P_{j} - P_{j - 1})}{Δ y^{2}} = ρ C_{p} S \frac{P_{i}^{n + 1} - P_{i}^{n}}{Δ t} + ϕ ρ \frac{S_{i}^{n + 1} - S_{i}^{n}}{Δ t},

(5)

where

λ

represents the flux, which is calculated using the upstream weighting method. This differencing method compares the flow direction between adjacent grid cells based on the pressure gradient, as shown in Equation (6).

\begin{array}{l} λ = {(\frac{ρ_{l} k k_{d}}{u_{l}})}_{i + 1 / 2, j}^{n + 1} = k_{i + 1 / 2, j} [\frac{ρ_{l} (P_{l}^{n}) \cdot k_{d} (S_{l}^{n})}{μ_{l} (P_{l}^{n})}]_{m, j}^{n} \\ m = \{\begin{matrix} i, & i f & P_{l, i, j} > P_{l, i + 1, j} \\ i + 1, & i f & P_{l, i, j} < P_{l, j + 1, j} \end{matrix} \end{array} (l = w, o) .

(6)

.

The model takes a permeability field and a time matrix as input, with dimensions of (2, 50, 51, 51), and outputs the pressure and saturation fields with dimensions of (1, 50, 51, 51), respectively. During the training process, the input data first pass through the coupled permeability module, and then the network splits into two parts representing the approximate simulations of pressure and saturation, respectively. Then, the model is trained sequentially, first on the labeled data and then by adding constraints from other scientific theories using the data points. Therefore, the residual of the governing equation in the water phase and oil phase can be added into the loss function of the neural network as a regularized term as follows:

\begin{matrix} f_{l} = \frac{ρ_{l}}{μ_{l} Δ x^{2}} (\begin{matrix} k_{i + \frac{1}{2}} k_{r l_{i + \frac{1}{2}}} (N_{P_{i + 1}} (t, x, y; θ) - N_{P_{i}} (t, x, y; θ)) \\ - k_{i - \frac{1}{2}} k_{r l_{i - \frac{1}{2}}} (N_{P_{i}} (t, x, y; θ) - N_{P_{i - 1}} (t, x, y; θ)) \end{matrix}) + \\ \frac{ρ_{l}}{μ_{l} Δ y^{2}} (\begin{matrix} k_{j + \frac{1}{2}} k_{r l_{j + \frac{1}{2}}} (N_{P_{j + 1}} (t, x, y; θ) - N_{P_{j}} (t, x, y; θ)) \\ - k_{j - \frac{1}{2}} k_{r l_{j - \frac{1}{2}}} (N_{P_{j}} (t, x, y; θ) - N_{P_{j} - 1} (t, x, y; θ)) \end{matrix}) \\ + 2 π h k k_{r l} ρ_{l} \frac{P_{l} - P_{w f}}{μ_{l} \ln (\frac{r_{e}}{r_{w}})} - ϕ ρ_{l} (C_{ϕ} N_{s} (t, x, y; θ_{s}) \frac{N_{P_{t}} (t, x, y; θ) - N_{P_{t - 1}} (t, x, y; θ)}{Δ t} + \frac{N_{S_{t}} (t, x, y; θ) - N_{S_{t - 1}} (t, x, y; θ)}{Δ t}) . \end{matrix}

(7)

Furthermore, the residual of labeled data, boundary conditions, and initial conditions can be calculated using the same methodology. Specifically, by incorporating residual control equations in the loss function, the model training becomes directional, focusing on establishing pressure and saturation fields for the permeability field at different time steps rapidly based on these control equations. The impact of specific label data and training data on the accuracy and training time has been discussed in previous work [30]. Hence, the total loss function for the TgCNN-based surrogate model in the context of the two-phase flow problem can be formulated as follows:

\begin{array}{l} L (θ) & = λ_{d a t a} M S E_{d a t a} (θ) + λ_{P D E} M S E_{P D E} (θ) + λ_{I C} M S E_{I C} (θ) \\ + λ_{B C} M S E_{B C} (θ) + λ_{E K} M S E_{E K} (θ) \end{array} .

(8)

3. Case Study

For the two-phase flow scenario, a numerical case study is designed in this paper. A square domain with dimensions of 510 m × 510 m is considered, uniformly divided into a grid of 51 × 51 cells. Within this domain, there is an injection well and four production wells located at grid positions (26, 26) and (1, 1), (1, 51), (51, 1), and (51, 51), respectively. The porosity of the reservoir in this domain is 20%, and the compressibility of the reservoir rock is 3 × 10⁻⁶/kPa. The initial pressure is set at 10,000 kPa, while the initial water saturation is 0.2, and the residual oil saturation is 0.2. The viscosities of water and oil are 1 mPa·s and 5 mPa·s, respectively. To simplify the problem, the capillary pressure

P_{C}

is 0, which means that

P_{o} = P_{w}

. In the remaining sections of the article, we use “pressure” as a general term to refer to the pressures of both the water phase and the oil phase The four boundaries are set as closed boundaries, following the Neumann boundary conditions. The injection well is maintained at a pressure of 15,000 kPa, while the production well continuously produces at 8000 kPa. The total simulation time is 1800 days, with a time step of 36 days for each of the 50 time steps conducted.

In this case, the model is established following a five-spot well pattern, which consists of one injection well and four production wells. Figure 2 illustrates one of the permeability fields and the locations of the wells. However, in the case of multiple wells, there can be multiple flow directions, making the scenario more complex. We encountered difficulties in achieving the convergence of the overall loss function, with fluctuating values. Additionally, we observed significant variations in the constraints imposed by scientific theories, even with small mismatches in the labeled data. Therefore, in this study, we first trained a highly accurate CNN model using a large amount of data, achieving an R² accuracy of 0.9999. We then evaluated the errors by applying the original data and the data-driven predictions to the equations, focusing on a point with the most dramatic variations. The results are presented in Table 1.

It can be observed that the predicted pressure at point

P_{l, i + 1, j}

closely matches the true value, while at point

P_{l, i, j}

, the predicted pressure differs from the true value by only 0.53, resulting in a pressure error of approximately 0.4%. The true pressure at point

P_{l, i, j}

is greater than at point

P_{l, i + 1, j}

, but the predicted pressure at point

P_{l, i, j}

is lower than at point

P_{l, i + 1, j}

. This discrepancy leads to different signs in the calculation of the partial differential equation loss term when incorporating the predicted values into the equation residuals. As a result of the differences in pressure magnitudes between neighboring grid cells, the differential equation loss term error increases abruptly from 0.00001 to 54. This also indicates that the data-driven model, although visually accurate in its predictions, exhibits some errors (sometimes even significant) when calculating the equations. In other words, the data-driven approach has not fully captured the characteristics of the governing equations. Regardless of how closely the mapping matches the true data on the test set, when applied to other complex test data, the predictions will invariably encounter issues. To address this, the study introduces soft constraints on the flow direction. The well locations are expressed in a binary manner, representing the comparison of pressure magnitudes between neighboring grid cells in the labeled dataset. Specifically, a binary value of 1 is assigned when the pressure difference is greater than zero, and a binary value of 0 is assigned when it is less than zero. The flow direction schematic diagram and the pressure difference in the horizontal direction are illustrated in Figure 3.

During the development of oil and gas fields, reservoir pressure gradually decreases from injection wells to production wells. Although the flow direction is not always strictly correct, under the assumption of fixed well locations and boundary conditions, the comparison of pressure magnitudes between neighboring grid cells remains relatively consistent. Therefore, this can be considered as a soft constraint and incorporated into the domain knowledge. In this study, the comparison of the pressure magnitudes between the neighboring grid cells is expressed as a binary image based on the well relationships and added to the loss function. This binary image constraint provides information that cannot be obtained from the governing flow equation.

3.1. Accuracy of the Prediction

In this section, the accuracy of the convolutional neural network surrogate model for multiple wells is evaluated and compared against results obtained from the numerical simulation software UNCONG (2020.1). The test set permeability fields are inputted into the trained surrogate model, resulting in corresponding pressure and water saturation field predictions. Using the results obtained from UNCONG as a reference, three permeability fields with significant differences are selected to compare the pressure and saturation predictions of the TgCNN surrogate model at 30 time steps, as shown in Figure 4.

To quantitatively evaluate the performance of four network models, the relative

L_{2}

error is introduced as follows:

L_{2} (u_{p r e}, u_{t r u e}) = \frac{‖u_{p r e} - u_{t r u e}‖}{‖u_{t r u e}‖}

(9)

Using the results obtained from UNCONG as a reference, 100 permeability fields are selected to compare the pressure and saturation predictions of the TgCNN surrogate model using the relative

L_{2}

error, as shown in Figure 5.

The results indicate that the surrogate model can accurately approximate the pressure and water saturation fields, with errors in both pressure and water saturation maintained within an acceptable range. Afterwards, 200 permeability fields are randomly selected to compare the predicted values with the true values at two random spatiotemporal points, as shown in Figure 6. The results indicate that the TgCNN surrogate model can accurately simulate the pressure field and water saturation field.

3.2. Inverse Permeability Field

In this study, the Ensemble Random Maximum Likelihood (EnRML) algorithm is used for permeability inversion. Based on Bayes’ theorem, the posterior probability density function of the model parameters is equal to the product of the likelihood function and the prior probability density function. Assuming the existence of an objective function

O (m)

, defined as Equation (10):

O (m) = \frac{1}{2} (m - m_{p r})^{T} C_{M}^{- 1} (m - m_{p r}) + \frac{1}{2} (g (m) - d_{o b s})^{T} C_{D}^{- 1} (g (m) - d_{o b s}) .

(10)

The objective function

O (m)

exhibits a negative correlation with the posterior probability of the model parameters. Therefore, the problem of maximizing the posterior probability of the model parameters can be transformed into the problem of minimizing the objective function. The objective of the EnRML algorithm is to find a set of model parameters that can maximize the posterior probability or minimize the objective function. It is important to emphasize that the EnRML algorithm is based on multiple random realizations for optimization, thereby encompassing multiple sets of implementations. In the context of the EnRML algorithm, the term “ensemble” refers to all the implementations involved in the optimization process. The specific derivation process of the EnRML algorithm can be found in [31]. Through the derivation, the following equation can be obtained:

\begin{array}{l} m_{j}^{l + 1} & = m_{j}^{l} - \frac{1}{1 + λ_{l}} [C_{M_{l}} - C_{M_{l}, D_{l}} ((1 + λ_{l}) C_{D} + C_{D_{l}})^{- 1} C_{M_{l}, D_{l}}^{T}] C_{M}^{- 1} (m_{j}^{l} - m_{p r, j}) \\ - C_{M_{l}, D_{l}} ((1 + λ_{l}) C_{D} + C_{D_{l}})^{- 1} (g (m_{j}^{l}) - d_{o b s, j}) j = 1, \dots, N_{e} \end{array},

(11)

where m represents the sought-after model parameters,

l

denotes the iteration step,

j

represents different realizations, and the coefficient

λ

acts similarly to a learning rate, where a larger value of

λ

corresponds to a smaller learning rate. The term

C_{M_{l}}

in the Hessian matrix represents the covariance matrix of the model parameters at the

l

th iteration step, while

C_{M}

outside the Hessian matrix represents the covariance matrix of the prior model parameters, which remains unchanged during the iteration process.

C_{M_{l}, D_{l}}

denotes the cross-covariance between the updated parameters based on the implementation ensemble and the estimated values at the

l

th iteration step,

d_{o b s, j}

represents the covariance of the estimated values at the

l

th iteration step, and

N_{e}

indicates the number of random realizations.

In the EnRML algorithm, the Random Maximum Likelihood (RML) method is initially employed to generate multiple sets of parameters by randomly sampling the model parameters. Each set of parameters generated by RML is used to construct a random realization. In each realization, the observed data are modeled as the sum of the target variable (dependent variable) and random measurement errors. For each realization, the observed data are simulated by using the model, the corresponding parameter set, and the introduced random measurement errors. In each realization, the initial model parameters are randomly sampled from the prior probability density function of the model parameters. By generating multiple random realizations, the EnRML algorithm can comprehensively explore the parameter space and consider different possibilities of parameter sets, providing more comprehensive parameter estimations and uncertainty assessments.

The surrogate model is then used for inverse problem solving and compared with commercial simulation software in terms of efficiency. Based on the constructed TgCNN surrogate model, inverse problem solving is performed on the permeability field using the iterative EnRML method [32,33,34,35]. In other words, for the unknown permeability field, the information about the permeability field is inferred based on the known pressure and water saturation. To simplify the computation, the permeability field is dimensionally reduced using Karhunen–Loève (K–L) expansion, resulting in the reduction in the permeability field parameters from 2601 to 20. The details of which can be found in our previous publication [30]. In this case, the influence of different sample sizes on the initial inversion is observed by using different numbers of samples in the ensemble, with the number of random realizations at 50 and 200, with 10 iterations performed. The inversion results are shown in Figure 7.

It can be observed that a larger initial quantity in the ensemble leads to better inversion results. The spread and RMSE are used to track the inversion process. The spread represents the variance of the unknown parameters at the current iteration step, while RMSE represents the difference between the unknown parameters at the current iteration step and the true values. Figure 8 shows that, as the iterations progress, although the ensemble with 50 members has a relatively small spread value, it fails to converge properly in many areas. On the other hand, the ensemble with 200 members exhibits a more complex initial distribution compared to the ensemble with 50 members, and its inversion results accurately reflect the position of each point, resulting in an overall better performance.

Subsequently, this case study compared the differences between using commercial simulation software UNCONG and the TgCNN surrogate model to handle the inverse problem. The reference permeability field used for the numerical experiments is shown in Figure 9a, while the mean of the initial ensemble for inversion is shown in Figure 9b. After performing the inversion using the UNCONG numerical simulation software and the TgCNN surrogate model, the mean of the ensemble is shown in Figure 9c,d, respectively. The computation time for UNCONG software is 9.5 h (34,205 s), while the TgCNN surrogate model takes 551 s. The results demonstrate that the inversion performance of the TgCNN surrogate model is similar to that of the UNCONG software in terms of accuracy. However, the TgCNN surrogate model exhibits a significantly improved computational efficiency compared to UNCONG.

4. Conclusions

Solving the inverse problem of the permeability field requires the model to undergo extensive forward computations. In this study, a surrogate model based on the TgCNN was successfully constructed through the design of the network structure and the introduction of physical constraints during the training process. With precision as a primary consideration, this research further employed a methodology that combines the TgCNN surrogate model with the EnRML algorithm for the inversion of the permeability field. The following conclusions were drawn:

(1) Considering the coupling nature of the two-phase flow equations for oil and water, an interconnected network structure was designed. This coupling model includes a permeability coupling module and two independent modules, namely, the pressure module and the water saturation module.

(2) The model is not only driven by data and equations but also incorporates, innovatively, a soft constraint based on well locations, specifically the flow direction. This addition successfully accelerates the model’s convergence rate and enhances predictive accuracy. Furthermore, compared to traditional numerical simulation software, computation time is significantly reduced by the surrogate model while ensuring predictive accuracy.

(3) The inversion performance of the TgCNN-based surrogate model is similar to that of commercial simulation software. However, in terms of computational efficiency, a more pronounced improvement is exhibited by the surrogate model.

5. Future Work

While certain unresolved issues remain in this work, in the real world, reservoirs do not have the regular shapes as depicted in our manuscript. Dealing with irregular shapes, while convolutional neural networks (CNNs) can handle non-boundary regions by treating them as 0 values, it becomes challenging when faced with unstructured grids. In future work, graph neural networks (GNNs) could be employed to address irregularities between grids. Furthermore, while the manuscript’s work demonstrates high accuracy and speed in handling two-dimensional reservoirs, the challenge escalates significantly when transitioning to three-dimensional reservoirs. Calculating bottomhole pressures for each layer becomes arduous, leading to increased uncertainty in the data and making it difficult to accurately invert for subsurface information. Lastly, the paper only establishes a surrogate model and compares it with traditional numerical simulations, without comparing between surrogate models [36]. This aspect has been earmarked for future work.

Author Contributions

Conceptualization, J.L.; Validation, Z.X.; Data curation, R.Z. and H.W.; Writing—original draft, J.L.; Writing—review & editing, Z.X. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (52288101).

Data Availability Statement

All data originate from software UNCONG.

Conflicts of Interest

Author Ran Zhang and Haochen Wang were employed by the company Geosteering & Logging Research Institute. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The Geosteering & Logging Research Institute had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

Vermeulen, P.T.M.; Heemink, A.W.; Valstar, J.R. Inverse modeling of groundwater flow using model reduction. Water Resour. Res. 2005, 41, W06003. [Google Scholar] [CrossRef]
Dow, E.; Szulczewski, M.; Kashinath, A. Inverse Modeling of Reservoirs with Tilted Fluid Contacts. SPE J. 2023, 28, 97–110. [Google Scholar] [CrossRef]
Wang, N.; Chang, H.; Zhang, D. Theory-Guided Auto-Encoder for Surrogate Construction and Inverse Modeling. Comput. Methods Appl. Mech. Eng. 2021, 385, 114037. [Google Scholar] [CrossRef]
Zhang, C.; Xi, L.; Wu, P.; Li, Z. A novel system for reducing CO₂-crude oil minimum miscibility pressure with CO₂-soluble surfactants. Fuel 2020, 281, 118690. [Google Scholar] [CrossRef]
Zhang, C.; Gu, Z.; Cao, L.; Wu, H.; Liu, J.; Li, P.; Zhang, D.; Li, Z. Effect of Pressure and Temperature Variation on Wax Precipitation in the Wellbore of Ultradeep Gas Condensate Reservoirs. SPE J. 2023, 2023, 1589–1604. [Google Scholar] [CrossRef]
Xu, Z.-X.; Li, S.-Y.; Li, B.-F.; Chen, D.-Q.; Liu, Z.-Y.; Li, Z.-M. A review of development methods and EOR technologies for carbonate reservoirs. Pet. Sci. 2020, 17, 990–1013. [Google Scholar] [CrossRef]
Sheng, G.; Su, Y.; Zhao, H.; Liu, J. A unified apparent porosity/permeability model of organic porous media: Coupling complex pore structure and multi-migration mechanism. Adv. Geo-Energy Res. 2020, 4, 115–125. [Google Scholar] [CrossRef]
Zhang, C.; Gu, Z.; Li, P.; Xu, G.; Zhang, D.; Li, Z. Investigation on the condensate gas composition variation and wax deposition mechanism during temperature-induced phase transition process. J. Clean. Prod. 2024, 442, 141109. [Google Scholar] [CrossRef]
Zhang, C.; Liu, Y.; Gu, Z.; Li, P.; Li, Z.; Zhang, K. Chemicals-CO₂ mechanisms of inhibiting steam heat transfer and enhancing oil film strip: Steam flow through the wall-adhering oil film surface in porous medium. Fuel 2024, 356, 129572. [Google Scholar] [CrossRef]
Xu, R.; Zhang, D.; Wang, N. Uncertainty quantification and inverse modeling for subsurface flow in 3D heterogeneous formations using a theory-guided convolutional encoder-decoder network. J. Hydrol. 2022, 613, 128321. [Google Scholar] [CrossRef]
Moghaddam, M.B.; Mazaheri, M.; Samani, J.M.V. Inverse modeling of contaminant transport for pollution source identification in surface and groundwaters: A review. Groundw. Sustain. Dev. 2021, 15, 100651. [Google Scholar] [CrossRef]
Zhou, H.; Gómez-Hernández, J.J.; Li, L. Inverse methods in hydrogeology: Evolution and recent trends. Adv. Water Resour. 2014, 63, 22–37. [Google Scholar] [CrossRef]
Chang, H.; Zhang, D. Jointly Updating the Mean Size and Spatial Distribution of Facies in Reservoir History Matching. Comput. Geosci. 2015, 19, 727–746. [Google Scholar] [CrossRef]
Eduardo, M.; Michael, J. Fast evaluation of pressure and saturation predictions with a deep learning surrogate flow model. J. Pet. Sci. Eng. 2022, 212, 110244. [Google Scholar]
Jayne, R.S.; Wu, H.; Pollyea, R.M. A probabilistic assessment of geomechanical reservoir integrity during CO₂ sequestration in flood basalt formations. Greenh. Gases Sci. Technol. 2019, 9, 979–998. [Google Scholar] [CrossRef]
Bieker, H.P.; Slupphaug, O.; Johansen, T.A. Real-time production optimization of oil and gas production systems: A technology survey. SPE Prod. Oper. 2007, 22, 382–391. [Google Scholar] [CrossRef]
Hriberšek, M.; Škerget, L. Iterative methods in solving navier–stokes equations by the boundary element method. Int. J. Numer. Methods Eng. 1996, 39, 115–139. [Google Scholar] [CrossRef]
Xiu, D.; Karniadakis, G.E. The Wiener––Askey Polynomial Chaos for Stochastic Differential Equations. SIAM J. Sci. Comput. 2002, 24, 619–644. [Google Scholar] [CrossRef]
Liu, J.; Gu, J.; Li, H.; Carlson, K.H. Machine learning and transport simulations for groundwater anomaly detection. J. Comput. Appl. Math. 2020, 380, 112982. [Google Scholar] [CrossRef]
Rainer, N.; Johann, N.; Joerg, S. A surrogate model for the prediction of permeabilities and flow through porous media: A machine learning approach based on stochastic Brownian motion. Comput. Mech. 2023, 71, 563–581. [Google Scholar]
Zhang, K.; Wang, Y.; Li, G.; Ma, X.; Cui, S.; Luo, Q.; Wang, J.; Yang, Y.; Yao, J. Prediction of field saturations using a fully convolutional network surrogate. SPE J. 2021, 26, 1824–1836. [Google Scholar] [CrossRef]
Ma, X.; Zhang, K.; Wang, J.; Yao, C.; Yang, Y.; Sun, H.; Yao, J. An efficient spatial-temporal convolution recurrent neural network surrogate model for history matching. SPE J. 2021, 27, 1160–1175. [Google Scholar] [CrossRef]
Daw, A.; Karpatne, A.; Watkins, W.D.; Read, J.S.; Kumar, V. Physics-guided neural networks (PGNN): An application in lake temperature modeling. arXiv 2017, arXiv:1710.11431. [Google Scholar]
Raissi, M.; Karniadakis, G.E. Machine learning of linear differential equations using gaussian processes. J. Comput. Phys. 2017, 348, 683–693. [Google Scholar] [CrossRef]
Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2018, 378, 686–707. [Google Scholar] [CrossRef]
Wang, N.; Chang, H.; Zhang, D. Efficient Uncertainty Quantification and Data Assimilation via Theory-guided Convolutional Neural Network. SPE J. 2021, 26, 4128–4156. [Google Scholar] [CrossRef]
Li, J.; Zhang, D.; He, T.; Zheng, Q. Uncertainty quantification of two-phase flow in porous media via the coupled-TgNN surrogate model. Geoenergy Sci. Eng. 2023, 221, 211368. [Google Scholar] [CrossRef]
Peaceman, D.W. Numerical Solution of the Nonlinear Equations for Two-Phase Flow through Porous Media. In Nonlinear Partial Differential Equations; Academic Press: Cambridge, MA, USA, 1967; pp. 171–191. [Google Scholar]
Peaceman, D. Interpretation of well-block pressures in numerical reservoir simulation. SPE J. 1978, 18, 183–194. [Google Scholar]
Li, J.; Zhang, D.; Wang, N.; Chang, H. Deep learning of two-phase flow in porous media via theory-guided neural networks. SPE J. 2021, 27, 1176–1194. [Google Scholar] [CrossRef]
Chen, Y.; Chang, H.; Meng, J.; Zhang, D. Ensemble Neural Networks (ENN): A gradient-free stochastic method. Neural Netw. 2018, 110, 170–185. [Google Scholar] [CrossRef]
Chen, Y.; Oliver, D.S. Ensemble Randomized Maximum Likelihood Method as an Iterative Ensemble Smoother. Math. Geosci. 2011, 44, 1–26. [Google Scholar] [CrossRef]
Chen, Y.; Oliver, D.S. Levenberg-Marquardt forms of the iterative ensemble smoother for efficient history matching and uncertainty quantification. Comput. Geosci. 2013, 17, 689–703. [Google Scholar] [CrossRef]
Chen, Y.; Oliver, D.S. History Matching of the Norne Full-Field Model With an Iterative Ensemble Smoother. SPE Reserv. Evaluation Eng. 2014, 17, 244–256. [Google Scholar] [CrossRef]
Luo, X.; Stordal, A.S.; Lorentzen, R.J.; Nævdal, G. Iterative Ensemble Smoother as an Approximate Solution to a Regularized Minimum-Average-Cost Problem: Theory and Applications. SPE J. 2015, 20, 962–982. [Google Scholar] [CrossRef]
Khormali, A.; Ahmadi, S. Experimental and modeling analysis on the performance of 2-mercaptobenzimidazole corrosion inhibitor in hydrochloric acid solution during acidizing in the petroleum industry. J. Pet. Explor. Prod. Technol. 2023, 13, 2217–2235. [Google Scholar] [CrossRef]

Figure 1. The schematic diagram of the coupled TgCNN surrogate model.

Figure 2. The diagram of the permeability field and well locations.

Figure 3. Flow direction diagram and horizontal pressure binary diagram.

Figure 4. The comparison of pressure fields and saturation fields among the three stochastic permeability realizations at 30 time steps between the TgCNN-based surrogate model and the reference.

Figure 5. Relative

L_{2}

error in 100 unlabeled permeability fields between the TgNN-based surrogate model and the numerical simulation software UNCONG.

Figure 5. Relative

L_{2}

error in 100 unlabeled permeability fields between the TgNN-based surrogate model and the numerical simulation software UNCONG.

Figure 6. Correlation between predictions of TgCNN surrogate model and the reference values with two points among the 200 permeability fields.

Figure 7. The reference permeability field compared with the mean values of the beginning and end of the inversion.

Figure 8. Inversion results corresponding to different number of samples in the ensemble.

Figure 9. Comparison of inversion results between numerical simulation software and surrogate model.

Table 1. Evaluation of equation error from raw data and pure data-driven prediction data.

Test Point (47, 20, 22)	$P_{l, i, j}$	$P_{l, i + 1, j}$	$S_{l, i, j}$	$S_{l, i + 1, j}$	$λ_{i + \frac{1}{2}} (P_{l, i + 1} - P_{l, i})$	Loss Value
True value	135.14881	134.87471	0.797258	0.773633	−19,490.85	0.00001
Predicted value	134.61239	134.87471	0.794914	0.773633	17,914.35	54.287

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, J.; Zhang, R.; Wang, H.; Xu, Z. Inverse Problem of Permeability Field under Multi-Well Conditions Using TgCNN-Based Surrogate Model. Processes 2024, 12, 1934. https://doi.org/10.3390/pr12091934

AMA Style

Li J, Zhang R, Wang H, Xu Z. Inverse Problem of Permeability Field under Multi-Well Conditions Using TgCNN-Based Surrogate Model. Processes. 2024; 12(9):1934. https://doi.org/10.3390/pr12091934

Chicago/Turabian Style

Li, Jian, Ran Zhang, Haochen Wang, and Zhengxiao Xu. 2024. "Inverse Problem of Permeability Field under Multi-Well Conditions Using TgCNN-Based Surrogate Model" Processes 12, no. 9: 1934. https://doi.org/10.3390/pr12091934

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Inverse Problem of Permeability Field under Multi-Well Conditions Using TgCNN-Based Surrogate Model

Abstract

1. Introduction

2. Methods

2.1. Governing Equations

2.2. Model Structure

3. Case Study

3.1. Accuracy of the Prediction

3.2. Inverse Permeability Field

4. Conclusions

5. Future Work

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI