1. Introduction
Global energy demand declined by 3.8% in the first quarter of 2020, where oil demand was hit strongly (by about 5%) due to the pandemic [
1]. Albeit that the pandemic shifted the common behavior from work-from-office to work-from-home and caused a sharp reduction in non-essential activities, it is predicted that the oil demand will rebound to 2019 levels in 2023 [
1]. This prediction is based on the recovery of the global economy, as the pandemic had a diverse impact across countries [
2]. The growth needs to be balanced by supply, yet the global oil supply was adversely impacted due to project delays and demand uncertainty. The current active projects are barely able to fulfill the demand growth until 2026 [
3]. At the same time, some companies have shifted focus and business models based on the demand from governments and activists to move towards net zero carbon emissions in 2050 [
3].
The pandemic has also been one of the reasons for the renewable energy growth. By 2020, the growth in capacity of renewable energy generation, dominated by the increase in solar and wind energy capacities, increased by 10.3% compared to 2020 [
4]. Efforts to reduce the dependency on fossil fuels have also emerged, but some measures were affected by the pandemic. The price of the carbon tax, pricing initiatives, allowance prices, and key meetings have been reduced and these initiatives have been delayed due to the uncertainty of the market [
5]. Despite this situation, several countries such as Germany and China are planning to join the carbon tax plan (planned in 2021) [
5]. This plan will create a paradigm shift in the operation plans for oil companies in these countries by adding extra constraints in tax calculation, i.e., emissions tax, given that the volatility of the oil price and demand will exist even after the pandemic.
Fulfilling the oil demand while reducing carbon footprint are considered two tasks that refute each other. Producing negative emission is challenging for the oil industry as a lot of parameters are involved in every stage of the reservoir exploitation. Nonetheless, near-zero emission is attainable in some conditions. The Johan Sverdrup field (operated by Equinor), for example, emits only 4% of the world’s average CO
2 emissions per barrel of oil produced due to the power from shore [
6]. Preplanning during exploration is important, but most existing fields were not preplanned as well as this field. Considering the decline in production in most existing fields, an alternative approach to reach the climate targets and meet the energy demands is CO
2 Enhanced Oil Recovery (EOR). CO
2-EOR is considered as one of the answers to satisfy the demand in carbon capture, utilization and storage (CCUS), while retaining the profit [
7].
Assessing the feasibility of EOR projects requires several evaluations before field-scale application. Aside from the common steps, viz. technical and economical screening, reservoir design is needed for CO
2-EOR [
8]. Several considerations, for instance, the injection design, rate, cycle, composition, and existing field constraints, are accounted for in this phase. In the context of CO
2-EOR injection design, water-alternating-gas (WAG) has been proven as one of the best designs [
9]. CO
2-WAG increases the microscopic displacement efficiency and yields a better mobility control to contact the unswept zones compared to continuous flooding. To design the parameters of the process, optimization needs to be performed. Optimization is a computationally expensive study, as numerous runs are required to solve the problem. Running optimization using existing reservoir simulators demands extensive amounts of time and memory space to solve the problem.
Building proxy models to solve the optimization problem is commonly performed for study assessment. A proxy model is a mathematically or statistically defined function that replicates the numerical simulation model outputs given selected input parameters. The ability of proxy models to learn complex reservoir model behavior has been proven in recent studies using both synthetic and field models [
10]. A proxy model learns from the given training and validation data generated from the response surface or simulation model output. A proxy model is an engineering tool that also honors the physics of the problem at hand. This powerful tool is constructed based on managing, clustering, and filtering data into information, later learned as knowledge.
The primary objective of this work (which is mainly based on the results obtained in [
11]) is to find a strategy that allows a significant reduction in runtime and storage space associated with the commercial simulators without sacrificing accuracy. This study is performed using CO
2-WAG as one of the most common Enhanced Oil Recovery (EOR) methods. A proxy model is built as a reservoir model substitute, and the total oil produced and CO
2 stored will be maximized as the objective function of the optimization problem. This study will be performed on two geological models, where one acts as a simple model, while the other represents the complexity that is expected in a real field model. Several points that are studied in this research are:
Development of a proxy model for simple and complex reservoir models that involves sampling using experiments, proxy buildings, and proxy robustness assessments.
Analysis of the problem and complexity encountered when building a proxy of a complex reservoir model, in comparison with the results of a simple reservoir model proxy development.
Solving the optimization problem with the generated proxy, both for the simple and complex reservoir models, using an optimization algorithm.
The paper is divided into several sections.
Section 1 introduces the background and the objective.
Section 2 summarizes the basic theory and workflow of the study.
Section 3 describes the materials and method to solve the optimization problem.
Section 4 focuses on the details of how the proxy model is built to solve the optimization problem. The
Section 5 describes the results of the two proxy models. Finally, the discussion and conclusions are presented in the last two sections.
2. Basic Theory
2.1. Previous Study on Proxy Modeling
The study of proxy models was initiated in the 2000s in the field of petroleum/reservoir engineering, mostly to solve computationally expensive problems. Zubarev [
10] reviewed the growth of proxy model studies from 1998 to 2008. They found that all proxy modeling techniques showed dependence on the complexity of the reservoir model, solution space dimensions, and the dataset quality.
Gholami [
12] applied a proxy model or a surrogate reservoir model (SRM) in their study of a complex problem, where a smaller elemental volume was studied. The study was applied to CO
2-WAG to mimic the grid behaviors (pressure and saturation) and well-based behaviors (production rates of oil, gas, and water). Based on the study results, the constructed proxy model learned the preferred pressure, saturation, and rate behavior, in which one year was used as the timestep interval for reporting frequency.
Amini [
13] performed a detailed study for a grid-based proxy model following Gholami’s results. The cascading effect (in which error is carried over and accumulates by iteration) and a comparison between fine-grid and coarse-grid reservoir models were analyzed in a CO
2 sequestration study. The study showed that the coarser model required fewer runs for training purposes than the fine grid model. In alignment with that, the cascading procedure illustrated significant errors when observed in the last time step.
In one of the recent studies performed by Chaki et al. [
14], they developed a proxy model to perform history matching using the Brugge field model as the reservoir model to be learned. They applied two methods, a deep neural network (DNN) and a recurrent neural network (RNN), to build a proxy that learns the behavior of the reservoir model (oil and water production rate and cumulative production). The RNN demonstrated better performance than the DNN, yet the amount of time needed to construct it was 15 times higher than that for the DNN.
Nait Amar et al. [
15,
16] performed a CO
2-WAG optimization study using proxy models. Two different machine learning (ML) techniques, namely an ANN [
15] and a hybrid support vector [
16], were implemented. These developed ML-based proxy models were able to learn the reservoir behavior (oil and water rate) and were then coupled with nature-inspired algorithms for the CO
2-WAG optimization study. Besides that, a study also demonstrated how an ANN could be employed to build a proxy model for waterflooding optimization in a fractured reservoir model [
17]. Thereafter, this study was extended to improve the methodology as discussed in [
18].
2.2. Possibilities for Proxy Modeling Improvement
Although many studies related to proxy modeling have been performed, there are still research gaps to be filled. Most studies showed that a proxy model could be a powerful tool for various tasks, including predictive models related to CO
2 and H
2 property estimations [
19,
20] and substitutes for reservoir models as detailed in the previous section. In the domain of reservoir and production engineering, proxy models can provide an alternative to overcome the current computational limitations in reservoir modeling (tremendous run time and memory consumption). In this aspect, most studies have worked with only one geological model and there are few studies focusing on the feasibility of this idea begin applied in different geological models (considering geological uncertainty) [
21,
22]. Some studies have also mentioned that the proxy would reflect the complexity of the reservoir model, yet quantitative results are to be found [
14]. Therefore, an investigation was performed to check whether a proxy model is a feasible substitute for the reservoir models studied in this paper. It is worth noting that the proxy models are generally case specific.
2.3. Workflow
There are generally four big steps that can be used as a guideline for proxy modeling.
- 1
Determining the study objective:
A proxy model only learns from the given sets of data and information, and this creates a limitation for the model, making it case specific. Different proxies need to be built for different study objectives, leading to sampling, proxy input–output combinations, study limitations, and the requirement of algorithms to solve the problem.
- 2
Data sampling:
After determining the study objectives, proxy scale, and input data for the proxy, data sampling can be performed. Data sampling is usually performed by running the reservoir model. The results to be learned are then sampled for the proxy learning dataset. Data sampling can be performed using available statistical sampling.
- 3
Data management:
After obtaining the sampling plan, reservoir model runs will be performed. Many data points can be obtained, yet not all of them will be used for proxy model training.
- 4
Designing and building the proxy model:
Using available machine learning or deep learning models, a proxy model can be built. The proxy model will approximate the numerical reservoir model. It should mimic the nonlinearity in responses of the numerical model. The complexity of the proxy model itself reflects the complexity of the reservoir model. Studies on proxy modeling have used approximately the same workflow shown in
Figure 1.
2.4. Artificial Neural Network (ANN)
In this study, an ANN was used to build the proxy. An ANN is a computational model inspired by the biological behavior of the human neurological system. An ANN consists of inputs, weights, bias, and an activation function. The structure, formed of several layers, each containing several nodes before reaching the output, is called the topology (
Figure 2). The activation function gives ANNs the ability to model the nonlinearity of a reservoir model. An ANN learns from forward and backpropagation, where the weights will be updated during each process to minimize the error. During forward propagation, the network moves from the input layer to the output layer, passing through each node and being transformed by the activation function, and eventually the results will be output. This will be noted as the neural network prediction.
The differences between the predicted and the actual output dataset are then calculated using a loss function, such as mean squared error (MSE), root mean square error (RMSE), absolute error (AE), and any other loss function. Backpropagation will then be performed to minimize the loss function. The loss will be sent back to the input layer as a fraction of the total signal of the loss. These two processes will be performed until the error satisfies the limit set or the number of iterations (epoch multiplied by batch size).
A hyperparameter is a parameter that controls the learning process. The activation function, number of hidden layers, and number of nodes are considered as hyperparameters in an ANN. Other than these, other hyperparameters such as learning rate, the optimizer, batch size, dropout, and number of epochs, are also considered. The learning rate is the step size for each iteration and the optimizer is the optimization function to minimize the loss function. The dropout is the probability that nodes are randomly disconnected during training and the batch size is the number of data points that pass through the neural network in every step. The epoch number is the number of cycles the training dataset goes through.
2.5. NSGA-II (Non-Dominated Sorting Genetic Algorithm II)
NSGA-II, an optimization algorithm to solve the multi-objective optimization problem, works based on genetic algorithms. A genetic algorithm is inspired from natural evolution theory. The algorithm reflects the natural selection process, where the fittest individuals are selected to reproduce the new offspring of the next generation. NSGA-II can be categorized as an evolutionary algorithm. This algorithm type was developed due to issues found in classical and gradient-based techniques, including the performance, which depends on the initial guess, and sub-optimal convergence issues. This algorithm uses a genetic algorithm as its fundamental knowledge. Three features of this algorithm are:
- 1
Elitist principle;
- 2
Explicit diversity preserving mechanism;
- 3
Emphasis on the non-dominated solutions.
The PyMoo [
23] built-in is used to implement NGSA-II in this study. For an illustration of the NGSA-II procedure, please refer to the work in [
24].
4. Proxy Development Workflow
In this section, the proxy development process is explained briefly and performed for both reservoir models.
4.1. Design of Experiments
The sampling was performed using LHS (Latin Hypercube Sampling) [
28] in this study. This sampling can be considered as a leveled sampling. It was performed based on previous study results and successful work on sampling for the study of WAG, for example, by Nait Amar, M. et al. [
16]. The two levels of sampling that were performed in this study are as follows.
Half-cycle length:
Sampling was performed for 3, 6, 9, and 12 month half-cycle lengths. This means that the proxy will not predict the CO2-WAG behavior out of the half-cycle length correctly.
- 2.
Gas injection rate and water injection rate:
The gas injection and water injection rates are sampled for each possible half-cycle length equally, since the probability for half-cycle length is equal for each possible value. More samples can be added if the proxy cannot learn from the given number of samples. The distribution of this parameter probability is uniform. Parameters are not dominating the others. Hence, the usual LHS can be performed.
4.2. Data Preparation
Based on the formulation of the optimization problem, two simulation outputs will be learned by the proxy, i.e., FOPR and FCO
2PR. These variables are available as output results of a simulation run. In addition, the half-cycle length, timestep, gas injection rate, and water injection rate are used as the proxy inputs. Normalization is performed to help the convergence of the ANN. Maximum–minimum normalization will be performed as:
4.3. Proxy Building
The proxy can be developed properly only if the types of proxy built and data used are known. Failure to perform in sequence will cause an infinite loop between these three sections (problem formulation, model preparation, and proxy building) until the first two sections are determined. An overview that shows the outline of the study, performed with the available software and programming language, is shown in
Figure 6.
WAG has distinct behavior. For a WAG ratio higher than 1, a sharp increase in production during the water injection phase is observed. When the WAG ratio is less than 1, an increase in production during the gas injection phase is noted. This distinct behavior is thus used as the base of building the proxy. The database is then split based on the injection phase. These concepts are applied to FCO2PR and FOPR proxies. For the Gullfaks model, a sharp production change is identified after 1 month, since the injection fluid (water or CO2) was changed, which can be observed as a delay compared to the Egg model. This results in proxy segmentations based on the injection phase.
4.4. Proxy Training and Validation
An optimization process was conducted to find the best topology of the ANN for every proxy segmentation. The optimized parameters were the learning rate, number of hidden layers, and number of nodes of each layer with an epoch of 100 with MSE (Equation (3)) used as the loss function. The optimum hyperparameters are then retrained with 1500 epochs to obtain their best performance. The detailed hyperparameters are listed in
Table 3.
It is essential to think of the behavior of the segments as a one proxy system. As one of the inputs is the result from the previous timestep, the value obtained from the proxy was used as the input value. The segments were then tested as parts of one proxy to verify its robustness. In lieu of relying on visual inspection, an error calculation was performed to quantify the robustness of the proxy model. For this reason, the average percentage relative error (%) was used. The equation is shown below.
4.5. Blind Test
Following training, a blind test was performed on the proxy models. Here, 12 runs outside of the sampling from LHS were prepared. The same assessment as explained above was performed before the proxy model was used, where APRE was used as the error calculation to determine the performance.
4.6. Optimization Study
NSGA-II was employed in this study. It is worth mentioning that no optimization study was performed for the hyperparameters in our optimization algorithm. To perform this, a better understanding of how the NSGA- II function is built must be acquired. Other than that, the population size will be kept at 40. The number of generations for one study was set to 100, where ten new offsprings will be generated in each generation. The whole study will take more than 1000 proxy runs to be finished.
6. Discussion
Each proxy model is case specific based on the data provided for its learning. This results in limitations of the proxy models; for example, they are not seen as one-size-fits-all solutions for optimization problems. In this work, the proxy models were created based on the discussed reservoir models. They do not have the ability to reflect the physical trends of other reservoir models. Hence, this, to a certain extent, limits their application. Additionally, the established proxy models are field based. This indicates that they are only able to depict the responses of the reservoir on the field scale, such as FOPR. These proxy models cannot predict any output on the grid and well scales. In this study, the methodology was illustrated without aiming to create a model that is a universal solution. Therefore, two proxy models that represent different reservoir models were built for solving the optimization problem in this study. One of the models (Egg Model) was modified to make both models similar in size for comparison purposes. A comparative study will be presented based on the steps of building the proxy model. Concerning this, it is important to note that this methodology was only tested with simulation data. Therefore, justification with real field data is recommended (when available) in future work to achieve higher maturity in terms of implementation.
6.1. Data Complexity
The rate response to the WAG injection phase for both models is shown in
Figure 12. Two cases are presented. One is for a 90 day half-cycle length, 2 Msm
3/d gas injection rate, and 3000 sm
3/d water injection rate. This case is shown until 1800 days only. The other case is for a 360 day half-cycle length, 2 Msm
3/d gas injection rate, and 9000 sm
3/d water injection rate. These cases were run as training cases for both models, as one of Latin Hypercube edges.
Regarding FCO
2PR behavior, a sharp increase in CO
2 production can be seen for the 90 day half-cycle. However, in the case of the 360 day half-cycle for the Gullfaks model, there is a smooth transition rather than a sharp increase or drop in production behavior. This different behavior will be learned using the same proxy model if it is not segmented based on half-cycle length. From FOPR plots, higher production rates can be noticed compared to the base case shown in
Figure 5.
The difference in the responses of the two models is primarily due to different geological model and well placements. Other parameters such as fluid model, relative permeability relationships, and initialization are identical for both models. The complex behaviors were learned by the proxies built for the respective reservoir models.
6.2. Number of Samples Needed
It can be concluded that complexity aligns with the number of samples needed for developing the proxy. For the Egg Model, 68 samples are enough, while for the Gullfaks model, 97 samples are needed to develop the proxy model. The total time to run all samples is aligned with the number of samples needed. This phase is the most time-consuming part as both models need more than 9 h to complete all runs for the training validation dataset.
There is generally no stringent guide in deciding the exact number of samples used in building a proxy [
29]. Adding extra runs means increasing the dimensionality, the amount of time to collect the data, and additional time needed for the algorithm to learn the data. Therefore, there is a trade-off to be considered when selecting the number of samples. To tackle this, segmentations were performed, which will be explained in the next section.
6.3. Proxy Model Development Process
To improve the robustness, segmenting the proxy was the first attempted method rather than increasing the number of samples. Segmenting the time into several parts shows the best improvement compared to other methods. The question that might arise is the number of segmentations needed. In this study, three segmentations were enough for the Egg model, but more than five were needed for the Gullfaks model.
Before assessing the proxy robustness, the computational time needed to build all the proxy segments was studied. Here, the time presented includes the database segmentation, topology optimization, re-learning of ANN with higher epoch, and running the whole training validation blind dataset using the proxy model. More data and segmentation will increase the amount of time needed. All proxy development statistics are tabulated in
Table 7.
6.4. Proxy Robustness
In this study, an average APRE of less than 2% for the training validation phase was used as the target. Other than that, the error window should be in the range of 10%. These numbers are selected arbitrarily as the boundary for this study. Both proxy models must fulfill these criteria before being tested using the blind test dataset. A more profound assessment was also performed to determine the error variations for each training validation case.
Both models are assessed using the blind test as the final test to check the robustness. In general, both models have similar APRE values for both training, validation, and blind tests. The robustness of blind test was assessed in the same way as for the training and validation. It is observed in this study that when the proxy model meets the robustness criteria during the training and validation, it tends to show the same robustness during the blind test.
6.5. Optimization Results
Both proxy models need an almost equal amount of time to finish optimization. The difference between the proxy models in the time taken to finish a run is only 0.41 s. No hyperparameters were changed and the total generations used for both models were the same (100 generations).
The number of Pareto optima are different, where there are 40 Pareto optima in the Egg model and only 32 Pareto optima in the Gullfaks model. The Pareto front of the two models is different. Both models still have the same behavior. If the water injection rate is increased while the gas injection rate stays the same, the oil production will increase while the amount of CO
2 stored in the reservoir will reduce. For the Egg model, this behavior is almost linear. However, for the Gullfaks model, it is not anywhere near linear. All statistics for the optimization process are tabulated in
Table 8.