**1. Introduction**

Enhanced oil recovery (EOR) processes are used to create favorable conditions for producing oil, through interfacial tension (IFT) reduction, wettability alteration or decreasing the oil viscosity [1]. CO2 injection is one of the most practical and effective EOR methods because it significantly reduces the oil viscosity and improves the sweep efficiency [2,3]. The minimum miscibility pressure (MMP) plays a significant role in designing the gas flooding operations. The minimum miscibility pressure can be measured using slim tube tests [4]. However, the laboratory measurements are costly and time-consuming [5]. MMP can also be determined using empirical correlations, but these correlations can lead to significant deviations, where the absolute error can reach up to 25% [6–10]. Several empirical correlations were proposed to estimate the MMP based on the reservoir condition and the fluids compositions. The empirical correlations were developed utilizing regression approaches. The common correlations are the Glaso correlation, Sebastian et al. correlation and Khazam et al. correlation [6,8,9]. Generally, the accuracy of the empirical correlations is increasing as the mathematical complexity of the equation increases. Most of the empirical correlations are used mainly for the fast screening applications [10].

In addition, analytical methods have been coupled with the equation of state (EOS) to estimate the MMP, and an average absolute percentage error of 15.7% is reported. The main advantage of analytical methods is that they can determine the MMP without introducing uncertainties associated with the condensing or vaporizing displacement (CV) process. Since the displacement processes are associated with complex miscibility mechanisms and then reduce the reliability of MMP prediction [11–13], Yuan et al. [14,15] applied the analytical gas theory to estimate the CO2-MMP. They used more than 180 data sets to build and evaluate the proposed semi-empirical model. They concluded that the developed model is more accurate and reliable than the empirical correlations, with a maximum average absolute percentage error of 9%. Furthermore, numerical approaches are used to determine the MMP. Fine-grid compositional simulations are utilized to solve the equation of state for particular grid sizes. The advantage of this approach is that it can be applied for heterogenous system of different pressure and temperature distributions as well as various fluid compositions. However, the numerical models can suffer from stability problems and may require a significantly small time-step in order to obtain stable solutions [10,15].

The use of artificial intelligence (AI) has proved to be an effective tool for prediction tasks because AI can capture the complex relationships between the output and input parameters [16]. Artificial neural network (ANN) is the most famous technique applied in the petroleum industry [17–20]. ANN technique has been utilized to predict the drilling fluids rheology, estimate the reservoir permeability and characterize the unconventional reservoirs [21–26]. AI also has been used for predicting the performance of several enhanced oil recovery (EOR) processes, such as steam-assisted gravity drainage (SAGD) process in heavy oil reservoirs [27–29].

Edalat et al. [30] presented a new ANN model to determine the MMP during hydrocarbon injection operations. Multi-layer perceptron (MLP) with two-layer feed-forward backpropagation were used. A total of 52 data points from an Iranian oil reservoir was employed with 20% for testing and 80% for training. They compared their results with a slim tube test and correlations, the maximum error of 18.58% and R-squared (R2) of 0.938 is reported. They concluded that their ANN model can determine the MMP for different fluid compositions.

Dehghani et al. [31] combined a genetic algorithm with an ANN technique (GA–ANN) to determine the minimum miscibility pressure for gas injection operations. A total of 46 data points of MMP experiments were utilized, and back propagation with two hidden layers was used. The GA–ANN model predicts the MMP using the reservoir parameters and the injected-gas composition. They concluded that the developed model can afford a high level of dependability and accuracy for determining the MMP.

Shokrollahi et al. [32] utilized a support vector machine technique to determine the MMP for CO2-injection operations. A total of 147 data points from experimental CO2-MMP was used to developed and validate the model reliability, and the values of coefficient of determination (R2) and average absolute percentage error were 0.90% and 9.6%, respectively. They mentioned that the proposed model shows high performance and good matching with the experimental data.

Liu et al. [33] suggested an improved method for estimating the CO2-MMP utilizing magnetic resonance imaging (MRI). The obtained results showed good agreements with the experimental measurements. Khazam et al. [9] developed a simple correlation to determine the CO2-MMP using a regression tool. A total of 100 PVT measurements from Libyan oilfields were used with a wide range of conditions, and a CO2-MMP between 1544 and 6244 psia. The developed correlation requires the values of the oil properties and system condition (pressure and temperature) to estimate the CO2-MMP with a high degree of accuracy, R<sup>2</sup> is 0.95 and AAPE is 5.74%. However, the model was developed based on limited data and from one region, and all samples were collected from Libyan oilfields. Czarnota et al. [5] presented a new approach to estimate the CO2-MMP using an acoustic separator by taking images for the CO2/oil system as a function of system pressure. They mentioned that the proposed approach can minimize the time required to obtain the CO2-MMP.

Rostami et al. [34] applied the support vector machine (SVM) technique to estimate the CO2-MMP during CO2 flooding. SVM was used to determine the CO2-MMP for live and dead crude oil systems. The developed model showed an accurate prediction with the average absolute relative deviation (AARD) of less than 3% and minimum coefficient of determination (R2) of 0.99. However, no direct equation is reported and the developed SVM model is considered as a black box model. Alfarge et al. [35] used laboratory measurements and field studies to characterize the CO2-flooding in shale reservoirs; more than 95 case studies were used. They constructed a proxy system to predict the incremental oil recovery based on the affecting parameters. The relationship between rock properties and incremental oil recovery were explained; the effect of permeability, porosity, total organic carbon content and fluid saturations on oil production was investigated. They mentioned that their findings could help to understand the complex recovery mechanisms during CO2-EOR operations.

Based on an intensive literature review, significant deviations between the measured and predicted CO2-MMP was observed. Analytical and empirical models can lead to considerable estimation errors. Artificial intelligence methods can improve the prediction performance for CO2-MMP. However, the available AI models were developed based on the hydrocarbon's injection, not CO2 flooding data, which may lead to unreliable predictions. The difference between hydrocarbons injection and CO2-flooding is significant in terms of system disturbance and miscibility mechanisms [10,14,15]. Usually, hydrocarbons injection was implemented by injecting the same reservoir composition, while CO2-flooding introduces new components into the reservoir system which results in disturbing the reservoir system. First contact miscibility (FCM) is usually associated with hydrocarbon injection while injecting non-hydrocarbon fluids (such as CO2 and N2) leads to multiple contact miscibility (MCM) [10,15]. Considerable errors could be generated when hydrocarbons injection models are used to predict the CO2-flooding performance [34,35]. Therefore, looking for a reliable model to estimate the CO2-MMP based on actual CO2-flooding data is highly needed.

In this paper, a reliable approach is presented to determine the MMP during the CO2 miscible flooding. Several artificial intelligence (AI) methods were studied, such as neural network, radial basis function, generalized neural network and fuzzy logic. The studied models investigate the significance of reservoir temperature, oil gravity, hydrocarbon composition and the injected-gas composition on the CO2-MMP. More than 100 data sets belonging to actual CO2-MMP experiments were used to develop and investigate the model reliability. This work introduces an effective approach for estimating the MMP during CO2-flooding, which could be used to refine the current numerical or analytical models and result in a better determination of the CO2-MMP.

#### **2. Methodology**

The data used were gathered from several published papers [7,10,14,15,36,37]. The minimum miscibility pressures (MMP) were measured using slim tube tests. The used dataset covers a wider range of reservoir conditions and hydrocarbon compositions; the main inputs are fluid composition, reservoir temperature and molecular weights. The dataset was randomly categorized into two divisions, training group (70% of the total data set) and testing group (30% of the total data). Before developing the AI models, statistical analysis was conducted by determining the minimum, maximum, mean, mode and other parameters, as listed in Table 1. The temperature data is changing in a range of 229 ◦F with a minimum value of 71 ◦F, maximum of 330 ◦F and arithmetic mean of 185.67 ◦F. The MMP values are changing between 1100 psia to 5000 psia with an arithmetic mean of 2583.49 psia. The statistical dispersion for MMP results was measured by calculating the standard deviation, skewness and kurtosis, and values of 876.98, 0.21 and 2.20 were obtained, respectively, which indicates that the data points are spread out over a wider range of values.


**Table 1.** A statistical analysis of the input and output data used in this study.

In addition, the frequency histograms were obtained for all data to give a rough estimation for the distribution density. The data set showed a multimodal pattern as shown in Figure 1. Finally, the correlation coefficient was determined to measure the strength and direction of the linear relationship between the input data and MMP data, Figure 2. Values of 0.7481, −0.493 and 0.1626 were obtained for temperature, mole fractions, and molecular weight, respectively, which indicates a weak linear relationship for both the mole fractions and molecular weight. Histogram plots indicate that most of the data set can be represented by the multimodal pattern. The correlation coefficient analysis reveals that the MMP has a weak relationship with the molecular weight, moderate relationship with the mole fractions, and a strong relationship with the system temperature.

**Figure 1.** Histograms of input and output data. (**A**) Reservoir temperature, (**B**) Mole % of C2–C6, (**C**) MW of C7<sup>+</sup> and (**D**) CO2-MMP (MMP = minimum miscibility pressure).

**Figure 2.** Correlation coefficient analysis: the effect of input parameters on CO2-MMP.

The correlation coefficient analysis showed that the molecular weight of C7<sup>+</sup> and the mole fractions of C2–C6 have a small effect on the MMP; however, those parameters are playing a significant role in controlling the MMP. Therefore, to improve these relationships the input data was transformed to different domains using different approaches (i.e., log, power, sigmoidal, etc.) until the best relationships, that have the highest correlation coefficient values, were obtained. Using the power model with power values of −1 and −0.5 for the molecular weight and the mole fraction, respectively, showed the best relationships between the input and output parameters (results are listed in Table 2).


**Table 2.** Correlation coefficient analysis for original and power transformed data.

#### **3. Results and Discussion**

Different artificial intelligence methods were used to obtain the optimum model that has the lowest average absolute percentage error (AAPE) for both the training and testing data and has the maximum correlation coefficient (R) value. Appendix A illustrates the equations used to calculate the AAPE and R. Sensitivity analysis was performed to fine tune the model parameters; the most suitable models are reported in this paper.

Initially, the original data were used to predict the MMP; however, significant errors were observed for all AI models. For example, the ANN model gave an error of 41.39%, and fuzzy logic system showed an error of 26.14%. Therefore, data processing techniques were implemented, by filtering the data to remove the outlier based on the average values and standard deviation (SD). The input data were also transformed into another domain using a power model. Trial and error technique was used to determine the best combination for the input parameters. Mainly, the fluid composition was categized into two groups: the first group is the mole percentage of ethane to hexane (C2 to C6%) and the second group is the molecular weight of heptane plus (MW C7+). Then, the square root of the first group (C2 to C6%) and the reciprocal of the second group (MW C7) were used as input parameters. As results of that, the error was reduced significantly, for example, in the ANN model the error decreased from 41.392% to 9.682%. The results from the artificial intelligence techniques are discussed below. Moreover, the problem of local minima was avoided by running the AI models several times using different model parameters. The profile of the error for the training and the testing data sets were also monitored during the phase of model development and validation. The error profiles were used to avoid the model memorization and the local minima problems.
