1. Introduction
The rate of penetration refers to the speed under which the rock is broken under the bit. It is a measure of the progress or speed of drilling per unit area per unit time and reported in feet per hour (ft/hr) in the oil field unit [
1]. In essence, a faster rate of penetration is desirable because it is cost-effective, thereby saving a lot of time and resources that could have been expended. However, other problems are associated with a faster rate of penetration. These include poor hole cleaning and excessive vibrations leading to more complications such as losing the bottom hole assembly (BHA) or wellbore instability [
2].
The rate of penetration is affected by several factors which are inseparable. These factors have been categorized into five categories and they include formation characteristics, the efficiency of the rig, properties of mud, hydraulic factors, as well as mechanical factors. As asserted by Hossain and Al-Majed [
3], these factors can be grouped into controllable and uncontrollable factors. On the one hand, the controllable factors are those that can be altered quickly without necessarily causing significant operational economies such as the design of the bit, the revolution of string per minute, and the weight of bit. On the other hand, the uncontrollable factors are impossible to alter due to geological and economic reasons which include weight and type of mud, underbalanced in formation in pressure, and the size of the bit. As it relates to the properties of fluids and the effects of the rate of penetration, it is hard to alter a single property without causing an impact on others. Therefore, it makes it difficult to assess the real impact of a specific parameter as it relates to the penetration rate [
4]. While drilling a well, three components must be considered [
5], namely the weight on bit (WOB), drillstring rotation (DSR), and the pumping rate (GPM) [
6].
To help determine the effects of drilling parameters, as well as the properties of mud, on the penetration rate, there are established methods that have been proven to be effective in doing so. Among these established methods are included the basic mathematical and physics equations which help derive the relationships between these items [
7]. In addition, the use of correlation and connecting them to obtain such relations has proven to be critical. However, there is no reliable or solid method that currently exists due to the level of complexity associated with the process of drilling. It is also not easy to capture each factor for predicting the penetration rate. Ricardo et al. [
8], posited that there was no complete model that could be used to capture all parameters during drilling, as well as the properties of mud which affect the rate of penetration. Therefore, it is important to treat them independently and come up with individual correlations for reliable results.
The S-shape well profile undergoes a build, hold, and then drop section [
9]. The well begins as vertical, then at the desired depth, directional BHA is picked to build an angle, then, hold it until a predefined departure is achieved, and then the angle is dropped back to zero [
10]. In the upcoming sections, rotary BHA is used, however, when the horizontal displacement is very far, then, a performance motor is selected to allow downhole rotation. This reduces the surface DSR due to torque limitation. In addition, a casing protector needs to be used to reduce casing wear [
11]. This profile is selected if there is a surface location such as having a mountain above the reservoir or to get away from subsurface trouble zones such as fractures or salt domes [
3]. Different formations are drilled across the tangent which can cause wellbore stability issues across some formations. The sidewall forces keep changing and cycling depending on the BHA position. Torque and drag are present as well, causing some forces to be lost such as vertical forces (WOB) and rotational forces (DSR). Hole cleaning is a big challenge since the cuttings can easily avalanche downward [
12,
13].
Effect of Drilling Parameters on the Rate of Penetration
Maurer [
14] adopted the concept of optimum hole cleaning and derived his rate of penetration (ROP) model for tricone bit type. He assumed that cuttings were being removed as the bit tooth impacted the formation rock. Bingham [
15] performed multiple laboratory experiments in order to develop his ROP model. He assumed that the weight on bit threshold could be neglected. This resulted in having the ROP as a function of the rotary speed (DSR) and applied weight on bit. In his model, he developed a WOB exponent which could be calculated through extermination. Bourgoyne and Young [
16] presented a mathematical regression model that used existing drilling data to calculate multiple exponents that were required for developing the full model. Each exponent captured a certain physical or mechanical meaning for the drilling process such as the effect of overbalance, overburden pressure, and bit tooth wear. Warren [
17] developed his model considering an optimum cleaning scenario for tricone bit type, where the removal of cuttings rate under the bit equaled the rate of generating new cuttings. Al-AbdulJabbar [
18] developed a new ROP model and took into consideration drilling mechanical and hydraulic parameters, as well as mud properties. Using nine inputs, two exponents were calculated which were bit exponent and formation compressive strength. Each formation type had its compressive strength coefficient.
Different previous studies have suggested the use of artificial intelligence (AI) to improve the predictability of different parameters related to the oil industry [
19,
20,
21,
22,
23,
24,
25,
26,
27]. Bilgesu et al. [
28] suggested the use of AI techniques for ROP prediction. They developed two artificial neural networks (ANN) models for predicting ROP, while drilling through various nine formations in different vertical wells. Amar and Ibrahim [
29] developed two ANN models to evaluate the ROP based on the formation depth, ECD, WOB, DSR, pore pressure gradient, drill bit’s tooth wear, and Reynolds number function. A comparison of the prediction power of the developed ANN-based models with the available empirical equations showed that both ANN-based models were highly accurate for estimating the ROP as compared with the empirical equations.
Elkatatny [
30] used the ANN feedforward network to predict ROP on three wells. Using two wells, the model was trained on 3333 data points with a correlation coefficient of 0.99 and an average absolute percentage error of 5%. Then, using 2700 unseen data from the third well, the model was able to predict the rate of penetration with a correlation coefficient of 0.99 and an average absolute percentage error of 4%. Al-AbdulJabbar et al. [
31] used a feedforward ANN to predict ROP on three well. Using 1500 data points from only single well, the model was able to predict the rate of penetration with a correlation coefficient of 0.92. Later on, the model was used to predict the other two wells with unseen data with a correlation coefficient of 0.95 and 0.94, respectively. Only one well was used in building the model which showed the power of AI in modeling and prediction. Elkatatny et al. [
32] demonstrated that once the ANN model was optimized and an empirical correlation was developed, the model could be converted from a black box to a white box making it flexible to deploy in real field applications and environments. Ahmed et al. [
33] developed a rate of penetration support vector machine AI model. Using 10 inputs representing drilling mechanical parameters and fluid properties, he developed a resilient ROP model with an AAPE of 2.83%. Al-AbdulJabbar et al. [
34] used ANN coupled with self-adaptive differential evolution (SaDE) to predict ROP in horizontal carbonate reservoirs. Using six inputs that coupled drilling mechanical parameters and formation petrophysical properties, such as gamma ray, resistivity, and bulk density, he achieved a strong correlation coefficient of 0.96 and an AAPE of 5.12% after building the model. Using another well with unseen data, he obtained R and AAPE values of 0.95 and 5.8% respectively. The ROP model was turned from a black box to a white box through extracting the weights and biases in a matrix form. A summary of these models including inputs and equations are presented in
Table A1,
Appendix B.
The main objective of this paper was to build new ROP models for the first time for the S-shape well profile based on the optimized fuzzy inference system (ANFIS), functional neural networks (FN), random forests (RF), and support vector machine (SVM). These models were built to enable a real-time ROP estimation based on the obtained data from the rig real-time sensors such as WOB, DSR, SPP, GPM, and T.
2. Artificial Intelligence Models Theory
ANFIS is the first model used in this study which is a fuzzy subtractive clustering-based fuzzy inference system. The fuzzy inference system consists of a multilayer feedforward adaptive network, and in this network, a specific function is applied to the incoming signal through the training nodes. The model training is conducted in the following two stages: First, the forward pass where the functional signals of the input training data going forward and the parameter in the output is identified through the least square formula, and secondly, the backward pass where the input parameters are updated using the gradient method while the error rates propagate in the opposite [
35].
FN is the second model used in this work; this model is very similar to the usual ANN model, but it uses a generalized functional model while ANN uses the sigmoidal common model. In addition, the neuron’s functions of the FN are learned based on the existing training data, and this means the weights associated with these neurons are not needed [
36]. The FN model is also characterized by the presence of different arguments in neural functions as compared with the ANN which has one argument [
37].
The third model considered, in this study, is the RF which was developed to perform classification and regression tasks [
38]. This model combines hundreds of decision trees; every decision tree is trained on different observations, every tree consists of several nodes, and every node considers several features. Then, the final prediction of the random forest model is defined as the average of every tree [
39].
The last artificial intelligence model considered in this study is the SVM which was developed earlier in the framework of statistical learning theory as a classifying algorithm. This model uses a multidimensional hyperplane that helps to divide or classify the data into two or more divisions based on the kernel, margin, gamma, and the regularization parameter (C) [
40].
3. Data Overview and Preparation
In this work, the field data were obtained from three S-shape wells that share the same hole size and intersect the same geological lithology. These wells were drilled using directional BHA and measurement while drilling (MWD). All the well profiles started as vertical, then, the section was kicked off until the tangent section was reached. The tangent was held at ±25–30°, and then the well was dropped back to zero. The surface data were acquired through real-time sensors, which were recorded on a footage base.
It is very important to recognize that the MWD tool provides information about inclination and azimuth based on a mud plus system which requires no circulation during the measurement. The MWD data are transferred to the surface approximately every 100 ft, while the connection time for the drill pipe stands. Therefore, it is not recommended to include the inclination data as an input parameter to be able to predict the ROP on a footage basis. The geological data, which are obtained from the logging while drilling (LWD) tool, was not available in this study, because it is not common to run the LWD tool outside the reservoir section. At the same time, the changes in pipe speed and torque data compensate for the effect of the geological data on the ROP prediction. The inclination data which is available every 100 ft (this is not a real-time record as the drilling mechanical parameters) did not affect the accuracy of the developed models for ROP prediction. This was confirmed, since the accuracies of the developed models without including the inclination were very high, which could be explained, as shown in
Figure 1, by the fact that the change in the ROP took place even when the inclination was constant as compared with the torque which changed consistent with the change of the ROP.
The collected data initially included all types of drilling operations performed in the 12 inches section which included tripping, drilling, and deploying casing. The important and necessary drilling data that were used included weight on bit (WOB), drillstring rotation (DSR), torque (T), pumping rate (GPM), and standpipe pressure (SPP) [
18].
Data cleaning was conducted to remove all unrealistic values and outliers. The first step was only to extract the portion where new drilling footage was made while discarding the remaining data, which required a human interface using data filtering and elimination. The second step was to clean the data based on the standard deviation, where all the data values without the range of ±3.0 standard deviation were removed from the data. Although the rotary steerable system (RSS) was used, somehow, the data were noisy. The RSS drive resulted in a much smoother hole as compared to using a mud motor. However, different formations were being drilled with ±25–30° inclination, which caused wellbore stability issues across some formations. In addition, sidewall forces kept changing and cycling depending on the BHA position. This introduced vibrations to the BHA, as well as excessive torque and drag. All of these affected the data quality, especially when the well inclination was approaching zero from its maximum value.
Figure 2 compares the correlation coefficients (R) among the ROP and all training parameters after the cleaning process. As indicated in this figure, the ROP is strongly affected by the T and WOB with Rs of 0.89 and 0.85, respectively, and the ROP has moderate functions with Q, DSR, and SPP with Rs of 0.43, 0.53, and 0.70, respectively.
At the end of a data cleaning process, the data consists of 4012, 1717, and 2500 data points from Well A, Well B, and Well C.
Table 1 shows the statistical analysis for the training data obtained from Well A and Well B after the cleaning process. As indicated in this table, Q is between 559 and 993 GPM, DSR is ranging from 99 to 159 rpm, SPP is from 912 to 2490 psi, T is between 4.11 and 15.9 klbf-ft, WOB is from 5.14 to 57.2 klbf, and ROP is ranging from 6.34 to 64.5 ft/hr.
4. Artificial Intelligence Models Optimization
Since the S-Shape profile, with the schematic shown in
Figure 3, had dual inclination phenomena (build then drop), two wells were used for training (Well A and Well B), and one for validation (Well C) with a ratio of 2.29:1. The two wells with the highest and lowest curvature were used to predict the well in between. The AI models were built using the ANFIS, FN, RF, and SVM models which were trained using the drilling parameters of the weight on bit (WOB), drillstring rotation (DSR), torque (T), pumping rate (GPM), and standpipe pressure (SPP). The data were loaded into the AI modeling software as inputs and output vectors separately. The data were randomly divided into a 70:30 ratio to be used in the training and testing stages. Well A and Well B were combined in the training-testing phase, where multiple AI model design parameters were varied on a trial-and-error basis to achieve the optimum prediction of the rate of penetration with the least amount of error. The design parameters of the ANFIS model optimized during this step were cluster radius and number of iterations, whereas, for the FN model, the design parameters of the training method and training function type were optimized. The maximum depth and maximum features of the RF model and the kernel, gamma, number of iterations, and C parameter of the SVM were optimized, in this study, to improve the ROP prediction. The governing factors for the selection of the optimum parameters for the AI models were the correlation coefficient (R) and the average absolute percentage error (AAPE), which are defined in
Appendix A.
Table 2 summarizes the optimized artificial intelligence models. As indicated in this table, the optimized ANFIS model has a cluster with 0.1 radius and 300 iterations. The FN model has a training method and function type of nonlinear function without iteration terms and forward selection method, respectively. The optimized RF model is characterized by a maximum depth of 29 and maximum features of sqrt. The optimum design parameters of the SVM model are the radial basis function kernel, gamma scale, 1000 iterations, and C of 100.
Using the optimized parameters for the ANFIS-ROP model, sensitivity analysis was performed to evaluate the effect of including the inclination as an input parameter.
Table 3 shows that for the training and testing data, there was little effect of including the inclination as an input parameter, where there was an increase of the AAPE from 9.57% to 13.81%, whereas for the validation data there was a significant negative effect of including the hole inclination as an input where the error was increased from 9% to 51%. On the basis of this sensitivity, it was decided to exclude the inclination from the input parameters. Please note that interpolation was performed to have a complete profile for the inclination as it was recorded every 100 ft.
6. Conclusions
Actual field data from three S-shape wells were used for the rate of penetration prediction using AI models of the ANFIS, FN, RF, and SVM based on five inputs of the DSR, SPP, Q, T, and WOB. On the basis of the results, the following can be concluded:
The ANFIS, FN, and RF models could effectively predict the ROP from the drilling parameters in the S-shape well profile for training, testing, and validation data, whereas the SVM model showed very low accuracy in estimating the ROP.
The developed ANFIS, FN, and RF models predicted the ROP with AAPEs of 9.50%, 13.44%, and 3.25%, respectively, for the training data.
For the testing data, the optimized ANFIS, FN, RF models estimated the ROP with AAPEs of 9.57%, 11.20%, and 8.37%, respectively.
The developed ANFIS, FN, and RF models outperformed the SVM model and the three published empirical correlations for estimating the ROP for the validation data.