1. Introduction
Numerous academics have conducted study on the use of big data and artificial intelligence techniques in drilling engineering as a result of the quick growth of artificial intelligence and big data technologies. A significant and intricate system of engineering is found in drilling engineering. Drilling efficiency and costs can be successfully increased and decreased by automatically identifying various working situations.
Golitsyna et al. [
1] proposed an automatic anomaly detection method during drilling, based on machine learning, which used drilling parameters to improve the intelligent detection effect of abnormal conditions in drilling.
YIN Qishuai et al. [
2] proposed a program based on working condition recognition method, which effectively eliminated the influence of personnel on drilling condition recognition. This method, however, did not take into account the relationship between the data before and after or the potential fluctuation of drilling data because it was based on programming language.
Eric van Oort and Ed Taylor et al. [
3] proposed an automatic drilling condition identification method that could not only record the learning curve in batch drilling but could also check the source of invisible lost time and solve these problems by communicating with drilling personnel to shorten the invisible lost time.
Eric Maidla and William Maidla [
4] proposed methods for data quality control and automatic identification of drilling conditions in the drilling process and successfully drilled a series of wells onshore using these methods. By ensuring the accuracy of drilling condition identification, training, and implementation of new drilling workflows, 31% to 43% of drilling time was saved.
Sun Ting et al. [
5] established a real-time drilling condition recognition model with a support vector machine model and achieved certain application effects, reducing the invisible lost time in the drilling process and improving the drilling efficiency. The kernel function and support vector machine parameters are selected by a cross-validation method. Six different drilling conditions, including rotary drilling, drilling, tubing running, single connection, backrow, and plug drilling, can be recognized by the model. The model offers excellent generalizability and great accuracy.
Ben (2019) et al. [
6] pointed out that it is difficult to classify the two drilling conditions of “rotary drilling” and “sliding drilling” based on the surface rotary speed alone, because of top-drive vibration. To achieve the desired classification accuracy, three machine learning models for identifying “rotary drilling” and “sliding drilling” were evaluated, namely random forest (RF), convolutional neural network (CNN), and recurrent neural network (RNN) models, finding that machine learning models are far superior to rule-based models.
Gabriel L. Dioliveira et al. [
7] found the invisible lost time in the drilling process by establishing an automatic drilling condition identification method and calculating key performance indicators of drilling time, such as drilling start and trip time. This method was used to evaluate drilling data over three years, and it revealed more than 700 days of non-visible lost time, showing that drilling efficiency has a great deal of opportunity to grow.
Coley [
8] developed a key component of an internal system to report key performance indicators of invisible lost time across drilling operations and developed a classification engine for common drilling conditions based on supervised machine learning.
Qiao Ying et al. [
9] proposed a convolutional neural network (CNN) and bidirectional gated recurrent unit neural network (GRU) parallel hybrid network (CNN–BiGRU) for intelligent identification of drilling conditions. The accuracy percentage was 94% overall. Experiments demonstrate the model’s great accuracy and efficiency as well as its capacity to intelligently recognize drilling circumstances.
In addition, we have summarized some research on traditional stacking and improved stacking algorithms, as shown in
Table 1.
The drilling condition identification method used in current research methods rarely takes knowledge of drilling engineering mechanisms into account. As a result, the model’s interpretability is poor, and it is challenging for people to comprehend the model’s internal working principles and decision-making process. The model may have trouble adjusting to new circumstances and domains if it only generates predictions by identifying patterns in the training data, rather than simulating the mechanics of the issue. As a result, the integrated learning algorithm that determines stacking conditions based on knowledge of the fusion mechanism is proposed to be improved upon in this study. The importance of integrating mechanistic information in feature building lies in the fact that it can enhance the model’s performance in terms of generalization, interpretability, and dependability.
The drilling process can be monitored and managed in real time by segmenting the drilling process into particular drilling conditions and establishing the key performance indicators (KPIs) of these drilling conditions. Additionally, the drilling process can be made more efficient and the well construction cycle can be shortened by identifying the invisible lost time brought on by redundant operating procedures.
Through fine monitoring and analysis of the drilling process, the invisible lost time in the drilling process can be found, which has important practical significance for improving drilling efficiency, shortening the well construction cycle, and saving drilling costs. The purpose of this study is to effectively identify drilling conditions, further record the duration of various conditions, and uncover hidden time losses. Ultimately, this can significantly improve drilling efficiency and reduce drilling costs.
3. Results and Discussion
Using Python programming, we conducted an experiment on a drilling dataset. First, we divided the dataset into training and testing sets. Then, following the algorithm flow, we input the data into the models and calculated the four evaluation metrics for each model. In this section, we compared and analyzed the performance of different drilling condition segmentation models, identified the strengths of different models, and ultimately selected the best-performing model. The comparative analysis results are shown in
Table 6.
The improved ensemble learning model performed the best in terms of accuracy, precision, recall, and F1 score. It can fully leverage the advantages of multiple base learners, improving the predictive performance and generalization ability of the model.
Random forest and KNN models also achieved good results, with high levels of accuracy, precision, recall, and F1 score. They are based on the concepts of decision trees and instance distance, enabling them to handle complex feature relationships and classification problems.
The MLP and support vector machine (SVM) models showed slightly lower accuracy and F1 score. The MLP may require more training and parameter tuning to improve performance, while SVM may benefit from better feature selection and adjustment of the kernel function to optimize classification results.
The ensemble learning model performed well in terms of accuracy and F1 score but had slightly lower precision and recall compared to the improved ensemble learning model. It makes decisions by combining the predictions of multiple base learners, effectively reducing the risk of overfitting and improving the model’s robustness.
The improved ensemble learning algorithm has the following advantages:
Strong feature combination capability: The embedded mechanism model can automatically learn and discover complex relationships and interactions between features. By embedding mechanisms such as nonlinear transformations, cross-features, and high-order feature combinations in the model, it enhances the modeling capability for complex data patterns;
Strong adaptability: The embedded mechanism model can automatically adjust the model’s complexity and flexibility based on the characteristics of the data. It can select appropriate feature embedding methods and parameter settings automatically, thereby improving the model’s adaptability and performance;
Strong interpretability: Compared to some black-box models, the embedded mechanism model often has better interpretability. Due to its explicit feature embedding process and model parameters, it becomes easier to understand how the model processes input features, thus providing better explanations for the model’s predictions;
Strong generalization capability: The embedded mechanism model learns and extracts the intrinsic feature representation of the data, enabling it to capture the underlying patterns and distribution characteristics of the data, resulting in strong generalization ability. It can effectively handle new unseen samples and maintain good performance across different datasets.
4. Drilling Time Efficiency Statistics and Enhancement
4.1. Invisible Lost Time
The division of drilling time is illustrated in
Figure 3, where the actual drilling cycle consists of productive time (PT) and non-productive time (NPT). PT is composed of technical limits time (TLT) and invisible lost time (ILT) [
26]. Within the productive time, there exists a portion of time lost due to low operational efficiency, known as invisible lost time (ILT). ILT is invisible because it does not appear in any daily conventional drilling reports, whereas non-productive time is often recorded in daily drilling reports.
4.2. Drilling Time Efficiency Statistics
The drilling time efficiency automatic statistics module is a tool used to automatically analyze and calculate the start time, end time, and duration of various drilling operations in drilling activities. Its working principle is based on the automatic drilling conditions recognition method, which obtains identification codes for each drilling state and encapsulates them along with the corresponding time data in the format of wellsite data transmission, which is then sent to the drilling time efficiency automatic statistics module.
In the automatic drilling conditions recognition method, drilling conditions data are labeled and classified, and each state is assigned a unique identification code. The purpose of these codes is to differentiate between different conditions, allowing the statistics module to accurately identify and calculate the time for each operation. Once the identification codes and time data are encapsulated and sent to the drilling time efficiency automatic statistics module, the module starts analyzing the data.
The module parses the received data, extracts the start time and end time for each operation, and calculates the duration of each operation. Through these calculations, the drilling time efficiency automatic statistics module can achieve automatic statistics of dynamic drilling operation times.
Drilling time efficiency analysis is performed based on the recognition of drilling conditions. Its objective is to measure and calculate the proportion of time occupied by different drilling operations during the entire drilling process. By analyzing the duration of each operation, the statistics module can calculate the time proportion of each operation in the overall drilling process. This enables drilling operation managers to understand the impact of each operation on the total drilling time and make appropriate optimizations and adjustments.
4.3. Enhancing Drilling Efficiency
Step 1: Based on the results of the drilling time efficiency automatic statistics module, the time spent on each drilling operation by different drilling teams can be analyzed and calculated. To visually depict the time distribution of each drilling operation for each team, charts and graphs can be used for statistical and visual representation.
Step 2: Through comprehensive analysis of historical neighboring well data, geological conditions, drilling conditions, and other factors, it is possible to engage experts to determine key performance indicators (KPIs) for each drilling state. These KPIs serve as important metrics for evaluating drilling efficiency and quality. The expertise and knowledge of the experts assist in identifying suitable KPIs for specific situations, enabling better assessment of drilling performance.
Step 3: For drilling workers who fail to meet the set KPIs, training and guidance can be provided to standardize the drilling operation process. Through training and guidance, drilling workers can learn and master standardized procedural operations and understand and adhere to best practices in drilling operations. This helps improve operational efficiency and quality, thereby reducing the duration of drilling operations. Regular training and continuous supervision ensure that drilling workers maintain a proficient level of operation and continuously enhance the overall efficiency of drilling operations [
27].
In conclusion, through measures such as automatic statistics of drilling time efficiency, KPI determination, and standardized training, it is possible to comprehensively analyze and optimize the time distribution and efficiency of drilling operations, thereby improving the overall performance of drilling activities. This integrated analysis and improvement process can assist drilling teams in continuously enhancing the quality and efficiency of drilling operations in practice. The time of different drilling conditions is shown in
Figure 4.
4.4. Engineering Applications
In terms of sliding drilling, certain inclined wellbore sections are selected, and positive displacement motors are used for sliding drilling. The sliding time is an important indicator to evaluate sliding drilling operations. As shown in
Figure 5, the overall sliding time is reduced, leading to an improvement in sliding drilling efficiency. The average sliding time for the first batch of wells is 82 min, while the average sliding time for the second batch of three wells is 64.7 min, resulting in an average sliding time reduction of 21.1%.
In terms of rotary drilling, as shown in
Figure 6, the average Rate of Penetration (ROP) for the first batch of five wells is 44.1 m/h, while the average ROP for the second batch of three wells is 51 m/h, resulting in an average ROP increase of 15.65%. Although there is a relative increase in ROP, the improvement margin is limited. The drilling team should maintain the current operational status.
5. Conclusions
This article compares various models for drilling conditions classification and ultimately selects an improved stacking ensemble learning algorithm for drilling conditions classification. The chosen model enhances the performance by incorporating polynomial interactions and incorporating domain knowledge for feature engineering. The accuracy of this model has reached 97%. The effective classification of drilling operating conditions lays the foundation for the next step of time statistics.
The goal of continuously improving drilling efficiency is to set key performance indicators (KPIs) for drilling conditions and operate based on standardized procedures aligned with those KPIs. By automating time tracking and monitoring drilling operation times in real time, wasteful time segments can be promptly identified and corrected. Through the establishment of KPIs and continuous improvement efforts, the drilling process can be optimized, leading to enhanced drilling efficiency. Through engineering applications, it can be observed that the average sliding time is reduced by 21.1%, and the average Rate of Penetration (ROP) is increased by 15.65%. These measures contribute to reducing non-productive time, improving drilling efficiency, and achieving cost-effectiveness.