Suitability of Different Machine Learning Outlier Detection Algorithms to Improve Shale Gas Production Data for Effective Decline Curve Analysis
Abstract
:1. Introduction
2. Outlier Detection Algorithms
2.1. MCD
2.2. PCA
2.3. OCSVM
2.4. ABOD
2.5. SOS
2.6. HBOD
2.7. KNN
2.8. LOF
2.9. COF
2.10. CBLOF
2.11. SOD
2.12. IF
3. Data and Methodology
4. Results and Interpretations
4.1. Results of Applying OD Algorithms
4.2. Results of Applying DCA Models after Removing Outliers
5. Limitations of This Work
6. Conclusions
- Although most OD algorithms are generic, not all of them are suitable for improving the production data, especially before applying DCA, such as the Linear-base algorithms, IF, and HBOD algorithms. The reason is that those algorithms detect complete portions of the production data as outliers, which causes hard application of DCA.
- CBOD, KNN, and ABOD are the most effective algorithms to be used to improve the data quality before applying DCA. These algorithms were found to smooth the production profile by detecting the most scattered data points without affecting any trend within the data.
- The LOF is especially suitable for production profiles with scattered isolated data points; however, it could affect the trends within the production profile in case of high assumed threshold values.
- The SOS and SOD are the least effective algorithms, although they preserve the declining trend of the production profile. Unlike other algorithms, not all the scattered data points were detected as outliers by these two algorithms. This behavior made the goodness of fitting after applying the DCA models almost the same as before removing the noise.
- DCA models are based on fitting the production history before extending them for prediction. Improving the production data improves their goodness of fitting and reliability of prediction. However, some models, such as Arps, Duong, and Wang, are less sensitive to removing the noise than others whenever the removing algorithms are applied. On the other hand, SEPD, PLE, and LGM models are more sensitive to removing the outliers and the production forecasting varied greatly using different OD algorithms.
- The assumed threshold when using the OD algorithms should be optimized based on the noise level within the production data. When selecting a certain algorithm, different thresholds could be assumed and applied until no big differences appear based on the goodness of fitting and the lowest threshold value.
- Due to the different assumptions and the model structure of each DCA model, it is highly recommended to use more than one model to evaluate the reserve of the shale gas wells.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
ABOD | Angle-based outlier detection |
ABOF | Angle-based outlier factor |
BDF | Boundary Dominated Flow |
CBOD | Cluster-Based Outlier Detector |
COF | Connectivity-Based Outlier Factor |
DCA | Decline Curve Analysis |
EUR | Estimated Ultimate Recovery |
HBOD | Histogram-Based Outlier Detector |
HEHD | Hyperbolic–Exponential Hybrid Decline |
IF | Isolation Forest |
KNN | k-nearest neighbors |
LDBOD | Local Density-Based Outlier Detector |
LGM | Logistic Growth Model |
LOF | Local Outlier Factor |
MCD | Minimum Covariance Determinant |
ML | Machine Learning |
MR | Mahalanobis Distance |
OCSVM | One Class Supported Vector Machine |
OD | Outlier Detection |
PCA | Principle Component Analysis |
PLE | Power-law Exponential |
RD | Rubost Distance |
SEPD | Stretched Exponential Decline Model |
SOD | Subspace Outlier Detection |
SOS | Stochastic Outlier Selection |
SVM | Supported vector Machine |
VDMA | Variable Decline Modified Arps |
b | Decline-Curve Exponent |
D | Decline Rate (Day−1) |
Di | Initial Decline Rate (Day−1) |
D∞ | Decline Rate at Infinite Tima (Day−1) |
Gp | Gas Cumulative Production (Mscf) |
q | Gas Flow Rate (Mscf/D) |
qi | Initial Gas Flow Rate (Mscf/D) |
t | Time (day) |
n | Time Exponent in Decline Curve Analysis Models |
τ | Characteristic Time Parameter, (Day−1) |
m | Exponent Regression Parameter |
a | Regression Parameter |
MCD Estimate of Location | |
MCD Covariance Estimate | |
Cov(X) | Observations Covariance Matrix |
Feature Space | |
Nonlinear Function that Transforms the Points to The Hyperplane | |
Slack Variable Allows Minor Deviations from the Hyperplane | |
Parameter Characterizes the Upper Bound on the Fraction of Outliers and the Lower Bound on the Number of Training Examples Used as Support Vector. | |
The Probability of Observation to be an Outlier | |
Affinity that Data Point xi has with Data Point xj Decays Exponentially | |
The Inverse of the Average Reachability Distance Calculated Using p’s k-nearest Neighbors | |
Path Length of Observation x | |
Average Path Length of Unsuccessful Search in a Binary Search | |
n | Number of External Nodes in Outlier Detection Algorithm |
References
- Ibrahim, M.; Mahmoud, O.; Pieprzica, C. A New Look at Reserves Estimation of Unconventional Gas Reservoirs; OnePetro: Richardson, TX, USA, 2018. [Google Scholar]
- Mahmoud, O.; Ibrahim, M.; Pieprzica, C.; Larsen, S. EUR Prediction for Unconventional Reservoirs: State of the Art and Field Case; OnePetro: Richardson, TX, USA, 2018. [Google Scholar]
- Wahba, A.; Khattab, H.; Gawish, A. A Study of Modern Decline Curve Analysis Models Based on Flow Regime Identification. JUSST 2022, 24, 26. [Google Scholar] [CrossRef]
- Mahmoud, O.; Elnekhaily, S.; Hegazy, G. Estimating Ultimate Recoveries of Unconventional Reservoirs: Knowledge Gained from the Developments Worldwide and Egyptian Challenges. Int. J. Ind. Sustain. Dev. 2020, 1, 60–70. [Google Scholar] [CrossRef] [Green Version]
- Mostafa, S.; Hamid, K.; Tantawi, M. Studying Modern Decline Curve Analysis Models for Unconventional Reservoirs to Predict Performance of Shale Gas Reservoirs. JUSST 2021, 23, 36. [Google Scholar] [CrossRef]
- Liang, H.-B.; Zhang, L.-H.; Zhao, Y.-L.; Zhang, B.-N.; Chang, C.; Chen, M.; Bai, M.-X. Empirical Methods of Decline-Curve Analysis for Shale Gas Reservoirs: Review, Evaluation, and Application. J. Nat. Gas Sci. Eng. 2020, 83, 103531. [Google Scholar] [CrossRef]
- Hazlett, R.D.; Farooq, U.; Babu, D.K. A Complement to Decline Curve Analysis. SPE J. 2021, 26, 2468–2478. [Google Scholar] [CrossRef]
- Molina, O.; Santos, L.; Herrero, F.; Monaco, A.; Schultz, D. Is Decline Curve Analysis the Right Tool for Production Forecasting in Unconventional Reservoirs? In Proceedings of the SPE Annual Technical Conference and Exhibition, Dubai, United Arab Emirates, 15–23 September 2021; SPE: Richardson, TX, USA, 2021; p. D031S060R001. [Google Scholar]
- Xu, Y.; Liu, X.; Hu, Z.; Nan, S.; Duan, X.; Chang, J. Production Effect Evaluation of Shale Gas Fractured Horizontal Well under Variable Production and Variable Pressure. J. Nat. Gas Sci. Eng. 2021, 97, 104344. [Google Scholar] [CrossRef]
- Niu, W.; Lu, J.; Sun, Y. An Improved Empirical Model for Rapid and Accurate Production Prediction of Shale Gas Wells. J. Pet. Sci. Eng. 2022, 208, 109800. [Google Scholar] [CrossRef]
- Alimohammadi, H.; Sadeghi, M.; Chen, S.N. A Novel Procedure for Analyzing Production Decline in Unconventional Reservoirs Using Probability Density Functions. In Proceedings of the SPE Canadian Energy Technology Conference, Calgary, AB, Canada, 11–16 March 2022; SPE: Richardson, TX, USA, 2022; p. D011S012R002. [Google Scholar]
- Wahba, A.; Khattab, H.; Tantawy, M.; Gawish, A. Modern Decline Curve Analysis of Unconventional Reservoirs: A Comparative Study Using Actual Data. J. Pet. Min. Eng. 2022. online ahead of print. [Google Scholar] [CrossRef]
- Joshi, K.G.; Awoleke, O.O.; Mohabbat, A. Uncertainty Quantification of Gas Production in the Barnett Shale Using Time Series Analysis; OnePetro: Richardson, TX, USA, 2018. [Google Scholar]
- Tugan, M.F.; Weijermars, R. Improved EUR Prediction for Multi-Fractured Hydrocarbon Wells Based on 3-Segment DCA: Implications for Production Forecasting of Parent and Child Wells. J. Pet. Sci. Eng. 2020, 187, 106692. [Google Scholar] [CrossRef]
- Arps, J.J. Analysis of Decline Curves. Trans. AIME 1945, 160, 228–247. [Google Scholar] [CrossRef]
- Ilk, D.; Rushing, J.A.; Perego, A.D.; Blasingame, T.A. Exponential vs. Hyperbolic Decline in Tight Gas Sands: Understanding the Origin and Implications for Reserve Estimates Using Arps’ Decline Curves. In Proceedings of the SPE Annual Technical Conference and Exhibition, Denver, CO, USA, 21 September 2008; SPE: Richardson, TX, USA, 2008; p. SPE-116731-MS. [Google Scholar]
- Ilk, D.; Perego, A.D.; Rushing, J.A.; Blasingame, T.A. Integrating Multiple Production Analysis Techniques to Assess Tight Gas Sand Reserves: Defining a New Paradigm for Industry Best Practices. In Proceedings of the IPC/SPE Gas Technology Symposium 2008 Joint Conference, Calgary, AB, Canada, 16 June 2008; SPE: Richardson, TX, USA, 2008; p. SPE-114947-MS. [Google Scholar]
- Valko, P.P. Assigning Value to Stimulation in the Barnett Shale: A Simultaneous Analysis of 7000 plus Production Hystories and Well Completion Records; OnePetro: Richardson, TX, USA, 2009. [Google Scholar]
- Valkó, P.P.; Lee, W.J. A Better Way to Forecast Production from Unconventional Gas Wells. In Proceedings of the SPE Annual Technical Conference and Exhibition, Florence, Italy, 19 September 2010; SPE: Richardson, TX, USA, 2010; p. SPE-134231-MS. [Google Scholar]
- Duong, A.N. An Unconventional Rate Decline Approach for Tight and Fracture-Dominated Gas Wells. In Proceedings of the Canadian Unconventional Resources and International Petroleum Conference, Calgary, AB, Canada, 19 October 2010; SPE: Richardson, TX, USA, 2010; p. SPE-137748-MS. [Google Scholar]
- Duong, A.N. Rate-Decline Analysis for Fracture-Dominated Shale Reservoirs. SPE Reserv. Eval. Eng. 2011, 14, 377–387. [Google Scholar] [CrossRef] [Green Version]
- Clark, A.J.; Lake, L.W.; Patzek, T.W. Production Forecasting with Logistic Growth Models. In Proceedings of the SPE Annual Technical Conference and Exhibition, Denver, CO, USA, 30 October 2011; SPE: Richardson, TX, USA, 2011; p. SPE-144790-MS. [Google Scholar]
- Zhang, H.; Cocco, M.; Rietz, D.; Cagle, A.; Lee, J. An Empirical Extended Exponential Decline Curve for Shale Reservoirs. In Proceedings of the SPE Annual Technical Conference and Exhibition, Houston, TX, USA, 28–30 September 2015; SPE: Richardson, TX, USA, 2015; p. D031S031R007. [Google Scholar]
- Wang, K.; Li, H.; Wang, J.; Jiang, B.; Bu, C.; Zhang, Q.; Luo, W. Predicting Production and Estimated Ultimate Recoveries for Shale Gas Wells: A New Methodology Approach. Appl. Energy 2017, 206, 1416–1431. [Google Scholar] [CrossRef]
- Gupta, I.; Rai, C.; Sondergeld, C.; Devegowda, D. Variable Exponential Decline: Modified Arps to Characterize Unconventional-Shale Production Performance. SPE Reserv. Eval. Eng. 2018, 21, 1045–1057. [Google Scholar] [CrossRef]
- Hawkins, D.M. Identification of Outliers; Springer: Dordrecht, The Netherlands, 1980. [Google Scholar]
- Suri, N.N.R.R.; Murty, N.M.; Athithan, G. Outlier Detection: Techniques and Applications: A Data Mining Perspective; Springer: Berlin/Heidelberg, Germany, 2019; ISBN 978-3-030-05127-3. [Google Scholar]
- Ahmed, T. Analysis of Decline and Type Curves. In Reservoir Engineering Handbook; Elsevier: Amsterdam, The Netherlands, 2019; pp. 1227–1310. ISBN 978-0-12-813649-2. [Google Scholar]
- Yehia, T.; Khattab, H.; Tantawy, M.; Mahgoub, I. Improving the Shale Gas Production Data Using the Angular- Based Outlier Detector Machine Learning Algorithm. JUSST 2022, 24, 152–172. [Google Scholar] [CrossRef]
- Chaudhary, N.L.; Lee, W.J. Detecting and Removing Outliers in Production Data to Enhance Production Forecasting; OnePetro: Richardson, TX, USA, 2016. [Google Scholar]
- Jha, H.S.; Khanal, A.; Seikh, H.M.D.; Lee, W.J. A Comparative Study on Outlier Detection Techniques for Noisy Production Data from Unconventional Shale Reservoirs. J. Nat. Gas Sci. Eng. 2022, 105, 104720. [Google Scholar] [CrossRef]
- Yehia, T.; Khattab, H.; Tantawy, M.; Mahgoub, I. Removing the Outlier from the Production Data for the Decline Curve Analysis of Shale Gas Reservoirs: A Comparative Study Using Machine Learning. ACS Omega 2022. online ahead of print. [Google Scholar] [CrossRef]
- Simpson, D.G. Introduction to Rousseeuw (1984) Least Median of Squares Regression. In Breakthroughs in Statistics; Kotz, S., Johnson, N.L., Eds.; Springer Series in Statistics; Springer: New York, NY, USA, 1997; pp. 433–461. ISBN 978-1-4612-0667-5. [Google Scholar]
- Kotu, V.; Deshpande, B. Chapter 13—Anomaly Detection. In Data Science, 2nd ed.; Kotu, V., Deshpande, B., Eds.; Morgan Kaufmann: Amsterdam, The Netherlands, 2019; pp. 447–465. ISBN 978-0-12-814761-0. [Google Scholar]
- Rousseeuw, P.J.; Hubert, M. Anomaly Detection by Robust Statistics. WIREs Data Min. Knowl. Discov. 2018, 8, e1236. [Google Scholar] [CrossRef] [Green Version]
- Schölkopf, B.; Williamson, R.C.; Smola, A.; Shawe-Taylor, J.; Platt, J. Support Vector Method for Novelty Detection. In Proceedings of the Advances in Neural Information Processing Systems; Solla, S., Leen, T., Müller, K., Eds.; MIT Press: Cambridge, MA, USA, 1999; Volume 12. [Google Scholar]
- Kriegel, H.-P.; Schubert, M.; Zimek, A. Angle-Based Outlier Detection in High-Dimensional Data. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, NV, USA, 24–27 August 2008; Association for Computing Machinery: New York, NY, USA, 2008; pp. 444–452. [Google Scholar]
- Kim, Y.; Lau, W.C.; Chuah, M.C.; Chao, H.J. Packetscore: Statistics-Based Overload Control against Distributed Denial-of-Service Attacks. In Proceedings of the IEEE INFOCOM 2004, Hong Kong, China, 7–11 March 2004; Volume 4, pp. 2594–2604. [Google Scholar]
- Goldstein, M.; Dengel, A. Histogram-Based Outlier Score (HBOS): A Fast Unsupervised Anomaly Detection Algorithm; German Research Center for Artificial Intelligence (DFKI): Kaiserslautern, Germany, 2012. [Google Scholar]
- Knorr, E.M.; Ng, R.T.; Tucakov, V. Distance-Based Outliers: Algorithms and Applications. VLDB J. 2000, 8, 237–253. [Google Scholar] [CrossRef]
- Breunig, M.M.; Kriegel, H.-P.; Ng, R.T.; Sander, J. LOF: Identifying Density-Based Local Outliers. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, TX, USA, 15–18 May 2000; Association for Computing Machinery: New York, NY, USA, 2000; pp. 93–104. [Google Scholar]
- Tang, J.; Chen, Z.; Fu, A.W.; Cheung, D.W. Enhancing Effectiveness of Outlier Detections for Low Density Patterns. In Proceedings of the Advances in Knowledge Discovery and Data Mining; Chen, M.-S., Yu, P.S., Liu, B., Eds.; Springer: Berlin/Heidelberg, Germany, 2002; pp. 535–548. [Google Scholar]
- Goldstein, M.; Uchida, S. A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data. PLoS ONE 2016, 11, e0152173. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wang, Y.; Li, K.; Gan, S. A Kernel Connectivity-Based Outlier Factor Algorithm for Rare Data Detection in a Baking Process. IFAC-PapersOnLine 2018, 51, 297–302. [Google Scholar] [CrossRef]
- Jiang, S.; An, Q. Clustering-Based Outlier Detection Method. In Proceedings of the 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery, Jinan, China, 18–20 October 2008; Volume 2, pp. 429–433. [Google Scholar]
- Nguyen, M.Q.; Mark, L.; Omiecinski, E. Subspace Outlier Detection in Data with Mixture of Variances and Noise; Georgia Institute of Technology: Atlanta, GA, USA, 2008. [Google Scholar]
- Muller, E.; Schiffer, M.; Seidl, T. Statistical Selection of Relevant Subspace Projections for Outlier Ranking. In Proceedings of the 2011 IEEE 27th International Conference on Data Engineering, Hannover, Germany, 11–16 April 2011; p. 445. [Google Scholar]
- Riahi-Madvar, M.; Nasersharif, B.; Azirani, A.A. Subspace Outlier Detection in High Dimensional Data Using Ensemble of PCA-Based Subspaces. In Proceedings of the 2021 26th International Computer Conference, Computer Society of Iran (CSICC), Tehran, Iran, 3–4 March 2021; pp. 1–5. [Google Scholar]
- Trittenbach, H.; Böhm, K. Dimension-Based Subspace Search for Outlier Detection. Int. J. Data Sci. Anal. 2019, 7, 87–101. [Google Scholar] [CrossRef]
- Liu, F.T.; Ting, K.M.; Zhou, Z.-H. Isolation Forest. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, Pisa, Italy, 15–19 December 2008; pp. 413–422. [Google Scholar]
- SPE Data Repository: Data Set: {1}, Well Number: {12}. Available online: https://www.spe.org/datasets/dataset_1/spreadsheets/dataset_1_well_12.xlsx (accessed on 1 August 2022).
- SPE Data Repository: Data Set: {1}, Well Number: {29}. Available online: https://www.spe.org/datasets/dataset_1/spreadsheets/dataset_1_well_29.xlsx (accessed on 1 August 2022).
- SPE Data Repository: Data Set: {1}, Well Number: {40}. Available online: https://www.spe.org/datasets/dataset_1/spreadsheets/dataset_1_well_40.xlsx (accessed on 1 August 2022).
Model | (q) Versus (t) * | Reference |
---|---|---|
Hyperbolic Arps (1945) | [15] | |
Power Law Exponential (PLE) (2008) | [16,17] | |
Stretched Exponential Production Decline (SEPD) (2010) | [18,19] | |
Duong (2010, 2011) | [20,21] | |
Logistic Growth Model (LGM) (2011) | [22] | |
Hyperbolic–Exponential Hybrid Decline (HEHD) (2016) | [23] | |
Wang (2017) | [24] | |
Variable Decline Modified Arps (VEDM) (2018) | [25] |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yehia, T.; Wahba, A.; Mostafa, S.; Mahmoud, O. Suitability of Different Machine Learning Outlier Detection Algorithms to Improve Shale Gas Production Data for Effective Decline Curve Analysis. Energies 2022, 15, 8835. https://doi.org/10.3390/en15238835
Yehia T, Wahba A, Mostafa S, Mahmoud O. Suitability of Different Machine Learning Outlier Detection Algorithms to Improve Shale Gas Production Data for Effective Decline Curve Analysis. Energies. 2022; 15(23):8835. https://doi.org/10.3390/en15238835
Chicago/Turabian StyleYehia, Taha, Ali Wahba, Sondos Mostafa, and Omar Mahmoud. 2022. "Suitability of Different Machine Learning Outlier Detection Algorithms to Improve Shale Gas Production Data for Effective Decline Curve Analysis" Energies 15, no. 23: 8835. https://doi.org/10.3390/en15238835
APA StyleYehia, T., Wahba, A., Mostafa, S., & Mahmoud, O. (2022). Suitability of Different Machine Learning Outlier Detection Algorithms to Improve Shale Gas Production Data for Effective Decline Curve Analysis. Energies, 15(23), 8835. https://doi.org/10.3390/en15238835