Effect of Domaining in Mineral Resource Estimation with Machine Learning
Abstract
:1. Introduction
2. Materials and Methods
2.1. Study Area
2.2. Methods Used for Spatial Estimation
2.2.1. XGBoost
2.2.2. SVR
2.2.3. Ensemble Methods
- Bootstrap sampling:
- 2
- Model Training:
- 3
- Aggregation:
3. Estimations
3.1. Estimation with XGBoost
3.2. Estimation with SVR
3.3. Estimation with Ensemble Technique
4. Results and Discussion
5. Conclusions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
Cu | Copper |
CPU | Central processing unit |
DSM | Deposit single model |
DseM | Deposit separate model |
HGM | High-grade model |
KNN | K-nearest neighbor |
LGM | Low-grade model |
ML | Machine learning |
MSE | Mean square error |
NN | Neural network |
OK | Ordinary kriging |
ppm | Parts per million |
RMSE | Root mean square error |
SVR | Support Vector Regression |
XGBoost | Extreme Gradient Boosting Tree |
References
- Siddiqui, F.I.; Pathan, A.G.; Ünver, B.; Tercan, A.E.; Hindistan, M.A.; Ertunç, G.; Atalay, F.; Ünal, S.; Kıllıoğlu, Y. Lignite resource estimations and seam modeling of Thar Field, Pakistan. Int. J. Coal Geol. 2015, 140, 84–96. [Google Scholar]
- Ertunc, G.; Tercan, A.E.; Hindistan, M.A.; Unver, B.; Unal, S.; Atalay, F.; Killioglu, S.Y. Geostatistical estimation of coal quality variables by using covariance matching constrained kriging. Int. J. Coal Geol. 2013, 112, 14–25. [Google Scholar] [CrossRef]
- Rossi, M.E.; Deutsch, C.V. Mineral Resource Estimation; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
- Xiang, J.; Xiao, K.; Carranza, E.J.M.; Chen, J.; Li, S. 3D mineral prospectivity mapping with random forests: A case study of Tongling, Anhui, China. Nat. Resour. Res. 2020, 29, 395–414. [Google Scholar]
- Amadua, C.C.; Owusub, S.; Folic, G.; Brakoc, B.A.; Abanyied, S.K. Comparison of ordinary kriging (OK) and inverse distance weighting (IDW) methods for the estimation of a modified palaeoplacer gold deposit: A case study of the Teberebie gold deposit, SW Ghana. Group 2022, 250, 700. [Google Scholar]
- Atalay, F.; Tercan, A.E.; Ünver, B.; Hindistan, M.A.; Ertunç, G. A Geostatistical Study of Tertiary Coal Fields in Turkey. In Proceedings of the Mathematics of Planet Earth, 15th Annual Conference of the International Association for Mathematical Geosciences, Madrid, Spain, 2–6 September 2014; pp. 723–726. [Google Scholar]
- Romary, T.; Rivoirard, J.; Deraisme, J.; Quinones, C.; Freulon, X. Domaining by clustering multivariate geostatistical data. In Geostatistics Oslo 2012; Springer: Berlin/Heidelberg, Germany, 2012; pp. 455–466. [Google Scholar]
- Fouedjio, F.; Hill, E.J.; Laukamp, C. Geostatistical clustering as an aid for ore body domaining: Case study at the Rocklea Dome channel iron ore deposit, Western Australia. Appl. Earth Sci. 2018, 127, 15–29. [Google Scholar]
- Afeni, T.B.; Lawal, A.I.; Adeyemi, R.A. Re-examination of Itakpe iron ore deposit for reserve estimation using geostatistics and artificial neural network techniques. Arab. J. Geosci. 2020, 13, 657. [Google Scholar] [CrossRef]
- Jafrasteh, B.; Fathianpour, N.; Suárez, A. Comparison of machine learning methods for copper ore grade estimation. Comput. Geosci. 2018, 22, 1371–1388. [Google Scholar] [CrossRef]
- Zaki, M.M.; Chen, S.; Zhang, J.; Feng, F.; Khoreshok, A.A.; Mahdy, M.A.; Salim, K.M. A Novel Approach for Resource Estimation of Highly Skewed Gold Using Machine Learning Algorithms. Minerals 2022, 12, 900. [Google Scholar] [CrossRef]
- Galetakis, M.; Vasileiou, A.; Rogdaki, A.; Deligiorgis, V.; Raka, S. Estimation of mineral resources with machine learning techniques. Mater. Proc. 2022, 5, 122. [Google Scholar] [CrossRef]
- Mery, N.; Marcotte, D. Quantifying Mineral Resources and Their Uncertainty Using Two Existing Machine Learning Methods. Math. Geosci. 2021, 54, 363–387. [Google Scholar] [CrossRef]
- Samanta, B.; Ganguli, R.; Bandopadhyay, S. Comparing the predictive performance of neural networks with ordinary kriging in a bauxite deposit. Min. Technol. 2013, 114, 129–139. [Google Scholar] [CrossRef]
- Mahboob, M.; Celik, T.; Genc, B. Review of machine learning-based Mineral Resource estimation. J. South. Afr. Inst. Min. Metall. 2022, 122, 655–664. [Google Scholar]
- Dumakor-Dupey, N.K.; Arya, S. Machine Learning—A Review of Applications in Mineral Resource Estimation. Energies 2021, 14, 4079. [Google Scholar] [CrossRef]
- Atalay, F. Estimation of Fe Grade at an Ore Deposit Using Extreme Gradient Boosting Trees (XGBoost). Min. Metall. Explor. 2024, 41, 2119–2128. [Google Scholar]
- Adadi, A. A survey on data-efficient algorithms in big data era. J. Big Data 2021, 8, 24. [Google Scholar]
- Obermeyer, Z.; Emanuel, E.J. Predicting the future—Big data, machine learning, and clinical medicine. N. Engl. J. Med. 2016, 375, 1216–1219. [Google Scholar]
- Van Der Ploeg, T.; Austin, P.C.; Steyerberg, E.W. Modern modelling techniques are data hungry: A simulation study for predicting dichotomous endpoints. BMC Med. Res. Methodol. 2014, 14, 137. [Google Scholar]
- Chia, M.Y.; Huang, Y.F.; Koo, C.H. Resolving data-hungry nature of machine learning reference evapotranspiration estimating models using inter-model ensembles with various data management schemes. Agric. Water Manag. 2022, 261, 107343. [Google Scholar]
- Gravesteijn, B.Y.; Nieboer, D.; Ercole, A.; Lingsma, H.F.; Nelson, D.; Van Calster, B.; Steyerberg, E.W.; Åkerlund, C.; Amrein, K.; Andelic, N. Machine learning algorithms performed no better than regression models for prognostication in traumatic brain injury. J. Clin. Epidemiol. 2020, 122, 95–107. [Google Scholar]
- Liu, T.-L.; Flückiger, B.; de Hoogh, K. A comparison of statistical and machine-learning approaches for spatiotemporal modeling of nitrogen dioxide across Switzerland. Atmos. Pollut. Res. 2022, 13, 101611. [Google Scholar]
- Shouval, R.; Fein, J.A.; Savani, B.; Mohty, M.; Nagler, A. Machine learning and artificial intelligence in haematology. Br. J. Haematol. 2021, 192, 239–250. [Google Scholar] [PubMed]
- Delaney, C.; Li, X.; Holmberg, K.; Wilson, B.; Heathcote, A.; Nieber, J. Estimating lake water volume with regression and machine learning methods. Front. Water 2022, 4, 886964. [Google Scholar]
- Zhao, T.; Wang, S.; Ouyang, C.; Chen, M.; Liu, C.; Zhang, J.; Yu, L.; Wang, F.; Xie, Y.; Li, J. Artificial intelligence for geoscience: Progress, challenges and perspectives. Innovation 2024, 5, 100691. [Google Scholar] [PubMed]
- Joshi, A.; Vishnu, C.; Mohan, C.K.; Raman, B. Application of XGBoost model for early prediction of earthquake magnitude from waveform data. J. Earth Syst. Sci. 2023, 133, 5. [Google Scholar]
- Shahani, N.M.; Zheng, X.; Liu, C.; Hassan, F.U.; Li, P. Developing an XGBoost regression model for predicting young’s modulus of intact sedimentary rocks for the stability of surface and subsurface structures. Front. Earth Sci. 2021, 9, 761990. [Google Scholar]
- Osman, A.I.A.; Ahmed, A.N.; Chow, M.F.; Huang, Y.F.; El-Shafie, A. Extreme gradient boosting (Xgboost) model to predict the groundwater levels in Selangor Malaysia. Ain Shams Eng. J. 2021, 12, 1545–1556. [Google Scholar]
- Jia, Y.; Jin, S.; Savi, P.; Gao, Y.; Tang, J.; Chen, Y.; Li, W. GNSS-R soil moisture retrieval based on a XGboost machine learning aided method: Performance and validation. Remote Sens. 2019, 11, 1655. [Google Scholar] [CrossRef]
- Badola, S.; Mishra, V.N.; Parkash, S. Landslide susceptibility mapping using XGBoost machine learning method. In Proceedings of the 2023 International Conference on Machine Intelligence for GeoAnalytics and Remote Sensing (MIGARS), Hyderabad, India, 27–29 January 2023; pp. 1–4. [Google Scholar]
- Dietterich, T.G. Ensemble methods in machine learning. In Proceedings of the International workshop on multiple classifier systems, Cagliari, Italy, 21–23 June 2000; pp. 1–15. [Google Scholar]
- Ganaie, M.A.; Hu, M.; Malik, A.K.; Tanveer, M.; Suganthan, P.N. Ensemble deep learning: A review. Eng. Appl. Artif. Intell. 2022, 115, 105151. [Google Scholar]
- Graczyk, M.; Lasota, T.; Trawiński, B.; Trawiński, K. Comparison of bagging, boosting and stacking ensembles applied to real estate appraisal. In Proceedings of the Intelligent Information and Database Systems: Second International Conference, ACIIDS, Hue City, Vietnam, 24–26 March 2010; Proceedings Part II 2. pp. 340–350. [Google Scholar]
- Shahhosseini, M.; Hu, G.; Pham, H. Optimizing ensemble weights and hyperparameters of machine learning models for regression problems. Mach. Learn. Appl. 2022, 7, 100251. [Google Scholar]
- Ridgeway, G.; Madigan, D.; Richardson, T.S. Boosting methodology for regression problems. In Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 3–6 January 1999. [Google Scholar]
- Natekin, A.; Knoll, A. Gradient boosting machines, a tutorial. Front. Neurorobotics 2013, 7, 21. [Google Scholar]
- Freund, Y.; Schapire, R.E. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 1997, 55, 119–139. [Google Scholar]
- Schapire, R.E. The boosting approach to machine learning: An overview. In Nonlinear Estimation and Classification; Springer: Berlin/Heidelberg, Germany, 2003; pp. 149–171. [Google Scholar]
- Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar]
- He, Z.; Lin, D.; Lau, T.; Wu, M. Gradient boosting machine: A survey. arXiv 2019, arXiv:1908.06951. [Google Scholar]
- Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased boosting with categorical features. Adv. Neural Inf. Process. Syst. 2018, 31, 11. [Google Scholar]
- Dorogush, A.V.; Ershov, V.; Gulin, A. CatBoost: Gradient boosting with categorical features support. arXiv 2018, arXiv:1810.11363. [Google Scholar]
- Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
- Nielsen, D. Tree Boosting with xgboost-Why Does xgboost Win “Every” Machine Learning Competition? NTNU: Trondheim, Norvegia, 2016. [Google Scholar]
- Wade, C.; Glynn, K. Hands-On Gradient Boosting with XGBoost and Scikit-Learn: Perform Accessible Machine Learning and Extreme Gradient Boosting with Python; Packt Publishing Ltd.: Birmingham, UK, 2020. [Google Scholar]
- Drucker, H.; Wu, D.; Vapnik, V.N. Support vector machines for spam categorization. IEEE Trans. Neural Netw. 1999, 10, 1048–1054. [Google Scholar]
- Drucker, H.; Burges, C.J.; Kaufman, L.; Smola, A.; Vapnik, V. Support vector regression machines. Adv. Neural Inf. Process. Syst. 1996, 9, 1–7. [Google Scholar]
- Lachaud, A.; Adam, M.; Mišković, I. Comparative study of random forest and support vector machine algorithms in mineral prospectivity mapping with limited training data. Minerals 2023, 13, 1073. [Google Scholar] [CrossRef]
- Breiman, L. Bagging predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar]
- Patil, P.; Du, J.-H.; Kuchibhotla, A.K. Bagging in overparameterized learning: Risk characterization and risk monotonization. J. Mach. Learn. Res. 2023, 24, 1–113. [Google Scholar]
- Ghojogh, B.; Crowley, M. The theory behind overfitting, cross validation, regularization, bagging, and boosting: Tutorial. arXiv 2019, arXiv:1905.12787. [Google Scholar]
- Park, Y.; Ho, J.C. Tackling overfitting in boosting for noisy healthcare data. IEEE Trans. Knowl. Data Eng. 2019, 33, 2995–3006. [Google Scholar] [CrossRef]
- Song, Y.; Liang, J.; Lu, J.; Zhao, X. An efficient instance selection algorithm for k nearest neighbor regression. Neurocomputing 2017, 251, 26–34. [Google Scholar] [CrossRef]
- Ho, T.K. Random decision forests. In Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, Canada, 14–16 August 1995; pp. 278–282. [Google Scholar]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Zhang, H.; Xie, M.; Dan, S.; Li, M.; Li, Y.; Yang, D.; Wang, Y. Optimization of Feature Selection in Mineral Prospectivity Using Ensemble Learning. Minerals 2024, 14, 970. [Google Scholar] [CrossRef]
- Hancock, J.T.; Khoshgoftaar, T.M. CatBoost for big data: An interdisciplinary review. J. Big Data 2020, 7, 94. [Google Scholar] [CrossRef]
- Fu, P.; Zhang, J.; Yuan, Z.; Feng, J.; Zhang, Y.; Meng, F.; Zhou, S. Estimating the heavy metal contents in entisols from a mining area based on improved spectral indices and Catboost. Sensors 2024, 24, 1492. [Google Scholar] [CrossRef]
- Ahmad, M.W.; Reynolds, J.; Rezgui, Y. Predictive modelling for solar thermal energy systems: A comparison of support vector regression, random forest, extra trees and regression trees. J. Clean. Prod. 2018, 203, 810–821. [Google Scholar]
- Geurts, P.; Ernst, D.; Wehenkel, L. Extremely randomized trees. Mach. Learn. 2006, 63, 3–42. [Google Scholar] [CrossRef]
- Nti, I.K.; Nyarko-Boateng, O.; Aning, J. Performance of machine learning algorithms with different K values in K-fold CrossValidation. Int. J. Inf. Technol. Comput. Sci. 2021, 13, 61–71. [Google Scholar]
- Henderi, H.; Wahyuningsih, T.; Rahwanto, E. Comparison of Min-Max normalization and Z-Score Normalization in the K-nearest neighbor (kNN) Algorithm to Test the Accuracy of Types of Breast Cancer. Int. J. Inform. Inf. Syst. 2021, 4, 13–20. [Google Scholar] [CrossRef]
- Samson, M. Mineral Resource Estimates with Machine Learning and Geostatistics. Master’ Thesis, University of Alberta, Edmonton, AB, USA, 2019. [Google Scholar]
All Composite Data (ppm) | Composites in Low-Grade Domain (ppm) | Composites in High-Grade Domain (ppm) | |
---|---|---|---|
Count | 483 | 346 | 137 |
Minimum | 1.2 | 1.2 | 8166 |
Average | 5697 | 3199 | 12,006 |
Median | 4020 | 3084 | 11,435 |
Maximum | 21,125 | 10,250 | 21,125 |
Standard Deviation | 4534 | 1897 | 2797 |
Skewness | 0.989 | 0.557 | 1.114 |
Kurtosis | 0.188 | −0.120 | 0.964 |
Hyper-Parameter | Grid Search Range | Explanations |
---|---|---|
Eta | 0.01–0.05–0.1–0.3–0.5–1 | Step size shrinkage used to prevent overfitting |
Max depth | 3–6–8–10–12–15 | Maximum depth of a tree |
Minimum child weight | 1–3–5–8–10 | Minimum sum of instance weight |
Evaluation metric | RMSE | Evaluation metric for validation data |
Objective | Squared Error | Learning objective |
Subsample | 0.1–0.3–0.5–0.6–0.8–1 | Subsample ratio of the training instance |
Hyper-Parameter | Deposit Single Model | Low-Grade Domain Model | High-Grade Domain Model |
---|---|---|---|
Eta | 1 | 0.1 | 0.05 |
Max depth | 6 | 15 | 6 |
Minimum child weight | 1 | 1 | 8 |
Subsample | 0.8 | 0.6 | 0.1 |
Hyper-Parameter | Grid Search Range | Explanations |
---|---|---|
Ploy_degree | 1, 2, 3 | Degree of polynomial |
C | Linearly spaced one thousand data points between 0.1 and 100 | Regularization parameter |
Degree | 2, 3, 4 | For polynomial kernels, degree of polynomials function |
Epsilon | Linearly spaced one thousand data points between 0 and 3 | Margin of tolerance |
Gamma | Linearly spaced ten data points between 0.25 and 1 | Specific to RBF and polynomial kernels, defines the influence of a single training |
Coef0 | Linearly spaced one thousand data points between 0 and 1 | For polynomial and sigmoid kernel independent terms in the kernel function |
Kernel | Linear, Polynomial (Poly), Radial Basis Function (RBF), Sigmoid |
Hyper-Parameter | Deposit Single Model | Low-Grade Domain Model | High-Grade Domain Model |
---|---|---|---|
Kernel | Poly | RBF | Poly |
Poly Degree | 2 | 3 | 2 |
C | 9.3 | 75.1 | 45.3 |
Coef0 | 0.566 | 0.881 | 0.036 |
Degree | 2 | 4 | 4 |
Epsilon | 0.199 | 0.301 | 0.331 |
Gamma | 0.795 | 0.542 | 0.452 |
Base Learner | Deposit Single-Model Weights | Low-Grade Domain Model Weights | High-Grade Domain Model Weights |
---|---|---|---|
KNeighbors | 0.44 | 0.34 | 0.38 |
Random Forest | 0.17 | 0.28 | 0.25 |
CatBoost | 0.19 | 0.16 | 0.27 |
Extra Trees | 0.19 | 0.22 | 0.10 |
ML Method | Deposit Single Model | Low-Grade Domain in Single Model | High-Grade Domain in Single Model | Deposit Separate Models | Low-Grade Domain Model | High-Grade Domain Model | |
---|---|---|---|---|---|---|---|
XGBoost | Min | 12 | 12 | 642 | 1698 | 1698 | 8908 |
Average | 5889 | 4580 | 9110 | 5515 | 3078 | 11,518 | |
Median | 4622 | 3836 | 9733 | 3592 | 3032 | 11,523 | |
Maximum | 20,239 | 20,103 | 20,239 | 17,538 | 4600 | 17,538 | |
Std. | 3676 | 2932 | 3320 | 3953 | 753 | 1436 | |
Count | 1697 | 1207 | 490 | 1697 | 1207 | 490 | |
SVR | Min | 359 | 359 | 2305 | 251 | 251 | 8173 |
Average | 5390 | 4408 | 7807 | 5710 | 3038 | 12,294 | |
Median | 4816 | 4061 | 7852 | 3726 | 3177 | 12,087 | |
Maximum | 15,137 | 14,128 | 15,137 | 21,110 | 6617 | 21,110 | |
Std. | 2798 | 2078 | 2868 | 4493 | 1282 | 2210 | |
Count | 1697 | 1207 | 490 | 1697 | 1206 | 490 | |
Ensemble | Min | 896 | 996 | 896 | 410 | 410 | 9319 |
Average | 5657 | 4504 | 8495 | 5536 | 3094 | 11,552 | |
Median | 4480 | 3946 | 9087 | 3673 | 3112 | 11,561 | |
Maximum | 16,094 | 15,518 | 16,094 | 15,829 | 6370 | 15,829 | |
Std. | 2996 | 2103 | 2976 | 3976 | 1107 | 928 | |
Count | 1697 | 1207 | 490 | 1697 | 1207 | 490 |
Single Model XGBoost | Single Model SVR | Single Model Ensemble | Domain-Based XGBoost | Domain-Based SVR | Domain-Based Ensemble | |
---|---|---|---|---|---|---|
Deposit | 3893 | 8202 | 5908 | 7408 | 5530 | 7124 |
Low-grade domain | −16,431 | −14,720 | −14,762 | 385 | 193 | 3 |
High-grade domain | 38,173 | 49,784 | 43,224 | 6429 | −292 | 6117 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Atalay, F. Effect of Domaining in Mineral Resource Estimation with Machine Learning. Minerals 2025, 15, 330. https://doi.org/10.3390/min15040330
Atalay F. Effect of Domaining in Mineral Resource Estimation with Machine Learning. Minerals. 2025; 15(4):330. https://doi.org/10.3390/min15040330
Chicago/Turabian StyleAtalay, Fırat. 2025. "Effect of Domaining in Mineral Resource Estimation with Machine Learning" Minerals 15, no. 4: 330. https://doi.org/10.3390/min15040330
APA StyleAtalay, F. (2025). Effect of Domaining in Mineral Resource Estimation with Machine Learning. Minerals, 15(4), 330. https://doi.org/10.3390/min15040330