Effect of Domaining in Mineral Resource Estimation with Machine Learning

Atalay, Fırat

doi:10.3390/min15040330

Open AccessArticle

Effect of Domaining in Mineral Resource Estimation with Machine Learning

by

Fırat Atalay

Mining Engineering Department, Beytepe Campus, Hacettepe University (Hacettepe Üniversitesi), Ankara 06800, Türkiye

Minerals 2025, 15(4), 330; https://doi.org/10.3390/min15040330

Submission received: 9 February 2025 / Revised: 18 March 2025 / Accepted: 20 March 2025 / Published: 21 March 2025

(This article belongs to the Special Issue Mineral Prospectivity Mapping (MPM) Using Multi-Source Datasets, Geo-Statistical Algorithms and Machine Learning Techniques)

Download

Browse Figures

Versions Notes

Abstract

:

Machine learning (ML) is increasingly applied in earth sciences, including in mineral resource estimation. A critical step in this process is domaining, which significantly impacts estimation quality. However, the importance of domaining within ML-based resource estimation remains under-researched. This study aims to directly assess the effect of domaining on ML estimation accuracy. A copper deposit with well-defined, hard-boundary, low- and high-grade domains was used as a case study. Extreme Gradient Boosting (XGBoost), Support Vector Regression (SVR), and ensemble learning were employed to estimate copper distribution, both with and without domaining. Estimation performance was evaluated using summary statistics, swath plot analyses, and the quantification of out-of-range blocks. The results demonstrated that estimations without domaining exhibited substantial errors, with approximately 30% of blocks in the high-grade domain displaying values outside their expected range. These findings confirm that, analogous to classical methods, domaining is essential for accurate mineral resource estimation using ML algorithms.

Keywords:

geostatistics; machine learning; Extreme Gradient Boosting (XGBoost); Support Vector Regression (SVR); ensemble learning; mineral resource estimation; copper

1. Introduction

In the spatial estimation of mineral resources, classical methods, including kriging, inverse distance weighting, nearest neighbor interpolation, and spline interpolation, are widely employed [1,2,3,4,5]. Among these, kriging is the most prevalent [6]. The widespread application of kriging in spatial estimation has led to the conventional understanding and acceptance of its procedural steps. These steps encompass compositing drillhole data, generating wireframe models that respect mineralization, variogram modeling, and block modeling. Figure 1 illustrates the general workflow of kriging.

As shown in Figure 1, a critical step within the kriging workflow involves partitioning wireframe models into domains. This partitioning is typically based on distinct mineralization zones, grade behaviors, or a combination of parameters such as alteration and lithology. This domaining process is fundamental to optimizing the accuracy of kriging interpolation, as the spatial domain is frequently partitioned into discrete, homogeneous sub-regions, termed domains [7]. Domaining is essential because kriging methodology relies on the assumption of second-order stationarity—specifically, the invariance of the statistical properties of the regionalized variable under translation. However, in practical geostatistical applications, this assumption is often compromised by inherent spatial heterogeneity. Domaining serves to mitigate this violation by delineating sub-domains where the stationarity criterion is more likely to be satisfied, thereby yielding more robust spatial estimations. Applying kriging to a non-stationary spatial field, without prior domain decomposition, can introduce significant bias and impair the reliability of the predictive outputs. The practical implementation of this complex domaining necessitates a subjective, manual interpretation of the mineral deposit, a process that is both time-consuming and laborious [8].

In contrast to classical spatial estimation methods, machine learning (ML) techniques are emerging. The question of whether spatial estimation using ML is superior to classical kriging remains an active area of research. The comparative literature between kriging and machine learning techniques can be broadly classified into three distinct categories: (1) studies demonstrating the superior performance of kriging, (2) investigations where machine learning serves as a viable alternative to kriging, and (3) analyses indicating the enhanced predictive capabilities of machine learning.

For instance, in the context of mineral resource estimation, research by Afeni et al. [9] provides evidence where kriging methodologies demonstrate superior performance compared to ML-based approaches. In their study, grade estimations were performed within an iron ore deposit utilizing the ordinary kriging (OK) method and a Multilayer Perceptron neural network. The estimated values obtained from both methods were compared against the actual sample values. Consequently, OK was reported to demonstrate superior performance. B. Jafrasteh et al. [10] conducted grade estimations in a copper mine using Random Forests, Deep Neural Networks, Gaussian Processes, and Indicator Kriging. Extensive empirical analyses within the study demonstrated that the Gaussian Process yielded the most optimal results, followed by Indicator Kriging. Zaki et al. [11] conducted a comparative analysis of Gaussian process regression, Support Vector Regression (SVR), Decision Tree Ensemble, Fully Connected Neural Network, and K-Nearest Neighbors (K-NN) with OK and Indicator Kriging in a gold deposit characterized by high skewness. Utilizing the X, Y, and Z coordinates of the samples as input parameters and gold values as the output, the study demonstrated that Gaussian Process Regression outperformed the other methods, while Fully Connected Neural Network was deemed an unreliable approach.

Conversely, studies also suggest that machine learning can be utilized as an alternative to classical geostatistical methods. For example, Galetakis et al. [12] predicted Cu grade distribution in a copper deposit which exhibiting skewed distribution using neural networks (NN) and adaptive neuro-fuzzy inference systems. The X, Y, and Z coordinates of the samples were used as inputs, while Cu% values served as outputs. The results of this study suggest that the outcomes obtained from NN, and adaptive neuro-fuzzy inference systems methodologies produced results comparable to kriging. Mery and Marcotte [13] evaluated the performance of Multiple Linear Regression and NN on a synthetic dataset. The primary objective of their study was to model tonnage curves and the associated uncertainties derived from OK, Constrained Kriging, Uniform Conditioning, Indirect Lognormal Correction, discrete Gaussian, Multilayer Perceptron and NN. The authors concluded that these machine learning methods could serve as a viable alternative to classical geostatistical techniques. Samanta et al. [14] utilized the NN and OK methods to predict Al₂O₃, SiO₂ content, and bauxite thickness within a bauxite deposit in India. The estimations employed sample X, Y, and Z coordinate values as input parameters. The study reported that the results obtained from the ML approach were comparable in quality to those derived from kriging.

Furthermore, numerous studies indicate the superior performance of ML algorithms. Rather than addressing these studies individually, it is more appropriate to focus on previously conducted reviews on the subject. The first review article to be discussed was published by Mahboob et al. in 2022 [15]. This study reviewed a total of thirty-one machine learning-related articles published between 1990 and 2019. The study evaluated and compared the types, performances, and capabilities of several machine learning methods against each other and against conventional geostatistical methods. Consequently, it was concluded that ML approaches outperformed classical geostatistical methods, with SVR being the most widely used ML technique. In 2021, Dumakor-Dupey and Arya [16] examined the use of ML approaches in mineral resource estimation. This study conducted a comparative analysis of ML and classical approaches, considering 131 studies. The review demonstrated that ML approaches are powerful tools for complex linear and nonlinear geological problems. Furthermore, it was emphasized that they produce superior results compared to classical geostatistical approaches.

Considering all the aforementioned sources, the question of whether ML approaches are superior to classical geostatistical approaches remains a subject of debate. This study does not intend to take part in this debate. Therefore, this study will focus exclusively on ML methods and is specifically targeted towards researchers and practitioners currently engaged in the research and application of ML algorithms. Furthermore, it does not intend to compare classical methods with ML approaches, nor does it seek to determine which approach is superior. Instead, it investigates the significance of domaining within newly developed ML approaches. Consequently, the following sections will exclusively address resource estimations conducted using ML approaches, and delineate the general patterns observed in these studies.

The methods used for the purpose of spatial estimation using ML approaches are data driven, and the steps of estimation are different by nature compared to classical spatial estimation methods. In addition, the steps of execution of ML approaches are not well established, and these steps are generally inherited from other engineering fields. In this context, in the spatial estimation of mineral resources, an ML model is trained, tested and validated based on the available data which comprise composites of drillholes [12,15,16,17]. This trained model is used to predict the block model that is generated using the wireframe model, which is based on the underlying mineralization. A review of resource estimations conducted using ML across all cited sources in this study reveals that the workflow presented in Figure 2 is generally employed. Figure 2 illustrates the general workflow of mineral resource estimation with ML.

As illustrated in Figure 2, the typical workflow for ML-based mineral resource estimation begins with dataset creation, incorporating spatial coordinates (X, Y, and Z for 3D) and the target variable. Subsequently, the dataset is partitioned into training and testing subsets to assess model generalization. Parameter tuning, a critical phase for ML accuracy, follows. The model is then trained with optimized parameters and evaluated using the test data. Finally, block-by-block estimation of the target variable is performed. However, a comparative analysis of Figure 1 and Figure 2 reveals a consistent omission: the domaining step. This oversight, wherein all data are utilized without explicit domain consideration, raises a crucial question regarding the impact of domaining on ML-based estimations. Addressing this question is pivotal. If domaining proves inconsequential, its elimination would significantly streamline resource estimation. Conversely, its significance necessitates its integration, currently often overlooked. Therefore, rigorously investigating the influence of domaining on ML-based resource estimations is imperative.

The potential oversight of domaining in ML-based resource estimation may stem from the inherent pattern recognition capabilities and data-driven nature of ML algorithms. Researchers might assume that ML models, by learning from extensive datasets, inherently capture domain-specific behaviors. This presumption is reinforced by the observation that ML performance often improves with larger datasets [18,19,20,21,22,23,24,25,26]. Consequently, using all available data without domain distinctions is perceived as potentially advantageous. However, this contrasts with classical geostatistical practices, which emphasize domain-based estimations due to the structural homogeneity of data within domains. Domaining, however, naturally reduces the data volume within each domain, posing a challenge for data-hungry ML methods. This creates a dilemma: balancing the benefits of structurally similar data within domains against the advantages of larger, albeit less homogeneous, datasets. This crucial dilemma, specific to spatial estimations using ML, remains largely unexplored.

To address this knowledge gap, this paper aims to explore the significance of domaining in ML-based mineral resource estimation. To evaluate the significance of domaining in ML applications, the study utilizes three widely recognized algorithms: Extreme Gradient Boosting (XGBoost), SVR, and the ensemble method. Among the aforementioned methods, XGBoost has gained significant popularity in recent years and was selected due to its exceptional performance in regression problems. The SVR method is frequently employed in resource estimation, with several studies reporting highly satisfactory results. The ensemble method, by its nature, inherently utilizes multiple algorithms. Therefore, the ensemble approach implemented in this study, which combines Nearest Neighbors, Random Forest, CatBoost, and ExtraTrees, arrives at a final result through the integration of these diverse methods. A copper deposit serves as the case study. Initially, the models are trained on the entire dataset without domain distinctions. Subsequently, the deposit is segmented into high- and low-grade domains, and estimations are performed independently for each domain. Finally, the estimation results from both approaches are compared to assess the influence of domaining.

2. Materials and Methods

2.1. Study Area

A copper deposit located in Türkiye is selected as the study area; however, due to a confidentiality agreement, no further information can be given about the location of the deposit. Domains at the deposit with high grade and low grades are separated clearly, which makes this deposit suitable for assessing the importance of domaining in grade estimation with ML approaches. A total of 46 drillholes were drilled at the deposit, with an average collar spacing of 57 m, while the spacing between the closest drillholes is 5 m and most distant drillhole is 102 away from the nearest drillhole. All the drillholes were drilled with variable dip and azimuths, with a total length 6132 m (Figure 3).

In total, 484 m of these drillings have intersected the copper mineralization, and the average sampling length is 1.08 m. For this reason, raw samples are composited as 1 m lengths as compositing is vital step in mineral resource estimation. The summary statistics of the Cu composites are given in Table 1.

A seen in Table 1, the high-grade domain exhibits a copper grade nearly four times greater than that of the low-grade domain. This indicates that the low- and high-grade domains are distinct concentration ranges. The high-grade domain exhibits higher average and median concentrations, as well as a wider spread of data and more pronounced positive skewness. All three datasets exhibit positive skewness, meaning the data are skewed to the right. This indicates that there are more lower values and fewer high values. The high-grade domain has the highest skewness, meaning that there are a few remarkably high values that are pulling the mean higher. In order to visually assess the data, histograms and boxplots of Cu (ppm) are given in Figure 4 and Figure 5, respectively.

As shown in Figure 4a, the relative frequencies of Cu grades reach local peaks approximately in the range of 2750–3500 Cu ppm and 8750–1000 Cu ppm. This indicates the presence of two distinct sample populations. This separation is clearly evident in Figure 5, where it can be seen that the low-grade domain Cu values do not overlap with the range of the high-grade domain with the exception of single data point. This data point is considered an outlier for the corresponding domain. There were two samples that were considered to be outliers. Due to the limited number of outliers, these data points were not excluded from the drillhole database.

A wireframe (solid model) that represents copper mineralization was modelled by manually interpreting the sections taken in east–west directions. For estimation purposes, a block model that corresponds to the solid model was created by using equidimensional blocks of 20 m × 20 m and 2 m in the X, Y and Z directions, respectively. The block sizes were determined based on multiple criteria. The horizontal dimensions of the block model were determined by considering both the possible production pattern and the average drillhole spacing. As mentioned previously, the average drillhole spacing was approximately 57 m. A value of 20 m is approximately one-third of this average spacing. These dimensions in the X and Y directions closely align with the possible open-pit production panels. A block model that aligns with potential production panels is considered more usable for subsequent production scheduling. The Z dimension of the block model was selected as 2 m, which is suitable for the possible production panel dimensions and also exceeds the composite length, consistent with general practice. In total, 2248 blocks were generated, including 1613 located in the low-grade domain and the remaining 635 blocks in high-grade domain (Figure 6).

Figure 6 shows that low-grade domain is located beneath the high-grade zone, while the lateral extension of the zones is equal. ML methods, which are explained in the next section, are used to estimate the Cu grades of these individual blocks.

2.2. Methods Used for Spatial Estimation

2.2.1. XGBoost

In regression problems, the goal is to predict a continuous target variable based on input features. While various regression techniques exist, ensemble learning has emerged as a formidable paradigm within the field of regression, also exhibiting notable success across diverse domains [27,28,29,30,31]. In contrast to traditional methodologies that depend on a single model, ensemble methods strategically amalgamate multiple models to attain enhanced accuracy and robustness [32,33]. In ML, ensemble methods are generally categorized into three approaches: stacking, boosting, and bagging [34,35]. Among these alternatives, boosting is a powerful ensemble learning technique that sequentially combines weak learners to create a strong learner [36,37]. The method mainly focuses on iteratively improving the model by weighting training instances based on the performance of previous learners. Several prominent boosting algorithms have been adapted or specifically designed for regression such as AdaBoost [38,39], Gradient Boosting Machines [40,41], XGBoost, and Categorical Boosting [42,43].

Among these alternatives, XGBoost (Extreme Gradient Boosting) represents a highly efficient and extensively utilized implementation of the gradient boosting framework [32,44,45,46]. It demonstrates exceptional performance in both classification and regression tasks by sequentially constructing an ensemble of decision trees. Unlike conventional gradient boosting, XGBoost integrates several critical optimizations that enhance its performance and scalability.

A pivotal feature is its employment of a regularized objective function, which encompasses both a loss function and regularization terms. The objective function can be expressed as follows:

O b j (θ) = \sum_{i = 1}^{n} L (y_{i}, {\tilde{y}}_{i}) + \sum_{i = 1}^{n} φ (f_{k})

(1)

where

L

is the loss function measuring the discrepancy between the predicted

{\tilde{y}}_{i}

and actual

y_{i}

values, and

φ (f_{k})

represents the regularization term penalizing model complexity to mitigate overfitting. The regularization significiantly bolsters the model’s generalization capability, particularly when addressing high-dimensional data or limited sample sizes.

Moreover, XGBoost utilizes a second-order Taylor approximation of the loss function, which provides more precise gradient estimates and expedited convergence compared to the first-order approximations employed in traditional gradient boosting. The second-order Taylor approximation can be represented as follows:

L (y_{i}, {\tilde{y}}_{i}) \approx L (y_{i}, {\tilde{y}}_{i}^{t}) + g_{i} \times (y_{i}, {\tilde{y}}_{i}^{t}) + \frac{1}{2} h_{i} \times {(y_{i}, {\tilde{y}}_{i}^{t})}^{2}

(2)

where

g_{i}

and

h_{i}

are first and second derivates of the loss function with respect to the prediction

{\tilde{y}}_{i}

.

Additionally, XGBoost incorporates several algorithmic enhancements, including sparsity-aware split finding, weight quantile sketch and parallel processing. These optimizations render XGBoost a potent and versatile instrument for spatial estimation.

2.2.2. SVR

Support Vector Machine is premise of the Support Vector Regression and was first proposed by Vapnik in 1979 [47], while SVR was introduced in 1996 [48]. For this reason, firstly, SVM is explained. SVM is a member of the supervised machine learning family that also includes XGBoost. SVM is designed for classification purposes. In basic terms, SVM aims to find the optimal hyperplane that separates the two classes by maximizing the margin [49]. The margin can be defined as the distance between the hyperplane and the closest data points from each class. These vectors are known as “support vectors”, which gives the name to the method.

SVR is designed to find the best fit target variable while minimizing complexity and avoiding overfitting. The objective of SVR is to find a function that is the flattest under the constraint of the maximum deviation of epsilon from the target considering all data. The flatness of the function is the main source of the reduction in overfitting, but it makes the approach prone to minor changes in the inputs. In mathematical terms, it is expressed as follows:

\frac{1}{2} {‖w‖}^{2} + C \sum_{i = 1}^{n} (ε_{i} + ε_{i}^{*})

(3)

where

{‖w‖}^{2}

is the regularization term to keep the model simple,

C

is a regularization parameter, and

ε_{i} a n d ε_{i}^{*}

are slack variables that measure the deviations of the predictions from the target values beyond the margin of tolerance.

The constraints in SVR ensure that the predictions are within an acceptable range of the actual target values, as defined by the margin of tolerance. The tolerances are given in Equation (4).

y_{i} - (w * x_{i} + b) \leq ϵ + ε_{i} (w * x_{i} + b) - y_{i} \leq ϵ + ε_{i} ε_{i}, ε_{i}^{*} \geq 0

(4)

where

y_{i}

is the target variable,

w * x_{i} + b

is the predicted value, and

ϵ

is the margin of tolerance.

These components together help SVR to find a balance between fitting the data accurately and maintaining a simple model. There are some parameters that are optimized in estimation with SVR. These parameters are given in Table 2 with brief explanations.

2.2.3. Ensemble Methods

Unlike classical ML methods, ensemble machine learning methods try to unify machine learning methods to achieve better results than the underlying machine learning methods. As mentioned previously, ensemble methods are generally categorized into stacking, boosting and bagging. Among these categories, bagging (bootstrap aggregating) was introduced by Brieman [50] as a heterogeneous method that utilizes different machine learning methods, which are called base learners, to obtain the final results. The main purpose of bagging is to enhance the stability of predictions to reduce the variance of outcomes, which mitigates the risk of overfitting [51,52,53]. As the name bagging implies, the approach mainly consists of steps of bootstrapping and aggregation. These steps are as follows.

Bootstrap sampling:

Given a dataset D with n instances, we generate B bootstrap samples

D_{1}

,

D_{2}

,…,

D_{n}

Each bootstrap sample

D_{i}

is created by randomly sampling

n

instances from

D

with replacement. This means some instances may appear multiple times in a sample, while others may be omitted.

2: Model Training:

A base model

M_{i}

is trained on each bootstrap sample

D_{i}

. The base model can be of the same type or distinct types, depending on the specific implementation for bagging.

3: Aggregation:

For regression tasks, the final prediction

\hat{y} = \sum_{i = 1}^{k} k_{i} {\hat{y}}_{i}

is obtained by weighting the predictions of all base models, while

k_{i}

are the weights of the predictions

{\hat{y}}_{i}

.

In this basic form of bagging, the approach can be seen as finding the weights of each underlying base learner, which can be different machine learning approaches. The weights

{\hat{y}}_{i}

are generally determined by minimizing the total expected prediction error (MSE). As base learners can be selected from all available ML algorithms, only the ML approaches used in this study are mentioned below.

k-Nearest Neighbors: The k-Nearest Neighbors (kNN) algorithm for regression estimates the value of a data point by taking the average of its k closest neighbors. The effectiveness of kNN is significantly influenced by the selection of k and the distance metric used to find the neighbors. It is a non-parametric technique, meaning it does not assume any specific data distribution, which can be beneficial for certain datasets [54].

Random Forest: Random Forest for regression creates a collection of decision trees, each trained on a random subset of the data and features. The final prediction is derived by averaging the outputs of all the individual trees, which helps to minimize overfitting and enhance generalization. This method is resilient to noise and can efficiently manage large datasets with many features [55,56,57].

CatBoost: CatBoost is a gradient boosting algorithm that constructs an ensemble of trees in a sequential manner, with each new tree aiming to correct the errors of the previous ones. It is particularly effective at managing categorical features without extensive preprocessing, making it user-friendly and efficient. CatBoost also includes techniques to prevent overfitting, such as ordered boosting and gradient-based feature selection, which improve its performance in regression tasks [58,59].

ExtraTrees: Extra Trees, or Extremely Randomized Trees, is an ensemble method that introduces more randomness into the tree-building process compared to Random Forest. It splits nodes by selecting cut points at random, which helps to reduce variance and enhance model robustness. For regression, the final prediction is obtained by averaging the outputs of all the trees in the ensemble, making it a powerful method for capturing complex patterns in the data [60,61].

3. Estimations

The grade estimation process performed with ML methods such as XGBoost and SVR consists of very similar steps such as feeding the raw data, training the model specific to the method, and estimating the block grades with the trained model [17]. The only difference between the methods stems from the determination of the parameters specific to the methods. While the predictions made with ensemble learning consist of steps similar to the methods mentioned above, unlike these methods, it only includes the stage of selecting the base learners.

In all estimations, the input data were the X, Y and Z coordinates of the composite, while the corresponding Cu grades were considered as outputs. One of the most important steps in the training phase of models is tuning the parameters. In this study, the grid search approach, which aims to achieve the best result by assessing all possibilities, was adopted. Overfitting is an undesirable situation in any kind of model training with machine learning. Overfitting in machine learning occurs when a model learns the training data too well, capturing noise and outliers rather than the underlying patterns. This results in a model that performs exceptionally well on the training data but poorly on unseen test data. Overfitting is characterized by a high variance and low bias, meaning the model is too complex and sensitive to the specific details of the training data. To mitigate the overfitting effect, K-Fold cross-validation can be used [62]. K-Fold cross-validation is a reliable method for evaluating the performance and generalizability of a machine learning model. To robustly evaluate model performance, K-fold cross-validation was employed. This technique partitions the dataset into k mutually exclusive subsets, or folds. In each of k iterations, a distinct fold is designated as the validation set, while the remaining k-1 folds constitute the training set. Consequently, every data point is utilized for both training and validation, providing a comprehensive assessment of the model’s generalization capabilities.

To evaluate the importance of domaining in machine learning-based estimations, the estimations were initially made across the entire deposit without any distinction. Subsequently, estimations were made separately for low-grade and high-grade zones. In the estimation of the low-grade zone, only the composite data within this zone was trained. Similarly, in the estimation of the high-grade zone, only the data within the high-grade zone was used. Therefore, three separate machine learning models were developed for each of the three estimation methods.

The estimations were performed on a computer with 64 GB RAM and a 24-core CPU. All programs were written in Python 3.10. It is commonly believed that machine learning methods require a long time to produce results. However, in grade estimations, the estimations were made in a brief time due to the use of limited number of inputs (X, Y, and Z values) and outputs (composite Cu grade values). After training the models on the current computer hardware, for all blocks, estimations were performed in a brief time, taking 2.56 × 10⁻⁴ s with XGBoost, 2.48 × 10⁻⁴ s with SVR, and 2.77 × 10⁻⁴ s with the ensemble method. To assess the applicability of this study to other deposits, an artificial block model consisting of one million blocks was created. The estimations in this block model were also made using the same models. As a result, the estimations took only 0.06, 0.05, and 0.07 s for XGBoost, SVR and the ensemble method, respectively, which shows that estimations take an insignificant amount of time when models are trained. The details of all estimation methods used in the study are provided below.

3.1. Estimation with XGBoost

The composites were divided into training and test sets, corresponding to 80% and 20% of the data, respectively. The samples were selected randomly. Some parameters should be tuned to obtain acceptable estimation results (Table 2).

The parameters given in Table 2 were tuned and these parameters are given in Table 3 with grid search ranges.

The tuned parameters in Table 3 were used to calculate the RMSE of the K-fold cross-validation, with values of 0.74, 0.45 and 0.38 for the deposit single model, low-grade domain model and high-grade domain model, respectively. Also, these tuned models were used to estimate the Cu values of the test data. The RMSE of the actual Cu values vs. test data Cu values were 0.71, 0.46 and 0.36 for the deposit single model, low-grade domain model and high-grade domain model, respectively. The close values of RMSE indicate that the model generalizations are high for all estimations. For this reason, the models were used to predict the Cu grades of individual blocks.

3.2. Estimation with SVR

As with the estimations using XGBoost, the dataset for SVR estimations was also divided into two groups: training and test data. As is well known, the data fed into SVR needs to be normalized. Among many normalization methods, min-max normalization was preferred. The formula for the method is as follows [63].

X_{n o r m} = \frac{X - X_{m i n}}{X_{m a x} - X_{m i n}}

(5)

where

X

is the original value,

X_{m i n}

is the minimum value in the dataset,

X_{m a x}

is the maximum value in the dataset and

X_{n o r m}

is the normalized value. This formula scales the original data to a range between 0 and 1.

The normalized composite values were used as inputs, while the Cu values were determined as the target variable. As for the XGBoost estimations, the composites were divided into training and test sets, corresponding to 80% and 20% of the data, respectively. Some parameters need to be optimized for SVR estimations. The grid search approach was adopted for optimization. These parameters are shown in Table 4 and Table 5.

To assess the generalization of the models, the RMSE of the K-fold validation (K = 10) on training dataset was calculated, with values of 0.82, 0.53 and 0.28 for the deposit single model, low-grade domain model and high-grade domain model, respectively. The trained models behaved similarly on test datasets with RMSE values of 0.79, 0.58 and 0.32 for the deposit single model, low-grade domain model and high-grade domain model, respectively. This shows that the models are not overfitted with the training datasets and can be used for estimation purposes.

3.3. Estimation with Ensemble Technique

In this study, the basic strategy in the methods used so far was based on tuning the parameters of the method to be used. However, as explained in Section 2.2.3, the ensemble learning method is a different approach that combines the strengths of various models to achieve high performance. A bagging strategy is used to estimate the spatial distribution of Cu grades. K-Nearest Neighbors, Random Forest, CatBoost and ExtraTrees are used as base learners. Like all other ML employed in this study, the composite data are separated into training and testing sets, comprising 80% and 20% of the data, respectively. First, the parameters are tuned for individual learners for estimation purposes. Later, the weights of these base learners are determined by minimizing the total expected prediction error.

As seen from Table 6, across all estimates, KNeighbors has the highest relative weight. It can be observed that the weights of the other methods are quite similar. Due to the black-box nature of the ML approaches, this behavior cannot be explained.

4. Results and Discussion

In spatial estimation, it is generally desired and expected that the grade value statistics are close to those of composites. However, as with classical geostatistical methods, results obtained with machine learning are known to suffer from the smoothing effect [9]. This smoothing effect can be described as a situation where the variance of the predicted values is lower than the variance of the composites. However, due to the unbiasedness condition, it is expected that the estimation mean will be close to the composite mean. The statistics of the predictions made with the XGBoost, SVR, and ensemble methods are shown in Table 7. As stated previously, all the ML models were initially trained based on the available data without considering domains. The statistics of these estimates, specific to ML methods, are given in the column named ‘deposit single model’. Without making additional estimations, the statistics of the blocks that remain in the low- and high-grade zones are provided in the columns ‘low-grade domain single model’ and ‘high-grade domain single model’, respectively. The columns named ‘low-grade domain model’ and ‘high-grade domain model’ show the statistics of separate estimations made based on the corresponding domain data. ‘Deposit separate models’ shows the statistics of the estimations made based on the domains for the entire deposit.

As shown in Table 7, the averages of the deposit single model and deposit separate models are close to the averages of all composites given in Table 1. This indicates that all the estimation methods produced acceptable results. However, considering only the averages of the estimates may lead to misleading conclusions. It would be more accurate to evaluate the results separately in the low- and high-grade zones. As seen in Table 1, which provides summary statistics for the composites, the Cu averages of the composites in the low- and high-grade zones are 3199 and 12,006 ppm, respectively. In Table 3, it is observed that the predictions made without considering the domains (given in the columns named ‘low-grade domain in single model’ and ‘high-grade domain in single model’) deviate significantly from the averages within the domains. The statistics of the domain-based estimations are closer to the statistics of the corresponding composites, which is desired in spatial estimation. The maximum values of the low-grade domain in the single model for all ML approaches are significantly higher than the corresponding domain’s maximum composite values. This is considered a serious flaw, as the block models contain higher values that are not observed in the composites related to the domain. Similarly, the minimum value of the high-grade domain in the single model is also significantly lower than the domain-specific composite’s lower value, which is also an undesirable situation in block model estimation. These results show that using a single model in the presence of distinct domains can be a source of significant issues. To investigate the reason for this issue, the block model was assessed within the domains. Initially, the high-grade zone was evaluated. To this end, blocks within the high-grade zone with values lower than 8166 Cu (ppm), the minimum value of composites within this zone, were identified. Secondly, blocks within the low-grade zone with values higher than 10,250 Cu (ppm), the maximum value of composites within the corresponding block model, were identified. In both cases, the identified blocks were observed to be located at the contacts of the zones. A total of 193 blocks in the high-grade zone were out of the zone’s range (equivalent to 30% of the blocks located in the high-grade zone), while 34 blocks (equivalent to 2% of the blocks located in the low-grade zone) were out of range in the low-grade zone. This observation suggests that high- and low-grade composite data at the contacts influence blocks outside their respective zones. In other words, composite data from the high-grade zone may lead to overestimation of certain blocks in the low-grade zone, and similarly, data from the low-grade zone may result in underestimation of blocks in the high-grade zone. However, according to the results, the high-grade zone is more affected than the low-grade zone. This is probably due to the greater amount of data located in the low-grade zone, which seems to have dominated the trained model. However, due to the black-box nature of machine learning, providing mathematical proof of this phenomenon is not feasible. As is well known, one of the inherent weaknesses of machine learning algorithms is their lack of interpretability. To mitigate this issue, which may arise in single-model estimations, attention mechanisms could be explored as an alternative approach in future studies. An attention mechanism essentially involves training machine learning methods to assign greater importance to specific data points. Thus, by training the model with high-grade data in the high-grade zone, more realistic results can be achieved. To sum up, all these results show that predictions made without considering domains result in unreliable estimation outcomes from a domain perspective.

The results were also evaluated from the perspective of change in spatial averages. Swath plots are primarily used to compare the averages of data falling within specific intervals in a particular direction. The direction is discretized into a series of contiguous, non-overlapping segments termed “swaths”. The dimensions of these swaths, primarily their width perpendicular to the path’s direction, are crucial parameters influencing the resolution of the analysis. A narrower swath width captures finer-scale variations in the variable of interest along the path but may also be more susceptible to localized noise or outliers. Conversely, a wider swath width provides a smoother representation, averaging out local fluctuations but potentially masks smaller-scale features. For each defined swath, the average value of the target variable is calculated. This involves identifying all data points (composites and block estimates) that fall within the spatial boundaries of the given swath. The arithmetic average of the variable’s values at these identified points is then computed, representing the central tendency of the variable within that specific swath. This process is repeated for each swath along the defined path, generating a series of average values that correspond to discrete locations along the path. In this study, comparisons of swath plots were examined only in the Y directions, while due to the geometric structure of the deposition, it was not possible to generate meaningful swaths for swath plot analysis for the X and Z directions.

In the assessment of swath plots, it is desirable for the averages of the data within the intervals to be close to the composite averages. The evaluation of swath plots is generally performed visually. As seen in Figure 7a, all the methods produced results close to the composite values, as desired. However, as clearly shown in Figure 7b,c, domain-specific estimates produced results closer to the composite values based on swath analysis, while all single-model-based estimates deviated significantly from the composite values. As an alternative to visual comparison, which might be subjective, a metric was developed to measure the deviation from the composites as follows:

{C u}_{d e v} = \sum_{i = 1}^{n} \frac{{C u}_{c o m p_i} - {C u}_{e s t_i}}{{C u}_{c o m p_i}}

(6)

where

{C u}_{c o m {p_}_{i}}

is the copper average of composites at interval i,

{C u}_{e s t_i}

is the copper grade average of block estimates,

{C u}_{a b s_d e v}

is the average deviation of estimates from the composites and n is the number of intervals in the swath plot. Due to the nature of the formulation, values close to zero indicate spatial estimation results closer to the composites. Results according to Equation (6) are given in Table 8.

As seen in Table 8, in the low-grade zone, all the ML methods that use a single model to estimate grade distribution systematically overestimate the low-grade domain, while the interval-based cumulative deviations are significantly negative. In the high-grade zone, single-model-based estimates underestimate the grade values. However, in terms of spatial variability, the estimations made within domains produced better results. The summary statistics and swath plot analysis show that using domain-based estimations generates better results. For these reasons, this study demonstrates that domaining is a crucial step in spatial estimation. As in classical geostatistics, using a smaller but more similar dataset that recognizes domains yields better results compared to using a larger but less similar dataset that does not consider domaining.

In this study, a dataset with clearly separated domains was used. In order words, there was a hard boundary between zones. However, in mineral resources, these domains may not always show such clear separation. Therefore, it is recommended to assess the results obtained here in deposits with more complex domains. This study emphasizes the importance of domaining in spatial estimation. As a natural result of domaining, data in the domains are subsets of the deposit data. These subsets may contain a limited amount of data. This study does not make a judgment on the minimum number of data points required for spatial estimation in the domains. This remains an open research subject. As reported in previous studies [17,64], the results of all estimates were smoother than the composite values. In other words, the variance of the estimates was lower than the composites’ variance. Furthermore, this study only assesses the importance of domaining; it does not determine which approach is the best estimator within these domains. For this reason, further studies should investigate which ML algorithms perform best in spatial estimation in the presence of domains. Also, in this study, X, Y and Z values are considered as input data, while lithology, other structures and geochemical variables etc. are not considered. For further studies it is recommended to assess the effects of these variables.

5. Conclusions

The widespread adoption of machine learning in engineering fields has extended to earth sciences, resulting in a growing body of research utilizing ML algorithms for mineral resource estimation. These studies commonly emphasize model training and subsequent block value estimations. However, the critical process of domaining is often neglected. To directly evaluate the significance of domaining, this study compared Cu estimations in a copper deposit performed with and without the explicit delineation of high- and low-grade zones. Estimations lacking domaining exhibited substantial errors, primarily stemming from the models’ failure to capture sharp grade discontinuities at zone boundaries. This resulted in the misallocation of high-grade values to low-grade zones and vice versa, with a notable 30% of blocks in the high-grade zone displaying out-of-range values. In contrast, estimations incorporating domaining yielded significantly improved accuracy, as validated by summary statistics, swath plot analyses, and out-of-range block assessments. These results unequivocally establish the vital importance of domaining in ML-driven resource estimation, regardless of the specific algorithm used. Nonetheless, all ML-derived estimates displayed a smoothing effect, consistently exhibiting lower variance than the composite data.

Funding

This research received no external funding.

Data Availability Statement

Due to data privacy, the data cannot be shared.

Acknowledgments

The author would like to thank two anonymous reviewers for their feedback that greatly improved the quality of the manuscript.

Conflicts of Interest

The author declares no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

Cu	Copper
CPU	Central processing unit
DSM	Deposit single model
DseM	Deposit separate model
HGM	High-grade model
KNN	K-nearest neighbor
LGM	Low-grade model
ML	Machine learning
MSE	Mean square error
NN	Neural network
OK	Ordinary kriging
ppm	Parts per million
RMSE	Root mean square error
SVR	Support Vector Regression
XGBoost	Extreme Gradient Boosting Tree

References

Siddiqui, F.I.; Pathan, A.G.; Ünver, B.; Tercan, A.E.; Hindistan, M.A.; Ertunç, G.; Atalay, F.; Ünal, S.; Kıllıoğlu, Y. Lignite resource estimations and seam modeling of Thar Field, Pakistan. Int. J. Coal Geol. 2015, 140, 84–96. [Google Scholar]
Ertunc, G.; Tercan, A.E.; Hindistan, M.A.; Unver, B.; Unal, S.; Atalay, F.; Killioglu, S.Y. Geostatistical estimation of coal quality variables by using covariance matching constrained kriging. Int. J. Coal Geol. 2013, 112, 14–25. [Google Scholar] [CrossRef]
Rossi, M.E.; Deutsch, C.V. Mineral Resource Estimation; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
Xiang, J.; Xiao, K.; Carranza, E.J.M.; Chen, J.; Li, S. 3D mineral prospectivity mapping with random forests: A case study of Tongling, Anhui, China. Nat. Resour. Res. 2020, 29, 395–414. [Google Scholar]
Amadua, C.C.; Owusub, S.; Folic, G.; Brakoc, B.A.; Abanyied, S.K. Comparison of ordinary kriging (OK) and inverse distance weighting (IDW) methods for the estimation of a modified palaeoplacer gold deposit: A case study of the Teberebie gold deposit, SW Ghana. Group 2022, 250, 700. [Google Scholar]
Atalay, F.; Tercan, A.E.; Ünver, B.; Hindistan, M.A.; Ertunç, G. A Geostatistical Study of Tertiary Coal Fields in Turkey. In Proceedings of the Mathematics of Planet Earth, 15th Annual Conference of the International Association for Mathematical Geosciences, Madrid, Spain, 2–6 September 2014; pp. 723–726. [Google Scholar]
Romary, T.; Rivoirard, J.; Deraisme, J.; Quinones, C.; Freulon, X. Domaining by clustering multivariate geostatistical data. In Geostatistics Oslo 2012; Springer: Berlin/Heidelberg, Germany, 2012; pp. 455–466. [Google Scholar]
Fouedjio, F.; Hill, E.J.; Laukamp, C. Geostatistical clustering as an aid for ore body domaining: Case study at the Rocklea Dome channel iron ore deposit, Western Australia. Appl. Earth Sci. 2018, 127, 15–29. [Google Scholar]
Afeni, T.B.; Lawal, A.I.; Adeyemi, R.A. Re-examination of Itakpe iron ore deposit for reserve estimation using geostatistics and artificial neural network techniques. Arab. J. Geosci. 2020, 13, 657. [Google Scholar] [CrossRef]
Jafrasteh, B.; Fathianpour, N.; Suárez, A. Comparison of machine learning methods for copper ore grade estimation. Comput. Geosci. 2018, 22, 1371–1388. [Google Scholar] [CrossRef]
Zaki, M.M.; Chen, S.; Zhang, J.; Feng, F.; Khoreshok, A.A.; Mahdy, M.A.; Salim, K.M. A Novel Approach for Resource Estimation of Highly Skewed Gold Using Machine Learning Algorithms. Minerals 2022, 12, 900. [Google Scholar] [CrossRef]
Galetakis, M.; Vasileiou, A.; Rogdaki, A.; Deligiorgis, V.; Raka, S. Estimation of mineral resources with machine learning techniques. Mater. Proc. 2022, 5, 122. [Google Scholar] [CrossRef]
Mery, N.; Marcotte, D. Quantifying Mineral Resources and Their Uncertainty Using Two Existing Machine Learning Methods. Math. Geosci. 2021, 54, 363–387. [Google Scholar] [CrossRef]
Samanta, B.; Ganguli, R.; Bandopadhyay, S. Comparing the predictive performance of neural networks with ordinary kriging in a bauxite deposit. Min. Technol. 2013, 114, 129–139. [Google Scholar] [CrossRef]
Mahboob, M.; Celik, T.; Genc, B. Review of machine learning-based Mineral Resource estimation. J. South. Afr. Inst. Min. Metall. 2022, 122, 655–664. [Google Scholar]
Dumakor-Dupey, N.K.; Arya, S. Machine Learning—A Review of Applications in Mineral Resource Estimation. Energies 2021, 14, 4079. [Google Scholar] [CrossRef]
Atalay, F. Estimation of Fe Grade at an Ore Deposit Using Extreme Gradient Boosting Trees (XGBoost). Min. Metall. Explor. 2024, 41, 2119–2128. [Google Scholar]
Adadi, A. A survey on data-efficient algorithms in big data era. J. Big Data 2021, 8, 24. [Google Scholar]
Obermeyer, Z.; Emanuel, E.J. Predicting the future—Big data, machine learning, and clinical medicine. N. Engl. J. Med. 2016, 375, 1216–1219. [Google Scholar]
Van Der Ploeg, T.; Austin, P.C.; Steyerberg, E.W. Modern modelling techniques are data hungry: A simulation study for predicting dichotomous endpoints. BMC Med. Res. Methodol. 2014, 14, 137. [Google Scholar]
Chia, M.Y.; Huang, Y.F.; Koo, C.H. Resolving data-hungry nature of machine learning reference evapotranspiration estimating models using inter-model ensembles with various data management schemes. Agric. Water Manag. 2022, 261, 107343. [Google Scholar]
Gravesteijn, B.Y.; Nieboer, D.; Ercole, A.; Lingsma, H.F.; Nelson, D.; Van Calster, B.; Steyerberg, E.W.; Åkerlund, C.; Amrein, K.; Andelic, N. Machine learning algorithms performed no better than regression models for prognostication in traumatic brain injury. J. Clin. Epidemiol. 2020, 122, 95–107. [Google Scholar]
Liu, T.-L.; Flückiger, B.; de Hoogh, K. A comparison of statistical and machine-learning approaches for spatiotemporal modeling of nitrogen dioxide across Switzerland. Atmos. Pollut. Res. 2022, 13, 101611. [Google Scholar]
Shouval, R.; Fein, J.A.; Savani, B.; Mohty, M.; Nagler, A. Machine learning and artificial intelligence in haematology. Br. J. Haematol. 2021, 192, 239–250. [Google Scholar] [PubMed]
Delaney, C.; Li, X.; Holmberg, K.; Wilson, B.; Heathcote, A.; Nieber, J. Estimating lake water volume with regression and machine learning methods. Front. Water 2022, 4, 886964. [Google Scholar]
Zhao, T.; Wang, S.; Ouyang, C.; Chen, M.; Liu, C.; Zhang, J.; Yu, L.; Wang, F.; Xie, Y.; Li, J. Artificial intelligence for geoscience: Progress, challenges and perspectives. Innovation 2024, 5, 100691. [Google Scholar] [PubMed]
Joshi, A.; Vishnu, C.; Mohan, C.K.; Raman, B. Application of XGBoost model for early prediction of earthquake magnitude from waveform data. J. Earth Syst. Sci. 2023, 133, 5. [Google Scholar]
Shahani, N.M.; Zheng, X.; Liu, C.; Hassan, F.U.; Li, P. Developing an XGBoost regression model for predicting young’s modulus of intact sedimentary rocks for the stability of surface and subsurface structures. Front. Earth Sci. 2021, 9, 761990. [Google Scholar]
Osman, A.I.A.; Ahmed, A.N.; Chow, M.F.; Huang, Y.F.; El-Shafie, A. Extreme gradient boosting (Xgboost) model to predict the groundwater levels in Selangor Malaysia. Ain Shams Eng. J. 2021, 12, 1545–1556. [Google Scholar]
Jia, Y.; Jin, S.; Savi, P.; Gao, Y.; Tang, J.; Chen, Y.; Li, W. GNSS-R soil moisture retrieval based on a XGboost machine learning aided method: Performance and validation. Remote Sens. 2019, 11, 1655. [Google Scholar] [CrossRef]
Badola, S.; Mishra, V.N.; Parkash, S. Landslide susceptibility mapping using XGBoost machine learning method. In Proceedings of the 2023 International Conference on Machine Intelligence for GeoAnalytics and Remote Sensing (MIGARS), Hyderabad, India, 27–29 January 2023; pp. 1–4. [Google Scholar]
Dietterich, T.G. Ensemble methods in machine learning. In Proceedings of the International workshop on multiple classifier systems, Cagliari, Italy, 21–23 June 2000; pp. 1–15. [Google Scholar]
Ganaie, M.A.; Hu, M.; Malik, A.K.; Tanveer, M.; Suganthan, P.N. Ensemble deep learning: A review. Eng. Appl. Artif. Intell. 2022, 115, 105151. [Google Scholar]
Graczyk, M.; Lasota, T.; Trawiński, B.; Trawiński, K. Comparison of bagging, boosting and stacking ensembles applied to real estate appraisal. In Proceedings of the Intelligent Information and Database Systems: Second International Conference, ACIIDS, Hue City, Vietnam, 24–26 March 2010; Proceedings Part II 2. pp. 340–350. [Google Scholar]
Shahhosseini, M.; Hu, G.; Pham, H. Optimizing ensemble weights and hyperparameters of machine learning models for regression problems. Mach. Learn. Appl. 2022, 7, 100251. [Google Scholar]
Ridgeway, G.; Madigan, D.; Richardson, T.S. Boosting methodology for regression problems. In Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 3–6 January 1999. [Google Scholar]
Natekin, A.; Knoll, A. Gradient boosting machines, a tutorial. Front. Neurorobotics 2013, 7, 21. [Google Scholar]
Freund, Y.; Schapire, R.E. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 1997, 55, 119–139. [Google Scholar]
Schapire, R.E. The boosting approach to machine learning: An overview. In Nonlinear Estimation and Classification; Springer: Berlin/Heidelberg, Germany, 2003; pp. 149–171. [Google Scholar]
Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar]
He, Z.; Lin, D.; Lau, T.; Wu, M. Gradient boosting machine: A survey. arXiv 2019, arXiv:1908.06951. [Google Scholar]
Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased boosting with categorical features. Adv. Neural Inf. Process. Syst. 2018, 31, 11. [Google Scholar]
Dorogush, A.V.; Ershov, V.; Gulin, A. CatBoost: Gradient boosting with categorical features support. arXiv 2018, arXiv:1810.11363. [Google Scholar]
Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
Nielsen, D. Tree Boosting with xgboost-Why Does xgboost Win “Every” Machine Learning Competition? NTNU: Trondheim, Norvegia, 2016. [Google Scholar]
Wade, C.; Glynn, K. Hands-On Gradient Boosting with XGBoost and Scikit-Learn: Perform Accessible Machine Learning and Extreme Gradient Boosting with Python; Packt Publishing Ltd.: Birmingham, UK, 2020. [Google Scholar]
Drucker, H.; Wu, D.; Vapnik, V.N. Support vector machines for spam categorization. IEEE Trans. Neural Netw. 1999, 10, 1048–1054. [Google Scholar]
Drucker, H.; Burges, C.J.; Kaufman, L.; Smola, A.; Vapnik, V. Support vector regression machines. Adv. Neural Inf. Process. Syst. 1996, 9, 1–7. [Google Scholar]
Lachaud, A.; Adam, M.; Mišković, I. Comparative study of random forest and support vector machine algorithms in mineral prospectivity mapping with limited training data. Minerals 2023, 13, 1073. [Google Scholar] [CrossRef]
Breiman, L. Bagging predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar]
Patil, P.; Du, J.-H.; Kuchibhotla, A.K. Bagging in overparameterized learning: Risk characterization and risk monotonization. J. Mach. Learn. Res. 2023, 24, 1–113. [Google Scholar]
Ghojogh, B.; Crowley, M. The theory behind overfitting, cross validation, regularization, bagging, and boosting: Tutorial. arXiv 2019, arXiv:1905.12787. [Google Scholar]
Park, Y.; Ho, J.C. Tackling overfitting in boosting for noisy healthcare data. IEEE Trans. Knowl. Data Eng. 2019, 33, 2995–3006. [Google Scholar] [CrossRef]
Song, Y.; Liang, J.; Lu, J.; Zhao, X. An efficient instance selection algorithm for k nearest neighbor regression. Neurocomputing 2017, 251, 26–34. [Google Scholar] [CrossRef]
Ho, T.K. Random decision forests. In Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, Canada, 14–16 August 1995; pp. 278–282. [Google Scholar]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Zhang, H.; Xie, M.; Dan, S.; Li, M.; Li, Y.; Yang, D.; Wang, Y. Optimization of Feature Selection in Mineral Prospectivity Using Ensemble Learning. Minerals 2024, 14, 970. [Google Scholar] [CrossRef]
Hancock, J.T.; Khoshgoftaar, T.M. CatBoost for big data: An interdisciplinary review. J. Big Data 2020, 7, 94. [Google Scholar] [CrossRef]
Fu, P.; Zhang, J.; Yuan, Z.; Feng, J.; Zhang, Y.; Meng, F.; Zhou, S. Estimating the heavy metal contents in entisols from a mining area based on improved spectral indices and Catboost. Sensors 2024, 24, 1492. [Google Scholar] [CrossRef]
Ahmad, M.W.; Reynolds, J.; Rezgui, Y. Predictive modelling for solar thermal energy systems: A comparison of support vector regression, random forest, extra trees and regression trees. J. Clean. Prod. 2018, 203, 810–821. [Google Scholar]
Geurts, P.; Ernst, D.; Wehenkel, L. Extremely randomized trees. Mach. Learn. 2006, 63, 3–42. [Google Scholar] [CrossRef]
Nti, I.K.; Nyarko-Boateng, O.; Aning, J. Performance of machine learning algorithms with different K values in K-fold CrossValidation. Int. J. Inf. Technol. Comput. Sci. 2021, 13, 61–71. [Google Scholar]
Henderi, H.; Wahyuningsih, T.; Rahwanto, E. Comparison of Min-Max normalization and Z-Score Normalization in the K-nearest neighbor (kNN) Algorithm to Test the Accuracy of Types of Breast Cancer. Int. J. Inform. Inf. Syst. 2021, 4, 13–20. [Google Scholar] [CrossRef]
Samson, M. Mineral Resource Estimates with Machine Learning and Geostatistics. Master’ Thesis, University of Alberta, Edmonton, AB, USA, 2019. [Google Scholar]

Figure 1. General workflow of kriging.

Figure 2. General workflow of mineral resource estimation with ML.

Figure 3. (a) Plan. (b) Oblique view of drillhole traces (plus sign (+) represents drillhole collar location).

Figure 4. Histograms of Cu grade: (a) all, (b) low-grade domain, (c) high-grade domain.

Figure 5. Boxplots of Cu grades (x represents the mean value, while filled dots represent outlier values).

Figure 6. Block model that shows low- and high-grade zones (red: high-grade zone, blue: low-grade zone).

Figure 7. Swath plots of the estimation results: (a) deposit, (b) low-grade domain, (c) high-grade domain. DSM: deposit single model, DseM: deposit separate model, LGM: low-grade model, HGM: high-grade model.

Table 1. Summary statistics of the Cu composites based on domains.

	All Composite Data (ppm)	Composites in Low-Grade Domain (ppm)	Composites in High-Grade Domain (ppm)
Count	483	346	137
Minimum	1.2	1.2	8166
Average	5697	3199	12,006
Median	4020	3084	11,435
Maximum	21,125	10,250	21,125
Standard Deviation	4534	1897	2797
Skewness	0.989	0.557	1.114
Kurtosis	0.188	−0.120	0.964

Table 2. Parameters that are tuned to obtain acceptable estimation results in XGBoost.

Hyper-Parameter	Grid Search Range	Explanations
Eta	0.01–0.05–0.1–0.3–0.5–1	Step size shrinkage used to prevent overfitting
Max depth	3–6–8–10–12–15	Maximum depth of a tree
Minimum child weight	1–3–5–8–10	Minimum sum of instance weight
Evaluation metric	RMSE	Evaluation metric for validation data
Objective	Squared Error	Learning objective
Subsample	0.1–0.3–0.5–0.6–0.8–1	Subsample ratio of the training instance

Table 3. Tuned XGBoost parameters.

Hyper-Parameter	Deposit Single Model	Low-Grade Domain Model	High-Grade Domain Model
Eta	1	0.1	0.05
Max depth	6	15	6
Minimum child weight	1	1	8
Subsample	0.8	0.6	0.1

Table 4. Parameters tuned to obtain acceptable estimation results in SVR.

Hyper-Parameter	Grid Search Range	Explanations
Ploy_degree	1, 2, 3	Degree of polynomial
C	Linearly spaced one thousand data points between 0.1 and 100	Regularization parameter
Degree	2, 3, 4	For polynomial kernels, degree of polynomials function
Epsilon	Linearly spaced one thousand data points between 0 and 3	Margin of tolerance
Gamma	Linearly spaced ten data points between 0.25 and 1	Specific to RBF and polynomial kernels, defines the influence of a single training
Coef0	Linearly spaced one thousand data points between 0 and 1	For polynomial and sigmoid kernel independent terms in the kernel function
Kernel	Linear, Polynomial (Poly), Radial Basis Function (RBF), Sigmoid

Table 5. Tuned SVR parameters used in estimation.

Hyper-Parameter	Deposit Single Model	Low-Grade Domain Model	High-Grade Domain Model
Kernel	Poly	RBF	Poly
Poly Degree	2	3	2
C	9.3	75.1	45.3
Coef0	0.566	0.881	0.036
Degree	2	4	4
Epsilon	0.199	0.301	0.331
Gamma	0.795	0.542	0.452

Table 6. Weights of base learners.

Base Learner	Deposit Single-Model Weights	Low-Grade Domain Model Weights	High-Grade Domain Model Weights
KNeighbors	0.44	0.34	0.38
Random Forest	0.17	0.28	0.25
CatBoost	0.19	0.16	0.27
Extra Trees	0.19	0.22	0.10

Table 7. Summary statistics of estimations *.

ML Method		Deposit Single Model	Low-Grade Domain in Single Model	High-Grade Domain in Single Model	Deposit Separate Models	Low-Grade Domain Model	High-Grade Domain Model
XGBoost	Min	12	12	642	1698	1698	8908
	Average	5889	4580	9110	5515	3078	11,518
	Median	4622	3836	9733	3592	3032	11,523
	Maximum	20,239	20,103	20,239	17,538	4600	17,538
	Std.	3676	2932	3320	3953	753	1436
	Count	1697	1207	490	1697	1207	490
SVR	Min	359	359	2305	251	251	8173
	Average	5390	4408	7807	5710	3038	12,294
	Median	4816	4061	7852	3726	3177	12,087
	Maximum	15,137	14,128	15,137	21,110	6617	21,110
	Std.	2798	2078	2868	4493	1282	2210
	Count	1697	1207	490	1697	1206	490
Ensemble	Min	896	996	896	410	410	9319
	Average	5657	4504	8495	5536	3094	11,552
	Median	4480	3946	9087	3673	3112	11,561
	Maximum	16,094	15,518	16,094	15,829	6370	15,829
	Std.	2996	2103	2976	3976	1107	928
	Count	1697	1207	490	1697	1207	490

* All statistics are given in Cu ppm.

Table 8. Cumulative deviations from the composites in Y direction.

	Single Model XGBoost	Single Model SVR	Single Model Ensemble	Domain-Based XGBoost	Domain-Based SVR	Domain-Based Ensemble
Deposit	3893	8202	5908	7408	5530	7124
Low-grade domain	−16,431	−14,720	−14,762	385	193	3
High-grade domain	38,173	49,784	43,224	6429	−292	6117

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Atalay, F. Effect of Domaining in Mineral Resource Estimation with Machine Learning. Minerals 2025, 15, 330. https://doi.org/10.3390/min15040330

AMA Style

Atalay F. Effect of Domaining in Mineral Resource Estimation with Machine Learning. Minerals. 2025; 15(4):330. https://doi.org/10.3390/min15040330

Chicago/Turabian Style

Atalay, Fırat. 2025. "Effect of Domaining in Mineral Resource Estimation with Machine Learning" Minerals 15, no. 4: 330. https://doi.org/10.3390/min15040330

APA Style

Atalay, F. (2025). Effect of Domaining in Mineral Resource Estimation with Machine Learning. Minerals, 15(4), 330. https://doi.org/10.3390/min15040330

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Effect of Domaining in Mineral Resource Estimation with Machine Learning

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Methods Used for Spatial Estimation

2.2.1. XGBoost

2.2.2. SVR

2.2.3. Ensemble Methods

3. Estimations

3.1. Estimation with XGBoost

3.2. Estimation with SVR

3.3. Estimation with Ensemble Technique

4. Results and Discussion

5. Conclusions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI