Next Article in Journal
Study on Correlations between Tailings Particle Size Distribution and Rheological Properties of Filling Slurries
Next Article in Special Issue
Remote Sensing, Petrological and Geochemical Data for Lithological Mapping in Wadi Kid, Southeast Sinai, Egypt
Previous Article in Journal
A Review of the Mineralogy, Petrography, and Geochemistry of Serpentinite from Calabria Regions (Southern Italy): Problem or Georesource?
Previous Article in Special Issue
Evaluating the Performance of Machine Learning and Deep Learning Techniques to HyMap Imagery for Lithological Mapping in a Semi-Arid Region: Case Study from Western Anti-Atlas, Morocco
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Geochemical Modeling of Copper Mineralization Using Geostatistical and Machine Learning Algorithms in the Sahlabad Area, Iran

1
Department of Mining Engineering, Amirkabir University of Technology (Tehran Polytechnic), Tehran 1591634311, Iran
2
Institute of Oceanography and Environment (INOS), University Malaysia Terengganu (UMT), Kuala Nerus 21030, Terengganu, Malaysia
*
Authors to whom correspondence should be addressed.
Minerals 2023, 13(9), 1133; https://doi.org/10.3390/min13091133
Submission received: 28 June 2023 / Revised: 22 August 2023 / Accepted: 24 August 2023 / Published: 27 August 2023

Abstract

:
Analyzing geochemical data from stream sediment samples is one of the most proactive tools in the geochemical modeling of ore mineralization and mineral exploration. The main purpose of this study is to develop a geochemical model for prospecting copper mineralization anomalies in the Sahlabad area, South Khorasan province, East Iran. In this investigation, 709 stream sediment samples were analyzed using inductively coupled plasma mass spectrometry (ICP-MS), and geostatistical and machine learning techniques. Subsequently, hierarchical analysis (HA), Spearman’s rank correlation coefficient, concentration–area (C–A) fractal analysis, Kriging interpolation, and descriptive statistics studies were performed on the geochemical dataset. Machine learning algorithms, namely K-means clustering, factor analysis (FA), and linear discriminant analysis (LDA) were employed to deliver a comprehensive geochemical model of copper mineralization in the study area. The identification of trace elements and the predictor composition of copper mineralization, the separation of copper geochemical communities, and the investigation of the geochemical behavior of copper vs. its trace elements were targeted and accomplished. As a result, the elements Ag, Mo, Pb, Zn, and Sn were distinguished as trace elements and predictors of copper geochemical modeling in the study area. Additionally, geochemical anomalies of copper mineralization were identified based on trace elements. Conclusively, the nonlinear behavior of the copper element versus its trace elements was modeled. This study demonstrates that the integration and synchronous use of geostatistical and machine learning methods can specifically deliver a comprehensive geochemical modeling of ore mineralization for prospecting mineral anomalies in metallogenic provinces around the globe.

1. Introduction

At the reconnaissance stage of mineral exploration, the location of potential mineral deposits (metallic or non-metallic) or prospecting areas must be exclusively determined [1,2,3,4]. The importance of the reconnaissance stage is due to the identification and separation of potential and barren areas in terms of ore mineralization; the potential areas will be lost or not identified if there is any error or lack of accuracy during this stage [5,6]. Geochemical modeling of stream sediment samples using methods such as multifractal analysis, clustering, principal component analysis (PCA), factor analysis, and artificial neural networks (ANN) is a valuable tool for mineral exploration [7,8,9,10,11]. Each method has advantages and disadvantages for geochemical modeling of stream sediment samples [12,13,14]. The advantages of using these methods are that they are simple in predicting the nonlinear geochemical behavior of the elements relative to each other and detecting the limits of the geochemical anomaly threshold. Perhaps the main disadvantage of the above methods is that the results are directly related to the sampled values and the smallest error in the raw data can greatly affect the output results. Accordingly, an integration and simultaneous use of the analyzing methods for geochemical modeling can diminish the percentage of errors and intensify the accuracy of results for mineral exploration [15,16,17].
Numerous studies have used fractal and correlation analysis and machine learning methods to identify potential mineralization zones separately [5,9,18,19,20,21,22,23,24,25]. However, a few studies have used integrated and hybrid methods for the geochemical modeling of ore mineralization [26,27,28,29]. Geochemical studies based on correlation analysis can only determine the relationship between two or more elements and commonly provide the correlation coefficients [30,31]. Although there is not necessarily a direct or an inverse relationship between two elements, their geochemical behavior in most cases is a nonlinear relationship that a second- or third-degree curve must be explained. Clustering methods (e.g., k-means clustering) are dominant tools for detecting elements’ nonlinear geochemical relationships [32,33].
Machine learning methods have various applications in geochemical modeling, such as detecting trace elements, clustering of elements, anomaly separation, the combination of layers [34,35], testifying large data volumes, and discovering specific trends and patterns [36]. However, there are some disadvantages to using machine learning methods, including the existence of comprehensive data, a long learning time which involves a powerful computer, and the interpretation of complex results [37,38]. Previous studies have scrutinized machine learning methods for geochemical modeling of ore mineralization in metallogenic provinces [39,40,41]. It is definitely noteworthy that a combination of machine learning methods with geostatistical methods can optimize the geochemical modeling results [9]. For instance, it is essential to determine statistical communities when applying the linear discriminant analysis (LDA) algorithm [42,43]. Threshold limits of each community can be calculated with various methods, such as k-means clustering or concentration–area (C–A) fractal analysis. Separation of these communities can affect the results of classifying elements by means of the methods used. Integration of different methods (e.g., machine learning and geostatistical techniques) can reduce the methods’ weaknesses and intensify their advantages for an accurate geochemical modeling of ore mineralization [44,45,46,47,48,49,50].
The main purpose of this study is to develop an approach for comprehensive geochemical modeling of copper mineralization by integrating geostatistical methods (based on correlation analysis) and machine learning techniques for prospecting copper anomaly zones in the Sahlabad area, Birjand, East Iran. The Sahlabad region is located in the Sistan structural zone of Eastern Iran (Figure 1A,B) [50], which has prodigious potential for copper, gold, magnesite, chromium, and iron deposits [51]. In the Sahlabad area, there are three active copper mines, namely Mesgaran, Chah-Rasteh (Kooh Kheiri) and Zahri, two old inactive mines (i.e., Kasrab and Cheshmeh-Zangi), and nine copper mineralization occurrences [52,53]. The mineralization type of the copper deposits is massive sulfide. Only a few geological studies have been documented for the Sahlabad area [54,55,56,57]; however, no comprehensive geochemical modeling for copper mineralization has yet been accomplished or reported in this area. The main objectives of this study are (1) to identify trace elements and the predictor composition of copper mineralization in the study area; (2) to separate copper geochemical communities; (3) to analyze geochemical behavior of copper element versus its trace elements. Therefore, geostatistical and machine learning algorithms are used to separate geochemical communities, analyze the geochemical behavior of elements, and cluster stream sediment data in the Sahlabad area. Consequently, this investigation makes a key contribution to develop an integrated geochemical model for copper mineralization using hierarchical analysis, factor analysis, k-means clustering, concentration–area (C–A) fractal analysis, linear discriminant analysis, and correlation analysis.

2. Geology of the Study Area

The Sahlabad area is located in the east of Iran (Birjand, South Khorasan province). It is positioned between longitudes 59°30′ to 60° and latitudes 32° to 32°30′ (Figure 1A,B). The study area is entirely located in the flysch belt and ophiolite melange of eastern Iran in the Sistan structural zone [51]. This structural zone is situated between the Nehbandan (in the west) and the Harirod fault (in the east), which is 800 km long and 200 km wide [52,53]. This zone has undergone evolutionary stages from oceanic crust to continental crust and is one of the derivations of the “young Tethys” type [53,54,55]. In this area, igneous, metamorphic, and sedimentary lithological units are exposed from the late Cretaceous to the Neogene [56]. The geological formations in the Sahlabad area include rocks with the characteristics of the flysch belt and ophiolite melange belt, which are attributed to the upper Cretaceous and lower Tertiary, and the volcanic cover and younger Tertiary sediments [54].

Copper Deposits and Ore Mineralization

Copper, gold, nickel, chromium and magnesite mineralization and some old mining activity and excavations are documented in different lithological units (e.g., ultrabasic, intermediate and acidic rock units, metamorphic rocks, listwanites, etc.) of the study area [55,56,57,58,59,60]. The spatial location of copper mines, deposits, and indices are shown in the geology map of the study region (Figure 1A). Malachite, chalcopyrite, and chalcocite were documented in the copper mineralization zones [61,62,63]. Characteristics of copper mineralization zones in the study area are shown in Table 1.

3. Materials and Methods

3.1. Geochemical Sampling

Geochemical sampling in Sahlabad area was prepared through the stream sediment sampling method. The initial design of the sampling points was mainly based on determining the center of stream’s gravity. For this purpose, a map of the drainage systems of the study area was generated using topographic maps and aerial images. Stream sediment samples (709 samples) with a particle size of −40 mesh were collected from the study area. The design and collection of stream sediment samples in the Sahlabad area were performed by the Geological Survey of Iran (GSI) [58]. Figure 2 shows the location of stream sediment samples in the Sahlabad area.
The Inductively coupled plasma mass spectrometry (ICP-MS) method [64] was used to analyze geochemical samples collected from the study area. Zn, Cr, Ti, Mn, Sr, Ba, Au, As, Sb, Bi, Hg, W, Pb, Ni, Mo, Sn, Ag, Co, Fe and Cu were selected in this analysis. In geochemistry analysis, one of the main components of general error in exploration operations is laboratory error, and obtaining this error is important to know the precision of the analysis. Since, in regional-scale geochemical projects, the goal is to measure the relative values of each element corresponding to each other for finding promising areas, hence the precision of measurements is more important than their accuracy. For this reason, the precision of the operation has been investigated by repeated analysis of geochemical samples. To check the accuracy of the analysis, for every 10 to 15 geochemical samples, a duplicate sample (about 10% of the total samples) was analyzed. In this project, 79 samples were selected randomly in the study area. Figure 3 shows a diagram of the relative error rate for different elements. It shows that Au, W, Cr and As have high relative error rates.

3.2. Methodology and Data Processing

Geochemical data processing to evaluate relationships between elements, determine the threshold limits of geochemical communities, and identify the geochemical predictor composition of Cu mineralization was carried out using geostatistical and machine learning methods in this study. In doing so, an integrated geochemical model, including hierarchical analysis, factor analysis, k-means clustering, fractal analysis, linear discriminant analysis, and correlation analysis, was used to optimize geochemical analysis to identify copper anomalies in the Sahlabad area. To use raw data in statistical methods, censored and outlier data need to be identified and replaced. In the study area, censored data were unavailable. Outlier data were identified using the boxplot method [59,60] and corrected with the largest and smallest numbers. Typically, the research path with the mentioned approaches follows three main objectives, which are actually different dimensions of the geochemical model.
The first objective is to determine the geochemical predictor composition of Cu mineralization in the study area. For this purpose, at first, hierarchical analysis was applied on geochemical data to identify elemental groups based on Pearson correlation coefficient. In order to analyze the nature of the geochemical data and the obtained results, the factor analysis method was used. In this method, the data are evaluated from the aspect of principal components analysis and variance justification. Thus, the geochemical predictor composition of Cu mineralization was determined and confirmed. After that, Spearman’s rank correlation coefficient analysis was used to quantify the relationship between the elements of the predictor composition elements.
The second objective is to separate the Cu geochemical communities based on the geochemical data of stream sediments. This approach was studied using the k-means clustering method as an innovative technique for this purpose. Threshold limits of background, anomaly, and enrichment communities were obtained through the results of k-means clustering analysis. In order to validate the results, the data were processed again using concentration–area fractal analysis.
The third objective is to investigate the behavior of Cu element versus its geochemical tracer elements. At first, a clustering analysis of predictor composition elements of Cu mineralization (Cu geochemical tracer elements) was performed using Linear Discriminant Analysis (LDA). The goal was to evaluate the clusters within the group. LDA analysis determines the clusters using geochemical threshold limits of the target element (Cu). As a consequence, the genetic relationship of the tracer elements versus the Cu can be investigated. Subsequently, the k-means method was implemented on the data with the aim of quantifying the behavior of geochemical tracer elements. The mechanism of this method is that it first divides the data into clusters based on the Euclidean distance, and the optimal number of clusters in this research was determined based on the utility function (S(i)). Then, with the regression method, the centers of the optimal clusters are trended, and in this way, the trend of changes in the concentration of different elements versus the Cu is evaluated.
Figure 4 shows an overview of the methodological flowchart in the study area. Additionally, in the following subsections, calculation principles and resources for further studies are presented.

3.2.1. Correlation Coefficients Method

In data mining and statistics, “hierarchical clustering” is a method that categorizes and groups observations and data in a hierarchical manner. The point that sets this method apart from other clustering methods is the top-down (or bottom-up) order that exists in this technique [61]. In this method, unlike other clustering methods, each observation may be placed in more than one cluster because clusters are formed based on different levels of distance; therefore, each cluster may be a subset of another cluster at a distance [62]. The clustering aims to classify the data into similar groups to identify elements with similar genetic characteristics in the area [63,64]. After identifying the copper element family, the correlation coefficient method was used for further study. Spearman’s rank method was used to understand the effective processes in the formation of deposits and determine the correlation coefficients in the geochemical family of copper element. This method was based on the degree of the dependence of two variables measured in a set of individual data and the dispersion of different elements in the rock units of a deposit.

3.2.2. Factor Analysis (FA) Method

Factor analysis (FA) is a multivariate statistical method that establishes a special relationship between a large set of seemingly unrelated variables under a hypothetical model. This method was used to reduce the size of the data in the first step and then identify the area’s main factors of copper mineralization [65]. In the FA method, a large number of variables are expressed in terms of a small number of dimensions or structures, which is called a factor [66]. Directions with maximum variability were identified using eigenvalues and eigenvectors.

3.2.3. K-Means Clustering Method

The K-means clustering method was used to group the data into clusters based on common characteristics [67]. Instead of simultaneously examining large numbers of data, clusters and their centers with specific features were used in the analysis. The K-means algorithm is presented as follows in five steps [68]:
(1) n members are divided into k clusters. The K number is selected randomly.
(2) Equation (1) calculates the Z j vector. (Cj is the center of each class)
z j = x c j x # c j   f o r   j = 1 . . k
(3) Equation (2) calculates the center of each class while the algorithm is running. The x indicates the vector of each member of Cj, and the number of Cj class members is represented by #Cj [69].
(4) The objective function is calculated based on Equation (2), which determines the total distance of each member from the center of the class.
f C 1 . C 2 . . C k = j 1 k X C j X z j 2
(5) The objective function is minimized, and the number of optimal classes (K) is determined based on the minimum objective function.
Shirazy et al. [70,71] presented a software to increase the speed of execution of the above algorithm. The primary purpose of using this method in the first step was to cluster the data related to the element copper and identify groups with common characteristics in the study area. In other words, the geochemical threshold of the copper element was calculated in the background, anomaly, and enrichment communities. The next step in using the K-mean method was to investigate the geochemical behavior of the copper family. Using the previous methods, the geochemical family of a copper element was identified, but how members of this family behave regarding the copper element is the result of the K-means method.

3.2.4. Concentration–Area (C–A) Fractal Method

Unusual phenomena with irregular shapes do not follow the principles of Euclid’s geometry. The geometry used to describe these phenomena is called fractal geometry. Fractal methods have many applications in surface geological and geochemical studies due to the geometric shape of the anomalies, the spatial distribution of the data, and the use of all data in the computational process. Many researchers have studied the use of fractal methods in Earth sciences. Shirazy et al. [70] proposed various methods of using this geometry to separate geochemical communities. An equation is presented below for the concentration of materials or fractal properties:
A ( ν ) ν α
where A ( ν ) is the cumulative area enclosed by concentration level lines whose corresponding concentration is greater than or equal to ν. The value of α represents the fractal dimension of the different amplitudes [72,73]. The purpose of using C–A fractal method in geochemical modeling was to determine the fractal dimension of copper concentration data in stream sediment samples. This method identifies the area’s various geochemical communities of copper concentrations. It also confirms the results of K-means clustering.

3.2.5. Linear Discriminant Analysis (LDA) Method

Linear discriminant analysis is a statistical method of data analysis used to measure the relationship between a parameter and known communities. In this method, the target communities need to be known in advance or at least defined [74,75]. Therefore, the results of K-means clustering and C–A fractal methods were used as one of the inputs of this method [74]. The purpose of using the LDA method was to investigate the intragroup behavior of geochemical elements in the copper family. In other words, the genetic relationship of elements identified as group elements with copper was investigated using this method. It was determined which elements of the previously identified geochemical family are genetically related to the copper element.

4. Analysis and Results

4.1. Preparation and Descriptive Statistics Analysis of Raw Data

Outlier data were identified using the boxplot method for analysis of the geochemical stream sediment data. The data were corrected using the replacement method with the largest and smallest values. The boxplot and histogram of the copper element as an example in the raw and corrected data are shown in Figure 5 and Figure 6, respectively.
In order to correct the outlier values, the identified outliers were removed first and then the largest and smallest values were determined. The outlier datapoint whose value was greater than the largest value was replaced with it, and if the outlier datapoint was less than the smallest value, it was replaced with it. Table 2 presents the statistical characteristics of each element. Mahboob et al. [31] and Shirazy et al. [32] state that the data neither change continuously nor change in frequency in geochemical exploration. It is better to divide the total data into classes or categories with equal intervals and then draw their distribution curve. According to the resultant curve, the statistical population distribution can be identified [75,76].
In this study, a histogram of the refined data (replacement of outlier data) was drawn, and a bell curve was matched (see Figure 5 and Figure 6). The normal distribution of the statistical population is a prerequisite for using data in many statistical methods. According to the statistical characteristics listed in Table 2, the skewness and kurtosis test were used to determine the frequency distribution of elements. In general, the normal bell distribution has zero skewness and kurtosis, so if these values in the data are close to zero, the distribution is also closer to the normal distribution [77]. However, if the data kurtosis and skewness range from −2 to 2, the data distribution is confirmed in terms of normality and can be used in statistical methods [78,79,80,81,82,83,84]. In the study area, the skewness and kurtosis coefficients for the data range from −1 to +1. Accordingly, the frequency distribution of data can be considered normal. It should be noted that the raw data of most elements, including outlier data, show an abnormal distribution, and after correcting outlier data, the distribution returned to normal.

4.2. Determining the Predictor Composition of Cu Mineralization

4.2.1. Hierarchical Clustering and Correlation Coefficients Analysis

Figure 7 shows a hierarchical clustering diagram in relation to the elements of Ag, Zn, Pb, Mo, Sn, and Cu. From the geochemical data, five elements are genetically related, and they are effective in identifying each other. For further analysis, the correlation coefficients were calculated for this cluster. The correlation coefficient is a specific measure that quantifies the strength of the linear relationship between two variables in a correlation analysis. The study of a correlation coefficient is one of the most critical cases in geochemical studies because it can be used to understand the environment and processes influential in forming deposits [85,86,87]. As shown in Table 3, the results indicate a direct relationship between Ag, Zn, Pb, Mo, Sn, and Cu.

4.2.2. FA Results

The purpose of using the FA method, in addition to confirming the results of the hierarchical analysis, was to calculate the scores related to the main component of Cu mineralization. To carry out the FA method, the validity of the data was evaluated using the Kaiser–Meyer–Olkin (KMO) index and performed on geochemical data of stream sediment samples. The KMO index for the data is 0.885. It means that these data have high validity and adequacy for the FA. Table 4 shows the FA results of geochemical data in the Sahlabad area.
As shown in Table 4, the percentage of variance justification is presented by different components. From Figure 8, four factors (the fracture point of the diagram) were considered according to the Scree plot and eigenvalues related to each factor. The results show that the variance justification in the four components is more than 65%, which is a significant amount. Table 5 presents the matrix results of four factors from the factor analysis. From the results presented in Table 4, the elements were identified in relation to the highest scores of the copper element.
The appropriate component was factor number one, which included the elements of Ag, Cu, Mo, Pb, Zn, and Sn. Based on the FA method, metallic and quasi-metallic elements, presented in the linear composition of the first component with an appropriate variance justification, can be considered predictors of copper mineralization in the study area. The FA results confirmed the hierarchical analysis results, which indicated the proper performance of methods and data quality.
To represent the potential areas represented by the elements of the first factor, a map obtained from the matrix of this factor was produced using the kriging interpolation method (Figure 9). Because topography plays an essential role in controlling stream sediment anomalies, a three-dimensional model of potential areas based on the first component matrix of the FA matched to Sahlabad’s topography was also presented in Figure 9. In Figure 10, the earthy brown color indicates the depletion of the main factor, and the red color designates the enrichment. Since the geochemical data are stream sediment samples, the area’s topography controls this enrichment and depletion.

4.3. Copper Geochemical Communities Separation

4.3.1. K-Means Clustering Results

Using the k-means clustering method, the copper element in the geochemical data was divided into three clusters. Table 6 presents the results of this clustering, separated by the different specifications. Additionally, Figure 11 shows the map of copper element clustering. As shown in Figure 11, the three groups are separated by blue, green, and red colors, representing the background, anomaly, and enrichment of geochemical communities, respectively.

4.3.2. Concentration–Area (C–A) Fractal Results

Figure 12 shows a map resulting from the Kriging interpolation method. In this regard, a concentration–area table was prepared using map networking and area calculation, including different concentrations of the copper element. The threshold range of each statistical community was calculated by plotting the logarithm of concentration and the area containing the copper element, based on the breaking points of the diagram. According to the C–A fractal diagram presented in Figure 13, the statistical community of copper element concentration data in stream sediment samples is divided into three groups. The concentration threshold of these groups is shown in Table 7. It can be concluded that the results of K-means clustering and C–A fractal methods confirm each other with a good accuracy.

4.4. Geochemical Behavior of Copper Tracer Elements

4.4.1. Linear Discriminant Analysis (LDA) Method

The FA results showed that Ag, Cu, Mo, Pb, Zn, and Sn in the first factor have the highest variance justification. Since the target was the element copper and its behavior with other elements, these six elements were used in the LDA method. From K-means clustering and C–A fractal methods, the separated communities were used as the inputs of the LDA method. The number of focal functions is less than the number of groups considered for the dependent quantity (i.e., three standard groups for the copper element). According to Figure 14, separating the data related to elements is in the direction of function one with the least possible interference. Consistent with data processed by the LDA method (Table 8), for every 110 analyzed samples, at least one sample is lost and excluded from the analysis. Therefore, out of a total of 709 samples, 701 samples were analyzed. Figure 15 shows the histogram of discriminant scores calculated based on the discriminant function for the data used in each discriminated group.
Figure 16 presents the results of the linear discriminant analysis performed on the data for grouping the elements. It was observed that the silver element is not inherently correlated with other elements and is in an independent group. Although silver correlates well with other elements, it does not play a role in tracing the copper. The elements copper, molybdenum, tin, lead, and zinc are separated into a group, indicating their inherent correlation in the study area.
Furthermore, Wilks’ Lambda test was used to validate the linear discriminant analysis. In this regard, Table 9 represents standard correlation coefficient and eigenvector values obtained from the discriminant functions. The first canonical discriminant functions were used in the analysis. The canonical correlation coefficient represents the Pearson correlation between the differentiation scores calculated by the above function and the initial grouping values. The eigenvalues represent the amount of variance expressed by the above functions. So that eigenvalues more significant than one represent a more appropriate discriminant function. The canonical correlation coefficient also indicates the correlation between discriminant scores and the dependent variable of the analysis, and the larger values of this parameter represent the higher power of the discriminant function. The calculated value for the canonical correlation coefficient is 0.903. This means that the discriminant function can model the variability of more than 81% related to the groups. This value is too significant for the data used in the analysis due to the study area’s scale. The value of the Wilks’ Lambda coefficient in the discriminant analysis is 0.184, which is a deficient value and indicates the high ability of the function in the correct classification of data. This number, on the other hand, equals the percentage of the total variance that this discriminant function cannot express.

4.4.2. K-Means Clustering Results

In this analysis, the elements of copper, molybdenum, lead, zinc, and tin were identified as a group of elements involved in tracing each other. In this regard, the behavior of two elements with each other was summarized in a number, indicating the direct or inverse correlation of two elements to each other. The copper’s geochemical behavior with each identified group element was genetically related to Cu investigated in pairs. To identify the optimal number of clusters in the K-mean clustering, the K number was increased from 3 to 10. Figure 17 shows the value of the utility function against the number of clusters for all elements in the same group as the copper element.
In line with the diagram shown in Figure 17, the number of three clusters is the optimal number of clusters for the behavior of copper element compared to other elements in its group. Figure 18 presents the profile of the clusters and the utility values for the optimal classification performed for the desired elements.

Geochemical Behavior of Cu and Mo Elements

Optimal clustering was used to investigate the geochemical behavior of copper element compared to Mo element. The centers of the designated clusters for K = 3 are plotted in Figure 19.
Based on this clustering, according to the diagram presented in Figure 19, the concentration of Cu in the stream sediment samples of Sahlabad area increases the concentration of Mo. The behavior of Cu and Mo elements with each other is nonlinear. The fitted curve with regression coefficient 1 is a quadratic curve whose equation is presented in Equation (4).
M o = 0.0003   C u 2 + 0.0236 C u + 0.3133
From Equation (4), the element concentration of Cu and Mo can be calculated in relative to each other. Equation (4) helps to see more about the behavior of Cu and Mo elements in the area’s stream sediments.

Geochemical Behavior of Cu and Zn

As shown in Figure 17, the three-class clustering has the highest utility value among all the classifications. Subsequently, Figure 20 shows the concentration of the cluster centers per k = 3 to identify the behavior of the Cu element relative to the Zn element.
Along with the above diagram, which shows the changes in the concentration of Cu against Zn in stream sediment samples, it is observed that the concentration of Zn also increases with increasing concentration of Cu. The behavior of Cu and Zn elements with each other is nonlinear, and the curve fitting on their behavior is of the quadratic equation with a nonlinear regression coefficient of 1 (R2 = 1), which is the behavioral equation of these two elements in Equation (5).
Z n = 0.2327   C u 2 + 14.231 C u + 124.98  

Geochemical Behavior of Cu and Pb Elements

As shown in Figure 17, the three-class clustering (K = 3) was selected as the most appropriate classification with the highest utility function value (s(i)). The graph of concentration values in the cluster centers of this classification is also presented in Figure 21 to describe the geochemical behavior of Cu versus Pb.
Following the graph of the behavior of Cu and Pb elements, an increasing trend of these two elements relative to each other is observed. The fitted curve, which shows the concentration behavior of the studied elements with each other, is a quadratic equation with a nonlinear regression coefficient of 1 (R2 = 1), presented in Equation (6).
P b = 0.0125   C u 2 0.32 C u + 15.359

Geochemical Behavior of Cu and Sn Elements

Considering the optimal clustering, the concentration of Cu in the stream sediment samples of the Sahlabad area increases concentration of Sn. This behavior was fitted with a quadratic curve, and its equation is presented in Equation (7). The copper and tin concentrations in the clusters’ centers are shown in Figure 22.
S n = 0.00007   C u 2 + 0.0625 C u + 0.2074    

5. Discussion

The technical flow presented in the current research provides a path for the multi-faceted analysis of stream sediment geochemical data. The Self-Validation System (SVS) in this methodology is a type of confirmation of the high accuracy of the results (based on the principle of reproducibility of the results). However, the challenge is to choose appropriate analytical methods for the study system. Therefore, one of the limitations of the presented technical flow is the analytical methods and subsystems related to the performance procedure of each of them. For example, can the interpolation method make the results more accurate (closer to reality)? A detailed and documented answer to this question requires another research definition based on the present article. Other limitations include the lack of other types of geochemical data, such as heavy minerals or rock samples. If all types of geochemical data are used, more information can be deduced.
Given that the relationships obtained between the copper element and the predictor elements of Cu mineralization (geochemical family) were all direct (including the elements Mo, Pb, Zn, and Sn), it means that as the grade of copper increases, the grade of these elements also increases, and vice versa. Geochemically, it can be interpreted that the results of the previous steps acknowledged that these elements are predictors of Cu mineralization in the Sahlabad area. Therefore, geochemical halos related to the Cu mineralization can be identified using these elements. On the other hand, these results show that the relationships between geochemical elements are not necessarily linear. In other words, they show a kind of nonlinear behavior concerning each other.
The strength of the integrated geochemical model presented in this research is the evaluation of the results of each method by the results of previous and subsequent methods. In such a way, the results of each part geochemically confirm the results of other parts. In geochemical studies based on stream sediment data, it is always important to identify the predictor composition of the target element mineralization, determine the geochemical threshold of the target element, and investigate the geochemical behavior of elements in the same group with the target element. Based on the analysis of stream sediment data by the geochemical model proposed in this study, a more accurate mineral exploration campaign can be performed. Another strength of the current geochemical model is the comprehensive review of geochemical data. This comprehensiveness (the steps presented in the flowchart (see Figure 4)) gives the geochemistry researchers a broader view of exploratory geochemical modeling based on stream sediment data.

6. Conclusions

In this study, correlation analysis and machine learning techniques were implemented and incorporated to develop a geochemical model of the data obtained from stream sediment samples for the Sahlabad area, South Khorasan province, East Iran. This study aimed to identify the geochemical anomalies of the copper element based on its trace elements. Trace elements were first identified to achieve this goal using hierarchical analysis and factor analysis (FA) methods. Then, the map of copper mineralization anomalies was produced using the main component scores of the factor analysis. Successively, the group behavior was investigated to clarify the geochemical relationships of trace elements with Cu. The geochemical behavior of trace elements versus copper element was modeled as quadratic equations. Typically, it can be said that the geochemical distribution of copper in the Sahlabad area is related to Mo, Sn, Pb, Zn, and Ag. Based on the results of linear discriminant analysis (LDA), which was applied to the family of copper tracer elements, it was found that Ag has a different behavior from other elements. Since it is in a different group from other elements, it can be understood that it has no direct genetic geochemical relationship with the copper element in the area. However, based on the initial clustering of the elements, it can be used as a copper tracer in the area, because this element’s genetic relationship with the copper element is indirectly defined. The geochemical model developed in this investigation is systematically appropriate for copper prospectivity mapping in the Sahlabad area and other analogue metallogenic provinces around the world.

Author Contributions

All authors have contributed to writing and editing this article. Writing original draft preparation, A.S. (Aref Shrazi) and A.H.; supervision, A.H. and A.B.P.; writing—review and editing, A.S. (Adel Shirazyand) and A.B.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Acknowledgments

We would appreciate the Department of Mining Engineering at Amirkabir University of Technology (Tehran Polytechnic). This study is part of the research activity carried out during Postdoctoral research (By the first author) at the Department of Mining Engineering at Amirkabir University of Technology (Tehran Polytechnic). The Institute of Oceanography and Environment (INOS), Universiti Malaysia Terengganu (UMT) is also acknowledged for providing facilities for editing, rewriting, and re-organizing the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Roonwal, G.S. Mineral Exploration: Practical Application; Springer: Berlin/Heidelberg, Germany, 2018. [Google Scholar]
  2. Gourley, A.C. Key elements of a model mining code: A Middle East case study. Miner. Econ. 2018, 32, 187–204. [Google Scholar] [CrossRef]
  3. Talapatra, A.K. Geochemical Exploration and Modelling of Concealed Mineral Deposits; Springer: Berlin/Heidelberg, Germany, 2020. [Google Scholar]
  4. Lindsay, M.D.; Piechocka, A.M.; Jessell, M.W.; Scalzo, R.; Giraud, J.; Pirot, G.; Cripps, E. Assessing the impact of conceptual mineral systems uncertainty on prospectivity predictions. Geosci. Front. 2022, 13, 101435. [Google Scholar] [CrossRef]
  5. Shirazi, A.; Hezarkhani, A.; Pour, A.B.; Shirazy, A.; Hashim, M. Neuro-Fuzzy-AHP (NFAHP) Technique for Copper Exploration Using Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) and Geological Datasets in the Sahlabad Mining Area, East Iran. Remote Sens. 2022, 14, 5562. [Google Scholar] [CrossRef]
  6. Revuelta, M.B. Mineral Resource Exploration. In Mineral Resources; Springer: Berlin/Heidelberg, Germany, 2018; pp. 121–222. [Google Scholar]
  7. Zuo, R.; Wang, J.; Xiong, Y.; Wang, Z. The processing techniques of geochemical exploration data: Past, present, and future. Appl. Geochem. 2021, 132, 105072. [Google Scholar] [CrossRef]
  8. Shirazi, A.; Shirazy, A.; Saki, S.; Hezarkhani, A. Introducing a software for innovative neuro-fuzzy clustering method named NFCMR. Glob. J. Comput. Sci. Theory Res. 2018, 8, 62–69. [Google Scholar] [CrossRef]
  9. Liu, G.; Fang, H.; Chen, Q.; Cui, Z.; Zeng, M. A Feature-Enhanced MPS Approach to Reconstruct 3D Deposit Models Using 2D Geological Cross Sections: A Case Study in the Luodang Cu Deposit, Southwestern China. Nat. Resour. Res. 2022, 31, 3101–3120. [Google Scholar] [CrossRef]
  10. Yin, B.; Zuo, R.; Xiong, Y.; Li, Y.; Yang, W. Knowledge discovery of geochemical patterns from a data-driven perspective. J. Geochem. Explor. 2021, 231, 106872. [Google Scholar] [CrossRef]
  11. Cui, Z.; Chen, Q.; Liu, G. Characterization of Subsurface Hydrogeological Structures with Convolutional Conditional Neural Processes on Limited Training Data. Water Resour. Res. 2022, 58, e2022WR033161. [Google Scholar] [CrossRef]
  12. Behera, S.; Panigrahi, M.K. Mineral prospectivity modelling using singularity mapping and multifractal analysis of stream sediment geochemical data from the auriferous Hutti-Maski schist belt, S. India. Ore Geol. Rev. 2021, 131, 104029. [Google Scholar] [CrossRef]
  13. Nforba, M.T.; Egbenchung, K.A.; Berinyuy, N.L.; Mimba, M.E.; Tangko, E.T.; Nono, G.D.K. Statistical evaluation of stream sediment geochemical data from Tchangue-Bikoui drainage system, Southern Cameroon: A regional perspective. Geol. Ecol. Landsc. 2020, 6, 1–13. [Google Scholar] [CrossRef]
  14. Zomorrodian, M.; Shayestehfar, M.R. Indicator ratios and additive composite halos in stream sediment samples as a geochemical indicator for identifying promising epithermal gold deposit in the north of Kashmar. Arab. J. Geosci. 2019, 12, 331. [Google Scholar] [CrossRef]
  15. Li, H.; Li, X.; Yuan, F.; Jowitt, S.M.; Zhang, M.; Zhou, J.; Zhou, T.; Li, X.; Ge, C.; Wu, B. Convolutional neural network and transfer learning based mineral prospectivity modeling for geochemical exploration of Au mineralization within the Guandian–Zhangbaling area, Anhui Province, China. Appl. Geochem. 2020, 122, 104747. [Google Scholar] [CrossRef]
  16. Sun, T.; Li, H.; Wu, K.; Chen, F.; Zhu, Z.; Hu, Z. Data-Driven Predictive Modelling of Mineral Prospectivity Using Machine Learning and Deep Learning Methods: A Case Study from Southern Jiangxi Province, China. Minerals 2020, 10, 102. [Google Scholar] [CrossRef]
  17. Chen, Q.; Cui, Z.; Liu, G.; Yang, Z.; Ma, X. Deep convolutional generative adversarial networks for modeling complex hydrological structures in Monte-Carlo simulation. J. Hydrol. 2022, 610, 127970. [Google Scholar] [CrossRef]
  18. Li, C.; Liu, B.; Guo, K.; Li, B.; Kong, Y. Regional Geochemical Anomaly Identification Based on Multiple-Point Geostatistical Simulation and Local Singularity Analysis—A Case Study in Mila Mountain Region, Southern Tibet. Minerals 2021, 11, 1037. [Google Scholar] [CrossRef]
  19. Zhou, S.-G.; Zhou, K.-F.; Wang, J.-L. Geochemical metallogenic potential based on cluster analysis: A new method to extract valuable information for mineral exploration from geochemical data. Appl. Geochem. 2020, 122, 104748. [Google Scholar] [CrossRef]
  20. Helba, H.A.; El-Makky, A.M.; Khalil, K.I. Application of CN fractal model, factor analysis, and geochemical mineralization probability index (GMPI) for delineating geochemical anomalies related to Mn-Fe deposit and associated Cu mineralization in west-central Sinai, Egypt. Geochem. Explor. Environ. Anal. 2021, 21, geochem2021-031. [Google Scholar] [CrossRef]
  21. Shirazy, A.; Ziaii, M.; Hezarkhani, A.; Timkin, T.V.; Voroshilov, V.G. Geochemical behavior investigation based on k-means and artificial neural network prediction for titanium and zinc, Kivi region, Iran. Bull. Tomsk. Polytech. Univ. 2021, 332, 113–125. [Google Scholar]
  22. Zhao, M.; Xia, Q.; Wu, L.; Liang, Y. Identification of Multi-Element Geochemical Anomalies for Cu–Polymetallic Deposits Through Staged Factor Analysis, Improved Fractal Density and Expected Value Function. Nat. Resour. Res. 2021, 31, 1867–1887. [Google Scholar] [CrossRef]
  23. Shahsavar, S.; Rad, A.J.; Afzal, P.; Nezafati, N. Selection of Optimum Fractal Model for Detection of Stream Sediments Anomalies. Geopersia 2020, 10, 395–404. [Google Scholar] [CrossRef]
  24. Khorshidi, N.; Parsa, M.; Lentz, D.R.; Sobhanverdi, J. Identification of heavy metal pollution sources and its associated risk assessment in an industrial town using the K-means clustering technique. Appl. Geochem. 2021, 135, 105113. [Google Scholar] [CrossRef]
  25. Wang, J.; Zhou, Y.; Xiao, F. Identification of multi-element geochemical anomalies using unsupervised machine learning algorithms: A case study from Ag–Pb–Zn deposits in north-western Zhejiang, China. Appl. Geochem. 2020, 120, 104679. [Google Scholar] [CrossRef]
  26. Farahmandfar, Z.; Jafari, M.; Afzal, P.; Ardalan, A.A. Description of gold and copper anomalies using fractal and stepwise factor analysis according to stream sediments in NW Iran. Geopersia 2019, 10, 135–148. [Google Scholar] [CrossRef]
  27. Aryafar, A.; Moeini, H.; Khosravi, V. CRFA-CRBM: A hybrid technique for anomaly recognition in regional geochemical exploration; case study: Dehsalm area, east of Iran. Int. J. Min. Geo-Eng. 2020, 54, 33–38. [Google Scholar] [CrossRef]
  28. Xiong, Y.; Zuo, R. Recognizing multivariate geochemical anomalies for mineral exploration by combining deep learning and one-class support vector machine. Comput. Geosci. 2020, 140, 104484. [Google Scholar] [CrossRef]
  29. Hedayat, B.; Ahmadi, M.E.; Nazerian, H.; Shirazi, A.; Shirazy, A. Feasibility of Simultaneous Application of Fuzzy Neural Network and TOPSIS Integrated Method in Potential Mapping of Lead and Zinc Mineralization in Isfahan-Khomein Metallogeny Zone. Open J. Geol. 2022, 12, 215–233. [Google Scholar] [CrossRef]
  30. de Campos, F.F.; Licht, O.A.B. Correlation diagrams: Graphical visualization of geochemical associations using the EzCorrGraph app. J. Geochem. Explor. 2021, 220, 106657. [Google Scholar] [CrossRef]
  31. Mahboob, M.A.; Celik, T.; Genc, B. Predictive modeling and comparative evaluation of geostatistical models for geochemical exploration through stream sediments. Arab. J. Geosci. 2020, 13, 1080. [Google Scholar] [CrossRef]
  32. Shirazy, A.; Ziaii, M.; Hezarkhani, A. Geochemical Behavior Investigation Based on K-means and Artificial Neural Network Prediction for Copper, in Kivi region, Ardabil province, IRAN. Iran. J. Min. Eng. 2020, 14, 96–112. [Google Scholar]
  33. Chen, Q.; Liu, G.; Ma, X.; Li, X.; He, Z. 3D stochastic modeling framework for Quaternary sediments using multiple-point statistics: A case study in Minjiang Estuary area, southeast China. Comput. Geosci. 2019, 136, 104404. [Google Scholar] [CrossRef]
  34. Zuo, R. Machine Learning of Mineralization-Related Geochemical Anomalies: A Review of Potential Methods. Nat. Resour. Res. 2017, 26, 457–464. [Google Scholar] [CrossRef]
  35. Zuo, R.; Wang, J.; Yin, B. Visualization and interpretation of geochemical exploration data using GIS and machine learning methods. Appl. Geochem. 2021, 134, 105111. [Google Scholar] [CrossRef]
  36. Salkuti, S.R. A survey of big data and machine learning. Int. J. Electr. Comput. Eng. 2020, 10, 575–580. [Google Scholar] [CrossRef]
  37. Gm, H.; Gourisaria, M.K.; Pandey, M.; Rautaray, S.S. A comprehensive survey and analysis of generative models in machine learning. Comput. Sci. Rev. 2020, 38, 100285. [Google Scholar] [CrossRef]
  38. Shirazy, A.; Hezarkhani, A.; Timkin, T.; Shirazi, A. Investigation of Magneto-/Radio-Metric Behavior in Order to Identify an Estimator Model Using K-Means Clustering and Artificial Neural Network (ANN) (Iron Ore Deposit, Yazd, IRAN). Minerals 2021, 11, 1304. [Google Scholar] [CrossRef]
  39. Keykhay-Hosseinpoor, M.; Kohsary, A.-H.; Hossein-Morshedy, A.; Porwal, A. A machine learning-based approach to exploration targeting of porphyry Cu-Au deposits in the Dehsalm district, eastern Iran. Ore Geol. Rev. 2019, 116, 103234. [Google Scholar] [CrossRef]
  40. Qin, Y.; Liu, L.; Wu, W. Machine Learning-Based 3D Modeling of Mineral Prospectivity Mapping in the Anqing Orefield, Eastern China. Nat. Resour. Res. 2021, 30, 3099–3120. [Google Scholar] [CrossRef]
  41. Dornan, T.; O’Sullivan, G.; O’Riain, N.; Stueeken, E.; Goodhue, R. The application of machine learning methods to aggregate geochemistry predicts quarry source location: An example from Ireland. Comput. Geosci. 2020, 140, 104495. [Google Scholar] [CrossRef]
  42. Shabani, A.; Ziaii, M.; Monfared, M.S.; Shirazy, A.; Shirazi, A. Multi-Dimensional Data Fusion for Mineral Prospectivity Mapping (MPM) Using Fuzzy-AHP Decision-Making Method, Kodegan-Basiran Region, East Iran. Minerals 2022, 12, 1629. [Google Scholar] [CrossRef]
  43. Hajsadeghi, S.; Asghari, O.; Mirmohammadi, M.; Meshkani, S.A. Discrimination of Mineralized Rock Types in a Copper-Rich Volcanogenic Massive Sulfide Deposit Through Fast Independent Component and Factor Analysis. Nat. Resour. Res. 2019, 29, 161–171. [Google Scholar] [CrossRef]
  44. Steiner, B.M.; Rollinson, G.K.; Condron, J.M. An Exploration Study of the Kagenfels and Natzwiller Granites, Northern Vosges Mountains, France: A Combined Approach of Stream Sediment Geochemistry and Automated Mineralogy. Minerals 2019, 9, 750. [Google Scholar] [CrossRef]
  45. Mohammadpour, M.; Bahroudi, A.; Abedi, M.; Rahimipour, G.; Jozanikohan, G.; Khalifani, F.M. Geochemical distribution mapping by combining number-size multifractal model and multiple indicator kriging. J. Geochem. Explor. 2019, 200, 13–26. [Google Scholar] [CrossRef]
  46. Miftah, A.; El Azzab, D.; Attou, A.; Rachid, A.; Ouchchen, M.; Soulaimani, A.; Soulaimani, S.; Manar, A. Combined analysis of helicopter-borne magnetic and stream sediment geochemical data around an ancient Tiouit gold mine (Eastern Anti-Atlas, Morocco): Geological and mining interpretations. J. Afr. Earth Sci. 2020, 175, 104093. [Google Scholar] [CrossRef]
  47. Liu, Y.; Cheng, Q.; Carranza, E.J.M.; Zhou, K. Assessment of Geochemical Anomaly Uncertainty Through Geostatistical Simulation and Singularity Analysis. Nat. Resour. Res. 2018, 28, 199–212. [Google Scholar] [CrossRef]
  48. Tahmooresi, M.; Babaei, B.; Dehghan, S. Intelligent geochemical exploration modeling using multiclass support vector machine and integration it with continuous genetic algorithm in Gonabad region, Khorasan Razavi, Iran. Arab. J. Geosci. 2021, 14, 1–15. [Google Scholar] [CrossRef]
  49. Zuo, R. Mineral Exploration Using Subtle or Negative Geochemical Anomalies. J. Earth Sci. 2020, 32, 439–454. [Google Scholar] [CrossRef]
  50. Aali, A.A.; Shirazy, A.; Shirazi, A.; Pour, A.B.; Hezarkhani, A.; Maghsoudi, A.; Hashim, M.; Khakmardan, S. Fusion of Remote Sensing, Magnetometric, and Geological Data to Identify Polymetallic Mineral Potential Zones in Chakchak Region, Yazd, Iran. Remote Sens. 2022, 14, 6018. [Google Scholar] [CrossRef]
  51. Tirrul, R.; Bell, I.; Griffis, R.; Camp, V. The Sistan suture zone of eastern Iran. Geol. Soc. Am. Bull. 1983, 94, 134–150. [Google Scholar] [CrossRef]
  52. Reyre, D.; Mohafez, S. A First Contribution of the NIOC-ERAP Agreements to the Knowledge of Iranian Geology; Editions Technip: Paris, France, 1972. [Google Scholar]
  53. Nabavi, M. An Introduction to the Iranian Geology; Geological Survey of Iran: Tehran, Iran, 1976; p. 110.
  54. Samani, B.; Ashtari, M. The development of geology in Sistan and Baluchestan region. Q. J. Earth Sci. 1991, 4, 26. [Google Scholar]
  55. Carey, S.W. The Expanding Earth; Elsevier: Amesterdam, The Netherlands, 1976; p. 488. [Google Scholar]
  56. Alavi, M. Sedimentary and structural characteristics of the Paleo-Tethys remnants in northeastern Iran. GSA Bull. 1991, 103, 983–992. [Google Scholar] [CrossRef]
  57. Navai, I. Geological Map of Sahlabad Area (On Scale 1:100,000); Iran Geological Survey (G.S.I.): Tehran, Iran, 1974.
  58. GSI. Report of Systematic Geochemical Explorations in the Sahlabad Area (Sheet on Scale 1:100,000—Geochemistry of Stream Sediments); Geological Survey of Iran (GSI): Tehran, Iran, 2001; p. 644.
  59. Schwertman, N.C.; Owens, M.A.; Adnan, R. A simple more general boxplot method for identifying outliers. Comput. Stat. Data Anal. 2004, 47, 165–174. [Google Scholar] [CrossRef]
  60. Filzmoser, P.; Gregorich, M. Multivariate Outlier Detection in Applied Data Analysis: Global, Local, Compositional and Cellwise Outliers. Math. Geosci. 2020, 52, 1049–1066. [Google Scholar] [CrossRef]
  61. Bouguettaya, A.; Yu, Q.; Liu, X.; Zhou, X.; Song, A. Efficient agglomerative hierarchical clustering. Expert Syst. Appl. 2014, 42, 2785–2797. [Google Scholar] [CrossRef]
  62. Murtagh, F.; Contreras, P. Algorithms for hierarchical clustering: An overview. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2012, 2, 86–97. [Google Scholar] [CrossRef]
  63. Yu, X.; Xiao, F.; Zhou, Y.; Wang, Y.; Wang, K. Application of hierarchical clustering, singularity mapping, and Kohonen neural network to identify Ag-Au-Pb-Zn polymetallic mineralization associated geochemical anomaly in Pangxidong district. J. Geochem. Explor. 2019, 203, 87–95. [Google Scholar] [CrossRef]
  64. Shirazi, A.; Hezarkhani, A.; Pour, A.B. Fusion of lineament factor (Lf) map analysis and multifractal technique for massive sulfide copper exploration: The Sahlabad area, East Iran. Minerals 2022, 12, 549. [Google Scholar] [CrossRef]
  65. Saadati, H.; Afzal, P.; Torshizian, H.; Solgi, A. Geochemical exploration for lithium in NE Iran using the geochemical mapping prospectivity index, staged factor analysis, and a fractal model. Geochem. Explor. Environ. Anal. 2020, 20, 461–472. [Google Scholar] [CrossRef]
  66. Lawley, D.N.; Maxwell, A.E. Factor analysis as a statistical method. J. R. Stat. Soc. Ser. D 1962, 12, 209–229. [Google Scholar] [CrossRef]
  67. Likas, A.; Vlassis, N.; Verbeek, J.J. The global k-means clustering algorithm. Pattern Recognit. 2003, 36, 451–461. [Google Scholar] [CrossRef]
  68. Saha, S.; Bandyopadhyay, S. A generalized automatic clustering algorithm in a multiobjective framework. Appl. Soft Comput. 2013, 13, 89–108. [Google Scholar] [CrossRef]
  69. Shirazy, A.; Ziaii, M.; Hezarkhani, A.; Timkin, T. Geostatistical and Remote Sensing Studies to Identify High Metallogenic Potential Regions in the Kivi Area of Iran. Minerals 2020, 10, 869. [Google Scholar] [CrossRef]
  70. Shirazi, A.; Shirazi, A.; Hezarkhani, A. Advanced Integrated Methods in Mineral Exploration; LAP LAMBERT Academic Publishing: Saarbrücken, Germany, 2022; p. 160. [Google Scholar]
  71. Shirazy, A.; Shirazi, A.; Hezarkhani, A. Predicting gold grade in Tarq 1: 100,000 geochemical map using the behavior of gold, Arsenic and Antimony by K-means method. J. Miner. Resour. Eng. 2018, 4, 11–23. [Google Scholar]
  72. Rantitsch, G. The fractal properties of geochemical landscapes as an indicator of weathering and transport processes within the Eastern Alps. J. Geochem. Explor. 2001, 73, 27–42. [Google Scholar] [CrossRef]
  73. Nazarpour, A. Application of CA fractal model and exploratory data analysis (EDA) to delineate geochemical anomalies in the: Takab 1: 25,000 geochemical sheet, NW Iran. Iran. J. Earth Sci. 2018, 10, 173–180. [Google Scholar]
  74. Tharwat, A.; Gaber, T.; Ibrahim, A.; Hassanien, A.E. Linear discriminant analysis: A detailed tutorial. AI Commun. 2017, 30, 169–190. [Google Scholar] [CrossRef]
  75. Xanthopoulos, P.; Pardalos, P.M.; Trafalis, T.B. Linear discriminant analysis. In Robust Data Mining; Springer: Berlin/Heidelberg, Germany, 2013; pp. 27–33. [Google Scholar]
  76. Roshani, P.; Mokhtari, A.R.; Tabatabaei, S.H. Objective based geochemical anomaly detection—Application of discriminant function analysis in anomaly delineation in the Kuh Panj porphyry Cu mineralization (Iran). J. Geochem. Explor. 2013, 130, 65–73. [Google Scholar] [CrossRef]
  77. Garson, G.D. Testing Statistical Assumptions; Statistical Associates Publishing: Asheboro, NC, USA, 2012. [Google Scholar]
  78. Levin, R.I. Statistics for Management; Pearson Education India: Chennai, India, 2011. [Google Scholar]
  79. Kim, H.-Y. Statistical notes for clinical researchers: Assessing normal distribution (2) using skewness and kurtosis. Restor. Dent. Endod. 2013, 38, 52–54. [Google Scholar] [CrossRef] [PubMed]
  80. George, D. SPSS for Windows Step by Step: A Simple Study Guide and Reference, 17.0 Update, 10/e; Pearson Education India: Chennai, India, 2011. [Google Scholar]
  81. Field, A. Discovering Statistics Using SPSS for Windows: Advanced Techiques for the Beginner/Andy Field; Sage Publications: London, UK, 2000. [Google Scholar]
  82. Field, A. Discovering Statistics Using IBM SPSS Statistics; SAGE: London, UK, 2013. [Google Scholar]
  83. Gravetter, F.J.; Wallnau, L.B.; Forzano, L.-A.B.; Witnauer, J.E. Essentials of Statistics for the Behavioral Sciences; Cengage Learning: Boston, MA, USA, 2020. [Google Scholar]
  84. Trochim, W.; Donnelly, J. The Research Methods Knowledge Base, 3rd ed.; Atomic Dog Publishing: Mason, OH, USA, 2006. [Google Scholar]
  85. El-Kammar, A.; El-Wakil, M.; El-Rahman, Y.A.; Fathy, M.; Abdel-Azeem, M. Stream sediment geochemical survey of rare elements in an arid region of the Hamadat area, central Eastern Desert, Egypt. Ore Geol. Rev. 2019, 117, 103287. [Google Scholar] [CrossRef]
  86. Crisigiovanni, F.; Licht, O.; Ferrari, V.; Porto, C. Geochemical mapping based on regularly spaced composite stream sediment samples produced from stored aliquots—State of Paraná Pre-Cambrian shield, Brazil. Geochim. Bras. 2019, 33, 234–259. [Google Scholar] [CrossRef]
  87. Seyid, M.; Rajendran, G.; Ayele, B. Geospatial analysis of stream sediment samples for gold and base metal concentration in Daya Dawa, West Guji, Oromia Region, Southern Ethiopia. Arab. J. Geosci. 2021, 14, 1–12. [Google Scholar] [CrossRef]
Figure 1. (A) Geological map of the Sahlabad area, scale of 1:100,000, modified from Navai [57]. (B) The geographical location of the Sistan structural zone in Iran and study area. An: andesite; Ba: basalt; Co: conglomerate; Dd: dacitic dyke; Lm: limestone; Lv: listwanite; Ml: ophiolite melange; Mt: metadiabase; Qt: Quaternary sediments; Tu: tuff; Ub: ultrabasic rocks; Sch: schist; Sh: shale.
Figure 1. (A) Geological map of the Sahlabad area, scale of 1:100,000, modified from Navai [57]. (B) The geographical location of the Sistan structural zone in Iran and study area. An: andesite; Ba: basalt; Co: conglomerate; Dd: dacitic dyke; Lm: limestone; Lv: listwanite; Ml: ophiolite melange; Mt: metadiabase; Qt: Quaternary sediments; Tu: tuff; Ub: ultrabasic rocks; Sch: schist; Sh: shale.
Minerals 13 01133 g001
Figure 2. The location of stream sediment samples in the Sahlabad Area.
Figure 2. The location of stream sediment samples in the Sahlabad Area.
Minerals 13 01133 g002
Figure 3. Relative error diagram of duplicate samples in stream sediment analysis.
Figure 3. Relative error diagram of duplicate samples in stream sediment analysis.
Minerals 13 01133 g003
Figure 4. An overview of the methodological flowchart used in this study.
Figure 4. An overview of the methodological flowchart used in this study.
Minerals 13 01133 g004
Figure 5. Boxplot (A) and histogram (B) of raw copper element data. *: Data symbol.
Figure 5. Boxplot (A) and histogram (B) of raw copper element data. *: Data symbol.
Minerals 13 01133 g005
Figure 6. Boxplot (A) and histogram (B) of corrected copper element data.
Figure 6. Boxplot (A) and histogram (B) of corrected copper element data.
Minerals 13 01133 g006
Figure 7. The result of hierarchical clustering of data in the form of a dendrogram.
Figure 7. The result of hierarchical clustering of data in the form of a dendrogram.
Minerals 13 01133 g007
Figure 8. Scree plot to identify the number of four factors in stream sediment data.
Figure 8. Scree plot to identify the number of four factors in stream sediment data.
Minerals 13 01133 g008
Figure 9. A three-dimensional model of potential areas based on the first component matrix of factor analysis adapted to the Sahlabad’s topography (for factor score please see Figure 10).
Figure 9. A three-dimensional model of potential areas based on the first component matrix of factor analysis adapted to the Sahlabad’s topography (for factor score please see Figure 10).
Minerals 13 01133 g009
Figure 10. A map of high potential areas based on the first component matrix of factor analysis, matched to Sahlabad’s topography.
Figure 10. A map of high potential areas based on the first component matrix of factor analysis, matched to Sahlabad’s topography.
Minerals 13 01133 g010
Figure 11. A map of K-means clustering applications on geochemical data of copper element in stream sediment samples.
Figure 11. A map of K-means clustering applications on geochemical data of copper element in stream sediment samples.
Minerals 13 01133 g011
Figure 12. An interpolation map of copper concentration in stream sediment samples.
Figure 12. An interpolation map of copper concentration in stream sediment samples.
Minerals 13 01133 g012
Figure 13. Concentration–area (C–A) fractal diagram of the copper element in stream sediment samples.
Figure 13. Concentration–area (C–A) fractal diagram of the copper element in stream sediment samples.
Minerals 13 01133 g013
Figure 14. Separation of statistical communities of the data based on thresholds of copper geochemical communities.
Figure 14. Separation of statistical communities of the data based on thresholds of copper geochemical communities.
Minerals 13 01133 g014
Figure 15. Histogram of discriminant scores based on the discriminant function for the data used in discriminated groups of (AC).
Figure 15. Histogram of discriminant scores based on the discriminant function for the data used in discriminated groups of (AC).
Minerals 13 01133 g015
Figure 16. A plot of element functions according to the copper element’s geochemical communities in the LDA method.
Figure 16. A plot of element functions according to the copper element’s geochemical communities in the LDA method.
Minerals 13 01133 g016
Figure 17. A graph of the utility function S(i) value versus the number of clusters for the Cu versus Mo, Pb, Sn, and Zn.
Figure 17. A graph of the utility function S(i) value versus the number of clusters for the Cu versus Mo, Pb, Sn, and Zn.
Minerals 13 01133 g017
Figure 18. The profile of the clusters and the utility function values for the optimal classification (k = 3) performed for Cu vs. Mo, Pb, Sn, and Zn.
Figure 18. The profile of the clusters and the utility function values for the optimal classification (k = 3) performed for Cu vs. Mo, Pb, Sn, and Zn.
Minerals 13 01133 g018
Figure 19. Cu and Mo concentrations in the centers of three clusters.
Figure 19. Cu and Mo concentrations in the centers of three clusters.
Minerals 13 01133 g019
Figure 20. The behavior of Cu and Zn concentrations in the centers of three clusters.
Figure 20. The behavior of Cu and Zn concentrations in the centers of three clusters.
Minerals 13 01133 g020
Figure 21. The behavior of Cu and Pb concentrations in the centers of clusters per three clusters.
Figure 21. The behavior of Cu and Pb concentrations in the centers of clusters per three clusters.
Minerals 13 01133 g021
Figure 22. The behavior of Cu and Sn concentrations in the centers of clusters per three clusters.
Figure 22. The behavior of Cu and Sn concentrations in the centers of clusters per three clusters.
Minerals 13 01133 g022
Table 1. Characteristics of copper mineralization zones in the study area [64].
Table 1. Characteristics of copper mineralization zones in the study area [64].
Copper MineralizationCoordinatesSize
(km2)
Alteration Zones Host Rock Lithology
Longitude (E)Latitude (N)
Mesgaran Deposit59°52′49″32°18′58″8Phy + Arg + Pp + Chl + QtzBa + Anb
Chah-Rasteh Deposit59°46′15″32°21′19″4Phy + Arg + Pp + Chl + CabAn + Anb
Zahri Deposit59°32′52″32°00′50″2Phy + Arg + Pp + HemUb + Sch
Kasrab Abandoned Mine59°59′45″32°21′05″3.8Phy + Arg + Pp + SepUb
Cheshme-Zangi Abandoned Mine59°59′08″32°25′02″2.5Phy + Arg + Pp + SilicificationLimestone shale + Listwanite
Shir-Shotor Indice59°53′50″32°14′28″1Arg + Pp + SepAn + Serpentinite (Ub)
Dastgerd Indice59°43′39″32°21′03″2Arg + Pp + Sep + HemHarzburgite
Torshaab Indice59°59′56″32°28′48″5Phy + Arg + Pp + Hem + LmSch
Chah-Anjir Indice59°53′37″32°15′44″2Pp + SepSerpentinite (Ub)
Zargaran Indice59°47′09″32°21′14″1Phy + Arg + Pp + Lm + Goe + HemAn + Db
West Mesgaran Indice59°52′26″32°19′36″1.5Arg + Pp + Hem + LmMtd
Mirsimin Indice59°54′58″32°17′53″9Arg + Pp + HemDb
Kuharod Indice59°50′31″32°18′01″1Phy + Arg + Pp + HemDb
Barghan Indice59°39′38″32°09′05″2Arg + Pp + Lm + Geo + HemDb + Limestone
Abbreviations: Ba = Basalt, An = Andesite, Anb = Andesite-Basalt, Ub = Ultrabasic, Sch = Schist, Db = Diabase, Mtd = Metadiabase, Chl = Chlorite Alteration, Qtz = Quartz Alteration, Cab = Carbonate Alteration, Pp = Propylitic Alteration, Arg = Argillic Alteration, Phy = Phyllic Alteration, Sep = Serpentine Alteration, Hem = Hematite Alteration, Lm = Limonite Alteration, Goe = Goethite Alteration.
Table 2. Statistical characteristics of each element in the stream sediment geochemical data.
Table 2. Statistical characteristics of each element in the stream sediment geochemical data.
ElementsMean
(ppm)
Standard Deviation (ppm)Median
(ppm)
Mode
(ppm)
Coefficient of VariationSkewnessKurtosis
Ag0.070.010.040.0414.280.841.26
Ba313.8372.533.00279.0023.110.780.35
Bi0.170.0390.160.1622.940.48−0.46
Co18.075.6416.6014.4031.210.79−0.27
Cu21.284.9721.0018.6023.350.34−0.44
Fe3.110.453.133.1814.47−0.07−0.14
Hg0.020.000.020.0200.03−0.59
Mn0.060.000.060.0600.18−0.21
Mo0.670.140.660.6720.890.24−0.41
Ni88.7753.1167.30208.7059.830.89−0.43
Pb14.472.3014.5015.4015.890.29−0.13
Sb0.500.130.490.4626−0.06−0.48
Sn1.190.381.100.9031.930.55−0.36
Ti0.310.0420.310.3113.54−0.09−0.01
Zn71.2111.7171.0068.0016.440.23−0.25
Sr28.608.72261.40495.4030.491.010.40
As8.111.618.201.0019.85−0.09−0.46
Au (ppb)0.871 (ppb)0.268 (ppb)0.00 (ppb)0.80 (ppb)0.300.48−0.60
Cr21.70111.1316.30108.10512.121.08−0.07
W0.960.260.941.0027.080.54−0.17
Table 3. Spearman’s rank correlation coefficients of Cu group elements.
Table 3. Spearman’s rank correlation coefficients of Cu group elements.
AgCuMoPbSnZn
Ag1
Cu0.5061
Mo0.6630.2891
Pb0.6830.2560.6651
Sn0.6430.3780.6540.6461
Zn0.7770.5230.7480.7210.6591
Table 4. The FA results of geochemical stream sediment data in the Sahlabad area.
Table 4. The FA results of geochemical stream sediment data in the Sahlabad area.
FactorsEigenvalues
TotalVariance (%)Cumulative Variance (%)
15.38526.92426.924
23.99719.98546.910
32.33511.67458.584
41.3906.95065.534
51.1405.70071.235
61.0145.06976.304
70.8354.17380.477
80.6713.35583.832
90.5132.56786.399
100.4432.21588.614
110.4402.20090.813
120.4212.10392.917
130.3721.85994.775
140.2761.37896.153
150.2221.11097.262
160.1440.71897.981
170.1300.65198.632
180.1210.60399.235
190.0840.42099.654
200.0690.346100
Table 5. The matrix of four factors obtained from the FA method.
Table 5. The matrix of four factors obtained from the FA method.
Principal Factors
Elements1234
Ag0.8750.012−0.0150.127
As0.043−0.0330.0250.063
Au0.0240.523−0.0740.221
Ba−0.089−0.3390.845−0.039
Bi0.127−0.285−0.336−0.068
Co0.0760.905−0.1410.247
Cr−0.3310.838−0.1760.067
Cu0.445−0.0280.1870.387
Fe−0.1100.290−0.1010.835
Hg0.2280.146−0.2390.334
Mn0.120−0.4890.3740.492
Mo0.849−0.2130.047−0.056
Ni−0.0830.897−0.2140.086
Pb0.795−0.147−0.317−0.035
Sb0.183−0.059−0.386−0.210
Sn0.812−0.059−0.1010.012
Sr−0.163−0.1510.8890.059
Ti0.325−0.809−0.0770.323
W0.2260.056−0.3300.127
Zn0.865−0.144−0.1000.195
Table 6. Clustering results of copper element using the k-means clustering method.
Table 6. Clustering results of copper element using the k-means clustering method.
Number of Clusters123
The concentration of Cu in the Center of Cluster (ppm)1723.3830.54
Threshold Concentration of Cu in Cluster (ppm)Less than 19.319.3 to 29.4Larger than 29.4
Table 7. Geochemical communities of the copper element based on C–A fractal results.
Table 7. Geochemical communities of the copper element based on C–A fractal results.
CommunityCu Threshold Concentration (ppm)
Background<19.07
Anomaly19.07<         <29.52
Enrichment>29.52
Table 8. The data separation processed by the LDA method.
Table 8. The data separation processed by the LDA method.
Number of Processed Data709
Excluded Missing or out-of-range group codes0
At least one missing discriminating variable110
The number of data used in the output701
Table 9. Eigenvalue and canonical correlation coefficient of discriminant analysis function.
Table 9. Eigenvalue and canonical correlation coefficient of discriminant analysis function.
Eigenvalue% of VarianceCumulative %Canonical Correlation
4.4241001000.903
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shirazi, A.; Hezarkhani, A.; Shirazy, A.; Pour, A.B. Geochemical Modeling of Copper Mineralization Using Geostatistical and Machine Learning Algorithms in the Sahlabad Area, Iran. Minerals 2023, 13, 1133. https://doi.org/10.3390/min13091133

AMA Style

Shirazi A, Hezarkhani A, Shirazy A, Pour AB. Geochemical Modeling of Copper Mineralization Using Geostatistical and Machine Learning Algorithms in the Sahlabad Area, Iran. Minerals. 2023; 13(9):1133. https://doi.org/10.3390/min13091133

Chicago/Turabian Style

Shirazi, Aref, Ardeshir Hezarkhani, Adel Shirazy, and Amin Beiranvand Pour. 2023. "Geochemical Modeling of Copper Mineralization Using Geostatistical and Machine Learning Algorithms in the Sahlabad Area, Iran" Minerals 13, no. 9: 1133. https://doi.org/10.3390/min13091133

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop