*4.5. Three-Dimensional MPM Based on Machine Learning*

To address the scientific problem of quantitative mineralization prediction at large depths, the previous section quantitatively extracted the deep geochemical mineralization signatures and constructed a geological and geochemical quantitative mineral resource prediction model at depth. In this section, the MaxEnt model and GMM are applied to carry out the 3D MPM to quantitatively predict deep mineral resources, and the uncertainty evaluation of the two models is performed for improving the accuracy of mineralization prediction. *Minerals* **2022**, *12*, x FOR PEER REVIEW 20 of 32 4.5.1. Training Sample Construction

#### 4.5.1. Training Sample Construction The MaxEnt model is a supervised machine learning algorithm, which requires learn-

The MaxEnt model is a supervised machine learning algorithm, which requires learning an optimal model from a given training dataset and using this model to output the corresponding result for classification. ing an optimal model from a given training dataset and using this model to output the corresponding result for classification. For mineralization prediction, the target variable of supervised learning (i.e., the la-

For mineralization prediction, the target variable of supervised learning (i.e., the label of training samples) is either mineralized or nonmineralized (denoted by 1 and 0, respectively). A total of 39,306 positive samples were extracted from known orebodies and 49,686 negative samples were extracted from the nonmineralized position confirmed by drillings, and these were used as the training dataset. In contrast to mineralization, which generates in a concentrated way in a limited space, the non-mineralization is a widespread phenomenon, and negative samples are selected to be distributed as randomly and uniformly as possible in the wall rock without mineralization and alteration throughout the study area [33] (Figure 17). bel of training samples) is either mineralized or nonmineralized (denoted by 1 and 0, respectively). A total of 39,306 positive samples were extracted from known orebodies and 49,686 negative samples were extracted from the nonmineralized position confirmed by drillings, and these were used as the training dataset. In contrast to mineralization, which generates in a concentrated way in a limited space, the non-mineralization is a widespread phenomenon, and negative samples are selected to be distributed as randomly and uniformly as possible in the wall rock without mineralization and alteration throughout the study area [33] (Figure 17).

**Figure 17.** Distribution of positive samples and negative samples. **Figure 17.** Distribution of positive samples and negative samples.

4.5.2. Three-Dimensional MPM and Uncertainty Evaluation of MaxEnt Model

*β* value of 2 for the model to reduce the influence of model overfitting.

out the 3D MPM.

achieved better results in this regard [65,76,147]

The MaxEnt method originated from statistical mechanics and was developed by

The MaxEnt method estimates the probability of the target variable with the maximum entropy value and is controlled by a set of constraints representing incomplete information about the target distribution. In mineralization prediction, the best interpretation of unknown occurrences by the model is to maximize the entropy value of the probability distribution for estimating the location of orebodies, and many scholars have

When modeling with MaxEnt software, if the model parameters are not set properly, it may lead to overfitting or redundancy [148]. The overfitting can be controlled by the modulation multiplier *β* [149], and the best performance of the model is obtained by setting the *β* value 2~4 [148,150]. Therefore, the study tested different values to find the best

Five prediction indicators are integrated into the MaxEnt model as input parameters, and data were randomly selected from the dataset during simulation, with 75% of the dataset as the training data and 25% as the test data. To reduce the randomness of the simulation results, the model is repeated 50 times with a maximum convergence threshold

#### 4.5.2. Three-Dimensional MPM and Uncertainty Evaluation of MaxEnt Model

The MaxEnt method originated from statistical mechanics and was developed by Phillips et al. using JAVA. This study uses version 3.4.1 of MaxEnt software (https:// biodiversityinformatics.amnh.org/open\_source/maxent/, (accessed on 15 June 2022)) to carry out the 3D MPM.

The MaxEnt method estimates the probability of the target variable with the maximum entropy value and is controlled by a set of constraints representing incomplete information about the target distribution. In mineralization prediction, the best interpretation of unknown occurrences by the model is to maximize the entropy value of the probability distribution for estimating the location of orebodies, and many scholars have achieved better results in this regard [65,76,147]

When modeling with MaxEnt software, if the model parameters are not set properly, it may lead to overfitting or redundancy [148]. The overfitting can be controlled by the modulation multiplier *β* [149], and the best performance of the model is obtained by setting the *β* value 2~4 [148,150]. Therefore, the study tested different values to find the best *β* value of 2 for the model to reduce the influence of model overfitting.

Five prediction indicators are integrated into the MaxEnt model as input parameters, and data were randomly selected from the dataset during simulation, with 75% of the dataset as the training data and 25% as the test data. To reduce the randomness of the simulation results, the model is repeated 50 times with a maximum convergence threshold of 0.00001. A maximum background points value of 10,000 is selected, and a logical value format output is chosen for a more favorable interpretation of the results.

The final prediction result of the MaxEnt model is evaluated using the average value of 50 iterations of the simulation, with the contribution rate of each mineral indicator shown in Table 4.


**Table 4.** Contribution rate of prediction indicators.

The AUC value of the test dataset is 0.844 and the AUC value of the training dataset is 0.848, so the MaxEnt model has high accuracy in mineral resources prediction at depth (Figure 18).

The output logical probability of the MaxEnt model is in the range of 0.000804~0.927941, which is mapped to 0 and 1, and then the 3D MPM is formed (Figure 19).

Although the mineral prospectivity map shows a good relationship between high probabilities and known gold orebodies (Figure 19), it is difficult to determine a certain logistic probabilities value as the prediction threshold value.

We take the ratio of prediction volume to orebodies occupied volume as a parameter. Observably, it must be that reverse variation of this parameter follows the greater logistic probabilities (Figure 20). The high potential area (logical probability > 0.525) is defined by the logistic probability of 0.525, which covers 80% of the known orebodies; the medium potential area (0.3 < logical probability < 0.525) is defined by the logistic probability of 0.3, which is another inflection point and covers all the known orebodies (Figure 20). The spatial distribution of mineralization potential areas by the MaxEnt model are shown in Figure 21a, based on which two mineral exploration targets are circled (Figure 21b).

of 0.00001. A maximum background points value of 10,000 is selected, and a logical value

of 50 iterations of the simulation, with the contribution rate of each mineral indicator

30m buffer zone 13 Au-Ag-Cu-Pb-Zn (B2) 7.7

As-Sb-Hg(B1)/W-Bi-Co-Mo(B3) 0.8

Hg-Sb (F4) 0.9

is 0.848, so the MaxEnt model has high accuracy in mineral resources prediction at depth

The AUC value of the test dataset is 0.844 and the AUC value of the training dataset

The final prediction result of the MaxEnt model is evaluated using the average value

**Prediction Indicator Rate of Contribution (%)** Au 77.6

format output is chosen for a more favorable interpretation of the results.

**Figure 18.** ROC curve of MaxEnt model. **Figure 18.** ROC curve of MaxEnt model.

shown in Table 4.

(Figure 18).

**Table 4.** Contribution rate of prediction indicators.

**Figure 19.** Three-dimensional MPM by MaxEnt model. (**a**) Three-dimensional MPM; (**b**) three-dimensional MPM with orebodies. **Figure 19.** Three-dimensional MPM by MaxEnt model. (**a**) Three-dimensional MPM; (**b**) threedimensional MPM with orebodies.

Although the mineral prospectivity map shows a good relationship between high probabilities and known gold orebodies (Figure 19), it is difficult to determine a certain

We take the ratio of prediction volume to orebodies occupied volume as a parameter. Observably, it must be that reverse variation of this parameter follows the greater logistic

the logistic probability of 0.525, which covers 80% of the known orebodies; the medium potential area (0.3 < logical probability < 0.525) is defined by the logistic probability of 0.3, which is another inflection point and covers all the known orebodies (Figure 20). The spatial distribution of mineralization potential areas by the MaxEnt model are shown in Fig-

**Figure 20.** Logical probability versus the ratio of prediction volume to orebody occupied volume.

ure 21a, based on which two mineral exploration targets are circled (Figure 21b).

logistic probabilities value as the prediction threshold value.

tion targets.

(Figure 22).

ure 21a, based on which two mineral exploration targets are circled (Figure 21b).

**Figure 20. Figure 20.** Logical probability versus Logical probability versus the ratio of prediction volume to orebody occupied volume.

(**a**) (**b**) **Figure 19.** Three-dimensional MPM by MaxEnt model. (**a**) Three-dimensional MPM; (**b**) three-di-

Although the mineral prospectivity map shows a good relationship between high probabilities and known gold orebodies (Figure 19), it is difficult to determine a certain

We take the ratio of prediction volume to orebodies occupied volume as a parameter. Observably, it must be that reverse variation of this parameter follows the greater logistic probabilities (Figure 20). The high potential area (logical probability > 0.525) is defined by the logistic probability of 0.525, which covers 80% of the known orebodies; the medium potential area (0.3 < logical probability < 0.525) is defined by the logistic probability of 0.3, which is another inflection point and covers all the known orebodies (Figure 20). The spatial distribution of mineralization potential areas by the MaxEnt model are shown in Fig-

mensional MPM with orebodies.

logistic probabilities value as the prediction threshold value.

**Figure 21.** (**a**) The distribution of mineralization potential areas. (**b**) MaxEnt model-based explora-**Figure 21.** (**a**) The distribution of mineralization potential areas. (**b**) MaxEnt model-based exploration targets.

4.5.3. Three-Dimensional MPM and Uncertainty Evaluation of GMM

4.5.3. Three-dimensional MPM and Uncertainty Evaluation of GMM When training and testing the model, the labeled data (which is used only in the evaluation) is divided into 75% for the training dataset and 25% for the test dataset, and the GMM is used to learn the information of the training dataset, and then the ROC curve is used for performance evaluation of the training dataset and test dataset, respectively. The AUC value of the test dataset is 0.75 and the AUC value of the training dataset is 0.75 When training and testing the model, the labeled data (which is used only in the evaluation) is divided into 75% for the training dataset and 25% for the test dataset, and the GMM is used to learn the information of the training dataset, and then the ROC curve is used for performance evaluation of the training dataset and test dataset, respectively. The AUC value of the test dataset is 0.75 and the AUC value of the training dataset is 0.75 (Figure 22).

*Minerals* **2022**, *12*, x FOR PEER REVIEW 24 of 32

**Figure 22.** ROC curve of GMM. **Figure 22.** ROC curve of GMM.

From the point of the AUC value, 0.75 is not a high value, which may indicate that From the point of the AUC value, 0.75 is not a high value, which may indicate that unsupervised training methods have a shortage in prediction with big data. From the point of the AUC value, 0.75 is not a high value, which may indicate that unsupervised training methods have a shortage in prediction with big data. However, from the mineral prospectivity map point of view, the prediction area has

unsupervised training methods have a shortage in prediction with big data. However, from the mineral prospectivity map point of view, the prediction area has covered orebodies well, and it still has a certain indication function in the mineralization prediction. Finally, two mineral exploration targets are delineated at depth (Figure 23). However, from the mineral prospectivity map point of view, the prediction area has covered orebodies well, and it still has a certain indication function in the mineralization prediction. Finally, two mineral exploration targets are delineated at depth (Figure 23). covered orebodies well, and it still has a certain indication function in the mineralization prediction. Finally, two mineral exploration targets are delineated at depth (Figure 23).

**Figure 23.** Mineral resources prediction at large depth based on GMM. **Figure 23.** Mineral resources prediction at large depth based on GMM.

**Figure 23.** Mineral resources prediction at large depth based on GMM.

#### **5. Discussion**

This study employed the geostatistical interpolation method to build a 3D geochemical model and geochemical anomaly model. In addition to deterministic modeling of 3D geology and geochemistry [151,152], geostatistical techniques also include uncertainty modeling of spatial distribution of subsurface heterogeneous structures and dynamic processes of fluid migration [153]. In view of the 3D heterogeneous structure, the multipoint geostatistical method can be used to overcome the shortage of traditional geostatistical simulations in delineating the geometric continuity of geological structures [154–159]. Meanwhile, traditional geostatistical simulation has the limitations of large computation, complicated parameterization and difficult to characterize multi-scale data. The application of machine learning and deep learning methods to reconstruct geological and geochemical structures can improve the simulation efficiency and can accurately express complex heterogeneous spatial structures [160,161], which deserves further research work.

The machine learning methods of the MaxEnt model and GMM are carried out for 3D MPM in the Zaozigou gold deposit in this study. Compared with the GMM, the MaxEnt model has a higher precision in detection of ore-induced anomalies, which demonstrates a higher reliability of 3D MPM (Figures 18 and 22). The prediction results of the two methods express a high correlation with the known orebodies, based on which two mineral exploration targets are circled (Figures 21 and 23).

Target I of the MaxEnt model is located at an elevation of 1600~2000 m, belonging to the NE-orientation orebody group, which should be the extension of the Au1 orebody. The Au and Sb concentration in this position (Figure 9) and the ratio of front halo to tail halo has been increasing (Figure 16). Additionally, it appears that the high logical probability calculated by the MaxEnt model and GMM at this position indicates the Au1 orebody may extend deeper or a concealed orebody exists therein. Meanwhile, the Target I of GMM is located at the elevation of about 1300 m, reflecting the weak anomaly in the deep drill.

Target II of the two methods is similarly located at the NW-orientation orebody group at the elevation of about 2500 m, where the fractures distribute complexly and the anomalies of tail halo elements and front halo elements overlapped (Figure 9).

#### **6. Conclusions**

In this paper, the three-dimensional primary halo anomaly data volume model is built based on the multifractal C-V model, which fully considers the nonlinear characteristics of the primary geochemical data. The C-V model is a three-dimensional extension of the two-dimensional multifractal method, according to which the geochemical concentrations are clearly illustrated at depth. The 3D geochemical anomaly data volume model provides an important element distribution indicator to the 3D MPM.

The data-driven CoDA method was performed in this paper by using clr transformation and factor analysis, among which factor F4 is selected as a prediction indicator. The knowledge-driven CoDA method used the SBP approach to extract the element associations of front halo, near-ore halo and tail halo, and the association of near-ore halo and the ratio of front halo to tail halo are selected as the other two prediction indicators. These selected geochemical association indicators are reliable for their good reflection in metallogenic regularity.

From the results of this paper, the MaxEnt model and the GMM are efficient machine learning methods in 3D MPM. By comparing the spatial distribution of the orebodies and the indication of the metallogenic regularity, the delineated mineral exploration targets can be considered as the mineral potential areas for further investigation. However, it must be mentioned that machine learning algorithms have fast and accurate calculation in the case of small data but lack generalization ability compared with deep learning algorithms in big data. As the amount of data gradually increases, the prediction ability of the MaxEnt model and the GMM usually reach the bottleneck, while deep learning can use more parameters to continuously optimize and improve the detection ability of the models. Deep learning-based 3D Mineral Prospectivity Mapping of the Zaozigou gold deposit should be paid more attention in the further research.

**Author Contributions:** Ideas, Y.K., G.C. and B.L.; Methodology, Y.K., G.C., C.L. and Z.Y.; software, M.X., S.Z. and H.Z.; writing—original draft preparation, Y.K., G.C., Y.W. and Y.G.; writing—review and editing, G.C. and B.L.; visualization, Y.K., M.X., S.Z., H.Z., L.W. and R.T.; supervision, B.L.; funding acquisition, B.L. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the National Key Research and Development Program of China (Grants 2017YFC0601505); the National Natural Science Foundation of China (Grants 41602334; 42072322); the Key Laboratory of Geochemical Exploration, Ministry of Natural Resources (Grant AS2019P02-01); Sichuan Science and Technology Program (Grant 2022NSFSC0510); and the Opening Fund of the Geomathematics Key Laboratory of Sichuan Province (Grant scsxdz2020yb06, scsxdz2021zd04).

**Data Availability Statement:** Restrictions apply to the availability of these data. Data was obtained from the Development Research Center of China Geological Survey and No. 3 Geological and Mineral Exploration team, Gansu Provincial Bureau of Geology and Mineral Exploration and Development and are available from Bingli Liu with the permission of the Development Research Center of China Geological Survey and No. 3 Geological and Mineral Exploration team, Gansu Provincial Bureau of Geology and Mineral Exploration and Development.

**Acknowledgments:** The authors thank the anonymous reviewers and the editors for their hard work on this paper. We are grateful to the Development Research Center of China Geological Survey and No. 3 Geological and Mineral Exploration team, Gansu Provincial Bureau of Geology and Mineral Exploration and Development for their data support.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

