Next Article in Journal
A Comparative Study on the Wear Performance and High-Temperature Oxidation of Co-Free Cermets and Hardmetals
Previous Article in Journal
Experimental Permeability and Porosity Determination of All-Oxide Ceramic Matrix Composite Material
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Inverse Design of Low-Resistivity Ternary Gold Alloys via Interpretable Machine Learning and Proactive Search Progress

1
Department of Chemistry, College of Sciences, Shanghai University, Shanghai 200444, China
2
Shanghai Shuzhiwei Information Technology Co., Ltd., 668 ShangDa Road, Shanghai 200444, China
*
Authors to whom correspondence should be addressed.
Materials 2024, 17(14), 3614; https://doi.org/10.3390/ma17143614
Submission received: 25 June 2024 / Revised: 15 July 2024 / Accepted: 17 July 2024 / Published: 22 July 2024

Abstract

:
Ternary gold alloys (TGAs) are highly regarded for their excellent electrical properties. Electrical resistivity is a crucial indicator for evaluating the electrical performance of TGAs. To explore new promising TGAs with lower resistivity, we developed a reverse design approach integrating machine learning techniques and proactive searching progress (PSP) method. Compared with other models, the support vector regression (SVR) was determined to be the most optimal model for resistivity prediction. The training and test sets yielded R2 values of 0.73 and 0.77, respectively. The model interpretation indicated that lower electrical resistivity was associated with the following conditions: a van der Waals Radius (Vrt) of 0, a Vr (another van der Waals Radius) of less than 217, and a mass attenuation coefficient of MoKα (Macm) greater than 77.5 cm2g−1. Applying the PSP method, we successfully identified eight candidates whose resistivity was lower than that of the sample with the lowest resistivity in the dataset by more than 53–60%, e.g., Au1.000Cu4.406Pt1.833 and Au1.000Pt2.232In1.502. Finally, the candidates were validated to possess low resistivity through the pattern recognition method.

Graphical Abstract

1. Introduction

The application of ternary gold alloys (TGAs) spans across a diverse range of applications, encompassing microelectronic devices, electrode materials, and electrocatalytic reactions [1,2,3]. In sectors related to electricity, electrical resistivity stands out as a critical performance metric. Usually, a lower electrical resistivity contributes to improved electricity transmission efficiency, reduced energy losses, decreased heat generation, and the mitigation of issues such as signal interference [4,5]. Historically, the identification of new alloy compositions has been a complex process, often relying on conventional trial-and-error experiments. However, in the case of alloys containing precious metals like TGAs, trial-and-error experiments become cost-prohibitive and impractical, possibly accounting for the scarcity of publications on this subject.
With the ongoing evolution of computer technology, a multitude of pertinent material design techniques have emerged, such as quantum chemical computations, molecular dynamics simulations [6], density functional theory (DFT) [7,8], Monte Carlo simulations [9], etc. Currently, these methods have expanded their applications to encompass material resistivity calculations. For instance, Alfè et al. used collinear spin-polarized DFT to calculate the lattice resistivity of bcc iron [10], where the simulated estimates were found to be in accordance with the experimental data. Harukazu et al. employed DFT and Møller–Plesset second-order perturbation theory (MP2) methods to analyze the crystal structures and resistivity of three distinct TMTSF (tetramethyltetraselenafulvalene) salts [11]. They proposed a computational method for determining the non-integer valence of TMTSF molecules within crystalline structures. The results of their work indicated that this method provided valence states of TMTSF molecules in I3 salts consistent with their electrical properties. Zhang et al. used first-principles molecular dynamics (FPMD) and dynamical mean field theory (DMFT) to calculate the resistivity and thermal conductivity of Fe-Si alloys [12]. Raghuraman et al. used DFT to calculate the resistivity of high-entropy alloys [13].
While DFT calculations have proven effective in specific systems [14,15,16], their applications in alloy systems, which involve combinations of multiple elements, significantly increases computational complexity, demanding substantial computational resources and time. Machine learning (ML) has demonstrated itself as a simple and efficient method that has been successfully applied in the design of alloy materials. For example, Roy et al. employed a Generative Adversarial Network (GAN) model to design high-hardness Multi-Principal Element Alloys (MPEAs) [17]. Through a search in an 18-element space, one of the alloys designed exhibited a hardness (941HV) 10% higher than the training data (857HV). Deffrennes et al. [18]. developed a framework for predicting binary liquidus, utilizing data from 466 CALPHAD binary phase diagrams to establish three machine learning models for predicting the formation of liquid miscibility gaps, the equilibrium onset temperature of solidification, and the liquid miscibility gap temperature. Ma et al. utilized a combination of the Non-dominated Sorting Genetic Algorithm II (NSGA-II) and virtual screening to optimize the composition of high-entropy alloys [19]. They ultimately recommended three candidate samples, all of which showed a significant improvement in Vickers’ hardness and compressive fracture strain compared to the original dataset. Feng et al. designed low-content Al-Mg-Si alloys that exhibited mechanical properties comparable to the other series aluminum alloys [20]. It is worth noting that low-Cu content Al alloys tended to have better corrosion resistance. The successful application of ML techniques in these studies demonstrates their significant potential in alloy research. Although TGAs have been identified as suitable materials for various applications, the number of relevant studies on TGAs is extremely limited [3]. Recently, we used the SVR model to predict the electrical resistivity of TGAs and employed the high-throughput screening approach to design a series of candidate samples [21]. However, constrained by computational power and the impracticality of exhaustive enumeration, only a small fraction of the TGA chemical space has been explored. Therefore, more effort is still required to conduct more meaningful research into TGAs.
In this work, we introduced an inverse design framework that integrated an interpretable ML model and proactive searching progress (PSP) to design low-resistivity TGAs. First, 51 samples were collected from the literature to construct the dataset. Subsequently, feature engineering was conducted, including descriptor imputation, data preprocessing, and feature selection. After comparing a series of models, the support vector regression (SVR) model was selected for modeling and SHapley Additive exPlanations (SHAP) was used for model interpretation [22,23]. The self-developed PSP method was then employed to design new TGAs with low resistivity, and the candidates were validated using a pattern recognition method. This inverse design framework will provide significant guidance for discovering low-resistivity TGAs and can be applied to the design of other materials.

2. Method

The framework of the inverse design of the low-resistivity TGAs is depicted in Figure 1. The dataset was composed of 51 TGA samples paired with 111 generated atomic descriptors for our ML approach. The Pearson correlation and genetic algorithm (GA) [24] were sequentially used for the feature selection to determine the optimal feature set. By comparing various model algorithms, a well-fitted SVR model was established, and the model interpretation was accessed via SHAP method. The reverse design was carried out by integrating the SVR model with the PSP method to identify the new promising TGAs. The search objective was set to identify the TGAs whose negative logarithm of resistivity −lgρ exceeded 8 to go beyond the highest value of 6.68 in the current samples as much as possible. Regarding the newly explored TGAs, pattern recognition was used for sample validation, where the result proved the new samples have potential to have lower resistivity.

2.1. Dataset

The 51 TGA samples were collected from the studies accessible from the Materials Platform for Data Science (MPDS), with the majority featured in our previous work [21]. Among these 51 samples, there are 15 different elements, with Au as element A, and 15 elements used for elements B and C. A total of 111 descriptors, filled using the Villars and Mendeleev databases, were used for the subsequent work.

2.2. Model Construction

Feature selection is very important and can improve model performance and reduce computational costs [25]. Concerning the 111 descriptors, Pearson correlation was firstly used to prune the highly correlated features, resulting in 80 descriptors. Sequentially, the GA method was used to determine the best feature subset from the remaining features. Various ML algorithms were considered and compared for the optimal model, including SVR, decision tree regression (DTR) [26], K-nearest neighbors (KNN) [27], random forest regression (RFR) [28], and gradient boosting regression (GBR) [29].

2.3. Evaluation Metrics

Leave-one-out cross-validation (LOOCV) is particularly suitable for studies with small datasets as it considers more possibilities for dividing the training and test sets, thus reducing bias from random divisions. This work is a typical example of small data machine learning, and using LOOCV effectively prevents model overfitting or underfitting caused by significant differences in dataset splits. In pursuit of a reasonable model, LOOCV was used on the training set to check the model’s robustness and fitness, and the test set was evaluated to assess the generalization ability. Additionally, the dataset was re-partitioned randomly 50 times for rebuilding the models to evaluate the model’s stability.
The common metrics, coefficient of determination (R2) and root mean square error (RMSE) were used to indicate the performance of the model. R2 can be used to measure the correlation between predictions and observations, and the value ranges from −1 to 1. The calculation formula could be depicted as follows:
R 2 = 1 i ( y ^ i y i ) 2 i ( y ̄ i y i ) 2
where y ̄ is the mean value of the observed values y, y i is the observed value, and y ^ i is the predicted value. RMSE can be used to measure the general error of the model. A smaller RMSE indicates a more accurate predicting ability of the model. The calculation formula can be written as follows:
R M S E = 1 n i = 1 n y i y ^ i
where n is the number of samples, y i is the true value, and y ^ i is the predicted value.

2.4. PSP

A well-established ML model could help us access the mapping association between the chemical compositions and target resistivity, which might facilitate the design of new materials. However, as for the terms of TGAs, in addition to Au, there are 2 combinations of elements that need to be considered from the 15 elements (excluding Au) included in the entire dataset. Thus, there are C 15 1 × C 14 1 different elemental combinations for TGAs, concerning 15 choices for B and C sites. Given the mole fraction step of 0.01, the chemical space for TGAs could be composed of C 15 1 × C 14 1 × 1 0 6 different compositions, which is thousands of times larger than the one in our previous work. It may take unaffordable times to obtain all the model predictions, even for the case of quaternary element alloys.
In this study, the PSP method, proposed in our previous works [30], was employed to efficiently design new low-resistivity TGAs, rather than the traditional high-throughput screening method. The loss function of PSP was defined as the absolute error between the expected property and the property predicted by a high-accuracy model, i.e., E E = o o , where EE is the Expected Error. The core idea of the PSP method is to consider the composition as the parameters awaiting optimization, and adopt a computationally economical surrogate model, e.g., Gaussian process regression (GPR) [31], to replace the SVR model locally. Throughout the iterative process, it allows for the rapid and efficient identification of TGAs that meet or closely match our predetermined resistivity criteria. Moreover, this inverse design method has already been successfully applied to materials design [32,33]. Thence, the PSP method could be used to determine the potential low-resistivity TGAs directly without going through all the possible predictions.

2.5. Pattern Recognition Validation

Pattern recognition is the process of utilizing computer algorithms to identify patterns and regularities within data. Its applications span a wide range of fields, including image recognition, data analysis, structural health monitoring, and materials science, among others. There are various methods within pattern recognition, such as principal component analysis (PCA) [34,35], partial least squares (PLS) [36], Fisher projection [37], and many more. In this study, we used pattern recognition to analyze the 2-dimensional tendency of the TGA samples and determine whether the designed samples were localized around the samples with low resistivity.

3. Results and Discussions

3.1. Data Preparation

In our study, the dataset was composed of 51 TGA samples with experimental resistivity measurements [21]. As shown in Figure 2a, the resistivity distribution of the majority of samples in the dataset is centered around 1.0 × 1 0 5   Ω · m . A logarithmic transformation with a base of 10 was applied to the resistivity values to make the dataset more closely resemble a normal distribution, aiming to enhance the model’s performance [38], where the transformed distribution was plotted in Figure 2b, exhibiting a closely approximate normal distribution. The average value of −lgρ is 5.68, with a standard deviation of 0.58. Following this preprocessing step, the dataset was partitioned into a 4:1 ratio, designating 80% for training purposes and reserving 20% for an independent test set. This partitioning scheme facilitated the subsequent modeling and evaluation of the model’s generalization capabilities.

3.2. Feature Selection

We populated the dataset with 111 atomic descriptors sourced from the Villars and Mendeleev databases, and the descriptor details can be seen in Table S1. The features were standardized using the standardization method from the ‘preprocess’ module in scikit-learn to avoid the effects of different scales. Feature selection plays a pivotal role in ML processes, as the judicious choice of features forms the foundation of model accuracy. To mitigate the risk of dimensionality issues and overfitting, it is essential to reduce the feature numbers as much as possible.
This study employed a two-step approach for feature selection. Firstly, a preprocessing step was conducted by setting the Pearson correlation coefficient threshold between features to 0.95, resulting in the retention of 80 features. Subsequently, GA was utilized for feature selection. It is worth noting that GA algorithms are inherently embedded and require integration with corresponding algorithms. We comprehensively evaluated the LOOCV results of the training set and test set, as shown in Figure 3a,b, where Figure 3a represents the results of the GA combined with the SVR algorithm (as the example). The features within the red box exhibited good performance in both the training set and test set. Consequently, we ultimately selected these nine features for further analysis. The correlation among these nine features is illustrated in the accompanying Figure 3c, demonstrating the absence of strong inter-feature correlations. The specific meanings represented by these nine features are as shown in Table 1.

3.3. Model Construction and Evaluation

As seen in Figure 3d, besides SVR model, we also explored the other range of algorithms including DTR, KNN, RFR and GBR to compare the performance of various models, by using the same feature selection as described above. As a result, the SVR model performed the best, achieving the highest LOOCV R2 value of 0.72 and the lowest RMSE of 0.288 on the training dataset. We also performed an analysis of variance (ANOVA) on the absolute values of the residuals from the LOOCV of the aforementioned algorithms and conducted post hoc analysis using the Tukey HSD test. Details can be found in the supplementary information. The ANOVA and Tukey HSD results indicate that there is no significant difference in prediction among the algorithms used in our study. However, since the SVR model showed relatively better prediction metrics, we proceeded with the SVR algorithm for the subsequent modeling. Additionally, we considered that the inclusion of the CAS feature in the model might not have significant physical meaning, so we contemplated removing it. After removing the CAS and using the remaining eight descriptors, the R2 value dropped to 0.464, the R value decreased to 0.736, the RMSE became 0.399, and the MSE became 0.159. This indicates that the model performance declined after removing the CAS feature. Therefore, we decided to retain the CAS feature. The specific meanings and methods of feature generation related to CAS can be found in the Supplementary Materials.
Once the model was selected, fine-tuning its parameters became pivotal for enhancing modeling performance. In the case of the SVR model, this study employed a grid search method for parameter optimization. With these optimized parameters, the R2 value reached 0.73 during LOOCV (Figure 4a), confirming the validity of our parameter choices.
Model generalization refers to how well the model performs on data outside the training set. In this study, we assessed the model’s generalization ability using an independent test dataset (Figure 4b). The R2 and RMSE values on the testing dataset were 0.77 and 0.332, respectively, which are close to the results on the training dataset. This confirms that the model possesses satisfactory generalization capability.
To assess the stability of our model, we randomly divided the dataset into training and test sets in a 4:1 ratio and repeated this process 50 times for modeling. The results of the model stability test are shown in Table 2, and more details can be found in Table S2 of the Supplementary Materials. The average R² value of LOOCV was 0.68, with an average R of 0.84 and an average RMSE of 0.308. The average R² value on the test set was 0.69, with an average R of 0.87 and an average RMSE of 0.345. These findings demonstrated that the model exhibits good stability.

3.4. Model Interpretation

Model interpretation can help us better understand the model. SHAP was employed to interpret the model in this study. Figure 5a displays the importance of the ranked descriptors. In the depicted graph, the feature importance is arranged in descending order as follows: Vrt, Vr, Macm, and other features. To uncover more valuable patterns, we partitioned the SHAP values of the most important features into positive and negative regions, as depicted in Figure 5b–d. The color of the scatter plot, whether blue or red, represents the level of influence, which illustrates the main correlations between each feature of the SVR model and the target value.
Vrt refers to the van der Waals Radius calculated by Mantina et al. [41]. As shown in Figure 5b, when Vrt equaled 0, the corresponding SHAP values were positive, and when it exceeded 0, the SHAP values became negative, indicating a negative correlation between Vrt and SHAP values. In other words, Vrt was negatively correlated with the electrical resistivity of TGAs, and when it equaled 0, the samples tended to have lower ρ. In our dataset, samples with lower electrical resistivity, such as AuCu4Lu, AuCu0.25V0.013, and AuLu0.5In0.5, had electrical resistivity values of 2.09 × 1 0 7   Ω · m , 2.45 × 1 0 7   Ω · m , and 4.09 × 1 0 7   Ω · m , and their Vrt values were all equal to 0.
Vr represented another kind of van der Waals Radius defined by Bondi et al. [39] and Mantina et al. [41]. When Vr was less than 217, the corresponding SHAP values were primarily positive, and when it exceeded 217, the SHAP values became negative, indicating a negative correlation between Vr and SHAP values. In other words, Vr was negatively correlated with electrical resistivity. For example, in the dataset, the corresponding electrical resistivity values for AuCu4.0Lu, AuCu0.25V0.013, and AuNd0.50Ge are 2.09 × 1 0 7   Ω · m , 2.45 × 1 0 7   Ω · m , and 2.04 × 1 0 5   Ω · m , while their corresponding Vr values are 203.67, 210.37, and 217.8, respectively.
Macm stands for the mass attenuation coefficient of MoKα. As shown in Figure 5c, when Macm exceeded a certain value of 77.5 cm2g−1, SHAP values were positive, indicating a positive correlation between Macm and SHAP values. When Macm exceeded 77.5 cm2g−1, the electrical resistivity of TGAs tended to be lower. For AuLu0.5In0.5, AuY0.5In0.5, and AuCeGe, the corresponding electrical resistivity values are 2.51 × 1 0 7 Ω · m , 2.98 × 1 0 7 Ω · m , and 1.91 × 1 0 5 Ω · m , with their respective Macm values being 86.88, 89.83, and 76.00.
Through the analysis mentioned above, we can conclude that when Vrt = 0, Vrt < 217, and Macm > 77.5 cm2g−1, the −lgρ of TGAs tended to be lower. Additionally, the research of Zefirov and Zorkii also indicates that van der Waals radii influence the packing between crystals [42], which significantly impacts their electrical conductivity.

3.5. Model Application

3.5.1. Proactive Search Process

In pursuit of the exploration and design of TGAs characterized by low resistivity, we employed the self-developed PSP method as outlined in reference [30]. High-throughput screening typically involves generating a large number of virtual samples for prediction and then selecting samples that meet the desired criteria based on the predicted target values. PSP leverages a meticulously trained model to pinpoint the samples within the designated compositional space that closely approximate the predefined target values. Notably, this approach, in contrast to forward design methodologies, offers substantial time savings and enhanced efficacy in uncovering samples that align with the specified criteria.
The compositional space involved linking site A with Au in a fixed proportion of 1, while sites B and C could host elements other than Au selected from the dataset, with proportions varying from 0 to 7. Given that the maximal value of −lgρ within the dataset stood at 6.68, we set the −lgρ target threshold as 8, signifying the pursuit of samples characterized by the lowest feasible resistivity. Through meticulous active searching, we identified a series of virtual candidates whose −lgρ exceeded 7. Among these candidates, we curated a selection of eight instances (Table 3 that featured elements not in concurrence with one another, thereby earmarking them for prospective experimental reference. As seen in Table 3, the −lgρ of these eught candidates was lower than that of the originals, whose −lgρ ranged from 7.0074 to 7.0733. Accordingly, the ρ range was 8.45 × 10 8 9.83 × 10 8 , approximately 53–60% less than the lowest ρ 20.9 × 10 8 (−lgρ = 6.68) from the dataset, and also 18–30% lower than the ρ 12.7 × 10 8 in our previous work.

3.5.2. Pattern Recognition

In this study, we employed the Fisher projection method to project the samples from high-dimensional space into two-dimensional space, where the sample projections were plotted by FIS(1) and FIS(2) derived from the linear combinations of the original features:
F I S 1 = 0.05617 R B + 0.05980 R C 0.01009 E m + 0.003075 M a c m + 2.098 D + 0.01701 C A S 0.08600 V r 0.002259 V r b 0.01346 V r t + 12.415  
F I S ( 2 ) = + 0.4580 R B 0.1134 R C 0.09091 E m + 0.02982 M a c m 1.517 D 0.0002584 C A S 0.01656 V r + 0.002266 V r b 0.004873 V r t + 6.050
As seen in Figure 6, the red and blue colors represent higher and lower −lgρ values, respectively. A distinct distribution can be observed: the red samples are almost located at the top right corner of the 2D scatter plot, while the blue samples are localized around the bottom left corner, indicating the locations of the TGAs with higher and lower −lgρ values. The candidate samples designed (highlighted in green) based on the PSP results are located on the right side of the plot, closer to the TGA samples with higher −lgρ values. This discovery further validated the feasibility of PSP in designing ternary gold alloys with low electrical resistivity.

3.5.3. Data Visualization

Data visualization is the representation of data in the form of images, charts, graphs, or other visual formats, making it easier to comprehend trends and patterns within the data [43,44]. In this study, we employed data visualization to better understand the trends and patterns in the electrical resistivity of the samples generated through PSP.
As shown in Figure 7, regarding the generated PSP samples, we could observe certain trends: samples with Vrt equal to 0 tended to have lower resistivity rates, there was a higher concentration of samples with lower resistivity around Vr near 217, and there was an increased presence of samples with lower resistivity when Macm equaled 77.5. These patterns aligned with the findings in Section 3.4 of our study.
Furthermore, we noticed that among the PSP-generated samples, there was a higher number of samples surpassing the lowest resistivity observed in the original dataset (sample details can be found in the supplementary information). This illustrated the superiority of the PSP approach in reverse materials design compared to traditional virtual screening methods.

3.6. Comparation of Related Work

Previously, a study by Wang et al. also employed machine learning methods [21], using virtual screening to design low-resistivity TGA materials. In their research, they designed 10 candidate materials, among which the one with the lowest resistivity was Au0.95Lu0.01In0.04, with a resistivity of 1.20 × 10 7   Ω m , −lgρ = 6.9217. In our study, we used the PSP method, which not only eliminated the need to exhaustively search the material space, but also resulted in a candidate material (Au1.000Cu4.406Pt1.833) with a minimum resistivity of 8.45 × 10 8   Ω m , nearly 30% lower than that of Au0.95Lu0.01In0.04. Furthermore, compared to the candidate materials proposed by them (AuxLuyInz), our candidate materials are more diverse in elemental composition, including Pt, In, Ga, and Cu in addition to Au. However, unfortunately, neither their work nor ours involved experimental synthesis.

4. Conclusions and Outlooks

In this study, we propose an inverse design framework for designing ternary gold alloys with low electrical resistivity.
(1)
Based on 51 TGA samples, the SVR model was established to predict their target electrical resistance, after comparing with its counterpart models, whose LOOCV R and RMSE were 0.73 and 0.281, respectively. The independent test set also yielded an R2 and RMSE of 0.77 and 0.332, indicating the good fitness of the SVR model.
(2)
A series of samples was designed through the utilization of our self-developed PSP method, and eight candidates were selected. The results of pattern recognition further demonstrated the feasibility of our sample design method.
(3)
The SHAP method was introduced to indicate that lower electrical resistivity occurs when Vrt equals 0, Vr is less than 217, and Macm is greater than 77.5 cm2g−1.
Our model can predict the resistivity of gold alloys with compositions different from those within the dataset, demonstrating its transferability to some extent. However, due to significant differences between materials, we are currently unable to validate the model’s transferability to other materials. Nonetheless, we will continue researching in this direction. Finally, our reverse design framework is applicable to other materials, and we encourage interested researchers to explore further with experimental validation.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ma17143614/s1, Table S1: 111 features filled with data from the Villars and Mendeleev databases, Table S2: Repeatedly randomly partitioning the training and test sets to validate the model’s robustness, Table S3: ANOVA results, Table S4: Tukey HSD test results and PSP samples. Reference [45] are cited in the supplementary materials.

Author Contributions

Methodology, T.L.; Software, T.L.; Validation, H.C.; Investigation, H.C. and S.C.; Writing—original draft, H.C.; Writing—review and editing, M.L. and W.L.; Funding acquisition, M.L. and W.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Key Research and Development Program of China (2022YFB3707800), the Key Program of Science and Technology of Yunnan Province (202302AB080022), the Major Science and Technology Projects of Yunnan Precious Metals Laboratory (No. YPML-2023050205), and the Yunnan Precious Metals Laboratory Science and Technology Plan Project (No. YPML-2023050208, No. 2023050280).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset and code are available from a GitHub link at GitHub-chehang228/PSP: data available.

Conflicts of Interest

Author Tian Lu was employed by the company Shanghai Shuzhiwei Information Technology Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Bonfil, Y.; Brand, M.; Kirowa-Eisner, E. Characteristics of Subtractive Anodic Stripping Voltammetry of Lead, Cadmium and Thallium at Silver-Gold Alloy Electrodes. Electroanalysis 2003, 15, 1369–1376. [Google Scholar] [CrossRef]
  2. Haochun, T.; Chun-Yi, C.; Tso-Fu Mark, C.; Takashi, N.; Daisuke, Y.; Toshifumi, K.; Katsuyuki, M.; Katsuyuki, M.; Kazuya, M.; Masato, S. Au–Cu Alloys Prepared by Pulse Electrodeposition toward Applications as Movable Micro-Components in Electronic Devices. J. Electrochem. Soc. 2018, 165, D58. [Google Scholar]
  3. Liu, H.; Xue, S.; Tao, Y.; Long, W.; Zhong, S. Design and solderability characterization of novel Au–30Ga solder for high-temperature packaging. J. Mater. Sci. Mater. Electron. 2020, 31, 2514–2522. [Google Scholar] [CrossRef]
  4. Gao, R.; Wen, S.; Li, A.; Zhang, H.; Du, W.; Deng, B. A novel low-resistance damper for use within a ventilation and air conditioning system based on the control of energy dissipation. Build. Environ. 2019, 157, 205–214. [Google Scholar] [CrossRef]
  5. Ran, G.; Ran, G.; Zhiyu, F.; Angui, L.; Kaikai, L.; Zhigang, Y.; Beihua, C. A novel low-resistance tee of ventilation and air conditioning duct based on energy dissipation control. Appl. Therm. Eng. 2017, 132, 790–800. [Google Scholar]
  6. Rapaport, D.C. The Art of Molecular Dynamics Simulation; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
  7. Kohn, W.; Sham, L.J. Self-consistent equations including exchange and correlation effects. Phys. Rev. 1965, 140, A1133. [Google Scholar] [CrossRef]
  8. Hohenberg, P.; Kohn, W. Inhomogeneous electron gas. Phys. Rev. 1964, 136, B864. [Google Scholar] [CrossRef]
  9. Binder, K.; Heermann, D.W.; Binder, K. Monte Carlo Simulation in Statistical Physics; Springer: Berlin/Heidelberg, Germany, 1992. [Google Scholar]
  10. Alfè, D.; Pozzo, M.; Desjarlais, M.P. Lattice electrical resistivity of magnetic bcc iron from first-principles calculations. Phys. Rev. B 2012, 85, 024102. [Google Scholar] [CrossRef]
  11. Yoshino, H.; Iwasaki, Y.; Tanaka, R.; Tsujimoto, Y.; Matsuoka, C. Crystal Structures and Electrical Resistivity of Three Exotic TMTSF Salts with I 3−: Determination of Valence by DFT and MP2 Calculations. Crystals 2020, 10, 1119. [Google Scholar] [CrossRef]
  12. Zhang, Y.; Luo, K.; Hou, M.; Driscoll, P.; Salke, N.P.; Minár, J.; Prakapenka, V.B.; Greenberg, E.; Hemley, R.J.; Cohen, R.E.; et al. Thermal conductivity of Fe-Si alloys and thermal stratification in Earth’s core. Proc. Natl. Acad. Sci. USA 2022, 119, e2119001119. [Google Scholar] [CrossRef]
  13. Raghuraman, V.; Wang, Y.; Widom, M. An investigation of high entropy alloy conductivity using first-principles calculations. Appl. Phys. Lett. 2021, 119, 121903. [Google Scholar] [CrossRef]
  14. Burke, K. Perspective on density functional theory. J. Chem. Phys 2012, 136, 150901. [Google Scholar] [CrossRef] [PubMed]
  15. Verma, P.; Truhlar, D.G. Status and challenges of density functional theory. Trends. Chem. 2020, 2, 302–318. [Google Scholar] [CrossRef]
  16. Cohen, A.J.; Mori-Sánchez, P.; Yang, W. Challenges for density functional theory. Chem. Rev. 2012, 112, 289–320. [Google Scholar] [CrossRef]
  17. Roy, A.; Hussain, A.; Sharma, P.; Balasubramanian, G.; Taufique, M.F.N.; Devanathan, R.; Singh, P.; Johnson, D.D. Rapid discovery of high hardness multi-principal-element alloys using a generative adversarial network model. Acta Mater. 2023, 257, 119177. [Google Scholar] [CrossRef]
  18. Deffrennes, G.; Terayama, K.; Abe, T.; Ogamino, E.; Tamura, R. A framework to predict binary liquidus by combining machine learning and CALPHAD assessments. Mater. Design 2023, 232, 112111. [Google Scholar] [CrossRef]
  19. Ma, Y.; Li, M.; Mu, Y.; Wang, G.; Lu, W. Accelerated Design for High-Entropy Alloys Based on Machine Learning and Multiobjective Optimization. J. Chem. Inf. Model. 2023, 63, 6029–6042. [Google Scholar] [CrossRef]
  20. Feng, X.; Wang, Z.; Jiang, L.; Zhao, F.; Zhang, Z. Simultaneous enhancement in mechanical and corrosion properties of Al-Mg-Si alloys using machine learning. J. Mater. Sci. Technol. 2023, 167, 1–13. [Google Scholar] [CrossRef]
  21. Wang, X.; Lu, T.; Zhou, W.; Ji, X.; Lu, W.; Yang, J. Accelerated Discovery of Ternary Gold Alloy Materials with Low Resistivity via an Interpretable Machine Learning Strategy. Chem. Asian J. 2022, 17, e202200771. [Google Scholar] [CrossRef]
  22. Niu, B.; Lu, W.-C.; Yang, S.-S.; Cai, Y.-D.; Li, G.-Z. Support vector machine for SAR/QSAR of phenethyl-amines. Acta Pharmacol. Sin. 2007, 28, 1075–1086. [Google Scholar] [CrossRef]
  23. Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
  24. Hsu, W.H. Genetic Algorithms; Department of Computing and Information Sciences, Kansas State University: Manhattan, KS, USA, 2004; Volume 234. [Google Scholar]
  25. Guyon, I.; Elisseeff, A. An introduction to variable and feature selection. J. Mach. Learn. Res. 2003, 3, 1157–1182. [Google Scholar]
  26. Quinlan, J.R. Induction of decision trees. Mach. Learn. 1986, 1, 81–106. [Google Scholar] [CrossRef]
  27. Cover, T.; Hart, P. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [Google Scholar] [CrossRef]
  28. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  29. Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
  30. Lu, T.; Li, H.; Li, M.; Wang, S.; Lu, W. Inverse design of hybrid organic–inorganic perovskites with suitable bandgaps via proactive searching progress. ACS Omega 2022, 7, 21583–21594. [Google Scholar] [CrossRef] [PubMed]
  31. Rasmussen, C.E.; Williams, C.K. Gaussian Processes for Machine Learning; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
  32. Xu, P.; Lu, T.; Ji, X.; Li, M.; Lu, W. Machine Learning Combined with Weighted Voting Regression and Proactive Searching Progress to Discover ABO3-δ Perovskites with High Oxide Ionic Conductivity. J. Phys. Chem. C 2023, 127, 17096–17108. [Google Scholar] [CrossRef]
  33. Wu, Y.; Shang, Z.; Lu, T.; Zhou, W.; Li, M.; Lu, W. Target-directed discovery for low melting point alloys via inverse design strategy. J. Alloys Compd. 2024, 971, 172664. [Google Scholar] [CrossRef]
  34. Jolliffe, I.T.; Cadima, J. Principal component analysis: A review and recent developments. Philos. Trans. R. Soc. A 2016, 374, 20150202. [Google Scholar] [CrossRef]
  35. Wold, S.; Esbensen, K.; Geladi, P. Principal component analysis. Chemometr. Intell. Lab. Syst. 1987, 2, 37–52. [Google Scholar] [CrossRef]
  36. Chen, C.; Cao, X.; Tian, L. Partial least squares regression performs well in MRI-based individualized estimations. Front. Neurosci. 2019, 13, 1282. [Google Scholar] [CrossRef] [PubMed]
  37. Belhumeur, P.N.; Hespanha, J.P.; Kriegman, D.J. Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 1997, 19, 711–720. [Google Scholar] [CrossRef]
  38. Wu, M.; Tikhonov, E.; Tudi, A.; Kruglov, I.; Hou, X.; Xie, C.; Pan, S.; Yang, Z. Target-Driven Design of Deep-UV Nonlinear Optical Materials via Interpretable Machine Learning. Adv. Mater 2023, 35, 2300848. [Google Scholar] [CrossRef] [PubMed]
  39. Bondi, A. van der Waals Volumes and Radii. J. Phys. Chem. 1964, 68, 441–451. [Google Scholar] [CrossRef]
  40. Rowland, R.S.; Taylor, R. Intermolecular Nonbonded Contact Distances in Organic Crystal Structures: Comparison with Distances Expected from van der Waals Radii. J. Phys. Chem. C 1996, 100, 7384–7391. [Google Scholar] [CrossRef]
  41. Mantina, M.; Chamberlin, A.C.; Valero, R.; Cramer, C.J.; Truhlar, D.G. Consistent van der Waals radii for the whole main group. J. Phys. Chem. A 2009, 113, 5806–5812. [Google Scholar] [CrossRef]
  42. Zefirov, Y.V.; Zorkii, P. Van der Waals radii and their application in chemistry. Russ. Chem. Rev 1989, 58, 421. [Google Scholar] [CrossRef]
  43. Heer, J.; Shneiderman, B. Interactive dynamics for visual analysis: A taxonomy of tools that support the fluent and flexible use of visualizations. Queue 2012, 10, 30–55. [Google Scholar] [CrossRef]
  44. Heer, J.; Mackinlay, J.; Stolte, C.; Agrawala, M. Graphical Histories for Visualization: Supporting Analysis, Communication, and Evaluation. IEEE T. Vis. Comput. Gr. 2008, 14, 1189–1196. [Google Scholar] [CrossRef]
  45. Zhang, S.; Lu, T.; Xu, P.; Tao, Q.; Li, M.; Lu, W. Predicting the Formability of Hybrid Organic–Inorganic Perovskites via an Interpretable Machine Learning Strategy. J. Phys. Chem. Lett. 2021, 12, 7423–7430. [Google Scholar] [CrossRef]
Figure 1. Workflow of machine learning.
Figure 1. Workflow of machine learning.
Materials 17 03614 g001
Figure 2. Frequency distribution histograms of (a) the distribution of the original dataset and (b) the distribution of the dataset after taking the negative logarithm.
Figure 2. Frequency distribution histograms of (a) the distribution of the original dataset and (b) the distribution of the dataset after taking the negative logarithm.
Materials 17 03614 g002
Figure 3. Feature selection and model comparation. (a) The R2 and RMSE of each feature subset in the training set. (b) The R2 and RMSE of each feature subset in the test set. (c) Correlation of 9 features. (d) The RMSE and R2 values for each algorithm on the training set using LOOCV. Note that the red frames represent selected features.
Figure 3. Feature selection and model comparation. (a) The R2 and RMSE of each feature subset in the training set. (b) The R2 and RMSE of each feature subset in the test set. (c) Correlation of 9 features. (d) The RMSE and R2 values for each algorithm on the training set using LOOCV. Note that the red frames represent selected features.
Materials 17 03614 g003
Figure 4. Model performance after parameter optimization. (a) The LOOCV results of GA-SVR using the optimized parameters on the training dataset. (b) Predicted values and actual values on the independent test set and training set.
Figure 4. Model performance after parameter optimization. (a) The LOOCV results of GA-SVR using the optimized parameters on the training dataset. (b) Predicted values and actual values on the independent test set and training set.
Materials 17 03614 g004
Figure 5. Explaining models using SHAP. (a) Feature importance ranking, (b) SHAP values of Vrt, (c) SHAP values of Vr, and (d) SHAP values of Macm. Note that red is positive and blue is negative.
Figure 5. Explaining models using SHAP. (a) Feature importance ranking, (b) SHAP values of Vrt, (c) SHAP values of Vr, and (d) SHAP values of Macm. Note that red is positive and blue is negative.
Materials 17 03614 g005
Figure 6. Scatter plot for pattern recognition, where red represents positive samples, blue represents negative samples, and green represents designed samples.
Figure 6. Scatter plot for pattern recognition, where red represents positive samples, blue represents negative samples, and green represents designed samples.
Materials 17 03614 g006
Figure 7. Visualization of descriptors: (a) Vrt, (b) Vr, (c) Macm, and (d) D. Note that the horizontal dashed line represents the minimum resistivity value in the dataset (−lgρ = 6.68).
Figure 7. Visualization of descriptors: (a) Vrt, (b) Vr, (c) Macm, and (d) D. Note that the horizontal dashed line represents the minimum resistivity value in the dataset (−lgρ = 6.68).
Materials 17 03614 g007
Table 1. The meanings and abbreviations of the 9 features used for model construction.
Table 1. The meanings and abbreviations of the 9 features used for model construction.
FeaturesDescription
RBProportion of element B
RCProportion of element C
EmEnthalpy melting (kJ mol−1)
MacmMass attenuation coefficient for MoKalpha (cm2 g−1)
DDistance valence electron (Schubert) (Å)
CASCAS number
Vrvan der Waals Radius [39,40,41]
Vrb
Vrt
Table 2. Results of the model stability test.
Table 2. Results of the model stability test.
LOOCVTEST
R2RMSERR2RMSER
average0.680.310.840.690.340.87
σ0.0520.0340.0270.0520.0490.033
Table 3. Eight candidates designed using the PSP method.
Table 3. Eight candidates designed using the PSP method.
NO.CandidatePrediction (−lgρ)
C1Au1.000Cu4.406Pt1.8337.0733
C2Au1.000Pt2.232In1.5027.0671
C3Au1.000Pt1.901Ga1.2147.0412
C4Au1.000Pt2.208Cu4.7847.0306
C5Au1.000Pt1.948Cu4.5077.0304
C6Au1.000Pt1.605Cu3.3777.0283
C7Au1.000Pt1.697Cu4.3387.0281
C8Au1.000Pt2.801Ga1.4357.0074
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Che, H.; Lu, T.; Cai, S.; Li, M.; Lu, W. Inverse Design of Low-Resistivity Ternary Gold Alloys via Interpretable Machine Learning and Proactive Search Progress. Materials 2024, 17, 3614. https://doi.org/10.3390/ma17143614

AMA Style

Che H, Lu T, Cai S, Li M, Lu W. Inverse Design of Low-Resistivity Ternary Gold Alloys via Interpretable Machine Learning and Proactive Search Progress. Materials. 2024; 17(14):3614. https://doi.org/10.3390/ma17143614

Chicago/Turabian Style

Che, Hang, Tian Lu, Shumin Cai, Minjie Li, and Wencong Lu. 2024. "Inverse Design of Low-Resistivity Ternary Gold Alloys via Interpretable Machine Learning and Proactive Search Progress" Materials 17, no. 14: 3614. https://doi.org/10.3390/ma17143614

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop