Active Learning for Rapid Targeted Synthesis of Compositionally Complex Alloys

Johnson, Nathan S.; Mishra, Aashwin Ananda; Kirsch, Dylan J.; Mehta, Apurva

doi:10.3390/ma17164038

Open AccessArticle

Active Learning for Rapid Targeted Synthesis of Compositionally Complex Alloys

¹

SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA

²

Materials Science and Engineering Department, University of Maryland, College Park, MD 20742, USA

^*

Author to whom correspondence should be addressed.

Materials 2024, 17(16), 4038; https://doi.org/10.3390/ma17164038

Submission received: 22 June 2024 / Revised: 3 August 2024 / Accepted: 6 August 2024 / Published: 14 August 2024

(This article belongs to the Special Issue Electrical and Optical Properties of Metal Oxide Thin Films)

Download

Browse Figures

Versions Notes

Abstract

The next generation of advanced materials is tending toward increasingly complex compositions. Synthesizing precise composition is time-consuming and becomes exponentially demanding with increasing compositional complexity. An experienced human operator does significantly better than a novice but still struggles to consistently achieve precision when synthesis parameters are coupled. The time to optimize synthesis becomes a barrier to exploring scientifically and technologically exciting compositionally complex materials. This investigation demonstrates an active learning (AL) approach for optimizing physical vapor deposition synthesis of thin-film alloys with up to five principal elements. We compared AL-based on Gaussian process (GP) and random forest (RF) models. The best performing models were able to discover synthesis parameters for a target quinary alloy in 14 iterations. We also demonstrate the capability of these models to be used in transfer learning tasks. RF and GP models trained on lower dimensional systems (i.e., ternary, quarternary) show an immediate improvement in prediction accuracy compared to models trained only on quinary samples. Furthermore, samples that only share a few elements in common with the target composition can be used for model pre-training. We believe that such AL approaches can be widely adapted to significantly accelerate the exploration of compositionally complex materials.

Keywords:

vapor deposition; active learning; machine learning

1. Introduction

Traditional alloy engineering mixes small additions of alloying elements into a primary element matrix for performance improvement. However, after centuries of incremental improvements, we are rapidly reaching the limit of performance from primary alloys. Over the last decade, compositionally complex alloys, sometimes called multi-principal element alloys or high-entropy alloys [1,2,3,4], containing many (3+) elements in significant proportions, have shown outstanding properties for a wide range of engineering applications, including structural alloys [5,6], batteries [7], thermoelectrics [8], shape-memory alloys [9,10], catalysts [1], high-entropy alloys [2,3,4,11], high-entropy ceramics [12], and more. Many of the desired alloys are composed of refractory and low-melting elements, and the final composition is seldom the same as the composition of the input reactant; it takes several iterations before the desired composition is reached. Discovering and fabricating precise alloy compositions in these high-dimensional spaces using a traditional approach is substantially slower and more expensive than desired.

The deposition of one element, and, consequently, reaching the desired alloy composition, is often influenced by the deposition of the other elements; therefore, a higher compositional complexity often means a significantly more complex synthesis optimization in coupled high-dimensional parameter space. The problem is further exacerbated because these new functional alloys are needed as thin films for catalysts and coatings or desired to be fabricated by advanced synthesis methods such as additive manufacturing or electroplating. High-throughput synthesis and characterization, guided by physical models, machine learning, or intuition (and human expertise), are suggested as a path for accelerated search of complex systems [13].

Currently, synthesis conditions are arrived at by a human operator relying on expertise (or intuition) in assessing the coupling between different elemental dimensions through trial iterations. An expert operator usually finds synthesis parameters for binary or ternary systems to a few percent of the desired compositions in a few (<5) trial iterations. However, a human operator’s biggest challenge is learning the complex coupling as the dimensionality increases. They struggle to improve the precision beyond a few percent and require exhaustive calibrations and iterative parameter tuning, especially if the coupling between elements is complex (non-linear). A common strategy employed to combat the curse of dimensionality is to reduce the problem’s dimensionality and then add additional dimensions one at a time. For example, instead of synthesizing a 5-element sample immediately, researchers might first manufacture 3- and 4-element sub-alloys. This multi-step approach allows researchers to tune the composition by just a few elements at a time instead of trying to tune five elements simultaneously. This approach converges if the additional dimensions are weakly coupled, and the challenge often is to find a strongly coupled base subset of the target space and separate it from the weakly coupled one if it exists.

Another common approach is physics- or chemistry-based models to map the complex coupling between deposition parameters. Bunn et al. demonstrated a computationally fast continuum model for optimizing film thickness in thin film samples synthesized via magnetron sputtering [14]. Their method also requires very few initial data points before achieving high prediction accuracy. Furthermore, their approach demonstrates high interpretability, directly reporting parameters like gun power and angle. However, Bunn’s model was shown for thickness measurements only, has yet to be applied to composition optimization, and does not incorporate multiple elements. The physiochemical modeling approach is compelling, and in many ways it quantifies information that a human acquires to build intuition. However, physiochemical models work when there is a substantial theoretical understanding of the synthesis process. Often, however, deep theoretical understanding is not available; what is available is empirical observations from trial synthesis. For example, Xia et al. detail latent causes that alter sputtering rates in magneto sputtering of multi-element thin films and qualitative insights on how the composition changes when multiple sources are used simultaneously [15]. However, quantitative prediction from theory of different sputtering rates needed in multi-elemental synthesis to reach the desired chemistry is very challenging.

Efforts to incorporate the empirical information from human operators to formulate quantitatively accurate models for higher dimensional target spaces are important. There are several traditional empirical methods for process optimization. One such method for magnetron sputtering is well-detailed in an article by Alami et al. [16], which involves depositing at power/sputtering rate steps and measuring the composition at each step. This exhaustive empirical method works well for low-dimensional systems like binary alloys. However, as the number of elements grew or a finer composition control was needed, the number of empirical measurements became burdensome. Another category of process optimization of sputtered thin films measures individual elements’ sputtering rates at various powers and angles. A sensor, like a quartz crystal monitor (QCM), can measure the sputtering rate directly as a function of cathode power and gun angle [17]. The sputtering rate of each element can be set to achieve the desired atomic percent of that element in the final film. Measuring individual sputtering rates on a QCM requires fewer measurements than exhaustively going through all power and angle combinations for an n-element system. However, accurate QCM measurements require knowledge of the sputtering elements’ Z-number and the deposited film’s density. The sputtering rates of an n-ary system are tricky to measure using a quartz crystal monitor because the density of the alloyed system is (a) usually not known a priori and (b) changes as a function of the sputtering rate of each element. Furthermore, the number of sputtering rates and interaction terms grows as

(n^{2} + n) / 2

for n many elements. QCM measurements also do not capture the interactions occurring when multiple sputtering sources are turned on simultaneously. The traditional empirical methods are data-hungry and, therefore, not very useful when exploring a new compositional chemistry.

Machine learning (ML) approaches, such as active learning (AL) and transfer learning (TL), provide tools that allow empirical methods to start from the earliest stages of exploration when information and insights about a newly discovered target space are minimal [18,19,20,21,22]. These approaches overcome many challenges a human operator faces, including optimizing in high-dimensional target spaces and the ability to quickly transfer knowledge gained from one system to another. As the exploration progresses, it also provides real-time insights into the structure of the target space, including the strength of coupling between dimensions and identification of a lower dimensional strongly coupled target subspace, insights that the operator can exploit to fine-tune the exploration strategy further.

In this article, we illustrate these approaches for exploring and optimizing magnetron-sputtered synthesis of 5-element alloys containing refractory and volatile elements. The insights that have emerged from these studies and the approaches developed here are widely applicable to other alloy systems as well as synthesis methods. We will discuss the insights as they emerge and highlight how they can be broadly applied to transform research in compositionally complex alloys in the concluding section of this work.

2. Materials and Methods

2.1. Sample Preparation and Synthesis

The samples synthesized in this study are all predicted to be half-Heusler (F

\bar{4}

3m) thermoelectrics. Half-Heuslers form at specific stoichiometries [8], specifically when the unit cell has a total of 18 valence electrons across all constituents. A three-element half-Heusler usually has equiatomic proportions. In a four-element half-Heusler, two elements occupy the first two Wyckoff sites and make up one-third of the atoms each. The other two atoms split occupancy on the third Wyckoff site and make up one-sixth of atoms each [23]. All training and target alloys in this study are listed in Figure 1.

The target compositions for the ternary system are equiatomic ratios, or

A_{0.3 \bar{3}}

B_{0.3 \bar{3}}

C_{0.3 \bar{3}}

. For a quarternary alloy, the target compositions are

A_{0.1 \bar{6}}

B_{0.1 \bar{6}}

C_{0.3 \bar{3}}

D_{0.3 \bar{3}}

. For quinary compositions, the target is

A_{0.1 \bar{1}}

B_{0.1 \bar{1}}

C_{0.1 \bar{1}}

D_{0.3 \bar{3}}

E_{0.3 \bar{3}}

.

The sputtering system used was an AJA International Orion ATC system [24] (Hingham, MA, USA). The system uses a dual turbopump and cryogenic pump to achieve ultra-high vacuum. The chamber pressure before all depositions was

10^{- 8}

Torr. The system has six sputtering guns; two use a radio-frequency power source and four use a direct current power source. Elemental targets were sourced from Kurt J. Lesker [25] and were two inches in diameter. All non-magnetic targets had a thickness of 1/4 inch, whereas metallic targets were 1/12 inch thick. Films were deposited on undoped single crystal Si wafers with a <100> orientation. (University Wafer). Wafers were nominally 380

μ

m thick and had a 3-in. diameter. All wafers were cleaned by acetone and electrostatically shocked by a radio frequency cathode at 100 W before deposition to clean the surface of any contaminants.

A pre-sputter routine was used for every deposition. The chamber is initially flooded with 30 mTorr of ultra high purity argon and all cathode guns are powered on at a constant value; this initiates sputtering on each gun. After a few seconds, the pressure is decreased to 3 mTorr and the shutters on the sputtering guns are closed. The targets are allowed to sputter with the shutter closed for two minutes so that the sputtering rate reaches steady state. After two minutes, the shutters are open and all active guns sputter for one hour. Film thicknesses vary depending on the sputtering rates of the individual elements and the material density. In general, films have a thickness on the order of hundreds of nanometers.

2.2. Sample Characterization

Sample composition is analyzed using a JEOL JXA-8230 (Akishima, Japan) microprobe analyzer with wavelength dispersive spectroscopy (WDS) [26]. A thin-film correction term is calibrated and fit to the data for each sample [27]. Corrections are also made for peak overlap, depending on the composition system analyzed. Five WDS measurements are taken at different regions in the films to get an aggregate composition. The final composition reported is the average of all measurements for a single sample. Wavelength dispersive spectroscopy has been shown to have accuracy to within

\pm 3

atomic percent [28].

2.3. Active Learning

There are many applications where supervised learning may be helpful, but access to labeled data is sparse and obtaining new labeled data is non-trivial. This can be the case in some synthesis studies; manufacturing a new sample or measuring a sample’s properties can be time- or resource-intensive. In many cases, it is both. Active learning regression models are trained on the currently labeled dataset, even if sparse, and the trained model is used to select the optimal input to be labeled. The ‘optimal’ input can be the one that most improves the models predictive accuracy, lowers its uncertainty, or otherwise. Data labeling occurs in an interactive cycle where the model chooses samples that are most beneficial at each iteration. This contrasts strategies like uniform sampling or random sampling. Active learning enables machine learning models to achieve better performance with fewer labeled samples by allowing the model to choose the data it learns from. In prior research, active learning has shown orders of magnitude reduction in the number of labeled samples to be generated to train a model with commensurate accuracy [29].

The primary elements of an active learning approach are a surrogate model and a query strategy. The surrogate model is the regression model used to make predictions. Surrogate models must provide some type of uncertainty or quality metrics so that sample optimality can be measured. The query strategy is the method used to determine the optimal next sample. There are different query strategies, such as uncertainty-based sampling, where the next sample to label is the one that the model is most uncertain about. Another is querying by committee, where an ensemble of models is trained. The variance in predictions across the ensemble reflects the uncertainty in prediction and the optimal next sample reduces this uncertainty by the largest margin. A third example is expected model change, which uses model gradients to identify inputs with maximal expected gradient lengths. In this investigation, we use uncertainty-based sampling with the Gaussian process surrogate model and querying by committee with the random forest surrogate model.

A Gaussian Process (GP) [30] is a non-parametric model that calculates probability densities over the space of possible regression functions, offering a probabilistic model. Unlike a Gaussian distribution, which is characterized by a mean and covariance, a Gaussian process is defined through a normal distribution over mean and covariance functions, denoted as

Y \sim G P (m (X), k (X, X^{'}))

. In this notation,

m (X)

and

k (X, X^{'})

represent the mean and covariance functions. GP models can capture various complex relationships while providing credibility intervals for their predictions. Due to these advantages, they are considered the default proxy model in active learning and we employ them for uncertainty-based sampling.

In this investigation, the Gaussian process regression model was implemented using the sklearn library [31]. Several different kernels were tested, including Matern, Rational Quadratic, and radial basis functions, as well as combinations of these three kernels. All kernels incorporated a homoscedastic white noise kernel. Optimal performance was achieved with a combination of a Matern and Rational Quadratic kernel, along with a white noise kernel. During each training step, the GPR was queried on all remaining untrained samples. As GP regressors can be queried directly for uncertainty in predictions, the sample with the highest prediction uncertainty was selected as the next input for the model.

A random forest [32] is an ensemble model that employs a set of trained decorrelated classification and regression trees (CARTs), achieved through bootstrapping and feature bagging. This decorrelation ensures that the random forest has lower variance than the individual tree models while maintaining low bias, thus addressing the bias–variance tradeoff. The final prediction of the Random Forest is an average over the predictions of the trees in the ensemble. In the context of active learning, the variance between the predictions of the trained tree models in the random forest is taken as a measure of uncertainty in a querying by committee policy.

The random forest model used in this study was implemented using the sklearn library. The employed model consisted of 10 random forests, each containing 250 estimators. Larger models with up to 50 random forests and 500 estimators were tested; increasing the model size beyond 250 estimators and 10 random forests did not significantly improve predictive accuracy but substantially increased training time. For active learning, samples were selected to teach the model based on committee voting. All 10 tree models were queried for predictions on the remaining untrained samples. The sample exhibiting the highest variance in prediction was chosen as the next teaching input for the model.

A neural network (NN) model was also used, due to the popularity and widespread application of NN-based approaches across diverse disciplines. The neural network model performed worse than all other models and the human operator, both in terms of prediction accuracy and the number of samples required to achieve a given accuracy. After training on all available samples in the training dataset, the neural network model achieved an MAE of 9%. This is indicative of the observation that neural network models, bereft of any additional inductive biases, need a higher number of samples to match the performance of classical ML approaches like random forest-based models [33,34,35,36]. The details of the neural network models are in the provided code in the Supplementary Information.

2.4. Transfer Learning

In the active learning approach described above, new training samples are selected from anywhere within the composition space. This implies that when training a model to make predictions for 5-element samples, active learning models can suggest samples of lower dimensions for testing. However, these models will only make suggestions within the target composition space. It is essential to consider that information from overlapping composition spaces can be utilized to train the model. For instance, a manufacturer working with an ABC alloy may find useful correlations in a BCD alloy. Although the two composition systems share only two elements (B and C), if the model can determine the relationship between B and C in the ABC system, it can apply this information to make predictions for the BCD system.

Moreover, human operators commonly adopt an approach of working towards progressively more complex systems due to the challenging nature of tuning sputtering parameters. If an operator aims to create an ABCDE alloy, they might initially develop an ABC alloy. After fine-tuning the parameters for the ABC alloy, they proceed to create an ABCD alloy. Following further adjustments, they will attempt the ABCDE alloy. This method reduces complexity by introducing only one new element for tuning at a time. Attempting to create an ABCDE alloy without prior observations of sputtering element interactions can be disastrous.

Many laboratories have previously manufactured and measured samples from past experiments. It would be beneficial to leverage these samples to help initialize tuning for a new target system, even if the old samples do not share all of the same elements. To address this, a transfer learning approach was adopted. Initially, the active learning models are trained solely on ternary samples that share at least one element with the target system. Once all of the current ternary samples have been taught to the model, quarternary samples are added. Again, these quaternary samples must share at least one element with the target system. After the ternary samples have been integrated into the model, the conventional active learning approach is utilized. The model is queried within the target ABCDE system to identify the best training sample to reduce model uncertainty.

In the context of transfer learning, we employ a concept known as dummy dimensions. This approach involves training an algorithm on all dimensions of a problem space, which, in this case, comprises six dimensions due to the six elements within the system. However, the input data typically contain only a few non-zero elements.

For a specific composition y, the algorithm receives a vector of length six. In the case of a ternary sample, only three out of the six entries contain non-zero values, and for a quarternary sample, four entries are non-zero, and so forth. The position of each element in the vector y is preserved. For instance, the atomic percentage of Nb (niobium) is always represented as the first entry in y. If there is no Nb in the sample, then the first entry is set to zero. Similarly, the second entry always corresponds to titanium, and so on.

Initially, the model is trained using samples that have only three non-zero values for power, angle, and atomic percentage. After the model is proficient with these samples, it proceeds to train on samples with four non-zero values for power, angle, and atomic percentage. This process continues for samples with five, and so on.

Several different transfer learning models were trained. Three models were trained on only one compositional subsystem (ternary only, quarternary only, and quinary only). A random sample was chosen from the manifold as the initial training point, then further samples were selected based on maximum uncertainty.

Next, models were created that were trained on the entirety of the ternary manifold (15 samples) and used to make predictions on the quarternary or quinary dataset. Again, one initial sample was chosen from the target manifold (quarternary or quinary), and further samples were selected using maximum uncertainty sampling.

Finally, a model was trained on the entirety of both the ternary and quarternary manifolds (30 samples), and predictions were made on the quinary dataset. A random quinary sample was chosen as the initial training point, and then sampling proceeded again using maximum uncertainty.

An overview of the full workflow, including transfer learning from previous samples and maximum uncertainty sample selection, are shown in Figure 1.

All models shown in the paper were re-run 10 times with a different random sampling of the manifold for initial training points (5 for the full model, 1 for the transfer learning models). The MAE reported in Figure 2 and Figure 3 are the average errors across all runs. The error bars on the MAE represent the standard deviation in MAE across all 10 training runs.

3. Results and Discussion

Active learning was implemented through two regression models: a Gaussian process regression (GPR) and random forest (RF). These specific models were selected based on a set of rationale. Random forests and Gaussian process-based regression models have been observed to be very effective at modeling with tabular data in prior research [33,35]. Additionally, these are very popular for active learning applications; for instance, Gaussian process models are the default surrogate model in Bayesian optimization studies [30]. The models were trained on a dataset of sputtering synthesis parameters and compositions for ternary, quarternary, and quinary alloys. At each learning iteration, the models were queried for the next data point to test based on a maximum uncertainty (MU) schema. The goal was to correctly predict the synthesis parameters for a target alloy using as few training data points as possible.

Model performance was assessed two ways: the models’ ability to correctly predict a target composition (inset of Figure 2) and its error in predicting all compositions across the composition manifold (main graph in Figure 2). Active learning models were able to find a target quinary composition after only 14 sample iterations. For predicting target compositions, the models were terminated after the mean absolute error (MAE) reached 3% since this is the uncertainty level of the composition measurement. The best-performing model was able to correctly predict synthesis parameters across the entire composition manifold to within 3% error after only 26 iterations.

The error is calculated as the absolute difference in the target atomic percentage Y and the measured atomic percentage

\hat{Y}

. This error is

| Y_{i} - \hat{Y_{i}} |

for the ith element in an n-ary composition. For a complete sample, the error is the mean of all differences

(1 / n) \sum_{i}^{n} | Y_{i} - \hat{Y_{i}} |

for all elements, often referred to as the mean absolute error (MAE). In this study, the training samples were ternary, quarternary, or quinary transition metal alloys. The target compositions in Figure 2 were all quinary alloys selected from six possible elements: titanium, vanadium, niobium, tantalum, antimony, and iron. The manifold error is taken over all compositional complexity; the main graph in Figure 2 represents the error in prediction for ternary, quaternary, and quinary alloys. The error shown is the average of 10 model runs, with each run sampling a different initial dataset. The shown error bars are the first standard deviation of error across all 10 runs.

The models’ prediction ability stands in contrast to an expert human operator performance; experts typically require 20 or more iterations to synthesize one target quinary alloy correctly. However, a human expert learns synthesis parameters for one alloy and can predict parameters for alloys with incrementally different chemistry but struggles to predict a significantly different composition in the same quinary composition space. In contrast, the models learn the full composition space.

The sparsity of training data was not a barrier to high predictive performance in either RF or GPR models. Magnetron sputtering synthesis, the method used in this study, is a labor intensive and slow process compared to others like spin coating; typically it takes a researcher up to two days to manufacture and characterize five samples. With these labor intensive synthesis processes, sparsity in data is a given. Researchers using active learning methods can still benefit from model-driven parameter guidance even with small dataset sizes.

As a baseline for data-driven models, all of the AL-based models were compared against a least squares-based linear regression model. As shown in the Supplemental Information, the least squares regression model had absolute errors upwards of 15 after 5 training samples; this eventually decreased to an error of around 7 after all training samples had been fit to the model.

Models were also trained using a subset of the most important synthesis parameters, determined using the mutual information index (MII) [37] to test the impact of poorly informative inputs on model performance (MII is discussed in more detail below). These models are labeled as ‘Reduced Features’ in Figure 2. The performance of models with these reduced feature sets is within the bound of models trained on all features. This knowledge provides several practical insights for both the active learning model and the experiments. From a computational side, removing less-informative features in studies with large feature sets can significantly reduce the computational cost of running an active learning workflow. From an experimental side, identification of less-informative features helps reduce the complexity of synthesis studies. If a synthesis parameter is not informative of the desired measurement, then it is better to fix that parameter and not vary it at all. Additionally, the information regarding the relative importance of input features for the model enables the domain scientist to compare the model’s learned mapping to their understanding of the system [38,39]. This allows the scientist to interpret the model’s mapping and verify its rationale, leading to a higher degree of trust in the model [40].

The neural network models performed especially poorly at this problem. As shown in the Supplementary Figures, the best neural network model failed to achieve an MAE below 9% even after trained on all available data. As such, the remainder of the study will focus on the GPR and RF results.

3.1. Transfer Learning into Higher Dimensional Systems

The most powerful feature of either the RF or GPR model is the ability to learn synthesis conditions from previously made samples even if those samples are in a lower-dimensional composition space (ternary, quarternary) or share only a few elements in common with the target composition. Figure 3 shows the MAE for RF models trained on successively more complex samples. Each MAE in Figure 3 is assessed over all compositions at a given complexity (ternary, quarternary, quinary).

Models trained only on ternary samples generally performed poorly, achieving a final MAE of

> 3 %

for predicting ternary compositions. Models for predicting quarternary samples trained on ternary samples show an immediate improvement in MAE over models trained only on quarternary samples. The biggest improvement is in quinary models trained on ternary, quarternary, or both systems. GPR models for predicting quinary compositions that were trained on ternary samples have an initial MAE of less than 10%; the quinary prediction models trained on quarternary only or on both sub-systems showed an initial MAE less than 5% and quickly approached an MAE of less than 2%. The advantage of pre-taining, perhaps not surprisingly, is greatest at earlier stages (and sparser-data training stage) of learning. Pre-training on the lower dimensional spaces reduces the training time by nearly a factor of two for both quarternary and quinary composition spaces. As synthesis of training/trial samples is slow and expensive, this is a significant savings, even though un-pre-trained models eventually achieve comparable accuracy. The difference between the quinary models pre-trained on only quarternary and ternary+quarternary is marginal, suggesting that the quarternary model captures all of the significant relationships learned by the ternary model.

The GPR models showed similar performance to the RF models. A complementary plot to Figure 3 for GPR models is included in the Supplemental Information.

In both cases, the model performance indicates that training on lower-dimensional samples is beneficial for predicting on high-dimensional systems. Synthesis laboratories often have prior training data on lower dimensional systems already acquired. This prior data can seed AL-based regression models. Furthermore, humans often work their way up to complex sample synthesis. Instead of trying to synthesize a 5-element sample immediately, researchers might first manufacture 3- and 4- element analogues. This allows researchers to tune the composition by just a few elements simultaneously instead of trying to tune five elements simultaneously. The active learning regression approach is compatible with this type of human calibration; as the human makes successively more complex samples, the regression model can be trained simultaneously. Once the human is prepared to make the most complex samples, they can immediately rely on the model predictions.

3.2. Feature Importance and Interdependence

Going to higher composition spaces is significantly harder, not only because every additional element brings in additional parameters, but these parameters are often strongly coupled with the parameters from the lower dimensions. Adding a new element to the composition spaces requires learning additional parameters and learning new interdependencies between already trained lower dimensional parameters with the new parameters. The interdependence of the synthesis parameters is best summarized in the mutual information index shown in Figure 4. The mutual information index encodes how the knowledge of one variable decreases uncertainty about another variable. Unlike other correlation coefficients, it does not assume a linear relationship between the two variables.

The mutual information index (MI) for a 12-dimension target space (gun power and gun angle for six elements) shows that eight of those dimensions are strongly correlated. The sputtering power for the elements is strongly correlated and strongly affects the atomic percentage of all the other elements. It is unsurprising that each element’s gun power has the highest mutual information with its atomic percentage. Still, the MII with its atomic percentage is not uniformly high for every element. For example, Sb gun power affects Ti atomic percentage as much as it does its own. However, the angles of the sputtering guns relative to the substrate play a negligible role for almost all elements. Only the angle of the Ta and Sb guns impacted the atomic percentage of other elements. This is likely due to the high Sb and Ta sputtering rate relative to all the other elements. The high sputtering rate of Sb and Ta means they can easily dominate the sample composition if they are both pointed directly at the substrate. Moving the angle of the gun away from the target is an effective way to modulate these high sputtering rates. For the other elements, whose sputtering rates are significantly lower than Sb or Ta, if they are pointed away from the substrate, their sputtering rate (and thus atomic percentage) quickly approaches zero. The MII for these values is shown below the black bar in Figure 4. Their angle must be set so they are always pointed directly at the substrate. Regarding mutual information, this means the angle of Ta and Sb has a high MII with the final composition and all other elements have a low MII.

The AL models most often suggested the same sputtering angle for the slowly sputtering V, Fe, Ti, and Nb (i.e., pointed directly at the substrate) even though there were training samples with other angles in the datasets. The model finds effective means of lowering the complexity of the target space without expert insight. It finds that when sputtering elements that have very different sputtering rates, it is often better to point the angle of slowly sputtering elements directly towards the substrate during a synthesis study and only adjust the angle of the high sputtering rate elements. The AL model, in effect, ‘discovers’ a good rule-of-thumb for human operators to follow when navigating a multi-element target space requiring very different sputtering rates.

The source of coupling between synthesis parameters in physical vapor deposition is somewhat elusive and has been the source of research before, albeit literature on many-element sputtering is limited. Elastic scattering of different elements while in transit to the substrate has been observed and postulated as a reason for changes in thin-film composition under certain sputtering conditions [15,41]. Anecdotally, in this study, interactions between the magnetic fields of neighboring guns were sometimes observed through visible changes to the halo of argon surrounding each sputtering gun. When the power of one gun was turned up high, the magnetic field of neighboring guns seemed to be altered. The degree of coupling between neighboring magnetic fields was not quantified in this study. The study herein was concerned more with a practical approach to dealing with complex coupling in the sputtering process, as opposed to a physics-based explanation.

3.3. Comparison to Other Active Learning Optimization Approaches

The method presented herein shows high interpretability, high prediction accuracy, and a very low barrier to entry. The entire model can be executed in under 50 lines of code. The methods and classes used are well-documented online, with plenty of supporting tutorials. An example notebook and the full dataset are also included in this publication’s Supplemental Material. There are also numerous open-source libraries and resources for performing the same process optimization outside of the ones used in this article.

The ability to perform transfer learning enables researchers to pre-train AL regression models even with data not directly relevant to the composition of interest. Exhaustive calibration studies, or QCM-guided calibration studies, can only make predictions within the target system being calibrated. The AL regression approach succeeds even when trained with samples from other compositional systems. For example, a sample with a TaNbFeSb composition can be used as a training data point for a VNbFeSb; this is the case in Figure 3. The AL regression method can determine interactions between Nb, Fe, and Sb from the TaNbFeSb and apply it to the VNbFeSb sample.

Active learning methods have been previously applied to predict single scalar outputs from multivariate inputs [42,43]. Many materials engineering problems require simultaneous optimization of multiple parameters, whether it is a multinary composition or competing material properties like hardness and strength. Furthermore, many active learning studies have focused on synthesis methods that can be automated or performed in a high-throughput manner [44]. The intersections of active learning, high throughput experimentation, and automation are currently being pursued by many self-driving laboratories around the world. The Ada laboratory at the University of British Columbia has demonstrated several successes in machine learning-driven material synthesis [45]. Notably, the system found a global maxima of hole mobility in 35 sample runs. In another publication, authors demonstrated Ada for performing multi-objective optimization of thin films with competing objectives [46].

Yet, there still exist a multitude of material synthesis methods that are not yet automated or not easily converted into a high-throughput method. In these cases, researchers are attempting to optimize multivariate objectives from sparse data. Active learning can potentially benefit these labor-intensive processes all the same. Human operators, acting alongside active learning algorithms, can reach optimized synthesis parameters much faster than a human operator alone. The high predictive performance of these models on sparse datasets enables researchers to use active learning with only a single sample needed.

This study builds upon these previous investigations but demonstrates active learning-based multi-output recommendations for targeted synthesis on a system that is not high throughput or robotically controlled but, instead, labor intensive and thus built on sparse and expensive measurements. The goal of this effort is not to replace a human operator with a robot but to augment human performance by quantifying the structure of the target space and identifying trends and insights that could be converted into rule-of-thumb to guide optimization when the empirical dataset is even sparser at the beginning of a synthesis campaign in a compositional space.

4. Conclusions

Materials science is increasingly moving in the direction of large datasets, whether computationally generated or experimentally generated. These large datasets are well-equipped for deep learning algorithms such as Google’s GNoME platform [47]. Concurrently, advances in robotic systems are pushing materials science towards automated synthesis and experimental discovery through self-driving laboratories.

Yet, still, the vast majority of materials science research is not yet compatible with deep learning methods due to data sparsity, and most laboratories do not have access to high throughput robotic systems. Active learning, together with transfer learning stand to fill this gap and enable higher productivity for these traditional laboratories.

Although the AL approach was demonstrated for magnetron sputtering, it is extensible to other thin-film physical vapor deposition techniques, such as other physical vapor deposition systems [11,48,49,50], chemical vapor deposition [3,4], and plasma laser deposition [51,52]. It may also be extensible to other synthesis techniques that are high-throughput compatible, such as additive manufacturing or continuous flow synthesis [53]. This active learning workflow applies as long as the system has a finite number of tunable parameters, and those tunable parameters are strong predictors of sample composition.

This method also has the potential to be incorporated as a ‘cold start’ for other active learning workflows, such as those used in self-driving laboratories. The operation of autonomous workflows still requires an initial point of entry. Where to start regarding instrument process calibration is not always clear before autonomous synthesis. The method detailed herein can be used to initialize synthesis workflows to achieve a targeted composition, or to efficiently explore compositional systems, by seeding with previously collected data even if the data are not completely in the target space or are in a lower-dimensional target space.

We strongly encourage researchers who are working on labor-intensive synthesis where datasets are too sparse for many machine learning approaches to adopt active- and transfer-learning approaches in their day-to-day laboratory operations. Having humans work alongside active learning models can vastly improve synthesis efficiency, productivity, and throughput.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ma17164038/s1. Supplementary Figure S1: Comparison of the MAE over the manifold for random forest, gaussian process regression, and neural network models. Supplementary Figure S2: MAE for transfer learning of both random forest and gaussian process regression models. Supplementary Figure S3: MAE for transfer learning of gaussian process regression model. Supplementary Figure S4: Pearson correlation coefficient for all variables used in regression. Supplementary Figure S5: MAE of linear regression model over the training manifold. Supplementary Figure S6: MAE for transfer learning of random forest model.

Author Contributions

N.S.J.: conceptualization, material synthesis, material characterization, data collection, algorithm development, writing—all drafts, visualization, review, editing. A.A.M.: conceptualization, algorithm development, writing—all drafts, visualization, review, editing. D.J.K.: material synthesis, material characterization, data collection, writing—review, editing. A.M.: supervision, project administration, writing—review, editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the U.S. Department of Energy, Office of Energy Efficiency and Renewable Energy (EERE), specifically the Advanced Materials & Manufacturing Technologies Office (AMMTO), under contract DE-AC02-76SF00515. Aashwin Mishra was partially supported by the SLAC ML Initiative. Dylan J. Kirsch was partially supported by the NSF Graduate Research Fellowship and the UMD Clark Doctoral Scholars Fellowship.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article and Supplementary Materials.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Li, S.Y.; Nguyen, T.X.; Su, Y.H.; Lin, C.C.; Huang, Y.J.; Shen, Y.H.; Liu, C.P.; Ruan, J.J.; Chang, K.S.; Ting, J.M. Sputter-Deposited High Entropy Alloy Thin Film Electrocatalyst for Enhanced Oxygen Evolution Reaction Performance. Small 2022, 18, 2106127. [Google Scholar] [CrossRef] [PubMed]
Jung, S.G.; Han, Y.; Kim, J.H.; Hidayati, R.; Rhyee, J.S.; Lee, J.M.; nd Woo Seok Choi, W.N.K.; Jeon, H.R.; Suk, J.; Park, T. High critical current density and high-tolerance superconductivity in high-entropy alloy thin films. Nat. Commun. 2022, 13, 3373. [Google Scholar] [CrossRef] [PubMed]
Zhou, B.; Wang, Y.; Xue, C.; Han, C.; Hei, H.; Xue, Y.; Liu, Z.; Wu, Y.; Ma, Y.; Gao, J.; et al. Chemical vapor deposition diamond nucleation and initial growth on TiZrHfNb and TiZrHfNbTa high entropy alloys. Mater. Lett. 2022, 309, 131366. [Google Scholar] [CrossRef]
Han, C.; Zhi, J.; Zeng, Z.; Wang, Y.; Zhou, B.; Gao, J.; Wu, Y.; He, Z.; Wang, X.; Yu, S. Synthesis and characterization of nano-polycrystal diamonds on refractory high entropy alloys by chemical vapour deposition. Appl. Surf. Sci. 2023, 623, 157108. [Google Scholar] [CrossRef]
Kim, Y.S.; Park, H.J.; Mun, S.C.; Jumaev, E.; Hong, S.H.; Song, G.; Kim, J.T.; Park, Y.K.; Kim, K.S.; Jeong, S.I.; et al. Investigation of structure and mechanical properties of TiZrHfNiCuCo high entropy alloy thin films synthesized by magnetron sputtering. J. Alloys Compd. 2019, 797, 834–841. [Google Scholar] [CrossRef]
Rar, A.; Frafjord, J.J.; Fowlkes, J.D.; Specht, E.D.; Rack, P.D.; Santella, M.L.; Bei, H.; George, E.P.; Pharr, G.M. PVD synthesis and high-throughput property characterization of Ni—Fe—Cr alloy libraries. Meas. Sci. Technol. 2019, 16, 834–841. [Google Scholar] [CrossRef]
Wang, K.; Nishio, K.; Horiba, K.; Kitamura, M.; Edamura, K.; Imazeki, D.; Nakayama, R.; Shimizu, R.; Kumigashira, H.; Hitosugi, T. Synthesis of High-Entropy Layered Oxide Epitaxial Thin Films: LiCr1/6Mn1/6Fe1/6Co1/6Ni1/6Cu1/6O2. Cryst. Growth Des. 2022, 22, 1116–1122. [Google Scholar] [CrossRef]
Anand, S.; Xia, K.; Hegde, V.I.; Aydemir, U.; Kocevski, V.; Zhu, T.; Wolverton, C.; Snyder, G.J. A valence balanced rule for discovery of 18-electron half-Heuslers with defects. Energy Environ. Sci. 2018, 11, 1480–1488. [Google Scholar] [CrossRef]
Zarnetta, R. Identification of quarternary shape memory alloys with near-zero thermal hysteresis and unprecedented functional stability. Adv. Funct. Mater. 2010, 20, 1917–1923. [Google Scholar] [CrossRef]
Hasan, N.M.A.; Hou, H.; Sarkar, S.; Thienhaus, S.; Mehta, A.; Ludwig, A.; Takeuchi, I. Combinatorial Synthesis and High-Throughput Characterization of Microstructure and Phase Transformation in Ni-Ti-Cu-V Quarternary Thin-Film Library. Engineering 2020, 6, 637–643. [Google Scholar] [CrossRef]
Liang, A.; Goodelman, D.C.; Hodge, A.M.; Farkas, D.; Branicio, P.S. CoFeNiTi_x and CrFeNiTi_x high entropy alloy thin films microstructure formation. Acta Mater. 2023, 257, 119163. [Google Scholar] [CrossRef]
Oses, C.; Toher, C.; Curtarolo, S. High-entropy ceramics. Nat. Rev. Mater. 2020, 5, 295–309. [Google Scholar] [CrossRef]
Gregoire, J.M.; Zhou, L.; Haber, J.A. Combinatorial synthesis for AI-driven materials discovery. Nat. Synth. 2023, 2, 493–504. [Google Scholar] [CrossRef]
Bunn, J.K.; Voepel, R.Z.; Wang, Z.; Gatzke, E.P.; Lauterbach, J.A.; Hattrick-Simpers, J.R. Development of an Optimization Procedure for Magnetron-Sputtered Thin Films to Facilitate Combinatorial Materials Research. Ind. Eng. Chem. Res. 2016, 55, 1236–1242. [Google Scholar] [CrossRef]
Xia, A.; Togni, A.; Hirn, S.; Bolelli, G.; Lusvarghi, L.; Franz, R. Angular-dependent deposition of MoNbTaVW HEA thin films by three different physical vapor deposition methods. Surf. Coatings Technol. 2020, 385, 119163. [Google Scholar] [CrossRef]
Alami, J.; Eklund, P.; Emmerlich, J.; Wilhelmsson, O.; Jansson, U.; Högberg, H.; Hultman, L.; Helmersson, U. High-power impulse magnetron sputtering of Ti—Si—C thin films from a Ti₃SiC₂ compound target. Thin Solid Films 2006, 515, 1731–1736. [Google Scholar] [CrossRef]
Deki, S.; Aoi, Y.; Asaoka, Y.; Kajinami, A.; Mizuhata, M. Monitoring the growth of titanium oxide thin films by the liquid-phase deposition method with a quartz crystal microbalance. J. Mater. Chem. 1997, 7, 733–736. [Google Scholar] [CrossRef]
Gongora, A.E.; Xu, B.; Perry, W.; Okoye, C.; Riley, P.; Reyes, K.G.; Morgan, E.F.; Brown, K.A. A Bayesian experimental autonomous researcher for mechanical design. Sci. Adv. 2020, 6, eaaz1708. [Google Scholar] [CrossRef]
Xue, D.; Balachandran, P.V.; Hogden, J.; Theiler, J.; Xue, D.; Lookman, T. Accelerated search for materials with targeted properties by adaptive design. Nat. Commun. 2016, 7, 11241. [Google Scholar] [CrossRef]
Kusne, A.G.; Yu, H.; Wu, C.; Zhang, H.; Hattrick-Simpers, J.; DeCost, B.; Sarker, S.; Oses, C.; Toher, C.; Curtarolo, S.; et al. On-the-fly closed-loop materials discovery via Bayesian active learning. Nat. Commun. 2020, 11, 5966. [Google Scholar] [CrossRef]
Nikolaev, P.; Hooper, D.; Webber, F.; Rahul Rao, K.D.; Krein, M.; Poleski, J.; Barto, R.; Maruyama, B. Autonomy in materials research: A case study in carbon nanotube growth. NPJ Comput. Mater. 2016, 2, 16031. [Google Scholar] [CrossRef]
Ament, S.; Amsler, M.; Sutherland, D.R.; Chang, M.C.; Guevarra, D.; Connolly, A.B.; Gregoire, J.M.; Thompson, M.O.; Gomes, C.P.; Dover, R.B.V. Autonomous materials synthesis via hierarchical active learning of nonequilibrium phase diagrams. Sci. Adv. 2021, 7, eabg4930. [Google Scholar] [CrossRef] [PubMed]
Zeier, W.G.; Schmitt, J.; Hautier, G.; Aydemir, U.; Gibbs, Z.M.; Felser, C.; Snyder, G.J. Engineering half-Heusler thermoelectric materials using Zintl chemistry. Nat. Mater. 2016, 1, 16032. [Google Scholar] [CrossRef]
AJA International. ATC Orion Magnetron Sputtering System. 2023. Available online: https://www.ajaint.com/atc-orion-series-sputtering-systems.html (accessed on 1 March 2024).
Kurt J. Lesker Co. Sputtering Targets. 2023. Available online: www.lesker.com/materials-division.cfm/section-sputtering-targets (accessed on 1 December 2023).
JEOL Ltd. JEOL JXA-8230. 2023. Available online: https://www.jeol.com/products/scientific/epma/ (accessed on 1 March 2024).
Takakura, M.; Takahashi, H.; Okumura, T. Thin-Film Analysis with Electron Probe X-ray MicroAnalyzer; Elsevier: Amsterdam, The Netherlands, 1998; Volume 33E. [Google Scholar]
Abou-Ras, D.; Caballero, R.; Fischer, C.H.; Kaufmann, C.; Lauermann, I.; Mainz, R.; Mönig, H.; Schöpke, A.; Stephan, C.; Streeck, C.; et al. Comprehensive Comparison of Various Techniques for the Analysis of Elemental Distributions in Thin Films. Microsc. Microanal. 2011, 17, 728–751. [Google Scholar] [CrossRef] [PubMed]
Lookman, T.; Balachandran, P.V.; Xue, D.; Yuan, R. Active learning in materials science with emphasis on adaptive sampling using uncertainties for targeted design. NPJ Comput. Mater. 2019, 5, 21. [Google Scholar] [CrossRef]
Rasmussen, C.E.; Williams, C.K. Gaussian Processes for Machine Learning; Springer: Cham, Switzerland, 2006; Volume 1. [Google Scholar]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Grinsztajn, L.; Oyallon, E.; Varoquaux, G. Why do tree-based models still outperform deep learning on typical tabular data? Adv. Neural Inf. Process. Syst. 2022, 35, 507–520. [Google Scholar]
Mishra, A.A.; Edelen, A.; Hanuka, A.; Mayes, C. Uncertainty quantification for deep learning in particle accelerator applications. Phys. Rev. Accel. Beams 2021, 24, 114601. [Google Scholar] [CrossRef]
Caruana, R.; Niculescu-Mizil, A. An empirical comparison of supervised learning algorithms. In Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA, USA, 25–29 June 2006; pp. 161–168. [Google Scholar]
Rodriguez-Galiano, V.; Sanchez-Castillo, M.; Chica-Olmo, M.; Chica-Rivas, M. Machine learning predictive models for mineral prospectivity: An evaluation of neural networks, random forest, regression trees and support vector machines. Ore Geol. Rev. 2015, 71, 804–818. [Google Scholar] [CrossRef]
Kraskov, A.; Stögbauer, H.; Grassberger, P. Estimating mutual information. Phys. Rev. E 2004, 69, 066138. [Google Scholar] [CrossRef] [PubMed]
Arrieta, A.B.; Díaz-Rodríguez, N.; Del Ser, J.; Bennetot, A.; Tabik, S.; Barbado, A.; García, S.; Gil-López, S.; Molina, D.; Benjamins, R.; et al. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 2020, 58, 82–115. [Google Scholar] [CrossRef]
Adhikari, A.; Tax, D.M.; Satta, R.; Faeth, M. LEAFAGE: Example-based and Feature importance-based Explanations for Black-box ML models. In Proceedings of the 2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), New Orleans, LA, USA, 23–26 June 2019; pp. 1–7. [Google Scholar]
Davis, B.; Glenski, M.; Sealy, W.; Arendt, D. Measure utility, gain trust: Practical advice for XAI researchers. In Proceedings of the 2020 IEEE Workshop on Trust and Expertise in Visual Analytics (TREX), Salt Lake City, UT, USA, 25–30 October 2020; pp. 1–8. [Google Scholar]
Neidhardt, J.; Mráz, S.; Schneider, J.M.; Strub, E.; Bohne, W.; Liedke, B.; Möller, W.; Mitterer, C. Experiment and simulation of the compositional evolution of Ti—B thin films deposited by sputtering of a compound target. J. Appl. Phys. 2008, 104, 063304. [Google Scholar] [CrossRef]
Bishnoi, S.; Singha, S.; Ravindera, R.; Bauchyb, M.; Gosvamic, N.N.; Kodamanad, H.; Krishnan, N.A. Predicting Young’s modulus of oxide glasses with sparse datasets using machine learning. J. Non-Cryst. Solids 2019, 524, 119643. [Google Scholar] [CrossRef]
Tripathi, B.M.; Sinha, A.; Mahata, T. Machine learning guided study of composition-coefficient of thermal expansion relationship in oxide glasses using a sparse dataset. Mater. Today Proc. 2022, 67, 326–329. [Google Scholar] [CrossRef]
Mekki-Berrada, F.; Ren, Z.; Huang, T.; Wong, W.K.; Zheng, F.; Xie, J.; Tian, I.P.S.; Jayavelu, S.; Mahfoud, Z.; Bash, D.; et al. Two-step machine learning enables optimized nanoparticle synthesis. NPJ Comput. Mater. 2021, 7, 55. [Google Scholar] [CrossRef]
MacLeod, B.P.; Parlane, F.G.L.; Morrisey, T.D.; Hase, F.; Roch, L.M.; Dettelbach, K.E.; Moreira, R.; Yunker, L.P.E.; Rooney, M.B.; Deeth, J.R.; et al. Self-driving laboratory for accelerated discovery of thin-film materials. Sci. Adv. 2020, 6, eaaz8867. [Google Scholar] [CrossRef]
MacLeod, B.P.; Parlane, F.G.L.; Rupnow, C.C.; Dettelbach, K.E.; Elliott, M.S.; Morrissey, T.D.; Haley, T.H.; Proskurin, O.; Rooney, M.B.; Taherimakhsousi, N.; et al. A self-driving laboratory advances the Pareto front for material properties. Nat. Commun. 2022, 13, 995. [Google Scholar] [CrossRef] [PubMed]
Merchant, A.; Batzner, S.; Schoenholz, S.S.; Aykol, M.; Cheon, G.; Cubuk, E.D. Scaling deep learning for materials discovery. Nature 2023, 624, 80–85. [Google Scholar] [CrossRef]
Kelly, P.; Arnell, R. Magnetron sputtering: A review of recent developments and applications. Vacuum 2000, 56, 159–172. [Google Scholar] [CrossRef]
Musila, J.; Barocha, P.; Vlcĕka, J.; Namc, K.; Hanc, J. Reactive magnetron sputtering of thin films: Present status and trends. Thin Solid Films 2005, 475, 208–218. [Google Scholar] [CrossRef]
Sarakinos, K.; Alami, J.; Konstantinidis, S. High power pulsed magnetron sputtering: A review on scientific and engineering state of the art. Surf. Coatings Technol. 2010, 204, 1661–1684. [Google Scholar] [CrossRef]
Sloyan, K.A.; May-Smith, T.C.; Eason, R.W.; Lunney, J.G. The effect of relative plasma plume delay on the properties of complex oxide films grown by multi-laser, multi-target combinatorial pulsed laser deposition. Appl. Surf. Sci. 2009, 255, 9066–9070. [Google Scholar] [CrossRef]
Keller, D.A.; Ginsburg, A.; Barad, H.N.; Shimanovich, K.; Bouhadana, Y.; Rosh-Hodesh, E.; Takeuchi, I.; Aviv, H.; Tischler, Y.R.; Anderson, A.Y.; et al. Utilizing Pulsed Laser Deposition Lateral Inhomogeneity as a Tool in Combinatorial Material Science. ACS Comb. Sci. 2015, 17, 209–221. [Google Scholar] [CrossRef]
Dunlap, J.H.; Ethier, J.G.; Putnam-Neeb, A.A.; Iyer, S.; Luo, S.X.L.; Feng, H.; Torres, J.A.G.; Doyle, A.G.; Swager, T.M.; Vaia, R.A.; et al. Continuous flow synthesis of pyridinium salts accelerated by multi-objective Bayesian optimization with active learning. Chem. Sci. 2023, 14, 8061–8069. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Overview of the active learning loop used in the present study. Pre-training of the active learning models was done on all compositions that shared at least one element in common with the target system. After pre-training, the model was iteratively trained and queried for new samples until a sufficiently low MAE was achieved. The inset box lists all composition systems that are included in the training/target datasets.

Figure 2. Mean absolute error for a quinary composition as a function of the number of training samples; (inset) mean absolute error for a given target composition as a function of the number of training samples. Error bars represent the standard deviation in prediction error across 10 different target composition predictions.

Figure 3. Performance of active learning models trained on successively more complex compositions. Plots are organized based on the composition being predicted; the leftmost plot are predictions for ternary compositions; quarternary in the middle; quinary on the right.

Figure 4. Mutual information index for all input parameters and atomic percentages for all data.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Johnson, N.S.; Mishra, A.A.; Kirsch, D.J.; Mehta, A. Active Learning for Rapid Targeted Synthesis of Compositionally Complex Alloys. Materials 2024, 17, 4038. https://doi.org/10.3390/ma17164038

AMA Style

Johnson NS, Mishra AA, Kirsch DJ, Mehta A. Active Learning for Rapid Targeted Synthesis of Compositionally Complex Alloys. Materials. 2024; 17(16):4038. https://doi.org/10.3390/ma17164038

Chicago/Turabian Style

Johnson, Nathan S., Aashwin Ananda Mishra, Dylan J. Kirsch, and Apurva Mehta. 2024. "Active Learning for Rapid Targeted Synthesis of Compositionally Complex Alloys" Materials 17, no. 16: 4038. https://doi.org/10.3390/ma17164038

APA Style

Johnson, N. S., Mishra, A. A., Kirsch, D. J., & Mehta, A. (2024). Active Learning for Rapid Targeted Synthesis of Compositionally Complex Alloys. Materials, 17(16), 4038. https://doi.org/10.3390/ma17164038

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Active Learning for Rapid Targeted Synthesis of Compositionally Complex Alloys

Abstract

1. Introduction

2. Materials and Methods

2.1. Sample Preparation and Synthesis

2.2. Sample Characterization

2.3. Active Learning

2.4. Transfer Learning

3. Results and Discussion

3.1. Transfer Learning into Higher Dimensional Systems

3.2. Feature Importance and Interdependence

3.3. Comparison to Other Active Learning Optimization Approaches

4. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI