1. Introduction
Metal Additive Manufacturing (AM), particularly Directed Energy Deposition (DED), has made it feasible to produce large-scale metallic components with advanced outer geometries and tailored microstructures in terms of design advantages, waste reduction, and lead time [
1,
2]. Nonetheless, it presents a challenge to predict mechanical conduct owing to rapid thermal cycles, varied solidification patterns, and microstructural changes: hence, final UTS and YS change due to the interplay between alloy composition and properties, cooling rate and grain smoothness, and thermal gradients [
3,
4]. Though DED flaws such as porosity, residual pressure, and non-equilibrium phases are nearly entirely inevitable, post-processing/sintering, such as heat treatment or hot isostatic pressing, influences ductility, resistance, and ductility uniformity, and can be used to enhance the material’s homogeneity and quality [
5,
6]. These processes are effective means to improve ductility, strength, and microstructural uniformity. In a number of cases, well-designed, optimally tuned DED parts have shown similarities or even superior mechanical properties when compared to their conventionally processed counterparts, such as cast or wrought steel. However, if the goal is to reduce dependence on post-processes and to better define as-built properties, there is a strong demand for predictive models to accurately predict performance from material parameters and processes. The present study fulfils this need by providing an interpretable, strong ML model that predicts major mechanical properties from DED-specific thermal and compositional characteristics, to improve informed process design and quality control.
Characterization of the mechanical properties of DED-fabricated parts is often costly, time-consuming, and resource-intensive for specimen creation and calibration [
7]. Experimentally, material-dependent setups may complicate the identification of underlying mechanisms and may be limited to a limited number of processing parameters, rendering them not directly applicable to other alloys or DED systems. Alternative simulators, such as the finite element method (FEM) or thermodynamic software, require elaborate calibration, computing, and sequential simulation workflows to predict microstructure development and mechanical properties [
8]. For DED, the simulation of the porosity, thermal history, and cooling rate is necessary for material behavior modelling to predict the final properties [
9]. There is a call for hybrid methods that bring together inductive ML and physics-based simulations to enable accurate predictions of mechanical properties depending on the process conditions and material parameters within reasonable computation times. This work employs a hybrid approach, with JMatPro used for underlying physical properties and ML models for rapid generalizing structure property relationships over broad design spaces.
Evaluation of the mechanical properties of DED-fabricated components is typically limited by being expensive, time-consuming, and resource-intensive for specimen preparation and calibration. Furthermore, experiments usually concern particular alloys and dedicated DED conditions with an addition-limited range of process parameters. For instance, Akbari et al. [
10] introduced a physics-informed ML framework that is trained by experimental data of more than 140 metal AM studies—not limited to processes such as DED—to make accurate predictions of mechanical properties (UTS, YS, hardness) and interpret the impact of features by using SHAP. For example, Sharma et al. [
11] built random forest and gradient boosting regression algorithms on a training set of 421 samples and reported high predictive performance (R
2 > 0.85) for tensile, yield, and elongation and that post-processing condition and build orientation were the most important predictors in shaping the mechanical performance in all the three considered properties. Similarly, Mozaffar et al. [
12] used a recurrent neural network with Gated Recurrent Units (GRUs) for predicting high-dimensional thermal histories in DED, and they reached a test-set MSE of 3.84 × 10
−5. The model was able to achieve strong modelling accuracy for infrastructure of different geometries and time scales, thus establishing the foundation for data-robust, real-time thermal control measures. Furthermore, Kannapinn et al. [
13] proposed a digital twin framework based on neural ordinary differential equations for real-time prediction of residual stress fields and tensile properties during the deposition process by multi-physics DED simulations. Park et al.’s [
14] amorphous-inspired 65.1Co28.2Cr5.3Mo lattice structures designed through artificial intelligence and manufactured through AM show excellent specific compression strength. The special combination of periodic nodal patterns and aperiodic patterns of struts made certain by heat treatment and nano-structuring seems a promising perspective for applications in high-stress environments, such as valve cages in power plants. Furthermore, predicting mechanical properties such as UTS, YS, and hardness value (HV) through simulations alone is expensive, time-consuming, and requires integration of multiple discrete and computationally intensive models.
Given the constraints in experimental datasets, using ML methods to develop models based on simulation data provides a cost-effective and scalable approach to model structure–property relationships for DED-fabricated low-alloy steels [
15]. Data from such thermodynamic software, such as JMatPro, can instantaneously predict significant metallurgical variables, including phase fractions, grain size, and thermal properties, under a variety of compositions and cooling rates, which would take substantial time and expense to be obtained through experimentation and physical simulation [
15]. Simulation-dataset-driven ML models are general, providing estimates for the mechanical properties in different DED-relevant conditions, and the intricate input parameter interrelations are captured. Integration of JMatPro simulations with ML algorithms enhances the prediction of UTS, YS, and HV, and contributes to the optimization of the DED-relevant process parameters for optimal mechanical performance. There are barriers between ML and metal AM, such as feature complexity, small dataset size, and limited variability when compared to traditional ML applications.
Considering the obstacles and limitations, ML has found applications in Metal AM DED processes. For instance, Rahman et al. [
16] employed JMatPro developed mechanical properties data—based on alloy composition, cooling rates, grain sizes, and thermal gradients—for more than 2400 DED produced low-alloy steel compositions. They used ML models such as PR and MVLR to predict UTS, YS, and hardness, obtaining R
2 > 0.79 and showing that simulation augmented datasets can be effectively used to inform mechanical property prediction. Sinha et al. [
17] demonstrated a Random Forest Regression model to predict the mechanical properties, such as the elongation, UST, and YS of the alloy steel by using the compositional features and cold rolling deformation. The model had excellent predictive ability of R
2 ≈ 0.94 for elongation, 0.99 for UTS, and 0.85 for YS, which presents the power of ensemble learning in materials properties prediction. Furthermore, Era et al. [
18] showed that the use of XGBoost outperformed Random Forest and Ridge Regression for the prediction of tensile properties of L-DED parts of SS 316L. Based on input parameters given as laser power, scanning speed, and layer thickness, XGBoost captures the most accurate response with an RMSE error of 11.38 MPa (YS), 12.22 MPa (UTS), and 3.22% (elongation). This demonstrated the model’s capacity for mechanical behavior prediction in AM. Qi et al. [
19] have predicted YS, UTS, and elongation of low-alloy steels with an average R
2 of 0.73 using the proposed KD-GCN method compared to classical ML models. It also shows a higher consistency (90%) with domain knowledge when analyzing the feature importance than MLP (70%), which proves its accuracy and interpretability. Chandraker et al. [
20] established an ML model which maps alloy composition and DED printing parameters to YS in multi-principal element alloys (e.g., Co–Cr–Fe–Mn–Ni system). They achieved an R
2 of 0.84 using ensemble methods like Random Forest and XGBoost, to show the efficiency of ML in predicting mechanical performance from processing inputs. Xie et al. [
21] presented a wavelet-CNN model for the accurate prediction of UTS for metal AM based on its thermal history. It surpassed traditional ML, obtaining an R
2 ~0.70. Certain temperature bands (1213–1365 °C and 654–857 °C) were observed to have a pronounced effect on UTS. Longer dwell times and quick cooling caused more fine-grained microstructures and better mechanical properties.
Although prior studies have successfully applied ML for advancing the prediction of the mechanical properties of metal AM, they were usually carried out with small experimental datasets, which is very restricting due to the inherent large spread in processing parameters, and with individual materials and build strategies. In contrast, the present study is based on a large and simulation-based dataset, consisting of 4900 instances produced by JMatPro for DED-fabricated low-alloy steels. Such a large-scale dataset includes comprehensive input information such as chemical composition, grain size, cooling rate, and thermal property, which can guarantee a sufficient training capacity of the ML models for UTS, YS, and HV prediction. Our framework features enhanced generalization by encoding a larger range of material and process variables to enable optimal processes and quality monitoring for a more comprehensive scope of DED processing conditions. This approach allows for a more systematic and physics-based investigation of structure–property relations that are important in advanced manufacturing applications.
With our extensive JMatPro simulated dataset, this work adopts an ML framework for AM to predict mechanical properties—UTS, YS, and HV of DED-manufactured low-alloy steels. For assessing the performance of the predicted models, MVLR, PR—Degree 2 and Degree 3, MLPR and XGBoost algorithms were used for predicting the mechanical properties. For the classification, datasets were also trained with classification models to classify ranges of mechanical properties according to DED features like cooling rate, grain size, temperature, and chemical composition. The performance of each model was extensively evaluated in terms of predictive accuracy and interpretability. In addition, this study employed SHAP (SHapley Additive exPlanations) analysis to explain the role of each input feature in making predictions for the ML algorithm. Through physics-based feature creation and interpretable regression and classification models, this framework allows for robust and explainable structure–property prediction in DED-manufactured low-alloy steels.
A key innovation of our framework lies in its ability to generalize across a wide range of alloy compositions and DED processing conditions. By physics-informed featurization—computed using JMatPro simulations, ML models are capable of training a relation for mechanical properties such as UTS, YS, and HV as a function of parameters such as cooling rate, grain size, alloy composition, and thermal behavior that are specific to a DED process. This simulation-driven feature engineering enables the framework to extend its predictive capability to unseen alloy systems and DED parameter spaces, including novel compositions or thermal profiles not explicitly present in the dataset. In contrast to traditional empirical models which were confined to narrow experimental ranges, this study framework provides a flexible and extensible prediction platform to propel the current standard of structure–property modeling in AM, in particular for DED-processed low-alloy steels.
2. Methodology
This section describes the ML framework, including the process used to acquire and preprocess data from the JMatPro (Version 14) simulations. It also discusses feature engineering based on the parameters of DEDs and the implementation of MVLR, PR (Degree 2 and Degree 3), MLPR, XGBoost, and classification models. To verify and interpret models, the SHAP analysis and standard evaluation metrics are used to verify and interpret the models in predicting the mechanical properties.
2.2. Data Acquisition and Pre-Processing
In this study, a comprehensive dataset comprising 4900 samples was generated from thermophysical calculations on low-alloy steels using JMatPro simulations, based on composition data collected from a commercial DED company machine. Each example consists of 13 input descriptors and one of three target mechanical response variables (UTS, YS, and HV). The input features include the full elemental compositions (C, Si, Mn, P, S, Ni, Cr, Mo, Cu, V, Al, N), grain size, thermal exposure temperature (temp), and cooling rate—all significant factors affecting microstructural evolution. The cooling rate material shows three types of cooling mediums typical of the processing of materials: furnace-cooling corresponding to slow cooling (0.01 °C/s), air-cooling as a model of medium-speed cooling (3.3 °C/s), and ice-brine, fast cooling (375 °C/s). These cooling rates do not directly mimic the intricate, dependent layer cooling of DED; instead, they were selected to encompass the primary metallurgical transformation pathways (ferritic/pearlitic, bainitic, and martensitic) [
22,
23,
24], to ensure some range is covered between different samples whilst remaining computationally tractable for ML. Furthermore, these cooling rates (0.1 °C/s, 3.3 °C/s, 375 °C/s) values were utilized in JMatPro to span a wide spectrum of solidification and phase-transformation domains that enable the ML framework to predict generalized thermal–mechanical trends. Grain size was kept constant at 20 ASTM to take into account very fine grains (~3 µm), as it is influenced or can be an interrelated factor of composition, cooling rate, and temperature in explaining the mechanical properties. Grain size was included as an input descriptor for completeness, but it was constant in the present dataset. Rigorous preprocessing steps were applied to ensure data integrity, including format alignment, detection of outliers, and normalization of features. The obtained high-fidelity dataset serves as a robust foundation for predictive modeling of mechanical properties in AM.
Data preprocessing is a key step in the dataset training of predictive ML models, as it has a direct impact on the performance, accuracy, and generalizability of these models. This process commonly includes excluding incomplete samples, removing redundant samples, and addressing outliers that can induce biased learning patterns. In this study, missing values were handled using the complete case filtering strategy, retaining only those samples with fully observed features across all input dimensions. The filtering operation is mathematically represented in Equation (1) [
25], where
denotes the
data point and
, its corresponding feature value:
where:
D = original dataset with n samples and m features
xi = ith data sample
xij = value of the jth feature in the ith sample
Dclean = filtered dataset containing only complete records
This equation means that only the samples for which the values of the features are valid and not missing (i.e., not NaN) are retained for subsequent modelling. Here, NaN is the abbreviation for “Not a Number”, used to represent an undefined or missing number in a dataset which typically results from incomplete simulations, corrupted input data, or format irregularities.
As a result of this filtering, the dataset used for the regression-based algorithm consists of MVLR, PR, MLPR, and XGBoost models used for 4900 samples after filtering, cleaning, deduplicating, and handling missing values. This ensured that only complete and consistent data contributed to the ML algorithm’s dataset training and evaluation. In contrast, the classification model dataset of 4900 samples, as the pre-processing step focused on label binarization and class balancing, not involving sample removal. No automated outlier detection (e.g., Mahalanobis distance or Z-score) was conducted, however, inspection of input feature distributions assured that all samples remained consistent. Duplicate entries were inherently excluded due to unique simulation configurations. The retained variability in chemical composition, grain size, thermal input, and cooling rate presented a large and diverse feature space suitable for both regression and classification model tasks.
Table 1 illustrates a statistical overview of several labels within the dataset, incorporating their mean, median, and standard deviation. The descriptive statistics provided in this paper will highlight the spread and features of the data, thus providing an important insight into the general nature of the data.
2.3. Dataset Featurizations
To ensure robust and generalizable predictions for mechanical properties, a dedicated and physics-based feature selection approach was employed during ML model training. The featurization approach employed in this work combines material-related as well as process-informed features acquired from extensive simulations generating the low-alloy steel-based alloy using JMatPro. Specifically, elemental compositions (i.e., C, Mn, Si, Cr, Ni, Mo), grain size, cooling rate, and thermal history, all of which are known to decisively impact mechanical behavior in DED-produced low-alloy steels. These characteristics, collated in
Figure 2, are from the foundation of the dataset and represent domain knowledge for the understanding of complex structure–property relationships for DED fabrication.
Besides process parameters and material compositions used to generate continuous variables, the featurization framework includes some categorical features to capture essential qualitative distinctions within the dataset. As illustrated in
Figure 2, the categorical input (e.g., alloy code: MBB, MBC) represents key identifiers that differentiate compositional families or experimental groupings relevant to the DED process. These categorical variables are converted to machine-readable form via one-hot encoding where a binary vector of size 1 × n is used to express each category, with “1” assigned to the
index and zeros assigned to the rest of the indices for any data point that belongs to the
category. This encoding framework is implemented so that ML models can understand and learn discrete differences between alloy classes without adding prediction tasks. While this study analysis is focused on alloy code as a categorical input, it demonstrates that the featurization method can be straightforwardly applied to other types of input features, such as process parameters and material composition. It confirms the consistency and completeness of the training datasets when any of the necessary categorical or continuous aspects are removed. This pre-processing ensures that data used as input can provide a full feature set for both training and evaluation, thereby promoting more consistent and reliable model training and evaluation.
This work used a physics-based featurization to keep the ML models rooted in metallurgical principles, rather than purely statistical correlations. This strategy maps input descriptors into some kind of material space that has physical meaning and reflects directly its effect on microstructure and mechanical behavior. The relations that underlie the featurization framework are as follows.
Process parameters, which include temperature, cooling rate, and grain size, are correlated with equations of thermal softening relation, cooling rate–hardness relationship, and Hall–Petch equations. See Equations (2)–(4) accordingly [
26,
27,
28]. For material alloy compositions which are related to solid–solution strengthening Equation (5) [
29]. Categorical features, such as alloy codes (e.g., MBB, MBC), were included using one-hot encoding, ensuring that the ML framework distinguished family specific differences in phase stability and alloy design in Equation (6) [
30].
where
is the strength at temperature
T,
is the room-temperature strength, and
is a softening coefficient. This reflects the well-known decline in strength and hardness at elevated temperatures.
where
is the yield strength,
the lattice friction stress,
the Hall–Petch slope, and
is the grain size. This captures the well-established strengthening due to grain refinement.
where
is the cooling rate. Cooling promotes the phase transformation and also includes the cooling rate as a feature that captures its direct impact on hardness and strength.
where
is the atomic fraction of alloying element
i (C, Mn, Ni, Cr, Mo, etc.), and
its strengthening coefficient. This relation ensures that compositional effects such as hardenability and lattice distortion are embedded in the feature space.
This one-hot encoding is used for categorical features like alloy codes (MBB, MBC). Here, represents the set of encoded categorical vectors for all samples, where each is an n-dimensional binary vector The condition ensures that exactly one element of the vector is assigned a value of “1”, corresponding to the category that the sample belongs to, while all other entries remain “0”. This encoding enables the machine learning model to discern categorical differences between alloy families in a mathematically solid and machine-readable manner, without applying pseudo-ordinal relationships among categories.
2.4. Models Evaluations and Verification Strategy
To comprehensively evaluate the prediction accuracy of the developed models, both regression-based and classification-based evaluation strategies were used. These combined methods enabled robust validation of both continuous and discretized outputs for mechanical property prediction of DED-manufactured low-alloy steels.
For regression tasks involving MVLR, PR, MLPR, and XGBoost algorithms performance was quantified using four standard error-based metrics which are mean absolute error (MAE), root mean squared error (RMSE), and the coefficient of determination (R
2). The equations are as follows in Equations (7)–(9) [
31,
32,
33]:
where
and
represent the predicted and observed values of the
sample,
is the mean of the observed values, and N denotes the number of data points. These measurements define an adaptive regression rule based on absolute and squared deviations from predictions to labels, offering a sound assessment of the accuracy and generalization of the regression model. The better model performance depends on the lower value of MAE, RMSE, and the higher the value of R
2.
To complement regression analysis, a classification framework was used to interpret model performance when the continuous mechanical property outputs were transformed into categorical bins (i.e., low, medium, and high hardness). The confusion matrix and its derived metrics-accuracy, precision, recall, and F1-score were used to report the performance. This was supplemented by multiclass receiver operating characteristic (ROC) curves and area under the curve (AUC) analysis, giving an overview of the discriminatory power of the model across the different classes.
To enhance ML model reliability and minimize bias caused by data heterogeneity, rigorous cross-validation techniques were integrated into the validation strategy. In particular, cross-validation used in this study, the dataset was split into subsets, and the subset of the data was used as a test set while the rest was used as training data one at a time. K-Fold and stratified K-Fold cross-validation (CV) were employed to maintain the class balance between the folds, and leave-one-out cross-validation (LOOCV) is also used to ensure the evaluation results are statistically robust and generalizable. The best resulting model, according to cross-validation loss, was chosen and saved for each mechanical property task. This comprehensive evaluation system, comprising continuous and discrete measures, multi-layered cross-validation, and optimized learning parameters, guarantees accurate prediction and interpretability of mechanical behaviors in DED-manufactured low-alloy steels.
This dual-framework approach integrates regression-based precision with classification-based interpretability, provides full model verification, and demonstrates the potential of the developed ML algorithm framework in estimating the mechanical properties based on the process-dependent features in DED-processed low-alloy steels.
2.5. ML Algorithms
This section provides the ML algorithms used to predict mechanical properties—UTS, YS, and hardness of the futurized dataset based on JMatPro simulations of DED fabricated low-alloyed steel. The algorithms used are Multivariable Linear Regression (MVLR), Polynomial Regression (Degree 2 and 3), Multi-Layer Perceptron Regressor (MLPR), Extreme Gradient Boosting (XGBoost), and classification models that are then employed to enhance predictive accuracy and comparative analysis.
2.5.5. Classification Model
The predictive performance and generalization capability of the classification models for the training score and test score is used to evaluate the models based on classification accuracy. The training score measures how well the model is performing on the training dataset and the test score on the test dataset. These measures are calculated as the percentage of samples that are correctly classified from the set. Balanced performance in the two scores suggests better generalization, while a significant difference between the values may indicate underfitting or overfitting. The equations are as below in Equations (18) and (19) [
41]:
Here, are the predicted class labels for the training sample and test sample, respectively; and are their corresponding true class labels. and represent the total number of training and test samples.
Additionally, classification accuracy was assessed with confusion matrix-based metrics including accuracy, precision, recall, and F1-score, for each class, averaged with both the macro and the weighted approach in Equations (20)–(24) [
42]. To evaluate the model’s ability to distinguish among multiple classes, multi-class ROC curves (via the one-vs-rest method) and the corresponding area under the curve (AUC) with the macro averaging and weighted averaging in Equations (25)–(27) [
43]. These all-inclusive performance criteria verify the efficiency and dependability of the proposed ML models for predicting the mechanical properties of DED-manufactured low-alloy steels.
where TP, TN, FP, and FN are the number of true positives, true negatives, false positives, and false negatives. Precision measures the ratio of correctly predicted positive observations to the total predicted positives (positive predictive value), while recall measures the ratio of correctly predicted positive observations to all observations in the actual class. The F1-score is the harmonic mean of precision and recall, which provides a trade-off between them. In a multiclass framework, is the F1-score for class c, and C is the number of classes. The macro-averaged F1-score is taken as the unweighted average of F1-scores across all classes to make sure the classes contribute equally.
Multiclass ROC-AUC (OvR) Equation:
Then, the overall macro-average AUC,
Here, is the true positive rate as a function of the false positive rate for class ccc in a one-vs-rest (OvR) multiclass classification. Namely, the AUC for class c, noted , is obtained by integrating over [0, 1] with respect to the false positive rate. The macro-averaged ROC-AUC score, , is calculated as the average of individual class-wise over all C classes
To further ensure robustness, cross-validation methods are used, such as K-Fold, stratified K-Fold and leave-one-out cross-validation (LOOCV). In K-Fold CV, the dataset is divided into K subsets that are used for validation (K-Fold) and for training (rest of the K−1 Folds). Stratified K-Fold builds upon it by keeping the class distribution through them, which is a very useful option for imbalanced datasets. Leave-one-out-cross-validation (LOOCV), a special case of K-Fold with
K =
N, trains on all but a single data point and tests that single point for any parameter; LOOCV uses all available data at the expense of computational cost. The equations are as follows in Equations (28)–(30) [
44],
Leave-one-out cross-validation (LOOCV):
Here, is a performance measure (such as accuracy, F1-score, or RMSE) in the fold or iteration. K is the sum of all folds in K-Fold and stratified K-Fold cross-validation, and N is the count of all samples in the dataset for the leave-one-out cross-validation (LOOCV).
3. Results and Discussion
This section provides a detailed analysis of the prediction performance of the ML models proposed for UTS, YS, and HV mechanical properties by JMatPro simulated dataset on DED-processed low-alloy steels. The focus is given on the regression and classification performance of models like MVLR, PR, MLPR, and especially XGBoost and the results are validated by error metrics, confusion matrices, SHAP analysis, and cross-validation methods.
3.8. Multi-Class ROC–AUC Evaluation of Mechanical Property Classification
Receiver operating characteristic (ROC) curves are employed to evaluate the performance of the multiclass classification models used for UTS, YS, and hardness predictions.
Figure 10 indicates the ROC curves and the AUC values of each class, which is between a true positive rate and a false positive rate.
For UTS (
Figure 10a), the AUC values for the three classes were 0.2511 (Class 1), 0.2742 (Class 2), and 0.2816 (Class 3). These are below the 0.5 threshold, which means that the ROC analysis is not consistent with the rest of the strong performance shown in the accuracy, precision, recall, and F1-score metrics. This trend was also present in YS (
Figure 10b) classification, with respective AUC values of 0.2516, 0.2826, and 0.2725 for Classes 1, 2, and 3. On hardness (
Figure 10c), the corresponding AUC varied between 0.2481 and 0.2716, which indicates that ROC-based separability of the model is low, despite high classification metrics.
Despite the low AUC values, prior evaluations—including confusion matrices, SHAP and drop-column importance, and F1-score metrics consistently indicate high predictive performance. Thus, the ROC results may not be representative in this context and should be interpreted with caution. For a reliable ROC-based assessment, probability-calibrated outputs or one-vs-rest AUC averaging would be more appropriate for future analysis. Although the AUC values are low, prior evaluations of confusion matrices, SHAP, drop-column importance, and F1-score metrics show consistently good predictions.
For UTS (
Figure 11a), the AUC values of three classes were 0.25 (Low), 0.27 (Medium), and 0.28 (High). Also, for YS (
Figure 11b), the values were 0.25, 0.28, and 0.27, and for hardness (
Figure 11c), they were 0.25, 0.27, and 0.26, respectively. These AUCs are all well below 0.5, signifying a random or non-informative classifier. The ROC curves come very close to the reference diagonal, which is an additional indication that the classifier does not make clear class separation on the probabilistic score level alone.
The lower AUCs are inconsistent since the ML models perform very well on other metrics like precision, recall, F1-score (all ≥ 0.95), and confusion matrix results (low values). The difference is likely due to which predicted probabilities are calibrated (or multi-class AUC is computed), for example, using raw scores or without a proper one-vs-rest decomposition. This kind of challenge can be frequently encountered in multi-class ROC analysis, especially when SoftMax probabilities are not well separated, or the class imbalance results in the deformation of the ROC space.
4. Conclusions
This study presents a comprehensive and physics-informed ML-based prediction of mechanical properties—UTS, YS, and HV of DED-fabricated low-alloy steels. Using a robust simulation-driven dataset of 4900 samples produced in JMatPro, this framework includes important thermal and compositional characteristics, including cooling rate, temperature, grain size, and alloy chemistry to capture complex process–structure–property relationships.
Five ML models were studied: MVLR, PR, MLPR, XGBoost, and a classification model. Among them, XGBoost achieved the best R2 values of 0.98 for UTS, YS, and HV, and the minimum prediction errors (MAE = 15.4 MPa for UTS, 15.43 MPa for YS, and 0.556 for HV). This high predictive capability is confirmed from residual plots, error distribution, and robust cross-validations (K-Fold, Stratified, LOOCV) in which R2 were always above 0.99.
On the other hand, classification models showed a good predictability (macro-averaged F1-scores of 0.95 for the three mechanical properties) and classification accuracies of 94–95% in test datasets. These models can successfully differentiate low, medium, and high-performance classes, as indicated by confusion matrices. Correlation analysis of Pearson showed highly significant linear correlations (R = 0.98 between UTS and YS; R = 0.97 between UTS and HV; R = 0.97 between YS and HV) that reflected the physical interdependence among strength and hardness in the DED-processed steels. This high coefficient also confirms the internal consistency of the dataset and shows the relevance of combining multi-property predictive methods.
The explainable AI (XAI) methods, especially SHAP and the drop-column analysis, revealed temperature and cooling rate as the most influential predictors, once more confirming their metallurgical significance in microstructural evolution. High inter-property correlation (UTS–YS: R2 = 0.98, UTS–HV: R2 > 0.97) also strengthened the possibility of unified multi-property modeling.
Overall, this study highlights the potential of utilizing physics-based featurization and interpretable ML algorithms to efficiently predict mechanical properties in DED-manufactured steels. In addition to the scientific insights gained, the concept of intelligent AM process control is scalable toward industrial applications, toward real-time quality assurance, decreased material qualification time, and intelligent AM process optimization in future Industry 5.0-ready AM systems.