4.1. EE Consumption Models
-values for all variable batches and data cleaning strategies are shown in
Figure 8. Consistency throughout the variable batches can be observed for all four cleaning strategies when the RF model framework is used. The consistency among the variable batches means, for the RF model framework that the deciding factor for the performance is the variable batch and not the hyper-parameters of the statistical model framework. This is a wanted outcome since the performance of a model should mainly be dependent on factors stemming from the application domain and not based on an optimization of parameters in an abstract framework.
Variable batches (VB) 1 and 2 produce models with the lowest -values. These are also the VB that do not use scrap representation variables. This provides evidence that scrap types are relevant factors when determining the EE consumption of the EAF and that scrap should not be treated collectively using only the total weight of all charged scrap.
The best performing models are those from VB 5 and 6 which use categories from the visual scrap representation. This provides evidence that a categorization based on scrap shapes is an optimal approach when creating a statistical model predicting the EE in the steel plant under study.
The performance of the models using the visual scrap representation are, performance-wise, followed by the models using the plant scrap representation (VB 3 and 4) and the density scrap representation (VB 7 and 8), respectively. Essentially, this indicates that a too fine or a too coarse representation of the charged scrap are sub-optimal for a statistical model predicting the EE. The steel plant scrap representation has numerous scrap types that contain the same scrap with respect to shape and dimension. The only difference is varying alloying content from Ni, Cr, and Mo, which does not significantly affect the melting time.
The coarse representation is based on the apparent density of the scrap, which does not take into account the shape of the scrap. Scrap shapes are closely related to the area-to-volume ratio, which is the strongest factor determining the melting time of scrap in the EAF since the stirring in the EAF is low during a large part of the melting phase. For example, HM has the same apparent density as skulls, which consist of bulky mixtures of solid slag and metal that takes long time to melt. Likewise, thin and thick plate have similar densities but different area-to-volume ratios.
The consistency among the four sets of VB between the four cleaning strategies for both statistical model frameworks further strengthens the evidence regarding the effects of the chosen scrap representations on the predictive performance of the models.
The performance and meta data of the best models and meta data from each cleaning strategy and model framework are shown in
Table 8. In general, the ANN models perform similarly or better than the RF models with regards to the
-values. However, the ANN models always have a smaller mean error and standard deviation of error, i.e.,
and
.
With regards to the modeling meta data, the ANN and RF models had the same VB for the best models on the data from each cleaning strategy. This is also a wanted outcome based on the same reasoning as before regarding the importance priority between the domain-specific factors and the abstract model-based factors.
The total amount of cleaned data points was 25.8 percentages higher for the Expert cleaning strategy compared to the Domain-specific cleaning strategy. The -values only increased slightly using the Expert cleaning strategy; 0.035 and 0.039 for the RF and ANN models, respectively. Using the statistical cleaning method Tukey’s fences, the amount of data cleaned were 11 and 13.3 percentages higher than the Domain-specific cleaning strategy. As opposed to the Expert cleaning strategy, the -values were worse for the models involving Tukey’s fences. Tukey reported reduced -values of 0.049 and 0.077 for the RF and ANN models, respectively. For the Tukey-Domain-specific the reduction in were 0.053 and 0.078, respectively. These results lead to two important findings. First, the usage of statistical cleaning heuristics results in a model performance that is sub-par to models using data cleaned by the usage of domain-specific knowledge; the Expert and Domain-specific cleaning strategies. Second, using data cleaned by an Expert yields models with the best performance, which illuminates the importance of knowledge about the specific EAF operations one intends to model. However, the large relative percentage of data loss using the Expert cleaning strategy (34.2%) as opposed to the Domain-specific cleaning strategy (10%) tilts the chosen cleaning strategy in favor of the latter since the data loss percentage directly relates to the percentage of future heats the model can predict on. This finding is closely tied to the practical usefulness of the model.
4.2. Analysis of the Selected Model
The selected model for the analysis is the RF model with a variable batch 6 using domain-specific cleaning. This model was selected based on the relatively high -value (0.457) while still keeping 90% of the data compared to the model using the expert cleaning type, which provided the highest -value (0.490) but only keeping 64.2% of the data. Hence, the chosen model can be used in 90% of future heats given that the test data and training data come from the same distribution.
The main interactions of the variables Plate1, Internal1, HM, and Shredded on the EE consumption and the charged weight distributions of these scrap categories for each of the charge types A, B, and C, can be seen in
Figure 9.
The EE contribution by Plate1 on charge type A is lower than for B and C. However, charge type B can have both a positive and negative EE contribution since the densest part of the distribution is present in the steepest EE interaction change for Plate1. Hence, it is hard to conclude whether charge type B gets a similar contribution as does charge type A or charge type C. Charge type C is commonly using zero, or next to zero, amount of Plate1. Hence there is a large contribution to EE by Plate1 for this charge type. The EE contribution by Internal1 is slightly lower for charge type A than charge types B and C. The EE contribution by HM on steel type A is higher than for charge types B and C, the latter two of which are charged similarly across all heats. For the Shredded scrap category, all charge types receive similar contributions to EE.
Based on these highlights, it is expected that charge type A has the lowest EE consumption, that charge type C has the highest EE consumption, and that the EE consumption of charge type B is in-between those of type A and type C. The following EE consumption, relative to the EE consumption by charge type A, was obtained based on the data from the heats used to create
Figure 9: charge type B
and charge type C
.
On the other hand, charge type B requires slightly less EE consumption on average than charge type A (1.00). This is likely because Plate1 being closer to a negative contribution to EE for charge type B than was previously anticipated. Furthermore, the analysis only focused on 4 of the 12 scrap categories, out of which 3 are of particular interest to the selected charge types. In addition, the model also uses a total of 21 input variables to predict the EE consumption of any given heat. However, given these caveats, the analysis is in line with what could be expected from the model analyzed as well as from expert domain knowledge.
The main utility of SHAP main interaction values is that it provides clarity on how specific values governed by the input variable distribution contribute to the EE consumption prediction. The SHAP main interaction values only show the univariate relationship between the input variable and the EE consumption prediction. It is possible to use the SHAP interaction effects between the input variables as explained in
Section 3.5.1. However, the number of SHAP plots to analyze will be equal to an additional 20 for each input variable; giving a total of 441 SHAP interaction plots for a complete analysis of the selected model. Although experience and knowledge about the specific EAF can guide the selection of SHAP interaction plots, a further analysis of the SHAP plots is best left as a future point of study. The above analysis shows what can be done with the available tools. Although interesting, an exhaustive analysis is for obvious reasons out of scope of the present paper.
The SHAP main interaction effects on EE by each scrap category of the selected model can be observed in
Figure 10. Thin plate, i.e., Plate1, has been confirmed by the steel plant engineers to contribute to less EE. This is evident since a steep drop can be observed. The reduced EE by Plate1 is eventually flattened out and increases when the amount of Plate1 approaches the upper limit of the furnace capacity. Internal1 contains heavy scrap with an apparent density of over 1.4 ton/m
. Heavy scrap takes longer time to melt which contributes to a steadily rising EE with an increased amount of Internal1 scrap. This has also been confirmed by the steel plant engineers, which refer to the use of Internal1 as the reciprocal of Plate1 in the steel plant charging strategies. One could observe a slight decrease in the EE contribution by increasing the amount of Plate2 and a slight increase in EE contribution for Internal2. However, these scrap categories consist of less than 1% of the total charged weight in the studied heats. Hence, it is difficult to draw any clear conclusion on their contribution to the EE. Shredded scrap is charged based on operating practices rather than for specific charging strategies which results in steel with low amount of tramp elements. The EE contribution is decreasing with increased amount of Shredded scrap from the nominal amount used. This was not confirmed by the process engineers. The decreasing EE contribution could be a model artifact or because the shredded scrap does contribute to a decrease EE consumption in the process. The latter could be the case since shredded scrap is easily melted due to its high surface-area-to-volume ratio.
Incineration scrap is charged in low amounts, corresponding to only approximately 1.5% of the total charging weight on average for all heats and approximately 5% for the heats using charge types with lower requirements on impurity tramp elements. The increase in EE consumption is likely due to the melting requirements of the scrap weight rather than due to the surface-area-to-volume ratio. Also, skulls require higher EE according to the SHAP main interaction effect. This was confirmed by the steel plant engineers which reported that skulls are difficult to melt. Skulls are large concrete-like pieces of slag and metal mixture.
The SHAP main interaction values for the base variables can be seen in
Figure 11. Here, the focus will be on the variables whose EE contribution is counter-intuitive from the standpoint of a practitioner in physico-chemical modeling. Specifically, these are Burner
, Carbon Injection, Burner oil, and Lance
. According to the steel plant engineers, Burner oil is only effective up to a certain amount of
, which is when the burner oxygen is used in tandem with burner oil. This agrees well with the observation from
Figure 11. The burners are used in their maximum capacity when melting scrap. However, the burners still need to be active for the remainder of the heat to prevent the burners from getting clogged by slag and scrap. Thus, Burner oil is closely related to TTT, which is the reason Burner oil in higher amounts contributes positively to EE. Carbon injection, which should contribute to more heat generated by carbon boil, contributes positively to EE. This was confirmed by the steel plant engineers to be related to the continuous injection of carbon fines throughout the heat. As soon as liquid steel is present, carbon fines are injected to facilitate foaming slag. Similar to Burner oil, Carbon Injection is also closely related to TTT. In the steel plant of study, Lance
is only used to clear the slag door to enable sampling of the steel melt temperature and composition. Therefore Lance
does not have a consistent contribution to the EE.
The positive relationship Injection Carbon to EE can also be observed by its Pearson correlation coefficient in
Table 9. For both the Pearson correlation and dCor, Injection Carbon has the highest value with respect to the EE. In addition, the SHAP feature importance (
Figure 12) regards the Carbon Injection as almost twice as important as the Total weight when the model predicts the EE consumption.
The sometimes counter-intuitive relations between the input variables to the EE consumption prediction emphasize the importance of not only having a firm understanding of the physico-chemical and process experience on the relations governing the EAF. It is also important to understand the relationship between the governing factors to the EE consumption of the specific EAF that one intends to model.
It is important to observe that the total weight and TTT are among the three most important variables for the model when predicting on EE. Both variables are also correctly considered by the model with respect to what is known from process metallurgical experience. This agrees well with the results from the previously reported SHAP analysis of a statistical model, created by the authors of the present study, predicting the EE of an EAF [
10].