1. Introduction
The catalytic hydrogenation of nitrile butadiene rubber (NBR) into hydrogenated NBR (HNBR) enhances the polymer’s resilience, making it ideal for demanding environments in the automotive and industrial sectors. Despite the robust demand for HNBR, predominantly driven by the automotive industry, current semi-batch production methods relying on Ru- and Os-based catalysts are hindered by significant limitations in efficiency and scalability. These limitations underscore the urgent need for innovative approaches to HNBR production [
1].
A promising solution lies in the adoption of continuous processing techniques, specifically, using static mixer (SM) reactors with an open-blade structure. Recent studies, such as those by Madhuranthakam et al. [
2], have demonstrated that SM reactors can achieve high conversion rates for NBR, exceeding 97% in continuous production settings without the auxiliary costs and extended production times associated with batch processes. Static mixers facilitate plug-flow behavior along with efficient mass and heat transfer, making them ideal for the chemical industry’s shift towards continuous processes.
The cornerstone of optimizing this continuous process is the development and validation of a mechanistic model that simulates the SM reactor’s performance. This model provides a detailed representation of the physical and chemical dynamics within the reactor, offering critical insights that are essential for refining the reactor design, enhancing operational efficiency, and ensuring consistent, high-quality HNBR production. The comprehensive understanding gleaned from this model not only supports operational improvements but also shows the way for scalable, efficient production methodologies in the HNBR industry [
3].
One key advantage of a mechanistic model is its capacity to elucidate the intricate kinetics and thermodynamics governing the hydrogenation of NBR within a static mixer. By capturing the detailed mechanisms of chemical reactions, mass transfer, and heat transfer, the model enables researchers and engineers to analyze the complex interactions within a mixer. This comprehension facilitates the identification of critical parameters influencing reaction kinetics and the effects of operating conditions on product yield and quality. Moreover, the mechanistic model serves as a valuable predictive tool, allowing for the simulation of various scenarios and parameter variations without requiring extensive experimental trials. This predictive capability is instrumental in optimizing static mixer design, operational conditions, and catalyst selection [
4]. Meanwhile, researchers can explore different strategies for maximizing conversion rates, minimizing by-products, and achieving desired product specifications. In addition, validating the mechanistic model with experimental results enhances the credibility and reliability of its predictive capabilities. It ensures that the model accurately represents the real-world behavior of the static mixer during the hydrogenation of NBR. The alignment between the model’s predictions and the experimental data verifies the robustness of the model and instills confidence in its ability to guide future experiments and process improvements [
5].
On the other hand, although mechanistic models for SM reactor performance can provide valuable insights, they are not without their shortcomings and limitations. For instance, these models often rely on simplifications and assumptions for making a system more tractable. These simplifications may not fully capture the complexity of real-world phenomena, leading to a gap between model predictions and actual outcomes. Moreover, mechanistic models depend on accurate input parameters, and their predictions can be sensitive to parameter variations. Obtaining precise values for all parameters, especially in complex chemical processes, can be challenging and may introduce uncertainties. Besides, many mechanistic models assume spatial and temporal homogeneity within a reactor, neglecting gradients and variations. In reality, the dynamics within a static mixer can be non-uniform, impacting the accuracy of the model. This fact also imposes uncertainty on the model. Furthermore, detailed mechanistic models can be computationally intensive, requiring significant computational resources and simulation times. This may limit their practical use, especially for real-time process-control applications. Another issue is that mechanistic models developed for lab-scale reactors may not easily translate to larger-scale industrial reactors. The scale-up process involves additional complexities and considerations that the original model may not capture. Finally, mechanistic models may be developed based on specific experimental conditions, limiting their generalizability to different operating environments, feedstock variations, or reactor geometries. Therefore, again, these models may be impotent to handle the inherent operational uncertainties [
6].
Addressing the challenges associated with mechanistic models for static mixer (SM) reactor performance involves a combination of careful considerations, methodologies, and strategies. For example, gathering more comprehensive experimental data to improve the accuracy of model parameterization and validation can be a practical solution since well-designed experiments can provide valuable insights into reaction kinetics, fluid dynamics, and heat transfer within a static mixer. Meanwhile, developing a more detailed and accurate representation of a reaction mechanism is another alternative. In addition, employing advanced computational techniques, such as high-performance computing or machine learning algorithms, to handle the computational intensity associated with detailed mechanistic models is useful. This can enhance model efficiency without compromising accuracy. On the other hand, the implementation of nonlinear optimization algorithms and parameter estimation techniques to enhance the accuracy of the model’s parameters can be considered as another remedy. Sensitivity analyses can be performed to identify critical parameters and quantify their impacts on the model’s predictions. Moreover, developing multi-scale models that account for variations in spatial and temporal scales, which allows for a more accurate representation of dynamic processes occurring at different levels within a static mixer, is completely beneficial. As another option, establishment of robust validation protocols by comparing model predictions with a diverse set of experimental data is reasonable. This helps ensure that the model is applicable across a range of conditions and configurations. Furthermore, the model’s predictive capabilities under varying conditions can be improved by utilizing real-time data acquisition systems to continuously update and validate the mechanistic model during reactor operation. Addressing scale-up challenges by incorporating scaling factors and considering differences in fluid dynamics, heat transfer, and reaction kinetics between lab-scale and industrial-scale reactors is another solution. Finally, by implementing uncertainty quantification techniques to estimate and manage uncertainties associated with model parameters and predictions, a measure of confidence in the model’s results is obtained [
7]. As a result, by combining these strategies, researchers and engineers can improve the reliability and applicability of mechanistic models for SM reactor performance, making them valuable tools for optimizing processes in chemical engineering applications.
Among the aforementioned approaches, machine learning (ML) -based models can be powerful tools in overcoming the challenges associated with the development and optimization of mechanistic models for static mixer (SM) reactor performance. ML techniques, such as regression or neural networks, can be employed to calibrate mechanistic models by learning the relationships between the input parameters and the experimental outcomes. This can help improve the accuracy of the parameter estimation and reduce the reliance on hand-tuned parameters. Moreover, ML-based models can act as surrogate models for computationally expensive mechanistic models. By training ML models on a subset of data generated by the mechanistic model, they can provide rapid predictions, enabling quicker simulations and facilitating optimization tasks. In addition, ML models, especially regression models or neural networks, can learn complex relationships within reaction kinetics. This is particularly useful when dealing with intricate chemical reaction mechanisms, allowing the model to predict reaction rates based on input conditions. Also, ML algorithms can perform sensitivity analyses to identify critical parameters that significantly influence reactor performance. Understanding these sensitivities can guide researchers in refining mechanistic models and focusing on key aspects during experimental design. Further, ML-based optimization algorithms can be applied to explore the parameter spaces efficiently and identify optimal operating conditions for static mixers. This is particularly beneficial for achieving desired reaction yields while considering various constraints. It should be noted that combining mechanistic models with ML-based models in a hybrid approach can leverage the strengths of both methodologies. This fusion can enhance the accuracy of predictions, especially when dealing with complex and non-linear processes. Finally, ML models trained on data from one type of static mixer can be adapted to work with similar mixers. This “transfer learning approach” can save computational resources and time when extending models to new reactor configurations [
8]. As we can see, machine learning offers a suite of tools that complement mechanistic modeling efforts, providing enhanced predictive capabilities, efficiency gains, and insights into the underlying processes within static mixers. The integration of ML techniques can significantly contribute to overcoming the limitations associated with traditional mechanistic modeling approaches.
In should be noted that in the context of sustainable materials development, the potential of biodegradable polymers, particularly polybutylene adipate-co-terephthalate (PBAT), has garnered significant attention due to their favorable mechanical properties and biodegradability. Recent advancements in the field, as illustrated in [
9], have explored the application of PBAT in direct pellet 3D printing processes, demonstrating its efficacy in creating environmentally friendly plastic films. This aligns with our research on the continuous hydrogenation of nitrile butadiene rubber using a static mixer (SM) reactor, where we leveraged the environmental benefits of PBAT within the chemical engineering domain. By utilizing innovative processes that reduce reliance on non-degradable polymers, our approach not only enhanced the SM reactor’s performance but also contributed to the broader goal of reducing plastic pollution—a critical aspect emphasized in the current literature.
On the other hand, in parallel to the development of sophisticated computational models, recent advancements in material fabrication techniques have significantly influenced industrial applications. A notable example is the work presented in [
10], which explored the utilization of direct pellet 3D printing techniques to produce super stretchable propylene-based elastomers with exceptional elongation properties, exceeding 4000%. This breakthrough demonstrates the potential of additive manufacturing to overcome traditional constraints in material processing, such as waste reduction and enhanced mechanical properties. Similar to these advancements, our research employs machine learning to refine the processing conditions in SM reactors, drawing on the precision and adaptability of modern manufacturing to enhance the hydrogenation process. Such comparisons underscore the synergistic potential of combining advanced material processing with computational modeling to address complex industrial challenges.
Based on the provided explanations, it can be asserted that the development of suitable machine learning (ML) -based models for static mixer (SM) reactors proves highly beneficial for various purposes. Therefore, this paper proposes an artificial neural network (ANN) -based model that characterizes the dynamics of the SM reactor, explicitly considering uncertainties in several key internal parameters. The results obtained highlight the effectiveness of utilizing a reliable ANN model, trained successfully using a comprehensive dataset. The data collection process involves a judicious data sampling technique covering the entire range of uncertainties. This is achieved through the application of a Monte Carlo sampling method. Consequently, a singular ML-based model is expected to emerge, proficiently capturing the reactor’s dynamics in the presence of uncertainty. This approach stands in contrast to the traditional use of multiple mechanistic models, emphasizing the advantages of employing a well-trained ANN model for dynamic representation and prediction within the SM reactor system.
3. Mechanistic Model of SM Reactor
To establish a dependable machine learning (ML) -based model for an SM reactor, the first crucial step involves gathering essential training and testing data. In this paper, we utilize the mechanistic model previously introduced in [
2], considering it as the foundational model. Subsequently, we conduct diverse operating scenarios (i.e., corresponding to different values of the key parameters to show the inherent uncertainty), capturing and recording the corresponding data. A brief review over the fundamentals of the mechanistic model is presented in this section.
In general, understanding the residence time distribution (RTD) in static mixers (SMs) is crucial when employing a mixer as a chemical reactor. The RTD within an SM featuring an open-blade internal structure under laminar flow conditions is articulated by applying Danckwerts’s axial dispersion model. Through the inclusion of the residence time distribution (RTD) and considering the kinetics of the hydrogenation reaction, as well as the mass transfer of hydrogen gas into the unsaturated polymer solution, the ensuing model (represented by Equations (1)–(3)) is formulated to describe the conversion, hydrogen concentration within the polymer phase, and catalyst concentration, all presented in a dimensionless form.
Meanwhile the open–open boundary conditions are given by the following equations:
initial condition at
θ = 0:
First boundary condition:
And second boundary condition:
in which
υ =
x,
h, and
ζ.
In the given equations,
Pe represents the Peclet number, defined by
Pe = UL/Da, where
U is the flow velocity,
L is a characteristic length, and
Da is the Damköhler number. The variables include
x for conversion,
h for normalized hydrogen concentration, ζ for normalized catalyst concentration,
θ for dimensionless time
(t/τ), and
k for dimensionless length (
z/L). The essential parameters needed to solve the model are
Pe,
R,
q, and
θτ. The Peclet number for the Kenics KMX SM, involving a similar experimental system with hydrogen gas and polymer solution, has been modeled in [
27] as a function of liquid-side and gas-side hydraulic Reynolds numbers. The empirical model is given by the following:
in which β
1, β
2, and β
3 are constants and
ReL−h and
ReG−h are Reynolds numbers defined based on the hydraulic mean diameter (
DH) of the SM reactor. More details about this mechanistic model are available in [
2].
It should be noted that to use a mechanistic model as a base and then train a machine learning (ML) -based model on top of it, a few key steps are required. This process is often referred to as model enhancement or model augmentation, where machine learning is used to capture complexity or correct for discrepancies in the mechanistic model. To this end, the following steps are necessary:
- -
Understanding the Mechanistic Model
Having a clear understanding of the underlying mechanistic model, its equations, and the assumptions made is necessary. Due to this reason, in the above section, we have thoroughly introduced the mechanistic model of the SM reactor.
- -
Data Generation
In the next step, we can generate a dataset using the mechanistic model. We can run simulations over a range of input conditions to produce data points that include both input parameters and corresponding outputs predicted by the mechanistic model. This dataset will be used for training and testing of the ML-based model.
- -
Identification of Model Discrepancies
Another important step is to identify and analyze the discrepancies or uncertainties in the mechanistic model. Areas where the model deviates from experimental data or where certain complexities are not adequately captured can become potential targets for ML.
- -
Selecting ML Algorithm
The next step is to choose an ML algorithm suitable for the problem. For instance, regression algorithms, neural networks, or ensemble methods are common choices. The selection depends on the nature of the problem and the dataset.
- -
Training the ML Model
Now, we can train the ML model using the dataset generated from the mechanistic model. The ML model learns the relationships between the input parameters and the mechanistic model outputs, capturing any nonlinearities or discrepancies present in the data.
- -
Testing the ML Model
The final step is to test the trained ML model using a separate dataset not used during training, ideally, experimental data. Generally, metrics such as mean squared error, R-squared, or others can be used for evaluation.
6. Proposed Methodology
When dealing with uncertainty in the parameters of a mechanistic model and aiming to train a reliable ANN model, we need to follow a probabilistic approach. The suggested steps in this regard are as follows:
- -
Uncertainty Identification
In the first step, we need to clearly define the parameters in the mechanistic model that have associated uncertainty. In addition, identification of the nature of this uncertainty—whether it is due to measurement errors, variability in material properties, or other sources—is also important.
- -
Probabilistic Sampling
Next, we need to sample the uncertain parameters probabilistically from their probability distributions. If the nature of the uncertainty is known, we can use statistical methods or experimental data to define the probability distributions for these parameters.
- -
Dataset Generation
Now, for each set of sampled parameters, we can run simulations using our available mechanistic model to generate corresponding input-output pairs. This process produces a dataset which represents a specific combination of uncertain parameters and the corresponding mechanistic model prediction.
- -
Monte Carlo Simulation
In this stage, we need to employ a Monte Carlo simulation to generate a large number of samples by repeatedly sampling the uncertain parameters and running simulations. This approach accounts for the uncertainty in the parameter values and provides a diverse dataset for training the ANN.
- -
Dataset Splitting
At this time, we should split the generated dataset into training and test sets, ensuring that each set captures the variability in the uncertain parameters. A common splitting ratio can be 80% for training and 20% for testing.
- -
Scaling
After data gathering, the next step is to normalize or standardize the input/output variables to ensure that they are on a similar scale. This step is crucial for the convergence and performance of the neural network.
- -
ANN Architecture and Training
In this stage, we can define the architecture of the ANN, considering the uncertain parameters as input features. Thus, we can start training the ANN using the generated dataset, where the inputs are the uncertain parameters and the outputs are the corresponding mechanistic model predictions.
- -
Evaluation
Now, it is time to evaluate the performance of the trained ANN on the test dataset while assessing its ability to generalize across different instances of uncertain parameters.
- -
Iterative Improvement
Based on the evaluation results (and if needed), we may iteratively improve the ANN model by adjusting hyper-parameters, architecture, or other aspects. Here, the target is to find a model that provides reliable predictions under various scenarios of parameter uncertainty.
It should be noted that by adopting a probabilistic approach and incorporating uncertainty into the dataset generation process, we are able to train an ANN model that accounts for parameter uncertainties in the underlying mechanistic model. This approach enhances the reliability of the ANN predictions and provides valuable insights into the robustness of the model under varying conditions.
7. Simulation Results and Discussion
As we discussed earlier when introducing the mechanistic model of the SM reactor,
Pe,
R,
q, and
θτ are considered as key parameters of the model. To demonstrate how our methodology can address the presence of uncertainty in these parameters, we assume that all of them follow a Gaussian probability distribution function with specified means and standard deviations.
Table 1 presents the selected statistical characteristics of these parameters.
Moreover,
Figure 2 illustrates the general structure of the model we are developing. This structure outlines the information the model receives and its output. Specifically, as shown in
Figure 2, our primary objective is to propose a model capable of predicting the degree of hydrogenation at each sampling time (input #6) and for each element (input #5) when provided with four other inputs (
Pe, R, q, and
θτ) that impact the output, which is the degree of hydrogenation.
We also conduct a detailed investigation into how variations in the crucial input parameters (
Pe,
R,
q, and
θτ) influence the degree of hydrogenation across different stages of the process, as represented by the hydrogenation in elements 6, 12, 18, and 24. Our analysis reveals significant insights into the sensitivity of the hydrogenation process to changes in the input parameters at different stages. Specifically, we observe that the degree of hydrogenation exhibits greater variability in the initial elements compared to the higher ones. For instance, at element #6, the degree of hydrogenation ranges from 46% to 78% upon reaching steady-state conditions after 30 samples. In contrast, at element #24, which represents a later stage in the process, the hydrogenation degree ranges more narrowly between 96% and 99% under similar conditions (
Figure 3).
This pattern indicates that as the reactants progress through the elements, the process becomes less sensitive to fluctuations in the input parameters, likely due to approaching chemical equilibrium or saturation effects. Such findings not only highlight the complex interplay between the process parameters and their impacts on the outcome at various stages but also underscore the importance of considering element-specific dynamics in our machine learning models to enhance prediction accuracy and process optimization.
Therefore, based on the explanations provided in the previous section, the first step involves generating an appropriate dataset that accurately represents the dynamics of the hydrogenation process. To create this dataset for training the artificial neural network (ANN) model, various values for Pe, R, q, and θτ need to be set, utilizing suitable Gaussian probability distribution functions. Subsequently, simulations are to be conducted using the available mechanistic model for each set of Pe, R, q, and θτ. Simultaneously, the ‘degree of hydrogenation’ is recorded at different locations in the reactor as a percentage at each sampling time. We designate the desired sampling locations after 6, 12, 18, and 24 elements. In the generated dataset, the output variable is the ‘degree of hydrogenation’ and the input variables include the number of elements from which the sampling process is carried out (6, 12, 18, or 24); the corresponding values for Pe, R, q, and θτ for that specific simulation; and the time of sampling. Consequently, our dataset comprises six input variables and one output variable.
Now, the dataset generated can be utilized to train the ANN for predicting the ‘degree of hydrogenation’ as the desired model output. For this purpose, we choose a multi-layer perceptron (MLP) with 20 neurons in the hidden layer. The general structure of the selected ANN is illustrated in
Figure 4. The training criteria, including the mean squared error (MSE) for each training epoch (
Figure 5), the training error histogram (
Figure 6), and the regression model between the ANN output and mechanistic output (
Figure 7), are also presented. It is important to note that increasing the number of independent simulations can enhance our training dataset, allowing us to better capture the internal dynamics of the reactor. In this instance, we opt for 15 different sets to assign values for
Pe,
R,
q, and
θτ. We will demonstrate that the model generated based on the training dataset performs well when tested with a new set of input variables (unseen values for
Pe,
R,
q, and
θτ randomly sampled from the corresponding Gaussian probability density function). However, if the model’s performance on the test data is deemed unacceptable, it may indicate poor training of the ANN model. One solution to address this issue is to expand the training dataset by adding new different input variables associated with the key parameters and record the simulation results.
For further data analysis over the constructed dataset, which may help to fully grasp the complexity of the machine learning problem, we also calculate the correlation coefficients between the selected input variables (Pe, R, q, and θτ) and the output. Across all studied elements (#6, #12, #18, and #24), the magnitude of the obtained correlation coefficients is generally very low (below 0.3 in absolute value), which typically suggests a weak linear relationship between these features and the degree of hydrogenation. In other words, the generally weak correlations suggest that either the process is dominated by non-linear effects not captured by simple linear correlation or that other factors not included in the analysis may be influencing hydrogenation more significantly. Therefore, the lack of strong linear relationships might indicate the need for more complex modeling approaches that can capture non-linear interactions, such as neural networks. Our presented findings in the following paragraphs completely support this fact.
After the termination of the training phase, the next step involves evaluating the performance of the trained ANN. For this purpose, we randomly select 15 different values for the key parameters Pe, R, q, and θτ from the associated probability density function. We then compare the outputs of the corresponding mechanistic models, serving as references, with the output of the trained ANN after various elements (i.e., after 6, 12, 18, and 24 elements).
As illustrated in
Figure 8, the predicted outputs from the ANN successfully align with the reference values. Each set of
Pe,
R,
q, and
θτ corresponds to 240 sampling times, with every 60 samples indicating the degree of hydrogenation in percentages at the respective sampling points (after 6, 12, 18, and 24 elements). In essence, we have developed a unique and accurate ANN model capable of representing the dynamics of the SM reactor in the presence of uncertainty regarding key parameters.
With this model, users can effectively analyze the behavior and performance of the reactor even when they are uncertain about the exact values of the key parameters. Consequently, users can make informed modifications to enhance the overall efficiency of the hydrogenation process.
In this section, we compare the performance of the proposed ANN with other machine learning models including linear regression, decision tree, and random forest to identify the most effective approach for predicting the degree of hydrogenation. In general, linear regression, a foundational statistical technique, offers a straightforward approach for modeling the relationship between independent variables and a continuous dependent variable through a linear function. Despite its simplicity and interpretability, linear regression can struggle with complex, nonlinear data structures, which are typical in chemical processes. On the other hand, decision tree segments data into branches to form predictions, making it intuitive and capable of handling nonlinear data. However, they are often prone to overfitting, especially with noisy data. Moreover, we also implemented random forest, which may improve generalization to new data compared to decision tree. Our empirical results show that while the decision tree and random forest models provided substantial improvements over linear regression, they did not outperform the ANN. The ANN achieved the lowest RMSE, indicating its superior ability to model complex relationships hidden in data (
Table 2). The ANN’s architecture, which comprises interconnected layers and nodes, enables it to learn nonlinear transformations and interactions between features effectively, providing a nuanced mapping of inputs to outputs. This capability makes it particularly skillful at capturing the intricate dynamics of chemical reactions, where interactions and process conditions may have nonlinear effects on the outcomes.
Figure 9 depicts the performance of each model with respect to the test data. It is clear in the figure the outputs from the decision tree model and the ANN are very close to those from the actual data compared to what we may see in linear regression and random forest.
Figure 10 and
Figure 11 also demonstrate the error histogram and error signals associated with each model, respectively.
Finally, we conduct a comprehensive study for a feature importance analysis to substantiate the significance of the input features in our predictive model. Our analysis reveals that the ‘Time’ (Feature #6) exhibited the highest importance score, followed closely by ‘Element Number’ (Feature #2). These results are consistent with our initial hypotheses and the observational data presented in
Figure 3, which clearly illustrate a pronounced dependency of the degree of hydrogenation on both the time of the reaction and the specific elements involved. The temporal dimension (time) is critical as it encapsulates the kinetics of the hydrogenation process, with longer durations allowing for more complete reactions, thereby influencing the degree of hydrogenation observed. Similarly, as we know by increasing the ‘Element Number’, the degree of hydrogenation increases. Following these, the next most influential features, in order of decreasing importance, are
θτ,
q,
R, and
Pe (
Figure 12). These findings not only reinforce our theoretical understanding but also provide quantitative evidence that support the model’s reliance on these features to predict hydrogenation outcomes accurately. This feature importance analysis substantiates the critical nature of the selected input parameters and justifies their inclusion and prioritization within our modeling framework.
Finally, we need to mention that as the proposed ML model is based on the data generated from our previous work (the mechanistic model development [
2]), all underlying assumptions for deriving that mechanistic model are also applicable here. These assumptions include: (i) the validity of the film model for mass transfer, with a negligible reaction extent in the liquid film; (ii) steady and constant temperature and pressure conditions; (iii) a constant mass transfer coefficient and interfacial area; (iv) the independence of the axial dispersion from the concentration and position; and (v) a reaction primarily occurring in the bulk of the polymer solution, with a negligible reaction in the gas phase.
Also, it is important to note that our model demonstrates its capability in handling high variability in parameters and robustness in predicting the reactor dynamics under diverse conditions. Utilizing this model can lead to reduced computational demands, lower operational costs, and improved decision-making capabilities. However, its potential limitations include susceptibility to overfitting, sensitivity to specific parameter variations, and limited generalizability due to the specific dataset used for training. Additionally, the scale of the experiments and the specific conditions under which the model was developed may also impose limitations.