1. Introduction
Industrial fermentation processes for producing different compounds, such as pharmaceuticals and food ingredients, have long been used in the biotechnology industry. Various microorganisms can be used for these processes, from different species of bacteria to yeast and fungi [
1]. Furthermore, cell culture processes with mammalian cells are relevant in the biopharmaceutical context, particularly for the production of monoclonal antibodies. Working with each organism has advantages and challenges, but some challenges are common to all upstream bioprocesses. Essentially, these organisms are extremely complex biologically, as different metabolic networks are present in each cell, which are differentially activated under different circumstances. Furthermore, the interaction with the external environment affects the biological phenomena at the cellular level [
2]. For example, one of the major challenges when implementing/deploying a new fermentation process is its scale-up, since changes in geometry and physical conditions on a large scale can affect process performance through the formation of gradients (e.g., of substrate concentration) [
3,
4].
Models are essential tools in the field of bioprocesses [
5]. They can be used to transform process knowledge and data into relevant predicted variables. Ultimately, they can be used in real-time for monitoring and process optimization [
6,
7]. Regarding the topic of this review article, fermentation processes, models can also be useful tools to predict differences in performance once the scale changes [
8]. In this way, the scaling process can be made more seamlessly, thus reducing the time spent deploying a new process to full-scale production. These models can essentially be divided into two categories: mechanistic and data-driven.
Mechanistic models, also commonly referred to as “white-box” models, are mathematical representations of the knowledge of the process. They are described by first-principles equations, whose parameters have physical meaning [
9,
10]. On the other hand, data-driven models (“black box”) do not require any previous knowledge of the process but rather predict process outputs solely based on available process data [
11,
12].
Both of these modeling approaches have their strengths and weaknesses, and because of that, their application also differs. This will be further discussed in specific sections in this article,
Section 2 and
Section 3, for mechanistic and data-driven modeling, respectively. Combining both approaches is a strategy to overcome their limitations [
11]. This is broadly defined as hybrid or gray-box modeling [
12]; this is the terminology adopted in this work. Hybrid modeling creates the possibility of taking advantage of the process knowledge available to build a model that can be easily adapted to different cases (within a specific range) [
13,
14] and simultaneously use data-driven approaches to explain parts of the process for which there is no mechanistic knowledge available [
15,
16].
This review aims to describe each of the above-mentioned types of models, their advantages and disadvantages, how they have been applied in the context of fermentation processes in the past, as well as up-to-date examples. Furthermore, we will discuss how their combination—hybrid modeling—has been used and the opportunities it opens in topics that can be aided by modeling. Finally, we will present perspectives on how such models can enhance scale-up, a topic of broad interest in this research area.
The article is divided into six sections, the first being the Introduction section.
Section 2 and
Section 3 refer to mechanistic and data-driven models, respectively. The following
Section 4 focuses on hybrid modeling, its structure, and potential applications in fermentation processes.
Section 5 highlights some of the issues concerning scale-up and the role that models can play in solving them. Finally,
Section 6 presents the conclusion of the review, summarizing the main take-home messages and future perspectives.
2. Mechanistic Modeling
Mechanistic models are based on the fundamental laws of natural science and aim to describe systems and their mechanisms using mathematical equations derived from process knowledge. Equation parameters can have a biological, chemical, or physical meaning. They can be based on mass, heat, and momentum balances as well as kinetic rate equations.
Figure 1 illustrates the process of developing such models. In fermentation processes, the most critical aspects to predict are biomass growth, substrate consumption, and product formation [
10]. These models can have different levels of complexity regarding the assumptions made about cell heterogeneity (segregated vs. unsegregated models) and the detail considered when calculating cell growth (structured vs. unstructured models) [
10]. Segregated models focus on studying heterogeneity in cell populations. For the purpose of bioprocess optimization, their high complexity makes them challenging and time-consuming, and therefore, an average description of the cell is more commonly applied [
17]. Thus, the focus is on unsegregated models, and so there is no further discussion on the differences between segregated and unsegregated models. A review of the methods utilized for cell population modeling and its applications can be found in the publication of Waldherr [
18].
Table 1 highlights the differences between structured and unstructured mechanistic models and their advantages and disadvantages.
Unstructured kinetic models are widespread because they usually do not contain many parameters, and as a consequence, they are not computationally expensive. They consider biomass as a black box that converts a substrate into the product of interest and do not detail the chemical reactions occurring inside the cells. Therefore, this type of model focuses on studying the impact that external parameters, such as temperature and pH, have on the biological component of the process or others, such as agitation power and aeration, have on the physical component of the system. The lower resolution of the biological component allows for more detail on the physical characterization of the process [
10]. Some successful applications of these types of models are (a) modeling of overflow metabolism in
Escherichia coli [
19]; (b) modeling of enzyme production in
Aspergillus oryzae under different aeration and agitation conditions [
20]; and, (c) kinetic modeling of glucose and xylose co-fermentation for the production of lignocellulosic ethanol [
21].
On the other hand, structured models consider biomass as multi-component organisms with internal structure [
22]. They can be useful, for example, for improving the metabolic efficiency (yield) of a target product. For example, Tang et al. [
23] used a pooled metabolic model to predict the metabolic impact in
P. chrysogenum of a feast–famine cycle, both on the hour and minute time scales, making it relevant for studying the influence of large-scale mixing. Jahan et al. [
24] developed a model to estimate specific growth rates based on reaction kinetics for wild-type and genetic mutants of
Escherichia coli. In another study, Çelik et al. [
25] developed a structured kinetic model for
Pichia pastoris growth and recombinant protein production for optimizing feeding strategy.
Unstructured models are simpler but can still provide a useful description of the process and, therefore, can be easily used for design purposes [
10]. Special attention must be paid since they cannot be easily extrapolated. On the other hand, dynamic metabolic modeling provides a more accurate description of growth metabolism, increasing extrapolating capabilities [
26]. However, it requires many equations, and a significant number of parameters must be experimentally identified. Ultimately, what defines a good model depends on the desired application context, and thus, model complexity should be decided according to the question to be answered.
In
Table 2, more examples are summarized, both for structured and unstructured approaches. Most applications concern the prediction of growth or product formation under different process conditions. Nevertheless, these models have also been used to determine optimal process parameters or to understand the cell’s metabolic responses. In summary, mechanistic models are based on process understanding, which means that they can be extrapolated, to some extent, outside the specific context in which they are developed by tuning some of the model parameters [
10]. On the downside, they require intensive study/knowledge of the process and thus are time- and resource-consuming to develop and maintain [
9].
3. Data-Driven Modeling
Unlike mechanistic models, data-driven approaches ignore relationships that originate from process knowledge, and the parameters of the mathematical equations will not have a physical meaning.
Figure 2 illustrates how historical process data are used to develop these models. In a fermentation context, these models aim to predict critical quality attributes (CQA), such as titer, productivity, and carbon efficiency, based on critical process parameters, without accounting for the mechanistic causalities that describe the relationships [
11].
Machine learning is a widely used data-driven modeling method and can be divided into two categories: supervised and unsupervised learning. Some authors also introduce the classification of reinforcement learning, though others defend that it should not be considered a specific class of learning methods but rather as a paradigm where an agent learns how to behave in an environment based on a reward signal [
33,
34]. Nonetheless, the three types (supervised, unsupervised, and reinforcement) of learning will be discussed and examples will be given on their application in the modeling of fermentation processes (
Table 3).
In supervised learning, the data are labeled, which means that in addition to inputs, there are also predetermined output attributes that are considered in the modeling process [
33]. In the case of fermentation processes, the output attributes would be, e.g., biomass and product concentration, and the input attributes, online process data. Therefore, the algorithm will identify the relationship between input and output variables and use it to predict the target values based on new input values. Supervised learning includes the use of artificial neural networks (ANN). These have been successfully implemented for fermentation processes. For example, Tavasoli et al. [
35] used neural networks to develop a
-stat approach to control methanol feeding in an
E. coli fermentation for recombinant protein production. The results showed significant improvements compared to previously used approaches for methanol feeding. Furthermore, Nagy [
36] has used dynamic neural networks to develop a model predictive controller of temperature for continuous yeast fermentation.
On the other hand, unsupervised learning focuses on identifying hidden patterns in the data without considering a target attribute. All variables in the dataset are used as inputs. Thus, these methods are suitable for clustering and association techniques [
33]. Some examples are PCA (principal component analysis) and PLS (partial least squares) regressions. Andersen et al. [
37] applied a partitioned PLS model to predict the yield of a batch fermentation based on selected process variables. Another example is the use of a data-driven Gaussian process regression model by Barton et al. [
38] on a batch fermentation model. The model was used to increase productivity from batch to batch by manipulating process variables, e.g., batch cycle time. Nucci et al. [
39] used a PCA algorithm to detect when the process is not progressing as planned, providing decision-making support. A sub-category for unsupervised learning methods recurs to maximum-likelihood properties. These methods take into account the measurement error variance information, making them suitable for processes with limited and noisy data, such as fermentation or cell culture. In the works of Dewasme et al. [
40] and Pimentel et al. [
41], they are applied to PCA and nonnegative matrix decomposition (NMD), respectively, to reduce data dimensionality and identify relevant process models in hybridoma cell culture for the production of monoclonal antibodies.
Classified between supervised and unsupervised learning, reinforcement learning is a type of algorithm in which the agent (in this case, the model) learns how to behave in a dynamic environment through trial and error interactions, and the only feedback is a scalar reward signal [
34,
42]. Two main strategies are used for solving this type of problem: (a) search the space of behaviors for one that performs well in the environment; this is achieved, for example, using genetic algorithms; (b) use statistical and dynamic programming methods for estimating the effect that taking different actions has on the different states of the system [
42]. These types of models find applications in fermentation processes, particularly in the development of feed control strategies. Treloar et al. [
43] applied a deep reinforcement learning method to control substrate feeding rates to maintain the desired population levels (in a co-culture) to optimize product formation. In another example, Kim et al. [
44] used model-based reinforcement learning to develop a feed rate control that led to an increase in yield and productivity in an in silico penicillin production plant.
Table 3 summarizes the examples given in the text above, in addition to more relevant examples of the application of data-driven models. It highlights the capabilities of these approaches. To conclude, the main advantage of data-driven models is the automatic assembly of the models and the low computational burden, which makes them suitable for real-time monitoring and control [
11]. However, unlike mechanistic modeling, its predictive capabilities are limited to the space where they were validated, restricting its use for bioprocess control and optimization to very specific cases [
12].
Table 3.
Examples of applications of data-driven modeling of fermentation processes.
Table 3.
Examples of applications of data-driven modeling of fermentation processes.
Microorganism | Type of Model | Studied Parameter | Main Findings | Reference |
---|
Bacillus megaterium | PCA | Fault detection | On-line fault detection providing decision-making support | [39] |
Escherichia coli | Neural networks | -stat feeding control | Increased cell growth and target protein production | [35] |
Reinforcement learning | Feed rate control | Product formation optimization in a simulated chemostat with co-cultures | [43] |
Hybridoma cells | Maximum-likelihood PCA | Macroscopic reactions | Determine minimum number of reactions and parameters for process model | [40] |
Maximum-likelihood NMD | Prediction of relevant parameters | Identified model with good prediction results | [41] |
Penicillium chrysogenum | Reinforcement learning | Feed rate control | Optimized yield and productivity of in silico penicillin production plant | [44] |
Reinforcement learning | Feed rate control | Overperformed other feed control strategies for a digital industrial penicillin plant | [45] |
Saccharomyces cerevisiae | Neural networks | Temperature model predictive controller | More robust temperature control, compared to linear model predictive controller | [36] |
Neural networks | Monitoring of relevant parameters | Prediction results with on-line fluorescence spectroscopy and a process model were equivalent to those of the model where offline calibration data were used | [46] |
Gaussian process regression | Manipulation of cycle time | Increased productivity from each batch to the following one | [38] |
Streptomycetes sp. | PLS | API production | Identification of process variables responsible for variation in API production | [47] |
Not disclosed | PLS | Yield prediction | Similar performance to more complex genetic algorithm | [37] |
4. Hybrid Modeling
As stated in the preceding sections, despite their advantages, both mechanistic and data-driven models have their shortcomings, as summarized in
Table 4. Semi-parametric hybrid modeling, here named hybrid modeling, has the possibility of combining these approaches. It results in a more accurate mechanistic model by incorporating historical process data or a data-driven model that can be extrapolated outside the specific context in which it has been trained [
48]. This is particularly relevant for modeling complex systems where only partial process understanding exists. An example would be a fermentation process in which the mass and energy balances are well defined, but the parameters of the kinetic rate equations are complicated to determine [
49]. Process data are used by data-driven models to fill in knowledge gaps. An advantage of this type of model is that it integrates existing knowledge into a structured data-driven framework, allowing predictions to be improved as more experimental data on the process are collected and added to the model [
50]. For monitoring purposes, the models should predict the relevant parameters, based on online measurements. When it comes to fermentation processes, this can be achieved, e.g., for biomass concentration (from oxygen and carbon evolution rates, for example) [
51], straightforwardly through data-driven techniques; however, this is more challenging for product concentration, due to, for example, relatively low product titers [
52]. By integrating the two types of knowledge, mechanistic and data-driven, the limitations presented in
Table 4 can be reduced [
52]. Hybrid modeling is a relevant approach for model predictive control since the model needs to remain robust in untested regions of the process [
50]. Hybrid models are characterized by their extrapolation capabilities, making them suitable for process control outside the tested process conditions [
53]. For this application, the extrapolation capabilities of mechanistic models are an important complement to data-driven approaches. Furthermore, the underlying mechanistic structure of these models makes them more transparent and easier to scrutinize than their purely data-driven counterparts [
53].
The work of Narayanan et al. [
13] highlights the benefits of hybrid modeling by comparing the performance metrics of a process model with varying degrees of hybridization. This study evaluates seven process models in which 0% (equal to a fully data-driven model) to 100% (a fully mechanistic model) of process knowledge is included. The fully data-driven model utilized was PLS since it was the best performing among other tested structures (e.g., NN); when incorporating process knowledge, NN were chosen as the data-driven components. As for the mechanistic component, the added knowledge to each of the five hybrid models was (1) the rate of accumulation, (2) mass balances, (3) specific rate, (4) specific growth and death rate, and (5) kinetic terms. The fully mechanistic model further included Monod equations for the metabolites.
Table 5 summarizes the results obtained. Essentially, the models were tested in two contexts: interpolation and extrapolation, i.e., within process conditions present in the training dataset and conditions not observed in the training data, namely the feed profiles. For both cases, hybrid approaches showed superior performance. Most interestingly, the data requirements for each level of knowledge incorporation differed. In the interpolation scenario, it is possible to observe that, by adding an equation for the accumulation rates, the same performance as the data-driven model was achieved with 20 fewer training runs. Hybrid Model 3, which contained a variable for the specific formation/consumption rate of each variable, was the best performer, having the lowest MSE (mean squared error) and the least training data. As more knowledge was added to the model, more training data were necessary to achieve equal performance. In these cases, the additional knowledge of the process means that a larger number of outputs need to be predicted using the NN, thus having more parameters and requiring a larger quantity of data. As for the mechanistic model, the MSE obtained is the highest observed; however, only 10 runs are necessary. With the same number of runs, the data-driven model exhibits significantly worse performance (MSE of 0.15).
The differences became more striking in the extrapolation test scenario. The data-driven model performed poorly even with 50 training runs. By adding some knowledge, the performance of hybrid model 1 improved significantly, although its performance was still not satisfactory. Similarly to the interpolation case, hybrid model 3 was the best performer. It exhibited the lowest MSE while also requiring the least training data. The models with a larger mechanistic component once again required more training data and had an equally low MSE. Finally, the fully mechanistic model presented an MSE higher than that of the best hybrid models, although it needed the least amount of data. Considering only 30 training runs, it was only outperformed by hybrid model 3. Furthermore, when comparing the two extremes, data-driven and mechanistic, it is clear that the latter is superior when extrapolation is necessary. Overall, with an adequate selection of the mechanistic component, hybrid models present several advantages, resulting in more accurate models with good extrapolation properties and with lower data requirements than data-driven counterparts.
Depending on how the different types of models are combined, two hybrid model structures can be defined, parallel and serial (
Figure 3). The parallel structure (
Figure 3a) is suitable when the parametric (mechanistic) model exists independently, but its prediction capabilities are limited due to, e.g., unmodeled effects and nonlinearities [
12]. As such, the parametric model can be used by itself, and the non-parametric component only improves the quality of the predictions [
48]. The downside of this approach is that the model’s prediction will remain poor for the input space in which the data-driven model has not been trained. As for the serial structure (
Figure 3b,c), the “white-box” model will be composed of first-principles equations, such as mass and energy balances, for example, and the “black-box” model component will be used to represent, for example, kinetic terms, since these are harder to validate [
12]. The serial structure is particularly suitable when there is insufficient knowledge of the underlying process mechanisms to build a fully mechanistic model, but sufficient process data are available to calibrate the data-driven component. On the other hand, a serial structure can also be applied in the case where the predictions of the mechanistic model are used as input to the data-driven model, establishing relationships between the process parameters or the inputs [
12].
The main determinant of the best structure to adopt is the structure of the mechanistic model, as the assumptions made in that model constrain the solution space [
54]. As such, when the mechanistic model cannot correctly represent some aspects of the process, e.g., complex nonlinear kinetics, a parallel structure is preferred. It can perform better than the serial arrangement since the data-driven model can partially compensate for the structural weakness in the mechanistic model. When the structure of the mechanistic model is accurate, the serial model gives better predictions compared to the parallel model. In addition, the extrapolation properties will be significantly better.
Hybrid modeling is a relatively recent field. Despite significant efforts in this area, as evidenced by applications in fermentation processes (refer to
Table 6), there are still challenges to overcome. These challenges should be addressed to allow their widespread application in the field of biochemical engineering. A detailed discussion of current challenges can be found in the review of Schweidtmann et al. [
48]. Some examples that are found particularly relevant for fermentation processes are: (a) the complexity in parameter estimation in dynamic hybrid models since this could lead to an increase in computational demand [
55,
56]; (b) the lack of well-documented methods for incremental learning, i.e., being able to train the model on new data without requiring access to the original data, which could be essential to improve the model’s predictions as more experimental data are collected. This is relevant since the most common approach at the moment is batch incremental learning, which requires the model to be re-trained using the whole dataset (original data and new data) and that can become computationally expensive for increasing quantities of data [
57]; and (c) the use in adaptive and evolving systems, since this can be the case in fermentation processes, such as processes with distinct phases for growth and product synthesis [
58]. This results in a metabolic shift, which can be represented by keeping the same structure (e.g., the equation used to describe the growth rate) and changing the value of certain parameters—adaptive—or by constructing a different model for each phase (e.g., choosing a new equation that better describes the growth rate in the new conditions)—evolving.
Applications in Fermentation Processes
This section will focus on applications of hybrid modeling in the field of fermentation. Hybrid modeling can be used as a prediction tool in the process development stage [
16,
49], and its applications extend to monitoring, control, and optimization.
Table 6 summarizes some examples of the application of this type of model.
As an early development stage example, von Stosch et al. [
49] applied a hybrid model for a reduced DOE for an
E. coli fermentation process. The selected structure for the model was a serial structure (
Figure 3c), where the data-driven component (ANN) is used to predict the rates and correlation parameters included in the mechanistic component (material balances). Essentially, the mechanistic model uses ODEs to describe the variation over time of volume, biomass, and product concentrations, by establishing the relationships between these variables and rates (biomass and product formation) and the added base and feed solution. The data-driven model is an artificial neural network with three layers. The inputs for this model are the process parameters X (biomass concentration), P/X (specific productivity), T (temperature), pH, and the carbon feed rate. The structure of the neural network (number of nodes and hidden layers) as well as the most relevant process parameters to include were decided on the basis of the performance of the model on the validation set. The hybrid model could predict the impact of different induction conditions (e.g., temperature and pH) on biomass growth and recombinant protein formation, allowing for better process understanding without added experiments.
For monitoring purposes, the models can be used as a soft sensor where online measurements are taken by the model and turned into relevant predictions. For example, Brunner et al. [
51] use the online data of
, measured in the off-gas for real-time prediction of biomass concentration throughout the different stages of a fermentation process (batch, transition phase, and fed-batch), recurring to a simple carbon balance model coupled with multiple linear regression. In this case, a serial structure is adopted as well; however, unlike the previous example, the mechanistic model’s predictions are fed into the data-driven component (
Figure 3b). The rate of carbon production is calculated mechanistically based on mass balances. This value, along with the volume of the base solution that has been fed into the reactor, is input into the multiple linear regression model. This model will calculate the biomass concentration. Furthermore, a phase detection algorithm is used to determine the current process stage, and automatically adapt the values of the model’s parameters, based on the concentration of
measured online in the off-gas.
In another approach, Boareto et al. [
61] used NNs to improve a previously developed mechanistic model of the produced lipolytic enzyme titer. The model utilized
and substrate feed rate measurements to predict, in real-time, the enzyme titer, as well as substrate and biomass concentrations. In this approach, a parallel structure is adopted to combat the structural mismatch in the original mechanistic model. The mechanistic component of the model was adapted from the literature and reduced so that it would include ODEs only for the biomass and substrate concentration, using the well-known Monod equation to predict the growth rate. The equations describing the evolution in enzyme activity were removed due to their inaccurate predictions. The data-driven component of the model (ANN) is used to calculate the enzyme’s titer based on the carbon evolution rate, substrate feed rate (online measurements), and biomass concentration (calculated by the mechanistic component). The selected neural network model had three layers and its structure was determined by cross-validation. The final model significantly improved the accuracy of the enzyme titer predictions while maintaining the same performance for the prediction of biomass concentration.
Cabaneros Lopez et al. [
65] uses mid-infrared spectroscopy data to feed a PLS model. Combined with a kinetic model, they can predict glucose, biomass, and ethanol concentration in a lignocellulosic fermentation. The model presents a parallel structure, and the predictions of both components are fused by a continuous-discrete extended Kalman filter (CD-EKF). The mechanistic component is a kinetic model composed of eight ODEs describing all the variables of interest, and the parameter estimation was performed by the non-linear least-squares method. For the data-driven component, PLS models were used to predict glucose, xylose, and ethanol concentrations from the spectral data. The predictions of the hybrid model were compared to the predictions of the mechanistic and data-driven models on their own. In all but one case, the hybrid model presented lower RMSE values than the other models. In one of the test fermentations, the RMSE for the prediction of ethanol concentration was lower for the mechanistic model than for the hybrid model.
For control purposes, robust models with high extrapolation possibilities are required [
52]—characteristic of hybrid models—however, not many applications are reported. However, there are some examples such as the work of Dors et al. [
63] and Jenzsch et al. [
62], in which the predictions of the hybrid model are used as input to control the fermentation feed profile, leading to a more stable process and improved batch-to-batch reproducibility, respectively. In the first case, a parallel structure is adopted. The mechanistic component consists of ODEs that describe the mass balances for all relevant process variables and uses Monod relationships to describe the kinetics of the process. The data-driven component (ANN) is used to partially calculate the consumption and production rates. The predictions of both components are weighted according to the process data available to train the neural network in the region corresponding to the current process state; i.e., if sufficient historical data for the current state exist, the prediction of the neural network will have a superior weight to the one of the mechanistic components. Finally, the hybrid model is used to calculate an optimal feed rate. The use of hybrid modeling for online applications, namely monitoring, is already significant. Real-time predictions of key process variables open the possibility of detecting if the process is running as expected and can aid the decision-making of operators. This would push the industry towards a more digital operation, being less dependent on variations influenced by human interaction.
5. Model Aided Scale-Up
The topic of scale-up has been central in the research activities concerning fermentation for years. This entails taking a newly developed process from lab to full production scale. The aim is to produce large product quantities while maintaining the CQAs observed at the small scale. The challenge is that as the reactor size increases, the conditions for favorable growth may be harder to attain, e.g., due to less efficient mixing. This, in turn, can lead to lower process reproducibility, yields, and product quality [
66]. Process scale-up is still, to this day, a major challenge in the fermentation industry, as it is usually not based on mathematical process models but on empirical correlations.
The ideal way to tackle the challenges found on large scales would be to perform experiments on the actual production site. However, this is not economically feasible, not only due to the large amount of resources consumed but also due to the loss of production capacity [
67]. The alternative is the use of scale-down approaches, in which a laboratory or pilot scale reactor is used to replicate the conditions experienced at an industrial scale, so the results are relevant for production process optimization. Achieving a successful scale-down is also a challenge since some conditions might be difficult to replicate at smaller scales, such as oxygen transfer, shear rate, and flow patterns. Several platforms can be used for it, including pilot scale reactors, microtiter plates [
68], shake flasks, microbioreactors or milliliter scale stirred reactors [
69]. Another interesting approach is the use of two connected STRs or a STR and a PFR [
70]. This approach allows a potential study of gradients by having each small-scale reactor represent a specific zone of the large-scale reactor, for example, of substrate or oxygen depletion.
5.1. Use of CFD-Coupled Kinetic Models
A relevant problem in industrial-scale reactors that should be taken into account when planning a scale-up is the formation of gradients. There are significant gradients that may impact fermentation process performance of, e.g., substrate or dissolved oxygen concentration. These gradients will occur when the local rate of consumption is higher than the rate of transport [
3]. The consequence will be the occurrence of different zones inside the bioreactor with a surplus or a deficiency of nutrients or oxygen. For example, if the substrate is dosed at the top, then the cells located there will experience high nutrient concentrations which can lead to, for example, overflow metabolism in
E. coli with the production of inhibitory by-products like acetate as a consequence.
A useful tool to study gradients is the combination of computational fluid dynamics (CFD) simulations and biological models. By combining the fluid flow information gained from the CFD model, with the cell metabolism information from the biological model, it is possible to predict the cell response to environmental factors and ultimately the impact this has on CQAs [
71].
Several studies have been conducted, with the biological model’s complexity ranging from simpler unstructured approaches to complex structured metabolic models. Although unstructured approaches are useful for a better understanding of gradients, they cannot capture the response of the organisms to the different conditions, as metabolic models can.
Table 7 summarizes examples of models developed. To highlight a few, Pigou and Morchain [
4] were able to predict the areas of the bioreactor where acetate would be consumed or produced by
E. coli due to the formation of a glucose gradient. They used a population balance model in combination with a compartment model, reducing the computational burden compared to a CFD simulation. In the work of Sibler et al. [
72], they could predict the impact of the CO gradient on the cell population. Through their work, the scale-up of this syngas fermentation can be performed by taking into account the need to improve CO mass transfer or engineering strains that better cope with this limitation.
5.2. Use of Hybrid Modelling in Scale-Up
So far, this section has illustrated how modeling can be used to predict a challenge that can be encountered when scaling up the formation of gradients. However, models that can predict the fermentation performance to some extent on a large scale, namely what the CQAs would look like under different conditions, would be extremely beneficial when planning the scale transition. Although there are numerous instances of hybrid models being applied to fermentation processes, their utilization for scale-up is not commonly mentioned. We believe that this modeling approach offers a significant opportunity to accelerate the scale-up stage of fermentation process development.
As mentioned in
Section 4, hybrid models can have good extrapolation properties (which are determined by the mechanistic component of the model) while being less time-consuming to develop than purely mechanistic models. This highlights their suitability as an approach for scale-up since the model could be developed and calibrated at a smaller scale and then extrapolated to a larger scale. One strategy is to develop a mechanistic model on a small scale and complement it with data-driven approaches to represent scale-specific parts of the model. The data-driven component will account for scale-up factors and other assumptions made in the mechanistic model that are not valid on larger scales [
12,
76]. Another option is to use mainly small-scale experimental data to develop the model but include a few validation experiments at a larger scale. This will be sufficient to significantly improve the model’s prediction capabilities on a large scale, while at the same time not requiring a big increase in resource consumption [
50].
Here, two examples are described where a hybrid model is used to predict larger-scale performance. In the work of Bayer et al. [
50], a hybrid model was developed from a DOE with 300 mL shake flasks and re-calibrated with only three batches at the 15 L lab-scale reactor. These few experimental runs on a larger scale were sufficient for the model to accurately describe cell behavior and product formation on the 15 L scale under different process conditions. These included shifts in temperature and substrate concentration in the feed, throughout the fermentation time, while the model had only been trained in static conditions. Regarding the work of Rogers et al. [
14], three hybrid models are developed on the 1 L scale, where each model has a different quantity of kinetic knowledge incorporated, i.e., number of parameters. The models are then used to predict the performance of fermentations in 5 L reactors under a temperature change that was not present in the training data. They found that the model with intermediate knowledge incorporation performed the best in this case and is suitable for model-based bioreactor optimization and scale-up. Although the model with the largest kinetic information had more confident predictions, it showed lower accuracy. This highlights that incorporating information that is not fully understood can lead to incorrect bias and poorer model performance, i.e., overfitting the model.
No examples of applications of similar strategies to industrial scales have been found in the literature. However, an adaptation of the described strategies to larger scales could be of interest.