*Article* **Use of Food Spoilage and Safety Predictor for an "A Priori" Modeling of the Growth of Lactic Acid Bacteria in Fermented Smoked Fish Products**

**Angela Racioppo, Daniela Campaniello, Milena Sinigaglia, Antonio Bevilacqua, Barbara Speranza \* and Maria Rosaria Corbo**

> Department of Agriculture, Food, Natural Resources and Engineering (DAFNE), University of Foggia, 71122 Foggia, Italy; angela.racioppo@unifg.it (A.R.); daniela.campaniello@unifg.it (D.C.); milena.sinigaglia@unifg.it (M.S.); antonio.bevilacqua@unifg.it (A.B.); mariarosaria.corbo@unifg.it (M.R.C.) **\*** Correspondence: barbara.speranza@unifg.it

**Abstract:** Fermentation is one of the oldest methods to assure the safety and quality of foods, and to prolong their shelf life. However, a successful fermentation relies on the correct kinetics depending on some factors (i.e., ingredients, preservatives, temperature, inoculum of starter cultures). Predictive microbiology is a precious tool in modern food safety and quality management; based on the product characteristics and the conditions occurring in food processing, the inactivation of or increase in microbial populations could be accurately predicted as a function of the relevant intrinsic or extrinsic variables. The main aim of this study was the optimization of the formula of a smoked fermented fish product using predictive modeling tools (tertiary and secondary models) in order to define the role of each factor involved in the formulation and assure a correct course of fermentation. Product optimization was conducted through the software Food Spoilage and Safety Predictor (FSSP), by modeling the growth of lactic acid bacteria (LAB) as a function of some key parameters such as temperature, pH, salt, liquid smoke, carbon dioxide, and nitrites. The variables were combined through a fractional design of experiments (DoE) (3k-p), and the outputs of the software, i.e., the maximal growth rate (μmax) and the time to attain the critical threshold (tcrit), were modeled through a multiple regression procedure. The simulation, through FSSP and DoE, showed that liquid smoke is the most critical factor affecting fermentation, followed by temperature and salt. Concerning temperature, fermentation at 20–25 ◦C is advisable, although a low fermentation temperature is also possible. Other parameters are not significant.

**Keywords:** predictive microbiology; FSSP; DoE; smoke; fermentation; fish

#### **1. Introduction**

The application of fermentation to fish goes back many thousands of years with evidence of fermented fish products in ancient Greece (*aimeteon*) and in the Roman era (*garum*). However, although the fermentation process was once only used as a preservation method, today, it is also used for health purposes. In fact, a recent trend in the design of innovative fish products is the individuation of fish formulas fermented with appropriate starter cultures (sometimes with probiotic or other functional traits) to produce foods with improved sensory attributes, high nutritive value, and health benefits [1,2].

Fermented fish products usually have unique characteristics, especially in terms of aroma, flavor, and texture: this is the result of the transformation of organic materials into simpler compounds by the activity of microorganisms or enzymes during the fermentation process. The fermented products are more digestible, at the same time preserving their nutritional properties, even better than unfermented raw fish. These foods are a source not only of proteins, amino acids, and polypeptides, but also of minerals (calcium and iron), some B group vitamins, and, above all, polyunsaturated fatty acids [3].

**Citation:** Racioppo, A.; Campaniello, D.; Sinigaglia, M.; Bevilacqua, A.; Speranza, B.; Corbo, M.R. Use of Food Spoilage and Safety Predictor for an "A Priori" Modeling of the Growth of Lactic Acid Bacteria in Fermented Smoked Fish Products. *Foods* **2022**, *11*, 946. https://doi.org/ 10.3390/foods11070946

Academic Editors: Carlos Vilas, Míriam R. García and Jose A. Egea

Received: 28 February 2022 Accepted: 21 March 2022 Published: 25 March 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Unfortunately, most fermented fish products are still local and not so easily found nationwide, with Asia and Africa being the main producers; only some fish sauces and shrimp pastes are widely known in Europe [4]. Each country has its own types of fermented fish products characterized by different production processes and formulations.

Recently, these products have been gaining increasing interest among consumers due to their healthy characteristics supported by several studies in the literature [3,5–15]. Apart from the well-known health effects of fish meat, correct fermentation, led by protechnological microorganisms, could result in the biofortification of amino acids and peptides with some physiological and beneficial functions (antioxidant, antihypertensive, antiproliferative, hypoglycemic, immune-stimulating, and anticoagulant effects) [1]. In addition, a correct and predictable process piloted by starter cultures, either autochthonous or commercial, can limit the production of undesired metabolites and produce a good product; the choice to rely the quality of fermented products on spontaneous fermentation (natural microbiota) is no longer advisable, because of the very high risk of incurring arrested fermentation and health problems. This is why recent research is increasingly focused on the selection of microorganisms able to guarantee the best performances [16–18]. The most promising microorganisms selected as starters are generally those that are isolated from the native microbiota of traditional products since they are well adapted to the environmental conditions of the considered food [19–21].

Regardless of the type of starter, the correct course of fermentation is essential to produce a safe and high-quality product. It is also crucial to define the effects of each factor involved in the formulation (ingredients, preservatives, temperature, inoculum of starter cultures) on the process of evolution. In this context, predictive microbiology stands up as a precious tool in modern food safety and quality management. Based on the product characteristics and the conditions occurring in food processing, the inactivation of or increase in microbial populations in foods, as a function of the relevant intrinsic or extrinsic variables, could be accurately predicted [22]. A new frontier goal in predictive microbiology is the use of models and databases for product optimization, as also shown by the authors of some seafood applications [23,24].

Generally, the optimization of new formulas relies on different steps and phases and could start with an "a priori" modeling as a step prior to actual challenge tests at the laboratory level, followed by scaling up and optimization. However, before planning and executing some experiments, it is advisable to understand the role of most variables involved, and to perform a screening to exclude less significant or non-significant factors. The reduction in the number of variables is important from a mathematical point of view, because the experience of authors suggests that the use of many variables could lead to confounding phenomena; in addition, a priori modeling could help researchers minimize the number of experiments required to be performed in the lab, as well as helping them to better define the conditions of assays.

Therefore, the main aim of this study was an "a priori" optimization of the formula of a smoked fermented fish product using predictive modeling tools to define the role of each factor involved in the formulation (temperature, pH, salt, liquid smoke, carbon dioxide, and nitrites) for a correct course of fermentation. The variables were combined through a fractional design of experiments (DoE) (3k-p), and the outputs of the software, i.e., the maximal growth rate (μmax) and the time to attain the critical threshold (tcrit), were modeled through a multiple regression procedure.

#### **2. Materials and Methods**

#### *2.1. Software and Design*

Product optimization was conducted through Food Spoilage and Safety Predictor (FSSP), a free software package available as an additional tool in ComBase or on the website of the developers (http://fssp.food.dtu.dk/, accessed on 6 September 2021).

The focus was the modeling of lactic acid bacteria (LAB) growth as a function of some key parameters for a fermented fish product including intrinsic factors or extrinsic

factors (temperature and pH), processing parameters (salt, liquid smoke, nitrites, or carbon dioxide), and parameters connected to LAB growth (relative lag time). The choice of liquid smoke relies on the fact that the consumer appeal towards smoked fish products is increasing and the use of liquid smoke instead of the traditional method is advisable because it is more eco-friendly, while carbon dioxide is crucial to simulate fermentation and storage under anoxic conditions. Finally, nitrites are the preservatives generally used in fermented meat products to counteract *Clostridium botulinum* growth. The variables were combined through a fractional design (3k-p), and each factor had three levels (minimum or "−1", mean value or "0", and maximum or "+1"), as shown in Table 1. The combination of these variables resulted in a design consisting of 243 runs, as shown in the Supplementary Materials File S1. The input conditions for modeling were as follows: initial concentration of LAB, 5 log CFU/g; maximal population density, 9 log CFU/g; critical threshold, 8.9 log CFU/g; weak acids at 0 ppm. For modeling purposes, weak acids were excluded to reduce the number of variables to 7; in a mixed design, the randomization of 7 variables set at 3 levels produces a total of 243 combinations and makes the estimation of both the individual effects of each variable and their two-way interactions possible.


**Table 1.** Coded levels of the variables.

An increase in the number of variables causes an increase in the number of the combinations of the design (up to 729 combinations with 9 variables); a secondary effect of increasing the number of variables is the possibility of a statistical artifact due to the high number of individual and interactive terms. Thus, weak acids were not used for this design, because they are added at the end, and not throughout fermentation, to prolong the shelf life of the product.

The tool works on psychrotolerant LAB (*Latilactobacillus sakei*; *Latilactobacillus curvatus*; *Lactobacillus* spp.; and other related genera from autochthonous microbiota of fish) [25].

#### *2.2. Modeling*

All 243 combinations built by the randomization of the variables were used as inputs for the software, and for each combination, the maximal growth rate (μmax) and the time to attain the critical threshold (tcrit), as an indirect measure of the time required to attain the steady state, were evaluated; the values are presented in Supplementary Materials File S1. Then, μmax and tcrit were modeled through a multiple regression procedure, using the option DoE (design of experiments) of the software Statistica for Windows (Version 7, Tulsa, OK, USA), to obtain a second-order model as follows:

$$y = B\_0 + \sum B\_i \mathbf{x}\_i + \sum B\_{ii} \mathbf{x}\_i^2 + \sum B\_{ij} \mathbf{x}\_i \mathbf{x}\_j \tag{1}$$

where "*y*" is the modeled dependent variable (μmax and tcrit); *Bi*, *Bi*i, and *Bij* are the coefficients of the model, associated with independent variables; the term "*xi*" highlights the individual effect of the predictor; the symbols "*xi* 2" and "*xix*j" indicate the quadratic and interactive effects. Variables showing a significance <95% (*p* > 0.05) were not included in the equation by the software; moreover, due to the high number of variables, interactive terms were excluded.

The effect of each independent variable on the outputs was also evaluated through the individual desirability functions, estimated as follows:

$$\mathbf{d} = \begin{pmatrix} 0\_{\prime} & \mathcal{Y} \le \mathcal{Y}\_{unfav} \\ \frac{\mathcal{Y} - \mathcal{Y}\_{unfav}}{\mathcal{Y}\_{fav} - \mathcal{Y}\_{unfav}} & \mathcal{Y}\_{unfav} \le \mathcal{Y} \le \mathcal{Y}\_{fav} \\ 1\_{\prime} & \mathcal{Y} \ge \mathcal{Y}\_{fav} \end{pmatrix} \tag{2}$$

where *yunfav* and *yfav* are the most unfavorable (high value of tcrit and low value of μmax) and the most favorable (low value of tcrit and high value of μmax) values of the dependent variables.

#### **3. Results and Discussion**

The use of tertiary models to predict safety and food shelf life has also been regarded by the European Union as a tool for quality assurance. The tertiary model used in this work was originally designed for seafood products [26,27], but it has been recently updated for other foods (for example, cheese), and the name has been changed to FSSP. In the latest version, it contains a collection of several options/models, including RRS models (relative rate of spoilage), and models to predict spoilers (*Photobacterium phosphoreum*, *Morganella psychrotolerans*, *Morganella morganii*, *Shewanella putrefaciens*) and biogenic amine formation, pathogens (*Listeria monocytogenes*), and lactic acid bacteria. As observed in the study of Mejlholm et al. [28], predictions performed through the FSSP software have good precision when the complexity of the growth model matches the complexity of the foods of interest; this was also confirmed by the study of Bolívar et al. [29] during the modeling of the growth of *L. monocytogenes* in Mediterranean fish species, where the software was used to estimate μmax based on the values of pH, aw, storage temperature, and atmospheric conditions of the studied fish species. The goodness of fit of FSSP was regarded as acceptable for *Lactobacillus* spp. (now *Lactobacillus* and related genera); in addition, the tool is flexible and focuses on a high number of parameters, and for some of them (weak acids and preservatives), it is reliable a substitution of some compounds with other ones chosen by users [25].

LAB, mainly psychrotolerant strains, grow at high concentrations during the storage of seafood, resulting in the production of off-odors, gas, slime, and undesired metabolites [25,30]. However, there are various positive and pro-technological species and strains able to contribute to shelf life and healthy characteristics, because of their metabolism and health properties [21,23,30,31]. Apart from strain characterization and validation in some matrices, to the best of the authors' knowledge, there are no models able to predict the performances of LAB during fish fermentation. These models, hereby labeled as a priori functions, could be extremely useful in the first phase of product design, because they could help researchers to choose only some variables for laboratory confirmation.

It is important to include in such models both the matrix's variables and processing factors (pH, temperature, carbon dioxide), including preservatives used in the process (salt and nitrites). Another variable hereby tested was the relative lag time (RLT). This parameter is different from the classical lag time and relies on the physiological state of cells introduced into a new environment (for a starter culture from a bioreactor to food) and should be read as the "amount of work done by cells to adapt to the new environment" [32]. From a theoretical and practical point of view, RLT is at its minimum when the microorganism has a very low lag phase and grows at its optimal growth rate, and could be 0, when the microorganism has no lag phase.

The inclusion of RLT as a main variable of the model relies on the fact that, in a guided fermentation, it is crucial to inoculate well-adapted microorganisms in the matrix (low RLT), or to use them in an environment with the same characteristics of the media used to produce their biomass. This factor was described in the past as the physiological function of the Baranyi and Roberts model and was shown to strongly affect the growth curve [33].

The steps taken in this research are similar to those used when a second-level model of predictive microbiology is developed: the first step was the evaluation of the growth curves in some conditions; then, the fitting parameters of the growth curves were modeled as a function of pH, temperature, composition, etc. In this research, FSSP substitutes for the first step.

Figure 1 shows an example of the growth curves (combinations S34 and S41) predicted for LAB using FSSP. For each combination studied (S1), the outputs of the software (μmax, maximal growth rate, and tcrit, time to attain the critical threshold of 8.9 log CFU/g) were recorded and subsequently used as input data for a multiple regression procedure.

**Figure 1.** Example of growth curves (combinations S34 and S41) predicted for LAB using FSSP.

The growth rate is the classical parameter for secondary models, but tcrit was added in this research. To the authors' knowledge, this is the first time tcrit has been added in a modeling study of LAB cultures. In a simple way, tcrit could be defined as the time to attain the steady state; thus, it is an indirect measure of the performances of starter cultures and of their acidification kinetics. Moreover, the time to attain the steady state could have other implications related to the Jameson effect: a dominant group (a starter culture) could stop the growth of the other subpopulations (spoiling microorganisms) and induce a stationary phase, thus reducing the maximum level they could attain [34]. For a starter culture, the Jameson effect, and thus the time to attain the steady state, could have a strong effect on the inhibition of undesirable microorganisms and on bioprotective effects.

The first output of the statistical approach used was the table of standardized effects (Table 2), which shows the statistical weight of each studied factor; in particular, it summarizes the effects of the linear (L), quadratic (Q), and interactive terms of RLT (relative lag time), temperature, concentration of NaCl, liquid smoke, and the amount of nitrite and CO2 at equilibrium in the headspace on the maximal growth rate (μmax) and on tcrit of LAB in fish products. The standardized effect was evaluated as the ratio of the mathematical coefficient of each factor (from multiple regression) vs. its standard error.

**Table 2.** Standardized effects of linear (L), quadratic (Q), and interactive terms of RLT (relative lag time), temperature, concentration of NaCl, liquid smoke, and the amount of nitrite and CO2 at equilibrium in the headspace on the maximal growth rate (μmax) and time to attain the steady state (tcrit) of lactic acid bacteria in fish products. The standardized effect was evaluated as the ratio of the mathematical coefficient of each factor (from multiple regression) vs. its standard error. The degrees of freedom for the *t*-test were 204; R<sup>2</sup> ad, regression coefficient corrected for multiple regression. \* Not significant.


**Table 2.** *Cont*.


For the maximal growth rate, the most significant factor was liquid smoke, followed by temperature and salt; pH, nitrites, RLT, and CO2 did not play a significant role. For the time to attain the critical threshold (tcrit), the data point out the significance of liquid smoke, temperature, RLT, and salt. Although only for qualitative purposes, the main benefit of this model, compared to the boundary approach used for LAB in FSSP [25], is that it does not offer a negative overview (combinations of parameters causing growth inhibition); the ratio behind this approach is a positive method (from a mathematical point of view), that

is, some of the factors and interactions are related to growth and are the parameters able to cause an increase in the dependent variables (rate and tcrit).

A quantitative estimation of the role of the significant variables could be gained through the use of surface response plots. Concerning the interactions of liquid smoke × temperature (Figure 2a) and liquid smoke × salt (Figure 2b), the growth rate was at its maximum at the highest value of temperature (coded level "+1", 25 ◦C) and with the lowest value of salt and liquid smoke; however, the model predicted the complete inhibition of LAB growth only for a few combinations of temperature and smoke (level "−1" for temperature and "0.8–1" for liquid smoke, corresponding to actual values of 10 ◦C and 35–40 ppm of liquid smoke). The surface response plot for the interaction of nitrite × liquid smoke (Figure 2c) underlines the non-significant effects of nitrites. The effect of liquid smoke on the growth of LAB was also confirmed by the results obtained for tcrit, as the highest values of this parameter were predicted for amounts of liquid smoke over 20–25 ppm (data not shown).

Surface plots are a good tool for treating data from a multiple regression, and for highlighting the individual and quadratic terms of each factor, as well as for showing the interactive effects amongst the different independent variables. However, despite these benefits, it is not possible to see the effect of each variable excluding the others, i.e., the effect of each factor is linked to the effects of other variables. Thus, to this scope, other approaches could be used, such as desirability which is a dimensionless parameter able to provide an answer to the following issue: how much is an output desired? [35] In addition, through mathematical extrapolation, desirability is an important tool to highlight critical values for each factor.

**Figure 2.** *Cont*.

**Figure 2.** Surface response plots for the interactions liquid smoke × temperature (**a**), liquid smoke × salt (**b**), and nitrite × liquid smoke (**c**).

Figures 3 and 4 show the desirability profile for μmax and tcrit, respectively; the profile highlights the different quantitative weights of each studied variable. Focusing on the results obtained for μmax (Figure 3), only temperature, salt, and liquid smoke are able to affect fermentation, whereas other parameters such as pH, use of anoxic conditions, and nitrites are not significant. Concerning temperature, fermentation at 20–25 ◦C is advisable, although a low fermentation temperature is also possible (10–15 ◦C). This result is corroborated in numerous other studies using mathematical models to predict the growth of different microorganisms in foods at various storage temperatures [36–38], where a strong dependency was found between microbial growth and the storage temperature assayed.

**Figure 3.** Desirability profile for the maximal growth rate (μmax) of lactic acid bacteria in fish products.

Although a temperature of 25 ◦C is generally not the optimal temperature of LAB, and a further increase in desirability for higher temperatures could be hypothesized, the choice of the range was due to two main reasons: (i) higher temperatures could have masked the effect of other variables because of the strongest effect of temperatures for starter culture kinetics [39]; (ii) fish fermentation is advisable at a lower temperature to avoid massive production of undesirable compounds, such as biogenic amines [40]. Finally, 25 ◦C is the maximum level of temperature for LAB in FSSP.

As expected, the effect of salt was less significant than temperature (at least in the range tested in this study, 0–6%): lactic acid bacteria, in fact, are able to tolerate high salt concentrations, and this tolerance gives them an advantage over other less tolerant species, allowing a rapid start of fermentation [21]. The results (see box of NaCl in Figure 3) show that the decrease in μmax begins to be significant beyond a salt concentration of 3%; however, 3–3.5% (*w*/*w*) is the salt concentration recommended by the Codex Alimentarius [41] in smoked-flavored fish products to avoid the proliferation of spoilage bacteria. Both Figures 3 and 4 point out that a smoke concentration >20 ppm (coded level 0) is not advisable because it could result in a significant delay of starter growth. This inhibitory effect is probably attributable to the bactericidal effects of smoke components (phenols, polycarboxylic acids) on most types of bacteria and/or fungi, including LAB [42]. Many other recent studies

were focused on the use of liquid smoke on fish products [43–45], since consumer appeal towards these foods has been increasing recently. The traditional smoking method is under investigation since it can cause the production of harmful compounds, such as polycyclic aromatic hydrocarbons, mainly benzo(α)pyrene [46]. Consequently, the use of liquid smoke is advisable because it is more eco-friendly, requires a shorter smoking time, and ensures a longer product durability [44].

**Figure 4.** Desirability profile for the time to attain the steady state (tcrit) of lactic acid bacteria in fish products.

In this study, model building was based on a completely randomized design, which was able to estimate possible interactive and quadratic effects. This type of design is useful for screening purposes, and to reduce the number of variables; however, for robust predictive equations, other designs (for example, central composite design) are advisable, because a higher number of levels allows for good precision in the prediction, at least in the ranges tested. Thus, further investigations are required in this field, in order to assess the significance of other factors not considered in FSSP (food structure, food components, effects of natural microbiota, etc.) which could strongly and significantly affect the growth/survival of lactic acid bacteria.

#### **4. Conclusions**

The design of a fermented product is a complex process, and the definition of the formulation is a critical step above all for the performances of the starter cultures. The use of LAB is a promising way to valorize seafood products and to design safe, stable, and consumer-appealing products; however, some variables should be considered.

The use of liquid smoke and the amount of phenols are critical, because the simulation through FSSP suggested that there is a critical amount (20 ppm) after which the fermentation could experience a delay.

Concerning temperature, fermentation at 20–25 ◦C is advisable because of higher growth rate values, although a low fermentation temperature is also possible. The model, in fact, also predicted growth at 10–15 ◦C.

RLT could strongly affect the performances of a starter culture, thus suggesting that the physiological state of cells could affect their performances during fish fermentation.

Other parameters are not significant (at least those tested in this case study: pH, use of anoxic conditions, nitrites), and if they are included in the formulation, their amount should be determined through other considerations (chemistry, costs, safety for consumers, etc.).

This model could be the background for future studies devoted to the optimization of fish fermentation, as it offers some details on the effect of many factors, although it is important to consider that each a priori modeling should be followed by validation, to check if the mathematical effects have the same weight in real systems, or if other factors could play a role.

**Supplementary Materials:** The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/foods11070946/s1, Supplementary Material File S1: Coded levels and combinations used for the simulation of FSSP.

**Author Contributions:** Conceptualization, A.B. and B.S.; methodology, M.R.C., A.B. and B.S.; software, A.B.; investigation, A.R. and D.C.; data curation, A.B.; writing—original draft preparation, A.B. and B.S.; writing—review and editing, A.B., A.R. and B.S.; supervision, M.R.C. and M.S.; funding acquisition, M.R.C. and M.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research study was funded by the Puglia region through the project PO FEAMP 2014/2020—Measure 1.26 "Valorizzazione di specie ittiche affumicate mediante tecniche tradizionali e innovative" (CUP N. B71B17000990009; project leader: UNCI-Agroalimentare).

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


### *Article* **Consideration of Maintenance in Wine Fermentation Modeling**

**Alain Rapaport 1,\*, Robert David 2, Denis Dochain 3, Jérôme Harmand <sup>4</sup> and Thibault Nidelet <sup>5</sup>**


**Abstract:** We show that a simple model with a maintenance term can satisfactorily reproduce the simulations of several existing models of wine fermentation from the literature, as well as experimental data. The maintenance describes a consumption of the nitrogen that is not entirely converted into biomass. We show also that considering a maintenance term in the model is equivalent to writing a model with a variable yield that can be estimated from data.

**Keywords:** wine fermentation; nitrogen; mathematical modeling; population model; maintenance; variable yield

#### **1. Introduction**

The overall principle of wine fermentation consists of the conversion of sugar into ethanol by yeast. It has been observed for a long time that nitrogen consumed during the yeast growth also plays an important role. The fermentation can be indeed modeled by a two-step process in which the yeast first grows on nitrogen as a limiting resource and then degrades the non-limiting sugar into ethanol and carbon dioxide. However, experimental observations have shown that the consumed nitrogen was not entirely converted into biomass. Several mathematical models were proposed to take these characteristics into consideration. For instance, in [1,2], the biomass growth follows a logistic law whose carrying capacity depends on the initial quantity of nitrogen. In [3], a model that distinguishes part of nitrogen used for yeast growth from another part responsible of the synthesis of proteins (hexose transporters [4]) was developed. Both models were calibrated with different sets of experimental data and provide satisfactory fitting. However, both models present some drawbacks. The dependency of the dynamics on the initial condition of the first model makes it sensitive to the precise knowledge of the initial quantity of nitrogen (that needs to be "memorized" in the dynamical equations of the model). Moreover, it does not allow consideration of non-batch operations or continuous addition of nitrogen, such as in [5] for instance. The second model relies on the knowledge of the time-varying concentration of transporters, which is in general not easily accessible for experimental measurements, and several assumptions were necessary to estimate it from biomass measurements.

The objective of the present work is to propose a new model that reconciles both approaches in a single one.

The observation of the ratio of produced biomass over nitrogen consumption along the whole fermentation, determined on experimental database or numerical simulations of models [1,3], shows that this ratio is non-constant and depends on the initial quantities. This highlights that the conversion of nitrogen into biomass can be viewed as a variable yield process. The experimental evidence that nitrogen is not entirely converted into biomass therefore advocates for the consideration of a maintenance term in the modeling (see, for instance, [6]), without necessarily requiring a detailed representation of the internal mechanism or cells.

**Citation:** Rapaport, A.; David, R.; Dochain, D.; Harmand, J.; Nidelet, T. Consideration of Maintenance in Wine Fermentation Modeling. *Foods* **2022**, *11*, 1682. https://doi.org/ 10.3390/foods11121682

Academic Editors: Carlos Vilas, Míriam R. García and Jose A. Egea

Received: 20 April 2022 Accepted: 17 May 2022 Published: 8 June 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Indeed, different mechanisms in the internal functioning of the cells have been investigated in the literature, particularly the role of carbohydrate accumulation [7–9], which could explain that the growth dynamics of yeast in wine fermentation does not follow the classical mass-balanced models [10,11]. However, the measurements of these biochemical compounds is experimentally very difficult and is almost impossible in an industrial framework.

The rationale of the results presented here is to test if the introduction of a maintenance term (see [12–14] or [15–17]) can improve wine fermentation modeling. One of the original features of the proposed approach is to view nitrogen consumption as a global consumption for growth by considering a variable yield. This allows us to avoid to consider a specific structure to model the maintenance. Thus, the purpose of the present work is to investigate the ability of a simpler model with a maintenance term to reproduce and predict wine fermentation kinetics.

Here, we propose a new modeling approach based on a maintenance term (which gives rise to a variable yield), a feature that has not been yet considered in the wine fermentation literature, to the best of our knowledge.

It focuses mainly on the new modeling of the growth of yeast on nitrogen.

This new model was validated using both data generated by existing models (Section 4) and experimental data (Section 5).

#### **2. The Proposed Model**

We denote by *N*, *S*, *E*, CO2 and *X* the concentrations of (total) nitrogen, sugar, ethanol, dioxide carbon and biomass, respectively. For simplicity, we derive a model under isothermal conditions.

For the first step *N* → *X* (yeast growth on nitrogen), we propose the following equations

$$\frac{dX}{dt} = \mu\_N(N, X)X \tag{1}$$

$$\frac{dN}{dt} = -\frac{\mu\_N(N,X)X}{Y} - m(N,X)X \tag{2}$$

where *Y* is the growth yield, *μ* the Contois growth function

$$
\mu\_N(N, X) = \frac{\mu\_N^{max} N}{N + K\_N X}
$$

and *m* a *maintenance* function, which is positive for *N* > 0 and *X* > 0. We choose here a ratio-dependent kinetics function *μ<sup>N</sup>* to reproduce the observation that the growth is slowing down under an excess of yeast, with a Contois expression as in [3]. In the literature, the maintenance *m* is often considered as constant [12,13], which was validated in continuous culture (chemostat). In general, continuous cultures are intended to be operated at a stationary phase, very differently to batch-operating mode. However, as already investigated in [17], maintenance terms have to depend on the level of available resources; say, *R* (*N* here). In particular, constant maintenance in a batch model would imply *dR dt* < 0 when the resource is exhausted, i.e., *R* = 0, and thus *R* could take unrealistic negative values, as underlined in [14]. In [15,16], the maintenance is directly related to the microbial activity, which is stopped in absence of nutrients. This is why we consider a maintenance function proportional to the growth activity, with a factor that might depend on the nitrogen concentration (one may expect that it decreases when the substrate *N* becomes rare)

$$m(N, X) = a(N)\mu\_N(N, X)$$

where *α* is a positive function equal to zero for *N* = 0. Then, one can consider the function *y* defined as follows

$$y(N) := \frac{Y}{1 + \alpha(N)Y'} \quad N \ge 0$$

Formally, model (1) and (2) can be rewritten equivalently as

$$\frac{dX}{dt} = \mu\_N(N, X)X\tag{3}$$

$$\frac{dN}{dt} = -\frac{\mu\_N(N, X)X}{y(N)}\tag{4}$$

where the function *y* is playing the role of a *variable yield*. Identifying the function *m* or the function *y* is thus formally equivalent. However, we shall see in the next section that identifying the function *y* instead of *m* presents some practical advantages.

For the second step *S* → *E* + CO2, we follow the model proposed in the literature [3]

$$\frac{dE}{dt} = \frac{d\mathbf{C}\mathbf{O}\_2}{dt} = \left[\mu\_N(N\_\prime X) + \beta\nu\_E(E)\right]\mu\_S(S)X \tag{5}$$

$$k\frac{dS}{dt} = -k\frac{dE}{dt}\tag{6}$$

where *μ<sup>S</sup>* is a Monod function and *ν<sup>E</sup>* a function inhibited by the ethanol

$$
\mu\_S(S) = \frac{\mu\_S^{\max} S}{K\_S + S} \quad \nu\_E(E) = \frac{1}{1 + K\_E E} \tag{7}
$$

Inhibition by the consumption of sugar *S* by ethanol *E* has been reported many times in the literature [18–22]. The constant yield of production *k* of CO2 and consumption of *S* follows a mass balance assumption, verified experimentally [23], that can be determined using thermodynamics considerations [24].

Note that this model can be extended to anisothermal conditions, considering that the maximal specific rate parameters *μmax <sup>N</sup>* , *<sup>μ</sup>max <sup>S</sup>* and affinity constants *KS*, *KE* are temperature dependent, as in [3].

#### **3. Calibration of the Model**

From model Equation (1), the parameters of the function *μ<sup>N</sup>* can be identified independently of the yield and maintenance terms. To validate the hypothesis of ratio dependency of the function *μN*, one can first use experimental data to plot the slope of the logarithm of *X* versus the ratio *r* = *N*/*X* and check if it qualitatively follows a function of the form

$$\mu(r) = \frac{\mu\_N^{\max} r}{\mathcal{K}\_N + r}$$

A classical least-square method can be applied to fit parameters *μmax <sup>N</sup>* , *KN* on the data. Alternatively, one can plot the inverse of the slope of the logarithm of *X* versus the inverse of the ratio *r* to check if it qualitatively follows a linear dependency, as obtained from Equation (1)

$$\left(\frac{d\log X}{dt}\right)^{-1} = \frac{1}{\mu\_N^{\max}} + \frac{K\_N}{\mu\_N^{\max}} \left(\frac{N}{X}\right)^{-1} \tag{8}$$

However, for the accurate identification of the parameters *μmax <sup>N</sup>* , *KN*, a linear regression on Equation (8) is expected to be less reliable than a non-linear least-square optimization of the solution *<sup>X</sup>*(·) of (3), because - *d dt* log *X* <sup>−</sup><sup>1</sup> , *N X* <sup>−</sup><sup>1</sup> data might be too far to be uniformly distributed.

Note from Equations (1) and (2) that one has

$$\lim\_{t \to +\infty} N(t) = 0$$

(because the derivative of *N* cannot vanish when *N* is not exhausted). In absence of the maintenance term *m*, one gets *dX dt* <sup>+</sup> *<sup>Y</sup>dN dt* = 0 which implies that one should have

$$Y = \frac{X(+\infty) - X(0)}{N(0) - N(+\infty)} = \frac{X(+\infty) - X(0)}{N(0)}$$

To test the validity of the model with maintenance, one can plot from experimental data the ratio *<sup>X</sup>*(+∞)−*X*(0) *<sup>N</sup>*(0) for different values of *N*(0) to check that it is not constant. If this is the case, one can then look at identifying a non-constant function *y*. For this purpose, we write from Equations (3) and (4)

$$X(+\infty) - X(0) = -\int\_0^{+\infty} y(N(t)) \frac{dN}{dt}(t) \, dt$$

and as *t* → *N*(*t*) is a monotone-decreasing function, one can make the change of variable *n* = *N*(*t*) in this last integral to obtain

$$X(+\infty) - X(0) = \int\_0^{N(0)} y(n) \, dn$$

Therefore, if one fits a differential function *f* such that *f*(0) = 0 that satisfies

$$X(+\infty) - X(0) = f(N(0))$$

for experimental data with different values of *N*(0), then one simply gets *y* = *f* .

Let us underline that identifying the function *y* in this way can be achieved independently of the knowledge of the kinetics *μN*, differently to the function *m*, which clearly presents some robustness advantages. Once the function *μ<sup>N</sup>* is identified, the maintenance function can then be determined as

$$m(N, X) = \left(\frac{1}{y(N)} - \frac{1}{Y}\right) \mu\_N(N, X)$$

where *Y* = *y*(0) (to fulfill *α*(0) = 0).

For model Equations (5) and (6), the coefficient *k* is kept from the literature, and the parameters *β*, *μmax <sup>S</sup>* , *KS*, *KE* are identified (with a least-square method) from experimental data of CO2 production rate.

#### **4. Validation of the Model on Synthetic Data**

We have used synthetic data generated by models of the literature that were previously validated on experimental data [1,3] for a range of initial conditions and operating conditions.

Fitting comparisons of the proposed model with the different data sets are reported in Section 6.

#### *4.1. Validation on Simulations of a Model with Transporter*

We have considered the model with transporters developed in [3], which is more complex with two additional state variables: the concentrations of hexose transporters and the nitrogen dedicated to these transporters. Data were generated by simulating this model with the parameters given in [3] and operating conditions given in Table 1.


**Table 1.** Operating conditions for the simulation of the model with transporters.

This model explicitly distinguishes two forms of nitrogen, one available for the yeast *NX* and the other one *Ntr* for the transporters. To compare with the variable *N* of our model, we have considered the total nitrogen *N* = *NX* + *Ntr*.

#### 4.1.1. Estimation of the Contois Function

We have used a non-linear least-square method based on a Newton algorithm with a finite difference approximation of the Jacobian matrix (function leastsq of scilab). Figure <sup>1</sup> shows a good fitting of the Contois function *<sup>μ</sup><sup>N</sup>* on data - *N X* , *dX dt X* of the transporter model, with parameters given in Table 2.

**Figure 1.** Result of the fitting of the Contois function on data from the model with transporters.

**Table 2.** Parameters of the Contois function *μN*.


#### 4.1.2. Estimation of the Variable Yield Function

On Figure 2, data *X*(*T*) − *X*(0) versus *N*(0) from the model with transporters were plotted for *T* = 350 h (we have checked that *N* is quasi-null at *T* and that *X* no longer increases after *T*). One can see that the points are aligned. However, the line that passes through these points does not touch 0, which is not possible for a constant yield (for a constant yield, the points have to be aligned on a line that passes through 0, because when *N*(0) = 0, there is no biomass production).

Then, we fitted a *C*<sup>2</sup> function *f* such that *f*(0) = 0 with the following expression

$$f(N) = \begin{cases} aN + b\left(1 - \left(\frac{N\_t - N}{N\_t}\right)^3\right), & N < N\_{\text{th}}\\ aN + b, & N \ge N\_{\text{th}} \end{cases}$$

whose parameters are given in Table 3.

The calibration of the parameters *a*, *b* of the function *f* was performed with a linear regression (function reglin of scilab).

**Table 3.** Parameters of the variable yield function *y*.


Then, we obtain the variable yield function *y* as the *C*<sup>1</sup> function

$$y(N) = f'(N) = \begin{cases} a + b \frac{3(N\_t - N)^2}{N\_t^3}, & N < N\_t \\ a, & N \ge N\_t \end{cases}$$

and the function *α*, which describes the maintenance as

$$\mathfrak{a}(N) = \frac{1}{\mathfrak{y}(N)} - \frac{1}{\mathfrak{y}(0)} = \begin{cases} \frac{N\_{\mathfrak{t}}^3}{aN\_{\mathfrak{t}}^3 + 3b(N\_{\mathfrak{t}} - N)^3} - \frac{N\_{\mathfrak{t}}}{3b + aN\_{\mathfrak{t}}}, & N < N\_{\mathfrak{t}}\\ \frac{3b}{a(3b + aN\_{\mathfrak{t}})}, & N \ge N\_{\mathfrak{t}} \end{cases}$$

which are both depicted on Figure 3.

**Figure 3.** Graphs of the obtained variable yield function *y* and of the function *α*.

Note that the model with transporters was validated only for *N*(0) in the interval [0.071, 0.57] g·L−1, and that we have no a priori information about the behavior of the yield for values of *<sup>N</sup>*(0) smaller than 0.071 g·L<sup>−</sup>1. The threshold parameter *<sup>N</sup>*† was simply chosen so that the simulations of the variables *X* and *N* of the model (3) and (4) were the closest to the ones of the transporter model.

4.1.3. Estimation of the Other Parameters and Comparison of the Models

For the model of the second step *S* → *E* + CO2, the stoichiometric parameter *k* was taken from the literature, while the other parameters *β*, *μmax <sup>S</sup>* , *KS*, *KE* were estimated with a least-square optimization on the CO2 chronicles only (the CO2 production rate being a variable that is usually measured in experiments), starting from values in [3]. Values are given in Table 4.

**Table 4.** Parameters for the second step *S* → *E* + CO2 model.


Here, we also used a non-linear least-square method based on a Newton algorithm with a finite difference approximation of the Jacobian matrix (function leastsq of scilab). All data were re-normalized to 1 (i.e., for each variable, the figures were divided by the largest one).

Finally, we present on Figures 4–6 simulations of the new model for three largely different initial values of nitrogen from 0.170 g·L−<sup>1</sup> to 0.567 g·L<sup>−</sup>1. The evolution of the ethanol concentration *E* has not been reproduced as it is proportional to the CO2 concentration.

These simulations show the ability of the new model to reproduce, with a single set of parameters, close simulations to the model with transporters, in terms of production of biomass and dioxide carbon, estimation of the peak of the CO2 production rate and depletion of (total) nitrogen and sugar.

**Figure 4.** Comparison with the model with transporters (in dashed) for *<sup>N</sup>*(0) = 0.170 g·L−<sup>1</sup> (constant temperature of 24 ◦C).

**Figure 5.** Comparison with the model with transporters (in dashed) for *<sup>N</sup>*(0) = 0.283 g·L−<sup>1</sup> (constant temperature of 24 ◦C).

**Figure 6.** Comparison with the model with transporters (in dashed) for *<sup>N</sup>*(0) = 0.567 g·L−<sup>1</sup> (constant temperature of 24 ◦C).

#### *4.2. Validation on the SOFA Model*

The model proposed in [1] does not explicitly consider transporters with an additional state variable as the previous model, and instead presents a more sophisticated expression of the dynamics that depend on the initial condition, with an additional latency term at the beginning of the simulations.

Differently to the previous model, which is built as a "mass-balanced" model, this one relies on an empirical dynamics of logistic shape for the biomass growth, with some parameters that depend on the initial concentration of nitrogen *N*(0), instead of the twodimensional model (3) and (4).

Therefore, this is not a Markovian model. It has been validated on different operating conditions, and has been encoded into the SOFA software exploited for decision-making [2]. We launched simulations of this model for the same operating conditions than for the previous model (Table 1). Although simulations look qualitatively similar, they do not overlap, especially for the biomass chronicle. This could be explained by the fact that this model is intended to predict a number of cells and not a precise biomass (an average number of 4.15 × 109 cells for one *<sup>g</sup>* of biomass was used to have *<sup>X</sup>* expressed in g·L−<sup>1</sup> as for the previous model). We proceeded to a new validation of our model on these data.

#### 4.2.1. Estimation of the Contois Function

Figure <sup>7</sup> shows that the data -*N X* , *dX dt X* do not precisely follow the graph of a function (this is most probably due to the fact that the model is not Markovian). Indeed, this happens mainly for the large value *N*<sup>0</sup> of the initial nitrogen. We believe that this could be explained by the dynamics of the biomass *X* of this model, which is a logistic law with a carrying capacity given by an heuristic expression that depends on *N*0, and not dynamics coupled with the dynamics of *N* (indeed the interval of tested values of *N*<sup>0</sup> might be larger than the validity of this model). However, we have fitted the graph of a Contois function to these data with the parameters given in Table 5, which was able to satisfactorily reproduce the trajectories of the model for a large amplitude of values of *N*0, as we shall see later on.

As for the previous model, we used a non-linear least-square method based on a Newton algorithm with a finite difference approximation of the Jacobian matrix (function leastsq of scilab). As one can see in Table 5, the values of *<sup>μ</sup>max <sup>N</sup>* and *KN* are significantly larger and smaller, respectively, than in Table 2, which is consistent with the observation that this model predicts a faster convergence of the biomass to its maximal value, despite the latency term (compare Figures 4–6 with Figures 8–10).

**Figure 7.** Result of the fitting of the Contois function on data from the SOFA model.

**Table 5.** Parameters of the Contois function *μN*.

**Figure 8.** Comparison with the SOFA model (in dashed) for *<sup>N</sup>*(0) = 0.170 g·L−<sup>1</sup> (constant temperature of 24 ◦C).

**Figure 9.** Comparison with the SOFA model (in dashed) for *<sup>N</sup>*(0) = 0.283 g·L−<sup>1</sup> (constant temperature of 24 ◦C).

**Figure 10.** Comparison with the SOFA model (in dashed) for *<sup>N</sup>*(0) = 0.567 g·L−<sup>1</sup> (constant temperature of 24 ◦C).

4.2.2. Estimation of the Variable Yield Function

Data *X*(*T*) − *X*(0) from the simulation of the SOFA model were plotted on Figure 11 at *<sup>T</sup>* = 350 h, for different values of *<sup>N</sup>*(0) in the interval [0.071, 0.57] g·L−<sup>1</sup> (here, we also checked that the fermentation was quasi-ended at *T*). One can see that the points follow an increasing concave curve and further increase very slowly, quite differently to the model with transporters (see Figure 2).

**Figure 11.** Result of the fitting of the function *f* on data from the SOFA model.

We have then fitted a *C*<sup>2</sup> function *f* with *f*(0) = 0 for the expression

$$f(N) = \begin{cases} bN - aN^2, & N < N\_\uparrow \\ bN - aN^2 + bN + \frac{A}{B} \left( e^{-BN\_\uparrow} - e^{-BN} \right) & N < N\_\uparrow \end{cases}$$

with

$$A = (b - 2aN\_\dagger)e^{BN\_\dagger}, \quad B = \frac{2a}{b - 2aN\_\dagger}$$

and parameters *a*, *b*, *N*† given in Table 6.

Parameters *<sup>a</sup>* and *<sup>b</sup>* were determined with a linear regression (function reglin of scilab).

**Table 6.** Parameters of the variable yield function *y*.


Then, we obtain the expression of the variable yield function

$$y(N) = f'(N) = \begin{cases} b - 2aN\_{\prime} & N < N\_{\dagger} \\ Ac^{-BN\_{\prime}} & N \ge N\_{\dagger} \end{cases}$$

as well as the function *α*

$$a(N) = \frac{1}{y(N)} - \frac{1}{y(0)} = \begin{cases} \frac{1}{b - 2aN} - \frac{1}{b}, & N < N\_{\rm th} \\\\ \frac{e^{b(N - N\_{\rm th})}}{b - 2aN\_{\rm th}} - \frac{1}{b}, & N \ge N\_{\rm th} \end{cases}$$

whose graphs are drawn on Figure 12.

**Figure 12.** Graphs of the obtained variable yield function *y* and of the function *α*.

4.2.3. Estimation of the Other Parameters and Comparison of the Models

For the second step, the same stochiometric parameter *k* was taken for the literature, and the other parameters *β*, *μmax <sup>S</sup>* , *KS*, *KE* were estimated with a least-square optimization on the CO2 chronicles only, as for data generated by the model with transporters (see Table 7).

**Table 7.** Parameters for the second step *S* → *E* + CO2 model.


Figures 8–10 show the comparison between the SOFA model and our calibrated model for the same initial condition than for the former comparison with the model with transporters. Here also, we see that the proposed model reproduces quite faithfully the simulations of the SOFA model, with the advantage of being a simpler Markovian model. Indeed, the difference between the model with transporters and the SOFA model can be translated into different maintenance terms (see Figures 3 and 12): for large values of nitrogen, the model with transporters behaves like a model with a maintenance proportional to the growth, while the SOFA model amounts to have a strongly increasing maintenance. Recall that the simulations for the largest value of *N*(0) showed the most differences between these two models (for *<sup>N</sup>*(0) = 0.567 g·L<sup>−</sup>1, the model with transporters predicts a biomass production of 5.11 g·L<sup>−</sup>1, while the SOFA model predicts 3.88 g·L<sup>−</sup>1; see Figures <sup>6</sup> and 10). While the model with transporters was validated experimentally for *N*(0) in the interval [0.170, 0.567] g·L<sup>−</sup>1, we believe the validation of the SOFA model for initial concentrations of nitrogen larger than 0.4 g·L−<sup>1</sup> might need to be revisited (although our model once calibrated is able to reproduce the SOFA simulations).

#### **5. Calibration of the Model on Real Data**

We considered data from experiments conducted at SPO Lab (INRAE, Montpellier, France) in 2004, that were used to calibrate the model with transporters and the SOFA model (see [1,3]). The data consisted of a set of three experiments with the same operating conditions given in Table 1 and different initial concentrations *N*(0) of nitrogen, exactly the same as for the simulations of Sections 4.1 and 4.2. For each experiment, one had


We first calibrated a function *f*(·) to the data (*N*(0), *X*(*T*) − *X*(0)), with the same expression as in Section 4.2, to determine a yield function *y*(·) (see Figure 13), using a linear regression to estimate parameters *a* and *b*.

**Figure 13.** Results of the fitting of the function *f* on the experimental data (**left**) and of the corresponding variable yield function *y* (**right**).

As we do not have measurements of *N* over the time, we cannot estimate the Contois parameters independently of the CO2 measurements, as we did with the synthetic data. All the parameters of the model were fitted simultaneously with a least-square method (values are given in Table 8), except for the sugar-conversion yield, for which we have used the value of the literature *k* = 2.17, as before.

The non-linear least-square method uses a Newton algorithm with a finite difference approximation of the Jacobian matrix (function leastsq of scilab), and the data set was re-normalized to the maximal value of 1.



Figures 14–16 show the results of the fitting for the three experiments. One can appreciate the goodness of fit for a unique set of parameters. In particular, the production of biomass and CO2, as well as the height and date of the peak of *d*CO2/*dt*, are well predicted with this model.

**Figure 14.** Simulation for *<sup>N</sup>*(0) = 0.170 g·L−<sup>1</sup> (experimental data in blue).

**Figure 15.** Simulation for *<sup>N</sup>*(0) = 0.283 g·L−<sup>1</sup> (experimental data in blue).

**Figure 16.** Simulation for *<sup>N</sup>*(0) = 0.567 g·L−<sup>1</sup> (experimental data in blue).

#### **6. Fitting Comparisons**

For the calibration of the variable yield function on both synthetic and experimental data (Sections <sup>4</sup> and 5), we have used a linear regression (function reglin of scilab) for the determination of parameters *a*, *b* of the function *f* (for the model with transporter) and *f* (for the SOFA model and experimental data). The residual error is given in Table 9.


This shows that the model with transporters behave very closely to a variable yield model. The fitting performances for the SOFA model and experimental data are more difficult to interpret, because the validity of the SOFA model for the large range of initial concentrations of nitrogen we considered is questionable, and the quantity of experimental data is quite poor compared to the synthetic data.

For the synthetic data, the calibration of the growth characteristics (parameters *μmax <sup>N</sup>* , *KN* of the Contois function) was performed first, independently of the CO2 data. Then, parameters for the second step (parameters *k*, *β*, *μmax <sup>S</sup>* , *KS*, *KE* for the CO2 production) were calibrated. In both cases, a non-linear least-square method based on a Newton algorithm with a finite difference approximation of the Jacobian matrix (function leastsq of scilab) was used. Table 10 shows a good fitting quality.

**Table 10.** Root Mean Square Error (RMSE) for the calibration of the growth function *μ* and the CO2 chronicles.


We recall that for experimental data, we do not have measurement of *N* over time, so it was not possible to estimate the growth function independently of the CO2 measurements. The estimation of all the parameters was made on the CO2 measurements only. We have used the same non-linear least-square method, with data re-normalized to 1 (i.e., the figures were divided by the largest one), so that all points have equal weight in the criterion. The errors shows a good fitting of the CO2 curves with the model with maintenance.

#### **7. Conclusions**

In this work, we demonstrated that the consideration of a maintenance term, or equivalently, a variable yield, in wine fermenting modeling can satisfactorily replace more sophisticated models with a simpler structure. Indeed, the effects of the underlying mechanisms of transporters or carbohydrate accumulation, which are difficult to capture experimentally, are somehow encoded into a maintenance term, and are translated into a variable yield between biomass and nitrogen. We showed that this variable yield, as a function of the nitrogen concentration, can be estimated from experimental data of biomass growth and nitrogen depletion, without the need to measure internal compounds. This consideration brings a flexibility to suit different kind of models or experimental data (once calibrated) with a single common structure, that could correspond to different operating conditions or hypotheses in wine fermentation. This new approach provides new perspectives of control of fermentation with nitrogen addition, based on a simple Markovian model, as well as model extensions with aromatic compounds [25] or multistrains [26].

**Author Contributions:** All authors have contributed equally. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author.

**Conflicts of Interest:** The authors declare no conflict of interest. Technord participated in the data analysis and the modeling discussion as Robert David developed in the past one of the studied model as senior researcher at Université Catholique de Louvain with Denis Dochain. Technord does not intend to use any application of Intellectual property (licensing) for these developments. The interest of this publication is purely scientific and legitimates the Data Science activity of Technord. Data were generated by INRAE, where Alain Rapaport, Jérôme Harmand and Thibault Nidelet are employed.

#### **References**


## *Article* **Parameter Estimation of Dynamic Beer Fermentation Models**

**Jesús Miguel Zamudio Lara 1,2, Laurent Dewasme 1, Héctor Hernández Escoto <sup>2</sup> and Alain Vande Wouwer 1,\***

<sup>1</sup> Systèmes, Estimation, Commande et Optimisation, Université de Mons, 7000 Mons, Belgium

<sup>2</sup> Departamento de Ingeniería Química, Universidad de Guanajuato, Guanajuato 36050, Mexico

**\*** Correspondence: alain.vandewouwer@umons.ac.be

**Abstract:** In this study, two dynamic models of beer fermentation are proposed, and their parameters are estimated using experimental data collected during several batch experiments initiated with different sugar concentrations. Biomass, sugar, ethanol, and vicinal diketone concentrations are measured off-line with an analytical system while two on-line immersed probes deliver temperature, ethanol concentration, and carbon dioxide exhaust rate measurements. Before proceeding to the estimation of the unknown model parameters, a structural identifiability analysis is carried out to investigate the measurement configuration and the kinetic model structure. The model predictive capability is investigated in cross-validation, in view of opening up new perspectives for monitoring and control purposes. For instance, the dynamic model could be used as a predictor in recedinghorizon observers and controllers.

**Keywords:** parameter estimation; mathematical modeling; beer fermentation; food industry

#### **1. Introduction**

Beer is the most consumed alcoholic beverage worldwide and is produced by the fermentation of sugars in the wort by yeasts [1]. The production of beer in 2019 was 1912 million hectoliters (hl), while in 2020, the production reduced to 1820 million hl due to the COVID-19 pandemic [2]. Beer production is a complex biochemical process in which the main ingredients are water, malt (sugar source), yeast, and hops [3]; however, other products can be added, such as fruits, chocolate, and coffee grains, among others.

The fermentation stage is crucial to guarantee good quality beer since it is when all the nutrients, flavor, and odor components are produced, in addition to ethanol. At this stage, yeast is introduced in the wort (broth that is rich in sugars) from the boiling stage at the desired temperature. The main chemical reaction is the conversion of these sugars into ethanol and carbon dioxide, along with biomass growth and heat generation. At the same time, several secondary reactions occur, generating several components at lower concentrations that contribute to the flavor and aroma characteristics.

To enhance the fermentation, several factors, such as yeast pitching rate, dissolved oxygen, batch pressure, and temperature, must be taken care of by the brewers [4]. Among these factors, temperature is important as it helps accelerate the fermentation but needs to remain within controlled bonds to avoid yeast death (above 30 °C), the production of undesirable byproducts, and the growth of bacteria, damaging the final product. Therefore, rigorous control of the temperature inside the fermenter must be exercised to ensure product quality and alleviate variations between batches.

In the brewing industry, time-varying temperature profiles are established along the fermentation process in order to alleviate the above-mentioned potential issues [5]. Looking for the appropriate temperature profile is, however, not an easy task, and experimental determination can be time-consuming. Model-based optimization is, therefore, an appealing alternative. Dynamic models can be useful not only to optimize the operating conditions, but also to design state estimators reconstructing online non-measured variables or designing controllers to ensure close setpoint tracking [6].

**Citation:** Zamudio Lara, J.M.; Dewasme, L.; Hernández Escoto, H.; Vande Wouwer, A. Parameter Estimation of Dynamic Beer Fermentation Models. *Foods* **2022**, *11*, 3602. https://doi.org/10.3390/ foods11223602

Academic Editors: Jose A. Egea, Carlos Vilas and Míriam R. García

Received: 3 October 2022 Accepted: 7 November 2022 Published: 11 November 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

To this end, Gee and Ramirez [7] proposed a detailed model of beer fermentation describing biomass growth and the production of flavor compounds through macroscopic reactions inferred from biological pathways. This model segregates the sugars into glucose, maltose, and maltotriose. The derivation of the corresponding sugar uptake kinetics is, therefore, the center of interest, and the related parameters are assumed to be temperaturedependent. However, this model is unlikely to be applicable for control purposes since it involves variables that require specific monitoring equipment beyond the standards of most fermenters. Andrés-Toro et al. [8], conversely, proposed to segregate the yeast (biomass) into three types: lag, active, and death, while the sugars are considered as a whole. Sugar monitoring, therefore, appears simpler, and the model also takes account of industrial operational characteristics as well as the undesired beer flavor caused by ethanol and byproduct (diacetyl and ethyl acetate) formation. The main drawback of this model is the difficulty in obtaining accurate biomass data. Trelea et al. [9] developed what can be considered as, so far, the most practical model, where only three states are considered: the dissolved carbon dioxide concentration used as an image of the growing biomass, the ethanol concentration, and the sugar concentration. The main advantage of this latter model lies in its practical control-oriented description of the fermentation process, considering variables that can be easily measured and tracked.

The objective of this work is to revisit these classical models, propose a few adaptations, and develop a thorough study of the parameter estimation problem based on a popular fermentation device, e.g., a 30-L stainless-steel Grainfather® fermenter. Two alternative mathematical models are considered, one based on yeast (biomass) and the other on carbon dioxide. The difference between these models is discussed in terms of biological interpretation, bioreactor instrumentation, and data collection (i.e., parameter estimation, model validation, and process control perspectives). As a result, models with good predictive capability are proposed together with their experimental validation.

This paper is organized as follows. The next section describes the experimental setup, while Section 3 presents a review of dynamic models of beer fermentation, together with possible model adaptations required to represent the considered case study. Section 4 develops a structural identifiability analysis based on the software tool Strike Goldd [10]. Section 5 introduces a parameter identification procedure, including parametric sensitivity analysis and model validation. The last section draws the main conclusions of this work and discusses the monitoring and control perspectives.

#### **2. Beer Fermentation Experimental Set-Up**

The pilot plant consists of a stainless-steel conical fermenter (30 L, Grainfather®), which has a built-in sensor to measure the temperature of the liquid content. This sensor is paired to a control system connected to a glycol chiller (Grainfather®) to keep the temperature regulated. Ethanol and carbon dioxide concentrations are measured online, respectively, with a tilt® hydrometer and a Plaato® airlock.

The hydrometer is introduced in the wort and keeps floating in a tilted position, measuring the specific gravity which also allows, based on some predetermined correlations, for assessing the percentage of alcohol. The sensor also has an integrated temperature sensor. The airlock consists of four components: a lid, a bubbler, a Tritan, and a smart part (containing the temperature and infrared sensors). This device measures the evenly-sized bubbles of carbon dioxide released by the wort and converts them into liters of *CO*2. The data are stored and displayed in the Brewblox® interface.

Besides the two online probes, a CDR BeerLab® analyzer is used to obtain offline measurements of sugar, ethanol, and vicinal diketone (VDK) concentrations. The scheme of the full experimental setup is displayed in Figure 1.

**Figure 1.** Stainless-steel fermenter pilot plant monitoring set-up.

In this study, ale beer fermentation was considered and carried out at a temperature ranging from 17 to 26 °C. To obtain the wort, the malt is crushed using a mill. Indeed, the grain must only be broken, but not grounded. The next step is mashing, where sugars are obtained from starch. The crushed grain is added to a boiler tank (35 L, Grainfather®) containing 19 L of water at 48 °C. The mashing step consists of four stages at different temperatures and times described in Table 1. Once mashing is finished, the grain is rinsed with water at 75 °C until 24 L of wort is obtained. Usually, the quantity of added water is 8 L. Then, the wort is boiled to sterilize the liquid. The latter step is carried out for 80 min at 100 °C. Hop is added at 40 min and 65 min. Eventually, the wort is cooled down as fast as possible to the desired fermentation temperature with the help of a counter-flow wort cooler. The cold wort is transferred in the fermentation tank, filled up to 17 L.

A set of four isothermal batch fermentations without agitation are carried out using different operating conditions described in Table 2. Each experiment is carried out once, but two replicates are taken and analyzed for each sample. The total sampling volume represents less than 10% of the initial wort volume (17 L), a condition to neglect the volume changes. Samples are taken every 2 to 3 h during the first 36 h. After this period, the process enters a stationary phase and the sampling time is therefore adapted at irregular, longer, time intervals. To analyze the samples with the CDR BeerLab®, it is necessary to achieve preprocessing, including degasification and centrifugation to eliminate everything that could interfere during the measurement.

**Table 1.** Operating conditions of the scheduled mash steps: temperatures and times.



**Table 2.** Operating conditions of each fermentation batch: duration, temperature, and initial sugar and biomass concentrations.

#### **3. Mathematical Models of Beer Fermentation**

Beer fermentation has been studied extensively, providing various mathematical models, and this section proposes a brief description of three of them, considered as milestones in the research field and used as a reference in the upcoming model development. These mathematical models can provide precious support to design the fermentation operating conditions.

#### *3.1. The Model of Gee and Ramirez*

This model published in 1994 includes the main components of fermentation, e.g., sugars, biomass, and ethanol, as well as amino acids, fusel alcohols, VDKs, and acetaldehydes, which impact the flavor, often in an undesirable way [7]. Esters also have an important role in the aroma and may add some pleasant character in moderate ranges, but undesired hard fruity tastes at higher levels.

The sugar uptake model reads as follows:

$$\frac{dG}{dt} = -\mu\_G X\_\prime \tag{1a}$$

$$\frac{dM}{dt} = -\mu\_M X\_\prime \tag{1b}$$

$$\frac{dN}{dt} = -\mu\_N X\_\prime \tag{1c}$$

where *G*, *M*, and *N*, respectively, stand for glucose, maltose, and maltotriose. The specific growth rates are built upon classical kinetic activation (Monod law) and inhibition factors and are given by:

$$
\mu\_{\rm G} = \frac{\mu\_{\rm G}G}{K\_{\rm G} + G} \tag{2a}
$$

$$
\mu\_M = \frac{\mu\_M \mathcal{M}}{K\_M + \mathcal{M}} \frac{K\_G'}{K\_G' + G},
\tag{2b}
$$

$$
\mu\_N = \frac{\mu\_N N}{K\_N + N} \frac{K\_G'}{K\_G' + G} \frac{K\_M'}{K\_M' + M} \tag{2c}
$$

where *μ<sup>i</sup>* are maximum rate constants (*i* = *G*, *M*, *N*), while *Ki* and *K <sup>j</sup>* (*j* = *G*, *M*) are, respectively, the half-saturation and inhibition constants, all assumed to depend on the temperature following an Arrhenius law of the form:

$$r = A \exp\left(\frac{B}{RT}\right),\tag{3}$$

where *A* and *B* are, respectively, the Arrhenius frequency factor and the activation energy. *R* is the ideal gas constant.

Biomass production is represented by the mass-balance ODE:

$$\frac{dX}{dt} = \mu\_X X\_\prime \tag{4}$$

where

$$
\mu\_x = \chi\_{X\!G}\mu\_1 + \chi\_{X\!M}\mu\_2 + \chi\_{X\!N}\mu\_3 \tag{5}
$$

is the biomass growth rate, a function of the several sugar intake rates. *YXi* (*i* = *G*, *M*, *N*) are the yield coefficients of biomass with respect to the specific sugars.

The ethanol concentration is assumed to evolve proportionally to the variations of the sugar concentrations, resulting in an algebraic equation of the form:

$$E = E\_0 + \mathcal{Y}\_{\rm EG}(G\_0 - G) + \mathcal{Y}\_{\rm EM}(M\_0 - M) + \mathcal{Y}\_{\rm EN}(N\_0 - N),\tag{6}$$

where *YEi* (*i* = *G*, *M*, *N*) are the yield coefficients of ethanol with respect to the consumed sugars.

Three main amino acids are considered, which are responsible for the formation of flavor compounds in the beer like the fusel alcohols. The amino acids are described by the following differential equations:

$$\frac{d\xi}{dt} = -\mathsf{Y}\_{\xi} \, \frac{\mathfrak{J}}{K\_{\mathfrak{G}} + \mathfrak{J}} \frac{dX}{dt} = -\mathsf{Y}\_{\xi} \mu\_{X} \, \frac{\mathfrak{J}}{K\_{\mathfrak{G}} + \mathfrak{J}} X,\tag{7}$$

where *Y<sup>ξ</sup>* and *K<sup>ξ</sup>* are, respectively, the yield coefficients and inhibition constants with species *ξ* = leucine (L), isoleucine (I), and valine (V).

The impact of fusel alcohols is a plastic, solvent-like flavor. Moreover, some experiments achieved in [11] have also linked higher alcohol levels with physiological effects associated with hangovers. The four fusel alcohols represented in the model are isobutyl alcohol (IB), isoamyl alcohol (IA), 2-methyl-1-butanol (MB), and n-propanol (P).

$$\frac{dIB}{dt} = \Upsilon\_{IB}\mu\_V X\_\prime \tag{8a}$$

$$\frac{dIA}{dt} = \Upsilon\_{IA}\mu\_L X\_\prime \tag{8b}$$

$$\frac{dMB}{dt} = \Upsilon\_{MB}\mu\_I X\_\prime \tag{8c}$$

$$\frac{dP}{dt} = \mathcal{Y}\_P(\mu\_V + \mu\_I)X\_\prime \tag{8d}$$

where *Y<sup>ζ</sup>* (*ζ* = *IB*, *IA*, *MB*, *P*) are yield coefficients, and *μξ* are specific rates expressed as *μξ* <sup>=</sup> <sup>−</sup> <sup>1</sup> *X dξ dt* <sup>=</sup> *<sup>Y</sup>ξμ<sup>X</sup> <sup>ξ</sup> <sup>K</sup>ξ*+*<sup>ξ</sup>* (*<sup>ξ</sup>* = *<sup>L</sup>*, *<sup>I</sup>*, *<sup>V</sup>*).

Esters contribute mainly to the aroma of the beer due to their high volatility. In moderate concentrations, they can confer a pleasant character to the beer. However, once in excess, the aroma becomes overly fruity, which is undesired by most consumers. Principal esters are ethyl acetate (*EA*), ethyl caproate (*EC*), and isoamyl acetate (*IAc*).

$$\frac{dEA}{dt} = \Upsilon\_{EA}(\mu\_G + \mu\_M + \mu\_N)X\_\prime \tag{9a}$$

$$\frac{dEC}{dt} = \chi\_{\rm EC} \mu\_X X\_\prime \tag{9b}$$

$$\frac{dIAc}{dt} = \chi\_{IAc}\mu\_{IAc}X\_{\prime} \tag{9c}$$

where *Y<sup>γ</sup>* are the yield coefficients (*γ* = *EA*, *EC*, *IAc*) and *μIAC* is the maximum isoamyl acetate formation rate.

The common practice recommends completely removing vicinal diketones (VDKs) since they add some undesired buttery flavor notes. VDK production is assumed to be proportional to the growth rate, while their possible re-assimilation by yeast to form other by-products is proportional to their concentration. It must be noticed that acetaldehyde

(*AAl*) is another compound showing similar behavior to the VDKs. The mass-balance equations read:

$$\frac{dVDK}{dt} = \Upsilon\_{VDK} \mu\_X X - r\_{VDK} V DKX\_\prime \tag{10a}$$

$$\frac{dAA l}{dt} = \Upsilon\_{AAl}(\mu\_G + \mu\_M + \mu\_N)X - r\_{AAl}AAIX\_\prime \tag{10b}$$

where *Y<sup>ω</sup>* defines the yield coefficients and *r<sup>ω</sup>* defines first-order rate constants (*ω* = *VDK*, *AAl*). *μ<sup>X</sup>* is the biomass growth rate and *μG*, *μM*, *μ<sup>N</sup>* are the sugar consumption rates.

This model therefore proposes a detailed description of the flavor and aroma of the beer but also presents the drawback of being difficult to apply in a realistic context since it requires numerous and often expensive advanced on-line monitoring devices. Indeed, in practice, biomass concentration is only measured at the beginning of a batch without further monitoring. Moreover, most of the parameters are temperature-dependent, either imposing rigorous operating conditions (i.e., only one constant temperature level) or parameter estimation of the temperature-dependent functions (which requires data collection at various temperatures).

#### *3.2. Model of De Andrés-Toro et al.*

This model, published in 1998, is more concise than the previous one as it only considers five state variables: sugars, biomass, ethanol, ethyl acetate, and diacetyl (i.e., vicinal diketones) [8]. Ethyl acetate and diacetyl are assumed to be the most influencing compounds regarding aroma and flavor. In the following, the model dynamics are described state-by-state. Biomass is segregated into three types: lagged, active, and dead. It is indeed assumed that part of the biomass goes through several states during the process, first in a lag phase when the fermentation starts, then in an active (growing) state, and eventually in an inactive (dead) state.

$$\frac{d\mathbf{x}\_{L}}{dt} = -\mu\_{L}\mathbf{x}\_{L'}\tag{11a}$$

*Active biomass*

$$\frac{dX\_A}{dt} = \mu\_X X\_A + \mu\_L X\_L - \mu\_{DT} X\_{A\prime} \tag{11b}$$

*Dead biomass*

$$\frac{dX\_D}{dt} = \mu\_{DT} X\_A - \mu\_{SD} X\_{D\prime} \tag{11c}$$

where the lagged biomass becomes active at the specific rate *μL*, the active biomass grows at the specific rate *μ<sup>X</sup>* and dies at the rate *μDT*, while the dead biomass settles in the bottom of the reactor at the rate *μSD*. *μ<sup>X</sup>* and *μSD* are further defined as

$$
\mu\_X = \frac{\mu\_{X\_0} S}{0.5 S\_0 + E} \,\text{s}\tag{12}
$$

$$
\mu\_{SD} = \frac{\mu\_{SD\_0} 0.5 S\_0}{0.5 S\_0 + E}. \tag{13}
$$

*μ<sup>X</sup>* represents an activation by the substrate *S* and an inhibition by ethanol *E*. The inhibition constant is assumed to be inversely proportional to half the initial substrate concentration *S*<sup>0</sup> (indeed two units of S give one unit of E in the stoichiometry of the reaction). *μSD* describes an inhibition by ethanol, which is directly related to *CO*2, which is not a variable in this model, but whose bubbles impair the settling phenomenon. The inhibition constant is again related to half the initial substrate concentration (i.e., maximal quantity of ethanol that can be produced).

Sugar consumption follows a Monod law according to:

$$\frac{dS}{dt} = -\mu\_S X\_{A\prime} \tag{14}$$

with

$$
\mu\_{\mathcal{S}} = \frac{\mu\_{\mathbb{S}\_0} \mathcal{S}}{\mathcal{K}\_{\mathcal{S}} + \mathcal{S}}.\tag{15}
$$

in which *μS*<sup>0</sup> is the maximum specific consumption rate and *KS* is the half-saturation constant.

Ethanol production is described by

$$\frac{dE}{dt} = \mu\_E X\_{A\prime} \tag{16}$$

where the specific rate includes a Monod factor with respect to the substrate *S* and an inhibition factor related to the ethanol concentration. This factor vanishes when the ethanol reaches the maximum value 0.5*S*0:

$$
\mu\_E = \frac{\mu\_{E\_0} S}{k\_E + S} \left( 1 - \frac{E}{0.5 S\_0} \right). \tag{17}
$$

Ethyl acetate is produced as a byproduct of active biomass growth:

$$\frac{dEA}{dt} = \chi\_{EA}\mu\_X X\_{A\prime} \tag{18}$$

where *YEA* is the yield coefficient.

Diacetyl is a component belonging to the vicinal diketones (VDKs), which is produced as the biomass grows by consuming the sugars. Afterward, diacetyl is reduced into acetoin with a reduction rate *rVDK* activated in the presence of ethanol:

$$\frac{dVDK}{dt} = k\_{VDK}SX\_A - r\_{VDK}VDKE.\tag{19}$$

All the parameters are assumed to be affected by temperature according to the Arrhenius law (Equation (3)).

The biomass segregation model provides an accurate and consistent description of the process but, as a drawback, requires the corresponding monitoring equipment. In the study of [8], total biomass was measured online by absorbance change detection in a photocell, while the biomass state classification was made based on pre-established assumptions.

#### *3.3. Model of Trelea et al.*

The originality of the model of Trelea et al. [9], with respect to the previous ones, is that it considers the carbon dioxide dynamics instead of the biomass dynamics. Carbon dioxide sensors are indeed easily implemented and calibrated on-line, reliable, and significantly cheaper than biomass measurement devices.

The evolution of carbon dioxide is related to yeast growth, sugar consumption, and ethanol production. *CO*<sup>2</sup> dynamics are assumed to be driven by a Monod law describing sugar activation and saturation effects, an inhibition factor taking account of the decreasing cell respiratory capacity following ethanol accumulation. The influence of the initial biomass concentration on the initial *CO*<sup>2</sup> production rate is also taken into account:

$$\frac{dCO\_2}{dt} = \mu\_{\text{max}} \frac{S}{K\_S + S} \frac{1}{1 + K\_I E^2} (CO\_2 + C\_0 X\_0) \tag{20}$$

where *KS* is the half-saturation coefficient, *KI* is the inhibition constant, and *C*<sup>0</sup> is a conversion factor.

Algebraic equations describe the evolution of the sugar and ethanol concentrations in relation to *CO*<sup>2</sup> as suggested in [12]:

$$S = S\_0 - \mathcal{Y}\_S C O\_{2\prime} \tag{21}$$

$$E = \text{Y}\_{\text{E}} \text{CO}\_{2\text{\textquotedblleft}} \tag{22}$$

where *YS* and *YE* are the corresponding yield coefficients. The main advantage of this model lies in its practical and control-oriented description of the fermentation process, considering a variable that can be easily measured, such as carbon dioxide. The maximum specific growth rate, *μmax*, is however, assumed to depend on several operating parameters, such as pressure, temperature, and initial yeast concentration, as follows:

$$
\mu\_{\text{max}} = a\_T T\_n + a\_P P\_n + a\_X X\_{0,n} + a\_{TP} T\_n P\_n + a\_{TX} P\_n X\_{0,n} + a\_{0\prime} \tag{23}
$$

where *ai* are parameters to be estimated.

#### *3.4. Proposed Mathematical Models*

In this study, two mathematical models of beer fermentation are proposed, one based on biomass dynamics and the other on *CO*<sup>2</sup> dynamics. These models take inspiration from the mathematical developments of the previous sections and attempt to describe experiments performed with the beer fermenter described in Section 2.

Figure 2 shows some data collected in a batch experiment at 19 °C. It is apparent that sugar was not completely consumed at the end of the fermentation, with a residual concentration of about 12 g/L (this behavior was confirmed in repeated experiments in the same and different conditions). Two possible causes of this sluggish fermentation in the end of batch were explored in additional experiments, including biomass settling and water quality. However, gentle agitation and tests with different water sources did not influence the initial observation. Other causes such as the depletion of some components required for yeast proliferation and maintenance (such as nitrogen, sterols, fatty acids) could not be assessed. The models were, therefore, adapted to describe the experimental observations. As the published models only consider the total consumption of sugars, structural modifications were made to cope with this type of behaviour. The changes mostly impact the definition of the specific growth rate, as explained in the following sections.

#### 3.4.1. Dynamic Model Based on Biomass

An overall alcoholic fermentation reaction can be written as follows:

$$k\_{\rm S} \text{S} \xrightarrow{r\_1} k\_{\rm E} \text{E} + k\_{\rm V} \text{VDK} + k\_{\rm CO\_2} \text{CO}\_2 + \text{X} \tag{24}$$

where sugars *S* are consumed and converted by yeasts *X* into ethanol *E*, vicinal diketones *VDK*, and carbon dioxide *CO*2. *kS*, *kE*, *kV*, and *kCO*<sup>2</sup> represent the yield coefficients of sugar, ethanol, vicinal diketones, and carbon dioxide, respectively.

**Figure 2.** Direct validation of the biomass model with the data from experiment 2. Stars: experimental data. Error bars: 95% confidence intervals. Continuous blue line: model prediction.

Vicinal diketones are later reduced by yeasts, producing 2-3-butanodiol *P* in the following second reaction:

$$\text{VDK} \xrightarrow{r\_2} \text{P} \tag{25}$$

From Equations (24) and (25), a set of mass-balance ODEs can be derived:

$$\frac{dX}{dt} = \mu\_X X - \delta\_X X\_\prime \tag{26a}$$

$$\frac{dS}{dt} = -k\_S \mu\_X X\_\prime \tag{26b}$$

$$\frac{dE}{dt} = k\_E \mu\_X X \tag{26c}$$

$$\frac{dCO\_2}{dt} = -k\_{CO\_2} \mu\_X X\_\prime \tag{26d}$$

$$\frac{dVDK}{dt} = k\_V \mu\_X X - r\_{VDK} V DK. \tag{26e}$$

The specific growth rate is defined as:

$$
\mu\_X = \mu\_{\text{max}} \left( 1 - \frac{S\_{\text{min}}}{S} \right), \qquad \text{for} \quad S \ge S\_{\text{min}}.\tag{27a}
$$

$$
\mathcal{S} = 0, \qquad \text{for} \quad \mathcal{S} \le \mathcal{S}\_{\text{min}}.\tag{27b}
$$

The specific growth rate usually represented by a Monod law is replaced by a Drooplike factor [13], commonly used to describe microalgae growth. This kinetic structure expresses that a minimum level of sugar *Smin* is necessary to trigger growth. Above this threshold, the Monod factor has an activation/saturation effect similar to a Monod factor. Furthermore, a decrease in biomass is observed due to settling and is represented by the coefficient *δ<sup>X</sup>* in Equation (26a).

In classical batch fermentation, the temperature is the only variable that can be manipulated during the batch to control the evolution of the fermentation. Based on the previously discussed modeling studies [7–9] and experimental observation, a temperaturedependency of the maximum specific growth rate *μmax* is expected that can be described following several structural laws [14]. Moreover, the reduction speed of the VDKs, *rVDK*, is also affected by temperature. In Table A1, the definitions and units of each parameter are listed.

#### 3.4.2. Dynamic Model Based on Carbon Dioxide

The yeast concentration is difficult to monitor during the batch and, most of the time, this variable is either indirectly measured by the turbidity of the wort, or simply by sampling and cell counting using a cytometer or a dry-weight method. Hence, there is a strong motivation to monitor other, more accessible, process variables and develop dynamic models describing their evolution. Carbon dioxide may be such an indicator since it is a product of the biochemical reactions related to sugar oxidation, which provides the necessary energy to the yeast cells to grow and leads to ethanol production if sugar is in excess, activating the overflow metabolism [15,16]. In this study, the model of [9] was adapted in the following way:

$$\frac{dCO\_2}{dt} = \mu\_X \text{CO}\_2\tag{28a}$$

$$S = S\_0 - k\_{\mathbb{S}} \text{CO}\_2\tag{28b}$$

$$E = E\_0 + k\_E \text{CO}\_{2\prime} \tag{28c}$$

$$\frac{dVDK}{dt} = k\_V \mu\_X CO\_2 - r\_{VDK} VDK\_\prime \tag{28d}$$

where the specific growth rate is given by:

$$
\mu\_x = \mu\_{\max} \frac{S}{K\_S + S} \left( 1 - \frac{CO\_2}{Cp\_{\max}S\_0} \right). \tag{29}
$$

The kinetic description contains a Monod law for the sugar activation and saturation effects, and a logistic factor to mimic the observed sigmoidal evolution of carbon dioxide. The data sets also reveal that the maximum production of carbon dioxide is correlated with the initial sugar concentration *S*0, which therefore defines the maximum *CO*<sup>2</sup> level (i.e., the carrying capacity of the logistic model). The initial biomass concentration, which is assumed to be known (measured) and directly correlated to the *CO*<sup>2</sup> dynamics in [9], was not used in the current study since the initial condition of *CO*<sup>2</sup> was available. The rates *μmax* and *rVDK* are assumed to depend on temperature and Table A2 lists the definitions and units of some parameters.

#### **4. Structural Identifiability and Observability of the Models**

Identifiability globally refers to the possibility of identifying the model parameters from the available data. A model is structurally identifiable if all the parameters can be uniquely determined from ideal measurements of its outputs, i.e., collected in continuous time without errors or noise, and the knowledge of the dynamic equations [17]. If this property is not met, any further effort to estimate the non-identifiable parameters will be vain. However, the identifiability analysis is often omitted due to the assumed complexity of the mathematical developments required to achieve the analysis. Recently, several methodologies and toolboxes have been developed to significantly ease the task, as reviewed in [18]. Some of these software tools are DAISY [19], GENSSI [20], STRIKE-GOLDD [10], and SIAN [21].

On the other hand, practical identifiability deals with the possibility of assessing all or some of the model parameters under realistic conditions, e.g., sampled data and measurement noise. The Fisher Information Matrix (FIM) is useful to assess practical identifiability through a rank test condition. An ill-conditioned FIM can indicate poor practical parameter identifiability even if structural identifiability is met.

In this work, STRIKE-GOLDD (STRuctural Identifiability taken as Extended-Generalized Observability with Lie Derivatives and Decomposition) was used to investigate the structural identifiability of the proposed beer fermentation models. This software tool has been developed in MATLAB® and addresses identifiability based on the concept of observability. To this end, the model is extended by considering its model parameters as state variables with zero dynamics. The results obtained for both models indicate that structural identifiability is ensured only when all the state variables are measured.

Another property of interest is observability, which is a prerequisite to the design of a state observer to reconstruct nonmeasured state variables. The results of the analysis are provided in Tables 3 and 4 for the two dynamic models. For the model based on biomass, the analysis reveals that the set of three measurements [*CO*2, *E*, *VDK*] is necessary to guarantee observability. The set [*CO*2, *E*, *S*] shows partial observability as *VDK* cannot be reconstructed but biomass *X* could. The carbon dioxide model requires the measurement of *VDK* together with another variable (*CO*<sup>2</sup> or *E* or *S*) to fulfill the observability condition. An observer could, therefore, be designed to estimate the sugar concentration online, which is the most expensive measurement using an online or at-line hardware probe.


**Table 3.** Observability analysis of the biomass model for several measurement configurations.

**Table 4.** Observability analysis of the carbon dioxide model for several measurement configurations.


#### **5. Parameter Identification Problem**

Parameter identification is achieved using classical nonlinear parameter estimation techniques [22]. The procedure considers a Weighted Least-Square (WLS) criterion, i.e., a weighted sum of squared differences between model predictions and experimental measurements:

$$\mathbf{J}(\theta) = \sum\_{i=1}^{M} \left( y(t\_i) - y\_{model}(t\_i, \theta) \right)^T \mathbf{W}^{-1} (y(t\_i) - y\_{model}(t\_i, \theta)) \tag{30}$$

where J is the value of the cost function, *y*(*ti*) is the vector of *N* measured variables at the measurement instant *ti* (*i* = 1, ... , *M*), *ymodel*(*ti*, *θ*) is the model prediction that depends on the set of *P* parameters *θ* to be identified, and W is a normalization matrix where the diagonal elements are chosen as the squares of the maximum measurements values of each component concentration. This choice allows normalization of the prediction errors, and is particularly well-suited to a relative error model where it is assumed that the error is proportional to the maximum values of the observed variables:

$$\mathbf{W} = \begin{bmatrix} \max\_{t\_i} (y\_1^2) & 0 & \dots & 0 \\ 0 & \max\_{t\_i} (y\_2)^2 & \dots & 0 \\ & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \dots & \max\_{t\_i} (y\_N)^2 \end{bmatrix} \tag{31}$$

The estimated parameter set is obtained by minimizing a cost function J(*θ*) as follows:

$$\hat{\theta} = \arg\min\_{\theta} \mathbf{J}(\theta) \tag{32}$$

To achieve the minimization of (32), a two-step procedure is implemented: (a) a multistart strategy defines random sets of initial parameter values to cover as much as possible of the search field. The minimization of (32) is first performed using the Matlab® optimizer fminsearch (Nelder-Mead algorithm); (b) the Matlab lsqnonlin optimizer is subsequently used from the identified global minimum (i.e., the smallest local minimum identified in the search space by fminsearch) to refine the minimization and compute the Jacobian matrix containing the model parameter sensitivities, denoted *yθ*. These sensitivities can be exploited to compute the Fisher Information Matrix (FIM) defined as:

$$\text{FIM} = \sum\_{i=1}^{M} y\_{\theta}^{T}(t\_i) \hat{\Omega}^{-1} y\_{\theta}(t\_i) \tag{33}$$

where Ωˆ = ˆ2*W* is the a posteriori covariance matrix of the measurement errors, which can be evaluated using the weighting matrix W (Equation (31)) and an a posteriori estimator of the relative measurement error:

$$
\hat{\varepsilon}^2 = \frac{J^\*}{MN - P} \tag{34}
$$

where *J*∗ is the value of the cost function at the optimum, *MN* represents the total number of data, and *P* is the number of estimated parameters *θ*. An estimate of the parameter estimation error covariance matrix can then be inferred from the Cramer–Rao bound as follows:

$$
\hat{\Sigma} = \text{FIM}^{-1} \tag{35}
$$

From the diagonal of the covariance matrix Σˆ , the standard deviations for each parameter can be extracted and the corresponding coefficients of variations can be calculated as:

$$\text{CV} = \frac{\sigma}{\theta\_i} \tag{36}$$

To achieve the estimation of the parameters of the beer fermentation models, a total of 4 batch experiments are considered as shown in Table 2. Out of these 4 experiments, 3 are used for parameter estimation and model direct validation (experiments 2 to 4), while experiment 1 is used for cross-validation. An important point of the current work is the use of all the samples of experiments 2, 3, 4 to achieve the identification, including two temperature-varying parameters, i.e., the specific growth rate *μmax* and the *VDK* reduction rate *rVDK*. Previous studies have indeed demonstrated that the other parameters do not change significantly with temperature. In addition to the stoichiometric and kinetic parameters, the initial conditions are also considered unknown (and are therefore estimated) since possibly corrupted by measurement noise.

#### *5.1. Biomass Model*

This model counts 8 parameters (*kS*, *kE*, *Smin*, *gx*, *kV*, *kCO*<sup>2</sup> , *μmax*, *ra*) to be estimated. The dependence on temperature of the parameters *μmax* and *rVDK*, is formulated as follows:

$$
\mu\_{\text{max}} = a \text{ln}(T) + b; \qquad r\_{VDK} = c \text{ln}(T) + d; \tag{37}
$$

introducing the additional parameters *a*, *b*, *c*, *d*. Several nonlinear model structures have been considered to correlate the parameters with the temperature. It turned out that the selected logarithmic structure provides the best results.

In practice, the identification proceeds in three steps: (a) a first parameter estimation without explicit temperature dependence (i.e., *μmax* and *rVDK* are considered constant), (b) an estimation of the four parameters linked to the temperature dependence (the others being fixed at their previously estimated values), and (c) a global identification of all the parameters starting from the previous estimates.

Good practice recommends partitioning the data using a ratio of approximately 75/25 for parameter estimation (and subsequent direct validation), and cross-validation, respectively. Accordingly, three experimental data sets are used in direct validation and the remaining one in cross-validation. Since the parameter estimation procedure aims at capturing information on the process in a wide range of operations, it is legitimate to include experiments with different initial sugar and biomass concentrations and temperature levels. Particularly, it is important to collect informative data regarding the evolution of *μmax* and *rVDK* with respect to temperature in (37). Among the several possible data partitions, one possible combination appears to be: experiments 2, 3, 4 for parameter estimation (and direct validation) and experiment 1 for cross-validation. Indeed, experiments 1 and 2 are carried out at the same temperature, but experiment 2 also includes different sugar and biomass initial conditions. Table 5 reports the values of the estimated parameters and their coefficients of variations.


**Table 5.** Parameter estimate values and coefficients of variation (CV) for the biomass model.

Figures 2 and 3 show some direct validation results, i.e., the fitting of the model to the experimental data collected in experiments 2 and 4 together with the a posteriori error bars on the experimental data. The model reproduces quite well the dynamics of the several variables, even if the biomass predictions sometimes deviate from the confidence intervals of the data, and some deviations in the VDK production are also observed in the early hours. The coefficients of variations confirm the good estimation results, as the maximum relative CV is 11% for the minimum substrate quota *Smin*.

In order to assess the model predictive capacity, cross-validation is achieved using the dataset from experiment 1. In this case, only the initial conditions are estimated while the parameters are kept fixed. As shown in Figure 4, the model predicts satisfactorily the experimental data. The biomass data again has some uncertainty, which can probably be linked to several factors such as cell counting errors, biomass mixing (to counteract biomass settling and collect representative samples) and nitrogen limitation [23].

**Figure 3.** Direct validation of the biomass model with the data from experiment 4. Stars: experimental data. Error bars: 95% confidence intervals. Continuous blue line: model prediction.

**Figure 4.** Cross-validation of the biomass model with the data from experiment 1. Stars: experimental data. Error bars: 95% confidence intervals. Continuous blue line: model prediction.

Vicinal diketone dynamics present a varying latency phase followed by production/consumption, both driven by biomass dynamics. The uncertainty on the latency period compromises the resulting fitting since biomass does not exhibit the same behavior in the early phase of fermentation.

#### *5.2. Carbon Dioxide Model*

A similar procedure was applied to estimate the values of the carbon dioxide model parameters. In this case, the dependence on temperature of *μmax* and *rVDK* is best represented by:

$$r\mu\_{\text{max}} = a\ln(T) + b; \qquad r\_{VDK} = cT^2 + dT + e;\tag{38}$$

Hence, the resulting model counts 10 parameters (*kS*, *kE*, *KS*, *Cpmax*, *kV*, *a*, *b*, *c*, *d*, *e*), and Table 6 reports the estimated values with their respective coefficients of variations. As can be noticed, the latter are smaller than the ones of the previous model, mainly due to the absence of biomass measurement and the associated uncertainty.

The identification is again decomposed into distinctive steps: (a) estimation of the parameters without temperature dependence and with an arbitrary value for *KS* whose practical identifiability is poor, (b) estimation of *KS* with all the other parameters fixed at their previously estimated values, (c) estimation of the parameter linked to the temperature dependence (a to e) with all the others fixed to their previous values, and (d) final reestimation of all the parameters.

Figures 5 and 6 show the direct validation with experiments 2 and 4, as well as the a posteriori error bars on the experimental data.

**Figure 5.** Direct validation of the carbon dioxide model with the data from experiment 2. Stars: experimental data. Error bars: 95% confidence intervals. Continuous blue line: model prediction.

**Figure 6.** Direct validation of the carbon dioxide model with the data from experiment 4. Stars: experimental data. Error bars: 95% confidence intervals. Continuous blue line: model prediction.


**Table 6.** Parameter estimate values and coefficients of variations (CV) for the carbon dioxide model.

Model cross-validation using experiment 1 is shown in Figure 7 and confirms the satisfactory predictive capacity of the model, except for some minor deviations in the evolution of the VDKs due to the presence of a time-varying latency phase.

In Table 7, the root means square errors (RMSEs) are provided for each experiment and each variable separately. The cost function residuals of the direct validations are also provided. It can be observed that, overall, the RMSE values are small for both models. Regarding the biomass model, *X* and *VDK* present larger RMSEs than the other variables, due to the observed deviations between the model prediction and the experimental data in Figures 2 and 3. Regarding the carbon dioxide model, RMSEs indicate a better fit with the experimental data. This statement is confirmed by Figures 5 and 6. Cross-validation results also support this analysis since the RMSEs of the biomass and carbon dioxide models are, respectively, 1.997 and 0.846.


**Table 7.** Cost function residuals and relative RMSEs for each state variable of the two models, resulting from direct validation.

Discriminating among the proposed models is difficult since they target different variables. However, taking into account the cost function residuals, the carbon dioxide model fit better to the current operating conditions and monitoring set-up (J = 0.72) than the biomass model (J = 1.76). Furthermore, from a practical point of view, the identification of the dioxide carbon model requires a sensor configuration that is easier to set up, limiting offline analytical analysis. Conversely, the identification of the biomass model requires offline cell counting to measure yeast concentration. Moreover, considering process control, carbon dioxide online sensors are affordable, whereas biomass sensors are expensive (alternatively a biomass software sensor could be developed based on the measurements of *CO*2, E, VDK). The main advantage of the biomass model lies in the provided information about the biomass metabolic state during the fermentation process, allowing a more straightforward detection of possible contamination.

**Figure 7.** Cross-validation of the carbon dioxide model with the data from experiment 1. Stars: experimental data. Error bars: 95% confidence intervals. Continuous blue line: model prediction.

#### **6. Conclusions**

The demand for processes with more rigorous quality standards, as is the case in the pharmaceutical industry, has led to the development of approaches such as process analytical technologies (PAT), now being extended to the agro-food sector and, more specifically, to the brewing industry. This work is motivated by the growing importance of mathematical modeling, in the context of PAT, to design process digital twins that can support lab-scale operations. Model-based advanced monitoring and control techniques can indeed be developed in view of optimizing and improving the process. In this study, two alternative models, initially proposed in seminal works, are adapted and identified under realistic experimental conditions. One of the models is based on the description of the biomass evolution, while the other, more pragmatic, considers carbon dioxide, a more accessible variable that can be measured with cheap sensors. These models take account of the temperature influence in a simple way. A systematic identification procedure is described. Cross-validation highlights the good predictive capability of both models, which are good candidates for model-based control.

**Author Contributions:** Conceptualization, J.M.Z.L., L.D., and A.V.W.; methodology, J.M.Z.L., L.D., H.H.E., and A.V.W.; software, J.M.Z.L.; validation, J.M.Z.L.; formal analysis, J.M.Z.L.; investigation, J.M.Z.L., L.D., H.H.E., and A.V.W.; data curation, J.M.Z.L.; writing—original draft preparation, J.M.Z.L., L.D., and A.V.W.; writing—review and editing, L.D., H.H.E., and A.V.W.; supervision, A.V.W. and H.H.E.; project administration, H.H.E. and A.V.W.; funding acquisition, H.H.E. and A.V.W. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author.

**Acknowledgments:** The authors are grateful to Vincent Moeyaert, Perla Melendez, and Ivette Navarro for their help in the experimental set-up. J.M.Z.L. acknowledges the support of CONACYT under grant 782850.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**


**Table A1.** Parameter nomenclature list for biomass model.


**Table A2.** Parameter nomenclature list for carbon dioxide model.

#### **References**


**Jose Lucas Peñalver-Soto 1,2, Alberto Garre 3, Arantxa Aznar 1, Pablo S. Fernández <sup>1</sup> and Jose A. Egea 2,\***


**Abstract:** In food processes, optimizing processing parameters is crucial to ensure food safety, maximize food quality, and minimize the formation of potentially toxigenic compounds. This research focuses on the simultaneous impacts that severe heat treatments applied to food may have on the formation of harmful chemicals and on microbiological safety. The case studies analysed consider the appearance/synthesis of acrylamide after a sterilization heat treatment for two different foods: pureed potato and prune juice, using *Geobacillus stearothermophilus* as an indicator. It presents two contradictory situations: on the one hand, the application of a high-temperature treatment to a low acid food with *G. stearothermophilus* spores causes their inactivation, reaching food safety and stability from a microbiological point of view. On the other hand, high temperatures favour the appearance of acrylamide. In this way, the two objectives (microbiological safety and acrylamide production) are opposed. In this work, we analyse the effects of high-temperature thermal treatments (isothermal conditions between 120 and 135 ◦C) in food from two perspectives: microbiological safety/stability and acrylamide production. After analysing both objectives simultaneously, it is concluded that, contrary to what is expected, heat treatments at higher temperatures result in lower acrylamide production for the same level of microbial inactivation. This is due to the different dynamics and sensitivities of the processes at high temperatures. These results, as well as the presented methodology, can be a basis of analysis for decision makers to design heat treatments that ensure food safety while minimizing the amount of acrylamide (or other harmful substances) produced.

**Keywords:** food safety; acrylamide formation; thermal resistance; dynamic models; simulation

#### **1. Introduction**

Different conflicting objectives often arise in many food processes (e.g., quality vs. economical cost). Finding the optimal solutions which balance among all the existing objectives is not an easy task due to the complexity of the mathematical models describing such processes [1–4]. This optimization step is crucial to produce an efficient decisionmaking process [5]. Recent research has been devoted to optimizing food processes where two or more conflicting objectives appear. The most common are usually related to product quality and process economy [6–8], different quality parameters [9–11], economic and environmental parameters [12,13], or, as in the present study, product quality and safety [14].

One of the most important methods of food preservation in the food industry is thermal processing. Historically, the focus was on optimizing heat treatments to improve the processes related to microbial destruction, nutrients retention, cooking values and loss of quality [15]. The main function of heat treatments is to inactivate microorganisms and

**Citation:** Peñalver-Soto, J.L.; Garre, A.; Aznar, A.; Fernández, P.S.; Egea, J.A. Dynamics of Microbial Inactivation and Acrylamide Production in High-Temperature Heat Treatments. *Foods* **2021**, *10*, 2535. https://doi.org/10.3390/foods10112535

Academic Editor: Barbara Cardazzo

Received: 10 September 2021 Accepted: 19 October 2021 Published: 21 October 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

enzymes to achieve safe, long shelf-life food. The associated disadvantages are related to food quality (e.g., nutrients or texture preservation) due to the effect of high temperatures. Therefore, the design of thermal processes in the food safety sector must face different objectives, such as food quality or energy consumption vs. microbial inactivation. The use of high temperatures implies food degradation but also the formation of substances that can be harmful to humans [16]. One example is acrylamide, a chemical that is produced during high-temperature processes in foods that contain reducing sugars (such as fructose and glucose) and asparagine [17]. Acrylamide was found to be carcinogenic in rodents, and the International Agency for Research on Cancer has classified it as a probable carcinogen [18,19]. This has motivated food authorities to propose methodologies to minimize acrylamide content in commercial and homemade foods. The European Food Safety Authority (EFSA) has set recommendation levels for some foods [20]. The influence of high temperature on the formation of acrylamide has been previously demonstrated [21]. The higher the processing temperature, the more acrylamide is formed, thus heat treatments need to be optimized to decrease the amount of produced acrylamide. The formation of acrylamide in a process that involves the heating of food is explained by the Maillard reaction [22]. Since its formulation, this reaction has been studied from different points of view. Traditionally, the focus was put on components that affected colour, flavour, and taste, whereas more recently the focus has moved to the analysis of the formation of mutagens and carcinogens. Acrylamide is one of these chemicals in the spotlight due to its potential formation in highly consumed foods such as potato chips [23] or Asian noodles [24]. French fries, coffee, and bread have also presented high levels of acrylamide [21]. In fact, a wide range of different food products containing fructose and asparagine can contain acrylamide.

In the case of baby food, the recommendations for acrylamide levels are more restrictive than in other types of food, and a maximum allowed amount of 30 μg/kg is set [20]. Different studies [25,26] have shown that various foods exceed these limits. Specifically, potato-based products are highly susceptible to containing high levels of acrylamide. On the other hand, although the EFSA has not yet established recommendations for foods based on vegetables or fruits [27], it has been shown that products such as prune juice can contain high concentrations of acrylamide, reaching much higher values even than in potato-based foods. [28]. Therefore, this study focuses on foods such as potato puree and prune juice, which can be catalogued as baby food and may not meet EFSA's recommendation for those products.

From a quality vs. microbiological point of view, heat stability of heat-labile quality factors presents a higher z-value than those typical of bacteria. Then, high-temperature short-time (HTST) processes are less deleterious to food quality while ensuring microbial food safety and stability [29], although the impact of acrylamide formation has not been considered. In this regard, the two proposed objectives (i.e., microbial inactivation and acrylamide formation) are opposed and the problem must be analysed.

The microorganism considered here is *Geobacillus stearothermophilus*, a Gram-positive, thermophilic, and spore-forming bacterium with an optimal growth temperature around 55 ◦C. The spores are very heat-resistant and usually survive canning and sterilization operations. Furthermore, it has been detected in different foods such as canned vegetables, ready-to-eat meals containing meat, fruit preparations, or dehydrated ingredients [30]. Other relevant pathogenic spore formers of interest in the food industry, such as *Clostridium botulinum* or *Bacillus cereus*, have not been considered here, as their inactivation is generally not a problem within the range of temperatures considered in this study which give rise to significant amounts of formed acrylamide.

As *Geobacillus stearothermophilus* spores are used to validate heat sterilization processes [31–34], this study evaluates the inactivation of this microorganism in a heat treatment within the typical temperatures applied to the studied products (120–135 ◦C).

In this work, we have simulated and analysed the dynamics of the two objectives in a thermal inactivation operation. The aim is to use mathematical models to determine the conditions where the amount of formed acrylamide is minimum while ensuring microbial inactivation. To the authors' knowledge, no previous work was published facing the inactivation of microorganisms and acrylamide production.

#### **2. Materials and Methods**

#### *2.1. Case Study*

This study analyses the dynamics of two processes associated with the application of severe heat treatments in some foods: microbial inactivation and acrylamide production. In principle, higher temperatures ensure a safer operation from the microbiological point of view, but can also induce a higher acrylamide production. We have applied this analysis to two practical cases of the food industry (potato puree and prune juice) where different time/temperature treatments are simulated to assess the inactivation of a thermo-resistant microorganism and the acrylamide formation. The microorganism is first characterized (see Section 2.2) and then heat treatments scenarios are simulated for each of the considered foods to assess the dynamics of microbial inactivation and acrylamide formation and their balance.

#### *2.2. Microbial Inactivation Model and Parameter Estimation*

To characterize the behaviour of the microorganism, the Bigelow model [35] was chosen. This choice is motivated by the use of this model to characterize the microbial inactivation of *Geobacillus stearothermophilus* within the literature (e.g., [36–38]). In any case, the use of other models would not invalidate the methodology presented here, and the whole procedure would be similar. It considers a log-linear relationship between the fraction of survivors (*S*) and treatment time, (*t*), as shown in Equation (1).

$$
\log\_{10} S = \frac{-t}{D(T)} \tag{1}
$$

$$
\log\_{10} D(T) = \log\_{10} D(T\_{ref}) - \frac{T - T\_{ref}}{z} \tag{2}
$$

The influence of treatment temperature (*T*) in the microorganism is reflected as the D-value, which is log dependent on the temperature, as shown in Equation (2). The Dvalue represents the time required to reduce the microbial population by 90% at a constant temperature, and the z-value quantifies the sensitivity of the D-value to temperature changes. The reference temperature, *Tref* , is a parameter without biological meaning but can improve parameter identifiability [39,40].

To estimate the model parameters, several D-values (n = 113) for thermal inactivation of *Geobacillus stearothermophilus* were collected from the literature (Web of Science database) as described in [41]. A temperature range between 97.5 ◦C and 137.5 ◦C was considered, and only food matrixes (especially vegetable-based) were included [38,42–50]. Non-linear regression was applied to obtain mean *log*10*D*(*Tref*)-values and *z*-values with their respective standard errors.

The estimated model parameters are *log*10*D*(*Tref*) = −0.0468±0.03789, *z* = 8.66 ± 0.283 ( ◦C). The experimental data, as well as the fitted model, are presented in Figure S1, that shows a good model fitting, which confirms the suitability of the Bigelow model to characterize the microorganism in the conditions considered. The reference temperature was set to a value near the middle of the temperature range (*Tref* = 125 ◦C) as recommended by [40]. Monte Carlo simulations were used to calculate the probability that a heat treatment (time, temperature) produces at least 6 logarithmic reductions in the microbial load (symbolized as *P*(*log*10*S* ≥ 6) within the text), which is considered to be sufficient for many inactivation processes. In any case, changing this value would not change the proposed methodological approach. For this, we select 1000 pairs of values of the model parameters (*log*10*D*(*Tref*), *z*) obtained by simulation from two independent normal distributions, where *log*10*D*(*Tref*) ∼ *N*(*μ* = −0.0468, *σ* = 0.03789) and *z* ∼ *N*(*μ* = 8.66, *σ* = 0.283). Next, the expected log reduction was calculated with Equations (1) and (2) from the heat treatment conditions (time, temperature) for each of the 1000 pairs

of parameter values. Finally, the probability of achieving the target inactivation was calculated by dividing the number of cases that comply with *log*10*S* ≥ 6 by 1000. This procedure was used to address the first objective: maximize *P*(*log*10*S* ≥ 6).

#### *2.3. Acrylamide Production Objective*

To quantify the acrylamide formation, the multi-response kinetics in a glucose-asparagine reaction at high temperatures (120–200 ◦C) proposed by [51] were used. The model is based on the reaction network shown in Figure S2.

Glucose and asparagine react to form a Schiff base. Fructose is formed by glucose isomerization, and it also reacts with asparagine to form the Schiff base. At the same time, the Schiff base is degraded into melanoidins and acrylamide, whereas acrylamide is degraded into unknown species (named *Product X*). Study [51] calculated the equilibrium constants for the temperature range (120–200 ◦C), which showed a logarithmic relationship with temperature. Using this observed relationship, a temperature-dependent function was fitted for each constant (*Ki*(*T*) ∀*i* = 1, 2 ... , 6). Although the range of temperatures used by Knol et al. does not coincide with the range considered for t*log*10*D*(*T*) vs. *T* values in this study (e.g., from 97.5 to 137.5), we consider that the same logarithmic relationship applies in our case. The estimated kinetic constant values were used in the system of ordinary differential equations that quantifies the amount of acrylamide formed for a specific time and temperature, shown in Equation (3).

$$\begin{cases} \frac{d[\text{Glucose}]}{dt} = -\mathcal{K}\_{1}(T) \cdot [\text{Glucose}] \cdot [\text{Asparagine}] - \mathcal{K}\_{2}(T) \cdot [\text{Glucose}]\\ \frac{d[\text{Fructose}]}{dt} = -\mathcal{K}\_{3}(T) \cdot [\text{Fructose}] \cdot [\text{Asparagine}] + \mathcal{K}\_{2}(T) \cdot [\text{Glucose}]\\ \frac{d[\text{Asparagine}]}{dt} = -\mathcal{K}\_{1}(T) \cdot [\text{Glucose}] \cdot [\text{Asparagine}] - \mathcal{K}\_{3}(T) \cdot [\text{Fructose}] \cdot [\text{Asparagine}]\\ \frac{d[\text{Sulfificase}]}{dt} = \mathcal{K}\_{1}(T) \cdot [\text{Glucose}] \cdot [\text{Asparagine}] + \mathcal{K}\_{3}(T) \cdot [\text{Fructose}] \cdot [\text{Asparagine}]\\ \frac{-\mathcal{K}\_{4}(T) \cdot [\text{Sulfificase}] - \mathcal{K}\_{5}(T) \cdot [\text{Sulfificase}]\\ \frac{d[\text{Actual}]}{dt} = \mathcal{K}\_{4}(T) \cdot [\text{Sulfificase}] - \mathcal{K}\_{6}(T) \cdot [\text{Actual}] \end{cases} \tag{3}$$

One of the solutions to the system of differential equations is the acrylamide concentration formed for a certain heat treatment (time, temperature), which is the second objective in our formulation. In this second objective, apart from the heat treatment conditions, that is, the time and temperature variables, it is necessary to set the initial amounts of glucose, fructose, and asparagine. Two foods were selected: potato puree (typical pH of 5.1–6.0) and prune juice (typical pH of 4.0–5.0). For each of the two selected foods, initial concentrations are shown in Table 1. Details on these calculations are provided in Supplementary Material. From the existing potato varieties, the *red potato* was chosen, as it produces the highest amount of acrylamide [52].


**Table 1.** Initial concentrations of glucose, fructose, and asparagine in pureed potato and prune juice.

#### **3. Results and Discussion**

This section analyses the results of the simulation of isothermal heat treatments with temperatures between 120 and 135 ◦C for two food matrices, pureed potatoes and prune juice. On the one hand, to quantify inactivation, a Monte Carlo simulation approach is used, as explained in Section 2.2, which provides the probability that the heat treatment (time, temperature) produces at least six logarithmic reductions (*P*(*log*10*S* ≥ 6)). Therefore, all results for inactivation in both foods are related to that probability.

On the other hand, acrylamide production is reported in the units of measurement μ*g kg* recommended by EFSA to make comparisons [20].

#### *3.1. Single-Objective Analysis*

Regarding the first objective (i.e., microbial inactivation), *P*(*log*10*S* ≥ 6) as a function of temperature and time of heat treatment is presented in Figure 1. As expected, the areas with the highest probability of having more than six log-reductions in the microbial count are those related to higher temperatures and longer treatment times. It is notable that, in the ranges of time and temperature considered (Figure 1), *P*(*log*10*S* ≥ 6) is very sensitive to small changes in temperature or in processing time. This sensitivity is higher as the temperature increases.

**Figure 1.** Probability of six or more logarithmic reductions *P*(*log*10*S* ≥ 6) for *Geobacillus stearothermophilus* as a function of heat treatment (time, temperature).

On the other hand, the impact of the heat treatment on acrylamide formation is represented in Figure 2A for pureed potato and Figure 2B for prune juice. The area showing the highest acrylamide formation is defined by the highest treatment temperature. As expected, higher temperature and/or longer duration of the heat treatment had a positive correlation with acrylamide formation, although significant differences were found between the foods tested. The main difference was the amount of acrylamide that could be produced, which was higher for prune juice at all the time/temperature combinations. For example, for the most severe treatments (upper right corner of Figure 2A,B) the concentration was around two times higher in the case of prune juice. Comparing the isolines of both objectives, larger changes in temperature or processing time are needed to produce significant changes in acrylamide production (Figure 2) than in *P*(*log*10*S* ≥ 6) (Figure 1). In other words, acrylamide formation is less sensitive to temperature and time than inactivation.

In any case, this mono-objective analysis confirms that both objectives counter each other, thus a balance in the operating parameters must be achieved. As an expected conclusion of this analysis, the increase in temperature or processing time in thermal treatments favours the inactivation of the microorganism and disfavours (i.e., increases) acrylamide formation. However, the sensitivity of both responses to changes in temperature or time is significantly different.

**Figure 2.** Amount of acrylamide formed as a function of heat treatment (time, temperature) for the potato puree (**A**) and prune juice (**B**).

#### *3.2. Dynamics of Geobacillus Inactivation and Acrylamide Formation*

As previously discussed, both objectives conflict. Therefore, to analyse the influence of the temperature on both objectives simultaneously, the inactivation rates and the amount of formed acrylamide for different treatment temperatures are evaluated. Simulated inactivation curves for the ranges of temperature and time considered are represented in Figure 3 following the Bigelow model (i.e., Equations (1) and (2)). The increase in temperature produces an increase in the slope of the inactivation curve and therefore a faster inactivation. On the other hand, acrylamide formation is explained by the Maillard reaction (Equation (3)) and its dynamics are represented in Figure 4. Figure 4A shows the formation rates for pureed potato. It is observed that the higher the temperature, the higher the formation. This is confirmed by Figure 4B, which corresponds to the amount of acrylamide formed in prunes juice.

**Figure 3.** *Geobacillus stearothermophilus* inactivation dynamics according to the Bigelow model.

**Figure 4.** Acrylamide formation dynamics according to the Maillard reaction: potato puree (**A**) and prune juice (**B**).

Figure 5A,B represent the inactivation curves for five different temperatures between 120 and 135 ◦C and the acrylamide formed for each temperature when *P*(*log*10*S* ≥ 6) = 0.95 for pureed potato and prune juice, respectively. Inactivation curves are represented as coloured lines and the left y-axis measures the number of logarithmic reductions. As discussed before, higher temperatures result in a higher slope of the inactivation rate. Bars placed at the times required for each treatment temperature show that processes at higher temperatures need less treatment time to achieve at least six log-reductions with a 95% probability. On the other hand, the height of each bar represents the acrylamide formed for each treatment (time–temperature).

**Figure 5.** Inactivation rates represented as lines and acrylamide formed represented as bars for different temperatures when *P*(*log*10*S* ≥ 6) = 0.95, potato puree (**A**) and prune juice (**B**).

It is observed that both for the potato puree (Figure 5A) and the prune juice (Figure 5B), the lower the temperature, the greater the amount of acrylamide is formed due to the higher treatment time needed to achieve (*log*10*S* ≥ 6) = 0.95. As the temperature increases, the treatment time required to inactivate the spores decreases, and therefore the amount of acrylamide formed is also lower. We must recall that both objectives (i.e., microbial inactivation and acrylamide formation) are determined by the combination of temperature and time. The processing time for each temperature is determined by the defined level of inactivation (i.e., (*log*10*S* ≥ 6) = 0.95). This time is different for each temperature and the balance between dynamics explains the unexpected differences in the calculated acrylamide amount.

Table 2 quantifies the time and acrylamide formed for each temperature and three values of *P*(*log*10*S* ≥ 6). For a probability of inactivation of 95%, if the temperature is increased by 12.5% (from 120 to 135 ◦C), the treatment time is reduced by 98% (from 23.92 to 0.46 min). Higher acrylamide amounts were observed for prune juice, as the amount of reducing sugars is higher than in pureed potato, although the qualitative appearance of the dynamics of the objectives is the same for both foods. The previously mentioned 12.5% increase in temperature (that would imply a 98% reduction in the treatment time) would result in a 99.5% and 99.3% reduction in acrylamide for pureed potato and prune juice, respectively. Therefore, as kinetics behave differently for temperature changes, an additional analysis to find the optimal trade-off between both objectives was performed, based on a multi-objective approach.

**Table 2.** Acrylamide formed and duration for different heat treatments and inactivation probabilities.


#### *3.3. Multi-Objective Approach*

The dependence of both objectives on time and temperature calls for a multi-objective optimization approach where the aim would be to find the pairs (temperature, time) that provide the Pareto front of optimal solutions (e.g., set of temperature/time solutions for which no objective can be improved without sacrificing the other one). However, the application of such an approach led to a set of Pareto solutions which consisted of the maximum temperature tested (135 ◦C) and different processing times (data not shown). This behaviour can be explained by the different dynamics of the objectives analysed above: an increase in temperature drastically reduces the processing time needed to achieve *P*(*log*10*S* ≥ 6) = 0.95, which at the same time reduces the amount of acrylamide formed due to the slower dynamics of that process. For that reason, the non-dominated solutions consist of pairs of the maximum temperature tested and different processing times.

While this is a perfectly valid mathematical result, it is not so useful from the engineering point of view if we want to check the effects of other temperatures over the objectives, which can be critical in the case of considering additional objectives. For this reason, we eliminated the temperature as a decision variable and fixed it to some discrete values within the tested range, analysing the evolution of both objectives for each of the discrete temperatures over time. Figure 6 shows the acrylamide formed (y-axis) in a heat treatment that inactivates the microorganism for different values of *P*(*log*10*S* ≥ 6), represented in the x-axis.

The highest temperature tested (135 ◦C, orange dots) caused the lowest amount of formed acrylamide at any level of probability as treatment time was short. On the other hand, for the lowest temperature (associated with long treatment times, 120 ◦C, purple dots) the highest amount of acrylamide is formed. This reaffirms the results obtained in the previous section and also allows us to broaden the perspective of the problem, as it can be observed that increasing the probability of reaching the target microbial inactivation promotes the acrylamide formation, although with a very low sensitivity, except in the area of probability close to 100%, where the acrylamide formation tends to rise more significantly. The differences between foods (Figure 6A,B) are only manifested in the acrylamide levels, but the qualitative behaviour is similar.

**Figure 6.** Acrylamide formed in a heat treatment that inactivates the microorganism for different values of *P*(*log*10*S* ≥ 6): potato puree (**A**) and prune juice (**B**).

For the lowest temperature (120 ◦C) the duration of the treatment exceeds 23 min for *P*(*log*10*S* ≥ 6) = 90% (Table 2). On the other hand, increasing the temperature from 120 ◦C to 135 ◦C (+12.5%), the time needed for a 90% probability is around half a minute and acrylamide could be decreased by up to 99.5% and 99.3% for pureed potato and prune juice, respectively.

To increase *P*(*log*10*S* ≥ 6) from 90% to 99%, the processing time must increase between approximately 10 and 13% depending on the temperature considered. Due to this variation, the amount of acrylamide increases by about 15–23% depending on the chosen temperature, considering both products again.

To assist the decision-making process, Figure 7A,B collect the necessary information to design a heat treatment that seeks to maximize food safety by inactivating a microorganism and minimizing acrylamide formation. These figures show the amount of acrylamide formed (y-axis) as a function of the temperature of the heat treatment (x-axis), as well as *P*(*log*10*S* ≥ 6) (colour of the lines). The treatment time is determined by both the chosen temperature and the probability level, as explained below. The horizontal black line represents the maximum amount recommended in baby food 30 (μg/kg) [20], that was considered as a worst-case scenario. Figure 7A refers to pureed potato, whereas Figure 7B refers to prune juice. These figures provide a global vision of how the heat treatment directly determines the amount of formed acrylamide. The area that is above the horizontal line is an undesirable area, as the amount of acrylamide exceeds the recommendation. Ideally, we should remain in the lower region to ensure that the product has a low level of acrylamide while ensuring the inactivation of the microorganism given a defined probability value for *P*(*log*10*S* ≥ 6). In both cases, it is observed that, when the temperature increases, acrylamide falls for this temperature range, as discussed above.

In Figure 7A, for pureed potato, the horizontal line divides the desirable region at around 126–127 ◦C for all the considered probabilities. Therefore, the heat treatments should be above that temperature. At 126–127 ◦C we would lie within the recommended limit of acrylamide (25–30 μg/kg) whereas an increase of 8–9 ◦C would produce around 1 μg/kg (Table 2). For the case of prune juice, Figure 7B, the same behaviour is observed. However, the amount of acrylamide formed is higher, as the initial concentrations of fructose and asparagine are also higher. In this case, the recommended limit for all the considered probabilities would lie within 129–130 ◦C. This information can be useful to consider the use of these foods in, e.g., baby-food products, to avoid exceeding the recommendation.

**Figure 7.** Acrylamide formed as a function of time for each temperature and *P*(*log*10*S* ≥ 6): potato puree (**A**) and prune juice (**B**).

Decision makers can use plots such as Figure 7 to decide which levels of acrylamide are likely to be present in the final product depending on *P*(*log*10*S* ≥ 6) and temperature, which could affect other properties of the food not considered here. Another approach would be to select the desired acrylamide level and *P*(*log*10*S* ≥ 6). In this way, the treatment temperature is determined. Figure 7 shows that, for high temperatures, as the duration of the treatments are very short regardless of the inactivation probability chosen, the acrylamide formation is low in every case. However, for lower temperatures the sensitivity is higher: small changes in temperature (linked to higher exposure times) result in significant changes in acrylamide production.

To complement these design steps, the duration of treatment should be calculated. For that purpose, Figure 8 shows the time (x-axis) for each temperature (y-axis) for three selected probability values of *P*(*log*10*S* ≥ 6). Therefore, in the example, if the required temperature is 130 ◦C for a level of 85%, in the middle plot a treatment duration of 2.5 min is obtained. This value can be more easily retrieved by simulating both models (acrylamide production and microbial inactivation).

**Figure 8.** Treatment time required for each temperature and probability.

#### **4. Conclusions**

We have analysed the balance between microbial inactivation and acrylamide formation in the case where a thermoresistant microorganism can be present in food in which acrylamide can be formed due to high temperature thermal treatments. As a case study, we have chosen the inactivation of *Geobacillus stearothermophilus* in two particular foods (i.e., pureed potato and prune juice) that we have characterized with the Bigelow model. The acrylamide formation has been modelled with the Maillard equation. The analysis of the dynamics of both processes reveals that, to ensure a certain level of microbial inactivation, heat treatments at higher temperatures lead to decreased acrylamide formation, similar to the behaviour of quality components. This is due to the processes' different sensitivities to temperature. While microbial inactivation is very sensitive (i.e., the times to produce a level of inactivation with a certain probability dramatically decreases with temperature), acrylamide formation is not. The methodology presented here can be used by decision makers to design heat treatments when food safety objectives are faced, maximizing inactivation and minimizing the amount of acrylamide (or other target substances). As the selected foods can be used as ingredients in baby foods, the obtained outcomes were compared with the EFSA's recommendations about maximum acrylamide concentrations in baby food. Even at the highest temperatures, where the least amount of acrylamide is formed due to the short processing times to ensure the microbial inactivation, the expected acrylamide amounts are very close to the maximum EFSA's recommendation, therefore it should be taken into account when using them in infant formulations.

The methodology presented here can be a basis to re-design processes where food safety and acrylamide formation are important issues. It can be used to make decisions when there are unexpected process variables deviations (e.g., lower treatment temperatures than expected). It can also be extended to other processes and microorganisms, and in future work we plan to include other conflicting objectives depending on temperature and time, such as quality or cost. This future work will also be addressed to experimentally validate the conclusions obtained here to refine the model fitting to other conditions and to check the effects of different food matrixes.

**Supplementary Materials:** The following are available online at https://www.mdpi.com/article/10 .3390/foods10112535/s1, Figure S1: Experimental data and fitted model for the considered *log*10*D*/*T* values, Figure S2: Kinetic model for acrylamide formation from glucose, fructose and asparagine proposed by Knol et al. 2005. Supp\_file1: Calculation of initial concentrations of glucose, fructose and asparagine for pureed potato and prune juice.

**Author Contributions:** Conceptualization: J.A.E. and P.S.F.; methodology, J.A.E., A.G. and J.L.P.-S.; software, J.L.P.-S. and A.G.; validation, J.A.E., A.G., A.A. and P.S.F.; formal analysis, J.A.E., A.G., A.A. and P.S.F.; investigation, J.L.P.-S.; resources, J.A.E. and P.S.F.; data curation, J.L.P.-S. and A.G.; writing—original draft preparation, J.L.P.-S.; writing—review and editing, J.A.E., A.G., A.A., P.S.F. and J.L.P.-S.; visualization, J.L.P.-S.; supervision, J.A.E., A.G. and P.S.F.; project administration, J.A.E. and P.S.F.; funding acquisition, J.A.E. and P.S.F. All authors have read and agreed to the published version of the manuscript.

**Funding:** The financial support of this research work was provided by the Ministry of Science, Innovation and Universities of the Spanish Government and European Regional Development Fund (ERDF) through project AGL2017-86840-C2-1-R. J.L.P.-S. is grateful to the JAE-INTRO program from CSIC (Grant no JAEINT19\_EX\_0797). A.G. was supported by a postdoctoral grant from the Fundación Séneca (20900/PD/18).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interests.

#### **References**

