**Bioprocess Monitoring and Control**

Editor

**Bernd Hitzmann**

MDPI • Basel • Beijing • Wuhan • Barcelona • Belgrade • Manchester • Tokyo • Cluj • Tianjin

*Editor* Bernd Hitzmann University of Hohenheim Germany

*Editorial Office* MDPI St. Alban-Anlage 66 4052 Basel, Switzerland

This is a reprint of articles from the Special Issue published online in the open access journal *Processes* (ISSN 2227-9717) (available at: https://www.mdpi.com/journal/processes/special issues/ bioprocess control).

For citation purposes, cite each article independently as indicated on the article page online and as indicated below:

LastName, A.A.; LastName, B.B.; LastName, C.C. Article Title. *Journal Name* **Year**, *Article Number*, Page Range.

**ISBN 978-3-03936-932-4 (Hbk) ISBN 978-3-03936-933-1 (PDF)**

Cover image courtesy of Ana Juan-Garc´ıa.

c 2020 by the authors. Articles in this book are Open Access and distributed under the Creative Commons Attribution (CC BY) license, which allows users to download, copy and build upon published articles, as long as the author and publisher are properly credited, which ensures maximum dissemination and a wider impact of our publications.

The book as a whole is distributed by MDPI under the terms and conditions of the Creative Commons license CC BY-NC-ND.

## **Contents**



## **About the Editor**

**Bernd Hitzmann**, Dr. rer. nat., is a professor at the University of Hohenheim. After his studies and doctorate at the University of Hannover, he worked at the California Institute of Technology, Pasadena, USA, and at ABB's corporate research center, in Heidelberg, Germany. He habilitated at the University of Hannover. Since 2011, he has been Head of the Department of Process Analytics and Cereal Science, Institute of Food Science and Biotechnology, at the University of Hohenheim, Germany. His expertise lies in bioprocess monitoring and control, spectroscopy, modelling, and chemometrics.

## **Preface to "Bioprocess Monitoring and Control"**

Bioprocesses have been carried out for thousands of years. When they were first used, based on experience, one knew how to carry out these processes so that the required product, mainly food, was obtained in the desired quality. In the time of Antoni van Leeuwenhoek (1632–1723), the important protagonists of bioprocesses (microorganisms) became accessible. In particular, the production of beer, i.e. ethanol, was one major player, which gave rise to the improvement of our knowledge of bioprocesses significantly. That temperature is one of the most important process variable was recognized very early. However, during the 20th century, a lot of improvements in process development were gained. At the beginning of that century, examples were the development of a fed-batch process for baker yeast production and the patent of the production process of acetone and butanol (1915). In 1949, Monod published his model for the specific growth rate of microorganism. Not only did process development evolve, but process measurement systems such as spectroscopy also did, at BASF in 1938. These are important for the collection of on-line process information. From 1941 to 1944, industrial penicillin, i.e. medication drugs, began to be used, and glutamic acid production began in 1957. Around that time (1960), Rudolf E. Kalman published a new approach to linear filtering and prediction problems, i.e. the Kalman filter.

That all these themes are still of high importance is obvious. In particular, due to the initiative "A European Green Deal" of the European Commission, the further development and optimization of the bioprocesses involved in the bioeconomy are of fundamental importance. For a sustainable economy, bioprocesses must be central. They must be carried out in an optimal way, so that resources are saved.

In this book, methods and techniques are provided for the monitoring and control of bioprocesses. From new developments for sensors, the application of spectroscopy and modelling approaches, the estimation and observer implementation for ethanol production and the development and scale-up of various bioprocesses and their closed loop control information is presented. The processes discussed here are very diverse. The major applications are cultivation processes, where microorganism were grown, but also, the incubation process of birds' eggs, as well as an indoor climate control for humans, will be discussed. Altogether, in 12 chapters, nine original research papers and three reviews are presented. In the first chapter, a model-based soft sensor is presented, which is based on the adjustable structure geometric observer, and can be used to estimate important variables in a bioreactor. The performance of this approach was compared to the performance of an extended Kalman filter. In the second chapter, a new control method for a fed-batch yeast cultivation is discussed, based on dielectric spectroscopy to control the specific growth rate at different set points. Siderophores are important in areas where the availability of iron is low, because iron is one of the critical growth limiting factors for mainly all aerobic microorganisms. How the production of siderophores can be carried out is demonstrated in the third chapter. The production of an antimicrobial agent is discussed in the fourth chapter, as well as its inhibitory properties against different microorganisms. In the fifth chapter, the effect of nitrate and perchlorate on selenate reduction in a batch reactor is presented. A study of a bioleaching process for the extraction of metals from a flotation concentrate is discussed in the sixth chapter. In this contribution, results regarding the influence of two typical flotation frothers on the sensitivity of bacteria in the mesophilic mixed culture are presented. Due to coronavirus medication, drugs are more important than ever. The monitoring of monoclonal antibody breakthrough curves in chromatographic downstream operations is presented in chapter seven. As a measurement system, a Raman spectrometer is used and complemented with an extended Kalman filter. It is demonstrated that this approach allows the estimation of the antibody concentrations with reduced noise and increased robustness. In the eighth chapter, a model predictive controller is presented, which can regulate the heating power of an incubator for bird eggs. It is demonstrated that the air temperature can be kept constant, although the eggshell temperatures are different in different zones of the incubator. An adaptive occupant-based predictive controller for a heating, ventilation, and air conditioning system using a predictive classification model to provide an optimal indoor climate with respect to temperature and humidity is discussed in the ninth chapter. As mentioned already, sensors are fundamental for an optimal process performance. In chapter ten, examples of well-defined conjugated macromolecules based on oligo(arylene ethynylene) skeletons are reviewed for their use in sensor applications. In chapter eleven, the most important models for physiological, biochemical, and physical properties governed by temperature are discussed. In the review, a toolset for the future exploitation of temperature as a control variable for optimization, monitoring, and control applications in bioprocess engineering is presented. The importance of the control of the specific growth rate to enable the improvement of the quality and reproducibility of bioprocesses is reviewed in the last chapter. Requirements are given that must be met to successfully implement the specific growth rate control system. Furthermore, recommendations are presented for the selection of particular control systems for specific biotechnological processes.

I would like to thank the authors of the chapters for their excellent work, as well as the stuff of MDPI, who supported me in making this book a reality.-Summer 2020 Bernd Hitzmann

> **Bernd Hitzmann** *Editor*

## *Editorial* **Special Issue: Bioprocess Monitoring and Control**

#### **Bernd Hitzmann**

Department of Process Analytics and Cereal Science, Institute of Food Science and Biotechnology, University of Hohenheim, 70599 Stuttgart, Germany; Bernd.Hitzmann@Uni-Hohenheim.de

Received: 14 July 2020; Accepted: 14 July 2020; Published: 16 July 2020

Bioprocesses can be found in different areas such as the production of food, feed, energy, chemicals, and pharmaceuticals. From bio-catalysis to fermentation processes or mammalian cell cultures, different reaction systems are applied. Due to the bio-economy initiatives in different countries, the number of bioprocesses will grow further in the future. One characteristic feature of all these different bioprocesses is a complex reaction matrix where different substances play an important role. Frequently, one must deal with a three-phase system, i.e., a liquid, a gas, and a solid phase. For the optimal operation of these processes, monitoring and supervision systems are required for all phases. Additionally—although bioprocesses have been applied for several thousand years, such as the fermentation of dough—on-line measurement systems for important process variables are still rare. Although the measurement of key variables is a challenge, the control of them to guarantee optimal yields is an even greater challenge.

This Special Issue of *Processes* entitled "Bioprocess Monitoring and Control" presents novel examples of on-line monitoring and closed loop control techniques applied to different bioprocesses. The accepted manuscripts cover a range of important topics in different bioprocess areas, where microorganisms, bird's eggs, and humans are involved. Different techniques such as those for the construction of sensors, the production of a biocontrol agent, scaling up procedures, the application of observers, closed loop control, and the model-based monitoring of a downstream process are presented. The accepted manuscripts are nine original research papers and three reviews, which are summarized below.

Lisci et al. [1] studied a model-based soft sensor, which is based on the adjustable-structure geometric observer and can be used to estimate important variables in a bioreactor. The performance of this approach was compared to the performance of an extended Kalman filter. Using simulations, they were able to show that both estimators lead to good estimation performance. The geometric observer estimation is more sensitive to measurement noise, probably because of the presence of the Lie derivative in the correction term. Lisci et al. concluded that the systematic geometric approach led to the best solution for the estimation problem, giving a structure that did not depend on the correction algorithm.

Brignoli et al. [2] present a new controlmethod for the fed-batch cultivation of*KluyveromycesMarxianus.* They counter the problem of noise and oscillations in the control variable and address the exponential growth dynamics more effectively. Based on dielectric spectroscopy for the on-line biomass concentration measurements, the specific growth rate was estimated. Using a feedforward-feedback controller, the authors could demonstrate that the specific growth rate could be maintained at different set point values. Therefore, the feasibility of the closed loop control of the specific growth rate of yeast in long-duration fed-batch cultures was demonstrated successfully.

Abo-Zaid et al. [3] present results from tests of twenty fluorescent *Pseudomonas* isolates for their ability to produce siderophores. The assessment of their antagonistic activity against six plant pathogenic fungal isolates is demonstrated. For the promising strains, a scaling-up production of siderophores from fluorescent *Pseudomonads* was carried out. They could show that the exponential fed-batch fermentation of *P. aeruginosa* F2 and *P. fluorescens* JY3 gave higher concentrations of siderophores and biomass than batch fermentation. Furthermore, they demonstrated that formulations of siderophore-producing fluorescent *Pseudomonads* were effective in controlling soil-borne fungi and for the stimulation of plant growth. Therefore, they concluded that bio-friendly formulations of siderophore-producing fluorescent *Pseudomonas* isolates could be used as biocontrol agents for controlling some plant fungal diseases.

Zhang et al. [4] isolated from the commercial Yanjing Natto food a bactericide-secreting Bacillus strain, i.e., *Bacillus subtilis* natto, which is potentially useful as a biocontrol agent. Upon the optimization of the growth medium for optimal bactericide secretion, the antimicrobial activity of the strain was enhanced significantly. They could demonstrate the inhibitory properties of the obtained agent against *S. aureus*, *E. coli*, and *S. typhimurium*. Using HPLC, 13C-nuclear magnetic resonance, and mass spectral analyses, the structure of the purified new bactericides could be identified.

The effect of nitrate and perchlorate on selenate reduction in a batch reactor was investigated by Kim et al. [5]. They selectively enriched selenate-reducing bacteria in bench-scale sequencing batch reactors, which were seeded with activated sludge, and operated them semi-continuously in parallel for more than one and a half months. They show that complete selenate and nitrate reduction can be accomplished simultaneously. Kim et al. concluded that selenate-reducing bacteria are capable of enduring the competition associated with the reduction of other oxyanions and electron donors without significant inhibition after appropriate acclimation.

Jafari et al. [6] studied a bioleaching process for the extraction of metals from a flotation concentrate. In their contribution, results regarding the influence of two typical flotation frothers on the sensitivity of bacteria in the mesophilic mixed culture are presented. As a traditional mixed mesophilic microorganism culture, *Acidithiobacillus ferrooxidans*, *Leptospirillum ferrooxidans*, and *Acidithiobacillus thiooxidans* were used. By increasing the dosage of the frothers, they could show a negative correlation with bacterial activities. However, the mixed culture showed a lower sensitivity to the toxicity of the frothers than the examined pure cultures.

Feidl et al. [7] investigated the monitoring of monoclonal antibody breakthrough curves in chromatographic downstream operations. As a measurement system, they used a Raman spectrometer connected to the process by a self-developed flow cell. An extended Kalman filter was developed by complementing the measurement information with information coming from a lumped kinetic model. Feidl et al. demonstrate in their contribution that this approach allows the estimation of the antibody concentrations with reduced noise and increased robustness.

For the incubation of bird eggs, Youssef et al. [8] developed a model predictive controller to regulate the heating power. They used several IR radiators divided into three zones to adjust the eggshell temperatures individually in each zone. To test and implement the developed controller, four full incubation trials were performed. The authors could demonstrate that the controllers were able to follow the reference trajectory defined for each individual zone. They could keep the air temperature constant, although the eggshell temperatures within the middle zone were different from those in the sidelong zones.

That we as humans are also part of a kind of bioprocess and have to be maintained under optimal conditions is considered by Youssef et al. [9]. They propose an adaptive occupant-based predictive controller for a heating, ventilation, and air conditioning system using a predictive classification model to provide an optimal indoor climate with respect to temperature and humidity. To estimate the individual metabolic rates of 25 participants, three input variables—aural temperature, heart rate, and average skin heat-flux—were used. The least squares support vector machine technique was applied to predict the individual's thermal sensation. Based on that, they recommend an adaptive model predictive controller to adjust the indoor climate.

Without reliable sensors, the control of processes is not possible. Krywko-Cendrowska et al. [10] emphasize the importance of monitoring in general but also for bioprocesses. They review examples of well-defined conjugated macromolecules based on an oligo(arylene-ethynylene) skeleton used for sensor applications and discuss their relevance and their perspectives, not only for biological samples. In the review, they focus exclusively on examples of uniform macromolecules.

Noll et al. [11] summarize the most important models for physiological, biochemical, and physical properties governed by temperature. A timeline of the publication of different temperature models is presented as are the pro and cons of mechanistic and empirical models. In their review, a toolset for the future exploitation of temperature as a control variable for optimization, monitoring, and control applications in bioprocess engineering is presented.

Galvanauskas et al. [12] emphasize in their review the importance of the control of the specific growth rate because this enables the improvement of the quality and reproducibility of bioprocesses. Requirements are given that must be met to successfully implement the specific growth rate control system. Furthermore, recommendations are presented for the selection of particular control systems for specific biotechnological processes.

The articles in this Special Issue highlight the diversity of bioprocesses and new applications in the development and management of these processes. Especially for closed loop control, the reliability of the measurements is central. Beside the measurements, the applicability of the models is equally important. They must provide the required accuracy in estimating bioprocess variables. The articles in this Special Issue show a major step forward towards efficient bioprocesses but in the future further research is still needed. The papers from this Special Issue can be accessed at the following link: https://www.mdpi.com/journal/processes/special\_issues/bioprocess\_control.

#### **References**


© 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **A Geometric Observer-Assisted Approach to Tailor State Estimation in a Bioreactor for Ethanol Production**

#### **Silvia Lisci, Massimiliano Grosso and Stefania Tronci \***

Dipartimento di Ingegneria Meccanica Chimica e dei Materiali, Università degli Studi di Cagliari, Via Marengo 2, 09123 Cagliari, Italy; s.lisci@dimcm.unica.it (S.L.); massimiliano.grosso@dimcm.unica.it (M.G.)

**\*** Correspondence: stefania.tronci@dimcm.unica.it; Tel.: +39-070-675-5050

Received: 28 February 2020; Accepted: 16 April 2020; Published: 20 April 2020

**Abstract:** In this work, a systematic approach based on the geometric observer is proposed to design a model-based soft sensor, which allows the estimation of quality indexes in a bioreactor. The study is focused on the structure design problem where the set of innovated states has to be chosen. On the basis of robust exponential estimability arguments, it is found that it is possible to distinguish all the unmeasured states if temperature and dissolved oxygen concentration measurements are combined with substrate concentrations. The proposed estimator structure is then validated through numerical simulation considering two different measurement processor algorithms: the geometric observer and the extended Kalman filter.

**Keywords:** nonlinear state estimation; geometric observer; bioreactor; continuous system; extended Kalman filter; model-based sensor

#### **1. Introduction**

Bioreactors are units where a wide variety of products are made in industrial plants and where a diversity of important processes, such as fermentation, occur. Usually, the control of bioreactors is accomplished through the regulation of variables, such as pH and temperature, for optimizing the microbial growth [1,2]. Product quality indexes such as biomass, substrate, product or by-product, and dissolved oxygen concentrations are not usually controlled, because they are difficult to measure in real-time [3]. Even though many works report the availability and advantages of monitoring techniques, industrial biotechnology processes have a scarce capacity for real-time monitoring, which implies a limited implementation of efficient control of the process [4,5]. Because an unpredicted perturbation may lead to significant changes in the qualitative behavior of the system [6–8], it is crucial to accurately monitor the process.

Model-driven soft sensors can be a possible approach to estimate variables from secondary measurements. They rely on first principles process models and on algorithms that reconcile the available measurements with predictions carried out by the model. Several estimation techniques have been proposed in the literature for chemical and biochemical processes. Among them, the following have been recognized to have strong potential in the online estimation of nonlinear systems: (i) extended Kalman filter [9], (ii) high gain observer [10], (iii) sliding mode observer [11], (iv) geometric observer [12,13]. Many of the strategies to estimate unmeasurable states and disturbances for partially known systems are based on the extended Kalman filter (EKF) because its design is quite simple and it is widely accepted by relevant industries [14,15].

In this paper, the problem of estimating unmeasured states in a bioreactor is addressed. The study is based on the detailed model proposed by [16], which is considered as the virtual plant. The main objective is to compare different estimation solutions depending on the available measurements and the

characteristics of the sensors. An adjustable-structure geometric estimation approach is implemented, considering the estimator structure as a degree of freedom in the design with the aim of improving performance versus robustness estimation behavior [13,17,18]. The used estimation algorithm is the geometric observer (GO) with proportional innovation [19], which offers the simplicity of tuning and implementation. In order to show that the proposed procedure for choosing the estimation structure can be applied to other estimation techniques, the extended Kalman filter (EKF) is also used as the measurement processor algorithm.

#### **2. Process Model**

The biochemical process considered in the present paper is a fermentation reactor for the production of ethanol. The model was developed by [16] and, for the sake of clarity, it is hereafter reported (Equations (1)–(6)). It is assumed there is a perfect mixing in the reactor (constant pH and constant volume). The dynamics of biomass (*CX*), substrate (*CS*), ethanol (*CP*), along with the oxygen concentration (*CO*<sup>2</sup> ) are considered. Energy balances are also taken into account describing the reactor temperature (*Tr*) and jacket temperature (*T*ag) dynamics. A low dilution rate has been considered allowing a balance between the biomass exiting from the system and the biomass produced in the reactor.

$$\frac{d\mathbb{C}\_X}{dt} = \mu\_X \mathbb{C}\_X \frac{\mathbb{C}\_S}{K\_{\mathbb{S}} + \mathbb{C}\_{\mathbb{S}}} e^{-K\rho \mathbb{C}\_P} - \frac{F\_{\mathcal{C}}}{V} \mathbb{C}\_X \tag{1}$$

$$\frac{d\mathbb{C}\_P}{dt} = \mu\_P \mathbb{C}\_X \frac{\mathbb{C}\_S}{K\_{S1} + \mathbb{C}\_S} e^{-K\_{P1}\mathbb{C}\_P} - \frac{F\_\varepsilon}{V} \mathbb{C}\_P \tag{2}$$

$$\frac{dC\_S}{dt} = -\frac{1}{R\_{SX}}\mu\_X \mathbf{C}\_X \frac{\mathbf{C}\_S}{K\_S + \mathbf{C}\_S} e^{-Kp\mathbf{C}\_P} - \frac{1}{R\_{SP}}\mu\_P \mathbf{C}\_X \frac{\mathbf{C}\_S}{K\_{S1} + \mathbf{C}\_S} e^{-Kp\_1\mathbf{C}\_P} + \frac{F\_i}{V} \mathbf{C}\_{S,in} - \frac{F\_\varepsilon}{V} \mathbf{C}\_S \tag{3}$$

$$\frac{dC\_{O\_2}}{dt} = k\mu \left(C\_{O\_2}^\* - C\_{O\_2}\right) - \mu\_{O\_2} \frac{1}{Y\_{O\_2}} C\_X \frac{C\_{O\_2}}{K\_{O\_2} + C\_{O\_2}} \tag{4}$$

$$\frac{dT\_r}{dt} = \left(\frac{F\_i}{V}\right)(T\_{in} + 273) - \left(\frac{F\_c}{V}\right)(T\_r + 273) - \mu\_{O\_2} \frac{1}{Y\_{O\_2}} \text{C}\_X \frac{\text{C}\_{O\_2}}{K\_{O\_2} + \text{C}\_{O\_2}} \frac{\Delta H\_r}{32 \,\rho\_r \,\text{C}\_{\text{heat},r}} - \frac{K\_T A\_T \left(T\_r - T\_{ag}\right)}{V \,\rho\_r \,\text{C}\_{\text{heat},r}} \tag{5}$$

$$\frac{dT\_{\text{ag}}}{dt} = \left(\frac{F\_{\text{ag}}}{V\_{\text{j}}}\right) \left(T\_{in,\text{ag}} - T\_{\text{ag}}\right) + \frac{K\_T A\_T \left(T\_r - T\_{\text{ag}}\right)}{V\_{\text{j}} \rho\_{\text{ag}} \mathcal{C}\_{\text{heat,ag}}}\tag{6}$$

The oxygen equilibrium concentration is affected by the inorganic salts, which are added to the solution as source of inorganic nitrogen. The dependence is reported in Equation (7)

$$\mathbf{C}\_{O\_2}^\* = \mathbf{C}\_{O\_2,0}^\* \cdot \mathbf{1}0^{-\sum H\_i I\_i} \tag{7}$$

where the equilibrium concentration as a function of temperature has been calculated using the equation proposed by [20] for distilled water

$$\mathbf{C}\_{O\_2, \, 0}^\* = 14.6 - 0.3943T\_r + 0.007714 \, T\_r^2 - 0.0000646T\_r^3 \tag{8}$$

In the present work, the distillation strength *HiIi* is kept constant and it has been calculated using the equations reported in [16].

The model is here used to simulate a real process and to develop the model-based soft sensor (estimator). Because the aim of the work is to mimic a real situation, the simulation using the model parameters reported in [16] is considered as the real plant (hereafter referred to as the virtual plant). On the other hand, some of the parameters used for the model in the estimator algorithm have been modified. The aim was to insert modeling errors to simulate what usually happens in a real situation where parameter uncertainty is present. This is often the case when dealing with complex systems such as biological reactors [21]. A Monte Carlo method has been used to produce empirical error estimates

on the model parameters, using a uniform noise distribution with a maximum deviation equal to ±8%. Table 1 summarizes the parameters of the model used in the estimator algorithm obtained by performing 100 simulations and leading to the maximum error calculated on the trajectories of the six states. The other parameters of the model are the same as reported in [16]. The nominal conditions of the virtual plant are reported in Table 2.


**Table 1.** Parameters for the virtual plant and for the estimator.

**Table 2.** Nominal operating conditions of the process.


The model simulations have also been made more realistic by adding noise to the available measurements, and the precision of the sensors is reported in Table 3.

**Table 3.** Noise for the different measuring sensors.


It is important to specify that all tests performed in the following have been carried out by imposing step changes to three inputs of the system (*CS,in*, *Tin,ag*, *Tin*). The description of input variations is reported in Table 4.



#### **3. Estimation Problem**

The current real-time monitoring methods used in ethanol production consist of secondary measurements such as pH, turbidity, gas composition and temperature [4]. Even if such variables

provide important information about the process, they do not directly relate to the state of the system, making it difficult to apply advanced control strategies. Furthermore, even the best process measurements are corrupted by some amount of signal noise and their true values are somewhat uncertain. State estimation techniques can be used to improve the output signal of measured process states in the presence of uncertainty and when it is not possible to directly measure all the variables of interest.

The estimation problem consists of jointly designing the estimation structure (i.e., estimator model, sensors, innovated states and data assimilation mechanisms), and the estimation algorithm (i.e., the dynamic data processor), to infer some or all the states of the bioreactor on the basis of the available model in conjunction with available measurements, according to a specific estimation objective. In the present fermentation reactor estimation study, the emphasis has been placed on: (i) the detection of the more adequate measured outputs leading to the best performance, (ii) the selection of the innovated states, meaning the states which are updated by using the available measurement.

For simplifying the formulation of the problem, the model in Equations (1)–(6) is written in compact form as reported in Equation (9)

$$\dot{\mathbf{x}} = f(\mathbf{x}, \boldsymbol{\mu}), \ \mathbf{x}(t\_0) = \mathbf{x}\_0 \tag{9a}$$

$$y = h(\mathbf{x})\tag{69}$$

where *x* is the *n*-dimensional state vector, equal to *x*<sup>0</sup> at the initial time *t*0, *u* is the *p*-dimensional input vector, *f* is the n-dimensional vector fields, *y* is the *m*–dimensional vector of the measured outputs and *h* is the map relating states and measurements. The dimension of the measured outputs is less than the number of states, that is *m* < *n*. Using the geometric approach [19,22], it is possible to define the nonlinear estimation map φ as Equation (10)

$$\boldsymbol{\Phi}(\mathbf{x}, \boldsymbol{\mu}) = \begin{bmatrix} \boldsymbol{\Phi}\_1, \dots, \boldsymbol{\Phi}\_{\text{i}}, \dots, \boldsymbol{\Phi}\_{\text{m}} \end{bmatrix}^T \tag{10a}$$

$$\Phi\_i = \begin{pmatrix} h\_{i\star} \ L\_f h\_{i\star} \ \dots \ \iota\_f^{\kappa\_i - 1} h\_i \end{pmatrix} \tag{10b}$$

where the *L<sup>j</sup> f hi* is the *j*th Lie derivative of the time-varying scalar field *hi* along the vector *f*, κ*<sup>i</sup>* is the observability index of the *i*th output and κ is the estimator order defined in Equation (11)

$$
\kappa\_1 + \kappa\_2 + \dots + \kappa\_m = \kappa = n \tag{11}
$$

If the map φ(*x*, *u*) is invertible with respect to *x* (Equation (12)), the system is observable, and the states can be reconstructed using the available model (Equations (1)–(6)) and a proper measurement processor algorithm [14].

$$rank(\partial\_{\mathbf{x}}\Phi(\mathbf{x},\boldsymbol{\mu}))=n\tag{12}$$

If the Jacobian matrix ∂*x*φ(*x*, *u*) is rank deficient, there are unobservable states. In this case, the system is detectable only if all the unobservable modes have negative real parts [19].

#### *3.1. Robust Estimability and Robust Detectability*

If all states can be fully observable, the observability matrix should be full-rank, but practical observability can be assessed if the condition number of the observability matrix (Σ) is small [23]. Furthermore, a small singular value of the observability matrix implies the worst estimate of the states [24]

$$
tau(\Upsilon) = n, \Upsilon = \partial\_x \Phi(x, u) \tag{13a}$$

$$\frac{\overline{\sigma}(\Upsilon)}{\underline{\sigma}(\Upsilon)} = \Sigma < \Xi,\tag{13b}$$

$$w\underset{t}{\operatorname{avg}}\underline{\operatorname{\sigma}}(\mathsf{\Upsilon}) \geq \varepsilon\_{0} \tag{13c}$$

where Ξ and ε<sup>0</sup> are, respectively, the selected thresholds.

On the other hand, if matrix Υ is rank deficient and the unobservable states are stable, it is necessary to distinguish between states that can be innovated (distinguishable states) and states that cannot (undistinguishable states). In this case, the dimension of the map in Equation (10) is equal to the dimension of the distinguishable states, and robust detectability can be assessed if the following conditions are satisfied (Equation (14))

$$\frac{\overline{\sigma}(\Upsilon\_{\mathbb{P}})}{\underline{\sigma}(\Upsilon\_{\mathbb{P}})} = \Sigma\_{\mathbb{i}} < \Xi\_{\mathbb{P}} \tag{14a}$$

$$
\omega\_{\rm av\overline{\rm B}}(\mathbf{\Upsilon}\_{\rm P}) \geq \varepsilon\_{\rm P^{0}} \tag{14b}
$$

$$\Upsilon\_{\mathbb{P}} = \partial\_{\mathbf{x}} \Phi\_{\mathbb{P}}(\mathbf{x}, \boldsymbol{\mu}) \tag{14c}$$

φ<sup>p</sup> = *<sup>h</sup>*1, ... , *<sup>L</sup>*κ1−<sup>1</sup> *<sup>f</sup> <sup>h</sup>*1, ... , *hm*, ... , *<sup>L</sup>*κ*m*−<sup>1</sup> *<sup>f</sup> hm* (14d)

$$
\kappa\_1 + \kappa\_2 + \dots + \kappa\_m = \kappa = p \tag{14e}
$$

The constants Ξ*<sup>p</sup>* and εp0 are, again, the selected thresholds.

#### *3.2. Selection of the Estimator Structure*

The performance of an estimator is obviously strongly affected by the model of the process and the quality of the available measurements. Biological processes are complex systems, therefore the presence of model uncertainty in terms of parameters and neglected dynamics is, in general, to be expected [21]. This means that the complete reconstruction of the states requires, in general, a combination of different measurements [4]. Within this framework, it is important to underline that there is still a gap between the sensors for laboratory use and large scale monitoring in real-time [4]. The selection of the estimator structure is therefore focused on the choice of the best monitoring strategies, by considering which are the most representative measured outputs and the presence of parameter errors in the model used in the estimator. It is considered that system monitoring can be expensive, in terms of both fixed and operation costs, therefore it could be useful to optimize performance with the least number of sensors. This analysis has been carried out comparing condition number and minimum singular values of the matrix Υ or Υ*<sup>p</sup>* (Equations (13) and (14)). The performances have also been evaluated by simulating different trajectories, from which the convergence rate, presence of off-set and signal noise have been evaluated.

#### *3.3. Algorithms for the Estimation Problem*

In this study, two different algorithms have been selected and compared. The first one is the geometric observer [19], which is formally connected with the observability properties reported in the previous section. The geometric observer (GO) can be applied also to detectable systems [13], and it demonstrated to be simple to implement and tune [18,22]. The geometric approach is also used to select the estimator structure, in terms of the selection of measurements and states to be innovated.

The geometric observer algorithm is reported in Equation (15), where it is assumed that some states are not innovated. This choice may depend on the rank deficiency of the observability matrix or a design choice intended to improve the robustness and efficiency of the estimator.

$$\dot{\mathbf{x}}\_{i} = \hat{f}\_{i}(\mathbf{\hat{x}}, \mathbf{u}) + \left(\partial\_{\text{xi}i}\Phi(\mathbf{\hat{x}}, \mathbf{u})\right)^{-1} \mathbf{K}(\mathbf{y} - \mathbf{h}(\mathbf{\hat{x}})), \mathbf{x}\_{i0} = \mathbf{x}\_{i}(t\_{0}) \tag{15a}$$

$$
\dot{\mathbf{x}}\_{\mathfrak{u}} = \hat{f}\_{\mathfrak{u}}(\mathfrak{x}, \mathfrak{u}), \mathfrak{x}\_{\mathfrak{u}0} = \mathfrak{x}\_{\mathfrak{u}}(t\_0) \tag{15b}
$$

*Processes* **2020**, *8*, 480

The inverse of the observability matrix ∂*xi* φ(*x*, *u*) in Equation (15a) is calculated at each time step and *K* is a block diagonal matrix (Equation (16)), whose coefficients are constant tuning parameters. The estimated states *x*ˆ in Equation (15) are the innovated states (*x*ˆ*i*), where the dynamics predicted by the model are adjusted by means of the available measurements *y*, and the not innovated states (*x*ˆ*u*), which are only predicted by the model (referred to in the following as open-loop states).

$$\mathbf{K} = \begin{pmatrix} \mathbf{B}\_1 & \mathbf{0} & \dots & \mathbf{0} \\ \mathbf{0} & \mathbf{B}\_2 & \dots & \mathbf{0} \\ \vdots & \vdots & \dots & \vdots \\ \mathbf{0} & \mathbf{0} & \dots & \mathbf{B}\_m \end{pmatrix}, \quad \mathbf{B}\_1 = \begin{bmatrix} k\_{11} \\ \vdots \\ k\_{1v\_1} \end{bmatrix}, \quad \mathbf{B}\_2 = \begin{bmatrix} k\_{21} \\ \vdots \\ k\_{2v\_2} \end{bmatrix}, \quad \mathbf{B}\_m = \begin{bmatrix} k\_{m1} \\ \vdots \\ k\_{mv\_m} \end{bmatrix} \tag{16}$$
 
$$\nu\_i = \kappa\_{i-1}$$

Tuning guidelines are provided by [17], proving that a set of tuning parameters *kij* is required for every measurement. For observability indexes equal to 1 or 2 (κ*<sup>i</sup>* = 1, 2), the proportional gains can be obtained by considering Equation (17).

$$k\_{i1} = 2\zeta a \nu\_{\prime} \quad k\_{i2} = a \nu\_{\prime}^2 \tag{17a}$$

$$
\omega\_0 \in [10\omega\_c, \Re 0\omega\_c], \quad \zeta = [1, \Im ]\tag{17b}
$$

The GO has then been compared with the extended Kalman filter (EKF), which is the most used estimator algorithm in the industry because of its straightforward construction [17]. Even if the EKF is usually applied to complete observability systems, in this investigation it has been used also when the choice of measurements leads to a rank deficient observability matrix. The EKF algorithm has been applied in the continuous form, reported in the following Equation (18).

$$
\dot{\mathbf{x}}\_i = \hat{f}\_i(\mathbf{\hat{x}}, \boldsymbol{\mu}) + \mathbf{K}\_{EKF}(\boldsymbol{y} - \boldsymbol{h}(\mathbf{\hat{x}})), \\
\mathbf{x}\_{i0} = \mathbf{x}\_i(t\_0) \tag{18a}
$$

$$\dot{\mathfrak{X}}\_{\mathfrak{u}} = \hat{f}\_{\mathfrak{u}}(\mathfrak{x}, \mathfrak{u}), \mathfrak{x}\_{\mathfrak{u}0} = \mathfrak{x}\_{\mathfrak{u}}(t\_0) \tag{18b}$$

$$\mathbf{K}\_{\rm EKF} = \mathbf{P}(t)\mathbf{H}^{\rm T}\mathbf{R}^{-1} \tag{18c}$$

$$\dot{P}(t) = P(t)F(t) + \mathbf{F}^T(t)\mathbf{P}(t) + \mathbf{Q}(t) - \mathbf{K}\_{\text{EKF}}\text{HP}, \mathbf{P}(t\_0) = \mathbf{P}\_0\tag{18d}$$

*F*(*t*) is the Jacobian of the vector field ˆ *fi* , calculated with respect to the innovated states, *P* is the error covariance matrix of the innovated states, *H* is the matrix of the derivative of the map *h* with respect to the states, *Q* and *R* are, respectively, the covariance matrix of the model and measurements errors [9]. The constant matrix *Q*, *R*, and *P*<sup>0</sup> are tuning parameters of the estimation model and they have been calculated minimizing the error between the states calculated with the simulated plant and the estimator along a reference trajectory.

#### **4. Results**

#### *4.1. Estimation Structure*

The choice of the estimation structure has been carried out considering: (i) condition number and the minimum singular value of the Jacobian matrix Υ (or Υ*p*) for a different choice of measurements and innovated states and (ii) evaluating the responses of the reconstructed states for a given trajectory. Temperature and dissolved oxygen measurements have always been considered available, according to the laboratory and industrial practice. On the other hand, sensors suited for ethanol measurements as well as substrate and biomass are not always available for large scale real-time applications [4]. According to the analysis reported in [3], two possible scenarios have been considered: (i) biomass

concentration in the reactor is measured online or (ii) substrate concentration in the reactor is measured online. Using the representation in (9), the considered cases are reported in Equation (19):

$$\mathbf{y} = \begin{pmatrix} \mathbb{C}\_{\mathbf{x}\prime} & \mathbb{C}\_{\mathbf{O}\_2\prime} \; T\_{r\prime} & T\_{\mathbf{a}\prime} \end{pmatrix} \tag{19a}$$

$$\mathbf{y} = \begin{pmatrix} \mathbb{C}\_{\text{s} \prime} & \mathbb{C}\_{\text{O}\_2 \prime} \, T\_{r \prime} & T\_{\text{ag}} \end{pmatrix} \tag{19b}$$

where *y* represents the measured output vector.

According to Equation (13), it is easy to demonstrate that no combination of indexes κ*<sup>i</sup>* satisfies the observability property for the output vector in Equation (19a). This implies that a full order observer is possible if the substrate concentration is measured online and therefore when using the output configuration reported in Equation (19b). In this case, the nonlinear estimation maps satisfying Equation (13a) are reported in Equation (20)

$$\Phi\_1 = \begin{bmatrix} \mathbb{C}\_{\mathbb{S}\_{\prime}} \ L\_f \mathbb{C}\_{\mathbb{S}\_{\prime}} \ \mathbb{C}\_{\mathbb{O}\_{2^r}} \ L\_f \mathbb{C}\_{\mathbb{O}\_{2^r}} T\_{r\_{\prime}} \ T\_{a \mathbb{S}} \end{bmatrix}, \\ \Phi\_2 = \begin{bmatrix} \mathbb{C}\_{\mathbb{S}\_{\prime}} \ L\_f \mathbb{C}\_{\mathbb{S}\_{\prime}} \ \mathbb{C}\_{\mathbb{O}\_{2^r}} T\_{r\_{\prime}} \ L\_f \mathbb{C}\_{\mathbb{T}\_{\prime}} T\_{a \mathbb{g}} \end{bmatrix} \tag{20}$$

A first comparison of the two structures can be carried out by considering the values of condition number and minimum singular value for the Jacobian of the maps (20), calculated by averaging along the trajectories obtained with input step changes T1 and T2 (Table 4) and reported in Table 5. The structure φ<sup>2</sup> seems to be more robust (lower condition number), but it shows a lower minimum singular value, indicating that changes in the states should affect the outputs to a lesser extent.

**Table 5.** Condition number and minimum singular value with four measurements.


The reconstruction capabilities of the two structures using the geometric observer are therefore calculated, using the input variations T1 and T2, reported in Table 4. Results are shown in Figures 1–4, only for the unmeasured variables, which are ethanol and biomass concentration. It is worth noticing that also the state values calculated only with the model used in the estimation algorithm (open-loop model), but without innovation are reported in order to better highlight the correction provided by the estimation algorithm.

It is possible to observe that using the map φ1, allows a good reconstruction of the biomass behavior (Figure 1), while there is a large mismatch between the ethanol concentration obtained with the virtual plant and the reconstructed one (Figure 2).

When using the second configuration, results worsen, both for biomass (Figure 3) and for ethanol (Figure 4) concentration. It is worth noticing that the state's values estimated with map φ<sup>2</sup> are more corrupted by the measurement noise because in this case a greater observer gain has been used to decrease the offset.

**Figure 1.** Dynamic response of biomass concentration calculated with the virtual plant (continuous line), open-loop model (dashed line) and geometric observer (GO) (dotted grey line) for structure φ<sup>1</sup> along trajectory T1 (**left** panel) and T2 (**right** panel).

**Figure 2.** Dynamic response of ethanol concentration calculated with the virtual plant (continuous line), open-loop model (dashed line) and GO (dotted grey line) for structure φ<sup>1</sup> along trajectory T1 (**left** panel) and T2 (**right** panel).

**Figure 3.** Dynamic response of biomass concentration calculated with the virtual plant (continuous line), open-loop model (dashed line) and GO (dotted grey line) for structure φ<sup>2</sup> along trajectory T1 (**left** panel) and T2 (**right** panel).

**Figure 4.** Dynamic response of ethanol concentration calculated with the virtual plant (continuous line), open-loop model (dashed line) and GO (dotted grey line) for structure φ<sup>2</sup> along trajectory T1 (**left** panel) and T2 (**right** panel).

The two full order structures are not able to adequately estimate the product of the reactor, therefore a different solution is required to improve ethanol concentration. Using the same measured outputs, it is possible to improve estimation performance by reducing the order of the observer using only one Lie's derivative [22]. The maps reported in Equation (21) lead to five observable states and only one detectable.

$$\Phi\_{\mathsf{P}^3} = \left[ \mathbb{C}\_{\mathsf{s}\prime} \, \mathbb{C}\_{\mathsf{O}\_2 \prime} \, T\_{r\prime} \, L\_f \, T\_{r\prime} \, T\_{\mathsf{ag}} \right], \\ \Phi\_{\mathsf{P}^4} = \left[ \begin{array}{c} \mathbb{C}\_{\mathsf{s}\prime} \, L\_f \mathbb{C}\_{\mathsf{s}\prime} \, \mathbb{C}\_{\mathsf{O} \Sigma \prime} \, T\_{r\prime} \, T\_{\mathsf{ag}} \right] \end{array} \tag{21}$$

The rank of the Jacobian of the maps φpi (*i* = 3, 4) depends on the choice of the non innovated state (*x*ˆ*u*) between the two that are not measured, which are ethanol and biomass concentration. It can be verified that the map φp3 can be inverted only if *Cx* is innovated and *Cp* is not. On the other hand, the Jacobian of map φp4 always has a rank equal to five, regardless of the choice of the innovated states. Recalling Equation (15), the following partitions are considered:

$$\mathbf{x}\_{i} = \begin{bmatrix} \mathbf{C}\_{\mathbf{x}\prime} \mathbf{C}\_{\mathbf{s}\prime} & \mathbf{C}\_{O\_{2}\prime} \mathbf{T}\_{r\prime} & T\_{\mathbf{a}\prime} \end{bmatrix} \\ \mathbf{x}\_{u} = \begin{bmatrix} \mathbf{C}\_{p} \end{bmatrix} \tag{22a}$$

$$\mathbf{x}\_{i} = \begin{bmatrix} \mathbb{C}\_{p\_{\prime}} \mathbb{C}\_{s\_{\prime}} & \mathbb{C}\_{\mathbb{O}\_{2^{\prime}}} \, T\_{r\_{\prime}} \, T\_{\mathbb{4^{8}}} \end{bmatrix} \tag{22b}$$

The map φp3 can be used with the partition in Equation (22a), while the map φp4 can be used with both partitions in Equation (22a,b). Therefore, two different solutions are identified: φp4,1 for partition (22b) and φp4,2 for partition (22a). A first analysis of the possible configurations can be obtained by considering the minimum singular value and condition number reported in Table 6. The indexes' values are comparable; therefore, the evaluation of the best structure has been performed analyzing the reconstruction performance. Figures 5 and 6 represent the estimation of the unmeasured states (ethanol and biomass concentration) for the input step change T1 and T2 described in Table 4. The best reconstruction capabilities are shown by configuration φp3 for both the states. This result may suggest that conditions calculated with Equation (14) are informative when the magnitude between the different configurations is significantly different, otherwise, it is necessary to evaluate the estimation capabilities by evaluating the estimator response for given input changes.

**Table 6.** Mean condition number and minimum singular value for low order structures.


**Figure 5.** Dynamic response of the biomass concentration calculated with the virtual plant (continuous black line), GO with map φp3 (dashed dark grey line), GO φp4,1 (dotted black line), GO with map φp4,2 (dashed-dotted grey line) along the trajectory T1 (**left** panel) and T2 (**right** panel).

**Figure 6.** Dynamic response of the product concentration calculated with the virtual plant (continuous line), GO with map φp3 (dashed black line), GO with map φp4,1 (dotted black line), GO with map φp4,2 (dashed-dotted grey line) along trajectory T1 (**left** panel) and T2 (**right** panel).

#### *4.2. Validation*

The analysis carried out in the previous section indicates the best estimation structure with four measured outputs. In order to validate the obtained results, a new test was carried out considering as reference trajectory the variation of the input temperature (*Tin*) as shown in Table 4 (Case T3). Figure 7 shows the dynamic behavior of biomass and product concentration and confirms that the proposed structure can effectively reconstruct the unmeasured states also with different process conditions. It is worth noticing that the ethanol concentration is not innovated, and the correction of the other states also has a positive impact on its estimation.

Using the same number and choice of measured outputs Equation (19b) and partition between innovated and not innovated states Equation (22a), the estimation task has been addressed using the extended Kalman filter (Figure 8). The main reason for using another algorithm as a measurement processor is to demonstrate that the estimator performance depends on the structure selection rather than estimation algorithm. EKF has been preferred for this validation because it is usually preferred in the industrial practice as it is easy to implement and robust if adequately calibrated [25,26].

**Figure 7.** Dynamic response of biomass concentration (**left** panel) and ethanol concentration (**right** panel) calculated with the virtual plant (continuous line), open-loop model (dashed line) and GO (dotted grey line) for structure φp3 along the trajectory T3.

**Figure 8.** Dynamic response of biomass concentration and ethanol concentration calculated with the virtual plant (continuous line), open-loop model (dashed line) and extended Kalman filter (EKF) (dotted line) for structure φp3 along the trajectory T3.

Results show that EKF can effectively reconstruct the unmeasured states, revealing that estimator structure design is the key step for a successful achievement of the estimation goals. The only difference between the two approaches is that the biomass calculated with the geometric observer is more affected by noise. This behavior can be explained by the presence of the Lie derivative in GO, which implies a higher sensitivity to measurement noise with respect to the EKF.

#### **5. Conclusions**

The problem of estimating unmeasured states in a bioreactor was addressed, and it was demonstrated that the estimation performance relies on an appropriate structure selection rather than the chosen measurement processor algorithm. An adjustable-structure geometric estimation approach was used, and the estimator structure constituted a design degree of freedom to improve its performance versus robustness behavior. The estimation structure design was based on estimability and detectability properties used together with a geometric approach. The analysis of the estimability measures showed the ill- and well-conditioned structures (condition number of the observability matrix), and the poorest estimation performance for the given structure (minimum singular value of the observability matrix). From the implementation stage with simulations, it was found that the results agreed with the ones obtained from the structural assessment when estimability measure values calculated for the different structures were significantly different. The used estimation algorithm

was the geometric observer with proportional innovation, which offers simplicity of tuning and implementation. With the aim of showing that the proposed procedure for choosing the estimation structure can be applied to other estimation techniques, the extended Kalman filter was also used as measurement processor algorithm. The obtained results showed that the two estimators lead to good estimation performance, with the only difference that the geometric observer estimation is more sensitive to measurement noise, probably because of the presence of the Lie derivative in the correction term. Summarizing, the systematic geometric approach led to the best solution for the estimation problem, giving a structure that did not depend on the correction algorithm. The latter can be chosen according to the wishes of the personnel of the plant or developer experience. It is worth noticing that the systematic tuning procedure of the geometric approach was very useful for comparing the reconstruction capabilities of the different structures. The results obtained in this paper in terms of methodology could be applied to more complex biotechnological processes, such as the obtainment of ethanol from cellulosic material, where the measurement devices for real-time application in the industry are still missing. In this case, the proposed approach can be used to detect the measurements that lead to the best reconstruction capabilities and invest in them.

**Author Contributions:** S.L. performed the analytic calculations and performed simulations; S.T. conceived the idea, proposed the computational model and wrote the paper; M.G. contributed to the analysis of the results and provided critical feedback; S.T. and M.G. reviewed the manuscript. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Control of Specific Growth Rate in Fed-Batch Bioprocesses: Novel Controller Design for Improved Noise Management**

## **Yann Brignoli 1, Brian Freeland 2,3,4, David Cunningham <sup>2</sup> and Michal Dabros 1,\***


Received: 27 April 2020; Accepted: 6 June 2020; Published: 9 June 2020

**Abstract:** Accurate control of the specific growth rate (μ) of microorganisms is dependent on the ability to quantify the evolution of biomass reliably in real time. Biomass concentration can be monitored online using various tools and methods, but the obtained signal is often very noisy and unstable, leading to inaccuracies in the estimation of μ. Furthermore, controlling the growth rate is challenging as the process evolves nonlinearly and is subject to unpredictable disturbances originating from the culture's metabolism. In this work, a novel feedforward-feedback controller logic is presented to counter the problem of noise and oscillations in the control variable and to address the exponential growth dynamics more effectively. The controller was tested on fed-batch cultures of *Kluyveromyces marxianus*, during which μ was estimated in real time from online biomass concentration measurements obtained with dielectric spectroscopy. It is shown that the specific growth rate can be maintained at different setpoint values with an average root mean square control error of 23 ± 6%.

**Keywords:** bioprocess monitoring and control; specific growth rate control; signal noise management; dielectric spectroscopy; PAT; microbial bioprocessing

#### **1. Introduction**

The field of biotechnology is experiencing continued progress, owing to advances in online process monitoring and control. The application of various process analytical technologies enables researchers and end users to have insight into the process and to monitor its substrate uptake, biomass evolution, and cell metabolism. Traditionally, cultures were mostly performed in batch mode. One of the main drawbacks of this mode is that the growth dynamics cannot be controlled during the culture [1,2]. Significant improvement in this respect can be achieved by switching to the fed-batch mode and putting in place appropriate process supervision methodologies. In addition to greater control over the product quality, the cost efficiency of the process increases along with the enhancement of the culture yield [1,3,4].

The key variable that can be controlled in a fed-batch culture is the specific growth rate, μ. This variable has a direct effect upon the cell metabolism, and controlling it can be used to supress or to induce, depending on the need, the formation of secondary metabolites [3,5–8]. The challenge to control the specific growth rate comes from the fact that the determination of its instantaneous value is dependent on the online estimation of the biomass concentration. Indeed, there does not yet exist a direct online measurement tool for cell concentration. All the currently used technologies offer an indirect estimation, where the measured signal is correlated to the biomass concentration through an external calibration [9]. Two different monitoring approaches can be identified. The first approach is noninvasive towards the process and typically involves the use of off-gas analysis combined with a metabolic model to estimate the biomass concentration and its evolution [10–12]. The second approach, invasive to the process, includes the application of an in-situ optical density probe [13–15], dielectric (capacitance) spectroscopy [16–18], or fluorescence spectroscopy [19,20] to monitor biomass. Other methods rely on a significant capital investment, including biocalorimetry [21]. The main issue with the in-situ monitoring technologies is their sensitivity to the process conditions. In particular, aeration and agitation interfere with the probe, and thus create noise in the signal [16,19]. Noise and instabilities in the biomass concentration signal are directly propagated to the estimation of μ, making control of the latter a challenging task. For these reasons, biomass measuring technologies require appropriate data treatment methods, ranging from basic signal smoothing and baseline adjustment to complex filtering algorithms such as Kalman filter [22]. Signal filtering always involves a tradeoff between reducing noise and minimizing measurement delay which are both undesirable in process control applications.

With the appropriate tools set in place to estimate the cell concentration online, several methodologies are available to control the specific growth rate. The easiest approach is the application of a predetermined exponential feed. Here, the initial conditions of the fed-batch phase are used to calculate the feed rate profile corresponding to the specific growth rate setpoint, and a feedforward controller is used to implement the profile [1]. However, no feedback action is available to intervene in case the process does not follow the desired growth rate. Feedback control typically involves the use of the following two classical types of controllers: proportional (P) and proportional-integral (PI) (and, more rarely, proportional integral derivative (PID)). They are considered to be simple gains controllers since they are not based on a model [6,23–25]. Dabros et al. [24] showed that the use of a mixed, feedforward-feedback controller reduced the mean control error by 32% as compared with applying a purely feedforward exponential feed profile. As mentioned above, the noise present in the biomass measurement and in the μ estimation, in particular, is a challenge when using feedback controllers to regulate the specific growth rate. Both the proportional and the integral parts of the controller are predisposed to amplify the noise and induce undesirable oscillations to the process. Therefore, the data must be treated and smoothed prior to use [26].

More advanced controllers are often model-based, and include the popular model predictive controller (MPC) and artificial neural networks [1,2]. These control approaches require extensive process knowledge in order to build and validate a data-driven model [1,27,28], which can be difficult to achieve as part of a new project or cell strain. Recently, successful implementation of growth rate controllers based on MPC, where noninvasive Process Analytical Technology (PAT) tools were applied towards microbial cultures, has been reported [29–32]. In addition, multisensor techniques have been applied by combining dielectric capacitance monitoring with biocalorimetry to reduce measurement noise [33]. However, these approaches add an extra layer of process complexity, resulting either from the extra modeling effort required or from the application of additional PAT tools. A critical advantage of dielectric capacitance monitoring is that it offers a direct and easily scalable method of biomass estimation [34,35] that can also be applied to single-use bioreactors [36,37]. Indeed, the signal-to-noise ratio often increases with scale, due to the reduced effect of bubbles on the probe, and therefore fast process development from bench scale to pilot scale [34]. Thus, control applications that require process scale-up would benefit from the reduced process complexity by applying dielectric capacitance as compared with utilizing multivariate approaches or biocalorimetry monitoring, where significant investment is required on scale-up [38]. Additionally, as no modeling is involved to implement direct dielectric monitoring based control, the systems developed should be easily transferable to other organisms, as previously demonstrated [24,39].

The technique used to control the specific growth rate depends on the ability to monitor the biomass and to manage the nonlinear growth dynamics of the process. A simple exponential feed cannot cope with unforeseen metabolic changes or process variability, while P or PI feedback controllers can have issues dealing with the exponential nature of cellular growth and with signal noise. Ensuring reliable biomass monitoring and a robust way of estimating μ is essential, as is the proper design of a controller capable of addressing the particularities of the process.

In this work, a revised control logic is presented to counter the above-mentioned issues. Biomass concentration measurements, provided by a dielectric spectroscopy probe, are smoothed with the Savitzky–Golay algorithm and used to estimate the current specific growth rate of the process. A modified feedforward-feedback controller is designed and optimized to maintain μ at the desired setpoint. The novel control design is adapted to the culture's exponential dynamics, resulting in improved noise management.

#### **2. Materials and Methods**

#### *2.1. Cell Strain and Culture Conditions*

In this study, the fed-batch cultures were performed with the wild type strain of *Kluyveromyces marxianus* DSMZ 5422. This strain was chosen mainly for its fast growth dynamics and easy cultivation conditions. A cell bank of 24 cryovials was prepared from four colonies of *K. marxianus* isolated on a fresh YPLA (yeast extract 10 g/L, peptone 20 g/L, lactose 40 g/L, and agar 20 g/L) petri dish. Each colony was inoculated in a sterile tube with 5 mL of YPL (yeast extract 10 g/L, peptone 20 g/L, and lactose 40 g/L), previously autoclaved at 121 ◦C for 15 min, and cultivated overnight at 30 ◦C and 150 rpm. The 5 mL of media were transferred to a 500 mL conical flask containing 100 mL of the previously described YPL medium and cultivated overnight at 30 ◦C and 150 rpm. Then, the cultures were transferred into four sterile tubes of 25 mL and centrifuged at 3500 rpm for 3 min. The biomass deposited in the four tubes was resuspended in a 5.5 mL solution of NaCl 0.9% and peptone 0.1%, and a 5.5 mL solution of glycerol 40%, reaching a total volume of 12 mL. Finally, sterile cryovials were filled with 2 mL each and stored in the freezer at −18 ◦C.

All the precultures were inoculated with a frozen vial retrieved from the cell bank. The cells were grown for 24 h at 30 ◦C and 120 rpm in a 1000 mL baffled conical flask containing 200 mL of YPL solution cited above. The preculture media were autoclaved for 10 min at 115 ◦C. After 24 h, the cells were isolated by centrifugation at 3500 rpm for 3 min and resuspended in a 4.5 mL solution of NaCl 0.9% and peptone 0.1%, then transferred to a 10 mL sterile syringe for inoculation.

The cultures were performed at 30 ◦C in a 3.6 L bench scale bioreactor (KLF, BioEngineering AG, Wald, Switzerland), with a working volume of 2.5 L. The agitation rate was set to 1000 rpm and the air flow rate to 250 NL/h, corresponding to 2 vvm, to maintain the dissolved oxygen (DO) levels above 20% throughout the experiments. In addition, to accommodate the strains particularly high respiration demands, a sintered sparger was used to enhance oxygen transfer and obtain a kLa value of 0.06 s<sup>−</sup>1. The bioreactor was equipped with a double 6-blade Rushton type agitator, a PT-100 temperature probe, pH and DO probes (Mettler-Toledo, Columbus, OH, USA), inlet gas flow controller, exhaust gas port, acid and base ports, a feed port, an antifoam port, and a sampling port, as illustrated in Figure 1. The exhaust gas passed through a condenser cooled to 4 ◦C to minimize liquid loss. Temperature, pH, and pO2 were controlled using the software provided by BioEngineering. Medium pH was controlled at 5 using solutions of NaOH 3 M and HCl 1 M. A solution of 10% antifoam in water was used to prevent foam formation.

**Figure 1.** Fed-batch bioreactor setup.

Biomass concentration was monitored with a dielectric spectroscopy probe (ABER Instruments Ltd., Aberystwyth, UK). The probe was used in dual-frequency mode at 580 and 15,650 Hz with the polarization correction BM220 PolC tuned on and the filter value set at 140 to decrease the impact of aeration on the reading. Using the dual-frequency mode is more reliable because it minimizes the influence of baseline shifts and medium conductivity changes on the dielectric signal. Feeding was performed with a Peripex W2 pump (BioEngineering), which was controlled analogically between 4 and 20 mA by the current loop box ED-550 (Brainboxes, Liverpool, UK). Oxygen and CO2 concentration levels in the exhaust gas were analyzed using a Tandem Pro gas analyzer (Magellan BioTech, Borehamwood, UK).

The cells were first grown in batch mode until depletion of the carbon source (which was detected by monitoring the dissolved oxygen and off-gas analysis signals), and then in fed-batch mode. Table 1 shows the semi-defined batch and feed media composition. Yeast extract was used as an iron and vitamin source [40–42], whereas peptone was used as an additional source of vitamins and protein, replacing a trace elements solution typically used [16,24,43].


**Table 1.** Composition of the batch and feed media.

#### *2.2. Signal Filtering and Smoothing*

A data acquisition program was developed using LabVIEW 18 (National Instruments, Austin, TX, USA) to handle all data communications between the KLF bioreactor and process analyzers (dielectric probe and off-gas analyzer). The program also treated the dielectric signal, reducing measured noise online. Basic moving average and the Savitzky–Golay smoothing algorithms were implemented and evaluated, with the latter showing better performance in terms of striking a compromise between

minimizing signal delay and maximizing the accuracy of the filtered signal. A window of 121 data points and a first-order polynomial fit was used within the Savitzky–Golay algorithm to smooth the dielectric signal ahead of the specific growth rate estimation.

#### *2.3. Dielectric Spectroscopy Adaptive Calibration*

The technology of dielectric spectroscopy, when used in dual-frequency mode, measures the drop in the medium permittivity between the two electrical field frequencies. This drop is assumed to be linearly proportional to the concentration of live cells; however, the correlation is different for each cell strain and culture conditions [44]. The magnetic field used by the sensor is sensitive to the average bubble size, medium conductivity and, to some extent, fluid viscosity. In order to maintain the sensor's response accuracy to these changing parameters, the probe was recalibrated for each culture using the cell concentration data acquired at the end of batch phase. The adapted calibration model was included within the LabVIEW on-line monitoring program prior to fed-batch phase, accounting for changes in medium conductivity and fluid viscosity, and applied for the fed-batch segment of the experiment.

For the purpose of reference, cell concentration was measured off-line by dry cell weight, at intervals of 60 min during the batch phase and 3 h during the fed-batch phases. Samples of 5 mL of broth were filtered through a prescaled membrane filter. Then, the filter was dried until constant weight, the yeast mass was calculated and converted to grams per litre.

#### *2.4. Specific Growth Rate Estimation*

The filtered biomass signal was used to calculate the estimated specific growth rate. As cell growth is an exponential process, the linearization involves the use of a logarithm within the derivative. The theoretical way of calculating μ is given in Equation (1), where *Cx* is the current biomass concentration and *V* the medium volume.

$$
\mu(t) = \frac{d\ln(\mathbb{C}\_X V)}{dt} \tag{1}
$$

Specific growth rate values obtained with Equation (1) are, however, very sensitive to any residual noise in the biomass measurement. In order to minimize this effect, a Δ*t* window of 15 min was used in this work to calculate μ, as shown in Equation (2). This window size was the result of a compromise between reducing noise and minimizing signal delay.

$$\mu\_{\rm cst}(t) = \frac{\ln\left(\frac{\mathbb{C}\_{\mathcal{X}\_t} V\_t}{\mathbb{C}\_{\mathcal{X}\_t t - 15min} V\_{t-15min}}\right)}{\Delta t} \tag{2}$$

Throughout the batch phase, the volume of the medium remained constant; however, during the fed-batch phase, the culture volume increased as a function of the feed flow rate. The instantaneous value of the volume was updated by monitoring the mass of medium fed into the reactor and taking into account media density. Equation (3) shows the calculation used to correct the medium volume, *Vt,* throughout the culture. Here *Vini* is the initial volume of the batch medium, *mfed* is the mass of the feed solution fed into the vessel, as recorded by the scale, and ρ*feed* is the feed density.

$$V\_t = V\_{ini} + \frac{m\_{fcd}}{\rho\_{fead}}\tag{3}$$

#### *2.5. Controller Design*

The classic method of regulating the specific growth rate is open-loop (feed-forward) control; this technology implies implementing a predefined, exponential feed rate of medium to the culture. However, using this approach does not allow reacting to any potential process disturbances or deviations from the setpoint. In order to enhance the open-loop action, a feedback logic can be

implemented [5,6,8]. Dabros et al. [24] proposed a new type of PI controller where the controller gains were included within the exponential term of the control equation. The advantage with this control logic is that it is well adapted to the exponential dynamics of the culture, making it more effective and robust over the course of the fed-batch culture.

In this work, we demonstrate the use of a similar feedforward-feedback controller, as the one proposed in [24], but with a slight modification. The proposed control logic alteration allows for greater noise management and makes the controller more robust, particularly for long-duration fed-batch cultures.

The feed-forward part of the controller action, for a given specific growth rate setpoint, μ*sp*, is given by the following expression:

$$F\_{FF}(t) = F\_0 \exp(\mu\_{sp}t) \tag{4}$$

where *F*<sup>0</sup> is the theoretical initial feed flow rate, calculated as follows:

$$F\_0 = \mathcal{C}\_{X,0} V\_0 \frac{\mu\_{sp}}{Y\_{X/S} \, S\_F} \tag{5}$$

Here, *CX,*<sup>0</sup> is the initial biomass concentration, *V0* the initial volume of the culture, *YX*/*<sup>S</sup>* is the biomass yield coefficient, and *SF* is the substrate concentration in the feed medium. The initial biomass concentration is obtained with the dielectric measurement, while the biomass yield is determined at the end of the initial batch phase.

The feedback action of the controller is based on the process control error, calculated as follows:

$$
\dot{\varepsilon}(t) = \mu\_{\text{sp}} - \mu\_{\text{est}}(t) \tag{6}
$$

The feedback controller contains proportional and integral gains (respectively, *Kp* and *Ki*). In order to avoid the use of adaptive gains, both *Kp* and *Ki* were included within the exponential term of the feedback control equation, *FFB*, as shown in the following expression:

$$F\_{FB}(t) = F\_0 \exp\left( \left( K\_p \varepsilon(t) + K\_i \int\_0^t \varepsilon(t)dt \right) t \right) \tag{7}$$

Both parts of the controller, feedforward and feedback, are used simultaneously and can be written in the same and final control equation given below:

$$F(t) = F\_0 \exp\left( \left(\mu\_{sp} + K\_p \varepsilon(t) + K\_i \int\_0^t \varepsilon(t)dt \right) t \right) \tag{8}$$

This form of feedforward-feedback controller was used by Dabros et al. [24]. Its main drawbacks include sensitivity to noise and oscillatory behavior, observed to increase in time. Indeed, if the error term is unstable and noisy, the controller action can intensify the problem. To reduce the oscillations, the controller gains have to be chosen meticulously. In this work, two different methods were applied to tune the gains' values. The first approach involved manual tuning from the initial values reported in [24], which were 1.5 [-] for *Kp* and 0.5 h−<sup>1</sup> for *Ki*. After several cultures, the gains were adapted according to a manual tuning methodology [45] as follows: *Kp* was reduced to 0.75 [-] and *Ki* increased to 1 h<sup>−</sup>1. The second method used to calculate the controller gains was the Ziegler–Nichols open-loop tuning table [46]. This approach is based on the system's response following a step input applied to the control variable. In this work, a predefined feed flow rate was applied according to Equation (4), and the specific growth rate response was used to determine the proportional and integral terms. The calculated *Kp* was 0.66 [-] and *Ki* was 0.6 h<sup>−</sup>1, both values relatively close to the ones obtained with

manual tuning. The two sets of gains were tried in order to make the final choice: 0.75 [-] and 1 h−<sup>1</sup> for *Kp* and *Ki*, respectively.

By analyzing Equation (8), it can be noted that the proportional and integral terms of the controller are time dependent. This design effectively makes the weights of the two terms increase with time. Indeed, in this work, it was noticed that the oscillatory behavior increased with time not only in the control variable, μ, but also in the manipulated variable, *F(t)*. This meant that the oscillations were time dependent, and therefore the controller logic needed to be changed to remove the gains' dependence on time. To address this issue, a novel approach is proposed as follows.

First, only the proportional term was changed, yielding the following expression:

$$F(t) = F\_0 \exp\left( \left(\mu\_{\text{sp}} + \frac{K\_p}{t}\varepsilon(t) + K\_i \int\_0^t \varepsilon(t)dt \right)t \right) \tag{9}$$

By dividing *Kp* by the time, the effect of this term becomes constant over time. This change should increase the performance of the controller by reducing the high-frequency oscillations in the control and manipulated variables.

Similarly, the time dependence of the integral term was suspected to induce increasing oscillations in the control and the manipulated variables. In this case, the disturbance was expected to be low in frequency, since the integral term's effect is delayed in time. Thus, by making the integral term constant, the low-frequency oscillations should be reduced. To verify this hypothesis, Equation (9) was adapted as follows:

$$F(t) = F\_0 \exp\left(\mu\_{sp}t + K\_p \varepsilon(t) + K\_i \int\_0^t \varepsilon(t)dt\right) \tag{10}$$

This last modification was thought to be necessary particularly for long-duration fed-batch cultures. It should be noted that with the final controller logic (Equation (10)), both parts of the feedback controller have a constant weight over time, but they are still included in the exponential term of the equation. This is important because it allows for exponential gain scheduling, making the otherwise linear PI controller robust with respect to the exponential growth dynamics of the fed-batch culture.

Each time a new specific growth rate setpoint was defined during a running culture, the initial feed flow rate, *F*0, was calculated anew using Equation (5), applying the latest value of the biomass concentration, *CX,t*, and the current volume, *Vt*. To prevent integral windup, the window of integration was limited to 3 h. This limitation allowed maintaining the controller action continuous throughout the culture without having to reset the integral term between two setpoints.

Figure 2 shows the controller block diagram used in this work. The final μ controller logic is given by Equation (10). The control loop was executed every 20 s by the LabVIEW supervision program, registering all the process measurements and sending commands to the feed pump.

The performance of the controller was assessed, at each setpoint, by calculating the root mean square error (RMSE, Equation (11)). In order to allow for adequate stabilization time, the calculation window started one hour from the moment when the setpoint was initially applied and continued until the next setpoint change. The stabilization time was necessary because the dielectric signal showed signs of increased disturbance resulting from the metabolic adjustments following each setpoint change.

$$RMSE = \sqrt{\frac{\sum\_{i}^{n} \left(\mu\_{est\_i} - \mu\_{sp}\right)^2}{n}} \tag{11}$$

**Figure 2.** Controller block diagram.

#### **3. Results**

First, the performance of the original, unmodified feedforward-feedback PI controller (Equation (8), [24]) was assessed by analyzing the profiles of the specific growth rate and of the feed flow rate over the duration of a fed-batch culture. Analyzing the μ profile, shown in Figure 3, it can be observed that the process showed increasing oscillations starting after two hours of the fed-batch culture. The RMSE for this experiment was 0.063 h<sup>−</sup>1, corresponding to a mean controller error of 32%.

**Figure 3.** Setpoint (dash line) and estimated specific growth rate (solid line) during a controlled fed-batch culture, controller logic (Equation (8)).

To improve the controller and reduce the amplified oscillations, it was necessary to reduce the effect of the part of the controller action that reacts fast to the evolution of the error, ε(*t*). Indeed, Figure 4 confirms that oscillations were equally present in the manipulated variable. This meant that the action responsible for this behavior was dictated by the proportional term.

**Figure 4.** Feed flow rate applied during a controlled fed-batch culture, controller logic (Equation (8)).

For the subsequent experiment, the controller logic was changed to the one given by Equation (9). By maintaining the proportional gain, *Kp*, constant overtime, the high-frequency oscillatory behavior was reduced. In order to assess the controller's improvement, a longer step was performed. Figure 5 shows the specific growth rate profile over a nine-hour duration fed-batch culture. The process still shows oscillations, but ones that are constant and not amplified over time. In this experiment, the root mean square error (RMSE) was 0.072 h−<sup>1</sup> which corresponded to a mean controller error of 36%. It was stipulated that the controller still could be improved by examining the corresponding feed flow rate profile, shown in Figure 6. The oscillatory behavior was still present in this signal, but its frequency was lower (~0.5 h<sup>−</sup>1) than that observed previously (Figure 3, ~1 h<sup>−</sup>1). Nevertheless, the amplitude of the oscillations did increase in time.

**Figure 5.** Setpoint (dash line) and estimated specific growth rate (solid line) during a controlled fed-batch culture, controller logic (Equation (9)).

**Figure 6.** Feed flow rate profile obtained during a controlled fed-batch culture, controller logic (Equation (9)).

As the proportional gain remained constant, it could not be responsible for amplifying the oscillations in the feed rate. On the other hand, the integral term was still time dependent, its effect increasing over time. Moreover, since the oscillatory frequency was lower, suggesting that the term responsible for inducing them was a slow-acting term, it pointed at the integral term. Following this consideration, the controller logic was changed to that given in Equation (10).

To assess the improvement in the controller's performance, another experiment was performed where the specific growth rate setpoint was changed mid-culture. With the theoretical maximal specific growth rate of *K. marxianus* being close to 0.6 h<sup>−</sup>1, it was decided to control the growth rate at a high and a low setpoint within the same culture [40,47,48].

Figure 7 shows the specific growth rate obtained during this fed-batch experiment. It was possible to control the growth rate at 0.4 h−1; however, the culture could not be run at that setpoint for a long time. Indeed, the cell concentration increased too rapidly to allow for full aerobic conditions (DO > 20%). After 3 h, the limit of 20% DO was reached, and the setpoint was lowered to maintain the culture in aerobic respiration. It is unclear whether the remaining mild oscillatory behavior was caused by the residual noise in the filtered dielectric signal or by the controller itself. Figure 8 shows the feed flow rate applied throughout the culture; this time, little oscillation was observed in the manipulated variable. The resulting smooth control action allowed the controller to reach and maintain the growth very close to the setpoint throughout the experiment. The root mean square error (RMSE) for the μ setpoint of 0.4 h−<sup>1</sup> was 0.061 h−<sup>1</sup> and for the setpoint of 0.1 h<sup>−</sup>1, the RMSE was 0.019 h−1. The combined mean controller error for this culture was 17% which was the half compared to the errors observed during the cultures where the controllers given by Equations (8) and (9) were used (Figures 3 and 5, combined mean controller error of 32% and 36%, respectively).

**Figure 7.** Setpoint (dash line) and estimated specific growth rate (solid line) during a controlled fed-batch culture, controller logic (Equation (10)).

**Figure 8.** Feed flow rate profile during a controlled fed-batch culture, controller logic (Equation (10)).

Figure 9 shows the evolution of the biomass concentration during this fed-batch culture. The individual stages of the culture can easily be identified, i.e., the initial batch phase (until culture time of 4.4 h), followed by the two fed-batch phases at μ setpoints of 0.4 h−<sup>1</sup> and 0.1 h<sup>−</sup>1.

**Figure 9.** Evolution of biomass concentration (raw signal in solid line, filtered signal in dashed line) during the initial batch phase (blue) and two fed-batch phases at μ*sp* = 0.4 h−<sup>1</sup> (red) and 0.1 h−<sup>1</sup> (black).

#### **4. Discussion**

In this work, it was demonstrated that the logic of the specific growth rate controller was equally, if not more, important than the controller gains. Indeed, the control performance changed significantly depending on the approach chosen.

As a first step in the study, we have addressed the weakest element in the control loop, namely, the biomass measurement and the specific growth rate estimation. The biomass measurement signal was inherently noisy and sensitive to process conditions, such as aeration and agitation. The filtering approach was optimized to minimize the effect of the bubbles without inducing too long a lag time. The window of 15 min chosen for the specific growth rate estimator was a trade-off with respect to the lag time of the filtering algorithm and the remaining noise on the biomass signal.

Secondly, it was illustrated that by adjusting the controller logic, the signal oscillations could be managed effectively, and the controller stability improved. Using the original controller (Equation (8)), a fed-batch culture lasting longer than 2 h was difficult to achieve because of increasing oscillations in both the control and the manipulated variables. With the new control logic, the experiment was shown to be stable in time, and the mean controller error was reduced to 17%. The remaining error seems to be limited by the residual noise amplitude on the biomass measurement.

Finally, the reproducibility of the controller's performance was assessed by comparing three fed-batch cultures run at the same specific growth setpoint (0.1 h<sup>−</sup>1). The result, shown in Figure 10, demonstrates that after the initial period of instability (1 h), the estimated μ tracked the assigned setpoint in all three runs. The level of accuracy was similar, showing an average relative RMSE of 23% ± 6% (1σ) despite the variable starting biomass concentration, indicating controller robustness.

All fed-batch cultures were run between 6 and 9 h in total duration, with variable μ setpoints. Throughout this work, the maximum biomass concentration allowing aerobic conditions was around 25 g/L of dry cell weight. Above this concentration, regardless of the specific growth rate setpoint, the oxygen transfer rate was insufficient to allow purely aerobic respiration of the cells. The aim of fed-batch cultures is to ensure a permanent state of substrate limitation. If the culture conditions changed from aerobic to anaerobic, the limiting factor would be a combination of substrate and oxygen. The agitation rate could be increased to enhance the oxygen transfer and allow for higher biomass concentrations. However, this change was observed to deteriorate the growth rate estimation by increasing significantly the noise in the biomass concentration measurement.

**Figure 10.** Biomass concentration profiles (**a**) and specific growth rate estimates (**b**) during three fed-batch phases run at μ*sp* = 0.1 h<sup>−</sup>1.

#### **5. Conclusions**

The aim of this work was to implement and optimize a novel proportional-integral (PI) feedforward-feedback controller, designed to maintain a desired specific growth rate of a microbial culture. The proposed new control logic provides robust setpoint tracking of an exponentially evolving fed-batch culture, while ensuring improved noise and oscillation management. To reduce the noise present in the biomass measurement, the filtering approach was optimized using a first-order Savitzky–Golay algorithm instead of a simple moving average. Then, to limit the oscillations in the manipulated and control variables, the time dependency of the P and I feedback terms was removed, as shown in Equation (10). This change enhanced the stability of the system for longer processes without any noticeable increase in signal oscillations, as was previously observed. Despite the remaining noise in the biomass measurement, the controller was able to cope with this issue and maintain the desired specific growth rate over the duration of the culture. In conclusion, it was shown that a strain of *K. marxianus* could be grown successfully in fed-batch mode, under substrate-limited, aerobic conditions, at different setpoints ranging from 0.1 h−<sup>1</sup> to 0.4 h<sup>−</sup>1.

Some aspects could still be improved, particularly the biomass measurement signal filtration. In order to diminish the undesired effect of aeration on the dielectric signal, two possibilities become apparent. The first solution is to work under a slight pressure in the reactor, which would increase the solubility of oxygen, and thereby reach aerobic conditions with a reduced aeration rate. The second solution is to use an external bypass (flow-through cell) where the probe could be installed to measure the medium permittivity without the presence of bubbles. The latter approach would require modifications to the existing bioreactor. Finally, a more powerful signal filtration technique could be applied, such as the Kalman filter [22].

It was noticed that the biomass yield varied between the fed-batch cultures that were performed. Depending on the specific growth rate, the obtained biomass yield, *YX*/*S*, varied from 0.3 to 0.65 g/g. This fact complicated the use of the initial feed flow rate calculation, (*F*0, Equation (5)) since both terms, μ*sp* and *YX*/*S*, are present in this equation and yet, they are interdependent. It could be interesting to investigate more in depth the effect of the specific growth rate on the biomass yield, as suggested in [5]. By knowing at which growth rate a particular strain is more productive, industrial applications could be optimized and their costs reduced.

This work highlighted the feasibility of controlling the specific growth rate of yeast in long-duration fed-batch cultures and in spite of a noisy biomass measurement signal. The proposed approach should be applicable to other microbial systems, as well as to mammalian cell cultures. In the latter case, it is expected that the slower growth dynamics should allow a further reduction in signal noise and oscillations.

**Author Contributions:** Y.B. performed the experimental work with technical help from D.C. and under the supervision of B.F. and M.D.; Y.B. wrote the original manuscript, which was subsequently revised by M.D. and B.F. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Acknowledgments:** This publication has emanated from research supported by the School of Biotechnology, Dublin City University. The authors would like to thank Rachel Crossley, European Technical Sales Specialist at Aber Instruments Ltd. and Karin Philipp of Bioengineering AG for support.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## **Maximization of Siderophores Production from Biocontrol Agents,** *Pseudomonas aeruginosa* **F2 and** *Pseudomonas fluorescens* **JY3 Using Batch and Exponential Fed-Batch Fermentation**

**Gaber Attia Abo-Zaid 1,\*, Nadia Abdel-Mohsen Soliman 1, Ahmed Salah Abdullah 1, Ebaa Ebrahim El-Sharouny 2,3, Saleh Mohamed Matar 1,4 and Soraya Abdel-Fattah Sabry <sup>2</sup>**


Received: 15 February 2020; Accepted: 31 March 2020; Published: 12 April 2020

**Abstract:** Twenty fluorescent *Pseudomonas*isolates were tested for their ability to produce siderophores on chrome azurol S (CAS) agar plates and their antagonistic activity against six plant pathogenic fungal isolates was assessed. Scaling-up production of siderophores from the promising isolates, *P*. *aeruginosa* F2 and *P*. *fluorescens* JY3 was performed using batch and exponential fed-batch fermentation. Finally, culture broth of the investigated bacterial isolates was used for the preparation of two economical bioformulations for controlling *Fusarium oxysporum* and *Rhizoctonia solani*. The results showed that both isolates yielded high siderophore production and they were more effective in inhibiting the mycelial growth of the tested fungi compared to the other bacterial isolates. Exponential fed-batch fermentation gave higher siderophore concentrations (estimated in 10 μL), which reached 67.05% at 46 h and 45.59% at 48 h for isolates F2 and JY3, respectively, than batch fermentation. Formulated *P*. *aeruginosa* F2 and *P*. *fluorescens* JY3 decreased the damping-off percentage caused by *F*. *oxysporum* with the same percentage (80%), while, the reduction in damping-off percentage caused by *R*. *solani* reached 87.49% and 62.5% for F2 and JY3, respectively. Furthermore, both formulations increased the fresh and dry weight of shoots and roots of wheat plants. In conclusion, bio-friendly formulations of siderophore-producing fluorescent *Pseudomonas* isolates can be used as biocontrol agents for controlling some plant fungal diseases.

**Keywords:** *Pseudomonas*; siderophores; antagonism; batch fermentation; exponential fed-batch fermentation; bio-friendly formulations; biocontrol

#### **1. Introduction**

Siderophores are a group of low-molecular-weight molecules (400–1500 Da) with a high affinity to ferric ions. These molecules are secreted by different microorganisms in response to low-iron conditions; a siderophore molecule forms a Fe-siderophore complex, which is recognized by membrane-receptor proteins within the microorganism. After that, Fe-siderophore complex is transported into cell periplasm, wherever it facilitates iron uptake [1,2]. Siderophores are iron chelators that have many applications, such as iron chelation therapy, antibiotic carriers, inhibitors of metalloenzymes, promotion

of plant growth, biocontrol of plant and fish pathogens, biocontrol of algal blooms, removal of petroleum hydrocarbons from the marine environment, soil bioremediation, recovery of unusual earth elements and modification of surfaces [3]. The role of siderophore production in the biocontrol of *Erwinia carotovora* was demonstrated by Kloepper et al., 1980, who were the first to use pseudobactin-producing *Pseudomonas fluorescens*, a type of siderophore, as a biocontrol agent. Several species of fluorescent pseudomonads are able to secrete siderophores such as *P*. *aeruginosa*, *P*. *fluorescens* and *P*. *syringae* [4,5]. *P*. *fluorescens*, which secretes a hydroxamate-type siderophore revealed a high efficiency against *Macrophomina phaseolina*, the pathogen causing peanut charcoal rot [6], while a catechol-type siderophore produced by *P*. *syringae* had inhibitory effects on spore germination and mycelium morphology of *Fusarium oxysporum* [7]. Maximization of siderophore production from fluorescent pseudomonad isolates is performed in the bioreactor using different fermentation strategies [8–10]. There are three common operational modes of fermentation process: batch, fed-batch and continuous fermentation [11]. Batch fermentation is a very simple process, where all the medium components are sterilized in the vessel of bioreactor; however, some sensitive components might be sterilized separately and then added after sterilization to the bioreactor. Then, the organism is inoculated into the vessel and its growth is maintained inside this closed system until the end of the run without any further modification. The main objective of the fermentation process is maximizing production of the biological product or biomass, so that the process ends when the yield or products are achieved, when consumption of nutrients and minerals has occurred, or when toxic products accumulate [11,12]. Fed-batch fermentation is started with a batch phase and accomplished with a fed-batch phase by adding a carbon source or any other nutrients to the bioreactor at the end of the exponential phase of the culture growth (after consumption of the initial carbon source) without any removal of the fermentation products until the end of the fermentation process. The medium components could be added to the vessel according to designated feed rate (according model) to avoid the inhibitory effect of the initial high concentration of any medium component that occurs in batch fermentation. This ensures that the fed-batch fermentation process achieves high concentrations of the desired products [13]. There are different types of fed-batch fermentation according to the feeding strategy: constant [12], exponential [14,15], linear [16], pulse [12], and feeding based on other models [17]. Both batch and fed-batch strategies have been used for maximizing pyocyanin production from *P*. *aeruginosa* JY21 [12], and to improve the production of biosurfactant from *P*. *aeruginosa* USM AR2 and MR01 [18,19]. In addition, these strategies have also been used for increasing the production of acetyl esterase from *Pseudomonas* sp. ECU1011 [20]. Radha et al., 2014 enhanced the production of alkaline protease using batch and fed-batch fermentation strategies from a mixed culture of *P*. *putida* and *Staphylococcus aureus*. They reported1.18-fold increases in the activity of alkaline protease when the enzyme was produced in a fed-batch fermentation compared to batch fermentation process [21]. Siderophores, which have high affinity to ferric irons, are naturally produced by microorganisms under low-iron conditions. The resulting Fe-siderophore complex becomes unavailable to other organisms, but the producing strain can uptake this complex via a very specific receptor in its outercellular membrane [22,23]. By this strategy, siderophores producing fluorescent *Pseudomonas* can restrict the growth of plant pathogens in the rhizospher of plants. This mechanism is known as competition for iron nutrition, which is one of the most important mechanisms of biological control of plant pathogens. Pyoverdin, a siderophore produced by *P*. *putida* WCS358, was effective in controlling radish fusarium wilt disease, which is caused by *F*. *oxysporum* f. sp. *raphani* [24]. In addition, *P. fluorescens* strains SPs9 and SPs20 that produced siderophores were effective in the inhibition of tomato wilt pathogen, *F*. *oxysporum* f. sp. *lycopersici* [5]. Siderophores play an imperative role in iron nutrition of plants, and consequently, promote plant growth. Iron uptake by plants is usually enhanced by microbial siderophores, when the plant is able to recognize the bacterial ferric-siderophore complex. Plants take in iron from bacterial siderophores by means of different mechanisms, for example, chelation and release of iron, the direct uptake of siderophore-Fe complexes, or by aligand exchange reaction [25]. *Arabidopsis thaliana* took up Fe-pyoverdine complex synthesized by *P*. *fluorescens* C7, resulting in an increase in iron concentration

inside the plant, and hence, improved plant growth [26]. The main goal of our study is maximizing the production of siderophores from two *Pseudomonas* species, *P*. *aeruginosa* and *P*. *fluorescens* using batch and exponential fed-batch fermentation strategies. We then test the ability of siderophore-producing fluorescent pseudomonads to act as biocontrol agents against soil-borne fungi such as *F*. *oxysporum* and *Rhizoctonia solani*, the major agents of damping off and root rot diseases for many different plant crops that cause severe loss in the productivity of food and feed crops.

#### **2. Materials and Methods**

#### *2.1. Bacterial and Fungal Isolates*

#### 2.1.1. Isolation of Fluorescent *Pseudomonas* Isolates

Isolation was carried out by the serial dilution method from forty soil samples of wheat, corn, eggplant, cotton, pepper and clover collected from Alexandria, Monufia and Sohag in Egypt. Ten grams of soil were suspended in 90 mL of sterile double distilled water and shaken for 1 h at 200 rpm. Next, 200 μL of 10−<sup>5</sup> and 10−<sup>6</sup> dilutions were spread onto King's B medium plates. After inoculation, plates were incubated at 28 to 30 ◦C for 1 to 2 days. Fluorescent *Pseudomonas* isolates were recognized by the formation of green fluorescence pigments in and around the colonies when exposed to ultraviolet light. Fluorescent *Pseudomonas* cultures were purified using the single colony technique and examined under light microscope after staining by Gram stain to confirm their purity.

#### 2.1.2. Bacterial and Fungal Isolates from Culture Collection

Four fluorescent pseudomonad isolates, *P*. *fluorescens* JY3, JY7, JY8 and JY13 (GenBank accession number, KF922490, KF922494, KF922495 and KF922500, respectively) (Table 1), and six fungal isolates, *Alternaria* spp., *F*. *culmorum*, *F*. *oxysporum* isolate A, *F*. *oxysporum* isolate B, *F*. *solani* and *R*. *solani* used in the current study were kindly provided by the City of Scientific Research and Technological Applications (SRTA-City).


**Table 1.** Fluorescent *Pseudomonas* isolates used in the current study.

#### *2.2. Screening for Siderophores Production Using CAS Assay*

#### 2.2.1. Qualitative Assay

All fluorescent *Pseudomonas* isolates were checked for their capability to secrete siderophores. Detection of siderophores was performed on chrome azurol S CAS agar plates [27] where the detection depends on the high affinity of siderophores to chelate iron, so that, in the presence of siderophores the colure of medium changes from greenish blue to orange. *Pseudomonas* isolates were grown in succinic medium contained 4 g succinic acid; 6 g KH2PO4;4gK2HPO4; 0.2 g MgSO47H2O; 1 g (NH4)2SO4 and dH2O up to 1000 mL [28] for 24 h at 200 rpm and 30 ◦C. Next, 10 mL of each culture was centrifuged at 10,000 rpm for 10 min and then the supernatant was filtrated throughout a 0.2 μ syringe filter. Five wells were made in each CAS agar plate and each well was filled with 80 μL of culture filtrate. Plates were then incubated at 30 ◦C for 24 h and the presence of siderophores was detected visually.

#### 2.2.2. Quantitative Assay

*Pseudomonas* isolates were cultured in succinic medium for 24 h at 200 rpm and 30 ◦C. A 1.5 mL volume of each culture was centrifuged at 10,000 rpm for 10 min. The relative level of siderophores was measured in a fixed volume of supernatant (10 μL) using the CAS assay method according to Schywan and Neilands (1987) [27]. Next, a 0.5 mL CAS assay solution was added to 10 μL of culture supernatant and mixed well, then 10 μL of Shuttle solution was added, mixed, and the mixture was left at room temperature for few minutes. The disappearance of the blue color relates to the presence of siderophores. The absorbance was measured at 630 nm using the media as blank. The relative level of siderophores was calculated based on the following formula:

#### Relative level of siderophores % = Ar − As/Ar × 100

Ar refers to absorption of CAS solution plus media plus shuttle solution whereas, As refers to absorption of CAS solution plus culture supernatant plus shuttle solution.

#### *2.3. Antagonistic E*ff*ect of Fluorescent Pseudomonad Isolates (Dual Culture Method)*

Antagonistic effects of all fluorescent pseudomonad isolates were tested against *Alternaria* spp., *F*. *culmorum*, *F*. *oxysporum* isolate A, *F*. *oxysporum* isolate B, *F*. *solani* and *R*. *solani* using the dual culture method according to Toure et al., 2004 [29]. Antagonistic isolates were streaked as a streak line with a loopfull of 2 day-old culture on potato dextrose agar plates, and incubated for 48 h prior inoculation by any tested fungus. A mycelial disc (5 mm in diameter) of an actively growing culture of the checked fungus was placed in the center at a standard distance close the other edge of the Petri plate and incubated at 30 ◦C for 3–7 days. Inhibition zones (the distance among the edge of antagonistic bacterial growth and the edge of fungal growth) were measured. All experiments were carried out in three replicates for each fungus.

#### *2.4. Molecular Identification of Bacterial Isolates*

Total DNA was extracted according to Istock et al., 2001 [30] from fluorescent *Pseudomonas* isolates F2, F7 and F8, which produced a higher percentage of siderophores production than the other *Pseudomonas*isolates that were isolated in the current investigation. The complete length of the 16S rRNA gene was amplified according to Matar et al., 2009 [31] using two universal primers, Start (forward) 5 AGAGTTTGATCMTGGCTCAG 3 and End (reverse) 5 TACGGYACCTTGTTACGACTT 3 . The amplified 16S rRNA gene of *Pseudomonas* isolates was purified and sequenced based on enzymatic chain terminator technique by the use of a Big Dye terminator sequencing kit. After that, the nucleotide sequences were aligned with pseudomonad 16S rRNA gene sequences obtained from GenBank database (http://www.ncbi.nlm.nih.gov). The phylogenetic tree was constructed with the UPGMA method using MEGA software version 5 and the number of bootstraps replications was 2000.

#### *2.5. Fermentation Experiments*

#### 2.5.1. Bioreactor

Fermentation experiments were carried out in a 10-L bench-top bioreactor (Cleaver, Saratoga, CA, USA) [26]. Temperature was monitored at 25 ◦C while the pH was adjusted to 7 by adding HCl 2 N and NaOH 2 N using feeding pumps. Aeration was performed using sterilized air that was supplied at 0.5 VVM (air volume per broth volume per minute). Agitation speed ranged between 200 to 600 rpm to keep the percentage of dissolved oxygen more than 30%. DO percentage and pH values were recorded automatically using the online module.

#### 2.5.2. Batch Fermentation

*P*. *aeruginosa* F2 and *P*. *fluorescens* JY3 (GenBank accession number, MG210480 and KF922490), which have the greatest percentage of siderophores, were cultivated in the bioreactor using a volume of 5 L of optimized medium for siderophore production. The medium contained 20 mL glycerol; 14.5 g glucose; 1 g glutamic acid; 4 g sodium succinate; 5 g asparagine; 0.1 g urea; 1 g (NH4)2SO4;1g kH2PO4; 3.5 g K2HPO4; 1 g MgSO4; 0.5 μM FeCl3 and dH2O up to 1000 mL, pH 7 for *P*. *aeruginosa* F2 [27]. However, *P*. *fluorescens* JY3 was cultivated on optimized medium contained 10 mL glycerol; 1 g glucose; 0.5 g glutamic acid, 3.14 g sodium succinate; 1 g asparagine; 0.1 g urea; 0.1 g (NH4)2SO4;6g kH2PO4;4gK2HPO4; 0.1 g MgSO4; 0.62 μM FeCl3 and dH2O up to 1000 mL, pH 7 [32]. Cultivation of *P*. *aeruginosa* F2 and *P*. *fluorescens* JY3 in the bioreactor started with optical density (O.D600nm) of 0.3 as an inoculum size by inoculation from seed culture of LB broth of each strain. Several samples from each culture were taken at time intervals for the determination of relative level of siderophores, biomass, glucose and glycerol.

#### 2.5.3. Fed-batch Fermentation

Fed-batch fermentation was initiated with batch phase using 5 L of optimized medium as previously described. At the end of exponential phase, a feeding step was created by adding the feeding medium of that every 1-L of optimized medium, including 10-fold of each medium component except kH2PO4 and K2HPO4, which were added with original weights shown in the optimized media and without 10-fold increase . The feeding strategy for each strain was accomplished using a exponential programmed feed rate that was initialized with a rate of 1.5 mL min−<sup>1</sup> to finish at rate of 22.5 mL min−<sup>1</sup> [15].

#### *2.6. Analytical Procedures*

#### 2.6.1. Relative Level of Siderophores

Relative level of siderophores was determined by CAS assay method according to Schywan and Neilands (1987) as mentioned before [27].

#### 2.6.2. Biomass Estimation

Dry cell weight was estimated according to Van Dam-Mieras et al., 1992 [33].

#### 2.6.3. Glucose Estimation

Glucose concentration was estimated using enzymatic colorimetric kit (Diamond Diagnostics, Egypt).

#### 2.6.4. Glycerol Estimation

Glycerol concentration was estimated by the method developed by Bok and Demain (1977) [34]. One mL of supernatant containing glycerol was added to 1 mL of 15 mM sodium metaperiodate in

0.12 M HCl in a test tube and incubated at room temperature for 10 min. After oxidation of periodate, 2 mL of 0.1% (w/v) L-rhamnose solution was added to the test tube in order to remove the excessive periodate ions. After mixing, 4 mL of Nash reagent containing ammonium acetate (150 g mL−1), acetic acid (0.2%, v/v) and acetylacetone (0.2%, v/v) were added. The color of the mixture was allowed to develop for 15 min in a water bath at 53 ◦C. After cooling, the optical density of the mixture was measured at 412 nm using a spectrophotometer and converted into glycerol concentration (g L−1) according to a calibration curve (ranging from 0 to 40 mg L<sup>−</sup>1).

#### *2.7. Application of Siderophores-Producing Pseudomonas Isolates as Biocontrol Agents*

#### 2.7.1. Formulation Experiment

Culture broth of *P. aeruginosa* F2 and *P. fluorescens* JY3 obtained from fed-batch fermentation containing 2.0×109cfu mL−<sup>1</sup> was formulated using talc powder (TP) as a carrier with some other additives that include glycerol or glucose as a carbon source and carboxymethylcellulose (CMC) as an adhesive. Also, calcium carbonate was added to the mixture for adjusting the pH to 7 [35].

#### 2.7.2. Fungal Inoculum Preparation

Inoculums of *F*. *oxysporum* and *R. solani* were prepared using sorghum/coarse sand/water (2:1:2 v/v/m) medium. All components were combined, packed and sterilized for 2 h. Agar discs, obtained from the edge of 5-day-old culture of each investigated fungus were inoculated into the sterilized medium. After two weeks of incubation at 30 ◦C, fungal inoculums became available for soil infestation [36].

#### 2.7.3. Soil Infestation

Inoculums of *F*. *oxysporum* and *R. solani* were added independently to the soil surface of each pot at the rate of 2% w/w, and then coated with a thin film of autoclaved soil. The infested pots were irrigated and reserved for 7 days before sowing.

#### 2.7.4. Disease Assessment

Disease assessment was estimated as a percentage of damping-off (pre- and post-emergence) after 7 and 21 days from sowing, respectively, using the following formula:

% Pre-emergence = No. of non-emerged seedlings/No. of sown seeds × 100

% Post-emergence = No. of dead emerged seedlings/No. of sown seeds × 100

% Damping-off = % Pre-emergence + % Post-emergence

#### 2.7.5. Greenhouse Experiment

A pots experiment was performed to study the effect of the formulated *P. aeruginosa* F2 and *P. fluorescens* JY3 for the biocontrol of *F*. *oxysporum* isolate A and *R. solani*. Pots (18 × 18 cm diameter) containing sterilized soil infested as previously declared. Nine treatments were performed as follows: (1) *F*. *oxysporum*; (2) *R. solani*; (3) untreated and uninfected control (healthy); (4) *P. aeruginosa* F2 formulation; (5) *P. fluorescens* JY3 formulation; (6) F2 formulation + *F*. *oxysporum*; (7) JY3 formulation + *F*. *oxysporum*; (8) F2 formulation + *R. solani* and (9) JY3 formulation + *R. solani*. Five wheat seeds were sown per pot, three replicate (pots) were used for each treatment. Seeds of wheat were treated with the formulations as seed drench at a dose of 10 g kg−<sup>1</sup> of seeds. Formulations were applied twice at a dose of 3 kg acre<sup>−</sup>1, 15 and 30 days after seed sowing as a soil drench.

#### *2.8. Statistical Analysis*

Analysis of variance (ANOVA) was used to analyze the results achieved in this study using CoStat software. The least significant difference (LSD) at P ≤ 0.05 level of probability was utilized for detecting significant differences among treatments.

#### **3. Results**

#### *3.1. Isolation of Fluorescent Pseudomonas Isolates*

Sixteen fluorescent *Pseudomonas* isolates were isolated using King's B medium (Table 1). The green fluorescent colonies were picked up under UV light, and further purified by repeated streaking (single colony) on the same medium, and examined under light microscope.

#### *3.2. Screening for Siderophores Production*

All bacterial isolates were screened qualitatively and quantitatively for siderophore production. In a primary step, qualitative screening has been performed using CAS agar plates. All isolates that formed a yellow to orange zone around the pure colony indicated a positive siderophore production. In addition, in the agar well diffusion method, when cell-free supernatant of each culture was applied to the wells of CAS agar plates, a yellow to orange halo was observed around the wells that indicates the production of siderophores (Figure 1A). In order to confirm previously obtained results, siderophores produced by all bacterial isolates were quantitatively measured in liquid cultures. *P*. *fluorescens* JY3 and *P*. *aeruginosa* F2 showed the highest production of siderophores (1.758% and 1.749%, respectively). Also, *P*. *aeruginosa* F7 and F8 that were isolated in the current work gave a high percentage of siderophore production but *P. fluorescens* JY8 showed the lowest production of siderophores (0.011%) in a fixed 10 μL using succinic medium (Figure 1B).


**Table 2.** Antagonistic effects of fluorescent *Pseudomonas*isolates against growth of some phytopathogenic fungi using the dual culture method.

\* Means in each column followed by the identical letter do not differ significantly at P ≤ 0.05 level; \*\* Significant letters.

**Figure 1.** (**A**) Qualitative assay of siderophore production on chrome azurol S (CAS) agar plates using supernatant of some fluorescent *Pseudomonas* isolates; (**B**) Relative level of siderophores produced by fluorescent *Pseudomonas* isolates grown on succinic medium using CAS assay and (**C**) antagonistic effect of *Pseudomonas aeruginosa* F2 and *Pseudomonas fluorescens* JY3 against *Fusarium solani,* plate on the left in each photo is the corresponding control without the antagonist (data are presented in Table 2).

#### *3.3. Antagonistic E*ff*ect of Fluorescent Pseudomonad Isolates (Dual Culture Method)*

This experiment aimed to investigate the antagonistic effect of the experimental *Pseudomonas* isolates on the growth of some tested plant pathogenic fungi using dual culture technique. Isolate F2 was the most efficient isolate in inhibiting the mycelial growth of all tested pathogens. On the other hand, isolates F14, F15, F16, JY7, JY8 and JY13 showed a weak antagonistic activity (Table 2). All tested *Pseudomonas* isolates showed antagonistic activity against *Alternaria* sp. except *Pseudomonas* isolates F14, F15, F16 and JY13. *Pseudomonas* isolate F5 followed by isolates F2, F9 and F10, which showed no significant differences among each other, were the most effective isolates in inhibiting the mycelial growth of *Alternaria* sp. Meanwhile, F2 had more antagonistic activity than the other fluorescent *Pseudomonas* isolates on *F*. *calmorum*. All *Pseudomonas* isolates showed antagonistic activity against *F. oxysporum* isolate A except F16, JY7, JY8 and JY13, where *Pseudomonas* isolate F1 had a more-antagonistic effect on *F. oxysporum* isolate A compared to the other isolates followed by F7. *Pseudomonas* isolates F1 and F2 followed by F4 and F5 followed by F7 and F8 were the most effective isolates in inhibiting the mycelial growth of *F. oxysporum* isolate B, but *Pseudomonas* isolate F2 had more antagonistic activity than the other fluorescent *Pseudomonas* isolates on *F*. *solani* and *R*. *solani* (Figure 1C).

#### *3.4. Molecular Identification of Bacterial Isolates*

*Pseudomonas* isolates F2, F7 and F8, which produced a higher percentage of siderophore production than the other *Pseudomonas* isolates isolated in the current study, were identified based on sequencing of 16S rRNA gene. A database search to identify the bacterial isolates was achieved in BLAST search at the National Center for Biotechnology Information site (http://www.ncbi.nlm.nih.gov). Blast results revealed that sequences of isolates F2, F7 and F8 were almost similar to several *P. aeruginosa* strains with homology percentage of 99%. *Pseudomonas* isolates F2, F7 and F8 were identified as *P. aeruginosa* with accession numbers of MG210480, MG076939 and MG210481, respectively (Table 1). A phylogenetic tree of the 16S rRNA gene that was generated using the nucleotide sequences of the fluorescent *Pseudomonas* isolates F2, F7 and F8 (obtained in the current investigation) and other pseudomonad 16S rRNA gene sequences (obtained from GenBank database) revealed that two major clusters exist. Cluster 1 included *Cellvibrio ostraviensis*, while all fluorescent *Pseudomonas* isolates obtained in this study and provided from GenBank database were clustered in cluster 2. This cluster is divided into two groups: the first group included *P. aeruginosa* F2, F7 and F8 with the other *P. aeruginosa* strains provided from GenBank; while the second group contained *P. fluorescens, P*. *putida*, *P*. *chlororaphis*, *P*. *syringae* and *P*. *melia*e (Figure 2).

**Figure 2.** Phylogenetic tree of fluorescent *Pseudomonas* isolates F2, F7 and F8 obtained in the current study and validly described members of the genus *Pseudomonas* based on the nucleotide sequences of the 16S rRNA gene. Phylogenetic tree was constructed with UPGMA method using MEGA version 5 and the number of bootstraps replications is 2000.

#### *3.5. Batch Fermentation*

Batch fermentation technique in the bioreactor was used in this study to enhance siderophores production from *P*. *aeruginosa* F2 and *P*. *fluorescens* JY3 compared to cultivation in shake flasks, which produced a low relative level of siderophores in our previous study. Batch fermentation was performed using *P*. *aeruginosa* F2 and *P*. *fluorescens* JY3 in a 10-L bench-top bioreactor (Cleaver, Saratoga, CA, USA), where both isolates produced a higher percentage of siderophores production than the other *Pseudomonas* isolates that were screened in the current work. Figure 3A shows the relative level of siderophores, cell biomass, glucose and glycerol concentrations of the culture broth of *P*. *aeruginosa* F2 plotted against time. Cell biomass of *P*. *aeruginosa* F2 increased slowly during lag phase, which continued for 5 h. After that, the culture entered exponential phase (log phase) where the biomass increased rapidly with a constant specific growth rate, μ of 0.043 h<sup>−</sup>1. The dissolved oxygen decreased as a result of the increasing demand for O2 that is required for bacterial growth (Figure 3B). To assure sufficient oxygen provided, oxygen was maintained at over 30% by increasing the agitation

rate. After 41 h the dissolved oxygen was increased gradually until the end of the run. The maximum biomass achieved was 10.3 g L−<sup>1</sup> at 38 h. Relative level of siderophores reached 1.91% at 12 h but it accomplished its maximum value of 31.9% at 40 h. Glycerol concentration decreased slowly and it was not fully consumed as its value reached 4.5 g L−<sup>1</sup> at 50 h but glucose concentration decreased rapidly and it was fully consumed at 26 h. Relative level of siderophores, biomass and glycerol concentration in the culture broth of *P*. *fluorescens* JY3 against time are shown in Figure 4A, where relative level of siderophores achieved was 1.97% at 12 h, reaching its maximum value of 23.9% at 48 h. Moreover, the maximum biomass achieved was 5.2 g L−<sup>1</sup> at 24 h and glycerol concentration decreased gradually to reach 0.25 g L−<sup>1</sup> at 52 h. *P*. *fluorescens* JY3 cell biomass increased exponentially with specific growth rate, μ of 0.096 h<sup>−</sup>1. The dissolved oxygen decreased rapidly and reached 30% in the first hour, so that, motor speed was increased and accordingly agitation rate increased until 6 h, consequently, dissolved oxygen increased rapidly as a result of increasing agitation speed (Figure 4B).

**Figure 3.** (**A**) Relative level of siderophores, biomass, glucose concentration and glycerol concentration as a function of time for batch fermentation culture broth of *Pseudomonas aeruginosa* F2 and (**B**) dissolved oxygen, agitation and aeration as a function of time during batch fermentation of *Pseudomonas aeruginosa* F2.

**Figure 4.** (**A**) Relative level of siderophores, biomass and glycerol concentration as a function of time for batch fermentation culture broth of *Pseudomonas fluorescens* JY3 and (**B**) dissolved oxygen, agitation and aeration as a function of time during batch fermentation of *Pseudomonas fluorescens* JY3.

#### *3.6. Fed-Batch Fermentation*

Maximization of siderophores production from *P*. *aeruginosa* F2 and *P*. *fluorescens* JY3 was completed using fed-batch technique. Fed-batch fermentation was carried out in a 10-L bench-top bioreactor (Cleaver, Saratoga, USA) using *P*. *aeruginosa* F2 and *P*. *fluorescens* JY3, which produced the highest percentage of siderophores production compared to other *Pseudomonas* isolates screened in this study. The fed-batch process was started with a batch phase and completed using a fed-batch phase using exponential feed rate. The fed-batch stage initiated with the addition of the feeding medium using an exponential feeding type of prearranged feed rate. The feeding started with an initial rate of 1.5 mL min−<sup>1</sup> until it reached the final rate of 22.5 mL min−1. Figure 5A illustrates relative level of siderophores, biomass and glucose/glycerol concentration of the culture broth of *P*. *aeruginosa* F2 as a function of time. During the batch step, the culture grew exponentially with a constant specific growth rate. Dissolved oxygen decreased quickly during the exponential phase due to the increasing demand for O2 that is required for the growing cell mass. The oxygen was held in reserve at above 30% by controlling motor speed and consequently agitation rate (Figure 5B).

**Figure 5.** (**A**) Relative level of siderophores, biomass, glucose concentration and glycerol concentration as a function of time for fed-batch fermentation culture broth of *Pseudomonas aeruginosa* F2 and (**B**) dissolved oxygen, agitation and aeration as a function of time during fed-batch fermentation of *Pseudomonas aeruginosa* F2.

The relative level of siderophores reached 1.8% at 18 h and then increased rapidly to achieve 37.8% at 40 h but it reached its maximum value of 67.05% at 46 h. The maximum biomass was 15.8 g L−<sup>1</sup> at 39 h. Glucose concentration decreased rapidly and it was fully consumed at 24 h. In addition, glycerol concentration decreased slowly and was not fully consumed at 24 h; its value reached 10.36 g L−1. After feeding, the glycerol concentration increased for a short period and then decreased rapidly to near zero at the end of the run.

In thecase of *P*. *fluorescens* JY3, cell biomass increased exponentially with a specific growth rate. The relative level of siderophores, biomass and glycerol concentration against time are shown in Figure 6A. The relative level of siderophores reached 1.92% at 16 h and increased rapidly to reach 19.24% at 36 h, achieving its maximum value of 45.59% at 48 h. The maximum biomass achieved was 13.7 g L−<sup>1</sup> at 44 h. Glycerol concentration decreased gradually and reached 2.89 g L−<sup>1</sup> at 24 h. After feeding, glycerol concentration increased gradually for some time to reach 4.75 g L−<sup>1</sup> at 42 h and decreased again to reach 2.85 g L−<sup>1</sup> at the end of the process. The dissolved oxygen decreased rapidly,

therefore the concentration of oxygen was restricted to a value of 30% by increasing the speed of motor to the desired agitation rate. The rise of the agitation rate resulted in an increase in the dissolved oxygen concentration in the medium until 8 h; after this point dissolved oxygen increased gradually. Once feeding starts, the dissolved oxygen decreased rapidly, and consequently the agitation speed was increased gradually (Figure 6B). The fed-batch fermentation technique produced the highest percentage of relative level of siderophores of all methods, so cultures of *P. aeruginosa* F2 and *P*. *fluorescens* JY3 obtained by fed-batch fermentation were used for the preparation of talc formulations to be utilized as biocontrol agents.

**Figure 6.** (**A**) Relative level of siderophores, biomass and glycerol concentration as a function of time for fed-batch fermentation culture broth of *Pseudomonas fluorescens* JY3 and (**B**) dissolved oxygen, agitation and aeration as a function of time during fed-batch fermentation of *Pseudomonas fluorescens* JY3.

*3.7. In-Vivo Antagonistic E*ff*ect of P. aeruginosa F2 and P. fluorescens JY3 against Some Phytopathogenic Fungi*

Formulations of siderophores-producing *Pseudomonas* isolates using talc powder as a carrier were applied as biocontrol agents in a soil infested with *F*. *oxysporum* and *R. solani*. Wheat seeds were treated with the formula and sown after seven days of soil infestation. Damping-off percentage and reduction percentage were evaluated. In addition, fresh and dry weights of shoots and roots were weighed. Formulations of *P. aeruginosa* F2 and *P*. *fluorescens* JY3, which produced a higher percentage of siderophores production than other *Pseudomonas* isolates used in this work, were effective in reducing damping-off caused by *F. oxysporum* and *R*. *solani* compared to the control infested with plant pathogens. Damping-off percentage caused by *F. oxysporum* and *R*. *solani* reached 33.3% and 53.3%, respectively. Meanwhile, treatments using formulations of F2 and JY3 in a soil infected with *F. oxysporum* were effective in reducing damping-off percentage, which was decreased to reach 6.67% with a reduction percentage of 80%. Formulation of *P. aeruginosa* F2 in a soil infected with *R*. *solani* was more effective in reducing damping-off percentage compared to the formulation of *P*. *fluorescens* JY3, as the damping-off percentage recorded 6.67% and 20% with a reduction percentage of 87.49% and 62.5%, respectively (Table 3). Both formulations stimulated the growth of wheat plants in a soil infected with *F*. *oxysporum* and *R*. *solani* compared to the other control treatments inoculated with each fungus alone. Fresh and dry weights of shoots and roots increased significantly in treatments with formulations of *P. aeruginosa* F2 and *P*. *fluorescens* JY3 in a soil infected with plant pathogenic fungi. Treatment using formulation of F2 in a soil infected with *F*. *oxysporum* increased the fresh weights of shoots and roots of wheat plants, which reached 3.77 g and 3.3 g with increase percentages of 45.09% and 49.39%, respectively. Whereas, there was a 3.33 g and 2.93 g increase in fresh weights of shoots and roots, with increased percentage to 37.84% and 43%, respectively, in the case of treatment using JY3 formulation. In comparison with *F*. *oxysporum* treatment alone, fresh weights of shoots and roots reached 2.07 g and 1.67 g, respectively. Fresh weights of shoots were 3.53 g and 3.27 g with increase percentage of 72.52% and 70.34% in treatments using F2 and JY3 formulation in a soil infected with *R*. *solani*, respectively, compared to treatment using the fungus alone that recorded 0.97 g. The fresh weights of roots reached 3.1 g and 2.83 g with an increased percentage of 80.65% and 78.8% using the same formulations in comparison with treatment using the fungus alone that reached 0.6 g (Table 4). The results of dry weights of shoots and roots obtained in this study elucidated the efficiency of treatments using formulations of *P. aeruginosa* F2 and *P*. *fluorescens* JY3 in increasing the dry weights of shoots in a soil infected with *F*. *oxysporum* and *R*. *solani*. The increase in shoot dry weight reached 2.83 g, 2.47 g, 2.63 g and 2.37 g, with increased percentage of 63.6%, 58.3%, 87.45% and 86.08%, respectively while the dry weights of shoots in the case of treatments using *F*. *oxysporum* and *R*. *solani* reached 1.03 g and 0.33 g, respectively. The dry weights of roots recorded 2.3 g, 2.03 g, 2.2 g and 1.83 g, with increase percentage of 55.22%, 49.26%, 91.82% and 90.16%, respectively. Whereas treatments using *F*. *oxysporum* and *R*. *solani* recorded 1.03 g and 0.18 g, respectively (Table 4).


**Table 3.** Effect of culture formulations of *Pseudomonas aeruginosa* F2 and *Pseudomonas fluorescens* JY3 on damping-off of wheat plants caused by *Fusarium oxysporum* and *Rhizoctonia solani*.

<sup>W</sup> Reduction % (in case of *F*. *oxysporum*) = [check-treatment]/check × 100; <sup>X</sup> Reduction % (in case of *R*. *solani*) = [check-treatment]/check × 100; \* Means in each column followed by the identical letter do not differ significantly at P ≤ 0.05 level; \*\* Significant letters; "n.d." not determined.


*Processes* **2020** , *8*, 455

49

identical letter do not differ significantly at P ≤ 0.05 level; \*\* Significant letters; "n.d." not determined.

#### **4. Discussion**

Siderophores are produced by several microorganisms such as *Pseudomonas*, *Enterobacter* and *Escherichia coli* [5,8,37,38]. In our study, twenty siderophores-producing *Pseudomonas* isolates were evaluated as biocontrol agents for their in-vitro antagonistic effect against the growth of some plant pathogenic fungi (*Alternaria* spp., *F*. *culmorum*, *F*. *oxysporum* isolate A, *F*. *oxysporum* isolate B, *F*. *solani* and *R*. *solani*). *P*. *aeruginosa* isolate F2 was more effective in significantly inhibiting the mycelial growth of all the tested fungi than other *Pseudomonas* isolates. These results are in harmony with the results obtained by Pàez et al., 2005 who reported that *P*. *aeruginosa* was more efficient than *P*. *fluorescens* in inhibiting some plant pathogenic fungi [39]. Also, Sasirekha and Srividya (2016) reported that siderophores-producing *P*. *aeruginosa* FP6 had antagonistic activity against *R*. *solani* [40]. Also, Islam et al., 2018 showed that *P*. *aeruginosa* RKA5 inhibited mycelia growth of *F*. *oxysporum* f. sp. *cucumerinum* [41]. On the other hand, Ahmedzedah et al., 2006 reported that 13 out of 47 fluorescent *Pseudomonas* isolates showed antagonistic effects against *Fusarium* sp. and *R*. *solani* [42].

The optimal performance of the upstream processing using on-line and off-line sensors in the two basic types of fermentation (batch, and fed-batch) was studied. Batch and fed-batch fermentation processes were accomplished to enhance the cell biomass and to achieve high concentrations of siderophores. In batch fermentation of *P*. *aeruginosa* F2 and *P*. *fluorescens* JY3, cells grew rapidly after the lag phase, which continued for 5 h. Biomass was subsequently increased exponentially with a constant specific growth rate of 0.043 h−<sup>1</sup> and 0.096 h−1, respectively, within the exponential phase. Batch fermentation of isolates F2 and JY3 in the bioreactor produced the highest concentration of biomass and siderophores. In isolate F2, maximum values of 10.3 g L−<sup>1</sup> at 38 h and 31.9% at 40 h, respectively, were reached, while values of 5.2 g L−<sup>1</sup> at 42 h and 23.9% at 48 h were recorded for isolate JY3, respectively. On the other hand, maximum biomass of *P*. *aeruginosa* PAO1 and *P*. *fluorescens* NCIM5096 was 1.6 and 1.96 g L−1, respectively [8,43]. Maximization of biomass and siderophores concentration in the bioreactor may be related to optimal conditions of pH, agitation, and aeration, which are provided by the bioreactor for cell growth and production of siderophores. Batch fermentation could not obtain high cell density and high concentration of product because cells suffer from substrate inhibition and catabolite repression. Accumulation of by-products such as acetic acid and propionic acid during batch fermentation was documented [9,44,45]. High partial pressure of CO2, high concentration of carbon source and high specific growth rate might be responsible for the accumulation of by-products that cause suppression of bacterial growth and production of products [46]. So, to obtain high cell density and high concentrations of siderophores, fed-batch fermentation is favored to reduce by-products production and to get rid of substrate suppression. Exponential fed-batch fermentation of *P*. *aeruginosa* F2 and *P. fluorescens* JY3 gave higher cell mass and siderophores concentration than batch fermentation. Biomass and siderophore estimation of *P*. *aeruginosa* F2 reached its maximum value of 15.8 g L−<sup>1</sup> at 39 h and 67.05% at 46 h, respectively while they reached 13.7 g L−<sup>1</sup> at 44 h and 45.59% at 48 h by *P*. *fluorescens* JY3, respectively. Sarma et al., 2010, obtained high cell density from fluorescent pseudomonad R81 using fed-batch fermentation [9].

The fact that *P*. *aeruginosa* is classified as a risk group 2 biological agent that rarely represents a serious health threat to human is evident [47–49]. In addition, the fact that this organism was isolated from the rhizosphere of edible plants from agricultural fields, as a member of the natural microflora found in the native soil, gave us confidence to continue working with this microorganism. Therefore, the current study used *P*. *aeruginosa*, as an effective biocontrol agent. Formulations of siderophores-producing *Pseudomonas* isolates, which were used as biocontrol agents in a soil infected with *F*. *oxysporum* and *R*. *solani*, were efficient in decreasing damping-off. The biocontrol mechanism of siderophores-producing fluorescent pseudomonads could be explained by their ability to chelate and reduce the amount of ferric ions available in rhizosphere (competition for iron nutrition). By this means, siderophores-producing fluorescent pseudomonads may restrict plant pathogens in the rhizosphere and reduce their ability to colonize plant roots. In addition, siderophores-producing fluorescent pseudomonads can induce plant systemic resistance, which reduces

the pathogen infection. Other mechanisms, in addition to siderophores production by fluorescent pseudomonads, were reported that may help in suppressing and controlling fungal pathogens, such as the production of antifungal compounds and lytic enzymes. Solans et al., 2016 studied the role of siderophores as biocontrol agents [50] whereas Arya et al., 2018 revealed that tomato seedlings inoculated with *P*. *fluorescens* strains SPs9 and SPs20, which are able to produce siderophores in the soil infected with *F*. *oxysporum*, succeeded in controlling wilt disease with high efficiency [5]. Leeman et al., 1996 reported that siderophore-producing *P*. *fluorescens* induced systemic resistance against Fusarium wilt of radish [51]. Formulation of siderophores-producing *P*. *aeruginosa* F2 and *P*. *fluorescens* JY3 were sufficient in increasing fresh and dry weights of shoots and roots of wheat plants. These results may due to the ability of both isolates to induce the plant to produce some phytohormones such as auxin, cytokinin, and gibberellins. Also, may be related to the capability of plant to recognize the microbial Fe-siderophores complex and consequently iron uptake from this complex (iron nutrition). In an agreement with the current findings, Sharma et al., 2003 showed increasing iron, chlorophyll a, and chlorophyll b concentrations of *Vigna radiate* plants when they were inoculated with siderophore-producing *Pseudomonas* strain GRP3 [52]. Additionally, our results were in agreement with Sayyed et al., 2005 and Manwar et al., 2000, who documented that inoculation with *P. fluorescens* enhanced seed germination, shoot and root length of wheat [8,53]. Arya et al., 2018, reported that siderophores-producing *P*. *fluorescens* strain SPs9 and SPs20 were effective in increasing fresh and dry weight of tomato plants in a soil infected with *F*. *oxysporum* [5]. The same results were obtained by inoculating *Triticum aestivum* with fluorescent pseudomonad R62 [54].

#### **5. Conclusions**

In this study, scaling-up production of siderophores from fluorescent pseudomonads was accomplished using fermentation technology. Exponential fed-batch fermentation of *P*. *aeruginosa* F2 and *P*. *fluorescens* JY3 gave higher concentrations of siderophores and biomass than batch fermentation. Formulations of siderophores-producing fluorescent pseudomonads were effective in controlling soil-borne fungi and for stimulation of plant growth. These formulations can therefore be utilized as plant growth promoters and biocontrol agents.

**Author Contributions:** G.A.A.-Z. planned and designed the research; A.S.A. performed the experiments; G.A.A.-Z. and A.S.A. wrote the manuscript; N.A.-M.S., E.E.E.-S., S.M.M. and S.A.-F.S. revised the manuscript. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Acknowledgments:** The authors would like to thank the City of Scientific Research and Technological Applications for housing and facilitating this work. The authors also thank Kenawy, A.M. of the City of Scientific Research and Technological Applications, Alexandria, Egypt for critical reading and revision.

**Conflicts of Interest:** Authors declare there is no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Isolation, Identification and Antimicrobial Evaluation of Bactericides Secreting** *Bacillus subtilis* **Natto as a Biocontrol Agent**

**Jing Zhang 1, Muhammad Bilal 1, Shuai Liu 1, Jiaheng Zhang 1, Hedong Lu 1, Hongzhen Luo 1, Chuping Luo 1, Hao Shi 1, Hafiz M. N. Iqbal <sup>2</sup> and Yuping Zhao 1,\***


Received: 3 January 2020; Accepted: 20 February 2020; Published: 25 February 2020

**Abstract:** Herein, a bactericide-secreting *Bacillus* strain, potentially useful as a biocontrol agent, was isolated from the commercial Yanjing Natto food. Following the biochemical and physiological evaluation, the molecular identification was performed using 16S rDNA sequencing of polymerase chain reaction-amplified DNA that confirmed the natto isolate as *Bacillus subtilis* natto (*B. subtilis* natto). The biocontrol (microbial inhibitory) capability of *B. subtilis* natto was investigated against *Staphylococcus aureus*, *Escherichia coli*, *Salmonella typhimurium*, and yeast (*Yarrowia lipolytica*) and recorded. The antimicrobial activity of *B. subtilis* natto was further enhanced by optimizing the growth medium for optimal bactericides secretion. Under optimized conditions, *B. subtilis* natto exhibited much higher inhibitory activity against *S. aureus* with a zone of inhibition diameter up to 27 mm. After 48 h incubation, the optimally yielded *B. subtilis* natto broth was used to extract and purify the responsible bactericides by silica gel column chromatography, gel column chromatography, and semi-preparative high-performance liquid chromatography. Structural identification of purified bactericides (designated as NT-5, NT-6, and NT-7) from *B. subtilis* natto was performed by 13C-nuclear magnetic resonance (NMR) and mass spectral analyses. The NMR comparison also revealed that NT-5, NT-6, and NT-7 had identical structures, except for the fatty chain. In summary, the present study suggests the improved biocontrol and/or microbial inhibitory potential of newly isolated bactericides secreting *B. subtilis* natto.

**Keywords:** biocontrol agent; *Bacillus subtilis* natto; isolation; molecular identification; medium optimization; antimicrobial activity; bactericides; spectral analyses

#### **1. Introduction**

Natto as a traditional food with a history of two thousand years has various healthcare functions such as prevention of osteoporosis, procoagulant, antimicrobial, anti-aging, anti-cancer, and gastrointestinal activities [1–3]. *Bacillus subtilis* natto (*B. subtilis* natto) is an aerobic Gram-positive probiotic that possesses a strong heat and acid-base stability [4]. This edible strain is claimed to be the organism primarily responsible for natto fermentation. It has a variety of functions including promoting the absorption of iron, calcium, and vitamin D by generating intestinal acidification [5]. *B. subtilis* natto metabolites also have antihypertensive, anti-tumor, anti-oxidation, thrombolytic and other functions. Among all these properties, potent antibacterial activity is one of the most important functions of

*B. subtilis* natto and might be due to the production of bacitracin, polymyxin, 2,6-pyridine dicarboxylic acid, and other antibiotics. It has the role of inhibiting pathogens such as *Salmonella typhimurium* and dysentery bacteria. Along with a broad antimicrobial spectrum, *B. subtilis* natto-derived antimicrobial substances are safe for humans as compared to other drugs [3,6,7].

A pronounced antibacterial activity against *Helicobacter pylori* has been found in *B. subtilis* natto cells. Through the minimum inhibitory concentration (MIC) tests, it was confirmed that *B. subtilis* natto possessed anti-platelet aggregation and anti-*H. pylori* activity due to the presence of dipicolinic acid [8]. Based on morphological studies, a bacterial isolate BH072 from a honey sample exhibited a broad-spectrum inhibitory activity against a range of molds including *Aspergillus niger*, *Botrytis cinerea*, and *Pythium* [9]. A lipopeptide biosurfactant was purified from *B. natto* TK-1 by acidic precipitation, methanol extraction, and thin-layer chromatography (TLC) system, and its in-vitro anti-adhesion activity was investigated. Results indicated that lipopeptide presented noteworthy antimicrobial properties and significantly inhibited the adhesion of *Escherichia coli*, *Staphylococcus aureus,* and *S. typhimurium* [3]. *Vibrio parahaemolyticus* is an emerging foodborne pathogen in seafood products. Effective measures to prevent *V. parahaemolyticus* in seafood would help minimize the probability of foodborne illness and large outbreaks of *Vibrio*. *B. subtilis* NT-6 isolated from natto secretes a novel antibacterial peptide AMPNT-6 that is a potent inhibitor of *V. parahaemolyticus* (up to 15.5 mm diameter). These results manifested that AMPNT-6 might be potentially employed as a natural inhibitor of *V. parahaemolyticus* on shrimp substrates [10].

Due to increasing people's health requirements, food safety standards have become more stringent, and considerable attention has been focused on food preservatives. Low-cost natural and broad-spectrum food preservatives are an emerging trend around the world. Compared with chemical preservatives, food preservatives produced using traditional fermented foods are newer, safer and more effective. Antibacterial substances in natto not only have the advantages of high safety, broad antibacterial spectrum but also rich in nutrition and functionality [11,12]. Nevertheless, there are few reports on the antibacterial activity of *B. subtilis* natto. Moreover, the main components and structure of the antibacterial substances are still unclear, and the antibacterial ability is weak. This study aims to isolate and purify the bactericides secreting *B. subtilis* natto. The biocontrol capability of *B. subtilis* natto was also evaluated against *E. coli*, *S. aureus*, *S. typhimurium*, and yeast (*Y. lipolytica*). Moreover, advanced instrumental techniques were used to elucidate the structural features of bactericides extracted from *B. subtilis* natto. The research of antibacterial baring compounds extraction and purification promoted its application in medicines and preservatives and expected to develop into new antibacterial drugs.

#### **2. Materials and Methods**

#### *2.1. Materials, Chemicals, and Reagents*

The natto food was obtained from Beijing Yanjing Zhongwu Material Technology Co., Ltd. (Beijing, China). Ethylenediaminetetraacetic acid (EDTA), glucose, cornflour, sucrose, soluble starch, soy flour, NaCl, tryptone, beef cream, ammonium nitrate, ZnSO4, FeSO4, and K2SO4 were purchased from Sinopharm Chemical Reagent Co., Ltd. (Shanghai, China). Shanghai Aladdin Biochemical Technology Co., Ltd. (Shanghai, China) provided *n*-hexane, ethyl acetate, methylene dichloride, and methanol. All aqueous solutions were prepared with deionized water. All other chemicals/reagents used in this study were the highest quality and used as received without any further processing unless otherwise stated.

#### *2.2. Microbial Strains*

*Escherichia coli*, *S. aureus, S. typhimurium*, and *Y. lipolytica* were procured from Shanghai Luwei Microbial Science and Technology Co. Ltd. (Shanghai, China). *B. subtilis* natto strain was isolated from natto food. All collected and isolated microbial cultures were maintained on nutrient agar slopes at 37 ◦C for 24 h. After the stipulated period of 24 h incubation, the viable cultures were preserved at 4 ◦C and sub-cultured periodically every two weeks to maintain the viability.

#### *2.3. Isolation and Screening of B. subtilis Natto*

As received natto was washed twice with warm water to eliminate the dust particles. One-gram natto was soaked in 10 mL sterile saline solution and heated in a water bath at 80 ◦C for 1.0 h. The natto mixture with the elimination of solid matters was allowed to cool at room temperature (25 ± 3 ◦C) and used as a bacterial suspension. The bacterial suspension was then serially diluted from 10−<sup>1</sup> to 10−<sup>9</sup> to reduce the concentration of microorganisms and spread on a sterile agar plate containing 20 mL of solidified screening medium. The freshly inoculated plates were placed at 37 ◦C for a time of 24 h. The resulting bacterial colonies were further purified by streaking on a screening medium, Gram-stained and observed under an optical microscope using the oil-immersion lens. A single bacterial colony was further subjected to a series of physiological and biochemical identification tests. Following Bergey's Manual of Determinative Bacteriology, catalase test, starch hydrolysis, salt tolerance (i.e., 2%, 5%, 7%, and 10%), gelatin liquefaction, acetyl methyl, Voges-Proskauer (VP), D-glucose and D-mannitol fermentation, and nitrate reduction were performed.

#### *2.4. Molecular Identification: DNA Extraction and PCR Amplification*

A bacterial genomic DNA extraction kit (*EasyTaq*® DNA Polymerase, Transgen Biotech, Beijing, China) was used for molecular identification purposes. Briefly, an overnight grown *B. subtilis* natto was centrifuged at 4000× *g* for 10 min. The extracted DNA was used as a template for polymerase chain reaction (PCR) to amplify about 500 or 1500 bp segment of the 16S rRNA gene sequence using the universal primers 27F (5 -AGAGTTTGTCCTGGCTCAG-3 ), 1492R (5 -TACGGCTACCTT GTTACGACTT-3 ). PCR amplification system (50 μL) contains; 1 μL genomic DNA, 1 μL each of upstream and downstream primers, 1 μL EasyTaq DNA Polymerase, 4 μL High pure dNTPs (EasyPure Genomic DNA Kit, Transgen Biotech), 5 μL 10× EasyTaq Buffer and ddH2O. The amplification was performed in a thermal cycler (Thermo Fisher Scientific, Waltham, MA, USA) by DNA denaturation at 94 ◦C for 3 min followed by 30 cycles of 30 s at 94 ◦C, 30 s at 55 ◦C (Tm), and 30 s at 72 ◦C and an ultimate extension of 5–10 min at 72 ◦C.

#### *2.5. S rDNA Sequencing and Data Analysis*

Samples were prepared and sequenced by the Shanghai Bioengineering Co., Ltd. China accredited laboratory. The target gel band was purified using the TaKaRa MiniBEST Purification Kit. The sequences obtained with the 1492R primer were aligned and compared with other partial 16S rDNA sequences in the GenBank database (http://www.ncbi.nlm.nih.gov/BLAST). The obtained 16S rDNA sequence was deposited to GenBank for assigning accession numbers by NCBI.

#### *2.6. Microbial Cultural Conditions*

*B. subtilis* natto strain was cultivated in seed culture medium (pH 7.2–7.4) containing peptone 10 g/L; beef extract 5 g/L, NaCl 6 g/L, and glucose 5 g/L at 37 ◦C for 24 h. The screening medium includes skim milk powder 50 g/L, NaCl 5 g/L, glucose 10 g/L, agar 20 g/L. The medium was sterilized in a laboratory-scale autoclave at 115 ◦C and 15 psi for 30 min. The agar medium contained 0.1 MPa and water addition and sterilized at 121 ◦C for 15 min. After cooling slightly, both were uniformly mixed. Constant temperature and humidity culture at 37 ◦C for 24 h. The fermentation medium comprises peptone 5 g/L, sodium chloride 5 g/L, glucose 20 g/L, KH2PO4 2 g/L, and K2HPO2 4 g/L. The medium was sterilized at 115 ◦C for 30 min. The culture was incubated at 37 ◦C at 150 rpm. Potato dextrose agar (PDA) containing potato 200 g/L, glucose 20 g/L and agar 15~20 g/L was used for the cultivation of yeast. Whereas medium comprised of tryptone 10 g/L, yeast extract 5 g/L, and NaCl 10 g/L was used for *S. typhimurium* cultivation.

#### *2.7. Antimicrobial Activity of B. subtilis Natto Extract*

*B. subtilis* natto has a unique potential to secrete bactericidal compounds of high interest. For such reason, a cell-free extract of freshly grown *B. subtilis* natto was used to test the inhibitory activity towards *E. coli*, *S. aureus*, *S. typhimurium*, and yeast. Briefly, around 100 mL liquid seed medium was inoculated with a 5% of freshly grown *B. subtilis* natto inoculum suspension. The inoculated seed medium was incubated at 37 ◦C for up to 48 h under uninterrupted shaking at 160 rpm. After 48 h incubation, the fermented broth was centrifuged at 4000× *g* for 15 min, and the cell-free supernatant was further filtered through a 0.45 μm polysulphonate membrane filter. The resulting filtered cell-free supernatant was considered as a bactericidal containing crude extract and tested using the agar well diffusion assay against selected strains. Briefly, sterilized discs were gently placed onto the center of the agar plate with sterile forceps following the addition of 20 μL of bactericidal containing cell-free supernatant to the discs. The inhibition zone diameter (mm) formed around the disks was recorded after the plate's incubation at 37 ◦C for 24 h [13]. Kanamycin at the concentration of 30 μg/mL, and water ware used as a positive and negative control, respectively.

#### *2.8. Minimal Inhibitory Concentration of B. subtilis Natto against Pathogenic Bacteria*

The broth dilution method, in test tubes, was adopted to record the minimum inhibitory concentration (MIC) of pathogenic bacteria, according to the procedure described earlier [14–16] with some modifications. Briefly, different extract preparations were serially diluted using sterile nutrient broth medium as a diluent to prepare a final crude extract concentration between 0.578 and 2400 μg/mL. All the tubes were inoculated with the bacterial suspension (adjusted to 107 CFU/mL broth), mixed and incubated for 24 h at 37 ◦C. The lowest concentration of *B. subtilis* natto (highest dilution), displaying no noticeable growth of pathogenic bacteria was defined as MIC. A tube with an equal volume of water was used as a control in parallel to evaluate the influence of the sterile medium on the growth of the test bacteria.

#### *2.9. Optimization of Growth Medium to Induce the Antibacterial Activity of B. subtilis Natto*

The types of carbon, nitrogen and inorganic salts in the fermentation medium also have a certain influence on the metabolism of the bacteria. Therefore, in this experiment, different carbon sources (2.0% corn flour, glucose, sucrose, and soluble starch), nitrogen source (0.5% soy flour, tryptone, beef extract, ammonium nitrate, and inorganic salts (0.5% potassium dihydrogen phosphate, zinc sulfate, ferrous sulfate, potassium sulfate) were screened and optimized by a single factor optimization. The size of the inhibition zone was measured after 48 h of culture.

#### *2.10. Extraction and Purification of Bactericides from B. subtilis Natto*

According to the results of orthogonal experiments, *B. subtilis* natto was aerobically grown in an Erlenmeyer flask (250-mL capacity) containing sucrose 1%, soy flour 0.8%, potassium sulfate 0.2%, sodium chloride 0.5%, and water 1 L under optimized culture conditions at 37 ◦C for 36 h on a rotary shaker (120 rpm). After cultivation, the fermentation broth was harvested by centrifugation at 4000× *g* and 4 ◦C for 20 min. The fermentation broth was added with ethyl acetate in a ratio of 1:1, thoroughly mixed, and allowed to stand in a separating funnel for two h. The extract was collected and concentrated using a rotary evaporator. The resulting brown extract (10.1 g) was purified by silica gel vacuum liquid chromatography using a 20 × 8 cm column eluted with gradient solvents of increasing polarity (n-hexane−EtOAc, 9:1, 7:3, 1:1, 3:7; CH2Cl2−MeOH, 15:1, 9:1, 7:3, 0:10; 600 mL each gradient) yielding a total of three fractions. Fractions A-4 and A-5 exhibiting antibacterial activity against *E. coli* and *S. aureus* were chosen for additional separation. Fraction A-4 (280.7 mg) and A-5 were passed through a Sephadex LH-20 column (60 × 3 cm) using methanol as a mobile phase to remove pigments, and then purified by a semi-preparative high-performance liquid chromatography (Eurospher C18

column) giving NT-5 (15.3 mg), NT-6 (45.5 mg) and NT-7 (51 mg) (MeOH−H2O: 0 min, 20%; 2 min: 30%; 7 min: 50%; 10 min-15 min: 100%).

#### *2.11. Structural Identification of Bactericides from B. subtilis Natto*

The samples were dissolved in 2% C5D5N or CDCl3. 13C-NMR spectra were obtained at 150 MHz on a Bruker Avance III NMR spectrometer (Bremen, Germany). HRESIMS (High-resolution electrospray ionization mass spectrometry) data were obtained from an ultra-high-resolution quadrupole time-of-flight (UHR-qTOF) Maxis 4G mass spectrometer (Bruker Daltonics, Bremen, Germany).

#### **3. Results and Discussion**

#### *3.1. Isolation and Purification of B. subtilis Natto*

For initial isolate screening, Gram staining was performed on a 24 h *B. subtilis* natto culture incubated at 37 ◦C under uninterrupted shaking (150-rpm) using LB liquid medium. According to the characteristics of protease secreted by *Bacillus natto*, protease-producing *Bacillus* was isolated using a milk culture medium (Figure 1A). The Gram-stained isolate was then subjected to a microscopic observation that revealed following characteristics, Gram+, rod-like in shape, flagella to form buds (full or sub-elastic), and elliptical with no obvious swelling (Figure 1B). After that, the second screening was performed covering various physiological and biochemical parameters and the results obtained are listed in Table 1. Based on the recorded characteristics during the first and second screening, the isolate was assumed to be *Bacillus subtilis* natto, which was further subjected to a molecular identification via agarose gel electrophoresis and 16S rDNA sequencing data analysis.

**Figure 1.** (**A**) Milk screening medium at 37 ◦C for 48 h; (**B**) Microscopic morphology of bacterial isolate cultured in LB liquid medium at 37 ◦C, 150 rpm for 24 h.


**Table 1.** Physiological and biochemical identification tests to identify *B. subtilis* natto.

#### *3.2. Molecular Identification: 16S Ribosomal DNA Sequence Analysis*

The molecular identification of bacterial isolate from natto food was carried out based on 16S rDNA sequencing data analysis. The 16S rDNA sequence of the amplified PCR product was found to be about 1500 bp long. Comparative analysis of the 16S rDNA nucleotide sequence from the newly isolated strain with the GenBank database sequences showed 99% similarity with the equivalent sequences from the species *B. subtilis.* According to the Berger's Bacterial Identification Manual and 16s rRNA results, the bacterium is *B. subtilis* natto.

#### *3.3. Antibacterial Activity of B. subtilis Natto*

The antibacterial activity spectrum of *B. subtilis* natto was evaluated based on the extent of growth-inhibiting potential of four microbes, i.e., *S. aureus*, *E. coli*, *S. typhimurium,* and yeast by direct antagonism on agar plates using a two-layer plate method. Results revealed that *B. subtilis* natto has significant inhibitory activity against *S. aureus* and *E. coli* (Figure 2). Whereas, the antibacterial activity against *S. typhimurium* was observed to be very weak, and there was almost no inhibitory activity on yeast (Figure 2). In addition, MIC test results showed that the MIC value of *B. subtilis* natto against *E. coli* was 6.1 μg/mL, and the MIC value against *S. aureus* was 8.9 μg/mL.

**Figure 2.** Inhibition zone of pathogenic bacteria, (**A**) *Staphylococcus aureus*, (**B**) *Escherichia coli*, (**C**) Yeast, (**D**) *Salmonella typhimurium* (**E**) Water, and (**F**) Kanamycin.

#### *3.4. Optimization of Growth Medium to Induce the Antibacterial Activity of B. subtilis Natto*

A classical optimization method (one factor at a time) was performed to induce the growth of *B. subtilis* natto for optimum performance. For such reason, the influence of various carbon sources, nitrogen sources and inorganic salt on the growth of bactericide secreting *B. subtilis* natto was evaluated, and the results obtained are shown in Figure 3A–C. Among different carbon sources, sucrose had the greatest effect on the antibacterial activity of *B. subtilis* natto (inhibition zone, 21.1 mm) followed by soluble starch, D-glucose and corn flour (Figure 3A). Similarly, the addition of ammonium nitrate and potassium dihydrogen phosphate showed a profound inhibitory effect with a diameter zone of 23.0 mm and 20.1 mm among the nitrogen sources (Figure 3B) and inorganic salts (Figure 3C), respectively. Based on the single factor screening results, ammonium nitrate (nitrogen source) had the greatest influence on the antimicrobial activity of different pathogenic strains, followed by sucrose (carbon source) and KH2PO4 (inorganic salt) (Figure 3C). Therefore, these selected factors were further optimized using three levels for each factor in nine experimental runs designed according to the Taguchi's orthogonal method (orthogonal layout L9). The three variables and their respective levels (coded as 1, 2, 3) are listed in Table 2. Response results in terms of the inhibition zone diameter of all these different combinations of the three factors were appraised using the analysis of variance (ANOVA). Scores (response results) are summed up for each factor and each level (k 1, k 2 and k 3 for levels 1, 2 and 3, respectively) as well as averaged (K 1, K 2 and K3 for levels 1, 2 and 3, respectively). It can be observed from the orthogonal array experiment that the inhibition zone reached 27 mm under the optimized medium combination of experiment 6 (Table 3). The optimization scheme was AB3C1 indicating that sucrose 1%, 2%, 3% can be used, whereas soy flour 0.8%, and potassium sulfate 0.2% showed the best antibacterial effect. Notably, the bacteriostatic effect was significantly improved after the optimization of medium components. Most of the previous reports have shown that *B. natto* has the ability to inhibit *E. coli* and *S. aureus*, but its antibacterial activity was found to be very weak. Through the optimization of the medium, *B. subtilis* natto exhibited a potent inhibitory activity towards *S. aureus* with an antibacterial diameter of up to 27 mm (Figure 4, Table 3). Earlier studies have mainly focused on the antifungal or antitumor activity of *B. natto*, and there are few reports on inhibition of *S. aureus* and *E. coli*.


**Table 2.** Factors and their respective assigned levels.


**Table 3.** Results of the L9 (3 × 3) orthogonal array experiment.


**Figure 3.** Effects of different (**A**) carbon sources (**B**) nitrogen sources and (**C**) inorganic salts on the antibacterial potential of *B. subtilis* natto.

**Table 3.** *Cont.*

**Figure 4.** The optimal medium of fermentation broth against the inhibition zone diameter of *Staphylococcus aureus*.

#### *3.5. Purification of Bactericides from B. subtilis Natto*

The fermentation broth obtained from *B. subtilis* natto under optimized cultivation conditions was harvested by centrifugation at 4000× *g* and 4 ◦C for 15 min. The resultant cell-free supernatant was mixed with ethyl acetate and subjected to the rotary evaporator to obtain a concentrated extract. The resulting extract was subjected to silica gel vacuum liquid chromatography, which yielded three fractions. Among these, fractions A-4 and A-5 presenting inhibitory activity towards *E. coli* and *S. aureus* were further separated using a Sephadex LH-20 column to remove pigments and then purified by a semi-preparative HPLC into three fractions designating as NT-5, NT-6, and NT-7 (Figure 5). As evident from the HPLC pattern, the fractions NT-5, NT-6 and NT-7 were 100% purified with the retention time of 30.613, 35.583 and 36.067 min, respectively.

**Figure 5.** HPLC patterns; (**A**) NT-5, (**B**) NT-6 and (**C**) NT-7.

#### *3.6. Structural Elucidation by 13C-NMR and Mass Spectral Analyses*

The structural elucidation of purified bactericides designated as NT-5, NT-6 and NT-7 were further confirmed by 13C NMR analysis at 150 MHz. 13C-NMR displayed different peaks for each tested fraction, which are summarized in Table 4. Compound NT-5 eluting at 30.613 min in the HPLC analysis displayed a protonated molecular ion [M+H]<sup>+</sup> at *m*/*z* 1008.6 (Figure 6A). Compound NT-6 eluting at 35.583 min in the HPLC analysis showed a protonated molecular ion [M + H]<sup>+</sup> at *m*/*z* 1022.5 (Figure 6B). Compound NT-7 eluting at 36.067 min in the HPLC analysis and revealed a protonated molecular ion [M + H]<sup>+</sup> at *m*/*z* 1036.8 (Figure 6C). The NMR data of NT-5, NT-6, and NT-7 summarized in Table 4 were consistent with the literature [17] and verified that NT-5, NT-6, and NT-7 has an inhibitory effect on *E. coli* and *Staphylococcus aureus* (Figure 7). The proposed structures of three active compounds are portrayed in Figure 8.

**Table 4.** 13C-NMR data for NT-5, NT-6, and NT-7 fractions.


**Table 4.** *Cont.*


**Figure 7.** Inhibition zone of pathogenic bacteria, (**1**) *Staphylococcus aureus*, (**2**) *Escherichia coli.* (**A**) NT-5, (**B**) NT-6 and (**C**) NT-7.

˄ ˅ ˄ ˅ ˄ ˅

**Figure 8.** Proposed structures of three active compounds.

Antagonism is omnipresent in Nature among diverse species. People have remained interested for a long time in realistically using it in the areas of agricultural defense, disease therapy, and food preservation. Bacterial reproduction is challenging to control and can cause massive food losses. Extensive use of food additives is another serious problem that may cause harm to people's bodies. Using a biological antibacterial agent to control bacterial diseases is a widespread theme that has been widely investigated [18–24]. In contrast to chemical biocides, numerous antibiotics produced by antagonistic strains present the advantage of being decomposable leaving no detrimental residues. The rapid emergence of resistant pathogenic strains and the identification of adverse chemical moieties in the food chain has rekindled the researcher's attention towards green approaches to effectively tackling pathogenic bacteria. Among the alternatives, biological control by means of natural antagonistic microorganisms has been widely deliberated, and some *Bacillus* strains are effective against various microbial pathogens under the optimal medium composition [3,9,20]. For example, Tabbene et al. [25] found that nutrients such as carbon, nitrogen sources, and inorganic salts enhanced the antimicrobial activity by *B. subtilis* B38 against pathogenic *Salmonella enteridis*, *Listeria monocytogenes*, and methicillin-resistant *Staphylococcus* species. In comparison to basal medium, the antibacterial activity was 2- to 4-fold improved in the modified culture medium consisting of 0.15% (w/v) ammonium succinate, 1.5% (w/v) lactose, and 0.3 mg/L manganese. The results indicate that the nutrients act as environmental factors, qualitatively and quantitatively influencing the synthesis of antimicrobial compounds by *B. subtilis* B38.

Antimicrobial peptides secreted by *Bacillus* spp. have been demonstrated as potent biocontrol candidates against a plethora of phytopathogens [26,27]. This study isolated and identified a strain that has shown significant inhibitory activity against various bacterial strains, and also has shown potential for the synthesis of antimicrobial agents. A novel antibacterial peptide, AMPNT-6, secreted by *B. subtilis* NT-6 showed an evident inhibitory activity against *Vibrio parahaemolyticus* with a diameter and MIC of 15.5 mm and 1.25 mg/mL, respectively. In comparison to the control, the significant bactericidal activity through the changes of *V. parahaemolyticus* growth curve indicate that AMPNT-6 can potentially be used as a natural inhibitor to reduce the likelihood of foodborne outbreaks of *Vibrio* [10]. Wang et al. [28] isolated a *B. subtilis natto* CSUF5 strain and its antifungal metabolites were extracted from the medium of the inhibition zone on the dual culture plate. The identified V7-surfactins and I/L7-surfactins exhibited slight antifungal activities against Aspergillus Niger, and their MIC50 reduced in the order V7 > I/L7. Microscopic analysis revealed that surfactin variants delayed the fungal spore germination and thus suppressed the hyphae growth, mainly displayed in hyphal shriveling and distortion [28]. Surfactin is a biosurfactant produced by *B. subtilis* was first observed in 1968 in a culture broth of *B. subtilis* and was initially purified as a fibrin-clotting inhibitor. It is one of the most effective and potent lipopeptide-type biosurfactants produced by different Gram-positive and endospore-producing *B. subtilis* strains [28–30]. It is a cyclic lipopeptide characterized by a 9-hydroxycarbonic acid moiety with profound surface activities as well as antibiotic activity [31]. Cyclic lipopeptides including fengycin, iturin, lichenysin, and surfactin are among the major categories of biosurfactants secreted by *Bacillus* species [32]. Biosurfactants from *Bacillus* strains have been reported to exhibit profound antibacterial and antifungal activities, and the antimicrobial activity was increased with increasing concentration of biosurfactants. A lipopeptide biosurfactant produced by *B. natto* TK-1 showed strong antimicrobial activity against *Botrytis cinerea*, *Fusarium moniliforme*, *Micrococcus luteus* and *S. typhimurium*. The biosurfactant also significantly reduced the tumor cell viability in a dose-responsive manner [3]. Various lipopeptide biosurfactants secreted by *B. licheniformis* [33,34] and *B. subtilis* [35,36] have also revealed antimicrobial activities. For instance, antimicrobial activities of surfactin and iturin produced by *Bacillus* strains to inhibit phytopathogenic fungi have been described [34,37,38]. Standard surfactin, which was originally purified from *B. subtilis*, contained macrolide with the heptapeptide sequenced Glu-Leu-Leu-Val-Asp-Leu-Leu and a lipid fraction that consists of a mixture of several hydroxy-fatty acids (chain length of 13-15 carbon atoms) [39]. Taken together, the present results revealed that the lipopeptide surfactants produced by the *Bacillus* genus display a high perspective for biopharmaceutical and biotechnological applications owing to their biocontrol activities.

#### **4. Conclusions**

In conclusion, a Gram+ bacterial strain isolated from the commercially available natto and identified as *B. subtilis* natto has considerable inhibitory properties against *S. aureus*, *E. coli*, and *S. typhimurium*. The bactericide's secretion potential of *B. subtilis* natto was further optimized via the classical approach under different growth conditions. Through the optimization of the growth medium, its bacteriostatic performance was reaffirmed and found the highest against *S. aureus*. The previous reports have shown that *B. subtilis* natto can inhibit *E. coli* and *S. aureus*, but its antibacterial potential is weak. However, herein, we have eliminated this limited/weak activity drawback, and under-optimized environment, *B. subtilis* natto exhibited much higher inhibitory activity towards *S. aureus* with a zone of inhibition diameter up to 27 mm. The fermentation broth of *B. subtilis* natto was purified via silica gel and Sephadex column chromatography to extract bactericide baring active compounds and designated as NT-5, NT-6, and NT-7. HPLC, MS, and 13C-NMR further analyzed the active compound fractions to elucidate their structural features. Comparison of the NMR data of NT-5, NT-6, and NT-7 indicated that they share the same structure except for the fatty chain. Future studies are ongoing in our lab to scrutinize the exact chemical structure and cellular toxicity of these compounds as potential candidates for biomedical purposes.

**Author Contributions:** Conceptualization, J.Z. (Jing Zhang) and Y.Z.; Data curation, J.Z. (Jing Zhang), H.L. (Hedong Lu), and H.S.; Formal analysis, J.Z. (Jiaheng Zhang) and S.L.; Investigation, H.L. (Hongzhen Luo); Methodology, M.B.; Project administration, Y.Z.; Supervision, Y.Z.; Validation, C.L.; Writing—original draft, M.B., H.M.N.I., and Y.Z.; Writing—review & editing, M.B., H.M.N.I., and Y.Z. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was financially supported by Young academic leaders in Jiangsu Province, Six talent peaks project in Jiangsu Province (2015-SWYY-026). The authors also acknowledge the support from the Postgraduate Research & Practice Innovation Program of Jiangsu Province (SJCX17\_0700), a study on highly efficient biotransformation of oleic acid and linoleic acid to γ-decalactone in *Yarrowia lipolytica* based on synthetic biology (21606097), Huai'an Agricultural and Social Development Project (HAN201812), and Huai'an Special Fund Project for the Transformation of Scientific and technological achievements (HA201803). We would also like to express sincere appreciation to Professor Peter Proksch (Institute of Pharmaceutical Biology and Biotechnology, Heinrich-Heine-Universität Düsseldorf, 40225 Düsseldorf, Germany) for his assistance during the experiment.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **E**ff**ect of Nitrate and Perchlorate on Selenate Reduction in a Sequencing Batch Reactor**

**Hyun-Woo Kim 1, Seong Hwan Hong <sup>2</sup> and Hyeoksun Choi 3,\***


Received: 12 February 2020; Accepted: 13 March 2020; Published: 16 March 2020

**Abstract:** Selenate removal from a water body is being vigorously debated owing to severe health impact, but inhibitions of coexisting anions have been reported. To suggest a viable treatment option, this study investigates the effect of nitrate and perchlorate on selenate reduction in a laboratory-scale sequencing batch reactor. The experimental design tests how competing electron acceptors (NO3 − and ClO4 −) and electron donor (acetate) limitations affect selenate reduction in the reactor. Results show that the reactor achieves almost complete selenate reduction within the initial concentration ranges of 0.1–1 mM by enriching selenate-reducing bacteria with appropriate temperature (30 ◦C) and acclimation period (50 days). We monitored simultaneous selenate and nitrate reduction in the reactor without specific inhibition due to a difference in microbial growth strategy related to electron donor status. Lack of perchlorate-reducing bacteria makes perchlorate addition (0.2 mM) not to be closely associated with dissimilative perchlorate reduction. These results provide information that can help us to understand the effect of competing electron acceptors on selenate reduction and the kinetics of potential parallel reactions in the reactor.

**Keywords:** biological selenate reduction; electron donor competition; nitrate; perchlorate; sequencing batch

#### **1. Introduction**

Selenium (Se) is an essential micronutrient but can cause adverse health effects (e.g. hair loss, fingernail loss, numbness in fingers or toes, and circulatory problems) with long-term and heavy exposure [1,2]. Since Se in water originates from not only geological sources such as weathering of seleniferous soils/rocks but also anthropogenic processes such as mining, fossil fuel combustion, and other industrial activities [3], the World Health Organization has set a provisional total Se guideline of 40 μg/L in drinking water [4]. The United States Environmental Protection Agency permits the maximum concentration limit (MCL) of total Se as 50 μg/L and the regulations of national primary drinking water as 5 μg/L [2]. Likewise, the Korean Ministry of Environment is reducing the MCL to 10 μgSe/L in drinking water [5].

Se has four oxidation states (−II, 0, IV, VI) and forms several organic complexes [6]. In surface water, most Se primarily exists either selenate (SeO4 <sup>2</sup><sup>−</sup>) or selenite (SeO3 <sup>2</sup>−). Both oxyanions are toxic to living organisms thus various treatment technologies have been investigated to remove Se from water [7]. Although physicochemical technologies effectively separate Se from the water supplied for domestic and industrial use, eventual post-treatments for the byproducts are required and technical limitations are still existing [8]. Fortunately, biological treatment can reduce selenate and selenite to insoluble elemental Se (Se0) via anaerobic microbial metabolisms [6,9]. From a wide variety of environments, selenate- or selenite-reducing bacteria have been isolated [10,11].

When selenate or selenite coexist with other anions such as nitrate (NO3 <sup>−</sup>), sulfate (SO4 <sup>2</sup>−), and perchlorate (ClO4 −), biological Se reduction can be inhibited by the electron scavenging of denitrifying bacteria, sulfate-reducing bacteria, or perchlorate-reducing bacteria because most selenate-reducing bacteria are heterotrophic facultative anaerobes which compete for electron donors under anoxic or anaerobic conditions. Another limiting factor might be the drastic change of selenate in the water body due to irrigated agricultural drainage [11], sedimentary soil erosion [3], surface mining [12], coal-fired power plants [13], and so on.

Most biological selenate reductions are targeted for either pure culture or up-flow anaerobic sludge blanket process [14]. Relatively little reports are available about the simultaneous reduction of selenate in a mixed culture when competing anions exist [14–17]. This study, therefore, investigates the feasibility of simultaneous selenate, nitrate, and perchlorate reduction in a sequencing batch reactor (SBR) and evaluate the inhibitory effects of nitrate and perchlorate on biological selenate reduction.

#### **2. Materials and Methods**

#### *2.1. Selective Enrichment of Selenate-Reducing Bacteria*

To selectively enrich selenate-reducing bacteria, bench-scale SBRs were semi-continuously operated in parallel for more than one and a half months. Seed sludge was activated sludge taken from a local municipal wastewater treatment plant with a treatment capacity of 30,000 m3/d in the northern part of I-city, Korea. Using selenate as a sole electron acceptor, the enrichment period was kept under anoxic conditions. To support selective pressure on selenate-reducing bacteria, the temperature was controlled to 30 ◦C by aquarium heaters following previous literature [18].

#### *2.2. Operating Condition of SBRs*

Figure 1 shows the schematic diagram of the triplicate SBRs. The working volume of each SBR was 5 L. To verify the proper temperature condition (25 ◦C and 30 ◦C), SBRs were continuously monitored for more than 200 h until complete selenate reduction at the first batch. And then all the reactors were operated with 24 h sequence with the optimal temperature condition using the pre-acclimated biomass for 30 ◦C. Each SBR was completely mixed for 23 h. And then, an hour of settling period followed by rapid draw sequence of the upper liquid (2.5 L) and fill sequence with fresh feed solution. The feed solution contains selenate, acetate (CH3COO−), buffer, and essential minerals: 50 mg/L of SeO4 2−, 200 mg/L of CH3COO<sup>−</sup>, 46 mg/L of (NH4)2SO4, 13.7 mg/L of K2HPO4, 84 mg/L of NaHCO3, 51.3 mg/L of MgSO4·7H2O, 43 mg/L of CaSO4·2H2O, and 2.5 mg/L of FeSO4·7H2O. Other micronutrients were available from inoculum and endogenous cell decay. Acetate was a sole carbon source (electron donor). To test the effects of nitrate and perchlorate on selenate reduction, we designed the experiments as shown in Table 1.

**Figure 1.** Schematic diagram of triplicate sequencing batch reactors (SBRs).


**Table 1.** Operating conditions of SBRs according to experimental design.

<sup>a</sup> Initial acclimation period.

#### *2.3. Analytical Methods*

Influent and effluent liquid samples were filtered using a 0.2 μm syringe filter (Whatman, GE Healthcare Life Sciences, Marlborough, MA, USA) and kept in a refrigerator at 4 ◦C before analysis. Selenate was determined by using an ion chromatograph (Dionex ICX-1100, Dionex, Sunnyvale, CA, USA) equipped with an IonPac AS15 analytical column and AG15 guard column. The used eluent was a 36.5 mM NaOH solution (Daejung Chemicals, Siheung, Korea). The volume of the used sample loop for selenate determination was 100 μL. For perchlorate determination, we used the same ion chromatograph equipped with IonPac AS16 analytical column and AG16 guard column (Dionex, Thermo Fisher Scientific, Waltham, MA, USA). In this case, we used the sample loop volume of 1000 μL with the same 50 mM NaOH eluent. Nitrate and acetate concentrations were monitored by using an IonPac AS9-HC analytical and AG9-HC guard column with 9 mM Na2CO3 eluent and a 25 μL sample loop. The detection limits for selenate and perchlorate were 5 μg/L each. And those of acetate and nitrate were 0.5 mg/L. All the regressions for experimental data were conducted by Sigmaplot software (Systat Software Inc., San Jose, CA, USA) based on the assumption of first-order removal [19].

#### **3. Results and Discussion**

#### *3.1. Appropriate Temperature for Selenate-Reducing Bacteria Acclimation in SBRs*

To increase the activity of selenate-reducing bacteria in the seed sludge, initial acclimation (phase 0) was conducted for about 50 days using two sets of triplicate SBRs. Figure 2a shows the variations of selenate concentrations at the very first batch of the SBRs. During nine days of phase 0, only 27% of initial selenate (0.72 mM SeO4 <sup>2</sup>−) was reduced on average at the SBRs of 25 ◦C. However, in the SBRs at 30 ◦C selenate was reduced to below detection level after nine days. This result indicates that 30 ◦C, higher than room temperature, is more appropriate for the growth of selenate-reducing bacteria, which is consistent with previous literature [14,20,21]. With the revealed temperature condition, all the SBRs enriched selenate-reducing bacteria at 30 ◦C for the rest of phase 0 for further experiments.

At the end of phase 0, monitoring results indicate that SBRs could reduce selenate (0.9 mM) to below detection level in less than four hours. This enhancement indicates that phase 0 must have made the selenate-reducing bacteria successfully acclimated to start instantaneous selenate reduction right after fill-sequence without lag-period. Figure 2b demonstrates that enriched microorganisms actively reduce selenate to Se<sup>0</sup> biologically at the last batch of phase 0, consistent with the literature [6,14,22,23]. Regression indicates that the observed selenate reduction rate was revealed as rapid as 0.96 h<sup>−</sup>1.

**Figure 2.** Dynamics of selenate concentration in SBRs of phase 0: (**a**) before acclimation, (**b**) after acclimation.

#### *3.2. E*ff*ect of Nitrate on Selenate Reduction*

At phase 1 and phase 2, this study tests the effect of most probable electron-competing anion, nitrate, on selenate reduction (Table 1). We artificially constitute low (0.1 mM, phase 1) and high (1 mM, phase 2) selenate conditions for better interpretation. Figure 3 illustrates the dynamics of average (n = 3) selenate and nitrate in a whole sequence of SBRs at a steady state. When 3.8 mM CH3COO<sup>−</sup> was added to 0.1 mM SeO4 <sup>2</sup><sup>−</sup> (phase 1) as an excess electron donor in the presence of 0.96 mM NO3 <sup>−</sup>-N (approximately 1:10 of influent SeO4 <sup>2</sup><sup>−</sup>: NO3 − mole ratio), selenate and nitrate were simultaneously reduced to below detection level within six hours in SBRs (Figure 3a). In the case of phase 2, nitrate was completely reduced to below detection level, whereas a small amount of selenate was detected (0.02 mM, 98% reduction) after six hours in SBRs (Figure 3b). Close to the end of the sequence, the selenate concentration decreased to below detection level.

Within the ratio of SeO4 <sup>2</sup>−: NO3 − between 1:1 and 1:10 tested in this study, both selenate and nitrate could be simultaneously reduced without significant inhibition. The selenate reduction rate was maintained at 0.55–0.57 h−<sup>1</sup> regardless of initial concentration. This result indicates that selective enrichment and long acclimation (>30 days) could make selenate-reducing bacteria endure competitive inhibition, described previously [24]. In addition, it was noticed that the denitrification rate was not interrelated with the selenate concentration and kept the rate as 0.88 h−<sup>1</sup> almost constantly, which supports simultaneous selenate and nitrate reduction under excess electron donor condition.

**Figure 3.** Dynamics of SeO4 <sup>2</sup><sup>−</sup> and NO3 <sup>−</sup> in SBRs: (**a**) SeO4 <sup>2</sup><sup>−</sup>:NO3 − = 1:10 (phase 1) and (**b**) SeO4 <sup>2</sup><sup>−</sup>:NO3 − = 1:1 (phase 2).

#### *3.3. E*ff*ect of External Carbon Limitation on Selenate Reduction*

Two sets of experiments were performed to investigate the effect of carbon source limitation on simultaneous selenate and nitrate reduction in the SBRs under low (0.1 mM, phase 3) and high selenate (1 mM, phase 4) conditions. Acetate concentration was limited to 0.8 mM for phase 3 when the initial SeO4 <sup>2</sup><sup>−</sup> concentration was 0.1 mM. Keeping the nitrate concentration as 1.0 mM results in the decrease of C:N ratio from 6.7:1 to 1.2:1 compared to phase 1. Phase 4 was conducted with 1 mM of SeO4 <sup>2</sup><sup>−</sup> reducing C:N ratio from 11.1:1 (phase 2) to 2.3:1 (phase 4). Phase 3 and phase 4 were directly comparable to phase 1 and phase 2, respectively. Figure 4a,b demonstrate the variations of selenate, nitrate, and acetate concentrations in SBRs at phase 3 and 4, respectively, as described in Table 1.

Figure 4a (phase 3) shows that all the selenate was reduced instantaneously within two hours but the accompanying nitrate reduction significantly decelerates when the acetate was depleted at around 3 h. Figure 4b (phase 4) illustrates that nitrate reduction similarly stops when the acetate was depleted, but selenate reduction gradually progressed further despite the depletion of external carbon sources. This result indicates that denitrifying bacteria are more sensitive to electron donor compared to selenate-reducing bacteria. The increase of acetate concentration from 0.7 mM to 1.2 mM enhanced the nitrate reduction rate about 80% (from 0.79 hr−<sup>1</sup> to 1.42 hr<sup>−</sup>1) at phase 4 but the nitrate reduction rate drastically ceased as the carbon source depleted. Selenate reduction rate was also decreased by 27.4% (from 0.95 h−<sup>1</sup> to 0.69 h<sup>−</sup>1) possibly owing to inhibition associated with carbon source competition.

**Figure 4.** Dynamics of SeO4 <sup>2</sup><sup>−</sup>, NO3 <sup>−</sup>, and CH3COO<sup>−</sup> under carbon limitation condition: (**a**) phase 3, (**b**) phase 4.

When the selenate and nitrates are coexisting, selenate-reducing bacteria might present the ability to compete successfully for limited carbon resources like K-strategist microorganisms [25] while nitrate-reducing bacteria exploit relative offspring trends like r-strategist microorganisms [26] in this study. This result suggests that selenate-reducing bacteria has a more competitive advantage over withstanding harsh carbon-limiting condition than nitrate-reducing bacteria.

#### *3.4. Nitrate and Perchlorate E*ff*ect on Selenate Reduction in SBRs*

To investigate the effect of another oxyanion, perchlorate, on the simultaneous selenate reduction, SBRs were operated with a feed solution containing selenate (0.1 mM), nitrate (1.0 mM), and perchlorate (0.15 mM) with an excess amount of external carbon source (3.4 mM).

Figure 5 demonstrates that selenate and nitrate reduction are not affected by perchlorate significantly. It was observed that 38% of perchlorate (reduction rate of 0.02 h−1) can be reduced together with selenate and nitrate in the SBRs during 24 h of a sequence. This result indicates that dissimilatory perchlorate-reducing bacteria can grow together with selenate- and nitrate-reducing bacteria under anaerobic conditions if the carbon source (electron donor) is not limiting [27]. In this study, an insufficient population of perchlorate-reducing bacteria might have prevented the perchlorate from being a competitive inhibitor of selenate or nitrate reduction under excess electron donor conditions. Owing to perchlorate, the reduction rate of nitrate was significantly reduced from 0.9~1.4 h−1(excess electron donor condition) to 0.5 h−<sup>1</sup> at phase 5. However, that of selenate did not decline but maintained to around 0.7–1.3 h<sup>−</sup>1, which evidences the ability of selenate-reducing bacteria to endure harmful perchlorate as well as electron donor competition without significant inhibition. This result also means that selenate-reducing bacteria can be dominantly enriched from activated sludge within a reasonable period of time if the carbon source is not limiting.

**Figure 5.** Dynamics of SeO4 <sup>2</sup><sup>−</sup>, NO3 <sup>−</sup>, and ClO4 − with excess carbon source in SBRs.

#### **4. Conclusions**

This research provides information about how competing anions, nitrate, and perchlorate, affect selenate reduction in SBRs which are seeded with activated sludge. Based on the observed data from this research, the following conclusions are drawn as below:

(1) SBRs can rapidly enrich selenate-reducing bacteria from the activated sludge by using the selective pressure of temperature (30 ◦C) and sufficient acclimation period of >40 days.

(2) Complete selenate and nitrate reduction can be accomplished simultaneously in anaerobic SBRs by supplying the excess amount of electron donor. Limitation of electron donor may decrease the activity of nitrate-reducing bacteria instantaneously while selenate-reducing bacteria responds slowly using the limited resources more efficiently.

(3) Coexistence of perchlorate in the feed did not affect selenate reduction significantly owing to the shortage of dissimilatory perchlorate reducing bacteria. However, together with selenate and nitrate, 38% of perchlorate could be reduced without acclimation when electron donor is not limited.

Overall, these results evidence that selenate-reducing bacteria are capable of enduring competitions associated with other oxyanions reduction and electron donor without significant inhibition after appropriate acclimation. This study may contribute to understanding biological Se reduction better in relation to competing anions and electron donor conditions.

**Author Contributions:** H.C. conceived and designed the study. S.H.H. conducted the whole experiment. H.-W.K. and H.C. analyzed the data and wrote the paper. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2016R1D1A1B03933921).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

1. Stranges, S.; Navas-Acien, A.; Rayman, M.P.; Guallar, E. Selenium status and cardiometabolic health: state of the evidence. *Nutr. Metab. Cardiovasc. Dis.* **2010**, *20*, 754–760. [CrossRef] [PubMed]


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **E**ff**ects of Conventional Flotation Frothers on the Population of Mesophilic Microorganisms in Di**ff**erent Cultures**

## **Mohammad Jafari 1, Mehdi Golzadeh 2, Sied Ziaedin Shafaei 1, Hadi Abdollahi 1,\*, Mahdi Gharabaghi <sup>1</sup> and Saeed Chehreh Chelgani 3,\***


Received: 1 September 2019; Accepted: 23 September 2019; Published: 25 September 2019

**Abstract:** Bioleaching is an environment-friendly and low-investment process for the extraction of metals from flotation concentrate. Surfactants such as collectors and frothers are widely used in the flotation process. These chemical reagents may have inhibitory effects on the activity of microorganisms through a bioleaching process; however, there is no report indicating influences of reagents on the activity of microorganisms in the mixed culture which is mostly used in the industry. In this investigation, influences of typical flotation frothers (methyl isobutyl carbinol and pine oil) in different concentrations (0.01, 0.10, and 1.00 g/L) were examined on activates of bacteria in the mesophilic mixed culture (*Acidithiobacillus ferrooxidans, Leptospirillum ferrooxidans,* and *Acidithiobacillus thiooxidans*). For comparison purposes, experiments were repeated by pure cultures of *Acidithiobacillus ferrooxidans* and *Leptospirillum ferrooxidans* in the same conditions. Results indicated that increasing the dosage of frothers has a negative correlation with bacteria activities while the mixed culture showed a lower sensitivity to the toxicity of these frothers in comparison with examined pure cultures. Outcomes showed the toxicity of Pine oil is lower than methyl isobutyl carbinol (MIBC). These results can be used for designing flotation separation procedures and to produce cleaner products for bio extraction of metals.

**Keywords:** flotation; bioleaching; frother; mixed culture; machine learning

#### **1. Introduction**

Pyrometallurgy and high-pressure leaching are two typical methods used for the extraction of metals from concentrates of flotation separation [1–3]. These methods have several disadvantages such as high investment and operation costs, environmental pollution (chemical reagents in the waste waters of hydrometallurgical plants and SO2 gas generation from pyrometallurgical plants), high energy consumption in the pyrometallurgy processes, high technology requirements for pyro/hydro-metallurgy process, and finally special expertise for system operators [4,5]. Variations in the metal price have caused very intense competition among the high prestigious mining companies (Anglo American, BHP, Rio Tinto, Glencore, etc.) to revise their feasibility studies where the feasibility of mining projects has significantly depended on the project costs. Moreover, the problem of global warming and environmental pollution has led the mineral processing industry to focus on the use of low-cost, low-energy, and environmentally friendly methods [4,6–10]. Thus, several investigations have been focused on the operation and optimization of the bioleaching processes for the extraction of metals from

low-grade deposits (by heap bioleaching [11,12]), waste (by columns [13,14]), and concentrates (by bioleaching tanks [13,15,16]) which considerably have lower costs and environment effects [11,16–24].

However, on one hand, few investigations studied the effects of flotation reagents on the bioleaching of sulfide flotation concentrates [25–32]. On the other hand, those studies (Table 1) are mainly focused on the cultures consisted one specific microorganism while in the industry, mixed microorganisms are mostly used for the bioleaching process [33]. Where using a mixed culture with different microorganisms can lead to the cooperative effects and bioleaching may show a higher efficiency than pure cultures [34–40].



#### *Processes* **2019** , *7*, 653



**No.**

8

9 10 Sodium ethyl-xanthate

Sodium n-propyl xanthate

Sodium isobutyl xanthate

All reagents have a negative effect on the

biooxidation

 of iron.

[45]

(ferrous) Biooxidation Potassium amyl xanthate

Iron

> 11

*Leptospirillum*

 n-butyl xanthate

Potassium

Sodiumdithiocarbamate

 (alkyl) and

Sodium (alkyl) dithiocarbamate sodiumdi-(alkyl)dithiophosphate

 Isopropylthionocarbamate

Dithiophosphate(mixture)

2-mercaptobenzthiazole

Sodium

*Processes* **2019** , *7*, 653

[32]

 **Ref**

This study investigated influences of two typical flotation frothers pine oil (PO) and methyl isobutyl carbinol (MIBC) on a population (microorganisms count) of a traditional mixed mesophilic microorganisms culture (*Acidithiobacillus ferrooxidans, Leptospirillum ferrooxidans,* and *Acidithiobacillus thiooxidans*). This is because there is a direct relationship between bioleaching rate (recovery of valuable metals from ores) and the population of microorganisms [46]. Three different concentrations of frothers were examined (0.01, 0.10, and 1.00 g/L). For comparison purposes, outcomes were compared with results of the same conditionings on pure cultures of *Acidithiobacillus ferrooxidans* and *Leptospirillum ferrooxidans.* Various parameters were measured in the control tests: pH, ORP (oxidation-reduction potential), total iron (FeT) in the solution, and DO (dissolved oxygen in the media). Mutual Information (MI) assisted by Pearson correlation was used to explore the relationship among these measured variables and select the most important parameters for further assessments. Outputs of this investigation can be used for mineral processing plants which flotation separation is their main beneficiation method to design ambient conditions. This method helps to produce cleaner products by a leaching tank. The most efficient route for processing of flotation concentrate is a leaching tank where it can process high-grade feeds and it has a high process recovery [15], for the downstream processes and environment.

#### **2. Materials and Methods**

#### *2.1. Bacterial Strain and Growth Conditions*

Pure strains of *Acidithiobacillus ferrooxidans*(T.f), *Leptospirillum ferrooxidans*(L.f)*,* and *Acidithiobacillus thiooxidans* (T.t) which have different oxidation abilities (Table 2) were obtained from the research and development center of Sarcheshmeh mine, Kerman, Iran. Microorganism strains were cultivated in the environment presented in Table 3. 5 cc of each pure culture was selected to build the mixed culture.


**Table 2.** Oxidation ability of microorganisms.



#### *2.2. Flotation Reagents*

Flotation frothers (MIBC and PO) were prepared in the mineral processing laboratory at the University of Tehran, Iran. A wide range of their concentrations which are common in the various flotation plants (0.01, 0.10, and 1.00 g/L) was used and their influences were explored by different analyses.

#### *2.3. Experimental Procedure*

Twenty-one experiments, nine tests for each frother (three different dosages and three different cultures) and one control test (without frother) for each culture, were conducted. To do experiments, microorganisms were cultivated in a 9K medium containing five different mineral salts ((NH4)2SO4: 3 g/L, MgSO4·7H2O: 0.5 g/L, K2HPO4: 0.5 g/L, KCl: 1 g/L, and Ca (NO3)2·H2O: 0.01 g/L). The initial pH of the media was adjusted to 1.8 with H2SO4. As a source of energy 44.22 g/L FeSO4·7H2O and 10 g/L sulfur were added to the media. Incubation was performed at 34 ◦C in an incubator shaker having the

rotation speed of 140 rpm. Effects of MIBC and PO on microorganisms count and their activities in the various cultures are assessed by measuring different parameters (pH, ORP, FeT, and DO) (Table 4). All the cultures were monitored and the mentioned parameters were measured for 21 days. To save time and cost, mutual information and Pearson correlation were used for the feature selection (FS). FS indicates the relationship among parameters and can be used to rank them.



#### *2.4. Feature Selection*

Feature or variable selection is used to select the most effective variables on specific responses. It assists to optimize the number of variables which typically have to be measured during a process, reduce the number of parameters, and to save cost and time. In other words, collinearity may lead to measuring various parameters that show the same concept [47–49]. Therefore, FS was used through the value of measured parameters (pH, ORP, FeT, and DO) in the control tests to find the most effective parameters on microorganisms count (MC). The selected parameters were used as indicative factors for further assessments.

#### 2.4.1. Pearson Correlation

Pearson correlation (r) categorizes the magnitude and value of the linear relationship between two variables. "r" statistically determines the strength of a correlation and donates negative values (−1 ≤ r < 0) when by increasing one variable another one decreases and positive values (0 < r ≤1) when they have the same orientation. "r" close to 0 means there is no relationship [50,51]. Pearson correlation was used to explore linear correlations between the measured parameters (pH, ORP, FeT, and DO) in the control tests through 21 days of monitoring.

#### 2.4.2. Mutual Information

Mutual information (MI) is a unique method which can determine both the linear and nonlinear correlation between variables. MI between two variables (x;y) is non-negative and is defined as:

$$\text{MI}(\mathbf{x}; \mathbf{y}) = \sum\_{y \in \mathcal{R}} \sum\_{x \in \mathcal{S}} p(\mathbf{x}, \ y) \log\_2 \frac{p(\mathbf{x}, \ y)}{p(\mathbf{x}) p(y)},\tag{1}$$

where *p*(*x*) and *p*(*y*) are probability density functions and *p*(*x*,*y*) means the joint probability of a given stimulus.

#### **3. Results**

#### *3.1. Control Test*

Exploring MC in three different cultures and in the absence of frothers (Figure 1a) shows that the MC is increasing during 21 day activities, and the MC after 21 days *Initial MC* ratio for the mixed culture is higher than T.f and L.f cultures ( 6.4 0.66 vs. 6.8 0.78 and 3.2 0.76 , respectively). These results indicate that the MC grows faster in the mixed culture than two other cultures and/or bacteria may have higher activities in the mixed culture. Figure 1b shows that the ORP after 21 days *Initial ORP* ratio is higher for the mixed culture than T.f and L.f cultures ( <sup>680</sup> <sup>335</sup> vs. <sup>652</sup> <sup>408</sup> and <sup>675</sup> <sup>376</sup> , respectively). The mixed culture has the lowest DO and Fe<sup>T</sup> while T.f culture has the lowest pH through assessments (Figure 1c–e). These results show correlations among these measured variables (pH, ORP, FeT, and DO). Statistical analyses were used to do variable importance measurement (VIM) and to select the most representative parameters for further evaluations.

**Figure 1.** Exploring variation of process parameters in three different cultures for control tests during 21 days monitoring. (**a**) Bacterial count; (**b**) ORP; (**c**) DO; (**d**) FeT; (**e**) pH.

#### Feature Selection

Exploring linear relationships by Pearson correlation through various measured parameters in three different cultures over 21 days indicate that the pH and FeT have the highest negative "r" value with MC (Figure 2) while DO shows an insignificant correlation. In other words, when pH and FeT decrease, MC increases (Figure 2). Moreover, Pearson correlation shows a high relationship between ORP and FeT. MI was used to explore nonlinear relationships between parameters and rank them based on their importance. If the MI was close to 1, it means there is a high correlation between X and Y, and a value close to 0 means there is no relationship. MI can be used to rank variables based on their effectiveness on a dependent variable and rank independent variables based on their importance (VIM) [52]. In this study, MI was used to rank the measured parameters (pH, ORP, FeT, and DO) in the control tests and rank them based on their effects that may receive from the MC value (VIM). Using VIM by MI and Pearson correlation together provides a direct determination to decide whether to add an additional variable for assessments or not. MI results (Figure 3) illustrate that pH and Fe<sup>T</sup> receive the highest effectiveness from MC among all measured parameters. Thus, these two parameters are selected to study the effect of the conventional flotation frothers (MIBC and PO) on the different cultures.

**Figure 2.** Pearson correlations between the measured parameters in the control tests.

**Figure 3.** Ranking effectiveness of the measured parameters on microorganism population by Mutual Information.

#### *3.2. Frothers*

#### 3.2.1. Population of Microorganisms

A comparison between the population ratio of microorganisms (MC after 21 days *Initial MC* ) in the absence (control tests) with the presence of frothers indicates (Table 5) that MIBC and PO reduce the MC during the process. In other words, generally by increasing the frother dosages the MC is decreased. This decrease in the highest examined dosage (1 g/L) was considerably higher than the other dosages while in the case of L.f and mixed culture the population is even lower than the initial day (MC ratio < 1) (Table 5). In the presence of frothers and their different dosages, the MC ratio has the following order: T.f > mixed > L.f. In general, the MC ratio is higher in the presence of PO compering with MIBC.


**Table 5.** The MC after 21 days *Initial MC* ratio in various conditions.

#### 3.2.2. Fe Total

Figure 4 shows the negative relationship between FeT and MC for three different cultures in all tests where by increasing bacteria population the Fe<sup>T</sup> is decreasing. In general, by increasing the dosages of frothers, by stopping the growth MC, the amount of FeT in the solution remains high through the process (Figure 4). Furthermore, these results illustrate that, after 21-day measurement, the amount of Fe<sup>T</sup> in the solution for T.f culture is higher than two other cultures since there is a moderate slope of reduction between MC and FeT for T.f culture in all experiments. The mixed culture generally shows the highest FeT reduction in the solution compared with two other cultures and the Fe<sup>T</sup> reduction ratio during the process has the following order: mixed > T.f > L.f. The Fe<sup>T</sup> in the solution is approximately lower in the presence of MC.

**Figure 4.** *Cont.*

**Figure 4.** Relationship between population of bacteria and Fe total in different conditions. (**a**) 0.01 g/L MIBC; (**b**) 0.01 g/L PO; (**c**) 0.1 g/L MIBC; (**d**) 0.1 g/L PO; (**e**)1g/L MIBC; (**f**)1g/L PO; (**g**) Control test.

#### 3.2.3. pH

Figure 5 illustrates the negative relationship between pH and MC for three different cultures in all conditions where by increasing bacteria population (MC) the pH value is decreasing. In other words, by increasing MC and as a result of their activities, the pH value is reducing. In general, the rate of pH reduction is decreased by increasing the dosages of frothers (Figure 5). These results indicate that L.f has the highest and T.f has the lowest pH value during the process monitoring (L.f > mixed > T.f). In the presence of PO, the pH reduction is moderately continuous for all cultures while in the presence of MIBC (above 0.01 g/L), the pH reduction is only detectable for the T.f culture (Figure 5).

**Figure 5.** *Cont.*

(**g**)

**Figure 5.** Relationship between population of bacteria and pH in different conditions. (**a**) 0.01 g/L MIBC; (**b**) 0.01 g/L PO; (**c**) 0.1 g/L MIBC; (**d**) 0.1 g/L PO; (**e**)1g/L MIBC; (**f**)1g/L PO; (**g**) Control test.

#### **4. Discussion**

Oxidizing metal sulfides to sulfate via contact between bacteria and mineral (direct) and oxidizing Fe2<sup>+</sup> to Fe3<sup>+</sup> or/and S<sup>o</sup> to SO4 (without contact: indirect) are the main mechanisms of metal extraction in the bioleaching process [53–59]. Moreover, it was well understood that oxidation of Fe2<sup>+</sup> to Fe3<sup>+</sup> during bioprocess decreases pH values and FeT in the solution (precipitation of iron as jarosite and other iron oxides/hydroxides: Equations (2)–(4)) [60–62]. These phenomena lead to both direct and indirect bioleaching mechanisms [63–69]. Moreover, increasing bacteria population (MC) plays a fundamental role in the bioleaching process, mostly affecting pH and FeT. Thus, there should be negative correlations between MC-pH, and MC-Fe<sup>T</sup> during bio-activities (Figures 2 and 3). On the other hand, there should be also a positive relationship between pH and FeT (Figure 2). Akinci et al. demonstrated that the rate of pH reduction during bioleaching in different bacterial cultures have a decreasing order as follows: *A. thiooxidans* > mixed culture > *A. ferooxidans* [70] that supports outcomes presented in Table 5.

$$\text{Fe}^{3+} + 2\text{H}\_2\text{O} \rightarrow \text{FeOOH} \downarrow + 3\text{H}^+ \text{.} \tag{2}$$

$$\text{Fe}^{3+} + 3\text{H}\_2\text{O} \rightarrow \text{Fe(OH)}\_3 \downarrow + 3\text{H}^+ \text{.} \tag{3}$$

$$\rm 3Fe^{3+} + M^{+} + 2HSO\_{4}^{-} + 6H\_{2}O \rightarrow MFe\_{3}(SO\_{4})\_{2}(OH)\_{6} \downarrow + 8H^{+}.\tag{4}$$

The toxicity of flotation reagents is well documented and the presence of their substrates in the flotation products indicated many environmental issues which may inactivate bacteria metabolism [71–73]. Thus, frothers can change the surface properties of energy resources in the culture, limit the surface tension of the media, and inhibit microorganism activities [74,75]. The toxicity of flotation reagents depends on their chemical composition, and their dosages [25–27,32,76]. When flotation concentrate of sulfides is subjected for metal extraction via bioleaching, presence of frothers in the solution can increase the pH at the initial stage of the process [28,29,43,77]. Loon and Madgwick reported that flotation reagents reduced the bacteria growth and limited the formation of soluble iron in the bioleaching process. Since MIBC and PO are unstable at pH below 3, therefore, they may consume H<sup>+</sup> from the solution, decompose, and increase the pH. Thus, presented results in Figures 4 and 5 are in good agreement with the literature where by increasing the dosages of these frothers, the rate of decreasing in pH and FeT value into the solutions are slowing down [29].

It was reported that the growth rate of L.f is lower than T.f (around half of T.f) [58]. This can translate as the rate of its activities also lower than two other examined cultures in a certain period (21 days) of the process (Figure 1 and Table 5). These mean that the rate of the negative effect of reagents can be related to the bacteria sensitivity. Okibe and Johnson reported that L.f is more sensitive than T.f in the presence of flotation reagents which comprises the presented results in Figures 4 and 5 [45]. In general, in the mixed culture, bacteria show a better activity and lower sensitivity than other cultures to the toxicity of frothers while the sensitivity of T.f to the frothers in their highest dosages (1 g/L) is lower than two other bacteria (Figures 4 and 5). This can be as a result of the simultaneous presence of iron and sulfur-oxidizing bacteria in the mixed culture that positively improves microorganism activities. Zhang et al. reported that the oxidation activity of the mixed culture (*Acidithiobacillus ferrooxidans, Acidithiobacillus thiooxidans,* and *Leptospirillum*) is higher than that of the pure culture, and the mixed culture has the highest adaptability to the bioleaching conditions [78]. Furthermore, meanwhile, the solubility of MIBC is six times higher than PO (at the same condition) [47], its inhibitory effect on bacteria activity can be higher than PO. This also is in a good agreement with the outcome of analyses (Figures 4 and 5). Thus, although PO produces larger and less stable bubbles than MIBC within flotation separation, its toxicity in terms of bioleaching and environmental issues is lower than MIBC.

#### **5. Conclusions**

A comparison between the mixed and pure cultures during 21 days of monitoring indicated that bacteria concentration of the mixed culture is higher than pure ones. These results indicated that the insensitivity of the mixed culture to the toxicity of MIBC and PO as conventional flotation frothers in low dosages (0.001 and 0.01 g/L) is more than pure cultures. MC showed the highest population in the presence of frothers (0.001 and 0.01 g/L). Mutual information and Pearson correlation assessments released that pH value and total iron in the solution are the main parameters during bacteria activities. There is a significant negative correlation between bacteria population and pH (as the most important factor of bioleaching). Presence of frothers disrupted bacteria activities; thus, the rate of pH reduction and oxidation-reduction of iron were decreasing by increasing the dosage of frothers. In the absence and presence (0.001 g/L) of the flotation frothers the rate of pH reduction during the process has the following order for the examined cultures: mixed > T.f > L.f. In general, the mixed culture has the highest Fe oxidation-reduction ratio in both the absence and presence of frothers. Results demonstrated that although during flotation MIBC can produce smaller and more stable bubbles than PO, its toxicity is higher than PO for various microorganisms.

**Author Contributions:** M.J., S.Z.S., M.G. (Mahdi Gharabaghi), and H.A. conceived and designed the experiments; M.J. performed the experiments; M.G. (Mehdi Golzadeh) and S.C.C. analyzed the data; S.C.C. and M.J. wrote the paper.

**Acknowledgments:** Authors would like to appreciate all support and provision of microorganisms for this investigation from the Center of Research and Development in Sarcheshmeh mine, Kerman, Iran. Additionally, we would like to thank Ali Rezai, and Hossayni for their assistance in the mineral processing laboratory at the University of Tehran, and Ghanbar Zad for the AAS analyses in the geochemistry laboratory at the University of Tehran.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Combining Mechanistic Modeling and Raman Spectroscopy for Monitoring Antibody Chromatographic Purification**

**Fabian Feidl 1, Simone Garbellini 1, Martin F. Luna 1, Sebastian Vogg 1, Jonathan Souquet 2, Hervé Broly 2, Massimo Morbidelli <sup>1</sup> and Alessandro Butté 1,\***


Received: 30 August 2019; Accepted: 23 September 2019; Published: 1 October 2019

**Abstract:** Chromatography is widely used in biotherapeutics manufacturing, and the corresponding underlying mechanisms are well understood. To enable process control and automation, spectroscopic techniques are very convenient as on-line sensors, but their application is often limited by their sensitivity. In this work, we investigate the implementation of Raman spectroscopy to monitor monoclonal antibody (mAb) breakthrough (BT) curves in chromatographic operations with a low titer harvest. A state estimation procedure is developed by combining information coming from a lumped kinetic model (LKM) and a Raman analyzer in the frame of an extended Kalman filter approach (EKF). A comparison with suitable experimental data shows that this approach allows for the obtainment of reliable estimates of antibody concentrations with reduced noise and increased robustness.

**Keywords:** Raman spectroscopy; downstream processing; chromatography; flow cell; extended Kalman filter

#### **1. Introduction**

The application of spectroscopic techniques to monitor chromatographic processes in the frame of the so-called process analytical technology (PAT) initiative is very promising due to its potential for gathering important on-line process information in a non-invasive way [1–5]. Available spectroscopic techniques range from UV/vis and Fourier transform infrared spectroscopy to dynamic light scattering [6]. Several applications of Raman spectroscopy have been reported in upstream processing [7–9], showing the potential of this technology, which often requires specific modeling techniques, such as partial least squares (PLS) regression, to extract the desired information from the measured spectra. Recently, a successful implementation of Raman spectroscopy for the on-line monitoring of monoclonal antibody (mAb) concentrations in downstream processing was reported by Feidl et al [10]. An ad hoc developed flow cell enabled the integration of the Raman technology into the capture (protein A) step of a mAb manufacturing process, providing accurate on-line estimates of mAb concentration. However, in spite of these results, the use of this technology remains limited due to the intrinsic weakness of the Raman signal [11–13].

In this work, we explore the possibility of overcoming these difficulties by combining estimates from the Raman signal with the predictions of a mechanistic model. This is particularly convenient in the chromatographic purification of mAbs because these processes are well understood and reliable mechanistic simulation models are available [14]. It has been already shown in many other areas that the combination of deterministic knowledge and on-line measurements can lead to more accurate and reliable estimates [15–18], e.g., by using extended Kalman filtering (EKF) [19]. While a general introduction to Kalman filtering is given in [20], a more detailed description is provided in [21].

In this work, the implementation of an EKF for a chromatographic capture step to estimate antibody concentration is shown by combining the information of a deterministic chromatographic model with on-line information derived from Raman-based PLS estimates. In particular, the objective is to monitor the chromatographic breakthrough curves of a low-concentrated monoclonal antibody harvest. The beneficial effect of the combination of the two approaches with respect to the stand-alone Raman and chromatographic model is discussed.

#### **2. Materials and Methods**

#### *2.1. Raman Spectral Acquisition and Flow Cell*

Raman spectra were acquired with a Kaiser RamanRxn2 analyzer (Kaiser Optical Systems, Inc., Ann Arbor, MI, USA), including a 785 nm laser at 400 mW and a cooled charged-coupled device (CCD) detector, measuring inelastic photon scattering across a 150–3425 cm−<sup>1</sup> wavenumber range. A laser exposure time of 30 s was chosen to collect the single scan spectra. A flow cell with an optimized flow characteristic, signal enhancement, pressure tolerance, and a single use potential was developed. A schematic illustration of the flow cell is shown in Figure 1. It includes four main modules: (A) An analyzer adapter, which connects the flow cell to the Raman analyzer via a fiber cable; (B) a non-contact objective to focus the laser beam within the flow path; (C) a flow path, which guides the sample longitudinally to the laser beam; and (D) a reflector, reflecting scattered and unscattered light to the analyzer via the flow path. In the application to chromatographic purifications, the inlet connection is coupled to the elution stream, and the outlet connection is linked to the sample fractionator.

**Figure 1.** Schematic illustration of the developed flow cell.

#### *2.2. Cell Culture Supernatant*

Two cell culture supernatant pools containing a recombinant mAb with product concentrations between 0.30 and 0.60 mg/mL were obtained from a CHO cell perfusion process, as reported in [22]. Besides cell filtering through the perfusion hollow fiber module (0.5 μm pore size, Spectrum Laboratories, Netherlands), no other treatment was applied to the supernatant, which therefore contained a large quantity of impurities, e.g., media components, host cell proteins (HCP, 3×105 ppm), DNA (4 <sup>×</sup>10<sup>4</sup> ppm) and high molecular weight (HMW) species (1.1%).

#### *2.3. Breakthrough Runs and Reference Analytic*

Fifteen breakthrough (BT) runs were performed on MabSelect SuRe columns (GE Healthcare, Uppsala, Sweden), prepacked by Repligen GmbH (Ravensburg, Germany, 0.5 × 5 cm), as described

in [10]. The feed concentration, flow rate, and the fraction duration were changed between the different BT runs, as described in Table 1. Since several Raman measurements were acquired while collecting the sample of a single fraction (between 4 and 16 spectra per fraction), a spline approach was applied to interpolate missing reference measurements for each Raman spectrum.


**Table 1.** Chromatographic settings and Raman measurements availability of performed breakthrough curves.

The mAb concentrations were determined off-line by HPLC with an analytical standard deviation of 0.01 mg/mL as described in [23]. The HMW, HCP, and DNA content of the harvest were determined as described in [10]. A schematic illustration of the experimental set-up is shown in Figure 2.

**Figure 2.** Schematic illustration of the experimental set up to perform chromatographic breakthrough runs and collecting synchronized Raman measurements and breakthrough fractions.

#### *2.4. Chemometric Modeling Procedure*

All calculations were performed with MATLAB R2018a (Mathworks, Natick, MA, USA) using in-house developed routines, if not stated otherwise. The modeling procedure included Savitzky–Golay smoothing with a second polynomial order and a frame size of 51 Raman shift wavenumbers [24], spectrum wise standard normal variate (SNV) processing [25], and Raman shift wavenumber wise mean centering on spectra and reference values [26]. The removal of spectral regions based on the bioprocess modeling experience resulted in spectral ranges of 450–1820 cm<sup>−</sup>1, 1880–2530 cm−<sup>1</sup> and 2590–3100 cm−<sup>1</sup> to eliminate interferences with the window material and water as well as non-informative regions. No derivative, Raman shift wavenumber selection, or automated outlier removal tools were applied. The nonlinear iterative partial least squares (NIPALS) algorithm [27] was used to calibrate predictive PLS models, which regressed spectral data on HPLC reference values, including a threefold cross validation (CV) [28]. The optimal number of latent variables (LV) was determined based on the minimum CV error.

After calibrating the model on five different BT runs, the model was tested on an external BT run, i.e., not included in the calibration. In order to evaluate the model performance, the root mean square error was calculated for both cross validation (RMSECV) and external prediction (RMSEP), using Equation (1), where *y*ˆ*<sup>i</sup>* is the predicted value of the i-th observation, *yi* is the corresponding measured value, and n is the total number of observations:

$$RMSE = \sqrt{\frac{1}{n} \sum\_{i=1}^{n} \left( y\_i - \hat{y}\_i \right)^2} \tag{1}$$

Furthermore, the coefficient of determination (*R*2) was calculated for the external prediction as follows:

$$R^2 = 1 - \frac{\sum\_{i=1}^{n} (y\_i - \hat{y}\_i)^2}{\sum\_{i=1}^{n} (y\_i - \overline{y})^2} \tag{2}$$

A rotational approach was used to judge upon the model transferability and data set similarity. Hence, sets were rotated in order to have every BT run in the external prediction set once, as shown in Table 2. As an example, to build the model for rotation 1 (ROT1), the data of BT#2–6 were used for model calibration and tested on BT#1 (for further details, see [10]).

**Table 2.** Modeling procedure including the Raman-PLS (partial least squares) calibration, mechanistic model fitting, as well as the tuning, validation and perturbation of the extended Kalman filtering (EKF).

```
For a given Rot i
      1. Raman-PLS calibration:
           PLSRot i ← calibrate Raman-PLS on BT#1–6 except BT#i
      2. Mechanistic model fitting:
           LKM ← fit LKM on BT#7–15 with respective process inputs (PIBT#7−15)
      3. EKF tuning:
           EKFRot i ← tune EKF (Q, R, kRot i
                                        Q ) on BT#1–6 except BT#i
                  for n = BT#1–6 except BT#i
                      kn
                       Qbest
                            = minkQ RMSEPEKFn
                  end for
                  kRot i
                   Q = average kn
                                    Qbest
                                        for all n
      4. EKF validation:
           Run EKFRot i with inputs LKM, PIRot i, PLSRot i and kRot i
                                                                Q
      5. EKF perturbation:
             for n = 1–200 simulations
               PIRS = random sampling of ε, cin, Qflow from Gaussian probability distribution
               Run EKFRot i with inputs LKM, PIRS, PLSRot i and kRot i
                                                                   Q
             end for
```
#### *2.5. Deterministic Modeling Procedure*

The chromatographic process was modelled using the lumped kinetic model (LKM) [14,29]:

$$\frac{\partial \mathbf{c}}{\partial t} = -\nu \frac{\partial \mathbf{c}}{\partial \mathbf{x}} + D\_L \frac{\partial^2 \mathbf{c}}{\partial \mathbf{x}^2} - \varrho \frac{\partial \eta}{\partial t} \quad \mathbf{t} \in [0, t\_{\rm end}]\_\prime \text{ x } \in [0, L\_{\rm Col}] \tag{3}$$

$$\frac{\partial q}{\partial t} = k\_m(q^\* - q) \tag{4}$$

where *c* is the liquid phase concentration of the protein, *t* is the time, *tend* is the end of the BT run, *v* is the interstitial velocity (*<sup>v</sup>* <sup>=</sup> *Qflow Acol* <sup>ε</sup> ), *Qflow* is the volumetric flow rate (see Table 1), *Acol* is the column cross sectional area (*Acol*= 0.196 cm2), ε is the bed porosity (ε= 0.36 [30]), *x* is the coordinate along the column longitudinal axis, *LCol* is the column length (*LCol*= 5 cm), *DL* is the apparent axial dispersion coefficient, ϕ = (1 − ε)/ε is the phase ratio of the column, *q* is the solid phase concentration of the protein, *km* is the mass transfer coefficient, and *q*∗ is the equilibrium solid phase concentration of the protein.

The following initial conditions were applied:

$$x(t=0, \mathbf{x}) = c\_0(\mathbf{x})\tag{5}$$

$$q(t=0,\mathbf{x}) = q\_0(\mathbf{x})\tag{6}$$

They were then combined with the classical Danckwerts' boundary conditions:

$$\mathcal{L}(t>0, \mathbf{x}=0) = c\_{\rm in}(t) + \frac{D\_L}{v} \left. \frac{\partial \mathbf{c}}{\partial \mathbf{x}} \right|\_{\mathbf{x}=0} \tag{7}$$

$$\left.\frac{\partial c}{\partial x}\right|\_{x=L\_{\text{vol}}} = 0\tag{8}$$

where *cin*(*t*) is the feed concentration (see Table 1). Both *c*0(*x*) and *q*0(*x*) are zero for all values of *x*. The apparent axial dispersion coefficient *DL* can be estimated from the reduced van Deemter equation [31]:

$$D\_L = A \frac{d\_p}{2} v \tag{9}$$

where *A* is the intercept of the reduced van Deemter equation, *dp* the average particle diameter (*dp* = 85 μm). The van Deemter eddy diffusion coefficient *A* was experimentally determined from pulse injection experiments at different flow rates with mAb under non-adsorbing conditions (*A* = 17.15; data not shown). An empirical correlation was used for the mass transfer coefficient *km*, approximating hindered mass transfer due to pore blockage and other effects [32]:

$$k\_m = k\_m^{\max} \left( S\_1 + (1 - S\_1) \left( 1 - \frac{q}{q\_{\rm sat}} \right)^{S\_2} \right) \tag{10}$$

where *kmax <sup>m</sup>* is the maximum mass transfer coefficient, *qsat* is the saturation capacity of the resin, and *S*<sup>1</sup> is a maximum hindrance coefficient (0 < *S*<sup>1</sup> ≤ 1). The coefficient *S*<sup>2</sup> (with *S*<sup>2</sup> > 0) accounts for the nonlinear increase of the hindrance. The protein adsorption process was described using a Langmuir isotherm, where *H* is the Henry coefficient:

$$q^\* = \frac{Hc}{1 + \frac{Hc}{q\_{\rm sat}}}\tag{11}$$

Coefficients *km*, *S*1, *S*2, *qsat* and *H* were fitted on BT#7–15 using the corresponding process inputs (PI), such as ε, *cin* and *Qflow*, as shown in Table 2. The partial differential equations were discretized along the x coordinate using a first order central finite difference method, and the resulting system of ordinary differential equations was solved using 100 grid points.

#### *2.6. Extended Kalman Filter Tuning, Validation and Perturbation*

A prerequisite for this technique is a general nonlinear time-invariant system in continuous time, which generates measurements at discrete time steps *tk* = *k*Δ*t* [33,34]:

$$\frac{\partial \mathbf{x}}{\partial t} = f(\mathbf{x}(t), \boldsymbol{u}(t), \boldsymbol{p}) + \boldsymbol{w}(t) \tag{12}$$

$$y(t\_k) = h(\mathbf{x}(t\_k)) + v(t\_k) \tag{13}$$

where *x* denotes the states, *u* is the deterministic inputs, *p* is the time-invariant parameters, and *y* is the measurements of the system. The nonlinear function *f*() describes the state dynamics, and *h*() is the measurement function that relates state *x* with measurement *y*. The process noise *w* and the

measurement noise *v* are assumed to be uncorrelated zero-mean Gaussian random processes with covariances *Q*(*t*) and *R*(*t*), respectively (*E* =ˆ expectation operator):

$$E[w(t)] = E[v(t\_k)] = 0\tag{14}$$

$$E\left[w(t)w^{T}(\tau)\right] = Q(t)\delta(t-\tau)\tag{15}$$

$$E\left[v(t\_k)v^T(t\_k)\right] = R(t\_k) \tag{16}$$

$$E\left[w(\pi)v^T(t\_k)\right] = 0\tag{17}$$

Given such a system, the EKF can estimate states from noisy measurements through a recursive procedure, including two main steps [34,35]. In the first step (prediction step), the a posteriori state estimate *x*ˆ - *t* + *k*−1 and covariance matrix *P* - *t* + *k*−1 are propagated from *t* + *<sup>k</sup>*−<sup>1</sup> to *<sup>t</sup>* − *<sup>k</sup>* , leading to the a priori state estimate and covariance matrix (superscripts indicate values before (-) and (+) after measurement update):

$$\mathfrak{X}(t\_k^-) = \mathfrak{X}(t\_{k-1}^+) + \int\_{t\_{k-1}^+}^{t\_k^-} f(\mathfrak{X}(\tau), \mathfrak{u}(\tau), p) d\tau \tag{18}$$

$$\mathfrak{H}\left(t\_k^{-}\right) = h\left(\mathfrak{X}\left(t\_k^{-}\right)\right) \tag{19}$$

As well as the state covariance matrix:

$$P(t\_k^-) = P(t\_{k-1}^+) + \int\_{t\_{k-1}^+}^{t\_k^-} \left( Z(\tau)P(\tau) + P(\tau)Z^T(\tau) + Q(\tau) \right) d\tau \tag{20}$$

The EKF formulation uses linearized models of the nonlinear system for state estimation. Hence, the system is linearized at each time *tk* to obtain local state–space matrices:

$$Z(t) = \left(\frac{\partial f}{\partial \mathbf{x}}\right)\_{\mathbf{x}(t), u(t), p} \tag{21}$$

$$\mathbf{C}(t\_k) = \left(\frac{\partial h}{\partial \mathbf{x}}\right)\_{\mathbf{x}(t\_k^-)}\tag{22}$$

In the second step (update step), conducted as soon as a new measurement *y*(*tk*) becomes available, the Kalman filter gain *K*(*tk*) is calculated and used to update the a priori state estimates and covariance matrix to the a posteriori values:

$$K(t\_k) = P(t\_k^-) \mathbb{C}(t\_k)^T \left( \mathbb{C}(t\_k) P(t\_k^-) \mathbb{C}(t\_k)^T + R(t\_k) \right)^{-1} \tag{23}$$

$$\mathfrak{X}\begin{pmatrix}t\_k^+\\k\end{pmatrix} = \mathfrak{X}\begin{pmatrix}t\_k^-\\k\end{pmatrix} + \mathcal{K}(t\_k)\left(\mathcal{Y}(t\_k) - h\left(\mathfrak{X}\begin{pmatrix}t\_k^-\\k\end{pmatrix}\right)\right) \tag{24}$$

$$P(t\_k^+) = (I - K(t\_k)C(t\_k))P(t\_k^-) \tag{25}$$

The aim of the procedure is to obtain improved state estimates *x*ˆ *t* + *k* , characterized by small values of the covariance matrix P - *t* + *k* . In order to achieve this for a specific application, the design parameters of the EKF, such as the measurement noise covariance *R*, the initial state estimates *x*ˆ - *t* + 0 the corresponding covariance *P*0, and the process noise covariance *Q* need to be carefully selected. To initialize the filter, a consistent pair of *x*ˆ0 and *P*<sup>0</sup> needs to be selected to enable a fast convergence to the correct estimate [36].

In this work, the in-built KF toolbox of MATLAB was used. The discretized lumped kinetic model served as state transition function *f*(), and the Raman-PLS results were used as physical measurements of the outlet concentration of the column. The corresponding variance of the rotation specific RMSEP multiplied by the identity matrix was used as the error covariance matrix *R*. The time-varying process noise covariance *Q* was computed on-line at any given time *tk* using a Monte Carlo approach, as reported in [33]. This takes the knowledge from the LKM parameter identification step into account by using the nominal parameter values and its parameter covariance matrix, resulting in *QMC*. Additionally, a model mismatch factor *kQ* [34] was used and tuned for each rotation.

$$Q = Q\_{\text{MC}} \, k\_Q \tag{26}$$

For a certain rotation, as shown in Table 2, the EKF approach was separately applied to all *n* BT curves of the calibration set. For each BT curve, *kQ* was optimized (*k<sup>n</sup> Qbest*) based on the respective prediction error (*RMSEPEKFn* ). Subsequently, the received *k<sup>n</sup> Qbest* of all *<sup>n</sup>* BT curves of the calibration set were averaged, resulting in the rotational specific model mismatch factor (*kRot i <sup>Q</sup>* ). The EKF was applied to the external BT curve, which was included in neither the Raman-PLS calibration nor the EKF tuning, to externally validate its effect. In a perturbation study, 200 simulations of the mechanistic model were ran with random process input values (*PIRS*) for bed porosity ε, the feed concentration *cin* as well as the volumetric flow rate *Qflow* sampled from a Gaussian probability distribution with a standard deviation of 5% to compare the robustness of the LKM and EKF.

#### **3. Results and Discussion**

#### *3.1. Partial Least Squares Raman Modeling*

The acquired Raman spectra of the BT curves are comparable with the spectra described in [10] and are shown in Figure S1A. Due to the high impurity content in the harvest, the spectral features of different species (i.e., target mAb, media components, HCPs, DNA and HMW) overlapped, leading to broad bands and no distinct peak profiles within single spectra. Hence, only small variations between different spectra could be observed, and a suitable data pretreatment and multivariate model calibration were needed to extract useful information, such as the target protein titer. This had an obvious influence on the spectral appearance shown in Figure S1B. The variable importance in projection (VIP), shown in Figure S1C, indicates the regions between 2300 and 2700 cm−<sup>1</sup> as well as between 2900 and 3000 cm−<sup>1</sup> as very important. This is in line with the fact that proteins exhibit several Raman bands in the region between 2500–4000 cm−<sup>1</sup> [37]. Results of the PLS modeling for different rotations are shown in Table 3. Though the number of observations in the calibration set varied slightly between rotations, the optimal number of selected latent variables (LVs) was consistent among rotations and was either 11 or 12. The RMSECV was constant around 0.040 mg/mL on a calibration range from 0 to 0.42 mg/mL and was thus almost independent of the rotation scheme. However, variations in the RMSEP between 0.045 and 0.072 mg/mL as well as varying values of R<sup>2</sup> between 0.70 and 0.86 indicated slight differences between the BT runs.

**Table 3.** Data set information and PLS modeling results of all rotations.


The prediction of titer as a function of time for ROT1 is exemplarily shown in Figure 3. The red dots represent the off-line HPLC titer measurements for each fraction, whereas the continuous blue line represents the Raman-PLS-based prediction. It can be seen that the trend of the BT curve was generally well captured by Raman. However, the predictions were clearly scattered around the reference values. It is worth mentioning that increasing the number of scans per Raman measurement could improve the signal-to-noise ratio. However, this would extend the measurement duration, which is critical in most of the downstream processing applications.

**Figure 3.** Time evolutions of Raman-PLS predictions (blue) for rotation 1 (ROT1) compared to off-line HPLC measurements (red).

As for other rotations, shown in Figure 4, one can observe a clear offset of the predictions at the initial phase (i.e., before breakthrough started). In spite of an optimized laser exposure time, signal enhancement through the flow cell and spectral pretreatment methods, the signal-to-noise ratio was rather small and probably close to the detection limit. Though the obtained averaged RMSEP of 0.05 mg/mL and averaged R2 of 0.8 is remarkable, one must probably conclude that Raman-PLS is insufficient for a precise monitoring of a breakthrough at such small concentrations.

#### *3.2. Mechanistic Modeling*

As a next step, an additional set of nine BT runs (BT#7–15) were used to fit the LKM parameters by minimizing the RMSE with the measured concentration values in the breakthrough. The corresponding estimated values (along with 95% confidence intervals) are reported in Table 4 and contain the Henry coefficient *H*, saturation capacity *qsat*, the maximum mass transfer coefficient *kmax <sup>m</sup>* , maximum hindrance coefficient *S*1, and the nonlinearity increase of hindrance coefficient *S*2.

**Table 4.** Fitting parameter of the lumped kinetic model (LKM) optimized on breakthrough (BT)#7–BT#15 (ε = 0.36; *A* = 17.15; *dp* = 85 μm).


The corresponding model predictions of the breakthrough together with the fitting data sets are shown in Figure 5. The red dots represent the HPLC titer measurements, whereas the continuous blue lines represent the predictions of the LKM. Additionally, the RMSE in fitting (RMSEF) is indicated.

**Figure 4.** Titer predictions of Raman-PLS (grey), the LKM (green), the EKF model with tuned *kQ* (black) and HPLC off-line measurements (red) for rotations 1–6 (**A**–**F**).

**Figure 5.** Internal mechanistic model (LKM) predictions of titer for BT#7–15 (blue) and corresponding HPLC off-line measurements (red).

It can be seen that the shapes of BT curves vary in steepness, inflection point and saturation level. This is due to the differences in feed concentration and loading flow rate. At complete column saturation, the asymptotic value of the outlet concentration in the BT tended to the feed concentration. Moreover, higher feed concentrations generally produced earlier breakthroughs. Similarly, larger flow rates not only reduced the residence time in the column but also increased the convection rate along the column with respect to the diffusion rate to the resin, thus producing faster and flatter BT

curves. In most cases, the LKM was able to closely predict the trend of the reference measurements. The worst results were obtained in the case of BT#10–12, where RMSEF ranged from 0.016 to 0.026 mg/mL. For such runs, which correspond to the smaller feed concentrations, the error was mostly due to a shift of the predicted BT time with respect to the measured one. In spite of this, it appears that the slope of the predicted BT curves in the inflection point was very similar to the measured one. This result could be explained by an underestimation of the effective column capacity. However, it is difficult to identify the exact origin of this disagreement.

The model ability of predicting new runs was tested by applying the LKM to the first six BT curves (BT#1–6), which were not included in the fitting process of the LKM. The results of the predictions are shown in Figure 6, where the red dots represent HPLC measurements and the blue line the LKM predictions.

**Figure 6.** Mechanistic model (LKM) predictions of titer for BT# 1–6 with parameter values fitted on BT# 7–15 (blue) compared to HPLC off-line measurements (red).

It can be observed that the model was able to predict the shape of the BT curves and, in particular, the steepness, indicating a good estimation of the mass transport properties. Additionally, the saturation level seemed to be well predicted, since there is no significant mismatch between estimated and measured times for reaching saturation conditions. However, BT#1–4 showed a significant offset, leading to RMSEP values up to 0.039 mg/mL. This may be related to the fact that the model might not have precisely captured the adsorption mechanism at lower feed concentrations, which could also explain the offsets in Figure 5. Moreover, one can also observe a different behavior in the curves at early BT times. The measured BT curves seem sharper than what was predicted by the model. Again, this might be due to unaccounted differences in the feed composition, resin aging, column packing quality or a more complex behavior of the system than described by the model. Of course, more complex models could be introduced to improve the description of, particularly, the mass transfer process [31,38]. On the contrary, the good ability of the model to predict the shape of the BT curve well in spite of its generality and simplicity makes it a good candidate for its application in the frame of the EKF, where such inaccuracies could be corrected in real-time by experimental measurements.

#### *3.3. Extended Kalman Filter Tuning*

Before applying the EKF concept, the filter design parameters *R*, *x*ˆ *t* + 0 , *Q* and *kQ* needed to be carefully selected and tuned. For this, the variance of the Raman-PLS model (*RMSECVPLS*<sup>2</sup>) was used as the measurement noise *R*, while the process noise *Q* was computed on-line at any given *tk*, as described in Section 2.6. Since mismatches between the LKM predictions and external data set were

expected, a rotation specific mismatch factor *kQ* was applied. In the following, the determination of *kQ* for ROT1 (*kRot* <sup>1</sup> *<sup>Q</sup>* ) is described: The EKF was applied for each BT curve of the calibration set of ROT1, i.e., BT#2–6, using the Raman-PLS calibrated for ROT1, LKM predicting the distinct BT curve and varying *kQ* in a range between 10 and 1 <sup>×</sup> <sup>10</sup>6. The resulting *RMSEPEKF* as a function of *kQ* for each BT curve of the training set is shown in Figure 7.

**Figure 7.** Error in the EKF prediction of the monoclonal antibody (mAb) concentration as a function of the model mismatch factor *kQ* for each of the BT curves of the calibration set of ROT1.

It can be seen that a minimal *RMSEPEKF* could be obtained by selecting *kQ* around 1 <sup>×</sup> 10<sup>4</sup> for all BT curves. Note that *kQ* can be regarded as a measure of the confidence in Raman measurements versus the confidence in the mechanistic model. The higher *kQ*, the more the EKF relied on Raman-PLS, while for small *kQ*, the LKM was considered as more trustworthy. It is also important to note that the absolute value of *kQ* depended strongly on the absolute values of the estimation of both measurement and process noise. The presence of the minimum in the middle of the investigated *kQ* range indicates the beneficial effect of considering both types of information in producing the estimates, thus indicating that this approach is better than using only the LKM (small *kQ*) or the Raman-PLS model (large *kQ*). It can be assumed that by further increasing *kQ* above 1 <sup>×</sup> <sup>10</sup>6, even higher RMSEP could be obtained, since the filter would singly rely on Raman-PLS. In contrast, when *kQ* tended towards 10, it singly relied on the LKM. This procedure was repeated for all rotations, and the resulting rotation-specific model mismatch factors (*kRot i <sup>Q</sup>* ) are summarized in Table 5. It can be seen that the values of *<sup>k</sup>Rot i Q* were similar for all rotations, thus indicating a robust confidence balancing between Raman-PLS and LKM predictions.

**Table 5.** Rotational specific model mismatch factor *kRot i <sup>Q</sup>* for all rotations.


#### *3.4. Extended Kalman Filter Validation*

To externally validate the EKF, the rotational approach explained in Section 2.6 and Table 2, was applied for all rotations. In Figure 4A–F, the final results of Raman-PLS, the LKM and EKF for ROT1-6 are shown, respectively.

The red dots represent the off-line HPLC measurements; the continuous grey and green lines represent the Raman-PLS and LKM predictions, respectively; the black line represents the results of the EKF. As mentioned above, the Raman-PLS predictions showed significant noise, although they captured the trend of the actual BT curve. A significant prediction offset can be seen in ranges at the beginning of the breakthrough. Here, the Raman-PLS prediction was rather untrustworthy, which might be explained through difficulties in training the model on samples which did not contain the target molecule or contained a very small amount of it, close to the limit of detection. On the other hand, the LKM was able to smoothly predict the shape of the BT curve, although it showed a constant and significant error. As also noted before, the LKM tended to anticipate the onset of breakthrough, which appeared sharper in the experiments. The line of the EKF seemed, as expected, to be a combination of the curves above, with the beneficial effect of eliminating the mechanistic model offsets on one hand and smoothening the high noise of the Raman predictions on the other. The EKF was particularly reliable in the region of the incipient breakthrough, where its predictions relied mostly on the LKM results, thus eliminating the negative concentration values predicted by the Raman-PLS. On the other hand, at concentration values of 0.1 mg/mL, the offset with respect to the off-line measurements was eliminated and simultaneously reduced the noise. This same pattern exists in all other rotations, as seen by the data summarized in Table 6, where the errors in terms of RMSEP for Raman-PLS, LKM and EKF predictions are compared for all considered rotations.


**Table 6.** Prediction errors of the Raman-PLS, the LKM and EKF model for all rotations.

As can be observed in Table 6, the RMSEPPLS and RMSEPLKM could be reduced by applying the EKF. The only exception to this trend was ROT3, where the LKM was slightly better. In this case, both the Raman-PLS and the LKM both largely anticipated the BT time. Nevertheless, the advantage of using an EKF estimator appears very clear from this table, especially in significantly reducing the error of those rotations exhibiting a large Raman-PLS model error.

#### *3.5. Extended Kalman Filter Perturbation*

One of the major drawbacks of the LKM approach is its sensitivity to the input process parameters. This is illustrated in Figure 8 with reference to ROT1, where the effect of a 5% Gaussian distributed perturbation in feed concentration and flow rate, as well as bed porosity on the predictions of the LKM and the EKF, is compared for 200 simulations. The selected perturbed process parameters are representative of the variables that are actually subject to perturbations in real applications.

The solid line indicates the mean of the predictions of all 200 simulations using the LKM (red) and EKF (blue), whereas the shaded areas show the corresponding 68% confidence intervals, respectively. While the red shaded area is broad and clearly deviates from the reference off-line HPLC measurements, the blue shaded area is rather narrow and distributed around the reference values. It can be concluded that the LKM was strongly affected by perturbations of its input parameters, while this sensitivity was reduced for the EKF predictions due to the influence of the Raman-based estimates.

**Figure 8.** Effect of 5% Gaussian distributed perturbations in feed concentration and flow rate, as well as bed porosity on the predictions of the LKM (mean = dark red; standard deviation interval = light red) and the EKF (mean = dark blue; standard deviation interval = light blue) of ROT1 in 200 simulations.

#### **4. Conclusions**

In this work, an EKF estimator of the monoclonal antibody concentration at the outlet of a chromatographic column was developed. Its predictions considerably improved with respect to the use of the single Raman-PLS or the LKM. This was demonstrated for the case of a low titer harvest (mAb conc. < 0.42 mg/mL), which is typical for perfusion bioreactors. In general, the PLS-based predictive Raman models were able to capture the shape of the breakthrough curves, but the obtained results were too noisy for practical applications—for example, in the direction of process control. On the other hand, the mechanistic LKM, properly tuned on an external data set, was able to capture the qualitative shape of the breakthrough curves, but it exhibited deviations that were too large with respect to the off-line reference measurements. The proper application of the EKF requires the preliminary estimation of the parameter *kQ*, which is responsible for weighting the contribution of the Raman-PLS and the LKM predictions in the final estimate of the filter. By applying the tuned EKF to an external data set, its superiority as compared to Raman-PLS and LKM became obvious. While the LKM predictions served as a solid backbone for the EKF, the Raman-PLS real-time information updated the state estimates and significantly reduced the LKM offset. Though the LKM showed, in some cases, comparable prediction errors, the perturbation analysis showed the additional benefit of EKF through the increased robustness with respect to the model input parameter values. It is worth noting that the RMSE-EKF of 0.026 mg/mL on a range of 0–0.42 mg/mL is very close to the analytical standard deviation of the reference off-line HPLC measurements (0.01 mg/mL). However, it needs to be mentioned that the low detection limit of Raman spectroscopy becomes critical at very low protein concentrations. Here, the LKM could of course be of help, and the EKF predictions should rely mostly on these values. To fully benefit from the LKM, it should be specifically tuned in the region of low concentrations, which is at the incipient breakthrough. Nevertheless, the EKF performance reported in this work is already sufficient for the implementation in the frame of the control of a capture chromatographic step, where the range of interest is around 70% of the breakthrough value [39]. It can be concluded that the EKF is a powerful tool for smart sensors and should be considered more often for monitoring and control within bioprocesses. In the future, this approach might be extended to other applications where deterministic knowledge is available, like in the monitoring of protein aggregation [40], crystallization [41] or in-line buffer preparation. At the same time, research activities on the different components of a Raman analyzer towards increased signal intensities should be continued to further increase the prediction accuracy and reduce the measurement duration.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2227-9717/7/10/683/s1, Figure S1: Acquired Raman spectra of the calibration set of ROT1.

**Author Contributions:** Conceptualization, F.F.; formal analysis, M.F.L.; investigation, S.G. and S.V.; validation, J.S. and H.B.; supervision, M.M. and A.B.

**Funding:** This research was funded by the KTI (CTI)-Program of the Swiss Economic Ministry Project 19190.2 PFIW-IW.

**Acknowledgments:** We would like to acknowledge Merck-Serono SA for providing the cell line and nutrition media, ChromaCon AG and Kaiser Optical Systems Inc. for the discussions and partnership as well as Dr. Moritz Wolf for producing the harvest.

**Conflicts of Interest:** The authors declare no conflicts of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Bioenvironmental Zonal Controlling of Incubated Avian Embryo Using Localised Infrared Heating**

#### **Ali Youssef, Tomas Norton \* and Daniel Berckmans**

Department of Biosystems, Animal and Human Health Engineering Division, M3-BIORES: Measure, Model & Manage of Bioresponses Laboratory, KU Leuven, Kasteelpark Arenberg 30, 3001 Heverlee, Belgium; ali.youssef@kuleuven.be (A.Y.); daniel.berckmans@kuleuven.be (D.B.)

**\*** Correspondence: tomas.norton@kuleuven.be

Received: 25 August 2019; Accepted: 17 September 2019; Published: 23 September 2019

**Abstract:** The main objective of any bioenvironmental controller is to create favourable bioenvironmental conditions around the living-system. In industrial incubation practice of chicken embryo, it is sometimes difficult to fill large incubators with uniform eggs, which leads to suboptimal results. The ideal incubation solution is a machine that is capable of coping with all sorts of variabilities in eggs. This can be realised in practice by creating different zones of different environmental conditions within the same machine. In the present study, a two-levels controller was designed and implemented to combine both convective and radiative heating to incubate eggs. On the higher level, three model-predictive-control (MPC) constrained controllers were developed to regulate the power applied to nine IR-radiators divided into three zones based on continuous feedback of the eggshell temperatures in each zone. On the lower level, a PID controller was used to maintain the air temperature within an experimental incubator at a fixed level (34 ◦C) lower than the standard incubation temperature. Four full incubation trials were carried out to test and implement the developed zonal controllers. The implementation results showed that the developed controllers were able to follow the reference trajectory defined for each zone. It was possible to keep the eggshell temperatures within the middle region (zone) different from the sidelong regions (zones) while the air temperature kept constant at 34 ◦C. The average hatching result (HOF) of the four full incubation trial was 84.0% (±0.5). The developed two-levels control system is a promising technique for demand-based climate controller and to optimizing energy use by using multi-objectives MPCs with constraint on total energy consumption.

**Keywords:** bioenvironmental control; model-predictive controller; zonal controlling; dynamic modelling

#### **1. Introduction**

The production of meat and eggs worldwide is increasing because of the growing population and the high demand of animal proteins [1]. In the poultry industry, meat and egg production sectors require large-scale incubation of eggs at a hatchery and a well-controlled and monitored environment [2].

Eggs of different origins and with different pre-incubation treatments are put together in an incubator, resulting in a non-uniformity between the hatching times of the different eggs [3]. This non-synchronised time of hatch, referred to as a large hatch window, is negative in terms of animal welfare and post-hatching performance, as the chicks are deprived from food and water (until the rest of the eggs are hatched and transferred to the farm) [4–7]. Hatcheries therefore have a strong objective to synchronise the hatch time.

In practice, the incubation process takes place in two different machines, namely, (i) the setter (from incubation day 0 to day 16–18) where the embryos stay during the largest part of their development and (ii) the hatcher (day 17–19 to day 20) where the embryos hatch.

In practice, today's tendency is that incubators are becoming larger. This has however implications on hatchery management. In practice, as hen houses are not increasing in size, it becomes more difficult to fill such larger incubators with uniform eggs from a single flock and storage time. Therefore, it regularly happens that eggs from two or three different flocks are combined in one machine, which can lead to sub-optimal results. For instance, putting eggs from a flock with an age of 30 weeks with those from another flock of 60 weeks of age in the same incubator will certainly lead to a decrease in the number of hatched chicks, poorer chick quality, higher post-hatch mortality, etc., under present conditions where all eggs in the machine are treated equally [8–10].

The ideal incubation solution is a machine that is capable of coping with all sorts of variabilities in eggs (e.g., flock age, strain, storage time, etc.,). This can be realised in practice by creating different zones with different environmental conditions within the same machine. Localised manipulation of the environmental variables inside the incubator (mainly temperature) is a potential solution to create different zones within the incubator space. The creation of different temperature zones by means of air-heat flow control is a very complex and expensive process [11–14]. The best option, according to many scientists, is the radiation (radiant) method of heating. In practice, for the creation of localized zones with a higher temperature, heating systems based on electric radiation (infrared) heaters have been widely used [11]. Radiation (infrared) heating systems radiantly heat surfaces rather than air volumes, which allows them to be used to heat individual zones of eggs, which cannot be achieved with conventional forced convective heating.

During incubation, the thermoregulatory system of the chicken embryo evolves through different stages from a poikilothermic to a homeothermic system [15,16]. The incubated egg is considered as a complex, individually different, time varying and dynamic (CITD) system as introduced by Berckmans et al. (e.g., [15,17,18]). Hence, the dynamic thermal response of the fertile egg to changes in ambient temperature is different from one day to another during the embryonic development [15]. As such, modelling and controlling a biological system such as the fertile incubated eggs is more complex than modelling of non-living physical systems (such as electric circuits). Most of the biological responses including heat production of incubated eggs are results of a complex network of interactions among many components inside the egg. We have ([12]) successfully implemented the multi-zonal controlling in an empty ventilated chamber using a multi-objective proportional-integral-plus (PIP). However, implementing such multi-zonal controller in a bio-environment around living systems (e.g., incubated embryo) is a great challenge because of the inherent non-linearity of the system. One of the challenges that are engaged with controlling different thermal zones simultaneously (multi-zonal control) is the controllability of the system under question. The controllability property of the system plays a crucial role in many control problems, such as stabilization of unstable systems by feedback, or optimal control [19,20]. The system's controllability can be roughly defined as an ability to do whatever we want with our system, or in more technical terms, the ability to transfer our system from any initial state to any desired final state in a finite time [18]. The challenge facing us to control multi-thermal-zones was to control the temperature in a certain number of zones inside the test chamber with a minimal number of control variables. Model-based control techniques, such as model-predictive-control (MPC) are suitable approaches to handle the inherent nonlinearity of the living systems and the interaction of the different thermal zones. The MPC is well-known and frequently used in the industry for optimal control of time-varying systems with constraints [21]. MPC benefits from simple and intuitive tuning and the ability to control a range of simple and complex phenomena, including systems with time delays, non-minimum phase dynamics, and instability [22]. Additionally, the framework of MPC incorporates straightforwardly system's constraints and multiple operating conditions, exhibits an intrinsic compensation for dead time, and provides the flexibility to formulate and tailor a control objective [21,22].

The main objective of this paper is to investigate the possibility of zonal controlling the bio-environment of incubated chicken embryo by combining forced convection and localised infrared heat using adaptive predictive-controlling approach.

#### **2. Materials and Methods**

#### *2.1. Experimental Setup*

#### 2.1.1. Experimental Incubator

Experiments were carried out in a small-scale experimental incubator (see Figure 1) at the division of Measure, Model and Manage Bio-responses (M3-BIORES), Leuven University (KU Leuven), Belgium. The experimental incubator is composed of the main chamber and the air preparation chamber whose inner dimension is 0.8 × 0.6 × 0.4 m and 0.8 × 0.25 × 0.4 m (l × w × h), respectively (Figure 1). Both chambers have 0.04 m thick surrounding walls made of propylene. The air preparation chamber is equipped with an air re-circulation, one inlet opening for fresh air, four re-circulation openings, and a pipe system (Figure 1) with a three-way valve to regulate the flow rate of fresh air (refreshment) and re-circulated air coming from the main chamber. The fresh air and re-circulated air were mixed within the air preparation chamber to be pumped into the main chamber through two inlet pipes (Figure 1). Volumetric flow rate, of fresh air versus that of the re-circulated air, into the main chamber is conditioned by the open degree of the three-way valve. With 100% valve open, maximum ventilation rate of 0.17 m3 h−<sup>1</sup> (0.885 volume refreshment per hour) was achieved with no air re-circulation. The main chamber is equipped with two heater fans connected to the two inlet pipes, which create forced air ventilation and supply maximum total heat of 200 W in the system (Figure 1). The prepared air (in the air preparation chamber) is pumped through the heater fan to be heated up, if the heaters were on, before flowing into the main chamber (Figure 1). The air inside the main chamber is exhausted through the four openings on the top sidewall. Portion of the exhausted air (based on the control suggestion) was mixed with the fresh inlet air and the rest was removed out of the incubator through the main outlet opening. A mixing fan was positioned in the middle of the main chamber to accelerate the mixing of the inlet air with the air inside the chamber. The main chamber was designed to hold a standard incubation egg tray above the mixing fan and the air inlet pipes. The used egg tray was a Petersime N.V. standard setter tray (B00568) for chicken eggs with 150 egg places arranged in a 10 × 15 matrix and made of polypropylene. Air temperature, eggshell temperature and relative humidity inside the test chamber were automatically controlled using the Petersime FocusTM controller. Control actions are calculated based on continuous feedback from a temperature/humidity probe, which is placed inside the main chamber. The controller compares the measurements from these sensors with the set points to make the decision of heating up by turning on the heater fan, cooling down by cool air ventilation or humidifying by supply steam from the steam generator or dehumidifying by dry air ventilation. The Petersime FocusTM control box is connected to the data acquisition and control PC where the incubation process and control actions were programmed using the Petersime Focus Software (v.2.0).

**Figure 1.** Schematic representation of the experimental incubator showing the main chamber, air preparation chamber, and the infrared cover.

#### 2.1.2. Infrared Cover

The cover of the experimental incubator was designed to provide infrared heating in the top of the incubated eggs. The infrared cover was equipped of nine ceramic infrared (IR) radiators (lamps), of the type Elstein® IOT/75. The used Elstein IOT/75 lamps were ceramic infrared dark radiators with maximum power of 100 Watts and operating infrared wavelength range of 3–10 μm. The IR lamps were equipped with E27 threads that can be screwed-in like bulbs into porcelain sockets. Nine porcelain sockets were fixed on three metal bars with three sockets each (Figure 2). The three metal bars were fixed to the inner side of a wooden frame (with an inner dimension 0.8 × 0.60 × 0.20 m) in such a way that the IR lamps were facing downward (Figure 2). The distance (*h*) between the IR lamps and the eggs was adjustable, within the range h = 0.1 and 0.3 m, using four adjustment knobs (Figure 2, top view). A plexiglass cover was placed on the top of the wooden frame. A rubber washer was placed between the wooden frame and the incubator chamber to prevent air leakage. The nine IR lamps were divided into three groups, each consisting of three IR lamps. The power applied to each of the three groups were individually controlled via a model-predictive-controller (MPC) designed for this purpose (see Section 2.4).

**Figure 2.** Schematic representation of the infrared cover showing the nine IR lamps (top view graph) and the adjustable distance from the eggs (h) using the adjustment knob (cross section graph).

#### *2.2. Measurements and Data Acquisition*

Measurements from incubator's built-in air temperature/humidity probe and CO2 were recorded every two minutes and saved together with the controller set points in the data acquisition PC. The egg tray was spatially divided (through the short side) into three regions, each of which consisted of 50 eggs and facing one corresponding group of three IR lamps (Figure 3). In each region, two thermocouples (type-T) were placed on the equators of two eggs to represent the average eggshell temperature within each region (Figure 3). All the sensors were covered with aluminium foil for protection from overheating caused by the direct exposition to radiation heating.

**Figure 3.** The location of the eggshell temperature sensors within each region (I, II and III) in the experimental incubator (left picture) and the temperature sensor placed on the equator of the eggshell covered with an aluminium foil to be protected from the IR heating (right picture).

#### *2.3. Experiments*

#### 2.3.1. Pilot Experiments

In total, 12 pilot experiments were conducted to investigate the thermal profile (i.e., the main effective diameter of thermal radiation over the eggs surface, which is a function of the distance *h* between the lamp and the eggs) of the IR lamps. Additionally, these pilot experiments were conducted to investigate the maximal allowable operating power applied to the IR lamps to avoid over heating of the incubated eggs (to define the controller constraints). To investigate the thermal profile of the IR lamps over the incubated eggs a thermal camera (VarioCAM®, *InfraTech*) with 640 <sup>×</sup> 480 thermal resolution, is used. Two infertile eggs were placed on a small tray and one IR lamp was suspended from the top with an adjustable distance (*h*) from the surface of the two eggs.

Two infertile eggs ("equipped eggs") were equipped with two thermocouples. One thermocouple was fixed to the eggshell on the equator to measure the eggshell temperature (*Tegg*) and another was inserted 2 cm inside the egg, through a drilled hole in the narrow end of the egg (Figure 4), to measure the egg core temperature. The equipped eggs (Figure 4) were used to investigate how much the egg's internal temperature differs from the eggshell temperature and how fast the heat transferred from the eggshell to the internal parts of the eggs when using the IR heating. This enables the definition of the optimal constraints necessary for designing the IR controller to avoid any harms to the living embryo during incubation.

**Figure 4.** The "equipped egg" is an infertile egg, which was equipped with two thermocouples (type-T), one being placed on the equator of the egg, and the other one placed inside the egg through a drilled hole in the narrow end of the egg.

#### 2.3.2. Control and System Identification Experiments

During the course of the research work reported in this paper, a set of five-step experiments were conducted to model the dynamic responses of eggshell temperatures to changes in the power (pulse width modulation, or PWM) applied to the IR radiators. The main goal was to develop a model predictive controller (MPC) to regulate, locally, the eggshell temperature of the incubated eggs within the incubator. The power applied to the IR radiators was manipulated by changing the duty cycle (percentage) of the PWM signals. The step experiments were carried out by applying step changes in the PWM duty cycle (percentage) to the IR radiators over the range 10–20%, while maintaining a fixed distance (*h* = 0.15 m, obtained from the pilot experiments) between the radiators and the eggs and constant air temperature (*Tair* ≈34 ◦C). Figure 5 shows an example of the applied step changes in the power applied to the IR radiators and the corresponding dynamic response of the eggshell temperature for two incubated eggs in region II.

#### 2.3.3. Full Incubation Trials and Controller Implementation

Four full incubation trials were carried out to implement the developed MPCs. In order to test the performance of the developed predictive controllers in regulating the eggshell temperatures within the three regions (I, II and III) of the experimental incubator, 150 eggs were incubated with 50 eggs per region until hatching. During each incubation trial, reference trajectories of different set points were applied to evaluate the performance of each controller. To test the capability of the developed MPCs to create different thermal zones of eggshell temperature within the IRinc1 the reference trajectories to the three MPC's were defined in such a way that the eggshell temperatures within the middle regions (region II) was kept different (from day 10 until day 6) from the two sidelong regions (regions I and III).

**Figure 5.** Step changes in the pulse width modulation (PWM) duty cycle (%) and the corresponding responses of the eggshell temperatures of two eggs in region II.

#### *2.4. Model Predictive Controller (MPC) and System Identification*

The proposed control strategy in this paper is based on controlling the temperature of the incubated eggs by combining the convective and radiative heating mechanisms (Figure 6). The forced convective heating/cooling was controlled using the Petersime FocusTM controller and set to maintain the air temperature (*Tair*), within the main chamber, at a lower value (set point ≈ 34 ◦C) than the standard temperature for egg incubation (*Tstd* = 37.8 ◦C). The radiative heating, using the IR radiators, was used to bring the eggshell temperature (*Tegg*) to the desired reference value *RTegg*(*k*) at time *k*. A model predictive controller (MPC) was developed with the objective of maintaining the eggshell temperature (*Tegg*) around a certain predefined desired reference trajectory (*RTegg*) along the incubation period, which was achieved by actively manipulating the power applied to the IR radiators. The input power applied to the IR radiators was implemented through the pulse-widths modulated (PWM) signals generated using NI USB6251 interface.

**Figure 6.** Block diagram representing the control strategy to combine both convective and radiative heating to control the eggshell temperature using model-predictive-controller (MPC) to regulate the eggshell temperature of incubated eggs.

#### 2.4.1. System Identification and Parameter Estimation

A single-input, single-output (SISO) discrete transfer function (DTF) model was used to describe the static and dynamic responses of the eggshell temperature (*Tegg*) to step changes in the power (PWM duty cycle '%') applied to the IR radiators. The model has the following general structure [23]:

$$T\_{\mathfrak{c}\mathfrak{F}\mathfrak{F}}(k) = \frac{\mathfrak{B}\mathfrak{z}^{-1}}{\mathfrak{A}(\mathfrak{z}^{-1})} u(k-\delta) - \xi(k) \tag{1}$$

where: *Tegg*(*k*) is the output (average eggshell temperature per region) at time *k*; *u*(*k*) is the input (PWM duty cycle to the IR radiators) at time *k* (min); ξ(*k*) is additive noise, assumed to be a zero mean, serially uncorrelated sequence of random variables with variance σ*<sup>2</sup>* accounting for measurement noise, modelling errors and effects of unmeasured inputs to the process.

The two polynomials *A z*−**<sup>1</sup>** and *B z*−**<sup>1</sup>** are given by:

$$\begin{aligned} A(z^{-1}) &= 1 + a\_1 z^{-1} + a\_2 z^{-2} + \dots + a\_n z^{-na} \\ B(z^{-1}) &= b\_0 + b\_1 z^{-1} + b\_2 z^{-2} + \dots + b\_m z^{-nb} \end{aligned} \tag{2}$$

where *ai* and *bi* are the model parameters to be estimated; *<sup>z</sup>*−*<sup>i</sup>* is the backward shift operator, *<sup>z</sup>*−1·*y*(*k*) = *y*(*k* − 1); and *n*, *m* are the orders of the respective polynomials. In the present paper, the simplified refined Instrumental variable (SRIV) algorithm was utilised in the identification and estimation of the models [24]. The appropriate model structure was identified, i.e., the most appropriate values for the triad [*n*, *m*, δ] (see Equation (1)). Two main statistical measures were employed to determine the most appropriate values of this triad. Namely, the coefficient of determination *RT* <sup>2</sup> , based on the response error; and *YIC* (Young's information criterion), which provides a combined measure of model fit and parametric efficiency, with large negative values indicating a model which explains the output data well and yet avoids over-parameterisation [25,26].

#### 2.4.2. MPC and Cost Function Formulation

The general idea behind any MPC design is to select a sequence of *Nc* future control moves to minimise a cost function *J* (Equation (3)) over a prediction horizon of *Np* sample times [27]. In this paper, a quadratic programming cost function with quadratic objective function and linear constraints was used. The quadratic programming form leads to smoother control actions in comparison to the linear form. The model predictive controller uses the model (Equation (1)) to predict the response of the system based on the past measured inputs and outputs. This predicted output (*T*ˆ*egg*) was then used to calculate the optimal input by mathematical optimisation techniques in order to reduce the difference between this output and the desired one (*RTegg*). This optimal input was calculated by minimizing the following cost function [21]:

$$J\{N\_1, N\_p, N\_c\} = \sum\_{j=N\_1}^{N\_p} a\_j \left[\mathcal{T}\_{\text{cgS}}(k+j|k) - R\_{\text{TregS}}[k+j]\right]^2 + \sum\_{j=1}^{N\_c} \lambda\_j [\Delta u[k+j-1]]^2 \tag{3}$$

where, <sup>Δ</sup>*u*(*k*) is the change in input (power applied to the IR radiators), *<sup>T</sup>*ˆ*egg*(*<sup>k</sup>* + *<sup>j</sup>*|*k*) is the predicted output (eggshell temperature) sequence, *RTegg*(*k*) is the desired value of the output, *N*<sup>1</sup> is the minimum of the prediction horizon, α and λ are the weighting factors. The block diagram depicted in Figure 7. shows the basic structure of the designed MPC system in the present work to control the eggshell temperature using localised IR heating.

**Figure 7.** Block diagram representing the basic structure of the proposed MPC system to control the eggshell temperature of incubated eggs.

The model (Equation (1)) is the cornerstone of the MPC system and should be robust enough to fully capture the process dynamics [21]. In other words, the identified model should be able to describe the dynamic responses of the eggshell temperature to changes in the control input (i.e., power applied to the IR radiators). There is a wide family of MPC algorithms, each member of which is defined by the choice of the prediction model, the cost function and obtaining the control law [21]. In the present paper, the dynamic matrix control (DMC) algorithm was considered. The DMC formulation uses the step response to model the process [21,28]. The process model employed in this formulation is the step response of the eggshell temperature to step increase in the input, while the disturbance was considered constant along the prediction horizon (*Np*). The procedure to obtain the predictions is as follows. As a *step response* model (Equation (1)) was employed:

$$T\_{\text{eff}}(k+j) = \sum\_{j=1}^{\infty} g\_j \Delta u(k+j-1) \tag{4}$$

where *gj* is the *step response* coefficient.

The predicted eggshell temperature along the prediction horizon is:

$$\hat{\mathcal{T}}\_{\text{cg}}(k+j|k) = \sum\_{j=1}^{N\_p} \mathbf{g}\_j \Delta u(k+j-1) + f(k+j) \tag{5}$$

where the *f*(*k* + *j*) is the *free response* of the system.

Equation (5) can be written as follows:

$$
\begin{bmatrix}
\hat{\mathcal{T}}\_{\mathcal{CG}}[k+1|k] \\
\hat{\mathcal{T}}\_{\mathcal{CG}}[k+2|k] \\
\cdots \\
\hat{\mathcal{T}}\_{\mathcal{CG}}[k+N\_{\mathcal{P}}|k]
\end{bmatrix} = \begin{bmatrix}
\mathcal{G}\_{1} & 0 & \dots & 0 \\
\mathcal{G}\_{2} & \mathcal{G}\_{1} & \dots & 0 \\
\vdots & \vdots & \ddots & \vdots \\
\mathcal{G}\_{N\_{\mathcal{P}}} & \mathcal{G}\_{N\_{\mathcal{P}}-1} & \dots & \mathcal{G}\_{1}
\end{bmatrix} \begin{bmatrix}
\Delta u[k] \\
\Delta u[k+1] \\
\cdots \\
\Delta u[k+N\_{\mathcal{C}}-1]
\end{bmatrix} + \begin{bmatrix}
f[k] \\
f[k+1] \\
\cdots \\
f[k+N\_{\mathcal{P}}-1]
\end{bmatrix} \tag{6}
$$

or

$$
\hat{T} = \mathbf{G}\mathbf{u} + \mathbf{f} \tag{7}
$$

where **<sup>G</sup>** is the *Np* <sup>×</sup> *Nc dynamic matrix*, *<sup>T</sup>*<sup>ˆ</sup> is the *Np*-dimension vector contains the predicted eggshell temperatures along the prediction horizon, **u** represents the *Nc*-dimension vector of the control inputs and **f** is the *free response* vector. Hence, using Equation (7) the cost Function (3) can be represented in the following form [21]:

$$J = \left(\mathbf{G}\mathbf{u} + \mathbf{f} - \mathbf{R}\right)^{\mathrm{T}} \left(\mathbf{G}\mathbf{u} + \mathbf{f} - \mathbf{R}\right) + \lambda\mathbf{u}^{\mathrm{T}}\mathbf{u} \tag{8}$$

$$J = \frac{1}{2} \mathbf{u}^T \mathbf{H} \mathbf{u} + \mathbf{b}^T \mathbf{u} + \mathbf{f}\_0 \tag{9}$$

with **H** = 2 - **G***T***G** + λ**I** , **<sup>b</sup>***<sup>T</sup>* <sup>=</sup> <sup>2</sup>(**<sup>f</sup>** <sup>−</sup> **<sup>R</sup>**) *<sup>T</sup>***<sup>G</sup>** and **<sup>f</sup>**<sup>0</sup> <sup>=</sup> (**<sup>f</sup>** <sup>−</sup> **<sup>R</sup>**) *<sup>T</sup>*(**<sup>f</sup>** <sup>−</sup> **<sup>R</sup>**).

The cost Function (9) is quadratic, therefore, the minimum is unique. The optimal input change can be calculated by setting the derivative equal to zero:

$$\frac{d\mathbf{l}}{d\mathbf{u}} = 2(\mathbf{G}^T \mathbf{G} + \lambda \mathbf{I})\mathbf{u} + 2\mathbf{G}^T(\mathbf{f} - \mathbf{R}) = 0\tag{10}$$

Then the optimal input change is given by:

$$\mathbf{u} = \left(\mathbf{G}^{\mathrm{T}}\mathbf{G} + \lambda\mathbf{I}\right)^{-1}\mathbf{G}^{\mathrm{T}}(\mathbf{R} - \mathbf{f})\tag{11}$$

For the control law, as such, only the first element of vector **u** was implemented.

#### **3. Results**

#### *3.1. Pilot Experiments*

The results of the conducted experiments have shown that the optimal thermal profile of the IR lamps was achieved at distance (*h*) of 0.15 m, which corresponds to an effective diameter of 0.30 m.

Figure 8 shows a comparison between the temperature responses of the equipped eggs under convective heating and radiative heating (at PWM of 10%). The results (an example is shown in Figure 8) have shown that the temperature difference between the egg-core and the eggshell was vanishing (quasi zero) faster (5.2 ± 1.5 min) in case of radiative heating in comparison to convective heating (9.4 ± 1.8 min). The average time constants of eggshell temperature in case of radiative and convective heating were 7.6 ± 1.2 min and 7.45 ± 1.23 min, respectively. The steady-state eggshell temperature at different power levels (PWM) applied to the radiative lamps are shown in Figure 9. A linear regression model was fit the relation between the eggshell temperature (◦C) and PWM (%) with a slope of 0.42.

**Figure 8.** Step responses of eggshell and egg core temperatures to step up increases in convective-heating (left graphs) and radiative-heating (right graphs), with PWM = 10%, inside the experimental incubator. The 'temperature diff' is the difference between the egg-core and the eggshell temperatures.

**Figure 9.** The resulted steady-state temperature of the eggshell at different PWM (%) to the IR lamps at distance *h* = 0.15 m.

#### *3.2. System Identification and Predictive Model Generation*

It should be stated here that the objective of this stage was to identify an approximation mode of the incubated fertile-egg system. The identification step in the present paper was tuned towards the main objective of control-oriented model design.

The SRIV algorithm, combined with the *YIC* and *RT* <sup>2</sup> , suggested that a second-order (number of poles, *n* = 2) DTF model with one minute pure delay (δ = 1 min) was most suitable (i.e., *R<sup>T</sup>* <sup>2</sup> = 0.98 ± 0.01 and *YIC* = −13.00 ± 1.45) to describe the dynamic responses of the eggshell temperature (*Tegg*(*k*)) to step changes in the power applied to the IR radiators (*u*(*k*)). More specifically, the SRIV algorithm identified the following general DTF model structure (denoted by the triad [2 2 2], i.e., *n* = 2, *m* = 2 and δ = 2),

$$T\_{\xi\xi\xi}(k) = \frac{b\_1 z^{-1} + b\_2 z^{-2}}{1 + a\_1 z^{-1} + a\_2 z^{-2}} \mu(k - \delta) \tag{12}$$

or in the following difference equation form,

$$T\_{\mathfrak{E}\mathfrak{E}\mathfrak{E}}(k) = -a\_1 T\_{\mathfrak{E}\mathfrak{E}\mathfrak{E}}(k-1) - a\_2 T\_{\mathfrak{E}\mathfrak{E}\mathfrak{E}}(k-2) + b\_1 \mathfrak{u}(k-\delta-1) + b\_2 \mathfrak{u}(k-\delta-2) \tag{13}$$

Table 1 shows the average parameter estimates for the identified model structure for the three eggs.

**Table 1.** The resulted model parameter estimates (average and ± standard error) obtained from 15 incubated eggs at embryonic day (ED) 15. - -


#### *3.3. Model Predictive Control Design*

#### 3.3.1. MPC Cost Function and Constraints

During the incubation process, the thermoregulatory system of the chicken embryo evolves through different stages from a poikilothermic to a homeothermic system. Hence, the thermal response of the fertile egg to changes in ambient temperature is different from one day to another during the embryonic development [15]. Such a complex and sensitive process is subject to a number of limitations and constraints pertaining to the ranges of tolerable eggshell temperatures and acceptable IR operating power range. Table 2 shows the applied constraints on the controlled, *Tegg*, and the manipulated, *u*(*k*), variables during designing the MPC system. Based on the results of pilot experiments, the control signal, *u*(*k*), was constrained within the allowable PWM range between 0 and 20%. In order to prevent large increments in the rate of change in the manipulated variable Δ*u*(*k*), the maximum boundary was

set as small as 0.5%. On the other hand, to prevent overheating a fast decrease in the IR heating was allowed by setting the minimum boundary to −2%. Additionally, the controlled variable, *Tegg*(*k*), was constrained within the allowable incubation range between 36 and 40 ◦C.

**Table 2.** The predefined constraints on both the controlled and manipulated variables, which should be considered for designing the MPC system.


The developed MPC system should anticipate constraint violations and correct them in an appropriate way. Therefore, the minimization of the cost Function (9) is subject to constraints on the output (*Tegg*(*k*), eggshell temperature), input (*u*(*k*), power applied to the IR radiators) and changes in the input, Δ*u*(*k*), as the quadratic programing (QP) formulation:

$$\min\_{\mathbf{u}} \frac{1}{2} \mathbf{u}^T \mathbf{H} \mathbf{u} + \mathbf{b}^T \mathbf{u},\\ \text{subject to} \begin{cases} 0 \le \boldsymbol{\mu} | \boldsymbol{k} \le 20 \\\ -2 \le \boldsymbol{\Delta} \boldsymbol{\mu} | \boldsymbol{k} \le 0.5 \\\ 34 \le T\_{\text{eff}} | \boldsymbol{k} \le 40 \end{cases} \tag{14}$$

#### 3.3.2. MPC Simulation

To simulate the closed-loop MPC of the eggshell temperature (controlled variable, *Tegg*(*k*)) using the power applied to the IR radiators (manipulated variable, *u*(*k*)) the following design parameters were defined:


The controller algorithm is initialized by computing the *dynamic matrix* **G** for the system (13), which is defined, based on (6), as follows:

$$\mathbf{G} = \begin{bmatrix} \mathcal{G}\_1 & 0 & 0 \\ \mathcal{G}\_2 & \mathcal{G}\_1 & 0 \\ \mathcal{G}\_3 & \mathcal{G}\_2 & \mathcal{G}\_1 \end{bmatrix} \tag{15}$$

where,

$$\begin{cases} g\_1 = -a\_1.T\_{\mathfrak{c}\mathfrak{K}\mathfrak{F}}(k-1) - a\_2.T\_{\mathfrak{c}\mathfrak{K}\mathfrak{F}}(k-2) + b\_1.\mathfrak{u}(k-\delta-1) + b\_2.\mathfrak{u}(k-\delta-2) \\\ g\_2 = -a\_1.T\_{\mathfrak{c}\mathfrak{K}\mathfrak{F}}(k) - a\_2.T\_{\mathfrak{c}\mathfrak{K}\mathfrak{F}}(k-1) + b\_1.\mathfrak{u}(k-\delta) + b\_2.\mathfrak{u}(k-\delta-1) \\\ g\_3 = -a\_1.T\_{\mathfrak{c}\mathfrak{K}\mathfrak{F}}(k+1) - a\_2.T\_{\mathfrak{c}\mathfrak{K}\mathfrak{F}}(k) + b\_1.\mathfrak{u}(k-\delta+1) + b\_2.\mathfrak{u}(k-\delta) \end{cases}$$

and computing the control *gain matrix* **K**, which is defined as follows:

$$\mathbf{K} = \left(\mathbf{G}^{\mathsf{T}}\mathbf{G} + \lambda\mathbf{I}\right)^{-1}\mathbf{G}^{\mathsf{T}}$$

A zero mean (μ = 0) white noise term with variance (σ) of 0.1 was added to both output *Tegg*(*k*) and input *u*(*k*) signals to simulate the measurement and actuator noises (unmeasured disturbances), respectively. A simulation example of the closed-loop response of the designed MPC controller based on the DTF model (12) is depicted in Figure 10. Despite the added disturbances in both input and output signals, the simulation of the MPC closed-loop response was able to follow the reference signal (set point) efficiently with max error (*Tegg* − *RTegg*) of ±0.4 ◦C.

**Figure 10.** A simulation example of the closed-loop step response *Tegg*(*k*)(upper graph) and unmeasured disturbances in both input and output using the designed constrained MPC controller based on the general TF model structure (12), showing the control signal *u*(*k*) (lower graph). The closed-loop simulation of the developed MPC was implemented on MATLAB on a computer with Intel® 8 core i7 (2.7 GHz) processor and 16GB RAM. The average computational time for one iteration on this computer was 15.6 s.

#### 3.3.3. MPC Implementation and Full Incubation Experiments

A two-level zonal control system was developed combining both convective and radiative heating to regulate the eggshell temperatures within three different zones (region I, II and III) simultaneously. On the higher-level three MPC systems were used to regulate the eggshell temperatures within the three regions. Additionally, a PID controller (Petersime FocusTM) was employed in the lower level to regulate the incubation air temperature within the experimental incubator.

Four full incubation trials were carried out to implement and tune the developed MPC system to regulate the eggshell temperatures of incubated eggs in three different zones inside the experimental incubator (see Figure 11). To investigate the possibility of the controllers to regulate the eggshell temperatures within the three regions, the reference trajectory for region II (middle region) was different from those for sidelong regions (i.e., region I and III). The programmed reference trajectories were including some extreme set points (e.g., 34 ◦C), which do not follow the standard eggshell temperature (around 38 ◦C). Therefore, it was expected that such treatments might affect the final hatching results. Figure 11 shows an example of the implementation results of the developed MPCs to regulate the eggshell temperatures inside the experimental incubator IRin1. By employing different set points,

it was possible to create two thermal zones, between days 0 and 6, (at region I and III) of incubated eggs with more or less same eggshell temperature sandwiching another with different eggshell temperature.

**Figure 11.** Example of theimplementation results of the developedMPCs to regulate the eggshell temperatures at three different regions, I (upper graphs), II (middle graphs) and II (lower graphs) simultaneously.

Although the controllers were able to follow the reference trajectories in each zone (region), it was noticed (Figure 11) that the responses of the eggshell temperatures in each zone exhibited an oscillation around the set point values with an average error of ±0.5 ◦C. This can be attributed to the unmeasured disturbances resulting from the interaction between adjacent zones with different temperatures and to the control actions of the PID (Petersime FocusTM) controller, which regulates the incubation air temperature around the eggs in the whole incubator. Another possible reason for such oscillated deviation between the actual eggshell temperature and the set point is the fact that, during this study, we have designed the MPCs based on one predictive model (12). That with the assumption that the DTF model (12) is representative of the controlled dynamic system (incubated embryo). However, in reality as shown in previous studies (e.g., [15]) the incubated embryo is inherently a nonlinear system, which exhibits different dynamics and responses almost every embryonic day. Therefore, we are proposing for future work an adaptive control approach, in which a linear model (with a fixed model structure as (12)) is estimated on the fly as the operation conditions are changing, hence the internal system-model of the MPC is updated at each scheduled time period (e.g., each day).

Previous studies (e.g., [31–34]) showed that the incubated chicken embryos are evolving at early stage of development (between incubation days ED 5–7) from an ectothermic (gaining its required heat from the surrounding environment) organism to an endothermic (produces its own heat) organism. Hence, in practice of industrial incubation, most of the energy is used to cool down the incubated embryo to the standard eggshell temperature (~38 ◦C). Therefore, the proposed two-level control system, which combine convective and radiative heating, is believed to be a promising technique to use the energy more efficiently during incubation. This can be achieved by locally heating up the required zones using localized IR heating (i.e., demand-based climate controlling). Additionally, a multi-objective cost function can be used to optimize the energy used by including an extra constraint on total energy consumption.

The results of the four full incubation trials showed that combining both convective and radiative heating mechanisms was successful to hatch the incubated eggs with average hatch-of-fertile (HOF = (hatched chicks/number of true fertile eggs) × 100) of 84.0% (±0.5). A breakout of the unhatched eggs was performed [35,36] in the end of each incubation trial. The average breakout results of the unhatched eggs are shown in Figure 12.

**Figure 12.** Average breakout results of the unhatched eggs during the four full incubation trials showing the percentage of hatched, infertile (inf), contaminated (Cont.), early death (Er. Death), middle death (M.D) and malformed embryo.

#### **4. Conclusions**

During the present study, a two-levels controller was designed and implemented to combine both convective and radiative heating to incubate eggs. On the higher level, three MPC constrained controllers were developed to regulate the power applied to nine IR-radiators divided into three zones based on continuous feedback of the eggshell temperatures in each zone. On the lower level, a PID controller (Petersime FocusTM) was used to maintain the air temperature within an experimental incubator at a fixed level (34 ◦C) lower than the standard incubation temperature. Four full incubation trials were carried out to test and implement the developed zonal controllers. The implementation results showed that the developed controllers were able to follow the reference trajectory defined for each zone. It was possible to keep the eggshell temperatures within the middle region (zone) different from the sidelong regions (zones) while the air temperature kept fixed at 34 ◦C. The average hatching result (HOF) of the four full incubation trial was 84.0% (±0.5). The developed two-levels control system is a promising technique for demand-based climate controller and to optimize energy use by using multi-objectives MPCs with constraint on total energy consumption.

**Author Contributions:** Conceptualization, A.Y. and D.B.; methodology, A.Y.; software, A.Y.; validation, A.Y.; formal analysis, A.Y.; investigation, A.Y.; resources, D.B. and T.N.; data curation, A.Y.; writing—original draft preparation, A.Y.; writing—review and editing, A.Y. and T.N.; visualization, A.Y.; supervision, A.Y., T.N. and D.B.; project administration, T.N. and D.B.; funding acquisition, D.B.

**Funding:** This research was funded by the Belgian government agency for Innovation by Science and Technology (IWT), grant number IWT110404: 2011–2014.

**Acknowledgments:** The authors gratefully thank to the technical and financial support of Petrsime N.V. (Belgium) and the support of Petersime R&D team, namely, Luc Gabrial, Rudy Verhelst, Pascal Garain and Paul Degraeve.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Model-Based Monitoring of Occupant's Thermal State for Adaptive HVAC Predictive Controlling**

#### **Ali Youssef, Nicolás Caballero and Jean-Marie Aerts \***

Department of Biosystems, Animal and Human Health Engineering Division, M3-BIORES: Measure, Model & Manage of Bioresponses Laboratory, KU Leuven, Kasteelpark Arenberg 30, 3001 Heverlee, Belgium; ali.youssef@kuleuven.be (A.Y.); nicolas.caballero@gmail.com (N.C.)

**\*** Correspondence: jean-marie.aerts@kuleuven.be

Received: 31 August 2019; Accepted: 5 October 2019; Published: 10 October 2019

**Abstract:** Conventional indoor climate design and control approaches are based on static thermal comfort/sensation models that view the building occupants as passive recipients of their thermal environment. Recent advances in wearable sensing technologies and their generated streaming data are providing a unique opportunity to understand the user's behaviour and to predict future needs. Estimation of thermal comfort is a challenging task given the subjectivity of human perception; this subjectivity is reflected in the statistical nature of comfort models, as well as the plethora of comfort models available. Additionally, such models are using not-easily or invasively measured variables (e.g., core temperatures and metabolic rate), which are often not practical and undesirable measurements. The main goal of this paper was to develop dynamic model-based monitoring system of the occupant's thermal state and their thermoregulation responses under two different activity levels. In total, 25 participants were subjected to three different environmental temperatures at two different activity levels. The results have shown that a reduced-ordered (second-order) multi-inputs-single-output discrete-time transfer function (MISO-DTF), including three input variables (wearables), namely, aural temperature, heart rate, and average skin heat-flux, is best to estimate the individual's metabolic rate (non-wearable) with a mean absolute percentage error of 8.7%. A general classification model based on a least squares support vector machine (LS-SVM) technique is developed to predict the individual's thermal sensation. For a seven-class classification problem, the results have shown that the overall model accuracy of the developed classifier is 76% with an F1-score value of 84%. The developed LS-SVM classification model for prediction of occupant's thermal sensation can be integrated in the heating, ventilation and air conditioning (HVAC) system to provide an occupant thermal state-based climate controller. In this paper, we introduced an adaptive occupant-based HVAC predictive controller using the developed LS-SVM predictive classification model.

**Keywords:** thermal sensation; thermal comfort; machine-learning; prediction; adaptive controlling

#### **1. Introduction**

Thermal comfort (TC) is an ergonomic aspect determining the satisfaction about the surrounding environment and is defined as "that condition of mind which expresses satisfaction with the thermal environment and is assessed by subjective evaluation" [1]. The effect of thermal environments on occupants might also be assessed in terms of thermal sensation (*TS*), which can be defined as "a conscious feeling commonly graded into the categories cold, cool, slightly cool, neutral, slightly warm, warm, and hot" [1]. Thermal sensation and thermal comfort are both subjective judgments, however, thermal sensation is related to the perception of one's thermal state, and thermal comfort to the evaluation of this perception [2]. The assessment of thermal sensation has been regarded as more reliable and. as such. is often used to estimate thermal comfort [3].

Human thermal sensation mainly depends on the human body temperature (core body temperature), which is a function of sets of comfort factors [4,5]. These comfort factors include indoor environmental factors, namely the mean air temperature around the body, relative air velocity around the body, humidity, and the mean radiant temperature to the body [5]. Additionally, some personal (individual-related) factors, namely, metabolic rate or internal heat production in the body, which vary with the activity level and clothing thermal-physical properties (such as clothing insulation and vapour clothing resistance), are included. It should be mentioned that the individual thermal perception is deepening, as well, on psychological factors, including naturalness (an environment where the people tolerate wide changes of the physical environment), expectations and short/long-term experience, which directly affect individuals' perceptions, time of exposure, perceived control and environmental stimulation [6]. The most considered method to have an accurate assessment of *TS* is to ask the individuals directly about their thermal sensation perception [4,5]. Thermal sensation mathematical models have been developed in order to overcome the difficulties of direct enquiry of subjects. The development of such models is mostly dependent on statistical approaches that correlates experimental conditions (i.e., environmental and person-related variables) data to thermal sensation votes obtained from human subjects [3,5]. Most of these models (e.g., predicted mean vote, PMV) are static in the sense that they predict the average vote of a large group of people based on the seven-point thermal sensation scale. Instead of individual thermal comfort, they only describe the overall thermal sensation of multiple occupants in a shared thermal environment. To overcome the disadvantages of static models, adaptive thermal comfort models aim to provide insights in increasing opportunities for personal and responsive control, thermal comfort enhancement, energy consumption reduction and climatically responsive and environmentally responsible building design [7,8]. The idea behind adaptive model is that occupants and individuals are no longer regarded as passive recipients of the thermal environment but, rather, play an active role in creating their own thermal preferences [8]. In addition to regression analysis, thermal sensation prediction can also be seen as a classification problem where various classification algorithms can be implemented [7]. Recently, a number of studies (e.g., [9–13]) have demonstrated the possibility of using machine learning techniques, such as a support vector machine (SVM), to assess and predict human thermal sensation. It can be concluded based on the published work (see the recent literature review [7] showing that classification-based models have performed so well as regression models).

Recent advances in mobile technologies in healthcare, in particular wearable technologies (m-health) and smart clothing, have positively contributed to new possibilities in controlling and monitoring health conditions and human wellbeing in daily life applications. The wearable sensing technologies and their generated streaming data are providing a unique opportunity to understand the user's behaviour and to predict future needs [14]. The generated streaming data is unique due to the personal nature of the wearable devices. However, the generated streaming data is forming a challenge pertaining to the need of personalized adaptive models that can handle newly arrived personal data.

Current heating, ventilation, and air conditioning (HVAC) control systems can be divided into two types: air temperature regulator (ATR) and thermal comfort regulator (TCR). Most TCR controllers use static models, mainly PMV, as a performance criterion.

This paper aims to develop an adaptive model for real-time monitoring of human thermal states using personal non-intrusive sensing techniques. The developed model should be suitable for real-time adaptive control of indoor climate systems and smart wearable applications.

#### **2. Materials and Methods**

#### *2.1. Experiments and Experimental Setup*

#### 2.1.1. Climate Chambers (Body and Mind Room)

The "*Body and Mind Room*" consists of three climate-controlled chambers (A, B and C) designed and built to investigate the dynamic mental and physiological responses of humans to specific indoor climate conditions. The Body and Mind rooms are experimental facilities at the M3-BIORES laboratory (Animal and Human Health Engineering Division, KU Leuven, Leuven, Belgium). The three rooms are dimensionally identical; however, each room is designed to provide different ranges of climate conditions, as shown in Table 1.

**Table 1.** The different temperature and relative humidity ranges that can be provided by the different Body and Mind rooms (A, B and C).


The three rooms are equipped with axial fans to simulate wind velocities between 2.5 and 50 km·h<sup>−</sup>1.

#### 2.1.2. Experimental Protocol

The experimental protocol used in the present study is designed in such way to investigate the subjects' thermal and physiological responses to predefined three different temperature (*low*, *normal* and *high*) that under two levels of physical activities (seated = *low* and cycling = *high*). The three predefined temperatures (*low* = 5 ◦C, *normal* = 24 ◦C and *high* = 37 ◦C) are chosen based on the thermal comfort chart from [15] and the effects on health according to the Wind Chill Chart for cold exposure (National Weather Service of the US) and for hot temperatures exposure according to [16]. The conducted experiments consist of two phases (Figure 1, upper graph), namely, low activity and high activity phases. During the first experimental phase, low activity phase, the test subjects (while being seated = low activity) are exposed, for 55 min, to three levels of temperatures in the following order: normal, low, high and normal again (Figure 1). During the high activity phase, the test subjects is exposed to a 15 min of light physical stress (80 W of cycling on a fastened racing bicycle). During the course (75 min) of the active phase, each test subject is exposed to the predefined three temperature levels (Figure 1, lower graph). During each temperature level, starting from the normal level (24 ◦C), the test subjects are performed 15 min of cycling (with 80 W power) and followed 4 min of resting (seated). During the course of conducted experiments, the clothing insulation factor (*Col*) is kept constant at *Col* = 0.34, which accounts for a cotton short and t-shirt as standard clothing for all test subjects. The experimental protocol is approved by the SMEC (Sociaal-Maatschappelijke Ethische Comissie) on 16 January 2019 with number G-2018-12-1464.

#### 2.1.3. Test Subjects

In total, 25 healthy participants (six females and 19 males were asked before the experiments if they had been diagnosed with any cardiac problems, diabetic or any other health problems), between the age of 25 and 35 (average age 26 ± 4.2) years, with average weight and height of 70.90 (±12.70) kg and 1.74 (±0.10) m, respectively, volunteered to perform the aforementioned experimental protocol.

#### 2.1.4. Measurements and Gold Standards

During the course of the experiments, participants' heart rate, metabolic rate, average skin temperature, heat flux between the skin and the ambient air and core body temperature represented by the aural temperature are measured continuously. Heart rate monitoring is performed using a Polar H7 ECG (Polar, Kempele, Finland) strap that is placed under the chest, with a sampling frequency of 128 Hz. The metabolic rate, as metabolic equivalent tasks (METs) of each test subject, is calculated based on indirect calorimetry using a MetaMAX 3B (CORTEX-Medical, Leipzig, Germany) spiroergometer sensor. The average skin temperature is calculated based on measurements from three body-placed sensors, namely, scapula, chest and arm (Figure 2). The skin temperature measurements are performed using one Shimmer (Shimmer-Sensing, Dublin, Ireland) temperature sensor and two gSKIN® bodyTEMP patches (greenTEG, Zurich, Switzerland). Two heat flux gSKIN® patches are

placed on both the chest and the left arm (Figure 2). The skin temperatures and heat flux measurements are acquired at a sampling frequency of 1 Hz. All the measured from the wearable sensors were received and saved on a smart phone. Core body temperature is estimated based on aural temperature measure measurements, which is performed using in-ear wireless (Bluetooth) temperature sensor (Cosinuss One, Düsseldorf, Germany) with a sampling rate if 1 Hz. At the end of each applied temperature level during the course of both experimental phases, a thermal sensation questionnaire, based on ASHRAE seven-point thermal scale, is performed for each test subject.

**Figure 1.** Plots showing the climate chambers' set-point temperatures programed during the 55 min low activity phase (upper graph) and the 75 min high activity phase (lower graph).

**Figure 2.** Sensor placement. (**A**) Ear channel for aural temperature measurement via the Cosinuss One; (**B**) upper arm where the skin temperature and heat flux are measured with the gSKIN patch; (**C**) middle upper chest where the skin temperature and heat flux are measured with the gSKIN patch; (**D**) lower chest where the heart rate is measured with the Polar H7; (**E**) scapula where skin temperature is measured with the Shimmer sensor; (**F**) mouth and nose where metabolic rate is measured via a MetaMAX-3B spiroergometer sensor.

#### *2.2. Modelling and Classification*

For the sake of present study, the measured variables are divided into wearables, which are easily measured variables using wearable sensors and gold standards (reference) variables, which are not suitable for wearable technologies. The wearables include heart rate *HR*, aural temperature *Ter*, average skin temperature *Tsk*, skin heat flux *qsk* and ambient air temperature *T*∞. On the other hand, the gold

standards consist of the core temperature *Tc*, which is driven from the aural temperature [*Tc* = *f*(*Ter*)], metabolic rate *Mr* and personal thermal sensation votes *TS*. The ultimate goal of this work is to develop an adaptive classification model to predict the individual thermal sensation depending, solely, on the wearables or estimated variables. Hence, both of the metabolic rate and core body temperature are estimated using an online dynamic modelling approach (Figure 3). Then, the individual thermal sensation is predicted using a classification model (classifier) whose inputs are the wearables and estimated the metabolic rate and core body temperature (Figure 3).

**Figure 3.** Overview of the main steps to predict the individual thermal sensation.

#### 2.2.1. Dynamic Modelling

Although the system under study (occupant's thermoregulation) is inherently a non-linear system, the essential perturbation behaviour can often be approximated well by simple linearized transfer function (TF) models [17–19]. For the purposes of the present paper, therefore, the following linear, multi-input, single-output (MISO) discrete time systems are considered to estimate metabolic rate and core body temperature [18]:

$$y(k) = \sum\_{r=1}^{r=R} \frac{B\_r(z^{-1})}{A\_r(z^{-1})} \mu\_r(k - \delta\_r) + \xi(k),\tag{1}$$

where *k* denotes the value of the associated variable at the *k*th sampling instant; *y*(*k*) is the output variable; *ur*(*k*), *r* = 1, 2, ... ,*R* are input variables, while *A* - *z*−<sup>1</sup> and *B* - *z*−<sup>1</sup> are appropriately defined polynomials in the backshift operator *z*<sup>−</sup>1, i.e., *z*−*<sup>i</sup> y*(*k*) = *y*(*k* − *i*) and ξ(*k*) is additive noise, a serially uncorrelated sequence of random variables with variance σ<sup>2</sup> that accounts for measurement noise. The simplified refined instrumental variable (SRIV) algorithm was utilised in the identification and estimation of the models (model parameters and model structure) [20]. Two main statistical measures were employed to determine the most appropriate model structure. Namely, the coefficient of determination *RT* <sup>2</sup> , based on the response error; and *YIC* (Young's information criterion), which provides a combined measure of model fit and parametric efficiency, with large negative values indicating a model which explains the output data well and yet avoids over-parameterisation [21]. Additionally, the estimation performance of the selected models is evaluated used the mean absolute error (MAE) value.

#### 2.2.2. Classification Model

To predict the individual thermal sensation, a classification model (classifier) is developed and trained based on the wearables and estimated variables (metabolic rate and core body temperature), together with the thermal sensation votes (gold standard). A modified support vector machine (SVM) technique, namely, the least squares support vector machine (LS-SVM), is used to develop and train the thermal sensation classifier [22,23]. SVMs are originally presented as binary classifiers [22] that assign each data instance *<sup>X</sup>* <sup>∈</sup> <sup>R</sup>*<sup>d</sup>* to one of two classes described by a class label *<sup>y</sup>* <sup>∈</sup> {−1, 1} based on

the decision boundary that maximises the margin 2/**w**<sup>2</sup> between the two classes. Generally, a feature map <sup>ϕ</sup> : <sup>R</sup>*<sup>d</sup>* <sup>⇒</sup> <sup>R</sup>*<sup>p</sup>* is used to transform the geometric boundary between the two classes to a linear boundary *<sup>L</sup>* : **<sup>w</sup>***T*ϕ(*x*) <sup>+</sup> *<sup>b</sup>* <sup>=</sup> 0 in feature space, for some weight vector **<sup>w</sup>** <sup>∈</sup> <sup>R</sup>*p*×<sup>1</sup> *and b* <sup>∈</sup> <sup>R</sup>. The class of each instance can then be found by *y* = *sign*- **w***T*ϕ(*x*) + *b* , where *sign* refers to the sign function. Due to some computational complexities of standard SVM because of the quadratic programming problem, the least squares support vector machine (LS-SVM) is presented to overcome such problem. LS-SVM, in contrast with standard SVM, relies on a least squares cost function as follows:

$$\min\_{\mathbf{w}, \mathbf{w}, \mathbf{b}, \mathbf{c}} \frac{1}{2} \mathbf{w}^T \mathbf{w} + \gamma \sum\_{i=1}^N \epsilon\_i^2,\tag{2}$$

such that *yi* **w***T*ϕ(*xi*) + *b* ≥ 1 − *ei and ei* ≥ 0, *i* = 1, 2, ... , *N*, where *ei* errors such that 1 − *ei* is proportional to the signed distance of *xi* from the decision boundary, and γ represents the regularisation constant. In LS-SVM, instead of solving the quadratic programming problem, a set of linear equations to be solved is sufficient to find the optimal solution of the classifier. The LS-SVMlab (Least Squares Support Vector Machine lab) Matlab-based toolbox is used to implement the LS-SVM classification algorithm [22].

The performance of the classification model is determined based on accuracy, sensitivity, precision and F1-score as follows:

$$Accuracy = \frac{TP + TN}{TP + TN + FP + FN}, \text{ Sensitivity} \\ = \frac{TP}{TP + FN}$$

$$Precision = \frac{TP}{TP + FP}, \text{ F1Score} \\ = \frac{2 \ast Precision \ast Sensitivity}{Precision + Sensitivity}$$

where *TP*, *TN*, *FP* and *FN* are the true positive, true negative, false positive and false negative, respectively.

#### **3. Results and Discussion**

#### *3.1. Dynamic Modelling and Estimation of an Individual's Metabolic Rate*

The average metabolic rate obtained from the 25 participants at the temperature levels (24, 5 and 37 ◦C) during low and high activity phases are presented in Table 2.


**Table 2.** Average (±standard deviation) of the measured metabolic rate obtained from the 25 test subjects during low and high activity phases.

Different combinations of input variables (wearables) are tested for the best estimation of an individual's metabolic rate. The SRIV algorithm, combined with *YIC* and *R<sup>T</sup>* <sup>2</sup> selection criteria, suggested that a second-order MISO discrete-time TF with heart rate (*HR*), average skin heat flux (*qsk*) and aural temperature (*Ter*) as input variables is the best (with average *<sup>R</sup><sup>T</sup>* <sup>2</sup> = 0.89 ± 0.04 and *YIC* = −13.62 ± 2.33) to describe and estimate the dynamic behaviour of the individual's metabolic rate. More specifically, the SRIV algorithm identified the following general MISO discrete-time TF model structure:

$$
\hat{M}\_r(k) = \left[ \frac{B\_1(z^{-1})}{A(z^{-1})} \frac{B\_2(z^{-1})}{A(z^{-1})} \frac{B\_3(z^{-1})}{A(z^{-1})} \right] \begin{bmatrix} T\_{cr}(k-\delta\_1) \\ H\_R(k-\delta\_2) \\ \overline{q}\_{sk}(k-\delta\_3) \end{bmatrix} + \xi(k) \tag{3}
$$

where *M*ˆ *<sup>r</sup>*(*k*) is the estimated metabolic rate and the numerator polynomials *B*1, *B*<sup>2</sup> and *B*<sup>3</sup> are of the following orders (number of zeros) 2, 3 and 2, respectively. The system delays δ1, δ<sup>2</sup> and δ<sup>3</sup> are varied from person to another (inter-personal) with average values of 1.4, 0.20 and 0.21 min, respectively (Table 3). A simulation example of the developed estimation model (Equation (3)) for one test subject during the low activity experimental phase at normal temperature (24 ◦C) is depicted in Figure 4.

**Table 3.** Average *R<sup>T</sup>* <sup>2</sup> , *YIC*, model delays and MAPE for the selected MISO-DTF model to estimate the individual's metabolic rate obtained from the 25 test subjects during low and high activity phases.


**Figure 4.** A simulation example of the developed MISO-DTF model (Equation (3)) to estimate the metabolic rate during the low activity experimental phase at normal temperature (24 ◦C).

The estimation performance of the selected general MISO-DTF (Equation (3)) is evaluated based on the mean absolute percentage error (*MAPE* = 100% *N <sup>N</sup> k*=1 *<sup>M</sup>*<sup>ˆ</sup> *<sup>r</sup>*(*k*)−*Mr*(*k*) *Mr*(*k*) ) value.

 The results have shown that the developed general model (Table 3), for all test subjects, a higher average MAPE value (10 ± 2.2%) during the low activity phases than the average MAPE value (7.6 ± 2.6%) resulted during the high activity phases. The METs (metabolic equivalent tasks) are a measure which accounts for a normalized form of energy expenditure per kilogram of mass. There is a consensus that the measurement of the metabolic rate might vary among individuals (interpersonal) up to 75% [24], even within the same day from morning to afternoon for the same subject (intrapersonal) up to 6%, though measurements on different days might be comparable on fasted subjects [25]. Hence, a general estimation model of individual metabolic rate will not be efficient in this case. However, the general estimation performance of the suggested general MISO model can be enhanced by using the online adaptive form of the SRIV algorithm [26]. The online adaptive (closed-loop) SRIV algorithm provides the possibility to personalise the developed general model by retuning the model parameters and model delays based on the streaming data acquired from the wearable sensors.

#### *3.2. Classification Model and Prediction of Individual's Thermal Sensation*

In order to give an idea about the interaction relationship between considered variables, the correlation between all measured variables are calculated and represented in a colour-map Pearson correlation coefficient (*r*) matrix, as shown in Figure 5. High positive or negative correlation coefficient values, such as that between heat flux and skin temperature, are reflected as a strong interaction between these variables, which can affect the feature selection of the classification model.

**Figure 5.** Colour-map of the correlation matrix representing the correlation levels (*r* ∈ [−1, 1]) between the mean values of all measured variables, namely, heart rate (HR), aural (core) temperature, arm temperature, chest temperature, scapula temperature, heat flux from the arm skin (Arm HF), heat flux from the chest skin (Chest HF) and metabolic rate (METs).

The classification model for predicting the individual's thermal sensation is developed, based on the LS-SVM approach, by training the classifier on 80% of the data points, while the rest of the data (20%) is used for testing. The model accuracy, sensitivity F1-score and overall confusion matrix are computed to evaluate the performance of the developed classifier. The feature space includes all the measured and estimated input variables, namely, *Ter*, *HR*, *qsk*, *Tsk*, <sup>Δ</sup>*<sup>T</sup>* = *Ter* <sup>−</sup> *Tsk* and *<sup>M</sup>*<sup>ˆ</sup> *<sup>r</sup>*. Additionally, other features are extracted by computing the variance, min, max, root mean squares (RMS) and first derivative ( *dx dt* , where *x* is the variable) of the aforementioned measured and estimated variables. The age and gender of the test subjects are also included in the feature spaces.

Figure 6 is showing the distribution of the participants' thermal sensation votes at the three environment temperatures (24, 5 and 37 ◦C). The aforementioned figure shows the 'confusion space', or the area in which the reported thermal sensation votes at the three environmental temperatures are overlapping. Such observation shows that the thermal perception may overlap even with large differences in the surrounding environmental temperatures. Such a confusion space is a great challenge for any predictive model of thermal sensation, especially for static models such as PMV.

For the sake of the main objective of the present work, the computational cost of the developed algorithm should be low enough to be compatible with wearable technology and online modelling. Hence, a feature selection procedure is employed to obtain the most reduced-dimension model yet with the best error performance. Feature selection here is based on evaluating all possible feature combinations and selecting the combination with best error performance. The feature selection step results in a feature space including 25 features, as shown in Table 4.

**Figure 6.** Distribution (distrib.) of the participants' (25) thermal sensation votes at the three environment temperatures (24, 5 and 37 ◦C) showing the confusion space.

**Table 4.** An overview of the selected feature space including the measured and estimated variables (six variables) and some operations on these variables (× = selected).


A classification model is developed based on the selected 25 input features and trained using the LS-SVM approach. The resulting confusion matrix from the developed classification model based on the selected feature space is shown in Figure 7.

The overall error performance results of the developed classification model are presented in Table 5. For a seven-class classification problem, the developed classifier have shown an overall accuracy of 76% to predict the individual's thermal sensation. The developed classifier has shown a high (84%) F1-score, which reflects low false-positives and -negatives.

**Table 5.** Overall error performance (accuracy, sensitivity, precision and F1-score) of the developed general LS-SVM classification model.


**Figure 7.** The resulted confusion matrix from testing the developed LS-SVM classifier. The diagonal represents the correctly-classified data points.

The error performance results of the developed general classification for each class separately are shown in Table 6. The results showed that the error performances of classes 1, 2, 6 and 7 are very low (see Table 6), which can be attributed to the low number (0, 2, 4 and 2, respectively) of obtained votes for these classes, or, in other words, due to the uneven class distribution. Therefore, the overall F1-score is a more reliable and efficient measure of performance than the accuracy in this case.

**Table 6.** The error performance (precision, sensitivity and F1-score) of the developed general LS-SVM model for the seven-class classification problem.


SVM is used in recent studies to assess the occupant's thermal demands [12] and to predict thermal comfort/sensation [11]. In these studies, the results have shown that SVM is able to predict thermal comfort/sensation with accuracy and F1 scores of 76.7% and 84%, respectively. However, these results is only obtained by reducing the seven-class classification problem to a three-class problem. Hence, we believe that reducing the number of classes will improve our suggested general model performance. Moreover, based on streaming data obtained from wearable sensor technologies, a personalised adaptive classification model, based on the same extracted features, will enhance the model performance to predict the individual's thermal sensation. Different related works investigated the problem of thermal sensation and comfort prediction via machine learning algorithms. Ghahramani et al. [27] applied the hidden Markov model (HMM) technique to the thermal comfort prediction problem with three levels of thermal comfort. The main issue in the used dataset in this study is the class imbalance, which is not tackled by the proposed methodology. A recent study by Lu et al. [7] proposed a personalised model, however, the study strictly investigated two subjects and developed a dedicated model for each subject.

Testing of the trained model for each test subject was implemented using the Matlab platform on a computer with Intel® 8 Core i7 (2.7 GHz) processor and 16 GB of RAM. The average computational time for one test run on this computer was 100 ms. The developed model should be trained using data from different populations (e.g., different ages, ethnicities and physical conditions) and different environmental conditions (e.g., broader ranges of temperature and humidity). In future research, the mental status of the participants should be taken into the account to investigate the capability of the developed model to comply with different mental conditions (e.g., stresses). Moreover, a sensitivity analysis should be performed considering different accuracy and sensitivity levels of the wearable sensors. On the other hand, the LS-SVM approach used in this paper is suitable for online adaptation with a flexibility to receive new data (streaming data) variables. Hence, the developed model can be used for adaptive real-time mode-based monitoring of individual thermal sensation. Additionally, as such, the developed mode is suitable for different applications such as the simulation of the human thermal state under different environmental conditions and for building design and control. In this paper, we present the possibility of using this model for adaptive HVAC control systems.

#### *3.3. Adaptive Occupant-Based HVAC Predictive Controller*

In modern buildings, it is very common that HVAC control systems are designed in such way to ensure parsimonious energy use and cost-effective building operation. This often happens by tuning HVAC control parameters (e.g., set points) to exploit the inherent trade-off between energy consumption and thermal comfort, with the latter acting as a constraint defining a theoretical and practical upper bound on potential energy savings [28–30]. In this paper, we suggested a model-predictive control (MPC) strategy, which is based on continuous feedback of occupant's thermal state (sensation/comfort) with main control objective to achieve occupant's thermal comfort. Then the energy use can be employed in the controller's cost-function as constraints.

In this section, we introduced to an adaptive occupant-based HVAC predictive controller using the developed LS-SVM predictive classification model. The general framework of the proposed HVAC predictive controller approach is depicted in Figure 8. In this paper, we only describe the main methodology to use the LS-SVM predictive classification model for the occupant's thermal state in the generalized predictive control (GPC) scheme, which will be studied and investigated further in future work. The GPC is one of the most popular model predictive controlling (MPC) methods in broad number of fields. The basic principle of GPC is to calculate a sequence (control horizon, *Nc*) of future control signals that minimizes a cost function defined over a prediction horizon (*Np*) [31]. In general, the GPC algorithm consists of two main subsystems, namely, a prediction model and an optimizer. As shown in Figure 8, two main components, namely, the adaptive algorithm for LS-SVM predictive model and the GPC algorithm, are suggested.

**Figure 8.** Block diagram of the proposed adaptive HVAC LS-SVM-based model predictive controller.

3.3.1. Adaptive LS-SVM-Based Algorithm for Predicting the Occupant's Thermal Sensation

The availability of the real-time sensors data, from the wearable technologies, has given the possibility of streaming data, which are processed via the proposed online streaming algorithm to adapt the classifier model. This adaptive algorithm is needed to handle the newly arrived data (streaming data) in the training set. Different approaches are available in the literature to handle these challenges, such as incremental learning methods [32], which work on adapting and retuning the parameters of the general model based on the newly collected data. Another approach is the localized learning, which is based on developing a local model for each test point or subset of the test set [33]. The streaming data includes:


#### 3.3.2. The GPC Algorithm

In general, the goal of any controller is to calculate the input (control signal) to the controlled system (plant) such that the output follows the desired reference. However, in case of the predictive controller, the GPC algorithm aims to find the best-predicted output sequence (using the prediction model) that is the closest to a predefined reference trajectory (desired thermal state in our case). The *prediction model* in our case is the adaptive LS-SVM classification model that predicts the occupant's thermal sensation (*TS*). The algorithm simulates multiple future scenarios (predicted output sequence) in a systematic way using the *optimizer*, then the predicted output *<sup>T</sup>*ˆ*S*(*<sup>k</sup>* + *Np*|*k*) is used to calculate the optimal future input (ambient temperature, *<sup>T</sup>*<sup>ˆ</sup> *<sup>a</sup>*(*<sup>k</sup>* + *Nc*|*k*)). The optimizer solves an online optimization problem based on a defined *cost function* (Figure 8), which minimizes the predicted error *e*ˆ(*k* + *Np*|*k*) between the reference *RS*(*k*) and the predicted output *<sup>T</sup>*ˆ*S*(*<sup>k</sup>* + *Np*|*k*). The cost-function is given as follows [34]:

$$\mathcal{J}\{\mathcal{N}\_1, \mathcal{N}\_p, \mathcal{N}\_c\} = \sum\_{j=N\_1}^{N\_p} a\_j \left[\mathcal{T}\_S(k+j \mid k) - \mathcal{R}\_S(k+j)\right]^2 + \sum\_{j=1}^{N\_c} \lambda\_j \left[\Delta T\_a(k+j-1)\right]^2\tag{4}$$

where <sup>Δ</sup>*Ta*(*k*) is the change in the control signal (ambient temperature), *<sup>T</sup>*ˆ*S*(*<sup>k</sup>* + *<sup>j</sup>* <sup>|</sup> *<sup>k</sup>*) is the predicted output (thermal sensation) sequence, *RS*(*k*) is the reference (desired level of thermal sensation), *N*<sup>1</sup> is the minimum of the prediction horizon and α and λ are the weighting factors. The suggested control signal (*T*ˆ *<sup>a</sup>*) can be incorporated into the HVAC system by feeding it as a set-point to the HVAC controller. The sequence of predicted thermal sensation is crucial in the optimisation (cost function) of the control (manipulated) variables [35,36]. In the presented approach, we have considered the air temperature as the only manipulated variable; however, more HVAC-related variables can be added to the optimisation step (e.g., ventilation rate and energy consumption).

The proposed approach (Figure 8) depends on the extracted features from easily measured variables (*Ter*, *HR*, *qsk* and *Tsk*,) that can be collected from already available (off-the-shelf) wearable sensors (e.g., smart watches and on-body smart tags). As such, this proposed approach has the advantage over other models (e.g., [37]), which depend on difficult and/or invasive measurements (core body temperature and metabolic rate) and, consequently, not convenient for long-term monitoring. Moreover, the used LS-SVM used in this approach is suitable for online prediction of an individual's thermal state with minimum computational cost (100 ms for the prediction of the thermal sensation of each individual).

#### **4. Conclusions**

In this present paper, 25 participants are subjected to three different environmental temperatures, namely 5 ◦C (cold), 20 ◦C (moderate) and 37 ◦C (hot), at two different activity levels, namely, at low level (rest) and high level (cycling at 80 W power). Metabolic rate, heart rate, average skin temperature

(from three different body locations), heat flux and aural temperature are measured continuously during the course of the experiments. The thermal sensation votes are collected from each test subject based on the ASHRAE seven-point questionnaire. The results have shown that a reduced-ordered (second-order) MISO-DTF including three input variables (wearables), namely, aural temperature, heart rate and average heat flux, is best to estimate the individual's metabolic rate (non-wearable) with an average MAPE of 8.7%. A general classification model based on the LS-SVM technique is developed to predict the individual's thermal sensation. For a seven-class classification problem, the results have shown that the overall model accuracy and F1-score of the developed classifier are 76% and 84%, respectively. It is suggested in this paper that the overall performance of the model can be enhanced by using a personalised adaptive classification algorithm based on streaming data from wearable sensors. The developed LS-SVM classification model for the prediction of the occupant's thermal sensation can be integrated in the HVAC system to provide an occupant thermal state-based climate controller. In this paper, we introduced an adaptive occupant-based HVAC predictive controller using the developed LS-SVM predictive classification model.

**Author Contributions:** Conceptualisation: A.Y. and J.-M.A.; methodology: A.Y. and N.C.; software: A.Y.; validation: A.Y.; formal analysis: A.Y. and N.C.; investigation: A.Y. and N.C.; resources: J.-M.A.; data curation: A.Y. and N.C.; writing—original draft preparation: A.Y. and N.C.; writing—review and editing: A.Y. and J.-M.A.; visualisation: A.Y.; supervision: J.-M.A. and A.Y.; project administration: J.-M.A.; funding acquisition: J.-M.A.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Review* **Well-Defined Conjugated Macromolecules Based on Oligo(Arylene Ethynylene)s in Sensing** †

## **Agata Krywko-Cendrowska 1, Dawid Szweda <sup>2</sup> and Roza Szweda 2,\***


Received: 8 April 2020; Accepted: 29 April 2020; Published: 3 May 2020

**Abstract:** Macromolecules with well-defined structures in terms of molar mass and monomer sequence became interesting building blocks for modern materials. The precision of the macromolecular structure makes fine-tuning of the properties of resulting materials possible. Conjugated macromolecules exhibit excellent optoelectronic properties that make them exceptional candidates for sensor construction. The importance of chain length and monomer sequence is particularly important in conjugated systems. The oligomer length, monomer sequence, and structural modification often influence the energy bang gap between the highest occupied molecular orbital (HOMO) and the lowest unoccupied molecular orbital (LUMO) of the molecules that reflect in their properties. Moreover, the supramolecular aggregation that is often observed in oligo-conjugated systems is usually strongly affected by even minor structural changes that are used for sensor designs. This review discusses the examples of well-defined conjugated macromolecules based on oligo(arylene ethynylene) skeleton used for sensor applications. Here, exclusively examples of uniform macromolecules are summarized. The sensing mechanisms and importance of uniformity of structure are deliberated.

**Keywords:** well-defined macromolecules; sequence-defined macromolecules; sequence-defined polymers; conjugated oligomers; oligo(arylene ethynylene)s; biosensors; sensors; process monitoring

#### **1. Introduction**

Nowadays, facing the development of precise polymer chemistry, in particular new synthetic methods that allow for monomer sequence control we are looking for new areas of application of macromolecules where the sequence matters. To design new applications of macromolecules the sequence–property relationship has to be well understood. Sensing and process monitoring are expanding areas where the structure of macromolecules and the sequence of monomers became a crucial parameter to achieve specificity and selectivity of the detection and bioprocess monitoring.

The monitoring of bioprocesses in an organism is performed by cascade communication between the network of biomolecules [1]. Biological components often react very sensitively to environmental changes (e.g., pH, temperature, nutrients), which may result in adverse effects on the activity of the cells or the reproducibility of the process. Uniform, sequence-defined macromolecules such as proteins and DNA are key features in the regulation of biological processes. The well-defined and sequence-controlled structures of those biomolecules enable precise recognition of specific molecular patterns or environmental changes (e.g., temperature, pH) to regulate the cascade events in the living organisms. The monomer sequence, for instance, amino acids in proteins or nucleotides in DNA, determines functions and is responsible for regulation of thousands of cascade events in our body. The whole mammalian immune system relies on a large array of nucleic acid sensors [2,3].

The precision of the primary structure is also very important in man-made (bio)sensor systems based on conjugated macromolecules [4]. Over the last years, the synthetic strategies leading to well/sequence-defined macromolecules, that enable better control of the properties of resulting materials, were developed [5]. The monomer composition [6–19], sequence [4,6,12,19], and the oligomer length [6,9] are parameters that influence the energy levels of HOMO and LUMO orbitals in conjugated molecules. The structure indicates the energy gap between HOMO and LUMO, absorption, and emission properties in conjugated systems [6,19]. Even a slight modification of oligomer structure, e.g., in the side [20] or end groups [21,22], can affect its optoelectronic properties and influence sensing.

Conjugated macromolecules, due to their excellent optoelectronic properties, found great use in the construction of different types of sensors [23–25], e.g., monitoring of enzymatic activity [26], chirality sensors [27], protein sensing [28], material self-healing [29], diagnosis and drug discovery [25], biosensing and therapeutics [30]. The π-conjugated structure, allowing communication between monomers in molecules backbone, is responsible for excellent optoelectronic properties susceptible to environmental changes. The signal can be revealed in one or more dimensions [24] that induce selectivity of the read-out, e.g., chemical nose approach [25,31]. Multidimensional sensor response delivered by multiple sensor elements can selectively interact with the sample and produce a distinct pattern of response enabling specific identification of target components.

Among conjugated macromolecules, oligo(arylene ethynylene)s (OArEs) have gained considerable attention due to their excellent optoelectronic properties and emerging applications [32–35]. The current synthesis methods provide access to uniform OArEs of precise length and full sequence control [6,12,36]. OArEs are an important class of sensory materials [26,30,37–40]. OArEs sensors can be used in organic solvents as well as in the aqueous environment or as solid films. Their sensing mechanism usually takes advantage of their fluorescence properties, but not exclusively.

This review aims to summarize the examples of linear oligo(arylene ethynylene)s applied in sensing and to discuss their relevance and perspectives. Our study highlights the importance of adjustment and manipulation of their sequence and length to improve the performance of oligomer-based fluorescent sensors. The OArEs sensors in solution and solid films are discussed in the context of their applications in detection and process monitoring. The examples of oligomers that change properties upon the presence of a particular analyte or environmental changes are also included. Here, we focused on well-defined, uniform oligomers built from at least three arylene units connected via ethynylene linkage.

#### **2. Synthesis of Well-Defined Conjugated Macromolecules**

The synthetic strategies leading to uniform macromolecules that enable control over the monomer sequence have been introduced to polymer science during the last decades [41–43]. The nature-inspired need of defined macromolecular structures was a driving force to develop new synthetic methodologies, that combine current achievements of polymer chemistry, organic synthesis, and biochemistry to develop methods yielding uniform, sequence-defined macromolecules (Figure 1) [41,42,44,45].

In general, discrete macromolecules can be accessed by iterative synthesis. In principle, the synthesis relay on the stepwise attachment of protected monomers followed by deprotection. These two steps are repeated cyclically until the desired molecule is obtained. When the monomers are equipped with orthogonal functional groups there is no need for use of protecting groups [46,47]. Three main approaches can be distinguished: classical solution synthesis (Figure 1b), synthesis using a soluble polymer as a support (Figure 1c), or solid-phase synthesis (Figure 1d) [48]. The syntheses performed by classical organic chemistry methods are associated with cumbersome purification after each step. The use of polymeric supports in a soluble [6] or a solid phase [49] significantly simplifies the purification process.

In iterative synthesis it is very important to achieve high stepwise yields. The physical limitation of iterative synthesis is stepwise yield that determines possible length of macromolecules. The total yield equals the product of actual step yield multiplying and, thus its value decreases dramatically with the number of steps, according to the formula:

$$Y\_{total}[\%] = \prod\_{i=1}^{n} \left( \frac{Y\_n}{Y\_{n\text{th}}} \times 100 \right) \tag{1}$$

where: *n*—number of steps, *Yn*—actual yield of step *n* [g], *Ynth*—theoretical yield of step *n* expressed in mass units.

The *Ytotal* drops dramatically with the number of steps. For example, if the stepwise yield *Yn* will be 95%, after 50 steps total yield will be only 7%.

**Figure 1.** (**a**) The main approaches for the synthesis of sequence-defined conjugated macromolecules. Sequence-defined macromolecules can be obtained by multistep-growth synthesis using three main approaches: solid-phase synthesis, synthesis on soluble support, or solution synthesis. The monomers are used in a protected form that demands performance of the deprotection step after each coupling or by chemoselective reactions where monomers are equipped with orthogonal functional groups. The examples are of (**b**) solution synthesis [42], (**c**) synthesis on soluble support [6], and (**d**) solid-phase synthesis [11].

To obtain discrete conjugated oligomers (COs) based on oligo(arylene ethynylene)s skeleton, several stepwise synthetic strategies have been developed [42,50]. Usually, the oligomers are obtained by iterative solution synthesis that involves protected monomers. Due to poor solubility of arylene ethynylene oligomers they are usually synthesized from monomers functionalized by solubilizing

substituents. Oligomers are produced by successive coupling and deprotection steps that are repeated in the cycle until the preferred macromolecule is obtained [8,9,11,51–53]. An interesting alternative to classical solution-phase synthesis is the use of soluble or solid supports [7,10], in particular for longer oligomers synthesis. Those approaches significantly simplify purification after each step. The synthesis can be facilitated by divergent/convergent approaches [54] of bidirectional growth [55].

For example, the group of M.A.R. Meier established a solution synthesis protocol for sequence-ordered, uniform pentamers built from five different monomers [12]. The oligomers were synthesized by Sonogashira cross-coupling reaction followed by deprotection (Figure 1b). The photophysical properties of the monodisperse oligomers differed only slightly, but the sequence had an impact on their thermal properties and the hydrodynamic volume.

Oligo(arylene ethynylene)s without solubilizing substituents can be obtained by a soluble-support approach. An interesting example of the polystyrene-tethered synthesis of uniform and sequence-defined oligomers without solubilizing substituents was reported by R. Szweda et al. [6]. The use of an ATRP-made, tailored polystyrene support enabled the synthesis of OArEs containing sequence ordered pyridine and benzene units. For oligomer synthesis, Sonogashira cross-coupling reaction was used (Figure 1c). The use of the soluble support approach gave access to the unsubstituted oligomers that are inaccessible using other methods due to their limited solubility. The UV and fluorescent properties were changing with the oligomer length and composition.

The solid-phase synthesis of oligo(phenylene ethylene)s (OPEs) was reported by the group of J. M. Tour [9]. The synthesis of 16-units oligo(phenylene ethylene)s was demonstrated [9]. The authors applied an iterative divergent/convergent approach on Merrifield resin leading to oligo(2-alkyl-1,4-phenylene ethynylene)s. At each stage of the iteration, the length of the oligomer doubled. Another example of solid-phase synthesis leading to oligomers of 18 repeating units was developed by J. Moore [11]. In this approach, the monoprotected bisethynylarene and a 3-bromo-5-iodo arene monomers bearing orthogonal reactivity were used.

Alternatively, the discrete conjugated macromolecules can be obtained by purification of oligomers mixtures e.g., using reverse-phase chromatography [56] or automated flash chromatography [19] as demonstrated by the group of C. J. Hawker. The automated flash chromatography is an efficient separation method and can be used for the separation of thiophene oligomers of 2–14 units in grams scale [19].

Other methods often employed for the synthesis of COs as step-growth polymerization lead to mixtures of products of dispersity often higher than 1.2 [57]. Dispersity polymers consist of a mixture of structures which has a significant impact on the properties and optical behavior that may influence their sensing performance [58–60]. Moreover, the polymerization technique is not easily reproducible and in the context of sensor application that might be a crucial factor to obtain the same properties of materials. The resulting polymer even in the best-controlled conditions does not consist of uniform macromolecules. It is unlikely to obtain exactly the same mixture of macromolecules in two different polymerization processes.

Although lots of effort has been made, all of the existing methods show many limitations. The usual problems are low yields, limited molar mass (chain length), small synthesis scale, high synthesis cost, time, and labor consumption, etc. In the context of applications, those problems pose new challenges to the chemists to optimize the synthetic strategies, that will overcome existing limitations and enable further development of the field.

#### **3. Macromolecular Conjugated Sensor Probes**

Sequence-defined, uniform macromolecules based on linear oligo(arylene ethynylene)s backbone have been used in different types of sensors that can be classified into two main approaches: (i) solution probes, where oligomers are dissolved in a medium (Section 3.1) and (ii) solid-phase probes (Section 3.2), where oligomers are used as films or they are immobilized on a solid support. In the solution phase, we can distinguish two main sensor categories: classical OArEs (Section 3.1.1) and oligo(arylene ethynylene) electrolytes (Section 3.1.2), that possess ionic side-chains or end groups and exhibit water solubility. In this review, the compounds were divided according to the applied approach.

The oligo(arylene ethynylene)-based sensors can be also categorized according to the type of signal used in sensing, e.g., fluorescence, electrochemical, UV-vis, circularly polarized light (CD). Among the typical detection methods, the most popular is based on fluorescence that takes advantage of excellent optoelectronic properties of OArEs.

The sensing properties of OArEs strongly depends on their structure and chosen monomers. It was demonstrated that certain oligomers have specific structures and can selectively respond to the presence of particular species, e.g., selective detection of fibrillar and prefibrillar amyloid protein aggregates [61]. Sensing can be influenced even by little structural changes like oligomers length and monomer composition. In the following sections, the examples presenting the influence of structural factors on detection efficiency are discussed.

The examples of linear, discrete oligo(arylene ethynylene)s applied in sensing for detection and process monitoring are listed in Table 1. The oligomers that change properties upon the presence of a particular analyte or environmental changes are also included.






**Table 1.** *Cont*.

#### *Processes* **2020** , *8*, 539



*Processes* **2020**, *8*, 539











TAS—transient-absorption

 spectroscopy,

THF—tetrahydrofuran,

UV-vis—ultraviolet-visible

 spectroscopy.

#### *3.1. Oligo(Arylene Ethynylene)s as Sensors Probes in Solution*

#### 3.1.1. Oligo(Arylene Ethynylene) Sensors

Oligo(arylene ethynylene)s have been used as sensors to detect different species e.g., chemicals [65], saccharides [38,62], amino acids [63], explosives [84], ions [66], and physical changes e.g., temperature [67] or solvent polarity [67]. OArEs were also used for process monitoring to track self-healing of polymers [29].

For example, the oligo(phenylene ethynylene)s foldamers with urea end-groups (Table 1, no. 4) were used for the detection of chiral carboxylic acids e.g., tartaric acid. [64] The stereodynamic oligomer-carboxylic acid complexes formed chiral structures easily detectable by CD measurements. It was demonstrated that the chiroptical signal could be used for quantitative analyses providing a fast and simple method for chirality sensing assays (Figure 2).

**Figure 2.** Circular dichroism (CD) spectra of the mixture obtained with oligo(phenylene ethynylene)s (Table 1, no. 4), Et3N, and samples of tartaric acid and linear relationship between the CD amplitude at 370 nm and the sample enantiomeric excess. Reprinted from [64] with permission from Elsevier.

For instance, conjugated oligomers functionalized by boronic acid have been used as sensors to detect different saccharides: D-fructose, D-galactose, D-ribose, and D-glucose, in potassium phosphate buffer/DMSO (99/1, *v*/*v*) [38]. By the addition of saccharides, significant fluorescence enhancement was observed, and the response was different depending on saccharide. However, it was shown that the fluorescence response can be observed only for well-designed oligomer structures. The oligo(phenylene ethynylene)s with –OC10H21 side chains (Table 1, no. 1) and boronic acid groups attached via triazole linker were sensitive exclusively to fructose presence. It was found that the fluorescence response depends on supramolecular interactions between sensors and analyte molecules which are very structure dependent. This study highlighted the need for a specific design of oligomer fluorophore in the development of effective saccharide sensors.

Ortho-oligo(phenylene ethynylene)s (Table 1, no. 7, 8) were used as a circularly polarized luminescence probe for the detection of silver ions [66]. The enantiopure helical core has been prepared by a new macrocyclization reaction. The combination of such o-OPE helical skeleton and pyrene reporter units lead to two characteristic circularly polarized emission features. The intensity of the bands linearly corresponds with silver(I) concentration.

Interestingly, the temperature change that is a physical process can be followed by oligo(arylene ethynylene)s [67]. The acetylene-bridged pentiptycene (n = 2, 3, and 4) (Table 1, no. 9) and phenylene−pentiptycene−phenylene three-ring system (Table 1, no. 10) were evaluated as fluorescent temperature sensors. The trimer and tetramer showed a unique response to temperature and solvent polarity driven by intramolecular interactions (Figure 3). It was found that the twisted region of their rotational potentials occurs at the local energy minimum, and the distribution of rotational conformers is sensitive to the temperature and solvent polarity. Twisting of the π-conjugated backbones reflected in 40 nm blue-shifted fluorescence spectra. It was demonstrated that upon temperature change in the range between 80 and 320 K, the fluorescence emission of acetylene-bridged pentiptycene tetramer shifted significantly. This property can be used for the development of low-temperature sensors.

**Figure 3.** The change of conformation caused by temperature difference influence the fluorescence properties. Temperature dependence of fluorescence spectra (ʎ ex = 303 nm) of tetramer in methylotetrafuran at 20- and 10-K intervals, respectively, between 80 and 320 K. Reprinted with permission from [67]. Copyright 2006 American Chemical Society.

Recently it was demonstrated that oligo(arylene ethynylene)s (Table 1, no. 11) can be used as a fluorescent probe for monitoring of the self-healing process [29]. The OPE incorporated into a mussel-inspired scratch-healing polymer network helped to determine detailed depth- and time-dependent self-healing efficiency using confocal laser scanning microscopy (Figure 4). The damage of the network resulted in decreased fluorescence emission of polymer within the scratch. The mobility of the fluorescence marker is connected with the plasticity of the polymeric material, thus during scratch refilling, no independent migration of dye within the polymeric material was detected.

#### 3.1.2. Oligo(Arylene Ethynylene) Electrolytes

Oligo(arylene ethynylene) electrolytes are very attractive macromolecules for application in sensing. As sensor probes, they combine the excellent fluorescence properties of a conjugated aryl-alkyne system with electrolyte advantages especially water solubility [35]. Due to the presence of ionic groups, these oligomers are very sensitive to the environment changes, e.g., ionic strength, pH, presence of ions, presence of electrolytes. The charged pendant groups can induce electrostatic interactions with oppositely charged (macro)molecules that reflect in fluorescence properties variation [37]. Moreover, the charges distributed along the oligomer molecules affect their aggregation thus they exhibit high fluorescence response to alterations of aggregates structure and conformational changes. Those changes caused by the presence of individual charged molecules may reveal a unique response in the photophysical properties of the conjugated chromophore. The resulting fluorescence quenching or enhanced emission can indicate presence of ions [70], oxygen [71], surfactants [70,72,75,79,92], detergents [76,79], MV2<sup>+</sup> ions [68], solvent polarity [80], anionic biopolymer carboxymethylcellulose [77,78], biomolecules [93], and bacteria [69]. The oligomers were used for processes monitoring of e.g., amyloid formation [61,73], enzymatic activity [74], and photochemical reaction processes [71].

**Figure 4.** Time- and depth-dependent confocal laser scanning microscopy (CLSM) measurements in fluorescence mode (λex = 405 nm) with the fluorescence channel (λem = 406−510 nm) monitoring thermally triggered self-healing procedure, in particular the virgin damaged cross-linked copolymer film, after 1, 2, and 8 h of thermal treatment at 60 ◦C: (red) homogeneous area within the scratch, (yellow) heterogeneous area covering the majority of the analyzed defect, (orange) specific area with residual removed film material, (blue) intact and undamaged reference area for each measurement, and (green) photo-bleached marker area. Reprinted with permission from [29]. Copyright 2018 American Chemical Society.

For instance, the oligo(p-phenylene ethynylene) electrolytes (OPE) were successfully applied to track amyloid formation [61,73]. Oligomers with ester terminal moieties and positively charged –(CH2)3N(CH3) <sup>3</sup><sup>+</sup> pendant groups of different length OPEn (n = 1, 2, and 3) (Table 1, no. 21) and OPE1 negatively charged with pendant –(CH2)3SO3<sup>−</sup> groups (Table 1, no. 22) were evaluated as probes for monitoring of the fibrillation process (Figure 5) [61,73]. The carboxyester terminal groups of OPEs cause high fluorescence quenching due to the strong interactions with the solvent, on the other hand, the oligomers show strong fluorescence emission when in a water-poor environment. These environment-dependent fluorescence properties were used for the sensor design. It was demonstrated that positively charged OPEs used in 10:1 (protein:OPE) molar ratio are effective molecular taggants for selective sensing of the amyloid fibril of the model protein HEWL. Upon fibril formation, OPEs form clusters with the fibrils, where the carboxyester terminal groups are isolated from water. In a non-water environment, they form superluminescent chiral J aggregates [94] and significant fluorescence enhancement is observed (Figure 5). It was found that due to the energy transfer the excitation at 280 nm characteristic for HEWL results in the emission of OPE only in solutions containing OPEs and HEWL amyloids that indicate amyloid formation.

**Figure 5.** Oligo(p-phenylene ethynylene) (OPEs) electrolytes are forming specific chiral constructs together with amyloid fibrils. The construct exhibits enhanced fluorescence quenching and a unique CD signal. Circular dichroism spectra of OPE n = 3 (10 μM) in phosphate buffer with hen egg white lysozyme (HEWL) monomer (black trace) and with HEWL (10 μM) amyloid (red trace). Emission spectra of OPE n = 3 in phosphate buffer (PB, pH 7.4, 10 mM) alone (black long dashed line) with HEWL monomers (red short dashed line) and with HEWL amyloids (blue solid line), concentration: 500 nM, protein concentration: 5 μM monomer basis/0.25 mg/mL. Reprinted with permission from [73]. Copyright 2015 American Chemical Society.

p-Phenylene ethynylene oligomers can be also used for monitoring of enzymatic processes. Complexes of oligomers (Table, no. 24, 25) with enzyme substrates were successfully used to follow activity and inhibition of two biomarkers, phospholipase indicating heart and circulatory disease, and acetylcholinesterase for Alzheimer's diagnosis (Figure 6) [74]. In a buffer solution, oligomers form complexes with positively charged substrates e.g., 1,2-dilauroyl-sn-glycero-3-phosphoglycerol (DLPG) and lauroyl choline (LaCh). The DLPG-oligomer (Table, no. 24) complex upon phospholipases undergoes transformation due to the cleavage of DLPG phosphate bond that resulted in a swift of the fluorescence quenching. The aggregates of an anionic oligomer (Table, no. 25) with lauroyl choline were used as a sensor to detect the activity and inhibition of acetylcholinesterase.

**Figure 6.** (**A**) Fluorescence of the oligomer/1,2-dilauroyl-sn-glycero-3-phosphoglycerol (DLPG) (Table 1, no. 24) aggregates over the course of Phospholipase A1 activity with 1.4 μMOPE and a DLPG concentration of 7.27 μM, with enzyme added ranging from 0.5 to 5 mU of Phospholipase A1. (**B**) A concentration of 1.4 μM of +2C with DLPG at a series of concentrations from 10.6 to 35.6 μM (7.5−25.4 DLPG:OPE ratio), followed by the addition of 4 mU of Phospholipase A1. (**C**) Fluorescence of the oligomer/DLPG aggregates over the course of Phospholipase A2 activity with 1.4 μM oligomer and a DLPG concentration of 7.27 μM, with enzyme added ranging from 0.5 to 5 mU of Phospholipase A2. (**D**) A concentration of 1.4 μM of +2C with DLPG at a series of concentrations from 2.37 to 17.8 μM (1.7−12.7 DLPG:oligomer ratio), followed by addition of 40 mU of Phospholipase A2. t = −1 s is the time of enzyme addition. Wavelength of excitation is 375 nm, emission is 440 nm. Reprinted with permission from [74]. Copyright 2015 American Chemical Society.

OArEs (Table 1, no. 18) can be used to monitor chemical processes, e.g., photolysis [71]. For example, the photo-induced degradation process of oligomer (Table 1, no. 18) occurred by three main routes: the photoprotonation of the triple bond followed by the addition of water, the addition of singlet oxygen across the triple-bond, and the cleavage of the quaternary ammonium side-chains. The degradation led to the formation of different products depending on the reaction atmosphere (argon or oxygen). All those structural changes reflected in fluorescence properties indicating the rate and mechanism of the degradation. Whenever the process was performed in the presence or absence of oxygen, different products of different fluorescence properties were formed. The dependence of fluorescence properties on the reaction atmosphere led to developing an oxygen-sensing methodology based on fluorescence read-out of OArE photo-degradation.

Constructs of oligo(phenylene ethynylene)s electrolyte and gold nanoparticles can be used for selective bacteria identification using the "chemical nose" sensing concept (Table 1, no. 14) [69]. In a solution, positively charged gold nanoparticles form complexes with negatively charged oligo(phenylene ethynylene)s, and oligomer fluorescence is quenched. In the presence of bacteria, some OPEs are released to the solution and fluorescence is restored as a consequence of the presence of free oligomers (Figure 7). The applied oligomer with branched oligo(ethylene glycol) side chain

reduces the non-specific interaction of oligomer and bacteria. Depending on bacteria the oligomer replacement is different which results in selective fluorescence response.

**Figure 7.** (**a**) Fluorescence intensity patterns of nanoparticle–oligomer (Table 1, no. 14) constructs in the presence of various bacteria strains. (**b**) The schematic presentation of sensor design. Bacteria interact with gold nanoparticle-oligomer construct and as oligomers macromolecules are released to the solution, fluorescence enhancement is observed. For each bacteria, interactions with nanoparticles are unique. In the figure, columns represent bacteria of different types, and rows represent the oligomer–nanoparticle constructs. Reprinted with permission from [69]. Copyright 2008 John Wiley and Sons.

The π-conjugated oligo(phenylene ethynylene) backbones with two negatively charged <sup>−</sup>CH2COO– groups on each repeating unit and lengths of n = 5, 7, and 9 (Table 1, no. 13) were used to detect Ca2<sup>+</sup> ions and quenching ionic agents [68]. In the presence of bivalent calcium ions, the oligomers aggregated causing fluorescence shift. The shift depended on oligomer length and for the shorter oligomers (n = 5, 7), the effects are less pronounced than for longer ones n = 9. This shift can be explained by the planarization of the phenylene ethynylene backbone and formation of "excimer-like" excited states, that are not observed in the absence of Cu2<sup>+</sup> ions. The ligomers were also evaluated for fluorescence quenching in the presence of methyl viologen and 3,3 -diethyloxacarbocyanine–well-known fluorescence quenching agents. It was found that the quenching efficiency depends on oligomer length. Taken together, the elongation of oligomer increased the ionic charge of macromolecules that in presence of counter ions favor the formation of ordered and backbone-overlapped aggregates.

For example, while the fluorescence of cationic OPEs with amine end groups is quenched in water, the addition of a small amount of oppositely charged detergent, sodium dodecyl sulfate (SDS), causes a significant increase in the OPE fluorescence due to the formation of a complex (Table 1, no. 16, 17) [35]. These OPE-detergent complexes exhibited antimicrobial properties [95], which, in addition to the fluorescence emission during their formation, can be utilized for the development of multifunctional biosensors.

#### *3.2. Oligo(Arylene Ethynylene) Sensor Films*

Oligo(arylene ethynylene) films consist of packed macromolecules with π-conjugated backbone thus exhibit high fluorescence emission which can be altered upon binding of an analyte molecule. OArEs films are an excellent materials for detection of amino acids [63], bacteria [81,82], explosives [35,83,84,87], pH [86], inorganic acids [85], gas [88], digital information [89], or

chemicals [90,91]. Usually, the detection of an analyte is based on fluorescence changes, its enhancement or quenching upon binding of the sensed molecule. The OArEs films can be obtained by covalent immobilization e.g., reaction between an aldehyde and amine-functionalized surface [85,87], triethoxysilane group and glass [86], electrostatic binding [96], or drop-casting [90].

For instance, oligo(phenylene ethynylene)s bearing 4-aminophenyl-D-mannopyranoside groups (Table 1, no. 36, 37) in combination with laser scanning confocal microscopy have been used for the detection of *E. coli* bacteria [81]. Oligomer probes with two mannose groups enable discrimination between uropathogenic and the non-uropathogenic *E. coli* mutant. Moreover, the films of oligomer on aluminum support together with SPR allowed for quantitative biosensing of uropathogenic *E. coli* achieving a LOD of 104 CFU/mL. Those findings showed the direction towards robust biochips to detect bacteria.

For example, oligo(p-phenylene ethynylene) (Table 1, no. 44, 45) films have been examined in sensing of common explosive nitroaromatic compounds (NACs) i.e., 2,4,6-trinitrophenol (picric acid, PA), 2,4,6-trinitrotoluene (TNT), 2,4-dinitrotoluene (DNT), and nitrobenzene (NB) [87]. Interestingly, the film with cholesterol side groups (Table 1, no. 45) exhibited sensitivity to changes of water/THF solvents ratio (Figure 8a). In water, the film adapted a compacted structure causing a decrease in fluorescence intensity whereas in THF the chains attained extended conformation. In the presence of NACs molecules, complete fluorescence quenching was observed as the effect of the formation of nonfluorescent OPE-NACs complexes. This effect was not interfered by the presence of other compounds, including methanol, THF, toluene, dichloromethane, ammonia, HCl, NaOH, NaCl, copper salts, or seawater (Figure 8b). The experiments revealed that the cholesterol chains incorporated in the oligomer structure induced the sensitivity of the films towards the detected molecules by at least one order of magnitude. Thus, the films of oligo(p-phenylene ethynylene) with cholesterol groups can be used as an effective sensor for explosives.

Surface-immobilized monolayers of defined in length, short oligo(p-phenylene ethynylene) oligomers end-capped by fluorescein (Table 1, no. 43) have been used as narrow-range threshold fluorescent pH indicators (Figure 9a) [86]. At low pH, fluorescein was in its lactone form and the observed emission mostly originated from the oligomer. Upon pH increase fluorescein form change to anionic that favors electron delocalization with a respective decrease in HOMO-LUMO gap. A smaller energy gap facilitates the exciton migration that results in fluorescence enhancement. Moreover, an increase of pH causes a bathochromic shift of oligomer emission due to energy transfer from the oligomer backbone to fluorescein (Figure 9b). This unique pH-dependent response was observed only for oligomer-fluorescein dyad structures immobilized on the surface. The dyad structure was crucial for sensor selectivity. Experiments performed for immobilized fluorescein did not reveal such a selective sensor response. For comparison the same experiment was performed for dyad oligomers in solution, however, the fluorescence signal was much weaker in intensity and the pH validation range was significantly narrowed (pH 8 to 10).

**Figure 8.** (**a**) The two states adopted by the oligomers containing cholesterol side chains (Table 1, no. 45) in water and tetrahydrofuran (THF), respectively. THF is a good solvent for oligomer and its cholesterol side chains and macromolecules in the film attain extended conformation. In contrast, water is a poor solvent for both the oligomer backbone and the side chains, thus the oligomer film is collapsed. Plots of the ratios of Ix/Iy of a given fluorescent film (Film 1-oligo(p-phenyleneethynylene) with cholesterol moieties and Film 2-pristin oligo(p-phenyleneethynylene)) against the compositions of the mixture solvents in which the fluorescence measurements were conducted (for Film 1 (Table 1, no. 45), ʎem = 445 nm, y ʎex 500 nm; for Film 2 (Table 1, no. 44), ʎem 374 nm, ʎex 420 nm). (**b**) Quenching efficiencies of various common explosive nitroaromatic compounds (NACs) on the fluorescence emission of Film 1 and Film 2 in water and THF, respectively, and the interferences of commonly found interferents in the sensing of Film 1 (concentration of NACs and interferents are 50 mM). Adapted from [87] with permission from The Royal Society of Chemistry.

**Figure 9.** (**a**) Structure of the sensor and its assembly into a surface-immobilized monolayer. (**b**) The general principle of generating pH-dependent fluorescent response. (**c**) pH-dependent absorption (left) and fluorescence (right) spectra of monolayer fluorescein-oligomer film. Reprinted with permission from [86]. Copyright 2013 John Wiley and Sons.

Additionally, immobilized oligo(p-phenylene ethynylene) can act as chemosensors for the detection of polar species in an aprotic solvent. For example, a self-assembled monolayer of oligo(p-phenylene ethynylene) with cholic acid moieties (Table 1, no. 42) immobilized onto a glass slide, has been used as a sensor for trace amounts of inorganic acids, such as HCl, H2SO4, HNO3, and H3PO4, in acetone medium [85]. The presence of a cholic acid unit induced the formation of hydrophobic pockets in the upper part of the layer (Figure 10a). This pocket containing imino group was able to trap ions that interacted with the imino groups. Basing on the comparative studies performed for different acids, it was revealed that for the anaerobic acids, the quenching efficiency depended on the size of the molecule and hydrogen bonds between the anions (Figure 10b). In other words, to observe efficient quenching the acid ions had to fit the cavity of the hydrophilic pocket. When chloride anion was trapped in the pocket the fluorescence quenching originated from the protonation of the imino group next to the phenylene ethynylene segment was observed.

**Figure 10.** (**a**) Illustration of immobilized oligomer (Table 1, no. 42) conformations in different medium representing good (acetone as an example) or poor solvent (water, for example). In a good solvent, the hydrophilic pocket is formed as an upper layer of the film. (**b**) Quenching efficiencies of various acids to the fluorescence emission of Film 1-oligo(p-phenylene ethynylene) with cholic acid side chains (Table 1, no. 42) and Film 2-oligo(p-phenylene ethynylene) (Table 1, no. 41) in water and acetone, respectively (concentration of acids are 20 μM). Reprinted with permission from [85]. Copyright 2012 American Chemical Society.

Very sensitive sensor response can be achieved using electrochemical sensing methods. An electrochemical sensor based on an oligo(phenylene ethynylene) (Table 1, no. 49) and chemically reduced graphene oxide (rGO) nanocomposite was used for the quantification of dopamine (DA) [90]. This nanocomposite was synthesized by a simple ultrasonication method and then drop-casted onto a polished glassy carbon electrode and followed by casting of a Nafion ethanol solution (0.25 wt%). The formation of the oligomer nanocomposite was attributed to the π–π stacking interaction between the conjugated structure of oligo(phenylene ethynylene) and rGO as well as the electrostatic force between the amino group of oligomer and the carboxyl group on rGO. Anchoring of the oligomer changed the configuration of the multiple bonds so that a conjugated system represented a characteristic feature of conducting polymers. The developed sensor exhibited significantly enhanced electrocatalytic activity toward the oxidation of DA in a human serum PBS solution in the concentration range of 0.01–60 μM with LOD of 5 nM, a significantly lower value than those reported for the other DA sensors [97].

A chemical sensor based on GO-oligo(phenylene ethynylene) (Table 1, no. 48) nanocomposites was developed for amino acid detection [63]. The oligo(phenylene ethynylene) with cyanoacrylate groups in presence of cysteine residue change fluorescence properties. As a result of the interaction between oligomer and cysteine blue-shifted and decreased fluorescence emission was observed. For oligomer-GO nanocomposite the behavior was opposite and fluorescence enhancement occurred. The strong response of oligomer to cysteine can be used as a highly sensitive sensor.

Oligo(phenylene ethynylene)-based temperature sensors have been used to encode digital information [89]. The oligomers (Table 1, no. 47) were used as junctions between two Au electrodes (Figure 11a). Interestingly, during local temperature changes, the oligomers were able to change their structure between norbornadiene (NB)-state and quadricyclane (QC)-state (Figure 11d). The molecule states exhibited different conductance values that can be assigned to "1" and "0" digital symbols. The temperature-dependent conducting properties of oligomers could be used for local temperature monitoring. This system due to the clear response, translated into two states can be further exploited as a new approach for encoding digital information.

**Figure 11.** (**a**) Schematic of the molecular device with a modulating bias. (**b**) Reversible switching behavior of single-molecule devices and the applied waveform. (**c**) Energy landscape of isomerization processes. Blue and orange arrows indicate the electrically controlled forward and reverse switching processes, respectively. (**d**) Schematic describing the processes for controlling the norbornadiene (NB)-quadricyclane (QC) switching within a molecular junction (blue and orange arrows). The switching processes within a molecular junction are controlled in the forward direction (NB to QC) by electrically controlling the local temperature and in the reverse process (QC to NB) by catalyzing the reaction through a single electron transfer (SET) process. These two states possess different conductance values and can be used to encode digital information. Reprinted with permission from [89]. Copyright 2020 John Wiley and Sons.

#### **4. Conclusions**

Uniform, π-conjugated oligomers based on an arylene ethynylene core are attractive sensory materials. They can respond to the environment changes (polarity, temperature), presence of chemicals (amino acids, saccharides, ions), macromolecules (proteins, polymers), bacteria, and process monitoring. The successfully designed oligo(arylene ethynylene)-based sensors can be used as selective probes to detect particular analytes in the mixture and they can be used for selective process monitoring. However, their huge potential has not been explored, yet.

Well-defined conjugated arylene ethynylene can be accessed by iterative chemistry protocols that permit for full structure precision and sequence definition. The solubility issues occurring for oligo(arylene ethynylene)s can be overcome by the synthesis approach that uses soluble support. Nevertheless, the high synthesis scale and yields remain a challenge.

The sensing parameters (sensitivity, selectivity, specificity) are strongly connected with the oligomer structure. Even a small difference in structure, e.g., one unit length difference may result in loss of sensor selectivity and sensitivity. Although a variety of examples were described, it has been still difficult to rationally design the arylene ethynylene oligomers with high selectivity and affinity, though more systematic studies in the field are needed.

In the near future sensing and process monitoring can become an interesting and emerging application for sequence-defined polymers built from π-conjugated segments. As it was shown by many examples in this review sensing is one of the applications where monomer sequence, composition, and length matter. Systematic studies on the sequence–property relationship can open an avenue for more specific and selective sensors relevant to biological samples.

**Author Contributions:** R.S. developed the concept of the paper and wrote the manuscript. A.K.-C., D.S. contributed to the preparation of the manuscript, commented on the content, and helped with editing. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received funding from the project No 2018/31/D/ST5/01365 of Polish National Science Centre.

**Acknowledgments:** R. Szweda acknowledges Polish National Science Centre project No. 2018/31/D/ST5/01365 for received funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Review* **Modeling and Exploiting Microbial Temperature Response**

#### **Philipp Noll, Lars Lilge, Rudolf Hausmann and Marius Henkel \***

Institute of Food Science and Biotechnology, Department of Bioprocess Engineering (150k), University of Hohenheim, Fruwirthstr. 12, 70599 Stuttgart, Germany; philipp.noll@uni-hohenheim.de (P.N.); lars.lilge@uni-hohenheim.de (L.L.); rudolf.hausmann@uni-hohenheim.de (R.H.) **\*** Correspondence: marius.henkel@uni-hohenheim.de

Received: 15 December 2019; Accepted: 14 January 2020; Published: 17 January 2020

**Abstract:** Temperature is an important parameter in bioprocesses, influencing the structure and functionality of almost every biomolecule, as well as affecting metabolic reaction rates. In industrial biotechnology, the temperature is usually tightly controlled at an optimum value. Smart variation of the temperature to optimize the performance of a bioprocess brings about multiple complex and interconnected metabolic changes and is so far only rarely applied. Mathematical descriptions and models facilitate a reduction in complexity, as well as an understanding, of these interconnections. Starting in the 19th century with the "primal" temperature model of Svante Arrhenius, a variety of models have evolved over time to describe growth and enzymatic reaction rates as functions of temperature. Data-driven empirical approaches, as well as complex mechanistic models based on thermodynamic knowledge of biomolecular behavior at different temperatures, have been developed. Even though underlying biological mechanisms and mathematical models have been well-described, temperature as a control variable is only scarcely applied in bioprocess engineering, and as a conclusion, an exploitation strategy merging both in context has not yet been established. In this review, the most important models for physiological, biochemical, and physical properties governed by temperature are presented and discussed, along with application perspectives. As such, this review provides a toolset for future exploitation perspectives of temperature in bioprocess engineering.

**Keywords:** thermal growth curve; temperature modeling; thermoregulation; monitoring and control; bioprocess engineering; calorimetry

#### **1. Introduction**

Predetermined by the applied system, bioprocesses are generally very sensitive to most changes in environmental conditions. It is for this reason that conditions such as temperature, pO2, or pH are generally tightly controlled. In most cases, even small deviations from optimum values may lead to a significant reduction in the overall productivity and reproducibility of the process. Therefore, special consideration must be given to control tasks, which are typically defined by maintaining process variables within a narrow optimum [1]. In contrast to artificial laboratory conditions, microorganisms are usually exposed to a changing environment, with changes in pH, nutrient availability, competitors, and elevated or decreased temperature, etc. A crucial environmental factor for microorganisms is temperature. It affects the folding, structure, and stability of almost every biomolecule, as well as the metabolic reaction rate. Detection of temperature changes and subsequent adaptation of the metabolism are essential for microbial survival, such as by pathogens sensing intrusion into a host. Organisms can sense temperature shifts indirectly or by specialized sensing systems that evolved to detect changes in temperature in order to respond with adapted gene expression. This was extensively reviewed by Klinkert and Narberhaus [2]. Indirect temperature sensing is possible via the accumulation of aggregated proteins after a heat shock and via stalled ribosomes after a cold shock. Molecular

thermosensors may consist of DNA, RNA, proteins, or lipids. DNA topology changes, e.g., supercoiling caused by heat stress, stable RNA structures preventing translation at sub-optimal temperatures, temperature-responsive regulatory proteins, and alterations in lipid membrane stability with respective to fluidity are just a few examples for direct temperature sensing. Temperature plays an essential role and has a crucial effect on biological processes. Targeted temperature adjustments for triggering a desired response may also be exploited for biotechnological applications. Noll et al. published an example for a thermosensitive structure to direct the carbon flow of a substrate into a product rather than into biomass, by exploiting an RNA thermometer to optimize the heterologous production of rhamnolipid biosurfactants [3].

Alterations in temperature lead to multiple, often complex, interconnected metabolic changes. Models describing a biological process as a function of temperature are therefore indispensable to reducing complexity and facilitating understanding of those interconnections. Mathematical descriptions of how (bio-)chemical reactions respond to high or low temperatures emerged as early as in the 1900s with the "primal" temperature model of Arrhenius [4]. He investigated the reaction kinetics for sugar cane inversion by acids, depending on temperature. Popularized by Arrhenius, a variety of temperature models evolved over time, as shown in Figure 1. These models range from data-driven empirical approaches to complex mechanistic models that are based on thermodynamic knowledge of biomolecular behavior at different temperatures. Models are readily available to be used as a tool for process control and design. An overview of the current state of thermo-modeling is crucial to reasonably selecting a suitable model for the bioprocess to be monitored or optimized. The aim of this review is to provide an overview of available temperature models to facilitate understanding of model intention and reasonable selection for application. So far, only a few applications for temperature in (industrial) process design, monitoring, and control have been described. Applied model approaches are usually based on fuzzy logic or artificial neural networks and do not harvest the full potential of deterministic approaches. There are a few examples for applied model-assisted temperature control strategies in industrial biotechnology. These include heat balancing for an estimation of metabolic activity to improve batch-to-batch reproducibility by applying a process control module which uses the difference between the culture temperature and temperature of a coolant to predict oxygen mass transfer rates and *kLa* values [5–7]. Furthermore, deterministic process models can be used to describe a biological process as a function of a physical condition, like the temperature. They may be used in the food sector to estimate product shelf life, determining critical control points of a process, or to maximize productivities and ensure safe distribution chains [8]. These examples highlight a potential for temperature and deterministic models in process design. An approach for experimental design to optimize processes depending on multiple parameters is the Response Surface Method (RSM): "RSM is a collection of statistical techniques used for studying the relationships between measured responses and independent input variables" [9]. This method may be used for optimization purposes in experimental design in the shape of a metamodel. It connects the response of an objective function to input variables and determines its relations, for example, by the means of first- or second-order polynomial equations. The Matlab software package from The Mathworks, Inc. (Natick, MA, USA) [10] or the Minitab statistic software (GMSL s.r.l., UK) [11] may be used to conduct an RSM analysis. In bioprocesses, usually very few process variables are available online that continuously display a process course. Furthermore, arguably the only control variable to direct a bioprocess towards a desired outcome is, in most cases, the addition of fresh media. Other variables like pH, temperature, or pO2 are typically controlled at a constant value. Therefore, investigating and exploiting novel potentially available monitoring and control variables like temperature is a reasonable strategy to extend existing toolsets of bioprocess monitoring and control. Even though several systems inducible by temperature have been discovered and made available to biotechnologists in the last decades, only a few have been exploited for practical purposes, such as for process design or optimization [12]. For monitoring and control purposes, calorimetric approaches have been presented [5,6,13,14]. For control purposes, temperature may be used to directly address biological traits like RNA-thermometers to provoke a

desired response, as previously evaluated [3]. Furthermore, indirect metabolic effects may be exploited, like elevated metabolic rates at high temperatures or the correct folding of proteins at low temperatures. Even though knowledge on microbial temperature responses and adaptation, along with descriptions developed by mathematical means, is available, its potential for applied industrial bioprocesses has not been sufficiently exploited. This review provides an overview on available thermo-models with the potential to develop model-assisted or model-derived process control strategies using temperature as a crucial parameter.

#### **2. History of Temperature Modeling—17th–20th Century**

As early as the 17th century, there were theories on temperature being a form of particle movement. The kinetic theory of gases, with its origins in the 18th century, first specifically associated translational motions of molecules with heat and not with their vibrational or rotational motions [15]. Daniel Bernoulli was the pioneer of the kinetic theory of gases. He hypothesized that gases consist of a finite number of small spherical particles, which move through space along a straight line with high velocities. He assumed that heat increases the particle speed (*v*) and demonstrated that air pressure correlates with *v*2. Air temperature can therefore be measured by this pressure at a constant density, making temperature proportional to *v*<sup>2</sup> [16,17]. The kinetic energies of the molecules are correlated with the ideal gas law of Equation (1), whose history began with the French engineer Émile Clapeyron in 1834 [18,19]. In the following, only SI units are used. Parameters with non-SI units, used by cited authors, were converted into SI units.

$$p \cdot \mathbf{V} = n \cdot \mathbb{R} \cdot T \tag{1}$$

where *p* is the pressure, Pa; *V* is the volume, m3; *n* is the amount of substance, mol; *R* is the universal gas constant ~8.314, J mol−<sup>1</sup> K−1; and *T* is the absolute temperature, K. The Dutch chemist and first Nobel Prize laureate J. H. Van't Hoff observed that the chemical reaction rate doubles or triples when the temperature is increased by 10 K, which he expressed with the following equation:

$$Q\_{10} = \left(\frac{k\_2}{k\_1}\right)^{\frac{10}{1\times 2^{-1}\_1}}\tag{2}$$

This "rule of thumb" for chemical kinetics allows estimations for various phenomena in chemistry, biochemistry, and ecology. He furthermore described the change in the equilibrium constant K of a chemical reaction with respect to the change in temperature at constant pressure with the Van't Hoff Equation (3):

$$\frac{d}{dT}\ln K\_{\text{eq}} = \frac{\Delta H}{R \cdot T^2} \tag{3}$$

where *K* is the dimensionless equilibrium constant; Δ*H* is the standard enthalpy change, J mol−1; *R* is the universal gas constant ~8.314, J mol−<sup>1</sup> K−1; and *T* is the absolute temperature, K. Van't Hoff's student and the father of temperature models Svante Arrhenius continued the work of his teacher on the description of temperature-dependent specific reaction rate constants in chemical reactions with his essay "On the Reaction Velocity of the Inversion of Cane Sugars by Acids" [4]. Arrhenius observed that the reaction velocity of chemical reactions increased between 10% and 15% for each degree of rising temperature and postulated a semi-empirical model based on the Van't Hoff equation, which is shown in its integrated form in Equation (4).

$$k = A \cdot \epsilon^{\frac{-\nu\_q}{kT}} \tag{4}$$

where *k* is the rate constant, s<sup>−</sup>1, for a first-order rate constant; *A* is called a pre-exponential frequency or collision factor, s−1, for a first-order rate constant; *Ea* is an empirical parameter, the (Arrhenius) activation energy, J mol−1, characterizing the exponential temperature dependence of *k*; *R* is the universal gas constant ~8.314, J mol−<sup>1</sup> K−1; and *T* is the absolute temperature, K. In 1935, Henry Eyring formulated a statistical mechanistic equation following the transition state theory (former activated-complex theory) that assumed a transition state complex (‡TC) and a quasi-type equilibrium between educts (e1; e2), the transition state, and the product (P) [20,21]. The model has a similar way of describing the variance of the rate of a chemical reaction with temperature as the equation

of Arrhenius. Therefore, it underlined Arrhenius' previous observations and assumptions with a mechanistic approach.

$$\begin{array}{ccccc}k\_1 & & \\ \mathbf{e}\_1 + \mathbf{e}\_2 & \stackrel{k\_1}{\iff} & \mathbf{T} \mathbf{C} \stackrel{k\_3}{\twoheadrightarrow} \mathbf{P} & & \\ & k\_2 & & \end{array} \tag{5}$$

According to the transition state theory, the rate constant can be described as follows:

$$k(T) = \frac{k\_B \cdot T}{h} \cdot \mathbf{K}^\ddagger \tag{6}$$

where *kB* is the Boltzmann constant ~1.381 ·10−23,JK−1; *T* is the absolute temperature, K; *h* is the Planck's constant ~6.626·10−34,Js−1; and *<sup>K</sup>*‡ is the dimensionless equilibrium constant. A different way to express the rate constant is by replacing the equilibrium constant with a term containing the standard molar changes of entropy and enthalpy:

$$\lambda(T) = \frac{k\_{B^\*}T}{h} \cdot e^{\Lambda^\sharp S^\circ/\mathcal{R}\_{\cdot \cdot}} e^{-\Lambda^\sharp H^\circ/(\mathcal{R} \cdot T)}\tag{7}$$

where the entropy and enthalpy of activation are the standard molar change of entropy Δ‡*S* ◦ ,JK−<sup>1</sup> mol−1, when reactants form the transition state (activated) complex and standard molar change of enthalpy Δ‡*H*◦ , respectively, J mol<sup>−</sup>1. *R* is the universal gas constant ~8.314, J mol−<sup>1</sup> K<sup>−</sup>1. The (Arrhenius) activation energy (*Ea*) and enthalpy of activation are not the same, but approximately equal, as they are convertible, depending on the molecularity [22].

#### *2.1. Temperature in Biological Systems—The History Began with Arrhenius*

Microbiologists have noticed a major effect of temperature on the growth rate of microbial populations and described this effect with the Arrhenius equation by simply replacing the rate constant *k* in Equation (4) with the growth rate (μ), meaning the reciprocal of the generation time. The so-called Arrhenius plot, where ln(μ) is plotted against the reciprocal temperature, was used in the past and is still applied today to describe a relation between the temperature and growth of different bacteria and molds [23–26]. From this plot, Arrhenius parameters can easily be derived. Their plots show a good fit for lower temperatures. The Arrhenius model does not represent cell death, so a decrease of the growth curve at non-physiological temperatures. The lack of fit of the Arrhenius model for some temperature-dependent biological processes gave rise to the development of improved models describing growth as a function of temperature. Most of these models are based on Arrhenius' parental model and evolved over time.

#### *2.2. Biological Mechanisms Involved in Temperature Responses*

Microorganisms have developed molecular traits to respond to changing environmental temperatures. These traits have been extensively reviewed [2,27,28]. The principles of microbial thermo responses range from changing DNA topology, e.g., supercoiling caused by heat stress, stalled ribosomes, or stable RNA preventing translation during cold stress to proper folding of proteins, working optima of enzymes, or lipid membrane stability and fluidity. Biomolecules are generally thermos-sensitive. Therefore, various options for direct molecular thermosensing are possible. Molecular thermosensors may consist of DNA, RNA, proteins, or lipids [2]. The accessibility of DNA for the transcriptional machinery is crucial for transferring genetic information via RNA into a protein and is influenced by DNA topology [29]. DNA supercoiling, and thus accessibility, is altered in response to a shifting temperature, as has been reported for the plasmic DNA of mesoand thermophiles [30]. Mesophiles have negatively supercoiled plasmic DNA and hyperthermophilic archaea with a growth optimum ≥80 ◦C have relaxed or positively supercoiled plasmic DNA [30–32]. Proteins such as the histone-like structuring proteins (H-NS) work as temperature-dependent regulators,

governing >200 temperature-regulated genes in *Salmonella* sp. and more than two third of *E. coli* K-12 temperature-regulated genes, respectively [33,34]. The inhibition of gene expression by H-NS is caused by trapping RNA polymerase and mediating DNA looping, thereby disturbing the progression of RNA polymerase [35–38]. Temperature-dependent gene expression is also influenced on the RNA level, where RNA can form inhibitory loop structures called RNA-thermometers (RNAT). Here, base pairing blocks the Shine-Dalgarno-sequence (SD) and AUG start codons, inhibiting ribosomal binding and translation initiation. By raising the temperature to a threshold (melting temperature), the hairpin structure opens and permits the access of ribosomes to the translation initiation site [2,39]. The secondary structure and thereby functionality of RNAT is characterized by canonical or non-canonical base paring, internal loops or mismatches, and the total number of loop structures. Based on these characteristics, RNA thermometers may be subdivided into three categories: (i) ROSE-like RNATs (repression of heat shock gene expression), (ii) FourU RNATs, and (iii) additional types of RNATs [40]. Most RNATs have been identified in the 5 -UTR of mRNA. The ROSE-like RNAT family is probably the most abundant temperature-sensing mRNA structure. ROSE-like RNATs usually control the repression of heat shock gene expression, but have also been reported to control expression of the rhamnosyl transferase, which is associated with *Pseudomonas aeruginosa* virulence [41]. ROSE-RNATs are located in the 5 UTR, are between 60 and 100 nt in length, and consist of 2–4 loop structures [41–43]. The majority of described RNATs of the second family, the FourU RNATs, govern the gene expression of virulence genes, and only two FourU RNAT's are known to control heat shock protein formation. FourU RNATs contain a sequence of four Uridines that occlude the SD sequence by canonical A-U and/or non-canonical G-U base pairing [40]. The virulence gene *lcrF* (*virF*) of *Yersina pestis* and the *agsA* small heat shock gene of *Salmonella enterica* were among the first genes described to be governed by a FourU RNAT [44]. Furthermore, attempts have been made to design synthetic RNATs with tailor-made characteristics to differ in up to 10-fold sensitivity- and around 3-fold threshold changes compared to a starting thermometer sequence [45]. On the protein level, global repressors, sensor kinases, methyl-accepting chemotaxis proteins (MCPs), M-like proteins, chaperones, and proteases are involved in microbial temperature responses [2]. The global transcriptional repressor CtsR has been termed a "protein thermosensor", and liberates DNA upon an up-shift in temperature connected to the expression of heat shock proteins. Due to a glycine-rich loop structure, CtsR exhibits intrinsic heat-sensing characteristics [46]. MCPs function as transmembrane receptors and consist of a periplasmic ligand-binding domain and a signaling domain in the cytosol that can interact with cytosolic sensor kinases [47]. The Tar MCP, for example, is convertible from a heat to a cold sensor in the presence of aspartate and consequent methylation at up to four sites [48]. The surface M and M-like proteins of the human pathogens group A streptococci bind to a variety of human plasma proteins in a temperature-dependent manner. The affinity of the M-like coild-coil protein Arp4 to IgA, is high at 10 and 20 ◦C, but low at 37 ◦C, due to a conformational change of Arp4 and consequent loss of the coild-coil conformation and binding activity [49]. The diverse class of small heat shock proteins (sHsp) can act as molecular chaperones upon heat shock. They are temperature sensors with different molecular mechanisms. For example, the sHsp Hsp26 of *Saccharomyces cerevisiae* consists of 24 subunits and changes its affinity state towards unfolded proteins at high temperatures by undergoing a conformational change [50,51].

#### *2.3. Characteristic Graph for Growth as a Function of Temperature*

Representative curvature of a model depicting the specific growth rate as a function of temperature, called the thermal growth curve, is shown in Figure 2. The simulated optimal specific growth rate (μ*opt*) with the maximum turning point at the temperature optimum (*Topt*) and growth rate at half of μopt (μ*50%opt*) are marked in the model of a thermal growth curve. It has been pointed out that the term "optimal temperature" may need further specification to distinguish between the temperature for the optimal growth rate and the optimal temperature of the maximum biomass yield [52]. The minimum and maximum temperatures (*Tmin* and *Tmax*, respectively) for growth flank the asymmetric function

and mark the thermal tolerance or thermal niche of an organism [53,54]. These three temperatures (*Tmin*, *Topt*, and *Tmax*) are commonly referred to as cardinal temperatures. Bacteria can adapt to changing temperatures in the short run by producing cold- or heat-shock proteins. Furthermore, it was reported that the performance optimum of *E. coli* can be shifted when exposed to suboptimal temperatures for ~2000 generations. Conversely, the thermal niche breadth remained constant in that case [54]. The result is a reshaped thermal growth curve with the same upper and lower limits. The asymmetry of the thermal growth curve indicates that bacteria, which may be adapted to high temperatures, can survive in lower temperatures quite well. In contrary, fitness decreases sharply when temperatures exceed the optimum, resulting in thermal shock [55,56]. In one of the most recent approaches, the growth of psychrophiles, mesophiles, thermophiles, and hyperthermophiles was modeled, covering a temperature range of 124 ◦C, from −2 to +122 ◦C. The model was applied to 230 different strains of uni- and multicellular organisms with growth temperatures below freezing and the highest known temperature for biological growth so far [57,58].

**Figure 2.** Scheme of the thermal growth curve where the temperature (K) is plotted against the growth rate (s<sup>−</sup>1). Cardinal temperatures (*Tmin*, *Topt*, and *Tmax*) with their corresponding growth rates (μ*opt* and μ*50%opt*) are indicated.

#### *2.4. Mechanistic Versus Empirical Models*

An often cited empirical approach for modeling the thermal growth curve of microorganisms is the approach of Ratkowsky et al. [59] (Scopus: cited by 615, 6 November 2019) and the semi-empirical model of Arrhenius. The development of mechanistic approaches for modeling the thermal growth curve of microorganisms started with the master reaction model of Johnson et al. in 1946 [60] (Scopus: cited by 80, 6 November 2019). The mechanistic models consider the description of single essential protein thermal stability (master reaction model) or the thermal stability of the whole proteome (the proteome model) as key for modeling the thermal growth curve. The transition of the native to an active and/or inactive state of the protein is considered. Grimaud et al. have extensively reviewed temperature growth models and concluded that empirical models display a better fit for balanced growth in non-limiting conditions than mechanistic models. Conversely, mechanistic models offer a complementary point of view for modeling thermal growth and can accurately represent temperature responses for growth under non-balanced conditions [61].

#### **3. Temperature Modeling—From the 20th Century until Today**

#### *3.1. The Model of Hinshelwood (1946)*

Hinshelwood expanded Arrhenius' model by adding a temperature-dependent term describing a "rate of degeneration" that becomes relevant at temperatures above *Topt* [62]. Hinshelwood assumes a balanced growth for his model, saying that the total amount of compounds in a cell is constant. The model is based on the assumptions that just one enzymatic reaction is rate-controlling and the product of this reaction is a thermosensitive essential biomolecule which denatures irreversibly when temperatures are raised beyond the optimum. Temperature dependency and denaturation at high temperatures are of zero order and exhibit a temperature dependency similar to the Arrhenius model. The model represents the rate of synthesis in the minuend and degeneration in the subtrahend. At low temperatures, the subtrahend term is insignificantly small; in a small temperature region, both terms are almost equal, canceling each other out; and at high temperatures, the subtrahend term mostly accounts for the rapid decrease of the rate to zero.

$$
\mu(T) = A\_1 \cdot \mathbf{e}^{-\frac{E\_1}{R \cdot T}} - A\_2 \cdot \mathbf{e}^{-\frac{E\_2}{R \cdot T}} \tag{8}
$$

where *A*1, and *A*<sup>2</sup> are referred to as pre/non-exponential, collision, or frequency actors, s<sup>−</sup>1, related to entropy [62]; *E*<sup>1</sup> and *E*<sup>2</sup> are related to enthalpy [22], representing activation energies, J mol−1, of the rate-determining enzyme reaction and high-temperature denaturation, respectively; *R* is the universal gas constant ~8.314, J mol−<sup>1</sup> K<sup>−</sup>1; and *T* is the temperature, K.

#### *3.2. The Model of Johnson (1946)*

In the same year, Johnson and Lewin [60] proposed another mechanistic model, which also assumes a simple case of a single reaction controlling growth, and called it their "master reaction model". In contrast to Hinshelwood, they assumed that a reversibly denaturable "master enzyme" E0 controls an essential reaction for growth (assuming no substrate limitation). They reported their observation that *E. coli* stopped growing at non-viable 45 ◦C, but started growing exponentially again when transferred back to 37 ◦C. Increasing the exposition time of *E. coli* to non-viable temperatures led to lowered growth rates at 37 ◦C compared to the control. Hence, they assumed and described reversible protein denaturation damage as part of their model by integrating an equilibrium constant (*K*1). The constant accounts for the equilibrium of reversibly denatured inactive (*Ed*) to native active enzymes (*En*).

$$K\_1 = \frac{E\_d}{E\_n} \tag{9}$$

Hence, the amount of native active enzyme is given by Equation (10), with *E*<sup>0</sup> being the total amount of enzyme (native and denatured, mol).

$$E\_n = \frac{E\_0}{1 + K\_1} = \frac{E\_0}{1 + e^{\frac{-\Delta H}{R \cdot T}} \cdot e^{\frac{\Delta S}{R}}} \tag{10}$$

Johnson and Lewin then referred to Equation (7) proposed by Eyring, adding Equation (10) to account for the amount of active enzyme and substrate concentration and yielding Equation (11) for the temperature-dependent specific reaction rate (*k*).

$$k(T) = \frac{k\_{\rm B} \cdot T}{h} \cdot e^{\Delta^{\rm t} S^{\circ} / R} \cdot e^{-\Delta^{\rm t} H^{\circ} / (R \cdot T)} \cdot [S] \cdot [E\_n] \tag{11}$$

By assuming that one single enzymatic reaction governs temperature-dependent growth at a constant substrate concentration and by substituting *En* in Equation (11) with rearranged Equation (10), temperature-dependent growth can be described as Equation (12):

$$\mu(T) = \ c \cdot T \cdot E\_0 \cdot e^{\Lambda^\ddagger S^\diamond / R} \cdot e^{-\Lambda^\ddagger H^\diamond / (R \cdot T)} \cdot \frac{1}{1 + e^{-(\Lambda H - T \cdot \Lambda S) / (R \cdot T)}}\tag{12}$$

where *c* is a derived Boltzmann/Plancks constant, s K−1, from the Eyring model of Equation (7); Δ‡*H*◦ and Δ‡*S* ◦ are the standard molar change of enthalpy and entropy of activation, respectively (as described for Equation (7)); Δ*H* and Δ*S* are the enthalpy, J, and entropy change, J K−1, respectively, between native and denatured enzymes; *R* is the universal gas constant ~8.314, J mol−<sup>1</sup> K<sup>−</sup>1; and *T* is the absolute temperature, K. The equation can then be shortened to the model in Equation (14) using the expression for Gibbs free energy change (between a catalytically active and reversibly denatured inactive state) at a constant temperature (Equation (13)):

$$
\Delta G^{\circ} = \Delta H - T \cdot \Delta S \tag{13}
$$

$$\mu(T) = \text{C-}T \cdot \text{e}^{-\Lambda^\ddagger H^\diamond / (\text{R-}T)} \cdot \frac{1}{1 + \text{e}^{-\Lambda G / (\text{R-}T)}}\tag{14}$$

where *<sup>E</sup>*<sup>0</sup> is assumed to be constant and *<sup>c</sup>*·*e*Δ‡*<sup>S</sup>* ◦ /*R*·*E*<sup>0</sup> is compressed to *<sup>C</sup>*. The fraction term containing the Gibbs free energy change can be assumed as the probability that the enzyme is in its native, and not its inactive, state. In the temperature region for a catalytically active enzyme, Δ*G*, J, has high positive values, yielding almost zero for the exponential term in the denominator of the probability term, and thus one for the probability for a catalytically active enzyme.

#### *3.3. The Model of Sharpe (1977)*

In 1977, Sharpe et al. [63,64] merged the models of Johnson and Lewin with the model of Hultin, which were both founded on Eyring's theory and modeled on the activated complex in chemical reactions [20,60,65,66]. The result was a unified rate model for the description of biological processes at physiological temperatures. Sharpe's model was originally developed for poikilotherms. Sharpe assumed balanced growth with a constant total amount of compounds per cell and just a single rate-controlling enzyme determining the development rate at all temperatures. Its reaction rate is of a zero order. The total concentration of enzyme (active + inactive) is assumed to be constant at all temperatures. Three enzyme states are considered and described: an inactivation state at low and high temperatures, as well as an active development state. Sharpe described transition between the states by his model, which depicts the thermal growth curve with the following equation:

$$\mu(T) = \frac{T \cdot e^{(\Phi - \,\Delta H \,'\, 'T)/R}}{1 + e^{(\Delta S\_L - \,\Delta H\_L / T)/R} + e^{(\Delta S\_H - \,\Delta H\_H / T)/R}} \tag{15}$$

where *T* is the temperature, K; is the universal gas constant ~8.314, J mol−<sup>1</sup> K<sup>−</sup>1; and the other parameters describe the rate-controlling enzyme reaction, where Φ is a dimensionless conversion factor; Δ‡*H*◦ is the enthalpy of activation of the reaction catalyzed by the rate-controlling enzyme, J mol<sup>−</sup>1; the subscript *L* accounts for low-temperature inactivation and the subscript *H* for high-temperature inactivation; and Δ*S*∗ ,JK−<sup>1</sup> mol<sup>−</sup>1, and Δ*H*<sup>∗</sup> , J mol−1, mark the entropic and enthalpic change, respectively, upon highor low-temperature inactivation specified by the subscript. In 1991, Zwietering et al. [8] rewrote the model of Sharpe exhibiting an Arrhenius-type of temperature dependency using activation energies rather than changes in enthalpy to describe growth. As described by the International Union of Pure

and Applied Chemistry, IUPAC, the (Arrhenius) activation energy (*Ea*) and enthalpy of activation are not the same, but approximately equal, as they are convertible, depending on the molecularity [22].

$$\mu(T) = \frac{k\_{\rm id} \cdot e^{-\frac{E\_{\rm g}}{RT}}}{1 + k\_{\rm I} \cdot e^{-\frac{E\_{\rm j}}{RT}} + k\_{\rm Ii} \cdot e^{-\frac{E\_{\rm j}}{RT}}} \tag{16}$$

where *ka* (s<sup>−</sup>1), *kl* (-), and *kh* (-) are collision factors that are dimensionless in the denominator, as described by Zwietering et al. [8]; *E*, J mol−1, represents the activation energy; subscript *a* accounts for the rate-determining enzyme reaction; subscripts *h* and *l* describe high- and low-temperature inactivation, respectively; *R* is the gas constant ~8.314, J mol−<sup>1</sup> K<sup>−</sup>1; and *T* is the temperature, K.

#### *3.4. The Model of Mohr (1980)*

Mohr and Krawiec [24] analyzed the thermal growth curves for 12 bacterial species. Among them were thermophiles, mesophiles, and psychrophiles. They used the Arrhenius plot for their data, where ln(μ) is plotted against the reciprocal temperature in Kelvin. They reported two different slopes for the Arrhenius profiles for some mesophiles and thermophiles at suboptimal temperatures. The temperature at the interception point were both slopes meet is referred to as the "critical temperature" (*Tcrit*) [24,67]. This point marks the turning point where the organization of an organism, and hence it growth behavior, changes. In order to describe the two different slopes, they proposed two equations, which they reduced to one with the assumption that "[ ... ] a balance of organizations exists at any temperature":

$$
\mu(T) = A\_1 \cdot e^{-E\_1/R \cdot T} - A\_2 \cdot e^{-E\_2/R \cdot T} \ T\_{crit} < T \quad < T\_{\text{max}} \tag{17}
$$

$$
\mu(T) = A\_1' \cdot e^{E\_1'/R \cdot T} \quad T\_{\rm min} \,\,\, \,\, T \,\,\, \,\, T\_{\rm crit} \,\,\, \,\, T\_{\rm crit} \tag{18}
$$

$$\mu(T) = \frac{1}{A\_1^{\star} \cdot e^{E\_1'/R \cdot T} + A\_1^{\star \star} \cdot e^{E\_1/R \cdot T}} - A\_2 \cdot e^{-E\_2/R \cdot T} \tag{19}$$

where *A*1, *A* <sup>1</sup>, and *<sup>A</sup>*<sup>2</sup> are referred to as pre/non-exponential, collision, or frequency actors, s−1; *E*1, *E* <sup>1</sup>, and *<sup>E</sup>*<sup>2</sup> are referred to as temperature characteristics, J mol−1; *<sup>R</sup>* is the gas constant ~8.314, J mol−<sup>1</sup> K<sup>−</sup>1; and *T* is the temperature, K. The parameters marked with a " ' " are used to describe temperature-dependent growth for *T*min < *T* < *Tcrit*, whereas parameters without " ' " are used to describe temperature-dependent growth for *Tcrit* < *T* < *T*max, *A*<sup>∗</sup> <sup>1</sup> = 1/*A* <sup>1</sup>, and *A*∗∗ <sup>1</sup> = 1/*A*1.

#### *3.5. The Model of Schoolfield (1981)*

Schoolfield developed a non-linear regression model [68] based on the model proposed by Sharpe. His group reformulated the model of Sharpe and eliminated the high correlations of Sharpe's parameters (e.g., 0.9996). Furthermore, Schoolfield et al. argued that there is no "readily apparent" initial guess for beginning iterations for parameter estimation. Hence, they also aimed to facilitate the regression process and parameter estimation.

$$\ln(T) = \frac{\imath \iota\_{25} \cdot \frac{T}{278} \cdot e^{\left(\frac{\Delta^2 H^{\circ}}{R} \cdot (\frac{1}{278} - \frac{1}{2})\right)}}{1 + e^{\left[\frac{\Delta^2 H^{\circ}}{R} \cdot \left(\frac{1}{T\_1/2} - \frac{1}{T}\right)\right]} + e^{\left[\frac{\Delta H^{\circ}}{R} \cdot \left(\frac{1}{T\_1/2} - \frac{1}{T}\right)\right]}}\tag{20}$$

Schoolfield et al. chose 25 ◦C (298 K) and the respective specific growth rate *u*25, s<sup>−</sup>1, as a reference because enzyme inactivation would be low or not present at that temperature in most biological systems. Δ‡*H*◦ is the enthalpy of activation of the reaction catalyzed by the rate-controlling enzyme, J mol−1; the subscript *L* accounts for low-temperature inactivation and the subscript *H* for high-temperature inactivation of the enzyme; and Δ*H*<sup>∗</sup> marks the enthalpic change upon high- or low-temperature inactivation specified by the subscript, J mol−1. With an increasing or decreasing temperature, 50%

of the rate-controlling enzyme is inactivated by either a high *T*1/2*<sup>H</sup>* , K or low temperature *T*1/2*<sup>L</sup>* , K, as previously described by Hultin [66] and adopted by Schoolfield et al. [68].

#### *3.6. The Models of Ratkowsky and Zwietering (1982–1991)*

As extensively reviewed by Grimaud et al. [61], several models for temperature-dependent growth in biological systems have been developed. Most of these models were developed to describe food spoilage and medical applications. One often cited (802, Scopus, 8 August 2019) empirical model is the square root model proposed by Ratkowsky et al., as an alternative to the widely used Arrhenius model, to describe growth as a function of temperature [26]:

$$\mu(T) = \left[b\_1 \cdot (T - T\_{\min})\right]^2 \tag{21}$$

where *b*<sup>1</sup> is a Ratkowsky parameter, K−<sup>1</sup> s−0.5, and *Tmin* is the minimum temperature of growth, K. This model was extended to the complete bio-kinetic range in 1983 by the same author [59]:

$$\mu(T) = \left(b\_2 \cdot (T - T\_{\min}) \cdot \left| 1 - e^{\left[c\_2 \cdot (T - T\_{\max})\right]} \right|\right)^2 \tag{22}$$

where *c* is a Ratkowsky parameter, K<sup>−</sup>1, and *Tmax* is the maximum temperature, K, at which growth is observed. Zwietering et al. [8] argued that Ratkowsky's model could not be used for temperatures above *Tmax* because the model predicted positive values for growth rates beyond the high-temperature end of the thermal niche. They therefore adapted the model accordingly and the result is shown in Equation (23).

$$\mu(T) = \left[b\_3 \cdot (T - T\_{\rm min})\right]^2 \cdot \left\{1 - e^{\left[c\_3 \cdot (T - T\_{\rm max})\right]}\right\} \tag{23}$$

#### *3.7. The Model of Roels (1983)*

In 1983, Roels et al. [69] developed a model to describe the growth rate as a function of temperature. The numerator has an Arrhenius-type appearance and the energy for activation was replaced by the Gibbs free energy change upon denaturation of a rate-controlling enzyme in the denumerator.

$$\mu(T) = \frac{A \cdot e^{\left(\frac{-E\_C}{RT}\right)}}{1 + B \cdot e^{\left(\frac{-\Lambda C\_d}{RT}\right)}}\tag{24}$$

where *A* and *B* are pre-exponential factors, s<sup>−</sup>1; *R* is the gas constant ~8.314, J mol−<sup>1</sup> K<sup>−</sup>1; and *T* is the temperature, K.

#### *3.8. The Model of Davey (1989)*

Davey proposed an empirical generalized predictive model based on a modified linear-Arrhenius equation, which combined the influence of temperature and water activity to describe microbial growth [70]. The activation energy parameter from the original Arrhenius model was replaced in Davey's model by two coefficients of inverse temperature. In his work, Davey provided and evaluated models for the influence of environmental factors like the *aw* value or pH in combination with temperature, as well as temperature as the sole influencing factor, on growth [70–72]. The model describing temperature as the sole environmental factor is shown in Equation (25):

$$
\mu(T) = e^{\mathbb{C}\_0 + \mathbb{C}\_1/T + \mathbb{C}\_2/T^2} \tag{25}
$$

where *C*0–*C*<sup>2</sup> are dimensionless Davey coefficients, - and the energy of activation given in the Arrhenius equation is replaced by two parameters of reciprocal temperature, K. Two years later, in 1991, Davey used the model for predicting the temperature-dependent lag time [72].

#### *3.9. The Models of Lobry and Rosso (1991–1993)*

In 1991, Lobry [73] developed an empirical model that includes the three cardinal temperatures (*Tmin*, *Topt*, and *Tmax*) as parameters. The cardinal temperature model (CTM) estimates positive values for growth rates at temperatures between the low temperature and high temperature end (*Tmin* and *Tmax*), with the highest growth rate (*uopt* (s<sup>−</sup>1)) at *Topt*. Outside the thermal niche (*T* < *Tmin*; *T* > *Tmax*), negative values are predicted. Each microbial species exhibits these three characteristic cardinal temperatures, which permits direct biological interpretation of the model parameters and facilitates parameter estimation using experimental data for Lobry's CTM-model Equation (26). The authors emphasize the "absence of high structural correlations" between parameters of their model. In 1991, Rosso et al. [74] further elaborated Lobry's empirical model by including the point of inflection in the suboptimal range of temperatures, which was experimentally determined. Following this, the so-called cardinal temperature model with inflection (CTMI; Equation (27)) could be used to accurately predict growth in the suboptimal range of temperature. Rosso's group noted an "unexpected" high linear correlation between cardinal temperatures, especially between *Tmax* and *Topt*, with r = 0.991. They then argued that due to the correlations found, one instead of three cardinal temperatures could sufficiently describe the permissive temperature range for growth. They also mentioned an exception for the stated relationships between the cardinal temperatures for the growth behavior of *Vibrio* sp. In total, they analyzed 47 different data sets describing the growth of psychrophilic, mesophilic, and thermophilic strains.

$$u(T) = u\_{opt} \cdot \left[ 1 - \frac{\left(T - T\_{opt}\right)^2}{\left(T - T\_{opt}\right)^2 + T \cdot \left(T\_{\max} + T\_{\min} - T\right) - T\_{\max} \cdot T\_{\min}} \right] \tag{26}$$

$$u(T) = u\_{opt} \cdot \frac{(T - T\_{\text{max}}) \cdot (T - T\_{\text{min}})^2}{\left(T\_{opt} - T\_{\text{min}}\right) \cdot \left[\left(T\_{opt} - T\_{\text{min}}\right) \left(T - T\_{opt}\right) - \left(T\_{opt} - T\_{\text{max}}\right) \left(T\_{opt} + T\_{\text{min}} - 2 \cdot T\right)\right]} \tag{27}$$

#### *3.10. The Model of Blanchard (1996)*

Blanchard et al. [75] originally developed their model to quantify the short-term temperature effect on natural assemblages of microphytobenthos' photosynthetic capacity. Blanchard described a progressive increase in photosynthetic capacity during a temperature increase up to an optimum temperature, with a rapid decrease when the temperature was raised beyond the optimum. For the model, cardinal temperatures (*Tmin*, *Topt*, and *Tmax*) with a biological meaning are used to facilitate a reasonable initial guess for parameter estimation. Grimaud et al. [61] rewrote the Blanchard model to represent growth as a function of temperature instead of photosynthetic capacity, as shown in Equation (28):

$$
\mu(T) = \mu\_{opt} \left( \frac{T\_{\text{max}} - T}{T\_{\text{max}} - T\_{opt}} \right)^{\beta} \cdot \mathbf{e}^{-\beta \cdot (T\_{opt} - T) / (T\_{\text{max}} - T\_{opt})} \tag{28}
$$

where μ*opt*, s−1, is the maximal specific growth rate at optimal temperature *Topt*, K; *Tmax*, K, is the maximum temperature where growth is observed; and β is a dimensionless Blanchard parameter.

#### *3.11. The Models of Eppley and Norberg (2004)*

Eppley [76] proposed a simple function with a positive exponential correlation between temperature and maximum expected growth, as shown in Equation (29). He stated that this model may be used for a generalized estimate for μmax in unicellular algae for temperatures <40 ◦C.

$$
\mu(T) = 0.851 \cdot 1.066^T \tag{29}
$$

The proposed Eppley curve or envelope function of Equation (29) shows the evolutionary interspecific upper limit for the maximum specific growth rate at any temperature up to 40 ◦C. The limit for the maximum growth rate increases exponentially until 40 ◦C in Eppley's function. For model assembly, Eppley used almost 200 data points of different species of unicellular algae [76]. Based on Eppley's findings, Jon Norberg developed a model for temperature-dependent growth in 2004 [77], shown in Equation (30).

$$
\mu(T) = \left[1 - \left(\frac{T-Z}{w}\right)^2\right] \cdot 0.59 \cdot \mathbf{e}^{0.0633 \cdot T} \tag{30}
$$

An envelope function 0.59 <sup>×</sup> e0.0633*<sup>T</sup>* according to Eppley is contained in Norberg's model for temperature-dependent growth, where *T*, K, is the ambient temperature and*Z*, K, is the temperature with the maximum specific growth rate derived from the envelope function representing Topt respectively. The width of the temperature response function is determined by the parameter *w*, K, meaning the width of the thermal niche. A generalized form of the Eppley–Norberg model Equation (31) would add *a* and *b* as dimensionless parameters, -, generalizing the Eppley envelope function.

$$\mu(T) = \left[1 - \left(\frac{T-Z}{w}\right)^2\right] \cdot a \cdot e^{b \cdot T} \tag{31}$$

#### *3.12. The Modified Master Reaction Model (2005)*

In 1967, Brandts recognized that the master reaction model proposed by Johnson and Lewin (see Equation (12)) failed to describe enzymatic reactions adequately when applied to the full bio kinetic temperature range [78,79]. Arguably, the limitations of the model arise from the assumed temperature independence of Δ*G* in the master reaction model upon protein denaturation. Therefore, Brandts et al. attributed the temperature dependency to Δ*G* by simply using an empirical polynomial expression relating Δ*G* to *T*. In 1974, Privalov et al. [80] reported a linear relation between enthalpy and entropy and temperature upon protein unfolding when assuming a specific constant heat capacity change for a specific protein. Almost 20 years later, it was reported that the change in enthalpy upon denaturation (Δ*Hd*) normalized to the number of amino acid residues (or molecular weight, respectively) at a specific temperature (*T*∗ *<sup>H</sup>* ~373 K) converged to a common value (Δ*H*<sup>∗</sup> ). Likewise, the same convergence behavior to one common value (Δ*S*∗ ) at a specific temperature (*T*∗ *<sup>S</sup>* ~385 K) for entropy upon denaturation (Δ*Sd*) normalized to the number of amino acid residues was described for a number of homologous compounds [80–83]. The so-called convergence temperatures (*T*∗ *<sup>H</sup>* and *T*<sup>∗</sup> *S*) were obtained as the temperatures where the apolar contributions (apolar hydrogen atoms CH) to the corresponding changes in entropy or enthalpy, respectively, upon denaturation approach zero [81,82,84]. Therefore, Δ*H*∗ describes only polar and van der Waals interactions and Δ*S*∗ primarily accounts for configurational entropy [83]. In 1990, Murphy et al. [81] analyzed the convergence behavior by plotting Δ*Hd* or Δ*Sd* normalized to mol amino acid residue against the normalized heat capacity change (Δ*Cp*) upon denaturation and obtained the following correlations:

$$
\Delta S\_d = \Delta S^\* + \Delta \mathbb{C}\_p \cdot \ln \left( \frac{T}{T\_S^\*} \right) \tag{32}
$$

$$
\Delta H\_d = \Delta H^\* + \Delta \mathbb{C}\_p \cdot \left( T - T\_H^\* \right) \tag{33}
$$

The above-mentioned equations describe a temperature-dependent enthalpic and entropic change upon denaturation normalized to the number of amino acid residues in a protein (*n*). From Murphy's findings [81], the change in Gibbs free energy upon protein denaturation (~protein thermal stability) is given by

$$
\Delta G\_d(T) = n \cdot \left[ \Delta H^\star - T \cdot \Delta S^\star + \Delta C\_p \cdot \left[ \left( T - T\_H^\star \right) - T \cdot \ln \left( \frac{T}{T\_S^\star} \right) \right] \right] \tag{34}
$$

where Δ*Cp*· %- *T* − *T*<sup>∗</sup> *H* <sup>−</sup> *<sup>T</sup>*·*ln T T*∗ *S* & accounts for the hydrophobic contribution of the Gibbs free energy change upon denaturation of the rate-determining "master enzyme". Ross connected Murphy's findings with the rewritten master reaction model of Johnson and Lewis, where the change in enthalpy between the catalytically active and inactive state of the rate-limiting enzyme was replaced by the temperature-dependent Gibbs free energy change [79,85].

$$\ln(T) = \frac{c \cdot T \cdot e^{\left(-\Delta^{\ddagger} H^{\complement} / R \cdot T\right)}}{1 + e^{\left(-\Delta G\_{\ddagger} / R \cdot T\right)}}\tag{35}$$

Replacing the description of the Gibbs free energy in the denominator of Equation (35) with the term in Equation (34) resulted in a modified master reaction model:

$$\mu(T) = \frac{c \cdot T \cdot \varepsilon^{( - \Delta^{\ddagger} H' / R \cdot T)}}{1 + \varepsilon^{( - n \cdot \{ \Delta H^{\circ} - T \cdot \Delta S^{\circ} + \Delta C\_p \cdot \left[ (T - T\_H^{\circ}) - T \cdot \ln(T / T\_S^{\circ}) \} \right] / R \cdot T}} \tag{36}$$

In the denominator of Equation (36), the thermodynamic parameters Δ*H*∗ , Δ*S*∗ , and Δ*Cp* are normalized to mol amino acid residue. In 2005, Ratkowsky et al. [79] reduced the eight-parameter model given in Equation (36) to a five-parameter model by simply applying the universal constants for globular proteins *T*∗ *<sup>H</sup>* = 373.6 K, *T*<sup>∗</sup> *<sup>S</sup>* <sup>=</sup> 385.2 K, and <sup>Δ</sup>*S*<sup>∗</sup> <sup>=</sup> 18.1 J K−<sup>1</sup> found by Murphy et al. [81,82] to the model. Ratkowsky's group fitted the reduced five-parameter modified master reaction model to data from 35 bacterial strains. The universal constant Δ*H*<sup>∗</sup> suggested by Murphy's group with 5640 J mol−<sup>1</sup> amino acid residue was found to be unsuitable for representing bacterial growth when applied to five data sets of Ross [86]. The reduced modified master reaction model with applied universal constants evaluated in the work of Ratkowsky et al. [79] is given in Equation (37).

$$\mu(T) = \frac{c \cdot T \cdot e^{(-\Delta^\ddagger H^\ddagger/8.314 \cdot T)}}{1 + e^{(-n \cdot [\Delta H^\ddagger - T \cdot 18.1 + \Delta C\_p \cdot][(T - 373.6) - T \cdot \ln(T/385.2)])/8.314 \cdot T)}}\tag{37}$$

#### *3.13. The Model of Zeldovich (2007–2016)*

The group of Zeldovich [87] argued that the whole proteome has to be considered when describing the temperature response and sensitivity of an organism. Ghosh and Dill [88] continued the work of Zeldovich et al. and proposed a model that considers the folding stabilities across an organisms' proteome to describe temperature-dependent growth rates of bacteria. They assumed that the growth rate was a function of temperature composed of a product of two factors: First, Arrhenius-type low-temperature activation for one or more activated metabolic processes controlling the increase of growth rate at low temperatures, and second, a term accounting for the folded part of the proteome at any temperature, which also depicts the "denaturation catastrophe" when reaching high temperatures.

$$u(T) = u\_0 \cdot e^{\left(\frac{-\Delta t\_H}{k \cdot T}\right)^\circ} \prod\_{i=1}^\Gamma \frac{1}{1 + e^{\left(-\Delta G\_{\text{un}} \left(N\_i, T\right)/R \cdot T\right)}},\tag{38}$$

where μ<sup>0</sup> is a growth rate reference, s−1. The parameter Δ‡*H*◦ , J mol−1, describes the activation barrier for growth (e.g., an essential growth-limiting metabolic rate). The authors found that the activation barrier for growth (~68.2 kJ mol−1) in *E. coli* approximately corresponds to the energy needed by ribosomes to form a peptide bond. Hence, the authors identified ribosomal action to grow protein chains as one of the key growth rate-limiting factors, along with protein motions necessary for enzymatic reactions. Γ describes the amount of essential proteins required for growth. The product

term accounts for the probability that the *i*th essential protein composed of *Ni* amino acids is in its native state, which is expressed by Δ*Gun* in Equation (39):

$$
\Delta G\_{\rm unt} = -k\_{\rm B} \cdot T\_0 \cdot n \cdot \left[ \begin{array}{c} \frac{\underline{\varepsilon} \underline{\eta} + m \underline{\boldsymbol{u}} \cdot \boldsymbol{\varepsilon}}{\underline{\boldsymbol{k}} \underline{\boldsymbol{x}} \cdot \boldsymbol{T}\_0} + \frac{\Delta C\_p}{\underline{\boldsymbol{k}} \underline{\boldsymbol{y}} \cdot \boldsymbol{T}\_0} \cdot \left( T - T\_H^\* \right) + \frac{T}{T\_0} \cdot \ln(\boldsymbol{z}) - \frac{T}{T\_0} \cdot \Delta C\_p \cdot \ln\left(\frac{T}{T\_S^\*}\right) + \frac{\underline{\boldsymbol{l}}\_p}{2 \cdot \underline{\boldsymbol{n}}} \cdot \boldsymbol{\varepsilon} \right] \\\ \left( \frac{Q\_n^2}{\underline{\boldsymbol{R}}\_n \cdot \left( 1 + \boldsymbol{\kappa} \cdot \underline{\boldsymbol{R}}\_n \right)} - \frac{Q\_d^2}{\underline{\boldsymbol{R}}\_d \cdot \left( 1 + \boldsymbol{\kappa} \cdot \underline{\boldsymbol{R}}\_d \right)} \right) \end{array} \tag{39}
$$

where *kB* is the Boltzmann constant ~1.381 · 10−23,JK−1; *T*<sup>0</sup> = 300 K; *n* is the number of amino acids with respective to the chain length of a protein; *g*<sup>0</sup> is the free energy upon amino acid desolvation and upon contact; *c* is the concentration of denaturant; Δ*Cp* is the heat capacity change upon denaturation, J K−<sup>1</sup> mol-residue<sup>−</sup>1; *T* is the absolute temperature, K; *T*<sup>∗</sup> *<sup>H</sup>* = 373 K and *T*<sup>∗</sup> *<sup>S</sup>* = 385 K are the enthalpic and entropic convergence temperatures, respectively; *z* is defined as the loss of average conformational freedom per backbone bond; *lb* is the average Bjerrum length; *Qn* and *Qd* are the total net charge of native and denatured protein, respectively; *Rn* and *Rd* account for the radii of native and denatured protein, respectively; κ is the reciprocal of the Debye length (for further details, see [89,90]). To obtain the probability distribution for protein stabilities of a proteome p(Δ*G*), Equation (39) can be used. The equation accounts for the stability of an average single protein of length *n* and may be used in combination with the distribution of protein chain lengths (P(*n*)) of a cell available for various cell types from genomic/proteomic data in order to calculate temperature-dependent proteome stability.

#### *3.14. The Model of Daniel (2010)*

In 2010, Daniel et al. [91] proposed the equilibrium model to describe temperature-dependent catalytic activity of enzymes in non-limiting conditions. The group reported that the decrease of the catalytic rate constant (*kcat*) of tested enzymatic reactions above the optimal temperature (*Topt*) does not entirely correspond to thermal stability data and irreversible denaturation. Furthermore, they found that part of the activity loss above *Topt* was reversible and probably associated with a "conformational, dynamic and solvent-based effect" altering the active site of the enzyme. To explain the higher than expected decrease of *kcat* at a certain temperature, they suggested that an enzyme may be present in three states: (i) catalytically active (Eact); (ii) catalytically inactive, but not (significantly) unfolded (Einact); and (iii) irreversibly denatured (X). They rapid changes of the Michaelis constant (*Km*), describing an enzyme's affinity towards a substrate, that occur with temperature support their hypothesis of an "Einact state", where the active site is altered. They argue that the active site may need a certain degree of flexibility to function properly and is therefore more prone to changes in temperature affecting conformation and dynamics.

$$\begin{array}{ccccc}\text{K}\_{\text{eq}} & & \\ \text{E}\_{\text{act}} & \stackrel{\text{K}\_{\text{inact}}}{\rightleftharpoons} & \text{E}\_{\text{inact}} \times & \\ \end{array} \tag{40}$$

Daniel's group assumed a rapid equilibrium between Eact and Einact and time-dependent denaturation at a certain temperature. The conversion rate of Eact to Einact is thereby assumed to be faster than the rate of denaturation and catalytic reaction rate, respectively. The authors investigated the applicability of the model for >50 datasets of >30 enzymes of different reaction classes and structures (monomeric to hexameric) and concluded that their model is universally applicable and independent of the reaction type and enzymatic structure [91]. The model may therefore be suitable for describing a thermal growth curve, where a rate-controlling enzymatic reaction is often assumed to describe temperature-dependent growth (e.g., master reaction model, [60]).

$$\mu(T) = \frac{k\_B}{l\_1} \cdot T \cdot E\_{0^{-}e^{-\left(\frac{\Delta G^{+}}{RT}\right)} \cdot e^{-\left(\frac{\Delta H^{+}}{RT}\right)\_{eq} \cdot \left(\Delta H\_{eq} \cdot \left(\frac{1}{T\_{eq}} - \frac{1}{T}\right)/R\right) \cdot e}}{1 + e^{\left(\Delta H\_{eq} \cdot \left(\frac{1}{T\_{eq}} - \frac{1}{T}\right)/R\right)}} \cdot \left[1 + e^{\left(\Delta H\_{eq} \cdot \left(\frac{1}{T\_{eq}} - \frac{1}{T}\right)/R\right)}\right]^{-1} \tag{41}$$

where *kB* is the Boltzmann constant ~1.381 · 10<sup>−</sup>23,JK−1; *h* is the Planck's constant ~6.626·10<sup>−</sup>34,Js−1; *T* is the absolute temperature, K; *E0* is the total enzyme concentration composed of the sum of Eact, Einact, and X, mol m−3; Δ*G*<sup>∗</sup> cat is the Gibbs free energy of activation of the enzymatic reaction; *R* is the universal gas constant ~8.314, J mol−<sup>1</sup> K−1; Δ*G*<sup>∗</sup> inact is the Gibbs free energy of activation for the irreversible denaturation of the rate-controlling enzyme; Δ*Heq* is the enthalpic change between Eact and Einact; *Teq* is the equilibrium temperature where the rate-controlling enzyme is present as 50:50 Eact and Einact, K; and *t* is the time, s.

#### *3.15. The Model of Kooijman (2010)*

The DEB or Dynamic Energy Budget theory deals with the description of rates for physiological processes. Assimilation, growth, respiration, maintenance, or reproduction in individual, not further specified, "organisms" are analyzed and described by the generalized theory. The rates are described as a function of the environment, like the temperature or nutrient availability, and the state of the organism like size or age, for example. S.A.L.M. Kooijman, whose early concepts on energy budgets were published in 1986 [92], summarized his work in "DEB theory for Metabolic Organization" in 2010 [93]. In DEB theory, temperature-dependent growth is described by a reformulated Arrhenius equation in the numerator and complemented by a term inspired by Sharpe's model [63] for reduced rates at the high- and low-temperature end in the denominator. The equation therefore accounts for the amount of enzyme in its native state and considers a possible transition to an inactive state via hot and cold denaturation. Kooijman argues that Eyring's thermodynamic interpretation of the Arrhenius type of temperature dependence might only be understood as an approximation. It is an enormous step from Eyring's model, considering bimolecular reactions in the gas phase, to physiological rates with many compounds [94].

$$\mu(T) = \frac{k\_1 \cdot \mathbf{e}^{T\_A/T\_1 - T\_A/T}}{1 + \mathbf{e}^{T\_{Al}/T - T\_{Al}/T\_l} + \mathbf{e}^{T\_{Ah}/T\_h - T\_{Ah}/T}} \tag{42}$$

where *T*<sup>1</sup> is a reference temperature, K, with the corresponding rate *k*1, s−1; *TA* is the Arrhenius temperature (i.e., linear slope of the Arrhenius plot), K; *Tl* and *Th* mark the cardinal temperatures flanking the thermal niche (low- and high-temperature denaturation, respectively), K; and *TAl* and *TAh* account for Arrhenius temperatures at temperature boundaries of the thermal niche, K.

#### *3.16. The Model of Huang (2011)*

The group of Huang [95] developed a model by modifying and combining the Arrhenius equation with the theory and model of Eyring et al. [4,20,65].

$$
\mu(T) = A \cdot T \cdot e^{-\left(\frac{\Delta C}{RT}\right)^a} \tag{43}
$$

where Δ*G* , J mol−1, accounts for an energy term; *R* is the universal gas constant ~8.314, J mol−<sup>1</sup> K−1; *T* is the absolute temperature, K; α is a Huang parameter, K−1; and *A* describes the collision or frequency factor, s<sup>−</sup>1. Huang reports that their model can only sufficiently describe growth behavior at a suboptimal temperature. To extend their model to the entire physiological temperature range, they used an expression known from Ratkowsky et al. [59]:

$$
\mu(T) = A \cdot T \cdot e^{-\left(\frac{\Delta C'}{RT}\right)^a} \cdot \left[1 - e^{c\_2 \cdot (T - T\_{\max})}\right] \tag{44}
$$

where *c*<sup>2</sup> is a Ratkowsky parameter, K<sup>−</sup>1, and *Tmax* is the maximum growth temperature, K. Huang's group reported good fits (R<sup>2</sup> = 0.985) for their model using data for thermal growth rates from five different bacteria.

#### *3.17. The Model of Corkrey (2014)*

In 2014, Corkrey et al. attempted to build a universal mechanistic model [58]. It was used to model the growth of 230 different strains of unicellular and multicellular organisms ranging from psychrophilic to hyperthermophilic, covering a temperature range of 124 ◦C, from −2 ◦C to +122 ◦C. They therefore argued that their findings might be used to model the dependence of the growth rate on temperature for all unicellular and multicellular life forms. Being able to find a good fit for their universal model to the thermal growth curves of various life forms, they concluded that there might be evidence for the presence of a single highly conserved reaction in the last universal common ancestor. Under all limiting conditions, a single-enzyme-catalyzed reaction rate, which controls the growth rate, is described in the numerator of the Corkrey model by an Arrhenius type of temperature dependency. The denominator accounts for the effects of temperature on protein conformation, which causes alterations in catalytic activity of the putative enzyme and therefore a change in the expected rate.

$$\mu(T) = \frac{T \cdot e^{\left(c - \frac{\Delta^\ddagger H^\circ}{R \cdot T}\right)}}{1 + e^{\left(-n \cdot \frac{\Delta H^\circ - T \cdot \Delta S^\circ + \Delta C\_P \cdot (T - T\_H^\circ - T \cdot \ln(T/T\_S^\circ))}{R \cdot T}\right)}}\tag{45}$$

where Δ‡*H*◦ is the enthalpy of activation, J/mol; *R* is the universal gas constant ~8.314, J mol−<sup>1</sup> K<sup>−</sup>1; *c* is a dimensionless scaling factor; *T* is the temperature, K; Δ*Cp* is the heat capacity change, J K−<sup>1</sup> mol−<sup>1</sup> -amino acid residue, upon denaturation of the putative enzyme; Δ*H*<sup>∗</sup> is the change of enthalpy, J mol−<sup>1</sup> amino acid residue, at the convergence temperature *T*∗ *<sup>H</sup>*, K, for enthalpy of protein unfolding; and Δ*S*<sup>∗</sup> is the change of entropy, J K−1, at the convergence temperature *T*<sup>∗</sup> *<sup>S</sup>*, K, for the entropy of protein unfolding.

#### *3.18. The Model of Hobbs (2014)*

The heat capacity model or macromolecular rate theory was proposed by Hobbs et al. [96] and applied by Schipper et al. [97]. They state that enzymatic rates show Arrhenius behavior when increasing with temperature, until an optimum (*Topt*), but argue that decreasing rates above *Topt* cannot sufficiently be explained only by denaturation of the enzymes. They have reported the effect of a heat capacity change upon activation (between the ground state and transition state of a rate-limiting enzyme) that shapes the thermal growth curve. The change in heat capacity affects the temperature dependency of Δ‡*G*◦ (Gibbs free energy difference, between a ground state and transition state) that is in turn responsible for determining the temperature dependency of enzymatic activity. They state that the change in heat capacity influencing enzymatic rates implicates temperature dependence for various biological rates, ranging from "enzymes to ecosystems". Hobbs et al. formulated their findings using Eying's Equation (46) as a scaffold and a term to describe the degree of temperature dependence of Δ‡*G*◦ with Δ‡*C*◦ *<sup>p</sup>* Equation (47). Assuming Δ‡*C*◦ *<sup>p</sup>* to be zero, Δ‡*G*◦ would be independent from temperature and the reaction behavior of the growth rate-limiting enzyme would follow an Arrhenius type of temperature dependence. Large negative values for Δ‡*C*◦ *<sup>p</sup>* would lead to a significant temperature dependence of Δ‡*G*◦ , leading to a non-Arrhenius behavior and explaining a decrease in reaction rate above *Topt*, independent of denaturation. Compared to the master reaction model, heat capacity theory takes into account that enzymes are in fast equilibrium with the transition state and denaturation does not easily occur [61,96,97].

$$\ln\left(T\right) = \frac{k\_B}{h} \cdot T \cdot e^{\left(-\Delta^\ddagger G^\diamond \left(T\right)/R \cdot T\right)}\tag{46}$$

$$
\Delta^\ddagger G^\circ(T) = \Delta^\ddagger H\_{T\_0}^\circ + \Delta^\ddagger \mathbb{C}\_p^\circ \cdot (T - T\_0) + T \cdot \left(\Delta^\ddagger S\_{T\_0}^\circ + \Delta^\ddagger \mathbb{C}\_p^\circ \cdot \ln\left(\frac{T}{T\_0}\right)\right) \tag{47}
$$

where *kB* is the Boltzmann constant ~1.381 · 10−23,JK−1; *h* is the Planck's constant ~6.626·10−34, J s<sup>−</sup>1; Δ‡*G*◦ is the Gibbs free energy difference, between a ground state and transition state, J mol−1; *R* is the universal gas constant ~8.314, J mol−<sup>1</sup> K<sup>−</sup>1; *T* is the absolute temperature, K; *T*<sup>0</sup> is a reference temperature, K; Δ‡*H*◦ *<sup>T</sup>*<sup>0</sup> and <sup>Δ</sup>‡*<sup>S</sup>* ◦ *<sup>T</sup>*<sup>0</sup> are enthalpic and entropic change between reactants and the transition state at the reference temperature *T*0; Δ‡*C*◦ *<sup>p</sup>* is the heat capacity change between reactants and the transition state, J mol−<sup>1</sup> K<sup>−</sup>1.

#### *3.19. The Model of DeLong (2017)*

In 2017, DeLong et al. [98] argued that models describing the thermal growth curve lack the assumption that the catalyzing enzyme lowers the activation energy of the rate-determining reaction. They therefore introduced a model that describes the reduction of required activation energy for the rate-determining reaction as a function of free energy (enzyme stability) of the catalyzing enzyme. This term was incorporated into the dividend of the exponential term of the Arrhenius function. The Arrhenius activation energy (*Ea*; see Equation (4)) was replaced with the difference of a baseline energy (*Eb*, J), describing kinetic requirements as if the reaction would take place outside an organism, lowered by the enzymatic contribution (*Ec*, J) inside an organism, yielding Equation (48):

$$
\mu(T) = A \cdot e^{\frac{-(E\_b - E\_c)}{k\_B \cdot T}} \tag{48}
$$

where the authors used the Boltzmann constant *kB* ~1.381 · 10−23,JK−1, instead of the universal gas constant in the divisor of the exponential term and *T* is the absolute temperature, K. The dividend of the exponential term then accounts for the activation energy lowered by the enzymatic action. The extent of enzymatic contribution (*Ec*) depends on the activity status of the enzyme, which is given by probability terms of protein stability. The temperature-dependent protein stability is given by Δ*G* (Equation (49)), where Δ*H*, J, is the change of enthalpy and Δ*Cp*, J mol−<sup>1</sup> K<sup>−</sup>1, the change in heat capacity, both relative to the melting temperature *Tm*, K, between the folded and unfolded state. At *Tm*, Δ*H* is by definition zero and increases for temperatures below *Tm*. Δ*Cp* reflects the extent of free energy that can be kept by the enzyme without changing the temperature, which increases below *Tm* with a decreasing temperature. The authors assumed that the probability of the maximum contribution of the enzyme to reduce the activation energy by the amount *EL* approaches 1 at Δ*G*max. Hence, the probability function Equation (50) is composed of the maximum amount (*EL*) by which an active enzyme can lower the activation energy and the probability of the enzyme being correctly folded and active, given by the ratio of Δ*G* to Δ*G*max. This transformation yields probability terms for each parameter in Equation (49), as presented in Equations (51) and (52). Therefore, Equation (49) can be rewritten as Equation (53).

$$
\Delta G = \Delta H \cdot \left(1 - \frac{T}{T\_m}\right) + \Delta C\_p \cdot \left(T - T\_m - T \cdot \ln\left(\frac{T}{T\_m}\right)\right) \tag{49}
$$

$$E\_c = \,\,E\_L \cdot \frac{\Delta G}{\Delta G\_{\text{max}}}\tag{50}$$

$$E\_{\Delta H} = E\_L \cdot \frac{\Delta H}{\Delta G\_{\text{max}}} \tag{51}$$

$$E\_{\Delta \mathcal{C}\_p} = E\_L \cdot \frac{\Delta \mathcal{C}\_p}{\Delta \mathcal{G}\_{\text{max}}} \tag{52}$$

$$E\_c = \,^cE\_{\Delta H} \cdot \left(1 - \frac{T}{T\_m}\right) + E\_{\Delta C\_p} \cdot \left(T - T\_m - T \cdot \ln\left(\frac{T}{T\_m}\right)\right) \tag{53}$$

Combining the probability function Equation (53) for the degree of enzymatic contribution to lowering the baseline energy *Eb* of the rate-determining reaction with the Arrhenius-type Equation (48) yields the enzyme-assisted Arrhenius model in Equation (54).

$$\mu(T) = A \cdot \text{\textbullet}^{-\left(E\_b - \left(E\_{\text{AA}} \cdot \left(1 - \frac{T}{T\_{\text{RW}}}\right) + E\_{\text{AC}\_p} \cdot \left(T - T\_{\text{RW}} - T \cdot \ln\left(\frac{T}{T\_{\text{RW}}}\right)\right)\right)\right)}{\frac{k\_B - 1}{k\_B - 1}}\tag{54}$$

#### *3.20. Additional Temperature Models*

In Table 1 further models describing a hunchback-shaped curve for temperature-dependent rates are summarized.

**Table 1.** Models describing a hunchback-shaped curve of temperature-dependent rates (not explained in the text).


*T* is the absolute temperature, K; *Topt* is the optimal growth temperature, K; *Tmax* and *Tmin* are the upper and lower limit of the thermal curve, respectively; *Tlow* and *Thigh* are shaping parameters that determine asymmetry of the growth curve, -; *Ai* represents frequency factors, s<sup>−</sup>1; *Ei* represents activation energies, J mol<sup>−</sup>1; *R* is the universal gas constant ~8.314, J mol−<sup>1</sup> K<sup>−</sup>1; *Tref* is a reference temperature, K; μ*opt* is the optimal specific growth rate, s<sup>−</sup>1; *Ed* is the activation energy for enzyme denaturation, J mol<sup>−</sup>1; *K* is a dimensionless inactivation constant.

#### **4. Biotechnological Applications for Targeted Temperature Variation Assisted by Temperature Models**

#### *4.1. Temperature with Potential for Bioprocess Design*

Temperature is an easily measurable (continuously, in situ, and in real-time) and controllable process variable. In most cases, temperature is controlled at constant values to maintain suitable physiological conditions at a given optimum value. Besides temperature, these requirements are also common for other basic physico-chemical characteristics, including dissolved oxygen and the pH value, which are routinely controlled in most bioreactor cultivations. For bioprocess design and optimization, very few correcting variables (e.g., the addition of fresh feed media) are available to direct a bioprocess towards a desired outcome. There are two strategies for efficiently using the temperature to influence the process outcome. First, alterations in the process temperature may be used to target metabolism, deliberately trigger stress responses, modify enzymatic turnover, or activate existing regulatory mechanisms. Secondly, natural or synthetic regulation mechanisms may be introduced to engineer host microorganisms (see Section 2.2). As such, processes can be designed to include these regulatory mechanisms as an effective tool to influence the process. Even though several systems inducible by temperature have been discovered and made available to biotechnologists in the last decades, only very few have been applied for process design or optimization [12].

#### *4.2. Application of Temperature Models and Temperature for Bioprocesses Design*

Mukhtar et al. modeled the temperature-dependent soil nitrification potential rate with the square root model of Ratkowsky et al. and estimated the optimum temperature for the nitrification potential rate with the heat capacity model. The authors suggest that the knowledge of thermodynamic properties of the soil nitrification response may be used to improve the application of large-scale fertilization, while reducing eutrophication and connected negative environmental impacts [105]. For cleaning purposes of contaminated water with excess nitrogen, denitrifying fixed-bed bioreactors can be used for NO3 − removal [106]. The group of Nordström et al. used the Eyring equation to simulate the temperature dependency of NO3 − removal rates in a denitrifying fixed-bad wood chip bioreactor [107]. They used the heat capacity model of Hobbs et al. [96] to derive the optimum temperature for NO3 <sup>−</sup> removal by their bioreactor. Nordström et al. reported that NO3 − removal rates of reducing microbial consortium change over time, while the temperature optimum is shifted towards lower temperatures (from 24.2 to 16.0 ◦C) [107]. Downshifts in temperature may somewhat be beneficial for a bioprocess, depending on the intended outcome. Seel et al. were able to increase biomass yields to optimize nutrient assimilation at suboptimal temperatures (10 ◦C) for mesophilic isolates from chilled foods and refrigerators in defined medium compared to their reference strains [52]. They emphasized the importance of defining the "optimum temperature". Seel et al. distinguish between the optimum temperature for the maximum growth rate and the optimum for the maximum biomass yield. At 10 ◦C, which was around 15–25 ◦C below the optimum for the maximum growth rate of isolates, they reported an increase of 20%–110% of biomass formation. They argue that the generally assumed optimum growth temperature at μmax may not be the optimum for all biological processes in the host. This has been described for protein production, membrane permeability, and cellular stress [108–112]. The works of Corkrey et al. [58,113] thermodynamically justified the applied findings of Seel et al. Corkrey's model described the connection between the temperature stability of proteins and the growth rate governed by an assumed essential enzymatic reaction, with the temperature for optimal enzyme stability being 10–15 ◦C below the temperature of μ*max*. Seel et al. state that due to correct protein folding and protein turnover, energy can be conserved and the biomass yield improved. Perhaps the most popular example for altering temperature as a means of an optimized process outcome can be found during the production of recombinant protein, such as by using *E. coli* as a heterologous expression system [12]. During the process, typically at the stage of the induction of protein expression, a temperature shift is performed, lowering the process temperature by several ◦C. This temperature shift does not benefit overall transcription or translation activity, but instead, results in an increased amount of correctly folded protein. This is due to lower amounts of recombinant protein being less likely to form inactive, insoluble aggregates (inclusion bodies), and allowing more time for correct folding after translation due to lower protein production rates. In the case of enzyme production, correctly folded protein is a prerequisite of enzymatic activity, and therefore, in most cases, temperature shifts are applied [114]. Another possibility for using temperature as a method for optimizing protein production is the application of thermo-inducible promoter systems. Considering the perspective of bioprocess engineering, it is of high interest to combine strain engineering and process development approaches to maximize overall productivity and yields. Strong chemically inducible promoters and expression systems that are commonly used in laboratory-scale protein expression [115] cause a high degree of metabolic burden to the microorganisms, and as such, protein production results in a simultaneous reduction of growth. It is therefore beneficial for optimized process design to be able to uncouple biomass growth from protein biosynthesis. Furthermore, production of recombinant proteins at early stages of cultivation often results in reduced overall yields, as many proteins are sensitive to degradation by proteases [114,116]. To counteract this instability of recombinant proteins, inducible promoter systems are a possible tool. Thermo-inducible promoter systems have the advantage of not requiring intrusion into the process, as chemical inductors are not required. As such, the risk of contamination, as well as the costs of the process, may be reduced. Nalley et al. evaluated the effect of temperature on growth, fatty acid production, and the fatty acid profile for algae suitable for mass cultivation and biofuel production [117]. They assessed the effect of temperature on microalgae with the model of Norberg [77]. Nalley et al. report temperature-specific fatty acid production, which is mostly controlled by the temperature-dependent growth rate. Furthermore, they found that temperature dramatically influences the fatty acid profile, with an increase in polyunsaturated

fatty acids and decrease in monounsaturated and short fatty acids when increasing the temperature. The thermostable phosphotriesterase-like lactonases (PLLs) from hyperthermophilic *Sulfolobus* genera present an industrially relevant molecule for bioremediation processes, such as in the degradation of highly toxic pesticides like organophosphates [118]. Thermostable PLLs are of particular interest due to their wide temperature and pH working range, as well as resistance to organic solvents. In contrast to mesophilic enzyme isolates, the application of extremozymes is not prone to low stability in solution or an elevated temperature (>30 ◦C) [119,120]. In the work of Restaino et al., a high-yield pre-industrial-scale process with an optimized purification method for PLLs was developed [11]. The authors exploited the thermostability of PLLs in their downstream and recombinant PLL production in fast-growing mesophilic *E. coli* for their upstream and bioproduction strategy. Impurities were removed by a thermo-precipitation step (65–75 ◦C), which was optimized using a statistical response surface method to compute optimal precipitation temperatures. The solubility of proteins can be altered by different variables, like the pH, protein concentration, ionic strength, or temperature. Ethanol may be used as a solvent for precipitation, but exhibits the tendency to denature proteins at temperatures above 0 ◦C. Therefore, cold EtOH is often used for protein fractionation [121]. The authors Cimini et al. investigated the influence of temperature on the industrially relevant capsular polysaccharide (CPS) of *E. coli*, K4 [122]. It exhibits a high similarity to the economically valuable but only expensively extractable chondroitin from animal tissue. Chondroitin is, for example, used in the pharmaceutical sector to prevent osteoarthritis [123]. Cimini et al. found a positive correlation between CPS production and temperature. As stated before, CPS production is thermoregulated and *E. coli* CPS are not expressed at temperatures <20 ◦C [124]. Another pharmaceutically relevant product and precursor for the commonly used anti-inflammatory drug desfluorotriamcinolone, is 16α-hydroxy hydrocortisone. Hydrocortisone is converted in a temperature-dependent manor by *Streptomyces roseochromogenes* to 16α-hydroxy hydrocortisone. The group of Restaino et al. was able to maximize the bioconversion of hydrocortisone to 16α-hydroxy hydrocortisone, while lowering side-product formation. By adjusting the process temperature to 26 ◦C and pH to 6, they were able to almost entirely (95%) convert hydrocortisone into the desired product 16α-hydroxy hydrocortisone [125]. Another example for lowering the temperature to obtain optimal expression, correctly folded, and working recombinant enzymes is the mammalian enzyme 6-*O*-sulfotransferase (6-OST). It can be recombinantly produced in *E. coli*. 6-OST is of particular interest as it is required for the industrial and biotechnological production of heparin. So far, the blood anticoagulant heparin has only been derived from animals. 6-OST side-specifically sulfonates a heparin precursor and marks a key step in heparin bioproduction. The group of Restaino et al. reported high cell density cultivation of *E. coli* in which recombinant mammalian 6-OST was produced using an induction strategy optimized for yield and productivity. The strategy involved lowering the temperature (37 to 25 ◦C) upon induction and using a combination of two inducer molecules to balance the metabolic burden. The combination of balanced biomass growth and the induction strategy resulted in an optimal recombinant enzyme expression and enhanced biomass productivity [126].

An interesting application involving measuring and controlling the temperature during bioprocesses is the estimation of metabolic activity by heat balancing. In a partially isolated bioreactor system, heat generation by metabolic processes can be calculated by measuring heat transfer from or to the bioreactor [5]. This calorimetric technique typically involves the calculation of a heat balance by calculating transfer from or to the heat exchanger, enthalpy balancing in exhaust gas, energy dissipation by stirring, and monitoring the temperature of added liquids [6]. A calorimetric control strategy for the growth rate of *Escherichia coli* [13] and *Saccharomyces cerevisiae* [14] by adjusting feed rates was developed. The authors report the successful establishment and control of a high-cell density cultivation, with the feed rate solely relying on heat balancing. Furthermore, besides applications in process control, recently, a calorimetric approach for the detection of prophage activation and release was proposed. The authors report that by evaluating differences in metabolic heat, reactivation of dormant infected bacterial cells can be detected [127]. In general, however, the development of

calorimetric control strategies at a laboratory scale is difficult, as sufficient isolation and sensitive equipment to detect heat generated at a small-scale is required [128]. Even though, at a larger scale, increasing ratios of volume to surface favor the sensitivity of calorimetric approaches, calorimetric approaches are only scarcely applied in industrial biotechnology. Table 2 provides an overview of biological traits, and temperature models and/or temperature adjustments used to achieve a desired process outcome.


**Table 2.** Biological traits associated with modeling techniques and/or targeted temperature adjustments for the control, monitoring, and optimization of biotechnological processes.

#### **5. Summary and Conclusions**

In bioprocess engineering, very few process variables are usually available online. Therefore, exploiting existing control variables to their full extent is a reasonable strategy for broadening existing toolsets of monitoring and control. Both underlying biological mechanisms of temperature sensing and adaptation and mathematical models for temperature effects have been well-described. However, temperature as a control variable is only scarcely applied in bioprocess engineering, so an exploitation strategy merging both in context has not yet been established.

This review presents and discusses the most important models for physiological, biochemical, and physical properties governed by temperature, along with application perspectives. As such, this review provides a toolset for the future exploitation of temperature as a control variable for optimization, monitoring, and control applications in bioprocess engineering.

**Author Contributions:** Conceptualization, P.N. and M.H.; Formal Analysis, P.N., L.L., R.H. and M.H.; Investigation P.N., L.L., R.H. and M.H.; Writing-Original Draft Preparation, P.N.; Writing-Review & Editing, P.N., L.L., R.H. and M.H.; Supervision, M.H. All authors have read and agreed to the published version of the manuscript.

**Funding:** P.N. is a member of the "BBW ForWerts" graduate program and receives a scholarship within the frame of the Baden-Wuerttemberg Landesgraduiertenfoerderung (LGF) awarded by the Ministry of Science, Research and the Arts (MWK) of Baden-Wuerttemberg, Germany.

**Conflicts of Interest:** The authors declare no conflicts of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Review* **Practical Solutions for Specific Growth Rate Control Systems in Industrial Bioreactors**

#### **Vytautas Galvanauskas \*, Rimvydas Simutis, Donatas Levišauskas and Renaldas Urniežius**

Department of Automation, Kaunas University of Technology, Kaunas 51367, Lithuania; rimvydas.simutis@ktu.lt (R.S.); donatas.levisauskas@ktu.lt (D.L.); renaldas.urniezius@ktu.lt (R.U.)

**\*** Correspondence: vytautas.galvanauskas@ktu.lt; Tel.: +370-37-300-291

Received: 27 August 2019; Accepted: 30 September 2019; Published: 2 October 2019

**Abstract:** This contribution discusses the main challenges related to successful application of automatic control systems used to control specific growth rate in industrial biotechnological processes. It is emphasized that, after the implementation of basic automatic control systems, primary attention shall be paid to the specific growth rate control systems because this process variable critically affects the physiological state of microbial cultures and the formation of the desired product. Therefore, control of the specific growth rate enables improvement of the quality and reproducibility of the biotechnological processes. The main requirements have been formulated that shall be met to successfully implement the specific growth rate control systems in industrial bioreactors. The relatively easy-to-implement schemes of specific growth rate control systems have been reviewed and discussed. The recommendations for selection of particular control systems for specific biotechnological processes have been provided.

**Keywords:** biotechnological processes; bioreactor control; specific growth rate control; batch-to-batch reproducibility

#### **1. Introduction**

Biotechnological processes play an increasingly important role in modern industry and health sectors. Many of the important active pharmaceutical ingredients are recombinant therapeutic proteins produced by the cultivation of genetically modified microorganisms or mammalian cells in bioreactors. These biotechnological processes are highly nonlinear and non-stationary. Therefore, modeling and control of the above bioprocesses are complicated control engineering tasks, especially in industrial recombinant protein production processes, in which high safety requirements and operational restrictions must be secured [1,2]. The goal of this contribution is to review and recommend practical and easily implementable control system schemes for biomass specific growth rate (further referred to as SGR or μ) control in industrial bioreactors. The recommendations are based on an analysis of the existing SGR control solutions and availability of the control schemes suitable for practical implementation in industrial bioreactors. The specific growth rate μ (1/h), is defined as the ratio of the cell's absolute growth rate and the amount of cells:

$$
\mu = \frac{dX}{dt} \,\,\frac{1}{X} \tag{1}
$$

where *X* = *xV* (g) is the cell (biomass) amount; *x* (g/L) is the cell (biomass) concentration; and *V* (L) is the cultivation broth volume. The SGR is the most important variable in biotechnological processes, which influences the physiological state of microbial culture, production of cell biomass and desired products, and quantity and quality of products [3–8].

The development of relatively simple and reliable methods for SGR monitoring and control in industrial bioreactors is one of the most important control engineering tasks for successful implementation of the Process Analytical Technology (PAT) framework in bioengineering [9,10]. However, to properly exploit the benefits of SGR control systems in microbial and mammalian cell cultivation processes, basic bioprocess variables (temperature, pH, dissolved oxygen concentration, etc.) need to be controlled by commonly available and well-functioning control systems. Unfortunately, in many cases these systems do not ensure sufficient control quality [1,11,12] allowing to further proceed with SGR monitoring and control.

This paper is structured as follows: In Section 2, importance of the control quality of the basic control systems in the biotechnological processes is analyzed. Section 3 introduces important preconditions for implementation of SGR control systems in industrial bioreactors. Section 4 expands on strategies for SGR control suitable for industrial bioreactors. Finally, the authors give recommendations for application of the discussed SGR control solutions in various biotechnological processes.

#### **2. Quality of Basic Control Systems in Industrial Bioreactors**

The performance quality of automatic control systems for basic process variables is still low in most industrial microbial and mammalian cell cultivation processes [1,11]. Despite the fact that sophisticated control strategies for microbial cultivation processes are widely discussed in the academic community and research papers [2,13–15], the authors' experience shows that, at present, only simple, conventional automatic control systems are realized in the majority of industrial-scale (bio)reactors [11,16]. This situation is related to a common underestimation of the control systems' importance in improving the productivity and quality of the biotechnological processes. It is also related to the relatively high costs of implementation and maintenance of these advanced control systems and the resultant low acceptance of these systems by plant managers.

Bioreactors are the key operation units in biochemical and biopharmaceutical processes, in which the basic control systems attempt to control the cultivation environment outside of the cell. The commonly controlled variables of the cells' environment are temperature, pressure, pH, and dissolved oxygen concentration. The basic feedback control systems of industrial bioreactors for controlling bacterial cell cultures that produce biopharmaceutical products are presented in Figure 1. The temperature controller manipulates the flow rate of cooling water in the jacket. The pressure inside the bioreactor is controlled by manipulation of the off-gas flow rate. The pH controller manipulates the flow rate of ammonia solution (usually, the acid solution does not need to be added to bacterial cell culture cultivations, unless compensation of base excess is required). The dissolved oxygen controller output is split to manipulate the air flow rate and the agitation speed (at high cell density cultivation the air flow may be enriched by additional oxygen).

Today, the most important industrial cultivations of microbial and mammalian cells are carried out in the fed-batch mode. In fed-batch processes, one or more substrates are fed into the bioreactor during the process. The product remains in the bioreactor until the end of the cultivation cycle. Fed-batch processes overcome substrate inhibition and overflow effects. Such an operational mode allows a high cell density and product concentrations to be achieved [6]. By controlling the substrate feeding rate, the optimal conditions for the biotechnological process can be secured.

To realize the bacterial growth rate control systems, efficient glucose feeding algorithms need to be implemented, and in the mammalian cell cultivation processes, additionally, the feeding rate of glutamine needs to be controlled. It is important to note that modern industrial bioreactors are equipped with inexpensive and reliable devices to measure the composition of aeration gas in the inlet and outlet (fraction of *O2, CO2*) and the molar flow rate, *Q*. Hence, the oxygen uptake rate (OUR) and the carbon dioxide production rate (CPR) can be calculated from the online measurements as follows:

$$\text{OLIR} = \mathbb{Q}\left(\text{O}\_2^{\text{in}} - \text{O}\_2^{\text{out}}\right),\tag{2}$$

$$\text{CPR} = \mathcal{Q}\left(\text{CO}\_2^{\text{out}} - \text{CO}\_2^{\text{in}}\right). \tag{3}$$

The above measurements allow an online estimation of the important process variables, OUR and CPR, during bacterial and mammalian cell cultivation processes [17,18]. Because of the lower cell density and respiratory intensity, OUR and CPR measurements based on the off-gas composition may cause larger measurement errors in mammalian cell cultivation processes. As an alternative technique, an OUR estimation using dissolved oxygen (DO) measurements may also be applied [19]. The OUR and CPR are the most important variables for indirect monitoring of the biomass growth rate in industrial bioreactors, as they comprehensively reflect the physiological state and metabolic activity of the aerobic biotechnological processes [1–3,11,20].

**Figure 1.** Basic control system loops for a typical microbial cultivation process.

Good performance of the basic control systems improves the batch-to-batch reproducibility/ repeatability of the processes [11,21]. The other advantage of a well-controlled bioreactor is the possibility to run the bioreactor at higher capacity or with better efficiency by operating the process closer to physical constraints. Good reproducibility is also an important condition for possible process improvements and modifications, as improvement in reproducibility by means of well-operating control systems allows a reduction in the number of expensive and time-consuming experiments required to compare the performance indices of the modified processes and to optimize the controlled technological regime [1].

The proportional-integral-derivative (PID) controllers are predominant controllers used in the basic control systems of microbial and mammalian cell cultivation processes. Quality of the bioprocess control depends on the complexity of the process dynamics, the process variable measurement noise and errors, tuning of the system controllers, and performance accuracy of the executive devices (valves and speed drives). Dynamics of the particular biotechnological parameter control process can be characterized by three resulting dynamic parameters: dead time, time constant and process gain. These parameters are commonly used to determine the tuning parameters of a PID controller. Control quality of the bioreactor operation mode critically depends on how well the controllers are set up

and tuned to deal with the sources of the process variability [1,2,11,13]. Because of the nonlinearity and nonstationarity of the bioprocesses, proper tuning of the controllers requires appropriate efforts. The performance of PID controllers with fixed tuning parameters are not sufficiently accurate because of the significant variations in the process' dynamics. Consequently, various approaches have been proposed to tune the PID controller parameters in microbial cultivation processes under time-varying operating conditions, including gain-scheduling methods [12,22–24], first-principle models [25], tendency models [26], rule-based fuzzy systems [13], and other techniques [1,2,14,15]. The proposed approaches give a sound theoretical and practical basis to implement adaptive control schemes in bioreactor systems and show that implementation of the adaptive algorithms in basic control systems can significantly increase the performance of the systems. Gain-scheduling methods and tendency models are the most appropriate solutions for improving the quality of basic control systems in microbial cultivation processes because they are relatively simple to implement and pose low requirements on model complexity. The advantages have been extensively discussed and substantiated in previous studies [12,22–24,26]. An important task now is to broaden implementation of these algorithms in industrial bioreactors.

Well-functioning basic control systems create opportunities for further process improvements and also for implementation of the SGR control systems in bioreactors [1,11]. Development of relatively simple solutions to control the specific growth rate in fed-batch processes remains a timely and important task in view of implementing the PAT framework in industrial biotechnological processes.

#### **3. Preconditions for Implementation of SGR Control Systems in Industrial Bioreactors**

The basic requirements for SGR control systems designed to control microbial and mammalian cell cultivation processes can be formulated as follows:


According to the above requirements, most of the solutions for SGR control systems presented in scientific literature [3,6] are not attractive enough for industrial implementation. More complex monitoring and control systems, even if equipped with easy-to-use interfaces, may require retuning, model identification, and maintenance tasks in case the operational modes or microbial cultures have been changed. Often these tasks cannot be carried out by the biotechnology companies alone and may cause additional expenses for outsourcing and production delays. In the authors' opinion, this is the main reason SGR control systems have rarely been used in industrial bioreactors so far.

In this contribution, the authors provide an overview of those SGR control systems that meet the aforementioned requirements. In most widespread control systems, SGR is usually controlled by manipulating the substrate feeding rate [6,27]. In recombinant protein production processes, the temperature of the medium is also used to control cell growth [28]. Despite that the growth rate could be controlled (e.g., by manipulating the dissolved oxygen concentration in cultivation medium [29], temperature of the medium [28] or pH), to date these techniques have not been sufficiently investigated and have not been widely implemented in industrial practice [11].

When the growth rate is controlled by manipulating the substrate feeding rate, the substrate concentration in cultivation medium remains relatively low [30]. This allows avoiding production of the overflow metabolites in some of the most important microbial expression systems, so-called Crabtree-positive organisms, such as *S. cerevisiae* and *E. coli*. The presence of the overflow metabolites, such as acetate or ethanol, frequently leads to inhibition of both the biomass growth and formation of the proteins.

To ensure controllability and batch-to-batch reproducibility, the SGR needs to be controlled during the process at a level that is lower than the maximum SGR [11]. The maximum available SGR is observed during particular growth phases of the process, when the growth is not limited by substrate concentration [12,31], and depends on the specific culture, medium composition, concentrations of biomass and metabolites, as well as the oxygen transfer capabilities of the bioreactor. It is worth mentioning that direct control of the substrate concentration in bioreactor at the set-point does not guarantee that a constant SGR will be kept. This is because [11]:


In the majority of recombinant protein production processes, the control objective is to maximize the amount of target protein at the end of the process while maintaining high batch-to-batch reproducibility. To achieve this goal, two steps most often are implemented for SGR control:


SGR control systems can be realized using open-loop and closed-loop control systems [6,11]. In the following sections, the authors analyze and evaluate SGR control systems that are best suited for industrial applications. The analyzed and evaluated control solutions are ordered in this review by their complexity (i.e., starting with the simplest open-loop systems and ending up with the control systems that employ cascade control schemes and SGR estimators).

#### **4. Schemes for SGR Practical Control Systems**

#### *4.1. Open-Loop SGR Control Systems*

The majority of industrial fed-batch microbial cultivation processes are operated using open-loop SGR control systems [11], in which the time profile of the substrate feeding rate is calculated using simple mass-balance models and a desired time profile of the SGR during the process. The desired SGR values, μ*set,* can be described by the following equation:

$$
\mu\_{\rm{set}} = \begin{cases}
\quad \mu\_{\rm{set}\_1} = (0.85 \dots 0.90) \mu\_{\rm{max}} \text{ for growth phase,} \\
\quad \quad \mu\_{\rm{set}\_2} = \mu\_{\rm{opt}} \text{ for production phase.}
\end{cases}
\tag{4}
$$

Based on the desired set values for SGR, the corresponding substrate feeding rate can be determined. Accumulation of the total biomass during cultivation and the substrate feeding rate for both stages of the process can be estimated from simple mass-balance equations:

$$\frac{dX}{dt} = \mu\_{\text{set}\_i} X, \text{ i } i = 1, \text{ 2.} \tag{5}$$

The amount of biomass accumulated in the growth stage can be calculated from the equation

$$X(t) = X\_0 e^{\mu\_{\text{str}\_1} t},\tag{6}$$

and the amount of biomass accumulated in the production stage can be calculated from the equation

$$X(t) = X\_0 e^{\mu\_{\rm sat\_1} t\_\mathcal{g}} \cdot e^{\mu\_{\rm sat\_2} \left(t - t\_\mathcal{g}\right)},\tag{7}$$

where *tg* (h) is the end time of the growth stage.

Using the predicted time trajectories of the biomass accumulation, *X(t)*, the substrate feeding rate *F1(t)* in the growth stage can be derived from the dynamic mass-balance equation for the substrate under steady-state conditions [2,6,8,20,30], which results in the following equation

$$F\_1(t) = \frac{X\_0 e^{\mu\_{\rm sat\_1}\,t} \mu\_{\rm sat\_1}}{Y\_{\rm xs1} S\_F},\tag{8}$$

and the substrate feeding rate *F2(t)* in the production stage can be estimated from the equation

$$F\_2(t) = \frac{X\_0 e^{\mu\_{\rm sat\_1}} e^{t\_{\mathcal{S}}} e^{\mu\_{\rm sat\_2}} \, (t - t\_{\mathcal{S}})}{Y\_{\infty 2} S\_F},\tag{9}$$

where *X0* (g) is the total amount of biomass in the bioreactor at the beginning of cultivation process; *X(t)* (g) is the time trajectory of the total biomass accumulated during the process; *SF* (g/L) is the concentration of the substrate in the feeding solution; and *Yxs1* and *Yxs2* (g/g) are the yields of biomass on substrate in the growth and production phases, respectively. In substrate-limited processes, the substrate concentration in the bioreactor is low. Therefore, this concentration is not taken into account in Equations (8) and (9).

When the SGR control algorithm based on Equations (8) and (9) is developed, implementation of the control system is straightforward. For this purpose, only an actuator to dose the feeding substrate to the bioreactor is needed. For some recombinant protein production processes, the yield of biomass on substrate can be different in the growth (*Yxs*1) and the production stages (*Yxs*2). In such cases, the yields must be identified from experimental data for the particular process phase and must be taken into account when using Equations (8) and (9) to estimate the substrate feeding rates. Additionally, μ*max* and μ*opt* may vary during the process because of the increasing concentrations of metabolites, biomass, and other process variables. In this case, μ*max(t)* and μ*opt(t)* should be presented as time profiles, and a numerical integration procedure to predict the biomass growth and substrate feeding time profiles needs to be applied.

The substrate feeding time profiles estimated from Equations (8) and (9) can be directly used for implementing the open-loop SGR control systems in various biotechnological processes [6,7,22,30,32,33]. Certainly, more sophisticated bioprocess models and optimization procedures can be used to determine the feeding rate control algorithms in open-loop SGR control systems. These methods are widely reviewed and analyzed in many research and academic papers [3,6,8,27,34,35]. However, implementation of more sophisticated procedures in industrial bioprocesses requires specific knowledge in process modeling and efforts to develop more accurate models. Consequently, application of complex methods in industrial environment is not a commonplace.

The SGR open-loop control systems based on Equations (8) and (9) are easy to implement and do not require additional measurements. On the other hand, open-loop systems do not compensate for process disturbances. Consequently, possible variations in the substrate concentration of the feeding solution or deviations of the feeding flow rate are not compensated. These disturbances can decrease the performance of the biotechnological process. In the next sections, the authors analyze and provide relatively simple and already existing solutions to overcome these problems.

#### *4.2. SGR Control Systems Based on CPR*/*OUR Estimations*

Reliability and accuracy of SGR control can be increased by employing closed-loop control systems. One of the simplest closed-loop SGR control systems is proposed in Reference [27]. Here, based on a

simplified assumption that the carbon dioxide production rate (CPR) during the process is in a linear relationship with the biomass growth rate

$$
\mathbb{C}PR(t) = a\mu(t)X(t),\tag{10}
$$

the specific growth rate μ can be estimated using the following equation

$$\mu(t) = \frac{\text{CPR}(t)}{\int\_0^t \text{CPR}(\tau)d\tau},\tag{11}$$

where α is a model parameter, and τ is the integration time variable.

If real-time estimation of SGR is available, feedback control systems can be developed to automatically track a desired SGR time profile by manipulating the substrate feeding rate.

Results of the SGR control obtained in Reference [36] show that an acceptable control quality can be obtained by applying a typical control system based on PI controllers. To achieve better control quality, it is straightforward to adapt the controller parameters to the time-varying dynamics of the controlled process by applying gain-scheduling algorithms mentioned in Section 2 and using the *CPR* signal as a scheduling variable. The main drawback of the analyzed control approach is that, during SGR estimation, an assumption is made that the *CPR* during the process is proportional to the absolute biomass growth rate. In fact, it is known that more accurate results may be achieved if the Luedeking–Piret-type relationship is applied to correlate the *CPR* and the biomass growth rate [20,37]. This relationship additionally takes into account the *CPR* fraction that is related to the maintenance of the cell's vital functions and accounts for a significant part of the total *CPR* (for instance, in high-cell-density bacterial cultivation processes). Figure 2 shows the simulated trajectories of the biomass growth and the *CPR* of the recombinant *E. coli* cultivation process in a 1 m<sup>3</sup> volume bioreactor as well as the comparison of the actual and the estimated SGRs. The latter is calculated from Equations (10) and (11). The actual *CPR* of the process is modeled using the equation

$$
\mathbb{C}PR(t) = a\mu(t)X(t) \, \, + \beta X(t), \tag{12}
$$

where parameter β determines the *CPR* fraction related to maintenance of the cell's vital functions.

**Figure 2.** Simulated trajectories of the biomass growth and carbon dioxide production rate (*CPR)* (**a**), and the trajectories of the real specific growth rate (SGR) and that estimated from Equation (7) (**b**) in a typical recombinant *E. coli* cultivation process ina1m3 bioreactor.

In the simulation experiment, values of the parameters α and β were used that are typical for recombinant *E. coli* cultivation processes (α = *0.9 (gCO2*/*gX),* and β = *0.1 (gCO2*/*(gX*·*h))*)*,* induction at *t* = *8 h*) [38]. The simulation results, presented in Figure 2, show that the estimated SGR deviation from the real trajectory increases with an increasing amount of biomass (the estimated rate at the end of the process is *0.05 (1*/*h)* higher than the real one). Hence, it is advantageous to introduce empiric correlations correct the estimated SGR when applying this method in high-density cultivation processes. The magnitude of correction should be defined from earlier cultivation experiments.

To estimate SGR, *OUR* data can also be used. However, the measurements related to *OUR* estimation may be corrupted by the noise related to the off-gas composition, pressure, and the gas flow rate fluctuations if additional oxygen is used enrich the aeration air. Therefore, to control the high-density cultivation processes, it is recommended to use *CPR* data in SGR estimation relationships.

A more accurate SGR control system based on *OUR* or *CPR* measurements is proposed in References [25,39]. Realization of the proposed control system does not require a mathematical model and *a priori* knowledge of the culture of the microorganisms under control. It can be realized using standard programmable controllers/measurement devices and is well suited for control of industrial biotechnological processes. In the cited contributions, it is shown that if the substrate feeding rate is manipulated to control *OUR* during the process, in such a way that the *OUR* data-based ratio *R*

$$R = \frac{dOLIR}{dt} \,\,\frac{1}{OUR} \,\, '\,\, \tag{13}$$

is stabilized at the desired SGR set-point *R* = μ*set*, then the specific growth rate μ will asymptotically approach the set-point μ*set* and will be controlled at that point. For control of the ratio *R*, the PI control algorithm was recommended, and controller gain was adapted to the time-varying dynamics of the controlled process using the gain-scheduling approach with the feeding rate as the scheduling variable. The block-scheme of the SGR control system and the simulation results of the system's performance are presented in Figure 3. The simulation and experimental investigation tests of the proposed SGR automatic control system have shown a stable performance and sufficiently accurate control of the SGR under stepwise changes to the process parameter values and high-level noise of the feedback signal measurements [25,39]. This control system can be efficiently applied in controlling biotechnological processes, in which the SGR set-point is constant or changes slowly. For realization of the system, either *OUR* or *CPR* online estimates can be used.

**Figure 3.** Block-scheme of the SGR control system (**a**) and the simulation results of the system performance (**b**). Reproduced with permission from D. Levišauskas, Biotechnology Letters; published by Springer Nature, 2001.

It should be stressed that the SGR control systems based on Equation (11), when applied in high-density cultivation processes, cause noticeable deviations at high cell concentrations. SGR control systems based on Equation (13) are less efficient when tracking time-varying SGR set-point profiles.

In the next sections, more complex control systems are discussed that overcome the above shortcomings.

#### *4.3. SGR Control Systems Based on CPR*/*OUR Estimations and the Mass of CO2*/*O2 Produced*/*Consumed During Cultivation*

Robust control of the SGR is a crucial problem when designing an efficient process, in which the SGR is to be controlled at the value μ*set* < μ*max* in order to secure reproducibility of the processes. However, the already discussed SGR closed-loop control systems have two shortcomings: (a) for system implementation, an online estimation of the μ-values is required, and (b) high batch-to-batch reproducibility is not guaranteed. For example, if disturbances occur during a process (e.g., variation in the initial amount of biomass *X0*) or in the instrumentation (e.g., if the substrate feeding is shortly interrupted), they cause slight deviations in the biomass growth trajectory from the desired trajectory, and such an offset cannot be eliminated later on, even if the controller exactly tracks the predefined μ-profile. An approach to cope with the disturbances that cause process reproducibility problems is proposed in Reference [40]. In this work, a desired SGR time profile μ*set(t)*, the initial amount of biomass *X0*, and Equation (1) were used to estimate the biomass growth time profile *X(t)* during the process. If the biomass growth profile *X(t)* can be tightly controlled by manipulating the substrate feeding rate, the corresponding SGR profile will follow the desired μ*set(t)* profile. This control system is more robust as compared to direct SGR control systems, as the short-term disturbances that occur in the control equipment and the process itself are compensated by controlling an integral variable—the amount of accumulated biomass *X(t).* However, implementation of the above control system requires development of a reliable soft-sensor for the online estimation of the amount of accumulated biomass during the process. Therefore, the *X(t)* online estimation problem complicates the practical realization of this control approach. To eliminate this shortcoming, simplified SGR control systems were developed and experimentally tested in bacterial and mammalian cell cultivation processes [41–43]. The main idea behind these control systems is to use the predetermined time profiles of *CPRset(t)* as the system's time-varying set-point, and the mass of *CO2(t)* produced during the process (*mCO2set)* as an indirect metric for SGR control purposes. *CPR(t)* is stoichiometrically related to the SGR and the biomass (Equation (9)), and the integral of this equation gives the mass *mCO2set(t)* produced during the process.

By manipulating the substrate feeding rate to control the predetermined set-point time profile *mCO2set(t)*, the control system indirectly maintains the desired SGR during the process. The structure of the discussed cascade control system is depicted in Figure 4. The PI controller of the inner loop controls the *CPRset(t)* time profile, and the PI controller of the outer loop controls the set *mCO2set(t)* profile.

**Figure 4.** Cascade control system for indirect SGR control based on predetermined *CPRset(t)* and *mCO2set(t)* time profiles.

If the controlled process is tightly kept on the *CPRset(t)* and *mCO2set(t)* time profiles, the process will also follow the desired SGR time profile. The proposed control system ensures good quality of the SGR control, and, because of the cumulative nature of the set-point variable *mCO2(t)*, random disturbances do not significantly distort the course and reproducibility of the process.

Implementation of the proposed control system can be realized in the following steps:


Various realizations of the above control system have been investigated by computer simulations of the system's performance and by controlling real processes of recombinant *E. coli* and mammalian cells (CHO) [18,41,42]. Typical results of the applied control system for controlling the recombinant *E. coli* fed-batch cultivation processes over six runs are presented in Figure 5. The laboratory-scale experimental results show that the proposed control approach leads to a stable and robust behavior of the controlled process. It should also be stressed that small variations in the initial amount of biomass *X0* and short instrumentation disturbances do not significantly affect the reproducibility of the process.

**Figure 5.** Typical experimental results of the total cumulative CPR (**a**) and SGR indirect control (μ*set(t)* = 0.5 1/h at the first process stage and μ*set(t)* = 0.175 1/h at the second stage) (**b**) during the recombinant *E. coli* cultivation process. Reproduced with permission from M. Jenzsch et al. J. of Biotechnology; published by Elsevier, 2007.

Because of the significant changes in the process dynamics during cultivation, it is possible to improve control quality of the cascade control system by adapting controller parameters. Tuning parameters of the PI controllers can be adapted to time-varying dynamics of the controlled process using the gain-scheduling approach. The controller adaptation scheme using the gain-scheduling algorithm is shown in Figure 4 by the dashed lines. In the gain-scheduling algorithms, one can use *CPR* or *OUR* measurements as the gain-scheduling variables.

Instead of using the *OUR(t)* or *CPR(t)* time profiles, the performance of the inner control loop of the cascade control system can also be improved by implementing the SGR estimator, developed from Equations (12) and (13) [44]. Investigation results presented in Reference [44] have shown that the control system with the SGR estimator outperforms the control system depicted in Figure 4 when the controlled process is affected by disturbances to the substrate feeding rate. On the other hand, implementation of the modified control system requires additional calculations related to online estimation of the SGR.

The structure of the SGR control system presented in Figure 4 may be used as a basis for development of closed-loop control systems for controlling the processes of various microbial cultures in industrial bioreactors. Because it is technically simple to implement and possible to improve batch-to-batch reproducibility, this system could be used as a benchmark to compare the control quality of various SGR control systems and to evaluate their potential implementation in industrial bioreactors.

#### **5. Concluding Remarks and Recommendations**

In recent years, numerous research papers have been published, in which original solutions and sophisticated control techniques were developed for the automatic control of microbial and mammalian cell cultivations processes. However, the majority of the proposed control systems are too complicated to be attractive for robust control of industrial biotechnological processes. Therefore, the well-known statement of Luyben [16], "Complex elegant control systems look great on paper but soon end up on 'manual' in an industrial environment", is also valid for the majority of the control systems developed for biotechnological processes.

In this paper, relatively simple control approaches that can be applied in microbial and mammalian cell cultivation processes are discussed and recommended for practical application. The reviewed algorithms and systems designed for indirect control of the specific growth rate can significantly increase robustness and batch-to-batch reproducibility of industrial-scale biotechnological processes. The recommended control algorithms and systems are based on *CPR* or *OUR* online measurements and on the total mass of oxygen consumed or the total mass of carbon dioxide produced during the process. In the case when additional oxygen is used during the processes, it is recommended to use the *CPR* and *mCO2* signals in the control system algorithms because of their lower estimation errors compared to those when using *OUR* and *mO2* signals. To estimate oxygen uptake and carbon dioxide production rates, several industrially well-established gas analyzers and mass flow meters are available. Basic instrumentation for installation of the SGR control systems, the online gas analyzer, combines parallel measurement of *CO2* and *O2* concentrations in the off-gas using two space-saving sensors. The analyzer can be used both for lab- and industrial-scale bioreactors. In the industrial gas analyzers, compensators for gas pressure and humidity are incorporated. Consequently, these analyzers ensure good precision and reliability of the measurements.

The available instrumentation and discussed control methods and systems provide a possibility for wider application of SGR control in biotechnological processes. At the very beginning of the process, accuracy of the indirect measurements is usually low and insufficient to track exactly the SGR set-point time profile in closed-loop control systems. Consequently, it is recommended to start the process using the feeding rate open-loop control strategies determined by Equations (8) and (9) and, after three to four hours, to switch the SGR control to the closed-loop control system. For the low-density cell cultivation processes, the SGR control system based on Equation (11) is recommended. For processes, in which the SGR set-point is kept constant, the control system based on Equation (13) (Figure 3) is well suited. For more advanced applications, the SGR control system presented in Figure 4 is recommended. The above system can be applied as a benchmark to compare the control quality of various SGR control systems.

**Author Contributions:** Conceptualization, V.G., R.S., and D.L.; methodology, V.G., R.S., and D.L.; software, V.G. and R.S.; validation, V.G., R.S., and D.L.; formal analysis, V.G., R.S. and D.L.; investigation, V.G., R.S., D.L., and R.U.; resources, V.G., R.S. and D.L.; data curation, V.G.; writing—original draft preparation, V.G. and R.S.; writing—review and editing, V.G., R.S., D.L., and R.U.; visualization, V.G.; supervision, V.G.; project administration, V.G.; funding acquisition, V.G., R.S., D.L., and R.U.

**Funding:** This research was funded by the European Regional Development Fund according to the supported activity "Research Projects Implemented by World-class Researcher Groups" under the Measure No. 01.2.2-LMT-K-718.

**Conflicts of Interest:** The authors declare no conflicts of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Processes* Editorial Office E-mail: processes@mdpi.com www.mdpi.com/journal/processes

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18

www.mdpi.com ISBN 978-3-03936-933-1