Next Article in Journal
Nonparametric Probability Density Function Estimation Using the Padé Approximation
Next Article in Special Issue
Beyond Spectrograms: Rethinking Audio Classification from EnCodec’s Latent Space
Previous Article in Journal
Development and External Validation of [18F]FDG PET-CT-Derived Radiomic Models for Prediction of Abdominal Aortic Aneurysm Growth Rate
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

Batch-to-Batch Optimization Control of Fed-Batch Fermentation Process Based on Recursively Updated Extreme Learning Machine Models

School of Engineering, Merz Court, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
Author to whom correspondence should be addressed.
Algorithms 2025, 18(2), 87;
Submission received: 30 December 2024 / Revised: 31 January 2025 / Accepted: 4 February 2025 / Published: 6 February 2025


This paper presents a new method of batch-to-batch optimization control for a fed-batch fermentation process. A recursively updated extreme learning machine (ELM) neural network model is used to model a fed-batch fermentation process. ELM models have advantages over other neural networks in that they can be trained very fast and have good generalization performance. However, the ELM model loses its predictive abilities in the presence of batch-to-batch process variations or disturbances, which lead to a process–model mismatch. The recursive least squares (RLS) technique takes the model prediction error from the previous batch and uses it to update the model parameters for the next batch. This improves the performance of the model and helps it to respond to any changes in process conditions or disturbances. The updated model is used in an optimization control procedure, which generates an improved control profile for the next batch. The update of the RLS model enables the optimization control strategy to maintain a high final product quality in the presence of disturbances. The proposed batch-to-batch optimization control method is demonstrated on a simulated fed-batch fermentation process.

1. Introduction

Batch and fed-batch processes are commonplace in industrial fermentation processes. In a fed-batch process, a feed stream is used to supply additional substrate to the reactor during the reaction. The feed rate profile of this feed stream can be changed during the process, and the final product is removed at the end of the reaction. Introducing a feed stream helps to improve the control of reactant concentrations and, therefore, can be used to discourage undesired reactions from taking place and to improve product selectivity [1]. These benefits only transpire when a suitable feeding strategy is used. Therefore, determining the feeding strategy that yields the maximum amount of desired product is the main design objective for fed-batch reactors (i.e., the optimal control strategy) [2]. In the case of fermentations, this is usually where the rate of substrate being added to the reactor is equal to the rate at which it is used up by the organism [3]. The ever-increasing market competition and the stricter regulations on product quality and process efficiency mean that the optimization of fed-batch fermentations is becoming increasingly important [4]. An optimal control strategy will also help improve profits from the process [5].
The simplest way of optimizing the fed-batch process is to build an off-line control strategy using a model of the process. However, the presence of process–model mismatches (PMMs) and unknown disturbances renders the control strategy suboptimal when applied to the real fed-batch process [4]. Therefore, strategies have been proposed to mitigate this issue. Batch-to-batch optimization involves taking data from the previous batch and using them, with historical batch data, to optimize the performance of the following batch.
In order to develop an optimal control strategy, a suitable model of the process is required. There are three different types of models that can be used: mechanistic models, data-driven models, or hybrid models. A mechanistic model uses knowledge of the kinetics, stoichiometry, and mass and energy balances of the reaction system. Data-driven or “black-box” models are constructed from process operational data and used to describe the relationship between the manipulated variables and the output variables [2,6]. Hybrid models combine elements of data-driven and mechanistic models, taking advantages of both [7,8,9,10]. Von Stosch et al. [11] gives a detailed review of hybrid modeling in process systems engineering.
A common approach for developing a data-driven model is through using neural networks [12,13]. The most commonly used neural networks are multi-layer feedforward neural networks consisting of an input layer, an output later, and one or more hidden intermediate layers [14]. Each layer has several neurons, which are connected to neurons in the adjacent layers via weighted connections. Data are processed in these neurons as they pass through the network until they reach the output layer [15]. Fed-batch processes are nonlinear systems and are often modeled using neural networks due to their excellent approximation ability [16,17]. Xiong et al. [18] developed a batch-to-batch control strategy based on a control affine feedforward neural network, which used modeling errors from the previous batch to improve the model. The model prediction improved after each batch, and so process–model mismatch was reduced. Zhang et al. [4] developed a linearized model that was rebuilt after each batch by adding data from the recently completed batch to the pool of historical process data. The optimal control strategy using the updated model was able to respond to disturbances. Jewaratnam et al. [3] expanded on the updating linearized model by introducing a “sliding window” approach in which only data from a window of the most recent batches were used to update the model. The sliding window improved the convergence rate and stability following a disturbance.
ELM is a type of neural network developed by Huang et al. [19]. It differs from standard single-layer feedforward neural networks (SLFNs) in the way the network is trained. In standard SLFNs, hidden-layer weights must be solved iteratively using a training algorithm like back propagation, and a large number of training iterations are typically required [20]. Whereas, in ELM, the weights of the hidden layers are assigned randomly and are not repeatedly adjusted [21]. This allows the output-layer weights to be solved as a linear system of equations, which makes the training the ELM model significantly faster [22]. ELM also provides very high generalization performance [23]. Alli and Zhang [21] proposed using ELM in conjunction with recursive least squares (RLS) for batchwise model updating. Unknown disturbances or PMMs are detrimental to the ELM model prediction. RLS is, therefore, used to address any PMM. After each batch, RLS uses the latest model prediction error to update the weights of the ELM neural network output layer, allowing the ELM model to respond to disturbances. The method was shown to be successful in suppressing the effects of PMM and disturbances. Recursive methods have also been recently used in developing process monitoring methods. Li et al. [24] presented a recursive principal component analysis method for adaptive process monitoring. Yu and Zhao [25] developed a recursive exponential slow feature analysis method for adaptive process monitoring. Yu et al. [26] proposed a recursive cointegration approach for the adaptive monitoring of nonstationary industrial processes.
This paper proposes using the recursively updated ELM model in a batch-to-batch optimization control strategy. An ELM model is initially developed from historical process operation data. After each batch, the ELM model will be updated using RLS and used in an optimization framework to maximize the final biomass of a fed-batch fermentation in the next batch. Using a recursively updated model will allow the model to adapt to disturbances and reduce PMM with each iteration. A mechanistic model of a fed-batch fermentation will be used to simulate a real fed-batch process.
This paper is organized as follows: Section 2 presents the fed-batch fermentation process, ELM, batch-to-batch ELM model updating using RLS, and batch-to-batch optimization control. Section 3 presents the results and discussions. Section 4 concludes this paper.

2. Materials and Methods

2.1. Fed-Batch Fermentation Model

A mechanistic model for the production of baker’s yeast in a fed-batch fermenter is used in this study. The model was taken from Yüzgeç et al. [27]. Equations (1)–(12) outline the kinetic model.
Glucose uptake rate:
Q s = Q s , m a x C s K s + C s 1 exp t t d
Oxidation capacity:
Q o , l i m = Q o , m a x C o K o + C o K i K i + C e
Specific growth rate limit:
Q s , l i m = μ c r Y x / s o x
Oxidation glucose metabolism:
Q s , o x = min Q s Q s , l i m Q o , l i m / Y o / s
Reductive glucose metabolism:
Q s , r e d = Q s Q s , o x
Ethanol uptake rate:
Q e , u p = Q e , m a x C e K e + C e K i K i + C e
Oxidative ethanol metabolism:
Q e , o x = min Q e , u p Q o , l i m Q s , o x Y o / s Y e / o
Ethanol production rate:
Q e , p r = Q s , r e d Y e / s
Total specific growth rate:
μ = Q s , o x Y x / s o x + Q s , r e d Y x / s r e d + Q e , o x Y x / e
Carbon dioxide production rate:
Q c = Q s , o x Y c / s o x + Q s , r e d Y c / s r e d + Q e , o x Y c / e
Oxygen consumption rate:
Q o = Q s , o x Y o / s + Q e , o x Y o / e
Respiratory quotient:
R Q = Q c / Q o
In the above equations, Qi is the specific consumption or production rate of i ( g   g 1 h 1 ); Ci is concentration of the ith component (g L−1); Ke is the saturation constant for ethanol (g L−1); Ki is the inhibition constant (g L−1); Ko is the saturation constant for oxygen (g L−1); Ks is the saturation constant for the substrate (glucose) (g L−1); t is time (h); td is the time delay (h); the subscripts or superscripts c, cr, e, lim, max, o, ox, red, s, up, and x represent carbon dioxide, critic, ethanol, limitation, maximum, oxygen, oxidative, reductive, substrate (glucose), uptake, and biomass, respectively; and μ is the specific growth rate (h−1).
The reactor was modeled as isothermal. The dynamic model is outlined in Equations (13)–(18).
d C s d t = F V S o C s μ Y x / s o x + Q e , p r Y e / s + Q m C x
d C o d t = Q o C x + k L a o C o C o F V C o
d C e d t = Q e , p r Q e , o x C x F V C e
d C x d t = μ C x F V C x
d V d t = F
k L a o = 113 F a A R 0.25
where Cs, Co, Ce, and Cx represent, respectively, the concentrations of glucose, oxygen, ethanol, and biomass; F and Fa stand for feed rate and air feed rate, respectively; kLao is the total volumetric mass transfer coefficient (h−1); V is the volume of the content in the reactor; and AR denotes the cross-sectional area of the reactor.
The values of the parameters used in the model are given in Table 1. The initial conditions used in the model are given in Table 2. A simulation based on this model was built in MATLAB [28] and was used to generate historical batch data and test new feed profiles obtained from the optimization process.

2.2. Extreme Learning Machine

An example of a single-hidden-layer feedforward neural network (SLFN) is shown in Figure 1. An input layer consisting of n input neurons, x, is connected to a hidden layer of N ~ neurons via a series of hidden-layer weights, w, and biases, b. The output values of the hidden-layer neurons are computed using these inputs, weights, and biases in an activation function. The value of the output-layer neurons is then calculated using the linear relationship between the hidden-layer outputs and the output-layer weights, β.
For the SLFN used in this investigation, each input neuron corresponds to the flowrate at the respective feed interval, and the single output neuron represents the final biomass concentration.
The extreme learning machine (ELM), proposed by Huang et al. [23], is a learning algorithm for an SLFN. They proved that the method was several orders of magnitude faster and could achieve improved generalization performance, when compared to other methods (such as back propagation). The main principle of ELM is that the hidden-layer weights and biases are chosen randomly.
For N distinct samples, ( x j , t j ), where x j = x j 1 , x j 2 , , x j n T R n   is the j th set of input values and t j = t j 1 , t j 2 , , t j m T R m   is the j th set of target values, the hidden-layer output matrix of the neural network is given by H .
H w 1 , , w N ~ , b 1 , , b N ~ , x 1 , , x N = g w 1 · x 1 + b 1 g w N ~ · x 1 + b N ~ g w 1 · x N + b 1 g w N ~ · x N + b N ~ N × N ~
where w i = w i 1 , w i 2 , , w i n T ,   i = 1 , , N ~ is the weight vector connecting the input neurons to the i th hidden-layer neuron, and b i is the bias for the i th hidden neuron.
Using ELM training, the hidden-layer weights and biases are chosen randomly and are used in the activation function g x , along with the input sample values, to compute H . The output neurons are linear in nature for ELM such that,
i = 1 N ~ β i H j i = i = 1 N ~ β i g w i · x j + b i = o j ,
where β i = β i 1 , β i 2 , , β i m T ,   i = 1 , , N ~ , is the weight vector connecting the i th hidden-layer neuron to the output neurons, and o j is the network output for the j th sample.
So long as the activation function is infinitely differentiable, this SLFN can approximate the N samples with zero error j = 1 N o j t j = 0 , i.e., a properly trained network with correct β i , w i , and b i suggests the following:
i = 1 N ~ β i g w i · x j + b i = t j ,
This equation can be written more simply as follows:
H β = T
β = β 1 T β N ~ T N ~ × m a n d   T = t 1 T t N T N × m
The hidden-layer output matrix, H , can be calculated simply as shown in Equation (19), and the matrix of target output values, T , is already known. Therefore, training the SLFN involves finding the least-squares solution β ^ of the linear system given in Equation (22). The matrix of weights connecting the hidden layer to the output layer, β , is calculated as follows:
β ^ = H T
where H is the Moore–Penrose generalized inverse of matrix H .
Due to there often being an unequal number of hidden neurons, N ~ , and distinct training samples, N , H is rarely a square matrix. Therefore, the hidden-layer output matrix, H , is noninvertible, and so the Moore–Penrose generalized inverse is used to compute the system in Equation (22). Singular value decomposition (SVD) is the method used to solve the Moore–Penrose generalized inverse.

2.3. Recursive Least Squares

Recursive least squares (RLS) is used to update the ELM model parameters following each subsequent batch. It works in such a way that the parameters calculated from N observations are used in the estimation of parameters for N + 1 observations. This is more computationally efficient than simply recalculating the least squares after each observation from scratch [29].
Consider a system of historical data based on N pairs of input and output data,
X N = x 1 T x N T   y N = y 1 y N
where x N T is a column vector of system input values, and y N is the scalar system output for the N th measurement. The system is modeled by the following linear model:
y = X θ + ε
The model parameters relating y to X can be estimated through least-squares estimation by minimizing the loss function as follows:
min J θ = ε 2
where ε = y y ^ are the estimation errors, and y ^ = X θ are the estimated values. The solution to the minimization is to find an estimation of the parameters, θ ^ , that satisfies the following equation:
y = X θ ^
By left multiplying both sides of the above equation by X T and then left multiplying both sides by (XTX)−1, the least-squares solution is given by the following:
θ ^ = X T X 1 X T y
Upon obtaining additional measurements of the system, the matrix X is expanded by a row, and the vector y is expanded by an element as follows:
X N + 1 = X N x N + 1 T   y N + 1 = y N y N + 1
The model parameter estimation from Equation (27) can then be written as follows:
θ ^ N + 1 = X T N + 1 X N + 1 1 X T N + 1 y N + 1 = X T N X N + x N + 1 x N + 1 T 1 X T N y N + x N + 1 y N + 1
Assuming the matrix X T N X N is positive definite, the recursive least squares can be solved from the following equations:
θ ^ N + 1 = θ ^ N + K N e N + 1
e N + 1 = y N + 1 x N + 1 T θ ^ N
K N = P N x N + 1 T 1 + x N + 1 T P N x N + 1 1
P N + 1 = I K N x N + 1 T P N
P 0 = I · P 0 P 0 > 100 σ 2
For time-varying systems, it is necessary to introduce a forgetting factor, λ (typically 0.9 < λ 1 ), which works to reduce the influence of old data. A smaller forgetting factor will “forget” the old data faster. Using a forgetting factor, Equations (31) and (32) become
K N = P N x N + 1 T λ + x N + 1 T P N x N + 1 1
P N + 1 = I K N x N + 1 T P N / λ
Equations (29), (30), (34) and (35) need to be updated after each subsequent batch. The new ELM parameters (output-layer weights and bias) are taken from θ ^ N + 1 = β T , b 2 , where b2 is the bias for the output layer neuron.

2.4. Batch-to-Batch Optimization Control

The aim of batch-to-batch optimization control is to maximize the amount of biomass at the end of the batch by adjusting the substrate feed profile. In order to carry out the optimization, an ELM model linking the batch feed profile and the final biomass concentration at the end of a batch is developed. The ELM model is of the following form:
y = f ( U )
where y is the final biomass concentration at the end of a batch; U = (u1, u2, …, u8) is the feed profile for the batch; and f() is a nonlinear function represented by an ELM. Here, we divide the batch duration into 8 equal intervals, and the substrate feed rate is kept constant during each interval. Hence, the feed profile is a vector containing 8 elements.
For the calculation of the optimal feed profile for the current batch (the kth batch), Uk, the optimization uses the feed profile from the previous batch, Uk−1, as the initial search value. This feed profile is applied as the inputs to the ELM model in order to predict the final biomass concentration, which is used in a numerical optimization procedure. Therefore, the ELM model has 8 input neurons and one output neuron. The final volume is also calculated and multiplied with the predicted final biomass concentration to find the final amount of biomass. The optimization problem is solved using the MATLAB function, “fmincon”, from the MATLAB Optimization Toolbox and is outlined by Equation (37). Note that minimization is typically considered in optimization tools, so a negative sign is added to the predicted amount of the final biomass to convert the maximization problem into a minimization problem. The optimization solution is subject to the constraints that the feed at each interval cannot exceed 2600 L h−1, and the final volume in the reactor should not exceed the maximum.
min U J = V y = V f U = V 0 + i = 1 8 u i t s f U
s u b j e c t   t o :     0 u i 2600   L h 1 V < V f   L
In the above equation, V0 is the initial volume of the content in the reactor at the start of a batch; V is the final volume of the content in the reactor at the end of a batch; U is the feed profile; and ts is the interval length (or sampling time).
It should be noted that this optimization control strategy is not a conventional feedback control strategy in which a manipulated variable is continuously adjusted by a feedback controller. In this optimization control strategy, an ELM model of the batch process is used in an optimization framework to find the best feeding profile.

2.5. Batch-to-Batch Optimization Control Strategy Integrating ELM and RLS

In this section, the batch-to-batch optimizing control strategy integrating ELM and RLS for the fed-batch fermentation process is outlined. The RLS updates the parameters of the ELM model, i.e., the output-layer weights and bias, by recursively solving the least-squares problem. This will work to amend any plant–model mismatches and the effects of unknown disturbances. The output-layer weights and bias of the ELM for the previous batch are used as the initial guess for the RLS procedure. The parameters are updated iteratively in the RLS algorithm. Combining ELM with RLS is beneficial when used on real plant systems due to the historical process data often being inaccurate or insufficient. The updated ELM model can be then used for the optimization of the next batch. The optimization will benefit from the ELM model being up-to-date and accurate. Figure 2 gives the framework of the proposed approach, and the details are explained below.
To address the situation in which only limited historical process operation data are available, we started with a small number of historical batches. A set of 25 feed profiles, each consisting of 8 feed intervals, was created. Each interval had a duration of 2 h, and the feed rate was kept constant throughout the interval. These feed profiles were generated by adding uniformly distributed random variations (up to ±50 L h−1) to a baseline profile to emulate how different operators might have different batch strategies based on their own experience. Figure 3 shows all the 25 feed profiles used. Due to the large number of curves, legends have not been added to the figure. It can be seen from Figure 3 that these feed profiles have a general increasing trend as the batch progresses. This is consistent with the general feed profiles for this process reported in previous studies [3,27,28,30]. Each feed profile was simulated in the fed-batch fermentation mechanistic model to find the corresponding final biomass concentrations. To represent real processes in which measurement noise always exists, normally distributed noise with zero mean and a standard deviation of 0.1 (g L−1) was added to the final biomass concentrations. The data in this study were generated through simulation. In practical applications, the substrate feed rate can be measured using a flow meter, and the biomass concentration at the end of a batch can be either measured on-line or through laboratory analysis.
The historical batch data were normalized by scaling to zero mean and unit variance. They were then randomly partitioned into training, validation, and unseen testing data sets. The number of hidden neurons in the ELM model was determined using cross validation. The ELM model was trained using the training data set and with different numbers of hidden neurons. In this study, we consider the number of hidden neurons to be in the range of 1 to 60. The number of hidden neurons that resulted in a low combined root-mean-square error (RMSE) in the training and validation data was selected. Figure 4 shows how the modeling error changed with the number of hidden neurons. The final selected ELM model had 35 hidden neurons based on Figure 4. Note that, if the final selected number of hidden neurons is close to 60 (the upper end), then the upper end of the range should be extended. A plot of the predicted against the actual final biomass concentration, showing the performance of the trained ELM model on all 3 data sets, is given in Figure 5. Note that the values shown in the plot are normalized values.
A given batch feed profile was run in both the fed-batch simulation and the trained ELM model. In order for the feed profile to run in the fed-batch simulation, it was scaled back to the original scale. The prediction error of the ELM model was used in the RLS updating algorithm, i.e., Equations (29), (30), (34) and (35), to update the output-layer weights and bias of the ELM model. The updated model was then used in the optimization procedure to find an optimal feed profile for the next batch. The optimization yields a new feed profile that provides the maximum amount of the final biomass that the current model can achieve, subject to the constraints of the optimization. The process was repeated from batch to batch, using the new optimized feed profile with each batch. The reason for continuously updating the model is that it can then handle unknown disturbances. Also, in the case of limited historical batches for training the model, continuously updating the model will help it to adapt and improve its generalization capabilities.
As ELM models can be trained quickly when the number of data are not large, model redevelopment is feasible when the number of data are not large. However, when the number of data are large, which is the case in practical situations in which a batch process can run for years, re-developing an ELM model after each batch with a large number of data is costly. This issue can be addressed by recursively updating the ELM model from batch to batch using RLS.

3. Results and Discussions

3.1. Adaptive Modeling

To assess the RLS model update, a series of feed profiles, similar to the ones used to train the ELM model, were run through both the fed-batch simulation and the ELM model. The prediction error of the model was calculated after each batch. Recursive updating of the model parameters through RLS was performed after each batch. An unmeasured disturbance in the feed concentration was introduced from the 51st batch. The prediction error of the model for each subsequent batch is plotted in Figure 6, which also shows the results of this process when RLS model updating was not performed. Prior to the disturbance, the systems with and without RLS updating performed about equally with root-mean-square errors (RMSEs) of 0.075 and 0.079, respectively. The RMSE of the ELM model with batch-to-batch updating was only marginally lower than that of the ELM model with batch-to-batch updating. Even with the recursive model updating, the prediction error was unable converge to zero due to the added measurement noise. A small number of historical training batches (25 batches) were used to build the model; therefore, it is expected that the predictive capability of the ELM model will improve when more batches are performed and used to train the model through RLS updating. When Alli and Zhang [21] performed a similar experiment on a different case study, the prediction error remained steady and close to zero both with and without RLS model updating. This is more likely due to the relatively small measurement noise used in [21].
Following the disturbance, the system without the RLS updating continually overpredicted the final biomass concentration and was unable to correct itself. This is apparent in the prediction error remaining below zero. For the system with RLS model updating, the prediction error reduced back toward zero. However, the error does not truly converge to zero probably due to the added measurement noise, and it takes 30–40 batches for the prediction accuracy to return to a similar level to that of batches before the disturbance occurred. The RLS system was able to successfully modify the ELM model output-layer weights so that the prediction error for the following batches returned to near zero. Overall, Figure 6 shows that RLS model updating is able to improve the accuracy of the ELM model following a disturbance, but the performance for this system still has large room for improvement. A possible reason for this is that the RLS was not set up correctly, with suboptimal values used for the forgetting factor or P0. The use of a forgetting factor is important as it causes the RLS algorithm to put more focus on the most recent batch data. This is significant, especially when there is a disturbance to the system, as it helps the updated model to better represent the current process. Another likely reason could be that the historical data used to train the model were not sufficient. The feed profiles could have been too varied or not varied enough, and the volume of data (25 batches) may not have been sufficient. This highlights a possible issue of ELM modeling and data-driven modeling as a whole: the accuracy of a model is reliant on it being provided with “good” data—which is not always possible, especially in industrial environments.

3.2. Batch-to-Batch Optimization with RLS Model Updating

The final biomass concentrations from both the ELM model and the fed-batch simulation for the batch-to-batch optimization with RLS model updating outlined in Section 2.5 is shown in Figure 7. A change in the substrate feed concentration (325 g L−1 to 322 g L−1) was introduced from batch 11 to represent an unknown disturbance. It should be noted that the disturbance is typically associated with raw material variations, and the disturbance typically lasts for a consecutive number of batches. For the first iteration (batch) of the process, the final biomass concentrations of the simulation and the model prediction were very similar as shown in Figure 8. This is because the first feed profile used in batch 1 came from the training data, and so the ELM model was well fitted for it. In the next iteration (batch), the feed profile was optimized using the updated ELM model. As a result, the final biomass concentration improved for both simulation and ELM model prediction. However, the ELM model significantly overpredicted and caused a large prediction error. This is due to the new optimized feed profile being different from the training data and so being outside the bounds of the ELM model’s predictive capabilities. For the next batch, the ELM model’s prediction was much closer to the result of simulation. This shows that the model was successfully updated through RLS so that it could better predict the final biomass concentration for the new optimized profiles. The prediction error continued to improve with each batch and converged to zero after six batches. The unmeasured disturbance to the feed concentration introduced from batch 11 caused the final biomass concentration to decrease. This caused the ELM model to overpredict again. However, RLS was again able to update the model parameters so that after about eight batches, the prediction error was back close to zero. The final biomass concentration following the disturbance was not as large as what was achieved prior to the disturbance. This was likely due to the physical limitations of the new conditions.
Figure 8a shows how the mass of biomass at the end of the batch changed with each iteration of the batch-to-batch optimization, with and without RLS model updating. The new optimized feed profile in batch 2 caused large model prediction errors for both systems. The system that used RLS model updating was able to use the prediction error to update the parameters of the ELM model and return the prediction error back to close to zero in the subsequent batches. However, the large prediction error for the system without RLS updating persisted as the ELM model was unable to adapt. The objective function in the optimization was then based on this inaccurate model and resulted in the new generated feed profile not being optimal for the true fermentation process. This resulted in the final amount of biomass being lower than that from the system with RLS updating, which used a more accurate model in the optimization. The performance also could not improve after the second batch as the model was not updated and so the optimization was being given the same problem to solve. The batch-to-batch optimization without RLS model updating was also unable to adapt to the disturbance introduced from batch 11 as shown in Figure 8. In contrast, the batch-to-batch optimization with the batchwise-updated ELM model was able to overcome the effect of disturbance and quickly converged to stable operation (after about five batches). It can be seen from Figure 8 that the final amount of biomass under the batch-to-batch optimization control was about 40 kg higher than that for the scheme with the fixed ELM model. Figure 9 shows the optimized feed profiles for batch 10 (before the introduction of disturbance) and batch 30 (after the effect of disturbance has been captured by the RLS model updating). It can be seen from Figure 9 that only some small adjustment in the feed profile during the second, third, and fifth stages (i.e., 2–6 h and 8–10 h) were required to overcome the effect of the unknown disturbance.

3.3. Improvement Suggestions

The new optimized feed profiles shown in Figure 9 differed substantially from the historical training feed profiles shown in Figure 3. Prior to any RLS model updating, the model struggled to predict the final biomass concentration for these new feed profiles, as shown by the large prediction error for batch 2 in Figure 8b. Rebuilding the model by including these feed profiles in the training data should improve its predictive capabilities.
In this study, 25 historical batches were used to build the model, which is a small number of data for model building. Simply using more historical batches to build the ELM model could improve its performance. However, using too much training data means the requirement of many batches of process operation data. This means more time and money need to be spent collecting the data. The smaller the number of batches used, the sooner the model can be built and implemented. Therefore, a balance must be found for a suitable number of historical batches.
The ELM model only predicted the final biomass concentration of the desired product and was used in the optimization to maximize the amount of this product at the end of the batch. The new optimized feed profiles may be causing the excessive production of undesirable products. In order to address this, the ELM model could be trained with two outputs in the output layer: the final desired product concentration and an undesired by-product concentration. The objective function could be adapted to maximize the amount of desired product whilst minimizing the amount of undesired by-product. Alternatively, two ELM models, one for each product, could be built and updated separately.
Baron and Zhang [30] showed that combining multiple ELM models, developed from bootstrap re-sampling copies of the original training data, improved the accuracy of the model. Additionally, the confidence bounds of the model predictions can be found and used in the optimization to improve the reliability of the optimal feed profile. Using a bootstrap-aggregated extreme learning machine (BA-ELM) with the RLS model updating could provide further improved results.

4. Conclusions

Batch-to-batch optimization control based on a recursively updated extreme learning machine (ELM) model of a fed-batch fermentation process is presented in this paper. ELM is a type of single-hidden-layer feedforward neural network and uses randomly assigned hidden-layer weights and biases, which makes model training very fast. Iterative learning in the form of batch-to-batch updating of the output-layer weights and bias through recursive least squares (RLS) improved the modeling performance. The predictive performance of the batchwise-updated ELM model was greater than that of the original ELM model, especially under the presence of an unknown disturbance. The batchwise-updated ELM model was also able to track the influence of unknown disturbances (variations in substrate feed concentration) by adapting the model to the new conditions. An offline optimization was performed on the updated model after each subsequent batch and was able to improve the final mass of biomass for the next batch. The batch-to-batch optimization policy was also able to respond to unknown disturbances (variations in substrate feed concentration) by altering the feed profile so that the batch product quality following the disturbance returned to a level similar to that achieved before the disturbance. It is shown that the batch-to-batch optimization with the updated ELM model could produce 40 kg more final biomass each batch than that without model updating when the feed concentration was disturbed from its nominal value of 325 g L−1 to 322 g L−1. The effectiveness of the proposed modeling and optimization method has been demonstrated on a simulated fed-batch fermentation process and shows how it can be used to handle process–model mismatch and unknown disturbances.

Author Contributions

Conceptualization, A.M. and J.Z.; methodology, A.M. and J.Z.; software, A.M.; validation, A.M.; formal analysis, A.M.; investigation, A.M.; resources, J.Z.; data curation, A.M.; writing—original draft preparation, A.M.; writing—review and editing, J.Z.; visualization, A.M.; supervision, J.Z.; project administration, J.Z.; funding acquisition, J.Z. All authors have read and agreed to the published version of the manuscript.


This research received no external funding.

Data Availability Statement

Data will be made available on request.

Conflicts of Interest

The authors declare no conflicts of interest.


  1. Ochoa, S. Fed-Batch Fermentation—Design Strategies. In Comprehensive Biotechnology, 3rd ed.; Moo-Young, M., Ed.; Pergamon: Oxford, UK, 2019; pp. 586–600. [Google Scholar] [CrossRef]
  2. Bonvin, D. Optimal operation of batch reactors—A personal view. J. Process Control. 1998, 8, 355–368. [Google Scholar] [CrossRef]
  3. Jewaratnam, J.; Zhang, J.; Morris, J.; Hussain, A. Batch-to-batch iterative learning control using linearised models with adaptive model updating. In Proceedings of the 2012 UKACC International Conference on Control, Cardiff, UK, 3–5 September 2012; pp. 271–276. [Google Scholar] [CrossRef]
  4. Zhang, J.; Nguyan, J.; Morris, J.; Xiong, Z. Batch to batch iterative learning control of a fed-batch fermentation process using linearised models. In Proceedings of the 2008 10th International Conference on Control, Automation, Robotics and Vision, Hanoi, Vietnam, 17–20 December 2008; pp. 745–750. [Google Scholar] [CrossRef]
  5. Thomas, I.M.; Kiparissides, C. Computation of the near-optimal temperature and initiator policies for a batch polymerization reactor. Can. J. Chem. Eng. 1984, 62, 284–291. [Google Scholar] [CrossRef]
  6. Souza, F.A.A.; Araújo, R.; Mendes, J. Review of soft sensor methods for regression applications. Chemom. Intell. Lab. Syst. 2016, 152, 69–79. [Google Scholar] [CrossRef]
  7. Tian, Y.; Zhang, J.; Morris, A.J. Modelling and optimal control of a batch polymerisation reactor using a hybrid stacked recurrent neural network model. Ind. Eng. Chem. Res. 2001, 40, 4525–4535. [Google Scholar] [CrossRef]
  8. Chen, L.; Hontoir, Y.; Huang, D.; Zhang, J.; Morris, A.J. Combining first principles with black-box techniques for reaction systems. Control. Eng. Pract. 2004, 12, 819–826. [Google Scholar] [CrossRef]
  9. Psichogios, D.C.; Ungar, L.H. A hybrid neural network-first principles approach to process modeling. AIChE J. 1992, 38, 1499–1511. [Google Scholar] [CrossRef]
  10. Thompson, M.L.; Kramer, M.A. Modeling chemical processes using prior knowledge and neural networks. AIChE J. 1994, 40, 1328–1340. [Google Scholar] [CrossRef]
  11. von Stosch, M.; Oliveira, R.; Peres, J.; Feyo de Azevedo, S. Hybrid semi-parametric modeling in process systems engineering: Past, present and future. Comput. Chem. Eng. 2013, 60, 86–101. [Google Scholar] [CrossRef]
  12. Zhan, Y.; Zhu, J. Response surface methodology and artificial neural network-genetic algorithm for modeling and optimization of bioenergy production from biochar-improved anaerobic digestion. Appl. Energy 2024, 355, 122336. [Google Scholar] [CrossRef]
  13. Sun, P.; Chen, J.; Que, H. A priori knowledge-based dual hierarchical RNN for spatial-temporal process modeling: Using a multitubular reactor as a case study. IEEE Trans. Ind. Inform. 2024, 20, 899–910. [Google Scholar] [CrossRef]
  14. Himmelblau, D.M. Accounts of experiences in the application of artificial neural networks in chemical engineering. Ind. Eng. Chem. Res. 2008, 47, 5782–5796. [Google Scholar] [CrossRef]
  15. Misra, S.; Li, H. Comparative study of shallow and deep machine learning models for synthesizing in situ NMR T2 distributions. In Machine Learning for Subsurface Characterization; Misra, S., Li, H., He, J., Eds.; Gulf Professional Publishing: Houston, TX, USA, 2020; pp. 219–241. [Google Scholar] [CrossRef]
  16. Ławryńczuk, M. Neural Modelling of a Yeast Fermentation Process Using Extreme Learning Machines; Springer: Berlin/Heidelberg, Germany, 2016. [Google Scholar] [CrossRef]
  17. Kulkarni, S.G.; Chaudhary, A.K.; Nandi, S.; Tambi, S.S.; Kulkarni, B.D. Modeling and monitoring of batch processes using principal component analysis (PCA) assisted generalized regression neural networks (GRNN). Biochem. Eng. J. 2004, 18, 193–210. [Google Scholar] [CrossRef]
  18. Xiong, Z.; Xu, Y.; Zhang, J.; Dong, J. Batch-to-batch control of fed-batch processes using control-affine feedforward neural network. Neural Comput. Appl. 2008, 17, 425–432. [Google Scholar] [CrossRef]
  19. Huang, G.-B.; Zhu, Q.-Y.; Siew, C.-K. Extreme learning machine: Theory and applications. Neurocomputing 2006, 70, 489–501. [Google Scholar] [CrossRef]
  20. Sharma, N.; Deo, R. Wind speed forecasting in Nepal using self-organizing map-based online sequential extreme learning machine. In Predictive Modelling for Energy Management and Power Systems Engineering; Deo, R., Samui, P., Roy, S.S., Eds.; Elsevier: Amsterdam, The Netherlands, 2021; pp. 437–484. [Google Scholar] [CrossRef]
  21. Alli, K.; Zhang, J. Adaptive Modelling of Fed-batch Processes with Extreme Learning Machine and Recursive Least Square Technique. In Proceedings of the 12th International Conference on Agents and Artificial Intelligence, Valletta, Malta, 22–24 February 2020; SciTePress: Setúbal Municipality, Portugal, 2020; Volume 2, pp. 668–674. [Google Scholar] [CrossRef]
  22. Huang, G.; Huang, G.-B.; Song, S.; You, K. Trends in extreme learning machines: A review. Neural Netw. 2015, 61, 32–48. [Google Scholar] [CrossRef]
  23. Huang, G.-B.; Zhu, Q.-Y.; Siew, C. Extreme learning machine: A new learning scheme of feedforward neural networks. In Proceedings of the IEEE International Conference on Neural Networks, Budapest, Hungary, 25–29 July 2004; pp. 985–990. [Google Scholar] [CrossRef]
  24. Li, W.H.; Yue, H.H.; Valle-Cervantes, S.; Qin, S.J. Recursive PCA for adaptive process monitoring. J. Process Control. 2000, 10, 471–486. [Google Scholar] [CrossRef]
  25. Yu, W.K.; Zhao, C.H. Recursive exponential slow feature analysis for fine-scale adaptive processes monitoring with comprehensive operation status identification. IEEE Trans. Ind. Inform. 2019, 15, 3311–3323. [Google Scholar] [CrossRef]
  26. Yu, W.K.; Zhao, C.H.; Huang, B. Recursive cointegration analytics for adaptive monitoring of nonstationary industrial processes with both static and dynamic variations. J. Process Control. 2020, 92, 319–332. [Google Scholar] [CrossRef]
  27. Yüzgeç, U.; Türker, M.; Hocalar, A. On-line evolutionary optimization of an industrial fed-batch yeast fermentation process. ISA Trans. 2008, 48, 79–92. [Google Scholar] [CrossRef] [PubMed]
  28. Zhang, J.; Feng, Y.; Al-Mahrouqi, M.H. Reliable optimal control of a fed-batch fermentation process using ant colony optimisation and bootstrap aggregated neural network models. In Computer-Aided Chemical Engineering, Proceedings of the 21st European Symposium on Computer Aided Process Engineering, Chalkidiki, Greece, 29 May–1 June 2011; Pistikopoulos, E.N., Georgiadis, M.C., Kokossis, A.C., Eds.; Elsevier: Amsterdam, The Netherlands, 2011; Volume 29, pp. 664–667. [Google Scholar]
  29. Åström, K.J.; Wittenmark, B. Computer-Controlled Systems: Theory and Design, 3rd ed.; Dover Publications: Mineola, NY, USA, 2011. [Google Scholar]
  30. Baron, C.; Zhang, J. Reliable On-Line Re-Optimization Control of a Fed-Batch Fermentation Process Using Bootstrap Aggregated Extreme Learning Machine; Springer Nature: Cham, Switzerland, 2020. [Google Scholar] [CrossRef]
Figure 1. A single-hidden-layer feedforward neural network (SLFN) configured appropriately for training by ELM.
Figure 1. A single-hidden-layer feedforward neural network (SLFN) configured appropriately for training by ELM.
Algorithms 18 00087 g001
Figure 2. Flow diagram of the proposed batch-to-batch optimization control strategy integrating ELM and RLS: (a). data generation; (b). ELM modeling; (c). batch-to-batch optimization with ELM model updating using RLS.
Figure 2. Flow diagram of the proposed batch-to-batch optimization control strategy integrating ELM and RLS: (a). data generation; (b). ELM modeling; (c). batch-to-batch optimization with ELM model updating using RLS.
Algorithms 18 00087 g002
Figure 3. Feed profiles of the historical batches used to build the ELM model.
Figure 3. Feed profiles of the historical batches used to build the ELM model.
Algorithms 18 00087 g003
Figure 4. RMSE of the model prediction for the training and validation data sets.
Figure 4. RMSE of the model prediction for the training and validation data sets.
Algorithms 18 00087 g004
Figure 5. Predicted (ELM) vs. actual (fed-batch simulation) final biomass concentration (normalized values) for the training, validation, and unseen testing data sets.
Figure 5. Predicted (ELM) vs. actual (fed-batch simulation) final biomass concentration (normalized values) for the training, validation, and unseen testing data sets.
Algorithms 18 00087 g005
Figure 6. Prediction error of the ELM model for recursive batches with and without RLS model updating. An unmeasured disturbance is introduced from the 51st batch.
Figure 6. Prediction error of the ELM model for recursive batches with and without RLS model updating. An unmeasured disturbance is introduced from the 51st batch.
Algorithms 18 00087 g006
Figure 7. Actual (fed-batch simulation) and predicted (ELM model) final biomass concentration for the ELM with RLS model updating system. An unmeasured disturbance is introduced from the 11th batch.
Figure 7. Actual (fed-batch simulation) and predicted (ELM model) final biomass concentration for the ELM with RLS model updating system. An unmeasured disturbance is introduced from the 11th batch.
Algorithms 18 00087 g007
Figure 8. Final mass of biomass from the fed-batch simulation (a) and model prediction error (b) for the system with and without RLS model updating. An unmeasured disturbance is introduced from the 11th batch.
Figure 8. Final mass of biomass from the fed-batch simulation (a) and model prediction error (b) for the system with and without RLS model updating. An unmeasured disturbance is introduced from the 11th batch.
Algorithms 18 00087 g008
Figure 9. The optimized feed profiles for batch 10 and batch 30.
Figure 9. The optimized feed profiles for batch 10 and batch 30.
Algorithms 18 00087 g009
Table 1. Parameter values used in the fed-batch fermentation model.
Table 1. Parameter values used in the fed-batch fermentation model.
K e g   L 1 0.1 Y e / o   g   g 1 1.1236
K o g   L 1 9.6 × 10−5 Y e / s   g   g 1 0.4859
K i g   L 1 3.5 Y x / e   g   g 1 0.7187
K s g   L 1 0.612 Y o / s   g   g 1 0.3857
Y x / s o x g   g 1 0.585 Y c / e   g   g 1 0.6450
Y x / s r e d g   g 1 0.050 Y o / e   g   g 1 0.8904
Y c / s o x g   g 1 0.5744 S o   g   L 1 325
Y c / s r e d g   g 1 0.4620 C o   g   L 1 0.006
Q o , m a x   g   g 1 h 1 0.255 μ c r   h 1 0.21
Q s , m a x   g   g 1 h 1 2.943 A R   m 2 12.56
Q e , m a x   g   g 1 h 1 0.238 t d   h 2
Q m g   g 1 h 1 0.03 F a   m 3 h 1 9983.4451
Table 2. Initial conditions and settings used in the fed-batch fermentation model.
Table 2. Initial conditions and settings used in the fed-batch fermentation model.
C s t = 0 g   L 1 7 V t = 0 L 50,000
C o t = 0 g   L 1 7.8 × 10−3 V m a x L 100,000
C e t = 0 g   L 1 0 t f h 16
C x t = 0 g   L 1 15 t i n t h 2
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Moore, A.; Zhang, J. Batch-to-Batch Optimization Control of Fed-Batch Fermentation Process Based on Recursively Updated Extreme Learning Machine Models. Algorithms 2025, 18, 87.

AMA Style

Moore A, Zhang J. Batch-to-Batch Optimization Control of Fed-Batch Fermentation Process Based on Recursively Updated Extreme Learning Machine Models. Algorithms. 2025; 18(2):87.

Chicago/Turabian Style

Moore, Alex, and Jie Zhang. 2025. "Batch-to-Batch Optimization Control of Fed-Batch Fermentation Process Based on Recursively Updated Extreme Learning Machine Models" Algorithms 18, no. 2: 87.

APA Style

Moore, A., & Zhang, J. (2025). Batch-to-Batch Optimization Control of Fed-Batch Fermentation Process Based on Recursively Updated Extreme Learning Machine Models. Algorithms, 18(2), 87.

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop