Parallel Solution of Robust Nonlinear Model Predictive Control Problems in Batch Crystallization

Cao, Yankai; Kang, Jia; Nagy, Zoltan K.; Laird, Carl D.

doi:10.3390/pr4030020

Open AccessFeature PaperArticle

Parallel Solution of Robust Nonlinear Model Predictive Control Problems in Batch Crystallization

by

Yankai Cao

¹,

Jia Kang

²,

Zoltan K. Nagy

¹ and

Carl D. Laird

^1,*

¹

School of Chemical Engineering, Purdue University, 480 Stadium Mall Drive, West Lafayette, IN 47907, USA

²

Department of Chemical Engineering, Texas A&M University, 3122 TAMU, College Station, TX 77843, USA

^*

Author to whom correspondence should be addressed.

Processes 2016, 4(3), 20; https://doi.org/10.3390/pr4030020

Submission received: 6 May 2016 / Revised: 20 June 2016 / Accepted: 22 June 2016 / Published: 30 June 2016

(This article belongs to the Special Issue Algorithms and Applications in Dynamic Optimization)

Download

Browse Figures

Versions Notes

Abstract

:

Representing the uncertainties with a set of scenarios, the optimization problem resulting from a robust nonlinear model predictive control (NMPC) strategy at each sampling instance can be viewed as a large-scale stochastic program. This paper solves these optimization problems using the parallel Schur complement method developed to solve stochastic programs on distributed and shared memory machines. The control strategy is illustrated with a case study of a multidimensional unseeded batch crystallization process. For this application, a robust NMPC based on min–max optimization guarantees satisfaction of all state and input constraints for a set of uncertainty realizations, and also provides better robust performance compared with open-loop optimal control, nominal NMPC, and robust NMPC minimizing the expected performance at each sampling instance. The performance of robust NMPC can be improved by generating optimization scenarios using Bayesian inference. With the efficient parallel solver, the solution time of one optimization problem is reduced from 6.7 min to 0.5 min, allowing for real-time application.

Keywords:

dynamic optimization; robust NMPC; parallel NLP; batch crystallization

Graphical Abstract

1. Introduction

Nonlinear model predictive control (NMPC) is an advanced control technique based on an online solution of a nonlinear optimal control problem at each sampling instance using new measurements and updated state estimates. The quality of NMPC depends on the accuracy of the underlying model. Despite the high fidelity of using nonlinear models based on first principles, there are still uncertainties associated with external and internal disturbances. Although the inherent robust Input-to-State Stability (ISS) of NMPC can be proven for ideal NMPC [1,2], the assumption that the existence of uncertainties do not change the feasibility (e.g., no state and input constraints) is not valid for many applications. Even if robust stability is valid, it is of limited use in analyzing the robust performance, especially for batch processes.

Several approaches have been proposed to take uncertainty into consideration in the design of NMPC algorithms. The most widely-studied approach is to solve a min–max optimization at each sampling instance to minimize the performance index of the worst-case while satisfying the state and input constraints for a set of uncertainty realizations [3]. One concern about this approach is that the nominal performance is sacrificed as the min–max optimization often chooses a very conservative control strategy. Huang et al. [4] proposes to minimize the expected value of the performance index based on multiple uncertainty scenarios. Nagy and Braatz [5] minimizes a weighted sum of expected value and variance of the performance index. While all of these approaches can be implemented within a feedback framework, this feedback is not considered in the NMPC optimization formulation itself. By contrast, Magni et al. [6] optimizes the control laws instead of the control steps at each sampling step. However, if the form of the control law is overly complex, this approach may not be computationally feasible. Recently, several other methods including multi-stage NMPC [7], Riccati differential equations [8] and a relaxation-based approach [9] were reported.

If we represent the uncertainties with a set of scenarios, the multi-scenario-based robust NMPC problem can be viewed as a large-scale stochastic program. The problem size becomes too large to be solved efficiently online by a serial solver, driving the need for parallel algorithms. For stochastic programs, an efficient parallel algorithm often exploits the structure at problem formulation level (e.g., Bender decomposition, Lagrangian decomposition, Lagrangian relaxation, progressive hedging) or at linear algebra level. Although the parallelization of the first class can be easily implemented, the convergence rate is typically slow, especially for nonlinear problems. In contrast, the second class of approaches can retain the fast convergence properties of the original host algorithms. For this class, interior-point methods are popular because the structure of the linear system remains the same at each iteration. The linear systems derived using interior-point methods for stochastic programming problems have the block-bordered-diagonal form, and they can be decomposed using the Schur complement method [10]. When the number of first stage variables is small, this approach has almost perfect strong scaling. However, when the number of first stage variables is large, forming and solving the dense Schur complement becomes a computational bottleneck.

In order to deal with stochastic programs with large first-stage dimensionality, many approaches have been proposed. Kang et al. [11] uses a preconditioned conjugate gradient (PCG) procedure to solve the Schur system with an automatic L-BFGS preconditioner. This approach avoids both forming and factorizing the Schur complement explicitly. Lubin et al. [12] forms the Schur system as a by-product of a sparse factorization and factorizes the Schur system in parallel. Cao et al. [13] performs adaptive clustering of scenarios inside-the-solver and forms a sparse compressed representation of the large Karush–Kuhn–Tucker(KKT) system as a preconditioner. The matrix that needs to be factorized in this approach is much smaller than the full-space KKT system and more sparse than the Schur system.

In addition to the parallel solution of the KKT system, a scalable parallel algorithm also requires parallel evaluations of the nonlinear programming (NLP) functions and gradients, and parallel implementations of all other linear algebra operations (e.g., vector-vector operations and matrix-vector multiplications). While the latter is easy for many parallel architectures, the former is not. There is, to the best knowledge of the author, no efficient modeling language supporting parallel evaluations of functions and gradients for general NLP problems. However, for structured problems such as stochastic programs, Kang et al. [11] and Zavala et al. [10] build a single AMPL [14] model instance for each scenario and evaluate all these instances in parallel. Several packages (e.g., PySP [15], StochJuMP [16]) have also been developed to support the parallel evaluation of functions and gradients for structured NLP problems.

This paper solves optimization problems arising from robust NMPC using the parallel algorithm developed to solve stochastic programs. This paper is organized as follows: Section 2 presents both the NMPC and robust NMPC approaches. Section 3 describes one parallel algorithm to solve large-scale stochastic programs based on the Schur complement method. Section 4 illustrates this approach with a case study of a batch crystallization process, and compares the performance of robust NMPC with open-loop control and nominal NMPC. Final conclusions are presented in Section 5.

2. Problem Formulations

This section demonstrates the problem formulations in the context of batch processes, while the solution strategy described in Section 3 can also be applied to continuous processes.

2.1. NMPC Formulation

For a batch process in the interval [

t_{0}

,

t_{f}

], the optimal control problem solved online at a sampling instance

t_{k}

is of the following form:

\begin{matrix} min_{u (t)} & J (z (t), u (t), p), \end{matrix}

(1a)

\begin{matrix} s . t . & \frac{d z (t)}{d t} = f (z (t), u (t), p), \end{matrix}

(1b)

\begin{matrix} y (t) = c (z (t), u (t), p), \end{matrix}

(1c)

\begin{matrix} z (t_{k}) = \hat{z} (t_{k}), \end{matrix}

(1d)

\begin{matrix} g (z (t), u (t), p) \leq 0, t \in [t_{k}, t_{f}], \end{matrix}

(1e)

where J is the objective function, t is the time,

z (t)

is the vector of

n_{z}

state variables, u denotes the vector of

n_{u}

input variables, p represents the vector of

n_{p}

uncertainty parameters, and

y (t)

is the vector of

n_{y}

output variables. The initial state values

\hat{z} (t_{k})

of the process are estimated using moving horizon estimation (MHE) from the historical measurement of

y (t)

,

t \in [t_{0}, t_{k}]

. The function f describes the system dynamics and the function g represents the constraints on the inputs and state variables. After solving the above optimal control problem, the input trajectory in the interval [

t_{k}

,

t_{k + 1}

) is injected in the plant. The optimization process is repeated with the updated estimation of

\hat{z} (t_{k + 1})

at the next sampling instance

t_{k + 1}

.

For batch processes, the objective function usually only depends on the product quality at the end of the process. Therefore, one popular expression of the objective function is:

\begin{matrix} ∥ y (t_{f}) - y_{s e t} ∥_{Π}^{2}, \end{matrix}

(2)

where Π is a weight matrix, and

y_{s e t}

is the setpoint. We want the product quality at the end of the batch process to be as close to the setpoint as possible.

2.2. MHE Formulation

NMPC requires the initial value of the states

\hat{z} (t_{k})

, but often not all states can be measured. Therefore, we need to estimate those unmeasured state variables from available measurements. At each sampling instance

t_{k}

, before solving the optimal control problem (1), we solve the state estimation problem of the following form:

\begin{matrix} min_{p, w (t)} & \int_{t_{0}}^{t_{k}} {∥ w (t) ∥}_{R}^{2} d t + \sum_{i = 1}^{k} {∥ y (t_{i}) - y^{m} (t_{i}) ∥}_{W}^{2} + {∥ p - p^{r e f} ∥}_{Z}^{2}, \end{matrix}

(3a)

\begin{matrix} s . t . & \frac{d z (t)}{d t} = f (z (t), u (t), p) + w (t), \end{matrix}

(3b)

\begin{matrix} y (t) = c (z (t), p), \end{matrix}

(3c)

\begin{matrix} z (t_{0}) = \tilde{z_{0}}, \end{matrix}

(3d)

\begin{matrix} z (t) \geq 0, t \in [t_{0}, t_{k}], \end{matrix}

(3e)

where

y^{m} (t_{i})

is the vector of measured values at sampling instance

t_{i}

, w is the vector of model noise,

p^{r e f}

is the vector of reference value for p, and R, W and Z are weighting matrices. In the objective function, we want the predicted output to fit the measurements, the predicted parameter to be close to the reference, and the model noise to be small. Here, we assume the initial state value at

t_{0}

is available; otherwise,

\tilde{z_{0}}

is also a variable and a term penalizing the deviation of

\tilde{z_{0}}

from reference should also be included in the objective function.

2.3. Robust NMPC Formulation

Despite the high fidelity obtained with nonlinear models based on first principles, there are still uncertainties associated with external and internal disturbances. A decision made without a consideration of these uncertainties might not only result in low-quality products but also carry the risk of violating some safety constraints. In order to deal with the parameter uncertainties, robust NMPC minimizes the expected or worst-case performance. For a batch process controlled by robust NMPC minimizing the expected performance, we solve with the following objective instead of (2):

\begin{matrix} E (∥ y (t_{f}) - y_{s e t} ∥_{Π}^{2}), \end{matrix}

(4)

where E represents the expected value with respect to uncertain parameters p, and p follows a known distribution on the set

P \in R^{n_{p}}

.

To solve this problem numerically, one method is to assume that p has a finite number of realizations

p_{1}, . . ., p_{S}

, with probability

ξ_{1}, . . ., ξ_{S}

.

S : = {1 . . S}

is the scenario set and S is the number of scenarios. With this assumption, the objective function can be formulated as the following:

\begin{matrix} E (∥ y (t_{f}) - y_{s e t} ∥_{Π}^{2}) = \sum_{s \in S} ξ_{s} {∥ y_{s} (t_{f}) - y_{s e t} ∥}_{Π}^{2} . \end{matrix}

(5)

Then, we can derive the following extensive form of the robust NMPC problems and also drop

ξ_{s}

from the notation by defining

Π \leftarrow ξ_{s} Π

:

\begin{matrix} min_{u (t)} & \sum_{s \in S} {∥ y_{s} (t_{f}) - y_{s e t} ∥}_{Π}^{2}, \end{matrix}

(6a)

\begin{matrix} \frac{d z_{s} (t)}{d t} = f (z_{s} (t), u (t), p_{s}), \end{matrix}

(6b)

\begin{matrix} y_{s} (t) = c (z_{s} (t), u (t), p_{s}), \end{matrix}

(6c)

\begin{matrix} z_{s} (t_{k}) = \hat{z} (t_{k}), \end{matrix}

(6d)

\begin{matrix} g (z_{s} (t), u (t), p_{s}) \leq 0, \end{matrix}

(6e)

\begin{matrix} t \in [t_{k}, t_{f}], \forall s \in S, \end{matrix}

(6f)

where

z_{s}

is a vector of states corresponding to

p = p_{s}

. The control profile u needs to be determined before the realization of p is known. Hence, we can view u as the first stage variables and

z_{s}

and

y_{s}

as the second stage variables.

In many cases, the number of possible realizations of p is infinite. To deal with that situation, a number of scenarios are generated using Monte Carlo sampling. Although Equation (5) is no longer exact, it is often a good approximation when the number of scenarios is sufficiently large. This method is called the sample average approximation (SAA) method. The optimal value from the extensive form problem (6) converges to that of the original problem with objective function (4) with probability 1 as

S \to \infty

[17].

If we want to minimize the worst-case performance index instead of expected performance at each sampling instance, we can replace the objective function (6a) with the following equations:

\begin{matrix} min_{u (t), w o r s t} & w o r s t, \end{matrix}

(7a)

\begin{matrix} s . t . & w o r s t \geq ∥ y_{s} (t_{f}) - y_{s e t} ∥_{Π}^{2} . \end{matrix}

(7b)

2.4. Efficient Optimization via the Simultaneous Approach

The above optimization problems are all differential-algebraic equation (DAE) constrained optimization problems. The simultaneous method can be used to reformulate these DAE-constrained problems by discretizing the DAE system using collocation methods [18]. As an example of the simultaneous approach, we consider the formulation for robust NMPC with expected performance as the objective. The time domain [

t_{k}, t_{f}

] is partitioned into

n_{e}

stages with length

h_{i}

,

i = 1, . . ., n_{e}

, where

\sum_{i = 1}^{n e} h_{i} = t_{f} - t_{k}

, while each stage is discretized using

n_{c}

collocation points. The problem after discretization is of the following form:

\begin{matrix} min_{u^{i, j}, z_{s}^{i, j}, y_{s}^{i, j}, {\dot{z}}_{s}^{i, j}} & \sum_{s \in S} {∥ y_{s}^{n_{e}, n_{c}} - y_{s e t} ∥}_{Π}^{2}, \end{matrix}

(8a)

\begin{matrix} s . t . & z_{s}^{i, j} = z_{s}^{i} + h_{i} \sum_{k = 1}^{n_{c}} w_{j, k} {\dot{z}}_{s}^{i, j}, \end{matrix}

(8b)

\begin{matrix} {\dot{z}}_{s}^{i, j} = f (z_{s}^{i, j}, u^{i, j}, p_{s}), \end{matrix}

(8c)

\begin{matrix} y_{s}^{i, j} = c (z_{s}^{i, j}, u^{i, j}, p_{s}), \end{matrix}

(8d)

\begin{matrix} z_{s}^{1} : = \hat{z} (t_{k}), \end{matrix}

(8e)

\begin{matrix} z_{s}^{i + 1} : = z_{s}^{i, n_{c}}, \end{matrix}

(8f)

\begin{matrix} g (z_{s}^{i, j}, u^{i, j}, p_{s}) \leq 0, \end{matrix}

(8g)

\begin{matrix} \forall i = 1, . . ., n_{e}, j = 1, . . . n_{c}, s \in S, \end{matrix}

(8h)

where w are the coefficients from the Radau collocation method. If we view

u^{i, j}

as first stage variables, and

z_{s}^{i, j}, y_{s}^{i, j}

, and

{\dot{z}}_{s}^{i, j}

as second stage variables, the above problem fits the problem formulation of two-stage stochastic programs.

3. Efficient Parallel Schur Complement Method for Stochastic Programs

The robust NMPC problem formulation discussed in the paper match the structure of stochastic programming problems. A general extensive form of two-stage stochastic programs is of the form:

\begin{matrix} \min & f_{0} (x_{0}) + \sum_{s \in S} f_{s} (x_{s}, x_{0}), \end{matrix}

(9a)

\begin{matrix} s . t . & c_{0} (x_{0}) = 0, & (λ_{0}) \end{matrix}

(9b)

\begin{matrix} c_{s} (x_{0}, x_{s}) = 0, & (λ_{s}) \end{matrix}

(9c)

\begin{matrix} x_{0} \geq 0, & (ν_{0}) \end{matrix}

(9d)

\begin{matrix} x_{s} \geq 0, & (ν_{s}) \end{matrix}

(9e)

\begin{matrix} \forall s \in S, \end{matrix}

(9f)

where

x_{0} \in R^{n_{0}}

are the first stage variables,

λ_{0} \in ℜ^{m_{0}}

and

ν_{0} \in ℜ^{n_{0}}

are the dual variables for the first stage equality constraints and the bounds,

x_{s} \in ℜ^{n_{s}}

are the second stage variables for scenario s, and

λ_{s} \in ℜ^{m_{s}}

and

ν_{s} \in ℜ^{n_{s}}

are the dual variables for the second stage equality constraints and the bounds. The total number of variables is

n : = n_{0} + \sum_{s \in S} n_{s}

and the total number of equality constraints is

m : = m_{0} + \sum_{s \in S} m_{s}

.

In our implementation, instead of solving the original stochastic program of the form in (9), we solve the problem (10) by duplicating the first stage variables

x_{0}

as

x_{0, s}

,

s \in S

:

\begin{matrix} \min & f_{0} (x_{0, 1}) + \sum_{s \in S} f_{s} (x_{s}, x_{0, s}), \end{matrix}

(10a)

\begin{matrix} s . t . & c_{0} (x_{0, 1}) = 0, & (λ_{0}) \end{matrix}

(10b)

\begin{matrix} c_{s} (x_{s}, x_{0, s}) = 0, & (λ_{s}) \end{matrix}

(10c)

\begin{matrix} x_{0, 1} \geq 0, & (ν_{0}) \end{matrix}

(10d)

\begin{matrix} x_{s} \geq 0, & (ν_{s}) \end{matrix}

(10e)

\begin{matrix} x_{0, s} = x_{0}, & (σ_{s}) \end{matrix}

(10f)

\begin{matrix} \forall s \in S, \end{matrix}

(10g)

where the equality and bound constraints previously applied on

x_{0}

only transfer to that of

x_{0, 1}

to prevent redundant constraints.

Without Equation (10f), the above formulation can be decomposed into S independent sub-problems. The Lagrangian function of subproblem 1 is defined as

\begin{matrix} \begin{matrix} L_{1} (x_{0, 1}, x_{1}, λ_{1}, λ_{0}, ν_{1}, ν_{0}) = & f_{0} (x_{0, 1}) + f_{1} (x_{1}, x_{0, 1}) + {λ_{1}}^{T} c_{1} (x_{0, 1}, x_{1}) \\ + {λ_{0}}^{T} c_{0} (x_{0, 1}) - ν_{1}^{T} x_{1} - ν_{0}^{T} x_{0, 1}, \end{matrix} \end{matrix}

(11)

and the Lagrangian function for the remaining subproblem s,

s \in {2 . . S}

is defined as:

\begin{matrix} L_{s} (x_{0, s}, x_{s}, λ_{s}, ν_{s}) & = f_{s} (x_{s}, x_{0, s}) + {λ_{s}}^{T} c_{s} (x_{0, s}, x_{s}) - ν_{s}^{T} x_{s} . \end{matrix}

(12)

The Lagrangian of the whole problem (10) can be formulated as:

\begin{matrix} L (x, λ, ν, σ) = \sum_{s \in S} L_{s} + σ_{s}^{T} (x_{0, s} - x_{0}) . \end{matrix}

(13)

If we use an interior-point method to solve the problem (10), typically the dominant computational cost is the solution of the KKT system. Given the structure of problem (10), the KKT system has the following arrowhead form:

\begin{matrix} [\begin{matrix} K_{1} & B_{1} \\ K_{2} & B_{2} \\ ⋱ & ⋮ \\ K_{S} & B_{S} \\ B_{1}^{T} & B_{2}^{T} & \dots & B_{S}^{T} & K_{0} \end{matrix}] [\begin{matrix} Δ w_{1} \\ Δ w_{2} \\ ⋮ \\ Δ w_{S} \\ Δ w_{0} \end{matrix}] = [\begin{matrix} r_{1} \\ r_{2} \\ ⋮ \\ r_{S} \\ r_{0} \end{matrix}], \end{matrix}

(14)

where

\begin{matrix} Δ w_{0}^{T} & : = [Δ x_{0}^{T}], \\ Δ w_{1}^{T} & : = [Δ x_{1}^{T}, Δ {x_{0, 1}}^{T}, Δ λ_{1}^{T}, Δ λ_{0}^{T}, σ_{1}^{T}], \\ Δ w_{s}^{T} & : = [Δ x_{s}^{T}, Δ {x_{0, s}}^{T}, Δ λ_{s}^{T}, σ_{s}^{T}], & \forall s \in {2 . . S} \\ r_{0}^{T} & : = \sum_{s \in S} σ_{s}, \\ r_{1}^{T} & = - [{(\nabla_{x_{1}} L_{1} + ν_{1} - μ_{i n} {X_{1}}^{- 1} e)}^{T}, c_{1}^{T}, c_{0}^{T}, {(x_{0, 1} - x_{0})}^{T}], \\ r_{s}^{T} & = - [{(\nabla_{x_{s}} L_{s} + ν_{s} - μ_{i n} {X_{s}}^{- 1} e)}^{T}, c_{s}^{T}, {(x_{0, s} - x_{0})}^{T}], & \forall s \in {2 . . S} \\ K_{0} & : = [\begin{matrix} 0_{n_{0}} \end{matrix}], \\ K_{1} & : = [\begin{matrix} W_{1} & H_{0, 1}^{T} & A_{1} & A_{0} & 0 \\ H_{0, 1} & W_{0, 1} & T_{1} & 0 & I \\ A_{1}^{T} & T_{1}^{T} & 0 & 0 & 0 \\ A_{0}^{T} & 0 & 0 & 0 & 0 \\ 0 & I & 0 & 0 & 0 \end{matrix}], \\ K_{s} & : = [\begin{matrix} W_{s} & H_{0, s, s}^{T} & A_{s} & 0 \\ H_{0, s, s} & W_{0, s} & T_{s} & I \\ A_{s}^{T} & T_{s}^{T} & 0 & 0 \\ 0 & I & 0 & 0 \end{matrix}], & \forall s \in {2 . . S} \\ B_{1} & : = [\begin{matrix} 0 & 0 & 0 & 0 & - I, \end{matrix}], \\ B_{s} & : = [\begin{matrix} 0 & 0 & 0 & - I \end{matrix}], & \forall s \in {2 . . S} \\ W_{s} & : = H_{s} + X_{s}^{- 1} V_{s}, & \forall s \in {1 . . S}, \\ W_{0, 1} & : = H_{0, 1} + X_{0, 1}^{- 1} V_{0, 1}, \\ W_{0, s} & : = H_{0, s}, & \forall s \in {2 . . S} \end{matrix}

(15)

where

c_{s} = c_{s} (x_{s}, x_{0, s})

,

A_{s} = \nabla_{x_{s}} c_{s} (x_{s}, x_{0, s})

,

T_{s} = \nabla_{x_{0, s}} c_{s} (x_{s}, x_{0, s})

,

H_{s} = \nabla_{x_{s} x_{s}}^{2} L_{s}

,

H_{0, s} = \nabla_{x_{0, s} x_{0, s}}^{2} L_{s}

,

H_{0, s, s} = \nabla_{x_{0, s} x_{s}}^{2} L_{s}

.

Assuming that all

K_{s}

are of full rank, we can show with the Schur complement method that the solution of the Equation (14) is equivalent to that of the following system:

\begin{matrix} \underset{: = Z}{\underset{︸}{(K_{0} - \sum_{s \in S} B_{s}^{T} K_{s}^{- 1} B_{s})}} Δ w_{0} & = \underset{: = r_{Z}}{\underset{︸}{r_{0} - \sum_{s \in S} B_{s}^{T} K_{s}^{- 1} r_{s}}}, \end{matrix}

(16a)

\begin{matrix} K_{s} Δ w_{s} & = r_{s} - B_{s} Δ w_{0}, \forall s \in S . \end{matrix}

(16b)

The system (16) can be solved in three steps. The first step is to form Z and

r_{Z}

by adding the contribution from each scenario s. This step requires the factorizations of one sparse matrix

K_{1}

of size

n_{1} + 2 n_{0} + m_{1} + m_{0}

and

S - 1

sparse matrix

K_{s}

of size

n_{s} + 2 n_{0} + m_{s}

. Besides a total of S factorizations of block matrices, this step also requires a total of

(S + 1) n_{0}

backsolves. The second step is to solve the Equation (16a) to get the direction of first stage variables

Δ w_{0}

. This step requires one factorization and one backsolve of the dense matrix Z. With

Δ w_{0}

, the third step is to compute

Δ w_{s}

from Equation (16b). This step requires a total of S backsolves of the block sparse matrix. A straightforward implementation of these three steps leads to the explicit Schur complement method.

Using the Schur complement method, both step 1 and step 3 can be easily parallelized. When

n_{0}

is relatively small, the cost of factorizing matrix Z in step 2 is negligible, and the efficiency of parallelizing step 1 and step 3 can be close to one if the size of each block is close to each other. In addition, the memory requirement of the parallel Schur complement method is much smaller for each node than solving the system (14) in serial since the information of each block can be stored at each node.

One advantage of using the formulation (10) is that the Schur complement matrix is positive definite (P.D.) if the original KKT system and each

K_{s}

block satisfies the inertia condition for descent [11,19]. This property enables the use of a PCG procedure to solve the Schur system [11], leading to the implicit Schur complement method. This approach avoids both the explicit formation and factorization of the dense Schur complement matrix. Therefore, this approach is more efficient when

n_{0}

is relatively large.

Another advantage of using formulation (10) is that it facilitates the software development process. Equation (15) indicates that the KKT system of the whole problem can be constructed from the Jacobian, Hessian, and function evaluations of subblocks. In other words, the whole model can be constructed by generating one model representation (e.g., AMPL file) for each subblock and setting appropriate suffixes in each model file to identify first stage variables. Therefore, the model evaluation can be performed in parallel. The specialty of formulation (10) is that the Hessian and Jacobian for the subblocks can be used directly. For example, the Jacobian evaluated for subproblem s,

s \in {2 . . S}

, is

\nabla_{x_{s}, x_{0, s}} c_{s} {(x_{s}, x_{0, s})}^{T} = [A_{s}^{T}, T_{s}^{T}]

. For the formulation (10),

\nabla_{x_{s}, x_{0, s}} c_{s} {(x_{s}, x_{0, s})}^{T}

can be used directly in Equation (15) without splitting into

A_{s}^{T}

and

T_{s}^{T}

and the remaining matrices in Equation (15) can be obtained straightforwardly from each model representation.

4. Performance of Robust NMPC on Batch Crystalization

In this section, we illustrate the performance of an implementation of robust NMPC with a batch crystallization process.

4.1. Case Study: Multidimensional Unseeded Batch Crystallization Model

This section describes briefly a multidimensional unseeded batch crystallization model of

K H_{2} P O_{4}

-

H_{2} O

system. The details can be found in Mesbah et al. [20], Acevedo and Nagy [21], Cao et al. [22]. If we only consider the length L and the width W of crystals, using the population balance model (PBM) and method of moments (MOM), the batch crystallization model can be expressed as the following system of differential algebraic equations:

\begin{matrix} \frac{d μ_{00}}{d t} = B, \end{matrix}

(17a)

\begin{matrix} \frac{d μ_{10}}{d t} = G_{1} μ_{00}, \end{matrix}

(17b)

\begin{matrix} \frac{d μ_{01}}{d t} = G_{2} μ_{00}, \end{matrix}

(17c)

\begin{matrix} \frac{d μ_{11}}{d t} = G_{1} μ_{01} + G_{2} μ_{10}, \end{matrix}

(17d)

\begin{matrix} \frac{d μ_{20}}{d t} = 2 G_{1} μ_{10}, \end{matrix}

(17e)

\begin{matrix} \frac{d C}{d t} = - 2 ρ_{c} k_{v} G_{1} (μ_{11} - μ_{20}) - ρ_{c} k_{v} G_{2} μ_{20}, \end{matrix}

(17f)

\begin{matrix} G_{1} = k_{g_{1}} S^{g_{1}}, \end{matrix}

(17g)

\begin{matrix} G_{2} = k_{g_{2}} S^{g_{2}}, \end{matrix}

(17h)

\begin{matrix} B = k_{b} S^{b}, \end{matrix}

(17i)

\begin{matrix} S = \frac{C - C_{s} (T)}{C_{s} (T)}, \end{matrix}

(17j)

\begin{matrix} C_{s} (T) = c T^{2} + d T + e, \end{matrix}

(17k)

where

μ_{i j}

is the cross-moment, C is the solute concentration, B is the nucleation rate,

G_{1}

and

G_{2}

are the growth rates along L and W, respectively, S is the relative supersaturation,

C_{s}

is the saturation concentration,

k_{g_{1}}, k_{g_{2}}, g_{1}, g_{2}

, and

k_{b}

are kinetic parameters,

ρ_{c}

is the density of the solution, c, d, and e are polynomial coefficient describing the relationship between saturation concentration and temperature, and

k_{v}

is a constant volumetric shape factor. The temperature T is the control in this system. Two important indexes of crystals are mean length (

M L

) and aspect ratio (

A R

), which can be determined with the following equations:

\begin{matrix} M L = \frac{μ_{01}}{μ_{00}}, \end{matrix}

(18a)

\begin{matrix} A R = \frac{μ_{01}}{μ_{10}} . \end{matrix}

(18b)

The nominal kinetic parameters are available in Acevedo and Nagy [21], Cao et al. [22], Gunawan et al. [23] and Majumder and Nagy [24].

4.2. Numerical Results

The kinetic parameters in this model are subject to large uncertainties. For the purpose of this case study, we assume that

k_{b}

, b,

k_{g_{1}}

,

g_{1}

,

k_{g_{2}}

and

g_{2}

follow uniform distributions on the interval

[3.494 \cdot 10^{6} 5.494 \cdot 10^{6}]

#/

{cm}^{3}

min,

[2.03 2.05]

,

[0.06726 0.07926]

cm/min,

[1.47 1.49]

,

[0.5445 0.6645]

cm/min,

[1.73 1.75]

. We also assume that measurements of

M L

,

A R

and C are available and the measurement noise corresponding to

M L

,

A R

and C follows truncated normal distributions on the interval

[- 12 12]

μm,

[- 0.2 0.2]

, and

[- 0.008 0.008]

g / {cm}^{3}

. The mean values of the original normal distribution are all zero, and the standard deviations are 6 μm, 0.1, and 0.004

g / {cm}^{3}

, respectively.

The setpoint we keep is

A R_{s e t} = 2.9

and

M L_{s e t} = 200 μ

m, which is selected using the Pareto front line reported in Cao et al. [22]. The following cost function is used as the objective function in the NMPC and to evaluate the performance of a specific test simulation:

\begin{matrix} c o s t = 100 {(A R (t_{f}) - A R_{s e t})}^{2} + {(M L (t_{f}) - M L_{s e t})}^{2} . \end{matrix}

(19a)

We assume that the batch process lasts for 90 minutes and there are 18 sampling and control steps. The total number of first stage variables is small enough that the explicit Schur complement method is still efficient. For practical considerations, we also assume the batch process is also subjected to the following constraints so that the temperature profile is within the operation range and certain yield is guaranteed:

\begin{matrix} T_{m i n} < = T (t) < = T_{m a x}, \end{matrix}

(20a)

\begin{matrix} - R_{m a x} < = \frac{d T (t)}{d t} < = 0, \end{matrix}

(20b)

\begin{matrix} C (t_{f}) - C_{m a x} < = 0 . \end{matrix}

(20c)

For the numerical results shown later,

T_{m a x}

is 45 °C,

T_{m i n}

is 5 °C,

R_{m a x}

is 4 °C/min, and

C_{m a x}

is 0.237 g/

{cm}^{3}

.

Table 1 shows the robust performance of different control strategies when exact information is available. For each control strategy, we test the robust performance over 100 scenarios generated from the uncertain parameter distributions, and we will refer to these as test scenarios. For the case of ideal NMPC, we assume that the state of the system is perfectly known, and the controller performance is estimated using exact information from each test scenario. Both the open-loop and nominal NMPC perform the optimization using nominal values for the parameters. For the two robust formulations, exact min–max and exact min–expected, we need to select scenarios for the multi-scenario optimization. We refer to these as optimization scenarios. In Table 1, we show results for the exact case where the optimization scenarios are the same as the test scenarios. Later, in this section (and in Table 2), we will consider the more realistic case when the optimization scenarios are not the same as the test scenarios. While we assume that ideal NMPC knows the exact value of state variables, both nominal NMPC and two robust formulations use MHE to estimate unknown state variables.

Although the ideal NMPC knows the true value of the uncertain parameters, it cannot achieve the setpoint for several test scenarios. The worst-case performance for the ideal NMPC with exact parameters is 499, which is the lower bound of the worst-case performance of all other control strategies. The deviation of the product quality from the setpoint using open-loop control stategy is very large. Because of the feedback mechanism, performance of nominal NMPC improves significantly compared with the open-loop control. By considering uncertainty in the design of NMPC, the performance of exact min–max NMPC is much better than that of nominal NMPC in terms of the average, standard deviation and worst-case

c o s t

evaluated by 100 test scenarios. However, the robust NMPC sacrifices the performance when the uncertain parameters are all at their nominal values. Compared with the reduction in the worst-case

c o s t

, the nominal

c o s t

is still small. It is interesting to observe that the performance of exact min–expected NMPC is much worse than that of exact min–max NMPC and nominal NMPC, even in terms of average

c o s t

. The reason is that, although the control minimizes the expected cost at each sampling instance, the optimization formulation does not explicitly consider feedback. One advantage of robust NMPC methods minimizing worst-case or expected performance is that they can fulfill all input and state constraints for all optimization scenarios, which is not guaranteed with nominal NMPC. For this application, although there is constraint violation using nominal NMPC for several test scenarios, the violation is small.

Figure 1 shows that the optimal temperature profiles obtained using nominal NMPC and robust NMPC methods. It is clear that the input profiles from three methods are quite different.

The results of robust NMPC shown in Table 1 are ideal in that the test scenarios are the same as the the optimization scenarios. We now show results for the more realistic case when they are not the same. Therefore, we generate a new set of optimization scenarios from the uncertain parameter distributions. Table 2 shows the robust performance of robust NMPC using different numbers of optimization scenarios. In theory, increasing the number of optimization scenarios makes the uncertainty distribution considered in the optimization a better approximation of the true uncertainty distribution. Since the number of test scenarios are limited, many other factors (e.g., similarity of optimization scenarios and testing scenarios) also influence the robust performance. For min–expected NMPC, increasing the number of optimization scenarios from 50 to 100 changes the performance slightly. In contrast, increasing the number of optimization scenarios from 50 to 100 significantly improves the performance of min–max NMPC. This shows that min–max NMPC is more sensitive to the number of optimization scenarios. The performance of both robust formulations using a new set of 100 optimization scenarios is close to the performance of using exactly test scenarios as optimization scenarios as shown in Table 1, indicating that 100 optimization scenarios appear to be sufficient for this case study.

The size of the problem solved in Table 2 with 100 optimization scenarios is very large. It has 434,219 variables and 434,200 constraints. Table 3 shows the solution time of solving the optimization problem at step

t = 0

. The total time is composed of both the time constructing the model and the time solving the NLP. The Schur-complement method implemented not only solves the problem in parallel, but also builds and evaluates the model in parallel. It gains 14 times speedup on a computer with 25 cores compared with its own serial implementation. Our solver using a full-factorization method similar to Ipopt takes 6.7 min to solve the problem while the parallel Schur complement solver only requires half a minute, allowing for real-time application of this control strategy.

Uncertain parameters can be estimated using MHE. However, in the presence of significant noise and large uncertainties, point estimation results might not be accurate. Nevertheless, we can use Bayesian inference to update the posterior distributions of uncertainties and generate optimization scenarios according to the posterior distribution instead of prior distribution at each sampling instance. Specifically, the posterior distribution is:

\begin{matrix} f (p | y^{m}) = \frac{f (y^{m} | p) f (p)}{f (y^{m})} f (y^{m} | p) f (p), \end{matrix}

(21)

where

f (p)

is the prior probability density before

y^{m}

is observed,

f (y^{m} | p)

is the probability density of observing

y^{m}

with a given p, and

f (y^{m})

is the probability density of observing

y^{m}

, which can be viewed as a constant. For a given p, we can get a corresponding

y (p)

from simulation. Therefore,

f (y^{m} | p)

is equivalent to

f (y^{m} | y (p))

and can be computed according to the measurement error distribution. With this information, Markov chain Monte Carlo (MCMC) can be used to generate a set of scenarios based on the posterior distribution.

Table 4 illustrates that the performance of robust NMPC with Bayesian inference is better than robust NMPC with scenarios generated from the prior distribution alone. This is because the posterior distribution takes the measurements into consideration and therefore is more accurate than the prior distribution. Specifically, the performance of robust min–max NMPC with 25 optimization scenarios from Bayesian inference is close to the ideal performance. Increasing the number of optimization scenarios from 25 to 50 slightly deteriorates the performance because it now considers some scenarios that have very low probability.

5. Conclusions

In conclusion, this paper solves the optimization problems arising from robust NMPC using the parallel algorithm developed to solve stochastic programs in distributed and shared memory machines. The optimization problem resulting from robust NMPC at each sampling instance can be viewed as a large-scale stochastic program. Using an interior-point method to solve this problem results in a KKT system of the arrowhead form, and these linear systems can be decomposed using the Schur complement method, which can be implemented in parallel.

Using a case study of a multidimensional unseeded batch crystallization process, we show that robust min–max NMPC provides better robust performance compared with open-loop optimal control, nominal NMPC, and robust NMPC minimizing the expected performance at each sampling instance. We further improve the performance by generating optimization scenarios using Bayesian inference. The efficient parallel framework can dramatically reduce both the time to build the model and the time to solve the optimization problem, and thus allows for real time application.

Acknowledgments

The authors gratefully acknowledge the financial support provided to Yankai Cao and partial financial support provided to Carl Laird by the National Science Foundation (CAREER Grant CBET# 0955205).

Author Contributions

Yankai Cao and Carl Laird conceived the research; Jia Kang provided the NLP solver; Zoltan Nagy provided the case study; Yankai Cao wrote the paper; Carl Laird revised the final document.

Conflicts of Interest

The authors declare no conflict of interest.

References

Jiang, Z.P.; Wang, Y. Input-To-State stability for discrete-time nonlinear systems. Automatica 2001, 37, 857–869. [Google Scholar] [CrossRef]
Magni, L.; Scattolini, R. Robustness and robust design of MPC for nonlinear discrete-time systems. In Assessment and Future Directions of Nonlinear Model Predictive Control; Springer: Berlin Heidelberg, Germmany, 2007; pp. 239–254. [Google Scholar]
Scokaert, P.; Mayne, D. Min-Max feedback model predictive control for constrained linear systems. IEEE Trans. Autom. Control 1998, 43, 1136–1142. [Google Scholar] [CrossRef]
Huang, R.; Patwardhan, S.C.; Biegler, L.T. Multi-Scenario-Based robust nonlinear model predictive control with first principle models. Comput. Aided Chem. Eng. 2009, 27, 1293–1298. [Google Scholar]
Nagy, Z.K.; Braatz, R.D. Robust nonlinear model predictive control of batch processes. AIChE J. 2003, 49, 1776–1786. [Google Scholar] [CrossRef]
Magni, L.; De Nicolao, G.; Scattolini, R.; Allgöwer, F. Robust model predictive control for nonlinear discrete-time systems. Int. J. Robust Nonlinear Control 2003, 13, 229–246. [Google Scholar] [CrossRef]
Lucia, S.; Finkler, T.; Engell, S. Multi-Stage nonlinear model predictive control applied to a semi-batch polymerization reactor under uncertainty. J. Process Control 2013, 23, 1306–1319. [Google Scholar] [CrossRef]
Telen, D.; Houska, B.; Logist, F.; Van Derlinden, E.; Diehl, M.; Van Impe, J. Optimal experiment design under process noise using Riccati differential equations. J. Process Control 2013, 23, 613–629. [Google Scholar] [CrossRef] [Green Version]
Streif, S.; Kögel, M.; Bäthge, T.; Findeisen, R. Robust Nonlinear Model Predictive Control with Constraint Satisfaction: A relaxation-based Approach. In Proceedings of the 19th IFAC World Congress, Cape Town, South Africa, 24–29 August 2014; pp. 11073–11079.
Zavala, V.M.; Laird, C.D.; Biegler, L.T. Interior-Point decomposition approaches for parallel solution of large-scale nonlinear parameter estimation problems. Chem. Eng. Sci. 2008, 63, 4834–4845. [Google Scholar] [CrossRef]
Kang, J.; Cao, Y.; Word, D.P.; Laird, C. An interior-point method for efficient solution of block-structured NLP problems using an implicit Schur-complement decomposition. Comput. Chem. Eng. 2014, 71, 563–573. [Google Scholar] [CrossRef]
Lubin, M.; Petra, C.; Anitescu, M. The parallel solution of dense saddle-point linear systems arising in stochastic programming. Optim. Methods Softw. 2012, 27, 845–864. [Google Scholar] [CrossRef]
Cao, Y.; Laird, C.; Zavala, V. Clustering-Based Preconditioning for Stochastic Programs. Comput. Optim. Appl. 2015, 64, 379–406. [Google Scholar] [CrossRef]
Gay, D.M.; Kernighan, B. AMPL: A Modeling Language for Mathematical Programming, 2nd ed.; Cengage Learning: Boston, MA, USA, 2002; Volume 2. [Google Scholar]
Watson, J.P.; Woodruff, D.L.; Hart, W.E. PySP: Modeling and solving stochastic programs in Python. Math. Program. Comput. 2012, 4, 109–149. [Google Scholar] [CrossRef]
Huchette, J.; Lubin, M.; Petra, C. Parallel algebraic modeling for stochastic optimization. In Proceedings of the 1st First Workshop for High Performance Technical Computing in Dynamic Languages, New Orleans, Louisiana, 16–21 November 2014; pp. 29–35.
Shapiro, A.; Dentcheva, D.; Ruszczynski, A. Lectures on Stochastic Programming: Modeling and Theory; SIAM-Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 2014; Volume 16. [Google Scholar]
Cuthrell, J.E.; Biegler, L.T. On the optimization of differential-algebraic process systems. AIChE J. 1987, 33, 1257–1270. [Google Scholar] [CrossRef]
Forsgren, A.; Gill, P.E.; Wright, M.H. Interior methods for nonlinear optimization. SIAM Rev. 2002, 44, 525–597. [Google Scholar] [CrossRef]
Mesbah, A.; Nagy, Z.; Huesman, A.; Kramer, H.; Van den Hof, P. Real-time control of industrial batch crystallization processes using a population balance modeling framework. IEEE Trans. Control Syst. Technol. 2012, 20, 1188–1201. [Google Scholar] [CrossRef]
Acevedo, D.; Nagy, Z.K. Systematic classification of unseeded batch crystallization systems for achievable shape and size analysis. J. Cryst. Growth 2014, 394, 97–105. [Google Scholar] [CrossRef]
Cao, Y.; Acevedo, D.; Nagy, Z.K.; Laird, C.D.; School of Chemical Engineering, Purdue University, West Lafayette, IN, USA. Unpublished work. 2015.
Gunawan, R.; Ma, D.L.; Fujiwara, M.; Braatz, R.D. Identification of kinetic parameters in multidimensional crystallization processes. Int. J. Modern Phys. B 2002, 16, 367–374. [Google Scholar] [CrossRef]
Majumder, A.; Nagy, Z.K. Prediction and control of crystal shape distribution in the presence of crystal growth modifiers. Chem. Eng. Sci. 2013, 101, 593–602. [Google Scholar] [CrossRef]

Figure 1. Optimal temperature profile for nominal NMPC (nonlinear model predictive control) and robust NMPC.

Table 1. The robust performance (value of

c o s t

) of different control strategies evaluated using 100 test scenarios and exact information.

**Table 1.** The robust performance (value of $c o s t$ ) of different control strategies evaluated using 100 test scenarios and exact information.
Control Strategies	Nominal	Average	Standard Deviation	Worst-Case
Ideal	$2 \times 10^{- 4}$	30	66	499
Open-loop	$2 \times 10^{- 4}$	167	223	1339
Nominal NMPC	0.2	93	159	955
Exact Min–max NMPC	32	78	113	677
Exact Min–expected NMPC	12	99	169	1076

Table 2. The robust performance of the robust NMPC using different numbers of scenarios evaluated using 100 test scenarios.

**Table 2.** The robust performance of the robust NMPC using different numbers of scenarios evaluated using 100 test scenarios.
Type	S	Nominal	Average	Standard Deviation	Worst-Case
Min–max	25	15	99	170	1062
	50	13	102	178	1129
	75	13	95	156	946
	100	25	80	120	767
Min–expected	25	21	89	138	902
	50	11	100	172	1085
	75	12	99	169	1064
	100	12	99	169	1074

Table 3. The solution time of solving a robust optimization problem with 100 optimization scenarios.

**Table 3.** The solution time of solving a robust optimization problem with 100 optimization scenarios.
	# Processors	Full Factorization	Schur Complement Method
	# Processors	Time(s)	Time(s)	Speedup
Building Model	1	44.3	64.2	-
	2	-	34.8	1.8
	5	-	14.9	4.3
	10	-	8.6	7.5
	20	-	6.3	10.2
	25	-	4.7	13.7
Solving NLP	1	406	426.9	-
	2	-	216.3	2.0
	5	-	90.8	4.7
	10	-	51.0	8.4
	20	-	35.8	11.9
	25	-	30.0	14.2

Table 4. Robust performance of min–max NMPC with different numbers of optimization scenarios from Bayesian inference evaluated using 100 simulations.

**Table 4.** Robust performance of min–max NMPC with different numbers of optimization scenarios from Bayesian inference evaluated using 100 simulations.
Type	S	Nominal	Average	Standard Deviation	Worst-Case
Min–max	12	18	74	120	744
	25	13	61	96	584
	50	11	71	114	655
Min–expected	12	17	81	141	943
	25	12	84	145	949
	50	11	84	145	934

© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cao, Y.; Kang, J.; Nagy, Z.K.; Laird, C.D. Parallel Solution of Robust Nonlinear Model Predictive Control Problems in Batch Crystallization. Processes 2016, 4, 20. https://doi.org/10.3390/pr4030020

AMA Style

Cao Y, Kang J, Nagy ZK, Laird CD. Parallel Solution of Robust Nonlinear Model Predictive Control Problems in Batch Crystallization. Processes. 2016; 4(3):20. https://doi.org/10.3390/pr4030020

Chicago/Turabian Style

Cao, Yankai, Jia Kang, Zoltan K. Nagy, and Carl D. Laird. 2016. "Parallel Solution of Robust Nonlinear Model Predictive Control Problems in Batch Crystallization" Processes 4, no. 3: 20. https://doi.org/10.3390/pr4030020

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Parallel Solution of Robust Nonlinear Model Predictive Control Problems in Batch Crystallization

Abstract

1. Introduction

2. Problem Formulations

2.1. NMPC Formulation

2.2. MHE Formulation

2.3. Robust NMPC Formulation

2.4. Efficient Optimization via the Simultaneous Approach

3. Efficient Parallel Schur Complement Method for Stochastic Programs

4. Performance of Robust NMPC on Batch Crystalization

4.1. Case Study: Multidimensional Unseeded Batch Crystallization Model

4.2. Numerical Results

5. Conclusions

Acknowledgments

Author Contributions

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI