EPO Dosage Optimization for Anemia Management: Stochastic Control under Uncertainty Using Conditional Value at Risk

McAllister, Jayson; Li, Zukui; Liu, Jinfeng; Simonsmeier, Ulrich

doi:10.3390/pr6050060

Open AccessFeature PaperArticle

EPO Dosage Optimization for Anemia Management: Stochastic Control under Uncertainty Using Conditional Value at Risk

by

Jayson McAllister

¹

,

Zukui Li

^1,*,

Jinfeng Liu

¹

and

Ulrich Simonsmeier

²

¹

Department of Chemical and Materials Engineering, University of Alberta, Edmonton, AB T6G 1H9, Canada

²

Cybernius Medical Ltd., St. Alberta, AB T8N 2T7, Canada

^*

Author to whom correspondence should be addressed.

Processes 2018, 6(5), 60; https://doi.org/10.3390/pr6050060

Submission received: 8 April 2018 / Revised: 3 May 2018 / Accepted: 15 May 2018 / Published: 21 May 2018

(This article belongs to the Special Issue Modeling & Control of Disease States)

Download

Browse Figures

Versions Notes

Abstract

:

Due to insufficient endogenous production of erythropoietin, chronic kidney disease patients with anemia are often treated by the administration of recombinant human erythropoietin (EPO). The target of the treatment is to keep the patient’s hemoglobin level within a normal range. While conventional methods for guiding EPO dosing used by clinicians normally rely on a set of rules based on past experiences or retrospective studies, model predictive control (MPC) based dosage optimization is receiving attention recently. The objective of this paper is to incorporate the hemoglobin response model uncertainty into the dosage optimization decision making. Two methods utilizing Conditional Value at Risk (CVaR) are proposed for hemoglobin control in chronic kidney disease under model uncertainty. The first method includes a set-point tracking controller with the addition of CVaR constraints. The second method involves the use of CVaR directly in the cost function of the optimal control problem. The methods are compared to set-point tracking MPC and Zone-tracking MPC through computer simulations. Simulation results demonstrate the benefits of utilizing CVaR in stochastic predictive control for EPO dosage optimization.

Keywords:

anemia management; hemoglobin level control; model predictive control; Conditional Value at Risk

1. Introduction

One of the major side effects of chronic kidney disease (CKD) is the reduced ability to produce endogenous erythropoietin (EPO), which is a hormone that regulates the production of red blood cells in the body. When the natural production of EPO drops significantly, these patients suffer from a condition called anemia, which is characterized by a reduced mass of red blood cells and hemoglobin level. Recombinant human erythropoietin has become the standard for treating anemia in chronic kidney disease [1]. Recently, optimization of the EPO dosage has been considered to achieve reduced deviation of patient hemoglobin levels from a normal range (zone), while also reducing drug use and expense [2].

Zone model predictive controllers have become a popular area of research in the biomedical field [3]. They have been used successfully in clinical trials in the control of blood glucose in Type 1 Diabetes Mellitus [4,5]. They have gained significant popularity over set-point tracking model predictive control (MPC) for biomedical applications due to their ability to reject sensor noise, and maintain a higher degree of stability in the presence of large plant–model mismatch [6]. They also seem more appropriate for these applications as many of them do not have a target state, but rather a target zone.

The EPO dosing algorithms that have been developed and tested previously are open-loop devices. The patient will typically have their blood labs completed once every 2–4 weeks, and the EPO dose will be optimized for the time period in between sampling times. The EPO medications are approved manually by a clinician and administered during regular dialysis. It has been recognized by the clinical community that, while low Hb level leads to anemia, too high Hb levels can increase the risk of mortality for the patient [7]. Hence, effective methods are needed to determine the appropriate dose of EPO to maintain the target Hb level.

To develop an automatic decision support system and achieve effective anemia management, some of the first published papers on the use of advanced process control in CKD utilize set-point tracking MPC combined with neural network hemoglobin response modeling and report promising results in clinical trials [8,9,10]. The MPC algorithm in this case was an L1-Norm of the state deviation from the target. In another case [11], a pharmacokinetic and pharmacodynamic model was used to describe the patient’s hemoglobin response and then designed controllers based on Quantitative Feedback Theory. In the small pilot study, these researchers also showed promising results. Modeling of hemoglobin response to EPO administration was also done under uncertainty via a recent method called semi-blind Robust Identification [12]. The work presented in this paper differs significantly from these researchers. One of the more challenging aspects of hemoglobin control is the time-varying nature of the patient’s system dynamics. Patient’s can become EPO resistant [13], as well as have their health improve drastically, if endogenous EPO production increases [14]. To address the time-varying nature of the hemoglobin response, constrained recursive system identification is used. There exists a great deal of uncertainty in the model parameters, and the system is often subjected to acute disturbances, such as infection [15] and blood loss [16].

The aim of this work is to develop an advanced control system to calculate optimal EPO dosages for anemia patients, while considering process uncertainty in the control method. The overall block diagram of the control system is presented in Figure 1. The method uses an autoregressive with exogenous inputs (ARX) model with a model predictive controller. To address the process uncertainty, it is represented in the form of different scenarios and further utilized within a model predictive control formulation, with the use of Conditional Value at Risk (CVaR) technique. CVaR was introduced by Rockafellar and Uryasev and is widely used in the finance industry [17]. CVaR is a popular tool used in many risk management applications. For example, it has been shown to work well in heating, ventilation and air conditioning control systems [18]; portfolio/asset optimization [19]; and quantifying flood damage [20]. However, it has been studied in very few applications related to model predictive control. The existing methods for model predictive control under uncertainty mostly include robust MPC [21] and stochastic MPC based methods [22]. The two methods address the worst-case performance and the expected performance, respectively. CVaR is a method that addresses both of these criteria. In addition, CVaR provides an effective convex approximation of a probabilistic constraint used in stochastic MPC. Thus, in this paper, the applications of CVaR for handling both risk averse performance and the probabilistic constraint are studied.

The rest of the paper is organized as follows. Section 2 presents the hemoglobin response modeling through constrained optimization. Deterministic set-point and zone model predictive control algorithms are introduced in Section 3, for comparison purposes to the CVaR methods introduced later. An overview of Conditional Value at Risk in general is introduced in Section 4, which is a technique for addressing uncertainty. A set-point tracking model predictive controller using CVaR constraints on the upper and lower zone boundaries is then introduced in Section 5 and later compared to the set-point tracking MPC. A second controller, a zone-tracking model predictive controller using CVaR directly in the cost function, is introduced in Section 6 and later compared to the zone-tracking model predictive controller. In Section 7, the controllers are tested in computer simulations. The experimental designs are outlined for a simple ARX model based simulator as well as for a physiological model based simulator, using both Gaussian and uniform uncertainty distributions. Section 8 concludes the paper.

2. Hemoglobin Response Modeling

The system model used in the following derivations has an ARX model structure. This kind of model is one type that can be estimated through classic System Identification methods. The goal of System Identification is to map the response of a system’s output, to that of its input. The internal dynamics of the system are not known or estimated. This type of modeling is also referred to as black box modeling. In this work, the model parameters are obtained through constrained optimization. The model uses a weekly sampling time (

Δ t = 1

week), the last Hb measurement

y_{k}

, and several past EPO doses

u_{k}

(eight-week history of dosages in this work), to predict the future Hb measurement. The order of the

b (z)

polynomial is 8 and was determined empirically [23]. The order of the

a (z)

polynomial in the model is set to 1. Using a higher order for the

a (z)

polynomial may lead to inaccuracies in the model estimation and predictions because of the large amount of measurement error, mixed with the infrequent measurements typically used in the medical field. It is possible that, with more accurate measurements, a larger number of

a_{k}

parameters could yield better modeling results, but this work does not address this aspect.

Iron is necessary for the creation of red blood cells, but the effect of iron levels in the patient are ignored in this model. It is assumed that, if the treating clinician were to maintain stable and adequate iron levels within the patient, the response of the hemoglobin should be mostly related to the administration of EPO, and not related to fluctuating or low values of iron within the patient. If a patient were to have low iron levels, they will likely become EPO resistant which could effect the quality of the current model, should the change (from adequate to low levels) occur quickly. Future work should try to include the iron dynamics within the model, as there certainly are some dynamic interactions between these two systems.

The ARX model structure that maps the EPO administration events to the hemoglobin concentration is

y_{k + 1} = a_{1} y_{k} + b_{1} u_{k} + \dots + b_{8} u_{k - 7}

(1)

where the input u represents the EPO doses divided by 5000 international unit (IU), and y represents the hemoglobin concentration in g/dL. Clinical data for an example patient are shown in Figure 2. The hemoglobin measurements (top figure) are shown as the actual measurements recorded in approximately two-week intervals. The EPO doses (bottom figure) are presented as one-week dose totals. Typically, a dialysis patient with late stage renal disease will receive EPO medications 1–3 times per week, depending on dose size. With traditional dosing protocols, the oscillatory behavior observed in the hemoglobin is quite common [24].

To estimate a patient’s hemoglobin response model, the cost function in Equation (2) is used. It is the weighted least squares of the error between model predictions and actual hemoglobin values. The parameter

λ

is used as a weighting value (

0 < λ < 1

), to put a higher priority on newer measurements. The output y represents the hemoglobin measurements.

min_{a_{1}, b_{1}, \dots, b_{8}} \sum_{i = 1}^{t_{f}} λ^{(t_{f} - i)} {({\hat{y}}_{k + i} - y_{k + i})}^{2}

(2)

where

{\hat{y}}_{k + i}

and

y_{k + i}

represent the predicted and actual hemoglobin value, respectively. Equation (3) represents the predicted hemoglobin using the historical data and the estimated model parameters (

b_{1}, \dots, b_{8}, a_{1}

) for a time horizon of 1 to

t_{f}

.

[\begin{matrix} {\hat{y}}_{k + 1} \\ {\hat{y}}_{k + 2} \\ : \\ {\hat{y}}_{k + t f} \end{matrix}] = [\begin{matrix} u_{k} & \dots & u_{k - 7} & y_{k} \\ u_{k + 1} & \dots & u_{k - 6} & y_{k + 1} \\ : & : \\ u_{k + t_{f} - 1} & \dots & u_{k + t_{f} - 8} & y_{k + t_{f} - 1} \end{matrix}] [\begin{matrix} b_{1} \\ b_{2} \\ : \\ b_{8} \\ a_{1} \end{matrix}]

(3)

which can be compactly written as

\hat{Y} = X θ

(4)

where

θ

is a vector of the optimized model parameters, Y is the measured hemoglobin,

\hat{Y}

is the predicted hemoglobin, and X is a matrix of the EPO dosage and measured hemoglobin concentration. Then, the objective function in Equation (2) can be written as

min θ^{T} H θ + 2 θ^{T} f + c

where

H = X^{T} R X

,

f = - X^{T} R Y

, and

c = Y^{T} R Y

. The measurement weight in Equation (2) is represented by the weighting matrix R, which is a diagonal matrix of the weighting values

λ

for each measurement.

Recursive modeling has been show to work well in cases where the model parameters are time varying, such as in Type 1 Diabetes Mellitus [25]. Recursive modeling is the process of estimating a new patient model after each updated measurement. In this work, the patient’s model parameters are relearnt after each measurement, based on a moving window of estimation data. In the absence of validation data, constraints are used in the modeling formulation to ensure model stability, as well as to enforce that the model parameters follow patterns that are well described by many patients [26]. The complete constrained ARX (C-ARX) model parameter estimation problem can be represented by Equation (5). Equation (5) assumes that the delay of the system is approximately two weeks. The peak time

b_{k}

parameter is shown here with a delay of two weeks (Equation (5f)), and represents the largest contribution to the system of all the

b_{k}

parameters. The peak time

b_{k}

parameter can be shifted to the third or fourth week by reformulating constraints (Equation (5f,g)). Individual patient hemoglobin response times can vary. Some patients may begin to observe a substantial increase of hemoglobin in as little as two weeks, where some take much longer before a significant effect on the hemoglobin can be observed. The most appropriate way of dealing with the system’s unknown delay is to solve the same problem where the peak time

b_{k}

parameter is at the second, third and fourth delay, and then choose the solution with the lowest cost function. Equation (5b) is used as a minimum and maximum range for the model estimates, as almost all the measurements should be within this range. Measurements that exist outside of this range may not be realistic, or may coincide with some abnormal event. Equation (5c) is used to enforce model stability. Equation (5d) is used to enforce a minimum response delay of two weeks. Equation (5e) enforces that the model parameters are always positive, because the response of the system to the inputs is always positive. Equations (5f)–(5i) are defined through experience, and are used to enforce a particular parameter shape learned in [26]. Finally, Equation (5i) is used to ensure the model relies somewhat on the inputs to the system, and was determined through trial and error.

\begin{matrix} min & θ^{T} H θ + 2 θ^{T} f + c \end{matrix}

(5a)

\begin{matrix} s . t . & 7.0 \leq {\hat{y}}_{k + i} \leq 15.0 i = 1, \dots, t_{f} \end{matrix}

(5b)

\begin{matrix} 0.7 \leq a_{1} \leq 0.99 \end{matrix}

(5c)

\begin{matrix} b_{1} = 0 \end{matrix}

(5d)

\begin{matrix} b_{k} \geq 0 k = 2, \dots, 8 \end{matrix}

(5e)

\begin{matrix} b_{2} \geq 0.05 \end{matrix}

(5f)

\begin{matrix} b_{k} < 0.8 b_{k - 1} k = 3, \dots, 8 \end{matrix}

(5g)

\begin{matrix} b_{8} < 0.005 \end{matrix}

(5h)

\begin{matrix} \sum_{k = 1}^{8} b_{k} \geq 0.1 \end{matrix}

(5i)

The above model is a constrained quadratic optimization problem and it is solved using the quadprog solver in MATLAB.

3. Deterministic Control Formulation: Setpoint MPC and Zone-MPC

The stochastic control model described in later sections is compared to the traditional method of implementing model predictive control (MPC), including a set-point tracking MPC and a zone-tracking MPC. First, the set-point tracking model is presented in Equation (6). The target hemoglobin level is represented by

y_{s p}

. Q is the state tuning parameter. N is the prediction horizon.

\begin{matrix} min_{u} & \frac{1}{2} \sum_{i = 1}^{N} Q ∥{\hat{y}}_{k + i} - y_{s p}∥ + \frac{1}{2} \sum_{i = 0}^{N - 2} ∥Δ u_{k + i}∥ \end{matrix}

(6a)

\begin{matrix} s . t . & u_{L} \leq u_{k + i} \leq u_{H} i = 0, \dots, N - 2 \end{matrix}

(6b)

\begin{matrix} Δ u_{L} \leq Δ u_{k + i} \leq Δ u_{H} i = 0, \dots, N - 2 \end{matrix}

(6c)

Figure 3a is an illustration of set-point tracking MPC, where the output is driven to the set-point. Control input chatter is often observed in set-point tracking MPC.

The zone-tracking MPC formulation is presented in Equation (7). Zone control is realized through the constraints set on

δ_{k + i}

in Equation (7b). Set-point tracking MPC is a special case of zone control, where the zone limits

y_{H}

and

y_{L}

are equal.

\begin{matrix} min_{u, δ} & \frac{1}{2} \sum_{i = 1}^{N} Q ∥{\hat{y}}_{k + i} - δ_{k + i}∥ + \frac{1}{2} \sum_{i = 0}^{N - 2} ∥Δ u_{k + i}∥ \end{matrix}

(7a)

\begin{matrix} s . t . & y_{L} \leq δ_{k + i} \leq y_{H} i = 1, \dots, N \end{matrix}

(7b)

\begin{matrix} u_{L} \leq u_{k + i} \leq u_{H} i = 0, \dots, N - 2 \end{matrix}

(7c)

\begin{matrix} Δ u_{L} \leq Δ u_{k + i} \leq Δ u_{H} i = 0, \dots, N - 2 \end{matrix}

(7d)

In zone-MPC, it is common to see the output oscillate slightly between the upper and lower limits while the system stabilizes to a steady state within the control zone limits. Figure 3b is an illustration of zone-MPC.

It is worth pointing out that the cost function can be modeled using either absolute value or a quadratic function for both controllers presented above.

4. Conditional Value at Risk

The uncertainty considered in this work is in the form of process uncertainty on the output and can be represented by the variable

w_{k}

in Equation (8).

{\tilde{y}}_{k + 1} = a_{1} y_{k} + b_{1} u_{k} + \dots + b_{8} u_{k - 7} + w_{k}

(8)

To address the uncertainty in the hemoglobin response to EPO dosage, this work studies two Conditional Value at Risk (CVaR) techniques for control under uncertainty. In general, Value at Risk (VaR) at confidence level

β

(

V a R_{β}

) is defined as the maximum loss value that is assigned to some desired probability,

β

. That is to mean that a loss less than or equal to

V a R_{β}

will occur

β \times 100

% of the time. What

V a R_{β}

fail to account for, is the losses that occur

(1 - β) \times 100 %

of the time with a loss larger than

V a R_{β}

. CVaR at confidence level

β

(

C V a R_{β}

) aims to minimize this loss and it is defined as the average of the loss exceeding

V a R_{β}

. For instance, a slightly higher

V a R_{β}

may be tolerated, if the average value of the

β

-tail distribution is lower. This would ensure that when a loss greater than

V a R_{β}

does occur, it will likely be smaller. A diagram is included in Figure 4 to facilitate a better understanding of the value in which CVaR seeks to minimize.

Another major advantage of using CVaR constraints over other types of chance constraints, is that the uncertainty can follow any distribution. With a known distribution, different scenarios are generated for each sampling instant by sampling the random distribution. In this fashion, the controller optimization problem can take into account multiple scenarios, which help to approximate the different possibilities of measurement and process uncertainties, to ultimately provide a more robust solution.

Next, the CVaR technique is incorporated into the MPC formulation in two different ways. The first method handles chance constraints through CVaR approximation, whereas the second method directly optimizes CVaR in the cost function. Both controllers were tested in computer simulations against traditional zone-MPC and set-point tracking MPC to show the benefits that the two methods may provide.

5. Stochastic Control Using CVaR Constraints

The uncertainty is located within the output,

{\tilde{y}}_{k}

, in the form of a random variable

w_{k}

.

w_{k}

is assumed to follow a certain distribution. The controller is designed to take into account a process uncertainty from a known distribution. The optimization problem uses two chance constraints (Equations (9b) and (9c)) on the output of the system, one on the lower bound and the other one on the upper bound. Due to the conservative approximation of the chance constraint that the CVaR constraint results in, they will only be used for

k = 3

, as shown below. If more constraints are used, the problem will become infeasible.

k = 2

is not used because

b_{1} = 0

, which means that, if the hemoglobin were to fall outside of the zone, the controller would only have a single input to try and move the hemoglobin into the zone to avoid infeasibility. The result would be a very aggressive control move in these scenarios that should be avoided. It should also be noted that the cost function is computed with Equation (1), where

{\hat{y}}_{k}

represents the predicted hemoglobin without uncertainty. Chance constraints are used in solving optimization problems under uncertainty. There exists uncertainty in the output prediction

{\tilde{y}}_{k}

, which is computed through Equation (8). The meaning of the chance constraint is that utilizing some knowledge of the uncertainty in

{\tilde{y}}_{k}

, the constraint will be infeasible no more than

ϵ

amount of the time.

ϵ

is the probability of violation. The chance constrained set-point MPC formulation is proposed as following:

\begin{matrix} min & Q \sum_{k = 1}^{N} | {\hat{y}}_{k} - y_{s p} | + \sum_{k = 0}^{N - 2} | Δ u_{k} | \end{matrix}

(9a)

\begin{matrix} s . t . & P r {{\tilde{y}}_{k} > y_{H}} \leq ϵ k = 3 \end{matrix}

(9b)

\begin{matrix} P r {{\tilde{y}}_{k} < y_{L}} \leq ϵ k = 3 \end{matrix}

(9c)

\begin{matrix} u_{L} \leq u_{k + i} \leq u_{H} i = 0, \dots, N - 2 \end{matrix}

(9d)

\begin{matrix} Δ u_{L} \leq Δ u_{k + i} \leq Δ u_{H} i = 0, \dots, N - 2 \end{matrix}

(9e)

Due to the conservatism of the CVaR constraints, only a single point along the prediction horizon will utilize the constraints. Note that the

Δ u_{k}

term is summed from 0 to

N - 2

because

b_{1}

is always estimated as zero in the ARX model. The following derivations use a prediction horizon, N, of eight weeks, and the ARX model introduced previously. Equation (10) is used to demonstrate the derivation, which is the upper bound constraint [17].

P r {{\tilde{y}}_{k} > y_{H}} \leq ϵ

(10)

In the derivation below, the indicator function

1_{(0, \infty)} (u)

is used. This function holds a value of 1 if

u > 0

and a value of 0 if

u \leq 0

. For any positive parameter

α

, we have

1_{(0, \infty)} (u) = 1_{(0, \infty)} (\frac{1}{α} u)

. Defining an upper bounding function

ϕ (u)

for the indicator function, the following inequality can be written.

1_{(0, \infty)} (\frac{1}{α} u) \leq ϕ (\frac{1}{α} u)

(11)

Replacing u with

{\tilde{y}}_{k} - y_{H}

, the following inequality can be written.

P r {{\tilde{y}}_{k} - y_{H} > 0} = E [1_{(0, \infty)} ({\tilde{y}}_{k} - y_{H})] \leq E [ϕ (\frac{1}{α} ({\tilde{y}}_{k} - y_{H}))] \leq ϵ

(12)

If

{\tilde{y}}_{k} - y_{H} > 0

, the constraint is violated and the indicator function holds a value of 1, and vice versa.

ϕ (u)

is the upperbounding function for the indicator function and it is a conservative estimate of the probability of violation,

ϵ

. From this, any solution that satisfies the inequality will also satisfy the constraint. The focus remains on the right side of this inequality. (u)

^{+}

is defined as the maximum operator, which holds a maximum value between 0 and the input u. Applying the upper bounding function

ϕ (u) = {(u + 1)}^{+}

, a conservative approximation of the chance constraint is defined as

E [\begin{matrix} {(\frac{1}{α} ({\tilde{y}}_{k} - y_{H}) + 1)}^{+} \end{matrix}] \leq ϵ

(13)

multiply by

\frac{α}{ϵ}

\frac{1}{ϵ} E [\begin{matrix} {(({\tilde{y}}_{k} - y_{H}) + α)}^{+} \end{matrix}] \leq α

(14)

moving

α

to the left hand side

- α + \frac{1}{ϵ} E [\begin{matrix} {(({\tilde{y}}_{k} - y_{H}) + α)}^{+} \end{matrix}] \leq 0

(15)

To reduce conservatism, it is necessary to find the minimum

α

that satisfies the inequality. That is, it is desired to find the smallest upperbound. This is a CVaR constraint, but it is difficult to calculate its value in its present form.

min_{α} - α + \frac{1}{ϵ} E [\begin{matrix} {(({\tilde{y}}_{k} - y_{H}) + α)}^{+} \end{matrix}] \leq 0

(16)

The expectation operator is removed with the introduction of sampling to approximate the CVaR constraint. M is the total number of scenarios. For instance, if there were a single random variable per scenario, and the prediction horizon was 1, it would be necessary to sample the random variables’ distribution M times.

π_{j}

is the probability that a single scenario will occur, and is equal to

\frac{1}{M}

. Note that

\tilde{y}

changes to

y_{k, j}

because the output is now an expected value with the uncertainty added in.

- α + \frac{1}{ϵ} \sum_{j = 1}^{M} π_{j} [\begin{matrix} {((y_{k, j} - y_{H}) + α)}^{+} \end{matrix}] \leq 0

(17)

The expression inside of the max operator can be replaced by two separate constraints. Equation (18) replaces the max operator term with the variable

v_{j, k}

, and constraints are set on

v_{j, k}

to satisfy the original max operator’s functionality. The combination of all three constraints in Equation (18) can be used to provide a conservative estimate of the original chance constraint in Equation (10) through the use of sampling [18]. These constraints are solved using many different scenarios of the random variable distribution, leading to it being part of a class of controllers called Scenario-Based MPC.

\begin{matrix} - α_{k} + \frac{1}{ϵ} \sum_{j = 1}^{M} π_{j} v_{j, k} \leq 0 & k = 3 \end{matrix}

(18a)

\begin{matrix} v_{j, k} \geq (y_{k, j} - y_{H}) + α_{k} & j = 1, . . ., M, k = 3 \end{matrix}

(18b)

\begin{matrix} v_{j, k} \geq 0 & j = 1, . . ., M, k = 3 \end{matrix}

(18c)

Similarly, for the lower bound constraints, we have the following approximation

\begin{matrix} - γ_{k} + \frac{1}{ϵ} \sum_{j = 1}^{M} π_{j} ω_{j, k} \leq 0 & k = 3 \end{matrix}

(19a)

\begin{matrix} ω_{j, k} \geq (y_{L} - y_{k, j}) + γ_{k} & j = 1, . . ., M, k = 3 \end{matrix}

(19b)

\begin{matrix} ω_{j, k} \geq 0 & j = 1, . . ., M, k = 3 \end{matrix}

(19c)

Finally, notice that the cost function in Equation (9) contains absolute terms. This equation can be manipulated to remove the absolute sign and the resulting linear program with all the constraints is outlined in Equation (20).

\begin{matrix} min & Q \sum_{k = 1}^{N} h_{k} + \sum_{k = 0}^{N - 2} g_{k} \end{matrix}

(20a)

\begin{matrix} s . t . & h_{k} \geq {\hat{y}}_{k} - y_{s p} & k = 1, \dots, N \end{matrix}

(20b)

\begin{matrix} h_{k} \geq - ({\hat{y}}_{k} - y_{s p}) & k = 1, \dots, N \end{matrix}

(20c)

\begin{matrix} g_{k} \geq Δ u_{k} & k = 0, \dots, N - 2 \end{matrix}

(20d)

\begin{matrix} g_{k} \geq - Δ u_{k} & k = 0, \dots, N - 2 \end{matrix}

(20e)

where

{\hat{y}}_{k}

is the Hb prediction from the ARX model without process uncertainty term.

Summarizing the above derivations, the complete deterministic approximation to the chance constrained set-point MPC formulation is given as follows:

\begin{matrix} min & Q \sum_{k = 1}^{8} h_{k} + \sum_{k = 0}^{6} g_{k} \end{matrix}

(21a)

\begin{matrix} s . t . & - α_{k} + \frac{1}{ϵ} \sum_{j = 1}^{M} π_{j} v_{j, k} \leq 0 & k = 3 \end{matrix}

(21b)

\begin{matrix} v_{j, k} \geq (y_{k, j} - y_{H}) + α_{k} & j = 1, . . ., M, k = 3 \end{matrix}

(21c)

\begin{matrix} v_{j, k} \geq 0 & j = 1, . . ., M, k = 3 \end{matrix}

(21d)

\begin{matrix} - γ_{k} + \frac{1}{ϵ} \sum_{j = 1}^{M} π_{j} ω_{j, k} \leq 0 & k = 3 \end{matrix}

(21e)

\begin{matrix} ω_{j, k} \geq (y_{L} - y_{k, j}) + γ_{k} & j = 1, . . ., M, k = 3 \end{matrix}

(21f)

\begin{matrix} ω_{j, k} \geq 0 & j = 1, . . ., M, k = 3 \end{matrix}

(21g)

\begin{matrix} h_{k} \geq {\hat{y}}_{k} - y_{s p} & k = 1, \dots, N \end{matrix}

(21h)

\begin{matrix} h_{k} \geq - ({\hat{y}}_{k} - y_{s p}) & k = 1, \dots, N \end{matrix}

(21i)

\begin{matrix} g_{k} \geq Δ u_{k} & k = 0, \dots, N - 2 \end{matrix}

(21j)

\begin{matrix} g_{k} \geq - Δ u_{K} & k = 0, \dots, N - 2 \end{matrix}

(21k)

\begin{matrix} u_{L} \leq u_{k} \leq u_{H} & k = 0, \dots, N - 2 \end{matrix}

(21l)

\begin{matrix} Δ u_{L} \leq Δ u_{k} \leq Δ u_{H} & k = 0, \dots, N - 2 \end{matrix}

(21m)

There are

4 M + 76

constraints and the optimization vector consists of

2 M + 34

variables. The controller optimization problem involves the use of M scenarios, where M typically holds a value larger than 100 and is selected by the user. A single scenario is simply the trajectory of the hemoglobin response while including uncertainty values drawn randomly from the known uncertainty distribution. It should quickly become apparent that scenario-based constraints such as these lead to a very large number of constraints, with

4 M + 2

constraints related to the CVaR constraints alone. With this in mind, the cost function used in this design is a linear program and MATLAB’s linprog function was used along with its dual-simplex optimization algorithm. This controller is referred to as CVaR

_{1}

below.

6. Stochastic Control Using a CVaR Cost Function

The second method discussed in this work optimizes the control inputs using conditional value at risk directly within the cost function. In contrast to the first method, the solution to this optimization problem is always feasible. The controller derivation follows the work of [17,19] closely, but with the exception that the algorithm is modified to use a zone-MPC formulation. The optimal solution can be thought of as the solution that minimizes the average loss that occurs in the

β

-tail distribution of the probability density function (pdf) of the loss function. The controller formulation in Equation (22) uses the distribution of the random variable,

w_{k}

, to generate several different plausible scenarios of process uncertainty.

\begin{matrix} min_{u, δ} & C V a R_{β} f (u, δ, w) \end{matrix}

(22a)

\begin{matrix} s . t . & y_{L} \leq δ_{k + i} \leq y_{H} & i = 1, \dots, N \end{matrix}

(22b)

\begin{matrix} u_{L} \leq u_{k + i} \leq u_{H} & i = 0, \dots, N - 2 \end{matrix}

(22c)

\begin{matrix} Δ u_{L} \leq Δ u_{k + i} \leq Δ u_{H} & i = 0, \dots, N - 2 \end{matrix}

(22d)

where the cost function is given as

f (u, δ, w) = Q \sum_{i = 1}^{N} | {\tilde{y}}_{k + i} - δ_{k + i} | + \sum_{i = 0}^{N - 2} | Δ u_{k + i} |

(23)

The loss function is associated with decision variables

u_{k}

and auxiliary variables

δ_{k}

and random variables

w_{k}

. Note that Q is the tuning parameter in this equation and zone control is facilitated through the use of the variable

δ_{k}

. The conditional expectation of loss in the bad tail leads to the idea of Conditional Value at Risk.

C V a R_{β}

is defined as the conditional expectation of the loss above

ℓ_{β}

C V a R_{β} f = ϕ_{β} = \frac{1}{1 - β} \int_{f \geq ℓ_{β}} f (u, δ, w) p (w) d w

(24)

Rockafellar and Uryasev [17] showed that the

C V a R_{β}

can be determined as follows

ϕ_{β} = min_{ℓ} ℓ + \frac{1}{1 - β} E {[f (u, δ, w) - ℓ]}^{+} = min_{ℓ} ℓ + \frac{1}{1 - β} \int_{w} {[f (u, δ, w) - ℓ]}^{+} p (w) d w

(25)

{[f]}^{+}

denotes the max of

[f, 0]

. ℓ is an auxiliary variable (can be viewed as

V A R_{β}

) to be optimized. The integral can be approximated by sampling of the random variable:

w_{1}, \dots, w_{M}

with probability

π_{1}, \dots, π_{M}

ϕ_{β} = min_{ℓ} ℓ + \frac{1}{1 - β} \sum_{j = 1}^{M} π_{j} {[f (u, δ, w_{j}) - ℓ]}^{+}

(26)

The absolute terms from Equation (23) can be linearized by introducing variables

h_{k}

and

g_{k}

. Note that

y_{k + 1 + i, j}

represents the expected output for the scenario.

\begin{matrix} min_{u_{k}, δ_{k}, ℓ, v_{j}} & Q \sum_{i = 1}^{N} h_{k + i, j} + \sum_{i = 0}^{N - 2} g_{k + i} \end{matrix}

(27a)

\begin{matrix} s . t . & h_{k + i, j} \geq y_{k + 1 + i, j} - δ_{k + i, j} & i = 1, \dots, N, j = 1, \dots, M \end{matrix}

(27b)

\begin{matrix} h_{k + i, j} \geq - (y_{k + 1 + i, j} - δ_{k + i, j}) & i = 1, \dots, N, j = 1, \dots, M \end{matrix}

(27c)

\begin{matrix} g_{k + i} \geq Δ u_{k + i} & i = 0, \dots, N - 2 \end{matrix}

(27d)

\begin{matrix} g_{k + i} \geq - Δ u_{k + i} & i = 0, \dots, N - 2 \end{matrix}

(27e)

Then, by combining Equations (26) and (27), the CVaR cost can be written as

min_{u_{k}, δ_{k}, ℓ} ℓ + {(1 - β)}^{- 1} \sum_{j = 1}^{M} π_{j} {[Q \sum_{i = 1}^{N} h_{k + i, j} + \sum_{i = 0}^{N - 2} g_{k + i} - ℓ]}^{+}

(28)

This is equivalent to

\begin{matrix} min_{u_{k}, δ_{k}, ℓ, v_{j}} & ℓ + {(1 - β)}^{- 1} \sum_{j = 1}^{M} π_{j} v_{j} \end{matrix}

(29a)

\begin{matrix} s . t . & v_{j} \geq Q \sum_{i = 0}^{N - 1} h_{k + i, j} + \sum_{i = 0}^{N - 2} g_{k + i} - ℓ & j = 1, \dots, M \end{matrix}

(29b)

\begin{matrix} v_{j} \geq 0 & j = 1, \dots, M \end{matrix}

(29c)

The variable

v_{j}

is introduced to allow the calculation of the max operator. The final form of the stochastic MPC problem is a linear program outlined in Equation (30) that minimizes the

C V a R_{β}

, where

k, j, i

represent the sampling instant, scenario number and sampling instant along the prediction horizon, respectively.

\begin{matrix} min_{u_{k}, δ_{k}, ℓ, v_{j}} & ℓ + {(1 - p)}^{- 1} \sum_{j = 1}^{M} π_{j} v_{j} \end{matrix}

(30a)

\begin{matrix} s . t . & v_{j} \geq Q \sum_{i = 0}^{N - 1} h_{k + i, j} + \sum_{i = 0}^{N - 2} g_{k + i} - ℓ & j = 1, \dots, M \end{matrix}

(30b)

\begin{matrix} v_{j} \geq 0 & j = 1, \dots, M \end{matrix}

(30c)

\begin{matrix} h_{k + i, j} \geq y_{k + 1 + i, j} - δ_{k + i, j} & i = 1, \dots, N, j = 1, \dots, M \end{matrix}

(30d)

\begin{matrix} h_{k + i, j} \geq - (y_{k + 1 + i, j} - δ_{k + i, j}) & i = 1, \dots, N, j = 1, \dots, M \end{matrix}

(30e)

\begin{matrix} g_{k + i} \geq Δ u_{k + i} & i = 0, \dots, N - 2 \end{matrix}

(30f)

\begin{matrix} g_{k + i} \geq - Δ u_{k + i} & i = 0, \dots, N - 2 \end{matrix}

(30g)

\begin{matrix} y_{L} \leq δ_{k + i, j} \leq y_{H} & i = 1, \dots, N, j = 1, \dots, M \end{matrix}

(30h)

\begin{matrix} u_{L} \leq u_{k + i} \leq u_{H} & i = 0, \dots, N - 2 \end{matrix}

(30i)

\begin{matrix} Δ u_{L} \leq Δ u_{k + i} \leq Δ u_{H} & i = 0, \dots, N - 2 \end{matrix}

(30j)

The above Equation results in

4 M N + 2 M + 6 N

constraints and an optimization vector length of

2 M N + 2 N + M + 1

. This controller is referred to as CVaR

_{2}

below.

7. Computer Simulation Results

The following section outlines the experimental design and the simulation results comparing the CVaR methods to set-point tracking MPC and zone-MPC for both Gaussian and uniform distributions, for both an ARX model based patient simulator and a pharmacokinetics and pharmacodynamics (PK/PD) model based simulator.

Both simulator cases use the same Gaussian and uniform distributions for the random variable

w_{k}

. The histograms are outlined in Figure 5. Figure 5a follows a Gaussian distribution of

N (0, 0 . 2^{2})

, while Figure 5b is a uniform distribution of

- 0.4 \leq w_{k} \leq 0.4

. In all cases, the controllers that use the CVaR technique were given the exact distribution of

w_{k}

to calculate the scenarios for each controller iteration. For these simulations, each scenario would draw random variables for each sampling time in the future that is used in the controller design. For instance, in the first CVaR controller, the uncertainty is only needed up to

k + 3

for each sampling instant, meaning there are only three values drawn from the uncertainty distribution for each scenario. In the second CVaR controller, it draws the full eight weeks of values for each scenario.

The performance of the controllers are measured based on four metrics. The state performance metrics are the integrated output error (IOE) outside of the control zone, and the percent of points in the zone (PIZ). The input metrics are average weekly EPO dose (EPO/week) and the average weekly change in dose (avg

Δ E P O

). The four performance metrics can be calculated using Equation (31).

\begin{matrix} I O E = \sum_{k = 1}^{t f} f_{k}, f_{k} = \{\begin{matrix} H g b_{k} - y_{H}, i f H g b_{k} > y_{H} \\ y_{L} - H g b_{k}, i f H g b_{k} < y_{L} \\ 0, i f y_{L} \leq H g b_{k} \leq y_{H} \end{matrix} \end{matrix}

(31a)

\begin{matrix} P I Z = 100 \sum_{k = 1}^{t f} \frac{c_{k}}{t f}, c_{k} = \{\begin{matrix} 1, i f y_{L} \leq H g b_{k} \leq y_{H} \\ 0, i f y_{L} > H g b_{k} o r H g b_{k} > y_{H} \end{matrix} \end{matrix}

(31b)

\begin{matrix} E P O / w e e k = \sum_{k = 1}^{t f} \frac{E P O_{k}}{t f} \end{matrix}

(31c)

\begin{matrix} A v g Δ E P O = \sum_{k = 2}^{t f} \frac{| E P O_{k} - E P O_{k - 1} |}{t f} \end{matrix}

(31d)

7.1. Test under An ARX Model Based Simulator

The ARX model based simulations were performed without time-varying parameters and recursive modeling. These simulations represent a nominal case, where the patient model does not change. A single constrained-ARX model was used. A simulation time of 2000 weeks was used, to obtain better statistical power of the results. A sampling time of one week was used. The controller tuning parameters are shown in Table 1. The tuning parameters were defined empirically, to limit the Average Change in EPO parameters to be low. In practice, EPO is often administered in 1000 IU increments, so the controllers were tuned to have the average weekly change less than this value.

The results for the Gaussian distribution simulation are outlined in Table 2. The CVaR formulations offer significant improvements over traditional zone-MPC, but their performance is only marginally better than set-point tracking MPC. The time to solve is included in the table. Due to the large number of added variables introduced for sampling in the CVaR controllers, the time per iteration also increases.

The results for the uniform distribution are outlined in Table 3 and Table 4. CVaR

_{1}

has a large increase in the state performance over set-point tracking MPC, but it comes at the cost of an aggressive control action. The overall statistics show the controller to be more aggressive, but in fact it is actually quite stable with regards to its control input action.

Figure 6 depicts a small window of the full 2000 week simulation results comparing MPC to CVaR

_{1}

. CVaR

_{1}

behaves similar to zone-MPC, but the controller will be extra aggressive outside of the zone boundary as the hemoglobin approaches the constraint boundaries. The overall aggressiveness of the controller is much larger due to these larger moves, even though the doses are typically more stable. Figure 7 compares CVaR

_{2}

to zone-MPC. As compared to zone-MPC, the response is much better, as the CVaR is able to pre-emptively change the dose in anticipation of the hemoglobin leaving the zone, whereas zone-MPC only reacts to the hemoglobin leaving the zone, causing it to traverse beyond the zone boundaries more often.

7.2. Test under PK/PD Model Based Simulator

For this test, the patient simulator shown in Figure 8 was used which was designed in detail in [23]. The patient simulator represents a more realistic scenario to test the controllers with. The patient simulator uses a system of nonlinear delayed differential equations (DDE) based on pharmacokinetics and pharmacodynamics to model hemoglobin response.

The DDE model was proposed by Chait et al. [11]. Pharmacokinetics is the study of the movement of drugs within the body while pharmacodynamics is related to the mechanisms by which the drugs affect the body. The PK/PD model is described below.

\begin{matrix} E_{e n} = \frac{C H_{e n}}{μ K_{H} S - H_{e n}} \end{matrix}

(32a)

\begin{matrix} \frac{d E (t)}{d t} = \frac{- V E (t)}{K_{m} + E (t)} - α E (t) + d o s e (t) \end{matrix}

(32b)

\begin{matrix} \frac{d R (t)}{d t} = \frac{S (E_{e n} + E (t - D))}{(C + E_{e n} + E (t - D))} - 4 \frac{x_{1} (t)}{μ^{2}} \end{matrix}

(32c)

\begin{matrix} \frac{d x_{1} (t)}{d t} = x_{2} (t) \end{matrix}

(32d)

\begin{matrix} \frac{d x_{2} (t)}{d t} = \frac{S (E_{e n} + E (t - D))}{(C + E_{e n} + E (t - D))} - 4 \frac{x_{1} (t)}{μ^{2}} - 4 \frac{x_{2} (t)}{μ} \end{matrix}

(32e)

In Equation (32), states

E (t)

and

R (t)

represent the pool of exogenous erythropoietin and the population of red blood cells (RBC) within the body, respectively. States

x_{1} (t)

and

x_{2} (t)

are internal states that aid in calculating

R (t)

.

E_{e n}

is the endogenous erythropoietin naturally produced by the body.

K_{H}

is the average amount of hemoglobin per RBC (also known as the mean corpuscular hemoglobin, MCH). The value used here is fixed at 29.5 pg/cell, which is within the reference range of 27–33 pg/cell [11]. The hemoglobin value is directly proportionate to the RBC population; the hemoglobin value can be attained by multiplying the RBC population estimate by the MCH value. The function

d o s e (t)

is a train of impulses, representing the EPO injections. The model has an initial condition as represented in Equation (33) and requires two measurements of hemoglobin (

H b_{1} a n d H b_{2}

) and the time in between those measurements (

t_{1} a n d t_{2}

) to estimate. The prior history of the exogenous EPO is unknown, and assumed to be zero for all

t \leq 0

.

\begin{matrix} \dot{R_{0}} = \frac{(H b_{2} - H b_{1})}{K_{H} (t_{2} - t_{1})} \end{matrix}

(33a)

\begin{matrix} E (0) = 0 \end{matrix}

(33b)

\begin{matrix} R (0) = \frac{H b_{1}}{K_{H}} \end{matrix}

(33c)

\begin{matrix} x_{1} (0) = \frac{μ (H_{e n} - μ K_{H} \dot{R_{0}})}{4 K_{H}} \end{matrix}

(33d)

\begin{matrix} x_{2} (0) = R (0) - \frac{4 x_{1} (0)}{μ} \end{matrix}

(33e)

Table 5 contains parameters that are estimated for an individual patient using clinical data and nonlinear least-squares regression. The dde23 and lsqnonlin functions in MATLAB were used for DDE solution and parameter estimation, respectively.

The DDE model is solved in continuous time and sampled every two weeks. The output of the DDE model is the red blood cell population (

R_{k}

), which is multiplied by the mean corpuscular hemoglobin (

K_{H}

) to get the hemoglobin measurement. The hemoglobin measurement is then added to an integrating process uncertainty (

w_{k}

). It is important to note the difference between the ARX model simulator and PK/PD model simulator in regards to the way

w_{k}

is used. In the ARX simulator case,

w_{k}

is filtered with the patient model. In the PK/PD simulator case, the disturbance is integrating. The magnitudes are drawn from the same distributions in both cases.

The simulator also includes the possibility of acute step and ramp disturbances to simulate infections and blood losses frequently observed in actual patient data. Acute disturbances are facilitated through the use of the variable

A_{D}

, which holds a value of one unless a disturbance has occurred. If an acute disturbance occurs, it will hold a value less than 1, and the existing pool of red blood cells in the patient simulator model is multiplied by the fractional value to reduce it. A step disturbance is then modeled by an impulse in

A_{D}

, while an infection is modeled by a step function in

A_{D}

. Plant–model mismatch exists as the controller model is linear. The patient simulator was shown to mimic the real life hemoglobin system dynamics of patients well [23]. The total simulation time was 500 weeks. Recursive modeling is necessary to capture the time-varying nature of the system model. Although the patient model does not change significantly from sampling interval to sampling interval, the patient model is still re-estimated to provide continuity between the models at each sampling interval. A hemoglobin sampling time of two weeks is used, and two weekly dose total inputs solved by the controller are used during each sampling interval. The recursive modeling method applies a data re-sampling algorithm to get one week measurements (through linear interpolation), and weekly dose totals. The one-week re-sampled hemoglobin history is also smoothed, using a five-point moving average filter, before modeling occurs. The weighting parameter

λ

is set to 0.965. The tuning parameters for each of the controllers are outlined in Table 6. The tuning parameters were again defined empirically. For these simulations, the Avg

Δ E P O

was tuned to be approximately equal in most cases to better compare the controller results to one other.

For reference, more information about the actual disturbances is shown in Figure 9 and Figure 10. The top figures are the summation of the disturbance over the course of the simulation. The center figures are the magnitude of the disturbance at each point. The third figures are the acute disturbances that occur during the simulation, which are the same in both cases. In the simulation, the acute disturbances include seven blood loss events, and three infection events in both cases. The PK/PD parameters used in the simulator are as follows:

α = 0.2718

,

C = 22.45

,

D = 6.33

,

H_{e n} = 7.81

,

K_{m} = 76.05

,

μ = 91.54

,

S = 0.00554

, and

V = 1655

.

The results for the Gaussian uncertainty case are presented in Table 7. The four controllers’ performance statistics are shown in Figure 11 for a portion of the full 500-week simulation. The tuning parameters were chosen to try and keep the controller average

Δ E P O

approximately equal, so it is easier to compare the state performance between the controllers. However, the zone-MPC controller is unable to reach the same level of aggressiveness as the other three without becoming relatively unstable. Using the CVaR constraints improves the performance drastically over zone-MPC, and offers modest improvements over traditional MPC. Using CVaR in the cost function, offers similar state performance to that of set-point tracking MPC, with the same average control action. Using CVaR in the cost function allows the medication dosing to remain stable when the hemoglobin is within the zone. As it nears the borders, the controller will typically make large moves in the medication dose. Overall, the medication dose remains very stable, even though the average

Δ E P O

does not portray this feature in the statistics.

For the uniform distribution case, the overall results are shown in Table 8. A short window of the overall simulation can be seen in Figure 12. The CVaR constraints allow the controller to significantly improve the PIZ metric over the other controllers, while remaining with a similar average controller aggressiveness. Using CVaR in the cost function has an average change in dose similar to MPC and

C V a R_{1}

, but, looking at Figure 12, it is again easy to see that the controller rarely changes the medication dose, but, when it does so, it is typically a significant move.

8. Conclusions

This work explores the use of Conditional Value at Risk (CVaR) in addressing process uncertainty in hemoglobin level control in chronic kidney disease patients. CVaR is able to utilize any type of uncertainty distribution and does not rely on Gaussian assumptions. Two methods of utilizing CVaR were introduced, one where CVaR constraints are used, and the other where the cost function is the CVaR. Simulations were performed under both Gaussian and uniform type of uncertainty distributions. A simple ARX based and a more complex PK/PD based patient simulator were used in the control performance test. Simulation results show that adding CVaR constraints to the setpoint tracking MPC problem increased the state performance over all of the other controllers. Using CVaR directly in the cost function resulted in improvements over traditional zone-MPC. Moreover, it also offers the benefit of having the medication doses remain more stable over the course of the treatment, which may be a highly sought after feature of a control method by clinicians. Based on these results, using Conditional Value at Risk is a promising tool in improving control performance in hemoglobin concentration control and EPO dosage optimization. Finally, this work has only focused on the process uncertainty and the model parameter uncertainty is skipped, which will be investigated in future work.

Author Contributions

Z.L., J.M. and J.L. participated in the design and analysis of control methods and experiments. J.M. wrote the MATLAB code and performed all the simulations. J.M. wrote the paper. Z.L., J.M. and J.L. participated in the revisions and editing of the paper. U.S. provided the clinical data for modeling purposes.

Acknowledgments

This work was supported by the Natural Sciences and Engineering Research Council of Canada.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ARX	Autoregressive with exogenous inputs
CKD	Chronic Kidney Disease
CVaR	Conditional Value at Risk
DDE	Delayed Differential Equation
EPO	Erythropoetin
IOE	Integrated Output Error
IU	International Units
PIZ	Percent of Points in the Zone
PK/PD	Pharmacokinetic and pharmacodynamic
MPC	Model Predictive Control
RBC	Red Blood Cells
VaR	Value at Risk

References

Hayat, A.; Haria, D.; Salifu, M. Erythropoetin stimulating agents in the management of anemia of chronic kidney disease. Patient Preference Adherence 2008, 2, 195–200. [Google Scholar] [PubMed]
Miskulin, D.; Weiner, D.; Tighiouart, H.; Ladik, V.; Servilla, K.; Zager, P.; Martin, A.; Johnson, H.; Meyer, K. Computerized decision support for EPO dosing in hemodialysis patients. Am. J. Kidney Dis. 2009, 54, 1081–1088. [Google Scholar] [CrossRef] [PubMed]
Grosman, B.; Dassau, E.; Zisser, H.; Jovanovic, L.; Doyle, F. Zone Model Predictive Control: A Strategy to Minimize Hyper- and Hypoglycemic Events. J. Diabetes Sci. Technol. 2010, 4, 961–975. [Google Scholar] [CrossRef] [PubMed]
Batora, V.; Tarnik, M.; Murgas, J.; Schmidt, S.; Nørgaard, K.; Poulsen, N.; Madsen, H.; Boiroux, D.; Jørgensen, J. The Contribution of Glucagon in an Artificial Pancreas for People with Type 1 Diabetes. In Proceedings of the 2015 American Control Conference (ACC), Chicago, IL, USA, 1–3 July 2015; pp. 5097–5102. [Google Scholar]
Rivadeneira, P.; Ferramosca, A.; Gonzalez, A. Impulsive Zone Model Predictive Control with Application to Type I Diabetic Patients. In Proceedings of the 2016 IEEE Conference on Control Applications (CCA), Buenos Aires, Argentina, 19–22 September 2016; pp. 544–549. [Google Scholar]
Lee, J.; Gondhalekar, R.; Doyle, F. Design of an Artificial Pancreas using Zone Model Predictive Control with a Moving Horizon State Estimator. In Proceedings of the 2014 IEEE 53rd Annual Conference on Decision and Control (CDC), Los Angeles, CA, USA, 15–17 December 2014; pp. 6975–6980. [Google Scholar]
Bradbury, B.D.; Danese, M.D.; Gleeson, M.; Critchlow, C.W. Effect of Epoetin alfa dose changes on hemoglobin and mortality in hemodialysis patients with hemoglobin levels persistently below 11 g/dL. Clin. J. Am. Soc. Nephrol. 2009, 4, 630–637. [Google Scholar] [CrossRef] [PubMed]
Gaweda, A.; Jacobs, A.; Aronoff, G.; Brier, M. Model predictive control of erythropoietin administration in the anemia of ESRD. Am. J. Kidney Dis. 2008, 51, 71–79. [Google Scholar] [CrossRef] [PubMed]
Brier, M.; Gaweda, A.; Dailey, A.; Jacobs, A.; Aronoff, G. Randomized trial of model predictive control for improved anemia management. Clin. J. Am. Soc. Nephrol. 2010, 5, 814–820. [Google Scholar] [CrossRef] [PubMed]
Gaweda, A.; Jacobs, A.; Aronoff, G.; Rai, S.; Brier, M. Individualized anemia management reduces hemoglobin variability in hemodialysis patients. J. Am. Soc. Nephrol. 2014, 25, 159–166. [Google Scholar] [CrossRef] [PubMed]
Chait, Y.; Horowitz, J.; Nichols, B.; Shrestha, R.P.; Hollot, C.V.; Germain, M.J. Control-Relevant Erythropoiesis Modeling in End-Stage Renal Disease. IEEE Trans. Biomed. Eng. 2014, 61, 658–664. [Google Scholar] [CrossRef] [PubMed]
Akabua, E.; Inanc, T.; Gaweda, A.; Brier, M.; Kim, S.; Zurada, J. Individualized Model Discovery: The Case of Anemia Patients. J. Comput. Methods Programs Biomed. 2015, 118, 23–33. [Google Scholar] [CrossRef] [PubMed]
Alves, M.; Vilaca, S.; Carvalho, M.; Fernandes, A.; Dusse, L.; Gomes, K. Resistance of dialyzed patients to erythropoietin. Rev. Bras Hematol. Hemoter. 2015, 37, 190–197. [Google Scholar] [CrossRef] [PubMed]
Weis, L.; Metzger, M.; Haymann, J.; Thervet, E.; Flamant, M.; Vrtovsnik, F.; Gauci, C.; Houillier, P.; Froissart, M.; Letavernier, E.; et al. Renal Function Can Improve at Any Stage of Chronic Kidney Disease. PLoS ONE 2013, 8, e81835. [Google Scholar] [CrossRef] [PubMed]
Malhotra, V.; Beniwal, P.; Pursnani, L. Infections in Chronic Kidney Disease. Clin. Queries Nephrol. 2012, 1, 253–258. [Google Scholar] [CrossRef]
Saeed, F.; Agrawal, N.; Greenberg, E.; Holley, J. Lower Gastrointestinal Bleeding in Chronic Hemodialysis Patients. Int. J. Nephrol. 2011, 2011, 272535. [Google Scholar] [CrossRef] [PubMed]
Rockafellar, R.; Uryasev, S. Optimization of Conditional Value-at-Risk. J. Risk 2000, 2, 21–42. [Google Scholar] [CrossRef]
Parisio, A.; Molinari, M.; Varagnolo, D.; Johansson, K. A Scenario-based Predictive Control Approach to Building HVAC Management Systems. In Proceedings of the 2013 IEEE International Conference on Automation Science and Engineering (CASE), Madison, WI, USA, 17–20 August 2013; pp. 428–435. [Google Scholar]
Bemporad, A.; Puglia, L.; Gabbriellini, T. A Stochastic Model Predictive Control Approach to Dynamic Option Hedging with Transaction Costs. In Proceedings of the 2011 American Control Conference (ACC), San Francisco, CA, USA, 29 June–1 July 2011; pp. 3862–3867. [Google Scholar]
Zhang, X.; Liu, P.; Xu, C.; Ming, B.; Xie, A.; Feng, M. Conditional Value-at-Risk for Nonstationary Streamflow and Its Application for Derivation of the Adaptive Reservoir Flood Limited Water Level. J. Water Resour. Plan. Manag. 2018, 144, 04018005. [Google Scholar] [CrossRef]
Bemporad, A.; Morari, M. Robust model predictive control: A survey. In Robustness in Identification and Control; Springer: London, UK, 1999; pp. 207–226. [Google Scholar]
Mesbah, A. Stochastic model predictive control: An overview and perspectives for future research. IEEE Control Syst. 2016, 36, 30–44. [Google Scholar] [CrossRef]
McAllister, J.; Li, Z.; Liu, J.; Simonsmeier, U. Erythropoietin Dose Optimization for Anemia in Chronic Kidney Disease Using Recursive Zone Model Predictive Control. IEEE Trans. Control Syst. Technol. 2018, in press. [Google Scholar] [CrossRef]
Thanakitcharuand, P.; Jirajan, B. Prevalence of hemoglobin cycling and its clinical impact on outcomes in Thai end-stage renal disease patients treated with hemodialysis and erythropoiesis-stimulating-agents. J. Med. Assoc. Thail. 2016, 99, 28–37. [Google Scholar]
Turksoy, K.; Bayrak, E.; Quinn, L.; Littlejohn, E.; Cinar, A. Multivariable Adaptive Closed-Loop Control of an Artificial Pancreas Without Meal and Activity Announcement. Diabetes Technol. Ther. 2013, 15, 386–400. [Google Scholar] [CrossRef] [PubMed]
Ren, J.; McAllister, J.; Li, Z.; Liu, J.; Simonsmeier, U. Modeling of Hemoglobin Response to Erythropoietin Therapy through Constrained Optimization. In Proceedings of the 6th International Symposium on Advanced Control of Industrial Processes, Taipei, Taiwan, 28–31 May 2017; pp. 245–250. [Google Scholar]

Figure 1. Overall control block diagram.

Figure 2. Clinical data for an anemia patient.

Figure 3. Comparison of setpoint and zone tracking MPC.

Figure 4. Depiction of the definition of Conditional Value at Risk.

Figure 5. Uncertainty distributions.

Figure 6. Comparisons to CVaR

_{1}

for the uniform distribution.

Figure 6. Comparisons to CVaR

_{1}

for the uniform distribution.

Figure 7. Comparisons to CVaR

_{2}

for the uniform distribution.

Figure 7. Comparisons to CVaR

_{2}

for the uniform distribution.

Figure 8. Block diagram of the patient simulator.

Figure 9. Uncertainty in the PK/PD based simulations: Gaussian uncertainty plus acute disturbance.

Figure 10. Uncertainty in the PK/PD based simulations: uniform uncertainty plus acute disturbance.

Figure 11. Results under Gaussian process uncertainty.

Figure 12. Results under uniform process uncertainty.

Table 1. Controller settings.

Controller Identifier	MPC	Zone-MPC	CVaR $_{1}$	CVaR $_{2}$
Q	0.8	10	0.8	0.3
M	-	-	500	500
$Δ u_{H}$ (IU)	20,000	20,000	20,000	20,000
$β$	-	-	-	0.95
$ϵ$	-	-	0.1	-
$y_{L}$	10.5	10	10.5	10
$y_{H}$	10.5	11	10.5	11
Target	10.5	10–11	10.5	10–11
Constraint Limits	-	-	9.5–11.5	-

Table 2. Simulation results under Gaussian distribution.

Performance Statistic	MPC	Zone-MPC	CVaR $_{1}$	CVaR $_{2}$
IOE ( $\frac{g}{d L w e e k}$ )	112.1	174.3	113.2	108.0
PIZ (%)	74.2	64.8	74.0	74.0
EPO/week (IU)	5449	5450	5450	5448
Avg $Δ E P O$	603	489	938	541
Time per Iteration (sec)	0.005	0.005	0.091	0.588

Table 3. Setpoint MPC and CVaR

_{1}

model performance under uniform distribution.

Table 3. Setpoint MPC and CVaR

_{1}

model performance under uniform distribution.

Performance Statistic	MPC	CVaR $_{1}$
IOE ( $\frac{g}{d L w e e k}$ )	146.7	115.5
PIZ (%)	69.9	73.4
EPO/week (IU)	5435	5438
Avg $Δ E P O$	701	1413

Table 4. Zone MPC and CVaR

_{2}

model performance under uniform distribution.

Table 4. Zone MPC and CVaR

_{2}

model performance under uniform distribution.

Performance Statistic	Zone-MPC	CVaR $_{2}$
IOE ( $\frac{g}{d L w e e k}$ )	189.2	140.8
PIZ (%)	65.3	71.0
EPO/week (IU)	5433	5435
Avg $Δ E P O$	592	698

Table 5. Parameters for the PK/PD Model.

Parameter	Description
$H_{e n}$	Hemoglobin Level due to Endogenous Erythropoietin
$μ$	Mean RBC life span
V	Maximal Exogenous Erythropoietin clearance rate
$K_{m}$	Exogenous Erythropoietin level that produces half maximal clearance rate
$α$	Linear clearance constant
S	Maximal RBC production rate stimulated by $E_{P}$
C	Amount of EPO that produces half maximal RBC production rate
D	Time required for EPO-stimulated RBCs to start forming

Table 6. Controller settings.

Controller Identifier	MPC	Zone-MPC	CVaR $_{1}$	CVaR $_{2}$
Q	3	10	3	1.5
M	-	-	500	500
$Δ u_{H}$ (IU)	20,000	20,000	20,000	20,000
$u_{H}$ (IU)	30,000	30,000	30,000	30,000
$β$	-	-	-	0.95
$ϵ$	-	-	0.3	-
$y_{L}$	11	10	10	10
$y_{H}$	11	12	12	12
Target	11	10–12	10–12	10–12
Constraint Limits	-	-	10–12	-

Table 7. Simulation results under Gaussian distribution.

Performance Statistic	MPC	Zone-MPC	CVaR $_{1}$	CVaR $_{2}$
IOE ( $\frac{g}{d L w e e k}$ )	25.99	30.47	25.04	39.05
PIZ (%)	80.0	76.5	81.3	80.0
EPO/week (IU)	20,044	20,059	20,399	19,795
Avg $Δ E P O$	1333	805	1267	1343

Table 8. Simulation results under uniform distribution.

Performance Statistic	MPC	Zone-MPC	CVaR $_{1}$	CVaR $_{2}$
IOE ( $\frac{g}{d L w e e k}$ )	44.58	55.12	40.7	45.10
PIZ (%)	70.4	65.2	77.8	71.7
EPO/week (IU)	20,429	20,548	20,793	20,779
Avg $Δ E P O$	1223	838	1236	1289

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

McAllister, J.; Li, Z.; Liu, J.; Simonsmeier, U. EPO Dosage Optimization for Anemia Management: Stochastic Control under Uncertainty Using Conditional Value at Risk. Processes 2018, 6, 60. https://doi.org/10.3390/pr6050060

AMA Style

McAllister J, Li Z, Liu J, Simonsmeier U. EPO Dosage Optimization for Anemia Management: Stochastic Control under Uncertainty Using Conditional Value at Risk. Processes. 2018; 6(5):60. https://doi.org/10.3390/pr6050060

Chicago/Turabian Style

McAllister, Jayson, Zukui Li, Jinfeng Liu, and Ulrich Simonsmeier. 2018. "EPO Dosage Optimization for Anemia Management: Stochastic Control under Uncertainty Using Conditional Value at Risk" Processes 6, no. 5: 60. https://doi.org/10.3390/pr6050060

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

EPO Dosage Optimization for Anemia Management: Stochastic Control under Uncertainty Using Conditional Value at Risk

Abstract

1. Introduction

2. Hemoglobin Response Modeling

3. Deterministic Control Formulation: Setpoint MPC and Zone-MPC

4. Conditional Value at Risk

5. Stochastic Control Using CVaR Constraints

6. Stochastic Control Using a CVaR Cost Function

7. Computer Simulation Results

7.1. Test under An ARX Model Based Simulator

7.2. Test under PK/PD Model Based Simulator

8. Conclusions

Author Contributions

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI