*2.1. Notation*

Throughout the paper, the notation |·| is used to denote the Euclidean norm of a vector, the notation |·|*Q* denotes a weighted Euclidean norm of a vector (i.e., |*x*|<sup>2</sup>*Q* = *xTQx* where *Q* is a positive definite matrix). *x<sup>T</sup>* denotes the transpose of *x*. **R**+ denotes the set [0, <sup>∞</sup>). The notation *Lf <sup>V</sup>*(*x*) denotes the standard Lie derivative *Lf <sup>V</sup>*(*x*) := *∂V*(*x*) *∂x f*(*x*). For given positive real numbers *β* and , <sup>B</sup>*β*() := {*x* ∈ **R***n* | |*x* − | < *β*} is an open ball around with a radius of *β*. Set subtraction is denoted by "\", i.e., *A*\*B* := {*x* ∈ **R***n* | *x* ∈ *A*, *x* ∈/ *<sup>B</sup>*}. *x* maps *x* to the least integer greater than or equal to *x* and *x* maps *x* to the greatest integer less than or equal to *x*. The function *f*(·) is of class C<sup>1</sup> if it is continuously differentiable in its domain. A continuous function *α* : [0, *a*) → [0, ∞) is said to belong to class K if it is strictly increasing and is zero only when evaluated at zero.

#### *2.2. Class of Systems*

The class of continuous-time nonlinear systems considered is described by the following state-space form:

$$
\dot{\mathbf{x}} = f(\mathbf{x}) + \mathbf{g}(\mathbf{x})u + d(\mathbf{x})w,\ \mathbf{x}(t\_0) = \mathbf{x}\_0 \tag{1}
$$

where *x* ∈ **R***n* is the state vector, *u* ∈ **R***m* is the manipulated input vector, and *w* ∈ *W* is the disturbance vector, where *W* := {*w* ∈ **R***q* | |*w*| ≤ *θ*, *θ* ≥ <sup>0</sup>}. The control action constraint is defined by *u* ∈ *U* = {*umin* ≤ *u* ≤ *umax*} ⊂ **R***<sup>m</sup>*, where *umin* and *umax* represent the minimum and the maximum value vectors of inputs allowed, respectively. *f*(·), *g*(·) and *d*(·) are sufficiently smooth vector and matrix functions of dimensions *n* × 1, *n* × *m* and *n* × *q*, respectively. Without loss of generality, the initial time *t*0 is taken to be zero (*<sup>t</sup>*0 = 0), and it is assumed that *f*(0) = 0, and thus, the origin is a steady-state of the system of Equation (1) with *w*(*t*) ≡ 0, (i.e., (*x*<sup>∗</sup>*s* , *u*∗*s* )=(0, 0)). In the manuscript, we assume that every measured state is measured by multiple sensors that are isolated from one another such that if one sensor measurement is tampered by cyber-attacks, a secure network or some secure way can still be used to send the correct sensor measurements of *x*(*t*) to the controller. This can also be viewed as secure, redundant sensors or just having an alternative, secure network to send the sensor measurements to the controller. However, if this assumption does not hold, i.e., no secure sensors are available, then the system has to be shut down after the detection of cyber-attacks, or to be operated in an open-loop manner thereafter with an accurate process model.

#### *2.3. Stabilizability Assumptions and Lyapunov-Based Control*

Consider the nominal system of Equation (1) with *w*(*t*) ≡ 0. We first assume that there exists a stabilizing feedback control law *u* = <sup>Φ</sup>(*x*) ∈ *U* such that the origin of the nominal system of Equation (1) can be rendered asymptotically stable for all *x* ∈ *D*1 ⊂ **R***<sup>n</sup>*, where *D*1 is an open neighborhood of the origin, in the sense that there exists a positive definite C<sup>1</sup> control Lyapunov function *V* that satisfies the small control property and the following inequalities:

$$a\_1(|x|) \le V(x) \le a\_2(|x|),\tag{2a}$$

$$\frac{\partial V(\mathbf{x})}{\partial \mathbf{x}} F(\mathbf{x}, \Phi(\mathbf{x}), \mathbf{0}) \le -\mathfrak{a}\_3(|\mathbf{x}|), \tag{2b}$$

$$\left|\frac{\partial V(\mathbf{x})}{\partial \mathbf{x}}\right| \le a\_4(|\mathbf{x}|) \tag{2c}$$

where *<sup>α</sup>j*(·), *j* = 1, 2, 3, 4 are class K functions. *<sup>F</sup>*(*<sup>x</sup>*, *u*, *w*) is used to represent the system of Equation (1) (i.e., *<sup>F</sup>*(*<sup>x</sup>*, *u*, *w*) = *f*(*x*) + *g*(*x*)*u* + *d*(*x*)*w*).

An example of a feedback control law that is continuous for all *x* in a neighborhood of the origin and renders the origin asymptotically stable is the following control law [18]:

$$\varphi\_i(x) = \begin{cases} -\frac{p + \sqrt{p^2 + |q|^4}}{|q|^2} q\_\prime & \text{if} \quad q \neq 0\\ 0, & \text{if} \quad q = 0 \end{cases} \tag{3a}$$

$$\Phi\_i(\mathbf{x}) = \begin{cases} u\_i^{\min}, & \text{if} \quad q\_i(\mathbf{x}) < u\_i^{\min} \\ q\_i(\mathbf{x}), & \text{if} \quad u\_i^{\min} \le q\_i(\mathbf{x}) \le u\_i^{\max} \\\ u\_i^{\max}, & \text{if} \quad q\_i(\mathbf{x}) > u\_i^{\max} \end{cases} \tag{3b}$$

where *p* denotes *Lf <sup>V</sup>*(*x*) and *q* denotes (*LgV*(*x*))*<sup>T</sup>* = [*Lg*1*<sup>V</sup>*(*x*)··· *Lgm <sup>V</sup>*(*x*)]*<sup>T</sup>*. *ϕi*(*x*) of Equation (3a) represents the *i*th component of the control law <sup>Φ</sup>(*x*) before considering saturation of the control action at the input bounds. <sup>Φ</sup>*i*(*x*) of Equation (3b) represents the *i*th component of the saturated control law <sup>Φ</sup>(*x*) that accounts for the input constraints *u* ∈ *U*. Based on the controller <sup>Φ</sup>(*x*) that satisfies Equation (2), the set of initial conditions from which the controller <sup>Φ</sup>(*x*) can stabilize the origin of the input-constrained system of Equation (1) is characterized as: *φn* = {*x* ∈ **R***n* | *V*˙ + *κ<sup>V</sup>*(*x*) ≤ 0, *u* = <sup>Φ</sup>(*x*) ∈ *U*, *κ* > <sup>0</sup>}. Additionally, we define a level set of *<sup>V</sup>*(*x*) inside *φn* as <sup>Ω</sup>*ρ* := {*x* ∈ *φn* | *<sup>V</sup>*(*x*) ≤ *ρ*}, which represents a stability region of the closed-loop system of Equation (1).

#### **3. Cyber-Attack and Detection Methodology**

From the perspective of process control systems, cyber-attacks are malicious signals that can compromise actuators, sensors or their communication networks. Specifically, among sensor cyber-attacks, DoS attacks, replay attacks and deception attacks are the three most common and easily implementable ones by attackers [5]. On the other hand, since stealthy cyber-attacks are designed to damage the performance of CPS (e.g., stability and safety), developing more reliable detection and control methods that can detect, locate and mitigate cyber-attacks in a timely fashion and control the damage within a tolerable limit is imperative.

In this section, the min-max cyber-attack designed to damage closed-loop stability of the system of Equation (1) is first introduced. Subsequently, a general model-based detection method [4] and the corresponding stealthy cyber-attacks that can evade such detection are presented. Therefore, to better detect different types of cyber-attacks, the data-based detection scheme that utilizes machine learning methods is finally developed with a sliding detection window.

#### *3.1. Min-Max Cyber-Attack*

In this subsection, we first consider a deception sensor cyber-attack, in which the minimum or maximum allowable sensor measurement values are fed into process control systems (e.g., a Lyapunov-based control system with a stability region <sup>Ω</sup>*ρ* defined by a level set of Lyapunov function *<sup>V</sup>*(*x*)) to drive the closed-loop states away from their expected values and finally ruin the stability of the closed-loop system. Since ∀*x* ∈ <sup>Ω</sup>*ρ*, there exists a feasible control action *u* = <sup>Φ</sup>(*x*) such that *V* ˙ < 0, closed-loop stability is maintained within the stability region <sup>Ω</sup>*ρ* under <sup>Φ</sup>(*x*). Assuming that attackers know the stability region of the system of Equation (1) in advance and have access to some of the sensors (but not all), to remain undetectable by a simple stability region-based detection method (i.e., the cyber-attack is detected if the state is out of the stability region), the min-max cyber-attack is designed with the following form such that the fake sensor measurements are still inside <sup>Ω</sup>*ρ*:

$$\mathfrak{x} = \underset{x \in \mathbb{R}}{\text{arg}\max} \{ V(x) \le \rho \} \tag{4}$$

where *x*¯ is the tampered sensor measurement. Since the controller needs to ge<sup>t</sup> access to true state measurements to maintain closed-loop stability in a state feedback control system, wrong state measurements under cyber-attacks can affect control actions and eventually drive the state away from its set-point. In the section "Application to a chemical process example", it is shown that if attackers apply a min-max cyber-attack to safety-critical sensors (e.g., temperature or pressure sensors in a chemical reactor) in process control systems, closed-loop stability may not be maintained (i.e., the closed-loop state goes out of <sup>Ω</sup>*ρ*) and the system may have to be shut down.

#### *3.2. Model-Based Detection and Stealthy Cyber-Attack*

Based on the known process model of Equation (1), a cumulative sum (CUSUM) statistic detection method [4] can be developed to minimize the detection time when a cyber-attack occurs. Specifically, the CUSUM statistic method detects cyber-attacks by calculating the cumulative sum of the deviation between expected and measured states. The method is developed by the following equations:

$$S(k) = (S(k-1) + z(k))^+,\ S(0) = 0\tag{5a}$$

$$D(S(k)) = \begin{cases} 1, & \text{if } S(k) > S\_{TH} \\ 0, & \text{otherwise} \end{cases} \tag{5b}$$

where *S*(*k*) is the nonparametric CUSUM statistic and *STH* is the threshold of the detection of cyber-attacks. (*S*)+ = *S*, if *S* ≥ 0 and (*S*)+ = 0 otherwise. *D* is the detection indicator where *D* = 1 indicates that the cyber-attack is confirmed or there is no cyber-attack if *D* = 0. *z*(*k*) is the deviation between expected states *x*˜(*tk*) and measured states *x*(*tk*) at time *t* = *tk*: *z*(*k*) := |*x*˜(*tk*) − *x*(*tk*)| − *b* where *x*˜(*tk*) is derived using the known process model, the state and the control action at *t* = *tk*−1, and *b* is a small positive constant to reduce the false alarm rate due to disturbances.

With a carefully selected *STH*, the model-based detection method can detect many sensor cyber-attacks efficiently. However, the above model-based method may be evaded and becomes invalid for stealthy cyber-attacks if attackers know more about the system (e.g., the system model and the principles of the detection method). For example, three advanced stealthy cyber-attacks were proposed in [4] to damage the system without triggering the threshold of the model-based detection method. Specifically, a surge cyber-attack is designed to maximize the damage for the first few steps (similar to min-max cyber-attacks) and switch to cyber-attacks with small perturbations for the rest of time when *S*(*k*) reaches *STH*. The form of a surge cyber-attack is given by the following equations:

$$\mathbf{x}(t\_k) = \begin{cases} |\mathbf{x}(t\_k)^{\min}|, & \text{if } S(k) \le S\_{TH} \\ |\tilde{\mathbf{x}}(t\_k) - |S\_{TH} + b - S(k-1)|, & \text{otherwise} \end{cases} \tag{6}$$

The above surge cyber-attack is able to maintain *S*(*k*) within its threshold and therefore is undetectable by the above detection method. In this case, the defenders should either develop more advanced detection methods for stealthy cyber-attacks (i.e., it becomes an interactive decision-making process between an attacker and a defender [19]), or develop a detection method from another perspective, for example, a data-based method. Since the purpose of any type of stealthy cyber-attack is to change the normal operation and destroy the performance of the system of Equation (1), the dynamic operation of the system of Equation (1) (e.g., dynamic trajectories in state-space) under cyber-attacks becomes different from that of the nominal system of Equation (1). The deviation of the data can be regarded as an intrinsic indicator for detection of cyber-attacks. In this direction, a data-based detection system is developed via machine learning methods in the next subsection.

#### *3.3. Detection via Machine Learning Techniques*

Machine learning has a wide range of applications in classification, regression, and clustering problems. To detect cyber-attacks, classification methods can be utilized to determine whether there

is a cyber-attack on the system of Equation (1) or not. The data-based learning problems are usually categorized into unsupervised learning and supervised learning.

Unsupervised learning (e.g., k-means clustering) uses unlabeled data to derive a model that can split the data into different categories. On the other hand, supervised learning aims to develop a function that maps an input to an output based on labeled dataset (input-output pairs). There are two types of supervised learning tools, (1) classification tools (e.g., k-nearest neighbor (k-NN), support vector machine (SVM), random forest, neural networks) are used to develop a function based on labeled training datasets to predict the class of a new set of data that was not used in the training stage; (2) regression tools (e.g., linear regression, support vector regression, etc.) aim to predict the outcome of an event based on the relationship between variables obtained from the training datasets (labeled input-output pairs) [20]. Since supervised learning concerns labeled training data, we utilize a neural network (NN) algorithm to predict whether the system of Equation (1) is nominally operating, under disturbances or under cyber-attacks. Subsequently, a Lyapunov-based model predictive controller is proposed to stabilize the closed-loop system during the absence and presence of cyber-attacks.

#### *3.4. NN-Based Detection System*

Since the evolution of the closed-loop state from the initial condition *x*(0) = *x*0 ∈ <sup>Ω</sup>*ρ* is determined by both the nonlinear system model of Equation (1) and the design of process control systems, it is difficult to distinguish normal operation from the operation under cyber-attacks. Moreover, even if a detection method is developed for a specific cyber-attack (e.g., min-max cyber-attack), the detection strategy is not guaranteed to identify a different type of cyber-attack. Motivated by these concerns, this work proposes a data-based detection system for different types of cyber-attacks by using machine learning methods.

As a widely-used machine learning method, neural networks build a general class of nonlinear functions from input variables to output variables. The basic structure of a feed-forward multiple-input-single-output neural network with one hidden layer is given in Figure 1, where *Nuj*, *j* = 1, 2,. . . , *n* denotes the input variables in the input layer, *θ*1*i*, *i* = 1, 2,. . . , *h* denotes the neurons in the hidden layer and *Ny* denotes the output in the output layer. Specifically, the hidden neurons *θ*1*i* and the output *Ny* (i.e., the classification result) are obtained by the following equations, respectively [21]:

$$\theta\_{1i} = \sigma\_1(\sum\_{j=1}^{n} N\_{\text{wij}}^{(1)} N\_{\text{uj}} + N\_{\text{wij}0}^{(1)}) \tag{7}$$

$$N\_{\mathcal{Y}} = \sigma\_2(\sum\_{j=1}^h N\_{wj}^{(2)} \theta\_{1j} + N\_{w0}^{(2)}) \tag{8}$$

where *σ*1, *σ*2 are nonlinear activation functions, *N*(1) *wij* and *N*(2) *wj* are weights, and *N*(1) *wi*0, *N*(2) *w*0 are biases. For simplicity, the input vector **Nu** will be used to denote all the inputs *Nuj*, and the weight matrix **N w** will be used to represent all the weights and biases in Equations (7) and (8). The neurons in the hidden layer receive the weighted sum of inputs and use activation functions *σ*1 (e.g., ReLu function *σ*(*x*) = max(0, *x*) or sigmoid function *σ*(*x*) = 1/(1 + *e*<sup>−</sup>*<sup>x</sup>*)) to bring in the nonlinearity such that the NN is not a simple linear combination of the inputs. The output neuron generates the class label via a linear combination of hidden neurons and an activation function *σ*2 (e.g., sigmoid function for two classes or softmax function *<sup>σ</sup>i*(*x*) = *exi*/ ∑*<sup>K</sup> k*=1 *exk* for multiple classes where *K* is the number of classes).

Given a set of training data including the input vectors **N<sup>i</sup> u**, *i* = 1, 2,. . . *NT* and the corresponding classified labels (i.e., target vectors **N<sup>i</sup> t**), the NN model is trained by minimizing the following error function (i.e., loss function):

$$\mathbb{E}(\mathbf{N}\_{\mathbf{W}}) = \frac{1}{2} \sum\_{i=1}^{N\_T} |\mathbf{N}\_{\mathbf{Y}}^{\mathbf{i}}(\mathbf{N}\_{\mathbf{u}\prime}^{\mathbf{i}}\mathbf{N}\_{\mathbf{W}}) - \mathbf{N}\_{\mathbf{t}}^{\mathbf{i}}|^2 \tag{9}$$

where **N<sup>i</sup> y**(**N<sup>i</sup> u**, **N w**) is the predicted class for the input **N<sup>i</sup> u** under **N w**. The above nonlinear optimization problem is solved using the stochastic gradient descent (SGD) method, in which the backpropagation method is utilized to calculate the gradient of **E**(**N <sup>w</sup>**). Meanwhile, the weight matrix **N w** is updated by the following equation:

$$\mathbf{N}\_{\mathbf{w}} := \mathbf{N}\_{\mathbf{w}} - \eta \nabla \mathbf{E}(\mathbf{N}\_{\mathbf{w}}) \tag{10}$$

where *η* is the learning rate to control the speed of convergence. Additionally, to avoid over-fitting during the training process, k-fold cross-validation is employed to randomly partition the original dataset into *k* − 1 subsets of training data and 1 subset of validation data, and early-stopping is activated once the error on the validation set stops decreasing.

Finally, the classification accuracy of the validation dataset is utilized to demonstrate the performance of the neural network since the validation dataset is independent of the training dataset and is not used in training the NN model. Specifically, the classification accuracy (i.e., the test accuracy) of the trained NN model is obtained by the following equation:

$$N\_{\rm acc} = \frac{n\_{\rm c}}{n\_{\rm real}} \tag{11}$$

where *nc* is the number of data samples with correct predicted classes, and *nval* is the total number of data samples in the validation dataset. In general, the NN performance depends on many factors, e.g., the size of dataset, the number of hidden layers and nodes, and the intensity and the amount of disturbance applied [22–24]. In Remark 1, the method of determining the number of layers and nodes is introduced.

 

**Figure 1.** Basic structure of a feed-forward neural network used for cyber-attack detection.

In this paper, the NN is developed to derive a model *M* to classify three classes: the nominal closed-loop system, the closed-loop system with disturbances, and the closed-loop system under cyber-attacks. A large dataset of time-varying states for various initial conditions (i.e., dynamic trajectories) of the above three cases is used as the input to the neural network. The output of the neural network is the classified class. Since the feed-forward NN is a static model with a fixed input dimension (i.e., fixed time length) but the detection method should be applied during the dynamic operation of the system of Equation (1), multiple NN models with various sizes of input datasets

(i.e., various time lengths) are used for the detection of cyber-attacks in real time until the time length corresponding to the available data since the beginning of the time of operation becomes equal to the time length that is preferred to be utilized for the remainder of the operating time. Specifically, given a training dataset of time-series state vectors (i.e., closed-loop trajectories): *Nu* ∈ **R***n*×*<sup>T</sup>* where *n* is the number of states and *T* is the number of sampling steps of each trajectory, the NN model is obtained and applied as follows: (1) the NN is trained with data corresponding to time lengths from the initial time to *T* sampling steps in intervals of *Na* sampling steps, i.e., the *i*th NN model *Mi* is trained using data from *t* = 0 to *t* = *iNa*, where *i* = 1, 2,. . . , *T*/*Na* and *T* is a multiple integer of *Na*; (2) when incorporating the NN-based detection system in MPC, real-time state measurement data can be readily utilized in the corresponding NN model *Mi* to check if there is a cyber-attack so far.

**Remark 1.** *With an appropriate structure (i.e., number of layers and hidden neurons) of the neural network, the weight matrix* **Nw** *is calculated by Equation (10) and will be utilized to derive the classification accuracy of Equation (11). However, in general, there is no systematic method to determine the structure of a neural network since it highly depends on the number of training data samples and also the complexity of the model needed for classification. Therefore, in practice, the neural network is initiated with one hidden layer with a few hidden neurons. If the classification result is unsatisfactory, we increase the hidden neurons number and further layers with appropriate regularization are added to improve the performance.*

**Remark 2.** *It is noted that the above classification accuracy of the NN model represents the ratio of the number of correct predictions to the total number of predictions for all classes. If we only consider the case of binary classification (i.e., whether the system is under cyber-attacks or not), sensitivity (also called recall or true positive rate) and specificity (also called true negative rate) are also useful measures. Specifically, sensitivity measures the proportion of actual cyber-attacks that are correctly identified as such, while specificity measures the proportion of actual non-cyber-attacks that are correctly identified as such. Therefore, in the presence of multiple types of cyber-attacks or disturbances, it becomes straightforward to learn the performance of the NN-based method to detect true cyber-attacks via sensitivity and specificity.*

#### *3.5. Sliding Detection Window*

Since the classification accuracy of a NN is not perfect, false alarms may be triggered based on a one-time detection (i.e., non-cyber-attack case may be identified as cyber-attack). In order to reduce the false alarm rates, a detection indicator *Di* generated by each sub-model *Mi* and a sliding detection window with length *Ns* are proposed as follows:

$$D\_i = \begin{cases} 1, & \text{if attack is detected by } M\_i \\ 0, & \text{if no attack is detected by } M\_i \end{cases} \tag{12}$$

Based on the detection indicator *Di* at every *Na* sampling steps, the weighted sum of detection indicators within the sliding detection window *DI* shown in Figure 2 at *t* = *tk* = *k*Δ is calculated as follows:

$$D\_I = \sum\_{j=\lceil (k-N\_\star+1)/N\_d \rceil}^{\lfloor k/N\_d \rfloor} \gamma^{\lfloor \frac{k}{N\_d} \rfloor - j} D\_j \tag{13}$$

where *γ* is a detection factor that gives more weight to recent detections within the sliding window because the classification accuracy of the NN increases as more data is used for training. If *DI* ≥ *DTH*, where *DTH* is a threshold that indicates a real cyber-attack in the closed-loop system, then the cyber-attack is confirmed and reported by the NN-based detection system; otherwise, the detection system remains silent and the sliding window will be rolled one sampling time. To balance false alarms and missed detections, the threshold *DTH* is determined via extensive closed-loop simulations under cyber-attacks to derive a desired detection rate.

Additionally, since there is no guaranteed feasible control action that can drive the state back towards the origin once the state of the system of Equation (1) is outside the stability region <sup>Ω</sup>*ρ* due to the way of characterizing *φn* and <sup>Ω</sup>*ρ*, it is also necessary to check whether the state is in <sup>Ω</sup>*ρ*, especially when cyber-attacks occur but have not been detected yet. Therefore, to prevent the system state from entering a region in state-space where closed-loop stability is not guaranteed, the boundedness of the state vector within the stability region is also checked using the state measurement from redundant, secure sensors at the time when *Di* = 1. If the state *x* has already left <sup>Ω</sup>*ρ*, closed-loop stability is no longer guaranteed and in this case further safety system components (e.g., physical safety devices) need to be activated to avoid dangerous operations [25]. However, if *x* ∈ <sup>Ω</sup>*ρ*, the state measurement will be read from redundant, secure sensors instead of the original sensors to avoid deterioration of stability under the potential cyber-attack indicated by *Di*= 1.

**Figure 2.** The sliding detection window with detection activated every *Na* sampling steps, where triangles represent the detection indicator *Di* and the box with length *Ns* represents the sliding detection window.

**Remark 3.** *The sliding window with length Ns is employed to reduce false alarm rates. Considering that the classification accuracy derived is not perfect, the idea behind the sliding detection window is that a cyber-attack is confirmed only if it has been detected for a few times continuously instead of a one-time detection. The length of sliding window Ns will balance the efficiency of detection and false alarm rates. Specifically, a larger Ns and a higher detection threshold DTH (DI* ≥ *DTH within the sliding detection window represents the confirmation of a cyber-attack) lead to longer detection time but a lower false alarm rate, while a smaller Ns and a lower DTH have the opposite effect. Therefore, Ns and DTH should be determined well to achieve a balanced performance between detection efficiency and false alarm rate.*

**Remark 4.** *The above supervised learning-based cyber-attack detection method is able to distinguish the normal operation of the system of Equation (1) from the abnormal operation under cyber-attacks, provided that there is a large amount of labeled data available for training. However, for those unknown cyber-attacks which are never used for training, the detection is not guaranteed. Specifically, if there exists an unknown cyber-attack that is distinct from the trained cyber-attacks, the NN-based detection method may not be able to identify it as a cyber-attack. In this case, an unsupervised learning-based detection method may achieve better performance by clustering unknown cyber-attack data into a new class. However, if the unknown cyber-attack shares similar* *properties (e.g., similar attack mechanism) with a trained cyber-attack, the NN method may still be able to detect it and classify it as one of the available classes. For example, it is demonstrated in the section "Application to a chemical process example" that the unknown surge cyber-attack can still be detected by the NN-based detection system that is trained for min-max cyber-attacks because of the similarity between these two cyber-attacks.*

**Remark 5.** *Since different types of cyber-attacks may have various purposes, targeted sensors and attack duration, the dynamic behavior of a closed-loop system varies with different cyber-attacks, which can be eventually reflected by the data of states. Besides the detection of cyber-attacks, the above NN-based detection method is also able to recognize the types of cyber-attacks by training the NN model with data of various types of cyber-attacks labeled as different classes. As a result, the NN model can not only detect the occurrence of cyber-attacks, but also can identify the type of a cyber-attack if the data of that particular cyber-attack has been utilized for training.*

#### **4. Lyapunov-Based MPC (LMPC)**

To cope with the threats of the above sensor cyber-attacks, a feedback control method that accounts for the corruption of some sensor measurements should be designed by defenders to mitigate the impact of cyber-attacks and still stabilize the system of Equation (1) at its steady-state. Based on the assumption of the existence of a Lyapunov function *<sup>V</sup>*(*x*) and a controller *u* = <sup>Φ</sup>(*x*) that satisfy Equation (2), the LMPC that utilizes the accurate measurement from redundant, secure sensors is proposed as the following optimization problem:

˙

$$\mathcal{J} = \min\_{\boldsymbol{\mu} \in \mathcal{S}(\boldsymbol{\Lambda})} \int\_{t\_k}^{t\_{k+N}} L\_{\boldsymbol{\ell}}(\tilde{\boldsymbol{x}}(t), \boldsymbol{\mu}(t)) dt \tag{14a}$$

$$\text{s.t.}\quad \dot{\mathfrak{x}}(t) = f(\mathfrak{x}(t)) + \mathfrak{g}(\mathfrak{x}(t))u(t) \tag{14b}$$

$$
\tilde{\mathbf{x}}(t\_k) = \mathbf{x}(t\_k) \tag{14c}
$$

$$u(t) \in \mathcal{U}, \ \forall \ t \in [t\_{k\prime}, t\_{k+N}) \tag{14d}$$

$$V(\mathbf{x}(t\_k), \boldsymbol{\mu}(t\_k)) \le V(\mathbf{x}(t\_k), \boldsymbol{\Phi}(\mathbf{x}(t\_k))),$$
 
$$\text{if } V(\mathbf{x}(t\_k)) > \rho\_{\min}. \tag{14e}$$

$$\mathcal{V}(\vec{x}(t)) \le \rho\_{\min\prime} \,\,\forall \, t \in \left[t\_{k\prime}, t\_{k+N}\right)$$

$$\text{if } V(\mathbf{x}(t\_k)) \le \rho\_{\min} \tag{14f}$$

where *x*˜(*t*) is the predicted state trajectory, *S*(Δ) is the set of piecewise constant functions with period Δ, and *N* is the number of sampling periods in the prediction horizon. *V* ˙ (*x*(*tk*), *u*(*tk*)) represents the time derivative of *<sup>V</sup>*(*x*), i.e., *∂V∂x* (*f*(*x*˜(*t*)) + *g*(*x*˜(*t*))*u*(*t*)). We assume that the states of the closed-loop system are measured at each sampling time instance, and will be used as the initial condition in the optimization problem of LMPC in the next sampling step. Specifically, based on the measured state *x*(*tk*) at *t* = *tk*, the above optimization problem is solved to obtain the optimal solution *u*<sup>∗</sup>(*t*) over the prediction horizon *t* ∈ [*tk*, *tk*+*<sup>N</sup>*). The first control action of *<sup>u</sup>*<sup>∗</sup>(*t*), i.e., *<sup>u</sup>*<sup>∗</sup>(*tk*), is sent to the control actuators to be applied over the next sampling period. Then, at the next sampling time *tk*+<sup>1</sup> := *tk* + Δ, the optimization problem is solved again, and the horizon will be rolled one sampling time.

In the optimization problem of Equation (14), the objective function of Equation (14a) that is minimized is the integral of *Lt*(*x*˜(*t*), *u*(*t*)) over the prediction horizon, where the function *Lt*(*<sup>x</sup>*, *u*) is usually in a quadratic form (i.e., *Lt*(*<sup>x</sup>*, *u*) = *xTRx* + *<sup>u</sup>TQu*, where *R* and *Q* are positive definite matrices). The constraint of Equation (14b) is the nominal system of Equation (1) (i.e., *w*(*t*) ≡ 0) to predict the evolution of the closed-loop state. Equation (14c) defines the initial condition of the nominal process system of Equation (14b,14d) defines the input constraints over the prediction horizon. The constraint of Equation (14e) requires that *<sup>V</sup>*(*x*˜) for the system decreases at least at the rate under <sup>Φ</sup>(*x*) at *tk* when *<sup>V</sup>*(*x*(*tk*)) > *ρmin*. However, if *x*(*tk*) enters a small neighborhood around the origin <sup>Ω</sup>*ρmin* := {*x* ∈ *φn* | *<sup>V</sup>*(*x*) ≤ *ρmin*}, in which *V*˙ is not required to be negative due to the sample-and-hold

implementation of the LMPC, the constraint of Equation (14f) is activated to maintain the state inside <sup>Ω</sup>*ρmin* afterwards.

When the cyber-attack is detected by *Di* = 1 but not confirmed by *DI* ≥ *DTH* yet, the optimization problem of the LMPC of Equation (14) uses the state measurement from redundant, secure sensors instead of the original sensors as the initial condition *x*(*tk*) for the optimization problem of Equation (14) until the next instance of detection. However, if the cyber-attack is finally confirmed by *DI* ≥ *DTH*, the misbehaving sensor will be isolated, and the optimization problem of the LMPC of Equation (14) starts to use the state measurement from secure sensors instead of the compromised state measurement as the initial condition *x*(*tk*) for the optimization problem of Equation (14) for the remaining time of process operation. The structure of the entire cyber-attack-detection-control system is shown in Figure 3.

**Figure 3.** Basic structure of the proposed integrated NN-based detection and LMPC control method.

If the cyber-attack is detected and confirmed before the closed-loop state is driven out of the stability region, it follows that the closed-loop state is always bounded in the stability region <sup>Ω</sup>*ρ* thereafter and ultimately converges to a small neighborhood <sup>Ω</sup>*ρmin*around the origin for any *x*0 ∈ <sup>Ω</sup>*ρ*

under the LMPC of Equation (14). The detailed proof can be found in [11]. An example trajectory is shown in Figure 4.

**Figure 4.** A schematic representing the stability region <sup>Ω</sup>*ρ* and the small neighborhood <sup>Ω</sup>*ρmin* around the origin. The trajectory first moves away from the origin due to the cyber-attack and finally re-converges to <sup>Ω</sup>*ρmin* under the LMPC of Equation (14) after the detection of the cyber-attack by the proposed detection scheme.

**Remark 6.** *It is noted that the speed of detection (which depends heavily on the size of the input data to the NN, the number of hidden layers and the type of activation functions) plays an important role in stabilizing the closed-loop system of Equation (1) since the operation of the closed-loop system under the LMPC of Equation (14) becomes unreliable after cyber-attacks occur. In other words, if we can detect cyber-attacks in a short time, the LMPC can switch to redundant, secure sensors and still be able to stabilize the system at the origin before it leaves the stability region* <sup>Ω</sup>*ρ. Additionally, the probability of closed-loop stability can be derived based on the classification accuracy of the NN-based detection method and its activation frequency Na. Specifically, given the classification accuracy pnn* ∈ [0, 1]*, if the NN-based detection system is activated every Na* = 1 *sampling step, the probability of the cyber-attack being detected at each sampling step (i.e., Di* = 1*) is equal to pnn, which implies that the probability of closed-loop stability* ∀*<sup>x</sup>*0 ∈ <sup>Ω</sup>*ρ is no less than pnn. Moreover, for safety reasons, the region of initial conditions can be chosen as a conservative sub-region (i.e.,* <sup>Ω</sup>*ρe* := {*x* ∈ *φn* | *<sup>V</sup>*(*x*) ≤ *ρe*}*, where ρe* < *ρ) inside the stability region to avoid the rapid divergence of states under cyber-attacks and improve closed-loop stability. For example, let ρe* = *max*{*V*(*x*(*t*)) | *<sup>V</sup>*(*x*(*<sup>t</sup>* + Δ)) ≤ *ρ*, *u* ∈ *U*} *such that* ∀*x*(*tk*) ∈ <sup>Ω</sup>*ρe, <sup>x</sup>*(*tk*+<sup>1</sup>) *still stays in* <sup>Ω</sup>*ρ despite a miss of detection of cyber-attacks. Therefore, the probability of closed-loop stability* ∀*<sup>x</sup>*0 ∈ <sup>Ω</sup>*ρe under the LMPC of Equation (14) reaches* 1 − (1 − *pnn*)<sup>2</sup> *(i.e., the probability of cyber-attacks being detected within two sampling periods).*

**Remark 7.** *It is demonstrated in [11] that in the presence of sufficiently small bounded disturbances (i.e.,* |*w*(*t*)| ≤ *θ), closed-loop stability is still guaranteed for the system of Equation (1) under the sample-and-hold implementation of the LMPC of Equation (14) with a sufficiently small sampling period* Δ*. In this case, it is undesirable to treat the disturbance as a cyber-attack and trigger the false alarm. Therefore, the detection system should account for the disturbance case and have the capability to distinguish cyber-attacks from disturbances (i.e., the system with disturbances should be classified as a distinct class or treated as the nominal system).*

#### **5. Application to a Chemical Process Example**

In this section, we utilize a chemical process example to illustrate the application of the proposed detection and control methods for potential cyber-attacks. Consider a well-mixed, non-isothermal continuous stirred tank reactor (CSTR) where an irreversible first-order exothermic reaction takes place. The reaction converts the reactant *A* to the product *B* via the chemical reaction *A* → *B*. A heating jacket that supplies or removes heat from the reactor is used. The CSTR dynamic model derived from material and energy balances is given below:

$$\frac{d\mathbb{C}\_A}{dt} = \frac{F}{V\_L}(\mathbb{C}\_{A0} - \mathbb{C}\_A) - k\_0 \varepsilon^{-E/RT} \mathbb{C}\_A \tag{15a}$$

$$\frac{dT}{dt} = \frac{F}{V\_L}(T\_0 - T) - \frac{\Delta H k\_0}{\rho C\_p} e^{-E/RT} \mathcal{C}\_A + \frac{Q}{\rho C\_p V\_L} \tag{15b}$$

where *CA* is the concentration of reactant *A* in the reactor, *T* is the temperature of the reactor, *Q* denotes the heat supply/removal rate, and *VL* is the volume of the reacting liquid in the reactor. The feed to the reactor contains the reactant *A* at a concentration *CA*0, temperature *T*0, and volumetric flow rate *F*. The liquid has a constant density of *ρ* and a heat capacity of *Cp*. *k*0, *E* and Δ*H* are the reaction pre-exponential factor, activation energy and the enthalpy of the reaction, respectively. Process parameter values are listed in Table 1. The control objective is to operate the CSTR at the equilibrium point (*CAs*, *Ts*)=(0.57 kmol/m3, 395.3 K) by manipulating the heat input rate Δ*Q* = *Q* − *Qs*, and the inlet concentration of species *A*, Δ*CA*0 = *CA*0 − *CA*<sup>0</sup>*s*. The input constraints for Δ*Q* and Δ*CA*0 are |Δ*Q*| ≤ 0.0167 kJ/min and |Δ*CA*0| ≤ 1 kmol/m3, respectively.

**Table 1.** Parameter values of the CSTR.


To place Equation (15) in the form of the class of nonlinear systems of Equation (1), deviation variables are used in this example, such that the equilibrium point of the system is at the origin of the state-space. *x<sup>T</sup>* = [*CA* − *CAs T* − *Ts*] represents the state vector in deviation variable form, and *u<sup>T</sup>* = [Δ*CA*0 Δ*Q*] represents the manipulated input vector in deviation variable form.

The explicit Euler method with an integration time step of *hc* = 10−<sup>5</sup> min is applied to numerically simulate the dynamic model of Equation (15). The nonlinear optimization problem of the LMPC of Equation (14) is solved using the IPOPT software package [26] with the sampling period Δ = 10−<sup>3</sup> min.

We construct a Control Lyapunov Function using the standard quadratic form *<sup>V</sup>*(*x*) = *<sup>x</sup>TPx*, with the following positive definite *P* matrix:

$$P = \begin{bmatrix} \ 9.35 & 0.41\\ 0.41 & 0.02 \end{bmatrix} \tag{16}$$

Under the LMPC of Equation (14) without cyber-attacks, closed-loop stability is achieved for the nominal system of Equation (15) in the sense that the closed-loop state is always bounded in the stability region <sup>Ω</sup>*ρ* with *ρ* = 0.2 and ultimately converges to <sup>Ω</sup>*ρmin* with *ρmin* = 0.002 around the origin. However, if a min-max cyber-attack is added to tamper the sensor measurement of temperature of the system of Equation (15), closed-loop stability is no longer guaranteed. Specifically, the min-max cyber-attack is designed to be of the following form:

$$\mathbf{x}\_1 = \mathbf{x}\_1 \tag{17a}$$

$$\mathfrak{x}\_2 = \min\{\arg\max\_{x\_2 \in \mathbf{R}} \{\mathbf{x}^T P \mathbf{x} \le \rho\}\}\tag{17b}$$

where *x*1 = *CA* − *CAs*, *x*2 = *T* − *Ts*, and *x*¯1, *x*¯2 are the corresponding state measurements under min-max cyber-attacks. In this example, the min-max cyber-attack of Equation (17) is designed such that the measurement of concentration remains unchanged, and the measurement of temperature is tampered to be the minimum value that keeps the state at the boundary of the stability region <sup>Ω</sup>*ρ*.

In Figures 5 and 6, the temperature sensor measurement is intruded by a min-max cyber-attack at time *t* = 0.067 min. Without any cyber-attack detection system, it is shown in Figure 5 that the LMPC of Equation (14) keeps operating the system of Equation (15) using false sensor measurements blindly and finally drives the closed-loop state out of the stability region <sup>Ω</sup>*ρ*.

**Figure 5.** The state-space profile for the CSTR of Equation (15) under the LMPC of Equation (14) and under a min-max cyber-attack for the initial condition ( −0.25, 3).

**Figure 6.** The true state profile (*<sup>x</sup>*2 = *T* − *Ts*) and the sensor measurements (*x*¯2 = *T*¯ − *Ts*) of the closed-loop system under the LMPC of Equation (14) and under a min-max cyber-attack for the initial condition ( −0.25, 3), where the vertical dotted line shows the time the cyber-attack is added.

To handle the min-max cyber-attack, the model-based detection system of Equation (5) and the NN-based detection method are applied to the system of Equation (15). The simulation results are shown in Figures 7–13. Subsequently, the application of the NN-based detection method to the system under other cyber-attacks and the presence of disturbances is demonstrated in Figures 14–16. Specifically, we first demonstrate the application of the model-based detection system of Equation (5) and of the LMPC of Equation (14), where *STH* = 1 and *b* = −0.5 are chosen through closed-loop simulations. In Figure 7, the min-max cyber-attack of Equation (17) is added at 0.06 min and is detected at 0.1 min before the closed-loop state comes out of <sup>Ω</sup>*ρ*. The variation of the CUSUM statistic *S*(*k*) is shown in Figure 8, in which *S*(*k*) remains at *b* when there is no cyber-attack and exceeds *STH* at 0.1 min. After the min-max cyber-attack is detected, the true states are obtained from redundant, secure sensors and the LMPC of Equation (14) drives the closed-loop state into <sup>Ω</sup>*ρmin*.

**Figure 7.** Closed-loop state profiles (*<sup>x</sup>*2 = *T* − *Ts*, *x*¯2 = *T*¯ − *Ts*) for the initial condition ( −0.25, 3) under the LMPC of Equation (14) and the model-based detection system.

**Figure 8.** The variation of *S*(*k*) for the initial condition ( −0.25, 3) under the LMPC of Equation (14) and the model-based detection system.

Next, the NN-based detection system and the LMPC of Equation (14) are implemented to mitigate the impact of cyber-attacks. The feed-forward NN model with two hidden layers is built in Python using the Keras library. Specifically, 3000 time-series balanced data samples of the closed-loop states of the nominal system, the system with disturbances, and the system under min-max cyber-attacks from

*t* = 0 to *t* = 1 min are used to train the neural network to generate the classification of three classes, where class 0, 1, and 2 stand for the system under min-max cyber-attacks, the nominal system and the system with disturbances, respectively. It is demonstrated that 3000 time-series data is sufficient to build the NN for the CSTR example because dataset size smaller than 3000 leads to lower classification accuracy while the increase of dataset size over 3000 does not significantly improve the classification accuracy but brings more computation time as found in our calculations. 3000 data samples are split into 2000 training data, 500 validation data and 500 test data, respectively. *<sup>V</sup>*(*x*) = *xTPx* is utilized as the input vector to the NN model. The structure of the NN model is listed in Table 2. Additionally, to improve the performance of the NN model, batch normalization is utilized after each hidden layer to improve the performance of the NN algorithm.


 1  Softmax

Output Layer

**Table2.**Feed-forwardNNmodel.

To apply the NN-based detection method, we first investigate the relationship of the classification accuracy of the NN with respect to the size of the dataset. Specifically, assuming that the min-max cyber-attack occurs at a random sampling step before 0.1 min, the first NN model *M*0.1 is trained at *t* = 0.1 min using the data of states from *t* = 0 to 0.1 min. As shown in Figure 9, early-stopping is activated at the 8th iteration (epoch) of training when validation accuracy ceases to increase. The averaged classification accuracy at *t* = 0.1 min is obtained by training the same model *Mt*=0.1 for 10 times independently. The above process is repeated by increasing the size of the dataset by 0.02 min every time to derive the models for different time instances (i.e., *Mt*=0.12, *Mt*=0.14, ...). The minimum, the maximum and the averaged classification accuracy at each detection time instance are shown in Figure 10.

**Figure 9.** The variation of training accuracy and validation accuracy for the NN model *M*0.1, where early-stopping is activated at the 8th epoch of training.

Figure 10 shows that the averaged test accuracy increases as more state measurements are collected after the cyber-attack occurs, and is up to 95% with state measurements for a long period of time. This suggests that the detection based on recent models is more reliable and deserves higher weights in the sliding window. The confusion matrix of the above NN for three classes: the system under min-max cyber-attack, the nominal system, and the system with disturbances is given in Table 3. Additionally, besides the NN method, other supervised learning-based classification methods including k-NN, SVM and random forests are also applied to the same dataset and obtained the averaged test accuracies, sensitivities and specificities within 0.28 min as listed in Table 4.

**Figure 10.** The test accuracy of neural network with respect to the size of training and test data. matrix

 of the neural network.

**Table 3.** Confusion



**Table 4.** Comparison of the performance of different detection models.

When the detection of cyber-attacks is incorporated into the closed-loop system of Equation (15) under the LMPC of Equation (14), the detection system is called every *Na* = 5 sampling periods. The sliding window length is *Ns* = 15 sampling periods and the threshold for the detection indicator is *DTH* = 1.6. The detection system is activated from *t* = 0.1 min such that a desired test accuracy is achieved with enough data. The closed-loop state-space profiles under the NN-based detection system with the stability region <sup>Ω</sup>*ρ* check and the detection system without the <sup>Ω</sup>*ρ* check are shown in Figures 11 and 12.

Specifically, in Figure 11, it is demonstrated that without the stability region check, the closed-loop state leaves <sup>Ω</sup>*ρ* before the cyber-attack is confirmed. However, under the detection system with the boundedness check of <sup>Ω</sup>*ρ*, the closed-loop state is always bounded in <sup>Ω</sup>*ρ* by switching to redundant sensors at the first detection of min-max cyber-attacks. In Figure 12, it is shown that after the min-max cyber-attack is confirmed at *t* = 0.115 min, the misbehaving sensor is isolated and the LMPC of Equation (14) starts using the measurement of temperature from redundant sensors and re-stabilizes

the system at the origin. The simulations demonstrate that it takes around 0.8 min for the closed-loop state trajectory to enter and remain in <sup>Ω</sup>*ρmin* under the LMPC of Equation (14) once the min-max cyber-attack is detected. The corresponding input profiles for the closed-loop system of Equation (1) under the NN-based detection system with the <sup>Ω</sup>*ρ* check are shown in Figure 13, where it is observed that a sharp change of Δ *CA*0 occurs from *t* = 0.095 min to *t* = 0.115 min due to the min-max cyber-attack.

**Figure 11.** The state-space profile for the closed-loop CSTR with the initial condition (0.24, −2.78), where a min-max cyber-attack is detected by the NN-based detection system and mitigated by the LMPC of Equation (14).

**Figure 12.** Closed-loop state profiles (*<sup>x</sup>*2 = *T* − *Ts*, *x*¯2 = *T*¯ − *Ts*) for the initial condition (0.24, −2.78) under the LMPC of Equation (14) and the NN-based detection system.

**Figure 13.** Manipulated input profiles (*<sup>u</sup>*1 = Δ*CA*0, *u*2 = Δ*Q*) for the initial condition (0.24, −2.78) under the LMPC of Equation (14) and the NN-based detection system.

Additionally, when both disturbances and min-max cyber-attacks are present, it is demonstrated that the NN-based detection system is still able to detect the min-max cyber-attack and re-stabilize the closed-loop system of Equation (15) in the presence of disturbances by following the same steps as in the pure-cyber-attack case. The bounded disturbances *w*1 and *w*2 are added in Equation (15a,15b) as standard Gaussian white noise with zero mean and variances *σ*1 = 0.1 kmol/(m<sup>3</sup> min) and *σ*2 = 2 K/min, respectively. Also, the disturbance terms are bounded as follows: |*<sup>w</sup>*1| ≤ 0.1 kmol/(m<sup>3</sup> min), and |*<sup>w</sup>*2| ≤ 2 K/min, respectively. The closed-loop state and input profiles are shown in Figures 14–16. Specifically, in Figure 15, it is demonstrated that the min-max cyber-attack occurs at 0.08 min and is confirmed at 0.115 min before the closed-loop state leaves <sup>Ω</sup>*ρ*. In the presence of disturbances, the misbehaving sensor is isolated and the closed-loop states are driven to a neighborhood around the origin under the LMPC of Equation (14). In Figure 16, it is demonstrated that the manipulated inputs show variation around the steady-state values (0, 0) when the closed-loop system reaches a neighborhood of the steady-state due to the bounded disturbances.

**Figure 14.** The state-space profiles for the closed-loop CSTR with bounded disturbances and the initial condition (0.25, −3), where a min-max attack is detected by the NN-based detection system and mitigated by the LMPC of Equation (14).

**Figure 15.** State profiles (*<sup>x</sup>*2 = *T* − *Ts*, *x*¯2 = *T*¯ − *Ts*) for the closed-loop CSTR with bounded disturbances and the initial condition (0.25, −3) under the LMPC of Equation (14) and the NN-based detection system.

**Figure 16.** Manipulated input profiles (*<sup>u</sup>*1 = Δ*CA*0, *u*2 = Δ*Q*) for the closed-loop CSTR with bounded disturbances and the initial condition (0.25, −3) under the LMPC of Equation (14).

Lastly, since the surge cyber-attack of Equation (6) is undetectable by the model-based detection method, we also test the performance of the NN-based detection on the surge cyber-attack due to the similarity between surge cyber-attacks and min-max cyber-attacks (i.e., the surge cyber-attack works as a min-max attack for the first few sampling steps). It is demonstrated in simulations that 89% of surge cyber-attacks can be detected by the NN-based detection system that is trained for min-max cyber-attacks only, which implies that the NN-based detection method can be applied to many other cyber-attacks with similar properties.

Moreover, when cyber-attacks with different properties are taken into account, for example, the replay attack (i.e., *x*¯ = *X*, where *X* is the set of past measurements of states), the NN-based detection system can still efficiently distinguish the type of cyber-attacks and disturbances by re-training the NN model. The new NN model is built with labeled training data for the case of min-max, replay, nominal and with disturbances, for which the classification accuracy within 0.28 min is up to 85%. As a result, the NN-based detection model can be readily updated with the data of new cyber-attacks without changing the entire structure of detection or control systems.
