**5. Variable Dummy Value Model**

In the proposed model with a fixed dummy value, the attacker may attack the system by remaining undetected because the dummy value is fixed for all instances. Therefore, the dummy value should vary to protect the system against attacks. For this purpose, the model with a variable dummy value was introduced. In this scenario, the dummy value will change at every instant, and it will depend on the actual value, as well as some other values of the power in the system. Therefore, the dummy value of the power changes with the change in either of those values on which it depends. In the variable dummy value model, a linear function is implemented for the calculation of the dummy value and that function uses the actual measured value of that meter and the measured value of some other meter that has a relationship with that actual value. The function is only known to the control room. In the fixed dummy value model, the calculated dummy values are embedded into the meters. Similarly, in the variable dummy value model, the function used for the calculation of dummy values is embedded into the meters. This work assumed that the intruder does not have access to the meters, i.e., the intruder only has access to the measurements sent to the control room. The following functions are used in the case of the variable dummy value model for the calculation of dummy values for buses:

$$p'\_{v(y)}(i) = \beta\_{1vpii} p\_{v(y)}(i) + \beta\_{2vpii} p\_{viz(y)} + \beta\_{3vpii} \tag{25}$$
 
$$i = 1, 2, 3, \dots, b$$

$$q\_{v(y)}'(i) = \beta\_{1vqi} q\_{v(y)}(i) + \beta\_{2vqi} q\_{viz(y)} + \beta\_{3vqi} \tag{26}$$
 
$$i = 1, 2, 3, \dots, b$$

where *p v*(*y*) (*i*) and *q v*(*y*) (*i*) represent the *i*th entries of the dummy values vectors **p <sup>v</sup>**(**y**) and **q v**(**y**) , respectively. *pv*(*y*)(*i*) and *qv*(*y*)(*i*) denote the *i*th entries of **pv**(**y**) and **qv**(**y**), respectively. Similarly, *pviz*(*y*) and *qviz*(*y*) represent the active power and reactive power flowing through the first transmission line connected to the *i*th bus at the *y*th instant, respectively. *β*1*vpi*, *β*2*vpi*, *β*3*vpi*, *β*1*vqi*, *β*2*vqi*, and *β*3*vqi* are the constants that have to be learned to calculate the dummy values. Similarly, the calculation of the dummy values of the active and reactive powers flowing through the transmission lines can be done by using the following functions:

$$\begin{aligned} p'\_{vw(y)}(i) &= \beta\_{1vwpi} p\_{v(y)}(i) + \beta\_{2vwpi} p\_{w(y)}(i) + \beta\_{3vwpi} \\ &i = 1, 2, 3, \dots, t \end{aligned} \tag{27}$$

$$\begin{aligned} q'\_{vw(y)}(i) &= \beta\_{1\text{vuqqi}} q\_{v(y)}(i) + \beta\_{2\text{vuqqi}} q\_{w(y)}(i) + \beta\_{3\text{vuqqi}}\\ i &= 1, 2, 3, \dots, t \end{aligned} \tag{28}$$

$$\begin{aligned} p'\_{uvv(y)}(i) &= \beta\_{1wvpi} p\_{w(y)}(i) + \beta\_{2wvpi} p\_{v(y)}(i) + \beta\_{3wvpi} \\ &i = 1, 2, 3, \dots, t \end{aligned} \tag{29}$$

$$\begin{aligned} q'\_{uvv(y)}(i) &= \beta\_{1uvvqi} q\_{w(y)}(i) + \beta\_{2uvvqi} q\_{v(y)}(i) + \beta\_{3uvvqi} \\ i &= 1, 2, 3, \dots, t \end{aligned} \tag{30}$$

*p vw*(*y*) (*i*) and *q vw*(*y*) (*i*) denote the *i*th entries of vectors **p vw**(**y**) and **q vw**(**y**) , respectively, which contain the dummy values of powers flowing through the transmission lines in the forward direction at *y*th instant. *pw*(*y*)(*i*) and *qw*(*y*)(*i*) represent the active power and reactive power injected into *i*th bus at *y*th instant, respectively. *pw*(*y*)(*i*) and *qw*(*y*)(*i*) belong to **pv**(**y**) and **qv**(**y**), respectively. Similarly, *p wv*(*y*) (*i*) and *q wv*(*y*) (*i*) show the *i*th entries of vectors **p wv**(**y**) and **q wv**(**y**) , respectively, which have the dummy values of the active power and reactive power flowing through transmission lines in the backward direction. Constants are also used in the equations proposed for the calculation of dummy values.

The Equations (25)–(30) are used for finding the dummy values at the *y*th instant. In the variable dummy value model, the dummy values depend on the real-time measurement values. As the real-time measurement values are used for the calculation of the dummy values, the dummy values change at every instant in this case.

There is a key point to consider while selecting the dummy value, which is that the dummy value of a meter should be close to its actual value. There should not be too much difference between the actual and dummy value such that the attacker can find the dummy value and construct an undetectable attack. Therefore, when these linear functions are implemented for the calculation of the dummy value, we may obtain a dummy value that is far away from its actual value. The reason for this is that these dummy values depend on two different values of the power and there might be a high variance in the values of a certain meter depending upon the load connected to a bus. If the variance of either of the two actual values is high for a whole day, the dummy value will not be close to the actual value.

This problem may be minimized due to the selection of appropriate values of the constants. The selection of constants is done in such a way that all dummy values of a specific power for the whole day must remain close to the actual value of that power. For this purpose, a machine-learning technique, namely, multivariate linear regression (MLR), was used for finding the best values of the constants. The procedure of MLR to find the constants of the equation used to calculate the dummy values of the active power injected to all the buses is explained here. In this case, the hypothesis is written as

$$\mathbf{g\_{\beta\_k}}(\mathbf{p\_k}) = \beta\_{1vpk} p\_v(k) + \beta\_{2vpk} p\_{vkz} + \beta\_{3vpk} \tag{31}$$

Here, gβ**<sup>k</sup>** (**pk**) is a function of **pk** that is parameterized using β**k**. **pk** represents the *k*th input vector, where *k* = 1, 2, 3, ... , b and **pk =** [1 *pvkz pv*(*k*)] T. β**<sup>k</sup>** denotes the *k*th parameter vector and β**<sup>k</sup> =** [*β*3*vpk β*2*vpk β*1*vpk*] T. *β*1*vpk*, *β*2*vpk*, and *β*3*vpk* are the constants to be learned for each dummy value of the active power injected into the buses. Therefore, for each dummy value, a different vector of constants is used. Depending upon the hypothesis, the cost function for the multivariate linear regression can be written as

$$\mathbf{J}(\mathfrak{F}\_{\mathbf{k}}) = \frac{1}{2mt} \sum\_{y=1}^{mt} \left( \left( \sum\_{f=1}^{3} \beta\_{kf} p\_{kf(y)} \right) - p\_{v(y)}(k) \right)^2 \tag{32}$$

Here, *mt* represents the total number of instances, i.e., the total number of training examples in this case. *pv*(*y*)(*k*) represents the output of the *y*th training example of the active power injected to the *k*th bus. We must minimize the cost function so that we obtain the best values of the parameters. For this purpose, the gradient descent algorithm was applied, which is based on the update rule. The gradient descent can be written as

$$\beta\_{kf} := \beta\_{kf} - \alpha \frac{1}{mt} \sum\_{y=1}^{mt} \left( \mathbf{g}\_{\mathfrak{P}\_k} \left( \mathbf{p}\_{\mathbf{k}(y)} \right) - p\_{v(y)}(k) \right) p\_{kf(y)} \tag{33}$$

*βk f* represents the *f*th entry of the *k*th parameter vector. *pk f*(*y*) denotes the *f*th entry of the *k*th input vector at the *y*th instant. The β's are calculated again and again, and those parameters are used to calculate the cost. The above process is repeated until convergence occurs. When the cost converges, this produces the best values of the parameters.

By adopting the same procedure, the constants for the remaining equations are also found and those constants are put in their respective functions to calculate the dummy values of the active and reactive power. Then, these functions are embedded into the meters for the calculation of the dummy values. The meters measure the actual values of power and then use those functions to calculate the dummy values of power to send them to the control room. These functions are only known to the control room.

In the control room, to detect the FDI attacks, these functions are used to recalculate the dummy value by using the actual values obtained from the measurement vector. Then, the recalculated dummy value is compared with the dummy value obtained from the

measurement vector for attack detection. The following equations are used in the control room to compare the calculated dummy values and received dummy values of active and reactive powers injected into all the buses:

$$\begin{aligned} p\_{vp(y)}(j) &= p'\_{vr(y)}(j) - (\beta\_{1vpj} p\_{vr(y)}(j) + \beta\_{2vpj} p\_{vrjz(y)} + \beta\_{3vpj}) \\ j &= 1,2,3,...,b \end{aligned} \tag{34}$$

$$\begin{aligned} r\_{\text{vq}(y)}(j) &= q'\_{\text{vr}(y)}(j) - \left(\beta\_{1\text{vq}j} q\_{\text{vr}(y)}(j) + \beta\_{2\text{vq}j} q\_{\text{vr}j\text{z}(y)} + \beta\_{3\text{vq}j} \right) \\ &\quad j = 1, 2, 3, \dots, b \end{aligned} \tag{35}$$

The measurement vector received in the control room at the *y*th instant is **zdyr**. Here, *p vr*(*y*) (*j*) and *q vr*(*y*) (*j*) represent the *j*th entries of the received vectors **p vr**(**y**) and **p vr**(**y**) , respectively, which contain the dummy values of the active power and reactive power received in the control room at the *y*th instant. *pvr*(*y*)(*j*) and *qvr*(*y*)(*j*) denote the *j*th entries of the received vectors **pvr**(**y**) and **qvr**(**y**), respectively, which contain the actual values of the active power and reactive power received in the control room at the *y*th instant. *pvrjz*(*y*) and *qvrjz*(*y*) are taken from the received measurement vector. *rvp*(*y*)(*j*) and *rvq*(*y*)(*j*) represent the *j*th entries of the residue vectors **rvp**(**y**) and **rvq**(**y**), respectively, which contain the residues for the active and reactive powers injected into the buses at the *y*th instant. Similarly, the equations for calculating the residues for the forward and backward powers flowing through the transmission lines are given by:

$$r\_{vwp(y)}(j) = p'\_{vwr(y)}(j) - \left(\beta\_{1vwpj} p\_{vr(y)}(j) + \beta\_{2vwpj} p\_{wr(y)}(j) + \beta\_{3vwpj}\right) \tag{36}$$

$$r\_{vwq(y)}(j) = q'\_{vwr(y)}(j) - \left(\beta\_{1vwqj}q\_{vr(y)}(j) + \beta\_{2vwqj}q\_{wr(y)}(j) + \beta\_{3vwqj}\right) \tag{37}$$

$$\begin{aligned} r\_{\text{uv}vp(y)}(j) &= p'\_{\text{uvr}(y)}(j) - \left(\beta\_{1wvpj} p\_{\text{uvr}(y)}(j) + \beta\_{2wvpj} p\_{vv(y)}(j) + \beta\_{3wvpj}\right) \\ &\quad j = 1, 2, 3, \dots, t \end{aligned} \tag{38}$$

$$r\_{\text{uvq}(y)}(j) = q\_{\text{uvr}(y)}^{\prime}(j) - \left(\beta\_{1\text{uvq}j}q\_{\text{uvr}(y)}(j) + \beta\_{2\text{uvq}j}q\_{\text{uvr}(y)}(j) + \beta\_{3\text{uvq}j}\right) \tag{39}$$

In these equations, the dummy and actual values are obtained from the received measurement vector in the control room. *rvwp*(*y*)(*j*), *rvwq*(*y*)(*j*), *rwvp*(*y*)(*j*), and *rwvq*(*y*)(*j*) represent the *j*th entries of the residue vectors **rvwp**(**y**), **rvwq**(**y**), **rwvp**(**y**), and **rwvq**(**y**), respectively, which contain the residues for the active and reactive powers flowing through the transmission lines in the forward and backward directions at the *y*th instant. The overall residue at the *y*th instant is calculated using

$$r = \left| \mathbf{r\_{vp(y)}} \right| + \left| \mathbf{r\_{vq(y)}} \right| + \left| \mathbf{r\_{vwp(y)}} \right| + \left| \mathbf{r\_{vwq(y)}} \right| + \left| \mathbf{r\_{wvp(y)}} \right| + \left| \mathbf{r\_{wvq(y)}} \right| \tag{40}$$

For a secure system:

$$r = 0$$

If the total residue has some value other than zero, the system is considered attacked. The attacker hacks the measurement vector **zdy** and sends the vector **zdyr** to the control room after making the attack. As the attacker does not know which are the dummy values, the attacker will attack dummy values too. The attacker also does not know about the relationship used to calculate the dummy value. As a result, the attack is easily detected in the control room, as the value of *r* will not be equal to zero.

This proposed model of the variable dummy value can tackle the limitations of the fixed dummy value model and the stealth FDI attacks can be detected in an efficient way.
