2.4.2. Customers Heating

The thermal energy storage is intended to exchange heat with the circulating water to maintain the customers' room temperature greater than or equal to the designed heating temperature *T*cri a,room. The temperatures of supply and return water are obtained by the room and the ambient temperature in Equation (12).

$$\begin{cases} \begin{aligned} T\_{\rm w,o} &= T\_{\rm a,room}^{\rm ri} + \left( \frac{T\_{\rm w,o}^{\rm ri} - T\_{\rm w,in}^{\rm ri}}{2} \right) \left( \frac{T\_{\rm a,room}^{\rm ri} - T\_{\rm a,amb}^{\rm ri}}{T\_{\rm a,room}^{\rm ri} - T\_{\rm a,amb}^{\rm ri}} \right) \\ &+ \left( \frac{T\_{\rm w,in}^{\rm ri} + T\_{\rm w,o}^{\rm ri}}{2} - T\_{\rm a,room}^{\rm ri} \right) \left( \frac{T\_{\rm a,room}^{\rm ri} - T\_{\rm a,amb}}{T\_{\rm a,room}^{\rm ri} - T\_{\rm a,amb}^{\rm ri}} \right)^{\frac{1}{1+\eta}} \\\ T\_{\rm w,in} &= T\_{\rm w,o} - \left( T\_{\rm w,o}^{\rm ri} - T\_{\rm w,in}^{\rm ri} \right) \left( \frac{T\_{\rm a,nom}^{\rm ri} - T\_{\rm a,amb}}{T\_{\rm a,room}^{\rm ri} - T\_{\rm a,amb}^{\rm ri}} \right) \end{aligned} \tag{12}$$

where *T*cri w,o and *T*cri w,in are the respective temperatures of water supply and return designed for SETS. *η* is the heat transfer coefficient of the radiator. According to the standard of heating in the north of China, the minimum value of *T*cri a,room is 18 ◦C. During the winter heating period, the average ambient temperature of Shenyang is −16.9 ◦C.

The overshoot of water temperature and the frequent start of the fan are easily caused by the use of *T*w,o and *T*w,in owing to the inertia of temperature change. Therefore, in this paper, the highest supply water temperature *T*w,o,max and lowest return water temperature *T*w,i,min during the heating period are designed. Taking Shenyang city as an example, the minimum and maximum temperatures of *T*a,amb are taken into Equation (12) to obtain the temperature curves of supply and return water as shown in Figure 4. The minimum temperature of return water *T*w,i,min is selected as the benchmark so that the change of temperature *T*w,o − *T*w,in = *γ* can be deduced as shown in Equation (13).

$$\gamma = \left( 1 + \frac{T\_{\rm w,o} - T\_{\rm w,i,min}}{T\_{\rm w,o,max} - T\_{\rm w,i,min}} \right) T\_{\rm w,i,min} \tag{13}$$

Substituting *γ* into Equation (11), *T*a,in is obtained as shown in Equation (14).

**Figure 4.** Influence curve of ambient temperature on circulating water temperature.

$$T\_{\rm a,in} = T\_{\rm a,o} + \frac{\dot{m}\_{\rm W}c\_{\rm W} \left(1 + \frac{T\_{\rm w,o} - T\_{\rm w,i,min}}{T\_{\rm w,o,max} - T\_{\rm w,i,min}}\right) T\_{\rm w,i,min}}{\dot{m}\_{\rm s}c\_{\rm s}}.\tag{14}$$

Based on the above analysis, the thermal energy store in bricks at time *t* can be obtained as shown in Equation (15).

$$E\_{\rm pro,t} = E\_{\rm stto,t} = \int\_0^i \vec{Q}\_{\rm stto,t} \,\mathrm{d}t \qquad (i \in t), \tag{15}$$

where, *Q*¯ sto,*<sup>t</sup>* is the average power consumption of SETS at time *t*. *E*pro,*<sup>t</sup>* is the electricity consumption by the heating wires, and *E*sto,*t* is the stored heat energy in the bricks. The working time *t* is related to the behavior characteristics, and the superscript *i* is the time index.

#### *2.5. Customers' Behavior Characteristics Extraction*

The working time *t* of the SETS is determined by the different electricity consumption behavior characteristics of customers. For example, two types of behavior characteristics of SETS customer within one month is shown in Figure 5. One type starts from the previous night 22:00 and stops heating at 1:00. Then, SETS starts again at 3:00 and ends the reservoir at 5:00. The other type starts from the previous night 22:00 and then continues to work until the end at 5:00. The consumption time *t* of one type is *t* = 3.5*h* on the left curve, and the other type is *t* = 7*h* on the right curve.

Since the load curves of the same electrical behavior have regular working time, it is necessary to cluster the same electricity consumption behavior characteristics with the heat load data. Then, the customers' behavior types are used to determine the working time *t* of SETS. Also, the *E*pro,*t* daily electricity consumption of SETS is predicted by Equation (15). The method of K-means is used to identify customers' behavior types. After all SETS data clustering, the customers' behavior types are obtained. The load data with the same type has a similar working time, the same type is identified using the support vector machine that has been widely used to learn target class [26]. When the target types are larger than binary, the one-versus-one is one of the effective methods to find the discriminative binary classifiers to solve a multiclass problem [27]. Thus, the one-versus-one method is adopted in behavior extraction. The overall process of the behavior characteristics extraction is shown in Algorithm 1.

**Figure 5.** Customers' electricity consumption behavior curve.

#### **Algorithm 1** The behavior characteristics extraction algorithm process.


vector machine output time labels;


#### *2.6. Summary of the PM Prediction*

Based on the above analysis, the proposed PM of SETS can be used to predict the daily electricity consumption. The overall PM flowchart is shown in Figure 6. According to the initial values, the range of Reynolds-Number (*Re*) is estimated. If *Re* > 206 or *Re* ≤ 102, return the initialization parameters, otherwise, input the ambient temperature *T*a,amb to obtain temperatures of *T*w,o and *T*w,in. Select the maximum value of supply water *T*w,o,max and the minimum value of return water *T*w,in,min. Then, obtain the inlet temperature *T*a,in. *Nu* is selected based on the range of *Re*, so input the *Nu* and *<sup>T</sup>*a,in <sup>≈</sup> *<sup>T</sup>*fin to predict the average power consumption *<sup>Q</sup>*¯ sto. The *<sup>Q</sup>*¯ sto,*<sup>t</sup>* is integrated over working hours *t*. According to the stored thermal energy *Q*¯ sto,*<sup>t</sup>* in the bricks, the predicted value of SETS heating wires electricity consumption *E*pro,*t* can be obtained.

## *2.7. Influencing Factors*

Since influencing factors (including humidity, wind speed, and temperature) influence the prediction of the CPM, it is necessary to select the strong correlation factors as the input of the cyber and PMs. Thus, the Pearson correlation analysis in Equation (16) is adopted to analyze which one is strongly correlated. The influencing factors and the SETS load are represented as *a* and *b*, respectively.

$$\rho\_{a,b} = \frac{\left(\sum ab - \frac{\sum a \sum b}{N}\right)}{\sqrt{\left(\sum a^2 - \frac{\left(\sum a\right)^2}{N}\right)\left(\sum b^2 - \frac{\left(\sum b\right)^2}{N}\right)}} \,\tag{16}$$

The statistics on the correlation between the influencing factors and SETS load within 24 h is as shown in Figure 7. The histogram explains which influencing factors are strongly correlated with the SETS load. The influencing factors (including humidity, wind speed, and temperature) are negatively correlated with the load, and humidity is positively correlated with the SETS load only in the range of 8:00 to 10:00. The ranges of correlation coefficient of humidity (blue), wind speed (green) and temperature (yellow) are [−0.24, 0.21], [−0.47, 0.13], and [−0.82, −0.66], respectively. It can be seen that the correlation between the temperature and SETS load is strongly negative, while humidity and wind speed are weakly correlated with SETS load. Thus, the temperature is selected as the input of the cyber and PMs.

**Figure 6.** Flowchart for the physical model (PM).

**Figure 7.** Correlation of SETS load influencing factors.

## **3. Cyber–Physical Approach**

The parameters of SETS and ambient temperature are the inputs of the SETS PM. The PM prediction result is shown in Figure 8. It can be observed that January has the best fitting with the least error. The test result with the PM still has an irregular error as shown in Table 1. The root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) are used to measure the prediction performance of the PM with a deviation between the predicted value and the real value.

$$\begin{cases} \text{RMSE} = \sqrt{\frac{1}{m} \sum\_{i=1}^{m} \left( y\_i - \hat{y}\_i \right)^2} \\ \text{MAE} = \frac{1}{m} \sum\_{i=1}^{m} |\mathcal{Y}\_i - \mathcal{Y}\_i| \\ \text{MAPE} = \frac{100\%}{m} \sum\_{i=1}^{m} \left| \frac{y\_i - \hat{y}\_i}{y\_i} \right| \end{cases} \tag{17}$$

where *m* is the number of data. *yi* is the real value of SETS load, *y*ˆ*<sup>i</sup>* is the prediction value of SETS PM.

**Figure 8.** PM prediction of the load curve of a customer's SETS.


**Table 1.** Physical model (PM) prediction parameters.

The PM predicts error according to the result of RMSE, MAE, and MAPE, and obtains a consistent evaluation. Selecting the month of January which has the smallest RMSE in five months, the predicted result RMSE error is 167.36 kWh. The electricity consumption of real value is 5616 kWh, the prediction accuracy is approximately 29.8%. Therefore, the prediction accuracy of the PM still needs to be further improved.

To improve the prediction accuracy, other prediction methods are considered. The CM is widely used in load prediction because it can accurately model the past. Hence, the application of the CM is considered to predict the electricity consumption of SETS. However, the electric behavior of SETS is different from the other conventional electric loads, it is difficult to predict a sudden increase and decrease in heat load, that is, the predictability of the CM is not good enough. On the contrary, the PM is sensitive to a sudden increase and decrease in the heat load. Therefore, the CM and PM can be combined to improve the prediction accuracy with mutual verification. CPS is widely used in many fields, for example, measurement recovery in false data injection attacks [28], smart home cyberattack detection [29], energy theft detection in multi-tenant data centers [30], and inter-well connectivity estimation in petroleum [31], etc. Hence, the cyber–physical approach is developed to predict electricity consumption of SETS. The CM established by the load data of SETS is used to calibrate the output error of the PM.

The cyber components (including influencing factors) and the physical components (including behavior characteristics and average power consumption) of SETS are integrated using the cyber–physical approach. The cyber–physical approach is used to predict the electricity consumption of SETS. The Pearson correlation analysis formula is used to solve the correlation coefficients *ρa*,*<sup>b</sup>* between power consumption of SETS and influencing factors. The ambient temperature has the greatest impact on the power consumption of SETS, which is considered as an input of the PM flow.

Firstly, the PM of SETS is utilized to predict the daily average power consumption based on ambient temperature. Secondly, the integration of K-means and one-versus-one support vector machine is applied to extract the behavior characteristics of SETS to obtain the work time *t*. Then the electricity consumption of SETS is obtained by integrating average power consumption. The performance of the prediction is tested by the error calculation Equation (17). The error levels are divided according to the size of the errors.

Since RMSE is a good reflection of the precision of the measurement, it is used to design the error levels of the PM. According to the result of RMSE, which is represented by *s*, the subscript indicates *s* = 1, 2, 3, 4, 5 which represents the months from November 2017 to March 2018, respectively. The number <sup>1</sup> to <sup>5</sup> are 372.05, 310.29, 167.36, 227.1, 389.45, respectively. Therefore, according to the order of *<sup>s</sup>* (<sup>5</sup> > <sup>1</sup> > <sup>2</sup> > <sup>4</sup> > 3), the error is divided into 5 levels, and then the data is input in the CM which is used in BP neural network.

Subsequently, the output of PM is calibrated by the error of CM output to obtain the electricity consumption of the SETS. The specific method of the CM is introduced as follows.

The error levels, ambient temperature, and PM prediction values are taken as input vectors *xn* of BP neural network. Output vectors *on* is the error as shown in Figure 9. The number of hidden neurons is 13, the sigmoid function is selected as the activation function for the hidden layer and the output layer, and the gradient descent method is used to update the parameters. In the CM, the influencing factors and the power consumption of SETS are the input vectors *xn* of BP neural network, and the electricity is the output vectors *on*. The number of hidden neurons is set as 25.

**Figure 9.** Cyber model (CM) structure of back propagation (BP) neural network.

During training, the CM of the BP neural network can automatically extract the "reasonable rules" between the input layer and output layer data and memorize the learning content in the weight of the network adaptively. The CM can predict the load deviation after training even if there is unknown noise-pollution. BP neural network is used to train the CM. The CM prediction result plus the PM predicted value is the electricity consumption of SETS.
