1. Introduction
In view of the shortage of energy supply and environmental problems caused by fossil fuels, the idea of an integrated energy system (IES) provides a feasible framework for future energy systems due to the complementarity, flexibility, and reciprocity among multiple energy forms [
1]. Through the conversion between different energy forms, IES is able to adjust the energy flow in the links of production, transmission, distribution, and consumption. It is a complex cyber-physical system interconnected with the smart grid, the thermal system, the natural gas system, the transportation system, and other types of energy systems, in order to realize the coordinated operation of multiple energy subsystems [
2]. However, in the links of energy supply, transmission, and consumption, a cyber-physical integrated energy system has the characteristics of strong coupling among different energy subsystems, as well as deep interaction between the physical system and the information system. With the increasing penetration of renewable energy and other newly appearing distributed sources and loads such as heat pumps and electric vehicles, the safety and reliability of IES are facing new challenges and requirements. As the physical systems of each energy forms, as well as the information system, are all interconnected in IES, and a fault in any single energy subsystem may affect the rest of the interconnected energy subsystems and the information system, resulting in the expansion of the fault range. Therefore, it is of great significance to study the cyber-physical security of IES under the new challenge of high randomness of system operation, high nonlinearity and complexity of system modelling, and interaction uncertainty between the physical system and information system.
Since the 21st century, a number of malicious network attacks on cyber-physical systems have occurred worldwide. Attacks on information systems of power systems have resulted in the loss of power supply capacity of the actual physical power grid, which eventually led to significant energy accidents and serious economic and social losses. In 2010, Stuxnet invaded the SIEMENS PCS7 control system used in Iran’s Natanz uranium enrichment base and the Bushehr nuclear power plant, resulting in the failure of a large number of uranium enrichment centrifuges and generator units in the Bushehr nuclear power plant, which delayed Iran’s nuclear program for at least two years. In 2015, the Ukrainian power grid suffered a malicious network attack in which the implantation of computer viruses made the control server fail to sense and control the physical equipment of the power grid, leading to a large-scale power outage. In March 2018, the U.S. Department of Homeland Security and the Federal Bureau of Investigation issued an alert that energy and other key systems had been subject to network attacks in which a number of confidential documents of the industrial control system workstation and the supervisory control and data acquisition (SCADA) system were stolen [
3]. The above network security incidents demonstrate that network attacks to an IES information system can cause huge damage to the physical system, including core equipment failure, system instability, heavy load loss, and power failure. Current research mainly focuses on the simulation, operation optimization, and planning of IES, whereas research on the anomaly detection and defense of a cyber-physical integrated energy system under network attack is very little, which is an important gap that needs to be filled.
The main target of the network attack on an IES is to destroy the confidentiality, accuracy, and availability of the information in an IES. There are various forms of network attacks based on the different attack targets. At present, common attack forms include false data injection attacks (FDIAs), denial of service (DOS), and man-in-the-middle attacks (MTIM). An FDIA aims to destroy the data accuracy in an IES. Attackers invade remote measurement instruments and use carefully camouflaged and designed data to tamper with measurement data or cut off normal measurement data and upload erroneous camouflaged data during communication. The decision-making center obtains the wrong data after the attack and therefore the decision-making process is interrupted, threatening the stability of the whole system and causing great economic losses [
4]. DOS aims at destroying the availability of information elements or centers in the system, and then hindering the information transmission and intervening in the normal communication of dispatching and operation commands in the system [
5]. MITM aims at destroying the confidentiality of IES information, during which the attacker steals and monitors the communication channel to illegally obtain users’ private energy data [
6]. Among them, an FDIA makes it easy to exert a significant impact on the economic and stable operation of the integrated energy system due to its high concealment and destructive attack characteristics [
7,
8,
9]. This paper studies FDIAs as the main attack form.
Making defense against network attacks necessary is an important guarantee to maintain the safe and stable operation of an IES, and the first step in the defense is to detect the existence of a network attack, distinguish the mode of attack, and predict the attacker’s purpose and potential impact. The detection methods proposed for the above attacks can be divided into three categories. The first is the detection method based on trajectory prediction. Track analysis is carried out with historical data to predict the current state of the system and compared with the actual measurement to analyze the areas that may be attacked. Reference [
10] proposed a multi-sensor track fusion-based prediction method to extract the initial correlation information about the attacked oscillation parameters, using a Kalman-like particle filter-based smoother at each monitoring node. The characteristics of moving horizon estimation are used to deal with the continuous load fluctuation and disturbance caused by data injection in the power grid. Reference [
11] proposed a quick intrusion detection algorithm to detect FDIAs in smart grids, in which the statistical characteristics of dynamic gird states are analyzed and a time-varying dynamic model is established to accurately capture the dynamic state transition caused by the change in system configuration. The second category is the detection method based on state estimation, mainly including the residual detection method, the sudden variable detection measurement method, etc. Reference [
12] proposed a new method for false data detection, in which equivalent measurement transformation is applied instead of traditional weighted least squares state estimation and the false data are distinguished by the residual researching method. Reference [
13] proposed a method to screen and protect a specific subset of measurement devices, through which the data mutations in the selected subset of measurement devices can be quickly detected and significant security losses caused by attacks can be avoided as much as possible under limited protection resources. In recent years, the third kind of detection method based on artificial intelligence has been developing quickly. Reference [
14] used the compressive sampling method in compressed sensing theory and SVM to develop an anomaly detection model, and SVM was adopted to classify the feature compression results. Reference [
15] used statistical tools of binary logistic regression to detect attacks. Reference [
16] proposed a decision tree algorithm based on relative decision entropy, which combined with rough set attribute reduction technology to delete redundant features and improve attack detection accuracy.
In current research, the selection of features in the anomaly detection process has not been extended to the time-frequency field. However, the operation data of each energy subsystem in an IES (power subsystem, heating subsystem, gas subsystem, etc.) are mostly time-series data. For time-series data, the time-frequency features have higher accuracy and effectiveness than time-domain features. The typical time-frequency transform methods include wavelet transform (WT), discrete wavelet transform (DWT), empirical mode decomposition (EMD), and empirical wavelet transform (EWT). These methods have already proven their excellent feature extraction ability for time-series data in current studies. In Reference [
17], a wavelet representation method is proposed, in which the difference of information between the approximation of a signal at the resolutions can be extracted on the basis of WT. The application of this representation to data compression in image coding, texture discrimination, and fractal analysis is discussed. Reference [
18] established a protection scheme for microgrids using DWT and a decision tree, in which the DWT method is applied to the time-series data for fault classification. Reference [
19] verified the effectiveness of the EMD method for analyzing nonlinear and non-stationary data and interpreted the final presentation of the results as an energy–frequency–time distribution. Reference [
20] explored the performance of EMD based on numerical experiments in cases with fractional Gaussian noise, and the results showed that EMD can act effectively as a dyadic filter bank in stochastic situations involving broadband noise. Reference [
21] proposed a simple and logical definition of “trend” for nonlinear and nonstationary time-series data based on EMD. Climate data are used to illustrate how the intrinsic trend is determined and how the variability of the data on various time scales can be derived. Reference [
22] explored power quality disturbances based on EWT and DWT and the results showed that the two methods had different performances in combination with different classification methods.
Though the above time-frequency transform methods have already been adopted in feature extraction of time-series data in some other fields, the methods have not been applied in the anomaly detection process of an IES. In this paper, three time-frequency transform methods (DWT, EMD, and EWT) are used to extract the time-frequency features of IES time-series data, and the extracted time-frequency features are used for prediction instead of the conventional time-domain prediction for anomaly detection.
Besides, previous studies seldom consider the uncertainty of prediction error, and often use the percentage of prediction error distribution as the judgment threshold. In this paper, kernel density estimation (KDE) is used to estimate the probability density function of prediction error. Based on this, the interval of the predicted data considering prediction error is estimated, and the lower estimated interval value is used as the attack judgment threshold.
In summary, this paper proposes an anomaly detection method of time-series data in a cyber-physical integrated energy system based on time-frequency feature prediction. The technical framework of this paper is shown in
Figure 1. In
Section 2, the time-frequency features are constructed on the basis of three time-frequency transform methods of DWT, EMD, and EWT. Then in
Section 3, an anomaly detection method is proposed, in which the autoencoder (AE) and regression prediction are applied to the time-frequency features in order to predict the time-domain data within the detection period. Considering the prediction uncertainty, kernel density estimation (KDE) is used to estimate the interval of the time-domain prediction data. The attack judgment threshold is determined as the lower estimated bound of the predicted data. A case study is conducted for method validation, the data processing of which is interpreted in
Section 4 and the results analysis is presented in
Section 5. The results verify that the proposed method can significantly improve the anomaly detection accuracy on time-series data. Thus, the proposed model can effectively help resist network attacks and maintain the economic and stable operation of an IES.
2. Construction of the Time-Frequency Features
As the operation data of an IES are mostly time-series data with timing characteristics, research on anomaly detection of time-series data can effectively detect attack intrusion on an IES and ensure system safety and stability. The operation data of an IES usually undergo complex changes in a day without explicit seasonal or periodic patterns. When an FDIA occurs, the spatial and temporal correlation of the operation data in an IES may also differ from that in the normal state. Therefore, the feature extraction methods of time-frequency transform are adopted to analyze the implicit time-frequency characteristics of IES time-series data, such as discrete wavelet transform (DWT), empirical mode decomposition (EMD), and empirical wavelet transform (EWT). The excellent feature extraction ability of the above time-frequency feature extraction methods was verified in Refs. [
17,
18,
19,
20,
21,
22]. Based on the extracted time-frequency features, the autoencoder (AE) is further adopted in
Section 3 to mine the nonlinear hidden structure inside the features, which together address the correlation inconsistency and improve the anomaly detection accuracy.
2.1. Time-Frequency Transform Methods
To extract comprehensive and sufficient time-frequency characteristics of IES time-series data, three typical time-frequency transform methods are employed. The methods of DWT, EMD, and EWT are briefly introduced in this section.
2.1.1. Discrete Wavelet Transform (DWT)
DWT is a time-frequency extraction method that can obtain the time position of the spectral component of non-stationary signal. For a given time-series datum
f (
t), the DWT of
f (
t) includes both scale transform and translation transform, which equals the convolution of
f (
t), as shown in Equation (1) [
23].
where
m and
n are the scale and position parameters of the wavelet function, respectively. The wavelet function
ψm,n (
t) is shown in Equation (2).
2.1.2. Empirical Mode Decomposition (EMD)
EMD decomposes time-series data according to the time-scale characteristics of the data themselves instead of preset basis functions. Therefore, EMD is suitable for analyzing nonlinear and nonstationary time-series data. It can decompose the complex signal into finite intrinsic mode functions (IMF), and each IMF component contains the local characteristic signals of different time scales of the original signal. Then the time-frequency spectrum can be obtained through a Hilbert transform of each IMF, where the frequency with physical significance is obtained. Compared with short-time Fourier transform, wavelet transform (WT), and other methods, EMD decomposes the data themselves to form the basis functions, which can preserve the local time-scale characteristics of the original data [
19,
20,
21]. As a result, the time-frequency features obtained by EMD is an effective supplement to the features extracted by DWT.
The decomposition process of EMD is as follows [
19].
For a given signal
f (
t), the upper and lower envelopes are determined from the local maximum and minimum values of cubic spline interpolation.
m1 represents the mean value of the upper and lower envelopes.
h1 is calculated as follows.
In the second decomposition process,
h1 is the data to be decomposed, and
m11 is the mean value of upper and lower envelopes of
h1.
h11 is calculated as follows.
The decomposition process is repeated
K times until
h1K is an eigenmode function.
h1K is calculated as follows.
Then the first IMF component of
f (
t) is obtained as Equation (6), which contains the shortest periodic component.
Then
c1 is separated from the rest of the data, as shown in Equation (7).
Repeat the above process
n times, as shown in Equation (8).
Through the process of EMD, f (t) is decomposed as the sum of a set of IMF functions c1, …, cn. n is decided by the features of f (t) itself.
2.1.3. Empirical Wavelet Transform (EWT)
Although EMD has the advantage of self-adaption and performs particularly well when dealing with nonlinear and non-stationary signals, problems such as over envelope, under envelope, endpoint effect, and mode aliasing still exist in varying degrees in the application process. Empirical wavelet transform (EWT) combines the self-adaptive decomposition concept of EMD and the compact support feature of WT, which provides a new idea of self-adaptive time-frequency analysis. Compared with EMD, the EWT method can adaptively select the frequency band and overcome the problem of modal aliasing caused by the discontinuity of the time-frequency scale. At the same time, it has a complete and reliable mathematical theoretical basis, low computational complexity, and can also overcome the problems of over envelope and under envelope [
24]. Therefore, this paper selected EWT as one of the methods to extract the time-frequency features of time-series data so as to further enrich the time-frequency feature set.
In the computation process of EWT, the Fourier spectrum of the original signal is divided into continuous intervals. Then wavelet filter banks are constructed on each interval for filtering, and a group of AM and FM components is obtained at last through signal reconstruction. This method can identify the position of the signal’s feature information in the Fourier spectrum through the wavelet filter bank and adaptively extract the different frequency components of the signal. The design of the filter bank is based on the idea of constructing Littlewood Paley and Meyer wavelets. For
n > 0, the empirical wavelet function
ψn (
ω) and empirical scaling function
ϕn (
ω) are shown in Equations (9) and (10) [
22].
where
γ is a parameter determining the spectrum interval width and
γ∈ (0, 1).
β (
x) ∈C
k ([0, 1]) and satisfies the conditions in Equation (11).
The functions satisfying the conditions in Equation (11) are diverse, among which the most used one is shown in Equation (12).
Referring to the idea of classical WT, the empirical wavelet coefficient constructed by Gilles is generated by the inner product, and the detail coefficient
EWT (
n,
t)
d is generated by the inner product of empirical wavelet function
ψn (
ω) and signal
f (
t), as shown in Equation (13).
The approximate coefficient is determined by the inner product of the empirical scaling function
ϕn (
ω) and signal
f (
t), as shown in Equation (14).
2.2. Modeling of FDIA
Due to the interconnection of the cyber-physical integrated energy system, even a small network attack may give rise to huge stability problems and economic losses. The detection and defense of network attacks is of huge significance for the safety and stability of an IES. Though the forms of attack are various, the core attack purpose is consistently to destroy the operation stability of the IES and obtain economic benefits from the intrusion. In the introduction section, three common forms of network attack against cyber-physical integrated energy systems were introduced, namely, false data injection attacks (FDIAs), denial of service (DOS), and man-in-the-middle attacks (MITM). Among the three attack forms, it is easier for FDIA to have a significant and profound impact on the economic and stable operation of an IES in view of its high concealment and strong destructiveness. Therefore, this paper selects FDIA as the main attack form.
Since the control center in an IES is directly connected with the network dispatching center, the measurement data of the energy supply is difficult to modify. By comparison, the measuring instruments at the load end are widely distributed, and the FDIA against the load end is easier to implement. Therefore, this paper uses the method in Ref. [
25] to model and analyze the load redistribution (LR) attack, a specific type of FDIA. The attacker intends to threaten and affect the overall economy and stability of the system by modifying the measurement data at the partial load end of an IES. By modifying the measurement data at the load end, the dispatching system will receive the wrong load data and make wrong dispatching orders, resulting in the redistribution of the power flow in the whole network. The imbalance of the power flow distribution may cause further load loss at the load end, heavy load or overload of transmission lines, equipment damage, and even large-scale energy supply suspension accidents in the end. From an economic perspective, the attack may also raise the system operation cost and affect relevant economic transactions, which may eventually cause huge economic losses [
26,
27,
28].
To further model the LR attack process, suppose that the load of node i in subsystem m is before being attacked, where and e, h, and g are the subscripts of the power, heating, and gas subsystems, respectively. The attacker modifies the node load to make the original node load superimpose on an attack vector , which causes the transmission power of each transmission line superimposed on , influencing the energy flow distribution of the whole system.
In addition, if the sudden increase or decrease in node load or line transmission power is too much, the system detector can detect the change behavior through residual inspection and give an alarm. Therefore, assuming that the attacker has certain knowledge and experience of LR attacks in advance, the attacker limits the change in load and transmission power to within a range so as to bypass the residual inspection and avoid sounding an alarm. The limiting conditions are shown in Equations (15) and (16).
where
and
represent the lower and upper limit factors of the load change range at node
i of subsystem
m, respectively.
and
represent the lower and upper limit factors of the transmission power range from node
I to node
j in subsystem
m, respectively.
indicates the initial load at node
i of subsystem
m before the attack and
represents the attack vector at node
i.
indicates the initial transmission power on the transmission line from node
i to node
j in subsystem
m before the attack.
represents the corresponding power change caused by the attack on the line from node
i to node
j in subsystem
m.
2.3. Time-Frequency Feature Analysis of IES Load Data under FDIA
To further explore the feature extraction potential based on the three time-frequency transform methods of DWT, EMD, and EWT, a case study was conducted on the time-series data of a power subsystem node under LR attack. The load data of 100 time points were selected, and three different types of FDIA (slope attack, incentive attack, and delay attack) were assumed to exert on this node. The time-frequency features were qualitatively analyzed after applying DWT, EMD, and EWT on the attacked data so as to lay a foundation for the subsequent selection and construction of time-frequency features.
Figure 2,
Figure 3 and
Figure 4 present the three attack modes of slope attack, incentive attack, and delay attack, respectively.
Figure 2 shows the first attack mode—a slope attack. The attack vector is depicted by the orange line, which increases and decreases linearly. The red line indicates the normal state of the time-series load data, and the blue line presents the series data after the slope attack.
Figure 3 shows the second attack mode—an incentive attack. The attack vector is suddenly superimposed and lasts until the end of the attack.
Figure 4 presents the third attack mode—a delay attack. The attacked data repeat the signal value of the previous time period.
To observe the change in the time-frequency features under the three attack modes, the time-frequency transform methods of DWT, EMD, and EWT were applied to the case data. As for the setting of DWT, it can be seen from Equation (1) that the selection of wavelet functions ψm,n (k) and the decomposition level m can lead to different decomposition coefficients. These coefficients will further affect the feature extraction ability of a DWT-based feature extractor. Although the best setting of wavelet function and decomposition level m to achieve the best detection performance exists, it is impractical to test all types of wavelet functions. In this paper, the bior wavelet was chosen as the basis wavelet function in consideration of its biorthogonality and compact support by MATLAB. In addition, the mode components decomposed by EMD and EWT together form the original time-series data, and the coefficients of all mode function components are retained for feature extraction.
Figure 5,
Figure 6 and
Figure 7 present the decomposition coefficients of DWT, EMD, and EWT under the three attack modes, where the results of DWT only preserve the first-level detail coefficients. Under each attack mode, both the positive attack and the negative attack were studied.
As shown in
Figure 5, different feature extraction methods had different sensitivity to different attack modes. For example, under the action of a positive slope attack, the decomposition coefficient of EMD was very different from that of the normal state, whereas there was almost no difference under a negative slope attack. The decomposition coefficients of DWT and EWT were quite different in both positive and negative attacks. As presented in
Figure 6 and
Figure 7, for incentive and delay attacks, the decomposition coefficients of EMD and EWT were very different from the normal state, whereas those of DWT were not obvious.
By globally observing the decomposition coefficient diagram from
Figure 5,
Figure 6 and
Figure 7, it can be seen that DWT was more accurate at locating the time point of the abnormal value, and EMD and EWT had more advantages in judging the occurrence of the abnormal value. Therefore, the above three types of time-frequency transform methods, DWT, EMD, and EWT, can together provide good screening features for both judging whether an attack is occurring and locating the beginning and end time of the attack. The three methods complement each other and together form rich and diverse time-frequency features of the time-series data, which provides strong support for subsequent attack anomaly detection.
2.4. Composition of Time-Frequency Features
Based on the conclusions drawn in
Section 2.3, the decomposition coefficients obtained by DWT, EMD, and EWT can help distinguish the attacked time-series data in an IES. References [
29,
30,
31] proved that it is effective to analyze the statistical characteristics of decomposition coefficients obtained by time-frequency transform for time-series data. Therefore, to determine the time-frequency feature composition of IES time-series data, extracting the statistical features of the decomposition coefficients obtained by time-frequency transform is considered an important part. Based on the existing research, the statistical features are determined to be the variance and the distribution and the average of local maximum and local minimum of the time-frequency decomposition coefficients. To effectively reduce the feature dimension and avoid the “dimension disaster” in the subsequent application of the autoencoder, the vulnerability evaluation method proposed in Ref. [
26] can be first adopted to determine the vulnerable nodes of the system, and then the time-frequency feature extraction can be conducted only on the selected vulnerable points.
Figure 8 shows an overview of the time-frequency features and
Table 1 presents the detailed composition of the time-frequency features.
In
Figure 8,
n1 represents the number of DWT-related features.
n2 represents the number of features related to EMD.
n3 represents the number of features related to EWT. Node 1~ Node
K represents
K nodes whose data are used for anomaly detection.
In
Table 1,
m represents the number of the first-level detail coefficients of DWT, the coefficient of the
m-th IMF component of EMD, or the
m-th mode component value in EWT. The distribution mark sequence of local maximum and minimum defines a sequence whose value is 1 for all the positions where the local maximum appears, and −1 for the positions where the local minimum appears, with the rest being zero.
4. Data Processing of the Case Study
The studied case of an IES is derived from the UoM MED test case in Ref. [
33], which comprises a 22-node power network, a 31-node heat network, and a 37-node gas network. The system is assumed to be fully observable, which means all loads, flow rate, pressure, and temperature, as well as other state variables on the transmission carriers, can be measured and monitored. The detailed structure of the studied IES case is presented in
Figure 15.
In the studied case, there are three coupling devices in total, which involves eight nodes within the whole IES system. The coupling relationships are listed in
Table 2.
As shown in
Table 2, the CHP unit is a power–heat–gas-coupling device, which acts as a load in the gas network and sources in the electric and heat network. The gas boiler couples with the heat and gas network, which behaves as a load in the gas network and a heat source in the heat network.
It is worth noting that the load data used in the UoM MED test case are not time-series data, and this study uses capacity-matched time-series load data obtained from multiple sources. More specifically, the power load data are from Belgian Grid’s 2020 annual report [
34], with a sampling period of 15 min. As the data are on a yearly scale, a time resolution of 1 h was selected for broad detection in the case study, so the power load data were resampled to produce hourly data. The gas load data came from the third-party datasets of specific open-source projects [
35]. The thermal load data were acquired from the building thermal simulation data offered by Transys, with a data granularity of 1 h. A total of 8760 data points exist for both electrical and thermal load for an entire year. With regard to the gas data, they were expanded through interpolation so that the size of the dataset could be upscaled to 8760 samples.
The data of the electric, gas, and heat system were adjusted to the same scale with a time resolution of 1 h so that their time-frequency features were also on the same scale. Then the time-frequency features of the electric, gas, and heat could together compose multi-dimensional time-frequency features. The multi-dimensional features were then input into the AE to dig out the hidden features within and between different nodes and subsystems.
The attacked samples were constructed based on the three attack modes described in
Section 2.3. The time at which the attack started and the corresponding duration were randomly generated. The samples were split into a test set and a training set in the ratio of 7:3.
Due to improper collection, equipment failure, and other reasons, the raw data may be missing, repeated, abnormal, or redundant. The following preprocessing steps were carried out for data cleaning.
Step 1. Fill in missing values: For a single value that is missing, the method of interpolation is used to fill in the vacancy.
Step 2. Correct abnormal values: For data that are apparently out of the normal operation interval, the values should be modified. The method of interpolation is still adopted in this case.
Step 3. Eliminate redundant values: Data that are not consistent with the time resolution will be eliminated. When multiple values exist for the same time point, the average value will be retained as the only value.
Step 4. Normalize the data: Due to the significant difference in the magnitude of different types of load data, the samples are normalized to eliminate possible impact on the prediction results. The normalization process is implemented according to Equation (22).
where
x is the raw data before normalization, and
x’ is the normalized data.
μ and
σ denote the mean and standard deviation of the raw data, respectively.
The normalized time-series sample dataset under normal state is depicted in
Figure 16, illustrating the time-series sample load data at grid node 9, heat network node 20, and gas network node 12.
To simulate the scenario of being attacked, 100 incentive attacks with random start time and 20% attacking magnitude were imposed on the original data. A detailed diagram of the incentive attack process for certain load samples of the electric, heat, and gas network is shown in
Figure 17.