2.1. LSTM
With the increase in the number of neural network layers in MLP, on the one hand, the optimization function is easy to fall into the local optimal solution; on the other hand, there will be the problem of gradient disappearance. DNN solves the problem of gradient disappearance, but it still has the problem of being unable to model the time series and parameter expansion. RNN solves the problem of modeling time series, but when it faces the information of long time series, it has the problem of long time dependence. Finally, an LSTM that is not easy to fall into local optimization, gradient disappearance, and long time dependence is developed. Therefore, in the face of long time-series problems, LSTM is more reliable and accurate than the above methods.
The main structure of the LSTM [
12] model includes four parts: forgetting gate, input gate, output gate, and memory unit. They interact with each other in a specific relationship to filter and save information, as shown in
Figure 1:
In the figure, each arrow is a vector representing the output from the previous node to the input of other nodes; the circles represent point-by-point operations, the arrows merging represent connections, and the arrows bifurcating represent that their contents are copied for different purposes. The four components of LSTM are embedded in it.
The input, output, and state of LSTM are all one-dimensional vectors, and the key formulas are shown in (1) to (5):
In the formula, is the sigmoid or tanh function, , , , and are the node weight matrices of the input gate, forget gate, memory unit, and output gate, respectively, and , , , and are the node threshold matrices, respectively. is the input of the neuron at time t; is the output of the neuron at the time t − 1; , , , and are the weights of in different convolution operations, respectively; , , , and are the weights of ; , , , and are the bias values of the convolution operation. controls the input gate, which uses the sigmoid function to select the new input data and the output of the previous neuron to determine the input information of the neuron. controls the forgetting gate, the input parameters are the same as the input gate, and the output of the previous neuron is forgotten in proportion; that is, how much information is retained from the previous neuron to enter the neuron. is the cell state of the neuron at time t, which combines the cell state of the previous neuron and the new data input at time t to form a new cell state. The last is the output gate. Through the gate control setting, the current cell state is output proportionally; that is, , the current cell output.
2.2. ConvLSTM
Considering the spatial correlation of power system data, this paper proposes to use the ConvLSTM [
11] network designed by Shi et al. to predict the importance of power system nodes, expecting to achieve better prediction results than LSTM. The main structure of the ConvLSTM network is shown in
Figure 2:
The core essence of ConvLSTM is still the same as LSTM. The difference is that the ConvLSTM model introduces convolution calculation based on the LSTM model so that the model can not only obtain the time-series relationship but also extract spatial features like the convolution layer. In this way, the spatiotemporal sequence features can be obtained so that the ConvLSTM network can solve the spatiotemporal sequence prediction problem. The following are the key formulas of the ConvLSTM network.
In the formula, represents the convolution operation; is the input of the neuron at time t; is the output of the neuron at time t − 1; , , , and are the weights of in different convolution operations; , , , and are the weights of ; , , , and is the bias value of the convolution operation. controls the input gate, which uses the sigmoid function to select the new input data and the output of the previous neuron to determine the input information of the neuron. controls the forgetting gate, and the input parameters are the same as the input gate, which can determine how much information from the previous neuron enters the neuron. is the cell state of the neuron at time t, and at the same time, new data are input, and the two are reorganized into a new cell state. Finally, the output gate is gated to output the cell state according to a certain proportion; that is, the current cell output is obtained. It should be noted that X, C, H, i, f, and o are all three-dimensional tensors, and their last two dimensions represent the spatial information of rows and columns.