*3.1. Bi-LSTM Model*

Generally, Bi-LSTM is composed by two LSTM models of the forward and backward direction, which can capture long-term dependencies in one direction. Hence, the Bi-LSTM allows more information to be preserved by capturing long-term dependencies in both directions, which is suitable for power generation forecasting scenarios that require big data processing. The architecture of Bi-LSTM model can be shown as Figure 1.

**Figure 1.** Architecture of the Bi-LSTM model.

From Figure 1, it can be found that the forward directional LSTM is used to produce the past information of input sequences, while the backward directional LSTM can gain the future information of input sequences. Finally, the final output is obtained by combining the corresponding time output of forward directional LSTM and backward directional LSTM at each time, which can be expressed by:

$$h\_t = f(w\_1 \mathbf{x}\_t + w\_2 h\_{t-1})\tag{3}$$

$$h\_t' = f\left(w\_3 x\_t + w\_5 h\_{t+1}'\right) \tag{4}$$

$$
\rho\_t = \mathcal{g} \left( w\_4 h\_t + w\_6 h\_t' \right) \tag{5}
$$

where *ht* and *h <sup>t</sup>* are current node outputs of the forward and backward direction respectively; *ot* is the output of current cell; *w*1, *w*2, *w*3, *w*4, *w*<sup>5</sup> and *w*<sup>6</sup> are the weight coefficients.

According to Equations (3)–(5), *w*<sup>1</sup> and *w*<sup>3</sup> are the weights of the input to the forward and backward hidden layers, *w*<sup>2</sup> and *w*<sup>5</sup> are the weights between the same hidden layers, while *w*<sup>4</sup> and *w*<sup>6</sup> are the weights of the forward and backward hidden layers to the output layers. Compared with LSTM, Bi-LSTM improves the globality and integrity of feature extraction.
