Ultra-Short-Term Wind Power Forecasting Based on the MSADBO-LSTM Model

Zhao, Ziquan; Bai, Jing

doi:10.3390/en17225689

Open AccessArticle

Ultra-Short-Term Wind Power Forecasting Based on the MSADBO-LSTM Model

by

Ziquan Zhao

and

Jing Bai

^*

School of Electrical and Information Engineering, Beihua University, Jilin 132013, China

^*

Author to whom correspondence should be addressed.

Energies 2024, 17(22), 5689; https://doi.org/10.3390/en17225689

Submission received: 2 October 2024 / Revised: 7 November 2024 / Accepted: 8 November 2024 / Published: 14 November 2024

(This article belongs to the Section A3: Wind, Wave and Tidal Energy)

Download

Browse Figures

Versions Notes

Abstract

:

To address the challenges of the strong randomness and intermittency of wind power generation that affect wind power grid integration, power system scheduling, and the safe and stable operation of the system, an improved Dung Beetle Optimization Algorithm (MSADBO) is proposed to optimize the hyperparameters of the Long Short-Term Memory neural network (LSTM) for ultra-short-term wind power forecasting. By applying Bernoulli mapping for population initialization, the model’s sensitivity to wind power fluctuations is reduced, which accelerates the algorithm’s convergence speed. Incorporating an improved Sine Algorithm (MSA) into the forecasting model for this nonlinear problem significantly improves the position update strategy of the Dung Beetle Optimization Algorithm (DBO), which tends to be overly random and prone to local optima. This enhancement boosts the algorithm’s exploration capabilities both locally and globally, improving the rapid responsiveness of ultra-short-term wind power forecasting. Furthermore, an adaptive Gaussian–Cauchy mixture perturbation is introduced to interfere with individuals, increasing population diversity, escaping local optima, and enabling the continued exploration of other areas of the solution space until the global optimum is ultimately found. By optimizing three hyperparameters of the LSTM using the MSADBO algorithm, the prediction accuracy of the model is greatly enhanced. After simulation validation, taking winter as an example, the MSADBO-LSTM predictive model achieved a reduction in the MAE metric of 40.6% compared to LSTM, 20.12% compared to PSO-LSTM, and 3.82% compared to DBO-LSTM. The MSE decreased by 45.4% compared to LSTM, 40.78% compared to PSO-LSTM, and 16.62% compared to DBO-LSTM. The RMSE was reduced by 26.11% compared to LSTM, 23.05% compared to PSO-LSTM, and 8.69% compared to DBO-LSTM. Finally, the MAPE declined by 79.83% compared to LSTM, 31.88% compared to PSO-LSTM, and 29.62% compared to DBO-LSTM. This indicates that the predictive model can effectively enhance the accuracy of wind power forecasting.

Keywords:

ultra-short-term wind power forecasting; long short-term memory neural network; improved dung beetle optimization algorithm; sine algorithm; adaptive Gaussian–Cauchy mixture perturbation

1. Introduction

As society develops, the drawbacks of traditional fossil fuel usage have become increasingly prominent. In contrast to the use of conventional fossil fuels, which can cause severe harm to the environment and ecology, new types of environmentally friendly energy, such as wind power generation, are proving to be more sustainable and worthy of promotion [1]. Wind power generation has the potential to alleviate the shortage of conventional energy sources and mitigate the increasingly severe environmental pollution [2,3]. However, due to the inherent properties of natural wind, such as intermittency, randomness, and volatility, wind power generation exhibits strong randomness, intermittency, and high variability, which pose significant challenges for wind power grid integration and power system scheduling, consequently affecting the quality of electricity and the safe operation of the system [4]. Therefore, accurate wind power forecasting is crucial for alleviating peak load pressure on the grid, reducing the backup capacity of the power system, and enhancing the level of wind power injection and system reliability [5].

Wind power forecasting can be categorized into indirect and direct methods based on the acquisition approach. Indirect methods refer to physical-driven prediction techniques, while direct methods utilize statistical methods and machine learning to learn patterns from historical data [5]. Among these, physical-driven methods are often complex in modeling, costly, and computationally intensive [6]. Physical methods require the establishment of models based on complex physical relationships between various physical quantities, such as meteorological and topographical information. The computational cost for predictions is high, making them unsuitable for ultra-short-term forecasting [7]. Traditional statistical methods leverage historical data from wind farms to generate linear characteristics of wind output, such as Autoregressive Integrated Moving Average (ARIMA) [8] and Bayesian regression [9]. However, the linear characteristics of statistical models are not suitable for nonlinear and non-stationary predictions. With the rapid advancement of information technology and artificial intelligence, machine learning methods have been widely applied in the field of wind power forecasting, including artificial neural networks (ANNs) [10] and Support Vector Machines (SVMs) [11].

Early machine learning methods struggled to handle multivariate time-series data characterized by high dimensionality, temporal dynamics, and complexity. The expressive power and feature extraction capabilities of neural networks improve with increased network depth, allowing deep learning methods to better mine the high-dimensional, deep features contained within the data [12]. Due to their outstanding performance in extracting and fitting data features, deep learning methods have been widely applied in wind power forecasting in recent years [13], Among these methods, Long Short-Term Memory (LSTM) neural networks [14] stand out due to their strong memory retention capabilities, enabling them to effectively extract valuable information from long sequence data. Therefore, they have found widespread applications in the field of forecasting. References [15,16,17,18,19] propose several wind power forecasting models based on LSTM. These models utilize LSTM networks to learn the temporal features in wind power data, achieving higher prediction accuracy than linear models, traditional machine learning models, and artificial neural networks. Reference [20] proposes an improved Long Short-Term Memory (LSTM) neural network for wind power forecasting, demonstrating excellent predictive performance. However, standalone LSTM neural networks often face issues such as a high number of gate units, slow training speeds, and relatively low stability in predictive models. Reference [21] highlights the current research trend of continuously improving the prediction accuracy of LSTM models by combining them with bio-inspired ensemble forecasting models. Among these, the Dung Beetle Optimizer (DBO) algorithm [22], inspired by the rolling, dancing, foraging, stealing, and reproductive behaviors of dung beetles, generates diverse regional search strategies and update rules, leading to the development of a DBO-LSTM neural network for short-term power load forecasting. In this model, different populations of dung beetles are used for searching, replacing the traditional approach of manually setting parameters based on human experience. Additionally, this method enhances the generalization ability of the LSTM network in handling computational time-series problems.

However, the Dung Beetle Optimizer (DBO) may exhibit low convergence accuracy and be prone to local optima in certain situations. To further enhance the accuracy of wind power forecasting, an improved DBO algorithm is proposed to address the global optimization problem, which we name MSADBO. Inspired by the Modified Sine Algorithm (MSA), we endow the dung beetle with the global exploration and local development capabilities of MSA, expanding its search range, improving global exploration ability, and reducing the likelihood of falling into local optima. Additionally, chaotic mapping initialization and mutation operators are introduced for perturbation.

The improved DBO optimizes three hyperparameters in the LSTM, significantly enhancing the model’s diagnostic accuracy. Ultimately, when compared with other models, MSADBO-LSTM demonstrates the best prediction accuracy, robustness, and the least lag, indicating that this model can accurately capture the changing trends of wind power and respond promptly to future variations, showcasing high practicality and reliability.

The following sections will introduce the principles of the MSADBO-LSTM model and highlight the improvements made based on the original DBO-LSTM. A performance comparison between MSADBO and other optimization algorithms, as well as a comparison of wind power forecasting results, will also be presented. Finally, the superiority of this algorithm will be demonstrated.

2. Model Principles

2.1. Long Short-Term Memory Network

LSTM (Long Short-Term Memory network) is a specially designed recurrent neural network (RNN) aimed at addressing the gradient vanishing and exploding issues faced by standard RNNs when processing long sequence data. The basic structure of an LSTM unit is illustrated in Figure 1. What makes the LSTM unit unique is its inclusion of three different types of gates: the forget gate, input gate, and output gate. These gates precisely control the flow of information between units by weighting the input data and hidden states, effectively managing long-term dependencies.

The main function of the forget gate is to determine which information should be discarded from the cell state. It generates a value between 0 and 1 by weighting and activating the hidden state from the previous time step and the current input, where 0 indicates complete removal and 1 indicates complete retention. The input gate consists of two parts: a sigmoid layer and a tanh layer. The sigmoid layer determines which input values should be updated, outputting a value between 0 and 1 that controls the new information to be introduced. The tanh layer generates new candidate values that, after being controlled by the sigmoid layer, may be added to the cell state to update its value. The output gate is responsible for determining the output value based on the current cell state and the hidden state from the previous time step. Specifically, it first uses a sigmoid activation function to decide which information should be output, then applies a tanh activation function to transform the cell state, ultimately generating the new hidden state and output.

Through these gate mechanisms, LSTM can effectively retain and utilize long-term information when processing long sequence data, significantly improving the performance of traditional RNNs, especially in tasks such as language modeling and time-series prediction. As a result, LSTM networks are capable of capturing dependencies in sequence data over extended time horizons, thereby enhancing the model’s predictive ability and accuracy.

If the input sequence is

(x_{1}, x_{2}, \dots {, x}_{t})

and the hidden layer is

(h_{1}, h_{2}, \dots, h_{t})

, then at time t, we have:

f_{t} = σ (W_{f} h_{t - 1} + U_{f} x_{t} + b_{f}),

(1)

i_{t} = σ (W_{i} h_{t - 1} + U_{i} x_{t} + b_{i}),

(2)

{\tilde{c}}_{t} = \tan h (W_{c} h_{t - 1} + U_{c} x_{t} + b_{c}),

(3)

c_{t} = f_{t} \times c_{t - 1} + i_{t} \times {\tilde{c}}_{t},

(4)

o_{t} = σ (W_{o} h_{t - 1} + U_{o} x_{t} + b_{o}),

(5)

h_{t} = o_{t} \times \tan h (C_{t}),

(6)

In the equation, f_t, i_t and o_t represent the forget gate, input gate, and output gate, respectively; c_t is used to update the memory cell state. W_f, W_i, W_c, W_o, U_f, U_i, U_c and U_o represent the weights for each network layer; b_f, b_i, b_c and b_o represent the biases for each function;

σ a n d \tan h

are the activation functions, respectively.

2.2. Dung Beetle Optimization Algorithm

The Dung Beetle Optimization (DBO) algorithm is a swarm intelligence optimization algorithm based on the behavioral characteristics of dung beetles. The algorithm simulates various behaviors of dung beetles, such as rolling, dancing, breeding, foraging, and stealing, and designs a series of update rules and strategies. Each dung beetle group consists of four different types of agents: rolling beetles, breeding beetles (breeding balls), small beetles, and stealing beetles.

(1): Rolling beetles

Dung beetles roll balls of dung to suitable locations. While rolling, they use cues such as the sun or wind direction to maintain a straight path. To simulate this behavior in the algorithm, dung beetles need to move in a given direction within the search space. During the rolling process, the positions of these beetles are updated, their position changes are shown in Equations (7) and (8):

x_{i} (t + 1) = x_{i} (t) + α k x_{i} (t - 1) + b Δ x,

(7)

Δ x = |x_{i} (t) - X^{w}|,

(8)

where t represents the iteration count; x_i(t) indicates the position information of the i-th dung beetle at the t-th iteration;

α

represents the natural coefficient, assigned a value of 1 or −1. When

α = 1

, it indicates no deviation from the direction; when

α = - 1

, it indicates a deviation from the direction. K

ϵ

(0,0.2] represents the deflection coefficient; b indicates a constant value belonging to (0,1); X^w represents the global worst position; and

Δ x

is used to simulate changes in light intensity.

When these dung beetles encounter an obstacle blocking their path, they can use their dancing behavior to replan their route. In this case, the position update formula for these dung beetles is shown in Equation (9):

x_{i} (t + 1) = x_{i} (t) + \tan (θ) |x_{i} (t) - x_{i} (t - 1)|,

(9)

where

θ ϵ [0, π]

represents the deflection angle. When

θ = 0, \frac{π}{2} a n d π

, the position of the dung beetle will not be updated.

(2): Breeding beetles (breeding balls)

To safely breed their offspring, dung beetles roll the dung balls to a secure location and hide them inside, where they lay their eggs. Therefore, the boundary selection strategy for the dung beetles is shown in Equations (10) and (11):

b_{L}^{*} = \max (X^{*} (1 - R), b_{L}),

(10)

b_{U}^{*} = \min (X^{*} (1 + R), b_{U}),

(11)

where

b_{L}^{*} a n d

b_{U}^{*}

represent the lower and upper bounds of the egg-laying area, respectively;

X^{*}

denotes the current optimal position;

R = 1 - t ∕ T_{m a x}

,

T_{m a x}

indicate the maximum number of iterations; and

b_{L} a n d b_{U}

represent the lower and upper bounds of the optimization problem, respectively.

Once the location of the egg-laying area is determined, female dung beetles will choose the breeding ball in that area to lay their eggs. Each female dung beetle lays one egg per iteration. As seen from Equations (10) and (11), the boundaries of this egg-laying area are dynamic and depend on the value of R. Therefore, the position of the breeding ball also changes dynamically during the iterations, which can be represented as follows:

B_{i} (t + 1) = X^{*} + b_{1} (B_{i} (t) - b_{L}^{*}) + b_{2} (B_{i} (t) - b_{U}^{*}),

(12)

where

B_{i} (t)

is the position of the i-th dung ball at the t-th iteration; b₁ is a D-dimensional random vector following a normal distribution, and b₂ represents a D-dimensional random vector within the range [0, 1].

(3): Small beetles

When some larvae mature into adult dung beetles and emerge from the ground to forage, they are referred to as small beetles. The boundaries of their optimal foraging area are defined as follows:

b_{L}^{b} = \max (X^{b} (1 - R), b_{L}),

(13)

b_{U}^{b} = \min (X^{b} (1 + R), b_{U}),

(14)

where

b_{L}^{b} a n d b_{U}^{b}

represent the lower and upper bounds of the optimal foraging area for small beetles, respectively;

X^{b}

denotes the global best position. Therefore, the position update for the small beetles is as follows:

x_{i} (t + 1) = x_{i} (t) + C_{1} (x_{i} (t) - b_{L}^{b}) + C_{2} (x_{i} (t) - b_{U}^{b}),

(15)

where x_i(t) represents the position information of the i-th small beetle at the t-th iteration; C₁ indicates a random number following a normal distribution; and C₂ represents a random vector belonging to the interval (0,1).

(4): Stealing beetles

Some dung beetles will steal dung balls from other beetles, and these dung-stealing beetles are referred to as “thieving beetles”. In Equations (13) and (14), it can be observed that X^b represents the best food source. Therefore, we can consider that the area near X^b is the optimal location for food competition. During the iteration process, the positions of the thieving beetles are continuously updated and can be described as follows:

x_{i} (t + 1) = x^{b} + S \times g (|x_{i} (t) - X^{*}| + |x_{i} (t) - X^{b}|),

(16)

where x_i(t) represents the position information of the i-th thief at the t-th iteration; g is a random vector of size

1 \times D

following a normal distribution; and S denotes a constant.

2.3. Improved Dung Beetle Optimization Algorithm

2.3.1. The Purpose of the Improvement

Although the Dung Beetle Optimization (DBO) algorithm performs better than other algorithms in optimization, exhibiting strong optimization capability and fast convergence speed, it still faces imbalances in global exploration and local exploitation when tackling complex problems. This can lead to the risk of getting trapped in local optima and indicates weaker global exploration ability. Therefore, to enhance the exploration capacity of DBO, improvements are made to the existing optimization algorithm by incorporating Bernoulli mapping strategies, embedding an improved Sine Algorithm strategy, and utilizing adaptive Gaussian–Cauchy mutation perturbations to address these shortcomings.

2.3.2. Initialize the Population Using Bernoulli Mapping

Before the improvement, the population initialization of the Dung Beetle Optimization (DBO) algorithm was carried out using random generation. The shortcomings of this method include the uneven distribution of the beetles’ positions, weak global exploration capability, low population diversity, and a tendency to get trapped in local optima. In contrast, chaotic mapping combines determinism and randomness, characterized by randomness and non-periodicity. Chaotic initialization can enhance the search breadth of optimization algorithms and can be used to address global optimization problems [23]. Currently, there are many types of chaotic mappings, among which the Bernoulli mapping is one. In the field of optimization, it can replace the random number initialization of populations, improving the distribution quality of the dung beetle population and enhancing global search capability. Therefore, we use Bernoulli mapping to initialize the positions of the dung beetles. First, we project the values obtained through the Bernoulli mapping relation into the chaotic variable space. Then, the resulting chaotic values are mapped into the initial space of the algorithm through linear transformation. The specific expression for the Bernoulli mapping is as follows:

z_{n + 1} = \{\begin{matrix} \frac{z_{n}}{1 - β}, 0 \leq z_{n} \leq 1 - β \\ \frac{z_{n} - (1 - β)}{β}, 1 - β \leq z_{n} \leq 1 \end{matrix},

(17)

where

β

is the mapping parameter,

β ϵ (0, 1)

, which we set to

β = 0.518, z_{0} = 0.326

to achieve optimal value performance.

The distribution of the Bernoulli mapping chaotic sequence is shown in Figure 2. In Figure 2a, the scatter plot helps to observe whether the initial points are uniformly distributed or if there are certain clustered regions. In Figure 2b, the histogram primarily shows the frequency distribution of the system states, revealing whether the chaotic system is uniformly distributed, its randomness, and the diversity of its distribution. This method distributes the initial points almost uniformly (equitably) within the unit interval and ensures that the latter half of the generated sequence does not overlap. We have also demonstrated through statistical tests that the generated sequence exhibits good randomness. This allows for a more uniform distribution of the population initialized by the Bernoulli mapping, enhancing the quality and diversity of the population while avoiding issues such as getting trapped in local optima.

2.3.3. Introduction of the Improved Sine Algorithm

The Improved Sine Algorithm (MSA) [24] is inspired by various algorithms related to the Sine Cosine Algorithm (SCA) [25], the Sine Algorithm (SA) [26], the Exponential Sine Cosine Algorithm (ESCA) [27], and the Improved Sine Cosine Algorithm (ISCA) [28]. It utilizes the sine function in mathematics for iterative optimization, demonstrating strong global exploration capabilities. To achieve a good balance between global exploration and local exploitation, an adaptive variable inertia weight coefficient

ω_{t}

is introduced during the position update process. The position update formula for the Improved Sine Algorithm is shown in Equation (18):

x_{i} (t + 1) = ω_{t} x_{i} (t) + r_{1} \times \sin (r_{2}) \times [r_{3} p_{i} (t) - x_{i} (t)],

(18)

where t is the current iteration count,

ω_{t}

is the inertia weight,

x_{i} (t)

is the i-th position component of individual X at the t-th iteration,

p_{i} (t)

is the i-th component of the best individual position variable at the t-th iteration, r₁ is a nonlinear decreasing function, r₂ is a random number in the interval [0, 2π], and r₃ is a random number in the interval [−2, 2].

r₁ represents the search distance and direction of the dung beetle, optimizing the search method of the DBO algorithm. Its value is shown in Equation (19):

r_{1} = \frac{ω_{\max} - ω_{\min}}{2} \cos (\frac{π t}{T_{\max}}) + \frac{ω_{\max} + ω_{\min}}{2},

(19)

where

ω_{m a x}

and

ω_{m i n}

represent the maximum and minimum values of

ω_{t}

, t denotes the current iteration count, and T_max represents the maximum number of iterations.

By using the adaptive coefficient

ω_{t}

, the search space is gradually reduced, and as the number of iterations increases, the inertia weight decreases. In the early stages of the algorithm, a relatively large inertia weight allows for strong global exploration capability, while a relatively small inertia weight in the later stages helps improve local exploitation ability. The formula for the adaptive coefficient

ω_{t}

is as follows:

ω_{t} = ω_{\max} - (ω_{\max} - ω_{\min}) \times \frac{t}{T_{\max}},

(20)

To further enhance the global exploration and local exploitation capabilities of the DBO algorithm, a sine guiding mechanism has been introduced on top of the existing framework. By applying sine operations to the entire population of dung beetles during the rolling phase, the positions of the beetles are guided during updates. The improved formula is as follows:

x_{i} (t + 1) = \{\begin{matrix} x_{i} (t) + α \times k \times x_{i} (t - 1) + b \times Δ x, δ < S T \\ ω_{t} x_{i} (t) + r_{1} \times \sin (r_{2}) \times [r_{3} p_{i} (t) - x_{i} (t)], δ \geq S T \end{matrix},

(21)

where

δ = r a n d (1)

, ST

ϵ

(0.5,1]. In the improved position update formula, when

δ <

ST, it indicates that the dung beetle is rolling with a specific target, remaining in the normal global exploration phase. Conversely, when

δ >

ST, it signifies that the dung beetle does not have a clear rolling target but will move using a sine function when searching. By introducing this improved sine guiding mechanism, the excessive randomness in the DBO algorithm’s position update strategy can be significantly mitigated, while also addressing the original algorithm’s tendency to become trapped in local optima. This MSA sine guiding mechanism allows the dung beetles to engage in global exploration and local exploitation within the specified range of the algorithm, effectively expanding the search space. It facilitates a gradual convergence toward the same optimal solution, the target function value, thereby enhancing the algorithm’s global optimization capability.

2.3.4. Adaptive Gaussian–Cauchy Mixture Mutation Perturbation

In the final stage of the algorithm iteration, which is the foraging phase, the dung beetles tend to gather near the optimal position. However, the current position may not be the global optimum, causing the beetles to continuously search for the optimal position around their current location. This leads to the inability to discover the true optimal solution, resulting in them being trapped in local optima. To address this issue, mutation perturbations are generally employed to interfere with individuals, increasing the diversity of the population. This allows the beetles to escape local optima and explore other regions of the solution space until they ultimately find the global optimum.

In the Dung Beetle Optimization algorithm, a mutation operator is incorporated, where Gaussian mutation and Cauchy mutation are two commonly used mutation operators. Considering the advantages and disadvantages of both, an adaptive Gaussian–Cauchy mixture perturbation strategy is proposed, which combines the strengths of Cauchy mutation and Gaussian mutation. The specific formula is shown in Equation (19):

H^{b} (t) = X^{b} (t) \times (1 + μ_{1} \times G auss (σ) + μ_{2} cauchy (σ)),

(22)

where

X^{b} (t)

represents the optimal position of individual X at the t-th iteration,

H^{b} (t)

is the position of

X^{b} (t)

after the Gaussian–Cauchy mixture perturbation in the t-th iteration,

G a u s s (σ)

is the Gaussian mutation operator, and

c a u c h y (σ)

is the Cauchy mutation operator,

μ_{1} = \frac{t}{T_{m a x}}

,

μ_{2} = \frac{1 - t}{T_{m a x}}

.

In the early iterations of the algorithm, mutation perturbations are performed using the Cauchy distribution function, enabling global exploration and rapid convergence. As the algorithm continues to iterate, the positions of the dung beetles do not stabilize. At this point, the algorithm mainly employs the Gaussian distribution function to perturb the population, helping the algorithm escape local optima. By coordinating the characteristics of both Gaussian and Cauchy distribution functions, the diversity of the dung beetles is enhanced, further improving the algorithm’s local exploitation and global exploration capabilities. However, it cannot be guaranteed that the new position obtained after mutation perturbation will always have a better fitness than the original position. Therefore, after performing the mutation perturbation update, a greedy mechanism [29] is introduced to compare the fitness of the new and old positions in order to determine whether to update the position.

f (x)

represents the fitness value of position x, and the formula for its greedy mechanism is shown in Equation (23):

X^{b} = \{\begin{matrix} H^{b} (t), f [H^{b} (t)] < f [X^{b} (t)] \\ X^{b} (t), f [H^{b} (t)] \geq f [X^{b} (t)] \end{matrix},

(23)

2.4. Performance Testing of MSADBO

To demonstrate that MSADBO exhibits strong optimization capabilities and convergence, this study compares the MSADBO algorithm with the Whale Optimization Algorithm (WOA) [30], the Grey Wolf Optimizer (GWO) [31], and the Dung Beetle Optimizer (DBO). These algorithms are tested on three single-peak benchmark functions (F1, F3, F5) and three multi-peak benchmark functions (F8, F11, F13). The unimodal benchmark test functions are used to assess the performance and effectiveness of optimization algorithms, evaluating their optimization capabilities and convergence speeds. In contrast, multimodal benchmark test functions are employed to determine whether optimization algorithms can avoid getting trapped in local minima and find the global optimal solution, thereby evaluating their global search and exploration abilities. The expressions of the test functions are shown in Table 1, the results of the benchmark test functions are shown in Table 2, and the test results are illustrated in Figure 3.

From Figure 3, it is evident that the MSADBO algorithm exhibits excellent optimization performance, high precision, fast convergence speed, and good stability. Additionally, this algorithm demonstrates strong global search capabilities and the ability to escape local optima.

3. MSADBO-LSTM Prediction Model

The accuracy of the LSTM neural network model in predicting short-term wind power generation primarily depends on the L2 regularization parameter, learning rate, and the number of neurons in the hidden layer. Although the DBO algorithm can optimize these parameters for LSTM, the final optimization results may suffer from issues such as uneven initial population distribution, weak local exploitation capability, and a tendency to get trapped in local optima. Therefore, this paper improves the DBO algorithm by proposing the MSADBO optimization algorithm. It initializes the population using Bernoulli mapping and employs an improved sine algorithm to optimize the positions of rolling dung beetles, thereby enhancing the algorithm’s exploration capabilities in both local and global contexts. Additionally, an adaptive variable inertia weight coefficient is introduced to improve the positions of thieving beetles, balancing the algorithm’s global and local exploration abilities. Finally, an adaptive Gaussian–Cauchy mixture perturbation mutation strategy is implemented to strengthen the algorithm’s global search capability and its ability to escape local optima.

The process of the MSADBO-LSTM model is shown in Figure 4.

4. Experimental Results and Analysis

4.1. Data Sources and Evaluation Metrics

(1): Data Sources

The dataset in this study is sourced from the standard dataset provided by the “China Software Cup” University Student Software Design Competition. It contains operational data from 10 wind farms over the course of one year, totaling over 300,000 entries. To verify that the model proposed in this paper can adapt to different seasons, the data from wind farms in summer and winter were randomly selected, with 15 days from each season chosen as the experimental subjects. This dataset includes the following five feature variables: wind direction, wind speed, temperature, humidity, and atmospheric pressure, along with the actual wind power output. The power data extraction frequency is set to collect 96 sample points per day, with data being recorded every 15 min, resulting in a total of 2884 wind power data entries. We only used a past generated power series to predict the next power output (15 min in the future). The data from the first 12 days of both seasons were selected as training samples to predict the wind power for the next 3 days, with a training-to-testing sample ratio of 4:1.

(2): Evaluation Metrics

The prediction accuracy of each forecasting model is evaluated and compared using evaluation metrics. The selected metrics are Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Percentage Error (MAPE), and the coefficient of determination (R²). These five evaluation metrics are used to measure the discrepancies between predicted values and actual values from different perspectives. The Mean Squared Error (MSE) is used as the objective function to evaluate model performance, where a smaller fitness value indicates higher prediction accuracy. The Root Mean Squared Error (RMSE) typically represents the degree of dispersion of the results. Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) can indicate the bias in predictions. Additionally, the coefficient of determination (R²) is used to measure the linear correlation between actual values and predicted values. The formulas for each error metric are as follows:

E_{M A E} = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - {\hat{y}}_{i}|,

(24)

E_{M S E} = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2},

(25)

E_{R M S E} = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}},

(26)

E_{M A P E} = \frac{1}{n} \sum_{i = 1}^{n} |\frac{{\hat{y}}_{i} - y_{i}}{y_{i}}| \times 100 %,

(27)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {({\hat{y}}_{i} - y_{i})}^{2}}{\sum_{i = 1}^{n} {({\bar{y}}_{i} - y_{i})}^{2}},

(28)

In the equations, n represents the sample size,

y_{i}

denotes the actual value of the i-th wind power output,

{\hat{y}}_{i}

represents the predicted value, and

{\bar{y}}_{i}

indicates the mean value.

4.2. Data Preprocessing

4.2.1. Data Cleaning

When collecting wind power data, various unstable factors can affect data accuracy, such as equipment failure, communication issues, or other reasons that result in missing data points. Additionally, sensor errors can lead to outliers in the data. To improve the quality of the wind power data and enhance the accuracy of the model’s predictions, the 3-sigma rule is used to eliminate outliers from the data, and interpolation methods are applied to fill in the missing data points. The formula for the 3-sigma rule is as follows:

λ = \sqrt{\frac{\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}}{n - 1}},

(29)

x - \bar{x} > 3 λ,

(30)

In the formula, n represents the sample size, x represents the feature value matrix, and

\bar{x}

represents the mean value matrix of the feature values.

4.2.2. Feature Variable Selection

To improve the computational efficiency and prediction accuracy of the model, the Spearman correlation analysis method is applied to the cleaned data, removing features that are either irrelevant or weakly correlated. Figure 5 shows the Spearman correlation coefficient matrix, which describes the correlation between each feature. The absolute values of the correlation coefficients between air pressure and temperature with wind power generation are both less than 0.2, indicating that there is almost no correlation and making them unsuitable for prediction. Therefore, these two features are removed in subsequent predictions.

4.3. Simulation Experiment and Analysis

To validate the prediction performance of the proposed model, the 1442 sets of sample data for summer and winter will be separately input into the trained models of LSTM, PSO-LSTM, DBO-LSTM, and MSADBO-LSTM for comparison. The comparison charts of the four model predictions for summer and winter are shown in Figure 6 and Figure 7.

As shown in Figure 6 and Figure 7, different optimization algorithms have varying impacts on the accuracy of the prediction models. The MSADBO-LSTM model is closer to the actual values compared to other models and is able to track the actual values more quickly at turning points, which means the performance in different seasons has met expectations. To further analyze the prediction results more clearly, the model’s prediction performance is evaluated using five additional evaluation metrics, as shown in Table 3 and Table 4.

As shown in Table 3 and Table 4, in different seasons, the MSADBO-LSTM prediction model shows improvements across all five evaluation metrics, demonstrating its superior performance in predicting ultra-short-term wind power data, can adapt to different seasons. Taking winter as an example, compared to the LSTM, PSO-LSTM, and DBO-LSTM models, the MSADBO-LSTM model achieves reductions in MAE of 40.6%, 20.12%, and 3.82%, respectively; MSE achieves reductions of 45.4%, 40.78%, and 16.62%; RMSE achieves reductions of 26.11%, 23.05%, and 8.69%; and MAPE achieves reductions pf 79.83%, 31.88%, and 29.62%. Additionally, the comparison of R² values shows that the MSADBO-LSTM model has the best fit. The optimization capability of MSADBO is superior to that of DBO, as it is evident that the improvements to the DBO optimization algorithm have addressed the shortcomings of the original DBO. These enhancements have increased the global search capability and enabled faster escape from local optima.

From the comparative analysis above, it can be seen that the MSADBO-LSTM prediction model proposed in this paper more effectively matches historical wind power load data and related factors with the model. This model is better at capturing wind power load trends and provides more accurate predictions for future wind power loads.

5. Conclusions

This paper proposes an MSADBO-LSTM combined prediction model to improve the accuracy of ultra-short-term wind power forecasting. Through experimental analysis and model comparisons, the following conclusions can be drawn:

(1): By applying Bernoulli mapping for population initialization based on the original DBO algorithm, the optimal solution can be obtained more quickly and accurately, improving the prediction model’s adaptability to wind power fluctuations. The introduction of the improved sine guiding mechanism and adaptive Gaussian–Cauchy mixture mutation perturbation further optimizes the LSTM neural network prediction model, effectively avoiding issues such as insufficient local exploitation and getting trapped in local optima. This approach provides good convergence and prediction stability, addressing the shortcomings of the DBO-LSTM prediction model.
(2): By comparing the prediction results with those of different wind power forecasting models in various seasons, the proposed MSADBO-LSTM model demonstrates the highest prediction accuracy, the lowest prediction error, and the best overall curve-fitting performance. This indicates that the model can adapt to different seasons and can more quickly and accurately determine the optimal values of neural network hyperparameters, thereby improving prediction accuracy and enhancing the reliability of stable power system operations.
(3): This experiment primarily focuses on ultra-short-term wind power forecasting. In future research, predictions for medium-term or long-term wind power generation could be explored. For instance, when dealing with large samples, the accuracy of the model may decrease. Future research should introduce more models to provide additional options in the field of prediction. Furthermore, hybrid models that combine different methods should be explored. Thus, further research will focus on the following aspects: first, applying the proposed model to other domains, such as photovoltaic power generation forecasting; and second, utilizing more advanced methods to enhance prediction accuracy.

Author Contributions

Conceptualization, Z.Z. and J.B.; methodology, Z.Z.; software, Z.Z.; validation, Z.Z. and J.B.; formal analysis, Z.Z.; investigation, Z.Z.; resources, Z.Z.; data curation, Z.Z.; writing—original draft preparation, Z.Z.; writing—review and editing, Z.Z.; visualization, Z.Z.; supervision, J.B.; project administration, J.B.; funding acquisition, J.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Key R&D Project Number of Jilin Provincial Department of Science and Technology, grant number 20230204093YY.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Ping, Y. A Review of Wind Speed and Wind Power Forecasting Methods for Wind Farms. Electron. Technol. Softw. Eng. 2019, 214, 117766. [Google Scholar]
Veers, P.; Dykes, K.; Lantz, E.; Barth, S.; Bottasso, C.L.; Carlson, O.; Clifton, A.; Green, J.; Green, P.; Holttinen, H.; et al. Grand challenges in the science of wind energy. Science 2019, 366, eaau2027. [Google Scholar] [CrossRef] [PubMed]
de Falani, S.Y.A.; González, M.O.A.; Barreto, F.M.; de Toledo, J.C.; Torkomian, A.L.V. Trends in the technological development of wind energy generation. Int. J. Technol. Manag. Sustain. Dev. 2020, 19, 43–68. [Google Scholar] [CrossRef]
Liao, X.; Wu, J.; Chen, C. Short-Term Wind Power Forecasting Model Combining Attention Mechanism and LSTM. Comput. Eng. 2022, 48, 286–297+304. [Google Scholar]
Tang, X.; Gu, N.; Huang, X.; Peng, R. Research Progress on Short-Term Wind Power Forecasting Technology. J. Mech. Eng. 2022, 58, 213–236. [Google Scholar]
Yin, H.; Ou, Z.; Huang, S.; Meng, A. A cascaded deep learning wind power prediction approach based on a two-layer of mode decomposition. Energy 2019, 189, 116316. [Google Scholar] [CrossRef]
He, B.; Ye, L.; Pei, M.; Lu, P.; Dai, B.; Li, Z.; Wang, K. A combined model for short-term wind power forecasting based on the analysis of numerical weather prediction data. Energy Rep. 2022, 8, 929–939. [Google Scholar] [CrossRef]
Szostek, K.; Mazur, D.; Drałus, G.; Kusznier, J. Analysis of the Effectiveness of ARIMA, SARIMA, and SVR Models in Time Series Forecasting: A Case Study of Wind Farm Energy Production. Energies 2024, 17, 4803. [Google Scholar] [CrossRef]
Wang, G.; Jia, R.; Liu, J.; Zhang, H. A hybrid wind power forecasting approach based on Bayesian model averaging and ensemble learning. Renew. Energy 2020, 145, 2426–2434. [Google Scholar] [CrossRef]
Sun, W.; Wang, Y. Short-term wind speed forecasting based on fast ensemble empirical mode decomposition, phase space reconstruction, sample entropy and improved back-propagation neural network. Energy Convers. Manag. 2018, 157, 1–12. [Google Scholar] [CrossRef]
Yu, C.; Li, Y.; Bao, Y.; Tang, H.; Zhai, G. A novel framework for wind speed prediction based on recurrent neural networks and support vector machine. Energy Convers. Manag. 2018, 178, 137–145. [Google Scholar] [CrossRef]
Chen, J.; Zhang, J. Short term wind power prediction using attention mechanism based TCN BiGRU model. J. Tianjin Univ. Technol. 2024, 40, 69–74. [Google Scholar]
Ding, T.; Yang, M.; Yu, Y.; Si, Z.; Zhang, Q. Short-Term Wind Power Integrated Forecasting Method Based on Error Correction. High Volt. Eng. 2022, 48, 488–496. [Google Scholar]
Schmidhuber, J.; Hochreiter, S. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar]
Ramadevi, B.; Kasi, V.R.; Bingi, K. Hybrid LSTM-Based Fractional-Order Neural Network for Jeju Island’s Wind Farm Power Forecasting. Fractal Fract. 2024, 8, 149. [Google Scholar] [CrossRef]
Han, L.; Jing, H.; Zhang, R.; Gao, Z. Wind power forecast based on improved Long Short Term Memory network. Energy 2019, 189, 116300. [Google Scholar] [CrossRef]
Son, N.; Yang, S.; Na, J. Hybrid forecasting model for short-term wind power prediction using modified long short-term memory. Energies 2019, 12, 3901. [Google Scholar] [CrossRef]
Wang, D.; Cui, X.; Niu, D. Wind power forecasting based on LSTM improved by EMD-PCA-RF. Sustainability 2022, 14, 7307. [Google Scholar] [CrossRef]
Huang, Q.; Wang, X. A forecasting model of wind power based on IPSO–LSTM and classified fusion. Energies 2022, 15, 5531. [Google Scholar] [CrossRef]
Shuhao, L. Wind Power Forecasting Based on Improved Long Short-Term Memory Neural Network; Guangdong University of Technology: Guangdong, China, 2023. [Google Scholar]
Zhang, Y.; Ma, T.; Li, T.; Wang, Y. Short-Term Load Forecasting Based on DBO-LSTM Model. In Proceedings of the 2023 3rd International Conference on Energy Engineering and Power Systems (EEPS), Dali, China, 28–30 July 2023; pp. 972–977. [Google Scholar]
Xue, J.; Shen, B. Dung beetle optimizer: A new meta-heuristic algorithm for global optimization. J. Supercomput. 2023, 79, 7305–7336. [Google Scholar] [CrossRef]
Liu, W.; Guo, Z.; Jiang, F.; Liu, G.; Jin, B.; Wang, D. Grey Wolf Algorithm Improved by Collaborative Encirclement Strategy and Its PID Parameter Optimization. J. Comput. Sci. Explor. 2023, 17, 620–634. [Google Scholar]
Luo, Y.; Dai, W.; Ti, Y.-W. Improved sine algorithm for global optimization. Expert Syst. Appl. 2023, 213, 118831. [Google Scholar] [CrossRef]
Li, C.; Liang, K.; Chen, Y.; Pan, M. An exploitation-boosted sine cosine algorithm for global optimization. Eng. Appl. Artif. Intell. 2023, 117, 105620. [Google Scholar] [CrossRef]
Qu, L.; He, D. A Simplified Sine Cosine Algorithm: Sine Algorithm. Comput. Appl. Res. 2018, 35, 3694–3696+3728. [Google Scholar]
Yong, L.; Liang, M. Sine Cosine Algorithm with Nonlinear Decreasing Transformation Parameters. Comput. Eng. Appl. 2017, 53, 1–5+46. [Google Scholar]
Belazi, A.; Migallón, H.; Gónzalez-Sánchez, D.; Gónzalez-García, J.; Jimeno-Morenilla, A.; Sánchez-Romero, J.L. Enhanced parallel sine cosine algorithm for constrained and unconstrained optimization. Mathematics 2022, 10, 1166. [Google Scholar] [CrossRef]
Li, Y.; Li, W.; Zhao, Y.; Liu, A. Grey Wolf Algorithm Based on Lévy Flight and Random Walk Strategy. Comput. Sci. 2020, 47, 291–296. [Google Scholar]
Nadimi-Shahraki, M.H.; Zamani, H.; Asghari Varzaneh, Z.; Mirjalili, S. A systematic review of the whale optimization algorithm: Theoretical foundation, improvements, and hybridizations. Arch. Comput. Methods Eng. 2023, 30, 4113–4159. [Google Scholar] [CrossRef]
Makhadmeh, S.N.; Al-Betar, M.A.; Abu Doush, I.; Awadallah, M.A.; Kassaymeh, S.; Mirjalili, S.; Abu Zitar, R. Recent advances in Grey Wolf Optimizer, its versions and applications. IEEE Access 2023, 12, 22991–23028. [Google Scholar] [CrossRef]

Figure 1. Structure of an LSTM neural network unit.

Figure 2. Scatter plot and histogram of the Bernoulli mapping.

Figure 3. Comparison of convergence curves for each algorithm.

Figure 4. Flowchart of the MSADBO-LSTM model.

Figure 5. Spearman correlation coefficient matrix.

Figure 6. Comparison of ultra-short-term wind power forecasting models (summer).

Figure 7. Comparison of ultra-short-term wind power forecasting models (winter).

Table 1. Comparison of evaluation metrics for different prediction models.

Function	Dim	Range	fmin
Unimodal $f_{1} (x) = \sum_{i = 1}^{n} x_{i}^{2}$	30	[−100, 100]	0
$f_{3} (x) = \sum_{i = 1}^{n} {(\sum_{j = 1}^{i} x_{j})}^{2}$	30	[−100, 100]	0
$f_{5} (x) = \sum_{i = 1}^{n - 1} [{100 (x_{i}^{2} - x_{i + 1})}^{2} + {(1 - x_{i})}^{2}]$	30	[−30, 30]	0
Multimodal $f_{8} (x) = \sum_{i = 1}^{n} (- x_{i} \sin \sqrt{\|x_{i}\|})$	30	[−500, 500]	−418.9829
$f_{11} (x) = 1 + \frac{1}{4000} \sum_{i = 1}^{n} x_{i}^{2} - \prod_{i = 1}^{n} \cos (\frac{x_{i}}{\sqrt{i}})$	30	[−600, 600]	0
$f_{13} (x) = 0.1 \{\begin{matrix} \sin^{2} (3 π x_{1}) + \sum_{i = 1}^{n} {(x_{i} - 1)}^{2} [1 + \sin^{2} (3 π x_{i} + 1)] \\ + {(x_{n} - 1)}^{2} [1 + \sin^{2} (2 π x_{n})] + \sum_{i = 1}^{n} u (x_{i}, 5,100,4) \end{matrix}\}$	30	[−50, 50]	0

Table 2. The statistical results of MSADBO and other algorithms.

Functions	Statistical Indicator	MSADBO	DBO	PSO	GWO	WOA
$f_{1} (x)$	Average Std	7.2541 × 10⁻¹⁵ 1.0259 × 10⁻¹⁵⁴	1.1335 × 10⁻¹⁸ 1.6031 × 10⁻¹⁸	353.49 108.10	0.0094005 0.0035112	1.7828 × 10⁻¹² 8.894 × 10⁻¹³
$f_{3} (x)$	Average Std	1.7853 × 10⁻¹⁶⁷ 0	2.0764 × 10⁻¹² 2.9365 × 10⁻¹²	5502.54 998.51	144.92 47.99	86,097.40 22,670.34
$f_{5} (x)$	Average Std	26.59 0.00627	27.82 0.135	30,622.63 17,759.36	31.92 2.53	28.81 0.011257
$f_{8} (x)$	Average Std	−12,568.05 1.14	−6455.74 733.65	−6627.65 470.93	−6751.73 116.77	−10,356.84 1943.56
$f_{11} (x)$	Average Std	0 0	0 0	3.51 0.73	0.084081 0.091593	1.4692 × 10⁻¹¹ 2.0595 × 10⁻¹¹
$f_{13} (x)$	Average Std	0.32226 0.20107	2.2633 0.41855	399.076 160.2983	2.0923 0.27861	1.0348 0.27238

Table 3. Comparison of evaluation metrics for different prediction models (summer).

	MAE	MSE	RMSE	MAPE (%)	R²
Prediction Model	MAE	MSE	RMSE	MAPE (%)	R²
LSTM	5217.3191	66,139,861.5418	8132.6417	86.7193	0.8596
PSO-LSTM	3136.1068	37,960,924.1371	6161.2437	74.9481	0.9194
DBO-LSTM	3283.1161	36,348,365.5532	6028.9606	50.4503	0.9228
MSADBO-LSTM	2892.9954	35,472,284.0574	5955.8613	12.1573	0.9347

Table 4. Comparison of evaluation metrics for different prediction models (winter).

	MAE	MSE	RMSE	MAPE (%)	R²
Prediction Model	MAE	MSE	RMSE	MAPE (%)	R²
LSTM	5464.1312	55,611,482.5367	7457.3107	107.4797	0.90304
PSO-LSTM	4063.2942	51,277,895.3252	7160.8586	31.8315	0.9106
DBO-LSTM	3374.8279	36,419,550.2232	6034.8612	30.8059	0.9365
MSADBO-LSTM	3245.6764	30,365,751.6889	5510.5128	21.6826	0.94706

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhao, Z.; Bai, J. Ultra-Short-Term Wind Power Forecasting Based on the MSADBO-LSTM Model. Energies 2024, 17, 5689. https://doi.org/10.3390/en17225689

AMA Style

Zhao Z, Bai J. Ultra-Short-Term Wind Power Forecasting Based on the MSADBO-LSTM Model. Energies. 2024; 17(22):5689. https://doi.org/10.3390/en17225689

Chicago/Turabian Style

Zhao, Ziquan, and Jing Bai. 2024. "Ultra-Short-Term Wind Power Forecasting Based on the MSADBO-LSTM Model" Energies 17, no. 22: 5689. https://doi.org/10.3390/en17225689

APA Style

Zhao, Z., & Bai, J. (2024). Ultra-Short-Term Wind Power Forecasting Based on the MSADBO-LSTM Model. Energies, 17(22), 5689. https://doi.org/10.3390/en17225689

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Ultra-Short-Term Wind Power Forecasting Based on the MSADBO-LSTM Model

Abstract

1. Introduction

2. Model Principles

2.1. Long Short-Term Memory Network

2.2. Dung Beetle Optimization Algorithm

2.3. Improved Dung Beetle Optimization Algorithm

2.3.1. The Purpose of the Improvement

2.3.2. Initialize the Population Using Bernoulli Mapping

2.3.3. Introduction of the Improved Sine Algorithm

2.3.4. Adaptive Gaussian–Cauchy Mixture Mutation Perturbation

2.4. Performance Testing of MSADBO

3. MSADBO-LSTM Prediction Model

4. Experimental Results and Analysis

4.1. Data Sources and Evaluation Metrics

4.2. Data Preprocessing

4.2.1. Data Cleaning

4.2.2. Feature Variable Selection

4.3. Simulation Experiment and Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI