1. Introduction
The depletion of traditional fossil fuels and the resulting pollution are making energy supply and improvement increasingly severe. Currently, wind power as a non-polluting and sustainable new energy has received widespread attention [
1]. Renewable energy, represented by wind power, holds immense significance in addressing energy depletion and improving the energy structure. However, wind power is subject to variability and intermittency, which poses a challenge to grid stability when large-scale wind power integration takes place [
2]. Accurate wind power forecasting is one such crucial solution that enhances the reliability and efficiency of power systems [
3,
4,
5]. At the same time, it provides guidance for grid scheduling plans and maintains balance in wind power supply and demand [
6,
7].
Wind power forecasting can be categorized into ultra-short-term, short-term, medium-term, and long-term forecasts based on the time horizon [
8]. The ultra-short-term range is a few minutes (0–30 min), which can aid power system operators in real-time scheduling and the optimization of generation [
9]. The short-term prediction range is from hours to days (hours–days), mainly used for market trading and the optimization scheduling of wind farms [
10]. The medium and long-term predictions range from days to weeks, which can provide assistance and guidance for long-term maintenance planning and the energy management of wind farms [
11]. Considering the stability and the economy of the power system, short-term wind power forecasting has become the focus of research.
Physical models, statistical models, artificial intelligence models, and composite models are four commonly used methods [
12]. Numerical weather prediction (NWP) is one of the most widely used physical models. It is based on physical equations and utilizes information about the surrounding physical environment to establish a prediction model [
13]. This type of method does not require historical wind power data but is computationally complex [
14]. Statistical models predict future wind power output by analyzing patterns and trends in historical data. The advantage of statistical models is their low computational cost. Traditional statistical models include the autoregressive moving average (ARMA) model [
15] and the autoregressive integrated moving average (ARIMA) model [
16]. The above statistical model assumes that the relationship between time series data is linear in advance, which cannot deal with the nonlinear characteristics of wind power series [
17].
Artificial intelligence models exhibit remarkable nonlinear fitting capabilities when processing data and have been widely used in wind power prediction [
18]. For example, artificial neural networks (ANNs) [
19] and support vector machines (SVMs) [
20] combined with optimization algorithms have been applied to short-term wind power forecasting, and these methods have been shown to exhibit excellent predictive performance. Deep learning, as a class of artificial intelligence methods, can learn more complex non-linear relationships, thus being highly favored in wind power forecasting [
21]. For instance, recurrent neural networks (RNNs) can effectively process sequential data and produce good predictive results. However, considering that RNNs often faces the problems of vanishing or exploding gradients, variants of RNNs are mainly used, such as long short-term memory (LSTM) networks [
22] or gated recurrent unit (GRU) networks [
23]. Convolutional Neural Networks (CNNs) leverage unique convolution operations to effectively extract high-level features from wind power time series data, leading to accurate prediction results [
24]. Nevertheless, these prediction models still have limitations. LSTM can only extract forward time information from the input and ignore backward time information [
25]. Graves et al. [
26] proposed BiLSTM, a model that can simultaneously consider bidirectional information and achieves better predictive accuracy than LSTM. Commonly used CNNs have only one type of convolutional kernel, limiting their ability to capture hidden features of different scales when processing multivariate wind power data [
27]. To accurately predict wind power, the use of CNNs with multiple convolutional kernels has become urgently needed [
28]. However, CNNs still struggle to capture long-term trends in time series. Therefore, the prediction results using only a single model are not satisfactory.
The combined models can leverage the strengths of diverse models and have strong adaptive abilities when dealing with non-stationary signals [
29,
30]. A prediction method of CNN-LSTM was proposed in reference [
31]. It was demonstrated that the prediction performance of the CNN-LSTM model exceeded that of either a standalone CNN or LSTM. Zhou et al. [
32] combine LSTM and the K-Means clustering algorithm with the non-parametric kernel density estimation (KDE) method to improve prediction accuracy. The forecasting method described above uses the long-term trends of the raw data, which are messy, and with an increasing prediction time range, information loss is likely to occur, thus affecting the prediction performance. One way to tackle this challenge is to decompose the wind power data for the better learning of its structural and characteristic patterns by models. Lu et al. [
33] use variational mode decomposition and weighted permutation entropy (VMD-WPE) decomposition of historical wind power and key meteorological features as inputs to build a CNN-LSTM model and use different optimizers to seek out the best parameters for the model, with the aim of achieving accurate prediction outcomes. However, this two-stage decomposition method increases the complexity of data processing, and the sensitivity of VMD to noise can also affect the prediction results. In addition, analyzing only key meteorological features equally without considering the differences in the impact of various meteorological features on wind power output will also reduce the accuracy of the forecast. Another method to improve prediction accuracy is to add an attention mechanism to the model. Tang et al. [
34] proposed a CNN-LSTM-Attention prediction method, which weights the output of CNN-LSTM with attention to make the model focus more on the important features for prediction results, reduce information loss, and improve prediction accuracy. However, this way of introducing the attention mechanism requires the combined models to have a high ability to extract features and their long-term trends.
In summary, distinguishing the importance of different wind power characteristics and exploring a combination model that can fully extract input feature information and capture the time dependency of feature sequences is a major challenge to improve prediction accuracy. Therefore, this paper proposes a wind power prediction method based on feature weighting and combination models to overcome the limitations of existing methods and achieve higher accuracy predictions. This article makes the following contributions:
The attention mechanism is used to dynamically assign the weights of each input feature to distinguish the importance of different features on the impact of wind power output. In addition, the order in which the attention mechanism is introduced allows the model to be more focused on the information that is more important to the prediction results when extracting features.
The MCNN with different convolutional kernels can extract the feature information of different scales more comprehensively. SBiLSTM can better capture the temporal dependencies of feature sequences. The two neural networks in the combined model play to their respective strengths, enhancing the model’s capacity to extract features and their long-term trends.
Using real wind farm data, four groups of comparative experiments are carried out to verify the effectiveness and stability of the proposed method; based on four commonly used error indicators, the proposed models all demonstrated the best prediction accuracy.
The remainder of this article is structured as follows:
Section 2 includes the materials and methods; it first introduces the wind power-related datasets, then describes the prediction process of the proposed model, focusing on the individual modules in the ensemble model. Finally, the overall prediction framework of the method when applied to actual cases is introduced.
Section 3 conducts experiments and analyzes the results.
Section 4 draws conclusions.
4. Conclusions
In this work, we propose an AM-MCNN-SBiLSTM prediction model, which utilizes feature weighting and ensemble modeling, for efficient and accurate wind power forecasting. The attention mechanism is used to assign weights to each input feature, effectively addressing the issue where the model fails to discern differences in the importance of input data. The weighted reconstructed feature sequence facilitates the model to extract more key information. By utilizing an MCNN with three types of convolutional kernels and stacking three layers of BiLSTM (SBiLSTM), the model fully explores the multi-scale information of the feature sequence and its long-term trends. Experiments are conducted using actual operational data of wind turbines and compared with the prediction performance of other models. It is demonstrated that the model proposed in this paper exhibits higher predictive accuracy. It demonstrates stronger robustness in experiments with different time steps and longer time ranges and is better able to handle the actual fluctuations of wind power. Thus, this approach can offer more dependable short-term wind power prediction, serving as a reliable point of reference for consistent operation and power allocation in wind farms.
Although the above methods have good predictive performance, their training is complex. We conducted numerous experiments in this study to identify the optimal model parameters, which consumed a significant amount of time. Intelligent algorithms can improve the efficiency of model training. In future research, we plan to use different optimization algorithms to optimize the parameters of the prediction model and to continue to explore the ability of different combination models to extract features, in order to improve prediction accuracy.