1. Introduction
To achieve global clean energy development, reduce greenhouse gas emissions and prevent the crisis of the depletion of nonrenewable fossil energy reserves, the large-scale use of clean energy has become a global energy development trend [
1,
2]. Among the various widely used new energies, wind energy is used worldwide due to its wide energy distribution, pollution-free nature and sustainability, and it is of great significance to tap into the potential of wind energy to adjust the traditional energy structure. According to a report released by the Global Wind Energy Association (GWEC) in 2019, the global installed capacity of wind power in 2019 was 60.4 GW, reaching a total of 651 GW. As of the end of 2019, China’s cumulative installed wind power capacity reached 210 MW [
3]. The chaotic, random and intermittent characteristics of wind speed pose considerable challenges to power systems. The violent fluctuation of wind power in a short period of time causes a short-term imbalance of the power system, which may cause the power system to collapse. Therefore, accurate wind speed forecasting is critical to accurately predicting the output power of wind power and stabilizing the operating state of the power system.
At present, wind speed prediction methods mainly include the following four methods: (i) the physical model method, (ii) the time series method, (iii) the spatial correlation method and (iv) the artificial intelligence method [
4,
5,
6]. The physical model method mainly uses the physical parameters when the wind speed generates the background to construct complex mathematical equations, and uses numerical weather prediction (NWP) for simulation. Classic numerical simulation approaches include the high-resolution limited area model (HIRLAM) [
7], the fifth-generation mesoscale model (MM5) [
8] and the weather research and forecast model (WRF) [
9]. However, physical methods have disadvantages such as a difficulty in obtaining physical data, the consumption of many computing resources and being unsuitable for short-term wind speed prediction [
10]. The time series method uses the potential before and after information and correlation in the historical wind speed data to build a model. Common wind speed statistical models include autoregressive (AR) [
11], autoregressive moving average (ARMA) [
12], autoregressive integrated moving average (ARIMA) [
13] and autoregressive fraction moving average (ARFIMA) [
14] models. Although time series approaches are simpler and more economical when compared with physical model methods, they are also limited by the nonlinearity and nonstationarity of the wind speed time series. As a unique method, the spatial correlation model starts from the relevant wind speed data around the wind speed center and selects appropriate sites to build a spatial model. Samalot et al. [
15] successfully combined Kalman filtering and Kriging to reduce the bias of the weather research and forecasting (WRF) model. However, this method has strict measurement requirements and is difficult to implement.
In addition, with the rise of artificial intelligence, artificial intelligence methods have shown strong advantages in the extraction of the nonlinear characteristics of wind speed fluctuations, and have gradually become a research hotspot in the field of prediction. Many methods including artificial neural networks (ANNs) [
16,
17], support vector machines (SVMs) [
18,
19] and fuzzy logic (FL) methods [
20,
21] have been applied to wind speed prediction. Monfared et al. [
22] combined fuzzy logic with an artificial neural network, which not only effectively reduced the rule base but also improved the accuracy of predicting wind speed. Li et al. [
23] studied the application of adaptive linear elements (ALEs), back propagation (BP) and radial basis functions (RBFs) to these three neural networks in 1-h wind speed prediction and proposed that the best prediction model is related not only to the type of neural network but also to the data source. Guo et al. [
24] proposed a backpropagation neural network wind speed prediction method to eliminate seasonal effects to predict daily average wind speed. This method can effectively eliminate seasonal effects from actual wind speed data. Zhang et al. [
25] proposed a two-step method to determine the connection weight of the RBF network to predict the future wind speed interval. Compared with the traditional multilayer perceptron (MLP) method, this method can effectively increase the prediction interval. Compared with the traditional neural network, the extreme learning machine (ELM) has faster convergence speed and less human intervention, which leads to its strong generalization ability for heterogeneous datasets [
26].
The neural network improves the prediction accuracy of wind speed series to a certain extent. However, the instability of the wind speed sequence and the corresponding noise also create considerable interference in the neural network model training process. In the end, the model training effect is not good, and the wind speed prediction error is large. Therefore, to solve the random interference of the wind speed sequence, various preprocessing technologies have been developed. Liu et al. [
27] used wavelet transform (WT) preprocessing technology to decompose the original sequence into multiple wind velocity subsequences, and then made predictions through the echo state network. Niu et al. [
28] used empirical mode decomposition (EMD) to decompose the original signal and then predicted each subsequence through the general regression neural network (GRNN) optimized by the fruit fly algorithm (FOA), which improved the accuracy of wind prediction. EMD cannot effectively decompose the original wind speed series due to its disadvantages such as end effects and modal aliasing. After that, Ren et al. [
29] studied the prediction model based on EMD, its improved version and two intelligent algorithms, and finally suggested complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN)and support vector regression (SVR) as the best wind speed prediction method. Zhou et al. [
30] proposed a hybrid framework for multilevel wind speed prediction based on variational model decomposition (VMD) and convolutional neural networks. Furthermore, chaos theory has increasingly attracted attention. Multifractal patterns of wind speed can be obtained through chaotic characteristics analysis. Jiang et al. [
31] employed a hybrid linear-nonlinear modeling method based on chaos theory to capture the linear and nonlinear factors hidden in wind speed time series, which contained VMD technology to remove the noise in original data. The experimental results showed that the hybrid model was more accurate compared with other models.
Based on the analysis above, artificial intelligence methods have been the most extensive and successful approaches to short-term wind speed prediction, but the prediction ability of a single artificial intelligence method is limited. Hybrid approaches have shown better performance than single models. Therefore, it has gradually become a popular trend to apply data preprocessing techniques before sending wind speed data into forecasting models.
In this study, a novel hybrid strategy is proposed that includes three portions: data preprocessing, optimization and forecasting. Specifically, based on the decomposition and integration strategy, VMD decomposition is used to decompose the original wind speed series into several variational modes to filter out the noise in the original wind speed time series. Then, the KELM prediction network is applied to the problem of wind speed forecasting. At the same time, the improved seagull optimization algorithm is used to optimize the kernel parameters of the KELM network, thereby forming a hybrid model.
The main contributions and innovations of this research are as follows: (1) data preprocessing technology is included to reduce the volatility and randomness of wind speed series and improve the accuracy of prediction. VMD decomposes the original wind speed series into a set of relatively stable modes. (2) In the prediction phase, the kernel function is added to ELM to map the one-dimensional wind speed sequence to the high-dimensional space for prediction, which reduces the difficulty of prediction. (3) An improved seagull optimization algorithm (ISOA) is proposed to determine the two best parameters in KELM simultaneously. In the prediction phase, ISOA continuously searches for the two parameters of the kernel function in KELM. At the same time, each search can retain the optimal approximate solution, so that the KELM network can be optimized, and the prediction accuracy and stability of the prediction are improved. (4) A systematic assessment system is established to evaluate the forecasting ability of our developed hybrid model. Four multistep prediction experiments and three performance indicators are included in this study to compare and analyze the forecasting capacity of the proposed hybrid model in each case.
4. Different Experiments and Relative Analysis
In this section, a detailed evaluation and analysis of the proposed model are carried out. Two sets of experiments are designed, and the graphs and tables visually show the corresponding prediction results and evaluation indicators. The experimental setup and results are as follows.
4.1. Experimental Setup
Two sets of comparative experiments were used to compare the forecasting ability between the proposed model and other comparable models. Experiment 1 compared the proposed combined model with five independent models to investigate its prediction performance. Experiment 2 compared the forecasting accuracy between the proposed model and models using various data preprocessing technologies. The four data sets were tested by all models. The results of multistep ahead forecasting further illustrated the forecasting capability of different models. Three error evaluation indicators were used to quantify the predictive ability. The smaller the value of error criteria, the better the predictive performance.
In Experiment 1, we selected five widely used individual models (BP, SVM, LSTM, ELM and KELM) as the control group of the comparative experiment. In order to compare the developed strategy with the prediction ability based on different data preprocessing technologies, such as discrete wavelet transform (DWT), EMD and complementary ensemble empirical mode decomposition (CEEMD), we conducted experiment 2.
4.2. Experiment I: Comparison with Other Individual Models
Table 3 shows the comparison of the results of the proposed model and the other individual models in the four seasons datasets.
Figure 2,
Figure 3 and
Figure 4 show the forecasting results of individual forecasting models in SH in April. At the top of the chart, the predicted results versus 10 min interval sampling points for all forecasting models are shown. Below, the error distribution diagram of forecasting and the scatter diagram of each individual model are presented.
For SH Apr, in the one-step forecasting, the proposed model showed the best MAE, RMSE and MAPE scores at 0.315, 0.408 and 6.606% respectively, followed by the KELM model, whose values for MAE, RMSE and MAPE were 0.888, 1.190 and 17.373% respectively. The worst was the BP neural network, with MAE, RMSE and MAPE scores of 1.247, 1.642 and 30.167%, respectively. When the model forecasting was two-step, the developed model had the best accuracy with an RMSE of 0.436. In the three-step, the proposed model still had the best predictive ability with an RMSE of 0.496, but the second most accurate model was the BP network.
Figure 4,
Figure 5 and
Figure 6 shows the prediction results of the proposed model and the individual model in the spring experimental series (SH Apr).
For SH July, when the forecasting is one-step, the proposed VMD-ISOA-KELM hybrid model achieves the highest accuracy with a MAPE value of 3.140%. Comparatively, the individual models have fairly lower MAPE values of 9.792%, 7.434%, 8.561%, 7.355% and 7.342%, respectively. In the two-step and three-step forecasting, the developed combined model is more effective than the other methods for wind speed forecasting. Meanwhile, KELM has the lowest MAPE values at 7.342% and 9.883% in the one-step and two-step among the remaining four individual models.
For SH Oct, according to the evaluation criteria shown in
Table 3, the proposed model still outperformed the individual models in the three steps, with MAPE values of 2.367%, 2.541% and 2.844%. According to the obtained MAPE, long short-term memory (LSTM) is ranked as the second most effective model in the three forecasts, with lower MAPE values of 7.731%, 10.557% and 11.753%.
For SH Jan, in all forecasting steps, the developed combined model exceeded the five benchmark models with MAPE values of 3.894%, 4.276% and 4.737%. In the two-step and three-step forecasting, the five individual models performed poorly, and their RMSE values were all over 1.
4.3. Experiment II: Comparsion with Other Models Using Different Data Preprocessing Methods
This experiment demonstrated the forecasting performance of the wind speed time series by comparing the VMD-ISOA-model with models using different data preprocessing methods, namely DWT, EMD and CEEMD. The comparison results are listed in
Table 4 and
Figure 5,
Figure 6,
Figure 7 and
Figure 8. More details of the experiment are given below:
For SH Apr, in the one-step forecasting, the proposed model showed the best performance with a MAPE value of 6.606%. In comparison, the model after pretreatment of VMD ranked as the second most effective model among the other data preprocessing technologies, with MAPE values of 7.089%, 7.412% and 8.340%, respectively, from one-step to three-step forecasting. Correspondingly, the DWT-Model showed the worst forecasting accuracy with MAPE values of 18.12%, 28.585%, and 36.064% from one-step to three-step forecasting.
For SH July, according to the evaluation criteria shown in
Table 4, the proposed model still outperformed the individual models in one-step forecasting, with the lowest MAE, RMSE and MAPE values of 0.221, 0.270 and 3.140%. According to the obtained MAPE, LSTM ranked as the second most effective model in the three forecasting, with lower MAPE values of 7.731%, 10.557% and 11.753%.
For SH Oct, when the forecasting was one-step, the proposed VMD-ISOA-KELM hybrid model achieved the highest accuracy with a MAPE value of 3.140%. Comparatively, the DWT-Model, EMD-Model, CEEMD-Model and VMD-Model had MAPE values of 5.981%, 6.744%, 3.452%, 7.355% and 7.342%, respectively, which wereinferior to our developed hybrid model. The comparison results of our forecasting strategy and DWT-Model, EMD-Model and CEEMD-Model are shown in
Figure 7.
For SH Jan, when the model forecasting is one-step, the prediction accuracy of the hybrid model, which has the lowest MAE, RMSE and MAPE values of 0.252, 0.333 and 3.894% respectively, was still superior compared to the other models using different preprocessing methods. In addition, the CEEMD -Model showed a better forecasting performance than EMD, with MAPE values of 6.807%, 7.601% and 8.246% respectively when the model forecasting changed from one-step to three-step.