A Review of Wind Power Prediction Methods Based on Multi-Time Scales

Li, Fan; Wang, Hongzhen; Wang, Dan; Liu, Dong; Sun, Ke

doi:10.3390/en18071713

Open AccessReview

A Review of Wind Power Prediction Methods Based on Multi-Time Scales

by

Fan Li

¹,

Hongzhen Wang

^2,*

,

Dan Wang

¹,

Dong Liu

¹ and

Ke Sun

¹

State Grid Economic and Technological Research Institute Co., Ltd., Beijing 102209, China

²

School of Electrical Engineering, Xi’an Jiaotong University, Xi’an 710049, China

^*

Author to whom correspondence should be addressed.

Energies 2025, 18(7), 1713; https://doi.org/10.3390/en18071713

Submission received: 13 February 2025 / Revised: 26 March 2025 / Accepted: 27 March 2025 / Published: 29 March 2025

(This article belongs to the Special Issue Advancements in the Integrated Energy System and Its Policy)

Download

Browse Figures

Versions Notes

Abstract

:

In response to the ‘zero carbon’ goal, the development of renewable energy has become a global consensus. Among the array of renewable energy sources, wind energy is distinguished by its considerable installed capacity on a global scale. Accurate wind power prediction provides a fundamental basis for power grid dispatching, unit combination operation, and wind farm operation and maintenance. This study establishes a framework to bridge theoretical innovations with practical implementation challenges in wind power prediction. This work uses a narrative method to synthesize and discuss wind power prediction methods. Common classification angles of wind power prediction methods are outlined. By synthesizing existing approaches through multi-time scales, from the ultra-short term and short term to mid-long term, the review further deconstructs methods by model characteristics, input data types, spatial scales, and evaluation metrics. The analysis reveals that the data-driven prediction model dominates ultra-short-term predictions through rapid response to volatility, while the hybrid method enhances short-term precision. Mid-term predictions increasingly integrate climate dynamics to address seasonal variability. A key contribution lies in unifying fragmented methodologies into a decision support framework that prioritizes the time scale, model adaptability, and spatial constraints. This work enables practitioners to systematically select optimal strategies and advance the development of forecasting systems that are critical for highly renewable energy systems.

Keywords:

wind power; prediction method; multi-time scale; prediction model

1. Introduction

Energy serves as a crucial material basis for human existence and progress. Human society has undergone remarkable growth under the fossil energy system since the first Industrial Revolution. However, non-renewable energy sources, primarily coal, oil, and natural gas, are not only finite in resources but also contribute significantly to carbon emissions. Severe climate change is gradually posing challenges to human survival. In response, countries around the world have proposed their respective ‘zero carbon’ targets, and the pursuit of renewable energy development has emerged as a global consensus. Wind energy is characterized by its inexhaustibility and pollution-free nature, offering high social benefits, and it is a green energy source that is strongly supported by governments worldwide for development and utilization. Amidst the escalating global energy and environmental crises, renewable energy technologies that are efficient and clean have garnered significant attention from countries worldwide. Renewable energy sources, including wind power, are increasingly gaining importance. Since the end of the last century, the global installed capacity of wind power has been growing at a rapid pace. Similarly, China’s installed capacity for wind power has been increasing day by day, exhibiting a trend of rapid development. According to the ‘2024 National Electricity Supply and Demand Situation Analysis and Forecast Report’ released by the China Electricity Council, it is estimated that in 2024, the combined newly added installed capacity of grid-connected wind power and solar power will reach around 300 GW, with the cumulative installed capacity share exceeding 40% for the first time. By the end of June, the combined installed capacity of grid-connected wind and solar power in the country reached 1180 GW, surpassing the installed capacity of coal power for the first time with an annual growth rate of 37.2%, accounting for 38.4% of the total installed capacity, which is an increase of 6.5 percentage points compared to the same period last year [1]. According to the National Energy Administration’s statistical data for the national electricity industry from January to September 2024, as of the end of September, the cumulative installed power generation capacity nationwide was approximately 3160 GW, representing a year-on-year increase of 14.1%. Among this, the installed capacity of wind power was around 480 million kilowatts, exhibiting a year-on-year growth rate of 19.8% [2]. There is no doubt that wind power will continue to occupy an important position in the energy structure of various countries.

The integration of wind power into modern energy systems is a critical component of global efforts to transition towards sustainable energy sources [3]. Wind power generation, while offering significant environmental and societal benefits, is characterized by intermittency, randomness, and volatility compared to conventional energy sources [4,5,6]. These characteristics lead to random start–stop operations of wind turbines and fluctuating power outputs. Consequently, the integration of significant amounts of wind energy into the power system may adversely affect peak regulation, frequency control, and power quality [7,8]. As wind power capacity continues to grow, these challenges become more pronounced. To mitigate these impacts, accurate wind power forecasting is essential for formulating generation plans that optimize dispatch strategies and enhance system resilience [9,10]. Recent studies have explored various aspects of this challenge, including system-level coordination scheduling [11,12], aggregate power flexibility in multi-energy systems [13,14], long-term scenario generation using machine learning [15,16], and data-driven two-stage scheduling [17]. These contributions collectively highlight the importance of integrating renewable energy with innovative storage and transmission solutions, such as liquid hydrogen superconducting transmission [18]. Additionally, the optimal planning of hybrid energy storage systems [19,20,21] and emergency control coordination strategies [22] have been proposed to address the variability and resilience requirements of renewable-dominated power systems. Although the large-scale application of wind power has significantly increased the penetration rate of renewable energy, its inherent intermittency and volatility remain the key bottleneck constraining the safe operation of the power grid.

With the expansion of wind power capacity, these challenges intensify. To counteract them, grid dispatch departments need to develop generation plans based on future wind conditions. Thus, accurate wind power prediction is essential, as it directly influences both the electricity output and the operational strategies of the power systems [23].

Existing research has reviewed wind power prediction methods from diverse viewpoints, including predictions of wind speed and wind power output. In terms of temporal scales, the time scale is considered when reviewing data mining-based wind power forecasting methods in 2012 [24]. Wind speed and wind power prediction methods were reviewed across various time scales, ranging from 10 min to 72 h [25]. A comprehensive review of the application of artificial neural networks in wind power systems was provided [26]. When summarizing wind power prediction, prediction methods based on the type of object and the time scale are categorized.

In terms of forecasting methods, both individual and ensemble approaches are typically considered. Regarding individual methods, wind power forecasting methods from both physical and statistical perspectives were reviewed [27,28,29]. Existing wind power forecasting methods were reviewed from various perspectives, including the number of model inputs and outputs, the number of iterative steps, and model selection [30]. A comprehensive review of various deep learning techniques used in wind speed/wind power forecasting was provided, including data processing, feature extraction, and relationship learning stages [31]. They identified three challenges in accurately forecasting wind speed/wind power under complex conditions, namely the uncertainty of data, the incompleteness of features, and the complexity of nonlinear relationships. The forecasting models used in recent years were comprehensively reviewed and categorized, considering aspects such as input model types, preprocessing and postprocessing techniques, artificial neural network models, the forecasting horizon, lead steps, and evaluation metrics [32]. A comprehensive review of deep learning-based forecasting models in the wind energy sector was provided, and the future development directions for deep learning-based wind energy forecasting were also discussed [33]. A review of current forecasting techniques and their performance evaluation from the perspectives of key predictive technologies was conducted, namely physical methods, statistical methods, artificial intelligence techniques, and hybrid approaches [34]. The discussion explored the technologies used to improve forecasting precision, approaches to address significant forecasting challenges, current trends, and potential applications for future research. Additionally, it is widely acknowledged that within the range of cut-in and rated wind speeds, the power generation of a single wind turbine is roughly proportional to the cube of the wind speed [35]. Consequently, this study integrates wind speed prediction techniques into the investigation of wind power generation prediction.

2. Classification and Overview of Wind Power Prediction

The research subject of wind power prediction includes wind speed and wind power output. Wind power output has an approximate cubic polynomial relationship with wind speed. Thus, wind speed prediction can indirectly inform wind power prediction. However, turbine operating conditions affect this conversion, limiting the indirect method’s accuracy. In the last few years, studies have favored direct wind power prediction, which can be classified according to the time scale, spatial scale, input data, and model characteristics, as depicted in Figure 1. Additionally, evaluation metrics for wind power prediction differ across studies.

The time scale of wind power prediction, or the lead time from the present to the prediction moment, inversely correlates with accuracy. Different time scales serve various applications. Ultra-short-term forecasting (0–4 h) aids in power control [36], real-time scheduling, and internal wind farm dispatching [37]. Short-term forecasting (0–72 h) supports power system unit commitment [38] and electricity market trading [39]. Medium-to long-term forecasting, spanning weeks to months, is used for wind farm planning and unit maintenance scheduling. However, mid-long-term forecasting remains challenging, with research primarily concentrated on ultra-short-term and short-term predictions.

Wind farm power prediction is categorized by spatial scale into individual turbine, whole wind farm, and wind farm region predictions. Small wind farms, consisting of a few turbines, need individual turbine predictions for unit control and self-regulation. For larger farms, overall output prediction is vital for dispatch decisions. Both wind farm and regional aggregated predictions are critical, with the latter essential for understanding regional power fluctuations. A region is defined by aggregated power from nearby farms for system dispatch [40]. This study emphasizes predictions at the wind farm and regional levels, with wind farm power prediction being a central research focus in wind power prediction.

The types of input data are categorized as containing only wind power information and containing both wind power information and numerical weather prediction (NWP). The type and quality of input data, which include factors influencing wind power, set the maximum attainable accuracy. Historical data on wind power output and weather conditions are most accessible, making forecasting based solely on these data the simplest to implement. However, due to atmospheric dynamics, such methods are only effective for very short-term predictions. Research incorporating numerical weather prediction (NWP) is most prevalent. NWP, developed by meteorological departments through equations of atmospheric motion and numerical computation, forecasts future weather conditions, thereby improving short-term wind power forecasting accuracy. It can also be stressed that NWP usually operates on large-scale geographics, which is detrimental for small-scale forecasting (i.e., at high spatial resolution for turbine-specific forecasting). To further enhance accuracy, additional data such as wind turbine status [41], wind farm location [42], NWP for neighboring farms [43], and their power generation status can be used as input for forecasting models, improving the overall accuracy of wind power predictions.

As for model characteristics, wind power prediction methods can be separated into physical and statistical models. Physical models, which are the oldest, predict by solving equations that account for atmospheric motion, turbine characteristics, and wind farm location. They combine numerical weather prediction (NWP), turbine parameters, and geography to convert wind speed into power [44,45]. However, the infrequent updates of NWP data limit their accuracy, which makes them not suitable for short-term and medium-to-long-term forecasts. Despite their complexity and reliance on supercomputers, physical models are valuable for new wind farms without operational history. Statistical models rely exclusively on historical data for wind power forecasting, using simple datasets like wind speed and power observations, along with NWP wind speed forecasts. They are based on mathematical and statistical theories, providing simplicity, quick solutions, and high accuracy. As operational data from wind farms accumulate, high-quality forecasting becomes more feasible, but new technologies are needed to extract information from these data. Statistical methods continue to be a crucial direction for future wind power forecasting development, and this review focuses on them.

Statistical models primarily encompass traditional statistical methods, machine learning methods, and hybrid methods. Traditional wind farm power prediction methods, avoiding meteorological data, rely on historical generation patterns. The persistence method, using current speeds or power for predictions [46], is simple and occasionally accurate for very short-term forecasts but less precise over longer periods. ARMA models, introduced in the 1980s and used to predict wind power by decomposing wind speed [47], are easy to construct but inflexible, struggling with non-stationary conditions like gusts and shifts [48]. Time series extrapolation methods, solely using historical power data, see accuracy drop with increasing forecast horizons and have difficulty adapting to sudden changes.

Machine learning methods outperform traditional statistical methods in wind power prediction. The application of machine learning in wind energy forecasting can be broadly categorized into traditional machine learning models and deep learning models. Traditional machine learning methods include support vector machines (SVMs) [49], the k-nearest neighbor (KNN) algorithm with fuzzy logic [50], logistic regression [51], and the random forest (RF) algorithm [52,53]. As a type of machine learning method, deep learning has become pivotal in wind speed/wind power forecasting at present. These approaches are categorized into spatial and temporal models. Spatial models, like convolutional neural networks (CNNs) [54] and deep belief networks (DBNs) [55], process spatial data, with CNNs extracting grid-based correlations and DBNs handling feature extraction. Temporal models focus on sequential data, with recurrent neural networks (RNNs) [56], long short-term memory networks (LSTM) [57], and gated recurrent units (GRUs) [58] being common. RNNs process sequences but can suffer from gradient issues. LSTMs enhance RNNs for long sequences, while GRUs, which are structurally simpler, offer efficient computation.

Hybrid prediction methods, which integrate the strengths of various models and optimize their combination, are crucial for enhancing prediction accuracy in wind power forecasting [59]. This is particularly important given the complexity of wind power prediction due to multiple influencing factors, particularly under extreme weather conditions, where a single model proves inadequate and is prone to errors. Hybrid prediction methods are categorized into weighted combination prediction and fusion combination forecasting. The flowchart of the hybrid prediction process is shown in Figure 2. Weighted ensemble forecasting, introduced by Bates and Granger in 1969, combines predictions from multiple models using weighted averages. It includes fixed-weight methods, optimizing weights via objective functions like minimizing forecast error metrics [60,61] and variable-weight methods. They adjust weights over time to optimize and correct models for enhanced adaptability and accuracy.

Fusion combination prediction augments single models via optimization. For model inputs, wavelet transform (WT) [62] or empirical mode decomposition (EMD) [63] stabilize non-stationary signals [64], boosting accuracy but at a higher computational cost. Data mining, including rough set theory and principal component analysis, streamlines inputs without sacrificing accuracy. For model optimization, genetic [65] and firefly [66] algorithms optimize models, enhancing learning and generalization ability by overcoming local optima and parameter issues. For error correction, the error correction model [67] adjusts traditional forecasts. This adaptable method has spurred interest in multi-stage hybrid forecasting, which integrates input optimization, model optimization, and error correction for incremental accuracy gains.

The evaluation of the performance of a forecasting model necessitates a quantifiable metric. Sometimes, employing two measurement methods may not yield the same conclusion. Therefore, to ensure the accuracy of the results, it is recommended to use multiple metrics. In the field of wind power forecasting, commonly used evaluation indicators include the mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE). The specific definitions of these error metrics are as follows:

M S E = \frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}

(1)

R M S E = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}}

(2)

M A E = \frac{1}{N} \sum_{i = 1}^{N} |y_{i} - {\hat{y}}_{i}|

(3)

M A P E = \frac{100 %}{N} \sum_{i = 1}^{N} |\frac{y_{i} - {\hat{y}}_{i}}{y_{i}}|

(4)

where

y_{i}

represents the actual outcome,

{\hat{y}}_{i}

denotes the forecasted result, and

N

signifies the total number of data points.

3. Ultra-Short-Term Wind Power Prediction

This section summarizes the literature on the ultra-short-term forecasting of wind speed/wind power, focusing on the ultra-short-term forecasting issues for wind energy. The literature is classified based on model types, and the input data types, evaluation metrics used, and the spatial scale applied in each study are also summarized. The input data are divided into two categories. One contains only wind power information and the other includes both wind power information and meteorological data.

3.1. Traditional Statistical Model of Ultra-Short-Term Wind Power Prediction

In ultra-short-term wind power forecasting, traditional statistical models are often used to mine data features. Classical time series models are employed for power prediction, with the primary methods outlined below.

Autoregressive integrated moving average (ARIMA) model

The autoregressive (AR) model describes the relationship between the current value and historical values, using the variable’s own historical time data to predict itself. Autoregressive models must meet the requirement of stationarity. The moving average (MA) model focuses on the accumulation of the error term in the autoregressive model.

In contrast to the AR model, the MA model is capable of effectively eliminating random fluctuations in forecasting. The ARMA model integrates the AR model and the MA model. The formula is as follows:

y_{t} = μ + \sum_{i = 1}^{p} γ_{i} y_{t - i} + ε_{t} + \sum_{i = 1}^{q} θ_{i} ε_{t - i}

(5)

where

y_{t}

is the current value,

μ

is the constant term,

p

and

q

are the order,

γ_{i}

comprises the coefficients of the AR model part,

θ_{i}

are the coefficients of the MA model part, and

ε_{t}

is the error.

The ARIMA model, denoted as ARIMA (p,d,q), is built upon ARMA by incorporating differencing of the data to ensure stationarity. The ARIMA model is particularly effective for handling non-stationary time series data. This technique involves applying the ARMA model to a dataset that has been transformed through d-th degree differentiation and calculating the variances across d intervals of successive data points [68].

Research indeed applies time series analysis, specifically ARMA models, for ultra-short-term wind power forecasting, focusing on wind velocity and direction. One strategy decomposes wind speed into horizontal and vertical components, modeling them separately and combining the results to predict direction and speed [47]. Another approach uses dual ARMA models within a multi-class framework, with a logistic function for classification and separate prediction algorithms for each class [69]. Studies have also shown that the Seasonal Autoregressive Integrated Moving Average model (SARIMA) can be integrated with other methods [70]. It has been demonstrated that Markov chain Monte Carlo methods can be combined with the SARIMA for estimating both short-term and sustained winds, with the conclusion drawn that the SARIMA-based short-term model provides simplicity and speed while preserving the necessary level of accuracy.

Single exponential smoothing (SES) model

Simple exponential smoothing (SES) is a commonly used time series forecasting method, primarily for data that do not exhibit obvious trends or seasonal variations. A statistical analysis of the time series was conducted through conventional and robust measures [71]. This method predicts future values by applying a weighted average to historical data, assigning higher weights to more recent data.

Considering that

{\hat{y}}_{t}

is the forecast at some point of the time series and

y_{t}

is the available observation, the forecast errors are found to be

\{\begin{cases} e_{t} = y_{t} - {\hat{y}}_{t} \\ {\hat{y}}_{t + 1} = {\hat{y}}_{t} + α e_{t} \end{cases}

(6)

where

e_{t}

represents the forecast error.

{\hat{y}}_{t}

represents the forecasted value at time

t

;

y_{t}

represents the actual value at time t.

α

is the smoothing coefficient. The larger the value of

α

, the higher model’s responsiveness to the original data, but the smoothing effect will be less pronounced. Conversely, the smaller the value of α, the smoother the effect, as it gives less weight to recent changes.

Bayesian structural break model

The model, grounded in Bayesian principles and structural break analysis, is capable of integrating prior domain-specific knowledge concerning wind speed [72]. Abstracting away from physical causality, we could statistically infer that there exist different regimes (distributions) that generate this wind speed time series.

Let

y = (y_{1}, \dots, y_{T})^{'}

denote a time series of wind speed with

T

observations. Corresponding to each observation is a latent variable

S_{t}

taking on two values of 0 or 1, and

S_{t}

is independent and identically Bernoulli-distributed with the probability of occurrence of a break. If

S_{t} = 0

, the temporal continuity of the regime structure remains preserved at time t, indicating that

y_{t + 1}

and

y_{t}

share identical state-dependent characteristics without systemic parameter breaks. Otherwise,

y_{t + 1}

will be in a new regime starting from time t + 1. Assume that

y_{t}

is in the jth regime, then

j = 1 + \sum_{k = 1}^{t - 1} S_{k}

and the total number of regimes is

J = 1 + \sum_{k = 1}^{T - 1} S_{k}

.

The model further assumes that observations in regime j are normally distributed as

y_{t}| (μ_{j}, h_{j}) \overset{iid}{~} N (μ_{j}, h_{j}^{- 1}) (j = 1, \dots, J)

(7)

where

μ_{j}

and

h_{j}

represent the location and precision parameters of regime

j

.

Markov chains.

As a statistical model, Markov chains are used to simulate the random variations in wind speed and power. They can be employed to analyze time series data of wind power, aiding in the prediction of future wind speeds and directions. By constructing models based on Markov chains, it is possible to directly obtain estimates of short-term wind power distribution obtained from wind power time series data without making strict assumptions about the probability distribution of wind power.

A comprehensive framework for wind farm power forecasting has been established, incorporating finite-state Markov chain models. A key contribution of this research is the ability to integrate distributional forecasts into unit commitment and economic dispatch problems under wind power uncertainty [73]. A Markov-switching model for wind speed prediction is investigated. The model employs a regime-switching process that is governed by a discrete-state Markov chain to represent the nonlinear progression of wind speed time series data [74].

3.2. Machine Learning-Based Model of Ultra-Short-Term Wind Power Prediction

Machine learning-based models excel at modeling nonlinear relationships, improving prediction accuracy by capturing complex interactions between variables and wind power. Additionally, these methods are highly adaptable, fitting data effectively and adjusting models based on varying data characteristics. They also offer scalability. The principles and applications of common machine learning models are summarized below.

Gradient boosting machine (GBM)

A GBM is an ensemble learning technique that integrates multiple weak models, typically decision trees, to form a more powerful predictor. During each cycle, this regression-based algorithm incorporates an additional decision tree to the existing model, aiming to minimize the error and boost its predictive accuracy. In the context of a forecasting model, the GBM constructs a regression framework that predicts wind power output based on its correlation with wind speed. A GBM algorithm-based wind power prediction model was developed and assessed for its impact on grid security. The GBM model demonstrated superior performance in this research [75]. The gradient-boosted tree is the cumulative result of multiple regression trees, as depicted by the following equation:

F_{n} (x_{t}) = \sum_{i = 0}^{n} {\hat{y}}_{i} (x_{t})

(8)

where

x_{t}

depicts the wind speed at each time step

t

.

{\hat{y}}_{i} (x_{t})

illustrates the individual learners or decision trees that have been trained on distinct wind speed data points.

The operational architecture of gradient boosting constitutes three foundational elements, namely a weak learner, a loss function, and an additive model. The loss function is defined as the sum of the squared differences between the actual and forecasted values, which the model aims to minimize. For each iteration, a new additive model is introduced by identifying the residuals that reduce the loss function. The

L_{2}

function, constituting the predominant choice for differentiable loss surfaces, along with its associated gradient operator, can be formally specified as

\{\begin{cases} L_{2} = \frac{1}{2} L (y_{t}, {\hat{y}}_{t}) \\ L (y_{t}, {\hat{y}}_{t}) = \sum_{i = 1}^{N} {(y_{t} - {\hat{y}}_{t})}^{2} \\ - \frac{\partial L (y_{t}, {\hat{y}}_{t})}{\partial {\hat{y}}_{t}} = y_{t} - {\hat{y}}_{t} \end{cases}

(9)

where

x_{t}

depicts the windspeed at time t.

y_{t}

depicts the measured wind power.

{\hat{y}}_{t}

depicts the forecasted wind power.

When formulating the GBM model recursively, the model’s expression can be captured by the following equation follows:

F_{n + 1} (x_{t}) = F_{n} (x_{t}) + α Δ_{n} (x_{t})

(10)

where

α

depicts the learning rate.

Δ_{n}

depicts the regression model fitted to the residuals.

Support vector machine (SVM)

An SVM is a machine learning method which has the ability to capture the nonlinear mapping of wind power; thus, the SVM can be widely used for wind power forecasting. An SVM represents a technique in the realm of machine learning that can grasp the complex nonlinear relationships inherent in wind power data. Consequently, SVMs are frequently employed for predicting wind power output.

In a two-dimensional space, two classes of points completely separated by a straight line is referred to as linearly separable.

D_{0}

and

D_{1}

are two sets of points in an n-dimensional Euclidean space.

D_{0}

and

D_{1}

are linearly divisible if there exists an n-dimensional vector

w

and a real number

b

such that

w x_{i} + b > 0

for all points

x_{i}

belonging to

D_{0}

and

w x_{j} + b < 0

for all points

x_{j}

belonging to

D_{1}

.

When extended from two dimensions into a multidimensional space,

w x + b = 0

, which divides

D_{0}

and

D_{1}

exactly, becomes a hyperplane. In order to make this hyperplane more robust, it is necessary to go for the optimal hyperplane, the one that separates the two classes of samples by a maximum margin. The goal of the SVM is to find the furthest distance from each type of sample point to the hyperplane.

For linearly differentiable data, the SVM finds a hyperplane such

w^{T} \cdot x_{i} + b = 0

that all sample points satisfy:

y_{i} (w^{T} \cdot x_{i} + b) \geq 1 (i = 1, 2, \dots, n)

(11)

where

x_{i}

is the feature vector.

y_{i} \in \{- 1, + 1\}

represents the category labels.

w

is the normal vector, which determines the direction of the hyperplane; the optimization objective is to minimize

\frac{1}{2} {‖w‖}^{2}

.

b

is the offset, which can determine the distance between the hyperplane and the origin.

When the data are linearly indistinguishable, the original features are mapped to a higher dimensional space by means of a kernel function to make them linearly distinguishable without explicitly computing the mapping function. Commonly used kernel functions are linear kernel, polynomial kernel, Gaussian kernel, and Sigmoid kernel.

A model for forecasting the power is developed utilizing the principles of the SVM framework [76]:

\hat{y} (t + k) = \sum_{i = 1}^{N} (ω_{i} \cdot K (y_{i}, {\hat{y}}_{i}) + b_{i})

(12)

where at time

t + k

, the predicted power for a single wind farm is

\hat{y} (t + k)

. The n-dimensional wind power input set is

y_{i}

, with outputs

{\hat{y}}_{i}

. The kernel function is

K (\cdot)

, and

N

is the number of input variables. In the SVM model,

ω_{i}

is the normal vector and

b_{i}

is the offset.

The equations can be rearranged as follows, which subsequently allows for the determination of the optimal values for the weight and bias parameters:

\{\begin{cases} \min l (ω, ξ) = \frac{1}{2} (ω^{T} ω + γ ξ^{T} ξ) \\ \hat{y} (t + k) = \sum_{i = 1}^{N} (ω_{i} \cdot K (y_{i}, {\hat{y}}_{i}) + b_{i}) + ξ \end{cases}

(13)

where

l (\cdot)

refers to the objective function. The term

ξ

pertains to the slack variables, while

γ

denotes the positive parameter for regularization.

The method utilizes a least squares support vector machine (LSSVM), an enhanced version of the support vector machine. It can be described as follows [77]:

\{\begin{cases} \min [\frac{1}{2} w^{T} w + γ \sum_{i = 1}^{l} (ξ_{i} + η_{i})] \\ y_{i} - [〈w, x_{i}〉 + b] \leq ε + ξ_{i} \\ [w, x_{i} + b] - y_{i} \leq ε + η_{i}, i = 1, 2, \dots, l \\ ξ_{i} \geq 0, η_{i} \geq 0 \end{cases}

(14)

where the insensitivity is represented by

ε

, with tighter error term ranges for

ξ

and

η

being preferable. The SVM tuning parameter

γ

controls the penalty for data-supported errors, and choosing the right

γ

is essential to avoid overfitting.

The LSSVM technique addresses the linear equations to determine the ultimate regression line (or the higher dimension representation), thereby diminishing the complexity of solving to a noticeable degree and enhancing the solution’s processing speed.

Support vector regression (SVR)

The SVR framework is adept at addressing challenges associated with small datasets, nonlinearity, and high dimensionality. It is particularly effective for tackling the non-stationary nature of wind power prediction tasks. Based on SVR, ultra-short-term wind power prediction is carried out in some studies. An improved jellyfish search algorithm optimization support vector regression (IJS-SVR) model was proposed to achieve high-precision wind power prediction [78].

The equation for a linear regression model, which is used to build the SVR, can be formulated as follows:

\{\begin{cases} \hat{y} = a^{T} \cdot x + b \\ {(x_{1}, y_{1}), (x_{2}, y_{2}), \dots (x_{i}, y_{i})}, i \in [1, 2, \dots, n] \end{cases}

(15)

where

a

represents the weight, while

b

denotes the bias contact. Both of these parameters are determined through the training process of the SVR model.

x_{i}

and

y_{i}

denote the input and output data points, respectively. The variable

n

signifies the entire count of data samples.

The SVR model aims to reduce the discrepancy between the predicted

\hat{y}

and the actual

y

. However, there is a potential issue with its limited generalization capability. To address this, the SVR incorporates an acceptable margin of error

ε

between the actual and predicted values, defining its objective function accordingly:

\{\begin{cases} \min \frac{1}{2} ‖ a ‖^{2} + C \sum_{i = 1}^{m} (ξ_{i} + ξ_{i}^{*}) \\ y_{i} - {\hat{y}}_{i} \leq ε + ξ_{i} \\ {\hat{y}}_{i} - y_{i} \leq ε + ξ_{i}^{*} \\ ξ_{i} \geq 0, ξ_{i}^{*} \geq 0 \end{cases}

(16)

where

C

denotes the penalty parameter.

ξ_{i}

and

ξ_{i}^{*}

are the slack variables that quantify the extent of deviation.

ε

represents the error threshold, and

m

indicates the total count of training samples.

The nonlinear regression model was derived from addressing the optimization challenge between pairs, as delineated below:

\{\begin{cases} \hat{y} = \sum_{i = 1}^{m} (α_{i}^{*} - α_{i}) K (x_{i}, x) + b \\ K (x_{i}, x) = \exp (- \frac{{‖x_{i} - x‖}^{2}}{2 σ^{2}}) \end{cases}

(17)

where

σ

represents the width of the Gaussian kernel, a parameter that influences the predictive accuracy of the SVR. Concurrently, the magnitude of the regularization parameter

C

within the SVR also plays a role in determining the model’s predictive outcome.

Extreme learning machine (ELM)

The ELM is a rapidly learning algorithm renowned for its robust generalization when training neural networks with hidden neuron layer feedback control [79,80]. The core concept involves employing random weights and biases for the input layer during the training phase. The subsequent formulation can be employed to represent an ELM model that incorporates

L

neurons within its hidden layer [79]:

\{\begin{cases} \sum_{i = 1}^{L} β_{j} g (w_{j} \cdot x_{i} + b_{j}) = y_{i} \\ ‖H (w_{j}, b_{j}) β_{j} - Y‖ = \min_{w, b, β} ‖H (w_{j}, b_{j}) β_{j} - Y‖ \end{cases}

(18)

where

w

denotes the input weight,

β

denotes the output weight, and

b

denotes the threshold for the i-th neuron in the hidden layer. The input value is represented by

x_{i}

, and the corresponding output value is also denoted by

y_{i}

. The function

g (x)

refers to the activation function, characterized as a nonlinear piecewise constant function that satisfies the approximation capabilities theorem of the ELM.

Weighted random forest (WRF)

The dataset for training is bifurcated into two categories, which are conventional training instances and pre-forecast samples. Post the completion of the training phase, each decision tree undergoes evaluation, and the accuracy of predictions is determined using the subsequent method [81]:

τ_{i} = \frac{{({\hat{y}}_{i} - y_{i})}_{\max} + {({\hat{y}}_{i} - y_{i})}_{\min} - ({\hat{y}}_{i} - y_{i})}{\sum_{i = 1}^{n} ({\hat{y}}_{i} - y_{i})}, i = 1, 2, \dots, n

(19)

where

τ_{i}

is the correct prediction rate.

{\hat{y}}_{i}

denotes the predicted value, while

y_{i}

signifies the actual value. n is the number of decision trees.

To mitigate the effects of suboptimal decision tree training on the RF model’s performance, the predictions are adjusted based on the accuracy of each individual decision tree. Consequently, the WRF model is derived in the manner described below:

\{\begin{cases} {\hat{y}}_{R F} = \sum_{i = 1}^{n} τ_{i} Γ (x, T_{i}) \\ T_{n} = {(x_{1}, y_{1}), (x_{2}, y_{2}), \dots, (x_{n}, y_{n})}, x \in R, y \in R \end{cases}

(20)

By the process of training, a predictive model

Γ (x, T_{n})

is constructed using the training dataset

T_{n}

.

x

stands for the input vector set and

y

signifies the resulting output.

Artificial neural network (ANN)

The ANN consists of neurons that process information through interconnected weights, a network structure, and training techniques. Capable of diverse tasks like nonlinear approximation, categorization, clustering, simulation, forecasting, and data recovery, the ANN establishes complex input–output mappings by weighting inputs, applying biases, and using transfer functions to generate outputs. Its architecture and training algorithm determine the neuron arrangement and activation functions. An ANN model was developed with the Artificial Neural Network Fitting Tool for predicting wind speeds of 11 locations of the western Himalayan State of Himachal Pradesh, India [82]. During training, the network minimized the mean squared error by adjusting weights and biases, as illustrated in Figure 3.

The ANN has an input layer for receiving data, hidden layers for processing, and an output layer for generating predictions. The input layer takes the feature vector

x

, the hidden layer transforms it, and the output layer produces the final prediction. The hidden layer is termed as such because its internal values are not directly observable like the input matrix

X

or the output labels

y

.

Recurrent neural network (RNN)

RNNs process sequential data by recursively connecting nodes in a chain-like structure, capturing temporal dynamics through state feedback. They map conditional data to outputs, making them suitable for time series prediction with multiple influencing factors. An RNN includes an input layer for data reception, hidden layer neurons for encoding sequence patterns, and an output layer for predictions. The training process involves unfolding the hidden states across computational stages, as depicted in Figure 4, which illustrates the basic RNN structure, its time-unfolded diagram, and a multi-layer RNN configuration from left to right.

\{\begin{cases} h_{t} = f (x_{t}, h_{t - 1}; θ_{f}) \\ y_{t} = g (h_{t}; θ_{g}) \end{cases}

(21)

where

h_{t}

represents the state of the hidden layer at time

t

.

f (\cdot)

and

g (\cdot)

denote the nonlinear mappings applied to the inputs and states within the network.

x_{t}

and

y_{t}

represent the input and output of the RNN at time

t

, respectively.

θ_{f}

and

θ_{g}

are parameters associated with the RNN, which could include weights and biases. The inputs, outputs, and hidden layer states of the RNN are all vectors.

RNNs are susceptible to nonlinear challenges such as gradient vanishing and explosion when backpropagating errors across many time steps. These issues, particularly in traditional single-layer RNNs with recursive connections, can hinder the accurate modeling of nonlinear relationships in long time series data.

As for the examples, a layered RNN-based model predicts long-term wind speed and power, with tap delay enabling infinite dynamic responses to time series inputs [56]. Three cutting-edge RNN models with attention mechanisms are also studied for short-term wind power forecasting, with empirical results supporting their efficacy in complex prediction tasks [83].

Long short-term memory (LSTM) networks

The LSTM, an early RNN gating mechanism, features three gates within its units, which are the input gate, forget gate, and output gate. These gates create an internal feedback loop, with the input gate controlling state updates based on current and previous inputs, the forget gate managing state retention from the previous step, and the output gate regulating the output based on the internal state. A wind power prediction model combining the improved Adam optimizer with loss shrinkage (LsAdam) with LSTM has been proposed. The model utilizes an LsAdam to accelerate the LSTM network, resulting in super short-term wind power forecasts [84].

The LSTM, which enhances the RNN architecture with gated units in place of standard neurons, selectively retains and forgets information across sequences. This mechanism mitigates memory limitations and gradient issues in traditional RNNs for long time series, enabling the LSTM to outperform them in predictive tasks on extended historical data. The LSTM is composed of three distinct gate structures and a memory storage cell. The internal structure of the LSTM unit is shown in Figure 5, where

S_{t}

presents the state information of the LSTM unit at time step

t

,

h_{t}

is the output of the hidden layer at time step

t

,

f_{t}

is the forget gate,

g_{t}

is the input gate,

{\bar{S}}_{t}

is the information at the current time step,

o_{t}

is the output gate, and

σ

is the sigmoid activation function.

\{\begin{cases} f_{t} = σ (W_{f} [h_{t - 1}, x_{t}] + b_{f}) \\ g_{t} = σ (W_{g} [h_{t - 1}, x_{t}] + b_{g}) \\ {\bar{S}}_{t} = \tanh (W_{\bar{S}} [h_{t - 1}, x_{t}] + b_{\bar{S}}) \\ S_{t} = f_{t} S_{t - 1} + g_{t} {\bar{S}}_{t} \\ o_{t} = σ (W_{o} [h_{t - 1}, x_{t}] + b_{0}) \\ h_{t} = o_{t} \tanh S_{t} \end{cases}

(22)

where

W_{f}

,

W_{g}

,

W_{\bar{S}}

, and

W_{o}

represent the weight matrices corresponding to each module.

b_{f}

,

b_{g}

,

b_{\bar{S}}

, and

b_{o}

represent the bias terms. tanh is the hyperbolic tangent activation function.

σ

is the sigmoid activation function, defined as follows:

σ (x) = 1 / (1 + e^{- x})

(23)

The output layer uses the following formula to pass

h_{t}

through a fully connected layer to obtain the final prediction

y_{t}

:

y_{t} = σ (W_{y} h_{t} + b_{y})

(24)

where

W_{y}

represents the weight matrix and

b_{y}

represents the bias term.

Gated recurrent unit (GRU)

The GRU, a variant of LSTM, is an RNN that uses gating to control data flow. It tackles vanishing gradients and long-range dependency issues with a simpler parameter set than LSTM. The GRU’s core is to use specific neurons to maintain and transfer information over time, focusing on creating lasting memories, minimizing information loss, and detecting long-term dependencies [85]. The structure of the GRU is shown in Figure 6. A multi-modal multi-task spatiotemporal attention network (M2STAN) model is presented. The developed model employs a bidirectional gated recurrent unit (Bi-GRU) to model the spatial and temporal dependence, respectively [86].

In the GRU framework, there are two gates, namely a reset gate and an update gate, compared to the four in LSTM. The update gate determines data retention and discarding, similar to LSTM’s forget and input gates. The reset gate decides how much data to erase. The mathematical formulation is as follows [85]:

\{\begin{cases} z_{t} = σ (W_{z} \cdot [h_{t - 1}, x_{t}] + b_{z}) \\ r_{t} = σ (W_{r} \cdot [h_{t - 1}, x_{t}] + b_{r}) \\ \tilde{h_{t}} = \tanh (W_{h} \cdot [h_{t - 1} ⊙ r_{t}, x_{t}] + b_{h}) \\ h_{t} = (1 - z_{t}) ⊙ h_{t - 1} + z_{t} ⊙ \tilde{h_{t}} \end{cases}

(25)

where

z_{t}

denotes the update gate at timestep

t

.

W_{z}

signifies the weight matrix associated with the update gate.

r_{t}

represents the reset gate.

W_{r}

is the weight matrix for the reset gate. The symbol

\tilde{h_{t}}

denotes the memory content that leverages the reset gate to retain pertinent information from the prior state.

W_{h}

is the weight matrix for this memory content.

σ

is the sigmoidal function.

⊙

is the Hadamard product of two states, while

b_{z}

,

b_{r}

, and

b_{h}

correspond to the bias terms for the update gate, reset gate, and memory content, respectively. The use of square brackets denotes the summation of two vectors.

Temporal convolutional network (TCN)

In contrast to RNN-based methods, the TCN model processes sequential data without recursion, enhancing parallel computation capabilities and mitigating gradient issues. A spatio-temporal convolutional network (STCN) utilizing a directed graph convolution structure has been proposed. A temporal convolution network is employed to characterize the temporal features of wind power, achieving favorable results [87]. Employing sequential input data x and a filter q, the operation of causal convolution is depicted as follows:

q (t) * x = \sum_{s = 0}^{S - 1} q (s) x (t - d s)

(26)

with

S

representing the dimensions of the convolutional kernel and d denoting the dilation factor that regulates the spacing of the input data for convolution; the output from the causal convolution is solely influenced by preceding data points, establishing a strictly causal connection. Moreover, by deepening the layers, a causal convolutional network can achieve a large receptive field.

The structure of the TCN based on causal convolution with a kernel size of two is presented in Figure 7. The output of different layers can transmit the relations between the historical sequential data and the forecasting values at different temporal scales.

Convolutional neural network (CNN).

The CNN, known for weight sharing and local connectivity, efficiently extracts features from raw data. It typically comprises convolutional, pooling, and fully connected layers, reducing parameter count and model complexity. The typical structure of a CNN is shown in Figure 8. Convolutional layers use kernels for feature extraction, with more kernels being preferable. Pooling layers, which can use max or average methods, reduce dimensionality, prevent overfitting, and improve robustness, and they are often paired with ReLU activation. Fully connected layers transform pooled features into vectors for further processing. In the pooling layer, the max pooling method is used to obtain the maximum feature sequence. For a sequence x, each r-sized window of consecutive vectors is subjected to repeated max pooling. Equation (27) represents the maximum value from vector

x_{k}^{l}

to

x_{k + r - 1}^{l}

[85].

{\hat{x}}_{k}^{l} = \max (x_{k}^{l} : x_{k + r - 1}^{l})

(27)

CNNs use convolutional and pooling layers to extract important information and automatically obtain feature vectors, simplifying the processes of feature extraction and data reconstruction and elevating the quality of data features.

3.3. Hybrid Prediction Model of Ultra-Short-Term Wind Power Prediction

3.3.1. Weighted Combination Prediction Method

The weighted ensemble forecasting method utilizes multiple prediction models to generate forecasts, which are then combined in the form of weights to form the final forecasting results.

For example, a novel self-adaptive approach termed multiple challengers has been introduced for kernel recursive least-squares machines [88]. This approach, denoted as the multiple challengers kernel recursive least squares (MC-KRLS) model, employs multiple approximate linear dependency kernel recursive least squares (ALD-KRLS) algorithms or other kernel-based machines. What distinguishes this method is its utilization of the same input data across various algorithms, yet each is equipped with a unique dictionary. These dictionaries are intricately linked, primarily through their size, allowing for a weighted ensemble forecasting method that enhances predictive accuracy and adaptability.

3.3.2. Fusion Combination Prediction Method

In recent years, the hybrid prediction method of input optimization, model optimization, and error correction has emerged as a prominent area of research. This method aims to enhance the accuracy of predictions by optimizing each stage of the process. Below is a summary of the relevant literature, which categorizes the integrated methods for ultra-short-term wind power prediction into three aspects. The first aspect is input optimization. The second is fusion input optimization and model optimization, and the third is error correction on the basis of fusion input optimization and model optimization.

Hybrid method including input optimization

In the ultra-short-term wind power prediction conducted in studies, research has integrated signal decomposition or feature extraction methods. To achieve this target, several strategies have been adopted for input optimization. Techniques such as the t-SNE dimensionality reduction algorithm are used to condense the input data [89]. The use of NWP forecast data and time windows aids in identifying periods when the precision of rolling forecasts is less than optimal [90]. Furthermore, wavelet decomposition is implemented to enhance feature representation [91]. For ultra-short-term wind power prediction, a CNN is commonly used for extracting features and are integrated with models like LSTM [92,93,94], VMD [95], and GRUs [85] to create hybrid models that enhance the forecasting process.

In addition, a study introduced a novel ultra-short-term forecasting approach for wind power and wind speed utilizing the Takagi–Sugeno (T–S) fuzzy model. This model is capable of achieving precise forecasting outcomes through efficient linearization without the need for extensive historical data [96]. The proposed method uses meteorological measurements as input and identifies the parameters of the T–S fuzzy model via fuzzy c-means clustering and recursive least squares.

Hybrid method including model optimization

Based on input optimization, model optimization methods are proposed. In the realm of very short-term wind power prediction, model optimization can be primarily categorized into two types, which are structural optimization and parameter optimization.

For parameter optimization, model optimization frequently encompasses the tuning of parameters. Some optimization methods such as particle swarm optimization (PSO) [62], the GSA [63], enhanced bee swarm optimization (EBSO) [77], improved particle swarm optimization (IPSO) [97], the gray wolf optimizer (GWO) [76], coherent long short-term memory (CLSTM) [98], adaptive particle swarm optimization algorithm-based ant colony optimization (APSOACO) [99], the niche immune lion algorithm (NILA) [81], and monarch butterfly optimization (MBO) [100] have been adapted. It is worth noting that the focus here is on the optimization of model parameters by these methods, which does not imply the absence of structural optimization within the models.

There is a brief definition of the various optimization algorithms involved above. PSO refers to a swarm intelligence algorithm inspired by bird flock foraging behavior, dynamically adjusting search directions through individual historical optima and global best solutions, suitable for continuous parameter space optimization. The GSA simulates physical gravitational interactions, achieving progressive multi-dimensional parameter optimization through mass acceleration mechanisms. EBSO improves the scouting honey collection mechanism of bee algorithms by introducing adaptive neighborhood search to strengthen local exploration capability. IPSO addresses PSO’s premature convergence via dynamic inertia weight adjustment and elite retention strategies. GWO refers to a metaheuristic algorithm mimicking gray wolf social hierarchy and hunting behavior, guided by triple-leadership mechanisms for parameter updates. APSOACO hybridizes PSO’s global search with ant colony pheromone feedback mechanisms to enhance convergence precision in high-dimensional parameter optimization. The NILA combines immune network diversity preservation with lion pride hunting strategies to avoid local optima in multimodal parameter optimization. MBO refers to a dual-population algorithm simulating monarch butterfly migration, balancing exploitation–exploration tradeoffs via migration operators.

In fact, models such as WT-ANFIS [62], EEMD-PE-LSSVM [63], WD-BP [99],WD-WRF [81], and CEEMDAN-LSTM [100] mentioned above all incorporate input optimization. Furthermore, models like VMD-CNN-LSTM [97], ST-MSVM [76], and DOCREL [98] not only include input optimization but also involve the combination and reorganization of different neural networks or machine learning models, which is structural optimization. Here, models that feature such characteristics and involve parameter optimization are categorized as parameter optimization. Structural optimization, on the other hand, focuses more on the innovation of model structures.

Hybrid method including error processing techniques

The error correction based on fusion input optimization and model optimization is suggested to achieve a high accuracy of ultra-short-term wind power prediction. These methods establish nonlinear relationships between prediction results and errors, creating wind speed correction models to boost the generalization performance of prediction models and recover residual information lost during the learning process.

For ultra-short-term wind power prediction, a wind power forecasting method that integrates feature analysis with error correction has been presented to enhance the accuracy of ultra-short-term predictions [101]. The light gradient boosting machine (LightGBM) is used to predict and correct errors, refining the forecasted outcomes. By predicting the errors using LightGBM and superimposing them onto the preliminary prediction results, more precise prediction outcomes are obtained. The principles of LightGBM will be elaborated in detail in the next section.

3.4. Other Features of Ultra-Short-Term Wind Power Prediction

The input data type for wind power forecasting is crucial. Some wind power predictions are based solely on learning relevant features from historical wind power data and making predictions accordingly. Atmospheric dynamics, though complex, follow fundamental physical laws [102,103]. Hence, some wind power predictions use meteorological data, typically depending on NWP due to its straightforward application. This review categorizes the input data into two types. One type solely contains wind information, and another includes both wind information and NWP data.

Wind power forecasting at different spatial scales imposes different requirements on the types of application models and the analysis of model correlations. This review, oriented towards engineering applications, summarizes the spatial scales of related research. The spatial scales are categorized into two types, which are the wind farm and wind farm region. This classification helps relevant researchers find the most suitable wind power prediction methods according to the spatial scale of their wind power prediction applications.

When evaluating the efficacy of wind power prediction, there is an overlap of evaluation metrics used across different studies, as well as some distinctions. Classifying and summarizing evaluation metrics offers a structured approach to compare models and select the most appropriate ones for specific scenarios.

In this section, the input data type, the evaluation metric used, and the spatial scale applied to ultra-short-term wind power prediction in each study mentioned before are detailed in Table 1.

4. Short-Term Wind Power Prediction

This section summarizes the literature on the short-term forecasting of wind speed/wind power, focusing on the short-term forecasting issues for wind energy. The literature is classified based on model types. And the input data types, evaluation metrics used, and the spatial scale applied in each study are also summarized. The input data are divided into two categories. One contains only wind power information and the other includes both wind power information and meteorological data.

4.1. Traditional Statistical Model of Short-Term Wind Power Prediction

In the field of wind power forecasting, particularly for short-term predictions, extensive explorations have been conducted. Specifically, similar to very short-term wind power prediction, early research efforts were primarily focused on using traditional statistical models for short-term wind power prediction [104]. These research efforts not only provided a theoretical foundation for subsequent forecasting models but also laid an important cornerstone for technological advancements in this area. The traditional statistical model applied to short-term wind power forecasting is roughly the same as the ultra-short-term model. So, we will not repeat the model principles here but only summarize the work based on the traditional statistical model in terms of short-term wind power forecasting.

The hourly mean wind attributes are forecasted 1 h ahead for two wind observation sites based on the ARMA method [47]. The application of fractional ARIMA (f-ARIMA) models is examined for modeling and forecasting wind speeds over the day-ahead and two-day-ahead horizons [105]. A study verified that the SARIMA is effective in capturing daily cycle characteristics in 24 h wind power forecasting, but it relies on complete seasonal cycle data. This study showed modeling seasonal effects such as diurnal and seasonal variations but suffers from the drawbacks of high data requirements and poor adaptation to non-fixed cycles [106].

In terms of the Kalman filter model, some studies combined it with ARIMA to study the short-term wind power prediction method; the model has the characteristics of dynamic adaptation to data changes and strong noise resistance, and the study also proved the model validity and scientificity through actual wind farm examples [107].

4.2. Machine Learning-Based Model of Short-Term Wind Power Prediction

Light gradient boosting machine (LightGBM)

A short-term wind power prediction framework has been proposed [108]. It integrates a LightGBM, mutual information coefficient (MIC), and nonparametric regression techniques.

A LightGBM is an efficient GBM implementation that lowers computational complexity and memory usage with histogram-based algorithms and gradient-based one-side sampling (GOSS), making it ideal for high-dimensional sparse data. It excels in large-scale dataset processing, outperforming traditional GBM models in training speed and memory efficiency. GOSS maintains data distribution shape by calculating information gain, ignoring small-gradient data points and keeping those with larger gradients. The information gain equation is as follows:

{\tilde{V}}_{j} (d) = \frac{1}{n} (\frac{{(\sum_{x_{i} \in A : x_{i j} \leq d} g_{i} + \frac{1 - a}{b} \sum_{x_{i} \in B : x_{i j} \leq d} g_{i})}^{2}}{n_{l}^{j} (d)} + \frac{{(\sum_{x_{i} \in A : x_{i j} > d} g_{i} + \frac{1 - a}{b} \sum_{x_{i} \in B : x_{i j} > d} g_{i})}^{2}}{n_{r}^{j} (d)})

(28)

where

{\tilde{V}}_{j} (d)

denotes the information gain. Once the data are sorted in descending order of gradient magnitude, subset

A

encompasses the data with the highest gradients, subset

B

consists of a randomly selected segment of the data,

n

signifies the aggregate count of instances,

j

refers to the feature being segmented, and

d

indicates the specific point of segmentation.

Support vector machine (SVM)

Similarly to its application in ultra-short-term wind power forecasting methods, the SVM model is also widely utilized in short-term wind power prediction. The SVM is utilized as a model within hybrid algorithms [52]. In hybrid wind forecasting models, the SVM is integrated with ARIMA to predict wind speed and power generation.

The least squares support vector machine (LSSVM) simplifies the SVM optimization problem by minimizing squared errors, converting it into a linear equation system. It replaces the SVM’s inequality constraints with equality ones, streamlining the optimization. The LSSVM solves linear equations for the decision function, easing computation and speeding up processing, which is beneficial for large-scale problems. As an SVM enhancement, the LSSVM offers computational efficiency and is preferred for its simplified optimization in practical applications, especially in hybrid wind speed forecasting models combining phase space reconstruction and Markov chains. Its robust performance makes the LSSVM a solid foundation for model optimization and a popular choice for wind power prediction, often paired with feature extraction and optimization algorithms [61,109,110].

Support vector regression (SVR)

SVR represents a significant application of the SVM in the context of regression problems. The fundamental principle of SVR lies in minimizing the model’s error by maximizing the margin while allowing for a specified tolerance to enhance the model’s flexibility in handling data.

SVR methods aim to reveal linear or nonlinear data patterns. While traditional regression assumes Gaussian error terms, real-world applications like wind power prediction and direction estimation often involve non-Gaussian noise, such as beta or Laplacian distributions. In such instances, conventional regression techniques fail to achieve optimal performance. A method involving the uniform model of ν-support vector regression for the general noise model (N-SVR) is proposed [49]. There are also models based on the improved genetic algorithm for predicting wind power using SVR [111].

Extreme learning machine (ELM)

ELM-based neural networks are broadly applied in short-term wind power prediction. Five hybrid methods, including ELM-based neural networks, are considered using historical wind speed data [112]. To assess the impact of varying feature counts, two distinct analyses are performed, with one utilizing solely wind velocity as a predictor and another incorporating a combination of wind velocity alongside additional meteorological factors. Among the five hybrid NN models, ELM-based NN models demonstrated accurate forecasting outcomes coupled with reduced computation times.

Artificial neural network (ANN)

The ANN is capable of handling nonlinear problems and possesses adaptive and self-learning capabilities, which is why they are extensively utilized in short-term wind power prediction [113]. A two-pronged analysis of the wind speed forecasting issue is undertaken, encompassing both temporal and spatial aspects, and it employs the ANN for prediction purposes [114]. In addition to the advantages of the ANN, there is also room for improvement in ANN models for wind power prediction. A study on potential enhancements to the efficiency and stability of ANN models has been conducted [115].

Backpropagation (BP)

BP neural networks, a mainstay in artificial neural networks, are employed for learning data representations and have been utilized in short-term wind power prediction to capture the intricacies of meteorological factor interactions with power output.

The BP network is a standard method in ANNs, as shown in Figure 9. It features an input layer that passes activation patterns to hidden layer neurons, whose outputs then reach the output layer. This final layer produces the network’s response to the input signals [116].

Though BP neural networks might not match cutting-edge performance in short-term wind power prediction, their ease of use and interpretability make them a valuable asset and benchmark in predictive modeling. They also provide an accessible starting point for newcomers to neural network-based forecasting.

Recurrent neural network (RNN)

The RNN has emerged as a powerful tool for short-term wind power forecasting due to their ability to capture temporal dependencies in sequential data. Unlike traditional feedforward neural networks, RNNs maintain a hidden state that allows them to retain information from previous time steps, making them particularly effective for modeling time series data such as wind speed and power generation. Recent studies have shown that RNN-based models can achieve superior forecasting accuracy compared to conventional methods, especially in scenarios characterized by nonlinear patterns and complex dynamics inherent in wind power generation [117].

Long short-term memory (LSTM) network

In the realm of short-term wind power forecasting, the LSTM network has garnered significant attention due to their capability to model sequential data with long-range dependencies. LSTM models have been implemented in various configurations to enhance their predictive performance [57,118]. For instance, some studies have focused on optimizing LSTM models through techniques such as modified particle swarm optimization to fine-tune hyperparameters, which is crucial for mitigating overfitting and ensuring robust model performance across different data splits [119]. Moreover, the application of LSTM in wind power forecasting has been extended to include attention mechanisms, which allow the model to focus on the most relevant features when making predictions [120]. This integration of attention with LSTM can lead to more accurate and interpretable models, as it provides a way to understand the model’s decision-making process by highlighting the input features that contribute most significantly to the forecast.

Gated recurrent unit (GRU)

The GRU is a significant advancement in recurrent neural network architectures. It addresses the challenges of capturing long-term dependencies in sequential data. Additionally, it mitigates the computational intensity associated with LSTM networks. In the context of short-term wind power prediction, GRUs have demonstrated their effectiveness in modeling the complex nonlinear relationships within meteorological and turbine operational data. Their ability to adapt to varying patterns in wind speed and power output makes GRUs a valuable tool for enhancing the accuracy of forecasting models [55,58]. By selectively filtering relevant information through the update gate, GRUs can focus on the most salient features contributing to the prediction, leading to more robust and responsive models. Due to the superior performance and stable efficiency of gated recurrent units (GRUs), GRU models are often incorporated as a component of hybrid models for short-term wind power forecasting [121].

Temporal convolutional network (TCN)

The TCN is tailored for sequence modeling, excelling in short-term wind power forecasting. These networks use causal and dilated convolutions along with residual connections to seize temporal dependencies, vital for precise predictions. TCNs can retain a longer history than recurrent networks, beneficial for long-term pattern recognition in tasks like solar power forecasting. Their superior accuracy over other models underscores the potential of convolutional architectures in renewable energy forecasting.

For instance, a deep learning architecture, STCN, is proposed based on a graph model for spatio-temporal wind power forecasting [87]. This method has shown promising results, suggesting that the integration of TCNs can lead to improved forecasting performance.

Convolutional neural network (CNN)

The CNN has been effectively applied to short-term wind power prediction by utilizing their spatial feature extraction capabilities for sequential data. CNNs, known for capturing local patterns through convolutional layers, can treat the time axis as a spatial dimension, enabling the identification of complex nonlinear trends indicative of future power generation.

In the context of wind power prediction, the CNN has been employed to analyze historical wind speed and power output data, with the goal of predicting future energy production [54]. Furthermore, the robust feature extraction capabilities of the CNN make it a compelling choice for handling high-dimensional data, such as that encountered in wind farm settings with multiple turbines and sensors. Its use in ensemble models or hybrid architectures, combined with other machine learning techniques like LSTM or GRUs, has been shown to enhance predictive accuracy, demonstrating the versatility and complementary nature of CNNs in a broader forecasting framework [122,123].

Hidden autoregression (HAR)

Hybrid models that incorporate HAR components have emerged as a significant approach in the realm of short-term wind power prediction. These models leverage autoregressive properties to capture the linear dependencies within the time series data, which is crucial for predicting future outputs based on historical trends.

The nonlinear model endeavors to reduce the discrepancy between actual and predicted outputs. The nonlinear function

g (x_{t}, h)

translates the input

x_{t}

into the noisy observed output

y_{t}

, where

e_{t}

represents the noise. The parameters of the model are optimized to lessen the error function

e_{t} = y_{t} - {\hat{y}}_{t}

, thereby enabling the nonlinear model to be articulated by the following equation [124]:

y_{t} = g (x_{t}, h) + e_{t}

(29)

where

y_{t}

corresponds to the output of the process at time

t

,

x_{t}

is the input or stimulus to the process,

e_{t}

denotes the error term,

g

refers to the functional form, and

h

signifies the vector of model parameters, also known as the kernel.

The HAR nonlinear model in this study is based on the Volterra series, which, like the Taylor series, can capture system memory effects through lags in both continuous and discrete forms. While using the Volterra series can be computationally demanding for complex systems, it can be simplified to a few terms for systems with short-term memory dependencies. The HAR model’s internal structure is depicted in Figure 10.

\{\begin{cases} y_{t} = \sum_{n = 1}^{M} K_{n} [x_{t}] \\ K_{n} (x_{t}) = \sum_{i_{1} = 0}^{M_{1}} \sum_{i_{2} = 0}^{M_{2}} \dots \sum_{i_{n} = 0}^{M_{n}} k_{n} (i_{1}, i_{2}, \dots, i_{n}) x_{t - i_{1}} x_{t - i_{2}} \dots x_{t - i_{n}} \\ H_{n} (x_{t}) = \sum_{i = 0}^{M} h_{n} (i) {x^{i}}_{t - i} \end{cases}

(30)

where

K_{n}

is the nth Volterra series basis, with

k_{n}

as the nth Volterra kernel, showing the system’s lag input dependence. In orthogonal systems, this simplifies to the Hammerstein kernel, where

H_{n}

is the nth Hammerstein series basis and

h_{n}

is the nth Hammerstein kernel, both indicating lag dependency.

The Volterra integral functional series is the classic approach for nonlinear black box dynamical system modeling. An alternative approach based on direct estimation of the Volterra kernels using the collocation method is proposed, which presents a promising alternative for optimization [125].

Gaussian process (GP)

GP models are a class of probabilistic non-parametric models that are widely used for regression and classification tasks, including wind power forecasting. The fundamental principle of a GP model is to define a prior over functions rather than specifying a fixed functional form. This is achieved by assuming that any finite set of points drawn from the process is jointly Gaussian distributed [126].

The application of GP methodologies in wind power forecasting encompasses four distinct model configurations, namely dynamic, static, direct, and indirect. These models have been individually and collectively implemented in various operational states to predict the power output [127].

K-nearest neighbors (KNN) model

In the realm of short-term wind power forecasting, the KNN model represents a straightforward yet effective instance-based learning approach [128]. The KNN method involves three steps, namely calculating the distance between a query instance and the training set, selecting the k-closest instances, and predicting the aggregate wind power output using a weighted mean based on these nearest instances.

Given that

\{X_{1}, X_{2}, \dots, X_{K}\}

represent the

K

closest points in proximity to a query point and

\{Y_{1}, Y_{2}, \dots, Y_{K}\}

correspond to the aggregated wind power measurements associated with these points, the predicted aggregated wind power

\hat{y} (x)

can be computed employing the following estimation formula:

\hat{y} (x) = \frac{\sum_{k = 1}^{K} K (x, X_{k}) \cdot Y_{k}}{\sum_{k = 1}^{K} K (x, X_{k})}

(31)

where the kernel function

K (x, X_{k})

assigns weights to wind power measurements

Y_{k}

based on the inverse of their distances to the query point

x

, with Manhattan distance as the metric. Thus, the closest neighbor’s measurement has the greatest influence on the predicted value

\hat{y} (x)

.

For a Gaussian kernel, which is the kernel function utilized in this research and characterized by a smoothing parameter h, the kernel function

K (x, X_{k})

can be expressed as

K (x, X_{k}) = e^{- \frac{{(x - X_{k})}^{2}}{2 h^{2}}}

(32)

The KNN model is adept at recognizing local wind patterns for accurate short-term forecasts but struggles with scalability in high-dimensional spaces due to computational intensity. It remains a viable option for wind power prediction, especially when datasets are small and topological relationships are significant. Explanatory variables were proposed for the training of the KNN model, enabling the derivation of day-ahead aggregated points and probabilistic wind power forecasts from decentralized point forecasts of geographically dispersed wind farms.

4.3. Hybrid Prediction Model of Short-Term Wind Power Prediction

4.3.1. Weighted Combination Prediction Method

Weighted ensemble prediction methods combine the same or multiple prediction models without altering their structural composition. They assign different weights to each model to enhance the accuracy of wind power prediction. Weighted ensemble prediction models are typically composed of individual machine learning models. Data features are extracted by a deep feature selection framework to obtain optimal inputs. These inputs are then fed into the prediction models. And the forecasts generated by various algorithms are combined using a blending model [52]. Additionally, some studies construct weighted ensemble prediction models by employing multiple identical neural network models in parallel. The final result could be derived by selecting the best prediction from overlapping forecasts through a decision-making process [129]. Other research applies different machine learning methods with various optimization algorithms. And the best approach is selected for wind speed forecasting to predict short-term wind power. Machine learning algorithms possess unique strengths and weaknesses [112].

The introduction of these weighted ensemble prediction methods is anticipated to integrate the advantages of different algorithms by mitigating or smoothing out local forecasting errors. This compensates for individual model weaknesses, such as statistical models’ inability to handle nonlinearities or machine learning models’ overfitting. The complementary strengths of these models lead to superior predictive results compared to single models.

4.3.2. Fusion Forecasting Method

Hybrid method including input optimization

In short-term wind power forecasting, hybrid methods incorporating input optimization have integrated signal decomposition and feature extraction techniques to extract relevant features from datasets. For instance, CNNs decompose wind power data into frequency bands to capture nonlinear features [130]. Wavelet transform is employed for time series decomposition, with feature selection conducted using NSGA-II [131]. VMD is applied to decompose wind power signals into multiple components of varying frequencies, followed by neural network-based inverse temporal feature extraction [132]. EEMD is applied to extract noise data, enhancing prediction accuracy significantly [133]. ELM is used to extract nonlinear information from raw wind speed sequences [60].

Additionally, some hybrid methods strengthen feature learning by clustering power generation equipment or weather conditions. For example, the KHC algorithm is used for the adaptive clustering of wind turbines, and SVD is employed to extract the main components of power from turbines within the same cluster [134]. Wind energy sequences are decomposed into trend and fluctuation components using VMD. Fuzzy C-means (FCM) clustering is applied to the time series segments of the trend component, and corresponding clustering is performed on the fluctuation component [135]. The FCM clustering algorithm identifies weather characteristics in different regions, and dimensionality reduction using fuzzy reasoning and NWP is employed to obtain more applicable wind speed information [136].

Hybrid method including model optimization

Based on input optimization, model optimization methods are proposed. In short-term wind power prediction models, model optimization can be primarily classified into two types, which are structural optimization and parameter optimization. For structural optimization, improvements in neural network architectures are often employed to enhance learning effectiveness and prediction accuracy. For instance, the adaptive neuro-fuzzy inference system (ANFIS) has been introduced for short-term wind power forecasting [51,114]. The QRNN model is utilized for forecasting, and based on this, the DQR model has been developed to improve performance [137]. The improved deep mixture density network (IDMDZ) is used for probabilistic wind power prediction across multiple wind farms and entire regions [40]. Some hybrid methods, including structural optimization, combine multiple time series models or a single time series model with a single machine learning model. Some studies decouple the short-term wind power prediction problem into linear and nonlinear components, using time series models to handle the linear component and machine learning models to handle the nonlinear component [61]. Others use one time series model to determine the measurement and state equations of another time series model or use time series models to determine the structure of machine learning models. Application cases have shown that the prediction performance of such models is superior to that of individual time series models or machine learning models [138,139].

The other category focuses more on parameter optimization, building on input optimization or structural optimization. For example, studies based on the ANFIS model have used wavelet transform (WT) for input optimization and employed evolutionary particle swarm optimization for the parameters of ANFIS to improve prediction accuracy [140]. Input optimization based on data decomposition methods such as EEMD [123], singular spectrum analysis (SSA) [65,141], empirical wavelet transform (EWT) [142], and wavelet packet transform (WPT) [109] is combined with optimization algorithms like crow search optimization (CSO) [58], the firefly algorithm (FA) [65], international geothermal wave optics (IGWO) [141], the wavelet optimized algorithm (WOA) [142], the cuckoo search algorithm (CSA) [109], and particle swarm optimization based on simulated annealing (PSOSA) [110] to optimize the parameters of prediction models. Additionally, some studies have proposed new parameter optimization algorithms to achieve better search results during the optimization process, thereby enhancing parameter optimization [143]. The Kolmogorov–Arnold nets and Liquid ANN proposed last year provide new methods for short-term wind power prediction, and this work was the first to investigate the emerging liquid neural network (LNN) to provide necessary transparency in wind power prediction [144].

Hybrid method including error processing techniques

To achieve high accuracy in short-term wind power prediction, error correction based on input fusion optimization and model optimization has been proposed. These methods establish nonlinear relationships between prediction results and errors, creating wind speed correction models to enhance the generalization performance of prediction models and recover residual information lost during the learning process. For instance, the Ljung–Box q-test is employed to finalize the error correction process [55]. A feedback error correction approach has been introduced, utilizing a ramp predictor and data from neighboring wind farms [66]. Additionally, error-corrected versions of some models are achieved through the Markov model [67], improved hidden Markov model [145], or the deep belief network (DBN) [146].

Other hybrid methods incorporating error processing techniques focus on extracting specific feature information for error correction. For example, variational mode decomposition (VMD) is used to process noise in monitored wind speed sequences. It also conducted error correction by extracting high-frequency components of NWP wind speed [147]. An automatic correlation determination algorithm is employed to correct errors in numerical weather prediction and extract spatial correlations between wind speed sequences of neighboring wind farms to supplement input data [148]. Furthermore, some models use neural network models for detection and classification, followed by a similarity matching mechanism to correct predictions [149].

4.4. Other Features of Short-Term Wind Power Prediction

Similarly to ultra-short-term forecasting, summarizing different characteristics of wind power prediction is essential for short-term wind power prediction. The significant impact of various features on the selection of forecasting models is a key consideration. Additionally, summarizing these features is beneficial for model selection based on specific application requirements. Therefore, having a comprehensive understanding of the distinct characteristics of wind power prediction is essential. This understanding is highly necessary for conducting effective short-term forecasting.

In this section, the input data type, the evaluation metric used, and the spatial scale applied to short-term wind power prediction in each study mentioned before are summarized in Table 2.

5. Mid-Long-Term Wind Power Prediction

The accurate estimation of mid-long-term wind power generation is significant for the planning improvement, dispatch optimization, management development, and consumption enhancement of the power grid. These elements constitute the key factors in achieving power mutual aid and complementary dispatching on a large scale within renewable energy. However, due to the large time scale of medium-to-long-term predictions, low accuracy of weather forecasts, limited historical power generation data samples, and the significant differences between power generation prediction and short-term power prediction, short-term power prediction techniques cannot be directly replicated. Consequently, the industry has yet to establish effective mid-long-term wind power generation forecasting methods. In terms of model types, traditional statistical models alone cannot meet the accuracy requirements for mid-long-term wind power prediction. In response to these issues, some studies have proposed solutions from different perspectives.

Regarding mid-long-term wind power prediction, a survey of the existing major methods was conducted. Similarly to ultra-short-term and short-term predictions, the summary here is also provided from the perspectives of machine learning-based models and fusion forecasting methods.

5.1. Machine Learning-Based Model of Mid-Long-Term Wind Power Prediction

In the last few years, machine learning-based models have risen as a powerful tool for medium- and long-term wind energy forecasting. Traditional machine learning models, such as KNN and deep learning models, such as ANNs, have been widely used for their simplicity and interpretability. These models are particularly effective in handling linear relationships and can provide a baseline for more complex models.

Medium- and long-term wind energy forecasting often utilizes machine learning models to map inputs to outputs. Studies frequently apply ANNs to correlate high-altitude and ground-measured wind speeds [150] and to map vectors of wind speed and environmental parameters to wind energy outputs [151]. And KNN searches are employed to find and weight similar wind variation processes for forecasting [152]. The choice of machine learning model for wind energy forecasting depends on various factors, including the availability of data, the complexity of the wind patterns, and the specific requirements of the forecasting task [153]. While deep learning models generally offer higher accuracy, they require more computational resources and data. Traditional models, on the other hand, are more interpretable and computationally efficient, making them suitable for applications with limited resources.

Generally, machine learning-based models have significantly advanced the realm of wind energy prediction, providing more accurate and reliable predictions. The persistent development and application of these models are crucial for integrating wind energy into the power grid efficiently and for aiding the transition to a sustainable energy future.

5.2. Hybrid Prediction Model of Mid-Long-Term Wind Power Prediction

Hybrid models that combine the strengths of both traditional and deep learning approaches have also been explored. These models aim to leverage the interpretability of traditional methods and the predictive power of deep learning.

5.2.1. Weighted Combination Prediction Method

Similarly to ultra-short-term and short-term wind energy prediction methods, weighted ensemble forecasting methods are also employed in medium- and long-term wind power predictions. This approach involves assigning different weights to individual forecasting models based on their historical performance, reliability, and the specific characteristics of the data they handle.

By combining the strengths of various models, the weighted combination method aims to mitigate the weaknesses of any single model and provide a more comprehensive and accurate forecast. For instance, some utilize various soft computing methods to separately predict wind speeds, thereby extracting features implicitly contained within the data. Subsequently, the prediction results from these methods are integrated to gain the predicted wind speed at the target location [154]. Neural networks are also used for fitting residuals to predict non-stationary elements [129]. There are also methods that use networks such as the multilayer perceptron (MLP) to combine the outputs from the first two Elman networks to generate the final prediction [155].

In short, the weighted combination prediction method represents a sophisticated and effective strategy for medium- and long-term wind energy forecasting. Its capacity to synthesize diverse forecasting techniques and adapt to changing conditions makes it a valuable tool. It aids in the pursuit of more accurate and dependable wind power predictions. These predictions are crucial for the sustainable and efficient operation of wind energy systems.

5.2.2. Fusion Combination Prediction Method

The fusion combination prediction method is the most used approach in medium-to-long-term wind power forecasting. We summarize and analyze it from three perspectives. These perspectives are the hybrid method including input optimization, model optimization, and error processing techniques.

Hybrid method including input optimization

Mid-long-term forecasting places greater emphasis on extracting features from raw data to mitigate the interference of non-stationarity on predictions. Numerous studies have based their medium-to-long-term wind power prediction on input optimization combined with various time series methods or machine learning techniques.

The data processing methods for mid-long-term wind power forecasting include wavelet transform decomposition [110], empirical mode decomposition [156,157], the copula function [158], and others. It is important to note that principal component analysis (PCA) is a commonly used method for finding high-dimensional features. This method involves performing PCA on a delay matrix composed of normalized continuous time series samples to obtain principal component feature vectors, which are then used as input data. Subsequently, soft computing or k-nearest neighbor analysis methods are employed to predict each principal component, and finally, inverse mapping and inverse normalization are used to obtain the predicted results for wind speed or wind power [159].

For different principal component data of the dynamic wind energy process, different prediction models can be selected [160]. For example, some studies use PCA to select the principal components of multidimensional input data at the same moment, reducing the state space and improving the input structure of the prediction algorithm [161]. Other methods used for input optimization include rapid evaluation [143] and autoregressive analysis [162]. Rapid evaluation is used to determine the impact of input variables on prediction results, retaining those with significant impact and simplifying the input data structure. Autoregressive analysis can also extract the characteristics of data changes and determine the structure of the input vector. However, there are evident shortcomings in using the methods mentioned before to predict long-term wind speed or wind power. For instance, during forward iterative prediction or forward prediction across long time scales, the prediction error of time series accumulates and propagates forward, degrading the long-term prediction effect. If models are established by extracting knowledge from input data features, there may be issues such as one-sided feature extraction and inappropriate selection of high-dimensional kernel functions. These problems can be addressed through data fusion and the integration of methods.

Data fusion refers to the comprehensive use of prediction results from different time scales, including both long-term trend predictions and fine-scale predictions. It manifests in various forms. For instance, neural networks can be applied to identify weather patterns for the next year, followed by neural network predictions of detailed changes [163]. Alternatively, spectral analysis can be used to obtain pattern features of daily, monthly, and seasonal cycles, thereby decomposing the original wind speed through detrending and using Kalman filtering to predict the residual part. Combining the predictions of both parts achieves long-term wind speed forecasting [164]. Cluster analysis is also a common form with different classification methods. For example, through cluster analysis, the correlation between wind speed and weather factors such as temperature, air pressure, and humidity can be identified, and weather variables and wind speed can be clustered [165]. Some studies categorize inputs into three types based on wind speed magnitude [166,167], while others classify inputs based on vector similarity [168], and still others classify inputs based on weather patterns [169].

Hybrid method including model optimization

Model optimization is also very common in hybrid method models for medium-to-long-term wind power forecasting. Consistent with ultra-short-term and short-term forecasting, we will also categorize the hybrid method models involving model optimization into two types, namely structural optimization and parameter optimization.

In terms of structural optimization in the hybrid method, including model optimization for medium-to-long-term wind power prediction, there are improvements to neural networks, combining wind power prediction results based on neural networks with an ANFIS [170]. The deep conditional generative spatio-temporal (DCGST) approach attains high-precision predictions by tackling the non-stationarity of multiple wind power time series and intricately modeling their spatio-temporal dependencies [171]. There are methods that fuse Gaussian processes with neural networks to predict the same feature, achieving a hybrid method for long-term wind speed or wind power prediction [172]. It is also possible to use two different methods to predict two different features and integrate these features to obtain results that surpass predictions focused solely on a single feature, for example, combining the autoregressive integrated moving average model with an artificial neural network, where the autoregressive model is used to predict the stationary component [139,173,174]. Introducing the attention mechanism is also a method for optimizing model architecture [175].

Regarding parameter optimization in the hybrid method, including model optimization for medium-to-long-term wind power forecasting, this section builds on structural optimization and focuses on the selection of model parameters. Work in this area includes the PCA-MLP model optimized by iSSO for parameter tuning [176], the IWT-TDCNN model optimized by AFPSO [177], the DBNGA optimized by the genetic algorithm (GA) [178], and an RBF-MLP model with parameters optimized using the enhanced particle swarm optimization technique (EPSO) [179]. There are also models that distinguish between seasonal and daily variation trends, adjusting the prediction model parameters for daily variations based on seasonal trends, and achieving long-term forecasting through PSO, such as the SVR-ERNN model [180], as well as the first-order adaptive coefficient (FAC) and second-order adaptive coefficient (SAC) models [181].

Hybrid method including error processing techniques

When forecasting electricity generation for periods exceeding 24 h, NWP becomes more valuable due to the meteorological differential equations employed, which exhibit smaller systematic errors compared to prediction models that solely utilize wind information. However, the wind speeds predicted by NWP are often not directly usable and require correction.

In the context of medium-to-long-term wind power forecasting, hybrid method models involve error processing techniques from two primary perspectives. One approach focuses on predicting the deviation in wind power generation and using this deviation to correct the forecast results. For example, the WT-FA-FF-SVM model [182] employs wavelet transform to decompose observational data, utilizes a fuzzy adaptive resonance network to predict wind power, and then applies support vector regression to predict the deviation generated by the fuzzy adaptive resonance network, ultimately using the predicted deviation to correct the forecast outcomes. The other perspective involves correcting the wind speed results provided by NWP. For instance, models such as the GP-Cspeed model [126] and the ALL-CF model [183] use constrained Gaussian processes, Gaussian processes, and combined covariance functions to correct the wind speed results from NWP. Some studies propose a model for correcting NWP wind speeds using subsequence partitioning and DTW, aiming to reduce local drift errors [184]. Additionally, there is the Kelman ANN model [185], which couples the global forecast system (GFS) with the weather research and forecasting (WRF) system to eliminate systematic errors in predicting wind power using NWP.

5.3. Other Features of Mid-Long-Term Wind Power Prediction

Relative to ultra-short-term and short-term wind power prediction, mid-long-term wind power prediction has to confront the increased uncertainty in long-term meteorological prediction and the limited availability of historical data. Due to the longer time scale, historical data samples are often limited. The prediction models need to effectively utilize the available historical data and extract meaningful patterns. Mid-long-term wind power prediction focuses on strategic planning and management, while ultra-short-term and short-term predictions are more operationally oriented.

To investigate the characteristics of mid-long-term wind power prediction and compare them with ultra-short-term and short-term wind power prediction, this section summarizes the input data types, evaluation metrics used, and spatial scales applied in the studies mentioned earlier. These details are presented in Table 3.

6. Wind Ramp Event Prediction Methods

With the continuous growth of wind power grid-connected capacity, especially in wind power-intensive areas, the power grid is facing increasingly severe challenges, which are mainly reflected in the phenomenon of wind power fluctuating drastically within a short period of time, which is defined as wind power ramp events, and it can be further subdivided into upward and downward ramp events based on their changing trends [186]. Since many traditional prediction methods struggle to effectively capture ramp events, this study provides a brief review of ramp prediction methods to obtain a more comprehensive overview of wind power prediction methods.

Fluctuations in meteorological conditions are the core factors triggering wind power variations, and physical NWP models play a key role in predicting wind power creep events. In addition, topographic differences lead to different output power characteristics of each wind farm under the same meteorological conditions, so the rational use of historical data from wind farms is crucial to improve the accuracy of the prediction of ramp-up events.

A study used the WPPT tool, combined with NWP data, historical wind speed and direction, and wind power data, to predict wind power, and based on these predictions, they judge the likelihood of wind power creeping events in a specific time period in the future [187]. By analyzing the actual data for a period of one year, they explore the effect of different threshold choices on the prediction effect of climbing events with durations of 10 min and 1 h and found that the probability of wind power climbing events in fall is relatively low.

A high-precision (2 km resolution) WRF model is used for wind prediction and combined it with a power curve model to predict the climbing events of wind farms; however, the prediction results are not satisfactory [188]. This is mainly due to the mismatch between the height used for wind speed prediction (2 m) and the height at which the turbines are actually operating (50 to 80 m), as well as the fact that the segmented linear power curve model chosen fails to accurately reflect the complex relationship between wind speed and wind power. Therefore, the poor prediction performance is not caused by the high-resolution NWP model itself but by the irrationality of the choice of wind speed prediction values and power curves. In regions with dense wind farms or a large number of wind measurement points, it is found that the performance of the prediction model for wind power creep events can be optimized by using data from neighboring measurement points. By combining weather variation data measured by dual Doppler radar with historical data, the prediction accuracy is significantly improved [189].

Compared with the deterministic wind power prediction, the probabilistic prediction of wind power creep events provides more detailed information to the power system, and its prediction results give the probability distribution of the time of occurrence of creep events [190]. To achieve this probabilistic forecasting goal, a study introduced the concept of wind power fluctuation intensity and estimated the probability distribution of the time of occurrence of the ramp-up event by using the quantile regression (QR) model, with its interval from neighboring ramp-up events as an input variable [191]. By comparing the effectiveness of applying historical data with NWP data, they find that NWP data are more reliable in predictions.

In addition, there are studies comparing the performance of multiple NWP model combinations with a single NWP model in the probabilistic prediction of climbing events, and the results show that the prediction effect of multiple model combinations is significantly better than that of a single model [192]. In actual power system operation, operational decisions are often based on the severity of the ramp-up event rather than the precise wind power ramp-up magnitude. Therefore, categorizing and predicting climbing events by degree has become a new research direction. Some studies have classified the climbing events into four levels according to the size and direction of the climbing magnitude, and they have established an SVM model to directly predict the types of climbing events that may occur in the future. The results show that the accuracy of this method for predicting the types of climbing events in the next 6 h is close to 90% [193]. A study presents a novel approach for identifying and quantifying wind ramp characteristics [194]. It highlights the role of feature extraction in improving prediction accuracy, particularly in short-term forecasting, and underscores the need for tailored methodologies beyond traditional power prediction models.

7. Case Study

To provide a more intuitive representation of the effectiveness of wind power prediction for reference, this review provides a case study of ultra-short-term wind power prediction, constructing ultra-short-term prediction models for wind power generation from the perspectives of traditional statistical models, machine learning models, and hybrid models, respectively. The ARIMA model, LSTM model, and CNN-LSTM model are chosen to be constructed to demonstrate the prediction effect, respectively. The algorithm is set in a real wind farm in Xinjiang, China. The data are the actual wind power data and NWP data of the wind farm in January 2019, with a step size of 15 min. NWP data include wind speed at 10 m, 30 m, 50 m, and 70 m/hub of the wind tower, wind direction at 10 m, 30 m, 50 m, and 70 m/hub of the wind tower, temperature, humidity, and barometric pressure. The first 80% of the dataset is the training set and the last 20% is the test set, and the algorithm is tested by inputting the data with or without the NWP data, respectively.

7.1. Result of Ultra-Short-Term Wind Power Prediction Considering Wind Power Information

Considering only the wind power historical data, the ultra-short-term prediction of wind power is made based on the ARIMA model, LSTM model, and CNN-LSTM model, respectively, and the results are shown in Figure 11, Figure 12 and Figure 13. The MAE, MSE, RMSE, and MAPE of the three models are shown in Table 4.

The comparison of Figure 11, Figure 12 and Figure 13 shows that the ARIMA model-based ultra-short-term prediction of wind power is more appropriate in the case of sudden change in wind power, and the values of MAE, MSE, and RMSE are lower based on the ARIMA model. The MAPE is higher due to the fact that the ARIMA model-based ultra-short-term prediction of wind power has more error in the region where the wind power is smaller. The CNN-LSTM considering only wind power historical data performs better than the LSTM model in ultra-short-term prediction, both in the case of sudden wind power changes and in the case of wind power steady state, showing that in evaluation metrics, the MAE, MSE, RMSE, and MAPE of the former are lower than those of the latter.

7.2. Result of Ultra-Short-Term Wind Power Prediction Considering Wind Power Information and NWP Data

Considering the wind power historical data and NWP data, the ultra-short-term prediction of wind power is performed based on the LSTM model and CNN-LSTM model, respectively, and the results are shown in Figure 14 and Figure 15. The MAE, MSE, RMSE, and MAPE of the two models are shown in Table 5.

The CNN-LSTM model considering both wind power historical data and NWP data performs better than the LSTM model in ultra-short-term prediction, both in the case of sudden wind power changes and in the case of wind power steady state, showing that in the evaluation metrics, the MAE, MSE, RMSE, and MAPE of the former are lower than those of the latter.

In the case of sudden changes in wind power, the LSTM model taking into account the NWP data and the CNN-LSTM model performs better in comparison with the aforementioned model that only considers historical wind power data. This is due to the fact that NWP data provide information about future weather changes, which provides the model with information about possible sudden changes in wind power. However, in ultra-short-term forecasting, the insufficient quality of NWP data, their lagging nature, and the presence of invalid information may result in the forecasting performance of the model that takes into account NWP data being inferior to that of the model that only takes into account historical wind power data. In conclusion, the CNN-LSTM model is the most effective in predicting wind power, followed by the LSTM model and the ARIMA model. Reflecting the superiority of hybrid models in predicting wind power, the machine learning models are also able to fulfill the task of ultra-short-term wind power prediction, and the traditional statistical models have limited accuracy in predicting wind power.

By comparing the effects of different kinds of input data on the prediction effect, it can be found that counting NWP data is beneficial for wind power prediction, and the information extraction in NWP can help neural networks to learn the data better and achieve a better prediction effect.

8. Discussion and Prospects

8.1. Novelty and Key Contributions

Existing reviews do not start from engineering applications, and there is a lack of clarity on the wind power prediction methods that can be used in real-world applications that require a quick retrieval of forecasts based on temporal and spatial scales. There is also a lack of clarity on the types of data needed for prediction, as well as how to assess the effectiveness of wind power prediction and quickly obtain wind power prediction scenarios. This work uses a narrative method to synthesize and discuss wind power prediction methods. Firstly, the classification and overview of wind power prediction is presented from the viewpoint of the time scale, spatial scale, input data, and model characteristics. Then, the wind power prediction method is introduced from the perspective of time scale, with the ultra-short term, short term, and mid-long term, respectively, and subdivided according to model characteristics in each subsection. The underlying principles of the model and related work are explained. The time scale, spatial scale, input data, and evaluation metrics of each study mentioned in this work are also summarized. In addition, this review also considers wind ramp events and provides a brief overview of their prediction method. Finally, this review provides an arithmetic example of ultra-short-term power prediction for wind power to provide a more intuitive representation of the effectiveness of wind power prediction for reference. Aimed at the engineering application of wind power forecasting technology, this work summarizes wind power prediction methods across different time scales, categorizes them based on model characteristics within each time scale, and covers deep learning methods that have emerged in recent years. It is expected that the work conducted in this review will assist in dealing with diverse wind power prediction tasks in identifying suitable forecasting methods.

8.2. Future Research and Prospectss

A comparative analysis across different time scales is conducted for wind power prediction, summarizing and analyzing the characteristics of ultra-short-term, short-term, and mid-long-term wind power prediction, respectively. Additionally, a longitudinal comparison is made for multi-time scale wind power prediction to analyze the commonalities and differences among wind power prediction at different time scales.

Ultra-short-term wind power prediction relies heavily on high-frequency, real-time data from wind turbines and meteorological stations. It often requires models that can quickly process and analyze short-term fluctuations. These models focus on capturing the immediate dynamics of wind power generation. To capture the immediate fluctuations and trends, a substantial amount of recent data is often required. It is primarily used for real-time regulation and immediate operational adjustments, such as load balancing and grid stability. The time horizon of short-term wind power prediction is longer than that of ultra-short-term wind power prediction. Models for short-term prediction often need to handle more complex spatiotemporal dependencies and may involve hybrid deep learning models. It is mainly used for planning the generation schedule of wind turbines, allowing for more strategic adjustments over a longer period. Due to the extended time horizon, historical data samples are often limited. Mid-long-term wind power prediction models must effectively utilize available historical data and NWP to capture long-term trends. It targets strategic planning, including long-term maintenance scheduling, power market bidding, and grid assessment.

Traditional statistical models, machine learning-based models, and hybrid models are all capable of performing ultra-short-term wind power prediction. Nevertheless, with the development of technological research, the number of machine learning-based models and hybrid models has gradually increased and now constitutes a significant proportion of current ultra-short-term wind power prediction efforts. Hybrid models used for ultra-short-term wind power prediction focus on input optimization and model optimization. Decomposing the input data into signals and extracting features, as well as optimizing the prediction model parameters using optimization algorithms, remain the development trend and future direction for ultra-short-term wind power prediction. For short-term wind power prediction, research on machine learning-based models and hybrid models is significantly more prevalent than that on traditional statistical models. In addition to input optimization and model optimization, hybrid models incorporating error processing techniques have developed rapidly and shown notable prediction performance, becoming a current research hotspot for short-term wind power prediction. Regarding mid-long-term wind power prediction, as the extended time horizon and limitations in long-term weather prediction, traditional statistical methods fail to meet the required prediction accuracy. Machine learning-based models and hybrid models are the primary solutions. Among machine learning-based models, deep learning methods represent an important direction for exploration. Input optimization and model optimization are crucial for hybrid models. Moreover, to achieve higher prediction accuracy, correction methods are often employed.

The nature of input data type in ultra-short-term wind power prediction is determined by the short time scale and high dependency on historical data. As a result, predictions containing only wind information and those containing both wind information and NWP are equally prevalent. However, the current trend in prediction work is still to introduce NWP to extract wind power-related factors and improve prediction accuracy. For short-term wind power prediction, in addition to real-time data, it also incorporates historical data and NWP to forecast over a longer horizon. There are numerous studies in this area, and it is currently the mainstream research direction. For mid-long-term wind power prediction, due to the long time scale and limited historical data samples, it is generally necessary to fully utilize existing data for feature extraction. To ensure accuracy, effective NWP is often included in the input data type. However, there are also relevant studies on mid-long-term wind power prediction using only wind information in the absence of NWP.

Evaluation metrics exhibit a similar pattern across different time scales of wind power prediction. The MAE and RMSE are commonly used due to their simplicity and effectiveness in capturing overall prediction accuracy. The MAPE offers additional perspectives on relative performance and model fit. The MAPE is particularly useful for comparing the performance of models across different scales of wind power generation. However, it can be less reliable when actual values are close to zero. Therefore, metrics like the the NMAE (normalized mean absolute error) and NRMSE (normalized root mean squared error) normalize the MAE or RMSE by the range of the actual values, allowing for a more standardized comparison across different datasets and prediction horizons.

The primary focus of research for ultra-short-term and short-term wind power prediction remains on wind farms in terms of spatial scale. However, wind power prediction in wind farm regions has become a critical area of research due to the increasing integration of wind energy into power grids. Grid operations in wind farm regions, immediate operational adjustments, and short-term maintenance planning have posed new requirements for wind power prediction at these time scales. Ultra-short-term and short-term wind power prediction in wind farm regions are expected to be the future development trends. In mid-long-term wind power prediction, research related to wind farm regions is more extensive, but it also demands higher requirements for the correlation analysis and feature classification of different wind farms.

In short, ultra-short-term wind power prediction focuses on immediate operational adjustments and relies on high-frequency data and quick-adapting models, while short-term wind power prediction targets longer-term planning and requires more complex models to handle extended time horizons and seasonal variations. Mid-long term wind power prediction focuses on strategic planning and long-term operational adjustments, requiring models that can handle extended time horizons and limited data availability. Hybrid models that can accomplish specific prediction tasks within designated time scales remain a major direction in wind power prediction research. Future research directions can be summarized as follows:

Adaptive hybrid frameworks: While current hybrid models demonstrate promise, their generalization across diverse geographies and turbine technologies requires deeper investigation. Future work should focus on self-adaptive architectures capable of autonomously adjusting model weights in response to shifting environmental regimes.

Edge computing integration: The latency-critical nature of ultra-short-term forecasting necessitates embedded AI solutions for real-time edge computation, reducing reliance on centralized cloud infrastructures.

Uncertainty quantification: Probabilistic forecasting methods must evolve beyond Gaussian assumptions to capture tail risks associated with extreme ramp events, particularly under climate volatility.

9. Conclusions

The global pursuit of carbon neutrality has catalyzed unprecedented advancements in renewable energy systems, with wind energy emerging as a cornerstone of sustainable power generation. As wind energy penetration continues to rise, the imperative for high-precision and rapid-response wind power prediction has become a critical enabler for grid stability, operational efficiency, and economic viability. This review systematically synthesizes the methodological evolution in wind power forecasting, emphasizing the interplay between prediction time horizons (ultra-short term to mid-long term) and their associated modeling paradigms. By delineating the classification frameworks for wind speed/power characterization and dissecting model architectures across temporal scales, this work establishes a cohesive taxonomy that bridges theoretical innovations and practical engineering demands.

A central contribution lies in elucidating the hierarchical relationship between prediction objectives and methodological suitability. For instance, ultra-short-term models prioritize dynamic responsiveness to mitigate intra-hour variability, whereas mid-term forecasts demand robust handling of seasonal and meteorological patterns. The analysis further reveals that hybrid models are increasingly pivotal in addressing the spatiotemporal complexity of wind resource dynamics.

This review not only consolidates the state of the art but also charts a pathway for transcending current limitations, advocating for prediction systems that are simultaneously accurate, interpretable, and institutionally embedded within the energy transition ecosystem.

Author Contributions

Conceptualization, F.L. and H.W.; methodology, H.W.; software, F.L.; resources, D.W.; validation, D.L.; investigation, K.S.; writing—original draft preparation, F.L.; writing—review and editing, H.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Science and Technology Project of the Headquarters of State Grid Corporation of China, grant number 1400-202456361A-3-1-DG.

Data Availability Statement

No new data were created or analyzed in this study.

Conflicts of Interest

Authors Fan Li, Dan Wang, Dong Liu and Ke Sun were employed by the State Grid Economic and Technological Research Institute Co., Ltd. The remaining author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

China Electricity Council Releases “Analysis and Forecast Report on the National Electricity Supply and Demand Situation in the First Half of 2024”. Available online: https://www.cec.org.cn/websitefzy/detail/index.html?3-335294 (accessed on 8 February 2025).
National Energy Administration Releases National Electric Power Industry Statistics for January–September 2024. Available online: https://www.gov.cn/lianbo/bumen/202410/content_6981841.htm (accessed on 8 February 2025).
Vargas, S.A.; Telles Esteves, G.R.; Maçaira, P.M.; Bastos, B.Q.; Cyrino Oliveira, F.L.; Souza, R.C. Wind power generation: A review and a research agenda. J. Clean. Prod. 2019, 218, 850–870. [Google Scholar] [CrossRef]
Fang, J.; Liu, C. Artificial intelligence techniques for stability analysis in modern power systems. iEnergy 2024, 3, 194–215. [Google Scholar] [CrossRef]
Qin, B.; Li, H.; Wang, Z.; Jiang, Y.; Lu, D.; Du, X.; Qian, Q. New framework of low-carbon city development of China: Underground space based integrated energy systems. Undergr. Space 2024, 14, 300–318. [Google Scholar] [CrossRef]
Su, T.; Zhao, J.; Gomez-Exposito, A.; Chen, Y.; Terzija, V.; Gentle, J.P. Grid-enhancing technologies for clean energy systems. Nat. Rev. Clean. Technol. 2025, 1, 16–31. [Google Scholar] [CrossRef]
Qin, B.; Wang, H.; Liao, Y.; Li, H.; Ding, T.; Wang, Z.; Li, F.; Liu, D. Challenges and opportunities for long-distance renewable energy transmission in China. Sustain. Energy Technol. Assess. 2024, 69, 103925. [Google Scholar] [CrossRef]
Cai, Y.; Bréon, F. Wind power potential and intermittency issues in the context of climate change. Energy Convers. Manag. 2021, 240, 114276. [Google Scholar] [CrossRef]
Qin, B.; Wang, H.; Li, F.; Liu, D.; Liao, Y.; Li, H. Towards zero carbon hydrogen: Co-production of photovoltaic electrolysis and natural gas reforming with CCS. Int. J. Hydrogen Energy 2024, 78, 604–609. [Google Scholar] [CrossRef]
Islam, M.M.; Yu, T.; Giannoccaro, G.; Mi, Y.; la Scala, M.; Nasab, M.R.; Wang, J. Improving Reliability and Stability of the Power Systems: A Comprehensive Review on the Role of Energy Storage Systems to Enhance Flexibility. IEEE Access 2024, 12, 152738–152765. [Google Scholar] [CrossRef]
Qin, B.; Wang, H.; Li, W.; Li, F.; Wang, W.; Ding, T. Aperiodic Coordination Scheduling of Multiple PPLs in Shipboard Integrated Power Systems. IEEE Trans. Intell. Transp. Syst. 2024, 25, 14844–14854. [Google Scholar] [CrossRef]
Li, X.; Tian, Z.; Wu, X.; Feng, W.; Niu, J. Optimal planning for hybrid renewable energy systems under limited information based on uncertainty quantification. Renew. Energy 2024, 237, 121866. [Google Scholar] [CrossRef]
Li, H.; Yu, H.; Liu, Z.; Li, F.; Wu, X.; Cao, B.; Zhang, C.; Liu, D. Long-term scenario generation of renewable energy generation using attention-based conditional generative adversarial networks. Energy Convers. Econ. 2024, 5, 15–27. [Google Scholar] [CrossRef]
Li, H.; Qin, B.; Wang, S.; Ding, T.; Liu, J.; Wang, H. Aggregate power flexibility of multi-energy systems supported by dynamic networks. Appl. Energy 2025, 377, 124565. [Google Scholar] [CrossRef]
Jung, C. Recent Development and Future Perspective of Wind Power Generation. Energies 2024, 17, 5391. [Google Scholar] [CrossRef]
Li, H.; Qin, B.; Wang, S.; Ding, T.; Wang, H. Data-driven two-stage scheduling of multi-energy systems for operational flexibility enhancement. Int. J. Electr. Power Energy Syst. 2024, 162, 110230. [Google Scholar] [CrossRef]
Yu, D.; Gao, S.; Han, H.; Zhao, X.; Wu, C.; Liu, Y.; Song, T.E. Intraday two-stage hierarchical optimal scheduling model for multiarea AC/DC systems with wind power integration. Appl. Energy 2024, 364, 123079. [Google Scholar] [CrossRef]
Qin, B.; Wang, H.; Liao, Y.; Liu, D.; Wang, Z.; Li, F. Liquid hydrogen superconducting transmission based super energy pipeline for Pacific Rim in the context of global energy sustainable development. Int. J. Hydrogen Energy 2024, 56, 1391–1396. [Google Scholar] [CrossRef]
Leon, J.I.; Dominguez, E.; Wu, L.; Marquez, A.; Reyes, M.; Liu, J. Hybrid Energy Storage Systems: Concepts, Advantages, and Applications. IEEE Ind. Electron. Mag. 2021, 15, 74–88. [Google Scholar] [CrossRef]
Wang, H.; Qin, B.; Hong, S.; Cai, Q.; Li, F.; Ding, T.; Li, H. Optimal planning of hybrid hydrogen and battery energy storage for resilience enhancement using bi-layer decomposition algorithm. J. Energy Storage 2025, 110, 115367. [Google Scholar] [CrossRef]
Wu, N.; Wang, Z.; Li, X.; Lei, L.; Qiao, Y.; Linghu, J.; Huang, J. Research on real-time coordinated optimization scheduling control strategy with supply-side flexibility in multi-microgrid energy systems. Renew. Energy 2025, 238, 121976. [Google Scholar] [CrossRef]
Zhang, Z.; Qin, B.; Gao, X.; Ding, T.; Zhang, Y.; Wang, H. SE-CNN based emergency control coordination strategy against voltage instability in multi-infeed hybrid AC/DC systems. Int. J. Electr. Power Energy Syst. 2024, 160, 110082. [Google Scholar] [CrossRef]
Hanifi, S.; Liu, X.; Lin, Z.; Lotfian, S. A Critical Review of Wind Power Forecasting Methods—Past, Present and Future. Energies 2020, 13, 3764. [Google Scholar] [CrossRef]
Colak, I.; Sagiroglu, S.; Yesilbudak, M. Data mining and wind power prediction: A literature review. Renew. Energy 2012, 46, 241–247. [Google Scholar] [CrossRef]
Okumus, I.; Dinler, A. Current status of wind energy forecasting and a hybrid method for hourly predictions. Energy Convers. Manag. 2016, 123, 362–371. [Google Scholar] [CrossRef]
Tsai, W.-C.; Hong, C.-M.; Tu, C.-S.; Lin, W.-M.; Chen, C.-H. A Review of Modern Wind Power Generation Forecasting Technologies. Sustainability 2023, 15, 10757. [Google Scholar] [CrossRef]
Kusiak, A.; Zhang, Z.; Verma, A. Prediction, operations, and condition monitoring in wind energy. Energy 2013, 60, 1–12. [Google Scholar] [CrossRef]
Foley, A.M.; Leahy, P.G.; Marvuglia, A.; McKeogh, E.J. Current methods and advances in forecasting of wind power generation. Renew. Energy 2012, 37, 1–8. [Google Scholar] [CrossRef]
Jung, J.; Broadwater, R.P. Current status and future advances for wind speed and power forecasting. Renew. Sustain. Energy Rev. 2014, 31, 762–777. [Google Scholar] [CrossRef]
Wang, J.; Song, Y.; Liu, F.; Hou, R. Analysis and application of forecasting models in wind power integration: A review of multi-step-ahead wind speed forecasting models. Renew. Sustain. Energy Rev. 2016, 60, 960–981. [Google Scholar] [CrossRef]
Wang, Y.; Zou, R.; Liu, F.; Zhang, L.; Liu, Q. A review of wind speed and wind power forecasting with deep neural networks. Appl. Energy 2021, 304, 117766. [Google Scholar] [CrossRef]
Valdivia-Bautista, S.M.; Domínguez-Navarro, J.A.; Pérez-Cisneros, M.; Vega-Gómez, C.J.; Castillo-Téllez, B. Artificial Intelligence in Wind Speed Forecasting: A Review. Energies 2023, 16, 2457. [Google Scholar] [CrossRef]
Wu, Z.; Luo, G.; Yang, Z.; Guo, Y.; Li, K.; Xue, Y. A comprehensive review on deep learning approaches in wind forecasting applications. CAAI Trans. Intell. 2021, 7, 129–143. [Google Scholar] [CrossRef]
Santhosh, M.; Venkaiah, C.; Kumar, D.M.V. Current advances and approaches in wind speed and wind power forecasting for improved renewable energy integration: A review. Eng. Rep. 2020, 2, e12178. [Google Scholar] [CrossRef]
Mohammadi, K.; Shamshirband, S.; Yee, P.L.; Petkovic, D.; Zamani, M.; Ch, S. Predicting the wind power density based upon extreme learning machine. Energy 2015, 86, 232–239. [Google Scholar] [CrossRef]
Jabr, R.A. Adjustable robust OPF with renewable energy sources. IEEE Trans. Power Syst. 2013, 28, 4742–4751. [Google Scholar] [CrossRef]
Kristoffersen, J.R.; Christiansen, P. Horns rev offshore windfarm: Its main controller and remote control system. Wind Eng. 2003, 27, 351–359. [Google Scholar] [CrossRef]
Bessa, R.J.; Matos, M.A.; Costa, I.C.; Bremermann, L.; Franchin, I.G.; Pestana, R.; Machado, N.; Waldl, H.P.; Wichmann, C. Reserve setting and steady-state security assessment using wind power uncertainty forecast: A case study. IEEE Trans. Sustain. Energy 2012, 3, 827–836. [Google Scholar] [CrossRef]
Pinson, P.; Chevallier, C.; Kariniotakis, G.N. Trading wind generation from short-term probabilistic forecasts of wind power. IEEE Trans. Power Syst. 2007, 22, 1148–1156. [Google Scholar] [CrossRef]
Zhang, H.; Liu, Y.; Yan, J.; Han, S.; Li, L.; Long, Q. Improved deep mixture density network for regional wind power probabilistic forecasting. IEEE Trans. Power Syst. 2020, 35, 2549–2560. [Google Scholar] [CrossRef]
Gilbert, C.; Browell, J.; Mcmillan, D. Leveraging turbine-level data for improved probabilistic wind power forecasting. IEEE Trans. Sustain. Energy 2020, 11, 1152–1160. [Google Scholar] [CrossRef]
Khodayar, M.; Wang, J. Spatio-temporal graph deep neural network for short-term wind speed forecasting. IEEE Trans. Sustain. Energy 2019, 10, 670–681. [Google Scholar] [CrossRef]
Andrade, J.R.; Bessa, R.J. Improving renewable energy forecasting with a grid of numerical weather predictions. IEEE Trans. Sustain. Energy 2017, 8, 1571–1580. [Google Scholar] [CrossRef]
Men, Z.; Yee, E.; Lien, F.S.; Wen, D.; Chen, Y. Short-term wind speed and power forecasting using an ensemble of mixture density neural networks. Renew. Energy 2016, 87, 203–211. [Google Scholar] [CrossRef]
Wang, Y.; Liu, Y.; Li, L.; Infield, D.; Han, S. Short-term wind power forecasting based on clustering pre-calculated CFD method. Energies 2018, 11, 854. [Google Scholar] [CrossRef]
Alexiadis, M.C.; Dikopoulos, P.S.; Sahsamanoglou, H.S.; Manousaridis, I.M. Short term forecasting of wind speed and related electrical power. Sol. Energy 1998, 63, 61–68. [Google Scholar] [CrossRef]
Erdem, E.; Shi, J. ARMA based approaches for forecasting the tuple of wind speed and direction. Appl. Energy 2011, 88, 1405–1414. [Google Scholar] [CrossRef]
Croonenbroeck, C.; Dahl, C.M. Accurate medium-term wind power forecasting in a censored classification framework. Energy 2014, 73, 221–232. [Google Scholar] [CrossRef]
Hu, Q.; Zhang, S.; Xie, Z.; Mi, J.; Wan, J. Noise model based ν-support vector regression with its application to short-term wind speed forecasting. Neural Netw. 2014, 57, 1–11. [Google Scholar] [CrossRef]
Wu, W.; Peng, M. A data mining approach combining k-means clustering with bagging neural network for short-term wind power forecasting. IEEE Internet Things J. 2017, 4, 979–986. [Google Scholar] [CrossRef]
Zheng, D.; Eseye, A.T.; Zhang, J.; Li, H. Short-term wind power forecasting using a double-stage hierarchical ANFIS approach for energy management in microgrids. Prot. Control Mod. Power Syst. 2017, 2, 136–145. [Google Scholar] [CrossRef]
Feng, C.; Cui, M.; Hodge, B.M.; Zhang, J. A data-driven multi-model methodology with deep feature selection for short-term wind forecasting. Appl. Energy 2017, 190, 1245–1257. [Google Scholar] [CrossRef]
Wang, H.; Li, G.; Wang, G.; Peng, J.; Jiang, H.; Liu, Y. Deep learning based ensemble approach for probabilistic wind power forecasting. Appl. Energy 2017, 188, 56–70. [Google Scholar] [CrossRef]
Huang, C.-J.; Kuo, P.-H. A Short-Term Wind Speed Forecasting Model by Using Artificial Neural Networks with Stochastic Optimization for Renewable Energy Systems. Energies 2018, 11, 2777. [Google Scholar] [CrossRef]
Yang, R.; Liu, H.; Nikitas, N.; Duan, Z.; Li, Y.; Li, Y. Short-term wind speed forecasting using deep reinforcement learning with improved multiple error correction approach. Energy 2022, 239, 122128. [Google Scholar] [CrossRef]
Olaofe, Z.O. A 5-day wind speed, power forecasts using a layer recurrent neural network (LRNN). Sustain. Energy Technol. Assess. 2014, 6, 1–24. [Google Scholar] [CrossRef]
Gu, B.; Zhang, T.; Meng, H.; Zhang, J. Short-term forecasting and uncertainty analysis of wind power based on long short-term memory, cloud model and non-parametric kernel density estimation. Renew. Energy 2021, 164, 687–708. [Google Scholar] [CrossRef]
Meng, A.; Chen, S.; Ou, Z.; Ding, W.; Zhou, H.; Fan, J.; Yin, H. A hybrid deep learning architecture for wind power prediction based on bi-attention mechanism and crisscross optimization. Energy 2022, 238, 121795. [Google Scholar] [CrossRef]
Tascikaraoglu, A.; Uzunoglu, M. A review of combined approaches for prediction of short-term wind speed and power. Renew. Sustain. Energy Rev. 2014, 34, 243–254. [Google Scholar] [CrossRef]
Wang, J.; Hu, J.; Ma, K.; Zhang, Y. A self-adaptive hybrid approach for wind speed forecasting. Renew. Energy 2015, 78, 374–385. [Google Scholar] [CrossRef]
Shi, J.; Guo, J.; Zheng, S. Evaluation of hybrid forecasting approaches for wind speed and power generation time series. Renew. Sustain. Energy Rev. 2012, 16, 3471–3480. [Google Scholar] [CrossRef]
Catalão, J.P.S.; Pousinho, H.M.I.; Mendes, V.M.F. Hybrid wavelet-PSO-ANFIS approach for short-term wind power forecasting in Portugal. IEEE Trans. Sustain. Energy 2011, 2, 50–59. [Google Scholar] [CrossRef]
Lu, P.; Ye, L.; Sun, B.; Zhang, C.; Zhao, Y.; Zhu, T. A new hybrid prediction method of ultra-short-term wind power forecasting based on EEMD-PE and LSSVM optimized by the GSA. Energies 2018, 11, 697. [Google Scholar] [CrossRef]
Vladislavleva, E.; Friedrich, T.; Neumann, F.; Wagner, M. Predicting the energy output of wind farms based on weather data: Important variables and their correlation. Renew. Energy 2013, 50, 236–243. [Google Scholar] [CrossRef]
Gao, Y.; Qu, C.; Zhang, K. A hybrid method based on singular spectrum analysis, firefly algorithm, and BP neural network for short-term wind speed forecasting. Energies 2016, 9, 757. [Google Scholar] [CrossRef]
Keerthisinghe, C.; Silva, A.R.; Tardáguila, P.; Horváth, G.; Deng, A.; Theis, T.N. Improved Short-Term Wind Power Forecasts: Low-Latency Feedback Error Correction Using Ramp Prediction and Data from Nearby Farms. IEEE Access 2023, 11, 128697–128705. [Google Scholar] [CrossRef]
Wang, Y.; Wang, J.; Wei, X. A hybrid wind speed forecasting model based on phase space reconstruction theory and Markov model: A case study of wind farms in northwest China. Energy 2015, 91, 556–572. [Google Scholar] [CrossRef]
Taloba, A.I.; Abd El-Aziz, R.M.; Alshanbari, H.M.; El-Bagoury, A.A.H. Estimation and prediction of hospitalization and medical care costs using regression in machine learning. J. Healthc. Eng. 2022, 2022, 7969220. [Google Scholar] [CrossRef] [PubMed]
Dong, Y.; Ma, S.; Zhang, H.; Yang, G. Wind power prediction based on multi-class autoregressive moving average model with logistic function. J. Mod. Power Syst. Clean. Energy 2022, 10, 1184–1193. [Google Scholar] [CrossRef]
Al-Duais, F.S.; Al-Sharpi, R.S. A unique Markov chain Monte Carlo method for forecasting wind power utilizing time series model. Alex. Eng. J. 2023, 74, 51–63. [Google Scholar] [CrossRef]
Cadenas, E.; Jaramillo, O.A.; Rivera, W. Analysis and forecasting of wind velocity in chetumal, quintana roo, using the single exponential smoothing method. Renew. Energy 2010, 35, 925–930. [Google Scholar] [CrossRef]
Jiang, Y.; Song, Z.; Kusiak, A. Very short-term wind speed forecasting with Bayesian structural break model. Renew. Energy 2013, 50, 637–647. [Google Scholar] [CrossRef]
He, M.; Yang, L.; Zhang, J.; Vittal, V. A spatio-temporal analysis approach for short-term forecast of wind farm generation. IEEE Trans. Power Syst. 2014, 29, 1611–1622. [Google Scholar] [CrossRef]
Song, Z.; Jiang, Y.; Zhang, Z. Short-term wind speed forecasting with Markov-switching model. Appl. Energy 2014, 130, 103–112. [Google Scholar] [CrossRef]
Park, S.; Jung, S.; Lee, J.; Hur, J. A short-term forecasting of wind power outputs based on gradient boosting regression tree algorithms. Energies 2023, 16, 1132. [Google Scholar] [CrossRef]
Lu, P.; Ye, L.; Zhong, W.Z.; Qu, Y.; Zhai, B.X.; Tang, Y.; Zhao, Y.N. A novel spatio-temporal wind power forecasting framework based on multi-output support vector machine and optimization strategy. J. Clean. Prod. 2020, 254, 119993. [Google Scholar] [CrossRef]
Tu, C.S.; Hong, C.M.; Huang, H.S.; Chen, C.H. Short term wind power prediction based on data regression and enhanced support vector machine. Energies 2020, 13, 6319. [Google Scholar] [CrossRef]
Yuan, D.; Li, M.; Li, H.; Lin, C.; Ji, B. Wind power prediction method: Support vector regression optimized by improved jellyfish search algorithm. Energies 2022, 15, 6404. [Google Scholar] [CrossRef]
Wang, X.; Li, J.; Shao, L.; Liu, H.; Ren, L.; Zhu, L. Short-term wind power prediction by an extreme learning machine based on an improved hunter–prey optimization algorithm. Sustainability 2023, 15, 991. [Google Scholar] [CrossRef]
An, G.; Jiang, Z.; Chen, L.; Cao, X.; Li, Z.; Zhao, Y.; Sun, H. Ultra short-term wind power forecasting based on sparrow search algorithm optimization deep extreme learning machine. Sustainability 2021, 13, 10453. [Google Scholar] [CrossRef]
Niu, D.; Pu, D.; Dai, S. Ultra-short-term wind-power forecasting based on the weighted random forest optimized by the niche immune lion algorithm. Energies 2018, 11, 1098. [Google Scholar] [CrossRef]
Ramasamy, P.; Chandel, S.S.; Yadav, A.K. Wind speed prediction in the mountainous region of India using an artificial neural network model. Renew. Energy 2015, 80, 338–347. [Google Scholar] [CrossRef]
Huang, B.; Liang, Y.; Qiu, X. Wind Power Forecasting Using Attention-Based Recurrent Neural Networks: A Comparative Study. IEEE Access 2021, 9, 40432–40444. [Google Scholar] [CrossRef]
Huang, J.; Niu, G.; Guan, H.; Song, S. Ultra-Short-Term Wind Power Prediction Based on LSTM with Loss Shrinkage Adam. Energies 2023, 16, 3789. [Google Scholar] [CrossRef]
Hossain, M.A.; Chakrabortty, R.K.; Elsawah, S.D.; Ryan, M.J. Very short-term forecasting of wind power generation using hybrid deep learning model. J. Clean. Prod. 2021, 296, 126564. [Google Scholar] [CrossRef]
Wang, L.; He, Y. M2STAN: Multi-modal multi-task spatiotemporal attention network for multi-location ultra-short-term wind power multi-step predictions. Appl. Energy 2022, 324, 119672. [Google Scholar] [CrossRef]
Dong, X.; Sun, Y.; Li, Y.; Wang, X.; Pu, T. Spatio-temporal Convolutional Network Based Power Forecasting of Multiple Wind Farms. J. Mod. Power Syst. Clean. Energy 2022, 10, 388–398. [Google Scholar] [CrossRef]
Bezerra, E.C.; Pinson, P.; Leao, R.P.S.; Braga, A.P.S. A Self-Adaptive Multikernel Machine Based on Recursive Least-Squares Applied to Very Short-Term Wind Power Forecasting. IEEE Access 2021, 9, 104761–104772. [Google Scholar] [CrossRef]
Zhang, Y.; Li, Y.; Zhang, G. Short-term wind power forecasting approach based on Seq2Seq model using NWP data. Energy 2020, 213, 118371. [Google Scholar] [CrossRef]
Sun, Y.; Li, Z.Y.; Yu, X.N.; Li, B.J.; Yang, M. Research on Ultra-Short-Term Wind Power Prediction Considering Source Relevance. IEEE Access 2020, 8, 147703–147710. [Google Scholar] [CrossRef]
Nascimento, E.G.S.; de Melo, T.A.; Moreira, D.M. A transformer-based deep neural network with wavelet transform for forecasting wind speed and wind energy. Energy 2023, 278, 127678. [Google Scholar] [CrossRef]
Yu, W.; Li, S.; Zhang, H.; Kang, Y.; Li, H.; Dong, H. Ultra-short-term wind-power forecasting based on an optimized CNN-BILSTM-attention model. iEnergy 2024, 3, 268–282. [Google Scholar] [CrossRef]
Miao, C.; Li, H.; Wang, X.; Li, H. Ultra-Short-Term Prediction of Wind Power Based on Sample Similarity Analysis. IEEE Access 2021, 9, 72730–72742. [Google Scholar] [CrossRef]
Yu, G.; Liu, C.; Tang, B.; Chen, R.; Lu, L.; Cui, C.; Hu, Y.; Shen, L.; Muyeen, S.M. Short term wind power prediction for regional wind farms based on spatial-temporal characteristic distribution. Renew. Energy 2022, 199, 599–612. [Google Scholar] [CrossRef]
Zhou, J.; Liu, H.; Xu, Y.; Jiang, W. A Hybrid Framework for Short Term Multi-Step Wind Speed Forecasting Based on Variational Model Decomposition and Convolutional Neural Network. Energies 2018, 11, 2292. [Google Scholar] [CrossRef]
Liu, F.; Li, R.; Dreglea, A. Wind Speed and Power Ultra Short-Term Robust Forecasting Based on Takagi–Sugeno Fuzzy Model. Energies 2019, 12, 3551. [Google Scholar] [CrossRef]
Wu, X.; Jiang, S.; Lai, C.S.; Zhao, Z.; Lai, L.L. Short-Term Wind Power Prediction Based on Data Decomposition and Combined Deep Neural Network. Energies 2022, 15, 6734. [Google Scholar] [CrossRef]
Jalali, S.M.J.; Osorio, G.J.; Ahmadian, S.; Lotfi, M.; Campos, V.M.A.; Shafie-Khah, M.; Khosravi, A.; Catalao, J.P.S. New Hybrid Deep Neural Architectural Search-Based Ensemble Reinforcement Learning Strategy for Wind Power Forecasting. IEEE Trans. Ind. Appl. 2022, 58, 15–27. [Google Scholar] [CrossRef]
Yao, Z.; Wang, C. A hybrid model based on a modified optimization algorithm and an artificial intelligence algorithm for short-term wind speed multi-step ahead forecasting. Sustainability 2018, 10, 1443. [Google Scholar] [CrossRef]
Hossain, M.A.; Gray, E.; Lu, J.; Islam, M.R.; Alam, M.S.; Chakrabortty, R.; Pota, H.R. Optimized Forecasting Model to Improve the Accuracy of Very Short-Term Wind Power Prediction. IEEE Trans. Ind. Inform. 2023, 19, 10145–10159. [Google Scholar] [CrossRef]
Liu, Z.; Li, X.; Zhao, H. Short-Term Wind Power Forecasting Based on Feature Analysis and Error Correction. Energies 2023, 16, 4249. [Google Scholar] [CrossRef]
Chen, D.; Xue, J. An overview on recent progresses of the operational numerical weather prediction models. Acta Meteorol. Sin. 2004, 62, 623–633. [Google Scholar] [CrossRef]
Wang, J.; Yang, P.; Yang, X. Research on wind power prediction modeling based on numerical weather prediction. Renew. Energy Resour. 2013, 31, 34–38. [Google Scholar]
Torres, J.L.; García, A.; De Blas, M.; De Francisco, A. Forecast of hourly average wind speed with ARMA models in Navarre (Spain). Sol. Energy 2005, 79, 65–77. [Google Scholar] [CrossRef]
Kavasseri, R.G.; Seetharaman, K. Day-ahead wind speed forecasting using f-ARIMA models. Renew. Energy 2009, 34, 1388–1393. [Google Scholar] [CrossRef]
Zhang, W.; Lin, Z.; Liu, X. Short-term offshore wind power forecasting-A hybrid model based on Discrete Wavelet Transform (DWT), Seasonal Autoregressive Inte-grated Moving Average (SARIMA), and deep-learning-based Long Short-Term Memory (LSTM). Renew. Energy 2022, 185, 611–628. [Google Scholar] [CrossRef]
Louka, P.; Galanis, G.; Siebert, N.; Kariniotakis, G.; Katsafados, P.; Pytharoulis, I.; Kallos, G. Improvements in wind speed forecasts for wind power prediction purposes using Kalman filtering. J. Wind Eng. Ind. Aerodyn. 2008, 96, 2348–2362. [Google Scholar] [CrossRef]
Liao, S.; Tian, X.; Liu, B.; Liu, T.; Su, H.; Zhou, B. Short-Term Wind Power Prediction Based on LightGBM and Meteorological Reanalysis. Energies 2022, 15, 6287. [Google Scholar] [CrossRef]
Hu, J.; Wang, J.; Ma, K. A hybrid technique for short term wind speed prediction. Energy 2015, 81, 563–574. [Google Scholar] [CrossRef]
Wang, J.; Wang, Y.; Jiang, P. The study and application of a novel hybrid forecasting model a case study of wind speed forecasting in China. Appl. Energy 2015, 143, 472–488. [Google Scholar] [CrossRef]
Yu, L.; Meng, G.; Pau, G.; Wu, Y.; Tang, Y. Research on Hierarchical Control Strategy of ESS in Distribution Based on GA-SVR Wind Power Forecasting. Energies 2023, 16, 2079. [Google Scholar] [CrossRef]
Abbasipour, M.; Igder, M.A.; Liang, X. A Novel Hybrid Neural Network-Based Day-Ahead Wind Speed Forecasting Technique. IEEE Access 2021, 9, 151142–151154. [Google Scholar] [CrossRef]
Cadenas, E.; Rivera, W. Short term wind speed forecasting in La Venta, Oaxaca, México, using artificial neural networks. Renew. Energy 2009, 34, 274–278. [Google Scholar] [CrossRef]
Noorollahi, Y.; Jokar, M.A.; Kalhor, A. Using artificial neural networks for temporal and spatial wind speed forecasting in Iran. Energy Convers. Manag. 2016, 115, 17–25. [Google Scholar] [CrossRef]
Medina, S.V.; Ajenjo, U.P. Performance Improvement of Artificial Neural Network Model in Short-term Forecasting of Wind Farm Power Output. J. Mod. Power Syst. Clean. Energy 2020, 8, 484–490. [Google Scholar] [CrossRef]
Li, G.; Shi, J. On comparing three artificial neural networks for wind speed forecasting. Appl. Energy 2010, 87, 2313–2320. [Google Scholar] [CrossRef]
Cao, Q.; Ewing, B.T.; Thompson, M.A. Forecasting wind speed with recurrent neural networks. Eur. J. Oper. Res. 2012, 221, 148–154. [Google Scholar] [CrossRef]
Ko, M.S.; Lee, K.G.; Kim, J.K.; Hong, C.W.; Dong, Z.Y.; Hur, K. Deep Concatenated Residual Network with Bidirectional LSTM for One-Hour-Ahead Wind Power Forecasting. IEEE Trans. Sustain. Energy 2021, 12, 1321–1335. [Google Scholar] [CrossRef]
Sun, Y.; Wang, X.; Yang, J. Modified particle swarm optimization with attention-based lstm for wind power prediction. Energies 2022, 15, 4334. [Google Scholar] [CrossRef]
Xiong, B.; Lou, L.; Meng, X.Y.; Wang, X.; Ma, H.; Wang, Z.G. Short-term wind power forecasting based on Attention Mechanism and Deep Learning. Electr. Power Syst. Res. 2022, 206, 107776. [Google Scholar] [CrossRef]
Zhao, Z.; Yun, S.; Jia, L.; Guo, J.; Meng, Y.; He, N.; Li, X.; Shi, J.; Yang, L. Hybrid VMD-CNN-GRU-based model for short-term forecasting of wind power considering spatio-temporal features. Eng. Appl. Artif. Intell. 2023, 121, 105982. [Google Scholar] [CrossRef]
Blazakis, K.; Katsigiannis, Y.; Stavrakakis, G. One-Day-ahead solar irradiation and windspeed forecasting with advanced deep learning techniques. Energies 2022, 15, 4361. [Google Scholar] [CrossRef]
Zheng, J.; Du, J.; Wang, B.; Klemeš, J.J.; Liao, Q.; Liang, Y. A hybrid framework for forecasting power generation of multiple renewable energy sources. Renew. Sustain. Energy Rev. 2023, 172, 113046. [Google Scholar] [CrossRef]
Maatallah, O.A.; Achuthan, A.; Janoyan, K.; Marzocca, P. Recursive wind speed forecasting based on hammerstein auto-regressive model. Appl. Energy 2015, 145, 191–197. [Google Scholar] [CrossRef]
Sidorov, D.; Tynda, A.; Muratov, V.; Yanitsky, E. Volterra Black-Box Models Identification Methods: Direct Collocation vs. Least Squares. Mathematics 2024, 12, 227. [Google Scholar] [CrossRef]
Chen, N.; Qian, Z.; Nabney, I.T.; Meng, X. Wind power forecasts using Gaussian processes and numerical weather prediction. IEEE Trans. Power Syst. 2014, 29, 656–665. [Google Scholar] [CrossRef]
Xue, H.; Jia, Y.; Wen, P.; Farkoush, S.G. Using of improved models of Gaussian Processes in order to Regional wind power forecasting. J. Clean. Prod. 2020, 262, 121391. [Google Scholar] [CrossRef]
Mararakanye, N.; Dalton, A.; Bekker, B. Incorporating Spatial and Temporal Correlations to Improve Aggregation of Decentralized Day-Ahead Wind Power Forecasts. IEEE Access 2022, 10, 116182–116195. [Google Scholar] [CrossRef]
Lee, D.; Baldick, R. Short-term wind power ensemble prediction based on Gaussian processes and neural networks. IEEE Trans. Smart Grid 2014, 5, 501–510. [Google Scholar] [CrossRef]
Garg, S.; Krishnamurthi, R. A CNN encoder decoder LSTM model for sustainable wind power predictive analytics. Sustain. Comput. Inform. Syst. 2023, 38, 100869. [Google Scholar] [CrossRef]
Khazaei, S.R.; Ehsan, M.D.; Soleymani, S.D.B.; Mohammadnezhad-Shourkaei, H.S. A high-accuracy hybrid method for short-term wind power forecasting. Energy 2022, 238, 122020. [Google Scholar] [CrossRef]
Sun, Z.; Zhao, M. Short-Term Wind Power Forecasting Based on VMD Decomposition, ConvLSTM Networks and Error Analysis. IEEE Access 2020, 8, 134422–134434. [Google Scholar] [CrossRef]
Chandran, V.; Patil, C.K.; Manoharan, A.M.; Ghosh, A.; Sumithra, M.G.; Karthick, A.; Rahim, R.; Arun, K. Wind power forecasting based on time series model using deep machine learning algorithms. Mater. Today Proc. 2021, 47, 115–126. [Google Scholar] [CrossRef]
Wen, S.; Li, Y.; Su, Y. A new hybrid model for power forecasting of a wind farm using spatial–temporal correlations. Renew. Energy 2022, 198, 155–168. [Google Scholar] [CrossRef]
Ye, L.; Li, Y.; Pei, M.; Zhao, Y.; Li, Z.; Lu, P. A novel integrated method for short-term wind power forecasting based on fluctuation clustering and history matching. Appl. Energy 2022, 327, 120131. [Google Scholar] [CrossRef]
Huang, Y.; Liu, G.P.; Hu, W.S. Priori-guided and data-driven hybrid model for wind power forecasting. ISA Trans. 2023, 134, 380–395. [Google Scholar] [CrossRef] [PubMed]
Yu, Y.; Yang, M.; Han, X.; Zhang, Y.; Ye, P. A Regional Wind Power Probabilistic Forecast Method Based on Deep Quantile Regression. IEEE Trans. Ind. Appl. 2021, 57, 4420–4427. [Google Scholar] [CrossRef]
Liu, H.; Tian, H.Q.; Li, Y.F. Comparison of two new ARIMA ANN and ARIMA-Kalman hybrid methods for wind speed prediction. Appl. Energy 2012, 98, 415–424. [Google Scholar] [CrossRef]
Shukur, O.B.; Lee, M.H. Daily wind speed forecasting through hybrid KF-ANN model based on ARIMA. Renew. Energy 2015, 76, 637–647. [Google Scholar] [CrossRef]
Osorio, G.J.; Matias, J.C.O.; Catalao, J.P.S. Short-term wind power forecasting using adaptive neuro-fuzzy inference system combined with evolutionary particle swarm optimization, wavelet transform and mutual information. Renew. Energy 2015, 75, 301–307. [Google Scholar] [CrossRef]
Han, Y.; Tong, X. Multi-step short-term wind power prediction based on three-level decomposition and improved grey wolf optimization. IEEE Access 2020, 8, 67124–67136. [Google Scholar] [CrossRef]
Gu, B.; Hu, H.; Zhao, J.; Zhang, H.; Liu, X. Short-term wind power forecasting and uncertainty analysis based on FCM–WOA-ELM–GMM. Energy Rep. 2023, 9, 807–819. [Google Scholar] [CrossRef]
Salcedo-Sanz, S.; Pastor-Sanchez, A.; Del Ser, J.; Prieto, L.; Geem, Z.W. A coral reefs optimization algorithm with harmony search operators for accurate wind speed prediction. Renew. Energy 2015, 75, 93–101. [Google Scholar] [CrossRef]
Mughees, M.; Li, Y.; Li, Y. From C. elegans to liquid neural networks: A robust wind power multi-time scale prediction framework. In Proceedings of the IECON 2024—50th Annual Conference of the IEEE Industrial Electronics Society, Chicago, IL, USA, 3–6 November 2024. [Google Scholar] [CrossRef]
Li, M.; Yang, M.; Yu, Y.; Lee, W.-J. A wind speed correction method based on modified hidden markov model for enhancing wind power forecast. IEEE Trans. Ind. Appl. 2022, 58, 656–666. [Google Scholar] [CrossRef]
Hu, S.; Xiang, Y.; Huo, D.; Jawad, S.; Liu, J. An improved deep belief network based hybrid forecasting method for wind power. Energy 2021, 224, 120185. [Google Scholar] [CrossRef]
Liu, X.; Zhang, L.; Wang, J.; Zhou, Y.; Gan, W. A unified multi-step wind speed forecasting framework based on numerical weather prediction grids and wind farm monitoring data. Renew. Energy 2023, 211, 948–963. [Google Scholar] [CrossRef]
Hu, S.; Xiang, Y.; Zhang, H.; Xie, S.; Li, J.; Gu, C.; Sun, W.; Liu, J. Hybrid forecasting method for wind power integrating spatial correlation and corrected numerical weather prediction. Appl. Energy 2021, 293, 116951. [Google Scholar] [CrossRef]
Cui, Y.; Chen, Z.; He, Y.; Xiong, X.; Li, F. An algorithm for forecasting day-ahead wind power via novel long short-term memory and wind power ramp events. Energy 2023, 263, 125888. [Google Scholar] [CrossRef]
Llido, A.M.; Ortiz-Garca, E.G.; Portilla-Figueras, A.; Prieto, L.; Paredes, D. Hybridizing the fifth generation mesoscale model with articial neural networks for short-term wind speed prediction. Renew. Energy 2009, 34, 1451–1457. [Google Scholar] [CrossRef]
Ghadi, M.J.; Gilani, S.H.; Afrakhte, H.; Baghramian, A. A novel heuristic method for wind farm power prediction: A case study. Int. J. Electr. Power Energy Syst. 2014, 63, 962–970. [Google Scholar] [CrossRef]
Yesilbudak, M.; Sagiroglu, S.; Colak, I. A new approach to very short term wind speed prediction using k-nearest neighbor classification. Energy Convers. Manag. 2013, 69, 77–86. [Google Scholar] [CrossRef]
Ahmadi, A.; Nabipour, M.; Mohammadi-Ivatloo, B.; Amani, A.; Rho, S.; Piran, M.J. Long-Term Wind Power Forecasting Using Tree-Based Learning Algorithms. IEEE Access 2020, 8, 151511–151522. [Google Scholar] [CrossRef]
Bouzgou, H.; Benoudjit, N. Multiple architecture system for wind speed prediction. Appl. Energy 2011, 88, 2463–2471. [Google Scholar] [CrossRef]
De Giorgi, M.G.; Ficarella, A.; Tarantino, M. Assessment of the benefits of numerical weather predictions in wind power forecasting based on statistical methods. Energy 2011, 36, 3968–3978. [Google Scholar] [CrossRef]
Hu, J.; Wang, J.; Zeng, G. A hybrid forecasting approach applied to wind speed time series. Renew. Energy 2013, 60, 185–194. [Google Scholar] [CrossRef]
Guo, Z.; Zhao, W.; Lu, H.; Wang, J. Multi-step forecasting for wind speed using a modified EMD-based artificial neural network model. Renew. Energy 2012, 37, 241–249. [Google Scholar] [CrossRef]
Han, S.; Qiao, Y.; Yan, J.; Liu, Y.; Li, L.; Wang, Z. Mid-to-long term wind and photovoltaic power generation prediction based on copula function and long short term memory network. Appl. Energy 2019, 239, 181–191. [Google Scholar] [CrossRef]
Skittides, C.; Fruh, W. Wind forecasting using principal component analysis. Renew. Energy 2014, 69, 365–374. [Google Scholar] [CrossRef]
Hu, Q.; Su, P.; Yu, D.; Liu, J. Pattern-based wind speed prediction based on generalized principal component analysis. IEEE Trans. Sustain. Energy 2014, 5, 866–874. [Google Scholar] [CrossRef]
Kusiak, A.; Zheng, H.; Song, Z. Models for monitoring wind farm power. Renew. Energy 2009, 34, 583–590. [Google Scholar] [CrossRef]
Wang, X.; Liu, Y.; Hou, J.; Wang, S.; Yao, H. Medium- and Long-Term Wind-Power Forecasts, Considering Regional Similarities. Atmosphere 2023, 14, 430. [Google Scholar] [CrossRef]
Azad, H.B.; Mekhilef, S.; Ganapathy, V.G. Long-term wind speed forecasting and general pattern recognition using neural networks. IEEE Trans. Sustain. Energy 2014, 5, 546–553. [Google Scholar] [CrossRef]
Akcay, H.; Filik, T. Short-term wind speed forecasting by spectral analysis from long-term observations with missing values. Appl. Energy 2017, 191, 653–662. [Google Scholar] [CrossRef]
Guo, Z.; Chi, D.; Wu, J.; Zhang, W. A new wind speed forecasting strategy based on the chaotic time series modelling technique and the Apriori algorithm. Energy Convers. Manag. 2014, 84, 140–151. [Google Scholar] [CrossRef]
Togelou, A.; Sideratos, G.; Hatziargyriou, N. Wind power forecasting in the absence of historical data. IEEE Trans. Sustain. Energy 2012, 3, 416–421. [Google Scholar] [CrossRef]
Sideratos, G.; Hatziargyriou, N. An advanced statistical method for wind power forecasting. IEEE Trans. Power Syst. 2007, 22, 258–265. [Google Scholar] [CrossRef]
Sideratos, G.; Hatziargyriou, N. Probabilistic wind power forecasting using radial basis function neural networks. IEEE Trans. Power Syst. 2012, 27, 1788–1796. [Google Scholar] [CrossRef]
Sideratos, G.; Hatziargyriou, N. Wind power forecasting focused on extreme power system events. IEEE Trans. Sustain. Energy 2012, 3, 445–454. [Google Scholar] [CrossRef]
Liu, J.; Wang, X.; Lu, Y. A novel hybrid methodology for short-term wind power forecasting based on adaptive neuro-fuzzy inference system. Renew. Energy 2017, 103, 620–629. [Google Scholar] [CrossRef]
Yi, P.; Bao, Z.; Huang, F.; Wang, J.; Peng, J.; Zhang, L. Towards Effective Long-Term Wind Power Forecasting: A Deep Conditional Generative Spatio-Temporal Approach. IEEE Trans. Knowl. Data Eng. 2024, 36, 9403–9417. [Google Scholar] [CrossRef]
Yu, J.; Chen, K.; Mori, J.; Rashid, M.M. A Gaussian mixture copula model based localized Gaussian process regression approach for long-term wind speed prediction. Energy 2013, 61, 673–686. [Google Scholar] [CrossRef]
Cadenas, E.; Rivera, W. Wind speed forecasting in three different regions of Mexico, using a hybrid ARIMA-ANN model. Renew. Energy 2010, 35, 2732–2738. [Google Scholar] [CrossRef]
Shahzad, M.N.; Kanwal, S.; Hussanan, A. A New Hybrid ARAR and Neural Network Model for Multi-Step Ahead Wind Speed Forecasting in Three Regions of Pakistan. IEEE Access 2020, 8, 199382–199392. [Google Scholar] [CrossRef]
Chen, F.; Vinsen, K.; Filoche, A. Spatial–Temporal Approach for Gridded Wind Forecasting Across Southwest Western Australia. IEEE Access 2024, 12, 85905–185917. [Google Scholar] [CrossRef]
Yeh, W.C.; Yeh, Y.M.; Chang, P.C.; Ke, Y.C.; Chung, V. Forecasting wind power in the Mai Liao Wind Farm based on the multi-layer perceptron artificial neural network model with improved simplified swarm optimization. Int. J. Electr. Power Energy Syst. 2014, 55, 741–748. [Google Scholar] [CrossRef]
Abedinia, O.; Bagheri, M.; Naderi, M.S.; Ghadimi, N. A New Combinatory Approach for Wind Power Forecasting. IEEE Syst. J. 2020, 14, 4614–4625. [Google Scholar] [CrossRef]
Lin, K.-P.; Pai, P.-F.; Ting, Y.-J. Deep Belief Networks with Genetic Algorithms in Forecasting Wind Speed. IEEE Access 2019, 7, 99244–99253. [Google Scholar] [CrossRef]
Amjady, N.; Keynia, F.; Zareipour, H. Wind power prediction by a new forecast engine composed of modified hybrid neural network and enhanced particle swarm optimization. IEEE Trans. Sustain. Energy 2011, 2, 265–276. [Google Scholar] [CrossRef]
Wang, J.; Qin, S.; Zhou, Q.; Jiang, H. Medium-term wind speeds forecasting utilizing hybrid models for three dierent sites in Xinjiang, China. Renew. Energy 2015, 76, 91–101. [Google Scholar] [CrossRef]
Zhang, W.; Wu, J.; Wang, J.; Zhao, W.; Shen, L. Performance analysis of four modified approaches for wind speed forecasting. Appl. Energy 2012, 99, 324–333. [Google Scholar] [CrossRef]
Haque, A.U.; Nehrir, M.H.; Mandal, P. A hybrid intelligent model for deterministic and quantile regression approach for probabilistic wind power forecasting. IEEE Trans. Power Syst. 2014, 29, 1663–1672. [Google Scholar] [CrossRef]
Fang, S.; Chiang, H.-D. A high-accuracy wind power fore casting model. IEEE Trans. Power Syst. 2017, 32, 1589–1590. [Google Scholar] [CrossRef]
Chang, Y.; Yang, H.; Chen, Y.; Zhou, M.; Yang, H.; Wang, Y.; Zhang, Y. A hybrid model for long-term wind power forecasting utilizing nwp subsequence correction and multi-scale deep learning regression methods. IEEE Trans. Sustain. Energy 2024, 15, 263–275. [Google Scholar] [CrossRef]
Zhao, P.; Wang, J.; Xia, J.; Dai, Y.; Sheng, Y.; Yue, J. Performance evaluation and accuracy enhancement of a day ahead wind power forecasting system in China. Renew. Energy 2012, 43, 234–241. [Google Scholar] [CrossRef]
Ferreira, C.A.; Gama, J.; Matias, L.; Botterud, A.; Wang, J. A Survey on Wind Power Ramp Forecasting; Argonne National Laboratory: Chicago, IL, USA, 2011. [Google Scholar] [CrossRef]
Cutler, N.; Kay, M.; Jacka, K.; Nielsen, T.S. Detecting, categorizing and forecasting large ramps in wind farm power output using meteorological observations and WPPT. Wind Energy 2007, 10, 453–470. [Google Scholar] [CrossRef]
Bradford, K.T.; Carpenter, R.L.; Shaw, B.L. Forecasting southern plains wind ramp events using the WRF model at 3km. In Proceedings of the Ninth Annual Student Conference, Atlanta, GA, USA, 17 January 2010. [Google Scholar]
Hirth, B.D.; Schroeder, J.; Irons, Z.; Walter, K. Dual Dopler measurements of a wind ramp event at an Oklahoma wind plant. Wind Energy 2016, 19, 953–962. [Google Scholar] [CrossRef]
Potter, C.W.; Grimit, E.; Nijssen, B. Potential benefits of a dedicated probabilistic rapid ramp event forecast tool. In Proceedings of the 2009 IEEE/PES Power Systems Conference and Exposition, Seattle, WA, USA, 15–18 March 2009. [Google Scholar]
Bossavy, A.; Girard, R.; Kariniotakis, G. Forecasting uncertainty related to ramps of wind power production. In Proceedings of the European Wind Energy Conference & Exhibition, Warsaw, Poland, 20–23 April 2010. [Google Scholar]
Greaves, B.; Collins, J.; Parkes, J.; Tindal, A. Temporal forecast uncertainty for ramp events. Wind Eng. 2009, 33, 309–319. [Google Scholar] [CrossRef]
Zareipour, H.; Huang, D.; Rosehart, W. Wind power ramp events classification and forecasting: A data mining approach. In Proceedings of the Power and Energy Society General Meeting, Detroit, MI, USA, 24–28 July 2011. [Google Scholar]
Mishra, S.; Ören, E.; Bordin, C.; Wen, F.; Palu, I. Features extraction of wind ramp events from a virtual wind park. Energy Rep. 2020, 6, 237–249. [Google Scholar] [CrossRef]

Figure 1. Classification of wind power prediction.

Figure 2. The flowchart of the hybrid prediction process.

Figure 3. The structure of the ANN.

Figure 4. The structure of the RNN.

Figure 5. The structure of the LSTM.

Figure 6. The structure of the GRU.

Figure 7. The structure of the TCN based on casual convolution with kernel size of 2.

Figure 8. The typical structure of a CNN.

Figure 9. Topology of backpropagation network.

Figure 10. Internal structure of the HAR model.

Figure 11. Result of the ARIMA model considering wind power information.

Figure 12. Result of the LSTM model considering wind power information.

Figure 13. Result of the CNN-LSTM model considering wind power information.

Figure 14. Result of the LSTM model considering wind power information and NWP data.

Figure 15. Result of the CNN-LSTM model considering wind power information and NWP data.

Table 1. Other features of ultra-short-term wind power prediction.

Model Type			Reference	Prediction Model	Input Data Type	Evaluation Metric	Spatial Scale
Traditional statistical model			[47]	ARMA	Wind information	MAE	Wind farm
			[69]	ARMA	Wind information	RMSE, MAPE	Wind farm
			[70]	SARIMA	Wind information	RMSE, MAPE	Wind farm
			[71]	SES	Wind information	MAPE	Wind farm
			[72]	BSBM	Wind information	MAE, MSE, RMSE	Wind farm
			[73]	Markov chain	Wind information	MAE, MAPE, RMSE	Wind farm
			[74]	Markov chain	Wind information	MAE, MSE, RMSE	Wind farm
Machine learning-based model			[75]	GBM	Wind information	MAE, RMSE	Wind farm
			[78]	IJS-SVR	Wind information, NWP	MAE, RMSE, MAPE	Wind farm
			[79]	IHPO-ELM	Wind information	MAE, RMSE, MAPE	Wind farm
			[80]	SSA-DELM	Wind information	RMSE, MAE	Wind farm
			[82]	ANN	Wind information, NWP	MAPE	Wind farm
			[56]	LRNN	Wind information, NWP	MAPE	Wind farm
			[83]	DA-RNN	Wind information	MAE	Wind farm
			[84]	LSTM	Wind information	MSE, MAE	Wind farm
			[87]	STCN	Wind information	RMSE, MAE	Wind farm region
Hybrid model	Weighted combination prediction method		[88]	MC-KRLS	Wind information	RMSE	Wind farm
	Fusion combination prediction method	Input optimization	[77]	DR-LSSVM	Wind information, NWP	MAE, RMSE, MAPE	Wind farm
			[85]	CNN-GRU	Wind information	MAE, RMSE, MAPE	Wind farm
			[89]	Seq2Seq	Wind information	RMSE, MAE	Wind farm
			[90]	PM-BP	Wind information	MAE, RMSE	Wind farm
			[91]	MLP-transformer	Wind information, NWP	MAE, MSE, RMSE	Wind farm
			[93]	CNN-MLSTMs	Wind information	MAE, RMSE	Wind farm
			[94]	I-CNN-BILSTM	Wind information, NWP	MAE, RMSE	Wind farm region
			[95]	VMD-CNN	Wind information	MAPE	Wind farm
		Model optimization	[62]	WT-PSO-ANFIS	Wind information	MAPE&NMAE	Wind farm
			[63]	GSA-EEMD-PE-LSSVM	Wind information	NMAE&NRMSE	Wind farm
			[76]	ST-GWO-MSVM	Wind information	MAE, RMSE	Wind farm region
			[81]	WD-NILA-WRF	Wind information, NWP	MAPE	Wind farm
			[86]	M2STAN	Wind information, NWP	MAE, RMSE	Wind farm region
			[77]	DR-LSSVM	Wind information, NWP	MAE, RMSE, MAPE	Wind farm
			[97]	VMD-CNN-IPSO-LSTM	Wind information, NWP	MAE, RMSE, MAPE	Wind farm
			[98]	DOCREL	Wind information	RMSE, MAE	Wind farm
			[99]	WD-APSOACO-BP	Wind information	MAPE, RMSE	Wind farm
			[81]	WD-NILA-WRF	Wind information, NWP	MAPE	Wind farm
			[100]	CEEMDAN-LSTM-MBO	Wind information	MAE, RMSE, MAPE	Wind farm
		Error processing techniques	[101]	BiLSTM-GBM	Wind information, NWP	MAE, RMSE, MAPE	Wind farm

Table 2. Other features of short-term wind power prediction.

Model Type			Reference	Prediction Model	Input Data Type	Evaluation Metric	Spatial Scale
Traditional statistical model			[104]	ARMA	Wind information	RMSE	Wind farm
Traditional statistical model			[105]	f-ARIMA	Wind information	RMSE	Wind farm
Machine learning-based model			[108]	LightGBM-MIC	Wind information, NWP	MAE, RMSE	Wind farm
			[49]	N-SVR	Wind information	MAE, RMSE, MAPE	Wind farm
			[111]	GA-SVR	Wind information	MAE, RMSE	Wind farm
			[113]	ANN	Wind information	MSE, MAE	Wind farm
			[115]	ANN	Wind information, NWP	MAPE	Wind farm
			[116]	BP, RBF, BMA	Wind information	MAE, RMSE, MAPE	Wind farm
			[117]	RNN	Wind information	MAE	Wind farm
			[57]	LSTM	Wind information, NWP	RMSE, MAE	Wind farm
			[118]	Bi-LSTM	Wind information, NWP	MSE, MAE, MAPE	Wind farm
			[87]	STCN	Wind information	RMSE, MAE	Wind farm region
			[54]	CNN	Wind information	MAE, RMSE	Wind farm
			[124]	HAR	Wind information	RMSE, MAE, MAPE	Wind farm
			[126]	GP	Wind information, NWP	RMSE, MAE, MAPE	Wind farm
			[127]	GP	Wind information, NWP	RMSE, MAE	Wind farm region
			[128]	KNN	Wind information, NWP	MAE, RMSE	Wind farm region
Hybrid model	Weighted combination prediction method		[52]	ANN, SVM, GBM, RF	Wind information, NWP	MAE, RMSE	Wind farm region
			[112]	ICSA-WNN, PSO-WNN, ELM, RBF, MLP	Wind information, NWP	RMSE, MAE
			[129]	GP-NN	Wind information	MAE, MSE, RMSE, MAPE	Wind farm
	Fusion combination prediction method	Input optimization	[122]	CNN-LSTM	Wind information	MSE, RMSE, MAPE, MAE	Wind farm
			[130]	CNN-ED-LSTM	Wind information, NWP	MAE, MSE, MAPE, RMSE	Wind farm
			[131]	NSGA-II-WT-MLP	Wind information, NWP	RMSE, MAPE, RMSE, MAE	Wind farm
			[132]	VMD-ConvLSTM-LSTM	Wind information	MRE, MAE, MSE, RMSE	Wind farm
			[133]	EEMD-LSTM	Wind information, NWP	MSE	Wind farm
			[60]	ELM-LBQ-SARIMA	Wind information	MAE, MAPE, RMSE	Wind farm
			[134]	KHC-SVD-SVR	Wind information	MAE, RMSE	Wind farm
			[135]	VMD-FFT-FCM-RF	Wind information, NWP	MAE, RMSE	Wind farm
			[136]	FCM-VPBFN	Wind information, NWP	MAE, RMSE	Wind farm
		Model optimization	[65]	SSA-FA-BP	Wind information	MSE, MAE, MAPE	Wind farm
			[67]	C-LSSVM-PSOGSA	Wind information	MAE, RMSE, MAPE	Wind farm
			[61]	ARIMA-ANN, ARIMA-SVM	Wind information	MAE, RMSE	Wind farm
			[109]	EWT-CSA-LSSVM	Wind information	MAE, MAPE, RMSE	Wind farm
			[110]	WPT–LSSVM–PSOSA	Wind information	MAE, MSE, MAPE	Wind farm
			[114]	ANFIS	Wind information	MAPE, MAE, RMSE	Wind farm
			[119]	MPSO-ATT-LSTM	Wind information, NWP	MAPE, MAE	Wind farm
			[120]	AMC-LSTM	Wind information, NWP	MSE, MAE, RMSE	Wind farm
			[58]	EEMD-BA-RGRU-CSO	Wind information, NWP	MAE, RMSE	Wind farm
			[121]	VMD-CNN-GRU	Wind information, NWP	RMSE, MAE, MAPE	Wind farm
			[123]	A-CNN-LSTM	Wind information	RMSE, MAE, MAPE	Wind farm
			[51]	ANFIS	Wind information, NWP	RMSE, MAPE	Wind farm
			[137]	DQR	Wind information	MAE, RMSE	Wind farm region
			[40]	IDMDZ	Wind information, NWP	RMSE, MAE	Wind farm region
			[138]	ARIMA-ANN, ARIMA-Kalman	Wind information	MAE, MAPE, MSE	Wind farm
			[139]	KF-ANN	Wind information	MAPE	Wind farm
			[140]	WT-EPSO-ANFIS	Wind information	MAPE, MAE	Wind farm
			[141]	WPD-VMD-SSA-IGWO-KELM	Wind information	RMSE, MAE, MAPE	Wind farm
			[142]	FCM-WOA-ELM-GMM	Wind information, NWP	MAE, RMSE	Wind farm
			[143]	CRO-HS-ELM	Wind information	RMSE	Wind farm
		Error processing techniques	[66]	ramp predictor	Wind information, NWP	MAE	Wind farm region
			[55]	EWT-Q-GRU-BiLSTM-DBN	Wind information	MAE, RMSE, MAPE	Wind farm
			[67]	ICEEMDAN-LSTM	Wind information, NWP	MAPE, RMSE	Wind farm
			[145]	HMM	Wind information, NWP	RMSE, MAE	Wind farm
			[146]	DBN-SC	Wind information, NWP	RMSE, MAPE, MAE	Wind farm
			[147]	STC-DPN	Wind information, NWP	MAE, RMSE	Wind farm
			[148]	GP-SC	Wind information, NWP	RMSE, MAPE, MAE	Wind farm
			[149]	LSTM-WPRE	Wind information, NWP	MAPE, RMSE	Wind farm

Table 3. Other features of mid-long-term wind power prediction.

Model Type			Reference	Prediction Model	Input Data Type	Evaluation Metric	Spatial Scale
Machine learning-based model			[150]	ANN	Wind information, NWP	MAE, MSE	Wind farm
			[151]	MLP	Wind information, NWP	RMSE	Wind farm
			[152]	KNN	Wind information, NWP	MAE, MAPE, NRMSE	Wind farm
			[153]	Decision tree, bagging, random forest, boosting method, gradient boosting method, XGBoost	Wind information	MAE, RMSE, NRMSE, R²	Wind farm region
Hybrid model	Weighted combination prediction method		[154]	MLR, MLP, RFB, SVM	Wind information, NWP	MAE, RMSE, NMSE	Wind farm region
			[129]	GP-NN	Wind information	MAE, MSE, RMSE, MAPE	Wind farm
			[155]	WT-ELMAN-MLP	Wind information, NWP	MSE, NMAPE	Wind farm
	Fusion combination prediction method	Input optimization	[110]	WPT-PSOSA-LSSVM	Wind information	MAE, MSE, MAPE	Wind farm
			[156]	EEMD-SVM	Wind information	MAE, MAPE	Wind farm
			[157]	EMD-FNN	Wind information	MSE, MAE, MAPE	Wind farm region
			[158]	copula-LSTM	Wind information	RMSPE, MAPE	Wind farm
			[159]	PCA-KNN	Wind information	MAE, RMSE	Wind farm region
			[160]	PCA-SVR	Wind information	MAE, MSE	Wind farm
			[161]	PCA-K-NN	Wind information	MAE, MRE	Wind farm
			[143]	CRO-HS-ELM	Wind information	RMSE	Wind farm
			[162]	FRAMA-LSTM	Wind information, NWP	RMSE	Wind farm region
			[163]	NARX	Wind information, NWP	MAE	Wind farm
			[164]	S-Kalman	Wind information	RMSE, MAE	Wind farm
			[165]	k-means, chaotic time series	Wind information, NWP	ARE, MAPE	Wind farm
			[166]	RBFNN	Wind information	NMAE, NRMSE	Wind farm
			[167]	RBFNN	Wind information, NWP	NMAE, NRMSE	Wind farm
			[168]	RBFNN	Wind information, NWP	NMAE, NRMSE	Wind farm
			[169]	ARTMAP-RBFNN	Wind information, NWP	NMAE, NRMSE	Wind farm
		Model optimization	[170]	ANFIS	Wind information, NWP	MAPE, NMAE, NRMSE	Wind farm
			[171]	DCGST	Wind information	MSE, MAE	Wind farm
			[172]	GMCM-GPR	Wind information	RMSE, MAPE, R²	Wind farm region
			[173]	ARIMA-ANN	Wind information	ME, MSE, MAE	Wind farm
			[174]	ARAR-ANN	Wind information	MAE, MSE, MAPE	Wind farm
			[139]	KF-ANN	Wind information	MAPE	Wind farm region
			[175]	ABED	Wind information, NWP	MAE, RMSE	Wind farm region
			[176]	iSSO-PCA-MLP	Wind information	MSE	Wind farm
			[177]	AFPSO-IWT-TDCNN	Wind information	RMSE, MAPE	Wind farm
			[178]	DBNGA	Wind information, NWP	RMSE, MAPE	Wind farm region
			[179]	RBF-MLP	Wind information, NWP	RMSE, NMAE	Wind farm
			[180]	SIA-SVR-ERNN	Wind information	MSE, MAE, MAPE	Wind farm region
			[181]	PSO-FCA, PSO-SCA	Wind information	MSE, MAPE	Wind farm region
		Error processing techniques	[182]	WT-FA-FF-SVM	Wind information	MAPE, NRMSE, NMAE	Wind farm
			[126]	GP-Cspeed	Wind information, NWP	RMSE, NMAE	Wind farm
			[183]	ALL-CF	Wind information, NWP	NRMSE	Wind farm region
			[184]	MSHP	Wind information, NWP	MSE, MAE	Wind farm
			[185]	Kelman-ANN	Wind information, NWP	ME, MAE, RMSE, NRMSE	Wind farm

Table 4. The evaluation metrics of each prediction model considering wind power information.

Prediction Model	MAE/MW	MSE/MW	RMSE/MW	MAPE/%
ARIMA	1.643	14.287	3.780	20.54
LSTM	4.817	146.537	12.105	16.70
CNN-LSTM	2.947	37.063	6.088	8.3739%

Table 5. The evaluation metrics of each prediction model considering wind power information and NWP data.

Prediction Model	MAE/MW	MSE/MW	RMSE/MW	MAPE/%
LSTM	9.094	201.681	14.201	17.76
CNN-LSTM	8.363	177.790	13.334	11.35

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, F.; Wang, H.; Wang, D.; Liu, D.; Sun, K. A Review of Wind Power Prediction Methods Based on Multi-Time Scales. Energies 2025, 18, 1713. https://doi.org/10.3390/en18071713

AMA Style

Li F, Wang H, Wang D, Liu D, Sun K. A Review of Wind Power Prediction Methods Based on Multi-Time Scales. Energies. 2025; 18(7):1713. https://doi.org/10.3390/en18071713

Chicago/Turabian Style

Li, Fan, Hongzhen Wang, Dan Wang, Dong Liu, and Ke Sun. 2025. "A Review of Wind Power Prediction Methods Based on Multi-Time Scales" Energies 18, no. 7: 1713. https://doi.org/10.3390/en18071713

APA Style

Li, F., Wang, H., Wang, D., Liu, D., & Sun, K. (2025). A Review of Wind Power Prediction Methods Based on Multi-Time Scales. Energies, 18(7), 1713. https://doi.org/10.3390/en18071713

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Review of Wind Power Prediction Methods Based on Multi-Time Scales

Abstract

1. Introduction

2. Classification and Overview of Wind Power Prediction

3. Ultra-Short-Term Wind Power Prediction

3.1. Traditional Statistical Model of Ultra-Short-Term Wind Power Prediction

3.2. Machine Learning-Based Model of Ultra-Short-Term Wind Power Prediction

3.3. Hybrid Prediction Model of Ultra-Short-Term Wind Power Prediction

3.3.1. Weighted Combination Prediction Method

3.3.2. Fusion Combination Prediction Method

3.4. Other Features of Ultra-Short-Term Wind Power Prediction

4. Short-Term Wind Power Prediction

4.1. Traditional Statistical Model of Short-Term Wind Power Prediction

4.2. Machine Learning-Based Model of Short-Term Wind Power Prediction

4.3. Hybrid Prediction Model of Short-Term Wind Power Prediction

4.3.1. Weighted Combination Prediction Method

4.3.2. Fusion Forecasting Method

4.4. Other Features of Short-Term Wind Power Prediction

5. Mid-Long-Term Wind Power Prediction

5.1. Machine Learning-Based Model of Mid-Long-Term Wind Power Prediction

5.2. Hybrid Prediction Model of Mid-Long-Term Wind Power Prediction

5.2.1. Weighted Combination Prediction Method

5.2.2. Fusion Combination Prediction Method

5.3. Other Features of Mid-Long-Term Wind Power Prediction

6. Wind Ramp Event Prediction Methods

7. Case Study

7.1. Result of Ultra-Short-Term Wind Power Prediction Considering Wind Power Information

7.2. Result of Ultra-Short-Term Wind Power Prediction Considering Wind Power Information and NWP Data

8. Discussion and Prospects

8.1. Novelty and Key Contributions

8.2. Future Research and Prospectss

9. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI