1. Introduction
Renewable energy sources, such as solar and wind, are gaining more importance and attention because of the depletion of conventional energy sources, such as fossil fuels, and pollution generated by the combustion of such fuels. Wind power is a clean and sustainable source of energy, and it does not lead to any environmental hazards. Hence, energy generation with wind power has become the main goal of many countries. However, effective power generation with wind energy is quite an uncertain process because of the chaotic and intermittent nature of wind-power availability. This uncertainty in wind power can imperil power availability, quality, and stability. Eventually, this can lead to a huge loss in the energy market. Hence, precise prediction of wind power is a critical task with deep impact and large benefits for humanity.
There are various approaches to forecasting wind power and these can be classified broadly into three categories: (1) model-driven approaches, (2) data-driven approaches, and (3) hybrid approaches [
1]. Model-driven approaches require abundant meteorological knowledge and information of various physical factors affecting wind power [
2]. In data-driven approaches, on the other hand, data-driven statistical models are used for forecasting. With the advancement in the artificial-intelligence and data-science fields, more accurate prediction results can be achieved with this approach [
3]. Historical data are the only requirement for such models. Many research articles describe the performance of distinct data-driven models, such as the basic persistence model [
4], and complex models, including support vector machines (SVM) [
5,
6], neural networks (NN) [
7,
8], and autoregressive integrated moving average (ARIMA) [
9]. However, due to the highly stochastic and intermittent nature of wind-power time series, it is difficult to predict within a significantly accurate range.
Wind-power prediction studies are broadly classified into direct and indirect approaches. In direct approaches, wind-power data are directly predicted by various methods. The advantage of this kind of approach is that there is no need to study the relations between wind-power and wind-speed parameters. However, the prediction accuracy of a direct approach is not always good enough since wind-power data usually show high levels of randomness and a chaotic nature. Such wind-power data are very difficult to efficiently process with the prediction methods.
To overcome this difficulty, another part of the available studies focused on indirect prediction approaches. In this kind of approach, wind-speed data are firstly forecasted, and then the predicted data converted into wind-power data by means of various techniques. However, in practice, while transforming wind-speed into wind-power data, further errors are made in prediction accuracy because of inaccuracies in nonlinear power curve analysis. Generally, wind power and wind speed are related in terms of cubic or higher-order powers. Hence, a small change in wind speed leads to larger and significant deviations in wind power. The success of an indirect approach is in how it evaluates the nonlinear dependence between wind-power and wind-speed data. Such error evaluations lead to a rise in learning accuracy and comprehensibility. Instead of manufacturer power curves, statistical techniques seem to be a better option to describe the nonlinear relationship between wind power and wind speed. Higher-order polynomial equations, exponential, fitted power, regression, logistic, and many other models are used to estimate wind power by using explanatory wind-speed datasets.
While reviewing the literature related to short-term wind-power prediction, there is a large number of articles that are focused on direct wind-power as well as wind-speed predictions [
10,
11,
12].
However, there are very few articles that have compared the performance of direct and indirect approaches. Most of them have evidenced that the best prediction accuracy comes with direct approaches [
10,
11], whereas Reference [
12] concluded that an indirect approach performed better than the alternative.
In this paper, a novel approach is presented in order to eliminate the drawbacks of both direct and indirect prediction methods used in wind-power predictions. The proposed method cannot be classified into any of the commented groups because it uses combined information from wind-speed and wind-power series. In this sense, it is an alternative method and behaves as a direct–indirect hybrid that does not directly or indirectly predict power. It starts by smoothing down a wind-power time series by keeping respective wind-speed data as a reference. The process of smoothing down is based on the label sequence generation process discussed in the PSF algorithm and the Naïve Bayesian method-based matching process following the next procedure. Wind-speed and wind-power data are converted into a sequence of labels. Then, these labels are mapped and their best combination is estimated. Keeping these combinations as a reference, the wind-power labels are smoothed down and further predicted with the steps involved in the PSF method. After following this procedure, an important consequence is to reduce the degree of chaos contained in the resulting predicted series.
Multiple simulations have been carried out with the aim of collecting a contingent of results. Three different error measures have been used in order to quantify how much the proposed method outperforms existing ones.
The rest of the paper is organized as follows:
Section 2 describes the steps involved in the PSF algorithm.
Section 3 introduces the proposed methodology and the description of the prediction methodology for wind-power forecasting.
Section 4 shows the results obtained by the proposed approach in predicting wind power, including their quality measurements. Comparisons between the proposed method and other techniques are also provided. Finally,
Section 5 summarizes the conclusions achieved with regard to wind-power predictions.
2. Conventional PSF Methodology
The PSF algorithm is one of the most popular types of univariate time-series prediction methodology, proposed in Reference [
13] and further analyzed in Reference [
14]. The basic principle behind predictions with the PSF algorithm is an optimum search of pattern sequences present in a time series. This methodology consists of several processes that operate in two steps. During the first step, data are clustered, and during the second, the forecasting process is carried out based on the previously clustered data, as shown in
Figure 1. The novelty of the PSF algorithm is the utilization of labels for respective pattern sequences present in a time series, instead of the use of the original time-series data.
The clustering step consists of various tasks, including data normalization, the selection of an optimum number of clusters, and the application of k-means clustering. The ultimate aim of this step is to discover clusters of time-series data and accordingly label them. This starts with a normalization process, in which the time series is normalized with Equation (
1) in order to remove the redundancies present in it.
where
is the
jth value of each cycle in the input time series, and
N is its size in time units. Secondly, the normalized series is assigned with the labels according to different patterns present in it with the help of clustering methods. In PSF, a k-means clustering method is used because of its popularity, simplicity, and fast computing nature. However, it requires prior knowledge of a number of centers so that the series can be clustered in respective numbers of clusters. Reference [
13] utilized the Silhouette index [
15] to decide the number of clusters in PSF methodology, whereas Reference [
14] suggested the ‘best among three’ policy to decide the optimum number of clusters, in which three different indices (the Silhouette index [
15], Dunn index [
16], and Davies–Bouldin index [
17]) are used. In this policy, the cluster size is finalized with the use of multiple statistical tests to ensure efficiency in the clustering process. Further, References [
18,
19,
20] used a single index (Silhouette index [
15]) to simplify computation complexity in the clustering process.
Then, with respect to cluster heads (K) generated with the k-means clustering method, the values in the original time series are transformed into label series. These label series are further used for the prediction procedure. This prediction procedure consists of window-size selection, pattern sequence matching, and an estimation process.
Consider that
is the vector of time-series data of length
N, such that
. After clustering and labeling, the vector is converted into
, where
are labels representing the cluster centers to which data in vector
belongs. Then, during the process, the last
W labels are searched in vector
. If this sequence of the last
W labels is not found in
, then the search process is repeated for the last
labels. In PSF, the length of this label sequence of size
W is denoted as the window size. Therefore, window size can vary from W to 1, although this is not usual. In the window-size selection process, the sequence of labels of length size
W were picked from the backward direction, and this sequence was searched in the label series. The selection of optimum window (
W) is one of the most challenging processes in prediction with PSF in order to minimize the prediction errors. The mathematical expression for an optimum window size is the minimization of Equation (
2):
where
is a predicted value at time
t,
is the measured data at same time instance, and
represents the time series under study. Practically, the estimation of an optimum window size is done by means of errors validation. However, while searching a sequence
W in the label series, if this sequence is not found, then the size of
W is reduced by one unit. Again, this process continues until a new window sequence repeats itself in the label series at least once. This confirms that at least one sequence appears more than once in the label series. Once the optimum window size is obtained, the available pattern sequence in the window is searched in
, and the label present just after each discovered sequence is noted in a new vector
. Finally, the future time-series value is predicted by averaging the values in vector
as in Equation (
3).
where
is the length of vector
. Finally, the predicted labels are replaced with the appropriate value in a range of an original measured time series with a denormalization process. However, in order to predict future values for multiple time indices, the current predicted value is appended to the original time series, and this procedure continues until the desired number of prediction values are obtained. The usability and superior performance of the PSF method for distinct univariate time-series prediction applications are discussed in References [
20,
21,
22,
23,
24].
3. Proposed Methodology
The conventional PSF algorithm has gained popularity because of its superior and promising prediction performance for univariate time series. Also, PSF has shown its capability in wind-power and wind-speed predictions in [
25]. The methodology proposed in this paper is focused on predicting wind-power data samples framed in a time series with the assistance of corresponding wind-speed data. The prediction concept is based on the PSF algorithm. This novel methodology is proposed as an alternative to direct and indirect wind-power prediction approaches. In this methodology, the wind-power time series is predicted with modifications in conventional PSF and dataset smoothing. In contradiction to state-of-the-art methods and approaches, the significant difference in the proposed approach is the utilization of both wind-power and wind-speed datasets to achieve better accuracy in wind-power predictions.
Usually, researchers have used indirect wind-power prediction approaches due to the highly chaotic nature of wind-power time series. In comparison to wind-speed time series, the nature of respective wind-power time series is more chaotic and intermittent. Hence, it is difficult to predict them more accurately. Contrary to this, indirect approach methods are associated with additional errors accumulated by the curve fitting of power curves. The proposed approach attempts to reduce the prediction errors associated with both direct and indirect approaches. Firstly, this approach smooths down wind-power time series with the help of wind-speed time series by using the same labeling sequence technique as the one used in the conventional PSF algorithm. Secondly, it predicts the future values of wind-power time series with PSF principles.
Given wind-speed and wind-power values recorded in the past at a specific interval (5, 15, 30, and 60 min) up to the day
, the prediction of future values of wind power is expected at the next few intervals (of same precision) for day
d. Consider that
and
are the time series composed of ‘
n’ samples of wind power and wind speed, respectively, as follows:
Similar to the procedure followed in PSF, and are converted into label sequence and , respectively.
Let
be the labels of day
i obtained in the labeling step of the PSF method, where
K is the number of clusters.
and
are the label sequence of
W consecutive days, as follows:
The next step is to map the
sequence with the
sequence. This mapping is done with decision matrix (
M) that uses the Naïve Bayesian method. The motive of this matrix is to represent the pair of each label in
with all corresponding labels from
with respective occurrence probabilities of each pair. The formulation of decision matrix (
M) is done with four parameters: labels from
at
t and
, labels from
at
t, and the probability of occurrence of respective combinations, where
t is the label sequence index (
and
).
where
stands for probability of occurrence.
Table 1 shows a sample decision matrix, where the first three columns are the combinations of labels of
,
, and
, and the fourth one is the probability of occurrence of a combination of labels. It can often be possible in a decision matrix that each label in
has multiple alternatives in respective labels in
, with different probabilities of occurrence. In such cases, the Naïve Bayesian method is used to map the most suitable pairs in
and
. This mapping of labels generates a look-up table (
), as shown in
Table 2, which is referred further to smooth down the
sequence as indicated in Equation (
9):
where
is the Naïve Bayesian function.
The next process is the smoothing of the
series. This process is performed with the consideration of the above-mentioned look-up table. Firstly, all labels in
are compared with the respective labels in
. The ideal cases are considered wherever these matching pairs follow the pairs, as mentioned in the look-up table as shown in Equation (
10):
Whereas for mismatched cases, the labels in
are replaced with the labels corresponding to the respective
in the look-up table, as shown in Equation (
11):
where
,
,
are the labels in
and
, respectively, and
is a replacement of
from the look-up table at nonideal cases.
Eventually, this leads to the removal of labels in
responsible for making the wind-power time series more chaotic and intermittent, and to generate a smoother sequence of wind-power labels (
). This new sequence series (
) possesses a positive but much smaller Maximum Lyapunov Exponent (MLE) compared to that of
, as shown in
Section 4.3. The correlation coefficient between
and
is also smaller than the one between
and
. This assures that the
sequence is smoother and more favorable for future values prediction than
. The procedure of the proposed methodology is illustrated in graphical form and a block diagram in
Figure 2 and
Figure 3, respectively. It is also expressed in terms of pseudocode in
Figure 4.
Furthermore, the prediction process after smoothing is adopted from a conventional PSF algorithm. It starts with the calculation of optimum window (W) selection. Similar to the conventional PSF algorithm, the last W-sized label sequences in are searched for in the whole series. The mean of the very next label of each repetition of this window (W) sequence is noted as the future value of , and it is again replaced with a value within the range of with the denormalization process.
5. Conclusions
In this paper, a wind-power forecasting algorithm has been proposed, which can be considered an alternative method to direct and indirect approaches. While a direct approach directly predicts power, and an indirect approach does so with the help of power curves after previous predictions of wind speed, the proposed method combines both wind-speed and wind-power data, smooths down the resulting wind-power series, and uses them for predicting wind power in a clearly less chaotic way than existing methods do.
Multiple simulations were carried out with the aim of collecting a contingent of results. Three different error measures were used in order to quantify how much the proposed method can be said to outperform existing ones. Our conclusions are outlined in the next few paragraphs.
Direct prediction approaches show more accuracy in forecasts in comparison to indirect approaches in terms of all three error measures. The crucial reason behind these observations is that power curves are only based on the average deterministic relationships between wind-speed and -power datasets. However, such relationships are actually stochastic in nature. Power-curve variability is the significant factor to reduce wind-power prediction accuracy. In contrast, in the proposed method, all time instances in a wind-power time series are handled and modified individually on a case-by-case basis. This smooths down the time series and removes stochastic patterns in it up to an extent.
As shown in
Table 6 and discussed in the corresponding section, between the contemporary methods, ARIMA, SVM, and PSF showed the best performance for both direct and indirect approaches of wind-power predictions. However,
Table 5 shows how much the proposed methodology outperforms ARIMA, SVM, PSF, and other methods for all seasons. It shows, on average, 22.79%, 24.65%, and 17.26% improvement of the proposed method compared to ARIMA, SVM, and PSF, respectively, for collectively all seasons and time horizons. Similar improvement is observed for the whole one-year data.
There is scope for future developments. For instance, in this paper, the method used only values at time instants t and . A possibility is to use more time instants, such as . In a way, this presents certain similarities with Markov processes, where several-order Markov chain matrices could be established, regarding whether data of one or more previous states are taken into account when the probability of a state must be calculated.