Next Article in Journal
Deriving Optimal End of Day Storage for Pumped-Storage Power Plants in the Joint Energy and Reserve Day-Ahead Scheduling
Next Article in Special Issue
Optimal Power Transmission of Offshore Wind Power Using a VSC-HVdc Interconnection
Previous Article in Journal
Optimal Placement and Sizing of Renewable Distributed Generations and Capacitor Banks into Radial Distribution Systems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Accurate Short-Term Power Forecasting of Wind Turbines: The Case of Jeju Island’s Wind Farm

Department of Electrical Engineering, Sangmyung University, Seoul 03016, Korea
*
Author to whom correspondence should be addressed.
Energies 2017, 10(6), 812; https://doi.org/10.3390/en10060812
Submission received: 22 April 2017 / Revised: 7 June 2017 / Accepted: 13 June 2017 / Published: 15 June 2017
(This article belongs to the Special Issue Wind Generators Modelling and Control)

Abstract

:
Short-term wind power forecasting is a technique which tells system operators how much wind power can be expected at a specific time. Due to the increasing penetration of wind generating resources into the power grids, short-term wind power forecasting is becoming an important issue for grid integration analysis. The high reliability of wind power forecasting can contribute to the successful integration of wind generating resources into the power grids. To guarantee the reliability of forecasting, power curves need to be analyzed and a forecasting method used that compensates for the variability of wind power outputs. In this paper, we analyzed the reliability of power curves at each wind speed using logistic regression. To reduce wind power forecasting errors, we proposed a short-term wind power forecasting method using support vector machine (SVM) based on linear regression. Support vector machine is a type of supervised leaning and is used to recognize patterns and analyze data. The proposed method was verified by empirical data collected from a wind turbine located on Jeju Island.

1. Introduction

Wind power generation has grown rapidly over the past few decades, as demonstrated by the total accumulated wind power capacity which hit 319 GW from an installation in 2013 [1]. Furthermore, the worldwide total of wind power capacity was 432.9 GW in 2015, showing that the cumulative market of wind power grew more than 17%, with wind power capacity expected to increase consistently, to the point where wind power will generate 4337 TWh in 2035 [1,2]. For this reason, wind power forecasting—a technique that determines the quantity of wind power output that can be expected over a given period—is becoming a salient point of research as an important component in operating power systems to maintain their reliability.
Prior to forecasting wind power outputs, power curves need to be analyzed as they represent characters of wind turbine outputs. There are few studies on the reliability of power curves using regression and artificial neural network models, etc. [3]; therefore, we have analyzed the reliability of a power curve at each wind speed through the wind turbine’s power curve, and output the data using a logistic regression. In statistics, a logistic regression measures the relationship between a categorical dependent variable and one or more independent variables by estimating probabilities based on a logistic function, which is the cumulative logistic distribution [4]. Using a logistic regression, wind power outputs can be distributed (which do or do not enter the error range) at each wind speed, to create a probability model relating to the reliability of power curves.
Accurate wind power forecasting is essential when it comes to considering variability, reducing uncertainty, and penetrating wind power into power systems. Wind power forecasting is largely divided into three parts based on time horizon: short-term (one hour to one day), medium-term (one day to one week), and long-term (one week to one year) [5]. In particular, given the increasing supply of wind-generated power into the power grid, short-term wind power forecasting is becoming a critical issue in the safe operation of power systems. Precise short-term wind power forecasting can enable efficient power system management, improve the wind power supply level, and increase overall systemic reliability. Additionally, it can be utilized for power system planning such as unit commitment and dispatch scheduling [6]. Many approaches for predicting wind power have been proposed to increase the reliability of forecasted values. For wind power forecasting, there are models based on time series using auto regression moving average (ARMA)/auto regression integrated moving average (ARIMA), or the regression method [7]. These methods have many advantages: they can forecast fast and do not need many elements to forecast. However, time series models need lots of data when the model is structured, and the parameters are difficult to update when new data are uploaded. Furthermore, the regression model has limits in containing patterns and the variability of data.
Recently, many advanced approaches have been suggested to forecast more exact wind power outputs based on statistic and ensemble methods [8,9,10,11]. In this study, we proposed the use of support vector machine (SVM) to forecast wind power outputs. SVM is one of the most popular models in the field of machine learning [12], and is an advanced technique for classification and regression analysis. The support vector regression (SVR) is the method with which to carry out a SVM [13] based on regression as this model can consider variability as SVM is not largely influenced by noise data, which gives the high accuracy [14]. In this paper, we analyzed the accuracy of power curves at each wind speed using logistic regression, and also proposed short-term wind power forecasting using support vector machine based on linear regression to reduce wind power forecasting errors. To achieve this, we used the value of the power curve at each speed, and the accuracy of the power curve was calculated using logistic regression as additional variables. These variables are able to compensate when the forecasted wind speeds input had uncertainty. For this reason, it was possible to improve the error caused by the sudden change of output. In Section 2, the mathematical theories about logistic regression and SVM are described. In Section 3 and Section 4, we analyze the accuracy of the power curve using logistic and forecast wind power outputs through the SVM method. In Section 5, we summarize the results of this research.

2. Mathematical Definition for Enhanced Reliability Assessments Method of Wind Turbines

2.1. Logistic Regression

Logistic regression analysis is applicable to data that do not follow the normal distribution. Using logistic regression analysis, it is possible to interpret discrete data which could not be analyzed by linear regression analysis.
Simple linear regression is a statistical method used for predicting and analyzing an independent variable that is influenced by a dependent variable when a dependent variable is continuous [15]. Since dependent variables are binary and not continuous, it is impossible to use simple linear regression. In simple linear regression, the interaction formula between a dependent variable, Y, and an independent variable, X, is assumed in the linear model as:
Y k = β 0 + β k x k + ε k ,     k = 1 ,   2 , ,   n
Here, the dependent variable and independent variables are continuous variables. In this regression model, β 0 + β k x k is an observation’s expectation, E ( Y k ) , and ε k is the error where the independent variable, X, is x k . The expectation E ( Y k ) is assumed to be the independent variable’s linear expression. The error, ε k , follows the lognormal distribution as wind speed cannot have negative values, and this parameter can be zero, if it is assumed to be unbiased. σ ε 2 , which is consistent irrespective of x k , has an average zero as the center. Errors ε k and ε l occur independently ( k     l ). For example, we could define y k as follows:
y k = {     1 ,   the   k t h   variable’s   value   is   success 0 ,   otherwise
In this assumption, y k is the realization of a random variable Y k . If we apply the variable Y k , which has a rate of success P   ( Y k = 1 ) = p k , to a simple linear regression, we can get a regression model as follows:
Y k = β 0 + β k x k + ε k = 0   o r   1 ,     k = 1 ,   2 , ,   n
We can then calculate an expected value when X is expressed by x k :
E ( Y k ) = ( p k × 1 ) + ( ( 1 p k ) × 0 ) = p k = β 0 + β 1 x k
In this equation, we can determine that the expected value is represented as a probability when the independent variable, X, is x k . At this point, we can encounter problems with using a simple linear regression for analyzing binary data. First, the error term does not follow a normal distribution. When Y k is a binary dependent variable, the error term, ε k , has two values, 1 β 0 β 1 x k   ( Y k = 1 ) and β 0 β 1 x k   ( Y k = 0 ) . The second problem is that a simple linear regression has expectations between and . However, if we apply binary data to a simple linear regression, then expectations always have values between 0 and 1. Therefore, this model is inadequate for applying a simple linear regression to binary variables.
We can make a rule or classification which guesses the binary output from input variables. Due to classification, the dependent variable will have binary variables, and the expectation, E ( Y k ) , will represent a probability which has values between 0 and 1. Therefore, the dependent variable would be a binary variable, so it can be sufficient to use in a curve model, but not a linear model. The typical curve model is a logistic model [16] expressed by Equation (5):
E ( Y k ) = p k = exp ( β 0 + β 1 x k ) 1 + exp ( β 0 + β 1 x k )
where exp ( β 0 + β 1 x k ) / ( 1 + exp ( β 0 + β 1 x k ) ) is an observation’s expectation, E ( Y k ) , which is a curve model. The logistic model is used to estimate the probability of a binary response. Here, the logistic regression measures the relationship between a categorical dependent variable and one or more independent variables by estimating probabilities through a logistic model [17].
We can transform the logistic model into a linear model. This can be expressed as follow:
ln ( p k 1 p k ) = β 0 + β 1 x k
This transformation is called a logit transformation, which is defined by ln ( p k / ( 1 p k ) ) , where p k is a proportion.
This mode of logistic regression [18] uses the method of maximum likelihood estimation for estimating a regression coefficient. Here, the maximum likelihood estimation is a method for estimating a population parameter using a value which maximizes a likelihood function. The likelihood function is represented by a sample and population function.
Assume that P   ( Y k = 1 | X = x k ) = p ( x ; ϕ ) , for some function, p, parameterized by ϕ , which is itself parameterized function. When observations comprise independent variables, the likelihood function is expressed by Equation (7):
k = 1 n P r ( Y = Y k | X = x k ) = k = 1 n p ( x ; ϕ ) y k ( 1 p ( x ; ϕ ) 1 y k )
The probability sampling, Y 1 , Y 2 , , Y n , is estimated by a sequence of Bernoulli trials, which is represented by f i ( y i ) :
f i ( y i ) = {   p i y i ( 1 p i y i ) ,   y i = 0   o r   1 0 ,   o t h e r w i s e  
If each trial were to have its own probability of success, P k , the likelihood would be expressed by Equation (9):
L ( β 0 , β 1   ;   y 1 , y 2 , , y n ) =   k = 1 n p k y k ( 1 p k y k )
Equation (9) shows that the likelihood is the same as the probability sample’s joint probability function.
In this logistic regression [19], analyzing the regression coefficient is difficult as the regression model is not linear. Therefore, we need to utilize the odds’ conception for understanding a regression coefficient.
Suppose the numerical values between 0 and 1 are allocated two outcomes of a binary variable. The 0 signifies a negative response and 1 signifies a positive response. When p k is the ratio of observations with an outcome of 1, then 1 − p k is the ratio of an outcome of 0. This proportion is called the odds which is represented by Equation (10):
odds   =   p k ( 1 p k )   =   The   probability   of   success The   probability   of   failure
When the probability of success is higher than the probability of failure, the odds has a value greater than 1. Otherwise, when the probability of failure is higher than the probability of success, the odds has a value less than 1.
In logistic regression [20], the regression model uses log odds, which involves applying the natural logarithm to odds. When we assume that the prediction probability is p ^ k , we can estimate the log odds as follow:
ln ( p ^ k 1 p ^ k ) =   b 0 + b 1 x k
If we were to use an exponential function on either side, we can generate transformed odds, as in the Equation (12):
p ^ k 1 p ^ k = exp ( b 0 ) exp ( b 1 x k )
In this equation, the odds’ predicted value is multiplied as much as exp ( b 1 ) . If x k is 0, then the intercept, b 0 , becomes a predicted value.

2.2. Support Vector Regression (SVR)

As discussed in Section 1, support vector machine (SVM) is one of the most popular models in the field of machine learning and is an advanced technique for classification and regression analysis. SVM is not largely influenced by noise data and has high accuracy; furthermore, this model is easier to use than other machine learning approaches [21], and has been used for wind power forecasting [22]. Typically, the basis for training appropriate model is represented as follows:
D = { ( x 1 , y 1 ) , ( x 2 , y 2 ) , , ( x n , y n ) } d × K
where d denotes the space of the input patterns; the dataset, D, consists of labeled patterns; K signifies discrete space and indicates in regression scenarios. The purpose of this learning process is to find a prediction function, “ f :   X ”, to map hidden patterns to reasonable labels with real values. We can formulate the time series data which are measured outputs at time 1     t     n . Measured outputs are defined D in Equation (13).
To extend the SVM [21] to cases for which the data is not linearly separable, we use a hinge loss function, as follows:
max ( 0 , 1 y i ( ω x i + b ) )
This function is zero if the constraint is satisfied. This constraint is represented as:
y i ( ω x i + b ) 1 ,   ( 1 i n )
In Equations (2) and (3), y i are either 1 or −1, each indicating the class to which the point x i —a p-dimensional real vector—belongs. In addition, ω is the normal vector to the hyperplane and x i indicates the correct side of the margin [19]. For data on the incorrect side of the margin, the function’s value is relative to the distance from the margin. Therefore, we need to minimize the loss function, as follows:
minimize { [ 1 n i = 1 n max ( 0 , 1 y i ( ω x i + b ) ) ] + λ ω 2 }
where λ indicates the decision to swap between increasing the margin size and ensuring that x i resides on the correct side of the margin. Thus, for thoroughly small values of λ, the soft-margin SVM will still learn a viable categorization rule [22,23].
The SVR is a method to carry out a SVM based on regression. The SVR is divided into three types: a linearly separable SVR, a linearly inseparable SVR, and a nonlinear SVR [24].
In this paper, a nonlinear SVR was used to forecast wind power outputs. To do so, we used historical data for wind power output and wind speed data for training, in addition to the power curve’s value and accuracy data for correcting errors.

3. Reliability of Power Curve Estimation Using Logistic Regression

As discussed in Section 1, the power curves first need to be analyzed as they represent the character of a wind turbine’s outputs. In this section, we describe the construction of a statistical model to estimate the reliability of power curves, based on logistic regression using R language version 3.3.1. Wind power outputs and wind speeds, which were measured in single turbine from Jeju Island, were used as data as one minute averaged values. We constructed one minute data into 10 min and one hour averaged data by averaging 10 and 60 data of one minute. Wind speeds were measured by nacelle. The turbine was made by the HANJIN Industry Corporation (Seoul, Korea), model number HJWT2000, with a capacity of 2000 kW.

3.1. Jeju Island’s Empirical Output Data and Power Curve from Manufacturer

To perform this estimation, we used outputs measured from November to December 2015 from Jeju Island (the corresponding area is not specified to protect the security of the technical data).
Since the reliability analysis of the output curve must be performed prior to the output prediction, the data before the predicted data were used. Figure 1 represents the turbine’s power curve from the manufacturer and the measured outputs. As power curves normally contain a mean power value over 10 min, we constructed a power curve by converting one minute of data to 10 min of data using the average method.
As the power curve from the manufacturer did not correspond with the measured output data, we suggested a logistic regression to estimate the reliability of the power curve.

3.2. Classify Output Data Based on the Power Curve

To generate a logistic regression, we classified output data based on the power curve from the manufacturer. The classification was decided by whether the outputs were included in the ±20% of the power curve’s values at each wind speed or not. The band of classification is represented by dashed lines in Figure 1. In reality, as power curves frequently include a ±20% variability of measured power output [22], we classified the outputs into two categories, by error, based on whether the output data existed within the ±20% variability. We determined that the outputs were within the assumed existing error range throughout the entire range.

3.3. Reliability of Power Curve Estimation

After classification, we obtained values of 1 and 0, which is represented in Figure 2.
As seen in Figure 2, the classified data was non-linear and binary which meant that we could not use a linear regression model to analyze the data. Therefore, to analyze the binary data, which had a non-linear character, we used a logistic regression that could analyze non-linear data to make a statistical model. This model (represented by the red line) was used for estimating the reliability of a power curve from the manufacturer based on logistic regression. The band of ±20% variability got narrower while wind speed decreased, and would tend to be zero when the cut-in speed was approached. Wind measurement errors do not reduce proportionally to wind speed, so power curve reliability for low wind speeds could be underestimated. For this reason, we fixed the minimum value so that when the wind goes under a value (6 m/s), which is represented by the blue dashed line. The “X” label signifies wind speed, and the “Y” label signifies the probability that outputs exist within a tolerance band. Using this statistical model, we could estimate the reliability of a power curve at each wind speed. This is represented in Table 1.
We estimated the reliability of the power curve at each wind speed using this statistical model (Figure 2), which was deduced by logistic regression.

4. Wind Power Forecasting by Using SVR

As discussed in Section 2, we used a SVM based on a multi-variable regression to forecast wind power outputs [25]. Before any actual wind power forecasting, we analyzed Jeju Island’s output data and independent variables, and used the empirical data including wind power outputs, wind speed, power curve’s value and accuracy for training the SVM and forecasting wind power outputs.

4.1. Analysis of Empirical Data

To perform the wind power forecasting, we use outputs measured in January, 2016 from Jeju Island.
As seen in Figure 3, the wind power outputs were highly variable. To forecast wind power outputs, we considered three independent variables: wind speed, the value of the power curve at each speed, and the accuracy of the power curve. The forecasting of wind power outputs was progressed through the process as seen in Figure 4. The accuracy of the turbine’s power curve was calculated by applying the wind speed, outputs, and turbine’s power curve data to logistic regression analysis. The accuracy of the power curve was used as an auxiliary variable for the correction of the error. After forecasting the accuracy of the power curve, SVM model training was performed using the weekly data of wind speed, power, power curve, and power curve accuracy. Through training, the model analyzed patterns between past data, and when input data arrived, the model would derive results through the pattern learning result. In this paper, the forecasted wind velocity was used as the input data to perform the output forecast.
If a single variable is considered, it is possible for a large error to occur when a wrong single variable enters the input data. Therefore, there is a need to consider multi-variables. We used the value of the power curve at each speed and the accuracy of power curve, which is calculated by using logistic regression as additional variables. These variables compensate when there is uncertainty in the input of forecasted wind speeds.

4.2. Jeju Island’s Wind Power Outputs Forecasting

The training period was seven days long as learning periods can become biased towards past data. After training, we forecasted wind power outputs after 24 h, for the month of January 2016. In all cases, the results were slightly improved, but it was difficult to represent all days. Therefore, the two best cases were selected. In this case, we used historical data measured from 1 to 7 January 2016 from Jeju Island to train the SVM using R language and the “e1071” package [26,27,28]. After training, we forecasted wind power outputs for 8 January. The SVR model parameters are represented in Table 2.
The parameters of the SVR model listed in Table 2 were constructed through training using data from 1 to 7 January 2016. By applying these parameters, we forecasted wind power outputs for 8 January 2016. The results are in Figure 5. As shown in Figure 5, forecasted values using the SVR model and the proposed method (using the SVR model based on multi variables) were similar to the measured values. Both the SVR model and SVM based on multi-variable regression were highly accurate.
Table 3 shows the accuracy of the forecasted output values. Both forecasting methods exhibited high accuracy, but the proposed method was more accurate. In Figure 5, we used historical data measured from 4 to 10 January 2016 from Jeju Island, to train the SVM. After training, we forecasted wind power outputs for 11 January (The forecasted values are added to Appendix A).
The SVR model parameters (which were set for forecasting wind power outputs on 11 January 2016) are represented in Table 4.
The parameters listed in Table 5 are the parameters of the SVR model constructed through training using data from 4 to 10 January 2016. By applying these parameters, we forecasted wind power outputs for 11 January 2016. The results are shown in Figure 6.
In Figure 6, the measured values are represented with a black line; forecasted values using the SVR model based on wind power outputs and wind speed are represented as a red line; and forecasted values based on the proposed method are represented with a blue line (The forecasted values are added to Appendix A). Both forecasting models exhibited high accuracy, but the proposed method generated more accurate forecast values.
To confirm this performance numerically, we calculated correlation, R-squared, and RMSE. Table 5 represents the accuracy of that forecasted output values.
As shown in Table 3 and Table 5, the proposed method achieved very little improvement. However, it was possible to improve the error caused by the sudden change of output.

5. Conclusions

Short-term wind power forecasting is an important technique as it can inform system operators of how much wind power can be expected at a specific time. To increase the penetration of wind generating resources into the power grids, short-term wind power forecasting is becoming an important issue for grid integration analysis. To guarantee the reliability of forecasting, power curves need to be analyzed, and a forecasting method selected which compensates for the variability of wind power outputs. In this paper, we proposed an enhanced reliability assessment of power curves at each speed using logistic regression as the outputs predicted by the power curve and wind speed were accurate using this estimated reliability. Support vector machine is a kind of supervised learning and is a method for recognizing patterns and analyzing data; therefore, we proposed a method for forecasting wind power outputs using an SVM based on multi-variable regression to increase reliability.
The proposed method was verified with empirical data from a wind turbine located on Jeju Island. We used limited data and one wind turbine size. These limitations can be improved when additional data are available. We considered historical data including wind power output, wind speed, power curve, and the accuracy of the power curve for training. During training, the purpose of the power curve and its accuracy were made by correcting errors. After training, we forecasted wind power outputs over the next 24 h. We obtained the forecasted values by using a SVM based on a multi-variable regression, which was more accurate than the SVM based on single-variable regression (A review of additional turbines is summarized in Appendix A). Thus, the proposed method for estimating accuracy and forecasting outputs can provide reliable predictions of wind power outputs to power system operators.

Acknowledgments

This work was supported by the Korea Institute of Energy Technology Evaluation and Planning (KETEP) and the Ministry of Trade, Industry & Energy (MOTIE) of the Republic of Korea (No. 20161210200560).

Author Contributions

Jin Hur conceived and designed the overall research; BeomJun Park developed the accurate short-term power forecasting model and conducted the experimental simulation; Jin Hur and BeomJun Park wrote the paper; and Jin Hur guided the research direction and supervised the entire research process.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1 shows the forecasted values based on those represented in Figure 5. Wind power forecasting using a SVM based on a multi-variable regression model was more accurate than forecasting using a SVM based on a single-variable regression model.
Table A1. Measured and forecasted wind power outputs on 8 January 2016.
Table A1. Measured and forecasted wind power outputs on 8 January 2016.
HourMeasured [kW]SVR (Single Variables) [kW]SVR (Multi Variable) [kW]
1296.9598267.4591300.9246
2341.4116312.1750332.6964
3323.0441294.2303319.6960
4271.0491230.3204268.9342
5232.4090197.8502225.4609
6199.9801185.1180199.5423
7171.8569170.9805168.4929
895.534299.6494100.8628
929.368925.409018.6278
107.566923.039011.4897
1130.610458.012155.5026
1250.007454.457659.8538
1354.630848.374561.4500
1492.198195.013992.6552
1593.112295.735394.0444
1664.344768.437978.6010
1738.115643.845155.2460
1815.671713.772821.5177
195.75296.987311.3017
208.714622.356124.5019
2138.168544.736937.8291
2229.453925.261423.6915
230.000012.894510.0190
2431.542533.315129.0014
Table A2 shows the forecasted values based on those shown in Figure 6. Unsurprisingly, wind power forecasting using a SVM based on a multi-variable regression model was more accurate. Again, we calculated correlation, R-squared, and RMSE to confirm the accuracy numerically.
Table A2. Measured and forecasted wind power outputs on 11 January 2016.
Table A2. Measured and forecasted wind power outputs on 11 January 2016.
HourMeasured [kW]SVR (Single Variable) [kW]SVR (Multi Variables) [kW]
111.880415.172114.6157
243.299138.522946.7491
356.294446.122854.6041
416.887734.217522.3426
551.296670.671455.3924
6130.8975123.2936123.3551
7180.7022171.4096171.4022
8201.9767189.8895189.9328
9182.3402178.1740177.7160
10149.3459146.7730147.3944
11100.245287.835686.8331
12154.6401144.0491144.8370
13258.8261240.4598249.1183
14247.9340226.3591231.6409
15233.1442214.8088217.2803
16253.4962217.0605250.0018
17267.1652227.2595242.8441
18244.9381225.9953231.6946
19203.7053187.9981187.2971
20259.7748220.0076223.5598
21309.8671269.3476280.8379
22277.4633247.3862256.8524
23280.6704258.5844269.1221
24386.0640364.1557359.7852
We calculated the accuracy of the power curves for additional turbines which were the same as the existing wind turbine. The results of the accuracy are shown in Figure A1 and Table A3.
Figure A1. The statistical model for estimating accuracy of power curves for additional turbine using logistic regression (The red line is statistical model and the blue line is fixed minimum value).
Figure A1. The statistical model for estimating accuracy of power curves for additional turbine using logistic regression (The red line is statistical model and the blue line is fixed minimum value).
Energies 10 00812 g007
We also applied the same method to the additional turbine to perform wind forecasting on 8 January 2016. The following results were obtained.
Table A3. The probability that outputs for additional turbine exist within a tolerance band.
Table A3. The probability that outputs for additional turbine exist within a tolerance band.
Wind Speed [m/s]Accuracy [%]
4.074.97
5.074.97
6.074.97
7.080.57
8.085.57
9.088.66
10.091.71
11.094.48
12.096.15
13.097.07
14.097.82
15.098.33
16.099.58
17.099.12
18.099.33
19.099.57
20.099.69
Table A4 shows the forecasted values based on those shown in Figure A2.
Table A4. Measured and forecasted wind power outputs to additional turbine on 8 January 2016.
Table A4. Measured and forecasted wind power outputs to additional turbine on 8 January 2016.
HourMeasured [kW]SVR (Single Variable) [kW]SVR (Multi Variables) [kW]
1332.540311.150308.801
2346.513335.553320.253
3325.885323.321299.133
4278.277265.841239.944
5238.735217.666200.565
6202.201200.712189.074
7173.578171.336175.348
8103.105106.735108.191
935.69627.86726.605
1021.52025.57124.042
1144.14664.98859.111
1257.34766.00558.541
1364.18370.68464.431
14100.91495.93597.406
15101.30898.109103.570
1671.78284.31369.181
1747.64458.47446.167
1817.29126.32128.347
1912.91213.27817.640
2013.50331.42624.212
2145.66943.89547.083
2235.91128.93930.082
236.90120.69122.567
2436.30632.96040.109
Figure A2. Measured and forecasted wind power outputs to additional turbine on 8 January 2016.
Figure A2. Measured and forecasted wind power outputs to additional turbine on 8 January 2016.
Energies 10 00812 g008
Table A5 shows the accuracy of forecasted values.
Table A5. Accuracy of SVR model and Persistence method (8 January 2016).
Table A5. Accuracy of SVR model and Persistence method (8 January 2016).
HourCorrelationR-SquaredRMSE
SVR (single variables)0.99660.993115.8013
SVR (multi variables)0.99700.993710.7970
Persistence method0.96560.932532.0529

References

  1. REN21. Renewable 2015 Global Status Report; Renewable Energy Policy Network for the 21th Century (REN21): Paris, France, 2015. [Google Scholar]
  2. IEA. World Energy Outlook 2015; International Energy Agency: Paris, France, 2015. [Google Scholar]
  3. O’Hair, E.; Giesselmann, M.G. Comparative analysis of regression and artificial neural network models for wind turbine power curve estimation. J. Sol. Energy Eng. 2001, 123, 327–332. [Google Scholar]
  4. Department of Statistics. Available online: http://www.stat.cmu.edu/~cshalizi/uADA/12/lectures/ch12.pdf (accessed on 8 June 2017).
  5. Chang, W.-Y. A literature review of wind forecasting methods. J. Power Energy Eng. 2014, 2, 161–168. [Google Scholar] [CrossRef]
  6. Foley, A.M.; Leahy, P.G.; McKegh, E.J. Wind power forecasting & prediction methods. In Proceedings of the 9th International Conference on Environment and Electrical Engineering, Prague, Czech Republic, 16–19 May 2010. [Google Scholar]
  7. Abdelaziz, A.Y.; Rahman, M.A.; El-Khayat, M.M.; Hakim, M.A. Short term wind power forecasting using autoregressive integrated moving average approach. J. Energy Power Eng. 2013, 7, 2089. [Google Scholar]
  8. Pinson, P.; Madsen, H.; Nielsen, H.A.; Papaefthymiou, G.; Klöckl, B. From probabilistic forecasts to statistical scenarios of short-term wind power production. Wind Energy 2008, 12, 51–62. [Google Scholar] [CrossRef]
  9. Pinson, P.; Madsen, H.; Giebel, G.; Kariniotakis, G.; von Bremen, L.; Marti, I. Forecasting of wind generation: Recent advances and future challenges. In Proceedings of the European Wind Energy Conference and Exhibition, Milan, Italy, 7–10 May 2007. [Google Scholar]
  10. Costa, A.; Crespo, A.; Navarro, J.; Lizcano, G.; Madsen, H.; Feitosa, E. A review on the young history of the wind power short-term prediction. Renew. Sustain. Energy Rev. 2007, 12, 1725–1744. [Google Scholar] [CrossRef]
  11. Choi, S.Y.; Han, K.Y.; Kim, B.H. Comparison of different multiple linear regression models for real-time flood stage forecasting. J. Korean Soc. Civ. Eng. 2012, 32, 9–20. [Google Scholar]
  12. Law, M. A Simple Introduction to Support Vector Machines. Available online: https://www.cise.ufl.edu/class/cis4930fa15idm/notes/intro_svm_new.pdf (accessed on 8 June 2017).
  13. Drucker, H.; Burges, C.J.C.; Kaufman, L.; Smola, A.; Vapnik, V. Support vector regression machines. Adv. Neural Inf. Proc. Syst. 1996, 9, 155–161. [Google Scholar]
  14. Hearst, M.A.; Dumais, S.T.; Osuna, E.; Platt, J.; Scholkopf, B. Support vector machines. IEEE Intell. Syst. Appl. 1998, 13, 18–28. [Google Scholar] [CrossRef]
  15. Press, S.J.; Wilson, S. Choosing between logistic regression and discriminant analysis. J. Am. Statist. Assoc. 1978, 73, 699–705. [Google Scholar] [CrossRef]
  16. Hellevik, O. Linear versus logistic regression when the dependent variable is a dichotomy. Qual. Quant. 2009, 43, 59–74. [Google Scholar] [CrossRef]
  17. Peduzzi, P.; Concato, J.; Kemper, E.; Holford, T.R.; Feinstein, A.R. A simulation study of the number of events per variable in logistic regression analysis. J. Clin. Epidemiol. 1996, 49, 1373–1379. [Google Scholar] [CrossRef]
  18. Friedman, J.; Hastie, T.; Tibshirani, R. Additive logistic regression: A statistical view of boosting (with discussion and a rejoinder by the authors). Ann. Statist. 2000, 28, 337–407. [Google Scholar] [CrossRef]
  19. Jordan, A. On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. Adv. Neural Inf. Proc. Syst. 2002, 14, 841–848. [Google Scholar]
  20. Hayes, A.F.; Matthes, J. Computational procedures for probing interactions in OLS and logistic regression: SPSS and SAS implementations. Behav. Res. Methods 2009, 41, 924–936. [Google Scholar] [CrossRef] [PubMed]
  21. Burges, C.J.C. A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Discov. 1998, 2, 121–167. [Google Scholar] [CrossRef]
  22. Du, Y.; Lu, J.; Li, Q.; Deng, Y.L. Short-term wind speed forecasting of wind farm based on least square-support vector machine. Power Syst. Technol. 2008, 32, 62–66. [Google Scholar]
  23. Lei, M.; Shiyan, L.; Jiang, C.; Liu, H.; Zhang, Y. A review on the forecasting of wind speed and generated power. Renew. Sustain. Energy Rev. 2009, 13, 915–920. [Google Scholar] [CrossRef]
  24. Zhu, L. Support Vector Machines. Available online: http://www.pstat.ucsb.edu/student%20seminar%20doc/Kernel%20SVM.pdf (accessed on 8 June 2017).
  25. Liu, Y.; Shi, J.; Yang, Y.; Lee, W.-J. Short-term wind-power prediction based on wavelet transform—Support vector machine and statistic-characteristics analysis. IEEE Trans. Ind. Appl. 2012, 48, 1136–1141. [Google Scholar] [CrossRef]
  26. Meyer, D. Support Vector Machines. Available online: http://dictionnaire.sensagent.leparisien.fr/SUPPORT%20VECTOR%20MACHINE/en-en/ (accessed on 8 June 2017).
  27. Meyer, D.; Technikum Wien, F.H. Support Vector Machines—The Interface to Libsvm in Package e1071. Available online: https://cran.r-project.org/web/packages/e1071/vignettes/svmdoc.pdf (accessed on 8 June 2017).
  28. Support Vector Regression in R. Available online: http://www.svm-tutorial.com/2014/10/support-vector-regression-r/ (accessed on 8 June 2017).
Figure 1. The wind turbine’s power curve from the manufacturer (red line), ±20% of the power curve’s values (orange dashed lines) and measured outputs (black circles) from a wind turbine located on Jeju Island.
Figure 1. The wind turbine’s power curve from the manufacturer (red line), ±20% of the power curve’s values (orange dashed lines) and measured outputs (black circles) from a wind turbine located on Jeju Island.
Energies 10 00812 g001
Figure 2. The statistical model for estimating reliability of power curves using logistic regression (The red line is statistical model and the blue line is fixed minimum value).
Figure 2. The statistical model for estimating reliability of power curves using logistic regression (The red line is statistical model and the blue line is fixed minimum value).
Energies 10 00812 g002
Figure 3. Wind power outputs from Jeju Island in January 2016.
Figure 3. Wind power outputs from Jeju Island in January 2016.
Energies 10 00812 g003
Figure 4. The process of wind power forecasting using logistic regression and SVR (support vector regression) method.
Figure 4. The process of wind power forecasting using logistic regression and SVR (support vector regression) method.
Energies 10 00812 g004
Figure 5. Measured and forecasted wind power outputs on 8 January 2016.
Figure 5. Measured and forecasted wind power outputs on 8 January 2016.
Energies 10 00812 g005
Figure 6. Measured and forecasted wind power outputs on 11 January 2016.
Figure 6. Measured and forecasted wind power outputs on 11 January 2016.
Energies 10 00812 g006
Table 1. The probability that outputs exist within a tolerance band.
Table 1. The probability that outputs exist within a tolerance band.
Wind Speed [m/s]Accuracy [%]
4.074.17
5.074.17
6.074.17
7.080.03
8.084.63
9.088.35
10.091.29
11.093.70
12.095.30
13.096.56
14.097.48
15.098.15
16.098.74
17.099.03
18.099.30
19.099.50
20.099.62
Table 2. SVR model parameter for forecasting wind power outputs on 8 January 2016.
Table 2. SVR model parameter for forecasting wind power outputs on 8 January 2016.
ParameterSVR (Single Variables)SVR (Multi Variable)
SVM-KernelRadialRadial
Cost32.008.00
Gamma1.00.33
Epsilon0.000.00
Table 3. Accuracy of SVR model and Persistence method (8 January 2016).
Table 3. Accuracy of SVR model and Persistence method (8 January 2016).
HourCorrelationR-SquaredRMSE (Root Mean Square Error)
SVR (single variables)0.99570.991916.9843
SVR (multi variables)0.99710.99439.1491
Persistence method0.95750.916933.8178
Table 4. SVR model parameter for forecasting wind power outputs on 11 January 2016.
Table 4. SVR model parameter for forecasting wind power outputs on 11 January 2016.
ParameterSVR (Single Variables)SVR (Multi Variable)
SVM-KernelRadialRadial
Cost16.0032.00
Gamma1.00.33
Epsilon0.100.00
Table 5. Accuracy of SVR model and Persistence method (11 January 2016).
Table 5. Accuracy of SVR model and Persistence method (11 January 2016).
HourCorrelationR-SquaredRMSE (Root Mean Square Error)
SVR (single variables)0.99420.988421.5575
SVR (multi variables)0.99750.995015.4395
Persistence method0.88300.779647.4638

Share and Cite

MDPI and ACS Style

Park, B.; Hur, J. Accurate Short-Term Power Forecasting of Wind Turbines: The Case of Jeju Island’s Wind Farm. Energies 2017, 10, 812. https://doi.org/10.3390/en10060812

AMA Style

Park B, Hur J. Accurate Short-Term Power Forecasting of Wind Turbines: The Case of Jeju Island’s Wind Farm. Energies. 2017; 10(6):812. https://doi.org/10.3390/en10060812

Chicago/Turabian Style

Park, BeomJun, and Jin Hur. 2017. "Accurate Short-Term Power Forecasting of Wind Turbines: The Case of Jeju Island’s Wind Farm" Energies 10, no. 6: 812. https://doi.org/10.3390/en10060812

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop