Next Article in Journal
Impact of a Pilot-Scale Plasma-Assisted Washing Process on the Culturable Microbial Community Dynamics Related to Fresh-Cut Endive Lettuce
Next Article in Special Issue
Regarding Solid Oxide Fuel Cells Simulation through Artificial Intelligence: A Neural Networks Application
Previous Article in Journal
Numerical Analysis for Thermal Performance of a Photovoltaic Thermal Solar Collector with SiO2-Water Nanofluid
Previous Article in Special Issue
Deep Forest Reinforcement Learning for Preventive Strategy Considering Automatic Generation Control in Large-Scale Interconnected Power Systems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Adaptive Solar Power Forecasting based on Machine Learning Methods

Department of Electrical Engineering, Nanjing University of Aeronautics and Astronautics (NUAA), Nanjing 210016, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2018, 8(11), 2224; https://doi.org/10.3390/app8112224
Submission received: 19 September 2018 / Revised: 20 October 2018 / Accepted: 7 November 2018 / Published: 12 November 2018
(This article belongs to the Special Issue Applications of Artificial Neural Networks for Energy Systems)

Abstract

:
Due to the existence of predicting errors in the power systems, such as solar power, wind power and load demand, the economic performance of power systems can be weakened accordingly. In this paper, we propose an adaptive solar power forecasting (ASPF) method for precise solar power forecasting, which captures the characteristics of forecasting errors and revises the predictions accordingly by combining data clustering, variable selection, and neural network. The proposed ASPF is thus quite general, and does not require any specific original forecasting method. We first propose the framework of ASPF, featuring the data identification and data updating. We then present the applied improved k-means clustering, the least angular regression algorithm, and BPNN, followed by the realization of ASPF, which is shown to improve as more data collected. Simulation results show the effectiveness of the proposed ASPF based on the trace-driven data.

1. Introduction

In recent years, as the growing electrical power load demand and the claim on reducing greenhouse gases generated by the exhaustion of fossil fuels, Smart Grid (SG) and Microgrid (MG) featuring renewable energy (RE) have developed quickly, to satisfy consumption and mitigate pollution [1,2]. However, renewable energy is typically intermittent and weather-dependent, such as in the case of wind and solar power, which is a big challenge to the planning, operation, unit commitment and energy scheduling of MG or SG [3,4,5]. If the RE power generation can be predicted accurately, the efficiency of energy management can be greatly improved [6]. Thus, many researchers focus on predicting the RE power generation [7,8]. For solar power prediction, statistical methods and machine learning methods are two commonly used approaches [3,8,9,10,11]. Statistical methods are basically applied through regression function or probability distribution function (PDF), which estimates the relationship between weather variables and solar intensity [9,12,13]. Machine learning techniques learn the characteristics of the sample data, and then achieves the prediction by training models, such as Neural Networks (NN) and Support Vector Machines (SVM) [10,11].
For statistical methods, the authors of [12] generate a set of solar scenarios by assuming the forecasting errors follow a normal distribution. In [13], the forecasting error of the renewable energy is simulated via multiple scenarios from the Monte Carlo method under a Beta PDF. This method has an obvious dependence on the predictive error distribution function which is normally difficult to obtain. On the other hand, for machine learning, auto-regressor is adopted to predict the two-ahead solar power generation in [14]. Three layers NN is applied to build the forecasting model, the data of the previous day, highest temperature and weather type of the forecasting day are taken as input variables in [15], which forecasts very well for sunny days. Besides the highest temperature, the lowest temperature and average temperature are chosen as the input data to predict photovoltaic (PV) output power in [16]. However, the generalization ability of NN is limited, and thus the overall prediction effect by using only NN is not precise enough in many cases.
Although the forecasting results by any one of the above methods are not satisfactory overall, the forecasting accuracy can be further improved by merging them together. Some existing works focus on the combination of artificial neural network and data mining for more accurate solar power [17]. K-means clustering with nonlinear auto-regressive neural networks are adopted to forecast solar irradiance in [18], and k-means with artificial neural networks are used to predict solar irradiance in [19]. However, the traditional k-means is sensitive to the initial center, in practice, if the cluster center is given randomly, the k-means often converges to one or more clusters which contains very few or even no data point. Therefore, many researchers focus on the cluster centers initialization to improve the performance of the k-means [20,21]. The weather data is of high-dimension, which makes the improved k-means clustering in [21] a good fit to our method. Because it is proposed to tackle the high-dimension data initial cluster centers. Appropriate classification can improve the prediction accuracy, however, key factors that affect the prediction accuracy can be very different in different clusters. It is thus very necessary to analyze the most important factors in each cluster. The widely used variable selection methods are the forward selection regression, forward stage-wise regression and LASSO [22,23]. In practice, after clustering, there may be very little data in some groups, so the above mentioned methods tend to be too aggressive to eliminate too many correlated and useful predictors. However, the least angular regression (LARS) [24], which is particularly suitable for situations where the feature dimension is higher than the sample number, and during the calculation it keeps all variables selected into the regression model, and provides a simple structure for computing. We therefore choose LARS to select the most important factors in each group. In fact, the predicting methods with only key factors are not sufficient to have very accurate results [22], because the environment is highly complicated.
Therefore, an adaptive revising method on the original forecasting is highly desirable. Motivated by this, we propose an adaptive solar power forecasting (ASPF) method in this paper, which can revise the original forecastings adaptively according to the historical forecasting errors and the weather data. It combines data clustering, variable selection and artificial neural network (ANN) to achieve highly accurate revision on PV power predictions by learning the weather data and day-ahead forecasting errors intelligently and adaptively. The function of the ASPF is shown in Figure 1. Through ANN, ASPF learns the characteristics of forecasting errors between the predicted and actual PV power. By clustering the data, the capture of the forecasting characteristics could be highly improved. In addition, ASPF classifies the predicted day into similar day or non-similar (defined in Section 2), based on which the original forecastings are revised accordingly. The details are shown in Section 2. Also, ANN is adaptively upgraded so that ASPF improves as more data collected. Please note that ASPF actually captures the characteristics of forecasting errors and revises the predictions accordingly, and thus it does not require any specific original forecasting method, which can be a machine learning method or a statistical one. This makes ASPF a quite general method which can be used in many cases.
The remainder of this paper is organized as follows. We present the principle of ASPF and its framework in Section 2. The algorithms to realize ASPF are proposed in Section 3. The adaptive solar power forecasting is presented in Section 4. We perform the simulation studies in Section 5. Section 6 concludes this paper.

2. Principle and Framework of the ASPF

In this paper, the realization of ASPF mainly depends on the stored historical solar power data and the learning network which is updated according to the actual needs. In the database, we store the original predictive solar power P o r i , the revised solar power P r e v , and the actual solar power data P a c t . As more data collected, ASPF becomes more adaptive. Under certain condition, the historical solar power data P s d m are taken as the revised data P r e v directly, which is named as Similar Day Mode (SDM). On the other hand, when the condition of SDM does not hold, which is named as Non-similar Day Mode (NDM), ASPF modifies the original predictions P o r i by the output from a learning and updating network obtained from Back-propagation neural networks (BPNN) [25]. It learns the characteristics of differences between P o r i and P a c t . In this way, ASPF predicts more precisely for both SDM and NDM.
Furthermore, whatever the system is working in SDM or NDM, the network can be updated according to that whether the error between P r e v and P a c t is within an acceptable range or not. If the error is out of the allowed range, the network in ASPF is updated. We now present the mode judging standard, ASPF framework, and updating rules in this section. The notations used in this paper are summarized in Table 1.

2.1. Mode Judgment

In order to quickly determine the similar days, K-means clustering [26,27] is used to divide the data into several groups firstly, and in every group, clustering center is adopted to judge whether there is a similar day in the current database rapidly, as shown in Figure 2. The input vector x R q × 1 contains P o r i and key variables which affect the original predicting accuracy. x is then compared with clustering centers C k to be clustered into group k, if it has one or more similar days, and the closest historical solar power P s d m can be found in the database. Here, a variable denoted as Δ 1 is chosen as a threshold for the similar historical data, and if the error between x and C k is larger than Δ 1 , there is no similar day.
Considering the importance of clustering, we propose an improved k-means method to group the data, which is shown in Section 3.1. Please note that in different groups, the key weather variables may be different. For example, solar power is more closely related to temperature in sunny days than cloudy days. Therefore, we adopt a model-selection method called the least angular regression, also known as LARS [24], to analyze and find out the most correlated weather variables in different clusters.
The predicting network is thus acquired by combining the improved k-means clustering and LARS, which keeps updating for more accurate predictions. Besides, the trained BPNN, the weight w and the bias value b are used to determine the value of Δ 1 , which is thus closed related to the updated network. As the accuracy of the updated network increases, the similarity judgment is getting more precisely as well. We present the thorough steps to determine and update Δ 1 in Section 3 and Section 4.

2.2. Framework of the ASPF in SDM and NDM

In order to show the principle of ASPF more clearly, we explain it from the following two aspects, namely, the operation of ASPF with similar days in SDM, and without similar day in NDM. The block diagrams are shown in Figure 3a,b, respectively.
Figure 3a shows the case when ASPF is working in SDM. x is compared with every data in group k until the closest data is found, which will be used as the revised PV power. Later, the actual solar power P a c t will be recorded in EMS when the whole day of operation ends. In order to testify the effectiveness of ASPF in SDM, P s d m and P a c t are compared in Updating judgment. If the largest difference between these two vectors at any time slot is less than a certain value, P s d m is chosen as P r e v , otherwise the network will be updated.
Figure 3b shows the case when ASPF is working in NDM. Because no similar day is found in the current database, ASPF learns the new data through BPNN and decides if the BPNN is updated or not. Different from SDM, P r e v is acquired by the BPNN, and the difference between P r e v and P a c t determines whether the network is updated in NDM. In this way, ASPF absorbs more useful data and thus becomes better in predictions.

2.3. Rules for Network Updating and Feedback

Figure 4 shows the rule of the Updating judgment function in Figure 3. P r e v is the day-ahead revised predictive PV power obtained from SDM or NDM, which is compared with P a c t , and Δ is the maximum absolute error between P r e v and P a c t . Δ 2 represents acceptable error threshold between the revised solar power and the actual PV power in the current adaptive system. In addition, Δ 2 is updating as more data collected and processed, and will become smaller and smaller. It is also an indicator to measure the accuracy of predictions.
In the Updating judgment, if Δ is larger then Δ 2 , a trigger signal R s [ 0 , 1 ] is produced to activate BPNN. R s = 1 and BPNN is updated; otherwise, BPNN remains unchanged. Thus, the Updating judgment is mainly applied to renew the BPNN to guarantee accurate predictions. Here, Δ 2 is calculated by function Equations (9) and (17) in Section 3 and Section 4. We apply this both for SDM and NDM.
Through the above three parts, we compose the ASPF, and the complete realizations in every detail are presented next.

3. The Algorithms to Realize ASPF

This section presents the algorithms to realize ASPF. We first propose an improved k-means clustering method to divide the data into several groups. Then, LARS is presented to find the most relevant variables in every group. Followed by this, we discuss the compensation network for each group. We also present the feedback and updating mechanism.

3.1. Improved k-means Clustering

Figure 5 shows a typical daily solar power profile in different weather conditions, i.e., sunny, cloudy and rainy. sunny 1 and sunny 2 represent the sunny day in winter and summer, respectively. cloudy and rainy day’s solar power are different from the sunny days’. Therefore, considering the differences between these different weather types, we first cluster the data into several specific groups.
k-means is an effective, robust and widely used clustering technique [28]. It aims to cluster the data N = [ 1 , 2 , , N ] observations into K clusters, in which each observation belongs to the cluster with the nearest mean, and different clusters have low similarity with each other. It is implicitly assumed that features within a cluster are “homogeneous” and that different clusters are “heterogeneous” [26,29]. Here, we adopt k-means as a basis for clustering our data.
Given a dataset S = { x n } n = 1 N of N points and the number K of desired clusters, for each group, the corresponding generalized centroids are usually denoted as C = { C 1 , C 2 , , C K } , and C k R q × 1 . The standard k-means finds the optimal cluster centers { C 1 , C 2 , , C K } such that the sum of the 2-norm distance between each point x n , and the nearest cluster center C k is minimized [20,21]:
min : k = 1 K n = 1 N η n , k x n C k 2
where η n , k [ 0 , 1 ] , is the coefficient which represents the degree of how the nth sample belongs to the kth cluster. For simplicity, η n , k = 1 / 2 in this paper. C k is changed from C k g i v to C K during the calculation, and C k g i v denotes the initial given clustering center for the k th group.
As mentioned in Section 1, the traditional k-means is sensitive to the initial center C k g i v , and different initializations can lead to different final results. In recirculation iteration calculation, the k-means often converges to one or more clusters which contains very few or even no data point if the cluster center is given randomly.
To overcome this problem, we employ a simplified method which combines potential function and density function to get the initial cluster centers [21]. Firstly, for a given K, the density function of any sample x n is:
D n ( 0 ) = j = 1 N 1 + f r d x n x j 2 1
where f r d is the function of effective radius of the neighborhood density, and the more data around sample x n , the greater D n ( 0 ) is. f r d = ( α N ( N 1 ) ) / ( n = 1 N j = 1 N x n x j 2 ) , and α is one constant. For a known sample set, the value of f r d is easy to get. After the density of each data point has been computed, the data point x 1 with the highest density will be selected as the first cluster center C 1 g i v = x 1 , and D 1 = max { D n ( 0 ) , n N } is the density value. Then, the density function in the subsequent cluster becomes:
D n ( k ) = D n ( k 1 ) D k 1 + f r d x n x k 2 1
where k = 1 , 2 , , K 1 , and D k = max { D n ( k 1 ) , n N } , so the remaining cluster centers C k g i v = x k can be identified as x k according D k . Based on this result, the function Equation (1) will be convergent through few iterations, and no initial clustering will be empty.
k-means consists of two phases: one for determining the initial centroids and the other for assigning data points to the nearest clusters and then recalculating the cluster means. The second phase is necessary for finding the optimal k, and the data within the same type are very similar, while the data from different types are not close.
In this paper, the method discussed in [30] is adopted to obtain optimal number of k. Here, an inner distance in the same group is defined as:
D w ( n , k ) = 1 N k 1 g = 1 N k x k , g x k , n 2
where D w ( n , k ) is the average distance from n th data in cluster k to other data in the same group, and N k denotes the number of data in the k th cluster, n = [ 1 , 2 , , N k ] . If D w ( n , k ) is small, the data in this cluster is also compact correspondingly.
The distance for data between different groups is defined as:
D b ( n , k ) = min 1 k K , k n 1 N k g = 1 N k x k , g x k , n 2
where D b ( n , k ) represents the minimal average distance of the n th data from cluster k to other clusters. In addition, the larger value of the D b ( n , k ) is, the feature of “heterogeneous” is more clear. In order to synthesize the two factors, D b ( n , k ) and D w ( n , k ) , the linear combination is used to balance the relationship between them. At the same time, in order to enable the indicator to analyze the validity of the data clustering and to avoid the index from being affected by the dimension, the fractional operation is employed. The index to calculate the optimal k is defined as:
I k ( n , k ) = D b ( n , k ) D w ( n , k ) D b ( n , k ) + D w ( n , k )
where I k ( n , k ) [ 1 , 1 ] . Since the D b ( n , k ) requires at least two clusters, k starts from 2 to K. From Equation (6), it can be seen that when the cluster number k is taken from 2 to K, the optimal k is confirmed if the I k ( n , k ) gets the maximal value. Usually, K N , K = I n t ( N ) [31].
Eventually, based on the condition of getting a suitable initial cluster centers and the optimal number of k, pseudo-code for the improved k-means clustering is summarized in Algorithm 1.
Algorithm 1: Improved k-means Algorithm
Step 1: For each n N , k [ 2 , K ] .
Step 2: In each cluster k, the initial cluster center C k g i v is obtained by executing Equations (2) and (3), then C k & N k can be calculated by Equation (1).
Step 3: Find and record I k based on Equations (4)–(6). Compare k with K, and if k < K , k : = k + 1 and repeat Step 2; otherwise, jump to Step 4.
Step 4: Acquire the maximal I k and the corresponding number k , recalculate Equations (2) and (3), then the optimal clustering data is obtained through Equation (1), and the optimal centers are C k , weather types are G k , k = 1 , 2 , , k .
In this way, solar power data is divided into several groups by the proposed improved k-means clustering. According to the standardized-residual method and Lagrange Interpolation Polynomial [32], the abnormal data is removed and the remaining data are normalized.

3.2. The Least Angular Regression Algorithm

A typical regression can be expressed as: μ ^ = X β ^ , where μ ^ is the current estimate of the vector of responses y with M predictors, X is one matrix with N × M , N is the number of sample, and then current correlation will be: c ^ o r = X ( y μ ^ ) , and there will exist a j such that | c ^ o r , j | is maximized. Then μ will be updated by rule of μ ^ m + 1 = μ ^ m + γ ^ s i g n ( c ^ o r , j ) X j , in LARS, γ ^ is endogenously chosen so that the algorithm proceeds equiangular between the variables in the most correlated set (hence the ‘least angle direction’) until the next variable is found. Then the next variable joints the active set, and their coefficient are moved together in a way that keeps their correlations tied and decreasing. This process is continued until all the variables in the model, and end at the full least-squares fit [24].
Actually, the LARS algorithm begins at μ ^ 0 = 0 , supposing μ ^ is current estimate and let c ^ o r = X ( y μ ^ ) . Define A as a subset of the indices corresponding to variable with largest absolute correlations, C ^ o r = max { | c ^ o r , m | } and A = { m : | c ^ o r , m | = C ^ o r , m } . Defining s m = s i g n ( c ^ o r , m ) , m A and active matrix corresponding to A is X A = ( s m X m ) . Letting G A = X A X A and A A = ( 1 A G A 1 1 A ) 1 2 , 1 A is a vector of 1 s of length equaling | A | , the size of A . Then the equiangular vector can be defined as u A = X A ω A , where ω A = A A G A 1 1 A , is the unit vector making equal angles, less than 90 , with the columns of X A which satisfies X A u A = A A 1 A , and u A 2 = 1 , and a = X u A . Then the μ will be updated in the next step of the LARS algorithm is:
μ ^ A + = μ ^ A + γ ^ u A
where
γ ^ = min m A c + C ^ o r c ^ o r , m A A a m , C ^ o r + c ^ o r , m A A + a m
min + indicates that the minimum is taken over only positive components within each choice of m, a m a . Finally, dependency variables are sorted by relevance level, and we can easily get the important predictors by LARS.
After the dataset S is divided in several styles in Section 3.1, and these subsets are defined as S 1 , S 2 , , S k , respectively. Then each subset is analyzed by LARS to reduce the influence of uninformative predictors, and we can focus on variables which is significant for the differences in prediction.

3.3. BPNN-Based Solar Power Compensation

The BPNN, one of the most popular techniques in the filed of neural network, is a kind of supervised learning NN. It uses the steepest gradient descent method to reach very small approximation. Theoretically, BPNN can approximate any nonlinear functions [25], in fact, for the PV power prediction, the relationship between its input data and forecasting value is very complicated [8]. So it is suitable to adopt the BPNN in our ASPF to learn this feature. Therefore, a three-layered feed-forward neural network and the BP learning algorithm are adopted together to compensate the predictive error of solar power in this paper.
The input nodes { P b k ( t ) } b = 1 B of the BPNN to the hidden layer is decided by the result from Section 3.2, x consists of the { P b k ( t ) } b = 1 B from t = 1 to T. B denotes the number of the input nodes, and P b k ( t ) is the kth group of inputing predictive message at time slot t. For the number of hidden neurons (HN) in the hidden layer, the method in [33] is used to get the upper bound on the number of hidden neurons, HN m a x . Then the hidden neurons h is ranged from 1 to HN m a x , and the model is trained for several times according the method in [34], and the forecasting mean square error from the testing data is calculated and recorded for comparisons. The optimal h is acquired in the end, which is the number occuring mostly in the iteration with the less mean square error.
Initializing the weight is another important step in BPNN, and initialization of a network involves assigning initial values for the weights of all connecting links, here, initial values for the weights are set in the small random values between 0.3 and + 0.3 [33]. Momentum coefficient and learning rate in BPNN are chosen from [ 0.021 , 0.029 ] and [ 0.345 , 0.539 ] , respectively. Besides, in order to overcome the possible overfitting problem in forecasting, regularization is employed of the parameter ranged in [ 0.87 , 0.9 ] for k groups of models.
Finally, Maximum Absolute Error (MxAE), Root Mean Squared Error (RMSE) and Mean Absolute Percentage Error (MAPE) between the revised solar power and the actual value are used to quantify the performance of each group. MxAE, RMSE and MAPE are widely used statistical measurements of the accuracy of the models [22]. In addition, in each group, MxAE, RMSE and MAPE are defined as follows:
MxAE = max | y n y ^ n | , n = 1 , 2 , , N k
RMSE = 1 N k n = 1 N k ( y n y ^ n ) 2
MAPE = 100 N k n = 1 N k y n y ^ n y n
where N k is the number of predicted points in type k, y n is actual solar power for the nth point, and y ^ n represents the revised solar power.

3.4. Closed-Loop Feedbacks and Updates

ASPF not only has the ability to compensate the error existed in original predictive solar power day-ahead, but also uses the historical solar actual power as the revised value. So, it is necessary to search for a similar day in the system. Besides, it should be feasible to use P s d m as a substitute of P r e v , and important to guarantee this P s d m with high accuracy. All above requirements are accomplished by the closed-loop feedback and updating mechanism.
In this paper, we use Δ 2 to define the MxAE of the current forecasting model. In addition, the criteria for judging similar days is related to network parameters and determined by the maximal absolute error of the current BPNN. with the input data of | x i x j | , BPNN can get the difference of the predictive PV power and the actual values, if this difference is less than Δ 2 , their corresponding actual values will be replaced by each other with acceptable errors. So the actual solar power of the similar data can be used as the revised value directly when this similarity meets certain conditions. Here, the specific relationship of the Δ 1 with the BPNN and this condition are shown below.
The original forecasting models are built in Section 3.3, and after the models are trained, MxAE, weight ω and bias b can be obtained from BPNN. BPNN can be formulated as the non-linear function, expressed as:
y ^ = f ( x ω + b )
and for any input data x i , x j S , if it satisfies:
| y i y ^ i | Δ 2 | y j y ^ j | Δ 2
and the function f ( · ) has the properties of monotonically increasing and derivable. So | y ^ i y ^ j | | f ( | x i x j | ω + b ) | .
In case 1, if y f ( x ω + b ) , the actual solar power will be larger than the revised one. Thus,
( y i y j ) ( f ( x i ω + b ) f ( x j ω + b ) )
so,
| y i y j | ( f ( | x i x j | ω + b ) ) Δ 2
In case 2, if y f ( x ω + b ) , the relationship of Equation (15) also can be obtained.
Therefore, with the difference obtained by input data | x i x j | , f ( · ) is known, if the result of f ( · ) is less then Δ 2 , y i and y j will be replaced by each other with acceptable errors. On the other hand, when the prediction model is invariant, by comparing the similarity of predicted input information, the actual solar power of similar historical day can be used as the revised PV power directly if it satisfies certain condition. In addition, we give this condition below.
In practice, through comparing the maximal | x i x j | and the corresponding value calculated by the f ( · ) is known, it can be found that if Δ 1 max | x i x j | , then | y i y j | Δ 2 . So the initial Δ 1 can be chosen as max | x i x j | , and thus Δ 1 has a close relationship with the updated network. As the accuracy of the updating network increases, the precision of the similar days’ solar power prediction becomes higher and higher. So it is reasonable to adopt historical actual solar power as the day-ahead revised data when the similar input message exists. Our proposed method is available for reusing the historical data to achieve better forecasting revision.

4. Adaptive Solar Power Forecasting Algorithm

Based on the Algorithms in Section 3, in this section, we present the complete ASPF.

4.1. Effectiveness of the ASPF

As more data collected, there will be more and more historical data in database, so it is not proper to use an invariant parameter as Δ 1 to search for the similar day. Meanwhile, the potential function in [21] has a good performance of slow decaying. Based on this, the decaying of Δ 1 is defined as:
Δ 1 , u p = Δ 1 e 2 π ( 1 L a L a v e ) arctan ( N L )
where Δ 1 is current value, Δ 1 , u p represents the updated value of Δ 1 in the presence of multiple similar days, L a v e = 1 N L l = 1 N L L ( l ) , and L = { L ( n ) Δ 1 } , L a = min { L } . In Equation (16), Δ 1 is updated only when more than two similar days exist. The flow chart in SDM is shown in Figure 6a.
Similarly, the forecasting error gradually decreases with the increasing amount of the testing data. In addition to this, in the system, the networks can be re-trained when the predictive error beyond the Δ 2 . Meanwhile, if MxAE of the testing day H ( d ) = max | y ^ d ( t ) y d ( t ) | , t = 1 , 2 , , T is less than the initial Δ 2 , it is then to decrease Δ 2 , obtained from Equation (9), where d represents the index of the day. The decreasing of Δ 2 follows
Δ 2 , u p = Δ 2 e ( 1 H ( d ) Δ 2 ) H ( d ) Δ 2
where Δ 2 , u p is the updated value of Δ 2 . The flow chart of updating BPNN and Δ 2 is shown in Figure 6b.
Figure 7 shows the complete flow chart of ASPF, in which the forecasting accuracy can be improved gradually, and the historical data is thus used more effectively. Moreover, according to the similar daily information, ASPF adaptively identifies and compensates for the predicted solar power day-ahead.

4.2. The Algorithm of whole System

Based on the above analysis, we now summarize the complete steps of the algorithm to realize the ASPF in Algorithm 2.
Algorithm 2: Adaptive Solar Power Forecasting Algorithm
Step 1: The historical data is divided into different groups by Algorithm 1, and corresponding clustering centers C k are confirmed.
Step 2: LARS is used to each group to get its important predictors, and BPNN is employed to learn the features in every group, subsequently, according the principle in Section 3.4, Δ 1 and Δ 2 is setup.
Step 3: For a new P o r i and its important predictors, compares with C k , if the maximal error is less than Δ 1 , go to Step 4; otherwise, go to Step 5.
Step 4: P o r i and its important predictors compares with every data in group k, and finds the most similar day as P r e v , and update Δ 1 according formulation Equation (16), then go to Step 6.
Step 5: P o r i is revised by BPNN, and the day-ahead PV power P r e v is obtained, go to Step 6.
Step 6: Δ = | P a c t P r e v | , if any element in Δ is larger than Δ 2 , update the BPNN, storage P a c t and P o r i in EMS, then go to Step 3; otherwise, go to Step 7.
Step 7: According to formulation Equation (17) update Δ 2 , and go to Step 3.

5. Simulation Results

In this section, we test the performance of ASPF based on the trace obtained from a small power system in China. In order to validate the proposed algorithm, we apply it in the original forecasting PV power which is predicted by radial basis function NN and Multiple Linear Regression (MLR). The algorithms including the improved k-means, LARS, BPNN and ASPF are verified via simulations and comparisons.

5.1. Data Description

In this paper, the small power system consists of micro-diesel, photovoltaic array, lithium battery cabinet and load demand. The simulations are performed to verify the operation of the proposed algorithm. The data used in ASPF comes from the actual factory which locates in Hekou town, Nantong city, China( 32.49 N , 120.83 E ). For PV power forecasting, the prediction can be performed every 2 h, 1 h, 0.5 h, or 15 min according to users’ demand. All the simulations complete in the MATLAB, and in our ASPF the computation time mainly depends on the BPNN which only takes several seconds one time in Matlab. So the results of ASPF can be used in day-ahead energy management, and it can be applied to the real-time energy management too. In this paper, the parameter for the PV power adaptive compensation time interval is set to 1 h based on our system. In this system, solar power is mainly supplied to the canteen and the water pumps in the factory. According to the historical data, we obtain the original forecasting PV power in NN and MLR respectively, and store them in EMS.

5.2. Predicting Compensation of ASPF

The data of original solar power is from 1 January to 31 December 2016 in Hekou town. For the k-means algorithm, the initial cluster centers calculated by the potential function need less iterations compared with traditional stochastic way, as shown in Figure 8a. As explained in Section 3.1, the range of k is from 2 to 19. In addition, the initial cluster centers can be found fast and better to determine the optimal number of clusters, and effectively avoid the occurrence of non-clustering situation. Equations (4)–(6) are used to determine the optimal number of clusters. Based on the methods described in Section 3.1, the number of clusters for different k under NN and MLR are shown in Figure 8b, where we can see that the optimal number of clusters is k = 2 in NN, and k = 3 in MLR.
In order to modify the differences in original method for forecasting day-ahead solar power, LARS is used to analyze the key factors which affect the forecasting most in NN and MLR. In addition, the critical parameters are different in NN and MLR, shown in Table 2 and Table 3. The forecasting results of NN and MLR for different groups are summarized in Table 4 and Table 5, which show that our proposed method has greatly improved the accuracy of PV prediction no matter the PV power is predicted by machine learning or MLR.

5.3. Performance Evaluation of ASPF

We now validate that our ASPF can adaptively revise the solar power which is predicted by NN. We randomly select 100 days of data from 1 January to 31 October 2016, and store the data as the initial database, which includes both predictive information and PV actual power. The data in the database gradually accumulates as the number of days increases in this system. Data from 1 November to 30 December 2016 are used to test the effectiveness of the ASPF. By comparing with the cluster centers C 1 and C 2 , g r o u p 1 has 23 data points, and g r o u p 2 has 37. Here, d 1 = 1 , 2 , , 23 , d 2 = 1 , 2 , , 37 .
For these 60 testing days, the ASPF results are in Table 6. NL is the number of days whose MxAE is smaller than Δ 2 , and NB denotes the number of days whose MxAE is larger than Δ 2 . Through Table 6, it can be observed that there are 8 times with similar day, and 15 times with no similar day in g r o u p 1. Due to the stored data in g r o u p 1 is less than in g r o u p 2, there are only 4 times that the historical actual value can be accepted as revised solar power. The non-similar day data are revised by our proposed BPNN, and the results are ideal as their MxAEs are less than Δ 2 . Moreover, the similar day number of group 2 is more than that in group 1 because there are more data in g r o u p 2 in the database. There is also one non-similar day’s result predicted by BPNN over the Δ 2 , because the weather changes are more complex in non-sunny circumstances which lead to larger errors. However, the overall performance of our proposed methods is very stable and reliable.
The comparisons between the proposed method and the historical data are shown in Figure 9 and Figure 10. Here, P s d m denotes the corresponding historical actual solar power if similar days exist. P o r i is the predicted solar power by original method. P a c t represents the actual solar power of the day. Figure 9a shows the result when there is a similar day in g r o u p 1 and when d 1 = 1 , P s d m is used as the revised PV power. In addition, at the night of the day, we can know the actual solar power P a c t ( d 1 ) by EMS, the MxAE between the P s d m and P a c t ( d 1 ) will be gotten, here MxAE is 9.112, less than initial Δ 2 , which is 9.476. In Figure 9b, there are 3 similar days, and the actual MxAE between P s d m and P a c t ( 13 ) is 8.351, larger than the current Δ 2 . In Figure 9d, the data of the 13rd day are used to retrain the BP neural network. When d 1 = [ 2 , 3 , 4 , 5 , 6 , 8 , 9 , 10 , 11 , 12 ] , there is no similar day, so BPNN is used to get the day-ahead revised solar power as shown in Table 7. And the MxAEs in these days are less than 9.476, so Δ 2 gradually reduces. Δ 2 does not update when BPNN is not working as shown in Figure 9d for the 6th and the 7th day. Figure 9c illustrates the Δ 1 in EMS, and if there is no similar day or the number of similar days is 1, Δ 1 is constant. When the number of similar days is larger than 1, it means several similar days are in the database, and Δ 1 should be lowered in order to find possible similar days as the data increases. In this way, the efficiency of ASPF keeps improving. Please note that, we plot the results for sunny day in Figure 9 and Figure 10, however, our simulation results are not only effective for sunny days, but also for PV power revision in other weather types, such as cloudy, rainy day. It can be seen in Table 7 and Table 8, because the data in g r o u p 1 or g r o u p 2 can be any weather types.
The results from g r o u p 2 are shown in Figure 10. The number of similar days increase, the change in Δ 1 and Δ 2 are even more apparent than g r o u p 1. This change depends on the number of operation days. Table 7 and Table 8 record the detailed steps in g r o u p 1 and g r o u p 2. Here, SI is an indicator of the existence of similar days: 1 indicates similar days exist, while 0 means no similar day. TS represents the number of similar days when SI 1 . WD is the MxAE in P s d m / P r e v , and P a c t is less than Δ 2 . In contrast, BD indicates that the MxAE is greater than Δ 2 . NS is the opposite of SI, and NS = 1 means no similar day. W D and B D are used to record the states of our revised prediction performance after updating.
In Table 7, BD is 1 in the 13th day, which means the similar day’s actual value has a large deviation from the 13th day’s actual solar power. In addition, then, ASPF retrains the network using the 13th day’s data at the end of the 13th day. When there is similar day close to the 13th day at the 18th day, the 13th day’s actual solar power is adopted as day-ahead revised power and the MxAE is less than Δ 2 . The same situation also happens in g r o u p 2, as shown in Table 8. In g r o u p 2, at the 3rd day, there is no similar day, solar power is revised by BPNN and then stored in database as shown in Table 8. In addition, the 25th day is similar to the 3rd day’s information, so the 3rd day’s actual solar power is used as the 25th day’s revised power, and the result is acceptable as shown in Figure 10d. This reveals that as the number of data increases, the results of adopting similar day’s actual power become more accurate, and the predictive input message can be identified and compensated adaptively.
Through the above simulations, it is reasonable to use the similar day’s actual value as the solar forecast power if the forecasting data satisfies the certain condition, which limits the deviation of the predicted data be less than Δ 1 . Also, because BPNN updates the network when MxAE exceeds Δ 2 , this also ensures that BPNN can effectively predict the data in different situations. In the case of 98%, BPNN can get satisfactory results in this system.
Furthermore, in order to verify the wide applicability of our proposed ASPF algorithm, we run similar simulation for the PV power predicted by MLR, and compare the result. Firstly, 150 days data from 1 January 2016 to 31 August 2016 are randomly selected as the initial database to get the MLR predictive PV power. Then, we select 120 days from 1 September, to 31 December 2016 as testing data. According to the results of k-means, the data of the testing days is classified into three groups, which contain 43, 51, and 26 days, respectively. In addition, the ASPF simulation results are shown in Figure 11. In Figure 11a, Δ 1 gradually decreases as the number of days increases, because the data in g r o u p 1 has more accurate predictions than g r o u p 2 from Table 5. As Δ 2 decreases in Figure 11b, the network updates continuously, and the number of similar days increases so that Δ 1 becomes smaller correspondingly. For Figure 11c,e, Δ 1 does not change, because in g r o u p 2 and g r o u p 3, the number of the similar days are less than in g r o u p 1. Δ 2 in Figure 11d,f decrease gradually, and thus the accuracy of our method improves continuously. In summary, the proposed ASPF obtains high-precision prediction values in both non-similar days and similar day mode under different predicting methods.

6. Conclusions

In this paper, we propose an adaptive solar power forecasting algorithm for highly precise solar power forecasting. We firstly present the adaptive framework of the algorithm, integrating the improved k-means clustering, LARS and BPNN. ASPF is thus able to select the important variables, compensate the predictions from different methods, and update the BPNN adaptively. We also present these algorithms in details. Finally, we evaluate the proposed ASPF through the simulated experiments. We verify the validity of these algorithms, such as, improved k-means, LARS, BPNN and ASPF. Furthermore, we testify the effectiveness of the ASPF algorithm for different prediction models. The simulation results show that our proposed method greatly improves the accuracy of solar power, and keeps improving as more data collected. For future work, we consider to apply ASPF in other application, such as energy scheduling mentioned in Introduction.

Author Contributions

Y.W. proposed the conceptualization of adaptive forecasting the solar power; Y.W., H.Z., F.Z. and J.C. worked together to investigate the proper method to realize this adaptive solar power forecasting; H.Z., X.C. and F.Z. validated the performance in Matlab, and the test data is provided by X.C.; H.Z. and Y.W. analyzed the data and wrote the paper; X.C., F.Z. and J.C. supervised the all the entire process.

Funding

The authors gratefully acknowledge financial support for this project provided by the NSF of China under Grants No. 51607087 and the Fundamental Research Funds for the Central Universities of China, NO. XCA17003-06.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Wang, Y.; Mao, S.; Nelms, R.M. Online Algorithms for Optimal Energy Distribution in Microgrids; Springer International Publishing: Berline, Germany, 2015. [Google Scholar]
  2. Wang, Y.; Mao, S.; Nelms, R.M. On Hierarchical Power Scheduling for the Macrogrid and Cooperative Microgrids. IEEE Trans. Ind. Inform. 2015, 11, 1574–1584. [Google Scholar] [CrossRef]
  3. Liang, H.; Zhuang, W. Stochastic Modeling and Optimization in a Microgrid: A Survey. Energies 2014, 7, 2027–2050. [Google Scholar] [CrossRef]
  4. Meng, L.; Sanseverino, E.R.; Luna, A.; Dragicevic, T.; Vasquez, J.C.; Guerrero, J.M. Microgrid supervisory controllers and energy management systems: A literature review. Renew. Sustain. Energy Rev. 2016, 60, 1263–1273. [Google Scholar] [CrossRef]
  5. Wang, Y.; Mao, S.; Nelms, R.M. Online Algorithm for Optimal Real-Time Energy Distribution in the Smart Grid. IEEE Trans. Emerg. Top. Comput. 2013, 1, 10–21. [Google Scholar] [CrossRef]
  6. Wang, Y.; Shen, Y.; Mao, S.; Cao, G.; Nelms, R.M. Adaptive Learning Hybrid Model for Solar Intensity Forecasting. IEEE Trans. Ind. Inform. 2018, 14, 1635–1645. [Google Scholar] [CrossRef]
  7. Soman, S.S.; Zareipour, H.; Malik, O.; Mandal, P. A review of wind power and wind speed forecasting methods with different time horizons. In Proceedings of the North American Power Symposium, Arlington, TX, USA, 26–28 September 2010; pp. 1–8. [Google Scholar]
  8. Antonanzas, J.; Osorio, N.; Escobar, R.; Urraca, R.; Martinez-De-Pison, F.J.; Antonanzas-Torres, F. Review of photovoltaic power forecasting. Solar. Energy 2016, 136, 78–111. [Google Scholar] [CrossRef]
  9. Bacher, P.; Madsen, H.; Nielsen, H.A. Online short-term solar power forecasting. Solar. Energy 2009, 83, 1772–1783. [Google Scholar] [CrossRef] [Green Version]
  10. Liu, J.; Fang, W.; Zhang, X.; Yang, C. An Improved Photovoltaic Power Forecasting Model with the Assistance of Aerosol Index Data. IEEE Trans. Sustain. Energy 2015, 6, 434–442. [Google Scholar] [CrossRef]
  11. Shi, J.; Lee, W.J.; Liu, Y.; Yang, Y.; Wang, P. Forecasting Power Output of Photovoltaic Systems Based on Weather Classification and Support Vector Machines. IEEE Trans. Ind. Appl. 2015, 48, 1064–1069. [Google Scholar] [CrossRef]
  12. Su, W.; Wang, J.; Roh, J. Stochastic Energy Scheduling in Microgrids with Intermittent Renewable Energy Resources. IEEE Trans. Smart Grid 2014, 5, 1876–1883. [Google Scholar] [CrossRef]
  13. Osório, G.J.; Lujano-Rojas, J.M.; Matias, J.C.O.; Catalão, J.P.S. Including forecasting error of renewable generation on the optimal load dispatch. In Proceedings of the 2015 IEEE Eindhoven PowerTech, Eindhoven, The Netherlands, 29 June–2 July 2015; pp. 1–6. [Google Scholar]
  14. Palma-Behnke, R.; Benavides, C.; Lanas, F.; Severino, B.; Reyes, L.; Llanos, J.; Sáez, D. A Microgrid Energy Management System Based on the Rolling Horizon Strategy. IEEE Trans. Smart Grid 2013, 4, 996–1006. [Google Scholar] [CrossRef]
  15. Chen, C.; Duan, S.; Cai, T.; Liu, B. Smart energy management system for optimal microgrid economic operation. Renew. Power Gener. Iet 2011, 5, 258–267. [Google Scholar] [CrossRef]
  16. Ding, M.; Wang, L.; Bi, R. An ANN-based Approach for Forecasting the Power Output of Photovoltaic System. Proc. Environ. Sci. 2011, 11, 1308–1315. [Google Scholar] [CrossRef]
  17. Yesilbudak, M.; Çolak, M.; Bayindir, R. A review of data mining and solar power prediction. In Proceedings of the IEEE International Conference on Renewable Energy Research and Applications, Birmingham, UK, 20–23 November 2016; pp. 1117–1121. [Google Scholar]
  18. Benmouiza, K.; Cheknane, A. Forecasting hourly global solar radiation using hybrid k-means and nonlinear autoregressive neural network models. Energy Convers. Manag. 2013, 75, 561–569. [Google Scholar] [CrossRef]
  19. Mccandless, T.C.; Haupt, S.E.; Young, G.S. A regime-dependent artificial neural network technique for short-range solar irradiance forecasting. Renew. Energy 2016, 89, 351–359. [Google Scholar] [CrossRef] [Green Version]
  20. Huang, X.; Ye, Y.; Zhang, H. Extensions of Kmeans-Type Algorithms: A New Clustering Framework by Integrating Intracluster Compactness and Intercluster Separation. IEEE Trans. Neural Netw. Learn. Syst. 2014, 25, 1433–1446. [Google Scholar] [CrossRef] [PubMed]
  21. Chiu, S.L. Fuzzy model identification based on cluster estimation. J. Intell. Fuzzy Syst. 1994, 2, 267–278. (In Chinese) [Google Scholar]
  22. Tang, N.; Mao, S.; Wang, Y.; Nelms, R.M. Solar Power Generation Forecasting with a LASSO-Based Approach. IEEE Int. Things J. 2018, 5, 1090–1099. [Google Scholar] [CrossRef]
  23. Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer: Berline, Germany, 2001. [Google Scholar]
  24. Efron, B.; Hastie, T.; Johnstone, I.; Tibshirani, R. Least angle regression. Ann. Stat. 2004, 32, 407–451. [Google Scholar]
  25. Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature 1986, 323, 399–421. [Google Scholar] [CrossRef]
  26. Hartigan, J.A.; Wong, M.A. Algorithm AS 136: A K-Means Clustering Algorithm. J. Roy. Stat. Soc. 1979, 28, 100–108. [Google Scholar] [CrossRef]
  27. Kanungo, T.; Mount, D.M.; Netanyahu, N.S.; Piatko, C.D.; Silverman, R.; Wu, A.Y. An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 881–892. [Google Scholar] [CrossRef]
  28. Ratanamahatana, C.A.; Lin, J.; Gunopulos, D.; Keogh, E.; Vlachos, M.; Das, G. Mining Time Series Data. Data Min. Knowl. Discov. Handb. 2005, 1069–1103. [Google Scholar]
  29. Modha, D.S.; Spangler, W.S. Feature Weighting in k-Means Clustering. Mach. Learn. 2003, 52, 217–237. [Google Scholar] [CrossRef]
  30. Zhou, S.B.; Xu, Z.Y.; Tang, X.Q. Method for determining optimal number of clusters in K-means clustering algorithm. J. Comput. Appl. 2010, 30, 1995–1998. [Google Scholar] [CrossRef]
  31. Rezaee, M.R.; Lelieveldt, B.P.F.; Reiber, J.H.C. A new cluster validity index for the fuzzy c-mean. Pattern Recognit. Lett. 1998, 19, 237–246. [Google Scholar] [CrossRef]
  32. Liu, L.; Zhai, D.; Jiang, X. Current situation and development of the methods on bad-data detection and identification of power system. Power Syst. Prot. Control 2010, 38, 143–147. [Google Scholar]
  33. Govindaraju, R.S.; Rao, A.R. Artificial Neural Networks in Hydrology. J. Hydrol. Eng. 2000, 5, 124–137. [Google Scholar]
  34. Jiao, B.; Ye, M. Determination of Hidden Unit Number in a BP NeuraI Network. J. Shanghai Dianji Univ. 2013, 16, 113–116, 124. (In Chinese) [Google Scholar]
Figure 1. The function of ASPF.
Figure 1. The function of ASPF.
Applsci 08 02224 g001
Figure 2. The principle of the Identify similar days.
Figure 2. The principle of the Identify similar days.
Applsci 08 02224 g002
Figure 3. The ASPF schematic in different mode.
Figure 3. The ASPF schematic in different mode.
Applsci 08 02224 g003
Figure 4. The principle of Updating judgment.
Figure 4. The principle of Updating judgment.
Applsci 08 02224 g004
Figure 5. The solar power for different weather types.
Figure 5. The solar power for different weather types.
Applsci 08 02224 g005
Figure 6. The flow chart for searching similar days.
Figure 6. The flow chart for searching similar days.
Applsci 08 02224 g006
Figure 7. The flow chart for ASPF in power systems.
Figure 7. The flow chart for ASPF in power systems.
Applsci 08 02224 g007
Figure 8. (a) The iterations in traditional and improved k-means. (b) The curve of determining the optimal number of clusters.
Figure 8. (a) The iterations in traditional and improved k-means. (b) The curve of determining the optimal number of clusters.
Applsci 08 02224 g008
Figure 9. The results of g r o u p 1 in ASPF. (a) denotes that there is a similar day, and the MxAE between the actual solar power and historical value is less than Δ 2 ; on the contrary, (b) shows the results that the MxAE is larger than Δ 2 ; (c) plots a curve that the Δ 1 in g r o u p 1 changes with the similar days’ message; (d) shows the Δ 2 in g r o u p 1 changing with MxAE.
Figure 9. The results of g r o u p 1 in ASPF. (a) denotes that there is a similar day, and the MxAE between the actual solar power and historical value is less than Δ 2 ; on the contrary, (b) shows the results that the MxAE is larger than Δ 2 ; (c) plots a curve that the Δ 1 in g r o u p 1 changes with the similar days’ message; (d) shows the Δ 2 in g r o u p 1 changing with MxAE.
Applsci 08 02224 g009
Figure 10. The results of g r o u p 2 in ASPF. (a) denotes that there is a similar day, and the MxAE between the actual solar power and historical value is less than Δ 2 ; on the contrary, (b) shows that the MxAE is larger than Δ 2 ; (c) plots a curve that the Δ 1 in g r o u p 2 changes with the similar days’ message; (d) shows the Δ 2 in g r o u p 2 changing with MxAE.
Figure 10. The results of g r o u p 2 in ASPF. (a) denotes that there is a similar day, and the MxAE between the actual solar power and historical value is less than Δ 2 ; on the contrary, (b) shows that the MxAE is larger than Δ 2 ; (c) plots a curve that the Δ 1 in g r o u p 2 changes with the similar days’ message; (d) shows the Δ 2 in g r o u p 2 changing with MxAE.
Applsci 08 02224 g010
Figure 11. The results of ASPF applied to compensate the solar PV predicted by MLR. (a,c,e) are the trend of Δ 1 in g r o u p 1, g r o u p 2 and g r o u p 3. (b,d,f) denote the Δ 2 in g r o u p 1, g r o u p 2 and g r o u p 3 respectively.
Figure 11. The results of ASPF applied to compensate the solar PV predicted by MLR. (a,c,e) are the trend of Δ 1 in g r o u p 1, g r o u p 2 and g r o u p 3. (b,d,f) denote the Δ 2 in g r o u p 1, g r o u p 2 and g r o u p 3 respectively.
Applsci 08 02224 g011
Table 1. Notation.
Table 1. Notation.
SymbolDescriptionSymbolDescription
P o r i the original predictive solar power C k g i v initial given clustering center in group k
P r e v the revised solar power C k optimal clustering center in group k
P a c t the actual solar power D n ( k ) density function value of the nth data in order to calculate the initial center of k + 1 th cluster
P s d m the historical solar power D k the kth group initial cluster center
Δ 1 the threshold to judge the similar day D w inner distance in the same group
Δ 1 , u p the updated value of Δ 1 D b the minimum distance between different groups
Δ 2 acceptable error between P r e v and P a c t I k intra-group and inter-group index
Δ 2 , u p the updated value of Δ 2 N k the number of data in kth cluster
R s signal to trigger the network updating x k , g the gth data in group k
xthe vector contains P o r i and weather data c o r correlation coefficient of residual
Nthe total number of x o ^ n the predictive value of the nth data
kclustering numberwthe weight of neural network
Kthe maximum number of clusteringsbthe bias value of neural network
C k the kth clustering centerhthe number of hidden neurons
N T total number of similar days or non-similar day N L the number of days with the MxAE less than Δ 2
N B the number of days with the MxAE larger than Δ 2 S I the indicator to record whether there is a similar day
T S the number of similar days when S I 1 in P s d m / P r e v and P a c t Δ 2 W D the indicator to record whether MxAE in P s d m / P r e v and P a c t Δ 2
B D the indicator to record whether MxAE W D the indicator to record W D state after NN updated
B D the indicator to record B D state after NN updated N S the indicator without similar day
Table 2. The results of variable selection by LARS in NN.
Table 2. The results of variable selection by LARS in NN.
Pattern Key Factors
group 1Original solar powerAir temperature
group 2Original solar powerAir temperatureSunshine duration
Table 3. The results of variable selection by LARS in MLR.
Table 3. The results of variable selection by LARS in MLR.
Pattern Key Factors
group 1original PV powerSunshine durationAir temperatureRelative humidity
group 2original PV powerRelative humidityAir temperatureSunshine duration
group 3original PV powerAir temperatureSunshine duration
Table 4. Comparison of forecasting results in NN.
Table 4. Comparison of forecasting results in NN.
RMSE (kW)MAPE (%)MxAE (kW)
group 13.664.309.476
N N 9.3116.4222.34
group 21.747.048.33
N N 6.4431.9118.14
Table 5. Comparison of forecasting results in MLR.
Table 5. Comparison of forecasting results in MLR.
RMSE (kW)MAPE (%)MxAE (kW)
group 13.027.4310.01
M L R 9.0151.0425.51
group 23.878.399.67
M L R 9.5944.7320.46
group 32.686.548.54
M L R 11.9934.6623.14
Table 6. The results of ASPF.
Table 6. The results of ASPF.
PatternSimilar DaysNon-Similar Day
NTNLNBNTNLNB
g r o u p 184415150
g r o u p 21712520191
Table 7. The results of ASPF in g r o u p 1.
Table 7. The results of ASPF in g r o u p 1.
Day1234567891011121314151617181920212223
SI10000010000011101100100
TS10000030000011101100100
WD10000010000001000100000
BD00000000000010101000100
NS01111101111100010011011
W D 01111101111100010011011
B D 00000000000000000000000
Table 8. The results of ASPF in g r o u p 2.
Table 8. The results of ASPF in g r o u p 2.
Day12345678910111213141516171819202122232425262728293031323334353637
SI1101101000000011110110001001110010010
TS1205501000000021120250001001110010070
WD1001101000000001100010001000000010010
BD0100000000000010010100000001110000000
NS0010010111111100001001110110001101101
W D 0010010111111110001001110100001101101
B D 0000000000000000000000000010000000000

Share and Cite

MDPI and ACS Style

Wang, Y.; Zou, H.; Chen, X.; Zhang, F.; Chen, J. Adaptive Solar Power Forecasting based on Machine Learning Methods. Appl. Sci. 2018, 8, 2224. https://doi.org/10.3390/app8112224

AMA Style

Wang Y, Zou H, Chen X, Zhang F, Chen J. Adaptive Solar Power Forecasting based on Machine Learning Methods. Applied Sciences. 2018; 8(11):2224. https://doi.org/10.3390/app8112224

Chicago/Turabian Style

Wang, Yu, Hualei Zou, Xin Chen, Fanghua Zhang, and Jie Chen. 2018. "Adaptive Solar Power Forecasting based on Machine Learning Methods" Applied Sciences 8, no. 11: 2224. https://doi.org/10.3390/app8112224

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop