*Article* **A New Hybrid Short-Term Interval Forecasting of PV Output Power Based on EEMD-SE-RVM**

#### **Sen Wang, Yonghui Sun \*, Yan Zhou, Rabea Jamil Mahfoud and Dongchen Hou**

College of Energy and Electrical Engineering, Hohai University, Nanjing 210098, China; senwang@hhu.edu.cn (S.W.); zhouyan@hhu.edu.cn (Y.Z.); rabea7mahfoud@hotmail.com (R.J.M.); hdc190406030001@hhu.edu.cn (D.H.)

**\*** Correspondence: sunyonghui168@gmail.com; Tel.: +86-139-0516-9126

Received: 30 October 2019; Accepted: 23 December 2019; Published: 23 December 2019

**Abstract:** The main characteristics of the photovoltaic (PV) output power are the randomness and uncertainty, such features make it not easy to establish an accurate forecasting method. The accurate short-term forecasting of PV output power has great significance for the stability, safe operation and economic dispatch of the power grid. The deterministic point forecast method ignores the randomness and volatility of PV output power. Aiming at overcoming those defects, this paper proposes a novel hybrid model for short-term PV output power interval forecasting based on ensemble empirical mode decomposition (EEMD) as well as relevance vector machine (RVM). Firstly, the EEMD is used to decompose the PV output power sequences into several intrinsic mode functions (IMFs) and residual (RES) components. After that, based on the decomposed components, the sample entropy (SE) algorithm is utilized to reconstruct those components where three new components with typical characteristics are obtained. Then, by implementing RVM, the forecasting model for every component is developed. Finally, the forecasting results of every new component are superimposed in order to achieve the overall forecasting results with certain confidence level. Simulation results demonstrate, by comparing them with some previous methods, that the hybrid method based on EEMD-SE-RVM has relatively higher forecasting accuracy, more reliable forecasting interval and high engineering application value.

**Keywords:** photovoltaic output power forecasting; hybrid interval forecasting; relevance vector machine; sample entropy; ensemble empirical mode decomposition

#### **1. Introduction**

With the development of industrialization, traditional fossil fuels are faced with the increased depletion and the environmental pollution problems brought by fossil fuels' combustion become the main obstacle to global economic development. To solve this problem, in the past few decades, more and more attentions have been paid on the renewable energy sources, such as biomass energy, tidal energy, wind energy, solar energy, etc. [1]. However, due to the intermittency and variability of those renewable energies, they would cause unavoidable fluctuations and instability if they are highly integrated in the power grid. Therefore, how to obtain the accurate forecast of renewable energy sources is massively important for the safe, steady and reliable operation of power grid [2].

Regarding the short-term renewable energy generation forecasting, the existing models are roughly divided into four categories: artificial intelligence based models (AIBM), statistical models, physical models and hybrid models [3]. In [4], the statistical smoothing techniques were utilized to create a statistical normalization of the solar energy, which was beneficial to implement the online short-term power forecasting of photovoltaic (PV). In [5], the ARIMA model was taken as a statistical model to realize the output power forecasting of a PV-grid-connected system. As a method of statistical and machine learning, ensemble approach also played a crucial role in short-term load forecasting [6,7]. In [6], an ensemble approach was combined together with extreme learning machine (ELM) and wavelet for short-term load forecasting in solar power system. In [7], a solar forecasting model was proposed based on multiple satellite images and support vector machine (SVM). The motion vector of the cloud was predicted by the satellite atmosphere motion vector (AMVS) image, then, the output prediction of the PV power was realized. In [8], the ANN techniques were combined with spatial modes to forecast the daily global horizontal irradiance. Physical models used physical factors to construct the required models [9–15], and in most cases, there were no distinct boundaries within different models, thus the hybrid models [4,5,11,16–18] have become the most frequently used models to forecast PV generation. For example, in [4], the statistical models and AIBM were combined to implement short-term solar power forecasting. In [11,16–18], the AIBM and physical models were integrated to obtain the forecasting of PV systems output power. On the other hand, by taking the randomness and uncertainty of solar energy into account, in recent years, there have been lots of results discussing the short-term forecasting problem of PV output power [5,16,19]. Besides the one-day-ahead time horizons [16], other forecasting time scales have also been considered, such as one-hour-ahead, 15-, 30- and 45-min-ahead time horizons [20].

However, in most of the aforementioned models, only point forecast problems were concerned, with few determined values were achieved. Nevertheless, many forecasting errors were detected in the results [21]. Besides that, those models lacked the ability to describe the non-stationary with a probable range of fluctuation. Different from the specific value of conventional point forecast, prediction interval (PI) can deliver a quantification of uncertainty with a prescribed confidence level, which indicates the probable prediction. Due to the uncertainty of the forecasting, a range consisting of upper and lower bounds with the indication of accuracy is more credible than the conventional prediction points [22]. Interval forecasting can provide more information about changeability of the target variable, which is more suitable to predict the renewable energy generation [23,24]. According to the results of point forecasting, if the probability distribution of model error is known exactly, the prediction interval can be calculated accurately. In [25], a method was established based on ELM and the pairs bootstrap and then applied to obtain the probabilistic interval forecasting of wind power, where the prediction error was assumed to obey Gaussian distribution. In [26], the prediction error was analyzed and assumed to obey Beta distribution, and then the interval forecasting model was developed. The conventional prediction interval methods mainly depend on the accuracy of point forecasts and error assumptions, but it is difficult to quantify a special prior error assumption, which influences the performance of prediction interval.

Up to now, several forecasting methods have been proposed for forecasting renewable energy power [27–31]. For data with strong randomness, the preprocessing of data is especially important to improve the prediction accuracy. The common data processing methods include EMD, ensemble empirical mode decomposition (EEMD) and wavelet decomposition. For example, EMD can decompose complex sequences and then predict them separately. In order to obtain better performance of wind forecasting, in [32], the prediction interval is optimized by combining the conditional probability. In [33], the EEMD method was used to solve the model mixing problems. However, the relativities among the decompositions were usually ignored in the conventional EEMD methods, where some complexity was also added. In [34], a kind of ELM was proposed to realize the probabilistic interval forecasting of wind power, where the authors used a two-layer integrated machine learning method. In [35], the random forest model of different meteorological conditions was established and the components were predicted, then the weighted output was carried out on the prediction results. To obtain better performance of short-term forecasting, EEMD method based on sample entropy (SE) was proposed, which was more effective and accurate than the conventional EEMD.

Nevertheless, there have been few interval prediction methods of solar power based on EEMD, which decomposed the time series into diverse frequency components and forecasting each component to improve the accuracy. Thus, the method involving EEMD and SE was used to decompose the original sequence into different new components. That method was also used to construct the different

components in order to analyze the complexity. Then, the characteristic of EEMD method was optimized. The results considering the interval forecasting methods by the hybrid method including EEMD, SE and relevance vector machine (RVM), which have great challenge and importance can enhance the accuracy of the conventional RVM method.

Based on the above discussions, this paper proposes a new hybrid model based on EEMD-SE-RVM for short-term interval forecasting of PV output power. Several intrinsic mode functions (IMFs) and residual (RES) components can be obtained by using the EEMD to decompose the original PV power output sequences. Consequently, three new components with typical characteristics are obtained based on the SE algorithm. Then, for each new component, a prediction model is established using RVM, respectively, and, the forecasting results of every new component are superimposed so that the overall forecasting result with a certain confidence level is obtained. Considering the simulated case study, the results show that this hybrid approach is very effective and has a robust generalization ability as well as a strong practical application value.

The rest of this article is organized as follows. Section 2 introduces the basic models of EEMD, SE and RVM algorithms, respectively. Section 3 develops the hybrid model interval forecasting of PV output power. Case studies and numerical results are given in Section 4. Finally, conclusions are drawn in Section 5.

#### **2. Methodology**

#### *2.1. EEMD Principle*

The most obvious drawback of conventional EMD is that it will produce mode mixing, which indicates that either a single IMF consisting of obvious different proportion or composed of signals of the same proportion in different IMF components, and it usually leads to signal instability. Aiming at solving this drawback, a new method named EEMD was proposed, which is basically a noise-assisted data analysis method. This demonstrates that noise can be performed using in the EMD method.

In EEMD, there are two important parameters. One is the amplitude *k* of the white noise and the other is maximum number of iterations *M* of EMD. Usually, the values of *M* and *k* are chosen according to the characteristics of personal experience and data. Without loss of generality, in this paper, *M* was taken as 100 and the range of *k* was 0.05–0.5.

The detailed steps of EEMD can be highlighted in the following five points [18]:


(4) Repeat steps (2) and (3) for a certain amount of white noise each time and the decomposition of corresponding IMF components is obtained. The average of all the corresponding IMFs was calculated where it is the final result of each IMF. Then, the average value of all residual components was calculated, and the average value was taken as the final result of the residual.

$$
\overline{c\_i}(t) = \sum\_{n=1}^{N} c\_{i,n}(t) / N, \quad \overline{r\_m}(t) = \sum\_{n=1}^{N} r\_{m,n}(t) / N. \tag{1}
$$

(5) Output *ci*(*t*)(*i* = 1, ··· , *m*) represents IMF components and *rm*(*t*) represents the RES component.

#### *2.2. SE Principle*

For the IMF components and the RES component that are decomposed by the EEMD, if the forecasting model is developed individually, the calculation will be greatly increased, and the correlation between different components will be ignored. In this paper, the sample entropy theory was used for recombination of these components with relevant characteristics.

For a given *k*, *r* and *N*, where *k* represents embedding dimension, *r* denotes tolerance, *N* represents number of data points. *SampEn*(*N*, *k*,*r*) is the negative logarithm of the conditional probability. For a data sequence {*xi*} = \* *x*(1), ... , *x*(*N*) + , the specific algorithm of sample entropy is expressed as follows:

(1) Construct the sequence {*xi*} constitute m-dimensional vector

$$X(i) = \left[ \mathbf{x}(i), \mathbf{x}(i+1), \dots, \mathbf{x}(i+k-1) \right] \tag{2}$$

(2) Define the distance *dk*(*X*(*i*), *X*(*j*)) between vectors *X*(*i*) and *X*(*j*) as the absolute maximum difference between their scalar components

$$d\_k(X(i), X(j)) = \max\_{0 \sim k-1} |\mathbf{x}(i+k) - \mathbf{x}(j+k)| \tag{3}$$

(3) For a given value of *r*, count the number of *dk*(*X*(*i*), *X*(*j*)) ≤ *r*, and then calculate the ratio of *N* − *k*. Be defined as

$$B\_i^k(r) = \frac{1}{N - k} num |d\_k(X(i), X(j)) \le r|\tag{4}$$

where *r* denotes the threshold, which serves as a noise filter, *r* > 0; *i* = 1, ··· , *N* − *k* + 1.

(4) The mean value of *B<sup>k</sup> i* (*r*) can be represented as

$$B^k(r) = \frac{1}{N - k + 1} \sum\_{i=1}^{N-k+1} B\_i^k(r) \tag{5}$$

(5) By increasing the iteration to *k* + 1, repetition step (1) to step (4), the mean value of *Bk*<sup>+</sup><sup>1</sup> *<sup>i</sup>* (*r*) can be represented as

$$B^{k+1}(r) = \frac{1}{N-k} \sum\_{i=1}^{N-k} B\_i^{k+1}(r) \tag{6}$$

(6) Finally, *SampEn* for a finite data length of *N* can be estimated as

$$SampEn(N,k,r) = -\ln[B^{k+1}(r)/B^k(r)]\tag{7}$$

In general, *r* is between 0.1 and 0.25 SD, *k* equals to 1 or 2, among them SD represents the standard deviation of time series. Here *k* is set as 2 and *r* is 0.15 SD.

#### *2.3. RVM Principle*

Comparing with other forecasting algorithms, RVM not only has the characteristics of modeling highly sparse, less optimized parameters, flexible kernel selection and strong generalization ability, but also can directly implement the interval forecasting. Therefore, RVM is used to develop the interval forecasting model for those new components reconstructed by SE.

For a specified input training sample {*xn*} *N <sup>n</sup>*=<sup>1</sup> and the corresponding output set {*tn*} *N <sup>n</sup>*=1, the relevance vector machine regression model can be defined as follows

$$t\_i = \sum\_{i=1}^{N} \omega\_i \mathcal{K}(\mathbf{x}, \mathbf{x}\_i) + \omega\_0 + \varepsilon \tag{8}$$

where <sup>ε</sup> <sup>∼</sup> <sup>N</sup>(0, <sup>σ</sup>2) is the error of the independent sample, <sup>ω</sup>*<sup>i</sup>* are the model weights, *<sup>N</sup>* is the sample size and *K*(*x*, *xi*) is a nonlinear kernel function.

*Energies* **2020**, *13*, 87

Given a training sample set {*xi*, *ti*} *N <sup>i</sup>*=1, suppose the target value *ti* is independent and the noise in data follows the Gaussian distribution with the variance σ2, then the likelihood function of the training sample set can be described as

$$\begin{aligned} p(t|\omega, \sigma^2) &= \prod\_{n=1}^{N} p(t\_i|\omega, \sigma^2) \\ &= \left(2\pi\sigma^2\right)^{-N/2} \exp\{-\frac{\|t-\Phi\omega\|^2}{2\sigma^2}\} \end{aligned} \tag{9}$$

where *t* = (*t*1, ··· , *tn*) *<sup>T</sup>*, <sup>ω</sup> = (ω0, <sup>ω</sup>1, ··· <sup>ω</sup>*n*) *<sup>T</sup>* and Φ is the design matrix defined by

$$
\Phi = \begin{bmatrix}
1 & K(\mathbf{x}\_1, \mathbf{x}\_1) & K(\mathbf{x}\_1, \mathbf{x}\_2) & \cdots & K(\mathbf{x}\_1, \mathbf{x}\_N) \\
1 & K(\mathbf{x}\_2, \mathbf{x}\_1) & K(\mathbf{x}\_2, \mathbf{x}\_2) & \cdots & K(\mathbf{x}\_2, \mathbf{x}\_N) \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
1 & K(\mathbf{x}\_N, \mathbf{x}\_1) & K(\mathbf{x}\_N, \mathbf{x}\_2) & \cdots & K(\mathbf{x}\_N, \mathbf{x}\_N)
\end{bmatrix} \tag{10}
$$

Based on the priori probabilities distribution and likelihood distribution, the posterior distribution over the weight form Bays rule can be written as

$$\begin{split} p(\boldsymbol{\omega}|\boldsymbol{t},\boldsymbol{\alpha},\sigma^{2}) &= \frac{p(\boldsymbol{t}|\boldsymbol{\omega},\boldsymbol{\alpha}^{2})p(\boldsymbol{\omega}|\boldsymbol{\alpha})}{p(\boldsymbol{t}|\boldsymbol{\alpha},\sigma^{2})} \\ &= \left(2\pi\right)^{-\left(N+1\right)/2|\Sigma|^{-1/2}} \exp\{-\frac{1}{2}\left(\boldsymbol{\omega}-\boldsymbol{\mu}\right)^{T}\Sigma^{-1}\left(\boldsymbol{\omega}-\boldsymbol{\mu}\right)\} \end{split} \tag{11}$$

where Σ = (σ−2Φ*T*Φ + *A*) −1 , μ = σ−2ΣΦ*Tt* and *A* = *diag*(*a*0, *a*1, ... , *aN*).

At last, the hyper parameter α and the variance σ<sup>2</sup> can be estimated by using the maximum likelihood algorithm.

The input value is *x*∗ *i* , then the corresponding forecasting value can be described as [13]

$$\begin{cases} \begin{aligned} y\_\* &= q(\mathbf{x}\_i^\*) \mu \\ \sigma\_\*^2 &= \sigma\_{MP}^2 + q(\mathbf{x}\_i^\*)^T \Sigma \varphi(\mathbf{x}\_i^\*) \end{aligned} \tag{12}$$

Under the confidence level of α, the interval forecasting value results can be described as [25]

$$[L\_{b'}^{a}, L\_{b}^{a}] = [y\_{"\prime} - z\_{a/2} \sigma\_{"\prime\prime} y\_{"\prime} + z\_{a/2} \sigma\_{"\prime}] \tag{13}$$

where *L*<sup>α</sup> *<sup>b</sup>* and *<sup>U</sup>*<sup>α</sup> *<sup>b</sup>* represents lower and upper bound of forecasting value. *Z*α/2 represents standard Gaussian distribution, which depends on the confidence level.

#### **3. Hybrid Forecasting Model**

The proposed hybrid method mainly has three stages in PI construction. Those stages are historical PV output power series decomposition stage by using EEMD, the components construction stage utilizing SE and the construction stage by RVM. This part is divided into five sections. The first section is to introduce the principle of sample selection. The second section is to describe the decomposition of the data using EEMD and the third section is to demonstrate the reconstruction of components using SE. In the last section, the analysis of the RVM method and the corresponding flow chart as well as the pseudo-code program are given.

#### *3.1. Sample Selection*

For the sake of validating the forecasting ability of the method proposed in this paper, the PV output power simulation data of a PV power plant in Jiangsu province from July 2011 to June 2012 was obtained. Considering the different sunrise and sunset time in each season, and in order to ensure that the data obtained has value, only 10 h data from 8:00 to 17:00 was taken. If different seasons are selected, then the sunrise and sunset time of different seasons are different. In order to unify the data, 8:00–17:00 time period was selected. Otherwise, the changes of weather have massive impacts on the PV output power. By comparing the historical output power curve with the meteorological curve, it can be found that the meteorological conditions have a great influence on the PV power output. In order to ensure the consistency of the same kind of data and to predict the PV output power more accurately, the PV historical output power data was divided into three types (sunny days, cloudy days and rainy days) according to the numerical weather prediction (NWP). The photovoltaic historical output power was classified according to the weather type, and the model prediction was respectively carried out on the photovoltaic historical output power. Using the EEMD to decompose the historical PV output power. The forecasting model was developed respectively. The historical photovoltaic output power data of 6 h to be predicted and the NWP at the time to be predicted were used as the input of the model. The model in this paper was a rolling prediction model. For different time to be predicted, the input data was updated online and in real time.

#### *3.2. Decomposing the Classified PV Output Power Using EEMD*

While PV output power contains randomness and volatility with the influence of weather changes and other factors, the result of direct forecasting would have a large error. For the sake of enhancing the forecasting results, it is essential to preprocess the original data. In the performed comparison, the EEMD shows better noise robustness and decomposing result than other decomposition algorithms. In this paper, the PV output power was decomposed by using EEMD, and some new components were achieved. For example, Figure 1 shows the decomposition results of a sunny-day PV output power data by applying EEMD.

#### *3.3. Reconstructing the New Components Using SE*

As it can be seen from Figure 1, there was a similar trend for some components. If these components are highly similar, the value of the sample entropy between them will be small. Therefore, the rules to reconstruct the new components based on SE are as presented as follows:

(1) The sample entropy of the given PV data sequence, IMF components and RES component were calculated.

(2) The components with obviously lower sample entropy value than that of the given sequence could form the trend component.

(3) The components with obviously higher sample entropy value than that of the given sequence could form the random component.

(4) The detail component's sample entropy value was within a given threshold of θ around the given sequence. In this paper, we chose θ = 0.7.

Figure 2 gives the trend graph of the new components after reconstruction.

**Figure 2.** Trend graph of each new component.

It can be obviously noticed from Figure 2 that the three components (trend, detail and random) have their own typical features. With respect to the trend component, it can roughly reflect the overall fluctuation of the original PV power sequence. Similarly, for the detail component, it can characterize the detailed fluctuations of the original PV power sequence. Considering the random component, it represents the fluctuations caused by other factors, which cannot to be explicitly described. Table 1 shows the composition of each new component.



For further simplification of the calculation, the forecasting interval was reduced. The trend component was selected for point forecasting, the detail and random components were selected for interval forecasting. Then, the result of the different component forecasting was superposed, the interval forecasting at a certain degree of confidence was obtained and the optimal prediction was realized.

#### *3.4. Kernel Function of RVM*

RVM is a pattern recognition as well as regression forecasting method, which is based on kernel function, the kernel implements non-linear transformation among plurality of feature spaces. The basic method of mixed kernel is to combine plurality of kernels having different characteristics together with a certain proportion, and optimizes the combined kernel function so as to have better performance. Considering that RVM has the advantages of less limitation of kernel function selection and the excellent properties of RBF kernel in solving local fluctuations and polynomial kernel in dealing with

global fluctuations, the combination of the global kernel of polynomial kernel and the typical local kernel of RBF kernel is used for short-term PV output power interval forecasting so as to obtain better forecasting results. The hybrid kernel is shown as [13,28]

$$K(\mathbf{x}, y) = \theta G(\mathbf{x}, y) + (1 - \theta)P(\mathbf{x}, y) \tag{14}$$

$$G(\mathbf{x}, y) = \exp(-\frac{\|\mathbf{x} - y\|^2}{\sigma^2}) \tag{15}$$

$$P(\mathbf{x}, \mathbf{y}) = (\mathbf{x} \cdot \mathbf{y}) = (\mathbf{x} \cdot \mathbf{y} + 1)^2 \tag{16}$$

where *G*(*x*, *y*) is the Gaussian kernel function, *P*(*x*, *y*) is the binomial kernel function, θ is the weight of the kernel function and σ is the kernel width. θ and σ are the parameters that need to be optimized. In this paper, the optimal values of θ and σ are obtained by using the method of grid search [36].

#### *3.5. Evaluating Indicator*

There are many evaluation indicators for the forecasting, an evaluation index different from the well-known point forecasting, such as MAPE and RMSE. The following evaluation indicators were used in this paper.

(1) Mean absolute percentage error

$$\text{MAPE} = \frac{1}{N} \sum\_{i=1}^{N} \left| \frac{y\_{for} - y\_{tra}}{y\_{tra}} \right| \times 100\% \tag{17}$$

where *yf or* is the value of forecasting, *ytru* is the actual value of sample and *N* represents the number of the sample.

(2) Forecasting interval coverage percentage

$$\text{FICP}^{(1-\beta)} = \frac{1}{N} \xi^{(1-\beta)} \times 100\% \tag{18}$$

where *N* denotes the number of the sample, ξ is the number of the actual PV output power within the interval under the level 1 − β.

(3) Forecasting interval average width

$$\text{FIAW}^{(1-\beta)} = \frac{1}{N} \sum\_{i=1}^{N} \frac{\mathcal{U}^{\beta} - L^{\beta}}{y\_{\text{tru}}} \tag{19}$$

where *N* represents the number of the sample, *ytru* is the actual value of the sample, *U*<sup>β</sup> is the upper boundary and *<sup>L</sup>*<sup>β</sup> is the lower boundary under the level 1 <sup>−</sup> <sup>β</sup>.

This paper proposed a new EEMD-SE-RVM method used for the PV output power short-term interval forecasting. A simplified pseudo-algorithm that summarizes this process is provided in Algorithm 1.

#### **Algorithm 1. PV power forecast**


The EEMD method has better performance used in the interval forecast by eliminating the mode mixing problem, which exists in the EMD method. However, prediction interval forecast based on conventional EEMD is still influenced by the high complexity. The proposed method uses SE to analyze the decompositions so that the complexity is reduced. According to the analysis above, SE recombined the decomposition into trend, detail and random components to optimize the forecasting method. The trend component, which is smoother and steadier, was used to achieve point forecasting, and the detail component and random component were difficult to be used in the conventional point forecast method because of the uncertainty and non-stationary. The method that achieved point and interval forecasts respectively could guarantee better performance by reducing the numerical value fluctuation.

#### **4. Case Study**

In this part, the PV data of Jiangsu photovoltaic power station from July 2011 to June 2012 was used to test the accuracy and effectiveness of the EEMD-SE-RVM model proposed in this paper. The installed capacity of this PV plant was 30 MW, consisting of 28 PV arrays of 1.09 MW. The data were collected once an hour and 24 times a day. What is collected is the instantaneous value of PV output power at the current time. The prediction date was randomly selected and the data before the prediction date was used as the training data of the model.

For the sake of validating the interval forecasting effect of the model proposed in this paper under different confidence levels, two confidence levels of 90% and 60% were chosen as examples. Figures 3–8 depict the results in different case interval forecasting. In this paper, three common indices forecasting interval coverage percentage (FICP), forecasting interval average width (FIAW) and mean absolute percentage error (MAPE) were used to assess the effect of the interval forecasting [24,27]. Tables 2–4 give the different case interval forecasting results and analysis.

**Figure 3.** Interval forecasting results in a sunny day with the 90% confidence level.

**Figure 4.** Interval forecasting results in a sunny day with the 60% confidence level.

**Figure 5.** Interval forecasting results in a cloudy day with the 90% confidence level.

**Figure 6.** Interval forecasting results in a cloudy day with the 60% confidence level.

**Figure 7.** Interval forecasting results in a gloomy day with the 90% confidence level.

**Figure 8.** Interval forecasting results in a gloomy day with the 60% confidence level.

**Table 2.** Interval forecasting results of EEMD-sample entropy (SE)-relevance vector machine (RVM) mode in a sunny day.



**Table 3.** Interval forecasting results of EEMD-SE-RVM model in a cloudy day.

**Table 4.** Interval forecasting results of EEMD-SE-RVM model in a gloomy day.


To prove the superiority of the method proposed in this paper, the RVM model, EMD-RVM model and EEMD-RVM model were also used for the same PV output power short-term interval forecasting, respectively. In this case, the forecasting results in the sunny day were chosen for example. In this paper, three evaluation indexes FICP, FIAW and MAPE and model running time were used to evaluate the effect of interval prediction. In Table 5, the results at 90% confidence level of four different models were provided.

**Table 5.** Comparison of forecasting effect among four models.


On the other hand, for more evaluation of the adaptability to different PV output power data of this proposed model, the other forecast days in different seasons were considered. For example, the date of 6 August 2011, 30 October 2011, 14 May 2012 and 19 March 2012 were selected stochastically. According to the four days original PV output power data, the probability of one hour-ahead of the PV output power in these four days at 90% confidence was predicted, and the results are illustrated in Figure 9. At the same time, in Table 6, the results of evaluating the indicators FICP, FIAW and MAPE are given.

**Table 6.** Comparison of indices results among four different days.


**Figure 9.** Interval forecasting results of four days for a PV power plant. (**a**) 6 August 2011; (**b**) 30 October 2011; (**c**) 14 March 2012 and (**d**) 19 May 2012.

Taking sunny days as an example, the short-term interval prediction of two different time scales was carried out under a 90% confidence level are depicted in Figure 10. At the same time, in Table 7, the results of evaluating the indicators FICP, FIAW and MAPE are given.

**Figure 10.** Results under two different circumstances (**a**) Hour-ahead and (**b**) day-ahead.


**Table 7.** Comparison of indices results between two different days.

It can be clearly noted from the comparison results that the forecasting effects obtained by the proposed method were better than the other methods. Furthermore, the superiority and wide adaptability of this proposed model were fully confirmed based on the above comparison.

#### **5. Conclusions**

Firstly, considering the influence of different meteorological conditions on the output power of PV, the original PV output power data has been classified into three categories. Strong theoretical basis in addition to noise robustness are some of the advantages of EEMD. Those features overcome the drawbacks that the wavelet analysis requires, which are the artificial selection of the basic functions and the mode mixing phenomenon of EMD. Consequently, the original PV output power achieves better decomposition by the use of EEMD. Secondly, the use of SE excavates the correlation among the components as well as reduces the model complexity, which creates contributions to enhance the running efficiency. Thirdly, the hybrid kernel RVM method was implemented to achieve the PV output power short-term interval forecasting. In the part of illustrative results, comparing the EEMD-SE-RVM with other models, the obtained MAPE and FIAW indices had better values than other models, and the FICP of the proposed model was higher than that obtained from the compared models. In this paper, the proposed hybrid model not only improved the forecasting precision, but also enhanced the interval coverage rate, and at the same time, reduced the width of the prediction interval, which made it suitable for practical application on other renewable energies output power forecasting.

**Author Contributions:** S.W. conceived and designed the experiment and wrote the original manuscript. Y.S. performed the experiments and evaluated the data. Y.Z., R.J.M. and D.H. reviewed and proofread the manuscript. All authors read and proofread the manuscript. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported in part by the National Key R&D Program of China under Grant 2018YFB0904200 (Technology and application of wind power/photovoltaic power prediction for promoting renewable energy consumption), and in part by the eponymous Complement S&T Program of State Grid Corporation of China under Grant SGLNDKOOKJJS1800266.

**Conflicts of Interest:** The authors state that there is no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
