Assessment of Different Deep Learning Methods of Power Generation Forecasting for Solar PV System

Kuo, Wen-Chi; Chen, Chiun-Hsun; Hua, Shih-Hong; Wang, Chi-Chuan

doi:10.3390/app12157529

Open AccessArticle

Assessment of Different Deep Learning Methods of Power Generation Forecasting for Solar PV System

¹

Department of Mechanical Engineering, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan

²

Department of Aerospace and Systems Engineering, Feng Chia University, Taichung 407, Taiwan

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2022, 12(15), 7529; https://doi.org/10.3390/app12157529

Submission received: 2 July 2022 / Revised: 23 July 2022 / Accepted: 25 July 2022 / Published: 27 July 2022

(This article belongs to the Section Energy Science and Technology)

Download

Browse Figures

Versions Notes

Abstract

:

An increase in renewable energy injected into the power system will directly cause a fluctuation in the overall voltage and frequency of the power system. Thus, renewable energy prediction accuracy becomes vital to maintaining good power dispatch efficiency and power grid operation security. This article compares the one-day-ahead PV power forecasting results of three models paired with three groups of weather data. Since the number, loss, and matching problem of weather data will all influence the training results of the model, a pre-processing data framework is proposed to solve the problem in this study. The models used are a deep learning algorithm-based artificial neural network (ANN), long short-term memory (LSTM), and gated recurrent unit (GRU). The weather data groups are Central Weather Bureau (CWB), local weather station (LWS), and hybrid data (the combination of CWB and LWS data). Compared to the other two groups, hybrid data showed a 5–8% improvement in measurements. In addition, when it comes to different weather conditions, the advantages of the LSTM model were highlighted. After further analysis, the LSTM model combined with hybrid data showed the most accurate measurements, which was proved through forecasting results for one month. Finally, the results indicate that when the amount of data is limited, using hybrid data and the five weather features is helpful for training the model. Accordingly, the proposed model shows better one-day-ahead PV forecasting.

Keywords:

deep learning (DL); forecasting; neural network; renewable energy; solar power generation

1. Introduction

In recent decades, renewable energy development and carbon reduction have been energy policy targets in various countries in response to global warming. Due to the intrinsic intermittent and uncertain nature of renewable energy, previous experience shows that a high penetration percentage of renewable energy will affect the system’s voltage, frequency, and load demand fluctuations. In the state of California, the solar penetration rate of residential roofs has increased rapidly, causing the load to show a duck curve, which conflicts with conventional generation efficiency and the frequency of daily start-up/shut-down operations [1,2]. In this context, developing an accurate forecast of long-term and short-term renewable generation ability is vital for efficient power dispatch and operations. This paper compares different models and weather data to predict one-day-ahead hourly PV power generation.

Accurate prediction of PV power generation is challenging due to complex interactions between many environmental conditions and uncontrollable factors. Therefore, studying weather data is crucial to predicting the impact of solar power generation [3,4,5]. Sangrody et al. [6] selected the most influential variables, including sky coverage, relative humidity, and temperature, for forecasting model training. Sensitivity tests showed that relative humidity plays a dominant role in forecasting. Sangrody et al. [7] concluded that the most common weather variables, including sky coverage, dew point, relative humidity, and temperature, can be used to predict solar PV generation. Rehman and Mohandes [8] estimated global solar radiation using relative humidity and daily mean temperature and showed that their model outperformed other cases ((1) daily maximum air temperature and (2) daily mean air temperature), with an absolute mean percentage error of 4.49%. Alluhaidah et al. [9] compared the impact of seven meteorological variables on the accuracy of predictions, including air temperature, relative humidity, pressure, cloud coverage, wind speed, and wind direction. The best results strongly depended on cloud coverage and relative humidity. The least influential variable was wind direction, followed by wind speed. As described by Ghanbarzadeh et al. [10], sunshine hours, daily mean air temperature, and relative humidity are key variables, and the proper processing of these variables on a specific timescale outperformed the other timescale, with an absolute mean percentage error of 8.84%. Giorgi et al. [11] determined the influence of ambient temperature on the forecast of PV power and solar irradiance and yielded the best predictive ability. Gigoni et al. [12] noted that temperature was not as sensitive as other weather parameters among several methods for solar energy prediction. Several studies indicated that weather parameters had a potential impact on PV power prediction, including the relative humidity, temperature, diffuse horizontal radiation, and global horizontal radiation, which significantly affected PV power output. Moreover, daily precipitation seems to be a less critical factor in PV power prediction [13,14,15].

Additionally, previous studies pointed out the importance of weather data classification, which significantly helped in improving the PV prediction effect. Liu and Zhang [16] proposed a novel weather classification process and a novel way to embed a physical model of PV units into the K-Nearest Neighbor (KNN) algorithm model. The results showed that the method could reduce the uncertainty of PV output forecasting. Hossain and Mahmood [17] used the K-means algorithm to classify different levels of irradiance, where the interval was based on the hour of the day and the season. Cheng et al. [18] proposed a new method, which was a factor-based graph modeling method using numerical weather prediction (NWP) data for PV power prediction. This method was used to explore and analyze a multi-graph model that can adapt to different weather conditions for PV forecasting. Yu et al. [19] pointed out that inaccurate predictions usually occur on cloudy days. Therefore, to improve prediction accuracy in cloudy weather, for the classification of weather types, cloudy days were divided into cloudy and mixed. Silvia et al. [20] proposed a new ensemble method based on the probabilistic distribution of trials using the probabilistic ensemble method (PEM) for PV power forecasting on cloudy days. The result showed that the nRMSE metric reached 4.79% in 2017 in the totally cloudy days class. This paper proposes a data classification method and finds the critical weather factor that affects PV power generation forecasting.

Furthermore, with advances in the calculation speed of computers, machine learning has been widely used in the past decade, especially in prediction-related literature using large amounts of complex historical information [21,22]. Yona et al. [23] proposed power output forecasting of a PV system based on 24 h ahead insolation forecasting by using reported weather data, fuzzy theory, and a neural network. Yang et al. [24] proposed a hybrid method for hourly forecasting of 1-day-ahead PV power. The historical data of PV output power were classified by the self-organizing map (SOM) and learning vector quantization (LVQ) networks. The results showed that the proposed approach achieved better predictive accuracy than the traditional ANN and simple Support Vector Regression (SVR) methods. Sorkun et al. [25] assessed deep learning with the time series method and claimed that it is suitable and competitive for solar forecasting. The results showed that LSTM and GRU models are suitable and competitive for solar radiation time series forecasting. Hanafi et al. [26] used a backpropagation neural network (BPNN) and extreme learning machine (ELM) for one-hour-ahead solar power forecasting. The results showed that the accuracy and computational time of extreme learning machine were better than those of backpropagation neural networks. Liu et al. [27] used weather conditions that were categorized as sunny, cloudy, rainy, low light, sunny-cloudy, and rainy-sunny as the training dataset to carry out one-day-ahead solar power forecasting. The results showed that their proposed simplified LSTM model outperformed the multilayer perceptron (MLP) model. U. K. Das et al. [28] provided a comprehensive and systematic review of the direct forecasting of photovoltaic power generation. They also provided a critical analysis of recent work, including statistics and machine learning models based on historical data. Meftah et al. [29] compared the performance of LSTM and MLP for PV power forecasting in winter and summer. The results indicated that the effect using LSTM was better for short-term forecasting. A summary of different forecasting algorithms, data inputs, and horizons in the literature is shown in Table 1.

The actual PV power generation site used in this study is in Taoyuan City, Taiwan. The number of equivalent sunshine hours is about 3 h. The weather in this city in the experimental year was relatively unstable. The objective of this study was to compare three kinds of deep learning models, namely, the artificial neural network (ANN), long short-term memory (LSTM), and gated recurrent unit (GRU), for one-day-ahead PV power generation forecasting. Furthermore, a data pre-processing framework is proposed to improve predictive accuracy and prevent the failure of one-day-ahead PV power forecasts. The structure of this article is as follows: Section 1 presents a brief review of weather factors and modeling research for PV power forecasting. The details of the various weather data resources used in this study are introduced in Section 2. Then, Section 3 proposes the framework for data pre-processing. Section 4 introduces the application of deep learning modeling and evaluation methods. Section 5 presents the numerical PV power forecasting results, followed by the final section for the conclusion.

2. Data Description

This section introduces the sources of the weather and solar photovoltaic parameters and clarifies the means of obtaining various data. The architecture of data collection is shown in Figure 1. The data used in this study include weather data and PV power generation data. The measured weather forecast data and historical data can be easily accessed from the Central Weather Bureau (CWB) website. These data were used for model training and prediction for one-day-ahead PV power forecasting. In addition, the PV power data used in this study were obtained from the SQL server of the energy management system (EMS), which is used as the target of deep learning.

2.1. Photovoltaic Generation Data

The actual PV module field is depicted in Figure 2. The PV module was installed on the rooftop of a fast-food restaurant in Taoyuan City, Taiwan, and the capacity is 10 kW. The number of effective sunshine hours is about 3 h in this area. The collection period of the photovoltaic generation data used in this study was from 1 August 2020 to 3 July 2021. The PV power generation data are sampled every minute in this field, and hourly PV power data were obtained as the average data for 60 min.

2.2. Meteorological Data from the CWB

The experimental database of historical meteorological data used in this study was provided by the Xinwu Weather Station of CWB (about 26 km away from the site) from August 2020 to July 2021. The observed weather database has 55 different weather parameters and can be easily accessed from the CWB website. On the other hand, the weather forecast dataset has 16 different weather features, providing the next three days of weather forecast data, and the data interval is hourly. Details can be found on the CWB website [33,34].

2.3. AccuWeather Data

From the historical PV data, it was found that on rainy days, there was a drastic change or almost no PV power generation, as shown in Figure 3. The figure illustrates hourly power generation from 2 to 4 March 2021. The photo shows the weather condition captured by the sky camera at that time for reference. As a result, rainfall data are one of the important parameters for photovoltaic power forecasting. Since there were no rainfall data in the forecast data of the CWB, the rainfall data of Yangmei District in this article were obtained from the AccuWeather Data website, and the data interval is once per hour [35].

2.4. Local Weather Station (LWS) and Pyrheliometer

The LWS data are from a micro weather station installed in the field of Yangmei District, Taiwan, which is a 6-in-1 weather station. It was installed on the platform less than 100 m from the PV panel. That equipment provides data that include temperature, relative humidity, average wind speed, wind direction, rainfall, and atmospheric pressure. The datasheet of the weather station is presented in Table 2. A pyrheliometer was installed in the field to measure and record the current solar radiation, and the datasheet is depicted in Table 3. The data are updated every minute. The hourly LWS data were obtained as the average data for 60 min.

3. Data Pre-Processing

The literature review showed that the acquisition of meteorological data has a critical influence on solar photovoltaic forecasting. The accuracy of the forecasting results will vary considerably and is subject to the quality of the measured parameters. These historical and forecast weather data may be lost during collection or unavailable because the website is inaccessible, and this will affect the forecasting results of photovoltaic power generation for the next day. Therefore, data pre-processing is the most important and critical factor for a one-day-ahead PV power forecasting study. The data pre-processing methods used in this study can be divided into three steps, which are data classification, data filtration, and missing data processing. The structure of data pre-processing used in this study is presented in Figure 4.

3.1. Data Classification

The main purpose of data classification is to classify weather data from different sources before performing forecast training. To study the influence of data from different sources on one-day-ahead solar photovoltaic forecasting, the data used in this study were divided into three sources: The first group is the historical and forecast data provided by the open platform of the CWB. The second group is the historical weather data from the on-site LWS equipment to carry out one-day-ahead forecasting. The third group is the combination of CWB and LWS data, which means using the same weather features and training data from the CWB and LWS, named hybrid weather data in this article.

3.2. Data Filtering

The data filtration process consists of the pre-processing of solar photovoltaic power data, historical weather data, and historical weather forecast data. The details are described as follows:

The filtration method of solar photovoltaic power data: When collecting solar photovoltaic power data, very little electricity consumption is needed to maintain the standby status of the inverter when no power generation occurs. In addition, solar photovoltaic power generation is too low in the early morning. These data not only affect the forecast calculation but are useless in the actual power generation forecast. Therefore, the pre-processing method of solar photovoltaic data entails replacing values less than 0.5 kW and the standby value of −0.002 W by 0. In the zoomed-in area of the collected data illustrated in Figure 5, the values are much lower than −0.002 W, so these values are replaced by 0.
The filtration method of weather feature data: There are 55 different weather feature parameters in CWB’s historical data but only 16 features from CWB’s weekly forecast data. Thus, 16-parameter weather forecast data were compared to 55-historical-parameter data to find exact matches. The eigenvalues used in the model training process must be the same as those used in the predictive model so that the calculated weights of the deep-learning model are consistent. After cross-checking with the complete data used in this research, there were only 5 parameters that could be used, as shown in Table 4. Among these 5 parameters, the value subject to rainfall has a certain influence on the forecast for the next day. It was obtained from the AccuWeather website and used as one of the parameters.

3.3. Missing Data Processing

In the process of data transmission and storage, data may be lost due to network disconnection, database maintenance, or some unexpected malfunction. Different methods were used to pre-process lost historical and forecast data, which are described as follows:

The pre-processing method of the historical data, including the data from the CWB and local weather station, involved directly deleting abnormal and missing data. The data for such days were erased when the historical data were −9999, null, or intermittently lost. The number of processed data points is shown in Figure 6. The hollow bars show the amount of data that can be used for one-day-ahead solar photovoltaic forecasting training after processing by the above method. There are 170 days of hybrid data, 308 days of local weather station data, and 181 days of CWB data. The original data are from 1 August 2020 to 20 June 2021, a total of 324 days.
However, the pre-processing method used for lost weather forecast data differs from that used for historical data. There were two methods used in this study. The first was the interpolation of data that were missing for one or two hours of the day. If more than one-fourth of the forecast data for a day were lost, the following method was used. The historical data of the next day’s forecast were searched on the CWB’s database, and similar weather data were directly used as the forecast weather factor for the next day. For example, if the forecast for the next day was mostly clear from 6:00 to 18:00, the forecast data for the same weather were searched in the historical database and used as the next day’s forecast data. The reason for adopting this method is that the unavailability of weather information for the next day will cause the predictive system to crash. As a result, it cannot be applied to actual cases, and it will be meaningless to introduce the predictive module into the energy management system.

4. Methods and Evaluation

We note that PV power generation and weather data are related to time series. Thus, the deep learning algorithms chosen for one-day-ahead PV power forecasting were LSTM and GRU neural network models, and the traditional ANN model was added to compare the accuracy of the three models in this study. Hyperparameter tuning of the deep learning network will directly affect the model’s performance and then influence the prediction results. This section introduces the method of hyperparameter tuning for each model. The accuracy evaluation methods used in this study are the most widely used metrics, namely, MAPE, RMSE, and MAE. They are described below.

4.1. Artificial Neural Network (ANN)

An artificial neural network is one of the main tools used in deep learning. It consists of an input layer, hidden layer, and output layer. The input layer comprises artificial input neurons and brings the initial data into the system for further processing by subsequent layers of artificial neurons. The hidden layer is the layer between input layers and output layers, where artificial neurons receive a set of weighted inputs and produce an output through an activation function. The output layer is the last layer of neurons and produces the given outputs for the program. For the neurons in each layer, the activation function measures the importance of the neuron output to improve the model’s effectiveness. The advantage of ANN is it can deal with complicated nonlinear problems and has excellent adaptive ability. Therefore, ANN is one of the most popular methods to realize predictions [36]. However, the performance will be relatively poor when there are less data.

4.2. Long Short-Term Memory (LSTM)

RNN (Recurrent Neural Network) is a neural network that can cause the vanishing and exploding gradient problem, so it cannot handle information for a long period of time. Therefore, long short-term memory was introduced by Hochreiter and Schmidhuber [37], who proposed that LSTM can solve the vanishing and exploding gradient problem during the training process. In 1999, Gers et al. [38] introduced the forget gate into the LSTM structure, which decides the ratio of information that needs to be kept in the memory cell. LSTM performs better when there are more data.

4.3. Gated Recurrent Unit (GRU)

The gated recurrent unit was introduced by Cho et al. [39]. GRU’s design is similar to that of LSTM, but without an output gate. It tries to solve the vanishing gradient problem by using an update gate and a reset gate. When dealing with small datasets, GRU is a better choice than LSTM because it has only two gates with a fast calculating speed.

4.4. Hyperparameter Adjustment Process

Hyperparameter tuning is a way to optimize the deep learning model, including the hidden unit, batch size, and epochs. Note that the batch size indicates the number of training samples. By increasing the number of samples, the training speed can be accelerated. If the batch size is too large, the memory will be insufficient. By contrast, too small a batch size may lead to underfitting. The hidden unit indicates the number of parameters. The higher the number of hidden layer units, the easier overfitting becomes. Epochs represent the number of training iterations for all of the batches. The training curve will descend when the epoch increases to optimize the loss value. Therefore, the batch size was fixed at the beginning, and three to five layers were tested in every step with the same epochs in the hyperparameter adjustment process.

4.5. Evaluation Indices

After obtaining every hourly forecast value of one-day-ahead PV power with the proposed model, it was necessary to validate the applicability of the model through certain indices. In machine learning, the most common indices used to evaluate the performance of models are MAE, RMSE, and MAPE [40].

Mean Absolute Error (MAE). MAE can be used to measure the error between predicted values and actual values. It depends on the scale of continuous variables. The lower the value, the higher the accuracy of the predicted model. The equation is given below:

$MAE = \frac{1}{N} \sum_{i = 1}^{N} | \hat{X_{i}} - X_{i} |$

(1)

where $\hat{X_{i}}$ and $X_{i}$ represent the $i^{t h}$ forecasted and actual values, respectively, and N is the size of the test dataset (N = 168 for one week in this study).
Root Mean Squared Error (RMSE). RMSE can be used to measure the deviation between predicted values and actual values. The difference between RMSE and MAE is that RMSE is sensitive to outliers. That means that RMSE is easily influenced by large deviations. Hence, a smaller error indicates better performance. The equation is given below:

$RMSE = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(\hat{X_{i}} - X_{i})}^{2}}$

(2)

where $\hat{X_{i}}$ and $X_{i}$ represent the $i^{t h}$ forecasted and actual values, respectively, and N is the size of the test dataset (N = 168 for one week in this study).
Mean Absolute Percentage Error (MAPE). MAPE measures the accuracy as a percentage. It can be used to judge the quality of the predicted result. The definition is:

$MAPE = [\frac{1}{N} \sum_{i = 1}^{N} |\frac{X_{i} - \hat{X_{i}}}{X_{i}}|] \times 100 %$

(3)

where $\hat{X_{i}}$ and $X_{i}$ represent the $i^{t h}$ forecasted and actual values, respectively, and N is the size of the test dataset (N = 168 for one week in this study).

5. Numerical Results

5.1. Results of Hyperparameter Adjustment

For the actual forecast from 4 February to 21 May 2021, five features (temperature, relative humidity, rainfall, average wind speed, and UV index) were employed with Adam optimizer to tune the hyperparameters. The test results of hyperparameters are shown in Table 5, Table 6 and Table 7. The tables show that the accuracy of hybrid weather data for the one-day-ahead forecast is higher than that of CWB and LWS data when applied to different models.

5.2. Forecast Performance with Different Weather Data Groups

After deep learning model training, these models were applied to execute solar photovoltaic forecasting from 27 June to 3 July 2021. This section describes the use of different deep learning methods (ANN, LSTM, and GRU) with different weather groups for solar photovoltaic forecasting. The forecasting results of the ANN model are presented in Figure 7. The solid blue line is the actual PV power generation, the green dashed line is the predicted result using LWS weather data, the red dashed line is the predicted result using hybrid weather data, and the yellow dashed line is the predicted result using CWB weather data. The lines used in the other graphs have the same meanings. It is apparent that when only CWB data are used, the prediction results of the ANN model severely deviate from the actual power generation during the peak period on 27 June and 29 June 2021. The results indicate that the ANN model does not provide good predictions, even under sunny conditions. On the other hand, when only historical LWS data are used, the results during peak power generation periods deviate quite substantially from the actual values due to poor weather (e.g., 28 June to 29 June). In contrast, the overall prediction curve using hybrid data is closer to the actual power generation trend, although the results are still not good enough under poor weather conditions.

The forecasting results of the LSTM model are shown in Figure 8. When CWB data are used for prediction, the accuracy of LSTM is better than that of ANN. This means that the learning mode with time series helps to improve the predictive accuracy. However, from the prediction results of LWS data, it is found that although the results are better than those of the ANN model (the trends on 28 and 29 June 2021 show significant improvement), there is still a big difference in the prediction results on 1 July 2021. This means that the predictive ability is still limited when only historical data are employed. Conversely, the forecast trend of the hybrid model is similar to the actual PV power generation results.

The forecasting results of the GRU model are shown in Figure 9. It is also evident from the results that when using CWB data, the GRU model can effectively improve the accuracy in comparison with ANN, especially on 27 June 2021. However, the results show that if only the LWS historical weather data are used for forecasting, the prediction results are still inaccurate with severe deviations in tendency. This consequence is the same as those of ANN and LSTM, e.g., on 28 June and 1 July 2021. In contrast, when the hybrid data are used, the one-day-ahead solar photovoltaic forecasting trend is better than when only using CWB or LWS data. The prediction results on 1 July 2021 show that hybrid data significantly help improve the effect of the one-day-ahead solar photovoltaic forecast.

The predictive ability of different models can be clearly seen in the above figures. In addition, the predicted generation intensity every hour will largely vary when using different methods with different weather data sources. The average evaluation indices of the three models with different weather data groups are shown in Table 8. Comparing the forecasting results of different models, it is found that the predictive effect of ANN is relatively poor compared to those of LSTM and GRU. The one-week average value of MAPE can vary by up to 8~10%, and the maximum differences in MAE and RMSE are up to 1.089 and 0.46, respectively. The results imply that the LSTM and GRU models are more suitable for one-day-head solar photovoltaic forecasting. Subsequently, the forecasting results of different weather groups are compared. From the one-week average result, when only LWS historical data are used for forecasting, the prediction effect of all models is comparatively poor.

Finally, the comprehensive comparison results from the trend of the PV power forecast curve or numerical statistical analysis show that using hybrid weather data can effectively improve the accuracy of solar photovoltaic forecasts. On the other hand, from the average prediction results of the three different groups of weather data, the effect of the LSTM model is the best, indicating that this model is more suitable as a solar photovoltaic prediction model in this study, with a MAPE of 20.0%, MSE of 1.158, and RMSE of 1.004. Based on the suggestion by Lewis [41], the accuracy of the predictive ability based on MAPE is classified as highly accurate (less than 10%), good (11% to 20%), reasonable (21% to 50%), and inaccurate (more than 50%). It is shown that the data pre-processing method proposed in this study combined with the LSTM deep learning forecasting model can achieve good prediction results.

From the analysis results mentioned above, the use of hybrid data can significantly improve the one-day-ahead solar photovoltaic forecast effect. Therefore, the results of one-day-ahead solar photovoltaic forecasting with the hybrid data group for one week were further analyzed in this study, as shown in Figure 10. The results of the one-week forecast statistical numerical analysis are presented in Table 9. In Figure 10, there is an obvious prediction difference among the three models on 2 and 3 July 2021. The forecast results of ANN are worse than those of LSTM and GRU. This means that when there is a big change in weather conditions, the difference between the forecast and actual power generation trends will become more pronounced.

When comparing the daily forecast numerical analysis results of the LSTM and GRU models for one week, the results of the LSTM model are better than those of the GRU model, and the average forecast results are relatively stable. The average results of LSTM for a week have a MAPE of 16%, MSE of 0.71, and RMSE of 0.83. On the other hand, the GRU model is better than LSTM on sunny days. The MAPE, MAE, and RMSE of GRU are 8%, 0.263, and 0.513, and those of LSTM are 11%, 0.531, and 0.792, respectively, on 27 June 2021. In addition, the forecast results of GRU on 2 July are also better than those of LSTM. However, the advantages of the LSTM model are highlighted when the weather pattern is poor. From the forecasting results on 1 July, the MAPE of LSTM is about 8% higher than that of GRU, and MAE and RMSE are about 0.3–0.4 higher, respectively. Therefore, when selecting the prediction results for the next day, the weather forecast can be used as a reference. If the weather forecast for the next day is sunny, the results of the GRU model are more accurate. If it is cloudy, the results of LSTM are a better choice.

According to the aforementioned results, it is found that forecasting via the LSTM model combined with the hybrid weather data yields the best prediction results. Therefore, the LSTM model was used to predict the solar photovoltaic power for June 2021, and the results are shown in Figure 11. The results show that MAPE, MAE, and RMSE are 16.984%, 1.764, and 1.283, respectively. It is shown that the combination of CWB data and LWS data together with the deep learning model is helpful for improving the accuracy of the one-day-ahead solar photovoltaic forecast. Furthermore, a comparison with other one-day-ahead solar photovoltaic forecast methods from the literature is shown in Table 10. It can be found that the application of complex deep learning algorithms combined with the weather data and data pre-processing method proposed in this study has a favorable effect on one-day-ahead solar photovoltaic forecasting.

6. Conclusions

The acquisition of weather feature data greatly influences the one-day-ahead solar photovoltaic prediction results. The previous literature focused on a single data source, comparing different weather features, or used historical time-series data directly for one-day-ahead PV power forecasting. The objective of this study was to find the appropriate weather features and weather data group paired with a deep learning algorithm to implement one-day-ahead PV power forecasting. In this work, the application of ANN, LSTM, and GRU combined with different weather data (LWS, CWB, and hybrid data) was used for one-day-ahead photovoltaic power generation forecasting, and the forecasting effects were compared. The area of study covers a rooftop solar photovoltaic system in Yangmei District, Taiwan. The data pre-processing method proposed in this article can prevent the failure of forecasting results and solve the data unavailability problem. Hybrid weather data show up to 5–8% improvements in accuracy compared to the CWB and LWS. Considering all of the weather patterns, the LSTM model is more accurate for one-day-ahead PV power forecasting. MAPE, MAE, and RMSE are 16%, 0.71, and 0.83, respectively. The results indicate that when the amount of data is limited, using hybrid data and the five weather features is helpful for training the model. Accordingly, the proposed model shows better one-day-ahead PV forecasting.

The pre-processing data framework may remove a large number of historical data points, which may lead to insufficient data and data sample discontinuity. This causes a limitation in model training in this study. Nevertheless, the deep learning method has good generalization ability (the adaptability of machine learning algorithms to fresh samples). Therefore, keeping the valuable data will have the best effect on long-term predictions. This article shows that the validation data over the one-month study is stable. The MAE, MAPE, and RMSE for one month are 16.984%, 1.764, and 1.283, respectively. These results support that the LSTM model combined with hybrid weather data is reliable for one-day-ahead PV power forecasting.

Author Contributions

This paper is the collaborative work of all authors. Conceptualization, C.-H.C. and C.-C.W.; methodology, W.-C.K. and S.-H.H.; software, W.-C.K. and S.-H.H.; validation, W.-C.K., C.-H.C. and S.-H.H.; formal analysis, W.-C.K.; investigation, W.-C.K. and C.-H.C.; resources, C.-H.C.; data curation, W.-C.K.; writing—original draft preparation, W.-C.K.; writing—review and editing, C.-C.W.; supervision, C.-H.C.; project administration, C.-H.C., C.-C.W. and W.-C.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Ministry of Science and Technology, Taiwan, R.O.C., grant number MOST 110-2221-E-035-091 and MOST 110-2221-E-A49-079.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The authors have not provided data associated with this article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Morrissey, K.; Kahrobaee, S.; Ioan, A. Optimal Energy Storage Schedules for Load Leveling and Ramp Rate Control in Distribution Systems. In Proceedings of the 2020 IEEE Conference on Technologies for Sustainability (SusTech), Santa Ana, CA, USA, 23–25 April 2020; pp. 1–4. [Google Scholar]
Ahmed, M.; Kamalasadan, S. An approach for local net-load ramp rate control using integrated energy storage based on least square error minimization technique. In Proceedings of the 2018 IEEE Power and Energy Conference at Illinois (PECI), Champaign, IL, USA, 22–23 February 2018; pp. 1–6. [Google Scholar]
Vegvari, Z.; Bokony, V.; Barta, Z.; Kovacs, G. Life history predicts advancement of avian spring migration in response to climate change. Glob. Chang. Biol. 2010, 16, 1–11. [Google Scholar] [CrossRef] [Green Version]
Behrang, M.A.; Assareh, E.; Ghanbarzadeh, A.; Noghrehabadi, A.R. The potential of different artificial neural network (ANN) techniques in daily global solar radiation modeling based on meteorological data. Sol. Energy 2010, 84, 1468–1480. [Google Scholar] [CrossRef]
Bianchini, G.; Paoletti, S.; Vicino, A.; Corti, F.; Nebiacolombo, F. Model estimation of photovoltaic power generation using partial information. In Proceedings of the IEEE PES ISGT Europe 2013, Lyngby, Denmark, 6–9 October 2013; pp. 1–5. [Google Scholar]
Sangrody, H.; Sarailoo, M.; Zhou, N.; Tran, N.; Motalleb, M.; Foruzan, E. Weather forecasting error in solar energy forecasting. IET Renew. Power Gener. 2017, 11, 1274–1280. [Google Scholar] [CrossRef] [Green Version]
Sangrody, H.; Sarailoo, M.; Zhou, N.; Shokrollahi, A.; Foruzan, E. On the performance of forecasting models in the presence of input uncertainty. In Proceedings of the 2017 North American Power Symposium (NAPS), Morgantown, WV, USA, 17–19 September 2017; pp. 1–6. [Google Scholar]
Rehman, S.; Mohandes, M. Artificial neural network estimation of global solar radiation using air temperature and relative humidity. Energy Policy 2008, 36, 571–576. [Google Scholar] [CrossRef] [Green Version]
Alluhaidah, B.M.; Shehadeh, S.H.; El-Hawary, M.E. Most Influential Variables for Solar Radiation Forecasting Using Artificial Neural Networks. In Proceedings of the 2014 IEEE Electrical Power and Energy Conference, Washington, DC, USA, 12–14 November 2014; pp. 71–75. [Google Scholar]
Ghanbarzadeh, A.; Noghrehabadi, A.R.; Assareh, E.; Behrang, M.A. Solar radiation forecasting based on meteorological data using artificial neural networks. In Proceedings of the 2009 7th IEEE International Conference on Industrial Informatics, Cardiff, Wales, UK, 23–26 June 2009; pp. 227–231. [Google Scholar]
De Giorgi, M.G.; Congedo, P.M.; Malvoni, M. Photovoltaic power forecasting using statistical methods: Impact of weather data. IET Sci. Meas. Technol. 2014, 8, 90–97. [Google Scholar] [CrossRef]
Gigoni, L.; Betti, A.; Crisostomi, E.; Franco, A.; Tucci, M.; Bizzarri, F.; Mucci, D. Day-Ahead Hourly Forecasting of Power Generation From Photovoltaic Plants. IEEE Trans. Sustain. Energy 2018, 9, 831–842. [Google Scholar] [CrossRef] [Green Version]
Mahmud, K.; Azam, S.; Karim, A.; Zobaed, S.; Shanmugam, B.; Mathur, D. Machine Learning Based PV Power Generation Forecasting in Alice Springs. IEEE Access 2021, 9, 46117–46128. [Google Scholar] [CrossRef]
Shah, A.S.B.M.; Yokoyama, H.; Kakimoto, N. High-Precision Forecasting Model of Solar Irradiance Based on Grid Point Value Data Analysis for an Efficient Photovoltaic System. IEEE Trans. Sustain. Energy 2015, 6, 474–481. [Google Scholar] [CrossRef]
Suksamosorn, S.; Hoonchareon, N.; Songsiri, J. Influential Variable Selection for Improving Solar Forecasts from Numerical Weather Prediction. In Proceedings of the 2018 15th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON), Chiang Rai, Thailand, 18–21 July 2018; pp. 333–336. [Google Scholar]
Liu, Z.; Zhang, Z. Solar forecasting by K-Nearest Neighbors method with weather classification and physical model. In Proceedings of the 2016 North American Power Symposium (NAPS), Denver, CO, USA, 18–20 September 2016; pp. 1–6. [Google Scholar]
Hossain, M.S.; Mahmood, H. Short-Term Photovoltaic Power Forecasting Using an LSTM Neural Network and Synthetic Weather Forecast. IEEE Access 2020, 8, 172524–172533. [Google Scholar] [CrossRef]
Cheng, L.; Zang, H.; Ding, T.; Wei, Z.; Sun, G. Multi-Meteorological-Factor-Based Graph Modeling for Photovoltaic Power Forecasting. IEEE Trans. Sustain. Energy 2021, 12, 1593–1603. [Google Scholar] [CrossRef]
Yu, Y.; Cao, J.; Zhu, J. An LSTM Short-Term Solar Irradiance Forecasting Under Complicated Weather Conditions. IEEE Access 2019, 7, 145651–145666. [Google Scholar] [CrossRef]
Silvia, P.; Alfredo, N. A New Probabilistic Ensemble Method for an Enhanced Day-Ahead PV Power Forecast. IEEE J. Photovolt. 2022, 12, 581–588. [Google Scholar]
Wan, C.; Zhao, J.; Song, Y.; Xu, Z.; Lin, J.; Hu, Z. Photovoltaic and solar power forecasting for smart grid energy management. CSEE J. Power Energy Syst. 2015, 1, 38–46. [Google Scholar] [CrossRef]
Perveen, G.; Rizwan, M.; Goel, N. Comparison of intelligent modelling techniques for forecasting solar energy and its application in solar PV based energy system. IET Energy Syst. Integr. 2019, 1, 34–51. [Google Scholar] [CrossRef]
Yona, A.; Senjyu, T.; Funabashi, T.; Kim, C. Determination Method of Insolation Prediction with Fuzzy and Applying Neural Network for Long-Term Ahead PV Power Output Correction. IEEE Trans. Sustain. Energy 2013, 4, 527–533. [Google Scholar] [CrossRef]
Yang, H.; Huang, C.; Huang, Y.; Pai, Y. A Weather-Based Hybrid Method for 1-Day Ahead Hourly Forecasting of PV Power Output. IEEE Trans. Sustain. Energy 2014, 5, 917–926. [Google Scholar] [CrossRef]
Sorkun, M.C.; Paoli, C.; Incel, Ö.D. Time series forecasting on solar irradiation using deep learning. In Proceedings of the 2017 10th International Conference on Electrical and Electronics Engineering (ELECO), Bursa, Turkey, 30 November–2 December 2017; pp. 151–155. [Google Scholar]
Hanafi, R.A.; Liu, C.; Suwarno. In One-Hour-Ahead Solar Power Forecasting Using Artificial Neural Networks in Taiwan. In Proceedings of the 2019 2nd International Conference on High Voltage Engineering and Power Systems (ICHVEPS), Denpasar, Indonesia, 1–4 October 2019; pp. 169–174. [Google Scholar]
Liu, C.H.; Gu, J.C.; Yang, M.T. A Simplified LSTM Neural Networks for One Day-Ahead Solar Power Forecasting. IEEE Access 2021, 9, 17174–17195. [Google Scholar] [CrossRef]
Das, U.K.; Tey, K.S.; Seyedmahmoudian, M.; Mekhilef, S.; Idris, M.Y.I.; Van Deventer, W.; Horan, B.; Stojcevski, A. Forecasting of photovoltaic power generation and model optimization: A review. Renew. Sustain. Energy Rev. 2018, 81, 912–928. [Google Scholar] [CrossRef]
Mehtah, E.; Adel, M. Solar Power Forecasting Using Deep Learning Techniques. IEEE Access 2022, 10, 31692–31698. [Google Scholar]
Seyedmahmoudian, M.; Jamei, E.; Thirunavukkarasu, G.S.; Soon, T.K.; Mortimer, M.; Horan, B.; Stojcevski, A.; Mekhilef, S. Short-Term Forecasting of the Output Power of a Building-Integrated Photovoltaic System Using a Metaheuristic Approach. Energies 2018, 11, 1260. [Google Scholar] [CrossRef] [Green Version]
VanDeventer, W.; Jamei, E.; Thirunavukkarasu, G.S.; Seyedmahmoudian, M.; Soon, T.K.; Horan, B.; Mekhilef, S.; Stojcevski, A. Short-term PV power forecasting using hybrid GASVM technique. Renew. Energy 2019, 140, 367–379. [Google Scholar] [CrossRef]
Muhammad, A.; Lee, S.J.; Khang, S.H.; Jong, S. Two-Stage Attention over LSTM with Bayesian Optimization for Day-Ahead Solar Power Forecasting. IEEE Access 2021, 9, 107387–107398. [Google Scholar]
CWB Observation Data Inquire System. Available online: https://e-service.cwb.gov.tw/HistoryDataQuery/index.jsp (accessed on 1 August 2020).
Open Weather Data. Available online: https://opendata.cwb.gov.tw/index (accessed on 1 August 2020).
AccuWeather. Available online: https://www.accuweather.com/ (accessed on 1 August 2020).
Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Gers, F.A.; Schmidhuber, J.; Cummins, F. Learning to forget: Continual prediction with LSTM. Neural Comput. 2000, 12, 2451–2471. [Google Scholar] [CrossRef]
Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078. [Google Scholar]
Chen, Z.; Yang, Y. Assessing forecast accuracy measures. Prepr. Ser. 2004, 2010, 2004–2010. [Google Scholar]
Lewis, C.D. Industrial and Business Forecasting Methods: A Practical Guide to Exponential Smoothing and Curve Fitting; Butterworth-Heinemann: London, UK, 1982. [Google Scholar]

Figure 1. The architecture of data collection (including Central Weather Bureau and local micro weather station) and model forecasting structure.

Figure 2. Rooftop solar PV with 30 solar panels and 10 kW capacity in Yangmei District, Taiwan.

Figure 3. The historical photovoltaic forecast data graph and the weather condition photo captured by the sky camera from 2–4 March 2021.

Figure 4. Data pre-processing structure.

Figure 5. The filtration result of solar photovoltaic power data. The green circle shows the change from the blue line to the red line.

Figure 6. Missing data processing results (hollow bars: the amount of data after processing; the orange, green and blue bars are the amount of original data).

Figure 7. The PV power forecasting results of the ANN model.

Figure 8. The PV power forecasting results of the LSTM model.

Figure 9. The PV power forecasting results of the GRU model.

Figure 10. Comprehensive comparison results of hybrid metadata.

Figure 11. Forecasting results of the LSTM model for the month of June.

Table 1. Summary of different forecasting algorithms, data inputs, and horizons.

Ref.	Year	Method	Inputs	Horizon	Best Results of Accuracy
[10]	2009	ANN	Air temperature, relative humidity, and sunshine	Hour	MAPE (%): 8.84
[16]	2016	KNN	DNI, DHI, ambient temperature	Hour	MAPE (%): 18.25 MRAE ⁴ (%): 2.01
[17]	2020	LSTM	Solar irradiance, temperature, relative humidity, and wind speed	24 h	MAPE (%): 22.31 RMSE: 0.71 MAE: 0.36
[23]	2014	SVR	Temperature, probability of precipitation, and solar irradiance	One-day-ahead hourly	MRE ⁵ (%): 3.295
[26]	2019	BPNN ELM ¹	Sunshine hours, global radiation, and UV index	One-hour-ahead	nRMSE ⁶ (%): 7.75
[30]	2018	DEPSO ²	Tipping Bucket Rain Gauge, wind speed, wind direction, air temperature, relative humidity, and solar radiation	Hour	RMSE (%): 4.4% MAE: 0.03 MBE: −1.63 VAR ⁷: 0.01 WME ⁸: 0.16 MRE (%): 3.1
[31]	2019	Hybrid GA-SVM ³	Tipping Bucket Rain Gauge, wind speed, wind direction, air temperature, relative humidity, and solar radiation	Hour	MAPE (%): 1.7052 RMSE: 11.226
[32]	2021	Two-stage attention-based encoder–decoder over LSTM	Forty-one features, including solar radiation, temperature, humidity, snowfall, albedo, etc.	One-day-ahead	RMSE: 0.0638 MAE: 0.0324

¹ ELM: Elaboration Likelihood Model; ² DEPSO: Discrete Evolutionary Particle Swarm Optimization); ³ GA-SVM: Genetic Algorithm and Support Vector Machine; ⁴ MRAE: Mean Relative Absolute Error; ⁵ MRE: Mean Relative Error; ⁶ nRMSE: normalized Root Mean Square Error; ⁷ VAR: variance of the prediction errors; ⁸ WME: Weighted Mean Errors.

Table 2. Local weather sensor specification sheet.

Feature	Range
Temperature [°C]	0~60
Relatively Humidity [%]	0~100
Average Wind Speed [m/s]	0~60
Wind Direction [Degree]	0~360
Rainfall [mm/h]	0~200
Pressure [hPa]	600~1100

Table 3. Pyrheliometer specification sheet.

Specification	Range
Measurement Range [W/m²]	0~2000
Spectral Range [nm]	305~2800

Table 4. Final features selected from the CWB.

Feature	Unit
Temperature	°C
Relative Humidity	%
Rainfall	mm
Average Wind Speed	m/s
UV Index	-

Table 5. The hyperparameter tuning results of CWB data.

Model	Input Time	Layers	Epochs	Learning Rate	Batch Size	MAE	MAPE	RMSE
LSTM	2 days	3	2000	1 × 10⁻³	8	0.5074	17%	0.9979
GRU	2 days	3	2000	1 × 10⁻³	8	0.4898	18%	0.8397
ANN	3 days	5	2000	1 × 10⁻³	8	0.8999	34%	1.6067

Table 6. The hyperparameter tuning results of LWS historical data.

Model	Input Time	Layers	Epochs	Learning Rate	Batch Size	MAE	MAPE	RMSE
LSTM	2 days	3	2000	1 × 10⁻³	8	0.3992	12%	0.7105
GRU	2 days	3	2000	1 × 10⁻³	8	0.4083	12%	0.7059
ANN	1 day	3	2000	1 × 10⁻³	8	0.4377	12%	1.3326

Table 7. The hyperparameter tuning results of hybrid data.

Model	Input Time	Layers	Epochs	Learning Rate	Batch Size	MAE	MAPE	RMSE
LSTM	1 day	4	2000	1 × 10⁻³	8	0.31242	9%	0.5921
GRU	1 day	4	2000	1 × 10⁻³	8	0.36907	11%	0.6911
ANN	2 days	5	2000	1 × 10⁻³	8	0.35434	10%	1.3326

Table 8. Average one-day-ahead PV power forecasting results of each weather data group for one week.

Model	CWB Weather Data			Local Weather Data			Hybrid Weather Data			Average
Model	MAPE	MAE	RMSE	MAPE	MAE	RMSE	MAPE	MAE	RMSE	MAPE	MAE	RMSE
ANN	29%	2.020	1.400	23%	1.415	1.179	24%	1.333	1.147	25.3%	1.598	1.242
LSTM	21%	0.931	0.940	23%	1.839	1.240	16%	0.706	0.831	20.0%	1.158	1.004
GRU	19%	1.083	1.003	23%	2.207	1.362	20%	0.828	0.888	20.7%	1.372	1.084

Table 9. Daily one-day-ahead PV power forecasting results using hybrid weather parameters.

Day	ANN Model			LSTM Model			GRU Model
Day	MAPE	MAE	RMSE	MAPE	MAE	RMSE	MAPE	MAE	RMSE
27 June 2021	15%	1.090	1.044	11%	0.531	0.729	8%	0.263	0.513
28 June 2021	35%	1.863	1.365	21%	1.046	1.023	25%	1.205	1.098
29 June 2021	31%	1.506	1.227	23%	0.861	0.928	26%	0.864	0.930
30 June 2021	20%	0.968	0.984	13%	0.535	0.731	18%	0.944	0.972
1 July 2021	21%	1.240	1.113	14%	0.556	0.745	22%	0.863	0.929
2 July 2021	18%	2.364	1.538	11%	0.986	0.993	7%	0.494	0.703
3 July 2021	36%	3.720	1.929	24%	1.230	1.109	24%	1.320	1.149
Average	24%	1.33	1.15	16%	0.71	0.83	20%	0.83	0.89

Table 10. Summary of different algorithms, data inputs, and horizons of one day-ahead PV forecasting.

Paper	Method Used	Inputs	Horizon	Best Results for Accuracy
[17]	LSTM NN	Solar irradiance, temperature, relative humidity, and wind speed	24 h	MAPE (%): 22.31 RMSE: 0.71 MAE: 0.36
[23]	Weather-based hybrid method: SOM, LVQ, and SVR	Temperature, probability of precipitation, and solar irradiance	One-day-ahead hourly	RMSE: 1.6811
This manuscript	LSTM	Temperature, relative humidity, rainfall, average wind speed, and UV index	One-day-ahead 24 h	MAPE (%): 16.984 RMSE: 1.764 MAE: 1.283

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kuo, W.-C.; Chen, C.-H.; Hua, S.-H.; Wang, C.-C. Assessment of Different Deep Learning Methods of Power Generation Forecasting for Solar PV System. Appl. Sci. 2022, 12, 7529. https://doi.org/10.3390/app12157529

AMA Style

Kuo W-C, Chen C-H, Hua S-H, Wang C-C. Assessment of Different Deep Learning Methods of Power Generation Forecasting for Solar PV System. Applied Sciences. 2022; 12(15):7529. https://doi.org/10.3390/app12157529

Chicago/Turabian Style

Kuo, Wen-Chi, Chiun-Hsun Chen, Shih-Hong Hua, and Chi-Chuan Wang. 2022. "Assessment of Different Deep Learning Methods of Power Generation Forecasting for Solar PV System" Applied Sciences 12, no. 15: 7529. https://doi.org/10.3390/app12157529

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Assessment of Different Deep Learning Methods of Power Generation Forecasting for Solar PV System

Abstract

1. Introduction

2. Data Description

2.1. Photovoltaic Generation Data

2.2. Meteorological Data from the CWB

2.3. AccuWeather Data

2.4. Local Weather Station (LWS) and Pyrheliometer

3. Data Pre-Processing

3.1. Data Classification

3.2. Data Filtering

3.3. Missing Data Processing

4. Methods and Evaluation

4.1. Artificial Neural Network (ANN)

4.2. Long Short-Term Memory (LSTM)

4.3. Gated Recurrent Unit (GRU)

4.4. Hyperparameter Adjustment Process

4.5. Evaluation Indices

5. Numerical Results

5.1. Results of Hyperparameter Adjustment

5.2. Forecast Performance with Different Weather Data Groups

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI