A Deep Learning-Based Solar Power Generation Forecasting Method Applicable to Multiple Sites

Jang, Seon Young; Oh, Byung Tae; Oh, Eunsung

doi:10.3390/su16125240

Open AccessArticle

A Deep Learning-Based Solar Power Generation Forecasting Method Applicable to Multiple Sites

by

Seon Young Jang

¹,

Byung Tae Oh

^2,*

and

Eunsung Oh

^3,*

¹

Department of Electronics and Information Engineering, Korea Aerospace University, Goyang-si 10504, Gyeonggi-do, Republic of Korea

²

Department of Computer Engineering, Korea Aerospace University, Goyang-si 10504, Gyeonggi-do, Republic of Korea

³

Department of Electrical Engineering, College of IT Convergence, Global Campus, Gachon University, Seongnam-si 13120, Gyeonggi-do, Republic of Korea

^*

Authors to whom correspondence should be addressed.

Sustainability 2024, 16(12), 5240; https://doi.org/10.3390/su16125240

Submission received: 9 April 2024 / Revised: 6 June 2024 / Accepted: 18 June 2024 / Published: 20 June 2024

(This article belongs to the Special Issue Sustainable Management and Design of Renewable Power Systems)

Download

Browse Figures

Versions Notes

Abstract

:

This paper addresses the challenge of accurately forecasting solar power generation (SPG) across multiple sites using a single common model. The proposed deep learning-based model is designed to predict SPG for various locations by leveraging a comprehensive dataset from multiple sites in the Republic of Korea. By incorporating common meteorological elements such as temperature, humidity, and cloud cover into its framework, the model uniquely identifies site-specific features to enhance the forecasting accuracy. The key innovation of this model is the integration of a classifier module within the common model framework, enabling it to adapt and predict SPG for both known and unknown sites based on site similarities. This approach allows for the extraction and utilization of site-specific characteristics from shared meteorological data, significantly improving the model’s adaptability and generalization across diverse environmental conditions. The evaluation results demonstrate that the model maintains high performance levels across different SPG sites with minimal performance degradation compared to site-specific models. Notably, the model shows robust forecasting capabilities, even in the absence of target SPG data, highlighting its potential to enhance operational efficiency and support the integration of renewable energy into the power grid, thereby contributing to the global transition towards sustainable energy sources.

Keywords:

convolutional neural network; deep learning; domain estimation; long short-term memory; machine learning; renewable; solar power generation

1. Introduction

1.1. Motivation

Renewable power generation has witnessed unprecedented growth in recent years, with the largest growth observed for SPG. According to the International Renewable Energy Agency [1], solar power accounted for over 65% of global renewable capacity additions in 2023, underscoring its pivotal role in the transition towards sustainable energy systems. This surge is attributed not only to the potential of solar power to address pressing environmental concerns but also to its economic viability. Recent analyses indicate that solar power’s levelized cost of energy (LCOE) has become increasingly competitive, outperforming coal and gas combined-cycle costs in many developed countries [2].

However, integrating solar power into the energy grid introduces unique challenges owing to its inherent variability. SPG is highly dependent on natural phenomena, such as sunlight intensity and duration, which fluctuate daily and seasonally. Accurate SPG forecasting ensures grid stability and maximizes solar energy utilization efficiency [3].

1.2. Prior Works

Recent advancements in deep learning (DL) for SPG forecasting have led to the development of more accurate and robust predictive models. Among these, the development and application of various innovative DL frameworks and methodologies to enhance the precision and reliability of solar power forecasts stand out.

According to the reviews in [4,5], for SPG forecasting, approximately 70% of studies employ DL models, including Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) models, while 20% of the research specifically utilizes Convolutional Neural Network (CNN) models, which are also a type of DL model. Additionally, recent studies have explored the combination of DL models with fuzzy logic or meta-heuristic approaches to enhance prediction accuracy [6]. This distribution highlights the predominant reliance on sequence-based models, such as RNNs and LSTMs, for forecasting tasks that involve time series data of SPG, which require capturing temporal dynamics and dependencies over time. These models are particularly suitable for handling sequential data because of their ability to remember information over long periods, making them effective in predicting future values based on past trends. Dhaked et al. proposed an LSTM-based SPG forecasting model considering weather conditions and demonstrated the effect of changing the number of hidden layers and constraints on the forecast [7]. Wang et al. suggested an SPG forecasting method based on an LSTM model with frequency-domain input data decomposition [8]. In [8], Bayesian optimization was applied to optimize the hyperparameters of an LSTM model. Li et al. proposed an LSTM-based SPG forecasting method using variational-mode data decomposition [9]. Similar to [8], in [9], the performance was improved by creating multiple models for different states through data decomposition and applying an artificial Gorilla troop optimizer. These studies indicate that for SPG forecasting, the trend is not merely to apply classical LSTM models but rather to combine LSTM models with optimizers and other techniques to fine-tune the model parameters more effectively and enhance their performance.

The use of CNN models, traditionally known for their application in image processing and recognition tasks, in 20% of the studies indicated the broader applicability of these models in analyzing time series data. CNNs can capture spatiotemporal patterns in data, which can be particularly useful in scenarios where the spatial distribution of inputs (e.g., cloud cover and solar irradiance maps) plays a significant role in the forecasting process. CNN models are often researched not as standalone solutions but rather in combination with LSTM models to form hybrid models. Marinho et al. showed that a CNN-LSTM model performs more accurately than standalone CNN and LSTM models [10]. Anu-Shalini et al. demonstrated that a CNN-bidirectional LSTM (biLSTM) model provides more accurate predictions than standalone CNN, LSTM, and biLSTM models [11]. Similarly, an autoencoder (AE)-biLSTM model was proposed in [12]. These studies indicate that the biLSTM model is more efficient than the LSTM model for SPG forecasting due to its ability to capture dependencies in both forward and backward directions, providing a more comprehensive contextual understanding and enhancing prediction accuracy. Houran et al. proposed a CNN–LSTM model based on the Coati optimization algorithm [13]. A Coati optimizer determines the hyperparameters to improve the convergence and accuracy of the CNN-LSTM model. Alharkan et al. proposed a dual-stem CNN-LSTM model integrated attention mechanism [14]. In [14], the dual-stem design had two parallel pathways within the model: one stem processed the input data through CNN layers, which were adept at extracting spatial features and patterns, and the other stem utilized LSTM layers to capture temporal dependencies and sequences in the data.

Research is also progressing on the application of transformers based on self-attention mechanisms. Initially designed for natural language processing tasks, transformers have shown remarkable success because of their ability to process sequences in parallel and capture long-range dependencies in data through self-attention [15]. This mechanism allows the model to weigh the importance of different parts of the input data relative to each other, providing a dynamic method to focus on relevant information for making predictions. Al-Ali et al. proposed a CNN-LSTM transformer model [16]. In [16], a CNN-LSTM model was used to extract spatial and temporal features, and a transformer was applied to generate forecast results from the features. Zhu et al. proposed a transformer-based SPG forecasting method using data filtering [17]. In [17], the Savitzky–Golay filter (SG) and Local Outlier Factor (LOF) filter were applied for data preprocessing to reduce noise, and a transformer was employed for the SPG forecast model.

Table 1 summarizes DL-based SPG forecasting methods.

These studies indicate that the trend for SPG forecasting is not merely to apply classical DL models but rather to combine models with optimizers and other techniques to fine-tune the model parameters and enhance their performance more effectively.

However, the current research predominantly focuses on modeling individually implemented SPG sites. This approach involves collecting and utilizing site-specific input data, such as weather conditions and training data, to generate forecasts for a single site. Consequently, this methodology implies that producing forecasts for multiple sites would necessitate either the design of distinct models tailored to each site or separate training sessions for models corresponding to each location. This presents a limitation regarding scalability and efficiency when extending the forecasting capabilities across multiple sites.

1.3. Contribution

This study addresses the significant challenge of developing an accurate and scalable SPG forecasting model that can be effectively applied across multiple sites using a single common model. Traditional SPG forecasting methods often require site-specific models, which can be inefficient and difficult to scale. Our research proposes a novel deep learning-based common model designed to overcome these limitations by leveraging common meteorological elements such as temperature, humidity, and cloud cover.

The key contributions of this study are as follows:

The development of a common model: We propose a deep learning-based forecasting model that can accurately predict SPG for various locations by utilizing common meteorological data. This model is designed to extract site-specific characteristic features from these shared data elements, thus enhancing the forecasting accuracy.
The integration of a classifier module: To address the variability and unique characteristics of different sites, we integrate a classifier module within the common model framework. This module enables the model to adapt and predict SPG for both known and unknown sites by identifying and leveraging site similarities.
Improved adaptability and generalization: By extracting and utilizing site-specific features from common meteorological data, the proposed model significantly improves its adaptability and generalization capabilities across diverse environmental conditions. This dual-layered approach allows for a better understanding of how different factors contribute to SPG outputs at each location, enabling a highly adaptable forecasting framework.
Robust performance across multiple sites: The evaluation results demonstrate that our model maintains high performance levels across different SPG sites, with minimal performance degradation compared to site-specific models. Notably, the model exhibits robust forecasting capabilities even in the absence of target SPG data, highlighting its potential to enhance operational efficiency and support the integration of renewable energy into the power grid.

To validate the effectiveness of this multifaceted approach, we employed a comprehensive dataset encompassing a nationwide area in the Republic of Korea. Through rigorous testing and validation, the model demonstrated the capability of being applied across multiple sites as a common model with only marginal performance degradation compared to models trained specifically for individual sites. In particular, using the proposed model allows us to obtain forecasting performance within a certain range, even in cases where the target SPG data do not exist. This verifies that the proposed model can significantly improve operational efficiency and renewable energy integration into the power grid.

2. Methodology

2.1. SPG Forecasting System Architecture

The objective of our proposed forecasting system is to accurately predict SPG by leveraging six key types of meteorological data: temperature (TMP), humidity (REH), precipitation probability (POP), cloudiness index (SKY), wind direction (VEC), and wind speed (WSD). We devised a dual-component SPG prediction framework to enhance operational efficiency and feasibility, as illustrated in Figure 1. This framework is partitioned into two distinct subsystems: one dedicated to extracting features from raw weather data and the other to utilize these features to forecast SPG.

The proposed two-tiered model assumes that the initial feature extraction from the input data operates independently of the subsequent SPG forecasting phase. Given that the input weather data are intrinsically linked to specific local geological factors and considering that SPG output is predominantly determined by these input parameters—assuming a consistent SPG system configuration—it stands to reason that segmenting the overall system into two discrete units and independently optimizing each for its intended purpose would result in increased efficiency.

The feature extraction subsystem employs a combination of CNN and LSTM networks for the analysis. Specifically, it processed 24 h of weather data to ensure efficient and practical information handling. At each timestamp, the data pertaining to all six weather variables were considered. In addition, we factored in the solar elevation (SE) and azimuth angle (AA) data to directly account for variations in solar irradiance.

Let

x_{i, t}

be the input element, which is the numerical weather prediction (NWP) for the SPG site

i

at the timestamp

t

. It is defined as a vector:

x_{i, t} = {[A A_{i, t}, S E_{i, t}, T M P_{i, t}, R E H_{i, t}, P O P_{i, t}, S K Y_{i, t}, V E C_{i, t}, W S D_{i, t}]}^{T},

(1)

where

{[\cdot]}^{T}

is the transpose of a matrix. The input data for the forecasting of the SPG site

i

are expressed as the 24-h time series of the input element as:

N W P_{i} = {[x_{i, 0}, x_{i, 1}, \dots, x_{i, t}, \dots, x_{i, 22}, x_{i, 23}]}^{T} .

(2)

To summarize, we constructed a 24 × 8 dimensional data matrix, where each row represents the NWP data for each hour.

To manage this data structure, we propose a CNN architecture with a kernel size of 3, allowing the CNN to incorporate temporal adjacency (i.e., data from the preceding and succeeding time points) to ensure stability and robustness during the feature extraction phase. This subsystem utilizes two convolutional layers complemented by batch normalization and a Rectified Linear Unit (ReLU) activation function. The resultant features are then further refined through a biLSTM module that considers the sequential nature of the 24-h weather data chosen for its inherent temporal correlation. The biLSTM module complements the preceding CNN module by fully considering the sequential and non-sequential flows for 24 h. Thus, the first subsystem serves as a Feature Encoder, distilling essential information from the weather data into a usable format for forecasting.

The characteristics of the feature encoder are summarized as follows:

CNN and LSTM Interaction: The CNN layers are responsible for extracting spatial features from the input weather data. These features are then fed into the LSTM layers, which analyze temporal patterns. The interaction is triggered sequentially: once the CNN extracts the spatial features, these features are used as inputs in the LSTM layers for further temporal analysis.
Interaction Trigger: The trigger for the interaction between CNN and LSTM layers is the sequential data flow within the model. After the CNN layers process the input data to capture spatial correlations, the resulting feature maps are fed into the LSTM layers to capture temporal dependencies.
Justification for biLSTMs: The use of biLSTMs is justified by its ability to capture dependencies in both forward and backward directions. This capability is particularly useful for SPG forecasting as it allows the model to consider the influence of past and future weather conditions on the current prediction. Bidirectional LSTMs provide a more comprehensive temporal analysis, leading to improved prediction accuracy. As stated in Section 1.2, the use of biLSTMs is known to be effective for power generation forecasting.

The second subsystem, the Regressor, interprets the encoded features to predict SPG output. This prediction mechanism was configured straightforwardly using a conventional multilayer perceptron (MLP) structure comprising two layers.

This architecture underscores our commitment to developing a sophisticated yet pragmatic solution for SPG forecasting. By dissecting the process into distinct feature extraction and regression phases, our system not only optimizes the utilization of weather data for solar power prediction but also sets a new standard for forecasting models in the renewable energy sector.

2.2. Implicit Site Estimation using the Classification Module

A pivotal aspect of our research was addressing the significant challenge of predicting SPG amid high variability across local sites. The variance in weather information attributed to geographical, environmental, and seasonal factors complicates the collection of a sufficient and reliable dataset for training. This variability often results in unstable and unreliable training samples, ultimately impairing the forecasting accuracy for sites that are not represented in the training data.

To mitigate these challenges, we introduce a novel system design that reduces the sample bias inherent to each local site through a feature similarity-based classification mechanism. This approach aggregates similar features to form coherent groups that facilitate more accurate predictions. Specifically, we integrated a site classification module within the Encoder system, as shown in Figure 2. This classifier was ingeniously designed to compel the encoder to implicitly assimilate the types of weather data, thereby embedding the local site characteristics into the model’s understanding. Consequently, our system exhibits enhanced robustness and adaptability and can generate reliable forecasts for previously unknown sites by leveraging feature similarity to infer local environmental conditions.

The goal of our research was to develop a common model for SPG forecasting across multiple sites. Therefore, to train the proposed network, NWP data from multiple sites were used simultaneously. The subscript

i

represents the site label, consistent with its usage in Equations (1) and (2). Moreover, we used the following loss functions:

L_{T} = α L_{R} + β L_{C},

(3)

where

L_{R}

and

L_{C}

represent the regression and classification errors, respectively.

α

and

β

are the weight factors between the regression and the classification. In our experiment, we set

α = β

.

The regression error was determined by measuring the total absolute error between the predicted SPG and the target SPG values, i.e., the actual hourly SPG data, across all sites

I

and timestamps

T

:

L_{R} = \sum_{i \in I} \sum_{t \in T} |p r e d_{i, t} - t a r g e t_{i, t}| .

(4)

The common cross-entropy loss was adopted for the classification error:

L_{C} = \sum_{i \in I} q_{i} \log (C_{i})

(5)

where

q_{i}

indicates the true label and

C_{i}

indicates the predicted probability.

This innovation represents a significant advancement in SPG forecasting and allows our model to overcome the limitations of traditional forecasting methods. By equipping the encoder with the ability to discern and adapt to each site’s unique characteristics, we enabled more generalized and accurate predictions across a diverse array of SPG sites. This approach not only broadens the applicability of our forecasting model but also sets a new standard for addressing the complexities of renewable energy prediction in a geographically varied landscape.

3. Results and Discussion

3.1. Dataset

This study was conducted using SPG data measured at seven sites distributed nationwide across the Republic of Korea, as shown in Figure 3. The SPG data were recorded by the Korea Southern Power Corporation from 2013 to 2022. The data were published on a public data portal managed by the Ministry of Interior and Safety, Republic of Korea [18]. Data from 2013 to 2020 were used to train the proposed model, and the remaining data were used to calculate the results. Each SPG site had a different installation capacity, ranging from 93 to 187 kWp. For a clear comparison, the installation capacities of all sites were normalized to 100 kWp in the experimental results.

The weather information utilized in this study was derived from day-ahead NWP data provided by the National Climate Data Center of the Republic of Korea Meteorological Administration [19].

The datasets used in this study are publicly available and were collected from various data sources within public data portals [18,19]. The data include solar generation and meteorological elements such as temperature, humidity, precipitation probability, cloudiness index, wind direction, and wind speed, collected from various sites across the Republic of Korea, as described in Table 2 and Table 3.

3.2. Experimental Setups

The experiments were performed on a system equipped with dual NVIDIA GeForce RTX 4090 GPUs, leveraging the PyTorch framework for the optimal tensor processing of GPU architectures. The initial learning rate was set to 0.001 and managed by the LambdaLR scheduler as an epsilon value of

1 \times 10^{- 8}

and a weight decay of

1 \times 10^{- 3}

to dynamically adjust the rate for optimal convergence and model performance.

A batch size of 30 was used to balance the memory constraints and computational efficiency. The training plan was structured to run for 300 epochs within each iteration, focusing on exposing the model to various data samples. The epoch yielding the smallest test loss over 50 iterations was chosen to indicate model performance. This approach, chosen after preliminary testing, aimed to maximize training effectiveness within computing resources, ensure a thorough learning process, and minimize overfitting against the background of task complexity and the selected model architecture.

We evaluated the performance of our proposed system with three performance metrics. The primary evaluation metric used was the Mean Absolute Error (MAE). The MAE was calculated as follows:

M A E_{i} = \frac{1}{T} \sum_{t \in T} |{\hat{y}}_{i, t} - y_{i, t}|,

(6)

where

{\hat{y}}_{i, t}

and

y_{i, t}

are the predicted SPG value and actual SPG value of site

i

at timestamp

i

and

T

is the length of observation periods, e.g., 24 h. This metric provides a measure of the average magnitude of errors in the predictions, without considering their direction.

To provide a more comprehensive evaluation of our model, we also included the Mean Squared Error (MSE) and Root Mean Square Error (RMSE) as additional metrics. The MSE is defined as:

M S E_{i} = \frac{1}{T} \sum_{t \in T} {({\hat{y}}_{i, t} - y_{i, t})}^{2},

(7)

The MSE metric emphasizes larger errors more than the MAE due to the squaring of the error terms. The RMSE, being the square root of the MSE, is calculated as:

R M S E_{i} = \sqrt{\frac{1}{T} \sum_{t \in T} {({\hat{y}}_{i, t} - y_{i, t})}^{2}},

(8)

The RMSE provides a measure that maintains the units of the original data, making it interpretable in the context of the original values.

3.3. Site-Specific Performance

Initially, we evaluated the performance of our proposed system on a site-specific basis; that is, we trained and tested the model independently for data from each site. In this scenario, the site-specific model operates without the need for a classification module because the model focuses solely on forecasting for a singular known site.

Table 4 lists the forecasting performances of the site-specific model.

The MAE across the seven sites was 3.43, with a standard deviation of 1.11. Given that the installation capacity of each site was normalized to 100 kWp, the proposed method achieved a forecast error of 3.5% or less, according to the installation capacity. This level of accuracy is particularly noteworthy in the context of the Republic of Korea’s regulatory framework for renewable energy grid integration, which stipulates a participation threshold based on forecasting accuracy that requires an MAE of no more than 8% for SPG [20]. Therefore, the proposed forecasting method in a site-specific scenario demonstrated excellent performance for SPG utilization, significantly surpassing the regulatory requirements and showcasing its potential to contribute effectively to the integration of renewable energy sources into the power grid.

In Table 4, site 5 exhibits outlier performance, with an MAE of 5.5, notably higher than the other sites. This deviation can be attributed to the fact that site 5 is located in an island region, as shown in Figure 3, where the accuracy of weather forecasting is generally low because of the unique meteorological conditions of the area. Despite this challenge, the performance at site 5 still met the regulatory requirement of remaining within an 8% MAE threshold for participation in the Republic of Korea’s renewable energy generation forecasting system. This underscores the robustness of the proposed forecasting method, demonstrating its capability to deliver satisfactory forecast performance, even in geographically challenging locations where weather prediction is inherently less precise.

The MSE across the seven sites averaged 64.49, with a standard deviation reflecting the variance in prediction errors. The MSE metric emphasizes larger errors more than the MAE due to the squaring of the error terms. This means that higher MSE values indicate that some predictions had significantly larger errors. Site 5 again shows a much higher MSE (173.55) compared to other sites, reinforcing the impact of its unique meteorological conditions. The relatively higher MSE at site 5 suggests that the model’s predictions occasionally deviate significantly from actual values, which is consistent with the challenges of forecasting in this specific region.

The RMSE, which provides a measure that maintains the units of the original data, averaged 7.60 across all sites. The RMSE is particularly useful as it directly relates to the magnitude of errors in the same units as the predicted and actual SPG values. Site 5 had the highest RMSE (13.17), indicating that the errors at this site were not only frequent but also substantial in size. Despite this, the average RMSE value of 7.60 across all sites demonstrates that the proposed model maintains a reasonable error margin, even when considering the challenging conditions at site 5.

Overall, the proposed forecasting method shows strong performance across most sites, significantly exceeding regulatory requirements. The higher errors observed at site 5 highlight areas for potential improvement, particularly in regions with unique meteorological conditions. The combination of MAE, MSE, and RMSE metrics provides a comprehensive evaluation of the model’s accuracy and robustness, confirming its effectiveness for SPG forecasting across diverse geographical locations. The results in Table 4 will be used as a baseline for subsequent experiments.

3.4. Multisite Performance

Subsequently, we assessed the performance of a common model designed for multisite applications. The weather data used to train the model were aggregated from all sites by combining the datasets into a single comprehensive dataset. This process involved normalizing and standardizing the data to ensure uniformity across different sites. Specifically, the weather data from 2013 to 2020 were used for training, while the data from 2021 to 2022 were reserved for testing.

To aggregate the data for training, we combined the meteorological elements from each site into a unified dataset. Each site’s data were pre-processed to handle any missing values and ensure consistency in the input features. This aggregated dataset was then used to train the common model, allowing it to learn from a diverse range of environmental conditions.

For testing, the trained common model was evaluated using the individual site data from 2021 to 2022. This approach ensured that the model’s performance was assessed on data that were not seen during the training phase, providing an unbiased evaluation of its generalization capabilities across different sites.

Investigating the performance of a common model is crucial as it represents a potentially straightforward solution for forecasting at unknown sites. By leveraging a comprehensive dataset that encompasses the variability across different geographical locations, the common model aims to generalize the forecasting capability, thereby facilitating accurate predictions, even for sites not explicitly represented in the training data. This approach simplifies the forecasting process for new or unknown sites and tests the model’s adaptability and scalability across diverse environmental conditions.

Table 5 presents a comparison of the forecasting performance of the proposed models. In the table, “Common w/o cls.” refers to the performance of the common model trained on data from sites 1 to 7 without including a classifier, while “Common w cls.” shows the performance outcomes of the proposed method with the applied classifier. The common model without a classifier exhibited an average performance decrease of 22% compared to the site-specific model results listed in Table 4. By contrast, the proposed method with the classifier showed an average performance decrease of 15%, indicating a 7% improvement over the basic common model without the classifier. This highlights the efficacy of incorporating a classifier to enhance the forecasting accuracy of the common model, demonstrating significant mitigation of performance loss when extending the model to multisite applications.

In particular, for site 6, which exhibited one of the best forecast performances in the site-specific model analysis, the MAE of the basic common model without a classifier approximately doubled. However, in the case of the proposed model with the classifier, the increase in the MAE was within 50%. This demonstrates that including the classification model enables the encoder to incorporate site-specific information to a certain extent. This incorporation significantly aids the regressor in making more accurate predictions by providing it with contextually relevant features tailored to the characteristics of each site. The improvement in forecasting accuracy for site 6 underlines the value of integrating site-specific nuances into the common model framework, showcasing the classifier’s role in enhancing the adaptability and predictive capability of the model across diverse locations.

The MSE and RMSE values further substantiate these findings. The MSE across all sites for the common model without a classifier averaged 79.13, while for the model with a classifier, it averaged 55.54, indicating a 42% improvement. This substantial reduction in MSE highlights the classifier’s role in mitigating larger errors, which is crucial for improving the overall robustness of the forecasting model. The RMSE, which averaged 8.69 for the model without a classifier and 7.21 for the model with a classifier, shows a 21% improvement. The consistent improvement in both MSE and RMSE metrics underscores the effectiveness of the classifier in enhancing the predictive performance of the common model.

3.5. Unknown-Site Performance

To further demonstrate the advantages of the proposed system, additional experiments were conducted by segregating the seven sites into known and unknown categories. We then train the common model exclusively using data from known sites. For instance, we trained the common model using data from known sites, namely, 1, 2, 3, and 7, and subsequently tested the model on unknown sites, namely, 4, 5, and 6.

Table 6 and Table 7 present the forecast performances of the common model for unknown sites, i.e., 4, 5, and 6. The common model was trained using data from known sites, i.e., 1, 2, 3, and 7 (Table 6) and sites 1, 3, and 7 (Table 7). The purpose of using different training sets for the common models in Table 6 and Table 7 was to evaluate the robustness and adaptability of our model under varying conditions. By training the model on different combinations of sites, we aimed to test its ability to generalize and predict SPG accurately across unknown sites with diverse environmental conditions.

As expected, a significant drop in the prediction accuracy for unknown sites was observed compared to the baseline performance of the site-specific models. However, in all the cases of the common model presented in Table 6 and Table 7, the performance still met the 8% MAE threshold required for participation in the Republic of Korea’s renewable energy generation forecast system. Notably, implementing the proposed method resulted in approximately 3–6% performance improvement for the basic common model. This enhancement can be attributed to the classifier’s utilization of stored information from the trained sites to assist in forecasting unknown sites. Even without direct historical data for these unknown locations, the classifier leverages similarities with known sites to make more accurate predictions, demonstrating the efficacy of incorporating site-specific characteristics through classification to improve the forecasting accuracy across new and diverse locations.

In addition to MAE, MSE and RMSE metrics provided further insights into the model’s performance. The MSE values in Table 6 for the common model without a classifier averaged 160.33, whereas the model with a classifier achieved an average MSE of 147.79, indicating an 8% improvement. Similarly, in Table 7, the MSE values were 165.98 without a classifier and 161.87 with a classifier, showing a reduction in error by approximately 3%. The reduction in MSE demonstrates the classifier’s effectiveness in minimizing larger errors, which is crucial for maintaining robust forecasting performance across diverse sites.

The RMSE values also reflect the classifier’s impact on model performance. In Table 6, the RMSE for the common model without a classifier averaged 12.51, while the model with a classifier had an RMSE of 11.99, representing a 4% improvement. In Table 7, the RMSE values were 12.76 without a classifier and 12.26 with a classifier, indicating a 1% reduction.

The impact of the classifier is also evidenced by the comparative forecast performance depicted in Table 6 and Table 7, where we observed a decrease in performance in Table 7 relative to that in Table 6. Specifically, Table 6 and Table 7 differ in the number of sites used to train the common model, with four sites considered in Table 6 and only three sites considered in Table 7. This reduction in the number of training cases directly affected the model’s domain generalization capability. The fewer sites used in training, the less diverse the data the model must learn from, which can limit its ability to generalize across new or unknown domains accurately. This outcome highlights the critical role that the breadth of the training data plays in enhancing the robustness of the model and its ability to adapt to varied geographical and environmental conditions, underscoring the importance of incorporating as much site-specific data as possible into the training phase.

Furthermore, we explored the potential of a transfer learning (TL) scenario to enhance the adaptability of our forecasting model to specific sites. Transfer learning is particularly valuable when limited weather information is available for unknown sites [21]. By applying TL, we fine-tuned the network parameters, making the model more tailored and responsive to the conditions of a specific site. In detail, the Feature Encoder was frozen, and the Regressor was only retrained with the same hyperparameters as in the original experiment to preserve the parameter features trained with large-scale data. This approach ensures that the general features learned from the comprehensive dataset are retained while adapting the model to the unique characteristics of each site. To evaluate the effectiveness of the TL scenario, we posited a situation in which four months of weather data (equating to approximately 4% of the total training data) were accessible for each site’s training. This setup allowed us to investigate how even a small subset of site-specific data can significantly affect a model’s performance by adjusting its parameters via TL. The incorporation of TL is designed to bridge the knowledge gap between known and unknown sites, leveraging available site-specific information to improve forecasting accuracy and model generalization across various environments.

Table 8 and Table 9 illustrate the outcomes of retraining the system using TL. Consistent with expectations, applying TL generally enhanced the prediction accuracy across all sites. Notably, site 6, which demonstrated strong performance in the site-specific model but experienced a decline in accuracy within the common model framework, showed a dramatic improvement because of site-specific fine-tuning. This significant enhancement underscores the necessity of a synergistic approach that combines the broad applicability of the common model with the tailored precision of site-specific models, depending on each site’s unique characteristics.

Moreover, across all outcomes, the proposed method yielded an additional 2–4% MAE performance improvement over the basic common model. This finding highlights the classifier module’s efficacy in refining the model’s capability to adapt and generalize across various sites, further reinforcing the argument for integrating such a module into the forecasting system. The role of the classifier in leveraging site-specific information, even when only a small amount of data is available, is crucial for enhancing the overall accuracy and adaptability of the model in a TL scenario.

However, when examining the MSE and RMSE metrics, we observed that the improvements are less pronounced, especially in Table 9. The MSE values in Table 8 show a slight improvement of about 2% with the inclusion of the classifier, while Table 9 shows a negligible improvement of 0.1%. Similarly, the RMSE values indicate minimal changes, with only a 0.2% improvement observed in Table 9.

This discrepancy between MAE and MSE/RMSE improvements can be attributed to the nature of these metrics. The MAE provides a linear measure of the average error magnitude, while the MSE and RMSE emphasize larger errors due to the squaring of differences. The less significant improvements in the MSE and RMSE suggest that while the overall average prediction error (as indicated by the MAE) decreased, the variance in error magnitude remained relatively unchanged. In other words, although the model with the classifier reduced the average error, it did not significantly mitigate the impact of larger prediction errors.

This outcome highlights the complexity of improving the forecasting accuracy across all error metrics and underscores the need for further refinement of the model to specifically address larger errors. Enhancing the model’s ability to consistently predict across all error magnitudes will be essential for achieving more robust performance improvements in the MSE and RMSE.

4. Conclusions

This study proposes an innovative DL-based approach for SPG forecasting that demonstrates considerable potential across multiple sites. Our research successfully navigated the complexities associated with SPG forecasting by leveraging a comprehensive dataset from various locations throughout the Republic of Korea and employing DL techniques to enhance forecasting accuracy by integrating common meteorological data and site-specific features.

A key contribution of our study is developing a common model framework that incorporates a classification module, enabling the effective adaptation and prediction of both known and unknown sites. This methodology significantly enhances the model’s adaptability and generalization capabilities across diverse environmental conditions. The evaluation results demonstrate that our model maintains high performance levels across different SPG sites with minimal performance degradation compared to site-specific models.

By addressing these key areas and providing a robust evaluation, we have shown that our developed solution is both effective and adaptable, making it a valuable tool for improving SPG forecasting accuracy and supporting the integration of renewable energy into the power grid.

Although the proposed system shows promising results for SPG forecasting, it still has limitations. First, the system architecture was not fully optimized due to data collection issues. We found that a larger system architecture is necessary to deal with the complex behavior of the SPG system. Additionally, data augmentation schemes could be applied to improve model robustness. Moreover, it is necessary to take into account more meaningful information, such as the geographical characteristics of the site. Incorporating this information could enrich the feature extraction stages and ultimately improve the overall forecasting accuracy.

Future research should explore how the site set’s composition influences the common model’s performance. This involves investigating the problem of identifying an optimal set of information for configuring a common model to ensure that it can be effectively generalized across multiple sites. In addition, the development of hybrid models that combine the strengths of common and site-specific models is a promising avenue. Such hybrid approaches can leverage the broad applicability of common models while incorporating the precision and adaptability of site-specific models, potentially leading to superior forecasting accuracy and robustness under diverse environmental conditions.

Moreover, while our current model incorporates the solar elevation and azimuth angle data to account for basic seasonal variations in solar irradiance and utilizes multi-year weather data (2013 to 2020 for training and 2021 to 2022 for testing) to inherently consider seasonal effects, future research could significantly benefit from a more explicit seasonal approach. This would involve collecting and analyzing data across different seasons to better understand and predict the impact of seasonal changes on SPG. Implementing such seasonal approaches could enhance the accuracy and robustness of SPG forecasting models, thereby improving their practical application in diverse climatic conditions.

Author Contributions

Conceptualization, S.Y.J., B.T.O. and E.O.; methodology, S.Y.J., B.T.O. and E.O.; software, S.Y.J.; validation, S.Y.J., B.T.O. and E.O.; formal analysis, B.T.O. and E.O.; investigation, S.Y.J., B.T.O. and E.O.; resources, S.Y.J., B.T.O. and E.O.; data curation, S.Y.J., B.T.O. and E.O.; writing—original draft preparation, S.Y.J.; writing—review and editing, B.T.O. and E.O.; visualization, S.Y.J., B.T.O. and E.O.; supervision, B.T.O. and E.O.; project administration, E.O.; funding acquisition, E.O. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Korea Electric Power Corporation (Grant number: R22XO02-23) and the Ministry of Science and ICT (MSIT), Republic of Korea, through the Information Technology Research Center (ITRC) Support Program supervised by the Institute for Information & Communications Technology Planning & Evaluation (IITP) under Grant RS-2023-00259004.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Publicly available datasets were analyzed in this study. These data can be found at https://www.data.go.kr/data/15043386/fileData.do (accessed on 8 April 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Camera, F.L. Renewable Capacity Statistics 2023; International Renewable Energy Agency (IRENA): Masdar City, Abu Dhabi, 2023; ISBN 978-92-9260-525-4. [Google Scholar]
Mandys, F.; Chitnis, M.; Silva, S.R.P. Levelized cost estimates of solar photovoltaic electricity in the United Kingdom until 2035. Patterns 2023, 4, 100735. [Google Scholar] [CrossRef] [PubMed]
Massidda, L.; Bettio, F.; Marrocu, M. Probabilistic day-ahead prediction of PV generation. A comparative analysis of forecasting methodologies and of the factors influencing accuracy. Sol. Energy 2024, 271, 112422. [Google Scholar] [CrossRef]
Rajasundrapandiyanleebanon, T.; Kumaresan, K.; Murugan, S.; Subathra, M.; Sivakumar, M. Solar energy forecasting using machine learning and deep learning techniques. Arch. Comput. Methods Eng. 2023, 30, 3059–3079. [Google Scholar] [CrossRef]
Benti, N.E.; Chaka, M.D.; Semie, A.G. Forecasting renewable energy generation with machine learning and deep learning: Current advances and future prospects. Sustainability 2023, 15, 7087. [Google Scholar] [CrossRef]
Gao, J.; Heng, F.; Yuan, Y.; Liu, Y. A novel machine learning method for multiaxial fatigue life prediction: Improved adaptive neuro-fuzzy inference system. Int. J. Fatigue 2024, 178, 108007. [Google Scholar] [CrossRef]
Dhaked, D.K.; Dadhich, S.; Birla, D. Power output forecasting of solar photovoltaic plant using LSTM. Green Energy Intell. Transp. 2023, 2, 100113. [Google Scholar] [CrossRef]
Wang, L.; Mao, M.; Xie, J.; Liao, Z.; Zhang, H.; Li, H. Accurate solar PV power prediction interval method based on frequency-domain decomposition and LSTM model. Energy 2023, 262, 125592. [Google Scholar] [CrossRef]
Li, G.; Wei, X.; Yang, H. Decomposition integration and error correction method for photovoltaic power forecasting. Measurement 2023, 208, 112462. [Google Scholar] [CrossRef]
Marinho, F.P.; Rocha, P.A.; Neto, A.R.; Bezerra, F.D. Short-term solar irradiance forecasting using CNN-1D, LSTM, and CNN-LSTM deep neural networks: A case study with the Folsom (USA) dataset. J. Sol. Energy Eng. 2023, 145, 041002. [Google Scholar] [CrossRef]
Anu Shalini, T.; Sri Revathi, B. Hybrid power generation forecasting using CNN based BILSTM method for renewable energy systems. Autom. Časopis Autom. Mjer. Elektron. Računarstvo Komun. 2023, 64, 127–144. [Google Scholar] [CrossRef]
Khan, N.; Ullah, F.U.M.; Haq, I.U.; Khan, S.U.; Lee, M.Y.; Baik, S.W. AB-net: A novel deep learning assisted framework for renewable energy generation forecasting. Mathematics 2021, 9, 2456. [Google Scholar] [CrossRef]
Abou Houran, M.; Bukhari, S.M.S.; Zafar, M.H.; Mansoor, M.; Chen, W. COA-CNN-LSTM: Coati optimization algorithm-based hybrid deep learning model for PV/wind power forecasting in smart grid applications. Appl. Energy 2023, 349, 121638. [Google Scholar] [CrossRef]
Alharkan, H.; Habib, S.; Islam, M. Solar power prediction using dual stream CNN-LSTM architecture. Sensors 2023, 23, 945. [Google Scholar] [CrossRef] [PubMed]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 1–11. [Google Scholar]
Al-Ali, E.M.; Hajji, Y.; Said, Y.; Hleili, M.; Alanzi, A.M.; Laatar, A.H.; Atri, M. Solar energy production forecasting based on a hybrid CNN-LSTM-transformer model. Mathematics 2023, 11, 676. [Google Scholar] [CrossRef]
Zhu, J.; Zhao, Z.; Zheng, X.; An, Z.; Guo, Q.; Li, Z.; Sun, J.; Guo, Y. Time-Series Power Forecasting for Wind and Solar Energy Based on the SL-Transformer. Energies 2023, 16, 7610. [Google Scholar] [CrossRef]
Public Data Portal. Ministry of the Interior and Safety, South Korea. 2024. Available online: https://www.data.go.kr/index.do (accessed on 8 April 2024).
Open MET Data Portal. Korea Meteorological Administration, South Korea. 2024. Available online: https://data.kma.go.kr/resources/html/en/aowdp.html (accessed on 8 April 2024).
Yu, H.-U.; Kim, S.; Wi, Y.-M.; Lee, J. An Offer Method for Photovoltaic Power Plants with ESSs Considering Incentives for Forecasting Accuracy of Renewable Generation. Trans. Korean Inst. Electr. Eng. 2022, 71, 1076–1083. [Google Scholar] [CrossRef]
Miraftabzadeh, S.M.; Colombo, C.G.; Longo, M.; Foiadelli, F. A day-ahead photovoltaic power prediction via transfer learning and deep neural networks. Forecasting 2023, 5, 213–228. [Google Scholar] [CrossRef]

Figure 1. SPG forecasting system architecture.

Figure 2. Training system using the classifier for site estimation.

Figure 3. SPG site map.

Table 1. DL-based SPG forecast methods.

Ref.	Model	Description
[7]	LSTM	LSTM with grid search-based hyperparameter fine-tuning
[8]	BO-LSTM	Frequency-domain data decomposition and LSTM with BO-based hyperparameter tuning
[9]	AGT-LSTM	Fuzzy-based data decomposition and LSTM with AGT optimizer-based hyperparameter tuning
[10]	CNN-LSTM	Hybrid CNN-LSTM model with statistical data model
[11]	CNN-biLSTM	Hybrid CNN-bidirectional LSTM model with statistical data model
[12]	AE-biLSTM	Hybrid Auto Encoder-bidirectional LSTM model with statistical data model
[13]	CO-CNN-LSTM	CNN-LSTM with CO-based hyperparameter tuning
[14]	DOCNN-LSTM	Dual-stem CNN-LSTM with an attention mechanism
[16]	CNN-LSTM-Transformer	Transformer combining CNN-LSTM
[17]	SG-LOF-Transformer	Transformer with SG-LOF data filtering

Table 2. Description of the used SPG data.

Site ID	Date	Generation
Site ID	Date	00 h	…	08 h	09 h	…	22 h	23 h
Site 1	2013-01-01	0.00	…	9.88	32.60	…	0.00	0.00
Site 1	2013-01-02	0.00	…	10.52	39.02	…	0.00	0.00
⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮

Table 3. Description of the used NWP data.

Site ID	Date	Hour	TMP (°C)	REH (%)	POP (%)	SKY	VEC (°)	WSD (m/s)
Site 1	2013-01-01	00	7.3	52	15	2	279	4.9
Site 1	2013-01-01	01	5.8	50	12	2	281	5.8
⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮

Table 4. Forecast performance of the proposed site-specific model.

Model	Loss	Site 1	Site 2	Site 3	Site 4	Site 5	Site 6	Site 7	Avg.
Site-specific	MAE	3.16	3.33	3.35	3.34	6.08	1.72	3.65	3.52
	MSE	50.34	51.19	50.77	51.97	173.55	14.06	59.58	64.49
	RMSE	7.10	7.15	7.13	7.21	13.17	3.75	7.72	7.60

Table 5. Forecast performance of the common model. The common models are trained by the data in sites 1–7.

Model	Loss	Site 1	Site 2	Site 3	Site 4	Site 5	Site 6	Site 7	Avg.
Common w/o cls.	MAE	3.90	4.30	3.34	3.85	5.99	3.47	4.45	4.19
	MSE	62.88	75.96	48.89	63.59	170.25	52.70	79.62	79.13
	RMSE	7.93	8.72	6.99	7.97	13.05	7.26	8.92	8.69
Common w cls.	MAE	4.37	3.56	3.47	3.67	5.78	2.45	4.24	3.93 (7%)
	MSE	50.96	41.19	46.83	56.17	113.21	14.89	65.55	55.54 (42%)
	RMSE	7.14	6.42	6.84	7.49	10.64	3.86	8.1	7.21 (21%)

Table 6. Forecast performance of the common model for unknown sites. The common models are trained by the data in known sites (1, 2, 3, and 7).

Model	Loss	Site 4	Site 5	Site 6	Avg.
Common w/o cls.	MAE	4.82	6.76	7.63	6.40
	MSE	94.27	198.87	187.86	160.33
	RMSE	9.71	14.1	13.71	12.51
Common w cls.	MAE	4.54	6.63	6.93	6.03 (6%)
	MSE	85.35	194.4	163.62	147.79 (8%)
	RMSE	9.23	13.94	12.79	11.99 (4%)

Table 7. Forecast performance of the common model for unknown sites. The common models are trained by the data in known sites (1, 3, and 7).

Model	Loss	Site 4	Site 5	Site 6	Avg.
Common w/o cls.	MAE	5.21	6.95	7.31	6.49
	MSE	106.86	212.97	178.12	165.98
	RMSE	10.34	14.59	13.35	12.76
Common w cls.	MAE	5.12	6.74	7.11	6.32 (3%)
	MSE	105.18	204.63	175.88	161.87 (3%)
	RMSE	10.26	14.3	13.26	12.61 (1%)

Table 8. Forecast performance of the common model with TL for unknown sites. The common models are trained by the data in known sites (1, 2, 3, and 7).

Model	Loss	Site 4	Site 5	Site 6	Avg.
Common + TL w/o cls.	MAE	3.71	4.97	1.91	3.53
	MSE	59.4	111.51	16.32	62.41
	RMSE	7.71	10.56	4.04	7.45
Common + TL w cls.	MAE	3.39	4.94	1.83	3.39 (4%)
	MSE	56.17	113.21	14.89	61.42 (2%)
	RMSE	7.49	10.64	3.86	7.33 (2%)

Table 9. Performance comparison of the common model to TL for unknown sites. The common models are trained by the data in known sites (1, 3, and 7).

Model	Loss	Site 4	Site 5	Site 6	Avg.
Common + TL w/o cls.	MAE	3.65	4.88	1.88	3.47
	MSE	59.26	109	16.19	61.48
	RMSE	7.7	10.44	4.02	7.39
Common + TL w cls.	MAE	3.52	4.86	1.85	3.41 (2%)
	MSE	57.73	110.32	16.13	61.39 (0.1%)
	RMSE	7.6	10.5	4.02	7.37 (0.2%)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jang, S.Y.; Oh, B.T.; Oh, E. A Deep Learning-Based Solar Power Generation Forecasting Method Applicable to Multiple Sites. Sustainability 2024, 16, 5240. https://doi.org/10.3390/su16125240

AMA Style

Jang SY, Oh BT, Oh E. A Deep Learning-Based Solar Power Generation Forecasting Method Applicable to Multiple Sites. Sustainability. 2024; 16(12):5240. https://doi.org/10.3390/su16125240

Chicago/Turabian Style

Jang, Seon Young, Byung Tae Oh, and Eunsung Oh. 2024. "A Deep Learning-Based Solar Power Generation Forecasting Method Applicable to Multiple Sites" Sustainability 16, no. 12: 5240. https://doi.org/10.3390/su16125240

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Deep Learning-Based Solar Power Generation Forecasting Method Applicable to Multiple Sites

Abstract

1. Introduction

1.1. Motivation

1.2. Prior Works

1.3. Contribution

2. Methodology

2.1. SPG Forecasting System Architecture

2.2. Implicit Site Estimation using the Classification Module

3. Results and Discussion

3.1. Dataset

3.2. Experimental Setups

3.3. Site-Specific Performance

3.4. Multisite Performance

3.5. Unknown-Site Performance

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI