Next Article in Journal
Green Public Transport in Poland—Planning the Process of the Electrification of the Bus Fleet of Vehicles
Next Article in Special Issue
Outlier Detection and Correction in Smart Grid Energy Demand Data Using Sparse Autoencoders
Previous Article in Journal
Simple Energy Model for Hydrogen Fuel Cell Vehicles: Model Development and Testing
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Evidential Solar Irradiance Forecasting Method Using Multiple Sources of Information

by
Mohamed Mroueh
1,
Moustapha Doumiati
2,*,
Clovis Francis
3 and
Mohamed Machmoum
4
1
Triskell Consulting, 32 Rue Arago, 92800 Puteaux, France
2
IREENA Lab UR 4642, Electrical and Electronics Department, ESEO, 10 Bd Jeanneteau, 49100 Angers, France
3
Arts et Métiers Paris Tech, Châlons en Champagne, Department of Design, Industrialization, Risk, and Decision (CIRD), 51000 Châlons en Champagne, France
4
IREENA Lab UR 4642, Nantes University, 37 Bd de l’université, 44602 Saint Nazaire, France
*
Author to whom correspondence should be addressed.
Energies 2024, 17(24), 6361; https://doi.org/10.3390/en17246361
Submission received: 5 November 2024 / Revised: 11 December 2024 / Accepted: 13 December 2024 / Published: 18 December 2024

Abstract

:
In the context of global warming, renewable energy sources, particularly wind and solar power, have garnered increasing attention in recent decades. Accurate forecasting of the energy output in microgrids (MGs) is essential for optimizing energy management, reducing maintenance costs, and prolonging the lifespan of energy storage systems. This study proposes an innovative approach to solar irradiance forecasting based on the theory of belief functions, introducing a novel and flexible evidential method for short-to-medium-term predictions. The proposed machine learning model is designed to effectively handle missing data and make optimal use of available information. By integrating multiple predictive models, each focusing on different meteorological factors, the approach enhances forecasting accuracy. The Yager combination method and pignistic transformation are utilized to aggregate the individual models. Applied to a publicly available dataset, the method achieved promising results, with an average root mean square error (RMS) of 27.83 W/m2 calculated from eight distinct forecast days. This performance surpasses the best reported results of 30.21 W/m2 from recent comparable studies for one-day-ahead solar irradiance forecasting. Comparisons with deep learning-based methods, such as long short-term memory (LSTM) networks and recurrent neural networks (RNNs), demonstrate that the proposed approach is competitive with state-of-the-art techniques, delivering reliable predictions with significantly less training data. The full potential and limitations of the proposed approach are also discussed.

1. Introduction

1.1. Context and Motivations

In recent years, interest in renewable energy sources has increased to overcome the depletion of the world’s traditional energy supplies [1] (fuel, natural gas, coal, and even uranium). In 2019, the European Commission presented the Green Deal [2] for Europe, a road map to reach climate neutrality by 2050. To ensure reaching this goal, the European Union (EU) executives set up the “Fit for 55” package [3] in July 2021. This package includes, among other topics, a greater calling: to increase the share of renewable energies on the continent to 40% by 2030, up from the initial 32%. These are some of the reasons why electric utilities are focusing more on efficient, ecologically friendly, and cost-effective solutions [4,5,6,7,8,9]. One of the most significant advances associated with the energy transition is implementation of the microgrid (MG) [10,11,12], which is a network structure that integrates localized energy management systems with renewable energy sources in a decentralized manner. An MG is a system which can be controlled and supplies a nearby region with electric power by combining loads and distributing sources and power converters. Over the course of the last several years, the prevalence of the MG idea has seen a substantial uptick. The intermittent nature of renewable energy sources makes it difficult for MGs to keep up with customer demand [11,12]. Consequently, it is necessary to have an energy management system (EMS) which also includes an energy storage system (ESS). In most cases, ESSs manage the power balance between production and consumption by storing electricity during low-cost or off-peak hours and discharging it during high-cost or peak hours. This allows the ESS to retain the power balance between generation and consumption. The majority of the time, when it comes to energy storage technologies, having an accurate prediction of the amount of power which will be produced enables a more effective use of ESS units, which in turn extends their lifetime. Therefore, forecasting the production of energy is essential for an EMS, as it may result in large cost savings, simpler maintenance of grid components, and the avoidance of eventual faults in the MG.
Photovoltaic generators are one of the most well-known techniques for producing renewable energy. Using the characteristics of semiconductors such as silicon, the solar radiation is converted into electrical energy [13,14]. The generation of solar energy is highly reliant on both the sun’s position and the meteorological conditions. Hence, utilizing meteorological characteristics and astronomical data about the sun is often required to forecast solar energy production. However, since the model of energy production is dynamic, nonlinear, and parameter-dependent, it is exceedingly difficult to forecast the amount of electric power which will be generated. In addition, the proliferation of decentralized energy resources raises the level of uncertainty in power networks, thus increasing the complexity to provide an exact estimate of the amount of generated electricity. Furthermore, in real-world applications, the penalties for underpredictions and overpredictions are drastically different, depending on the corresponding financial applications, which makes prediction assessment more subjective. To tackle the challenges posed by the inherent uncertainties in solar power generation, robust forecasting models can be constructed utilizing time series weather data.
The primary contribution of this study is the proposal of a novel predictive method based on evidence theory. This method demonstrates the capability to operate effectively even when confronted with an incomplete learning database. It also exhibits low computational complexity, making it efficient even with limited training data. Additionally, the suggested method is flexible and modular, and it proves to be competitive when compared to other state-of-the-art machine learning approaches applied for energy solar production.

1.2. Brief Overview of Forecasting Techniques and Time Horizons

Accurate generation forecasting is essential for renewable energy systems, particularly in MGs, where energy production is heavily influenced by natural variability and cannot be easily controlled. According to [14,15,16], forecasting models are categorized based on their prediction horizons, each tailored to specific operational and planning needs within MGs:
  • Very short-term forecasting covers a from few minutes to 1 h, being critical for real-time MG operations, electricity market clearing, and immediate regulatory actions.
  • Short-term forecasting spans from 1 h to 1–2 days, supporting MG energy dispatch, operational security, and load-balancing decisions.
  • Medium-term forecasting encompasses 5–7 days, aiding decisions on resource allocation, unit commitment, and storage optimization in MGs.
  • Long-term forecasting extends beyond 1 week, facilitating maintenance scheduling, strategic planning, and overall operational management in MG systems. Forecasting uncertainty increases with longer time horizons due to the unpredictable nature of influencing factors.
  • Forecasting techniques can be broadly classified into three categories based on their approach and computational requirements [14]:
  • Physical models [17] rely on meteorological data, such as numerical weather prediction (NWP), to estimate solar irradiance. These models incorporate local physical influences, adapting data through solar PV models and power curves. While they provide high accuracy, especially for long-term forecasting, they often require significant computational resources, making them less practical for real-time MG applications.
  • Statistical models [18] use historical data to forecast solar irradiance and are well suited for short-term predictions. These methods analyze time series data to identify patterns, which are then used to forecast future values. Common statistical approaches include regression models, persistence models, moving averages, and ARIMA. Although computationally efficient, statistical models may lack precision compared with physical models, particularly when dealing with complex and nonlinear data.
  • Intelligent techniques [14,19] are ideal for handling non-stationary and erratic time series data, such as solar irradiance. These techniques, including neural networks, genetic algorithms, and belief function theory (BFT), offer greater flexibility and reduced computational complexity. Machine learning models, including hybrid approaches which combine multiple techniques, are particularly effective for predicting solar behavior based on historical data, accommodating uncertainty, and improving forecast accuracy in a dynamic MG environment.
  • Intelligent techniques and hybrid models have become increasingly favored due to their ability to manage uncertainty, adapt to real-time changes, and optimize energy distribution with minimal computational overhead. In this context, this study proposes an innovative, intelligent evidential approach [20,21] to solar irradiance forecasting which is particularly suitable for short-to-medium-term scenarios within MGs, enabling more reliable energy management and decision making. The strategy consists of constructing many predictors, each based on a different meteorological component, and then proceeding to the fusion of all information sources using the BFT framework. Predictors utilize past information (historical data) to predict the solar irradiance values.

1.3. Related Works

In the early 2000s, Cao et al. [22] presented a combination of an artificial neural network and wavelet analysis to predict solar irradiance. This approach is typical of sample data preparation utilizing wavelet transformation for forecasting. The anticipated solar irradiance is simply the summation of all the forecasted components acquired by the various recurrent neural networks (RNNs), whose time–frequency domains match appropriately. When adjusting the weights and biases of the networks during network training, discount coefficients are employed to account for the influence of various time steps on the accuracy of the final prediction. On the basis of a mix of recurrent BP networks and wavelet analysis, an enhanced model for forecasting solar irradiance was constructed. However, the optimal updating of weights and biases was not studied.
More recently, in 2018, Qing et al. [23] provided a solar prediction method for hourly day-ahead solar irradiance prediction using weather forecasting data, where the prediction problem was formulated as a structured output forecasting model which independently predicted several outputs at once. The prediction model was trained using long short-term memory (LSTM) networks, which accounted for the interdependence between consecutive hours within the same day. The proposed method was evaluated on a dataset collected on the island of Santiago in Cape Verde. The results demonstrated that the proposed algorithm was 18.34% more accurate than backpropagation algorithm in terms of root mean square error (RMSE) by using about 2 years of training data to predict the half-year testing data.
In the following year, Husein et al. [24] opted to rely exclusively on meteorological variables, such as the dry-bulb temperature, dew point temperature, and relative humidity, for generating their solar irradiance forecasts, addressing the challenge posed by the absence of historical solar irradiation data in certain scenarios. To achieve this, they employed a deep recurrent neural network architecture incorporating both long-term and short-term memory (LSTM-RNN). This model was designed to predict the hourly solar irradiance one day ahead, effectively capturing temporal dependencies in the meteorological data. To comprehensively evaluate the proposed approach, six experiments were conducted using weather station data from Germany, the USA, Switzerland, and South Korea, representing diverse climate types. The results demonstrated that the proposed method outperformed feedforward neural networks (FFNNs) and achieved an RMSE as low as 60.31 W/m2. Additionally, compared with the persistence model, it achieved an average forecast skill of 50.90%, with improvements of up to 68.89% on specific datasets.
Within the same year, Yu et al. [25] published an article presenting a technique based on LSTM for short-term forecasts considering a timeline which spans global horizontal irradiance (GHI) one hour in advance and one day in advance. A clearness index was introduced as input data for the LSTM model to improve prediction accuracy on cloudy days and to classify the type of weather by k-means during data processing, where cloudy days were classified as cloudy and mixed (partially cloudy). This information was derived from an analysis of the results of an ANN and SVR, as reported in the literature. The authors stated that inaccurate forecasts typically took place on cloudy days.
After a short period, Wojtkiewicz et al. [26] evaluated the use of gated recurrent units (GRUs) to predict solar irradiance and reported the results of utilizing multivariate GRUs to estimate the hourly solar irradiance in Phoenix, Arizona. Using purely historical solar irradiance data as well as the inclusion of external meteorological factors and cloudiness data, the authors compared and assessed the performance of GRUs and LSTM. Their discussions prove that the proposed deep learning methods could be further improved by incorporating more detailed cloud cover-related features.
At the end of the same year, Aslam et al. [27] published a comparative study of various deep learning approaches for the purpose of forecasting one-year-ahead hourly and daily solar radiation using historical solar radiation data and clear sky global horizontal irradiance. These approaches include gated recurrent units (GRUs), LSTM, RNNs, feedforward neural networks (FFNNs), and support vector regression (SVR). According to this study, the two most effective and state-of-the-art deep learning models are LSTM and GRUs. The GRU has two gates unlike LSTM, which has three gates. This results in reducing the complexity of the structure. Therefore, less operation is needed for GRUs compared with LSTM.
In 2020, Hui et al. [28] proposed a probabilistic hybrid approach for solar irradiance forecasting which integrates a deep recurrent neural network with residual modeling. Specifically, an LSTM-based point forecast is generated using historical data along with relevant features. These point predictions are subsequently utilized as inputs for estimating residual distributions. The parameters of these residual distributions are then determined via maximum likelihood estimation. The final probabilistic forecast is obtained by simultaneously considering both the point prediction and the corresponding residual distribution.
In the same year, Byung-ki et al. [29] suggested a model based on an algorithm for long-term and short-term memory which made use of restricted input data as well as data from other locations. The authors stated that it is feasible to develop a model with one-time learning utilizing national and international data, and the suggested model has the ability to predict solar irradiance by using weather predictions for the next day provided by the Korea Meteorological Administration as well as daily solar irradiance.
More recently, the authors of [30] proposed similarity-based forecasting models (SBFMs) for predicting photovoltaic (PV) power at a high temporal resolution, utilizing weather variables which were measured at a lower temporal resolution. As a case study, the model forecasted the PV power generated by the solar panels installed on the rooftop of a commercial building for the following day in five-minute intervals, considering various scenarios of available weather data. The results indicated that the proposed SBFMs could achieve a greater forecasting accuracy than several benchmark models by relying on only two or three weather variables.
In [14], the authors presented an extensive review of forecasting models and performance metrics documented in the literature, with a particular emphasis on short-term forecasting for wind and solar power generation. It included a detailed analysis of the data duration used by each model and provided a comparative evaluation of their performance metrics through a comprehensive overview. In [13], various deep neural network (DNN)-based models were investigated for the estimation of prediction intervals (PIs) in the context of regional solar and wind energy forecasting. Another research work [31] applied deep learning techniques to predict photovoltaic energy generation in residential systems. The study leveraged real-world data to evaluate the effectiveness of LSTM networks, convolutional neural networks (CNNs), and hybrid convolutional-LSTM models across different forecasting horizons for photovoltaic power production. The models were trained using aggregated historical data from hundreds of residential PV systems within a region, and their performance was assessed in terms of predicting energy generation both for the entire region and for individual systems. The authors of [19] presented a novel CNN-CatBoost hybrid approach for predicting solar radiation using 1 h weather observation data from the Korean Weather Data Open Portal. The model predicts solar radiation based on extra-atmospheric solar radiation and three weather variables: temperature, relative humidity, and total cloud volume. The hybrid model surpassed the CNN-single model, achieving improved average absolute mean error accuracy in solar radiation prediction. The research presented in [32] introduced a systematic, robust approach to improving resilience against missing features in energy forecasting applications using robust optimization. Specifically, the authors developed a robust regression model designed to effectively manage missing features during test time. The authors of [33] presented a comprehensive evaluation of four prominent machine learning models, namely the bidirectional gated recurrent unit (BiGRU), bidirectional long short-term memory (BiLSTM), the simple bidirectional recurrent neural network (BiRNN), and unidirectional LSTM, in the context of solar power yield time series forecasting.

1.4. Contributions

As discussed previously, solar energy forecasting research has traditionally focused on neural network architectures, with an increasing emphasis on deep learning techniques. Although these methods achieve high accuracy, their performance is heavily dependent on the availability of large, high-quality datasets. However, in practical applications, data quality and availability are frequently hindered by challenges such as network latency, sensor malfunctions, or data integrity issues. These challenges can lead to incomplete datasets and reduced forecasting reliability, underscoring the need for alternative approaches. This study introduces an innovative method grounded in the BFT framework which incorporates multiple predictive models to address critical issues in solar energy forecasting. The proposed approach excels in managing uncertainty, effectively handling missing data while maximizing the utility of available information. It also supports rapid real-time execution, delivering reliable short-term forecasts in few seconds, a crucial advantage for dynamic decision-making systems. The key contributions of this research work are as follows:
  • Competitive performance: The method demonstrates accuracy on par with traditional techniques, achieving reliable forecasts even in challenging data environments. The proposed method demonstrates promising results, achieving a root mean square error (RMS) of 27.83 W/m2 compared with the best result of 30.21 W/m2, reported in [29], for one-day-ahead solar irradiance forecasting
  • Robust handling of incomplete data: The approach efficiently manages missing or partial datasets, maintaining forecasting accuracy despite real-world data limitations.
  • Reduced data requirements: Compared with deep learning models, the method delivers reliable results with significantly smaller datasets, addressing the practical constraints of data availability.
  • By addressing these challenges, this study presents a resilient and efficient alternative to conventional methods, enhancing the practicality and reliability of solar energy forecasting systems.

2. Material and Methods

2.1. Database

In this study, we used data collected from the Saaleaue weather station managed by the Max Planck Institute of Meteorology [34]. This dataset was chosen in this study for its completeness and public availability. Moreover, as the dataset includes a wide range of features, it enabled meaningful comparisons of the proposed method’s performance with other techniques in the literature which utilize different subsets of features. It is worth noting that the proposed method would remain operational if applied to a different dataset.
This subsection provides geographical features and a structural presentation of the employed database.

2.1.1. Geographical Location

The geographical characteristics are offered so that the outcomes of this research may be properly interpreted. The weather station installation (see Figure 1) is located at the coordinates 50°57′04.8″ N, 11°37′29.0″ E in the Jena Experiment field [35], near the Saale River in Germany.
Table 1 gives some relevant details about this region’s solar potential, and Figure 2 shows the solar trajectory from the perspective of the weather station during 2022.

2.1.2. Data Structure

The database used in this study is presented as a compilation of freely accessible CSV files. Each column represents a particular meteorological or solar component, whereas each row corresponds to measurements made at the same time. The measurement’s date and time are listed in the first column. The shortwave downward radiation (SWDR) column, which represents the solar irradiance this work aims to predict, was considered the most pertinent feature in our analysis. It holds significant relevance as it directly influences the solar energy received at the Earth’s surface and plays a crucial role in various applications, including solar energy forecasting. Therefore, this feature held paramount importance in our analysis and served as the primary focus for prediction in this study. Based on their availability ratio in the database, their predictability according to current climatographical models, and their correlation with the SWDR values, we identified a particular set of weather features useful for this study among those available (see Table 2).

2.1.3. Preprocessing

There are different predispositions which must be taken into consideration in order to use this database. Extreme outliers were eliminated first since they are indicative of technical data acquisition issues. Then, a two-hour moving average filter was applied to the SWDR signal to remove high frequencies caused by measurement noise and sensor limitations. Next, the SWDR signal was resampled with a period of one hour to accommodate the needs of this study. Note that some gaps are present in the database. For convenience, artificial measurement points were inserted in the series with no meaningful values (NaN). For the rest of this paper, this processed signal is referred to as the “solar irradiance”, denoted as y i with i N .

2.2. Feature Extraction

In the context of this study, every irradiance value y i recorded at time t i ( i { 0 , 1 , 2 I } ) was linked to a feature vector x i . The dimension of this vector represents the number of calculated features, and the integer i refers to the index of the measure in the database. Since it expresses measures with regularly sampled discrete signals, one can express t i as a function of i and T, the time separating two consecutive samples (i.e., the sampling period), where t i = t 0 + i T . In our case, T was set to 1 h.
In this work, a set of 15 features are calculated, of which the first 12 represent normalized values of every selected weather feature (see Table 2). Let W i be a weather feature vector of dimension 12, corresponding to the instant t i and W i , j , the j t h component of W i . The normalization method used here is based on unit scaling with scaling bounds defined by lower and upper quantile at a 1/1000 level (see Equation (1)). This level is established while considering the remaining outliers which need to be eliminated.
x i , j = W i , j a j b j a j
where a j and b j are quantiles of W i , j for all i valuesat levels of 0.001 and 0.999, respectively, and j { 1 , 2 , 12 } .
The remaining three features represent normalized versions of time-related variables. The “day cursor” is a normalized representation of the time of day (see Equation (2)), scaled between 0 and 1 (e.g., 3:00 p.m. becomes 0.625). The “year cursor” (see Equation (3)) is a normalized representation of the day of the year, also scaled between 0 and 1 (e.g., 4 February becomes 0.096). Lastly, the “previous day irradiance” feature (see Equation (4)) directly uses the solar irradiance measured 24 h prior. The choice of a 24 h lag was motivated by the significant daily cycle observed in the solar irradiance data, as indicated by a prominent peak in the frequency spectrum (see Figure 3).
x i , 13 = t i ( mod 1 day ) 1 day
x i , 14 = t i ( mod 1 year ) 1 year
x i , 15 = y i 24
The objective of the forecasting task is to determine the value of solar irradiance y i based on the feature vector x i available at time t i . This explains the selection criteria for the weather features (see Table 2), which rely on straightforward predictability. For the proposed model to achieve accuracy, x i must consist of predictable variables derived from validated climatological models. It is also important to highlight that the choice of the predictive model affects the performance of the proposed forecasting approach, emphasizing the need for further research to identify the most suitable model. This issue, however, falls outside the scope of the present work. In the following, let t p represent the present time and t k ( k > p ) denote the prediction time. The objective is to estimate y k at t k given x k , which is derived from predicted meteorological data and climatological models, as well as x i and y i for all ( p ) < i p . Here, corresponds to the length of the training period divided by the sampling interval T, representing the number of samples in the training dataset. This study primarily focuses on forecasting solar irradiance 1 day ahead and 1 h ahead. However, the full potential of the proposed method is explored in Section 3.4.

2.3. Building Basic Probability Assignment

This section presents a proposed technique for creating basic probability assignments (BPAs) from the training dataset. For a comprehensive understanding of the belief function theory, readers are referred to [20,21,37]. Since the objective was to estimate the value of y i based on the feature vector x i , the issue could be split into separate estimations of y i based on each component x i , j , making a substantially unjustified independence assumption between features. However, this study reveals that this assumption has a significant detrimental impact on the prediction performance. Therefore, the proposed method suggests a different approach. Due to the high regularity in the signal to be predicted, which is caused by some connections to physical and astronomical aspects, the feature day cursor (indexed 13) is significantly crucial and needs to work with every other feature. Therefore, there are 14 distinct subproblems, each of which is based on a feature combined with the day cursor one. For each component except for the day cursor, a single BPA was used as the basis for the estimation. Therefore, the subproblem indexed by j ( 1 j 15 , j 13 ) was defined to estimate y k with the only inputs being x k , 13 and x k , j as the testing instance and x i , 13 , x i , j , and y i for all ( p ) < i p as the training data.
In order to solve this subproblem, the first step was to keep only pertinent data from the available training dataset for analysis. To accomplish this, a selection criterion was derived using the distance between features in the training dataset (i.e., x i , j , x i , 13 ( p ) < i p ) and the ones provided at the time of prediction (i.e., x k , j and x k , 13 ). This criterion includes the contribution of the day cursor feature with the actual one. Based on the Euclidean distance metric (see Equation (5)), a selection of y i was retained for the next step by comparing d i , j , k with a tuning parameter α (see Equation (6)). The objective of this distance function is to take into account only irradiance values measured around the same time of day. Using this distance metric, only relevant information were selected from the learning database, specifically for the purpose of feeding the mass synthesis approach proposed in this research. The impact of the parameter α is investigated later in a dedicated section (see Section 3.1.1):
d i , j , k = ( x i , 13 x k , 13 ) ) 2 + ( x i , j x k , j ) ) 2
S j , k = y i d i , j , k < α with j 13
Second, using the kernel density estimation (KDE) method with a Gaussian kernel for this subset of solar irradiance values (i.e., S j , k ), a probability distribution function was fitted (see Equation (7)). KDE is a non-parametric method for calculating a random variable’s probability density function (PDF) which is widely used in the field of renewable energy time series prediction [38,39]. Thanks to its flexibility, this method was suitable for the application under inquiry:
f j , k ( y ) = 1 n σ j , k 2 π y i S j , k e 1 2 y y i σ j , k 2
with n being the number of elements in S j , k and σ j , k being a smoothing parameter also known as the “bandwidth”, which is defined in this work by a rule-of-thumb formula (see Equation (8)) [40]:
σ j , k = ( 4 3 ) 1 / 5 n 1 / 5 ( n 1 ) 1 / 2 y i S j , k y i 2 1 n y i S j , k y i 2
Next, the PDF curve was horizontally sliced to produce a series of areas, with the surface of each representing a probability value. A tuning parameter designated as N predetermined the total number of slices. The impact of this parameter is further covered in Section 3.1.4. Each slice was assigned to a set of values and a mass value (structure of a BPA) as described by Equations (9) and (10). The mechanism of the BPA building step is shown in Figure 4:
I j , k ( p ) = y f j , k ( y ) > p Δ j , k p { 1 , . . N 1 }
with Δ j , k = 1 N max f j , k ( y ) y I j , k ( 0 ) and I j , k ( 0 ) representing the BPA universe set, which was defined in our case as the interval [0,1000] (W/m2):
m j , k I j , k ( p ) = I j , k ( p ) min g j , k ( p ) ( y ) , Δ j , k d y
with g j , k ( p ) ( y ) = max f j , k ( y ) p Δ j , k , 0 .
Example 1. 
In the example shown in Figure 5, the PDF was fitted using solar irradiance values (i.e., y i ) selected based on their associated relative humidity values (i.e., x i , 3 ) when similar to the actual one at the time 4:00 p.m. on 8 August 2017. The probability density curve was then divided into three sections, with the bases of each section serving as the BPA’s focal elements and their surface representing the BPA’s masses values. The BPA was then defined as follows:
m [ 220 , 375 ] = 0.14 m [ 81.4 , 399 ] = 0.345 m [ 0 , 1000 ] = 0.515

2.4. Complexity Simplification

In the previous section, when generating a BPA for each feature (aside from the day cursor one), the number of focal elements was predefined by the tuning parameter N. We could anticipate that the computation entailed iteration over N 14 mass products for the combination of evidence. The proposed approach became impractical due to the exponential relationship’s potential to significantly increase calculation complexity. For this reason, before moving on to the combination step introduced later in this section (see Section 2.6), it is convenient to simplify each BPA whenever possible. Hence, any pair of focal elements with enough similarities to be treated as one focal element will be found using the simplification method. The two sets were then merged by taking their union, after which the masses were added together. A set similarity metric (see Equation (11)) was used to measure the similarity across sets, and the measurement was compared to the tuning parameter β . The effect of the parameter β on the overall method performance is examined later in Section 3.1.2. The process is described in greater detail in Algorithm 1, where the objective is to combine intervals which are similar, as keeping them separate only provides a minimal amount of information that does not justify the increase in complexity:
S I j , k ( p 1 ) , I j , k ( p 2 ) = I j , k ( p 1 ) Δ I j , k ( p 2 ) ¯ p 1 , p 2 { 0 , 1 , N 1 }
where | · | is the normalized length of the set · and Δ is the symmetrical set difference operator. The normalization was performed over the BPA universe set [0,1000] (W/m2).
Algorithm 1 Complexity simplification
  1:
procedure simplify(BPA, β )
  2:
     n BPA . Sets
  3:
    for  i 1  to  N 1  do
  4:
        if Massesi  1  then
  5:
            j i + 1
  6:
           while  j n  do
  7:
               if Massesj  1 and i j  then
  8:
                    s S ( BPA . Sets i , BPA . Sets j )       ▹ Equation (11)
  9:
                   if  s > β  then
10:
                        BPA . Sets i BPA . Set i BPA . Sets j
11:
                        BPA . Masses i BPA . Masses i + BPA . Masses j
12:
                        BPA . Sets j
13:
                        BPA . Masses j 1
14:
                        j 1
15:
                   else
16:
                        j j + 1
17:
                   end if
18:
               else
19:
                    j j + 1
20:
               end if
21:
           end while
22:
        end if
23:
    end for
24:
    for  i 1  to N do
25:
        if BPA.Massesi  = 1  then
26:
           Delete BPA.Setsi
27:
           Delete BPA.Massesi
28:
        end if
29:
    end for
30:
return BPA
31:
end procedure

2.5. Discounting

In the BFT framework, this operation is heavily used. In general, and especially when dealing with a large number of BPAs, it aims to reduce the confidence level of BPAs and gives relatively good improvement to the combination. Typically, to operate discounting, all BPA masses are reduced by the same ratio (i.e., discounting factor), and lost masses are transferred to the mass of the ignorance set (i.e., universe set). In this study, a novel discounting strategy is brought forth in which the discounting factor varies and is dependent on the size of each set (see Equation (12)). The idea behind this discounting method is that small sets need to be discounted more than bigger sets. This assumption is compounded by the fact that small sets contribute more to the combination process and can become unsettling if too much faith is put in them. Furthermore, the discounting factor must be higher when the BPA mass calculation employs more data (i.e., when the selection phase retains more data). This was addressed by using a parametric relation, defined in Equation (13):
m j , k * I j , k ( p ) = ( 1 ζ j , k ) e 3 1 I j , k ( p ) m j , k I j , k ( p )
where | · | is the normalized length of the set · and ζ j , k is the discounting factor [41]. In addition, we have
ζ j , k = 1 1 S j , k γ
where | · | is the number of elements in ·, is the training dataset length, and γ is the discounting power coefficient representing a tuning parameter for the proposed method. This parameter’s effect on the efficiency of the suggested method is investigated later in Section 3.1.3.

2.6. Combination

After determining all of the mass functions resulting from the 14 distinct subproblems, we intended to combine them so as to make use of the knowledge which was brought on by each of them individually. There are different combination methods which could have been used, depending on the reliability of each source of information. In our instance, all of the information sources were trustworthy, but there were many sources of information. The Yager combination method [42,43,44,45,46] was utilized to prevent fading of the ignorance set mass across successive mass function combinations. Algorithm 2 presents the pseudo-code of the used combination method. It is important to keep in mind that the procedure for simplifying the complexity was executed between each combination in order to stop exponential development in the number of components in the final mass variables.
Algorithm 2 Combination
  1:
procedure combine(BPA1, BPA2)
  2:
     n 1 BPA1.Sets
  3:
     n 2 BPA2.Sets
  4:
    BPA.Set ← Empty Set
  5:
    BPA.Masses ← Empty Set
  6:
     k 1
  7:
    for  i 1  to  n 1  do
  8:
        for  j 1  to  n 2  do
  9:
            BPA . Sets k BPA1.Sets i BPA2.Sets j
10:
            BPA.Masses k BPA1.Masses i × BPA2.Masses j
11:
           if  BPA2.Masses k =  then
12:
                BPA2.Masses k [ 0 , 1000 ]
13:
           end if
14:
            k k + 1
15:
        end for
16:
    end for
17:
return BPA
18:
end procedure
19:
procedure combine all(BPAs)
20:
     n BPAs
21:
    BPA.Set { [ 0 , 1000 ] }
22:
    BPA.Masse { 1 }
23:
    for  i 1 to n  do
24:
         BPA combine( BPA , BPA S i )
25:
         BPA simplify( BPA , β )
26:
    end for
27:
return BPA
28:
end procedure

2.7. Pignistic Transformation

The integration of several mass functions resulted in a source of information which was deeper in detail. Nevertheless, it was also portrayed by a mass function. It was required to proceed to the transformation of the mass function in order to transfer the information brought by it to a probabilistic representation. In most cases, the pignistic transformation distributes any given mass value to a collection of events across the individual components which make up that set in an equal manner. In the context of our research, this was carried out in a continuous form, as demonstrated by Equation (14):
f k ( y ) = y I k ( p ) m k I k ( p ) I k ( p )
where m k is the final mass function obtained from the combination of all m j , k * ( 1 j 15 , j 13 ), I k ( p ) is the focal elements in m k , | · | is the normalized set length of ·, and f k is the PDF estimated for the predicted solar irradiance value at time t k [47].

2.8. Decision Making

Given the PDF derived in the previous step, the decision making phase entailed picking a value for the solar irradiance to use as the expected one (i.e., y k ^ ). In most cases, this was accomplished by determining the solar irradiation value which resulted in the highest possible probability density expressed in f k ( · ) . In our situation, the method for making decisions was based on a customized calculation (see Equation (16)). This calculation determined the expected value based only on the values which provided a probability density that was greater than a predetermined threshold, fixed in our case to 90% of the maximal probability density in f k ( · ) ) (see Equation (15)):
E k = y | f k ( y ) 0.9 f k ( m a x )
y ^ k = E k y f k ( y ) d y E k f k ( y ) d y

2.9. Summary

A series of procedures was followed in order to create a forecast using the suggested technique for the impending values of the solar irradiance. In the first step, the problem was subdivided into 14 smaller issues, each of which was assigned a single feature in addition to the one pertaining to the day cursor. Each individual sub-problem was supposed to result in a distinct BPA. When developing a BPA, it is necessary to compare the dedicated feature values in the training dataset to the one supplied at the time of prediction in order to choose relevant solar irradiance values. These data were gathered, and then by using the KDE approach, a PDF was produced. After this, the PDF was chopped horizontally, and each of the produced pieces represented a different component of the BPA. The underside of each chunk was considered to be a BPA set, whereas the surface was given the mass value corresponding to that BPA set. After this, the BPA was made simpler to prevent complex computations. The resulting mass function was then discounted by a designated factor, and the Yager combination technique was applied to combine the results with the BPAs which were previously produced. The final outcome was a new mass function which was more informative than the starting ones. The pignistic transformation technique was used to convert this mass function into a PDF, and then a prediction value was calculated using a selected weighted average to obtain the final result. Table 3 provides a comprehensive display of all the symbols employed in the presentation of the proposed method. On the other hand, Figure 6 and Figure 7 illustrate the complete workflow of the proposed method, offering a visual representation of the entire process. Tuning parameters are available in Table 4.
Table 3. List of symbols.
Table 3. List of symbols.
SymbolDescription
TSampling period
iIndex of measurement in the dataset
INumber of measurements in the dataset
t i i t h measurement time in the dataset
y i i t h measurement irradiance value in the dataset
x i Feature vector calculated at time t i
jIndex of feature or weather component
W i Weather features vector at time t i
W i , j Weather feature j at time t i
x i , j Feature j at time t i
t p Present time ( 0 i I )
t k Prediction time ( p < k I )
Number of elements in the training dataset
d i , j , k Distance metric based on features j and 13
S j , k Subset of data used for subproblem j
f j , k ( · ) Fitted probability density function from S j , k
σ j , k Bandwidth for the fit process of S j , k
I j , k ( p ) BPA sets from subproblem j at t k ( 0 p N 1 )
m j , k ( · ) Mass function obtained from subproblem j at t k
S ( · , · ) Similarity function between two sets
m j , k * ( · ) Discounted mass obtained from subproblem j at t k
ζ j , k Discounting factor of subproblem j at t k
m k ( · ) Combination of all m j , k * ( · ) mass functions
f k ( · ) Calculated probability density function from m k ( · )
I k ( p ) Focal elements of m k ( · )
Table 4. Proposed method’s tuning parameters.
Table 4. Proposed method’s tuning parameters.
SymbolDescriptionDomain
α Threshold distance between features [ 0 , 1 ]
β Threshold hamming distance between sets [ 0 , 1 ]
γ Discounting power coefficient [ 0 , + ]
NNumber of focal elements in single BPA N

3. Results and Discussion

Several analyses addressing various aspects of the proposed prediction approach are presented in this section. First, the parameters of the method were examined to determine their impact on the overall performance. Subsequently, the key concept of aggregating multiple predictors was emphasized by comparing the accuracies of the individual predictors, each based on distinct sources of information, with the predictor obtained by combining them. The overall performance of the method was then compared to several recent state-of-the-art techniques, validating its effectiveness. Finally, the method was tested to its limits to showcase its full predictive potential. Simulations were performed on an Intel(R) Core(TM) i7-8550U CPU 1.80 GHz 1.99 GHz processor using MATLAB R2021a software.
The evaluation of the prediction performance relied on various metrics, such as the absolute mean error (AME), mean percentage error (MPE), mean bias error (MBE), and root mean square error (RMSE). Among these, the RMSE is the most commonly used metric in recent performance analyses, making it ideal for comparative studies. Consequently, the RMSE (see Equation (17)) was selected as the primary performance measure in this study [48]:
RMSE = 1 h i = p p + h 1 y i y ^ i 2
with t p referring to the present time, h being the prediction horizon, y i being the real irradiance at time t i , and y ^ i being the predicted irradiance. The other performance metrics can be formulated as follows:
AME = 1 h i = p p + h 1 | y i y i ^ |
MPE = 1 h i = p p + h 1 | y i y i ^ | y i
MBE = 1 h i = p p + h 1 ( y i y i ^ )

3.1. Impact of Parameters

The parameters of the proposed method are listed in Table 4. To assess the impact of each parameter individually, each one was varied within a suitably designed range, while the remaining parameters were kept constant based on a benchmark set of values. The objective of these simulations was to provide deeper insight into the parameter calibration process of the proposed method and to justify the selection of the benchmark values. The results from these simulations would serve as a basis for parameter calibration in the subsequent tests presented in this section. For each tuning parameter, solar irradiance forecasting simulations using the proposed methodology were performed over seven consecutive random days in August 2017. The forecast horizon was set to one day ahead, and the training set consisted of data from the preceding 5 years. For each of the seven simulations, the RMSE was calculated, and the average RMSE was evaluated across different parameter values.

3.1.1. Threshold Distance Between Features ( α )

The threshold distance between features, denoted by α , regulates the number of data points selected from the training set to construct the BPA. Increasing α resulted in the selection of more data points, while decreasing α led to fewer selected points. If α is too small, then the model may choose an insufficient number of data points, causing overfitting and reducing the generalization capability. Conversely, if α is set too high, excessive data, including irrelevant points, may be included, diminishing the precision of the regression. Therefore, an appropriate balance for α was essential.
Using the simulation set-up detailed in Section 3.1, we evaluated the RMSE for different values of α , as depicted in Figure 8.
The plot illustrates that for α = 0 , corresponding to no selected data points and leading to a BPA of full ignorance, the prediction error was considerably higher than for positive values of α . As α increased beyond 0.1, the RMSE began to rise, suggesting that too much irrelevant data were being considered. From a methodological perspective, the optimal value of α minimizes the RMSE, and in this case, the best performance was achieved with α * = 0.025 .

3.1.2. Threshold Hamming Distance Between Sets ( β )

In the complexity reduction process (see Section 2.4), a set distance metric was introduced to quantify the similarity between two sets within a BPA. This metric was compared against a threshold β , which determined whether these sets should be aggregated into a single set within the BPA. The selection of the β threshold is crucial because it directly affects the computational complexity of the proposed method. If β is set too high, then no sets are considered similar, making the simplification ineffective and leaving the problem at its original complexity. Conversely, if β is too low, then an excessive number of sets are aggregated, leading to a loss of information within the BPA, which increases ignorance and diminishes the performance of the method. As such, optimizing β requires balancing two competing objectives: minimizing computational complexity while maintaining the performance of the proposed technique.
In Figure 9, the RMSE and execution time are evaluated across various values of β . The purpose of this measurement is to assess the impact of different β values on the performance and computational efficiency of the proposed method.
The results show that the RMSE decreased as β increased up to 0.975 , where it reached its minimum, but rose for β > 0.975 , indicating that too much aggregation began to affect the performance. Additionally, the execution time grew exponentially with an increasing β . Since the trade-off between prediction accuracy and execution time is dependent on the application, determining an objective optimal β may not be feasible. To address this, we plotted the Pareto front in Figure 10, which highlights the bi-objective optimization problem.
In this analysis, the chosen threshold β * = 0.975 was found to be optimal, as 4.083 s of execution time was negligible compared with the forecast’s usage frequency.

3.1.3. Discounting Power Coefficient ( γ )

In the proposed method outlined in this paper, the BPA undergoes a discounting phase (see Section 2.5) where additional ignorance is introduced to moderate its influence in the combination process. This adjustment is controlled by a discounting coefficient, denoted as γ . If γ is set too low, then there is a risk of overreliance on potentially inaccurate outputs. Conversely, excessively high γ values lead to an undesirable loss of critical information. Thus, tuning γ to find an optimal balance is essential.
Figure 11 illustrates the variation in the average RMSE as a function of different γ values. The high RMSE at γ = 0 emphasizes the necessity of the discounting step. As observed, the RMSE generally decreased as γ increased, though with greater instability compared with other parameters. The optimal γ value was identified to be γ * = 90 , as this corresponded to the minimum RMSE value, highlighting its importance in achieving better prediction accuracy.

3.1.4. Number of Focal Elements in Single BPA (N)

In the proposed method, the number of focal elements, denoted by N, is predefined during the BPA construction phase (cf. Section 2.3). This parameter plays a key role in balancing the BPA’s expression of ignorance and its conformity to the underlying probability density function. Additionally, N impacts computational complexity, requiring bi-objective optimization analysis to weigh the trade-offs between performance and computational efficiency. When N was set to larger values, the BPA became more dependent on the knowledge encapsulated in the fitted probability density function, thereby reducing its level of ignorance. However, excessive adherence to the probability density function is undesirable, as it may not perfectly capture the underlying distribution. Furthermore, increasing N led to a rise in computational complexity. Conversely, smaller N values injected more ignorance into the BPA, which diminished the efficient utilization of available information.
The average RMSE across different values of N is presented in Figure 12a. It can be observed that for N 4 , the RMSE increased, which indicates overreliance on the data source during BPA computation. This undermines the effectiveness of using belief function theory (BFT) to represent ignorance. The execution time for a one-day-ahead prediction with various N values is shown in Figure 12b, demonstrating an exponential increase in the computation time as N increased. This resulted from the higher complexity of combining BPAs with more focal elements.
Given the objectives of minimizing the error and computational time, the values N = 3 and N = 4 were optimal for the Pareto front. The difference in execution time between these two values was negligible, and the prediction process in the underlying application required frequent updates. Thus, N * = 4 was chosen as the optimal value.

3.1.5. Conclusions

In this study on parameter impacts, a pragmatic individual optimization approach was applied to each parameter. Through this method, simulations were conducted over seven consecutive days for a one-day-ahead prediction, providing a reliable indicator of the proposed method’s accuracy. Table 5 presents the local variation in the mean RMSE for parametric changes around the identified optimal values, thus offering insights into the sensitivity of the predictor’s accuracy with respect to each parameter. This analysis helped highlight which parameters exerted the most influence on the model’s performance and required more precise tuning.

3.2. Performance Analysis: Individual Versus Combined Predictors

In this section, the individual contributions of certain predictors are thoroughly examined to showcase the significance of combining the efforts of multiple forecasters. It is essential to recognize that each predictor is associated with a specific feature. The primary aim of this study was to illustrate the collaborative functionality of various predictors when they were combined, thereby emphasizing the synergistic effects resulting from their integration. For instance, the relative humidity-based predictor was anticipated to demonstrate reduced accuracy during the winter months, as precipitation exacerbates the decoupling between solar irradiance and humidity during this period. Conversely, the atmospheric pressure predictor was expected to perform better in winter due to the enhanced correlation it exhibits with solar irradiance under cloudy conditions.
To exemplify the seasonal behavior of these predictors, we present the performance of two distinct predictors: one based on the relative humidity and the other based on atmospheric pressure. This was accomplished by executing a one-day-ahead prediction simulation for a randomly selected day in summer, as illustrated in Figure 13a, and for another randomly selected day in winter, shown in Figure 13b.
The figures clearly demonstrate that the atmospheric pressure-based predictor exhibited greater accuracy during the wet and cold winter months compared with the dry and hot summer months, while the relative humidity predictor yielded the opposite results during these same periods. This performance discrepancy underscores the necessity for employing a diverse set of predictors. Moreover, Table 6 shows the performance of individual forecasting predictors based on specific meteorological criteria compared to their combinations. Notably, as shown in the last column of Table 6, the combined approach, which incorporates all features, consistently outperformed most individual predictors in terms of overall performance across the year. The simulations presented in Figure 13 are complemented by the prediction curve generated from the integration of all predictors, as depicted in Figure 14. It is worth noting that since this study focused on short-to-medium-term forecasting rather than ultra short-term forecasting, we did not assess the computational cost or benefit of using the full set of weather features versus a subset. However, the execution time of the proposed method, which integrates all features, was discussed in the previous section, demonstrating its suitability for short-to-medium-term forecasting, with execution times consistently remaining below one minute.

3.3. Comparison with Other Recent State-of-the-Art Methods

This section aims to evaluate the performance of the proposed approach by conducting a comparative analysis with recent studies addressing similar forecasting challenges. Given that the existing methods in the literature employ varying input parameters, training periods, and forecast horizons, we performed multiple simulations tailored to the specific parameters of each state-of-the-art method. This ensured a fair and equitable comparison across the different forecasting approaches. To obtain a robust evaluation metric, the performance of the proposed method was quantified using the mean of a set of RMSE values derived from eight distinct forecast days evenly distributed throughout the year at intervals of 45 days. This methodology was designed to yield performance metrics representative of each method’s effectiveness across different seasons, thereby mitigating the potential biases which could arise from season-dependent results. For methods which implement one-day-ahead predictions, as referenced in previous works [22,23,24,27,28,29], the comparative results are summarized in Table 7. The methods were organized according to their year of publication, and the input parameters and training periods were also outlined. It is essential to underscore that the proposed method was structured to learn exclusively from a designated set of meteorological parameters, contingent upon the specific input features utilized by each method. Furthermore, it is noteworthy that the proposed approach may omit certain weather-related features if the comparative method incorporates parameters not present in the utilized database [34], such as cloud cover. This flexibility allows the proposed method to adapt to varying data availability, potentially enhancing its applicability across different forecasting scenarios. Table 2 lists all available weather features in this study. It can be noticed that the proposed method performed better than all of the other listed state-of-the-art methods.
Several methodologies in the literature focus on forecasting solar irradiance for a one-hour-ahead horizon [25,26,27]. The proposed method was similarly configured to predict solar irradiance in one-hour intervals, ensuring that the input features and training duration were aligned with those of the established state-of-the-art methods.
To achieve a comprehensive assessment, the proposed method was rigorously evaluated across various hours of the day. This approach allowed for a nuanced understanding of its performance throughout different times, capturing the diurnal variability in solar irradiance. The results of this evaluation are presented in Table 8, which illustrates that the proposed method consistently outperformed all of the other methods analyzed, with the exception of the latest method in the comparison. Notably, this latter method showed only a marginal improvement of 5 W/m2 in prediction accuracy, and this enhancement is restricted solely to predictions made during midday. Additionally, the proposed method was subjected to comparison against other studies which employed varied input datasets, as outlined in Table 9. This comparative framework not only highlights the robustness of the proposed methodology but also contextualizes its performance within the broader body of research. To further facilitate external performance evaluations, Table 10 presents additional assessment metrics which provide insight into various dimensions of the prediction accuracy and reliability. This comprehensive approach underscores the significance of the proposed method in advancing the field of solar irradiance forecasting.

3.4. Full Potential of the Proposed Method

In this section, the proposed methodology is subjected to rigorous testing to assess its limits by extending the forecast horizon beyond practical constraints and reducing the availability of training data. The primary objective of this analysis was to gain a comprehensive understanding of the inherent capabilities of the proposed method. It was generally anticipated that performance would deteriorate as the forecast horizon was lengthened and the training duration was curtailed, but this section aims to provide a quantitative characterization of this relationship.
Figure 15a illustrates the RMSE as a function of the forecast horizon. Notably, the RMSE increased significantly with extended forecast horizons; specifically, it rose from 15.14 W/m2 in the 1 week forecast to 76.5 W/m2 in the 23 week forecast when trained on data spanning 1 year. Conversely, when the training period was reduced to just 4 days, the RMSE escalated from 84.21 W/m2 to 146.9 W/m2 over the same forecast horizon. Furthermore, Figure 15b depicts the RMSE plotted against the training duration using a reverse logarithmic scale. This figure reveals that transitioning from a training period of 1 year to just 2 days led to a substantial increase in the RMSE from 27.22 W/m2 for the 3 week forecast to 138.7 W/m2 and from 67.36 W/m2 to 151.1 W/m2 for the 4 month forecast.
To further elucidate the interactions between the forecast horizon and the training period, a color map representation is provided in Figure 16. This visual aid facilitates a deeper understanding of the simultaneous effects of varying both parameters on the overall performance of the proposed forecasting method.

4. Conclusions and Perspectives

This study presented an innovative evidential approach for predicting solar irradiance. The method was formulated in a BFT framework, and its major contributions are listed below:
  • Enhanced flexibility: The integration of belief functions with machine learning allows for better management of missing data, making the model more adaptable to real-world scenarios.
  • Improved data utilization: The method effectively utilizes available data even when some data points are missing, improving overall prediction accuracy.
  • Integration of multiple predictive models: By incorporating various predictive models, each addressing different meteorological factors, the approach provides a more comprehensive and accurate forecast.
  • Competitive accuracy: Performance evaluations show that the method achieved an average root mean square (RMS) error of 27.83 W/m2, outperforming the latest comparable method with an RMS error of 30.21 W/m2 [29].
  • Short-to-medium-term forecasting capability: The approach is specifically designed for short-to-medium-term solar irradiance forecasting, making it suitable for practical applications in energy management and forecasting.

Future Directions and Applications

  • Feature selection: Future directions should focus on a more detailed examination of the relationship between meteorological variables and solar radiation, potentially employing techniques such as principal component analysis to address multicollinearity and enhance feature selection for more accurate predictions.
  • Real-time testing and sensor faults: Future efforts will focus on real-time testing of the proposed model, with considerations for potential sensor faults.
  • Extension to other applications within the MG environment: The method’s versatility may be extended to other domains, such as wind energy production and electrical load forecasting.
  • Privacy in load forecasting: Although the datasets used are publicly available, applications like load forecasting may require decentralized learning approaches such as federated learning to safeguard data privacy. The parametric nature of the BFT framework supports its adaptability to decentralized and distributed applications, making it particularly suitable for privacy-sensitive environments.

Author Contributions

M.M. (Mohamed Mroueh): Conceptualization, Methodology, Software, Validation, Writing—original, and Validation. M.D.: Project administration, Conceptualization, Funding acquisition, Supervision, Writing—review and editing, Validation. C.F.: Project administration, Supervision, Conceptualization and Validation. M.M. (Mohamed Machmoum): Project administration, Supervision, Funding acquisition, and Validation. All authors have reviewed and approved the final version of the manuscript for publication.

Funding

This research work was funded by Fonds Européen de Développement Régional FEDER and Région Pays de la Loire, France in the context of DETECT Project AAP RFI WISE. APC was funded by Triskell Consulting.

Data Availability Statement

The original contributions presented in the study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

AMEAbsolute mean error
ANNArtificial neural network
BFTBelief function theory
BPABasic probability assignments
CNNConvolutional neural network
DNNDeep neural network
EMSEnergy management system
ESSEnergy storage system
GRUGated recurrent unit
KDEKernal density estimation
LSTMLong short-term memory
MBEMean bias error
MGMicrogrid
MPEMean percentage error
PDFProbability density function
PVPhotovoltaic
RMSERoot mean square error
RNNRecurrent neural network
SBFMSimilarity-based forecasting model
SWDRShortwave downward radiation

References

  1. Olabi, A.; Abdelkareem, M.A. Renewable energy and climate change. Renew. Sustain. Energy Rev. 2022, 158, 112111. [Google Scholar] [CrossRef]
  2. Hainsch, K.; Löffler, K.; Burandt, T.; Auer, H.; del Granado, P.C.; Pisciella, P.; Zwickl-Bernhard, S. Energy transition scenarios: What policies, societal attitudes, and technology developments will realize the EU Green Deal? Energy 2022, 239, 122067. [Google Scholar] [CrossRef]
  3. Ovaere, M.; Proost, S. Cost-effective reduction of fossil energy use in the European transport sector: An assessment of the Fit for 55 Package. Energy Policy 2022, 168, 113085. [Google Scholar] [CrossRef]
  4. Costa, E.; Wells, P.; Wang, L.; Costa, G. The electric vehicle and renewable energy: Changes in boundary conditions that enhance business model innovations. J. Clean. Prod. 2022, 333, 130034. [Google Scholar] [CrossRef]
  5. Venugopal, P.; Haes Alhelou, H.; Al-Hinai, A.; Siano, P. Analysis of Electric Vehicles with an Economic Perspective for the Future Electric Market. Future Internet 2022, 14, 172. [Google Scholar] [CrossRef]
  6. Dik, A.; Omer, S.; Boukhanouf, R. Electric Vehicles: V2G for Rapid, Safe, and Green EV Penetration. Energies 2022, 15, 803. [Google Scholar] [CrossRef]
  7. Chebotareva, G.; Tvaronavičienė, M.; Gorina, L.; Strielkowski, W.; Shiryaeva, J.; Petrenko, Y. Revealing Renewable Energy Perspectives via the Analysis of the Wholesale Electricity Market. Energies 2022, 15, 838. [Google Scholar] [CrossRef]
  8. Çelik, D.; Meral, M.E.; Waseem, M. The progress, impact analysis, challenges and new perceptions for electric power and energy sectors in the light of the COVID-19 pandemic. Sustain. Energy Grids Netw. 2022, 31, 100728. [Google Scholar] [CrossRef]
  9. Zandrazavi, S.F.; Guzman, C.P.; Pozos, A.T.; Quiros-Tortos, J.; Franco, J.F. Stochastic multi-objective optimal energy management of grid-connected unbalanced microgrids with renewable energy generation and plug-in electric vehicles. Energy 2022, 241, 122884. [Google Scholar] [CrossRef]
  10. Kanakadhurga, D.; Prabaharan, N. Demand side management in microgrid: A critical review of key issues and recent trends. Renew. Sustain. Energy Rev. 2022, 156, 111915. [Google Scholar] [CrossRef]
  11. Sandelic, M.; Peyghami, S.; Sangwongwanich, A.; Blaabjerg, F. Reliability aspects in microgrid design and planning: Status and power electronics-induced challenges. Renew. Sustain. Energy Rev. 2022, 159, 112127. [Google Scholar] [CrossRef]
  12. Barik, A.K.; Jaiswal, S.; Das, D.C. Recent trends and development in hybrid microgrid: A review on energy resource planning and control. Int. J. Sustain. Energy 2022, 41, 308–322. [Google Scholar] [CrossRef]
  13. Antonio Alcantra, I.M.G.; Aler, R. Direct estimation of prediction intervals for solar and wind regional energy forecasting with deep neural networks. Eng. Appl. Artif. Intell. 2022, 114, 105128. [Google Scholar] [CrossRef]
  14. Prema, V.; Bhaskar, M.S.; Almakhles, D.; Gowtham, N.; Rao, K.U. Critical Review of Data, Models and Performance Metrics for Wind and Solar Power Forecast. IEEE Access 2021, 15, 667–688. [Google Scholar] [CrossRef]
  15. Mohamed, L.; Heba, M.; Ahmed, K.; Khalid, A. A non-linear auto-regressive exogenous method to forecast the photovoltaic power output. Sustain. Energy Technol. Assess. 2020, 38, 100670. [Google Scholar] [CrossRef]
  16. Muhammed, A.H.; Nadjem, B.; Kada, B.; Samuel, C.N. Ultra-short-term exogenous forecasting of photovoltaic power production using genetically optimized non-linear auto-regressive recurrent neural networks. Renew. Energy 2021, 171, 191–209. [Google Scholar] [CrossRef]
  17. Lorenz, E.; Hurka, J.; Schneider, M. Qualifed forecast of ensemble power production by spatially dispersed grid-connected PV systems. In Proceedings of the 23rd European Photovoltaic Solar Energy Conference, Valencia, Spain, 1–5 September 2008; pp. 3285–3291. [Google Scholar]
  18. Abraham, B.; Ledolter, J. Statistical Methods for Forecasting; Wiley: Hoboken, NJ, USA, 2009. [Google Scholar]
  19. Kim, H.; Park, S.; Park, H.-J.; Son, H.-G.; Kim, S. Solar Radiation Forecasting Based on the Hybrid CNN-CatBoost Model. IEEE Access 2023, 11, 13492–13500. [Google Scholar] [CrossRef]
  20. Shafer, G. A Mathematical Theory of Evidence; Princeton University Press: Princeton, NJ, USA, 1976; Volume 42. [Google Scholar] [CrossRef]
  21. Denoeux, T. Decision-making with belief functions: A review. Int. J. Approx. Reason. 2019, 109, 87–110. [Google Scholar] [CrossRef]
  22. Cao, S.; Cao, J. Forecast of solar irradiance using recurrent neural networks combined with wavelet analysis. Appl. Therm. Eng. 2005, 25, 161–172. [Google Scholar] [CrossRef]
  23. Qing, X.; Niu, Y. Hourly day-ahead solar irradiance prediction using weather forecasts by LSTM. Energy 2018, 148, 461–468. [Google Scholar] [CrossRef]
  24. Husein, M.; Chung, I.Y. Day-ahead solar irradiance forecasting for microgrids using a long short-term memory recurrent neural network: A deep learning approach. Energies 2019, 12, 1856. [Google Scholar] [CrossRef]
  25. Yu, Y.; Cao, J.; Zhu, J. An LSTM short-term solar irradiance forecasting under complicated weather conditions. IEEE Access 2019, 7, 145651–145666. [Google Scholar] [CrossRef]
  26. Wojtkiewicz, J.; Hosseini, M.; Gottumukkala, R.; Chambers, T.L. Hour-ahead solar irradiance forecasting using multivariate gated recurrent units. Energies 2019, 12, 4055. [Google Scholar] [CrossRef]
  27. Aslam, M.; Lee, J.M.; Kim, H.S.; Lee, S.J.; Hong, S. Deep learning models for long-term solar radiation forecasting considering microgrid installation: A comparative study. Energies 2019, 13, 147. [Google Scholar] [CrossRef]
  28. He, H.; Lu, N.; Jie, Y.; Chen, B.; Jiao, R. Probabilistic solar irradiance forecasting via a deep learning-based hybrid approach. IEEJ Trans. Electr. Electron. Eng. 2020, 15, 1604–1612. [Google Scholar] [CrossRef]
  29. Jeon, B.K.; Kim, E.J. Next-day prediction of hourly solar irradiance using local weather forecasts and LSTM trained with non-local data. Energies 2020, 13, 5258. [Google Scholar] [CrossRef]
  30. Sangrody, H.; Zhou, N.; Zhang, Z. Similarity-Based Models for Day-Ahead Solar PV Generation Forecasting. IEEE Access 2020, 8, 104469–104478. [Google Scholar] [CrossRef]
  31. de Costa, R.L. Convolutional-LSTM networks and generalization in forecasting of household photovoltaic generation. Eng. Appl. Artif. Intell. 2022, 116, 105458. [Google Scholar] [CrossRef]
  32. Stratigakos, A.; Andrianesis, P.; Michiorri, A.; Kariniotakis, G. Towards Resilient Energy Forecasting: A Robust Optimization Approach. IEEE Trans. Smart Grid 2024, 15, 874–885. [Google Scholar] [CrossRef]
  33. Hayajneh, A.M.; Alasali, F.; Salama, A.; Holderbaum, W. Intelligent Solar Forecasts: Modern Machine Learning Models and TinyML Role for Improved Solar Energy Yield Predictions. IEEE Access 2024, 12, 10846–10864. [Google Scholar] [CrossRef]
  34. Del Grosso, S.; Parton, W.; Mosier, A.; Hartman, M.; Keough, C.; Peterson, G.; Ojima, D.; Schimel, D. Nitrogen in the Environment: Sources, Problems and Management; Max Planck Institut für Biogeochemie: Jena, Germany, 2001; p. 413. Available online: https://www.bgc-jena.mpg.de/wetter/ (accessed on 10 May 2022).
  35. Schmid, B.; Schmitz, M.; Rzanny, M.; Scherer-Lorenzen, M.; Mwangi, P.N.; Weisser, W.W.; Hector, A.; Schmid, R.; Flynn, D.F. Removing subordinate species in a biodiversity experiment to mimic observational field studies. Grassl. Res. 2022. [Google Scholar] [CrossRef]
  36. Geiger, M.; Diabaté, L.; Ménard, L.; Wald, L. A web service for controlling the quality of measurements of global solar irradiation. Sol. Energy 2002, 73, 475–480. [Google Scholar] [CrossRef]
  37. Liu, Z.g.; Pan, Q.; Dezert, J. Evidential classifier for imprecise data based on belief functions. Knowl.-Based Syst. 2013, 52, 246–257. [Google Scholar] [CrossRef]
  38. Wahbah, M.; Mohandes, B.; EL-Fouly, T.H.; El Moursi, M.S. Unbiased cross-validation kernel density estimation for wind and PV probabilistic modelling. Energy Convers. Manag. 2022, 266, 115811. [Google Scholar] [CrossRef]
  39. Zhang, K.; Yu, X.; Liu, S.; Dong, X.; Li, D.; Zang, H.; Xu, R. Wind power interval prediction based on hybrid semi-cloud model and nonparametric kernel density estimation. Energy Rep. 2022, 8, 1068–1078. [Google Scholar] [CrossRef]
  40. Węglarczyk, S. Kernel density estimation and its application. EDP Sci. 2018, 23, 00037. [Google Scholar] [CrossRef]
  41. Guyard, R.; Cherfaoui, V. Study of discounting methods applied to canonical decomposition of belief functions. In Proceedings of the 2018 21st International Conference on Information Fusion (FUSION), Cambridge, UK, 10–13 July 2018; pp. 2505–2512. [Google Scholar]
  42. Yager, R.R. Decision making under Dempster-Shafer uncertainties. Int. J. Gen. Syst. 1992, 20, 233–245. [Google Scholar] [CrossRef]
  43. Sentz, K.; Ferson, S. Combination of Evidence in Dempster-Shafer Theory; Sandia National Laboratories Albuquerque: Albuquerque, NM, USA, 2002; Volume 4015. [Google Scholar] [CrossRef]
  44. Fu, C.; Yang, S. The conjunctive combination of interval-valued belief structures from dependent sources. Int. J. Approx. Reason. 2012, 53, 769–785. [Google Scholar] [CrossRef]
  45. Smets, P. Belief functions: The disjunctive rule of combination and the generalized Bayesian theorem. Int. J. Approx. Reason. 1993, 9, 1–35. [Google Scholar] [CrossRef]
  46. Dempster, A.P. Upper and lower probabilities induced by a multivalued mapping. In Classic Works of the Dempster-Shafer Theory of Belief Functions; Springer: Berlin/Heidelberg, Germany, 2008; pp. 57–72. [Google Scholar] [CrossRef]
  47. Smets, P. Decision making in the TBM: The necessity of the pignistic transformation. Int. J. Approx. Reason. 2005, 38, 133–147. [Google Scholar] [CrossRef]
  48. Wen, X.; Jaxa-Rozen, M.; Trutnevyte, E. Accuracy indicators for evaluating retrospective performance of energy system models. Appl. Energy 2022, 325, 119906. [Google Scholar] [CrossRef]
  49. Rajagukguk, R.A.; Ramadhan, R.A.; Lee, H.J. A review on deep learning models for forecasting time series data of solar irradiance and photovoltaic power. Energies 2020, 13, 6623. [Google Scholar] [CrossRef]
  50. Ghimire, S.; Bhandari, B.; Casillas-Perez, D.; Deo, R.C.; Salcedo-Sanz, S. Hybrid deep CNN-SVR algorithm for solar radiation prediction problems in Queensland, Australia. Eng. Appl. Artif. Intell. 2022, 112, 104860. [Google Scholar] [CrossRef]
  51. Wang, M.; Wang, P.; Zhang, T. Evidential Extreme Learning Machine Algorithm-Based Day-Ahead Photovoltaic Power Forecasting. Energies 2022, 15, 3882. [Google Scholar] [CrossRef]
Figure 1. Geographical location (a) and photograph (b) of the Saaleaue weather station in Germany [34].
Figure 1. Geographical location (a) and photograph (b) of the Saaleaue weather station in Germany [34].
Energies 17 06361 g001
Figure 2. Solar angles (a) and geographical solar trajectory (b) from the Saaleaue weather station location at 12:00 p.m. through the year 2022.
Figure 2. Solar angles (a) and geographical solar trajectory (b) from the Saaleaue weather station location at 12:00 p.m. through the year 2022.
Energies 17 06361 g002
Figure 3. Fast Fourier transform of a three years of records on solar irradiance.
Figure 3. Fast Fourier transform of a three years of records on solar irradiance.
Energies 17 06361 g003
Figure 4. Mechanism of the BPA building step.
Figure 4. Mechanism of the BPA building step.
Energies 17 06361 g004
Figure 5. Slicing method to build BPA.
Figure 5. Slicing method to build BPA.
Energies 17 06361 g005
Figure 6. Mechanism of the proposed method.
Figure 6. Mechanism of the proposed method.
Energies 17 06361 g006
Figure 7. Entire mechanism of the proposed method.
Figure 7. Entire mechanism of the proposed method.
Energies 17 06361 g007
Figure 8. Impact of α on the proposed method’s performance.
Figure 8. Impact of α on the proposed method’s performance.
Energies 17 06361 g008
Figure 9. Impact of β on the proposed method performance (a) and the execution time (b).
Figure 9. Impact of β on the proposed method performance (a) and the execution time (b).
Energies 17 06361 g009
Figure 10. Pareto front with the average RMSE and the execution time according to variable β .
Figure 10. Pareto front with the average RMSE and the execution time according to variable β .
Energies 17 06361 g010
Figure 11. Impact of γ on the proposed method’s performance.
Figure 11. Impact of γ on the proposed method’s performance.
Energies 17 06361 g011
Figure 12. Impact of N on the proposed method’s performance (a) and the execution time (b).
Figure 12. Impact of N on the proposed method’s performance (a) and the execution time (b).
Energies 17 06361 g012
Figure 13. One-day-ahead predictions utilizing two different predictors: (a) summer and (b) winter.
Figure 13. One-day-ahead predictions utilizing two different predictors: (a) summer and (b) winter.
Energies 17 06361 g013
Figure 14. Prediction one day ahead with the proposed method in summer (a) and in winter (b).
Figure 14. Prediction one day ahead with the proposed method in summer (a) and in winter (b).
Energies 17 06361 g014
Figure 15. Proposed method performance according to forecast horizon (a) and training period (b).
Figure 15. Proposed method performance according to forecast horizon (a) and training period (b).
Energies 17 06361 g015
Figure 16. Proposed method performance according to both forecast horizon and training period.
Figure 16. Proposed method performance according to both forecast horizon and training period.
Energies 17 06361 g016
Table 1. Solar features of the Saaleaue weather station location [36].
Table 1. Solar features of the Saaleaue weather station location [36].
FeatureValueUnit
Specific photovoltaic power output1071.6kWh/kWp
Direct normal irradiation (DNI)995.0kWh/m2
Global horizontal irradiation (GHI)1082.3kWh/m2
Global tilted irradiation (GTIopta)1265.8kWh/m2
Optimum tilt of PV modules (OPTA)38/180°
Air temperature9.9°C
Terrain elevation138.0m
Table 2. Selected weather features.
Table 2. Selected weather features.
FeatureUnit
Atmospheric pressurembar
Temperature°C
Relative humidity%
Specific humidityg/Kg
Dew point°C
Saturation pressurembar
Vapor pressurembar
Deficit pressurembar
Vapor concentrationmmol/mol
Air tightg/m3
Wind speedm/s
Precipitationmm
Table 5. Study results for impact of parameters.
Table 5. Study results for impact of parameters.
SymbolOptimumSensibilityFeatures
α 0.0253.1606-High sensibility
-Bounded
β 0.9750.4294-Affects execution time
-User-defined (Pareto)
-Bounded
γ 900.1561-High instability
-Low sensibility
N42.741-Affects execution time
-High sensibility
-Discrete
Table 6. Performance (RMSE in W/m2) of several individual predictors in comparison with their combination in 2017.
Table 6. Performance (RMSE in W/m2) of several individual predictors in comparison with their combination in 2017.
DatePrecip.Atm. PressureRel. HumidityWind SpeedOverall
29 Jan23.3620.06119.8019.4128.09
18 Mar27.6920.5231.6425.9815.96
4 May21.3054.3120.4342.7125.35
21 Jun270.40261.20113.80260.4074.28
8 Aug209.10200.8042.60199.6029.11
24 Sep40.0933.8440.5537.9821.30
16 Nov11.7510.7810.1910.2310.12
29 Dec8.47516.49111.9016.459.535
Table 7. Comparison of the proposed method accuracy on a one-day-ahead forecast horizon with other state-of-the-art methods.
Table 7. Comparison of the proposed method accuracy on a one-day-ahead forecast horizon with other state-of-the-art methods.
Proposed Method’s Input Parameters
Adapted to Match State-of-the-Art
Methods for a Fair Comparison)
RMSE (in W/m2)
ReferencesModelTraining PeriodState of the ArtProposed
Cao et al. [22]RNN-Solar irradiance6 years44.32630.60
Qing et al. [23]LSTM-Dew point
-Humidity
-Temperature
-Visibility *
-Wind speed
2.5 years76.24534.17
Husein et al. [24]LSTM-Cloud cover *
-Humidity
-Precipitation
-Temperature
-Wind direction *
-Wind speed
15 years60.31032.68
Aslam et al. [27]LSTM
GRU
RNN
-Solar irradiance10 years55.277
55.821
63.125
29.53
Hui et al. [28]LSTM-Atmospheric pressure
-Cloud cover *
-Relative humidity
-Temperature
-Wind speed
10 years62.54030.63
Byung-ki et al. [29]LSTM-Atmospheric pressure
-Cloud cover *
-Humidity
-Precipitation
-Solar irradiance
-Temperature
-Wind speed
5 years30.21027.83
* Unavailable feature for the proposed method. Bold text refers to productive results. State of the art results source: [49].
Table 8. Comparison of the proposed method’s accuracy on a one-hour-ahead forecast horizon with other state-of-the-art methods.
Table 8. Comparison of the proposed method’s accuracy on a one-hour-ahead forecast horizon with other state-of-the-art methods.
Authors and ReferencesModelProposed Method’s
Input Parameters
(Adapted to Match
State-of-Art Methods for
a Fair Comparison)
Training PeriodRMSE (in W/m2)
State of the Art AverageProposed
07:00 a.m.12:00 p.m.5:00 p.m.Average
Yu et al. [25]LSTM-Air temperature
-Cloud type *
-Dew point
-GHI *
-Precipitation
-Relative humidity
-Solar zenith angle *
-Wind direction *
-Wind speed
5 years41.37027.1446.3132.2535.23
Wojtkiewicz et al. [26]GRU
LSTM
-Air temperature
-GHI *
-Relative humidity
-Solar zenith angle *
11 years67.290
66.570
33.4851.4319.9534.95
Aslam et al. [27]LSTM
GRU
RNN
-Solar irradiance10 years108.89
99.722
105.28
2.93039.7621.7121.46
* Unavailable feature for the proposed method. Bold text refers to productive results. State of the art results source: [49].
Table 9. Method performances with respect to other state-of-the-art methods (one day ahead).
Table 9. Method performances with respect to other state-of-the-art methods (one day ahead).
Authors and ReferencesModelDatabaseRMSE (in W/m2)
Sujan et al. [50]CSVRDaystar Energy Solar Farm25.14
Minli et al. [51]EELMUniversity of Macau PV System46.97
ProposedEvidentialSaaleaue WS17.81
Table 10. Performance of the proposed method using different evaluation metrics (RMSE was also evaluated in Table 6).
Table 10. Performance of the proposed method using different evaluation metrics (RMSE was also evaluated in Table 6).
DateRMSE
(in W/m2)
AME
(in W/m2)
MPE
(in %)
MBE
(in W/m2)
29 Jan28.0915.5871.80 %−8.73
18 Mar15.968.1227.75 %−4.46
4 May25.3532.3780.22 %−24.89
21 Jun74.2833.0218.14 %−6.84
8 Aug29.1121.1810.42 %−17.56
24 Sep21.3012.5522.55 %−5.34
16 Nov10.1212.7053.45 %−3.31
29 Dec9.5353.4326.45 %0.39
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mroueh, M.; Doumiati, M.; Francis, C.; Machmoum, M. An Evidential Solar Irradiance Forecasting Method Using Multiple Sources of Information. Energies 2024, 17, 6361. https://doi.org/10.3390/en17246361

AMA Style

Mroueh M, Doumiati M, Francis C, Machmoum M. An Evidential Solar Irradiance Forecasting Method Using Multiple Sources of Information. Energies. 2024; 17(24):6361. https://doi.org/10.3390/en17246361

Chicago/Turabian Style

Mroueh, Mohamed, Moustapha Doumiati, Clovis Francis, and Mohamed Machmoum. 2024. "An Evidential Solar Irradiance Forecasting Method Using Multiple Sources of Information" Energies 17, no. 24: 6361. https://doi.org/10.3390/en17246361

APA Style

Mroueh, M., Doumiati, M., Francis, C., & Machmoum, M. (2024). An Evidential Solar Irradiance Forecasting Method Using Multiple Sources of Information. Energies, 17(24), 6361. https://doi.org/10.3390/en17246361

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop