This section consists of three main parts. First, the data-driven part of the model is evaluated, followed by a discussion of the improvements brought about by contextual optimisation. Subsequently, the final DCF model is presented and validated by the short- and long-term models in all three cities.
5.1. Data-Driven Model Results
The initial model was based on historical solar radiation data and the respective algorithm. SVM outperformed FBP in the 1-hour ahead prediction in terms of R
2 and RMSE (
Table 4). It also had the lowest MAE for all three horizons. For 2-hour prediction, the FBP yielded similar results in R
2 and MAE to SVM, while beyond this horizon, it outperformed the SVM model. This is because SVM displayed a stark decline in accuracy with the increase in prediction horizon. For the long-term forecast, FBP resulted in a better R
2 and RMSE, while SVM yielded a better MAE (
Table 5). Adding extraterrestrial radiation to the model enhanced the performance of SVM and FBP for both the short- and long-term predictions (
Table 4 and
Table 5). For the short-term prediction, R
2 increased by ca. 7% for FBP, and between 5% (for 1 hour ahead) and 10.5%, (for 3 hours ahead) for the SVM model. MAE decreased noticeably for FBP, by ca. 34 W/m
2, and also, but less drastically, for the SVM model. RMSE also decreased for both algorithms. The SVM model, which included global and extraterrestrial radiation of the same hour and day, 1 and 2 years ago, yielded the best results. The R
2 value in the long-term model increased by 7% for FBP and 17% for SVM. Furthermore, MAE and RMSE were reduced substantially. Overall, the addition of extraterrestrial radiation resulted in considerable improvements of all models. Extraterrestrial radiation on a horizontal surface is a good indicator of potential global horizontal irradiance, stating how much solar radiation is received at the top of the atmosphere for a certain location [
51].
The hyperparameters were tuned for the SVM model, using grid search cross validation. The tunable parameters were the regularisation parameter
, the size of the error tube
and the width of the RBF kernel
. The influence of
and
were minimal, leading to improvements of less than 0.0004% in R
2. Therefore, it was focused on tuning the regulation parameter
. SVMs are generally strongly dependent on their hyperparameters [
10]. However, tuning the hyperparameters for these models did not lead to significant improvements. For the short-term prediction,
= 120 led to the best results. This, however, only improved R
2 by 0.5%, MAE by 8.6 W/m
2, and RMSE by 1.6 W/m
2. These improvements were low, compared with the addition of features. For the long-term prediction, the best
was 0.5. The improvements for this were even smaller.
The results of the data-driven model can be seen in
Figure 8, displaying the same trend as described for the initial model (untuned, without added features).
5.2. Contextual Optimisation Results
The results of further contextual optimisation are presented in this section. Setting all negative values to zero slightly improved the SVM model. It further enhanced the model, as it does not confuse the user with the prediction of impossible (negative) values. As the FBP short-term model had larger negative predictions, eliminating these led to greater improvements. The R2 increased by 3% and MAE and RMSE decreased by 26 W/m2 and 8 W/m2, respectively. The long-term model improvements were less significant. As neither of the models predicted negative solar radiation during the day, setting all values to zero was appropriate. A model that predicted zero values at night, instead of negative values, was a closer reflection of reality.
There were some positive predictions at night. As this was not possible, sunrise and sunset adjustments were applied. Setting all values from sunset to sunrise to zero gave slightly better prediction results than defining all night hours as 8 p.m.–6 a.m. This was to be expected and true for short- and long-term predictions, in both SVM and FBP models. Including the flexible sunrise and sunset in the model allowed it to be easily applied to a location with different geographical conditions. This is particularly important in locations that are far from the equator, as sunset and sunrise vary more over the year in those places. However, it must be noted that including this adjustment into the model requires extra computational power. In locations where there is no significant variation in sunset and sunrise times during the year, this step may not be worth the marginally improved performance.
Seasonal adaptation only applied to the long-term forecast. There were three versions of this amendment, using the mean (V1), the median (V2), and the mean of the mean and median (V3). For SVM, the seasonal adaptation had a greater impact on the model with additional features. Version 1 performed best for the R
2 value, reducing the error by 11% and decreasing RMSE by 7 W/m
2, as shown in
Figure 9. However, MAE increased by 6 W/m
2, which should be avoided. Version 2 performed better for MAE, decreasing it. However, the R
2 value decreased by 0.2% and RMSE increased slightly, which is also not desirable. Version 3 combines aspects of both preceding versions, offering more continuity and stable results. The R
2 and RMSE values for this version were better in comparison with the previous amendment (sunrise and sunset), while MAE was very similar. Therefore, version 3 of the seasonal adaptation, using the mean of the median and mean, was chosen as the last amendment for the long-term SVM model. The improvement of applying the seasonal adaptation can clearly be observed in
Figure 10.
For FBP, the improvement on the model with additional features was marginal. As version 1 (using the mean as the average) led to improvements for all metrics, it was chosen for the FBP model. Interestingly, applying the seasonal adaptation to the FBP model without the extraterrestrial radiation led to results in R
2, MAE, and RMSE that were only slightly different from the model with extraterrestrial radiation. The seasonal adaptation had a greater positive impact on the model without extraterrestrial radiation, as shown in
Table 6, with the addition of correcting the daily seasonality. The impact on this model was larger because the yearly and daily seasonality were both corrected, while for the model with extraterrestrial radiation mostly daily seasonality was adjusted. Thus, using a model without extraterrestrial radiation could be considered if these data are not available.
Table 4 and
Table 5 display the results of all steps of data-driven and contextual parts for short- and long-term forecasts. It is clear that the accuracy was enhanced at each step of the algorithm, starting from the initial features training to the SA. The proposed model changes improved R
2 of the short-term model by 5% (1 h) to 11% (3 h) for SVM and 7% for FBP. The MAE for the FBP model decreased by 39 W/m
2 and by ca. 25 W/m
2 for SVM. RMSE was also decreased by 17 to 24 W/m
2 for SVM and 18 W/m
2 for FBP. The overall R
2 improvement associated with model changes for the long-term forecast is 20% for SVM and 8% for FBP, as shown in
Table 5. MAE decreased by 42 W/m
2 for FBP but only by 16 W/m
2 for SVM. For SVM, however, RMSE decreased by 41 W/m
2, whereas for FBP, it decreased by 20 W/m
2.
The insights of the individual model results for different horizons were taken to determine which algorithm to use for which horizon in the final DCF. For DCF, the highest accuracy for the 1- and 2-hour predictions was achieved using SVM with extraterrestrial radiation as an additional input feature, using the dynamic night-time adjustment and version 3 of the seasonal adaptation.
Figure 11 shows that the 1-hour prediction SVM displayed a compact trend line with only a few normally distributed errors. For FBP, most values were on a line that was slightly too steep, indicating an overprediction for those values. However, there were also many points below the dense line, signalling underprediction. For the 3-hour and long-term predictions, the FBP using V1 of the seasonal adaptation outperformed all the other versions and algorithms. It can be concluded that the SVM model should be used for 1- and 2-hour ahead predictions, while beyond that, the FBP model should be utilised in the final DCF.
The performance of FBP suffered less from an increase in horizon than the SVM model. This is due to the underlying characteristics of the algorithm; FBP is specifically designed for time-series prediction [
30]. An advantage is that the performance declines less over time. However, inputting the whole past time series into the model did not allow emphasising values that had a higher correlation and were more relevant to the particular prediction. For SVM, this could be differentiated.
5.3. DCF Performance
In this section, the DCF performance for short- and long-term forecasting is presented. To validate its performance and ensure that DCF is a generic model that can be utilised for different locations, forecasts were conducted for three cities, i.e., Denver, Boston, and Seattle.
The results for all three cities and both algorithms are presented in
Table 7. It can be seen that the SVM model performed even better on the short-term prediction in Seattle and Boston than for Denver, while the general trend remained the same as for the Denver results. For the long-term prediction, Denver displayed the best results in terms of R
2; however, both MAE and RMSE were as low or lower for Boston and Seattle than for Denver. Again, the SVM model mostly outperformed FBP in the 1- and 2-hour forecasts, while the FBP model generally generated better results for 3-hour prediction and in the long term. This was observed similarly in the results and its trend validated the chosen DCF model.
Two days of short-term predictions by the DCF algorithm are displayed in
Figure 12. It shows that the model was noticeably accurate for sunny days (first day), with smooth irradiance transitions. Furthermore, it captured trends for changes in weather, as can be observed on the second day. Despite the rapid change in irradiance, the model still generated accurate predictions.
As shown in
Figure 13, DCF was applicable to different locations, conserving the general pattern of performance. This validated the DCF algorithm and provided us with confidence that this model will perform well in other not-yet-tested locations. Results of around 90% (91.2%, 90.6%, and 87.6%) for the 1-hour predictions were achieved for R
2, while MAE ranged from 36 W/m
2 for Seattle to 47 W/m
2 for Denver and RMSE from 75 W/m
2 for Seattle to 107 W/m
2 for Denver. For the 2-hour forecast, the R
2 value declined by about 5%, and MAE and RMSE increased by ca. 12 and 18 W/m
2, respectively, for all locations. The 3-hour prediction still generated R
2 of 78% (Seattle) to about 83% (Denver and Boston), while MAE ranged from 56 (Seattle) to about 61 W/m
2 (Denver and Boston) and RMSE from 103 W/m
2 (Seattle) to 116 W/m
2 (Denver and Boston). Even the long-term prediction for one year ahead still generated good results for all cities, with high R
2 values and low error values, as shown in
Figure 13.