Next Article in Journal
Chemical Characterization of Rural Organic Aerosol in the North China Plain Using Ultrahigh-Resolution Mass Spectrometry
Previous Article in Journal
A Statistical Approach on Estimations of Climate Change Indices by Monthly Instead of Daily Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Machine Learning Dynamic Ensemble Methods for Solar Irradiance and Wind Speed Predictions

by
Francisco Diego Vidal Bezerra
1,
Felipe Pinto Marinho
2,
Paulo Alexandre Costa Rocha
1,3,*,
Victor Oliveira Santos
3,
Jesse Van Griensven Thé
3,4 and
Bahram Gharabaghi
3
1
Department of Mechanical Engineering, Technology Center, Federal University of Ceará, Fortaleza 60020-181, CE, Brazil
2
Department of Teleinformatics Engineering, Technology Center, Federal University of Ceará, Fortaleza 60020-181, CE, Brazil
3
School of Engineering, University of Guelph, 50 Stone Rd. E, Guelph, ON N1G 2W1, Canada
4
Lakes Environmental, 170 Columbia St. W, Waterloo, ON N2L 3L3, Canada
*
Author to whom correspondence should be addressed.
Atmosphere 2023, 14(11), 1635; https://doi.org/10.3390/atmos14111635
Submission received: 4 August 2023 / Revised: 26 September 2023 / Accepted: 27 October 2023 / Published: 31 October 2023
(This article belongs to the Special Issue Solar Irradiance and Wind Forecasting)

Abstract

:
This paper proposes to analyze the performance increase in the forecasting of solar irradiance and wind speed by implementing a dynamic ensemble architecture for intra-hour horizon ranging from 10 to 60 min for a 10 min time step data. Global horizontal irradiance (GHI) and wind speed were computed using four standalone forecasting models (random forest, k-nearest neighbors, support vector regression, and elastic net) to compare their performance against two dynamic ensemble methods, windowing and arbitrating. The standalone models and the dynamic ensemble methods were evaluated using the error metrics RMSE, MAE, R2, and MAPE. This work’s findings showcased that the windowing dynamic ensemble method was the best-performing architecture when compared to the other evaluated models. For both cases of wind speed and solar irradiance forecasting, the ensemble windowing model reached the best error values in terms of RMSE for all the assessed forecasting horizons. Using this approach, the wind speed forecasting gain was 0.56% when compared with the second-best forecasting model, whereas the gain for GHI prediction was 1.96%, considering the RMSE metric. The development of an ensemble model able to provide accurate and precise estimations can be implemented in real-time forecasting applications, helping the evaluation of wind and solar farm operation.

1. Introduction

Electricity generated from fossil fuel sources has been the main driver of climate change, probably contributing over 70% of greenhouse gas emissions and over 90% of all carbon gas emissions. The alternative of decarbonizing the world’s electricity generation system is focused on being alert to sources of renewable energy, whose generation costs are increasingly accessible [1].
The impact of intermittency generation [2] on the electrical grid is an undesired effect when it comes to electrical generation from alternative energy resources, such as wind speed and global solar radiation. Since this generation is dependent on weather conditions, one of the means to eliminate or reduce its uncertainties is the availability of good prediction methods for these resources [3].
The search for parameters that can describe atmospheric behavior and its predictability has led research on machine learning to develop and to create models, based on the most diverse types of predictors, for use in different areas. In [4], multilayer machine learning is used to improve the resolution of ground-based astronomical telescopes. In [5], parameters are used to construct an atmospheric circulation model.
The influences of atmospheric factors on the generation of electrical energy from solar and wind sources are usually the main problem in the generation of smart grids, where large-scale generation plants need to be integrated into the electrical grid, which directly affects the planning, investment, and decision-making processes. Forecast models can minimize that problem via machine learning models [6].
The benefits of optimizing the forecast of generation from wind and solar sources using models is also an economic factor, as it gives greater security to the electricity sector via the improvement of renewable energy purchase contracts [7].
A 14-year-long data set was explored in [8], containing daily values of meteorological variables. This dataset was used to train three deep neural network (DNNs) architectures over several time horizons to predict global solar radiation for Fortaleza, in the northeastern region of Brazil. The accuracy of the predictions was considered excellent according to its normalized root mean squared error (nRMSE) values and good according to mean absolute percentage error (MAPE) values.
The variability of mathematical prediction models has individual importance inherent to each one of the methods employed, and in this scenario, dynamic ensemble models emerge, which present potentially better performance when compared to individual models, since they seek maximum optimization by considering the best of the individual models. This approach is currently very successfully used in research and industrial areas. Several dynamic ensemble methods have been developed for forecasting energy generation from renewable sources in which they use the presence of well-known forecast models such as random forest regression (RF), support vector regression (SVR), and k-nearest neighbors (kNN), which are applied to integrate optimizations for use in dynamic ensemble methods [9].
The random forest (RF) forecasting model is based on the creation of random decision trees. In this method, these decision trees state specific rules and conditions for the flow of the result until its conclusion.
The support vector machine (SVR) is a regression algorithm that uses coordinates for individual observations and uses hyperplanes to segregate data sets. This is a widely used method for categorizing clusters and classifying. This model was first developed for classification purposes and has been largely tested [10,11] in recent approaches [12] to develop a novel method for the maximum power point tracking of a photovoltaic panel and in [13], where solar radiation estimation via five different machine learning approaches is discussed.
The KNN method is a supervised learning algorithm which is widely used as a classifier that, based on the proximity of nearest neighboring data, performs categorization via similarity and predicts a new sample using the K-closest samples. Recently, this approach has been used in [14], where virtual meteorological masts use calibrated numerical data to provide precise wind estimates during all phases of a wind energy project to reproduce optimal site-specific environmental conditions.
Most studies have focused on accurate wind power forecasting, where the random fluctuations and uncertainties involved are considered. The study in [15] proposes a novel method of ultra-short-term probabilistic wind power forecasting using an error correction modeling with the random forest approach.
The elastic net method is a regularized regression method that linearly combines the penalties of the LASSO and Ridge methods. In [16], the study uses forecast combinations that are obtained by applying regional data from Germany for both solar photovoltaic and wind via the elastic net model, with cross-validation and rolling window estimation, in the context of renewable energy forecasts.
The state of the art is currently to use dynamic ensemble methods in a meta-learning approach such as arbitrating, which uses output combinations according to the predictions of the loss that shall result, as well as windowing approaches, which have parameterizations for adjusting the degree of data to be considered [17].
In [18], a global climate model (GCM) is studied to improve a near-surface wind speed (WS) simulation via 28 coupled model intercomparisons using dynamical components.
In [19], a hybrid transfer learning model based on a convolutional neural network and a gated recurrent neural network is proposed to predict short-term canyon wind speed with fewer observation data. The method uses a time sliding window to extract time series from historical wind speed data and temperature data of adjacent cities as the input of the neural network.
In [20], authors studied the multi-GRU-RCN method, an ensemble model, to obtain significant information regarding factors such as precipitation and solar irradiation via short-time cloud motion predictions from a cloud image. The ensemble modeling used in [21] integrates wind and solar forecasting methodologies applied to two locations at different latitudes and with climatic profiles. The obtained results reduce the forecast errors and can be useful in optimizing planning to use intermittent solar and wind resources in electrical matrices.
A proposed new ensemble model in [22] was based on graph attention networks (GAT) and GraphSAGE to predict wind speed in a bi-dimensional approach using a Dutch dataset including several time horizons, time lags, and weather influences. The results showed that the ensemble model proposed was equivalent to or outperformed all benchmarking models and had smaller error values than those found in reference literature.
In [23], time horizons ranging from 5 min to 30 min were studied in 5-min time steps in evaluating solar irradiance short-term forecasts to global horizontal irradiance (GHI) and direct normal irradiance (DNI) using deep neural networks with 1-dimensional convolutional neural networks (CNN-1Ds), long short-term memory (LSTM), and CNN–LSTM. The metrics used were the mean absolute error (MAE), mean bias error (MBE), root mean squared error (RMSE), relative root mean squared error (rRMSE), and coefficient of determination (R2). The best accuracy was obtained for a horizon of 10 min, improving 11.15% on this error metric compared to the persistence model.
There are studies employing different DNN architectures, such as GNN, CNN, and LSTM, achieving satisfactory outcomes in different fields of science [24,25,26,27]. However, the present work focuses on classical ML, since the main objective is to identify the best supporting ensemble approach to the ML procedures. by analyzing the influence of dynamic ensemble arbitrating and windowing methods on machine learning algorithms traditionally, focusing on predicting electrical power generation. We also present their greater efficiency, using data of interest for energy production with input variables of wind speed and solar irradiance. We have followed this approach because of its advantage in exploring dynamic ensemble methods, since these seek the best pre-existing efficiency for generating a unique and more effective predictability model.

2. Location and Data

In this paper, two data types were used to carry out the analysis, which were acquired from solarimetric and anemometric stations located in Petrolina—PE. The data were collected from the SONDA network (National Organization of Environmental Data System) [28], which was a joint collaboration between several institutions and was created for the implementation of physical infrastructure and human resources, aimed at raising and improving the database of solar and wind energy resources in Brazil.
The time sampling used in this study was 10 min, and the duration of data collection was from January 2007 to December 2010. The detailed information about the data of the solarimetric and anemometric station is shown in Table 1, where MI (min) is the “measurement interval” and the duration of data collection is presented as MP, “measured period”. Its location on the map is shown in Figure 1.
The Petrolina region is classified as a BSh Koppen climate zone [30]. There are considerable differences in the annual cycle between solar radiation and wind. The average wind speed and solar irradiance in Petrolina experience significant seasonal variations throughout their annual cycle. The windiest interval of the year occurs from May to November, with average wind speeds above 5.4 m/s. The month with the strongest winds is August, with an average hourly wind speed of 6.7 m/s. The period with the lowest wind volume of the year is from November to May. The month with the calmest winds is March, with an average hourly wind speed of 4.1 m/s.
The period of greatest solar radiance in the year is from September to November, with a daily average above 7.2 kWh/m2, with October being the peak with an average of 7.5 kWh/m2. The period with the lowest solar radiance in the year is from May to July, with a daily average of 6.1 kWh/m2, with June being the month with the lowest solar radiance, with an average of 5.7 kWh/m2.

2.1. Wind Speed Data

The wind speed were was obtained in m/s from a meteorological station, which has anemometric sensors at altitudes of 25 m and 50 m from the ground. The highest altitude was chosen for this study, both to reduce the effects of the terrain and to be closer to the altitudes currently in practice for wind turbines [31].

2.2. Irradiance Data

The global horizontal irradiance (GHI) data acquired from the solarimetric station were used in this study, and the clear-sky coefficient was considered, in order to remove dependence on air mass in the irradiance values that reach the sensors [32], through the use of the clear-sky factor (Ics) [33], using the polynomial fit model [34]. The work [35] obtained promising results from the same database using two machine learning estimation models for (GHI).
In order to obtain irradiance data independent of air mass variations, we used kt, which is defined by the ratio between the global horizontal irradiance value (GHI) (I) and the clear sky factor (Ics), as shown in Equation (1).
k t = I I c s

3. Methodology

Initially, wind speed and irradiance data were acquired and the intervals for the test and training sets were determined. For wind speed data, in a measurement period from 2007 to 2010, the first three years were used as the training data set and the last year as the test set. In order to allow the evaluation of the performance of the tested forecasting models and also of dynamic ensemble methods, this study developed a computational code in Python to evaluate the output values obtained by the well-known machine learning forecasting methods: random forest, k-nearest neighbors (kNN), support vector regression (SVR), and elastic net. For each of the methods, the best performance parameters (lower root mean squared error (RMSE)) were evaluated. Right after the stage of acquisition and determination of the optimal parameters for each of the models, the methods of dynamic ensemble windowing and arbitrating were executed, from which performance metrics values were also obtained: coefficient of determination (R2), root mean squared error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE). These values were compared to evaluate the efficiency of the dynamic ensemble methods compared to other stand-alone models. The variation of the λ parameter for windowing, which is the length used for the extension of the values considered in the data forecast, was also evaluated. The methodology used can be seen in Figure 2.
In the data pre-processing, a recursive approach of Lagged Average values for kt and ν time series was applied: this feature is given by the vector L(t) with components calculated using Equation (2).
L i ( t ) = 1 N t [ t i δ T , t ( i 1 ) δ T ] x ( t )

3.1. Windowing Method

The diversity of the models makes the forecast analysis rich and complex, since each model has strong points and other weaknesses, in the sense that from this combination, the best results can be treated and considered to obtain more accurate forecasts. To perform this combination, it is necessary to know how to estimate at which points certain specific models perform better.
Windowing [17] is a dynamic ensemble model, where weights are calculated based on the performance of each individual model, evaluated in a data window referring to immediately previous data. The size of this window is parameterized by the λ value. This means that the weights of each model are re-evaluated at each time step, and then they are classified to catalogue only the best performance results, generating a hybrid model.

3.2. Arbitrating Method

Arbitrating [36] uses the metalearning method to learn and predict the classifiers. In this study, it regards the weights based on each model’s performance for a given time step. At each simulation instant, the most reliable model is selected and included in the prediction process.

3.3. Machine Learning Prediction Models and Dynamic Ensemble Method Parameters

In the data training stage, GridSearch was used with 5-fold cross-validation. The search parameters are shown in Table 2.
GridSearch is a tool from the Scikit-learn library used in Python which applies a methodology whose function is to combine parameters from the methods under evaluation and present them in a single output object for analysis. This is a very important tool when comparing performance between methods, the object of this study.

3.4. Performance Metrics Comparison Criteria

As the purpose of this work is to evaluate the performance of dynamic ensemble methods against other methods, performance metrics had to be determined to allow it. The metrics used were those of Equations (3)–(6).
  • Coefficient of determination (R2)
R 2 = 1 i = 1 N ( y i y l ^ ) 2 i = 1 N ( y i y l ¯ ) 2
  • Root mean squared error (RMSE)
R M S E = i = 1 N ( y i y l ^ ) 2 n
  • Mean absolute error (MAE)
M A E = 1 N i = 1 N y l ^ y i
  • Mean absolute percentage error (MAPE)
M A P E = 1 n i = 1 n y i y l ^ y i

4. Results and Discussion

This section discusses the results generated in this work. It focuses on the analysis of efficiency metrics for the machine learning methods employed. This analysis determines which method/parameters obtain the best performance in the application of wind speed and solar irradiance data.

4.1. Wind Speed Predictions

During the search for best-performance methods, the optimized parameters for each of the tested methods needed to be identified. This allowed for the elaboration of the dynamic ensemble, which was built upon the merging of the best-performance results at each time step and for all the methods in question. The optimal parameters found for each of the time horizons are shown in Table 3.
Efficiency evaluations for each of the forecasting methods were based on performance metrics evaluations for each time horizon under study (t + 10, t + 20, t + 30 and t + 60). Initially, for all time horizons, windowing proved to be the most efficient method. Then, a fine-tuning evaluation was performed based on the variation of the windowing parameter to assess its influence on performance. The predominance of better performance for windowing in all time horizons and its comparisons can be seen in Table 4 and Figure 3.
Elastic net is a penalized linear regression model that is a combination of Ridge and LASSO regression into a single algorithm and uses best_l1_ratio as a penalty parameter during the training step, being 0 for Ridge and 1 value for LASSO regression. From Table 3, the parameter obtained the value of 1, which means that LASSO regression was used in its entirety.
As with the evaluation employing RMSE, values from R2, MAE, and MAPE were also assessed. Once the best performance was found for the windowing ensemble method, an in-depth analysis was performed based on the variation of its parameter λ to assess the influence on its internal performance. Since the time horizon that presented the best performance was t + 10, this was the focus of the analysis, as shown in Figure 4, Figure 5, Figure 6 and Figure 7. The detailed data for all the horizons is shown in Table 5, Table 6 and Table 7.
When we checked the influence of the λ parameter on windowing method performance, it was found from λ = 74 that it is no longer the most efficient method, and SVR becomes the best one, due to its lowest RMSE value. It is important to highlight that the best performance value for the windowing method, which is the best performance overall, was found for λ = 19. The performance comparison between the two methods can be seen in Figure 8.

4.2. Irradiance Predictions

During the search for best-performance methods, the optimized parameters of each of these methods needed to be known to allow the elaboration of the dynamic ensemble, which is built from merging the best-performance results at each instant and for each of the methods in question. The optimal parameters for each time horizon are shown in Table 8.
Efficiency evaluation for each of the solar irradiance forecasting methods were based on performance metrics for each time horizon under study (t + 10, t + 20, t + 30 and t + 60). Again, windowing proved to be the most efficient method for all time horizons, with the best method being found for the t + 10 time horizon, having the lowest RMSE value, using its parameterizations with λ = 50 initially. Then, fine-tuning was performed based on the variation of the windowing parameter to assess its influence on performance. The predominance of better performance for windowing in all time horizons and its comparisons can be seen in Table 9 and Figure 9.
Just like the evaluation employing RMSE, values of R2, MAE, and MAPE were also analyzed. After the best performance was found for the windowing method, an in-depth analysis was performed based on the variation of its parameter λ to assess the influence on its internal performance. Since the time horizon that presented the best performance was t + 10, this was the focus of the analysis, as shown in Figure 10, Figure 11, Figure 12 and Figure 13. The detailed data for all tested time horizons is shown in Table 10, Table 11 and Table 12.
Some authors applied elastic Nnet in time-varying combinations [16], using RMSE as a performance metric. They found that, for PV forecasts, it obtained 13.4% more precise forecasts than the simple average and for the wind forecast, it obtained 6.1% better forecasts.
In [21], an ensemble method which used MAPE as the comparative efficiency metric for wind speed data was studied with a value of 9.345%, and solar with 7.186%, which proved to be the most efficient.
In this study, performance improvements were obtained for the most efficient method (windowing) compared to the second most efficient for wind speed of 0.56% and, for solar irradiation, 1.86%.

4.3. Comparison with Results from the Literature

Performance of the windowing approach was compared with other wind forecasting models found in the literature. It is important to disclose that a direct comparison between different predictive models is not an easy task, since each applied approach has its own objectives, hyperparameters, and input data [22]. To facilitate the comparison against the results found in the literature, Table 13 compiles the results previously presented for the proposed windowing model. The results found in literature for wind speed forecasting are compiled and presented in Table 14, where RMSE and MAE are in m/s.
Analyzing the results for reference [22], in which wind speed was forecasted in the Netherlands using an ensemble approach merging graph theory and attention-based deep learning, we can observe that the proposed windowing ensemble model is not able to surpass the results for neither RMSE nor MAE for t + 60 forecasting horizon. The accentuated difference between these two models can be explained because the GNN SAGE GAT model, being developed to handle graph-like data structure, excels in retrieving complex spatiotemporal relationships underlaying the dataset, drastically improving its forecasting capacity when compared with other ML and DL models alike.
In reference [37], the authors proposed a wind forecasting for a location in Sweden, with a model based on a bi-directional recurrent neural network, a hierarchical decomposition technique, and an optimization algorithm. When compared with their results, the windowing model proposed in this paper offers improvement over the reference results for t + 10 forecasting horizon by 1% and by 20% for t + 60. When MAE and MAPE are analyzed, the windowing indicates improvement over these metrics for t + 10 and t + 60, increasing by 28% the MAE value for t + 10, and 9% for t + 60. Regarding MAPE, the improvement is 64% for t + 10 and 95% for t + 60.
In the work of Liu et al. [39], another deep learning-based predictive model was proposed. It used a hybrid approach composed of data area division to extract historical wind speed information and an LSTM layer optimized via a genetic algorithm to process the temporal aspect of the dataset to forecast wind speed in Japan. Compared to this reference, the windowing model showed no improvement for wind speed forecasting. However, the windowing approach offers competitive forecasting for the assessed time windows, being in the same order of magnitude as the ones in the reference. In work [40], the authors proposed the employment of another hybrid forecasting architecture composed of CNN and LSTM deep learning models for wind speed estimation in the USA. Their results, when compared against the windowing methodology, are very similar for all forecasting horizons, showing that both windowing and CNN–LSTM offer good results for wind speed estimation for these time intervals.
In Dowell et al. [38], a statistical model for estimation of future wind speed values in the Netherlands was proposed. For the available t + 60 time horizon, we observe that, again, the forecasted wind speeds for the reference and proposed windowing models are very similar, suggesting both models as valuable tools for wind speed forecasting.
For GHI forecasting, the results found in the literature are presented in Table 15.
In work [23], a deep learning standalone model of CNN was applied to estimate future GHI values in the USA. Comparing the GHI forecasting results achieved via windowing with this reference, we observe that the proposed model was not able to provide superior forecasting performance. However, the windowing results are still competitive since both approaches were able to reach elevated coefficient of determination values for all the assessed forecasting horizons, with a slight advantage for the deep learning model.
In reference [41], the authors combined principal component analysis (PCA) with multivariate empirical model decomposition (MEMD) and gated recurrent unit (GRU) to predict GHI in India. In their methodology, the PCA extracted the most relevant features from the dataset after it was filtered via the MEMD algorithm. Lastly, the future irradiance was estimated via the deep learning model of GRU. Compared to their approach, the windowing model could not improve the GHI forecasting within a t + 60 time window. Also, the reference model MEMD-PCA-GRU provided an elevated R2 value of 99%, showing clearly superior performance over the proposed ensemble model.
When our model is compared with the physical-based forecasting models proposed in [42,43], we can conclude that windowing can achieve similar results for time horizons of t + 30 and t + 60. In [42], authors used the FY-4A-Heliosat method for satellite imagery to estimate GHI in China. Although the windowing model could not improve on GHI forecasting for t + 30 and t + 60 time windows, the proposed model was able to return relevant results for irradiance estimation in both cases. The second physical-based model proposed in [43] was applied to estimate GHI in Finland. In their methodology, the Heliosat method is again employed, together with geostationary weather data from satellite images. Compared to their proposed approach, the windowing model can improve GHI forecasting for t + 60 in 8%, providing significant advance in the irradiance estimation.
In work [44], the authors used the state-of-the-art transformer deep learning architecture together with sky images [45] for GHI estimation in the USA. Analyzing their results and the ones provided by the windowing method, we observe that the transformer-based model reaches the best GHI forecasting values for RMSE in all the assessed time windows.
After the comparison of the ensemble windowing approach with reference models found in the literature, we see that wind speed prediction is often competitive and usually improves wind speed prediction for the assessed forecasting horizons. The results for wind speed prediction using the ensemble model corroborate the results found in the literature, where the ensemble approach often reaches state-of-the-art forecasting in time-series prediction applications [21,46,47,48]. Their improved performance comes from the combination of weaker predictive models to improve their overall forecasting capacity, also reducing the ensembled model’s variance [49,50].
However, the proposed dynamic ensembled approach faced increased difficulty when determining future GHI values. This may be an indication that irradiance forecasting is a more complex non-linear natural phenomenon, requiring improved extraction of spatiotemporal information from the dataset. Since the proposed ensemble model does not have a deep learning model in its architecture it cannot properly identify and extract spatiotemporal information underlying the dataset, thus failing in providing better irradiance estimation. Deep learning models can often excel in this type of task, as proved in the results from Table 15. Extensive literature can be found regarding improvements of time-series forecasting problems when complex and deep approaches are employed [22,23,51,52].

5. Conclusions

This work proposed to evaluate the performance of two machine learning (ML) dynamic ensemble methods, using wind speed and solar irradiance data separately as inputs. Initially, wind speed and solar irradiance data from the same meteorological station were collected, the time horizons to be studied were determined (t + 10 min, t + 20 min, t + 30 min and t + 60 min), and then a recursive approach of lagged average values was applied to evaluate the models’ predictors.
ML methods well known in other energy forecasting research works regarding wind and irradiance data were selected to compare their efficiency with two other methods that use a dynamic ensemble approach (windowing and arbitrating). The programming code in Python was developed to catalog the optimal efficiency parameters of each previously known model, based on error metrics and coefficient of determination. The dynamic ensemble methods (windowing and arbitrating), based on the optimal parameters of each previously calibrated models (random forest, k-nearest neighbors, support vector regression, and elastic net), generated a single model with greater efficiency for both wind and solar irradiance data.
For forecasting wind speed data, the most efficient method was found to be windowing for all time horizons, when evaluated by the criterion of the lowest RMSE value, and specifically for the time horizon t + 10, as evidenced in Figure 3. The greatest efficiency was found in an interval of 1 to 74 for the λ parameter, reaching maximum performance for the value λ = 19, as seen in Figure 8, which suggests that the windowing parameterization directly influences the method’s performance.
Structurally, solar radiation data is different from wind data, since they have cycles in nature and are different physical phenomena, presenting different correlations with their historical values, which impacts different trends for the λ parameter in each of the variables.
For solar irradiation forecasting, the most efficient method was also windowing and the t + 10 min time horizon reached the lowest RMSE value. Unlike what was found for wind speed data, a greater linearity in the trend was perceived from the λ windowing parameter variation plot when analyzing its RMSE values. Looking at the λ interval under study, the best performance value (using RMSE criteria) of λ = 1 was found, as can be seen in Figure 10. Unlike all other plots, in Figure 12, there is a sudden jump between λ from 1 to 3. Although the reference metric is RMSE, for some other metrics the use of λ = 1 may mean insufficient information for the model, since it will have as input variable just one previous time step (window size).
Using wind speed data, the efficiency gain of the most efficient model (windowing for the time horizon t + 10 min and λ = 19, see Table 4), when compared to the second highest efficiency (SVR), was 0.56% when using the lowest value RMSE metric. A similar trend could be observed for the model using solar irradiance data. The efficiency increase, comparing the most efficient model (windowing for the time horizon t + 10 min and λ = 1, see Table 9) to the second highest efficiency (arbitrating), was about 1.72%, and when compared to the third most efficient method (SVR), it was about 1.96%.
Also, extensive comparisons with spatiotemporal models found in the literature show that the dynamic ensemble model for wind speed often provides superior forecasting performance for the assessed time horizons, deeming the proposed approach as a valuable tool for wind speed estimation. Regarding irradiance forecasting, the dynamic ensemble architecture proposed in this study could not surpass the deep learning-based models, which showed superior spatiotemporal identification, and consequently better estimated GHI values. However, the proposed windowing approach can provide competitive results and superior GHI forecasting when compared to physics-based predictive models.
For future works, the dynamic ensemble architecture can be improved with the addition of more complex machine learning models, such as deep learning and graph-based approaches, such as the one in works [22,51,52]. This may boost the windowing forecasting capacity for GHI and wind speed estimation once it is able to benefit from spatiotemporal data information underlying the dataset. The models were developed to treat the database in a generalized way. Specific studies with delimitation of seasons and/or times of day can be carried out as future studies. The development of an ensemble model able to provide accurate and precise estimations can then be employed in the development of real-time forecasting applications, helping the evaluation of wind and solar farms operation.

Author Contributions

Conceptualization, F.D.V.B., F.P.M. and P.A.C.R.; data curation, F.D.V.B. and F.P.M.; formal analysis, P.A.C.R.; methodology, F.D.V.B., F.P.M. and P.A.C.R.; software, F.D.V.B. and F.P.M.; supervision, P.A.C.R.; validation, P.A.C.R., J.V.G.T. and B.G.; visualization, P.A.C.R., J.V.G.T. and B.G.; writing—original draft, F.D.V.B., F.P.M. and V.O.S.; writing—review and editing, F.D.V.B., F.P.M., P.A.C.R., V.O.S., J.V.G.T. and B.G.; project administration, P.A.C.R.; funding acquisition, P.A.C.R., B.G. and J.V.G.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Sciences and Engineering Research Council of Canada (NSERC) Alliance, grant No. 401643, in association with Lakes Environmental Software Inc., by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—Brasil (CAPES)—Finance Code (Grant No. 001), and by the Conselho Nacional de Desenvolvimento Científico e Tecnológico—Brasil (CNPq), grant no. 303585/2022-6.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data of wind speed and irradiation from Petrolina—PE—Brazil are downloaded from SONDA (National Organization of Environmental Data System)\\portal (http://sonda.ccst.inpe.br/, accessed on 12 July 2023).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Osman, A.I.; Chen, L.; Yang, M.; Msigwa, G.; Farghali, M.; Fawzy, S.; Rooney, D.W.; Yap, P.S. Cost, environmental impact, and resilience of renewable energy under a changing climate: A review. Environ. Chem. Lett. 2022, 21, 741–764. [Google Scholar] [CrossRef]
  2. Calif, R.; Schmitt, F.G.; Duran Medina, O. −5/3 Kolmogorov turbulent behavior and Intermittent Sustainable Energies. In Sustainable Energy-Technological Issues, Applications and Case Studies; Zobaa, A., Abdel Aleem, S., Affi, S.N., Eds.; Intech: London, UK, 2016. [Google Scholar] [CrossRef]
  3. Carneiro, T.C.; de Carvalho, P.C.M.; dos Santos, H.A.; Lima, M.A.F.B.; de Souza Braga, A.P. Review on Pho-tovoltaic Power and Solar Resource Forecasting: Current Status and Trends. J. Sol. Energy Eng. Trans. ASME 2022, 144, 010801. [Google Scholar] [CrossRef]
  4. Shikhovtsev, A.Y.; Kovadlo, P.G.; Kiselev, A.V.; Eselevich, M.V.; Lukin, V.P. Application of Neural Networks to Estimation and Prediction of Seeing at the Large Solar Telescope Site. Publ. Astron. Soc. Pac. 2023, 135, 014503. [Google Scholar] [CrossRef]
  5. Yuval, J.; O’Gorman, P.A. Neural-Network Parameterization of Subgrid Momentum Transport in the Atmosphere. J. Adv. Model. Earth Syst. 2023, 15, e2023MS003606. [Google Scholar] [CrossRef]
  6. Meenal, R.; Binu, D.; Ramya, K.C.; Michael, P.A.; Vinoth Kumar, K.; Rajasekaran, E.; Sangeetha, B. Weather Forecasting for Renewable Energy System: A Review. Arch. Comput. Methods Eng. 2022, 29, 2875–2891. [Google Scholar] [CrossRef]
  7. Mesa-Jiménez, J.J.; Tzianoumis, A.L.; Stokes, L.; Yang, Q.; Livina, V.N. Long-term wind and solar energy generation forecasts, and optimisation of Power Purchase Agreements. Energy Rep. 2023, 9, 292–302. [Google Scholar] [CrossRef]
  8. Rocha, P.A.C.; Fernandes, J.L.; Modolo, A.B.; Lima, R.J.P.; da Silva, M.E.V.; Bezerra, C.A.D. Estimation of daily, weekly and monthly global solar radiation using ANNs and a long data set: A case study of Fortaleza, in Brazilian Northeast region. Int. J. Energy Environ. Eng. 2019, 10, 319–334. [Google Scholar] [CrossRef]
  9. Du, L.; Gao, R.; Suganthan, P.N.; Wang, D.Z.W. Bayesian optimization based dynamic ensemble for time series forecasting. Inf. Sci. 2022, 591, 155–175. [Google Scholar] [CrossRef]
  10. Vapnik, V.N. Adaptive and Learning Systems for Signal Processing, Communications and Control. In The Nature of Statistical Learning Theory; Springer: Berlin/Heidelberg, Germany, 1995. [Google Scholar]
  11. Smola, A. Regression Estimation with Support Vector Learning Machines. Master’s Thesis, Technische Universit at Munchen, Munich, Germany, 1996. [Google Scholar]
  12. Mahesh, P.V.; Meyyappan, S.; Alia, R.K.R. Support Vector Regression Machine Learning based Maximum Power Point Tracking for Solar Photovoltaic systems. Int. J. Electr. Comput. Eng. Syst. 2023, 14, 100–108. [Google Scholar] [CrossRef]
  13. Demir, V.; Citakoglu, H. Forecasting of solar radiation using different machine learning approaches. Neural Comput. Appl. 2023, 35, 887–906. [Google Scholar] [CrossRef]
  14. Schwegmann, S.; Faulhaber, J.; Pfaffel, S.; Yu, Z.; Dörenkämper, M.; Kersting, K.; Gottschall, J. Enabling Virtual Met Masts for wind energy applications through machine learning-methods. Energy AI 2023, 11, 100209. [Google Scholar] [CrossRef]
  15. Che, J.; Yuan, F.; Deng, D.; Jiang, Z. Ultra-short-term probabilistic wind power forecasting with spatial-temporal multi-scale features and K-FSDW based weight. Appl. Energy 2023, 331, 120479. [Google Scholar] [CrossRef]
  16. Nikodinoska, D.; Käso, M.; Müsgens, F. Solar and wind power generation forecasts using elastic net in time-varying forecast combinations. Appl. Energy 2022, 306, 117983. [Google Scholar] [CrossRef]
  17. Cerqueira, V.; Torgo, L.; Pinto, F.; Soares, C. Arbitrage of forecasting experts. Mach. Learn. 2019, 108, 913–944. [Google Scholar] [CrossRef]
  18. Lakku, N.K.G.; Behera, M.R. Skill and Intercomparison of Global Climate Models in Simulating Wind Speed, and Future Changes in Wind Speed over South Asian Domain. Atmosphere 2022, 13, 864. [Google Scholar] [CrossRef]
  19. Ji, L.; Fu, C.; Ju, Z.; Shi, Y.; Wu, S.; Tao, L. Short-Term Canyon Wind Speed Prediction Based on CNN—GRU Transfer Learning. Atmosphere 2022, 13, 813. [Google Scholar] [CrossRef]
  20. Su, X.; Li, T.; An, C.; Wang, G. Prediction of short-time cloud motion using a deep-learning model. Atmosphere 2020, 11, 1151. [Google Scholar] [CrossRef]
  21. Carneiro, T.C.; Rocha, P.A.C.; Carvalho, P.C.M.; Fernández-Ramírez, L.M. Ridge regression ensemble of machine learning models applied to solar and wind forecasting in Brazil and Spain. Appl. Energy 2022, 314, 118936. [Google Scholar] [CrossRef]
  22. Santos, V.O.; Rocha PA, C.; Scott, J.; Thé, J.V.G.; Gharabaghi, B. Spatiotemporal analysis of bidimensional wind speed forecasting: Development and thorough assessment of LSTM and ensemble graph neural networks on the Dutch database. Energy 2023, 278, 127852. [Google Scholar] [CrossRef]
  23. Marinho, F.P.; Rocha, P.A.C.; Neto, A.R.; Bezerra, F.D.V. Short-Term Solar Irradiance Forecasting Using CNN-1D, LSTM, and CNN-LSTM Deep Neural Networks: A Case Study with the Folsom (USA) Dataset. J. Sol. Energy Eng. Trans. ASME 2023, 145, 041002. [Google Scholar] [CrossRef]
  24. Wu, Q.; Zheng, H.; Guo, X.; Liu, G. Promoting wind energy for sustainable development by precise wind speed prediction based on graph neural networks. Renew. Energy 2022, 199, 977–992. [Google Scholar] [CrossRef]
  25. Oliveira Santos, V.; Costa Rocha, P.A.; Thé, J.V.G.; Gharabaghi, B. Graph-Based Deep Learning Model for Forecasting Chloride Concentration in Urban Streams to Protect Salt-Vulnerable Areas. Environments 2023, 10, 157. [Google Scholar] [CrossRef]
  26. Tabrizi, S.E.; Xiao, K.; van Griensven Thé, J.; Saad, M.; Farghaly, H.; Yang, S.X.; Gharabaghi, B. Hourly road pavement surface temperature forecasting using deep learning models. J. Hydrol. 2021, 603, 126877. [Google Scholar] [CrossRef]
  27. Zhang, Y.; Gu, Z.; Thé, J.V.G.; Yang, S.X.; Gharabaghi, B. The Discharge Forecasting of Multiple Monitoring Station for Humber River by Hybrid LSTM Models. Water 2022, 14, 1794. [Google Scholar] [CrossRef]
  28. INPE. SONDA—Sistema de Organização Nacional de Dados Ambientais. 2012. Available online: http://sonda.ccst.inpe.br/ (accessed on 26 September 2023).
  29. GOOGLE. Google Earth Website. Available online: http://earth.google.com/ (accessed on 12 July 2023).
  30. Peel, M.C.; Finlayson, B.L.; McMahon, T.A. Updated world map of the Köppen-Geiger climate classification. Hydrol. Earth Syst. Sci. 2007, 11, 1633–1644. [Google Scholar] [CrossRef]
  31. Landberg, L.; Myllerup, L.; Rathmann, O.; Petersen, E.L.; Jørgensen, B.H.; Badger, J.; Mortensen, N.G. Wind resource estimation—An overview. Wind. Energy 2003, 6, 261–271. [Google Scholar] [CrossRef]
  32. Kasten, F.; Czeplak, G. Solar and terrestrial radiation dependent on the amount and type of cloud. Sol. Energy 1980, 24, 177–189. [Google Scholar] [CrossRef]
  33. Ineichen, P.; Perez, R. A new airmass independent formulation for the linke turbidity coefficient. Sol. Energy 2002, 73, 151–157. [Google Scholar] [CrossRef]
  34. Marquez, R.; Coimbra, C.F.M. Proposed metric for evaluation of solar forecasting models. J. Sol. Energy Eng. Trans. ASME 2013, 135, 011016. [Google Scholar] [CrossRef]
  35. Rocha, P.A.C.; Santos, V.O. Global horizontal and direct normal solar irradiance modeling by the machine learning methods XGBoost and deep neural networks with CNN-LSTM layers: A case study using the GOES-16 satellite imagery. Int. J. Energy Environ. Eng. 2022, 13, 1271–1286. [Google Scholar] [CrossRef]
  36. Cerqueira, V.; Torgo, L.; Soares, C. Arbitrated ensemble for solar radiation forecasting. In Advances in Computational Intelligence; Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2017. [Google Scholar] [CrossRef]
  37. Neshat, M.; Nezhad, M.M.; Abbasnejad, E.; Mirjalili, S.; Tjernberg, L.B.; Astiaso Garcia, D.; Alexander, B.; Wagner, M. A deep learning-based evolutionary model for short-term wind speed forecasting: A case study of the Lillgrund offshore wind farm. Energy Convers. Manag. 2021, 236, 114002. [Google Scholar] [CrossRef]
  38. Dowell, J.; Weiss, S.; Infield, D. Spatio-temporal prediction of wind speed and direction by continuous directional regime. In Proceedings of the 2014 International Conference on Probabilistic Methods Applied to Power Systems, PMAPS 2014, Durham, UK, 7–10 July 2014. [Google Scholar] [CrossRef]
  39. Liu, Z.; Hara, R.; Kita, H. Hybrid forecasting system based on data area division and deep learning neural network for short-term wind speed forecasting. Energy Convers. Manag. 2021, 238, 114136. [Google Scholar] [CrossRef]
  40. Zhu, Q.; Chen, J.; Shi, D.; Zhu, L.; Bai, X.; Duan, X.; Liu, Y. Learning Temporal and Spatial Correlations Jointly: A Unified Framework for Wind Speed Prediction. IEEE Trans. Sustain. Energy 2020, 11, 509–523. [Google Scholar] [CrossRef]
  41. Gupta, P.; Singh, R. Combining a deep learning model with multivariate empirical mode decomposition for hourly global horizontal irradiance forecasting. Renew. Energy 2023, 206, 908–927. [Google Scholar] [CrossRef]
  42. Yang, L.; Gao, X.; Hua, J.; Wang, L. Intra-day global horizontal irradiance forecast using FY-4A clear sky index. Sustain. Energy Technol. Assess. 2022, 50, 101816. [Google Scholar] [CrossRef]
  43. Kallio-Myers, V.; Riihelä, A.; Lahtinen, P.; Lindfors, A. Global horizontal irradiance forecast for Finland based on geostationary weather satellite data. Sol. Energy 2020, 198, 68–80. [Google Scholar] [CrossRef]
  44. Liu, J.; Zang, H.; Cheng, L.; Ding, T.; Wei, Z.; Sun, G. A Transformer-based multimodal-learning framework using sky images for ultra-short-term solar irradiance forecasting. Appl. Energy 2023, 342, 121160. [Google Scholar] [CrossRef]
  45. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 5998–6008. [Google Scholar]
  46. Peng, Z.; Peng, S.; Fu, L.; Lu, B.; Tang, J.; Wang, K.; Li, W. A novel deep learning ensemble model with data denoising for short-term wind speed forecasting. Energy Convers. Manag. 2020, 207, 112524. [Google Scholar] [CrossRef]
  47. Abdellatif, A.; Mubarak, H.; Ahmad, S.; Ahmed, T.; Shafiullah, G.M.; Hammoudeh, A.; Abdellatef, H.; Rahman, M.M.; Gheni, H.M. Forecasting Photovoltaic Power Generation with a Stacking Ensemble Model. Sustainability 2022, 14, 11083. [Google Scholar] [CrossRef]
  48. Wu, H.; Levinson, D. The ensemble approach to forecasting: A review and synthesis. Transp. Res. Part C Emerg. Technol. 2021, 132, 103357. [Google Scholar] [CrossRef]
  49. Ghojogh, B.; Crowley, M. The Theory behind Overfitting, cross Validation, Regularization, Bagging and Boosting: Tutorial. arXiv 2023, arXiv:1905.12787. [Google Scholar]
  50. Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 14–18 August 2016. [Google Scholar] [CrossRef]
  51. Oliveira Santos, V.; Costa Rocha, P.A.; Scott, J.; Van Griensven Thé, J.; Gharabaghi, B. Spatiotemporal Air Pollution Forecasting in Houston-TX: A Case Study for Ozone Using Deep Graph Neural Networks. Atmosphere 2023, 14, 308. [Google Scholar] [CrossRef]
  52. Oliveira Santos, V.; Costa Rocha, P.A.; Scott, J.; Thé, J.V.G.; Gharabaghi, B. A New Graph-Based Deep Learning Model to Predict Flooding with Validation on a Case Study on the Humber River. Water 2023, 15, 1827. [Google Scholar] [CrossRef]
Figure 1. Map of the northeast of Brazil. The Petrolina measurement site is highlighted [29].
Figure 1. Map of the northeast of Brazil. The Petrolina measurement site is highlighted [29].
Atmosphere 14 01635 g001
Figure 2. Diagram of the data flow for the applied methodology.
Figure 2. Diagram of the data flow for the applied methodology.
Atmosphere 14 01635 g002
Figure 3. Windowing λ parameter variation influence in RMSE for different time horizons in wind speed data analysis for all the studied time horizons.
Figure 3. Windowing λ parameter variation influence in RMSE for different time horizons in wind speed data analysis for all the studied time horizons.
Atmosphere 14 01635 g003
Figure 4. Windowing λ parameter influence on RMSE value for the time horizon t + 10.
Figure 4. Windowing λ parameter influence on RMSE value for the time horizon t + 10.
Atmosphere 14 01635 g004
Figure 5. Windowing λ parameter influence in MAE value for the time horizon t + 10.
Figure 5. Windowing λ parameter influence in MAE value for the time horizon t + 10.
Atmosphere 14 01635 g005
Figure 6. Windowing λ parameter influence in R2 value for the time horizon t + 10.
Figure 6. Windowing λ parameter influence in R2 value for the time horizon t + 10.
Atmosphere 14 01635 g006
Figure 7. Windowing λ parameter influence in MAPE value for the time horizon t + 10.
Figure 7. Windowing λ parameter influence in MAPE value for the time horizon t + 10.
Atmosphere 14 01635 g007
Figure 8. Parameter λ variance effect in method performance. SVR result is shown for reference.
Figure 8. Parameter λ variance effect in method performance. SVR result is shown for reference.
Atmosphere 14 01635 g008
Figure 9. Windowing λ parameter variation influence in RMSE for all the studied time horizons in solar irradiation data analysis.
Figure 9. Windowing λ parameter variation influence in RMSE for all the studied time horizons in solar irradiation data analysis.
Atmosphere 14 01635 g009
Figure 10. Windowing λ parameter influence in RMSE value in time horizon t + 10.
Figure 10. Windowing λ parameter influence in RMSE value in time horizon t + 10.
Atmosphere 14 01635 g010
Figure 11. Windowing λ parameter influence in R2 value in time horizon t + 10.
Figure 11. Windowing λ parameter influence in R2 value in time horizon t + 10.
Atmosphere 14 01635 g011
Figure 12. Windowing λ parameter influence in R2 value in time horizon t + 10.
Figure 12. Windowing λ parameter influence in R2 value in time horizon t + 10.
Atmosphere 14 01635 g012
Figure 13. Windowing λ parameter influence in MAPE value in time horizon t + 10.
Figure 13. Windowing λ parameter influence in MAPE value in time horizon t + 10.
Atmosphere 14 01635 g013
Table 1. Geographic coordinates, altitude in relation to the sea level, measurement intervals, and measurement periods of the data were collected from the Petrolina station. MI and MP stand for, respectively, “measurement interval” and “measurement period”.
Table 1. Geographic coordinates, altitude in relation to the sea level, measurement intervals, and measurement periods of the data were collected from the Petrolina station. MI and MP stand for, respectively, “measurement interval” and “measurement period”.
TypeLat. (◦)Long. (◦)Alt. (m)MI (min)MP
Anemometric09°04′08″ S40°19′11″ O387101 January 2007 to 12 December 2010
Solarimetric1 January 2010 to 12 December 2010
Table 2. Search parameters and grid values applied to the tested methods.
Table 2. Search parameters and grid values applied to the tested methods.
MethodSearch ParameterGrid Values
Random forestmaxdepth[2, 5, 7, 9, 11, 13, 15, 21, 35]
KNNnearest neighbours k1 ≤ k ≤ 50, k integer
SVRpenalty term C[0.1, 1, 10, 100, 1000]
coefficient λ[1, 0.1. 0.01, 0.001, 0.0001]
Elastic netregularization term λ [1, 0.1. 0.01, 0.001, 0.0001]
WindowingΛ[1, 3, 6, 12, 25, 50, 100]
Arbitrating*
*: Due to the use of a meta-heuristic methodology, the initial parameter was not needed.
Table 3. Best parameters for each machine learning method.
Table 3. Best parameters for each machine learning method.
MethodParametert + 10t + 20t + 30t + 60
Random forestbest_max_depth7777
best_n_estimators20202020
KNNbest_n_neighbors49494949
SVRbest_C1111
best_epsilon0.10.110.1
Elastic netbest_l1_ratio1111
Table 4. Comparison of RMSE (m/s) values, using different methods for different time horizons and windowing λ parameter variation. The best results for each time horizon are in bold.
Table 4. Comparison of RMSE (m/s) values, using different methods for different time horizons and windowing λ parameter variation. The best results for each time horizon are in bold.
Time HorizonλRFKNNSVRElastic NetWindowingArbitrating
t + 10 min10.694580.710400.693960.698280.692630.69447
30.69180
60.69114
120.69041
190.69007
250.69040
500.69226
740.69402
1000.69431
t + 20 min10.883100.893320.883720.885540.868170.88315
30.87353
60.87563
120.87699
250.87803
500.87889
1000.87960
t + 30 min10.994690.998590.991300.996600.974970.99091
30.98017
60.98333
120.98583
250.98702
500.98832
1000.98902
t + 60 min11.180921.195271.177641.182811.151501.18156
31.15647
61.16170
121.16685
251.16987
501.17254
1001.17455
Table 5. Comparison of MAE (m/s) values, using different methods in different time horizons and windowing λ parameter variation. The best results for each time horizon are in bold.
Table 5. Comparison of MAE (m/s) values, using different methods in different time horizons and windowing λ parameter variation. The best results for each time horizon are in bold.
Time HorizonλRFKNNSVRElastic NetWindowingArbitrating
t + 10 min10.515920.532160.514380.518530.513840.51711
30.51366
60.51328
120.51276
190.51272
250.51301
500.51441
740.51574
1000.51603
t + 20 min10.658450.668820.660400.659900.646630.65936
30.65140
60.65332
120.65435
250.65554
500.65637
1000.65695
t + 30 min10.742500.747350.741250.743470.725940.74097
30.73105
60.73402
120.73625
250.73732
500.73846
1000.73902
t + 60 min10.894960.907530.891790.895890.867840.89570
30.87277
60.87826
120.88307
250.88580
500.88813
1000.88963
Table 6. Comparison of R2 (m/s) values, using different methods in different time horizons and windowing λ parameter variation. The best results for each time horizon are in bold.
Table 6. Comparison of R2 (m/s) values, using different methods in different time horizons and windowing λ parameter variation. The best results for each time horizon are in bold.
Time HorizonλRFKNNSVRElastic NetWindowingArbitrating
t + 10 min10.842480.835220.842750.840790.843360.84252
30.84373
60.84403
120.84436
190.84451
250.84436
500.84353
740.84273
1000.84260
t + 20 min10.745340.739410.744980.743930.753880.74531
30.75083
60.74963
120.74885
250.74825
500.74776
1000.74736
t + 30 min10.676900.674360.679090.675660.689580.67935
30.68626
60.68423
120.68262
250.68186
500.68102
1000.68057
t + 60 min10.544430.533290.546950.542970.566850.54393
30.56310
60.55914
120.55522
250.55291
500.55087
1000.54933
Table 7. Comparison of MAPE (m/s) values, using different methods in different time horizons and windowing λ parameter variation. The best results for each time horizon are in bold.
Table 7. Comparison of MAPE (m/s) values, using different methods in different time horizons and windowing λ parameter variation. The best results for each time horizon are in bold.
Time HorizonλRFKNNSVRElastic NetWindowingArbitrating
t + 10 min10.212770.253600.202570.218480.210400.21634
30.21122
60.21092
120.21040
190.21022
250.21075
500.21179
740.21246
1000.21234
t + 20 min10.315340.338230.341780.312060.312800.32577
30.31558
60.31658
120.31745
250.31906
500.31990
1000.32101
t + 30 min10.380890.397860.375200.370640.367110.38499
30.36968
60.37245
120.37227
250.37367
500.37352
1000.37538
t + 60 min10.523200.535670.517310.512840.505520.52440
30.50730
60.51189
120.51289
250.51480
500.51571
1000.51872
Table 8. Best parameters for each machine learning method.
Table 8. Best parameters for each machine learning method.
MethodParametert + 10t + 20t + 30t + 60
Random forestbest_max_depth5555
best_n_estimators20202020
KNNbest_n_neighbors37374948
SVRbest_C0.10.10.10.1
best_epsilon0.10.10.10.1
Elastic netbest_l1_ratio 1111
Table 9. Comparison of RMSE (W/m2) values, using different methods in different time horizons and windowing λ parameter variation. The best results for each time horizon are in bold.
Table 9. Comparison of RMSE (W/m2) values, using different methods in different time horizons and windowing λ parameter variation. The best results for each time horizon are in bold.
Time HorizonλRFKNNSVRElastic NetWindowingArbitrating
t + 10 min175.0200075.2600074.1900074.9800072.7318674.01000
372.93221
673.29363
1273.21035
2573.24620
5073.48055
10073.69330
t + 20 min190.9400083.5000084.4500084.5300080.0700083.19000
380.63000
681.19000
1281.87000
2582.56000
5082.11000
10082.57000
t + 30 min190.1500090.5000091.4900093.4900086.2500089.70000
387.00000
687.75000
1288.33000
2588.95000
5088.70000
10089.01000
t + 60 min1112.05000112.13000112.76000118.08000105.51000111.13000
3106.62000
6107.76000
12108.89000
25109.32000
50110.12000
100110.30000
Table 10. Comparison of R2 (W/m2) values, using different methods in different time horizons and windowing λ parameter variation. The best results for each time horizon are in bold.
Table 10. Comparison of R2 (W/m2) values, using different methods in different time horizons and windowing λ parameter variation. The best results for each time horizon are in bold.
Time HorizonλRFKNNSVRElastic NetWindowingArbitrating
t + 10 min10.920000.920000.920000.920000.921840.92000
30.92141
60.92062
120.92080
250.92073
500.92022
1000.91976
t + 20 min10.880000.900000.900000.900000.910000.90000
30.91000
60.90000
120.90000
250.90000
500.90000
1000.90000
t + 30 min10.880000.880000.880000.870000.890000.88000
30.89000
60.89000
120.89000
250.89000
500.88000
1000.89000
t + 60 min10.830000.830000.820000.512230.850000.83000
30.84000
60.84000
120.84000
250.83000
500.83000
1000.83000
Table 11. Comparison of MAE (W/m2) values, using different methods in different time horizons and windowing λ parameter variation. The best results for each time horizon are in bold.
Table 11. Comparison of MAE (W/m2) values, using different methods in different time horizons and windowing λ parameter variation. The best results for each time horizon are in bold.
Time HorizonλRFKNNSVRElastic NetWindowingArbitrating
t + 10 min148.2900048.4700044.1600049.3100072.7318646.24000
344.52301
645.00717
1245.27759
2545.67924
5045.79140
1046.16632
t + 20 min165.1900055.6300059.6700058.8600052.5300056.20000
353.31000
654.12000
1255.27000
2556.88000
5055.59000
1056.79000
t + 30 min162.0900061.5800064.7700067.1300058.1400060.91000
359.02000
659.91000
1260.85000
2561.34000
5061.84000
1061.51000
t + 60 min181.2800079.8400081.4400089.0700074.5900079.80000
375.92000
677.11000
1278.47000
2579.08000
5079.48000
1079.63000
Table 12. Comparison of MAPE (W/m2) values, using different methods in different time horizons and windowing λ parameter variation. The best results for each time horizon are in bold.
Table 12. Comparison of MAPE (W/m2) values, using different methods in different time horizons and windowing λ parameter variation. The best results for each time horizon are in bold.
Time HorizonλRFKNNSVRElastic NetWindowingArbitrating
t + 10 min10.220000.240000.210000.230000.207010.22000
30.21027
60.21254
120.21364
250.21444
500.21541
1000.21684
t + 20 min10.320000.280000.280000.270000.250000.27000
30.25000
60.26000
120.26000
250.27000
500.26000
1000.27000
t + 30 min10.290000.300000.290000.330000.270000.29000
30.28000
60.28000
120.28000
250.29000
500.29000
1000.29000
t + 60 min10.340000.350000.340000.547470.320000.34000
30.32000
60.33000
120.33000
250.34000
500.34000
1000.34000
Table 13. Compilation of the windowing’s results for different time horizons.
Table 13. Compilation of the windowing’s results for different time horizons.
MetricTime HorizonWind SpeedGHI
RMSEt + 100.69007 m/s72.73186 W/m2
t + 200.86817 m/s80.07 W/m2
t + 300.97497 m/s86.25 W/m2
t + 601.1515 m/s105.51 W/m2
R2t + 100.844510.92184
t + 200.753880.91
t + 300.689580.89
t + 600.566850.85
MAEt + 100.51272 m/s44.52301 W/m2
t + 200.64663 m/s52.53 W/m2
t + 300.72594 m/s58.14 W/m2
t + 600.86784 m/s74.59 W/m2
MAPEt + 100.210220.20701
t + 200.31280.25
t + 300.367110.27
t + 600.505520.32
Table 14. Compilation of results for wind speed forecasting.
Table 14. Compilation of results for wind speed forecasting.
ModelMetric ValueAuthor
GNN SAGE GATRMSE
0.638 for t + 60 forecasting horizon
MAE
0.458 for t + 60 forecasting horizon
Oliveira Santos et al. [22]
ED-HGNDO-BiLSTMRMSE
0.696 average for t + 10 forecasting horizon
1.445 average for t + 60 forecasting horizon
MAE
0.717 average for t + 10 forecasting horizon
0.953 average for t + 60 forecasting horizon
MAPE
0.590 average for t + 10 forecasting horizon
9.769 average for t + 60 forecasting horizon
Neshat et al. [37]
Statistical model for wind speed forecastingRMSE
1.090 for t + 60 forecasting horizon
Dowell et al. [38]
Hybrid wind speed forecasting model using area division (DAD) method and a deep learning neural networkRMSE
0.291 average for t + 10 forecasting horizon
0.355 average for t + 30 forecasting horizon
0.426 average for t + 60 forecasting horizon
MAE
0.221 average for t + 10 forecasting horizon
0.293 average for t + 30 forecasting horizon
0.364 average for t + 60 forecasting horizon
Liu et al. [39]
Hybrid model CNN-LSTMRMSE
0.547 for t + 10 forecasting horizon
0.802 for t + 20 forecasting horizon
0.895 for t + 30 forecasting horizon
1.114 for t + 60 forecasting horizon
MAPE
4.385 for t + 10 forecasting horizon
6.023 for t + 20 forecasting horizon
7.510 for t + 30 forecasting horizon
11.127 for t + 60 forecasting horizon
Zhu et al. [40]
Table 15. Compilation of results for GHI forecasting.
Table 15. Compilation of results for GHI forecasting.
ModelMetric ValueAuthor
CNN-1DRMSE (R2)
36.24 (0.98) for t + 10 forecasting horizon
39.00 (0.98) for t + 20 forecasting horizon
38.46 (0.98) for t + 30 forecasting horizon
Marinho et al. [23]
MEMD-PCA-GRURMSE (R2)
31.92 (0.99) for t + 60 forecasting horizon
Gupta and Singh [41]
Physical-based forecasting modelRMSE
75.91 for t + 30 forecasting horizon
89.81 for t + 60 forecasting horizon
MAE
48.85 for t + 30 forecasting horizon
57.01 for t + 60 forecasting horizon
Yang et al. [42]
Physical-based forecasting modelRMSE
114.06 for t + 60 forecasting horizon
Kallio-Meyers et al. [43]
Deep learning transformer-based forecasting modelMAE
34.21 for t + 10 forecasting horizon
43.64 for t + 20 forecasting horizon
49.53 for t + 30 forecasting horizon
Liu et al. [44]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Vidal Bezerra, F.D.; Pinto Marinho, F.; Costa Rocha, P.A.; Oliveira Santos, V.; Van Griensven Thé, J.; Gharabaghi, B. Machine Learning Dynamic Ensemble Methods for Solar Irradiance and Wind Speed Predictions. Atmosphere 2023, 14, 1635. https://doi.org/10.3390/atmos14111635

AMA Style

Vidal Bezerra FD, Pinto Marinho F, Costa Rocha PA, Oliveira Santos V, Van Griensven Thé J, Gharabaghi B. Machine Learning Dynamic Ensemble Methods for Solar Irradiance and Wind Speed Predictions. Atmosphere. 2023; 14(11):1635. https://doi.org/10.3390/atmos14111635

Chicago/Turabian Style

Vidal Bezerra, Francisco Diego, Felipe Pinto Marinho, Paulo Alexandre Costa Rocha, Victor Oliveira Santos, Jesse Van Griensven Thé, and Bahram Gharabaghi. 2023. "Machine Learning Dynamic Ensemble Methods for Solar Irradiance and Wind Speed Predictions" Atmosphere 14, no. 11: 1635. https://doi.org/10.3390/atmos14111635

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop