Next Article in Journal
Graphitic Carbon Nitride (g-C3N4) in Photocatalytic Hydrogen Production: Critical Overview and Recent Advances
Previous Article in Journal
Solar Radiation Forecasting: A Systematic Meta-Review of Current Methods and Emerging Trends
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Comparative Analysis of Machine Learning Techniques in Predicting Wind Power Generation: A Case Study of 2018–2021 Data from Guatemala

by
Berny Carrera
and
Kwanho Kim
*
Department of Industrial and Systems Engineering, Dongguk University, Seoul 04620, Republic of Korea
*
Author to whom correspondence should be addressed.
Energies 2024, 17(13), 3158; https://doi.org/10.3390/en17133158
Submission received: 15 May 2024 / Revised: 20 June 2024 / Accepted: 20 June 2024 / Published: 26 June 2024
(This article belongs to the Section A3: Wind, Wave and Tidal Energy)

Abstract

:
The accurate forecasting of wind power has become a crucial task in renewable energy due to its inherent variability and uncertainty. This study addresses the challenge of predicting wind power generation without meteorological data by utilizing machine learning (ML) techniques on data from 2018 to 2021 from three wind farms in Guatemala. Various machine learning models, including Gated Recurrent Unit (GRU) and Long Short-Term Memory (LSTM), Bidirectional Long Short-Term Memory (BiLSTM), Bagging, and Extreme Gradient Boosting (XGBoost), were evaluated to determine their effectiveness. The performance of these models was assessed using Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) metrics. Time series cross-validation was employed to validate the models, with GRU, LSTM, and BiLSTM showing the lowest RMSE and MAE. Furthermore, the Diebold–Mariano (DM) test and Bayesian model comparison were used for pairwise comparisons, confirming the robustness and accuracy of the top-performing models. The results highlight the superior accuracy and robustness of advanced neural network architectures in capturing the complex temporal dependencies in wind power data, making them the most reliable models for precise forecasting. These findings provide critical insights for enhancing grid management and operational planning in the renewable energy sector.

1. Introduction

Renewable energy technology has become crucial for addressing climate change and energy supply challenges. Since the 1980s, wind energy has significantly contributed to renewable energy sources, especially in the United States [1]. It plays a crucial role in satisfying the rising global demand for electricity and is an important sector of the global renewable energy industry.
Accurate forecasts of wind power generation are essential for maintaining stability and frequency in the electrical system, given the unique characteristics of immediate and continuous demand satisfaction in the electrical market [2]. As the importance of renewable energy rises, it becomes essential to develop accurate models for predicting renewable energy outputs [3]. These models support the reliability of systems that integrate renewable energy sources with more controllable energy resources.
Previous research has significantly contributed to improving prediction accuracy in renewable power forecasting [4,5,6,7,8,9]. The use of machine learning (ML) techniques is increasingly prevalent in renewable energy prediction, particularly in the wind energy sector. Muñoz et al. [10] highlight the importance of incorporating spatial data, such as wind power estimations provided by transmission system operators, to enhance the accuracy of renewable energy forecasts. Bellinguer et al. [11] proposed an adaptive wind power prediction method using physics-based data preprocessing and k-means clustering to group wind farms. Li et al. [12] developed a wind power forecasting model combining the Dragonfly Algorithm and Support Vector Machine, showing superior accuracy compared to other models. Silva et al. [13] employed a hybrid approach combining Stacking-ensemble learning with Complete Ensemble Empirical Mode Decomposition for wind power forecasting. Carrera et al. [14] studied solar power prediction in South Korea, emphasizing the importance of accurate forecasting methods. Lin et al. [15] presented a hybrid machine learning methodology that captures the non-linear characteristics of wind speed patterns for short-term forecasting. Stratigakos et al. [16] utilized machine learning to forecast renewable energy generation across hierarchical levels, effectively managing missing data in photovoltaic parks and wind turbines. These studies highlight the significance of machine learning in enhancing the accuracy of renewable wind energy forecasting.
In the subdomain of deep learning, several studies have focused on the use of neural network models to improve wind power forecasting [7,17,18]. For example, [19] investigated the application of a deep neural network for wind power forecasting, demonstrating enhanced prediction accuracy compared with traditional models. The studies in [20] integrated a Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) network for wind power forecasting, showing improved performance in capturing temporal dependencies and nonlinear relationships. Furthermore, researchers have investigated the effect of incorporating meteorological data into wind power forecast models [21]. Shahin et al. [22] demonstrated the effective use of an Artificial Neural Network (ANN) to forecast solar irradiance with minimal error. In [23], the studies examined the effect of wind speed and direction on the accuracy of wind power forecasts, highlighting the significance of meteorological variables for enhancing forecasts. Similarly, [24] proposed a hybrid model incorporating Correntropy LSTM with a variational mode decomposition technique to forecast wind power generation using meteorological data effectively. These studies collectively underscore the importance of neural network models, meteorological data, and historical information in wind power forecasting.
Objectives of this study:
  • To develop machine learning models for predicting wind power generation without relying on detailed meteorological data;
  • To evaluate the performance of various machine learning techniques, including Simple, ensemble, and deep learning algorithms;
  • To use time series cross-validation, Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), the Diebold–Mariano (DM) test, and Bayesian model comparison to assess and compare model performance;
  • To provide insights for enhancing grid management and operational planning in the renewable energy sector by identifying the most accurate and robust models.
This study bridges the gap between theory and practice by applying advanced machine learning techniques to real-world wind power data from 2018 to 2021 in the country of Guatemala, demonstrating their practical applicability and effectiveness in forecasting without meteorological observations.
This research begins with an analysis of the provided data, investigating its characteristics and trends in Section 2.1—Proposed Framework. Section 2.2 and Section 2.3 detail the data sources and preprocessing steps to ensure consistency and reliability. Section 2.4 categorizes the employed machine learning methods into Simple, ensemble, and deep learning algorithms. Section 2.5 explains the performance metrics, time series cross-validation, and statistical tests, including the Diebold–Mariano test and Bayesian model comparison. Section 3 presents the results from the models, covering time series cross-validation, the Diebold–Mariano test, Bayesian model comparison, and a comparative analysis using a dedicated test set. Section 4, the conclusion, discusses the practical implications of forecasting wind turbine energy without meteorological data. Finally, Section 5 provides a separate discussion, highlighting key findings and their significance.

2. Methods

2.1. Proposed Framework

This subsection outlines the methodology used to compare machine learning approaches for wind power generation prediction in the absence of meteorological data. The framework consists of multiple stages: data collection, preprocessing, model implementation, and performance evaluation. The goal is to determine the best methods for accurately predicting wind power when limited meteorological inputs are available.
Figure 1 depicts a comprehensive framework for assessing and comparing different machine learning models used in wind power forecasting. This methodological approach is subdivided into several crucial modules, each described in detail:
  • Wind Power Grids: This module shows the spatial arrangement of three separate wind farms in Guatemala—SNT-E1, VBL-E1, and LCU-E1. Every farm is depicted on a simplified map, indicating the specific locations where data are collected. This visual representation highlights the distributed nature of data collecting across several wind power locations;
  • Data Collection: Data from the three wind farms are systematically gathered and stored in various databases. These databases are represented by the cylinder icons in Figure 1, which symbolize the continuous and reliable process of data collection and maintenance;
  • Data Preprocessing and Analytics: This crucial module involves handling missing values, applying filters, and combining data. The approach ensures the preservation of data quality and the standardization of datasets before their use in model training and validation, optimizing the accuracy of the forecasting models;
  • ML—Time Series Cross Validation: This module describes the detailed methodology used to evaluate and compare different machine learning algorithms specifically designed for wind power forecasting. Additionally, it provides detailed explanations of various machine learning approaches, including baseline linear and ridge regressions, as well as more advanced models like neural networks and ensemble methods such as Random Forest and XGBoost. This study also investigates the effectiveness of advanced deep learning frameworks such as GRU, LSTM, and convolutional networks in analyzing time series data that represent wind direction. Furthermore, the models are trained and tested using data from the years 2018 to 2020 for training, and 2021 for testing. This stage also involves hyperparameter tuning to enhance the performance of each model, ensuring that the models are well-fitted to the historical data while also being capable of reliably predicting unseen data;
  • The Diebold–Mariano Test: This test is utilized to statistically evaluate and compare the predictive accuracy of the models. The main objective is to analyze the variations in prediction errors using metrics like DM statistics (DM-stats) and p-values to determine if the disparities in model performance are statistically significant;
  • Bayesian Comparison: This module uses Bayesian statistical methods to estimate the relative probabilities of several models being the best based on the available data through pairwise comparisons. This methodology offers a probabilistic perspective on the performance of a model, complementing the frequentist approach of the DM test;
  • ML Comparison: The final module involves a thorough evaluation and comparison of the most effective models. The results of these comparisons determine the most efficient machine learning model or models for predicting wind power using the collected data and the analytical approaches employed.

2.2. Wind Power Grids

Guatemala is located in Central America and presents a promising potential for wind power generation due to its strategic geographical and climatic conditions. The country is bordered by Mexico to the north and west, Belize to the northeast, Honduras to the east, and El Salvador to the southeast. This diverse topography includes coastal plains, highlands, and volcanic mountains, creating distinct wind corridors particularly favorable for wind energy production. Regions such as the southern highlands and the Pacific coastal areas consistently experience high wind speeds, making them ideal locations for wind farms [25,26].
Additionally, Guatemala’s proximity to the equator ensures relatively stable wind directions throughout the year, enhancing its suitability for wind energy projects. Currently, the nation has a total installed wind power capacity of around 106.5 MW, contributing significantly to its renewable energy [27,28]. Notable wind farms, such as those in Escuintla, have been pivotal in this development, supported by investments aimed at expanding the country’s renewable energy infrastructure [29].
The government’s commitment to renewable energy is reflected in its policies, which aim to fuel 80% of the electricity matrix with renewable energy by 2030, focusing on sources like wind, solar, and small hydroelectric plants [30]. This commitment not only supports local energy needs but also aligns with broader environmental goals by reducing reliance on fossil fuels and lowering greenhouse gas emissions.
Guatemala’s wind directions are influenced by trade winds with a northern component (North–Northeast, Northeast, North–Northwest) from October to February. This is due to a high-pressure system in central North America that extends through the Gulf of Mexico and the Yucatán Peninsula. These winds enter Guatemala via the Izabal department and accelerate between the Merendón and Las Minas Mountain ranges, leading to higher speeds in the eastern regions before slowing down as they move to the central and northwestern parts of the country. From March to June, the wind shifts to a southern component because of low-pressure systems over the Pacific Ocean. These systems can push winds over the mountainous regions, reaching the departments of Alta Verapaz, Huehuetenango, and El Quiché. From July to September, the wind generally returns to a northern component due to the Atlantic’s semi-permanent anticyclone; however, hurricanes or tropical storms can temporarily disrupt this pattern, altering the wind flow briefly but significantly [31]. The National Interconnected System (S.N.I.) in Guatemala comprises three wind farms with a total installed capacity of 107.4 MW, summarized in Table 1.
  • San Antonio el Sitio: The wind farm consists of sixteen VESTAS V112/3300 wind turbines, each with a power rating of 3.3 MW, totaling 52.8 MW. It is located in the municipality of Villa Canales, in the department of Guatemala, Guatemala. The wind farm began commercial operations on 19 April 2015 [32];
  • Viento Blanco, Sociedad Anónima: This wind farm has seven VESTAS V112/3300 wind turbines, each with a capacity of 3.3 MW, totaling 23.1 MW [33]. It is situated on the La Colina estate in the municipality of San Vicente Pacaya, Escuintla, Guatemala. It started commercial operations on 6 December 2015, and is shown in Figure 2;
  • Las Cumbres de Agua Blanca: This wind farm, owned by the company Transmisora de Electricidad, Sociedad Anónima, contains fifteen GAMESA G114/2100 wind turbines, each with a capacity of 2.1 MW, totaling 31.5 MW [34]. It is located in the municipality of Agua Blanca, department of Jutiapa, Guatemala, in the community of Lagunilla, and began commercial operations on 25 March 2018.

2.3. Data Preprocessing and Analytics

This section presents an analysis of the dataset related to wind energy generation from three wind grid farms located in different regions of Guatemala. The training set consists of historical data from 2018, 2019, and 2020, while the test set comprises data from 2021; both are used to forecast energy production for specific months. The entire dataset spans from 18 February 2018 to 30 May 2021, in spans of one-hour periods.
The dataset includes six distinct variables: “ID”, “Time”, “Date”, “SNT-E1”, “VBL-E1”, and “LCU-E1”. The “ID” variable is not used in the predictive model. The ‘Time’ variable denotes the current time in minutes, and the ‘Date’ variable is presented in the year–month–day format. To use these variables as inputs, the date is partitioned, using only the day and month as input variables; the year is excluded as it is not relevant in this context. The timestamp is used similarly as an input variable, with the day, month, and time being the algorithm’s input variables.
The energy generation columns (“SNT-E1”, “VBL-E1”, and “LCU-E1”) in the dataset contain non-numeric values that indicate periods of turbine maintenance or errors in the reading systems. To ensure data consistency, these non-numeric values were replaced with 0.
The descriptive statistics related to the energy generation capabilities of the wind farms SNT-E1, VBL-E1, and LCU-E1 from 2018 to 2021 are summarized in Table 2. SNT-E1 has the highest mean energy generation of 19.90 MW, demonstrating its overall capacity to generate power compared to the other wind farms. VBL-E1 has the lowest average output at 10.45 MW, indicating that it is either smaller in size or generates less power. LCU-E1’s average generation is 17.37 MW, placing it in the middle of the other two farms. SNT-E1 also presents the highest variability in energy production with a standard deviation of 27.55 MW, which could be due to its larger capacity or more variable wind conditions. VBL-E1 and LCU-E1 show more consistent outputs, with standard deviations of 13.90 MW and 21.39 MW, respectively. All farms experience periods of 0% output, which can be attributed to either calm conditions or maintenance activities. SNT-E1 reaches a maximum generation of 340.55 MW, LCU-E1 reaches 214.06 MW, and VBL-E1 reaches 151.48 MW, reflecting potential differences in their peak production capabilities.
Figure 3 presents the time series of wind energy for each wind farm, demonstrating that their average energy production varies. For each subfigure, the upper time series represents SNT-E1, the middle one represents VBL-E1, and the bottom time series represents LCU-E1 wind farms. The area is one of the factors influencing power generation, indicating that wind farm size and location may vary. The shadow displayed in the graphs represents the 95% confidence interval of the estimate, providing a visual indication of the variability and reliability of the data. This is crucial for understanding the uncertainty around the mean estimates derived from the dataset. Notably, in February 2020, wind farms VBL-E1 and LCU-E1 presented significantly higher energy generation compared to the same month in other years, which is not observed in SNT-E1, as shown in Figure 3c. In addition, a closer examination of wind farms SNT-E1 and VBL-E1 reveals that the highest values for wind farms 2 and 3 in 2020 are significantly higher than in previous years.
The monthly distribution of wind energy generation for the three wind farms, SNT-E1, VBL-E1, and LCU-E1, is shown in Figure 4. Each boxplot in the figure illustrates the range and central tendency of energy production across different months, offering insights into seasonal patterns and variations between the wind farms. The box in each plot represents the interquartile range (IQR), encapsulating the middle 50% of the data points, with the line inside the box indicating the median energy value. The whiskers extend to the smallest and largest values within 1.5 times the IQR from the quartiles, while points outside this range are considered outliers and are shown as individual dots. The boxplots reveal clear seasonal trends, with increased energy production observed from July to August and November to March across all wind farms. These periods correspond to seasonal wind patterns that enhance energy generation. Additionally, the boxplots highlight differences in production levels among the wind farms. SNT-E1 consistently shows higher median and overall energy generation compared to VBL-E1 and LCU-E1, suggesting that SNT-E1 benefits from more favorable wind conditions or more efficient turbines. Detailed observations indicate that SNT-E1 exhibits higher production values consistently, with notable peaks and a relatively larger spread in monthly generation. VBL-E1 shows more variability in production, with significant outliers in certain months, indicating occasional high-generation events. LCU-E1 demonstrates moderate production levels with some months showing considerable variability, as indicated by the presence of outliers.
These findings emphasize the significance of including temporal parameters such as the day, month, and time in wind energy forecast models. The seasonal and hourly patterns found in the dataset provide useful information for constructing reliable forecasting models. Although training the model is crucial, it must account for the possibility of anomalies in the dataset, as their presence may impair the accuracy of the model’s predictions. Prediction errors, such as outliers, have an impact on model weight modifications. To address this issue, we use a control chart idea in which values that exceed a specified threshold for power generation (Z score value larger than 3) are removed from the dataset, as shown in Equation (1) [35]. Once again, anomalies are removed to standardize the dataset, allowing the model to concentrate on dependable data points. For Equation (1), Z is the Z-score, x is the actual wind power value, μ is the mean of x, and σ is the standard deviation. The outliers were then replaced with the mean of the values immediately preceding and following the removed data points.
Z = (xµ)/σ.

2.4. Machine Learning Algorithms

This section explores the wide range of machine learning algorithms that are utilized in the prediction of wind power. The algorithms are classified into three separate categories: Simple, ensemble, and deep learning. Every category fulfills a distinct function in managing the intricacies of predictive modeling within the energy industry. Table 3 presents the machine learning algorithms considered in this study.

2.4.1. Simple Learning Algorithms

  • Linear Regression (LR): LR is a fundamental technique in predictive modeling used to determine the linear correlations between meteorological variables, such as wind speed and power production. Although LR is easy to understand and comprehend, its primary limitation is its inability to capture complex, non-linear relationships, which are often present in wind power data [35];
  • Ridge Regression: The Ridge algorithm is an extension of Linear Regression that incorporates L2 regularization. This regularization technique penalizes large coefficients, hence mitigating the risk of overfitting. This feature makes it particularly well-suited for situations where there is multicollinearity among the input features [35]. As a result, it ensures stability and enhances the capacity to produce accurate forecasts in wind power scenarios;
  • Huber Regression: Huber is a robust regression method that is effective in handling outliers. It combines the characteristics of both L2 and L1 regularization [35]. The strategy of the system is adapted based on the presence of outliers in the data, resulting in more dependable forecasts in the naturally turbulent environment of wind power data collection;
  • K-Nearest Neighbors (KNN): KNN is a non-parametric algorithm that uses the nearest training examples in the feature space to predict outputs. KNN is proficient in dealing with non-linear data and can be highly efficient in wind power forecasting, particularly when patterns cannot be effectively approximated by parametric models [35];
  • Decision Trees (DT): DT partitions the dataset into leaf nodes based on judgments made on feature values, resulting in a distinct and easily understandable structure of decisions. Although they are successful in handling both categorical and continuous inputs, they are prone to overfitting, particularly in datasets commonly encountered in wind energy forecasts [36].

2.4.2. Ensemble Learning Algorithms

  • Random Forest (RF): RF enhances decision trees by constructing a collection of trees and combining their forecasts. By employing this method, the variability is much diminished without any accompanying increase in partiality, rendering RF an effective instrument for representing the stochastic and complex characteristics of wind speed and direction data. RF is outperformed by the ExtraTrees method in terms of speed and the reliability of its results, especially when it comes to mitigating overfitting and handling noisy data in wind power forecasting [35];
  • ExtraTrees: ExtraTrees, also known as Extremely Randomized Trees, is a variant of the decision tree algorithm that introduces more extensive randomization in the tree splitting process;
  • AdaBoost: AdaBoost is a boosting algorithm that belongs to the ensemble technique category. It works by adjusting the weights of examples that are mistakenly predicted, allowing succeeding models to focus more on these difficult examples [35]. In the context of wind power forecasting, “difficult examples” refer to instances where the prediction errors are higher, often due to irregularities or anomalies in the data. By prioritizing these more challenging examples, AdaBoost can improve the performance of decision tree models, thereby enhancing the overall accuracy of predictions;
  • Bagging: Bagging, also known as Bootstrap Aggregating, mitigates overfitting by constructing numerous models (usually trees) from various subsets of the dataset and subsequently averaging their predictions. This approach is beneficial for generating more reliable forecasts in the unpredictable domain of wind power generation;
  • Gradient Boosting (GB): GB is a machine learning technique that builds models in a sequential manner, where each new model is designed to rectify the errors created by the previously built models [35]. GB is highly efficient in non-linear situations, such as wind power forecasts, where each consecutive model is specifically designed to improve the accuracy of the residuals from the previous models;
  • XGBoost: Extreme Gradient Boosting is renowned for its high efficiency and ability to process big and complex datasets that contain sparse features, which is common in wind data analysis. The inclusion of built-in regularization in this model aids in the prevention of overfitting [37], resulting in higher prediction accuracy for wind power forecasting.

2.4.3. Deep Learning Algorithms

  • Neural Network (NN): A fully connected layer, commonly referred to as a dense layer or neural network, creates a multitude of connections with the previous layer. The architecture under consideration involves a dense layer where each neuron is connected to every neuron in the preceding layer. The widespread adoption of dense layers in artificial neural networks is attributed to their effectiveness in capturing complex patterns and relationships within data. Dense layers facilitate the acquisition of complex input data representations through extensive inter-neuron information exchange, making them a popular choice in diverse AI applications [35]. In general, NNs can represent complex and non-linear connections that simpler linear models may not be able to convey well. NNs are very valuable in forecasting wind power generation due to the interplay of several meteorological elements;
  • Gated Recurrent Unit (GRU) and Long Short-Term Memory (LSTM): The GRU is a type of recurrent neural network (RNN) that was designed to address the issue of the vanishing gradient problem that is frequently encountered in conventional RNN structures. GRU networks offer a promising solution to the challenge of preserving and flowing information within sequential data, with design similarities to LSTM networks. LSTM networks are a distinct type of recurrent neural network that are designed to effectively capture and understand long-term dependencies, particularly in the context of sequence prediction tasks. LSTM networks are characterized by their exceptional capacity to preserve and efficiently exploit information across extended sequences [35]. The remarkable capacity of LSTM networks to retain memory makes them a potent instrument for examining time-dependent information and tackling complex sequential patterns [38]. GRU and LSTM are both very effective in modeling sequential data, such as time series of wind speeds. They possess a high level of skill in capturing long-term relationships in data, which is essential for making precise predictions about the generation of wind power over extended periods;
  • Convolutional Neural Networks (CNN): CNNs are mostly utilized in image processing but they may also be modified to handle sequential data by considering it as a one-dimensional “image”. CNNs have the capability to capture local dependencies and quickly extract features, which is advantageous in dealing with complex wind energy or weather patterns that impact wind generation;
  • BiGRU and BiLSTM: BiGRU and BiLSTM are bidirectional versions of GRU and LSTM, respectively. They are designed to process data in both forward and backward directions, enabling them to successfully capture both past and future context [35]. An in-depth comprehension of time is essential for effectively predicting wind power generation, allowing for the adaptation to the ever-changing climatic conditions.

2.5. Evaluation of ML Models

This subsection provides an explanation of the evaluation criteria and procedures used to measure the performance of the models presented in the previous subsection. This includes the use of the DM statistical test to compare forecast accuracy and Bayesian approaches for evaluating probabilistic models. Section 2.5.1 presents the performance metrics analyzed in this study. Section 2.5.2 covers the utilization of time series cross-validation, an essential validation approach that guarantees the models’ reliability and effectiveness on sequential data, reflecting real-world scenarios where predictions are made based on prior data patterns. This systematic approach not only enables a thorough evaluation of the predicting abilities of each model but also improves the comprehension of their real-world uses in wind energy prediction.

2.5.1. Performance Metrics

In this section, the performance metrics utilized in this study are detailed. A two-step analysis is conducted in this study utilizing time series cross-validation with a split of data equal to six-fold parts. The primary aim of cross-validation is to assess the model’s ability to generate accurate predictions using unseen data, thereby identifying potential issues such as overfitting or bias [35].
In order to conduct a comparative analysis of the algorithms according to the outcomes achieved, it is critical to approximate the practical performance of a predictive model by considering multiple performance metrics. By averaging and squaring over N, the root mean square error (RMSE) penalizes regression models according to the distance between ŷi and yi. The mean absolute error (MAE) quantifies the distance between ŷi and yi on average. As a regression problem, these two metrics are taken into account when estimating energy output. RMSE imposes a penalty on the greater discrepancy between predicted and observed wind power, whereas MAE calculates the mean of offsets directly. The following are the mathematical equations representing the performance metrics utilized in this study, shown in Equations (2) and (3).
RMSE = (1/N × ∑ (yiŷi)2)1/2,
MAE = 1/N × ∑|yiŷi|,
where yi is the actual wind power value, ŷi is the prediction value from the models, ӯi is the mean of yi, and N is the sample size.

2.5.2. Time Series Cross-Validation

Time series cross-validation is a crucial method in wind power forecasting because it considers the sequential pattern of meteorological and wind data. Time series cross-validation preserves the chronological order of observations, unlike normal cross-validation methods that randomly shuffle data points. In wind power forecasting, it is essential to consider the influence of previous events on future outcomes and to reflect any temporal dependencies in the model [39].
Time series cross-validation is a useful method for assessing the reliability and applicability of prediction models in the field of wind power forecasting. The model can be trained using historical data and then evaluated using new, unseen data, replicating real-life situations where predictions are made based on past observations. This approach offers a more accurate evaluation of a model’s performance and aids in determining its ability to adjust to novel and potentially unobserved changes in wind directions over a period of time.
Furthermore, time series cross-validation can assist in optimizing hyperparameters and model configurations that are tailored to the temporal characteristics of the data, such as seasonal patterns and trends. By employing this methodology, modelers may guarantee that the prognostic precision and dependability of the forecasts are upheld, which is of utmost importance for operational planning and decision-making in wind power management. This approach also helps to prevent the transfer of future information into the training process, avoiding overfitting and ensuring that the models maintain their maximum accuracy and effectiveness when used in real-world situations [39].

2.5.3. Diebold–Mariano Test

The DM test is frequently employed to compare the accuracy of two predictive models’ forecasts [40]. This metric is widely utilized in the domains of economics and finance, although it can also be implemented in statistical science/engineering to compare the prediction errors of various models. Originally, the DM test was not designed or intended for models; however, it has been widely used by others to test the accuracy of forecast models [41]. The DM test compares two contending models by analyzing the variances in their forecasting errors. This statistical test enables analysts to statistically determine whether one forecasting model exhibits a statistically significant improvement over another, or if the observed differences in performance are merely attributable to random fluctuations in the data [42]. As a result, it is a highly effective tool for comparative forecasting studies. The steps and equations utilized in the DM test are as follows:
  • Defining the forecast errors: first let us define the e 1 t and e 2 t to be the forecast errors of Model 1 and Model 2 at time t, which represent the errors observed in the test set;
  • Calculate the loss differential d t based on the difference in the forecasted errors using a loss function L, which is selected as the squared error. The differential is shown in Equation (4):
    d t = L e 1 t L e 2 t = e 1 t 2 e 2 t 2 .
  • Compute the mean loss differential: the mean d ¯ is calculated in Equation (5).
    d ¯ = 1 T t = 1 T d t
  • The DM statistics (DM-stats) test is then typically computed in Equation (6).
    D M _ s t a t s = d ¯ σ ^ d 2 T
    where σ ^ d 2 represents the estimate of the variance d t , and T is the number of the forecasted periods.
  • Hypothesis testing: Assuming the null hypothesis H 0 , which states that the forecasting accuracy of both models is identical. The test DM-stats follow a standard normal distribution asymptotically. A rejection of H 0 occurs when the absolute value of the DM statistic is significantly high, suggesting that the forecasting performance of the two models differs considerably.

2.5.4. Bayesian Model Comparison

Bayesian model comparison requires the utilization of particular statistical measures and computations, with the Bayes factor and model marginal likelihood serving as pivotal components [43,44]. These allow for the quantitative comparison of models according to their probability given the context of the available data [45]. The steps and equations utilized in Bayesian model comparison are as follows:
  • First, the marginal likelihood is computed given model M and given data D, shown in Equation (7).
    P D M = P D | θ , M P θ | M d θ .
    where P(D|θ,M) is the likelihood of the data given the parameters θ under model M. P(θ|M) is the prior distribution. In order to account for all possible parameter values, the integral effectively calculates the mean of the likelihood by summing over all possible values of θ;
  • Second, the Bayes Factor (BF) is used to compare two models, M 1 and M 2 , and is defined as the ratio of the model’s marginal likelihood over each model, as shown in Equation (8).
    B F 12 = P ( D | M 1 ) P ( D | M 2 ) .
  • The BF provides a direct measure of the marginal likelihood in favor of one model over another. B F 12 ≈ 1 indicates little to no difference between the models; B F 12 > 10 indicates a strong marginal likelihood in favor of M 1 ; and B F 12 < 0.1 indicates a strong marginal likelihood in favor of M 2 .

3. Results

The results of the machine learning methods that were presented in the preceding section are provided in this section. Three sections comprise the results: first, performance metrics derived from the time series cross-validation; second, the DM test outcomes; and, finally, comparison results of the Bayesian model.

3.1. Time Series Cross-Validation

The data are partitioned using time series cross-validation with a six-fold split over the training set. In addition, the best models are obtained through a grid search by determining the optimal values for hyperparameters on the training set and the evaluated validation set; these models are then finally compared to the test set.
One segment comprises three years of data utilized in the experiments: the training set (2018–2020), and the second test set (2021). The training set was employed to train the machine learning models, while six-fold cross-validation was utilized as a resample procedure throughout the training period. An additional step involved conducting a grid search in order to identify the optimal hyperparameter values for every model. The experimental procedures were executed utilizing scikit-learn, statsmodels, and tensorflow, all of which provide Python 3.9 implementations of the aforementioned machine learning techniques. The hyperparameter tuning procedures are configured in a manner consistent with all data inputs. The optimized values for each model’s tested hyperparameter candidates for each algorithm are detailed in Appendix A. Table 4 presents the findings of the best parameters found with the time series cross-validation.
The measurement of the RMSE for several machine learning algorithms used to anticipate wind power generation yields some crucial insights about performance. The results are shown in Table 5. DT, RF, LSTM, and GRU models have the lowest mean RMSE values, indicating that they are very good at predicting wind power generation. Their RMSE values are firmly grouped around the mean and exhibit minimal variability, indicating consistent performance across test sets or partitions.
AdaBoost has the lowest RMSE standard deviation (STD) among the methods, indicating that its predictions are highly stable independent of data variability or model training settings. Notably, multiple techniques, including LR, Ridge, Huber, and KNN have similar mean, minimum, and maximum RMSE values. Ensemble approaches such as ExtraTrees and GB perform well in this study but not as well as RF or DT, with mean RMSE values that are higher than the best models.
Deep learning models, particularly LSTM and GRU, perform well in this forecasting, probably because of their capacity to hold sequential dependencies and long-term trends in time-series data such as wind speed and power output, which are critical for effective predictions. Traditional methods, such as LR and Ridge, have greater RMSEs, indicating worse appropriateness for this complicated, nonlinear prediction problem. The bidirectional forms of recurrent neural networks, BiGRU and BiLSTM, also perform well, indicating that adding feedback from future states into the learning process improves predicting accuracy.
Furthermore, Table 5 presents the MAE results obtained through time series cross-validation across various machine learning models. These findings offer essential insights into the effectiveness of these models in predicting wind power. RF, DT, and KNN have the lowest MAE values, showing that they perform well in predicting wind power with few errors. Notably, RF has the shortest MAE, indicating its ability to handle complicated, non-linear interactions found in wind data. Moreover, MAE values for models such as LR, Ridge, Huber, KNN, DT, and others are consistent across the mean, min, and max, implying that these models experience uniform error across different data splits. This could be because the models fit the data well or badly but consistently over all cross-validation folds.
Models with lower standard deviations, such as AdaBoost’s STD of 2.475, indicate more consistent performance across multiple data splits. This stability is critical for operational contexts where prediction performance must be consistent despite minor alterations in input data.
Furthermore, BiLSTM and BiGRU also have unusually low MAE values, indicating that these deep learning models—which can capture temporal dependencies and feedback loops in data—are extremely effective for time-series forecasting in wind power; however, deep learning models such as GRU, LSTM, BiGRU, and BiLSTM provide competitive MAE values while also maintaining relatively low variability in their predictions. This highlights the utility of RNN-based models for capturing dynamic patterns in wind speed and power generation datasets.
Traditional regression models, such as LR and Ridge, have higher MAE, which could be attributed to their linear nature and may not adequately reflect the complex, non-linear interdependencies of the predictors in wind energy forecasting.

3.2. Diebold–Mariano Test

The comparative analysis using the DM test reveals several insights into the performance of different machine learning models for forecasting wind power generation across three wind farms (SNT-E1, VBL-E1, and LCU-E1) in Guatemala. This statistical test is crucial for comparing the predictive accuracy of two forecasting models as it determines whether there is a statistically significant difference in their performance based on the DM statistic. A positive DM statistic indicates that the first model in the comparison performs better, and a negative number indicates that the second model performs better. A summarized tables of the DM test results categorized by model performance are shown in Table 6, Table 7 and Table 8.
The results in Table 6 highlight the high performance of specific model pairs across the three wind farms. The Ridge regression model paired with a NN shows exceptional performance for SNT-E1 with a DM-stats of 18.366 and a highly significant p-value of 2.77 × 10−73, though it performs poorly for VBL-E1 and LCU-E1 with negative DM-stats. Similarly, the Huber regression model paired with a NN demonstrates outstanding performance across all three wind farms, particularly for LCU-E1 with a DM-stats of 29.742 and a p-value of 4.29 × 10−181. The combinations of ExtraTrees with BiLSTM, Bagging, and GB as well as XGBoost paired with BiLSTM also show high performance, with consistently positive DM-stats and very low p-values across all farms, indicating superior predictive accuracy and robustness in handling sequential data. The Bagging model outperforms BiLSTM across all datasets with highly significant DM statistics: 10.2314 for SNT-E1; 11.1001 for VBL-E1; and 12.9001 for LCU-E1. The corresponding p-values (2.30 × 10−24, 2.41 × 10−28, and 1.46 × 10−37) indicate that these results are statistically significant, underscoring the superior accuracy of Bagging in wind power generation forecasting. ExtraTrees outperforms BiLSTM across all datasets with significant DM statistics: 9.063 for SNT-E1; 9.140 for VBL-E1; and 10.685 for LCU-E1, with very low p-values. ExtraTrees shows significant underperformance when compared to GB and XGBoost, with negative DM statistics indicating the second model as the best with performance: −12.671, −13.862, and −14.823 against GB; and −12.739, −14.719, and −13.934 against XGBoost. GB significantly outperforms BiLSTM with DM statistics of 11.340 for SNT-E1 11.958 for VBL-E1, and 13.348 for LCU-E1, all with extremely low p-values, demonstrating the robustness of GB in comparison to BiLSTM. Similarly, XGBoost shows superior performance over BiLSTM across all datasets, with DM statistics of 11.366 for SNT-E1, 11.994 for VBL-E1, and 13.329 for LCU-E1, with very low p-values, confirming XGBoost’s superior predictive accuracy.
Table 7 presents a summary of the DM test outcomes for models showcasing moderate performance, as indicated by moderate DM-stats and statistically significant p-values. The combination of LR with LSTM performs moderately well, especially for SNT-E1, with a DM-statistic of 7.951 and a p-value of 2.20 × 10−15. Ridge regression paired with BiGRU also shows moderate performance, particularly for SNT-E1 with a DM-stats of 5.831 and a significant p-value. The Huber regression model paired with LSTM demonstrates strong performance across all wind farms, especially for SNT-E1 and LCU-E1. The KNN model combined with LSTM and the RF model paired with BiLSTM also exhibit moderate performance, indicating their potential for reliable forecasts under certain conditions.
Table 8 presents the DM test results for models with low performance, which is defined by low or insignificant DM-stats and high p-values. The negative DM-stats for Ridge regression paired with ExtraTrees across all wind farms indicate that ExtraTrees performs better than Ridge regression. Specifically, for SNT-E1 the DM-stats is −1.272, for VBL-E1 it is −1.333, and for LCU-E1 it is −3.414, confirming ExtraTrees’ superior predictive accuracy. The combination of Huber regression with ExtraTrees also performs poorly, as indicated by insignificant p-values and low DM-stats. Additionally, DT models paired with GRU and RF models paired with BiGRU show low performance across the wind farms, as reflected by their negative DM-stats and significant p-values. These results suggest that GRU and BiGRU are more effective for wind power forecasting in this context.
Appendix B presents the complete DM test statistics, comparing the performance of various machine learning models for forecasting wind power generation across the three wind farms: SNT-E1, VBL-E1, and LCU-E1. The figure uses different shapes and colors to indicate significance levels, with red circles denoting statistically significant differences (p < 0.05) and blue circles indicating non-significant differences (p ≥ 0.05). The shapes correspond to the different wind farms: SNT-E1 (+), VBL-E1 (■), and LCU-E1 (▲). The analysis reveals that AdaBoost vs. BiGRU yields DM statistics near zero across all wind farms, suggesting similar performance between these models, with BiGRU slightly outperforming AdaBoost in SNT-E1 and VBL-E1. Conversely, AdaBoost generally outperforms BiLSTM, as evidenced by positive DM statistics such as 0.297 for SNT-E1, 3.629 for VBL-E1, and 3.672 for LCU-E1. Bagging shows consistent underperformance compared to GB and XGBoost, highlighted by negative DM statistics across all wind farms, such as −10.752 for GB and −10.806 for XGBoost in SNT-E1. Ridge regression paired with models like NN, LSTM, and BiLSTM demonstrates variable performance, with Ridge vs. NN showing a significant positive DM statistic for SNT-E1 (18.366) but a negative statistic for LCU-E1 (−9.242). Similarly, Huber regression exhibits robust performance when compared with NN and LSTM, indicated by positive DM statistics across most wind farms, while comparisons with ExtraTrees and GB reveal mixed results.

3.3. Bayesian Model Comparison

The Bayesian model comparison test carried out in this study offers a comprehensive evaluation of the predictive capabilities of different machine learning models in wind power generation forecasting. The results, visualized and summarized in Figure 5 in a pairwise model comparison matrix and in the Bayesian model comparison heatmap shown in Appendix C, indicate the comparative effectiveness of different models. In Figure 5, the heatmap illustrates the pairwise model comparisons based on the Bayesian model comparison results. Each cell in the matrix shows the result of comparing the model listed on the y-axis with the model listed on the x-axis. The symbols indicate the following: ‘>’ means the model on the y-axis outperforms the model on the x-axis, ‘<’ means the model on the y-axis is outperformed by the model on the x-axis, and ‘~’ indicates that the models are practically equivalent, with performance differences falling within the Region of Practical Equivalence (ROPE).
In this study, Bayesian model comparison was employed to evaluate the performance differences between pairs of machine learning models in predicting wind power generation. The results of these comparisons are expressed in terms of three probabilities: Worse Probability, Better Probability, and ROPE Probability. Specifically, the “Better Probability” indicates the likelihood that Model 1 performs better than Model 2, while the “Worse Probability” reflects the likelihood that Model 1 performs worse than Model 2. The ROPE Probability measures the chance that the difference in performance between the two models is practically negligible, meaning the difference is within a range considered not significant for practical applications.
LR, Ridge, and Huber show consistently higher Worse Probabilities when compared to advanced ensemble models such as AdaBoost, Bagging, GB, and XGBoost. For instance, in the comparison between AdaBoost and CNN, the Worse Probability is 0.323 while the Better Probability is 0.665, indicating that CNN generally underperforms. Similarly, comparisons between GB and models like XGBoost show significant ROPE probabilities with 0.209, suggesting that the differences in performance may not be practically significant in some cases.
Neural network models, including CNN, GRU, LSTM, and NN, frequently demonstrate superior performance over traditional models but it was not the case in the results of our study. For example, when comparing CNN and BiGRU, the Worse Probability is 0.592 and the Better Probability is 0.398, suggesting a slight edge for BiGRU but still highlighting CNN’s competitiveness. In comparison with simpler models, DT shows a Better Probability of 0.638 and a Worst Probability of 0.349.
Huber regression generally underperforms compared to other advanced models. For instance, in the comparison between Huber and GRU, the Worse Probability is 0.842, while the Better Probability is 0.152, indicating Huber’s weaker performance in handling the variability in wind power data. Overall, the Bayesian model comparison provides a comprehensive view of model performance, emphasizing the strengths and weaknesses of different models in wind power forecasting and guiding informed model selection based on specific application needs.

3.4. Comparison of Machine Learning Models

In terms of the results over unseen data, the performance metrics obtained from the time series cross-validation, as presented in Table 5, demonstrate that the most effective models for wind power forecasting are GRU, LSTM, and BiLSTM. These models demonstrated the lowest RMSE and MAE, which indicates their exceptional accuracy. The BiLSTM model achieved an RMSE of 17.870 and an MAE of 11.392; the LSTM model closely followed with an RMSE of 17.781 and an MAE of 11.832. The GRU model demonstrated outstanding performance, achieving an RMSE of 17.860 and an MAE of 11.585. The findings highlight the effectiveness of advanced neural network architectures in capturing complex temporal patterns in wind power data, establishing them as the most reliable models for accurate forecasting in this specific context.
In terms of a pairwise comparison over the machine learning models, the DM test and Bayesian model comparison results show which models are most effective at estimating wind power. The findings of the DM test show that, in comparison to other models, XGBoost, Bagging, and particular neural networks (such as CNN and LSTM) often exhibit statistically significant improved performance. For instance, XGBoost paired with BiLSTM shows consistently high DM-stats across all wind farms, indicating superior performance. This implies that these models have a clear edge when it comes to identifying the patterns required for precise wind power forecasts.
These results are corroborated by the Bayesian model comparison heatmap, which offers a probabilistic evaluation of model performance. High posterior probabilities are routinely demonstrated by AdaBoost, GB, Bagging, and XGBoost. For example, Bagging shows a Better Probability of 0.545 against BiGRU and 0.559 against BiLSTM, confirming its robustness in ensemble models. Similarly, the ensemble model XGBoost shows a high probability of outperforming BiGRU and BiLSTM, with better probabilities of 0.538 and 0.550, respectively. This confirms the position of ensemble models as the best models. The accuracy and dependability of these models in various forecasting situations are validated by the Bayesian analysis, which provides a sophisticated knowledge of model confidence.

4. Conclusions

In this study, we performed a comparative analysis of various machine learning techniques to predict wind power generation using data from 2018 to 2021 from three wind farms in Guatemala. The following key findings were identified.
Predicting wind power without meteorological data:
  • One of the notable advantages of our models is their independence of the presence of meteorological data. This is particularly valuable in situations where weather data are not available, incomplete, or unreliable. Even with the sole use of historical power generation data, our models can produce accurate forecasts;
  • Meteorological data may occasionally be susceptible to measurement inaccuracies, incomplete values, or delays in data collection. These issues are mitigated in our models through the utilization of operational data, which is typically more consistently recorded and maintained by wind farms. This enhances the robustness of the forecasting process;
  • The complexity of data preprocessing and integration are reduced in our models by excluding weather data. This simplification can result in faster deployment and lower computational requirements, making the forecasting process more efficient.
Model Performance:
  • The models GRU, LSTM, and BiLSTM showed the lowest RMSE and MAE, proving their accuracy in predicting wind power generation;
  • The performance of ensemble methods, specifically XGBoost and Bagging, was strong, consistently providing accurate predictions.
Statistical Validation:
  • The Diebold–Mariano (DM) Test confirmed the statistical significance of the performance disparities among models, underscoring the superior accuracy of GB, XGBoost, and Bagging;
  • The Bayesian model comparison revealed that AdaBoost, Bagging, GB, and XGBoost presented robustness, as indicated by probabilistic evidence, surpassing other models.
Practical Implications:
  • According to the findings, the implementation of advanced neural network architectures and ensemble methods can notably improve the reliability of wind power predictions. This improvement can lead to better grid stability and operational efficiency;
  • Effective operational planning and resource allocation are facilitated by accurate wind power predictions, reducing reliance on fossil fuels and supporting renewable energy integration.
Other real-world consequences
  • Grid stability is improved through the use of accurate forecasting models, allowing for better management of supply and demand;
  • Enhanced forecasting models enable more accurate operational planning, minimizing expenses linked to excess production and storage;
  • Accurate wind power predictions facilitate the integration of renewable energy, decreasing greenhouse gas emissions and dependence on non-renewable sources.
In conclusion, the application of advanced machine learning techniques, particularly GRU, LSTM, BiLSTM, GB, XGBoost, and Bagging, has proven to be effective in forecasting wind power generation, where XGBoost and Bagging obtain the best results in our tests. These models offer significant benefits for grid management and operational planning, contributing to the broader goal of enhancing renewable energy utilization. Future research should continue to explore hybrid models and the incorporation of additional variables to further improve forecasting accuracy.

5. Discussion

It is essential to compare various machine learning algorithms in wind power forecasting because of the complex characteristics of the data and the distinct approaches that each model takes to handle it. Linear Regression and Decision Trees, although providing a comprehensible insight into the factors and characteristics that impact wind power, may not possess the adaptability to capture complex patterns. Ensemble approaches, like Random Forest, XGBoost, and Gradient Boosting, can enhance prediction accuracy by combining numerous models to minimize both variance and bias. This approach efficiently captures complex correlations and interactions between variables, resulting in more reliable predictions. Deep learning models like LSTM, GRU, and CNN are highly skilled in processing sequential and time-series data. They excel at capturing the temporal patterns and long-term relationships that are commonly seen in environmental data. Through the process of comparing various algorithms, researchers and engineers may determine the most precise and computationally efficient models for specific forecasting jobs. This allows them to enhance performance and guarantee dependability in real-world applications. This comparison also facilitates the comprehension of the compromises between interpretability and predictive capability, allowing for the creation of models that not only accurately anticipate but also offer insights into the fundamental mechanisms influencing wind power generation.
In the case of time series cross-validation results, tree-based models and recurrent neural networks have reduced RMSE values, indicating that they can successfully manage the nonlinear and temporal fluctuations of wind power data. AdaBoost’s consistency and stability across several test situations demonstrate its promise for dependable forecasts under shifting wind conditions. These findings can help guide the selection of acceptable models for operational deployment in wind power forecasting applications, based on specific requirements such as accuracy, computing efficiency, and robustness to data variability. For MAE results, our experiments show that ensemble and deep learning models excel at forecasting wind power, despite its complexities. The tree-based models, particularly RF, exhibit robustness and precision, making them ideal for practical applications. Meanwhile, the success of BiLSTM and BiGRU highlights the growing importance of advanced neural network architectures for properly managing sequential data and the dependencies over time. These insights are crucial for choosing models that provide not just precision but also consistency in real-world wind power prediction scenarios, providing efficient and dependable energy management.
Evaluating the predicted accuracy and contrasting the performance of various models are important aspects of analyzing the output of machine learning algorithms used to forecast wind power. In this approach, the DM test and the Bayesian model comparison are essential. By comparing the forecast errors of two models, the DM test ensures a reliable evaluation of each model’s performance and permits a direct comparison of forecast accuracy. The test’s resilience to diverse model parameters and capacity to manage diverse forecasting failures make it very helpful.
The DM test is a crucial technique for assessing the efficacy of various machine learning models in forecasting wind power generation. The test enables us to directly evaluate the inaccuracies in predictions made by different models, therefore, determining which model provides the most dependable forecasts. Precision in wind power forecasts is crucial in our specific circumstances as it is vital for optimizing energy output and managing the system effectively. Moreover, wind power generation data frequently have non-normal distributions as a result of the intrinsic variability in wind patterns. The DM test’s ability to withstand and accurately assess non-normality makes it a suitable candidate for our analysis. By incorporating this test, we address the potential skewness and kurtosis in forecast errors, ensuring that our comparative analysis remains valid and reliable.
In contrast, a major benefit over conventional hypothesis testing is that Bayesian model comparison offers a probabilistic interpretation of model performance. It takes into account past data, which might enhance model comparison—particularly when there is a lack of or a lot of noise in the data. It also measures model uncertainty, providing information about the assessment’s degree of confidence and managing complex model structures that are frequently seen in wind power forecasts.
These techniques are critical for guaranteeing precise forecasts in the context of wind power forecasting, which are necessary for operational effectiveness and risk management. Enhanced prediction accuracy facilitates improved energy resource optimization, grid management, and integration of renewable energy sources. Through an extensive evaluation process that employs the DM test and Bayesian model comparison, we can ascertain which models offer the most predictive power, hence, facilitating effective and dependable energy management.
This study presents an in-depth comparative analysis of various forecasting models, providing useful insights into their suitability and effectiveness in diverse energy systems. The findings aid in making educated choices regarding model selection and strategy development for energy grid forecasting. This emphasizes the importance of customizing the methodology based on the unique data features and performance needs of the grid.
One limitation of our study is the lack of detailed power measurements per individual generator. The data provided by the “Administrador del Mercado Mayorista (AMM)” include aggregated power generation values for entire wind farms, which limits the granularity of our analysis. Additionally, the data used in this study are centered on wind farms located in Guatemala. For future research, we aim to include information from wind farms in other countries to provide a more comprehensive and comparative analysis of wind power generation. Moreover, further research should investigate hybrid models that integrate the advantages of tree-based approaches and neural networks, or models like transformers, in order to potentially enhance the accuracy and dependability of energy forecasts. Furthermore, future studies could evaluate the behavior of climate variables to determine the benefits or differences that exist between models. The objective is to ascertain whether the same behavior is observed in forecasting models.

Author Contributions

B.C. and K.K. conducted the initial design and developed the framework. B.C. wrote the draft of the manuscript, while K.K. reviewed and proofread the paper. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Dongguk University Research Fund of 2024 (S-2024-G0001-00028).

Data Availability Statement

The dataset used in this study is available daily and can be accessed through the “Administrador del Mercado Mayorista (AMM)” website at https://www.amm.org.gt/ (accessed on 31 June 2021). The data belong to the AMM and includes detailed information on wind power generation from the specified wind farms in Guatemala. For further details and access to the dataset, please contact or visit the AMM website.

Acknowledgments

We express our sincere gratitude to Josue Obregon from Seoul National University of Science and Technology for graciously lending us the server that facilitated our computational experiments. His support was instrumental in the execution of our research. We are also deeply thankful to Hamideh Baghaei Daemi for her invaluable assistance in acquiring the necessary data for our study. Her expertise and dedication significantly contributed to the success of this project.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Evaluated hyperparameters for the machine learning algorithms.
Table A1. Evaluated hyperparameters for the machine learning algorithms.
ModelHyperparametersValues or Distributions
LRfit_intercept[True, False]
Ridgeestimator__alphaUniform distribution (0.1, 10.0)
Huberestimator__epsilon, estimator__alphaUniform distributions (1.35, 1.75), (0.01, 25.0)
KNNestimator__n_neighbors, estimator__weights, estimator__algorithmrandint (2, 30), [‘uniform’, ‘distance’], [‘auto’, ‘ball_tree’, ‘kd_tree’, ‘brute’]
DTmax_depth, min_samples_split, min_samples_leafrandint (2, 30), randint (2, 10), randint (1, 4)
RFn_estimators, max_depth, min_samples_split, min_samples_leafrandint (5, 205), randint (2, 30), randint (2, 10), randint (1, 4)
ExtraTreesn_estimators, max_depthrandint (5, 205), randint (2, 30)
AdaBoostestimator__n_estimators, estimator__learning_rate, estimator__lossrandint (5, 205), Uniform (0.01, 0.99), [‘linear’, ‘square’, ‘exponential’]
Baggingestimator__n_estimatorsrandint (5, 205)
GBestimator__n_estimators, estimator__learning_rate, estimator__max_depthrandint (5, 205), Uniform (0.01, 1.0), randint (2, 30)
XGBoostn_estimators, learning_rate, max_depthrandint (5, 205), Uniform (0.01, 1.0), randint (2, 30)
NNregressor__estimator__hidden_layer_sizes, regressor__estimator__activation, regressor__estimator__solver, regressor__estimator__learning_rate[(30, 30, 30, 3), (50, 50, 50, 3), (100, 60, 60, 60, 20, 10, 3)], [‘relu’, ‘tanh’, ‘logistic’], [‘adam’, ‘sgd’], [‘constant’, ‘adaptive’]
GRUregressor__model__model_type, regressor__model__units, regressor__model__dropout_rate[‘GRU’], randint (20, 100), Uniform (0.1, 0.3)
LSTMregressor__model__model_type, regressor__model__units, regressor__model__dropout_rate[‘LSTM’], randint (20, 100), Uniform (0.1, 0.3)
CNNregressor__model__filters, regressor__model__kernel_sizerandint (16, 64), [2, 3, 4]
BiGRUregressor__model__model_type, regressor__model__units, regressor__model__dropout_rate, regressor__model__activation[‘GRU’], randint (32, 128), Uniform (0.1, 0.3), [‘relu’, ‘tanh’]
BiLSTMregressor__model__model_type, regressor__model__units, regressor__model__dropout_rate, regressor__model__activation[‘LSTM’], randint (32, 128), Uniform (0.1, 0.3), [‘relu’, ‘tanh’]

Appendix B

Figure A1. Comparison results of the DM test.
Figure A1. Comparison results of the DM test.
Energies 17 03158 g0a1

Appendix C

Figure A2. Heatmap of the results from Bayesian model comparison.
Figure A2. Heatmap of the results from Bayesian model comparison.
Energies 17 03158 g0a2

References

  1. Administration, U.S.E.I. Wind Explained—History of Wind Power. Available online: https://www.eia.gov/energyexplained/wind/history-of-wind-power.php (accessed on 1 June 2023).
  2. Lerner, J.; Grundmeyer, M.; Garvert, M. The importance of wind forecasting. Renew. Energy Focus 2009, 10, 64–66. [Google Scholar] [CrossRef]
  3. Jenkins, N.; Burton, T.L.; Bossanyi, E.; Sharpe, D.; Graham, M. Wind Energy Handbook; John Wiley & Sons: Hoboken, NJ, USA, 2021. [Google Scholar]
  4. Ghofrani, M.; Alolayan, M. Time Series and Renewable Energy Forecasting; IntechOpen: London, UK, 2018; Volume 10. [Google Scholar]
  5. Carrera, B.; Sim, M.K.; Jung, J.-Y. PVHybNet: A Hybrid Framework for Predicting Photovoltaic Power Generation Using Both Weather Forecast and Observation Data. IET Renew. Power Gener. 2020, 14, 2192–2201. [Google Scholar] [CrossRef]
  6. Aslam, S.; Herodotou, H.; Mohsin, S.M.; Javaid, N.; Ashraf, N.; Aslam, S. A survey on deep learning methods for power load and renewable energy forecasting in smart microgrids. Renew. Sustain. Energy Rev. 2021, 144, 110992. [Google Scholar] [CrossRef]
  7. Hanifi, S.; Liu, X.; Lin, Z.; Lotfian, S. A critical review of wind power forecasting methods—Past, present and future. Energies 2020, 13, 3764. [Google Scholar] [CrossRef]
  8. Obregon, J.; Han, Y.-R.; Ho, C.W.; Mouraliraman, D.; Lee, C.W.; Jung, J.-Y. Convolutional autoencoder-based SOH estimation of lithium-ion batteries using electrochemical impedance spectroscopy. J. Energy Storage 2023, 60, 106680. [Google Scholar] [CrossRef]
  9. Kim, J.; Obregon, J.; Park, H.; Jung, J.-Y. Multi-step photovoltaic power forecasting using transformer and recurrent neural networks. Renew. Sustain. Energy Rev. 2024, 200, 114479. [Google Scholar] [CrossRef]
  10. Munoz, M.; Morales, J.M.; Pineda, S. Feature-driven improvement of renewable energy forecasting and trading. IEEE Trans. Power Syst. 2020, 35, 3753–3763. [Google Scholar] [CrossRef]
  11. Bellinguer, K.; Mahler, V.; Camal, S.; Kariniotakis, G. Probabilistic Forecasting of Regional Wind Power Generation for the eem20 Competition: A Physics-Oriented Machine Learning Approach. In Proceedings of the 2020 17th International Conference on the European Energy Market (EEM), Stockholm, Sweden, 16–18 September 2020; pp. 1–6. [Google Scholar]
  12. Li, L.-L.; Zhao, X.; Tseng, M.-L.; Tan, R.R. Short-term wind power forecasting based on support vector machine with improved dragonfly algorithm. J. Clean. Prod. 2020, 242, 118447. [Google Scholar] [CrossRef]
  13. da Silva, R.G.; Ribeiro, M.H.D.M.; Moreno, S.R.; Mariani, V.C.; dos Santos Coelho, L. A novel decomposition-ensemble learning framework for multi-step ahead wind energy forecasting. Energy 2021, 216, 119174. [Google Scholar] [CrossRef]
  14. Carrera, B.; Kim, K. Comparison analysis of machine learning techniques for photovoltaic prediction using weather sensor data. Sensors 2020, 20, 3129. [Google Scholar] [CrossRef]
  15. Lin, B.; Zhang, C. A novel hybrid machine learning model for short-term wind speed prediction in inner Mongolia, China. Renew. Energy 2021, 179, 1565–1577. [Google Scholar] [CrossRef]
  16. Stratigakos, A.; van Der Meer, D.; Camal, S.; Kariniotakis, G. End-to-End Learning for Hierarchical Forecasting of Renewable Energy Production with Missing Values. In Proceedings of the 2022 17th International Conference on Probabilistic Methods Applied to Power Systems (PMAPS), Manchester, UK, 12–15 June 2022; pp. 1–6. [Google Scholar]
  17. Alkhayat, G.; Mehmood, R. A review and taxonomy of wind and solar energy forecasting methods based on deep learning. Energy AI 2021, 4, 100060. [Google Scholar] [CrossRef]
  18. Gu, C.; Li, H. Review on deep learning research and applications in wind and wave energy. Energies 2022, 15, 1510. [Google Scholar] [CrossRef]
  19. Baek, M.-W.; Sim, M.K.; Jung, J.-Y. Wind power generation prediction based on weather forecast data using deep neural networks. ICIC Express Lett. Part B Appl. 2020, 11, 863–868. [Google Scholar]
  20. Wu, Q.; Guan, F.; Lv, C.; Huang, Y. Ultra-short-term multi-step wind power forecasting based on CNN-LSTM. IET Renew. Power Gener. 2021, 15, 1019–1029. [Google Scholar] [CrossRef]
  21. Wang, Y.; Zou, R.; Liu, F.; Zhang, L.; Liu, Q. A review of wind speed and wind power forecasting with deep neural networks. Appl. Energy 2021, 304, 117766. [Google Scholar] [CrossRef]
  22. Shahin, M.B.U.; Sarkar, A.; Sabrina, T.; Roy, S. Forecasting Solar Irradiance Using Machine Learning. In Proceedings of the 2020 2nd International Conference on Sustainable Technologies for Industry 4.0 (STI), Dhaka, Bangladesh, 19–20 December 2020; pp. 1–6. [Google Scholar]
  23. Kosovic, B.; Haupt, S.E.; Adriaansen, D.; Alessandrini, S.; Wiener, G.; Delle Monache, L.; Liu, Y.; Linden, S.; Jensen, T.; Cheng, W. A comprehensive wind power forecasting system integrating artificial intelligence and numerical weather prediction. Energies 2020, 13, 1372. [Google Scholar] [CrossRef]
  24. Duan, J.; Wang, P.; Ma, W.; Tian, X.; Fang, S.; Cheng, Y.; Chang, Y.; Liu, H. Short-term wind power forecasting using the hybrid model of improved variational mode decomposition and Correntropy Long Short-term memory neural network. Energy 2021, 214, 118980. [Google Scholar] [CrossRef]
  25. International Renewable Energy Agency (IRENA). Energy Profile Guatemala. Available online: https://www.irena.org/-/media/Files/IRENA/Agency/Statistics/Statistical_Profiles/Central%20America%20and%20the%20Caribbean/Guatemala_Central%20America%20and%20the%20Caribbean_RE_SP.pdf#:~:text=URL%3A%20https%3A%2F%2Fwww.irena.org%2F (accessed on 6 January 2024).
  26. Energypedia. Guatemala Energy Situation. Available online: https://energypedia.info/wiki/Guatemala_Energy_Situation (accessed on 6 January 2024).
  27. Evwind. Wind Energy in Guatemala. Available online: https://www.evwind.es/2020/06/25/wind-energy-in-guatemala/75323 (accessed on 6 January 2024).
  28. EnergiaGuatemala.com. Energy in Guatemala: Current Outlook for This Industry. Available online: https://energiaguatemala.com/en/energy-in-guatemala-current-outlook-for-this-industry/ (accessed on 6 January 2024).
  29. Fulbright, N.R. Renewable Energy in Latin America: Central America. Available online: https://www.nortonrosefulbright.com/en/knowledge/publications/1e7b0a75/renewable-energy-in-latin-america-central-america (accessed on 6 January 2024).
  30. Wiki, G.E.M. Energy Profile: Guatemala. Available online: https://www.gem.wiki/Energy_profile:_Guatemala (accessed on 6 January 2024).
  31. Gobierno de la Republica de Guatemala Ministerio de Energia y Minas. Nuevo Modulo de Estadisticas Energeticas en Guatemala. Available online: https://www.mem.gob.gt/wp-content/uploads/2017/11/MODULO.pdf (accessed on 6 January 2024).
  32. Renewables, C. San Antonio Wind Farm, First Guatemala Wind Farm. Available online: https://www.cjr-renewables.com/en/san-antonio-wind-farm/ (accessed on 6 January 2024).
  33. Viento Blanco. Available online: https://viento-blanco.com/wind-farm/?lang=en (accessed on 6 January 2024).
  34. Power, T.W. Available online: https://www.thewindpower.net/windfarm_es_27390_las-cumbres.php (accessed on 6 January 2024).
  35. James, G.; Witten, D.; Hastie, T.; Tibshirani, R. An Introduction to Statistical Learning, 2nd ed.; Springer: Berlin/Heidelberg, Germany, 2021; Volume 112. [Google Scholar]
  36. Robert, C. Machine Learning, a Probabilistic Perspective; The MIT Press: Cambridge, UK, 2014. [Google Scholar]
  37. Chen, T.; He, T.; Benesty, M.; Khotilovich, V.; Tang, Y. Xgboost: Extreme Gradient Boosting; R Package Version 0.4-2; CRAN.R-project.org; 2015; pp. 1–4. [Google Scholar]
  38. Goodfellow, I.; Bengio, Y.; Courville, A.; Bengio, Y. Deep Learning; MIT Press: Cambridge, UK, 2016; Volume 1. [Google Scholar]
  39. Bergmeir, C.; Benítez, J.M. On the use of cross-validation for time series predictor evaluation. Inf. Sci. 2012, 191, 192–213. [Google Scholar] [CrossRef]
  40. Mariano, R.S.; Preve, D. Statistical tests for multiple forecast comparison. J. Econom. 2012, 169, 123–130. [Google Scholar] [CrossRef]
  41. Diebold, F.X. Comparing predictive accuracy, twenty years later: A personal perspective on the use and abuse of Diebold–Mariano tests. J. Bus. Econ. Stat. 2015, 33, 1. [Google Scholar] [CrossRef]
  42. Mohammed, F.A.; Mousa, M.A. Applying Diebold–Mariano Test for Performance Evaluation between Individual and Hybrid Time-Series Models for Modeling Bivariate Time-Series Data and Forecasting the Unemployment Rate in the USA. In Proceedings of the Theory and Applications of Time Series Analysis: Selected Contributions from ITISE 2019, Granada, Spain, 25–27 September 2019; Springer: New York, NY, USA, 2020; pp. 443–458. [Google Scholar]
  43. Phillips, D.B.; Smith, A.F. Bayesian model comparison via. In Markov Chain Monte Carlo in Practice; Chapman & Hall: London, UK, 1995; pp. 215–240. [Google Scholar]
  44. Geweke, J. Bayesian model comparison and validation. Am. Econ. Rev. 2007, 97, 60–64. [Google Scholar] [CrossRef]
  45. Sun, M.; Zhang, T.; Wang, Y.; Strbac, G.; Kang, C. Using Bayesian deep learning to capture uncertainty for residential net load forecasting. IEEE Trans. Power Syst. 2019, 35, 188–201. [Google Scholar] [CrossRef]
Figure 1. Proposed framework for the comparison analysis of the ML models over wind power forecasting.
Figure 1. Proposed framework for the comparison analysis of the ML models over wind power forecasting.
Energies 17 03158 g001
Figure 2. Wind farm Viento Blanco located in San Vicente Pacaya, Escuintla, Guatemala. Photo taken on 7 April 2024.
Figure 2. Wind farm Viento Blanco located in San Vicente Pacaya, Escuintla, Guatemala. Photo taken on 7 April 2024.
Energies 17 03158 g002
Figure 3. Time series of the wind energy from 3 wind farms in Guatemala: (a) year 2018, (b) year 2019, and (c) year 2020.
Figure 3. Time series of the wind energy from 3 wind farms in Guatemala: (a) year 2018, (b) year 2019, and (c) year 2020.
Energies 17 03158 g003aEnergies 17 03158 g003b
Figure 4. Monthly wind power energy for the period 2018–2021.
Figure 4. Monthly wind power energy for the period 2018–2021.
Energies 17 03158 g004
Figure 5. Pairwise comparison heatmap from the results from Bayesian model comparison.
Figure 5. Pairwise comparison heatmap from the results from Bayesian model comparison.
Energies 17 03158 g005
Table 1. Wind power generation grids in operation in Guatemala.
Table 1. Wind power generation grids in operation in Guatemala.
IDNameCapacity [MW]Starting DateRegionCoordinates
SNT-E1San Antonio El Sitio52.8019 April 2015Guatemala14.2124° N, 90.3315° W
VBL-E1Viento Blanco, S.A.23.106 December 2015Escuintla14.3912° N, 90.6638° W
LCU-E1Las Cumbres de Agua Blanca31.5025 March 2018Jutiapa14.2620° N, 89.344° W
Table 2. Descriptive statistics for each wind farm from 2018 to 2021.
Table 2. Descriptive statistics for each wind farm from 2018 to 2021.
Wind FarmMean Generation [MW]Standard Deviation [MW]Minimum Generation [MW]Maximum Generation [MW]
SNT-E119.9027.550340.55
VBL-E110.4513.900151.48
LCU-E117.3721.390214.06
Table 3. Categorization of machine learning algorithms.
Table 3. Categorization of machine learning algorithms.
CategoryAlgorithms
SimpleLinear Regression (LR), Ridge, Huber, K-Nearest Neighbors (KNN), Decision Trees (DT)
EnsembleRandom Forest (RF), ExtraTrees, AdaBoost, Bagging, Gradient Boosting (GB), XGBoost
Deep LearningNeural Network (NN), GRU, LSTM, CNN, BiGRU, BiLSTM
Table 4. Best parameters for the machine learning algorithms.
Table 4. Best parameters for the machine learning algorithms.
ModelBest Parameters
LRfit_intercept = False
Ridgealpha = 9.795213209218735
Huberalpha = 23.609862143899665, epsilon = 1.3795486160040937, max_iter = 10,000
KNNn_neighbors = 21, weights = ‘distance’
DTmax_depth = 4, min_samples_leaf = 2, min_samples_split = 6
RFmax_depth = 5, min_samples_split = 3, n_estimators = 82
ExtraTreesmax_depth = 15, n_estimators = 86
AdaBoostlearning_rate = 0.9472355241265569, n_estimators = 93
Baggingestimator = BaggingRegressor(n_estimators = 41)
GBlearning_rate = 0.22071792035854365, max_depth = 13, n_estimators = 69
XGBoostlearning_rate = 0.5069668065851777, max_depth = 25, n_estimators = 190
NNactivation = ‘logistic’, hidden_layer_sizes = (30, 30, 30, 3),
learning_rate = ‘adaptive’, max_iter = 10,000, solver = ‘sgd’
GRUbatch_size = 32, epochs = 50, dropout_rate = 0.11739125554241406, number of neurons = 65, verbose = 0
LSTMbatch_size = 32, epochs = 50, dropout_rate = 0.1591187227490868, number of neurons = 93, verbose = 0
CNNbatch_size = 32, epochs = 50, filters = 61, kernel_size = 4
BiGRUbatch_size = 200, epochs = 128, activation = ‘tanh’, dropout_rate = 0.22956820690724705, number of neurons = 107
BiLSTMbatch_size = 200, epochs = 128, model__activation = ‘tanh’, dropout_rate = 0.12150022102203821, number of neurons = 120
Table 5. Results from the performance metrics over the time series cross-validation.
Table 5. Results from the performance metrics over the time series cross-validation.
ModelRMSEMAE
MeanSTDMeanSTD
LR22.6625.55715.6813.668
Ridge23.4994.51916.8592.944
Huber22.5366.03414.5433.791
KNN18.8424.94911.9883.077
DT18.1874.70611.7692.919
RF18.0674.75911.5942.835
ExtraTrees21.2305.32912.8452.950
AdaBoost19.0653.31512.8882.475
Bagging21.3875.06812.7312.823
GB22.2335.23513.0682.805
XGBoost22.1735.32813.0012.873
NN20.3125.19813.3992.564
GRU17.8604.44911.5852.921
LSTM17.7814.64711.8322.868
CNN20.1815.55313.1652.932
BiGRU18.3305.23811.4093.226
BiLSTM17.8704.68111.3923.123
Table 6. Summarized results of high performance from DM test (significant DM-stats, low p-value).
Table 6. Summarized results of high performance from DM test (significant DM-stats, low p-value).
Model 1Model 2SNT-E1VBL-E1LCU-E1
DM-Statsp-ValueDM-Statsp-ValueDM-Statsp-Value
BaggingBiLSTM10.23142.30 × 10−2411.10012.41 × 10−2812.90011.46 × 10−37
RidgeNN18.3662.77 × 10−73−0.9233.56 × 10−1−9.2423.32 × 10−20
RidgeLSTM9.9493.87 × 10−234.4598.39 × 10−61.8945.82 × 10−2
HuberNN21.9231.95 × 10−10215.4061.59 × 10−5229.7424.29 × 10−181
HuberLSTM12.4424.31 × 10−356.4171.50 × 10−107.5445.27 × 10−14
ExtraTreesBiLSTM9.0631.70 × 10−199.1408.48 × 10−2010.6852.11 × 10−26
ExtraTreesGB−12.6712.58 × 10−36−13.8625.16 × 10−43−14.8238.02E × 10−49
ExtraTreesXGBoost−12.7391.10 × 10−36−14.7193.56 × 10−48−13.9341.95E × 10−43
GBBiLSTM11.3401.69 × 10−2911.9581.41 × 10−3213.3484.72E × 10−40
XGBoostBiLSTM11.3661.27 × 10−2911.9949.24 × 10−3313.3295.99 × 10−40
Table 7. Summarized results of moderate performance from DM test (moderate DM-stats, significant p-value).
Table 7. Summarized results of moderate performance from DM test (moderate DM-stats, significant p-value).
Model 1Model 2SNT-E1VBL-E1LCU-E1
DM-Statsp-ValueDM-Statsp-ValueDM-Statsp-Value
LRLSTM7.9512.20 × 10−154.3141.63 × 10−53.9079.46 × 10−5
RidgeBiGRU5.8315.79 × 10−93.1411.69 × 10−30.7954.27 × 10−1
HuberLSTM12.4424.31 × 10−356.4171.50 × 10−107.5445.27 × 10−14
KNNLSTM2.0004.55 × 10−21.5821.14 × 10−14.2132.56 × 10−5
RFBiLSTM−5.4206.19 × 10−8−1.2851.99 × 10−15.3917.27 × 10−8
Table 8. Summarized results of low performance from DM test (low or insignificant DM-stats, high p-value).
Table 8. Summarized results of low performance from DM test (low or insignificant DM-stats, high p-value).
Model 1Model 2SNT-E1VBL-E1LCU-E1
DM-Statsp-ValueDM-Statsp-ValueDM-Statsp-Value
LRKNN5.0883.73 × 10−73.3268.88 × 10−41.6988.96 × 10−2
RidgeExtraTrees−1.2720.203−1.3331.83 × 10−1−3.4146.45 × 10−4
HuberExtraTrees0.9080.3640.6245.33 × 10−11.3431.79 × 10−1
DTGRU−1.1870.235−2.3861.71 × 10−2−7.7151.41 × 10−14
RFBiGRU−4.9756.72 × 10−7−4.0944.30 × 10−51.5321.26 × 10−1
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Carrera, B.; Kim, K. Comparative Analysis of Machine Learning Techniques in Predicting Wind Power Generation: A Case Study of 2018–2021 Data from Guatemala. Energies 2024, 17, 3158. https://doi.org/10.3390/en17133158

AMA Style

Carrera B, Kim K. Comparative Analysis of Machine Learning Techniques in Predicting Wind Power Generation: A Case Study of 2018–2021 Data from Guatemala. Energies. 2024; 17(13):3158. https://doi.org/10.3390/en17133158

Chicago/Turabian Style

Carrera, Berny, and Kwanho Kim. 2024. "Comparative Analysis of Machine Learning Techniques in Predicting Wind Power Generation: A Case Study of 2018–2021 Data from Guatemala" Energies 17, no. 13: 3158. https://doi.org/10.3390/en17133158

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop