4.1. Model Evaluation
Table 13 shows the results of 5-fold cross-validation training on four machine learning models for electric energy consumption forecasting: LSTM, RF, SVR, and XGBoost. The performance measures reported are MAE, median of MAE, and standard deviation of MAE.
Based on the MAE and standard deviation, the GA–SVR model exhibited the lowest error during 5-fold cross-validation for the Palmas Campus time series, with an MAE of 758.80 and a standard deviation of 242.42. The GA–RF model followed, significantly behind, with an MAE of 758.80 and a standard deviation of 971.01. Considering the median, the GA–RF model presented the best value, with a median of 634.00, followed significantly by the GA–SVR model with a median of 941.00. Given the lower standard deviation, the GA–SVR model demonstrated more consistent performance scores in the training step for dataset 1.
In dataset 2, the GA–XGBoost model achieved the best scores in terms of mean and standard deviation during training, with an MAE of 441.20 and an SD of 224.71, followed by the GA–RF model with an MAE of 464.00 and an SD of 304.82. When considering the median, the GA–LSTM model presented the lowest error, with a value of 451.00, followed by the GA–XGBoost model. However, considering the lower standard deviation, the GA–XGBoost model exhibited the best training scores, presenting smaller fluctuations in values across each training fold for dataset 2.
In the testing phase, model performance was evaluated in a simulated real-world scenario, involving forecasting electricity consumption for 12 consecutive months at both the Palmas and Coronel Vivida campuses. To assess model performance, the MAE and sMAPE performance measures were employed, as shown in
Table 14 and
Table 15.
In contrast to the 5-fold cross-validation results, the GA–LSTM model demonstrated superior performance in electricity consumption forecasting for dataset 1 data in a 12-month-ahead horizon. With an sMAPE of 15.35% and an MAE of 2264.50, the GA–LSTM model significantly surpassed the GA–RF model, which achieved an sMAPE of 21.83% and an MAE of 3211.33. Moreover, the GA–LSTM model also outperformed traditional time-series forecasting models, with exponential smoothing ranking as the second-best performer, yielding an MAE of 3207.78 and an sMAPE of 21.82%.
Applying the WSB approach with a GA–LSTM model and the weak predictors GA–RF, GA–SVR, and GA–XGBoost resulted in a significant improvement in overall model performance, with an sMAPE of 13.90% and MAE of 1990.87. This demonstrates that the cooperative ensemble learning model is more effective for this specific problem, outperforming individual models in terms of sMAPE and MAE.
The remaining proposed models exhibited relatively similar performance, with sMAPE values ranging from 21.83% to 24.45% and MAE values ranging from 3211.33 to 3760.75. In contrast, the performance of traditional time-series forecasting models and other artificial neural network models was, in most cases, significantly lower, with sMAPE values ranging from 21.82% to 43.31% and MAE values ranging from 3207.78 to 5415.74.
A similar pattern was observed in dataset 2, where the GA–LSTM model once again demonstrated the best performance in the testing phase, with an MAE of 466.92 and an sMAPE of 18.80%.By applying the WSB approach with the GA–LSTM, the overall model performance improved significantly, achieving an sMAPE of 18.72% and an MAE of 464.93. The GA–RF model also maintained its second-best position, achieving an MAE of 529.00 and an sMAPE of 21.57%. Both models outperformed the performance of traditional time-series forecasting models and other artificial neural network models. The Holt-Winters multiplicative method, a traditional model, achieved the highest performance with an sMAPE of 26.77% and an MAE of 668.62. Among the other neural networks, the GA–CNN model demonstrated the best performance, with an sMAPE of 25.68% and an MAE of 633.16. Of the remaining models, the proposed GA–SVR and double exponential smoothing demonstrated the lowest level of performance, achieving sMAPE values of 32.68% and 41.59%, and MAE values of 752.50 and 771.36, respectively.
The discrepancy between the 5-fold cross-validation results and the 12-step-ahead forecast might be attributed to the occurrence of novel events within the time series, to which the models were not adequately exposed during training. Alternatively, the accumulation of forecast errors over successive predictions could have contributed to the observed divergence.
Figure 10 provides a visual comparison of estimated energy consumption for the IFPR–Palmas Campus, as predicted by each model, against the actual observed values throughout the year 2023. This graphical representation effectively underscores the predictive patterns exhibited by the different models.
A visual analysis indicates that all models demonstrated inaccuracies in predicting the sustained peak in energy consumption observed between June and September 2023. Among the models, the GA–LSTM model exhibited the closest behavior to the actual trend during this period. This anomalous data pattern suggests the occurrence of an unforeseen event, not present in the training dataset, that significantly influenced electricity consumption during this period.
The GA–SVR model exhibited a high degree of stability, consistently underestimating both the amplitude and frequency of data fluctuations and trends. Conversely, the GA–XGBoost model began forecasting in a highly efficient manner, outperforming others in the initial months. However, as errors accumulated, it demonstrated excessive sensitivity to data variations, resulting in highly unstable predictions characterized by frequent reversals in direction.
The GA–RF and GA–XGBoost models initially demonstrated proficiency in capturing upward trends within the time series. However, from March 2023 onwards, the models’ performance declined as they exhibited increased resistance to data fluctuations, converging towards a more stable pattern resembling the behavior of the GA–SVR model.
The GA–LSTM model demonstrated a strong ability to track both upward and downward trends within the time series, consistently maintaining its performance throughout the entire period. However, similar to the other evaluated models, the models had difficulties capturing the energy consumption peak observed between June and September 2023.
As depicted in
Figure 11, the electricity consumption forecasting results for IFPR–Coronel Vivida Campus also proved challenging to model. A peak in electricity consumption was observed between January 2023 and July 2023, similar to the IFPR–Palmas Campus time series. All models exhibited inaccurate behavior during this period.
In this instance, the GA–SVR model exhibited inconsistent and unstable forecasting behavior. Despite this new behavior, it failed to adequately fit the data and performed the worst among the proposed models. Similarly, the GA–XGBoost model also experienced a decline in performance.
The GA–LSTM and GA–RF models exhibited the closest alignment with real-world data, with the GA–LSTM being the most suitable for the dataset. However, even these models struggled to accurately capture the peak consumption period between January and July 2023. This limitation underscores the challenge of identifying and responding to anomalous events within time-series data.
Additionally, we explored advanced neural networks in the field of forecasting, such as ESN and CNN, in an attempt to achieve error rates below 10%. Unfortunately, our attempts were unsuccessful, reinforcing the complexity of electricity consumption modeling, as previously noted in the literature.
In our proposed approach, the goal is to obtain 12-month-ahead forecasts using a recursive strategy. This method involves using forecasts from previous months as inputs for predicting the next month. As a result, any errors associated with each forecast accumulate throughout the process, increasing the percentage error by the end. Moreover, to simulate real-world conditions, the model stores its predictions and updates subsequent values based on these forecasted values. Consequently, if the model generates significant errors at any point, subsequent forecasts will be directly impacted. This is one of the reasons why the errors exceed 10%.
Furthermore, the complexity of electricity consumption modeling has been emphasized in the literature. For example, in [
13], models with a 3-month forecast horizon showed MAPE errors as high as 31%. Similarly, for one-step-ahead forecasting in [
15], the performance of the compared models ranged between 5.9% and 11.3%.
4.2. SHAP Analysis
Interpretability is essential for fostering trust and transparency in artificial intelligence systems. SHAP values provide invaluable insights into model behavior by quantifying the contribution of each feature to a specific prediction. Beyond feature selection, SHAP values elucidate the magnitude and direction of feature influence, enhancing model comprehension and explainability.
To identify the key determinants of energy consumption at IFPR and to optimize resource allocation, a SHAP analysis was conducted on the GA–RF model, given its best performance among those suitable for SHAP analysis in this study. This approach enabled the quantification of each variable contribution to the model predictions [
21].
SHAP values are commonly visualized graphically to provide insights into the magnitude and direction of each feature’s impact on the model’s predictions. The arrows within each plot represent the impact of individual features on the final prediction. Positive values (rightward red arrows) indicate features that increase the predicted value, whereas negative values (leftward blue arrows) decrease it. The cumulative effect of these feature contributions determines the overall prediction.
Regarding the Palmas Campus, the GA–RF model predicted an average electricity consumption of 15,441.89 kWh. Each feature included in the model contributed to this prediction, either increasing or decreasing the forecast value.
Figure 12 presents the average values for each feature and their corresponding contributions.
The prominent SHAP values associated with Lag-06, along with the substantial contributions of Lag-02 and Lag-03, underscore a strong dependence of electricity consumption on its historical values. These findings align with the results of the KPSS test, which confirmed the presence of autocorrelation in the initial lags of the time series.
The significant impact of the “year 2019” variable suggests a notable deviation in electricity consumption patterns for that year. The variable’s importance to the model performance indicates that its exclusion would have negatively impacted the accuracy of predictions. This finding highlights the need for further investigation into the factors contributing to this phenomenon.
Among the climate variables, “absolute average maximum temperature” and “absolute minimum temperature” exhibited a positive correlation with energy consumption. Although these variables were of lesser importance compared to others, the findings suggest a relationship between temperature variations and electricity consumption in the IFPR–Palmas Campus region, especially considering its colder climate. However, further research is needed to confirm this hypothesis.
In the Coronel Vivida Campus, the GA–RF model predicted an average electricity consumption of 2164 kWh. The distribution of feature importance and direction differed significantly from the results obtained for the Palmas Campus, as illustrated in
Figure 13.
In this analysis, the SHAP values for features “Lag-03”, “Lag-06”, and “Lag-01” exhibited the highest contributions to the electricity consumption forecast. Unlike the Palmas Campus, where these lagged values had a positive impact, in Coronel Vivida, they negatively influenced the forecast. This implies that higher lagged consumption values tend to decrease the predicted consumption for the current period, suggesting an inverse relationship between past and present consumption at this campus.
The “year-2019” feature once again proved to be a significant predictor in the model, this time exerting a positive influence. This suggests that this feature is increasing the predicted value of electricity consumption.
The climatic feature “absolute minimum temperature” had a minimal contribution to the model. Although it was the most influential among the climatic variables incorporated in the Coronel Vivida Campus, its overall impact was relatively small. Similar to the Palmas Campus, this variable also contributed positively to the forecast. This suggests a comparable influence of regional climate on electricity consumption and warrants further investigation.