This section outlines the data utilized in the experiments, focusing on heat and electric energy sources, and details the experimental process and the results of applying the proposed architecture to predict energy consumption patterns. Experiments were conducted on the proposed architecture using district heat energy for energy prediction. To verify its scalability to various domains, experiments are conducted not only with district heat energy but also with electric energy.
4.1. Heat Energy in Cheongju, Republic of Korea
This section presents the experiments conducted using the proposed architecture with heat energy data.
Section 4.1.1 provides a description of the data utilized in the experiments.
Section 4.1.2 outlines the process optimization methods, as described in
Section 3.2. Finally,
Section 4.1.3 compares and evaluates five different models based on the results obtained in
Section 4.1.2 across various scenarios.
4.1.1. Dataset Description
Heat supply data, used to predict actual heat energy consumption, were collected hourly from an eco-friendly liquefied natural gas combined heat and power plant in Cheongju, Republic of Korea. The data are collected on an hourly basis from 2012 to 2021, comprising a total of 87,672 data points. Heat supply refers to the amount of heat delivered from the power plant to consumers, and the unit is gigacalories (Gcal). It covers heat consumption for space heating and domestic hot water usage.
Figure 6 visualizes the profile of the heat energy usage, showing a recurring pattern with a one-year cycle, reflecting its seasonal characteristics.
To understand the periodicity of energy usage and its correlation with the outdoor temperature, one of the key variables, the monthly average energy usage and monthly average temperature are visualized in
Figure 7. The
x-axis represents the months, and the
y-axis represents the monthly average energy usage and the monthly average temperature. The monthly average energy and temperature are computed by summing respective values across different years for each month and dividing by the corresponding number of data points. This visualization reveals that energy usage is high in the low-temperature period and low in the high-temperature period, indicating a seasonal pattern in energy consumption.
Figure 8 showed the relationship between daily mean temperature and energy usage deviation. It shows that higher temperatures generally lead to a decrease in energy usage deviation, whereas lower temperatures cause an increase.
Heat energy demand is influenced by climate-related meteorological variables, including outdoor temperature, wind speed, solar radiation, humidity, and precipitation [
69]. The weather data consist of hourly weather data for Cheongju and are used for predicting hourly heat supply in Cheongju. The latitude and longitude coordinates for Cheongju are approximately 37.5714, 126.9658. The minimum, maximum, mean, and standard deviation of the meteorological variables and heat energy used for prediction are described in
Table 2.
4.1.2. Process Optimization Methods
The dataset of heat energy was optimized using the defined process optimization methods described in
Section 3.2. In the process optimization methods, a total of 1440 scenarios are generated based on the conditions. Under the proposed architecture, 1440 scenarios were generated using eight data cleaning methods, four data split patterns, and 45 different combinations of data split ratios and years. The data split ratio of 1440 scenarios is shown in
Table 3.
First, the data split pattern group selects data cleaning candidates for condition1 optimization that are the most frequently used. Looking at the percentage occupied by each evaluation metric, R2 score is mostly occupied by dataset1, dataset2, and dataset3, among which dataset2 and dataset1 show higher performance. For MAE and RMSE, dataset0, dataset1, and dataset3 occupy the highest percentage, with dataset0 and dataset1 showing better performance. For MAPE, dataset1, dataset4, and dataset5 occupy the highest percentage, with dataset1 showing the best performance. Overall, dataset1 appears to be the optimized data cleaning method.
Similarly, for data split ratios, the distribution was examined, and the most frequently used data split ratio methods were selected as candidates for condition3 optimization. Looking at the percentage occupied by each evaluation metric, R2 score is mostly occupied by 4 months and 6 months, with 4 months showing better performance. For MAE, 6 months and 4 months occupy the highest percentage, with 6 months showing better performance. For RMSE, 12 months and 6 months occupy the highest percentage, with 12 months showing better performance. For MAPE, 6 months and 12 months occupy the highest percentage, with 12 months showing better performance. Overall, 12 months appears to be the optimized data split pattern method.
Finally, to select the data split ratio, the same data cleaning methods and data split patterns are used. Data cleaning and data split patterns are grouped for dataset1 and 12 months, which were previously selected using the optimized method. The top 10% based on performance metrics are sorted. Based on the sorted top data, the data split ratio that shows the highest performance is determined. Despite the differential training of data split ratios, it can be confirmed that the case of training with a ratio of 83:17 shows the highest performance. The experiment method selected in this manner shows an improved performance compared with training with all data except the test data.
Table 4 presents the maximum, minimum, and average values for each metric in the “Categorized cases” category. This category encompasses 1440 cases, which were obtained by combining all eight data cleaning methods, four data split pattern methods, and 45 methods that consider data split ratio and frequency for heat energy. The results demonstrate that the R2 score, MAE, RMSE, and MAPE may vary by up to 0.2001, 7.3107 Gcal, 15.7568 Gcal, and 6.1940e+12%, respectively, depending on the case selected. It can be observed that for 0.9760, the maximum values of the R2 score are observed in all cases, with MAE, RMSE, and MAPE values of 6.1373 Gcal, 8.3646 Gcal, and 12.85%, respectively. These values do not demonstrate the optimal performance against other indicators. Similarly, when MAE is at its optimal performance (5.7721 Gcal), R2 score, RMSE, and MAPE are 0.9666, 7.9720 Gcal, and 13.36%, respectively. When RMSE is at its optimal performance (7.9365 Gcal), R2 score, MAE, and MAPE are 0.9642, 5.8812 Gcal, and 14.34%, respectively. When MAPE is at its optimal performance (11.93%), R2 score, MAE, and RMSE are 0.9718, 5.9272 Gcal, and 8.1546 Gcal, respectively. Consequently, it is crucial to consider each indicator equally, as the performance of the model may vary depending on which indicator is prioritized. Accordingly, in the proposed method, the data cleaning method and the data split pattern method are selected through comparative analysis by selecting the top 10% for each evaluation index. The performance of each evaluation index for the conditions that satisfy the selected methods can be found in
Table 5. It should be noted that the highest performance value observed in
Table 5 may not correspond to that observed in
Table 4. However, a comparison of the average values reveals that all of them demonstrate an improvement in performance.
Table 6 presents the results of training the LightGBM model on heat energy using two different time periods: 5 years and 10 years. It shows performance improvements with an R2 score of 0.0019, MAE of 0.1612, RMSE of 0.0882, and MAPE of 0.56, confirming that using a large amount of data does not necessarily result in better performance immediately.
4.1.3. Prediction
After data optimization, the performance of the five models is compared in different scenarios. As shown in
Table 7, each scenario was trained with different periods, and other conditions were based on the process optimization method.
Table 8 displays the performance of five models across five scenarios. According to these results, the LightGBM generally performs well in five scenarios. The R2 score is high, and the MAE, RMSE, and MAPE are low. Additionally, the training time is relatively short. Therefore, LightGBM is the best option for this dataset. Catboost can also be considered as a viable alternative. The Catboost demonstrates similar performance to the LightGBM, especially in scenario C. It is noteworthy that this model performs well in terms of training time, as well as predictive performance. Although preprocessing was initially performed using the LightGBM, it is evident that this approach yields good performance for other models as well. These results indicate that the proposed optimization method is model-agnostic and can lead to an overall performance improvement.
Figure 9 shows the prediction results of LightGBM on the test data. The green solid line represents the predicted values, while the red solid line denotes the real values. The
x-axis and
y-axis represent the dates of test data and the heat energy usage [Gcal], respectively.
4.2. Electric Energy Dataset in Jeju, Republic of Korea
This section details the experiments conducted using the proposed architecture with both district heat energy and electric energy to evaluate its scalability across different domains.
Section 4.2.1 provides an overview of the data used in these experiments, whereas
Section 4.2.2 details the process optimization techniques described in
Section 3.2. Finally,
Section 4.2.3 delivers a comparative evaluation of five distinct models based on the outcomes from
Section 4.2.2 across different scenarios.
4.2.1. Dataset Description
In addition to heat energy, a dataset from a power generation unit on Jeju Island, Republic of Korea, was used to predict actual electric energy usage. The data are collected on an hourly basis from 2007 to 2021, comprising a total of 131,496 data points. Electricity demand performance refers to the electricity demand amount adjusted to the urgently required electricity amount at the power generation unit. The unit is megawatt-hour (MWh), which represents the amount of electric energy consumed per unit of time.
Figure 10 visualizes the electric energy usage, displaying a recurring pattern with a one-year cycle.
To understand this repeating cycle, monthly average energy usage and temperature are visualized as a monthly plot in
Figure 11. The
x-axis corresponds to the months, while the left
y-axis represents the monthly average electric energy usage, and the right
y-axis represents the monthly average temperature in Jeju. The monthly average value is calculated by summing the values for the same month in different years and dividing it by the number of data points for that month. From this visualization, it is evident that energy usage is higher in high temperatures or low temperatures compared with others.
Figure 12 visualizes the relationship between daily mean temperature and electric energy usage deviation. Electric energy usage deviation tends to increase at high temperatures or lower temperatures, which may be related to the intense use of electric heating and cooling devices such as air conditioners, respectively [
70]. This trend indicates that there is a seasonal pattern in energy usage and shows that it is very closely related to weather.
Climate data from Jeju are used to predict electricity consumption in Jeju per hour. The latitude and longitude coordinates for Jeju are approximately 33.5141 and 126.5297. The minimum, maximum, average, and standard deviation of the meteorological variables and electric energy used for prediction can be found in
Table 9.
4.2.2. Process Optimization Methods
In the process optimization methods, a total of 3360 scenarios are generated based on conditions. These scenarios are generated by combining 8 different datasets, 4 data split patterns, and 105 different data splits of ratio and year.
Table 10 shows the data split ratio of 3360 scenarios.
Data split patterns were grouped in the same way as the process optimization method of heat energy, and 10% of the high-performance data were sorted and combined within each group and metric. As a result of comprehensively considering the distribution of data cleaning methods in the top 10% of sorted data, it was determined that dataset3 was to be examined.
Then, the data cleaning was grouped, and the distribution of high-performance data split patterns was examined to select the data split patterns. As a result, when the data split pattern was 12 months, the overall performance was high.
Finally, the data split ratio was determined by grouping based on the previously selected dataset3 and 12 months. Based on the sorted top data, the data split ratio that demonstrates the highest performance is identified. Despite the differential training of data split ratios, it can be confirmed that the case trained with a ratio of 87.5:12.5 exhibits the highest performance.
Table 11 presents the maximum, minimum, and average values for each indicator in 3360 cases, obtained by combining eight data cleaning methods, four data split pattern methods, and 105 methods that consider the data split ratio and frequency for electric energy. For the entire case, the difference between the maximum and minimum values for each average indicator is 0.7494 for R2 score, 32.0729 for MAE, 40.0391 for RMSE, and 4.29 for MAPE. For 0.8868 with a maximum value based on the R2 score in the total number of cases, MAE, RMSE, and MAPE did not demonstrate the best performance, at 24.0195, 32.8413, and 3.81, respectively, and when MAPE had the best performance at 3.80, R2 score, MAE, and RMSE did not have the best performance, at 0.8827, 24.0023, and 33.4396, respectively. If the MAE had the highest performance at 19.9757, the RMSE also had the highest performance at 25.1399, but the R2 score and MAPE at this time were 0.7142 and 4.83, respectively, which were not the highest performance values. This result was the same as heat energy, and it was confirmed once again that it is important to consider evenly because the optimization method of the model may vary depending on which indicator is focused.
Table 12 shows the maximum, minimum, and average values of the performance of each evaluation index for cases that satisfy the data cleaning method and the data split pattern method selected through a proposed method that evenly considers the used evaluation indicators. While the highest performance values for each indicator in
Table 12 may differ from those in
Table 11, a comparison of the average values shows overall improved performance.
Furthermore, a comparison of the results obtained through the proposed method with those obtained from training all but the same test data revealed that the R2 score was improved by 0.0098, the MAE was 1.1626, and the RMSE was improved by 0.17. This result demonstrates that the use of an appropriate amount of data for the analysis of time series patterns, such as thermal energy, can result in satisfactory performance.
4.2.3. Prediction
In nine different scenarios, five models compared their performance by using optimized data. As indicated in
Table 13, scenarios were trained in a different year, and the other condition was determined by the process optimization method.
Table 14 shows the performance of five models in nine scenarios. Each model excels in various aspects, making the selection of the final model a challenging task. The LSTM demonstrates remarkable R2 score performance in some scenarios, particularly in modeling sequential data like time series. However, it incurs relatively long training times. MLP also excels in multiple scenarios in terms of MAE and RMSE, showcasing its ability to model nonlinear relationships. Nonetheless, they can be time-consuming. Additionally, boosting models such as Catboost, LightGBM, and XGBOOST are capable of rapid learning, with Catboost excelling in R2 score, MAE, and RMSE in certain scenarios. Taking the experimental results into comprehensive consideration, Catboost emerges as the preferred choice as the final model for the given dataset and scenarios. These results indicate that the proposed architecture improves the performance not only in heat energy but also in electric energy, opening up possibilities for extending the architecture to other energy predictions.
Figure 13 illustrates the electric energy prediction results for scenario I using LightGBM on test data. The green solid line denotes the predicted values, whereas the red solid line represents the observed values. The
x-axis displays the dates of the test data, and the
y-axis represents the electric energy usage [MWh].
4.3. Prediction Results and Discussions
Empirical results using heat energy and electric energy confirmed the following: data cleaning, data slit patterns, and data split ratios, selected as conditions for optimizing data, significantly impacted model performance. This underscores the importance of selecting conditions tailored to the specific data. Additionally, high performance across various indicators was achieved by considering both the characteristics of the indicators and the inherent properties of time series data.
This section describes the input variables that influenced the model’s results through the application of XAI. The prediction models of heat energy and electric energy are explained, respectively, and compared and analyzed for the two domains.
Figure 14 and
Figure 15 visualize the importance of variables using SHAP for a LightGBM applied to predict heat energy and electric energy. In a summary plot generated using SHAP values, the
y-axis displays the importance of features in descending order, with the most important features at the top and the least important at the bottom. And the
x-axis shows the SHAP values, indicating their contributions to increasing the output when positive and decreasing the output when negative.
In heat energy, the top five important variables selected are ground temperature, temperature, hour, month, and solar radiation (Radiation).
Figure 14 explains that high values of ground temperature and temperature contribute to predicting lower energy usage, and lower values contribute to predicting higher energy usage. In the case of time, it contributes to predicting lower energy usage when it has low values and higher energy usage when it has a high value.
On the other hand, in electric energy, the top five important variables selected are year, hour, temperature, ground temperature, and month.
Figure 15 explains that the low values of Year and Hour contribute to predicting lower energy usage, and high values contribute to predicting higher energy usage. In the case of ground temperature and temperature, it is explained that if the value is too low or too high, it contributes to predicting more energy usage.
These indicate that different features have a significant impact on the model depending on the dataset. It can be interpreted that energy types like heating are more influenced by temperature, while electric energy is interpreted to be greatly influenced by time. These results suggest that seasonal and social issues affect energy prediction such as temperature, time, and day.
The primary factors influencing energy output were identified through the XAI methodology. The analysis revealed that time and temperature variables significantly impact energy output. SHAP values provided detailed insights into the model’s decision-making process by quantifying each feature’s contribution. Elucidating these influential factors through XAI clarifies how the model makes predictions and why. This enhances the model’s robustness and predictive accuracy. Furthermore, this approach can be applied to other tasks, such as dimensionality reduction or feature selection, thus improving overall model performance and reliability in various applications.
Two experiments have confirmed that the optimization data determination method in the preprocessing phase leads to performance improvement. Choosing a low-carbon method for energy production by accurately predicting energy consumption in advance with improved performance can be an effective way to achieve carbon neutrality. In addition, despite the different patterns of heat and electric energy, the proposed architecture has been confirmed to be effective not only in heat energy but also in electric energy. These results can be expected to have scalability that can be effectively applied to other energy data affected by climate, lifestyle, and time. If energy consumption is predicted by effectively utilizing various energy data, energy management and optimization will enable energy conservation and sustainable energy use directions to be planned. Besides, it is expected to help with energy management and optimization by providing a way to build and interpret a data-based energy prediction model.
In this study, the preprocessing of time series energy data and the use of SHAP were employed to develop a robust, generalizable model. Key features with significant temporal changes and high fluctuation likelihood were carefully considered to ensure model reliability. Additionally, the model is designed to utilize four metrics to assess its performance. While this provides a robust foundation for evaluation, future research could involve the incorporation of additional evaluation metrics. This would enhance the assessment process, providing a more comprehensive understanding of the method’s performance across different conditions and datasets.
The approach proposed by the authors emphasizes enhancing the model’s generalizability through comprehensive experimental design and robust preprocessing techniques. Methods such as MinMaxScaler for normalization, advanced imputation strategies for handling missing data, and careful feature engineering were incorporated to ensure that the model could effectively adapt to various time series datasets. These preprocessing steps help maintain the integrity and consistency of the data, which is crucial for achieving reliable predictions across different contexts. Furthermore, the model was evaluated using a diverse set of prediction algorithms, including XGBoost, LightGBM, CatBoost, MLP, and LSTM. This multimodel evaluation demonstrated that the proposed approach consistently improves performance, regardless of the specific algorithm used. The use of SHAP values for explainability also contributed to the model’s robustness by identifying and emphasizing the most influential features, thereby enhancing the interpretability and reliability of the predictions. Experimental results indicate that the proposed method not only achieves high predictive accuracy but also maintains strong generalizability across different datasets and conditions. This is evidenced by the consistent performance improvements observed in various test scenarios, highlighting the adaptability of the proposed approach. However, certain limitations, such as sensitivity to extreme outliers and domain-specific nuances, are acknowledged and may require further attention in future research.
A superior model performs well under all circumstances. It is crucial to investigate models that demonstrate exceptional performance and adaptability to sudden data changes. Future developments will focus on generating virtual data representing anomalous phenomena using generative adversarial networks or other generative models. This approach aims to create models capable of simulating and adapting to unusual events, thereby enhancing their robustness and applicability.