Hourly Building Energy Consumption Prediction Using a Training Sample Selection Method Based on Key Feature Search
Abstract
:1. Introduction
- (1)
- Ingeniously incorporate the essential distinctive parameters for future energy consumption situations into the model training sample shortlisting. This makes it possible to quickly acquire data and create precise predictive models. Similarly, the model’s interpretability is improved.
- (2)
- The KFSS technique fixes the clustering method’s issue with insufficient training sets while also allowing for changes on the predicted day (e.g., HBECP problem under the transition of the AC switch-on mode in the transitional season). This effort includes the model’s predicting abilities throughout the year, particularly during the transition season, rather than simply during the cooling and heating seasons.
- (3)
- Mixing feature scene analysis and actual feature acquisition can increase not only the feasibility of actual engineering modeling, but also provide a leading direction for the next step in improving model capabilities. The proof existence can offer a fresh concept for mining building data that will become available in enormous amounts in the future and that can be swiftly implemented on energy platforms, offering a reliable assurance for HBECP.
2. Methodology
2.1. Framework
2.2. Dataset of Two-Step Selection Using Features Search
2.2.1. Similarity Measurement
2.2.2. Feature Outlier-Based Selection for Dataset
2.3. Adapting Feature Set
2.3.1. Feature Types
- (1)
- Sequence features are chosen based on historical operating energy consumption time series data. Data mining is used to investigate data features, which are then extracted as feature variables for prediction. A time series (e.g., time, date, holiday) and historical data are both examples of distinctive factors. The periodic intensity is identified using Fourier transform technology, and the discretized historical data is recovered as a feature using the person correlation coefficient.
- (2)
- Associated features need to be obtained through sensors, which include the external interference features (e.g., outdoor meteorological parameters) and internal interference features (e.g., indoor dry bulb temperature, relative humidity, and utilization rate).
2.3.2. Feature Set Selection
- 1.
- Extract Sequence Features
- 2.
- Different Feature Set Based on Availability
2.4. RF Model
2.4.1. RF
2.4.2. Parameters Optimization of RF
2.5. Performance Evaluation Indices
- (a)
- CV-Root mean squared error (CV-):
- (b)
- R square ():
- (c)
- Mean absolute percentage error ():
3. Case Study
3.1. Building and Data Description
3.2. Dataset of Two-Step Selection
3.3. Feature Set Scenario
3.3.1. Sequence Features Selection
- EDI at the same hour of the day before, two days earlier, and seven days earlier: Ed-1, h, Ed-2, h, Ed-7, h.
- EDI at the earlier hours of one day before: Ed-1, h-1 and Ed-1, h-2.
- EDI at the earlier hours of two days earlier: Ed-2, h-1 and Ed-2, h-2.
3.3.2. Scenario of Feature Set
4. Results and Discussion
4.1. Effectiveness Analysis of Training Set Selection
4.2. Results of Feature Scenario Analysis
5. Application
5.1. Modeling Adaptability
5.2. Application Implications
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviation
Nomenclature | |
AC | Air-conditioning |
HBECP | Hourly building energy consumption prediction |
CV-RMSE | CV-Root mean squared error |
DMT | Data mining technology |
DNN | Deep neural network |
FFT | Fast fourier transform |
FCM | Fuzzy C-means |
KNN | K nearest neighbors |
LSTM | Long short-term memory |
LOF | Local outlier factor |
MAPE | Mean absolute percentage error |
RF | Random forest |
R2 | Coefficient of determination |
RMSE | Root mean squared error |
Symbols | |
True power consumption density of the sample at the time | |
Predicted power consumption density of the sample at the time | |
Fpre | Key factor of the predict day |
Fhis | Key factor of the historical data |
r | Correlation coefficient |
Ti | Daily mean value of historical outdoor dry bulb temperature |
Tpre | Daily mean value of the predict daily outdoor dry bulb temperature |
Upre | Daily mean value of the predict daily AC utilization rate |
Ui | Daily mean value of the historical AC utilization rate |
Variable at the time | |
Mean of the variable | |
Other variable at the time | |
Mean of the other variable | |
The average of the local reachable density of the point in the k-th distance field of the point p divided by the local reachable density of the point p | |
lrdk(p) | The local reachable density of point p |
r-dk(p,o) | The k-th reachable distance from point o to point p |
d(p,o) | The distance between point p and point o |
dk(p) | The k-th distance of point p |
Nk(p) | The k-th distance neighborhood of point p |
References
- Wang, Z.; Huang, W.; Chen, Z. The peak of CO2 emissions in China: A new approach using survival models. Energy Econ. 2019, 81, 1099–1108. [Google Scholar] [CrossRef]
- Himeur, Y.; Ghanem, K.; Alsalemi, A.; Bensaali, F.; Amira, A. Artificial intelligence based anomaly detection of energy consumption in buildings: A review, current trends and new perspectives. Appl. Energy 2021, 287, 116601. [Google Scholar] [CrossRef]
- Bourdeau, M.; Zhai, X.Q.; Nefzaoui, E.; Guo, X.F.; Chatellier, P. Modeling and forecasting building energy consumption: A review of data-driven techniques. Sustain. Cities Soc. 2019, 48, 101533. [Google Scholar] [CrossRef]
- Sun, Y.; Haghighat, F.; Fung, B.C.M. A review of the-state-of-the-art in data-driven approaches for building energy prediction. Energy Build. 2020, 221, 110022. [Google Scholar] [CrossRef]
- Li, Y.; O’Neill, Z.; Zhang, L.; Chen, J.; Im, P.; DeGraw, J. Grey-box modeling and application for building energy simulations—A critical review. Renew. Sustain. Energy Rev. 2021, 146, 111174. [Google Scholar] [CrossRef]
- Goudarzi, S.; Anisi, M.H.; Kama, N.; Doctor, F.; Soleymani, S.A.; Sangaiah, A.K. Predictive modelling of building energy consumption based on a hybrid nature-inspired optimization algorithm. Energy Build. 2019, 196, 83–93. [Google Scholar] [CrossRef]
- Chen, Y.; Tan, H. Short-term prediction of electric demand in building sector via hybrid support vector regression. Appl. Energy 2017, 204, 1363–1374. [Google Scholar] [CrossRef]
- Dai, Y.; Zhao, P. A hybrid load forecasting model based on support vector machine with intelligent methods for feature selection and parameter optimization. Appl. Energy 2020, 279, 115332. [Google Scholar] [CrossRef]
- Wang, Z.; Wang, Y.; Zeng, R.; Srinivasan, R.S.; Ahrentzen, S. Random Forest based hourly building energy prediction. Energy Build. 2018, 171, 11–25. [Google Scholar] [CrossRef]
- Dong, Z.; Liu, J.; Liu, B.; Li, K.; Li, X. Hourly energy consumption prediction of an office building based on ensemble learning and energy consumption pattern classification. Energy Build. 2021, 241, 110929. [Google Scholar] [CrossRef]
- Wang, R.; Lu, S.; Feng, W. A novel improved model for building energy consumption prediction based on model integration. Appl. Energy 2020, 262, 114561. [Google Scholar] [CrossRef]
- Fan, C.; Xiao, F.; Zhao, Y. A short-term building cooling load prediction method using deep learning algorithms. Appl. Energy 2017, 195, 222–233. [Google Scholar] [CrossRef]
- Wang, J.Q.; Du, Y.; Wang, J. LSTM based long-term energy consumption prediction with periodicity. Energy 2020, 197, 117197. [Google Scholar] [CrossRef]
- Moon, J.; Jung, S.; Rew, J.; Rho, S.; Hwang, E. Combination of short-term load forecasting models based on a stacking ensemble approach. Energy Build. 2020, 216, 109921. [Google Scholar] [CrossRef]
- Zhang, L.; Wen, J. Active learning strategy for high fidelity short-term data-driven building energy forecasting. Energy Build. 2021, 244, 111026. [Google Scholar] [CrossRef]
- Liu, T.; Tan, Z.; Xu, C.; Chen, H.; Li, Z. Study on deep reinforcement learning techniques for building energy consumption forecasting. Energy Build. 2020, 208, 109675. [Google Scholar] [CrossRef]
- Zhou, X.; Lin, W.; Kumar, R.; Cui, P.; Ma, Z. A data-driven strategy using long short term memory models and reinforcement learning to predict building electricity consumption. Appl. Energy 2022, 306, 118078. [Google Scholar] [CrossRef]
- Zhang, L.; Wen, J.; Li, Y.; Chen, J.; Ye, Y.; Fu, Y.; Livingood, W. A review of machine learning in building load prediction. Appl. Energy 2021, 285, 116452. [Google Scholar] [CrossRef]
- Amasyali, K.; El-Gohary, N. Machine learning for occupant-behavior-sensitive cooling energy consumption prediction in office buildings. Renew. Sustain. Energy Rev. 2021, 142, 110714. [Google Scholar] [CrossRef]
- Oh, K.; Kim, E.-J.; Park, C.-Y. A Physical Model-Based Data-Driven Approach to Overcome Data Scarcity and Predict Building Energy Consumption. Sustainability 2022, 14, 9464. [Google Scholar] [CrossRef]
- Kim, D.; Lee, Y.; Chin, K.; Mago, P.J.; Cho, H.; Zhang, J. Implementation of a Long Short-Term Memory Transfer Learning (LSTM-TL)-Based Data-Driven Model for Building Energy Demand Forecasting. Sustainability 2023, 15, 2340. [Google Scholar] [CrossRef]
- Tian, C.; Li, C.; Zhang, G.; Lv, Y. Data driven parallel prediction of building energy consumption using generative adversarial nets. Energy Build. 2019, 186, 230–243. [Google Scholar] [CrossRef]
- Qian, F.; Gao, W.; Yang, Y.; Yu, D. Potential analysis of the transfer learning model in short and medium-term forecasting of building HVAC energy consumption. Energy 2020, 193, 116724. [Google Scholar] [CrossRef]
- Fan, C.; Sun, Y.; Xiao, F.; Ma, J.; Lee, D.; Wang, J.; Tseng, Y.C. Statistical investigations of transfer learning-based methodology for short-term building energy predictions. Appl. Energy 2020, 262, 114499. [Google Scholar] [CrossRef]
- Fan, C.; Lei, Y.; Sun, Y.; Piscitelli, M.S.; Chiosa, R.; Capozzoli, A. Data-centric or algorithm-centric: Exploiting the performance of transfer learning for improving building energy predictions in data-scarce context. Energy 2022, 240, 2775. [Google Scholar] [CrossRef]
- Bedi, J.; Toshniwal, D. Deep learning framework to forecast electricity demand. Appl. Energy 2019, 238, 1312–1326. [Google Scholar] [CrossRef]
- Somu, N.; Gauthama Raman, M.R.; Ramamritham, K. A deep learning framework for building energy consumption forecast. Renew. Sustain. Energy Rev. 2021, 137, 110591. [Google Scholar] [CrossRef]
- Acquah, M.A.; Jin, Y.; Oh, B.-C.; Son, Y.-G.; Kim, S.-Y. Spatiotemporal Sequence-to-Sequence Clustering for Electric Load Forecasting. IEEE Access 2023, 11, 5850–5863. [Google Scholar] [CrossRef]
- Chen, Y.; Tan, H.; Berardi, U. Day-ahead prediction of hourly electric demand in non-stationary operated commercial buildings: A clustering-based hybrid approach. Energy Build. 2017, 148, 228–237. [Google Scholar] [CrossRef]
- Jallal, M.A.; González-Vidal, A.; Skarmeta, A.F.; Chabaa, S.; Zeroual, A. A hybrid neuro-fuzzy inference system-based algorithm for time series forecasting applied to energy consumption prediction. Appl. Energy 2020, 268, 114977. [Google Scholar] [CrossRef]
- Piscitelli, M.S.; Brandi, S.; Capozzoli, A. Recognition and classification of typical load profiles in buildings with non-intrusive learning approach. Appl. Energy 2019, 255, 113727. [Google Scholar] [CrossRef]
- He, F.; Zhou, J.; Feng, Z.-k.; Liu, G.; Yang, Y. A hybrid short-term load forecasting model based on variational mode decomposition and long short-term memory networks considering relevant factors with Bayesian optimization algorithm. Appl. Energy 2019, 237, 103–116. [Google Scholar] [CrossRef]
- Zhang, G.; Tian, C.; Li, C.; Zhang, J.J.; Zuo, W. Accurate forecasting of building energy consumption via a novel ensembled deep learning method considering the cyclic feature. Energy 2020, 201, 117531. [Google Scholar] [CrossRef]
- Zhang, L.; Alahmad, M.; Wen, J. Comparison of time-frequency-analysis techniques applied in building energy data noise cancellation for building load forecasting: A real-building case study. Energy Build. 2021, 231, 110592. [Google Scholar] [CrossRef]
- Zhou, Y.; Wang, L.; Qian, J. Application of Combined Models Based on Empirical Mode Decomposition, Deep Learning, and Autoregressive Integrated Moving Average Model for Short-Term Heating Load Predictions. Sustainability 2022, 14, 7349. [Google Scholar] [CrossRef]
- Peng, L.; Wang, L.; Xia, D.; Gao, Q. Effective energy consumption forecasting using empirical wavelet transform and long short-term memory. Energy 2022, 238, 121756. [Google Scholar] [CrossRef]
- Fan, C.; Wang, J.; Gang, W.; Li, S. Assessment of deep recurrent neural network-based strategies for short-term building energy predictions. Appl. Energy 2019, 236, 700–710. [Google Scholar] [CrossRef]
- Robinson, C.; Dilkina, B.; Hubbs, J.; Zhang, W.; Guhathakurta, S.; Brown, M.A.; Pendyala, R.M. Machine learning approaches for estimating commercial building energy consumption. Appl. Energy 2017, 208, 889–904. [Google Scholar] [CrossRef]
- Somu, N.; Gauthama Raman, M.R.; Ramamritham, K. A hybrid model for building energy consumption forecasting using long short term memory networks. Appl. Energy 2020, 261, 114131. [Google Scholar] [CrossRef]
- Qiao, Q.; Yunusa-Kaltungo, A.; Edwards, R.E. Feature selection strategy for machine learning methods in building energy consumption prediction. Energy Rep. 2022, 8, 13621–13654. [Google Scholar] [CrossRef]
- Bianchi, C.; Zhang, L.; Goldwasser, D.; Parker, A.; Horsey, H. Modeling occupancy-driven building loads for large and diversified building stocks through the use of parametric schedules. Appl. Energy 2020, 276, 115470. [Google Scholar] [CrossRef]
- Peng, Y.; Rysanek, A.; Nagy, Z.; Schlüter, A. Using machine learning techniques for occupancy-prediction-based cooling control in office buildings. Appl. Energy 2018, 211, 1343–1358. [Google Scholar] [CrossRef]
- Wei, Y.; Xia, L.; Pan, S.; Wu, J.; Zhang, X.; Han, M.; Zhang, W.; Xie, J.; Li, Q. Prediction of occupancy level and energy consumption in office building using blind system identification and neural networks. Appl. Energy 2019, 240, 276–294. [Google Scholar] [CrossRef]
- Liu, J.; Zhang, Q.; Dong, Z.; Li, X.; Li, G.; Xie, Y.; Li, K. Quantitative evaluation of the building energy performance based on short-term energy predictions. Energy 2021, 223, 120065. [Google Scholar] [CrossRef]
- Shao, M.; Wang, X.; Bu, Z.; Chen, X.; Wang, Y. Prediction of energy consumption in hotel buildings via support vector machines. Sustain. Cities Soc. 2020, 57, 102128. [Google Scholar] [CrossRef]
- Ahmad, T.; Zhang, H. Novel deep supervised ML models with feature selection approach for large-scale utilities and buildings short and medium-term load requirement forecasts. Energy 2020, 209, 118477. [Google Scholar] [CrossRef]
- Das, A.; Annaqeeb, M.K.; Azar, E.; Novakovic, V.; Kjærgaard, M.B. Occupant-centric miscellaneous electric loads prediction in buildings using state-of-the-art deep learning methods. Appl. Energy 2020, 269, 115135. [Google Scholar] [CrossRef]
- Wang, X.; Yuan, J.; You, K.; Ma, X.; Li, Z. Using Real Building Energy Use Data to Explain the Energy Performance Gap of Energy-Efficient Residential Buildings: A Case Study from the Hot Summer and Cold Winter Zone in China. Sustainability 2023, 15, 1575. [Google Scholar] [CrossRef]
- Markus, M.; Breunig, H.-P.K.; Raymond, T.N.; Jörg, S. LOF: Identifying Density-Based Local Outliers. ACM J. 2000, 29, 93–104. [Google Scholar]
- Ahlgren, P.; Jarneving, B.; Rousseau, R. Requirements for a cocitation similarity measure, with special reference to Pearson’s correlation coefficient. J. Am. Soc. Inf. Sci. Technol. 2003, 54, 550–560. [Google Scholar] [CrossRef]
- Zhou, Z. Ensemble Methods: Foundations and Algorithms; CRC Press: Los Angeles, CA, USA, 2012. [Google Scholar]
- Zhou, Z. Machine Learnng; Tsinghua University Press: Beijing, China, 2016. [Google Scholar]
- Khalil, M.; McGough, A.S.; Pourmirza, Z.; Pazhoohesh, M.; Walker, S. Machine Learning, Deep Learning and Statistical Analysis for forecasting building energy consumption—A systematic review. Eng. Appl. Artif. Intell. 2022, 115, 105287. [Google Scholar] [CrossRef]
- Qiao, Q.; Yunusa-Kaltungo, A. A hybrid agent-based machine learning method for human-centred energy consumption prediction. Energy Build. 2023, 283, 112797. [Google Scholar] [CrossRef]
- Luo, X.J.; Oyedele, L.O.; Ajayi, A.O.; Akinade, O.O.; Owolabi, H.A.; Ahmed, A. Feature extraction and genetic algorithm enhanced adaptive deep neural network for energy consumption prediction in buildings. Renew. Sustain. Energy Rev. 2020, 131, 109980. [Google Scholar] [CrossRef]
- Moon, J.; Rho, S.; Baik, S.W. Toward explainable electrical load forecasting of buildings: A comparative study of tree-based ensemble methods with Shapley values. Sustain. Energy Technol. Assess. 2022, 54, 102888. [Google Scholar] [CrossRef]
- Chen, Z.; Xiao, F.; Guo, F.; Yan, J. Interpretable machine learning for building energy management: A state-of-the-art review. Adv. Appl. Energy 2023, 9, 100123. [Google Scholar] [CrossRef]
Feature Source | Feature Types | Feature Name | Access |
---|---|---|---|
Sequence features | Time features | Time-related factors (day type, hour type) | Only mining the characteristics of historical energy consumption data |
Period features | At the same time the day before, etc. (discrete historical data) | ||
Associated features | External interference features | Outdoor weather parameters (temperature, humidity, etc.) | It needs to be obtained through information technology such as sensors |
Internal interference features | Indoor temperature and humidity. utilization rate |
Scenario 1 | Scenario 2 | Scenario 3 | |
---|---|---|---|
Feature set | Sequence features | Sequence features + External interference features + indoor temperature and humidity | Sequence feature + external disturbance factor + indoor temperature and humidity + utilization rate |
Scenario | Number | Feature Set |
---|---|---|
Scenario 1 | Feature set 1 | E (d-1, h), E (d-2, h) |
Feature set 2 | E (d-1, h), E (d-2, h), E (d-7, h), E (d-1, h-1), E (d-2, h-1) | |
Feature set 3 | E (d-1, h), E (d-2, h), E (d-7, h), E (d-1, h-1), E (d-2, h-1), E (d-1, h-2), E (d-2, h-2) | |
Scenario 2 | Feature set 4 | outdoor temperature, outdoor humidity, indoor temperature, indoor humidity, hour type |
Feature set 5 | E (d-1, h), E (d-2, h), E (d-7, h), E (d-1, h-1), E (d-2, h-1), E (d-1, h-2), E (d-2, h-2), outdoor temperature | |
Feature set 6 | E (d-1, h), E (d-2, h), E (d-7, h), E (d-1, h-1), E (d-2, h-1), E (d-1, h-2), E (d-2, h-2), Indoor and outdoor temperature difference | |
Scenario 3 | Feature set 7 | utilization rate |
Feature set 8 | utilization rate, outdoor temperature, outdoor humidity, indoor temperature, indoor humidity, hour type | |
Feature set 9 | utilization rate, outdoor temperature, E (d-1, h), E (d-2, h), E (d-7, h), E (d-1, h-1), E (d-2, h-1), E (d-1, h-2), E (d-2, h-2) |
Predict Day | MAPE (%) | R2 | CV-RMSE (%) | Method |
---|---|---|---|---|
12 April | 5.10 | 0.95 | 7.76 | KFSS method |
10.83 | 0.94 | 14.69 | Original method | |
14 May | 12.04 | 0.92 | 16.40 | KFSS method |
22.66 | 0.46 | 44.37 | Original method | |
4 July | 4.73 | 0.98 | 8.39 | KFSS method |
13.94 | 0.89 | 20.46 | Original method | |
15 July | 4.58 | 0.99 | 6.25 | KFSS method |
6.53 | 0.98 | 8.82 | Original method | |
20 November | 7.80 | 0.90 | 9.24 | KFSS method |
15.76 | 0.20 | 26.08 | Original method | |
21 November | 3.46 | 0.97 | 5.13 | KFSS method |
11.59 | 0.67 | 16.80 | Original method |
Scenario | Number | Features | R2 | MAPE (%) |
---|---|---|---|---|
Scenario 1 | Feature set 1 | E (d-1, h), E (d-2, h) | 0.93 | 9.99 |
Feature set 2 | E (d-1, h), E (d-2, h), E (d-7, h), E (d-1, h-1), E (d-2, h-1) | 0.95 | 8.09 | |
Feature set 3 | E (d-1, h), E (d-2, h), E (d-7, h), E (d-1, h-1), E (d-2, h-1), E (d-1, h-2), E (d-2, h-2) | 0.94 | 8.33 | |
Scenario 2 | Feature set 4 | outdoor temperature, outdoor humidity, indoor temperature, indoor humidity, hour type | 0.78 | 19.60 |
Feature set 5 | E (d-1, h), E (d-2, h), E (d-7, h), E (d-1, h-1), E (d-2, h-1), E (d-1, h-2), E (d-2, h-2), outdoor temperature | 0.96 | 7.43 | |
Feature set 6 | E (d-1, h), E (d-2, h), E (d-7, h), E (d-1, h-1), E (d-2, h-1), E (d-1, h-2), E (d-2, h-2), Indoor and outdoor temperature difference | 0.96 | 6.96 | |
Scenario 3 | Feature set 7 | utilization rate | 0.94 | 12.26 |
Feature set 8 | utilization rate, outdoor temperature, outdoor humidity, indoor temperature, indoor humidity, hour type | 0.95 | 10.71 | |
Feature set 9 | utilization rate, outdoor temperature, E (d-1, h), E (d-2, h), E (d-7, h), E (d-1, h-1), E (d-2, h-1), E (d-1, h-2), E (d-2, h-2) | 0.96 | 6.99 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fang, H.; Tan, H.; Dai, N.; Liu, Z.; Kosonen, R. Hourly Building Energy Consumption Prediction Using a Training Sample Selection Method Based on Key Feature Search. Sustainability 2023, 15, 7458. https://doi.org/10.3390/su15097458
Fang H, Tan H, Dai N, Liu Z, Kosonen R. Hourly Building Energy Consumption Prediction Using a Training Sample Selection Method Based on Key Feature Search. Sustainability. 2023; 15(9):7458. https://doi.org/10.3390/su15097458
Chicago/Turabian StyleFang, Haizhou, Hongwei Tan, Ningfang Dai, Zhaohui Liu, and Risto Kosonen. 2023. "Hourly Building Energy Consumption Prediction Using a Training Sample Selection Method Based on Key Feature Search" Sustainability 15, no. 9: 7458. https://doi.org/10.3390/su15097458