Clustering Method Comparison for Rural Occupant’s Behavior Based on Building Time-Series Energy Data
Abstract
:1. Introduction
1.1. Background
1.2. Literature Review
- Conducted a comprehensive investigation of time series clustering methods for occupants’ behavior in the building energy field;
- Compared the clustering effects of various time series clustering algorithms on occupants’ continuous power load demand behavior;
- Introduced two specific indicators for evaluating the clustering effects of these algorithms;
- Analyzed the potential reasons for differences in clustering results across different algorithms;
- Discovered island rural residents’ behavior law for energy saving in line with sea island geographic characteristics.
2. Materials and Methods
2.1. Outline
2.2. Data Sources
2.2.1. Island Site
2.2.2. Database Condition
2.3. Algorithms
2.3.1. PCA and t-SNE
2.3.2. k-Means
2.3.3. Hierarchical Clustering Algorithm
2.3.4. DBSCAN
2.3.5. Model
2.3.6. Deep Learning
2.4. Perform Indicators
2.4.1. Accuracy Rate
2.4.2. Standard Deviation
3. Results
3.1. Questionnaire Results
3.2. Clustered Accuracy Rate
3.3. Clustered SD Values Comparison
4. Discussion
4.1. Clustering Detail Comparison for Different Algorithms
4.2. Yushan Island Energy Usage Patterns
4.3. Future Limitation
5. Conclusions
- When using data mining to cluster occupants’ energy-using behavior patterns, K-means, which relies on Euclidean distance and k-shape, has traditionally been the primary algorithm. These two methods are the primary choice for similar tasks.
- Km_DTW is applicable to intermittent curves instead of continuous data. Km_Euclidean performs great expression for building electricity load curves with several abnormal data; moreover, for an accurate database without outliers, Km_kshape could cluster the pattern features more efficiently.
- Hierarchy and DBSCAN clustered manners fail to group the time-series energy consumption curves, especially for unimodal curve type. Deep learning algorithms also can not cluster time-series building electricity usage data under default parameters in high precision.
- When clustering using four different distance algorithms, the difference in the time of curve condition changes in the energy pattern ranges from 0 days at the minimum to 14 days at the maximum. This indicates that different algorithms have a similar ability to identify the important time of condition variation.
- Accuracy rate and standard deviation introduced in this study serve as evaluation methods for clustering analysis of continuous electricity demand. These metrics effectively describe the characteristics of curve fitting extent within the clustering results.
- In terms of island carbon emission reduction, during fishing expeditions, it is crucial to utilize fisherman households when vacant. For non-fisherman residents, prioritizing heat insulation during winter is essential for the elderly, while passive design strategies are better suited for middle-aged accommodations. Renewable techniques can be applied to infrequently used public buildings like village committees, presenting an opportunity for significant energy efficiency improvements.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Chenari, B.; Carrilho, J.D.; Silva, M.G.D. Towards sustainable, energy-efficient and healthy ventilation strategies in buildings: A review. Renew. Sustain. Energy Rev. 2016, 59, 1426–1447. [Google Scholar] [CrossRef]
- IEA. Energy Technology Perspectives Scenarios; International Energy Agency (IEA): Paris, France, 2012. [Google Scholar]
- IRENA. Renewable Capacity Statistics 2021; International Renewable Energy Agency: Abu Dhabi, United Arab Emirates, 2021. [Google Scholar]
- IRENA. Renewable Energy Statistics 2022; The International Renewable: Abu Dhabi, United Arab Emirates, 2022. [Google Scholar]
- IRENA. World Energy Transitions Outlook: 1.5C Pathway; International Renewable Energy Agency: Abu Dhabi, United Arab Emirates, 2021. [Google Scholar]
- Allcott, H.; Mullainathan, S. Behavior and Energy Policy. Science 2020, 327, 1204–1205. [Google Scholar] [CrossRef] [PubMed]
- Fan, C.; Xiao, F.; Li, Z.; Wang, J. Unsupervised data analytics in mining big building operational data for energy efficiency enhancement: A review. Energy Build. 2018, 159, 296–308. [Google Scholar] [CrossRef]
- Miller, C.; Nagy, Z.; Schlueter, A. A review of unsupervised statistical learning and visual analytics techniques applied to performance analysis of non-residential buildings. Renew. Sustain. Energy Rev. 2018, 81, 1365–1377. [Google Scholar] [CrossRef]
- Zhao, Y.; Zhang, C.; Zhang, Y.; Wang, Z.; Li, J. A review of data mining technologies in building energy systems: Load prediction, pattern identification, fault detection and diagnosis. Energy Built Environ. 2020, 1, 149–164. [Google Scholar] [CrossRef]
- Li, K.; Ma, Z.; Robinson, D.; Ma, J. Identification of typical building daily electricity usage profiles using Gaussian mixture model-based clustering and hierarchical clustering. Appl. Energy 2018, 231, 331–342. [Google Scholar] [CrossRef]
- Rajabi, A.; Eskandari, M.; Ghadi, M.J.; Li, L.; Zhang, J.; Siano, P. A comparative study of clustering techniques for electrical load pattern segmentation. Renew. Sustain. Energy Rev. 2020, 120, 109628. [Google Scholar] [CrossRef]
- Aghabozorgi, S.; Shirkhorshidi, A.S.; Wah, T.Y. Time-series clustering—A decade review. Inf. Syst. 2015, 53, 16–38. [Google Scholar] [CrossRef]
- Ma, Z.; Yan, R.; Nord, N. A variation focused cluster analysis strategy to identify typical daily heating load profiles of higher education buildings. Energy 2017, 134, 90–102. [Google Scholar] [CrossRef]
- Li, K.; Yang, R.J.; Robinson, D.; Ma, J.; Ma, Z. An agglomerative hierarchical clustering-based strategy using Shared Nearest Neighbours and multiple dissimilarity measures to identify typical daily electricity usage profiles of university library buildings. Energy 2019, 147, 735–748. [Google Scholar] [CrossRef]
- Wu, X.; Kumar, V.; Quinlan, J.R.; Ghosh, J.; Yang, Q.; Motoda, H.; McLachlan, G.J.; Ng, A.; Liu, B.; Yu, P.S.; et al. Top 10 algorithms in data mining. Knowl. Inf. Syst. 2008, 14, 1–37. [Google Scholar] [CrossRef]
- Heidarinejad, M.; Dahlhausen, M.; McMahon, S.; Pyke, C.; Srebric, J. Cluster analysis of simulated energy use for LEED certified U.S. office buildings. Energy Build. 2014, 85, 86–97. [Google Scholar] [CrossRef]
- Deb, C.; Lee, S.E. Determining key variables influencing energy consumption in office buildings through cluster analysis of pre- and post-retrofit building data. Energy Build. 2018, 159, 228–245. [Google Scholar] [CrossRef]
- Dharssini, A.V.; Raja, S.C.; Karthick, T.; Venkatesh, P. Energy Pattern Classification and Prediction in an Educational Institution using Deep Learning Framework. Electr. Power Compon. Syst. 2022, 50, 615–635. [Google Scholar] [CrossRef]
- Liu, X.; Ding, Y.; Tang, H.; Xiao, F. A data mining-based framework for the identification of daily electricity usage patterns and anomaly detection in building electricity consumption data. Energy Build. 2021, 231, 110601. [Google Scholar] [CrossRef]
- Koupaei, D.M.; Cetin, K.; Passe, U.; Kimber, A.; Poleacovschi, C. Identifying rural high energy intensity residential buildings using metered data. Energy Build. 2023, 298, 113604. [Google Scholar]
- Paparrizos, J.; Gravano, L. k-Shape: Efficient and Accurate Clustering of Time Series. In ACM SIGMOD Record; Association for Computing Machinery: New York, NY, USA, 2015; Volume 45, pp. 69–76. [Google Scholar]
- Li, J.; Ma, R.; Deng, M.; Cao, X.; Wang, X.; Wang, X. A comparative study of clustering algorithms for intermittent heating demand considering time series. Appl. Energy 2024, 353, 122046. [Google Scholar] [CrossRef]
- Park, J.Y.; Yang, X.; Miller, C.; Arjunan, P.; Nagy, Z. Apples or oranges? Identification of fundamental load shape profiles for benchmarking buildings using a large and diverse dataset. Appl. Energy 2019, 236, 1280–1295. [Google Scholar] [CrossRef]
- Wen, L.; Zhou, K.; Yang, S. A shape-based clustering method for pattern recognition of residential electricity consumption. J. Clean. Prod. 2019, 212, 475–488. [Google Scholar] [CrossRef]
- Carmo, C.M.R.D.; Christensen, T.H. Cluster analysis of residential heat load profiles and the role of technical and household characteristics. Energy Build. 2016, 125, 171–180. [Google Scholar] [CrossRef]
- Verleysen, M.; François, D. The Curse of Dimensionality in Data Mining and Time Series Prediction. In Computational Intelligence and Bioinspired Systems, Proceedings of the IWANN 2005, Barcelona, Spain, 8–10 June 2005; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2005; pp. 758–770. [Google Scholar]
- Luo, X.; Hong, T.; Chen, Y.; Piette, M.A. Electric load shape benchmarking for small- and medium-sized commercial buildings. Appl. Energy 2017, 204, 715–725. [Google Scholar] [CrossRef]
- Morris, B. The components of the Wired Spanning Forest are recurrent. Probab. Theory Relat. Fields 2003, 125, 259–265. [Google Scholar] [CrossRef]
- Haben, S.; Singleton, C.; Grindrod, P. Analysis and Clustering of Residential Customers Energy Behavioral Demand Using Smart Meter Data. IEEE Trans. Smart Grid 2016, 7, 136–144. [Google Scholar] [CrossRef]
- Yilmaz, S.; Chambers, J.; Patel, M. Comparison of clustering approaches for domestic electricity load profile characterisation—Implications for demand side management. Energy 2019, 180, 665–677. [Google Scholar] [CrossRef]
- Hong, Y.; Yoon, S.; Choi, S. Operational signature-based symbolic hierarchical clustering for building energy, operation, and efficiency towards carbon neutrality. Energy 2023, 265, 126276. [Google Scholar] [CrossRef]
- Kim, J.; Song, K.; Lee, G.; Lee, S. Time-series data clustering with load-shape preservation for identifying residential energy consumption behaviors. Energy Build. 2024, 311, 114130. [Google Scholar] [CrossRef]
- Chen, S.; Lv, Y.; Wang, Z.; Ma, Y.; Huang, Y.; Wang, Y.; Cai, Y.; Rao, Z. Typical daily occupancy profiles of express hotels and its stochasticity effect on building heating and cooling loads. J. Build. Eng. 2023, 73, 106775. [Google Scholar] [CrossRef]
- Ashouri, M.; Haghighat, F.; Fung, B.C.; Yoshino, H. Development of a ranking procedure for energy performance evaluation of buildings based on occupant behavior. Energy Build. 2019, 183, 659–671. [Google Scholar] [CrossRef]
- Sun, C.; Zhang, R.; Sharples, S.; Han, Y.; Zhang, H. Thermal comfort, occupant control behaviour and performance gap—A study of office buildings in north-east China using data mining. Build. Environ. 2019, 149, 305–321. [Google Scholar] [CrossRef]
- Wang, Y.; Shao, L. Understanding occupancy pattern and improving building energy efficiency through Wi-Fi based indoor positioning. Build. Environ. 2017, 114, 106–117. [Google Scholar] [CrossRef]
- Wang, F.; Li, K.; Duić, N.; Mi, Z.; Hodge, B.M.S.; Shafie-khah, M.; Catalão, J.P. Association rule mining based quantitative analysis approach of household characteristics impacts on residential electricity consumption patterns. Energy Convers. Manag. 2018, 171, 839–854. [Google Scholar] [CrossRef]
- Yu, Z.J.; Haghighat, F.; Fung, B.C.; Zhou, L. A novel methodology for knowledge discovery through mining associations between building operational data. Energy Build. 2012, 47, 430–440. [Google Scholar] [CrossRef]
- Fan, C.; Xiao, F.; Madsen, H.; Wang, D. Temporal knowledge discovery in big BAS data for building energy management. Energy Build. 2015, 109, 75–89. [Google Scholar] [CrossRef]
- Fan, C.; Xiao, F.; Song, M.; Wang, J. A graph mining-based methodology for discovering and visualizing high-level knowledge for building energy management. Appl. Energy 2019, 251, 113395. [Google Scholar] [CrossRef]
- Zhang, C.; Zhao, Y.; Lu, J.; Li, T.; Zhang, X. Analytic hierarchy process-based fuzzy post mining method for operation anomaly detection of building energy systems. Energy Build. 2021, 252, 111426. [Google Scholar] [CrossRef]
- Xu, Y.; Yan, C.; Shi, J.; Lu, Z.; Niu, X.; Jiang, Y.; Zhu, F. An anomaly detection and dynamic energy performance evaluation method for HVAC systems based on data mining. Sustain. Energy Technol. Assess. 2021, 44, 101092. [Google Scholar] [CrossRef]
- Zhou, Y.; Yeoh, J.K.; Solihin, W. Studying the impact of building morphology on occupants’ movement using a rule mining approach. Build. Environ. 2024, 249, 111116. [Google Scholar] [CrossRef]
- Sha, X.; Ma, Z.; Sethuvenkatraman, S.; Li, W. A novel rule mining method for knowledge discovery of relationships among indoor air quality, HVAC operation and occupants’ activities. Build. Environ. 2024, 260, 111670. [Google Scholar] [CrossRef]
- Zhao, Y.; Zhang, C.; Cao, L. Post-Mining of Association Rules: Techniques for Effective Knowledge Extraction; IGI Global: Hershey, PA, USA, 2009. [Google Scholar]
- Zhan, S.; Liu, Z.; Chong, A.; Yan, D. Building categorization revisited: A clustering-based approach to using smart meter data for building energy benchmarking. Appl. Energy 2020, 269, 114920. [Google Scholar] [CrossRef]
- Zhang, C.; Zhao, Y.; Li, T.; Zhang, X. A post mining method for extracting value from massive amounts of post mining building operational data. Energy Build. 2020, 223, 110096. [Google Scholar] [CrossRef]
- Hsu, D. Comparison of integrated clustering methods for accurate and stable prediction of building energy consumption data. Appl. Energy 2015, 160, 153–163. [Google Scholar] [CrossRef]
- Rathod, R.R.; Garg, R.D. Regional electricity consumption analysis for consumers using data mining techniques and consumer meter reading data. Electr. Power Energy Syst. 2016, 78, 368–374. [Google Scholar] [CrossRef]
- Fan, C.; Sun, Y.; Shan, K.; Xiao, F.; Wang, J. Discovering gradual patterns in building operations for improving building energy efficiency. Appl. Energy 2018, 224, 116–123. [Google Scholar] [CrossRef]
- Gianniou, P.; Liu, X.; Heller, A.; Nielsen, P.S.; Rode, C. Clustering-based analysis for residential district heating data. Energy Convers. Manag. 2018, 165, 840–850. [Google Scholar] [CrossRef]
- Dab, K.; Henao, N.; Nagarsheth, S.; Dubé, Y.; Sansregret, S.; Agbossou, K. Consensus-based time-series clustering approach to short-term load forecasting for residential electricity demand. Energy Build. 2023, 299, 113550. [Google Scholar] [CrossRef]
- Choi, S.; Lim, H.; Lim, J.; Sungmin, Y. Retrofit building energy performance evaluation using an energy signature-based symbolic hierarchical clustering method. Build. Environ. 2024, 251, 111206. [Google Scholar] [CrossRef]
- Canaydin, A.; Fu, C.; Balint, A.; Khalil, M.; Miller, C.; Kazmi, H. Interpretable domain-informed and domain-agnostic features for supervised and unsupervised learning on building energy demand data. Appl. Energy 2024, 360, 122741. [Google Scholar] [CrossRef]
- Liu, Y.; Chong, W.T.; Yau, Y.H.; Wu, J.; Chang, Y.; Cui, T.; Chang, L.; Pan, S. A hybrid learning approach to model the diversity of window-opening behavior. Build. Environ. 2024, 257, 111525. [Google Scholar] [CrossRef]
Author | Time | Data Type | Methods | Unsupervised Algorithms | Purpose |
---|---|---|---|---|---|
[48] | 2015 | Panel data | Cluster; regression | k-means | Prediction |
[49] | 2016 | Panel data | Cluster; ARM | k-means; apriori | Knowledge extract |
[50] | 2018 | Longitudinal data | ARM | Gradual pattern mining | Knowledge extract |
[51] | 2018 | Panel data | Cluster | k-means | Knowledge extract |
[17] | 2018 | Cross-section data | Cluster | k-means | Knowledge extract |
[30] | 2019 | Panel data | Cluster | k-means | Knowledge extract |
[24] | 2019 | Panel data | Cluster | k-means | Knowledge extract |
[47] | 2020 | Cross-section data | ARM | FP-growth | Knowledge extract |
[46] | 2020 | Panel data | Cluster | k-means | Knowledge extract |
[19] | 2021 | Panel data | Cluster | k-means | Knowledge extract |
[18] | 2022 | Cross-section data | Cluster; regression | k-means | Prediction |
[52] | 2023 | Panel data | Cluster; regression | k-medoids | Prediction |
[20] | 2023 | Panel data | Cluster | Hierarchical | Knowledge extract |
[33] | 2023 | Panel data | Cluster | k-means | Knowledge extract |
[53] | 2024 | Cross-section data | Cluster | Hierarchical | Knowledge extract |
[22] | 2024 | Panel data | Cluster | k-means; k-shape; DTW, DDTW | Knowledge extract |
[54] | 2024 | Panel data | Cluster | t-SNE | Knowledge extract |
[55] | 2024 | Panel data | Cluster; classification | k-means | Knowledge extract; prediction |
[32] | 2024 | Panel data | Cluster | Deep learning | Knowledge extract |
Option | Explanation | Option | Explanation |
---|---|---|---|
Building Type | Type of resident or public building | Power | Annual electricity consumption |
Population | Long-term residents | Power intensity | Level of electricity consumption |
Coast | Reside near the seaside or not | Insulation | Presence of insulation material |
Job | Primary job type | Equipment | Cooling equipment used in summer |
Island | Live on island or not | Age | Average age of householders |
Width | Width of rural house | Orientation | Direction of building |
Depth | Depth of rural house | Structure | Type of bearing structure |
Algorithm Type | Calculation Method | Abbreviation |
---|---|---|
K-MEANS | k-shape | Km_kshape |
Euclidean | Km_Euclidean | |
DTW | Km_DTW | |
softdtw | Km_softdtw | |
Hierarchical | Euclidean | Hi_Euclidean |
manhattan | Hi_Manhattan | |
DTW | Hi_DTW | |
Density | DBSCAN | DBSCAN |
Model | Hidden Markov model, | HMM |
Auto-regressive model | AR | |
Deep learning | Recurrent neural network | RNN |
Autoencoder | Auto | |
Spectral clustering | SC | |
Time-window clustering | TWC |
Algorithms | M | V | LINE | Total | ||||
---|---|---|---|---|---|---|---|---|
Y | N | Y | N | Y | N | Y | N | |
Km_kshape | 66.67% | 33.33% | 81.25% | 18.75% | 41.18% | 58.82% | 67.05% | 32.95% |
Km_Euclidean | 69.23% | 30.77% | 73.38% | 26.62% | 60.10% | 39.90% | 65.09% | 34.91% |
Km_DTW | 84.62% | 15.38% | 53.13% | 46.88% | 35.29% | 64.71% | 63.64% | 36.36% |
Km_softdtw | 84.62% | 15.38% | 53.13% | 46.88% | 35.29% | 64.71% | 63.64% | 36.36% |
Hi_Euclidean | 97.44% | 2.56% | 0.00% | 100.00% | 58.82% | 41.18% | 54.55% | 45.45% |
Hi_Manhattan | 89.74% | 10.26% | 0.00% | 100.00% | 41.18% | 58.82% | 47.73% | 52.27% |
Hi_DTW | 97.44% | 2.56% | 0.00% | 100.00% | 5.88% | 94.12% | 44.32% | 55.68% |
DBSCAN | 76.92% | 23.08% | 12.50% | 87.50% | 17.65% | 82.35% | 42.05% | 57.95% |
Building Type | Residence | ||
---|---|---|---|
Resident | Often on island | Non-fisherman | Middle age |
Aged | |||
Fisherman | - | ||
Not on island frequently | - | - | |
Public | Government | Often use | - |
Non-use frequently | |||
Street lamp | |||
Processing industry |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, X.; Zhang, S.; Wang, X.; Wu, R.; Yang, J.; Zhang, H.; Wu, J.; Li, Z. Clustering Method Comparison for Rural Occupant’s Behavior Based on Building Time-Series Energy Data. Buildings 2024, 14, 2491. https://doi.org/10.3390/buildings14082491
Liu X, Zhang S, Wang X, Wu R, Yang J, Zhang H, Wu J, Li Z. Clustering Method Comparison for Rural Occupant’s Behavior Based on Building Time-Series Energy Data. Buildings. 2024; 14(8):2491. https://doi.org/10.3390/buildings14082491
Chicago/Turabian StyleLiu, Xiaodong, Shuming Zhang, Xiaohan Wang, Rui Wu, Junqi Yang, Hong Zhang, Jianing Wu, and Zhixin Li. 2024. "Clustering Method Comparison for Rural Occupant’s Behavior Based on Building Time-Series Energy Data" Buildings 14, no. 8: 2491. https://doi.org/10.3390/buildings14082491