1. Introduction
Recently, increasing amounts of energy are consumed due to the growth in energy demand, therefore emitting a vast amount of greenhouse gas (GHG). This has caused various adverse environmental problems such as global warming and air pollution [
1]. One sector that primarily affects increasing global energy use is the building sector. According to studies performed by the U.S. Department of Energy (DoE), nearly 40% of the total energy use in the world is attributable to buildings [
2,
3]. In particular, commercial buildings are responsible for approximately 40% of the total energy usage in the U.S.A., which is more than that of the transportation or industry sectors [
4]. In commercial buildings, heating and cooling are the biggest sources of major energy consumption [
5]. To reduce building energy use for heating and cooling, it is essential to design buildings for maximized energy efficiency during the early design phase, considering location, architectural components, form, materials, orientation, and so on. Therefore, it is necessary to analyze energy use in buildings, and to provide efficient building design.
To date, much research has been conducted to reduce energy consumption and to improve energy efficiency in buildings [
6,
7,
8,
9,
10]. For example, Chae et al. proposed a prediction model for electrical energy consumption in buildings based on an artificial neural network and Bayesian regularization algorithm [
8]. They also identified the important variables that affected electrical energy consumption for the purpose of feature reduction. Lei et al. investigated the energy performance of building envelopes integrating the phase change materials (PCMs) for cooling load reduction in Singapore by using numerical simulations [
9]. Mathew et al. analyzed the performance of energy efficiency according to the composition of various building envelopes [
10].
In addition, many countries operate database systems—which include building type, energy source, and end-use—based on surveys and measurements to manage the energy-use intensity (EUI). For example, in the United States, database systems such as the Commercial Buildings Energy Consumption Survey (CBECS), the Residential Energy Consumption Survey, and the Manufacturing Energy Consumption Survey have been operated [
11]. Such building energy benchmarking datasets make it possible to analyze a set of useful information on the status and performance of energy consumption, and are applied to solve shortcomings in a building. As one of the aforementioned datasets, CBECS, provided by the U.S. Energy Information Administration (EIA), includes various factors that affect the energy consumption of commercial buildings [
11,
12]. A detailed description of the CBECS dataset is described in
Section 2.1. By applying the CBECS dataset, Kahn et al. studied the association between a building’s electricity consumption and the physical attributes of the selected buildings [
13]. Arroyo et al. investigated the change of building energy efficiency and energy use taking into account the building type, climate, equipment, and so on [
14]. Jeon et al. compared the annual energy performance of different types of heating, ventilation, and air-conditioning (HVAC) equipment in existing small- and medium-sized commercial buildings [
15]. Furthermore, they identified appropriate HVAC systems that could be retrofitted into different commercial building types in a cost effective manner.
In this study, unlike ordinary perspectives in the current research literature, we propose a new designed system for effective building energy use analysis. One of the core methodologies proposed in this research was a data cube model for the multidimensional analysis of energy efficiency for heating/cooling according to various factors in a building by using the CBECS dataset. Based on the construction of the data cube model, multidimensional building energy consumption analysis was performed using on-line analytical processing (OLAP) operations with multiple levels of abstraction. In addition, the proposed analysis model also provided a comprehensive set of information out of the vast building energy data by applying the association rule mining. To the best of our knowledge, this was the first attempt of an OLAP analysis combined with association rule mining on large-scaled building energy datasets for an in-depth analysis of building energy.
3. Results and Discussion
This section introduces a multidimensional building energy analysis system using the data cube model and association rule mining. Furthermore, we validated the feasibility and applicability of the proposed system by describing the experimental results that were applied to the CBECS dataset provided by the U.S. EIA using the Oracle database management system and R tools.
3.1. Building Energy Analysis Results Derived by OLAP
In this section, we describe the multidimensional analysis outcomes, which were obtained by adjusting the levels of abstraction. A system manager attempts to choose a dimension with the viewpoint, and analyzes it with various levels of abstraction using the proposed model described in the previous section. First, we demonstrated that the system could be used to conduct a descriptive analysis for the amount of building energy consumption in different climate regions. For estimating heating and cooling energy usage, we adjust the data cube by using the drill-down and dice operations in relation with different climate zones, as seen in
Figure 5. In a very-cold/cold climate zone, it seems that a lot of energy is consumed for heating the space as seen in
Figure 5a. On the other hand, a significant amount of energy is also consumed for cooling in hot-dry/mixed-dry/hot-humid climate regions, as seen in
Figure 5b. Based on these results, we conducted a drill-down operation on building feature and building operation dimensions to look more closely at what factors affect energy efficiency. A system manager can consider various factors and adjust dimensions with varying levels of abstraction, but in our analysis scenarios, the building feature (e.g., materials and types of envelopes) and building operation (e.g., type of heating/cooling equipment) dimensions were chosen to confirm in detail which factors significantly affected the energy efficiency. The adopted analysis scenarios are as follows: (1) Scenario I: energy efficiency related factors for heating in very-cold/cold climate zones; and (2) Scenario II: energy efficiency related factors for cooling in hot-dry/mixed-dry/hot-humid climate zones with the various viewpoints.
3.1.1. Scenario I: Analysis on Heating Energy Influencing Factors in Very-Cold/Cold Climate Zones
To identify which factors of the chosen buildings affected the energy efficiency, we performed an in-depth analysis focusing on the very-cold/cold climate zones that had the highest AHES. First, we conducted a drill-down operation on the building operation dimension. Thereafter, a dice operation was performed in the type-level.
Figure 6 demonstrates that buildings with “district steam or hot water piped in from outside the building” consumed more energy than buildings using other systems, whereas those buildings with “heat pumps” or “individual space heaters” consumed relatively less energy for heating. In other words, the “district steam or hot water piped in from outside the building” was a heating system with poor energy efficiency, especially in very-cold/cold climate zones where using such systems are not recommended. To show energy usage in accordance with each heating system type, the capacity of a specific heating system has to be considered. However, only the average value for energy usage was used in this study. Practically, the analysis results are presented in
Figure 7 when using an aggregation operator with absolute energy usage that only considered the heating systems of buildings that were located in very-cold/cold climate zones.
Figure 7 shows that the “packaged central unit” had a higher value of heating energy usage unlike
Figure 6. From this result, there are potentially other factors that affect energy efficiency, not just heating systems. Therefore, we analyzed the AHES considering both heating system type and other factors in these climatic regions.
The drill-down operation was conducted on building feature dimension, and the dice operation was also performed for the material-level in major wall construction.
Figure 8a indicates the AHES values for different heating systems and materials of major wall construction in terms of three-dimension perspectives.
Figure 8a shows that the buildings using “district steam or hot water piped in from outside the building” had a higher energy use for heating than others regardless of the material of the major wall construction. In particular, buildings using both “district steam or hot water piped in from outside the building” and “window or vision glass”, simultaneously, consumed the highest AHES. In other words, the use of “district steam or hot water piped in from outside the building” as a type of heating system and “window or vision glass” as the wall construction material are not recommended when designing a building in very-cold/cold climate regions. We then conducted drill-down and dice operations on building feature dimension to perform more detailed analysis in relation to roof materials in addition to heating system types and wall materials.
Figure 8b presents that the buildings with specific energy consumption factors used only “built-up”, “metal surfacing”, or “plastic, rubber, or synthetic sheeting” as the major roof construction material and no cases were recorded as having other roof materials. Buildings with “metal surfacing” as the roof material tended to show more energy consumption than buildings with other roofing materials; that is to say, the combination consisting of factors such as “district steam or hot water piped in from outside the building” as the heating system, “window or vision glass” as the wall material, and “metal surfacing” as the roof material was marked as having a low energy efficiency for heating. Therefore, this combination is not recommended when designing a commercial building in very-cold/cold climate regions.
In Scenario I, we analyzed the factors that affected the heating energy efficiency in very-cold/cold climate regions by using OLAP operations with various levels of abstraction. To sum up, we conducted the drill-down and dice operations on the building operations dimension to understand which heating systems significantly affected the overall building energy efficiency. Thus, to perform a detailed heating energy efficiency analysis, we drilled down from the envelope system-level to the material-level for walls in the building feature dimension and diced for each heating system. As a result, we observed that most of the buildings which used both “district steam or hot water piped in from outside the building” as the heating system and “window or vision glass” as the wall material had higher overall AHES values. Furthermore, we derived the fact that it became the lowest energy-efficient combination when adding “metal surfacing” with the combination above-mentioned.
3.1.2. Scenario II: Analysis on Cooling Energy Influencing Factors in Hot-Dry/Mixed-Dry/Hot-Humid Climate Zones
Scenario II analyzed the factors that affect the cooling energy efficiency in hot-dry/mixed-dry/hot-humid (hereafter referred to hot-dry) climate regions. As seen in
Figure 5b, the ACES of buildings located in hot-dry climate zones showed the highest value, thus, the focus was placed upon these climate zones. In terms of the cooling energy driving factor analysis, a system manager can perform OLAP operations with varying levels of abstraction. In this scenario, we looked at it from the window glass in building feature dimension. In
Figure 9, we observed that, in the case of “single layer glass”, generally, the higher the percentage of window, the more the cooling energy tends to be consumed in hot-dry climate zones. In other words, it is more effective to reduce the ratio of windows to improve cooling energy efficiency in buildings using “single layer glass”. However, “single layer glass” is proportional to the ratio of the window glass to the cooling energy consumption, while “multi-layer glass” represents a slightly different pattern. Additionally, when compared to “single layer glass”, the ACES value of “multi-layer glass” was higher than that of “single layer glass” when the percentage of windows was 50% or less. In particular, the ACES was highest in buildings with “26–50%” and “multi-layer glass”.
Common sense tells us that the cooling energy efficiency of the buildings using “multi-layer glass” should be better than that of the buildings with “single layer glass”. It can be inferred from this phenomenon that there are other factors that affect the cooling energy efficiency such as major wall and/or roof construction materials. To perform a detailed cooling energy efficiency determining factors analysis, we introduced an additional viewpoint for seeking other materials by performing a dice operation, focused on both the wall and roof materials, satisfying both “multi-layer glass” and “26–50%” as the target window glass features. In
Figure 10, we observed that buildings using “sheet metal panels” as the wall materials and “plastic, rubber, or synthetic sheeting” as the roof material had the highest ACES value; that is to say, the combination consisting of the above factors had low energy efficiency for cooling. Therefore, such factors together in this combination are not recommended for use when constructing a commercial building in hot-dry climate regions. Note that this graph focused on the ACES of the buildings using “multi-layer glass” and “26–50%” as the window glass features in addition to the major wall and roof construction materials.
Scenario II described the process used to identify which factors or combinations of those factors affected the cooling energy efficiency in hot-dry climate zones by using an OLAP operation with various levels of abstraction. In summary, we performed drill-down and dice operations on building feature dimension to understand changes in cooling energy efficiency in relation to window glass features such as window type and window coverage percentage. As a result, the use of “single layer glass” in buildings turned out to be more energy efficient for cooling when combined with a low window glass coverage ratio. However, if the window glass percentage increased, the use of “multi-layer glass” was a way to improve energy efficiency. Thus, to conduct a detailed analysis, we focused on “multi-layer glass” and “26–50%” as window features and performed the dice operation on both the wall and roof materials. By applying this operation, we observed that the combination of “multi-layer glass” as the type of window glass, “26–50%” as the ratio of window, “sheet metal panels” as the wall material, and “plastic, rubber, or synthetic sheeting” as the roof materials showed the lowest energy efficiency for cooling.
In this section, we constructed the data cube model to perform multidimensional analysis in combination with various levels of abstraction with OLAP operations. This was a new way to find meaningful results and to improve the energy efficiency determining factors for buildings with these specific conditions. As representative examples of analysis, we focused on specific climate regions such as very-cold/cold and hot-dry. Thereafter, we analyzed the AHES and ACES for chosen building features and building operation dimensions including major wall and roof construction materials, type of window glass, percentage of window, and type of heating/cooling systems. The proposed data cube model was able to analyze AHES and ACES in multidimensional viewpoints, and provided a useful tool to discover sometimes unexpected facts. However, the proposed system has limitations in analyzing the relationships that potentially exist among a variety of building factors. In addition, it is difficult for a system manager to establish the complete rules for deriving energy efficient combinations. To address these challenges, the association rule mining method was applied to derive meaningful rules and provide useful knowledge in the form of rules, which could possibly represent various sets of energy efficiency combinations.
3.2. Building Energy Analysis Results Based on Association Rule Mining
In this section, based on the results of the multidimensional analysis described in the previous section, we derive a set of useful rules in building feature and operation perspectives. This process works with the application of association rule mining to establish the combination of building factors such as material and type for each envelope component, and the heating/cooling systems that affect energy efficiency in a building. That is to say, these association rules were derived to extract frequent and meaningful correlations between various factors in buildings and energy efficiency that could not be analyzed with a multidimensional data cube.
Since the association rule mining was applied to the categorical data (energy consumption data in our study) in numeric form, our sample datasets needed to be transformed into categorical data before performing this method [
34]. Therefore, prior to the experiment, the energy consumption was divided into 10 steps. The grade was divided by 10% of the energy consumption rank for heating and cooling in very-cold/cold and hot-dry climate regions. For example, 1-level was the top 10% of the lowest energy consumption for heating and cooling in very-cold/cold and hot-dry climate regions, respectively. In other words, the rules extracted using association rule mining were in the form of {[wall material|roof material|type of window glass|percentage of window|heating or cooling system]} → {[the grade of energy consumption for heating or cooling]}.
Table 3 and
Table 4 present samples of the obtained association rules (ordered by
lift) in very-cold/cold and hot-dry climate regions, respectively. By looking at each rule, the results showed the amount of average energy consumption (AHES/ACES) when there was a combination of specific material, type of envelope components, and the heating/cooling system in the buildings.
In
Table 3, if the buildings located in very-cold/cold climate zones had “multi-layer glass (type of window glass)”, “26–50% (percentage of window)” and “boiler inside the building that produces steam or hot water (heating system)”, the grade of AHES was 1-level (refer to Rule No. 1). This meant that the buildings consisting of such materials and equipment systems consumed less energy for heating. Compared with the multidimensional analysis using OLAP, each material, envelope type, and heating system are energy efficiency determining factors in very-cold/cold climate regions. On the other hand, in Rule No. 3, the buildings with “brick, stone, or stucco (wall)”, “26–50% (percentage of window)”, and “packaged central unit (heating system)” were marked at 8-level of the energy consumption grade. Hence, this combination was not energy-efficient for heating in very-cold/cold climate zones. Similarly, the other rules could be analyzed as aforementioned.
The rules for hot-dry climate regions are shown in
Table 4. Rule No. 1 depicts that the grade of ACES for buildings with “pre-cast concrete panels (wall)”, “single layer glass (type of window glass)”, “1% or less (percentage of window)”, and “packaged air conditioning units (cooling system)” was 1-level. Even if “single layer glass (type of window glass)” was used, the grade of ACES was lowered by “26–50% (percentage of window)” and “residential-TYPE central air conditioners without using chiller water (cooling system)”, as seen in Rule No. 2. In addition, Rule No. 5 showed that a combination of factors with less energy efficiency such as “plastic, rubber, or synthetic sheeting (roof)”, “combination of single and multi-layer glass (type of window glass)”, and “central chillers inside the building that chill water for air conditioning (cooling system)” had a lower grade of energy efficiency. Compared with the results of the multidimensional analysis, most combinations of the factors that had less energy efficiency turned out to be less energy-efficient.
In fact, by studying the relationships of building energy related parameters (construction materials, types of envelope, and equipment systems in this research) with the association rules obtained from the vast amount of data, it would be possible to effectively see a variety of other potential combinations of building factors such as lighting system and activities in the buildings. Further studies looking into the underlying reasons for these associations are needed to derive more detailed and reliable combinations considering all influential factors.
3.3. Discussion
The proposed system in this research had two main objectives: (1) to organize the datasets to identify energy consumption profiles in buildings so that it could be depicted and visualized by OLAP operations according to a variety of factors such as building feature, building operation, climate zone and so on; and (2) to derive the rules for energy efficiency enhancing combinations in buildings by utilizing association rule mining. Therefore, the primary goal of this system was to provide decision-makers (or system managers) with the ability to improve the building energy efficiency and to obtain a better understanding of the building components that may have the biggest chances for energy-saving. Note that the proposed system was motivated from the lack of an efficient building energy analysis model to help the administrators (such as building designers and architects) effectively make a decision from multiple choices at the initial stage of building design. Even though the proposed system is a unique data-driven system, it is difficult to declare its own effects since this prototype system has yet to be verified in the field. Consequently, the efficiency of the proposed system needs to be proven through an actual trial; furthermore, the functions of this system could be enhanced and fitted into any situation. To implement an in-depth and detailed analysis of the factors affecting building energy consumption, more diversified building characteristics should be considered in addition to the variables selected from the building datasets used in this study. Especially, as shown in
Figure 6, it would support more precise analysis if both the type of heating system and the number of installation instances and capacities were considered. The best case would be to perform analysis with all known properties of the buildings, accurately. However, in the real world, there are always limitations in the gathering and measuring of all the variables and properties that impact on building energy consumption. Therefore, in this study, we intended to obtain a promising insight into building energy consumption profiling based on the CBECS dataset that was built by systematically surveying and measuring commercial buildings across the entire U.S. boundary. At least, we believe that the proposed system can be a supplementary tool for decision makers to establish suitable strategies in energy saving based on data driven building energy analysis sessions. In fact, the system itself will not provide an answer on the best control measures used in the processes to explain the observed and predicted energy saving. Nevertheless, it should provide clues to guide further research providing valuable and unknown information about the development of appropriate design strategies influencing energy saving in buildings. Thus, additionally collaborative research with eco-friendly building designers and architects will be required for the construction of an advanced energy sensitive building design support system in the future.
4. Conclusions
In this study, we proposed a new evaluative methodology for the analysis of influencing factors that affect energy efficiency derived from selected commercial buildings in the U.S. by utilizing the CBECS dataset. The CBECS dataset built by U.S. EIA includes factual information about different building attributes and energy consumption for 6720 commercial buildings in the U.S. It also contains almost 28 building related features such as climate regions, divisions, materials, heating/cooling system, and so on. In the preprocessing steps, the CBECS dataset was refined by filtering out most of the important features affecting the energy consumption profile in a building. Thus, the records containing missing or noisy values were removed for more precise analysis. As a result, 5134 records and 10 features were used in this study.
The core of the methodology was the construction of a data cube model to handle the vast amount data regarding building variables and energy consumption measurements accumulated in a data warehouse. Multidimensional building energy analysis was performed using the OLAP operations, varying the levels of abstraction according to the dimensions of building features, building operation, and climate regions. The proposed analysis model elicited the useful pieces of information that revealed the advantageous and disadvantageous combinations of materials and types for the studied buildings was dependent on the climate conditions and modes of energy consumption (heating or cooling) by applying association rule mining. We confirmed the feasibility and applicability of the proposed analysis model by implementing it using the R tools; and then applied it to the analysis of the building energy consumption pattern identification in the U.S. To conclude, our research results provide a useful knowledge acquisition method that can help decision makers identify the critical building factors that could potentially increase energy efficiency so that eco-friendly and energy-saving buildings can be designed without heavy computer simulations with a specific set of conditions. Further validation and refinement of the proposed system is required and that certainly will be a part of our ongoing research.