*Article* **A Data Analytics-Based Energy Information System (EIS) Tool to Perform Meter-Level Anomaly Detection and Diagnosis in Buildings**

**Roberto Chiosa, Marco Savino Piscitelli and Alfonso Capozzoli \***

Department of Energy "Galileo Ferraris", TEBE Research Group, BAEDA Lab, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Turin, Italy; roberto.chiosa@polito.it (R.C.); marco.piscitelli@polito.it (M.S.P.) **\*** Correspondence: alfonso.capozzoli@polito.it

**Abstract:** Recently, the spread of smart metering infrastructures has enabled the easier collection of building-related data. It has been proven that a proper analysis of such data can bring significant benefits for the characterization of building performance and spotting valuable saving opportunities. More and more researchers worldwide are focused on the development of more robust frameworks of analysis capable of extracting from meter-level data useful information to enhance the process of energy management in buildings, for instance, by detecting inefficiencies or anomalous energy behavior during operation. This paper proposes an innovative anomaly detection and diagnosis (ADD) methodology to automatically detect at whole-building meter level anomalous energy consumption and then perform a diagnosis on the sub-loads responsible for anomalous patterns. The process consists of multiple steps combining data analytics techniques. A set of evolutionary classification trees is developed to discover frequent and infrequent aggregated energy patterns, properly transformed through an adaptive symbolic aggregate approximation (aSAX) process. Then a post-mining analysis based on association rule mining (ARM) is performed to discover the main sub-loads which mostly affect the anomaly detected at the whole-building level. The methodology is developed and tested on monitored data of a medium voltage/low voltage (MV/LV) transformation cabin of a university campus.

**Keywords:** building energy management; energy information systems; anomaly detection and diagnosis; classification tree; symbolic aggregate approximation; association rule mining

#### **1. Introduction**

The building sector is globally recognized as one of the most energy-intensive, and its energy demand continues to increase as a result of a combination of various factors such as extreme climatic events, increased demand for energy services, and in particular those related to air conditioning and quality of the built environment. According to the International Energy Agency (IEA) for the EU member states, buildings are responsible for around 21% of primary energy consumption [1].

As a result, this sector is currently among the most strategic ones for reducing global energy demand, improving energy efficiency, and achieving specific decarbonization targets. In the last years, the great focus on buildings has also been encouraged by the introduction of a robust regulatory framework that puts in evidence the importance of a more responsible building energy management. In this perspective, the technological advancements that characterized the world of IoT (Internet of Things) and ICT (information and communication technology) has played a fundamental role in determining an everincreasing spread of advanced monitoring and automation infrastructures in buildings, making it is possible to collect a huge amount of data and information related to the real performance in the operation of such complex systems.

**Citation:** Chiosa, R.; Piscitelli, M.S.; Capozzoli, A. A Data Analytics-Based Energy Information System (EIS) Tool to Perform Meter-Level Anomaly Detection and Diagnosis in Buildings. *Energies* **2021**, *14*, 237. https:// doi.org/10.3390/en14010237

Received: 8 December 2020 Accepted: 30 December 2020 Published: 5 January 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

The analysis of data collected represents a huge opportunity to identify and define effective energy-saving strategies and to optimize building performance in operation [2,3]. This process can be considered as the starting point of all the activities that are aimed at reducing the gap between the actual and expected building energy performance that is often generated by incorrect occupant behavior, equipment faults, and wrong or ineffective control strategies of energy systems [4].

Moreover, thanks to the growing availability of open access building data sets [5–7], analysts can quantitatively compare different processes of analysis, evaluating algorithm performance and assessing building energy performance in a more objective and transparent way [8].

Nonetheless, professional figures involved in the energy management process of buildings are now facing great difficulties in managing these large amounts of data and setting their analyses in a systematic way in order to extract useful knowledge and consequently the desired value.

For this purpose, energy management and information systems (EMISs) can be employed. EMISs belong to the rapidly evolving family of tools that monitor, analyze, and control building energy use and system performance, often leveraging advanced data analytics-based technologies. According to [9], the first classification of EMISs distinguishes such systems considering if their functionalities are enabled at the meter or system-level. The first category of EMISs considers data measurements at a high level (e.g., data related to the total load or of the main sub-loads) while system-level EMISs are focused on more detailed data related to the operation of specific systems or components. energy information systems (EISs) are part of EMIS and integrate software solutions conceived for the analysis of meter-level monitored data of buildings that are not usually collected through building automation systems (BAS). EISs typically enable predictive pattern recognition analysis for performing essential tasks in building energy management such as energy consumption forecasting, anomaly detection and diagnosis, advanced benchmarking, load profiling, and schedule optimization of building energy systems [4].

Among these tasks, anomaly detection and diagnosis has been the most underdeveloped for application on meter-level data.

Anomaly detection and diagnosis (ADD) in buildings is often related to fault detection and diagnosis (FDD) analysis conducted at system/component-level where the scale of analysis is small (e.g., air handling unit components). However, in most real cases, just a few aggregate variables related to the total energy consumption of the building are monitored and collected. Improving the building energy performance by analyzing aggregate data is challenging, especially if several factors such as occupant behavior, comfort levels, operational schedules of systems may generate different energy consumption patterns not always easily inferable. In this context, an EIS tool capable to automatically detect anomalous energy trends in building energy consumption allows energy managers to be promptly informed when the building is not behaving as expected and to avoid inefficient energy management procedures.

In the process of ADD, pattern recognition techniques play a key role in the analysis of patterns and trends in high-dimensional time series of building energy consumption [10]. There are three main expected goals behind ADD analysis in buildings that can be summarised as follows:


According to the aforementioned objectives, this work proposes an EIS tool capable of performing ADD analysis in buildings by exploiting meter-level data. ADD procedures are usually performed offline and on small subsets of historical data, but more and more interest is growing in developing an automatic framework of analysis for online implementations. For this purpose in this paper, an innovative ADD methodology conceived

for application in a real testbed (i.e., the university campus of Politecnico di Torino) is presented. The proposed methodology enables the automatic detection of energy anomalies at the whole-building level and their diagnosis at the sub-load level, revealing which sub-load/sub-loads are responsible for the anomalies detected. According to the objective of this paper, the next paragraph reports and discusses the literature concerning the implementation of ADD processes in buildings and presents the main contributions introduced in this work.

#### *Related Work and Contribution of the Paper*

ADD is extremely valuable for improving building energy performance and promising in terms of cost reduction potential if implemented in currently adopted EISs [11]. Despite the great potential offered by ADD at different levels of investigation in buildings, the implementation of this kind of analysis has been majorly focused at the system/componentlevel (e.g., heating, ventilation, and air conditioning (HVAC) systems), often neglecting applications at whole-building. This trend has been justified by the great availability of system-level data collected by building automation systems (BASs) in buildings. However, extracting any kind of meaningful information from BASs (especially from the outdated ones) can be a complicated task usually characterized by limitations on the data availability. Conversely, the collection of meter-level data in buildings is often performed by means of modern IoT devices that make monitored data easily available as never before. In this context, EIS tools focused on the analysis of meter-level data (especially ADD analysis) are becoming a very fast-growing market in the context of building analytics technologies.

According to the literature, the field of ADD in buildings is progressively leveraging on the application of data analytics techniques [12] for addressing both detection and diagnosis tasks.

The first task is often accomplished through the use of classification, regression, and pattern recognition techniques capable of providing estimations of the building energy consumption in normal operation according to specific boundary conditions (e.g., outdoor climatic conditions). The estimations are then used as a reference baseline for detecting the occurrence of abnormal patterns in the time series that significantly differs from the majority of processed data and/or from the expected trend [13].

For what concerns the implementation of supervised techniques for anomaly detection, in [14] the building energy consumption anomalies are identified comparing the actual consumption with the prediction of a hybrid artificial neural network (ANN) model. A similar approach is adopted in [15], where a deep neural network autoencoder was used to create a prediction model able to successfully detect abnormal energy patterns in the building operational data of an educational building in Hong Kong. Similarly, a general anomaly detection process is also proposed in [16], where the authors employed a variational recurrent autoencoder. Among supervised techniques also classification algorithms proved to be effective in anomaly detection. A robust methodology based on classification trees (CT) was proposed in [17]. In more detail, in that study, a set of classifiers were used for predicting the occurrence of categorical patterns in the time series of the total building electrical load, making it possible to detect a potential anomaly in the case of misclassification (i.e., the same concept of residual analysis in the case of regression models). The study underlined the prediction capabilities of CT algorithms and, most of all, the possibility of exploiting their interpretable nature in anomaly detection problems by extracting useful "if-then" decision rules.

In the context of unsupervised learning for anomaly detection, clustering, and association rule mining (ARM) are the most used techniques [18,19]. In [20], the authors used k-means clustering to automatically discover anomalies in whole-building energy consumption among daily load profiles characterized by an infrequent trend. In [21], an agglomerative hierarchical clustering-based strategy and three different dissimilarity measures were used to identify typical electrical usage profiles that enabled the detection of the abnormal ones.

As previously stated, the use of decision rules in the form of "if-then" implications is extremely valuable in anomaly detection. Following an unsupervised approach, this can be achieved by extracting association rules from building an operational dataset. association rules mining (ARM) algorithms have been widely used to discover abnormal patterns in the energy consumption of buildings and systems and then to enhance their performance. ARM allows discovering causal relationships between events also in the time domain [22]. This kind of algorithm is particularly suitable in extracting hidden knowledge from large databases, as it is reported in [23], where an extensive rules extraction is performed to detect energy wastes in the operation of a lighting system. Similarly, in [10], an improved ARM-based method was employed to discover and detect abnormal operational patterns of HVAC systems installed in a commercial building in Shenzhen (China).

More sophisticated approaches for anomaly detection consist of combining several techniques to maximize the amount of knowledge discovered and automatize the process of analysis.

The study conducted in [24] introduced the concept of collective anomaly detection, described as an event that is considered anomalous only if considered in relation to other events. In the proposed framework, ARM, performed through the Apriori algorithm, was used to extract the most frequent items from a time series related to smart grid operation. Then, anomalous behavior was identified through clustering analysis, considering silhouette indicator as a quality metric. Also, in [25], an anomaly detection process based on an ensembling technique was proposed. In detail, typical building operational patterns were identified by means of clustering analysis, and then an ARM algorithm was used to discover an anomalous load of a cooling chiller system installed in a building in Hong Kong. In [26], a multi-step clustering analysis was performed for removing anomalous daily load profiles from the energy consumption time series of a university campus. Then a regression model was developed on the anomaly-free dataset, combining artificial neural network (ANN) and regression tree (RT), to be used in online applications for detecting the occurrence of anomalous trends in the electrical energy consumption.

Another crucial aspect that arises from the literature review deals with the use of data reduction and transformation methods for (i) reducing the computational cost of the analysis, (ii) easily extracting the main patterns from time series, (iii) improving the effectiveness of supervised and unsupervised algorithms in detecting anomalies. In fact, directly analyzing raw data of time series could be extremely onerous, making difficult the handling and the characterization of the data under investigation. In this perspective, dimensionality reduction can be used with a low computational cost, for example, for removing irrelevant patterns and redundancy from energy consumption datasets. As reviewed in [12], various techniques were explored to enable the classification of data as normal or anomalous, such as principal component analysis (PCA) [27], linear discriminant analysis (LDA) [28], singular variable decomposition (SVD) [29].

In this context, symbolic aggregate approximation (SAX) [30] is one of the most promising techniques available to reduce the size of a time series without losing key information [31]. The SAX algorithm is conceived for the reduction of the time series through a piecewise technique and on its transformation into symbolic strings. Frequent symbolic sub-sequences in the whole sequence can then be extracted and defined as motifs (i.e., normal patterns), while infrequent ones can be isolated and labeled as discords (i.e., potential anomalies). In [31], SAX was used to discover patterns in time series related to the energy consumption of the International Commerce Centre (ICC) in Hong Kong and to recognize inefficient operating conditions that could cause energy wastes. Also, in [20], SAX was used for enabling the extraction of infrequent operating patterns in the energy consumption time series of a school campus and an office building. In particular, discords were detected, setting a minimum frequency threshold to the occurrence of SAX symbol sub-sequences representative of the original daily load profiles. In [17], an enhanced version of SAX called adaptive SAX (aSAX) was used for minimizing the information loss

due to the reduction and transformation of energy consumption time series and recognizing motif and discord symbolic patterns by means of classification models.

Once the detection of anomalies in energy consumption is performed, a diagnosis analysis makes it possible to identify the main causes associated with them. The field of anomaly diagnosis has been widely explored in buildings but with a greater focus on system-level applications rather than whole-building level. Also, this research field largely benefits from the use of data analytics techniques following both supervised and unsupervised approaches. The study in [32] proposed a process based on the development of a CT for diagnosing anomalies in the operation of air handling unit (AHU) components. Moreover, in [22], a CT was used for diagnosing up to 11 typical faults in AHU with an accuracy higher than 90%. Indeed, similarly to previously presented studies focused on anomaly detection, also the diagnosis analysis often exploits algorithms that allow the extraction of decision rules. Such a condition is particularly favorable for the final user due to the high interpretability of the diagnosis process, which meaningfulness can then be easily validated by domain expertise. To this aim also ARM algorithms can be employed as reported in [33–36].

On the basis of the literature review, in most of the cases, only meter level anomaly detection is performed at the whole-building level without any further analysis for identifying anomaly causes among sub-loads at a lower level.

The work presented in this paper aims to bridge this literature gap by introducing a novel hierarchical multi-level approach in the ADD process. The proposed methodology allows to perform the anomaly detection phase at the whole-building level, and only if an anomalous pattern is detected, an event-based diagnostic process is activated for finding root causes at the sub-load level. The event-based hierarchical approach in anomaly diagnosis makes it possible to reduce the computational cost of the analysis and also to rationalize the number and the quality of feedback generated by the ADD tool during operation. Indeed, the final user is not required to visually inspect the trends of all subloads in real-time, but he/she is alerted only when interesting events occur, i.e., when specific anomalous conditions among the sub-loads trends generate a divergence of the total load from the expected pattern.

This work combines different advanced data analytics techniques with the aim of maintaining the output of the ADD process human-readable and interpretable while providing accurate results.

The paper considers as a case study the energy consumption data gathered from a monitoring infrastructure installed in the university campus of Politecnico di Torino. The data refer to the electrical energy consumption of a medium voltage/low voltage (MV/LV) transformation cabin that serves different buildings/zones of the campus. In particular, about ten sub-loads of the cabin are available for developing the introduced hierarchical ADD process. The methodology leverages the reduction and transformation of the analyzed time series through an enhanced and adaptive process based on symbolic aggregate approximation (aSAX) as presented in [17]. The aSAX transformation enabled a reduction of the dataset and an effective identification of unexpected operational energy consumption patterns at the sub-daily time windows level. Furthermore, the diagnosis of the abnormal patterns detected at the total load level (i.e., MV/LV transformation cabin) was provided by implementing an association rule mining (ARM) algorithm on the subload time series. In this context, the main innovative aspects introduced by the present paper can be summarised as follows:

• In order to further enhance the pattern recognition process enabled by the aSAX-based process introduced in [17], different features of the energy consumption time series were encoded in symbols in addition to the mean value evaluated in each time window for data reduction purposes. In particular, the encoding of trend features of the time series was performed, allowing an improved characterization of energy consumption behavior and making it possible to reduce the information loss that is always related to the application of temporal abstraction processes such as aSAX. In addition, both the number of time windows and alphabet size for the encoding of the time series in symbols were tuned during the analysis through a fully automatic process.


The rest of the paper is organized as follows. Section 2 provides an overview and a brief theoretical description of the data analytics methods used for conducting ADD analysis. Section 3 presents and describes the case study considered for the analysis. Section 4 introduces the methodological framework on the basis of the ADD analysis performed. Eventually, Sections 5 and 6 presents and discusses the results obtained, while in Section 7, the concluding remarks and future research perspectives are reported.

#### **2. Description of the Data Analysis Methods**

In this section, the data analytics methods employed in this work are briefly described. The method description is not intended to be exhaustive, but it is aimed to underline the usefulness in the framework of this study and building energy data exploitation.

#### *2.1. Adaptive Symbolic Aggregate Approximation (aSAX)*

Meter-level data measurements are collected in the so-called time series: a twodimensional matrix where each row corresponds to a single observation in time and the column to a measured variable [31]. The sampling frequency determines the time interval between two consecutive observations, and for building applications, it is usually in the order of minutes. As a consequence, the resulting high-dimensional time series is often computationally expensive to be stored and analyzed in its original form. In this context, many dimensionality reductions and transformation techniques were proposed in the literature; one of the most widely used is the symbolic aggregate approximation (SAX), which makes it possible to compress time series while preserving its fundamental characteristics [30]. This process segments the original time series in sub-sequences, each of them is summarised with a single numerical value (e.g., mean value) that is then encoded into the alphabetic symbol and finally combined into a string. The resulting string is much shorter than the original time series and enables the application of various pattern recognition techniques while reducing the computational cost. In the last years, some variations to the original algorithm have been proposed in the literature, especially with the aim of generalizing some initial assumptions (e.g., data distribution) and facing information loss issues always generated from the reduction and transformation of time series. In the

author's opinion, one of the greatest improvements to the SAX was introduced through the so-called adaptive symbolic aggregate approximation (aSAX) [38]. In the following, the main steps of aSAX process are presented, with specific reference to their implementation in the present work.


$$\theta = \operatorname{atan}\left(\frac{\Delta \mathbf{q}(\mathbf{t\_n}) - \Delta \mathbf{q}(\mathbf{t\_l})}{\mathbf{n}}\right) \tag{1}$$

‐ τ

α ∗

α β β

‐ ‐ ‐ ‐

**Figure 1.** Definition of trend feature triangle and trend angle for a generic time series (y(t)).

 α The trend angle domain ranges continuously from −90◦ to 90◦ . If the trend angle value is approximately zero (θ ≈ 0), the trend is stationary; if it is positive (θ > 0), the trend is rising; vice versa if it is negative (θ < 0), the trend is descending.

β β βα‐

‐

β β β ‐

α

‐

‐ ‐ ‐

‐ ‐

• **Encoding**: this step consists of setting an alphabet size α and assigning an alphabetic character to each time window, according to where the extracted numerical feature lies within a set of breakpoints (β = {β1, . . . βα-1}) identified according to the shape of the feature distribution. The aSAX algorithm [38] finds the optimal positions of breakpoints through an iterative process by minimizing the distance among all the data points included between two consecutive breakpoints and their centroid (calculated average center). Eventually, the symbol can be assigned for each window (τ), creating a word of length W for the given sub-sequence (T<sup>i</sup> ). The original numerical time series y(t) is then transformed into an alphabetic string (y(α)) of length W∗N. α β β βα‐ ‐ τ ‐ α ∗

**2021**, , x FOR PEER REVIEW 8 of 28

Figure 2 shows an example of time series temporal abstraction conducted with the aSAX process. An electrical load time series (y(t) = {y1, . . . y192} (black line)) with a 15 min sampling frequency, is divided into two sub-sequences T<sup>i</sup> and Ti+1 of 24 h each. In this example, five-time windows (W = 5) of unequal length are identified for each sub-sequence, and the alphabet size is set to five (α = 5), meaning that four breakpoints β = {β1, β2, β3, β4} are identified. The time series is then approximated through PAA (red segments), and for each segment, the corresponding symbol is assigned. The PAA values distribution is shown on the right side of the figure in red and the breakpoints, evaluated through the aSAX, in dashed blue lines. The original time series for the time window (Ti+1) is converted from a numerical vector into an alphabetic string "a-b-d-c-a", reducing it from a 96-dimensional object to a 4-dimensional one. ‐ ‐ ‐ ‐ α β β β β β ‐ ‐ ‐ ‐ ‐ ‐ ‐

 α **Figure 2.** Example of an adaptive symbolic aggregate approximation (aSAX) process applied to an electrical load time series (T = 24 h, W = 5, α = 5).

#### *2.2. Recursive Partitioning and Globally Optimal Evolutionary Tree*

Classification is the task of assigning a class label to unlabelled data instances through a classifier model, providing prediction or description of a given dataset [43]. The classification model is created through an inductive learning algorithm using a training set, which is a data frame with attributes and labeled instances. Once the model has been created, its performance is evaluated on a test set through the comparison between the predicted and real labels. The decision tree is the most commonly used model for classification, thanks to its understandable graphical representation. Depending on the type of target attribute, discrete categorical or continuous numerical, a decision tree is called, either a classification tree or regression tree, respectively. The tree consists of a root, internal nodes, and leaves, all connected by branches. The construction of a tree classifier can be performed through different algorithms; in this framework, recursive partitioning and globally optimal evolutionary tree are considered.

The most commonly used recursive partitioning method is the classification and regression tree (CART), which is a binary decision tree based on the splitting of the instances in purer subsets (i.e., nodes) through decision rules [44]. It proceeds in a forward step-wise approach by maximizing homogeneity in each child node, yielding to a local optimal tree.

Conversely, the so-called evolutionary decision tree is based on a stochastic algorithm that aims to construct a globally optimum classification model [37]. This process randomly initializes the root node split, then at each iteration, variation operators (i.e., split, prune, major split rule mutation, minor split rule mutation, crossover) are applied. The survivor is selected, and the process is repeated until the stopping criterion is satisfied. The evolutionary tree algorithm used in this paper is implemented in the R package "evtree" [37].

One of the most important hyper-parameter that can be set for this algorithm is the variation operator probability, which refers to the probability that a given variation operator is chosen at a generic iteration. The default operator probability considered is c20m40sp40, meaning that the algorithm has a 20% probability of selecting the crossover operator, a 40% probability for selecting one of the mutation operators (20% for minor split rule mutation and 20% for major split rule mutation) and a 40% probability for selecting one of the split (with 20% probability) or the prune operators (with 20% probability).

The advantage of an evolutionary tree algorithm is that it tends to offer higher accuracy in prediction than recursive partitioning algorithms [37] while maintaining the same interpretable tree structure.

#### *2.3. Association Rules Mining (ARM)*

ARM is a widely used technique that allows extracting static causal relationships and correlations between attributes in a dataset. The objective is to find a group of variables (items) that frequently occur together in a database. This technique can only handle categorical variables, and it is usually computationally costly. One of the most used ARM algorithms is the iterative Apriori algorithm based on a frequent itemset that allows the extraction of static rules from a categorical transactional dataset [45]. Association rules are defined between a set of items (or itemset) in the form A ⇒ B, where A is the itemset called antecedent (LHS = left-hand side of the rule) and B consequent (RHS = right-hand side of the rule) and A ∩ B = <sup>∅</sup>. Rule extraction is usually restricted to only an item in the consequent.

Some user-defined parameters (confidence, support, and lift) need to be set in order to evaluate the significance of the obtained rules and filter out the less important. A domain expert sets those parameters according to each specific case. The support is calculated as the probability of the intersection between the antecedent A and consequent B (*supp*(A ⇒ B) = P(A ∩ B)), expressing the co-occurrence of the two events. The confidence (*conf*(A ⇒ B) = P(B|A)), defined as the conditional probability between A and B, allows assessment of the reliability of a rule. It gives the probability of the consequent event in all transactions containing the antecedent. The lift is the ratio between the confidence and support of consequent B (*lift*(A ⇒ B) = P(B|A) / P(B)). When the lift is higher than 1, it means that B is positively correlated with A, while if the lift is lower than 1, it suggests a negative correlation; otherwise, if the lift is equal to 1, there is no correlation at all. This parameter is particularly important since it allows one to select the most interesting rules [31]. In this paper, the ARM Apriori algorithm was used to extract interesting associations between the total building load and its sub-loads, especially during events detected as anomalous. This data analytics method was perfectly integrated with the outcome of the aSAX and classification processes, of which the results consist of categorical values.

#### **3. Case Study**

The case study analyzed refers to the energy consumption of a MV/LV transformer cabin identified as "substation C", that serves a part of the main campus of Politecnico di Torino (PoliTo), an Italian university located in Turin. Data related to the total electrical load and to some sub-loads are available with 15 min timesteps from 1 January 2015 to 31 December 2019. The hierarchical structure of the available data is shown in Figure 3: the first level refers to the total electrical load of substation C, while the second level shows the available sub-loads. In addition, the load breakdown in terms of average annual energy consumption was provided.

**2021**, , x FOR PEER REVIEW 10 of 28

⇒

‐

‐

‐ ‐

‐

‐

‐

‐ ‐

**Figure 3.** Hierarchical structure of the electrical load database under study.

In particular, a bar and a canteen were at the disposal of students and campus staff and accounted for 2.75% and 16.03%, respectively, of the total electrical energy consumption of substation C. The university data center accounted for 13.16% of the total energy consumption. The administration offices (rectory) corresponded to 3.83% of energy consumption and the mathematics department (DIMAT) for 2.21%. A large share of energy consumption (12.22%) was related to the mechanical room. The equipment located in this room included hot and chilled water circuits and auxiliaries such as recirculation pumps. The chilled water was provided by two chillers of nominal electrical power of 220 kW and a rated cooling capacity of 1120 kW, and a reversible water-water heat pump, with nominal a power and cooling capacity of 165 kW and 590 kW, respectively.

The remaining energy consumption was aggregated under a unique instance tagged as "Unlabelled\_load" as showed in Figure 3. It accounted for 48.76% of the total energy consumption, and since it was not directly measured, cannot be assigned to a specific sub-load.

#### **4. Methodological Framework**

In this section the conceived ADD methodology is presented and described. The proposed methodology aims to develop a two-level ADD analysis capable of making in a first step a high-level detection on total electrical load time series (at meter level) and in a second step performing the anomaly diagnosis on sub-loads (at sub-meter level). The methodology follows the flow chart structure shown in Figure 4.

**2021**, , x FOR PEER REVIEW 11 of 28

‐

‐ ‐

‐

‐ ‐

‐

‐ ‐

‐

‐ ‐ ‐

**Figure 4.** Flow chart explaining the adopted methodology.

In particular, four steps of analysis are considered.

	- ‐ ‐ ‐ ‐ • **Anomaly detection at total electrical load level**: Anomaly detection was performed on the encoded total electrical load time series of substation C. In each sub-daily time window, the total electrical load symbol obtained through aSAX was predicted through a globally optimal evolutionary tree [37], using as explanatory attributes contextual information such as calendar variables (day type and holiday) and energy variables (electrical demand of sub-loads). The model was developed through a testtrain-validation process and was able to predict the expected symbol in each time widow with high accuracy. However, when the model failed to correctly predict the symbol in a time window, the occurrence of a potential anomaly was assumed. Referring to Figure 5, the predicted symbol is the one with the higher occurrence in a given leaf node (green bar). All other symbols were infrequent and then potentially anomalous (yellow and red bars). Given the interest in detecting higher electrical load than normal, only the tree leaves nodes that showed infrequent symbols corresponding to a high electrical load (red bars in Figure 5) were considered and investigated in the following diagnostic phase;

• **Diagnosis at sub-load level:** Once the classification models were developed, a postmining phase was performed. The post-mining phase was aimed at searching historical relationships between misclassified total electrical load symbols and specific trends of sub-loads occurred in same time window. The process is described in Figure 6. The anomalous symbols identified in the training phase of the models were extracted and stored in a categorical data frame (Step-1 in Figure 6). From time series of sub-loads, the mean value and the trend angle were extracted. They were categorised through the aSAX process and then added to the categorical data frame (Step-2 in Figure 6). This data frame was then transformed into a transactional database on which ARM was applied (Step-3 in Figure 6). The LHS is composed of the additional categorical variables related to sub-loads, while RHS contains only the total electrical load anomalous symbol. ARM automatically extracts a set of rules which connects the historical infrequent behaviour of the total electrical load with the sub-load conditions. This process was implemented through the R package "arules" [47]. Resulting rules were then sorted and filtered setting appropriate interest measures parameters such as support, confidence and lift (Step-4 in Figure 6). Filtered rules were then stored within an anomaly library where they were ranked to show which sub-load condition (for example high electrical load or significantly uptrend) was responsible for the anomalous total electrical load behaviour. The tool gives a critical insight of the historical energy behaviour and, when implemented in real time load analysis, can provide useful feedback on which energy management actions are needed. **2021**, , x FOR PEER REVIEW 12 of 28

‐

‐ ‐

‐

‐ ‐

‐ ‐

‐

**Figure 5.** Interpretation of anomaly detection results.

**‐** ‐

‐ ‐

‐ ‐

‐

‐

‐ ‐

‐

‐

‐ **Figure 6.** Sub-meter level diagnosis methodology description.

#### **5. Results**

The previously described methodology was applied to the case study presented in Section 3. The quantitative analysis of data was performed through the statistical software R [48], and results related to each stage are reported in the following sections.

#### *‐ 5.1. Pre-Processing*

‐ ‐ The pre-processing phase allowed to handle missing values and to remove outliers. The procedure was applied to the total electrical load and sub-loads dataset.

‐ ‐ ‐ ‐ In particular, punctual outliers due to data transmission problems were detected, removed, and replaced through linear interpolation. The carpet plots of the total electrical load of substation C are reported in Figure 7a (one for each year considered). It can be seen that the building energy systems were usually turned on at 6:00 and turned off at 19:00. The electrical load increased from the night baseload until 8:00 when teaching activities and office activities began and started decreasing after 16:00. This pattern was visible for every working day (from Monday to Friday) with an average electrical load (from 8:00 to 16:00) of more than 300 kW. During the weekend, on the other hand, there was a significant decrease in the average electrical load to 100 kW, mainly due to the weekly university break and the absence of teaching and office activities. The same carpet plot representation is reported in Figure 7 for some representative sub-loads. Figure 7b shows the electrical load of the mechanical room in the years from 2015 to 2019. Because of the intensive use of the chillers in summer, the highest monthly average electrical load was reached in July with a value of about 100 kW. During the winter months, the electrical load was not zero because of the electrical demand of the recirculation pumps. Figure 7c shows the electrical load of the campus canteen. Also, this load is strongly dependent on the weekly university occupancy schedule. In fact, a significant decrease in the average electrical load was visible during weekends, when the campus was unoccupied, and no teaching or office activity took place.

**2021**, , x FOR PEER REVIEW 14 of 28

**Figure 7.** Carpet plot of the (**a**) total electrical load (substation C) (**b**) mechanical room (**c**) canteen.

#### *5.2. Time Series Abstraction*

In order to perform the data transformation and dimensionality reduction, the original time series of the electrical load was split into 24 h intervals since a daily periodical pattern was observed.

The time windows of daily load profiles were evaluated through a RT, considering the total electrical load as a numerical target and the hours of the day as a predictive attribute. The total electrical load from 2015 to 2019 was analyzed.

Holidays and weekends were excluded from the analysis since they usually present profiles that are flat or with low variance, and include those days in the model would have reduced the accuracy of the results. The splitting criterion adopted was based on the variance reduction around the numerical target's mean in each leaf node. In this way, the daily pattern was split into homogeneous consumption time windows. As a stopping criterion, a minimum number of objects in the child nodes at each split was set in order to have a time window length of at least 2 h.

The RT automatically identified the optimal number of windows thanks to a cost complexity pruning process. This procedure allowed us to choose the best tree by generating a fully expanded tree and then prune it iteratively. According to [17], this procedure enables the identification of an optimal trade-off between misclassification error and model complexity. The selection of the optimal tree size was performed according to the one standard error rule (i.e. 1-SE rule) [49].

The resulting tree had five leaves, which corresponded to five sub-daily time windows, which are summarised in Table 1. It can be seen that the first and fifth-time windows corresponded to the night hours during which the university was closed and not occupied. On the opposite, the remaining time windows correspond to occupied hours of the campus.


**Table 1.** Sub-daily time windows for total electrical load.

Once the time windows were identified, the PAA was performed in order to prepare the dataset for the encoding through the aSAX process.

A fundamental parameter to be set in the aSAX process is the alphabet size (α), which determines how many symbols are going to be used for the encoding, and as a consequence, also the number of breakpoints to search. While in the literature, the alphabet size is usually selected according to domain expertise [17,20,31], in this framework, an unsupervised technique consisting of k-means partitive clustering was used. In particular, the reduced data of the time series (through PAA) were clustered in order to find homogeneous groups and determine the optimal number of breakpoints. For this purpose, during the clustering process, several cluster quality indices, embedded in the R package NbClust [46], were calculated in order to assess the optimal number of clusters (k) according to a majority rule approach, setting a search space between k = 3 and k = 8. The results obtained suggested the partition with k = 6 as the optimal one, then determining the setting of the alphabet size value also equal to 6.

In detail, the positions of breakpoints, calculated under equally probability assumption, were used as initialization of the aSAX iterative algorithm [38]. As shown in Figure 8, those breakpoints (dotted lines) were not able to divide the distributions effectively, producing narrow intervals at low values and wider intervals for higher values of the reduced PAA time series. The final adaptive breakpoints (solid lines) were evaluated once a tolerance of 10−<sup>10</sup> on the representation error was reached (after about 60 iterations).

**Figure 8.** Step by step identification of adaptive breakpoints through the aSAX algorithm applied to the total electrical load.

‐ ‐ ‐ ‐ ‐ Figure 9 shows the carpet plot and histograms, referring to the encoded total electrical load time series. In particular, Figure 9a shows that in the first and fifth-time window, the most frequent symbols were "a" and "b", which corresponded to a low electrical load during night hours. In the second and fourth-time windows, corresponding to early morning and late afternoon, there was a prevalence of medium electrical load identified with the symbol "d" describing the switch-on/off of the systems. In the third time window, the symbols "e" and "f" were the most frequent since the electrical load in the middle of the day is the highest. **2021**, , x FOR PEER REVIEW 17 of 28

‐

‐

‐

‐ ‐

‐

**Figure 9.** aSAX representation of the total electrical load: (**a**) carpet plots (**b**) histogram distributions of symbols along the time windows and years.

‐

‐

Figure 9b shows the histograms of electrical load symbols divided by time windows and years. From this representation was evident how the load patterns had changed during the years from 2015 to 2019. In particular, in the first and fifth-time windows, a change of pattern from the symbol "b" to the symbol "a" was visible due to a lower baseload during night hours, when the campus was unoccupied. This behavior could be related to the refurbishment of buildings and/or systems served by substation C. The same trend was seen in the third time window where a change of pattern from the symbol "f" to the symbol "e" was visible, resulting in a lower electrical load during peak hours. This behavior suggests that the energy performance of the campus was improving over time. Further considerations about changes in the load patterns of the campus have been made in the following when the selection of a proper training period for the classification models is discussed.

#### *5.3. Anomaly Detection at Total Electrical Load Level*

For each time window, a globally optimal evolutionary tree was developed in order to further investigate the dependency of the total electrical load (i.e., target variable) from the boundary conditions (i.e., predictive variables).

To create a model that automatically learns new patterns as the building energy consumption changes, a training period that is consistent with the recent past was selected. In fact, as previously discussed, older patterns of energy consumption strongly differed from more recent ones, and including them in the learning training set could have compromised the capabilities of the models in terms of accuracy on the validation set. Therefore, the classification models were trained and tested on 2018 data and for simulating an online deployment of the process were validated on the first month of 2019. In particular, the 2018 dataset was split, with 80% placed into the train set and 20% into the test set, through a random sampling process.

The attributes considered in the evolutionary classification trees are listed in the following:


The choice to use as predictive values some sub-loads and not others was driven by a sensitivity analysis and by their percentage weight on the total electrical load. The canteen and the mechanical room weights were 12.22% and 16.03%, respectively, on the total electrical load (Figure 3). Moreover, among the labeled sub-loads, they showed the highest variance in 2018, as well as significant variations during the day. It is clear how they could be extremely useful in characterizing the relationships that existed between the normal operation of the substation C and their electrical demand.

For all the time windows, the maximum depth of the classification tree was set to 6, the minimum number of observations in each node was set to 20, and the default setting c20m40sp40 for variation operators was assumed (20% crossover, 40% mutation, and 40% split/prune).

Since the evolutionary algorithm and the splitting process were randomly initialized, the seed for the random number generator was set in the code in order to replicate the analysis easily.

Figure 10 shows the tree resulting from the training phase for the second time window. It shows that it effectively classified in each leaf node the most frequent symbol from the others while maintaining a readable and understandable structure. The developed set of evolutionary trees (one for each time window) was aimed at extracting very accurate decision rules so that in the leaf node, a high occurring symbol can be found. If this condition is satisfied, the low occurring symbols can be considered as potential anomalies for the considered time window. Those potential anomalies could then be subject to further investigation in order to understand which sub-load can be assumed as the cause for that infrequent behavior (anomaly diagnosis). **2021**, , x FOR PEER REVIEW 19 of 28

**Figure 10.** Globally optimum classification tree for the second time window (06:30–08:59).

Decision rules extracted from each tree (one for each time window) are reported in Table 2. It can be observed that the input variables used for the classification tree were able to explain the occurrence of each symbol with strong accuracy. Furthermore, it can be noticed that time window one was not associated with any decision rule. This window was found to be characterized by a very high occurrence (over 97% over the training period) of a single symbol. In this case, the available input variables were not able to further characterize the occurrence of other symbols.

‐

‐

‐

 The model performance for each time window is shown in Table 3. The table also reports that the overall accuracy in training testing and validation was 88.91%, 86.22%, and 89.03%, respectively. The results obtained suggest high generalizability for the classification models and the absence of overfitting issues.

> ‐ ‐

‐ ‐

α ‐

*‐*


**Table 2.** Decision rules extracted from globally optimal trees created in each time window on the training period.

**Table 3.** Accuracy results from a comparison between test and validation.


#### *5.4. Diagnosis at Sub-Load Level*

Once the classification models were created, the subset of anomalous symbols (higher than expected symbols) included in each node was transformed into a transactional database that contains the categorical target variable (total electrical load symbol) and some additional explanatory variables related to the sub-loads.

To extract those additional categorical variables, the sub-loads were subjected to the same time series abstraction process described for the total electrical load in Section 5.2. Using the same time window discretization as the total electrical load and the same alphabet size (α = 6), each time series of the available sub-loads was encoded through the aSAX process.

In order to further enrich information about sub-loads, the trend angle was also extracted and encoded (see Figure 11).

‐ **Figure 11.** Results of trend angle aSAX encoding applied to the rectory sub-load: (**a**) identification of adaptive breakpoints through the aSAX algorithm, (**b**) encoded trend angle carpet plot for 2018 and 2019. ‐

α ‐ − This feature allows tracking of the trend of the time series in each time window, making it possible to know if the load is increasing, decreasing, or it is stable. In this case, the alphabet size was set to three (α = 3) in order to reflect those three possible trends (respectively encoded as Up, Down, and Stable). The initial breakpoints, calculated under equally probability assumption, were used as initialization of aSAX iterative algorithm, and the final adaptive breakpoints were evaluated once a tolerance of 10−<sup>10</sup> on the representation error was reached. Then the Apriori ARM algorithm was applied to the transactional database structured, as depicted in Figure 12. ‐ α ‐ − ‐

‐

‐

‐

‐ ‐ ‐ **Figure 12.** Representation of the transactional databases used for the extraction of association rules. LHS: left-hand side of the rule; RHS: right-hand side of the rule.

‐

‐ In particular, the RHS was the anomalous total electrical load symbol extracted from the leaf node of the classification tree for a specific time window, while the LHS was composed of all possible combinations of electrical load symbols and trend angles symbols of sub-loads. The minimum and the maximum number of items in a transaction was set

in order to obtain rules with one or maximum of two items in the LHS. The minimum support to mine rules was set to 0.005, and the minimum confidence to 0.005. Redundant rules, equally or less predictive of a more general rule with the same confidence [50], were removed, and the remaining ones were represented in a scatter plot (Figure 13). The scatter plot helps the analyst to understand how interesting rules were filtered out by setting *lift*(A ⇒ B) > 1 and *conf*(A ⇒ B) > 0.5. Those rules were then stored in the anomaly library, where they were ranked according to the lift value. LHS of those rules represents the sub-load conditions that were found to be significantly influencing the abnormal total electrical load. ‐ ⇒ ‐ ‐

‐

‐

‐ ‐

‐ ‐ ‐

**2021**, , x FOR PEER REVIEW 21 of 28

‐

⇒ ⇒

**Figure 13.** Diagnosis procedure of extracting, filtering, and selecting only relevant association rules from node five of the second time window.

‐ ‐ An example of the procedure is shown in Figure 13 for node five of the second time window. In this node, the most frequent symbol was "b", and the only infrequent interesting symbol (higher electrical load) was "c". The transactional database was then constructed: the LHS was composed of the additional categorical variables related to subloads (electrical load symbol and trend angle symbol), while RHS contained only the total electrical load anomalous symbol (symbol "c"). ARM automatically extracts 338 rules, of which 180 resulted redundantly, and 158 rules were significant. After filtering, only 19 rules were stored in the anomaly library. In this particular case, the most frequent items in the anomaly library were: mechanical room symbol "d"; canteen symbol "c"; rectory symbol "d". For example, among the 19 rules considered, rule four (IF sym\_Mechanical\_room = "d" AND sym\_Canteen = "c" ⇒ sym\_Total\_Power = "c") had a lift value of about five and confidence of 100%. It means that if during the operation of the ADD process this rule was matched, then the diagnosis was extremely robust, given that the anomaly detected was already present in the analyzed historical database.

#### *5.5. Deployment of the ADD Tool*

The methodology was conceived to be implemented in a real-time data acquisition tool connected to a smart metering infrastructure. The metering infrastructure continuously collects data, and once a time window ends, the symbol of the total electrical load was

calculated through aSAX and compared to the one predicted by the globally optimal tree. Three possible cases could then occur:

**2021**, , x FOR PEER REVIEW 22 of 28


In the latter case, the diagnosis analysis is enabled. Given the boundary conditions, the corresponding leaf node of the evolutionary tree is identified, and the tool automatically retrieves the library of association rules extracted on the historical dataset for that specific anomaly condition (i.e., a specific symbol of the total electrical load). The following step was then to extract the additional features from sub-loads and encoding them in symbols/categorical values. Once all the potential LHS items had been computed, a scan of the rules included in the anomaly library was performed to detect any perfect match. If a perfect match of a rule exists, it means that a full diagnosis of the anomaly could be performed considering that the same anomaly condition (i.e., the relation between anomalous total load and sub-loads) was present in the historical dataset. Otherwise, if a perfect match does not exist, a partial match with the single item was searched. In the case of a partial match, the diagnostic capability is not as strong as for the perfect rule match. However, useful insight can be obtained about new possible configurations of sub-loads that could be included in the anomaly library during future updates. In order to make the whole ADD process flexible in learning new patterns, a full retraining of the classification models and anomaly library is supposed to be performed every month, considering a historical dataset of one year. ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐

The deployment of the methodology was performed on the validation set that consisted of the data referred to in January 2019. The process of detection through the evolutionary tree was performed on all-time windows. Only for reference, it was considered the classification performance achieved in the second time window for the whole month. The confusion matrix related to the classifier is reported in Figure 14. In particular, it can be seen that the classification tree achieved an accuracy of 93.55%, and only one time the actual symbol was different from the predicted one revealing a higher electrical load than expected, respectively "c" instead of "b". In particular, the anomaly occurred on 4 January 2019. ‐ ‐ ‐

**Figure 14.** Confusion matrix for the globally optimal classification tree predicting January 2019 total electrical load symbol in the 2nd time window. Red square: the detected anomalous behavior.

Once identified the day and the time window of the anomaly, the corresponding tree's leaf node was identified as well. In the example, the anomalous symbol of the total electrical load of 4 January 2019 was detected in the tree leaf node five. The diagnosis process was then enabled, and the sub load conditions were compared with the anomaly reference library. In the considered example, there was not a perfect rule match but a partial one on the following items:

‐

‐

‐

‐

**2021**, , x FOR PEER REVIEW 23 of 28


As previously discussed, a partial match is not as strong as a perfect rule match but provides useful suggestions to be considered for conducting the anomaly diagnosis. This aspect was demonstrated through further graphical analysis, reported in Figure 15. The figure shows a comparison between the anomalous and normal pattern of the total electrical load and the loads related to the mechanical room, printshop, and canteen. Only the second time window is reported in the plot. In particular, in red, the anomalous data related to the 4 January 2019 are reported, while in green are shown the frequent "normal" patterns of the given loads extracted from the training period (part of 2018). Along with the actual electrical loads (solid lines) were reported the relative PAA segments (dashed lines) and, for the "normal" pattern, the standard deviation (grey areas).

**Figure 15.** Comparison between the actual (red lines) and expected (green lines) electrical load with the relative standard deviation (grey areas) on 4 January 2019. The dashed green and red lines represent the PAA segments for the actual and expected load respectively. The blue horizontal solid line on the top graph represents the aSAX breakpoint related to the total power.

The combined effect of the three sub-loads (i.e., mechanical room, printshop, canteen) led to an overall electrical load higher than expected. The mean total electrical load rose from 236 kW (symbol "b") to 283 kW (symbol "c"), and it was easy to verify that the identified sub-loads contributed almost 90% to the power shift upward of the total electrical load. It is worth noting that although the printshop presented an anomalous electrical load pattern, the observed profile (red line) did not significantly deviate from the normal one (green line).

#### **6. Discussion of the Results**

This paper focused on the development of ADD methodology able to analyze meterlevel electrical load data in order to detect anomalous patterns and perform a diagnosis process on sub-loads. This methodological framework was conceived to be highly scalable and reliable in order to be implemented in energy data monitoring infrastructure for supporting a prompt detection of anomalies avoiding energy wastes over time.

The time window size and alphabet size for the aSAX encoding are key parameters. In [20] is reported an interesting sensitivity analysis based on these two parameters, showing that a trade-off between window numbers and alphabet size has to be found in order to minimize the variance between patterns and resolution needed. In this paper, the time window number was chosen by using an RT and the alphabet size by a k-means clustering evaluation. Once those parameters are set, the aSAX encoding procedure can be considered completely automatic. Moreover, the conducted analysis showed that considering the trend angle as an additional feature, a robust sub-loads characterization could be performed without adding computational burden.

Moreover, the selection of the predictive variables for the globally optimal classification tree needs particular attention. The overall energy consumption of a building is strongly related to the occupancy schedule, environmental conditions, thermo-physical features of the building, and the behavior of users. For this reason, those variables should be all included in the classification model and could help in describing infrequent but non-anomalous patterns. On the other hand, trustworthy values are difficult to retrieve or measure with continuity. Surely, the inclusion of those variables could qualitatively increase the model predictions.

A further interesting aspect of being considered is related to the data that should be used for training and how often training is needed. It is well known that building electrical load varies over the years due to the electrification of end-uses and the seek of the higher performance of appliances and facilities. For this reason, a good trade-off between retraining rate and computational effort should be performed. In our study, we validated the model in the first month of 2019 in order to assess its accuracy.

In addition, in order to prove the effectiveness of monthly retraining of the tool, a comparison was performed between two different deployment approaches. The first deployment considered was static, with the hypothesis of using the same classification models trained in 2018 for six months in 2019. The second deployment was dynamic, considering monthly retraining of the classification models with a one-year moving window training set. Results showed that the average classification accuracy was 82.85% for the dynamic deployment and was 78.77% for the static one. Therefore, with a dynamic deployment, the anomaly detection capabilities improved, given that the classifiers are able to learn new patterns that change over time. Following the same reasoning, the authors propose to implement a monthly update also for the association rules included in the anomaly library.

#### **7. Conclusions and Future Work**

This paper proposed a multiple-step ADD methodology to automatically detect at whole-building meter level anomalous energy consumption and then perform a diagnosis on the sub-loads responsible for that anomalous pattern. Frequent and infrequent electrical load patterns, properly transformed through an adaptive symbolic aggregate approxima‐

tion process, were discovered by means of globally optimum evolutionary classification trees. Association rule mining was employed to discover the main sub-loads, which mostly affected the anomaly detected at the whole-building level. ‐ ‐ ‐

‐ ‐

‐

**2021**, , x FOR PEER REVIEW 25 of 28

‐

In the future, the ADD process presented in this paper is expected to be implemented online within the energy information system of Politecnico di Torino and supplied through an energy data analytics dashboard developed with the R packages "shiny" [51] and "shinydashboard" [52]. Figure 16 reports a demo of the dashboard that is currently under construction and under offline testing. Moreover, the authors aim to integrate this ADD process together with other complementary tools able to perform electrical load forecasting and energy performance tracking (i.e., benchmarking).

**Figure 16.** Energy data analytics dashboard developed by the building automation and energy data analytics (BAEDA) Lab, which implements the anomaly detection and diagnosis (ADD) procedure presented in this paper.

> ‐ ‐ ‐ ‐ Further research will also be focused on the testing of alternative configurations of algorithms (i.e., data clustering, forecasting) with respect to the one considered in this study. In fact, the proposed algorithms cannot always be assumed as the best solution for performing such kind of analysis on energy consumption time series. As a reference, the aSAX transformation, the development of classification trees, and the extraction of association rules perfectly match with the need to provide a fully interpretable tool to the final user. However, this constraint, in some cases, can also determine an information loss and accuracy decrease. For this reason, a future analysis may well consider the use of more sophisticated algorithms (e.g., deep learning algorithms) that are characterized by their non-interpretable nature but makes it possible to achieve higher performance in detecting and diagnosing energy anomalies. This option still remains valuable if an explanation layer is included in the analytical process. Nowadays, such a task corresponds to the main goal of the machine learning field of the so-called explainable artificial intelligence (XAI), which offers new opportunities for effectively embedding advanced algorithms in AI-based energy management solutions where explanations of the black-box model predictions are often compulsory.

> **Author Contributions:** Conceptualisation, M.S.P. and A.C.; Data curation, M.S.P.; Formal analysis, R.C.; Investigation, M.S.P.; Methodology, M.S.P., R.C. and A.C.; Project administration, A.C.; Software, R.C.; Supervision, A.C.; Validation, M.S.P. and A.C.; Writing—original draft, R.C.; Writing review and editing, M.S.P. and A.C. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** No new data were created or analyzed in this study. Data sharing is not applicable to this article.

**Acknowledgments:** The authors express their gratitude to the Living Lab of Politecnico di Torino for providing data and to Giovanni Carioni for the support in data preparation and collection.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **The Use of Energy Models in Local Heating Transition Decision Making: Insights from Ten Municipalities in The Netherlands**

**Birgit A. Henrich 1,2, Thomas Hoppe 1,\* , Devin Diran <sup>2</sup> and Zofia Lukszo <sup>1</sup>**


**Abstract:** In 2018, the Dutch national government announced its decision to end natural gas extraction. This decision posed a challenge for local governments (municipalities); they have to organise a heat supply that is natural gas-free. Energy models can decrease the complexity of this challenge, but some challenges hinder their effective use in decision-making. The main research question of this paper is: What are the perceived advantages and limitations of energy models used by municipalities within their data-driven decision-making process concerning the natural-gas free heating transition? To answer this question, literature on energy models, data-driven policy design and modelling practices were reviewed, and based on this, nine propositions were formulated. The propositions were tested by reflecting on data from case studies of ten municipalities, including 21 experts interviews. Results show that all municipalities investigated, use or are planning to use modelling studies to develop planning documents of their own, and that more than half of the municipalities use modelling studies at some point in their local heating projects. Perceived advantages of using energy models were that the modelling process provides perspective for action, financial and socio-economic insights, transparency and legitimacy and means to start useful discussions. Perceived limitations include that models and modelling results were considered too abstract for analysis of local circumstances, not user-friendly and highly complex. All municipalities using modelling studies were found to hire external expertise, indicating that the knowledge and skill level that municipal officials have is insufficient to model independently.

**Keywords:** energy modelling; heating transition; modelling practices; data-driven policy design; local policy; municipality; multi-model ecologies

#### **1. Introduction**

#### *1.1. The Dutch Heating Transition*

In 2016, the heating and cooling sector accounted for half of the EU's energy consumption [1]. In The Netherlands, 53% of the national heat supply is provided by natural gas [2]. In March 2018, the Dutch national government announced its decision to end natural gas extraction from the Groningen gas field by 2030 [1] to help reach the climate goals of the Paris Agreement and to reduce the negative impact of natural gas extraction in the province of Groningen [2]. This is also referred to as the so-called 'heating transition' in The Netherlands and was later defined by the RVO (The Netherlands Enterprise Agency) as removing natural gas from industry, the built environment and the agricultural sector [2], and replacing it by (sustainable) heating alternatives. According to the Climate Agreement, the main climate policy program in The Netherlands, a sufficient level of sustainable heating must be made available to replace the natural gas supply and to meet the climate change mitigation target of reducing CO<sup>2</sup> emissions by 3.4 megatons in the built environment. To reach this goal, 1.5 million existing residential homes have to be supplied with sustainable heating by 2030 [3].

**Citation:** Henrich, B.A.; Hoppe, T.; Diran, D.; Lukszo, Z. The Use of Energy Models in Local Heating Transition Decision Making: Insights from Ten Municipalities in The Netherlands. *Energies* **2021**, *14*, 423. https://doi.org/10.3390/en14020423

Received: 14 December 2020 Accepted: 12 January 2021 Published: 14 January 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

However, this is challenging because decision-making and policymaking in this transition are far from simple, as actors, technology and institutions interact in a complex manner [1]. The heating transition requires a change of the supply of renewable energy, the infrastructure, residential heating systems and of thermal insulation in residential houses, which all raise questions about the division of costs and the freedom of choice [4]. Next to these dependencies, the heating transition poses significant financial challenges. Natural gas is currently cheaper than sustainable alternatives and residents do not always have sufficient funds available to provide the needed investments or to deal with increased living expenses [5].

To organise this complex transition, every municipality is expected to formulate a "Transition Vision Heat" (See Table A1, Appendix A, Glossary) and an implementation plan in their local government plans, to show how they will organise a heat supply that is natural gas-free and affordable, according to the Environment and Planning Act. This means that municipalities are expected (by the national government) to take a leading role in the heating transition. This is new for municipalities and requires them to collect new knowledge, expertise and competences. To this end, the national government has set up Test Beds for Natural Gas-Free districts (i.e., pilot projects) and a knowledge and learning programme to learn and experiment [3] within the National Programme for Natural Gas-Free Districts. The latter has a 120 million euro budget.

#### *1.2. The Use of Energy Models in Data-Driven Policymaking*

To enable the heating transition, municipalities need to answer questions such as, which heating source would lead to low end-user costs, low societal costs and low CO<sup>2</sup> emissions? To evaluate the effect and impact of potential policy measure or decisions on, for example, a preferred technology for natural gas-free heating in city districts, evidencebased policymaking entails the derivation of fact-based knowledge to support the decision making by policymakers. One way to approach evidence-based policymaking is with data-driven policies. A data-driven policy uses data and tools for processing and analysing data to design policies and to facilitate collaboration with citizens to co-create [6]. Currently, municipalities make limited use of data and data processing and analysis tools for decisionmaking support. This is partly due to a lack of guidelines. New guidelines are to be developed that can make use of new data sources and tools [6]. Historically, the first decision-making support tool developed for environmental planning was the multi-criteria decision aid (MCDA). The MCDA is considered a qualitative decision support tool [7]. One drawback of MCDA tools is that they do not allow for analysis to compare whether doing an action is better than doing nothing [8]. In the last years, the number of quantitative tools to support decision-makers has been growing, which include energy models. The advantage of energy models, compared to more qualitative tools such as MCDA, include a higher degree of traceability, easier implementation in computing environments and better opportunities for ex-ante analysis [8]. Dutch municipalities are increasingly trying to include energy models when designing policy for the heating transition are energy models. In the present study, an energy model is defined as a computer model of an energy system that introduces a structured way of thinking about the implications of changing parts of the system [9]. Energy models may help analysts and policymakers to better understand the increasingly complex energy sector. However, clear guidelines on how to use these models while designing policies are still lacking.

Next to a lack of guidelines on how to integrate energy models, practitioners, such as policymakers, also experience challenges with energy models themselves. This hinders the use of energy models for policy design and decision-making [10]. When interpreting modelling results, caution is needed, because when modelling, it is unavoidable to make use of assumptions and estimates, which may not be valid under all circumstances [11,12]. According to a recently published research report, in The Netherlands [10] no less than six different models focusing on the heating transition sometimes provide different results for the same research question, due to differences in approach, assumptions and input

data. This makes it difficult for policymakers to interpret, understand and trust modelling results.

Another significant challenge of current energy models is that they fail to take into account social aspects. This is problematic since the heating transition is highly dependent on humans and their intentions. Social aspects, such as behaviour and attitude of the public, affect proposed or implemented policies and should, therefore, not be ignored [13]. At present, building owners (either citizens/homeowners, institutional investors, private landlords or housing associations) have the right and responsibility to make investment decisions about the heating supply of their buildings [14]. In other words, they need to be incentivised to change their current gas-based heat supply. For this reason, building owners and local communities form an essential part of the heating system and their contribution to the heating transition, by deciding to adopt sustainable heating technologies and/or thermal insulation for their homes, is key in making the transition happen.

#### *1.3. Research Focus*

The present study focuses on the use of energy models in local heating transition projects to assess to what extent energy models are used in the decision-making process, how, and which advantages and limitations this has. The present paper aims to provide insight into the practice of energy modelling and insight into the needs and challenges of practitioners when using energy models in the heating transition. Thus far, no academic studies have addressed these issues. Insights therein can provide a starting point for more structured guidelines of effective energy modelling. The research question of this study was, therefore, as follows: What are the perceived advantages and limitations of using energy models for municipalities within their data-driven decision-making process concerning the natural-gas free heating transition? To answer the research question, a review of the literature and multiple embedded case studies was conducted in which different heating transition projects in ten Dutch municipalities were investigated. The scope was limited to energy models used by practitioners in the Dutch heating transition, as further explained in Section 3.

The paper is structured as follows. In Section 2, a literature review is presented on the use of energy models in heating transition projects, as well as on data-driven policy design and good modelling practices. Section 2 concludes with a set of theoretical propositions. In Section 3, research design and methodology are presented. In Section 4, the results of the analysis are presented. This includes testing of key propositions regarding the use of energy modelling. In Section 5, the results are discussed, and the academic merit of the present study is presented. The paper ends with a conclusion, the limitations of the study and suggestions for future research.

#### **2. Literature Review**

#### *2.1. Data-Driven Policymaking*

To plan for a transition to sustainable heating in the built environment, municipalities need data and evidence to support their decision-making processes [10]. One way to approach this is by formulating data-driven policies. Multiple studies agree that using a data-driven approach using new data sources and tools, such as energy models, can improve policymaking practices [6,13,15–18], but a systematic approach to do so is still missing [6,13]. Moreover, various studies express concerns about the capabilities of policymakers and stakeholders to deal with new data sources and technologies [16,18]. Thus far, no academic studies have been conducted addressing how the use of energy models affects practitioners within the heating transition, indicating a research gap. In addition, multiple studies call for more clear guidelines for the use of new data and tools by governmental institutions [6,13,19]. Argyrous [20] offers some guidelines on ensuring transparency and accountability, but only Koussouris et al. [17] offer concrete suggestions for practitioners besides ensuring the governmental organisation has the right expertise.

#### *2.2. Challenges of Using Energy Models in Heating Transition Policymaking*

Considering the academic literature regarding the use of energy models to support policymaking in the heating transition one thing becomes clear: there is a large variety of models and tools being used to support decision making within the energy transition, and few comparisons are being made between these models and tools. An overview of the literature found describing different modelling methods used for a sustainable heating transition is shown in Table A2, Appendix B (relevant findings for the present study). Reviewing this sample [1,21–41] shows that although modelling approaches have the potential to reduce the uncertainty of complex social issues, there is currently no systematic approach on how to apply models to make policy decisions and how to consider not only objective facts but also social and socio-economic factors. As the complexity of heating transition projects is partly due to the dependency on social factors such as human behaviour, models which consider not only objective techno-economic factors but also social and socio-economic factors, could increase the value of modelling approaches in heating transition projects [13,22,35,38,39].

Furthermore, the literature shows a large variety of models that are currently used, based on different theories and mathematical principles. A few common challenges can be recognised among this variety. First, the correctness and sensitivity of assumptions. Second, the transparency and usability for practitioners. Third, the need to integrate both economic, environmental and social factors. Another interesting aspect concerns the lack of energy modelling research, particularly in the heating transition of The Netherlands. Thus far, in this country, only one academic study was conducted addressing a model focused on the heating transition [1].

Although there is limited academic literature available, grey literature is abundant. A whitepaper by Nikolic et al. [19] offers general principles for good modelling practice and red flags that indicate inadequate modelling practices. It concludes that there is a need for modelling guidelines that are more practical and easier to communicate, and that there is a need for more interaction between academia and practitioners. Both Nikolic et al. [19] and De Ridder et al. [42] suggest that municipalities need to develop more internal knowledge to understand and make use of models. Diran et al. [43,44] claim that better access to data regarding buildings, infrastructure and energy production is needed to utilise current energy models, especially within the utility sector. Figure 1 presents an overview of the energy models and tools regarding the heating transition as used in The Netherlands.

**Figure 1.** Overview of Dutch energy models used for decision-making support in the heating transition. Image translated and adapted from [45].

A study by Brouwer et al. [10] compares six models that are often used by municipalities, i.e., the Vesta MAIS (Multi Actor Impact Simulation) model, the CEGOIA model, the Energy Transition Model (ETM), a DWA model and the Caldomus model. The characteristics of these models are discussed in Table 1. The study [10] reveals that these models provide significantly different results for the same research question due to differences in assumptions and modelling approach. Differences identified [10] include differences in building types and geographical borders, differences in renovations to improve surpassing energy label 'B'; differences in costs of all-electric networks; differences in the order of steps within the approach; different assumptions regarding the scarcity of heat sources and different assumptions regarding learning curves; different heating technologies included; and differences in optimisation research questions.

**Table 1.** Overview of the six energy models often used by municipalities for heating transition policymaking.


<sup>a</sup> English translation provided by the authors. <sup>b</sup> English translation provided by the authors. <sup>c</sup> Optimisation models find the optimal solution for a chosen criterion and constraints, whereas simulation models merely allow the end-user to explore how a system responds to different inputs.

#### *2.3. Propositions on the Use of Energy Models by Municipalities*

Based on the literature it can be deduced that clear guidelines for the use of energy models are missing and that there are serious concerns about the lack of expertise regarding energy models and data management at public organisations. Among energy models used for energy policy design, there are challenges regarding the correctness and sensitivity of assumptions, regarding the transparency and usability for practitioners (such as policymakers) and regarding the need to integrate more social factors. Moreover, although there is grey literature available, there is a lack of academic research about the use of energy models by municipalities. Based on the literature reviewed, propositions were formulated regarding current practices, advantages and limitations of municipalities using energy models in the heating transition. Table 2 presents these propositions with argumentative justifications provided for each of them. Note that some of these propositions were formulated in an if-then structure to improve readability. However, this structure only has

conversational implication and is not in line with formal logical implication, i.e., if X then Y" is only false in case "X" is true and "Y" is false.

**Table 2.** Overview of the theoretical propositions and their respective justifications.


#### **3. Research Design**

*3.1. Embedded Case Study Research Design*

To answer the research question, multiple embedded case studies were conducted. Based on the embedded case study design of Yin [46], the nine propositions formulated based on the literature review, guided design, data collection and analysis will be reflected upon [46]. In the present study, multiple cases represented a variety of heating transition projects. Key actors involved included heating transition practitioners and energy model developers. Practitioners, such as policymakers and project managers, are closely involved in the heating transition project of the municipality and/or in the development of the local heating vision document. Energy model developers are involved in the developing models that are used by municipalities.

#### *3.2. Case Selection*

The first generation of pilot projects from the National Programme for Natural Gas-Free Districts (see Table A1, Appendix A), consisting of 27 municipalities, served as an initial source of case study selection. It was predicted that these cases would produce similar results or contrasting results for anticipatable reasons. All of these projects started at a similar time in 2018, received government funding and had a similar manner of publicly documenting their progress. Differences in results between these projects are expected to be based on the size of the municipality, based on specific neighbourhood characteristics of the pilot projects and on different energy models that are used. In total, ten municipalities participated in the present study. This entailed a sample of three large municipalities (>100,000 residents), five medium-sized municipalities (>30,000 residents) and two small municipalities (<30,000 residents), across ten provinces (out of twelve provinces in the country; showing high geographical variation), with ten different approaches to natural gas alternatives analysis, and a variety of different selected heating alternatives. Table 3 presents an overview of the ten municipalities that participated and the potential alternatives for natural gas for their respective pilot projects, based on the information that was published in the project implementation reports of 2018.

#### *3.3. Pattern Matching*

To enable reflection from the empirical study to the theoretical propositions, the "pattern matching" technique was used. According to Yin [46], pattern matching is one of the most desirable techniques used in case study analysis. Pattern matching entails comparing empirically-based patterns with the predicted patterns made before collecting data, e.g., the theoretical propositions. The ATLAS.ti 8 [47] software was used to support the process of pattern matching. As there is a risk of collecting too little data with this approach [46] data were also collected on emerging themes that were present in the academic and grey literature but that were not captured in the propositions. After finalising the empirical study, each of the nine propositions will be reviewed separately and will be either confirmed or rejected based on confirmatory evidence that follows from the empirical analysis, as described in Section 3.4.


**Table 3.** Overview of the ten case studies, presenting the size and the proposed alternative heating technology options of each of the municipalities analysed.

*3.4. Data Collection, Treatment and Analysis*

The types of data per case study that were used concerned: (1) governmental reports (for example heating transition implementation plans); (2) in-depth interviews with practitioners from municipalities and; (3) in-depth interviews with model developers. The

information of these three sources was converged in a triangulating fashion. The documents (such as project implementation plans and model guidelines) provided secondary data that were used to structure the interviews. Only publicly available documents were used. Twenty-one in-depth, (expert) interviews provided primary data of the case studies.

All twenty-one interviewees were provided with informed consent forms and all interviewees provided, among others, permission for the use of their statements for the present study. An anonymised overview of respondents is shown in Tables A3 and A4, Appendix C. All interviews were conducted via video call or telephone, and audio was recorded. Interviews with both practitioners (14) and model developers (7) were fully transcribed. Transcripts were provided to the interviewees after the interviews and interviewees were given ample opportunity to read and alter the transcript. All interviews were conducted between the first of May and the first of September of 2020. The average duration of individual interviews was 55 min.

The interviews were semi-structured with open-ended questions to allow for in-depth analysis. Although a set of pre-defined questions was used, interviewees were also given the opportunity to explore questions in greater depth and to introduce new topics. This type of in-depth interviews, according to Roller [45], increases the credibility of the data by reducing response bias (distortion due to the tendency of interviewees to provide answers that are considered socially accessible) and by reducing satisficing (providing an easy 'I do not know' answer). The data collection process, including the informed consent forms, was approved by the Ethical Committee of the Technology, Policy and Management faculty at Delft University of Technology.

Analysis of the interview transcripts was completed by thematic coding. Atlas.ti 8 [47] (computer-aided qualitative data analysis software) was used to perform the coding process and to create coding reports. A semantic analysis was conducted, meaning that data was coded at face value, i.e., at the explicit meaning. Thematic coding is viewed as a relatively simple qualitative method that offers a high level of flexibility. Quotations were created based on the theoretical propositions and the research questions, and a code was assigned to each quotation. As proposed in standards for theoretical thematic analysis [59], an initial set of codes was set-up to guide analysis of the transcripts. The coding frame, as expected, did not fully cover all aspects related to the topic and was adapted and supplemented where needed with codes such as 'motivation residents' and 'not familiar with energy models'. These adaptations were made rather inductively, meaning that the 'open coding' function of Atlas.ti 8 [47] was used to add codes during the first round of coding. After this first round of coding, all codes and their frequency were assessed to see whether splitting or merging of codes was necessary. To transform the raw data into meaningful information, all quotes were given an English title; code groups were created to show the relation between several codes and so-called network figures were created to show the focus of different quotes within one code. Moreover, code-occurrence tables (see Tables A5 and A6 in Appendices D and E) were made to quantify the findings, which reduced the subjectivity of result interpretation.

#### **4. Results**

The interviews conducted with practitioners yielded 820 quotes divided over 36 thematic codes. Seven interviews were conducted with model developers. These interviews yielded 561 quotes divided over 53 thematic codes (See for an overview of codes and code-occurrence Appendices B and C). The results of the case studies were used to either validate or reject the propositions (Section 2.3; Table 2). The findings regarding the testing of the propositions are presented in Table 4. The findings will be discussed in more detail in Sections 4.1–4.8 below.


**Table 4.** An overview of the findings that confirm or reject the propositions made.

#### *4.1. Different Municipalities Use Different Energy Models with Different Aims*

The proposition 'Different municipalities use different energy models (if any) with different aims' was confirmed based on the case studies. Six different energy models were used by the ten municipalities studied to support decision-making for heating transition pilot projects or the design of the Transition Vision Heat: the Vesta MAIS model, the CEGOIA model, the Caldomus model, DWA models (the IKM and the WWM), the ETM and the WTM). This is in line with [10] which mentioned these six models as the most used models for the Dutch heating transition. Moreover, two national modelling studies based on one or more of these energy models were used, the 'Startanalyse' (Start Analysis in English; translation by the authors) and the 'Openingsbod' (Opening Offer in English; translation by the authors) (see Table A1, Appendix A). In the case studies, these models were only seldom used by practitioners, with the only exception in this sample pertaining the municipality of Utrecht, where a modelling team was deployed to use the Vesta MAIS model to develop heat scenarios. More in general, municipalities were found to use models and modelling studies to support the decision-making process, to provide more legitimacy towards residents or as a basis for more detailed heating transition business cases. No socio-technical energy transition modelling methodologies or agent-based modelling methodologies were found though, indicating that these were not considered important in the current planning and implementation of the heating transition at the local level. All models, except for the ETM, were optimisation models. The ETM did not offer an automated optimisation function. All models, except for the ETM, aimed to find the heating alternative with the lowest societal costs.

#### *4.2. Complexity and User-Friendliness of Energy Models*

The propositions 'If energy models are complex to use, then practitioners will make limited use of them while planning for the heating transition', and 'Practitioners seek the help of external parties to use and interpret energy models' were confirmed based on the case studies analysed. The results showed that the use of energy models was not necessarily limited, seven out of the ten heating transition pilot projects investigated used an energy model in their decision-making process and seven out of seven Transition Vision Heat projects used or were planning on using an energy model. However, six interviewees mentioned there were issues regarding the complexity and user-friendliness of energy models that hindered effective usage in heating transition projects. Four out of seven model developers claimed that practitioners often did not have the right background or the time to master these complex tools independently. According to the same four interviewees,

large-sized municipalities usually had more time and resources to learn how to use a model than their small-sized peers. If a third party conducted the modelling process, large-sized municipalities were therefore generally better able to critically reflect on the results. All seven municipalities from this sample that used energy models in their heating transition projects used third parties at some point during their heating transition projects to conduct modelling studies. Third-party expertise was used at all scope levels, Regional Energy Strategy development (see Table A1, Appendix A), Transition Vision Heat development and pilot projects. Municipalities were found to hire external parties to provide modelling calculations, home inspections, modelling result interpretation or to provide studies, for example into available heat sources. These findings confirm that there are indeed challenges with the complexity and user-friendliness of energy models and that these are usually overcome by seeking help from external parties.

#### *4.3. Integration of Social or Socio-Economic Factors into Energy Models*

The proposition 'If energy models do not integrate social or socio-economic factors, then practitioners will make limited use of them while planning for the heating transition' was rejected based on the case study analysis. All fourteen practitioners interviewees agreed that social and socio-economic factors are important and influence the success of heating transition projects. Three municipalities were found to use social or socio-economic data or information or were planning to use this to identify coupling opportunities (opportunities to combine activities for the heating transition with other improvement opportunities in a neighbourhood, such as sewer system updates, building renovations or traffic alterations), and two municipalities used or were planning to use social or socio-economic information to determine the prioritisation of neighbourhoods for heating transition activities. On the other hand, none of the practitioners or model developers interviewed claimed that social or socio-economic factors influenced the choice of heating alternatives, which is the focus of the six energy models municipalities of the sample used. The choice of heating alternative was based on the lowest societal costs in all municipal heating transition projects within the present study. All seven energy model developers agreed that that social, political and psychological aspects influence heating transition projects. However, all claimed that these factors should not and/or could not be included in their respective models and that it would be better to consider these factors alongside the techno-economic modelling results in energy modelling studies.

#### *4.4. Unavailable Data and Uncertain Assumptions*

The proposition 'If assumptions within energy models are uncertain than this will decrease the trust within energy models for practitioners' could neither be confirmed nor rejected based on the empirical results. Energy model developers were found to use different assumptions, and two energy model developers claimed that these are usually the reason why results between different energy models differ. Practitioners offered critiques of assumptions of models or modelling studies, in particular about assumptions regarding energy labels and the use of renewable gas. However, the impact this had on trust in energy models did not become clear in the interviews. The interviews showed that if practitioners did not agree with assumptions used in models or modelling studies that they requested model developers to change said assumptions or that they opted for a different model that used different assumptions. All seven model developers stated that they tried to be transparent about the assumptions they used and that, in collaboration with the practitioners, assumptions can be altered during the modelling process.

The proposition 'If data is uncertain or unavailable, then this will decrease the trust within energy models for heating transition decision making of practitioners' could not be confirmed nor be rejected. Data played an important role for municipalities and model developers in developing heating transition plans, and even though data was sometimes unavailable, this study offered no proof that this decreased the trust of practitioners in energy models. If municipalities decided to use a model, this energy model proved to be more useful if it was fed with local data. Unavailable data that could be useful according to practitioners and model developers is data about energy use per connection, data about the willingness to pay of residents and data about the potential impacts on the electricity grid. One energy model developer mentioned that the data collection process at public organisations was too time-consuming and two energy model developers mentioned that they ran into issues with the energy use data available from Statistics Netherlands ('CBS' in Dutch). These data were aggregated due to privacy laws and was often deemed too inaccurate to use for heating transition projects. Similarly, two energy model developers and one practitioner stated that the data from the Basic registration of addresses and buildings (BAG) (See Table A1, Appendix A) regarding energy labels provided too little insight into the level of thermal insulation present at residential houses. One of the most uncertain data sets used for heating transition projects was data about available heat sources. All model developers agreed that the datasets for heat source data were uncertain and that extra research was always needed to assess the local situation. However, whereas four energy models used the availability of heat sources as a determining factor for the choice of a natural gas alternative, two models did not use heat source availability as a determining factor.

#### *4.5. The Use of Third Party Modelling Expertise*

The proposition 'Practitioners need new (in-house) expertise to effectively use energy models' was confirmed based on the case studies. Only one municipality was yet capable of modelling scenarios individually. Others relied on the modelling expertise of third parties. Even if a municipality outsourced the modelling process, a minimum knowledge level was required to correctly interpret and critically reflect on results. According to energy model developers, practitioners, with only a few exceptions, did not meet this minimum condition. This also caused practitioners to propose incorrect or unsuitable research questions to model developers.

The proposition 'interactive visualisation and different interfaces for different stakeholders could improve the usability of energy models' was also confirmed based on the case studies. Three energy model developers had developed interactive models, maps or tools that, according to them, helped clients such as practitioners to better understand and interpret the modelling results. No statements from practitioners were gathered on the advantages of interactive models.

The proposition 'External parties have commercial reasons to not be transparent about their energy model design' could neither be confirmed nor be rejected. Two energy model developers stated that it was not always possible to gain access to underlying assumptions, data and parameters of models from other commercial agencies. However, all six models in this study were compared to each other in the benchmark study [10], indicating that model developers were at least willing to be transparent towards independent researchers. Moreover, one national modelling study compared the results and underlying assumptions, datasets and parameter sensitivities of multiple models (of which two were commercial). Besides, transparency was only mentioned as a limiting factor by one practitioner. Hence, one could state that even though transparency, especially at commercial model developers, could be improved, it did not seem to be a limiting factor for municipalities to use energy models.

#### *4.6. Advantages and Limitations of Using Energy Models*

According to the academic literature, energy modelling can aid in decision making and policymaking because it introduces a structured way of thinking about the implications of changing parts of the system [9]. The case studies provided more concrete benefits and limitations of using energy models for decision making in the Dutch heating transition. Practitioners stated that the use of energy models within heating transition projects provided perspective for action, financial insight, transparency and legitimacy, concrete propositions to residents and sparked useful discussions. Besides, one practitioner stated

that nationally available modelling studies provided validation and robustness of (other) modelling results. Most of these advantages are related to creating public support for policy. Practitioners also mentioned limitations of using energy models. Interviewees argued that energy modelling results were considered too abstract, too general or too simplified for local analysis. In addition, models were considered not user-friendly and complex. Practitioners mentioned that modelling results provided no insight into available heat sources, limited insight into the impact of nearby heat networks and no or limited insight into end-user costs. Another challenge mentioned was that the Statistics Netherlands ('CBS' in Dutch) neighbourhood definitions do not provide a logical division of the city, which, among others, created the need to conduct a reality check after modelling to filter out odd results, especially for the utility sector.

#### *4.7. Collaboration with Housing Associations, Network Operators and Citizen-Led Energy Cooperatives*

Moreover, from the case studies, insights were gathered that suggest that collaboration with housing associations and network operators is important during heating transition projects to prepare implementation plans and to find coupling opportunities. Housing associations were considered important as they often have property within the municipality and because they have renovation plans that may or may not align with the municipal heating transition plans. Network operators were considered important because they are responsible for underground infrastructure and network reinforcements. Therefore they have to be made aware of the municipal heating transition plans, and they have to provide input about the current limitations of the infrastructure for specific heating options. Moreover, citizen-led energy cooperatives play an important role in heating transition pilot projects. In five out of thirteen interviews, it was mentioned that collaboration with citizenled energy cooperatives is considered important. In one small and one medium-sized municipality, energy cooperatives even provided project leaders for heating transition pilot projects. For Transition Vision Heat development at larger municipalities citizen-led energy cooperatives were found to exercise less influence. Close collaboration with energy model developers happened only in municipalities that have established modelling teams that model energy systems independently; for this sample, those included the two largest municipalities (>300,000 residents).

#### *4.8. The Use of Comparative Analysis and Multi-Model Ecologies*

As mentioned, different models sometimes result in different outcomes, which can create confusion and uncertainty at practitioners. One practitioner interviewed explicitly mentioned experiencing such confusion. Three model developers of this sample actively used comparative analysis to reduce this issue, and one national energy modelling study, the 'Openingsbod', also offered comparative analysis. In such an analysis, differences in methodology, assumptions, data and results of different energy models or modelling studies are compared to one another. This indicated where result differences originate from and provided an overview of the robustness of results across models. One practitioner claimed that the latter helped in determining a priority of neighbourhoods to start with heating transition projects.

Finally, three practitioners mentioned the challenge of matching up heating transition plans at different levels of abstraction, which were found to influence each other and that were sometimes developed simultaneously and with different energy models. To decrease this challenge, one energy model developer tried to position his model in such a manner that he could assess how plans would fit together. This energy model developer envisioned a multi-model ecology in which their model provided a broad energy perspective and where other energy models would offer more detailed calculations on, for example, heating transition visions, heating transition business cases and the effects on power networks.

#### **5. Discussion**

#### *5.1. Reflection vis-à-vis the Academic Literature*

The present study has provided a more concrete image of the role of energy models in data-driven policymaking and decision-making in the heating transition. The literature review showed that modelling approaches have the potential to reduce the uncertainty and complexity of heating transition projects. The present study provided a concrete overview of the advantages of using energy models in heating transition decision making as experienced by practitioners and model developers. The advantages found seem to indicate that although energy models do not necessarily make a heating transition project less complex, they at least offer means to make legitimate choices. The advantages identified are in line with the advantages of data-driven policy design mentioned by Koussouris et al. [17] who stated that tools such as energy models, simplify decision-making processes, even under complicated conditions, by facilitating the opportunity to model complex processes and the opportunity to collaborate with different actors involved, and those mentioned by Adam et al. [15] who stated that providing evidence for the effectiveness of policy choices is one of the cornerstones of legitimate policymaking.

The results of the present study could provide a starting point for recommendations targeting policymakers and model developers to facilitate more effective use of energy models in heating transition decision-making. Such targeted recommendations were not found in the literature and could help towards designing a systematic approach for integrating energy models in data-driven policymaking, which is needed and currently lacking [6,13].

Moreover, the results of this study suggest that offering comparative model analysis would help practitioners to deal with the myriad of sometimes contrasting models, modelling studies and modelling results available and that setting up a multi-model ecology might decrease the challenges of aligning heating transition projects at different abstraction levels. This is in line with Manfren et al. [60] who state that multi-model ecologies could help in creating the integration between top-down and bottom-up modelling perspectives. Furthermore, it aligns with Nikolic et al. [61] who state that multi-model ecologies help get a more coherent and less biased understanding of the "right thing" to do in energy transition decision making as using multiple models allows multiple perspectives to be explored and be brought together.

Although this study confirmed certain advantages of using energy models it also shed light on the limitations of using energy models for decision-making. Designing modelling scenarios is considered a time-consuming and costly task. Modelling results are not absolute truths but rather results subject to calculation rules and assumptions, and if a model or its outcomes are incorrect, one might be worse off than when not using a model to begin with [19]. According to energy model developers interviewed in the present study, not all practitioners understood the limitations of energy models and interpreted modelling results as absolute truths.

Finally, the literature review suggested that it is problematic that current heating transition models do not include social and/or socio-economic factors, as the transition is highly dependent on humans and their behaviour [13]. However, the present study showed that practitioners were not always sure how social or socio-economic data should influence the choice of a heating alternative or the prioritisation of neighbourhoods. Moreover, accessing these data was sometimes difficult due to privacy restrictions. Model developers did not see added value in including social or socio-economic factors within their heating transition models, which all had a techno-economic focus. Their models were focused on finding the lowest societal and/or end-user costs for different heating alternatives and did not include social factors, as affordability for residents is seen as one of the main challenges of the Dutch heating transition [5]. The costs of a heating alternative are, as far as known, not only depending on social or socio-economic factors. Something that could be depending on such factors, for example, concerns the degree of participation and technology adoption rates.

In the present study, not one municipality was found using model methodologies focused on assessing social interactions, such as Agent-Based Modelling, System Dynamic Modelling or Socio-Technical Energy Transition Modelling. Instead, municipalities used models with a mere techno-economic focus and assessed social and socio-economic data alongside the results of these modelling efforts to identify coupling opportunities and/or to determine prioritisation of neighbourhoods.

#### *5.2. The Influence of National Agreements and Municipality Size*

All municipalities that provided information about their Transition Vision Heat planning design in the present study used or were planning to use models/modelling studies. This was expected as it was agreed in the national Climate Agreement of 2019 [3,62] that municipalities would use the 'Startanalyse' and its guidelines [63] to design their Transition Vision Heat. According to the Climate Agreement, this would provide all stakeholders with a "uniform frame of reference regarding the impact of the various natural gas alternatives in a district" [3]. This agreement might have incentivised municipalities to use energy models when designing their Transition Vision heat. However, three pilot projects did not use energy models to choose a natural gas alternative. The pilot projects analysed, all started before this statement was made in the climate agreement and before the 'Startanalyse' and its guidelines [63] were published. Therefore, practitioners in pilot projects might have been less familiar with available models and modelling studies, might have had less access to models and modelling studies and/or might have been less incentivised to use available models or modelling studies.

Secondly, pilot projects that did not use an energy model to choose a heating alternative had a few things in common. All three pilot projects were located in villages with less than 2000 residents. All of them had active citizen-led energy cooperatives, two pilots were organised by the local energy cooperative, two pilot project leaders were not familiar with energy models, and two pilot projects entailed only or mostly detached houses, from before 1940 with poor thermal insulation levels. Two practitioners claimed that they did not feel that they needed an energy model because the choice for a heating alternative could be made with common sense and information about the residential characteristics. This indicates that an energy model might not always be considered necessary or desirable for heating transition decision-making and that it is important to consider when the use of an energy model would be beneficial and when other sources of evidence might be sufficient to support decision-making.

#### **6. Conclusions**

#### *6.1. Answering the Research Question*

This study aimed to answer the research question 'What are the perceived advantages and limitations of using energy models for municipalities within their data-driven decision-making process concerning the natural-gas free heating transition?'. To answer this question, a literature review and embedded multiple case study research were conducted, which included different heating transition projects in ten Dutch municipalities.

Results inter alia show that energy models observed in the present study were mostly initiated and used by consultancy agencies to support Dutch municipalities in designing heating transition plans. Over half of the municipalities analysed were found to use models or modelling studies at some point during their respective heating transition pilot projects. All cases that provided information about local Transition Vision Heat development were using or planning to use models or modelling studies for the design of their vision document.

Models that were used pertained to the CEGOIA model, the Vesta MAIS model, DWA models, the ETM and the WTM. Modelling studies that were used concerned the 'Openingsbod' and the 'Startanalyse'. Municipalities that did not utilise models or modelling studies for their pilot projects belonged to the four smallest municipalities analysed in the present study, indicating a negative relation between municipality size and model usage. All municipalities that used models or modelling studies requested external expertise at some point during the modelling process, indicating that the knowledge and skill level at municipalities was not sufficient to do this independently. This was confirmed by model developers who also stated that the knowledge level of practitioners is often insufficient to interpret results of modelling studies conducted by third parties.

Advantages of using models in heating transition projects mentioned in the interviews were that the modelling process and its results provided perspective for action, financial and socio-economic insights, transparency and legitimacy towards residents, concrete propositions for residents and means to start useful discussions. However, interviewees also mentioned several limitations. First, models and modelling results were found too abstract, too general or too simplified for local analysis, not user-friendly and were considered complex. Results were difficult to interpret for non-experts such as practitioners, and interactive models could provide practitioners with a better understanding of the answer and help with getting a feeling for parameter sensitivity. Second, modelling results provided too little insight into end-user costs and the effects on the electricity grid. Third, data sets regarding energy use, thermal insulation levels and heat sources proved to be insufficient for local analysis, and there was no consensus between model developers and practitioners about the different assumptions regarding green gas availability and energy labels used in different models.

This study also showed that model developers deemed it unpractical to integrate social and socio-economic factors in the energy models discussed, but agreed that this data should be incorporated in modelling studies/reports. Model developers usually did this by collecting social or socio-economic data and by presenting this data next to the modelling results to provide context for further decision-making.

Finally, the results suggest that offering comparative model analysis would help practitioners to deal with the myriad of sometimes contrasting models, modelling studies and modelling results available and that setting up a multi-model ecology might decrease the challenges of aligning heating transition projects at different abstraction levels.

#### *6.2. Limitations*

The external validity of the empirical results is limited by the context in which the present study was conducted, in selected municipalities in The Netherlands. This was a scoping choice motivated by the case study design and time constraints of the present study. The representativeness of these results to other geographical, political and cultural contexts might therefore be fairly limited. It is expected that representativeness will particularly be limited for countries where the heating transition is not organised in a decentral manner or where there are not multiple (national) energy models available to analyse the costs of this transition.

Limited access to background information on some commercial energy models limited the reflection on technical aspects of the models reviewed in the case studies. In the present study, the capabilities, limitations, underlying assumptions of models were only compared at the surface level, based on publicly available reports and the challenges and advantages mentioned by interviewees. This limited access to background information limited the potential for in-depth model comparison. On the other hand, the time constraints of this research and the focus on user experiences and the modelling process rather than the actual energy models also limited this potential. This choice was made because limited access to the background information of (commercial) models was foreseen and because there are already other studies, such as [10], that focus on in-depth model comparison.

The data collection tools chosen, interviews and thematic coding, also have their respective limitations. Interviews and thematic coding are research tools that require a high degree of interpretation from the researcher. During the coding process, quotes had to be translated and interpreted. The literal transcripts, the coding process and the coding reports ensured quotes were methodologically analysed and that it was possible to review the original quotes.

The present study used multiple sources of evidence in a triangulating fashion to decrease the subjectivity of the answers and to check their consistency over time. A remarkable observation was that within the pilot projects observed the views and plans of interviewees did not always align with the views as exhibited in the implementation plans of the pilot project, due to advancing insights.

#### *6.3. Recommendations for Future Research*

The present study did not provide an answer as to when heating transition projects should and when they should not use energy models to guide their heating transition decision-making process. The discussion offered some criteria that might indicate projects that do not need energy models such as municipality size, residential housing characteristics and the presence of an energy cooperative. It is therefore recommended to conduct more research into which criteria could indicate that projects would have an advantage of using an energy model. It is recommended to conduct more case studies, with different types of heating transition projects, to explore this topic. In addition, it is suggested to also include case studies that utilise other decision support tools, such as MCDA tools, in order to assess the relative advantages and limitations of energy models when compared to other tools.

Furthermore, it is recommended to further study the impact of social and socioeconomic factors. The literature review revealed that that social and socio-economic factors are highly important for heating transition decision-making processes, but currently, the impact of social and socio-economic data within Dutch heating transition projects is limited and at best influences the prioritisation of neighbourhoods. More research into certain factors, for example, income or the presence of energy cooperatives, could provide insight into the correlation of these factors with heating transition project progress and into the potential value of models that include such factors. Such insights would not only benefit the Dutch heating sector but might also benefit a range of international energy transition projects. On the one hand, this might entail desk research into socio-technical transitions and models (such as Socio-Technical Energy Transition, System Dynamics or Agent-Based models). On the other hand, it might address practical case studies that test socio-technical transition theories and models within heat or energy transition projects. Ideally, such case studies are not restricted to The Netherlands but also include projects in countries with significantly different heating systems, energy markets, institutions, social and socio-economic values to compare and corroborate results.

Finally, it is recommended to conduct more research into the field of multi-model ecologies (e.g., systems of interacting models). The present study has shown the need for comparative analysis, for modelling at different abstraction levels and for assessing the impact of choices regarding the heating transition in other disciplines, such as electrical infrastructure and social welfare. More research into multi-model ecologies can benefit both the Dutch and the international academic modelling field as it offers the opportunity to add value to existing models, for example by making them more interactive with other national or international models. Nikolic et al. [61] and Manfren et al. [60] offer the first set of principles, challenges and guidelines that provide a conceptual basis for multimodel ecologies. Currently, the 'Mondaine Suite' project [64] is one of the first projects that is aiming to realise a multi-model ecology by developing a coupling mechanism for different (Dutch) energy models. However, this project does not yet couple Socio-Technical Energy Transition models, System Dynamics or Agent-Based models, which might offer an interesting opportunity for future research to include more social and behavioural components into multi-modal ecologies.

**Author Contributions:** Data curation, B.A.H.; Investigation, B.A.H.; Methodology, B.A.H., T.H.; Supervision, T.H., D.D. and Z.L.; Visualization, B.A.H.; Writing—original draft, B.A.H.; Writing review and editing, T.H., D.D. and Z.L. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research did not receive external funding.

**Institutional Review Board Statement:** The research design was approved by the Human Research Ethics Committee of the Technology, Policy and Management faculty of the Delft University of Technology.

**Informed Consent Statement:** Informed consent was obtained from all subjects involved in the study.

**Data Availability Statement:** Data used in this study entails anonymized interview transcripts and anonymized interview coding reports. Both can be found on the repository of the Delft University of Technology: https://repository.tudelft.nl/islandora/object/uuid%3Aae00908d-e89a-400e-819bdd0d11cdba34.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**

**Table A1.** Glossary and definitions of Dutch (policy) concepts and abbreviations used in the present study.


#### **Appendix B**

**Table A2.** Overview of the literature found describing different modelling methods used for sustainable heating transition projects.


#### **Appendix C**

**Table A3.** An overview of the respondents from municipal heating transition projects.



**Table A4.** An overview of the respondents from mode development firms involved in municipal heating transition projects.

#### **Appendix D**

**Table A5.** The code-occurrence table shows an overview of the 37 thematic codes, the respective code frequencies and the number of transcripts that quotes were identified in.


#### **Appendix E**

**Table A6.** The code occurrence table shows an overview of the 53 thematic codes, their respective code frequency and the number of transcripts that quotes were identified in and the code group.


#### **References**

