**1. Introduction**

The number of different electrical appliances in households keeps rising. As such, it is becoming increasingly important to recognize the potentials for energy savings and demand side management, i.e., the possibility to defer power consumption in order to improve the stability of the power grid. For these purposes, it is not only vital to have complete knowledge about the appliances present in a building, but also their individual energy demands and operational times. Monthly electricity bills only provide insufficient information to accomplish this task, however, as the required information can only be estimated from the household total. Even current-generation smart meters are incapable of providing a detailed and unambiguous itemization of energy consumption, so more finegrained means for load monitoring in a home are needed to provide enhanced consumption feedback and accomplish energy savings (which were documented to reach up to 12% in [1]).

Two fundamentally different approaches exist for the collection of the required data at appliance level. One the one hand, Intrusive Load Monitoring relies on the installation of power sensing devices for each appliance under consideration (or at least every electrical circuit in the home). The advantages of being able to attribute power consumption to individual devices, however, come at a high cost for instrumenting the environment with sensors, and maintaining their operability during their lifetime. On the other hand, Non-Intrusive Load Monitoring (NILM) methods collect the electrical information of a whole building or apartment and use algorithms to disaggregate the total power demand into the contributions of individual devices. The non-intrusive approach is often preferred due

**Citation:** Huchtkoetter, J.; Tepe, M.A.; Reinhardt, A. The Impact of Ambient Sensing on the Recognition of Electrical Appliances. *Energies* **2021**, *14*, 188. https://doi.org/10.3390/ en14010188

Received: 5 December 2020 Accepted: 23 December 2020 Published: 1 January 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

to lower costs and installation efforts. However, current NILM methods do not always succeed in accurately and unambiguously disaggregating power data from households [2].

Most current methods for appliance recognition and load disaggregation consider electrical consumption data (e.g., active power) [2,3]. When data are available at a sufficient resolution, however, more complex features like spectral components can be computed and used supplementally. Using more, especially more complex, features has been shown to improve the rate of correct device recognitions [4]. Confusion may still exist between certain appliance types, and some appliances have been reported to be "hard to disaggregate" [5,6] based on their electricity consumption alone. One potential candidate to alleviate the current limitations of NILM is the additional use of contextual information, as documented in [7–9]. For example, the distributions of On- and Off-durations as well as dependencies between device usages are modeled into a Factorial Hidden Markov Model (FHMM) in [7]. The resulting performance shows a marked improvement, even for a larger number of active appliances, and reaches improvements of up to 25%.

A similar approach based on user presence and time constraints was presented in [3]. The application of time constraints alone was shown to achieve a small improvement of about 3 %. However, as soon as indicators of the user presence were included in combination with time constraints, improvements of about 14% were reported.

But there are more parameters besides the aforementioned attributes. It is well-known that many electrical devices generate acoustic, magnetic, or optical emissions, or dissipate the consumed energy as heat during their activity. The potentials of using such information in the appliance recognition task have been investigated in [5,10,11] and further studies presented in Section 2. We, however, believe that our work is first to present a holistic and comprehensive study that determines the information gain of a range of additional sensing modalities. A thorough understanding of the importance of ambient sensing features is vital to optimally support appliance recognition (e.g., by lowering the number of candidate devices for the classification task). We strongly expect monitoring systems to profit from a deeper understanding of the features that characterize the operation of electrical appliances. System operators could then decide to specifically collect data based on the importance of certain sensor types, i.e., their usefulness, to better evaluate if costs outweigh possible benefits. Costs typically arise from hardware purchases and the device deployment in their optimum locations (e.g., luminosity sensors need to be mounted next to the light-emitting parts). However, non-monetary costs may also play a role, e.g., when sensors have the potential to compromise on user security or privacy. Understanding the importance of ambient sensing features will thus ease the considerations which sensor types to deploy. Accordingly, our work seeks to establish the foundation to enable further work in this context by providing an answer to the following question: Which ambient sensors can lead to improved appliance recognition results, and what are the most useful sensor types to facilitate the categorization of electrical consumers by their types?

In order to answer this question, we design a study to be conducted in two sequential steps. The first step, the design and implementation of a comprehensive data acquisition setup, is essential due to the unavailability of publicly released data that contains information beyond electrical power consumption data. In our second step, data analysis, we assess the contribution of each sensing modality to the overall appliance recognition task. Beyond the recognition of specific appliances, we also determine the set of features to facilitate the detection of appliance classes (i.e., the distinction between devices of different categories). Ultimately, the analyses presented in this work allow us to derive recommendations for future data collection campaigns, similar to the set of guidelines for electrical datasets presented in [12].

Our manuscript is organized as follows. In Section 2 we provide an overview of further studies which considered additional ambient sensing and illustrate how these works were considered for the design of our study. In Section 3 we present our system design to collect data from eight ambient parameters, both during appliance operation and inactivity. The concept for our subsequent evaluation and its parameter choices are

detailed in Section 4. We evaluate to what extent devices could be recognized and what ambient sensors carried the most information in Section 5, and we summarize the insights gained in our study in Section 6.

#### **2. Related Work**

A number of studies have considered further data besides electrical information sources to improve the device identification and accordingly improve the disaggregation process: Acoustic sensors, light, temperature, vibration, electro-magnetic fields, or acceleration data [5,10,11,13–16]. Remarkedly, however, the aforementioned works have considered these parameters largely in isolation, as shown in Table 1. Opposed to this, we present a comprehensive study that relies on all sensor types in this work.


**Table 1.** Types of Ambient Sensors Used in Related Work.

The sensor deployment methodologies differ as well. The authors of [5,11,13,14] use a small number of sensors, which are not fixed to the appliances under consideration, but rather monitor the ambient conditions in general. As such, they can collect information from appliances that are operating simultaneously. In contrast to this, the collection setups presented in [10,15,16] use separate sensors for each appliance, thus recorded values can be unambiguously attributed. These sensor placements are mostly related to the different concepts of the respective studies. The authors of [11,13,17] present different sensing platforms and concepts, but only briefly evaluate the possibility to identify electrical devices. In [14] the authors introduce a system which solely includes ambient audio information, collected on a per-room granularity. The system is considered as a load monitoring system and uses different collected audio features as a first disaggregation layer. Only if the audio features do not allow for disaggregation, the electrical features are evaluated and can overwrite the decision. The authors of [18] follow a similar approach, albeit they implement a smartphone-based system to detect household activities. Through the annotation of activities with corresponding energy consumption data, the authors enable basic load monitoring based on audio information. Lastly, the authors of [10] present the appliance-agnostic usage of "multi-modal signatures" through the common evaluation of all sensor data and their changes that allow for device identification. They recognize the potential of closely correlated environmental data as trusted sources of appliance activations, allowing NILM systems to validate or re-train themselves. While the general usefulness of multimodal signatures is proven in [10], the used sensor types are not evaluated concerning their individual usefulness. Aligned with these insights, the authors of [19] have also remarked the potential of environmental sensors in disaggregation tasks.

Bearing these related findings in mind, we present our data acquisition concept and its evaluation in the following sections. As the related studies did use approaches with and without consideration of electrical data for the recognition process, our study will accordingly include evaluations for both approaches.

#### **3. Data Acquisition Concept**

A number of datasets are widely used in related research on energy data analytics so far (e.g., [20–24]). Collecting such datasets, however, is often motivated by the desire to capture a large continuous stream of electrical energy consumption readings for data processing tasks like pattern recognition or forecasting. Ambient features or user-specific details (e.g., presence) are not part of most datasets, and there is no dataset available that comprises electrical signals as well as the full set of ambient conditions we consider in this work. As a result of this shortcoming of published datasets, it was necessary to run our own data collection campaign. We have decided to design a collection system for both electrical and ambient sensor data and use it to collect a dataset for the data analysis we conduct in Section 4. We describe our rationales behind the design of the system as well as the data preprocessing steps we apply in the following subsections.

#### *3.1. Selection of Appliances*

As we aim for a generalizable evaluation of device types and their emissions of nonelectrical signals, the first decision to make is the selection of the set of appliances under consideration. Our goal is to determine a representative set of electrical devices, that will be operated in a controlled environment in order to collect the input dataset for all further analyses. To make an informed choice of devices, we have consulted studies on electrical appliance ownership worldwide [25–27]. Through considering the appliance types reportedly owned by at least 2/3 of the households, we have been able to identify a set of 13 appliances that are present in many households in developed countries. Minor household devices are typically not part of the aforementioned surveys due to their large diversity and the negligible contributions to the monthly energy bill. Still, several use cases for their recognition in load data are conceivable, e.g., the identification of user activities that are tightly bound to the use of these devices. Accordingly, we have chosen eight additional devices related to cooking (e.g., a mini oven), personal hygiene (such as a hair dryer), and office activities (e.g., a printer). The full set of all 21 appliances under consideration is provided in Section 4.

#### *3.2. Selection of Monitored Parameters*

Having selected the appliances under consideration, their (expected) ambient influences need to be determined, in order to derive the sensors required to capture these parameters. For the evaluation of possible ambient influences we have extracted possible emissions from the devices' data sheets, the general construction of devices (thus implicitly considering the laws of physics), and moreover inspected the devices under test manually during their operation. The complete list of all eight captured sensor parameters is given in Table 2.

#### *3.3. Data Collection System Design*

Based on the derived set of requirements pertaining to the parameters to monitor, a collection system was prototypically designed and implemented. As it was our intention to collect the sensor measurements as close to the Device under Test (DuT) as possible, most of the ambient sensors were wired up to an embedded microcontroller system, based on the PJRC Teensy 3.2 board. Its compact size offered the possibility to be mounted very close to the DuT. Sensors for the parameters of interest were interfaced to the board either via a digital two-wire (I2C) interface, or through analog signals that were converted into the digital domain by the microcontroller's integrated 16-bit Analog-to-Digital Converter (ADC). The microcontroller system was programmed to use a periodic sampling schedule, and capture the considered parameters as synchronously as possible. Retrieved sensors values (e.g., digitized temperature readings) are scaled in order to report their data in SI units (e.g., °C). The data sampling rates are shown alongside the sensor types in Table 2. They were selected such that the microcontroller system could perform data processing and transmit them (across its USB-serial connection) in real-time. The choice of 400 Hz for vibration measurements aligns well with typical rotational speeds of the evaluated internal motors (e.g., internal motors of DVD and CD players typically spin at 200 rpm to 570 rpm).


**Table 2.** Summary of the Used Sensing Devices and the Properties of the Data They Collect.

> The sensor platform was connected to a personal computer in charge of centrally collecting all sampled data, to which two more sensors were attached. First, electrical signals for both voltage and current are collected at 10 kHz through a USB-interfaced PicoScope 4444 oscilloscope, equipped with a Hall effect current probe and a passive differential measurement voltage probe. Second, the collection of audio information and the determination of the most dominant frequency was accomplished through a connected USB microphone, sampling audio at 44.1 kHz. Temporal synchronization between the data recorded from the heterogeneous sensing modalities is ensured through inter-process signaling on the data collecting system. The raw data is collected into a file containing electrical information, an audio file, and a CSV file containing the ambient measurements.

## *3.4. Measurement Environment*

With the exception of large and immobile appliances (washing machine, dryer, refrigerator), measurements were collected in the same ambient conditions of an office environment. The remaining measurements were collected in-situ, i.e., a kitchen (for the fridge) and the laundry room (for washing machine and dryer). The ambient sensors were placed directly on the DuT, oriented according to the expected maximum emission strength for each captured feature. As such, e.g., light sensors were placed in front of light emitting devices and magnetic flux density sensors were placed close to motors wherever possible. While such a placement may not be realistic for real-world deployments, note that the intention of our approach is to determine importance of such features in the first place, for which as detailed information as possible are required. An example of the sensor placement for the mini oven is shown in Figure 1. Unless practically impossible, measurements were collected for full working cycles of devices. Each measurement was succeeded by a phase of appliance inactivity, in order to allow for sensor offset calibration. Data from devices with continuously variable power demand or without deterministic operation durations (e.g., computer monitors, lamps) were collected for two to five minutes. The only device measured for shorter duration was the food hand mixer, as it could only be operated for up to one minute before requiring a cool-off period, according to its user manual.

**Figure 1.** Practical Setup of the Data Collection Test Bench. The Microphone is not Pictured, due to its Positioning Outside of the Image Boundaries.

> At least two operational cycles were recorded for each DuT in order to permit crosschecking the recorded sensor data. All measurements were manually checked for correctness before storage, in order to ensure consistency regarding the collected data. In the rare occasion of obviously inconsistent data, the data collection was repeated, and the faulty data trace discarded. For devices that could be operated in different states (such as the hair dryer or fan), measurements were collected for each of the states individually, and treated as the same device during the evaluations.

#### *3.5. Data Postprocessing and Dataset Creation*

For the feature evaluation presented in Section 4, we only consider a simplified feature subset, consisting of either the maximum changes or a binary activation indication for the ambient features under evaluation. These simplified features were chosen because they could easily be determined locally on the low-power sensing device. Besides their fast computation, omitting raw data from collection also caters to user privacy protection [28].

Additionally, findings based on such simple features are also reproducible on systems using higher sampling rates. All three points were vital to allow the results of this study to be used as a guideline for a wide range of practical systems, some of which are expected to only provide low data resolutions.

To compute these features, each data collection period *Tcoll* was succeeded by an offset calibration phase *Tcal*. Both were designed to have approximately the same duration (i.e., 2–5 min, cf. Section 3.4). A clear delineation between both phases was easily possible due to the corresponding changes in electrical current consumption. Changes to the ambient humidity and temperature values were detected using additional temperature and humidity values, collected through a secondary measurement device in the room. During the manual evaluation of each trace, the steady-state value of each ambient sensor readings was determined for *Tcal*, and used as the baseline value for the uninfluenced ambient readings. For each of the used sensor types, the difference between both values in *Tcoll* and the baseline is used as a feature in our analysis. In addition to the use of absolute values by which each parameter has changed, we also consider them in a binary form, according to the following rule set:


We would like to note that supplementally collected electrical features (voltage, current, power) were not translated into a binary form, given that voltage readings remained constant and current samples always showed variations during a device's operation. Instead, the electrical features used in this paper were chosen such that they represent electrical information already used during load monitoring, at a complexity similar to the considered ambient features. This enables a comparison between use cases only using ambient data and use cases combining ambient data with the already present electrical data. All electrical features are calculated during a 40 ms (i.e., two mains periods) long section of the measurement, selected such that the appliance's current consumption is maximal, and requiring that its value is identical in both successive mains periods. Based on the data from this excerpt, the RMS voltage (U) and current (I) as well as the active power (P) are calculated.

Based on the collected and post-processed data, we have generated four variations of the dataset to serve as the foundation for our evaluations. All datasets contain a total of 144 traces. Their details are given as follows.

• **Ambient Parameters; binarized (** *Abin***)**: Changes in the sensor measurements were evaluated in a binary way in order to evaluate if the corresponding ambient characteristic is influenced by the device's activation. The resulting data consisted of a binary

value for each sensor, which we use in our evaluation as indicators whether the DuT's operation had an impact on the corresponding characteristic.


Let us consider an example of the data collection and processing sequence for the mini oven appliance as follows. The mini oven was equipped with the sensors according to Figure 1. Sensor data was collected during the mini oven's operation twice, with sufficient time between measurements to allow for a cooling down. Raw environment sensor data for one of the measurement run are plotted in Figure 2. The average sensor values during *Tcal* and from ambient measurements are then postprocessed to create the four aforementioned variants of the dataset. In the figure, the first 20 min of the sample, during which an electrical current flow was recorded, constitute *Tcoll*. The remaining about 20 min of the collected trace constitute *Tcal*. The postprocessing is applied as described, a visualization of the process is included in Figure 3. The resulting entries for the four datasets introduced above are computed. They are shown for reference in Table 3.

**Figure 2.** Data Collection Example , Displaying the Environmental Sensor Traces. Units According to Table 2, the End of *Tcoll* is Marked as a Dotted Grey Line.

**Figure 3.** Feature Extraction Processing Flow, Applied to Collected Raw Data .



#### **4. Data Evaluation Concept**

Our research objective is to assess how knowledge of the ambient conditions in an appliance's environment can support the recognition of electrical appliances. A total of eleven attributes are available for analysis (cf. Section 3): Eight ambient sensor attributes and three electrical quantities (voltage, current, and power). Instrumenting residential environments with sensing devices to capture all of these parameters, however, has several drawbacks. Besides the monetary costs for purchasing and installing sensors as well as ensuring their continuous operability, the continuous collection of data may be perceived as an intrusion into user privacy.

We hence conduct a methodological evaluation how each of the sensed attributes impact the appliance recognition rate, in order to determine the most information-rich subset of features.

We begin our evaluations with a determination of the importance of the contributions of each of the collected features when used to distinguish between the 21 appliance types listed in Table 4. In subsequent evaluations, however, we also present three evaluations considering the categorization of appliances into classes, as well as appliance recognition results when the appliance class is known a priori. Through this set of evaluations, guidelines on the best-suited feature subsets for different appliance recognition scenarios are derived.


**Table 4.** List of Considered Appliances and Associated Categories.

#### *4.1. Determining the Feature Importance for Appliance Recognition and Classification*

Determining the usefulness of features for classification purposes is a task that occurs across many research domains [29]. Considering the appliance recognition and classification case of this study, the usefulness of features is considered to allow cost-efficient data collection through the exclusion of features that carry little or no information. Additionally, feature selection methods allow for the comparison of the usefulness of features or subsets of features for different use cases. Note that the usefulness of a feature is highly specific to a given use case. For our contribution, we have chosen appliance recognition as a use case, i.e., the classification of appliances by their types, depending on the values of the available feature set. Appliance recognition is a typical classification use case from the field of energy data analysis: Based on a set of features, the single most likely appliance type should be returned. As follows, we assess the importance of the features we have described in Section 3.5 for the task at hand. Instead of conducting a single study on the general feature importance, however, we proceed in a more fine-grained fashion by considering several subsets of appliances (cf. Table 4). This way, we seek to provide a more detailed picture of the feature relevance for different use cases.

#### *4.2. Methodology for Determining the Distinctiveness of Features*

For the evaluations we conduct below, two pieces of information are of primary interest:


While the individual determination of a feature's importance helps in assessing to what extent each feature can reduce the chance of misclassification, it generally cannot identify the feature combination that leads to the best classification result overall. In order to find such combinations, the determination of an optimal subset is required. This feature subset considers which features work best together, indicating an ideal set of features to be used for the considered use cases [29]. When combined, both methods (individual feature relevance and best feature subset) allow for the development of better appliance monitoring systems.

## 4.2.1. Feature Importance

The usefulness of each feature is determined through the usage of a Random Forest of Trees. A decision tree is a structure which continuously divides the whole input data into subsets, such that the new subsets become more pure, i.e., features that lead to different output values become divisive elements [30]. In other words, the features that enable the cleanest division of input data into categories are considered the most important. In contrast to a simple decision tree, the Random Forest of Trees generates multiple trees for randomly selected subspaces of the total feature space. Only a subset of the input sets of the feature values is evaluated in each tree, and the resulting trees are then combined by averaging the determined probabilities. This ultimately allows for greater classification accuracy improvements as compared to a singular decision tree [30].

The Gini Impurity is defined as the rate of misclassification when an additional decision element is added to an existing decision tree [31]. It is widely used for the feature selection in Random Forests of Trees in order to annotate each division of input data into new subsets with an importance score. Only the decision that yields the greatest reduction of the Gini impurity is maintained, which corresponds to a decrease in the probability of misclassification. The averaged Gini impurity scores are used as feature importance scores in our present study.

#### 4.2.2. Optimal Feature Subset

To confirm that attained results can be generalized and allow to determine an optimal feature subset, we rely on the Recursive Feature Elimination [29]. The algorithm starts with the full set of features and greedily excludes the least informative feature after each evaluation iteration. A ranking criterion is calculated for all features, and the feature with the lowest ranking criterion is eliminated. This process is repeated until the desired size of the feature subset is reached [32]. If the size of the optimal subset is unknown in advance, a performance rating for the trained classifier results can be introduced. For this study, the accuracy was chosen as a performance rating for the trained classifier, such that the subset with the greatest overall accuracy result is chosen as the optimal feature subset.
