*4.4. Conducted Evaluations*

We evaluate the ambient features with respect to three general use cases:


#### **5. Evaluation of the Feature Importance**

As follows, we present the results of the different evaluation settings introduced in Section 4.4. Unless noted otherwise, we present the Cross-Validation Score (CVS) and optimal feature subset for each of the four sets of input data (cf. Section 3.5). The symbol notations for the sensed modalities introduced in Table 2 are used in the results.

#### *5.1. Evaluation Results for the Distinction between Devices on the Full Dataset*

Our first evaluation was conducted on the whole generated datasets and considered the device types as output. The results for the Random Forest of Trees method are displayed in Figure 4, whereas the results for the Recursive Feature Elimination are documented in Table 5. The bar graphs indicate the calculated feature importances, wherein the yaxis denominates the feature, while the x-axis denominates the corresponding feature importance in percent.

**Figure 4.** Feature Importances for the Appliance Recognition Across the Whole Dataset.


**Table 5.** Cross-Validation Score (CVS) Values for the Evaluated Appliance Recognition Scenarios.

The Random Forest of Trees method indicates a high usefulness for the vibrations and sound measurements. For the case of using binary change indicators, the magnetic flux density is deemed similarly important. The inclusion of electrical parameters slightly equalizes the importance levels, ye<sup>t</sup> the addition of electrical parameters only leads to slight improvements of the results when compared to the use of ambient sensors alone. This result is confirmed through the Feature Elimination, which shows that the inclusion of electrical parameters allows for slightly improved Cross-Validation Scores, but the best feature subset does still require nearly all features. The UV and visible light emissions are only useful for a very small number of devices; the UV readings exclusively for lamps. While this indicates that the sensor type is not of interest for most evaluations, it furthermore shows that there is potential to improve lamp recognition by means of only a single sensor type. Furthermore,audio frequency features can be considered to be highly informative, as they are not only present for most devices, but distinctly different between them. The maximally observed sensor changes, as collected in *A*Δ and *AE*Δ, enable more fine-grained assessments of the impact of observed signal changes. While the subsequent measurements on a device did generally not exhibit strong variations, variations did exist between different devices of the same type. The differences in ambient influences between two devices of the same type can, however, be high. For example measurements from two different microwaves resulted in consistent dominant audio frequencies for each of the devices across measurements.Nonetheless, one microwave exhibited a dominant frequency of 172 Hz, whereas the other had a dominant frequency of 344 Hz. This indicates that setups evaluating ambient data need to consider the similarity of devices of the same type and if multiple devices of the same type are present. Such findings must be considered during training, as they indicate that training done on one household may not be transferable to other environments.

#### *5.2. Evaluation Results for Device Categories*

The following evaluations were conducted such that, based on the ambient measurements, each DuT was classified into its corresponding class for each categorization presented in Section 4.3. Each presented evaluation provides the feature importance considering the Random Forest of Trees method and the optimal feature subset determined through the Recursive Feature Elimination, as well as the CVS achieved with the optimal subset.

#### 5.2.1. Classification According to the Number of States

In this evaluation each measurement was classified as belonging to either a Single-State, Multi-State or Infinite-State appliance. The results are listed in Figure 5. Both evaluated algorithms show a high importance of audio and vibration features for the classification into Single-State, Multi-State and Infinite-State Appliances. The binary evaluation achieves a better Cross-Validation Score, indicating that the simple presence of sound or vibrations could provide relevant contextual information for this classification. However, both CVS values are low, such that a classification based on the ambient features alone may not be fruitful.

**Figure 5.** Feature Importances [%] and Best Feature Subset for the Classification into Single-State, Multi-State and Infinite-State Devices—Considering Only Ambient Features.

#### 5.2.2. Classification According to Electrical Power Consumption

This evaluation considers the feature usefulness to determine if a set of ambient measurements belongs to a large consumer or small consumer appliance. Its results are given in Figure 6. The evaluation according to consumption ranks the temperature and humidity sensors as most relevant for this distinction. This is unsurprising, as appliances designed to significantly heat or cool the environment generally exhibit a high power consumption. However, the Feature Elimination furthermore reveals that the UV radiation is a relevant feature for this distinction, which is reasonable considering that only the light installations under evaluation emitted UV radiation, all of which belong to the class of small consumers.

**Figure 6.** Feature Importances (%) and Best Feature Subset for the Classification into Small Consumers and Large Consumes—Considering Only Ambient Features.

#### 5.2.3. Classification According to Load Curves

The evaluation considering load types evaluates feature usefulness to determine if measurements were taken from an ohmic, inductive or SMPS appliance. The results are contained in Figure 7. To classify devices according to their load curves, the Feature Elimination indicates that maximum changes are more effective. However, the results for the two applied methods differ considerably. This indicates that the combination of features is a lot more informative than singular features. The similar usefulness of most features determined through the Random Forest of Trees matches this finding. However, it needs to be remarked that the CVS for the binary evaluation is very low, indicating that this evaluation is not well-suited for a classification into load types.

**Figure 7.** Feature Importances (%) and Best Feature Subset for the Classification into Ohmic, Inductive, and Switched-Mode Power Supplies Devices—Considering Only Ambient Features.

#### *5.3. Device Type Distinction within Different Device Catgetories*

The following evaluations consider if and to which degree ambient information is informative if a device class is already known, but the instances of the device type within the given class shall be distinguished. Such an evaluation allows us to identify the device classes whose instances can be distinguished algorithmically. In contrast to the classification task considered in the previous section, the evaluations were not run on the full datasets with a target output from the set of device classes. Instead, we have run them on subsets of the full data, divided such that only one class of devices is part of the subset. The results are recorded in Tables 6 and 7 and presented in the following subsections.


**Table 6.** CVS Values for the Evaluated Appliance Destinction Scenarios—Binary Features.

**Table 7.** CVS Values for the Evaluated Appliance Destinction Scenarios—Maximum Feature Changes.


5.3.1. Distinguishing between Devices Sharing the Same Number of States

Recall that the device classes categorization used in this paper divides the set of considered appliances into Single-State, Multi-State and Infinite-State appliances. Again, we have computed the best feature subset within each of these categories, as well as ranking the feature importances. The Feature Elimination results can be found in Tables 6 and 7, in the first to third row, while the results for the Random Forest of Trees are depicted in Figure 8.

**Figure 8.** Feature Importances (%) for Recognition of Appliances of a Single Class—Appliance Categorization By: Number of State.

Concerning the recognition of device types from measurements of either only Single-State or only Multi-State Appliances, a rather large set of ambient measurements is required for best results. None of the features are particularly indicative of a specific appliance when considered on its own. However, the visible light and UV radiation are not present in Multi-State appliances, as none of the considered Multi-State devices emitted light. The distinction of Infinite-State devices differs as their distinction reaches the maximal Cross-Validation Score, in the case of maximum change evaluations, with a small dataset which only contains changes in temperature, humidity, and vibration. Sound emissions were identified as similarly important through the Random Forest of Trees method. The inclusion of electrical features was found to be useful for all three intra-category distinctions (as visible in the columns on the right-hand side of the tables).

#### 5.3.2. Distinguishing between Devices of the Same Consumption Class

For the following evaluations, the datasets of ambient features were divided, such as to create datasets that only contain measurements from either large or small consumers. The resulting datasets were evaluated to assess to what extent the device type of a measurement can be determined based on the data. The Feature Elimination results are documented in Tables 6 and 7 in the fourth and fifth row, and the results for the Random Forest of Trees are visualized in Figure 9.

**Figure 9.** Feature Importances (%) for Recognition of Appliances of a Single Class—Appliance Categorization By: Power Consumption.

Considering the large consumers (see Table 4), most appliances include a heating or cooling element, and multiple of them a motor, all of which can be expected to generate magnetic fields, vibrations, and sounds. Accordingly, both feature selection methods identify the magnetic flux density, vibrations, and audio features as important. The presence of IR radiation is furthermore shown to be distinctive. Small consumers do show low Cross-Validation Scores when only considering the presence of emissions, however maximum change evaluation scores indicate adequate results. Two findings are of special interest: First, the maximum change evaluation reaches better Cross-Validation Scores for feature subsets only including ambient features. Additionally, the feature subsets are quite large, but the majority of sensors is present for the recognition of only small and for the recognition of only large consumers.

#### 5.3.3. Distinguishing between Devices of the Same Load Type

To allow to evaluate the usefulness of ambient feature information for appliance recognition for appliances belonging to a certain load type, the whole datasets were split to only contain measurements of devices belonging to one load type and than evaluated such that for each measurement the device type should be distinguished. The Feature Elimination results are documented in Tables 6 and 7 in the sixth to ninth row, whereas the results for the Random Forest of Trees are depicted in Figure 10.

While the intra-class distinction for inductive and SMPS devices generates small optimal feature subsets for *AE*Δ with three or less features, the distinction of ohmic appliances always requires bigger subsets. While acknowledge that our results for SMPS might be potentially biased, given that many of these devices were emitting light and thus the grea<sup>t</sup> importance of the LDR is not surprising. Still, the differences between ohmic and inductive appliances indicate that systems using features best-suited for ohmic devices could gain additional information to ease the distinction of inductive devices with low additional data requirements.

**Figure 10.** Feature Importances (%) for Recognition of Appliances of a Single Class—Appliance Categorization By: Load Curve.

#### *5.4. Interpretation and Discussion*

We evaluated a set of features calculated from ambient sensor readings considering their usefulness and importances for different decision scenarios. All selected and evaluated features were shown to be relevant for at least one of the considered evaluations. However, the UV radiation was found to ease the distinction of different lamps, but could not be detected for any other appliance. Accordingly, its usefulness is restricted to scenarios involving such appliances.

To enable further consideration of the usefulness of individual features, we have accumulated how often each feature was part of an optimal feature subset. The sums are displayed in Table 8 and reconfirm the restricted usage potentials for the UV measurements. It can be furthermore seen that the audio and vibration data have an overall high importance, and were marked especially useful if only binary ambient information is available (i.e., if operating devices leads to the presence of acoustic signals or vibrations). A comparably high usefulness can be attributed to the temperature, magnetic flux density, and humidity measurements; they are included in more than half of the optimal feature subsets. Their importance is even higher when only ambient features are available for evaluation. This evaluation of emissions and feature selection methods illustrates that multiple ambient features might be necessary to properly distinguish electrical devices.


**Table 8.** Number of Times each Feature was included in the Optimal Feature Subset.

Considering the usage of electrical and ambient features, our results show that the inclusion of electrical features allows to achieve distinction with less ambient sensors and generally achieves higher Cross-Validation Scores. This indicates that systems integrating ambient information in the decision process should consider electrical and ambient feature at the same time and not within different decision mechanisms.

During the evaluation of the usefulness for different scenarios, this work also considered the possibility to use ambient sensor data to assign device classes. While the classifications specific to the appliance's internal workings, based on the number of appliance states (Single-State, Multi-State, or Infinite-State appliances) or based on the load curves (ohmic, inductive, or Switching Power Supplies), achieved low Cross-Validation Scores, the distinction between smaller and larger consumers could be shown to be feasible based on the easy-to-calculate features evaluated in this study from a small set of ambient sensors. Considering the classification according to load types or number of states, the use of binary features (indicating the change of an ambient parameter) is outperformed by the use of amplitude information from the audio and vibration sensors. As other studies have shown the feasible usage of more complex audio features (e.g., [5,14]), the additional information gain through such features could be investigated in future work to improve the results for the usage of ambient data for appliance classification even further.

Lastly, we conducted evaluations on two kinds of features: A binary change evaluation, and the evaluation of the maximum changes of sensor values. Considering the results, we have observed that binary features of ambient influences typically yield lower Cross-Validation Scores and often contain more features in the resulting optimal feature subsets. The usage of maximum change information resulted in a rise of 15.3 percentage points in the CVS, with a maximum improvement of nearly 30 percentage points for appliance recognition scenarios using only ambient information. A further 5.1 percentage points increase in CVS could be observed when including electrical information in appliance recognition scenarios.

We expect the results of this work to be a useful guideline for the creation of future energy data collection systems. Our careful analysis of the information gains of sensor types beyond the traditionally used electrical signals showcases the grea<sup>t</sup> potentials of using ambient information in conjunction with electrical data. We are convinced that the analyzed scenarios regarding certain classes of devices and the usefulness of ambient sensor types can help developers to advance and improve existing systems and algorithms based on our findings.

#### **6. Conclusions and Outlook**

In this work, we have conducted an evaluation of ambient sensor data and its possible uses in the context of electrical appliance recognition and load signature analysis in general. Based on two feature selection and ranking methods, we have demonstrated that sensor data for temperature, humidity, audio, vibrations, the magnetic flux density and IR radiation are useful for nearly any load monitoring use case. It could additionally be shown that a single measure for the change of ambient features during an appliance's operation is often sufficient to reach decent Cross-Validation Scores, indicating that such a data collection with low resource usage can already improve load monitoring systems. By considering different evaluation scenarios, we were able to show that the combination of both electrical

and ambient sensor data has been proven to provide the best benefits. We would like to reiterate at this point that our work is not primarily contributing to the field of NILM, i.e., the disaggregation of aggregate load data. Rather than that, we have methodologically determined the most information-rich sensor parameters that can improve energy data analysis methods (such as NILM) in a more general and algorithm-independent way.

**Author Contributions:** Conceptualization, J.H., M.A.T., A.R.; Data curation, M.A.T., J.H.; Formal analysis, J.H., M.A.T.; Funding acquisition, A.R.; Investigation, M.A.T., J.H.; Methodology, M.A.T., A.R., J.H.; Software, M.A.T.; Supervision, A.R., J.H.; Validation, M.A.T., J.H.; Visualization, J.H., A.R.; Writing—original draft, M.A.T.; Writing—review & editing, J.H., A.R. All authors have read and agreed to the publication of this version of the manuscript.

**Funding:** This work was supported by Deutsche Forschungsgemeinschaft gran<sup>t</sup> no. RE 3857/2-1.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.
