*2.2. High-Frequency Datasets*

The following high-frequency datasets were analyzed and compared (Table 3): REDD (Reference Energy Disaggregation dataset), BLUED (Building-Level fUlly-labeled dataset for Electricity Disaggregation), PLAID (Plug Load Appliance Identification Dataset), HFED (High-Frequency Energy Data), UK-DALE (United Kingdom recording Domestic Appliance-Level Electricity), COOLL (Controlled On/Off Loads Library), SusDataED (Sustainable Data for Energy Disaggregation), WHITED (Worldwide Household and Industry Transient Energy Dataset), BLOND (Building-Level Office enviroNment Dataset), and SynD (Synthetic energy Dataset).

REDD [27] is a residential dataset intended for research on disaggregation methods. REDD contains measurements from 6 different houses obtained over several months. The house input AC mains voltage and aggregated current are monitored at a sample rate of 15 kHz. Furthermore, the voltages and currents at individual circuits are monitored at a sample rate of 0.5 Hz, and plug-level monitors at a sample rate of 1 Hz. Similar to several of the datasets analyzed here, REDD provides

ground truth data by presenting energy samples of individual appliances (monitored at plug-level) and of subsets (monitored at circuit level) of the total load.

Similarly, BLUED [28] is a dataset obtained from a single-family residence. This dataset registers the AC mains voltage and aggregated current. The sampling rate is 12 kHz, and the measurements were performed for 1 week. Every state transition of the 43 appliances is labeled and time-stamped, providing ground truth for event detection algorithms.

PLAID [29] is a public and crowd-sourced dataset consisting of one-second voltage and current waveforms for different residential appliances. The goal of this dataset is to provide a public library for high-frequency (30 kHz) measurements that can be integrated into existing or novel appliance identification algorithms. PLAID currently contains measurements for more than 200 different appliances, grouped into 11 appliance classes, and totaling over a thousand records.

UK-DALE [16] is a publicly available dataset comprising records from 5 different houses. It contains AC mains voltage and aggregated current, as well as voltage and current of individual loads, hence, providing ground-truth for testing disaggregation and training algorithms. The sampling rate is 16 kHz for the house input, while the individual sensors are sampled every 6 s. There are more than 4 years of data in this dataset and it is continuously updated.

HFED [30] is a high-frequency Electromagnetic Interference (EMI) dataset comprising high-frequency measurements of EMI, emanated from electronic appliances, propagated through the power infrastructure, and measured at a single point. HFED includes 24 appliances connected over four different test setups (in lab settings and one test setup in home settings). EMI measurements are taken over a frequency range of 10 kHz to 5 MHz.

COOLL [31] is a publicly available home appliance dataset containing 42 appliances grouped into 12 classes. The AC mains voltage and current are monitored for each appliance at a sample rate of 100 kHz for 6 s, which includes turn-ON and turn-OFF transients. For each appliance, there are 20 measurements on different power-on angles of the mains cycle. Each appliance is measured individually; hence, there is no aggregated current data registered in the dataset.

SusDataED [32] is an extended version of the dataset SusData [33]. This dataset is composed of measurements taken from a single-family residence in Portugal. Samples of 17 distinct appliances were taken at a sampling rate of 12.8 kHz for ten days.

WHITED [34] is a dataset of appliance measurements from several locations (households and small industries) around the world. The voltage and current waveforms are recorded with the first 5 s of the appliance start-ups for 110 different appliances, amounting to 47 different appliance types. This dataset aims to provide a broad spectrum of different appliance types in different regions around the world.

BLOND [15] is a dataset with waveforms collected at a typical office building in Germany. It is a fully-labeled ground truth dataset, with 53 appliances distributed in 16 classes of devices, sampled at 50 kHz during 213 days.

SynD [35] is a synthetic dataset composed of residential loads. This dataset is the result of a 180 days custom simulation of a residential environment that relies on power traces of real household appliances. SynD is composed of measurements taken from 21 appliances in Austria, with a sampling rate of 5 Hz, during 180 days.

Table 3 shows a comparison between these high-frequency NILM datasets. It includes information on the environment (if data was collected in a Residential, Commercial, or Industrial environment); the Duration of the period of Data Collection (DCD); if the dataset includes scenarios of Multiple Simultaneous Loads (MSL); the sampling frequency ( *fs*); if Ground Truth is recorded, either as the recordings of current/power of individual loads or as recordings of events (at a given Load Event Resolution—LER); the Number of Appliance Classes (NoC); and the Number of Appliances (NoA).


**Table 3.** Comparison between high-frequency NILM Datasets.

#### *2.3. Evaluation of Datasets*

The analysis of the datasets, both high-frequency and low-frequency, presented above indicates that: (1) the majority of NILM datasets contains data collected in a residential environment; (2) the majority of high-frequency datasets register 200 or more samples per mains cycle, a notable exception being SynD whose sampling frequency is 5 Hz; (3) the majority of the datasets register multiple simultaneous loads. Concerning the unique characteristics of each dataset it can be observed that: (1) the'highest sampling frequency is used by COOLL (100 kHz); (2) while most low-frequency datasets do not provide ground-truth information, the high-frequency datasets provide ground truth by recording at a much lower rate (typically bellow 1 Hz) samples for individual loads.

#### *2.4. Tools for NILM Datasets*

The NILM Toolkit (NILMTK) [36] is an open-source toolkit designed to allow the comparison between NILM algorithms. It provides a Python API that operates on input and output binary files, therefore facilitating compatibility with data from NILM datasets. The input files used by NILMTK must be converted to the NILMTK-DF (data format), which is a data structure inspired on the dataset REDD comprising disaggregated power data (i.e., separate sample sets for each of the loads in a dataset) as well as metadata annotations about the sample set.

#### **3. The Design of a Novel Dataset**

Since none of the evaluated datasets had all the required characteristics for our research project, a new dataset development took place, with the first activity being requirements elicitation.

The LIT-Dataset is composed of three subsets: Synthetic, Simulated, and Natural. The Synthetic subset is obtained by a programmable power sequencing to a given set of loads in a controllable laboratory setup, so that repeatable scenarios can be obtained. In the Simulated subset, data is collected by simulating a circuit operation, allowing to test different scenarios and to control parameters that otherwise would not be possible or would be unsafe. The Natural subset is composed of voltage and current samples collected in a real-world uncontrolled environment; furthermore, apart from recording the aggregated current and the AC mains voltage, power sensors monitoring each load identify and record when each load event occurs.

Concerning the taxonomy presented in [17], the LIT-Dataset is an Aggregated Level dataset whose main application is Energy Disaggregation but is also applicable to energy saving, appliance recognition, and anomaly detection.

One of the requirements of the LIT-Dataset is that it includes multiple loads, as a NILM system must identify the loads that compose an aggregated current signal. Another requirement is that it must include precise indications of every load event (load on and load off), with a resolution better than one mains cycle, and have a high sample rate.

The Stakeholder requirements of the LIT-Dataset are based on the needs of the authors' NILM project, as well as on the requirements common to other NILM datasets. The LIT-Dataset Stakeholder requirements are listed below, as well as the rationale for each requirement:

DSReq 1. Data collection from loads connected to a single-phase 127 V, 60 Hz mains (the Brazilian power grid standard).

> R: Due to power grid availability in our lab. Considering that 127 V, 60 Hz, is a standard used in many countries around the world, such a requirement does not restrict the usage of the LIT-Dataset elsewhere.


 a variety types so systems can be evaluated and compared over the range of loads available in the real-world.


R: A high-frequency NILM dataset can be used by NILM algorithms that evaluate the waveform of the current in each mains cycle to determine accurately the occurrence of load events. Ground-truth indications of such events with an accuracy better than one mains semicycle provide information to validate such algorithms. 5 ms is a typical switching time for relays used to energize the loads of a dataset.

Remark: concerning this requirement, accuracy is the measure of the error between the instant were the actual load event occurred, and when the event is reported (labeled).

DSReq 6. The minimum sampling rate is 15,360 Hz, corresponding to 256 samples along one mains cycle.

> R: In high-frequency datasets, there is a trade-off between sampling frequency and storage requirements. Based on the analysis of datasets with sampling frequencies up to 100 kHz, the spectral densities of frequencies above 5 kHz in the aggregated signal, and the waveforms reconstructed from samples at 256 samples per cycle, this sampling rate was determined as an adequate trade-off selection.

DSReq 7. Recordings over a mix of loads so that low-power load-events (<5 W) occur while high power (>800 W) are energized.

> R: Switching a low-power load when high-power loads are energized poses a challenging scenario for NILM systems; hence, the LIT-Dataset should include such scenarios for evaluation of these systems.

For the Synthetic subset:

DSReqSy 1. Synthetic load shaping of up to eight concurrent loads.

> R: As a NILM system must disaggregate loads, a dataset should have aggregated data collected from loads energized concurrently. As there is a trade-off between

cost/complexity of the data collecting infra-structure and the number of concurrent loads, eight loads were selected as an adequate trade-off.

DSReqSy 2. The duration of each recording must be longer than 10 seconds and must include at least one power-ON and one power-OFF event.

> R: By examining the data from other datasets, 10 s was determined as a sufficient duration so that the stable periods occur between transient periods due to power-ON and power-OFF.

For the Simulated subset:

R:

DSReqSim 1. Recording at multiple power levels for each type of simulated load.

> To explore the flexibility due to simulation allowing multiple loads to be employed by just changing the component values.

DSReqSim 2. Different scenarios of the AC Mains must include wiring stray inductance, as well as harmonics and white noise added to the mains voltage.

R: To simulate multiple actual environments considering wiring stray inductance, harmonics, and noise.

For the LIT Natural subset:

DSReqN 1. Minimum monitoring time for naturally shaped loads (for each monitoring file): 1 day. R: Considering the daily seasonality typically present in the load shaping of the Natural subset, a day-long acquisition records such seasonality.

The taxonomy presented by Hart [37], from the perspective of power switching, was extended, resulting in these types of loads:


As per requirement DSReq 3, all these types of loads are required in the LIT-Dataset.

In [38], the authors present 17 suggestions to dataset providers to improve dataset interoperability and comparability. Since these suggestions were published after the LIT-Dataset requirements were specified, we present in Table 4, the coverage of the LIT-Dataset requirements with respect to the presented suggestions.


**Table 4.** Coverage of Klemenjak's [38] suggestions.
