*5.3. Natural Subset*

To illustrate the validation of the data collection system of the Natural subset (NSAS—Figure 9) a 3-load combination recording is presented. In this case, the EDNs are identified by a code transmitted in the data package to the Synchronization Master and Acquisition Node (SMAN). Table 15 presents the three loads used, the number of the respective EDN, and the corresponding identification code for power-ON and power-OFF events.


**Table 15.** Natural subset Devices (Acquisition Example).

The following sequence of load switching occurred: (i) turn-ON incandescent lamp; (ii) turn-OFF incandescent lamp; (iii) turn-ON LED lamp; (iv) turn-OFF LED lamp; (v) turn-ON drill; (vi) turn-OFF drill, generating the voltage and current curves represented in Figure 22a.

Table 16 presents all six events. The actual instant of the power-ON/power-OFF events can be obtained by analyzing the waveforms, as represented in Figure 22b, where the turn-ON Event 1 at sample 194,266 is shown. These values are presented in column "Event observed in waveform (in samples)" and represent the ground truth. The EDN data packet contains the time stamp and event code in the format *YYYY* : *MM* : *DD* : *HH* : *MM* : *SS*,*sample*\_*number*\_*<sup>a</sup> f ter*\_*SS*, *Code*\_*Id*. From the data packet, the reported time of the event (in samples) is obtained using the time-stamp of the first sample in the waveform. The corresponding error, measured in the number of samples, corresponds to the distance (in samples) between ground-truth and the detected event. The sampling frequency is 15,360 Hz; hence, each sample corresponds to 65.1 μs.



**Figure 22.** Natural subset: incandescent lamp, LED lamp, and drill.

#### *5.4. Analysis of the Results*

In this section, the analysis of the three subsets is presented. Table 17 illustrates the high-frequency datasets (initially presented in Section 2.2, Table 3), now including the three LIT-Dataset subsets.

From the results presented previously in this section and by the comparative analysis summarized in Table 17, the distinct features of the LIT-Dataset are:



**Table 17.** Comparison of high-frequency NILM datasets.

(\*)—The LIT-Dataset is still in the process of data collection, particularly for the Natural subset that is currently in the early phases of data collection.

#### *5.5. Considerations on the Design Process*

The LIT-Dataset, composed of three subsets, is the result of a design process that started with Requirements Engineering, including requirements harvesting of the Stakeholders Requirements, identification of the source and derived requirements, data format design, development process of the Enabling Systems (Jig, Simulator, and network of NSAS), validation of the enabling systems, data collection, data validation, and publication of the dataset files and user support documentation.

Following such a well-defined design process was beneficial to keep the project on track and according to established project planning. Even so, as with most engineering projects, some difficulties presented themselves during the process. The most relevant and time-consuming were:


• Certainly, the most unexpected difficulty was to finish the project, on-time, during the COVID-19 Pandemic. Significant changes in the work environment, basically moving all activities to home office, required an unexpected amount of extra work.

As the initial planning included very little slack time to cope with such difficulties, the solution to keep the original schedule of the project was to increase the weekly work effort of the participants. The collection of data for the Natural subset is somewhat delayed. The aim is to continue data collection for all subsets.
