3.3.1. Calibration

We calibrated each meter in advance using an *ENERGY-LOGGER 4000* [34] (with a stated accuracy of 1%) as a reference. We used ten loads with different power consumption ranging from 5 W to 2000 W and applied a linear calibration. The calibration parameters of each meter were stored permanently in their non-volatile memory. We repeated the calibration after the recording duration to see if aging affects have already occurred. Such effects could not be identified.

#### 3.3.2. Residual Power

The sum of all PowerMeters' apparent power matches the apparent power recorded by the SmartMeter with a slight offset (residual power). The residual power is the portion of the total consumed power which is not metered by an individual meter, i.e., the portion for which no ground truth data is available. Our goal was to minimize this portion in order to provide reliable ground truth data that can be used by supervised machine learning algorithms.

The residual power observed in the FIRED dataset (see Figure 8) is mainly due to nonmonitored, hard-wired devices in the apartment such as the lighting and the ventilation

system but also due to the power consumption of the distributed meters. The individual consumption of each light-bulb can be estimated with the log files which we provide with our dataset. To show that this is feasible, we generated apparent power estimates using these log files and the provided individual light recordings. The consumption of the remaining unmonitored appliances (including the consumption of 21 PowerMeters) is the base power consumption of the apartment. It can be estimated at times when lights are turned off and the majority of appliances do not consume any power which is typically during the night or in the case of owner absence. We calculated the base power **P***baseLx* for each individual supply leg *x* ∈ [1, 2, 3] as

*PMLx* := { *pm* ∈ *PM* | phase of *pm* is *x* }, (10)

$$L\_{Lx} := \{ l \in L \text{glits} \mid \text{phase of } l \text{ is } x \}, \tag{11}$$

$$\mathbf{P}\_{\text{bus}\downarrow x} = \mathbf{P}(SM\_{Lx}) - \sum\_{pm \in PM\_{Lx}} \mathbf{P}(pm) - \sum\_{l \in L\_{Lx}} \mathbf{P}(l). \tag{12}$$

*SMLx* is the SmartMeter data of live wire *Lx*, *PM* is the set of all PowerMeters, *PMLx* is the set of PowerMeters that are connected to live wire *Lx*, *Lights* is the set of all lights, *LLx* is the set of lights connected to *Lx* and **P**(*X*) is the extracted power trace of a meter or light *X*. We assume that the base power is normally distributed and therewith remove all points in **P***baseLx* that are further away than *σ* from the mean value and calculate **P***baseLx* as the mean from the cleaned signal.

Figure 13 shows the apparent power consumption including the lighting and the estimated base power with a remaining Root-Mean-Square Error (RMSE) of 17 V A.

**Figure 13.** The power consumption of the apartment over one full day (2nd July 2020). The power is down-sampled to one sample every 3 s. The black line indicates the power consumption recorded by the SmartMeter. The contribution of the six top-most consumers is shown as stacked colored blocks.The consumption of all remaining appliances and the reconstructed consumption of the apartment's lighting are aggregated and shown as the blue block *others*. The black base block represents the apartment's base power estimated as 26.66 V A on average for this day.
