3.2.3. Activation Balancing

In NILM literature, the time interval between an appliance being switched on and off is referred to as an *activation*. Domestic appliances exhibit typically one, up to several activations per day. Usually, the run-time of appliances is low compared to the time they are switched off. For the training of machine learning algorithms, one is consequently faced with a skewed dataset that contains only a few samples of the running appliance. To compensate, several authors balance samples with and without a (partial) activation during training [14,22,34,35,39,58,60,63,75,89,105,110,112,119,123]. The majority of works nevertheless train the models using the available data, without taking care of the class imbalance. In the scope of this review, we are only aware of [34], which investigates the effect of the ratio between samples with and without an activation on training results. They found that in case of batch normalization [147], the accuracy strongly decreased at a ratio

of one to five, whereas for instance normalization [146], the performance increased slightly up to the largest tested ratio of one to seven. In general, it remains unclear how exactly activation balancing influences the disaggregation quality and model convergence speed.
