*4.3. Multi-Task Learning*

If a machine learning model trains on separate but related tasks, this process is referred to as multi-task learning. For a good introduction and overview on the topic with respect to deep learning, the reader is referred to [173]. The NILM problem is suitable to be framed as a multi-task learning problem: The column 'Output' in Table 2 lists the different variants that have been employed in the reviewed literature. We asked ourselves: "Can we find evidence that multi-task learning leads to superior performance compared to single task learning in the case of the DNN-NILM approaches?"

Based on the literature review, we found the following: Ref. [91] trains a CNN on the washing machine, freezes the parameters of the convolutional layers, and retrains afterwards only the final, fully connected layer for other appliances. The authors find that the results of this approach are comparable to standard training. This finding suggests that the learned features of different appliances are similar and can be shared between appliances. Simultaneous learning on different appliances could therefore make features more robust and lower the requirements on the amount of training data. A large improvement from joint learning on multiple appliances is also reported by [107] and, as was already mentioned in Section 4.1, four of the best approaches [35,37,63,97] use multi-task learning for network training. Only the authors of [49] report a general decrease in performance of multi-task learning models with respect to their single-task counterparts. They propose to employ a different architecture or share less layers between appliances as a remedy. Due to the presented observations and the general benefits of multi-task learning presented in [173], we conclude that multi-task learning is beneficial for DNN-NILM approaches. As has also been noted by [49], we see the additional benefit of multi-task learning in a reduced computational burden for edge devices because a major amount of computations for disaggregation can be shared between several applications.

## *4.4. Parameter Studies*

As visualized in Figure 1, there are many degrees of freedom for DNN-NILM approaches. In Section 3, we listed the many options for the corresponding aspects that have already been tried out. Looking at the literature, however, we see a lack of understanding of the influence of the available options. Therefore, we want to stress the need and value of parameter studies for future research activities in the DNN-NILM field.

For example, in case of the data *sampling rate* and *window length*, several authors looked at the influence of these two parameters on the models performance, see Sections 3.2.2 and 3.3.1. There exists, however, no study that jointly investigates these two tightly connected parameters (maybe even on different datasets and based on different models). Similarly, we see potential in a systematic comparison between different normalization (Section 3.2.2), activation balancing (Section 3.2.3), data augmentation (Section 3.2.4), and post-processing (Section 3.5.1) strategies, as well as loss functions (Section 3.4.2).
