(2) Comparison with other proposals

Table 5 shows a comparison between our method and the state-of-the-art approaches which report the AUC score and F1 score. It is worth mentioning that the AUC score or F1 score of state-of-the-art are referenced from their papers directly. Because our dataset cannot satisfy their requirements. For example, [35] needs observe meters as auxiliary data to help NTL detection. [3] requires GIS data, quality data and TECH data to achieve the best performance. [10,12] ask large number of labeled samples, and more than 1 year span of consumption data is [11]'s necessary condition.

Among these state-of-the-art approaches, [10,11,35] are based on artificial samples. The SMART attack model defined by [11] is the simplest situation because its fraud factor *αt* is dominated by a fixed parameter. Due to this reason, [11] achieves the AUC score of 0.99. However, the realistic NTL is more similar to the adaptive attack model(FDI5) defined by [10]. As the key factor of the FDI5 is changed randomly and timely, [10] achieves the F1 score of 0.83 and [35] achieves the AUC score of 0.851. Even though their performance are poor enough refer to [11], their results are more convincing.

On the other hand, [3,12] and the SSAE are validated on realistic NTL samples. The results in the Table 5 show that the SSAE has achieved a large lead on AUC score and F1 score. The knowledge embedded sample model and deep semi-supervised learning are key reasons. Although [12] is also based on deep neural networks, its model is designed on electricity consumption completely and without any domain knowledge, so that its AUC score is not ideal. To avoid the limitation of information on electricity consumption, [3] supplements various auxiliary or privacy data to achieve notable improvement. It undoubtedly increases the difficulty of data acquisition, especially some data refer to customers' privacy. Our approach is a compromise solution which based on the SM data collected by the typical AMI system. It not only reduces the requirement of data types, but also protects customers' privacy. Besides knowledge embedded sample model, the SSAE has stronger feature learning and NTL detection capabilities. Even if raw SM data, SSAE still obtains an AUC score of 0.907.


**Table 5.** Comparison with the state-of-the-art.
