(1) Compared to Baselines

Table 4 and Figure 10 present the performance comparison of our approach and baselines. Overall, SSAE achieves the best results on all metrics. Benefit from the knowledge embedded sample model, SVM, KNN and XGBoost have also achieved good results, even using dimensionality reduction data as input features. Due to the stronger feature learning ability, MLP-3 and ResNet-20 is better than the above three methods. However, MLP-3 and ResNet-20 are supervised learning, and there can hardly avoid overfitting when labeled samples are extremely limited. In addition, the results of each of trial are very unstable. By using massive unlabeled samples and regularized losses, SSAE avoids overfitting successfully.

**Table 4.** NTL detection performance comparison (with knowledge).


**Figure 10.** The ROC curve and PR curve of SSAE and baselines.

From Table 4, XGBoost obtained a good AUC score which close to the SSAE by selecting a very small decision threshold. Obviously, it will make the classifier be more sensitive, and lead to unstable. On the contrary, SSAE allows larger decision threshold without causing a significant drop in Precision and Recall. Refer to the results of SSAE presented in Table 4, Precision and Recall of SSAE outperform all baselines when the decision threshold is 0.5. It shows that SSAE separates normal and abnormal samples as much as possible. Hence, the classifier of SSAE will be more stable on varying scenarios.
