5.2.2. Study of SSAE

We evaluate the performance of SSAE from the following aspects:


SSAE has achieved remarkable generalization performance. In order to reflect the fairness of the comparison, the median results of SSAE were selected in subsequent subsections for comparison.

**Figure 7.** Study the performance of SSAE according to varying proportion of labeled samples, latent size and experiment round. (**a**) Capability of semi-supervised learning against NTL. (**b**) Studying the effect of latent size to SSAE. (**c**) F1 score and AUC score on each round of experiment.

**Figure 8.** Comparison of t-SNE results of original samples and features learned by SSAE where 0 denotes normal samples plotted in blue, and 1 denotes NTL samples plotted in orange. (**a**) Original samples with embedding knowledge. (**b**) Latent features learned by SSAE.

4. **Convergence analysis**: For semi-supervised learning, the number of epochs is a very important parameter to avoid underfitting and overfitting. In this paper, the epoch is defined by training all labeled samples rather than unlabeled samples. Too small or too large an epoch value will lead to underfitting or overfitting, respectively. Figure 9 provides losses and scores with the epoch from 1 to 100. Between 40 and 60 epochs, losses and scores are relatively flat and stable. Before 40 epochs and after 60 epochs, there are large fluctuations. Especially after 60 epochs, the training loss continued to fall, the validation loss began to rise. At same time, the AUC score and the F1 score both decreased slightly, and SSAE is over-fitting obviously. Moreover, in Figure 9, the AUC score and F1 score reach 0.9738 and 0.8763 at the 50th epoch, respectively. Therefore, the epoch value of all experiments in this paper is fixed at 50.

**Figure 9.** Convergence analysis of SSAE.
