Automated Network Defense: A Systematic Survey and Analysis of AutoML Paradigms for Network Intrusion Detection
Abstract
1. Introduction
Related Work
2. Taxonomy of Parallel and Distributed AutoML Paradigms
2.1. Enhanced Sequential Search with Meta-Learning
2.2. Intrinsically Parallel Population-Based Search
2.3. Large-Scale Parallel Ensemble and Stacking
2.4. Distributed Neural Architecture Search (NAS)
2.5. Summary
3. Empirical Analysis of NID Systems
3.1. Meta-Analysis Methodology
Critical Analysis of Benchmark Datasets
3.2. Performance and Efficiency Comparison
3.3. Result Discussion
3.3.1. State-of-the-Art Performance via Large-Scale Parallel Ensembling
3.3.2. Discovering Novel Architectures with Distributed NAS
3.3.3. Synergy Between Domain Knowledge and Automation
3.3.4. Expanded Observations
4. Discussion: Challenges and Future Research Directions
4.1. Computational Scalability and Green AutoML
4.2. Security and Trust in Distributed Learning
4.3. End-to-End Feature Engineering for NID
Exploring Universal Feature Extraction Methods
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Conflicts of Interest
References
- Dadkhah, S.; Mahdikhani, H.; Danso, P.K.; Zohourian, A.; Truong, K.A.; Ghorbani, A.A. Towards the development of a realistic multidimensional iot profiling dataset. In Proceedings of the 2022 19th Annual International Conference on Privacy, Security & Trust (PST), Fredericton, NB, Canada, 22–24 August 2022; pp. 1–11. [Google Scholar]
- Lashkari, A.H.; Kadir, A.F.A.; Taheri, L.; Ghorbani, A.A. Toward developing a systematic approach to generate benchmark android malware datasets and classification. In Proceedings of the 2018 International Carnahan Conference on Security Technology (ICCST), Montreal, QC, Canada, 22–25 October 2018; pp. 1–7. [Google Scholar]
- Jian, S.J.; Lu, Z.G.; Du, D.; Jiang, B.; Liu, B.X. Overview of Network Intrusion Detection Technology. J. Cyber Secur. 2020, 5, 96–122. [Google Scholar]
- He, X.; Zhao, K.; Chu, X. AutoML: A survey of the state-of-the-art. Knowl.-Based Syst. 2021, 212, 106622. [Google Scholar] [CrossRef]
- Zhou, Y.; Shi, H.; Zhao, Y.; Ding, W.; Han, J.; Sun, H.; Zhang, X.; Tang, C.; Zhang, W. Identification of encrypted and malicious network traffic based on one-dimensional convolutional neural network. J. Cloud Comput. 2023, 12, 53. [Google Scholar] [CrossRef]
- Wang, W.; Zhu, M.; Wang, J.; Zeng, X.; Yang, Z. End-to-end encrypted traffic classification with one-dimensional convolution neural networks. In Proceedings of the 2017 IEEE International Conference on Intelligence and Security Informatics (ISI), Beijing, China, 22–24 July 2017; pp. 43–48. [Google Scholar]
- Ruff, L.; Vandermeulen, R.; Goernitz, N.; Deecke, L.; Siddiqui, S.A.; Binder, A.; Müller, E.; Kloft, M. Deep one-class classification. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, 10–15 July 2018; pp. 4393–4402. [Google Scholar]
- Pan, J.; Bulat, A.; Tan, F.; Zhu, X.; Dudziak, L.; Li, H.; Tzimiropoulos, G.; Martinez, B. Edgevits: Competing light-weight cnns on mobile devices with vision transformers. In Proceedings of the ECCV 2022: 17th European Conference (Part XI), Tel Aviv, Israel, 23–27 October 2022; Springer: Cham, Switzerland, 2022; pp. 294–311. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image Is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. Available online: https://openreview.net/forum?id=YicbFdNTTy (accessed on 11 February 2025).
- Chu, X.; Tian, Z.; Zhang, B.; Wang, X.; Shen, C. Conditional Positional Encodings for Vision Transformers. Available online: https://openreview.net/forum?id=3KWnuT-R1bh (accessed on 11 February 2025).
- Wang, Y.; Yao, Q.; Kwok, J.T.; Ni, L.M. Generalizing from a few examples: A survey on few-shot learning. Acm Comput. Surv. 2020, 53, 1–34. [Google Scholar] [CrossRef]
- Lotfollahi, M.; Jafari Siavoshani, M.; Shirali Hossein Zade, R.; Saberian, M. Deep packet: A novel approach for encrypted traffic classification using deep learning. Soft Comput. 2020, 24, 1999–2012. [Google Scholar] [CrossRef]
- Liao, H.J.; Lin, C.H.R.; Lin, Y.C.; Tung, K.Y. Intrusion detection system: A comprehensive review. J. Netw. Comput. Appl. 2013, 36, 16–24. [Google Scholar] [CrossRef]
- Thornton, C. Auto-WEKA: Combined Selection and Hyperparameter Optimization of Supervised Machine Learning Algorithms. Ph.D. Thesis, University of British Columbia, Vancouver, BC, Canada, 2014. [Google Scholar]
- Feurer, M.; Klein, A.; Eggensperger, K.; Springenberg, J.; Blum, M.; Hutter, F. Efficient and robust automated machine learning. Adv. Neural Inf. Process. Syst. 2015, 28, 2962–2970. [Google Scholar]
- Olson, R.S.; Bartley, N.; Urbanowicz, R.J.; Moore, J.H. Evaluation of a tree-based pipeline optimization tool for automating data science. In Proceedings of the Genetic and Evolutionary Computation Conference 2016, Denver, CO, USA, 20–24 July 2016; pp. 485–492. [Google Scholar]
- Erickson, N.; Mueller, J.; Shirkov, A.; Zhang, H.; Larroy, P.; Li, M.; Smola, A. Autogluon-tabular: Robust and accurate automl for structured data. arXiv 2020, arXiv:2003.06505. [Google Scholar]
- Jin, H.; Song, Q.; Hu, X. Auto-keras: An efficient neural architecture search system. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 1946–1956. [Google Scholar]
- Malekghaini, N.; Akbari, E.; Salahuddin, M.A.; Limam, N.; Boutaba, R.; Mathieu, B.; Moteau, S.; Tuffin, S. AutoML4ETC: Automated neural architecture search for real-world encrypted traffic classification. IEEE Trans. Netw. Serv. Manag. 2023, 21, 2715–2730. [Google Scholar] [CrossRef]
- Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp 2018, 1, 108–116. [Google Scholar]
- Holl, J.; Schmitt, P.; Feamster, N.; Mittal, P. New directions in automated traffic analysis. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, Virtual, 15–19 November 2021; pp. 3366–3383. [Google Scholar]
- Yang, L.; Shami, A. Towards Autonomous Cybersecurity: An Intelligent AutoML Framework for Autonomous Intrusion Detection. In Proceedings of the Workshop on Autonomous Cybersecurity, Rome, Italy, 28 May–1 June 2023; pp. 68–78. [Google Scholar]
- Lyu, R.; He, M.; Zhang, Y.; Jin, L.; Wang, X. Network Intrusion Detection Based on an Efficient Neural Architecture Search. Symmetry 2021, 13, 1453. [Google Scholar] [CrossRef]
- Draper-Gil, G.; Lashkari, A.H.; Mamun, M.S.I.; Ghorbani, A.A. Characterization of encrypted and vpn traffic using time-related. In Proceedings of the 2nd International Conference on Information Systems Security and Privacy (ICISSP), Rome, Italy, 19–21 February 2016; pp. 407–414. [Google Scholar]
- Wang, W.; Zhu, M.; Zeng, X.; Ye, X.; Sheng, Y. Malware traffic classification using convolutional neural network forrepresentation learning. In Proceedings of the 2017 International Conference on Information Networking (ICOIN), Da Nang, Vietnam, 11–13 January 2017; pp. 712–717. [Google Scholar]
- Zheng, W.; Gou, C.; Yan, L.; Mo, S. Learning to Classify: A Flow-Based Relation Network for Encrypted Traffic Classification. Proc. Web Conf. 2020, 2020, 13–22. [Google Scholar]
- Lin, X.; Xiong, G.; Gou, G.; Li, Z.; Shi, J.; Yu, J. ET-BERT: A Contextualized Datagram Representation with Pre-training Transformers for Encrypted Traffic Classification. Proc. ACM Web Conf. 2022, 2022, 633–642. [Google Scholar]
- Zhao, R.; Zhan, M.; Deng, X.; Wang, Y.; Wang, Y.; Gui, G.; Xue, Z. Yet Another Traffic Classifier: A Masked Autoencoder Based Traffic Transformer with Multi-Level Flow Representation. Proc. AAAI Conf. Artif. Intell. 2023, 37, 5420–5427. [Google Scholar] [CrossRef]
- Piet, J.; Nwoji, D.; Paxson, V. Ggfast: Automating generation of flexible network traffic classifiers. In Proceedings of the ACM SIGCOMM 2023 Conference, New York City, NY, USA, 10–14 September 2023; pp. 850–866. [Google Scholar]
- Pang, R.; Xi, Z.; Ji, S.; Luo, X.; Wang, T. On the security risks of AutoML. In Proceedings of the 31st USENIX Security Symposium (USENIX Security 22), Boston, MA, USA, 10–12 August 2022; pp. 3953–3970. [Google Scholar]
Paradigm | Example Framework | Core Principle | Parallel Mechanism | NID Applicability Analysis |
---|---|---|---|---|
Enhanced Sequential Search with Meta-Learning | Auto-WEKA, Auto-Sklearn | Accelerating Bayesian sequential optimization methods through meta-learning and parallel evaluation | Parallel evaluation of candidate configurations; Parallel initialization of meta-learning | Suitable for small to medium-sized datasets, it can find relatively optimal traditional ML models, but has limited support for deep learning |
Intrinsically Parallel Population-Based Search | TPOT | Utilize evolutionary algorithms to conduct parallel evaluation of the candidate ML pipelines of a population | Embarrassingly Parallel (EP) population fitness evaluation | Capable of exploring complex pipeline combinations, but the search process is highly random and the results may be unstable |
Large-Scale Parallel Ensemble and Stacking | AutoGluon | Parallel training of a large number of standard models, and then combining them through stacking and integration | Model parallelism and data parallelism (K-fold Bagging) | Powerful and stable performance, fast training speed, particularly suitable for tabular flow data, but model interpretability is relatively poor |
Distributed Neural Architecture Search (NAS) | Auto-Keras, AutoML4ETC | Using the controller-worker architecture to search for network structures on a distributed cluster | Distributed training and evaluation of sub-network architectures | Holds great potential for discovering highly customized deep learning models based on traffic data, but the computational cost is extremely high |
Dataset | Year | Primary Use Case | Strengths | Known Limitations and Biases | Implication for Performance Comparison |
---|---|---|---|---|---|
CICIDS2017 | 2017 | General-purpose NIDS evaluation | Realistic network topology; Diverse, modern attack types; Labeled flows with >80 features. | Severe class imbalance; Missing labels; Systematic labeling errors from CICFlowMeter tool can mislabel malicious TCP flows as benign. | Near-perfect scores should be viewed critically; performance may reflect overfitting to dataset artifacts rather than generalizable detection. |
ISCX2016 | 2016 | Encrypted, VPN traffic classification | Labeled VPN vs. non-VPN traffic for various applications; Includes full packet captures. | Generated in a highly controlled, sanitized environment with only one application active at a time; lacks realistic background noise. | High performance may not generalize to noisy, real-world networks with concurrent application traffic. |
USTC-TFC2016 | 2016 | Malicious, application traffic classification | Contains 10 types of malicious traffic from real-world captures and 10 types of normal application traffic. | Primarily used for classification tasks, not anomaly detection like CICIDS2017; less information on capture methodology. | Useful for evaluating classifiers on known malware vs. benign traffic, but less suited for evaluating zero-day anomaly detection systems. |
Category | Method | Dataset | ACC (%) | Precision (%) | F1-Score (%) |
---|---|---|---|---|---|
Traditional ML Models | KNN | CICIDS2017 | 96.30 | 96.20 | 96.30 |
DT | CICIDS2017 | 99.61 | 99.61 | 99.60 | |
RF | CICIDS2017 | 99.71 | 99.71 | 99.71 | |
ET | CICIDS2017 | 99.24 | 99.25 | 99.24 | |
XGBoost | CICIDS2017 | 99.75 | 99.75 | 99.75 | |
LightGBM | CICIDS2017 | 99.77 | 99.77 | 99.76 | |
Cat Boost | CICIDS2017 | 99.55 | 99.55 | 99.55 | |
Standard DL-CNN | LeNet | CICDS2017 | 98.87 | 96.41 | 88.90 |
CNN | CICDOS2017 | 85.99 | 86.25 | 95.20 | |
ResNet | CICDOS2017 | 98.70 | 98.50 | 98.10 | |
Advanced DL-Transformer | ViT-B/16 | CICIDS2017 | 80.32 | 82.38 | 79.83 |
AutoML-based Systems | Nprint [21] | CICIDS2017 | 99.90 | 100.00 | 99.90 |
Autonomous Cybersecurity [22] | CICIDS2017 | 99.80 | 99.80 | 99.80 | |
NAS-Net [23] | CICDOS2017 | 99.40 | 99.50 | 99.50 | |
KNN-AIDS | CICIDS2017 | 99.52 | 99.49 | 99.49 | |
DL-LSTM | CICIDS2017 | 99.32 | 99.32 | 99.32 | |
PyDSC-IDS | CICIDS2017 | 97.60 | 90.73 | 94.13 | |
OE-IDS | CICIDS2017 | 98.00 | 97.30 | 96.70 | |
PSO-DL | CICIDS2017 | 98.95 | 95.82 | 95.80 |
Category | Method | Dataset | ACC (%) | Precision (%) | F1-Score (%) |
---|---|---|---|---|---|
Traditional ML Models | AppScanner | ISCX-VPN-App [24] | 47.11 | 52.76 | 46.09 |
CUMUL | ISCX-VPN-App | 34.50 | 27.85 | 28.64 | |
Standard DL-CNN | 1DCNN | USTC-TFC2016 [25] | 96.79 | 96.93 | 96.76 |
2DCNN | USTC-TFC2016 | 96.94 | 97.07 | 96.92 | |
DeepPacketCNN | ISCX2016 | 92.24 | 94.01 | 92.04 | |
E2ECNN [12] | ISCX2016 | 92.48 | 93.03 | 92.47 | |
Advanced DL-Transformer | FS-Net [26] | ISCX-VPN-Service [24] | 69.51 | 59.98 | 57.08 |
ET-BERT [27] | ISCX-VPN-Service | 97.83 | 97.98 | 97.86 | |
YaTC [28] | ISCX-VPN-Service | 95.72 | 95.70 | 95.71 | |
HiLo-MAE | ISCX-VPN-Service | 99.19 | 99.20 | 99.19 | |
ViT-B/16 | USTC-TFC2016 | 73.77 | 75.79 | 73.45 | |
LGLFormer | USTC-TFC2016 | 99.32 | 99.30 | 99.30 | |
AutoML-based Systems | GGFAST [29] | AUCK-VI | 98.60 | 98.10 | 97.41 |
AutoML4ETC [19] | ISCX2016 | 94.35 | 94.87 | 94.40 | |
UWOrange-H | ISCX2016 | 92.56 | 92.56 | 94.87 | |
UCDavisCNN | ISCX2016 | 93.82 | 94.01 | 93.82 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, H.; Wang, X.; He, F.; Zheng, Z. Automated Network Defense: A Systematic Survey and Analysis of AutoML Paradigms for Network Intrusion Detection. Appl. Sci. 2025, 15, 10389. https://doi.org/10.3390/app151910389
Liu H, Wang X, He F, Zheng Z. Automated Network Defense: A Systematic Survey and Analysis of AutoML Paradigms for Network Intrusion Detection. Applied Sciences. 2025; 15(19):10389. https://doi.org/10.3390/app151910389
Chicago/Turabian StyleLiu, Haowen, Xuren Wang, Famei He, and Zhiqiang Zheng. 2025. "Automated Network Defense: A Systematic Survey and Analysis of AutoML Paradigms for Network Intrusion Detection" Applied Sciences 15, no. 19: 10389. https://doi.org/10.3390/app151910389
APA StyleLiu, H., Wang, X., He, F., & Zheng, Z. (2025). Automated Network Defense: A Systematic Survey and Analysis of AutoML Paradigms for Network Intrusion Detection. Applied Sciences, 15(19), 10389. https://doi.org/10.3390/app151910389