A Scalable Framework for Real-Time Network Security Traffic Analysis and Attack Detection Using Machine and Deep Learning
Abstract
:1. Introduction
- We present a phased approach to network security analysis, progressing from static traffic monitoring to the integration of machine learning and culminating in the deployment of LLMs.
- We offer a scalable, real-time platform that addresses the unique security challenges of large-scale networks by combining big data analytics with advanced endpoint security.
- We evaluate the performance of a BBPE-trained BERT model for detecting network threats across multiple datasets, demonstrating high performance and accuracy.
- Finally, NSTAP serves as a comprehensive solution for proactive threat detection and network management in enterprise environments.
2. Related Work
3. Platform Architecture
3.1. Design Principles and Data Pipeline
3.2. Core Components
3.2.1. Elastic Stack
- Elasticsearch: A distributed, RESTful search and analytics engine capable of storing and indexing data.
- Logstash: A data processing pipeline that ingests data from multiple sources simultaneously, transforms it, and then sends it to a stash like Elasticsearch.
- Kibana: A data visualization and exploration tool used for log and time-series analytics, application monitoring, and operational intelligence use cases.
- Beats: Lightweight data shippers that send data from hundreds or thousands of machines to Logstash or Elasticsearch.
3.2.2. Kafka
3.2.3. Zeek
3.2.4. Osquery
3.3. Deployment
User Role and Access Control
4. Intrusion Detection Techniques Integrated into the Platform
4.1. Static Analysis in NSTAP
4.2. AI Integration in NSTAP
4.2.1. Datasets
4.2.2. Evaluation Metrics
Validation Loss
Weighted F1 Score
Accuracy
Class-Specific Accuracy
4.2.3. Machine Learning Techniques
Decision Trees (DT)
Random Forest (RF)
4.2.4. Feature Selection Techniques
Pearson Correlation Coefficient
SVM-Based Feature Reduction Algorithm
Algorithm Overview
- Initialization: Set initial parameters, including the range of feature subset sizes and the accuracy threshold for identifying “bad features”.
- Feature Combination Generation: Iteratively generate combinations of features, starting from subsets of three features and incrementally adding more.
- Evaluation and Exclusion:
- Evaluate each feature subset using the SVM classifier.
- Measure the classification accuracy.
- If the inclusion of a feature results in an accuracy decrease equal to or greater than the threshold, label it as a “bad feature”.
- Exclude subsets containing “bad features” from further evaluation.
- Selection of Optimal Features: Identify the feature subset that achieves the highest accuracy without including any “bad features”.
4.2.5. Deep Learning Techniques in NSTAP
Proposed Approach
Data Preprocessing
- Removal of Null Values: Entries with missing data were eliminated to ensure the integrity of the dataset.
- Feature Selection: Unnecessary features that do not contribute to intrusion detection were discarded to reduce complexity and improve model efficiency.
- Normalization: Applied normalization techniques to standardize data formats and scales across different features.
- Annotating Data Elements: Each feature-value pair in a log entry was concatenated into a string, with feature names and their corresponding values separated by a delimiter (e.g., ‘$’), and spaces between pairs. This method preserves the semantic relationship between features and their values.
- Consolidation: All feature-value pairs were merged into a single text attribute for each entry in the dataset, resulting in a unified textual representation suitable for tokenization.
Dataset Standardization
Tokenizer Configuration and Training
- Vocabulary Size: We experimented with three different vocabulary sizes: 5000; 10,000; and 20,000 tokens. Adjusting the vocabulary size allows us to balance the granularity of token representation with computational efficiency.
- Special Tokens: Specific tokens relevant to the datasets were incorporated to handle special cases and delimiters within the data.
- Minimum Frequency Threshold: A minimum frequency threshold of 2 was set to include tokens in the vocabulary, ensuring that rare tokens do not adversely affect the model performance.
- Joint Training on All Datasets: The tokenizer was trained on the combined data from all three datasets to create a unified vocabulary that captures the diversity of the network logs.
- Individual Training on Each Dataset: The tokenizer was trained separately on each dataset to capture dataset-specific terminology and patterns.
Fine-Tuning the Pre-Trained BERT Model
- Model Adaptation: Modified the final classification layer of BERT to match the number of classes in our dataset, as specified in the label_dict. This ensures the model outputs are compatible with the classification task.
- Optimization Settings: Employed the AdamW optimizer with a learning rate of and an epsilon of to facilitate efficient training while preventing overfitting.
- Embedding Generation: The BBPE tokenizer was used to convert the textual data into input IDs and attention masks, which are the required inputs for the BERT model.
- Data Batching: The data was organized into batches, each containing input IDs, attention masks, and labels, to enable efficient training.
Integration with NSTAP
5. Results and Discussion
Dataset-Specific Performance
- TON-IoT: Exhibited high accuracy, with F1 scores improved progressively across four epochs. The lowest recorded validation loss was 0.0191, reflecting effective model learning and generalization capabilities. A notable observation from the results is the impact of vocabulary size on classification performance. As shown in Figure 7, while all vocabulary sizes led to strong performance, the models trained with larger vocabulary size 20,000 and on all datasets exhibited marginally better F1 scores compared to the 5000-vocabulary model, particularly in later epochs. This suggests that increasing the vocabulary size provides the model with richer token representations, which is particularly advantageous for complex multi-class classification tasks, such as those in the TON-IoT dataset.
- UNSW-NB15: As shown in Figure 7 the tokenizer trained on all datasets with a vocabulary size of 20,000 demonstrates significant improvements in both training and validation metrics over the first four epochs. The F1 score steadily increases, indicating optimal model performance. Both training and validation F1 scores showed clear progression, reflecting the model’s ability to generalize effectively with a larger and more diverse vocabulary. The decrease in validation loss suggests that the model learns meaningful patterns and avoids overfitting during these early epochs. This highlights the importance of using a larger vocabulary size to capture complex network traffic patterns.
- Edge-IIoT: The model achieved near-perfect accuracy across all classes, with an F1 score of 0.99999 and a validation loss as low as 0.0000441 by the third epoch. Even for minority classes like Ransomware (97% accuracy) and MITM (99% accuracy), the model maintained strong performance, demonstrating its robustness in handling class imbalance. Despite potential concerns of overfitting due to the high accuracy and low validation loss, future work will test the generalizability of the fine-tuned model by applying it to other datasets to ensure its broader applicability. This approach will help confirm whether the model overfits to the EDGE-IIoT dataset or generalizes effectively to unseen data from different network environments.
6. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
NSTAP | Network Security Traffic Analysis Platform |
IDS | Intrusion Detection Systems |
NIDS | Network-based Intrusion Detection System |
HIDS | Host-based Intrusion Detection System |
IPS | Intrusion Prevention Systems |
AI | Artificial Intelligence |
LLMs | Large Language Models |
NLP | Natural Language Processing |
DL | Deep Learning |
ML | Machine Learning |
RF | Random Forest |
DT | Decision Tree |
SVM | Support Vector Machine |
BERT | Bidirectional encoder representations from transformers |
BBPE | Byte-Level Byte-Pair Encoding |
IoT | Internet of Things |
GOOSE | Generic Object Oriented Substation Event |
References
- Goldman, Z.K.; McCoy, D. Deterring financially motivated cybercrime. J. Natl. Secur. Law Policy 2015, 8, 595. [Google Scholar]
- Global Networking Trends. Available online: https://www.cisco.com/c/en/us/solutions/enterprise-networks/global-networking-trends.html (accessed on 23 January 2025).
- Red Hat-Open Source Solutions. Available online: https://www.redhat.com/en (accessed on 23 January 2025).
- Technology Trends 2024. Available online: https://www.ericsson.com/en/reports-and-papers/ericsson-technology-review/articles/technology-trends-2024 (accessed on 23 January 2025).
- Jin, D.; Lu, Y.; Qin, J.; Cheng, Z.; Mao, Z. SwiftIDS: Real-time intrusion detection system based on LightGBM and parallel intrusion detection mechanism. Comput. Secur. 2020, 97, 101984. [Google Scholar] [CrossRef]
- Maasaoui, Z.; Hathah, A.; Bilil, H.; Mai, V.S.; Battou, A.; Lbath, A. Network Security Traffic Analysis Platform—Design and Validation. In Proceedings of the 2022 IEEE/ACS 19th International Conference on Computer Systems and Applications (AICCSA), Abu Dhabi, United Arab Emirates, 5–8 December 2022. [Google Scholar]
- Mehmood, Y.; Shibli, M.A.; Habiba, U.; Masood, R. Intrusion Detection System in Cloud Computing: Challenges and opportunities. In Proceedings of the 2013 2nd National Conference on Information Assurance (NCIA), Rawalpindi, Pakistan, 11–12 December 2013; pp. 59–66. [Google Scholar] [CrossRef]
- Maasaoui, Z.; Merzouki, M.; Bekri, A.; Abane, A.; Battou, A.; Lbath, A. Design and Implementation of an Automated Network Traffic Analysis System using Elastic Stack. In Proceedings of the 2023 20th ACS/IEEE International Conference on Computer Systems and Applications (AICCSA), Giza, Egypt, 4–7 December 2023; pp. 1–6. [Google Scholar] [CrossRef]
- Denning, D.E. An Intrusion-Detection Model. IEEE Trans. Softw. Eng. 1987, SE-13, 222–232. [Google Scholar] [CrossRef]
- Liu, C.; Zhang, Y. An Intrusion Detection Model Combining Signature-Based Recognition and Two-Round Immune-Based Recognition. In Proceedings of the 2021 17th International Conference on Computational Intelligence and Security (CIS), Chengdu, China, 19–22 November 2021; pp. 497–501. [Google Scholar] [CrossRef]
- Almseidin, M.; Alzubi, M.; Kovacs, S.; Alkasassbeh, M. Evaluation of machine learning algorithms for intrusion detection system. In Proceedings of the 2017 IEEE 15th International Symposium on Intelligent Systems and Informatics (SISY), Subotica, Serbia, 14–16 September 2017; pp. 277–282. [Google Scholar] [CrossRef]
- Khraisat, A.; Gondal, I.; Vamplew, P.; Kamruzzaman, J. Survey of intrusion detection systems: Techniques, datasets and challenges. Cybersecurity 2019, 2, 20. [Google Scholar] [CrossRef]
- Maasaoui, Z.; Merzouki, M.; Battou, A.; Lbath, A. Anomaly Based Intrusion Detection using Large Language Models. In Proceedings of the 2024 IEEE/ACS 21st International Conference on Computer Systems and Applications (AICCSA), Sousse, Tunisia, 22–26 October 2024; pp. 1–8. [Google Scholar]
- Gharib, A.; Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. An Evaluation Framework for Intrusion Detection Dataset. In Proceedings of the 2016 International Conference on Information Science and Security (ICISS), Pattaya, Thailand, 19–22 December 2016; pp. 1–6. [Google Scholar] [CrossRef]
- Abbasi, M.; Shahraki, A.; Taherkordi, A. Deep learning for network traffic monitoring and analysis (NTMA): A survey. Comput. Commun. 2021, 170, 19–41. [Google Scholar] [CrossRef]
- D’Alconzo, A.; Drago, I.; Morichetta, A.; Mellia, M.; Casas, P. A Survey on Big Data for Network Traffic Monitoring and Analysis. IEEE Trans. Netw. Serv. Manag. 2019, 16, 800–813. [Google Scholar] [CrossRef]
- Easterly, J.; Fanning, T. The Attack on Colonial Pipeline: What We’ve Learned & What We’ve Done Over the Past Two Years. 2023. Available online: https://www.cisa.gov/news-events/news/attack-colonial-pipeline-what-weve-learned-what-weve-done-over-past-two-years (accessed on 1 September 2023).
- Ahmad, R.; Alsmadi, I. Machine learning approaches to IoT security: A systematic literature review. Internet Things 2021, 14, 100365. [Google Scholar] [CrossRef]
- Abdullahi, M.; Baashar, Y.; Alhussian, H.; Alwadain, A.; Aziz, N.; Capretz, L.F.; Abdulkadir, S.J. Detecting cybersecurity attacks in internet of things using artificial intelligence methods: A systematic literature review. Electronics 2022, 11, 198. [Google Scholar] [CrossRef]
- Vaswani, A. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
- Devlin, J. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Elastic Stack. The Leading Platform for Search-Powered Solutions. Available online: https://www.elastic.co/elastic-stack (accessed on 3 October 2024).
- Zeek-Network Security Monitoring. Available online: https://zeek.org/ (accessed on 3 October 2024).
- Osquery-SQL Powered Operating System Instrumentation. Available online: https://www.osquery.io/ (accessed on 3 October 2024).
- Apache Kafka. A Distributed Event Streaming Platform. Available online: https://kafka.apache.org/ (accessed on 3 October 2024).
- Borkar, A.; Donode, A.; Kumari, A. A survey on Intrusion Detection System (IDS) and Internal Intrusion Detection and protection system (IIDPS). In Proceedings of the 2017 International Conference on Inventive Computing and Informatics (ICICI), Coimbatore, India, 23–24 November 2017; pp. 949–953. [Google Scholar] [CrossRef]
- Gkountis, C.; Taha, M.; Lloret, J.; Kambourakis, G. Lightweight algorithm for protecting SDN controller against DDoS attacks. In Proceedings of the 2017 10th IFIP Wireless and Mobile Networking Conference (WMNC), Valencia, Spain, 25–27 September 2017; pp. 1–6. [Google Scholar] [CrossRef]
- Wu, H.; Schwab, S.; Peckham, R.L. Signature Based Network Intrusion Detection System and Method. U.S. Patent 7,424,744, 9 September 2008. [Google Scholar]
- Garcia-Teodoro, P.; Diaz-Verdejo, J.; Maciá-Fernández, G.; Vázquez, E. Anomaly-based network intrusion detection: Techniques, systems and challenges. Comput. Secur. 2009, 28, 18–28. [Google Scholar] [CrossRef]
- Dong, B.; Wang, X. Comparison deep learning method to traditional methods using for network intrusion detection. In Proceedings of the 2016 8th IEEE International Conference on Communication Software and Networks (ICCSN), Beijing, China, 4–6 June 2016; pp. 581–585. [Google Scholar] [CrossRef]
- Moustafa, N.; Slay, J. UNSW-NB15: A comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In Proceedings of the 2015 Military Communications and Information Systems Conference (MilCIS), Canberra, Australia, 10–12 November 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 1–6. [Google Scholar]
- Patgiri, R.; Varshney, U.; Akutota, T.; Kunde, R. An investigation on intrusion detection system using machine learning. In Proceedings of the 2018 IEEE Symposium Series on Computational Intelligence (SSCI), Bangalore, India, 18–21 November 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1684–1691. [Google Scholar]
- Chowdhury, M.N.; Ferens, K.; Ferens, M. Network Intrusion Detection Using Machine Learning. In Proceedings of the International Conference on Security and Management (SAM), Las Vegas, NV, USA, 25–28 July 2016; p. 30. [Google Scholar]
- Zang, M.; Yan, Y. Machine Learning-Based Intrusion Detection System for Big Data Analytics in VANET. In Proceedings of the 2021 IEEE 93rd Vehicular Technology Conference (VTC2021-Spring), Helsinki, Finland, 25–28 April 2021; pp. 1–5. [Google Scholar] [CrossRef]
- Moustafa, N.; Slay, J. The Evaluation of Network Anomaly Detection Systems: Statistical Analysis of the UNSW-NB15 Data Set and the Comparison with the KDD99 Data Set. Inf. Secur. J. Glob. Perspect. 2016, 25, 18–31. [Google Scholar] [CrossRef]
- Stoleriu, R.; Puncioiu, A.; Bica, I. Cyber Attacks Detection Using Open Source ELK Stack. In Proceedings of the 2021 13th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), Pitesti, Romania, 1–3 July 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–6. [Google Scholar]
- Chen, S.; Liao, H. BERT-Log: Anomaly Detection for System Logs Based on Pre-Trained Language Model. Appl. Artif. Intell. 2022, 36, 2145642. [Google Scholar] [CrossRef]
- Le, V.-H.; Zhang, H. Log-Based Anomaly Detection without Log Parsing. In Proceedings of the 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), Melbourne, Australia, 15–19 November 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 492–504. [Google Scholar]
- Guo, H.; Yuan, S.; Wu, X. LogBERT: Log Anomaly Detection via BERT. In Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 18–22 July 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–8. [Google Scholar]
- Bertoli, G.D.C.; Júnior, L.A.P.; Saotome, O.; Dos Santos, A.L.; Verri, F.A.N.; Marcondes, C.A.C.; Barbieri, S.; Rodrigues, M.S.; De Oliveira, J.M.P. An end-to-end framework for machine learning-based network intrusion detection system. IEEE Access 2021, 9, 106790–106805. [Google Scholar] [CrossRef]
- Douiba, M.; Benkirane, S.; Guezzaz, A.; Azrour, M. An improved anomaly detection model for IoT security using decision tree and gradient boosting. J. Supercomput. 2023, 79, 3392–3411. [Google Scholar] [CrossRef]
- Rahali, A.; Akhloufi, M.A. Malbert: Malware detection using bidirectional encoder representations from transformers. In Proceedings of the 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Melbourne, Australia, 17–20 October 2021; pp. 3226–3231. [Google Scholar]
- Lee, W.; Stolfo, S.J. A framework for constructing features and models for intrusion detection systems. ACM Trans. Inf. Syst. Secur. TiSSEC 2000, 3, 227–261. [Google Scholar] [CrossRef]
- Ashfaq, R.A.R.; Wang, X.-Z.; Huang, J.Z.; Abbas, H.; He, Y.-L. Fuzziness based semi-supervised learning approach for intrusion detection system. Inf. Sci. 2017, 378, 484–497. [Google Scholar] [CrossRef]
- Mirsky, Y.; Doitshman, T.; Elovici, Y.; Shabtai, A. Kitsune: An ensemble of autoencoders for online network intrusion detection. arXiv 2018, arXiv:1802.09089. [Google Scholar]
- Laghrissi, F.; Douzi, S.; Douzi, K.; Hssina, B. Intrusion detection systems using long short-term memory (LSTM). J. Big Data 2021, 8, 65. [Google Scholar] [CrossRef]
- Bobade, S.Y.; Apare, R.S.; Borhade, R.H.; Mahalle, P.N. Intelligent detection framework for IoT-botnet detection: DBN-RNN with improved feature set. J. Inf. Secur. Appl. 2025, 89, 103961. [Google Scholar] [CrossRef]
- RabbitMQ. Messaging That Just Works. Available online: https://www.rabbitmq.com/ (accessed on 23 January 2025).
- Apache ActiveMQ. Message Broker. Available online: https://activemq.apache.org/ (accessed on 23 January 2025).
- Snort. The Open Source Network Intrusion Detection System. Available online: https://www.snort.org/ (accessed on 23 January 2025).
- Suricata. The Open Source Network Threat Detection Engine. Available online: https://suricata.io/ (accessed on 23 January 2025).
- Sysmon. System Monitor. Available online: https://learn.microsoft.com/en-us/sysinternals/downloads/sysmon (accessed on 23 January 2025).
- auditd. The Linux Auditing System. Available online: https://man7.org/linux/man-pages/man8/auditd.8.html (accessed on 23 January 2025).
- Biswas, P.P.; Tan, H.C.; Zhu, Q.; Li, Y.; Mashima, D.; Chen, B. A Synthesized Dataset for Cybersecurity Study of IEC 61850 Based Substation. In Proceedings of the 2019 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), Beijing, China, 21–24 October 2019; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2019. [Google Scholar] [CrossRef]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Decision Tree Algorithm. Available online: https://www.towardsanalytic.com/decision-tree-algorithm/ (accessed on 3 October 2024).
- Restack. Decision Making Models: Answer Decision Trees vs. Random Forests. Restack, 2025. Available online: https://www.restack.io/p/decision-making-models-answer-decision-trees-vs-random-forests-cat-ai (accessed on 3 October 2024).
- Towards Data Science, Random Forest Explained: A Visual Guide with Code Examples. Medium-Towards Data Science, 2025. Available online: https://medium.com/towards-data-science/random-forest-explained-a-visual-guide-with-code-examples-9f736a6e1b3c (accessed on 2 April 2025).
- Booij, T.M.; Chiscop, I.; Meeuwissen, E.; Moustafa, N.; Den Hartog, F.T. ToN_IoT: The Role of Heterogeneity and the Need for Standardization of Features and Attack Types in IoT Network Intrusion Data Sets. IEEE Internet Things J. 2022, 9, 485–496. [Google Scholar] [CrossRef]
- Ferrag, M.A.; Friha, O.; Hamouda, D.; Maglaras, L.; Janicke, H. Edge-IIoTset: A New Comprehensive Realistic Cyber Security Dataset of IoT and IIoT Applications: Centralized and Federated Learning. IEEE Dataport 2022. [Google Scholar] [CrossRef]
- Ferrag, M.A.; Ndhlovu, M.; Tihanyi, N.; Cordeiro, L.C.; Debbah, M.; Lestable, T. Revolutionizing Cyber Threat Detection with Large Language Models. arXiv 2023, arXiv:2306.14263. [Google Scholar]
Framework | Framework Capabilities | Detection Techniques | Classification | Datasets | Detection Accuracy (%) | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
HIDS | NIDS | Dataset Collection | Large Scale Data Flow | ML | DL | NLP | Binary | Multi- Class | |||
NSTAP [6,8,13] | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | • CICIDS 2017 • ToN-IoT • Edge-IIoT • UNSW-NB15 • FDIA | 100 (Edge-IIoT) |
ELK-ML [36] | ✓ | ✓ | ✓ | ✗ | ✓ | ✗ | ✗ | ✓ | ✓ | • Custom APT • DNS/HTTP Exfiltration | - |
AB-TRAP [40] | ✗ | ✓ | ✓ | ✓ | ✓ | ✗ | ✗ | ✓ | ✗ | • KDD 99 • NSL-KDD | - |
Catboost-IDS [41] | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ | • NSL-KDD • IoT-23 • BoT-IoT • Edge-IIoT | 100 (Edge-IIoT) |
MalBERT [42] | ✗ | ✗ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | • Androzoo (22,000 apps) | 97.61 |
MADAM ID [43] | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ | ✗ | ✓ | ✓ | • DARPA 1998 • BSM audit data | - |
Feed-forward NN [44] | ✗ | ✓ | ✗ | ✓ | ✓ | ✗ | ✗ | ✓ | ✓ | • NSL-KDD | 82.41 |
Kitsune [45] | ✗ | ✓ | ✓ | ✓ | ✓ | ✗ | ✗ | ✓ | ✓ | • Custom collected Data | - |
LSTM [46] | ✗ | ✓ | ✗ | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ | • KDD99 | 99.49 |
IDBN + RNN [47] | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ | • Bot-IoT • UNSW-NB15 | 92.6 (UNSW) |
Feature | Message Broker Systems | ||
---|---|---|---|
Apache Kafka [25] | RabbitMQ [48] | ActiveMQ [49] | |
Messaging Model | Pub/Sub, Streams | Queue-Based | Queue-Based |
Throughput | Very High | Moderate | Moderate |
Latency | Low | Low | Moderate |
Scalability | Horizontal | Vertical | Horizontal |
Data Retention | Yes | Limited | Limited |
Fault Tolerance | High | High | Moderate |
Use Cases | Real-Time Analytics | Messaging, Asynchronous Tasks | Messaging, Integration |
Community Support | Strong | Strong | Moderate |
Feature | Network Monitoring Tools | ||
---|---|---|---|
Zeek [23] | Snort [50] | Suricata [51] | |
Type | NTA (Network Traffic Analysis) | IDS/IPS | IDS/IPS |
Detection Method | Signature and Anomaly-Based | Signature-Based | Signature and Anomaly-Based |
Protocol Analysis | Deep | Shallow | Deep |
Performance | High | Moderate | High |
Custom Scripting | Yes (Zeek Script) | Limited | Yes (Lua Scripting) |
Use Cases | Monitoring, Forensics | Intrusion Detection | Intrusion Detection |
Community Support | Strong | Strong | Growing |
Feature | Endpoint Security Tools | ||
---|---|---|---|
Osquery [24] | Sysmon [52] | Auditd [53] | |
Platform Support | Cross-Platform | Windows Only | Linux Only |
Data Collection | System Info, Processes, File Integrity | Process Creation, Network Connections | System Calls, File Access |
Query Language | SQL-Based Queries | None | Rules-Based |
Real-Time Monitoring | Yes | Yes | Yes |
Customization | High | Moderate | Moderate |
Integration | Easy (via SQL Queries) | Requires Event Forwarding | Complex Configuration |
Community Support | Strong | Moderate | Moderate |
Attack Category | ToN-IoT | UNSW-NB15 | Edge-IIoT | Total |
---|---|---|---|---|
Normal | 50,000 | 93,000 | 1,615,643 | 1,758,643 |
XSS | 20,000 | - | 15,915 | 35,915 |
Scanning | 20,000 | - | 22,564 | 42,564 |
Ransomware | 20,000 | - | 10,925 | 30,925 |
Password | 20,000 | - | 50,153 | 70,153 |
Injection | 20,000 | - | 51,203 | 71,203 |
DoS | 20,000 | 16,353 | 50,062 | 86,415 |
DDoS | 20,000 | - | 171,630 | 191,630 |
Backdoor | 20,000 | 2329 | 24,862 | 47,191 |
MiTM | 1043 | - | 1214 | 2257 |
Generic | - | 58,871 | - | 58,871 |
Exploits | - | 44,525 | - | 44,525 |
Fuzzers | - | 24,246 | - | 24,246 |
Reconnaissance | - | 13,987 | - | 13,987 |
Analysis | - | 2677 | - | 2677 |
Shellcode | - | 1511 | - | 1511 |
Worms | - | 174 | - | 174 |
Vul Scanner | - | - | 50,110 | 50,110 |
DDoS_ICMP | - | - | 116,436 | 116,436 |
DDoS_HTTP | - | - | 49,911 | 49,911 |
Uploading | - | - | 37,634 | 37,634 |
Port Scanning | - | - | 22,564 | 22,564 |
Fingerprinting | - | - | 1001 | 1001 |
Attack types | 9 | 9 | 14 | 22 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Maasaoui, Z.; Merzouki, M.; Battou, A.; Lbath, A. A Scalable Framework for Real-Time Network Security Traffic Analysis and Attack Detection Using Machine and Deep Learning. Platforms 2025, 3, 7. https://doi.org/10.3390/platforms3020007
Maasaoui Z, Merzouki M, Battou A, Lbath A. A Scalable Framework for Real-Time Network Security Traffic Analysis and Attack Detection Using Machine and Deep Learning. Platforms. 2025; 3(2):7. https://doi.org/10.3390/platforms3020007
Chicago/Turabian StyleMaasaoui, Zineb, Mheni Merzouki, Abdella Battou, and Ahmed Lbath. 2025. "A Scalable Framework for Real-Time Network Security Traffic Analysis and Attack Detection Using Machine and Deep Learning" Platforms 3, no. 2: 7. https://doi.org/10.3390/platforms3020007
APA StyleMaasaoui, Z., Merzouki, M., Battou, A., & Lbath, A. (2025). A Scalable Framework for Real-Time Network Security Traffic Analysis and Attack Detection Using Machine and Deep Learning. Platforms, 3(2), 7. https://doi.org/10.3390/platforms3020007