Enhancing Security in 5G Edge Networks: Predicting Real-Time Zero Trust Attacks Using Machine Learning in SDN Environments

Ashfaq, Fiza; Wasim, Muhammad; Shah, Mumtaz Ali; Ahad, Abdul; Pires, Ivan Miguel

doi:10.3390/s25061905

Open AccessArticle

Enhancing Security in 5G Edge Networks: Predicting Real-Time Zero Trust Attacks Using Machine Learning in SDN Environments

by

Fiza Ashfaq

¹,

Muhammad Wasim

¹,

Mumtaz Ali Shah

²,

Abdul Ahad

^3,4,*

and

Ivan Miguel Pires

^5,*

¹

Department of Computer Science, UMT Sialkot Campus, KUST, Sialkot 51040, Pakistan

²

Department of Computer Science, University of Wah, Wah Cantt 47040, Pakistan

³

School of Software, Northwestern Polytechnical University, Xi’an 710072, China

⁴

Department of Electronics and Communication Engineering, Istanbul Technical University (ITU), Maslak, Istanbul 34469, Turkey

⁵

Instituto de Telecomunicações, Escola Superior de Tecnologia e Gestão de Águeda, Universidade de Aveiro, 3810-193 Águeda, Portugal

^*

Authors to whom correspondence should be addressed.

Sensors 2025, 25(6), 1905; https://doi.org/10.3390/s25061905

Submission received: 1 February 2025 / Revised: 27 February 2025 / Accepted: 14 March 2025 / Published: 19 March 2025

(This article belongs to the Special Issue Cybersecurity and Privacy Protection: The Key to IoT Sensors)

Download

Browse Figures

Versions Notes

Abstract

:

The Internet has been vulnerable to several attacks as it has expanded, including spoofing, viruses, malicious code attacks, and Distributed Denial of Service (DDoS). The three main types of attacks most frequently reported in the current period are viruses, DoS attacks, and DDoS attacks. Advanced DDoS and DoS attacks are too complex for traditional security solutions, such as intrusion detection systems and firewalls, to detect. The combination of machine learning methods with AI-based machine learning has led to the introduction of several novel attack detection systems. Due to their remarkable performance, machine learning models, in particular, have been essential in identifying DDoS attacks. However, there is a considerable gap in the work on real-time detection of such attacks. This study uses Mininet with the POX Controller to simulate an environment to detect DDoS attacks in real-time settings. The CICDDoS2019 dataset identifies and classifies such attacks in the simulated environment. In addition, a virtual software-defined network (SDN) is used to collect network information from the surrounding area. When an attack occurs, the pre-trained models are used to analyze the traffic and predict the attack in real-time. The performance of the proposed methodology is evaluated based on two metrics: accuracy and detection time. The results reveal that the proposed model achieves an accuracy of 99% within 1 s of the detection time.

Keywords:

cyber security; SDN; machine learning; zero trust; real-time; intrusion detection; intrusion prevention

1. Introduction

The development of 5G technology has completely changed communication systems. Faster and more reliable connectivity opens up a world of high-capacity applications at the network edge. However, with this paradigm comes a broad attack on the surface, raising security as a significant challenge. Complex networks may now be controlled and protected with the help of a Software-Defined Network (SDN) [1]. SDNs allow for more centralized management and enhance visibility by separating the control plane from the data plane. The researcher [2] makes network administration more responsive and adaptable. SDN has an essential part called the controller, which acts as the brain of the network [3]. Some widely used SDN controllers include RYU, Open Daylight, ONOS, Floodlight, and POX [4], as shown in Figure 1. These include open-source/open-flow-supported controllers. Recently, 5G technology has gained popularity in many industries, but is also open to several threats such as distributed denial-of-service (DDoS) and denial-of-service (DoS) attacks [5]. Every day, on average, 28.7k attacks are launched [6]. According to Neustar’s 2020 Cyber Threats and Trends Report, 151 percent more attacks occurred in June 2020 than in 2019 [6]. In June 2020, the assault volume jumped to 12 GBP from 11 GBP during the same time in 2019.

The transformation of 5G technology is particularly evident in edge computing, where data processing occurs closer to the source, thereby improving efficiency and response times [7]. However, the increasing prevalence of 5G-enabled edge networks also brings new security issues. SDN, a widely accepted revolutionary network, divides the control and data planes to improve network flexibility and management, as shown in Figure 2. The control plane makes forwarding decisions, the data plane handles traffic transmission and forwarding, and the application plane provides clients with programmable and open services [8]. These networks are vulnerable to a wide range of cyber threats due to their highly dynamic and dispersed nature, including DoS and DDoS attacks [9]. The Zero Trust method [10], founded on the principle of never trust, always verify, ensures that all access requests, whether they originate within or outside the network perimeter, are consistently validated and authorized.

Distributed denial-of-service (DDoS) attacks are complex to identify and prevent for traditional security solutions such as firewalls and intrusion detection systems, especially in large-scale networks [11]. Numerous earlier studies used data sets such as UNSW-NB15, NSL-KDD, KDD Cup 99, etc., which do not accurately represent IoT networks’ distinct traffic patterns and vulnerabilities [12]. The lack of publicly available IoT-specific statistics complicates evaluating and validating anomaly detection systems [12]. Detecting DDoS attacks in real-time, essential for minimizing the possibility of harm, is another significant limitation of current approaches. Today, security frameworks, such as intrusion detection systems and firewalls, frequently depend on static settings and predetermined rules. Because of their static nature, they are useless against dynamic and changing attack vectors. The main problem addressed in earlier studies is the lack of effective real-time security mechanisms in SDN environments. The majority of security solutions available are static and cannot be adjusted to the dynamic nature of network environments [12], making them vulnerable to changing security threats.

To address the vulnerability of SDN-enabled 5G networks to cyber-attacks, various machine learning models are applied, including Decision Tree (DT), Logistic Regression (LR), Naive Bias (NB), SVM, KNN, and Random Forest (RF) using the CIDDoS2019 dataset [13]. However, these algorithms do not evaluate for real-time environments. This study addresses this research gap by combining these predictive models with a Mininet simulation environment using a POX controller. The objective is to detect and mitigate real-time attacks. The machine learning-based models dynamically learn and adapt to new cyber threat patterns, improving the controller’s real-time capacity to identify and mitigate network hazards. We build a dynamic and adaptive system that can locate DDoS attacks in SDN by utilizing machine learning within a security and mitigation framework while also forecasting cyberattacks using the Zero Trust concept. The following are the contributions of this paper:

❍: We integrate real-time attack detection in an SDN environment using the POX controller, which has not received as much attention as other SDN controllers such as OpenDaylight and ONOS. We provide the CICDDoS2019 dataset for use in a real-time detection and mitigation framework, in contrast to many research studies that employ out-of-date datasets (for example, NSL-KDD, KDD Cup99), and some studies that use CICDDoS2019 for offline evaluation. In most related activities, real-time attack response and detection time are not crucial assessment criteria. Instead of only evaluating static models, we incorporate attack detection into Mininet, enabling a simulated real-world network environment.
❍: We train six machine learning models on the dataset CICDDOS2019 in a simulation environment using the POX controller separately, and hyperparameter tuning performed with GridSearchCV for the best performance measures.
❍: The proposed framework is extensively tested in a Mininet environment and configurations, demonstrating effective anomaly detection, low resource overhead, and scalability for large-scale Intelligent-SDN networks, including IoT, telematics, WAN, and 5G.

The structure of the rest of this paper is as follows: The literature review is present in Section 2, the methodology is explained in Section 3, Section 4 presents the experimental setup and model development for intrusion detection and mitigation, the results of the simulated environment are in Section 5, the discussions are described in Section 6, and conclusions and future work are presented in Section 7.

2. Related Work

The primary purpose of an IDS system is to monitor the computer system for any malicious behavior that might cause any data loss [14]. Software and hardware are used in intrusion detection systems. Machine learning techniques have proven helpful for this purpose since they provide low false alarm rates, high detection rates, and expensive communication costs [15]. This section presents numerous studies on SDN attack detection and mitigation using machine learning. We discuss machine learning-based SDN intrusion detection.

According to M. Latah et al. [16], the majority of researchers examined and developed machine learning methods for intrusion detection, which is the process of detecting any attack that can threaten the network, such as phishing, DoS, and DDOS attacks. Intrusion detection systems (IDS) commonly employ supervised learning, where models using tagged data indicate instances of network traffic as normal conduct or intrusions [17,18]. Supervised learning algorithms learn from similar patterns in deployed data to accurately predict known invasion types. The two primary supervised learning algorithms commonly utilized are the Support Vector Machine and Random Forest [19]. Although Random Forest is used predominantly for regression and classification tasks, SVM excels in classification objectives [20]. SVM demonstrates superior generalization compared to other machine learning methods [20]. However, unsupervised learning learns without labels by automatically classifying the input into many categories. Using unsupervised learning with ML models, a study gleaned relevant information [12]. They used a collection of random variables to represent the data. Class labels are used in unsupervised learning to glean relevant information from input datasets [12].

There are several benefits in connecting commonplace items to the Internet, which can improve our quality of life. However, there are security dangers associated with them also. Mehdi et al. [21] proposed a framework for detecting and mitigating IoT-based smart home intrusions. Their findings demonstrated an accuracy of around 90% in predicting any intrusion detection, and the proposed system could take appropriate action to stop such attempts. Yuan et al. [22] focused on mitigating DDoS damage in SDN networks, where disruptions prevent hosts and the network from efficiently processing signals, preventing user access to normal network services. The centralized structure of SDN makes it more vulnerable to such attacks. A similar approach suggested a two-phase solution: First, they used SVM to establish the optimization parameters [22]. Then, they combined the SVM approach with a genetic cross-validation algorithm to identify the optimization factors and improve traffic flow recognition. Marchese et al. [23] presented a solution addressing the fundamental security challenge confronting SDN systems. More specifically, they substituted a more advanced programmable framework for the traditional architecture but also confronted certain outdated security issues. The SDN strategy presented challenges, because the software-defined networking system is more centralized and vulnerable to hackers.

Similarly, using a machine learning-based technique, Park et al. [24] described a novel method for detecting and preventing network intrusions. In the suggested method, aberrant data was classified to identify SDN infiltration, and preprocessing entailed feature selection. Attack detection and alert generation were adequately handled using the suggested method. The two machine learning classification methods addressed the issues of distinguishing between anomalous and normal data through an audit. G. A. et al. [25] proposed a structure to identify potential abnormalities in SDN networks that used the Network Security Monitor (NSM) technique to detect intrusions without requiring additional host information. Given the flow-based nature of SDN, they opted for an alternative method to detect anomalies within an SDN network. The key goal was to create a system that keeps information open and accessible to both target hosts and possible hackers. To discriminate between different kinds of assaults and legal communications, they used a Random Forest algorithm. They calculated each class’s false positive rate (FPR) and actual positive rate (TPR) before calculating the F1 score to evaluate accuracy. Comparative analysis of key studies and analysis of supervised learning methods for intrusion detection systems are shown in Table 1.

Intrusion detection in software-defined 5G networks relies on artificial intelligence techniques, as outlined by [30], where machine learning algorithms are deployed to recognize potential threats based on flow categorization. Zafari et al. [27] addressed impacts of anticipating hosts in a software-defined network using machine learning techniques. To help the SDN controller anticipate assaults, specific security rules were established. Various machine-learning methods were employed to define these rules. They used the K-means algorithm to classify the data by dividing it into categories known as attack and standard data. Existing IDS are challenging to maintain as they are in confined locations, and it is difficult to identify attacks in such conventional networks intelligently [27]. J. et al. [26] proposed an IDS framework for software-defined 5G networks comprising three distinct layers: the data and intelligence layer, the management and control layer, and the forwarding layer. The forwarding layer monitored and captured network traffic and facilitated packet transmission between OpenFlow (OF) switches. Meanwhile, the control layer collected and analyzed network flows, using the data to block malicious flows as per controller directives. Similarly, a software-defined 5G network implemented and discussed supervised machine learning for intrusion detection [26]. This system, compared to existing methods, reduced the calculation overhead while increasing the accuracy of attack detection. Similarly, Zafari et al. [31], in an SDN network, predicted the impacted hosts using supervised machine learning methods. With the threshold set to 0, the decision tree method produced the most fantastic accuracy of 99.99%. The Bayesian network predicted the assaults with great precision and an average accuracy of 91.68%.

Chao et al. [32] utilized behavior-based SVM classifiers to classify network threats. SVM expedited the learning process while enhancing the accuracy of intrusion detection rates. Similarly, Marchese et al. [14] implemented a strategy reminiscent of SVM’s approach to mitigate malicious threats in traffic flow within SDN networks. The SVM-based model worked effectively, as evidenced by the 80% detection rate of fraudulent activity [14]. When learning unsupervised, methods determine the design based on the unlabeled input data [33]. It examined network traffic data for patterns, abnormalities, or clusters rather than pointing. Unsupervised learning algorithms are valuable for detection because they can identify unknown or innovative forms of incursion. Their ability to detect anomalous patterns in network data allows them to alert users to imminent threats. Table 2 shows a comparison of the proposed method with previous studies.

Saravanan et al. [37] proposed a classification technique for intrusion detection using network security data. The big data tool Apache Spark was used to develop and assess several classification methods, and testing and training times were recorded. However, the authors did evaluate a small number of categorization techniques. They discovered improved outcomes with a good false-positive ratio when compared to the current systems. When evaluating the categorization method in Apache Spark on network security data, better outcomes were obtained compared with current solutions. Decision Tree (DT), Support Vector Machine (SVM), Logistic Regression (LR), and SVM with Stochastic Gradient Descent (SGD) methods have accuracy rates of 96.8%, 93.9%, 92.8%, and 91.1%, respectively. An ML-based model for DDoS attack detection was suggested by Priya et al. [38]. The K-Nearest Neighbors (KNN), Random Forest (RF), and Naive Bayes (NB) classifiers are the three machine learning models that the authors used. Any DDoS assault on the network may be detected using the suggested method. According to the findings of the suggested method, the model has an average accuracy of 98.5% in detecting assaults.

To detect traffic DoS characteristics, Ujjain et al. [39] suggested entropy-based DoS detection, which combines two entropies using CNN and stacked autoencoders (SAE). The CPU use was time-consuming and much more significant. The models’ respective accuracies were 94% and 93%. Gadze et al. [40] suggested deep learning models using Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) to identify and reduce the risk of DDoS assaults that target the centralized controller in Software Defined Networks (SDN). In comparison, the models’ accuracy was lower. The CNN accuracy was 66% and LSTM was 89.63% when the data were split in a 70/30 ratio. DDoS, however, took the longest out of the ten efforts in the LSTM model to identify TCP, UDP, and ICMP. To classify traffic as either DDoS or BENIGN, Ahuja et al. [41] suggested a hybrid machine learning approach called Support Vector Classifier with Random Forest (SVC-RF). After calculating the number of features in the original dataset, the authors produced a new dataset known as the SDN dataset, which contained unique characteristics. According to the findings of the suggested model, the classifier SVC-RF can effectively classify traffic using the SDN dataset with an accuracy of 98.8%. An enhanced deep belief network (DBN) is the foundation of a novel deep learning model presented by Wang et al. [42], which detects network intrusions more effectively. In DBN, they substituted the Kernel-based Extreme Learning Machine (KELM) method for the Back Propagation approach. When compared to current neural network techniques, their suggested model operates more effectively. They assessed several categorization systems and calculated their accuracy. According to the results, the “DBN-KELM” algorithm obtained a 93.5% accuracy rate, while the “DBN-EGWO-KELM” method achieved a 98.60% accuracy rate. Using several machine learning models, Dehkordi et al. [43] suggested a technique to identify DDoS attacks in Software Defined Networks (SDN). The three primary components of the suggested approach were (1) gathering, (2) entropy-based, and (3) categorization. Three distinct datasets were used to test the suggested approach. The findings show that, when applied to an ISCX-SlowDDos2016 dataset, the average accuracy attained by the ML models, i.e., Logistic methods, J48 algorithm, BayesNet algorithm, Random Tree technique, and the REPTree algorithm was 99.62%, 99.87%, 99.33%, 99.8%, and 99.88%, respectively.

The literature review reveals that most of the proposed methodologies are static and unable to adjust to the dynamic nature of network environments, making them vulnerable to changing security threats [12]. Numerous earlier studies used datasets such as UNSW-NB15, NSL-KDD, KDD Cup 99, etc., which do not accurately represent IoT networks’ distinct traffic patterns and vulnerabilities [12]. The absence of publically available IoT-specific statistics makes it difficult to evaluate and validate anomaly detection solutions [12]. Our study provides real-time network attack detection and mitigation by fusing machine learning approaches with SDN’s dynamic capabilities. We used machine learning techniques to predict attacks in SDN-enabled 5G edge networks, Mininet for simulating networks, and the POX controller for managing SDN.

3. Methodology

This section presents a systematic approach to identify and prevent distributed denial of service (DDoS) attacks in a software-defined network with machine learning techniques, depicted in Figure 3. First, we present the benchmark dataset used to train the machine learning models. Secondly, we present data preprocessing techniques, machine learning models, and an evaluation of the results at the end of the section.

3.1. CICDDoS2019 Benchmark Dataset

The CICDDoS2019 is a realistic dataset of DDoS attacks and benign traffic [44]. The dataset is publicly available from the Canadian Institute of Cybersecurity’s website (Canadian Institute of Cybersecurity|UNB). Developed by the Canadian Institute of Cybersecurity (CIC) and the Canadian Communications Security Establishment (CSE) https://www.unb.ca/cic/datasets/ddos-2019.html (accessed on 15 January 2025). They used a CIC-Benign-generator to automate the data creation process, which involved instantiating networks of targeted devices using AWS. Log data and network traffic data were recorded and categorized after targeted computers were instrumented and assaulted methodically using a 50-machine attack infrastructure. The dataset’s designers demonstrated a significant effort to improve external validity through their experimental design, target, and attack network architecture, and architectural selection. In addition, there are benign data in this collection. Using CIC-BenignGenerator, benign background traffic is mimicked based on the abstract behavior profiles of twenty-five users. Email, FTP, SSH, HTTP, and HTTPS are the foundations of benign traffic. Previous studies on the CICDDoS2019 dataset, including feature selection methods, real-time attack detection, data pre-processing steps, etc., are shown in Table 3. The data set has 80 features, and Table 4 presents the class distribution of the CICDDoS2019 data set.

The CICDDoS2019 dataset captures two DDoS attack kinds. MSSQL, SSDP, NTP, TFTP, DNS, LDAP, NetBIOS, and SNMP are examples of reflection-based DNS. In this scenario, the real attackers can use the legitimated clients as cover for their attack. It becomes more difficult for victims to distinguish between attackers and users based on the source. DNS, LDAP, NETBIOS, and SNMP are attacks that rely on TCP (MSSQL and SSDP), UDP (NTP and TFTP), or both. Second, some attacks rely on exploits, such as SYN flood, UDP flood, and UDP-Lag. Attackers will send packets to the victim server while spoofing the originating IP address. The victim’s resources will run out as a result of this.

3.2. Data Preprocessing

Data pre-processing is a critical step in ensuring the quality and suitability of data for model training. It includes storing categorical variables in a format that machine learning algorithms can understand, normalizing the data to bring all features to a comparable scale, and cleaning the data by dealing with missing values and outliers. In contrast, feature engineering entails generating new features from pre-existing ones to improve the model’s predictive power. The dataset must be accurate, clean, and prepared for usage in machine learning models during the pre-processing stage. After loading the raw data, we eliminated any inconsistent or unnecessary information. We removed missing or insufficient values to maintain the quality of the data. We eliminated duplicate entries to prevent the model’s performance from being incorrect. Using the One Hot Encoding technique, we transformed non-numeric characteristics into a machine-readable format. In addition, we used the filtering technique to choose important characteristics for reducing the dimensionality of the dataset and improving computing efficiency.

3.3. Data Modeling

After preprocessing, we trained several machine learning models to determine which is best for DDoS detection. We used 80% of the data for training and 20% for testing to prepare and evaluate machine learning models. We used Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), K-nearest neighbors (KNN), Logistic Regression (LR), and Naïve Bayes (NB) for real-time SDN-based intrusion detection because of their classification accuracy, interpretability, computational efficiency, and applicability. RF improves DT’s resilience by lowering overfitting through ensemble learning, whereas DT is selected for real-time detection because of its quick inference time and interpretability. Although its high processing cost restricts its real-time use, SVM is included because of its capacity to handle high-dimensional data and complicated decision boundaries. Although KNN suffers from rising computing costs as the dataset expands, it was chosen for its ease of use and flexibility in responding to novel assault patterns.

Machine learning models are trained to evaluate their accuracy and reliability. The model with the best performance is saved for deployment. The SDN POX controller tracks and controls network traffic and integrates the stored ML model. The machine learning model is implemented in real-time to monitor and analyze network data using Wireshark to detect and mitigate attacks. This stage consists of traffic monitoring, in which incoming network traffic is continuously monitored for irregularities. Traffic is typically sent to its destination if no attack appears. After this, malicious packets are discarded when attacks reduce network interruption. We use sensitivity, recall, accuracy, F1 score, precision, and computation time as performance measures.

3.4. Flow of Our Proposed Framework

Our proposed framework, IntelligentSDN, uses an organized and structured procedure to identify and stop network assaults. Figure 4 shows the flow diagram that describes the main stages and procedures involved in the system’s operation:

❍: Initialization: In this step, we load the dataset for training and building the predictive model. We apply preprocessing, training, and validation procedures used by the model to make sure that it can correctly detect malicious activity. In further phases, the trained model is then saved and made accessible.
❍: Controller Initialization: This stage involves the system setting up the POX controller, which is essential for controlling and monitoring the data flow. The controller registers with the core network to create communication channels and ensure it is prepared to handle incoming network traffic.
❍: Managing Connections: Managing incoming traffic connections is the focus of this phase. Features are taken out of the packets to offer comprehensive insights into the traffic characteristics. The trained machine learning model processes these characteristics to determine whether the incoming traffic indicates an attack or regular activity.
❍: Packet Forwarding or Mitigation: The prediction results determine whether the system sends the packet or mitigates the danger. If packets are found to be malicious, they are discarded to stop the attack from propagating.
❍: Decision-Making Process: A decision-making node in the center of the flow chooses what to do with incoming packets. The output of the machine learning model, which acts as the main intelligence for identifying attacks, directs this choice.

Our Intelligent SDN framework, presented in Algorithm 1, identifies and mitigates assaults by combining machine learning techniques with the adaptability of SDN controllers.

Algorithm 1 Attack detection and mitigation: specific steps and procedures

Our SDN network system uses the following procedures to gather network traffic and categorize it as benign or an attack.

1:

procedure

2:

Step 1: Prepare the dataset for training

3:

Load the dataset (CICDDoS2019) containing benign and attack data.

4:

Preprocess the dataset:

❍: Handle missing values.
❍: Encode categorical features if any.
❍: Normalize or scale numerical features.

5:

Split the dataset into training (80%) and testing (20%) sets.

6:

Step 2: Define hyperparameter tuning for the Decision Tree

7:

Set the parameter grid:

❍: max_depth = [3, 5, 10, None]
❍: min_samples_split = [2, 5, 10]
❍: criterion = [‘gini’, ‘entropy’]

8:

Use GridSearchCV to find the best combination of parameters.

9:

Step 3: Train the Decision Tree model

10:

Fit the model using the best hyperparameters on the training set.

11:

Step 4: Evaluate the model

12:

Test the model on the testing set.

13:

Calculate metrics such as Accuracy, Precision, Recall, F1 Score, and Computation Time.

14:

Step 5: Integrate with Mininet and POX Controller

15:

Set up a Mininet topology with switches, hosts, and links.

16:

Assign the POX controller to manage the SDN topology.

17:

Monitor network traffic using wire-shark.

18:

Captured real-time traffic analysis.

19:

Step 6: Perform attack detection:

20:

Use the trained Decision Tree model to classify traffic as “Benign” or “Attack”.

21:

If an attack is detected:

❍: Mitigate the attack by blocking the source IP address or limiting bandwidth.
❍: Update the POX controller rules dynamically to enforce mitigation actions.

22:

Step 7: Display results

23:

Log detected attacks and mitigated traffic details.

24:

Visualize the model’s performance metrics and computation time.

25:

end procedure

4. Experimental Setup and Model Development

We use Mininet (Stanford, CA, USA) for implementation to test the effectiveness of machine learning techniques for real-time attack detection and mitigation in SDN. Using the CICDDoS2019 dataset, we integrate machine learning models and Mininet to simulate networks as shown in Figure 5. We explain the threat prevention techniques incorporated into our SDN-based architecture. The SDN controller instantly isolates or blocks the attack by dynamically updating flow rules when it detects a malicious traffic flow. Automated flow table updates are used to do this, blocking questionable IP addresses or unusual traffic patterns to stop further spread. Rate limitation is also used to reduce the danger of distributed denial-of-service (DDoS) attacks by controlling excessive bandwidth from possible attackers. The system uses anomaly-based policy updates for adaptive security, which enables the ML model to improve detection tactics gradually. By ensuring that threats are not only recognized but also actively eliminated, these mitigation techniques enhance network resilience and preserve service continuity. We train and test each well-known machine learning algorithms one at a time using hyper-parameter tuning with GridSearchCV, utilizing 5-fold stratified cross-validation. We used a Windows 10 computer with a 64-bit CPU and 12 GB of RAM and Ubuntu 20.04 as the guest operating system in Oracle’s Virtual Box to set up the experimental environment. We use Mininet and POX controllers on Ubuntu. The POX SDN controller is an open-source controller for OpenFlow/SDN that runs on Python 3.5, and a Mininet that supports OpenFlow 1.3 use in a virtualized environment. We created custom topologies using Python code, while MiniEdit makes it easier to create virtual network topologies. In Table 5, the setup specifications are shown. Mininet investigates numerous networking topologies, and POX SDN provides effective network administration. Actions based on the prediction of the machine learning model are part of the mitigation procedure in the script supplied for the POX controller. The script drops malicious packets to reduce an attack’s impact when detected.

4.1. Hyperparameters

We thoroughly investigate several hyperparameters in the process to find the best combination that improves the model’s performance. In this procedure, we use the Grid Search approach, a method included in the Python package Sklearn. The performance of these models is greatly influenced by the number of hyperparameters, such as max depth, min-samples-split, min-samples-leaf, and n-estimators, among others. The Grid Search model performs a cross-validated assessment for every set of hyperparameters. It divides the data into many subsets (folds) and tests the model’s performance on some subsets while training it on others.

For instance, the hyperparameters optimized include n_estimators being varied across values like [50, 100, 200], the maximum depth being set to different sizes [10, 20, None], and min_samples_split representing a minimum of two, five, or ten samples needed to divide an internal node. Similarly, the value ranges for the remaining hyperparameters are set. After the GridSearchCV process, we use the performance metrics on the validation data to determine the optimal set of hyperparameters. In the context of the problem domain under discussion, these optimal hyperparameters reflect the setup that best improves model performance and allows for more precise predictions. The process of finding the optimal values for a model’s parameters that are not discovered immediately from the training data is known as hyperparameter tuning. These factors influence the model’s capacity to generalize and function well on unknown data.

4.2. Hyper-Parameter Tuning for Support Vector Machine (SVM)

❍: C: The parameter for regularization.
❍: $γ$ (Gamma): RBF kernel kernel coefficient.
❍: Kernel: sigmoid, poly, rbf, and linear.

The objective function for a Support Vector Machine (SVM) is defined as:

min_{w} \frac{1}{2} {∥ w ∥}^{2} + C \sum_{i = 1}^{n} ξ_{i}

(1)

Subject to:

y_{i} (w^{T} x_{i} + b) \geq 1 - ξ_{i}, ξ_{i} \geq 0

(2)

where:

❍: w: Weight vector.
❍: b: Bias term.
❍: C: Regularization parameters controlling the trade-off between maximizing the margin and minimizing the classification error.
❍: $ξ_{i}$ : Slack variables to handleincorrect classification.
❍: $y_{i}$ : True label of the i-th data point ( $y_{i} \in {- 1, 1}$ ).
❍: $x_{i}$ : Feature vector of the i-th data point.

4.2.1. Kernel Function

The kernel function maps the data to a higher-dimensional space to make it linearly separable. Standard kernel functions include:

❍: Linear Kernel:

$K (x, x^{'}) = x^{T} x^{'}$

(3)
❍: Radial Basis Function (RBF) Kernel:

$K (x, x^{'}) = exp (- γ ∥ x - x^{'} ∥^{2})$

(4)

Here $γ$ is the kernel coefficient.

4.2.2. GridSearchCV Hyper-Parameter Tuning

GridSearchCV optimizes SVM hyperparameters by searching over a predefined grid. The hyperparameters tuned include:

❍: Regularization parameter (C):

$C \in {0.1, 1, 10, 100}$

(5)
❍: Kernel coefficient ( $γ$ ) for RBF kernel:

$γ \in {scale, auto, 0.01, 0.1}$

(6)
❍: Kernel types: linear, rbf, poly

4.3. Hyper-Parameter Tguning for Random Forest (RF)

❍: n_estimators: The forest’s total number of trees.
❍: max_depth: Tree’s maximum depth.
❍: min_samples_split: The minimum number of samples needed to split an internal node
❍: min_samples_leaf: The minimum number of samples needed to be a leaf node.
❍: max_features: The maximum number of features considered for the optimal split.

4.3.1. Gini Impurity

The Gini Impurity is calculated as:

G = 1 - \sum_{k = 1}^{K} p_{k}^{2}

(7)

where:

$p_{k}$ : Proportion of samples of class k in a node.
K: Total number of classes.

4.3.2. Information Gain (Entropy-Based)

Information Gain is defined as:

I G = H (parent) - \sum_{i = 1}^{n} \frac{| N_{i} |}{| N |} H (N_{i})

(8)

where:

❍: H: Entropy.
❍: $H (parent)$ : Entropy of the parent node.
❍: N: Total number of samples in the parent node.
❍: $N_{i}$ : Number of samples in the i-th child node.
❍: $H (N_{i})$ : Entropy of the i-th child node.
❍: n: Number of child nodes after the split.

4.3.3. Prediction

The prediction of the Random Forest model is based on the majority vote from all trees:

\hat{y} = majority vote from all trees

(9)

4.3.4. GridSearchCV Hyperparameter Tuning

To optimize the Random Forest model, GridSearchCV searches over a predefined set of hyperparameters:

Number of trees in the forest:

n_e s t i m a t o r s \in {10, 50, 100, 200}

Maximum depth of the tree:

\max_depth \in {5, 10, 20, None}

Minimum number of samples required to split an internal node:

\min_samples_split \in {2, 5, 10}

Minimum number of samples required to be at a leaf node:

\min_samples_leaf \in {1, 2, 4}

4.4. Hyper-Parameter Tuning for Logistic Regression (LR)

❍: C: The strength of regularization (inverse of the regularization parameter ( $Γ $ γ$ )
❍: Solver: Optimization algorithm.
❍: Penalty: Elasticnet, L1, L2, or no regularization type.

The objective of logistic regression with regularization is to minimize the following function:

min_{w} \frac{1}{n} \sum_{i = 1}^{n} [- y_{i} log σ (w^{T} x_{i}) - (1 - y_{i}) log (1 - σ (w^{T} x_{i}))] + \frac{λ}{2} {∥ w ∥}_{2}^{2}

(10)

where:

❍: $σ (z) = \frac{1}{1 + e^{- z}}$ : Sigmoid activation function.
❍: $λ = \frac{1}{C}$ : Regularization strength, where C is the inverse of $λ$ .
❍: ${∥ w ∥}_{2}^{2}$ : $L 2$ regularization penalty to control overfitting.

4.4.1. Regularization Penalties

Regularization helps prevent overfitting by penalizing the magnitude of model coefficients. Logistic regression supports the following types:

$L 1$ Regularization

Encourages sparsity in the coefficients by penalizing the absolute values:

{∥ w ∥}_{1} = \sum_{i = 1}^{n} | w_{i} |

$L 2$ Regularization

Discourages large coefficients by penalizing their squared values:

{∥ w ∥}_{2}^{2} = \sum_{i = 1}^{n} w_{i}^{2}

4.4.2. Hyperparameter Tuning with GridSearchCV

To optimize logistic regression, GridSearchCV is used to search for the best combination of hyperparameters:

❍

Regularization Strength (C):

C \in {0.01, 0.1, 1, 10, 100}

❍

Penalty (Regularization Type):

penalty \in {l 1, l 2}

❍

Solver: The choice of solver depends on the type of penalty:

−: liblinear: Supports both $L 1$ and $L 2$ penalties.
−: saga: Suitable for large datasets and supports both $L 1$ and $L 2$ penalties.

4.5. Hyperparameter Tuning for Decision Tree (DT)

❍: max-depth: maximum tree depth.
❍: min_samples_split: To separate an internal node, the bare minimum of samples is needed.
❍: min_samples_leaf: A leaf node must have a minimum of one sample.
❍: Criteria: Entropy or Gini.

4.5.1. Gini Impurity (Same as RF)

The Gini Impurity is calculated as:

G = 1 - \sum_{k = 1}^{K} p_{k}^{2}

(11)

where:

❍: $p_{k}$ : The probability of a data point belonging to class k.
❍: K: The total number of classes.

4.5.2. Entropy

Entropy is defined as:

H = - \sum_{k = 1}^{K} p_{k} {log}_{2} (p_{k})

(12)

where:

❍: $p_{k}$ : The probability of a data point belonging to class k.
❍: K: The total number of classes.

4.5.3. Information Gain

Information Gain measures the reduction in entropy after splitting the dataset and is defined as:

I G = H (parent) - \sum_{i = 1}^{n} \frac{| N_{i} |}{| N |} H (N_{i})

(13)

where:

❍: $H (parent)$ : Entropy of the parent node.
❍: N: Total number of samples in the parent node.
❍: $N_{i}$ : Number of samples in the i-th child node.
❍: $H (N_{i})$ : Entropy of the i-th child node.
❍: n: Number of child nodes after the split.

4.5.4. GridSearchCV Hyperparameter Tuning

To optimize the decision tree, GridSearchCV searches over a predefined set of hyperparameters:

Maximum depth of the tree:

\max_depth \in {3, 5, 10, None}

Minimum number of samples required to split an internal node:

\min_samples_split \in {2, 5, 10}

Minimum number of samples required to be at a leaf node:

\min_samples_leaf \in {1, 2, 4}

Criterion for measuring the quality of a split:

criterion \in {gini, entropy}

4.6. Hyper-Parameter Tuning for Naive Bias (NB)

Alpha: For MultinomialNB and ComplementNB, the additive smoothing parameter. There are no important hyperparameters for GaussianNB.

4.6.1. Posterior Probability

Naive Bayes is based on Bayes’ Theorem, which calculates the posterior probability as:

P (y ∣ x) = \frac{P (x ∣ y) P (y)}{P (x)}

(14)

where:

❍: $P (y ∣ x)$ : Posterior probability of class y given data x.
❍: $P (x ∣ y)$ : Likelihood of data x for class y.
❍: $P (y)$ : Prior probability of class y.
❍: $P (x)$ : Evidence, which normalizes the probabilities.

4.6.2. Gaussian Naive Bayes (GaussianNB)

For continuous features, Gaussian Naive Bayes assumes that

P (x_{i} ∣ y)

follows a normal distribution:

P (x_{i} ∣ y) = \frac{1}{\sqrt{2 π σ_{y}^{2}}} exp (- \frac{{(x_{i} - μ_{y})}^{2}}{2 σ_{y}^{2}})

(15)

where:

❍: $μ_{y}$ : Mean of the feature for class y.
❍: $σ_{y}^{2}$ : Variance of the feature for class y.

Multinomial Naive Bayes (MultinomialNB)

For discrete features, such as text or count data, Multinomial Naive Bayes models

P (x_{i} ∣ y)

as:

P (x_{i} ∣ y) = \frac{n_{y i} + α}{\sum_{j} (n_{y j} + α)}

where:

❍: $n_{y i}$ : Count of feature i in class y.
❍: $\sum_{j} n_{y j}$ : Total count of all features in class y.
❍: $α$ : Smoothing parameter to handle zero probabilities.

4.6.3. Hyperparameter Tuning with GridSearchCV

To optimize the Naive Bayes model, the following hyperparameters are tuned using GridSearchCV:

Smoothing Parameter (

α

):

α \in {0.1, 0.5, 1.0, 2.0}

Model Type:

❍: GaussianNB: For continuous features.
❍: MultinomialNB: For text or count-based features.

The SDN Controller controls the network and guides data flow across the various connected devices, as shown in Figure 5. The controller connects to two essential network components (EN1 and EN2), which act as gateways for traffic control. By connecting these components, IoT devices (IoTDevice1, IoTDevice2, and IoTDevice3) allow for centralized management of IoT activities. The system has a router (R4) to manage additional routing responsibilities and an ES1 switch for increased traffic distribution. The network has critical endpoints, such as a server, a user, and a possible attacker, raising operational and security issues. The logical separation in SDN networks enhances flexibility, scalability, and security, as illustrated by the red and blue connections, which stand for control and data channels, respectively.

The POX controller has security components and a Python script that automates the execution of experiments in various network settings. The main tasks include topology development, dataset production, monitoring, and policy changes. The framework is tested and improved by simulating real-world situations in a simulation environment. We enforce micro-segmentation, continuous authentication, and stringent access controls via the SDN controller. Machine Learning (ML) and Zero Trust Security in SDN improve network security by implementing dynamic access control, continuous authentication, and micro-segmentation. By limiting illegal connections between various network zones, micro-segmentation in SDN dynamically separates network segments, preventing attackers from moving laterally. The SDN controller changes the flow tables to isolate compromised devices and stop malicious traffic if an intrusion is found, as shown in Figure 6 and Figure 7. Based on real-time behavioral analysis, continuous authentication, powered by machine learning, ensures continuous identity verification. ML models look at network traffic for abnormalities rather than relying only on login credentials. If found, re-authentication is instantly initiated, or access revocation occurs. With the help of SDN’s centralized control plane, dynamic access control enforces least privilege policies, limiting user and device access to just the required resources. Training machine learning models on network traffic data achieves real-time prediction and mitigation of cyberattacks. We put the network through various attack scenarios to evaluate the efficiency of the framework. With strict security measures and intelligent prediction, the technique guarantees a thorough approach to 5G edge network security.

By preventing malicious traffic, rerouting questionable flows for additional examination, and isolating compromised hosts to limit their access, the SDN controller continuously updates flow tables to counteract assaults. This automatic method minimizes manual involvement and lowers the possibility of a protracted network outage while guaranteeing a prompt reaction to security concerns. To determine if detection accuracy and reaction time stay constant, we test the system under low, medium, and high traffic loads, which are typical of real-world networks that encounter varying traffic situations. This assessment aids in figuring out whether specific machine learning models suffer from high traffic levels, ensuring that the suggested strategy is workable for widespread implementation, including a comparison performance study, detection, etc.

5. Results

This section presents the proposed methodology’s performance metrics and results, which are also compared with those of previous methodologies.

5.1. Performance Metrics

We use evaluation parameters, including accuracy, precision, recall, F1-score, and computational time, to evaluate each algorithm’s performance. The values of true positives, true negatives, false positives, and false negatives are the foundation of these measures. These performance measures are essential for evaluating our framework and considering detection and mitigation capabilities. Every machine learning method has distinct features for analyzing, making predictions, and learning data points to categorize and identify attacks depending on the tuning parameters. Our proposed model achieves a higher F1-measure, recall, accuracy, and precision than the state-of-the-art models. These indicators thoroughly illustrate our framework’s performance in a range of network environments and threat situations. The performance matrices are defined below:

5.1.1. Accuracy

Evaluates how accurate the attack detection mechanism is overall. It is the percentage of correctly identified cases, including true negatives and true positives, relative to all cases.

A c c u r a c y = \frac{True Positives + True Negatives}{Total Instances}

(16)

5.1.2. Precision

It shows the proportion of attacks that are real threats. Greater accuracy results in fewer false positives.

P r e c i s i o n = \frac{True Positives}{True Positives + False Positives}

(17)

5.1.3. Sensitivity (Recall)

Evaluates the capacity to recognize every actual attack accurately. False negatives decrease when recall is high.

R e c a l l = \frac{True Positives}{True Positives + False Negatives}

(18)

5.1.4. The F1-Score

The harmonic mean of recall and accuracy maintains a balance between the two.

F 1 = \frac{2 * P r e c i s i o n * R e c a l l}{Precision + Recall}

(19)

5.1.5. Computational Time

The time it takes for a computer to complete a calculation or activity is known as computational time and referred to as execution time or running time.

C T = E n d T i m e - S t a r t T i m e

(20)

5.2. Performance of Machine Learning Classifiers

The results demonstrate that using GridSearchCV to adjust hyperparameters dramatically improves the performance of the proposed methods, which not only improves accuracy, precision, recall, and F1-score, but also focuses on reducing computational time. RF achieves exceptional performance with 99% in all evaluation matrices shown in Table 6 and Figure 8 while showing low computational times of 6.29 s. For the Random Forest classifier, the confusion matrix of the results in Figure 9 shows that RF achieved a result of 99% when GridSearchCV determined that the best hyperparameters were n_estimators = 200, bootstrap = True, max_depth = 20, min_samples_split = 5, and min_samples_leaf = 2. The optimal hyperparameter to obtain 99.97% accuracy results is n-estimator = 200. The hyperparameter tuning configuration is presented in Table 7.

When evaluating the performance of the classification model, the Receiver Operating Characteristic (ROC) Area Under Curve (AUC) in Figure 10 is essential. An impressive accomplishment where the model has excellent discriminating powers is representred by an AUC value of 0.96. A ROC curve that reaches the upper left corner represents the model’s capacity to accurately categorize every occurrence while ensuring a complete lack of false positives. An AUC score of this kind shows that the model is excellent at differentiating between positive and negative classes and does not incorrectly classify. The RF model’s performance is exhibited in the AUC of 0.96, a rarely seen and unusual event in practice. The model’s performance is lower than random guessing, according to values below 0.5. At the same time, an AUC of 0.5 indicates that the model performs no better than random chance. With the help of the AUC, the total discriminatory power measured provides information on the model’s classification accuracy and applicability in a range of real-world scenarios.

For the DT classifier, the confusion matrix of the results in Figure 11 shows that the DT achieves exceptional performance with 99% in all evaluation metrics as shown in Figure 12, while showing low computational times of 1 s. The perfect performance of the DT model is reflected in the AUC of 1.00 shown in Figure 13, which is a rare and unusual event in practice. KNN also maintains a high accuracy of 99%, requiring slightly more computational time at 7.3 s. SVM shows significant improvements in accuracy and F1-score, up to 99%. GridSearchCV optimizes SVM hyperparameters over C, gamma, and kernel values. To provide reliable findings, a 5-fold Stratified Cross-Validation uses ‘C’:10, ‘gamma’:‘scale’, and ‘kernel’:‘rbf’. Still, its computational time is the highest among all models at 184.5 s, which could limit its suitability for time-critical tasks.

Naive Bayes shows improved performance with scores between 87% and 88%, but its computational time of 74.2 s is moderate. LR sustained strong results, slightly improving the computational time of 150.5 s. In general, our findings indicate that our method prioritizes both accuracy and efficiency. The comparison graph of our results is shown in Figure 14. RF and DT are particularly well suited for real-time detection due to their high accuracy and low computational time, making them ideal for efficiently mitigating DDoS attacks. A comparison of our proposed results with previous methods is given in Table 8. Previous and proposed research on attack traffic identification using machine learning methods is presented in Table 9.

We capture traffic analysis through Wireshark. The traffic analysis I/O graph is shown in Figure 15, and the real-time traffic analysis attack detection I/O graph (with and without attacks packet flow) is presented in Figure 16. Attack detection and mitigation analyses are shown in Figure 17 and Figure 18.

6. Discussion

This research study addresses the problem controllers deployed in software-defined networking systems face. The primary objectives were establishing an SDN network, creating network flows that mimic attacks and mitigation, and applying a machine learning technique to detect and immediately stop these flows. In a Mininet emulation environment, we successfully implemented this method, and the above data demonstrate how well it worked to identify and mitigate SDN threats. Firewalls, intrusion prevention systems, and many more are available in conventional networking setups, but their capabilities are restricted because they rely on signatures kept in their security database. Based on Python, we developed and used a machine-learning method and mechanism for network traffic analysis using the CICDDoS2019 dataset. The results demonstrate that using GridSearchCV to adjust hyperparameters dramatically improves the performance of proposed methods, which not only improves accuracy, precision, recall, and F1-score but also focuses on reducing computational time. Random Forest and Decision Tree achieved exceptional performance with 99% across all evaluation metrics. That shows that RF and DT are particularly well suited for real-time detection because they combine high accuracy and low computational time, making them ideal for mitigating DDoS attacks effectively. The outcomes show how well machine learning-based attack mitigation strategies work to lower attack traffic and enhance network efficiency.

The controller analyzes every network transaction and produces metrics input into a machine-learning detection algorithm to determine whether an attack occurred. After capturing the network traffic through Wireshark, the controller utilizes the model detection to determine whether there is an attack and classifies it based on the network parameters. According to Amrish et al. [30], who have also used machine learning methods to prevent attacks in SDN, dynamic flow management and reconfiguration methodology significantly reduces the detection of attack traffic. Their findings show that in SDN systems, machine learning-based strategies may efficiently reduce the effects of assaults and enhance network performance. By dropping and blocking impacted ports, our mechanism shows an outstanding mitigation strategy in line with research on flexible flow monitoring and modification by Kousar et al. [47]. Our method dynamically changes the network flow rules to discard or redirect traffic associated with identified activities. By altering the network flow patterns, malicious traffic is isolated or redirected to reduce its impact on legitimate traffic. Our findings and discussion collectively demonstrate how machine learning methods, particularly Random Forest and Decision Trees, improve SDN attack detection and mitigation. This study sheds light on their functionality and how they affect network performance and offers recommendations for further research in actual business networks.

7. Conclusions

This paper presented a framework for designing and executing network operations by programming with software-defined networks, which is impossible with traditional networks. Although SDN is a potential future technology that is revolutionizing the networking sector, companies have only applied it in a small number of industry applications at this time. This study aimed to protect the controller by identifying and preventing any intrusion. SDN can become more intelligent, adaptable, and able to ward off complex threats by utilizing machine learning techniques. In conclusion, the methods proposed for detecting and mitigating DDoS attacks showed strong performance in accuracy, precision, recall, F1-score, and computational efficiency compared to prior studies. Models like RF and DT achieved 99% accuracy while maintaining low computational times, making them well-suited for real-time applications. KNN also performed well but required slightly more computational time. Although SVM achieved notable accuracy improvements, its high computational time limits its use in time-sensitive tasks. Naive Bayes showed moderate improvements, and LR maintained reliable performance with reduced computational times. Overall, the combination of high accuracy and computational efficiency ensures that the proposed methods are practical for real-time DDoS detection and mitigation, with RF and DT being the most balanced and valuable models. With the advancement of machine learning and software-defined networks, researchers must conduct more research and collaborate to navigate the associated challenges of this technology for network security. Although the outcomes from the CIC-DDoS2019 dataset are encouraging, it is vital to understand that the applicability of the dataset may be restricted. Examining reaction and computational time in real-time DDoS attack scenarios is equally essential and a crucial component of cybersecurity.

Author Contributions

Conceptualization, F.A. and A.A.; methodology, F.A. and A.A.; writing—original draft preparation, F.A.; writing—review and editing, A.A. and I.M.P.; supervision, M.W., M.A.S. and A.A.; funding acquisition, I.M.P. All authors have read and agreed to the published version of the manuscript.

Funding

This work is funded by FCT/MECI through national funds and, when applicable, co-funded EU funds under UID/50008: Instituto de Telecomunicações.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ML	Machine Learning
DL	Deep Learning
RF	Random Forest
KNN	K-Nearest Neighbor
LR	Logistic Regression
SVM	Support Vector Machine
DT	Decision Tree
NB	Naive Bias
SDN	Software Defined Network
5G	5th generation
ZTA	Zero Trust Architecture
WAN	Wide Area Network
IoT	Internet Of Things
DOS	Denial of Service
DDOS	Distributed Denial of Service
U2R	User To Root
R2L	Remote to Local
IDS	Intrusion Detection System
IPS	Intrusion Prevention System
HMM	Hidden Markov Mode
DNN	Deep Neural Network
IZT	Intelligent Zero Trust

References

Segura, G.A.N.; Chorti, A.; Margi, C.B. Centralized and Distributed Intrusion Detection for Resource-Constrained Wireless SDN Networks. IEEE Internet Things J. 2022, 9, 7746–7758. [Google Scholar] [CrossRef]
Zaher, M.; Alawadi, A.H.; Molnár, S. Sieve: A flow scheduling framework in SDN based data center networks. Comput. Commun. 2021, 171, 99–111. [Google Scholar] [CrossRef]
Becerra-Suarez, F.L.; Fernández-Roman, I.; Forero, M.G. Improvement of Distributed Denial of Service Attack Detection through Machine Learning and Data Processing. Mathematics 2024, 12, 1294. [Google Scholar] [CrossRef]
Khader, R.; Eleyan, D. Survey of DoS/DDoS attacks in IoT. Sustain. Eng. Innov. 2021, 3, 23–28. [Google Scholar] [CrossRef]
Agrawal, N.; Tapaswi, S. An SDN-assisted defense mechanism for the shrew DDoS attack in a cloud computing environment. J. Netw. Syst. Manag. 2021, 29, 12. [Google Scholar] [CrossRef]
Ramzan, M.; Shoaib, M.; Altaf, A.; Arshad, S.; Iqbal, F.; Castilla, Á.K.; Ashraf, I. Distributed denial of service attack detection in network traffic using deep learning algorithm. Sensors 2023, 23, 8642. [Google Scholar] [CrossRef]
Wei, Y.; Jang-Jaccard, J.; Sabrina, F.; Singh, A.; Xu, W.; Camtepe, S. AE-MLP: A Hybrid Deep Learning Approach for DDoS Detection and Classification. IEEE Access 2021, 9, 146810–146821. [Google Scholar] [CrossRef]
Keshari, S.K.; Kansal, V.; Kumar, S. A systematic review of quality of services (QoS) in software defined networking (SDN). Wirel. Pers. Commun. 2021, 116, 2593–2614. [Google Scholar] [CrossRef]
Kerman, A.; Borchert, O.; Rose, S.; Tan, A. Implementing a zero trust architecture. Natl. Inst. Stand. Technol. 2020, 2020, 17. [Google Scholar]
Ashfaq, F.; Ahad, A.; Hussain, M.; Shayea, I.; Pires, I.M. Enhancing Zero Trust Security in Edge Computing Environments: Challenges and Solutions. In Proceedings of the World Conference on Information Systems and Technologies, Lodz, Poland, 26–28 March 2024; Springer: Cham, Switzerland, 2024; pp. 433–444. [Google Scholar]
Badotra, S.; Panda, S.N. A survey on software defined wide area network. Int. J. Appl. Sci. Eng. 2020, 17, 59–73. [Google Scholar]
Alzahrani, R.J.; Alzahrani, A. Security Analysis of DDoS Attacks Using Machine Learning Algorithms in Networks Traffic. Electronics 2021, 10, 2919. [Google Scholar] [CrossRef]
Gupta, N.; Maashi, M.S.; Tanwar, S.; Badotra, S.; Aljebreen, M.; Bharany, S. A comparative study of software defined networking controllers using mininet. Electronics 2022, 11, 2715. [Google Scholar] [CrossRef]
Das, D.; Sahoo, B.; Roy, S.; Mohanty, S. Performance Analysis of an OpenFlow-Enabled Network with POX, Ryu, and ODL Controllers. IETE J. Res. 2024, 70, 8538–8555. [Google Scholar] [CrossRef]
Tabash, M.; Abd Allah, M.; Tawfik, B. Intrusion detection model using naive bayes and deep learning technique. Int. Arab J. Inf. Technol. 2020, 17, 215–224. [Google Scholar] [CrossRef]
Eliyan, L.F.; Di Pietro, R. DoS and DDoS attacks in Software Defined Networks: A survey of existing solutions and research challenges. Future Gener. Comput. Syst. 2021, 122, 149–171. [Google Scholar] [CrossRef]
Numan, M.; Subhan, F.; Khalid, M.N.A.; Khan, W.Z.; Iida, H. Clone node detection in static wireless sensor networks: A hybrid approach. J. Netw. Comput. Appl. 2024, 232, 104018. [Google Scholar] [CrossRef]
Sarioguz, O.; Miser, E. Artificial intelligence and participatory leadership: The role of technological transformation in business management and its impact on employee participation. Int. Res. J. Mod. Eng. Technol. Sci. 2024, 6, 1618–1633. [Google Scholar]
Alpaydin, E. Introduction to Machine Learning; MIT Press: Cambridge, MA, USA, 2020. [Google Scholar]
Abou El Houda, Z. Security Enforcement Through Software Defined Networks (SDN). Ph.D. Thesis, Université de Technologie de Troyes, Troyes, France, Université de Montréal, Montréal, QC, Canada, 2021. [Google Scholar]
Taj, R. A Machine Learning Framework for Host Based Intrusion Detection Using System Call Abstraction; Dalhousie University: Halifax, NS, Canada, 2020. [Google Scholar]
Alanazi, F.; Jambi, K.; Eassa, F.; Khemakhem, M.; Basuhail, A.; Alsubhi, K. Ensemble Deep Learning Models for Mitigating DDoS Attack in Software-Defined Network. Intell. Autom. Soft Comput. 2022, 33, 923–938. [Google Scholar] [CrossRef]
Hussain, F.; Abbas, S.G.; Husnain, M.; Fayyaz, U.U.; Shahzad, F.; Shah, G.A. IoT DoS and DDoS attack detection using ResNet. In Proceedings of the 2020 IEEE 23rd International Multitopic Conference (INMIC), Bahawalpur, Pakistan, 5–7 November 2020; IEEE: New York, NY, USA, 2020; pp. 1–6. [Google Scholar]
Jiyad, Z.M.; Al Maruf, A.; Haque, M.M.; Gupta, M.S.; Ahad, A.; Aung, Z. DDoS Attack Classification Leveraging Data Balancing and Hyperparameter Tuning Approach Using Ensemble Machine Learning with XAI. In Proceedings of the 2024 Third International Conference on Power, Control and Computing Technologies (ICPC2T), Raipur, India, 18–20 January 2024; IEEE: New York, NY, USA, 2024; pp. 569–575. [Google Scholar]
Nguyen, T.T.; Reddi, V.J. Deep Reinforcement Learning for Cyber Security. IEEE Trans. Neural Netw. Learn. Syst. 2023, 34, 3779–3795. [Google Scholar] [CrossRef]
Li, J.; Zhao, Z.; Li, R. Machine learning-based IDS for software-defined 5G network. IET Netw. 2018, 7, 53–60. [Google Scholar] [CrossRef]
Nanda, S.; Zafari, F.; DeCusatis, C.; Wedaa, E.; Yang, B. Predicting network attack patterns in SDN using machine learning approach. In Proceedings of the 2016 IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN), Palo Alto, CA, USA, 7–10 November 2016; pp. 167–172. [Google Scholar] [CrossRef]
Polat, H.; Polat, O.; Cetin, A. Detecting DDoS attacks in software-defined networks through feature selection methods and machine learning models. Sustainability 2020, 12, 1035. [Google Scholar] [CrossRef]
Guo, X.; Xian, H.; Feng, T.; Jiang, Y.; Zhang, D.; Fang, J. An intelligent zero trust secure framework for software defined networking. PeerJ Comput. Sci. 2023, 9, e1674. [Google Scholar] [CrossRef] [PubMed]
Amrish, R.; Bavapriyan, K.; Gopinaath, V.; Jawahar, A.; Kumar, C.V. DDoS detection using machine learning techniques. J. IoT Soc. Mob. Anal. Cloud 2022, 4, 24–32. [Google Scholar] [CrossRef]
Maulana, A.; Pongoh, D.; Doringin, F.J.; Lumbu, R.S.; Yura, Y.; Rampi, H.; Timpua, D.; Assa, J.; Simoboh, C. Case Study of Cybercrime Implementation in Technology Development. Interdiscip. J. Adv. Res. Innov. 2024, 2, 192–196. [Google Scholar]
Wang, P.; Chao, K.M.; Lin, H.C.; Lin, W.H.; Lo, C.C. An Efficient Flow Control Approach for SDN-Based Network Threat Detection and Migration Using Support Vector Machine. In Proceedings of the 2016 IEEE 13th International Conference on e-Business Engineering (ICEBE), Macau, China, 4–6 November 2016; pp. 56–63. [Google Scholar] [CrossRef]
Rawat, S.; Srinivasan, A.; Ravi, V.; Ghosh, U. Intrusion detection systems using classical machine learning techniques vs integrated unsupervised feature learning and deep neural network. Internet Technol. Lett. 2022, 5, e232. [Google Scholar] [CrossRef]
Dong, S.; Sarem, M. DDoS Attack Detection Method Based on Improved KNN With the Degree of DDoS Attack in Software-Defined Networks. IEEE Access 2020, 8, 5039–5048. [Google Scholar] [CrossRef]
Forbacha, S.C.; Kinteh, M.K.; Hamza, E.M. Enhanced Attacks Detection and Mitigation in Software Defined Networks. Am. J. Comput. Eng. 2024, 7, 40–80. [Google Scholar] [CrossRef]
Han, D.; Li, H.; Fu, X.; Zhou, S. Traffic Feature Selection and Distributed Denial of Service Attack Detection in Software-Defined Networks Based on Machine Learning. Sensors 2024, 24, 4344. [Google Scholar] [CrossRef]
Siva reddy, S.V.; Saravanan, S. Performance evaluation of classification algorithms in the design of Apache Spark based intrusion detection system. In Proceedings of the 2020 5th International Conference on Communication and Electronics Systems (ICCES), Coimbatore, India, 10–12 June 2020; IEEE: New York, NY, USA, 2020; pp. 443–447. [Google Scholar]
Priya, S.S.; Sivaram, M.; Yuvaraj, D.; Jayanthiladevi, A. Machine learning based DDoS detection. In Proceedings of the 2020 International Conference on Emerging Smart Computing and Informatics (ESCI), Pune, India, 12–14 March 2020; IEEE: New York, NY, USA, 2020; pp. 234–237. [Google Scholar]
Ujjan, R.M.A.; Pervez, Z.; Dahal, K.; Khan, W.A.; Khattak, A.M.; Hayat, B. Entropy based features distribution for anti-DDoS model in SDN. Sustainability 2021, 13, 1522. [Google Scholar] [CrossRef]
Gadze, J.D.; Bamfo-Asante, A.A.; Agyemang, J.O.; Nunoo-Mensah, H.; Opare, K.A.B. An investigation into the application of deep learning in the detection and mitigation of DDOS attack on SDN controllers. Technologies 2021, 9, 14. [Google Scholar] [CrossRef]
Ahuja, N.; Singal, G.; Mukhopadhyay, D.; Kumar, N. Automated DDOS attack detection in software defined networking. J. Netw. Comput. Appl. 2021, 187, 103108. [Google Scholar] [CrossRef]
Wang, Z.; Zeng, Y.; Liu, Y.; Li, D. Deep belief network integrating improved kernel-based extreme learning machine for network intrusion detection. IEEE Access 2021, 9, 16062–16091. [Google Scholar] [CrossRef]
Banitalebi Dehkordi, A.; Soltanaghaei, M.; Boroujeni, F.Z. The DDoS attacks detection through machine learning and statistical methods in SDN. J. Supercomput. 2021, 77, 2383–2415. [Google Scholar] [CrossRef]
Sharafaldin, I.; Lashkari, A.H.; Hakak, S.; Ghorbani, A.A. Developing realistic distributed denial of service (DDoS) attack dataset and taxonomy. In Proceedings of the 2019 International Carnahan Conference on Security Technology (ICCST), Chennai, India, 1–3 October 2019; IEEE: New York, NY, USA, 2019; pp. 1–8. [Google Scholar]
Lawal, M.A.; Shaikh, R.A.; Hassan, S.R. A DDoS attack mitigation framework for IoT networks using fog computing. Procedia Comput. Sci. 2021, 182, 13–20. [Google Scholar] [CrossRef]
Ashraf, U.; Sharif, H.; Usman, S.; Hasnain, M. A Machine Learning Based Approach for the Detection of DDoS Attacks on Internet of Things Using CICDDoS2019 Dataset-PortMap. Lahore Garrison Univ. Res. J. Comput. Sci. Inf. Technol. 2024, 8. [Google Scholar] [CrossRef]
Kousar, H.; Mulla, M.M.; Shettar, P.; Narayan, D. Detection of DDoS attacks in software defined network using decision tree. In Proceedings of the 2021 10th IEEE international conference on Communication Systems and Network Technologies (CSNT), Bhopal, Indi, 18–19 June 2021; IEEE: New York, NY, USA, 2021; pp. 783–788. [Google Scholar]

Figure 1. Open source/open-flow supported controllers.

Figure 2. 5G network model for intrusion detection model with intelligent SDN.

Figure 3. Proposed Methodology for Intelligent SDN.

Figure 4. Flow diagram of proposed IntelligentSDN for attack detection and mitigation.

Figure 5. IntelligentSDN security framework.

Figure 6. UDP and TCP traffic between hosts in an SDN environment: an analysis of Wireshark conversations.

Figure 7. Wireshark Packet length distribution: a statistical study of network traffic recorded.

Figure 8. Random Forest performance metric evaluation on the Mininet SDN network.

Figure 9. Testing confusion matrix of Random Forest.

Figure 10. ROC curve of Random Forest.

Figure 11. Testing confusion matrix: Decision Tree.

Figure 12. Decision Tree performance metric evaluation on the Mininet SDN network.

Figure 13. ROC curve of Decision Tree.

Figure 14. Comparison chart of machine learning mModels by performance metrics.

Figure 15. Traffic Analysis I/O graph.

Figure 16. Real-time traffic analysis with and without attacks detection.

Figure 17. Controller-detected attack mitigation.

Figure 18. Analysis of attack detection and mitigation.

Table 1. Analysis of supervised learning methods for intrusion detection system.

Reference	Year	Methods	Deployment and Assessment	Benefits	Accuracy (%)
[23]	2018	SVM	It performs admirably in complex fields	Intrusion detection, DDoS assaults	90
[26]	2019	SVM	SVM acquire valuable information, strong capacity	DDoS detection	94
[27]	2019	ML-Based Methods	DT handle continuous and discrete data	DoS, DDoS attacks detection using the “Long Tail” public data collection	91
[28]	2020	SVM	SVM provides precise classification for detecting DDoS attacks	Denial-of-service attacks detection	89
[29]	2023	SVM	Improves intrusion detection rate accuracy and, dangers to security	Identification of security risks	80
[3]	2024	RF, DT	When paired with DTs, supervised RF increases the classifier’s resilience	Selecting the most discriminating traits	83

Table 2. Comparison with previous studies (CT = Computational Time).

Reference	Year	Methods	SDN Focused	Accuracy	Simulated Environment	Sensitivity	CT
[28]	2020	KNN, SVM, NB	×	✓	✓	✓	×
[34]	2020	KNN	✓	✓	×	×	×
[12]	2021	SVM, RF, LR, NB, DT	✓	✓	✓	×	×
[29]	2023	SVM	✓	✓	×	×	×
[3]	2024	RF, DT	×	✓	✓	✓	✓
[35]	2024	SVM, DT	✓	✓	✓	✓	×
[36]	2024	DT, SVM, LR, RF	✓	✓	✓	✓	×
Proposed method	2024	DT	✓	✓	✓	✓	✓

Table 3. Previous studies using CICDDoS2019 dataset (RTD = Real Time Detection, SDN-C = SDN-Controller).

Ref.	Year	SDN-C	RTD	Dataset	Features/Features Selection Methods	Data Preprocessing
[45]	2021	NO	NO	CICDDoS2019	23 features by ranker search method	Remove redundant and irrelevant values, null values (binary classification)
[6]	2023	NO	NO	CICDDoS2019	20 features selected by extra tree classifier	Standard scalar normalizing, dealing with missing values, null values and transforming categorical values, label encoding (binary classification)
[3]	2024	NO	NO	CICDDoS2019	22 features selected by using Pearson’s method	Remove outliers (binary classification and multi-classification)
[46]	2024	Yes	No	CICDDoS2019	26 features selected manually and feature engineering	Standard scalar normalizing and handle redundant and missing values, label encoding (binary classification)
Our Study	2025	Yes	Yes	CICDDoS2019	20 features selected by filter-based selection	Normalizing data and handle redundant and missing values, remove outliers, transforming categorical values, label encoding (binary classification)

Table 4. Statistics of CICDDoS2019 dataset.

Sr.No	Class	Numbers
1	Benign	5686
2	DDoS-DNS	5,071,011
3	DDoS-LDAP	2,179,930
4	DDoS-MSSQL	4,522,492
5	DDoS-NetBIOS	4,093,279
6	DDoS-NTP	1,202,642
7	DDoS-SNMP	5,159,870
8	DDoS-SSDP	2,610,61
9	DDoS-SYN	1,582,289
10	DDoS-TFTP	20,082,580
11	DDoS-UDP	134,645
12	DDoS-UDP-Lag	366,461
13	DDoS-WebDDoS	439

Table 5. Requirements for the experimental environment.

Component	Description
Processor	64-bit
RAM	12 GB
Operating System	Windows 10
Virtualization Software	Oracle’s Virtual Box
Host Operating System	Ubuntu 20.04
Network Emulator	Mininet
SDN Controller	POX
Internet	An internet connection modem

Table 6. Results of the Random Forest tree.

Accuracy	Precision	Recall	F1-Score	Computational Time
99.97	99	99	99	6.29 s

Table 7. Hyperparameter grid search results.

Hyperparameters	Tested Values
n_estimators	50, 100, 200
max_depth	10, 20, None
min_samples_split	2, 5, 10
min_samples_leaf	1, 2, 4
bootstrap	True, False

Table 8. Comparison of proposed methods results and previous results. Here, Pre: Precision, Rec: Recall, F1: F1-score, Acc: Accuracy, and CT: Computational Time.

Models	Proposed Methods Results					Previous Results [12]
	Pre (%)	Rec (%)	F1 (%)	Acc (%)	CT (s)	Pre (%)	Rec (%)	F1 (%)	Acc (%)	CT (s)
RF	99	99	99	99	6.29	98	98	99	98	84.2
DT	99	99	99	99	1	99	98	99	98	4.5
KNN	99	99	99	99	7.3	99	99	99	98	3.5
SVM	98	98	99	99	184.5	86	87	85	86	7.2
NB	87	87	88	87	74.2	66	54	38	45	1.3
LR	98	98	99	99	150.5	98	98	99	98	5.5

Table 9. Comparison with the outcomes of other studies.

Ref.	Models	Accuracy	Precision	Recall	F1-Score	CT
[12]	DT	98	99	98	99	4.53 s
[3]	RF	99	99	99	99	12 m
[35]	SVM	93	92	94	93	—
[28]	KNN	95	97	93	95	—
OurStudy	DT	99	99	99	99	1 s

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ashfaq, F.; Wasim, M.; Shah, M.A.; Ahad, A.; Pires, I.M. Enhancing Security in 5G Edge Networks: Predicting Real-Time Zero Trust Attacks Using Machine Learning in SDN Environments. Sensors 2025, 25, 1905. https://doi.org/10.3390/s25061905

AMA Style

Ashfaq F, Wasim M, Shah MA, Ahad A, Pires IM. Enhancing Security in 5G Edge Networks: Predicting Real-Time Zero Trust Attacks Using Machine Learning in SDN Environments. Sensors. 2025; 25(6):1905. https://doi.org/10.3390/s25061905

Chicago/Turabian Style

Ashfaq, Fiza, Muhammad Wasim, Mumtaz Ali Shah, Abdul Ahad, and Ivan Miguel Pires. 2025. "Enhancing Security in 5G Edge Networks: Predicting Real-Time Zero Trust Attacks Using Machine Learning in SDN Environments" Sensors 25, no. 6: 1905. https://doi.org/10.3390/s25061905

APA Style

Ashfaq, F., Wasim, M., Shah, M. A., Ahad, A., & Pires, I. M. (2025). Enhancing Security in 5G Edge Networks: Predicting Real-Time Zero Trust Attacks Using Machine Learning in SDN Environments. Sensors, 25(6), 1905. https://doi.org/10.3390/s25061905

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Enhancing Security in 5G Edge Networks: Predicting Real-Time Zero Trust Attacks Using Machine Learning in SDN Environments

Abstract

1. Introduction

2. Related Work

3. Methodology

3.1. CICDDoS2019 Benchmark Dataset

3.2. Data Preprocessing

3.3. Data Modeling

3.4. Flow of Our Proposed Framework

4. Experimental Setup and Model Development

4.1. Hyperparameters

4.2. Hyper-Parameter Tuning for Support Vector Machine (SVM)

4.2.1. Kernel Function

4.2.2. GridSearchCV Hyper-Parameter Tuning

4.3. Hyper-Parameter Tguning for Random Forest (RF)

4.3.1. Gini Impurity

4.3.2. Information Gain (Entropy-Based)

4.3.3. Prediction

4.3.4. GridSearchCV Hyperparameter Tuning

4.4. Hyper-Parameter Tuning for Logistic Regression (LR)

4.4.1. Regularization Penalties

L 1 Regularization

L 2 Regularization

4.4.2. Hyperparameter Tuning with GridSearchCV

4.5. Hyperparameter Tuning for Decision Tree (DT)

4.5.1. Gini Impurity (Same as RF)

4.5.2. Entropy

4.5.3. Information Gain

4.5.4. GridSearchCV Hyperparameter Tuning

4.6. Hyper-Parameter Tuning for Naive Bias (NB)

4.6.1. Posterior Probability

4.6.2. Gaussian Naive Bayes (GaussianNB)

4.6.3. Hyperparameter Tuning with GridSearchCV

5. Results

5.1. Performance Metrics

5.1.1. Accuracy

5.1.2. Precision

5.1.3. Sensitivity (Recall)

5.1.4. The F1-Score

5.1.5. Computational Time

5.2. Performance of Machine Learning Classifiers

6. Discussion

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

$L 1$ Regularization

$L 2$ Regularization