1. Introduction
The rapid increase in the number of mobile devices is driving the growth of the Internet and communication field, leading to significant security challenges. Network security serves as a critical shield against malicious intrusions, preventing unauthorized access to sensitive data and systems. Cybercriminals continuously look for vulnerabilities to exploit, making it essential to establish strong defenses for safeguarding against potential risks. Therefore, detecting and preventing attacks before they occur is crucial for network protection. Intrusion Detection Systems (IDS) emerged as a promising security measure to complement and strengthen network defenses. In network environments, IDS serves a vital function by actively monitoring network activities to detect potentially malicious behaviors. It assesses the data flow within the internal network before it is transmitted externally, searching for any indications of abnormalities. Upon detecting anomalies, an alert is triggered and sent to system administrators for prompt response to the incident.
Traditional intrusion detection systems (IDS) rely on predefined rules and signatures to identify malicious activities. While this approach offers some initial effectiveness, the growing complexity and volume of attacks expose its limitations. Attackers can readily bypass these systems by constructing variants of known attacks or targeting entirely new vulnerabilities. This vulnerability to novel attack methods results in a steadily increasing error rate for traditional IDS [
1]. Addressing these limitations is crucial for bolstering the overall resilience of intrusion detection systems in dynamic cybersecurity landscapes.
To solve this problem, Deep Learning (DL) introduces a dynamic and adaptive approach that supplements the rule-based nature of traditional IDS [
2,
3]. DL has made significant advancements across various fields, such as image processing [
4], storage systems [
5], speech recognition [
6], and cybersecurity [
7]. By leveraging DL algorithms, an IDS gains the ability to learn from historical data, adapting to evolving threats without solely relying on predefined rules. This adaptive learning empowers the system to recognize patterns and anomalies, making it more adept at identifying novel attack methods and zero-day exploits.
In the field of DL, attack classification involves two main types: binary classification and multi-class classification (
Figure 1). Unlike binary classification, which divides network traffic into normal and abnormal categories, multi-class classification specifically delineates attack types, enabling NIDS to distinguish between incoming flows, thereby reducing false alarm rates [
8]. As network size expands, attack techniques not only diversify but also grow in complexity. Consequently, detecting various attack types becomes imperative for businesses to preempt intrusions, particularly in scenarios where numerous attack variations abound.
One of the simplest ways to achieve high performance in multi-class intrusion detection within NIDS is to deploy multiple algorithms simultaneously, each algorithm specializing in the detection of a specific type of attack (
Table 1). While this method may seem effective, it is not without drawbacks. Primarily, it can lead to increased computational costs, which is a significant concern for real-time detection capabilities [
9]. High efficiency often demands significant resource consumption and vice versa [
10]. Therefore, it is essential to balance maintaining high intrusion detection performance and minimizing computational costs in a multi-class intrusion detection scenario.
Furthermore, when working with multi-class intrusion detection, researchers have encountered several challenges. Network traffic, primarily in the form of flow data, is inherently unstructured, as flows can be generated randomly between endpoints over time. Additionally, while flow information is crucial for detecting attacks, not every attribute is pertinent to every attack type [
11]. For instance, “Flow Duration” holds significance in identifying Denial-of-Service (DoS) attacks due to its focus on the duration of data flow persistence. Conversely, in detecting Injection attacks, which prioritize message content over traffic duration, this attribute might be less critical. Finally, in network traffic analysis, port information plays a critical role in identifying malicious activity [
12]. Intruders often exploit unusual port usage patterns. This includes unauthorized access attempts on uncommon ports or irregular traffic patterns on standard ports. By incorporating port data, we gain a deeper understanding of network communication, enabling more precise intrusion detection. This sets the stage for our method, which improves multi-class intrusion detection.
To tackle this challenge, Graph neural networks (GNNs) present a promising solution for multi-class intrusion detection. Within intrusion detection, GNNs can depict network topology as a graph and learn patterns from it, facilitating the analysis of unstructured data. GNNs excel in processing such data by capturing relationships between network elements. By leveraging GNNs, we can effectively integrate these diverse data sources to enhance the precision of intrusion detection across multi-class attacks.
Table 1.
Attack-Algorithm Mapping.
Table 1.
Attack-Algorithm Mapping.
Attack | Algorithm |
---|
DoS | RNN [13] |
Port Scan | Support Vector Machine (SVM) [14] |
DDoS | E-GraphSAGE [15] |
Reconnaissance | E-GraphSAGE [15] |
Backdoor | E-GraphSAGE [15] |
Fuzzers | SVM [16] |
Generic | SVM [16] |
Exploits | SVM [16] |
In general, the majority of NIDS benchmark datasets (e.g., [
17,
18,
19]) offer insights into network traffic or data flow. Intrusion detection aims to identify malicious attacks within the network system. In GNN representations of these systems, such flows are represented as edges, with associated data serving as edge features. Consequently, harnessing edge features within GNN architectures becomes crucial for accurate intrusion detection. However, conventional GNN approaches tend to prioritize node features for node classification tasks, overlooking the potential benefits of incorporating edge features [
20]). This emphasis on node features alone may limit the effectiveness of intrusion detection systems, as it overlooks critical contextual information encoded within the edges. Edge features capture nuanced relationships and interactions between nodes, offering valuable insights into the network’s behavior and facilitating more precise detection of anomalous activities. Therefore, advancing GNN methodologies to incorporate edge features alongside node features is imperative for enhancing the accuracy and robustness of intrusion detection systems in safeguarding network security. J. Gilmer [
21] introduced a Message Passing Neural Networks (MPNN) function, which incorporates edge features to predict the quantum mechanical properties of molecules. Similarly, L. Gong [
22] proposed a method to handle edge features in multigraphs. Their framework enables edge features to dynamically adapt across network layers, allowing models to effectively leverage richer edge feature information. However, both studies primarily utilize edge features to enhance node representation rather than edge classification, a crucial aspect of NIDS. To harness edge features for flow data detection, W. Lo proposed an inductive algorithm named E-GraphSAGE. While this method demonstrated promising results in IoT networks, it notably disregards port information post-graph creation, which is pivotal for attack detection.
W. W. Lo [
15] introduced an inductive algorithm called E-GraphSAGE for flow detection using edge classification. Although their proposed model shows the efficient method of using flow information as edge features, their proposal puts all the features to the edge, and the information in the node is a matrix-ones. This method is limited as their graph does not represent a close representation of the actual network.
Based on this limitation, this paper introduces a novel approach to enhance GraphSAGE, a variant of GNN, for more effective multi-class intrusion detection in network flow data. Our method concentrates on rearranging features embedded in the graph. In this method, information related to the host, including IP addresses and ports, is entered into the node, and information related to flow, such as flow duration, flow length, and number of Packet/s, is entered into the edge. By doing this, when aggregating information and concatenating it in the MPNN process, the node can provide identification information (e.g., which port the attack occurs), and the edge contains the information of that attack. Furthermore, the identification related to the port is essential because some attacks only occur on specific ports. For example, a Brute Force attack focuses on ports 21, 22… which requires a password for login to crack the user’s password. Finally, to enhance performance in training and predicting the model, we utilize an additional embedding layer to extract more informative features from neighboring nodes. This improved feature utilization contributes to more accurate intrusion detection.
We summarize our contributions as follows:
Our research highlights how paying attention to the details of network traffic, specifically the edge features, can greatly improve GNN for spotting different types of attacks from the Internet. By focusing on these small but crucial aspects, we aim to improve IDS to recognize and prevent cyber threats, which are always changing and becoming more sophisticated.
Our contribution underscores the critical role of port information in identifying attacks, a factor often neglected during the training process. By incorporating this vital aspect into our approach, we aim to enhance the accuracy and effectiveness of intrusion detection systems. This emphasis on utilizing port information is key to bolstering network security defenses against emerging cyber threats.
This paper proposes a method to reorganize the features embedded within the graph. By doing so, the graph can more accurately represent the actual network, where nodes display endpoint identification information such as IP addresses and ports, and edges contain flow information such as Duration, Number of Packets per second, and Length…
Our experiments reveal substantial performance enhancements over traditional methods, achieving accuracy rates of 98.32% on the CIC-IDS-2017 dataset and 96.71% on the UNSW-NB15 dataset.
The remainder of this paper is organized as follows.
Section 2 discusses key related work, while
Section 3 provides relevant background about multi-class attacks, IDS, GNN, and GraphSAGE. In
Section 4, we present our proposed method.
Section 5 covers the experiments and evaluation. Lastly, in
Section 6, we provide conclusions and outline avenues for future research.
2. Related Work
In recent years, many researchers [
23,
24] applied theories of machine learning to intrusion detection and proposed an anomaly detection model due to the limitations of a Rule-based Intrusion Detection System as it cannot detect variants of existing attacks that do not match any signature in the database. Aamir et al. [
14] proposed a solution to detect distributed denial-of-service (DDoS) and port scan attacks by using a Support Vector Machine (SVM) to enhance network security and improve the accuracy of identifying malicious activities. Although the algorithm achieves high performance in detecting intrusion, it is primarily designed for binary classification.
To address the limitations of traditional machine learning (ML) methods in intrusion detection, deep learning (DL) has emerged as a promising alternative, demonstrating superior accuracy. Various DL techniques, including Convolutional Neural Networks (CNN) [
25], Recurrent Neural Networks (RNN) [
13,
26,
27], and traditional Multi-Layer Perceptron (MLP) [
28]. Despite their impressive performance, these methods share a limitation: CNN is primarily designed for grid data such as images and RNN performs well with sequential data (text), which represents as structured data. Therefore, these limitations will make these models ineffective when capturing flow data, which are organized as unstructured data [
29,
30].
To overcome these shortcomings, Graph Neural Networks (GNNs) have gained prominence in DL, particularly for tasks involving graph-structured data. GNNs excel in capturing complex relationships within graphs, leveraging both local and global information to propagate data across nodes effectively. This adaptability makes GNNs well-suited for various applications, including social network analysis [
31,
32], recommendation systems [
33,
34], and biological network analysis [
21,
35]. In the realm of intrusion detection, GNNs offer a promising approach by representing network topology as a graph and learning patterns from it, thus enabling analysis of unstructured data [
36,
37]. GNNs are uniquely designed to navigate and analyze graph structures, making them suitable for the dynamic and interconnected nature of network activities.
By utilizing GNN, Zhou et al. [
20] effectively applied the Graph Convolutional Network (GCN), which is a variant of GNN, for botnet attack detection via node classification tasks, by simulating botnet traffic parallel to normal network traffic. However, their study primarily concentrates on node features for classification tasks without considering edge features. The goal of NIDS is to identify and detect attacks on traffic and flows [
3]. This presents the challenge of edge classification on flow datasets, where crucial information is primarily provided through edges. Leveraging the edges (flows) feature allows the model to adeptly manage unseen flows entering the network [
30,
38]. Moreover, relying on information from NIDS benchmark network datasets [
17,
18], which offer more information as edge features rather than node features, enables effective edge classification. Some existing graph representation learning methods [
21,
22] have already incorporated edge features to enhance node representation for improved performance. However, these methods were not specifically designed for edge classification, which is the primary objective of NIDS.
W. W. Lo [
15] introduced an inductive algorithm called E-GraphSAGE for flow detection using edge classification. Although the approach achieved high performance, the author only deployed the port information in the creating graph step. Then, they omitted the port information and applied the matrix one instead. The information-related port is crucial due to it provides the identification information while training, helping the model have more information on the flow data. Therefore, omitting port information can result in suboptimal performance in both the training and testing stages. To overcome this limitation, we proposed a method to rearrange the features embedded in the graph. In this approach, the identification information, such as IP addresses and ports, is utilized and embedded as node features, and the flow information, such as Flow Duration, Flow Length, and Number of Packets per second,…, is embedded as edge features. Furthermore, to delve deeper into information from neighbors, we also utilize an additional embedding layer.
Based on the E-GraphSAGE algorithm, Mirlashari [
39] proposed a modified version to enhance IoT intrusion detection systems. In their method, the message function is modified to compute by concatenating the source node and edge features. This approach allows the model to flexibly process complex relationships between nodes and edges by incorporating both the source node and edge features. However, their method still ignores port information, which is important for effective training and testing. Caville [
40] introduced the Anomal-E approach to learn edge features and graph topological structure. In this method, a graph is constructed with nodes representing hosts and edges describing the flows between them. By adapting E-GraphSAGE for a self-supervised learning task, the model effectively enables edge embeddings without using any data labels.
In contrast to the previous studies, we proposed a method to rearrange and leverage all features in the dataset. This approach provides the model with more information during training, resulting in improved performance when tested with incoming unseen flows.
4. Methodology
Most of the NIDS benchmark datasets, such as CIC-IDS-2017 or UNSW-NB15, provide information related to flow data, which is represented as edge features when transformed into the graph. Therefore, selecting edge features to depict flow information such as Flow Duration, Flow Length, and Number of packets per second,… allows the model to better analyze the relationship and interaction between nodes, facilitating more precise detection of anomalous network activities.
Traditional GNN models have found successful applications in various domains. However, these approaches primarily concentrate on node features for node classification tasks, neglecting the consideration of edge features for edge classification. While some existing graph representation learning methods have suggested leveraging edge features, their primary goal has been to enhance the efficiency of node representation for improved training performance. In this context, the introduction of the E-GraphSAGE study aims to address the issues mentioned above.
Nevertheless, we have observed that pertinent endpoint information, such as IP addresses and ports, is represented as node features, and these were excluded during training. This results in a diminished model performance as it fails to exploit all the features present in the dataset fully. Furthermore, their aggregation function is relatively straightforward, comprising only one input and one output layer, restricting the model’s ability to explore deeper structures.
In our proposal, to improve the model’s performance in handling multi-class attacks, we aim to make full use of all information within the dataset and reorganize pertinent details associated with nodes and edges. Moreover, we plan to incorporate an extra hidden layer in the aggregate function to exploit deeper into the information. In this section, we will provide detailed insights into our proposed model and the sequence of steps when applied in NIDS.
4.1. Dataset
To evaluate our model when applied to NIDS in this paper, we utilized three different NIDS datasets, each containing distinct features and labels for various types of attacks. The first dataset is CIC-IDS-2017 (Canadian Institute for Cybersecurity Intrusion Detection Systems 2017), followed by UNSW-N15 (University of New South Wales—Network Based 15).
4.1.1. CICIDS2017
This is a comprehensive collection of labeled network traffic data for the evaluation and development of intrusion detection systems. The dataset covers different types of attacks, including DoS, DDoS, and Port Scan [
17]. The inclusion of realistic and diverse scenarios makes the CIC-IDS-2017 dataset a valuable resource for assessing the robustness and effectiveness of intrusion detection mechanisms. This dataset is comprised of 77 features with corresponding class labels with a total of 2,830,743 flows.
4.1.2. UNSW-NB15
This is a comprehensive collection of network traffic data designed for evaluating intrusion detection systems [
19]. It includes both normal and malicious network traffic, making it a valuable resource for training and testing intrusion detection systems. The dataset covers various attack scenarios, such as DoS, Generic, and Exploits, providing researchers and practitioners with a realistic representation of cyber threats in a network environment. The dataset contains 47 features with corresponding class labels with 2,515,798 flows.
4.2. Proposed Model
4.2.1. Graph Creation
Several studies [
21,
22] have introduced the utilization of edge features, especially in [
15], where the authors successfully employed an algorithm that allows collecting information for edge and executing edge classification. However, they omit information related to endpoints, such as IP addresses and Ports. This limitation can be critical because some attacks primarily target specific ports (e.g., Brute force attacks focus on ports like 22, 21, etc., which require a password for login to crack the user’s password). Thus, port information plays a pivotal role in enabling Network Intrusion Detection Systems (NIDS) to identify suspicious flows effectively. By incorporating port information, we can better mitigate incoming intrusions to the network [
12].
In the context of an intrusion network, it is common for two flows to be generated from the same Source IP and Destination IP but with different Ports, as shown in
Figure 6; these flows represent distinct communication channels. To capture this behavior in our analysis, we can combine the source IP address with the source port and the destination IP address with the destination port after importing the flow dataset. This approach offers two advantages: (1) Creation of unique identifiers by combining IP and port information. These identifiers can represent individual users or communication channels. (2) Reduction in dataset dimensionality by merging the separate source and destination port columns into single combined fields. This streamlining allows the model to focus on the core aspects of communication rather than individual port numbers, potentially improving flow differentiation and classification accuracy.
Then, the graph is constructed with nodes represented by unique combinations of IP addresses and corresponding individual Ports and an edge representing a flow identified by Source IP and Destination IP (
Figure 7). This enables us to model the network accurately in graph format. After creating the graph, essential edge-related information, such as Flow Duration, Length of Packet, number of Packets per Second, etc., is embedded as edge features, while information related to the identification of flow data, such as IP addresses and ports, is embedded as node features. This approach allows us to leverage all the information in the dataset, preventing information loss during the training process.
4.2.2. Node-Embedding
We employ the Message-Passing Neural Network algorithm [
15] to gather information from the local neighborhood of a node, including both neighboring nodes and the edges connecting them. This collected information is then used to compute updates to the node’s embedding. To manage the potentially large number of neighbors, we specifically use the GraphSAGE algorithm (detailed in
Section 3) to sample a fixed number of neighbors at each step. By iterating this process multiple times, nodes progressively gain a deeper understanding of their network environment.
At the
l-th layer, the aggregated information
at node
u, can be expressed as Equation (
1):
Here, represents the information of neighbor node v in the previous layer and represents the information of the neighbor edge between node u and node v in the previous layer. The information of all neighbor nodes v is collected into the embedding of node u at layer l.
Subsequently, the information from all neighbor nodes
v is concatenated with node
u from the previous layer in order to update information for node
u at layer
l. The result is then processed by the model’s trainable parameters
and passed through a non-linear activation function
(e.g., ReLU, Sigmoid). The embedding of the node
u at layer
l is calculated as indicated in Equation (
2):
At the final iteration, the embedding result of node
u is indicated as
as the final value of node
u after final layer
L, as depicted in Equation (
3):
The node embedding process is depicted in
Figure 8. In the figure, node 4 (highlighted in red) serves as the target node for the node embedding operation. To initiate the process, the features of nodes 4 and 6 in layer
are identified as the neighboring nodes of node 4. The edges connecting nodes 5 and 4, as well as nodes 6 and 4, undergo aggregation. Subsequently, these aggregated features are concatenated with the features of node 4. Finally, the outcome is passed through the model’s trainable
activation function
, effecting an update for the representation of node 4 in layer
L.
4.2.3. Edge Embedding
In most of the current NIDS benchmark datasets, network flow information is provided as edge features in graph format for edge classification tasks rather than node features for node classification tasks. Therefore, after performing node embedding to update information on nodes from their neighbors, at this stage, we will update information for edges (edge embedding). This process aims to achieve the ultimate goal of the model, which is multi-class attack detection through edge classification.
As mentioned before, after creating a graph from the flow data, information about the flows is embedded into edges
. At the final iteration of node embedding, the edge between node
u and
v is embedded through the process of concatenating the information of these two nodes as shown in Equation (
4):
This involves combining the representations or features of the nodes
u and
v to form a comprehensive and informative embedding for the edge connecting them. The output of this process is formed as a vector with
m rows corresponding to
m classes. This output is a one-hot matrix used to encode
m labels. Then, the attack classification for each edge
i is computed as shown in Equation (
5):
In the graph construction step, the number of flows in the network equals the number of edges in the graph. Therefore, each label of flow is assigned to each edge.
4.2.4. Model Framework
To summarize, in this section, the model framework will be discussed (
Figure 9). Firstly, for the NIDS dataset, a graph is constructed where nodes represent different IP addresses, and edges represent flow data between two IP addresses. To distinguish between flows generated by the same IP address but different ports, we concatenated the values of Source IP with Source Port and Destination IP with Destination Port. This concatenation ensures clearer differentiation between nodes. Subsequently, the embedding layers for both nodes and edges are executed. During this step, through multiple iterations, nodes collect information from their neighboring nodes and edges, perform concatenation operations, and then update themselves.
In the final iteration, after completing node embedding, the edge embedding process takes place by concatenating the two final results from the nodes and updating the edges. Finally, to enhance the effectiveness of the training process and enable deeper learning, three GraphSAGE layers are applied, utilizing ReLU as the non-linear activation function for classifying attacks for each edge in the model.
5. Experiments and Evaluation
5.1. Experiments
Our model has undergone experimentation on three datasets, as mentioned in
Section 4: CIC-IDS-2017 and UNSW-NB15. Since the experimental results across these datasets are similar, this section primarily focuses on presenting the outcomes of the CIC-IDS-2017 dataset.
In our experiment, we executed each step, including data splitting, graph transformation, training, and testing/evaluation, in a workflow as illustrated in
Figure 10.
To enhance dataset comprehensibility, specific information is initially presented in textual data types (object or string). However, it is crucial to convert this textual data into numerical formats (int or float) for computational efficiency. Therefore, in the initial step, textual features undergo encoding into numerical formats to facilitate computations for the deep learning model. Two graphs are generated from these sets after splitting the previously imported dataset into a training set and a test set. Throughout each training step, the model’s performance is assessed for multi-class intrusion detection using the designated test set, which contains flows the model did not learn before. We simulated the experiment on the hardware with:
Following the training and evaluation phases, the model’s performance is evaluated on the graph derived from the test set. This graph incorporates various flows not encountered during the model’s training, aiming for results that better simulate real-world conditions.
Figure 11 depicts the training and testing results of our model in multi-class attack detection in the CIC-IDS-2017 dataset. The model consists of three GraphSAGE layers, which means that the neighbor information is collected from a three-hop neighborhood for deeper exploiting data. To optimize information gathering from neighbors, we aim to find a balance between acquiring too much information, leading to redundancy, and gathering too little. This balance is achieved by setting the dropout rate to 0.2 between GraphSAGE layers, ensuring a thoughtful selection of information from neighbors during the node embedding process. Additionally, we opted for the cross-entropy loss function, employed gradient descent during the back-propagation step, and utilized the Adam optimizer with a learning rate of 0.001.
During both training and testing, it is evident that the testing performance is higher and converges faster than the training process, indicating the effective operation of the model.
5.2. Evaluation
To evaluate the model results in various scenarios, standard metrics listed in
Table 2 are employed. The performance metrics used include: Recall (the rate at which the true attack slots are accurately anticipated as attacks), precision (the rate of the time slots anticipated as the attack that is truly an attack), F1-score (the adjusted mean of precision and recall), and accuracy. Here, TP represents the number of True Positive samples, TN represents the number of True Negatives, FP represents the number of False Positives, and FN represents the number of False Negatives.
To assess multi-class intrusion detection performance, we benchmark our model against recurrent models like E-GraphSAGE [
15], RNN [
13], and SVM [
16]. We evaluate their performance across various attack types and calculate the average performance from the CIC-IDS-2017 and UNSW-NB15 datasets.
Figure 12 illustrates the comparison results using the CIC-IDS-2017 dataset. Particularly, a total of 137191 flows in the CIC-IDS-2017 dataset have been taken to experiment in the proposed. After splitting data into train and test sets, the number of flows in the train set is 82,314 flows, and in the test set is 54,877 flows. After training, the attack types distributed in the test set as below:
Benign: 27,673 flows
Dos: 10,781 flows
Port Scan: 12,580 flows
DDoS: 3843 flows
Overall, while our model does not outperform the E-GraphSAGE proposal in DDoS attacks, it still achieves over 90% accuracy across most classes. As the figure, it is evident that our solution, excluding the benign class, exhibits a high detection rate for port scan attacks, with accuracy nearing 95%. Conversely, DDoS exhibits the lowest detection rate, with an accuracy of approximately 91%. Based on the distribution of flow types, it can be easily seen that the number of flows labeled DDoS is very small compared to the remaining labels, especially the DoS label. Furthermore, because the main characteristics between DDoS and DoS attacks are almost the same, the model may easily misclassify flows labeled DDoS as DoS during training to increase the accuracy rate of the model.
Figure 13 shows the comparison results between our model and recurrent models with the UNSW-NB15 dataset. Similar to the CIC-IDS-2017 dataset, a total of 126,656 flows have been taken to experiment in the proposed. After splitting data into train and test sets, the number of flows in the train set is 75,993 flows, and in the test set is 50,663 flows. After training, the attack types distributed in the test set as below:
The results presented in the figure demonstrate the superior performance of the Proposal method compared to the E-GraphSAGE, RNN, and SVM algorithms across various traffic categories in the UNSW-NB15 dataset. For the Normal traffic, the Proposal method achieved the highest accuracy of 98%, outperforming E-GraphSAGE (95%), RNN (94%), and SVM (93%). Similarly, in the Exploits category, the Proposal method exhibited the best accuracy at 91%, followed by E-GraphSAGE (87%), RNN (85%), and SVM (83%). Based on the distribution of flow types, we can see that the number of flows labeled DoS and the number of flows labeled Reconnaissance are almost equal. Furthermore, in the reconnaissance attack, if an attacker sends many TCP SYN packets to the target server or conducts high-frequency and continuous port scanning to exploit vulnerabilities or gather information, it can lead to resource overload and be detected as a DoS attack. Therefore, the result representation of DoS and Reconnaissance flows can be mutually misunderstood when leveraging an additional embedding layer to aggregate information from neighbors. This results in the performance of both attacks being lower than that of previous methods. Even in the DoS and Reconnaissance categories, where the Proposal method’s accuracy was slightly lower than the top-performing algorithms, the differences were relatively small. The Proposal method still maintained a high accuracy of 93.7% and 97%, respectively, indicating that it remains a highly capable and reliable option for network intrusion detection tasks.
Hence, while the proposed method may not achieve peak performance in certain classes like DDoS in the CIC-IDS-2017 dataset, or DoS and Reconnaissance in the UNSW-NB15 dataset, it does attain the highest accuracy across numerous other classes. The findings depicted in
Figure 14 indicate that the average accuracy across both datasets surpasses that of the recurrent algorithms, showcasing its effectiveness.
To further evaluate our proposed method, we present
Table 3, which comprehensively compares our approach against several existing methods across various metrics. The results showcased in this table overwhelmingly demonstrate the superior performance of our model in terms of accuracy, consistently outperforming the baselines by a significant margin. This enhanced accuracy is attributed to our novel feature organization and the incorporation of an additional GraphSAGE embedding layer, which enables the model to effectively capture and utilize both edge and node information for classification. In addition to the performance and efficiency benefits, our method also offers a broader understanding of the network by effectively leveraging both node attributes (e.g., ports) and edge relationships. This comprehensive understanding enables the model to adapt rapidly to diverse network threats and make informed decisions about the classification of flows. As a result, our method not only excels in detecting and preventing known intrusion patterns but also demonstrates robust performance against novel and evolving attack methodologies. This ability to stay ahead of the ever-changing threat landscapes is essential for maintaining a secure and resilient network infrastructure. However, by utilizing an additional embedding layer to the training and testing process, our model enables nodes and edges to extract more features from their neighbors in the graph, making the model more complex compared to previous research. As a result, our model exhibits longer training and prediction times than the E-GraphSAGE model.
Figure 15 visualizes the embedding of the CIC-IDS-2017 dataset of our proposed model by using the Uniform Manifold Approximation and Projection (UMAP) graph. The embedding reveals distinct clusters and patterns within the data. A central region with a high density of points is surrounded by several smaller clusters and scattered points. The central region exhibits different colors, suggesting a potential overlap or similarity between certain classes. In addition, there are also several well-separated clusters. These clusters are predominantly a single color, representing distinct classes within the dataset. This separation shows that our proposed model has effectively captured the underlying structure and relationships within the data. Similar to
Figure 15,
Figure 16 depicts our proposed model’s embedding of the UNSW-NB15 dataset. The ’Normal’ and ’Generic’ clusters are distinguishable in this depiction. However, certain clusters, such as ’DoS’ and ’Reconnaissance’, appear intermingled. This leads to potential confusion during label prediction between these categories.