SDN Anomalous Traffic Detection Based on Temporal Convolutional Network

Wang, Ziyi; Guan, Zhenyu; Liu, Xu; Li, Caixia; Sun, Xuan; Li, Jun

doi:10.3390/app15084317

Open AccessArticle

SDN Anomalous Traffic Detection Based on Temporal Convolutional Network

by

Ziyi Wang

^1,*,

Zhenyu Guan

¹,

Xu Liu

²,

Caixia Li

³,

Xuan Sun

^3,*

and

Jun Li

³

¹

School of Cyber Science and Technology, Beihang University, Beijing 100191, China

²

China Industrial Internet Research Institute, Beijing 100015, China

³

College of Computer Science, Beijing Information Science and Technology University, Beijing 102206, China

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2025, 15(8), 4317; https://doi.org/10.3390/app15084317

Submission received: 10 March 2025 / Revised: 1 April 2025 / Accepted: 2 April 2025 / Published: 14 April 2025

Download

Browse Figures

Versions Notes

Abstract

:

The wide application of software-defined network (SDN) architecture, combined with its centralized control characteristics, have exacerbated the potential risk of network attacks, and the traditional anomaly traffic detection methods are facing the challenges of high false alarm rate and insufficient generalization ability due to the reliance on manual rule design and the difficulty in capturing dynamic temporal features. In response to these challenges, we propose a Temporal Convolutional Network (TCN)-based anomalous traffic detection method for SDN. Taking the packet length sequence as the core feature, the long-term temporal dependency in the traffic data is effectively captured by causal convolution and dilation convolution operations of the TCN model, combined with the residual connection mechanism to optimize the gradient propagation and improve the stability of the model training. The experiments validate the model performance based on the public InSDN dataset, and the results show that the method achieves high accuracy in the binary classification task of normal and malicious traffic and improves its detection accuracy by about 5% compared with traditional statistical methods and mainstream deep learning models.

Keywords:

abnormal traffic detection; Software Defined Network (SDN); deep learning; Temporal Convolutional Network

1. Introduction

With the rapid development of the Internet, network traffic has shown explosive growth, in which the existence of abnormal traffic poses serious threats to network security; network traffic anomaly detection technology is currently receiving widespread attention and in-depth research [1]. Abnormal traffic detection technology is designed to monitor network traffic in real time and identify abnormal behaviors, such as distributed denial of service (DDoS) attacks, port scanning, worm propagation, etc., in order to take appropriate protective measures to protect the normal operation of the network. Therefore, the network traffic anomaly detection technology, which can effectively monitor and identify these abnormal behaviors, has become a research hotspot in the field of network security.

In traditional network architecture, the control plane and data plane are tightly coupled and integrated into dedicated devices, which limits the flexibility of network management and service innovation ability. In traditional networks, the data plane and the control plane are integrated into the network device. The control plane is responsible for maintaining the routing table of the switch and determining the best path to send network packets. The data plane is responsible for forwarding packets according to instructions given by the control plane. In Software Defined Networking (SDN), control functions are stripped out and centralized in the software-based SDN controller, while the underlying network devices only need to accept and execute instructions from the controller. The basic architecture of SDN consists of three layers and two interfaces. The application layer communicates with the SDN controller by calling the API. The SDN controller communicates with the network switch on the data plane through the OpenFlow protocol to realize the forward control of network traffic. OpenFlow is a mainstream protocol for southbound interfaces. The data plane is made up of many network switches that are only responsible for the forwarding and exchange of packets. The network switch communicates with the controller through the OpenFlow protocol to perform specific processing or forwarding operations on packets according to the controller’s instructions.

SDN has been widely used in data centers and cloud computing due to its centralized control and flexible programmability [2]. However, the decoupling of the control plane and data plane of the SDN architecture makes it a key target for cyberattacks. Network attacks on the SDN are mainly launched on the forwarding plane and control plane. Among the possible attacks on the SDN control plane and forwarding plane, DoS attacks take the most forms and are easiest to organize. A large amount of malicious traffic is generated to overload the SDN controller or switch, resulting in an overload of the control channel, which seriously affects the quality of service, and even makes the entire network unavailable. Because the forwarding plane is directly connected to external users, attacks are more diverse, such as switch hijacking, SDN scanning, address resolution protocol (ARP) attacks, and virus attacks. For example, in an ARP attack, the attacker sends a large number of ARP packets containing the incorrect mapping between MAC addresses and IP addresses. As a result, the switch cannot store the correct MAC address information and therefore cannot implement normal forwarding services. On the control plane, attackers can directly launch resource-consuming attacks on network controllers and send a large number of Packet-in messages to block the processing queue of the controller or even cause the entire network to crash. Attackers can also launch a variety of attacks by forging North–South interface conversation messages, such as DDoS attacks, black hole attacks, malicious insertion of flow rules, and so on. Because of its centralized control, SDN sometimes faces more complex attack challenges than traditional networks.

In the SDN environment, the traditional anomaly detection methods mainly rely on rule-based matching and statistical analysis techniques. These methods have many limitations when dealing with increasingly complex and changeable attack methods. Rule-based matching methods usually rely on predefined attack signature databases to identify abnormal behaviors by matching network traffic with known attack patterns. However, as attack methods continue to evolve, new attacks may not conform to established rules, making it difficult for these methods to detect emerging threats in a timely and accurate manner. The statistical analysis method establishes the normal behavior model of network traffic and detects the abnormal situation that deviates from the model. These methods include techniques based on threshold setting, statistical modeling, etc. However, in practical applications, frequent fluctuations in normal flow can lead to high false positive rates and reduce the reliability of detection.

With the rapid advancements in deep learning, remarkable progress has been achieved in fields such as image recognition and natural language processing, presenting new opportunities for network anomaly detection. Deep learning models can automatically learn complex features in network traffic data without manually designing feature extractors, and they have stronger adaptability and generalization ability. Currently, deep learning-based network anomaly traffic detection methods can be primarily categorized into those based on convolutional neural network (CNN) [3], recurrent neural network (RNN) [4], and their variants, such as long short-term memory (LSTM) [5] and gated recurrent units (GRU) [6], as well as hybrid models. CNN is capable of extracting complex features from network traffic data without the need for manually designed feature extractors, providing more adaptive and generalizable capabilities. Most CNN models are used to extract spatial features of network traffic data, but they are weak in modeling time series; RNNs and their variants, although able to capture time series dependencies, are prone to the problem of gradient vanishing or gradient explosion when dealing with long sequences, and they have a slow training speed. To overcome these limitations, some studies have proposed hybrid models to improve detection performance by combining the advantages of different models. However, these hybrid models still have some complexity in feature extraction and model training, and there is still room for further improvement in the convergence speed and generalization ability of the models.

However, one of the key challenges in anomaly detection, especially in encrypted traffic scenarios, is the effective extraction of representative features from raw network data. Traditional methods that rely on payload inspection become ineffective when traffic is encrypted, necessitating the exploration of alternative feature extraction approaches. In network anomaly detection, features such as packet length distribution, inter-arrival times, flow duration, and statistical properties of network sessions can provide valuable insights even when payload data are unavailable. The effectiveness of any machine learning or deep learning model heavily depends on the quality of these extracted features, as poorly chosen or redundant features may lead to high false positive rates and degraded detection performance. Therefore, advanced feature engineering techniques, including automated feature selection and representation learning through deep models, have become essential for improving detection accuracy and robustness.

In recent years, Temporal Convolutional Network (TCN) [7] has emerged in the fields of time series prediction and speech recognition due to its unique sequence modeling capability, which builds a deep network by stacking causal convolutional layers and dilated convolutional layers and combines the efficient parallel computation advantage of convolutional neural networks with the long-range temporal sequence modeling capability of recurrent neural networks. In the field of anomaly detection, the layer expansion mechanism of TCN can effectively capture multi-scale timing features, and its fixed-length history window mechanism avoids the gradient propagation problem of traditional RNN. These characteristics enable TCN to demonstrate unique advantages in network traffic timing analysis.

To address the above challenges, this paper proposes a method for anomalous traffic detection featuring packet length sequences, based on TCN model to learn the features. The TCN model employs a convolutional network to process time-series data, modeling temporal dependencies through causal convolution (ensuring that predictions at the current time step are not affected by future time steps) and dilated convolution (expanding the sensory field) to model temporal dependencies. The approach aims to improve the accuracy and efficiency of detection, reduce the false alarm rate, and enhance the generalization ability of the model to better cope with complex network environments and variable attack methods.

This paper presents several significant contributions:

In this study, TCN is applied to the work of SDN traffic anomaly detection. The TCN model is able to effectively capture the long-range dependencies of the packet length sequences, and at the same time, it utilizes parallel computing to improve the training efficiency and overcome the inefficient training and gradient vanishing problems of the RNN model.
Different from the methods that rely on the traditional stream statistics features, this paper proposes that using the packet length sequence as the core feature representation can reduce the feature dimension while preserving the key features of the attack behavior and improve the detection accuracy. The five-tuple grouping policy (source IP address, destination IP address, source port, destination port, and protocol) is used to optimize the feature extraction mode, enhance the expression capability of traffic behavior features, and improve classification accuracy.
Experiments on public InSDN dataset show that the proposed method achieves high accuracy in normal traffic and malicious traffic classification tasks, and that it is superior to the baseline method in detection and computation efficiency.

In summary, the anomaly detection framework proposed in this study provides a way for network anomaly detection. In the development process of 5G and subsequent technologies, the framework has shown potential application value in many fields, such as industrial Internet of Things [8], intelligent mission-critical services [9], network function virtualization [10] and virtualized network slicing environment [11]. The practical application effect of the framework in 5G and future technologies will be further explored and verified in the future.

The rest of the paper is organized as follows. We present related work in Section 2. The methodology proposed in this paper is described in Section 3. The experiments are synthesized in Section 4. Finally, we conclude the paper in Section 5.

2. Related Work

Aiming at the threat of anomalous traffic attacks faced in SDN architectures, in order to meet the challenges of the complex network attack methods and the dynamic traffic patterns, researchers at home and abroad have proposed a variety of detection methods. Currently, the research on SDN anomalous traffic detection methods mainly focuses on the following aspects: detection techniques based on traditional statistical methods, which realize anomaly identification by analyzing the differences in the statistical distribution of traffic feature parameters. Machine learning methods show strong adaptability, and typical algorithms include supervised learning models such as Support Vector Machine (SVM), Decision Tree and K-Nearest Neighbor (KNN). With the breakthrough of deep learning technology, neural network-based detection methods have gradually become the mainstream of research, which can be categorized into supervised learning, unsupervised learning, and semi-supervised learning.

2.1. SDN

SDN, as a new network architecture, separates network control and data forwarding. The control plane is responsible for network policy control and resource scheduling for the forwarding plane, while the forwarding plane forwards data according to the dynamic policies of the control plane, thus realizing flexible control of network traffic. The controller sends flow table rules to the switch through southbound interfaces (e.g., OpenFlow, NETCONF), and provides network state query and policy invocation functions for third-party applications through northbound interfaces (e.g., REST APIs). The basic SDN architecture consists of three planes and two interfaces as shown in Figure 1. The three planes are the application plane, the control plane, and the data plane, and the two interfaces are the southbound and northbound interfaces. The application plane mainly realizes load balancing, traffic monitoring, security protection, and so on. The control plane is responsible for issuing and updating routing and forwarding rules. The data plane is responsible for packet forwarding and switching.

However, the wide adoption of SDN also brings new security issues. In the application plane, malicious applications can implement covert attacks through API vulnerabilities in the northbound interface. By constructing illegal network policy requests, the attacker injects false flow rules into the controller, thus realizing network topology tampering, traffic hijacking, and other invasive behaviors. Such attacks are highly stealthy, as they can pass through the authentication mechanism disguised as normal control commands. Most of the security threats in the control plane are aimed at paralyzing the controller or affecting the interaction between the controller and the switch, so the attacker mainly overloads the computing resources and network links of the control plane as a means to realize the attack on the controller. The attack vectors in the data plane focus on resource overloading attacks on switching devices. This type of flooding attack can lead to the degradation of switches into traditional Layer 2 devices, undermining the centralized control advantages of SDN.

2.2. Traditional Anomalous Traffic Detection Methods

Traditional abnormal traffic detection methods mainly rely on predefined rules and statistical analysis techniques. The rule-based method identifies abnormal traffic by setting a series of fixed rules, such as determining whether the traffic is abnormal based on specific port numbers, protocol types, packet sizes and other characteristics. The method is simple to implement and fast to detect, but its drawbacks are also very obvious, i.e., the formulation of rules needs to rely on expert knowledge, and it is difficult to cover all possible anomalies, and it is weak in detecting new types of attacks and unknown threats.

Methods based on statistical analysis have been applied in the field of abnormal traffic detection since the 1990s [12]. Statistical analysis methods, on the other hand, model normal traffic by analyzing the statistical characteristics of network traffic, such as the arrival rate of packets, the average packet size, and the flow duration, etc., and when the detected traffic differs significantly from the normal model, it is considered to be an abnormal flow. However, statistical analysis methods are susceptible to noise and normal traffic fluctuations in the face of complex network environments and variable traffic patterns, leading to a decrease in detection accuracy and an increase in false alarms. He et al. [13] proposed a DDoS attack defense scheme, SDCC, which integrates bandwidth detection and data flow detection techniques, and employs a confidence-based filtering (CBF) method to calculate the CBF score of the packet. If the CBF score of a packet is below a specific threshold, the packet is determined to be an attack packet. The algorithm is simple to compute, but it needs to be constantly updated and optimized in the emerging attacks.

2.3. Machine Learning-Based Anomaly Traffic Detection Methods

Machine learning-based detection methods are also widely used in SDN traffic anomaly detection [14,15,16], but these methods usually rely on accurate feature engineering support. For example, Tayfour [17] proposed a Voting-based Non-Kernel Density Estimation (V-NKDE) classifier that combines a voting mechanism with four classical machine learning methods—Naive Bayes, KNN [18], Decision Tree [19], and Extremely Randomized Trees, by classifying the SDN traffic feature data and combining with the voting mechanism for the final result determination. This method can effectively reduce the false alarm rate and the risk of excess, thus improving the accuracy and robustness of traffic anomaly detection. Satheesh et al. [20] proposed a machine-learning based anomaly detection model, which categorizes packets by analyzing in detail the packet’s information, including the source IP, the destination IP, the port number, and other features of the packet. The model utilizes the SDN controller to adjust the forwarding rules in the flow table to block the malicious traffic in time and stop its further propagation. Sebbar et al. [21] proposed a security model based on the Random Forest algorithm, which identifies the determination of whether an attack is an attack or not by pre-establishing the security policy and the Time To Live (TTL) delay criteria. Ali et al. [22] proposed a blockchain-based intelligent link failure recovery framework for software-defined Internet of Things (SD-IoT) environments. The article innovatively adopts the TOPSIS (optimal solution rule) module for link failure recovery, which integrates multiple quality metrics such as latency, jitter, and bandwidth in alternative path selection, rather than being limited to the shortest path. In addition, in order to enhance the security of IoT systems, the research combines blockchain technology and Artificial Neural Network (ANN) to achieve distributed DDoS attack detection and defense. The blockchain ensures data immutability, while the ANN implements an efficient DDoS attack identification and defense mechanism by analyzing attack patterns in network traffic. The framework not only enhances the efficiency of link recovery but also improves the system’s ability to protect against DDoS attacks, providing strong support for the stable operation of SD-IoT. Wei et al. [23] proposed a hybrid deep learning based DDoS attack detection and classification method. This method combines two deep learning models, the Auto-Encoder (AE) and the Multi-Layer Perceptron (MLP). First, feature extraction is performed using AE, which maps the high-dimensional features of network traffic samples to the low-dimensional space through unsupervised learning, from which the most discriminative features are extracted to effectively reduce noise and irrelevant features. Subsequently, the compressed features extracted by AE are input into the MLP model for the classification of DDoS attack types. The experimental results show that the AE-MLP model indicates high accuracy and robustness in the detection and classification of DDoS attacks.

2.4. Deep Learning Based Anomaly Traffic Detection Methods

In the field of anomalous traffic detection, many researchers have widely adopted a variety of deep learning models to improve detection effectiveness. CNN-based methods can effectively identify local patterns and structural features in traffic by mapping network traffic data to a multidimensional space and using convolutional layers to extract spatial features of the data. However, CNNs have certain limitations in processing time series data and are difficult to capture long-term dependencies in the data. RNNs and their variants (e.g., LSTM, GRU), on the other hand, are specialized in processing time series data and are able to memorize previous information and use it for current prediction, thus capturing dynamic changes in time series. However, RNN is prone to the problem of gradient vanishing or gradient explosion when dealing with long sequences, resulting in models that are difficult and slow to train. In addition, some researchers and scholars have proposed methods such as AE networks [24], deep belief networks (DBN) [25], and generative adversarial networks (GAN) [26].

In order to overcome the shortcomings of the above single model, some studies have proposed hybrid models, and although better detection results have been achieved, these hybrid models still have some complexity in feature extraction and model training, and the convergence speed and generalization ability of the model still need to be further improved. Elsayed et al. [27] proposed a hybrid method based on CNN and LSTM, which firstly structured the detection data, then extracted spatiotemporal features through CNN and LSTM, and finally used Softmax to complete the detection. Wei et al. [28] proposed a two-branch feature extraction network, which uses CNN and RNN, respectively, to extract spatial and temporal features of the data. The advantage of this method is that it does not need to extract traffic features manually and is able to learn complex patterns in the data automatically. By combining the dual advantages of CNN and RNN, the method is not only able to capture the spatial features of the data but also can effectively handle time series data; however, there is a problem of poor model generalization. Bai et al. [29] proposed to use bidirectional LSTM to learn data features, determine anomalies through a classifier, and deploy idle edge nodes in the Internet of Things to increase the flexibility of detection. To overcome the limitations of single-dimensional feature extraction in dealing with complex cyber attack scenarios, Lin et al. [30] proposed a detection method based on multilevel feature fusion. The method employs a bidirectional long short-term memory network (BiLSTM) and a CNN, which are used to extract spatial, temporal, and byte features of the traffic data, respectively, and further fuses these multidimensional features. With this fusion strategy, the method is able to capture the intrinsic characteristics of network traffic more comprehensively, thus realizing more accurate anomaly detection in complex network environments. Liu et al. [31] proposed a HTTPS traffic detection based on the Bidirectional Gating Recurrent Unit (BiGRU) and the Attention Mechanism method. The method extracts the forward and backward features of byte sequences in a session through BiGRU, which in turn captures the temporal dependency of the data. In addition, the introduction of the attention mechanism enables the model to assign different weights to different features, thus enhancing the model’s focus on key features. However, the method is highly sensitive to hyper-parameters, which not only increases the complexity of hyper-parameter setting but also requires a large number of experiments to determine the optimal parameter combinations, thus restricting the model’s practicability and generalization to a certain extent. Luo et al. [32] proposed a model based on Recombination Generative Adversarial Network (RGAN). The RGAN-based intrusion detection method optimizes the generator and discriminator through two-stage adversarial learning to enhance the recognition of minority attack samples. The minority class attack samples are first generated by combining the self-attention mechanism and GAN, and the features of the traffic data are extracted by using GRUs and CNNs. The false alarm rate is then reduced by introducing the reconstruction loss to further improve the detection performance for minority class samples. The method shows good detection results on the CSE-CIC-IDS2018 dataset.

3. Proposed Method

In this paper, we propose a TCN based SDN abnormal traffic detection method, as shown in Figure 2. Through lightweight feature design and efficient time series modeling, accurate identification of abnormal traffic in dynamic network environment is realized. At the feature engineering level, the packet length sequence is used as the core feature, combined with the five-element group grouping strategy, which reduces the computational complexity of traditional multi-dimensional feature extraction and retains the key dynamic information of traffic behavior. As for the model architecture, through the synergistic mechanism of causal convolution and dilation convolution, long-term dependencies are captured under the premise of ensuring temporal causality, which solves the problems of low training efficiency and gradient vanishing caused by sequence dependency in traditional RNN-like models. Meanwhile, the introduction of the residual connection structure further optimizes the stability of gradient propagation in the deep network, which significantly improves the convergence speed and generalization ability of the model. During the training process, the cross-entropy loss function and Adam optimizer are used, combined with dropout regularization technique to effectively suppress the overfitting phenomenon.

3.1. Data Preprocessing

Before abnormal traffic detection, raw network traffic data needs to be preprocessed to extract useful feature information. Packet length, as a key feature of network traffic, can effectively reflect the size and transmission of packets, which is closely related to network load and transmission efficiency. Specifically, packet length not only reveals the size of data transmission; the arrival frequency of packets can reflect the active degree of the network, with high-frequency packet arrivals indicating more frequent data transmission in the network, while low-frequency packet arrivals may indicate that the network transmission is more sparse; and lastly, a sudden change in packet length is often a sign of unexpected behavior in the network, such as DDoS attacks and port scans and port scanning, etc. Anomalous traffic usually leads to significant changes in packet lengths.

In this paper, the packet length sequence is chosen as the network traffic feature, and the specific steps are to extract the network traffic data from the PCAP file, group the packets according to the five-tuple, and extract the packets belonging to the same five-tuple to form multiple packet sequences. Each packet sequence contains a certain number of packets, and the length of the packet sequences is adjusted according to the actual demand and the characteristics of the dataset. The data preprocessing flowchart is shown in Figure 3. We tested packet sequences of different lengths. We found that when the sequence length was set to no more than 750, the model demonstrated the best performance in capturing the key information of network traffic dynamic behavior and detecting abnormal traffic. If the sequence length is too short, the model cannot obtain sufficient traffic information, which will affect the detection accuracy of abnormal traffic. On the other hand, when the sequence length is too long, the model will introduce excessive redundant information. Therefore, we choose the length of the sequence to be no more than 750, which ensures that the sequence contains enough information and does not lead to the sequence being too long and increase the computational complexity.

3.2. TCN Model

TCN is a deep learning model specialized in processing time series data. The core idea of TCN is to capture long-term dependencies in time series by CNN. Compared with the traditional RNN, TCN has better parallel computing capability and fewer parameters, thus showing higher efficiency and better performance in processing long sequence data.

The structure of TCN mainly consists of an input layer, multiple residual blocks and an output layer. Each residual block contains multiple convolutional layers, which enhance the model’s learning capability through residual connection. In addition, TCN introduces dilated convolution, which expands the sensory field by inserting voids in the convolution kernel to capture longer temporal dependencies without increasing the computational effort. The structure of the TCN convolutional network is shown in Figure 4, which ensures that the future information in the time series does not affect the current prediction by adopting convolutional method with interval sampling, thus maintaining the principle of causality in time series analysis. Compared with traditional convolutional methods, TCN is able to span a certain step size when performing convolutional operations, so that a wider perceptual range and capture dependencies at more distant time points can be obtained even if the output size remains constant. This design allows TCN to be more efficient in analyzing time-series data and to better understand and predict long-term dependency patterns.

In Figure 5, we present the TCN model for SDN anomalous traffic detection. The model is designed to efficiently handle long time series data, and the main structure includes residual blocks and dilated causal convolutional layers. The model structure in the figure employs multiple layers of residual blocks to ensure that the training of the deep network is more stable and easy to optimize. The inputs in the figure are processed through three residual modules, and the residual blocks are jump-connected after each layer of convolution to avoid the common problem of gradient vanishing in deep networks and to speed up convergence. Dilated causal convolution allows the network to capture more complex temporal dependencies by expanding the sensory field, while causality ensures that the output depends only on the current moment and previous moment data. Finally, the output of the model is processed through a 1 × 1 convolutional layer to further map the features to the desired output.

The causal convolutional layer is the basis of the TCN model, which employs causal convolutional operations to ensure that only information from previous moments is used in predicting the output at the current moment, thus maintaining the sequential nature of the time series. The causal convolutional layer extracts local features in the sequence through convolutional operations of the convolution kernel with the input sequence and passes these local features to the next layer. The output of the causal convolutional layer can be expressed as

y_{t} = σ (W * x_{t} + b)

(1)

where

y_{t}

is the output of the causal convolutional layer at moment

t

,

σ

is the activation function,

W

is the convolutional kernel weight,

b

is the bias term,

*

denotes the convolution operation, and

x_{t}

is the value of the input sequence at moment

t

. The activation function used in this paper is ReLU, which is a non-saturated activation function that is unsaturated at positive numbers and hard saturated at negative numbers. Since there are only linear relations in the ReLU function, it converges faster and is a commonly used activation function in CNNs. The function of ReLU is shown in Equation (2).

R e L U (x) = \{\begin{matrix} m a x (0, x) & x \geq 0 \\ 0 & x < 0 \end{matrix}

(2)

The dilation convolution layer introduces the concept of dilation rate on the basis of the causal convolution layer and expands the sensory field of the model through the dilation convolution operation, so that the model can capture the dependencies in a longer time range. The dilation rate of the dilation convolution layer determines the time interval covered by the convolution kernel; the larger the dilation rate, the wider the time range covered by the convolution kernel, and the larger the model’s sense field. The output of the dilation convolutional layer can be expressed as:

y_{t} = σ (W *_{d i l a t i o n} x_{t} + b)

(3)

where

*_{d i l a t i o n}

denotes the dilation convolution operation and

d i l a t i o n

is the dilation rate.

In order to alleviate the gradient vanishing problem and improve the training stability of the model, residual connections are introduced into the TCN model. The residual connection adds the input directly to the output so that the model can better convey the gradient information during the training process and avoid the gradient vanishing problem due to the deep structure. The output of residual connection can be expressed as

y_{t} = F (x_{t}) + x_{t}

(4)

where

F (x_{t})

is the output of the main part of the TCN model,

x_{t}

is the input, and

y_{t}

is the output after residual linking.

3.3. Model Training and Optimization

In the process of model training, the selection of appropriate loss functions and optimization algorithms is crucial. These elements directly affect the training efficiency, convergence speed, and generalization ability of the model. The loss function is used to measure the difference between the model’s predicted output and the true label, while the optimization algorithm gradually reduces the loss function by adjusting the model’s parameters so that the model continuously approaches the optimal solution on the training data. In order to ensure that the model can converge effectively during the training process and has strong generalization ability, this paper adopts the cross-entropy loss function and Adam optimization algorithm and introduces the dropout technique to reduce the overfitting phenomenon.

The cross-entropy loss function is commonly used in classification tasks, and it can effectively measure the difference between the model output and the true label. Setting the true label as

y \in {0,1}

and the predicted value of the model as

\hat{y} \in [0, 1]

, the cross-entropy loss function

L

can be expressed as:

L = - \sum_{i = 1}^{N} [y_{i} \log ({\hat{y}}_{i}) + (1 - y_{i}) \log (1 - {\hat{y}}_{i})]

(5)

where

N

denotes the number of samples,

y_{i}

is the true label of the ith sample, and

{\hat{y}}_{i}

is the predicted probability of the model. This loss function pushes the model to output more accurate predictions by penalizing the model for larger errors in incorrect predictions.

During the training process, the optimization algorithm is responsible for updating the model parameters through a back-propagation algorithm. The Adam (Adaptive Moment Estimation) optimization algorithm combines the advantages of the RMSprop and Momentum optimization algorithms, and it is able to adaptively adjust the learning rate to improve the training efficiency. Specifically, the Adam optimization algorithm uses first-order moments (mean) and second-order moments (variance) to calculate the adaptive learning rate for each parameter. Adam’s update formula is as follows:

m_{t} = β_{1} m_{t - 1} + (1 - β_{1}) \nabla_{θ} J (θ)

(6)

v_{t} = β_{2} v_{t - 1} + (1 - β_{2}) (\nabla_{θ} J (θ))^{2}

(7)

{\hat{m}}_{t} = \frac{m_{t}}{1 - β_{1}^{t}}, {\hat{v}}_{t} = \frac{v_{t}}{1 - β_{2}^{t}}

(8)

θ_{t} = θ_{t - 1} - \frac{α {\hat{m}}_{t}}{\sqrt{{\hat{v}}_{t}} + ϵ}

(9)

where

m_{t}

and

v_{t}

denote the estimates of the first-order and second-order moments, respectively,

β_{1}

and

β_{2}

are the decay rates,

α

is the learning rate, and

ϵ

is a constant that prevents division-by-zero errors. By combining the information of the first-order moments and second-order moments, Adam is able to better control the direction and magnitude of the gradient update, which enables the model parameters to converge quickly and stably during the training process.

In order to prevent the model from overfitting during the training process, this paper introduces the dropout technique. Dropout is a regularization method that reduces the model’s overdependence on some specific neurons by randomly discarding a portion of the neurons in the neural network during the training process, thus improving the model’s generalization ability. Specifically, in each training iteration, dropout sets the output of some neurons to zero with probability

p

, thus breaking the co-adaptation relationship between neurons. Usually, the application of dropout can effectively prevent overfitting and improve the robustness of the model.

4. Experiments and Analysis

4.1. Dataset

To verify the effectiveness of the proposed method, this paper uses the different public datasets for experiments. Considering the need for data balance, we applied oversampling techniques to ensure a more balanced distribution between normal and abnormal traffic in the datasets. This approach helps to mitigate class imbalance and enhances the model’s ability to generalize.

InSDN dataset [33]. The InSDN dataset released by UCD ASEADOS Lab in 2020, is the first benchmark dataset built specifically for SDN intrusion detection scenarios, which makes up for the shortcomings of the traditional intrusion detection dataset in terms of the dynamics of the SDN architecture and the simulation of control plane attacks. The dataset is generated based on a virtualized experimental environment that simulates an SDN topology containing OpenDaylight controllers, Open vSwitch switches, and multiple hosts. Network traffic is constructed through the Mininet tool, and seven types of typical attacks are simulated using Scapy, including DDoS, port scanning, ARP Spoofing, Flow Table Flooding, ICMP Flooding, TCP SYN Flooding, and Controller Resource Exhaustion Attacks, which cover attack vectors in the SDN data plane, control plane, and application plane. The data collection is stored in PCAP format, in which normal traffic accounts for about 20% and abnormal traffic accounts for about 80%. Each traffic sample is grouped by five-tuple and labeled with the attack type and timestamp, and the labeling process is cross-validated by network security experts to ensure reliability. These samples represent aggregated flows, which consist of multiple packets between the same source and destination within a specified time window, rather than individual packets. Specifically, the labeling was initially performed by automated scripts that categorized traffic based on predefined attack signatures and behavioral patterns. This automated labeling was then reviewed by a panel of cybersecurity professionals, including researchers from the UCD ASEADOS Lab and external domain experts specializing in SDN security and intrusion detection. In cases where discrepancies or uncertainties arose, a consensus-based approach was adopted: multiple experts independently reviewed the mislabeled or ambiguous instances, and a final classification decision was made based on majority agreement. The dataset labels flows based on the presence of malicious activity during their lifetime. If any part of a flow contains attack traffic (e.g., a DDoS attack initiated mid-flow), the entire flow is labeled as malicious. Overlapping flows (e.g., simultaneous connections with the same five-tuple) are likely distinguished by timestamps or unique flow identifiers to ensure granularity. Additionally, we have conducted experiments using approximately 20% normal traffic and 20% abnormal traffic to evaluate our model’s performance in a more balanced setting.
CICIDS 2017 Dataset [34]. The CICIDS 2017 dataset, developed by the University of New Brunswick, is a widely used dataset for intrusion detection research. It simulates real-world normal and attack traffic using a realistic testbed environment with various attack types, including DoS, brute-force attacks, botnets, web-based attacks, and infiltration. The dataset captures both network flow-based and raw packet-based data, providing rich feature sets for machine learning-based intrusion detection models.
USTC-TFC2016 [35]. The USTC-TFC2016 dataset developed by the University of Science and Technology of China (USTC) is a benchmark for encrypted traffic classification. It consists of various types of encrypted traffic, including legitimate applications such as Skype, WeChat, and BitTorrent, as well as malicious traffic from botnets and malware. The dataset includes both flow-based statistical features and raw packet data, enabling research in machine learning-based encrypted traffic classification.

The InSDN dataset differs from other standard intrusion detection datasets in several key aspects, particularly in its relevance to SDN-specific security threats. It includes control-plane-specific attacks, such as Flow Table Flooding and Controller Resource Exhaustion, which are not present in traditional datasets. Additionally, InSDN’s traffic patterns reflect the dynamic nature of SDN architectures, where network flows are centrally managed by controllers rather than distributed routing mechanisms. To ensure a balanced evaluation, our study includes these datasets to validate the model’s performance under more generalized network conditions.

4.2. Evaluation Metrics

In the experiments, Accuracy, Precision, Recall and F1-score are used to evaluate the performance of the model. Accuracy indicates the proportion of the number of samples correctly classified by the model to the total number of samples, which can measure the overall classification performance of the model; Precision indicates the proportion of the samples predicted by the model to be anomalous traffic that are actually anomalous traffic, reflecting the accuracy of the model in predicting anomalous traffic; Recall indicates the proportion of the samples that are actually anomalous traffic that are correctly predicted by the model to be anomalous traffic, reflecting the model’s ability in detecting anomalous traffic. The F1-score is the reconciled average of the precision rate and the recall rate, which integrates the precision rate and the recall rate of the model and can reflect the performance of the model more comprehensively. The formulas for these evaluation metrics are as follows:

A c c u r a c y = \frac{T P + T N}{T P + T N + F N + F P} \times 100 %

(10)

Precision = \frac{TP}{TP + FP} \times 100 %

(11)

R e c a l l = \frac{T P}{T P + F N} \times 100 %

(12)

F 1 - score = 2 \times \frac{Precision \times Recall}{Precision + Recall}

(13)

where TP is the number of true examples, TN is the number of true negative examples, FP is the number of false positive examples, and FN is the number of false negative examples.

4.3. Experiment Result

Figure 6a shows the variation trend of the model’s accuracy in the training process under different learning rates (e.g., 0.01, 0.03, 0.001, and 0.002). With the increase in training epochs, the accuracy of the model gradually becomes stable. When the learning rate is set to 0.001, the accuracy of the model is the highest after the final convergence. Figure 6b further analyzes the impact of batch size on model checking performance. On InSDN dataset, the experiment compared the detection accuracy of models with different batch sizes (such as 16, 32, 64). The results showed that when the batch size was 32, the detection accuracy of TCN model in different training stages was the best, which not only ensured good generalization ability, but also took into account the training stability. As can be seen from Figure 6, when the number of training rounds reaches 10, the accuracy rate of the model tends to be stable. Therefore, the number of single training samples used in the method proposed in this paper is 32, the epoch is 10, and the learning rate is 0.001. In addition, by referring to the previous literature and conducting experimental verification, the model can maintain good stability during training when the dropout is set to 0.2.

The experimental results show that the anomalous traffic detection method based on the bidirectional TCN model exhibits high classification performance in the binary classification task (normal and malicious traffic). In the model performance evaluation, the confusion matrix can directly show the number of true cases, false positive cases, true negative cases, and false negative cases, which can clearly understand the performance of the model for each category. According to the confusion matrix generated from the experimental results (shown in Figure 7), the detection accuracy is 96.36% for normal traffic (benign) and 97.97% for malicious traffic (malicious). The confusion matrix results show that the model has a false detection rate of 3.64% for normal traffic and 2.03% for malicious traffic, indicating that the model has a low false alarm rate and leakage rate. By comprehensively analyzing the classification indexes, the model performs more stably in detecting malicious traffic and effectively distinguishes normal traffic from abnormal traffic.

Considering the generalization ability of the model, experiments are carried out on different public network traffic datasets. As shown in Table 1, On InSDN dataset, the precision rate of normal traffic (benign) is 98.15%, the recall rate is 96.36%, and the F1-score is 97.25%; the precision rate of malicious traffic (malicious) is 96.03%, the recall rate is 97.97%, and the F1-score is 96.99%. These results show that the model’s performance is more balanced in both normal and malicious traffic tasks and can accurately detect abnormal traffic and effectively reduce the false alarm rate. However, in the CICIDS 2017 dataset, due to the complexity of traffic and attack mode, the index of malicious traffic detection in the model is slightly lower. The USTC-TFC2016 dataset is between the two, and the overall performance is more balanced.

In Table 2, we compare and analyze the accuracy of five models in benign and malicious sample classification tasks, aiming to evaluate the performance of each model. The experimental results show that the TCN model has significant advantages in both sample classification.

It can be seen from the experimental results that the abnormal traffic detection method based on TCN can effectively utilize the temporal dependence of the traffic sequence and show excellent classification performance. TCN processes the entire input sequence in parallel, significantly reducing inference latency. Experiments on the InSDN dataset show that the average inference time of the model on the GPU is 2.1 ms/sample. The complexity of the TCN model is mainly determined by the dilatation causal convolution and residual blocks. In order to ensure its practicability, we implement the proposed model on the test platform. This will be further verified in a real environment.

5. Conclusions

With the wide application of SDN centralized architecture, the network security threats it faces show a trend of complexity and concealment. Traditional anomalous traffic detection methods face the bottleneck of high false alarm rate and insufficient generalization capability in a SDN dynamic environment due to their reliance on manual rule design and difficulty in capturing dynamic timing features. To address this challenge, this paper proposes a TCN-based anomalous traffic detection method for SDN, which achieves efficient detection of anomalous traffic through packet length sequence modeling and multi-level timing feature extraction. Compared with existing research, this paper introduces the TCN model into the SDN security domain and utilizes the synergy of its causal convolution and dilation convolution to effectively capture the long-term dependency of traffic data while ensuring the temporal causality. Gradient propagation is optimized through the residual connection mechanism, which significantly improves the training stability and convergence efficiency of the deep network. At the feature design level, it is proposed to take the packet length sequence as the core feature, combined with the five-element group grouping strategy, which reduces the complexity of feature engineering while retaining the key information of traffic dynamic behavior. Experiments based on the publicly available dataset InSDN validation show that the proposed method achieves high detection accuracy on standard datasets. The advantages of a TCN model, such as parallel computation, fast training speed, and relatively small number of model parameters make it highly practical in real-world applications of abnormal network traffic detection.

The innovations of this study are mainly reflected in: the parallel computing capability and long-range temporal capture characteristics of the TCN model overcomes the low training efficiency and gradient vanishing problem of the traditional RNN-like models; the lightweight feature design reveals the dynamic behavioral characteristics of sudden flooding of anomalous traffic and low-frequency long connections by a single-dimensional sequence of packet lengths; and the model’s advantage of a low number of parameters in resource-constrained scenarios offers feasibility for edge network deployment.

However, there are still limitations in the current work. First, the experimental validation in this study is primarily based on the InSDN dataset, which is a simulated dataset. While it effectively demonstrates the feasibility of our approach, further validation using real-world SDN traffic datasets and heterogeneous architectures is required to ensure the model’s generalizability. Future research will aim to evaluate the proposed method on diverse datasets, including real-world encrypted traffic scenarios, to enhance its robustness across different SDN environments.

Another important limitation is the need for more advanced feature extraction techniques to improve anomaly detection performance, particularly in encrypted traffic scenarios. Traditional packet length sequences, while effective in capturing temporal traffic patterns, may not provide sufficient discriminatory power when dealing with encrypted communication or protocol obfuscation. Feature extraction plays a crucial role in network anomaly detection by enabling the identification of key behavioral patterns without relying on packet payload analysis. Advanced techniques, such as flow-based statistical analysis, entropy-based metrics, and graph-based representations can enhance detection capabilities by capturing higher-order dependencies within traffic flows. Furthermore, deep learning-based feature extraction methods, such as attention mechanisms and autoencoders, can learn hierarchical representations of network traffic, improving the adaptability of models to evolving attack patterns. In future research, integrating multi-modal feature extraction could provide a more comprehensive understanding of anomalous behaviors, leading to more robust and generalizable detection models.

Third, real-time feasibility remains a challenge. While TCN’s parallel computation offers an advantage in efficiency, achieving low-latency detection in high-speed SDN environments requires further optimization. Future work will focus on introducing knowledge distillation and quantization compression techniques to optimize inference speed while maintaining detection accuracy. Additionally, incremental learning mechanisms will be explored to dynamically adapt the model to new attack patterns, reducing reliance on frequent retraining.

Finally, cross-domain collaborative detection will be investigated by integrating logs from northbound interfaces and state data from the control plane to provide a more holistic view of network security. By combining multi-modal feature fusion, real-time optimization, incremental learning, and cross-domain analysis, future research aims to construct a more robust and adaptive SDN security protection system, providing both theoretical support and practical implementation for intelligent network defense.

Author Contributions

Conceptualization, Z.W. and C.L.; methodology, Z.W.; formal analysis, Z.W.; investigation, X.L., Z.G. and X.S.; resources, Z.G. and X.L.; data curation, Z.W. and C.L.; writing—original draft preparation, Z.W. and Z.G.; writing—review and editing, Z.W., X.L., Z.G., X.S. and J.L.; supervision, X.S. and J.L.; project administration, X.S. and J.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Basic Research Project of the National Defense Science and Industry Bureau (Project No. JCKY2022405C010). We would like to express our deepest gratitude to these organizations for their generous funding and support.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Yang, Z.; Liu, X.; Li, T.; Wu, D.; Wang, J.; Zhao, Y.; Han, H. A systematic literature review of methods and datasets for anomaly-based network intrusion detection. Comput. Secur. 2022, 116, 102675. [Google Scholar]
Son, J.; Buyya, R. A taxonomy of software-defined networking (SDN)-enabled cloud computing. ACM Comput. Surv. (CSUR) 2018, 51, 1–36. [Google Scholar]
Li, Z.; Liu, F.; Yang, W.; Peng, S.; Zhou, J. A survey of convolutional neural networks: Analysis, applications, and prospects. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 6999–7019. [Google Scholar]
Radford, B.J.; Apolonio, L.M.; Trias, A.J.; Simpson, J.A. Network traffic anomaly detection using recurrent neural networks. arXiv 2018, arXiv:1803.10769. [Google Scholar]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [PubMed]
Dey, R.; Salem, F.M. Gate-variants of Gated Recurrent Unit (GRU) neural networks. In Proceedings of the 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS), Boston, MA, USA, 6–9 August 2017; pp. 1597–1600. [Google Scholar] [CrossRef]
Bai, S.; Kolter, J.Z.; Koltun, V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv 2018, arXiv:1803.01271. [Google Scholar]
Liu, Y.; Garg, S.; Nie, J.; Zhang, Y.; Xiong, Z.; Kang, J.; Hossain, M.S. Deep Anomaly Detection for Time-Series Data in Industrial IoT: A Communication-Efficient On-Device Federated Learning Approach. IEEE Internet Things J. 2021, 8, 6348–6358. [Google Scholar] [CrossRef]
Spantideas, S.; Giannopoulos, A.; Cambeiro, M.A.; Trullols-Cruces, O.; Atxutegi, E.; Trakadas, P. Intelligent Mission Critical Services Over Beyond 5G Networks: Control Loop and Proactive Overload Detection. In Proceedings of the 2023 International Conference on Smart Applications, Communications and Networking (SmartNets), Istanbul, Turkiye, 25–27 July 2023; pp. 1–6. [Google Scholar] [CrossRef]
Zoure, M.; Ahmed, T.; Réveillère, L. Network Services Anomalies in NFV: Survey, Taxonomy, and Verification Methods. IEEE Trans. Netw. Serv. Manag. 2022, 19, 1567–1584. [Google Scholar] [CrossRef]
Wang, W.; Liang, C.; Chen, Q.; Tang, L.; Yanikomeroglu, H.; Liu, T. Distributed Online Anomaly Detection for Virtualized Network Slicing Environment. IEEE Trans. Veh. Technol. 2022, 71, 12235–12249. [Google Scholar] [CrossRef]
Duan, X.Y.; Fu, Y.; Wang, K. Network traffic anomaly detection method based on multi-scale residual classifier. Comput. Commun. 2023, 198, 206–216. [Google Scholar] [CrossRef]
Heng, H.E.; Yan, H.U.; Zheng, L.; Xue, Z. Efficient DDoS attack detection and prevention scheme based on SDN in cloud environment. J. Commun. 2018, 39, 139–151. [Google Scholar]
Swami, R.; Dave, M.; Ranga, V. Voting-based intrusion detection framework for securing software-defined networks. Concurr. Comput. Pract. Exp. 2020, 32, e5927. [Google Scholar] [CrossRef]
Deepa, V.; Sudar, K.M.; Deepalakshmi, P. Design of ensemble learning methods for DDoS detection in SDN environment. In Proceedings of the 2019 International Conference on Vision Towards Emerging Trends in Communication and Networking (ViTECoN), Vellore, India, 30–31 March 2019; pp. 1–6. [Google Scholar]
Maheshwari, A.; Mehraj, B.; Khan, M.S.; Idrisi, M.S. An optimized weighted voting based ensemble model for DDoS attack detection and mitigation in SDN environment. Microprocess. Microsyst. 2022, 89, 104412. [Google Scholar] [CrossRef]
Tayfour, O.E.; Marsono, M.N. Collaborative detection and mitigation of DDoS in software-defined networks. J. Supercomput. 2021, 77, 13166–13190. [Google Scholar] [CrossRef]
Xu, Y.; Sun, H.; Xiang, F.; Sun, Z. Efficient DDoS detection based on K-FKNN in software defined networks. IEEE Access 2019, 7, 160536–160545. [Google Scholar] [CrossRef]
Kousar, H.; Mulla, M.M.; Shettar, P.; Narayan, D.G. Detection of DDoS attacks in software defined network using decision tree. In Proceedings of the 2021 10th IEEE International Conference on Communication Systems and Network Technologies (CSNT), Bhopal, India, 18–19 June 2021; pp. 783–788. [Google Scholar]
Satheesh, N.; Rathnamma, M.V.; Rajeshkumar, G.; Sagar, P.V.; Dadheech, P.; Dogiwal, S.R.; Velayutham, P.; Sengan, S. Flow-based anomaly intrusion detection using machine learning model with software defined networking for OpenFlow network. Microprocess. Microsyst. 2020, 79, 103285. [Google Scholar] [CrossRef]
Sebbar, A.; Zkik, K.; Baddi, Y.; Boulmalf, M.; El Kettani, M.D.E.-C. MitM detection and defense mechanism CBNA-RF based on machine learning for large-scale SDN context. J. Ambient. Intell. Humaniz. Comput. 2020, 11, 5875–5894. [Google Scholar] [CrossRef]
Ali, J.; Shan, G.Y.; Gul, N.; Roh, B.-H. An intelligent blockchain-based secure link failure recovery framework for software-defined internet-of-things. J. Grid Comput. 2023, 21, 57. [Google Scholar] [CrossRef]
Wei, Y.Y.; Jang-Jaccard, J.; Sabrina, F.; Singh, A.; Xu, W.; Camtepe, S. AE-MLP: A hybrid deep learning approach for DDoS detection and classification. IEEE Access 2021, 9, 146810–146821. [Google Scholar] [CrossRef]
Zhang, L.; Wang, J.S. A hybrid method of entropy and SSAE-SVM based DDoS detection and mitigation mechanism in SDN. Comput. Secur. 2022, 115, 102604. [Google Scholar]
Shen, Y.J. An intrusion detection algorithm for DDoS attacks based on DBN and three-way decisions. J. Phys. Conf. Ser. 2022, 2356, 012044. [Google Scholar]
Novaes, M.P.; Carvalho, L.F.; Lloret, J.; Proença, M.L., Jr. Adversarial deep learning approach detection and defense against DDoS attacks in SDN environments. Future Gener. Comput. Syst. 2021, 125, 156–167. [Google Scholar]
Abdallah, M.; Khae, N.A.L.; Jahromi, H.; Jurcut, A.D. A hybrid CNN-LSTM based approach for anomaly detection systems in SDNs. In Proceedings of the 16th International Conference on Availability, Reliability and Security, Vienna, Austria, 17–20 August 2021; pp. 1–7. [Google Scholar]
Wei, G.L.; Wang, Z.H. Adoption and realization of deep learning in network traffic anomaly detection device design. Soft Comput. 2021, 25, 1147–1158. [Google Scholar] [CrossRef]
Bai, J.J.; Gu, R.C.; Liu, Q.H. A DDoS attack detection scheme based on Bi-LSTM in SDN. Comput. Eng. Sci. 2023, 45, 277–285. [Google Scholar]
Lin, K.D.; Xu, X.L.; Xiao, F. MFFusion: A multi-level features fusion model for malicious traffic detection based on deep learning. Comput. Netw. 2022, 202, 108658. [Google Scholar] [CrossRef]
Liu, X.; You, J.L.; Wu, Y.L.; Li, T.; Li, L.; Zhang, Z.; Ge, J. Attention-based bidirectional GRU networks for efficient HTTPS traffic classification. Inf. Sci. 2020, 541, 297–315. [Google Scholar] [CrossRef]
Luo, H.Q.; Wan, L. A recombination generative adversarial network for intrusion detection. Int. J. Appl. Math. Comput. Sci. 2024, 34, 323–334. [Google Scholar]
Elsayed, M.S.; Le-Khac, N.A.; Jurcut, A.D. InSDN: A novel SDN intrusion dataset. IEEE Access 2020, 8, 165263–165284. [Google Scholar] [CrossRef]
Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward generating a new intrusion detection dataset and intrusion traffic characterization. In Proceedings of the 4th International Conference on Information Systems Security and Privacy (ICISSP 2018), Madeira, Portugal, 22–24 January 2018; Volume 1, pp. 108–116. [Google Scholar]
Wang, W.; Zhu, M.; Zeng, X.; Ye, X.; Sheng, Y. Malware traffic classification using convolutional neural network for representation learning. In Proceedings of the International Conference on Information Networking, ICOIN, Da Nang, Vietnam, 11–13 January 2017; pp. 712–717. [Google Scholar]

Figure 1. The base structure of SDN.

Figure 2. Framework diagram of the proposed method. The framework focuses on malicious traffic (red) and normal traffic (blue), which are preprocessed, features are extracted by TCN model, and finally, classification of normal and malicious traffic is achieved by Softmax.

Figure 3. The process of traffic preprocessing.

Figure 4. TCN convolutional network architecture.

Figure 5. Improved TCN structure. The left part indicates that the TCN is composed of three residual blocks; the right part indicates the details of each residual block.

Figure 6. The selection of parameters for TCN: (a) Accuracy at different learning rates on InSDN. (b) Accuracy at different batch sizes on InSDN.

Figure 7. Confusion matrix of TCN for InSDN.

Table 1. Classification results of TCN on different datasets.

Datasets	Metrics	Benign	Malicious
InSDN	precision	0.9815	0.9603
	recall	0.9636	0.9797
	F1-score	0.9725	0.9699
CICIDS 2017	precision	0.9750	0.9520
	recall	0.9582	0.9630
	F1-score	0.9665	0.9575
USTC-TFC2016	precision	0.9780	0.9582
	recall	0.9626	0.9690
	F1-score	0.971	0.9635

Table 2. Average detection efficiency of different models on InSDN dataset.

Models	Accuracy	Precision	Recall	F1-Score
CNN	0.8559	0.8459	0.8411	0.8934
LSTM	0.8619	0.8740	0.8552	0.8543
CNN-LSTM [27]	0.8773	0.8759	0.8671	0.8678
BiLSTM [29]	0.8870	0.8764	0.8756	0.8761
TCN(Ours)	0.9700	0.9640	0.9691	0.9672

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, Z.; Guan, Z.; Liu, X.; Li, C.; Sun, X.; Li, J. SDN Anomalous Traffic Detection Based on Temporal Convolutional Network. Appl. Sci. 2025, 15, 4317. https://doi.org/10.3390/app15084317

AMA Style

Wang Z, Guan Z, Liu X, Li C, Sun X, Li J. SDN Anomalous Traffic Detection Based on Temporal Convolutional Network. Applied Sciences. 2025; 15(8):4317. https://doi.org/10.3390/app15084317

Chicago/Turabian Style

Wang, Ziyi, Zhenyu Guan, Xu Liu, Caixia Li, Xuan Sun, and Jun Li. 2025. "SDN Anomalous Traffic Detection Based on Temporal Convolutional Network" Applied Sciences 15, no. 8: 4317. https://doi.org/10.3390/app15084317

APA Style

Wang, Z., Guan, Z., Liu, X., Li, C., Sun, X., & Li, J. (2025). SDN Anomalous Traffic Detection Based on Temporal Convolutional Network. Applied Sciences, 15(8), 4317. https://doi.org/10.3390/app15084317

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

SDN Anomalous Traffic Detection Based on Temporal Convolutional Network

Abstract

1. Introduction

2. Related Work

2.1. SDN

2.2. Traditional Anomalous Traffic Detection Methods

2.3. Machine Learning-Based Anomaly Traffic Detection Methods

2.4. Deep Learning Based Anomaly Traffic Detection Methods

3. Proposed Method

3.1. Data Preprocessing

3.2. TCN Model

3.3. Model Training and Optimization

4. Experiments and Analysis

4.1. Dataset

4.2. Evaluation Metrics

4.3. Experiment Result

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI