Two-Phase Deep Learning-Based EDoS Detection System

Nhu, Chien-Nguyen; Park, Minho

doi:10.3390/app112110249

Open AccessArticle

Two-Phase Deep Learning-Based EDoS Detection System

by

Chien-Nguyen Nhu

¹ and

Minho Park

^2,*

¹

Department of Information Communication, Materials, and Chemistry Convergence Technology, Soongsil University, Seoul 156-743, Korea

²

School of Electronic Engineering, Soongsil University, Seoul 156-743, Korea

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2021, 11(21), 10249; https://doi.org/10.3390/app112110249

Submission received: 25 September 2021 / Revised: 24 October 2021 / Accepted: 31 October 2021 / Published: 1 November 2021

(This article belongs to the Special Issue Machine Learning for Attack and Defense in Cybersecurity)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Cloud computing is currently considered the most cost-effective platform for offering business and consumer IT services over the Internet. However, it is prone to new vulnerabilities. A new type of attack called an economic denial of sustainability (EDoS) attack exploits the pay-per-use model to scale up the resource usage over time to the extent that the cloud user has to pay for the unexpected usage charge. To prevent EDoS attacks, a few solutions have been proposed, including hard-threshold and machine learning-based solutions. Among them, long short-term memory (LSTM)-based solutions achieve much higher accuracy and false-alarm rates than hard-threshold and other machine learning-based solutions. However, LSTM requires a long sequence length of the input data, leading to a degraded performance owing to increases in the calculations, the detection time, and consuming a large number of computing resources of the defense system. We, therefore, propose a two-phase deep learning-based EDoS detection scheme that uses an LSTM model to detect each abnormal flow in network traffic; however, the LSTM model requires only a short sequence length of five of the input data. Thus, the proposed scheme can take advantage of the efficiency of the LSTM algorithm in detecting each abnormal flow in network traffic, while reducing the required sequence length of the input data. A comprehensive performance evaluation shows that our proposed scheme outperforms the existing solutions in terms of accuracy and resource consumption.

Keywords:

economic denial of sustainability; deep learning; cloud computing; long short-term memory; artificial neural network

1. Introduction

Cloud computing has become one of the fastest-growing segments in the IT industry. Cloud computing is increasingly attracting big, medium, and small businesses by offering on-demand inexpensive and scalable resources for achieving the system requirements. However, security is still a significant concern with this emerging technology. An economic denial of sustainability (EDoS) attack is currently becoming one of the most challenging cloud security issues [1].

An EDoS attack exploits the pay-per-use and auto-scaling features of cloud computing to charge a cloud adopter an excessive bill, leading to a large-scale service withdrawal or bankruptcy. An EDoS attack is a new variant of a distributed denial of service (DDoS) attack. Unlike a DDoS attack, which can prevent legitimate users from accessing a service for a certain amount of time, an EDoS attack can restrain a cloud adopter from delivering service indefinitely, leading to bankruptcy [2]. Because EDoS is a relatively new form of attack with a tricky nature, it is more challenging to detect an EDoS attack than a DDoS attack [1].

To the best of our knowledge, there are very few solutions that can tackle EDoS attacks efficiently. Most of the existing solutions addressing EDoS attacks are hard-threshold-based solutions with high false-alarm rates or high time complexity, such as graphical Turing tests and crypto-puzzle solutions [3,4,5,6,7,8,9,10,11,12]. A machine learning-based approach is presented in [13] to resolve the issue caused by a hard threshold and decreased the high false-positive rate. However, this method does not achieve high accuracy. Moreover, the method in [13] only detects that there is an attack happening in network traffic and doesn’t know which flow is abnormal. Because an EDoS attack is a type of slow rate attack, the attack rate looks similar to the legitimate network traffic from the victim-end during each time period. To efficiently detect this type of slow rate attack, it is necessary to trace or collect the historical information of the attack source. Considering that network traffic also has a sequential relationship in the time dimension, in [14,15], two multivariate time-series data-based algorithms are proposed based on two variant forms of a recurrent neural network (RNN), i.e., a long short-term memory (LSTM) and a gated recurrent unit (GRU) to detect and mitigate the EDoS attacks in each network flow. These solutions achieve high accuracy by using a dynamic threshold that can reduce the high false-alarm rate. However, a disadvantage of an LSTM or other variants of an RNN in anomaly detection is the required long sequence length of the model, which requires the defense system to use such algorithms, increasing the calculation and detection times and consuming a large amount of computing resources. The LSTM model presented in [14] requires a sequence length of 250 for the input data. In some cases, the network traffic can contain a huge number of network flows. Because the systems in [14,15] detects each network flow, the calculation time and the delay of these schemes will increase significantly.

Understanding the above issues, we propose a two-phase deep learning-based EDoS detection scheme using the LSTM algorithm to detect and mitigate each abnormal flow; however, the sequence length of the LSTM model is significantly reduced. In the first phase, an artificial neural network (ANN) algorithm observes the network traffic within a time interval to check if there is an attack during that time period. We call the detector in the first phase the period detector because it detects the period under attack. The second phase detection using the LSTM algorithm is then triggered to detect an attack flow if any attacks are detected in the first phase. The second detector, called a flow detector, determines precisely which flow is abnormal. Using the period detector before the flow detector, we can know when an attack has occurred, and which data are the most critical among the long sequence of input data for the LSTM model. By doing so, we can reduce the sequence length of the LSTM input data, and can eliminate the case where the flow detector has to handle a huge amount of network flows but without abnormal flows in the network traffic. Thus, by applying the two phases of detection, our model exploits the advantage of an LSTM, while reducing the sequence length of the input data to five (the shortest length yet) and decreasing the calculation time, the delay time of the system, significantly.

Our contributions in this study are as follows:

We proposed a novel solution using two-phase detectors to efficiently detect EDoS attacks. It is known that the solutions using an LSTM or other variants of an RNN algorithm to tackle EDoS attacks achieve higher accuracy and lower false-alarm rates than other existing hard-threshold-based solutions. However, the LSTM-based solutions require a long sequence length of input data, increasing the detection time and computational overhead for a defense system. The proposed scheme exploits the advantages of LSTM algorithms, i.e., high accuracy and low false-alarm rate, and overcomes the shortcoming in which a long sequence of input data is required.
Second, we implemented an EDoS detection and prevention system using two-phase detectors to detect common types of EDoS attacks. Whereas the existing schemes only detect if an attack is happening and warns the cloud provider to react, the implemented system detects and mitigates each abnormal flow in the network traffic.
Finally, we conducted considerable experiments to demonstrate the effectiveness of our scheme. We collected and analyzed the experimental results and made detailed comparisons with other algorithms to illustrate that the proposed scheme outperforms the others in terms of accuracy and resource consumption.

The rest of this paper is organized as follows: In Section 2, we present related studies. The background knowledge is described in Section 3. In Section 4, we describe our proposed scheme in detail. Section 5 describes our testbed. The performance evaluation will be delivered in Section 6 followed by the discussion in Section 7. Finally, we provide some concluding remarks and discuss future areas of research in Section 8.

2. Related Works

Several techniques for alleviating EDoS attacks can be found in the literature. Each comes with its own benefits and limitations. Self-verifying proof of work (sPoW) [4] proposed by Khor and Nakao acts like a DNS server but returns a crypto puzzle with an encryption key instead of the IP address. The client must solve this puzzle to receive authorization to use the cloud service. However, attackers can easily launch a puzzle accumulation attack without solving a crypto-puzzle-based request. In addition, the scheme is vulnerable to a false positive rates because some legitimate users might be denied service owing to the difficulty of the puzzle. In [5,6], the authors proposed EDoS-Shield to lessen the impact of EDoS attacks in cloud computing. Two main components, virtual firewalls (VFs) and authentication nodes (V-nodes) have major roles in detecting EDoS attacks. The authentication nodes conduct a verification process using a graphical Turing test for requests sent from a new IP address and choose to put this address in either a whitelist or a blacklist. The subsequent requests that are from an IP address in the blacklist will be blocked by the virtual firewall. Although EDoS-Shield is quite effective at preventing botnet-generated traffic, this model uses a graphical Turing test, which is poor at verification. Such a test can lead to a high false-positive rate and increase end-to-end latency. Two other studies, also using a graphical Turing test and a crypto puzzle technique to mitigate EDoS attacks, are described in [7,8], respectively. The limitations of these mechanisms are similar to those of EDos-Shield or sPow, i.e., a high false-positive rate and increasing end-to-end latency.

Some statistical methods such as entropy [16] or fuzzy [17] are proposed to detect EDoS attacks. In [16], the authors achieved a good detection accuracy. However, the method was experimented on in an extremely simple testbed, which raises doubt regarding its performance in real-world performance. Fuzzy-entropy-based EDoS mitigation described in [17] produces high errors because of the predefined rule.

Some researchers have recently proposed machine-learning-based techniques to handle EDoS attacks. In [13], an execution trace to detect different EDoS attacks is analyzed. They use an SVM and a neural network, and propose a set of features to detect three types of EDoS attacks. However, this mechanism only detects whether an attack is occurring and warns the cloud to react. If there is an attack happening, the cloud provider will not scale up the system. In addition, EDoS attacks cannot be detected on each flow. Recognizing that network traffic is a type of multivariate time-series data, the authors in [14,15] proposed an LSTM and gated recurrent unit (GRU), two variant forms of a recurrent neural network (RNN). These algorithms can handle sequential relationship data problems extremely effectively. The mechanisms in [14,15] achieve high accuracy and are evaluated through several metrics such as accuracy, detection time, cost, and complexity. However, using LSTM and GRU leads to a problem of high resource consumption. The sequence length of the input data required for two algorithms is long. The LSTM in [14] requires a sequence length of 250, and the GRU requires a length of 100. This lengthens the detection time and increases the resources of the defense system.

As shown in the above review, no existing proposals have the correct approach to achieve a high accuracy, use fewer resources, and detect each flow of network traffic in EDoS attack tackling. From an analysis of the LSTM algorithm and recommendations on the defense inspired by two recently conducted in-depth studies on EDoS characteristics ([18,19]), we propose a two-phase deep learning-based EDoS detection mechanism to take advantage of the LSTM and eliminate the limitations of the model complexity.

3. Background Knowledge

3.1. EDoS Attack Analysis

As mention in [19], an EDoS is a type of low-rate DDoS attack. Unlike a high-rate DDoS attack, however, an EDoS attack is sophisticated and arduous to detect because of its low-rate traffic and stealthy behavior. EDoS and high-rate DDoS attacks are also different in terms of their purpose. In high-rate DDoS attacks, the attacker’s goal is to disrupt the services offered by a cloud service provider. Therefore, high-rate DDoS attackers irrationally launch attacks over a short amount of time with maximum resources. Conversely, EDoS attacks target the financial component of the service provider. As depicted in Figure 1, EDoS attacks exploit the auto-scaling feature of cloud computing and cause an unwanted installation of new virtual machines. The costs associated with this unpaid malicious usage burden the cloud service provider. EDoS attackers gradually push illegitimate traffic over a longer period of time. Because EDoS attack traffic looks similar to benign traffic, its detection is a challenging task.

Like DDoS attacks, EDoS attacks can be categorized into two types: bandwidth depletion and resource depletion attacks [20]. In bandwidth depletion attacks, attackers flood a victim with unwanted traffic, which exhausts the network bandwidth of the victim. The source IP address of an attacker generates one or two flows with a high-level packet volume in each flow. ICMP flooding, Smurf, and Fraggle attacks are representatives of this type. With respect to resource depletion attacks, the attacker aims to generate a large number of flows to a victim address. Each generated flow contains a small number of packets. Because these packets are spoofed IP packets, there will be no reply packets from the victim server. Thus, these flows remain alive during the attack. This consumes a significant number of resources of the victim, including the CPU and memory, and the victim server has to require more resources from the cloud service provider. A TCP SYN flooding attack is a representative of this type. In [13,21], an attack that targets a specific application (such as a database API request attack) and a Yo-Yo attack are introduced. However, when we study the characteristics of these attacks, we see they can be classified into the above two types. In conclusion, the common types of EDoS attacks can be summarized in Table 1.

3.2. Basics of Artificial Neural Network

In this study, an ANN model is used in the first phase of the proposed two-phase deep learning-based EDoS detection mechanism. The ANN module collects the traffic information during a time period and detects whether an attack is occurring during this time period. An ANN is a biologically inspired form of distributed computation [22] and is composed of simple processing units and connections between them. In this study, we employ classic feed-forward neural networks trained using a back-propagation algorithm.

A feed-forward neural network has an input layer and an output layer, with one or more hidden layers in between the input and output layers. The ANN functions as follows: each node i in the input layer has a signal

x_{i}

as the network input, multiplied by a weight value

w_{i j}

between the input layer and the hidden layer. Each node j in the hidden layer receives the signal

a j

according to this formula:

a (j) = θ_{j} + \sum_{j = 1}^{n} x_{i} w_{i j}

(1)

The output

a_{j}

of each node j of the hidden layer is then broadcast to the output layer:

y_{k} = θ_{k} + \sum_{j = 1}^{m} a_{j} w_{i j}

(2)

where

θ_{j}

,

θ_{k}

are the biases in the hidden layer and output layer, respectively; n and m are the numbers of nodes in the hidden layer and output layer. The output of the output layer is passed through a function. The activation could be a sigmoid function for a binary classification problem or a softmax function for a multiple classification. The output obtained from the activation function will be compared with the target. In this study, we used the mean square error as the error function:

E_{m} = \frac{1}{n} \sum_{i = 1}^{n} {(T_{i} - Y_{i})}^{2}

(3)

where

T_{i}

and

Y_{i}

are the target value and output value, respectively. The training process is the process where the ANN model is updated by parameters

w_{i j}

to make the error function

E_{m}

go toward zero.

3.3. Basics of Long Short-Term Memory

LSTM, which is a variant form of an RNN [23], is known to be a solution to overcoming the major disadvantage of an RNN, i.e., an inability to process long sequences of data or information. Thus, an LSTM is commonly used for anomaly detection problems such as network intrusion detection, because this problem requires a long sequence of network traffic information. An LSTM takes the form of a repeating cell chain. The cell contains four types of interactive neural networks that interact in a special way to enable the network to remember historical information. The LSTM protects and controls the state of the cells through the input gate, output gate, and forget gate. Figure 2 depicts the architecture of the LSTM. The figure below shows an LSTM cell.

The internal calculation formula of the LSTM cell is defined as follows:

f_{t} = σ (W_{f} \times [h_{t - 1}, x_{t}] + b_{f})

(4)

i_{t} = σ (W_{i} \times [h_{t - 1}, x_{t}] + b_{i})

(5)

\tilde{C_{t}} = tanh (W_{C} \times [h_{t - 1}, x_{t}] + b_{C})

(6)

C_{t} = f_{t} * C_{t - 1} + i_{t} * \tilde{C_{t}}

(7)

o_{t} = σ (W_{o} \times [h_{t - 1}, x_{t}] + b_{o})

(8)

h_{t} = o_{t} * tanh (C_{t})

(9)

where i, o, f indicate the input gate, output gate, and forget gate, respectively. Here, W is a weight matrix, and b is the bias. In addition,

\tilde{C}

and C are the candidate state and new state, h is the output, x is the input, i is the input time, and

σ

denotes a sigmoid function. However, the disadvantage of an LSTM is that it requires long sequential input data. Because the LSTM model does not know where the most important part of the sequential input data is located, it requires a long sequence to extract the historical information and make a prediction or classify this input data. In this study, we combine an ANN and an LSTM to classify each flow in a traffic network. The ANN model acts as a period detector to detect when an attack occurs, from which the LSTM model, which plays the role of a flow detector, can know where the important part of the sequential information is located to reduce the sequential input data.

4. System Design

In this section, our research goal and system design analysis are first introduced. Then, the component and workflows are given. Finally, the internal modules of our proposed scheme are thoroughly explained.

4.1. An Objective and System Design Analysis

Cloud consumers are typically monitored using a multivariate time series, whose anomaly detection is critical for service quality management [14]. For instance, as shown in Figure 3, the parameters of a cloud instance, such as CPU utilization and memory consumption are collected and tracked by a network administrator. When an EDoS attack is launched, the parameters of the cloud consumer will suddenly change in value. An EDoS attack is similar to a low-rate DDoS attack in terms of the characteristics [18]. The networking traffic launched by this attack is not changed too dramatically during each time period, similar to a conventional DDoS attack. As discussed in [14,15], an LSTM or a GRU, two variant forms of an RNN, are chosen for EDoS detection. These algorithms are not only able to learn from historical data but also to simultaneously keep track of the multivariate time series data to detect an EDoS attack. These methods will collect the sequential multiple variables information of the attacker at each observation and learn not only the past values of each variable but also the correlation information between variables. Nevertheless, the complexity of the model is a limitation of these algorithms. The algorithms also require a long sequence of input data. The sequence length can be 100 or longer (in [14], 250-sequence-length input data are required). This increases the calculation time overhead and response time of the entire system. In particular, the calculation time required to detect each flow is even longer. Thus, in this paper, a two-phase deep learning-based mechanism is proposed to overcome this limitation of an LSTM in EDoS attack detection. The first phase in the proposed two-phase scheme is a period detector. The period detector will detect if an EDoS attack has occurred within an observed time period. An ANN is the core algorithm in the period detector. The second phase is a flow detector. This phase will accurately filter out abnormal flows. LSTM is the core of the flow detector. The aim of the period detector is to reduce the sequence length of the input data for the LSTM model in the flow detector. If we use the flow detector directly, we require a hundred or longer sequence length of the flow feature input data for each flow to fit the model, because the model cannot know when an abnormal time has occurred where it is the most important part of a long sequence of input data. Using the period detector before a flow detector, we can know when an attack happened and where the most important part of a long sequential input data is for an LSTM model. By doing so, we can reduce the sequential length of the input data for the LSTM, especially in the case that network traffic can contain a huge number of normal flows. If we only use the flow detector, the calculation time and response of the system can increase much more highly but not necessarily to detect each network flow. In brief, the period detector helps reduce the calculation time of the flow detector. Figure 4 depicts an overview of our proposed mechanism aimed at detecting EDoS attacks. This conceptual architecture includes four main modules: a raw data processing scheme, a period detector, a flow detector, and a firewall. The system workflow and mission of each module will be explained in detail in the next sections.

4.2. System Workflow

Figure 5 presents a detailed architecture of the two-phase deep learning-based EDoS detection mechanism. During the data preprocessing stage, the collector runs a Wireshark tool every 5 s to capture all packets going through a switch when installing OpenVswitch, in which the system resources of the victim server are also collected. The network packets will be captured into a pcap file and saved to a disk in the collector machine. Afterward, the feature extractor will load the pcap file from the collector to extract the data attributes. These data attributes will also be transformed and standardized to fit the ANN model. After preprocessing, the appropriate data will be sent to the first detector-ANN model to detect whether an attack happened during that 5-s time period. If the ANN model, which is a binary classifier, outputs an attack, the observed pcap file of a 5-s time period is now split into five consecutive pcap files of 1-s time periods. The feature extractor is called again to extract the flow-based attributes of each flow in the five consecutive pcap files constructed. The extracted flow-based features are constructed for sequential data having a length of five. The sequential data are sent to the flow detector, i.e., the LSTM model, to classify where each flow in the observed time period is abnormal. If a flow is detected as an abnormal flow, the source address of this flow will be updated to a blacklist of a firewall. Note that to effectively adapt our proposed framework to different network systems, the ANN and LSTM models will be replaced by new ANN and LSTM models that are trained using the updated database in a preset time. The workflow of the two-phase deep learning-based EDoS detection mechanism is summarized in Algorithm 1.

Algorithm 1 The two-phase deep learning-based EDoS Defense Mechanism

P_{i} ⟵

Pcap file of a 5-s period i

P e_{i} ⟵

Feature tuple of the period i

p e r i o d O u t p u t ⟵

Output of the ANN detector

f l o w O u t p u t ⟵

Output of the LSTM detector

m ⟵

Number of flows in

P_{i}

x^{j} = {x_{t - 4}^{j}, x_{t - 3}^{j}, x_{t - 2}^{j}, x_{t - 1}^{j}, x_{t}^{j}} ⟵

A sequential feature of each flow j from

t - 4

to t

P_{j} ⟵

Pcap file of flow j in splitted from

P_{i}

\bar{F e a t u r e E x t r a c t o r} ⟵

Feature Extractor module

\bar{A N N} ⟵

ANN model

\bar{L S T M} ⟵

LSTM model

i p A d d r e s s ⟵

Ip address of flow j

loop

Every 5 s Collector captures

P i

Send

P i

to

\bar{F e a t u r e E x t r a c t o r} ⟶ P e_{i}

Feed

P e_{i}

to

\bar{A N N} ⟶ p e r i o d O u t p u t

if

p e r i o d O u t p u t

== 1 (attack period) then

Split

P i

into m pcap files of m flows

for j to m do

Send

P_{j}

to

\bar{F e a t u r e E x t r a c t o r} ⟶ x^{j}

Feed

x^{j}

to

\bar{L S T M} ⟶ f l o w O u t p u t

if

f l o w O u t p u t

== 1 (abnormal flow) then

Send

i p A d d r e s s

to vFirewall

else

continue

end if

end for

else

continue

end if

end loop

4.3. Internal Modules

Herein, we present all components of the two-phase deep learning-based EDoS detection mechanism, as shown in Figure 5.

4.3.1. Collector

The collector module will capture all network packets going through a virtual switch installed on openVswitch and collect the system resources of the victim server. The collector uses the Wireshark tool to achieve this mission. The Wireshark tool is the world’s most popular network protocol analyzer [24]. Wireshark has tools for capturing, viewing, and analysis of data packets. Every 5 s, the collector runs Wireshark tool to capture all of the network packets and saves these network packets into a pcap file. The pcap files are placed into storage in the collector machine.

4.3.2. Feature Extractor

This module loads the pcap file from pcap file storage and extracts the data information to take out the appropriate features, as shown in Table 2. These attributes are then normalized before fitting them to the ANN or LSTM model. These features are key, and are selected by applying the correlation-based feature selection process proposed in [25]. The purpose of feature selection is to increase the detection accuracy of the model and decrease the computing resources. Using irrelevant or redundant features will decrease the accuracy and waste computing resources. As proposed in [25], a feature set S will be evaluated using the metric

M_{S}

:

M_{S} = \frac{k \bar{r_{c f}}}{\sqrt{k + k (k - 1) \bar{r_{f f}}}}

(10)

where k is the number of features in the feature set S,

\bar{r_{c f}}

is the mean correlation value between features and class label f in S,

\bar{r_{f f}}

is the average inter-correlation between two features in S. In addition,

\bar{r_{c f}}

and

\bar{r_{f f}}

are calculated by the information gain (IG), which measures the correlation between two random variables X and Y:

I G = H (X) - H (X | Y) = H (X) + H (Y) - H (X, Y)

(11)

where H(X), H(Y), H(X,Y) are calculated as:

H (X) = \sum_{x \in X} p (x) {log}_{2} p (x)

(12)

H (Y) = \sum_{y \in Y} p (y) {log}_{2} p (y)

(13)

We calculate

M_{S}

for each random subset of features and chose the subset S that has the largest value

M_{S}

.

The module runs a script that takes advantage of Apache Spark framework [26], which is an open-source unified analytics for large-scale data processing to speed up the calculation time. After being extracted, these features need to be normalized because they do not have similar ranges of values, and their formula can be expressed as follows:

z = \frac{x - μ}{σ}

(14)

where x is the standardized value,

μ

is the mean of the distribution, and

σ

is the standard deviation of the distribution.

This module is called again to extract and calculate features from five pcap files of a 1-s time period to make sequential data for the LSTM model when the ANN model detects that there is an attack during a 5-s time period.

4.3.3. ANN Model-Period Detector

The ANN module aims to learn a set of features, as shown in Table 2, to detect whether there is an attack during a time period of 5 s. In this study, we employ classic feed-forward neural networks trained with a back-propagation algorithm. The ANN model parameters are presented in Table 3. Our proposed ANN model has an input layer of 10 nodes, 3 hidden layers, and an output layer of 1 node. Because our ANN model is a binary classifier, the activation function of the output layer is a sigmoid function:

f (z) = \frac{1}{1 + e^{- z}}

(15)

A 10-dimensional vector x whose elements correspond to input variables will be fit into the ANN model. The trained model will be calculated based on the learned parameters to output a z value, and this z value will then output a new value

f (z)

between zero and one by the sigmoid function. Based on this

f (z)

value, the period detector classifies whether the observed time period is abnormal or normal.

4.3.4. LSTM Model-Flow Detector

The LSTM model classifies each individual flow during the attack time period detected from ANN model as a normal or abnormal flow. For each flow, the LSTM model considers a time series

X = {x^{(1)}, x^{(2)}, x^{(3)}, \dots, x^{(n)}}

where each step

x^{(t)}

in the time series is an m-dimensional feature vector

x_{1}^{(t)}, x_{2}^{(t)}, x_{3}^{(t)}, \dots, x_{m}^{(t)}

, whose elements correspond to each feature in Table 4. For our proposed LSTM model, the value of parameter n in a considered time series is five. In other words, the sequential length of input data is five, which is the shortest length compared to other LSTM models applying for the same problem. By using the period detector, we can determine when an attack occurs in network traffic. Then, we only need focus on a time series of each flow during that attacked time period. Thus, we can reduce the sequential length of input data for the LSTM model.

Figure 6 shows the model architecture of the proposed LSTM. Our proposed model includes an LSTM layer, a dropout layer to avoid over-fitting the problem and a fully connected layer and an output layer of a single node. The sigmoid function is used as the activation function for the output layer because the LSTM model is a binary classifier. Time series data

X = {x^{(t - 4)}, x^{(t - 3)}, x^{(t - 2)}, x^{(t - 1)}, x^{(t)}}

are fitted into the LSTM layer. The LSTM layer learns both the temporal and spacial representation from the input sequential data and outputs an output vector

h = {h^{(t - 4)}, h^{(t - 3)}, h^{(t - 2)}, h^{(t - 1)}, h^{(t)}}

following the equations from (4) to (9). Each element in the output vector play acts as a unit in the input layer of the fully connected layer. The fully connected layer works as an artificial neural network. It outputs a value of z and this value then goes through the sigmoid function to output a value of zero (normal flow) or one (abnormal flow). The model parameters are presented in Table 5.

4.3.5. vFirewall

The virtual firewall (vFirewall) works as a filter mechanism filtering the incoming requests to the cloud services, based on a comparison with a regularly-updated blacklist. If a packet whose IP address matches the IP listed in a blacklist, vFirewall will execute a drop action to drop this packet. The blacklist is updated by the flow detector. Other packets whose IP addresses do not match the IP lists in the blacklist will be forwarded to the cloud server by a forward action

4.3.6. Training Database

Due to the continual changing of network states, the deep learning-based detectors’ accuracy will naturally increase over time without periodic retraining. In the preset time, the proposed ANN model and LSTM model will be replaced by a new ANN and LSTM model that are trained by the updated training database. There are two separate databases, which are both MySQL open-source databases for the ANN model and LSTM model. The training databases are continuously updated with labels and extracted features every time the ANN model or the LSTM model finish detecting a time period or a network flow.

5. Experimental Setup

In this section, we present our attack scenarios for the simulation. Next, the description of our dataset is given. Finally, we describe our test bed in detail.

5.1. Attack Scenarios

As discussed in Section 2, there are two common types of EDoS attacks, bandwidth depletion, and resource depletion. To prove the efficiency of our proposed mechanism, we simulate an ICMP flooding attack, a TCP SYN flooding attack, and an HTTP flooding attack. The ICMP flooding attack is a representative of bandwidth depletion attacks, and the others are resource depletion attacks.

ICMP flooding attack, also known as the Ping flood attack is a common EDoS attack in which an attacker takes down a victim server by overwhelming it with ICMP echo requests, i.e., ping request. The network bandwidth of the victim server will be overloaded by the attacker’s ping request and thus, the victim server has to request more resource usage to reply to other normal requests.
TCP SYN flooding attack, also known as “half-open attack”, exploits part of the normal TCP three-way handshake to consume resources on the victim server and render it unresponsive. An attack continuously sends an initial connection request to the victim server making all ports unavailable to respond to upcoming legitimate traffic.
TCP-HTTP flooding attack is one of the EDoS attacks where a web server is exploited by an attacker through seemingly-legitimate HTTP GET or POST requests. The attacker forces the targeted webserver to allocate the utmost resources for each request. Since the attacker does not use the spoofed IP request as a TCP SYN flooding attack, this kind of attack is not established in the network layer. Instead, it is established in the application layer.

EDoS attacks are similar to low-rate DDoS attacks [19]. A packet per second (attack rate) of 10,000 is considered the standard rate to differentiate between a low-rate DDoS and high-rate EDoS [15]. Following the instructions in [14,15], we use the Bonesi tool [27] to simulate an ICMP flooding attack and TCP SYN flooding attack. In reality, a real EDoS attack scenario not only contains attack traffic but also contains normal traffic. To create a realistic EDoS attack, we mix EDoS attack traffic and normal traffic. We simulated different levels of EDoS attacks whose request rate ranged from 1000 to 7000 and mixed with a fixed rate of 400 normal requests per second following the EDoS evaluation scheme proposed by Al-Haidari et al. [28], as shown in Table 6.

5.2. Dataset Description

Following the recommendations in [14,15,29,30], we choose the SMD dataset and UNSW-NB15 dataset for ANN and LSTM model training, respectively. The SMD dataset was publicly published in KDD 2019 and was collected from real network traffic statistics of a large internet company. It is also used in many anomaly detection studies. There are 38 features in each SMD dataset sample. By using the correlation-based feature selection process, we chose 10 key features for ANN model. UNSW-NB15 is created by the IXIA PerfectStorm tool in the Cyber Range Lab of the Australian Center for Cyber Security. A raw network packet file (pcap file) of 50.2 GB is provided. We extracted this pcap file to calculate multivariate time series of features for the LSTM model. The shapes of the training set, validation set, and testing set of the ANN model are (28,479,10), (4257,10), and (20,300, 10). For the LSTM model, these values are (30,400, 10), (6010, 10), and (18,379,10), respectively.

5.3. Detailed Implementation and Test Preparation

Figure 7 shows our testing topology, which consists of a virtual machine as a victim server, another virtual machine as a normal user that installed the Packet Sender tool [31] for generating normal traffic, two virtual machines with the Bonesi tool installed as attackers, a physical machine on which the installed Whireshark tool and Apache spark framework act as the Collector, Feature Extractor, and two detectors, i.e., an ANN model and an LSTM model. In the victim machine, we installed an Nginx server following the instructions in [32]. The Linux IPTables [33] will be implemented in the victim virtual machine. OpenVswitch is installed in a virtual machine. All of the virtual machines running Ubuntu 20.04 OS in our testbed are created by Oracle Virtualbox version 5.2.42 [34]. The physical machine’s configuration is Intel(R) Core(TM) i7-4770 CPU, 3.40GHz and a total of 48GB memory, running 64bit Ubuntu Linux v18.04. For training and implementing the two deep learning models, we use the Keras 2.4.3 library—an open-source software library that provides a Python interface [35]. The models are also training on the physical machine mentioned earlier.

6. Result Analysis

A successful EDoS attack mitigation mechanism requires correctly identifying attacks and a quick response time (Quality of Service), while minimizing the resource consumption [1]. In this section, we will evaluate our proposed model based on two main criteria: quality of service and resource consumption. In addition, we evaluate the efficiency of the proposed model at the cloud service provider end by comparing the CPU usage of the victim server when being protected by our proposed model to that when not being protected.

6.1. Quality of Service

To prove the effectiveness of the proposed EDoS Detection system, we compare our model to other existing works with the same environmental setup. Firstly, to the best of our knowledge, based on a review of the relevant works [13,15], the most recent and state-of-the-art research applying machine learning to tackle an EDoS attack and detect multiple types of EDoS attacks are used in our work. Hence, to prove the detection accuracy of our proposed model, we compare our approach with the two approaches (support vector machine (SVM) and neural network (NN)) in [13]. Our approach is proposed to eliminate the disadvantage of calculation and delay time of the RNN-based method. Thus, to prove the advantages of calculation time, the response of the system, and the resource usage, we compare our proposed approach with the proposed model in [15], which uses GRU, another variant form of RNN but with a longer sequential input (100) than the sequential input we use for the LSTM model. Four metrics, i.e., accuracy (AC), detection rate (R), F1-score (F1), and false alarm rate are calculated to evaluate the anomaly detection capacity of our proposed mechanism. The formula of the four metrics are expressed as follows:

Accuracy is the proportion of correct detections over the number of total flows:

$A C = \frac{T P + T N}{T P + T N + F P + F N}$

(16)
Detection Rate is the proportion of number of detected abnormal flows to the number of all abnormal flows:

$R = \frac{T P}{T P + F N}$

(17)
False Alarm Rate is the ratio of abnormal flows falsely classified as normal flows:

$F A R = \frac{F P}{F P + T N}$

(18)
F1-score is the weighted average of P and R:

$F 1 = \frac{2 \times P \times R}{P + R}$

(19)
Here, $P = \frac{T P}{T P + F P}$ is the ratio of correct abnormal flows between total detected abnormal flows.

TP, FP, TN, and FN indicate True Positive, False Positive, True Negative, False Negative, respectively.

6.1.1. Evaluate the Period Detector

We first evaluate the performance of the period detector. Despite the fact that the period detector cannot classify which exact flow is abnormal, it helps the system detect when an attack occurs and reduces the sequence length of the input data for the flow detector, which leads to a speedup of the detection time of the whole system for each flow. We need to evaluate the period detector to prove the efficiency of our chosen parameters for this ANN model.

Figure 8 presents the detection performance of the period detector in three common types of EDoS attacks using the model parameters listed in Table 3. The results are summarized in Table 7. Accuracy, detection rate, false alarm rate, and F1-score, when detecting ICMP flooding attack are 95.896%, 94.208%, 4.104%, 95.831%, respectively. The results show that our model can accurately detect when an EDoS attack occurs, which will help the flow detector quickly detect abnormal flows.

6.1.2. Evaluate the Flow Detector

We then evaluated the detection performance of the flow detector. The flow detector is triggered after the period detector detects an attack and alerts the flow detector. The flow detector will detect which exact flow is abnormal. Figure 9 shows the detection performance in three EDoS attacks’ common types of the four mentioned solutions: R-EDoS [15], SVM-Abbasi et al. [13], NN-Abbasi et al., and our proposed scheme. The average results are presented in Table 8. These results show that regarding detection rate, accuracy, and F1-score, our proposed model accounts for the highest rate, 98.139%, 98.163%, and 98.163%, which is slightly higher than R-EDoS system and clearly outperforms the two solutions (SVM and NN) in [13]. With respect to false alarm rate, the proposed system dominates the production of wrong warnings when it only accounts for 1.837%, while the false alarm rate of R-EDoS is 4.333%, SVM and NN are 13.779% and 21.113%, respectively. R-EDoS and our approach achieve much better results than SVM and NN in [13] because these two approaches learn from current and historical information of attackers. SVM and NN only observe and learn the attacker’s information at the current time period. A normal TCP session sends the very first packets to the victim. Due to the short observation time, the collected TCP flow looks like a TCP SYN flooding flow. Consequently, the system in [13] will consider it as an abnormal flow. That proves recurrent neural network-based algorithms outperform other machine learning-based algorithms in EDoS attack detection.

Figure 10 and Figure 11 show the detection and response time results of the flow detector compared with [13,15], respectively. The detection time and response time are two crucial metrics in evaluating an EDoS detection mechanism [18]. The detection time measures how quickly a new attack is detected. We simulated an SYN flooding attack with an attack rate of 7000 requests per second and calculated the detection time for each classified flow. The result in Figure 10 show that our flow detector outperforms the three other solutions in terms of the detection time metrics. In particular, compared with the R-EDoS method, our model only requires 0.54 s to detect each flow because our LSTM model only uses a sequential input data length of 5 instead of 100, which R-EDoS used for a GRU-based model. This shortens the calculation time of our proposed model compared to the GRU-based model in [15]. This also allows our model to outperform the three other methods in terms of the response time. The response time is defined as follows. The response time is the period when a cloud user makes a request, and one corresponding response is sent back to the user. Overall, the response times of our proposed model (varying within 21–29 ms) and R-EDoS, which are two variant forms of RNN algorithms, are similar and shorter than that of the SVM-based model and NN-based model [13]. Our proposed model, however, is better than R-EDoS in terms of response time (varying from 21–25 ms) because of a shorter sequential input data length.

6.2. Resources Consumption

In this section, we evaluate the efficiency of our proposed mechanism in terms of CPU usage and memory usage, which are two representative resource consumption metrics. Figure 12 and Figure 13 show the results of the CPU and memory utilization, respectively. These results were obtained when we simulated an EDoS attack with different numbers of requests of 1000–7000 per second during a 100-s period. For the CPU usage, the result shows the CPU usage of the physical machine when extracting the features and detecting each flow. Because the data sequence length is five, our model consumes the lowest CPU usage (32% to 36.3%) compared to SVM-Abbasi et al. (48% to 50.2%), NN-Abbasi et al. (43% to 48.2%) [13], and R-EDoS [15] (40% to 43.2%). With respect to the memory utilization, our proposed scheme only consumes from 2.8% to 3.5%, whereas the consumption of the GRU-based R-EDoS method is approximately 4% higher than that of our two-phase model. Overall, our proposed model consumes fewer resources than the SVM-Abbasi et al., NN-Abbasi et al., and GRU-based R-EDoS models.

6.3. Performance Analysis at the Victim Server

In this section, we present the experiment results of the CPU usage of a victim server under the control of our proposed EDoS detection mechanism to evaluate the efficiency of our proposed model at the service provider’s end. We launched a TCP-SYN flooding attack with an attack rate of 7000 requests per second and a normal traffic rate of 400 requests per second during a 1-h period.

During the first 5 min, we only launched normal traffic, after which, we launched a TCP-SYN flooding attack. Figure 14 shows the victim’s CPU usage when protected by our proposed EDoS detection model and when not protected by our model. Without being protected by our model, the victim server has to handle a large number of abnormal requests, which makes the server consume a huge number of resources. When the TCP-SYN flooding attack is launched (at minute five), the CPU usage increases from 40% to over 50% and reaches 100% after 50 min. Most cloud-based systems set an upper threshold of 80% of the CPU usage to trigger another virtual machine. Therefore, without being protected by our proposed model, the cloud system will need a second virtual machine allocated 30 min after an attack, leading to an increase in the cost the user must pay to the cloud service provider.

However, when being protected by our proposed model, the average CPU usage of the victim server is only approximately 40% during a 1-h period. This means the cloud user does not need to pay for a second virtual machine when being protected by our proposed model while under EDoS attacks.

7. Discussion

Based on the comprehensive results given above, we summarize some of the outstanding points demonstrating the effectiveness of the two-phase deep learning-based EDoS detection system in detecting EDoS attacks conducted on our practical testbed:

Our proposed model achieves a high detection rate, accuracy, and F1-score, and a low error rate, which clearly shows that it outperforms other state-of-the-art existing solutions. More specially, the proposed two-phase deep learning-based EDoS detection system achieves 98.163%, 98.139%, 98.163%, 4.333% of accuracy, detection rate, f1 score, and false alarm rate, respectively. These values are much better than SVM (86220%, 85.392%, 78.859%, 13.779%) and neural network-based (78.887%, 77.901%, 78.859%, 21.113%) models in [13]. Compared with R-EDoS systems, which use GRU (another variant of RNN models), the proposed model also overcomes in terms of accuracy, detection rate, f1-score, and false alarm rate.
Our model can defense against three common types of EDoS attack, i.e., HTTP flooding, TCP-SYN flooding, and ICMP flooding attacks.
Our scheme can detect and mitigate each abnormal flow in network traffic within an extremely short period of time, i.e., 0.54 s. This result highly outperforms other existing approaches, especially R-EDoS [35], which uses another variant of RNN and achieves pretty high accuracy in EDoS detection. The quick detection time makes the response time of the entire system much lower than other existing solutions despite the high amount of network.
Using our proposed EDoS detection model, low CPU (32% to 36.3%) and memory (2.8% to 3.5%) resources are consumed.
Using our defense system, a cloud user can avoid being forced to pay for the unexpected cost coming from the EDoS attacks.
In other words, our two-phase deep learning-based EDoS defense system is the most suitable approach to protect cloud systems from various EDoS attacks and brings a better service quality for the protected cloud services.
Although our proposed system is very efficient in handling EDoS attacks, it does have a limitation. The accuracy of the period detector, i.e., ANN model is not too high (from 95.896% to 96.439%). Some recent advanced algorithms in image processing proposed in [36,37] could be considered to apply for period detection. These algorithms are both CNN-based models and improved to reduce computation and enhance detection speed.

8. Conclusions

In this study, we propose a novel mechanism to handle EDoS attacks in each flow coming into a cloud system. This approach not only protects cloud infrastructure from paying much more money for various EDoS attacks such as TCP-HTTP flooding, ICMP flooding, and TCP SYN Flooding, but also help cloud service providers improve their service quality. We present a two-phase deep learning-based detector for EDoS detection based on utilizing the advantages of ANN and LSTM algorithms. By using the period detector, i.e., ANN model to detect when an attack occurs, we can take advantage of the LSTM in terms of accuracy, while eliminating the greatest disadvantage of this algorithm, i.e., the long sequence length of the input data. Finally, our mechanism can apply to different network systems, and adapts well because the deep learning-based detectors are replaced periodically by using an updated training database. The evaluation described in Section 6 shows that our proposed mechanism is extremely efficient in both accuracy, detection time, response time, and resource consumption.

As a future study, we expect to improve the mechanism by using an SDN-based model to enhance the process of mitigating and collecting network features. Moreover, as discussed in Section 7, we will improve the performance of the period detector by considering the two models mentioned in [36,37]. In addition, we plan to compare the proposed scheme to other EDoS defense systems using more evaluation criteria.

Author Contributions

C.-N.N. proposed the idea, conducted the experiments, performed the analysis, and wrote the manuscript. M.P. provided the guidance for data analysis and paper writing. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) (No. 2020R1F1A1076795).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Chowdhury, F.Z.; Kiah, L.B.M.; Ahsan, M.A.M.; Idris, M.Y.I.B. Economic denial of sustainability (EDoS) mitigation approaches in cloud: Analysis and open challenges. In Proceedings of the International Conference on Electrical Engineering and Computer Science (ICECOS), Palembang, Indonesia, 22–23 August 2017; pp. 206–211. [Google Scholar]
Shawahna, A.; Abu-Amara, M.; Mahmoud, A.S.H.; Osais, Y. EDoS-ADS: An Enhanced Mitigation Technique Against Economic Denial of Sustainability (EDoS) Attacks. IEEE Trans. Cloud Comput. 2020, 8, 790–804. [Google Scholar] [CrossRef]
Thaper, R.; Verma, A. Adaptive Pattern Attack Recognition technique (APART) against EDoS attacks in Cloud Computing. In Proceedings of the 2015 Third International Conference on Image Information Processing (ICIIP), Waknaghat, India, 21–24 December 2015; pp. 31–34. [Google Scholar]
Khor, S.H.; Nakao, A. spow: On-demand cloud-based edos mitigation mechanism. In Proceedings of the 5th Workshop on Hot Topics in System Dependability, Lisbon, Portugal, 29 June 2009; pp. 1–6. [Google Scholar]
Sqalli, M.H.; Al-Haidari, F.; Salah, K. EDoS-Shield—A Two-Steps Mitigation Technique against EDoS attacks in Cloud Computing. In Proceedings of the 4th IEEE International Conference on Utility and Cloud Computing, Melbourne, VIC, Australia, 5–8 December 2011; pp. 49–56. [Google Scholar]
Al-Haidari, F.; Sqalli, M.H.; Salah, K. Enhanced EDoS-Shield for Mitigating EDoS attacks Originating from Spoofed IP Addresses. In Proceedings of the IEEE 11th International Conference on Trust, Security and Privacy in Computing and Communications, Liverpool, UK, 25–27 June 2012; pp. 1167–1174. [Google Scholar]
Baig, Z.A.; Binbeshr, F. Controlled Virtual Resource Access to Mitigate Economic Denial of Sustainability (EDoS) Attacks against Cloud Infrastructures. In Proceeding of the International Conference on Cloud Computing and Big Data, Fuzhou, China, 16–19 December 2013; pp. 346–353. [Google Scholar]
Kumar, M.N.; Sujatha, P.; Kalva, V.; Nagori, R.; Katukojwala, A.K.; Kumar, M. Mitigating Economic Denial of Sustainability (EDoS) in Cloud Computing Using In-cloud Scrubber Service. In Proceedings of the 4th International Conference on Computational Intelligence and Communication Networks, Mathura, India, 3–5 November 2012; pp. 535–539. [Google Scholar]
Masood, M.; Anwar, Z.; Raza, S.A.; Hur, M.A. EDoS Armor: A cost effective economic denial of sustainability attack mitigation framework for e-commerce applications in cloud environments. In Proceedings in the INMIC, Lahore, Pakistan, 19–20 December 2013; pp. 37–42. [Google Scholar]
Chowdhury, F.Z.; Idris, M.Y.I.; Kiah, M.L.M.; Ahsan, M.A.M. EDoS eye: A game theoretic approach to mitigate economic denial of sustainability attack in cloud computing. In Proceedings of the IEEE 8th Control and System Graduate Research Colloquium (ICSGRC), Shah Alam, Malaysia, 4–5 August 2017; pp. 164–169. [Google Scholar]
Singh, P.; Rehman, S.U.; Manickam, S. Comparative Analysis of State-of-the-Art EDoS Mitigation Techniques in Cloud Computing Environment. arXiv 2019, arXiv:1905.13447. [Google Scholar]
Agrawal, N.; Tapaswi, S. A proactive defense method for the stealthy EDoS attacks in a cloud environment. Int. J. Netw. Manag. 2020, 30, e2094. [Google Scholar] [CrossRef]
Abbasi, H.; Ezzati-Jivan, N.; Bellaiche, M.; Talhi, C.; Dagenais, M.R. Machine Learning-Based EDoS attack Detection Technique Using Execution Trace Analysis. J. Hardw. Syst. Secur. 2019, 3, 164–176. [Google Scholar] [CrossRef]
Dinh, P.T.; Park, M. Dynamic Economic-Denial-of-Sustainability (EDoS) Detection in SDN-based Cloud. In Proceedings of the 5th International Conference on Fog and Mobile Edge Computing (FMEC), Paris, France, 20–23 April 2020; pp. 62–69. [Google Scholar]
Dinh, P.T.; Park, M. R-EDoS: Robust Economic Denial of Sustainability Detection in an SDN-Based Cloud Through Stochastic Recurrent Neural Network. IEEE Access 2021, 9, 35057–35074. [Google Scholar] [CrossRef]
Monge, M.A.S.; Vidal, J.M.; Villalba, L.J.G. Entropy-Based Economic Denial of Sustainability Detection. Entropy 2017, 19, 649. [Google Scholar] [CrossRef] [Green Version]
Bhingarkar, S.; Shah, D. FLNL: Fuzzy entropy and lion neural learner for EDoS attack mitigation in cloud computing. Int. J. Model. Simul. Sci. Comput. 2018, 9, 1850049. [Google Scholar] [CrossRef]
Agrawal, N.; Tapaswi, S. Defense Mechanisms Against DDoS Attacks in a Cloud Computing Environment: State-of-the-Art and Research Challenges. IEEE Commun. Surv. Tutor. 2019, 21, 3769–3795. [Google Scholar] [CrossRef]
Zhijun, W.; Wenjing, L.; Liang, L.; Meng, Y. Low-Rate DoS Attacks, Detection, Defense, and Challenges: A Survey. IEEE Access 2020, 8, 43920–43943. [Google Scholar] [CrossRef]
Phan, T.V.; Park, M. Efficient Distributed Denial-of-Service Attack Defense in SDN-Based Cloud. IEEE Access 2019, 7, 18701–18714. [Google Scholar] [CrossRef]
Bremler-Barr, A.; Brosh, E.; Sides, M. DDoS attack on cloud auto-scaling mechanisms. In Proceedings of the IEEE INFOCOM 2017, Atlanta, GA, USA, 1–4 May 2017; pp. 1–9. [Google Scholar]
Wang, G.; Hao, J.; Ma, J.; Huang, L. A new approach to intrusion detection using Artificial Neural Networks and fuzzy clustering. Expert Syst. Appl. 2010, 37, 6225–6232. [Google Scholar] [CrossRef]
Malhotra, P.; Vig, L.; Shroff, G.; Agarwal, P. Long short term memory networks for anomaly detection in time series. In Proceedings of the 23rd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, Bruges, Belgium, 22–24 April 2015; pp. 89–94. [Google Scholar]
Banerjee, U.; Vashishtha, A.; Saxena, M. Evaluation of the Capabilities of WireShark as a tool for Intrusion Detection. Int. J. Comput. Appl. 2010, 6, 1–5. [Google Scholar] [CrossRef]
Gopika, N.; Kowshalaya, A.M.M.E. Correlation Based Feature Selection Algorithm for Machine Learning. In Proceedings of the 3rd International Conference on Communication and Electronics Systems (ICCES), Coimbatore, India, 15–16 October 2018; pp. 692–695. [Google Scholar]
Gupta, A.; Thakur, H.K.; Shrivastava, R.; Kumar, P.; Nag, S. A Big Data Analysis Framework Using Apache Spark and Deep Learning. In Proceedings of the IEEE International Conference on Data Mining Workshops (ICDMW), New Orleans, LA, USA, 18–21 November 2017; pp. 9–16. [Google Scholar]
Bonesi. The DDoS Botnet Simulator. Available online: https://github.com/Markus-Go/bonesi (accessed on 2 December 2018).
Al-Haidari, F.; Sqalli, M.; Salah, K. Evaluation of the Impact of EDoS attacks Against Cloud Computing Services. Arab. J. Sci. Eng. 2015, 40, 773–785. [Google Scholar] [CrossRef]
Moustafa, N.; Slay, J. UNSW-NB15: A comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In Proceedings of the Military Communications and Information Systems Conference (MilCIS), Canberra, ACT, Australia, 10–12 November 2015; pp. 1–6. [Google Scholar]
Moustafa, N.; Slay, J. The evaluation of Network Anomaly Detection Systems: Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set. Inf. Secur. J. Glob. Perspect. 2016, 25, 18–31. [Google Scholar] [CrossRef]
PacketSender Tool. Available online: https://packetsender.com/ (accessed on 27 October 2015).
Saravanan, S. DDoS Intro-Journey to Bonesi Tool. Available online: https://itnext.io/ddos-intro-journey-to-bonesi-tool-8b43d53228ac (accessed on 10 July 2020).
Wang, B.; Lu, K.; Chang, P. Design and implementation of Linux firewall based on the frame of Netfilter/IPtable. In Proceedings of the 11th International Conference on Computer Science & Education (ICCSE), Nagoya, Japan, 23–25 August 2016; pp. 949–953. [Google Scholar]
Loganayagi, B.; Sujatha, S. Creating virtual platform for cloud computing. In Proceedings of the IEEE International Conference on Computational Intelligence and Computing Research, Coimbatore, India, 28–29 December 2010; pp. 1–4. [Google Scholar]
Keras. Available online: https://keras.io/ (accessed on 17 June 2015).
Yu, J.; Zhang, W. Face Mask Wearing Detection Algorithm Based on Improved YOLO-v4. Sensors 2021, 21, 3263. [Google Scholar] [CrossRef] [PubMed]
Roy, A.M.; Bhaduri, J. A Deep Learning Enabled Multi-Class Plant Disease Detection Model Based on Computer Vision. AI 2021, 2, 413–428. [Google Scholar] [CrossRef]

Figure 1. EDoS effect.

Figure 2. LSTM architecture.

Figure 3. Multivariate time series snippet visualization with an anomalous region highlighted.

Figure 4. Conceptual architecture of the proposed model.

Figure 5. Detailed architecture of the proposed model.

Figure 6. The proposed LSTM model.

Figure 7. Experimental topology.

Figure 8. Detection performance of the period detector in different kinds of EDoS attacks.

Figure 9. Detection performance comparison of the flow detector in different kinds of EDoS attacks. (a) Accuracy. (b) Detection Rate. (c) F1 score. (d) False alarm rate.

Figure 10. Detection time among four solutions.

Figure 11. Response time among four solutions.

Figure 12. Compute usage according to attack rate.

Figure 13. Memory usage according to attack rate.

Figure 14. CPU usage of the victim server under control of the proposed model.

Table 1. Common types of EDoS attack.

Bandwidth Depletion Attacks	Resources Depletion Attacks
ICMP Flooding attack	TCP SYN Flooding attack
UDP Flooding attack	PUSH+ACK attack
Database API request attack	Low and Slow rate attack
Fraggle attack	TCP-HTP Flooding attack
Smurf attack
Yo-Yo attack

Table 2. Key Features for ANN model.

Feature	Description
$a v g_l e n g t h$	The average length of a flow
$p c f$	Number of correlative flows
$o d g s$	One-direction flows generation speeding
$n b_i n c o m i n g_p k t s$	Number of incoming packets
$n b_o u t g o i n g_p k t s$	Number of outgoing packets
$n b_b y t e s_f l o w$	Average number of bytes per flow
$f l o w_d u r a t i o n$	Duration of a flow
$c p u_l o a d$	CPU usage of victim server
$m e m o r y_u s a g e$	Memory usage of victim server
$b a n d w i d t h$	Network bandwidth

Table 3. ANN model parameters.

Parameter	Value
Node number of input layer	10
Number of hidden layer	3
Node number of hidden layer 1	13
Node number of hidden layer 2	15
Node number of hidden layer 3	18
Node number of output layer	1
Activation function of hidden layer	Relu
Activation function of output layer	Sigmoid
Optimizer	Adam
Loss function	Mean square error

Table 4. Key Features for LSTM model.

Feature	Description
$n o_b y t e s$	Number of bytes per flow
$n o_p k t s$	Number of packets per flow
$d u r a t i o n$	Flow duration
$t i m e_e a c h_p k t$	Times between two consecutive packets
$n b_i n g o i n g_p k t s$	Number of ingoing packets of a flow
$n b_o u t g o i n g_p k t s$	Number of outgoing packets of a flow
$f w_t t l$	Source to destination time to live value
$b w_t t l$	Destination to source time to live value
$f w_i n t p k t$	Source interpacket arrival time
$b w_i n t p k t$	Destination interpacket arrival time

Table 5. LSTM model parameters.

Parameter	Value
Node number of input layer	10
Number of hidden layer	1
Node number of hidden layer	80
Sequence length	5
Dropout	0.2
Node number of output layer	1
Activation function of hidden layer	Relu
Activation function of output layer	Sigmoid
Optimizer	Adam
Loss function	mean square error

Table 6. EDoS attack simulation with different numbers of requests.

Scenario	Number of Requests
S-1	1000
S-2	2000
S-3	3000
S-4	4000
S-5	5000
S-6	6000
S-7	7000

Table 7. Detection performance of the period detector in different kinds of EDoS attacks.

	Accuracy	Detection Rate	False Alarm Rate	F1
ICMP flooding attck	95.896%	94.208%	4.104%	95.831%
TCP-SYN flooding attack	96.439%	96.888%	3.561%	96.482%
TCP-HTTP flooding attack	95.723%	94.231%	4.203%	95.681%

Table 8. Abnormal flow detection comparison on average between our proposed approach and other solutions in detail.

	Accuracy	Detection Rate	False Alarm Rate	F1
R-EDoS [15]	95.350%	96.137%	4.333%	95.571%
SVM-Abbasi et al. [13]	86.220%	85.392%	13.779%	86.027%
NN-Abbasi et al. [13]	78.887%	77.901%	21.113%	78.859%
Our proposed scheme	98.163%	98.139%	1.837%	98.163%

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nhu, C.-N.; Park, M. Two-Phase Deep Learning-Based EDoS Detection System. Appl. Sci. 2021, 11, 10249. https://doi.org/10.3390/app112110249

AMA Style

Nhu C-N, Park M. Two-Phase Deep Learning-Based EDoS Detection System. Applied Sciences. 2021; 11(21):10249. https://doi.org/10.3390/app112110249

Chicago/Turabian Style

Nhu, Chien-Nguyen, and Minho Park. 2021. "Two-Phase Deep Learning-Based EDoS Detection System" Applied Sciences 11, no. 21: 10249. https://doi.org/10.3390/app112110249

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Two-Phase Deep Learning-Based EDoS Detection System

Abstract

1. Introduction

2. Related Works

3. Background Knowledge

3.1. EDoS Attack Analysis

3.2. Basics of Artificial Neural Network

3.3. Basics of Long Short-Term Memory

4. System Design

4.1. An Objective and System Design Analysis

4.2. System Workflow

4.3. Internal Modules

4.3.1. Collector

4.3.2. Feature Extractor

4.3.3. ANN Model-Period Detector

4.3.4. LSTM Model-Flow Detector

4.3.5. vFirewall

4.3.6. Training Database

5. Experimental Setup

5.1. Attack Scenarios

5.2. Dataset Description

5.3. Detailed Implementation and Test Preparation

6. Result Analysis

6.1. Quality of Service

6.1.1. Evaluate the Period Detector

6.1.2. Evaluate the Flow Detector

6.2. Resources Consumption

6.3. Performance Analysis at the Victim Server

7. Discussion

8. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI