1. Introduction
In network monitoring and attack detection applications, the collection and processing of data packets passing through the network is a big challenge because the number of transferred packets is huge, especially on high-speed connections. In addition, apart from the processing of information items in packet headers, many applications also need to process the packet payloads. This causes significant delays in the processing of data packets and can result in network traffic congestion. Therefore, it is necessary to select suitable packet information items and appropriate processing methods to speed up the processing of data packets.
In an IP (Internet Protocol) packet header, the source IP and destination IP are important information for the transfer of the packet from the source host to the target host. In DDoS (Distributed Denial of Service) attacks, a huge number of packets of fake requests are sent to the target host to exhaust system resources or to flood the network connection. Since the destination IP of these packets is the attacked target host, it is possible to detect a DDoS attack in the early stage by monitoring high-frequency destination IPs in the target network router. Similarly, the monitoring can be done in ISP (Internet Service Provider) routers to track source IPs of hosts that originate a large number of packets. These IPs may be the addresses of hosts infected with network worms and these worms are scanning the network for the next target hosts. IP addresses with a high occurrence frequency in the IP packet stream are called Hot-IPs. Therefore, the problem of target detection of DDoS attacks or the detection of network worm sources can be solved by monitoring the IP packet stream transferred through the network to find Hot-IPs [
1,
2,
3].
On a certain network connection, an IP packet stream is a sequence of IP packets, which can be represented as
S = {
a1,
a2,…,
aM}, where there are
M packets with
N unique IP addresses. Suppose
fi to be the occurrence frequency of the packet with IP address
si in
S, and then we have
fi = {
j|sj =
si}, where 1 ≤
i ≤
N and 1 ≤
j ≤
M. We also have
f1 +
f2 + … +
fN =
M and, given the occurrence frequency threshold ϕ, the Hot-IP = {
si|
fi ≥ ϕ} [
1,
2].
The problem of finding Hot-IPs in the IP packet stream can be solved using algorithms for finding elements with a high occurrence frequency in the data stream. There are a number of such algorithms, including Majority, Frequent, Lossy Counting, Space Saving, Count-Sketch, Count-Min and Group Testing [
4,
5,
6,
7,
8,
9,
10,
11,
12].
Section 2 of this paper will briefly discuss these algorithms.
Huynh et al. [
1,
2] proposed to use the non-adaptive Group Testing method for fast finding of Hot-IPs in the IP packet stream and to apply the results in the detection of DDoS attacks and network worm spreading sources. Since the computational complexity of the Group Testing method (
O(
tN), where
N is the number of unique IP addresses and
t is the number of tests) is relatively high, it is not efficient for the processing of the IP packet stream in heavy traffic [
1]. To solve this issue, they use the Reed Solomon code concatenation method to construct the
d-disjunct matrix
m [
1,
2]. Thus, the storage space of matrix
m is significantly reduced and the computational complexity is also reduced, which is equivalent to the polynomial time.
In this paper, we evaluate methods to find high-occurrence-frequency elements in the data stream and apply them to find Hot-IPs in the IP packet stream passing through the network. Based on the computational and space complexity evaluation results of each method, the most efficient method is selected for use in the target detection model of DDoS attacks.
The rest of the paper is organized as follows:
Section 2 reviews and compares methods for finding high-occurrence-frequency elements in the data stream and applies them in finding Hot-IPs in the IP packet stream.
Section 3 presents the proposed target detection model of DDoS attacks and
Section 4 is our conclusion.
3. Proposed Target Detection Model of DDoS Attacks
DDoS (Distributed Denial of Service) is the type of attack that makes the computer or network system overload, and the system cannot provide the service, or has to stop working. While a DoS attack is usually originated from one source, or a small number of sources, a DDoS attack is originated from large number of sources distributed all over the Internet. In real DDoS attacks, network service servers are “flooded” by a huge amount of requests sent from controlled hosts (also called zombies or bots) distributed on the networks [
14]. When the amount of requests is too large, the server is overloaded and fails to handle incoming requests. Consequently, legitimate users are not able to access the service provided by servers.
Figure 1 illustrates a typical architecture of DDoS attacks.
There have been a number of proposed measures to defend against DDoS attacks over the last decade. However, until now there has not been any solution capable of DDoS prevention comprehensively and effectively due to the complexity, scale and highly distributed nature of DDoS attacks [
14,
15,
16,
17]. These DDoS defense measures can be classified into three groups: (1) measures based on deployment location; (2) measures based on network protocols; and (3) measures based on the time of action. Group (1) consists of measures that are deployed at the sources or targets of DDoS attacks. On the other hand, group (2) includes measures that defend DDoS attacks at the IP, TCP/UDP (Transmission Control Protocol/User Datagram Protocol), or application layers. Based on the time of action, group 3 includes pre-attack, in-attack and post-attack measures [
14,
15,
16,
17].
Based on the analysis of DDoS architectures and the Hot-IP finding results presented in
Section 2, we propose a DDoS target or victim detection model based on Hot-IP finding, as shown in
Figure 2. The IP addresses of service servers are the destination IP addresses of IP packets sent to these servers. Under a DDoS attack, these IP addresses usually have an extremely high occurrence frequency. Therefore, if we deployed a Hot-IP/DDoS detector on the target network router, it is possible to detect the signals of a DDoS attack early. Our DDoS target detection model can be deployed at the target host, or would be best at the router of the target network.
In the proposed model, the Hot-IP/DDoS detector deployed on the target network router is responsible for capturing and processing all IP packets sent to service servers. The detector uses the sliding window method on the IP packet stream to find destination IPs that have high occurrence frequency. Count-Min is the method implemented for finding high-occurrence-frequency IPs in the detector. A threshold of occurrence frequency for Hot-IPs is determined in advance to identify the possibility of a DDoS attack. The threshold is determined based on the type of network services and the user access patterns when the system is in normal operation. Initial experiment results on a simulated environment show that the detector is capable of quickly identifying Hot-IPs correctly and based on that, it can detect DDoS attacks on network service servers.
4. Conclusions
Finding Hot-IPs in the IP packet stream flowing through the network can be used to detect targets or victims of DDoS attacks, or spreading sources of network worms. It can also be used in applications to monitor activities of network elements. This paper reviewed methods to find high-occurrence-frequency elements in the data stream and applied them in finding Hot-IPs in the IP packet flow passing through the network. Research results shows that Count-Min gives the best overall performance thanks to its low computational and space complexity, and its fast processing speed. We also proposed a Hot-IP finding-based model for early target/victim detection of DDoS attacks, which can be deployed on the router of the target network.
This research can be extended in the following directions: (i) complete the Hot-IP–based detection module and deploy it in the real environment; and (ii) optimize the Hot-IP finding module using embedded processors to speed up the processing of IP packets to be able to monitor large bandwidth network connections.