Real-Time Detection System for Data Exfiltration over DNS Tunneling Using Machine Learning

Orieb Abualghanam; Hadeel Alazzam; Basima Elshqeirat; Mohammad Qatawneh; Mohammed Amin Almaiah

doi:10.3390/electronics12061467

Abstract

The domain name system (DNS) plays a vital role in network services for name resolution. By default, this service is seldom blocked by security solutions. Thus, it has been exploited for security breaches using the DNS covert channel (tunnel). One of the greatest current data leakage techniques is DNS tunneling, which uses DNS packets to exfiltrate sensitive and confidential data. Data protection against stealthy exfiltration attacks is critical for human beings and organizations. As a result, many security techniques have been proposed to address exfiltration attacks starting with building security policies and ending with designing security solutions, such as firewalls, intrusion detection or prevention, and others. In this paper, a hybrid DNS tunneling detection system has been proposed based on the packet length and selected features for the network traffic. The proposed system takes advantage of the outcome results conducted using the testbed and Tabu-PIO feature selection algorithm. The evolution of the proposed system has already been completed using three distinct datasets. The experimental outcome results show that the proposed hybrid approach achieved 98.3% accuracy and a 97.6% F-score in the DNS tunneling datasets, which outperforms the other related works’ techniques using the same datasets. Moreover, when the packet length was added into the hybrid approach, the run-time shows better results than when Tabu-PIO was used when the size of the data increases.

Keywords:

data leakage; DNS tunneling; DNS tunneling tools; M-PIO; UNSW-NB15

1. Introduction

New, sophisticated, and dangerous malware are continuously increasing, which may lead to cybercrimes that affect national security, human beings, and the economy. The expected financial losses due to cybercrimes are estimated to reach USD 10.5 trillion by 2025 [1].

Ensuring secure networks or guaranteeing the security of the information system at any organization is considered a challenge [2] owing to the need for monitoring and controlling the data traffic flow in and out of a network’s edge [3]. Thus, presenting a security defender is a trade-off between the cost and the complexity of this security solution in terms of computation, communication, and storage [4].

Cybercrime aims either to damage an organization’s infrastructure, to gain unauthorized access, or to leak confidential data. The challenge is that the attackers change their behavior for attempting an attack every day. Thus, detecting a new attack based on a previous pattern or signature is futile. Nowadays, intelligent detection based on anomaly detection for any compromise is the most popular strategy [5,6].

Data exfiltration is one of the most common types of attacks on organizations or individuals. It is the process of retrieving, modifying, copying, or transferring sensitive information from a computer, a server, or another device without authorization [7].

Recently, statistics have shown that the number of data breaches using DNS tunneling has significantly increased. Many approaches have been proposed by researchers to address the DNS tunneling problem [8]; however, the proposed DNS tunneling detection methods did not consider the method of selecting features based on statistical and/or behavioral features, which affects both the complexity of the system and the overall performance of the network.

In recent years, data exfiltration has been one of the uses for tunneling. Tunneling is the process of encapsulating the original packet inside another packet to cause data leakage. Traditional tunnels have been established based on network layer protocols, but, recently, they have been established via application layer protocols, such as the domain name system (DNS), secure shell (SSH), and the hypertext transfer protocol (HTTP) [9,10].

The DNS has a critical role in the network of resolving the domain name to the related IP address; however, DNS communication comparatively has a poor policy, and this point might be exploited by cybercriminals to maintain a covert channel (tunnel) for data leakage from inside the network by an infected host [11].

DNS tunneling is a well-known cyber attack used for stealing sensitive information. The attacker encapsulates confidential information into DNS requests or responses to bypass intrusion detection, intrusion prevention, or other network security monitoring mechanisms [12].

The attacker can use many techniques to exploit DNS, one of which is to register a domain name (e.g., attacker.com) and install a malware victim host so that the attacker will be able to take the control of this infected host. Thus, the attacker can steal sensitive data, such as credit card numbers, personal passwords, or intellectual property information, in a DNS request of the form arbitrary-string.attacker.com [13].

Interestingly, the firewalls in most enterprises are configured to allow DNS packets. The attacker can control the infected host and use a crafted DNS request packet on UDP port 53 to establish a tunnel to steal sensitive data. The crafted DNS has encapsulated confidential information into the DNS packet [14,15].

The default configuration of any security solution allows any packet carrying port 53. Port 53 is used for DNS, and it is always open to transfer DNS queries. Checking DNS packets with firewalls, intrusion detection/prevention systems, or any security solution will degrade the performance of the network significantly [16]. On the other hand, DNS packets are used by intruders to exfiltrate the data. So, there is a need for intelligent techniques that can cope with detecting the covert channel and, at the same time, keep the efficiency of the organization.

The difference between malicious and normal traffic can be easily detected or noticed, but wading inside the details of the normal packet to detect if there is a bypass for the security policies is a challenge.

1.1. Motivation

The motivation of this paper is to take advantage of the outcome results conducted using the testbed and Tabu-PIO feature selection algorithm to propose a hybrid DNS tunneling detection system based on the packet length and selected features for the network traffic that can enhance accuracy, F-score, and the run-time of DNS tunneling system.

1.2. Contributions

The main contributions of this paper are summarized as follows:

It focuses on designing a lightweight DNS tunneling detection system to detect data exfiltration.
It provides a clear idea about the name resolution system using the DNS protocol and DNS tunneling.
It presents a list of ranges recommended for the DNS packet length to detect if there is a DNS covert channel.
It proposes a hybrid DNS tunneling system based on the M-PIO and a specific DNS packet length range.

1.3. The Paper’s Organization

We organized our paper as follows. In Section 3, we present the previous work on the DNS tunneling techniques in machine learning. In Section 4, we describe our design and implementation for the DNS tunneling environment. Section 4 explores the background of attack models and ML techniques for detection. Section 6 presents the experiments and evaluations for three different datasets. Finally, in Section 7, our conclusions and possibilities for future work are given.

2. Background

The background section presents a brief description of the DNS protocol and DNS tunneling that compromised the DNS protocol to leak sensitive data.

2.1. Introduction to DNS

A unique IP address is assigned to websites and internet-connected devices, including phones, tablets, computers, routers, etc. [17,18]. Computer systems can only understand numbers; therefore, an IP address is an ideal way to identify any website or internet-connected device [19]. For example, the IP address corresponding to the website name www.ju.edu.jo (accessed on 10 January 2022) is 87.236.232.79. As humans, we can easily remember the human-readable website name instead of its corresponding IP address. The main usage of the domain name system (DNS) is translating or mapping website names, e.g., ju.edu.jo, into IP addresses. For example, when the client enters the URL www.ju.edu.jo into the web browser, the IP address corresponding to this URL is obtained as follows:

Step 1: The URL www.ju.edu.jo is sent to the recursive server, also known as the recursive resolver. The recursive resolver looks up the IP address corresponding to the website name www.ju.edu.jo. If it is found in its local cache, the resolver then responds to the browser’s query and provides it with the corresponding IP address. The transaction performed in this step is shown by arrow 1, Figure 1.

Figure 1. DNS resolution scheme.
Step 2: If the IP address is not found in the local cache of the resolver server, the resolver sends the query for www.ju.edu.jo to one of the root name servers. The root server tells the resolver the correct top-level domain server to talk to. The transactions performed in this step are shown by arrows 2 and 3.
Step 3: There are many TLD servers: one for .gov domains, another for .edu domains, and so on. The function of TLD servers is to tell the resolver server where it can find the authoritative nameserver for the corresponding domain address. The transactions performed in this step are shown by arrows 4 and 5.
Step 4: The resolver server has finally come to the right authoritative server, which knows the IP address and tells the resolver. The resolver server then responds to the browser’s query, provides it with the corresponding IP address, and remembers the IP address of the website so that if any other client asks for the same website it has the answer. The transactions performed in this step are shown by arrows 6, 7, and 8.

The domain name system (DNS) uses both the UDP/53 and TCP/53 ports. DNS queries with a size greater than 512 bytes use TCP, whereas UDP must be used to exchange small bits of information. The TCP/53 and UDP/53 ports are enabled on firewalls, which may constitute a security vulnerability with regard to DNS tunneling [20].

The domain name has to be unique; thus, the DNS will use a hierarchical naming structure, and this name should be followed by rules, such as the components and length of the domain name. It is composed of a sequence of letters, numbers, and hyphens, and the domain length name has to be less than 255 characters [21]. Figure 2 presents the DNS message format with different attributes.

Figure 2. DNS message format.

2.2. Introduction to DNS Tunneling

The DNS tunnel (covert channel) is considered one of the greatest threats challenging network security. It is typically used to send confidential information or transmit commands for remote control by malware, fresh Trojan horses, APT organizations, etc. [22]. Detecting DNS tunneling is considered an issue; security equipment is typically installed at the edges of networks to look for DNS tunnels.

Figure 3 presents the recursive DNS when the tunnel is established. Assuming that the infected host (victim) first wants to access the domain name “attacker.com” (accessed on 13 April 2022), and the local DNS server (Resolver Name server) has no entry for this domain in its cache, then the following process has been carried out:

Figure 3. Recursive domain name resolution using DNS tunneling.

The victim has crafted a DNS packet in which to encapsulate confidential data.
The victim sends a DNS query to its configured DNS resolver to ask about the IP address, which is represented by the “A” record in Figure 2 of the domain name “attacker.com”.
When the DNS resolver receives the request, it will choose one of the thirteen root servers to which to forward the received request.
The root DNS server will forward the request to the matched top-level domain (TLD) server, which is “.com”. Thus, the root server does not know the IP address of “attacker.com”.
The local DNS server will directly send the query to the corresponding TLD “.com” after receiving the reply from the root server.
Once the TLD receives the request from the DNS resolver, it will reply by sending the IP address of the authoritative DNS server responsible for attacker.com.
When the DNS resolver receives the IP address for the authoritative DNS server, the DNS resolver will forward the reply message to the local DNS server, in turn, to access the attacker’s server directly.

2.3. One-Class Classifier Techniques

Several proposals have used machine learning techniques to detect or prevent incidents. Signature-based detection has become an old technique that has not coped with the immense change in the pattern for well-known attack types. One-class techniques, such as the isolation forest (iForest), OC-SVM, and local outlier factor (LOF), are widely used in anomaly detection.

3. Related Works

DNS tunneling can be detected using two popular methods: traffic analysis and payload analysis [23]. In payload analysis, only a single DNS request is analyzed in terms of the DNS packet attributes, such as the number of bytes, the packet length, and the packet contents. In this method, the analysis aims to detect general rules or a signature for the packet. The second method aims to analyze the total traffic over a specific period of time. This method focuses on analyzing the volume (size) of DNS traffic or the total number of host names per domain [24]. The domain history is used when a traffic analysis is used as an indication of tunneling [25].

A malware program that is meant to steal data must use a covert channel in the presence of security countermeasures. Today, malware developers frequently employ the DNS protocol as a covert conduit for this purpose [26,27].

A detection method for DNS tunneling and low throughput for data exfiltration has been proposed [28]. The proposed method has three phases: data collection, feature extraction, and anomaly detection. The evolution of the proposed method has been conducted using 47 million DNS requests per hour.

In [29], a method for DNS tunnel detection is proposed, which is mainly based on the isolated forest for Android. Their proposed approach, called KRTunnel for mobile devices, achieved an accuracy of 98.1%.

Chen et al., in [30], proposed the use of the LSTM model for a DNS covert channel detection method, noting that their proposed method does not rely on feature engineering. To clarify it further, they use the FQDNs of DNS packets as the input in the first stage, and then, they implement an end-to-end detection approach using the LSTM model.

Liu et al., in [31], proposed a DNS tunnel detection mechanism based on behavior features. The authors used four DNS tunnel tools to generate the DNS records (i.e., OzymanDN, nscat2, dns2tcp, and iodine). The normal traffic was collected from the ISP DNS server, while the tunnel traffic was collected from the Intranet DNS server. The generated data contain four categories of features with 18 behavior features, including packet size, time interval, records type, and domain entropy. The proposed mechanism was deployed at the recursive DNS to identify the tunnel. The accuracy of the system reached 99.6%.

Bubnov, in [32], addressed the problem of DNS tunneling. The author used the feed-forward neural network with multi-labels, in which each label represents a DNS tunneling technique (i.e., tunnel dnscapy, tunnel dns2tcp, tunnel iodine, tunnel tuns, and plain). The dataset was collected from a peer-to-peer topology network with a client-created DNS tunnel and a DNS server. The proposed approach evaluation yielded 83% in terms of accuracy and 81% and 84% for recall and precision, respectively.

Lambion et al., in [33], developed machine learning classifiers for the detection of DNS tunneling attacks by using the random forest (RF) and convolutional neural network (NN) structures. The results of the developed machine learning classifiers showed that the accuracy of the proposed detector using the RF and NN is 96%.

Chowdhary, Bhowmik, and Rudra, in [34], built a DNS tunneling attack detector. The authors of this paper combined two methods to build a detector. The first method relies on the cache misses on the DNS server cache, while the second method uses machine learning to classify the DNS query. Several classifiers were used and evaluated in terms of accuracy, f-score, and time. The proposed detector was able to inform the user if there is any DNS tunneling attack in real time.

Altuncu et al. proposed a deep learning-based detector that detects and prevents tunneling attacks over the DNS in real time [35]. The results showed that the DNS tunneling in a system was detected with 99.9% accuracy and 99.8% precision, which means that the proposed system has a high success rate in preventing tunneling threats in DNS traffic.

Sabir et al. presented a review study that identifies and classifies machine learning approaches, evaluation datasets, and performance metrics using a systematic literature review method [9]. The authors conclude that the integration of behavior and data-driven mechanisms should be explored. Moreover, they conclude that there is a need to develop large-size and high-quality datasets.

Ishikura et al., in [12], proposed a DNS tunneling detection method based on the cache-property-aware features. The proposed approach used the cache miss count to characterize the DNS tunneling traffic. Based on the selected feature, two filters have been introduced to detect DNS tunneling: a long short-term memory (LSTM) and a rule-based filter. The rule-based filter achieved a higher detection rate than the LSTM filter. The authors use a DNS cache server installed on the local network to capture the DNS traffic generated on the DNS cache server. The generated data were used to evaluate the proposed methods.

Zhan et al. proposed a method for detecting data exfiltration of the DNS over HTTPS (DoH) [36]. The proposed method analyzed the fingerprints of DoH clients and extracted flow-based features to identify the DNS tunneling. The results proved that the proposed method is effective and hard to evade.

Nguyen and Park, in [37], proposed a two-layer transformer system to detect DoH tunneling attacks. The proposed system can be integrated with a secure operating system in an enterprise network. The accuracy of the system was 99.4%, and it only needs 20% labeled data compared to other supervised machine learning classifiers.

The main challenge of this research is the dataset. There is no standard dataset for DNS tunneling used by researchers. All datasets used by researchers have been generated by DNS tunnel tools on private networks. Moreover, the proposed approaches that used machine learning did not consider the number of features used to train the model. The feature selection step is very critical and affects the speed and accuracy of the model. Table 1 presents the related works mentioned in this section and compares them in terms of datasets or tools used to generate the DNS tunneling data.

Table 1. Summary of related works.

4. Testbed Environment Setting for Dataset Generation

This section features a detailed description of the testbed setting and the DNS tunneling tools that were used, as well as a description of how the network was organized with the corresponding IP address for each device.

A virtual machine infrastructure (VMI) was used to build the proposed network. Figure 4 presents the testbed environment for setting up a DNS tunnel with the corresponding IP addresses that were used for each device. Windows 10 was used for the nonvulnerable host, and Kali Linux (Ubuntu) kernels were used as the infected host. The attacker controls the victim from outside the network and has its own authoritative server with the “attacker.com“ domain name.

Figure 4. Illustration of proposed testbed environment using VM.

In this paper, four DNS tunneling tools were used in the testbed. The attack passes through different stages before the DNS tunneling begins data leakage. The first stage is initial access to the victim using reconnaissance, scanning, and exploitation of any vulnerability. The second stage is exploitation to add malicious software to obtain full access to the victim. The final stage is the attack, in which the sensitive and conditional data are exfiltrated through the DNS packets.

To generate the datasets dnscat2, dns2tcp, dnsteal, and iodine, DNS tunneling tools have been used on Kali Linux (Ubuntu). Table 2 features the description of each tool and the number of records that have been generated from each tool. After this, all records generated using the DNS tools were merged into a DNS tunnel, which holds 34,109, while the normal traffic is 3240.

Table 2. DNS tunnel tools’ utilities.

Generally speaking, when a client sends a DNS query to the DNS server normally the length of the DNS packet is between 50 and 550 bytes [38]. The reason that the length of a DNS packet is that there are various types of DNS packets, such as query messages, response messages, and recursive queries, and each type has a different packet length.

Overall, the range of the DNS packet lengths must be clarified in order to filter the suspicious DNS packets in the network. Thus, any length outside this range is suspicious, but the question of how to detect the abnormality, even though the length of the packet is within the normal range, remains.

Based on the testbed that has been conducted using different DNS tunneling tools, the length of the packet was changed. Additionally, a specific length of DNS packets was found for the DNS tunneling, which is 101, 110, 115, 148, 149, 156, 157, 167, 173, 183, 229, 269, 325. On the other hand, determining a specific package length enhances the performance of the DNS tunneling detection system by presenting a fast filtering approach based on only one attribute: “length”.

More details about tools, packets and the packet content will be found in Appendix A.

5. The Proposed Hybrid DNS Tunneling Detection Approach

This section introduces the proposed hybrid DNS tunneling detection approach. The proposed hybrid approach consists of three main phases, as illustrated in Figure 5: data processing, model development, and model testing. The first phase is data preprocessing, which has been discussed in Section 5.1. Section 5.3 and Section 5.5 present the standard pigeon-inspired optimizer (PIO) and the modified version of the PIO used in DNS tunneling model development. Finally, Section 5.5 presents the hybrid testing procedure that checks the packet length range before using the developed model in phase two.

Figure 5. The hybrid DNS tunneling detection system.

5.1. Data Preprocessing

The dataset preprocessing phase is very important before building the training to eliminate bias or incorrect classification. It passes through different steps, such as data normalization, reduction, cleaning, and transformation. Preprocessing was applied to all datasets before their use for the feature selection or for the hybrid approach.

In this paper, a new DNS tunneling dataset has been generated from the UNSW-NB15 dataset by extracting only the DNS packet and calling DNS records from UNSW-NB15. The new DNS records extracted from the UNSW-NB15 training set do not include any redundant records. All symbolic data have been converted to numeric values. The attack class is considered to be a DNS tunnel and has been replaced by “1”, while the normal class label has been set to “0”. A DNS tunneling detection system is supposed to be used in the early stages of a network and filter only the DNS packets.

In the data normalization step, a new scaling has been performed for all data values into a proportional range of each feature. To avoid the classifier’s bias for the majority class over the minority in an ambulance dataset, this step is crucial [39]. Equation (1) presents the normalization scale for all applied datasets.

X_{n o r m a l i z e d} = (\frac{X - X_{m i n}}{X_{m a x} - X_{m i n}})

(1)

5.2. Pigeon-Inspired Optimization (PIO)

The pigeon-inspired optimizer belongs to bio-inspired swarm intelligence. The idea of the PIO algorithm is inherited from the homing behavior of pigeons. In the past, pigeons were used to carry messages between people over long distances [40]. Pigeons used the magnetic particles in their beak to navigate to their homes. The PIO algorithm is based on two main operators: map and compass and landmark operators [41]. During the map and compass operator, the pigeon used the location of the sun and the Earth’s magnetic field while using the landmark operator to navigate toward the destination [42]. Using the landmark operator, the pigeon guides the swarm if it is familiar with the location; otherwise, it will follow the leader. As mentioned earlier, the PIO has two main operators. The following clarifies the mathematical model of these operators:

Map and compass operator:
In this phase, each pigeon i is represented by its location $X_{i}$ and its velocity $V_{i}$ in the D-dimensional search space. In each iteration, the position and velocity of the pigeons are updated based on previous iteration values. The values are updated toward the best pigeon values, as presented in Equations (2) and (3) [43].

$V_{i} (t) = V_{i} (t - 1) . e^{- R t} + r a n d . (X_{g} - X_{i} (t - 1))$

(2)

The $r a n d$ function represents a uniform random number between [0,1], $X_{g}$ is the global best pigeon based on the evaluation function value, and R is the map and compass factor. The velocity of a pigeon $V_{i}$ represents the change amount of the current pigeon $X_{i}$ toward the best pigeon $X_{g}$ .

$X_{i} (t) = X_{i} (t - 1) + V_{i} (t)$

(3)
Landmark operator: The landmark operator will be used when the pigeons are closer to their landmark. In the case of the algorithm, it will activate after the map and compass iterations have ended [44]. All pigeons are evaluated according to their fitness value. Then, in each iteration in the landmark operator, the number of pigeons is reduced to half, as presented in Equation (4). After considering only the half number of pigeons in each iteration, all residual pigeons $X_{i}$ will be evaluated and updated in their position and velocity toward the best one, which is located in the center $X_{c}$ , as in Equations (5) and (6).

$N_{p} (t + 1) = \frac{N_{p} (t)}{2}$

(4)

where $N_{p}$ is the number of pigeons in iteration t.

$X_{c} (t + 1) = \frac{\sum X_{i} (t + 1) . F i t n e s s (X_{i} (t + 1))}{N_{p} \sum F i t n e s s (X_{i} (t + 1))}$

(5)

$X_{i} (t + 1) = X_{i} (t) + r a n d . (X_{c} (t + 1) - X_{i} (t))$

(6)

The

F i t n e s s

of a solution or the selected pigeon reflects the quality of the solution based on the false positive rate (FPR), the number of the selected features, and the true positive rate (TPR).

5.3. Modified Pigeon-Inspired Optimization (M-PIO)

Feature selection is a vital process used to remove irrelevant features, thus reducing the dimensionality of the dataset, which also improves the accuracy of the system and reduces the processing time. In this section, a modified version of the pigeon-inspired optimizer (M-PIO) for feature selection is used [6]. The modified version of the M-PIO contains a local search algorithm as an extra operator. Two variations of the local search PIO (LS_PIO) are used; the first one uses the hill climbing algorithm, while the second one uses the Tabu search algorithm. Additionally, a DNS tunneling detection approach based on a one-class support vector machine is proposed, as shown in Figure 5.

In LS-PIO, the first population is randomly generated; the solution is presented as a vector that includes all of the features. The vector length is fixed, and the value of the index indicates the absence or presence of the corresponding feature by zero or one, respectively.

The solution (pigeon) that has the highest fitness value is called the global solution, and the rest of the solutions (pigeons) will update their positions toward the global solution. The position of the pigeon, as illustrated in Equation (7), depends on the pigeon’s velocity. The pigeon’s velocity determines the amount of change that will be applied to the solution. A pigeon’s velocity is calculated by the cosine similarity value for the pigeon and the best pigeon in Equation (8).

X {(t)}_{(i, p)} [i] = \{\begin{matrix} {X (t - 1)}_{p} [i], & i f (S (V_{i} (t)) > r) \\ {X (t - 1)}_{g} [i], & otherwise \end{matrix}

(7)

\begin{matrix} V_{p} = C o s i n e S i m i l a r i t y (X_{g}, X_{p}) = \frac{X_{g} . X_{p}}{| | X_{g} | | . | | X_{p} | |} \\ = \frac{\sum_{i = 0}^{n - 1} X_{p, i} X_{g, i}}{\sqrt{\sum_{i = 0}^{n - 1} {X_{p, i}}^{2}} \sqrt{\sum_{i = 0}^{n - 1} {X_{g, i}}^{2}}} \end{matrix}

(8)

In each iteration, after determining the global solution, it will be entered into the local search algorithm. The local search algorithm tries to find a better solution. In this paper, the Tabu search and the hill climbing algorithms have been used.

Modified Landmark Operator

The modified landmark operator works as the base in Section 5.2, but it uses Equations (7) and (8) to update the pigeon’s position and to calculate the velocity, respectively. At each iteration, only half the number of pigeons is considered after sorting them by their fitness value. This eliminates the pigeons or the solutions that have bad fitness values.

5.4. Fitness Function

The fitness function has been used to evaluate the pigeons (solutions) in each iteration. In the LS_PIO, the fitness function used is presented in Equation (9). Based on Equation (9), the best pigeon is the pigeon that has the minimum fitness value [45]. As illustrated in Equation (9), N is the number of features in the dataset, and F is the number of selected features in the solution.

α

,

β

, and

δ

are weights that reflect the importance of each corresponding measure. The summation of all the weights is equal to one where

α = 0.04

,

β = 0.48

, and

δ = 0.48

. The weights of TPR and FPR are equal since they have the same importance regarding the DNS tunneling detection system. Though the weight of the number of selected features is smaller than other weights, this small fraction is used to give preference to solutions that have the same TPR and FPR but a different number of selected features.

F F = α * \frac{F}{N} + β * F P R + δ * \frac{1}{T P R}

(9)

Algorithm 1 presents the pseudocode of the modified LS_PIO.

In this paper, a hybrid DNS tunneling detection system has been proposed based on the M-PIO and packet range attributes. It can be noticed that the M-PIO produces the best set of features for the ingress packet. On the other hand, it is time-consuming to check at least the selected features that are produced from the M-PIO. Thus, in the hybrid approach, first, the check will be only based on one attribute: the packet length.

If the packet length matches any range, then the packet will classify as tunneling. When none of the ranges are matched, the system will check on the other selected M-PIO features.

5.5. Hybrid Model Testing

This subsection clarifies the testing phase for each packet that passes the DNS tunneling detection system. As illustrated in Figure 5 each packet will be checked for the packet length range condition; if the packet length falls within the specified range, then it will be identified as a DNS tunneling packet regardless of the model’s decision. If the packet is not within the specified packet length range, then the model’s decision will be considered. Based on our testbed experiments, we notice that DNS tunneling packets fall within a specified length range. Thus, the packets that fall within the specified range are definitely DNS tunneling packets. However, not all packets may fall within the specified range, since our testbed only uses four tools to generate the DNS tunneling data. For this, we use it as an extra filter to enhance the detection rate of the system.

Algorithm 1 Hybrid intelligent DNS tunneling detection system based on (Tabu-PIO and packet length)

Input:

N_{p}

: population size, R: ratio of map and compass,

n i_{1}

: local search iterations,

n i_{2}

: pigeons iterations,

P_{L}

: packet length ,

X_{i}

: a solution

Output:

X_{g}

: global solution (set of features)

1:: Initialize the first population randomly ( $X_{1}, X_{2}, \dots X_{N_{p}}$ ).
2:: Evaluate the population members $(X_{1}, X_{2}, \dots, X_{N_{p}})$ according to their fitness by Equation (9).
3:: Determine $X_{g}$ (the pigeon that has the minimum fitness value).
4:: while ( $n i_{1} > = 1$ ) do
5:: Update pigeon velocity and path toward $X_{g}$ using Equations (2) and (3).
6:: Evaluate the updated pigeons $(X_{1}, X_{2}, \dots, X_{N_{p}})$ according to their fitness by Equation (9).
7:: Determine $X_{g}$ (the pigeon that has the minimum fitness value).
8:: $X_{g n}$ = Tabu_search ( $X_{g}$ )
9:: if $X_{g n} < X_{g}$ then $X_{g}$ = $X_{g} n$
10:: end while
11:: while ( $N_{p} > = 1$ ) do
12:: Sort solutions (pigeons) according to their fitness.
13:: $N p = N_{p} / 2$
14:: Determine the desired location by Equation (5)
15:: Update the position of the pigeon by Equation (6).
16:: Update the best pigeon $X_{g}$ .
17:: end while

6. Experiments and Evaluation

6.1. Datasets

This section presents three datasets that have been used to evaluate the performance of the hybrid system. The first one extracted all the DNS from the UNSW-NB15 benchmark datasets, as shown in Figure 6, and a detailed description of the training and testing dataset is presented in Table 3. The main objective from extracts of the new datasets is to attempt related proposals in the future to conduct comparisons; hence, the UNSW-NB15 is publicly available.

Figure 6. Extracting DNS records from UNSW-NB15 dataset.

Table 3. # of instances of DNS records from UNSW-NB15.

The second dataset is DNS tunneling [32], which is collected from an isolated private network. To generate the dataset, four DNS tunneling tools have been used: dns2tcp, dnscapy, iodine, and tuns. The generated dataset has four types of DNS tunneling for each tool in addition to the normal DNS traffic. A detailed description of the dataset is presented in Table 4.

Table 4. # of instances of DNS records from [32].

The third dataset is generated from the testbed environment. A detailed description of the number of records from each tool is presented in Table 2

6.2. Setup of the Experiments

The hybrid system was implemented using Python 3.7. All experiments were conducted based on a 64-bit operating system for Windows 10, 16 GB RAM, and Intel Core i7. Table 5 presents the initialized value for the experiment settings. On the other hand, the 10-fold cross-validation method was used for our generated datasets, and in DNS records from the UNSW-NB15 and labeled DNS exfiltration datasets [32], the same splitting for the training and testing was used.

Table 5. The experimental setup.

Three ranges for the packet length were used for fast detection of whether there is a DNS tunnel or not, which are low (RL), medium (RM), and high (RH).

6.3. Performance Metrics

In this paper, four performance metrics were used to measure the performance of the proposed system and for comparing our approach with other related approaches in the literature [46].

False Positive Rate (FPR or false alarms): Equation (10) illustrates the FPR, which represents the percentage of a normal DNS packet’s class that has been classified as a DNS tunnel.

$F P R = \frac{F P}{T N + F P}$

(10)
Accuracy: Equation (11) illustrates the accuracy, which represents the ratio of the correct classification classes to the total number of classes.

$A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}$

(11)
F-score (F-measure): Equation (12) illustrates the F-score, which uses the accuracy of the model by taking the precision and recall values into consideration.

$F - S c o r e = \frac{2 * T P}{2 * T P + F P + F N}$

(12)
AUC: The area under curve is a good measure with imbalanced data. It is a good indicator of overall accuracy.

6.4. Results Discussion

This section presents the experimental results of the proposed DNS tunneling detection approach in two phases. The first represents the number of selected features from all datasets; the UNSW-NB15 and [32] using both versions of the modified PIO (with the hill climbing and Tabu search). The second part evaluates the proposed approach in terms of several performance metrics.

Table 6 presents the number of selected features for each algorithm with the index of each one from the UNSW-NB15 dataset for each of the examined algorithms with the three one-class classifiers (LOF, OC-SVM, and iForest). The results indicate that the Tabu-PIO reduces the number of features in the UNSW-NB15 from 42 to only 13 features when the iForest classifier is used. The Tabu search provides a smaller number of features compared to the hill climbing in all one-class classifiers.

Table 6. The selected features from DNS records extracted from UNSW-NB15 by LS_PIO based on different classifiers and local search methods.

Table 7 shows the results for [32] datasets when different classifiers are used and different local search algorithms are used. It can be noticed that the Tabu-PIO reduces the features from 17 features to 5 features when iForest has been used. On the other hand, the OC-SVM achieved the highest number of features compared to the other classifiers.

Table 7. The selected features from [32] dataset by LS_PIO based on different classifiers and local search methods.

Table 8 elaborates on the results for the DNS_UNSW-NB15 dataset based on two local search algorithms (the hill climbing and Tabu search) when combined with the PIO. The comparisons were conducted in terms of TPR, FPR, accuracy, and F-score for the DNS_UNSW-NB15 dataset.

Table 8. The evaluation results for LS_PIO using DNS_UNSW-NB15 dataset.

The two versions of the LS_PIO have been evaluated using three classifiers: iForest, OC-SVM, and LOF. According to Table 8, the PIO-Hill_Climbing achieved the best results with the OC-SVM classifier compared to all other classifiers. Moreover, in the case of the OC-SVM classifier, the PIO-Tabu_Search has a better TPR and F-score compared with the PIO-Hill_Climbing. Regarding FPR, both versions of the LS-PIO have zero value with the OC-SVM and iForest classifiers. Both versions of the LS_PIO have the worst results in the case of the LOF classifier compared with the iForest and the OC-SVM.

Figure 7 illustrates the convergence curve for both the PIO-Tabu_Search and PIO-Hill_Climbing. Figure 7 shows that the PIO-Tabu_Search converges faster than the PIO-Hill_Climbing, especially at the first 50 iterations. On the other hand, the PIO-Hill_Climbing converged after 110 iterations and remained enhanced. Additionally, at iteration 200, the PIO-Tabu_Search had a better fitness value than the PIO-Hill_Climbing.

Figure 7. Hill climbing and Tabu convergence curve using UNSW-NB15 dataset.

Table 9 demonstrates the achieved results for both versions of the LS_PIO (PIO-Hill_Climbing and PIO-Tabu_Search) in terms of TPR, FPR, accuracy, and F-score for the labeled DNS exfiltration dataset in [32]. The two versions of the LS_PIO have been evaluated using three classifiers; iForest, OC-SVM, and LOF. According to Table 9, the iForest classifier has the worst results in terms of all metrics compared to the OC-SVM and LOF of both versions of the LS_PIO. The PIO-Tabu_Search has better results than the PIO-Hill_Climbing with 98.8%, 6.7%, 98.3%, and 97.6% in terms of TPR, FPR, accuracy, and F-score, respectively, using the OCS-SVM classifier.

Table 9. The evaluation results for LS_PIO using labeled DNS exfiltration dataset [32].

Table 10 and Table 11 present the results of the proposed approach (the LS_PIO) for the generated dataset in this paper. The results were conducted based on all features of the generated datasets.

Table 10. The evaluation results using our generated dataset using all features.

Table 11. Comparison between different DNS tunneling techniques using the labeled DNS exfiltration dataset [32].

Table 10 presents the evaluation for the generated dataset using three selected classifiers, in terms of TPR, FPR, F-Score, and AUC. Additionally, it presents the accuracy score for the training set, validation set, and testing set. According to Table 10, the OC-SVM achieved the best results with 99.77%, zero, 98.56%, and 97.75%, in terms of TPR and FPR, F-score, and AUC, respectively, against the other classifiers. Moreover, the accuracy results with the OC-SVM were 99.87%, 99.48%, and 99.34% for the training set, validation set, and testing set, respectively. Finally, the LOF classifier has the worst results in terms of all metrics.

Table 11 illustrates the comparison results between the two approaches using the same DNS exfiltration dataset introduced in [32]. In [32] multi-label feed-forward neural networks were used to detect DNS tunneling; their proposed approach achieved an accuracy of 84%. Moreover, in [34] K-nearest neighbors (KNN) was used, and the experimental results show 94% accuracy and 94% F-score when using the same dataset [32]. It can be noticed that the results show that our proposed approach achieved the best results in terms of 98% accuracy and F-score compared to the examined related works.

Figure 8 presents the run-time for the M-PIO and our hybrid DNS tunneling based on different data sizes. It can be noticed that when the Tabu-PIO is used on a small dataset, it gives better results than when the data size is increased. On the other hand, the hybrid approach that used the packet length range outperforms the Tabu-PIO in all data sizes, and a significant enhancement can be noticed when the size of the data becomes huge.

Figure 8. Run-time for our dataset.

7. Conclusions

Data exfiltration is one of the most current issues in the security field, and the ability to detect data leakage in the network is an issue, especially when the data are leaked by the DNS protocol. DNS tunneling is used to exfiltrate sensitive or confidential data using DNS packets. The attacker exploits the DNS packets that have been configured to bypass motoring the security systems by default. In this paper, a hybrid DNS tunneling detection has been presented based on the Tabu-PIO and packet length range. Moreover, a testbed has been conducted using virtual machines to generate DNS tunneling datasets with different classes. Our generated dataset summarized the different ranges of the packet length, which helps us to modify the Tabu-PIO.

The evaluation was conducted based on three datasets: the DNS records from the UNSW-NB15 dataset, the labeled DNS exfiltration dataset [32], and our testbed dataset. The results show that using the Tabu-PIO reduces the number of features in all datasets, i.e., from 42 to 13 features and from 17 to 5 in the DNS records from the UNSW-NB15 dataset and DNS tunneling [32], respectively. Moreover, the results demonstrate that using a hybrid approach (the M-PIO + packet length) enhances the run-time significantly when the size of the data increased.

In future works, the proposed approach can be improved by allowing it to adapt to new records with minimal human intervention. Moreover, regarding the main challenge of this research, there is no robust dataset specifically designed for the DNS tunneling problem. As a result, we intend to build a benchmark dataset specialized for DNS tunneling.

Author Contributions

Conceptualization, O.A.; methodology, O.A. and H.A.; software, O.A. and H.A.; validation, O.A., H.A. and M.A.A.; investigation, O.A. and H.A.; writing—original draft, O.A. and H.A.; writing—review, B.E., M.Q. and M.A.A.; supervision, M.Q. and M.A.A.; funding acquisition, M.A.A. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported through the Annual Funding track by the Deanship of Scientific Research, Vice Presidency for Graduate Studies and Scientific Research, King Faisal University, Saudi Arabia (Project No. GRANT 2863).

Data Availability Statement

The UNSW-NB15 dataset is available in the Kaggle repository [https://www.kaggle.com/datasets/mrwellsdavid/unsw-nb15 (accessed on 22 January 2021)]. The labeled DNS exfiltration dataset is available on GitHub [https://github.com/netrack/learn (accessed on 2 September 2022)].

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could appear to influence the work reported in this paper.

Appendix A

This section presents samples of screenshots for the Testbed Environment using different DNS tunneling tools. Figure A1, Figure A2, Figure A3 and Figure A4 present a random sample of packets generated for each tool. These data have been taken and converted as affected data that can be trained by machine learning algorithms.

Each package contains the value of the DNS attributes in 53 packets such as A and AAA, sconce port and destination port, and other related information such as source and destination MAC address.

Figure A5 illustrates the content of the UNSW-NB15 dataset in terms of the title of the feature, the dimension of the dataset (21,367 ∗ 44 ) and the sample of the attribute value.

Figure A1. Sample packet generated from Dnscat2 tunneling tool.

Figure A2. Sample packet generated from Dnsteal tunneling tool.

Figure A3. Sample packet generated from Iodine tunneling tool.

Figure A4. Sample packet generated from uninfected device.

Figure A5. Sample of UNSW-NB15 dataset for DNS records.

References

Hawdon, J. Cybercrime: Victimization, perpetration, and techniques. Am. J. Crim. Justice 2021, 46, 837–842. [Google Scholar] [CrossRef] [PubMed]
Abiodun, O.I.; Abiodun, E.O.; Alawida, M.; Alkhawaldeh, R.S.; Arshad, H. A review on the security of the internet of things: Challenges and solutions. Wirel. Pers. Commun. 2021, 119, 2603–2637. [Google Scholar] [CrossRef]
Wang, Y.; Zhou, A.; Liao, S.; Zheng, R.; Hu, R.; Zhang, L. A comprehensive survey on DNS tunnel detection. Comput. Netw. 2021, 197, 108322. [Google Scholar] [CrossRef]
AbuAlghanam, O.; Qatawneh, M.; Almobaideen, W.; Saadeh, M. A new hierarchical architecture and protocol for key distribution in the context of IoT-based smart cities. J. Inf. Secur. Appl. 2022, 67, 103173. [Google Scholar] [CrossRef]
AbuAlghanam, O.; Alazzam, H.; Alhenawi, E.; Qatawneh, M.; Adwan, O. Fusion-based anomaly detection system using modified isolation forest for internet of things. J. Ambient. Intell. Humaniz. Comput. 2022, 14, 131–145. [Google Scholar] [CrossRef]
Alghanam, O.A.; Almobaideen, W.; Saadeh, M.; Adwan, O. An improved PIO feature selection algorithm for IoT network intrusion detection system based on ensemble learning. Expert Syst. Appl. 2023, 213, 118745. [Google Scholar] [CrossRef]
Vaccari, I.; Narteni, S.; Aiello, M.; Mongelli, M.; Cambiaso, E. Exploiting Internet of Things protocols for malicious data exfiltration activities. IEEE Access 2021, 9, 104261–104280. [Google Scholar] [CrossRef]
Liang, J.; Wang, S.; Zhao, S.; Chen, S. FECC: DNS Tunnel Detection model based on CNN and Clustering. Comput. Secur. 2023, 128, 103132. [Google Scholar] [CrossRef]
Sabir, B.; Ullah, F.; Babar, M.A.; Gaire, R. Machine learning for detecting data exfiltration: A review. ACM Comput. Surv. (CSUR) 2021, 54, 1–47. [Google Scholar] [CrossRef]
Do, Q.; Martini, B.; Choo, K.K.R. Exfiltrating data from android devices. Comput. Secur. 2015, 48, 74–91. [Google Scholar] [CrossRef]
Ahmed, J.; Gharakheili, H.H.; Raza, Q.; Russell, C.; Sivaraman, V. Real-time detection of DNS exfiltration and tunneling from enterprise networks. In Proceedings of the 2019 IFIP/IEEE Symposium on Integrated Network and Service Management (IM), Arlington, VA, USA, 8–12 April 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 649–653. [Google Scholar]
Ishikura, N.; Kondo, D.; Vassiliades, V.; Iordanov, I.; Tode, H. DNS tunneling detection by cache-property-aware features. IEEE Trans. Netw. Serv. Manag. 2021, 18, 1203–1217. [Google Scholar] [CrossRef]
Ahmed, J.; Gharakheili, H.H.; Raza, Q.; Russell, C.; Sivaraman, V. Monitoring enterprise DNS queries for detecting data exfiltration from internal hosts. IEEE Trans. Netw. Serv. Manag. 2019, 17, 265–279. [Google Scholar] [CrossRef]
Greenwald, M.; Singhal, S.K.; Stone, J.R.; Cheriton, D.R. Designing an academic firewall: Policy, practice, and experience with surf. In Proceedings of the Proceedings of Internet Society Symposium on Network and Distributed Systems Security, San Diego, CA, USA, 22–23 February 1996; IEEE: Piscataway, NJ, USA, 1996; pp. 79–92. [Google Scholar]
Alsaleh, M.; Barrera, D.; Van Oorschot, P.C. Improving security visualization with exposure map filtering. In Proceedings of the 2008 Annual Computer Security Applications Conference (ACSAC), Anaheim, CA, USA, 8–12 December 2008; IEEE: Piscataway, NJ, USA, 2008; pp. 205–214. [Google Scholar]
Goodall, J.R.; Ragan, E.D.; Steed, C.A.; Reed, J.W.; Richardson, G.D.; Huffer, K.M.; Bridges, R.A.; Laska, J.A. Situ: Identifying and explaining suspicious behavior in networks. IEEE Trans. Vis. Comput. Graph. 2018, 25, 204–214. [Google Scholar] [CrossRef] [PubMed]
Bahga, A.; Madisetti, V. Internet of Things: A Hands-on Approach; Arshdeep Bahga and Vijay Madisetti: Himayatnagar, Hyderabad, 2014. [Google Scholar]
Satam, P.; Alipour, H.R.; Al-Nashif, Y.B.; Hariri, S. Anomaly Behavior Analysis of DNS Protocol. J. Internet Serv. Inf. Secur. 2015, 5, 85–97. [Google Scholar]
Fall, K.R.; Stevens, W.R. TCP/IP Illustrated, Volume 1: The Protocols; Addison-Wesley: Boston, MA, USA, 2011. [Google Scholar]
Zhu, L.; Hu, Z.; Heidemann, J.; Wessels, D.; Mankin, A.; Somaiya, N. Connection-oriented DNS to improve privacy and security. In Proceedings of the 2015 IEEE Symposium on Security and Privacy, San Jose, CA, USA, 17–21 May 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 171–186. [Google Scholar]
Born, K.; Gustafson, D. Detecting dns tunnels using character frequency analysis. arXiv 2010, arXiv:1004.4358. [Google Scholar]
Mitsuhashi, R.; Jin, Y.; Iida, K.; Shinagawa, T.; Takai, Y. Malicious DNS Tunnel Tool Recognition using Persistent DoH Traffic Analysis. IEEE Trans. Netw. Serv. Manag. 2022. [Google Scholar] [CrossRef]
Palau, F.; Catania, C.; Guerra, J.; Garcia, S.; Rigaki, M. DNS tunneling: A deep learning based lexicographical detection approach. arXiv 2020, arXiv:2006.06122. [Google Scholar]
Sammour, M.; Hussin, B.; Othman, M.F.I.; Doheir, M.; AlShaikhdeeb, B.; Talib, M.S. DNS tunneling: A review on features. Int. J. Eng. Technol. 2018, 7, 1–5. [Google Scholar] [CrossRef]
Al-kasassbeh, M.; Khairallah, T. Winning tactics with DNS tunnelling. Netw. Secur. 2019, 2019, 12–19. [Google Scholar] [CrossRef]
Nadler, A.; Bitton, R.; Brodt, O.; Shabtai, A. On the vulnerability of anti-malware solutions to DNS attacks. Comput. Secur. 2022, 116, 102687. [Google Scholar] [CrossRef]
Patsakis, C.; Casino, F.; Katos, V. Encrypted and covert DNS queries for botnets: Challenges and countermeasures. Comput. Secur. 2020, 88, 101614. [Google Scholar] [CrossRef]
Nadler, A.; Aminov, A.; Shabtai, A. Detection of malicious and low throughput data exfiltration over the DNS protocol. Comput. Secur. 2019, 80, 36–53. [Google Scholar] [CrossRef]
Wang, S.; Sun, L.; Qin, S.; Li, W.; Liu, W. KRTunnel: DNS channel detector for mobile devices. Comput. Secur. 2022, 120, 102818. [Google Scholar] [CrossRef]
Chen, S.; Lang, B.; Liu, H.; Li, D.; Gao, C. DNS covert channel detection method using the LSTM model. Comput. Secur. 2021, 104, 102095. [Google Scholar] [CrossRef]
Liu, J.; Li, S.; Zhang, Y.; Xiao, J.; Chang, P.; Peng, C. Detecting DNS tunnel through binary-classification based on behavior features. In Proceedings of the 2017 IEEE Trustcom/BigDataSE/ICESS, Sydney, NSW, Australia, 1–4 August 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 339–346. [Google Scholar]
Bubnov, Y. DNS tunneling detection using feedforward neural network. Eur. J. Eng. Technol. Res. 2018, 3, 16–19. [Google Scholar]
Lambion, D.; Josten, M.; Olumofin, F.; De Cock, M. Malicious DNS tunneling detection in real-traffic DNS data. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 10–13 December 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 5736–5738. [Google Scholar]
Chowdhary, A.; Bhowmik, M.; Rudra, B. DNS tunneling detection using machine learning and cache miss properties. In Proceedings of the 2021 5th International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India, 6–8 May 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1225–1229. [Google Scholar]
Altuncu, M.A.; Gülağiz, F.K.; Özcan, H.; Bayir, Ö.F.; Gezgın, A.; Nıyazov, A.; Çavuşlu, M.A.; Şahın, S. Deep Learning Based DNS Tunneling Detection and Blocking System. Adv. Electr. Comput. Eng. 2021, 21, 39–48. [Google Scholar] [CrossRef]
Zhan, M.; Li, Y.; Yu, G.; Li, B.; Wang, W. Detecting DNS over HTTPS based data exfiltration. Comput. Netw. 2022, 209, 108919. [Google Scholar] [CrossRef]
Nguyen, T.A.; Park, M. DoH Tunneling Detection System for Enterprise Network Using Deep Learning Technique. Appl. Sci. 2022, 12, 2416. [Google Scholar] [CrossRef]
Orebaugh, A.; Ramirez, G.; Beale, J. Wireshark & Ethereal Network Protocol Analyzer Toolkit; Elsevier: Amsterdam, The Netherlands, 2006. [Google Scholar]
Patro, S.; Sahu, K.K. Normalization: A preprocessing stage. arXiv 2015, arXiv:1503.06462. [Google Scholar] [CrossRef]
Chen, B.; Lei, H.; Shen, H.; Liu, Y.; Lu, Y. A hybrid quantum-based PIO algorithm for global numerical optimization. Sci. China Inf. Sci. 2019, 62, 70203. [Google Scholar] [CrossRef]
Guilford, T.; Roberts, S.; Biro, D.; Rezek, I. Positional entropy during pigeon homing II: Navigational interpretation of Bayesian latent state models. J. Theor. Biol. 2004, 227, 25–38. [Google Scholar] [CrossRef] [PubMed]
Sun, H.; Duan, H. PID controller design based on prey-predator pigeon-inspired optimization algorithm. In Proceedings of the 2014 IEEE International Conference on Mechatronics and Automation, Tianjin, China, 3–6 August 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 1416–1421. [Google Scholar]
Duan, H.; Qiao, P. Pigeon-inspired optimization: A new swarm intelligence optimizer for air robot path planning. Int. J. Intell. Comput. Cybern. 2014, 7, 24–37. [Google Scholar] [CrossRef]
Alazzam, H.; Sharieh, A.; Sabri, K.E. A feature selection algorithm for intrusion detection system based on pigeon inspired optimizer. Expert Syst. Appl. 2020, 148, 113249. [Google Scholar] [CrossRef]
Alazzam, H.; Sharieh, A.; Sabri, K.E. A lightweight intelligent network intrusion detection system using OCSVM and Pigeon inspired optimizer. Appl. Intell. 2022, 52, 3527–3544. [Google Scholar] [CrossRef]
Sokolova, M.; Japkowicz, N.; Szpakowicz, S. Beyond accuracy, F-score and ROC: A family of discriminant measures for performance evaluation. In Proceedings of the Australasian Joint Conference on Artificial Intelligence, Hobart, Australia, 4–8 December 2006; Springer: Berlin/Heidelberg, Germany, 2006; pp. 1015–1021. [Google Scholar]

Figure 1. DNS resolution scheme.

Figure 2. DNS message format.

Figure 3. Recursive domain name resolution using DNS tunneling.

Figure 4. Illustration of proposed testbed environment using VM.

Figure 5. The hybrid DNS tunneling detection system.

Figure 6. Extracting DNS records from UNSW-NB15 dataset.

Figure 7. Hill climbing and Tabu convergence curve using UNSW-NB15 dataset.

Figure 8. Run-time for our dataset.

Table 1. Summary of related works.

Reference	Tools for Dataset Generating	Methods
Ishikura et al. [12]	dnscat2	Binary classification using long short-term memory (LSTM)
Nadler, Aminov, and Shabtai [28]	dns2tcp and iodine	Isolation forest (one-class)
Wang et al. [29]	DNSExfiltrator	Isolation forest
Chen et al. [30]	Iodine, Dnscat2, Dns2tcp, DNShell v1.7, and Ozymandns	LSTM
Liu et al. [31]	dns2tcp, dnscat2, iodine, and Ozymandns	Binary classification (SVM, logistic regression, and decision tree)
Bubnov [32]	dnscapy, dns2tcp, iodine, and tuns	Multi-label feed-forward neural networks
Lambion et al. [33]	Iodine	Convolutional neural network (CNN)
Chowdhary, Bhowmik, and Rudra [34]	dnscapy, dns2tcp, iodine, and tuns	Various ensemble machine learning (random forest, K-nearest neighbor (KNN), SVM, and Gaussian Naive Bayes)
Altuncu et al. [35]	Iodine, dns2cat, and dns2tcp	Deep feed-forward (DFF) neural network
Zhan et al. [36]	godoh and DNSExfiltrator	Random forest, logistic regression, and decision tree
Nguyen and Park [37]	dns2tcp, dnscat2, and iodine	Deep learning (two-layer transformer)
Proposed Approach	dns2tcp, dnscat2, dnsteal, and iodine	Feature selection and ensemble approach

Table 2. DNS tunnel tools’ utilities.

DNS Tool	Operating System	Query Types	# of Records
Dns2tcp	Linux	KEY and TXT	7698
Dnscat2	Linux	A, AAAA, CNAME, NS, TXT, and MX records	10,658
Iodine	Linux, MacOS, Windows	A, CNAME, NULL, MX, TXT, SRV	9856
Dnsteal	Linux	A, AAAA, TXT	5897
Normal	Linux	-	3240

Table 3. # of instances of DNS records from UNSW-NB15.

Class Label	Training Set	Testing Set
DoS	40	107
Exploits	68	185
Fuzzers	17	358
Generic	18,162	39,116
Reconnaissance	12	35
Attack	18,299	39,801
Normal	3068	7493
Total Records	21,367	47,294

Table 4. # of instances of DNS records from [32].

Class Label	Training Set	Testing Set
Dns2tcp	6298	2772
dnscapy	10,043	4375
Iodine	8565	3663
Tuns	40,392	17,378
Attack	65,298	28,188
Normal	12,051	5054
Total Records	77,350	33,242

Table 5. The experimental setup.

Total Number of Run = 20
OC-SVM Parameters
Parameter	Value
$γ$	scale
$k e r n e l$	RBF
$ν$	[0.074, 0.001, 0.01, 0.1]
Packet Length (PL)
Packet length RL	[100–115]
Packet length RM	[145–185]
Packet length RH	[220–340]
Fitness Function Parameters
$α$	0.04
$β$	0.48
$δ$	0.48
Tabu-PIO Parameters
Local_search_iteration (Tabu-Hill)	10
# of Iterations	200
Number of Population (Np)	128
Map and Compass Factor	0.09

Table 6. The selected features from DNS records extracted from UNSW-NB15 by LS_PIO based on different classifiers and local search methods.

Classifier	Method	# of Features	Set of Selected Features
iForest	PIO-Hill_Climbing	14	[2, 4, 6, 14, 17, 22, 23, 24, 25, 32, 35, 37, 38, 39]
iForest	PIO-Tabu_Search	13	[2, 4, 6, 14, 17, 22, 23, 24, 25, 32, 37, 38, 39]
OC-SVM	PIO-Hill_Climbing	17	[1, 2, 3, 4, 6, 8, 10,11, 14, 15, 17, 19, 22, 26, 29, 30, 37]
OC-SVM	PIO-Tabu_Search	15	[1, 2, 4, 6, 8, 10, 14, 15, 17, 19, 22, 26, 29, 30, 37]
LOF	PIO-Hill_Climbing	15	[1, 2, 4, 6, 7, 8, 11, 14,16, 26, 27, 35, 36, 38, 42]
LOF	PIO-Tabu_Search	14	[1, 2, 4, 6, 7, 8, 11, 16, 26, 27, 35, 36, 38, 42]

Table 7. The selected features from [32] dataset by LS_PIO based on different classifiers and local search methods.

Classifier	Method	# of Features	Set of Selected Features
iForest	PIO-Hill_Climbing	6	[6, 7, 9, 11, 14, 15]
iForest	PIO-Tabu_Search	5	[6, 7, 9, 11, 14]
OC-SVM	PIO-Hill_Climbing	8	[0, 1, 2, 7, 9, 11, 13, 15]
OC-SVM	PIO-Tabu_Search	7	[0, 2, 7, 9, 11, 13, 15]
LOF	PIO-Hill_Climbing	7	[0, 3, 7, 11, 12, 13, 15]
LOF	PIO-Tabu_Search	6	[3, 7, 11, 12, 13, 15]

Table 8. The evaluation results for LS_PIO using DNS_UNSW-NB15 dataset.

Approach/ Classifier	PIO-Hill_Climbing				PIO-Tabu_Search
Approach/ Classifier	TPR	FPR	Accuracy	F-Score	TPR	FPR	Accuracy	F-Score
iForest	0.99	0.00	0.98	0.97	0.99	0.00	0.987	0.96
OC-SVM	0.98	0.00	0.99	0.98	0.99	0.00	0.99	0.99
LOF	0.88	0.32	0.86	0.84	0.003	0.32	0.87	0.86

Table 9. The evaluation results for LS_PIO using labeled DNS exfiltration dataset [32].

Approach/ Classifier	PIO-Hill_Climbing				PIO-Tabu_Search
Approach/ Classifier	TPR	FPR	Accuracy	F-Score	TPR	FPR	Accuracy	F-Score
iForest	0.58	0.49	0.56	0.52	0.59	0.48	0.58	0.54
OC-SVM	0.98	0.08	0.97	0.96	0.99	0.07	0.98	0.98
LOF	0.90	0.07	0.89	0.87	0.92	0.07	0.90	0.90

Table 10. The evaluation results using our generated dataset using all features.

Dataset	Technique	TPR	FPR	Acc (Train)	Acc (Validate)	Acc (Test)	F-Score	AUC
Our dataset	iForest	0.98	0.003	0.97	0.97	0.96	0.97	0.96
	OC-SVM	0.99	0.000	0.99	0.99	0.99	0.99	0.98
	LOF	0.95	0.007	0.94	0.92	0.93	0.92	0.93

Table 11. Comparison between different DNS tunneling techniques using the labeled DNS exfiltration dataset [32].

Reference	Method	Accuracy	F-Score
[32]	Multi-label feed-forward neural networks	0.84	-
[34]	K-nearest neighbor (KNN)	0.94	0.94
Proposed approach	OC-SVM	0.98	0.98

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.