**1. Introduction**

Network steganography has recently gained considerable attention in the scientific community. Many new methods have been developed, and many more will be developed in the near future [1] as new network protocols are constantly being developed. This paper focuses solely on the detection of steganography techniques that operate at the network protocol level.

With the growing number of devices in networks, including IoT, network steganography detection faces new challenges in terms of both accuracy and performance [2]. To be performed e ffectively, steganography needs to operate:


If detection is performed o ff-line or if it causes too much latency, there will be more tra ffic waiting to be analyzed than can actually be analyzed. Performance optimization is the main focus of the research described here since the main application of network steganography is real-time communication [3,4].

Some of the accurate detection methods tailored for specific network steganography techniques cannot be e ffectively implemented in real-time regimes because excessive computing and/or memory resources are needed [5]. This makes us question the *overall accuracy* of such methods since they are unable to analyze high-throughput tra ffic in a multi-host environment.

In this paper, we present a new method to introduce a compromise between detailed packet inspection and optimal detection performance. Our motivation is to provide a generic method that orchestrates network steganography detection in real-time regime, making it possible to implement in multi-host environments that generate high-throughput tra ffic. As a part of the method, we have presented a steganalysis layer selection method that provides an intelligent selection of steganalysis algorithms, preserving the balance between resource consumption and detection performance. To the authors' best knowledge, this is the first generic network steganography detection method that utilizes a top-down approach for a detection method selection algorithm to ensure optimal computation resource allocation.

#### **2. Related Work**

Historically, most network steganography detection methods had been part of research on new steganographic techniques. In recent years, there emerged new detection methods that are not countermeasures for a particular steganographic technique but provide a broader perspective. Based on the literature, we can distinguish two major categories for network steganography detection methods: *technique-specific* and *generic*, as presented in Figure 1.

**Figure 1.** Network steganography detection classification.

The first category: technique-specific, comprises methods proposed as countermeasures for specific steganographic techniques. Methods in this category usually operate on low-level network data, require relatively much computation resources, and are not able to detect other steganographic techniques instead of the one or several for which they are designed.

The second category: generic, comprises methods that are not designed to detect one specific steganographic technique but offer a comprehensive approach to network anomaly detection and categorization of network traffic for potential steganographic utilization. Methods in this category may not provide detailed information on detected suspicious traffic but can label it for further investigation. Most generic methods fall into two subcategories that characterize their approach: statistical or machine learning.

A majority of methods described in the existing literature fall in the first category. Each of those methods is applied to specific steganographic techniques categories, as shown in Figure 2 [6].

**Figure 2.** Network steganography classification.

For packet modification techniques, the steganalysis methods presented so far include:


For stream modification techniques, several detection methods have been described, including:


Some generic methods for steganalysis operating on high-level aggregated metadata have been proposed:


In addition, several generic methods for steganalysis have been proposed for steganogram detection in digital media. However, these methods apply for a di fferent range of data-hiding techniques (digital images/media) that are outside the scope of this research. Those methods include:


All told, the existing literature on network steganography detection focuses on countermeasures and methods for the detection of newly described steganography techniques rather than a generic approach, with exceptions described above.

The generic method described in this paper provides a framework for the utilization of various steganalysis methods at once. The method requires the use of other existing network steganography detection methods for optimum e ffectiveness. The proposed method's main novelty is providing a capability for intelligent selection of best-fit steganalysis methods for analyzed network tra ffic to maintain optimal resource utilization. While some of the existing methods provide a generic approach to steganography detection, none of those methods provide a unified cooperation model for utilizing other methods.

#### **3. Multilayer Network Steganography Detection**

#### *3.1. Method Description*

The core concept for our proposed method of network steganography detection is multilayer steganalysis and intelligent detection method selection based on packet classification and optimal resource utilization. We propose a top-down approach for a detection method selection algorithm as it ensures optimal computation resource allocation. In such an approach, we prefer high-layer metrics analysis over methods operating on low-level data (which would require more resources) unless high-level analyzers identify suspicious network tra ffic.

As shown in Figure 3, the first step is a packet capture (101), which acquires a single network packet from a hardware resource, such as a network card. The next step is feature extraction (102), which is the first stage of building a data model. Extracted features may include protocol headers and other derived data that can be calculated in near real-time. Extracted features serve as an input for metrics aggregation (103) and steganalysis layer selection (104). Metrics aggregation modules provide derived metrics operating on various aggregation layers. The scope of the metrics and calculation algorithms is determined by the steganalysis method(s) for which the method is to be applied. Examples of the metrics aggregation may include aggregated data counters, port utilization, etc. The main assumption for metrics aggregation is that high-layer metrics computation should consume fewer resources and take less time than the computation of low-layer metrics, as shown in Figure 4. We named the lowest-layer metrics "*1s<sup>t</sup> layer* aggregated metrics" and the highest-layer metrics " *Nth layer* aggregated metrics."

The calculated metrics and features extracted from each packet serve as input for steganalysis layer selection (104), which determines the optimal steganalysis layer. We discuss the steganalysis layer selection in Section 3.2.

**Figure 4.** Aggregated Metrics hierarchy.

The Steganography Detection module (105) comprises multiple steganalysis methods. Each steganalysis method is assigned to a specific layer, based on the method's complexity and, in particular, on its resource utilization. Given a maximum of *N* layers of steganalysis methods, and a function *L(m)* defining real-time operating resource consumption for each method *m* belonging to the set of methods *M*, the following is assumed:

$$\forall m \in M(L(m) < L(m-1)), \text{ provided that } N \ge m > 1 \tag{1}$$

In other words, steganalysis methods in higher layers require fewer resources to effectively detect network steganography in the real-time regime. Steganography detection methods in each layer may, but do not have to, operate on corresponding aggregated metrics layers.

The result of the performed multilayer steganalysis is provided to the steganography layer selection module to update the classification rules.

#### *3.2. Steganalysis Layer Selection*

The performance of our proposed method relies on the accuracy of the steganalysis layer selection algorithm and its parameters. In order to achieve better results, the algorithm should be tailored to fit specific performance requirements and at least the anticipated types of steganography technique. We sugges<sup>t</sup> the following selection method, which should suffice for most applications.

As shown in Figure 5, the steganalysis layer selection method can operate in two modes:

#### 1. Rule learning;

2. Packet classification.

**Figure 5.** Steganalysis layer selection method.

In the first mode, the method applies various machine learning algorithms for frequent pattern mining, classification, and clustering to the steganalysis result (204) provided by the layered steganalysis module, computed anomaly scoring (205), and aggregated metrics (201). Learned rules are stored in memory (203) for the anomaly scoring module and packet classification.

In the second mode, the layer selection method receives a packet's extracted features (206) to classify the packet (207) for the selection of the optimal steganalysis layer (208). Packet classification (207) operates on previously learned rules and may use various classification methods and metrics, including but not limited to network address classification, network protocol classification, and TCP/UDP port classification.

The selection and application of specific algorithms for frequent pattern mining, classification, and clustering utilized by the rule learner module (202) are beyond the scope of this research work as they are widely discussed in the literature [21,22]. However, we recommend the *k-means clustering* for mining a predefined number of clusters of network devices, the *FP-growth* algorithm for frequent pattern mining, and an optimized SVM (Support-Vector Machine) trainer [23] for classification.
