**2. Preliminaries**

#### *2.1. Network Monitoring*

The internet is based on a protocol suite, which was developed by the Defense Advanced Research Projects Agency (DARPA). The idea of a distributed topology, with a packet switched network is described from the time perspective by its author Baran in [4]. With the rapid growth of the internet at the end of the 20th century, there was also a necessity for network accounting and monitoring. Almost in parallel to network traffic profiling for accounting reasons, frequent and large-scale network attacks have led to an increased need for developing techniques for analyzing network traffic. In the design philosophy of the DARPA internet protocols [5], Clark explained *flow* as being connected with the necessity to treat differently those packets transmitted by intermediary network devices with an appropriate type of service demanded by the endpoints applications. In such a way, particular packets belonging to the same connection can be distinguished. According to the basic principles of packet switching networks, each datagram from a network connection can take different routes. In the beginning, the original ARPANET host-to-host protocol provided flow control based on both bytes and packets. However, later, due to efficiency reason, only the bytes number was used for acknowledgments.

Later, at the beginning of the 1990s, Mills et al. proposed internet accounting [6]. Network accounting introduced packet aggregation based on flows, using packet header information. Then, the idea was developed to use real-time traffic flow measurement (RTFM) [7]. Claffy et al. proposed a methodology for profiling traffic flows on the internet for communication analysis [8].

Then, with the implementation of the NetFlow [9] protocol, the traditional understanding of IP flow was defined as a set of five, up to seven, IP packet attributes flowing in a single direction. When a TCP session is considered, a flow consists of all packets transmitted until this session terminates. NetFlow, among others, uses the following IP packet attributes: IP source address, IP destination address, source port, destination port, layer 3 protocol type, type of service, router or switch interface. All packets with the

same earlier-mentioned attributes are grouped into a flow, and then packets and bytes are counted. With the introduction of the IP flow information export (IPFIX) protocol, the number of flow attributes, named IPFIX information elements, increased to several hundred. The RFC7011 explains that [10]: "(...) A Flow is defined as a set of packets or frames passing an Observation Point in the network during a certain time interval. All packets belonging to a particular Flow have a set of common properties. Each property is defined as the result of applying a function to the values of:


In the studied papers, we have found different usages of the *flow* term with several nouns, such as traffic, packet, data, and IP packets, which can mislead the readers. Therefore, we have decided to unify network traffic terms for this survey as the following definitions:


Traffic data can be collected from an observation point with a hardware or software solution. Written in C, an open-source library *Libpcap* (see: https://www.tcpdump.org/, accessed on: 2 July 2021) is available for different platforms. This library delivers an application programming interface (API), which can be implemented in capturing software, e.g., tcpdump or Wireshark. Collected traffic data with the libpcap library can be saved in the pcap file format and used to create a dataset for analytics and classification.

Depending on the available datasets (see Section 2.4) and implemented machine learning algorithms, traffic datasets can directly feed the chosen *CNN-based deep learning model* (CDM) or may have to be pre-processed according to several paths, as is presented in the upper part of the workflow diagram in Figure 2. Different possible side paths—raw traffic processing or filtering—are indicated with blue arcs.

The straightforward path is with only trimming or padding block. In the case of using only internet raw traffic for a machine (deep) learning (straight line in Figure 2), there is a necessity to change the length of the original stream data into chunks to prepare these according to the input size of the chosen CDM. If the input dimension of the selected model is smaller than the stream data chunk, the latter has to be trimmed to the size of the CDM's input vector dimension. When the input stream is shorter than is expected by the CDM, the remaining part of the input vector is padded with an arbitrarily selected value—usually with zeros, to fit the suitable CDM's input vector dimension.

Following two side paths—alternatively: flows or sessions—requires grouping traffic data according to the same properties. Then, the flows or sessions' data must be trimmed or padded, again as in raw traffic, to fit the suitable CDM's input vector dimension.

An alternate path for flows or sessions data can lead through selected layers (L2, L3, L4+) payload extraction. In intrusion detection systems (IDS), such processing is called deep packet inspection (DPI). Then again, extracted payloads must be trimmed or padded to fit the suitable CDM's input vector dimension.

Finally, the traffic data samples became an input for machine learning data described in more detail in Section 2.2.

**Figure 2.** Workflow diagram of network traffic classification using deep convolutional neural networks with various data transformation paths.

#### *2.2. Convolutional Neural Networks*

Deep neural networks have recently become one of the hottest methodologies applied in machine learning and pattern recognition. They provide machine learning models that surpass previous approaches. Thanks to the rapid development of computing resources and common usage of relatively cheap parallel-computing platforms, previously longlasting machine learning tasks have become available for everyone. Among the most popular types of deep networks are convolutional neural networks (CNN) [18]. They are based on convolution operators, the weight of which is a subject of learning. Due to the multidimensional nature of convolution, the CNN has gained enormous popularity in image processing and analysis. Their history starts from the LeNet [19] by LeCun et al., which was a breakthrough in image pattern recognition. The CNN-based approach considerably surpassed the previous methods to classify hand-written digits recognition (MINIST datasets). It became possible because the neural network, in this case, is responsible not only for classifying the data samples (as, up to that time, typical neural nets did) but also

for extracting features. In particular, the convolutional layers perform this task. In the case of conv-nets, the typical structure of the recognition scheme consists of two parts. The first one is a set of consecutive convolution layers that are stacked alternately with pooling layers. Convolution layers are responsible for extracting data features while pooling layers for reduction of the data size. The combination of feature extraction with size reduction allows for detecting data features at increasing scales. Finally, if necessary, the output of the convolution and pooling layer is flattened to obtain a final feature vector. Its further processing is a typical classification task that is usually based on the structure resembling (or sometimes being equal to) the multilayer perceptron (MLP classifier). Contrary to convolutional layers, in classification layers, all neurons located at a given layer are connected to all in the next one. Because of that, they are called fully connected (FC) or dense layers. The combination of the CNN and FC layers constitute the complete classification framework. In many papers, the name CNN is spread into the complete neural model consisting of both parts, the actual CNN and FC. However, formally speaking, it should rather be used exclusively for the first—feature extraction—part of the model. An example of such a type of network is shown in Figure 3. The diagram shows the LeNet consisting of the two parts mentioned above. The data feature extraction part inputs and the 32 × 32 image, consist of layers—convolution (conv 1), pooling (pool 1), convolution (conv 2), pooling (pool 2)—and outputs the vector of 400 data features. The classification part consists of three fully connected layers: the first with 120 neurons, the second with 84 neurons, and finally, the third with ten neurons. The number of neurons equals the number of output classes, which is equal, in this case, to the number of possible digits that might appear on the input image. For the sake of simplicity, we use shortcuts for the principal layers of the neural model: C—convolutional layer, P—pooling layer and, FC—fully connected layer. The LeNet structure may thus be coded as C|P|C|P|FC|FC|FC.

**Figure 3.** Architecture of the LeNet neural network (C|P|C|P|FC|FC|FC).

In the classic case of the image input data, the number of its dimension equals 2 for gray-level images and 3 for color ones. In the first case, it is a data array, the sizes of which equal the image sizes. In the second one, such a structure is tripled and consists of three planes of the size of an image, each of which represents one color component (in most cases red, green, and blue). The size of the third dimension equals, therefore, 3. Although the 2D and 3D above structures are mostly used in the digital image domain, the 1D input is also possible. In such a case, the convolution in at least the first layer is a 1D convolution.

Following the enormous growth in popularity of the CNN structures, they started to be applied in many domains other than vision systems. One of these domains was the categorization of the IP network traffic. In this case, the input data to be classified are samples of the network traffic. The resulting classes, in turn, are related to the types of traffic.

There are several ways of preparing the traffic data to obtain valuable input for the machine (deep) learning model described in detail in Section 2.1 (see also Figure 2). One of the possible preprocessing methods makes use of the traffic features. These features are, however, different from data features extracted by CNN. The primers are intentionally and carefully selected features of particular meaning: either properties of the traffic (e.g., IP address, port) or some statistics (e.g., number of bytes, packets). The data features, in turn, are automatically selected numbers derived from the original data vector that makes the input of the learning model.

In traffic analysis applications, the input data are one-dimensional time series consisting of consecutive bytes transmitted. Following various dimensionalities of possible inputs of the CNN (1-, 2- or 3D), one may find in the traffic analysis several solutions that either keep the original 1D dimensional nature of the traffic data, or increase the number of dimensions. The 1D CNN solutions consist of 1D convolution filters, at least at the input layer. The 2D solutions add the second dimension by using, in the vast majority of cases, two approaches: traffic wrapping or one-hot encoding. The 3D solution either exploits more sophisticated wrapping or combines both techniques. The schematic diagram showing the data flow in each case is shown in the lower part of Figure 2.

Independently of the method used to add the second dimension of data, the input data for the machine learning model should consist of equal-sized data samples. To obtain such samples, data trimming (for samples originally too long) or padding (for those that are too short) is usually performed (see Section 2.1 for details).

The *traffic wrapping* cuts the data sample consisting of *n* bytes into *n*2 pieces of the same length *n*1. Values of *n*1 and *n*2 are chosen in such a way that *n*1 · *n*2 = *n*. In the output data 2D array, each data value has not only neighbors that were transmitted just before and just after (these are horizontal neighbors in the 2D array), but also has vertical neighbors that, coming back to the original 1D data sample, are equivalent to the data values that appeared at a certain time before and the same time after the current data value. For example, if the data sample of size *n* consists of bytes, the *t*-th byte has two direct horizontal neighbors, *t* − 1 and *t* + 1, and two direct vertical ones, *t* − *n*1 and *t* + *n*1. The traffic wrapping is shown in Figure 4. This approach performs in a way that may be called linear stacking and is applied in all but one among the studied approaches. Several atypical approaches to 1D to 2D sample mapping (diagonal, waterfall, spirals) were studied in [20].

**Figure 4.** Introducing the second dimension by wrapping the network traffic data.

The second technique of increasing the traffic data dimensions is *one-hot encoding*. This approach replaces numerical integer values by the binary vector such that all but one of its elements equals zero, and the unique element equals one. Such an approach is applicable for numerical variables belonging to a finite set of *m* possible values (for example, a value of a byte belongs to the set of possible *m* = 256 values). The one-hot-encoder thus inputs an integer of *m* values and outputs a binary vector of size *m*, containing value one at the position related to the current input values and zero elsewhere (an alternative solution encodes *n*-values variable as a binary vector of size *n* − 1, where the *n*-th input value is encoded as an all-zero output). Replacing single values by vectors converts the 1D vector of integers into a 2D binary array—see Figure 5.

**Figure 5.** Introducing the second dimension using one-hot encoding.

The origin of the one-hot encoding approach is related to the observation that—in most cases—elements of the traffic data sample (single values) are not ordered, and there is no intrinsic order of values represented by bytes. They usually represent some pieces of information encoded using bytes via standardized codes. They should thus be treated as unordered categorical values, rather than a set of consecutive integers. This property makes them different from, for example, image data, where pixel values are ordered—higher values of pixels represent a higher value of luminance. The property of having ordered values of the input data is essential in neural network learning algorithms, which are inseparable parts of the neural models that use gradient descent approaches to modify network weights iteratively.

The one-hot-encoded vector is the sparse 1D data structure of the length equal to the encoded variable's possible values. Making it shorter is possible, using another classic trick—the embedding technique. It produces a shorter vector of a given length of possibly the same amount of information as the one-hot-encoded input. The vector embedding is performed using a fully connected layer that takes a binary one-hot-encoded vector as the input and produces a shorter embedding vector further processed by the convolution layer(s).

The architecture of the neural models consists of classic convolution, pooling and fully connected layers. It also includes often typical mechanisms found in other deep-neural models, such as regularization (mostly drop-out), preventing overfitting, or softmax output normalization that allows for interpreting the output of the model as probabilities.

In many neural approaches to network traffic analysis, pre-trained neural CNNbased models are used. They gained popularity in the image analysis domain due to their effectiveness and ability to work as backbones in many image analysis fields. In their case, the transfer learning approach in most widely used, where the image pretrained model is learned to adapt to the network-traffic data. Pre-trained models focus on recognizing single objects located within the image and work usually on images with fixed sizes. To this group belong the following well-known networks: LeNet [19], AlexNet [21], GoogLeNet/Inception [22], DenseNet [23], ResNet [24], VGG [25], XCeption [26], and MobileNet [27].

#### *2.3. Visual Aspects of the Traffic Data*

The visualization of the network traffic is one of the classic approaches to traffic monitoring. The most traditional way is visualizing network structure as graphs where nodes and edges represent the network topology. A routing graph is a typical example of such visualization. However, this is just one of the possible network visualizations. Along with developing the internet and constantly increasing abilities to process traffic data, data visualization techniques have always played a significant role in this field. There have been many contributions in this field since the first editions of the Visualization for Cyber Security (VizSec) forum [28,29]. To perform meaningful visualizations, in some cases, authors use data reduction methods, e.g., PCA for dimensionality reduction [30].

Thanks to transforming the 1D time series of the original traffic data into 2- and 3-matrices, one may look at network traffic as digital images [31]. The single elements of the traffic data samples—which, in most cases, are simply bytes—play the role of pixels. The luminance of the pixel refers to the value of a particular element/byte, where higher byte values are represented by lighter pixels. The 1D traffic data converted into higherdimensional data samples of a fixed length may be displayed as binary, gray level, or color images. In the first case, the input must be binary. One uses this type of trafficto-image transformation in the case of one-hot-encoded network traffic. In the case of gray-level images, image pixels, one usually applies wrapping techniques. The resulting gray-level image looks like an image of a texture, including either irregular or regular patterns. In rarer cases, the resulting image is a color one, which is the 3D data structure. The third dimension has a fixed size of 3, due to the number of planes referring to three color components. Each of them is a gray-level image with the luminance value associated with the intensity of a particular component.

They interpret the network-traffic samples as images allowed for directly applying the image-processing techniques to this type of, initially, non-image data. They have been used, e.g., for detecting anomalies in internet traffic [32,33].

Because 2- and 3D CNN-based neural models were initially developed to process digital images, image representation of traffic has become an obvious visualization method in the CNN-based neural models. Since the ready-to-use neural backbone models are designed to process the input data of a fixed size, the size of the traffic data sample must become compliant with the input image size.

Because 2- and 3D CNN-based neural models were initially developed to process digital images, image representation of traffic became an obvious visualization method in CNN-based neural models. Since the ready-to-use neural backbone models are designed to process the input data of a fixed size, the size of the traffic data sample must become compliant with the input image size. This fact is noticeable in many network models where the size of the traffic data sample is equal to the size of the input of the neural model initially developed to process images of particular sizes. A typical example of such a strict dependence of the traffic data sample and the input of pre-trained backbone is a sample size that equals 784, which appears in many approaches. They also force the 2D input of the neural model equal to be a square array, where the length of the edge equals 28 (28 × 28 = 784)—see [34]. Such a choice is not motivated by the particular properties of the network traffic, but by the neural LeNet model, which was originally used to recognize hand-written digits on squared bitmaps of size 28 × 28. Examples of images of network traffic processed using the method [34] are shown in Figure 6. The grayscale images are built from matrices using flow wrapping. The hexadecimal value of black pixels stand for 0x00, and white ones for 0xff. One may see that different samples of the same traffic (rows) look similar, while images derived from different types of internet traffic differ from one another (columns).

**Figure 6.** Traffic visualizations of trojan Zeus (the first column), Skype (the second column), Outlook (the third column), backdoor Htbot (fourth column) and botnet Virut (fifth column). Images were created by the authors with the advantage of the tool introduced in [34] on the USTC-TFC2016 dataset.

#### *2.4. The Datasets*

The crucial role in all machine learning methods is that of the datasets. They are necessary to perform the learning process of classifiers. They also help compare various approaches. In the case of traffic classification, several open datasets are commonly used in papers under study. The datasets include various types of traffic data: raw traffic, flows and features. Short characteristics of the most frequently employed in the studied papers are listed in Table 1. Figure 7 shows the popularity of particular datasets in the investigated papers.

The group of 28 articles use less popular datasets (Figure 7). These datasets in alphabetic order are as follows:

**Table 1.** The summary of the most popular datasets used in the studied papers—sorted by the year of creation.


**Figure 7.** Popularity of the datasets within the reviewed articles. Digits in the brackets stand for the number of all occurrences of each dataset.

BoT-IoT, CAN 2017, CIC-AAGM2017, CIRA-CIC-DoHBrw-2020, CSE-CIC-IDS2018, CTU-Malware, CTU-Mixed, DARPA 1998, DARPA 1999, EDU1, ISCX-Bot-2014, ISCX Tor-nonTor, Malware Capture Facility Project (malware), MAWILab, Mirai-RGU, NIMS, NLANR AMP, NLANR MAWI, SCU-RNE, UPC Broadband Traffic Research group's dataset, VAST 2013 challenge and WRCCDC.

All the datasets contain a certain number of labeled traffic samples. Labels refer to the traffic classes. Classes always belong to one of two groups. In most cases, these groups are malware and benign traffic. One dataset, VPN-nonVPN, contains classes grouped according to the VPN connections within the frames of which the traffic was registered. For details regarding classes, see Table 2.

There are many datasets used in scientific papers for network monitoring and classification. They usually consist of real or simulated data. Some of them are described only in publications but are not available for other researchers for methods evaluation. In this survey, we have selected and compared only those datasets that were used in the research described in the studied papers. A comprehensive analysis that highlights datasets utilized for IDS concepts purposes is in [87]. The paper touched upon the question of pcap and NetFlow differences. It analyzed common datasets concerning the wanted traffic occurrence (not malware), the data format, anonymity, volume of the traffic, type of traffic, labeling, etc. It is crucial to point out that some described datasets are publicly available.

**Table 2.** The traffic details of the most popular datasets. Datasets are sorted by the number of occurrences in articles.


Having a given dataset, in the case of network traffic classification, one follows the classic machine learning workflow. The dataset is divided into training and testing sets. The former is used to train the neural model, while the latter is used to test it. However, this workflow is preceded by transforming the network traffic data into data samples that the neural model may use. Finally, the evaluation of the results is usually performed using typical measures: precision, recall, accuracy, and F1-score.

#### *2.5. Other Surveys*

Because the classification of computer network traffic seems to be a leading trend in the latest research, many surveys have been published that widely discuss this topic. However, each paper examines the scientific problem from a different perspective.

Identifying malware by machine learning techniques was widely investigated in [91]. The paper focused on different types of malware analysis. In addition, one can find a brief description of common malware types. Then, feature selection, classification, and

clustering for malware detection were discussed. Finally, the authors mentioned trends of malware development. The survey did not focus on network traffic.

Another paper [92], was written on the subject of traffic classification for quality of service purposes. The article examined many machine learning methods and their advantages for anomaly and intrusion detection. It is important to highlight that the survey also discussed the practicality of the methods. Unfortunately, when it comes to CNNs, there was only one paragraph fully devoted to the history of CNNs.

Unwanted network traffic detection in the Internet of Medical Things (IoMT) was extensively discussed in [93]. Researchers analyzed types of malware attacks, architectures of the IoT environment and taxonomy of security IoT protocols. The latter focused on key management, authentication, access control and intrusion detection. The article stated that future research will be based, among others, on blockchain usage and cross-platform detection.

Work in the topic of Android malware classification was categorized in [94]. In the paper, a novel taxonomy of android malware families was introduced. The interesting part of the paper is a list of Android malware datasets and the surveyed articles' limitations. The paper finished with future directions.

A comprehensive review of malware analysis tools that detect and analyze malware executables is given in [95]. Except for reverse engineering tools as well as memory forensics, packet analysis, detection tools, online scanners and sandboxes were elaborated.

Deep learning techniques were introduced as those that can quickly solve complex problems [96]. The article highlighted the following architectures of deep neural networks (DNNs): feed-forward neural network (FNN), convolutional neural network (CNN), recurrent neural network (RNN) and generative adversarial network (GAN). One section touched on the deep learning private data frameworks. The deep learning threats and attacks, as well as defense techniques, were also examined.

A survey [1] for collecting articles that propose deep learning-based modes to find intrusion in the network data was introduced. In the paper, one can find the taxonomy of deep learning models. In the list of research papers on supervised instance classification models for intrusion detection, there is a brief mention of [34]. The authors, among others, concluded that advances in deep learning methods are noticeable. On top of that, they said that it is often impossible to reproduce some deep learning models, due to the lack of adequate information. The authors also proposed a novel classification of four network traffic datasets.

The utilization of deep learning methods for the purposes of cybersecurity was examined in [97]. The paper singled out types of machine learning, types of deep learning and algorithms for both. Then, deep learning platforms were examined. Finally, the article outlined network attacks. CNN usage by [34] was only mentioned. The detection of cyberattacks to the IoT infrastructure with the advantage of deep learning articles was widely discussed in [98]. The researchers reviewed the IoT architecture, reference models and IoT protocols. Then, they introduced threats against IoT systems and continued with intrusion detection systems (IDS). An interesting IDS taxonomy was also described in the paper. In the CNN section, they mentioned a few articles, but only [34] is related to security, based on network traffic. The next survey dealt with the development and detection trends of unwanted software [99]. In addition, the authors focused on those areas that were omitted by other surveys, e.g., advances in the creation of new types of malware.

The detection of intrusions throughout analysis of images generated from network traffic was outlined in [100]. The paper distinguished classical and neural networks methods. The authors dwelt on deep learning models of the convolutional neural network (CNN), long short-term memory (LSTM), support vector machine (SVM) and hybrid ones. When it comes to only CNN models, they detailed the works of [20,34,56,85]. This review points out that [101] was one of the first concepts of converting network traffic to images. The paper aroused our interest. The work proposed the creation of two-dimensional images that consist of 4 bytes in an IP address structure. The matrix then shows the intensity of

traffic in the image representation. Nevertheless, this idea does not refer explicitly to CNN, and is not covered in further sections of this survey.

A systematic literature review highlighted interesting trends in the IoT infrastructure [102]. The researchers concluded that the majority of attacks take place in the network layer. There is a mention of the most popular datasets as well as common attacks.

Network traffic classification algorithms, for instance, based on the port number, statistical characteristics, host behaviors and deep learning, was considered in [3]. The last category encapsulated the following models: stack autoencoder (SAE), CNN, LSTM and deep belief networks (DBN). The CNN section described only three methods, where the CNN input was transformed beforehand—From the one-dimensional data to different one-dimensional data, to two-dimensional data or to three-dimensional data.

While preparing this survey, we decided to develop the concept of describing nothing but the CNN models' usage for traffic classification and malware detection purposes. Contrary to [3], our article consists of 91 papers, i.e., all papers on this topic written until 2021. The conclusion of the CNN chapter in [3] is that the transformation from the 1st dimension to the 3rd dimension is better than other transformations. We believe that it is hard to hypothesize with only a few examples. On top of that, the compared examples utilized a variety of methods. Therefore, the proposed survey inherits and enhances the classification of different forms of transformations.

#### **3. Raw Traffic**

The first group of transformation methods works on the captured packets as they come in—the raw traffic. This type of data seems to be the most direct input of the CNNbased deep learning model (CDM), as its considerable merit is the lack of necessity of the prepossessing phase. However, only four research groups decided to base their work on this type of data while crafting the CNN input. These are 1D transformation [78] and 2D input concepts [35,103] (see Table 3). In addition, ref. [79] proposed both a 1D approach and a 2D one.

The one-dimensional entry to CNN is a vector created from raw traffic packets. While transforming packets, Marín et al. proposed the removal of only two attributes of traffic from protocol data units (PDU), i.e., MAC and IP addresses [78]. After that, a fixed size of 1300 bytes is set. It means that all longer packets are trimmed, while smaller packets are zero-padded. In the end, each packet is labeled to be either benign or malware. Finally, vectors are given to the 6-Layer CNN, which is tested on the USTC-TFC2016 dataset. This is an unwanted traffic detection approach.

The idea of wrapping raw traffic packets into the matrix was proposed by Ko et al. to test an 11-Layer CDM [103]. Traffic originated from the EDU1 dataset. This research proposed 200 × 200 bytes images. This is a traffic classification approach.

The scientific concept of raw traffic packets wrapping was further studied by Jia et al. [35]. The traffic images were based on the DARPA 1999. The authors unified the packets length so that each reached 784 bytes. Then, they wrapped the vector to create a matrix of 28 × 28 bytes size. The paper provided the images as an entry for LeNet [19]. The work's aim was to enhance malware detection—in particular, the detection of intrusions.

The following paper, written by Zhang et al., used two different versions of CNN input [79]. The 1300 bytes size vectors are given to two 10-Layer CDM. 2D CNN obtains a traffic matrix, whose size is not revealed. It is important to highlight that vector as an entry to 1D CNN achieves better results than the matrix given to the 2D CDM. In the proposed approach, the packets are left unchanged. The deep learning model works on raw traffic from the USTC-TFC2016 dataset to detect malware. The paper proposed two types of transformations.


**Table 3.** The summary of articles that are in the raw traffic transformation group.
