3.1. Motivations
Different deep learning-based detectors have been proposed [
3,
20,
26]. As one of the most popular algorithms in deep learning, convolutional neural network (CNN) is widely applied in these detectors. Since CNN can automatically learn features from training samples, these detectors directly use a binary executable file as input and classify it. In our work we focus on how to generate adversarial samples which can evade CNN-based malware detectors. The problem of generating adversarial malware samples can be formalized as follows.
An executable is represented as a sequence of binary bytes , where is between 0 and 255 and is the length of an executable. In our work we set . If the length of an executable is less than , zeros are padded at the end of the file. The malware detector is denoted as , where is the parameters of a detector, and outputs the probability that is malware. If , is classified as malware, otherwise is classified as benign.
Given a malicious file which is correctly classified as malware, an adversarial sample generation method can inject carefully-selected bytes into an executable (while preserving its runtime functionality), so that the executable can be classified as benign.
Conventional methods use gradient-based algorithm to generate adversarial samples [
7,
16]. These approaches use the input gradient value to update the injected byte values. Gradient value is calculated by minimizing the classification loss function of a detector, with respect to the target label. The gradient-based algorithm is an iterative algorithm and only one byte value is computed per iteration. Therefore, the computation cost for generating an adversarial sample is high, which is not suitable for generating a large number of adversarial examples. The motivation of our research is to design a method which can generate adversarial samples efficiently.
3.2. Finding Data Area Important for Classification
To evade the detection of malware detectors, we need to inject padding bytes into a source malware binary to change its category. To avoid using gradient-based algorithms to calculate the values of injected padding bytes, the padding bytes we use are the byte sequences extracted from benign executables. If these byte sequences can represent the characteristics of benign executables, the probability that an adversarial malware sample can fool a detector will increase. Therefore, our main task is to extract byte sequences which can represent the characteristics of benign executables.
To evade the detection of malware detectors, we need to inject padding bytes into a source malware binary to change its category. To avoid using gradient-based algorithms to calculate the values of injected padding bytes, the padding bytes we use are the byte sequences extracted from benign executables. If these byte sequences can represent the characteristics of benign executables, the probability that an adversarial malware example can fool a detector will increase. Therefore, our main task is to extract byte sequences which can represent the characteristics of benign executables.
CNN-based detectors generate explicit feature maps for input samples.
Figure 1 gives an example for CNN convolution operation. The input data is a sequence. When we apply convolution to the input data, we mix two buckets of information. The first bucket is the input data. The second bucket is the convolution kernel, a single matrix of floating-point numbers. The output of the kernel is the altered sequence which is often called a feature map. Usually there are multiple convolution kernels and each kernel outputs a feature map. Feature maps represent features of an input data at different level. Through analyzing feature maps, we can discover which features are more important for decision making, and the data corresponding to important features can be used to construct adversarial samples.
Grad-CAM [
27] algorithm provides explanations for decisions from a large class of CNN-based models. We use the Grad-CAM algorithm to evaluate the important values of each feature map for a target class
. The important value of a feature map, with respect to a specific class is computed as Equation (1).
indicates the importance of
, with respect to class
.
where
is the
th feature map,
is the
th element of
,
is the number of elements of
,
is a class label,
is the input for class
in the softmax layer (classification layer in a CNN).
To discover the importance area of the input data for class
, the contributions of all feature maps need to be considered. The weighted sum of all feature maps is computed, which is defined as Equation (2).
is called the class-discriminative localization map, which has the same size as a feature map.
In (2) the ReLU function () is applied to the linear combination of feature maps because only the features that have a positive impact on class are considered. Without the ReLU function, the localization map sometimes highlights more than just the class of interest and performs worse at localization. Each element can be seen as a feature extracted from the input data. The element , with a greater value, will also have more positive impact on class . We can find the data area that is important for class by mapping back to the corresponding data area in the input.
3.3. Generating Adversarial Examples
In reality the structure and parameters of a malware detector are unknown. In order to obtain the feature maps, we have to create a pseudo detector, which can simulate the true detector. MalConv [
3] is a typical CNN-based detector. In our work, we select MalConv network as the pseudo detector. The network structure of MalConv is shown in
Figure 2.
We regard an executable (PE file) as a byte stream. The input of MalConv is a fixed-length sequence from a PE file. If the length of an executable is shorter than the fixed-length, a number of zeros are inserted at the end of an executable. In MalConv, the first layer is an embedding layer, where each byte of an input sequence is converted into an 8-dimensional embedding vector. MalConv has two parallel convolutional layers. These embedding vectors are then transferred to two one-dimensional convolutional layers to generate feature maps, respectively. The next layer is a temporal max pooling layer, which combines the outputs of the two convolutional layers and passes them to a fully connected layer and a softmax layer for classification.
In our paper, we use Equation (1) to calculate the important value of each feature map, with respect to class
, denoted as
, which is the important value of the
th feature map generated from the
th convolutional layer
. MalConv has two parallel convolutional layers. We normalize
for each independent convolutional layer, respectively, which is shown as Equation (3).
The class-discriminative localization map is calculated as the weighted sum of the feature maps generated by the two parallel convolutional layers, which is shown as Equation (4). Here, we set all convolution kernels to have the same size; thus, all feature maps, as well as the class-discriminative localization map, have the same size, which are one-dimensional vectors. Different CNN-based networks have different structures. Another key problem we should resolve is how to locate the byte sequences in a source binary file, according to the class-discriminative localization map.
A MalConv model has two independent convolutional layers, and each convolution layer has multiple convolution kernels. To simplify data mapping, we set the kernel length equal to the kernel’s moving stride, all kernels have the same length, and the length of the input data is bytes. The mapping relationship between a feature map and an input data can be constructed as follows.
In [
3], the authors tried different parameter settings to test the performance of MalConv. We followed [
3] and set the length and the moving stride of a kernel as 500, and the kernel number of each convolutional layer as 128.
Figure 3 shows the relationships between an input data and a features map. In
Figure 3, each square in the first row represents an input byte, and each square in the second row represents the embedding vector of an input byte. Kernel1 is a one-dimensional convolution kernel of a convolutional layer, whose length is 500. Kernel1 is convolved across the embedding data, computing the dot product between the entries of the kernel and the embedding data and producing a one-dimensional feature map
. If each convolutional layer has 128 kernels, we can obtain 128 one-dimensional feature maps from one convolutional layer. The embedding data has the same length as the input data. Therefore, each feature map has 4000 elements. In
Figure 3, the fourth row shows the mapping relationship between an element of a feature map and a byte sequence in the input data. For example, the first element of
,
, is calculated by convoluting Kernel1 with the first five hundred elements of the embedding vector, and each input byte corresponds to an element of the embedding vector. Therefore,
[1] is related with the first five hundred bytes of the input data. The class-discriminative localization map is the weighted sum of all feature maps, so it has the same mapping relationship as that of a features map.
To generate adversarial examples, we firstly train a MalConv model as the pseudo detector. Then, we create a dataset for feature extraction. All samples in the dataset are benign samples and can be correctly classified as benign by a detector. We input a sample to the pseudo detector and obtain the class-discriminative localization map of the sample. According to the mapping relationship between input data and the class-discriminative localization map, we can extract the byte sequences from the input data, which can represent the features of a sample. We usually extract the byte sequences corresponding to the elements having the greatest value in . We call these byte sequences as feature byte sequences, which can be stored and shared by different adversarial samples. When generating an adversarial example, we randomly select one or multiple sequences and inject them into a malware sample.
Different from adversarial samples of image, feature byte sequences injected into a malware sample should have concrete program semantics. Sometimes the head and tail of a feature byte sequences are separated from other bytes of a program and cannot represent complete program semantics. In this case, we should extend a feature byte sequence to include the separate parts. For example, a feature byte sequence (bytes in the box), extracted according to the mapping relationship, is shown in
Figure 4. The decompiling codes of the binary bytes are shown in
Figure 5. We can see the head byte FF and the tail byte 45 cannot represent correct program semantics. To generate a feature byte sequence having correct program semantics, we should extend the feature byte sequences to include 8B and 08. From this point we can see the injected byte sequences, generated using our method, are explainable.
To more accurately locate the important area in the input data, we train several MalConv models with different parameter settings and combine the class-discriminative localization map from all MalConv models to locate the important area of the input data.
Algorithm 1 gives the algorithm for extracting feature byte sequences from input data using multiple detection models. The length of convolution kernels in different MalConv models can be different. For the convenience of extracting feature byte sequences, we define a new data structure . It is a vector having the same length as the input data. Each element in records the important value of the corresponding byte of the input data. The important values of input bytes are assigned according to . According to the mapping relationship, we can find the byte sequence corresponding to (the th element of ); then, the values of the elements of corresponding to the byte sequence are set as . The function implements this objective. Due to multiple models used to locate feature byte sequences, we use and represent the class-discriminative localization map and , generated from model (the th detector). The vector is the sum of all , which stores the final important value of each byte of the input data.
In Algorithm 1,
is the number of models, and
gives the threshold of important value for selecting feature sequences.
is the input data. The function
returns all the feature maps generated by model
.
is the
th element of the
th feature map generated by the
th convolutional layer of a MalConv model. The function
extracts all bytes whose important values are bigger than
from
, according to vector
. The continuous bytes, having the same important value, consist of a feature byte sequence.
Figure 6 shows a sample how to extract feature byte sequences from input data. We set
as 50; therefore, only two feature byte sequences (sequences in black box) are extracted from input data.
Algorithm 1: Extracting feature byte sequences of a benign sample. |
Input: |
Output: |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3.4. Strategies for Injecting Feature Sequences
A malware adversarial sample should preserve the same semantics as that of a source file. It requires that any byte in the source executable cannot be changed. Therefore, feature sequences should be injected into the spare space of an executable, which cannot be executed by a computer. Two strategies can be adopted to locate spare space in an executable: mid-file and end-of-file injection. We apply both strategies to generate adversarial samples in our work.
Mid-file injection: we locate the gaps between neighboring PE sections by parsing a PE file header. The gaps are placed by the compiler, since the physical size allocated to a PE section is greater than its virtual size. The length of a gap is calculated as RawSize-VirtualSize. The index of the start address of a gap is computed as PointerToRawData (offset address of a section) + VirtualSize. We collect the start address and length of each gap in an executable, then inject the feature byte sequences with appropriate length into these gaps.
End-of-file injection: another strategy we use is adding new sections at the end of a PE file and injecting feature byte sequences into the newly added sections. Since the new sections are not accessed by program code, the semantics of the original PE file are preserved. The process of adding a new section block includes three steps. First, we modify the value of bytes, which store the number and size of sections in the PE file header and update the values of file alignment and section alignment. Then, we use the offset address of the last section block plus the offset address of the new block as the final offset address. Next, we set the attribute values of the new section, such as the section name, execution attributes, size of the hard disk, and size of the memory. Finally, we modify the offset address of the aligned section and the offset address of the file in the section table and modify the size of image in the PE header.
Similar to [
17], our method adopting the mid-file injection generates adversarial samples by injecting perturbed bytes in the gaps between neighboring PE sections. The method adopting end-of-file injection generates malware adversarial examples by adding new sections at the end of PE file, which is similar to previous methods [
7,
16,
22,
24]. However, all these methods [
7,
16,
17,
22,
24] are belong to gradient-based method, which is optimized by computing the gradient of the objective function, with respect to each byte of a source malware binary. The gradient-based algorithm is an iterative algorithm and only one byte value is computed per iteration. Generating an adversarial malware sample by gradient-based method spends much time, so it is not applicable for generating a large number of adversarial samples. To avoid using gradient-based algorithms to calculate the values of injected padding bytes, our methods use the byte sequences extracted from benign executables to generate adversarial samples. In addition, our methods aim to evade CNN-based malware detectors, which is similar to [
23]. We make a more detailed comparison between our method and the gradient-based method [
16] in
Section 4 and
Section 5.