*Article* **A Novel Android Botnet Detection System Using Image-Based and Manifest File Features**

**Suleiman Y. Yerima 1,\* and Abul Bashar <sup>2</sup>**


**Abstract:** Malicious botnet applications have become a serious threat and are increasingly incorporating sophisticated detection avoidance techniques. Hence, there is a need for more effective mitigation approaches to combat the rise of Android botnets. Although the use of Machine Learning to detect botnets has been a focus of recent research efforts, several challenges remain. To overcome the limitations of using hand-crafted features for Machine-Learning-based detection, in this paper, we propose a novel mobile botnet detection system based on features extracted from images and a manifest file. The scheme employs a Histogram of Oriented Gradients and byte histograms obtained from images representing the app executable and combines these with features derived from the manifest files. Feature selection is then applied to utilize the best features for classification with Machine-Learning algorithms. The proposed system was evaluated using the ISCX botnet dataset, and the experimental results demonstrate its effectiveness with F1 scores ranging from 0.923 to 0.96 using popular Machine-Learning algorithms. Furthermore, with the Extra Trees model, up to 97.5% overall accuracy was obtained using an 80:20 train–test split, and 96% overall accuracy was obtained using 10-fold cross validation.

**Keywords:** botnet detection; Histogram of Oriented Gradients; image processing; android botnets; machine learning

#### **1. Introduction**

The prevalence of mobile malware globally is a well-known phenomenon as increasing malware of different types continue to target mobile platforms and particularly Android. The McAfee Threat report of June 2021 stated that around 7.73 million new mobile malware samples were seen in 2020 alone [1]. The report further revealed that 2.34 million new mobile malwares had already been discovered in the wild during the first quarter of 2021.

Android, being an open source mobile and IoT platform that also permits users to install apps from diverse sources is the prime target for mobile malware. Unverified and/or re-packaged apps can be downloaded and installed on an Android device from virtually any online third-party source other than the official Google play store. Even though the Google play store benefits from screening services to prevent the distribution of malicious apps, cleverly crafted malware, such as the Chamois botnet [2–4], were still able to bypass protection mechanisms and infect millions of users worldwide.

Chamois was distributed through Google play and third-party app stores and infected over 20.8 million Android devices between November 2017 and March 2018. The first generation of the Chamois botnet was primarily distributed through fake apps, and initial eradication efforts by anti-malware professionals almost completely eliminated the threat. The creators of the botnet responded by adopting a more sophisticated distribution model

**Citation:** Yerima, S.Y.; Bashar, A. A Novel Android Botnet Detection System Using Image-Based and Manifest File Features. *Electronics* **2022**, *11*, 486. https://doi.org/ 10.3390/electronics11030486

Academic Editor: Giovanni Dimauro

Received: 31 December 2021 Accepted: 3 February 2022 Published: 8 February 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

that bundled Chamois into a fake payment solution for device manufacturers and a fake advertising SDK for developers.

As mobile devices—especially smartphones—tend to be online for long periods, they provide a suitable platform for operating botnets when they have been compromised. Mobile botnets are controlled using SMS or web-based commands and control channels and are used for various attacks, such as Distributed Denial of Service (DDoS), phishing attacks, spam distribution, click fraud, credential stuffing etc. A study by Imperva on mobile botnet activity revealed that 5.8 million bot-infected mobile devices were used to launch credential stuffing attacks on websites and apps over a 45-day period on six major cellular networks [5].

DDoS attacks are high volume and high frequency and are thus easily detected by traditional network intrusion detection systems. By contrast, credential stuffing attacks from botnets are characterized by low frequency and low volume network traffic and are therefore more challenging to detect. Thus, complementary approaches to network-based detection are needed to strengthen defense against mobile botnet infection and attacks.

As mobile malware continues to increase and become more sophisticated, research efforts directed at detecting and mitigating Android malware has intensified in recent years. Several Machine-Learning-based detection systems have been proposed in the current literature to combat the rising incident of Android malware, including botnets [6–8]. Such systems rely on statically or dynamically extracted features for training the Machine-Learning models. In many cases, these features are either hand-crafted and/or depend heavily on domain expertise to effectively extract them. As the Android OS evolves, many of these hand-crafted features may become deprecated or obsolete and the entire feature extraction process will need to be re-engineered.

The utilization of image processing techniques to extract features from image-based representation of the application has the distinct advantage of eliminating the need to rely on hand-crafted features to build Machine-Learning models. Moreover, with image-based approach, little or no modification will be required to adapt to platform/OS evolution, and this leads to long-term efficiency compared to systems based on hand-crafted features.

Hence, in this paper, we propose a system that utilizes an image processing technique called Histogram of Oriented Gradients, to extract features for training Machine-Learning models to detect Android botnets. Our proposed system is a novel scheme that detects Android botnets based on Histogram of Oriented Gradients (HOG). In the scheme, the HOG features are combined with byte histograms and features from the app manifest file to improve prediction accuracy. Furthermore, we demonstrate the feasibility of our approach using a dataset of Android botnets and benign samples.

The rest of the paper is organized as follows: In Section 2, we provide an overview of related work. Section 3 describes our proposed system, while in Section 4, we outline the study undertaken to evaluate the system. Section 5 presents and discusses the results of the evaluation. Finally, in Section 6, we conclude the paper and give an outline of future work.

#### **2. Related Work**

There is extensive literature regarding the Machine-Learning-based detection of mobile malware, and [9,10], provide recent surveys on the topic. Here, we provide an overview of related works in Android botnet detection as well as image-based detection of malicious applications.

#### *2.1. Image-Based Analysis of Malicious Applications*

In [11], a method for image-based malware classification using an ensemble of CNN architectures was proposed. This was based on the malimg dataset where the raw images were used as input to the CNN-based classification system. Additionally, a malware dataset of 96 packed executables was also used and converted into images to evaluate the proposed system. The images were divided into training and validation sets based on a 70:30 split.

The method consisted of using transfer learning with fine-tuned ResNet-50 and VGG16 models that were pre-trained on ImageNet data. The output of these models obtained through SoftMax classifiers were fused with a version of the output that had been reduced using PCA and fed to a one-vs-all multiclass SVM classifier. In their experiments, they obtained a classification accuracy of up to 99.50% with unpacked samples and 98.11% and 97.59% for packed and salted samples, respectively.

In [12], the authors presented a method to recognize malware by capturing the memory dump of suspicious processes and representing them as RGB images. The study was based on 4294 malware samples consisting of 10 families and benign executables and several Machine-Learning classifiers, including J48, SMO with RBF kernel, Random Forest, Linear SVM and XGBoost. Dimensionality reduction was achieved using UMAP based manifold learning strategy. A combination of GIST and HOG features were used to extract features from the RGB images. The method yielded the highest prediction accuracy of up to 96.39% using the SMO classifier.

In [13], Bozkir et al. evaluated several CNN architectures for PE malware classification using coloured images. They used the Malevis Dataset containing 12,394 malware files and split this into 8750 training and 3644 testing samples. From their experiments, they obtained an accuracy of 97.48% using the DenseNet architecture.

Nataraj et al. [14] used grayscale images to visualize malware binaries to distinguish between different families. GIST was used to extract features from the images, and using KNN as a classifier they achieved 97.18% classification accuracy on experiments with a dataset consisting of 9458 malware samples from 25 families. In another paper [15] by the same authors, similar results were obtained when they applied image processing with dynamic analysis, in order to address both packed and unpacked malware.

Kumar et al. [16] proposed a method that uses an autoencoder enhanced deep convolutional neural network (AE-DCNN) to classify malware images into their respective families. A novel training mechanism is proposed where a DCNN classifier is trained with the help of an encoder. The encoder is used to provide extra information to the CNN classifier that may be lost during the forward propagation thus resulting in better performance. On the standard malimg dataset, 99.38% accuracy and F1-score of 99.38% were reported.

In [17], Fine Tuning and Transfer Learning approaches were used for multi-class classification of malware images. Eight different fine-tuned CNN-based transfer learning models were developed for vision-based malware multi-classification applications. These included VGG16, AlexNet, DarkNet-53, DenseNet-201, Inception-V3, Places365-GoogleNet, REsNet-50 and MobileNet-V2. Experiments based on the malimg dataset showed high performance with 99.97% accuracy.

Similarly, in [18], the IMCFN system i.e., image-based malware classification using fine-tuned convolutional neural network architecture, was presented. IMCFN converts raw malware binary into color images that are used by fine-tuned CNN architecture to classify malware. It fine-tunes a previously trained model based on ImageNet dataset and uses data augmentation to address class imbalance. The method was evaluated using malimg and an IoT-android mobile dataset containing 14,733 malware and 2486 benign samples. With the malimg dataset, an accuracy of 98.82% was obtained, while 97.35% accuracy was obtained for the IoT-android mobile dataset.

Xiao et al. [19] proposed a malware classification framework, MalFCS based on malware visualization and automated feature extraction. Malware binaries are visualized as entropy graphs based on structural entropy, while a deep convolutional neural network is used as a feature extractor to automatically extract patterns shared by a family from entropy graphs. An SVM classifier was used to classify malware based on the extracted features. The method achieved 99.7% accuracy when evaluated on the malimg dataset and 100% accuracy when evaluated on the Microsoft dataset.

Awan et al. also proposed an image-based malware classification system, which was investigated using malimg data [20]. The VGG19 model was used with transfer learning as a feature extractor, while a CNN model enhanced by a spatial attention mechanism was used to enhance the system. The attention-based model achieved an accuracy of 97.68% in the classification of the 25 families using a 70:30 training and testing split.

In [21], DenseNet was used with the final classification layer adopting a reweighted class-balanced loss function in the final classification layer to address data imbalance issues and improve performance. Experiments performed on malimg dataset yielded 98.23% accuracy while 98.46%, 98.21% and 89.48% accuracies were obtained with BIG 2015, MaleVis and Malicia datasets, respectively.

In [22,23], local binary patterns (LBP) were used while [24] used Intensity, Wavelet and Gabor to extract grayscale image features. Han et al. [25], used entropy graphs and similarity measures between entropy images for malware family classification. They obtained an accuracy of 97.9% by experimenting with 1000 malware samples from 50 families. In [26], the authors first disassembled binary executables and then converted the opcode sequences into RGB images. They evaluated their approach on 9168 malware and 8640 benign binaries achieving 94.8% to 96.5% accuracy.

Dai et al. [27] proposed a method for identifying malware families aimed at addressing the deficiencies of dynamic analysis approaches, by extracting a memory dump file and converting it to a grayscale image. They used the Cuckoo sandbox and built the procdump program into the sandbox, while using the ma command to extract the dump of the monitored process. Histogram of Gradient (HOG) was used to extract features from the image file and train KNN, Random Forest and MLP classifiers. Experiments were performed on 1984 malware samples from the Open Malware Benchmark dataset, and MLP performed best with an accuracy of 95.2% and F1-score of 94.1%.

Although these works highlight the success of employing image-based techniques in malware related work, their focus has largely been on Windows (PE) malware and family classification. By contrast, this paper uses image-based techniques for detection of botnets on the Android platform based on a novel approach to utilize HOG with manifest file features.

In a recent paper that focused on Android, Singh et al. [28] proposed a system called SARVOTAM that converts malware non-intuitive features into fingerprint images to extract quality information. Automatic extraction of rich features from visualized malware is then enabled using CNN, ultimately eliminating feature engineering and domain expert cost. They used 15 different combinations of Android malware image sections to identify and classify malware and replaced the softmax layer of CNN with ML algorithms like KNN, SVM and Random Forest to analyze grayscale malware images. It was observed that CNN-SVM outperformed the original CNN as well as CNN-KNN and CNN-RF. The experiments performed on the DREBIN dataset achieved 92.59% accuracy using Android certificates and manifest malware images.

#### *2.2. Botnet Detection on Android*

In [29], the authors proposed a signature-based, real-time SMS botnet detection system that applies pattern-matching for incoming and outgoing SMS messages. This is followed by a second step that uses rule-based techniques to label SMS messages as suspicious or normal. They performed experiments to evaluate their system with more than 12,000 messages. The system detected all 747 malicious SMS messages but also had a high false positive rate with 349 normal SMS messages misclassified as malicious.

Jadhav et al. presented a cloud-based Android botnet detection system in [30], based on strace, netflow, logcat, sysdump and tcpdump. Although this is a real-time dynamic analysis system, one major drawback is the ability of sophisticated botnets to detect and evade the cloud environment. Moreover, detecting Android botnets using a cloud-based dynamic analysis system based on several types of traces is more resource intensive compared to an image-based static analysis system.

Moodi et al. [31], presented an approach to detect Android botnets based on traffic features. Their method was based on SVM where a new approach called smart adaptive particle swarm optimization support vector machine (SAPSO-SVM) is developed to adapt the parameters of the optimization algorithm. The proposed approach identified the top 20 traffic features of Android botnets from the 28-SABD Android botnet dataset.

Bernardeschia et al. [32], used model checking to identify Android botnets. Static analysis is used to derive a set of finite state automata from the Java byte code that represents approximate information about the run-time behaviour of an app. However, the authors only evaluated their approach using 96 samples from the Rootsmart botnet family and 28 samples from the Tigerbot family in addition to 1000 clean samples.

Anwar et al. [33], proposed a static technique that consists of four layers of botnet security filters. The four layers consist of MD5 signatures, permissions, broadcast receiver and background services modules. Based on these, classification models were built using SVM, KNN, J48, Bagging, Naive Bayes and Random Forest. Experiments were performed on 1400 mobile botnet applications from the ISCX Android botnet dataset and 1400 benign applications. They observed the best result of 95.1% accuracy from the results of their experiments. In [34], the Android Botnet Identification System (ABIS) was proposed based on static and dynamic features using API calls, network traffic and permissions. These features were used to train several Machine-Learning classifiers, where Random Forest showed the best performance by obtaining a precision score of 0.972 and a recall score of 0.960.

Yusof et al. proposed a botnet classification system based on permission and API calls in [35]. They used feature selection to select 16 permissions and 31 API calls that were subsequently used to train Machine-Learning algorithms using the WEKA tool. The experiments were performed on 6282 benign and malicious samples using Naive Bayes, KNN, J48, Random Forest and SVM. Using both permission and API call features, Random Forest obtained the best results with 99.4% TP rate, 16.1% FP rate, 93.2% precision and 99.4% recall. This work was extended in [36] to include system calls and this resulted in improved performance with Random Forest achieving 99.4% TP rate, 12.5% FP rate, 98.2% precision, 99.4% recall and 97.9% accuracy.

In [37], a system for Android botnet detection using permissions and their protection levels were proposed. Random Forest, MLP, Naive Bayes and Decision Trees were used as Machine-Learning classifiers, with the experiments conducted using 1635 benign and 1635 botnet applications from the ISCX botnet datasets. Random Forest achieved 97.3% accuracy, 98.7% recall and 98.5% precision as the best result.

In [38], Android botnet classification (ABC) was proposed as a Machine-Learningbased system using requested permissions as features with Information Gain feature selection applied to select the most significant requested permissions. Naive Bayes, Random Forest and J48 were used as classifiers and experiments showed that Random Forest had the highest detection accuracy of 94.6%, lowest FP rate of 9.9%, with precision of 93.1% and recall of 94.6%. The experiments were performed on 2355 Android applications (1505 samples from the ISCX botnet dataset and 850 benign applications).

Karim et al. proposed DeDroid in [39], as a static analysis approach to extract critical features specific to botnets that can be used in the detection of mobile botnets. They achieved this by observing the code behaviour of known malware binaries that possess command and control features. In [40], an Android botnet detection system based on deep learning was proposed. The system is based on 342 static features including permissions, API calls, extra files, commands and intents. The model was evaluated using 6802 samples including 1929 ISCX botnet dataset samples and 4873 clean applications.

The performance of CNN was compared to Naive Bayes, Bayes Net, Random Forest, Random Tree, Simple Logistic, ANN and SVM. The CNN-based model achieved the best performance with 98.9% accuracy, 98.3% precision, 97.8% recall and 98.1% F1-score. In [8], a comprehensive study of deep learning techniques for Android botnet detection was presented using the same dataset and static features utilized in [40]. CNN, DNN, LSTM, GRU, CNN-LSTM and CNN-GRU models were studied, and the overall best result from DNN was 99.1% accuracy, 99% precision, 97.9% recall and 98.1% F1-score.

This cross-section of Android botnet detection systems summarized above indicates that in the current literature, most proposed solutions are based on hand-crafted (static or dynamic features) or rely on in-depth (Android) domain knowledge, unlike the system proposed in this paper. Furthermore, compared to image-based approaches, hand-crafted features may not be sustainable in the long run because as the Android OS evolves, new features are added while some old ones may become deprecated. This will require significant re-engineering of hand-crafted based systems to cope with the OS/platform evolution.

Some recent papers have begun exploring image-based techniques for Android botnet detection. In [41], the Bot-IMG framework was used to extract HOG descriptors and train Machine-Learning-based classifiers to distinguish botnets from benign applications. An enhanced HOG scheme was proposed, which enabled improved accuracy performance with the use of autoencoders. The system was evaluated with experiments performed using 1929 ISCX botnet applications and 2500 benign applications.

KNN, SVM, Random Forest, XGBoost and Extra Trees learning algorithms were trained using the HOG-based schemes. With Extra Trees, the best result from 10-fold cross validation was obtained using autoencoder and gave 93.1% accuracy with 93.1% F1-score. In [42], the authors used permissions to generate images based on a co-occurrence matrix. The images were used to train a CNN model to classify applications into benign or botnet. The experiments were performed on 3650 benign applications and 1800 botnet applications from the ISCX dataset. Their best result was 97.2% accuracy, 96% recall, 95.5% precision and 95.7% F1-score.

Different from [41,42], the system presented and evaluated in this paper is a novel botnet detection system based on image features (i.e., HOG, byte histograms) and manifest features (i.e., permissions, intents). All of these features come from a single pre-processed composite image derived from automated reverse engineering of the Android applications. In this paper, we demonstrate the feasibility and performance of the proposed scheme by using it to train and evaluate several popular Machine-Learning classifiers on a dataset of 1929 ISCX botnet applications and 2500 benign applications.

#### **3. Proposed HOG-Based Android Botnet Detection System**

Our proposed system is based on the Bot-IMG framework [41], which enables automated reverse engineering of the Android applications, image generation and subsequent extraction of image-based and manifest features. Figure 1 shows an overview of the system for HOG-based Android botnet detection. As shown in the figure, the first step involves reverse engineering the apks to extract the various files contained in the application.

Out of all the files present in an apk, only the manifest file and the Dalvik executable (dex) file are utilized in the proposed system. The manifest file is processed using AXML-Printer2 tool, which converts it into a readable text file that is scanned to generate a set of 187 features consisting of permissions and intents. These features extracted from the manifest file are encoded for gray-scale representation.

Thus, the presence of a feature is denoted by 255 (or white), while a 0 (i.e., black) is recorded if the feature is absent and these are stored in an array of manifest features. The dex file is converted to a byte array consisting of integer encoded bytes ranging from 0 to 255. This byte array from the executable is combined with the array of manifest features. The combined array is then used to generate a composite gray-scale image representing the application.

**Figure 1.** Overview of the different steps involved in building the image-based Android botnet detection system.

The image files are processed by the feature extraction engine using the algorithm described in Section 3.2 to generate feature vectors for each application used in the training of the Machine-Learning models. During the training of a model, a feature selection algorithm is applied to select the best features. The trained model is then used to detect botnet apps by classifying an unknown application into 'botnet' or 'benign'. The proposed system is based on HOG, byte histograms and manifest features. We provide a brief description of HOG in the following section.

#### *3.1. Histogram of Oriented Gradients*

HOG, first proposed for human detection by Dalal and Triggs [43] is a popular image descriptor that has found wide application in computer vision and pattern recognition. For example, it has been applied to handwriting recognition [44], recognition of facial expressions [45], pedestrian detection system for autonomous vehicles [46]. HOG is considered to be an appearance descriptor because it counts occurrences of gradient orientation in localized portions of an image. Due to the simple computations involved, HOG is generally a fast descriptor compared to Local Binary Patterns (LBP) or Scale Invariant Feature Transforms (SIFT).

HOG descriptors are computed on a dense grid of uniformly spaced cells and overlapping local contrast normalizations are used for improved performance. For each pixel, magnitude and orientation can be computed using the following formulae:

$$\mathbf{g} = \sqrt{\mathbf{g}\_x^2 + \mathbf{g}\_y^2} \tag{1}$$

$$\theta = \tan^{-1} \left( \frac{\mathcal{g}\_y}{\mathcal{g}\_x} \right) \tag{2}$$

where *g<sup>x</sup>* and *g<sup>y</sup>* are calculated from the neighboring pixels in the horizontal and vertical directions respectively. Figure 2 illustrates how the histograms are generated for a cell, using the highlighted pixel as an example. For the pixel represented by number 65, the change in *x* direction *g<sup>x</sup>* is 69 − 54 = 15, and the change in *y* direction *g<sup>y</sup>* is 78 − 30 = 48. Using the Equations (1) and (2), the total magnitude *g* = 50.3 while the orientation *θ* = 72.65◦ . To generate the histogram for the cell, using nine bins representing the orientations separated 20 degrees apart, each pixel's contribution will be added to the bin according to orientation.

For example, in Figure 2, the orientation is 72.65◦ , which is between 60 degrees and 80 degrees. Thus, the magnitude is split between these two bins by using the following weighting approach where the distances from the bin orientations are used. Hence, we have (72.65 − 60)/20 and (80 − 72.65)/20 as the weights that will be used to split the 50.3 magnitude between the bins. This means that the split will result in 31.7 and 18.6 being placed in the 4th and 5th bins respectively. The process is repeated for all the pixels in the cell.

**Figure 2.** Building a Histogram of Oriented Gradients using nine bins representing positive orientations spaced 20 degrees apart.

The binning of the magnitudes by taking the orientations into consideration, produces a histogram of gradient directions for the pixel magnitudes in a cell. If the number of bins is taken as 9, then, each cell will be represented by a 9 × 1 array. In order to make the gradients less sensitive to scaling, the histogram array is normalized in blocks, where each block is made up of *b* × *b* cells. Hence, taking *b* = 2 will result in 4 cells per block. This means that each block will be represented by a 36 × 1 vector or array (i.e., 4 cells × nine bins). Block normalization is based on the L2-norm computed as in Equation (3), where *e* is a small constant:

$$v \leftarrow \sqrt{||v||\_2^2 + \varepsilon^2} \tag{3}$$

In the default situation, the HOG algorithm takes an input image whose size is 128 × 64. Therefore, in the 128 × 64 pixel image, it turns out that if we take 8 × 8 pixels in each cell and 2 × 2 cells in each block, then this will result in 7 horizontal block positions and 15 vertical block positions. Hence, we get a total HOG vector length of 3780 (which is 36 × 7 × 15). As such, to get a HOG descriptor vector of length 3780 for an image, we are required to choose the following parameters: *n* = 9 (number of orientations); ppc = 8 × 8 (number of pixels per cell) and cpb = 2 × 2 (number of cells per block).

#### *3.2. Characterizing Apps with Image and Manifest Features*

The methodology for extracting the HOG descriptors and using them together with byte histogram and manifest features to characterize the apps, is discussed in this section. The steps involved in deriving the composite features for Machine-Learning-based detection approach are shown in Algorithm 1.


The required input is the set of images from benign and botnet apps, while the output will be a high dimensional vector *V*. The images generated from the apps are of different sizes. Thus, in order to utilize the original HOG descriptors generation approach proposed by Dalal and Triggs, the images must be reshaped to 128 × 64 pixels. However, it has been found that resizing the images diminishes the performance of the trained Machine-Learning models [41]. We therefore adopt a methodology that uses five patches or segments from the image with each segment being 128 × 64 pixels in size. In line 1 of Algorithm 1, the five arrays *X*1, *X*2, *X*3, *X*4 and *X*5 that will hold the pixels of the five segments are initialized with zeros.

This approach utilizes only the first 40,960 pixels of the images (after extracting the manifest features) making for a fast and efficient system. Once the five segments from the image are copied into the arrays, they are reshaped into 128 × 64 arrays and converted into 5 separate images. This is because the HOG descriptor function only takes images as input and not arrays. From line 12 to line 15 of Algorithm 1, five different HOG vectors are generated for each segment image, and in line 14, we sub-sample each of them to retain 500 descriptors. The 500 descriptors from each batch are then concatenated into a 2500 descriptor vector *HV* in line 16.

From line 17 to line 28 of Algorithm 1, a byte histogram is generated, but only for the same combined area where the HOG descriptors were extracted, i.e., the first 40,960 pixels of the application's image. The byte histogram will consist of a vector of dimension 256 that will hold the occurrences of bytes (pixels) within that region. The occurrences are clipped and log-scaled as depicted in lines 23, 24 and 27 respectively, to keep the values between zero and 255. Finally, the overall feature vector *V* of dimension 2943 is generated by concatenating the extracted manifest vector *M* (MV) with the final HOG vector (*HV*) and the log-scaled byte histogram (*BH*).

#### *3.3. Feature Selection Using CHI Square Algorithm*

Since the image processing resulted in a high dimensional vector, we apply feature selection for dimensionality reduction and to improve the performance of the Machine-Learning classifiers. As we know, if there are more features resulting a high dimensional vector as an input to the classifier during the training phase, they will contribute to algorithmic complexity in terms of data storage and processing. As not all features contribute to the model's performance in the classification phase, it is suitable that they be removed from the training phase as well. This process is termed as 'Dimensionality Reduction'.

Dimensionality reduction can be achieved by "measuring" the contribution of each of the features to the model's prediction performance. Those features that have insignificant contributions can be safely removed to enhance the training speed of the Machine-Learning model. Dimensionality reduction chooses those features, which are good contributors to the model performance, and hence this process is also called Feature Selection.

Various approaches for Feature Selection have been presented in the literature, such as Information Gain, Mutual Information, Principal Component Analysis and the Chi-Square test [47]. In our research, we chose to use the Chi-Square test, which results in a better prediction performance for our ML classifiers. The Chi-Square test is represented by the formula given in Equation (4):

$$\chi^2 = \sum\_{i} \left( O\_i - E\_i \right)^2 / E\_i \tag{4}$$

where

*χ* <sup>2</sup> = Chi-Squared value

*O<sup>i</sup>* = Observed value

*E<sup>i</sup>* = Expected value

In our case, the observed value could take one of the values of the input features variable and the expected value would be another feature variable. If there is a strong correlation between them (that is *χ* 2 is too low) then it is enough to consider only one out of them and hence reduce one feature. Similarly, all possible combinations of the feature variables can be compared and sorted according to their Chi-Square values. Then, we can choose those feature variables that have high Chi-Square values from the list.

#### **4. Experiments and Evaluation of the System**

#### *4.1. Dataset Description*

The ISCX botnet dataset obtained from [48] has been used to evaluate the proposed system. The dataset consists of 1929 botnet apps of 14 different families. We complemented this dataset with 2500 clean apps from different categories on the Google play store and used VirusTotal for verification. Thus, our experiments were based on a total of 4429 applications from, which the images were generated and processed using the Bot-IMG framework.The clean applications can be made available to third parties on request.

#### *4.2. Evaluation Metrics*

The following metrics were used in measuring the performance of the models: accuracy, precision, recall and F1-score. All the results of the experiments are from 10-fold cross validation where the dataset is divided into 10 equal parts with 10% of the dataset held out

for testing, while the models are trained from the remaining 90%. This is repeated until all of the 10 parts have been used for testing. The average of all 10 results is then taken to produce the final result. We also employed the 80:20 split approach where 80% of the samples were used for training and 20% for testing.

#### *4.3. Machine-Learning Classifiers*

In this section, a brief overview of the Machine-Learning classifiers is presented, which were used to distinguish between botnet and clean apps. In general these are algorithms, which are trained on the labelled data (input) and then the learned model is used for estimating the target variable (output, in this case, malicious botnet or clean app).


#### **5. Results and Discussions**

In this section we present the results of the experiments performed to evaluate the performance of the proposed scheme described in Section 3. The proposed scheme was implemented with Python and the following libraries were utilized: OpenCV, PIL, Scikitlearn, Scikit-image, Pandas, Keras, Numpy, Seaborn and Matplotlib. The experiments were performed on an Ubuntu Linux 16.04 64-bit machine with 8 GB RAM.

Six popular Machine-Learning classifiers were used to evaluate the proposed scheme. These include: K-Nearest Neighbor (KNN), Random Forest (RF), Support Vector Machines (SVM), Decision Trees (DT), Extra Trees (ET) and XGBoost (XGB). We implemented two other schemes for baseline comparison of the Machine-Learning classifier performance. The first baseline scheme was the original HOG scheme where all the images in the training and test sets were resized to the standard 128 × 64 pixels and resulting in vectors of size 3780 used in training the models. The second baseline scheme used five segments to extract HOG descriptors in an identical way to our proposed scheme described by Algorithm 1 and used them to train the models without adding byte histograms or manifest features. We call the second baseline scheme the 'enhanced HOG' method.

In Table 1, the results of our proposed scheme using 10-fold cross validation are shown for the six Machine-Learning classifiers. We present the precision and recall for both malicious botnet class (M) and the benign or clean class (C). Note that the F1-scores presented in the table are weighted values, due to the difference in the numbers of samples in each class. The table shows that all of the classifiers obtained an overall accuracy performance of 92.3% or above, indicating that our proposed approach enables the training of high performing machine learning classifiers. The Extra Tree classifier had the highest weighted F1-score of 0.96, followed by Random Forest with 0.958 and XGBoost with 0.952. The lowest weighted F1-score was for KNN with 0.923 while SVM obtained a weighted F1-score of 0.926.

The Extra Trees classifier had the best precision and recall values except in the case of malware recall, which was 94.2% compared to that of Random Forest, which had 94.4%. SVM had the lowest malware class recall of 92.1% while KNN had the lowest benign class recall of 90.7%.


**Table 1.** Classifier performance with permissions, byte histograms and HOG descriptors (10-fold cross validation results).

In Table 2, the results of the proposed scheme using a train–test split of 80:20 are presented. The table shows that all of the classifiers resulted in an overall accuracy of 93.7% or higher, with the Extra Tree classifier yielding an accuracy of 97.5%. The Extra Tree classifier had the highest weighted F1-score of 0.980 followed by XGBoost and Random Forest with 0.970 and Decision Trees with 0.950. SVM and KNN had the lowest weighted F1-score of 0.940. Extra Trees had the highest malware class recall of 97% while SVM had the lowest recall of 92% for malware. Extra Tree, XGBoost and RF had the highest recall for benign class with 98%, while SVM had the lowest one, with 92%. Based on these results Extra Trees model will be the classifier of choice for our proposed Android botnet detection system.

**Table 2.** Classifier performance with permissions, byte histograms and HOG descriptors (train–test split results).


The results presented in Tables 1 and 2 demonstrates the effectiveness of our proposed scheme. This is evident in the performance of the strongest and the weakest classifiers in the group. SVM and KNN were the weakest classifiers but still managed to yield quite high accuracies and F1-scores in both 10-fold cross validation and the split based evaluation. On the other hand, the strongest classifiers Extra Tress, RF and XGBoost produced results that were comparable to the state-of-the art in the literature.

It is possible that the few malicious botnets that were not detected by the system had characteristics that made them resemble benign apps. For example, botnets with relatively few permissions and intents, or those with HOG representation were very close to those of benign training examples. This could be addressed in future work by extracting additional types of features or complementing our proposed method with alternative methods, for example through an ensemble or voting approach.

In Table 3 and Figure 3, we compare the performance of the proposed scheme with the two baseline schemes (HOG original and HOG enhanced) using the overall classification accuracy as the metric. The accuracies of each of the Machine-Learning classifiers for the compared schemes can be seen side by side in Figure 3. The proposed scheme outperforms the baseline schemes in all of the Machine-Learning classifiers.

From Table 3, HOG original obtained the highest classification accuracy of 89.2% with the XGBoost classifier. This suggests that the resizing of the images during pre-processing has adverse effects on the performance of the models. The enhanced HOG scheme reached a highest accuracy of 92.7% also with XGBoost classifier. The scheme proposed in this paper, which additionally leverages byte histograms and encoded manifest features led to significantly improved performance.

**Table 3.** Comparison of the baseline HOG schemes with the proposed method using the overall accuracy (10-fold cross validation results).


**Figure 3.** Overall classification accuracy for the various classifiers using the three compared schemes.

In Figure 4, the average training times for the samples trained during the 10-fold cross validation experiments are shown. Note that the training includes the feature selection step. The XGBoost classifier needed about 14.49 s to train 3987 samples in the training set, equivalent to an average of 3.6 milliseconds per sample. The rest of the classifiers were much faster and required significantly lower average training times for the training sets as shown in Figure 4.

The highest accuracy classifier, Extra Trees, needed an average of 1.72 s for the training sets—equivalent to 0.43 milliseconds per sample. In the pre-processing stage, the average amount of time taken to extract the features per application was approximately 1.37 s. These relatively low pre-processing and training times required per application indicates that the proposed approach is feasible in practice.The fact that we successfully utilized off-the-shelf Python libraries to build and evaluate the proposed system also indicates that commercial implementation is viable.

**Figure 4.** Average training times for the training set samples in seconds for each of the ML classifiers using our proposed scheme.

Although we demonstrated the effectiveness of our proposed method by experimental results showing high performance with several classifiers, our observed results also compare favorably with existing works. Due to variations in environments, datasets, the numbers of samples, reported metrics etc., direct comparison is not always possible. However, the set of results reported in this paper either exceeds or is similar to what has been reported in recent related works.

For example, the work in [41] was based on the same dataset used in this paper but had a lower performance with 93.1% as the highest accuracy. The papers [33,38] also reported lower accuracies than our results. However, these works were based on a different dataset and used hand-crafted features. As mentioned before, such features on Android have the disadvantage of maintenance overhead in the long run. Moreover, it was shown in [55] that the performance of hand-crafted features used to build machine learning models declined over time.

#### **6. Conclusions and Future Work**

In this paper, we proposed a novel approach for the detection of Android botnets based on image and manifest features. The proposed approach removes the need for detection solutions to rely on extracting hand-crafted features from the executable file, which ultimately requires domain expertise. The system is based on a Histogram of Oriented Gradients (HOG) and additionally leverages a byte histogram and the manifest features. We implemented the system in Python and evaluated its performance using six popular Machine-Learning classifiers.

All of them exhibited good performance with Extra Trees, XGBoost and Random Forest obtaining better performance as compared to the state-of-the-art results. These results demonstrate the effectiveness of the proposed approach. An overall accuracy of 97.5% and F1-score of 0.980 were observed with Extra Trees when evaluated with the 80:20 split approach; while a 96% accuracy and 0.960 F1-score were observed when evaluated using a 10-fold cross validation approach. In future work, we plan to explore other types of image descriptors and investigate whether they could be leveraged to improve the performance of the HOG-based scheme.

**Author Contributions:** Conceptualization, S.Y.Y.; methodology, S.Y.Y.; software, S.Y.Y. and A.B.; validation, A.B. and S.Y.Y.; formal analysis, S.Y.Y.; investigation, S.Y.Y.; resources, A.B. and S.Y.Y.; data curation, S.Y.Y.; writing—original draft preparation, S.Y.Y.; writing—review and editing, A.B.; visualization, A.B.; supervision, A.B. and S.Y.Y.; project administration, A.B.; funding acquisition, A.B. and S.Y.Y. All authors have read and agreed to the published version of the manuscript.

**Funding:** There is no external funding.

**Institutional Review Board Statement:** Ethical review and approval were waived for this study, due to the usage of the dataset available from the public domain (University of New Brunswick) governed by the ethics and privacy laws at: https://www.unb.ca/cic/datasets/android-botnet.html, accessed on 28 December 2021.

**Informed Consent Statement:** Since the dataset was taken from University of New Brunswick (public domain) the informed consent was not applicable in our case.

**Data Availability Statement:** The botnet dataset used in this research work was taken from the public domain (University of New Brunswick) at: https://www.unb.ca/cic/datasets/android-botnet.html, accessed on 28 December 2021.

**Acknowledgments:** This work is supported in part by the 2021 Cybersecurity research grant from the Cybersecurity Center at Prince Mohammad Bin Fahd University, Al-Khobar, Saudi Arabia.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


### *Article* **An Efficient Method for Generating Adversarial Malware Samples**

**Yuxin Ding \*, Miaomiao Shao, Cai Nie and Kunyang Fu**

Department of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen 518000, China; 21B951007@stu.hit.edu.cn (M.S.); 19S051001@stu.hit.edu.cn (C.N.); 19S151080@stu.hit.edu.cn (K.F.)

**\*** Correspondence: yxding@hit.edu.cn; Tel.: +86-755-2603-2193

**Abstract:** Deep learning methods have been applied to malware detection. However, deep learning algorithms are not safe, which can easily be fooled by adversarial samples. In this paper, we study how to generate malware adversarial samples using deep learning models. Gradient-based methods are usually used to generate adversarial samples. These methods generate adversarial samples case-by-case, which is very time-consuming to generate a large number of adversarial samples. To address this issue, we propose a novel method to generate adversarial malware samples. Different from gradient-based methods, we extract feature byte sequences from benign samples. Feature byte sequences represent the characteristics of benign samples and can affect classification decision. We directly inject feature byte sequences into malware samples to generate adversarial samples. Feature byte sequences can be shared to produce different adversarial samples, which can efficiently generate a large number of adversarial samples. We compare the proposed method with the randomly injecting and gradient-based methods. The experimental results show that the adversarial samples generated using our proposed method have a high successful rate.

**Keywords:** adversarial sample; malware detection; deep learning; convolutional neural network

#### **1. Introduction**

Deep neural networks have been successfully applied in different fields, such as computer vision and natural language processing. Recently, deep neural networks have gained attention to improve the performance of malware detection [1–4]. Deep learning algorithms can automatically learn features from training data, so malware detectors can implement end-to-end training based on it. Most of the approaches directly use binary Windows portable executable (PE) files as input data for the malware detection model to distinguish malicious and benign samples. The experimental results show that deep learning-based malware detectors can achieve high detection accuracy.

Despite their successful application in different fields, deep learning methods are sensitive to small perturbations in input samples. Szegedy et al. [5] found that small changes on input samples can cause classification errors. These perturbed samples are called adversarial samples. In the field of malware, similar methods have been proposed to evade malware detectors [6–8]. These methods are usually optimized by computing the gradient of the objective function, with respect to each byte of a source malware binary. Gradient-based methods generate adversarial samples case-by-case. Each time they only translate a source malware sample into a corresponding adversarial malware sample. If the number of padding bytes needed to inject into a malware is large, the time cost for generating an adversarial sample is very high. Therefore, these methods are not suitable for generating a large number of adversarial samples.

In this paper, we propose an efficient deep learning-based method for generating malware adversarial examples. We firstly extracted the feature byte sequences from benign samples, according to their importance. The importance of a sequence for classification is evaluated by a feature weight calculation method. Feature byte sequences were then

**Citation:** Ding, Y.; Shao, M.; Nie, C.; Fu, K. An Efficient Method for Generating Adversarial Malware Samples. *Electronics* **2022**, *11*, 154. https://doi.org/10.3390/ electronics11010154

Academic Editor: Suleiman Yerima

Received: 13 December 2021 Accepted: 1 January 2022 Published: 4 January 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

injected into malware samples to generate adversarial samples. Since benign sequences can be stored into a database and shared by different malware samples, our proposed method can generate adversarial samples more efficiently. We tried to use two different strategies, the end-of-file and the mid-file, to inject binary sequences into a PE file. The experimental results show that the adversarial samples generated using our proposed method have a high successful rate for attacking CNN-based malware detectors.

The rest of this paper is organized as follows. In Section 2, we introduce the related work. In Section 3, we propose the research motivation and the method for generating malware adversarial examples. The experiments and discussions are described in Sections 4 and 5, respectively. Finally, we give our conclusions in Section 6.

#### **2. Related Work**

Deep learning methods have been widely applied in many fields and achieved excellent experimental results. However, recent studies show that deep learning models are sensitive to small perturbations in the input data [6,9]. The data samples after adding perturbations are called adversarial samples. Adversarial samples may cause deep learning algorithms to make wrong decision. The methods for generating adversarial samples can be divided into two categories: black- and white-box algorithms. The white-box algorithms assume that attackers have detailed information about the structure and parameters of the deep learning model [5,10]. Such information can be exploited to calculate perturbations. For black-box algorithms [11,12], any information about deep learning models is unknown. The perturbations of adversarial samples are usually computed based on the gradients of the loss function, with respect to the input data and a target label.

Goodfellow et al. [9] made a point that adversarial samples are the result of the learning models being too linear, rather than too nonlinear, and proposed the fast gradient sign algorithm to generate adversarial examples (FGSM). They found that networks with hidden units which have unbounded active functions simply respond by making their hidden unit activations very large, so it is better to only change the original input. Papernot et al. [12] proposed the Jacobian matrix-based method (JSMA) to generate adversarial samples. JSMA constructs adversarial samples by computing forward derivatives of deep neural network. This model uses knowledge of the network architecture to create adversarial saliency maps. The saliency maps indicate which input features an adversary should perturb, in order to impact output result of classification. Xiao et al. [13] proposed an optimization framework for the attacker to find the near-optimal label flips that maximally reduces the classification performance. The framework simultaneously models the adversary's attempt and the defender's reaction in a loss minimization problem. Based on this framework, they developed an algorithm of attacking support vector machines (SVMs). Moosavi-Dezfooli et al. [11] proposed Deepfool, which is based on an iterative linearization of the classifier to generate minimal perturbations that are sufficient to change class labels. The experimental results show that Deepfool can generate smaller perturbations than that generated by FGSM.

Sometimes attackers cannot obtain the detail knowledge about the deep learning model. For example, only the network outputs on certain inputs can be observed. Under these cases, black-box algorithms are applied to adversarial samples generation. Black-box attack was firstly proposed by Papernot et al. [14]. They trained a substitute network to fit the unknown neural network, and then generated adversarial examples using the substitute neural network [12]. The substitute network is a simulator of the target network. Therefore, the success of the black-box attack depends on the transferability property to hold between the target and substitute network. Liu et al. [15] conducted an extensive study of the transferability over large models and a large-scale dataset. Their results prove that the transferability for non-targeted adversarial samples is prominent, even for large models and a large-scale dataset. They also presented novel, ensemble-based approaches to generate transferable adversarial samples.

In the malware detection field, different black- and white-box algorithms are also presented. Different from images, there are semantic dependencies between bytes in an executable, any modification to a byte value may cause the executable cannot be executed or loss its intrusive functionality. To avoid this problem, some methods [7,16] generate adversarial malware samples by appending specific bytes at the end of executables. The input size of deep learning-based detector is fixed. If the size of an executable is bigger than the fixed size, it cannot be used to generate an adversarial sample. To solve this issue, padding bytes can be injected into the gaps between sections in a PE file [17].

Hu and Tan [18] used the generative adversarial network to generate adversarial samples. They constructed a substitute detector to fit the black-box malware detector. Then, the generative adversarial network is trained to minimize the probability that the generated adversarial samples are predicted as malware by the substitute detector. Al-Dujaili et al. [19] investigated the methods that reduce adversarial blind spots for DNN based detectors. They considered it a saddle-point optimization problem and used the inner maximize methods to improve the robustness of DNN. Hu and Tan [20] proposed a black-box algorithm to evade a RNN-based detector. They trained a substitute RNN to approximate the victim RNN, then used the generative RNN to output sequential adversarial samples. Chen et al. [21] proposed the adversarial crafting algorithm based on the Jacobian matrix to generate adversarial samples.

Bojan et al. [16] proposed a white-box algorithm for evading the deep learning-based detector MalConv [3]. The algorithm is a gradient-based method which aims to minimize the confidence associated to the malicious class. To preserve the intrusive functionality of an executable, they appended padding bytes at the end of each malware sample. Suciu et al. [7] also proposed a white-box algorithm to evade Malconv model. Based on FGSM, they proposed the one-shot FGSM append attack. The algorithm uses the gradient value of the classification loss, with respect to the target label to update the appended byte values.

Apart from the above-mentioned malware adversarial sample generation methods, there are some other methods. Kreuk et al. [22] proposed to generate adversarial examples by appending to the malware binary file a small section. Peng et al. [23] used a generative adversarial network to generate semantics aware adversarial malware samples, which can fool the detection algorithms. They trained a recurrent neural network BiLSTM based a substitute detector to fit the black-box malware detector. In [24], the authors proposed two white-box methods and one black-box method to attack the CNN-based malware detector MalConv [3]. Recently, Chen et al. [25] used the deep reinforcement learning to generate malware adversarial examples, which has high success rate. A comparison of typical methods for generating adversarial samples is given in Table A1 (see Appendix A).

#### **3. Methodology for Generating Adversarial Malware Examples**

#### *3.1. Motivations*

Different deep learning-based detectors have been proposed [3,20,26]. As one of the most popular algorithms in deep learning, convolutional neural network (CNN) is widely applied in these detectors. Since CNN can automatically learn features from training samples, these detectors directly use a binary executable file as input and classify it. In our work we focus on how to generate adversarial samples which can evade CNNbased malware detectors. The problem of generating adversarial malware samples can be formalized as follows.

An executable *x* is represented as a sequence of *L* binary bytes *x* = (*x*1, *x*2, · · · , *xL*), where *x<sup>i</sup>* is between 0 and 255 and *L* is the length of an executable. In our work we set *<sup>L</sup>* <sup>=</sup> <sup>2</sup> <sup>×</sup> <sup>10</sup><sup>6</sup> . If the length of an executable is less than <sup>2</sup> <sup>×</sup> <sup>10</sup><sup>6</sup> , zeros are padded at the end of the file. The malware detector is denoted as *f<sup>θ</sup>* (*x*) : *x* → [0, 1], where *θ* is the parameters of a detector, and *f<sup>θ</sup>* (*x*) outputs the probability that *x* is malware. If *f<sup>θ</sup>* (*x*) > 0.5, *x* is classified as malware, otherwise *x* is classified as benign.

Given a malicious file which is correctly classified as malware, an adversarial sample generation method can inject carefully-selected bytes into an executable (while preserving its runtime functionality), so that the executable can be classified as benign. its runtime functionality), so that the executable can be classified as benign. Conventional methods use gradient‐based algorithm to generate adversarial samples [7,16]. These approaches use the input gradient value to update the injected byte values.

Given a malicious file which is correctly classified as malware, an adversarial sample generation method can inject carefully‐selected bytes into an executable (while preserving

of the file. The malware detector is denoted as ఏሺሻ: → ሾ0,1ሿ, where is the parameters of a detector, and ఏሺሻ outputs the probability that is malware. If ఏሺሻ 0.5, is

Conventional methods use gradient-based algorithm to generate adversarial samples [7,16]. These approaches use the input gradient value to update the injected byte values. Gradient value is calculated by minimizing the classification loss function of a detector, with respect to the target label. The gradient-based algorithm is an iterative algorithm and only one byte value is computed per iteration. Therefore, the computation cost for generating an adversarial sample is high, which is not suitable for generating a large number of adversarial examples. The motivation of our research is to design a method which can generate adversarial samples efficiently. Gradient value is calculated by minimizing the classification loss function of a detector, with respect to the target label. The gradient‐based algorithm is an iterative algorithm and only one byte value is computed per iteration. Therefore, the computation cost for gener‐ ating an adversarial sample is high, which is not suitable for generating a large number of adversarial examples. The motivation of our research is to design a method which can generate adversarial samples efficiently. *3.2. Finding Data Area Important for Classification*

#### *3.2. Finding Data Area Important for Classification* To evade the detection of malware detectors, we need to inject padding bytes into a

*Electronics* **2022**, *11*, x FOR PEER REVIEW 4 of 17

classified as malware, otherwise is classified as benign.

To evade the detection of malware detectors, we need to inject padding bytes into a source malware binary to change its category. To avoid using gradient-based algorithms to calculate the values of injected padding bytes, the padding bytes we use are the byte sequences extracted from benign executables. If these byte sequences can represent the characteristics of benign executables, the probability that an adversarial malware sample can fool a detector will increase. Therefore, our main task is to extract byte sequences which can represent the characteristics of benign executables. source malware binary to change its category. To avoid using gradient‐based algorithms to calculate the values of injected padding bytes, the padding bytes we use are the byte sequences extracted from benign executables. If these byte sequences can represent the characteristics of benign executables, the probability that an adversarial malware sample can fool a detector will increase. Therefore, our main task is to extract byte sequences which can represent the characteristics of benign executables. To evade the detection of malware detectors, we need to inject padding bytes into a

To evade the detection of malware detectors, we need to inject padding bytes into a source malware binary to change its category. To avoid using gradient-based algorithms to calculate the values of injected padding bytes, the padding bytes we use are the byte sequences extracted from benign executables. If these byte sequences can represent the characteristics of benign executables, the probability that an adversarial malware example can fool a detector will increase. Therefore, our main task is to extract byte sequences which can represent the characteristics of benign executables. source malware binary to change its category. To avoid using gradient‐based algorithms to calculate the values of injected padding bytes, the padding bytes we use are the byte sequences extracted from benign executables. If these byte sequences can represent the characteristics of benign executables, the probability that an adversarial malware example can fool a detector will increase. Therefore, our main task is to extract byte sequences which can represent the characteristics of benign executables. CNN‐based detectors generate explicit feature maps forinput samples. Figure 1 gives

CNN-based detectors generate explicit feature maps for input samples. Figure 1 gives an example for CNN convolution operation. The input data is a sequence. When we apply convolution to the input data, we mix two buckets of information. The first bucket is the input data. The second bucket is the convolution kernel, a single matrix of floating-point numbers. The output of the kernel is the altered sequence which is often called a feature map. Usually there are multiple convolution kernels and each kernel outputs a feature map. Feature maps represent features of an input data at different level. Through analyzing feature maps, we can discover which features are more important for decision making, and the data corresponding to important features can be used to construct adversarial samples. an example for CNN convolution operation. The input data is a sequence. When we apply convolution to the input data, we mix two buckets of information. The first bucket is the input data. The second bucket is the convolution kernel, a single matrix of floating‐point numbers. The output of the kernel is the altered sequence which is often called a feature map. Usually there are multiple convolution kernels and each kernel outputs a feature map. Feature maps represent features of an input data at different level. Through analyz‐ ing feature maps, we can discover which features are more important for decision making, and the data corresponding to important features can be used to construct adversarial samples.

**Figure 1.** Convolution of a sequence with a convolution kernel. **Figure 1.** Convolution of a sequence with a convolution kernel.

Grad‐CAM [27] algorithm provides explanations for decisions from a large class of CNN‐based models. We use the Grad‐CAM algorithm to evaluate the important values of each feature map for a target class . The important value of a feature map, with respect to a specific class is computed as Equation (1). indicates the importance of , with respect to class . Grad-CAM [27] algorithm provides explanations for decisions from a large class of CNN-based models. We use the Grad-CAM algorithm to evaluate the important values of each feature map for a target class *c*. The important value of a feature map, with respect to a specific class is computed as Equation (1). *α c k* indicates the importance of *FeatureMap<sup>k</sup>* , with respect to class *c*.

$$\alpha\_k^c = \frac{1}{\text{Len\\_FeatureMap}\_k} \sum\_i \frac{\partial S^c}{\partial FeatureMap\_k[i]} \tag{1}$$

where *FeatureMap<sup>k</sup>* is the *k*th feature map, *FeatureMap<sup>k</sup>* [*i*] is the *i*th element of *FeatureMap<sup>k</sup>* , *Len*\_*FeatureMap<sup>k</sup>* is the number of elements of *FeatureMap<sup>k</sup>* , *c* is a class label, *S c* is the input for class *c* in the softmax layer (classification layer in a CNN). , \_ is the number of elements of , is a class label, is the input for class in the softmax layer (classification layer in a CNN). To discover the importance area of the input data for class , the contributions of all

where is the th feature map, ሾሿ is the th element of

*Electronics* **2022**, *11*, x FOR PEER REVIEW 5 of 17

ൌ <sup>1</sup>

\_

To discover the importance area of the input data for class *c*, the contributions of all feature maps need to be considered. The weighted sum of all feature maps is computed, which is defined as Equation (2). *L c* is called the class-discriminative localization map, which has the same size as a feature map. feature maps need to be considered. The weighted sum of all feature maps is computed, which is defined as Equation (2). is called the class‐discriminative localization map, which has the same size as a feature map.

$$L^c = \text{ReLU}(\sum\_k a\_k^c featureMap\_k) \tag{2}$$

ሾሿ

(1)

In (2) the ReLU function (ReLU(*x*) = Max(0, *x*)) is applied to the linear combination of feature maps because only the features that have a positive impact on class *c* are considered. Without the ReLU function, the localization map sometimes highlights more than just the class of interest and performs worse at localization. Each element *L c* [*i*] can be seen as a feature extracted from the input data. The element *L c* [*i*], with a greater value, will also have more positive impact on class *c*. We can find the data area that is important for class *c* by mapping *L c* [*i*] back to the corresponding data area in the input. of feature maps because only the featuresthat have a positive impact on class are con‐ sidered. Without the ReLU function, the localization map sometimes highlights more than just the class of interest and performs worse at localization. Each element ሾሿ can be seen as a feature extracted from the input data. The element ሾሿ, with a greater value, will also have more positive impact on class . We can find the data area that is important for class by mapping ሾሿ back to the corresponding data area in the input.

#### *3.3. Generating Adversarial Examples 3.3. Generating Adversarial Examples*

In reality the structure and parameters of a malware detector are unknown. In order to obtain the feature maps, we have to create a pseudo detector, which can simulate the true detector. MalConv [3] is a typical CNN-based detector. In our work, we select MalConv network as the pseudo detector. The network structure of MalConv is shown in Figure 2. In reality the structure and parameters of a malware detector are unknown. In order to obtain the feature maps, we have to create a pseudo detector, which can simulate the true detector. MalConv [3] is a typical CNN‐based detector. In our work, we select Mal‐ Conv network as the pseudo detector. The network structure of MalConv is shown in Figure 2.

**Figure 2.** Structure of MalConv. **Figure 2.** Structure of MalConv.

We regard an executable (PE file) as a byte stream. The input of MalConv is a fixed‐ length sequence from a PE file. If the length of an executable is shorter than the fixed‐ length, a number of zeros are inserted at the end of an executable. In MalConv, the first layer is an embedding layer, where each byte of an input sequence is converted into an 8‐ dimensional embedding vector. MalConv has two parallel convolutional layers. These embedding vectors are then transferred to two one‐dimensional convolutional layers to generate feature maps, respectively. The next layeris a temporal max pooling layer, which combines the outputs of the two convolutional layers and passes them to a fully connected layer and a softmax layer for classification. We regard an executable (PE file) as a byte stream. The input of MalConv is a fixedlength sequence from a PE file. If the length of an executable is shorter than the fixed-length, a number of zeros are inserted at the end of an executable. In MalConv, the first layer is an embedding layer, where each byte of an input sequence is converted into an 8 dimensional embedding vector. MalConv has two parallel convolutional layers. These embedding vectors are then transferred to two one-dimensional convolutional layers to generate feature maps, respectively. The next layer is a temporal max pooling layer, which combines the outputs of the two convolutional layers and passes them to a fully connected layer and a softmax layer for classification.

In our paper, we use Equation (1) to calculate the important value of each feature map, with respect to class , denoted as , , which is the important value of the th fea‐ ture map generated from the th convolutional layer ,. MalConv has two parallel convolutional layers. We normalize , for each independent convolutional layer, respectively, which is shown as Equation (3). In our paper, we use Equation (1) to calculate the important value of each feature map, with respect to class *c*, denoted as *α c l*,*k* , which is the important value of the *k*th feature map generated from the *l*th convolutional layer *FeatureMapl*,*<sup>k</sup>* . MalConv has two parallel convolutional layers. We normalize *α c l*,*k* for each independent convolutional layer, respectively, which is shown as Equation (3).

$$w\_{l,k}^c = \frac{a\_{l,k}^c}{\sum\_k a\_{l,k}^c} \tag{3}$$

The class-discriminative localization map is calculated as the weighted sum of the feature maps generated by the two parallel convolutional layers, which is shown as

Equation (4). Here, we set all convolution kernels to have the same size; thus, all feature maps, as well as the class-discriminative localization map, have the same size, which are one-dimensional vectors. Different CNN-based networks have different structures. Another key problem we should resolve is how to locate the byte sequences in a source binary file, according to the class-discriminative localization map. tion (4). Here, we set all convolution kernels to have the same size; thus, all feature maps, as well as the class‐discriminative localization map, have the same size, which are one‐ dimensional vectors. Different CNN‐based networks have different structures. Another key problem we should resolve is how to locate the byte sequences in a source binary file, according to the class‐discriminative localization map.

 ൌ , ∑ , 

The class‐discriminative localization map is calculated as the weighted sum of the feature maps generated by the two parallel convolutional layers, which is shown as Equa‐

,

*Electronics* **2022**, *11*, x FOR PEER REVIEW 6 of 17

$$L^\mathcal{E} = \text{ReLU}(\sum\_{l} \sum\_{k} w\_{l,k}^\mathcal{E} featureMap\_{l,k}) \tag{4}$$

(3)

A MalConv model has two independent convolutional layers, and each convolution layer has multiple convolution kernels. To simplify data mapping, we set the kernel length equal to the kernel's moving stride, all kernels have the same length, and the length of the input data is <sup>2</sup> <sup>×</sup> <sup>10</sup><sup>6</sup> bytes. The mapping relationship between a feature map and an input data can be constructed as follows. A MalConv model has two independent convolutional layers, and each convolution layer has multiple convolution kernels. To simplify data mapping, we set the kernel length equal to the kernel's moving stride, all kernels have the same length, and the length of the input data is 2 ൈ 10 bytes. The mapping relationship between a feature map and an in‐ put data can be constructed as follows.

In [3], the authors tried different parameter settings to test the performance of MalConv. We followed [3] and set the length and the moving stride of a kernel as 500, and the kernel number of each convolutional layer as 128. Figure 3 shows the relationships between an input data and a features map. In Figure 3, each square in the first row represents an input byte, and each square in the second row represents the embedding vector of an input byte. Kernel1 is a one-dimensional convolution kernel of a convolutional layer, whose length is 500. Kernel1 is convolved across the embedding data, computing the dot product between the entries of the kernel and the embedding data and producing a one-dimensional feature map *FeatureMap*1. If each convolutional layer has 128 kernels, we can obtain 128 one-dimensional feature maps from one convolutional layer. The embedding data has the same length as the input data. Therefore, each feature map has 4000 elements. In Figure 3, the fourth row shows the mapping relationship between an element of a feature map and a byte sequence in the input data. For example, the first element of *FeatureMap*1, *FeatureMap*<sup>1</sup> [1], is calculated by convoluting Kernel1 with the first five hundred elements of the embedding vector, and each input byte corresponds to an element of the embedding vector. Therefore, *FeatureMap*<sup>1</sup> [1] is related with the first five hundred bytes of the input data. The class-discriminative localization map is the weighted sum of all feature maps, so it has the same mapping relationship as that of a features map. In [3], the authors tried different parameter settings to test the performance of Mal‐ Conv. We followed [3] and set the length and the moving stride of a kernel as 500, and the kernel number of each convolutional layer as 128. Figure 3 shows the relationships be‐ tween an input data and a features map. In Figure 3, each square in the first row represents an input byte, and each square in the second row represents the embedding vector of an input byte. Kernel1 is a one‐dimensional convolution kernel of a convolutional layer, whose length is 500. Kernel1 is convolved across the embedding data, computing the dot product between the entries of the kernel and the embedding data and producing a one‐ dimensional feature map ଵ. If each convolutional layer has 128 kernels, we can obtain 128 one‐dimensional feature maps from one convolutional layer. The embed‐ ding data has the same length as the input data. Therefore, each feature map has 4000 elements. In Figure 3, the fourth row shows the mapping relationship between an element of a feature map and a byte sequence in the input data. For example, the first element of ଵ, ଵ ሾ1ሿ, is calculated by convoluting Kernel1 with the first five hundred elements of the embedding vector, and each input byte corresponds to an ele‐ ment of the embedding vector. Therefore, ଵ [1] is related with the first five hundred bytes of the input data. The class‐discriminative localization map is the weighted sum of all feature maps, so it has the same mapping relationship as that of a features map.

**Figure 3.** Mapping feature map back to raw data. **Figure 3.** Mapping feature map back to raw data.

To generate adversarial examples, we firstly train a MalConv model as the pseudo detector. Then, we create a dataset for feature extraction. All samples in the dataset are To generate adversarial examples, we firstly train a MalConv model as the pseudo detector. Then, we create a dataset for feature extraction. All samples in the dataset are benign samples and can be correctly classified as benign by a detector. We input a sample to the pseudo detector and obtain the class-discriminative localization map *L c* of the sample. According to the mapping relationship between input data and the classdiscriminative localization map, we can extract the byte sequences from the input data,

which can represent the features of a sample. We usually extract the byte sequences corresponding to the elements having the greatest value in *L c* . We call these byte sequences as feature byte sequences, which can be stored and shared by different adversarial samples. When generating an adversarial example, we randomly select one or multiple sequences and inject them into a malware sample. to the elements having the greatest value in . We call these byte sequences as feature byte sequences, which can be stored and shared by different adversarial samples. When generating an adversarial example, we randomly select one or multiple sequences and inject them into a malware sample. Different from adversarial samples of image, feature byte sequences injected into a to the elements having the greatest value in . We call these byte sequences as feature byte sequences, which can be stored and shared by different adversarial samples. When generating an adversarial example, we randomly select one or multiple sequences and inject them into a malware sample. Different from adversarial samples of image, feature byte sequences injected into a malware sample should have concrete program semantics. Sometimes the head and tail

benign samples and can be correctly classified as benign by a detector. We input a sample to the pseudo detector and obtain the class‐discriminative localization map of the sam‐ ple. According to the mapping relationship between input data and the class‐discrimina‐ tive localization map, we can extract the byte sequences from the input data, which can represent the features of a sample. We usually extract the byte sequences corresponding

benign samples and can be correctly classified as benign by a detector. We input a sample to the pseudo detector and obtain the class‐discriminative localization map of the sam‐ ple. According to the mapping relationship between input data and the class‐discrimina‐ tive localization map, we can extract the byte sequences from the input data, which can represent the features of a sample. We usually extract the byte sequences corresponding

*Electronics* **2022**, *11*, x FOR PEER REVIEW 7 of 17

*Electronics* **2022**, *11*, x FOR PEER REVIEW 7 of 17

Different from adversarial samples of image, feature byte sequences injected into a malware sample should have concrete program semantics. Sometimes the head and tail of a feature byte sequences are separated from other bytes of a program and cannot represent complete program semantics. In this case, we should extend a feature byte sequence to include the separate parts. For example, a feature byte sequence (bytes in the box), extracted according to the mapping relationship, is shown in Figure 4. The decompiling codes of the binary bytes are shown in Figure 5. We can see the head byte FF and the tail byte 45 cannot represent correct program semantics. To generate a feature byte sequence having correct program semantics, we should extend the feature byte sequences to include 8B and 08. From this point we can see the injected byte sequences, generated using our method, are explainable. malware sample should have concrete program semantics. Sometimes the head and tail of a feature byte sequences are separated from other bytes of a program and cannot rep‐ resent complete program semantics. In this case, we should extend a feature byte sequence to include the separate parts. For example, a feature byte sequence (bytes in the box), ex‐ tracted according to the mapping relationship, is shown in Figure 4. The decompiling codes of the binary bytes are shown in Figure 5. We can see the head byte FF and the tail byte 45 cannot represent correct program semantics. To generate a feature byte sequence having correct program semantics, we should extend the feature byte sequences to include 8B and 08. From this point we can see the injected byte sequences, generated using our method, are explainable. of a feature byte sequences are separated from other bytes of a program and cannot rep‐ resent complete program semantics. In this case, we should extend a feature byte sequence to include the separate parts. For example, a feature byte sequence (bytes in the box), ex‐ tracted according to the mapping relationship, is shown in Figure 4. The decompiling codes of the binary bytes are shown in Figure 5. We can see the head byte FF and the tail byte 45 cannot represent correct program semantics. To generate a feature byte sequence having correct program semantics, we should extend the feature byte sequences to include 8B and 08. From this point we can see the injected byte sequences, generated using our method, are explainable.

**Figure 4.** A sample of a feature byte sequence. **Figure 4.** A sample of a feature byte sequence. **Figure 4.** A sample of a feature byte sequence.


**Figure 5.** Decompiling codes of a binary byte sequence. **Figure 5.** Decompiling codes of a binary byte sequence.

**Figure 5.** Decompiling codes of a binary byte sequence. To more accurately locate the important area in the input data, we train several Mal‐ Conv models with different parameter settings and combine the class‐discriminative lo‐ To more accurately locate the important area in the input data, we train several Mal‐ Conv models with different parameter settings and combine the class‐discriminative lo‐ To more accurately locate the important area in the input data, we train several MalConv models with different parameter settings and combine the class-discriminative localization map from all MalConv models to locate the important area of the input data.

calization map from all MalConv models to locate the important area of the input data. Algorithm 1 gives the algorithm for extracting feature byte sequences from input data using multiple detection models. The length of convolution kernels in different MalConv models can be different. For the convenience of extracting feature byte sequences, we de‐ fine a new data structure ℎ. It is a vector having the same length as the input data. Each element in ℎ records the important value of the corre‐ sponding byte of the input data. The important values of input bytes are assigned accord‐ ing to . According to the mapping relationship, we can find the byte sequence corre‐ sponding to (the th element of ); then, the values of the elements of ℎ corresponding to the byte sequence are set as . The function ℎሺሻ implements this objective. Due to multiple models used to locate fea‐ ture byte sequences, we use and ℎrepresent the class‐discriminative lo‐ calization map from all MalConv models to locate the important area of the input data. Algorithm 1 gives the algorithm for extracting feature byte sequences from input data using multiple detection models. The length of convolution kernels in different MalConv models can be different. For the convenience of extracting feature byte sequences, we de‐ fine a new data structure ℎ. It is a vector having the same length as the input data. Each element in ℎ records the important value of the corre‐ sponding byte of the input data. The important values of input bytes are assigned accord‐ ing to . According to the mapping relationship, we can find the byte sequence corre‐ sponding to (the th element of ); then, the values of the elements of ℎ corresponding to the byte sequence are set as . The function ℎሺሻ implements this objective. Due to multiple models used to locate fea‐ ture byte sequences, we use and ℎrepresent the class‐discriminative lo‐ calization map and ℎ, generated from model (the th detector). The Algorithm 1 gives the algorithm for extracting feature byte sequences from input data using multiple detection models. The length of convolution kernels in different MalConv models can be different. For the convenience of extracting feature byte sequences, we define a new data structure *byteWeightMap*. It is a vector having the same length as the input data. Each element in *byteWeightMap* records the important value of the corresponding byte of the input data. The important values of input bytes are assigned according to *L c* . According to the mapping relationship, we can find the byte sequence corresponding to *L c i* (the *i*th element of *L c* ); then, the values of the elements of *byteWeightMap* corresponding to the byte sequence are set as *L c i* . The function *SetByteWeight*() implements this objective. Due to multiple models used to locate feature byte sequences, we use *L c i* and *byteWeightMap<sup>i</sup>* represent the class-discriminative localization map and *byteWeightMap*, generated from model *M<sup>i</sup>* (the *i*th detector). The vector *f ByteWeightMap* is the sum of all *byteWeightMap<sup>i</sup>* , which stores the final important value of each byte of the input data.

calization map and ℎ, generated from model (the th detector). The vector ℎ is the sum of all ℎ, which stores the final im‐ portant value of each byte of the input data. vector ℎ is the sum of all ℎ, which stores the final im‐ portant value of each byte of the input data. In Algorithm 1, *ModelNum* is the number of models, and *thresh* gives the threshold of important value for selecting feature sequences. *xbenign* is the input data. The function *GetFeatureMap*() returns all the feature maps generated by model *M<sup>i</sup>* . *FeatureMapj*,*<sup>k</sup>* [*n*] is the *n*th element of the *k*th feature map generated by the *j*th convolutional layer of a MalConv model. The function *ExtFeaSeq*() extracts all bytes whose important values are bigger than *thresh* from *xbenign*, according to vector *f ByteWeightMap*. The continuous bytes, having the same important value, consist of a feature byte sequence. Figure 6 shows a sample how to extract feature byte sequences from input data. We set *thresh* as 50; therefore, only two feature byte sequences (sequences in black box) are extracted from input data. **end** ሾሿ ൌ ሺℎ, ℎℎ, ሻ

ሻ;

,;

ൌ ൫, ൯;

*Electronics* **2022**, *11*, x FOR PEER REVIEW 8 of 17

quences in black box) are extracted from input data.

**for** each , in **do**

 ሻ; ℎ ൌ ℎ ,

ℎ ൌ ℎሺ

<sup>∑</sup> డௌ

డி௧௨ெೕ,ೖሾሿ ;

ℎ ൌ ℎ ℎ;

**Output:** ሾሿ

ℎ ൌ 0ሬ⃗; **for** = 1 to **do**

ℎ ൌ 0ሬ⃗;

ൌ <sup>ଵ</sup>

**for** each ,

\_ி௧௨ெೕ,ೖ

**do**

ൌ ReLUሺℎሻ;

ൌ Normalizeሺ,

,

**end**

,

**end** 

**Algorithm 1:** Extracting feature byte sequences of a benign sample **Input:** , ൌ , ଵ,⋯,ௌே௨, , ℎℎ

In Algorithm 1, is the number of models, and ℎℎ gives the thresh‐ old of important value for selecting feature sequences. is the input data. The func‐ tion ሺሻ returns all the feature maps generated by model . ,ሾሿ is the th element of the th feature map generated by the th convo‐ lutional layer of a MalConv model. The function ሺሻ extracts all bytes whose important values are bigger than ℎℎ from , according to vector ℎ. The continuous bytes, having the same important value, consist of a feature byte sequence. Figure 6 shows a sample how to extract feature byte sequences from input data. We set ℎℎ as 50; therefore, only two feature byte sequences (se‐

#### *3.4. Strategies for Injecting Feature Sequences*

A malware adversarial sample should preserve the same semantics as that of a source file. It requires that any byte in the source executable cannot be changed. Therefore, feature sequences should be injected into the spare space of an executable, which cannot be executed by a computer. Two strategies can be adopted to locate spare space in an executable: mid-file and end-of-file injection. We apply both strategies to generate adversarial samples in our work.

Mid-file injection: we locate the gaps between neighboring PE sections by parsing a PE file header. The gaps are placed by the compiler, since the physical size allocated to a PE section is greater than its virtual size. The length of a gap is calculated as RawSize-VirtualSize. The index of the start address of a gap is computed as PointerToRawData (offset address of a section) + VirtualSize. We collect the start address and length of each gap in an executable, then inject the feature byte sequences with appropriate length into these gaps.

End-of-file injection: another strategy we use is adding new sections at the end of a PE file and injecting feature byte sequences into the newly added sections. Since the new sections are not accessed by program code, the semantics of the original PE file are preserved. The process of adding a new section block includes three steps. First, we modify the value of bytes, which store the number and size of sections in the PE file header and update the values of file alignment and section alignment. Then, we use the offset address of the last section block plus the offset address of the new block as the final offset address. Next, we set the attribute values of the new section, such as the section name, execution attributes, size of the hard disk, and size of the memory. Finally, we modify the offset address of the aligned section and the offset address of the file in the section table and modify the size of image in the PE header.

Similar to [17], our method adopting the mid-file injection generates adversarial samples by injecting perturbed bytes in the gaps between neighboring PE sections. The method adopting end-of-file injection generates malware adversarial examples by adding new sections at the end of PE file, which is similar to previous methods [7,16,22,24]. However, all these methods [7,16,17,22,24] are belong to gradient-based method, which is optimized by computing the gradient of the objective function, with respect to each byte of a source malware binary. The gradient-based algorithm is an iterative algorithm and only one byte value is computed per iteration. Generating an adversarial malware sample by gradientbased method spends much time, so it is not applicable for generating a large number of adversarial samples. To avoid using gradient-based algorithms to calculate the values of injected padding bytes, our methods use the byte sequences extracted from benign executables to generate adversarial samples. In addition, our methods aim to evade CNN-based malware detectors, which is similar to [23]. We make a more detailed comparison between our method and the gradient-based method [16] in Sections 4 and 5.

#### **4. Experiments**

#### *4.1. Dataset Description*

The malware samples we used came from the VirusShare project at http://virusshare. com/ (accessed on 1 December 2021). We downloaded 20,000 malicious samples, whose sizes were between 1 KB and 5 MB. The benign samples were collected from Windows platforms. We collected 20,000 benign Windows PE files in total. Two criteria were used to assess the quality of adversarial samples. The successful rate (SR) of the adversarial attack is defined as the percentage of the adversarial samples that can evade a detector. Another is the time cost for generating adversarial samples, which is used to evaluate the efficiency of the proposed algorithm. The experimental environment was 64-bit Ubuntu14 operating system, CPU Intel® Xeon Silver 4116 with 256 G memory.

#### *4.2. Experimental Results*

In the experiments, we trained four MalConv detectors. The description of parameter setting, training data, and detection accuracy is shown in Table 1. In Table 1, the column "Kernel Number" gives the kernel number for each convolutional layer. The training samples included fifty percent benign files and fifty percent malicious files, i.e., 5000 benign files and 5000 malicious files. The accuracy is defined as the percentage of the testing samples that can be correctly classified.


**Table 1.** Parameter setting for detectors.

To objectively evaluate the successful rate that adversarial examples evade detection, in each experiment, we selected one MalConv model as the detector and used the remaining models to generate feature byte sequences. We repeated the experiments four times and used the average successful rate of four experiments to evaluate the performance of the proposed method. For each experiment, we randomly chose 100 benign samples from the testing set and use Algorithm 1 to extract the features sequences from benign samples. Only the sequences with the highest important value in each sample were selected. We got about two thousand feature sequences per experiment. We randomly selected 1000 samples that were correctly classified as malware from the testing set and injected feature sequences into them, in order to generate adversarial samples.

To observe how the number of injected bytes affects the performance of the proposed method, we injected different numbers of bytes into a sample. The number of the injected bytes was set to 1000, 2000, 5000, 10,000, and 20,000, respectively. In our work, two injection strategies were applied to inject feature byte sequences.

The experimental results, adopting the mid-file and the end-of-file strategies, are shown in Tables 2 and 3, respectively. In two tables, "Avg Time Cost Per Sample" means the time cost for generating an adversarial sample.


**Table 2.** SR of the proposed method adopting the mid-file strategy.

**Table 3.** SR of the proposed method adopting the end-of-file strategy.


To verify whether the feature sequences can represent the characteristics of benign executables, we compared the proposed method with the randomly injecting method. The randomly injecting method randomly extracts byte sequences from benign executables and injects them into malware to generate adversarial samples. For the randomly injecting methods, we also used two different strategies to inject randomly extracted sequences. The experimental results are shown in Tables 4 and 5, respectively.



From Tables 2–5, we can see that the successful rate of the proposed method was significantly higher than that of the randomly injecting method, which was about 30–60% higher than that of the corresponding randomly injecting method. It proves that the feature sequences injected into adversarial samples can reflect the characteristics of benign executables, which can influence the decision of the detectors. The injected sequences were extracted from benign executables. If more benign sequences were injected in a malware sample, a malware sample will be more similar as a benign sample. Therefore, we can see, for both methods, that the success rate increased with the length of the injected bytes increasing.


**Table 5.** SR of the randomly injecting method adopting end-of-file strategy.

For the end-of-file strategy, all malicious features in malware samples are preserved and not modified. Compared with the end-of-file strategy, the mid-file strategy injects feature sequences into the gaps between sections, which destroys some malicious features of malware samples. To mislead the detector, the end-file strategy needs to inject more feature byte sequences to counteract the effects of the original malicious features. Therefore, from Tables 2–5 we can see when injecting the same number of benign bytes into malware samples, the successful rate of the method adopting the mid-file strategy is higher than that adopting the end-of-file strategy. For the proposed method, the successful rate adopting the mid-file strategy is about 3–23% higher than that adopting the end-of-file strategy. For the randomly injecting method, the successful rate adopting the mid-file strategy is about 4–11% higher than that adopting the end-of-file strategy.

We also compare the proposed method with the gradient-based method [16]. The end-of-file strategy is adopted to inject feature sequences. For the gradient-based method, the gradient is calculated by minimizing the classification loss of the detector, with respect to the target label. In the experiment we select two different classification loss functions to calculate the gradient. One is the softmax classification loss (see Equation (5)), which is used to train MalConv. The other is the mean-square error (see Equation (6)), which is often used to train conventional back propagation (BP) networks.

$$L\_{softmax}(\theta) = -\frac{1}{m} \left[ \sum\_{i=1}^{m} \sum\_{j=1}^{k} 1\{y^{(i)} = j\} \log \frac{e^{\theta\_l^T x^{(i)}}}{\sum\_{l=1}^{k} e^{\theta\_l^T x^{(i)}}} \right] \tag{5}$$

$$L\_{\rm ms}(\theta) = \frac{1}{2m} \sum\_{i=1}^{m} (\mathcal{Y}\_i - y\_i)^2 \tag{6}$$

Due to the limitation of computing cost, only 200 adversarial samples are generated for each experiment and the maximum number of the injected bytes is less than 10,000. The experimental results adopting two different classification loss functions are shown in Table 6. We can see the successful rate of the proposed method adopting the end-of-file strategy is about 6–10 percent higher than that of the gradient-based method adopting softmax classification loss. The successful rate of the gradient-based method adopting softmax classification loss is about 5–17 percent higher than the method adopting mean squared error loss.



#### **5. Discussion**

From the experiments we can see the gradient-based algorithm takes a relatively long time to generate an adversarial sample. In our work, 200 adversarial samples are generated for each experiment. The gradient-based method takes an average about 100 min to generate an adversarial sample (See Table 6), because it only generates one appended byte per iteration. In addition, it is hard to determine the iteration number when appended bytes converge to their optimal values. If we use the gradient-based algorithm to generate a large amount of adversarial samples, the time cost is very high. For the proposed method, most time is spent on training a CNN-based detector. In the experiments, we spent about 10 h training a MalConv model. The time for extracting feature sequences is about one hour. Injecting feature sequences into a PE file can be done in a very short time (the average time in our experiment is about one minute, see Tables 2 and 3). Because the feature sequences can be shared by all adversarial samples, the proposed method is suitable for generating a large number of adversarial samples.

Interpretability is another challenge faced by adversarial sample generation algorithms. The gradient-based methods calculate the value of injected bytes by minimizing the classification loss of a detector, with respect to the target label. These injected bytes have no explainable semantics and are only treated as binary values. Different from the gradient-based methods, the proposed method injects feature byte sequences into malware. A feature sequence is a byte sequence extracted from a benign executable. By decompiling the executable, the semantics of a feature byte sequences can be clearly defined. Therefore, using the proposed method we can explain the meaning of the injected bytes.

In our study, the proposed method is only designed to generate the adversarial samples for CNN-based detectors. The feature byte sequences are selected based on the convolution operation of CNN. This means that we need to know in advance which algorithms a detector uses. Compared with our proposed method, the gradient-based methods are more commonly used methods, which do not assume the classification methods a detector uses. So, they can be more widely used to generated adversarial samples for different neural networks, such as BP network, CNN [16], and RNN [20].

Generating malware adversarial samples is different from generating image adversarial samples. For image adversarial samples, we can directly update each pixel. For malware adversarial samples, we cannot modify any byte of a source executable, otherwise we cannot guarantee that it can be executed correctly. Therefore, we have to inject padding bytes into the gaps or the end of a PE file. The number of gaps and the length of each gap in a PE file are limited. Using the mid-file strategy, sometimes we cannot find enough gaps to store feature byte sequences in an executable, which may reduce the successful rate. For the end-of-file strategy, we can append any number of section blocks at the end of a PE file by modifying the PE file structure. Therefore, it is relatively easy for the end-file strategy to inject enough bytes to generate an adversarial sample. However, adversarial samples generated using the end-of-file strategy are prone to be detected by simply analyzing the PE section table or examining if such sections are accessed by program instructions. In addition, if the length of a malware sample is greater than the input length of a detector, and the end-of file strategy cannot be applied.

#### **6. Conclusions**

In this paper we study how to generate malware adversarial samples. Different from previous gradient-based methods, we generate malware adversarial examples by injecting byte sequences into a source executable. The injected byte sequences can be shared by different adversarial samples. Our proposed method is efficient and suitable for generating a large number of adversarial samples. We proposed the algorithm to extract feature byte sequences for CNN-based deep learning models. Feature byte sequences can represent the characteristics of benign samples. Compared with the padding bytes generated using gradient-based methods, the feature byte sequences are explainable. The experimental results show that the adversarial samples, generated using the proposed method, have a high successful rate, and the proposed method is suitable for generating a large number of adversarial samples. It is possible that a more robust malware detector can be trained using the generated adversarial samples and the original samples. In this work, we have not yet provided definitive evidence for the benefits of the generated adversarial samples in improving performance of malware detection, due to the complexity of adversarial training malware detectors. In our future work, we plan to investigate how to use the generated adversarial malware samples to improve the performance of malware detection models.

**Author Contributions:** Conceptualization, Y.D.; formal analysis, Y.D., M.S., C.N. and K.F.; methodology, Y.D. and M.S.; software, C.N. and K.F.; validation, Y.D., M.S. and C.N. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was partially supported by the National Natural Science Foundation of China (Grant No. 61872107), Scientific Research Foundation of Shenzhen (Grant No. JCYJ20180507183608379).

**Data Availability Statement:** The data is available from http://virusshare.com/ (accessed on 1 December 2021).

**Conflicts of Interest:** The authors declare no conflict of interest.

