**1. Introduction**

In recent years, the popularity of the Android operation system has attracted the attention of malware developers, whose work has grown rapidly [1,2]. Many malware developers focus on hacking mobile devices and changing them into bots. This allows hackers to access the infected device and other connected devices and form botnets. Botnets are used to execute different malicious attacks, such as distributed denial-of-service (DDoS) attacks, sending spam, data theft, etc. The malicious botnet attacks are developed with advanced techniques (e.g., multi-staged payload or self-protection), making it difficult to identify the malware. This, in turn, poses major threats that require the design of effective approaches to detect these attacks [3].

Android botnets are used to perform attacks on the targeted devices. DDos attacks are achieved by flooding the target machine with superfluous requests and blocking legitimate requests, thus, causing a failure of the targeted system and disruption of the services [4]. Consequently, to protect against such attacks, machine learning methods are

**Citation:** Alkahtani, H.; Aldhyani, T.H.H. Artificial Intelligence Algorithms for Malware Detection in Android-Operated Mobile Devices. *Sensors* **2022**, *22*, 2268. https:// doi.org/10.3390/s22062268

Academic Editors: Leandros Maglaras, Helge Janicke and Mohamed Amine Ferrag

Received: 12 February 2022 Accepted: 11 March 2022 Published: 15 March 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

proven to be effective in detecting and tracking these threats in the internet of things [5,6]. Haystack [7] reported that a third-part of software-development companies manage 70% of the mobile application and control the personal data of users. According to the AV-TEST Security Institute [8], malicious programming increased, with 5.7 million malware Android packages detected by Kaspersky in 2020, three times more than in 2019 (2.1 million). Figure 1 summarize the increase of malware installation packages for smartphone devices in the last five years. Therefore, signature-based malicious installation packages for the extraction of malware patterns relying on their characteristics can be an effective strategy to secure mobile applications.

**Figure 1.** Malware installation packages for smartphone devices.

Malicious attacks occur in different enrolments with a variety of methods such as fuzzing, denial of service, DDoS, port scanning, and probing [9]. These attacks can be threatening to transport, application layers, or different protocols such as internet control message protocol, file transfer protocol, user datagram protocol, simple mail transfer protocol, transmission control protocol, hypertext transfer protocol, etc. Network-based intrusion detection systems can be used to deal with such attacks by scanning the network and detecting them [10].

Usually, in the Android system, security is in-built, where the sandboxing method and permission system are designed to reduce the risk of Android applications [11]. The former was developed using the Linux environment for running Android applications, which allows users to enable permission for the installation of any Android application [12]. However, when updating or upgrading mobile applications, security and privacy features such as time permission, background location, storage, etc., are changed, giving a timeframe for malware attacks. It is possible to exploit Android vulnerabilities during the application developed by users since the Google Play Store cannot detect malicious attacks after the publication of the applications [13]. The percentage of Android malware is presented in Figure 2.

Intrusion detection systems are developed using machine learning and deep learning methods. However, the machine learning technique cannot cope with the huge traffic of data flooding the system. Similarly, deep learning methods fail to provide low generalization errors due to the absence of optimization. Fixed Android botnet datasets make it feasible to design detectors with high detection rates [15], but having complex traffic data hinders the obtention of an accurate prediction rate. This has motivated the development of techniques that are based on Android-malware neuro-evolution classification, thus, providing the number of layers and neurons along with the detection process [16].

The present study aimed to extract static and dynamic features from unknown applications; these features show if a particular application is "normal" or "attack". These features are used to examine the performance of several machine learning and deep learning models, including the k-nearest neighbors (KNN) [17], support vector machine (SVM) [18], convolutional neural networks (CNN) [19], dense neural networks [20], gated recurrent units (GRU), long short-term memory (LSTM) [21], and the hybrid deep learning convolutional neural networks long/short-term memory (CNN-LSTM) and convolutional neural networks/gated recurrent units CNN-GRU [22] methods.

In this study, we investigated and estimated the performance of various machine learning and deep learning algorithms in the detection of mobile malware attacks. This study offers the optimal algorithms for the monitoring of Android applications against malicious attacks. Thus, our research aims to contribute to this field with the following:


### **2. Background of Study**

This section offers an overview of previous research related to intrusion detection systems, Android malware detection, and standard datasets of Android malicious attacks. Furthermore, it provides an overview of the machine learning and deep learning approaches applied to the design of cybersecurity systems.

The regular improvement of sophisticated Android malware families, e.g., Chamois malware, has made the task of detecting malicious attacks daunting. To tackle this, researchers developed machine learning techniques that improved the available systems. Recently, many studies have applied machine learning models for Android botnet detection, such as linear regression, KNN [23], SVM, and decision trees (DT) algorithms [24]. Some of these recent studies [25,26] used deep learning algorithms, although they do not provide a thorough understanding of their effectiveness. Therefore, the current study compares with deep learning models to examine their effectiveness in Android botnet detection with the use of the available installation support center of expertise (ISCX) botnet dataset [27–29].

Kadir et al. [30] used deep learning models to analyze Android botnet attacks in an attempt to understand the latter's hidden features. The system was evaluated using the ISCX Android botnet dataset, which contained 1929 samples. Anwar et al. [31] proposed an Android botnet detection approach based on static functions. The features of permissions, MD5 signatures, and broadcast receivers were combined and processed with machine learning algorithms. The input data collected from the ISCX dataset were 1400 from different botnet applications, with the system achieving an accuracy of 95.1% in distinguishing Android botnet attacks [32].

Several machine learning algorithms were proposed to classify normal and abnormal botnet attacks. In one study, the results indicated that the random forest approach had 0.972% precision and 0.96% recall. In [33], machine learning approaches were proposed for detecting Android botnets. The ISCX dataset consisted of 1635 benign and 1635 attacks. The random forest tree model achieved 97%. In another study [34], the DT, Naive Bayes, and random forest machine learning algorithms were used to detect Android attacks. The information gain method was used to select the significant features. The random forest algorithm achieved a 94.6% accuracy. Karim et al. [35] proposed the static analysis approach to explore the pattern of the features of Android botnet attacks. The features were compared with the intrusion application using the Drebin dataset [36]. Artificial intelligence (AI) approaches using a knowledge-based system were used to secure Android mobiles against malicious attacks [37,38]. Inspired by a meta-heuristic rule and based on fuzzy logic, intrusion detection and data mining systems were developed [39], while machine learning approaches were applied in the development of IDS applications [40–42]. The design of IDS systems employed the artificial bee colony [43], particle swarm optimization [44], grey wolf optimization [45], and artificial fish swarm [46] algorithms.

Many systems were developed based on signature-based Android malware detection approaches and behavior-based Android malware intrusion detection approaches [47]. The former is a simple detection method that manages intrusions' low degree of false positives. The latter is based on anomaly detection and is a very common method using AI algorithms to detect malicious attacks. Numerous research articles aimed to detect and classify Android malware and attacks using machine learning and deep learning approaches, such as the DT and deep learning approaches [48]. By using the generative adversarial networks algorithm [49], it was shown that traditional machine learning was successful in detecting malware in an Android environment [50].

Most of the published studies used datasets from Google Play [51], AndroZoo, Android Permission [52], Andrototal [53], Wandoujia [54], Kaggle [55], and CICMaldroid [56]. The present study aimed at developing a system to detect malware attacks in Android environments that have an in-built security system. However, there are still many Android applications with design weaknesses and security flaws that can be threatening to end-users. Therefore, it is crucial to use machine learning and deep learning algorithms to detect Android malware and vulnerability analysis to prevent the development of malware and attacks by hackers [57,58].

### **3. Materials and Methods**

In 2008, Android was developed. With the increasing number of Android applications, companies immediately discussed and built security tools [2]. Nevertheless, the Android system is suffering from security weaknesses. In the last five years, AI approaches focused on protecting the Android system, with many researchers studying the appropriate AI approaches to obtain high accuracy. The framework of the present research is presented in Figure 3. The machine learning algorithms support vector machine (SVM), k-nearest neighbors (KNN), linear discriminant analysis (LDA) and the deep learning algorithms long short-term memory (LSTM), convolution neural network-long short-term memory (CNN-LSTM), and autoencoder algorithms were used to detect malware and attacks against Android applications. These algorithms were tested using two standard datasets. The research questions of this study were:


**Figure 3.** A generic representation of the models applied for the detection of Android malware.
