A Survey on ML Techniques for Multi-Platform Malware Detection: Securing PC, Mobile Devices, IoT, and Cloud Environments
Abstract
:1. Introduction
- We conducted a thorough review of the latest literature on malware detection published since 2017, revealing that this is the first comprehensive survey to explore machine learning-based malware detection across PCs, mobile devices, IoT systems, and cloud environments.
- This study investigated platform-specific features (e.g., static, dynamic, memory, and hybrid) for training ML models and analyzed the malware landscape across platforms. It comprehensively reviews ML- and DL-based malware detection techniques and highlights the key research trends for each platform.
- This study identified both platform-specific challenges and cross-platform issues that affect the development of effective ML-based malware detection techniques.
- Finally, this study identifies the limitations of the existing literature and suggests future research directions.
2. Comparison with Previous Related Surveys
Papers | Year | Main Contribution | Insights into Malware | ML-Based Malware Detection in Diverse Platform | Challenges Identified | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Latest Prominent Malware Variants | Platform-Based Malware Taxonomy | Analysis Methods (Static, Dynamic, Memory, and Hybrid) | Feature Details | PCs | Mobile | IoT | Cloud | |||||
Windows | Linux | |||||||||||
[13] | 2021 | Survey on malware detection techniques using machine learning algorithms. | × | × | √ | × | √ | × | × | × | × | × |
[14] | 2019 | Survey on sophisticated attack and evasion techniques used by the contemporary malwares. | × | × | ≈ | × | √ | × | × | × | × | × |
[15] | 2022 | This survey is on the use of deep learning-based malware detection. | × | × | ≈ | × | √ | × | × | × | × | × |
[17] | 2021 | Reviewed machine learning methods for Android malware detection. | × | × | √ | × | × | × | √ | × | × | × |
[16] | 2020 | Study on traditional and state-of-the-art ML techniques for malware detection. | × | × | ≈ | √ | √ | × | × | × | × | × |
[18] | 2023 | DL approaches for malware defenses in the Android environment. | × | × | √ | √ | × | × | √ | × | × | × |
[19] | 2020 | Android malware detection using deep learning. | × | × | ≈ | √ | × | × | √ | × | × | × |
[21] | 2023 | Survey on IoT malware taxonomy and detection mechanisms. | × | ≈ | × | × | × | × | × | √ | × | √ |
[22] | 2023 | Discussed IoT dataset use to evaluate machine learning techniques. | × | × | ≈ | × | × | × | × | √ | × | × |
[23] | 2023 | Review on emerging machine learning algorithms for detecting malware in IoT. | × | × | × | × | × | × | × | √ | × | × |
[7] | 2024 | Modern deep learning technologies for identifying malware on Windows, Linux, and Android platforms. | × | × | √ | ≈ | √ | √ | √ | × | × | × |
[20] | 2021 | Computer-based and mobile-based malware detection and their countermeasures are presented. | × | × | ≈ | × | √ | × | √ | × | × | √ |
[24] | 2022 | ML- and DL-based defenses against attacks and security issues in cloud computing is provided. | × | × | × | × | × | × | × | × | √ | √ |
[25] | 2021 | Behavior-based malware detection system in the cloud environment. | × | × | ≈ | × | × | × | × | × | × | × |
Our survey | 2024 | Survey on malware detection in PCs, mobile devices, IoT, and cloud platforms using ML techniques. | √ | √ | √ | √ | √ | √ | √ | √ | √ | √ |
3. Malware Fundamentals
3.1. What Is Malware?
3.2. Leading Malware Threats in the Current Cyber Landscape
3.3. Malware Analysis
- Static analysis;
- Dynamic analysis;
- Memory analysis;
- Hybrid analysis.
3.4. Features Used in ML-Based Malware Detection
4. Malware Landscape Across Platforms
4.1. PCs
4.1.1. Windows
4.1.2. Linux
4.1.3. macOS
4.2. Mobile Devices
4.2.1. Android
4.2.2. iOS
4.3. IoT Platform
4.4. Cloud Environments
5. Machine Learning Algorithms for Malware Detection
6. Application of Machine Learning on Malware Detection
6.1. PC (Personal Computers) Malware Detection
6.1.1. Malware Detection on Windows Platform
6.1.2. Malware Detection on Linux OS
6.1.3. Malware Detection on macOS
6.2. Malware Detection on Mobile Platforms
6.2.1. Android Malware Detection
6.2.2. Malware Detection in iOS
6.3. Malware Detection on IoT Platform
Reference | Data Source | Feature Category | Feature Name | ML Algorithms | Accuracy (%) |
---|---|---|---|---|---|
[68] | IoT-23 dataset | Static | Network capture files include IP address, ID of the capture, protocol, etc. | RF, NB, MLP, SVM, and AdaBoost | 99.5 |
[89] | NSS Mirai Dataset latest relevant, balanced datasets https://www.stratosphereips.org/datasets-iot23 (accessed on 3 February 2025) | Static | Alert level (source and destination IP addresses, C&C activities, protocol) and packet-level features (IP address or port number, packet size, etc.) | CNN | 99 |
[90] | ARM-based IoT | Static | Opcode features | RNN and CNN | 99.98 |
[91] | Executable and Linkable Format (ELF) file templates are executed in the QEMU sandbox | Dynamic | System call graph | CNN | 97 |
[102] | KISA-data challenge 2019-Malware.04, provided by the Korea Internet & Security Agency | Hybrid | Opcode and API call sequences | Bi-LSTM and spatial pyramid pooling network (SPP-Net) | 92.09 |
[92] | Network traffic is collected from external repositories | Dynamic | 2D image-based network traffic features | Neural network | 91.32 |
[69] | Bot-IoT, MedBIoT, and MQTT-IoT-IDS2020 datasets | Dynamic | Packet-level metadata of the raw PCAP file | DT, RF, K-nearest neighbor (KNN), and extreme gradient boosting (XGB) | 99.5 with RF |
[93] | MedBIoT dataset [126]. IoTID (IoT network intrusion dataset) http://dx.doi.org/10.21227/q70p-q449 (accessed on 4 February 2025) | Dynamic | PCAP files | LSTM, RNN, and DT, respectively | 98.71 |
[124] | UNSW-SOSR2019 | Static | Network packets (source IP, destination IP, timestamp, traffic flows, etc.) | Graph neural network (GNN) | - |
[10] | N-BaIoT dataset | Network | IoT network traffic | MLP and Auto Encoder | 99% |
[125] | Bot IoT dataset | Network | IoT network traffic | ANN | 99% |
6.4. Malware Detection on Cloud Platform
6.5. Discussion
7. Challenges Associated with Platform-Specific and Cross-Platform Malware Detection
7.1. PC Platforms (Windows, Linux, and macOS)
7.1.1. Common Challenges Across PC Platforms
- Linux: Linux systems often lack the necessary ELF loaders for executing malware samples.
- Windows: Missing or incompatible DLLs can disrupt malware detection. Windows systems face similar issues as missing or incompatible Dynamic Link Libraries (DLLs).
- macOS: macOS’s sandboxing mechanisms can restrict access to runtime environments, further complicating the execution of suspicious binaries for dynamic analysis.
7.1.2. Windows-Specific Challenges
- Widespread usage and sophisticated attacks: Windows operating systems face unique security challenges owing to their architecture and widespread usage, making them particularly vulnerable to window-specific exploits. For example, the WannaCry ransomware exclusively targets Windows systems by exploiting the EternalBlue vulnerability in the Windows SMB protocol, causing global disruptions. Additionally, Windows-specific protocols such as NTLM (NT LAN Manager) and NetBIOS have been exploited in pass-the-hash and man-in-the-middle attacks, which are not applicable to Linux or macOS systems. These vulnerabilities are often exacerbated by Windows’ reliance on legacy systems and backward compatibility requirements, leaving systems exposed to unpatched exploits [130].
7.1.3. Linux-Specific Challenges
- Kernel exploits and diverse architectures: Linux operating systems face unique challenges owing to their architecture and widespread use in servers, cloud infrastructures, and IoT devices. A notable example is the Dirty COW (CVE-2016-5195) vulnerability, a Linux-specific privilege escalation flaw in the kernel memory subsystem that allows attackers to gain root access [131]. Linux-specific package managers and dependencies can also introduce vulnerabilities if not properly maintained, as seen in attacks targeting outdated software in distributions, such as Ubuntu or CentOS. These issues are compounded by the diversity of Linux distributions that can lead to inconsistent patching and security practices.
7.1.4. macOS-Specific Challenges
- Exploiting native frameworks (macOS), which are known for their robust security architecture, faces unique challenges due to their closed ecosystem and reliance on native frameworks. One notable example is gatekeeper bypass vulnerability, in which attackers exploit flaws in the macOS app verification system to execute malicious software without user consent. The reliance on proprietary frameworks such as AppleScript and iCloud integration also introduces vulnerabilities, as seen in attacks that exploit scripting vulnerabilities or iCloud phishing schemes [132].
- Perception of security: macOS users often perceive their systems as inherently secure, which attackers exploit through sophisticated phishing and malvertising campaigns.
7.1.5. Summary of Commonalities and Differences
7.2. Mobile Platforms (Android and iOS)
7.2.1. Android-Specific Challenges
- OS update delays: the dependency on multiple manufacturers slows OS updates, leaving outdated devices vulnerable.
- Third-party apps: third-party applications increase risks to device security and user privacy.
- Device diversity: the variety of Android devices and OS versions complicates uniform patching and security protocols.
7.2.2. iOS-Specific Challenges
- Jailbreaking risks: jailbreaking removes iOS restrictions, exposing devices to malware and unauthorized app installations [133].
- iCloud phishing and account takeovers: attackers use phishing to steal iCloud credentials, enabling data theft and device tracking.
7.2.3. Summary of Mobile Platform Differences
7.3. IoT-Specific Challenges
- Device diversity: the rapid growth and heterogeneity of IoT devices create challenges for uniform malware detection.
- Resource constraints: limited computational resources render IoT devices more vulnerable to malware.
- Dataset limitations: the lack of valid large-scale IoT malware datasets limits machine learning model training.
7.4. Cloud-Specific Challenges
- Interconnected infrastructure: the distributed nature of cloud systems increases the risk of malware attacks.
- Shared responsibility model: the shared responsibility model can obscure security accountability, limiting organizations’ visibility and hindering threat detection.
- Scalability of attacks: attackers can leverage the automation and scalability features of the cloud to launch large-scale attacks quickly.
7.5. Cross-Platform Challenges
- Data heterogeneity: differences in file formats, system call sequences, and behavioral patterns across platforms make it challenging to create generalized models.
- Lack of unified datasets: the absence of a standardized, diverse, and large-scale dataset encompassing samples from various platforms (Windows, macOS, Linux, Android, IoT, and cloud) complicates the effective training of detection models.
- Model transferability: ML models trained on one platform (e.g., Windows) may not generalize well to others (e.g., Linux or IoT) because of the differences in malware characteristics.
- Performance scalability: ensuring that detection techniques remain scalable and efficient when applied to resource-limited environments such as IoT and cloud systems remains a critical challenge.
8. Limitations in the Existing Literature and Future Research Directions
8.1. Limitations in the Existing Literature
8.2. Future Research Directions
9. Conclusions
Funding
Conflicts of Interest
References
- Nguyen, M.H.; Le Nguyen, D.; Nguyen, X.M.; Quan, T.T. Auto-detection of sophisticated malware using lazy-binding control flow graph and deep learning. Comput. Secur. 2018, 76, 128–155. [Google Scholar] [CrossRef]
- Companies, O. 2024 Cisco Cybersecurity Readiness Index. 2024. Available online: https://newsroom.cisco.com/c/dam/r/newsroom/en/us/interactive/cybersecurity-readiness-index/documents/Cisco_Cybersecurity_Readiness_Index_FINAL.pdf (accessed on 15 October 2024).
- Palatty, N.J. Top Malware Attack Statistics, Astra. 2024. Available online: https://www.getastra.com/blog/security-audit/malware-statistics/ (accessed on 15 October 2024).
- Forbes. Why Ransomware Should Be on Every Cybersecurity Team’s Radar. 2022. Available online: https://www.forbes.com/councils/forbestechcouncil/2022/04/12/why-ransomware-should-be-on-every-cybersecurity-teams-radar/#:~:text=According to Cybersecurity Ventures%2C victims,business up and running again (accessed on 15 October 2024).
- Toulas, B. Linux Malware Sees 35% Growth During 2021. 2022. Available online: https://www.bleepingcomputer.com/news/security/linux-malware-sees-35-percent-growth-during-2021/ (accessed on 19 October 2024).
- GANDH, V. 2023 ThreatLabz Report Indicates 400% Growth in IoT Malware Attacks. 2023. Available online: https://www.zscaler.com/blogs/security-research/2023-threatlabz-report-indicates-400-growth-iot-malware-attacks (accessed on 15 October 2024).
- Maniriho, P.; Mahmood, A.N.; Chowdhury, M.J.M. A Survey of Recent Advances in Deep Learning Models for Detecting Malware in Desktop and Mobile Platforms. ACM Comput. Surv. 2024, 56, 1–41. [Google Scholar] [CrossRef]
- Pleshakova, E.; Osipov, A.; Gataullin, S.; Gataullin, T.; Vasilakos, A. Next gen cybersecurity paradigm towards artificial general intelligence: Russian market challenges and future global technological trends. J. Comput. Virol. Hacking Tech. 2024, 20, 429–440. [Google Scholar] [CrossRef]
- Kasri, W.; Himeur, Y.; Alkhazaleh, H.A.; Tarapiah, S.; Atalla, S. From Vulnerability to Defense: The Role of Large Language Models in Enhancing Cybersecurity. Computation 2025, 13, 30. [Google Scholar] [CrossRef]
- Rey, V.; Sánchez, P.M.S.; Celdrán, A.H.; Bovet, G. Federated learning for malware detection in IoT devices. Comput. Netw. 2022, 204, 108693. [Google Scholar] [CrossRef]
- Gulmez, S.; Kakisim, A.G.; Sogukpinar, I. XRan: Explainable deep learning-based ransomware detection using dynamic analysis. Comput. Secur. 2024, 139, 103703. [Google Scholar] [CrossRef]
- Ferdous, J.; Islam, R.; Mahboubi, A.; Islam, M.Z. A Review of State-of-the-Art Malware Attack Trends and Defense Mechanisms. IEEE Access 2023, 11, 121118–121141. [Google Scholar] [CrossRef]
- Singh, J.; Singh, J. A survey on machine learning-based malware detection in executable files. J. Syst. Archit. 2021, 112, 101861. [Google Scholar] [CrossRef]
- Chakkaravarthy, S.S.; Sangeetha, D.; Vaidehi, V. A Survey on malware analysis and mitigation techniques. Comput. Sci. Rev. 2019, 32, 1–23. [Google Scholar] [CrossRef]
- Tayyab, U.-H.; Khan, F.B.; Durad, M.H.; Khan, A.; Lee, Y.S. A Survey of the Recent Trends in Deep Learning Based Malware Detection. J. Cybersecur. Priv. 2022, 2, 800–829. [Google Scholar] [CrossRef]
- Gibert, D.; Mateu, C.; Planes, J. The rise of machine learning for detection and classification of malware: Research developments, trends and challenges. J. Netw. Comput. Appl. 2020, 153, 102526. [Google Scholar] [CrossRef]
- Wu, Q.; Zhu, X.; Liu, B. A Survey of Android Malware Static Detection Technology Based on Machine Learning. Mob. Inf. Syst. 2021, 2021, 8896013. [Google Scholar] [CrossRef]
- Liu, Y.; Tantithamthavorn, C.; Li, L.; Liu, Y. Deep Learning for Android Malware Defenses: A Systematic Literature Review. ACM Comput. Surv. 2023, 55, 1–36. [Google Scholar] [CrossRef]
- Wang, Z.; Liu, Q.; Chi, Y. Review of android malware detection based on deep learning. IEEE Access 2020, 8, 181102–181126. [Google Scholar] [CrossRef]
- Roseline, S.A.; Geetha, S. A comprehensive survey of tools and techniques mitigating computer and mobile malware attacks. Comput. Electr. Eng. 2021, 92, 107143. [Google Scholar] [CrossRef]
- Victor, P.; Habibi, A.; Rongxing, L.; Tinshu, L.; Pulei, S.; Shahrear, X. IoT malware: An attribute—Based taxonomy, detection mechanisms and challenges. Peer Peer Netw. Appl. 2023, 16, 1380–1431. [Google Scholar] [CrossRef] [PubMed]
- Alex, C.; Creado, G.; Almobaideen, W.; Alghanam, O.A.; Saadeh, M. A Comprehensive Survey for IoT Security Datasets Taxonomy, Classification and Machine Learning Mechanisms. Comput. Secur. 2023, 132, 103283. [Google Scholar] [CrossRef]
- Gaurav, A.; Gupta, B.B.; Panigrahi, P.K. A comprehensive survey on machine learning approaches for malware detection in IoT-based enterprise information system. Enterp. Inf. Syst. 2023, 17, 439–463. [Google Scholar] [CrossRef]
- Belal, M.M.; Sundaram, D.M. Comprehensive review on intelligent security defences in cloud: Taxonomy, security issues, ML/DL techniques, challenges and future trends. J. King Saud Univ. Comput. Inf. Sci. 2022, 34, 9102–9131. [Google Scholar] [CrossRef]
- Aslan, O.; Ozkan-Okay, M.; Gupta, D. Intelligent Behavior-Based Malware Detection System on Cloud Computing Environment. IEEE Access 2021, 9, 83252–83271. [Google Scholar] [CrossRef]
- Gopinath, M.; Sethuraman, S.C. A comprehensive survey on deep learning-based malware detection techniques. Comput. Sci. Rev. 2023, 47, 100529. [Google Scholar] [CrossRef]
- Sanda, O.; Pavlidis, M.; Polatidis, N. A deep learning approach for host-based cryptojacking malware detection. Evol. Syst. 2024, 15, 41–56. [Google Scholar] [CrossRef]
- Ferdous, J.; Islam, R.; Bhattacharya, M.; Islam, Z. Malware−Resistant Data Protection in Hyper−Connected Networks: A Survey. 2023. Available online: https://arxiv.org/pdf/2307.13164 (accessed on 23 October 2024).
- Mitchell, A. Current Malware Trends: 5 Most Common Types of Malwares in 2024. Lumifi, 2024. Available online: https://www.lumificyber.com/blog/current-malware-trends-5-most-common-types-of-malware-in-2024/ (accessed on 23 October 2024).
- BURGESS, M. Conti’s Attack Against Costa Rica Sparks a New Ransomware Era. WIRED, 2022. Available online: https://www.wired.com/story/costa-rica-ransomware-conti/ (accessed on 23 October 2024).
- Toulas, B. REvil Ransomware Member Extradited to U.S. to Stand Trial for Kaseya Attack. BLEEPING COMPUTER, 2022. Available online: https://www.bleepingcomputer.com/news/security/revil-ransomware-member-extradited-to-us-to-stand-trial-for-kaseya-attack/ (accessed on 23 October 2024).
- Schwirtz, M.; Perlroth, N. DarkSide, Blamed for Gas Pipeline Attack, Says It Is Shutting Down. The New York Times. 8 June 2021. Available online: https://www.nytimes.com/2021/05/14/business/darkside-pipeline-hack.html (accessed on 23 October 2024).
- Gatlan, S. Accenture Confirms Data Breach after August Ransomware Attack. Bleeping Computer, 2021. Available online: https://www.bleepingcomputer.com/news/security/accenture-confirms-data-breach-after-august-ransomware-attack/ (accessed on 23 October 2024).
- Kost, E. What is an Advanced Persistent Threat (APT)? UpGuard, 2024. Available online: https://www.upguard.com/blog/what-is-an-advanced-persistent-threat (accessed on 23 October 2024).
- Sharma, A.; Gupta, B.B.; Singh, A.K.; Saraswat, V.K. Orchestration of APT malware evasive manoeuvers employed for eluding antivirus and sandbox defense. Comput. Secur. 2022, 115, 102627. [Google Scholar] [CrossRef]
- Masood, Z.; Samar, R.; Raja, M.A.Z. Design of a mathematical model for the Stuxnet virus in a network of critical control infrastructure. Comput. Secur. 2019, 87, 101565. [Google Scholar] [CrossRef]
- Jain, A. Decoding cryptojacking: What is this and how can you protect yourself? Crypto.news. 9 May 2024. Available online: https://crypto.news/what-is-cryptojacking-how-does-it-work/ (accessed on 24 October 2024).
- Fortinet Cryptojacking (Learns How Cryptojacking Works and Gains Access to and Abuses Computer Resources). Fortinet, 2024. Available online: https://www.fortinet.com/resources/cyberglossary/cryptojacking#:~:text=Cryptojacking is also referred to,overall health of your network (accessed on 24 October 2024).
- Stevens, R. Crypto Mining Botnet Found on Defense Department Web Server. 2020. Available online: https://decrypt.co/18738/crypto-mining-botnet-found-on-defense-department-web-server (accessed on 24 October 2024).
- Stevens, R. Man Fined $7000 for Using Russian Supercomputer to Mine Bitcoin. Decrypt, 2019. Available online: https://decrypt.co/9751/man-fined-for-using-russian-supercomputer-to-mine-crypto (accessed on 24 October 2024).
- Wolf, A. 13 Types of Malware Attacks—And How You Can Defend Against Them. 2024. Available online: https://arcticwolf.com/resources/blog/8-types-of-malware/ (accessed on 24 October 2024).
- Baker, K. The 12 Most Common Types of Malwares. CROWDSTRIKE, 2023. Available online: https://www.crowdstrike.com/en-us/cybersecurity-101/malware/types-of-malware/ (accessed on 24 October 2024).
- Ferdous, J.; Islam, R.; Mahboubi, A.; Islam, M.Z. AI-based Ransomware Detection: A Comprehensive Review. IEEE Access 2024, 12, 2024. [Google Scholar] [CrossRef]
- Kara, I. Fileless malware threats: Recent advances, analysis approach through memory forensics and research challenges. Expert Syst. Appl. 2023, 214, 119133. [Google Scholar] [CrossRef]
- Kumar, R.; Zhang, X.; Wang, W.; Khan, R.U.; Kumar, J.; Sharif, A. A Multimodal Malware Detection Technique for Android IoT Devices Using Various Features. IEEE Access 2019, 7, 64411–64430. [Google Scholar] [CrossRef]
- Panker, T.; Nissim, N. Leveraging malicious behavior traces from volatile memory using machine learning methods for trusted unknown malware detection in Linux cloud environments. Knowl. Based Syst. 2021, 226, 107095. [Google Scholar] [CrossRef]
- Oz, H.; Aris, A.; Levi, A.; Uluagac, A.S. A Survey on Ransomware: Evolution, Taxonomy, and Defense Solutions. ACM Comput. Surv. 2022, 1, 1–37. [Google Scholar] [CrossRef]
- Unit, T.A. VMware Threat Report—Exposing Malware in Linux-Based Multi-Cloud Environments. 2022. Available online: https://blogs.vmware.com/security/2022/02/2022-vmware-threat-report-exposing-malware-in-linux-based-multi-cloud-environments.html (accessed on 4 November 2024).
- Walsh, R. Linux Malware Stats and Facts for 2024. 2024. Available online: https://www.comparitech.com/blog/vpn-privacy/linux-malware-stats-and-facts/ (accessed on 4 November 2024).
- Meshi, Y.E.T.M. Battling MacOS Malware with Cortex AI. 2023. Available online: https://www.paloaltonetworks.com/blog/security-operations/battling-macos-malware-with-cortex-ai/ (accessed on 4 November 2024).
- B. Report. MacOS Threat Landscape Report; Bitdefender: Bucharest, Romania, 2023. [Google Scholar]
- Ani Petrosyan Number of Detected Malicious Installation Packages on Mobile Devices Worldwide from 4th Quarter 2015 to 3rd Quarter 2023. Statista, 2024. Available online: https://www.statista.com/statistics/653680/volume-of-detected-mobile-malware-packages/ (accessed on 27 October 2024).
- Sherif, A. Market Share of Mobile Operating Systems Worldwide from 2009 to 2024, by Quarter. Statista, 2024. Available online: https://www.statista.com/statistics/272698/global-market-share-held-by-mobile-operating-systems-since-2009/ (accessed on 1 November 2024).
- Manzil, H.H.R.; Naik, S.M. Detection approaches for android malware: Taxonomy and review analysis. Expert Syst. Appl. 2024, 238, 122255. [Google Scholar] [CrossRef]
- Saracino, A.; Sgandurra, D.; Dini, G.; Martinelli, F. MADAM: Effective and Efficient Behavior-based Android Malware Detection and Prevention. IEEE Trans. Dependable Secur. Comput. 2018, 15, 83–97. [Google Scholar] [CrossRef]
- Garg, S.; Baliyan, N. Comparative analysis of Android and iOS from security viewpoint. Comput. Sci. Rev. 2021, 40, 100372. [Google Scholar] [CrossRef]
- Shen, Y.; Wuhan, H. Enhancing data security of iOS client by encryption algorithm. In Proceedings of the 2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China, 25–26 March 2017; pp. 366–370. [Google Scholar]
- Lutaaya, M. Rethinking app permissions on iOS. In Proceedings of the CHI Conference on Human Factors in Computing Systems, Montreal QC Canada, 21–26 April 2018; pp. 1–6. [Google Scholar] [CrossRef]
- Phungglan, J. Most Common Viruses on iPhone. 2023. Available online: https://macpaw.com/how-to/most-common-iphone-viruses (accessed on 3 November 2024).
- Walsh, R. iOS Malware Stats and Facts for 2024. 2024. Available online: https://www.comparitech.com/blog/vpn-privacy/ios-malware-stats-and-facts/ (accessed on 3 November 2024).
- O’Flaherty, K. New ‘Dangerous’ iPhone Spyware Attack Warning Issued To iOS Users. 2024. Available online: https://www.forbes.com/sites/kateoflahertyuk/2024/04/19/new-dangerous-iphone-spyware-attack-warning-issued-to-ios-users/ (accessed on 3 November 2024).
- Vailshery, L.S. Number of Internet of Things (IoT) Connections Worldwide from 2022 to 2023, with Forecasts from 2024 to 2033. Statista, 2024. Available online: https://www.statista.com/statistics/1183457/iot-connected-devices-worldwide/ (accessed on 5 November 2024).
- Jose, C.S. Zscaler ThreatLabz Finds a 400% Increase in IoT and OT Malware Attacks Year-over-Year, Underscoring Need for Better Zero Trust Security to Protect Critical Infrastructures. Zscaler, 2023. Available online: https://www.zscaler.com/press/zscaler-threatlabz-finds-400-increase-iot-and-ot-malware-attacks-year-over-year-underscoring (accessed on 5 November 2024).
- Yadav, R.M. Effective analysis of malware detection in cloud computing. Comput. Secur. 2019, 83, 14–21. [Google Scholar] [CrossRef]
- Kilonzi, F. Cloud Malware: Types of Attacks and How to Defend Against Them. 2023. Available online: https://thenewstack.io/cloud-malware-types-of-attacks-and-how-to-defend-against-them/ (accessed on 25 November 2024).
- Huda, S.; Islam, R.; Abawajy, J.; Yearwood, J.; Hassan, M.M.; Fortino, G. A hybrid-multi filter-wrapper framework to identify run-time behaviour for fast malware detection. Futur. Gener. Comput. Syst. 2018, 83, 193–207. [Google Scholar] [CrossRef]
- Alzaylaee, M.K.; Yerima, S.Y.; Sezer, S. DL-Droid: Deep learning based android malware detection using real devices. Comput. Secur. 2020, 89, 101663. [Google Scholar] [CrossRef]
- Stoian, N.A. Machine Learning for Anomaly Detection in IoT Networks: Malware Analysis on the IoT-23 Data Set. Bachelor’s Thesis, University of Twente, Enschede, The Netherlands, 2020. [Google Scholar]
- He, M.; Huang, Y.; Wang, X.; Wei, P.; Wang, X. A Lightweight and Efficient IoT Intrusion Detection Method Based on Feature Grouping. IEEE Internet Things J. 2024, 11, 2935–2949. [Google Scholar] [CrossRef]
- Mezina, A.; Burget, R. Obfuscated malware detection using dilated convolutional network. In Proceedings of the 2022 14th International Congress on Ultra Modern Telecommunications and Control Systems and Workshops (ICUMT), Valencia, Spain, 11–13 October 2020; pp. 110–115. [Google Scholar] [CrossRef]
- Mitchell, J.; McLaughlin, N.; Martinez-del-Rincon, J. Generating sparse explanations for malicious Android opcode sequences using hierarchical LIME. Comput. Secur. 2024, 137, 103637. [Google Scholar] [CrossRef]
- Potha, N.; Kouliaridis, V.; Kambourakis, G. An extrinsic random-based ensemble approach for android malware detection. Connect. Sci. 2021, 33, 1077–1093. [Google Scholar] [CrossRef]
- Yoo, S.; Kim, S.; Kim, S.; Kang, B.B. AI-HydRa: Advanced hybrid approach using random forest and deep learning for malware classification. Inf. Sci. 2021, 546, 420–435. [Google Scholar] [CrossRef]
- Bai, J.; Yang, Y.; Mu, S.; Ma, Y. Malware detection through mining symbol table of linux executables. Inf. Technol. J. 2013, 12, 380–384. [Google Scholar] [CrossRef]
- Jeon, S.; Moon, J. Malware-Detection Method with a Convolutional Recurrent Neural Network Using Opcode Sequences. Inf. Sci. 2020, 535, 1–15. [Google Scholar] [CrossRef]
- Snow, E.; Alam, M.; Glandon, A.; Iftekharuddin, K. End-to-end Multimodel Deep Learning for Malware Classification. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020. [Google Scholar] [CrossRef]
- Darem, A.; Abawajy, J.; Makkar, A.; Alhashmi, A.; Alanazi, S. Visualization and deep-learning-based malware variant detection using OpCode-level features. Futur. Gener. Comput. Syst. 2021, 125, 314–323. [Google Scholar] [CrossRef]
- Catak, F.O.; Ahmed, J.; Sahinbas, K.; Khand, Z.H. Data Augmentation based Malware Detection Using Convolutional Neural Networks. PeerJ Comput. Sci. 2021, 7, e346. [Google Scholar] [CrossRef]
- Moreira, C.C.; Moreira, D.C.; de Sales, C.d.S. Improving ransomware detection based on portable executable header using xception convolutional neural network. Comput. Secur. 2023, 130, 103265. [Google Scholar] [CrossRef]
- Jindal, C.; Salls, C.; Aghakhani, H.; Long, K.; Kruegel, C.; Vigna, G. Neurlux: Dynamic malware analysis without feature engineering. In Proceedings of the 35th Annual Computer Security Applications Conference, San Juan, PR, USA, 9–13 December 2019; pp. 444–455. [Google Scholar] [CrossRef]
- Chaganti, R.; Ravi, V.; Pham, T.D. A multiview feature fusion approach for effective malware classification using Deep Learning. J. Inf. Secur. Appl. 2022, 72, 103402. [Google Scholar] [CrossRef]
- Darabian, H.; Homayounoot, S.; Dehghantanha, A.; Hashemi, S.; Karimipour, H.; Parizi, R.M.; Choo, K.-K.R. Detecting Cryptomining Malware: A Deep Learning Approach for Static and Dynamic Analysis. J. Grid Comput. 2020, 18, 293–303. [Google Scholar] [CrossRef]
- Naeem, M.R.; Khan, M.; Abdullah, A.M.; Noor, F.; Khan, M.I.; Ullah, I.; Room, S. A Malware Detection Scheme via Smart Memory Forensics for Windows Devices. Mob. Inf. Syst. 2022, 2022, 9156514. [Google Scholar] [CrossRef]
- Tekerek, A.; Yapici, M.M. A novel malware classification and augmentation model based on convolutional neural network. Comput. Secur. 2022, 112, 102515. [Google Scholar] [CrossRef]
- Naeem, H.; Dong, S.; Falana, O.J.; Ullah, F. Development of a deep stacked ensemble with process based volatile memory forensics for platform independent malware detection and classification. Expert Syst. Appl. 2023, 223, 119952. [Google Scholar] [CrossRef]
- Landman, T.; Nissim, N. Deep-Hook: A trusted deep learning-based framework for unknown malware detection and classification in Linux cloud environments. Neural Netw. 2021, 144, 648–685. [Google Scholar] [CrossRef] [PubMed]
- Pektaş, A.; Acarman, T. Learning to detect Android malware via opcode sequences. Neurocomputing 2020, 396, 599–608. [Google Scholar] [CrossRef]
- Aamir, M.; Iqbal, M.W.; Nosheen, M.; Ashraf, M.U.; Shaf, A.; Almarhabi, K.A.; Alghamdi, A.M.; Bahaddad, A.A. AMDDL model: Android smartphones malware detection using deep learning model. PLoS ONE 2024, 19, e0296722. [Google Scholar] [CrossRef]
- Sudheera, K.L.K.; Divakaran, D.M.; Singh, R.P.; Gurusamy, M. ADEPT: Detection and Identification of Correlated Attack Stages in IoT Networks. IEEE Internet Things J. 2021, 8, 6591–6607. [Google Scholar] [CrossRef]
- Vasan, D.; Alazab, M.; Venkatraman, S.; Akram, J.; Qin, Z. MTHAEL: Cross-architecture iot malware detection based on neural network advanced ensemble learning. IEEE Trans. Comput. 2020, 69, 1654–1667. [Google Scholar] [CrossRef]
- Le, H.V.; Ngo, Q.D.; Le, V.H. Iot botnet detection using system call graphs and one-class CNN classification. Int. J. Innov. Technol. Explor. Eng. 2019, 8, 937–942. [Google Scholar] [CrossRef]
- Shire, R.; Shiaeles, S.; Bendiab, K.; Ghita, B.; Kolokotronis, N. Malware Squid: A Novel IoT Malware Traffic Analysis Framework Using Convolutional Neural Network and Binary Visualisation; 11660 LNCS; Springer International Publishing: Berlin/Heidelberg, Germany, 2019. [Google Scholar]
- Jiang, H.; Lin, J.; Kang, H. FGMD: A robust detector against adversarial attacks in the IoT network. Futur. Gener. Comput. Syst. 2022, 132, 194–210. [Google Scholar] [CrossRef]
- Li, C.; Lv, Q.; Li, N.; Wang, Y.; Sun, D.; Qiao, Y. A novel deep framework for dynamic malware detection based on API sequence intrinsic features. Comput. Secur. 2022, 116, 102686. [Google Scholar] [CrossRef]
- Aditya, W.R.; Girinoto; Hadiprakoso, R.B.; Waluyo, A. Deep Learning for Malware Classification Platform using Windows API Call Sequence. In Proceedings of the International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS), Jakarta, Indonesia, 28–29 October 2021; pp. 25–29. [Google Scholar] [CrossRef]
- Ring, M.; Schlör, D.; Wunderlich, S.; Landes, D.; Hotho, A. Malware detection on windows audit logs using LSTMs. Comput. Secur. 2021, 109, 102389. [Google Scholar] [CrossRef]
- Lakshmanarao, M.S.A. Android Malware Detection with Deep Learning using RNN from Opcode Sequences. Int. J. Interact. Mob. Technol. 2022, 16, 145. [Google Scholar] [CrossRef]
- Ma, Z.; Ge, H.; Wang, Z.; Liu, Y.; Liu, X. Droidetec: Android Malware Detection and Malicious Code. arXiv 2020, arXiv:2002.03594. [Google Scholar]
- Sasidharan, S.K.; Thomas, C. MemDroid—LSTM based Malware Detection Framework for Android Devices. In Proceedings of the 2021 IEEE Pune Section International Conference (PuneCon), Pune, India, 16–19 December 2021; pp. 1–6. [Google Scholar] [CrossRef]
- Amer, E.; El-sappagh, S. Robust deep learning early alarm prediction model based on the behavioural smell for android malware. Comput. Secur. 2022, 116, 102670. [Google Scholar] [CrossRef]
- Wu, Y.; Shi, J.; Wang, P.; Zeng, D.; Sun, C. Android malware detection. IET Inf. Secur. 2023, 17, 118–130. [Google Scholar] [CrossRef]
- Jeon, J.; Jeong, B.; Baek, S.; Jeong, Y.S. Hybrid Malware Detection Based on Bi-LSTM and SPP-Net for Smart IoT. IEEE Trans. Ind. Inform. 2022, 18, 4830–4837. [Google Scholar] [CrossRef]
- Mahdavifar, S.; Alhadidi, D.; Ghorbani, A.A. Effective and Efficient Hybrid Android Malware Classification Using Pseudo-Label Stacked Auto-Encoder. J. Netw. Syst. Manag. 2022, 30, 22. [Google Scholar] [CrossRef]
- Hemalatha, J.; Roseline, S.A.; Geetha, S.; Kadry, S.; Damaševičius, R. An efficient densenet-based deep learning model for Malware detection. Entropy 2021, 23, 344. [Google Scholar] [CrossRef]
- Kumar, S.; Janet, B. DTMIC: Deep transfer learning for malware image classification. J. Inf. Secur. Appl. 2022, 64, 103063. [Google Scholar] [CrossRef]
- Huang, X.; Ma, L.; Yang, W.; Zhong, Y. A Method for Windows Malware Detection Based on Deep Learning. J. Signal Process. Syst. 2021, 93, 265–273. [Google Scholar] [CrossRef]
- Xu, P.; Zhang, Y.; Eckert, C.; Zarras, A. HawkEye: Cross-Platform Malware Detection with Representation Learning on Graphs; 12893 LNCS; Springer International Publishing: Berlin/Heidelberg, Germany, 2021. [Google Scholar]
- Yang, Q.; Liu, Y.; Chen, T.; Tong, Y. Federated machine learning: Concept and applications. ACM Trans. Intell. Syst. Technol. 2019, 10, 1–19. [Google Scholar] [CrossRef]
- Rizvi, S.K.J.; Aslam, W.; Shahzad, M.; Saleem, S.; Fraz, M.M. PROUD-MAL: Static analysis-based progressive framework for deep unsupervised malware classification of windows portable executable. Complex Intell. Syst. 2022, 8, 673–685. [Google Scholar] [CrossRef]
- Khan, M.; Baig, D.; Khan, U.S.; Karim, A. Malware Classification Framework using Convolutional Neural Network. In Proceedings of the 2020 International Conference on Cyber Warfare and Security (ICCWS), Islamabad, Pakistan, 20–21 October 2020. [Google Scholar] [CrossRef]
- Amer, E.; Zelinka, I. A dynamic Windows malware detection and prediction method based on contextual understanding of API call sequence. Comput. Secur. 2020, 92, 101760. [Google Scholar] [CrossRef]
- Catak, F.O.; Yazi, A.F.; Elezaj, O.; Ahmed, J. Deep learning based Sequential model for malware analysis using Windows exe API Calls. PeerJ Comput. Sci. 2020, 6, e285. [Google Scholar] [CrossRef]
- Hasan, M.M.; Rahman, M.M. RansHunt: A support vector machines based ransomware analysis framework with integrated feature set. In Proceedings of the 20th International Conference of Computer and Information Technology (ICCIT), Dhaka, Bangladesh, 22–24 December 2018; pp. 1–7. [Google Scholar] [CrossRef]
- Karbab, E.M.B.; Debbabi, M.; Derhab, A. SwiftR: Cross-platform ransomware fingerprinting using hierarchical neural networks on hybrid features. Expert Syst. Appl. 2023, 225, 120017. [Google Scholar] [CrossRef]
- Hwang, C.; Hwang, J.; Kwak, J.; Lee, T. Platform-independent malware analysis applicable to windows and linux environments. Electronics 2020, 9, 793. [Google Scholar] [CrossRef]
- Walkup, E. Mac Malware Detection via Static File Structure Analysis; Stanford University: Stanford, CA, USA, 2014; pp. 1–5. [Google Scholar]
- Pajouh, H.H.; Dehghantanha, A.; Khayami, R.; Choo, K.K.R. Intelligent OS X malware threat detection with code inspection. J. Comput. Virol. Hacking Tech. 2018, 14, 213–223. [Google Scholar] [CrossRef]
- Gao, H.; Cheng, S.; Zhang, W. GDroid: Android malware detection and classification with graph convolutional network. Comput. Secur. 2021, 106, 102264. [Google Scholar] [CrossRef]
- Wang, S.; Chen, Z.; Yan, Q.; Yang, B.; Peng, L.; Jia, Z. A mobile malware detection method using behavior features in network traffic. J. Netw. Comput. Appl. 2019, 133, 15–25. [Google Scholar] [CrossRef]
- Cimitile, A.; Martinelli, F.; Mercaldo, F. Machine learning meets ios malware: Identifying malicious applications on apple environment. In Proceedings of the 3rd International Conference on Information Systems Security and Privacy, Porto, Portugal, 19–21 February 2017; pp. 487–492. [Google Scholar] [CrossRef]
- Zhou, G.; Duan, M.; Xi, Q.; Wu, H. ChanDet: Detection Model for Potential Channel of iOS Applications. J. Phys. Conf. Ser. 2019, 1187, 042045. [Google Scholar] [CrossRef]
- Mercaldo, F.; Santone, A. Deep learning for image-based mobile malware detection. J. Comput. Virol. Hacking Tech. 2020, 16, 157–171. [Google Scholar] [CrossRef]
- Le, H.V.; Ngo, Q.D. V-Sandbox for Dynamic Analysis IoT Botnet. IEEE Access 2020, 8, 145768–145786. [Google Scholar] [CrossRef]
- Zhou, X.; Liang, W.; Li, W.; Yan, K.; Shimizu, S.; Wang, K.I.K. Hierarchical Adversarial Attacks Against Graph-Neural-Network-Based IoT Network Intrusion Detection System. IEEE Internet Things J. 2022, 9, 9310–9319. [Google Scholar] [CrossRef]
- Ashraf, E.; Areed, N.F.; Salem, H.; Abdelhay, E.H.; Farouk, A. FIDChain: Federated Intrusion Detection System for Blockchain-Enabled IoT Healthcare Applications. Healthcare 2022, 10, 1110. [Google Scholar] [CrossRef]
- Guerra-Manzanares, A.; Medina-Galindo, J.; Bahsi, H.; Nomm, S. MedBIoT: Generation of an IoT Botnet Dataset in a Medium-sized IoT Network. In Proceedings of the International Conference on Information Systems Security and Privacy, Auckland, New Zealand, 25–27 February 2020; Volume 2020, pp. 207–218. [Google Scholar] [CrossRef]
- Xiao, L.; Li, Y.; Huang, X.; Du, X. Cloud-based malware detection game for mobile devices with offloading. IEEE Trans. Mob. Comput. 2017, 16, 2742–2750. [Google Scholar] [CrossRef]
- Mouratidis, H.; Shei, S.; Delaney, A. A security requirement modelling language for cloud computing environments. Softw. Syst. Model. 2020, 19, 271–295. [Google Scholar] [CrossRef]
- Nguyen, P.S.; Huy, T.N.; Tuan, T.A.; Trung, P.D.; Long, H.V. Hybrid feature extraction and integrated deep learning for cloud-based malware detection. Comput. Secur. 2025, 150, 104233. [Google Scholar] [CrossRef]
- Exchange, S. Does WannaCry Infect Linux? 2017. Available online: https://security.stackexchange.com/questions/159397/does-wannacry-infect-linux?newreg=ed46c309743448b3ad646d3b5130f12f (accessed on 22 January 2025).
- Blog, R.H. Understanding and Mitigating the Dirty Cow Vulnerability. 2016. Available online: https://www.redhat.com/en/blog/understanding-and-mitigating-dirty-cow-vulnerability (accessed on 22 January 2025).
- Apple. About the Security Content of MacOS Ventura 13.3. 2023. Available online: https://support.apple.com/en-us/120945 (accessed on 22 January 2025).
- iPhone User Guide. Unauthorized Modification of iOS. 2025. Available online: https://support.apple.com/en-gb/guide/iphone/iph9385bb26a/ios (accessed on 26 January 2025).
- Sánchez, P.M.S.; Celdrán, A.H.; Bovet, G.; Pérez, G.M. Transfer Learning in Pre-Trained Large Language Models for Malware Detection Based on System Calls. 2024. Available online: http://arxiv.org/abs/2405.09318 (accessed on 26 January 2025).
- Han, S.; Pool, J.; Tran, J.; Dally, W.J. Learning both weights and connections for efficient neural networks. Adv. Neural Inf. Process. Syst. 2015, 2015, 1135–1143. [Google Scholar]
- Jacob, B.; Kligys, S.; Chen, B.; Zhu, M.; Tang, M.; Howard, A.; Adam, H.; Kalenichenko, D. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 2704–2713. [Google Scholar] [CrossRef]
- Hinton, G.; Vinyals, O.; Dean, J. Distilling the Knowledge in a Neural Network. 2015. Available online: http://arxiv.org/abs/1503.02531 (accessed on 26 January 2025).
- Ribeiro, M.T.; Singh, S.; Guestrin, C. ‘Why should I trust you?’ Explaining the predictions of any classifier. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1135–1144. [Google Scholar] [CrossRef]
- Rodis, N.; Sardianos, C.; Papadopoulos, G.T.; Radoglou-Grammatikis, P.; Sarigiannidis, P.; Varlamis, I. Multimodal Explainable Artificial Intelligence: A Comprehensive Review of Methodological Advances and Future Research Directions. IEEE Access 2023, 12, 159794–159820. [Google Scholar] [CrossRef]
- Sokol, K.; Flach, P. Explainability fact sheets: A framework for systematic assessment of explainable approaches. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, Barcelona, Spain, 27–30 January 2020; pp. 56–67. [Google Scholar] [CrossRef]
- Guidotti, R.; Monreale, A.; Ruggieri, S.; Turini, F.; Giannotti, F.; Pedreschi, D. A survey of methods for explaining black box models. ACM Comput. Surv. 2018, 51, 1–42. [Google Scholar] [CrossRef]
- Demirci, D.; Sahin, N.; Sirlancis, M.; Acarturk, C. Static Malware Detection Using Stacked BiLSTM and GPT-2. IEEE Access 2022, 10, 58488–58502. [Google Scholar] [CrossRef]
Platform | File Format | Static Features | Dynamic Features | Memory Features |
---|---|---|---|---|
Windows | Executable (EXE) files. | PE header information: Import/export address tables, section headers, entry point address, date timestamp, code section size. File metadata: Size, creation/modification dates, access permissions. Strings: IP addresses, domain names. Opcode sequences: An opcode is an instruction executed by a CPU, describing an executable file’s behavior. Hence, opcode sequences are the specific sequences of operations extracted from the binary code. | API calls: Sequence and types of Windows API calls (e.g., CreateProcess, WriteFile). Registry modifications: Registry key creation, deletion, or modification. File system modifications: Deletes, creates, or overwrites the existing file, encrypts all or a subset of files in case of ransomware. Host logs: Events extracted from host logs. Network activity: Source and destination IP addresses, TCP ports, Domain Names System (DNS) requests, and network protocols (e.g., HTTP, HTTPS, SMTP, etc.). Resource usage: Higher CPU or memory usage may indicate the presence of malware in the system. | Windows memory dumps. |
Linux | Executable and Linkable Format (ELF): code, data, and metadata for execution. | ELF header information: Malware developers manipulate ELF headers to evade or crash standard analysis tools [46]. Internal libraries: Most Linux malware is statically linked to its libraries, eliminating external dependencies [46]. Shared libraries: List of dynamically loaded libraries. Sections and segments: Information on the .text (code) and .data (global variables) segments. | System call patterns: Frequency and type of system calls. Network behavior: Monitoring outgoing/incoming connections and socket creation. | Sections and segments: Memory segments (.text, .data, .bss). |
macOS | Mach-O files: native executable format for macOS. | Code signatures: Presence and structure of code signing. Dynamic libraries: Information on loaded libraries (DYLIBs). | File activity: Monitor file creations, deletions, modifications, and access patterns. Inbound and outbound traffic: Observe and analyze all network traffic, including DNS, HTTP requests, and other communication protocols. Service start/stop: Track each modification linked to service operation. TI reputation services: Utilize threat intelligence to detect malicious files, IP addresses, and domains. | Sandboxing: Memory protection through entitlements. |
Android | APK (Android Package Kit) files: It is a compressed archive that includes all the resources needed to distribute and install applications on Android devices. | Strings: Domain names, IP addresses, and ransom notes in case of ransomware attack. Permissions analysis: The set of permissions requested by the app to the users (e.g., camera access, network communication, Bluetooth, contacts, and more). Manifest information: Details about application components (e.g., activities, services, and receivers). Intents: Allows communication between various components of an app. API calls: API calls enable inter-application communication to detect malicious behavior. | Behavioral features: Network communication, SMS, data storage behavior. File system features: Similarly to PCs, features extracted from a mobile device’s file system can indicate the presence of malware. User interaction: Detecting ransomware can be achieved by correlating user interactions with application runtime events [47]. System resource analysis: CPU, memory and battery, process reports and network usage. Network traffic analysis: URLs, IPs, network protocols, certificates, non-encrypted data. | Embedded files: Presence of assets (e.g., shared libraries) impacting memory allocation. Memory dumps: A snapshot of Android’s memory that captures all data and processes in the RAM at a specific time, including system processes, application data, and temporary data from various programs. |
iOS | iOS App Store Package (IPA): specific to iOS for app distribution. | Code signing: Verification of signatures. Sandboxing and entitlements: Permission restrictions. | Objective-C method calls: Runtime behavior. Dynamic behavior: API usage patterns (e.g., contacts, location access). Data encryption: Encrypted data usage. | Entitlements: Defines memory boundaries through sandboxing. |
IoT | Various formats (e.g., BIN, HEX, Linux executables). | Firmware version: Metadata, updates, and patches. Opcode sequences: Extracting operational codes after disassembling the binary file. Control-flow graph (CFG): Extracting from the assembly file. API calls: Extracting from the binary. | Network traffic: Service type (http, smtp, ftp, etc.), device communication protocols (e.g., MQTT, CoAP), packet size transmitted by source IP address, etc. Device-specific behavior: Interactions with sensors, actuators, device ports. System calls: Timestamp, return value, arguments, and name of each system call. Resource utilization-based features: 'CPU usage, process usage, and RAM usage. | System call sequences: System-level commands specific to device memory. Memory-mapped IO: Monitoring interactions with memory-mapped I/O (MMIO). Memory buffer usage: Analysis of memory buffers for potential overflows. |
Cloud | VM disk images (e.g., VMDK, QCOW2), container formats (e.g., Docker images). | VM metadata: Hypervisor information (e.g., VM details). Data storage patterns: Interactions with cloud storage. Strings and n-grams | API usage patterns: Cloud-specific API calls (AWS SDK, Google Cloud API). Container activity: Monitoring processes and network activity in containers. System calls: Extracted from the interactions between applications and the OS’s kernel during runtime. | Virtual memory dumps: Contains memory-specific features (system calls, memory access). |
ML Techniques | Algorithms | References |
---|---|---|
Traditional machine learning algorithms | ||
Support Vector Machines (SVMs): This method employs a hyperplane to maximize the margin between malicious and benign samples, proving effective for high-dimensional data. | SVM | [66,67,68] |
K-Nearest Neighbors (KNN): This algorithm classifies samples based on the predominant class of their nearest neighbors, utilizing feature similarity as the primary criterion. | KNN | [55,69] |
Logistic Regression (LR): This approach classifies malware by modeling the relationship between features and binary outcomes (malicious or benign) utilizing a sigmoid function. The sigmoid function converts input values to a range of 0 to 1, making it ideal for interpreting results as probabilities. It is used for binary classification tasks, especially in logistic regression and neural networks. | LR | [70,71,72] |
Naïve Bayes (NB): A probabilistic approach that assumes feature independence, which is efficient for text-based malware detection. | NB | [67,68] |
Decision Trees (DTs): Decision trees are a supervised learning method that classify data by building a tree-like model. The process identifies the most critical features and splits the data into subsets based on these attributes to form nodes. It recursively classifies each node until a final decision is reached as benign or malware. | DT | [69,70] |
Ensemble learning algorithms | ||
Random Forest (RF): This approach constructs multiple decision trees and aggregates their outputs through majority voting or averaging, thereby enhancing robustness and accuracy. | RF | [67,68,69,70,72,73,74] |
Gradient Boosting (e.g., XGBoost, LightGBM): This approach sequentially constructs weak learners, specifically decision trees, to minimize errors, thereby providing high accuracy in the analysis of structured malware data. | Gradient Boosting | [72] |
XGBoost | [69,72] | |
AdaBoost: This approach focuses on challenging samples by modifying weights during the training process, thereby combining weak classifiers into a robust one. | AdaBoost | [68,74] |
Bagging: The Bagging technique randomly divides the dataset into multiple subsets (bootstraps) based on instances, each with unique instances, and then aggregates the results from models trained on these subsets to enhance generalization. | ||
Deep learning algorithms | ||
Convolutional Neural Networks (CNNs): This approach demonstrates efficacy in image-based malware detection, utilizing automated extraction of spatial features from transformed malware binaries. | CNN | [70,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91] |
Recurrent Neural Networks (RNNs): This method facilitates the analysis of sequential data, including API call sequences and opcode patterns, for behavioral-based malware identification. | RNN | [75,90,92,93] |
Long Short-Term Memory (LSTM): A variant of the recurrent neural network (RNN) that effectively captures long-term dependencies, particularly applicable for time-series analysis of dynamic malware features. | LSTM | [76,80,82,87,93,94,95,96,97,98,99,100,101,102] |
Gated Recurrent Unit (GRU): It is a type of recurrent neural network (RNN) designed to process sequential data, such as time series or text. This model is more computationally efficient than LSTMs due to fewer parameters and the absence of a separate output gate. | GRU | [95] |
Generative Adversarial Networks (GANs): This process generates synthetic malware samples for data augmentation, thereby enhancing the efficacy of detection systems with limited datasets. | GAN | [84] |
Autoencoders: Autoencoders are unsupervised neural networks used for dimensionality reduction, feature extraction, and anomaly detection. They aim to learn a compressed representation of the input data (encoding) and then reconstruct the input (decoding) as accurately as possible. | VAEs, Sparse Autoencoders etc. | [103] |
Transformer Models (e.g., BERT): Transformers are advanced deep learning architectures based on attention mechanisms designed to handle sequential or contextual data effectively. | BERT | |
Transfer Learning (TL): This is a deep learning approach where a model pre-trained on one task or dataset is reused and fine-tuned for a related but different task. It is particularly effective when the target dataset is small or lacks diversity. | Pre-trained CNNs like Inception, VGG, ResNet50, etc. | [104,105,106] |
Multilayer Perceptron (MLP): It is a type of artificial neural network (ANN) consisting of multiple layers of nodes. It is commonly used in supervised learning tasks such as classification and regression. | MLP | [55,68,70,73,85,107] |
Federated Learning (FL): FL is an emerging AI model in which ML models are trained locally on edge devices such as smartphones and IoT devices, without sharing raw data. Instead, only model parameters and gradients are exchanged with a global model, preserving user privacy, and enhancing security. However, its effectiveness depends on device capabilities and communication overhead. | - | [10,108] |
Large Language Models (LLMs): The ability of LLMs to capture contextual relationships enables the identification of subtle patterns indicative of malicious activities. LLMs assist in automating threat analysis, improving detection accuracy, and aiding in malware classification. | GPT, BERT, ChatGPT-4, Claude | [8,9] |
Reference | Data Source | Feature Category | Feature Name | ML Algorithms | Result (Accuracy) | Limitations |
---|---|---|---|---|---|---|
Static feature-based malware detection | ||||||
[75] | Malimg | Static | Opcode sequences | Deep RNN | 96% | It requires significant computational resources. |
[76] | Microsoft BIG 2015 | Static | Opcodes, images, byte sequence, etc. | DNN, LSTM, and CNN. | 98.35% | It is useless against zero-day malware. |
[104] | BIG 2015, Malimg, MaleVis and Malicia dataset | Static | 2D images | DenseNet | 98.23% | It has high false negatives and highly imbalanced datasets. |
[77] | Microsoft BIG 2015 | Static | Image-based opcode features | CNN | 99.12% | Outdated dataset. |
[105] | Malimg dataset, Microsoft BIG 2015 | Static | Grayscale images from PE files | VGG16, VGG19, ResNet50, and inceptionV3 | 98.92% | Cannot detect malware packed using advanced techniques. |
[109] | Malimg | Static | Static signatures | ATT-DNNs | 98.09% | Cannot detect obfuscated malware. |
[78] | Malware API-class | Static | Executable file to static images | CNN | 98.00% | _ |
[79] | VirusShare, Hybrid Analysis | Static | Executable file to static images | Xception Convolutional Neural Network (CNN) | 98.20% | _ |
[110] | Microsoft BIG 2015 | Static | Malware binary files into static images | DNN | 97.80% | _ |
Dynamic feature-based malware detection | ||||||
[94] | VirusShare | Dynamic | Sequences of API calls | Bi-LSTM | 97.31% | Limited to executing samples in a Windows 7 environment. |
[111] | Custom datasets | Dynamic | Sequences of API calls | Markov chain representation | 99.7% | - |
[112] | VirusTotal | Dynamic | API calls | LSTM | 95% | Limited to executing samples in a Windows 7 environment. |
[95] | VirusTotal | Dynamic | API call sequences | LSTM and GRU | 96.8% | Highly imbalanced dataset. |
[66] | CA Tech- neologises VET Zoo | Dynamic | Runtime behavior | MRed, ReliefF, SVM | 99.499% | High computational complexity. |
[96] | Audit log events | Dynamic | Process names, action types, and accessed file | LSTM | 91.05% | High false positives and lack of scalability. |
[80] | Multiclass dataset (Ember Dataset, private dataset) | Dynamic | Loaded DLLs, registry changes, API call sequences and file changes | CNN-LSTM | 96.8% | Susceptible to adversarial attacks. |
Hybrid feature-based malware detection techniques | ||||||
[81] | VirusTotal | Hybrid (static and dynamic) | Combination of static and dynamic features (PE section, PE import, PE API, and PE images) | CNN | 97% | Failed to validate the robustness against adversarial attacks. |
[73] | The Korea Internet & Security Agency (KISA) | Hybrid | Size of file and header, counts of file sections. Entropy, file system changes API call, DLL-loaded info, network activities, etc. | RF, MLP | 85.1% | Extensive time is needed for feature extraction. |
[106] | VirusShare | Hybrid | Image-based static and dynamic features | VGG16 | 94.70% | - |
[113] | VirusShare | Hybrid | Function length frequency representation, registry activities, API calls, and file operation features | SVM | 97.10% | Small dataset. |
[82] | VirusTotal | Hybrid | Opcodes and system calls | CNN, LSTM, and an attention-based LSTM | 99% | High computational cost. |
Memory feature-based malware detection techniques | ||||||
[83] | Dumpware10 | Memory | Memory images of running processes | CNN | 98% | Malware processing cost is high under limited resource capabilities. |
[84] | Dumpware10, BIG2015 dataset | Memory | Memory images of running processes | GAN and CNN | 99.86% for BIG2015 dataset | Only one type of data, like bytes, is used. Need to make the dataset more diverse. |
[85] | CIC-MalMem-2022 https://www.unb.ca/cic/datasets/malmem-2022.html (accessed on 3 February 2025) | Memory | Memory images of running processes | CNN and MLP | 99.8% | Training time complexity and vulnerability to adversarial attacks. |
[70] | CIC-MalMem-2022 https://www.unb.ca/cic/datasets/malmem-2022.html (accessed on 3 February 2025) | Memory | Multi-memory features | RF, DT, LR, MLP, and CNN | 99.89% | - |
Reference | Data Source | Feature Category | Feature Name | ML Algorithms | Result (Accuracy) |
---|---|---|---|---|---|
[107] | AndroZoo, VirusShare, and clean Ubuntu libraries | Static | Assembly instructions (control-flow graphs) | MLP | 96.82% |
[115] | VirusShare | Static | Strings from binary data | DNN | 94% |
[74] | VX heavens | Dynamic | System calls | J48, random forest, AdaBoostM1 (J48), and IBk | 98% |
[86] | VirusShare | Memory | Memory dumps | CNN | 99.9% |
[46] | VirusTotal and ViruShare | Memory | Multi-memory features | DNNs | 98.8% |
Reference | Data Source | Feature Category | Feature Name | ML Algorithms | Accuracy | Limitations |
---|---|---|---|---|---|---|
Static feature-based Android malware detection techniques | ||||||
[118] | MalGenome | Static | Call graphs | GCN | 98.99% | Lack of representative of real-world scenarios. |
[87] | Contagio Mobile | Static | Opcode sequences | CNN-LSTM | 91.42% | Unable to manage obfuscated malware. |
[71] | MalDroid-2020 dataset | Static | Opcode sequences (histograms of n-grams) | LR | 93.56% | Adversarial attack resistance and handling evolving malware are not addressed. |
[97] | CIC-Inves2017 | Static | Opcode sequences | LSTM | 96% | Small dataset (1500 apps). |
[72] | Drebin, VirusShare, AndroZoo | Static | Permissions, intents | Base models (LR, MLP, and SGD), ensemble learning | 99.1% | - |
[88] | Drebin dataset | Static | Opcode sequences, permissions, API calls | CNN | 99.92% | Lack of malware diversity and scalability. |
Dynamic feature-based Android malware detection techniques | ||||||
[67] | McAfee | Dynamic | Actions/events | Base models (NB, SL, SVM RBF, J48, PART, RF), deep learning | 97.8% | - |
[98] | Google Play Store | Dynamic | API calls | Bi-LSTM | 97.22% | High detection time. |
[119] | Drebin dataset | Dynamic | Network traffic Permissions, intents, API calls | C4.5 | 97.89% | Small dataset. |
[99] | MalGenome | Dynamic | System call sequences | LSTM | 99.23%. | - |
[100] | Custom dataset | Dynamic | API and system call sequences | LSTM | 96.8% | |
Hybrid feature-based Android malware detection techniques | ||||||
[67] | McAfee | Hybrid | Permissions, intents, API calls, actions/events | Base models (NB, SL, SVM RBF, J48, PART, RF), deep learning | 99.6% detection | - |
[55] | Contagio Mobile, https://contagiominidump.blogspot.com/ (accessed on 3 February 2025) VirusShare and Genome | Hybrid | Runtime behaviors across various levels—kernel, application, user, and package | K-NN, LDC, QDC, MLP, Parzen Classifier (PARZC) and RBF | 96% | This method is susceptible to mimicry attacks and ineffective against unknown malware. |
[101] | VirusShare, Drebin, DroidAnalytics and CICInvesAndMal2019/2000 https://www.unb.ca/cic/ (accessed on 3 February 2025) | Hybrid | Permission requests, API and system call sequences, opcode sequences, and graph structures, including abstract syntax trees, control-flow, and data-flow graphs | Bi-LSTM and GNN | 95.94% | Need more scalable static analyses. |
[103] | CICMal-Droid2020 | Hybrid | Permissions, intents, system calls, composite behaviors, and network traffic packets | Pseudo-label stacked autoencoder (PLSAE) | 98.28% | - |
Memory feature-based Android malware detection techniques. | ||||||
[85] | AndroZoo project https://androzoo.uni.lu/ (accessed on 3 February 2025) | Memory | Process memory dumps | Ensemble of MLP and CNN | 94.3% | Vulnerable to adversarial attacks. |
Aspect | Windows | Linux | macOS |
---|---|---|---|
Runtime issues | DLL dependencies | ELF loader challenges | Sandboxing restricts runtime access |
Cross-platform threats | Vulnerable to malware like Mirai | Targeted by cross-platform threats | Exploits shared vulnerabilities |
Unique vulnerabilities | EternalBlue (WannaCry) exploits SMB protocol | Dirty COW (CVE-2016-5195) targets kernel | Gatekeeper bypass allows malicious execution |
Perception of security | High attack surface due to widespread usage | Perceived as secure but targeted for IoT/cloud | Perceived as secure, leading to lax practices |
Protocols and frameworks | NTLM and NetBIOS exploited | Kernel vulnerabilities targeted | AppleScript and iCloud exploited |
Aspect | Android | iOS |
---|---|---|
OS update delays | Delays due to dependency on manufacturers and carriers, leaving devices vulnerable. | Timely updates controlled by Apple, ensuring uniform distribution. |
Third-party apps | Third-party app stores and sideloading increase security risks. | Apps restricted to App Store with strict review guidelines, reducing risks. |
Device diversity | Wide range of devices and custom OS versions complicate uniform patching. | Limited device variants and centralized control ensure consistent security. |
Jailbreaking risks | Not applicable (rooting exists but is less common). | Jailbreaking removes iOS restrictions, exposing devices to malware. |
iCloud phishing | Not applicable (Google account phishing exists but is less targeted). | iCloud phishing leads to account takeovers and data theft. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ferdous, J.; Islam, R.; Mahboubi, A.; Islam, M.Z. A Survey on ML Techniques for Multi-Platform Malware Detection: Securing PC, Mobile Devices, IoT, and Cloud Environments. Sensors 2025, 25, 1153. https://doi.org/10.3390/s25041153
Ferdous J, Islam R, Mahboubi A, Islam MZ. A Survey on ML Techniques for Multi-Platform Malware Detection: Securing PC, Mobile Devices, IoT, and Cloud Environments. Sensors. 2025; 25(4):1153. https://doi.org/10.3390/s25041153
Chicago/Turabian StyleFerdous, Jannatul, Rafiqul Islam, Arash Mahboubi, and Md Zahidul Islam. 2025. "A Survey on ML Techniques for Multi-Platform Malware Detection: Securing PC, Mobile Devices, IoT, and Cloud Environments" Sensors 25, no. 4: 1153. https://doi.org/10.3390/s25041153
APA StyleFerdous, J., Islam, R., Mahboubi, A., & Islam, M. Z. (2025). A Survey on ML Techniques for Multi-Platform Malware Detection: Securing PC, Mobile Devices, IoT, and Cloud Environments. Sensors, 25(4), 1153. https://doi.org/10.3390/s25041153