RansomFormer: A Cross-Modal Transformer Architecture for Ransomware Detection via the Fusion of Byte and API Features
Abstract
:1. Introduction
- First, we design a dual-stream architecture that processes raw bytes from PE files in one stream and tokenized API names in another. This architecture connects byte data features with API signals and is enhanced by a cross-attention mechanism. Additionally, we employ a self-supervised pre-training strategy that applies masked language modeling on raw bytes and contrastive learning on API names, improving feature representation across both modalities.
- Second, we construct a large ransomware dataset covering 161 families, as the existing datasets are outdated, lack family variants, or are limited in size. This dataset enables further research in ransomware detection.
- Third, we prove that static data alone can achieve high detection accuracy while adding dynamic data further improves detection. Fusing byte data with API imports enhances static feature extraction, leading to better detection performance.
2. Related Work
2.1. Overview of Ransomware Detection Methods
2.2. Transformer-Based Approaches in Ransomware Detection
3. Materials and Methodology
3.1. Data Collection
3.2. Static and Dynamic Feature Extraction
3.2.1. PE Byte Representation
3.2.2. API Representation
- API names are tokenized using a vocabulary-based lookup as follows:
- Unknown API names are mapped to a special token `<UNK>`.
- Sequences exceeding 1024 tokens are truncated.
- Shorter sequences are zero-padded to maintain fixed length.
- `[CLS]’ denotes the start of the sequence.
- API name and arguments are tokenized separately.
- `[SEP]’ marks sequence boundaries.
- Paths are anonymized.
- Hexadecimal numbers are converted to decimal.
- Text is standardized to lowercase.
3.3. Model Architecture
3.3.1. Input Encoding
3.3.2. Cross-Modal Attention
3.3.3. Classification
4. Experiments and Results
4.1. Data Extraction and Preprocessing
4.2. Model Evaluation
- Accuracy : Represents the ratio of correctly classified instances to the total number of instances.
- Precision: Indicates the proportion of correctly identified ransomware samples out of all instances predicted as ransomware.
- Recall (Sensitivity): Measures the proportion of actual ransomware samples the model correctly identified.
- F1-score: The harmonic mean of precision and recall, providing a balanced measure of the model’s performance, is particularly useful when dealing with imbalanced datasets.
- FPR: Represents the proportion of benign samples incorrectly classified as ransomware.
- TP (True Positive): Number of ransomware samples correctly identified as ransomware.
- TN (True Negative): Number of benign samples correctly identified as benign.
- FP (False Positive): Number of benign samples incorrectly identified as ransomware.
- FN (False Negative): Number of ransomware samples incorrectly identified as benign.
4.3. Hyperparameter Settings
4.4. Model Training
4.5. Results
4.6. Comparison of Ransomware Detection Methods
5. Conclusions
Author Contributions
Funding
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Brewer, R. Ransomware attacks: Detection, prevention and cure. Netw. Secur. 2016, 2016, 5–9. [Google Scholar] [CrossRef]
- Everett, C. Ransomware: To pay or not to pay? Comput. Fraud. Secur. 2016, 2016, 8–12. [Google Scholar] [CrossRef]
- Gazet, A. Comparative analysis of various ransomware virii. J. Comput. Virol. 2010, 6, 77–90. [Google Scholar] [CrossRef]
- AlMajali, A.; Elmosalamy, A.; Safwat, O.; Abouelela, H. Adaptive Ransomware Detection Using Similarity-Preserving Hashing. Appl. Sci. 2024, 14, 9548. [Google Scholar] [CrossRef]
- Lee, J.; Yun, J.; Lee, K. A Study on Countermeasures against Neutralizing Technology: Encoding Algorithm-Based Ransomware Detection Methods Using Machine Learning. Electronics 2024, 13, 1030. [Google Scholar] [CrossRef]
- Alzahrani, S.; Xiao, Y.; Sun, W. An Analysis of Conti Ransomware Leaked Source Codes. IEEE Access 2022, 10, 100178–100193. [Google Scholar] [CrossRef]
- Alzahrani, S.; Xiao, Y.; Asiri, S. Conti Ransomware Development Evaluation. In Proceedings of the 2023 ACM Southeast Conference, New York, NY, USA, 12–14 April 2023; ACM SE `23; pp. 39–46. [Google Scholar] [CrossRef]
- Albin Ahmed, A.; Shaahid, A.; Alnasser, F.; Alfaddagh, S.; Binagag, S.; Alqahtani, D. Android Ransomware Detection Using Supervised Machine Learning Techniques Based on Traffic Analysis. Sensors 2024, 24, 189. [Google Scholar] [CrossRef]
- Kenyon, B.; McCafferty, J. Ransomware Recovery. ITNOW 2016, 58, 32–33. [Google Scholar] [CrossRef]
- Lee, Y.; Lee, J.; Ryu, D.; Park, H.; Shin, D. Clop Ransomware in Action: A Comprehensive Analysis of Its Multi-Stage Tactics. Electronics 2024, 13, 3689. [Google Scholar] [CrossRef]
- Andronio, N.; Zanero, S.; Maggi, F. HelDroid: Dissecting and Detecting Mobile Ransomware. In Proceedings of the Research in Attacks, Intrusions, and Defenses, Kyoto, Japan, 2–4 November 2015; Bos, H., Monrose, F., Blanc, G., Eds.; Springer: Cham, Switzerland, 2015; pp. 382–404. [Google Scholar]
- Drabent, K.; Janowski, R.; Mongay Batalla, J. How to Circumvent and Beat the Ransomware in Android Operating System—A Case Study of Locker. CB! tr. Electronics 2024, 13, 2212. [Google Scholar] [CrossRef]
- Gómez-Hernández, J.A.; García-Teodoro, P. Lightweight Crypto-Ransomware Detection in Android Based on Reactive Honeyfile Monitoring. Sensors 2024, 24, 2679. [Google Scholar] [CrossRef] [PubMed]
- Gazzan, M.; Sheldon, F.T. An Incremental Mutual Information-Selection Technique for Early Ransomware Detection. Information 2024, 15, 194. [Google Scholar] [CrossRef]
- Bang, J.; Kim, J.N.; Lee, S. Entropy Sharing in Ransomware: Bypassing Entropy-Based Detection of Cryptographic Operations. Sensors 2024, 24, 1446. [Google Scholar] [CrossRef] [PubMed]
- Davidian, M.; Kiperberg, M.; Vanetik, N. Early Ransomware Detection with Deep Learning Models. Future Internet 2024, 16, 291. [Google Scholar] [CrossRef]
- Albshaier, L.; Almarri, S.; Rahman, M.M.H. Earlier Decision on Detection of Ransomware Identification: A Comprehensive Systematic Literature Review. Information 2024, 15, 484. [Google Scholar] [CrossRef]
- Gazzan, M.; Sheldon, F.T. Novel Ransomware Detection Exploiting Uncertainty and Calibration Quality Measures Using Deep Learning. Information 2024, 15, 262. [Google Scholar] [CrossRef]
- Alqahtani, A.; Sheldon, F.T. eMIFS: A Normalized Hyperbolic Ransomware Deterrence Model Yielding Greater Accuracy and Overall Performance. Sensors 2024, 24, 1728. [Google Scholar] [CrossRef]
- Yamany, B.; Elsayed, M.S.; Jurcut, A.D.; Abdelbaki, N.; Azer, M.A. A Holistic Approach to Ransomware Classification: Leveraging Static and Dynamic Analysis with Visualization. Information 2024, 15, 46. [Google Scholar] [CrossRef]
- Kharraz, A.; Robertson, W.; Balzarotti, D.; Bilge, L.; Kirda, E. Cutting the Gordian Knot: A Look Under the Hood of Ransomware Attacks. In Proceedings of the Detection of Intrusions and Malware, and Vulnerability Assessment, Milan, Italy, 9–10 July 2015; Almgren, M., Gulisano, V., Maggi, F., Eds.; Springer: Cham, Switzerland, 2015; pp. 3–24. [Google Scholar]
- Li, J.; Yang, G.; Shao, Y. Ransomware Detection Model Based on Adaptive Graph Neural Network Learning. Appl. Sci. 2024, 14, 4579. [Google Scholar] [CrossRef]
- Goswami, S.; Kumar, A. A transformative deep learning framework for traffic modelling using sensors-based multi-resolution traffic data. Int. J. Sens. Netw. 2023, 42, 145–155. [Google Scholar]
- Chen, D.; Nie, M.; Gan, Q.; Wang, D. Evolving network representation learning based on recurrent neural network. Int. J. Sens. Netw. 2024, 46, 114–122. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. arXiv 2023, arXiv:1706.03762. [Google Scholar]
- Deng, F.; Tian, H.; Zhao, X.; Han, D. Lightweight remote sensing road detection with an attention-augmented transformer. Int. J. Sens. Netw. 2024, 46, 245–259. [Google Scholar]
- Asiri, S.; Xiao, Y.; Li, T. PhishTransformer: A Novel Approach to Detect Phishing Attacks Using URL Collection and Transformer. Electronics 2024, 13, 30. [Google Scholar] [CrossRef]
- Alshomrani, M.; Albeshri, A.; Alturki, B.; Alallah, F.S.; Alsulami, A.A. Survey of Transformer-Based Malicious Software Detection Systems. Electronics 2024, 13, 4677. [Google Scholar] [CrossRef]
- Lin, H.; Cheng, X.; Wu, X.; Yang, F.; Shen, D.; Wang, Z.; Song, Q.; Yuan, W. CAT: Cross Attention in Vision Transformer. arXiv 2021, arXiv:2106.05786. [Google Scholar]
- Chen, C.F.; Fan, Q.; Panda, R. CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification. arXiv 2021, arXiv:2103.14899. [Google Scholar]
- Rahima Manzil, H.H.; Naik, S.M. Android ransomware detection using a novel hamming distance based feature selection. J. Comput. Virol. Hacking Tech. 2024, 20, 71–93. [Google Scholar] [CrossRef]
- Deng, X.; Cen, M.; Jiang, M.; Lu, M. Ransomware early detection using deep reinforcement learning on portable executable header. Clust. Comput. 2024, 27, 1867–1881. [Google Scholar] [CrossRef]
- Chew, C.J.W.; Kumar, V.; Patros, P.; Malik, R. Real-time system call-based ransomware detection. Int. J. Inf. Secur. 2024, 23, 1839–1858. [Google Scholar] [CrossRef]
- Aljabri, M.; Alhaidari, F.; Albuainain, A.; Alrashidi, S.; Alansari, J.; Alqahtani, W.; Alshaya, J. Ransomware detection based on machine learning using memory features. Egypt. Inform. J. 2024, 25, 100445. [Google Scholar] [CrossRef]
- Cen, M.; Jiang, F.; Doss, R. RansoGuard: A RNN-based framework leveraging pre-attack sensitive APIs for early ransomware detection. Comput. Secur. 2025, 150, 104293. [Google Scholar] [CrossRef]
- Coglio, F.; Lekssays, A.; Carminati, B.; Ferrari, E. Early-Stage Ransomware Detection Based on Pre-attack Internal API Calls. In Proceedings of the Advanced Information Networking and Applications, Juiz de Fora, Brazil, 29–31 March 2023; Barolli, L., Ed.; Springer: Cham, Switzerland, 2023; pp. 417–429. [Google Scholar]
- Sood, I.; Sharma, V. TLERAD: Transfer Learning for Enhanced Ransomware Attack Detection. Comput. Mater. Contin. 2024, 81, 2791–2818. [Google Scholar] [CrossRef]
- Kuswanto, D.; Anjad, M.R. Application of Improved Random Forest Method and C4.5 Algorithm as Classifier to Ransomware Detection Based on the Frequency Appearance of API Calls. In Proceedings of the 2021 IEEE 7th Information Technology International Seminar (ITIS), Surabaya, Indonesia, 6–8 October 2021; pp. 1–6. [Google Scholar] [CrossRef]
- Al-rimy, B.A.S.; Maarof, M.A.; Shaid, S.Z.M. Crypto-ransomware early detection model using novel incremental bagging with enhanced semi-random subspace selection. Future Gener. Comput. Syst. 2019, 101, 476–491. [Google Scholar] [CrossRef]
- Ciaramella, G.; Iadarola, G.; Martinelli, F.; Mercaldo, F.; Santone, A. Explainable Ransomware Detection with Deep Learning Techniques. J. Comput. Virol. Hacking Tech. 2024, 20, 317–330. [Google Scholar] [CrossRef]
- Gajjar, A.; Kashyap, P.; Aysu, A.; Franzon, P.; Choi, Y.; Cheng, C.; Pedretti, G.; Ignowski, J. RD-FAXID: Ransomware Detection with FPGA-Accelerated XGBoost. ACM Trans. Reconfigurable Technol. Syst. 2024, 17. [Google Scholar] [CrossRef]
- Ashwini, A.; Nagasundara, K.B. An intelligent ransomware attack detection and classification using dual vision transformer with Mantis Search Split Attention Network. Comput. Electr. Eng. 2024, 119, 109509. [Google Scholar] [CrossRef]
- Gaber, M.; Ahmed, M.; Janicke, H. Zero day ransomware detection with Pulse: Function classification with Transformer models and assembly language. Comput. Secur. 2025, 148, 104167. [Google Scholar] [CrossRef]
- MalwareBazaar. Available online: https://bazaar.abuse.ch/browse (accessed on 2 January 2025).
- VirusShare. Available online: https://virusshare.com (accessed on 2 January 2025).
- VirusTotal. Available online: https://www.virustotal.com/gui/home/upload (accessed on 2 January 2025).
- Carrera, E. PEfile: Python Module for Parsing and Analyzing PE Files. Available online: https://github.com/erocarrera/pefile (accessed on 10 January 2025).
- GitHub-cert-ee/cuckoo3: Cuckoo3 Is a Python 3 Open Source Automated Malware Analysis System. Available online: https://github.com/cert-ee/cuckoo3 (accessed on 10 January 2025).
- Singh, A.; Mushtaq, Z.; Abosaq, H.A.; Mursal, S.N.F.; Irfan, M.; Nowakowski, G. Enhancing Ransomware Attack Detection Using Transfer Learning and Deep Learning Ensemble Models on Cloud-Encrypted Data. Electronics 2023, 12, 3899. [Google Scholar] [CrossRef]
- Taheri, R.; Shojafar, M.; Arabikhan, F.; Gegov, A. Unveiling vulnerabilities in deep learning-based malware detection: Differential privacy driven adversarial attacks. Comput. Secur. 2024, 146, 104035. [Google Scholar] [CrossRef]
- Aryal, K.; Gupta, M.; Abdelsalam, M.; Kunwar, P.; Thuraisingham, B. A Survey on Adversarial Attacks for Malware Analysis. IEEE Access 2025, 13, 428–459. [Google Scholar] [CrossRef]
- Imran, M.; Appice, A.; Malerba, D. Evaluating Realistic Adversarial Attacks against Machine Learning Models for Windows PE Malware Detection. Future Internet 2024, 16, 168. [Google Scholar] [CrossRef]
- Shafin, S.S.; Karmakar, G.; Mareels, I. Obfuscated Memory Malware Detection in Resource-Constrained IoT Devices for Smart City Applications. Sensors 2023, 23, 5348. [Google Scholar] [CrossRef]
- Naseer, M.; Ullah, F.; Ijaz, S.; Naeem, H.; Alsirhani, A.; Alwakid, G.N.; Alomari, A. Obfuscated Malware Detection and Classification in Network Traffic Leveraging Hybrid Large Language Models and Synthetic Data. Sensors 2025, 25, 202. [Google Scholar] [CrossRef] [PubMed]
Family | Count | Family | Count | Family | Count |
---|---|---|---|---|---|
Loki | 1111 | STOP | 866 | LockBit | 592 |
Zeppelin | 422 | ConvAgent | 391 | StopCrypt | 281 |
GandCrypt | 224 | GandCrab | 189 | PornoAsset | 174 |
MSIL | 57 | Xorist | 44 | Cerber | 37 |
Brresmon | 27 | Nymaim | 24 | BlackMatter | 23 |
FileCryptor | 22 | Urausy | 22 | Osiris | 16 |
Yakes | 13 | TeslaCrypt | 13 | Blocker | 13 |
Delshad | 12 | Locky | 11 | HTOT | 10 |
MarsStealer | 10 | Conti | 10 | Phobos | 10 |
DarkSide | 9 | Gen2 | 9 | LockScreen | 9 |
Hive | 9 | Instabot | 8 | Vidar | 8 |
Smoke | 8 | TorrentLocker | 8 | Wanna | 8 |
PennyWise | 7 | AvKill | 7 | Fragtor | 7 |
Panda | 6 | Hruu | 6 | Nemty | 6 |
Poison | 6 | Sodin | 6 | Dharma | 6 |
MedusaLocker | 6 | WannaCryptor | 6 | GPCode | 5 |
Component | Parameter | Value |
---|---|---|
API Encoder | Embedding Dimension | 256 |
Number of Transformer Layers | 8 | |
Number of Attention Heads | 8 | |
Byte Encoder | Input Shape | (1, 1024) |
First Conv Layer | 64 filters, 5×1 kernel | |
Second Conv Layer | 128 filters, 3×1 kernel | |
Pooling Output | 64 features | |
Output Dimension | 256 | |
Cross-Attention | Embedding Dimension | 256 |
Number of Heads | 8 | |
Dropout Rate | 0.2 | |
Classifier | Hidden Layer 1 | 512 neurons |
Hidden Layer 2 | 256 neurons | |
Output Layer | 2 neurons | |
Dropout Rate 1 | 0.6 | |
Dropout Rate 2 | 0.4 | |
Loss Function | CrossEntropyLoss | |
Optimizer (AdamW) | Learning Rate | 0.001 |
Weight Decay | 1 × 10−5 | |
Training | Learning Rate | 0.001 |
Batch Size | 64 |
Dataset | Subset | Benign Samples | Ransomware Samples |
---|---|---|---|
>Dataset 1 | Training Set | 3000 | 3000 |
Validation Set | 1000 | 1000 | |
Test Set | 1000 | 1000 | |
>Dataset 2 | Training Set | 800 | 800 |
Validation Set | 300 | 300 | |
Test Set | 300 | 300 |
Dataset | Accuracy | Precision | Recall | F1-Score | FPR |
---|---|---|---|---|---|
Dataset 1 | 99.25% | 99.30% | 99.20% | 99.25% | 0.70 |
Dataset 2 | 99.50% | 99.67% | 99.33% | 99.50% | 0.33 |
Reference | Method | Features | Dataset | Accuracy | Precision | Recall | F1-Score |
---|---|---|---|---|---|---|---|
[35] | RNN | APIs | Benign: 942 Ransomware: 582 Families: 11 | 93.45 | 94.92 | 93.77 | 93.34 |
[38] | RF and C4.5 | APIs | Benign: 682 Ransomware: 5049 Families: - | 96.00 | 98.04 | 95.16 | 96.35 |
[39] | Incremental Bagging | APIs | Benign: 1000 Ransomware: 8152 Families: 15 | 97.89 | 98.16 | 98.97 | 98.70 |
[36] | ANN | APIs | Benign: 1111 Ransomware: 5203 Families: 13 | 93.00 | 93.00 | 93.00 | 93.00 |
[49] | CNN Transformer | Static and dynamic features | Benign: N/S Ransomware: N/S Families: N/S | 99.10 | 99.20 | 98.90 | 97.64 |
RansomFormer Dataset 1 | Transformer | APIs imports PE bytes | Benign: 5000 Ransomware: 5000 Families: 161 | 99.25 | 99.30 | 99.20 | 99.25 |
RansomFormer Dataset 2 | Transformer | APIs calls PE bytes | Benign: 2000 Ransomware: 2000 Families: 88 | 99.50 | 99.67 | 99.33 | 99.50 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Alzahrani, S.; Xiao, Y.; Asiri, S.; Alasmari, N.; Li, T. RansomFormer: A Cross-Modal Transformer Architecture for Ransomware Detection via the Fusion of Byte and API Features. Electronics 2025, 14, 1245. https://doi.org/10.3390/electronics14071245
Alzahrani S, Xiao Y, Asiri S, Alasmari N, Li T. RansomFormer: A Cross-Modal Transformer Architecture for Ransomware Detection via the Fusion of Byte and API Features. Electronics. 2025; 14(7):1245. https://doi.org/10.3390/electronics14071245
Chicago/Turabian StyleAlzahrani, Saleh, Yang Xiao, Sultan Asiri, Naif Alasmari, and Tieshan Li. 2025. "RansomFormer: A Cross-Modal Transformer Architecture for Ransomware Detection via the Fusion of Byte and API Features" Electronics 14, no. 7: 1245. https://doi.org/10.3390/electronics14071245
APA StyleAlzahrani, S., Xiao, Y., Asiri, S., Alasmari, N., & Li, T. (2025). RansomFormer: A Cross-Modal Transformer Architecture for Ransomware Detection via the Fusion of Byte and API Features. Electronics, 14(7), 1245. https://doi.org/10.3390/electronics14071245