From Tweets to Threats: A Survey of Cybersecurity Threat Detection Challenges, AI-Based Solutions and Potential Opportunities in X
Abstract
:1. Introduction
2. Cybersecurity Challenges and Threats in X
2.1. Security Threats in X
2.1.1. Multimedia Content Threats
- Multimedia content exposure
- Shared ownership
- Manipulation of multimedia content
- Steganography
- Shared links to multimedia content
- Metadata
- Outsourcing and transparency of data centers
- Static links
- Tagging-link ability from shared multimedia data
- Unauthorized data disclosure
- Video conference
2.1.2. Traditional Threats
- Spamming
- Malware
- Sybil attack and fake profiles
- Impersonation
- Clickjacking
- Social phishing
- Hijacking
2.1.3. Social Threats
- Corporate espionage
- Cyberbullying and cyber-grooming
- Cyberstalking
3. Motivations of the Cyber Threats on X
- ❖ Financial benefits
- ❖ Entertainment
- ❖ Cyber spying
- ❖ Expertise for the job
- ❖ Cyber warfare
- ❖ Revenge/Feelings
- ❖ Hacktivism
4. Survey Methodology
- Research Questions
- Objectives
5. AI-Based Cyber Threat Solutions in X
5.1. Machine Learning (ML)
- Supervised Learning
- Behavior based
- Content based
- 2.
- Unsupervised Learning
- Behavior based
- Content based
- 3.
- Reinforcement Learning or Semi-Supervised learning.
5.2. Deep Learning (DL)
- Convolutional neural networks (CNNS)
- 2.
- Graph Convolutional Networks (GCNs)
- 3.
- Recurrent neural networks (RNNS) and Long Short-Term Memory Networks (LSTM)
- 4.
- Deep neural networks (DNNs)
5.3. Ensemble and Hybrid Learning
6. X Security: ML/DL Solutions
6.1. Detection of Vulnerabilities and Exploits on X
6.2. Detection of Security Content
7. Analysis of X Cyber Threat Solutions
7.1. The Complexity of the Algorithm
7.2. Degree of Information Summarization
7.3. Scalability and Effectiveness
7.4. Semantic Characteristics
7.5. The Scope of Detecting a Threat
8. Discussion and Potential Opportunities
9. Results and Analysis
10. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Familoni, B.T. Cybersecurity challenges in the age of AI: Theoretical approaches and practical solutions. Comput. Sci. IT Res. J. 2024, 5, 703–724. [Google Scholar] [CrossRef]
- Statista. Worldwide Cybersecurity Spending 2017–2028, Statista. 2024. Available online: https://www.statista.com/statistics/991304/worldwide-cybersecurity-spending/ (accessed on 10 November 2023).
- Aiyer, B.; Caso, J.; Russell, P.; Sorel, M. New survey reveals $2 trillion market opportunity for cybersecurity technology and service providers. Governance 2022, 1, 2. [Google Scholar]
- Kaur, G.; Bonde, U.; Pise, K.L.; Yewale, S.; Agrawal, P.; Shobhane, P.; Maheshwari, S.; Pinjarkar, L.; Gangarde, R. Social Media in the Digital Age: A Comprehensive Review of Impacts, Challenges and Cybercrime. Eng. Proc. 2024, 62, 6. [Google Scholar] [CrossRef]
- Boyd, D.M.; Ellison, N.B. Social network sites: Definition, history, and scholarship. J. Comput.-Mediat. Commun. 2007, 13, 210–230. [Google Scholar]
- Weir, G.R.; Toolan, F.; Smeed, D. The threats of social networking: Old wine in new bottles? Inf. Secur. Tech. Rep. 2011, 16, 38–43. [Google Scholar] [CrossRef]
- Zigomitros, A.; Papageorgiou, A.; Patsakis, C. Social network content management through watermarking. In Proceedings of the 2012 IEEE 11th International Conference on Trust, Security and Privacy in Computing and Communications, Liverpool, UK, 25–27 June 2012; pp. 1381–1386. [Google Scholar]
- Stokes, K.; Carlsson, N. A peer-to-peer agent community for digital oblivion in online social networks. In Proceedings of the 2013 Eleventh Annual Conference on Privacy, Security and Trust, Tarragona, Spain, 10–12 July 2013; pp. 103–110. [Google Scholar]
- Miller, Z.; Dickinson, B.; Deitrick, W.; Hu, W.; Wang, A.H. Twitter spammer detection using data stream clustering. Inf. Sci. 2014, 260, 64–73. [Google Scholar] [CrossRef]
- Joe, M.M.; Ramakrishnan, B. Novel authentication procedures for preventing unauthorized access in social networks. Peer-to-Peer Netw. Appl. 2017, 10, 833–843. [Google Scholar]
- Ghazinour, K.; Matwin, S.; Sokolova, M. YOURPRIVACYPROTECTOR, A recommender system for privacy settings in social networks. arXiv 2016, arXiv:1602.01937. [Google Scholar]
- Tounsi, W.; Rais, H. A survey on technical threat intelligence in the age of sophisticated cyber attacks. Comput. Secur. 2018, 72, 212–233. [Google Scholar] [CrossRef]
- De Souza, G.A.; Da Costa-Abreu, M. Automatic offensive language detection from Twitter data using machine learning and feature selection of metadata. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020; pp. 1–6. [Google Scholar]
- Fang, Y.; Gao, J.; Liu, Z.; Huang, C. Detecting Cyber Threat Event from Twitter Using IDCNN and BiLSTM. Appl. Sci. 2020, 10, 5922. [Google Scholar] [CrossRef]
- Humayun, M.; Niazi, M.; Jhanjhi, N.; Alshayeb, M.; Mahmood, S. Cyber Security Threats and Vulnerabilities: A Systematic Mapping Study. Arab. J. Sci. Eng. 2020, 45, 3171–3189. [Google Scholar] [CrossRef]
- Dionísio, N.; Alves, F.; Ferreira, P.M.; Bessani, A. Towards end-to-end cyberthreat detection from Twitter using multi-task learning. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020; pp. 1–8. [Google Scholar]
- Oosthoek, K.; Doerr, C. Cyber Threat Intelligence: A Product Without a Process? Int. J. Intell. CounterIntell. 2020, 34, 300–315. [Google Scholar] [CrossRef]
- Rathore, S.; Sharma, P.K.; Loia, V.; Jeong, Y.-S.; Park, J.H. Social network security: Issues, challenges, threats, and solutions. Inf. Sci. 2017, 421, 43–69. [Google Scholar]
- de Andrade, N.N.G.; Martin, A.; Monteleone, S. “All the better to see you with, my dear”: Facial recognition and privacy in online social networks. IEEE Secur. Priv. 2013, 11, 21–28. [Google Scholar] [CrossRef]
- González-Manzano, L.; González-Tablas, A.I.; de Fuentes, J.M.; Ribagorda, A. Cooped: Co-owned personal data management. Comput. Secur. 2014, 47, 41–65. [Google Scholar] [CrossRef]
- Viejo, A.; Castella-Roca, J.; Rufián, G. Preserving the user’s privacy in social networking sites. In Proceedings of the International Conference on Trust, Privacy and Security in Digital Business, Prague, Czech Republic, 28–29 August 2013; pp. 62–73. [Google Scholar]
- Van Laere, O.; Schockaert, S.; Dhoedt, B. Georeferencing Flickr resources based on textual meta-data. Inf. Sci. 2013, 238, 52–74. [Google Scholar]
- Lee, S.; Kim, J. Warningbird: A near real-time detection system for suspicious urls in twitter stream. IEEE Trans. Dependable Secur. Comput. 2013, 10, 183–195. [Google Scholar] [CrossRef]
- Ahmed, F.; Abulaish, M. A generic statistical approach for spam detection in online social networks. Comput. Commun. 2013, 36, 1120–1129. [Google Scholar] [CrossRef]
- Singh, S.; Jeong, Y.-S.; Park, J.H. A survey on cloud computing security: Issues, threats, and solutions. J. Netw. Comput. Appl. 2016, 75, 200–222. [Google Scholar] [CrossRef]
- Squicciarini, A.C.; Shehab, M.; Wede, J. Privacy policies for shared content in social network sites. VLDB J. 2010, 19, 777–796. [Google Scholar]
- Ramzan, N.; Park, H.; Izquierdo, E. Video streaming over P2P networks: Challenges and opportunities. Signal Process. Image Commun. 2012, 27, 401–411. [Google Scholar] [CrossRef]
- Gurunath, R.; Klaib, M.F.J.; Samanta, D.; Khan, M.Z. Social media and steganography: Use, risks and current status. IEEE Access 2021, 9, 153656–153665. [Google Scholar]
- Alsodi, O.; Zhou, X.; Gururajan, R.; Shrestha, A. A Survey on Detection of cybersecurity threats on Twitter using deep learning. In Proceedings of the 2021 8th International Conference on Behavioral and Social Computing (BESC), Doha, Qatar, 29–31 October 2021; pp. 1–5. [Google Scholar]
- Zhang, Z.; Gupta, B.B. Social media security and trustworthiness: Overview and new direction. Future Gener. Comput. Syst. 2018, 86, 914–925. [Google Scholar]
- Nauman, M.; Azam, N.; Yao, J. A three-way decision making approach to malware analysis using probabilistic rough sets. Inf. Sci. 2016, 374, 193–209. [Google Scholar]
- Faghani, M.R.; Saidi, H. Malware propagation in online social networks. In Proceedings of the 2009 4th International Conference on Malicious and Unwanted Software (MALWARE), Montreal, QC, Canada, 13–14 October 2009; pp. 8–14. [Google Scholar]
- Lanza, C.; Lodi, L. Towards a semi-automatic classifier of malware through tweets for early warning threat detection. JLIS. It 2024, 15, 101–118. [Google Scholar]
- Noh, G.; Oh, H.; Kang, Y.-M.; Kim, C.-K. PSD: Practical Sybil detection schemes using stickiness and persistence in online recommender systems. Inf. Sci. 2014, 281, 66–84. [Google Scholar]
- Faghani, M.R.; Nguyen, U.T. A study of clickjacking worm propagation in online social networks. In Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration (IEEE IRI 2014), Redwood City, CA, USA, 13–15 August 2014; pp. 68–73. [Google Scholar]
- Krombholz, K.; Hobel, H.; Huber, M.; Weippl, E. Advanced social engineering attacks. J. Inf. Secur. Appl. 2015, 22, 113–122. [Google Scholar]
- Shah, A.; Varshney, S.; Mehrotra, M. Threats on online social network platforms: Classification, detection, and prevention techniques. Multimed. Tools Appl. 2024, 1–33. [Google Scholar] [CrossRef]
- Diomidous, M.; Chardalias, K.; Magita, A.; Koutonias, P.; Panagiotopoulou, P.; Mantas, J. Social and psychological effects of the internet use. Acta Inform. Med. 2016, 24, 66. [Google Scholar] [CrossRef]
- El Asam, A.; Samara, M. Cyberbullying and the law: A review of psychological and legal challenges. Comput. Hum. Behav. 2016, 65, 127–141. [Google Scholar] [CrossRef]
- Dreßing, H.; Bailer, J.; Anders, A.; Wagner, H.; Gallas, C. Cyberstalking in a large sample of social network users: Prevalence, characteristics, and impact upon victims. Cyberpsychol. Behav. Soc. Netw. 2014, 17, 61–67. [Google Scholar]
- Munk, T. The Rise of Politically Motivated Cyber Attacks: Actors, Attacks and Cybersecurity; Routledge: London, UK, 2022. [Google Scholar]
- Akoto, W. Who spies on whom? Unravelling the puzzle of state-sponsored cyber economic espionage. J. Peace Res. 2024, 61, 59–71. [Google Scholar]
- Graham, C.M.; Lu, Y. Skills expectations in cybersecurity: Semantic network analysis of job advertisements. J. Comput. Inf. Syst. 2023, 63, 937–949. [Google Scholar] [CrossRef]
- Dawson, M.E., Jr. Cyber Warfare: Threats and Opportunities; Postdoctoral report; Universidade Fernando Pessoa: Porto, Portugal, 2021. [Google Scholar]
- Gadekar, C.; Rakshit, P.P. Study to Perform Opinion Mining on Motivation Factors Generating Cyber Crime by Twitter Analytics. In Proceedings of the International Conference on Innovative Computing & Communications (ICICC), New Delhi, India, 21–23 February 2020. [Google Scholar]
- Romagna, M.; Leukfeldt, R.E. Social Opportunity Structures in Hacktivism: Exploring Online and Offline Social Ties and the Role of Offender Convergence Settings in Hacktivist Networks. Vict. Offenders 2024, 19, 511–533. [Google Scholar] [CrossRef]
- Manakitsa, N.; Maraslidis, G.S.; Moysis, L.; Fragulis, G.F. A review of machine learning and deep learning for object detection, semantic segmentation, and human action recognition in machine and robotic vision. Technologies 2024, 12, 15. [Google Scholar] [CrossRef]
- Lughbi, H.; Mars, M.; Almotairi, K. A Novel NLP-Driven Dashboard for Interactive CyberAttacks Tweet Classification and Visualization. Information 2024, 15, 137. [Google Scholar] [CrossRef]
- Btoush, E.A.L.M.; Zhou, X.; Gururajan, R.; Chan, K.C.; Genrich, R.; Sankaran, P. A systematic review of literature on credit card cyber fraud detection using machine and deep learning. PeerJ Comput. Sci. 2023, 9, e1278. [Google Scholar]
- Omar, S.; Ngadi, A.; Jebur, H.H. Machine learning techniques for anomaly detection: An overview. Int. J. Comput. Appl. 2013, 79, 33–41. [Google Scholar]
- Zhang, C.; Lu, Y. Study on artificial intelligence: The state of the art and future prospects. J. Ind. Inf. Integr. 2021, 23, 100224. [Google Scholar]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar]
- Ayodele, T.O. Machine learning overview. New Adv. Mach. Learn. 2010, 2, 16. [Google Scholar]
- Kotsiantis, S.B.; Zaharakis, I.; Pintelas, P. Supervised machine learning: A review of classification techniques. Emerg. Artif. Intell. Appl. Comput. Eng. 2007, 160, 3–24. [Google Scholar]
- Hatcher, W.G.; Yu, W. A survey of deep learning: Platforms, applications and emerging research trends. IEEE Access 2018, 6, 24411–24432. [Google Scholar]
- Ferrara, E.; Varol, O.; Davis, C.; Menczer, F.; Flammini, A. The rise of social bots. Commun. ACM 2016, 59, 96–104. [Google Scholar]
- Varol, O.; Ferrara, E.; Davis, C.; Menczer, F.; Flammini, A. Online human-bot interactions: Detection, estimation, and characterization. In Proceedings of the Eleventh International AAAI Conference on Web and Social Media, Montreal, QC, Canada, 15–18 May 2017; pp. 280–289. [Google Scholar]
- Lee, K.; Eoff, B.; Caverlee, J. Seven months with the devils: A long-term study of content polluters on twitter. In Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media, Barcelona, Spain, 17–21 July 2011; pp. 185–192. [Google Scholar]
- Kantepe, M.; Ganiz, M.C. Preprocessing framework for Twitter bot detection. In Proceedings of the 2017 International Conference on Computer Science and Engineering (UBMK), Antalya, Turkey, 5–8 October 2017; pp. 630–634. [Google Scholar]
- Subrahmanian, V.S.; Azaria, A.; Durst, S.; Kagan, V.; Galstyan, A.; Lerman, K.; Zhu, L.; Ferrara, E.; Flammini, A.; Menczer, F. The DARPA Twitter bot challenge. Computer 2016, 49, 38–46. [Google Scholar] [CrossRef]
- David, I.; Siordia, O.S.; Moctezuma, D. Features combination for the detection of malicious Twitter accounts. In Proceedings of the 2016 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC), Ixtapa, Mexico, 9–11 November 2016; pp. 1–6. [Google Scholar]
- Khaled, S.; El-Tazi, N.; Mokhtar, H.M. Detecting fake accounts on social media. In Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), Seattle, WA, USA, 10–13 December 2018; pp. 3672–3681. [Google Scholar]
- Yang, C.; Harkreader, R.; Gu, G. Empirical evaluation and new design for fighting evolving twitter spammers. IEEE Trans. Inf. Forensics Secur. 2013, 8, 1280–1293. [Google Scholar]
- Velayutham, T.; Tiwari, P.K. Bot identification: Helping analysts for right data in twitter. In Proceedings of the 2017 3rd International Conference on Advances in Computing, Communication & Automation (ICACCA) (Fall), Dehradun, India, 15–16 September 2017; pp. 1–5. [Google Scholar]
- Amleshwaram, A.A.; Reddy, N.; Yadav, S.; Gu, G.; Yang, C. Cats: Characterizing automation of twitter spammers. In Proceedings of the 2013 Fifth International Conference on Communication Systems and Networks (COMSNETS), Bangalore, India, 7–10 January 2013; pp. 1–10. [Google Scholar]
- Ji, Y.; He, Y.; Jiang, X.; Cao, J.; Li, Q. Combating the evasion mechanisms of social bots. Comput. Secur. 2016, 58, 230–249. [Google Scholar]
- Teljstedt, C.; Rosell, M.; Johansson, F. A semi-automatic approach for labeling large amounts of automated and non-automated social media user accounts. In Proceedings of the 2015 Second European Network Intelligence Conference, Karlskrona, Sweden, 21–22 September 2015; pp. 155–159. [Google Scholar]
- Gilani, Z.; Kochmar, E.; Crowcroft, J. Classification of twitter accounts into automated agents and human users. In Proceedings of the ASONAM ‘17: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017, Sydney, Australia, 31 July–3 August 2017; pp. 489–496. [Google Scholar]
- Daouadi, K.E.; Rebaï, R.Z.; Amous, I. Bot detection on online social networks using deep forest. In Proceedings of the Artificial Intelligence Methods in Intelligent Algorithms: Proceedings of 8th Computer Science Online Conference 2019, Zlin, Czech Republic, 24–27 April 2019; pp. 307–315. [Google Scholar]
- Yang, W.; Dong, G.; Wang, W.; Shen, G.; Gong, L.; Yu, M.; Lv, J.; Hu, Y. Detecting bots in follower markets. In Proceedings of the 9th International Conference, BIC-TA 2014, Wuhan, China, 16–19 October 2014; pp. 525–530. [Google Scholar]
- Chu, Z.; Gianvecchio, S.; Wang, H.; Jajodia, S. Who is tweeting on Twitter: Human, bot, or cyborg? In Proceedings of the ACSAC ‘10: Proceedings of the 26th Annual Computer Security Applications Conference, Austin, TX, USA, 6–10 December 2010; pp. 21–30. [Google Scholar]
- Chu, Z.; Gianvecchio, S.; Wang, H.; Jajodia, S. Detecting automation of twitter accounts: Are you a human, bot, or cyborg? IEEE Trans. Dependable Secur. Comput. 2012, 9, 811–824. [Google Scholar]
- Gurajala, S.; White, J.S.; Hudson, B.; Matthews, J.N. Fake Twitter accounts: Profile characteristics obtained using an activity-based pattern detection approach. In Proceedings of the SMSociety ‘15: Proceedings of the 2015 International Conference on Social Media & Society, Toronto, ON, Canada, 27–29 July 2015; pp. 1–7. [Google Scholar]
- Caruccio, L.; Desiato, D.; Polese, G. Fake account identification in social networks. In Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), Seattle, WA, USA, 10–13 December 2018; pp. 5078–5085. [Google Scholar]
- Valliyammai, C.; Devakunchari, R. Distributed and scalable Sybil identification based on nearest neighbour approximation using big data analysis techniques. Clust. Comput. 2019, 22 (Suppl. S6), 14461–14476. [Google Scholar]
- Cai, C.; Li, L.; Zeng, D. Detecting social bots by jointly modeling deep behavior and content information. In Proceedings of the CIKM ‘17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, Singapore, 6–10 November 2017; pp. 1995–1998. [Google Scholar]
- Kudugunta, S.; Ferrara, E. Deep neural networks for bot detection. Inf. Sci. 2018, 467, 312–322. [Google Scholar]
- Wang, W.; Mauleon, R.; Hu, Z.; Chebotarov, D.; Tai, S.; Wu, Z.; Li, M.; Zheng, T.; Fuentes, R.R.; Zhang, F. Genomic variation in 3010 diverse accessions of Asian cultivated rice. Nature 2018, 557, 43–49. [Google Scholar] [CrossRef]
- Ping, H.; Qin, S. A social bots detection model based on deep learning algorithm. In Proceedings of the 2018 IEEE 18th International Conference on Communication Technology (ICCT), Chongqing, China, 8–11 October 2018; pp. 1435–1439. [Google Scholar]
- Morstatter, F.; Wu, L.; Nazer, T.H.; Carley, K.M.; Liu, H. A new approach to bot detection: Striking the balance between precision and recall. In Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), San Francisco, CA, USA, 18–21 August 2016; pp. 533–540. [Google Scholar]
- Igawa, R.A.; Barbon, S., Jr.; Paulo, K.C.S.; Kido, G.S.; Guido, R.C.; Júnior, M.L.P.; da Silva, I.N. Account classification in online social networks with LBCA and wavelets. Inf. Sci. 2016, 332, 72–83. [Google Scholar] [CrossRef]
- Jr, S.B.; Campos, G.F.; Tavares, G.M.; Igawa, R.A.; Jr, M.L.P.; Guido, R.C. Detection of human, legitimate bot, and malicious bot in online social networks based on wavelets. ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 2018, 14, 1–17. [Google Scholar] [CrossRef]
- Bara, I.-A.; Fung, C.J.; Dinh, T. Enhancing Twitter spam accounts discovery using cross-account pattern mining. In Proceedings of the 2015 IFIP/IEEE International Symposium on Integrated Network Management (IM), Ottawa, ON, Canada, 11–15 May 2015; pp. 491–496. [Google Scholar]
- Gupta, A.; Budania, H.; Singh, P.; Singh, P.K. Facebook based choice filtering. In Proceedings of the 2017 IEEE 7th International Advance Computing Conference (IACC), Hyderabad, India, 5–7 January 2017; pp. 875–879. [Google Scholar]
- Main, W.; Shekokhar, N. Twitterati identification system. Procedia Comput. Sci. 2015, 45, 32–41. [Google Scholar] [CrossRef]
- Dickerson, J.P.; Kagan, V.; Subrahmanian, V. Using sentiment to detect bots on twitter: Are humans more opinionated than bots? In Proceedings of the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014), Beijing, China, 17–20 August 2014; pp. 620–627. [Google Scholar]
- Loyola-González, O.; Monroy, R.; Rodríguez, J.; López-Cuevas, A.; Mata-Sánchez, J.I. Contrast pattern-based classification for bot detection on twitter. IEEE Access 2019, 7, 45800–45817. [Google Scholar] [CrossRef]
- Andriotis, P.; Takasu, A. Emotional bots: Content-based spammer detection on social media. In Proceedings of the 2018 IEEE International Workshop on Information Forensics and Security (WIFS), Hong Kong, China, 11–13 December 2018; pp. 1–8. [Google Scholar]
- Beskow, D.M.; Carley, K.M. Its all in a name: Detecting and labeling bots by their name. Comput. Math. Organ. Theory 2019, 25, 24–35. [Google Scholar] [CrossRef]
- Zhang, C.; Zhang, G.; Sun, S. A mixed unsupervised clustering-based intrusion detection model. In Proceedings of the 2009 Third International Conference on Genetic and Evolutionary Computing, Guilin, China, 14–17 October 2009; pp. 426–428. [Google Scholar]
- Chavoshi, N.; Hamooni, H.; Mueen, A. Debot: Twitter Bot Detection via Warped Correlation. In ICDM; IEEE: Piscataway, NJ, USA, 2016; Volume 18, pp. 28–65. [Google Scholar]
- Cresci, S.; Di Pietro, R.; Petrocchi, M.; Spognardi, A.; Tesconi, M. DNA-inspired online behavioral modeling and its application to spambot detection. IEEE Intell. Syst. 2016, 31, 58–64. [Google Scholar] [CrossRef]
- Cresci, S.; Di Pietro, R.; Petrocchi, M.; Spognardi, A.; Tesconi, M. Social fingerprinting: Detection of spambot groups through DNA-inspired behavioral modeling. IEEE Trans. Dependable Secur. Comput. 2017, 15, 561–576. [Google Scholar] [CrossRef]
- Minnich, A.; Chavoshi, N.; Koutra, D.; Mueen, A. BotWalk: Efficient adaptive exploration of Twitter bot networks. In Proceedings of the ASONAM ‘17: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017, Sydney, Australia, 31 July–3 August 2017; pp. 467–474. [Google Scholar]
- Chew, P.A. Searching for unknown unknowns: Unsupervised bot detection to defeat an adaptive adversary. In Proceedings of the 11th International Conference, SBP-BRiMS 2018, Washington, DC, USA, 10–13 July 2018; pp. 357–366. [Google Scholar]
- Chen, Z.; Tanash, R.S.; Stoll, R.; Subramanian, D. Hunting malicious bots on twitter: An unsupervised approach. In Proceedings of the 9th International Conference, SocInfo 2017, Oxford, UK, 13–15 September 2017; pp. 501–510. [Google Scholar]
- Munschauer, M.; Nguyen, C.T.; Sirokman, K.; Hartigan, C.R.; Hogstrom, L.; Engreitz, J.M.; Ulirsch, J.C.; Fulco, C.P.; Subramanian, V.; Chen, J. The NORAD lncRNA assembles a topoisomerase complex critical for genome stability. Nature 2018, 561, 132–136. [Google Scholar] [CrossRef]
- Abu-El-Rub, N.; Mueen, A. Botcamp: Bot-driven interactions in social campaigns. In Proceedings of the WWW ‘19: The World Wide Web Conference, San Francisco, CA, USA, 13–17 May 2019; pp. 2529–2535. [Google Scholar]
- Zhu, X.; Goldberg, A.B. Introduction to semi-supervised learning. Synth. Lect. Artif. Intell. Mach. Learn. 2009, 3, 1–130. [Google Scholar]
- Chapelle, O.; Scholkopf, B.; Zien, A. Semi-supervised learning (Chapelle, O. et al., eds.; 2006) [book reviews]. IEEE Trans. Neural Netw. 2009, 20, 542. [Google Scholar] [CrossRef]
- Shi, P.; Zhang, Z.; Choo, K.-K.R. Detecting malicious social bots based on clickstream sequences. IEEE Access 2019, 7, 28855–28862. [Google Scholar] [CrossRef]
- Dorri, A.; Abadi, M.; Dadfarnia, M. Socialbothunter: Botnet detection in twitter-like social networking services using semi-supervised collective classification. In Proceedings of the 2018 IEEE 16th International Conference on Dependable, Autonomic and Secure Computing, 16th International Conference on Pervasive Intelligence and Computing, 4th International Conference on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech), Athens, Greece, 12–15 August 2018; pp. 496–503. [Google Scholar]
- Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- Xu, H.; Dong, M.; Zhu, D.; Kotov, A.; Carcone, A.I.; Naar-King, S. Text classification with topic-based word embedding and convolutional neural networks. In Proceedings of the 7th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, Seattle, WA, USA, 2–5 October 2016; pp. 88–97. [Google Scholar]
- Olabanjo, O.; Wusu, A.; Aigbokhan, E.; Olabanjo, O.; Afisi, O.; Akinnuwesi, B. A novel graph convolutional networks model for an intelligent network traffic analysis and classification. Int. J. Inf. Technol. 2024, 1–13. [Google Scholar] [CrossRef]
- Asif, M.; Al-Razgan, M.; Ali, Y.A.; Yunrong, L. Graph convolution networks for social media trolls detection use deep feature extraction. J. Cloud Comput. 2024, 13, 33. [Google Scholar]
- Gundubogula, A.S. Enhancing Graph Convolutional Network with Label Propagation and Residual for Malware Detection. Master’s Thesis, Wright State University, Dayton, OH, USA, 2023. [Google Scholar]
- Khan, Z.; Khan, Z.; Lee, B.-G.; Kim, H.K.; Jeon, M. Graph neural networks based framework to analyze social media platforms for malicious user detection. Appl. Soft Comput. 2024, 155, 111416. [Google Scholar]
- Simran, K.; Balakrishna, P.; Vinayakumar, R.; Soman, K. Deep Learning Approach for Enhanced Cyber Threat Indicators in Twitter Stream. In Proceedings of the 7th International Symposium, SSCC 2019, Trivandrum, India, 18–21 December 2019; pp. 135–145. [Google Scholar]
- Schuster, M.; Paliwal, K.K. Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 1997, 45, 2673–2681. [Google Scholar]
- Elman, J.L. Finding structure in time. Cogn. Sci. 1990, 14, 179–211. [Google Scholar] [CrossRef]
- Shin, H.-S.; Kwon, H.-Y.; Ryu, S.-J. A new text classification model based on contrastive word embedding for detecting cybersecurity intelligence in twitter. Electronics 2020, 9, 1527. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [PubMed]
- Wang, J.-H.; Liu, T.-W.; Luo, X.; Wang, L. An LSTM approach to short text sentiment classification with word embeddings. In Proceedings of the 2018 Conference on Computational Linguistics and Speech Processing, Hsinchu, Taiwan, 4–5 October 2018; pp. 214–223. [Google Scholar]
- Ding, Z.; Xia, R.; Yu, J.; Li, X.; Yang, J. Densely connected bidirectional lstm with applications to sentence classification. In Proceedings of the 7th CCF International Conference, NLPCC 2018, Hohhot, China, 26–30 August 2018; pp. 278–287. [Google Scholar]
- Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. 2015, 61, 85–117. [Google Scholar] [PubMed]
- Dionísio, N.; Alves, F.; Ferreira, P.M.; Bessani, A. Cyberthreat detection from twitter using deep neural networks. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019; pp. 1–8. [Google Scholar]
- Berahman, K.; Zhou, X.; Li, Y.; Gururajan, R.; Barua, P.; Acharya, R.; Chennakesavan, S.K. New Ensemble Deep Learning Model for Gynaecological Cancer Risk Prediction; Research Square, Australia. 2024. Available online: https://www.researchgate.net/publication/379506676_New_Ensemble_Deep_Learning_Model_for_Gynaecological_Cancer_Risk_Prediction (accessed on 10 November 2023).
- Shukla, H.; Jagtap, N.; Patil, B. Enhanced Twitter bot detection using ensemble machine learning. In Proceedings of the 2021 6th International Conference on Inventive Computation Technologies (ICICT), Coimbatore, India, 20–22 January 2021; pp. 930–936. [Google Scholar]
- Shahnawaz Ahmad, M.; Mehraj Shah, S. Unsupervised ensemble based deep learning approach for attack detection in IoT network. Concurr. Comput. Pract. Exp. 2022, 34, e7338. [Google Scholar]
- Khanday, A.M.U.D.; Rabani, S.T.; Khan, Q.R.; Malik, S.H. Detecting twitter hate speech in COVID-19 era using machine learning and ensemble learning techniques. Int. J. Inf. Manag. Data Insights 2022, 2, 100120. [Google Scholar]
- Ahmad, R.; Alsmadi, I.; Alhamdani, W.; Tawalbeh, L.a. A deep learning ensemble approach to detecting unknown network attacks. J. Inf. Secur. Appl. 2022, 67, 103196. [Google Scholar]
- Muneer, A.; Alwadain, A.; Ragab, M.G.; Alqushaibi, A. Cyberbullying detection on social media using stacking ensemble learning and enhanced BERT. Information 2023, 14, 467. [Google Scholar] [CrossRef]
- Siddiqui, T.; Hina, S.; Asif, R.; Ahmed, S.; Ahmed, M. An ensemble approach for the identification and classification of crime tweets in the English language. Comput. Sci. Inf. Technol. 2023, 4, 149–159. [Google Scholar]
- Arora, R.; Gupta, R.; Yadav, P. Utilizing Ensemble Learning to enhance the detection of Malicious URLs in the Twitter dataset. In Proceedings of the 2024 4th International Conference on Innovative Practices in Technology and Management (ICIPTM), Noida, India, 21–23 February 2024; pp. 1–6. [Google Scholar]
- Krishna, T.V.S.; Krishna, T.S.R.; Kalime, S.; Krishna, C.V.M.; Neelima, S.; PBV, R.R. A novel ensemble approach for Twitter sentiment classification with ML and LSTM algorithms for real-time tweets analysis. Indones. J. Electr. Eng. Comput. Sci. 2024, 34, 1904–1914. [Google Scholar]
- Alqahtani, A.F.; Ilyas, M. A Machine Learning Ensemble Model for the Detection of Cyberbullying. arXiv 2024, arXiv:2402.12538. [Google Scholar]
- Olaitan, O.L.; David, A.O.; Michael, O.A. Deep Learning Approach for Classification of Tweets in Detecting Cyber Truculent. Adv. Res. 2024, 25, 113–122. [Google Scholar]
- Vaiyapuri, T.; Shankar, K.; Rajendran, S.; Kumar, S.; Gaur, V.; Gupta, D.; Alharbi, M. Automated cyberattack detection using optimal ensemble deep learning model. Trans. Emerg. Telecommun. Technol. 2024, 35, e4899. [Google Scholar]
- Ruohonen, J.; Hyrynsalmi, S.; Leppänen, V. A mixed methods probe into the direct disclosure of software vulnerabilities. Comput. Hum. Behav. 2020, 103, 161–173. [Google Scholar]
- Campiolo, R.; Santos, L.A.F.; Batista, D.M.; Gerosa, M.A. Evaluating the utilization of Twitter messages as a source of security alerts. In Proceedings of the 28th Annual ACM Symposium on Applied Computing, Coimbra, Portugal, 18–22 March 2013; pp. 942–943. [Google Scholar]
- Sabottke, C.; Suciu, O.; Dumitraș, T. Vulnerability disclosure in the age of social media: Exploiting twitter for predicting real-world exploits. In Proceedings of the 24th {USENIX} Security Symposium ({USENIX} Security 15), Washington, DC, USA, 12–14 August 2015; pp. 1041–1056. [Google Scholar]
- Trabelsi, S.; Plate, H.; Abida, A.; Aoun, M.M.B.; Zouaoui, A.; Missaoui, C.; Gharbi, S.; Ayari, A. Mining social networks for software vulnerabilities monitoring. In Proceedings of the 2015 7th International Conference on New Technologies, Mobility and Security (NTMS), Paris, France, 27–29 July 2015; pp. 1–7. [Google Scholar]
- Kergl, D.; Roedler, R.; Rodosek, G.D. Detection of zero day exploits using real-time social media streams. In Proceedings of the Advances in Nature and Biologically Inspired Computing: Proceedings of the 7th World Congress on Nature and Biologically Inspired Computing (NaBIC2015), Pietermaritzburg, South Africa, 1–3 December 2015; pp. 405–416. [Google Scholar]
- Queiroz, A.; Keegan, B.; Mtenzi, F. Predicting software vulnerability using security discussion in social media. In Proceedings of the European Conference on Cyber Warfare and Security, Dublin, Ireland, 29–30 June 2017; pp. 628–634. [Google Scholar]
- Behzadan, V.; Aguirre, C.; Bose, A.; Hsu, W. Corpus and deep learning classifier for collection of cyber threat indicators in twitter stream. In Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), Seattle, WA, USA, 10–13 December 2018; pp. 5002–5007. [Google Scholar]
- Arora, T.; Sharma, M.; Khatri, S.K. Detection of cyber crime on social media using random forest algorithm. In Proceedings of the 2019 2nd International Conference on Power Energy, Environment and Intelligent Control (PEEIC), Greater Noida, India, 18–19 October 2019; pp. 47–51. [Google Scholar]
- Le, B.-D.; Wang, G.; Nasim, M.; Babar, M.A. Gathering cyber threat intelligence from Twitter using novelty classification. arXiv 2019, arXiv:19f07.01755. [Google Scholar]
- Mahaini, M.I.; Li, S. Detecting cyber security related Twitter accounts and different sub-groups: A multi-classifier approach. In Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Virtual, 8–11 November 2021; pp. 599–606. [Google Scholar]
- Deshmukh, R.; Shinde, S.; Yadav, B.; Pathak, A.; Shetty, A. Darkintellect: An Approach to Detect Cyber Threat Using Machine Learning Techniques on Open-Source Information. Math. Stat. Eng. Appl. 2022, 71, 1431–1439. [Google Scholar]
- Coyac-Torres, J.E.; Sidorov, G.; Aguirre-Anaya, E.; Hernández-Oregón, G. Cyberattack detection in social network messages based on convolutional neural networks and NLP techniques. Mach. Learn. Knowl. Extr. 2023, 5, 1132–1148. [Google Scholar] [CrossRef]
- Abdelhaq, H.; Sengstock, C.; Gertz, M. Eventweet: Online localized event detection from twitter. Proc. VLDB Endow. 2013, 6, 1326–1329. [Google Scholar]
- Mittal, S.; Das, P.K.; Mulwad, V.; Joshi, A.; Finin, T. Cybertwitter: Using twitter to generate alerts for cybersecurity threats and vulnerabilities. In Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), San Francisco, CA, USA, 18–21 August 2016; pp. 860–867. [Google Scholar]
- Horrocks, I.; Patel-Schneider, P.F.; Boley, H.; Tabet, S.; Grosof, B.; Dean, M. SWRL: A semantic web rule language combining OWL and RuleML. W3C Memb. Submiss. 2004, 21, 1–31. [Google Scholar]
- Sapienza, A.; Bessi, A.; Damodaran, S.; Shakarian, P.; Lerman, K.; Ferrara, E. Early warnings of cyber threats in online discussions. In Proceedings of the 2017 IEEE International Conference on Data Mining Workshops (ICDMW), New Orleans, LA, USA, 18–21 November 2017; pp. 667–674. [Google Scholar]
- Le Sceller, Q.; Karbab, E.B.; Debbabi, M.; Iqbal, F. Sonar: Automatic detection of cyber security events over the twitter stream. In Proceedings of the 12th International Conference on Availability, Reliability and Security, Reggio Calabria, Italy, 29 August–1 September 2017; pp. 1–11. [Google Scholar]
- Alves, F.; Bettini, A.; Ferreira, P.M.; Bessani, A. Processing tweets for cybersecurity threat awareness. Inf. Syst. 2021, 95, 101586. [Google Scholar]
- Nazir, F.; Ghazanfar, M.A.; Maqsood, M.; Aadil, F.; Rho, S.; Mehmood, I. Social media signal detection using tweets volume, hashtag, and sentiment analysis. Multimed. Tools Appl. 2019, 78, 3553–3586. [Google Scholar] [CrossRef]
- Rodriguez, A.; Okamura, K. Generating real time cyber situational awareness information through social media data mining. In Proceedings of the 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC), Milwaukee, WI, USA, 15–19 July 2019; pp. 502–507. [Google Scholar]
- Dabiri, S.; Heaslip, K. Developing a Twitter-based traffic event detection model using deep learning architectures. Expert Syst. Appl. 2019, 118, 425–439. [Google Scholar]
- Sani, A.M.; Moeini, A. Real-time Event Detection in Twitter: A Case Study. In Proceedings of the 2020 6th International Conference on Web Research (ICWR), Tehran, Iran, 22–23 April 2020; pp. 48–51. [Google Scholar]
- Rodriguez, A.; Okamura, K. Enhancing data quality in real-time threat intelligence systems using machine learning. Soc. Netw. Anal. Min. 2020, 10, 1–22. [Google Scholar] [CrossRef]
- Reddy, P.M.; Venkatesh, K.; Bhargav, D.; Sandhya, M. Spam detection and fake user identification methodologies in social networks using extreme machine learning. Int. J. Anal. Exp. Modal Anal. 2021, 13, 2367–2374. [Google Scholar] [CrossRef]
- Kondeti, P.; Yerramreddy, L.P.; Pradhan, A.; Swain, G. Fake account detection using machine learning. In Evolutionary Computing and Mobile Sustainable Networks: Proceedings of ICECMSN 2020, Proceedings of the International Conference on Evolutionary Computing and Mobile Sustainable Networks (ICECMSN 2020), Bangalore, India, 20–21 February 2020; Springer: Singapore, 2021; pp. 791–802. [Google Scholar]
- Bindu, K.; Rishith, B.P.; Sathish, D.; Subhash, V.; Harika, B.; Swathi, N. Detection of fake accounts in Twitter using data science. Int. Res. J. Mod. Eng. Technol. Sci. 2022, 4, 3552–3556. [Google Scholar]
- Rodrigues, A.P.; Fernandes, R.; Shetty, A.; Lakshmanna, K.; Shafi, R.M. Real-time twitter spam detection and sentiment analysis using machine learning and deep learning techniques. Comput. Intell. Neurosci. 2022, 2022, 5211949. [Google Scholar] [CrossRef] [PubMed]
- Shukla, R.; Sinha, A.; Chaudhary, A. TweezBot: An AI-driven online media bot identification algorithm for Twitter social networks. Electronics 2022, 11, 743. [Google Scholar] [CrossRef]
- Mughaid, A.; Obeidat, I.; AlZu’bi, S.; Elsoud, E.A.; Alnajjar, A.; Alsoud, A.R.; Abualigah, L. A novel machine learning and face recognition technique for fake accounts detection system on cyber social networks. Multimed. Tools Appl. 2023, 82, 26353–26378. [Google Scholar] [CrossRef]
- Ritter, A.; Wright, E.; Casey, W.; Mitchell, T. Weakly supervised extraction of computer security events from twitter. In Proceedings of the WWW ‘15: Proceedings of the 24th International Conference on World Wide Web, Florence, Italy, 18–22 May 2015; pp. 896–905. [Google Scholar]
- Rao, P.; Kamhoua, C.; Njilla, L.; Kwiat, K. Methods to Detect Cyberthreats on Twitter: ‘Surveillance in Action’; Springer: Berlin/Heidelberg, Germany, 2018; pp. 333–350. [Google Scholar]
- Chambers, N.; Fry, B.; McMasters, J. Detecting Denial-of-Service Attacks from Social Media Text: Applying NLP to Computer Security; Association for Computational Linguistics: New Orleans, LA, USA, 2018; pp. 1626–1635. [Google Scholar]
- Yılmaz, Y.; Hero, A.O. Multimodal event detection in Twitter hashtag networks. J. Signal Process. Syst. 2018, 90, 185–200. [Google Scholar] [CrossRef]
- Zong, S.; Ritter, A.; Mueller, G.; Wright, E. Analyzing the perceived severity of cybersecurity threats reported on social media. arXiv 2019, arXiv:1902.10680. [Google Scholar]
- Ghankutkar, S.; Sarkar, N.; Gajbhiye, P.; Yadav, S.; Kalbande, D.; Bakereywala, N. Modelling machine learning for analysing crime news. In Proceedings of the 2019 International Conference on Advances in Computing, Communication and Control (ICAC3), Mumbai, India, 20–21 December 2019; pp. 1–5. [Google Scholar]
- Boyd, K.; Eng, K.H.; Page, C.D. Area under the precision-recall curve: Point estimates and confidence intervals. In Proceedings of the M16666achine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2013, Prague, Czech Republic, 23–27 September 2013; Proceedings, Part III 13. pp. 451–466. [Google Scholar]
Authors | Year | Focus Area | Techniques/Models | Results |
---|---|---|---|---|
[120] | 2021 | X bot detection | Weight of Evidence encoding, Extra Trees (feature selection), Random Forest (blending) | 93% AUC; rapid threat detection with static profile data but less effective than behavioral analysis methods |
[121] | 2022 | IoT network attack detection | Unsupervised ensemble learning, Deep Belief Network (DBN) | 97.5% detection accuracy, 2.3% false alarm rate |
[122] | 2022 | Hate speech detection during COVID-19 | Decision Tree, Stochastic Gradient Boosting, TF-IDF, Bag of Words, tweet length | Stochastic Gradient Boosting: 99% precision, 97% recall, 98% F1-score, 98.04% accuracy |
[123] | 2022 | Cyber threat detection | Ensemble of deep learning classifiers | Effective detection of novel cyber threats, adaptable Intrusion Detection System (IDS) |
[124] | 2023 | Cyberbullying detection on X | Ensemble stacking, deep neural networks (DNNs), BERT-M, word2vec, CBOW | 97.4% accuracy on X dataset, 90.97% on combined X andFacebook dataset, F1-score: 0.964, precision: 0.950, recall: 0.92 |
[125] | 2023 | Crime-related tweet classification | Logistic Regression, Support Vector Machine, k-Nearest Neighbors, Decision Tree, Random Forest, TF-IDF | 96.2% accuracy using soft weighted Voting classifier |
[126] | 2024 | Spam URL detection on X | k-Nearest Neighbors, bagging, Random Forest, URL content, user profile, hybrid features | High accuracy (>90%) when using combined feature sets |
[127] | 2024 | Real-time public opinion analysis | Naïve Bayes, Decision Trees, Random Forest, Logistic Regression, RNN, LSTM, GRU, ensemble of ML and DL models | Comparative analysis of ML and DL models; novel ensemble approach combining ML and DL |
[128] | 2024 | Aggressive tweet detection | Stacking ensemble, Decision Trees, Random Forest, Linear SVC, Logistic Regression, k-Nearest Neighbors | 94.00% accuracy in classifying tweets as aggressive or non-aggressive |
[129] | 2024 | Cyberbullying detection in tweets | 1D-CNN, Maximum Entropy, Unigram, Bigram, Trigram, N-gram | 96.1% accuracy, 93.6% precision, 73.7% recall, 83.8% F1-score |
[130] | 2024 | Cyber attack detection in IIoT environments | IRSO algorithm, Deep Belief Network (DBN), BiGRU, Autoencoder (AE), Modified Gray Wolf Optimizer (MGWO) | Superior performance in IIoT cybersecurity, effective feature selection and hyperparameter tuning |
Study | Focus | Methodology | Datasets |
---|---|---|---|
[160] | - Events from X that requires only minimal supervision - DoS attacks, data breaches, and account hijacking | Weakly supervised learning | Tweets containing “DDoS” |
[147] | An automatic, self-learned framework that can detect, geolocate, and categorize cybersecurity events in near-real time over the X stream | First story detection | Streaming tweets |
[161] | Machine learning techniques by considering user behavior, content of tweets, social relationships, etc., to detect different types of cyberthreats | SocialKB | - Tweets containing “URLs” - Streaming tweets |
[137] | Cybersecurity events | Deep learning model; cascaded CNN architecture | Labelled 21,000 tweets collected using Tweepy |
[162] | A novel application of NLP models to detect denial of service attacks using only social media as evidence | Basic neural network | Tweets written on attack day |
[163] | Treat the event detection problem in a multimodal X hashtag network | Expectation-maximization (EM) algorithm | Tweets containing hashtag |
[118] | A novel tool that uses deep neural networks to process cybersecurity information received from X | SVM, MLP, CNN, BiLSTM | Tweets filtered by keywords |
[164] | Analyze the severity of cybersecurity threats based on the language that is used to describe them online | Supervised ML models | Tweets containing “DDoS” and “vulnerability” |
[165] | Cybersecurity-related data | Three supervised ML models; SVM, MNB, and RF | Real-time cyber attack Data from HuffPost News Site |
[138] | Cybersecurity threats relevant data | RF | Filtered tweets collected using X’s streaming API |
[139] | Collection method of Cyber threat tweets | Centroid, One-class SVM, CNN, LSTM | Streaming Tweets |
[16] | A multitask learning approach combining two Natural Language Processing tasks for cyberthreat intelligence | Multitask Learning (MTL) | Streaming Tweets |
[113] | A novel word embedding model, called contrastive word embedding, that enables to maximize the difference between base embedding models | CNN, RNN and LSTM | Curated data, OSINT data, and background knowledge |
[14] | - Detection of cyber threat events on tweets. - Named Entity Recognition (NER) for tweets | Multitask learning NLP, IDCNN, BiLSTM | Streaming Tweets |
[140] | Cybersecurity-related discussions | Four supervised machine learning models; Decision Tree, Random Forests, SVM, and Logistic Regression | Labelled tweets collected using the X Sampling API |
[141] | Cybersecurity threats relevant data | Five ML models: SVM, Random Forest, Decision Tree, XGBoost and AdaBoost | Labelled 21,000 tweets were collected using a python package Tweepy |
[142] | Cybersecurity-related data | Deep learning model; CNN architecture | Social network messages |
Metrics | Description | Equation | Range |
---|---|---|---|
Accuracy (A) | Assess the number of TPs | [0–1] | |
Recall | The ratio of TP to a TP and FN | [0–1] | |
Precision | The ratio of TP to a TP and FP | [0–1] | |
F1-Score | Combines precision and recall | [0–1] | |
AUC | The area between two points bounded by the function and the x-axis | [0–1] |
Prediction Methods | Algorithm |
---|---|
DeepNN [118] | CNN |
SYNAPSE [148] | SVM |
DataFreq [150] | LR |
SONAR [147] | Cosine similarity |
CyberX [144] | SWRL |
Text mining [146] | Filtering |
Prediction Methods | Summarization |
---|---|
Text mining [146] | Summarized alert |
SYNAPSE [148] | Clustering, exemplar |
DataFreq [150] | Sentiment score for each company |
DeepNN [118] | Classification, NER |
CyberX [144] | Detailed alert |
SONAR [147] | Clustering |
Prediction Methods | Scalability |
---|---|
SONAR [147] | Updated keywords |
DataFreq [150] | Updated keywords |
SYNAPSE [148] | Fixed keywords and accounts |
DeepNN [118] | Fixed keywords and accounts |
Text mining [146] | Fixed accounts and dictionaries |
CyberX [144] | Fixed user profile |
Prediction Methods | Recall (TPR)/Precision |
---|---|
DeepNN [118] | Recall 94% |
SYNAPSE [148] | Recall 90% |
DataFreq [150] | Recall 84% |
CyberX [144] | Precision 86% |
Text mining [146] | Precision 84% |
SONAR [147] | Precision 23% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Alsodi, O.; Zhou, X.; Gururajan, R.; Shrestha, A.; Btoush, E. From Tweets to Threats: A Survey of Cybersecurity Threat Detection Challenges, AI-Based Solutions and Potential Opportunities in X. Appl. Sci. 2025, 15, 3898. https://doi.org/10.3390/app15073898
Alsodi O, Zhou X, Gururajan R, Shrestha A, Btoush E. From Tweets to Threats: A Survey of Cybersecurity Threat Detection Challenges, AI-Based Solutions and Potential Opportunities in X. Applied Sciences. 2025; 15(7):3898. https://doi.org/10.3390/app15073898
Chicago/Turabian StyleAlsodi, Omar, Xujuan Zhou, Raj Gururajan, Anup Shrestha, and Eyad Btoush. 2025. "From Tweets to Threats: A Survey of Cybersecurity Threat Detection Challenges, AI-Based Solutions and Potential Opportunities in X" Applied Sciences 15, no. 7: 3898. https://doi.org/10.3390/app15073898
APA StyleAlsodi, O., Zhou, X., Gururajan, R., Shrestha, A., & Btoush, E. (2025). From Tweets to Threats: A Survey of Cybersecurity Threat Detection Challenges, AI-Based Solutions and Potential Opportunities in X. Applied Sciences, 15(7), 3898. https://doi.org/10.3390/app15073898