Artificial Intelligence in Cybersecurity: A Review and a Case Study

Okdem, Selcuk; Okdem, Sema

doi:10.3390/app142210487

Open AccessArticle

Artificial Intelligence in Cybersecurity: A Review and a Case Study

by

Selcuk Okdem

^1,*

and

Sema Okdem

²

¹

Computer Engineering Department, Engineering Faculty, Erciyes University, Kayseri 38030, Turkey

²

Kayseri Vocational and Technical Anatolian High School, Kayseri 38020, Turkey

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(22), 10487; https://doi.org/10.3390/app142210487

Submission received: 27 July 2024 / Revised: 9 October 2024 / Accepted: 18 October 2024 / Published: 14 November 2024

(This article belongs to the Special Issue Artificial Intelligence for Cybersecurity: Latest Advances and Prospects)

Download

Browse Figures

Versions Notes

Abstract

:

The evolving landscape of cyber threats necessitates continuous advancements in defensive strategies. This paper explores the potential of artificial intelligence (AI) as an emerging tool to enhance cybersecurity. While AI holds widespread applications across information technology, its integration within cybersecurity remains a recent development. We offer a comprehensive review of current AI applications in this domain, focusing particularly on their preventative capabilities against prevalent threats like phishing, social engineering, ransomware, and malware. To illustrate these concepts, the paper presents a case study showcasing a specific AI application in a cybersecurity context. This case study addresses a critical gap in securing communication within resource-constrained Internet of Things (IoT) networks using the IEEE 802.15.4 standard. We discussed the advantages and limitations of employing PN sequence encryption for this purpose.

Keywords:

cybersecurity; IT security; machine learning; artificial intelligence; genetic algorithm; IEEE 802.15.4; encryption; PN sequences

1. Introduction

Artificial intelligence (AI) is an emerging tool offering promising solutions to the complex problems of cybersecurity. This manuscript delves into their applications in this critical field. We explore the current state-of-the-art advancements in these fields, focusing on four key subcategories: phishing, social engineering, ransomware, and malware. For each category, we present a focused analysis exploring the specific techniques and methodologies that leverage AI to address these prevalent threats. This work employs a comparative approach utilizing descriptive analysis and in-depth discussions to illuminate the strengths, weaknesses, and opportunities for further research within AI-powered cybersecurity solutions.

The past decade (2013–2023) has been marked by a surge in complex and financially damaging cybersecurity threats. These major incidents, often exceeding a million dollars in financial impact, pose a critical challenge to both global security and economic stability. Consequently, understanding the evolution and patterns of these attacks over the past decade is crucial for researchers in the field. In their study, the authors of [1] identified a significant escalation in the frequency and complexity of cyberattacks. DDoS incidents surged in 2022, while malware attacks steadily increased, culminating in a peak in 2023. This trend underscores the growing sophistication of threat actors and the vulnerability of digital infrastructures. Furthermore, the combined impact of other attack methods, including phishing and zero-day exploits, surpassed that of DDoS and malware, revealing the diverse nature of cyber threats [1].

Secure communication is an essential pillar of cybersecurity in the IT domain. However, wireless communication presents a unique challenge due to its inherent vulnerability. Unlike wired connections, wireless data travel through the airwaves, essentially broadcasting information. Because the data are broadcast, they can be intercepted. Attackers with malicious intent can intercept this broadcast communication. We have addressed them by providing a concise review of the latest applications of AI in cybersecurity, focusing on prevalent threats like phishing, social engineering, ransomware, and malware. To bridge the gap between theory and practice, we showcase a specific case study. This case study explores how a genetic algorithm (GA), a subfield of AI, secures communication within IEEE 802.15.4 networks. We chose this specific example because it highlights a critical gap in securing low-power, resource-constrained networks like those used in the ever-growing IoT and wireless sensor networks (WSNs).

Encryption, which scrambles data using cryptographic algorithms to ensure only authorized users can access it, plays a critical role in securing communication on wireless networks. This protects sensitive information. It does this while the information is being sent. This works even on open wireless networks.

The IEEE 802.15.4 protocol is a popular choice for wireless communication in industrial and home appliance applications due to its suitability for low-power, low-data-rate sensor networks and Internet of Things (IoT) devices. IEEE 802.15.4 offers good performance in noisy environments. However, it is more susceptible to cyber attacks. This is because it uses less complex encryption algorithms compared to protocols like IEEE 802.11. To exemplify the potential of AI in wireless communication, we present a case study involving the development of an anonymous encryption methodology for IEEE 802.15.4 networks. Our case study includes GA application. GAs are a type of AI. We use GAs for anonymous communication. At the same time, we analyze how network performance can be kept optimal. Such a proposal departs from traditional hash functions and register-based encryption mechanisms. It is based on deriving a genetically derived pseudo-random noise (PN) sequence. A comparison of the IEEE 802.15.4 PN sequence and our proposal is presented in throughput analyses.

A recent study [2] proposes using multiple PN codes (S3, S5, S7, and S9) with different spreading factors. These spreading factors are found using GA. The goal is to improve throughput in channels with different chip error rates. The primary distinction between the work presented in [2] and this study lies in the utilization of spreading factors and PN codes, as well as the operational methodology for ensuring secure communication. While ref. [2] prioritizes achieving higher throughput under varying channel conditions, our research focuses on enhancing the level of secure communication while maintaining throughput levels comparable to generic IEEE 802.15.4. The authors of [2] proposed publicly available codes (S3, S5, S7, and S9) that operate at different spreading factors than the standard IEEE 802.15.4 PN codes. Their proposed mechanism switches between spreading factors based on changing channel conditions, adapting to varying chip error rates. In contrast, our approach maintains a fixed spreading factor throughout the communication process, resembling the standard IEEE 802.15.4 PN operation. Furthermore, the PN codes employed in [2] are pre-generated offline and used for subsequent operations. Our proposal, however, allows users to generate codes dynamically, even during runtime, in a public-blind manner, thereby bolstering the overall security of the system.

The key findings of our proposals are as follows:

Effective PN Sequence Discovery: We implemented a GA as a method for generating sequences. We used this method to create a new PN sequence. This PN sequence is for IEEE 802.15.4 networks. Our new sequence offers a viable alternative to the standard’s default set, enhancing communication security.

Preserved Noise Characteristics: The newly discovered PN sequence exhibits pseudo-random noise characteristics comparable to the default sequences defined in the IEEE 802.15.4 standard, ensuring compatibility and functionality within the network.

Maintained Throughput: The GA-derived sequence is good at handling data. It works just as well as the default PN sequence set. This means that adding security features does not slow down how fast data travels.

Re-discoverable Sequence for Anonymity: The GA scheme allows for the re-discovery of the similar default PN sequence using the same GA parameters. This re-discoverability feature provides anonymity in the sequence set, further bolstering communication security.

Enhanced Security through Lack of Production Mechanism: Unlike traditional methods that rely on shift registers or hash functions for sequence generation, the GA-derived sequence lacks a predefined production mechanism. Not knowing how messages are created on-channel plays an important role. This makes it harder to understand what is being communicated. It also adds another layer of anonymity for the people using this channel.

Hardware-Level Security: The GA-discovered sequence can be embedded directly into the wireless transceiver chip. Because it is performed on hardware, not software, we can make security stronger and harder to break.

Our research utilizes a genetic algorithm (GA) to generate a secure pseudo-random noise (PN) sequence for IEEE 802.15.4 networks. This approach achieves performance comparable to the existing standard while offering superior security due to the anonymity and unknown nature of the generation mechanism. Additionally, hardware-level implementation holds promise for a more dependable and tamper-proof security solution.

Our paper examines the intersection of AI in cybersecurity. Following this section, we delve into a focused examination of current AI applications used for cybersecurity purposes. This review will focus on four prevalent cyber threats: phishing, social engineering, ransomware, and malware. Subsequently, we propose a novel method for information encryption that leverages the power of a GA. We will meticulously dissect the proposed GA model, unveiling its intricate details and functionalities. Furthermore, a rigorous evaluation of the model’s performance will be presented. Finally, the paper concludes by summarizing our key findings in cybersecurity.

The methodology of our work explores the potential AI applications to enhance cybersecurity through a two-pronged approach: a literature review and a case study.

To contextualize our work within the existing body of knowledge, a concise review of recent research is presented in Section 2 and Section 5. This section adheres to the established data collection and inclusion criteria such as:

Data Collection: To explore the current state of AI use in cybersecurity in a concise way, we performed a focused review of recent academic publications. This involved searching reputable academic databases and publications for relevant research focusing on AI for threat prevention.

Inclusion Criteria: To capture the rapidly evolving field of AI technologies, our selection process prioritized the most recent publications. We specifically focused on studies that addressed preventative measures against common threats like phishing, social engineering, ransomware, and malware.

To illustrate the practical applications of the AI concepts, Section 6 presents a case study. This case study delves into the use of GA as a sub-class of AI to secure communication within IEEE 802.15.4 networks. By examining this specific example, we aim to showcase both the real-world benefits of AI in cybersecurity and any potential limitations associated with these technologies. The following considerations guided the development of our case study:

Addressing a Gap: We identified a critical security vulnerability in communication protocols used by resource-constrained IoT and WSN networks adhering to the IEEE 802.15.4 standard. This standard is not compatible with traditional encryption methods used in Wi-Fi (WEP, TKIP, CCMP) due to their high processing power and energy consumption requirements.

Proposed Solution: To address this gap, we propose a novel encryption methodology specifically designed for IEEE 802.15.4 communication. This methodology leverages a GA, which is a popular sub-class of AI algorithms, to generate a unique symbol-to-chip sequence table, essentially acting as a cipher for data transmission. Unlike the standard protocol’s publicly available table, our solution ensures anonymity and renders data unreadable for passive eavesdroppers on the communication channel.

Evaluation: To assess the effectiveness of our proposed encryption method under realistic conditions, we employed an experimental platform. This platform simulated varying chip error rates (

ρ

) ranging from 0% to 3%, representing low-noise environments like an office space. We maintained the default IEEE 802.15.4 protocol parameters and eliminated potential collisions to isolate the impact of error-causing factors on communication throughput.

The following sections explore the potential of AI in combating cybersecurity threats. Section 2 reviews recent studies on applying AI to identify and prevent phishing attacks. Section 3 focuses on their role in mitigating social engineering tactics. Section 4 and Section 5 delve into the use of AI against ransomware and malware, respectively. Section 6 presents a case study to showcase a specific application of AI in cybersecurity. Finally, Section 7 concludes the paper by summarizing the key findings and highlighting future directions.

2. Phishing Attacks

Phishing attacks are the most common cybercrime recently. They use fake emails that appear to be from someone they trust. These emails try to steal valuable information. The recipient is tricked into giving away personal information. This information can include sensitive data like passwords and credit card details. The attackers act like fishermen, using a tempting facade to lure unsuspecting victims into their trap. Stolen information can fuel financial crimes or malicious acts, but user awareness and strong online security are our best defenses against these evolving threats.

Phishing attacks are not limited to email. Phishers use a scattered approach, employing misleading messages across various communication channels. This includes instant messaging, online forums, and social media. These messages often contain a deceptive link leading to a fake website designed to steal your information. This widespread method significantly increases the chances of a user clicking and unknowingly logging in with their username and password on the fake website. This is how phishing attacks steal login credentials. The malicious intent behind phishing is often cleverly disguised. Therefore, caution online is crucial. By acquiring stolen login credentials, phishers gain the potential to launch a variety of cybercrimes, all stemming from a single unsuspecting click. Machine learning (ML) and deep learning (DL) can be powerful tools in identifying patterns that reveal malicious intent in these attacks. These techniques analyze vast amounts of data to diagnose phishing attempts in real time. Similar to automated intelligence, ML and DL act as powerful decision-making tools within management information systems.

ML and DM act as powerful tools for cybersecurity. They analyze vast amounts of data to uncover hidden attacking patterns. So, they can provide better planning and mitigation strategies. By employing these strategies, we can empower organizations with a robust toolkit for data analysis, encompassing threat identification through email content examination, historical malicious activity recognition, and threat classification for enhanced investigation. Phishing detection can be addressed through a technique called classification. Like a detective, this method sorts websites into categories. These categories include legitimate, suspicious, and phishy. By classifying websites, it improves cybersecurity decisions [3,4]. In the study [4], the authors analyzed various website characteristics to predict their type. They built a training dataset by pairing these characteristics with known website classifications. The objective is to create a classifier. The proposal suggests a new system. This system works like an automated detective. It examines websites at first. It finds hidden patterns in data used for training. Based on these patterns, it can identify the type of website. A website classifier’s effectiveness hinges on the strength of its feature-to-classification linkages, with accuracy measured by the alignment between predicted and real-world website types.

Effective research is being conducted on phishing detection. A key study by Kapan et al. explores how selecting appropriate features can enhance ML-based phishing detection [5]. The authors investigate the impact of classifier type on accuracy and employ various methods to optimize detection performance. They analyze the results of using different features and observe the influence of diverse phishing attacks. To achieve this, they created a new dataset and tested various website classification methods. They experimented with different feature sets for each classification method, evaluating each method’s effectiveness based on accuracy, true positive rate (catching phishing attempts), false positive rate (flagging safe sites), and processing speed. Their findings suggest that features based on URLs and HTTP protocols yield superior performance, indicating that focusing on these specific aspects can improve phishing detection accuracy. Notably, they achieved a remarkable F1-score of 0.99 while maintaining fast execution speed. To strengthen their conclusions, the authors validated their models on established benchmark datasets. This validation confirmed the effectiveness of decision trees and support vector machines for phishing detection, solidifying the reliability of these algorithms. The study underscores the importance of strategic feature and classifier selection to improve phishing attack detection capabilities. While this research offers valuable insights, a more comprehensive analysis could incorporate additional features, classifiers, and cost considerations, particularly regarding the impact of feature collection speed.

Another study was conducted by Abdul et al. [6] on combating phishing attacks. They propose an ML-based system. The research leverages a publicly available dataset. This dataset contains attributes describing the URLs of phishing and legitimate websites. Various machine learning algorithms, including decision tree, random forest, and a novel hybrid model combining logistic regression, support vector machine, and decision tree (LR+SVC+DT), were implemented after data pre-processing. To improve model performance, the authors employed feature selection techniques. They optimized the parameter values using cross-validation. To assess model effectiveness, the authors employed evaluation metrics of accuracy, precision, recall, F1-score, and specificity. By demonstrating the high efficiency and accuracy of their LR+SVC+DT hybrid model in detecting phishing URLs, this research underscores the valuable role ML can play in combating phishing attacks. While current systems perform well, future phishing detection can be even stronger by combining the strengths of list-based and ML approaches.

The authors in [7] extend the investigation into the efficacy of ML. By employing a comparative analysis framework, the performance of four prominent ML models is evaluated: artificial neural networks (ANNs), support vector machines (SVMs), decision trees (DTs), and random forests (RFs). The findings corroborate the superiority of the random forest model. It solidifies the use of ML as the cornerstone of phishing detection. Notably, RFs emerge as the most effective model in their study. To potentially achieve even more robust results and push the boundaries of performance, future research endeavors should explore the application of additional ML algorithms.

Phishing attacks pose a persistent threat in cybersecurity. The short lifespan of phishing campaigns can make it difficult to identify attackers. However, effective mitigation strategies can still be implemented, such as:

Enforcement Collaboration: Improved information sharing and cooperation are crucial to combating phishing attacks. Stronger digital collaboration is useful, potentially deterring future attacks. So, it can be possible to take down threats more quickly.

User Education: Though complete elimination of phishing remains elusive, user education in recognizing visual cues like suspicious URLs and website inconsistencies can significantly reduce vulnerability, especially for novice users.

The Need for Continuous Training: Several studies have shown that many novice internet users often fail to pay attention. This inattentiveness can make them more susceptible to phishing attacks. This necessitates ongoing and repetitive training initiatives to keep users informed about evolving phishing tactics and deception methods employed by attackers.

Online Phishing Communities: Serving as valuable resources for users, online phishing awareness communities frequently compile data on phishing attempts, including blacklisted URLs. These are a helpful tool, but for robust protection, users should also be aware of wider web security indicators.

Phishing attacks pose a significant challenge. A multifaceted approach can effectively combat them. This approach should encompass various strategies. Law enforcement collaboration, user education with a focus on visual cues, and ongoing training programs can significantly reduce the susceptibility of users to these attacks. Online communities provide a wealth of valuable resources. However, users still need to develop a broader understanding of web security best practices. This knowledge will empower them to effectively identify and avoid phishing attempts [3].

The constant evolution of phishing makes it difficult to eliminate the damage of attacks. However, information sharing and collaboration are key to disrupting the attacks. While complete eradication is unlikely, user education on suspicious URLs and website inconsistencies significantly reduces vulnerability, especially for new users. Studies show inattentiveness makes them susceptible, highlighting the need for ongoing training on evolving tactics. Online phishing communities offer resources like blacklisted URLs, but a broader understanding of web security best practices is crucial for robust protection. Despite the challenge, a multi-pronged approach with law enforcement, user education, training, and online communities can greatly reduce user susceptibility.

3. Cybersecurity in Social Engineering

Recent advances in social media automate tasks and increase convenience, but they also raise security concerns. Identity theft, financial fraud, and unauthorized access are some of the most significant threats. Using reliable and secure software is crucial to staying safe. Research in cybersecurity helps us understand these risks and develop ways to protect ourselves. The digital age expands our online presence as we share more and more of our lives online. Social engineering attacks, which exploit human trust rather than technical vulnerabilities, are becoming increasingly common. In these attacks, malicious actors manipulate people to gain access to sensitive data. Even though cybersecurity advancements can minimize the impact of such attacks, research suggests that the human element remains a critical factor in online safety [8].

Social engineering attacks, which exploit psychological manipulation to achieve malicious goals, are on the rise due to the widespread availability of technology and the proliferation of online communication. However, research on social engineering within the cybersecurity domain remains limited. This limitation could be attributed to the absence of unified criteria for evaluating these attacks or the scarcity of effective mitigation strategies. In a recent study by [9], the authors address this critical gap by proposing a novel topic modeling-based process for cyber-attack modeling. This process was successfully applied to model grooming and bullying attacks, where the attackers demonstrably used psychological manipulation techniques. The model achieved a high degree of accuracy in detecting the attackers’ communicative intent. Additionally, a functional parental control prototype was developed to showcase the model’s practical application. While real-time detection and mitigation mechanisms for these attacks are still under development, studying social engineering from a cybersecurity perspective allows us to bridge the gap between traditional security measures and future cybersecurity projects. This standardization of knowledge and processes can pave the way for the development of more robust and comprehensive solutions against these ever-evolving online threats. The effectiveness of the modeling process underscores its potential for future use against a wider range of unforeseen social engineering attacks.

The study in [10] examined the seriousness of current data protection in cybersecurity. The joint study revealed a significant number of potential victims: 788,000 susceptible to keyloggers, over 12 million vulnerable to phishing kits, and 2 billion compromised credentials exposed through social engineering. This research reinforces the value of equipping employees with the knowledge and skills to protect an organization’s critical information [11]. A study by Pethers et al. explored how social engineering tactics and design elements in phishing emails can make people more vulnerable to cyber sextortion attacks. Researchers employed a quantitative approach, using a survey to gauge people’s susceptibility to cyber sextortion emails. Their findings suggest that security measures should consider how emails are crafted to reduce the risk of sextortion attacks [12].

Another study on social media security was conducted by Khan et al. [13]. They examined the impact of cybersecurity awareness on social media platforms. They recognized that sharing personal information offers both social advantages and privacy risks. People weigh these factors and perform a cost-benefit analysis before disclosing information. They conducted a face-to-face survey with 284 participants. They examined the role of factors such as age, gender, and frequency of internet access, as well as protective online behaviors, in predicting self-disclosure. They used hierarchical regression analysis and machine learning algorithms. According to their results, cyber protection behavior significantly influences self-disclosure. They measured success as achieving a balanced classification score of 70% (F1 measure). They suggest in their study that educating users through cybersecurity training programs can enable them to make informed decisions about self-disclosure online, reducing potential risks. Because they used a hybrid approach that blended traditional statistical analysis with machine learning, they were able to explore the complex connection between cybersecurity awareness and self-disclosure behavior.

In the study in [14], the authors explore a multi-layered security model that mitigates evolving social engineering attacks by addressing both technological weaknesses and human factors through employee education and awareness training. They suggest using two tools to fight social engineering attacks. The first tool is called behavioral analytics. Behavioral analytics tracks how people normally use computer systems. The second tool uses AI and detects unusual activity in real time, allowing social engineering attacks to be stopped.

In paper [15], the authors propose a novel method using a recurrent neural network long short-term memory (RNN-LSTM) to identify well-disguised threats in social media posts. Then, they observed the produced flags for potential threats by RNN-LSTM. The researchers created a custom dataset. To populate it, they collected data from hundreds of Facebook posts. These posts came from both corporate and personal accounts. The Social Engineering Attack Detection pipeline (SEAD) utilizes domain heuristics to filter malicious posts, then tokenizes and analyzes sentiment before labeling them as anomalies or training data. The model is trained to identify five attack types. The types chosen are those which are common. Their experimental results showed that the semantics and linguistics similarities are an effective indicator for early detection of SEA.

Current cybersecurity research often lacks comprehensive solutions for social engineering attacks. Effective research should incorporate diverse perspectives on the attack methods. However, gaining a complete understanding of the issue remains difficult. Attackers continuously adapt their tactics, necessitating defenses that anticipate and counter these changing threats. This ongoing threat emphasizes the need for perpetual research and development in cybersecurity.

4. Ransomware: A Growing Threat in the Cybersecurity Landscape

Ransomware is a type of malicious software that encrypts user files, rendering them inaccessible. Its purpose is to extort a payment from the victim in exchange for unlocking the files. A major ransomware attack called WannaCry struck the world in 2017. This event significantly heightened public awareness of cybersecurity threats. In recent years, the proliferation of ransomware has become a major concern, inflicting substantial financial losses, reputational damage, and operational disruptions on individuals and organizations alike. In Table 1, well-known ransomware is listed [16].

Ransomware emerged in 1989 and has rapidly evolved into a sophisticated and widespread threat. Its encryption techniques have become increasingly complex, its ability to spread and evade detection has grown, and its capacity to extort victims has intensified. The global damage caused by ransomware attacks is on a trajectory to surpass hundreds of billions of US dollars in the coming years, with new attacks occurring within seconds. The cumulative worldwide damage from ransomware incidents has been steadily rising over time [17,18].

While human analysts struggle to keep pace with the ever-growing volume of data, AI excels in this domain. Its ability to analyze massive datasets makes them highly effective for ransomware detection. In this context, AI algorithms are trained on a colossal collection of both benign and malicious software. By analyzing the behavior of these programs, the algorithms learn to identify the characteristic traits that distinguish ransomware from legitimate applications. This acquired knowledge empowers them to detect even novel ransomware variants, even those never encountered before [16,19].

The authors of paper [18] provide an exploration of ransomware, delving into its history, classification (taxonomy), and the research efforts aimed at mitigating the threat. They trace ransomware’s origins and major trends that have shaped its evolution. They propose a taxonomy to categorize different ransomware types based on their unique characteristics and behaviors. The study goes on to identify shortcomings in current research, particularly regarding real-time protection and zero-day ransomware identification. While this study has contributed to the field, significant challenges persist, necessitating continued research efforts in ransomware mitigation and prevention. While traditional supervised learning methods are widely used for malware detection, their limitations hinder their effectiveness. Their main limitations are difficulties in achieving high accuracy and difficulties in handling complex malware strains. Because of these limitations, it is necessary to explore alternative approaches for more effective detection.At this point, DL techniques can be useful. Its detection accuracy and reliable outputs can offer a good solution. Algorithms using these techniques offer several advantages. These advantages include using automatic feature generation, as well as eliminating the need for manual feature engineering. They can learn from datasets given to them, and this process can be automated to minimize human interaction. This minimization ultimately enables rapid real-time detection. However, there are some challenges when using DL approaches. The main challenge is the need of large amounts of data. This data is used for training these algorithms. These algorithms are not suitable if limited datasets are used for a malware application. Another problem with these algorithms occurs when they are used on systems with low processing power. This can be a problem for resource constraint systems. Finally, adapting these techniques to real-world datasets can be problematic, as real-world data often deviate from the training data used in the development process. Despite these challenges, DL offers a powerful tool in the fight against ransomware. By acknowledging its limitations and adapting applications to address these issues, researchers can leverage the strengths of DL to bolster ransomware detection capabilities [16,20,21].

In response to the ever-sophisticated social engineering tactics employed by cybercriminals and the limitations of existing tools in detecting novel ransomware variants, a recent study by the authors of [22] proposed a novel framework called “RTrap”. This framework utilizes machine learning to generate decoy files strategically placed throughout a system. By acting as bait, these deceptive files lure ransomware into targeting them, triggering a lightweight monitoring system that continuously tracks file activity. Evaluations conducted by the study’s authors demonstrate RTrap’s effectiveness in ransomware detection, achieving a high success rate with a minimal average loss of only 18 legitimate user files per 10,311 files. Building upon the work presented in [23], the authors propose RansomAI, a novel framework that leverages reinforcement learning (RL) to endow existing ransomware with the capability to dynamically adapt its encryption behavior.

This dynamic adaptation gives ransomware a powerful ability to evade detection by security solutions. RansomAI integrates an agent that works by learning the optimal combination of encryption algorithms, rates, and durations. In such use, the system balances maximizing data encryption with minimizing detection by a sophisticated defense mechanism that employs device fingerprinting. To validate RansomAIs effectiveness, the authors deployed it within Ransomware-PoC, infecting a Raspberry Pi configured as a sensor. Experiments using Deep Q-Learning for representation and Isolation Forest for detection showed that their system performed the detection process quickly, within minutes, with an accuracy exceeding 90%. Nonetheless, further evaluation is planned to assess RansomAIs generalizability across diverse devices and its efficacy with various malware samples.

Effective malware protection requires considering a comprehensive set of parameters. A robust defense strategy integrates diverse methods that address these parameters simultaneously. Focusing on a single parameter creates vulnerabilities that malware can exploit. This is particularly critical when dealing with the varied threats posed by ransomware.

ML algorithms offer a powerful approach to detecting ransomware patterns due to their ability to handle diverse data points. However, effective implementation requires a layered development process. Each layer’s effectiveness must be rigorously evaluated and deficiencies addressed. Early detection and understanding of malware patterns during development can be highly advantageous.

User awareness plays a vital role in ransomware defense. Training programs that educate users on ransomware fundamentals and current defense systems can significantly improve organizational preparedness. This empowers users to contribute to the overall security posture and reduce the risk of successful attacks.

5. Defending Against Malware

Malicious software, also known as malware, poses a significant threat to the IT industry. The recent surge in malware attacks has become a major challenge. Malware can infiltrate computer systems without authorization, leading to a variety of harmful consequences. These consequences often include data theft and system corruption. The increasing popularity of mobile devices, particularly those running Android, necessitates the development of robust security solutions. However, this widespread adoption also creates a larger target for mobile malware infections.

To address this critical issue, Vanjire et al. [24] propose an ML-based approach for anomaly detection on Android devices. Their system utilizes the power of three machine learning algorithms, including K-nearest neighbors (KNN), naive Bayes, and decision tree. They analyzed mobile application behavior and identified potential malware vulnerabilities. As demonstrated in this study, ML methods offer a powerful approach to combating the growing malware threat. These methods provide a means to analyze and classify large amounts of data. They allow malware to be identified. Their successes hold true even when using obfuscation techniques to evade traditional signature-based detection. One of ML approaches is proposed by Kumar et al. [25]. Their approach is based on a classification technique for classifying Windows PE files. This technique is trained on a substantial dataset of roughly 100,000 Brazilian malware samples. Each sample is characterized by 57 features. The authors explore various machine learning models, achieving the highest accuracy of 99.7% with a random forest model. This result shows the effectiveness of the random forest model in differentiating between benign and malicious files, suggesting its potential as a valuable tool for system security.

Polymorphic malware is a new and highly adaptable form of malicious software. It poses a significant challenge to traditional signature-based detection methods. This type of malware constantly modifies its code to evade identification. This renders signature-based approaches ineffective. To address this growing threat, Akhtar et al. [26] propose an ML approach for such types of malware attacks. Their approach utilizes various algorithms. These algorithms include naive Bayes, support vector machines (SVMs), J48, random forest (RF), and their own proposed method. They employed them on a large dataset. They chose the model with the highest accuracy and lowest error rate based on their analysis of detection rate and false positive/negative rates. These rates are measured by the confusion matrix. This analysis provides effective differentiation between benign and malicious traffic on computer networks. The analysis focuses on the difference in correlation symmetry integrals and demonstrates the effectiveness of ML in detecting highly adaptable malware.

Contemporary cybersecurity methods are increasingly burdened by sophisticated malware. This malware is typically characterized by rapid spread, self-propagation, and advanced evasion tactics. These characteristics allow malware to evade near real-time detection and forensic analysis. AI presents itself as a potential solution to address this growing cybersecurity challenge. In a recent study ([27]), the authors propose a novel systematic approach for identifying modern malware families. This approach utilizes a combination of dynamic DL methods and heuristic techniques to achieve classification and detection of different malware types. Their research explores the application of symmetry analysis within the context of malware detection. This application aims to improve detection capability, analysis performance, and mitigation strategies. Ultimately, their research strives for the development of more resilient cyber-systems against evolving threats. To establish the effectiveness and real-world applicability of their approach, the authors employed an empirically-based dataset specifically formed with recent malicious software samples. The experimental results demonstrate that the proposed hybrid approach, combining behavior-based DL and heuristic-based techniques, outperforms static DL methods for malware detection and classification. The complexity of cybersecurity software itself can pose a challenge for existing malware detection techniques. This is particularly evident in the face of highly sophisticated malware attacks.

The continuous emergence of novel malware variants, often mimicking legitimate software, poses a significant challenge for detection. Furthermore, malware’s ability to dynamically modify its internal structure adds complexity. To address these difficulties and improve detection efficiency, dynamic analysis solutions are crucial to expedite feature extraction. Additionally, research into more advanced detection approaches is essential for effectively identifying malicious activities. The recent rise in “intelligent” malware underscores the need for developing artificial intelligence (AI) technologies for both malware detection and prevention.

6. Enhancing Security in Low-Rate Wireless Networks

The ubiquity of wireless networks requires a variety of protocols that meet specific needs. IEEE 802.11 is a commonly used protocol in many wireless applications. For low power consumption and low-rate communication, the IEEE 802.15.4 protocol is a preferred choice in battery-powered devices in home appliances and industrial settings demanding robust operation amidst noise and interference [28].

However, the encryption mechanisms such as WEP, TKIP, and CCMP used in IEEE 802.11 are not compatible with IEEE 802.15.4 due to their high processing power and energy requirements. Since IEEE 802.15.4 protocols require low-power compatibility, lightweight encryption approaches are necessary. Therefore, low-cost encryption algorithms should be used in these networks.

In this section, we propose a novel encryption methodology specifically designed for IEEE 802.15.4 communications. In this methodology, we utilize a GA to generate a symbol-to-chip sequence table, effectively ciphering data. Unlike the standard IEEE 802.15.4 protocol, where this table is publicly available, our anonymous solution makes data symbols undecryptable for listeners passively monitoring the channel.

Traditionally, security in IEEE 802.15.4 networks is addressed through higher-layer protocols, introducing additional overhead. Here, our innovation lies in integrating encryption seamlessly within existing protocol operations, eliminating the need for extra processing demands. This enables secure communication within the inherent low-power and low-overhead constraints of IEEE 802.15.4 networks, enhancing their overall performance and security posture.

Traditionally, security in IEEE 802.15.4 networks is provided through higher-layer protocols. Such protocols introduce additional overhead because adding higher-layer algorithm operations. Our innovation here is to integrate encryption into existing protocol operations in IEEE 802.15.4 existing PHY-MAC (Physical-Media Access Control) layer, eliminating extra processing demands. This enables secure communication within the low power and low load constraints inherent in IEEE 802.15.4 networks, improving their overall performance and security posture.

6.1. IEEE 802.15.4: A Protocol for Low-Rate Wireless Communication

The IEEE 802.15.4 protocol, standardized by the IEEE 802.15 Working Group in 2003, are commonly employed in low-rate wireless personal area networks (WPANs). This protocol defines the PHY-MAC layer functionalities, enabling short-range and low-power wireless communication between devices.

The PHY layer of IEEE 802.15.4 operates in various frequency bands. The 2450 MHz band is a popular choice in these network operations. In this band, the protocol offers a data rate of 2 Mbps using Offset Quadrature Phase-Shift Keying (O-QPSK) modulation. This band provides 16 communication channels. These channels are 5 MHz wide.

An IEEE 802.15.4 packet structure within the 2450 MHz band typically begins with a preamble sequence (PRE) for channel identification, followed by a synchronization header (SYN) for frame delimitation, and a PHY header containing essential information about the packet. These control fields are succeeded by a variable-length payload carrying the actual data, limited to a maximum of 127 octets. Due to this limitation and the minimum size requirement of 1280 octets for an IPv6 packet, an adaptation layer called 6LoWPAN is often used with IEEE 802.15.4 to enable communication within the IPv6 internet protocol suite.

IEEE 802.15.4 networks can operate in two modes: beacon-enabled and beaconless (usually as unslotted). In beacon-enabled mode, a designated coordinator device is responsible for network synchronization. Devices within the network can be categorized as full-function devices (FFDs) or reduced function devices (RFDs). FFDs can act as either coordinators or network clients, while RFDs are limited to client functionality. The coordinator in a beacon-enabled mode transmits special frames called beacons to allocate time slots to client devices for data transmission. These beacons also contain network configuration information and facilitate time synchronization for channel access.

In unslotted mode, devices are not required to adhere to a time-slotted approach. Instead, they rely on the Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA) algorithm for channel access. The flowchart of unslotted CSMA/CA operation can be found in [29]. It begins with a transmission attempt, where initial values are set for parameters like the number of back-off attempts (NB) and the back-off exponent (BE). The device then waits for a random back-off period before checking the channel for idle state. If the channel is idle for a specific duration, the packet transmission proceeds. Upon successful reception, the destination device transmits an acknowledgment (ACK) message. However, noise in the channel can corrupt the transmitted data (packet or ACK). In such scenarios, the source device retransmits the packet up to a predefined maximum number of attempts. If the channel remains busy after multiple attempts, the access failure is reported to higher layers. A successful transmission is acknowledged within a specific time window (MacAckWaitDuration), after which the process is considered complete. The upper layers in the network protocol stack typically consist of a network layer for routing and a higher-level application layer specific to the device’s function [29,30].

The IEEE 802.15.4 standard uses a matching technique to transmit data in wireless networks. Instead of standard data bits, it uses short sequences of pseudo-random noise (PN) codes (given in Table 2). These PN codes act like unique identifiers for each 4-bit chunk of data. By referencing this table that assigns a specific 32-chip PN code to each 4-bit symbol, the receiver can decipher the original data even with some interference. To achieve a balance between data rate and power consumption, the IEEE 802.15.4 protocol transmits these PN codes at 2 million chips per second (Mchip/s). Since each 32-chip sequence represents a 4-bit symbol, this translates to a data rate of 250 kilobits per second (kbps).

6.2. Genetic Algorithm (GA)

John Holland, along with his students and colleagues, pioneered the GA during the 1960s and 1970s at the University of Michigan [31]. Inspired by natural selection and biological reproduction, GAs are evolutionary algorithms that have become popular optimization tools for various real-world applications. Mimicking natural selection (survival of the fittest) and biological reproduction processes, GAs develop optimal solutions (fittest individuals) progressively without relying on strict mathematical formulations. The optimal solution inherits the best characteristics (genes) from the fittest individuals in previous generations. Therefore, GAs are considered stochastic, nonlinear, and discrete event processes rather than mathematically guided algorithms.

The simplest GA operates on a population of individuals represented by fixed-length bit strings. Selection criteria are used to choose a parent pool from the population for generating the next generation. The crossover and mutation operators introduce new candidate solutions into the population. The crossover operator produces new offspring by exchanging partial bit strings and inverting bits between two parents. The mutation operator randomly flips some genes of the new offspring. Each individual’s fitness is evaluated using a fitness function. Finally, the fittest individual in the last generation is considered the optimal solution.

The GA begins by initializing a population with random candidate solutions. It then iteratively develops the optimal solution across generations. During the search process, the GA employs a set of genetic operators: selection, crossover, and mutation. The selection operator prepares the parents’ pool for mating. Thus, the selection operator guides the GA to the optimal solution by preferring the fittest individuals over low-fitted ones as given in Figure 1. A crossover operator is a powerful tool for producing new offspring and improving the quality of individuals by swapping genes between the parents. Crossover operators can influence population diversity in complex ways. While they do not directly create entirely identical offspring (children) with their parents, they can favor specific combinations of existing traits. The mutation operator maintains the population diversity. The main idea is to change the allele of the child randomly. The mutation operator is controlled by a mutation probability that is kept as low as possible to avoid the GA behaving like a random search (Figure 2). The processes of the GA and its operators are given in Figure 3.

GA borrows terminology from biology as it simulates biological processes. However, GA entities are much simpler than their biological counterparts. The fundamental GA terminologies are given as follows:

Population: A set of candidate solutions. The population, a set of candidate solutions, allows the GA to explore various search space regions, facilitating global exploration. Therefore, the quality of the initial population significantly impacts GA performance.

Chromosome/Individual: A candidate solution consisting of genes and their alleles. A gene is a single element position (bit or short block of bits) within a chromosome, and an allele is the gene’s value in a particular chromosome.

Initialization: The first GA process responsible for preparing the initial population with random candidate solutions (individuals).

Evaluation: This process determines the fitness level of an individual using a problem-dependent fitness function. It is triggered after every new individual is produced.

Selection: A crucial process for selecting parents for the crossover operation. The simplest selection technique is based on fitness value, where solutions with higher fitness have a greater probability of being selected.

Crossover and Mutation: A recombination process responsible for generating new offspring is called crossover, while a random deformation of an individual with a specific probability is called mutation.

Replacement: This process prepares the population for the next generation. The basic technique selects the fittest individuals from the current generation (parents and new offspring) to form the next generation.

Stop Criteria: These criteria specify when to terminate the GA and select the optimal solution. Typically, the GA stops when at least one of the following criteria is met: reaching the maximum number of generations or finding an individual with a fitness value exceeding or falling below a threshold [32].

6.3. Optimizing IEEE 802.15.4 Encryption Using Genetic Algorithms

Direct-Sequence Spread Spectrum (DSSS) is a technique that scrambles data with a high-speed, pseudo-random noise (PN) sequence generated from a high data-rate source [33]. This process expands the transmission bandwidth by a factor known as the spreading gain. We leverage the DSSS framework to integrate our encryption method. However, to enhance anonymity, we propose replacing the generic, publicly known IEEE 802.15.4 PN sequence table with a custom symbol–sequence table. To achieve this, we employ the GA to generate high-quality PN sequences. The GA optimizes three key metrics as outlined in [34]:

Balance Property $f_{b a l a n c e}$ : This metric ensures a near-equal distribution of ones and zeros within the sequence, promoting signal clarity;
Run Property $f_{r u n}$ : This metric minimizes the occurrence of consecutive ones or zeros (runs) within the sequence, mitigating potential signal bias;
Correlation Property $f_{r u n}$ : This metric aims to minimize the similarity between different sequences generated by the GA. This reduces the likelihood of interference between users sharing the same channel.

We mathematically evaluate these quality metrics using objective functions, similar to the approach used in [2]. However, we prefer a product function rather than a sum function to form a single objective, as we expect this choice to converge to the optimum more quickly. The GA prioritizes minimizing these functions to identify optimal PN sequences for our encryption scheme. To combine the desired balance, run length, and correlation properties of PN sequences into a single evaluation metric, Equation (1) is derived. This equation’s output is then scaled between 0 and 1 for easier interpretation. Finally, this resulting objective function is used within the GA to find the optimal PN sequence.

\begin{matrix} f_{o b j e c t i v e} & = & \prod_{i = 1}^{3} f_{i} \\ where f & = & {f_{b a l a n c e}, f_{r u n}, f_{c o r r}} \end{matrix}

(1)

The GA parameters were chosen in accordance with the recommendations outlined in [35]. The specific values assigned to these parameters are presented in Table 3.

The success of reaching to the minima of the objective function (given in Equation (1)) is illustrated in Figure 4. The best objective (penalty) function value is found as 2.14

\times 10^{- 7}

.

By means of the GA, our proposal unveils a groundbreaking method for safeguarding communication within IEEE 802.15.4 networks. This novel sequence surpasses the standard set, offering a significant security boost. The advantages obtained by this approach are listed below:

Evolving Security: Our GA successfully birthed a robust PN sequence, providing a secure alternative to the default options. GA operates within each session connection established by the transport layer, which is leveraged for cross-layer interoperability. This enables the generation of the subsequent anonymous PN sequence. This sequence is communicated from the sender to the receiver via Transport Layer Security (TLS), such as Secure Sockets Layer (SSL), layer data. Employing TLS in transport layer connections is indispensable for achieving maximum security. Consequently, enhanced security is provided on a continuous basis over time.

Preserving Performance: Remarkably, this new sequence maintains the noise characteristics needed for the network to function flawlessly, with no impact on data transmission speed.

Cloaked in Anonymity: The GA allows for the recreation of similar, yet distinct, sequences using the same parameters. This “rediscovery” feature introduces anonymity, a layer further strengthening communication security.

Unbreakable Code: Unlike traditional methods with predictable generation mechanisms, the GA-derived sequence has no known production method. This obscurity acts as an extra layer of encryption, making it incredibly difficult to crack.

Hardware Shield: This newfound sequence can be embedded directly into the chip powering wireless communication. This hardware-based approach eliminates the need for software security, potentially leading to a more reliable and tamper-proof solution.

Our research paves the way for using GAs to generate secure PN sequences within these networks. This approach prioritizes exceptional security without significantly impacting performance. The hardware implementation offers a promising path towards a highly secure system against potential threats.

This method leverages unique sequences, generated through the GA, to function as digital identifiers. These unique sequences are assigned to short data units (4 bits) as shown in Table 4. A separate table maps each of these symbols (4 bits) to a specific code sequence consisting of 32 chips. This two-step process allows the receiver to decode the original data even when it is disrupted by interference. Additionally, it provides a layer of encryption by using the inherent secrecy of the GA-generated sequences, enhancing overall security. To ensure seamless integration with existing protocols, we maintain the established symbol-to-sequence association used in the standard IEEE physical layer. Our method simply modifies the sequence values within the set, guaranteeing compatibility with current infrastructure while introducing an encryption benefit.

Table 5 compares the performance of the proposed PN sequence set to the IEEE 802.15.4 payload sequence in terms of PN characteristics. For each objective function output, smaller values are desirable. As shown in the table, the proposed sequence set achieves performance close to that of the IEEE 802.15.4 payload sequence.

To evaluate communication throughput under realistic noise conditions, we employed a simulation platform with chip error rates (

ρ

) ranging from 0% to 3%, representing a low-noise environment like an office. We adhered to the default IEEE 802.15.4 protocol parameters as specified in [36] and eliminated collisions to isolate the impact of error-causing factors. Figure 5 compares the performance of our proposed secure GA sequence (blue) to the generic IEEE 802.15.4 sequence (red). As evident from the figure, both sequences exhibit similar performance across various (

ρ

) values. Chip error rates in typical office environments are estimated to fall within the range of

10^{- 3}

to

10^{- 6}

[37], suggesting that our chosen error rate range [0–3%] offers a comprehensive evaluation. Notably, the figure indicates that the proposed secure GA sequence is a viable solution for low-rate wireless personal area networks in office or home settings. It achieves anonymous, encrypted physical-layer security while maintaining throughput comparable to the existing IEEE 802.15.4 PN sequence scheme.

WSN and IoT devices are often constrained by low processing power, limited memory, and restricted power supplies. This poses a challenge for securing their wireless communication, as robust encryption algorithms like WPA, WPA2, and WEP are computationally expensive and energy-intensive for these devices. AI offers promising solutions for enhancing security, However, for resource-constrained devices with low processing power and limited battery life, alternative approaches like lightweight algorithms might be more practical for real-time implementation. Our proposed approach addresses this by employing AI offline to generate encryption codes. These pre-computed codes are then used within lightweight pseudo-random noise (PN) sequence communications, enabling security within the widely used IEEE 802.15.4 protocol for WSN and IoT devices.

7. Conclusions

The rapidly changing landscape of cyber threats compels the field of cybersecurity to continuously adapt its defenses. This study explores the potential of AI to improve defenses against evolving cyber threats. We provide a concise review of current applications of AI in cybersecurity, focusing specifically on their preventive capabilities against phishing, social engineering, ransomware, and malware. To illustrate these theoretical concepts, a case study presents a specific application of AI in securing communication within IEEE 802.15.4 networks. This case study examines the run-time operational performance of a secure GA sequence implemented for low-rate WPANs. Our findings suggest the proposed sequence offers a promising solution for secure communication in office and home appliances. Our approach achieves anonymous and encrypted communication while maintaining throughput. The performance metrics of our proposed GA sequence closely resemble those of PN sequences defined by the generic IEEE 802.15.4 standard. This similarity, combined with the inherent secrecy of GA-generated sequences, provides an additional layer of encryption. This work highlights the potential of AI in securing communication channels and emphasizes the need for continued research in this domain.

Author Contributions

Conceptualization, S.O. (Selcuk Okdem) and S.O. (Sema Okdem); methodology, S.O. (Selcuk Okdem); software, S.O. (Selcuk Okdem) and S.O. (Sema Okdem); validation, S.O. (Sema Okdem); and other works, S.O. (Selcuk Okdem) and S.O. (Sema Okdem). All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data requests can be directed to corresponding author’s email address.

Acknowledgments

During the revision stage of our manuscript writing, we utilized Gemini for linguistic checks and language improvement.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ACK	Acknowledgment
AI	Artificial Intelligence
ANN	Artificial Neural Network
CCMP	Counter Mode with Cipher Block Chaining Message Authentication Code Protocol
CSMA/CA	Carrier Sense Multiple Access with Collision Avoidance
DL	Deep Learning
DSSS	Direct-Sequence Spread Spectrum
DT	Decision Tree
FFD	Full-Function Device
GA	Genetic Algorithm
HTTP	Hypertext Transfer Protocol
IEEE	Institute of Electrical and Electronics Engineers
IoT	Internet of Thing
IT	Information Technology
KNN	K-Nearest Neighbors
LR	Logistic Regression
LSTM	Long Short-Term Memory
MAC	Media Access Control
ML	Machine Learning
O-QPS	Offset Quadrature Phase-Shift Keying
PHY	Physical
PN	Pseudo-random Noise
PRE	Preamble
RF	Random Forest
RFD	Reduced-Function Device
RL	Reinforcement Learning
RNN	Recurrent Neural Network
SEA	Social Engineering Attack
SVC	Support Vector Machine
TKIP	Temporal Key Integrity Protocol
WEP	Wired Equivalent Privacy
WPA	Wi-Fi Protected Access
WPAN	Wireless Personal Area Network
WSN	Wireless Sensor Network
SSL	Secure Sockets Layer
TLS	Transport Layer Security
URL	Uniform Resource Locator

References

Falowo, O.I.; Ozer, M.; Li, C.; Abdo, J.B. Evolving Malware and DDoS Attacks: Decadal Longitudinal Study. IEEE Access 2024, 12, 39221–39237. [Google Scholar] [CrossRef]
Okdem, S.; Shi, H. Improving IoT and WSN Communication Throughput Using Evolutionary Optimization. In Proceedings of the ICCCI’24. 6th International Conference on Computer Communication and the Internet (ICCCI), Tokyo, Japan, 14–16 June 2024; pp. 169–174. [Google Scholar]
Qabajeh, I.; Thabtah, F.; Chiclana, F. A recent review of conventional vs. automated cybersecurity anti-phishing techniques. Comput. Sci. Rev. 2018, 29, 44–55. [Google Scholar] [CrossRef]
Thabtah, F.; Mohammad, R.M.; McCluskey, L. A dynamic self-structuring neural network model to combat phishing. In Proceedings of the 2016 International Joint Conference on Neural Networks (IJCNN), Vancouver, BC, Canada, 24–29 July 2016; pp. 4221–4226. [Google Scholar]
Kapan, S.; Sora Gunal, E. Improved Phishing Attack Detection with Machine Learning: A Comprehensive Evaluation of Classifiers and Features. Appl. Sci. 2023, 13, 13269. [Google Scholar] [CrossRef]
Karim, A.; Shahroz, M.; Mustofa, K.; Belhaouari, S.B.; Joga, S.R.K. Phishing Detection System Through Hybrid Machine Learning Based on URL. IEEE Access 2023, 11, 36805–36822. [Google Scholar] [CrossRef]
Alnemari, S.; Alshammari, M. Detecting phishing domains using machine learning. Appl. Sci. 2023, 13, 4649. [Google Scholar] [CrossRef]
Salama, R.; Al-Turjman, F. Cyber-Security Countermeasures and Vulnerabilities to Prevent Social-Engineering Attacks. In Artificial Intelligence of Health-Enabled Spaces; CRC Press: Boca Raton, FL, USA, 2023; pp. 133–144. [Google Scholar]
Zambrano, P.; Torres, J.; Tello-Oquendo, L.; Yánez, Á.; Velásquez, L. On the modeling of cyber-attacks associated with social engineering: A parental control prototype. J. Inf. Secur. Appl. 2023, 75, 103501. [Google Scholar] [CrossRef]
Thomas, K.; Li, F.; Zand, A.; Barrett, J.; Ranieri, J.; Invernizzi, L.; Markov, Y.; Comanescu, O.; Eranti, V.; Moscicki, A.; et al. Data breaches, phishing, or malware? Understanding the risks of stolen credentials. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA, 30 October–3 November 2017; pp. 1421–1434. [Google Scholar]
Aldawood, H.; Skinner, G. Educating and raising awareness on cyber security social engineering: A literature review. In Proceedings of the 2018 IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE), Wollongong, NSW, Australia, 4–7 December 2018; pp. 62–68. [Google Scholar]
Pethers, B.; Bello, A. Role of attention and design cues for influencing cyber-sextortion using social engineering and phishing attacks. Future Internet 2023, 15, 29. [Google Scholar] [CrossRef]
Khan, N.F.; Ikram, N.; Murtaza, H.; Asadi, M.A. Social media users and cybersecurity awareness: Predicting self-disclosure using a hybrid artificial intelligence approach. Kybernetes 2023, 52, 401–421. [Google Scholar] [CrossRef]
Edwards, L.; Zahid Iqbal, M.; Hassan, M. A multi-layered security model to counter social engineering attacks: A learning-based approach. Int. Cybersecur. Law Rev. 2024, 5, 313–336. [Google Scholar] [CrossRef]
Aun, Y.; Gan, M.L.; Wahab, N.; Guan, G.H. Social engineering attack classifications on social media using deep learning. Comput. Mater. Contin 2023, 74, 4917–4931. [Google Scholar]
Alraizza, A.; Algarni, A. Ransomware detection using machine learning: A survey. Big Data Cogn. Comput. 2023, 7, 143. [Google Scholar] [CrossRef]
Humayun, M.; Jhanjhi, N.; Alsayat, A.; Ponnusamy, V. Internet of things and ransomware: Evolution, mitigation and prevention. Egypt. Informatics J. 2021, 22, 105–117. [Google Scholar] [CrossRef]
Razaulla, S.; Fachkha, C.; Markarian, C.; Gawanmeh, A.; Mansoor, W.; Fung, B.C.M.; Assi, C. The Age of Ransomware: A Survey on the Evolution, Taxonomy, and Research Directions. IEEE Access 2023, 11, 40698–40723. [Google Scholar] [CrossRef]
Majid, A.A.M.; Alshaibi, A.J.; Kostyuchenko, E.; Shelupanov, A. A review of artificial intelligence based malware detection using deep learning. Mater. Today Proc. 2023, 80, 2678–2683. [Google Scholar] [CrossRef]
Bello, I.; Chiroma, H.; Abdullahi, U.A.; Gital, A.Y.; Jauro, F.; Khan, A.; Okesola, J.O.; Abdulhamid, S.M. Detecting ransomware attacks using intelligent algorithms: Recent development and next direction from deep learning and big data perspectives. J. Ambient Intell. Humaniz. Comput. 2021, 12, 8699–8717. [Google Scholar] [CrossRef]
Sharmeen, S.; Ahmed, Y.A.; Huda, S.; Koçer, B.Ş.; Hassan, M.M. Avoiding future digital extortion through robust protection against ransomware threats using deep learning based adaptive approaches. IEEE Access 2020, 8, 24522–24534. [Google Scholar] [CrossRef]
Ganfure, G.O.; Wu, C.F.; Chang, Y.H.; Shih, W.K. Rtrap: Trapping and containing ransomware with machine learning. IEEE Trans. Inf. Forensics Secur. 2023, 18, 1433–1448. [Google Scholar] [CrossRef]
von der Assen, J.; Celdrán, A.H.; Luechinger, J.; Sánchez, P.M.S.; Bovet, G.; Pérez, G.M.; Stiller, B. Ransomai: Ai-powered ransomware for stealthy encryption. In Proceedings of the GLOBECOM 2023–2023 IEEE Global Communications Conference, Kuala Lumpur, Malaysia, 4–8 December 2023; pp. 2578–2583. [Google Scholar]
Vanjire, S.; Lakshmi, M. Behavior-based malware detection system approach for mobile security using machine learning. In Proceedings of the 2021 International Conference on Artificial Intelligence and Machine Vision (AIMV), Gandhinagar, India, 24–26 September 2021; pp. 1–4. [Google Scholar]
Kumar, A.; Abhishek, K.; Shah, K.; Patel, D.; Jain, Y.; Chheda, H.; Nerurkar, P. Malware detection using machine learning. In Proceedings of the Knowledge Graphs and Semantic Web: Second Iberoamerican Conference and First Indo-American Conference, KGSWC 2020, Mérida, Mexico, 26–27 November 2020; Proceedings 2. Springer: Berlin/Heidelberg, Germany, 2020; pp. 61–71. [Google Scholar]
Akhtar, M.S.; Feng, T. Malware analysis and detection using machine learning algorithms. Symmetry 2022, 14, 2304. [Google Scholar] [CrossRef]
Djenna, A.; Bouridane, A.; Rubab, S.; Marou, I.M. Artificial intelligence-based malware detection, analysis, and mitigation. Symmetry 2023, 15, 677. [Google Scholar] [CrossRef]
Okdem, S.; Shi, H. A Real-Time Link Quality Estimation Method for IEEE 802.15.4 Based Wireless Sensor Network and IoT Devices. In Proceedings of the 2023 International Wireless Communications and Mobile Computing (IWCMC), Marrakesh, Morocco, 19–23 June 2023; pp. 1–6. [Google Scholar] [CrossRef]
Okdem, S. A cross-layer adaptive mechanism for low-power wireless personal area networks. Comput. Commun. 2016, 78, 16–27. [Google Scholar] [CrossRef]
Okdem, S. A real-time noise resilient data link layer mechanism for unslotted IEEE 802.15. 4 networks. Int. J. Commun. Syst. 2017, 30, e2955. [Google Scholar] [CrossRef]
Holland, J.H. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence; MIT Press: Cambridge, UK, 1992. [Google Scholar]
Alhijawi, B.; Awajan, A. Genetic algorithms: Theory, genetic operators, solutions, and applications. Evol. Intell. 2023, 17, 1245–1256. [Google Scholar] [CrossRef]
Herzog, R. Interference cancellation for a high data rate user in coded CDMA systems. In Proceedings of the ICC’98. 1998 IEEE International Conference on Communications. Conference Record. Affiliated with SUPERCOMM’98 (Cat. No. 98CH36220), Atlanta, GA, USA, 7–11 June 1998; Volume 2, pp. 709–713. [Google Scholar]
Swami, D.S.; Sarma, K.K. A Logistic-Map Based PN Sequence for Stocastic Wireless Channels; IGI Global: Hershey, PA, USA, 2017; pp. 155–182. [Google Scholar] [CrossRef]
Khankhour, H.; Abdoun, O.; Abouchabaka, J. Parallel genetic approach for routing optimization in large ad hoc networks. Int. J. Electr. Comput. Eng. (IJECE) 2022, 12, 748–755. [Google Scholar] [CrossRef]
IEEE P802.15.4; IEEE Std. 802.15.4-2003, Part. 15.4. Wireless Medium Access Control (MAC) and Physical Layer (PHY) Specifications for Low-RateWireless Personal Area Networks (LR-WPANs). IEEE: New York, NY, USA, 2003; pp. 1–320.
Fainberg, M. A Performance Analysis of the IEEE 802.11B Local Area Network in the Presence of Bluetooth Personal Area Network. Master’s Thesis, Polytechnic University, Powai, India, 2001; pp. 30–34. [Google Scholar]

Figure 1. An example of roulette wheel operation.

Figure 2. An example of crossover and mutation operations.

Figure 3. Flowchart of GA operations.

Figure 4. Performance over generations.

Figure 5. Throughput performance over error rate.

Table 1. Brief chronology of major ransomware.

Year	Ransomware
1989	AIDS Trojan
2012	Reveton
2013	CryptoLocker
2014	CryptoWall
2015	TeslaCrypt
2016	Locky
2017	WannaCry
2018	SamSam
2019	Ryuk
2020	Maze
2021	REvil/Sodinokibi
2022	Royal Ransomware
2023	LockBit Ransomware

Table 2. PN Sequences of 4-bit Symbols.

Index	Symbol	Sequence (32,4)
1	0000	11011001110000110101001000101110
2	1000	11101101100111000011010100100010
3	0100	00101110110110011100001101010010
4	1100	00100010111011011001110000110101
5	0010	01010010001011101101100111000011
6	1010	00110101001000101110110110011100
7	0110	11000011010100100010111011011001
8	1110	10011100001101010010001011101101
9	0001	10001100100101100000011101111011
10	1001	10111000110010010110000001110111
11	0101	01111011100011001001011000000111
12	1101	01110111101110001100100101100000
13	0011	00000111011110111000110010010110
14	1011	01100000011101111011100011001001
15	0111	10010110000001110111101110001100
16	1111	11001001011000000111011110111000

Table 3. Parameters used by GA.

Parameter Name	Value
#Threads	8
#Population	80
Cross-over rate	0.80
Mutation rate	0.05

Table 4. Proposed PN sequences of 4-bit symbols found by GA.

Index	Symbol	Sequence (32,4)
1	0000	10100010011101001001101101101100
2	1000	10010100010100011011110111000011
3	0100	10110011001010110011001001010101
4	1100	01101100100101011011101101010000
5	0010	11011011101010000010100010010111
6	1010	00011011011010101010110100101001
7	0110	01001001011001111000101001111010
8	1110	10011011111100110001001000100101
9	0001	01110100011001011100110010110100
10	1001	11101001110100000000110100111110
11	0101	01001101100111000011010010100111
12	1101	00000100011111101010100100111101
13	0011	10010110011000110000011110101110
14	1011	01101100001100011110101100110010
15	0111	00011100101001101101101000001111
16	1111	00010010111001101101010101110100

Table 5. Objective function outputs of sequences.

Sequence	$f_{balance}$	$f_{run}$	$f_{corr}$
Proposed Secure Sequence GA	0.00	0.4335	0.4937
IEEE 802.15.4 PN Sequence [36]	0.00	0.5040	0.4667

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Okdem, S.; Okdem, S. Artificial Intelligence in Cybersecurity: A Review and a Case Study. Appl. Sci. 2024, 14, 10487. https://doi.org/10.3390/app142210487

AMA Style

Okdem S, Okdem S. Artificial Intelligence in Cybersecurity: A Review and a Case Study. Applied Sciences. 2024; 14(22):10487. https://doi.org/10.3390/app142210487

Chicago/Turabian Style

Okdem, Selcuk, and Sema Okdem. 2024. "Artificial Intelligence in Cybersecurity: A Review and a Case Study" Applied Sciences 14, no. 22: 10487. https://doi.org/10.3390/app142210487

APA Style

Okdem, S., & Okdem, S. (2024). Artificial Intelligence in Cybersecurity: A Review and a Case Study. Applied Sciences, 14(22), 10487. https://doi.org/10.3390/app142210487

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Artificial Intelligence in Cybersecurity: A Review and a Case Study

Abstract

1. Introduction

2. Phishing Attacks

3. Cybersecurity in Social Engineering

4. Ransomware: A Growing Threat in the Cybersecurity Landscape

5. Defending Against Malware

6. Enhancing Security in Low-Rate Wireless Networks

6.1. IEEE 802.15.4: A Protocol for Low-Rate Wireless Communication

6.2. Genetic Algorithm (GA)

6.3. Optimizing IEEE 802.15.4 Encryption Using Genetic Algorithms

7. Conclusions

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI