Electronics

Research

Jump to: Review, Other

19 pages, 587 KiB

Open AccessArticle

MSFuzz: Augmenting Protocol Fuzzing with Message Syntax Comprehension via Large Language Models

by Mingjie Cheng, Kailong Zhu, Yuanchao Chen, Guozheng Yang, Yuliang Lu and Canju Lu

Electronics 2024, 13(13), 2632; https://doi.org/10.3390/electronics13132632 - 4 Jul 2024

Viewed by 259

Abstract

Network protocol implementations, as integral components of information communication, are critically important for security. Due to its efficiency and automation, fuzzing has become a popular method for protocol security detection. However, the existing protocol-fuzzing techniques face the critical problem of generating high-quality inputs. [...] Read more.

Network protocol implementations, as integral components of information communication, are critically important for security. Due to its efficiency and automation, fuzzing has become a popular method for protocol security detection. However, the existing protocol-fuzzing techniques face the critical problem of generating high-quality inputs. To address the problem, in this paper, we propose MSFuzz, which is a protocol-fuzzing method with message syntax comprehension. The core observation of MSFuzz is that the source code of protocol implementations contains detailed and comprehensive knowledge of the message syntax. Specifically, we leveraged the code-understanding capabilities of large language models to extract the message syntax from the source code and construct message syntax trees. Then, using these syntax trees, we expanded the initial seed corpus and designed a novel syntax-aware mutation strategy to guide the fuzzing. To evaluate the performance of MSFuzz, we compared it with the state-of-the-art (SOTA) protocol fuzzers, namely, AFLNET and CHATAFL. Experimental results showed that compared with AFLNET and CHATAFL, MSFuzz achieved average improvements of 22.53% and 10.04% in the number of states, 60.62% and 19.52% improvements in the number of state transitions, and 29.30% and 23.13% improvements in branch coverage. Additionally, MSFuzz discovered more vulnerabilities than the SOTA fuzzers. Full article

(This article belongs to the Special Issue Machine Learning for Cybersecurity: Threat Detection and Mitigation)

► Show Figures

Figure 1

18 pages, 3774 KiB

Open AccessArticle

Detecting Fake Accounts on Social Media Portals—The X Portal Case Study

by Weronika Dracewicz and Mariusz Sepczuk

Electronics 2024, 13(13), 2542; https://doi.org/10.3390/electronics13132542 - 28 Jun 2024

Viewed by 434

Abstract

Today, social media are an integral part of everyone’s life. In addition to their traditional uses of creating and maintaining relationships, they are also used to exchange views and all kinds of content. With the development of these media, they have become the [...] Read more.

Today, social media are an integral part of everyone’s life. In addition to their traditional uses of creating and maintaining relationships, they are also used to exchange views and all kinds of content. With the development of these media, they have become the target of various attacks. In particular, the existence of fake accounts on social networks can lead to many types of abuse, such as phishing or disinformation, which is a big challenge nowadays. In this work, we present a solution for detecting fake accounts on the X portal (formerly Twitter). The main goal behind the developed solution was to use images of X portal accounts and perform image classification using machine learning. As a result, it was possible to detect real and fake accounts and indicate the type of a particular account. The created solution was trained and tested on an adequately prepared dataset containing 15,000 generated accounts and real X portal accounts. The CNN model performing with accuracy above 92% and manual test results allow us to conclude that the proposed solution can be used to detect false accounts on the X portal. Full article

(This article belongs to the Special Issue Machine Learning for Cybersecurity: Threat Detection and Mitigation)

► Show Figures

Figure 1

18 pages, 936 KiB

Open AccessArticle

Linux IoT Malware Variant Classification Using Binary Lifting and Opcode Entropy

by Jayanthi Ramamoorthy, Khushi Gupta, Narasimha K. Shashidhar and Cihan Varol

Electronics 2024, 13(12), 2381; https://doi.org/10.3390/electronics13122381 - 18 Jun 2024

Viewed by 403

Abstract

Binary function analysis is fundamental in understanding the behavior and genealogy of malware. The detection, classification, and analysis of Linux IoT malware and its variants present significant challenges due to the wide range of architectures supported by the Linux IoT platform. This study [...] Read more.

Binary function analysis is fundamental in understanding the behavior and genealogy of malware. The detection, classification, and analysis of Linux IoT malware and its variants present significant challenges due to the wide range of architectures supported by the Linux IoT platform. This study concentrates on static analysis using binary lifting techniques to extract and analyze Intermediate Representation (IR) opcode sequences. We introduce a set of statistical entropy-based features derived from these IR opcode sequences, establishing a practical and straightforward methodology for machine learning classification models. By exclusively analyzing function metadata and opcode entropy, our architecture-agnostic approach not only efficiently detects malware but also classifies its variants with a high degree of accuracy, achieving an F1 score of 97%. The proposed approach offers a robust alternative for enhancing malware detection and variant identification frameworks for IoT devices. Full article

(This article belongs to the Special Issue Machine Learning for Cybersecurity: Threat Detection and Mitigation)

► Show Figures

Figure 1

18 pages, 3071 KiB

Open AccessArticle

Enhancing IoT Security: Optimizing Anomaly Detection through Machine Learning

by Maria Balega, Waleed Farag, Xin-Wen Wu, Soundararajan Ezekiel and Zaryn Good

Electronics 2024, 13(11), 2148; https://doi.org/10.3390/electronics13112148 - 31 May 2024

Viewed by 512

Abstract

As the Internet of Things (IoT) continues to evolve, securing IoT networks and devices remains a continuing challenge. Anomaly detection is a crucial procedure in protecting the IoT. A promising way to perform anomaly detection in the IoT is through the use of [...] Read more.

As the Internet of Things (IoT) continues to evolve, securing IoT networks and devices remains a continuing challenge. Anomaly detection is a crucial procedure in protecting the IoT. A promising way to perform anomaly detection in the IoT is through the use of machine learning (ML) algorithms. There is a lack of studies in the literature identifying optimal (with regard to both effectiveness and efficiency) anomaly detection models for the IoT. To fill the gap, this work thoroughly investigated the effectiveness and efficiency of IoT anomaly detection enabled by several representative machine learning models, namely Extreme Gradient Boosting (XGBoost), Support Vector Machines (SVMs), and Deep Convolutional Neural Networks (DCNNs). Identifying optimal anomaly detection models for IoT anomaly detection is challenging due to diverse IoT applications and dynamic IoT networking environments. It is of vital importance to evaluate ML-powered anomaly detection models using multiple datasets collected from different environments. We utilized three reputable datasets to benchmark the aforementioned machine learning methods, namely, IoT-23, NSL-KDD, and TON_IoT. Our results show that XGBoost outperformed both the SVM and DCNN, achieving accuracies of up to 99.98%. Moreover, XGBoost proved to be the most computationally efficient method; the model performed 717.75 times faster than the SVM and significantly faster than the DCNN in terms of training times. The research results have been further confirmed by using our real-world IoT data collected from an IoT testbed consisting of physical devices that we recently built. Full article

(This article belongs to the Special Issue Machine Learning for Cybersecurity: Threat Detection and Mitigation)

► Show Figures

Figure 1

23 pages, 6110 KiB

Open AccessArticle

Effects of RF Signal Eventization Encoding on Device Classification Performance

by Michael J. Smith, Michael A. Temple and James W. Dean

Electronics 2024, 13(11), 2020; https://doi.org/10.3390/electronics13112020 - 22 May 2024

Viewed by 517

Abstract

The results of first-step research activity are presented for realizing an envisioned “event radio” capability that mimics neuromorphic event-based camera processing. The energy efficiency of neuromorphic processing is orders of magnitude higher than traditional von Neumann-based processing and is realized through synergistic design [...] Read more.

The results of first-step research activity are presented for realizing an envisioned “event radio” capability that mimics neuromorphic event-based camera processing. The energy efficiency of neuromorphic processing is orders of magnitude higher than traditional von Neumann-based processing and is realized through synergistic design of brain-inspired software and hardware computing elements. Relative to event-based cameras, the development of event-based hardware devices supporting Radio Frequency (RF) applications is severely lagging and considerable interest remains in obtaining neuromorphic efficiency through event-based RF signal processing. In the Operational Technology (OT) protection arena, this includes efficient software computing capability to provide reliable device classification. A Random Forest (RndF) classifier is considered here as a reliable precursor to obtaining Spiking Neural Network (SNN) benefits. Both 1D and 2D eventized RF fingerprints are generated for bursts from N_Dev = 8 WirelessHART devices. Average correct classification (%C) results show that 2D fingerprinting is best overall using detected events in burst Gabor transform responses. This includes %C ≥ 90% under multiple access interference conditions using an average of N_EPB ≥ 400 detected events per burst. This is sufficiently promising to motivate next-step activity aimed at (1) reducing fingerprint dimensionality and minimizing the required computational resources, and (2) transitioning to a neuromorphic-friendly SNN classifier—two significant steps toward developing the necessary computing elements to achieve the full benefits of neuromorphic processing in the envisioned RF event radio. Full article

(This article belongs to the Special Issue Machine Learning for Cybersecurity: Threat Detection and Mitigation)

► Show Figures

Figure 1

12 pages, 1946 KiB

Open AccessArticle

HotCFuzz: Enhancing Vulnerability Detection through Fuzzing and Hotspot Code Coverage Analysis

by Chunlai Du, Yanhui Guo, Yifan Feng and Shijie Zheng

Electronics 2024, 13(10), 1909; https://doi.org/10.3390/electronics13101909 - 13 May 2024

Viewed by 763

Abstract

Software vulnerabilities present a significant cybersecurity threat, particularly as software code grows in size and complexity. Traditional vulnerability-mining techniques face challenges in keeping pace with this complexity. Fuzzing, a key automated vulnerability-mining approach, typically focuses on code branch coverage, overlooking syntactic and semantic [...] Read more.

Software vulnerabilities present a significant cybersecurity threat, particularly as software code grows in size and complexity. Traditional vulnerability-mining techniques face challenges in keeping pace with this complexity. Fuzzing, a key automated vulnerability-mining approach, typically focuses on code branch coverage, overlooking syntactic and semantic elements of the code. In this paper, we introduce HotCFuzz, a novel vulnerability-mining model centered on the coverage of hot code blocks. Leveraging vulnerability syntactic features to identify these hot code blocks, we devise a seed selection algorithm based on their coverage and integrate it into the established fuzzing test framework AFL. Experimental results demonstrate that HotCFuzz surpasses AFL, AFLGo, Beacon, and FairFuzz in terms of efficiency and time savings. Full article

(This article belongs to the Special Issue Machine Learning for Cybersecurity: Threat Detection and Mitigation)

► Show Figures

Figure 1

19 pages, 349 KiB

Open AccessArticle

Sampling-Based Machine Learning Models for Intrusion Detection in Imbalanced Dataset

by Zongwen Fan, Shaleeza Sohail, Fariza Sabrina and Xin Gu

Electronics 2024, 13(10), 1878; https://doi.org/10.3390/electronics13101878 - 11 May 2024

Cited by 1 | Viewed by 735

Abstract

Cybersecurity is one of the important considerations when adopting IoT devices in smart applications. Even though a huge volume of data is available, data related to attacks are generally in a significantly smaller proportion. Although machine learning models have been successfully applied for [...] Read more.

Cybersecurity is one of the important considerations when adopting IoT devices in smart applications. Even though a huge volume of data is available, data related to attacks are generally in a significantly smaller proportion. Although machine learning models have been successfully applied for detecting security attacks on smart applications, their performance is affected by the problem of such data imbalance. In this case, the prediction model is preferable to the majority class, while the performance for predicting the minority class is poor. To address such problems, we apply two oversampling techniques and two undersampling techniques to balance the data in different categories. To verify their performance, five machine learning models, namely the decision tree, multi-layer perception, random forest, XGBoost, and CatBoost, are used in the experiments based on the grid search with 10-fold cross-validation for parameter tuning. The results show that both the oversampling and undersampling techniques can improve the performance of the prediction models used. Based on the results, the XGBoost model based on the SMOTE has the best performance in terms of accuracy at 75%, weighted average precision at 82%, weighted average recall at 75%, weighted average F1 score at 78%, and Matthews correlation coefficient at 72%. This indicates that this oversampling technique is effective for multi-attack prediction under a data imbalance scenario. Full article

(This article belongs to the Special Issue Machine Learning for Cybersecurity: Threat Detection and Mitigation)

► Show Figures

Figure 1

18 pages, 510 KiB

Open AccessArticle

Learn-IDS: Bridging Gaps between Datasets and Learning-Based Network Intrusion Detection

by Minxiao Wang, Ning Yang, Yanhui Guo and Ning Weng

Electronics 2024, 13(6), 1072; https://doi.org/10.3390/electronics13061072 - 14 Mar 2024

Viewed by 844

Abstract

In an era marked by the escalating architectural complexity of the Internet, network intrusion detection stands as a pivotal element in cybersecurity. This paper introduces Learn-IDS, an innovative framework crafted to bridge existing gaps between datasets and the training process within deep learning [...] Read more.

In an era marked by the escalating architectural complexity of the Internet, network intrusion detection stands as a pivotal element in cybersecurity. This paper introduces Learn-IDS, an innovative framework crafted to bridge existing gaps between datasets and the training process within deep learning (DL) models for Network Intrusion Detection Systems (NIDS). To elevate conventional DL-based NIDS methods, which are frequently challenged by the evolving cyber threat landscape and exhibit limited generalizability across various environments, Learn-IDS works as a potent and adaptable platform and effectively tackles the challenges associated with datasets used in deep learning model training. Learn-IDS takes advantage of the raw data to address three challenges of existing published datasets, which are (1) the provided tabular format is not suitable for the diversity of DL models; (2) the fixed traffic instances are not suitable for the dynamic network scenarios; (3) the isolated published datasets cannot meet the cross-dataset requirement of DL-based NIDS studies. The data processing results illustrate that the proposed framework can correctly process and label the raw data with an average of 90% accuracy across three published datasets. To demonstrate how to use Learn-IDS for a DL-based NIDS study, we present two simple case studies. The case study on cross-dataset sampling function reports an average of 30.3% OOD accuracy improvement. The case study on data formatting function shows that introducing temporal information can enhance the detection accuracy by 4.1%.The experimental results illustrate that the proposed framework, through the synergistic fusion of datasets and DL models, not only enhances detection precision but also dynamically adapts to emerging threats within complex scenarios. Full article

(This article belongs to the Special Issue Machine Learning for Cybersecurity: Threat Detection and Mitigation)

► Show Figures

Figure 1

12 pages, 1510 KiB

Open AccessArticle

EPA-GAN: Electric Power Anonymization via Generative Adversarial Network Model

by Yixin Yang, Wen Shen, Qian Guo, Qiuhong Shan, Yihan Cai and Yubo Song

Electronics 2024, 13(5), 808; https://doi.org/10.3390/electronics13050808 - 20 Feb 2024

Viewed by 765

Abstract

The contemporary landscape of electricity marketing data utilization is characterized by increased openness, heightened data circulation, and more intricate interaction contexts. Throughout the entire lifecycle of data, the persistent threat of leakage is ever-present. In this study, we introduce a novel electricity data [...] Read more.

The contemporary landscape of electricity marketing data utilization is characterized by increased openness, heightened data circulation, and more intricate interaction contexts. Throughout the entire lifecycle of data, the persistent threat of leakage is ever-present. In this study, we introduce a novel electricity data anonymization model, termed EPA-GAN, which relies on table generation. In comparison to existing methodologies, our model extends the foundation of generative adversarial networks by incorporating feature encoders and feedback mechanisms. This adaptation enables the generation of anonymized data with heightened practicality and similarity to the original data, specifically tailored for mixed data types, thereby achieving a deliberate decoupling from the source data. Our proposed approach initiates by parsing the original JSON file, encoding it based on variable types and features using distinct feature encoders. Subsequently, a generative adversarial network, enhanced with information, downstream, generator losses, and the Was + GP modification, is employed to generate anonymized data. The introduction of random noise fortifies privacy protection during the data generation process. Experimental validation attests to a conspicuous reduction in both machine learning utility and statistical dissimilarity between the data synthesized by our proposed anonymization model and the original dataset. This substantiates the model’s efficacy in replacing the original data for mining analysis and data sharing, thereby effectively safeguarding the privacy of the source data. Full article

(This article belongs to the Special Issue Machine Learning for Cybersecurity: Threat Detection and Mitigation)

► Show Figures

Figure 1

Review

Jump to: Research, Other

16 pages, 3667 KiB

Open AccessReview

Research Trends in Artificial Intelligence and Security—Bibliometric Analysis

by Luka Ilić, Aleksandar Šijan, Bratislav Predić, Dejan Viduka and Darjan Karabašević

Electronics 2024, 13(12), 2288; https://doi.org/10.3390/electronics13122288 - 11 Jun 2024

Viewed by 697

Abstract

This paper provides a bibliometric analysis of current research trends in the field of artificial intelligence (AI), focusing on key topics such as deep learning, machine learning, and security in AI. Through the lens of bibliometric analysis, we explore publications published from 2020 [...] Read more.

This paper provides a bibliometric analysis of current research trends in the field of artificial intelligence (AI), focusing on key topics such as deep learning, machine learning, and security in AI. Through the lens of bibliometric analysis, we explore publications published from 2020 to 2024, using primary data from the Clarivate Analytics Web of Science Core Collection. The analysis includes the distribution of studies by year, the number of studies and citation rankings in journals, and the identification of leading countries, institutions, and authors in the field of AI research. Additionally, we investigate the distribution of studies by Web of Science categories, authors, affiliations, publication years, countries/regions, publishers, research areas, and citations per year. Key findings indicate a continued growth of interest in topics such as deep learning, machine learning, and security in AI over the past few years. We also identify leading countries and institutions active in researching this area. Awareness of data security is essential for the responsible application of AI technologies. Robust security frameworks are important to mitigate risks associated with AI integration into critical infrastructure such as healthcare and finance. Ensuring the integrity and confidentiality of data managed by AI systems is not only a technical challenge but also a societal necessity, demanding interdisciplinary collaboration and policy development. This analysis provides a deeper understanding of the current state of research in the field of AI and identifies key areas for further research and innovation. Furthermore, these findings may be valuable to practitioners and decision-makers seeking to understand current trends and innovations in AI to enhance their business processes and practices. Full article

(This article belongs to the Special Issue Machine Learning for Cybersecurity: Threat Detection and Mitigation)

► Show Figures

Figure 1

25 pages, 872 KiB

Open AccessReview

Detection of DoS Attacks for IoT in Information-Centric Networks Using Machine Learning: Opportunities, Challenges, and Future Research Directions

by Rawan Bukhowah, Ahmed Aljughaiman and M. M. Hafizur Rahman

Electronics 2024, 13(6), 1031; https://doi.org/10.3390/electronics13061031 - 9 Mar 2024

Viewed by 1183

Abstract

The Internet of Things (IoT) is a rapidly growing network that shares information over the Internet via interconnected devices. In addition, this network has led to new security challenges in recent years. One of the biggest challenges is the impact of denial-of-service (DoS) [...] Read more.

The Internet of Things (IoT) is a rapidly growing network that shares information over the Internet via interconnected devices. In addition, this network has led to new security challenges in recent years. One of the biggest challenges is the impact of denial-of-service (DoS) attacks on the IoT. The Information-Centric Network (ICN) infrastructure is a critical component of the IoT. The ICN has gained recognition as a promising networking solution for the IoT by supporting IoT devices to be able to communicate and exchange data with each other over the Internet. Moreover, the ICN provides easy access and straightforward security to IoT content. However, the integration of IoT devices into the ICN introduces new security challenges, particularly in the form of DoS attacks. These attacks aim to disrupt or disable the normal operation of the ICN, potentially leading to severe consequences for IoT applications. Machine learning (ML) is a powerful technology. This paper proposes a new approach for developing a robust and efficient solution for detecting DoS attacks in ICN-IoT networks using ML technology. ML is a subset of artificial intelligence (AI) that focuses on the development of algorithms. While several ML algorithms have been explored in the literature, including neural networks, decision trees (DTs), clustering algorithms, XGBoost, J48, multilayer perceptron (MLP) with backpropagation (BP), deep neural networks (DNNs), MLP-BP, RBF-PSO, RBF-JAYA, and RBF-TLBO, researchers compare these detection approaches using classification metrics such as accuracy. This classification metric indicates that SVM, RF, and KNN demonstrate superior performance compared to other alternatives. The proposed approach was carried out on the NDN architecture because, based on our findings, it is the most used one and has a high percentage of various types of cyberattacks. The proposed approach can be evaluated using an ndnSIM simulation and a synthetic dataset for detecting DoS attacks in ICN-IoT networks using ML algorithms. Full article

(This article belongs to the Special Issue Machine Learning for Cybersecurity: Threat Detection and Mitigation)

► Show Figures

Figure 1

Other

Jump to: Research, Review

19 pages, 1985 KiB

Open AccessOpinion

Towards an AI-Enhanced Cyber Threat Intelligence Processing Pipeline

by Lampis Alevizos and Martijn Dekker

Electronics 2024, 13(11), 2021; https://doi.org/10.3390/electronics13112021 - 22 May 2024

Viewed by 723

Abstract

Cyber threats continue to evolve in complexity, thereby traditional cyber threat intelligence (CTI) methods struggle to keep pace. AI offers a potential solution, automating and enhancing various tasks, from data ingestion to resilience verification. This paper explores the potential of integrating artificial intelligence [...] Read more.

Cyber threats continue to evolve in complexity, thereby traditional cyber threat intelligence (CTI) methods struggle to keep pace. AI offers a potential solution, automating and enhancing various tasks, from data ingestion to resilience verification. This paper explores the potential of integrating artificial intelligence (AI) into CTI. We provide a blueprint of an AI-enhanced CTI processing pipeline and detail its components and functionalities. The pipeline highlights the collaboration between AI and human expertise, which is necessary to produce timely and high-fidelity cyber threat intelligence. We also explore the automated generation of mitigation recommendations, harnessing AI’s capabilities to provide real-time, contextual, and predictive insights. However, the integration of AI into CTI is not without its challenges. Thereby, we discuss the ethical dilemmas, potential biases, and the imperative for transparency in AI-driven decisions. We address the need for data privacy, consent mechanisms, and the potential misuse of technology. Moreover, we highlight the importance of addressing biases both during CTI analysis and within AI models, warranting their transparency and interpretability. Lastly, our work points out future research directions, such as the exploration of advanced AI models to augment cyber defenses, and human–AI collaboration optimization. Ultimately, the fusion of AI with CTI appears to hold significant potential in the cybersecurity domain. Full article

(This article belongs to the Special Issue Machine Learning for Cybersecurity: Threat Detection and Mitigation)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Machine Learning for Cybersecurity: Threat Detection and Mitigation

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Published Papers (12 papers)

Research

Review

Other

Further Information

Guidelines

MDPI Initiatives

Follow MDPI