MDPI - Publisher of Open Access Journals

21 pages, 993 KB

Open AccessArticle

BIMW: Blockchain-Enabled Innocuous Model Watermarking for Secure Ownership Verification

by Xinyun Liu and Ronghua Xu

Future Internet 2025, 17(11), 490; https://doi.org/10.3390/fi17110490 (registering DOI) - 26 Oct 2025

The integration of artificial intelligence (AI) and edge computing gives rise to edge intelligence (EI), which offers effective solutions to the limitations of traditional cloud-based AI; however, deploying models across distributed edge platforms raises concerns regarding authenticity, thereby necessitating robust mechanisms for ownership [...] Read more.

The integration of artificial intelligence (AI) and edge computing gives rise to edge intelligence (EI), which offers effective solutions to the limitations of traditional cloud-based AI; however, deploying models across distributed edge platforms raises concerns regarding authenticity, thereby necessitating robust mechanisms for ownership verification. Currently, backdoor-based model watermarking techniques represent a state-of-the-art approach for ownership verification; however, their reliance on model poisoning introduces potential security risks and unintended behaviors. To solve this challenge, we propose BIMW, a blockchain-enabled innocuous model watermarking framework that ensures secure and trustworthy AI model deployment and sharing in distributed edge computing environments. Unlike widely applied backdoor-based watermarking methods, BIMW adopts a novel innocuous model watermarking method called interpretable watermarking (IW), which embeds ownership information without compromising model integrity or functionality. In addition, BIMW integrates a blockchain security fabric to ensure the integrity and auditability of watermarked data during storage and sharing. Extensive experiments were conducted on a Jetson Orin Nano board, which simulates edge computing environments. The numerical results show that our framework outperforms baselines in terms of predicate accuracy, p-value, watermark success rate (WSR), and harmlessness H. Our framework demonstrates resilience against watermarking removal attacks, and it introduces limited latency through the blockchain fabric. Full article

(This article belongs to the Special Issue Distributed Machine Learning and Federated Edge Computing for IoT)

► Show Figures

Figure 1

17 pages, 5175 KB

Open AccessArticle

Invisible Backdoor Attack Based on Dual-Frequency- Domain Transformation

by Mingyue Cao, Guojia Li, Simin Xu, Yihong Zhang and Yan Cao

Electronics 2025, 14(19), 3753; https://doi.org/10.3390/electronics14193753 - 23 Sep 2025

Viewed by 459

Abstract

Backdoor attacks are recognized as a significant security threat to deep learning. Such attacks can induce models to perform abnormally with inputs that contain predefined triggers, while maintaining state-of-the-art (SOTA) performance on clean data. Research indicates that existing backdoor attacks in the spatial [...] Read more.

Backdoor attacks are recognized as a significant security threat to deep learning. Such attacks can induce models to perform abnormally with inputs that contain predefined triggers, while maintaining state-of-the-art (SOTA) performance on clean data. Research indicates that existing backdoor attacks in the spatial domain have the problems of poor stealthiness and limited effectiveness. Based on the dispersion of adding perturbations in the frequency domain and the idea that multiple frequency-domain transformations can achieve different levels of feature fusion, we propose a dual-frequency-domain transformation backdoor attack method called DFDT (dual-frequency-domain transformation). DFDT executes dual-frequency-domain transformation on both clean samples and a trigger image, then conducts feature fusion in the frequency domain to augment the stealthiness of the poisoned samples. In addition, we introduce regularization samples to reduce the latent separability of clean and poisoned samples. We thoroughly evaluate the DFDT on three image datasets: CIFAR-10, GTSRB, and CIFAR-100. The experimental results show that the DFDT achieves greater stealthiness and effectiveness, achieving an attack success rate (ASR) that approximates 100% and a benign accuracy (BA) nearing 94%. Furthermore, we illustrate that DFDT can successfully evade state-of-the-art defenses, including STRIP, NC, and I-BAU. Full article

► Show Figures

Figure 1

17 pages, 394 KB

Open AccessArticle

Boosting Clean-Label Backdoor Attacks on Graph Classification

by Yadong Wang, Zhiwei Zhang, Ye Yuan and Guoren Wang

Electronics 2025, 14(18), 3632; https://doi.org/10.3390/electronics14183632 - 13 Sep 2025

Viewed by 509

Abstract

Graph Neural Networks (GNNs) have become a cornerstone for graph classification, yet their vulnerability to backdoor attacks remains a significant security concern. While clean-label attacks provide a stealthier approach by preserving original labels, they tend to be less effective in graph settings compared [...] Read more.

Graph Neural Networks (GNNs) have become a cornerstone for graph classification, yet their vulnerability to backdoor attacks remains a significant security concern. While clean-label attacks provide a stealthier approach by preserving original labels, they tend to be less effective in graph settings compared to traditional dirty-label methods. This performance gap arises from the inherent dominance of rich, benign structural patterns in target-class graphs, which overshadow the injected backdoor trigger during the GNNs’ learning process. We demonstrate that prior strategies, such as adversarial perturbations used in other domains to suppress benign features, fail in graph settings due to the amplification effects of the GNNs’ message-passing mechanism. To address this issue, we propose two strategies aimed at enabling the model to better learn backdoor features. First, we introduce a long-distance trigger injection method, placing trigger nodes at topologically distant locations. This enhances the global propagation of the backdoor signal while interfering with the aggregation of native substructures. Second, we propose a vulnerability-aware sample selection method, which identifies graphs that contribute more to the success of the backdoor attack based on low model confidence or frequent forgetting events. We conduct extensive experiments on benchmark datasets such as NCI1, NCI109, Mutagenicity, and ENZYMES, demonstrating that our approach significantly improves attack success rates (ASRs) while maintaining a low clean accuracy drop (CAD) compared to existing methods. This work offers valuable insights into manipulating the competition between benign and backdoor features in graph-structured data. Full article

(This article belongs to the Special Issue Security and Privacy for AI)

► Show Figures

Figure 1

20 pages, 2553 KB

Open AccessArticle

CCIBA: A Chromatic Channel-Based Implicit Backdoor Attack on Deep Neural Networks

by Chaoliang Li, Jiyan Liu, Yang Liu and Shengjie Yang

Electronics 2025, 14(18), 3569; https://doi.org/10.3390/electronics14183569 - 9 Sep 2025

Cited by 1 | Viewed by 604

Abstract

Deep neural networks (DNNs) excel in image classification but are vulnerable to backdoor attacks due to reliance on external training data, where specific markers trigger preset misclassifications. Existing attack techniques have an obvious trade-off between the effectiveness of the triggers and the stealthiness, [...] Read more.

Deep neural networks (DNNs) excel in image classification but are vulnerable to backdoor attacks due to reliance on external training data, where specific markers trigger preset misclassifications. Existing attack techniques have an obvious trade-off between the effectiveness of the triggers and the stealthiness, which limits their practical application. For this purpose, in this paper, we develop a method—chromatic channel-based implicit backdoor attack (CCIBA), which combines a discrete wavelet transform (DWT) and singular value decomposition (SVD) to embed triggers in the frequency domain through the chromaticity properties of the YUV color space. Experimental validation on different image datasets shows that compared to existing methods, CCIBA can achieve a higher attack success rate without a large impact on the normal classification ability of the model, and its good stealthiness is verified by manual detection as well as different experimental metrics. It successfully circumvents existing defense methods in terms of sustainability. Overall, CCIBA strikes a balance between covertness, effectiveness, robustness and sustainability. Full article

► Show Figures

Figure 1

43 pages, 1021 KB

Open AccessReview

A Survey of Cross-Layer Security for Resource-Constrained IoT Devices

by Mamyr Altaibek, Aliya Issainova, Tolegen Aidynov, Daniyar Kuttymbek, Gulsipat Abisheva and Assel Nurusheva

Appl. Sci. 2025, 15(17), 9691; https://doi.org/10.3390/app15179691 - 3 Sep 2025

Viewed by 1229

Abstract

Low-power microcontrollers, wireless sensors, and embedded gateways form the backbone of many Internet of Things (IoT) deployments. However, their limited memory, constrained energy budgets, and lack of standardized firmware make them attractive targets for diverse attacks, including bootloader backdoors, hardcoded keys, unpatched CVE [...] Read more.

Low-power microcontrollers, wireless sensors, and embedded gateways form the backbone of many Internet of Things (IoT) deployments. However, their limited memory, constrained energy budgets, and lack of standardized firmware make them attractive targets for diverse attacks, including bootloader backdoors, hardcoded keys, unpatched CVE exploits, and code-reuse attacks, while traditional single-layer defenses are insufficient as they often assume abundant resources. This paper presents a Systematic Literature Review (SLR) conducted according to the PRISMA 2020 guidelines, covering 196 peer-reviewed studies on cross-layer security for resource-constrained IoT and Industrial IoT environments, and introduces a four-axis taxonomy—system level, algorithmic paradigm, data granularity, and hardware budget—to structure and compare prior work. At the firmware level, we analyze static analysis, symbolic execution, and machine learning-based binary similarity detection that operate without requiring source code or a full runtime; at the network and behavioral levels, we review lightweight and graph-based intrusion detection systems (IDS), including single-packet authorization, unsupervised anomaly detection, RF spectrum monitoring, and sensor–actuator anomaly analysis bridging cyber-physical security; and at the policy level, we survey identity management, micro-segmentation, and zero-trust enforcement mechanisms supported by blockchain-based authentication and programmable policy enforcement points (PEPs). Our review identifies current strengths, limitations, and open challenges—including scalable firmware reverse engineering, efficient cross-ISA symbolic learning, and practical spectrum anomaly detection under constrained computing environments—and by integrating diverse security layers within a unified taxonomy, this SLR highlights both the state-of-the-art and promising research directions for advancing IoT security. Full article

► Show Figures

Figure 1

21 pages, 867 KB

Open AccessArticle

Homophily-Guided Backdoor Attacks on GNN-Based Link Prediction

by Yadong Wang, Zhiwei Zhang, Pengpeng Qiao, Ye Yuan and Guoren Wang

Appl. Sci. 2025, 15(17), 9651; https://doi.org/10.3390/app15179651 - 2 Sep 2025

Viewed by 551

Abstract

Graph Neural Networks (GNNs) have shown strong performance in link prediction, a core task in graph analysis. However, recent studies reveal their vulnerability to backdoor attacks, which can manipulate predictions stealthily and pose significant yet underexplored security risks. The existing backdoor strategies for [...] Read more.

Graph Neural Networks (GNNs) have shown strong performance in link prediction, a core task in graph analysis. However, recent studies reveal their vulnerability to backdoor attacks, which can manipulate predictions stealthily and pose significant yet underexplored security risks. The existing backdoor strategies for link prediction suffer from two key limitations: gradient-based optimization is computationally intensive and scales poorly to large graphs, while single-node triggers introduce noticeable structural anomalies and local feature inconsistencies, making them both detectable and less effective. To address these limitations, we propose a novel backdoor attack framework grounded in the principle of homophily, designed to balance effectiveness and stealth. For each selected target link to be poisoned, we inject a unique path-based trigger by adding a bridge node that acts as a shared neighbor. The bridge node’s features are generated through a context-aware probabilistic sampling mechanism over the joint neighborhood of the target link, ensuring high consistency with the local graph context. Furthermore, we introduce a confidence-based trigger injection strategy that selects non-existent links with the lowest predicted existence probabilities as targets, ensuring a highly effective attack from a small poisoning budget. Extensive experiments on five benchmark datasets—Cora, Citeseer, Pubmed, CS, and the large-scale Physics graph—demonstrate that our method achieves superior performance in terms of Attack Success Rate (ASR) while maintaining a low Benign Performance Drop (BPD). These results highlight a novel and practical threat to GNN-based link prediction, offering valuable insights for designing more robust graph learning systems. Full article

(This article belongs to the Special Issue Adversarial Attacks and Cyber Security: Trends and Challenges)

► Show Figures

Figure 1

19 pages, 2394 KB

Open AccessArticle

A Decoupled Contrastive Learning Framework for Backdoor Defense in Federated Learning

by Jiahao Cheng, Tingrui Zhang, Meijiao Li, Wenbin Wang, Jun Wang and Ying Zhang

Symmetry 2025, 17(9), 1398; https://doi.org/10.3390/sym17091398 - 27 Aug 2025

Viewed by 897

Abstract

Federated learning (FL) enables collaborative model training across distributed clients while preserving data privacy by sharing only local parameters. However, this decentralized setup, while preserving data privacy, also introduces new vulnerabilities, particularly to backdoor attacks, in which compromised clients inject poisoned data or [...] Read more.

Federated learning (FL) enables collaborative model training across distributed clients while preserving data privacy by sharing only local parameters. However, this decentralized setup, while preserving data privacy, also introduces new vulnerabilities, particularly to backdoor attacks, in which compromised clients inject poisoned data or gradients to manipulate the global model. Existing defenses rely on the global server to inspect model parameters, while mitigating backdoor effects locally remains underexplored. To address this, we propose a decoupled contrastive learning–based defense. We first train a backdoor model using poisoned data, then extract intermediate features from both the local and backdoor models, and apply a contrastive objective to reduce their similarity, encouraging the local model to focus on clean patterns and suppress backdoor behaviors. Crucially, we leverage an implicit symmetry between clean and poisoned representations—structurally similar but semantically different. Disrupting this symmetry helps disentangle benign and malicious components. Our approach requires no prior attack knowledge or clean validation data, making it suitable for practical FL deployments. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

23 pages, 728 KB

Open AccessEditor’s ChoiceArticle

BASK: Backdoor Attack for Self-Supervised Encoders with Knowledge Distillation Survivability

by Yihong Zhang, Guojia Li, Yihui Zhang, Yan Cao, Mingyue Cao and Chengyao Xue

Electronics 2025, 14(13), 2724; https://doi.org/10.3390/electronics14132724 - 6 Jul 2025

Viewed by 1118

Abstract

Backdoor attacks in self-supervised learning pose an increasing threat. Recent studies have shown that knowledge distillation can mitigate these attacks by altering feature representations. In response, we propose BASK, a novel backdoor attack that remains effective after distillation. BASK uses feature weighting and [...] Read more.

Backdoor attacks in self-supervised learning pose an increasing threat. Recent studies have shown that knowledge distillation can mitigate these attacks by altering feature representations. In response, we propose BASK, a novel backdoor attack that remains effective after distillation. BASK uses feature weighting and representation alignment strategies to implant persistent backdoors into the encoder’s feature space. This enables transferability to student models. We evaluated BASK on the CIFAR-10 and STL-10 datasets and compared it with existing self-supervised backdoor attacks under four advanced defenses: SEED, MKD, Neural Cleanse, and MiMiC. Our experimental results demonstrate that BASK maintains high attack success rates while preserving downstream task performance. This highlights the robustness of BASK and the limitations of current defense mechanisms. Full article

(This article belongs to the Special Issue Advancements in AI-Driven Cybersecurity and Securing AI Systems)

► Show Figures

Figure 1

19 pages, 18048 KB

Open AccessArticle

Natural Occlusion-Based Backdoor Attacks: A Novel Approach to Compromising Pedestrian Detectors

by Qiong Li, Yalun Wu, Qihuan Li, Xiaoshu Cui, Yuanwan Chen, Xiaolin Chang, Jiqiang Liu and Wenjia Niu

Sensors 2025, 25(13), 4203; https://doi.org/10.3390/s25134203 - 5 Jul 2025

Viewed by 665

Abstract

Pedestrian detection systems are widely used in safety-critical domains such as autonomous driving, where deep neural networks accurately perceive individuals and distinguish them from other objects. However, their vulnerability to backdoor attacks remains understudied. Existing backdoor attacks, relying on unnatural digital perturbations or [...] Read more.

Pedestrian detection systems are widely used in safety-critical domains such as autonomous driving, where deep neural networks accurately perceive individuals and distinguish them from other objects. However, their vulnerability to backdoor attacks remains understudied. Existing backdoor attacks, relying on unnatural digital perturbations or explicit patches, are difficult to deploy stealthily in the physical world. In this paper, we propose a novel backdoor attack method that leverages real-world occlusions (e.g., backpacks) as natural triggers for the first time. We design a dynamically optimized heuristic-based strategy to adaptively adjust the trigger’s position and size for diverse occlusion scenarios, and develop three model-independent trigger embedding mechanisms for attack implementation. We conduct extensive experiments on two different pedestrian detection models using publicly available datasets. The results demonstrate that while maintaining baseline performance, the backdoored models achieve average attack success rates of 75.1% on KITTI and 97.1% on CityPersons datasets, respectively. Physical tests verify that pedestrians wearing backpack triggers could successfully evade detection under varying shooting distances of iPhone cameras, though the attack failed when pedestrians rotated by 90°, confirming the practical feasibility of our method. Through ablation studies, we further investigate the impact of key parameters such as trigger patterns and poisoning rates on attack effectiveness. Finally, we evaluate the defense resistance capability of our proposed method. This study reveals that common occlusion phenomena can serve as backdoor carriers, providing critical insights for designing physically robust pedestrian detection systems. Full article

(This article belongs to the Special Issue Intelligent Traffic Safety and Security)

► Show Figures

Figure 1

20 pages, 1526 KB

Open AccessArticle

Chroma Backdoor: A Stealthy Backdoor Attack Based on High-Frequency Wavelet Injection in the UV Channels

by Yukang Fan, Kun Zhang, Bing Zheng, Yu Zhou, Jinyang Zhou and Wenting Pan

Symmetry 2025, 17(7), 1014; https://doi.org/10.3390/sym17071014 - 27 Jun 2025

Viewed by 834

Abstract

With the widespread adoption of deep learning in critical domains, such as computer vision, model security has become a growing concern. Backdoor attacks, as a highly stealthy threat, have emerged as a significant research topic in AI security. Existing backdoor attack methods primarily [...] Read more.

With the widespread adoption of deep learning in critical domains, such as computer vision, model security has become a growing concern. Backdoor attacks, as a highly stealthy threat, have emerged as a significant research topic in AI security. Existing backdoor attack methods primarily introduce perturbations in the spatial domain of images, which suffer from limitations, such as visual detectability and signal fragility. Although subsequent approaches, such as those based on steganography, have proposed more covert backdoor attack schemes, they still exhibit various shortcomings. To address these challenges, this paper presents HCBA (high-frequency chroma backdoor attack), a novel backdoor attack method based on high-frequency injection in the UV chroma channels. By leveraging discrete wavelet transform (DWT), HCBA embeds a polarity-triggered perturbation in the high-frequency sub-bands of the UV channels in the YUV color space. This approach capitalizes on the human visual system’s insensitivity to high-frequency signals, thereby enhancing stealthiness. Moreover, high-frequency components exhibit strong stability during data transformations, improving robustness. The frequency-domain operation also simplifies the trigger embedding process, enabling high attack success rates with low poisoning rates. Extensive experimental results demonstrate that HCBA achieves outstanding performance in terms of both stealthiness and evasion of existing defense mechanisms while maintaining a high attack success rate (ASR > 98.5%). Specifically, it improves the PSNR by 25% compared to baseline methods, with corresponding enhancements in SSIM as well. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

21 pages, 1317 KB

Open AccessArticle

Research on Hidden Backdoor Prompt Attack Method

by Huanhuan Gu, Qianmu Li, Yufei Wang, Yu Jiang, Aniruddha Bhattacharjya, Haichao Yu and Qian Zhao

Symmetry 2025, 17(6), 954; https://doi.org/10.3390/sym17060954 - 16 Jun 2025

Viewed by 1934

Abstract

Existing studies on backdoor attacks in large language models (LLMs) have contributed significantly to the literature by exploring trigger-based strategies—such as rare tokens or syntactic anomalies—that, however, limit both their stealth and generalizability, rendering them susceptible to detection. In this study, we propose [...] Read more.

Existing studies on backdoor attacks in large language models (LLMs) have contributed significantly to the literature by exploring trigger-based strategies—such as rare tokens or syntactic anomalies—that, however, limit both their stealth and generalizability, rendering them susceptible to detection. In this study, we propose HDPAttack, a novel hidden backdoor prompt attack method which is designed to overcome these limitations by leveraging the semantic and structural properties of prompts as triggers rather than relying on explicit markers. Not symmetric to traditional approaches, HDPAttack injects carefully crafted fake demonstrations into the training data, semantically re-expressing prompts to generate examples that exhibit high consistency in input semantics and corresponding labels. This method guides models to learn latent trigger patterns embedded in their deep representations, thereby enabling backdoor activation through natural language prompts without altering user inputs or introducing conspicuous anomalies. Experimental results across datasets (SST-2, SMS, AGNews, Amazon) reveal that HDPAttack achieved an average attack success rate of 99.87%, outperforming baseline methods by 2–20% while incurring a classification accuracy loss of ≤1%. These findings set a new benchmark for undetectable backdoor attacks and underscore the urgent need for advancements in prompt-based defense strategies. Full article

(This article belongs to the Section Mathematics)

► Show Figures

Figure 1

21 pages, 603 KB

Open AccessReview

A Survey on Multi-User Privacy Issues in Edge Intelligence: State of the Art, Challenges, and Future Directions

by Xiuwen Liu, Bowen Li, Sirui Chen and Zhiqiang Xu

Electronics 2025, 14(12), 2401; https://doi.org/10.3390/electronics14122401 - 12 Jun 2025

Viewed by 1140

Abstract

Edge intelligence is an emerging paradigm generated by the deep integration of artificial intelligence (AI) and edge computing. It enables data to remain at the edge without being sent to remote cloud servers, lowering response time, saving bandwidth resources, and opening up new [...] Read more.

Edge intelligence is an emerging paradigm generated by the deep integration of artificial intelligence (AI) and edge computing. It enables data to remain at the edge without being sent to remote cloud servers, lowering response time, saving bandwidth resources, and opening up new development opportunities for multi-user intelligent services (MISs). Although edge intelligence can address the problems of centralized MISs, its inherent characteristics also introduce new challenges, potentially leading to serious security issues. Malicious attackers may use inference attacks and other methods to access private information and upload toxic updates that disrupt the model and cause severe damage. This paper provides a comprehensive review of multi-user privacy protection mechanisms and compares the network architectures under centralized and edge intelligence paradigms, exploring the privacy and security issues introduced by edge intelligence. We then investigate the state-of-the-art defense mechanisms under the edge intelligence paradigm and provide a systematic classification. Through experiments, we compare the privacy protection and utility trade-offs of existing methods. Finally, we propose future research directions for privacy protection in MISs under the edge intelligence paradigm, aiming to promote the development of user privacy protection frameworks. Full article

(This article belongs to the Special Issue Security and Privacy in Networks)

► Show Figures

Figure 1

19 pages, 767 KB

Open AccessArticle

Defending Graph Neural Networks Against Backdoor Attacks via Symmetry-Aware Graph Self-Distillation

by Hanlin Wang, Liang Wan and Xiao Yang

Symmetry 2025, 17(5), 735; https://doi.org/10.3390/sym17050735 - 10 May 2025

Cited by 1 | Viewed by 1975

Abstract

Graph neural networks (GNNs) have exhibited remarkable performance in various applications. Still, research has revealed their vulnerability to backdoor attacks, where Adversaries inject malicious patterns during the training phase to establish a relationship between backdoor patterns and a specific target label, thereby manipulating [...] Read more.

Graph neural networks (GNNs) have exhibited remarkable performance in various applications. Still, research has revealed their vulnerability to backdoor attacks, where Adversaries inject malicious patterns during the training phase to establish a relationship between backdoor patterns and a specific target label, thereby manipulating the behavior of poisoned GNNs. The inherent symmetry present in the behavior of GNNs can be leveraged to strengthen the robustness of GNNs. This paper presents a quantitative metric, termed Logit Margin Rate (LMR), for analyzing the symmetric properties of the output landscapes across GNN layers. Additionally, a learning paradigm of graph self-distillation is combined with LMR to distill the symmetry knowledge from shallow layers, which can serve as the defensive supervision signals to preserve the benign symmetric relationships in deep layers, thus improving both model stability and adversarial robustness. Experiments were conducted on four benchmark datasets to evaluate the robustness of the proposed Graph Self-Distillation-based Backdoor Defense (GSD-BD) method against three widely used backdoor attack algorithms, demonstrating the robustness of GSD-BD even under severe infection scenarios. Full article

(This article belongs to the Special Issue Information Security in AI)

► Show Figures

Figure 1

21 pages, 2595 KB

Open AccessArticle

Adversarial Training for Mitigating Insider-Driven XAI-Based Backdoor Attacks

by R. G. Gayathri, Atul Sajjanhar and Yong Xiang

Future Internet 2025, 17(5), 209; https://doi.org/10.3390/fi17050209 - 6 May 2025

Viewed by 1282

Abstract

The study investigates how adversarial training techniques can be used to introduce backdoors into deep learning models by an insider with privileged access to training data. The research demonstrates an insider-driven poison-label backdoor approach in which triggers are introduced into the training dataset. [...] Read more.

The study investigates how adversarial training techniques can be used to introduce backdoors into deep learning models by an insider with privileged access to training data. The research demonstrates an insider-driven poison-label backdoor approach in which triggers are introduced into the training dataset. These triggers misclassify poisoned inputs while maintaining standard classification on clean data. An adversary can improve the stealth and effectiveness of such attacks by utilizing XAI techniques, which makes the detection of such attacks more difficult. The study uses publicly available datasets to evaluate the robustness of the deep learning models in this situation. Our experiments show that adversarial training considerably reduces backdoor attacks. These results are verified using various performance metrics, revealing model vulnerabilities and possible countermeasures. The findings demonstrate the importance of robust training techniques and effective adversarial defenses to improve the security of deep learning models against insider-driven backdoor attacks. Full article

(This article belongs to the Special Issue Generative Artificial Intelligence (AI) for Cybersecurity)

► Show Figures

Figure 1

18 pages, 1510 KB

Open AccessArticle

BMAIU: Backdoor Mitigation in Self-Supervised Learning Through Active Implantation and Unlearning

by Fan Zhang, Jianpeng Li, Wei Huang and Xi Chen

Electronics 2025, 14(8), 1587; https://doi.org/10.3390/electronics14081587 - 14 Apr 2025

Viewed by 771

Abstract

Self-supervised learning (SSL) is vulnerable to backdoor attacks, while the downstream classifiers based on SSL models inevitably inherit these backdoors, even when they are trained on clean samples. Despite the proposal of several methods of backdoor defense against backdoor attacks, few methods remain [...] Read more.

Self-supervised learning (SSL) is vulnerable to backdoor attacks, while the downstream classifiers based on SSL models inevitably inherit these backdoors, even when they are trained on clean samples. Despite the proposal of several methods of backdoor defense against backdoor attacks, few methods remain that can be used to effectively defend against various backdoor attacks while maintaining the high performance of the model. In this paper, based on the discovery that unlearning any trigger enhances the overall backdoor robustness of the model, a novel, efficient, and straightforward approach is proposed to address the most advanced backdoor attacks. This method involves two stages. Firstly, a backdoor is actively implanted in the model with a custom trigger. Secondly, the model is fine-tuned to unlearn the custom trigger. Through these two stages, it is found out that not only is the implanted backdoor removed, but the unknown backdoors implanted by attackers are also effectively mitigated. As illustrated by the extensive experiments conducted on multiple datasets against the current state-of-the-art methods of attack, the method proposed in this paper requires only a small amount of clean data (approximately 1%) to reduce the success rate of backdoor attacks effectively while ensuring minimal impact on model performance. Full article

► Show Figures

Figure 1

Search Results (71)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (71)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI