Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Invisible Threats in the Data: A Study on Data Poisoning Attacks in Deep Generative Models

Appl. Sci. 2024, 14(19), 8742; https://doi.org/10.3390/app14198742

by Ziying Yang¹

, Jie Zhang^2,*

, Wei Wang¹

and Huan Li¹

Reviewer 1:

Abu Kamruzzaman

Reviewer 2:

Christian Fernández-Campusano

Reviewer 3: Anonymous

Appl. Sci. 2024, 14(19), 8742; https://doi.org/10.3390/app14198742

Submission received: 12 July 2024 / Revised: 6 September 2024 / Accepted: 20 September 2024 / Published: 27 September 2024

(This article belongs to the Special Issue Computer Vision, Robotics and Intelligent Systems)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Well written work.

Please discuss on the benchmarks with other studies if possible to improve the readers acceptance on the paper.

May post the work in GitHub for others to tryout the future improvements.

Author Response

Response to Reviewer 1 Comments
1. Summary
Thank you very much for taking the time to review this manuscript. Please find the detailed responses below and the corresponding revisions highlighted in the re-submitted files.
2. Point-by-point response to Comments and Suggestions for Authors
Comments : Please discuss on the benchmarks with other studies if possible to improve the readers acceptance on the paper. May post the work in GitHub for others to tryout the future improvements.
Response : Thank you for your insightful comments. We agree that comparing the performance of our approach with other relevant studies would strengthen the paper’s impact and enhance the reader’s understanding. While we will not be able to include a detailed benchmark comparison in this revision, we plan to discuss this in the future work section. Additionally, we have made the code for this project publicly available on GitHub [page 3 line 81.]. This allows other researchers to easily replicate our experiments, contribute to future improvements, and extend this research in new and exciting directions.

Reviewer 2 Report

Comments and Suggestions for Authors

The authors employ an encoder-decoder network to "poison" the data during the preparation stage without modifying the model. The trigger remains visually undetectable, substantially enhancing attacker stealthiness and success rates. Consequently, this attack method poses a severe threat to DGMs' security while presenting new challenges for security mechanisms.

The work is interesting and a contribution to the field of knowledge. However, some limitations must be clarified in discussions, and improvements must be made. Such as:

I) DGMs typically consist of two components: a generator and a discriminator. The generator generates new data samples based on the learned data distribution. The discriminator, conversely, evaluates the realism of the data generated by the generator, aiming to achieve a high level of similarity to accurate data. Further details should be provided on how attackers can access these stages of the Model's training, in addition to indicating what happens when it is a distributed learning Model (such as Federal Learning).

II) Table 1 needs to be improved. The timeline should say the year and add references. The line for the year should say "proposal" or something similar.

III) Explain in more detail the Tables (3 and 4) and the indicators that allow making the decision indicated in the conclusions.

Iv) Add a Related Works section.

v) Add a Discussion section.

The reviewer

Best regards

Comments on the Quality of English Language

Minor

Author Response

Response to Reviewer 2 Comments
1. Summary
Thank you very much for taking the time to review this manuscript. Please find the detailed responses below and the corresponding revisions highlighted in the re-submitted files.
2. Point-by-point response to Comments and Suggestions for Authors
Comments 1: DGMs typically consist of two components: a generator and a discriminator. The generator generates new data samples based on the learned data distribution. The discriminator, conversely, evaluates the realism of the data generated by the generator, aiming to achieve a high level of similarity to accurate data. Further details should be provided on how attackers can access these stages of the Model's training, in addition to indicating what happens when it is a distributed learning Model (such as Federal Learning).
Response 1: Thank you for pointing out the need for more details regarding attacker access during the training process. To address this, we have added a concrete example in the introduction section (page 1, line 34, paragraph 4, as shown in Figure 1). This example illustrates a scenario where an attacker could potentially gain access to the training process and inject malicious data. Additionally, you raise an important point regarding distributed learning models, particularly federated learning. While this paper focuses on a centralized setting, we acknowledge the significance of this topic and have included it as a potential direction for future work in section conclusion. We believe this expansion will provide a more comprehensive understanding of the vulnerabilities inherent in DGM training. Mention exactly where in the revised manuscript this change can be found –page 13,line 436.
Comments 2: Table 1 needs to be improved. The timeline should say the year and add references. The line for the year should say "proposal" or something similar.
Response 2: Thank you for your insightful comments. We have modified Table 1 to include the details you suggested. Mention exactly where in the revised manuscript this change can be found –page 4.
Comments 3: Explain in more detail the Tables (3 and 4) and the indicators that allow making the decision indicated in the conclusions.
Response 3: Thank you for pointing out the need for more detailed explanations of Tables 3 and 4, as well as the decision-making process outlined in the conclusions. To address this, we have made the following revisions: 1. Expanded Table Explanations: We have added detailed explanations for Tables 3 and 4 in Section 4.1.4 (page:9-10, line: 323-345). These explanations clarify the metrics used, the data presented, and the interpretation of the results. 2. Clarified Decision-Making Process: In Section 4.2 (page 10, line: 348-377), we have expanded the discussion of the decision-making process, elaborating on how the indicators presented in Tables 3 and 4 support the conclusions drawn in the paper. 3. Strengthened Conclusions: To further clarify the relationship between the tables and the conclusions, we have added a second paragraph to the conclusion section (page: 12-13, line: 406-428). This paragraph explicitly connects the findings presented in Tables 3 and 4 to the overall conclusions of the study.
Comments 4: Add a Related Works section.
Response 4: Thank you for your insightful comments.We have added Related work section in this paper. Mention exactly where in the revised manuscript this change can be found –page 3.
Comments 5: Add a Discussion section.
Response 5: Thank you for your insightful comments. We have added Discussion section in this paper. Mention exactly where in the revised manuscript this change can be found –page 11, line 378.

Reviewer 3 Report

Comments and Suggestions for Authors

The paper proposes a defense mechanism against attacks on Deep Generative Models (DGM). My comments are as follows:

Length and Depth: The paper is too brief for such a critical topic in the field. The explanations provided are insufficiently detailed. The authors should expand their discussion to ensure comprehensive coverage of the subject.
Illustrative Example: The paper would benefit from the inclusion of a running example, ideally accompanied by a figure, to illustrate a data poisoning attack in a DGM. The authors should also elaborate on the significance and potential consequences of these attacks. Including examples of real-world incidents where such attacks have compromised systems and caused significant issues would strengthen the paper.
Literature Review: The Literature Review section requires substantial improvement. The authors should expand the discussion to cover general attacks on models, providing a broader context for their work. Additionally, it may be beneficial to merge Section 3 with the literature review, allowing for a more cohesive discussion that includes a variety of defense strategies.
Figure 1: The quality of Figure 1 needs improvement. Currently, it appears overly simplistic. The authors should refer to figures presented at leading data science conferences to guide their revisions. Moreover, the figure should be enlarged, as it is currently too small to be effective.
Experiments: In the experimental section, the authors should consider selecting attack models from different families, rather than focusing solely on StyleGAN, to better demonstrate the scalability of their proposed method. Additionally, the paper should include quantitative values for the BA and DS metrics to provide a more rigorous evaluation of the proposed defense mechanism.

Comments on the Quality of English Language

Refer to my comments. In addition, you should proofread the paper with professional editing services.

Author Response

Response to Reviewer 3 Comments

1. Summary

Thank you very much for taking the time to review this manuscript. Please find the detailed responses below and the corresponding revisions highlighted in the re-submitted files.

2. Point-by-point response to Comments and Suggestions for Authors

Comments 1:

Length and Depth: The paper is too brief for such a critical topic in the field. The explanations provided are insufficiently detailed. The authors should expand their discussion to ensure comprehensive coverage of the subject.

Response 1:

Thank you for pointing out the need for more details in the paper. To address this issue, we have extended the introduction section in the following ways:

1. Adding background of DGM security against backdoor attacks : We have added a new paragraph to the introduction (page 1, paragraph 2, lines 21-30) that establishes the importance of backdoor attacks within the broader context of DGM security. This paragraph clarifies the relevance and significance of this research area.

2. Adding introduction of backdoor attacks: We have further expanded the discussion on backdoor attacks in paragraph 3 of the introduction (page 1, lines 28-33). This provides a more detailed explanation of the concept of backdoor attacks, outlining their mechanics and potential impact on DGM security.

These revisions aim to provide the audiences with more details about the background and significance of backdoor attacks in DGMs before delving into the specific contributions of the paper.

Comments 2:

Illustrative Example: The paper would benefit from the inclusion of a running example, ideally accompanied by a figure, to illustrate a data poisoning attack in a DGM. The authors should also elaborate on the significance and potential consequences of these attacks. Including examples of real-world incidents where such attacks have compromised systems and caused significant issues would strengthen the paper.

Response 2: Thank you for your insightful comments.. The inclusion of a running example would greatly benefit the paper. We have added a detailed example, accompanied by Figure 1, to illustrate a data poisoning attack in a DGM. This example aims to clarify the process and potential consequences of such attacks. Furthermore, we have elaborated on the significance and potential consequences of these attacks, highlighting the potential for real-world harm. Mention exactly where in the revised manuscript this change can be found –page 1, paragraph 4, and line 34-48.

“Consider, for example(as shown in Figure 1, a DGM model trained to generate images of cats. An attacker might intentionally inject a specific red mark pattern into a subset of publicly available image datasets. This red mark, although subtle to the human eye, could act as a trigger. During the data preparation phase of model training, unsuspecting users might unknowingly utilize this poisoned public datasets to train their DGM models. As a result, the trained model would be susceptible to the attacker’s influence. Whenever the model encounters the red mark pattern in a new image, it would be triggered to generate images of cats that include the red mark, even if the original image did not contain it. The consequences of such a data poisoning attack are significant. The model’s output would be compromised, leading to the generation of biased or manipulated content. This could have severe implications depending on the application. For example, if the DGM is used for generating medical images, the injected trigger could lead to misdiagnosis or incorrect treatment recommendations.”

Comments 3:

Literature Review: The Literature Review section requires substantial improvement. The authors should expand the discussion to cover general attacks on models, providing a broader context for their work. Additionally, it may be beneficial to merge Section 3 with the literature review, allowing for a more cohesive discussion that includes a variety of defense strategies.

Response 3:

Thank you for your insightful comments. We have incorporated seven additional references to provide a broader context for the research. Mention exactly where in the revised manuscript this change can be found –page 1, line 24.

In response to your suggestion, we have merged sections 2 and 3, adding a cohesive paragraph to ensure a natural transition between the two sections. Mention exactly where in the revised manuscript this change can be found –page 5, line 162.

Comments 4:

Figure 1: The quality of Figure 1 needs improvement. Currently, it appears overly simplistic. The authors should refer to figures presented at leading data science conferences to guide their revisions. Moreover, the figure should be enlarged, as it is currently too small to be effective.

Response 4:

Thanks for the suggestion. We have improved Figure 1 accordingly. It can be found on the top of page 7.

Comments 5:

Experiments: In the experimental section, the authors should consider selecting attack models from different families, rather than focusing solely on StyleGAN, to better demonstrate the scalability of their proposed method. Additionally, the paper should include quantitative values for the BA and DS metrics to provide a more rigorous evaluation of the proposed defense mechanism.

Response 5:

We appreciate the reviewer’s suggestion to evaluate our method against a broader range of attack models. However, the computational resources and time required to train and evaluate models like StyleGAN3, which is among the most complex and resource-intensive generative models, made it impractical to include models from other families in our current study. We believe that focusing on StyleGAN3, a state-of-the-art model representing a significant challenge for attack methods, provides a compelling demonstration of our approach’s effectiveness. We acknowledge that further evaluation with other generative models would be beneficial and plan to explore this in future work as computational resources allow.

In response to your suggestion, we have refined the description of each metric section 4.1.4 ‘Evaluation Metrics’. The clean StyleGAN3 data represents the baseline performance (BA) on the clean datasets. Mention exactly where in the revised manuscript this change can be found –page 9-10, line 322-346.

4. Response to Comments on the Quality of English Language

Response : Thank you for your insightful comments. Therefore, we have proofread the whole paper.

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

This paper proposes a method for invisible backdoor attacks and a way to overcome these limitations. They employ a network of encoders-decoders to “poison” data during the training stage without modifying the model itself.

According to the authors, the trigger remains visually undetectable thanks to a meticulous design, which substantially improves the attacker’s stealth and success rates. Consequently, this attack method poses a serious threat to the security of DGMs and introduces new challenges for security mechanisms.

In this new version, they still need to review the training of the model before and after being "poisoned"; for example:

I) The authors say: This method uses a network of encoders-decoders to “poison” training datasets during data preparation. In particular, this approach achieves invisible triggers by implanting them into the datasets during the training phase of the DGM model without requiring any modification of the model itself.

But, What happens to each model's resulting metrics for precision and certainty in addition to the error measures? All of this occurs when a model is trained and tested, and its quality is demonstrated, results that are not reflected in the results section.

ii) A missing section explains how this attack is not detected, justifying it through empirical results, error measures, precision and certainty, or another way in which the results of a poisoned model cannot be detected. Justify and clarify this point well, which is critical.

iii) What are the practical applications of these attacks?

The reviewer

Best Regard

Comments on the Quality of English Language

Minor

Author Response

Response to Reviewer 2 Comments
1. Summary
Thank you very much for taking the time to review this manuscript. Please find the detailed responses below and the corresponding revisions highlighted in the re-submitted files.
2. Point-by-point response to Comments and Suggestions for Authors
Comments 1: The authors say: This method uses a network of encoders-decoders to “poison” training datasets during data preparation. In particular, this approach achieves invisible triggers by implanting them into the datasets during the training phase of the DGM model without requiring any modification of the model itself. But, What happens to each model's resulting metrics for precision and certainty in addition to the error measures? All of this occurs when a model is trained and tested, and its quality is demonstrated, results that are not reflected in the results section.
Response 1: Thank you for your insightful comments. We apologize that our explanation regarding the detection of this attack wasn’t clear. Therefore, we rewrote the context of our results to make it more clear and concise. “Table 3 presents an evaluation of our invisible backdoor attack method’s fidelity, comparing the performance of the poisoned StyleGAN3 model with a clean StyleGAN3 model. We utilize three metrics to assess fidelity: KID , FID, and EQ\_T. These metrics reflect the generated images’ quality and the model’s ability to perform integer translations, a crucial aspect of the StyleGAN3 model’s functionality. As demonstrated in Table 3, our method is capable of successfully implementing an invisible attack while maintaining model fidelity. In particular, the KID and FID values of the poisoned model are higher than those of the clean StyleGAN3 model, indicating that the generated images from the poisoned model are of comparable quality to those from the clean model. Importantly, the EQ_T of the poisoned model is less than 2%, demonstrating a minimal loss in the model’s ability to perform integer translations, a key aspect of the StyleGAN3 model’s functionality. These findings demonstrate that our approach is capable of achieving a balance between attack effectiveness and model fidelity.” “To evaluate the stealthiness of our trigger, we trained multiple classification models (ResNet18, ResNet34, ResNet50) on both clean and poisoned StyleGAN3 datasets. Table 4 presents the results, demonstrating a significant decrease in the accuracy (BA values) of models trained on the poisoned dataset, particularly for ResNet18, ResNet34, and ResNet50. Notably, the BA of the poisoned StyleGAN3 model drops below 1\%, indicating the trigger’s effective concealment from these models. These results confirm the balance achieved between attack effectiveness and model fidelity. While maintaining visual quality, the poisoned model remains susceptible to the trigger, demonstrating its influence on model behavior when presented with the specific trigger input. Figure 4 visually reinforces the stealthiness of our trigger. The images generated by the poisoned StyleGAN3 model (third row) exhibit subtle blurring compared to the original images (first row). This minute difference, barely perceptible through visual inspection alone, underlines the effectiveness of our method in preserving the stealthiness of the model.”
Comments 2: A missing section explains how this attack is not detected, justifying it through empirical results, error measures, precision and certainty, or another way in which the results of a poisoned model cannot be detected. Justify and clarify this point well, which is critical.
Response 2: Thank you for your valuable comments. We apologize for not explicitly explaining how this attack goes undetected. Therefore, In section 1, we added a detailed example, accompanied by Figure 1, to illustrate a data poisoning attack in a DGM. Additionally, we revised the manuscript to include a dedicated paragraph that explores the reasons why this attack can go undetected. Mention exactly where in the revised manuscript this change can be found –page 2, line:51-57. “Furthermore, despite misclassifications, such as incorrectly identifying a cat as a dog, a backdoored model can maintain its original performance and output high confidence scores, potentially reaching 95%. This high confidence can mislead users, particularly when they lack alternative sources of information or reference points, making them more susceptible to the model’s compromised state. Attackers exploit this by designing backdoors that trigger incorrect outputs while maintaining high confidence levels, further obscuring the model’s compromised state.”
Comments 3: What are the practical applications of these attacks?
Response 3: Thank you for your valuable insights. In this revised version, we elaborate on the practical implications of these attacks. Kindly refer to page: 2, line: 59-75, for the exact location where this enhancement can be found. “Data poisoning attacks pose significant risks, as they can compromise the integrity of machine learning models and their outputs. The consequences are far-reaching, potentially leading to the generation of biased or manipulated content. This can have severe implications across diverse applications. Practical manifestations of these attacks include: 1) Spread Misinformation and Propaganda: By manipulating social media algorithms, attackers can inject biased or false information, which can then spread rapidly and influence public opinion. 2) Legal and Ethical Bias: Data poisoning can be used to inject bias into models employed for legal and ethical decision-making, such as bail recommendations or criminal sentencing. This could result in unfair and discriminatory outcomes. 3) Adversarial Machine Learning: Data poisoning can be used to generate adversarial examples, which are specifically designed inputs intended to deceive machine learning models. This can lead to the bypass of security systems, manipulation of algorithms, and even the creation of vulnerabilities in critical infrastructure. These potential consequences underscore the need for robust defenses against data poisoning attacks to ensure the reliability and trustworthiness of machine learning models in diverse applications.”

Reviewer 3 Report

Comments and Suggestions for Authors

Overall, I am fine with the responses. However, I am not satisfied with the experiment's response. I will leave it to the Academic Editor to decide.

Author Response

Thank you for your feedback. We are dedicated to ensuring the quality of our research and will work with the Academic Editor to address any concerns regarding the experiment. We are confident that with further explanation and analysis, we can strengthen this aspect of our work.

Article Menu

Invisible Threats in the Data: A Study on Data Poisoning Attacks in Deep Generative Models

Further Information

Guidelines

MDPI Initiatives

Follow MDPI