Next Article in Journal
A Systematic Review of Synthetic Data Generation Techniques Using Generative AI
Previous Article in Journal
Research on Designing Context-Aware Interactive Experiences for Sustainable Aging-Friendly Smart Homes
 
 
Article
Peer-Review Record

Geometry-Aware Weight Perturbation for Adversarial Training

Electronics 2024, 13(17), 3508; https://doi.org/10.3390/electronics13173508
by Yixuan Jiang * and Hsiao-Dong Chiang
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3: Anonymous
Reviewer 4:
Electronics 2024, 13(17), 3508; https://doi.org/10.3390/electronics13173508
Submission received: 6 August 2024 / Revised: 2 September 2024 / Accepted: 3 September 2024 / Published: 4 September 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This paper introduces a geometry-aware method for adversarial training, with the primary objective of leveraging points in an open set surrounding the point of interest during the adversarial attack phase. This approach aims to better capture the underlying data manifold, which in turn could enhance the effectiveness of adversarial training. The authors' experiments provide empirical support for this insight, demonstrating the potential benefits of their proposed method.

 

Overall, this paper is well-motivated, thoroughly researched, and clearly articulated. The approach is innovative, and the experimental results are promising. However, I have the following comments and suggestions for further improvement:

 

1. Reproducibility: The reproducibility of this work is of paramount importance. I strongly recommend that the authors consider open-sourcing their project, including code, datasets, and detailed instructions for replication. This would not only strengthen the impact and credibility of the work but also facilitate future research in this area.

 

2. Norm-Based Attacks: The current study focuses primarily on adversarial attacks under the L∞-norm and L2 -norm. While this is a reasonable starting point, it would be beneficial for the authors to explore and discuss the applicability of their method to other types of attacks, such as L1-norm attacks or trace norm attacks. Including these additional perspectives could provide a more comprehensive evaluation of the proposed method's robustness across different adversarial scenarios.

 

3. Comparison with Manifold-Based Methods: The proposed method implicitly relates to manifold learning techniques, given its focus on modeling the data manifold during adversarial training. I suggest that the authors expand their discussion to include a comparison with existing manifold-based methods for adversarial robustness. This would not only place their work in the broader context of the field but also highlight the unique contributions and advantages of their geometry-aware approach.

Author Response

We extend our gratitude to the reviewer for taking the time to review this manuscript. Please find the detailed responses below.

Comments 1: Reproducibility: The reproducibility of this work is of paramount importance. I strongly recommend that the authors consider open-sourcing their project, including code, datasets, and detailed instructions for replication. This would not only strengthen the impact and credibility of the work but also facilitate future research in this area.

Response 1: Thank you for pointing that out. We plan to open-source the code after the paper is accepted.

Comments 2: Norm-Based Attacks: The current study focuses primarily on adversarial attacks under the L∞-norm and L2 -norm. While this is a reasonable starting point, it would be beneficial for the authors to explore and discuss the applicability of their method to other types of attacks, such as L1-norm attacks or trace norm attacks. Including these additional perspectives could provide a more comprehensive evaluation of the proposed method's robustness across different adversarial scenarios.

Response 2: We agree this comment. We selected L∞-norm and L2 -norm these two threat models because they are also the only two threat models discussed in previous works [A, B]. For a fair comparison, we also conducted comparison experiments under these settings. We claim that the experiments under these two threat models are informative enough to show the effectiveness of the proposed method. 

Comments 3: Comparison with Manifold-Based Methods: The proposed method implicitly relates to manifold learning techniques, given its focus on modeling the data manifold during adversarial training. I suggest that the authors expand their discussion to include a comparison with existing manifold-based methods for adversarial robustness. This would not only place their work in the broader context of the field but also highlight the unique contributions and advantages of their geometry-aware approach.

Responses 3: We assume that by "manifold-based methods", the reviewer refers to the defense methods that improve model performance on the adversarial data by transforming it to the counterpart on the normal data manifold. This class of methods is also called "Adversarial Purification", and we agree they are related to our paper. Therefore,  we have added Section 2.3 (page 5) to discuss the relation and distinction between these methods and our method.

[A] Yu, Chaojian, et al. "Robust weight perturbation for adversarial training". arXiv preprint arXiv:2205.14826 (2022).

[B] Yu, Chaojian, et al. "Understanding robust overfitting of adversarial training and beyond". arXiv preprint arXiv:2206.08675 (2022).

Reviewer 2 Report

Comments and Suggestions for Authors

Remarks:

1. 

 

Comments on the Quality of English Language

The language used by the authors is rather understandable for a non-native English speaker, yet one might wish the grammar to be clean throughout the manuscript. Careful proofreading would help improve the clarity and flow of the text. Improve the readability.

Author Response

We extend our gratitude to the reviewer for taking time to review this manuscript. Please find the detailed responses below. 

Comments 1:

Responses 1: Thank you for the constructive feedback! To improve the clarity of this section, we have added a figure (Figure 1 on page 2) to illustrate the problems under consideration in this paper. Furthermore, we added another figure (Figure 2 on page 3) to explain our rationale for the proposed method. Specifically, by visualizing the weight loss landscapes in Figure 2, we find that GAIRAT converges to a sharp local minimum, which motivates us to impose regularization on the smoothness of the weight loss landscape with AWP.

Comments 2: 

Responses 3: Agree. We have, accordingly, added Section 3.1 Preliminaries for a detailed introduction of the weight perturbation mechanism. In this section, we articulated the problem formulation of AWP  (Equation (7)) and its update rule to determine the weight perturbation (Equation (8)). We have also introduced the update rule of RWP for comparison (Equation (9)). The justification of the weight perturbation strategy is discussed from line 168 to line 170. For a theoretical justification, we refer the author to the paper of AWP [A].

Comments 4: Improve the methodology section, particularly for readers less familiar with the difficulties of the adversarial training.

Responses 4: We have improved the methodology section as above. To help readers less familiar with adversarial training, we have enriched the discussion of adversarial training (line 133 - line 139, Table 1 on page 5) by providing a detailed comparison of all the AT baseline methods covered in the paper.

Comments 5: The authors compare the GAWP with basic methods such as GAIRAT and AWP, it should include more comparisons with other state-of-the-art adversary training methods. 

Responses 5: Thank you for the constructive feedback! We have, accordingly, added the comparison experiments with MART and MLCAT_WP (updated Tables 3 and 4 on page 10). Note that we have compared our method with RWP which is the state-of-the-art weight perturbation strategy, to the best of our knowledge.

Comments 6: The analysis is performed using ResNet architecture on well-known datasets such as CIFAR-10 and CIFAR-100. However, the study would benefit from further tests on different data types and sampling designs to generalize the proposed method.

Responses 6: The chosen datasets and sampling design are the same as previous work RWP [B]. We follow the same setting for a fair comparison. For the other data types, we agree that it is the limitation of our current work to only discuss the application of the proposed method in the context of image data. Therefore, we have added Section 5 (page 13-14) to discuss the limitations of this paper and potential future research directions.

Comments 7: Explain why the proposed method should theoretically lead to improved robustness would be beneficial.

Responses 7: We agree the lack of theoretical justification is another limitation of this paper. Therefore, we have also added that to Section 5 (page 13-14). 

Comments 8: It would be useful to discuss how the proposed method affects model interpretability and whether it presents any additional challenges in understanding how the model makes decisions.

Responses 8: We respectfully claim that the focus of this paper lies in addressing the robustness generalization issue of GAIRAT and propose an advanced AT method to enhance the model robustness. The discussion of model interpretability is out of the scope of this paper.

Comments 9: The article discusses robust overfitting, but it would be helpful to include additional extensive analysis on whether the proposed method avoids overfitting not just in terms of robustness but also in terms of generalization to new, unseen data.

Responses 9: We appreciate the insightful feedback from the reviewer. According to the suggestion, we included additional analysis on regular overfitting in the paper (Figure 4 on page 11 and line 285 - 293)

Comments 10: Improve the final section of Conclusions and Future Work.  For example, how could the method be adapted for different types of neural network architectures or different domains??

Responses 10: We have, accordingly, added Section 5 for a better discussion of future work. For model architectures, we select PreActResNet-18 and Wide ResNet-34-10 which are the same as [B]. There is no necessary adaptation for our method to be applied to either architecture. As for different domains, we assume the reviewer means applying our method to other data types. This limitation has been discussed in Section 5.  

[A] Wu, Dongxian, et al. "Adversarial weight perturbation helps robust generalization."  arXiv preprint arXiv:2004.05884 (2020).

[B] Yu, Chaojian, et al. "Robust weight perturbation for adversarial training". arXiv preprint arXiv:2205.14826 (2022).

Reviewer 3 Report

Comments and Suggestions for Authors

The overall quality of the paper is good. But I would like to see better argumentation behind your decisions made during the experiments. Why did you choose the training parameters mentioned in the paper? Why did you choose the mentioned adversarial attacks?

Author Response

We extend our gratitude to the reviewer for taking time to review this manuscript. Please find the detailed responses below. 

Comments 1:  Why did you choose the training parameters mentioned in the paper? Why did you choose the mentioned adversarial attacks?

Responses 1: Based on our understanding, the reviewer is asking about the “training parameter” part of Section 4.1. We would like to provide more clarification regarding this part. For the choice of commonly used training parameters, such as batch size, learning rate, and momentum, we follow the settings in previous works [A, B]. For the other training parameters of our method GAWP, we choose the values based on the results of the ablation study in Section 4.3.2.

I chose the PGD attack for training and the AA attack for evaluation, which is the same as previous works [A, B]. Particularly, the PGD attack is less time-consuming compared with AA, which is appealing during training. AA is an ensemble of adversarial attacks which include the PGD attack and unseen attacks during training, which provides a more comprehensive evaluation of the model's robustness. A higher test accuracy under AA indicates an improved model robustness. A narrower gap between the test accuracies under AA and the PGD attack implies an enhanced robustness generalization.

A] Yu, Chaojian, et al. "Robust weight perturbation for adversarial training". arXiv preprint arXiv:2205.14826 (2022).

[B] Yu, Chaojian, et al. "Understanding robust overfitting of adversarial training and beyond". arXiv preprint arXiv:2206.08675 (2022).

Reviewer 4 Report

Comments and Suggestions for Authors

The proposed study on adversarial sample learning methods presents a compelling and relevant topic. However, I would like to suggest several revisions to further enhance the research:

 

(1) It would be beneficial to include a more comprehensive set of comparative experiments with existing studies. This would provide a clearer understanding of the advantages and limitations of the proposed method relative to the current state-of-the-art.

 

(2) The inclusion of visualizations for adversarial samples in the dataset would be valuable. Such visualizations could help to better illustrate the effectiveness and characteristics of the proposed approach.

 

(3) An analysis of the time and space complexity of the proposed method should be incorporated. This would provide a more thorough evaluation of the method’s efficiency and feasibility in practical applications.

 

(4) The content would be enriched by including discussions related to the following references: "Textual Adversarial Training of Machine Learning Models for Resistance to Adversarial Examples"

 

(5) It is recommended to discuss the limitations of the proposed method and potential areas for future improvements. This would provide a balanced view of the method's current capabilities and suggest directions for further research.

 

Author Response

We extend our gratitude to the reviewer for taking time to review this manuscript. Please find the detailed responses below. 

Comments 1: It would be beneficial to include a more comprehensive set of comparative experiments with existing studies. This would provide a clearer understanding of the advantages and limitations of the proposed method relative to the current state-of-the-art.

Responses 1: Thank you for the constructive feedback! We have added more comparative experiments with MART and MLCAT_WP (updated Tables 3 and 4 on page 10). Note that we have compared our method with RWP which is the state-of-the-art weight perturbation strategy, to the best of our knowledge.

Comments 2: The inclusion of visualizations for adversarial samples in the dataset would be valuable. Such visualizations could help to better illustrate the effectiveness and characteristics of the proposed approach.

Responses 2: We agree with this comment. Accordingly, we added a figure (Figure 1 on page 2) in the Introduction section for a better illustration of the problems under consideration in this paper. Note that visualizations of adversarial examples are included in Figure 1. We have also added Figure 2 on page 3 for the visualization of the weight loss landscape, which provides a clearer explanation of the motivation behind the proposed approach.

Comments 3: An analysis of the time and space complexity of the proposed method should be incorporated. This would provide a more thorough evaluation of the method’s efficiency and feasibility in practical applications.

Responses 3: We have, accordingly, added Figure 5 (e) on page 13 for the analysis of time complexity. As for space complexity, all the experiments are executed on 4 NVIDIA GeForce 2080Ti GPUs, which has been mentioned in line 254.

Comments 4: The content would be enriched by including discussions related to the following references: "Textual Adversarial Training of Machine Learning Models for Resistance to Adversarial Examples"

Responses 4: We agree that it is a limitation of our paper to only discuss the application of the proposed method in the context of image data. Therefore, we have added Section 5 to discuss the limitations of this paper and potential future works. The discussion of the adaptation of the proposed method to text data is from line 380 to line 389.

Comments 5: It is recommended to discuss the limitations of the proposed method and potential areas for future improvements. This would provide a balanced view of the method's current capabilities and suggest directions for further research.

Responses 5: Thank you for the constructive comment! We have added Section 5 to discuss the limitations of the proposed method and future research directions.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The authors well addressed my concerns and I suggest Accept.

Author Response

We sincerely appreciate the positive feedback from the reviewer.

Reviewer 2 Report

Comments and Suggestions for Authors

3) 

4) It is introduced in a very compact manner and it might be useful to explain it in more detail how it happens, perhaps expanding on the GAWP algorithm for those who are not well-endowed with it.





Comments on the Quality of English Language

Author Response

Comments 1:

Responses 1: Thank you for the comments! We have provided a more detailed discussion about the motivation of the research (line 32 - line 43). In summary, GAIRAT follows an intuition that has been proven successful in regular training settings. However, the current version of GAIRAT is unreliable because of the robustness generalization issue, and there has not been a practical remedy proposed to solve the issue. Through this paper, we hope to provide more insights of the underlying cause of this unaddressed issue. Additionally, given the success of similar ideas in regular training, we believe there exists untapped potential for GAIRAT in terms of comprehensively improving the model robustness.

Comments 2:

Reviewer 4 Report

Comments and Suggestions for Authors

I recommend the acceptance.

Author Response

We sincerely appreciate the positive feedback from the reviewer.

Round 3

Reviewer 2 Report

Comments and Suggestions for Authors

All remarks have been taken into consideration by the authors. The manuscript has significantly improved.

Comments on the Quality of English Language

Minor.

Back to TopTop