Defense Against Adversarial Attacks in Deep Learning
Round 1
Reviewer 1 Report
This paper presents an approach to make deep learning techniques (convolutional neural networks) robust to adversarial attacks when input images have been purposefully perturbed with noise. By leveraging GANS for data augmentation in addition to a knowledge transfer approach for training, the proposed method (UDDN) is able to improve on existing approaches using a public training and test set.
Overall the approach taken by the paper is sound, and it's relevance to deep learning research is timely given the vast interest in convolutional networks for computer vision. There are only 2 comments (1 minor, 1 moderate) that I feel should be addressed prior to publication:
1) Make sure all acronyms are defined when they are first used
2) Provide explicit details as to how LFW was used for experimental validation, and the structure of the experiment itself. This information can be inferred from the details of the result section, but would be better to be stated upfront.
In general I found the work presented to be an interesting contribution worthy of publication.
Author Response
To reviewer 1:
We would like to express our sincere thanks to you for the constructive and positive comments, and all my responses to the comments are as follows:
Point 1: Make sure all acronyms are defined when they are first used.
Response 1: Thank you for your suggestion. We have added the full name for the acronyms at their first occurrences, which include wasserstein generative adversarial networks (W-GAN), limited-memory BFGS (L-BFGS), fast gradient sign method (FGSM), and jacobian-based saliency map approach (JSMA). At the same time, we have added the explanation for U-Net at its first occurrence, which contains a contracting path and an expanding path. The U-Net is named after the network structure looks like a "U". Among them, UDDN and GNR are new methods we named. UDDN is a deep denoising neural network, and GNR is a noise reconstruction method based on generation model.
Point 2: Provide explicit details as to how LFW was used for experimental validation, and the structure of the experiment itself. This information can be inferred from the details of the result section, but would be better to be stated upfront.
Response 2: Thank you for your suggestion. The experimental verification of the proposed method is performed on the LFW face dataset, and the attack target model is FaceNet. We first extract 10k face images from the LFW (10 per class) and use a series of attack methods to distort the images and get the corresponding adversarial images. The attack methods include FGSM and IFGSM2, IFGSM4, and IFSGM8. The perturbation level of each sample is uniformly sampled from [0, 1] to obtain 40k images, which will be used to train the GNR. Then, we continue to extract 20K face data from the LFW (10 per class), and use the trained GNR to generate 80k attack images. Finally, we collected a total of 120k images in the training set. In order to prepare the validation set, we extract 5k personal face images (5 per class) from LFW. The same method was used to obtain 20k attack images. Two different test sets were built, one for white-box attacks and the other for black-box attacks. We extracted 5k personal face images (5 per class) from LFW. The white-box attack test set uses two attack methods FGSM and FGSM4 based on FaceNet to obtain 10k test images. The black-box attack test set uses two attack methods FGSM and FGSM4 based on pre-trained Inception V3 to obtain 10K test images. When training UDDN, we set the learning rate initially to 0.001 and decay to 0.0001 when the training loss converges. The model is trained on 4 GPUs, the batch size is 32, and the number of trained epochs is between 30 and 40, depending on the convergence speed of the model.
Reviewer 2 Report
The proposed work is interesting and sound. However, the presentation style must be improved. Specifically, authors need to be more details when explain prior work and spell out acronyms. The work will not be read by specialist only if published, it is not possible to use UDDN, U-Net etc without introducing them the first time these terms are mentioned in the paper.
There are other ways to improve generalization capabilities and perform knowledge transfer in deep networks, please refer to the following works and consider them in the literature review properly:
(1) SM Siniscalchi, VM Salerno, Adaptation to new microphones using artificial n eural networks with trainable activation functions, IEEE transactions on neural networks and learning systems 28 (8), 1959-1965
(2) Salerno, V.M.; Rabbeni, G. An Extreme Learning Machine Approach to Effective Energy Disaggregation. Electronics 2018, 7, 235.
(3) Zhang, A.; Wang, H.; Li, S.; Cui, Y.; Liu, Z.; Yang, G.; Hu, J. Transfer Learning with Deep Recurrent Neural Networks for Remaining Useful Life Estimation. Appl. Sci. 2018, 8, 2416.
With respect to the experimental results: The authors should not simply report numbers and showing that are better but link that result with the property of their proposal making those numbers better than that obtained with other approaches.
Author Response
To reviewer 2:
We would like to express our sincere thanks to you for the constructive and positive comments, and all my responses to the comments are as follows:
Point 1: Authors need to be more details when explain prior work and spell out acronyms. The work will not be read by specialist only if published, it is not possible to use UDDN, U-Net etc without introducing them the first time these terms are mentioned in the paper.
Response 1: Thank you for your suggestion. We have added the full name for the acronyms at their first occurrences, which include wasserstein generative adversarial networks (W-GAN), limited-memory BFGS (L-BFGS), fast gradient sign method (FGSM), and jacobian-based saliency map approach (JSMA). At the same time, we have added the explanation for U-Net at its first occurrence, which contains a contracting path and an expanding path. The U-Net is named after the network structure looks like a "U". Among them, UDDN and GNR are new methods we named. UDDN is a deep denoising neural network, and GNR is a noise reconstruction method based on generation model.
Point 2: There are other ways to improve generalization capabilities and perform knowledge transfer in deep networks, please refer to the following works and consider them in the literature review properly:
1) SM Siniscalchi, VM Salerno, Adaptation to new microphones using artificial neural networks with trainable activation functions, IEEE transactions on neural networks and learning systems 28 (8), 1959-1965
2) Salerno, V.M.; Rabbeni, G. An Extreme Learning Machine Approach to Effective Energy Disaggregation. Electronics 2018, 7, 235.
3) Zhang, A.; Wang, H.; Li, S.; Cui, Y.; Liu, Z.; Yang, G.; Hu, J. Transfer Learning with Deep Recurrent Neural Networks for Remaining Useful Life Estimation. Appl. Sci. 2018, 8, 2416.
Response 2: Thank you for your suggestion. We have read three references carefully and quoted them in the literature review. The labels of the added literature are [11], [12] and [17].
Point 3: With respect to the experimental results: The authors should not simply report numbers and showing that are better but link that result with the property of their proposal making those numbers better than that obtained with other approaches.
Response 3:Thank you for your suggestion. We have added the relationship between the experimental results and defense performance of protection strategy in the experiments. In the experiment of evaluating the performance of UDDN, it can be intuitively seen that the image denoised by our proposed UDDN is very close to the real image. To quantitatively analyze the performance of denoising method, we feed the denoised images into the classification model. The results of the model accuracy show that the probability of misclassification is greatly reduced after the images are denoised by the proposed UDDN. Compared with other denoising models, our method achieves higher accuracy. Therefore, it is more resistant against adversarial attacks. In the experiment of evaluating the overall defense strategy, in order to prove the effectiveness and robustness of the strategy, we compare the strategy with other defense mechanisms. At the same time, the experiments perform on different mainstream deep learning models and different datasets. The experimental results of the model accuracy show that the model achieves the highest accuracy under the protection of our proposed defense mechanism, which proves that we have the strongest defense ability against the adversarial attacks.
Round 2
Reviewer 2 Report
The authors have addressed all of my concerns. I think the manuscript could be accepted.
Author Response
To Academic Editor:
We would like to express our sincere thanks to you for the constructive and positive comments, and all my responses to the comments are as follows:
Point 1: All figure captions should explain details. Current version shows too short to understand the meaning of each figure.
Response 1: Thank you for your suggestion. We have added explanation for all figure captions in detail.
Point 2: Typo: Conclusion
Response 2: Thank you for your suggestion. We have fixed this spelling mistake in the conclusion of the article.