When Federated Learning Meets Watermarking: A Comprehensive Overview of Techniques for Intellectual Property Protection
Round 1
Reviewer 1 Report
In this survey paper, the authors worked on the ownership rights protection in the context of Machine Learning (ML) and Watermarking. Briefly, they provide an overview of recent advancements in Federated Learning watermarking, shedding light on the new challenges and opportunities that arise in this field. The manuscript is well written however, my comments are given as below.
1) Authors should add motivation along with contributtaion.
2) author should also add benfits of this research in intrdouction section.
3) It is suggested to use tabular comapsrion in the related work section(In revised version).
4) The quality of images in figure 1 is low. it should be improved in the revised version.
5) authors should add future work in the conclsion section.
6) authors should add new section and discuss the future challanges and the remidies in this direction.
The english is fine.
Author Response
Dear Reviewer,
We would like first to thank you for providing us the opportunity to improve our paper entitled "When Federated Learning meets Watermarking: A Comprehensive Overview of Techniques for Intellectual Property Protection" for potential publication in the MAKE journal. We greatly appreciate the time you devoted for providing insightful remarks.
In the following, we point-by-point detail how we respond to the judicious observations, recommendations and concerns put forth by the reviewers. These corrections are also highlighted in red within our revised manuscript (in attachement). We hope having properly answered all raised concerns. We remain at your disposal for any additional questions.
Comments 1 : Authors should add motivation along with contributtaion.
Response 1 : This comment has been addressed. We added a new paragraph in Section 1.1.Contribution, Page 2, that describe our motivation to do this work. We conducted this work to provide a comprehensive analysis of the interest of watermarking in federated machine learning pointing out recent advances and remaining challenges to address.
Comments 2 : author should also add benfits of this research in intrdouction section.
Response 2 : This comment has been addressed by adding a paragraph in Section 1.1.Contribution, Page 2, that describe the benefits of our work can provide to the research community, completing also the previous comment.
Comments 3 : It is suggested to use tabular comapsrion in the related work section(In revised version).
Response 3 : In our paper, there are two related sections. One on centralized DNN watermarking and, the other, on federated learning (FL) watermarking. To respond the reviewer comment several comparison tables have been added for FL watermarking (see Section 3 and 4). Regarding, centralized DNN Watermarking, we did not add new elements due to the fact such kind of technique is not the main concern of our paper.
Comments 4 : The quality of images in figure 1 is low. it should be improved in the revised version.
Response 4 : It is not possible to improve the quality of the images as requested because their resolution is of 28x28 pixels only. They are part of the reference test database CIFAR10 and MNIST. We add the references of these datasets in the description of the figure and we reduce the size of the figure to avoid visual artefacts.
Comments 5 : authors should add future work in the conclsion section.
Response 5 : This comment has been addressed. The conclusion has been revised in depth. It comes back on the main challenges to focus on, as well as the future works our team will work on. In particular, we will implement the presented methods to compare them in a unified framework.
Comments 6 : authors should add new section and discuss the future challanges and the remidies in this direction.
Response 6 : As requested by the reviewer, we added a novel section “Perspectives” where we point out incoming FL watermarking challenges. These ones include : i) needs for testing actual methods and develop new ones in the context of more complex applications; ii) reproducibility with the need for standardized experimental frameworks; iii) needs for investigating secure ownership verification protocols ; iv) needs for open-source federated learning frameworks that natively integrate FL watermarking pipeline as for instance for testing.
Author Response File: Author Response.pdf
Reviewer 2 Report
In this paper, an overview of recent advancements in Federated Learning watermarking, shedding light on the new challenges and opportunities that arise in this field are presented. These are my comments related to the paper:
1. The literature review section of the paper is weak. It needs to be extended. Some comparative tables must be added similar to table 3.
2. Table captions are added above the table. The table should be mentioned first in the text and then the table should be presented.
3. Yang et al. scheme [94], this should not be the title of a sub-heading.
4. The discussion section must provide some concise information in the form of a table or points to present the extract of this comparative analysis. Also, some recommendations would be nice.
5. The conclusion section should also be rewritten. It must give some concrete conclusions and future directions.
Moderate language revision is required.
Author Response
Dear Reviewer,
We would like first to thank you for providing us the opportunity to improve our paper entitled "When Federated Learning meets Watermarking: A Comprehensive Overview of Techniques for Intellectual Property Protection" for potential publication in the MAKE journal. We greatly appreciate the time you devoted for providing insightful remarks.
In the following, we point-by-point detail how we respond to the judicious observations, recommendations and concerns put forth by the reviewers. These corrections are also highlighted in red within our revised manuscript (in attachement). We hope having properly answered all raised concerns. We remain at your disposal for any additional questions.
Comments 1 : The literature review section of the paper is weak. It needs to be extended. Some comparative tables must be added similar to table 3.
Response 1 : This comment has been addressed. Several comparison tables have been added in Section 3 and 4 about the experimentation conducted by each proposal from the literature and to compare black box schemes. In addition, we have extended some description of existing works, providing more details (attacks, security assumptions, and so on). It is important to notice that the number of papers about federated learning (FL) watermarking is rather small today. All of them have been listed in the paper.
Comments 2 : Table captions are added above the table. The table should be mentioned first in the text and then the table should be presented.
Response 2 : This issue has been fixed. Caption are above tables and tables now appear just after they are mentioned in the text.
Comments 3 : Yang et al. scheme [94], this should not be the title of a sub-heading.
Response 3 : Thanks for having highlighted this issue which has corrected.
Comments 4 : The discussion section must provide some concise information in the form of a table or points to present the extract of this comparative analysis. Also, some recommendations would be nice.
Response 4 : This comment has been addressed. We have added a summary paragraph with concise pieces of information at the end of the discussion section. This one underlines the key findings of our analysis regarding FL watermarking solutions on the client or server sides, the investigation of other FL architectures and applications and, the need to take into consideration non-I.I.D. data distribution scenarios.
We have also added a recommendation section (see Section 5) that includes challenges to address today: i) needs for testing actual methods and develop new ones in the context of more complex applications; ii) reproducibility with the need for standardized experimental frameworks; iii) needs for investigating secure ownership verification protocols ; iv) needs for open-source federated learning frameworks that natively integrate FL watermarking pipeline as for instance for testing
Comments 5 : The conclusion section should also be rewritten. It must give some concrete conclusions and future directions.
Response 5 : Accordingly to the reviewer comment, the conclusion has been rewritten in Depth. It now comes back on the main challenges to focus on, as well as the future works our team will work on. In particular, we will implement the presented methods to compare them in a unified framework.
Comments 6 : Moderate language revision is required.
Response 6 : English writing has been carefully revised.
Author Response File: Author Response.pdf
Reviewer 3 Report
In this manuscript, the authors discuss the concept of Federated Learning (FL) and its application in training Deep Neural Networks (DNN) without centralized data, highlighting its privacy-preserving advantages. The paper identifies a significant problem: the vulnerability of shared model information to theft or unauthorized distribution. In addition, the paper aims to address this by exploring watermarking methods, specifically designed for FL. Despite the good structure of the manuscript and the satisfactory use of the English language, some concepts need to be focused on improving paper quality:
- In the introductory section, the limitations and novelty of this work are not clear or little specified. I suggest you try to include more bibliographical references for dealing with this topic to make a proper comparison and underlight your contributions to the research;
- Although the paper provides a comprehensive idea and presentation of the proposed FL modular architecture (particularly on the connection between federated aggregator and central entity), a better definition of the workflow used and the federated aggregator algorithm is missing. Please improve this aspect.
- From a technical point of view, the paper lacks a description of the software used for the implementation (e.g. TensorFlow or other frameworks). Please, include this essential technical aspect.
- The empirical results need a proper comparison with a minimal configuration of classical approaches. Moreover, the paper needs an explanation of the performance part (how/if you divided non-iid data or use cross-validation), and the training phase (show a plot to see if the models are learning, etc). I recommend improving the paper by focusing on this crucial part of your case study;
- The conclusion section needs a clear exposition on how the initial goal of the study was achieved, I suggest this part to support these with adequate numerical results and references, offering a more critical/discursive view of future research both on the FL approach and the scenario.
The quality of English in the manuscript is generally good, with clear sentence structures and appropriate use of technical jargon. While this may be suitable for a specialized audience, definitions or simplified explanations could enhance clarity. Despite the English is of a high standard, and suitable for academic publication, minor refinements could enhance its readability and accessibility.
Author Response
Dear Reviewer,
We would like first to thank you for providing us the opportunity to improve our paper entitled "When Federated Learning meets Watermarking: A Comprehensive Overview of Techniques for Intellectual Property Protection" for potential publication in the MAKE journal. We greatly appreciate the time you devoted for providing insightful remarks.
In the following, we point-by-point detail how we respond to the judicious observations, recommendations and concerns put forth by the reviewers. These corrections are also highlighted in red within our revised manuscript (in attachement). We hope having properly answered all raised concerns. We remain at your disposal for any additional questions.
Comments 1 : In the introductory section, the limitations and novelty of this work are not clear or little specified. I suggest you try to include more bibliographical references for dealing with this topic to make a proper comparison and underlight your contributions to the research;
Response 1 : To address the reviewer’s comment we have added the objective of our work and the benefits of it for the community (see section contributions). We would like to notice that the number of papers about federated learning (FL) watermarking is rather small today. All of them have been listed in the paper. To the best of our knowledge, only one survey exists on FL watermarking. This one only include two solutions. Our survey include the actual nine existing proposals for FL watermarking.
Comments 2 : Although the paper provides a comprehensive idea and presentation of the proposed FL modular architecture (particularly on the connection between federated aggregator and central entity), a better definition of the workflow used and the federated aggregator algorithm is missing. Please improve this aspect.
Response 2 : This comment has been addressed. In section 2.1, paragraph 2, more references about aggregation functions have added. Additionally, we now describe the most common aggregation function used, that is FedAvg [1] in Section 4.2.
[1] McMahan, B., Moore, E., Ramage, D., Hampson, S., & y Arcas, B. A. (2017, April). Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics (pp. 1273-1282). PMLR.
Comments 3 : From a technical point of view, the paper lacks a description of the software used for the implementation (e.g. TensorFlow or other frameworks). Please, include this essential technical aspect.
Response 3 : To answer to the reviewer’s comment we added a table where we indicate the experimental settings and the software frameworks used by existing FL watermarking methods, when this information was available from methods’ papers. This table and the associated text are located in Section 4.2, page 16, in the revised version of our paper.
Comments 4 : The empirical results need a proper comparison with a minimal configuration of classical approaches. Moreover, the paper needs an explanation of the performance part (how/if you divided non-iid data or use cross-validation), and the training phase (show a plot to see if the models are learning, etc). I recommend improving the paper by focusing on this crucial part of your case study;
Response 4 : To answer this comment we would like first recall the objective of our work. It aims at providing a comprehensive analysis of the interest of watermarking in federated machine learning pointing out recent advances and remaining challenges to address. Experimentation and comparison assessments of these schemes is part of the future works of our team. Indeed, one needs to implement them in a unified framework. Such a framework relies on the development on a federated learning framework that includes a FL watermarking pipeline; pipeline that has not yet been designed, to our knowledge.
Comments 5 : The conclusion section needs a clear exposition on how the initial goal of the study was achieved, I suggest this part to support these with adequate numerical results and references, offering a more critical/discursive view of future research both on the FL approach and the scenario.
Response 5 : Accordingly to the reviewer comment, the conclusion has been rewritten in Depth. It comes back on the conclusion of our analysis, pointing out the main challenges the community should focus on. It also state the future works our team will work on. In particular, we will implement the presented methods to compare them in a unified framework.
Author Response File: Author Response.pdf
Round 2
Reviewer 2 Report
The authors have incorporated most of the comments suggested in the revision. The paper is in much better shape. I don't have any other comments.
Minor language revision is needed.
Author Response
Dear Reviewer,
We sincerely appreciate your valuable comments during the review process, which have greatly improved our document. Your support and insights were invaluable, and we are grateful for the time and effort you devoted to reviewing our work.
We have thoroughly reviewed and addressed language mistakes.
Best regards,
Reviewer 3 Report
- Despite the manuscript has undergone revisions to improve its readability and more effectively structure the results, the manuscript requires a clearer exposition on the performance evaluation. Specifically, it would be beneficial to elucidate whether there was a division of non-iid data or the application of cross-validation.
- Additionally, integrating a graphical representation to track the learning curves of the models would be valuable. I strongly advise refining the paper by giving special attention to this essential section of your research.
The quality of English in the manuscript is generally good, with clear sentence structures and appropriate use of technical jargon. Only minor refinements could enhance its readability and accessibility.
Author Response
Dear Reviewer,
We sincerely appreciate the opportunity to submit a revised version of our article for potential publication in the MAKE journal. We are grateful for the time and effort you and the other reviewers have dedicated to providing feedback on our manuscript. Your insightful remarks and valuable suggestions have significantly improved our work.
We have diligently incorporated the recommended changes into the manuscript, which are clearly indicated by text in blue. Below, we provide a point-by-point response to your observations and concerns. All page references correspond to the revised manuscript file containing tracked changes.
Comments 1 : Despite the manuscript has undergone revisions to improve its readability and more effectively structure the results, the manuscript requires a clearer exposition on the performance evaluation. Specifically, it would be beneficial to elucidate whether there was a division of non-iid data or the application of cross-validation.
Response 1: We have addressed this comment by adding a new table (Table 6) in Section 5 to present the performance results of all methods according to their respective parameters set by their authors. This revision helps emphasize that due to variations in data, model architectures, and parameters tested by different authors, comparing these methods without a unified experimental framework is challenging. Our future work will focus on establishing such a standard framework.
Regarding data distribution, all presented methods follow the train/test split setting, where the test set shares the same distribution as the training set (IID data). The manner in which the training data is split among clients may be either IID or non-IID. Therefore, we believe that employing cross-validation techniques such as K-fold or stratified validation is unnecessary. Furthermore, such techniques significantly increase the computational complexity of training deep learning models on large databases, making them less suitable for the context of federated learning. Additionally, sharing confidential client-specific data required for cross-validation is often not feasible.
However, we acknowledge the significance of reporting results with cross-validation in the machine learning context. In future research, we plan to investigate the potential benefits of combining cross-validation and federated learning, particularly when an IID testing set is not available.
Comments 2 : Additionally, integrating a graphical representation to track the learning curves of the models would be valuable. I strongly advise refining the paper by giving special attention to this essential section of your research.
Response 2: In response to this comment, we have included a graphical representation (Figure 3, Section 3.2.1) to track the accuracy of the learning process on an IID testing set and the watermark detection rate using the WAFFLE technique. Similar experiments with other methods will be conducted once our unified framework, which is part of our ongoing work, is fully designed within a realistic federated learning scenario.
We have thoroughly reviewed and addressed language mistakes in the manuscript.
Thank you once again for your valuable feedback and consideration of our work.
Sincerely,
----- Mail original -----
De: "mohammed lansari" <[email protected]>
À: "Reda BELLAFQIRA" <[email protected]>
Envoyé: Vendredi 29 Septembre 2023 18:37:18
Dear Reviewer,
We thank you again for providing us the opportunity to present a revised version of our article for potential publication in the MAKE journal. We greatly appreciate the time and dedication you and the other reviewers have devoted to providing feedback on our article, and we are grateful for the insightful remarks and valuable improvements made to our work.
We have improved the paper according to the recommendations you put forth. These adjustments have been clearly indicated within the manuscript (text in blue). We provide below a point-by-point response to your observations and concerns. All page references correspond to the revised manuscript file containing tracked changes.
Comments 1 : Despite the manuscript has undergone revisions to improve its readability and more effectively structure the results, the manuscript requires a clearer exposition on the performance evaluation. Specifically, it would be beneficial to elucidate whether there was a division of non-iid data or the application of cross-validation.
Response 1 : This comment has been addressed. A new table Table 6 has been added in section 5 to expose the performance results of all methods according to their authors parameters. This revision helps us to better point out that due to the fact that authors are testing their methods with different data, model architectures and parameters, the comparison without an unified experimental framework is impossible. Our future work will consist on providing such standard framework.
Regarding data distribution, we have integrated a column in table 5 to state if yes or no a given method has been considered the division of non-iid data. Moreover, the performance of all methods are not using the cross-validation method : no mention of the method and the accuracy score is given without standard deviation. Probably because using such technique can be very long in particular for Federated Learning. We are aware about the importance of giving results using cross-validation, then we will give all results of our future work using cross-validation.
Comments 2 : Additionally, integrating a graphical representation to track the learning curves of the models would be valuable. I strongly advise refining the paper by giving special attention to this essential section of your research.
Response 2 : To address this comment, we have added a graphical representation (Figure 3, Section 3.2.1) to track the accuracy of learning process on a cross-validation set and the watermark detection rate using WAFFLE technique. Similar experimentation with other methods will be conducted ones our unified framework, which is part of our future work, will be completely designed on a realistic FL scenario.
We have thoroughly reviewed and addressed the language mistakes.
Author Response File: Author Response.pdf