The Circular U-Net with Attention Gate for Image Splicing Forgery Detection
Round 1
Reviewer 1 Report
Summary: The article under consideration is based on the assumption that the existing image tampering detection methods, based on deep learning with prompt recognition, are not suitable for small forgery areas. To address this gape, the article has proposed a neural network architecture (CAU-Net). The proposed architecture employs: (1) Residual Propagation and Feedback, (2) Attention Gate (3) Atrous Spatial Pyramid Pooling.
Required improvements:
(1) Introduction section does not specify the target research problem. In the abstract, it was stated that the limitation of existing methods is the detection of small forgery areas. This point is not discussed at all in Introduction. Therefore it is suggested that the Introduction section must be written again. The information can be presented in the following order: (1) Background on the research area, (2) a clear explanation of the target problem (3) Limitations of existing methods in addressing the target problem ((4) What has been proposed to address the target problem (5) Why a proposal is an attractive option for solving the target problem (6) How the proposal has been validated (7) What are the achieved results as well as their significance (8) In the light of proposed method and achieved results , summarize the contributions
(2) Related work does not thoroughly covers state of the art. The recent Image Splicing Forgery Detection methods must be classified (in different categories) and discussed. A comparative table can be made to show the pros and cons of various methods. Last but not least is to highlight the novelty of the proposed method. How the proposed method is different from state of the art ?? (in terms of technique)
(3) The achieved performance must be compared with state of the art methods. The methods selected for comparison are either old or published in conference proceedings.
(4) Discussion section is very brief. The achieved results must be discussed in a greater detail (highlighting the significance and limitations of achieved results)
Minor comments:
· Its better to cite the articles in the order. For example, the first paragraph in section 1 cites [25-28]. The citation should be in order.
· Typos must be removed. Some examples:
Line 3: Howerer -> However
Line 5: Residual Proagation -> Residual propagation
Line 9: has better -> has a better
Line 16: retouching.But in -> retouching. But in
Authors are suggested to comprehensively revised the entire article from English language point of view.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
1. Abstract section should be re-written focussing on why the study was done, what were the results achieved, how the results achieved are different from existing literature, what is the significance of the work done in the article.
2. Literature review or related work section requires more detailed information on the topic. Authors should do more critical evaluation from the existing literature.
3. Conclusion and discussion should be listed separately as two different sections and should provide comprehensive discussion of results obtained in context to the existing literature
In addition to my previous comments, please find further comments:
1. Authors should elaborate on how the pixel size of images could affect the forgery in context to the work done in the article and literature.
2. Authors should illustrate the significance of using CNN algorithm in the article in regards to existing literature.
Kind request to you to please add the above comments in addition to previous while sending report to authors.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 3 Report
1. Most of the articles in the literature are outdated, not from the last year or two.
2. The multiplication of equation 1 should be written in a standard way. Also, the "s" function is not mentioned in the paper.
3. Page 5, the description of Figure 5 mentions "1 by 1 by 1 convolutional kernel operation", what is the theoretical meaning of this? Is the author wrong? Please confirm that there is also a "1 by 1" convolutional in Figure 6.
4. The multiplication of equation 3 should be written in a standard way.
5. This study does not explain why the Evaluation Metrics of Equation 3 should be used, especially since there are many metrics to choose from and they all have different meanings.
6. The better results in Table 1 are suggested to be shown in bold.
7. The quantitative metrics in Table 1 show that the structure of this study is good, but the results in Figure 7 do not show the strengths of this study, especially the small details, but the abstract of this study claims that the small details can be seen in this paper, please correct the statement or explain in more detail in the article.
8. In the subsection of Compared Detection Methods, the importance of each component to this framework is shown. The importance of CAU-Net, MCNL-Net, and RRU-Net to the overall architecture can be seen from the metrics, but it also raises some questions about why this architecture needs to consider the U-Net and Attention U-Net architecture, which will make the overall training time longer and the performance worse. These issues are not discussed and explained in the article.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 4 Report
-the sentence : “Since the scale .. different” is not complete p.5,line 156
-check for typo for the word “moudles” p.6,line195 and Proagation p1,line5
-some more information for the residual feedback would be helpful p4, line 116
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 5 Report
Minor discrepancies can be supplemented; they did not diminish the value of the work in any way, and after removing them. Minor updates:
-
It is suggested to add to some of the related work that already exists by focusing on the problem or how to solve it. This is the reason for the existence of the section: "Related Works."
-
"Attention Gate" - it is necessary to detail the scope of architecture.
-
Figure 6 information needs to be updated.
-
Since there is already a very similar result, it is necessary to compare them and present the differences. Authors have to take this into account because it is not enough to describe similes by changing the picture. It is necessary to compare and detail the container architecture proposed by the author.
Comments for author File: Comments.pdf
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 6 Report
This paper presents a neural network with an end to end approach for detecting image forgery, in particular splicing type forgery. The proposed CAUNet architecture is based on a circular U-Net with additional mechanisms that enhance its performance. The added block Atrous Spatial Pyramid Pooling, which allows to study correlations at different scales of the image, and Attention Gate mechanism, which allows to reduce the weight of genuine regions and increase the weight of spurious regions when passing information from encoder layers to decoder layers, are described in detail in the paper. The paper presents the results of quality measurement, and an ablation study is performed to analyze the usefulness of Atrous Spatial Pyramid Pooling and Attention Gate. The experiments seem to be conducted correctly. The article emphasizes the importance of accurate and representative solutions for detecting image spoofing, irrespective of the image creation methods and other conditions. The network architecture is clearly presented and the main contributions of the authors seem clear and relevant.
Remarks and issues:
1. The introduction states that detection methods can be classified into three categories: the forgery trace detection, the inherent property consistency of imaging devices, and the intrinsic statistical features of images. The content of this classification is not disclosed and no reference is given for more details. Besides, the authors refer to the work [12] as an example for both the second category and the third, which adds to the misunderstanding. Therefore, I think this classification needs to be described in more detail.
2. In the introduction the reasoning is that traditional methods are only able to detect one forgery technique, because their manual signs are not able to detect multiple forgery methods simultaneously. Since there are methods, e.g. based on the JPEG region compression history, which can detect both copy-move and splicing forgery techniques, it is not entirely clear what the authors mean by image forgery techniques. Next comes an example with image capture device-based methods that fail if the two images in splicing were taken from devices of the same model. It is not clear how the device model is related to forgery techniques, since splicing operations do not change in this case. Please add a definition for the terms "forgery techniques", "forgery methods" and "forgery operations" and tell the reader where splicing belongs.
3. Paragraph 3.1 says that W_i is the parameters of the i-th layer, so the notation W_s(x) in formula 1 is confusing. In addition, it is not clear whether s is fixed for all blocks or not. It is also not clear whether W_s is a fixed linear mapping or a trainable one. Better notation should be introduced and nuances clarified.
4. In Section 3.3, I understand that g and x_l denote outputs from the two layers of the block. I think it should be explicitly written about it.
5. In the description of the choices the authors made while designing the contribution to the networks architecture, the authors claim that some choices lead to better convergence of the networks during training, however no such experiment is presented and thus this claim can not be validated.
6. There is a large number of typos and english mistakes in the text, I would suggest having a very careful proofreading and fixing the mistakes before submitting a revision.
Given the amount of issues I would recommend to reconsider the manuscript after a major revision.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
The comments have been addressed.
The article can be published in its current form
Reviewer 2 Report
Paper is significantly improved. Good Work.
The only suggestion for authors is to add table of symbols and abbrevations used in the article for the better understanding of readers.
Reviewer 3 Report
It's okay.
Reviewer 6 Report
I appreciate the extensive revision the authors performed. All my comments to the previous revision have been sufficiently addressed.