A Local-Sparse-Information-Aggregation Transformer with Explicit Contour Guidance for SAR Ship Detection
Round 1
Reviewer 1 Report
1. Over the last two years, many different approaches have been proposed in ship detection. Anchor free methods have been proposed, therefore, authors are advised to review literature and include many new approaches that are very similar to the proposed method. Not only in ship detection, but general in object detection.
2. The proposed method is somehow novel.
3. The abstract is presenting the proposed method, like the transformer operation is known to the reader. This should be briefly described in abstract.
4. In section 3.1.1 the Swin transformer is described and below Fig.2 only computational complexity is discussed. In my opinion the structure efficiency compared to existing project/methods should be described and motivation behind structure, depicted in Fig.2 is missing.
5. It is not clearly described how DETR and sparse information aggregation Transformer are connected. Try to depict it in Figs. 3-5.
6. Contour shape detection can not be used in the present form. Authors should use edge detector adopted to SAR data.
7. SAR images are too dark in the manuscript.
8. Compared methods are not well selected, try to compare methods with similar architecture.
9. False alarms should be depicted and discussed in details.
10. Learning procedure of the proposed method should be described.
1 All unknown parameters, image and network sized should be reported. (w, h, H, W, \alpha, etc.)
1Section 4.1.3 is double referenced and it should answer all questions regarding implementation.
Author Response
Please see the attachment.
Author Response File: Author Response.docx
Reviewer 2 Report
This paper deals with the ship detection problem for SAR imageries. The authors deal with this problem using a transformer-type approach to deal with the clutter. Overall, the paper is interesting and the results are promising. I have the following comments:
1. Please compare the proposed approach with state-of-the-art ship detection approaches such as [1] and [2].
2. The authors should analyze more about why you derive the proposed approach from a Swin Transformer approach. Now, it’s quite abrupt to introduce Swin transformer in Section 3.1
3. Motion error is a key problem in the practical application of SAR image formation[3]. If there are motion errors, the SAR images would be unfocused or blurred. Is the proposed approach robust to image quality degradation stemmed from motion errors? Adding some experiments is intractable, but the authors could discuss this aspect.
4. A language revision is necessary to fit some minor grammatical errors, e.g. the misuse of prepositions affects the readability of the paper.
5. The authors should pay attention to the format of references.
[1] X. Zhang, X. Yang, D. Yang, F. Wang and X. Gao, "A Universal Ship Detection Method with Domain-Invariant Representations," in IEEE Transactions on Geoscience and Remote Sensing, 2022, doi: 10.1109/TGRS.2022.3200957.
[2] Y. Li, Z. Ding, C. Zhang, Y. Wang and J. Chen, "SAR Ship Detection Based on Resnet and Transfer Learning," IGARSS 2019 - 2019 IEEE International Geoscience and Remote Sensing Symposium, 2019, pp. 1188-1191, doi: 10.1109/IGARSS.2019.8900290.
[3] W. Pu, "SAE-Net: A Deep Neural Network for SAR Autofocus," in IEEE Transactions on Geoscience and Remote Sensing, doi: 10.1109/TGRS.2021.3139914.
Author Response
Please see the attachment.
Author Response File: Author Response.docx
Reviewer 3 Report
This paper proposed A Local Sparse Information Aggregation Transformer with Explicit Contour Guidance for SAR Ship Detection. Overall, the structure of this paper is well organized, and the presentation is clear. However, there are still some crucial problems that need to be carefully addressed before a possible publication. More specifically,
1. The motivations or remaining challenges are not so clear or what kinds of issues or difficulties are this task that is facing. Please give more details and discussion about the key problems solved in this paper, which is largely different from existing works.
2. A deep literature reviews should be given, particularly advanced and SOTA deep learning or AI models in remote sensing data processing and analysis. Therefore, the reviewer suggests discussing some related works by analyzing the following papers in the revised manuscript, e.g., 10.1109/TGRS.2020.3016820.
3. What are the main differences between the proposed technique and existing methods?
4. The reviewer is wondering how about the computational complexity of the proposed method?
5. Please clarify what is the novelty of the proposed method? What is main difference between the proposed method and existing methods, e.g., ORSIm detector.
6. Some future directions should be pointed out in the conclusion.
Author Response
Please see the attachment.
Author Response File: Author Response.docx
Round 2
Reviewer 1 Report
Authors successfully answered to all of my questions, paper can be accepted in the present form.
Author Response
Thank you for your comments concerning our manuscript and approval of accepting our paper.
Reviewer 3 Report
This paper proposed A Local Sparse Information Aggregation Transformer with Explicit Contour Guidance for SAR Ship Detection. Overall, the structure of this paper is well organized, and the presentation is clear. However, there are still some crucial problems that need to be carefully addressed before a possible publication. More specifically,
1. The motivations or remaining challenges are not so clear or what kinds of issues or difficulties are this task that is facing. Please give more details and discussion about the key problems solved in this paper, which is largely different from existing works.
2. A deep literature reviews should be given, particularly advanced and SOTA deep learning or AI models in geospatial object detection. Therefore, the reviewer suggests discussing some related works by analyzing the following papers in the revised manuscript, e.g., ORSIm detector, Fourier-based rotation-invariant feature boosting.
3. What are the main differences between the proposed technique and existing methods?
4. The reviewer is wondering how about the computational complexity of the proposed method?
5. Some future directions should be pointed out in the conclusion.
Author Response
Please see the attachment.
Author Response File: Author Response.docx