Next Article in Journal
Lessons Learned While Implementing a Time-Series Approach to Forest Canopy Disturbance Detection in Nepal
Previous Article in Journal
Performance Comparison of Filtering Algorithms for High-Density Airborne LiDAR Point Clouds over Complex LandScapes
 
 
Article
Peer-Review Record

Sparse Label Assignment for Oriented Object Detection in Aerial Images

Remote Sens. 2021, 13(14), 2664; https://doi.org/10.3390/rs13142664
by Qi Ming 1, Lingjuan Miao 1, Zhiqiang Zhou 1,*, Junjie Song 1 and Xue Yang 2
Reviewer 1: Anonymous
Reviewer 2:
Remote Sens. 2021, 13(14), 2664; https://doi.org/10.3390/rs13142664
Submission received: 4 June 2021 / Revised: 30 June 2021 / Accepted: 1 July 2021 / Published: 7 July 2021
(This article belongs to the Section Remote Sensing Image Processing)

Round 1

Reviewer 1 Report

Major Comments

  • Overall this paper is well-organized and easy to read.
  • The novelty of this paper lies to its label assignment strategy, SLA (Sparse Label Assignment), the adoption of the Position-Sensitive Feature-Pyramid network for feature extraction and the novel Distance Rotated IoU Loss.
  • Various performance results are included, providing evidence for the efficacy of the proposed methods as well as the high performance of the paper’s model compared to other models in the bibliography in three different datasets.
  • The reference to existing work in object detection from aerial images and specifically in label assignment strategies seems to be very brief in the Related Work section considering the relevance to this paper.

Minor Comments

  • The authors are encouraged to provide better explanation on the notation in Section 3.2, specifically in Equations 2 and 3.

 

  • Table 1 in Section 4 is a bit harder to read and there could be some explanation in the corresponding section or in the caption.

 

  • There are a few minor issues concerning the linguistic quality of the manuscript. More specifically:

 

  • line 28 : “Due to that the objects”
  • line 30-31
  • line 37: “leads”
  • line 109: “slower”
  • line 138: “to aggregates”
  • line 144: “helps for detecting objects”
  • line 243: “Motivate”
  • line 461: “which are superior many advanced”

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 2 Report

remotesensing-1268079: Sparse Label Assignment for Oriented Object Detection in Aerial Images by Qi Ming, Lingjuan Miao, Zhiqiang Zhou, Junjie Song, Xue Yang

Suggest “Accept” with “Major Revision”! Good Work!

The following is my some exchanges with the authors, which might improve this paper, so as to better present to possible readers in the future.

(1) Line 2. “divide” what? Into what? Or distribute? One-to-one assignment? Please rewrite this sentence.

(2) Line 3 to Line 5. I cannot understand this point. What is the definition of “a suboptimal learning process”? Why do the pre-set anchors bring duplicate detections and missed detections? Which case do the duplicate detections occur? Which case do the missed detections occur? The two problems are from the preset anchors? Or from the label assignment strategy? Why do the preset anchors can be regarded as a label assignment strategy? Is the label means the ground truths?

(4) Line 4. What is the meaning of the “densely arranged samples”? Is it means that the training samples are in order? Likewise the “densely arranged objects” in Line 9. Parking side by side ships in HyperLi-Net?

(5) Line 6. Is the SLA based on anchor-free? Please state that why you select the “sparse” labels? To suppress false alarms?

(6) Line 7. What is the meaning of the “inconsistency” between CLS and REG?

(7) Line 8. What is the meaning of the “balance training”? The negative samples and positive samples reach a number balance? Like Libra R-CNN?

(8) Line 11 to Line 12. I do understand “between … and …”.

(9) Line 21. Add some literature review about SAR and Hyperspectral remote sensing applications. high-speed ship detection in sar images based on a grid convolutional neural network.

(10) Line 26. What about the negative samples? Are they abandoned fully?

(11) Line 29. How many anchors? The number. However, more anchors bring better fitting benefits. There are more anchors in depthwise separable convolution neural network for high-speed sar ship detection. This brings better accuracy.

(12) Line 32. Replace “intractable”.

(13) Line 34. Please cite the RetinaNet.

(14) Line 35 to Line 36. I do not understand it. See (6).

(15) This repository is empty. Github.

(16) Line 37. Is it right? NMS is a must-be tool to remove them.

(17) What is the difference between this work and your previous AAAI work, dynamic anchor learning for arbitrary-oriented object detection? Their motivations are the same.

(18) Line 40 to Line 44, why not adjust the threshold of NMS? Or can soft-NMS handle this problem? Find soft-NMS practice in LS-SSDD-v1.0.

(19) Line 46. Why? Many and many anchors can decrease the missed detections.

(20) Line 37 to Line 53. The authors just gave a shallow introduction. However, extensive analysis is needed.

(21) Line 54 to Line 55. The essence is just the object is too small in aerial images.

(22) Line 56. Select the better training samples from the raw datasets?

(23) Line 54 to Line 66. There are too many contributions here. What is yours, and what is others’? Please cite the previous paper.

(24) Line 75. The proposed D-RIOU has a faster convergency? I cannot find the evidences. For example, the IOU loss? G-IOU? D-IOU?

(25) Line 78. “with little additional overhead”, I cannot find the evidences.

(26) Line129. it’s à it is.

(27) Line 132. What are the “the special cases”?

(28) Give an overall introduction of Figure 2. In Figure 2, what is the CAM? Why do not have an up-sampling branch between the top and the second-top of FPN? What is the BCE loss?

(29) In Figure 2, I cannot find the position-sensitive feature pyramid network (PS-FPN).

(30) I find that Figure 3 is similar to Guided Anchoring.

(31) I find that Algorithm 2 is similar to IOU-Balance Sampling in Libra R-CNN.

(32) Figure 4. Give the (a), (b), (c).

(33) Figure 4. Why do you use X-GAP and X-GMP? Compared with GAP and GMP. Give some evidences.

(34) Line 255 to Line 257. When compared with CBAM and SE?

(35) Figure 4. Two sigmoid functions are used. Don’t they dilute the feature maps seriously?

(36) Table 1 and other tables. Please show AP-S, AP-M, AP-L, AP, AP50, AP-75 …. of coco datasets.

(37) Table 3. The lightweight operations? Reduction? shipdenet-20: an only 20 convolution layers and <1-mb lightweight sar ship detector?

(38) Many tricks are proposed. They are tiring for readers. Have you tried some traditional features in hog-shipclsnet? Or others injection of traditional hand-crafted features into modern cnn-based models for sar ship classification: what, why, where, and how. They can be selected as the future work for inspiration.

(39) Figure 8. The ships are all parked at ports. Please show some results of both scenes. Moreover, have you considered the balance scene learning mechanism for offshore and inshore ship detection in sar images? This can also be selected as the future work for inspiration.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Thanks for the author's response. Good work. I am glad to recommend your manuscript for publishing.

Back to TopTop