Next Article in Journal
A Convolution and Attention Neural Network with MDTW Loss for Cross-Variable Reconstruction of Remote Sensing Image Series
Previous Article in Journal
Estimating the Aboveground Biomass of Various Forest Types with High Heterogeneity at the Provincial Scale Based on Multi-Source Data
 
 
Article
Peer-Review Record

HB-YOLO: An Improved YOLOv7 Algorithm for Dim-Object Tracking in Satellite Remote Sensing Videos

Remote Sens. 2023, 15(14), 3551; https://doi.org/10.3390/rs15143551
by Chaoran Yu, Zhejun Feng, Zengyan Wu, Runxi Wei, Baoming Song and Changqing Cao *
Reviewer 1: Anonymous
Remote Sens. 2023, 15(14), 3551; https://doi.org/10.3390/rs15143551
Submission received: 13 June 2023 / Revised: 10 July 2023 / Accepted: 13 July 2023 / Published: 14 July 2023
(This article belongs to the Section AI Remote Sensing)

Round 1

Reviewer 1 Report (Previous Reviewer 1)

In this paper, the authors explained the novelty of the proposed method. I have the following suggestions

1. Put the explanations in the paper as the description of the proposed method.

2. Please release the code otherwise the paper is meaningless. 

 

NA

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report (Previous Reviewer 3)

Thank you for the comprehensive report to the reviewer. The paper is now in an acceptable format. See attached pdf for minor edits to fix. 

Comments for author File: Comments.pdf

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

The authors revised the paper and clarified the contribution of this paper. However, it does not change the fact that the novelty of this paper is quite limited. As the authors clarified in their response and paper, the authors applied yolov7 and BoTNet to solve the object detection in remote sensing videos. Considering on this, this paper can not be published in the current form.

N/A

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 2 Report

Dear Editor,

Dear Authors,

I have carefully reviewed the resubmitted manuscript and compared it to the previous version. In my opinion, the article has undergone significant improvement, and the authors have made it clearer to understand. They have addressed almost all of my concerns and issues with the previous version. However, my main concern, as I mentioned in the previous round of review, is still the level of innovation. The work combines some well-known concepts and tools for a special application, and I am not sure if the proposed approach can be generalized to a wider range of datasets.

However, I must acknowledge that the manuscript is scientifically and technically sound, and the topic is interesting. The authors have done their best to address my previous concerns and I would fairly let the manuscript be considered for publication if the other reviewers and editors also confirm its quality.

As a final suggestion, I recommend proofreading the entire manuscript and addressing any remaining unclear sentences and grammatical errors. The language has improved significantly from the previous version, but some issues still need attention.

Best wishes,

The reviewer!  

As a final suggestion, I recommend proofreading the entire manuscript and addressing any remaining unclear sentences and grammatical errors. The language has improved significantly from the previous version, but some issues still need attention.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 3 Report

The authors present an adapted YOLO model for dim objects in satellite video. While the method seems to work the paper needs extensive revision: 

There is a lot of literature on dim object tracking in videos but only one paper is cited. It is thus not clear what novelty the paper brings. The methodology should be compared to these techniques not only to a method known to struggle. 

Why are there highlighted yellow sections in the paper? 

1. Abstract, ln 2-3: language needs to be fixed

2. Abstract: videos from satellites are very rare, thus the abstract needs to provide more information to the reader why this is important to consider.

3. Abstract ln 5: what does 'ordinary' mean? 

4. Abstract: "As a result, the original..." No motivation or evidence is provided for this statement here nor in the paper but the whole methodology is built on it.

5. ln 17: "fully integrated features" needs to be explained

6. mAP_0.5 is not standard notation - please correct. 

7. ln 44: satellite videos are not common so why should this be considered? Add more motivation and clarity. 

8. lns 44-60 needs citations to back up the statements made

9. ln 63-64: "small proportion of objects"? But there are many other objects in the scene too. Explain the logic here. 

10. All figure captions need to be redone properly - the language is poor and the way they are written not standard.

11. ln 77 refers to two categories but then two specific methods are discussed. 

12. ln 80-81 references the only other paper that deals with dim objects - how does your work improve on this? More discussion is needed here, and more literature to place the novelty of your work.

13. ln 87: no hyphen

14. ln 109: why qualitative?

15. ln 119, 124, 127, 136, 144, 148, 153,155: only surnames and use et al  

16. ln 148: then

17. Section 3.1 needs to be redone. There are acronyms used for a number of methods which are never explained/defined/referenced. The notation is not consistent. It is very difficult to follow. e.g. Mat va MatMul is poor notation choice, not standard. Also, I can't see how Mat would be better? Motivation for the additions needs to be added - why will these improve the model? As written now they are just ad hoc decided on. 

18. lns 212-214: references/evidence needed

19. lns 269-271: add evidence that this is true and needs to be addressed. 

20. lns 272-274: And what about other datasets? This wouldn't be a useful solution then. 

21. lns 305-320: rewrite with better explanation

22. ln 321: inputted

23. lns 348-353: this could be useful rather in the introduction

24. Figures are flowing between pages e.g. Figure 9

25. lns 383-390 and 391-396 need references and better explanation

26. Table 1 requires better discussion and explanation. 

27. ln 426: What are these two experiments? 

28. Figure 10: explain the notation in the graphics. Better to improve the graphics. Also Figure 11. 

29. lns 463-465: where are these results shown? 

30. lns 469-471: This doesn't make sense - please elaborate.

31. Figure 13: It is not clear what is happening here, provide more description.

32. ln 481: what is (1)-(6)?

33. ln 484 and onwards: what is C3, C4 etc? This is never defined. 

34. Figure 14 has a different resolution in each pair? Why is this? It is also not at all clear what this figure is showing the reader. 

35. ln 520: incorrect figure reference. 

36. Table 3: What are the methods shown there? These are never discussed nor the acronyms defined. 

37. Table 3 column Rcll: This needs to be discussed.

38. Table 4: what is MOTA and MOTF? Never defined. 

39. ln 540-541: how is this deduced?

40. ln 543: fix the language

41. figure 15: how do you decide what to track? There are many objects. 

Minor edits as indicated already. The language in the Figure captions needs the most attention. 

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Back to TopTop