Next Article in Journal
Vulnerability of Clean-Label Poisoning Attack for Object Detection in Maritime Autonomous Surface Ships
Next Article in Special Issue
Analysis of the Steady-Stream Active Flow Control for the Blended-Winged-Body Underwater Glider
Previous Article in Journal
Optimized APF-ACO Algorithm for Ship Collision Avoidance and Path Planning
Previous Article in Special Issue
Physical Consistent Path Planning for Unmanned Surface Vehicles under Complex Marine Environment
 
 
Article
Peer-Review Record

Underwater Small Target Detection Based on YOLOX Combined with MobileViT and Double Coordinate Attention

J. Mar. Sci. Eng. 2023, 11(6), 1178; https://doi.org/10.3390/jmse11061178
by Yan Sun 1, Wenxi Zheng 1,*, Xue Du 2 and Zheping Yan 2
Reviewer 1:
Reviewer 2:
Reviewer 3:
J. Mar. Sci. Eng. 2023, 11(6), 1178; https://doi.org/10.3390/jmse11061178
Submission received: 26 April 2023 / Revised: 25 May 2023 / Accepted: 31 May 2023 / Published: 5 June 2023
(This article belongs to the Special Issue Autonomous Marine Vehicle Operations)

Round 1

Reviewer 1 Report

1.1 The main question addressed by the research is optimization of real-time object detection architectures for underwater applications.

1.2 The topic is not original and novelity is not clearly shown in the introduction.

1.3  It is not clearly shown, do the current models not enougth for the field or not. I reccomed to describe any case study in the introduction.

1.4 I also recommend to divide introduction into intro and relative paper review.

1.5 The paper  add the new model architecture to the subject area compared with other published material, but authors do that without any explanation.

1.6. The main remark to the paper is to explain or prove all proposal given by author.

 

2. Line 75-77: "In underwater target detection, attention mechanisms are frequently used in feature  extraction, and in mobile networks, attention mechanisms have proven their usefulness in computer vision through their ability to achieve high feature extraction at a relatively low cost." - The meaning is not clear, what does the " high feature extraction" means, attention-like layers have aim to "highlight" and help to rest only valuable features for target.

3.  lines 52-67: yolo family and real-time obj det go far away from this review, see for instnace https://paperswithcode.com/task/real-time-object-detection, also yolo6-8 papaers.

4 lines 69-74: please show some case, as example

5. lines 52-67, this text does not describe yolox and their benifits comparing to other models.

6. lines 85-108: text does not describe mobileVit, sota in this field 

7. lines 85-108: "Thus we choose MobileVIT[28] to combined with YOLOX[29] ." this important point, which can be thought as aim of the work  is given with sufficient provment or explanation

8. "The  YOLOX uses mosaic data enhancement during the image pre-processing 128 stage and selects four images from the dataset for stitching and testing, which can enrich 129 image backgrounds." i belive that augmentation can not be thougth as part of any architecture.

9. section 2.1 should contain illustration of yolox

10. section 2.3 do the layer correponds to the paper https://arxiv.org/abs/2103.02907 I can not find any refference.

11. line 253 "Because YOLOX has advantages in image enhancement, target classification, and label classification" - this important statment given without any prove, please show either some refferences or researches on all this relsuts 

12.  Figure 5 must higlight  difference in original and proposed architectures.

13.  the dataset in 4.1 does not have any refference. Where we can find URPC2020?

14. If URPC2020 exist anywhere It would be interensting to comapre author results with outher reseacher have.

15.  To the best of my knowledge "mAP is the standard to measure the 293 accuracy of the model in target detection." mAP can assume as mAP50 as,for instance, mAP 50:95 or other form, author should show thier measure determination.

 


1. The sentances like "Thus we designed a proposed coordinate attention named double coordinate attention (DCA)" (line 227) are not correctly constructed

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

It should be clearly stated whether the model used makes original contributions or just uses a combination of already existing modules.

All equations must be accompanied by much more pertinent justifications. There are many mathematical quantities that appear without explaining their meaning.

Better results are assumed through a balance between the accuracy of recognition and the necessary resources. It would have been useful, in the paragraph of discussing the results, also more concrete data such as calculation time and the need for computing resources (for example, memory volume). As a discussion of the existence of these resources on different devices directly involved in the applications under consideration.

From the logic of the presentation, a strong and direct connection must emerge between the mathematical support and the particular solution proposed as a personal contribution. Otherwise, the reader may be left with the impression that the mathematical support was introduced only formally because these are the usual requirements.

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

 

Comments for author File: Comments.pdf

The authors need to conduct a proper proof-reading of the overall manuscript. There are many typos and grammar mistakes.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

The authors aderres all my comment.

However, I also reccomend to add work motivation into introudction.

Also all points of contribution could be concidered in the discussion.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Back to TopTop