Next Article in Journal
Optimizing Design and Operational Parameters for Enhanced Mixing and Hydrodynamics in Bubbling Fluidized Bed Gasifiers: An Experimental and CFD-Based Approach
Previous Article in Journal
A Multi-Behavior Recommendation Method for Users Based on Graph Neural Networks
 
 
Article
Peer-Review Record

Improved YOLOv7 for Small Object Detection Algorithm Based on Attention and Dynamic Convolution

Appl. Sci. 2023, 13(16), 9316; https://doi.org/10.3390/app13169316
by Kai Li, Yanni Wang * and Zhongmian Hu
Reviewer 1:
Reviewer 2:
Reviewer 3: Anonymous
Reviewer 4: Anonymous
Reviewer 5:
Appl. Sci. 2023, 13(16), 9316; https://doi.org/10.3390/app13169316
Submission received: 17 July 2023 / Revised: 7 August 2023 / Accepted: 14 August 2023 / Published: 16 August 2023

Round 1

Reviewer 1 Report

Major contributions of the paper should be highlighted in section 1

Please mention the organization of paper.

The authors have mentioned - In this paper, we conduct several experiments on the backbone network of YOLOv7 - please clearly mention the count; avoid using jargons.

Explain the terminologies of the equations.

Figures should be cited close to its references

Resolution of figure 7 should be improved.

figure 10 caption - please recheck and re-write

Future work should be highlighted

-

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

The paper discusses an interesting approach for small object detection by improving Yolo v7. Here are a few reviews from my point of view:

1. The title and abstract clearly summarize the paper's focus and contributions. However, the abstract could benefit from a more explicit statement of the problem and a more straightforward explanation of the proposed solution. The abstract does not mention comparative analysis with other state-of-the-art small object detection models.

2.  Missing recent reference on small object detection [a1]-[a2], so the authors do not have information about the state of the art in small object detection.

3. The introduction mentions many techniques and algorithms (such as the SPPCSPC module, coordinate attention mechanism, spatial attention mechanism, and channel attention mechanism). Still, it does not provide much background on these terms. This may make it hard for readers who are not deeply familiar with the field to follow the paper. 

4. The paper uses a single dataset for testing the model, which might not ensure it's not overfitting. Also, it lacks comparative analysis with other state-of-the-art small target detection models (from other reputable articles), limiting the validation of the model's superiority.

5. The primary metric in the results is mean Average Precision (mAP). While this is a standard metric for object detection tasks, including more metrics could provide a more comprehensive understanding of the model's performance, such as IoU and F1-Score.

6. The experiments are conducted on a single dataset, which limits the validation of the model's performance. I strongly advised using multiple datasets to better position this paper.

Few small notes: 

1. The title should be Improved YOLOv7 for small objects detection algorithm based on attention and dynamic convolution.

2. Figure 1 and 7 are stretched.

[a1] Cheng, Gong, et al. "Towards large-scale small object detection: Survey and benchmarks." IEEE Transactions on Pattern Analysis and Machine Intelligence (2023).

[a2] Akyon, Fatih Cagatay, Sinan Onur Altinuc, and Alptekin Temizel. "Slicing aided hyper inference and fine-tuning for small object detection." 2022 IEEE International Conference on Image Processing (ICIP). IEEE, 2022.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Please see file attached. Thank you

Comments for author File: Comments.pdf

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 4 Report

This paper designs a new module for YOLOv7 to reduce the missing and missed detection of valid features in the feature extraction process. Feature optimization is also achieved by combining spatial and channel attention mechanisms. The topic is relevant for the Applied Sciences journal and may be interesting for its readers. However, some concerns should be addressed with a final revision before possible acceptance:

- The Related Work section must be improved. Subsection 2.1 describes the main structure of the algorithm model of YOLOv7. However, it does not detail the limitations of the current model in a clear way, so this subsection seems more like a "Background" subsection than a "Related Work" one. Additional information must be provided there about the limitations that you proposal aims to address in YOLOv7. Then, in Section 2, a discussion about other papers extending or modifying YOLOv7 is needed. You may create a third subsection, or include that discussion in Subsection 2.2

- Many images are blurry and/or badly sized, e.g., Figures 1 and 2. Because of this, it is not easy to understand the diagrams and workflows that are illustrated. In these conditions, such images could not be included in an accepted version of the paper, so they must be modified or redrawn. Moreover, some image captions appear on next page with respect to the page where the image is included. This issue requires fixing as well.

The level of English is fine, I just recommend a general proof-reading of the whole paper.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 5 Report

In this work, the author addresses the difficulties of the existing literature by enhancing the latest YOLOv7 model. The SPPCSPC module is improved to reduce feature loss during network processing by incorporating feature separation and merging ideas. To address missed detection issues, the CA attention module is introduced to enhance the model's sensitivity to small-scale targets and reduce noise impact. The dynamic convolution module is also introduced to handle misdetection and leakage caused by large target size variations, resulting in a highly robust network.

 

Comment 1: Abstract needs to be revised. The achieved result needs to be mentioned.

Comment 2: The following information should be properly briefed in the introduction section. 

a.      What is the CA attention module, and how does it enhance the model's ability to detect small-scale targets while reducing the impact of noise?

b.     How does the SPPCSPC module improve the YOLOv7 network model's performance in handling small targets?

Comment 3: Author’s should correctly specify the indicators or metrics used in the adapted framework which is specified in Ablation experiment.

Comment 4: Does the adapted framework have the potential for operationalization and goal-setting? If so, how? 

Comment 5: How does the adopted framework shown in P-R curves of network models address the diversity in performance before and after improvements. Kindly brief it with solid proof

Comment 6: How can the adapted conceptual framework be practically applied to assess the address the issue of challenging tiny target recognition? Kindly discuss

The English language quality in the manuscript can be improved. The manuscript contains numerous grammatical errors that affect the paper's readability. For instance, there are issues with subject-verb agreement, incorrect use of prepositions, and improper sentence construction. Some sentences are overly long and complex, which makes them hard to understand. It is essential to break down these complex sentences into simpler ones to improve clarity

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

After reviewing the author's responses and the revisions made to the manuscript, I can conclude the authors responded well to almost all of my review requests.

Adding a new dataset seems impossible with this journal's time frame.

I appreciate the time and effort the authors invested in addressing each concern and providing clarifications where necessary.

 

Back to TopTop