Next Article in Journal
Dilated Multi-Temporal Modeling for Action Recognition
Next Article in Special Issue
Non-Uniform-Illumination Image Enhancement Algorithm Based on Retinex Theory
Previous Article in Journal
Escaping Printable Encoded Streams to Embed Out-of-Band Data
Previous Article in Special Issue
EYOLOv3: An Efficient Real-Time Detection Model for Floating Object on River
 
 
Article
Peer-Review Record

DMA-Net: Decoupled Multi-Scale Attention for Few-Shot Object Detection

Appl. Sci. 2023, 13(12), 6933; https://doi.org/10.3390/app13126933
by Xijun Xie 1, Feifei Lee 1,* and Qiu Chen 2,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3:
Appl. Sci. 2023, 13(12), 6933; https://doi.org/10.3390/app13126933
Submission received: 6 May 2023 / Revised: 27 May 2023 / Accepted: 5 June 2023 / Published: 8 June 2023
(This article belongs to the Special Issue Computer-Aided Image Processing and Analysis)

Round 1

Reviewer 1 Report

This study proposed the DMA-Net model for few-shot object detection task. This net was efficient for object detection based on experimental works. I have a few minor comments for authors.

 

1. The manuscript analyzed the difference between the proposed model and other object detection models based on AP/AP50/AP75 metrics. However, how is the efficiency of the proposed model for the same dataset? This study should analyze performance and efficiency quantitatively.

2. Please check the number of sub-titles of every section in Line 272, Line 307 and Line 338.

3. Adjust table heading of Table 3.

4. The Discussion section should discuss the strength and weakness of the method, instead of experimental design for comparable analysis of different models.

5. References: Latest publications in Years 2021-2023 parallel to your study, are expected to be added, including the current state-of-the arts on detection and classification.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Manuscript title: DMA-Net: Decoupled Multi-scale Attention for Few-shot Object Detection

In this manuscript, The Decoupled Multi-scale Attention Module (DMAM) consisting of a multi-scale feature extractor, a multi-scale attention module, and a decoupled gradient module. By using support information and managing information relationships between branches, DMA-Net shown achieved state-of-the-art performance on generic FSOD benchmarks. I have few concerns and they are detailed below.

 

  1. In abstract, ‘Therefore, general detectors typically exhibit overfitting and poor generalizability when recognizing unknown objects if there are few samples. With the continuous maturation of few-shot classification (FSC) methods, …. the support branch and the query branch.’ sentences are not meaningful. Please make abstract connected and reads well in continuity.

  2. It could be helpful if some terminology explained for better understanding of the reader and his continuity. Like K shot, few shot object detection technique, meta knowledge, image query? What are multi-scale features? Also provide how the red rectangles around objects were drawn in figure 1. I see author explained few shot learning/object detection later in detail but adding purposeful one line sentences earlier could also help to connect well.

  3. Simillarly ,what are the support images, query images meaning.

  4. “Figure 9. Visualization of the attention of multi-scale features. The deeper the color is, the more attention the model focuses on.” It doesn’t seem that way. The hot spot  (red/brown/yellow) considered tobe high intensity and seems model pics the object in heatmap. Please clarify the sentence.

  5. Faster R-CNN results look superior to DMA-Net in table 2 , not shown comparison on table 3 though with Faster R-CNN. Also DMA-Net has more parameters. What are the advantages of using DMA-Net over Faster R-CNN. Please add text in discussion. May include a sentence in abstract too.

  6. Caption not self-explanatary for figure 9 and 10.What is first row and second row. Can author compare heatmap and show with faster R-CNN also. Add heatmap scalebar.

  7. Author used metric AP only. And may try to add other metric also that could further help understand the potential of method and comparison such as F1 score, false positive rate or false negative rate. 

  8. It’s not clear the data splitting used to train model, validation dataset and test dataset distribution.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

In this paper, the authors explore the significant field of object detection in computer vision. They shed light on the challenges faced by general detectors, including overfitting and poor generalizability, brought about by the complexities in collecting and labeling samples in many specialized areas. To address these concerns, they present an innovative framework known as Decoupled Multi-scale Attention (DMA-Net), which is anchored by a crucial module, the Decoupled Multi-scale Attention Module (DMAM).

 

While the paper is generally well-structured and proposes an intriguing and potentially impactful solution to the highlighted issues, there are several areas that could benefit from further refinement and elaboration:

 

1. In the related work section, the authors could enhance their discussion by not only acknowledging previous works but also identifying their shortcomings. This would better justify the need for this study and the proposed DMA-Net.

 

2. The authors should ensure that all graphical content, such as Figure 8, is clearly presented for easy interpretation. If the text is hard to read, it might detract from the overall comprehension of the paper.

 

3. The authors should also revisit Table 3 and ensure all columns have appropriate labels for better data understanding. The absence of labels for the first five columns may create confusion for the readers.

 

In conclusion, this paper introduces the DMA-Net, a promising framework for addressing challenges in object detection tasks. However, it requires enhancements in terms of clarity, detailed explanations, and comprehensive comparisons to truly stand out as a significant contribution to the field. By addressing these recommendations, the authors will be able to refine and elevate the quality of their research.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Back to TopTop