Next Article in Journal
Validation of GPM DPR Rainfall and Drop Size Distributions Using Disdrometer Observations in the Western Mediterranean
Previous Article in Journal
Analysis of Urbanization-Induced Land Subsidence in the City of Recife (Brazil) Using Persistent Scatterer SAR Interferometry
 
 
Article
Peer-Review Record

A Multi-Level Adaptive Lightweight Net for Damaged Road Marking Detection Based on Knowledge Distillation

Remote Sens. 2024, 16(14), 2593; https://doi.org/10.3390/rs16142593
by Junwei Wang 1,2,3,†, Xiangqiang Zeng 1,4,†, Yong Wang 1, Xiang Ren 1, Dongliang Wang 1, Wenqiu Qu 1,2, Xiaohan Liao 1 and Peifen Pan 5,*
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3: Anonymous
Remote Sens. 2024, 16(14), 2593; https://doi.org/10.3390/rs16142593
Submission received: 27 April 2024 / Revised: 14 June 2024 / Accepted: 27 June 2024 / Published: 16 July 2024
(This article belongs to the Section Remote Sensing Image Processing)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

In this  manuscript, authors claimed that they proposed a MALNet network based on knowledge distillation to improve the detection and segmentation performance in road surface marking. There are three key components in the MALNet network. MSDC modules are used to capture features t different scales. ASCA fusion module boosts feature representation and MLDS model is used to transfer knowledge from a teacher to a student model.  Experimental results demonstrate the efficacy and the superiority of the proposed method. However, there are some issues to be solved for publication.

1.  One major concern is the organization of this manuscript. Many representations are redundant, especially the experimental analysis part. The author should organize this part more orderly and clearly.

2.For section 4.5, the ablation experiments were conducted only on the CDM_P data set. Why not verify the ablation experiments on the CDM_H and CDM_C data sets?

3. There should be some related work introduced about the knowledge distillation.

4. The authors should compare the time consumption of their method to the others. They had not done this.

5.The conclusion should be shortened. It is too long.

Author Response

Comments 1: One major concern is the organization of this manuscript. Many representations are redundant, especially the experimental analysis part. The author should organize this part more orderly and clearly.

Response 1: Agree. Thank you for pointing this out. We agree with this comment and have reorganized the manuscript to improve clarity and reduce redundancy. Specifically, we moved the "Evaluation Metrics" section to the Methods part and merged the original Experiment Setup and Data sections into a single "Experiment Setting" section. Thank you for pointing this out. We agree with this comment. Mention exactly where in the revised manuscript this change can be found – pages 9-10, 3.4 Evaluation Metrics sections.

 

Comments 2: For section 4.5, the ablation experiments were conducted only on the CDM_P data set. Why not verify the ablation experiments on the CDM_H and CDM_C data sets.

Response 2: Agree. Thank you for your suggestion. We have now conducted ablation experiments on the CDM_H and CDM_C datasets as well.. Thank you for your understanding and support. Mention exactly where in the revised manuscript this change can be found – page 18, paragraph 4, lines 686-714.

 

Comments 3: There should be some related work introduced about the knowledge distillation.

Response 3: Agree. We agree with this comment and have included recent advancements in knowledge distillation in the Related Work section. Mention exactly where in the revised manuscript this change can be found – pages 3-4, lines 137-168.

Comments 4: The authors should compare the time consumption of their method to the others. They had not done this.

Response 4: Agree. We have added an analysis comparing the computation speed of our proposed model with other models in the Model Lightweight Analysis section. Thank you for pointing this out. We agree with this comment. Mention exactly where in the revised manuscript this change can be found – page 20, paragraphs 3-4, lines 751-762.

Comments 5: The conclusion should be shortened. It is too long.

Response 5: Agree.  Thank you for your feedback. We have shortened the Conclusion section to make it more concise. Mention exactly where in the revised manuscript this change can be found – page 21, paragraphs 5-7, lines 785-807.

 

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

This paper presented a Multi-level Adaptive Lightweight Network (MALNet) based on knowledge distillation to tackle the high-precision segmentation models for damaged road markings. The authors claimed the MALNet model showed improved performance by incorporating multi-scale dilated convolution and adaptive spatial-channel attention fusion modules. They utilized three datasets and presented their performance. The overall contributions are good, but some improvements are required. I have a few comments that need to be considered. The comments are as follows-

1.           What is the rationale for choosing ResNet-101 for the teacher network and ResNet-18 for the student network? Did you have any results if you chose a similar teacher and student network?

2.           More explanations for Figure 2 are required. How does the attention module adaptively adjust the spatial and channel weights?

3.           How are the 𝛼 and 𝛽 values chosen in Equation-4? If any sources are there, please include the reference.

4.           I will suggest adding some sample images from each dataset presented on page 9. Also, please include if the dataset is publicly available.

5.           I will suggest highlighting (in bold text) the best performance in Table 1. So, it will be easy to follow for the readers.

6.           IOU has been already defined previously. No need to define it on line 522.

7.           What are the MSDC modules mentioned on line 756? It is not defined previously.

8.           This is good work, I will recommend making your work publicly available on GitHub if your project permits.

9.           Please, pay attention to the punctuation and indentations such as

a.       On line 80, put space after (ASCA).

b.      On line 104, put a space after the comma of each framework name.

c.       On page 4, for all the points, there should be space before starting the paragraph, like this “ (1) Multi-Level Distillation Strategy: To …..

d.      On line 443, the lower recall is marked as “®”, but it will be (R). Please update it.

e.       There are several places where there is no indentation after a complete stop (.). Please review and update all of them. Some examples: lines- 432, 445, 448, 450,713, and so on.

Thank you for your contributions.

 

Good luck

Comments on the Quality of English Language

A little improvement is needed. 

Author Response

Comments 1: What is the rationale for choosing ResNet-101 for the teacher network and ResNet-18 for the student network? Did you have any results if you chose a similar teacher and student network?

 

Response 1: Thank you for your insightful comment. Here is the revised and more detailed response:

We chose ResNet-101 as the teacher network due to its deep architecture, which is essential for capturing complex and detailed features. ResNet-101 incorporates residual blocks with skip connections that address the vanishing gradient problem, enabling the construction of very deep networks with strong feature extraction capabilities. This depth is crucial for semantic segmentation tasks, particularly in capturing intricate details and semantic information.

  • ResNet-101 Advantages:
    • Deep Network Structure: With 101 convolutional layers, ResNet-101 can extract more complex and detailed features, enhancing segmentation accuracy. This depth is particularly important for providing accurate guidance in knowledge distillation.
    • Rich Feature Representation: The deep architecture captures a hierarchy of features, from low-level edge information to high-level semantic information, which is beneficial for generating high-quality labels and feature maps for distillation.
  • ResNet-18 Advantages for Student Network:
    • Lightweight and Efficient: ResNet-18 has only 18 convolutional layers, significantly reducing computational complexity compared to ResNet-101. This makes it suitable for deployment on resource-constrained devices, such as mobile or embedded systems.
    • Faster Inference: The shallower architecture of ResNet-18 allows for faster inference, which is critical for real-time detection and processing.

This setup effectively balances the complexity and efficiency required for our specific application.

 

Comments 2: More explanations for Figure 2 are required. How does the attention module adaptively adjust the spatial and channel weights?

 

Response 2: Agree. The attention module utilizes the sigmoid function to constrain the attention weights between 0 and 1, ensuring effective and stable weighting of the feature maps. These attention weights adaptively adjust the importance of each pixel or channel based on the input features and task requirements, achieving adaptive feature enhancement. By incorporating the ASCA module, the model can more effectively capture and enhance critical features, significantly boosting the model's representational and generalization performance.

These additions can be found on page 7, paragraphs 3-5, lines 269-285, and lines 297-304. Thank you for pointing this out. We agree with this comment. These additions can be found – on page 7, paragraphs 3-5,  lines 269-285, and lines 297-304.

 

Comments 3: How are the ? and ? values chosen in Equation 4? If any sources are there, please include the reference.

 

Response 3: The specific values for α and β were determined through experimental tuning to achieve an appropriate balance between class distillation and feature distillation. Experimental results indicate that setting α to 0.01 and β to 0.05 effectively enhances the performance of the student model. This setup reflects a different emphasis on class knowledge distillation and feature knowledge distillation, ensuring that feature distillation has a more significant impact. Excessive or insufficient weights for α and β could lead to overfitting or underfitting. Through experimentation, we found that α = 0.01 and β = 0.05 represent a suitable compromise, fully leveraging the teacher model's information without allowing any one loss to dominate the training process.

 

Comments 4: I will suggest adding some sample images from each dataset presented on page 9. Also, please include if the dataset is publicly available.

 

Response 4: Agree. We agree with this suggestion. Sample images from each dataset have been added, as seen in Figure 6. Additionally, the CDM-P dataset, which is based on public datasets, has been made publicly available. The data presented in this study are openly accessible at https://www.scidb.cn/, accessed on 28 March 2024. These modifications can be found – on page 10, paragraph 4, line 395.

 

Comments 5: I will suggest highlighting (in bold text) the best performance in Table 1. So, it will be easy to follow for the readers.

Response 5: Agree. We have highlighted the best results in bold for each table. Additionally, to emphasize the importance of F1 and IOU metrics, these metrics in the quantitative experimental results have been italicized. Thank you for pointing this out. We agree with this comment. Further details can be found in Tables 1, 2, 3.

 

Comments 6: IOU has been already defined previously. No need to define it on line 522.

 

Response 6: Agree. We have removed the redundant definition of IOU.

 

Comments 7: What are the MSDC modules mentioned on line 756? It is not defined previously.

 

Response 7: Agree. We acknowledge the mistake. The correct term should be MDSC, which stands for Multi-scale Dynamic Selection Convolution (MDSC) module. This has been corrected.

 

Comments 8: This is good work, I will recommend making your work publicly available on GitHub if your project permits.

 

Response 8: We appreciate your suggestion. However, since this is an institution-commissioned project with commercial applications, the code cannot be made publicly available.

 

Comments 9: Please, pay attention to the punctuation and indentations such as

a.       On line 80, put space after (ASCA).

b.      On line 104, put a space after the comma of each framework name.

c.       On page 4, for all the points, there should be space before starting the paragraph, like this “ (1) Multi-Level Distillation Strategy: To …..”

d.      On line 443, the lower recall is marked as “®”, but it will be (R). Please update it.

e.       There are several places where there is no indentation after a complete stop (.). Please review and update all of them. Some examples: are lines- 432, 445, 448, 450,713, and so on.

 

Response 9: Agree. We have made the necessary corrections regarding punctuation and indentations throughout the document. Thank you for highlighting these issues.

We have reviewed and updated the entire document to ensure consistency and accuracy.

 

Author Response File: Author Response.docx

Reviewer 3 Report

Comments and Suggestions for Authors

SUMMARY:

========

Novel deep learning method is proposed for fdetecting damaged road markings from "in situ" collected images. GENERAL COMMENTS:

=================

Explanation about distillation should be extended a bit. DETAILS:

========

LINE 224: "This skip connection directly adds the input to the output of the 224 convolutional layers, creating a residual." Adding or subtracting to obtain a residual? In figure 3, What means tenser? And the (6C)/(3C) between parenthesis? Are there arrows missing in figure 4? Can you put the testing machine (CPU) characteristics in table 6 (along with the FPS results)? ENGLISH:

========

Use of several "rather unusual" words: prowess (ability), voracious computational appetite (computationally intensive), bolster (support, help), falter (have poor performance), disparate (diverse), marred (spoiled)...

Comments on the Quality of English Language

Use of several "rather unusual" words: prowess (ability), voracious computational appetite (computationally intensive), bolster (support, help), falter (have poor performance), disparate (diverse), marred (spoiled)...

Author Response

Comments 1: Explanation about distillation should be extended a bit. DETAILS.

 

Response 1: Agree. We agree with this comment and have included recent advancements in knowledge distillation in the Related Work section. Mention exactly where in the revised manuscript this change can be found – page 3-4, lines 137-168.

 

Comments 2: LINE 224: "This skip connection directly adds the input to the output of the 224 convolutional layers, creating a residual." Adding or subtracting to obtain a residual? In figure 3, What means tenser? And the (6C)/(3C) between parenthesis? Are there arrows missing in figure 4? Can you put the testing machine (CPU) characteristics in table 6 (along with the FPS results)?.

 

Response 2: Agree. We agree with these comments. The sentence has been corrected to: "This skip connection directly adds the input to the output of the 224 convolutional layers." This change has been made in the manuscript.

Regarding Figure 3, 'tensor' refers to a feature map, '6C' indicates a feature map with 6 channels, and '(3C)' indicates weights for 3 channels. These explanations have been added on page 6, lines 245-246.

We have added the missing arrows to Figure 4 and replaced the figure to provide clearer explanations. This update can be found on page 7, lines 269-285.

Additionally, we have included the throughput test results based on the CPU and the performance metrics for other models mentioned in the manuscript in Table 6. These changes are detailed on page 20, lines 751-762.

 

Comments 3: Use of several "rather unusual" words: prowess (ability), voracious computational appetite (computationally intensive), bolster (support, help), falter (have poor performance), disparate (diverse), marred (spoiled)...

 

Response 3: Agree. We agree with this comment. We have replaced the unusual words with more common terms and conducted a thorough review of the manuscript to improve the clarity and correctness of the English language throughout.

Author Response File: Author Response.docx

Back to TopTop