Next Article in Journal
Landslide Detection and Mapping Based on SBAS-InSAR and PS-InSAR: A Case Study in Gongjue County, Tibet, China
Previous Article in Journal
An Observation Density Based Method for Independent Baseline Searching in GNSS Network Solution
 
 
Article
Peer-Review Record

Complex Mountain Road Extraction in High-Resolution Remote Sensing Images via a Light Roadformer and a New Benchmark

Remote Sens. 2022, 14(19), 4729; https://doi.org/10.3390/rs14194729
by Xinyu Zhang 1,2,†, Yu Jiang 3,4,†, Lizhe Wang 1,2, Wei Han 1,2,*, Ruyi Feng 1,2, Runyu Fan 1,2 and Sheng Wang 1,2
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3:
Reviewer 4:
Remote Sens. 2022, 14(19), 4729; https://doi.org/10.3390/rs14194729
Submission received: 22 August 2022 / Revised: 14 September 2022 / Accepted: 18 September 2022 / Published: 21 September 2022

Round 1

Reviewer 1 Report

Complex mountain road extraction is of great significance. The paper proposed a new dataset: Road Datasets in Complex Mountain Environments (RDCME) and  the Light Roadformer model.  The data used in the paper is convincing and the conclusions are reliable.

1. The paper should focus on the technical difficulties in mountainous road extraction with HRSIs. I think it's the image quality causing this-'mountain roads are small in size and have blurred edges'. So it shouldn't be defined as one of the characteristics of mountain roads.

2. A detaied introduction of Fig.5 should be added.

3. As to '2.4. Post process', it's very important to the final result. More details should be added. And, how about the efficiency comparing with manual editting?

In general, the paper is well-organized and the following paper could be referenced.

Abhay Kolhe & Archana Bhise (2022) Modified PLVP with Optimised Deep Learning for Morphological based Road Extraction, International Journal of Image and Data Fusion, 132, 155-179, DOI 10.108019479832.2020.1864785

 

Author Response

Thanks for your kind suggestions and comments. In this revision process, we have revised the points you made and cited the reference you recommended to us to address the concerns of the review experts. We revised the concept of mountain road features in the manuscript, supplemented some corresponding descriptions in the chart, and added some descriptions about the method model. All revisions in the manuscript are highlighted by yellow text. We believe the modified manuscript has obviously been improved by your help.

For more details, please refer to the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

In this paper, the authors propose a new mountain dataset and the Light Roadformer model to solve the problem of extracting mountain roads from HRSI in complex environments. The Light Roadformer model is composed of a transformer module and a self-attention module to focus on extracting more accurate road edge information and the post-process module is used to remove incorrectly predicted road segments.

 

Detailed comments are:

1)          In your experiment part, only the training dataset with  ground truth label is used to verify the effectiveness of this model. Could you use the unlabeled dataset as a supplement to the experiments?

2)          In figure 5, the part of the encoder-decoder structure is rotated 90 degrees which reduces the readability. Could you optimize the image layout?

3)          In section 2.3.2, could you give some explanation of the correspondence of Q, K, and V with the actual input dataset?

Author Response

Thanks for your kind suggestions and comments. In this revision process, we have revised  the points you made. All revisions in the manuscript are highlighted by yellow text. We believe  the modified manuscript has obviously been improved by your help.

For more details, please refer to the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

This ini an interesting paper, But I have a few concern:

1. Missing few latest research on Mountain Road Segmentation/Extraction [1][2][3][4]

2. The authors need to update their references and contrast their research with the current research. Then the authors must write their paper contribution based on these comparisons.

3. The authors need to add a new visualization that shows ground truth and prediction in an overlapping way so readers can visually see the IOU.

4. From my point of view, the idea for post-processing is the original contribution of this paper. The authors need to give more example on the post processing and report time complexity based on different examples.

5. Did the authors keep RGB color space in the CLAHE ? This is very uncommon as usually in CLAHE we need to convert to LAB. 

6. As the authors created their own Encoder and Decoder (not from transfer learning) they should try different colorspace in the Encoder and Decoder.

 

References:

[1] Chen, Weitao, et al. "NIGAN: A Framework for Mountain Road Extraction Integrating Remote Sensing Road-Scene Neighborhood Probability Enhancements and Improved Conditional Generative Adversarial Network." IEEE Transactions on Geoscience and Remote Sensing 60 (2022): 1-15.

[2] Chen, Ziyi, et al. "Road extraction in remote sensing data: A survey." International Journal of Applied Earth Observation and Geoinformation 112 (2022): 102833.

[3] Xu, Zeyu, et al. "Road extraction in mountainous regions from high-resolution images based on DSDNet and terrain optimization." Remote Sensing 13.1 (2020): 90.

[4] Courtial, Azelle, et al. "Exploring the potential of deep learning segmentation for mountain roads generalization." ISPRS International Journal of Geo-Information 9.5 (2020): 338.

Author Response

Thanks for your kind suggestions and comments. In this revision process, we have revised  the points you made. We have added your recommended references to the manuscript and corrected some details. All revisions in the manuscript are highlighted by yellow text. We believe  the modified manuscript has obviously been improved by your help.

Author Response File: Author Response.pdf

Reviewer 4 Report

This manuscript presents a semantic segmentation method for mountain road extraction using a vision transformer backbone. Experimental results obtained using an originally collected satellite image dataset revealed that the proposed method, called Light Roadformer, achieved higher IoU scores than the eight existing methods based on ConvNets and Transformers. Nevertheless, the questions below remain subjects that must be addressed in this manuscript.

 

1. The detailed architecture of the proposed method is depicted in Fig. 5.

Which parts of this architecture are novel in your method?

 

2. Why didn't you use an end-to-end architecture?

 

3. Is your method better than the following D-LinkNet?

 

- D-LinkNet: LinkNet With Pretrained Encoder and Dilated Convolution for High-Resolution Satellite Imagery Road Extraction, Lichen Zhou, Chuang Zhang, Ming Wu; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2018, pp. 182-186

 

4. Does the horizontal axis in Fig. 10 have the correct units? An epoch corresponds to a set of total images. What does the term "training sessions" in line 222 mean? Is it "iteration"? How many generations do the iterations correspond to?

 

5. How many annotators were there when the RDCME dataset was created?

 

6. The recommended style for equations (9) and (10) can be seen in the following survey paper.

 

- S. Minaee, Y. Boykov, F. Porikli, A. Plaza, N. Kehtarnavaz and D. Terzopoulos, "Image Segmentation Using Deep Learning: A Survey," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 7, pp. 3523-3542, 1 July 2022, doi: 10.1109/TPAMI.2021.3059968.

 

7. Fundamentally, Fig. 11 and Table 2 show the same information. Do you need both?

 

8. Is "This is" necessary in the caption of each figure?

 

9. Replace "Vit" with "ViT" on line 174. Replace "Pytorch" with "PyTorch" on line 221. Replace "IOU" with "IoU" between formulas (9) and (10) on line 224.

 

10. Is it possible to upload the source codes for Light Roadformer?

Author Response

Thanks for your kind assistance and essential comments. In response to your suggestion to our manuscript, first we added a comparative experiment using D-LinkNet on RDCME. We then supplemented the model description of the network architecture diagram in the manuscript, revised some formulas, and corrected language inaccuracies and inappropriate language. We also checked the manuscript many times to address the problems of expressions and symbol errors. All revisions in the manuscript are highlighted by yellow text. During the revision process, your questions, comments and assistance have urged us to continuously improve the manuscript.

 

Author Response File: Author Response.pdf

Round 2

Reviewer 3 Report

The authors have addressed most of my concerns, but only a few minor details that needs to be added in the minor revision:

1. Experiments with the time needed in post-processing for various images and contrast them with the time needed in the machine learning prediction.

2. Showing prediction and ground truth in an overlapping fashion is very important because we can't detect 1 pixel sift between prediction and ground truth on two images.

3. The authors need to further convince that their machine learning is not overfitting.

Author Response

Thank you for your suggestion, this time we have made further changes to your suggestions. We add some experimental details in the post-processing section and a description of the comparative model in the text. If you have any suggestions, please feel free to contact us.

For more details, please refer to the attachment.

Author Response File: Author Response.pdf

Back to TopTop