Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

A Bio-Inspired Visual Perception Transformer for Cross-Domain Semantic Segmentation of High-Resolution Remote Sensing Images

Remote Sens. 2024, 16(9), 1514; https://doi.org/10.3390/rs16091514

by Xinyao Wang¹, Haitao Wang^1,*, Yuqian Jing², Xianming Yang³ and Jianbo Chu¹

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3:

Javad Sheikh

Remote Sens. 2024, 16(9), 1514; https://doi.org/10.3390/rs16091514

Submission received: 10 February 2024 / Revised: 12 April 2024 / Accepted: 23 April 2024 / Published: 25 April 2024

(This article belongs to the Special Issue Deep Learning and Computer Vision in Remote Sensing-III)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This manuscript proposes a transformer-based strategy for cross-domain remote sensing image segmentation, which effectively acquires semantic information in high-resolution remote sensing images. The method is novel and described in detail in the manuscript. Furthermore, experimental results on benchmark remote sensing datasets demonstrate the effectiveness of the proposed module. However, there are some concerns that need to be addressed before possible publication:

1) The abstract lacks clarity in explaining the respective roles of the proposed modules and their relationship to semantic information extraction.

2) The utilization of experimental datasets in this study is considered inadequate; it is advisable to incorporate more datasets for robust evaluations.

3) Providing a more specific network structure for the proposed GSV-Trans would be helpful.

4) Although the backbone of the proposed algorithm utilizes transformers, it should be noted that transformer-based state-of-the-art (SOTA) methods are missing from Table II of the manuscript.

Author Response

Thank you for your invaluable suggestions. We have addressed each of your comments and made the necessary revisions to the manuscript.

Please see the attachment.

Once again, we sincerely appreciate your time and effort in reviewing our paper. Your valuable insights have been immensely helpful.

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

Lines 114 to 117. This can not be considered a contribution.

Line 201: be careful with the notation. In xt1 one must have t1 as a subscript.

Lines 325 and 326: information regarding R must be clarified. Is R a variable? The authors point out that R is a process. In this sense, in Equations (1) and (2), what should the readers understand regarding R?

Lines 425 to 430: How were the values chosen? Explain how the learning rate was chosen. Explain how the search for the best values was carried out?

There are many problems related to formatting references:

- The name of the journal must be capitalized with the first letter of each word. Example: Exchange “IEEE transactions on pattern analysis and machine intelligence” for “IEEE Transactions on Pattern Analysis and Machine Intelligence”.

- There is a need to standardize: Volume versus vol.; Pages versus pp.

Author Response

Thank you for your invaluable suggestions. We have addressed each of your comments and made the necessary revisions to the manuscript.

Please see the attachment.

Once again, we sincerely appreciate your time and effort in reviewing our paper. Your valuable insights have been immensely helpful.

Author Response File: Author Response.docx

Reviewer 3 Report

Comments and Suggestions for Authors

Dear authors,

I enjoyed reading your paper, Your approach to simulating human eye movements to enhance semantic segmentation is particularly interesting. but I have two points I think that need to be explained further and fixed:

1. Figure 5 Clarity: Could you provide more detail in the text explaining the different elements of Figure 5? I have trouble understanding this AEMA attention mechanism.

2. Minor Typo: There appears to be a typo in Figure 7, where "Gaze" is spelled "Caze."

kind regards,

Author Response

Thank you for your invaluable suggestions. We have addressed each of your comments and made the necessary revisions to the manuscript.

Please see the attachment.

Once again, we sincerely appreciate your time and effort in reviewing our paper. Your valuable insights have been immensely helpful.

Author Response File: Author Response.docx

Article Menu

A Bio-Inspired Visual Perception Transformer for Cross-Domain Semantic Segmentation of High-Resolution Remote Sensing Images

Further Information

Guidelines

MDPI Initiatives

Follow MDPI