Next Article in Journal
City-Scale Distance Sensing via Bispectral Light Extinction in Bad Weather
Previous Article in Journal
The 2018 Long Rainy Season in Kenya: Hydrological Changes and Correlated Land Subsidence
 
 
Article
Peer-Review Record

Building Extraction Based on U-Net with an Attention Block and Multiple Losses

Remote Sens. 2020, 12(9), 1400; https://doi.org/10.3390/rs12091400
by Mingqiang Guo 1, Heng Liu 1, Yongyang Xu 1,* and Ying Huang 2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Reviewer 4: Anonymous
Reviewer 5: Anonymous
Remote Sens. 2020, 12(9), 1400; https://doi.org/10.3390/rs12091400
Submission received: 24 February 2020 / Revised: 25 April 2020 / Accepted: 27 April 2020 / Published: 28 April 2020

Round 1

Reviewer 1 Report

I recommend changing the structure of the article to better organize. The current division is disordered and can hinder the understanding of the content of the paper. I suggest moving the Related work chapter to the Introduction section. Please clearly separate the methodology and data and results section. Please remember the main sections of the article: introduction, Methodology and materials, Results, Conclusion

Please clarify why you have selected such a way to expand the data set, why choosing 155 and 25 elements?

Author Response

Thank you for your letter and for the comments concerning our manuscript entitled “Building extraction based on U-Net with an attention block and multiple losses” (Manuscript ID remotesensing-741084). Those comments are all valuable and very helpful for revising and improving our paper, as well as the important guiding significance to our researches. To improve the manuscript, significant revisions have been made, and we change the structure of the article, so that it can be better understood. We have studied comments carefully and have checked the proof of the layout carefully. Point by point responses to the reviewer’ comments are listed in the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

The paper presents a new method (using attention and multiple losses) to produce a segmentation of remote sensing images to detect buildings.
The general structure of the paper seems relevant, and the result (as always in this type of paper) overperform the baselines on two main metrics: accuracy and IoU (that are only implicitly defined)
The main problem I have with the paper is that I don't really understand the methodology used. I am not an expert in attention networks, but I generally don't have a problem to understand new deep learning concepts. As this paper may be too specialized for me, I don't think I should give a recommendation to the editor.
Nevertheless, I think the method could be easier to understand if a complete English and Equation editing are done on the paper. Most of the notations are implicit and could be developed: for example, in Eq. 1, we don't know explicitly in which space "lives" x^l, x^{l+1}, in Eq. 2, g is not defined.
Another point is that the comparison to the baseline could more be justified. We don't know if the tuning done on the baseline gives the expected score (reproduce the score of the reference papers for example). And given the variability of scores of the different datasets (on the accuracy for instance), it is not clear to me that the improvement made by the attention mechanism is significant.

Author Response

Thank you for your letter and for the comments concerning our manuscript entitled “Building extraction based on U-Net with an attention block and multiple losses” (Manuscript ID remotesensing-741084). Those comments are all valuable and very helpful for revising and improving our paper, as well as the important guiding significance to our researches. To improve the manuscript, significant revisions have been made, and we change the structure of the article, so that it can be better understood. We have studied comments carefully and have checked the proof of the layout carefully. Point by point responses to the reviewer’ comments are listed in the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

The paper focuses on keeping the boundary of building shapes while extracting buildings using U-Net by using attention block and multiple losses. In general, the motivation and originality of the work look fine. However, the manuscript should be further improved to be accepted for the journal.

 

  • Introduction and Related Work sections: the organization of the sections are very confusing. Some of the contents in the Related Work section should be moved to the Introduction section. For example, explanations regarding the proposed method are not related work. These parts written in the related work section should be moved to the introduction section. Moreover, these two sections should be reconstructed and rewritten since these are quite difficult to understand in this version.

 

  • Methods: It is quite difficult to understand the concept of the methodology part. It needs more detailed explanations with some related figures.
    • All the notations used in the text and equations should be defined.
    • What are the features (x) in your model?
    • What is PSPNet? Why do you select skip-connected approach instead of the PSPNet?
    • Most of the equations written in the manuscript are difficult to understand (e.g., equations from (1) to (4). Please check the equations whether those are mathematically correct or not.
    • According to the equations (6) and (7), the loss_pbi is written two times. Are they the same? if yes, it is unfair. If those two are different ones, those notations should be written in a different way.

 

  • Experiments
    • It needs more detailed explanations of datasets used both of the training and validation data with some example figures.
    • What is AMUNet? 
    • Why are there two fig.7s? what are from (a) to (d) in the second fig.7?

 

  • There are some typos, grammar errors, and format errors that are not appropriate for the journal. Please carefully double-check the entire manuscript. 

Author Response

Thank you for your letter and for the comments concerning our manuscript entitled “Building extraction based on U-Net with an attention block and multiple losses” (Manuscript ID remotesensing-741084). Those comments are all valuable and very helpful for revising and improving our paper, as well as the important guiding significance to our researches. To improve the manuscript, significant revisions have been made, and we change the structure of the article, so that it can be better understood. We have studied comments carefully and have checked the proof of the layout carefully. Point by point responses to the reviewer’ comments are listed in the attachment.

Author Response File: Author Response.pdf

Reviewer 4 Report

My main concerns about the paper is that the methodology is not clear in many aspects:

  1. The attention idea has been used in the remote sensing community.
  2. The UNEt is not clearly described.
  3. How do you incorporate the attention layer.
  4. Where are the multiple losses.
  5. Experiments, should include sensitivity analysis not just providing results.  
  6. Compare your results against state-of-the-art.

 

Author Response

Thank you for your letter and for the comments concerning our manuscript entitled “Building extraction based on U-Net with an attention block and multiple losses” (Manuscript ID remotesensing-741084). Those comments are all valuable and very helpful for revising and improving our paper, as well as the important guiding significance to our researches. To improve the manuscript, significant revisions have been made, and we change the structure of the article, so that it can be better understood. We have studied comments carefully and have checked the proof of the layout carefully. Point by point responses to the reviewer’ comments are listed in the attachment.

Author Response File: Author Response.pdf

Reviewer 5 Report

This paper implements an attention gate based UNet model with multiple loss functions for optimization. It has adopted two datasets named INRIA and AIRS. It compares its result to a few other published networks and claimed a better performance. 

 This paper lacks proper linguistic and presentation. For example, Figure 1 does not include any notation or explanation of the blocks in the figure. The authors have used two different types of citation methods in this paper. For example, on page 4, section 3.1 authors have used both types of citation methods in the same paragraph. Spelling mistakes and a snapshot of the mathematical symbols are also visible while those mathematical symbols should be written properly. For example, the spelling mistake is observed on page 3 line 100 (“ReuseNet applied sematic segmentation…”, should be semantic instead of sematic) while the mathematical symbol of sigma is attached as a snapshot instead of in written format in page 4 line 176 and page 5 line 209. There are many sentences in the paper which are either incomplete or do not make clear sense.

It is not clear to me what is the exact contribution of there authors; U-net, attention-based U-net, and multi-loss have been used in previous works. Did the authors mainly added these features together?

This paper describes the methodology in section 3. It implements the attention-based model and compares its performance with other networks like multitask loss based SegNet, 2-level UNet, MSMT, and GAN-SCA. The first two results presented in Table 1 is an exact copy of the results presented in Table 1 of 2-Level UNet paper. But this paper does not explain the similar experimental setup which will prove that authors have achieved the exact same results as the 2-Level UNet. Rather the experimental setup in 2-Level UNet paper is different than this paper. 

Table 2 and 3 of this paper presents a generalized comparison of the proposed method with UNet and Attention UNet models. Although Attention UNet is one of the pioneer papers in attention-based model architecture in semantic segmentation, no theoretical difference is explained or mentioned in section 3 where authors have explained their methodology. It would be a good idea to mention the differences in attention gate architectures between these two papers in the methodology section. Also, while mentioning the proposed methods in these comparison tables, it is a good practice to write the name of the method (e.g. AMUNet) instead of writing “our” method. 

 This paper mentions in #322 that authors have used 30 images for training and 5 images for evaluation to present figure 4, while it mentions again in # 285 that authors will use 180 images. The reason that this paper has changed its sample number to represent the figure is not clear. 

 

 

 

Author Response

Thank you for your letter and for the comments concerning our manuscript entitled “Building extraction based on U-Net with an attention block and multiple losses” (Manuscript ID remotesensing-741084). Those comments are all valuable and very helpful for revising and improving our paper, as well as the important guiding significance to our researches. To improve the manuscript, significant revisions have been made, and we change the structure of the article, so that it can be better understood. We have studied comments carefully and have checked the proof of the layout carefully. Point by point responses to the reviewer’ comments are listed in the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Accept in present form

Author Response

Thanks

Reviewer 3 Report

Most comments that I had have addressed in the revised manuscript.

Author Response

Thanks

Reviewer 4 Report

The authors have answered my comments.

Author Response

Thanks

Reviewer 5 Report

Thank you for addressing most of my concerns. However, there are still some grammatical errors. I suggest the authors to revise the text carefully. 

Author Response

Thanks!Native spekaer have helped us to improve the paper in grammer.

Author Response File: Author Response.pdf

Back to TopTop