Next Article in Journal
Toward a Model to Evaluate Machine-Processing Quality in Scientific Documentation and Its Impact on Information Retrieval
Next Article in Special Issue
Enhanced Atrous Extractor and Self-Dynamic Gate Network for Superpixel Segmentation
Previous Article in Journal
Weed Detection Method Based on Lightweight and Contextual Information Fusion
Previous Article in Special Issue
Improved Sea Ice Image Segmentation Using U2-Net and Dataset Augmentation
 
 
Article
Peer-Review Record

MFFNet: A Building Extraction Network for Multi-Source High-Resolution Remote Sensing Data

Appl. Sci. 2023, 13(24), 13067; https://doi.org/10.3390/app132413067
by Keliang Liu 1, Yantao Xi 1,*, Junrong Liu 2, Wangyan Zhou 1 and Yidan Zhang 1
Reviewer 1:
Reviewer 2: Anonymous
Appl. Sci. 2023, 13(24), 13067; https://doi.org/10.3390/app132413067
Submission received: 25 October 2023 / Revised: 29 November 2023 / Accepted: 5 December 2023 / Published: 7 December 2023
(This article belongs to the Special Issue Deep Learning in Satellite Remote Sensing Applications)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The presented abstract outlines a significant contribution to the field of remote sensing and object extraction using high-resolution imagery. Deep learning approaches, particularly within the domain of building extraction from satellite and aerial imagery (UAV), have seen progressive advancement in recent years. he authors’ proposition of a Multi-Feature Fusion Network (MFFNet) ostensibly fills this gap, offering a promising solution to the challenge of building extraction from various high-resolution remote sensing data sources. 

 

Despite the strong results, the review draws attention to several critical points:

  • Citations and Geographic Bias: There is an indication of a geographic bias in citations and references. This presents a concern regarding the model’s tested applicability across diverse geographical locations and its generalisation capabilities. Further research should aim to diversify the citation scope and empirically test the model in various global regions.
  • Model Limitations: The authors appear to have focused predominantly on the superiority of their method over traditional segmentation models without a thorough discussion on the limitations of MFFNet. This omission is significant as it can inform future research directions and the practical deployment of the network.
  • Dataset Limitations: The clarity regarding the limitations and implications of the datasets used for testing is not sufficiently addressed. It is unclear how the diversity, size, and quality of these datasets might impact the overall performance and adaptability of the network.
  • Influence of Building Standards and Materials: The reviewer expresses concerns about the lack of discussion on how varying building standards and the use of different building materials have influenced the model's training and performance. These factors can significantly affect the model's practical application, particularly in scenarios where building materials and construction styles vary dramatically between regions.

The MFFNet introduces an innovative approach to building extraction from high-resolution remote sensing data. The preliminary results reported in the abstract are promising, showing potential for practical applications. However, to comprehensively understand the capabilities and limitations of MFFNet, the authors should address the highlighted concerns. A detailed exploration of the network's performance in diverse geographic locations, an in-depth analysis of its limitations, and a clear understanding of the influence of building materials and standards on the training process would greatly enhance the paper’s contribution to the field.

Author Response

We thank our editors and reviewers for their careful reading of our manuscript and thoughtful comments. We are humbled that our efforts have been well revived. Below, we have addressed tall questions and comments, and indicated the changes in the manuscript.

 

1.Citations and Geographic Bias: There is an indication of a geographic bias in citations and references. This presents a concern regarding the model’s tested applicability across diverse geographical locations and its generalisation capabilities. Further research should aim to diversify the citation scope and empirically test the model in various global regions.

Response: We are grateful to the reviewers for their assessment of our article. The primary goal of our study was to demonstrate the significant effectiveness of our designed MFFNet in extracting buildings from high-resolution remote sensing data. Given the limitations in acquiring high-resolution data, we validated our model using three distinct datasets: the Jilin-1 Satellite Remote Sensing Image Dataset, the Massachusetts Building Dataset, and the WHU Building Dataset.The Jilin-1 Satellite Remote Sensing Image Dataset comprises imagery from select areas in Xi'an, China, captured by sensors on the Jilin-1 satellite. The Massachusetts Building Dataset encompasses building data from the Massachusetts region in the United States. The WHU Building Dataset is a publicly available dataset from Southeast Asia, compiled using both drone and satellite imagery.Both the Massachusetts and WHU Building Datasets are widely utilized by scholars and are referenced in our citations. These datasets use different sensors and come from different regions, which we believe can prove that our model has certain generalization capabilities.

 

2.Model Limitations:The authors appear to have focused predominantly on the superiority of their method over traditional segmentation models without a thorough discussion on the limitations of MFFNet. This omission is significant as it can inform future research directions and the practical deployment of the network.

Response: We acknowledge that our original manuscript did not extensively discuss the limitations of our model. Initially, we only elaborated on the limitations concerning the comparative aspects of VIT. However, based on the valuable feedback from the reviewers, we have now expanded our discussion to address these limitations more comprehensively. We appreciate the reviewers pointing out this oversight and thank them for their constructive input.

 

3.Dataset Limitations: The clarity regarding the limitations and implications of the datasets used for testing is not sufficiently addressed. It is unclear how the diversity, size, and quality of these datasets might impact the overall performance and adaptability of the network.

Response: We believe that the dataset used for our testing is sufficiently diverse.We used the Jilin-1 building data set, Massachusetts building data set, and WHU building data set to verify the model. Among them, the Massachusetts building data set and the WHU building data set are the data sets used by most people. The Massachusetts building data set is a small-scale building data set, and the WHU building data set is a large building data set. We have cited them in the article (for example: Refs. 13, 36). In order to have enough and diverse training samples for our self-built Jilin-1 building data set, we randomly extracted some areas from the Chang'an area of Xi'an, China and cut more than 4,000 images. Of course, the composition of these data sets is different, but in comparison with other models, MFFNet shows excellent performance.

 

4.Influence of Building Standards and Materials:The reviewer expresses concerns about the lack of discussion on how varying building standards and the use of different building materials have influenced the model's training and performance. These factors can significantly affect the model's practical application, particularly in scenarios where building materials and construction styles vary dramatically between regions.

Response: We thank the reviewers for their suggestions. To validate our model, we used Jilin-1 Building Dataset, Massachusetts Building Dataset, and WHU Building Dataset. The sample location of the Jilin-1 building data set is Xi'an, China, the sample location of the Massachusetts building data set is Massachusetts, and the sample location of the WHU data set is Southeast Asia. The architectural styles of these three areas are different. For each dataset, we divided the data into training set, validation set, and test set. We train the model on the training set for each specific region, track its performance on the validation set, and evaluate the extraction results on the test set. Through this method, we found that MFFNet can show excellent performance when trained and tested on the same building dataset, with slight differences between different building datasets, but generally good results.

 

Improvements are needed for what the reviewers have given. We have carefully verified the citations of the references. Now the revised references are in line with the articles. We cite these articles for the following reasons: ① We designed a new deep learning network model to extract buildings, so we cite Some traditional methods of extracting buildings are introduced, with the purpose of highlighting the advantages of deep learning over traditional methods. ②In related work, we have discussed the development of deep learning in order to explain the basic basis for designing such a model. Therefore, in this chapter we quote some articles on classic deep learning models. ③Deep learning has been used in many fields of remote sensing. In addition to architectural extraction, different model structures have emerged in complex scenes. It is with reference to these model structures that we creatively proposed MFFNet.

We did oversimplify the conclusion, so we have revised our conclusion in the original article. Due to the reviewers' doubts, we also made a comprehensive review and revision of the article. 

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

The manuscript entitled “MFFNet: A Building Extraction Network for Multi-source High-resolution Remote Sensing Data” presents the capability of MFFNet on extraction of buildings in high-resolution RS data.

The manuscript does not support the title because multi-source data refers to datasets from different types of sensors. Also, the datasets are very high resolution.

The Abstract should be fully revised since it is far from the standard abstract.

The paper is not written well and requires a several corrections. For instance, the sentence in line 306 is incomplete.

Lines 301 and 302, the authors should change Formula to Equation.

The propose method has shown the best results among other the state of the art methods, however the improvement are less that 2%.

Comments on the Quality of English Language

The manuscript needs comprehensive corrections.

Author Response

We thank our editors and reviewers for their careful reading of our manuscript and thoughtful comments. We are humbled that our efforts have been well revived. Below, we have addressed tall questions and comments, and indicated the changes in the manuscript.

 

1.The manuscript does not support the title because multi-source data refers to datasets from different types of sensors. Also, the datasets are very high resolution.

Response: Thank you for your valuable feedback. Our research demonstrates that MFFNet can be applied to multi-source remote sensing data. Our experimental part used three data sets: Jilin No. 1 building data set, Massachusetts building data set, and WHU building data set. They are derived from different sensors. The Jilin-1 building data set comes from Jilin-1 satellite remote sensing images, the Massachusetts building data set comes from aerial remote sensing images, and the WHU building data set is generated by QuickBird, Worldview, IKONOS, and ZY- 3 and UAV remote sensing images. We demonstrate through experiments on these three data sets that MFFNet can perform better building extraction on remote sensing data from different sensors.Additionally, regarding the use of high-resolution images, these images provide rich detail features. We chose this approach to more precisely extract features such as the shape and contours of buildings, thereby better verifying the effectiveness of our algorithm. 

 

2.The Abstract should be fully revised since it is far from the standard abstract.

Response: Thanks to the reviewers for their valuable comments, we have revised the abstract.

 

3.The paper is not written well and requires a several corrections. For instance, the sentence in line 306 is incomplete.

Response: Line 306 is an explanation of Equation 2 mentioned above. Due to our oversight, we did not accurately convey our intended meaning, so we have made revisions to it. Additionally, we have made further revisions to other parts of the paper. 

 

4.Lines 301 and 302, the authors should change Formula to Equation.

Response: Thanks to the reviewer for pointing out the error. It was indeed our mistake and has now been corrected.

 

5.The propose method has shown the best results among other the state of the art methods, however the improvement are less that 2%.

Response: It is a privilege to address this question from the reviewers. In the current landscape, many models achieve commendable results, particularly with most models boasting an accuracy rate above 90%. While the improvement afforded by MFFNet might be less than 2%, it still represents a noteworthy advancement. Furthermore, the superiority of our model is also reflected to a certain extent through visual results. As demonstrated in Figures 8, 9, and 10 of our paper, we present visualizations of our segmentation results, which further illustrate the efficacy of MFFNet. 

 

We did oversimplify the conclusion, so we have revised our conclusion in the original article. Due to the reviewers' doubts, we also made a comprehensive review and revision of the article. 

Author Response File: Author Response.docx

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

Thanks, the manuscript is revised based on the given comments. 

Back to TopTop