Next Article in Journal
Mapping Forest Aboveground Biomass with MODIS and Fengyun-3C VIRR Imageries in Yunnan Province, Southwest China Using Linear Regression, K-Nearest Neighbor and Random Forest
Previous Article in Journal
A Strict Validation of MODIS Lake Surface Water Temperature on the Tibetan Plateau
 
 
Article
Peer-Review Record

A Land Cover Classification Method for High-Resolution Remote Sensing Images Based on NDVI Deep Learning Fusion Network

Remote Sens. 2022, 14(21), 5455; https://doi.org/10.3390/rs14215455
by Jingzheng Zhao 1, Liyuan Wang 2, Hui Yang 3, Penghai Wu 1,4,5, Biao Wang 1,4, Chengrong Pan 6,* and Yanlan Wu 1,4,5,7
Reviewer 1: Anonymous
Reviewer 2:
Remote Sens. 2022, 14(21), 5455; https://doi.org/10.3390/rs14215455
Submission received: 1 September 2022 / Revised: 18 October 2022 / Accepted: 25 October 2022 / Published: 30 October 2022

Round 1

Reviewer 1 Report

REVIEW SUMMARY

 

The authors present a work in which they use a Deep Learning (Fusion) Network to classify land cover based on a combination of NDVI and the images’ spectral bands. The approach is shown using high-resolution GF-1 (Gaofen-1) optical satellite images with 4 bands (blue, green, red, near infrared).

 

The authors’ claim is that their approach achieves a higher accuracy for land cover and land cover change classification and performs better than state-of-the-art methods.

 

The topic is very interesting and timely since correct information about land cover and changes are required in many applications. However, I have several concerns, which – in my opinion – need to be addressed before I can recommend publication of the manuscript. The main points are: Unclear description and role of the “fusion” network, state-of-the-art description is incomplete, and the findings/results are not fully supporting the claims. Below are specific comments.

 

SPECIFIC COMMENTS

 

[1] The NDVI layer is provided separately and calculated individually – although the information is already implicitly provided in the four bands. Why is the network not able to learn this relationship between the red band and the nir band itself? If I understood this correctly that the NDVI layer is used as additional input to the deep learning network, it is unclear to me what the “fusion” part is.

 

[2] The terms high intraclass variability and low interclass separability are introduced several times (including in the abstract) but never explained or defined. The terms should be defined to allow readers to understand what the authors mean.

 

[3] Page 2: The literature and state-of-the-art review starting from line 49 are in places incomplete. For example: “LCC methods can be divided inti manual feature-based methods and deep learning methods” -> One of the most popular methods is random forests, which is neither of them. In the following paragraph(s) the authors several additional features (e.g., based on shape) but there is also time, in particular since this is mentioned in Line 108. The claim Fully Convolutional Neural Networks require low computer performance is not clear to me. The terms “contextual semantic information” and “contextual semantic information guidance” are unclear to me. The object-based image analysis (OBIA) methods are insufficiently described and referenced. Core references of the OBIA method are missing. Further, existing work on similar deep learning methods that also use NDVI is missing (there are quite a number of references, though, for deep learning for LC in general, but not specifically what the authors claim). It is therefore difficult to evaluate the novelty.

 

[4] Line 134: Please explain how GF-1 provide a clearer image coverage.

 

[5] Section 2.1: There is a mix between methods and data/materials. This should be made clearer or separated.

 

[6] The section headlines 2.2.3 and 2.2.4 are a bit confusing – perhaps a different headline in one of them would make this clearer.

 

[7] Figure 3: Please explain the classes. Although the image patches are quite small there are obvious errors in the classification compared to the labels: For example, in image 4 there is quite a big area labelled as Shrub which does not appear in the classified images. In image 5 there are larger areas labelled as Other, which are classified differently. This needs to be discussed. Some classes require explanation/definition, e.g., “Special” and “Other”, “Forest” and “Shrub”, and “Residential” and “Industrial”. Further, in the beginning of the introduction land cover changes are highlighted, but the classes provided here are no change classes.

 

[8] Figure 5 and figure 3 should have the same colours for the classes.

 

[9] The discussion should have a section about the strength and weaknesses of the approach, including back-references to the state-of-the-art.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 2 Report

This article propose a deep learning fusion network used NDVI,
called the Dense-Spectral-Location-NDVI network (DSLN).

The manuscript mentioned the data used for training is 15 images  GF-1 images in 2015 and 2020 required for the training model , given the relatively small training data sample, was the risk of overfitting considered during training? I am not sure whether only 15 images can train a general and stable model.

It is unclear if the presented results in Section results are testing results, validation results. More details about the training and validation are needed. E.g., the training accuracy, validation, and learning curves, which could be used to observe if the models are overfitted or underfitted.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Round 2

Reviewer 2 Report

The manuscript has been improved and the authors have taken my comments into account.

Back to TopTop