Next Article in Journal
Modeling the Spectral Properties of Obtrusive Light Incident on a Window: Application to Montréal, Canada
Previous Article in Journal
Dependence of CWSI-Based Plant Water Stress Estimation with Diurnal Acquisition Times in a Nectarine Orchard
 
 
Article
Peer-Review Record

Meta-FSEO: A Meta-Learning Fast Adaptation with Self-Supervised Embedding Optimization for Few-Shot Remote Sensing Scene Classification

Remote Sens. 2021, 13(14), 2776; https://doi.org/10.3390/rs13142776
by Yong Li 1, Zhenfeng Shao 1,*, Xiao Huang 2, Bowen Cai 3 and Song Peng 1
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Reviewer 4: Anonymous
Remote Sens. 2021, 13(14), 2776; https://doi.org/10.3390/rs13142776
Submission received: 25 May 2021 / Revised: 30 June 2021 / Accepted: 7 July 2021 / Published: 14 July 2021

Round 1

Reviewer 1 Report

This manuscript presents a meta-learning fast adaptation with self-supervised embedding optimization for few-shot remote sensing scene classification. In this context, they intend to solve two problems of the few-shot remote sensing scene classification in different cities. Here are my comments:

  1. The performance of the trained model must be tested on new remote sensing scenes from different cities which are not included in training set, and
  2. The trained model is expected to be useful with just a few labeled samples in remote sensing scenes in unknown cities.

My general comments in this paper are in below:

  • The paper formatting/writing needs to be revised carefully. There are many repetitions and spelling errors in the paper.
  • to the transfer learning definition “transfer learning focuses on storing knowledge gained while solving one problem and applying it to a different but related problem”, the meta-learning approach targets are the same as the transfer learning targets. So, defining this solution on top of the transfer learning is not useful and I suggest correcting it.
  • The methodology section is vague and the authors need to clearly explain what their designed network architecture is (I suggest that they add a detailed flowchart of the method including all the layers) and clarify the hyper-parameters used, etc.
  • The experiments and results sections need to be revised. It should include all the test parameters, so the method can be tested by other researchers as well.
  • Table 1 is not clear enough. I wonder under what condition the authors got the results. How can they fairly compare their results to the existing methods?

 

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 2 Report

First of all I would like to congratulate authors for very interesting research.

I would like to suggest you make some corrections:

  • line 326 - the same font for symbol "S" in formula as well as in text;
  • figure 8 - vertical label, I suppose should be standard deviation;
  • the same fonts on all figures

Some additional comments, I would like to consider putting both models on the same plot: (figure 7 and 9) and (figure 8 and 10). In my opinion,  is easier to draw the conclusion from the comparison of two models. I also expect a bit better figure 5 description. I expect some conclusions from figure 5.

General English comment: usually the articles are written in passiv.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 3 Report

The topic under consideration is complex, requiring careful and deep processing, but it is absolutely clear. It is clear that to achieve the desired result, a step-by-step improvement of the model is required. The authors have done painstaking work, the ways of improving which are visible, including to them. This is, first of all, taking into account the nonlinearity of various degrees in the model. The direction of work is clear to everyone – to teach, to teach and to teach again. Teach through learning.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 4 Report


The manuscript "Meta-FSEO: A Meta-learning Fast Adaptation with Self-super-2 vised Embedding Optimization for few-shot remote sensing 3 scene classification" proposes a meta-learning algorithm (Meta-FSEO) to improve model learning from few-shot samples.

Please consider the following comments:


"during the meta-testing stage where hyper-parameter tuning and model selection occur" (line 206, page 5) - Why parameter tuning is performed in the test set? Do you mean validation set, as later referenced in the manuscript? (for example, line 276, page 8)

"All samples are collected in 5 categories, and each sample is collected from 1 shot to 5 shots for model verification." (line 280, page 8)

"All models are trained for 75,000 iterations" (line 305, page 9) - How the number of iterations is selected? Why all models use the same epoch if they have different structures?

"we pre-train it on a large dataset" (line 314, page 9) - What "large dataset" the pre-train step refers to?

"also optimizes the query set through a Self-supervised Embedding Optimization (SEO) module" (line 445, page 16) - How this optimization step is used in the test dataset used to report the model accuracy? How the model is used during a "prediction" step?

 

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

Thanks to the authors for updating their paper based on my former comments. Unfortunately, the papers is still far from coherent. Normally, I should have rejected this paper due to lack of coherency despite my last comments but I would like to give this paper one last chance. I hope these comments will help the authors to improve their manuscript: when I read their paper, I feel dizzy! The structure of the paper needs a lot of improvements. They tend to jump to the concepts back and forth making the reader confused and this applies to all parts of the text. Unfortunately, at this stage as I cannot understand what exactly their contribution is, I cannot let this paper proceed. I ask the authors to RE-WRITE their manuscript in the more organized way so that they reviewers understands what they have done. Here is my recommendation to the authors: you are using a new concept in remote sensing, so you need to first clearly explain what the state of the art is. Please add a section before your methodology for explaining the background knowledge. All the existing info related to meta-learning should go to this stage; make examples define all the parameters and formula so that the reader understands the concept. Then add a methodology section and use the terminology from the background section to explain what it that you have done. This way, it will be clear to the reviewers as what existed and what is your contribution. Also you need to clarify what is the difference between your method and your reference methods MAML and Meta-SGD. Meta-SGD is used as a reference method but never defined! Another confusing concept is that the aim of the work is to reduce the number of training samples. However, the authors used 180K images with the ration of 3:1:1 for training:val:test. It means 108K images are used for training. Can you explain how you reduced the number of training samples? Here are more comments to improve the paper: The authors use abbreviations that are not defined e.g. Meta-FSEO what do they stand for? Other example NLP, GPT, etc. please fix all The figures need to be defined clearly in the text in the order that they appear in the text. Figure 2 is defined before Figure 1. The location of Algorithm 1 is also not relevant to the surrounding text. Pls fix all. L216-217 vague! Please explain better. L223, what is Meta-SEO? L229, what is the 4 layer CNN? What are the components? What are transformers? L231, what elements? The parameters of the equations in the text are not defined! Pls fix eg. In Eq 1, what is f, y,x,etc L237, what is Fseo? L189, what are the natures of x and a? Is the term “Self-unsupervised” a legit term? I think it should be either self-supervised or unsupervised. If this term has been used elsewhere in the literature please cite; otherwise please fix L155, the transformers ARE… Fig 1: what is Classific? What are the elements of encoder? What do these signs mean: diamond, a cross in a circle? Please add all the details related to each step in your diagram and clearly explain them in the text from top to bottom and avoid jumping back and forth.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Back to TopTop