Next Article in Journal
Evaluation and Hydrological Application of a Data Fusing Method of Multi-Source Precipitation Products-A Case Study over Tuojiang River Basin
Previous Article in Journal
Towards Amazon Forest Restoration: Automatic Detection of Species from UAV Imagery
 
 
Article
Peer-Review Record

A Rotation-Invariant Optical and SAR Image Registration Algorithm Based on Deep and Gaussian Features

Remote Sens. 2021, 13(13), 2628; https://doi.org/10.3390/rs13132628
by Zeyi Li 1,2, Haitao Zhang 1,2,* and Yihang Huang 1,2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Remote Sens. 2021, 13(13), 2628; https://doi.org/10.3390/rs13132628
Submission received: 26 May 2021 / Revised: 17 June 2021 / Accepted: 29 June 2021 / Published: 4 July 2021

Round 1

Reviewer 1 Report

One of the many challenges associated with exploiting the same scene from two images collected by two different modalities is the relative rigid body rotation present.  Co-registration of two such images is a key enabler to any further processing.  The paper presents a novel approach to solving the rigid body rotation problem by a two-step process.  The first step uses a MLP style neural network operating on the histograms of the image gradients to estimate the relative rigid body rotation between the images.  Training data is derived from SEN1-2 database.  The second step is the registration step where a descriptor algorithm is applied to the corresponding Gaussian pyramid of oriented gradients of the two images.  Promising results are presented when operating on SEN1-2 data (I assume) was not part of the training set. 

The paper requires a thorough scrubbing to produce appropriate English usage and style.  I will provide only the most serious use of language issues since the vast number of corrections required to produce a readable paper to an English speaking audience is out of bounds with the time demands for the review. 

Most of the acronyms used in the paper are not defined in the text.  Please include the definition of all acronyms at the first usage.  

Abstract:  It would helpful to the reader, particularly a reader that may be new to the field and/or the algorithms being applied, to provide a clear and concise problem statement.  Note that the introduction does a reasonable job of supporting the problem at hand, but even there the discussion is not as direct as I would like.  

Introduction, lines 78-82:  Though it is true that noise is a complicating issue for common key point extraction procedures (like all extraction algorithms), it is also true that the scattering phenomenology differences between the two sensors is a driver.  Some discussion on this point within the context of the proposed methodology is in order.

Section 2.1, line 135:  "...due to radiation differences...".  Here and throughout the paper, the reference to the optical images gray scale values must be changed from "radiation" to "irradiance".  

Section 2.1, line 137:  "...the histogram gradients should contain information about the rotation of the image."  Since this is the primary hypothesis that is the key to this entire work, some discussion as to why this is true should be presented.  Otherwise, if the motivation for using the gradient histograms was simply a numerical experiment, "that worked", then simply state that.  

For sake of completeness and to provide further confidence in the results presented, describing what images from the SEN1-2 database were used in the training set versus used in the experiments is necessary.  

The authors attempted to provide an acronym database at the end of the paper.  It is woefully incomplete.  

 

Author Response

Response to review comments

First of all, we would like to thank editors and the reviewer for all the effort put in reviewing this paper. We have carefully read all the comments, and made following responses to you. We have also made some changes in the text, which is marked red in the revised manuscript. Hopefully these responses will address your questions.

Here are the responses to review comments:

Comment 1:

The paper requires a thorough scrubbing to produce appropriate English usage and style.  I will provide only the most serious use of language issues since the vast number of corrections required to produce a readable paper to an English speaking audience is out of bounds with the time demands for the review. 

  • Response:

Thank you for your suggestions. According to your suggestion, we have refined most of the expressions. The specific modifications will be shown in our revised manuscript. Some of them are listed here.

Comment 2:

Most of the acronyms used in the paper are not defined in the text.  Please include the definition of all acronyms at the first usage.  

  • Response:

Thank you for your suggestions.  We went through all the areas where the abbreviations appeared and explained where they first appeared.

Comment 3:

Abstract:  It would helpful to the reader, particularly a reader that may be new to the field and/or the algorithms being applied, to provide a clear and concise problem statement.  Note that the introduction does a reasonable job of supporting the problem at hand, but even there the discussion is not as direct as I would like. 

  • Response:

Thank you for your suggestions. According to your suggestion, we have modified the description in the abstract section to make it more direct.   

Comment 4:

Introduction, lines 78-82:  Though it is true that noise is a complicating issue for common key point extraction procedures (like all extraction algorithms), it is also true that the scattering phenomenology differences between the two sensors is a driver. Some discussion on this point within the context of the proposed methodology is in order.

  • Response:

Thank you for your suggestions. According to your suggestion, we think scattering phenomenology difference is more explicit than nonlinear radiation distortions in this place.

Comment 5:

Section 2.1, line 135:  "...due to radiation differences...".  Here and throughout the paper, the reference to the optical images gray scale values must be changed from "radiation" to "irradiance". 

  • Response:

Thank you for your suggestions. According to your suggestion, we change all the "radiation" to "irradiance".

 

Comment 6:

Section 2.1, line 137:  "...the histogram gradients should contain information about the rotation of the image."  Since this is the primary hypothesis that is the key to this entire work, some discussion as to why this is true should be presented.  Otherwise, if the motivation for using the gradient histograms was simply a numerical experiment, "that worked", then simply state that. 

  • Response:

Thank you for your suggestions. We introduce histograms for two reasons. First, the dimension of histogram is shorter than the size of image. If the histogram can reflect rotation relationship, the complexity of rotation problem will be greatly simplified. Secondly, the histogram is used as input to make the neural network free from the limitation of image size. Any image can be solved histogram input network. 

Comment 7:

For sake of completeness and to provide further confidence in the results presented, describing what images from the SEN1-2 database were used in the training set versus used in the experiments is necessary.

  • Response:

 Thank you for your suggestions. 1000 images from summer_18 and spring_50 were used in our training set. 600 images from spring_24, spring_98, summer_49, fall_29, fall_52 and winter_54 were used in our test dataset. We sorted the images. The less informative images were eliminated when we built the dataset. Figure 1 is a SAR image in spring_50 dataset named ROIs1158_spring_s1_50_p25.

Comment 8:

The authors attempted to provide an acronym database at the end of the paper.  It is woefully incomplete.  

  • Response:

Thank you for your advice.  To solve this problem, we will reduce the use of abbreviations in the conclusion part. 

 

Author Response File: Author Response.pdf

Reviewer 2 Report

Please see the attached file.

Comments for author File: Comments.pdf

Author Response

Response to review comments

First of all, we would like to thank editors and the reviewer for all the effort put in reviewing this paper. We have carefully read all the comments, and made following responses to you. We have also made some changes in the text, which is marked red in the revised manuscript. Hopefully these responses will address your questions.

Here are the responses to review comments:

Comment 1:

I am confused by “3.3.3. Comparison of Experimental Results” and “3.3.4.Comparison of Experimental Results”. The two subsections have the same title.

  • Response:

Thank you for your suggestions. This is the problem we have with typography. We have corrected these errors in the revised draft. Change the name of section 3.3.4 to: Rotation and Scale Experiments of the Proposed GPOG Algorithm. We apologize for any mistakes in this article.

Comment 2:

In the experiments, the optical images were from Google Earth. Could the authors use the images from real satellites or UAVs?

  • Response:

Thank you for your suggestions. According to your suggestion, Part of the data in the experiment was replaced with real satellite image data. 

Comment 3:

The optical and SAR images in the experiments are with the same spatial resolutions. How will it be if they have different resolutions? Could they make some experiments?

  • Response:

Thank you for your suggestions. A set of matching results of images of different scales will be added at the end of the paper.  

Comment 4:

More works on image registration are suggested, such as “DOI:10.1016/S0924-2716(01)00031-4”, “DOI: 10.1016/j.isprsjprs.2019.03.002”…

  • Response:

Thank you for your suggestions. According to your suggestion, we will cite the work of two papers.

Comment 5:

In the second test, GPOG was compared with HOG and RIFT. HOG and RIFT are very traditional. Could more recent methods be included?

  • Response:

Thank you for your suggestions. In Figure 1, We placed the results of the experiment horizontally after adding CFOG descriptor. In Figure 2, we show the experimental results in two row. As you can see, the overall layout is a bit bloated with the addition of new experiments. Above all, Many descriptors are based on HOG structures such as HOPC and CFOG. We believe that by comparing with HOG, we can fully explain the ability of GPOG operator. In addition, the HOG descriptor and RIFT descriptor are both open source descriptors, making the comparison more convincing. 

 

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

All my concerns have been answered.

Back to TopTop