Next Article in Journal
UMMFF: Unsupervised Multimodal Multilevel Feature Fusion Network for Hyperspectral Image Super-Resolution
Previous Article in Journal
Combination of Multiple Variables and Machine Learning for Regional Cropland Water and Carbon Fluxes Estimation: A Case Study in the Haihe River Basin
 
 
Article
Peer-Review Record

Near-Real-Time Long-Strip Geometric Processing without GCPs for Agile Push-Frame Imaging of LuoJia3-01 Satellite

Remote Sens. 2024, 16(17), 3281; https://doi.org/10.3390/rs16173281
by Rongfan Dai 1, Mi Wang 1,* and Zhao Ye 2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3:
Reviewer 4:
Reviewer 5: Anonymous
Remote Sens. 2024, 16(17), 3281; https://doi.org/10.3390/rs16173281
Submission received: 9 July 2024 / Revised: 1 September 2024 / Accepted: 2 September 2024 / Published: 4 September 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This study primarily addresses the displacement issues in strip images without ground control points (GCP) captured by the Luojia301 remote sensing satellite. By compensating for PRC based on relative orientation between frames, the study enhances interframe geometric accuracy. The approach involves pixel encryption, perspective transformation, and mapping the images onto a virtual stitching plane to generate corrected strip images. The key advantage of this research lies in utilizing GPU parallel processing for pixel encryption, linear transformation, and resampling to accelerate computations, balancing both precision and efficiency.

Before considering publication, there are several issues that I believe need further consideration. In Section 2.2.3, the manuscript mentions that the encryption algorithm applied to the original images increases the computational load by four times. Could you provide an explanation of the advantages of selecting four encryption points compared to other numbers of encryption points? Additionally, could you include a comparison with methods other than ORB, such as SURF?

Moreover, there are some formatting errors in the manuscript:

1. In Figure 13, the clarity of subfigure b is inconsistent with subfigures a and c; please correct this.

2. The explanation below Equation 4 is problematic and redundant with Section 2.1.1.

3. The formatting at the end of Section 2.2.1 on page 8 is incorrect.

 

4. In Table 2 on page 12, there is a spelling error in the "Center Position" column of the Data4 row, with an extra "E" included.

Comments on the Quality of English Language

Overall the quality of English of the manuscript is good.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

This paper introduced a near-real-time geometric correction and stitching method for LuoJia3-01 satellite array push-frame imaging sequence images. The experimental results demonstrated reasonable performance, including accuracy and efficiency. However, there were still some problems that need to be addressed and improved.

(1) For the first occurrence of an English abbreviation, such as CMOS, please provide its full name.

(2) In Equation 1, the expression "dx_y" is misplaced and should be "dy_k" according to the context of the article. In the paragraph following Equation 1, "dY_k" should be "dy_k".

(3) In Equation 2, specific meanings for the parameters need to be provided, and the parameter "l" appears twice with presumably different meanings.

(4) The term for "Equation " is used inconsistently. For example, it is "Formula (9)" in one instance and "Equation" elsewhere. Please correct this for consistency.

(5) In the coordinate mapping process described in section 2.2.3, the article chooses the direct method for correction and adopts image point encryption to avoid potential issues. How is the recommended grid size of 2 derived, and does it need to be adjusted based on different ground features and terrain?

(6) During the push-frame imaging process of LuoJia3-01, there is significant overlap between frames, such as between the frame-1 and frame-15. To achieve real-time rapid processing, the article proposes frame sampling in section 4 (Discussion). How does frame sampling lead to precision loss, and is the impact significant?

Comments on the Quality of English Language

see the comment above

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 3 Report

Comments and Suggestions for Authors

General Comments:

1. Nearly half of the references over the last 10+ years include Mi Wang as an author.  A brief look at them finds them to be no more informative than the present paper.

 

2. While part of the algorithm description refers to 3 images, the equations seem to only refer to 2.  There are a number of specific comments related to this.

 

3. It is ever discussed how blocks from each image are selected for processing. 

 

Specific Comments:

L 52: “same-name points” just means “the same point” in adjacent images – correct?  Somewhat odd wording.

 

L 54-56: Why does “high precision attitude processing” not take out most/all of the attitude effect?  Are attitude sensors not accurate enough, not sampled fast enough, or ?

 

L 65-66: “interframe images” would be clearer as “adjacent” or “overlapping”.

 

L 69-70: What is “theoretically not rigorous” – the camera model itself or something about the method?  If it is the model, what is not rigorous?

 

L 105: “high temporal” = near real time (L 126) or some clearer statement of speed required.

 

L 119-121: So this new method reverses the order of geometric correction and stitching together (L 95-101) – it should be explained why this is an improved approach.

 

L 132: What is RPC?  It needs to be spelled out and possibly explained before the acronym is used.  Reference needed?  It is used throughout the paper with no explanation.

 

L 131-132: Based on Figs 1, 2, there would seem to be no overlap between frame_i and frame_i+2 (more usual to use i-1, I, i+1), so it is unclear how one would construct motion parameters by comparing these 2 frames.

 

L 135: What is ORB?  It needs to be spelled out and possibly explained before the acronym is used.  Reference needed?

 

Fig 3: This diagram looks like adjusting adjacent frames with single frame forwarding – not what is described in L 131-132.

 

L 143-146: ORB is spelled out, but what are FAST and BRIEF (why is it not ORFB or OFB?)?  Is there some orientation other than rotation?  There need to be (short) descriptions and/or references for these algorithms.  In general, images could have x, y offsets and scale differences as well as rotation.  It would seem one should start with this model and explain why or how each item is not important or solved for.  

This comes up in Sec 2.1.2, but it would seem clearer to make it the starting point.

 

L 151: Opencv needs a reference or link.

 

L 163, 216: It is never really explained how the points/features to compare are selected.  The model in Eq 1 ignores rotation.

 

L 168: Spell out RFM for first use.  Relative Feature Model?

 

L 198-202: This paragraph is a repeat from L 163.  Its relevance to equations 3, 4 is not clear.

 

Eq 5: The x’s (a1 x x_j) seem unnecessary.  Are the a’s and b’s the same as in Eq 4?

 

L 216: The pieces of the model are never put together to show what is going to be calculated/solved.

 

L 245: Numbering method steps across Sections seems strange: 2.1.2 steps 1-4, 2.2.1 steps 5-6.

 

L 284: What is the meaning of “rare” with virtual control points?

 

L 294-295: In general, one cannot get 8 parameters from 4 equations – what is the trick here?  Where does matrix A in Eq 14 come from?

 

L 301: “High degree of overlap” is not how previous figures have portrayed the images.  This would also seem to be a very wasteful way of operating the camera and using bandwidth.

 

Sec 2.2.3: “Encrypted” is a strange word for what is described in this section.  While filling may be in line with other things described in the paper, intensity adjustment is somewhat far afield.

 

L 387: What is the 3 pixels depth?  Also, make clear which size is along/cross-track.

 

Fig 7: Does this show 6 frames (every tenth) as described in the text?  If so, put some space between them so readers can by eye judge the overlap (or not).  If frames are ~7872*0.7m long, then there is ~50% overlap between adjacent frames or gaps of nearly 35 km between every tenth.  Clarify this scenario.  This simple calculation (50% overlap) does do not seem to agree with that in L 423 or Fig 8.

 

* Stopped detailed comments. 

 

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 4 Report

Comments and Suggestions for Authors

        The article presents a real-time geometric correction method specifically designed for push-frame long-strip images captured by satellites like LuoJia3-01. The paper addresses a crucial aspect of satellite image processing by improving the efficiency and accuracy of geometric corrections for long-strip images. 

(1)     Background information of the LuoJia3-01 satellite payload and products should be added in the introduction.

(2)     In the process of frame imaging of LuoJia3-01, the authors should elaborate on what are the causes of serious inter-frame misalignment and complex inter-frame overlap problems.

(3)     The author should purposefully explain the purpose and significance of each step of the paper's methodology for processing. The explanation here corresponds to the above problem (2).

(4)     Some equations were given without clear explanation. Please double check each formula in the text, a large number characters of formulas that are not listed in the article for their representation.

(5)     In the experiments, the authors included experiment E1 for the purpose of describing the original RPC error; however, the reason for the error was not explained in detail, so please add the reason.

(6)     From the fig11(c), it is found that the method of this paper still has more obvious splicing errors. Therefore, it can be seen here that the methodology of this paper is still inapplicable in some cases. The author needs to discuss and define the applicability of this paper within the discussion and conclusion.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 5 Report

Comments and Suggestions for Authors

Due to the frame misalignment and intricate overlap relationships in the array push-frame imaging process, this paper introduced a near-real-time geometric correction and stitching technique for long-strip images products. The method exhibits innovation, and the experiments achieved favorable experimental results. There are still several questions that the authors need to address and further refine:

(1) The accuracy of DEM has a certain impact on geometric correction and inter-frame stitching precision. It is recommended to supplement the data source and accuracy of the DEM used.

(2) In section 3.2, the format of the E1 scheme is not aligned with the E2 and E3 schemes. Please revise it.

(3) In the experiment, the E1/E2 comparison method was proposed. It is suggested to supplement the corresponding references.

(4) In the abstract section, the sentence "Finally, image matching, coordinate transformation, and grayscale resampling are mapped to the GPU." is not clear. It is recommended to revise it for better clarity.

(5) In Figure 8, the text descriptions of (a)-(d) do not match the figure, which may lead to misunderstandings. Please revise them.

(6) What are the deficiencies or areas for improvement in the practical application of the method/algorithm presented in this paper? Is it possible to extend its use to the preprocessing of other similar satellite payload data?

(7) The reliability of the method in this paper may be affected by factors such as insufficient positioning accuracy, scarce connection points between adjacent frame images, cloud and snow coverage, water areas, and lack of texture in overlapping areas. Are there any solutions for such special cases?

Comments on the Quality of English Language

NULL

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Round 2

Reviewer 3 Report

Comments and Suggestions for Authors

General Comment(s): 

The paper is significantly improved from the previous version, but a physical description of the system to better understand some of the equations is still missing.  It is not possible to fully evaluate the method without this connection. It is not clear to me whether this a major or minor correction: A minimum of a short paragraph is needed, and in addition, probably some comments with several of the equations. 

 

Specific Comments:

 

L 49-56: This description is not very clear and/or seems somewhat internally contradictory.  It is not clear whether this is a conceptional or language (English) problem.  Is the ground image speed reduced by scanning the camera?  Are multiple images down-linked and combined in lower level processing on the ground?  See next comment.

 

Table 1: At 500 km altitude ground speed is >~7 km/s, so in 30 sec >200 km, but table says 20-40 km which would seem like quite a bit of scan. 

 

L 64-69: Whole satellite scans, not just imaging system? 

 

Sec 2, L 135-183: Very much improved.

 

Eq 4: There appears to be a typo as second equation should be ds = b0 +b1*l +b2*s  (or maybe reverse l, s, but not two s).

 

Eq 5: There is no clear connection between coordinates x, y (i-1) and x, y (i).  Is one set of indices in the second 2 equations supposed to be i-1, not all I ?  Are the a, b the coefficients from Eq 4?  If so, say it, otherwise chose different letters.

 

L 270-279: The selection of sub-images/frames is not explained adquately.

 

L 281-286, Fig 5, Eq 6, 7: The relationship of the figure to the equations is not obvious or well explained.  Why are the frames skewed in the second part of the figure?

 

Eq 15: Since gsd are averages and it is stated that the resolution is nearly identical, it is not clear what min, max mean.  Also, for gsd_max ~gsd_min, Eqn 15 would appear to give ~2, so what is the size W-2?

 

L 428-440: Somewhere in here the slew/scan rate needs to be mentioned.  In 30 sec, the ground track covers ~210 km.  There are 60 frames, each about 4 km long (~240 km).  There would be about 20-30% overlap between frames.  This needs a physical description of what is going on.  Some of the numbers given do not fit a simple physical interpretation.  If the scanning is as extreme as is suggested, why is this necessary?

 

Fig 7: Is the sequence of images top to bottom or bottom to top?  Say which in caption.

 

L 468-469: This amount of overlap is not consistent with L 428-440 unless the slewing/scanning is explained.  (Unless “row” and “sample” do not mean what seems to be the case elsewhere (and in general).)

 

[Not checked in detail]

 

 

Fig 14: Image (c) has the left edge cut off.  Is this intentional, a “feature” of the method, or an error?

 

Author Response

Please see the attachment

Author Response File: Author Response.docx

Reviewer 5 Report

Comments and Suggestions for Authors

The manuscript has been revised to the point where it can be published.

Author Response

Please see the attachment

Author Response File: Author Response.docx

Round 3

Reviewer 3 Report

Comments and Suggestions for Authors

General Comments:

The article is again improved, but the flaw noted below in Eq 13, 14 is serious. 

I also think that there are some "hidden steps" in how the data are decimated that need to be explained. 

Also, the article would be further improved is much of the explanation in the response to the previous points was worked into the text.

 

Specific Comments:

 

L 80-81: Isn’t the problem that the attitude measurements/solutions cannot be done at the full frame rate and there is actual jitter faster than the attitude measurements/corrections?  If that is the problem, this sentence does not correctly convey it.  If the problem is that the attitude solutions have too much error or jitter, there needs to be some other slight rewording.

 

L 152: Back when RPC was introduced, it appeared to be a radiometric correction; here it is a geometric correction.  Is it both?  Should one of the descriptions be somewhat modified to make this clear?

 

 

Eq 13, 14: These equations as written are non-linear (or nonsense)  as x_i, y_i appear in both A and L.  Are the x, y in A or L somehow different?   Looking at Eq 12, one can see this intuitively as the scale of s, l varies with s, l.   Or, am I missing an important point in the selection of points to be used for solution? 

 

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Back to TopTop