Next Article in Journal
Regional Models for Sentinel-2/MSI Imagery of Chlorophyll a and TSS, Obtained for Oligotrophic Issyk-Kul Lake Using High-Resolution LIF LiDAR Data
Previous Article in Journal
A Downscaling Method Based on MODIS Product for Hourly ERA5 Reanalysis of Land Surface Temperature
 
 
Article
Peer-Review Record

LD-SLAM: A Robust and Accurate GNSS-Aided Multi-Map Method for Long-Distance Visual SLAM

Remote Sens. 2023, 15(18), 4442; https://doi.org/10.3390/rs15184442
by Dongdong Li 1, Fangbing Zhang 1, Jiaxiao Feng 1, Zhijun Wang 1, Jinghui Fan 1, Ye Li 1, Jing Li 2 and Tao Yang 1,*
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3:
Reviewer 4: Anonymous
Remote Sens. 2023, 15(18), 4442; https://doi.org/10.3390/rs15184442
Submission received: 5 August 2023 / Revised: 31 August 2023 / Accepted: 5 September 2023 / Published: 9 September 2023

Round 1

Reviewer 1 Report (Previous Reviewer 2)

The revised manuscript aptly reflects dedication to enhancing the work's overall quality. The revisions carried out have significantly improved the clarity and organization of the content, thereby reinforcing the manuscript's potential to contribute significantly to the field. The attention given to addressing the concerns raised in the initial review is noteworthy. Responses to the queries and comments have resulted in a more coherent presentation of the research. This, in turn, serves to amplify the manuscript's significance within the broader research community. As the manuscript advances towards its final stage prior to submission, it is suggested that careful consideration be given to undertaking a comprehensive proofreading process. This phase aims to ensure the manuscript is devoid of grammatical errors, typographical inconsistencies, and language-related concerns. Considering the focus of the earlier feedback on clarification and coherence, engaging in a thorough proofreading exercise could potentially yield an added layer of refinement.

Refinement of the manuscript's language not only underscores the dedication to the research, but also aligns harmony with the elevated standards upheld by Remote Sensing journal. Diligence in attending to linguistic subtleties contributes to the overall readability and impact of the manuscript on its intended readership. A recommendation is extended to allocate time for proofreading the manuscript or, if feasible, involve a colleague to assist with this task. Often, a fresh perspective can identify nuances that might have been inadvertently overlooked. The objective remains to ensure the manuscript is polished and adeptly conveys the essence of the research.

. A recommendation is extended to allocate time for proofreading the manuscript 

 

examples:

Original: "In the scenario of GNSS complete loss during ….

Correction: "In scenarios of complete GNSS loss during ……

 

Original: "Currently, Low-texture environments are the most common cause of LD-SLAM failures."

Correction: "Currently, low-texture environments are the most common cause of LD-SLAM failures."

 

Original: "In this work, we automatically generate multi-maps to skip these low-texture regions, thereby improving the long-distance robustness of the overall system."

Correction: "In this work, we automatically generate multi-maps to avoid these low-texture regions, thereby enhancing the overall system's robustness over long distances."

 

Original: "Table III highlights the runtime of the main operations performed in the tracking and mapping threads and shows that our system can run in real-time at 20-30 frames per second."

Corrected: "Table III highlights the runtimes of the main operations performed in the tracking and mapping threads, demonstrating that our system can operate in real-time at a rate of 20-30 frames per second."

Original: "Because the tracking and mapping threads always work on the active map, multi-mapping doesn’t add much overhead."

 

Corrected: "Since the tracking and mapping threads consistently operate on the active map, the addition of multi-mapping doesn't introduce significant overhead."

 

Original: "According to the map’s associated keyframe size, performing full beam adjustment on the loop closure may increase the timing by several seconds."

Corrected: "Depending on the size of the associated keyframes on the map, performing a full beam adjustment during loop closure could potentially increase the timing by several seconds."

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report (New Reviewer)

A novel method is proposed by the authors to realize localization by multi-map and visual SLAM in the absence of global positioning information. Detailed simulation results together with high-quality supplementary files are provided to enhance the reproducibility. This paper is overall very nicely written and carefully organized. Only the following minor comments are listed for reference.

 

1.      Abstract seems to be a little bit lengthy. As a reader, this might make me lose my interest in this paper. Please consider compressing it to not exceeding 250 words.

2.      Font color. The used color of blue is rarely seen in academic paper (except for revised versions). Also, bold fonts are unnecessarily used in, for instance, the last column of Table II. The reviewer also checked the other papers published in this paper, and they are all in the default color. Please switch to the normal color and use unified format in the future version of this paper.

3.      Line numbers from line 665 to 668 seem incorrect, as some of them are missing in specific lines.

4.      Fonts of the caption and content in Table II are larger the remainder. Please check.

5.      The repeated words such as the “Equation Equation 7” and “Equation Equation 8” in Line 772 sjpi;d ne corrected. Please proofread the paper to check and correct such typos.

6.      For such a paper heavy in abbreviations, at least a nomenclature that explains all the used abbreviations is recommended.

7.      Please correct me if this comment is not reasonable – the reviewer has read references on graph-based optimization in SLAM of ground robots. The optimization problems therein are large but can be solved efficiently thanks to the sparsity of the Jacobian and Hessian matrices. In the case of this paper, aerial robots are used instead of their ground counterpart. Is it true that the involved elements of graph (or maps) such as nodes and vertices will increase significantly since the aerial robots can fly much faster? If so, is the sparsity of matrices still enough to enable an efficient solving process? The reviewer has noticed some quantitative results in the simulation parts, but just wonders how the sparsity is influenced, or, stays unchanged.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report (New Reviewer)

The study is on a GNSS and map mapping based slam study. This work addresses the seamless and accurate positioning of autonomous robots and aircraft in complex environments. It offers a new approach to solving problems in operations without GNSS. It proposes a graph optimization model that integrates GNSS data and visual information, aiming to solve the accumulation of errors due to the lack of visual odometry and inertial navigation systems. By offering innovative methods such as large-scale map creation, multi-map mapping and merging, and multi-map-based geolocation, autonomous vehicles are claimed to increase efficiency and accuracy in long-distance navigation.

My comments on the article:

1- A long literature has been written for the introduction. In this respect, it is not a very reader-friendly article. I recommend making it reader-friendly by using subheadings. I would recommend them to examine the following studies as a competitor/alternative, where many unnecessary sources are cited:

https://doi.org/10.3390/drones6050107

https://doi.org/10.1016/j.compeleceng.2016.06.006

https://doi.org/10.1109/LRA.2018.2793357

2- What is the proposed scientific contribution in the article? This should be briefly explained at the end of the introduction. Is it faster? Does it make fewer mistakes? is it cheaper? Is it more reliable? Specifically, the deficiencies in the literature should be explained by specifying. This is mandatory.

3- Judging by the content presented in the supplementary material, the method seems to be at the level of simulation. Hardware etc. is also presented in the content, but benchmarking is done in simulation. Is it true? If true, how was GNSS data generated for benchmarking? The proposed method has been benchmarked with different visual slam methods (I congratulate the authors in this respect.) Sharing the data of these comparison results will be proving content.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 4 Report (New Reviewer)

This manuscript studied GNSS-Aided multi-map solutions for long-distance visual SLAM. Overall, the manuscript is very well written with a rich number of details, which offers many valuable information to the community. Only some minor reworks are needed. Before proceeding to publication, I encourage the authors to address my following comments or questions:

1.      [Line 72-75] “Indeed, odometry drift is inevitable for any visual-inertial navigation system without the constraint of global positioning information. Secondly, odometry drift is inevitable for any visual-inertial navigation system without the constraint of global positioning information.”

-        These two sentences are duplicated. Please remove one.

 

2.      [Overall] There is no need to redefine abbreviations or acronyms. “Global Navigation Satellite System (GNSS)”, for example, appear four times throughout the manuscript. Only the first one is needed and GNSS can be used for the rest of the manuscript.

 

3.      [Line 1009-1010] “We utilize GNSS-RTK data as a reference for the ground truth trajectory”.

-        While GNSS-RTK can achieve cm-level positioning accuracy in good signal conditions (e.g., open sky), the performance can be significantly degraded in GNSS signal challenging environments for a ground vehicle scene (e.g., deep urban with multipath). Can you please comment on your ground truth setup in urban scenario field test?

-        Also, what are the baseline distances (between RTK rover and base station) for your experiments?

 

4.       [Overall] Besides showing the whole route experiment result, the positioning performance at most challenging locations (e.g., tunnel, underpass) should be highlighted to analyze the benefits of your method(s).

The English looks good. Only minor editing of English language is required.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 3 Report (New Reviewer)

paper can be accepted in present form

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

At the moment the manuscript is in a status that is not enough mature/ready for publication. The manuscript seems submitted in a rush without an accurate revision of the English language and technical terminology that are both inaccurate (for example the use of terms like, violent matching, posture mapping and public areas instead of overlapping areas) resulting in a very difficult to read manuscript. The authors should consider about restructuring their paper, in particular aims and innovation.

The authors claim some innovative aspects that are not really innovative as there exist already other works on the specific subject. For example the fisheye model was already used in many other VO and vSLAM algorithms (for example openVSLAM for which the authors should add the reference in the literature). The authors refer to GPS and not GNSS for their method, is there a specific reason to use that specific satellite constellation when multi-constellation RTK capable GNSS receivers are available at low-cost? Regarding the integration of GPS or GNSS  the authors should differentiate their work from other existing works, for example “G-VIDO: A Vehicle Dynamics and Intermittent GNSS-Aided Visual-Inertial State Estimator for Autonomous Driving” presenting pros and cons.

The authors should more comprehensively analyse the state of the art, improve the literature review and highlight the innovations with respect to other research works. There are important surveys about vSLAM and visual odometry that should be cited, the authors may use them and report only the relevant works that have similitudes with their own work.

The comparison with the other methods that do not use GNSS integration is not fair. The benefit in using GNSS and INS systems in photogrammetry has been known for decades and there is no doubt that using it is beneficial, especially on long distances. The authors mention several times the use of offline in their paper, which is not really clear since vSLAM algorithms are meant for real-time use. Is their system meant to work offline?In that case what is the benefit of using a vSLAM instead of a standard photogrammetry approach based on structure from motion?

In their multi map approach the authors do not mention degenerate geometric configurations that could happen in standard aerial photogrammetry surveys made of several strips such as those used in their simulation. What if, for example, the visual tracking is lost after a linear trajectory portion and a new map needs to be generated? The GPS positions along the trajectory are collinear and cannot be used to compute the similarity transformation from the vSLAM coordinate frame to any other geodetic related coordinate system.

It is clear that there is considerable amount of work done but the authors should first better clarify their innovation, aims, application of interest and then better structure the manuscript including only datasets and tests that are relevant.

Reviewer 2 Report

The study on increasing accuracy and reliability in the context of long-distance visual SLAM is well known to the community. The authors of the article presented a new method to solve two challenging problems with this issue. The first is to improve the accuracy of SLAM over long distances by using the fusion of GPS and visual information. The other is to solve the challenge of using the map in the long-distance SLAM by providing a split and merge mechanism. This method, in addition to addressing the challenge of large area data, also brings out the possibility of dealing with problems related to areas with low textural information. Several operational tests in different areas of ground vehicle and UAV are one of the strengths of this paper in the evaluation of the proposed method. Therefore, the article is suitable for readers of the Remote Sensing journal.

Comments for author File: Comments.pdf

Reviewer 3 Report

The aim of this study is to design a long-distance and multi-map visual SLAM system. GPS and the multiple map technology are used in this system.

 

The paper gives details of the method and a variety of experiments to test the effectiveness of method. This paper can be published at present form.

Back to TopTop