Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

PLI-SLAM: A Tightly-Coupled Stereo Visual-Inertial SLAM System with Point and Line Features

Remote Sens. 2023, 15(19), 4678; https://doi.org/10.3390/rs15194678

by Zhaoyu Teng¹

, Bin Han¹, Jie Cao^1,2,*, Qun Hao^1,2,3, Xin Tang¹ and Zhaoyang Li¹

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Remote Sens. 2023, 15(19), 4678; https://doi.org/10.3390/rs15194678

Submission received: 24 August 2023 / Revised: 19 September 2023 / Accepted: 20 September 2023 / Published: 24 September 2023

(This article belongs to the Topic Multi-Sensor Integrated Navigation Systems)

Round 1

Reviewer 1 Report

This paper deals with adding line segment to landmark based SLAM. The main contributions are : a modification to a well known line segment detector and a trick on the line residual estimation.

Using point and line as complementary features is not new at all but the authors seems to fuse different tricks found in the litterature. The result is a slightly faster ORB-SLAM-LINE updated with the ORB-SLAM3.

They propose some heuristic to filter landmarks such as the length of track or the orientation change in 3D wrt the gravity direction.
Line is not used in the loop-closing part.
=> Does the authors have any idea on how to use such line feature for loop closing? As the authors states that the navigation is done in low textured environment, how is the 2D point based loop closing approach performing? Couldn't such line landmarks also be used in a bag-of-word way?

The modification of EDLine is simple done by checking the mean curvature of each detected lines and then by thresolding.
Considering the line length selection, it is based on the number of 2D features detected. On one hand, if there is a lot of 2D points, only long lines are selected , on the other hand if there is few features, the authors take additional shorter lines. In order to do that, some heuristic threshold are provided.

=> Most of the variables in eq 1 and 2 are not defined. A reference to EDlines paper (in which the variable definition seems to be provided should also be given again here)

Matching approach is based on the one of PL-SLAM.

The initial pose estimation is unclear. Is it only based on IMU or is it also corrected by any epiploar or pnp approach ? Do lines use for this step ?

The line minimization cost is explained in section 4.2.2. The impac tof line is linked to its length (eq 3) but also to the number of point detected in the frame.

=> This is not explained in the paper. How is the weight influenced byt the number of 2D points detected? Also the explanation of W and T should be provided at this step and not in the experiment section.

For the keyframe decision, the sentence (1) is not clear, what does "if the tracked is less than 0.25 times the reference keyframes" meaning ?line 298 -> must tack => track

Bundle adjustment section present the residual used for line segments. It is based on the distance of reprojected start/end point to the detected line.

-> it is unclear how the covariance of line is estimated. How are the covariances of end points obtained?

line 349-350 repetition of inertial residuals.

The tight fusion is perform by using both landmark in the same residual cost (eq 12) during the optimization and when the IMU is initialized the IMU factors are added to the optimization. This is the classical approach.

=> It is not clear in the text when each cost function is used (eq 12-13 and 14) It could be link to the fig 1 diagram for better understanding.

Comparison in the experiment section are quite unfair as the proposed algorithm is tuned wrt the dataset and the other used "default" parameters (but maybe already tuned by the original author to these dataset?).

From table 1, the choice of W and T is quite unclear because we cannot see any tendance for the global cost looking at the 3D shape of the cost, there is no clear minimum. Is such values very linked to the dataset ? can it be generalized and fixed or must it be tuned each time ?

It would be good to have the same axis for fig 6. it seems that adding lines is not realy improving the results compared to ORBSLAM3 with point only. What about more challenging environments with less features ?

The entire system seems to work properly and the results show that the performance is better but close to existing ones. The idea of using lines is not new and going further would have been interesting (using lines for loop closing for example).

Finally, considering the writting the paper is well written and very clear, the approach is well presented and the results are shown on well known public data set but also on campus experiments. It would be interesing to have the SLAM code released for comparisons.

I am not qualified enough to assess the quality of English in this paper but for me (as a non native english speaker) it seems well written and is understandable.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 2 Report

Brief summary

Although indoor navigation systems based on SLAM technique are commonly used, they have some weak points that many researchers are trying to overcome. The manuscript concerns the new way to improve the performance of SLAM system supported by IMU unit. The main way of the improvement is an improvement of the algorithm of point and line feature detection. The proposed improvements were tested on three sequences from EuRoC MAV dataset and two sequences obtained by the authors in the Building of the School of Optoelectronics in Beijing. All the tests showed that the newly developed method is in most cases superior to other tested algorithms.

Comments and mistakes:

1. Line 142: There is: “... will be demonstrated in Section V”. I suppose the authors meant Section 5.

2. Lines 171 and 175: “DBoW2” or “DBOW2”

3. Lines 204-205: “EdLines” is used as the name of the algorithm while it is “EDlines” in the rest of the manuscript.

4. Equation (1) and Line 234: “... the rest of the parameters are the same as defined in EDlines.” Please explain the rest of the parameters anyway.

5. Lines 285-288 and 318-324: Please check the font and line spacing.

6. Lines 318 and 322: The symbol L_i^W looks differently from the corresponding symbol in Figure 3.

7. Line 428: “more unreliable” sounds strangely. Could you replace it with “less reliable”

8. Figure 5: The overlapping lines of different colours make the picture illegible. The idea works only for the right picture. Could authors propose any alternative way of presenting the results that could be applicable also for graphically invisible differences?

9. What was the evaluation method of particular algorithms in the real-world experiment? Did the authors register a true trajectory of the D455 unit?

10. The results of the experiment in the real-world scenario should be mentioned in the Conclusions section.

Please check the lexical correctness of the following fragments:

Line 133: “The structure of our method, which is mainly improved based on of ORB-SLAM3 …”

Line 428: “more unreliable”

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Article Menu

PLI-SLAM: A Tightly-Coupled Stereo Visual-Inertial SLAM System with Point and Line Features

Further Information

Guidelines

MDPI Initiatives

Follow MDPI