A Semantic Information-Based Optimized vSLAM in Indoor Dynamic Environments
Round 1
Reviewer 1 Report
The field of interest and application of the article is quite active. However, comparisons with previous studies should be more extensive. Also, some of my suggestions are:
1. All images should be prepared in high resolution again.
2. More visual and graphical results should be added.
3. Previous studies must be more recent.
4. A comprehensive discussion section is required.
5. Real-time responses should be analyzed in detail.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
The paper talks about a semantic information-based optimized vSLAM algorithm that overcomes the challenges in dynamic indoor environments by proposing the following approaches: (i) a strategy that enables vSLAM to initialize local maps and poses robustly and efficiently in dynamic environments, (ii) an efficient method based on homography matrix and dense optical flow method for inter-frame pose estimation, (iii) a method of optimized semantic segmentation based on depth images for accurately annotating semantic objects in key frames.
Reason to accept
- The paper shows nice empirical gains over ORB-SLAM and DS-SLAM for TUM RGB-D dataset.
- The paper is well written and easy to follow, however the quality of English can be improved.
Reasons to reject:
- The novelty in this paper is not evident because there are a lot of similar works that use deep learning methods like semantic segmentation for improving SLAM.
- The comparison with previous methods is not enough in this paper. Recent papers like DROID-SLAM [1], NICE-SLAM [2] are not compared with in the paper. Although they are not designed for dynamical scenes, they show improvements over ORB-SLAM.
- The quality of the figures in this paper is quite poor and needs to be improved. Especially, Figure 1 is difficult to read and needs to be improved for acceptance.
[1] Teed, Zachary, and Jia Deng. "Droid-slam: Deep visual slam for monocular, stereo, and rgb-d cameras." Advances in neural information processing systems 34 (2021): 16558-16569.
[2] Zhu, Zihan, et al. "Nice-slam: Neural implicit scalable encoding for slam." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 3 Report
The authors proposed a semantic segmentation technique for dynamic scenarios. The overall impression of the technical contribution of the current study is Marginal. However, my major concerns about this article are listed below:
1. The performance of the proposed technique needs to be measured using advanced metrics in the result section.
2. The conclusion should be more compact and should contain what is new in the proposed approach, what is better, the author's observations, and the deduction for all observations made by the authors.
Author Response
Our responses are all included in the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 2 Report
I am satisfied with the answers given by the authors. Therefore, I plan to accept the paper.
Author Response
Thank you for your conscientious review of our manuscript, and we are happy to hear your valuable suggestions.