A Multi-Strategy Visual SLAM System for Motion Blur Handling in Indoor Dynamic Environments
Abstract
:1. Introduction
- We present a semantic information compensation mechanism that predicts and recovers missing semantic information based on temporal analysis. This mechanism empowers our method to cope with failure of object identification.
- We propose a fusion method to generate segmented depth masks that incorporate reliable depth information. In contrast to semantic masks susceptible to incomplete segmentation, our proposed masks augment the integrity of RGB-derived semantic information, thus allowing superior discrimination between dynamic objects and background.
- We introduce a probability-based detection and elimination algorithm combined with segmented depth masks, which effectively eliminates the impacts of dynamic feature points and overcomes the unreliability of mask boundaries in the presence of motion blur.
2. Related Works
2.1. Dynamic SLAM by Fusing Semantic and Geometric Constraints
2.2. Dynamic SLAM Using Depth Images
3. Method
3.1. Missed Segmentation Compensation
3.2. Fusion of Depth Information with RGB-Derived Semantics
3.3. The Probability-Based Detection and Elimination Algorithm
4. Experimental Results and Discussion
4.1. Evaluation of the Missed Segmentation Compensation and Fusion Method
4.2. Comparison with Baseline Methods
4.3. Ablation Experiments
4.4. Dense 3D Mapping
4.5. Impact of Dilation Kernel Size on Performance
4.6. Runtime Analysis
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zachiotis, G.A.; Andrikopoulos, G.; Gornez, R.; Nakamura, K.; Nikolakopoulos, G. A survey on the application trends of home service robotics. In Proceedings of the 2018 IEEE International Conference on Robotics and Biomimetics (ROBIO), Kuala Lumpur, Malaysia, 12–15 December 2018; pp. 1999–2006. [Google Scholar]
- Doelling, K.; Shin, J.; Popa, D.O. Service robotics for the home: A state of the art review. In Proceedings of the 7th International Conference on PErvasive Technologies Related to Assistive Environments, Island of Rhodes, Greece, 27–30 May 2014; pp. 1–8. [Google Scholar]
- Cadena, C.; Carlone, L.; Carrillo, H.; Latif, Y.; Scaramuzza, D.; Neira, J.; Reid, I.; Leonard, J.J. Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age. IEEE Trans. Robot. 2016, 32, 1309–1332. [Google Scholar] [CrossRef]
- Taketomi, T.; Uchiyama, H.; Ikeda, S. Visual SLAM algorithms: A survey from 2010 to 2016. IPSJ Trans. Comput. Vis. Appl. 2017, 9, 16. [Google Scholar] [CrossRef]
- Kazerouni, I.A.; Fitzgerald, L.; Dooly, G.; Toal, D. A survey of state-of-the-art on visual SLAM. Expert Syst. Appl. 2022, 205, 117734. [Google Scholar] [CrossRef]
- Beghdadi, A.; Mallem, M. A comprehensive overview of dynamic visual SLAM and deep learning: Concepts, methods and challenges. Mach. Vis. Appl. 2022, 33, 54. [Google Scholar] [CrossRef]
- Bescos, B.; Fácil, J.M.; Civera, J.; Neira, J. DynaSLAM: Tracking, mapping, and inpainting in dynamic scenes. IEEE Robot. Autom. Lett. 2018, 3, 4076–4083. [Google Scholar] [CrossRef]
- Hu, X.; Zhang, Y.; Cao, Z.; Ma, R.; Wu, Y.; Deng, Z.; Sun, W. CFP-SLAM: A real-time visual SLAM based on coarse-to-fine probability in dynamic environments. In Proceedings of the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Kyoto, Japan, 23–27 October 2022; pp. 4399–4406. [Google Scholar]
- Xu, G.; Yu, Z.; Xing, G.; Zhang, X.; Pan, F. Visual odometry algorithm based on geometric prior for dynamic environments. Int. J. Adv. Manuf. Technol. 2022, 122, 235–242. [Google Scholar] [CrossRef]
- Xiang, Y.; Zhou, H.; Li, C.; Sun, F.; Li, Z.; Xie, Y. Deep learning in motion deblurring: Current status, benchmarks and future prospects. Vis. Comput. 2024, 1–27. [Google Scholar] [CrossRef]
- Xu, C.; Li, C.T.; Hu, Y.; Lim, C.P.; Creighton, D. Deep Learning Techniques for Video Instance Segmentation: A Survey. arXiv 2023, arXiv:2310.12393. [Google Scholar]
- Fan, Y.; Zhang, Q.; Tang, Y.; Liu, S.; Han, H. Blitz-SLAM: A semantic SLAM in dynamic environments. Pattern Recognit. 2022, 121, 108225. [Google Scholar] [CrossRef]
- Qin, Y.; Yu, H. A review of visual SLAM with dynamic objects. Ind. Robot. Int. J. Robot. Res. Appl. 2023, 50, 1000–1010. [Google Scholar] [CrossRef]
- Kruger, J.; Ehrhardt, J.; Handels, H. Probabilistic appearance models for segmentation and classification. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1698–1706. [Google Scholar]
- Ahmed, H.; Nandi, A.K. Probabilistic Classification Methods. In Condition Monitoring with Vibration Signals: Compressive Sampling and Learning Algorithms for Rotating Machines; Wiley Press: New York, NY, USA, 2019; pp. 225–237. [Google Scholar] [CrossRef]
- He, J.; Li, M.; Wang, Y.; Wang, H. OVD-SLAM: An online visual SLAM for dynamic environments. IEEE Sens. J. 2023, 23, 13210–13219. [Google Scholar] [CrossRef]
- Li, M.; He, J.; Jiang, G.; Wang, H. Ddn-slam: Real-time dense dynamic neural implicit slam with joint semantic encoding. arXiv 2024, arXiv:2401.01545. [Google Scholar]
- Bescos, B.; Campos, C.; Tardós, J.D.; Neira, J. DynaSLAM II: Tightly-coupled multi-object tracking and SLAM. IEEE Robot. Autom. Lett. 2021, 6, 5191–5198. [Google Scholar] [CrossRef]
- Wu, W.; Guo, L.; Gao, H.; You, Z.; Liu, Y.; Chen, Z. YOLO-SLAM: A semantic SLAM system towards dynamic environment with geometric constraint. Neural Comput. Appl. 2022, 34, 6011–6026. [Google Scholar] [CrossRef]
- Yu, C.; Liu, Z.; Liu, X.J.; Xie, F.; Yang, Y.; Wei, Q.; Fei, Q. DS-SLAM: A semantic visual SLAM towards dynamic environments. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 1168–1174. [Google Scholar]
- Li, S.; Lee, D. RGB-D SLAM in dynamic environments using static point weighting. IEEE Robot. Autom. Lett. 2017, 2, 2263–2270. [Google Scholar] [CrossRef]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Ji, T.; Wang, C.; Xie, L. Towards real-time semantic rgb-d slam in dynamic environments. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 11175–11181. [Google Scholar]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef]
- Zhong, F.; Wang, S.; Zhang, Z.; Wang, Y. Detect-SLAM: Making object detection and SLAM mutually beneficial. In Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA, 12–15 March 2018; pp. 1001–1010. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Proceedings, Part I 14. Springer: Berlin/Heidelberg, Germany, 2016; pp. 21–37. [Google Scholar]
- Dvornik, N.; Shmelkov, K.; Mairal, J.; Schmid, C. Blitznet: A real-time deep network for scene understanding. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 4154–4162. [Google Scholar]
- Sun, Q.; Liu, W.; Zou, J.; Xu, Z.; Li, Y. GGC-SLAM: A VSLAM system based on predicted static probability of feature points in dynamic environments. Signal Image Video Process. 2024, 18, 7053–7064. [Google Scholar] [CrossRef]
- Palazzolo, E.; Behley, J.; Lottes, P.; Giguere, P.; Stachniss, C. ReFusion: 3D reconstruction in dynamic environments for RGB-D cameras exploiting residuals. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 3–8 November 2019; pp. 7855–7862. [Google Scholar]
- Jin, J.; Jiang, X.; Yu, C.; Zhao, L.; Tang, Z. Dynamic visual simultaneous localization and mapping based on semantic segmentation module. Appl. Intell. 2023, 53, 19418–19432. [Google Scholar] [CrossRef]
- Virgolino Soares, J.C.; Medeiros, V.S.; Abati, G.F.; Becker, M.; Caurin, G.; Gattass, M.; Meggiolaro, M.A. Visual localization and mapping in dynamic and changing environments. J. Intell. Robot. Syst. 2023, 109, 95. [Google Scholar] [CrossRef]
- Vincent, J.; Labbé, M.; Lauzon, J.S.; Grondin, F.; Comtois-Rivet, P.M.; Michaud, F. Dynamic object tracking and masking for visual SLAM. In Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 25–29 October 2020; pp. 4974–4979. [Google Scholar]
- Ayman, B.; Malik, M.; Lotfi, B. DAM-SLAM: Depth attention module in a semantic visual SLAM based on objects interaction for dynamic environments. Appl. Intell. 2023, 53, 25802–25815. [Google Scholar] [CrossRef]
- Kan, X.; Shi, G.; Yang, X.; Hu, X. YPR-SLAM: A SLAM System Combining Object Detection and Geometric Constraints for Dynamic Scenes. Sensors 2024, 24, 6576. [Google Scholar] [CrossRef] [PubMed]
- Li, J.; Dai, J.; Su, Z.; Zhu, C. RGB-D Based Visual SLAM Algorithm for Indoor Crowd Environment. J. Intell. Robot. Syst. 2024, 110, 27. [Google Scholar] [CrossRef]
- Cheng, T.; Wang, X.; Chen, S.; Zhang, W.; Zhang, Q.; Huang, C.; Zhang, Z.; Liu, W. Sparse instance activation for real-time instance segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 4433–4442. [Google Scholar]
- Sturm, J.; Engelhard, N.; Endres, F.; Burgard, W.; Cremers, D. A benchmark for the evaluation of RGB-D SLAM systems. In Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vilamoura-Algarve, Portugal, 7–12 October 2012; pp. 573–580. [Google Scholar]
- Mur-Artal, R.; Tardós, J.D. Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Trans. Robot. 2017, 33, 1255–1262. [Google Scholar] [CrossRef]
- Soares, J.C.V.; Gattass, M.; Meggiolaro, M.A. Crowd-SLAM: Visual SLAM towards crowded environments using object detection. J. Intell. Robot. Syst. 2021, 102, 50. [Google Scholar] [CrossRef]
- Cheng, S.; Sun, C.; Zhang, S.; Zhang, D. SG-SLAM: A real-time RGB-D visual SLAM toward dynamic scenes with semantic and geometric information. IEEE Trans. Instrum. Meas. 2022, 72, 7501012. [Google Scholar] [CrossRef]
- Fu, F.; Yang, J.; Ma, J.; Zhang, J. Dynamic visual SLAM based on probability screening and weighting for deep features. Measurement 2024, 236, 115127. [Google Scholar] [CrossRef]
Sequences | ORBSLAM2 | DynaSLAM | Crowd-SLAM | YOLO-SLAM | SG-SLAM | Ours | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
RMSE | S.D. | RMSE | S.D. | RMSE | S.D. | RMSE | S.D. | RMSE | S.D. | RMSE | S.D. | |
fr3/w/xyz | 0.729 | 0.386 | 0.016 | 0.009 | 0.020 | 0.017 | 0.015 | 0.007 | 0.015 | 0.008 | 0.014 | 0.007 |
fr3/w/half | 0.419 | 0.245 | 0.030 | 0.016 | 0.026 | 0.022 | 0.028 | 0.014 | 0.027 | 0.013 | 0.023 | 0.011 |
fr3/w/static | 0.367 | 0.161 | 0.007 | 0.003 | 0.007 | 0.007 | 0.007 | 0.004 | 0.007 | 0.003 | 0.006 | 0.003 |
fr3/w/rpy | 0.775 | 0.411 | 0.035 | 0.019 | 0.044 | 0.031 | 0.216 | 0.100 | 0.032 | 0.019 | 0.031 | 0.018 |
fr3/s/static | 0.009 | 0.004 | / | / | 0.008 | 0.007 | 0.007 | 0.003 | 0.006 | 0.003 | 0.005 | 0.003 |
Sequences | ORBSLAM2 | DynaSLAM | Crowd-SLAM | YOLO-SLAM | SG-SLAM | Ours | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
RMSE | S.D. | RMSE | S.D. | RMSE | S.D. | RMSE | S.D. | RMSE | S.D. | RMSE | S.D. | |
crowd1 | 0.533 | 0.361 | 0.016 | / | 0.018 | / | 0.033 | / | 0.023 | 0.014 | 0.016 | 0.008 |
crowd2 | 1.311 | 0.257 | 0.031 | / | 0.030 | / | 0.423 | / | 0.060 | 0.040 | 0.031 | 0.018 |
crowd3 | 0.780 | 0.306 | 0.038 | / | 0.034 | / | 0.069 | / | 0.032 | 0.022 | 0.026 | 0.017 |
person1 | 0.796 | 0.709 | / | / | / | / | 0.157 | / | 0.040 | 0.014 | 0.038 | 0.012 |
person2 | 0.898 | 0.405 | / | / | / | / | 0.037 | / | 0.038 | 0.016 | 0.034 | 0.013 |
Sequences | Ours | w/o Compensation | w/o Depth | w/o Probability | ||||
---|---|---|---|---|---|---|---|---|
RMSE | S.D. | RMSE | S.D. | RMSE | S.D. | RMSE | S.D. | |
fr3/w/xyz | 0.0143 | 0.0072 | 0.0150 | 0.0075 | 0.0153 | 0.0075 | 0.0156 | 0.0077 |
fr3/w/half | 0.0233 | 0.0114 | 0.0245 | 0.0128 | 0.0244 | 0.0121 | 0.0253 | 0.0134 |
fr3/w/static | 0.0058 | 0.0027 | 0.0064 | 0.0029 | 0.0065 | 0.0030 | 0.0064 | 0.0032 |
fr3/w/rpy | 0.0314 | 0.0177 | 0.0331 | 0.0196 | 0.0329 | 0.0180 | 0.0344 | 0.0211 |
fr3/s/static | 0.0054 | 0.0026 | 0.0060 | 0.0029 | 0.0053 | 0.0025 | 0.0064 | 0.0028 |
Sequences | ORB Extraction | Compensation | Fusion | Probability | Others | Total |
---|---|---|---|---|---|---|
fr3/w/half | 14.44 | 4.35 | 7.34 | 4.57 | 16.79 | 47.49 |
fr3/w/static | 14.61 | 4.14 | 7.58 | 4.56 | 8.58 | 39.47 |
fr3/s/static | 15.14 | 4.37 | 11.85 | 4.29 | 7.84 | 43.49 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Huai, S.; Cao, L.; Zhou, Y.; Guo, Z.; Gai, J. A Multi-Strategy Visual SLAM System for Motion Blur Handling in Indoor Dynamic Environments. Sensors 2025, 25, 1696. https://doi.org/10.3390/s25061696
Huai S, Cao L, Zhou Y, Guo Z, Gai J. A Multi-Strategy Visual SLAM System for Motion Blur Handling in Indoor Dynamic Environments. Sensors. 2025; 25(6):1696. https://doi.org/10.3390/s25061696
Chicago/Turabian StyleHuai, Shuo, Long Cao, Yang Zhou, Zhiyang Guo, and Jingyao Gai. 2025. "A Multi-Strategy Visual SLAM System for Motion Blur Handling in Indoor Dynamic Environments" Sensors 25, no. 6: 1696. https://doi.org/10.3390/s25061696
APA StyleHuai, S., Cao, L., Zhou, Y., Guo, Z., & Gai, J. (2025). A Multi-Strategy Visual SLAM System for Motion Blur Handling in Indoor Dynamic Environments. Sensors, 25(6), 1696. https://doi.org/10.3390/s25061696