RISE-VIO: Robust Initialization and Targeted Pose Robustification for INS-Centric Visual–Inertial Odometry Under Degraded Visual Conditions
Abstract
1. Introduction
- A lightweight and system-compatible integration of graduated non-convexity (GNC) into two failure-critical stages of INS-centric VIO—startup initialization and per-frame pose estimation—without reformulating the full backend as a robust optimization problem.
- A GNC-robustified decoupled initialization module for feature-based VIO, combining gyroscope-bias estimation with a two-stage observability test to improve startup reliability under weak excitation and degraded tracking.
- An IMU-prior-guided GNC-EPnP module for robust per-frame pose estimation, improving tolerance to correspondence outliers and dynamic interference while preserving the efficiency of EPnP.
2. Related Work
3. System Overview
4. Method
4.1. Notation and Assumptions
4.2. GNC Preliminaries
4.3. Initialization with NEC and GNC-Based Gyroscope Bias Estimation
4.3.1. GNC-Based Gyroscope Bias Estimation
4.3.2. Two-Stage Observability Test
Stage I—Rotational Excitation
Stage II—Spectral Stability of LiGT
4.4. Hierarchical Outlier Rejection for Robust Tracking
4.4.1. Frontend and Keyframe Management
4.4.2. GNC-EPnP: Graduated Non-Convex Pose Estimation
4.4.3. Temporal Landmark Classification
4.4.4. Spatial Voxelization and IMU-Aided Short-Term Integration
| Algorithm 1 GNC-EPnP Inlier Solver |
|
5. Experiments
5.1. Experimental Setup and Evaluation Protocol
5.2. Robust Initialization on EuRoC
5.3. Controlled Monte Carlo Perspective-n-Point (PnP) Evaluation
5.4. Dynamic Scene Stress Test on VIODE
5.5. System-Level Benchmarking and Ablation on EuRoC
5.6. Runtime and Real-Time Performance
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Qin, T.; Li, P.; Shen, S. Vins-mono: A robust and versatile monocular visual-inertial state estimator. IEEE Trans. Robot. 2018, 34, 1004–1020. [Google Scholar] [CrossRef]
- Campos, C.; Elvira, R.; Rodríguez, J.J.G.; Montiel, J.M.; Tardós, J.D. Orb-slam3: An accurate open-source library for visual, visual–inertial, and multimap slam. IEEE Trans. Robot. 2021, 37, 1874–1890. [Google Scholar] [CrossRef]
- Campos, C.; Montiel, J.M.; Tardós, J.D. Inertial-only optimization for visual-inertial initialization. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA); IEEE: New York, NY, USA, 2020; pp. 51–57. [Google Scholar]
- He, Y.; Xu, B.; Ouyang, Z.; Li, H. A rotation-translation-decoupled solution for robust and efficient visual-inertial initialization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; IEEE: New York, NY, USA, 2023; pp. 739–748. [Google Scholar]
- Wang, W.; Li, J.; Ming, Y.; Mordohai, P. EDI: ESKF-based disjoint initialization for visual-inertial SLAM systems. In Proceedings of the 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); IEEE: New York, NY, USA, 2023; pp. 1466–1472. [Google Scholar]
- Wang, W.; Chou, C.; Sevagamoorthy, G.; Chen, K.; Chen, Z.; Feng, Z.; Xia, Y.; Cai, F.; Xu, Y.; Mordohai, P. Stereo-nec: Enhancing stereo visual-inertial slam initialization with normal epipolar constraints. In Proceedings of the 2024 IEEE International Conference on Robotics and Automation (ICRA); IEEE: New York, NY, USA, 2024; pp. 2691–2697. [Google Scholar]
- Xu, Z.; He, Y.; Wei, H.; Wu, Y. Doge: An extrinsic orientation and gyroscope bias estimation for visual-inertial odometry initialization. In Proceedings of the 2025 IEEE International Conference on Robotics and Automation (ICRA); IEEE: New York, NY, USA, 2025; pp. 9862–9868. [Google Scholar]
- Li, J.; Pan, X.; Huang, G.; Zhang, Z.; Wang, N.; Bao, H.; Zhang, G. RD-VIO: Robust visual-inertial odometry for mobile augmented reality in dynamic environments. IEEE Trans. Vis. Comput. Graph. 2024, 30, 6941–6955. [Google Scholar] [CrossRef] [PubMed]
- Song, S.; Lim, H.; Lee, A.J.; Myung, H. DynaVINS++: Robust visual-inertial state estimator in dynamic environments by adaptive truncated least squares and stable state recovery. IEEE Robot. Autom. Lett. 2024, 9, 9127–9134. [Google Scholar] [CrossRef]
- Zhang, J.; Zhang, C.; Liu, Q.; Ma, Q.; Qin, J. Tightly-coupled visual-inertial odometry with robust feature association in dynamic illumination environments. Robotica 2025, 43, 2304–2319. [Google Scholar] [CrossRef]
- Li, X.; Liu, C.; Yan, X. Robust Visual-Inertial Odometry with Learning-Based Line Features in a Illumination-Changing Environment. Sensors 2025, 25, 5029. [Google Scholar] [CrossRef]
- Mur-Artal, R.; Tardós, J.D. Visual-inertial monocular SLAM with map reuse. IEEE Robot. Autom. Lett. 2017, 2, 796–803. [Google Scholar] [CrossRef]
- Dai, W.; Zhang, Y.; Li, P.; Fang, Z.; Scherer, S. RGB-D SLAM in dynamic environments using point correlations. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 44, 373–389. [Google Scholar] [CrossRef]
- Palazzolo, E.; Behley, J.; Lottes, P.; Giguere, P.; Stachniss, C. ReFusion: 3D reconstruction in dynamic environments for RGB-D cameras exploiting residuals. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); IEEE: New York, NY, USA, 2019; pp. 7855–7862. [Google Scholar]
- Mu, C.; Feng, D.; Zheng, Q.; Zhuang, Y. A Robust and Efficient Visual-Inertial Initialization with Probabilistic Normal Epipolar Constraint. IEEE Robot. Autom. Lett. 2025, 10, 3590–3597. [Google Scholar] [CrossRef]
- Bescos, B.; Fácil, J.M.; Civera, J.; Neira, J. DynaSLAM: Tracking, mapping, and inpainting in dynamic scenes. IEEE Robot. Autom. Lett. 2018, 3, 4076–4083. [Google Scholar] [CrossRef]
- Bescos, B.; Campos, C.; Tardós, J.D.; Neira, J. DynaSLAM II: Tightly-coupled multi-object tracking and SLAM. IEEE Robot. Autom. Lett. 2021, 6, 5191–5198. [Google Scholar] [CrossRef]
- Yu, C.; Liu, Z.; Liu, X.J.; Xie, F.; Yang, Y.; Wei, Q.; Fei, Q. DS-SLAM: A semantic visual SLAM towards dynamic environments. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); IEEE: New York, NY, USA, 2018; pp. 1168–1174. [Google Scholar]
- Janai, J.; Güney, F.; Behl, A.; Geiger, A. Computer vision for autonomous vehicles: Problems, datasets and state of the art. Found. Trends Comput. Graph. Vis. 2020, 12, 1–308. [Google Scholar] [CrossRef]
- Lee, S.; Son, C.Y.; Kim, H.J. Robust real-time RGB-D visual odometry in dynamic environments via rigid motion model. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); IEEE: New York, NY, USA, 2019; pp. 6891–6898. [Google Scholar]
- Du, Z.J.; Huang, S.S.; Mu, T.J.; Zhao, Q.; Martin, R.R.; Xu, K. Accurate dynamic SLAM using CRF-based long-term consistency. IEEE Trans. Vis. Comput. Graph. 2020, 28, 1745–1757. [Google Scholar] [CrossRef]
- Sun, Y.; Liu, M.; Meng, M.Q.H. Improving RGB-D SLAM in dynamic environments: A motion removal approach. Robot. Auton. Syst. 2017, 89, 110–122. [Google Scholar] [CrossRef]
- Scona, R.; Jaimez, M.; Petillot, Y.R.; Fallon, M.; Cremers, D. Staticfusion: Background reconstruction for dense rgb-d slam in dynamic environments. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA); IEEE: New York, NY, USA, 2018; pp. 3849–3856. [Google Scholar]
- Minervini, A.; Carrio, A.; Guglieri, G. Enhancing Visual–Inertial Odometry Robustness and Accuracy in Challenging Environments. Robotics 2025, 14, 71. [Google Scholar] [CrossRef]
- Wang, K.; Chai, D.; Wang, X.; Yan, R.; Ning, Y.; Sang, W.; Wang, S. YSAG-VINS—A Robust Visual-Inertial Navigation System with Adaptive Geometric Constraints and Semantic Information Based on YOLOv8n-ODUIB in Dynamic Environments. Appl. Sci. 2025, 15, 10595. [Google Scholar] [CrossRef]
- Fischler, M.A.; Bolles, R.C. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 1981, 24, 381–395. [Google Scholar] [CrossRef]
- Yang, H.; Antonante, P.; Tzoumas, V.; Carlone, L. Graduated non-convexity for robust spatial perception: From non-minimal solvers to global outlier rejection. IEEE Robot. Autom. Lett. 2020, 5, 1127–1134. [Google Scholar] [CrossRef]
- Song, S.; Lim, H.; Lee, A.J.; Myung, H. DynaVINS: A visual-inertial SLAM for dynamic environments. IEEE Robot. Autom. Lett. 2022, 7, 11523–11530. [Google Scholar] [CrossRef]
- Kneip, L.; Siegwart, R.; Pollefeys, M. Finding the exact rotation between two images independently of the translation. In Proceedings of the European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2012; pp. 696–709. [Google Scholar]
- Cai, Q.; Zhang, L.; Wu, Y.; Yu, W.; Hu, D. A pose-only solution to visual reconstruction and navigation. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 45, 73–86. [Google Scholar] [CrossRef]
- Lepetit, V.; Moreno-Noguer, F.; Fua, P. EP n P: An accurate O (n) solution to the P n P problem. Int. J. Comput. Vis. 2009, 81, 155–166. [Google Scholar] [CrossRef]
- Burri, M.; Nikolic, J.; Gohl, P.; Schneider, T.; Rehder, J.; Omari, S.; Achtelik, M.W.; Siegwart, R. The EuRoC micro aerial vehicle datasets. Int. J. Robot. Res. 2016, 35, 1157–1163. [Google Scholar] [CrossRef]
- Minoda, K.; Schilling, F.; Wüest, V.; Floreano, D.; Yairi, T. VIODE: A simulated dataset to address the challenges of visual-inertial odometry in dynamic environments. IEEE Robot. Autom. Lett. 2021, 6, 1343–1350. [Google Scholar] [CrossRef]
- Geneva, P.; Eckenhoff, K.; Lee, W.; Yang, Y.; Huang, G. Openvins: A research platform for visual-inertial estimation. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA); IEEE: New York, NY, USA, 2020; pp. 4666–4672. [Google Scholar]








| Seq. | Raw | Drt150 | Gnc150 | Drt200 | Gnc200 |
|---|---|---|---|---|---|
| MH_01_easy | 0.180 | 0.151 | 0.153 | 0.193 | 0.194 |
| MH_02_easy | 0.105 | 0.086 | 0.086 | 0.091 | 0.092 |
| MH_03_medium | 0.143 | 0.115 | 0.116 | 0.107 | 0.100 |
| MH_04_difficult | 0.216 | 0.205 | 0.199 | 0.208 | 0.181 |
| MH_05_difficult | 0.290 | 0.254 | 0.254 | 0.218 | 0.183 |
| V1_01_easy | 0.071 | 0.100 | 0.072 | 0.069 | 0.070 |
| V1_02_medium | 0.128 | 0.093 | 0.093 | 0.097 | 0.098 |
| V1_03_difficult | 0.170 | 0.163 | 0.163 | 0.155 | 0.155 |
| V2_01_easy | 0.142 | 0.080 | 0.070 | 0.065 | 0.063 |
| V2_02_medium | 0.167 | 0.160 | 0.159 | 0.155 | 0.150 |
| V2_03_difficult | 0.187 | 0.155 | 0.153 | 0.208 | 0.192 |
| Scenario | Level | ORB-SLAM3 | VINS-Fusion | RISE-VIO |
|---|---|---|---|---|
| City day | 0_none | 1.940 | 0.210 | 0.129 |
| 1_low | 0.857 | 0.182 | 0.170 | |
| 2_mid | 4.486 | 0.560 | 0.216 | |
| 3_high | * | 0.510 | 0.292 | |
| City night | 0_none | * | 0.328 | 0.262 |
| 1_low | * | 0.371 | 0.298 | |
| 2_mid | * | 0.457 | 0.262 | |
| 3_high | * | 0.464 | 0.385 | |
| Parking lot | 0_none | 0.415 | 0.102 | 0.116 |
| 1_low | 0.245 | 0.138 | 0.126 | |
| 2_mid | 3.807 | 0.707 | 0.227 | |
| 3_high | 4.687 | 1.135 | 0.310 |
| Seq. | VINS- Fusion | Open- VINS | RISE-VIO (a) | RISE-VIO (b) | RISE-VIO (Full) |
|---|---|---|---|---|---|
| MH_01_easy | 0.180 | 0.157 | 0.130 | 0.121 | 0.114 |
| MH_02_easy | 0.105 | 0.104 | 0.098 | 0.079 | 0.070 |
| MH_03_medium | 0.143 | 0.280 | 0.144 | 0.163 | 0.119 |
| MH_04_difficult | 0.216 | 0.174 | 0.163 | 0.160 | 0.148 |
| MH_05_difficult | 0.290 | 0.262 | 0.131 | 0.187 | 0.131 |
| V1_01_easy | 0.071 | 0.070 | 0.075 | 0.094 | 0.073 |
| V1_02_medium | 0.128 | 0.261 | 0.116 | 0.113 | 0.099 |
| V1_03_difficult | 0.170 | 0.080 | 0.102 | 0.113 | 0.098 |
| V2_01_easy | 0.142 | 0.110 | 0.075 | 0.063 | 0.061 |
| V2_02_medium | 0.167 | 0.097 | 0.137 | 0.104 | 0.102 |
| V2_03_difficult | 0.187 | 0.146 | 0.166 | 0.154 | 0.132 |
| Seq. | Mean (m) | Std (m) |
|---|---|---|
| MH_01_easy | 0.114 | 0.010 |
| MH_02_easy | 0.070 | 0.012 |
| MH_03_medium | 0.119 | 0.008 |
| MH_04_difficult | 0.148 | 0.021 |
| MH_05_difficult | 0.131 | 0.014 |
| V1_01_easy | 0.073 | 0.006 |
| V1_02_medium | 0.099 | 0.006 |
| V1_03_difficult | 0.098 | 0.002 |
| V2_01_easy | 0.061 | 0.003 |
| V2_02_medium | 0.102 | 0.006 |
| V2_03_difficult | 0.132 | 0.006 |
| Seq. | RTE @ 3 m | RTE @ 5 m | RTE @ 10 m | |||
|---|---|---|---|---|---|---|
| VINS- Fusion | RISE-VIO (Full) | VINS- Fusion | RISE-VIO (Full) | VINS- Fusion | RISE-VIO (Full) | |
| MH_01_easy | 0.076 | 0.071 | 0.111 | 0.102 | 0.180 | 0.168 |
| MH_02_easy | 0.071 | 0.063 | 0.104 | 0.110 | 0.186 | 0.262 |
| MH_03_medium | 0.101 | 0.087 | 0.146 | 0.145 | 0.225 | 0.224 |
| MH_04_difficult | 0.112 | 0.093 | 0.166 | 0.143 | 0.282 | 0.254 |
| MH_05_difficult | 0.106 | 0.097 | 0.179 | 0.158 | 0.248 | 0.326 |
| V1_01_easy | 0.228 | 0.221 | 0.317 | 0.300 | 0.303 | 0.302 |
| V1_02_medium | 0.135 | 0.114 | 0.200 | 0.135 | 0.217 | 0.124 |
| V1_03_difficult | 0.101 | 0.096 | 0.122 | 0.125 | 0.171 | 0.151 |
| V2_01_easy | 0.090 | 0.073 | 0.086 | 0.099 | 0.109 | 0.125 |
| V2_02_medium | 0.091 | 0.068 | 0.139 | 0.096 | 0.132 | 0.153 |
| V2_03_difficult | 0.097 | 0.108 | 0.121 | 0.139 | 0.132 | 0.167 |
| Config | Wall time (s) | CPU (%) | Max RSS (MB) | Delay (s) Mean/Median/p95 |
|---|---|---|---|---|
| Benchmark | 132.47 | 112 | 190.4 | 0.089/0.080/0.137 |
| Demo | 142.63 | 252 | 798.4 | 0.096/0.088/0.142 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Xu, X.; Ju, R.; Jiao, W.; Li, L. RISE-VIO: Robust Initialization and Targeted Pose Robustification for INS-Centric Visual–Inertial Odometry Under Degraded Visual Conditions. Sensors 2026, 26, 2305. https://doi.org/10.3390/s26082305
Xu X, Ju R, Jiao W, Li L. RISE-VIO: Robust Initialization and Targeted Pose Robustification for INS-Centric Visual–Inertial Odometry Under Degraded Visual Conditions. Sensors. 2026; 26(8):2305. https://doi.org/10.3390/s26082305
Chicago/Turabian StyleXu, Xiaowei, Ran Ju, Wenhua Jiao, and Lijuan Li. 2026. "RISE-VIO: Robust Initialization and Targeted Pose Robustification for INS-Centric Visual–Inertial Odometry Under Degraded Visual Conditions" Sensors 26, no. 8: 2305. https://doi.org/10.3390/s26082305
APA StyleXu, X., Ju, R., Jiao, W., & Li, L. (2026). RISE-VIO: Robust Initialization and Targeted Pose Robustification for INS-Centric Visual–Inertial Odometry Under Degraded Visual Conditions. Sensors, 26(8), 2305. https://doi.org/10.3390/s26082305

