Online Calibration Method of LiDAR and Camera Based on Fusion of Multi-Scale Cost Volume
Abstract
:1. Introduction
- We propose an end-to-end LiDAR–camera extrinsic calibration network that incorporates multi-scale feature extraction, multi-scale cost volume feature matching, and cost volume aggregation modules. A multi-scale cost volume cascade approach predicts the 6-DoF transformation parameters, thereby enhancing calibration accuracy.
- By employing multi-scale feature extraction and a 3D Hourglass network for optimized feature fusion, the backbone network extracts camera and LiDAR features and utilizes a feature pyramid to process multi-scale information, enhancing adaptability to targets of varying scales. Meanwhile, the 3D Hourglass network processes the fused cost volume, preserving high-level semantic information and enhancing calibration accuracy.
- The model’s calibration accuracy is evaluated on the KITTI dataset, demonstrating strong generalization ability. Through multiple iterations of training, the model effectively handles a larger error range.
2. Related Work
2.1. Calibration Methods Based on Specific Targets
2.2. Calibration Methods Without Specific Targets
2.3. Calibration Method Based on Deep Learning
3. Our Method
3.1. Multi-Scale Feature Extraction
3.2. Constructing Multi-Scale Cost Volumes
3.3. Multi-Scale Cost Volume Aggregation
3.4. Pose Estimation Module
3.5. Loss Function and Training Strategy
4. Experiments and Analysis
4.1. Dataset
4.2. Evaluation Metrics
4.3. Experimental Setup
4.4. Results and Discussion
4.4.1. Quantitative Results
4.4.2. Ablation Study
4.4.3. Visualizing the Calibration Effect
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Yeong, D.J.; Velasco-Hernandez, G.; Barry, J.; Walsh, J. Sensor and sensor fusion technology in autonomous vehicles: A review. Sensors 2021, 21, 2140. [Google Scholar] [CrossRef] [PubMed]
- Duan, J.; Yu, S.; Tan, H.L.; Zhu, H.; Tan, C. A survey of embodied ai: From simulators to research tasks. IEEE Trans. Emerg. Top. Comput. Intell. 2022, 6, 230–244. [Google Scholar] [CrossRef]
- Zhuang, Y.; Sun, X.; Li, Y.; Huai, J.; Hua, L.; Yang, X.; Cao, X.; Zhang, P.; Cao, Y.; Qi, L.; et al. Multi-sensor integrated navigation/positioning systems using data fusion: From analytics-based to learning-based approaches. Inf. Fusion 2023, 95, 62–90. [Google Scholar] [CrossRef]
- Zhong, H.; Wang, H.; Wu, Z.; Zhang, C.; Zheng, Y.; Tang, T. A survey of LiDAR and camera fusion enhancement. Procedia Comput. Sci. 2021, 183, 579–588. [Google Scholar] [CrossRef]
- Wang, P. Research on comparison of LiDAR and camera in autonomous driving. J. Phys. Conf. Ser. 2021, 2093, 012032. [Google Scholar] [CrossRef]
- Grammatikopoulos, L.; Papanagnou, A.; Venianakis, A.; Kalisperakis, I.; Stentoumis, C. An effective camera-to-LiDAR spatiotemporal calibration based on a simple calibration target. Sensors 2022, 22, 5576. [Google Scholar] [CrossRef]
- Li, Y.; Yu, A.W.; Meng, T.; Caine, B.; Ngiam, J.; Peng, D.; Shen, J.; Lu, Y.; Zhou, D.; Le, Q.V.; et al. Deepfusion: LiDAR-camera deep fusion for multi-modal 3d object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 17182–17191. [Google Scholar]
- Li, X.; Xiao, Y.; Wang, B.; Ren, H.; Zhang, Y.; Ji, J. Automatic targetless LiDAR–camera calibration: A survey. Artif. Intell. Rev. 2023, 56, 9949–9987. [Google Scholar] [CrossRef]
- Wang, L.; Huang, Y. LiDAR–camera fusion for road detection using a recurrent conditional random field model. Sci. Rep. 2022, 12, 11320. [Google Scholar] [CrossRef]
- Roriz, R.; Cabral, J.; Gomes, T. Automotive LiDAR technology: A survey. IEEE Trans. Intell. Transp. Syst. 2021, 23, 6282–6297. [Google Scholar] [CrossRef]
- Yan, G.; He, F.; Shi, C.; Wei, P.; Cai, X.; Li, Y. Joint camera intrinsic and LiDAR-camera extrinsic calibration. In Proceedings of the 2023 IEEE International Conference On Robotics and Automation (ICRA), London, UK, 29 May–2 June 2023; pp. 11446–11452. [Google Scholar]
- Zhu, J.; Xue, J.; Zhang, P. Calibdepth: Unifying depth map representation for iterative LiDAR-camera online calibration. In Proceedings of the 2023 IEEE International Conference On Robotics and Automation (ICRA), London, UK, 29 May–2 June 2023; pp. 726–733. [Google Scholar]
- Zhu, J.; Li, H.; Zhang, T. Camera, LiDAR, and imu based multi-sensor fusion slam: A survey. Tsinghua Sci. Technol. 2023, 29, 415–429. [Google Scholar] [CrossRef]
- Liu, Z.; Chen, Z.; Wei, X.; Chen, W.; Wang, Y. External Extrinsic Calibration of Multi-modal Imaging Sensors: A Review. IEEE Access 2023, 11, 110417–110441. [Google Scholar] [CrossRef]
- Guo, Z.; Xiao, Z. Research on online calibration of LiDAR and camera for intelligent connected vehicles based on depth-edge matching. Nonlinear Eng. 2021, 10, 469–476. [Google Scholar] [CrossRef]
- Yan, G.; Liu, Z.; Wang, C.; Shi, C.; Wei, P.; Cai, X.; Ma, T.; Liu, Z.; Zhong, Z.; Liu, Y.; et al. Opencalib: A multi-sensor calibration toolbox for autonomous driving. Softw. Impacts 2022, 14, 100393. [Google Scholar] [CrossRef]
- Ye, C.; Pan, H.; Gao, H. Keypoint-based LiDAR-camera online calibration with robust geometric network. IEEE Trans. Instrum. Meas. 2021, 71, 1–11. [Google Scholar] [CrossRef]
- Shen, Z.; Dai, Y.; Song, X.; Rao, Z.; Zhou, D.; Zhang, L. Pcw-net: Pyramid combination and warping cost volume for stereo matching. In Proceedings of the European Conference On Computer Vision, Tel Aviv, Israel, 23–27 October 2022; pp. 280–297. [Google Scholar]
- Beltrán, J.; Guindel, C.; De La Escalera, A.; García, F. Automatic extrinsic calibration method for LiDAR and camera sensor setups. IEEE Trans. Intell. Transp. Syst. 2022, 23, 17677–17689. [Google Scholar] [CrossRef]
- Ou, J.; Huang, P.; Zhou, J.; Zhao, Y.; Lin, L. Automatic extrinsic calibration of 3D LiDAR and multi-cameras based on graph optimization. Sensors 2022, 22, 2221. [Google Scholar] [CrossRef]
- Huang, H.; Zhang, M.; Li, L.; Hu, J.; Wang, H. GTSCalib: Generalized Target Segmentation for Target-Based Extrinsic Calibration of Non-Repetitive Scanning LiDAR and Camera. IEEE Trans. Autom. Sci. Eng. 2024, 22, 3648–3660. [Google Scholar] [CrossRef]
- Zhang, Q.; Pless, R. Extrinsic calibration of a camera and laser range finder (improves camera calibration). In Proceedings of the 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Sendai, Japan, 28 September–2 October 2004; (IEEE Cat. No. 04CH37566). IEEE: Piscataway, NJ, USA, 2004; Volume 3, pp. 2301–2306. [Google Scholar]
- Unnikrishnan, R.; Hebert, M. Fast Extrinsic Calibration of a Laser Rangefinder to a Camera; Tech. Rep. CMU-RI-TR-05-09; Robotics Institute: Pittsburgh, PA, USA, 2005. [Google Scholar]
- Pandey, G.; McBride, J.; Savarese, S.; Eustice, R. Extrinsic calibration of a 3d laser scanner and an omnidirectional camera. IFAC Proc. Vol. 2010, 43, 336–341. [Google Scholar] [CrossRef]
- Kwak, K.; Huber, D.F.; Badino, H.; Kanade, T. Extrinsic calibration of a single line scanning LiDAR and a camera. In Proceedings of the 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, Francisco, CA, USA, 25–30 September 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 3283–3289. [Google Scholar]
- Park, Y.; Yun, S.; Won, C.S.; Cho, K.; Um, K.; Sim, S. Calibration between color camera and 3d LiDAR instruments with a polygonal planar board. Sensors 2014, 14, 5333–5353. [Google Scholar] [CrossRef]
- Dhall, A.; Chelani, K.; Radhakrishnan, V.; Krishna, K. LiDAR-camera calibration using 3d-3d point correspondences. arXiv 2017, arXiv:1705.09785. [Google Scholar]
- Velas, M.; Španěl, M.; Materna, Z.; Herout, A. Calibration of Rgb Camera with Velodyne LiDAR. 2014. Available online: https://www.fit.vut.cz/research/publication-file/10578/Calibration_of_RGB_Camera_With_Velodyne_LiDAR.pdf (accessed on 8 February 2025).
- Scaramuzza, D.; Harati, A.; Siegwart, R. Extrinsic self calibration of a camera and a 3d laser range finder from natural scenes. In Proceedings of the 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, San Diego, CA, USA, 29 October–2 November 2007; IEEE: Piscataway, NJ, USA, 2007; pp. 4164–4169. [Google Scholar]
- Pandey, G.; McBride, J.; Savarese, S.; Eustice, R. Automatic targetless extrinsic calibration of a 3d LiDAR and camera by maximizing mutual information. In Proceedings of the AAAI Conference on Artificial Intelligence, Toronto, ON, Canada, 22–23 July 2012; Volume 26, pp. 2053–2059. [Google Scholar]
- Taylor, Z.; Nieto, J. Motion-based calibration of multimodal sensor extrinsics and timing offset estimation. IEEE Trans. Robot. 2016, 32, 1215–1229. [Google Scholar] [CrossRef]
- Jiang, J.; Xue, P.; Chen, S.; Liu, Z.; Zhang, X.; Zheng, N. Line feature based extrinsic calibration of LiDAR and camera. In Proceedings of the 2018 IEEE International Conference on Vehicular Electronics and Safety (ICVES), Madrid, Spain, 12–14 September 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–6. [Google Scholar]
- Li, L.; Li, H.; Liu, X.; He, D.; Miao, Z.; Kong, F.; Li, R.; Liu, Z.; Zhang, F. Joint intrinsic and extrinsic lidar-camera calibration in targetless environments using plane-constrained bundle adjustment. arXiv 2023, arXiv:2308.12629. [Google Scholar]
- Schneider, N.; Piewak, F.; Stiller, C.; Franke, U. Regnet: Multimodal sensor registration using deep neural networks. In Proceedings of the 2017 IEEE Intelligent Vehicles Symposium (IV), Los Angeles, CA, USA, 11–14 June 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1803–1810. [Google Scholar]
- Iyer, G.; Ram, R.K.; Murthy, J.K.; Krishna, K.M. Calibnet: Geometrically supervised extrinsic calibration using 3d spatial transformer networks. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1110–1117. [Google Scholar]
- Cattaneo, D.; Vaghi, M.; Ballardini, A.L.; Fontana, S.; Sorrenti, D.G.; Burgard, W. Cmrnet: Camera to LiDAR-map registration. In Proceedings of the 2019 IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zealand, 27–30 October 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1283–1289. [Google Scholar]
- Shen, Z.; Dai, Y.; Rao, Z. Cfnet: Cascade and fused cost volume for robust stereo matching. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 13906–13915. [Google Scholar]
- Lv, X.; Wang, B.; Dou, Z.; Ye, D.; Wang, S. Lccnet: LiDAR and camera self-calibration using cost volume network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 2894–2901. [Google Scholar]
- Luo, Z.; Yan, G.; Li, Y. Calib-anything: Zero-training lidar-camera extrinsic calibration method using segment anything. arXiv 2023, arXiv:2306.02656. [Google Scholar]
- Xiao, Y.; Li, Y.; Meng, C.; Li, X.; Ji, J.; Zhang, Y. Calibformer: A transformer-based automatic lidar-camera calibration network. In Proceedings of the 2024 IEEE International Conference on Robotics and Automation (ICRA), Yokohama, Japan, 13–17 May 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 16714–16720. [Google Scholar]
Results of Multi-Range | Translation (cm) | Rotation () | |||||||
---|---|---|---|---|---|---|---|---|---|
Et | X | Y | Z | ER | Roll | Pitch | Yaw | ||
After 20/1.5 m network | Mean | 18.617 | 3.327 | 6.423 | 4.916 | 1.455 | 0.175 | 0.215 | 0.349 |
Median | 15.081 | 2.517 | 5.326 | 4.150 | 0.963 | 0.138 | 0.288 | 0.239 | |
std | 15.079 | 2.671 | 2.888 | 3.847 | 1.847 | 0.157 | 0.173 | 0.226 | |
After 10/0.1 m network | Mean | 4.092 | 1.110 | 1.186 | 1.089 | 0.611 | 0.165 | 0.140 | 0.069 |
Median | 3.518 | 0.453 | 0.539 | 0.526 | 0.441 | 0.111 | 0.119 | 0.031 | |
std | 2.743 | 1.161 | 1.298 | 1.339 | 1.311 | 0.098 | 0.041 | 0.066 | |
After 5/0.5 m network | Mean | 2.145 | 0.503 | 0.554 | 1.350 | 0.264 | 0.045 | 0.067 | 0.079 |
Median | 1.674 | 0.506 | 0.770 | 1.819 | 0.137 | 0.030 | 0.025 | 0.088 | |
std | 1.798 | 0.177 | 0.414 | 0.975 | 1.330 | 0.053 | 0.080 | 0.030 | |
After 2/0.2 m network | Mean | 1.234 | 0.361 | 0.496 | 0.334 | 0.191 | 0.064 | 0.040 | 0.045 |
Median | 0.886 | 0.394 | 0.486 | 0.331 | 0.075 | 0.082 | 0.021 | 0.044 | |
std | 1.285 | 0.205 | 0.092 | 0.075 | 1.340 | 0.034 | 0.045 | 0.014 | |
After 1/0.1 m network | Mean | 0.841 | 0.235 | 0.326 | 0.274 | 0.178 | 0.030 | 0.008 | 0.033 |
Median | 0.542 | 0.188 | 0.294 | 0.244 | 0.063 | 0.026 | 0.007 | 0.024 | |
std | 1.077 | 0.130 | 0.058 | 0.057 | 1.343 | 0.019 | 0.004 | 0.033 |
Method | Miscalibrated Range | Translation Absolute Error (cm) | Rotation Absolute Error () | Latency/ms | ||||||
---|---|---|---|---|---|---|---|---|---|---|
Mean | X | Y | Z | Mean | Roll | Pitch | Yaw | |||
Calibnet | ±0.2 m/ | 4.340 | 4.200 | 1.600 | 7.220 | 0.410 | 0.180 | 0.900 | 0.150 | 29.66 |
CalibFormer | ±0.25 m/ | 1.188 | 1.101 | 0.902 | 1.561 | 0.141 | 0.076 | 0.259 | 0.087 | 31.47 |
Calib-Anything | ±0.2 m/ | 1.027 | 1.027 | 0.742 | 1.313 | 0.136 | 0.079 | 0.229 | 0.099 | 32.15 |
Ours | ±0.25 m/ | 1.101 | 1.044 | 0.817 | 1.443 | 0.138 | 0.081 | 0.232 | 0.101 | 23.54 |
Regnet | ±1.5 m/ | 6 | 7 | 7 | 4 | 0.280 | 0.240 | 0.250 | 0.360 | 30.65 |
LCCNet | ±1.5 m/ | 0.297 | 0.262 | 0.271 | 0.357 | 0.017 | 0.020 | 0.012 | 0.019 | 24.53 |
PCBA [33] | ±1.5 m/ | 0.305 | 0.249 | 0.323 | 0.343 | 0.019 | 0.022 | 0.012 | 0.023 | 29.79 |
Ours | ±1.5 m/ | 0.278 | 0.235 | 0.326 | 0.274 | 0.020 | 0.030 | 0.008 | 0.022 | 23.91 |
Network Architecture | Params/M | FLOPs/G | Inference Time/ms | Translation Error/cm | Rotation Error/° |
---|---|---|---|---|---|
w/o multi-scale feature | 18.7 | 62.5 | 17.5 | 1.058 | 0.084 |
w/o multi-scale cost volume | 19.1 | 63.8 | 18.8 | 1.213 | 0.099 |
w/o cost volume aggregation | 19.4 | 70.5 | 19.5 | 1.151 | 0.093 |
w/o hierarchical feature fusion | 18.8 | 61.1 | 18.4 | 0.930 | 0.078 |
w/o hybrid loss function | 20.2 | 77.4 | 22.6 | 0.997 | 0.074 |
w/all modules | 20.5 | 78.3 | 23.3 | 0.864 | 0.068 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Han, X.; Luo, J.; Wei, X.; Wang, Y. Online Calibration Method of LiDAR and Camera Based on Fusion of Multi-Scale Cost Volume. Information 2025, 16, 223. https://doi.org/10.3390/info16030223
Han X, Luo J, Wei X, Wang Y. Online Calibration Method of LiDAR and Camera Based on Fusion of Multi-Scale Cost Volume. Information. 2025; 16(3):223. https://doi.org/10.3390/info16030223
Chicago/Turabian StyleHan, Xiaobo, Jie Luo, Xiaoxu Wei, and Yongsheng Wang. 2025. "Online Calibration Method of LiDAR and Camera Based on Fusion of Multi-Scale Cost Volume" Information 16, no. 3: 223. https://doi.org/10.3390/info16030223
APA StyleHan, X., Luo, J., Wei, X., & Wang, Y. (2025). Online Calibration Method of LiDAR and Camera Based on Fusion of Multi-Scale Cost Volume. Information, 16(3), 223. https://doi.org/10.3390/info16030223