Deep Voxelized Feature Maps for Self-Localization in Autonomous Driving
Abstract
:1. Introduction
- We conduct experiments that demonstrate that existing point-cloud-dependent deep PCR methods are outperformed by feature-based methods in terms of capacity and performance for self-localization estimation.
- We discuss the necessity of voxelization when constructing deep-features-based maps for self-localization and propose a deep voxelized map.
- We propose a self-localization algorithm that utilizes the proposed map with reassignment and attention mechanisms. We demonstrate that our method outperforms previous methods in real urban environments.
2. Related Work
2.1. High-Definition Maps for Self-Localization
2.2. Self-Localization with Point Cloud Maps
2.3. Self-Localization with Feature Maps
2.4. Self-Localization with Deep Feature Maps
3. Deep Voxelized Feature Maps and Self-Localization
3.1. Discussion of Deep Feature Maps
3.2. Deep Voxelized Feature Maps
3.2.1. Difference from NDT
3.2.2. Difference from Voxelization in PointNetLK-Revisited
4. PCR via Deep-Feature-Metric Optimization
4.1. Preliminary
4.1.1. Rigid-Body Transformation
4.1.2. Point Cloud
4.2. Problem Definition
4.3. Feature Extraction
4.4. Optimization
5. Self-Localization
5.1. Per-Voxel Residual Computation
5.2. Attention-Based Feature Aggregation
5.3. Implementation
5.3.1. Loss Function
5.3.2. Networks
6. Experiments
6.1. Targets for Comparison
6.2. Datasets
6.2.1. KITTI Visual Odometry Dataset
6.2.2. Shinjuku Urban Dataset
6.3. Implementations
6.4. Error Comparison with Full Data
6.4.1. Results
6.4.2. Discussion of Point-Cloud-Based Methods
6.4.3. Discussion of Feature-Based Methods
6.5. Qualitative Evaluation
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
PCR | Point cloud registration |
ND | Normal distribution |
IC | Inverse composition |
FC | Forward composition |
References
- Chalvatzaras, A.; Pratikakis, I.; Amanatiadis, A.A. A Survey on Map-Based Localization Techniques for Autonomous Vehicles. IEEE Trans. Intell. Veh. 2023, 8, 1574–1596. [Google Scholar] [CrossRef]
- Arun, K.S.; Huang, T.S.; Blostein, S.D. Least-Squares Fitting of Two 3-D Point Sets. IEEE Trans. Pattern Anal. Mach. Intell. 1987, PAMI-9, 698–700. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Yew, Z.J.; Lee, G.H. 3dfeat-net: Weakly supervised local 3d features for point cloud registration. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 607–623. [Google Scholar]
- Bai, X.; Luo, Z.; Zhou, L.; Fu, H.; Quan, L.; Tai, C.L. D3feat: Joint learning of dense detection and description of 3d local features. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, 13–19 June 2020; pp. 6359–6367. [Google Scholar]
- Lu, F.; Chen, G.; Liu, Y.; Zhang, L.; Qu, S.; Liu, S.; Gu, R. Hregnet: A hierarchical network for large-scale outdoor lidar point cloud registration. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Virtual, 11–17 October 2021; pp. 16014–16023. [Google Scholar]
- Wang, Y.; Solomon, J.M. Deep closest point: Learning representations for point cloud registration. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3523–3532. [Google Scholar]
- Fu, K.; Liu, S.; Luo, X.; Wang, M. Robust point cloud registration framework based on deep graph matching. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, 19–25 June 2021; pp. 8893–8902. [Google Scholar]
- Schwarz, S.; Preda, M.; Baroncini, V.; Budagavi, M.; Cesar, P.; Chou, P.A.; Cohen, R.A.; Krivokuća, M.; Lasserre, S.; Li, Z.; et al. Emerging MPEG standards for point cloud compression. IEEE J. Emerg. Sel. Top. Circuits Syst. 2018, 9, 133–148. [Google Scholar] [CrossRef] [Green Version]
- Magnusson, M. The Three-Dimensional Normal-Distributions Transform: An Efficient Representation for Registration, Surface Analysis, and Loop Detection. Ph.D. Thesis, Örebro Universitet, Orebro, Sweden, 2009. [Google Scholar]
- Javanmardi, E.; Javanmardi, M.; Gu, Y.; Kamijo, S. Factors to Evaluate Capability of Map for Vehicle Localization. IEEE Access 2018, 6, 49850–49867. [Google Scholar] [CrossRef]
- Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 652–660. [Google Scholar]
- Aoki, Y.; Goforth, H.; Srivatsan, R.A.; Lucey, S. PointNetLK: Robust & Efficient Point Cloud Registration Using PointNet. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 7156–7165. [Google Scholar]
- Li, X.; Pontes, J.K.; Lucey, S. Pointnetlk revisited. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, 19–25 June 2021; pp. 12763–12772. [Google Scholar]
- Huang, X.; Mei, G.; Zhang, J. Feature-metric registration: A fast semi-supervised approach for robust point cloud registration without correspondences. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, 13–19 June 2020; pp. 11366–11374. [Google Scholar]
- Huang, X.; Mei, G.; Zhang, J.; Abbas, R. A comprehensive survey on point cloud registration. arXiv 2021, arXiv:2103.02690. [Google Scholar]
- Endo, Y.; Javanmardi, E.; Kamijo, S. Analysis of Occlusion Effects for Map-Based Self-Localization in Urban Areas. Sensors 2021, 21, 5196. [Google Scholar] [CrossRef] [PubMed]
- Geiger, A.; Lenz, P.; Urtasun, R. Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA, 16–21 June 2012. [Google Scholar]
- Chen, S.; Liu, B.; Feng, C.; Vallespi-Gonzalez, C.; Wellington, C. 3D Point Cloud Processing and Learning for Autonomous Driving: Impacting Map Creation, Localization, and Perception. IEEE Signal Process. Mag. 2021, 38, 68–86. [Google Scholar] [CrossRef]
- Rozenberszki, D.; Majdik, A.L. LOL: Lidar-only Odometry and Localization in 3D point cloud maps. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Virtual, 31 May–31 August 2020; pp. 4379–4385. [Google Scholar]
- Yang, J.; Li, H.; Campbell, D.; Jia, Y. Go-ICP: A globally optimal solution to 3D ICP point-set registration. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 38, 2241–2254. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Segal, A.; Haehnel, D.; Thrun, S. Generalized-icp. In Proceedings of the Robotics: Science and Systems, Seattle, WA, USA, 28 June–1 July 2009; Volume 2, p. 435. [Google Scholar]
- Yuan, W.; Eckart, B.; Kim, K.; Jampani, V.; Fox, D.; Kautz, J. Deepgmr: Learning latent gaussian mixture models for registration. In Proceedings of the European Conference on Computer Vision (ECCV), Virtual, 23–28 August 2020; pp. 733–750. [Google Scholar]
- Biber, P.; Straßer, W. The normal distributions transform: A new approach to laser scan matching. In Proceedings of the 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No. 03CH37453), Las Vegas, NV, USA, 27–31 October 2003; Volume 3, pp. 2743–2748. [Google Scholar]
- Yokozuka, M.; Koide, K.; Oishi, S.; Banno, A. LiTAMIN: LiDAR-based tracking and mapping by stabilized ICP for geometry approximation with normal distributions. In Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 25–29 October 2020; pp. 5143–5150. [Google Scholar]
- Endo, Y.; Izawa, T.; Kamijo, S. Hierarchical map representation using vector maps and geometrical maps for self-localization. IATSS Res. 2022, 46, 450–456. [Google Scholar] [CrossRef]
- Engel, N.; Hoermann, S.; Horn, M.; Belagiannis, V.; Dietmayer, K. Deeplocalization: Landmark-based self-localization with deep neural networks. In Proceedings of the 2019 IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zealand, 27–30 October 2019; pp. 926–933. [Google Scholar]
- Wiesmann, L.; Marcuzzi, R.; Stachniss, C.; Behley, J. Retriever: Point Cloud Retrieval in Compressed 3D Maps. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; pp. 10925–10932. [Google Scholar] [CrossRef]
- Baker, S.; Matthews, I. Lucas-kanade 20 years on: A unifying framework. Int. J. Comput. Vis. 2004, 56, 221–255. [Google Scholar] [CrossRef]
- Sola, J.; Deray, J.; Atchuthan, D. A micro Lie theory for state estimation in robotics. arXiv 2018, arXiv:1812.01537. [Google Scholar]
- Tang, C.; Tan, P. Ba-net: Dense bundle adjustment network. arXiv 2018, arXiv:1806.04807. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
- Engel, J.; Schöps, T.; Cremers, D. LSD-SLAM: Large-scale direct monocular SLAM. In Proceedings of the European Conference on Computer Vision (ECCV), Zurich, Switzerland, 6–12 September 2014; pp. 834–849. [Google Scholar]
- Ba, J.L.; Kiros, J.R.; Hinton, G.E. Layer normalization. arXiv 2016, arXiv:1607.06450. [Google Scholar]
- Rusu, R.B.; Cousins, S. 3D is here: Point Cloud Library (PCL). In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Shanghai, China, 9–13 May 2011. [Google Scholar]
Dataset | For Training | For Testing |
---|---|---|
KITTI VO Dataset | 2535 | 5160 |
Shinjuku Urban Dataset | 2320 | 1490 |
Method | Format | Map Size (kB)/100 m | Mean | Median | ||
---|---|---|---|---|---|---|
Rot | Trans | Rot | Trans | |||
ICP | pc | 120.0 kB (10,000 pts.) | 13.01 | 0.41 | 0.45 | 0.14 |
DeepGMR [22] | pc | 120.0 kB (10,000 pts.) | 6.23 | 1.26 | 0.98 | 0.43 |
DCP [6] | pc | 60.0 kB (5000 pts.) | 14.79 | 0.98 | 3.55 | 0.41 |
FMR [14] | feature | 20.0 kB (D = 256/5.0 m) | 11.77 | 1.25 | 2.59 | 0.41 |
20.0 kB (D = 512/10.0 m) | 12.86 | 1.69 | 1.66 | 0.47 | ||
20.0 kB (D = 2560/50.0 m) | 20.87 | 4.12 | 1.38 | 0.70 | ||
NDT(P2D) [9] | feature | 37.0 kB (VS * = 2.0 m) | 12.96 | 0.44 | 0.68 | 0.23 |
17.4 kB (VS = 3.0 m) | 12.17 | 0.50 | 0.45 | 0.15 | ||
9.5 kB (VS = 4.0 m) | 11.29 | 0.65 | 0.33 | 0.13 | ||
5.5 kB (VS = 5.0 m) | 10.25 | 0.82 | 0.31 | 0.13 | ||
PointNetLK-Rev-Vox (FC) [13] | feature | 8.9 kB (VS = 20.0 m, D = 128) | 14.29 | 0.77 | 2.20 | 0.45 |
ours (w/o attention) | feature | 8.9 kB (VS = 20.0 m, D = 128) | 8.32 | 0.27 | 1.17 | 0.14 |
ours (w attention) | feature | 8.9 kB (VS = 20.0 m, D = 128) | 7.98 | 0.27 | 0.99 | 0.14 |
Method | Format | Map Size (kB)/100 m | Mean | Median | ||
---|---|---|---|---|---|---|
Rot | Trans | Rot | Trans | |||
ICP | pc | 120.0 kB (10,000 pts.) | 13.11 | 0.45 | 0.61 | 0.17 |
DeepGMR [22] | pc | 120.0 kB (10,000 pts.) | 6.90 | 1.88 | 3.64 | 0.98 |
DCP [6] | pc | 60.0 kB (5000 pts.) | 13.51 | 1.74 | 2.63 | 1.33 |
FMR [14] | feature | 20.0 kB (D = 256/5.0 m) | 13.95 | 1.49 | 3.33 | 0.59 |
20.0 kB (D = 512/10.0 m) | 8.92 | 1.71 | 1.41 | 0.43 | ||
20.0 kB (D = 2560/50.0 m) | 13.02 | 5.29 | 1.47 | 0.73 | ||
NDT(P2D) [9] | feature | 100.6 kB (VS * = 2.0 m) | 13.03 | 0.48 | 0.76 | 0.24 |
48.0 kB (VS = 3.0 m) | 12.39 | 0.53 | 0.54 | 0.17 | ||
26.7 kB (VS = 4.0 m) | 11.56 | 0.62 | 0.36 | 0.13 | ||
16.9 kB (VS = 5.0 m) | 10.32 | 0.78 | 0.31 | 0.12 | ||
PointNetLK-Rev-Vox (FC) [13] | feature | 17.6 kB (VS = 20.0 m, D = 128) | 13.69 | 0.74 | 2.68 | 0.48 |
ours (w/o attention) | feature | 17.6 kB (VS = 20.0 m, D = 128) | 8.97 | 0.15 | 0.22 | 0.05 |
ours (w attention) | feature | 17.6 kB (VS = 20.0 m, D = 128) | 7.33 | 0.12 | 0.21 | 0.04 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Endo, Y.; Kamijo, S. Deep Voxelized Feature Maps for Self-Localization in Autonomous Driving. Sensors 2023, 23, 5373. https://doi.org/10.3390/s23125373
Endo Y, Kamijo S. Deep Voxelized Feature Maps for Self-Localization in Autonomous Driving. Sensors. 2023; 23(12):5373. https://doi.org/10.3390/s23125373
Chicago/Turabian StyleEndo, Yuki, and Shunsuke Kamijo. 2023. "Deep Voxelized Feature Maps for Self-Localization in Autonomous Driving" Sensors 23, no. 12: 5373. https://doi.org/10.3390/s23125373
APA StyleEndo, Y., & Kamijo, S. (2023). Deep Voxelized Feature Maps for Self-Localization in Autonomous Driving. Sensors, 23(12), 5373. https://doi.org/10.3390/s23125373