Enhancing Inter-AUV Perception: Adaptive 6-DOF Pose Estimation with Synthetic Images for AUV Swarm Sensing
Abstract
:1. Introduction
2. Related Work
2.1. Visual Marker-Based Underwater Pose Estimation
2.2. Deep Learning-Based Pose Estimation
2.3. Data Generation and Style Transfer
3. Methodology
3.1. Generation of Synthetic Underwater Images
3.2. Image Style Alignment Based on Color Intermediate Domain
3.2.1. Definition of the Color Intermediate Domain
3.2.2. Generating the Synthetic Image Training Set in the Color Intermediate Domain
3.3. Pose Estimation Network Based on Salient Keypoint Vector Voting
3.3.1. Definition of Keypoints
3.3.2. Vector Voting Model
4. Evaluation of Color Intermediate Domain Mapping Strategy
5. Pose Estimation Experiment
5.1. Validation of AUV6D in Dynamic Environments
5.2. Evaluation of Environmental Adaptability
5.3. Evaluation of 6D Pose Estimation
5.3.1. Pose Accuracy Analysis
5.3.2. Comparative Analysis of Methods
6. Navigation Experiments
6.1. Tow Navigation Experiment
6.2. Autonomous Navigation Experiment Using Two AUVs
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Furfaro, T.C.; Alves, J. An Application of Distributed Long BaseLine—Node Ranging in an Underwater Network. In Proceedings of the Underwater Communications Networking (UComms), Sestri Levante, Italy, 3–5 September 2014. [Google Scholar]
- Allotta, B.; Caiti, A.; Costanzi, R.; Di Corato, F.; Fenucci, D.; Monni, N.; Pugi, L.; Ridolfi, A. Cooperative navigation of AUVs via acoustic communication networking: Field experience with the Typhoon vehicles. Auton. Robot. 2016, 40, 1229–1244. [Google Scholar] [CrossRef]
- Wang, Z.; Guan, X.; Liu, C.; Yang, S.; Xiang, X.; Chen, H. Acoustic communication and imaging sonar guided AUV docking: System and lake trials. Control Eng. Pract. 2023, 136, 105529. [Google Scholar] [CrossRef]
- Fallon, M.F.; Papadopoulos, G.; Leonard, J.J. A Measurement Distribution Framework for Cooperative Navigation using Multiple AUVs. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Anchorage, AK, USA, 3–8 May 2010; pp. 4256–4263. [Google Scholar]
- Kebkal, K.G.; Kabanov, A.A. Research on Feasibility of Low-Observable Acoustic Communication in AUV Group Navigation. Gyroscopy Navig. 2023, 14, 328–338. [Google Scholar] [CrossRef]
- Zhuo, X.; Hu, T.; Wu, W.; Tang, L.; Qu, F.; Shen, X. Multi-AUV Collaborative Data Collection in Integrated Underwater Acoustic Communication and Detection Networks. In Proceedings of the IEEE Conference on Global Communications (IEEE GLOBECOM)—Intelligent Communications for Shared Prosperity, Kuala Lumpur, Malaysia, 4–8 December 2023; pp. 6771–6776. [Google Scholar]
- Jiang, W.; Yang, X.; Tong, F.; Yang, Y.; Zhou, T. A Low-Complexity Underwater Acoustic Coherent Communication System for Small AUV. Remote Sens. 2022, 14, 3405. [Google Scholar] [CrossRef]
- Tang, Z. Long Baseline Underwater Acoustic Location Technology. In Encyclopedia of Ocean Engineering; Cui, W., Fu, S., Hu, Z., Eds.; Springer: Singapore, 2020; pp. 1–6. [Google Scholar]
- Zhao, W.; Qi, S.; Liu, R.; Zhang, G.; Liu, G. A Review of Underwater Multi-source Positioning and Navigation Technology; Springer: Singapore, 2023; pp. 5466–5479. [Google Scholar]
- Tang, J.; Chen, Z.; Fu, B.; Lu, W.; Li, S.; Li, X.; Ji, X. ROV6D: 6D Pose Estimation Benchmark Dataset for Underwater Remotely Operated Vehicles. IEEE Robot. Autom. Lett. 2024, 9, 65–72. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Jian, M.; Yang, N.; Tao, C.; Zhi, H.; Luo, H. Underwater object detection and datasets: A survey. Intell. Mar. Technol. Syst. 2024, 2, 9. [Google Scholar] [CrossRef]
- Xu, S.; Zhang, M.; Song, W.; Mei, H.; He, Q.; Liotta, A. A systematic review and analysis of deep learning-based underwater object detection. Neurocomputing 2023, 527, 204–232. [Google Scholar] [CrossRef]
- Zhou, J.; Pang, L.; Zhang, D.; Zhang, W. Underwater Image Enhancement Method via Multi-Interval Subhistogram Perspective Equalization. IEEE J. Ocean. Eng. 2023, 48, 474–488. [Google Scholar] [CrossRef]
- Ruan, J.; Kong, X.; Huang, W.; Yang, W. Retiformer: Retinex-Based Enhancement In Transformer For Low-Light Image. In Proceedings of the ICASSP 2023—2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 4–10 June 2023; pp. 1–5. [Google Scholar]
- Chen, X.; Liu, Y.; Wei, J.; Wan, Q.; Liu, S.; Cao, S.; Yin, X. Underwater image enhancement using CycleGAN. In Proceedings of the NCIT 2022; Proceedings of International Conference on Networks, Communications and Information Technology, Virtual, 5–6 November 2022; pp. 1–5. [Google Scholar]
- Li, C.; Anwar, S.; Porikli, F. Underwater scene prior inspired deep underwater image and video enhancement. Pattern Recognit. 2020, 98, 107038. [Google Scholar] [CrossRef]
- Feng, J.; Yao, Y.; Wang, H.; Jin, H. Multi-AUV Terminal Guidance Method Based On Underwater Visual Positioning. In Proceedings of the 2020 IEEE International Conference on Mechatronics and Automation (ICMA), Beijing, China, 13–16 October 2020; pp. 314–319. [Google Scholar]
- Wei, Q.; Yang, Y.; Zhou, X.; Fan, C.; Zheng, Q.; Hu, Z. Localization Method for Underwater Robot Swarms Based on Enhanced Visual Markers. Electronics 2023, 12, 4882. [Google Scholar] [CrossRef]
- Zhang, L.; Li, Y.; Pan, G.; Zhang, Y.; Li, S. Terminal Stage Guidance Method for Underwater Moving Rendezvous and Docking Based on Monocular Vision. In Proceedings of the OCEANS 2019, Marseille, France, 17–20 June 2019; pp. 1–6. [Google Scholar]
- Di, Y.; Manhardt, F.; Wang, G.; Ji, X.; Navab, N.; Tombari, F. SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, BC, Canada, 10–17 October 2021; pp. 12376–12385. [Google Scholar]
- Hu, Y.; Fua, P.; Wang, W.; Salzmann, M. Single-Stage 6D Object Pose Estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Electr Network, Seattle, WA, USA, 14–19 June 2020; pp. 2927–2936. [Google Scholar]
- Wang, G.; Manhardt, F.; Tombari, F.; Ji, X. GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Hefei, China, 20–25 June 2021; pp. 16606–16616. [Google Scholar]
- Peng, S.; Zhou, X.; Liu, Y.; Lin, H.; Huang, Q.; Bao, H. PVNet: Pixel-Wise Voting Network for 6DoF Object Pose Estimation. Trans. Pattern Anal. Mach. Intell. 2022, 44, 3212–3223. [Google Scholar] [CrossRef] [PubMed]
- Song, C.; Song, J.; Huang, Q. HybridPose: 6D Object Pose Estimation under Hybrid Representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Electr Network, Seattle, WA, USA, 14–19 June 2020; pp. 428–437. [Google Scholar]
- Tekin, B.; Sinha, S.N.; Fua, P. Real-Time Seamless Single Shot 6D Object Pose Prediction. In Proceedings of the 31st IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 292–301. [Google Scholar]
- Liu, P.; Zhang, Q.; Zhang, J.; Wang, F.; Cheng, J. MFPN-6D: Real-time One-stage Pose Estimation of Objects on RGB Images. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Xian, China, 30 May–5 June 2021; pp. 12939–12945. [Google Scholar]
- Zhang, S.; Zhao, W.; Guan, Z.; Peng, X.; Peng, J. Keypoint-graph-driven learning framework for object pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Electr Network, Nashville, TN, USA, 19–25 June 2021; pp. 1065–1073. [Google Scholar]
- Zheng, Y.; Zheng, C.; Shen, J.; Liu, P.; Zhao, S. Keypoint-Guided Efficient Pose Estimation and Domain Adaptation for Micro Aerial Vehicles. IEEE Trans. Robot. 2024, 40, 2967–2983. [Google Scholar] [CrossRef]
- Joshi, B.; Modasshir, M.; Manderson, T.; Damron, H.; Xanthidis, M.; Li, A.Q.; Rekleitis, I.; Dudek, G. DeepURL: Deep Pose Estimation Framework for Underwater Relative Localization. In Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 24 October 2020–24 January 2021; pp. 1777–1784. [Google Scholar]
- Zhu, J.Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2242–2251. [Google Scholar]
- Sun, B.; Jia, S.; Jiang, X.; Jia, F. Double U-Net CycleGAN for 3D MR to CT image synthesis. Int. J. Comput. Assist. Radiol. Surg. 2023, 18, 149–156. [Google Scholar] [CrossRef] [PubMed]
- Gong, C.; Huang, Y.; Luo, M.; Cao, S.; Gong, X.; Ding, S.; Yuan, X.; Zheng, W.; Zhang, Y. Channel-wise attention enhanced and structural similarity constrained cycleGAN for effective synthetic CT generation from head and neck MRI images. Radiat. Oncol. 2024, 19, 37. [Google Scholar] [CrossRef] [PubMed]
- Li, J.; Skinner, K.A.; Eustice, R.M.; Johnson-Roberson, M. WaterGAN: Unsupervised Generative Network to Enable Real-Time Color Correction of Monocular Underwater Images. IEEE Robot. Autom. Lett. 2018, 3, 387–394. [Google Scholar] [CrossRef]
- Zhou, W.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef]
- Fardo, F.A.; Conforto, V.H.; Oliveira, F.C.d.; Rodrigues, P.S.S. A Formal Evaluation of PSNR as Quality Measurement Parameter for Image Segmentation Algorithms. arXiv 2016, arXiv:1605.07116. [Google Scholar]
- Gatys, L.A.; Ecker, A.S.; Bethge, M. Image Style Transfer Using Convolutional Neural Networks. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 2414–2423. [Google Scholar]
- Bynagari, N.B. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. arXiv 2019, arXiv:1706.08500. [Google Scholar] [CrossRef]
Synthetic | SSIM Mean ↑ | SSIM Std ↓ | PSNR Mean (dB) ↑ | PSNR Std (dB) ↓ | Gram Mean ↓ | Gram Std ↓ | FID Mean ↓ | FID Std ↓ |
---|---|---|---|---|---|---|---|---|
Simulated | 0.3408 | 0.0736 | 14.9744 | 2.7336 | 0.0018 | 0.0019 | 3.4018 | 0.7189 |
CycleGAN IR | 0.5293 | 0.1095 | 16.4843 | 3.4098 | 0.0014 | 0.0021 | 3.4789 | 1.0076 |
CycleGAN IC | 0.5694 | 0.1198 | 20.0729 | 5.2012 | 0.0012 | 0.0026 | 2.5584 | 0.7462 |
Mask-Cycle | 0.5053 | 0.1090 | 18.3925 | 4.3192 | 0.0012 | 0.0024 | 3.2607 | 0.8832 |
Intermediate | 0.6433 | 0.0983 | 22.2661 | 2.5756 | 0.0002 | 0.0005 | 2.3593 | 0.6218 |
Pose Estimation Success Rate | ||||||||
---|---|---|---|---|---|---|---|---|
Underwater 1 | Pool | Underwater 2 | Underwater 3 | Underwater 4 | Low Light | Low Light (Lit) | Underwater 1 (Lit) | |
Simulated | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
CycleGAN | 0.23 | 0 | 0 | 0.11 | 0.04 | 0 | 0.01 | 0.11 |
Mask-Cycle | 0.94 | 0.02 | 0.31 | 0.81 | 0.19 | 0.07 | 0.10 | 0.76 |
Intermediate | 0.98 | 0.62 | 0.94 | 0.92 | 0.82 | 0.75 | 0.68 | 0.83 |
Translation X Error (m) | Translation Y Error (m) | Translation Z Error (m) | Roll Error (m) | Pitch Error (m) | Yaw Error (m) |
---|---|---|---|---|---|
0.0248 | 0.0168 | 0.1099 | 8.0653 | 4.5888 | 1.8232 |
Translation Error (m) | Orientation Error (°) | ADD | FPS | |
---|---|---|---|---|
DEEPURL | 0.068 | 6.77° | 57.16% | 40 |
Intermediate + PVNET | 0.186 | 14.55° | 53.09% | 37 |
Intermediate + YOLO6D | 0.472 | 20.56 | 33.49% | 54 |
AUV6D | 0.051 | 4.83 | 62.63% | 38 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wei, Q.; Yang, Y.; Zhou, X.; Hu, Z.; Li, Y.; Fan, C.; Zheng, Q.; Wang, Z. Enhancing Inter-AUV Perception: Adaptive 6-DOF Pose Estimation with Synthetic Images for AUV Swarm Sensing. Drones 2024, 8, 486. https://doi.org/10.3390/drones8090486
Wei Q, Yang Y, Zhou X, Hu Z, Li Y, Fan C, Zheng Q, Wang Z. Enhancing Inter-AUV Perception: Adaptive 6-DOF Pose Estimation with Synthetic Images for AUV Swarm Sensing. Drones. 2024; 8(9):486. https://doi.org/10.3390/drones8090486
Chicago/Turabian StyleWei, Qingbo, Yi Yang, Xingqun Zhou, Zhiqiang Hu, Yan Li, Chuanzhi Fan, Quan Zheng, and Zhichao Wang. 2024. "Enhancing Inter-AUV Perception: Adaptive 6-DOF Pose Estimation with Synthetic Images for AUV Swarm Sensing" Drones 8, no. 9: 486. https://doi.org/10.3390/drones8090486