Next Article in Journal
A Self-Attention Legendre Graph Convolution Network for Rotating Machinery Fault Diagnosis
Previous Article in Journal
Workplace Well-Being in Industry 5.0: A Worker-Centered Systematic Review
Previous Article in Special Issue
Learning Temporal–Spatial Contextual Adaptation for Three-Dimensional Human Pose Estimation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

6DoF Object Pose and Focal Length Estimation from Single RGB Images in Uncontrolled Environments

Graduate School of Electronic and Electrical Engineering, Kyungpook National University, Daegu 41566, Republic of Korea
*
Author to whom correspondence should be addressed.
Sensors 2024, 24(17), 5474; https://doi.org/10.3390/s24175474 (registering DOI)
Submission received: 9 July 2024 / Revised: 13 August 2024 / Accepted: 16 August 2024 / Published: 23 August 2024
(This article belongs to the Special Issue Computer Vision and Virtual Reality: Technologies and Applications)

Abstract

Accurate 6DoF (degrees of freedom) pose and focal length estimation are important in extended reality (XR) applications, enabling precise object alignment and projection scaling, thereby enhancing user experiences. This study focuses on improving 6DoF pose estimation using single RGB images of unknown camera metadata. Estimating the 6DoF pose and focal length from an uncontrolled RGB image, obtained from the internet, is challenging because it often lacks crucial metadata. Existing methods such as FocalPose and Focalpose++ have made progress in this domain but still face challenges due to the projection scale ambiguity between the translation of an object along the z-axis (tz) and the camera’s focal length. To overcome this, we propose a two-stage strategy that decouples the projection scaling ambiguity in the estimation of z-axis translation and focal length. In the first stage, tz is set arbitrarily, and we predict all the other pose parameters and focal length relative to the fixed tz. In the second stage, we predict the true value of tz while scaling the focal length based on the tz update. The proposed two-stage method reduces projection scale ambiguity in RGB images and improves pose estimation accuracy. The iterative update rules constrained to the first stage and tailored loss functions including Huber loss in the second stage enhance the accuracy in both 6DoF pose and focal length estimation. Experimental results using benchmark datasets show significant improvements in terms of median rotation and translation errors, as well as better projection accuracy compared to the existing state-of-the-art methods. In an evaluation across the Pix3D datasets (chair, sofa, table, and bed), the proposed two-stage method improves projection accuracy by approximately 7.19%. Additionally, the incorporation of Huber loss resulted in a significant reduction in translation and focal length errors by 20.27% and 6.65%, respectively, in comparison to the Focalpose++ method.
Keywords: 6DoF; pose estimation; focal length; uncontrolled RGB images; XR 6DoF; pose estimation; focal length; uncontrolled RGB images; XR

Share and Cite

MDPI and ACS Style

Manawadu, M.; Park, S.-Y. 6DoF Object Pose and Focal Length Estimation from Single RGB Images in Uncontrolled Environments. Sensors 2024, 24, 5474. https://doi.org/10.3390/s24175474

AMA Style

Manawadu M, Park S-Y. 6DoF Object Pose and Focal Length Estimation from Single RGB Images in Uncontrolled Environments. Sensors. 2024; 24(17):5474. https://doi.org/10.3390/s24175474

Chicago/Turabian Style

Manawadu, Mayura, and Soon-Yong Park. 2024. "6DoF Object Pose and Focal Length Estimation from Single RGB Images in Uncontrolled Environments" Sensors 24, no. 17: 5474. https://doi.org/10.3390/s24175474

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop