Next Article in Journal
Non-Invasive Characterization of Different Saccharomyces Suspensions with Ultrasound
Previous Article in Journal
Efficacy of Sensor-Based Training Using Exergaming or Virtual Reality in Patients with Chronic Low Back Pain: A Systematic Review
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

A Transformer-Based Image-Guided Depth-Completion Model with Dual-Attention Fusion Module

The College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China
*
Author to whom correspondence should be addressed.
Sensors 2024, 24(19), 6270; https://doi.org/10.3390/s24196270
Submission received: 30 July 2024 / Revised: 18 September 2024 / Accepted: 26 September 2024 / Published: 27 September 2024

Abstract

Depth information is crucial for perceiving three-dimensional scenes. However, depth maps captured directly by depth sensors are often incomplete and noisy, our objective in the depth-completion task is to generate dense and accurate depth maps from sparse depth inputs by fusing guidance information from corresponding color images obtained from camera sensors. To address these challenges, we introduce transformer models, which have shown great promise in the field of vision, into the task of image-guided depth completion. By leveraging the self-attention mechanism, we propose a novel network architecture that effectively meets these requirements of high accuracy and resolution in depth data. To be more specific, we design a dual-branch model with a transformer-based encoder that serializes image features into tokens step by step and extracts multi-scale pyramid features suitable for pixel-wise dense prediction tasks. Additionally, we incorporate a dual-attention fusion module to enhance the fusion between the two branches. This module combines convolution-based spatial and channel-attention mechanisms, which are adept at capturing local information, with cross-attention mechanisms that excel at capturing long-distance relationships. Our model achieves state-of-the-art performance on both the NYUv2 depth and SUN-RGBD depth datasets. Additionally, our ablation studies confirm the effectiveness of the designed modules.
Keywords: depth completion; dual-attention fusion module; multi-scale dual branch depth completion; dual-attention fusion module; multi-scale dual branch

Share and Cite

MDPI and ACS Style

Wang, S.; Jiang, F.; Gong, X. A Transformer-Based Image-Guided Depth-Completion Model with Dual-Attention Fusion Module. Sensors 2024, 24, 6270. https://doi.org/10.3390/s24196270

AMA Style

Wang S, Jiang F, Gong X. A Transformer-Based Image-Guided Depth-Completion Model with Dual-Attention Fusion Module. Sensors. 2024; 24(19):6270. https://doi.org/10.3390/s24196270

Chicago/Turabian Style

Wang, Shuling, Fengze Jiang, and Xiaojin Gong. 2024. "A Transformer-Based Image-Guided Depth-Completion Model with Dual-Attention Fusion Module" Sensors 24, no. 19: 6270. https://doi.org/10.3390/s24196270

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop