Next Article in Journal
Improving Rebar Twist Prediction Exploiting Unified-Channel Attention-Based Image Restoration and Regression Techniques
Previous Article in Journal
Sensor-Fused Nighttime System for Enhanced Pedestrian Detection in ADAS and Autonomous Vehicles
Previous Article in Special Issue
Neural Colour Correction for Indoor 3D Reconstruction Using RGB-D Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Online Scene Semantic Understanding Based on Sparsely Correlated Network for AR

The School of Computer and Artificial Intelligence, Beijing Technology and Business University, Beijing 102488, China
*
Author to whom correspondence should be addressed.
Sensors 2024, 24(14), 4756; https://doi.org/10.3390/s24144756 (registering DOI)
Submission received: 18 June 2024 / Revised: 9 July 2024 / Accepted: 18 July 2024 / Published: 22 July 2024
(This article belongs to the Special Issue 3D Reconstruction with RGB-D Cameras and Multi-sensors)

Abstract

Real-world understanding serves as a medium that bridges the information world and the physical world, enabling the realization of virtual–real mapping and interaction. However, scene understanding based solely on 2D images faces problems such as a lack of geometric information and limited robustness against occlusion. The depth sensor brings new opportunities, but there are still challenges in fusing depth with geometric and semantic priors. To address these concerns, our method considers the repeatability of video stream data and the sparsity of newly generated data. We introduce a sparsely correlated network architecture (SCN) designed explicitly for online RGBD instance segmentation. Additionally, we leverage the power of object-level RGB-D SLAM systems, thereby transcending the limitations of conventional approaches that solely emphasize geometry or semantics. We establish correlation over time and leverage this correlation to develop rules and generate sparse data. We thoroughly evaluate the system’s performance on the NYU Depth V2 and ScanNet V2 datasets, demonstrating that incorporating frame-to-frame correlation leads to significantly improved accuracy and consistency in instance segmentation compared to existing state-of-the-art alternatives. Moreover, using sparse data reduces data complexity while ensuring the real-time requirement of 18 fps. Furthermore, by utilizing prior knowledge of object layout understanding, we showcase a promising application of augmented reality, showcasing its potential and practicality.
Keywords: real scene analysis; instance segmentation; RGBD SLAM; augmented reality real scene analysis; instance segmentation; RGBD SLAM; augmented reality

Share and Cite

MDPI and ACS Style

Wang, Q.; Song, J.; Du, C.; Wang, C. Online Scene Semantic Understanding Based on Sparsely Correlated Network for AR. Sensors 2024, 24, 4756. https://doi.org/10.3390/s24144756

AMA Style

Wang Q, Song J, Du C, Wang C. Online Scene Semantic Understanding Based on Sparsely Correlated Network for AR. Sensors. 2024; 24(14):4756. https://doi.org/10.3390/s24144756

Chicago/Turabian Style

Wang, Qianqian, Junhao Song, Chenxi Du, and Chen Wang. 2024. "Online Scene Semantic Understanding Based on Sparsely Correlated Network for AR" Sensors 24, no. 14: 4756. https://doi.org/10.3390/s24144756

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop