Sensors

Research

23 pages, 26661 KiB

Open AccessArticle

Point Cloud Fusion of Human Respiratory Motion Under Multi-View Time-of-Flight Camera System: Voxelization Method Using 2D Voxel Block Index

by Jiadun Wang, Shengtao Li and Kai Huang

Sensors 2025, 25(10), 3062; https://doi.org/10.3390/s25103062 - 13 May 2025

Viewed by 595

Abstract

Time-of-flight (ToF) 3D cameras can obtain a real-time point cloud of human respiratory motion in medical robot scenes. Through this point cloud, real-time displacement information can be provided for the medical robot to avoid the robot injuring the human body during the operation [...] Read more.

Time-of-flight (ToF) 3D cameras can obtain a real-time point cloud of human respiratory motion in medical robot scenes. Through this point cloud, real-time displacement information can be provided for the medical robot to avoid the robot injuring the human body during the operation due to the positioning deviation. However, multi-camera deployments face a conflict between spatial coverage and measurement accuracy due to the limitations of different types of ToF modulation. To address this, we design a multi-camera acquisition system incorporating different modulation schemes and propose a multi-view voxelized point cloud fusion algorithm utilizing a two-dimensional voxel block index table. Our algorithm first constructs a voxelized scene from multi-view depth maps. Then, the two-dimensional voxel block index table estimates and reconstructs overlapping regions across views. Experimental results demonstrate that fusing multi-view point clouds from low-precision 3D cameras achieves accuracy comparable to high-precision systems while maintaining the extensive spatial coverage of multi-view configurations. Full article

(This article belongs to the Special Issue 3D Reconstruction with RGB-D Cameras and Multi-sensors)

► Show Figures

Figure 1

20 pages, 11233 KiB

Open AccessArticle

Capturing Free Surface Dynamics of Flows over a Stepped Spillway Using a Depth Camera

by Megh Raj K C, Brian M. Crookston and Daniel B. Bung

Sensors 2025, 25(8), 2525; https://doi.org/10.3390/s25082525 - 17 Apr 2025

Viewed by 522

Abstract

Spatio-temporal measurements of turbulent free surface flows remain challenging with in situ point methods. This study explores the application of an inexpensive depth-sensing RGB-D camera, the Intel^® RealSense™ D455, to capture detailed water surface measurements of a highly turbulent, self-aerated flow in [...] Read more.

Spatio-temporal measurements of turbulent free surface flows remain challenging with in situ point methods. This study explores the application of an inexpensive depth-sensing RGB-D camera, the Intel^® RealSense™ D455, to capture detailed water surface measurements of a highly turbulent, self-aerated flow in the case of a stepped spillway. Ambient lighting conditions and various sensor settings, including configurations and parameters affecting data capture and quality, were assessed. A free surface profile was extracted from the 3D measurements and compared against phase detection conductivity probe (PDCP) and ultrasonic sensor (USS) measurements. Measurements in the non-aerated region were influenced by water transparency and a lack of detectable surface features, with flow depths consistently smaller than USS measurements (up to 32.5% less). Measurements in the clear water region also resulted in a “no data” region with holes in the depth map due to shiny reflections. In the aerated flow region, the camera effectively detected the dynamic water surface, with mean surface profiles close to characteristic depths measured with PDCP and within one standard deviation of the mean USS flow depths. The flow depths were within 10% of the USS depths and corresponded to depths with 80–90% air concentration levels obtained with the PDCP. Additionally, the depth camera successfully captured temporal fluctuations, allowing for the calculation of time-averaged entrapped air concentration profiles and dimensionless interface frequency distributions. This facilitated a direct comparison with PDCP and USS sensors, demonstrating that this camera sensor is a practical and cost-effective option for detecting free surfaces of high velocity, aerated, and dynamic flows in a stepped chute. Full article

(This article belongs to the Special Issue 3D Reconstruction with RGB-D Cameras and Multi-sensors)

► Show Figures

Figure 1

15 pages, 28683 KiB

Open AccessArticle

Neural Radiance Field Dynamic Scene SLAM Based on Ray Segmentation and Bundle Adjustment

by Yuquan Zhang and Guosheng Feng

Sensors 2025, 25(6), 1679; https://doi.org/10.3390/s25061679 - 8 Mar 2025

Viewed by 1580

Abstract

The current neural implicit SLAM methods have demonstrated excellent performance in reconstructing ideal static 3D scenes. However, it remains a significant challenge for these methods to handle real scenes with drastic changes in lighting conditions and dynamic environments. This paper proposes a neural [...] Read more.

The current neural implicit SLAM methods have demonstrated excellent performance in reconstructing ideal static 3D scenes. However, it remains a significant challenge for these methods to handle real scenes with drastic changes in lighting conditions and dynamic environments. This paper proposes a neural implicit SLAM method that effectively deals with dynamic scenes. We employ a keyframe selection and tracking switching approach based on Lucas–Kanade (LK) optical flow, which serves as prior construction for the Conditional Random Fields potential function. This forms a semantic-based joint estimation method for dynamic and static pixels and constructs corresponding loss functions to impose constraints on dynamic scenes. We conduct experiments on various dynamic and challenging scene datasets, including TUM RGB-D, Openloris, and Bonn. The results demonstrate that our method significantly outperforms existing neural implicit SLAM systems in terms of reconstruction quality and tracking accuracy. Full article

(This article belongs to the Special Issue 3D Reconstruction with RGB-D Cameras and Multi-sensors)

► Show Figures

Figure 1

21 pages, 21131 KiB

Open AccessArticle

Measurement of Human Body Segment Properties Using Low-Cost RGB-D Cameras

by Cristina Nuzzi, Marco Ghidelli, Alessandro Luchetti, Matteo Zanetti, Francesco Crenna and Matteo Lancini

Sensors 2025, 25(5), 1515; https://doi.org/10.3390/s25051515 - 28 Feb 2025

Viewed by 1100

Abstract

An open question for the biomechanical research community is accurate estimation of the volume and mass of each body segment of the human body, especially when indirect measurements are based on biomechanical modeling. Traditional methods involve the adoption of anthropometric tables, which describe [...] Read more.

An open question for the biomechanical research community is accurate estimation of the volume and mass of each body segment of the human body, especially when indirect measurements are based on biomechanical modeling. Traditional methods involve the adoption of anthropometric tables, which describe only the average human shape, or manual measurements, which are time-consuming and depend on the operator. We propose a novel method based on the acquisition of a 3D scan of a subject’s body, which is obtained using a consumer-end RGB-D camera. The body segments’ separation is obtained by combining the body skeleton estimation of BlazePose with a biomechanical-coherent skeletal model, which is defined according to the literature. The volume of each body segment is computed using a 3D Monte Carlo procedure. Results were compared with manual measurement by experts, anthropometric tables, and a model leveraging truncated cone approximations, showing good adherence to reference data with minimal differences (ranging from

+ 0.5

to

- 1.0

dm³ for the upper limbs,

- 0.1

to

- 4.2

dm³ for the thighs, and

- 0.4

to

- 2.3

dm³ for the shanks). In addition, we propose a novel indicator based on the computation of equivalent diameters for each body segment, highlighting the importance of gender-specific biomechanical models to account for the chest and pelvis areas of female subjects. Full article

(This article belongs to the Special Issue 3D Reconstruction with RGB-D Cameras and Multi-sensors)

► Show Figures

Figure 1

19 pages, 2141 KiB

Open AccessArticle

DQRNet: Dynamic Quality Refinement Network for 3D Reconstruction from a Single Depth View

by Caixia Liu, Minhong Zhu, Haisheng Li, Xiulan Wei, Jiulin Liang and Qianwen Yao

Sensors 2025, 25(5), 1503; https://doi.org/10.3390/s25051503 - 28 Feb 2025

Viewed by 916

Abstract

With the widespread adoption of 3D scanning technology, depth view-driven 3D reconstruction has become crucial for applications such as SLAM, virtual reality, and autonomous vehicles. However, due to the effects of self-occlusion and environmental occlusion, obtaining complete and error-free 3D shapes directly from [...] Read more.

With the widespread adoption of 3D scanning technology, depth view-driven 3D reconstruction has become crucial for applications such as SLAM, virtual reality, and autonomous vehicles. However, due to the effects of self-occlusion and environmental occlusion, obtaining complete and error-free 3D shapes directly from 3D scans remains challenging, as previous reconstruction methods tend to lose details. To this end, we propose Dynamic Quality Refinement Network (DQRNet) for reconstructing complete and accurate 3D shape from a single depth view. DQRNet introduces a dynamic encoder–decoder and a detail quality refiner to generate high-resolution 3D shapes, where the former designs a dynamic latent extractor to adaptively select important parts of an object and the latter designs global and local point refiners to enhance the reconstruction quality. Experimental results show that DQRNet is able to focus on capturing the details at boundaries and key areas on ShapeNet dataset, thereby achieving better accuracy and robustness than SOTA methods. Full article

(This article belongs to the Special Issue 3D Reconstruction with RGB-D Cameras and Multi-sensors)

► Show Figures

Figure 1

15 pages, 2694 KiB

Open AccessArticle

Dynamic 3D Measurement Based on Camera-Pixel Mismatch Correction and Hilbert Transform

by Xingfan Chen, Qican Zhang and Yajun Wang

Sensors 2025, 25(3), 924; https://doi.org/10.3390/s25030924 - 3 Feb 2025

Viewed by 849

Abstract

In three-dimensional (3D) measurement, the motion of objects inevitably introduces errors, posing significant challenges to high-precision 3D reconstruction. Most existing algorithms for compensating motion-induced phase errors are tailored for object motion along the camera’s principal axis (Z direction), limiting their applicability in real-world [...] Read more.

In three-dimensional (3D) measurement, the motion of objects inevitably introduces errors, posing significant challenges to high-precision 3D reconstruction. Most existing algorithms for compensating motion-induced phase errors are tailored for object motion along the camera’s principal axis (Z direction), limiting their applicability in real-world scenarios where objects often experience complex combined motions in the X/Y and Z directions. To address these challenges, we propose a universal motion error compensation algorithm that effectively corrects both pixel mismatch and phase-shift errors, ensuring accurate 3D measurements under dynamic conditions. The method involves two key steps: first, pixel mismatch errors in the camera subsystem are corrected using adjacent coarse 3D point cloud data, aligning the captured data with the actual spatial geometry. Subsequently, motion-induced phase errors, observed as sinusoidal waveforms with a frequency twice that of the projection fringe pattern, are eliminated by applying the Hilbert transform to shift the fringes by π/2. Unlike conventional approaches that address these errors separately, our method provides a systematic solution by simultaneously compensating for camera-pixel mismatch and phase-shift errors within the 3D coordinate space. This integrated approach enhances the reliability and precision of 3D reconstruction, particularly in scenarios with dynamic and multidirectional object motions. The algorithm has been experimentally validated, demonstrating its robustness and broad applicability in fields such as industrial inspection, biomedical imaging, and real-time robotics. By addressing longstanding challenges in dynamic 3D measurement, our method represents a significant advancement in achieving high-accuracy reconstructions under complex motion environments. Full article

(This article belongs to the Special Issue 3D Reconstruction with RGB-D Cameras and Multi-sensors)

► Show Figures

Figure 1

30 pages, 13386 KiB

Open AccessArticle

Enhancing 3D Models with Spectral Imaging for Surface Reflectivity

by Adam Stech, Patrik Kamencay and Robert Hudec

Sensors 2024, 24(19), 6352; https://doi.org/10.3390/s24196352 - 30 Sep 2024

Viewed by 1390

Abstract

The increasing demand for accurate and detailed 3D modeling in fields such as cultural heritage preservation, industrial inspection, and scientific research necessitates advanced techniques to enhance model quality. This paper addresses this necessity by incorporating spectral imaging data to improve the surface detail [...] Read more.

The increasing demand for accurate and detailed 3D modeling in fields such as cultural heritage preservation, industrial inspection, and scientific research necessitates advanced techniques to enhance model quality. This paper addresses this necessity by incorporating spectral imaging data to improve the surface detail and reflectivity of 3D models. The methodology integrates spectral imaging with traditional 3D modeling processes, offering a novel approach to capturing fine textures and subtle surface variations. The experimental results of this paper underscore the advantages of incorporating spectral imaging data in the creation of 3D models, particularly in terms of enhancing surface detail and reflectivity. The achieved experimental results demonstrate that 3D models generated with spectral imaging data exhibit significant improvements in surface detail and accuracy, particularly for objects with intricate surface patterns. These findings highlight the potential of spectral imaging in enhancing 3D model quality. This approach offers significant advancements in 3D modeling, contributing to more precise and reliable representations of complex surfaces. Full article

(This article belongs to the Special Issue 3D Reconstruction with RGB-D Cameras and Multi-sensors)

► Show Figures

Figure 1

17 pages, 7063 KiB

Open AccessArticle

Online Scene Semantic Understanding Based on Sparsely Correlated Network for AR

by Qianqian Wang, Junhao Song, Chenxi Du and Chen Wang

Sensors 2024, 24(14), 4756; https://doi.org/10.3390/s24144756 - 22 Jul 2024

Cited by 3 | Viewed by 1415

Abstract

Real-world understanding serves as a medium that bridges the information world and the physical world, enabling the realization of virtual–real mapping and interaction. However, scene understanding based solely on 2D images faces problems such as a lack of geometric information and limited robustness [...] Read more.

Real-world understanding serves as a medium that bridges the information world and the physical world, enabling the realization of virtual–real mapping and interaction. However, scene understanding based solely on 2D images faces problems such as a lack of geometric information and limited robustness against occlusion. The depth sensor brings new opportunities, but there are still challenges in fusing depth with geometric and semantic priors. To address these concerns, our method considers the repeatability of video stream data and the sparsity of newly generated data. We introduce a sparsely correlated network architecture (SCN) designed explicitly for online RGBD instance segmentation. Additionally, we leverage the power of object-level RGB-D SLAM systems, thereby transcending the limitations of conventional approaches that solely emphasize geometry or semantics. We establish correlation over time and leverage this correlation to develop rules and generate sparse data. We thoroughly evaluate the system’s performance on the NYU Depth V2 and ScanNet V2 datasets, demonstrating that incorporating frame-to-frame correlation leads to significantly improved accuracy and consistency in instance segmentation compared to existing state-of-the-art alternatives. Moreover, using sparse data reduces data complexity while ensuring the real-time requirement of 18 fps. Furthermore, by utilizing prior knowledge of object layout understanding, we showcase a promising application of augmented reality, showcasing its potential and practicality. Full article

(This article belongs to the Special Issue 3D Reconstruction with RGB-D Cameras and Multi-sensors)

► Show Figures

Figure 1

11 pages, 153891 KiB

Open AccessArticle

Neural Colour Correction for Indoor 3D Reconstruction Using RGB-D Data

by Tiago Madeira, Miguel Oliveira and Paulo Dias

Sensors 2024, 24(13), 4141; https://doi.org/10.3390/s24134141 - 26 Jun 2024

Cited by 2 | Viewed by 1796

Abstract

With the rise in popularity of different human-centred applications using 3D reconstruction data, the problem of generating photo-realistic models has become an important task. In a multiview acquisition system, particularly for large indoor scenes, the acquisition conditions will differ along the environment, causing [...] Read more.

With the rise in popularity of different human-centred applications using 3D reconstruction data, the problem of generating photo-realistic models has become an important task. In a multiview acquisition system, particularly for large indoor scenes, the acquisition conditions will differ along the environment, causing colour differences between captures and unappealing visual artefacts in the produced models. We propose a novel neural-based approach to colour correction for indoor 3D reconstruction. It is a lightweight and efficient approach that can be used to harmonize colour from sparse captures over complex indoor scenes. Our approach uses a fully connected deep neural network to learn an implicit representation of the colour in 3D space, while capturing camera-dependent effects. We then leverage this continuous function as reference data to estimate the required transformations to regenerate pixels in each capture. Experiments to evaluate the proposed method on several scenes of the MP3D dataset show that it outperforms other relevant state-of-the-art approaches. Full article

(This article belongs to the Special Issue 3D Reconstruction with RGB-D Cameras and Multi-sensors)

► Show Figures

Figure 1

18 pages, 3738 KiB

Open AccessArticle

Research on Three-Dimensional Reconstruction of Ribs Based on Point Cloud Adaptive Smoothing Denoising

by Darong Zhu, Diao Wang, Yuanjiao Chen, Zhe Xu and Bishi He

Sensors 2024, 24(13), 4076; https://doi.org/10.3390/s24134076 - 23 Jun 2024

Cited by 1 | Viewed by 1713

Abstract

The traditional methods for 3D reconstruction mainly involve using image processing techniques or deep learning segmentation models for rib extraction. After post-processing, voxel-based rib reconstruction is achieved. However, these methods suffer from limited reconstruction accuracy and low computational efficiency. To overcome these limitations, [...] Read more.

The traditional methods for 3D reconstruction mainly involve using image processing techniques or deep learning segmentation models for rib extraction. After post-processing, voxel-based rib reconstruction is achieved. However, these methods suffer from limited reconstruction accuracy and low computational efficiency. To overcome these limitations, this paper proposes a 3D rib reconstruction method based on point cloud adaptive smoothing and denoising. We converted voxel data from CT images to multi-attribute point cloud data. Then, we applied point cloud adaptive smoothing and denoising methods to eliminate noise and non-rib points in the point cloud. Additionally, efficient 3D reconstruction and post-processing techniques were employed to achieve high-accuracy and comprehensive 3D rib reconstruction results. Experimental calculations demonstrated that compared to voxel-based 3D rib reconstruction methods, the 3D rib models generated by the proposed method achieved a 40% improvement in reconstruction accuracy and were twice as efficient as the former. Full article

(This article belongs to the Special Issue 3D Reconstruction with RGB-D Cameras and Multi-sensors)

► Show Figures

Figure 1

18 pages, 2095 KiB

Open AccessEditor’s ChoiceArticle

FusionVision: A Comprehensive Approach of 3D Object Reconstruction and Segmentation from RGB-D Cameras Using YOLO and Fast Segment Anything

by Safouane El Ghazouali, Youssef Mhirit, Ali Oukhrid, Umberto Michelucci and Hichem Nouira

Sensors 2024, 24(9), 2889; https://doi.org/10.3390/s24092889 - 30 Apr 2024

Cited by 8 | Viewed by 5503

Abstract

In the realm of computer vision, the integration of advanced techniques into the pre-processing of RGB-D camera inputs poses a significant challenge, given the inherent complexities arising from diverse environmental conditions and varying object appearances. Therefore, this paper introduces FusionVision, an exhaustive pipeline [...] Read more.

In the realm of computer vision, the integration of advanced techniques into the pre-processing of RGB-D camera inputs poses a significant challenge, given the inherent complexities arising from diverse environmental conditions and varying object appearances. Therefore, this paper introduces FusionVision, an exhaustive pipeline adapted for the robust 3D segmentation of objects in RGB-D imagery. Traditional computer vision systems face limitations in simultaneously capturing precise object boundaries and achieving high-precision object detection on depth maps, as they are mainly proposed for RGB cameras. To address this challenge, FusionVision adopts an integrated approach by merging state-of-the-art object detection techniques, with advanced instance segmentation methods. The integration of these components enables a holistic (unified analysis of information obtained from both color RGB and depth D channels) interpretation of RGB-D data, facilitating the extraction of comprehensive and accurate object information in order to improve post-processes such as object 6D pose estimation, Simultanious Localization and Mapping (SLAM) operations, accurate 3D dataset extraction, etc. The proposed FusionVision pipeline employs YOLO for identifying objects within the RGB image domain. Subsequently, FastSAM, an innovative semantic segmentation model, is applied to delineate object boundaries, yielding refined segmentation masks. The synergy between these components and their integration into 3D scene understanding ensures a cohesive fusion of object detection and segmentation, enhancing overall precision in 3D object segmentation. Full article

(This article belongs to the Special Issue 3D Reconstruction with RGB-D Cameras and Multi-sensors)

► Show Figures

Figure 1

19 pages, 10016 KiB

Open AccessArticle

LNMVSNet: A Low-Noise Multi-View Stereo Depth Inference Method for 3D Reconstruction

by Weiming Luo, Zongqing Lu and Qingmin Liao

Sensors 2024, 24(8), 2400; https://doi.org/10.3390/s24082400 - 9 Apr 2024

Cited by 6 | Viewed by 2625

Abstract

With the widespread adoption of modern RGB cameras, an abundance of RGB images is available everywhere. Therefore, multi-view stereo (MVS) 3D reconstruction has been extensively applied across various fields because of its cost-effectiveness and accessibility, which involves multi-view depth estimation and stereo matching [...] Read more.

With the widespread adoption of modern RGB cameras, an abundance of RGB images is available everywhere. Therefore, multi-view stereo (MVS) 3D reconstruction has been extensively applied across various fields because of its cost-effectiveness and accessibility, which involves multi-view depth estimation and stereo matching algorithms. However, MVS tasks face noise challenges because of natural multiplicative noise and negative gain in algorithms, which reduce the quality and accuracy of the generated models and depth maps. Traditional MVS methods often struggle with noise, relying on assumptions that do not always hold true under real-world conditions, while deep learning-based MVS approaches tend to suffer from high noise sensitivity. To overcome these challenges, we introduce LNMVSNet, a deep learning network designed to enhance local feature attention and fuse features across different scales, aiming for low-noise, high-precision MVS 3D reconstruction. Through extensive evaluation of multiple benchmark datasets, LNMVSNet has demonstrated its superior performance, showcasing its ability to improve reconstruction accuracy and completeness, especially in the recovery of fine details and clear feature delineation. This advancement brings hope for the widespread application of MVS, ranging from precise industrial part inspection to the creation of immersive virtual environments. Full article

(This article belongs to the Special Issue 3D Reconstruction with RGB-D Cameras and Multi-sensors)

► Show Figures

Figure 1

26 pages, 11619 KiB

Open AccessArticle

Neural Radiance Fields-Based 3D Reconstruction of Power Transmission Lines Using Progressive Motion Sequence Images

by Yujie Zeng, Jin Lei, Tianming Feng, Xinyan Qin, Bo Li, Yanqi Wang, Dexin Wang and Jie Song

Sensors 2023, 23(23), 9537; https://doi.org/10.3390/s23239537 - 30 Nov 2023

Cited by 4 | Viewed by 2701

Abstract

To address the fuzzy reconstruction effect on distant objects in unbounded scenes and the difficulty in feature matching caused by the thin structure of power lines in images, this paper proposes a novel image-based method for the reconstruction of power transmission lines (PTLs). [...] Read more.

To address the fuzzy reconstruction effect on distant objects in unbounded scenes and the difficulty in feature matching caused by the thin structure of power lines in images, this paper proposes a novel image-based method for the reconstruction of power transmission lines (PTLs). The dataset used in this paper comprises PTL progressive motion sequence datasets, constructed by a visual acquisition system carried by a developed Flying–walking Power Line Inspection Robot (FPLIR). This system captures close-distance and continuous images of power lines. The study introduces PL-NeRF, that is, an enhanced method based on the Neural Radiance Fields (NeRF) method for reconstructing PTLs. The highlights of PL-NeRF include (1) compressing the unbounded scene of PTLs by exploiting the spatial compression of normal

L^{\infty}

; (2) encoding the direction and position of the sample points through Integrated Position Encoding (IPE) and Hash Encoding (HE), respectively. Compared to existing methods, the proposed method demonstrates good performance in 3D reconstruction, with fidelity indicators of PSNR = 29, SSIM = 0.871, and LPIPS = 0.087. Experimental results highlight that the combination of PL-NeRF with progressive motion sequence images ensures the integrity and continuity of PTLs, improving the efficiency and accuracy of image-based reconstructions. In the future, this method could be widely applied for efficient and accurate 3D reconstruction and inspection of PTLs, providing a strong foundation for automated monitoring of transmission corridors and digital power engineering. Full article

(This article belongs to the Special Issue 3D Reconstruction with RGB-D Cameras and Multi-sensors)

► Show Figures

Figure 1

21 pages, 13877 KiB

Open AccessEditor’s ChoiceArticle

Recognition and Counting of Apples in a Dynamic State Using a 3D Camera and Deep Learning Algorithms for Robotic Harvesting Systems

by R. M. Rasika D. Abeyrathna, Victor Massaki Nakaguchi, Arkar Minn and Tofael Ahamed

Sensors 2023, 23(8), 3810; https://doi.org/10.3390/s23083810 - 7 Apr 2023

Cited by 35 | Viewed by 6534

Abstract

Recognition and 3D positional estimation of apples during harvesting from a robotic platform in a moving vehicle are still challenging. Fruit clusters, branches, foliage, low resolution, and different illuminations are unavoidable and cause errors in different environmental conditions. Therefore, this research aimed to [...] Read more.

Recognition and 3D positional estimation of apples during harvesting from a robotic platform in a moving vehicle are still challenging. Fruit clusters, branches, foliage, low resolution, and different illuminations are unavoidable and cause errors in different environmental conditions. Therefore, this research aimed to develop a recognition system based on training datasets from an augmented, complex apple orchard. The recognition system was evaluated using deep learning algorithms established from a convolutional neural network (CNN). The dynamic accuracy of the modern artificial neural networks involving 3D coordinates for deploying robotic arms at different forward-moving speeds from an experimental vehicle was investigated to compare the recognition and tracking localization accuracy. In this study, a Realsense D455 RGB-D camera was selected to acquire 3D coordinates of each detected and counted apple attached to artificial trees placed in the field to propose a specially designed structure for ease of robotic harvesting. A 3D camera, YOLO (You Only Look Once), YOLOv4, YOLOv5, YOLOv7, and EfficienDet state-of-the-art models were utilized for object detection. The Deep SORT algorithm was employed for tracking and counting detected apples using perpendicular, 15°, and 30° orientations. The 3D coordinates were obtained for each tracked apple when the on-board camera in the vehicle passed the reference line and was set in the middle of the image frame. To optimize harvesting at three different speeds (0.052 ms⁻¹, 0.069 ms⁻¹, and 0.098 ms⁻¹), the accuracy of 3D coordinates was compared for three forward-moving speeds and three camera angles (15°, 30°, and 90°). The mean average precision (mAP@0.5) values of YOLOv4, YOLOv5, YOLOv7, and EfficientDet were 0.84, 0.86, 0.905, and 0.775, respectively. The lowest root mean square error (RMSE) was 1.54 cm for the apples detected by EfficientDet at a 15° orientation and a speed of 0.098 ms⁻¹. In terms of counting apples, YOLOv5 and YOLOv7 showed a higher number of detections in outdoor dynamic conditions, achieving a counting accuracy of 86.6%. We concluded that the EfficientDet deep learning algorithm at a 15° orientation in 3D coordinates can be employed for further robotic arm development while harvesting apples in a specially designed orchard. Full article

(This article belongs to the Special Issue 3D Reconstruction with RGB-D Cameras and Multi-sensors)

► Show Figures

Figure 1

25 pages, 20118 KiB

Open AccessArticle

Light Field View Synthesis Using the Focal Stack and All-in-Focus Image

by Rishabh Sharma, Stuart Perry and Eva Cheng

Sensors 2023, 23(4), 2119; https://doi.org/10.3390/s23042119 - 13 Feb 2023

Viewed by 2838

Abstract

Light field reconstruction and synthesis algorithms are essential for improving the lower spatial resolution for hand-held plenoptic cameras. Previous light field synthesis algorithms produce blurred regions around depth discontinuities, especially for stereo-based algorithms, where no information is available to fill the occluded areas [...] Read more.

Light field reconstruction and synthesis algorithms are essential for improving the lower spatial resolution for hand-held plenoptic cameras. Previous light field synthesis algorithms produce blurred regions around depth discontinuities, especially for stereo-based algorithms, where no information is available to fill the occluded areas in the light field image. In this paper, we propose a light field synthesis algorithm that uses the focal stack images and the all-in-focus image to synthesize a 9 × 9 sub-aperture view light field image. Our approach uses depth from defocus to estimate a depth map. Then, we use the depth map and the all-in-focus image to synthesize the sub-aperture views, and their corresponding depth maps by mimicking the apparent shifting of the central image according to the depth values. We handle the occluded regions in the synthesized sub-aperture views by filling them with the information recovered from the focal stack images. We also show that, if the depth levels in the image are known, we can synthesize a high-accuracy light field image with just five focal stack images. The accuracy of our approach is compared with three state-of-the-art algorithms: one non-learning and two CNN-based approaches, and the results show that our algorithm outperforms all three in terms of PSNR and SSIM metrics. Full article

(This article belongs to the Special Issue 3D Reconstruction with RGB-D Cameras and Multi-sensors)

► Show Figures

Figure 1

10 pages, 4347 KiB

Open AccessArticle

3D Reconstruction Using 3D Registration-Based ToF-Stereo Fusion

by Sukwoo Jung, Youn-Sung Lee, Yunju Lee and KyungTaek Lee

Sensors 2022, 22(21), 8369; https://doi.org/10.3390/s22218369 - 1 Nov 2022

Cited by 16 | Viewed by 4113

Abstract

Depth sensing is an important issue in many applications, such as Augmented Reality (AR), eXtended Reality (XR), and Metaverse. For 3D reconstruction, a depth map can be acquired by a stereo camera and a Time-of-Flight (ToF) sensor. We used both sensors complementarily to [...] Read more.

Depth sensing is an important issue in many applications, such as Augmented Reality (AR), eXtended Reality (XR), and Metaverse. For 3D reconstruction, a depth map can be acquired by a stereo camera and a Time-of-Flight (ToF) sensor. We used both sensors complementarily to improve the accuracy of 3D information of the data. First, we applied a generalized multi-camera calibration method that uses both color and depth information. Next, depth maps of two sensors were fused by 3D registration and reprojection approach. Then, hole-filling was applied to refine the new depth map from the ToF-stereo fused data. Finally, the surface reconstruction technique was used to generate mesh data from the ToF-stereo fused pointcloud data. The proposed procedure was implemented and tested with real-world data and compared with various algorithms to validate its efficiency. Full article

(This article belongs to the Special Issue 3D Reconstruction with RGB-D Cameras and Multi-sensors)

► Show Figures

Figure 1

Journal Menu

Journal Browser

3D Reconstruction with RGB-D Cameras and Multi-sensors

Share This Special Issue

Special Issue Editors

Special Issue Information

Benefits of Publishing in a Special Issue

Published Papers (16 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI