Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (769)

Search Parameters:
Keywords = point cloud filter

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
27 pages, 8355 KB  
Article
Calibration of Roughness of Standard Samples Using Point Cloud Based on Line Chromatic Confocal Method
by Haotian Guo, Ting Chen, Xinke Xu, Yuexin Qiu, Jian Wu, Lei Wang, Huaichu Ye, Xuwen Chen and Ning Chen
Electronics 2026, 15(7), 1517; https://doi.org/10.3390/electronics15071517 - 4 Apr 2026
Viewed by 191
Abstract
This article proposes a calibration method combining line chromatic confocal and 3D point cloud processing to solve surface damage and low efficiency in traditional roughness sample calibration. Line chromatic confocal sensors scan roughness samples to obtain dense point clouds. We propose a back [...] Read more.
This article proposes a calibration method combining line chromatic confocal and 3D point cloud processing to solve surface damage and low efficiency in traditional roughness sample calibration. Line chromatic confocal sensors scan roughness samples to obtain dense point clouds. We propose a back projection mechanism, the adaptive density-based spatial clustering of applications with noise statistical outlier removal (BPM-ADBSCAN-SOR) algorithm that utilizes the ADBSCAN and SOR algorithms to address outlier noise and near-field noise in low-resolution point clouds, respectively, and then employs bounding boxes to crop the original high-resolution point cloud, thereby achieving multi-scale noise removal and point cloud clustering. We propose a Steady-State Confidence-Weighted Robust Gaussian Filtering (SSCW-RGF) algorithm, which calculates the range of the steady-state region, designs a steady-state region credibility weighting function to apply a weighted correction to the baseline fitting results, and then incorporates M-estimation theory to develop a robust Gaussian filtering algorithm weighted by steady-state region credibility, thereby mitigating the impact of outliers on Gaussian baseline fitting. Experiments verify the system accuracy: repeatability standard deviation is 0.0355 μm, relative repeatability error 0.3984%. Compared with reference block nominal values, the maximum absolute error is −0.745 μm, meeting specification tolerance. Compared with the contact profilometer, the maximum absolute error is 0.050 μm, the maximum relative error is +4.5%, and the calibration efficiency is improved by 90%. It provides a new approach for surface roughness calibration Full article
Show Figures

Figure 1

14 pages, 3147 KB  
Article
Improving the Environmental Safety of Transport Equipment Using Biodiesel Produced from Waste Vegetable
by Sergey N. Krivtsov, Nina V. Nemchinova, Andrey A. Tyutrin, Daniil Iakovlev, Dmitry A. Tikhov-Tinnikov, Sergey P. Ozornin, Andrei V. Negovora and Filipp A. Vasilev
Appl. Sci. 2026, 16(7), 3487; https://doi.org/10.3390/app16073487 - 3 Apr 2026
Viewed by 166
Abstract
Issues related to the environmental safety of transport vehicles, the operation of which leads to environmental pollution, continue to be highly relevant. In this work, we consider the use of biofuel mixed with diesel fuel for internal combustion engines operating at low temperatures. [...] Read more.
Issues related to the environmental safety of transport vehicles, the operation of which leads to environmental pollution, continue to be highly relevant. In this work, we consider the use of biofuel mixed with diesel fuel for internal combustion engines operating at low temperatures. This approach does not reduce the efficiency of transport, while also solving the issue of organic waste recycling. In this work, we address the possibility of reducing environmental pollution using carbon-neutral blended fuels based on esters of waste cooking oil (WCO), biobutanol, and diesel fuel for transport, tractor, and other equipment powered by a diesel internal combustion engine. In terms of the rate of biofuel implementation, Russia is still lagging behind the EU, China, and Japan, largely due to, inter alia, its climatic conditions with cold and long winters. The article also provides data on the possibility of using mixed biofuels under sub-zero temperatures. The process of forming a volumetric fuel supply through the common rail injector of the D4CB engine under changes in fuel pressure and drive pulse duration was also investigated, with the corresponding regression dependencies being presented. The losses of heat supplied into the cylinder when using a blend of diesel fuel and biodiesel (with 20 wt% butanol) in comparison with diesel fuel were analytically calculated. This made it possible to identify a function for adjusting fuel supply to compensate for power losses. The lubricity of fuel blends was assessed using the HFRR method. Full article
(This article belongs to the Section Ecology Science and Engineering)
Show Figures

Figure 1

23 pages, 6677 KB  
Article
Fine-Grained 3D Building Reconstruction and Floor Height Estimation from Ultra-High-Resolution TomoSAR Data Using Geometric Constraints
by Haoyuan Chen, Wenkang Liu, Quan Chen, Lei Cui and Mengdao Xing
Remote Sens. 2026, 18(7), 1073; https://doi.org/10.3390/rs18071073 - 2 Apr 2026
Viewed by 274
Abstract
The automatic generation of semantic Level of Detail (LOD) 2 models from TomoSAR point clouds is frequently compromised by elevation side-lobes, data sparsity, and inherent geometric distortions. In particular, the energy dispersion caused by side-lobes blurs vertical structures, making the extraction of floor [...] Read more.
The automatic generation of semantic Level of Detail (LOD) 2 models from TomoSAR point clouds is frequently compromised by elevation side-lobes, data sparsity, and inherent geometric distortions. In particular, the energy dispersion caused by side-lobes blurs vertical structures, making the extraction of floor details and accurate floor height estimation significantly challenging. To overcome these limitations, we present a refined reconstruction framework that tightly couples tomographic imaging mechanisms with building geometric priors. For fine-grained vertical reconstruction, we employ a geometry-constrained inverse projection strategy that concentrates scattered energy back onto the building façade to mitigate side-lobe interference. This is complemented by a Global Coherent Integration method, utilizing spectral analysis to robustly recover periodic floor patterns and estimate average floor heights. In the horizontal domain, we address the conflict between noise suppression and feature preservation through a separation-of-axes morphological strategy. Unlike traditional isotropic filtering, this approach processes orthogonal directions independently to bridge data gaps while strictly maintaining sharp building corners and recovering fine substructures. Validated on airborne Ku-band datasets, the proposed method demonstrates the capability to produce topologically complete and semantically rich urban models from sparse radar observations. Full article
Show Figures

Figure 1

18 pages, 3933 KB  
Article
Feature Selection Based on Height Mutual Information in Airborne LiDAR Filtering
by Zhan Cai, Luying Zhao, Qiuli Chen, Zhijun He, Shaoyun Bi and Xiaolong Xu
Remote Sens. 2026, 18(7), 1031; https://doi.org/10.3390/rs18071031 - 30 Mar 2026
Viewed by 247
Abstract
Filtering constitutes a critical step in the post-processing of airborne Light Detection And Ranging (LiDAR) data. Over the past decade, machine learning has emerged as a prominent methodological paradigm across numerous disciplines, attracting significant research interest in its application to LiDAR filtering. From [...] Read more.
Filtering constitutes a critical step in the post-processing of airborne Light Detection And Ranging (LiDAR) data. Over the past decade, machine learning has emerged as a prominent methodological paradigm across numerous disciplines, attracting significant research interest in its application to LiDAR filtering. From a machine learning perspective, filtering is essentially a binary classification task that aims to discriminate between ground and non-ground points. However, the limited information inherent in point clouds often leads to the generation of highly correlated features, particularly those derived from height data, which can compromise filtering accuracy. To address this issue, feature selection becomes imperative. In this study, we employed height-based mutual information as a criterion to identify and eliminate less discriminative features for filtering. The AdaBoost (Adaptive Boosting) algorithm was adopted as the classifier for point cloud filtering. For each point, nineteen features were derived from the raw LiDAR point cloud based on height and other geometric attributes within a defined neighborhood. The performance of the proposed feature selection approach was evaluated using benchmark datasets provided by the International Society for Photogrammetry and Remote Sensing (ISPRS). Experimental results demonstrate that the method is effective and reliable. After removing three selected features, the average kappa coefficient improved, along with a reduction in three categories of error, although a slight increase in Type II error (0.15%) was observed. Full article
Show Figures

Figure 1

30 pages, 135773 KB  
Article
Robust 3D Multi-Object Tracking via 4D mmWave Radar-Camera Fusion and Disparity-Domain Depth Recovery
by Yunfei Xie, Xiaohui Li, Dingheng Wang, Zhuo Wang, Shiliang Li, Jia Wang and Zhenping Sun
Sensors 2026, 26(7), 2096; https://doi.org/10.3390/s26072096 - 27 Mar 2026
Viewed by 481
Abstract
4D millimeter-wave radar provides high-precision ranging capability and exhibits strong robustness under adverse weather and low-visibility conditions, but its point clouds are relatively sparse and suffer from severe elevation-angle measurement noise. Monocular cameras, by contrast, provide rich semantic information and high recall, yet [...] Read more.
4D millimeter-wave radar provides high-precision ranging capability and exhibits strong robustness under adverse weather and low-visibility conditions, but its point clouds are relatively sparse and suffer from severe elevation-angle measurement noise. Monocular cameras, by contrast, provide rich semantic information and high recall, yet are fundamentally limited by scale ambiguity. To exploit the complementary characteristics of these two sensors, this paper proposes a radar-camera fusion 3D multi-object tracking framework that does not rely on complex 3D annotated data. First, on the radar signal-processing side, a Gaussian distribution-based adaptive angle compression method and IMU-based velocity compensation are introduced to effectively suppress measurement noise, and an improved DBSCAN clustering scheme with recursive cluster splitting and historical static-box guidance is employed to generate high-quality radar detections. Second, a disparity-domain metric depth recovery method is proposed. This method uses filtered radar points as sparse metric anchors, performs robust fitting with RANSAC, and applies Kalman filtering for temporal smoothing, thereby converting the relative depth output of the visual foundation model Depth Anything V2 into metric depth. Finally, a hierarchical fusion strategy is designed at both the detection and tracking levels to achieve stable cross-modal state association. Experimental results on a self-collected dataset show that the proposed method achieves an overall MOTA of 77.93%, outperforming single-modality baselines and other comparison methods by 11 to 31 percentage points. This study provides an effective solution for low-cost and robust environment perception in complex dynamic scenarios. Full article
(This article belongs to the Section Vehicular Sensing)
Show Figures

Figure 1

27 pages, 29264 KB  
Article
Method and Application of Full-Space Deformation Monitoring of Surrounding Rock in Coal Mine Roadway Based on Mobile Three-Dimensional Laser Scanning
by Chao Gao, Dexing He and Xinqiu Fang
Appl. Sci. 2026, 16(7), 3156; https://doi.org/10.3390/app16073156 - 25 Mar 2026
Viewed by 186
Abstract
Deformation monitoring of roadway surrounding rock is the key link to ensure the safety production of the coal mine. The traditional monitoring method can only obtain the displacement information of discrete measuring points, and it is difficult to fully reflect the spatial distribution [...] Read more.
Deformation monitoring of roadway surrounding rock is the key link to ensure the safety production of the coal mine. The traditional monitoring method can only obtain the displacement information of discrete measuring points, and it is difficult to fully reflect the spatial distribution characteristics and evolution law of surrounding rock deformation. Based on the engineering background of the extra-thick coal seam roadway in the Yushupo Coal Mine, Shanxi Province, China, this study proposes a set of full-space deformation monitoring methods for roadway surrounding rock based on explosion-proof mobile 3D laser scanning technology. Firstly, a hierarchical denoising method based on improved statistical filtering is established. The quality of point cloud data is effectively improved by region clipping, a connectivity analysis guided by multi-dimensional geometric features and adaptive density threshold three-level processing strategy. Secondly, a hierarchical point cloud registration method combining physical anchor geometric constraints and deep learning patch guided matching is proposed to reduce the registration error to millimeter level. Finally, the deformation evaluation of surrounding rock is carried out by combining the overall deformation identification with the quantitative analysis of local section slices. The engineering application results show that the deformation of the roadway floor is the most significant during the monitoring period, the maximum deformation is 90.0 mm, and the average deformation is 46.9 mm. The maximum deformation of the roof is 35.0 mm, and the convergence of both sides is asymmetric. Compared with the total station, the results show that the maximum displacement error in each direction does not exceed 5 mm, and the standard deviation is within 1.3 mm, which meets the engineering accuracy requirements of coal mine roadway deformation monitoring. This study provides a complete technical scheme for panoramic and high-precision monitoring of surrounding rock deformation in coal mine roadway. Full article
Show Figures

Figure 1

29 pages, 6656 KB  
Article
Improvements to the FLOAM Algorithm: GICP Registration and SOR Filtering in Mobile Robots with Pure Laser Configuration and Enhanced SLAM Performance
by Shichen Fu, Tianbao Zhao, Junkai Zhang, Guangming Guo and Weixiong Zheng
Appl. Sci. 2026, 16(7), 3141; https://doi.org/10.3390/app16073141 - 24 Mar 2026
Viewed by 228
Abstract
Laser SLAM is a key enabling technology for autonomous navigation of intelligent mobile robots. The standard FLOAM algorithm experiences low positioning accuracy, weak anti-interference performance, and prone error accumulation in pure LiDAR scenarios, making it difficult to meet practical engineering requirements. The focus [...] Read more.
Laser SLAM is a key enabling technology for autonomous navigation of intelligent mobile robots. The standard FLOAM algorithm experiences low positioning accuracy, weak anti-interference performance, and prone error accumulation in pure LiDAR scenarios, making it difficult to meet practical engineering requirements. The focus of numerous studies is thus on improved pure laser SLAM algorithms that are highly robust. The enhanced algorithm of FLOAM GICP registration and SOR filtering is applied in this study. The SOR filtering processes the laser point cloud to remove outlier noise. The GICP registration replaces the classic with an optimized matching cost function. Experiments are conducted on a mobile robot with a Leishen C16 LiDAR to simulate real-life tests in an indoor corridor and outdoor plaza on the Gazebo simulation platform. The results from the EVO tool’s quantitative evaluation indicate that the indoor mean absolute error and RMSE were reduced by 46.67% and 41.67% compared with FLOAM. The outdoor mean and maximum errors are reduced by 46.00% and 70.00%, respectively. The proposed improved scheme achieves centimeter-level positioning accuracy and strong robustness in pure laser configurations without auxiliary sensors such as IMUs or odometers, providing a reliable technical solution for the engineering application of mobile robots in sensor-constrained scenarios. Full article
Show Figures

Figure 1

22 pages, 26802 KB  
Article
Attention-Guided Semantic Segmentation and Scan-to-Model Geometric Reconstruction of Underground Tunnels from Mobile Laser Scanning
by Yingjia Huang, Jiang Ye, Xiaohui Li and Jingliang Du
Appl. Sci. 2026, 16(6), 3042; https://doi.org/10.3390/app16063042 - 21 Mar 2026
Viewed by 254
Abstract
Mobile Laser Scanning (MLS) integrated with Simultaneous Localization and Mapping (SLAM) has emerged as a key technology for digitizing GNSS-denied environments, such as underground mines. However, the automated interpretation of unstructured, high-density point clouds into semantic engineering models remains challenging due to extreme [...] Read more.
Mobile Laser Scanning (MLS) integrated with Simultaneous Localization and Mapping (SLAM) has emerged as a key technology for digitizing GNSS-denied environments, such as underground mines. However, the automated interpretation of unstructured, high-density point clouds into semantic engineering models remains challenging due to extreme geometric anisotropy in point distributions and severe class imbalance inherent to narrow tunnel environments. To address these issues, this study proposes a highly automated scan-to-model framework for precise semantic segmentation and vectorized two-dimensional (2D) profile reconstruction. First, an enhanced hierarchical deep learning network tailored for point clouds is introduced. The architecture incorporates a context-aware sampling strategy with an expanded receptive field of up to 10 m to preserve axial continuity, coupled with a spatial–geometric dual-attention mechanism to refine boundary delineation. In addition, a composite Focal–Dice loss function is employed to alleviate the dominance of wall points during network training. Experimental validation on a field-collected dataset comprising 16 mine tunnels demonstrates that the proposed model achieves a mean Intersection over Union (mIoU) of 85.15% (±0.29%) and an Overall Accuracy (OA) of 95.13% (±0.13%). Building on this semantic foundation, a robust geometric modeling pipeline is established using curvature-guided filtering and density-adaptive B-spline fitting. The reconstructed profiles accurately recover the geometric mean surface of the tunnel wall, yielding an overall filtered Root Mean Square Error (RMSE) of 4.96 ± 0.48 cm. The proposed framework provides an efficient end-to-end solution for deformation analysis and digital twinning of underground mining infrastructure. Full article
(This article belongs to the Special Issue Artificial Intelligence Applications in Underground Space Technology)
Show Figures

Figure 1

19 pages, 7295 KB  
Article
Video Identifying and Eraser: Use Multi-Task Cascaded Convolutional Neural Network to Enhance Safety in a Text-to-Video Diffusion Model
by Shuang Lin, Ranran Zhou and Yong Wang
Appl. Sci. 2026, 16(6), 2995; https://doi.org/10.3390/app16062995 - 20 Mar 2026
Viewed by 231
Abstract
Current security solutions predominantly rely on cloud-based implementations, often neglecting computational resource constraints and operational efficiency. While contemporary methodologies typically require additional training, the few that operate without retraining frequently yield suboptimal performance. To address these limitations, this work leverages a pre-trained MTCNN [...] Read more.
Current security solutions predominantly rely on cloud-based implementations, often neglecting computational resource constraints and operational efficiency. While contemporary methodologies typically require additional training, the few that operate without retraining frequently yield suboptimal performance. To address these limitations, this work leverages a pre-trained MTCNN architecture to detect faces of copyright-protected individuals. We construct a facial landmark database comprising five critical fiducial points, which serves as a supplementary module integrated into the stable diffusion framework, enabling real-time security filtering for synthesized video content. The proposed system utilizes MTCNN models pre-trained in the cloud to build a repository of copyrighted facial signatures, generating a geometric parameter database of facial landmarks. This database, coupled with a parallel verification unit, functions as a plugin within the standard Stable Diffusion pipeline. By leveraging Stable Diffusion’s native decoder, we decode stochastic frames from the U-Net latent representations and perform real-time comparative analysis to identify potential copyright violations in generated video sequences. Upon detecting an infringement, an on-screen display (OSD) alert notifies the user and immediately halts the text-to-video (T2V) generation process. Experimental evaluations demonstrate that our framework effectively mitigates the resource constraints and latency issues inherent in edge deployment scenarios of prior security implementations. Leveraging MTCNN’s proven robustness and extensive edge compatibility for facial recognition, the proposed detection and obfuscation plugin integrates seamlessly with Stable Diffusion while preserving generation quality. Full article
(This article belongs to the Special Issue Applied Multimodal AI: Methods and Applications Across Domains)
Show Figures

Figure 1

46 pages, 22593 KB  
Article
A Fully Automated SETSM Framework for Improving the Quality of GCP-Free DSMs Generated from Multiple PlanetScope Stereo Pairs
by Myoung-Jong Noh and Ian M. Howat
Remote Sens. 2026, 18(5), 806; https://doi.org/10.3390/rs18050806 - 6 Mar 2026
Viewed by 241
Abstract
We investigate the potential of frequent repeat imagery acquired by the PlanetScope Dove small satellite constellation to overcome temporal and spatial limitations in automated surface topography mapping. While individual PlanetScope Dove stereo pairs produce low-quality Digital Surface Models (DSMs) with large height uncertainties, [...] Read more.
We investigate the potential of frequent repeat imagery acquired by the PlanetScope Dove small satellite constellation to overcome temporal and spatial limitations in automated surface topography mapping. While individual PlanetScope Dove stereo pairs produce low-quality Digital Surface Models (DSMs) with large height uncertainties, the high temporal frequency enables multiple DSMs to enhance accuracy through multiple-pair image matching. We present a fully automated SETSM framework by improving the quality of PlanetScope Dove DSMs based on SETSM Multi-Pair Matching Procedure (SETSM MMP). This framework enhances stereo pair quality through an optimized stereo pair selection by sequential conditional filtering and a Weighted Stereo Pair Index (WSPI). A novel inter-plane vertical coregistration, which minimizes scaling errors between single stereo pair DSMs, was developed to improve consistency and accuracy in DSM quality without reference surfaces. Applied to the cloud-obscured Pantasma crater region in Nicaragua, the optimized stereo pair selection automatically selects well-defined stereo pairs. The inter-plane vertical coregistration without existing reference surfaces achieves up to a 43% Root Mean Square Error (RMSE) reduction and 26% improvement in distribution within a 5 m vertical error. DSM quality correlated strongly with tile size, stereo pair convergence angle, asymmetric angle and terrain-dependent scale variability. The proposed framework provides fully automatic, high quality PlanetScope Dove DSMs without Ground Control Points (GCPs). Full article
Show Figures

Figure 1

18 pages, 1354 KB  
Article
Design and Performance Validation of 4D Radar ICP-Integrated Navigation with Stochastic Cloning Augmentation
by Hyeongseob Shin, Dongha Kwon and Sangkyung Sung
Sensors 2026, 26(5), 1660; https://doi.org/10.3390/s26051660 - 5 Mar 2026
Viewed by 337
Abstract
Automotive radar has emerged as a pivotal technology for navigation in GNSS-denied environments, offering superior robustness to adverse weather and fluctuating lighting conditions compared to vision or LiDAR-based sensors. Despite these advantages, the inherent sparsity and noise of radar measurements often lead to [...] Read more.
Automotive radar has emerged as a pivotal technology for navigation in GNSS-denied environments, offering superior robustness to adverse weather and fluctuating lighting conditions compared to vision or LiDAR-based sensors. Despite these advantages, the inherent sparsity and noise of radar measurements often lead to degraded estimation accuracy and system reliability. To address these challenges, various radar-based localization frameworks have been explored, ranging from optimization-based and Extended Kalman Filter (EKF) approaches fused with Inertial Measurement Units (IMUs) to point cloud registration techniques like Iterative Closest Point (ICP). While filter-based methods are favored in multi-sensor fusion for their proven stability, ICP is widely utilized for high-precision pose estimation in point-cloud-centric systems. In this study, we propose a novel Radar-Inertial Odometry (RIO) framework that synergistically integrates ICP-based relative pose estimation with model-based sensor fusion. The proposed methodology leverages relative transformations derived from ICP alongside ego-velocity estimations obtained from radar Doppler measurements. To effectively incorporate relative ICP constraints, a stochastic cloning technique is implemented to augment previous states and their associated covariances, ensuring that the uncertainty of historical poses is explicitly accounted for. The performance of the proposed method is validated using public open-source datasets, demonstrating higher localization accuracy and more consistent performance compared to existing algorithms used for comparison. Full article
(This article belongs to the Section Navigation and Positioning)
Show Figures

Graphical abstract

22 pages, 27725 KB  
Article
A Shadow Geometry Approach for Olive Tree Canopy Volume Estimation Using WorldView-3 Multispectral Imagery
by Raffaella Brigante, Valerio Baiocchi, Laura Marconi, Alessandra Vinci, Roberto Calisti, Luca Regni, Fabio Radicioni and Primo Proietti
Remote Sens. 2026, 18(5), 779; https://doi.org/10.3390/rs18050779 - 4 Mar 2026
Viewed by 372
Abstract
The accurate estimation of tree canopy volume is fundamental in precision agriculture for quantifying vegetation structure, biomass, and productivity in perennial cropping systems. This study investigates a shadow geometry approach for estimating olive tree canopy volumes from a single, very high-resolution WorldView-3 multispectral [...] Read more.
The accurate estimation of tree canopy volume is fundamental in precision agriculture for quantifying vegetation structure, biomass, and productivity in perennial cropping systems. This study investigates a shadow geometry approach for estimating olive tree canopy volumes from a single, very high-resolution WorldView-3 multispectral image. The method integrates multispectral classification for canopy and shadow delineation with a geometric model that infers canopy height from shadow measurements, accounting for solar position and terrain morphology. Two classification strategies were evaluated: object-based image analysis (OBIA) and pixel-based (PB) classification, each applied to the original eight-band multispectral image and to a derived dataset enriched with vegetation indices (NDVI—Normalized Difference Vegetation Index; NDRE—Normalized Difference Red Edge Index) and principal component analysis (PCA) components. The canopy volume was estimated by integrating classified canopy and shadow areas with shadow-derived canopy height. The methodology was tested in a Mediterranean olive orchard and validated against UAV-derived point clouds for approximately 700 trees. The results indicate that the approach captures spatial variability in canopy structure. The Object-Based Image Analysis (OBIA) applied to filtered PCA-enhanced imagery achieved the highest accuracy in canopy volume estimation (RMSE = 2.04 m3; R2 = 0.56), outperforming the alternative pixel-based (PB) classification applied to the original multispectral data. Overall, the study demonstrates the potential of single-image WorldView-3 data for rapid and scalable three-dimensional canopy characterization in precision agriculture. Full article
Show Figures

Graphical abstract

16 pages, 2080 KB  
Article
Lidar–Vision Depth Fusion for Robust Loop Closure Detection in SLAM Systems
by Bingzhuo Liu, Panlong Wu, Rongting Chen, Yidan Zheng and Mengyu Li
Machines 2026, 14(3), 282; https://doi.org/10.3390/machines14030282 - 3 Mar 2026
Viewed by 412
Abstract
Loop Closure Detection (LCD) is a key component of Simultaneous Localization and Mapping (SLAM) systems, responsible for correcting odometric drift and maintaining global consistency in localization and mapping. However, single-modality LCD methods suffer from inherent limitations: LiDAR-based approaches are affected by point cloud [...] Read more.
Loop Closure Detection (LCD) is a key component of Simultaneous Localization and Mapping (SLAM) systems, responsible for correcting odometric drift and maintaining global consistency in localization and mapping. However, single-modality LCD methods suffer from inherent limitations: LiDAR-based approaches are affected by point cloud sparsity, limiting feature representation in unstructured environments, while vision-based methods are sensitive to illumination and weather variations, reducing robustness. To address these issues, this paper presents a LiDAR–vision multimodal fusion LCD algorithm. Spatiotemporal alignment between LiDAR point clouds and images is achieved through extrinsic calibration and timestamp interpolation to ensure cross-modal consistency. Harris corner detection and BRIEF descriptors are employed to extract visual features, and a LiDAR-projected sparse depth map is used to complete depth information, mapping 2D features into 3D space. A hybrid feature representation is then constructed by fusing LiDAR geometric triangle descriptors with visual BRIEF descriptors, enabling efficient loop candidate retrieval via hash indexing. Finally, an improved RANSAC algorithm performs geometric verification to enhance the robustness of relative pose estimation. Experiments on the KITTI and NCLT datasets show that the proposed method achieves average F1 scores of 85.28% and 77.63%, respectively, outperforming both unimodal and existing multimodal approaches. When integrated into a SLAM framework, it reduces the Absolute Error (ATE) RMSE by 11.2–16.4% compared with LiDAR-only methods, demonstrating improved loop detection accuracy and overall system robustness in complex environments. Full article
Show Figures

Figure 1

14 pages, 3278 KB  
Article
SESQ: Spatially Aware Encoding and Semantically Guided Querying for 3D Grounding
by Jinyuan Li, Yundong Wu, Tiancai Huang and Mengyun Cao
Computers 2026, 15(3), 145; https://doi.org/10.3390/computers15030145 - 1 Mar 2026
Viewed by 318
Abstract
3D visual grounding is a fundamental task for human–machine interaction, aiming to localize specific objects in complex 3D point clouds based on natural language descriptions. Despite recent advancements, existing Transformer-based architectures often rely on absolute position embeddings and heuristic query initialization, which lack [...] Read more.
3D visual grounding is a fundamental task for human–machine interaction, aiming to localize specific objects in complex 3D point clouds based on natural language descriptions. Despite recent advancements, existing Transformer-based architectures often rely on absolute position embeddings and heuristic query initialization, which lack the capacity to capture fine-grained relative spatial dependencies and fail to effectively filter out scene clutter. In this paper, we propose SESQ, a novel framework that synergizes Spatially Aware Encoding and Semantically Guided Querying for 3D grounding. Our approach introduces two key innovations. First, we propose the Rotary Spatially Aware Encoder (RSAE), which incorporates Rotary Position Embeddings (RoPE) into the self-attention layers. By transforming 3D coordinates into a rotary representation, RSAE enables the model to inherently capture relative spatial distances and maintains geometric consistency throughout the encoding stage. Second, a Semantic Query Initialization (SQI) module is designed to initialize object queries by explicitly computing the cross-modal similarity between textual embeddings and visual point cloud features. By replacing traditional heuristic-based selection with semantic-aware alignment, SQI ensures that the decoding process originates from contextually relevant object candidates, significantly reducing the impact of task-irrelevant distractors. Extensive experiments on ScanRefer and ReferIt3D (Nr3D/Sr3D) benchmarks demonstrate the effectiveness of our framework. Compared to the baseline EDA, our method achieves a significant performance gain of 2.68% in overall Acc@0.5 on ScanRefer, a 4.9% improvement on the challenging Nr3D “Hard” subset, and a 1.1% increase in overall Acc@0.25 on Sr3D. Full article
(This article belongs to the Special Issue Advanced Image Processing and Computer Vision (2nd Edition))
Show Figures

Figure 1

19 pages, 4237 KB  
Article
Intelligent Measurement of Concrete Crack Width Based on U-Net Deep Learning and Binocular Vision 3D Reconstruction
by Dedong Xiao, Gaoxin Wang, Kai Wang, Shukui Liu, Guangbin Shang, Qi-Ang Wang, Xiaohua Fan, Minghui Hu, Richeng Liu, Guozhao Chen and Zhihao Chen
Appl. Sci. 2026, 16(5), 2355; https://doi.org/10.3390/app16052355 - 28 Feb 2026
Viewed by 314
Abstract
The concrete cracking problem can seriously affect the durability and safety of civil structures. Accurately and quickly measuring the width of concrete cracks can help control defect development in a timely manner. Current research mainly relies on pixel detection of two-dimensional images, which [...] Read more.
The concrete cracking problem can seriously affect the durability and safety of civil structures. Accurately and quickly measuring the width of concrete cracks can help control defect development in a timely manner. Current research mainly relies on pixel detection of two-dimensional images, which lacks real three-dimensional information about crack lesions. Detection results are also obviously affected by various factors, such as shooting distance and posture, resulting in poor accuracy. Therefore, this paper presents an engineering-integrated solution that combines U-Net-based crack segmentation with binocular vision 3D reconstruction. The focus is placed on the practical deployment of the integrated pipeline, the optimization of key parameters under real inspection conditions, and the experimental validation of measurement accuracy on actual concrete cracks. Firstly, the U-Net deep learning algorithm is used to automatically identify and segment the concrete crack region; then, a binocular vision-based 3D reconstruction pipeline is adopted, and a parallax rejection algorithm based on a “double-threshold” decision is proposed to improve the fidelity of crack disparity maps, and the effect of the filter window size on the concrete crack region is analyzed; finally, an intelligent measurement method based on the 3D reconstruction model is proposed, and the measurement results of concrete crack width can be calculated directly from the 3D reconstruction model. The results show that (1) the model can identify the characteristics of the crack, and the detection effect at 4:00 p.m. is the best, because at this time the light is more uniform with less shadow and moderate contrast between the crack and its background; (2) the reconstruction of the 3D point cloud model of the concrete crack with a filtering window of size 9 × 9 is the best; (3) the maximum error between the calculated and measured values of crack width is 0.31mm, the minimum error is 0.07mm, and the average error is 0.15 mm, which indicates that the measurement accuracy reaches the sub-millimetre level and verifies the validity of the proposed method in this paper. Full article
(This article belongs to the Section Civil Engineering)
Show Figures

Figure 1

Back to TopTop