MDPI - Publisher of Open Access Journals

22 pages, 4435 KB

Open AccessArticle

Semantic Mapping in Public Indoor Environments Using Improved Instance Segmentation and Continuous-Frame Dynamic Constraint

by Yumin Lu, Xueyu Feng, Zonghuan Guo, Jianchao Wang, Lin Zhou and Yingcheng Lin

Electronics 2026, 15(7), 1392; https://doi.org/10.3390/electronics15071392 - 26 Mar 2026

Viewed by 312

Abstract

Reliable semantic perception is crucial for service robots operating in complex public indoor environments. However, existing semantic mapping approaches often face the dual challenges of high computational overhead and semantic redundancy in maps. To address these limitations, this paper proposes a low-resource semantic [...] Read more.

Reliable semantic perception is crucial for service robots operating in complex public indoor environments. However, existing semantic mapping approaches often face the dual challenges of high computational overhead and semantic redundancy in maps. To address these limitations, this paper proposes a low-resource semantic mapping framework based on improved instance segmentation and dynamic constraints from consecutive frames. First, we design the lightweight model MS-YOLO, which adopts MobileNetV4 as its backbone network and incorporates the SHViT neck module, effectively optimizing the balance between detection accuracy and computational cost. Second, we propose a consecutive frame dynamic constraint method that eliminates redundant object annotations through consecutive frame stability verification. Experimental results relating to both fusion and custom datasets demonstrate that compared to YOLOv8n-seg, MS-YOLO achieves improvements in accuracy, recall, and mAP@0.5, while reducing the number of parameters by 11.7% and floating-point operations (FLOPs) by 32.2%. Furthermore, compared to YOLOv11n-seg and YOLOv5n-seg, its FLOPs are reduced by 17.2% and 25.5%, respectively. Finally, the successful deployment and field validation of this system on the Jetson Orin NX platform demonstrate its real-time capability and engineering practicality for edge computing in public indoor service robots. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

22 pages, 26802 KB

Open AccessArticle

Attention-Guided Semantic Segmentation and Scan-to-Model Geometric Reconstruction of Underground Tunnels from Mobile Laser Scanning

by Yingjia Huang, Jiang Ye, Xiaohui Li and Jingliang Du

Appl. Sci. 2026, 16(6), 3042; https://doi.org/10.3390/app16063042 - 21 Mar 2026

Viewed by 254

Abstract

Mobile Laser Scanning (MLS) integrated with Simultaneous Localization and Mapping (SLAM) has emerged as a key technology for digitizing GNSS-denied environments, such as underground mines. However, the automated interpretation of unstructured, high-density point clouds into semantic engineering models remains challenging due to extreme [...] Read more.

Mobile Laser Scanning (MLS) integrated with Simultaneous Localization and Mapping (SLAM) has emerged as a key technology for digitizing GNSS-denied environments, such as underground mines. However, the automated interpretation of unstructured, high-density point clouds into semantic engineering models remains challenging due to extreme geometric anisotropy in point distributions and severe class imbalance inherent to narrow tunnel environments. To address these issues, this study proposes a highly automated scan-to-model framework for precise semantic segmentation and vectorized two-dimensional (2D) profile reconstruction. First, an enhanced hierarchical deep learning network tailored for point clouds is introduced. The architecture incorporates a context-aware sampling strategy with an expanded receptive field of up to 10 m to preserve axial continuity, coupled with a spatial–geometric dual-attention mechanism to refine boundary delineation. In addition, a composite Focal–Dice loss function is employed to alleviate the dominance of wall points during network training. Experimental validation on a field-collected dataset comprising 16 mine tunnels demonstrates that the proposed model achieves a mean Intersection over Union (mIoU) of 85.15% (±0.29%) and an Overall Accuracy (OA) of 95.13% (±0.13%). Building on this semantic foundation, a robust geometric modeling pipeline is established using curvature-guided filtering and density-adaptive B-spline fitting. The reconstructed profiles accurately recover the geometric mean surface of the tunnel wall, yielding an overall filtered Root Mean Square Error (RMSE) of 4.96 ± 0.48 cm. The proposed framework provides an efficient end-to-end solution for deformation analysis and digital twinning of underground mining infrastructure. Full article

(This article belongs to the Special Issue Artificial Intelligence Applications in Underground Space Technology)

► Show Figures

Figure 1

25 pages, 6915 KB

Open AccessArticle

EXAONE-VLA: A Unified Vision–Language Framework for Mobile Manipulation via Semantic Topology and Hierarchical LLM Reasoning

by Jeong-Seop Park, Yong-Jun Lee, Jong-Chan Park, Sung-Gil Park, Jong-Jin Woo and Myo-Taeg Lim

Appl. Sci. 2026, 16(5), 2600; https://doi.org/10.3390/app16052600 - 9 Mar 2026

Viewed by 584

Abstract

This paper proposes a unified vision–language framework that translates user instructions into navigation for the mobile base and actions for the manipulator in indoor environments. In general, occupancy grid maps constructed via SLAM capture solely the geometric layout of the environment. This renders [...] Read more.

This paper proposes a unified vision–language framework that translates user instructions into navigation for the mobile base and actions for the manipulator in indoor environments. In general, occupancy grid maps constructed via SLAM capture solely the geometric layout of the environment. This renders the robot incapable of leveraging the semantic information required for object distinction. The proposed method encodes semantic information from vision–language models and the robot’s pose in a textual format, referred to as a semantic topological graph. Specifically, the models including GroundingDINO, LG EXAONE, and SAM2 extract object-level semantic information, which is subsequently used to identify room characteristics. A large language model then interprets user instructions to identify the final destination for navigation within the semantic topological graph, followed by reasoning to determine the suitable action network. Notably, the proposed text-based representation facilitates a substantial reduction in inference time, and its effectiveness is validated through real-world experiments. Full article

(This article belongs to the Special Issue Deep Reinforcement Learning for Multiagent Systems)

► Show Figures

Figure 1

33 pages, 12968 KB

Open AccessArticle

Tunnel-SLAM: Low-Cost LiDAR/Vision/RTK/Inertial Integration on Vehicles for Roadway Tunnels

by Zeyu Li, Xian Wu, Jianhui Cui, Ying Xu, Rufei Liu, Rui Tu and Wei Jiang

Electronics 2026, 15(5), 1101; https://doi.org/10.3390/electronics15051101 - 6 Mar 2026

Viewed by 449

Abstract

Reliable positioning and mapping in roadway tunnels are crucial for vehicle-based monitoring and inspection, especially considering the challenging environmental conditions such as rapidly changing illumination, low-texture environments, and repetitive structural elements. While general LiDAR-inertial odometry (LIO) frameworks and loop-closure detection methods are effective [...] Read more.

Reliable positioning and mapping in roadway tunnels are crucial for vehicle-based monitoring and inspection, especially considering the challenging environmental conditions such as rapidly changing illumination, low-texture environments, and repetitive structural elements. While general LiDAR-inertial odometry (LIO) frameworks and loop-closure detection methods are effective in general scenarios, they often suffer from severe drift or incorrect loop constraints under these specific conditions. These challenges are further exacerbated by the inherent uncertainties associated with low-cost sensors. This paper introduces a narrow field-of-view LiDAR-centric RTK-visual-inertial SLAM system enhanced by three key modules: semantic-assisted loop detection and matching, two-stage RTK quality control, and adaptive factor graph optimization (FGO). In the first module, the proposed semantic loop descriptor (SLD) matching is used to determine the potential loop closure locations and then integrates the corresponding constraint as graph nodes. The quality control module addresses RTK outlier rejection during tunnel entry and exit, employing an event-driven stochastic model to characterize the uncertainty between RTK and the other sensors, effectively suppressing RTK-induced errors. FGO module performs optimization by incorporating LIO, RTK, and loop closure factors, employing a keyframe-based strategy to produce globally optimized poses while continuously updating the map. The proposed Tunnel-SLAM was evaluated against state-of-the-art SLAM algorithms in four extended roadway tunnels, ranging in traveling distance approximately from 5 to 10 km. Experimental results demonstrate that the proposed SLAM achieved a final drift of less than 2 m with loop closure, demonstrating significantly reducing the drift, while other existing SLAM frameworks fail catastrophically or have large drift. Full article

(This article belongs to the Special Issue Simultaneous Localization and Mapping (SLAM) of Mobile Robots)

► Show Figures

Figure 1

17 pages, 12829 KB

Open AccessArticle

Stereo Gaussian Splatting with Adaptive Scene Depth Estimation for Semantic Mapping

by Chenhui Fu and Jiangang Lu

J. Imaging 2026, 12(3), 105; https://doi.org/10.3390/jimaging12030105 - 28 Feb 2026

Viewed by 395

Abstract

Simultaneous Localization and Mapping (SLAM) is a fundamental capability in robotics and augmented reality. However, achieving accurate geometric reconstruction and consistent semantic understanding in complex environments remains challenging. Although recent neural implicit representations have improved reconstruction quality, they often suffer from high computational [...] Read more.

Simultaneous Localization and Mapping (SLAM) is a fundamental capability in robotics and augmented reality. However, achieving accurate geometric reconstruction and consistent semantic understanding in complex environments remains challenging. Although recent neural implicit representations have improved reconstruction quality, they often suffer from high computational cost and the forgetting phenomenon during online mapping. In this paper, we propose StereoGS-SLAM, a stereo semantic SLAM framework based on 3D Gaussian Splatting (3DGS) for explicit scene representation. Unlike existing approaches, StereoGS-SLAM operates on passive RGB stereo inputs without requiring active depth sensors. An adaptive depth estimation strategy is introduced to dynamically refine Gaussian scales based on real-time stereo depth estimates, ensuring robust and scale-consistent reconstruction. In addition, we propose a hybrid keyframe selection strategy that integrates motion-aware selection with lightweight random sampling to improve keyframe diversity and maintain stable, real-time optimization. Experimental evaluations demonstrate that StereoGS-SLAM achieves consistent and competitive localization, rendering, and semantic reconstruction performance compared with recent 3DGS-based SLAM systems. Full article

(This article belongs to the Section Computer Vision and Pattern Recognition)

► Show Figures

Figure 1

25 pages, 15267 KB

Open AccessArticle

3D Semantic Map Reconstruction for Orchard Environments Using Multi-Sensor Fusion

by Quanchao Wang, Yiheng Chen, Jiaxiang Li, Yongxing Chen and Hongjun Wang

Agriculture 2026, 16(4), 455; https://doi.org/10.3390/agriculture16040455 - 15 Feb 2026

Viewed by 698

Abstract

Semantic point cloud maps play a pivotal role in smart agriculture. They provide not only core three-dimensional data for orchard management but also empower robots with environmental perception, enabling safer and more efficient navigation and planning. However, traditional point cloud maps primarily model [...] Read more.

Semantic point cloud maps play a pivotal role in smart agriculture. They provide not only core three-dimensional data for orchard management but also empower robots with environmental perception, enabling safer and more efficient navigation and planning. However, traditional point cloud maps primarily model surrounding obstacles from a geometric perspective, failing to capture distinctions and characteristics between individual obstacles. In contrast, semantic maps encompass semantic information and even topological relationships among objects in the environment. Furthermore, existing semantic map construction methods are predominantly vision-based, making them ill-suited to handle rapid lighting changes in agricultural settings that can cause positioning failures. Therefore, this paper proposes a positioning and semantic map reconstruction method tailored for orchards. It integrates visual, LiDAR, and inertial sensors to obtain high-precision pose and point cloud maps. By combining open-vocabulary detection and semantic segmentation models, it projects two-dimensional detected semantic information onto the three-dimensional point cloud, ultimately generating a point cloud map enriched with semantic information. The resulting 2D occupancy grid map is utilized for robotic motion planning. Experimental results demonstrate that on a custom dataset, the proposed method achieves 74.33% mIoU for semantic segmentation accuracy, 12.4% relative error for fruit recall rate, and 0.038803 m mean translation error for localization. The deployed semantic segmentation network Fast-SAM achieves a processing speed of 13.36 ms per frame. These results demonstrate that the proposed method combines high accuracy with real-time performance in semantic map reconstruction. This exploratory work provides theoretical and technical references for future research on more precise localization and more complete semantic mapping, offering broad application prospects and providing key technological support for intelligent agriculture. Full article

(This article belongs to the Special Issue Advances in Robotic Systems for Precision Orchard Operations)

► Show Figures

Figure 1

29 pages, 33196 KB

Open AccessArticle

Robust Autonomous Perception for Indoor Service Machines via Geometry-Aware RGB-D SLAM and Probabilistic Dynamic Modeling

by Zhiyu Wang, Weili Ding and Wenna Wang

Machines 2026, 14(2), 222; https://doi.org/10.3390/machines14020222 - 12 Feb 2026

Viewed by 327

Abstract

Reliable autonomous perception is essential for indoor service machines operating in human-centered environments, where weak textures, repetitive structures, and frequent dynamic interference often degrade localization stability. Conventional RGB-D SLAM systems typically rely on static-scene assumptions or binary semantic masking, which are insufficient for [...] Read more.

Reliable autonomous perception is essential for indoor service machines operating in human-centered environments, where weak textures, repetitive structures, and frequent dynamic interference often degrade localization stability. Conventional RGB-D SLAM systems typically rely on static-scene assumptions or binary semantic masking, which are insufficient for handling persistent and fine-grained environmental dynamics. This paper presents a robust autonomous perception framework based on geometry-aware RGB-D SLAM, with a particular emphasis on probabilistic dynamic modeling at the feature level. The proposed system integrates multi-granularity geometric representations, including point features, parallel-line structures, and planar regions, to enhance geometric observability in low-texture indoor environments. On this basis, a probabilistic dynamic model is introduced to explicitly characterize feature reliability under motion, where dynamic probabilities are initialized by object detection and continuously updated through temporal consistency, spatial propagation, and multi-view geometric verification. Large-scale planar structures further serve as stable anchors to support robust pose estimation. Experimental results on the TUM RGB-D dynamic benchmark demonstrate that the proposed method significantly improves localization robustness, reducing the average ATE RMSE by approximately 66% compared with representative dynamic SLAM baselines. Additional evaluations on a real-world indoor dataset further validate its effectiveness for long-term autonomous perception under dense motion and frequent occlusions. Full article

(This article belongs to the Section Robotics, Mechatronics and Intelligent Machines)

► Show Figures

Figure 1

27 pages, 12640 KB

Open AccessArticle

A Suitable Scan-to-BIM Process Using OS Software and Low-Cost Sensors: Trend, Solutions and Experimental Validation

by Massimiliano Pepe, Przemysław Klapa, Andrei Crisan, Ahmed Kamal Hamed Dewedar and Donato Palumbo

Architecture 2026, 6(1), 24; https://doi.org/10.3390/architecture6010024 - 5 Feb 2026

Viewed by 918

Abstract

Open-source software is transforming visualization-oriented digital documentation and conceptual BIM by lowering financial and technical barriers, enabling broader participation in the digitalization of the AEC sector. This study develops and validates a cost-effective Scan-to-BIM workflow that combines low-cost hardware with freely available software [...] Read more.

Open-source software is transforming visualization-oriented digital documentation and conceptual BIM by lowering financial and technical barriers, enabling broader participation in the digitalization of the AEC sector. This study develops and validates a cost-effective Scan-to-BIM workflow that combines low-cost hardware with freely available software for 3D data acquisition, processing, and modeling. Photogrammetry and SLAM-based techniques generate accurate point clouds, which, once verified against terrestrial laser scanning data, can be integrated into open-source BIM environments. The workflow leverages COLMAP for 3D reconstruction and BlenderBIM for parametric modeling, combining geometric and semantic information to produce fully interoperable models. While open-source tools offer accessibility and transparency, they require supplementary validation in precision-critical applications and may involve trade-offs in accuracy, stability, and automation compared to commercial solutions. Application to a case study shows how efficient and rapid the process is, representing the trend for the scientific community. Full article

► Show Figures

Figure 1

29 pages, 627 KB

Open AccessReview

Learning-Based Multi-Robot Active SLAM: A Conceptual Framework and Survey

by Bowen Lv and Shihong Duan

Appl. Sci. 2026, 16(3), 1412; https://doi.org/10.3390/app16031412 - 30 Jan 2026

Cited by 1 | Viewed by 728

Abstract

Multi-robot systems (MRSs) offer distinct advantages in large-scale exploration but require tight coupling between decentralized decision-making and collaborative estimation. This survey reviews learning-based multi-robot Active Collaborative Simultaneous Localization and Mapping (AC-SLAM), modeling it as a coupled system comprising a Decentralized Partially Observable Markov [...] Read more.

Multi-robot systems (MRSs) offer distinct advantages in large-scale exploration but require tight coupling between decentralized decision-making and collaborative estimation. This survey reviews learning-based multi-robot Active Collaborative Simultaneous Localization and Mapping (AC-SLAM), modeling it as a coupled system comprising a Decentralized Partially Observable Markov Decision Process (Dec-POMDP) decision layer and a distributed factor-graph estimation layer. By synthesizing these components into a conceptual framework, recent methods for cooperative perception, mapping, and policy learning are systematically critiqued. The analysis concludes that Hierarchical Reinforcement Learning (HRL) and graph-based spatial abstraction currently offer superior scalability and robustness compared to monolithic end-to-end approaches. Furthermore, a comprehensive analysis of Sim-to-Real transfer strategies is provided, ranging from domain randomization to emerging Real-to-Sim techniques based on NeRF and 3D Gaussian Splatting. Finally, future directions are outlined, moving from geometric mapping toward LLM-driven active semantic understanding and dynamic digital twins to bridge the reality gap. Full article

(This article belongs to the Special Issue Applications of Robot Navigation in Autonomous Systems)

► Show Figures

Figure 1

18 pages, 1767 KB

Open AccessArticle

Integrating Roadway Sign Data and Biomimetic Path Integration for High-Precision Localization in Unstructured Coal Mine Roadways

by Miao Yu, Zilong Zhang, Xi Zhang, Junjie Zhang, Bin Zhou and Bo Chen

Electronics 2026, 15(3), 528; https://doi.org/10.3390/electronics15030528 - 26 Jan 2026

Viewed by 317

Abstract

High-precision autonomous localization remains a critical challenge for intelligent mining vehicles in GNSS-denied and unstructured coal mine roadways, where traditional odometry-based methods suffer from severe cumulative drift and perceptual aliasing. Inspired by the synergy between mammalian visual cues and cognitive neural mechanisms, this [...] Read more.

High-precision autonomous localization remains a critical challenge for intelligent mining vehicles in GNSS-denied and unstructured coal mine roadways, where traditional odometry-based methods suffer from severe cumulative drift and perceptual aliasing. Inspired by the synergy between mammalian visual cues and cognitive neural mechanisms, this paper proposes a robust biomimetic localization framework that integrates multi-source perception with a prior cognitive map. The core contributions are three-fold: First, a semantic-enhanced biomimetic localization method is developed, leveraging roadway sign data as absolute spatial anchors to suppress long-distance cumulative errors. Second, an optimized head direction (HD) cell model is formulated by incorporating a speed balance factor, kinematic constraints, and a drift correction influence factor, significantly improving the precision of angular perception. Third, boundary-adaptive and sign-based semantic constraint terms are integrated into a continuous attractor network (CAN)-based path integration model, effectively preventing trajectory deviation into non-navigable regions. Comprehensive evaluations conducted in large-scale underground scenarios demonstrate that the proposed framework consistently outperforms conventional IMU-odometry fusion, representative 3D SLAM solutions, and baseline biomimetic algorithms. By effectively integrating semantic landmarks as spatial anchors, the system exhibits superior resilience against cumulative drift, maintaining high localization precision where standard methods typically diverge. The results confirm that our approach significantly enhances both trajectory consistency and heading stability across extensive distances, validating its robustness and scalability in handling the inherent complexities of unstructured coal mine environments for enhanced intrinsic safety. Full article

► Show Figures

Figure 1

28 pages, 9411 KB

Open AccessArticle

A Real-Time Mobile Robotic System for Crack Detection in Construction Using Two-Stage Deep Learning

by Emmanuella Ogun, Yong Ann Voeurn and Doyun Lee

Sensors 2026, 26(2), 530; https://doi.org/10.3390/s26020530 - 13 Jan 2026

Viewed by 738

Abstract

The deterioration of civil infrastructure poses a significant threat to public safety, yet conventional manual inspections remain subjective, labor-intensive, and constrained by accessibility. To address these challenges, this paper presents a real-time robotic inspection system that integrates deep learning perception and autonomous navigation. [...] Read more.

The deterioration of civil infrastructure poses a significant threat to public safety, yet conventional manual inspections remain subjective, labor-intensive, and constrained by accessibility. To address these challenges, this paper presents a real-time robotic inspection system that integrates deep learning perception and autonomous navigation. The proposed framework employs a two-stage neural network: a U-Net for initial segmentation followed by a Pix2Pix conditional generative adversarial network (GAN) that utilizes adversarial residual learning to refine boundary accuracy and suppress false positives. When deployed on an Unmanned Ground Vehicle (UGV) equipped with an RGB-D camera and LiDAR, this framework enables simultaneous automated crack detection and collision-free autonomous navigation. Evaluated on the CrackSeg9k dataset, the two-stage model achieved a mean Intersection over Union (mIoU) of 73.9 ± 0.6% and an F1-score of 76.4 ± 0.3%. Beyond benchmark testing, the robotic system was further validated through simulation, laboratory experiments, and real-world campus hallway tests, successfully detecting micro-cracks as narrow as 0.3 mm. Collectively, these results demonstrate the system’s potential for robust, autonomous, and field-deployable infrastructure inspection. Full article

(This article belongs to the Special Issue Sensing and Control Technology of Intelligent Robots)

► Show Figures

Figure 1

22 pages, 416 KB

Open AccessReview

A Roadmap of Mathematical Optimization for Visual SLAM in Dynamic Environments

by Hui Zhang, Xuerong Zhao, Ruixue Luo, Ziyu Wang, Gang Wang and Kang An

Mathematics 2026, 14(2), 264; https://doi.org/10.3390/math14020264 - 9 Jan 2026

Viewed by 715

Abstract

The widespread application of robots in complex and dynamic environments demands that Visual SLAM is both robust and accurate. However, dynamic objects, varying illumination, and environmental complexity fundamentally challenge the static world assumptions underlying traditional SLAM methods. This review provides a comprehensive investigation [...] Read more.

The widespread application of robots in complex and dynamic environments demands that Visual SLAM is both robust and accurate. However, dynamic objects, varying illumination, and environmental complexity fundamentally challenge the static world assumptions underlying traditional SLAM methods. This review provides a comprehensive investigation into the mathematical foundations of V-SLAM and systematically analyzes the key optimization techniques developed for dynamic environments, with particular emphasis on advances since 2020. We begin by rigorously deriving the probabilistic formulation of V-SLAM and its basis in nonlinear optimization, unifying it under a Maximum a Posteriori (MAP) estimation framework. We then propose a taxonomy based on how dynamic elements are handled mathematically, which reflects the historical evolution from robust estimation to semantic modeling and then to deep learning. This framework provides detailed analysis of three main categories: (1) robust estimation theory-based methods for outlier rejection, elaborating on the mathematical models of M-estimators and switch variables; (2) semantic information and factor graph-based methods for explicit dynamic object modeling, deriving the joint optimization formulation for multi-object tracking and SLAM; and (3) deep learning-based end-to-end optimization methods, discussing their mathematical foundations and interpretability challenges. This paper delves into the mathematical principles, performance boundaries, and theoretical controversies underlying these approaches, concluding with a summary of future research directions informed by the latest developments in the field. The review aims to provide both a solid mathematical foundation for understanding current dynamic V-SLAM techniques and inspiration for future algorithmic innovations. By adopting a math-first perspective and organizing the field through its core optimization paradigms, this work offers a clarifying framework for both understanding and advancing dynamic V-SLAM. Full article

(This article belongs to the Section E2: Control Theory and Mechanics)

► Show Figures

Figure 1

18 pages, 7305 KB

Open AccessArticle

SERail-SLAM: Semantic-Enhanced Railway LiDAR SLAM

by Weiwei Song, Shiqi Zheng, Xinye Dai, Xiao Wang, Yusheng Wang, Zihao Wang, Shujie Zhou, Wenlei Liu and Yidong Lou

Machines 2026, 14(1), 72; https://doi.org/10.3390/machines14010072 - 7 Jan 2026

Viewed by 750

Abstract

Reliable state estimation in railway environments presents significant challenges due to geometric degeneracy resulting from repetitive structural layouts and point cloud sparsity caused by high-speed motion. Conventional LiDAR-based SLAM systems frequently suffer from longitudinal drift and mapping artifacts when operating in such feature-scarce [...] Read more.

Reliable state estimation in railway environments presents significant challenges due to geometric degeneracy resulting from repetitive structural layouts and point cloud sparsity caused by high-speed motion. Conventional LiDAR-based SLAM systems frequently suffer from longitudinal drift and mapping artifacts when operating in such feature-scarce and dynamically complex scenarios. To address these limitations, this paper proposes SERail-SLAM, a robust semantic-enhanced multi-sensor fusion framework that tightly couples LiDAR odometry, inertial pre-integration, and GNSS constraints. Unlike traditional approaches that rely on rigid voxel grids or binary semantic masking, we introduce a Semantic-Enhanced Adaptive Voxel Map. By leveraging eigen-decomposition of local point distributions, this mapping strategy dynamically preserves fine-grained stable structures while compressing redundant planar surfaces, thereby enhancing spatial descriptiveness. Furthermore, to mitigate the impact of environmental noise and segmentation uncertainty, a confidence-aware filtering mechanism is developed. This method utilizes raw segmentation probabilities to adaptively weight input measurements, effectively distinguishing reliable landmarks from clutter. Finally, a category-weighted joint optimization scheme is implemented, where feature associations are constrained by semantic stability priors, ensuring globally consistent localization. Extensive experiments in real-world railway datasets demonstrate that the proposed system achieves superior accuracy and robustness compared to state-of-the-art geometric and semantic SLAM methods. Full article

(This article belongs to the Special Issue Dynamic Analysis and Condition Monitoring of High-Speed Trains)

► Show Figures

Figure 1

19 pages, 38545 KB

Open AccessArticle

Improving Dynamic Visual SLAM in Robotic Environments via Angle-Based Optical Flow Analysis

by Sedat Dikici and Fikret Arı

Electronics 2026, 15(1), 223; https://doi.org/10.3390/electronics15010223 - 3 Jan 2026

Viewed by 636

Abstract

Dynamic objects present a major challenge for visual simultaneous localization and mapping (Visual SLAM), as feature measurements originating from moving regions can corrupt camera pose estimation and lead to inaccurate maps. In this paper, we propose a lightweight, semantic-free front-end enhancement for ORB-SLAM [...] Read more.

Dynamic objects present a major challenge for visual simultaneous localization and mapping (Visual SLAM), as feature measurements originating from moving regions can corrupt camera pose estimation and lead to inaccurate maps. In this paper, we propose a lightweight, semantic-free front-end enhancement for ORB-SLAM that detects and suppresses dynamic features using optical flow geometry. The key idea is to estimate a global motion direction point (MDP) from optical flow vectors and to classify feature points based on their angular consistency with the camera-induced motion field. Unlike magnitude-based flow filtering, the proposed strategy exploits the geometric consistency of optical flow with respect to a motion direction point, providing robustness not only to depth variation and camera speed changes but also to different camera motion patterns, including pure translation and pure rotation. The method is integrated into the ORB-SLAM front-end without modifying the back-end optimization or cost function. Experiments on public dynamic-scene datasets demonstrate that the proposed approach reduces absolute trajectory error by up to approximately 45% compared to baseline ORB-SLAM, while maintaining real-time performance on a CPU-only platform. These results indicate that reliable dynamic feature suppression can be achieved without semantic priors or deep learning models. Full article

(This article belongs to the Section Computer Science & Engineering)

► Show Figures

Figure 1

25 pages, 3616 KB

Open AccessArticle

A Deep Learning-Driven Semantic Mapping Strategy for Robotic Inspection of Desalination Facilities

by Albandari Alotaibi, Reem Alrashidi, Hanan Alatawi, Lamaa Duwayriat, Aseel Binnouh, Tareq Alhmiedat and Ahmad Al-Qerem

Machines 2025, 13(12), 1129; https://doi.org/10.3390/machines13121129 - 8 Dec 2025

Viewed by 689

Abstract

The area of robot autonomous navigation has become essential for reducing labor-intensive tasks. These robots’ current navigation systems are based on sensed geometrical structures of the environment, with the engagement of an array of sensor units such as laser scanners, range-finders, and light [...] Read more.

The area of robot autonomous navigation has become essential for reducing labor-intensive tasks. These robots’ current navigation systems are based on sensed geometrical structures of the environment, with the engagement of an array of sensor units such as laser scanners, range-finders, and light detection and ranging (LiDAR) in order to obtain the environment layout. Scene understanding is an important task in the development of robots that need to act autonomously. Hence, this paper presents an efficient semantic mapping system that integrates LiDAR, RGB-D, and odometry data to generate precise and information-rich maps. The proposed system enables the automatic detection and labeling of critical infrastructure components, while preserving high spatial accuracy. As a case study, the system was applied to a desalination plant, where it interactively labeled key entities by integrating Simultaneous Localization and Mapping (SLAM) with vision-based techniques in order to determine the location of installed pipes. The developed system was validated using an efficient development environment known as Robot Operating System (ROS) and a two-wheel-drive robot platform. Several simulations and real-world experiments were conducted to validate the efficiency of the developed semantic mapping system. The obtained results are promising, as the developed semantic map generation system achieves an average object detection accuracy of 84.97% and an average localization error of 1.79 m. Full article

(This article belongs to the Special Issue Robotic Intelligence Development of AI in Robot Perception, Learning, and Decision)

► Show Figures

Figure 1

Search Results (189)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (189)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI