MDPI - Publisher of Open Access Journals

19 pages, 3920 KB

Open AccessArticle

HCDFI-YOLOv8: A Transmission Line Ice Cover Detection Model Based on Improved YOLOv8 in Complex Environmental Contexts

by Lipeng Kang, Feng Xing, Tao Zhong and Caiyan Qin

Sensors 2025, 25(17), 5421; https://doi.org/10.3390/s25175421 - 2 Sep 2025

Viewed by 191

When unmanned aerial vehicles (UAVs) perform transmission line ice cover detection, it is often due to the variable shooting angle and complex background environment, which leads to difficulties such as poor ice-covering recognition accuracy and difficulty in accurately identifying the target. To address [...] Read more.

When unmanned aerial vehicles (UAVs) perform transmission line ice cover detection, it is often due to the variable shooting angle and complex background environment, which leads to difficulties such as poor ice-covering recognition accuracy and difficulty in accurately identifying the target. To address these issues, this study proposes an improved icing detection model based on HCDFI–You Only Look Once version 8 (HCDFI-YOLOv8). First, a cross-dense hybrid (CDH) parallel heterogeneous convolutional module is proposed, which can not only improve the detection accuracy of the model, but also effectively alleviate the problem of the surge in the number of floating-point operations during the improvement of the model. Second, deep and shallow feature weighted fusion using improved CSPDarknet53 to 2-Stage FPN_Dynamic Feature Fusion (C2f_DFF) module is proposed to reduce feature loss in neck networks. Third, optimization of the detection head using the feature adaptive spatial feature fusion (FASFF) detection head module is performed to enhance the model’s ability to extract features at different scales. Finally, a new inner-complete intersection over union (Inner_CIoU) loss function is introduced to solve the contradiction of the CIOU loss function used in the original YOLOv8. Experimental results demonstrate that the proposed HCDFI-YOLOv8 model achieves a 2.7% improvement in mAP@0.5 and a 2.5% improvement in mAP@0.5:0.95 compared to standard YOLOv8. Among twelve models for icing detection, the proposed model delivers the highest overall detection accuracy. The accuracy of the HCDFI-YOLOv8 model in detecting complex transmission line environments is verified and effective technical support is provided for transmission line ice cover detection. Full article

(This article belongs to the Special Issue AI-Based Computer Vision Sensors & Systems—2nd Edition)

► Show Figures

Figure 1

10 pages, 1385 KB

Open AccessArticle

Prediction of Distal Dural Ring Location in Internal Carotid Paraclinoid Aneurysms Using the Tuberculum Sellae–Anterior Clinoid Process Line

by Masaki Matsumoto, Tohru Mizutani, Tatsuya Sugiyama, Kenji Sumi, Shintaro Arai and Yoichi Morofuji

J. Clin. Med. 2025, 14(17), 5951; https://doi.org/10.3390/jcm14175951 - 22 Aug 2025

Viewed by 430

Abstract

Background/Objectives: Current bone-based landmark approaches have shown variable accuracy and poor reproducibility. We validated a two-point “tuberculum sellae–anterior clinoid process” (TS–ACP) line traced on routine 3D-computed tomography angiography (CTA) for predicting distal dural ring (DDR) position and quantified the interobserver agreement. Methods [...] Read more.

Background/Objectives: Current bone-based landmark approaches have shown variable accuracy and poor reproducibility. We validated a two-point “tuberculum sellae–anterior clinoid process” (TS–ACP) line traced on routine 3D-computed tomography angiography (CTA) for predicting distal dural ring (DDR) position and quantified the interobserver agreement. Methods: We retrospectively reviewed data from 85 patients (87 aneurysms) who were treated via clipping between June 2012 and December 2024. Two blinded neurosurgeons classified each aneurysm as extradural, intradural, or straddling the TS–ACP line. The intraoperative DDR inspection served as the reference standard. Diagnostic accuracy, χ² statistics, and Cohen’s κ were calculated. Results: The TS–ACP line landmarks were identifiable in all cases. The TS–ACP line classification correlated strongly with operative findings (χ² = 138.3, p = 6.4 × 10⁻²⁹). The overall accuracy was 89.7% (78/87), and sensitivity and specificity for identifying intradural aneurysms were 94% and 82%, respectively. The interobserver agreement was substantial (κ = 0.78). Nine aneurysms were misclassified, including four cavernous-sinus lesions that partially crossed the DDR. Retrospective fusion using constructive interference in steady-state magnetic resonance imaging corrected these errors. Conclusions: The TS–ACP line represents a rapid, reproducible tool that reliably localizes the DDR on standard 3D-CTA, showing higher accuracy than previously reported single-landmark techniques. Its high accuracy and substantial inter-observer concordance support incorporation into routine preoperative assessments. Because the method depends on only two easily detectable bony points, it is well-suited for automated implementation, offering a practical pathway toward artificial intelligence-assisted stratification of paraclinoid aneurysms. Full article

(This article belongs to the Special Issue Revolutionizing Neurosurgery: Cutting-Edge Techniques and Innovations)

► Show Figures

Graphical abstract

27 pages, 7285 KB

Open AccessArticle

Towards Biologically-Inspired Visual SLAM in Dynamic Environments: IPL-SLAM with Instance Segmentation and Point-Line Feature Fusion

by Jian Liu, Donghao Yao, Na Liu and Ye Yuan

Biomimetics 2025, 10(9), 558; https://doi.org/10.3390/biomimetics10090558 - 22 Aug 2025

Viewed by 528

Abstract

Simultaneous Localization and Mapping (SLAM) is a fundamental technique in mobile robotics, enabling autonomous navigation and environmental reconstruction. However, dynamic elements in real-world scenes—such as walking pedestrians, moving vehicles, and swinging doors—often degrade SLAM performance by introducing unreliable features that cause localization errors. [...] Read more.

Simultaneous Localization and Mapping (SLAM) is a fundamental technique in mobile robotics, enabling autonomous navigation and environmental reconstruction. However, dynamic elements in real-world scenes—such as walking pedestrians, moving vehicles, and swinging doors—often degrade SLAM performance by introducing unreliable features that cause localization errors. In this paper, we define dynamic regions as areas in the scene containing moving objects, and dynamic features as the visual features extracted from these regions that may adversely affect localization accuracy. Inspired by biological perception strategies that integrate semantic awareness and geometric cues, we propose Instance-level Point-Line SLAM (IPL-SLAM), a robust visual SLAM framework for dynamic environments. The system employs YOLOv8-based instance segmentation to detect potential dynamic regions and construct semantic priors, while simultaneously extracting point and line features using Oriented FAST (Features from Accelerated Segment Test) and Rotated BRIEF (Binary Robust Independent Elementary Features), collectively known as ORB, and Line Segment Detector (LSD) algorithms. Motion consistency checks and angular deviation analysis are applied to filter dynamic features, and pose optimization is conducted using an adaptive-weight error function. A static semantic point cloud map is further constructed to enhance scene understanding. Experimental results on the TUM RGB-D dataset demonstrate that IPL-SLAM significantly outperforms existing dynamic SLAM systems—including DS-SLAM and ORB-SLAM2—in terms of trajectory accuracy and robustness in complex indoor environments. Full article

(This article belongs to the Section Biomimetic Design, Constructions and Devices)

► Show Figures

Figure 1

20 pages, 27328 KB

Open AccessArticle

GDVI-Fusion: Enhancing Accuracy with Optimal Geometry Matching and Deep Nearest Neighbor Optimization

by Jincheng Peng, Xiaoli Zhang, Kefei Yuan, Xiafu Peng and Gongliu Yang

Appl. Sci. 2025, 15(16), 8875; https://doi.org/10.3390/app15168875 - 12 Aug 2025

Viewed by 316

Abstract

The visual–inertial odometry (VIO) system is not robust enough in long time operation. Especially, the visual–inertial and Global Navigation Satellite System (GNSS) coupled system is prone to dispersion of system position information in case of failure of visual information or GNSS information. To [...] Read more.

The visual–inertial odometry (VIO) system is not robust enough in long time operation. Especially, the visual–inertial and Global Navigation Satellite System (GNSS) coupled system is prone to dispersion of system position information in case of failure of visual information or GNSS information. To address the above problems, this paper proposes a tightly coupled nonlinear optimized localization system of RGBD visual, inertial measurement unit (IMU), and global position (GDVI-Fusion) to solve the problems of insufficient robustness of carrier position estimation and inaccurate localization information in environments where visual information or GNSS information fails. The preprocessing of depth information in the initialization process is proposed to solve the influence of an RGBD camera by lighting and physical structure and to improve the accuracy of the depth information of image feature points so as to improve the robustness of the localization system. Based on the K-Nearest-Neighbors (KNN) algorithm, to process the feature points, the matching points construct the best geometric constraints and eliminate the feature matching points with an abnormal length and slope of the matching line, which improves the rapidity and accuracy of the feature point matching, resulting in the improvement of the system’s localization accuracy. The lightweight monocular GDVI-Fusion system proposed in this paper achieves a 54.2% improvement in operational efficiency and a 37.1% improvement in positioning accuracy compared with the GVINS system. We have verified the system’s operational efficiency and positioning accuracy using a public dataset and on a prototype. Full article

► Show Figures

Figure 1

14 pages, 802 KB

Open AccessArticle

Risk Factor Analysis for Proximal Junctional Kyphosis in Neuromuscular Scoliosis: A Single-Center Study

by Tobias Lange, Kathrin Boeckenfoerde, Georg Gosheger, Sebastian Bockholt and Albert Schulze Bövingloh

J. Clin. Med. 2025, 14(11), 3646; https://doi.org/10.3390/jcm14113646 - 22 May 2025

Viewed by 794

Abstract

Background/Objectives: Proximal junctional kyphosis (PJK) is one of the most frequently discussed complications following corrective surgery in patients with neuromuscular scoliosis (NMS). Despite its clinical relevance, the etiology of PJK remains incompletely understood and appears to be multifactorial. Biomechanical and limited clinical studies [...] Read more.

Background/Objectives: Proximal junctional kyphosis (PJK) is one of the most frequently discussed complications following corrective surgery in patients with neuromuscular scoliosis (NMS). Despite its clinical relevance, the etiology of PJK remains incompletely understood and appears to be multifactorial. Biomechanical and limited clinical studies suggest that preoperative hyperkyphosis, resection of the spinous processes with consequent disruption of posterior ligamentous structures, and rod contouring parameters may contribute as risk factors. Methods: To validate these findings, we retrospectively analyzed 99 NMS patients who underwent posterior spinal fusion using a standardized screw-rod system between 2009 and 2017. Radiographic assessments were conducted at three time points: preoperatively (preOP), postoperatively (postOP), and at a mean follow-up (FU) of 29 months. Clinical variables collected included patient age, weight, height, sex, and Risser sign. Radiographic evaluations encompassed Cobb angles, thoracic kyphosis (TK), lumbar lordosis, the levels of the upper (UIV) and lower (LIV) instrumented vertebrae, the total number of fused segments, parameters of sagittal alignment, the rod contour angle (RCA), and the postoperative mismatch between RCA and the proximal junctional angle (PJA). Based on the development of proximal junctional kyphosis, patients were categorized into PJK and non-PJK groups. Results: The overall incidence of PJK was 23.2%. In line with previous biomechanical findings, spinous process resection was significantly associated with PJK development. Furthermore, the PJK group demonstrated significantly higher preoperative TK (59.3° ± 29.04° vs. 34.5° ± 26.76°, p < 0.001), greater RCA (10.2° ± 4.01° vs. 7.7° ± 4.34°, p = 0.021), and a larger postoperative mismatch between PJA and RCA (PJA−RCA: 3.8° ± 6.76° vs. −1.8° ± 6.55°, p < 0.001) compared to the non-PJK group. Conclusions: Spinous process resection, a pronounced mismatch between postoperative PJA and RCA (odds ratio [OR] = 1.19, p = 0.002), excessive rod bending (i.e., high RCA), and severe preoperative thoracic hyperkyphosis with an expected increase in the risk of PJK of approximately 6.5% per degree of increase in preoperative TK are significant risk factors for PJK. These variables should be carefully considered during the surgical planning and execution of deformity correction in NMS patients. Full article

(This article belongs to the Special Issue Clinical New Insights into Management of Scoliosis)

► Show Figures

Figure 1

14 pages, 5843 KB

Open AccessArticle

A SAM-Based Detection Method for the Distance Between Air-Craft Fire Detection Lines

by Mo Chen, Sheng Cheng, Yan Liu, Qifan Yin and Hongfu Zuo

Appl. Sci. 2025, 15(10), 5342; https://doi.org/10.3390/app15105342 - 10 May 2025

Viewed by 469

Abstract

Checking the distance between aircraft fire detection lines is a crucial task in the conformity inspection process of civil aircraft manufacturing. Currently, this task is mainly performed manually, which is inefficient and prone to errors and omissions. To address this issue, we propose [...] Read more.

Checking the distance between aircraft fire detection lines is a crucial task in the conformity inspection process of civil aircraft manufacturing. Currently, this task is mainly performed manually, which is inefficient and prone to errors and omissions. To address this issue, we propose a method for detecting the distance between aircraft fire detection lines based on the Segment Anything Model (SAM). In this method, we develop a general model for aircraft parts image segmentation and detection, named the Aircraft Segment Anything Model (ASAM). This model uses a low-rank fine-tuning strategy (LoRA) to fine-tune the encoder and mask decoder synchronously and incorporates a fusion loss function for adaptive training of the target task. The trained ASAM model is tested on a self-built fire detection line segmentation dataset, achieving Dice scores of 85.28, 85.99, and 86.26 based on sam_vit_b, sam_vit_l, and sam_vit_h weights, respectively. It scores 13.82 points higher than the classic segmentation model K-Net under the same training parameters. The proposed method provides a new approach for the widespread application of SAM in the field of aircraft parts image segmentation. Full article

(This article belongs to the Section Aerospace Science and Engineering)

► Show Figures

Figure 1

37 pages, 31655 KB

Open AccessArticle

The Interpretation of Historical Layer Evolution Laws in Historic Districts from the Perspective of the Historic Urban Landscape: A Case Study in Shenyang, China

by Yuan Wang, Chengxie Jin, Tiebo Wang and Danyang Xu

Land 2025, 14(5), 1029; https://doi.org/10.3390/land14051029 - 8 May 2025

Cited by 1 | Viewed by 1078

Abstract

In the context of global urbanization and the concomitant tension between heritage conservation and urban development, there is an urgent need to explore effective strategies for addressing the challenges posed by fragmented conservation, static cognition, and homogeneous renewal in conservation practice. Utilizing the [...] Read more.

In the context of global urbanization and the concomitant tension between heritage conservation and urban development, there is an urgent need to explore effective strategies for addressing the challenges posed by fragmented conservation, static cognition, and homogeneous renewal in conservation practice. Utilizing the theoretical framework of urban historic landscape, this study integrates urban morphology, architectural typology, urban imagery, and catalyst theory to formulate a progressive study on the evolution of historic districts through the layers of “historic areas, spatial forms, material carriers, value characteristics”. The research path is a progressive one that analyses the regularity of historic districts. The present study focuses on Shenyang as the object of empirical research, employing a multifaceted research method that integrates multiple scenarios and sub-cases within a single case. This method utilizes a combination of the literature and field research to obtain diversified data. The study then undertakes a systematic analysis of the accumulation of Shenyang’s historic districts through the application of kernel density analysis and geometric graphical methods. The study found that the dimension of the historical area of the Shenyang historic district presents the layering law of “single-core dominant–dual-core juxtaposition–fusion collage–extension–multi-point radiation”, and that the spatial form is summarized as seven types of the layering law, such as radiation type, ring type, triangular type, and grid type. The spatial form is summarized into seven types of laminar laws, such as radial, ring, triangular, grid, etc. The material carriers exhibit the conventional law of anchoring point-like elements, employing line-like elements as the skeletal structure and surface-like elements as the matrix. The value laminations are diversified, centralized, and self-adaptive. The study proposes the concept of “layer accumulation law” to elucidate the carrier transformation mechanism of cultural genes, and it provides a methodological tool for addressing the dilemma of “layer accumulation fracture”. The findings of this study not only deepen the localized application of HUL theory but also provide an innovative path for the practice of heritage conservation in urban renewal. Full article

(This article belongs to the Special Issue Cultural Heritage Preservation as a Basis for Sustainable Development and Transformation of Historic Urban Landscapes)

► Show Figures

Figure 1

17 pages, 10247 KB

Open AccessArticle

Pose Measurement of Non-Cooperative Space Targets Based on Point Line Feature Fusion in Low-Light Environments

by Haifeng Zhang, Jiaxin Wu, Han Ai, Delian Liu, Chao Mei and Maosen Xiao

Electronics 2025, 14(9), 1795; https://doi.org/10.3390/electronics14091795 - 28 Apr 2025

Viewed by 456

Abstract

Pose measurement of non-cooperative targets in space is one of the key technologies in space missions. However, most existing methods simulate well-lit environments and do not consider the degradation of algorithms in low-light conditions. Additionally, due to the limited computing capabilities of space [...] Read more.

Pose measurement of non-cooperative targets in space is one of the key technologies in space missions. However, most existing methods simulate well-lit environments and do not consider the degradation of algorithms in low-light conditions. Additionally, due to the limited computing capabilities of space platforms, there is a higher demand for real-time processing of algorithms. This paper proposes a real-time pose measurement method based on binocular vision that is suitable for low-light environments. Firstly, the traditional point feature extraction algorithm is adaptively improved based on lighting conditions, greatly reducing the impact of lighting on the effectiveness of feature point extraction. By combining point feature matching with epipolar constraints, the matching range of feature points is narrowed down to the epipolar line, significantly improving the matching speed and accuracy. Secondly, utilizing the structural information of the spacecraft, line features are introduced and processed in parallel with point features, greatly enhancing the accuracy of pose measurement results. Finally, an adaptive weighted multi-feature pose fusion method based on lighting conditions is introduced to obtain the optimal pose estimation results. Simulation and physical experiment results demonstrate that this method can obtain high-precision target pose information in a real-time and stable manner, both in well-lit and low-light environments. Full article

► Show Figures

Figure 1

17 pages, 1557 KB

Open AccessArticle

MultiDistiller: Efficient Multimodal 3D Detection via Knowledge Distillation for Drones and Autonomous Vehicles

by Binghui Yang, Tao Tao, Wenfei Wu, Yongjun Zhang, Xiuyuan Meng and Jianfeng Yang

Drones 2025, 9(5), 322; https://doi.org/10.3390/drones9050322 - 22 Apr 2025

Viewed by 935

Abstract

Real-time 3D object detection is a cornerstone for the safe operation of drones and autonomous vehicles (AVs)—drones must avoid millimeter-scale power lines in cluttered airspace, while AVs require instantaneous recognition of pedestrians and vehicles in dynamic urban environments. Although significant progress has been [...] Read more.

Real-time 3D object detection is a cornerstone for the safe operation of drones and autonomous vehicles (AVs)—drones must avoid millimeter-scale power lines in cluttered airspace, while AVs require instantaneous recognition of pedestrians and vehicles in dynamic urban environments. Although significant progress has been made in detection methods based on point clouds, cameras, and multimodal fusion, the computational complexity of existing high-precision models struggles to meet the real-time requirements of vehicular edge devices. Additionally, during the model lightweighting process, issues such as multimodal feature coupling failure and the imbalance between classification and localization performance often arise. To address these challenges, this paper proposes a knowledge distillation framework for multimodal 3D object detection, incorporating attention guidance, rank-aware learning, and interactive feature supervision to achieve efficient model compression and performance optimization. Specifically: To enhance the student model’s ability to focus on key channel and spatial features, we introduce attention-guided feature distillation, leveraging a bird’s-eye view foreground mask and a dual-attention mechanism. To mitigate the degradation of classification performance when transitioning from two-stage to single-stage detectors, we propose ranking-aware category distillation by modeling anchor-level distribution. To address the insufficient cross-modal feature extraction capability, we enhance the student network’s image features using the teacher network’s point cloud spatial priors, thereby constructing a LiDAR-image cross-modal feature alignment mechanism. Experimental results demonstrate the effectiveness of the proposed approach in multimodal 3D object detection. On the KITTI dataset, our method improves network performance by 4.89% even after reducing the number of channels by half. Full article

(This article belongs to the Special Issue Cooperative Perception for Modern Transportation)

► Show Figures

Figure 1

15 pages, 3682 KB

Open AccessArticle

Multi-Sensor Information Fusion Positioning of AUKF Maglev Trains Based on Self-Corrected Weighting

by Qian Hu, Hong Tang, Kuangang Fan and Wenlong Cai

Sensors 2025, 25(8), 2628; https://doi.org/10.3390/s25082628 - 21 Apr 2025

Viewed by 476

Abstract

Achieving accurate positioning of maglev trains is one of the key technologies for the safe operation of maglev trains and train schedules. Aiming at magnetic levitation train positioning, there are problems such as being easily interfered with by external noise, the single positioning [...] Read more.

Achieving accurate positioning of maglev trains is one of the key technologies for the safe operation of maglev trains and train schedules. Aiming at magnetic levitation train positioning, there are problems such as being easily interfered with by external noise, the single positioning method, and traditional weighting affected by historical data, which lead to the deviation of positioning fusion results. Therefore, this paper adopts self-corrected weighting and Sage–Husa noise estimation algorithms to improve them and proposes a research method of multi-sensor information fusion and positioning of an AUKF magnetic levitation train based on self-correcting weighting. Multi-sensor information fusion technology is applied to the positioning of maglev trains, which does not rely on a single sensor. It combines the Sage–Husa algorithm with the unscented Kalman filter (UKF) to form the AUKF algorithm using the data collected by the cross-sensor lines, INS, Doppler radar, and GNSS, which adaptively updates the statistical feature estimation of the measurement noise and eliminates the single-function and low-integration shortcomings of the various modules to achieve the precise positioning of maglev trains. The experimental results point out that the self-correction-based AUKF filter trajectories are closer to the real values, and their ME and RMSE errors are smaller, indicating that the self-correction-weighted AUKF algorithm proposed in this paper has significant advantages in terms of stability, accuracy, and simplicity. Full article

(This article belongs to the Section Navigation and Positioning)

► Show Figures

Figure 1

29 pages, 6622 KB

Open AccessArticle

Semantic Fusion Algorithm of 2D LiDAR and Camera Based on Contour and Inverse Projection

by Xingyu Yuan, Yu Liu, Tifan Xiong, Wei Zeng and Chao Wang

Sensors 2025, 25(8), 2526; https://doi.org/10.3390/s25082526 - 17 Apr 2025

Cited by 1 | Viewed by 1072

Abstract

Common single-line 2D LiDAR sensors and cameras have become core components in the field of robotic perception due to their low cost, compact size, and practicality. However, during the data fusion process, the randomness and complexity of real industrial scenes pose challenges. Traditional [...] Read more.

Common single-line 2D LiDAR sensors and cameras have become core components in the field of robotic perception due to their low cost, compact size, and practicality. However, during the data fusion process, the randomness and complexity of real industrial scenes pose challenges. Traditional calibration methods for LiDAR and cameras often rely on precise targets and can accumulate errors, leading to significant limitations. Additionally, the semantic fusion of LiDAR and camera data typically requires extensive projection calculations, complex clustering algorithms, or sophisticated data fusion techniques, resulting in low real-time performance when handling large volumes of data points in dynamic environments. To address these issues, this paper proposes a semantic fusion algorithm for LiDAR and camera data based on contour and inverse projection. The method has two remarkable features: (1) Combined with the ellipse extraction algorithm of the arc support line segment, a LiDAR and camera calibration algorithm based on various regular shapes of an environmental target is proposed, which improves the adaptability of the calibration algorithm to the environment. (2) This paper proposes a semantic segmentation algorithm based on the inverse projection of target contours. It is specifically designed to be versatile and applicable to both linear and arc features, significantly broadening the range of features that can be utilized in various tasks. This flexibility is a key advantage, as it allows the method to adapt to a wider variety of real-world scenarios where both types of features are commonly encountered. Compared with existing LiDAR point cloud semantic segmentation methods, this algorithm eliminates the need for complex clustering algorithms, data fusion techniques, and extensive laser point reprojection calculations. When handling a large number of laser points, the proposed method requires only one or two inverse projections of the contour to filter the range of laser points that intersect with specific targets. This approach enhances both the accuracy of point cloud searches and the speed of semantic processing. Finally, the validity of the semantic fusion algorithm is proven by field experiments. Full article

(This article belongs to the Section Sensors and Robotics)

► Show Figures

Figure 1

25 pages, 11848 KB

Open AccessArticle

Multiscale Feature Fusion with Self-Attention for Efficient 6D Pose Estimation

by Zekai Lv, Yufeng Guo, Shangbin Yang, Linlin Du, Rui Gao, Jinti Sun, Jiaqi Han, Hui Zhang and Qiang Wang

Algorithms 2025, 18(4), 207; https://doi.org/10.3390/a18040207 - 8 Apr 2025

Viewed by 752

Abstract

Six-dimensional (6D) pose estimation remains a significant challenge in computer vision, particularly for objects in complex environments. To overcome the limitations of existing methods in occluded and low-texture scenarios, a lightweight, multiscale feature fusion network was proposed. In the network, a self-attention mechanism [...] Read more.

Six-dimensional (6D) pose estimation remains a significant challenge in computer vision, particularly for objects in complex environments. To overcome the limitations of existing methods in occluded and low-texture scenarios, a lightweight, multiscale feature fusion network was proposed. In the network, a self-attention mechanism is integrated with a multiscale point cloud feature extraction module, enhancing the representation of local features and mitigating information loss caused by occlusion. A lightweight image feature extraction module was also introduced to reduce the computational complexity while maintaining high precision in pose estimation. Ablation experiments on the LineMOD dataset validated the effectiveness of the two modules. The proposed network achieved 98.5% accuracy, contained 19.49 million parameters, and exhibited a processing speed of 31.8 frames per second (FPS). Comparative experiments on the LineMOD, Yale-CMU-Berkeley (YCB)-Video, and Occlusion LineMOD datasets demonstrated the superior performance of the proposed method. Specifically, the average nearest point distance (ADD-S) metric was improved by 4.2 percentage points over DenseFusion for LineMOD and by 0.6 percentage points for YCB-Video, with it reaching 63.4% on the Occlusion LineMOD dataset. In addition, inference speed comparisons showed that the proposed method outperforms most RGB-D-based methods. The results confirmed that the proposed method is both robust and efficient in handling occlusions and low-texture objects while also featuring a lightweight network design. Full article

(This article belongs to the Section Algorithms for Multidisciplinary Applications)

► Show Figures

Figure 1

13 pages, 3587 KB

Open AccessArticle

KPMapNet: Keypoint Representation Learning for Online Vectorized High-Definition Map Construction

by Bicheng Jin, Wenyu Hao, Wenzhao Qiu and Shanmin Pang

Sensors 2025, 25(6), 1897; https://doi.org/10.3390/s25061897 - 18 Mar 2025

Viewed by 664

Abstract

Vectorized high-definition (HD) map construction is a critical task in the autonomous driving domain. The existing methods typically represent vectorized map elements with a fixed number of points, establishing robust baselines for this task. However, the inherent shape priors introduce additional shape errors, [...] Read more.

Vectorized high-definition (HD) map construction is a critical task in the autonomous driving domain. The existing methods typically represent vectorized map elements with a fixed number of points, establishing robust baselines for this task. However, the inherent shape priors introduce additional shape errors, which in turn lead to error accumulation in the downstream tasks. Moreover, the subtle and sparse nature of the annotations limits detection-based frameworks in accurately extracting the relevant features, often resulting in the loss of fine structural details in the predictions. To address these challenges, this work presents KPMapNet, an end-to-end framework that redefines the ground truth training representation of vectorized map elements to achieve precise topological representations. Specifically, the conventional equidistant sampling method is modified to better preserve the geometric features of the original instances while maintaining a fixed number of points. In addition, a map mask fusion module and an enhanced hybrid attention module are incorporated to mitigate the issues introduced by the new representation. Moreover, a novel point-line matching loss function is introduced to further refine the training process. Extensive experiments on the nuScenes and Argoverse2 datasets demonstrate that KPMapNet achieves state-of-the-art performance, with 75.1 mAP on nuScenes and 74.2 mAP on Argoverse2. The visualization results further corroborate the enhanced accuracy of the map generation outcomes. Full article

(This article belongs to the Special Issue Computer Vision and Sensor Fusion for Autonomous Vehicles)

► Show Figures

Figure 1

15 pages, 7546 KB

Open AccessArticle

Deterministic Light Detection and Ranging (LiDAR)-Based Obstacle Detection in Railways Using Data Fusion

by Susana Dias, Pedro J. S. C. P. Sousa, João Nunes, Francisco Afonso, Nuno Viriato, Paulo J. Tavares and Pedro M. G. P. Moreira

Appl. Sci. 2025, 15(6), 3118; https://doi.org/10.3390/app15063118 - 13 Mar 2025

Viewed by 1326

Abstract

Rail travel is one of the safest means of transportation, with increasing usage in recent years. One of the major safety concerns in the railway relates to intrusions. Therefore, the timely detection of obstacles is crucial for ensuring operational safety. This is a [...] Read more.

Rail travel is one of the safest means of transportation, with increasing usage in recent years. One of the major safety concerns in the railway relates to intrusions. Therefore, the timely detection of obstacles is crucial for ensuring operational safety. This is a complex problem with multiple contributing factors, from environmental to psychological. While machine learning (ML) has proven effective in related applications, such as autonomous road-based driving, the railway sector faces unique challenges due to limited image data availability and difficult data acquisition, hindering the applicability of conventional ML methods. To mitigate this, the present study proposes a novel framework leveraging LiDAR technology (Light Detection and Ranging) and previous knowledge to address these data scarcity limitations and enhance obstacle detection capabilities on railways. The proposed framework combines the strengths of long-range LiDAR (capable of detecting obstacles up to 500 m away) and GNSS data, which results in precise coordinates that accurately describe the train’s position relative to any obstacles. Using a data fusion approach, pre-existing knowledge about the track topography is incorporated into the LiDAR data processing pipeline in conjunction with the DBSCAN clustering algorithm to identify and classify potential obstacles based on point cloud density patterns. This step effectively segregates potential obstacles from background noise and track structures. The proposed framework was tested within the operational environment of a CP 2600-2620 series locomotive in a short section of the Contumil-Leixões line. This real-world testing scenario allowed the evaluation of the framework’s effectiveness under realistic operating conditions. The unique advantages of this approach relate to its effectiveness in tackling data scarcity, which is often an issue for other methods, in a way that enhances obstacle detection in railway operations and may lead to significant improvements in safety and operational efficiency within railway networks. Full article

(This article belongs to the Special Issue Interdisciplinary Approaches and Applications of Optics & Photonics)

► Show Figures

Graphical abstract

24 pages, 35913 KB

Open AccessArticle

Study on Spatial Interpolation Methods for High Precision 3D Geological Modeling of Coal Mining Faces

by Mingyi Cui, Enke Hou, Tuo Lu, Pengfei Hou and Dong Feng

Appl. Sci. 2025, 15(6), 2959; https://doi.org/10.3390/app15062959 - 10 Mar 2025

Viewed by 963

Abstract

High-precision three-dimensional geological modeling of mining faces is crucial for intelligent coal mining and disaster prevention. Accurate spatial interpolation is essential for building high-quality models. This study focuses on the 25214 workface of the Hongliulin coal mine, addressing challenges in interpolating terrain elevation, [...] Read more.

High-precision three-dimensional geological modeling of mining faces is crucial for intelligent coal mining and disaster prevention. Accurate spatial interpolation is essential for building high-quality models. This study focuses on the 25214 workface of the Hongliulin coal mine, addressing challenges in interpolating terrain elevation, stratum thickness, and coal seam thickness data. We evaluate eight interpolation methods (four kriging methods, an inverse distance weighting method, and three radial basis function methods) for terrain and stratum thickness, and nine methods (including the Bayesian Maximum Entropy method) for coal seam thickness, using cross-validation to assess their accuracy. Research results indicate that for terrain elevation data with dense and evenly distributed sampling points, linear kriging achieves the highest accuracy (MAE = 1.01 m, RMSE = 1.20 m). For the optimal interpolation methods of five layers of thickness data with sparse sampling points, the results are as follows: Q₄, spherical kriging (MAE = 2.13 m, RMSE = 2.83 m); N₂b, IDW (p = 2), MAE = 2.08 m, RMSE = 2.44 m; J₂y³, RS-RBF (MAE = 0.89 m, RMSE = 1.05 m); J₂y², TPS-RBF (MAE = 1.96 m, RMSE = 2.25 m); J₂y¹, HS-RBF (MAE = 2.36 m, RMSE = 2.71 m). A method for accurately delineating the zero line of strata thickness by assigning negative values to virtual thickness in areas of missing strata has been proposed. For coal seam thickness data with uncertain data (from channel wave exploration), a soft-hard data fusion interpolation method based on Bayesian Maximum Entropy has been introduced, and its interpolation results (MAE = 0.64 m, RMSE = 0.66 m) significantly outperform those of eight other interpolation algorithms. Using the optimal interpolation methods for terrain, strata, and coal seams, we construct a high-precision three-dimensional geological model of the workface, which provides reliable support for intelligent coal mining. Full article

(This article belongs to the Section Earth Sciences)

► Show Figures

Figure 1

Search Results (139)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (139)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI