MDPI - Publisher of Open Access Journals

28 pages, 7272 KiB

Open AccessArticle

Dynamic Object Detection and Non-Contact Localization in Lightweight Cattle Farms Based on Binocular Vision and Improved YOLOv8s

by Shijie Li, Shanshan Cao, Peigang Wei, Wei Sun and Fantao Kong

Agriculture 2025, 15(16), 1766; https://doi.org/10.3390/agriculture15161766 - 18 Aug 2025

Viewed by 82

Abstract

The real-time detection and localization of dynamic targets in cattle farms are crucial for the effective operation of intelligent equipment. To overcome the limitations of wearable devices, including high costs and operational stress, this paper proposes a lightweight, non-contact solution. The goal is [...] Read more.

The real-time detection and localization of dynamic targets in cattle farms are crucial for the effective operation of intelligent equipment. To overcome the limitations of wearable devices, including high costs and operational stress, this paper proposes a lightweight, non-contact solution. The goal is to improve the accuracy and efficiency of target localization while reducing the complexity of the system. A novel approach is introduced based on YOLOv8s, incorporating a C2f_DW_StarBlock module. The system fuses binocular images from a ZED2i camera with GPS and IMU data to form a multimodal ranging and localization module. Experimental results demonstrate a 36.03% reduction in model parameters, a 33.45% decrease in computational complexity, and a 38.67% reduction in model size. The maximum ranging error is 4.41%, with localization standard deviations of 1.02 m (longitude) and 1.10 m (latitude). The model is successfully integrated into an ROS system, achieving stable real-time performance. This solution offers the advantages of being lightweight, non-contact, and low-maintenance, providing strong support for intelligent farm management and multi-target monitoring. Full article

(This article belongs to the Topic Precision Feeding and Management of Farm Animals, 3rd Edition)

► Show Figures

Figure 1

17 pages, 6208 KiB

Open AccessArticle

Sweet—An Open Source Modular Platform for Contactless Hand Vascular Biometric Experiments

by David Geissbühler, Sushil Bhattacharjee, Ketan Kotwal, Guillaume Clivaz and Sébastien Marcel

Sensors 2025, 25(16), 4990; https://doi.org/10.3390/s25164990 - 12 Aug 2025

Viewed by 328

Abstract

Current finger-vein or palm-vein recognition systems usually require direct contact of the subject with the apparatus. This can be problematic in environments where hygiene is of primary importance. In this work we present a contactless vascular biometrics sensor platform named sweet which can [...] Read more.

Current finger-vein or palm-vein recognition systems usually require direct contact of the subject with the apparatus. This can be problematic in environments where hygiene is of primary importance. In this work we present a contactless vascular biometrics sensor platform named sweet which can be used for hand vascular biometrics studies (wrist, palm, and finger-vein) and surface features such as palmprint. It supports several acquisition modalities such as multi-spectral Near-Infrared (NIR), RGB-color, Stereo Vision (SV) and Photometric Stereo (PS). Using this platform we collected a dataset consisting of the fingers, palm and wrist vascular data of 120 subjects. We present biometric experimental results, focusing on Finger-Vein Recognition (FVR). Finally, we discuss fusion of multiple modalities. The acquisition software, parts of the hardware design, the new FV dataset, as well as source-code for our experiments are publicly available for research purposes. Full article

(This article belongs to the Special Issue Novel Optical Sensors for Biomedical Applications—2nd Edition)

► Show Figures

Figure 1

19 pages, 3977 KiB

Open AccessArticle

Accelerating Surgical Skill Acquisition by Using Multi-View Bullet-Time Video Generation

by Yinghao Wang, Chun Xie, Koichiro Kumano, Daichi Kitaguchi, Shinji Hashimoto, Tatsuya Oda and Itaru Kitahara

Appl. Sci. 2025, 15(16), 8830; https://doi.org/10.3390/app15168830 - 10 Aug 2025

Viewed by 399

Abstract

Surgical education and training have seen significant advancements with the integration of innovative technologies. This paper presents a novel approach to surgical education using a multi-view capturing system and bullet-time generation techniques to enhance the learning experience for aspiring surgeons. The proposed system [...] Read more.

Surgical education and training have seen significant advancements with the integration of innovative technologies. This paper presents a novel approach to surgical education using a multi-view capturing system and bullet-time generation techniques to enhance the learning experience for aspiring surgeons. The proposed system leverages an array of synchronized cameras strategically positioned around a surgical simulation environment, enabling the capture of surgical procedures from multiple angles simultaneously. The captured multi-view data is then processed using advanced computer vision and image processing algorithms to create a “bullet-time” effect, similar to the iconic scenes from The Matrix movie, allowing educators and trainees to manipulate time and view the surgical procedure from any desired perspective. In this paper, we propose the technical aspects of the multi-view capturing system, the bullet-time generation process, and the integration of these technologies into surgical education programs. We also discuss the potential applications in various surgical specialties and the benefits of utilizing this system for both novice and experienced surgeons. Finally, we present preliminary results from pilot studies and user feedback, highlighting the promising potential of this innovative approach to revolutionize surgical education and training. Full article

► Show Figures

Figure 1

32 pages, 1435 KiB

Open AccessReview

Smart Safety Helmets with Integrated Vision Systems for Industrial Infrastructure Inspection: A Comprehensive Review of VSLAM-Enabled Technologies

by Emmanuel A. Merchán-Cruz, Samuel Moveh, Oleksandr Pasha, Reinis Tocelovskis, Alexander Grakovski, Alexander Krainyukov, Nikita Ostrovenecs, Ivans Gercevs and Vladimirs Petrovs

Sensors 2025, 25(15), 4834; https://doi.org/10.3390/s25154834 - 6 Aug 2025

Viewed by 629

Abstract

Smart safety helmets equipped with vision systems are emerging as powerful tools for industrial infrastructure inspection. This paper presents a comprehensive state-of-the-art review of such VSLAM-enabled (Visual Simultaneous Localization and Mapping) helmets. We surveyed the evolution from basic helmet cameras to intelligent, sensor-fused [...] Read more.

Smart safety helmets equipped with vision systems are emerging as powerful tools for industrial infrastructure inspection. This paper presents a comprehensive state-of-the-art review of such VSLAM-enabled (Visual Simultaneous Localization and Mapping) helmets. We surveyed the evolution from basic helmet cameras to intelligent, sensor-fused inspection platforms, highlighting how modern helmets leverage real-time visual SLAM algorithms to map environments and assist inspectors. A systematic literature search was conducted targeting high-impact journals, patents, and industry reports. We classify helmet-integrated camera systems into monocular, stereo, and omnidirectional types and compare their capabilities for infrastructure inspection. We examine core VSLAM algorithms (feature-based, direct, hybrid, and deep-learning-enhanced) and discuss their adaptation to wearable platforms. Multi-sensor fusion approaches integrating inertial, LiDAR, and GNSS data are reviewed, along with edge/cloud processing architectures enabling real-time performance. This paper compiles numerous industrial use cases, from bridges and tunnels to plants and power facilities, demonstrating significant improvements in inspection efficiency, data quality, and worker safety. Key challenges are analyzed, including technical hurdles (battery life, processing limits, and harsh environments), human factors (ergonomics, training, and cognitive load), and regulatory issues (safety certification and data privacy). We also identify emerging trends, such as semantic SLAM, AI-driven defect recognition, hardware miniaturization, and collaborative multi-helmet systems. This review finds that VSLAM-equipped smart helmets offer a transformative approach to infrastructure inspection, enabling real-time mapping, augmented awareness, and safer workflows. We conclude by highlighting current research gaps, notably in standardizing systems and integrating with asset management, and provide recommendations for industry adoption and future research directions. Full article

(This article belongs to the Topic 3D Computer Vision and Smart Building and City, 3rd Edition)

► Show Figures

Figure 1

21 pages, 4909 KiB

Open AccessArticle

Rapid 3D Camera Calibration for Large-Scale Structural Monitoring

by Fabio Bottalico, Nicholas A. Valente, Christopher Niezrecki, Kshitij Jerath, Yan Luo and Alessandro Sabato

Remote Sens. 2025, 17(15), 2720; https://doi.org/10.3390/rs17152720 - 6 Aug 2025

Viewed by 325

Abstract

Computer vision techniques such as three-dimensional digital image correlation (3D-DIC) and three-dimensional point tracking (3D-PT) have demonstrated broad applicability for monitoring the conditions of large-scale engineering systems by reconstructing and tracking dynamic point clouds corresponding to the surface of a structure. Accurate stereophotogrammetry [...] Read more.

Computer vision techniques such as three-dimensional digital image correlation (3D-DIC) and three-dimensional point tracking (3D-PT) have demonstrated broad applicability for monitoring the conditions of large-scale engineering systems by reconstructing and tracking dynamic point clouds corresponding to the surface of a structure. Accurate stereophotogrammetry measurements require the stereo cameras to be calibrated to determine their intrinsic and extrinsic parameters by capturing multiple images of a calibration object. This image-based approach becomes cumbersome and time-consuming as the size of the tested object increases. To streamline the calibration and make it scale-insensitive, a multi-sensor system embedding inertial measurement units and a laser sensor is developed to compute the extrinsic parameters of the stereo cameras. In this research, the accuracy of the proposed sensor-based calibration method in performing stereophotogrammetry is validated experimentally and compared with traditional approaches. Tests conducted at various scales reveal that the proposed sensor-based calibration enables reconstructing both static and dynamic point clouds, measuring displacements with an accuracy higher than 95% compared to image-based traditional calibration, while being up to an order of magnitude faster and easier to deploy. The novel approach has broad applications for making static, dynamic, and deformation measurements to transform how large-scale structural health monitoring can be performed. Full article

(This article belongs to the Special Issue New Perspectives on 3D Point Cloud (Third Edition))

► Show Figures

Figure 1

13 pages, 4474 KiB

Open AccessArticle

Imaging on the Edge: Mapping Object Corners and Edges with Stereo X-Ray Tomography

by Zhenduo Shang and Thomas Blumensath

Tomography 2025, 11(8), 84; https://doi.org/10.3390/tomography11080084 - 29 Jul 2025

Viewed by 216

Abstract

Background/Objectives: X-ray computed tomography (XCT) is a powerful tool for volumetric imaging, where three-dimensional (3D) images are generated from a large number of individual X-ray projection images. However, collecting the required number of low-noise projection images is time-consuming, limiting its applicability to scenarios [...] Read more.

Background/Objectives: X-ray computed tomography (XCT) is a powerful tool for volumetric imaging, where three-dimensional (3D) images are generated from a large number of individual X-ray projection images. However, collecting the required number of low-noise projection images is time-consuming, limiting its applicability to scenarios requiring high temporal resolution, such as the study of dynamic processes. Inspired by stereo vision, we previously developed stereo X-ray imaging methods that operate with only two X-ray projections, enabling the 3D reconstruction of point and line fiducial markers at significantly faster temporal resolutions. Methods: Building on our prior work, this paper demonstrates the use of stereo X-ray techniques for 3D reconstruction of sharp object corners, eliminating the need for internal fiducial markers. This is particularly relevant for deformation measurement of manufactured components under load. Additionally, we explore model training using synthetic data when annotated real data is unavailable. Results: We show that the proposed method can reliably reconstruct sharp corners in 3D using only two X-ray projections. The results confirm the method’s applicability to real-world stereo X-ray images without relying on annotated real training datasets. Conclusions: Our approach enables stereo X-ray 3D reconstruction using synthetic training data that mimics key characteristics of real data, thereby expanding the method’s applicability in scenarios with limited training resources. Full article

► Show Figures

Figure 1

37 pages, 55522 KiB

Open AccessArticle

EPCNet: Implementing an ‘Artificial Fovea’ for More Efficient Monitoring Using the Sensor Fusion of an Event-Based and a Frame-Based Camera

by Orla Sealy Phelan, Dara Molloy, Roshan George, Edward Jones, Martin Glavin and Brian Deegan

Sensors 2025, 25(15), 4540; https://doi.org/10.3390/s25154540 - 22 Jul 2025

Viewed by 362

Abstract

Efficient object detection is crucial to real-time monitoring applications such as autonomous driving or security systems. Modern RGB cameras can produce high-resolution images for accurate object detection. However, increased resolution results in increased network latency and power consumption. To minimise this latency, Convolutional [...] Read more.

Efficient object detection is crucial to real-time monitoring applications such as autonomous driving or security systems. Modern RGB cameras can produce high-resolution images for accurate object detection. However, increased resolution results in increased network latency and power consumption. To minimise this latency, Convolutional Neural Networks (CNNs) often have a resolution limitation, requiring images to be down-sampled before inference, causing significant information loss. Event-based cameras are neuromorphic vision sensors with high temporal resolution, low power consumption, and high dynamic range, making them preferable to regular RGB cameras in many situations. This project proposes the fusion of an event-based camera with an RGB camera to mitigate the trade-off between temporal resolution and accuracy, while minimising power consumption. The cameras are calibrated to create a multi-modal stereo vision system where pixel coordinates can be projected between the event and RGB camera image planes. This calibration is used to project bounding boxes detected by clustering of events into the RGB image plane, thereby cropping each RGB frame instead of down-sampling to meet the requirements of the CNN. Using the Common Objects in Context (COCO) dataset evaluator, the average precision (AP) for the bicycle class in RGB scenes improved from 21.08 to 57.38. Additionally, AP increased across all classes from 37.93 to 46.89. To reduce system latency, a novel object detection approach is proposed where the event camera acts as a region proposal network, and a classification algorithm is run on the proposed regions. This achieved a 78% improvement over baseline. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

15 pages, 1991 KiB

Open AccessArticle

Hybrid Deep–Geometric Approach for Efficient Consistency Assessment of Stereo Images

by Michał Kowalczyk, Piotr Napieralski and Dominik Szajerman

Sensors 2025, 25(14), 4507; https://doi.org/10.3390/s25144507 - 20 Jul 2025

Viewed by 508

Abstract

We present HGC-Net, a hybrid pipeline for assessing geometric consistency between stereo image pairs. Our method integrates classical epipolar geometry with deep learning components to compute an interpretable scalar score A, reflecting the degree of alignment. Unlike traditional techniques, which may overlook subtle [...] Read more.

We present HGC-Net, a hybrid pipeline for assessing geometric consistency between stereo image pairs. Our method integrates classical epipolar geometry with deep learning components to compute an interpretable scalar score A, reflecting the degree of alignment. Unlike traditional techniques, which may overlook subtle miscalibrations, HGC-Net reliably detects both severe and mild geometric distortions, such as sub-degree tilts and pixel-level shifts. We evaluate the method on the Middlebury 2014 stereo dataset, using synthetically distorted variants to simulate misalignments. Experimental results show that our score degrades smoothly with increasing geometric error and achieves high detection rates even at minimal distortion levels, outperforming baseline approaches based on disparity or calibration checks. The method operates in real time (12.5 fps on 1080p input) and does not require access to internal camera parameters, making it suitable for embedded stereo systems and quality monitoring in robotic and AR/VR applications. The approach also supports explainability via confidence maps and anomaly heatmaps, aiding human operators in identifying problematic regions. Full article

(This article belongs to the Special Issue Feature Papers in Physical Sensors 2025)

► Show Figures

Figure 1

17 pages, 610 KiB

Open AccessReview

Three-Dimensional Reconstruction Techniques and the Impact of Lighting Conditions on Reconstruction Quality: A Comprehensive Review

by Dimitar Rangelov, Sierd Waanders, Kars Waanders, Maurice van Keulen and Radoslav Miltchev

Lights 2025, 1(1), 1; https://doi.org/10.3390/lights1010001 - 14 Jul 2025

Viewed by 474

Abstract

Three-dimensional (3D) reconstruction has become a fundamental technology in applications ranging from cultural heritage preservation and robotics to forensics and virtual reality. As these applications grow in complexity and realism, the quality of the reconstructed models becomes increasingly critical. Among the many factors [...] Read more.

Three-dimensional (3D) reconstruction has become a fundamental technology in applications ranging from cultural heritage preservation and robotics to forensics and virtual reality. As these applications grow in complexity and realism, the quality of the reconstructed models becomes increasingly critical. Among the many factors that influence reconstruction accuracy, the lighting conditions at capture time remain one of the most influential, yet widely neglected, variables. This review provides a comprehensive survey of classical and modern 3D reconstruction techniques, including Structure from Motion (SfM), Multi-View Stereo (MVS), Photometric Stereo, and recent neural rendering approaches such as Neural Radiance Fields (NeRFs) and 3D Gaussian Splatting (3DGS), while critically evaluating their performance under varying illumination conditions. We describe how lighting-induced artifacts such as shadows, reflections, and exposure imbalances compromise the reconstruction quality and how different approaches attempt to mitigate these effects. Furthermore, we uncover fundamental gaps in current research, including the lack of standardized lighting-aware benchmarks and the limited robustness of state-of-the-art algorithms in uncontrolled environments. By synthesizing knowledge across fields, this review aims to gain a deeper understanding of the interplay between lighting and reconstruction and provides research directions for the future that emphasize the need for adaptive, lighting-robust solutions in 3D vision systems. Full article

► Show Figures

Figure 1

32 pages, 2740 KiB

Open AccessArticle

Vision-Based Navigation and Perception for Autonomous Robots: Sensors, SLAM, Control Strategies, and Cross-Domain Applications—A Review

by Eder A. Rodríguez-Martínez, Wendy Flores-Fuentes, Farouk Achakir, Oleg Sergiyenko and Fabian N. Murrieta-Rico

Eng 2025, 6(7), 153; https://doi.org/10.3390/eng6070153 - 7 Jul 2025

Viewed by 2193

Abstract

Camera-centric perception has matured into a cornerstone of modern autonomy, from self-driving cars and factory cobots to underwater and planetary exploration. This review synthesizes more than a decade of progress in vision-based robotic navigation through an engineering lens, charting the full pipeline from [...] Read more.

Camera-centric perception has matured into a cornerstone of modern autonomy, from self-driving cars and factory cobots to underwater and planetary exploration. This review synthesizes more than a decade of progress in vision-based robotic navigation through an engineering lens, charting the full pipeline from sensing to deployment. We first examine the expanding sensor palette—monocular and multi-camera rigs, stereo and RGB-D devices, LiDAR–camera hybrids, event cameras, and infrared systems—highlighting the complementary operating envelopes and the rise of learning-based depth inference. The advances in visual localization and mapping are then analyzed, contrasting sparse and dense SLAM approaches, as well as monocular, stereo, and visual–inertial formulations. Additional topics include loop closure, semantic mapping, and LiDAR–visual–inertial fusion, which enables drift-free operation in dynamic environments. Building on these foundations, we review the navigation and control strategies, spanning classical planning, reinforcement and imitation learning, hybrid topological–metric memories, and emerging visual language guidance. Application case studies—autonomous driving, industrial manipulation, autonomous underwater vehicles, planetary rovers, aerial drones, and humanoids—demonstrate how tailored sensor suites and algorithms meet domain-specific constraints. Finally, the future research trajectories are distilled: generative AI for synthetic training data and scene completion; high-density 3D perception with solid-state LiDAR and neural implicit representations; event-based vision for ultra-fast control; and human-centric autonomy in next-generation robots. By providing a unified taxonomy, a comparative analysis, and engineering guidelines, this review aims to inform researchers and practitioners designing robust, scalable, vision-driven robotic systems. Full article

(This article belongs to the Special Issue Interdisciplinary Insights in Engineering Research)

► Show Figures

Figure 1

21 pages, 33500 KiB

Open AccessArticle

Location Research and Picking Experiment of an Apple-Picking Robot Based on Improved Mask R-CNN and Binocular Vision

by Tianzhong Fang, Wei Chen and Lu Han

Horticulturae 2025, 11(7), 801; https://doi.org/10.3390/horticulturae11070801 - 6 Jul 2025

Viewed by 498

Abstract

With the advancement of agricultural automation technologies, apple-harvesting robots have gradually become a focus of research. As their “perceptual core,” machine vision systems directly determine picking success rates and operational efficiency. However, existing vision systems still exhibit significant shortcomings in target detection and [...] Read more.

With the advancement of agricultural automation technologies, apple-harvesting robots have gradually become a focus of research. As their “perceptual core,” machine vision systems directly determine picking success rates and operational efficiency. However, existing vision systems still exhibit significant shortcomings in target detection and positioning accuracy in complex orchard environments (e.g., uneven illumination, foliage occlusion, and fruit overlap), which hinders practical applications. This study proposes a visual system for apple-harvesting robots based on improved Mask R-CNN and binocular vision to achieve more precise fruit positioning. The binocular camera (ZED2i) carried by the robot acquires dual-channel apple images. An improved Mask R-CNN is employed to implement instance segmentation of apple targets in binocular images, followed by a template-matching algorithm with parallel epipolar constraints for stereo matching. Four pairs of feature points from corresponding apples in binocular images are selected to calculate disparity and depth. Experimental results demonstrate average coefficients of variation and positioning accuracy of 5.09% and 99.61%, respectively, in binocular positioning. During harvesting operations with a self-designed apple-picking robot, the single-image processing time was 0.36 s, the average single harvesting cycle duration reached 7.7 s, and the comprehensive harvesting success rate achieved 94.3%. This work presents a novel high-precision visual positioning method for apple-harvesting robots. Full article

(This article belongs to the Section Fruit Production Systems)

► Show Figures

Figure 1

18 pages, 4774 KiB

Open AccessArticle

InfraredStereo3D: Breaking Night Vision Limits with Perspective Projection Positional Encoding and Groundbreaking Infrared Dataset

by Yuandong Niu, Limin Liu, Fuyu Huang, Juntao Ma, Chaowen Zheng, Yunfeng Jiang, Ting An, Zhongchen Zhao and Shuangyou Chen

Remote Sens. 2025, 17(12), 2035; https://doi.org/10.3390/rs17122035 - 13 Jun 2025

Viewed by 501

Abstract

In fields such as military reconnaissance, forest fire prevention, and autonomous driving at night, there is an urgent need for high-precision three-dimensional reconstruction in low-light or night environments. The acquisition of remote sensing data by RGB cameras relies on external light, resulting in [...] Read more.

In fields such as military reconnaissance, forest fire prevention, and autonomous driving at night, there is an urgent need for high-precision three-dimensional reconstruction in low-light or night environments. The acquisition of remote sensing data by RGB cameras relies on external light, resulting in a significant decline in image quality and making it difficult to meet the task requirements. The method based on lidar has poor imaging effects in rainy and foggy weather, close-range scenes, and scenarios requiring thermal imaging data. In contrast, infrared cameras can effectively overcome this challenge because their imaging mechanisms are different from those of RGB cameras and lidar. However, the research on three-dimensional scene reconstruction of infrared images is relatively immature, especially in the field of infrared binocular stereo matching. There are two main challenges given this situation: first, there is a lack of a dataset specifically for infrared binocular stereo matching; second, the lack of texture information in infrared images causes a limit in the extension of the RGB method to the infrared reconstruction problem. To solve these problems, this study begins with the construction of an infrared binocular stereo matching dataset and then proposes an innovative perspective projection positional encoding-based transformer method to complete the infrared binocular stereo matching task. In this paper, a stereo matching network combined with transformer and cost volume is constructed. The existing work in the positional encoding of the transformer usually uses a parallel projection model to simplify the calculation. Our method is based on the actual perspective projection model so that each pixel is associated with a different projection ray. It effectively solves the problem of feature extraction and matching caused by insufficient texture information in infrared images and significantly improves matching accuracy. We conducted experiments based on the infrared binocular stereo matching dataset proposed in this paper. Experiments demonstrated the effectiveness of the proposed method. Full article

(This article belongs to the Collection Visible Infrared Imaging Radiometers and Applications)

► Show Figures

Figure 1

25 pages, 32212 KiB

Open AccessArticle

Remote Sensing of Seismic Signals via Enhanced Moiré-Based Apparatus Integrated with Active Convolved Illumination

by Adrian A. Moazzam, Anindya Ghoshroy, Durdu Ö. Güney and Roohollah Askari

Remote Sens. 2025, 17(12), 2032; https://doi.org/10.3390/rs17122032 - 12 Jun 2025

Viewed by 702

Abstract

The remote sensing of seismic waves in challenging and hazardous environments, such as active volcanic regions, remains a critical yet unresolved challenge. Conventional methods, including laser Doppler interferometry, InSAR, and stereo vision, are often hindered by atmospheric turbulence or necessitate access to observation [...] Read more.

The remote sensing of seismic waves in challenging and hazardous environments, such as active volcanic regions, remains a critical yet unresolved challenge. Conventional methods, including laser Doppler interferometry, InSAR, and stereo vision, are often hindered by atmospheric turbulence or necessitate access to observation sites, significantly limiting their applicability. To overcome these constraints, this study introduces a Moiré-based apparatus augmented with active convolved illumination (ACI). The system leverages the displacement-magnifying properties of Moiré patterns to achieve high precision in detecting subtle ground movements. Additionally, ACI effectively mitigates atmospheric fluctuations, reducing the distortion and alteration of measurement signals caused by these fluctuations. We validated the performance of this integrated solution through over 1900 simulations under diverse turbulence intensities. The results illustrate the synergistic capabilities of the Moiré apparatus and ACI in preserving the fidelity of Moiré fringes, enabling reliable displacement measurements even under conditions where passive methods fail. This study establishes a cost-effective, scalable, and non-invasive framework for remote seismic monitoring, offering transformative potential across geophysics, volcanology, structural analysis, metrology, and other domains requiring precise displacement measurements under extreme conditions. Full article

(This article belongs to the Section Earth Observation Data)

► Show Figures

Figure 1

27 pages, 11130 KiB

Open AccessArticle

A Dual-Modal Robot Welding Trajectory Generation Scheme for Motion Based on Stereo Vision and Deep Learning

by Xinlei Li, Jiawei Ma, Shida Yao, Guanxin Chi and Guangjun Zhang

Materials 2025, 18(11), 2593; https://doi.org/10.3390/ma18112593 - 1 Jun 2025

Viewed by 762

Abstract

To address the challenges of redundant point cloud processing and insufficient robustness under complex working conditions in existing teaching-free methods, this study proposes a dual-modal perception framework termed “2D image autonomous recognition and 3D point cloud precise planning”, which integrates stereo vision and [...] Read more.

To address the challenges of redundant point cloud processing and insufficient robustness under complex working conditions in existing teaching-free methods, this study proposes a dual-modal perception framework termed “2D image autonomous recognition and 3D point cloud precise planning”, which integrates stereo vision and deep learning. First, an improved U-Net deep learning model is developed, where VGG16 serves as the backbone network and a dual-channel attention module (DAM) is incorporated, achieving robust weld segmentation with a mean intersection over union (mIoU) of 0.887 and an F1-Score of 0.940. Next, the weld centerline is extracted using the Zhang–Suen skeleton refinement algorithm, and weld feature points are obtained through polynomial fitting optimization to establish cross-modal mapping between 2D pixels and 3D point clouds. Finally, a groove feature point extraction algorithm based on improved RANSAC combined with an equal-area weld bead filling strategy is designed to enable multi-layer and multi-bead robot trajectory planning, achieving a mean absolute error (MAE) of 0.238 mm in feature point positioning. Experimental results demonstrate that the method maintains high accuracy under complex working conditions such as noise interference and groove deformation, achieving a system accuracy of 0.208 mm and weld width fluctuation within ±0.15 mm, thereby significantly improving the autonomy and robustness of robot trajectory planning. Full article

(This article belongs to the Section Materials Simulation and Design)

► Show Figures

Figure 1

22 pages, 640 KiB

Open AccessReview

A Review of Optical-Based Three-Dimensional Reconstruction and Multi-Source Fusion for Plant Phenotyping

by Songhang Li, Zepu Cui, Jiahang Yang and Bin Wang

Sensors 2025, 25(11), 3401; https://doi.org/10.3390/s25113401 - 28 May 2025

Viewed by 1070

Abstract

In the context of the booming development of precision agriculture and plant phenotyping, plant 3D reconstruction technology has become a research hotspot, with widespread applications in plant growth monitoring, pest and disease detection, and smart agricultural equipment. Given the complex geometric and textural [...] Read more.

In the context of the booming development of precision agriculture and plant phenotyping, plant 3D reconstruction technology has become a research hotspot, with widespread applications in plant growth monitoring, pest and disease detection, and smart agricultural equipment. Given the complex geometric and textural characteristics of plants, traditional 2D image analysis methods are difficult to meet the modeling requirements, highlighting the growing importance of 3D reconstruction technology. This paper reviews active vision techniques (such as structured light, time-of-flight, and laser scanning methods), passive vision techniques (such as stereo vision and structure from motion), and deep learning-based 3D reconstruction methods (such as NeRF, CNN, and 3DGS). These technologies enhance crop analysis accuracy from multiple perspectives, provide strong support for agricultural production, and significantly promote the development of the field of plant research. Full article

(This article belongs to the Section Smart Agriculture)

► Show Figures

Figure 1

Search Results (485)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (485)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI