Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (300)

Search Parameters:
Keywords = scale-invariant feature transform

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 1545 KB  
Article
Curvature-Aware Point-Pair Signatures for Robust Unbalanced Point Cloud Registration
by Xinhang Hu, Zhao Zeng, Jiwei Deng, Guangshuai Wang, Jiaqi Yang and Siwen Quan
Sensors 2025, 25(20), 6267; https://doi.org/10.3390/s25206267 - 10 Oct 2025
Viewed by 105
Abstract
Existing point cloud registration methods can effectively handle large-scale and partially overlapping point cloud pairs. However, registering unbalanced point cloud pairs with significant disparities in spatial extent and point density remains a challenging problem that has received limited research attention. This challenge primarily [...] Read more.
Existing point cloud registration methods can effectively handle large-scale and partially overlapping point cloud pairs. However, registering unbalanced point cloud pairs with significant disparities in spatial extent and point density remains a challenging problem that has received limited research attention. This challenge primarily arises from the difficulty in achieving accurate local registration when the point clouds exhibit substantial scale variations and uneven density distributions. This paper presents a novel registration method for unbalanced point cloud pairs that utilizes the local point cluster structure feature for effective outlier rejection. The fundamental principle underlying our method is that the internal structure of a local cluster comprising a point and its K-nearest neighbors maintains rigidity-preserved invariance across different point clouds. The proposed pipeline operates through four sequential stages. First, keypoints are detected in both the source and target point clouds. Second, local feature descriptors are employed to establish initial one-to-many correspondences, which is a strategy that increases correspondences redundancy to enhance the pool of potential inliers. Third, the proposed Local Point Cluster Structure Feature is applied to filter outliers from the initial correspondences. Finally, the transformation hypothesis is generated and evaluated through the RANSAC method. To validate the efficacy of the proposed method, we construct a carefully designed benchmark named KITTI-UPP (KITTI-Unbalanced Point cloud Pairs) based on the KITTI odometry dataset. We further evaluate our method on the real-world TIESY Dataset which is a LiDAR-scanned dataset collected by the Third Railway Survey and Design Institute Group Co. Extensive experiments demonstrate that our method significantly outperforms the state-of-the-art methods in terms of both registration success rate and computational efficiency on the KITTI-UPP benchmark. Moreover, it achieves competitive results on the real-world TIESY dataset, confirming its applicability and generalizability across diverse real-world scenarios. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

11 pages, 4334 KB  
Communication
Real-Time Object Classification via Dual-Pixel Measurement
by Jianing Yang, Ran Chen, Yicheng Peng, Lingyun Zhang, Ting Sun and Fei Xing
Sensors 2025, 25(18), 5886; https://doi.org/10.3390/s25185886 - 20 Sep 2025
Viewed by 365
Abstract
Achieving rapid and accurate object classification holds significant importance in various domains. However, conventional vision-based techniques suffer from several limitations, including high data redundancy and strong dependence on image quality. In this work, we present a high-speed, image-free object classification method based on [...] Read more.
Achieving rapid and accurate object classification holds significant importance in various domains. However, conventional vision-based techniques suffer from several limitations, including high data redundancy and strong dependence on image quality. In this work, we present a high-speed, image-free object classification method based on dual-pixel measurement and normalized central moment invariants. Leveraging the complementary modulation capability of a digital micromirror device (DMD), the proposed system requires only five tailored binary illumination patterns to simultaneously extract geometric features and perform classification. The system can achieve a classification update rate of up to 4.44 kHz, offering significant improvements in both efficiency and accuracy compared to traditional image-based approaches. Numerical simulations verify the robustness of the method under similarity transformations—including translation, scaling, and rotation—while experimental validations further demonstrate reliable performance across diverse object types. This approach enables real-time, low-data throughput, and reconstruction-free classification, offering new potential for optical computing and edge intelligence applications. Full article
Show Figures

Figure 1

25 pages, 12760 KB  
Article
Intelligent Face Recognition: Comprehensive Feature Extraction Methods for Holistic Face Analysis and Modalities
by Thoalfeqar G. Jarullah, Ahmad Saeed Mohammad, Musab T. S. Al-Kaltakchi and Jabir Alshehabi Al-Ani
Signals 2025, 6(3), 49; https://doi.org/10.3390/signals6030049 - 19 Sep 2025
Viewed by 699
Abstract
Face recognition technology utilizes unique facial features to analyze and compare individuals for identification and verification purposes. This technology is crucial for several reasons, such as improving security and authentication, effectively verifying identities, providing personalized user experiences, and automating various operations, including attendance [...] Read more.
Face recognition technology utilizes unique facial features to analyze and compare individuals for identification and verification purposes. This technology is crucial for several reasons, such as improving security and authentication, effectively verifying identities, providing personalized user experiences, and automating various operations, including attendance monitoring, access management, and law enforcement activities. In this paper, comprehensive evaluations are conducted using different face detection and modality segmentation methods, feature extraction methods, and classifiers to improve system performance. As for face detection, four methods are proposed: OpenCV’s Haar Cascade classifier, Dlib’s HOG + SVM frontal face detector, Dlib’s CNN face detector, and Mediapipe’s face detector. Additionally, two types of feature extraction techniques are proposed: hand-crafted features (traditional methods: global local features) and deep learning features. Three global features were extracted, Scale-Invariant Feature Transform (SIFT), Speeded Robust Features (SURF), and Global Image Structure (GIST). Likewise, the following local feature methods are utilized: Local Binary Pattern (LBP), Weber local descriptor (WLD), and Histogram of Oriented Gradients (HOG). On the other hand, the deep learning-based features fall into two categories: convolutional neural networks (CNNs), including VGG16, VGG19, and VGG-Face, and Siamese neural networks (SNNs), which generate face embeddings. For classification, three methods are employed: Support Vector Machine (SVM), a one-class SVM variant, and Multilayer Perceptron (MLP). The system is evaluated on three datasets: in-house, Labelled Faces in the Wild (LFW), and the Pins dataset (sourced from Pinterest) providing comprehensive benchmark comparisons for facial recognition research. The best performance accuracy for the proposed ten-feature extraction methods applied to the in-house database in the context of the facial recognition task achieved 99.8% accuracy by using the VGG16 model combined with the SVM classifier. Full article
Show Figures

Figure 1

16 pages, 7958 KB  
Article
Development and Evaluation of a Keypoint-Based Video Stabilization Pipeline for Oral Capillaroscopy
by Vito Gentile, Vincenzo Taormina, Luana Conte, Giorgio De Nunzio, Giuseppe Raso and Donato Cascio
Sensors 2025, 25(18), 5738; https://doi.org/10.3390/s25185738 - 15 Sep 2025
Viewed by 455
Abstract
Capillaroscopy imaging is a non-invasive technique used to examine the microcirculation of the oral mucosa. However, the acquired video sequences are often affected by motion noise and shaking, which can compromise diagnostic accuracy and hinder the development of automated systems for capillary identification [...] Read more.
Capillaroscopy imaging is a non-invasive technique used to examine the microcirculation of the oral mucosa. However, the acquired video sequences are often affected by motion noise and shaking, which can compromise diagnostic accuracy and hinder the development of automated systems for capillary identification and segmentation. To address these challenges, we implemented a comprehensive video stabilization model, structured as a multi-phase pipeline and visually represented through a flow-chart. The proposed method integrates keypoint extraction, optical flow estimation, and affine transformation-based frame alignment to enhance video stability. Within this framework, we evaluated the performance of three keypoint extraction algorithms—Scale-Invariant Feature Transform (SIFT), Oriented FAST and Rotated BRIEF (ORB) and Good Features to Track (GFTT)—on a curated dataset of oral capillaroscopy videos. To simulate real-world acquisition conditions, synthetic tremors were introduced via Gaussian affine transformations. Experimental results demonstrate that all three algorithms yield comparable stabilization performance, with GFTT offering slightly higher structural fidelity and ORB excelling in computational efficiency. These findings validate the effectiveness of the proposed model and highlight its potential for improving the quality and reliability of oral videocapillaroscopy imaging. Experimental evaluation showed that the proposed pipeline achieved an average SSIM of 0.789 and reduced jitter to 25.8, compared to the perturbed input sequences. In addition, path smoothness and RMS errors (translation and rotation) consistently indicated improved stabilization across all tested feature extractors. Compared to previous stabilization approaches in nailfold capillaroscopy, our method achieved comparable or superior structural fidelity while maintaining computational efficiency. Full article
(This article belongs to the Special Issue Biomedical Signals, Images and Healthcare Data Analysis: 2nd Edition)
Show Figures

Figure 1

23 pages, 5635 KB  
Article
Attention-Based Transfer Enhancement Network for Cross-Corpus EEG Emotion Recognition
by Zongni Li, Kin-Yeung Wong and Chan-Tong Lam
Sensors 2025, 25(18), 5718; https://doi.org/10.3390/s25185718 - 13 Sep 2025
Viewed by 545
Abstract
A critical challenge in EEG-based emotion recognition is the poor generalization of models across different datasets due to significant domain shifts. Traditional methods struggle because they either overfit to source-domain characteristics or fail to bridge large discrepancies between datasets. To address this, we [...] Read more.
A critical challenge in EEG-based emotion recognition is the poor generalization of models across different datasets due to significant domain shifts. Traditional methods struggle because they either overfit to source-domain characteristics or fail to bridge large discrepancies between datasets. To address this, we propose the Cross-corpus Attention-based Transfer Enhancement network (CATE), a novel two-stage framework. The core novelty of CATE lies in its dual-view self-supervised pre-training strategy, which learns robust, domain-invariant representations by approaching the problem from two complementary perspectives. Unlike single-view models that capture an incomplete picture, our framework synergistically combines: (1) Noise-Enhanced Representation Modeling (NERM), which builds resilience to domain-specific artifacts and noise, and (2) Wavelet Transform Representation Modeling (WTRM), which captures the essential, multi-scale spectral patterns fundamental to emotion. This dual approach moves beyond the brittle assumptions of traditional domain adaptation, which often fails when domains are too dissimilar. In the second stage, a supervised fine-tuning process adapts these powerful features for classification using attention-based mechanisms. Extensive experiments on six transfer tasks across the SEED, SEED-IV, and SEED-V datasets demonstrate that CATE establishes a new state-of-the-art, achieving accuracies from 68.01% to 81.65% and outperforming prior methods by up to 15.65 percentage points. By effectively learning transferable features from these distinct, synergistic views, CATE provides a robust framework that significantly advances the practical applicability of cross-corpus EEG emotion recognition. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

28 pages, 587 KB  
Article
The Lyra–Schwarzschild Spacetime
by M. C. Bertin, R. R. Cuzinatto, J. A. Paquiyauri and B. M. Pimentel
Universe 2025, 11(9), 315; https://doi.org/10.3390/universe11090315 - 12 Sep 2025
Viewed by 414
Abstract
In this paper, we provide a complete analysis of the most general spherical solution of the Lyra scalar-tensor (LyST) gravitational theory based on the proper definition of a Lyra manifold. Lyra’s geometry features the metric tensor and a scale function as fundamental fields, [...] Read more.
In this paper, we provide a complete analysis of the most general spherical solution of the Lyra scalar-tensor (LyST) gravitational theory based on the proper definition of a Lyra manifold. Lyra’s geometry features the metric tensor and a scale function as fundamental fields, resulting in generalizations of geometrical quantities such as the affine connection, curvature, torsion, and non-metricity. A proper action is defined considering the correct invariant volume element and the scalar curvature, obeying the symmetry of Lyra’s reference frame transformations and resulting in a generalization of the Einstein–Hilbert action. The LyST gravity assumes zero torsion in a four-dimensional metric-compatible spacetime. In this work, geometrical quantities are presented and solved via Cartan’s technique for a spherically symmetric line element. Birkhoff’s theorem is demonstrated so that the solution is proven to be static, resulting in the Lyra–Schwarzschild metric, which depends on both the geometrical mass (through a modified version of the Schwarzschild radius rS) and an integration constant dubbed the Lyra radius rL. We study particle and light motion in Lyra–Schwarzschild spacetime using the Hamilton–Jacobi method. The motion of massive particles includes the determination of the rISCO and the periastron shift. The study of massless particle motion shows the last photon’s unstable orbit. Gravitational redshift in Lyra–Schwarzschild spacetime is also reviewed. We find a coordinate transformation that casts Lyra–Schwarzschild spacetime in the form of the standard Schwarzschild metric; the physical consequences of this fact are discussed. Full article
(This article belongs to the Section Gravitation)
Show Figures

Figure 1

17 pages, 3935 KB  
Article
Markerless Force Estimation via SuperPoint-SIFT Fusion and Finite Element Analysis: A Sensorless Solution for Deformable Object Manipulation
by Qingqing Xu, Ruoyang Lai and Junqing Yin
Biomimetics 2025, 10(9), 600; https://doi.org/10.3390/biomimetics10090600 - 8 Sep 2025
Viewed by 467
Abstract
Contact-force perception is a critical component of safe robotic grasping. With the rapid advances in embodied intelligence technology, humanoid robots have enhanced their multimodal perception capabilities. Conventional force sensors face limitations, such as complex spatial arrangements, installation challenges at multiple nodes, and potential [...] Read more.
Contact-force perception is a critical component of safe robotic grasping. With the rapid advances in embodied intelligence technology, humanoid robots have enhanced their multimodal perception capabilities. Conventional force sensors face limitations, such as complex spatial arrangements, installation challenges at multiple nodes, and potential interference with robotic flexibility. Consequently, these conventional sensors are unsuitable for biomimetic robot requirements in object perception, natural interaction, and agile movement. Therefore, this study proposes a sensorless external force detection method that integrates SuperPoint-Scale Invariant Feature Transform (SIFT) feature extraction with finite element analysis to address force perception challenges. A visual analysis method based on the SuperPoint-SIFT feature fusion algorithm was implemented to reconstruct a three-dimensional displacement field of the target object. Subsequently, the displacement field was mapped to the contact force distribution using finite element modeling. Experimental results demonstrate a mean force estimation error of 7.60% (isotropic) and 8.15% (anisotropic), with RMSE < 8%, validated by flexible pressure sensors. To enhance the model’s reliability, a dual-channel video comparison framework was developed. By analyzing the consistency of the deformation patterns and mechanical responses between the actual compression and finite element simulation video keyframes, the proposed approach provides a novel solution for real-time force perception in robotic interactions. The proposed solution is suitable for applications such as precision assembly and medical robotics, where sensorless force feedback is crucial. Full article
(This article belongs to the Special Issue Bio-Inspired Intelligent Robot)
Show Figures

Figure 1

24 pages, 4538 KB  
Article
CNN–Transformer-Based Model for Maritime Blurred Target Recognition
by Tianyu Huang, Chao Pan, Jin Liu and Zhiwei Kang
Electronics 2025, 14(17), 3354; https://doi.org/10.3390/electronics14173354 - 23 Aug 2025
Viewed by 485
Abstract
In maritime blurred image recognition, ship collision accidents frequently result from three primary blur types: (1) motion blur from vessel movement in complex sea conditions, (2) defocus blur due to water vapor refraction, and (3) scattering blur caused by sea fog interference. This [...] Read more.
In maritime blurred image recognition, ship collision accidents frequently result from three primary blur types: (1) motion blur from vessel movement in complex sea conditions, (2) defocus blur due to water vapor refraction, and (3) scattering blur caused by sea fog interference. This paper proposes a dual-branch recognition method specifically designed for motion blur, which represents the most prevalent blur type in maritime scenarios. Conventional approaches exhibit constrained computational efficiency and limited adaptability across different modalities. To overcome these limitations, we propose a hybrid CNN–Transformer architecture: the CNN branch captures local blur characteristics, while the enhanced Transformer module models long-range dependencies via attention mechanisms. The CNN branch employs a lightweight ResNet variant, in which conventional residual blocks are substituted with Multi-Scale Gradient-Aware Residual Block (MSG-ARB). This architecture employs learnable gradient convolution for explicit local gradient feature extraction and utilizes gradient content gating to strengthen blur-sensitive region representation, significantly improving computational efficiency compared to conventional CNNs. The Transformer branch incorporates a Hierarchical Swin Transformer (HST) framework with Shifted Window-based Multi-head Self-Attention for global context modeling. The proposed method incorporates blur invariant Positional Encoding (PE) to enhance blur spectrum modeling capability, while employing DyT (Dynamic Tanh) module with learnable α parameters to replace traditional normalization layers. This architecture achieves a significant reduction in computational costs while preserving feature representation quality. Moreover, it efficiently computes long-range image dependencies using a compact 16 × 16 window configuration. The proposed feature fusion module synergistically integrates CNN-based local feature extraction with Transformer-enabled global representation learning, achieving comprehensive feature modeling across different scales. To evaluate the model’s performance and generalization ability, we conducted comprehensive experiments on four benchmark datasets: VAIS, GoPro, Mini-ImageNet, and Open Images V4. Experimental results show that our method achieves superior classification accuracy compared to state-of-the-art approaches, while simultaneously enhancing inference speed and reducing GPU memory consumption. Ablation studies confirm that the DyT module effectively suppresses outliers and improves computational efficiency, particularly when processing low-quality input data. Full article
Show Figures

Figure 1

21 pages, 3448 KB  
Article
A Welding Defect Detection Model Based on Hybrid-Enhanced Multi-Granularity Spatiotemporal Representation Learning
by Chenbo Shi, Shaojia Yan, Lei Wang, Changsheng Zhu, Yue Yu, Xiangteng Zang, Aiping Liu, Chun Zhang and Xiaobing Feng
Sensors 2025, 25(15), 4656; https://doi.org/10.3390/s25154656 - 27 Jul 2025
Viewed by 905
Abstract
Real-time quality monitoring using molten pool images is a critical focus in researching high-quality, intelligent automated welding. To address interference problems in molten pool images under complex welding scenarios (e.g., reflected laser spots from spatter misclassified as porosity defects) and the limited interpretability [...] Read more.
Real-time quality monitoring using molten pool images is a critical focus in researching high-quality, intelligent automated welding. To address interference problems in molten pool images under complex welding scenarios (e.g., reflected laser spots from spatter misclassified as porosity defects) and the limited interpretability of deep learning models, this paper proposes a multi-granularity spatiotemporal representation learning algorithm based on the hybrid enhancement of handcrafted and deep learning features. A MobileNetV2 backbone network integrated with a Temporal Shift Module (TSM) is designed to progressively capture the short-term dynamic features of the molten pool and integrate temporal information across both low-level and high-level features. A multi-granularity attention-based feature aggregation module is developed to select key interference-free frames using cross-frame attention, generate multi-granularity features via grouped pooling, and apply the Convolutional Block Attention Module (CBAM) at each granularity level. Finally, these multi-granularity spatiotemporal features are adaptively fused. Meanwhile, an independent branch utilizes the Histogram of Oriented Gradient (HOG) and Scale-Invariant Feature Transform (SIFT) features to extract long-term spatial structural information from historical edge images, enhancing the model’s interpretability. The proposed method achieves an accuracy of 99.187% on a self-constructed dataset. Additionally, it attains a real-time inference speed of 20.983 ms per sample on a hardware platform equipped with an Intel i9-12900H CPU and an RTX 3060 GPU, thus effectively balancing accuracy, speed, and interpretability. Full article
(This article belongs to the Topic Applied Computing and Machine Intelligence (ACMI))
Show Figures

Figure 1

18 pages, 774 KB  
Article
Bayesian Inertia Estimation via Parallel MCMC Hammer in Power Systems
by Weidong Zhong, Chun Li, Minghua Chu, Yuanhong Che, Shuyang Zhou, Zhi Wu and Kai Liu
Energies 2025, 18(15), 3905; https://doi.org/10.3390/en18153905 - 22 Jul 2025
Viewed by 317
Abstract
The stability of modern power systems has become critically dependent on precise inertia estimation of synchronous generators, particularly as renewable energy integration fundamentally transforms grid dynamics. Increasing penetration of converter-interfaced renewable resources reduces system inertia, heightening the grid’s susceptibility to transient disturbances and [...] Read more.
The stability of modern power systems has become critically dependent on precise inertia estimation of synchronous generators, particularly as renewable energy integration fundamentally transforms grid dynamics. Increasing penetration of converter-interfaced renewable resources reduces system inertia, heightening the grid’s susceptibility to transient disturbances and creating significant technical challenges in maintaining operational reliability. This paper addresses these challenges through a novel Bayesian inference framework that synergistically integrates PMU data with an advanced MCMC sampling technique, specifically employing the Affine-Invariant Ensemble Sampler. The proposed methodology establishes a probabilistic estimation paradigm that systematically combines prior engineering knowledge with real-time measurements, while the Affine-Invariant Ensemble Sampler mechanism overcomes high-dimensional computational barriers through its unique ensemble-based exploration strategy featuring stretch moves and parallel walker coordination. The framework’s ability to provide full posterior distributions of inertia parameters, rather than single-point estimates, helps for stability assessment in renewable-dominated grids. Simulation results on the IEEE 39-bus and 68-bus benchmark systems validate the effectiveness and scalability of the proposed method, with inertia estimation errors consistently maintained below 1% across all generators. Moreover, the parallelized implementation of the algorithm significantly outperforms the conventional M-H method in computational efficiency. Specifically, the proposed approach reduces execution time by approximately 52% in the 39-bus system and by 57% in the 68-bus system, demonstrating its suitability for real-time and large-scale power system applications. Full article
Show Figures

Figure 1

25 pages, 3106 KB  
Article
Multifractal-Aware Convolutional Attention Synergistic Network for Carbon Market Price Forecasting
by Liran Wei, Mingzhu Tang, Na Li, Jingwen Deng, Xinpeng Zhou and Haijun Hu
Fractal Fract. 2025, 9(7), 449; https://doi.org/10.3390/fractalfract9070449 - 7 Jul 2025
Viewed by 665
Abstract
Accurate carbon market price prediction is crucial for promoting a low-carbon economy and sustainable engineering. Traditional models often face challenges in effectively capturing the multifractality inherent in carbon market prices. Inspired by the self-similarity and scale invariance inherent in fractal structures, this study [...] Read more.
Accurate carbon market price prediction is crucial for promoting a low-carbon economy and sustainable engineering. Traditional models often face challenges in effectively capturing the multifractality inherent in carbon market prices. Inspired by the self-similarity and scale invariance inherent in fractal structures, this study proposes a novel multifractal-aware model, MF-Transformer-DEC, for carbon market price prediction. The multi-scale convolution (MSC) module employs multi-layer dilated convolutions constrained by shared convolution kernel weights to construct a scale-invariant convolutional network. By projecting and reconstructing time series data within a multi-scale fractal space, MSC enhances the model’s ability to adapt to complex nonlinear fluctuations while significantly suppressing noise interference. The fractal attention (FA) module calculates similarity matrices within a multi-scale feature space through multi-head attention, adaptively integrating multifractal market dynamics and implicit associations. The dynamic error correction (DEC) module models error commonality through variational autoencoder (VAE), and uncertainty-guided dynamic weighting achieves robust error correction. The proposed model achieved an average R2 of 0.9777 and 0.9942 for 7-step ahead predictions on the Shanghai and Guangdong carbon price datasets, respectively. This study pioneers the interdisciplinary integration of fractal theory and artificial intelligence methods for complex engineering analysis, enhancing the accuracy of carbon market price prediction. The proposed technical pathway of “multi-scale deconstruction and similarity mining” offers a valuable reference for AI-driven fractal modeling. Full article
Show Figures

Figure 1

26 pages, 92114 KB  
Article
Multi-Modal Remote Sensing Image Registration Method Combining Scale-Invariant Feature Transform with Co-Occurrence Filter and Histogram of Oriented Gradients Features
by Yi Yang, Shuo Liu, Haitao Zhang, Dacheng Li and Ling Ma
Remote Sens. 2025, 17(13), 2246; https://doi.org/10.3390/rs17132246 - 30 Jun 2025
Viewed by 1074
Abstract
Multi-modal remote sensing images often exhibit complex and nonlinear radiation differences which significantly hinder the performance of traditional feature-based image registration methods such as Scale-Invariant Feature Transform (SIFT). In contrast, structural features—such as edges and contours—remain relatively consistent across modalities. To address this [...] Read more.
Multi-modal remote sensing images often exhibit complex and nonlinear radiation differences which significantly hinder the performance of traditional feature-based image registration methods such as Scale-Invariant Feature Transform (SIFT). In contrast, structural features—such as edges and contours—remain relatively consistent across modalities. To address this challenge, we propose a novel multi-modal image registration method, Cof-SIFT, which integrates a co-occurrence filter with SIFT. By replacing the traditional Gaussian filter with a co-occurrence filter, Cof-SIFT effectively suppresses texture variations while preserving structural information, thereby enhancing robustness to cross-modal differences. To further improve image registration accuracy, we introduce an extended approach, Cof-SIFT_HOG, which extracts Histogram of Oriented Gradients (HOG) features from the image gradient magnitude map of corresponding points and refines their positions based on HOG similarity. This refinement yields more precise alignment between the reference and image to be registered. We evaluated Cof-SIFT and Cof-SIFT_HOG on a diverse set of multi-modal remote sensing image pairs. The experimental results demonstrate that both methods outperform existing approaches, including SIFT, COFSM, SAR-SIFT, PSO-SIFT, and OS-SIFT, in terms of robustness and registration accuracy. Notably, Cof-SIFT_HOG achieves the highest overall performance, confirming the effectiveness of the proposed structural-preserving and corresponding point location refinement strategies in cross-modal registration tasks. Full article
Show Figures

Figure 1

27 pages, 86462 KB  
Article
SAR Image Registration Based on SAR-SIFT and Template Matching
by Shichong Liu, Xiaobo Deng, Chun Liu and Yongchao Cheng
Remote Sens. 2025, 17(13), 2216; https://doi.org/10.3390/rs17132216 - 27 Jun 2025
Viewed by 652
Abstract
Accurate image registration is essential for synthetic aperture radar (SAR) applications such as change detection, image fusion, and deformation monitoring. However, SAR image registration faces challenges including speckle noise, low-texture regions, and the geometric transformation caused by topographic relief due to side-looking radar [...] Read more.
Accurate image registration is essential for synthetic aperture radar (SAR) applications such as change detection, image fusion, and deformation monitoring. However, SAR image registration faces challenges including speckle noise, low-texture regions, and the geometric transformation caused by topographic relief due to side-looking radar imaging. To address these issues, this paper proposes a novel two-stage registration method, consisting of pre-registration and fine registration. In the pre-registration stage, the scale-invariant feature transform for the synthetic aperture radar (SAR-SIFT) algorithm is integrated into an iterative optimization framework to eliminate large-scale geometric discrepancies, ensuring a coarse but reliable initial alignment. In the fine registration stage, a novel similarity measure is introduced by combining frequency-domain phase congruency and spatial-domain gradient features, which enhances the robustness and accuracy of template matching, especially in edge-rich regions. For the topographic relief in the SAR images, an adaptive local stretching transformation strategy is proposed to correct the undulating areas. Experiments on five pairs of SAR images containing flat and undulating regions show that the proposed method achieves initial alignment errors below 10 pixels and final registration errors below 1 pixel. Compared with other methods, our approach obtains more correct matching pairs (up to 100+ per image pair), higher registration precision, and improved robustness under complex terrains. These results validate the accuracy and effectiveness of the proposed registration framework. Full article
Show Figures

Figure 1

23 pages, 1208 KB  
Article
UCrack-DA: A Multi-Scale Unsupervised Domain Adaptation Method for Surface Crack Segmentation
by Fei Deng, Shaohui Yang, Bin Wang, Xiujun Dong and Siyuan Tian
Remote Sens. 2025, 17(12), 2101; https://doi.org/10.3390/rs17122101 - 19 Jun 2025
Viewed by 887
Abstract
Surface cracks serve as early warning signals for potential geological hazards, and their precise segmentation is crucial for disaster risk assessment. Due to differences in acquisition conditions and the diversity of crack morphology, scale, and surface texture, there is a significant domain shift [...] Read more.
Surface cracks serve as early warning signals for potential geological hazards, and their precise segmentation is crucial for disaster risk assessment. Due to differences in acquisition conditions and the diversity of crack morphology, scale, and surface texture, there is a significant domain shift between different crack datasets, necessitating transfer training. However, in real work areas, the sparse distribution of cracks results in a limited number of samples, and the difficulty of crack annotation makes it highly inefficient to use a high proportion of annotated samples for transfer training to predict the remaining samples. Domain adaptation methods can achieve transfer training without relying on manual annotation, but traditional domain adaptation methods struggle to effectively address the characteristics of cracks. To address this issue, we propose an unsupervised domain adaptation method for crack segmentation. By employing a hierarchical adversarial mechanism and a prediction entropy minimization constraint, we extract domain-invariant features in a multi-scale feature space and sharpen decision boundaries. Additionally, by integrating a Mix-Transformer encoder, a multi-scale dilated attention module, and a mixed convolutional attention decoder, we effectively solve the challenges of cross-domain data distribution differences and complex scene crack segmentation. Experimental results show that UCrack-DA achieves superior performance compared to existing methods on both the Roboflow-Crack and UAV-Crack datasets, with significant improvements in metrics such as mIoU, mPA, and Accuracy. In UAV images captured in field scenarios, the model demonstrates excellent segmentation Accuracy for multi-scale and multi-morphology cracks, validating its practical application value in geological hazard monitoring. Full article
(This article belongs to the Section AI Remote Sensing)
Show Figures

Graphical abstract

14 pages, 2575 KB  
Article
Speckle Noise Removal in OCT Images via Wavelet Transform and DnCNN
by Fangfang Li, Qizhou Wu, Bei Jia and Zhicheng Yang
Appl. Sci. 2025, 15(12), 6557; https://doi.org/10.3390/app15126557 - 11 Jun 2025
Viewed by 1287
Abstract
(1) Background: Due to its imaging principle, OCT generates images laden with significant speckle noise. The quality of OCT images is a crucial factor influencing diagnostic effectiveness, highlighting the importance of OCT image denoising. (2) Methods: The OCT image undergoes a Discrete Wavelet [...] Read more.
(1) Background: Due to its imaging principle, OCT generates images laden with significant speckle noise. The quality of OCT images is a crucial factor influencing diagnostic effectiveness, highlighting the importance of OCT image denoising. (2) Methods: The OCT image undergoes a Discrete Wavelet Transform (DWT) to decompose it into multiple scales, isolating high-frequency wavelet coefficients that encapsulate fine texture details. These high-frequency coefficients are further processed using a Shift-Invariant Wavelet Transform (SWT) to generate an additional set of coefficients, ensuring an enhanced feature preservation and reduced artifacts. Both the original DWT high-frequency coefficients and their SWT-transformed counterparts are independently denoised using a Deep Neural Convolutional Network (DnCNN). This dual-pathway approach leverages the complementary strengths of both transform domains to suppress noise effectively. The denoised outputs from the two pathways are fused using a correlation-based strategy. This step ensures the optimal integration of texture features by weighting the contributions of each pathway according to their correlation with the original image, preserving critical diagnostic information. Finally, the Inverse Wavelet Transform is applied to the fused coefficients to reconstruct the denoised OCT image in the spatial domain. This reconstruction step maintains structural integrity and enhances diagnostic clarity by preserving essential spatial features. (3) Results: The MSE, PSNR, and SSIM indices of the proposed algorithm in this paper were 4.9052, 44.8603, and 0.9514, respectively, achieving commendable results compared to other algorithms. The Sobel, Prewitt, and Canny operators were utilized for edge detection on images, which validated the enhancement effect of the proposed algorithm on image edges. (4) Conclusions: The proposed algorithm in this paper exhibits an exceptional performance in noise suppression and detail preservation, demonstrating its potential application in OCT image denoising. Future research can further explore the adaptability and optimization directions of this algorithm in complex noise environments, aiming to provide more theoretical support and practical evidence for enhancing OCT image quality. Full article
Show Figures

Figure 1

Back to TopTop