MDPI - Publisher of Open Access Journals

20 pages, 5077 KB

Open AccessArticle

Hybrid-Domain Synergistic Transformer for Hyperspectral Image Denoising

by Haoyue Li and Di Wu

Appl. Sci. 2025, 15(17), 9735; https://doi.org/10.3390/app15179735 - 4 Sep 2025

Hyperspectral image (HSI) denoising is challenged by complex spatial-spectral noise coupling. Existing deep learning methods, primarily designed for RGB images, fail to address HSI-specific noise distributions and spectral correlations. This paper proposes a Hybrid-Domain Synergistic Transformer (HDST) integrating frequency-domain enhancement and multiscale modeling. [...] Read more.

Hyperspectral image (HSI) denoising is challenged by complex spatial-spectral noise coupling. Existing deep learning methods, primarily designed for RGB images, fail to address HSI-specific noise distributions and spectral correlations. This paper proposes a Hybrid-Domain Synergistic Transformer (HDST) integrating frequency-domain enhancement and multiscale modeling. Key contributions include (1) a Fourier-based preprocessing module decoupling spectral noise; (2) a dynamic cross-domain attention mechanism adaptively fusing spatial-frequency features; and (3) a hierarchical architecture combining global noise modeling and detail recovery. Experiments on realistic and synthetic datasets show HDST outperforms state-of-the-art methods in PSNR, with fewer parameters. Visual results confirm effective noise suppression without spectral distortion. The framework provides a robust solution for HSI denoising, demonstrating potential for high-dimensional visual data processing. Full article

(This article belongs to the Topic Visual Computing and Understanding: New Developments and Trends)

► Show Figures

Figure 1

17 pages, 7343 KB

Open AccessArticle

Accelerated Super-Resolution Reconstruction for Structured Illumination Microscopy Integrated with Low-Light Optimization

by Caihong Huang, Dingrong Yi and Lichun Zhou

Micromachines 2025, 16(9), 1020; https://doi.org/10.3390/mi16091020 - 3 Sep 2025

Abstract

Structured illumination microscopy (SIM) with π/2 phase-shift modulation traditionally relies on frequency-domain computation, which greatly limits processing efficiency. In addition, the illumination regime inherent in structured illumination techniques often results in poor visual quality of reconstructed images. To address these dual challenges, this [...] Read more.

Structured illumination microscopy (SIM) with π/2 phase-shift modulation traditionally relies on frequency-domain computation, which greatly limits processing efficiency. In addition, the illumination regime inherent in structured illumination techniques often results in poor visual quality of reconstructed images. To address these dual challenges, this study introduces DM-SIM-LLIE (Differential Low-Light Image Enhancement SIM), a novel framework that integrates two synergistic innovations. First, the study pioneers a spatial-domain computational paradigm for π/2 phase-shift SIM reconstruction. Through system differentiation, mathematical derivation, and algorithm simplification, an optimized spatial-domain model is established. Second, an adaptive local overexposure correction strategy is developed, combined with a zero-shot learning deep learning algorithm, RUAS, to enhance the image quality of structured light reconstructed images. Experimental validation using specimens such as fluorescent microspheres and bovine pulmonary artery endothelial cells demonstrates the advantages of this approach: compared with traditional frequency-domain methods, the reconstruction speed is accelerated by five times while maintaining equivalent lateral resolution and excellent axial resolution. The image quality of the low-light enhancement algorithm after local overexposure correction is superior to existing methods. These advances significantly increase the application potential of SIM technology in time-sensitive biomedical imaging scenarios that require high spatiotemporal resolution. Full article

(This article belongs to the Special Issue Advanced Biomaterials, Biodevices, and Their Application)

► Show Figures

Figure 1

16 pages, 2827 KB

Open AccessArticle

A Dual-Modality CNN Approach for RSS-Based Indoor Positioning Using Spatial and Frequency Fingerprints

by Xiangchen Lai, Yunzhi Luo and Yong Jia

Sensors 2025, 25(17), 5408; https://doi.org/10.3390/s25175408 - 2 Sep 2025

Viewed by 147

Abstract

Indoor positioning systems based on received signal strength (RSS) achieve indoor positioning by leveraging the position-related features inherent in spatial RSS fingerprint images. Their positioning accuracy and robustness are directly influenced by the quality of fingerprint features. However, the inherent spatial low-resolution characteristic [...] Read more.

Indoor positioning systems based on received signal strength (RSS) achieve indoor positioning by leveraging the position-related features inherent in spatial RSS fingerprint images. Their positioning accuracy and robustness are directly influenced by the quality of fingerprint features. However, the inherent spatial low-resolution characteristic of spatial RSS fingerprint images makes it challenging to effectively extract subtle fingerprint features. To address this issue, this paper proposes an RSS-based indoor positioning method that combines enhanced spatial frequency fingerprint representation with fusion learning. First, bicubic interpolation is applied to improve image resolution and reveal finer spatial details. Then, a 2D fast Fourier transform (2D FFT) converts the enhanced spatial images into frequency domain representations to supplement spectral features. These spatial and frequency fingerprints are used as dual-modality inputs for a parallel convolutional neural network (CNN) model with efficient multi-scale attention (EMA) modules. The model extracts modality-specific features and fuses them to generate enriched representations. Each modality—spatial, frequency, and fused—is passed through a dedicated fully connected network to predict 3D coordinates. A coordinate optimization strategy is introduced to select the two most reliable outputs for each axis (x, y, z), and their average is used as the final estimate. Experiments on seven public datasets show that the proposed method significantly improves positioning accuracy, reducing the mean positioning error by up to 47.1% and root mean square error (RMSE) by up to 54.4% compared with traditional and advanced time–frequency methods. Full article

(This article belongs to the Section Navigation and Positioning)

► Show Figures

Figure 1

36 pages, 25793 KB

Open AccessArticle

DATNet: Dynamic Adaptive Transformer Network for SAR Image Denoising

by Yan Shen, Yazhou Chen, Yuming Wang, Liyun Ma and Xiaolu Zhang

Remote Sens. 2025, 17(17), 3031; https://doi.org/10.3390/rs17173031 - 1 Sep 2025

Viewed by 247

Abstract

Aiming at the problems of detail blurring and structural distortion caused by speckle noise, additive white noise and hybrid noise interference in synthetic aperture radar (SAR) images, this paper proposes a Dynamic Adaptive Transformer Network (DAT-Net) integrating a dynamic gated attention module and [...] Read more.

Aiming at the problems of detail blurring and structural distortion caused by speckle noise, additive white noise and hybrid noise interference in synthetic aperture radar (SAR) images, this paper proposes a Dynamic Adaptive Transformer Network (DAT-Net) integrating a dynamic gated attention module and a frequency-domain multi-expert enhancement module for SAR image denoising. The proposed model leverages a multi-scale encoder–decoder framework, combining local convolutional feature extraction with global self-attention mechanisms to transcend the limitations of conventional approaches restricted to single noise types, thereby achieving adaptive suppression of multi-source noise contamination. Key innovations comprise the following: (1) A Dynamic Gated Attention Module (DGAM) employing dual-path feature embedding and dynamic thresholding mechanisms to precisely characterize noise spatial heterogeneity; (2) A Frequency-domain Multi-Expert Enhancement (FMEE) Module utilizing Fourier decomposition and expert network ensembles for collaborative optimization of high-frequency and low-frequency components; (3) Lightweight Multi-scale Convolution Blocks (MCB) enhancing cross-scale feature fusion capabilities. Experimental results demonstrate that DAT-Net achieves quantifiable performance enhancement in both simulated and real SAR environments. Compared with other denoising algorithms, the proposed methodology exhibits superior noise suppression across diverse noise scenarios while preserving intrinsic textural features. Full article

► Show Figures

Graphical abstract

13 pages, 7032 KB

Open AccessArticle

Frequency-Domain Gaussian Cooperative Filtering Demodulation Method for Spatially Modulated Full-Polarization Imaging Systems

by Ziyang Zhang, Pengbo Ma, Shixiao Ye, Song Ye, Wei Luo, Shu Li, Wei Xiong, Yuting Zhang, Wentao Zhang, Fangyuan Wang, Jiejun Wang, Xinqiang Wang and Niyan Chen

Photonics 2025, 12(9), 857; https://doi.org/10.3390/photonics12090857 - 26 Aug 2025

Viewed by 239

Abstract

The spatially modulated full-polarization imaging system encodes complete polarization information into a single interferogram, enabling rapid demodulation. However, traditional single Gaussian low-pass filtering cannot adequately suppress crosstalk among Stokes components, leading to reduced accuracy. To address this issue, this paper proposes a frequency-domain [...] Read more.

The spatially modulated full-polarization imaging system encodes complete polarization information into a single interferogram, enabling rapid demodulation. However, traditional single Gaussian low-pass filtering cannot adequately suppress crosstalk among Stokes components, leading to reduced accuracy. To address this issue, this paper proposes a frequency-domain Gaussian cooperative filter (FGCF) based on a divide-and-conquer strategy in the frequency domain. Specifically, the method employs six Gaussian high-pass filters to effectively identify and suppress interference signals located at different positions in the frequency domain, while utilizing a single Gaussian low-pass filter to preserve critical polarization information within the image. Through the cooperative processing of the low-pass filter response and the complementary responses of the high-pass filters, simultaneous optimization of information retention and interference suppression is achieved. Simulation and real-scene experiments show that FGCF significantly enhances demodulation quality, especially for S₁, and achieves superior structural similarity compared with traditional low-pass filtering. Full article

(This article belongs to the Section New Applications Enabled by Photonics Technologies and Systems)

► Show Figures

Figure 1

22 pages, 5535 KB

Open AccessArticle

OFNet: Integrating Deep Optical Flow and Bi-Domain Attention for Enhanced Change Detection

by Liwen Zhang, Quan Zou, Guoqing Li, Wenyang Yu, Yong Yang and Heng Zhang

Remote Sens. 2025, 17(17), 2949; https://doi.org/10.3390/rs17172949 - 25 Aug 2025

Viewed by 459

Abstract

Change detection technology holds significant importance in disciplines such as urban planning, land utilization tracking, and hazard evaluation, as it can efficiently and accurately reveal dynamic regional change processes, providing crucial support for scientific decision-making and refined management. Although deep learning methods based [...] Read more.

Change detection technology holds significant importance in disciplines such as urban planning, land utilization tracking, and hazard evaluation, as it can efficiently and accurately reveal dynamic regional change processes, providing crucial support for scientific decision-making and refined management. Although deep learning methods based on computer vision have achieved remarkable progress in change detection, they still face challenges including reducing dynamic background interference, capturing subtle changes, and effectively fusing multi-temporal data features. To address these issues, this paper proposes a novel change detection model called OFNet. Building upon existing Siamese network architectures, we introduce an optical flow branch module that supplements pixel-level dynamic information. By incorporating motion features to guide the network’s attention to potential change regions, we enhance the model’s ability to characterize and discriminate genuine changes in cross-temporal remote sensing images. Additionally, we innovatively propose a dual-domain attention mechanism that simultaneously models discriminative features in both spatial and frequency domains for change detection tasks. The spatial attention focuses on capturing edge and structural changes, while the frequency-domain attention strengthens responses to key frequency components. The synergistic fusion of these two attention mechanisms effectively improves the model’s sensitivity to detailed changes and enhances the overall robustness of detection. Experimental results demonstrate that OFNet achieves an IoU of 83.03 on the LEVIR-CD dataset and 82.86 on the WHU-CD dataset, outperforming current mainstream approaches and validating its superior detection performance and generalization capability. This presents a novel technical method for environmental observation and urban transformation analysis tasks. Full article

(This article belongs to the Special Issue Advances in Remote Sensing Image Target Detection and Recognition)

► Show Figures

Figure 1

26 pages, 62819 KB

Open AccessArticle

Low-Light Image Dehazing and Enhancement via Multi-Feature Domain Fusion

by Jiaxin Wu, Han Ai, Ping Zhou, Hao Wang, Haifeng Zhang, Gaopeng Zhang and Weining Chen

Remote Sens. 2025, 17(17), 2944; https://doi.org/10.3390/rs17172944 - 25 Aug 2025

Viewed by 537

Abstract

The acquisition of nighttime remote-sensing visible-light images is often accompanied by low-illumination effects and haze interference, resulting in significant image quality degradation and greatly affecting subsequent applications. Existing low-light enhancement and dehazing algorithms can handle each problem individually, but their simple cascade cannot [...] Read more.

The acquisition of nighttime remote-sensing visible-light images is often accompanied by low-illumination effects and haze interference, resulting in significant image quality degradation and greatly affecting subsequent applications. Existing low-light enhancement and dehazing algorithms can handle each problem individually, but their simple cascade cannot effectively address unknown real-world degradations. Therefore, we design a joint processing framework, WFDiff, which fully exploits the advantages of Fourier–wavelet dual-domain features and innovatively integrates the inverse diffusion process through differentiable operators to construct a multi-scale degradation collaborative correction system. Specifically, in the reverse diffusion process, a dual-domain feature interaction module is designed, and the joint probability distribution of the generated image and real data is constrained through differentiable operators: on the one hand, a global frequency-domain prior is established by jointly constraining Fourier amplitude and phase, effectively maintaining the radiometric consistency of the image; on the other hand, wavelets are used to capture high-frequency details and edge structures in the spatial domain to improve the prediction process. On this basis, a cross-overlapping-block adaptive smoothing estimation algorithm is proposed, which achieves dynamic fusion of multi-scale features through a differentiable weighting strategy, effectively solving the problem of restoring images of different sizes and avoiding local inconsistencies. In view of the current lack of remote-sensing data for low-light haze scenarios, we constructed the Hazy-Dark dataset. Physical experiments and ablation experiments show that the proposed method outperforms existing single-task or simple cascade methods in terms of image fidelity, detail recovery capability, and visual naturalness, providing a new paradigm for remote-sensing image processing under coupled degradations. Full article

(This article belongs to the Section AI Remote Sensing)

► Show Figures

Figure 1

25 pages, 9065 KB

Open AccessArticle

PWFNet: Pyramidal Wavelet–Frequency Attention Network for Road Extraction

by Jinkun Zong, Yonghua Sun, Ruozeng Wang, Dinglin Xu, Xue Yang and Xiaolin Zhao

Remote Sens. 2025, 17(16), 2895; https://doi.org/10.3390/rs17162895 - 20 Aug 2025

Viewed by 654

Abstract

Road extraction from remote sensing imagery plays a critical role in applications such as autonomous driving, urban planning, and infrastructure development. Although deep learning methods have achieved notable progress, current approaches still struggle with complex backgrounds, varying road widths, and strong texture interference, [...] Read more.

Road extraction from remote sensing imagery plays a critical role in applications such as autonomous driving, urban planning, and infrastructure development. Although deep learning methods have achieved notable progress, current approaches still struggle with complex backgrounds, varying road widths, and strong texture interference, often leading to fragmented road predictions or the misclassification of background regions. Given that roads typically exhibit smooth low-frequency characteristics while background clutter tends to manifest in mid- and high-frequency ranges, incorporating frequency-domain information can enhance the model’s structural perception and discrimination capabilities. To address these challenges, we propose a novel frequency-aware road extraction network, termed PWFNet, which combines frequency-domain modeling with multi-scale feature enhancement. PWFNet comprises two key modules. First, the Pyramidal Wavelet Convolution (PWC) module employs multi-scale wavelet decomposition fused with localized convolution to accurately capture road structures across various spatial resolutions. Second, the Frequency-aware Adjustment Module (FAM) partitions the Fourier spectrum into multiple frequency bands and incorporates a spatial attention mechanism to strengthen low-frequency road responses while suppressing mid- and high-frequency background noise. By integrating complementary modeling from both spatial and frequency domains, PWFNet significantly improves road continuity, edge clarity, and robustness under complex conditions. Experiments on the DeepGlobe and CHN6-CUG road datasets demonstrate that PWFNet achieves IoU improvements of 3.8% and 1.25% over the best-performing baseline methods, respectively. In addition, we conducted cross-region transfer experiments by directly applying the trained model to remote sensing images from different geographic regions and at varying resolutions to assess its generalization capability. The results demonstrate that PWFNet maintains the continuity of main and branch roads and preserves edge details in these transfer scenarios, effectively reducing false positives and missed detections. This further validates its practicality and robustness in diverse real-world environments. Full article

► Show Figures

Figure 1

21 pages, 25577 KB

Open AccessArticle

DFFNet: A Dual-Domain Feature Fusion Network for Single Remote Sensing Image Dehazing

by Huazhong Jin, Zhang Chen, Zhina Song and Kaimin Sun

Sensors 2025, 25(16), 5125; https://doi.org/10.3390/s25165125 - 18 Aug 2025

Viewed by 459

Abstract

Single remote sensing image dehazing aims to eliminate atmospheric scattering effects without auxiliary information. It serves as a crucial preprocessing step for enhancing the performance of downstream tasks in remote sensing images. Conventional approaches often struggle to balance haze removal and detail restoration [...] Read more.

Single remote sensing image dehazing aims to eliminate atmospheric scattering effects without auxiliary information. It serves as a crucial preprocessing step for enhancing the performance of downstream tasks in remote sensing images. Conventional approaches often struggle to balance haze removal and detail restoration under non-uniform haze distributions. To address this issue, we propose a Dual-domain Feature Fusion Network (DFFNet) for remote sensing image dehazing. DFFNet consists of two specialized units: the Frequency Restore Unit (FRU) and the Context Extract Unit (CEU). As haze primarily manifests as low-frequency energy in the frequency domain, the FRU effectively suppresses haze across the entire image by adaptively modulating low-frequency amplitudes. Meanwhile, to reconstruct details attenuated due to dense haze occlusion, we introduce the CEU. This unit extracts multi-scale spatial features to capture contextual information, providing structural guidance for detail reconstruction. Furthermore, we introduce the Dual-Domain Feature Fusion Module (DDFFM) to establish dependencies between features from FRU and CEU via a designed attention mechanism. This leverages spatial contextual information to guide detail reconstruction during frequency domain haze removal. Experiments on the StateHaze1k, RICE and RRSHID datasets demonstrate that DFFNet achieves competitive performance in both visual quality and quantitative metrics. Full article

(This article belongs to the Topic Applied Computer Vision and Pattern Recognition: 2nd Edition)

► Show Figures

Figure 1

22 pages, 6785 KB

Open AccessArticle

Spatiality–Frequency Domain Video Forgery Detection System Based on ResNet-LSTM-CBAM and DCT Hybrid Network

by Zihao Liao, Sheng Hong and Yu Chen

Appl. Sci. 2025, 15(16), 9006; https://doi.org/10.3390/app15169006 - 15 Aug 2025

Viewed by 416

Abstract

As information technology advances, digital content has become widely adopted across diverse fields such as news broadcasting, entertainment, commerce, and forensic investigation. However, the availability of sophisticated multimedia editing tools has significantly increased the risk of video and image forgery, raising serious concerns [...] Read more.

As information technology advances, digital content has become widely adopted across diverse fields such as news broadcasting, entertainment, commerce, and forensic investigation. However, the availability of sophisticated multimedia editing tools has significantly increased the risk of video and image forgery, raising serious concerns about content authenticity at both societal and individual levels. To address the growing need for robust and accurate detection methods, this study proposes a novel video forgery detection model that integrates both spatial and frequency-domain features. The model is built on a ResNet-LSTM framework enhanced by a Convolutional Block Attention Module (CBAM) for spatial feature extraction, and further incorporates Discrete Cosine Transform (DCT) to capture frequency domain information. Comprehensive experiments were conducted on several mainstream benchmark datasets, encompassing a wide range of forgery scenarios. The results demonstrate that the proposed model achieves superior performance in distinguishing between authentic and manipulated videos. Additional ablation and comparative studies confirm the contribution of each component in the architecture, offering deeper insight into the model’s capacity. Overall, the findings support the proposed approach as a promising solution for enhancing the reliability of video authenticity analysis under complex conditions. Full article

► Show Figures

Figure 1

21 pages, 3126 KB

Open AccessArticle

WMSA–WBS: Efficient Wave Multi-Head Self-Attention with Wavelet Bottleneck

by Xiangyang Li, Yafeng Li, Pan Fan and Xueya Zhang

Sensors 2025, 25(16), 5046; https://doi.org/10.3390/s25165046 - 14 Aug 2025

Viewed by 391

Abstract

The critical component of the vision transformer (ViT) architecture is multi-head self-attention (MSA), which enables the encoding of long-range dependencies and heterogeneous interactions. However, MSA has two significant limitations: its limited ability to capture local features and its high computational costs. To address [...] Read more.

The critical component of the vision transformer (ViT) architecture is multi-head self-attention (MSA), which enables the encoding of long-range dependencies and heterogeneous interactions. However, MSA has two significant limitations: its limited ability to capture local features and its high computational costs. To address these challenges, this paper proposes an integrated multi-head self-attention approach with a bottleneck enhancement structure, named WMSA–WBS, which mitigates the aforementioned shortcomings of conventional MSA. Different from existing wavelet-enhanced ViT variants that mainly focus on the isolated wavelet decomposition in the attention layer, WMSA–WBS introduces a co-design of wavelet-based frequency processing and bottleneck optimization, achieving more efficient and comprehensive feature learning. Within WMSA–WBS, the proposed wavelet multi-head self-attention (WMSA) approach is combined with a novel wavelet bottleneck structure to capture both global and local information across the spatial, frequency, and channel domains. Specifically, this module achieves these capabilities while maintaining low computational complexity and memory consumption. Extensive experiments demonstrate that ViT models equipped with WMSA–WBS achieve superior trade-offs between accuracy and model complexity across various vision tasks, including image classification, object detection, and semantic segmentation. Full article

(This article belongs to the Section Sensor Networks)

► Show Figures

Figure 1

27 pages, 33921 KB

Open AccessArticle

Seeing Through Turbid Waters: A Lightweight and Frequency-Sensitive Detector with an Attention Mechanism for Underwater Objects

by Shibo Song and Bing Sun

J. Mar. Sci. Eng. 2025, 13(8), 1528; https://doi.org/10.3390/jmse13081528 - 9 Aug 2025

Viewed by 321

Abstract

Precise underwater object detectors can provide Autonomous Underwater Vehicles (AUVs) with good situational awareness in underwater environments, supporting a wide range of unmanned exploration missions. However, the quality of optical imaging is often insufficient to support high detector accuracy due to poor lighting [...] Read more.

Precise underwater object detectors can provide Autonomous Underwater Vehicles (AUVs) with good situational awareness in underwater environments, supporting a wide range of unmanned exploration missions. However, the quality of optical imaging is often insufficient to support high detector accuracy due to poor lighting and the complexity of underwater environments. Therefore, this paper develops an efficient and precise object detector that maintains high recognition accuracy on degraded underwater images. We design a Cross Spatial Global Perceptual Attention (CSGPA) mechanism to achieve accurate recognition of target and background information. We then construct an Efficient Multi-Scale Weighting Feature Pyramid Network (EMWFPN) to eliminate computational redundancy and increase the model’s feature-representation ability. The proposed Occlusion-Robust Wavelet Network (ORWNet) enables the model to handle fine-grained frequency-domain information, enhancing robustness to occluded objects. Finally, EMASlideloss is introduced to alleviate sample-distribution imbalance in underwater datasets. Our architecture achieves 81.8% and 83.8% mAP on the DUO and UW6C datasets, respectively, with only 7.2 GFLOPs, outperforming baseline models and balancing detection precision with computational efficiency. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Figure 1

13 pages, 1269 KB

Open AccessArticle

Contrast-Enhancing Spatial–Frequency Deconvolution-Aided Interferometric Scattering Microscopy (iSCAT)

by Xiang Zhang and Hao He

Photonics 2025, 12(8), 795; https://doi.org/10.3390/photonics12080795 - 7 Aug 2025

Viewed by 648

Abstract

Interferometric scattering microscopy (iSCAT) is widely used for label-free tracking of nanoparticles and single molecules. However, its ability to identify small molecules is limited by low imaging contrast blurred with noise. Frame-averaging methods are widely used for reducing background noise but require hundreds [...] Read more.

Interferometric scattering microscopy (iSCAT) is widely used for label-free tracking of nanoparticles and single molecules. However, its ability to identify small molecules is limited by low imaging contrast blurred with noise. Frame-averaging methods are widely used for reducing background noise but require hundreds of frames to produce a single frame as a trade-off. To address this, we applied a spatial–frequency domain deconvolution algorithm to suppress background noise and amplify the signal for each frame, achieving an improvement of ∼ 3-fold without hardware modification. This enhancement is achieved by compensating for missing information within the optical transfer function (OTF) boundary, while high-frequency components (noise) beyond this boundary are filtered. The resulting deconvolution process provides linear signal amplification, making it ideal for quantitative analysis in mass photometry. Additionally, the localization error is reduced by 20%. Comparisons with traditional denoising algorithms revealed that these methods often extract the side lobes. In contrast, our deconvolution approach preserves signal integrity while enhancing sensitivity. This work highlights the potential of image processing techniques to significantly improve the detection sensitivity of iSCAT for small molecule analysis. Full article

(This article belongs to the Special Issue Research, Development and Application of Raman Scattering Technology)

► Show Figures

Figure 1

24 pages, 1471 KB

Open AccessArticle

WDM-UNet: A Wavelet-Deformable Gated Fusion Network for Multi-Scale Retinal Vessel Segmentation

by Xinlong Li and Hang Zhou

Sensors 2025, 25(15), 4840; https://doi.org/10.3390/s25154840 - 6 Aug 2025

Viewed by 449

Abstract

Retinal vessel segmentation in fundus images is critical for diagnosing microvascular and ophthalmologic diseases. However, the task remains challenging due to significant vessel width variation and low vessel-to-background contrast. To address these limitations, we propose WDM-UNet, a novel spatial-wavelet dual-domain fusion architecture that [...] Read more.

Retinal vessel segmentation in fundus images is critical for diagnosing microvascular and ophthalmologic diseases. However, the task remains challenging due to significant vessel width variation and low vessel-to-background contrast. To address these limitations, we propose WDM-UNet, a novel spatial-wavelet dual-domain fusion architecture that integrates spatial and wavelet-domain representations to simultaneously enhance the local detail and global context. The encoder combines a Deformable Convolution Encoder (DCE), which adaptively models complex vascular structures through dynamic receptive fields, and a Wavelet Convolution Encoder (WCE), which captures the semantic and structural contexts through low-frequency components and hierarchical wavelet convolution. These features are further refined by a Gated Fusion Transformer (GFT), which employs gated attention to enhance multi-scale feature integration. In the decoder, depthwise separable convolutions are used to reduce the computational overhead without compromising the representational capacity. To preserve fine structural details and facilitate contextual information flow across layers, the model incorporates skip connections with a hierarchical fusion strategy, enabling the effective integration of shallow and deep features. We evaluated WDM-UNet in three public datasets: DRIVE, STARE, and CHASE_DB1. The quantitative evaluations demonstrate that WDM-UNet consistently outperforms state-of-the-art methods, achieving 96.92% accuracy, 83.61% sensitivity, and an 82.87% F1-score in the DRIVE dataset, with superior performance across all the benchmark datasets in both segmentation accuracy and robustness, particularly in complex vascular scenarios. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

22 pages, 24173 KB

Open AccessArticle

ScaleViM-PDD: Multi-Scale EfficientViM with Physical Decoupling and Dual-Domain Fusion for Remote Sensing Image Dehazing

by Hao Zhou, Yalun Wang, Wanting Peng, Xin Guan and Tao Tao

Remote Sens. 2025, 17(15), 2664; https://doi.org/10.3390/rs17152664 - 1 Aug 2025

Viewed by 355

Abstract

Remote sensing images are often degraded by atmospheric haze, which not only reduces image quality but also complicates information extraction, particularly in high-level visual analysis tasks such as object detection and scene classification. State-space models (SSMs) have recently emerged as a powerful paradigm [...] Read more.

Remote sensing images are often degraded by atmospheric haze, which not only reduces image quality but also complicates information extraction, particularly in high-level visual analysis tasks such as object detection and scene classification. State-space models (SSMs) have recently emerged as a powerful paradigm for vision tasks, showing great promise due to their computational efficiency and robust capacity to model global dependencies. However, most existing learning-based dehazing methods lack physical interpretability, leading to weak generalization. Furthermore, they typically rely on spatial features while neglecting crucial frequency domain information, resulting in incomplete feature representation. To address these challenges, we propose ScaleViM-PDD, a novel network that enhances an SSM backbone with two key innovations: a Multi-scale EfficientViM with Physical Decoupling (ScaleViM-P) module and a Dual-Domain Fusion (DD Fusion) module. The ScaleViM-P module synergistically integrates a Physical Decoupling block within a Multi-scale EfficientViM architecture. This design enables the network to mitigate haze interference in a physically grounded manner at each representational scale while simultaneously capturing global contextual information to adaptively handle complex haze distributions. To further address detail loss, the DD Fusion module replaces conventional skip connections by incorporating a novel Frequency Domain Module (FDM) alongside channel and position attention. This allows for a more effective fusion of spatial and frequency features, significantly improving the recovery of fine-grained details, including color and texture information. Extensive experiments on nine publicly available remote sensing datasets demonstrate that ScaleViM-PDD consistently surpasses state-of-the-art baselines in both qualitative and quantitative evaluations, highlighting its strong generalization ability. Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithm for Remote Sensing Imagery Processing (5th Edition))

► Show Figures

Figure 1

Search Results (325)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (325)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI