Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (126)

Search Parameters:
Keywords = dual-channel cross attention network

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
31 pages, 2435 KB  
Article
DEP-TFDualNet: A Dual-Domain Attention Framework with Temporal–Frequency Fusion for Depression Recognition Using Three-Channel Frontal EEG
by Haijun Lin, Jiayi Liu and Dongxu Jiang
Sensors 2026, 26(12), 3861; https://doi.org/10.3390/s26123861 - 17 Jun 2026
Viewed by 237
Abstract
Early depression screening is important for timely intervention, and electroencephalography (EEG) offers an objective and potentially portable sensing modality for computer-aided assessment. However, recognition from fixed three-channel frontal EEG remains difficult because of limited spatial information and incomplete modeling of temporal–frequency characteristics and [...] Read more.
Early depression screening is important for timely intervention, and electroencephalography (EEG) offers an objective and potentially portable sensing modality for computer-aided assessment. However, recognition from fixed three-channel frontal EEG remains difficult because of limited spatial information and incomplete modeling of temporal–frequency characteristics and temporal dependencies. This study proposes DEP-TFDualNet for acquisition-constrained frontal resting-state EEG. The framework integrates multi-scale convolution, dual-domain channel attention, temporal modeling derived from the independent recurrent neural network (IndRNN) architecture, and decision-stage fusion of deep representations with low-order statistical descriptors through a Kolmogorov–Arnold Network (KAN)-based nonlinear projection layer. Experiments were conducted on the publicly available three-channel frontal EEG subset of the MODMA dataset. After additional quality control, 48 subjects were retained (22 patients with major depressive disorder, 26 healthy controls). Under subject-wise stratified five-fold cross-validation, DEP-TFDualNet achieved 85.42% accuracy, 85.26% macro-F1, 81.82% sensitivity, 88.46% specificity, an AUC of 0.82, and a Brier score of 0.121. It achieved the best threshold-based subject-level performance and the lowest Brier score among the evaluated models. These results provide preliminary evidence that simplified frontal EEG sensing may support depression recognition in acquisition-constrained settings, although larger and external validation is still required. Full article
Show Figures

Figure 1

23 pages, 659 KB  
Article
EEG-ChTABNet: A Dual-Branch Channel-Wise Transformer with Gated Attention-Branch Network for EEG-Based Classification of Dementia
by Noor Kamal Al-Qazzaz, Sawal Hamid Bin Mohd Ali and Siti Anom Ahmad
Biomedicines 2026, 14(6), 1345; https://doi.org/10.3390/biomedicines14061345 - 15 Jun 2026
Viewed by 237
Abstract
Background/Objectives: Early and accurate discrimination of neurological conditions, dementia, stroke and healthy aging, remains a critical clinical challenge. Electroencephalography (EEG) is a non-invasive measure of brain dynamics and entropy-based features obtained from multichannel EEG have shown strong discriminative ability. However, existing deep [...] Read more.
Background/Objectives: Early and accurate discrimination of neurological conditions, dementia, stroke and healthy aging, remains a critical clinical challenge. Electroencephalography (EEG) is a non-invasive measure of brain dynamics and entropy-based features obtained from multichannel EEG have shown strong discriminative ability. However, existing deep learning approaches do not sufficiently address the combined challenges of small clinical cohorts and high-dimensional entropy feature spaces. In this study, a novel architecture is proposed for multi-class neurological EEG classification under extreme small-sample conditions. Methods: A novel dual-branch Channel-wise Transformer and Attention-Branch Network (EEG-ChTABNet) are pr to classify 19-channel EEG entropy features into three classes (dementia, stroke, healthy control; N = 45; 15 per class). The architecture suggests four new designs. First, the Channel Importance Attention (CIA) block, which adaptively learns to re-weight the importance of electrodes via squeeze-excitation. Second, the dual-branch encoder, which combines the global multi-head self-attention with the local depthwise-separable convolution. Third, the gated sigmoid fusion mechanism. Fourth, the bottleneck residual classification head, to solve overfitting. Eight entropy feature sets: Amplitude-Aware Permutation Entropy (AAPE), Attention Entropy (AttEn), Dispersion Entropy (DisEn), Distribution Entropy (DistrEn), Fluctuation-based Dispersion Entropy (FDispEn), Fuzzy Entropy (FuzEn), Linear Gaussian Estimation of the Conditional Entropy (LinEn), and Symbolic Dynamics (SyDy) were evaluated individually with stratified 5-fold cross-validation on within-fold SMOTE augmentation. Results: EEG-ChTABNet consistently outperformed the baseline Transformer on all 8 feature sets. DisEn and SyDy features yielded peak classification accuracy of 73.3% (AUC: 0.823 and 0.857, respectively) compared to the corresponding baseline of 57.8% and 55.6%. SyDy achieved the best overall AUC of 0.857 and the dementia detection sensitivity was up to 86.7% over multiple feature sets. Conclusions: EEG-ChTABNet shows the effectiveness of channel-adaptive, dual-branch Transformer Designs for EEG-based neurological classification from Small-Sample Entropy Feature Data, and Identifying SyDy and DisEn as the Most Discriminative Feature Representations for Three-Class Neurological EEG Classification. Full article
(This article belongs to the Special Issue Recent Advances in Biomedical Engineering for the Elderly)
Show Figures

Figure 1

28 pages, 5701 KB  
Article
Multi-Sequence Guided Generation of Contrast-Enhanced Magnetic Resonance Imaging Using Diffusion Models
by Yue Xu, Xiaokun Zhou, Wei Jiang, Chuanbing Wang, Xiangnan Geng, Da Cao, Wujin Xiao, Bin Liu and Wei Wang
Bioengineering 2026, 13(6), 634; https://doi.org/10.3390/bioengineering13060634 - 28 May 2026
Viewed by 257
Abstract
Objectives: Contrast-enhanced magnetic resonance imaging (CE-MRI) plays an important role in the diagnosis, treatment monitoring, and follow-up of brain tumors. However, the use of gadolinium-based contrast agents (GBCAs) is limited in patients with contraindications, such as severe renal impairment or situations requiring [...] Read more.
Objectives: Contrast-enhanced magnetic resonance imaging (CE-MRI) plays an important role in the diagnosis, treatment monitoring, and follow-up of brain tumors. However, the use of gadolinium-based contrast agents (GBCAs) is limited in patients with contraindications, such as severe renal impairment or situations requiring repeated examinations. This study aimed to develop a diffusion model-based Difference-Aware Guided Control Network (DAGCN) for synthesizing high-quality contrast-enhanced T1-weighted MRI (T1-CE) from non-contrast T1-weighted images in combination with an auxiliary sequence. Methods: Using the BraTS 2021 dataset, we proposed a two-stage generative framework that first localizes lesion-related enhancement cues and then guides image synthesis. In the first stage, a Difference-Aware Fusion and Prediction (DAFP) module was designed to extract complementary information from non-contrast T1-weighted images and an auxiliary sequence (T2-weighted or FLAIR) through dual-branch feature extraction and cross-modal channel attention fusion, followed by prediction of a lesion-related discrepancy map. In the second stage, the predicted discrepancy map was concatenated with the original T1-weighted images and introduced into a ControlNet-guided diffusion model to constrain the reverse denoising process and generate the target T1-CE image. Model performance was evaluated by visual comparison, quantitative metrics including peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), visual information fidelity (VIF), and normalized cross-correlation (NCC), as well as blinded radiologist scoring of image quality (IQ), clinical replaceability (IC), contrast enhancement (CE), and lesion conformity (CF). Results: DAGCN generated synthetic T1-CE images with preserved global anatomical structure and faithful local lesion enhancement without the need for contrast agent administration. Compared with baseline methods, DAGCN achieved the highest PSNR and NCC under both T1 + T2 and T1 + FLAIR settings, while showing competitive SSIM and VIF performance. Visual comparison and radiologist-based subjective evaluation further indicated improved lesion-focused enhancement fidelity and reduced false-positive enhancement. Among the two auxiliary sequence settings, the T1 + FLAIR configuration provided more specific lesion localization and cleaner background suppression than the T1 + T2 configuration, particularly by reducing interference from cerebrospinal fluid signals. Conclusions: The proposed DAGCN framework enables the synthesis of clinically informative contrast-enhanced-like MRI from non-contrast multi-sequence inputs and may provide a promising alternative for patients in whom gadolinium administration is contraindicated or should be avoided. In particular, the FLAIR-guided setting showed advantages in lesion specificity, background cleanliness, and overall diagnostic quality. Full article
Show Figures

Figure 1

35 pages, 6126 KB  
Article
StarRoute-DBNet: A Novel Multi-Modal Framework for Advanced Target Detection in Dynamic Environments Using SAR and Optical Image Fusion with FocusGraph and PhaseRoute
by Lanfang Lei, Sheng Chang, Zhongzhen Sun, Jianxin Zou, Huazheng Yang, Xinli Zheng, Changyu Liao, Wenjun Wei, Long Ma and Ping Zhong
Remote Sens. 2026, 18(11), 1731; https://doi.org/10.3390/rs18111731 - 27 May 2026
Viewed by 271
Abstract
Multimodal object detection based on synthetic aperture radar (SAR) and optical imagery is of great significance in remote sensing, particularly under adverse weather conditions, nighttime environments, and complex background scenarios. Although SAR imagery has unique advantages under all-weather conditions, its object detection performance [...] Read more.
Multimodal object detection based on synthetic aperture radar (SAR) and optical imagery is of great significance in remote sensing, particularly under adverse weather conditions, nighttime environments, and complex background scenarios. Although SAR imagery has unique advantages under all-weather conditions, its object detection performance still faces challenges in low-texture regions and cluttered scenes. Optical imagery provides rich spatial and texture information, but its applicability is limited in harsh environments. To overcome the limitations of unimodal SAR object detection, this paper proposes a novel multimodal object detection framework, termed StarRoute-DBNet, to improve detection accuracy and robustness through multimodal data fusion and efficient feature interaction. Specifically, a FocusGraph (Graph Convolution-Based Feature Relationship Modeling) module is first designed to adaptively model the spatial relationships between optical and SAR features via graph convolutional networks (GCNs), thereby capturing complex cross-modal spatial dependencies. This module enhances feature interaction across modalities, improves the localization accuracy of oriented targets, and shows clear advantages for small-object detection in complex backgrounds. Second, to alleviate the loss of critical information during downsampling, a PhaseRoute (Sparse Routing Polyphase Downsampling Module) is introduced, which combines multi-phase decomposition with a Top-2 sparse routing strategy to preserve informative spatial cues. By incorporating Gumbel noise into the routing process, the proposed module further improves routing flexibility, detection accuracy, and model robustness. In addition, a Multi-Scale Shuffle-Gated Fusion (MSSGF) module is proposed to address the multi-scale issue in multimodal feature fusion. This module integrates multi-scale convolutional branches, channel shuffles, and dual-attention mechanisms to enhance feature interaction across scales, while an adaptive weighted fusion strategy is employed to dynamically adjust the fusion weights of multimodal features. As a result, the proposed method significantly improves detection accuracy and robustness, especially in complex scenes. Extensive experiments conducted on the MVSDA dataset and the M4-SAR dataset demonstrate that the proposed StarRoute-DBNet consistently outperforms existing state-of-the-art methods under complex backgrounds and adverse conditions. In particular, it achieves clear advantages in oriented object detection and small-object detection, verifying its effectiveness and robustness for cross-modal remote sensing object detection. Full article
Show Figures

Figure 1

28 pages, 17436 KB  
Article
Cross-Modality Spectral Expansion Combined with Physical–Semantic Dual Priors for Cloud Detection in GF-1 Imagery
by Jing Zhang, Kexiao Shen, Liangnong Song, Shiyi Pan and Yunsong Li
Remote Sens. 2026, 18(11), 1689; https://doi.org/10.3390/rs18111689 - 23 May 2026
Viewed by 269
Abstract
Cloud detection in high-resolution Gaofen-1 (GF-1) imagery is challenging due to the absence of short-wave infrared (SWIR) bands, which prevents the use of physically interpretable indices such as the Normalized Difference Snow Index (NDSI) and often leads to severe cloud–snow confusion. To address [...] Read more.
Cloud detection in high-resolution Gaofen-1 (GF-1) imagery is challenging due to the absence of short-wave infrared (SWIR) bands, which prevents the use of physically interpretable indices such as the Normalized Difference Snow Index (NDSI) and often leads to severe cloud–snow confusion. To address this limitation, we propose a unified framework, termed the Cross-Modality Spectral Expansion and Dual-Prior Network (CMSE-DPNet), that integrates cross-modality spectral expansion with physical–semantic dual priors. First, an improved CycleGAN reconstructs 13-band pseudo-Sentinel-2 spectra from four-band GF-1 imagery, enabling the computation of snow-sensitive physical indices. Second, a Snow-Aware Feature Attention Guidance Module (SAFAGM) introduces pixel-level physical priors derived from NDSI, while a Label-Guided Channel Attention Module (LG-CAM) injects scene-level semantic priors inferred from geographic metadata using a large language model. These complementary priors guide the network to better distinguish clouds from spectrally similar backgrounds. Experiments on the GF-1 dataset show that the proposed method achieves an F1-score of 94.41% and an Intersection over Union (IoU) of 89.40%, outperforming several state-of-the-art cloud detection methods. The results indicate that cross-modality spectral expansion combined with physical–semantic prior guidance effectively improves cloud detection performance in complex cloud–snow coexistence scenarios. Full article
Show Figures

Figure 1

17 pages, 2200 KB  
Article
Robust Vessel Detection in Low-SNR DAS via Spatial Coherence Enhancement
by Zhongxiang Zheng, Peng Liu and Wei Huang
J. Mar. Sci. Eng. 2026, 14(10), 958; https://doi.org/10.3390/jmse14100958 - 21 May 2026
Viewed by 236
Abstract
Robust vessel detection from low-Signal-to-Noise Ratio (SNR) Distributed Acoustic Sensing (DAS) data benefits from exploiting spatial correlations among adjacent channels. The Cross-Channel Attention Fusion Network (CASFNet) is presented, utilizing a Cross-Channel Attention Fusion (CASF) mechanism to dynamically model dependencies among adjacent channels. This [...] Read more.
Robust vessel detection from low-Signal-to-Noise Ratio (SNR) Distributed Acoustic Sensing (DAS) data benefits from exploiting spatial correlations among adjacent channels. The Cross-Channel Attention Fusion Network (CASFNet) is presented, utilizing a Cross-Channel Attention Fusion (CASF) mechanism to dynamically model dependencies among adjacent channels. This approach, based on a dual-component spectrogram representation, adaptively fuses local spatial context, enhancing signal coherence under low-SNR conditions. Experiments on real-world DAS data demonstrate superior accuracy and robustness compared to state-of-the-art methods, achieving a detection accuracy of 99.24% and an F1-score of 99.19%. Ablation results confirm the effectiveness of this spatial fusion strategy for vessel monitoring using submarine DAS data. Full article
Show Figures

Figure 1

33 pages, 16764 KB  
Article
DC-FusionGNN: A Dual-Channel Framework Integrating Global Self-Attention and Local Topology Learning for Identifying Key Resistance Genes Against Fusarium graminearum Infection in Maize
by Yinfei Dai, Mengjiao Qiao, Jie Fan, Shihao Lu, Enshuang Zhao, Yuheng Zhu, Hanbo Liu and Hao Zhang
Plants 2026, 15(10), 1540; https://doi.org/10.3390/plants15101540 - 18 May 2026
Viewed by 714
Abstract
Fusarium graminearum infection of maize induces complex transcriptional reprogramming, yet existing differential-expression and local graph convolutional approaches struggle to capture long-range and multi-scale regulatory dependencies. We propose DC-FusionGNN, a dual-channel fusion graph neural network for key resistance-gene identification. Based on the transcriptome dataset [...] Read more.
Fusarium graminearum infection of maize induces complex transcriptional reprogramming, yet existing differential-expression and local graph convolutional approaches struggle to capture long-range and multi-scale regulatory dependencies. We propose DC-FusionGNN, a dual-channel fusion graph neural network for key resistance-gene identification. Based on the transcriptome dataset GSE174508, we first construct a comprehensive gene interaction network by integrating a WGCNA co-expression network with a STRING-based interaction network. The left channel combines structure-aware propagation with a Transformer-based global self-attention mechanism to model long-range cross-module dependencies, while the right channel couples GraphSAGE with a GCN to capture local topology and neighborhood heterogeneity. Embeddings from the two channels are concatenated to form a unified gene representation, trained via self-supervised link prediction. Compared with baseline graph neural networks, DC-FusionGNN achieves competitive and overall improved performance across multiple metrics, and robustness and independent cross-species (rice, GSE39635) experiments further confirm its stability and generalization ability. GO and KEGG enrichment analyses show that the top-ranked candidate genes are significantly enriched in plant defense responses, hormone signaling, and secondary metabolism, supporting the biological relevance of the model’s predictions. Full article
(This article belongs to the Special Issue Applications of Bioinformatics in Plant Science)
Show Figures

Figure 1

31 pages, 4074 KB  
Article
Design and Experimental Investigation of a Multi-Level Heartbeat Sound Feedback-Based Neurofeedback System: Neural Mechanisms
by Xiuyan Hu, Mingge Kang, Yijing Liu, Ting Shi, Xinyu Shi, Yunfa Fu and Anmin Gong
Sensors 2026, 26(10), 3187; https://doi.org/10.3390/s26103187 - 18 May 2026
Viewed by 412
Abstract
Auditory neurofeedback training (NFT) based on brain–computer interfaces (BCIs) has recently entered the precision motor domain as a task-embedded neural state regulation paradigm. Compared to traditional standalone NFT approaches (e.g., relaxation or attention training designed to enhance general cognitive abilities), task-embedded paradigms integrate [...] Read more.
Auditory neurofeedback training (NFT) based on brain–computer interfaces (BCIs) has recently entered the precision motor domain as a task-embedded neural state regulation paradigm. Compared to traditional standalone NFT approaches (e.g., relaxation or attention training designed to enhance general cognitive abilities), task-embedded paradigms integrate feedback directly into the motor task execution process. However, this design inevitably creates a dual-task scenario, and the effects of such a scenario on neural activity and behavioral performance have received limited systematic investigation in the existing literature. This study designed and implemented a closed-loop BCI system employing five-level heartbeat sound feedback and used this system as a research platform to examine the immediate neural mechanism changes and potential dual-task interference effects induced by single-session auditory NFT in moderately skilled shooters. The system maps real-time EEG features onto graded auditory signals varying in playback rate and volume intensity, incorporating a dynamic threshold adjustment mechanism. Twenty-two moderately skilled shooters completed three within-subject conditions (no-sound baseline, SMR enhancement, and theta suppression) in a single session with 32-channel EEG and behavioral data recorded simultaneously. Analyses employed whole-brain cluster-based permutation tests, cross-frequency coupling analysis, and functional connectivity analysis. Cluster-based permutation tests revealed that theta feedback induced a significant frontal 4–7 Hz suppression cluster (cluster p = 0.004), whereas SMR feedback did not produce significant 12–15 Hz enhancement at the group level. Theta feedback elicited cross-frequency spillover as follows: sensorimotor SMR power decreased significantly in theta responders (d = −0.69), with frontal theta and sensorimotor SMR changes positively correlated (r = 0.67, p < 0.001). Functional connectivity analysis using debiased weighted phase lag index (dwPLI) further demonstrated significant theta-band network reorganization (cluster p = 0.034). At the neural level, clear modulation effects were observed, but shooting ring values did not improve significantly under feedback conditions, and aiming time was significantly prolonged—a behavioral pattern consistent with potential dual-task interference from task-embedded auditory feedback. Single-session auditory NFT can act on the prefrontal cognitive control network and induce cross-frequency network reorganization, but the feedback channel itself constitutes a parallel task that may limit the short-term transfer of induced neural states to behavioral performance. This study examined the neural mechanisms of task-embedded auditory NFT and reported the dual-task costs that have been less characterized in prior “task + feedback” research, providing design considerations and preliminary mechanistic evidence for future development of auditory NFT in precision motor skill training. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

29 pages, 6163 KB  
Article
FI-CRNet: Frequency Interaction for Cloud Removal in Remote Sensing Images
by Pengchen Lei, Xiaomeng Xin, Xuena Qiu, Wenli Huang, Yang Wu and Ye Deng
Remote Sens. 2026, 18(10), 1608; https://doi.org/10.3390/rs18101608 - 16 May 2026
Viewed by 273
Abstract
Remote sensing imagery is often degraded by cloud cover, causing severe information loss and hindering downstream Earth observation tasks. Although recent deep learning methods, including CNN- and Transformer-based models, have achieved promising progress in cloud removal, they mainly operate in the spatial domain [...] Read more.
Remote sensing imagery is often degraded by cloud cover, causing severe information loss and hindering downstream Earth observation tasks. Although recent deep learning methods, including CNN- and Transformer-based models, have achieved promising progress in cloud removal, they mainly operate in the spatial domain and largely overlook the frequency-domain discrepancies introduced by clouds of different types and densities. This limitation restricts their ability to generalize across diverse cloud corruption scenarios. To address this issue, we propose a Frequency Interaction Cloud Removal Network (FI-CRNet), which introduces a novel Frequency-Aware Modulation (FAM) mechanism for high-fidelity cloud-free image reconstruction. The FAM module consists of two components. First, the Frequency Decomposition (FD) module explicitly separates input features into low-frequency cloud-affected components and high-frequency detail-rich components through spectral analysis, while aligning them with decoder features via cross-attention. Second, the Cross-Frequency Interaction (CFI) module adaptively integrates these components through a dual-gate weighting mechanism, including spatial and channel gates, to suppress cloud interference while enhancing structural and textural details. By jointly modeling frequency-domain cues and spatial features, FI-CRNet enables robust and adaptive reconstruction under diverse cloud conditions. Extensive experiments show that our method outperforms state-of-the-art techniques across diverse cloud scenarios. Full article
Show Figures

Figure 1

34 pages, 3638 KB  
Article
Multi-Scale Hybrid Attention Temporal Network for Motionless Activity Using Smartphone Inertial Sensors
by Sakorn Mekruksavanich and Anuchit Jitpattanakul
Technologies 2026, 14(5), 272; https://doi.org/10.3390/technologies14050272 - 30 Apr 2026
Viewed by 536
Abstract
Wearable sensor-based human activity recognition (HAR) has gained growing significance in healthcare monitoring and assisted living systems. Although considerable advances have been made in classifying dynamic movements, stationary activities—such as sleeping, driving, and watching TV—remain difficult to distinguish owing to their weak sensor [...] Read more.
Wearable sensor-based human activity recognition (HAR) has gained growing significance in healthcare monitoring and assisted living systems. Although considerable advances have been made in classifying dynamic movements, stationary activities—such as sleeping, driving, and watching TV—remain difficult to distinguish owing to their weak sensor signatures and limited discriminative cues. This paper presents the multi-scale hybrid attention temporal network (MHAT-Net), a deep learning framework whose key architectural novelty lies in the parallel (non-sequential) dual-pathway temporal modeling: a BiGRU branch and a transformer encoder branch operate simultaneously on the same spatially encoded representation, combined via a learnable attention-based fusion module. This design targets the underexplored problem of distinguishing stationary activities from weak inertial sensor signatures. The architecture is built upon three integrated components: (1) a multi-branch CNN with kernel sizes three, five, and seven combined with channel attention for adaptive spatial feature extraction across multiple temporal scales; (2) parallel bidirectional gated recurrent unit (BiGRU) and transformer encoder pathways for jointly capturing short-range sequential patterns and long-range temporal correlations; and (3) an attention-driven fusion module that adaptively weights the outputs of both temporal branches. The model was assessed on a publicly available benchmark comprising three motionless activity categories collected from 25 participants via smartphone sensors. In 5-fold cross-validation, MHAT-Net attained 97.42% (±4.69%) accuracy with accelerometer data and 92.31% (±0.31%) with gyroscope data, substantially exceeding the accuracies of five baseline architectures: CNN, LSTM, BiLSTM, GRU, and BiGRU. Ablation experiments identified multi-scale spatial feature extraction as the most influential module (2.21–2.47% contribution), followed by the hybrid temporal modeling components. Cross-modality analysis confirmed that accelerometer signals yielded richer discriminative content for stationary activities, while MHAT-Net sustained consistent performance across both sensor types. The proposed integration of multi-scale spatial encoding, hybrid temporal modeling, and multi-level attention gives MHAT-Net the ability to reliably detect subtle activity-specific patterns, establishing a new benchmark in wearable sensor-based recognition for comprehensive daily behavior monitoring. Full article
Show Figures

Figure 1

18 pages, 8745 KB  
Article
Automated Prostate Cancer Detection on T2-Weighted MRI Using a Dual-Stream Attention Network: A Study on Private Saudi Clinical Data and Public Benchmark Datasets
by Saeed Alqahtani, M. A. Jowhari, Yahya.Q. Sabi and Hussein Alshaari
J. Clin. Med. 2026, 15(9), 3327; https://doi.org/10.3390/jcm15093327 - 27 Apr 2026
Viewed by 462
Abstract
Background: The steady rise of prostate cancer in Saudi Arabia signals a critical public health shift that requires immediate investment in early detection and prevention to mitigate a future clinical crisis. Accurate diagnosis using multiparametric MRI and PI-RADS scoring remains challenging, as interpretations [...] Read more.
Background: The steady rise of prostate cancer in Saudi Arabia signals a critical public health shift that requires immediate investment in early detection and prevention to mitigate a future clinical crisis. Accurate diagnosis using multiparametric MRI and PI-RADS scoring remains challenging, as interpretations are highly experience-dependent and subspecialized radiologists are limited. Methods: To address this gap, this study introduces a novel Dual-Stream Attention Network designed to automate the classification of low-risk (PIRADS 2-3) versus high-risk (PIRADS 4-5) lesions from T2-weighted MRI. Leveraging a ResNet50 backbone, the architecture employs parallel streams for Local and Global Feature Processing, each enhanced by a Channel-Spatial Attention module to highlight diagnostically relevant regions. These features are integrated through a Cross-Stream Fusion mechanism and a gate-controlled Adaptive Feature Fusion module to optimize multi-scale information. The model was developed and validated on a regional dataset of 3850 images from Jazan Specialist Hospital and Prince Mohammed bin Naser Hospital. This research provides a standardized, high-precision diagnostic path tailored to the Saudi Arabian population, conducted under institutional review board approval (No. 25138). Results: The proposed dual-stream attention network achieved an accuracy of 97.8% on the validation set and 96.4% on the test set, demonstrating high performance and generalization capabilities in classifying prostate lesions from Saudi patient populations. Conclusions: The proposed dual-stream architecture with novel attention and fusion mechanisms demonstrates high effectiveness for prostate cancer classification from T2-weighted MRI in Saudi clinical settings. This represents the first deep learning model specifically trained and validated on Saudi Arabian prostate MRI data, with the potential to address the shortage of specialized expertise and improve diagnostic efficiency in the Kingdom. Full article
(This article belongs to the Special Issue Prostate Cancer: Diagnosis, Clinical Management and Prognosis)
Show Figures

Figure 1

28 pages, 12735 KB  
Article
FMW-YOLO: A Frequency-Enhanced and Multi-Scale Context-Aware Framework for PCB Defect Detection
by Yuguo Li, Shuo Tian, Wenzheng Sun, Longfa Chen, Jian Li, Junkai Hu and Na Meng
Micromachines 2026, 17(5), 531; https://doi.org/10.3390/mi17050531 - 27 Apr 2026
Viewed by 524
Abstract
A high-precision and efficient surface defect detection for printed circuit board (PCB) is critical to ensuring the reliability of electronic systems. However, the presence of complex circuit backgrounds and the small scale of defects often limit the precision and effectiveness of conventional inspection [...] Read more.
A high-precision and efficient surface defect detection for printed circuit board (PCB) is critical to ensuring the reliability of electronic systems. However, the presence of complex circuit backgrounds and the small scale of defects often limit the precision and effectiveness of conventional inspection approaches. To address these challenges, this paper proposes FMW-YOLO, a lightweight and accurate detection framework based on YOLO11n. Specifically, a Frequency-Enhanced Channel-Transposed and Local Feature backbone network is developed to improve feature extraction. By designing a Dual-Frequency and Channel Attention Aggregation module and a Lightweight Edge-Gaussian Block, the original C3k2 structure is refined to suppress noise interference while preserving high-frequency details, thereby enhancing feature representation. Furthermore, a neck network incorporating a Multi-Scale Context-Aware Enhancement mechanism is constructed, in which an Attention-Integrated Feature Pyramid is employed to facilitate more effective cross-scale feature interaction. In addition, a Dilated Reparam Residual Module is embedded into the C3k2 structure to expand the receptive field without significantly increasing computational burden. Finally, Wise-IoU is adopted to optimize bounding box regression by assigning greater importance to anchors of moderate quality. Extensive experiments conducted on the HRIPCB and DeepPCB datasets demonstrate that FMW-YOLO improves mAP50 by 2.1% and 0.3%, respectively, while reducing the number of parameters by 23%. These results indicate that the proposed method achieves improved detection accuracy and demonstrates strong potential for practical industrial applications. Full article
(This article belongs to the Topic AI Sensors and Transducers)
Show Figures

Figure 1

32 pages, 5393 KB  
Article
TCSNet: A Thin-Cloud-Sensitive Network for Hyperspectral Remote Sensing Images via Spectral-Spatial Feature Fusion
by Yuanyuan Jia, Siwei Zhao, Xuanbin Liu and Yinnian Liu
Remote Sens. 2026, 18(9), 1326; https://doi.org/10.3390/rs18091326 - 26 Apr 2026
Viewed by 307
Abstract
Cloud detection is essential for quantitative land-surface remote sensing and cloud-climate research. However, existing methods often prioritize spatial features over spectral features, which limits thin-cloud detection. To address this issue, this paper proposes a Thin-Cloud-Sensitive Network (TCSNet) for hyperspectral imagery. TCSNet employs an [...] Read more.
Cloud detection is essential for quantitative land-surface remote sensing and cloud-climate research. However, existing methods often prioritize spatial features over spectral features, which limits thin-cloud detection. To address this issue, this paper proposes a Thin-Cloud-Sensitive Network (TCSNet) for hyperspectral imagery. TCSNet employs an encoder–decoder architecture with a dual-branch design: a convolutional neural network (CNN) extracts multi-scale local features, while a PVTv2-B2 Transformer captures long-range spectral dependencies. To effectively integrate the complementary representations from both branches, a Cross-Modal Fusion (CMF) module with a lightweight single-channel gate is introduced at each stage, followed by a channel attention mechanism (SE) for feature recalibration. Subsequently, a Multi-Scale Fusion (MSF) module is used to integrate multi-level features through a top-down pathway, enabling deep semantic information to guide shallow feature expression. Furthermore, to enhance the decoder’s feature representation capability, a Combined Attention Mechanism (CAM) is incorporated at each decoder stage. This design enables the network to simultaneously focus on important channels, salient regions, and cloud boundaries, effectively alleviating spectral confusion between thin clouds and the underlying surface. Experimental results on Gaofen-5 01 hyperspectral data demonstrate that TCSNet achieves the highest recall (92.98%), Recallthin (85.59%), and Recallthick (99.75%), thereby validating its superiority for thin-cloud detection. Full article
(This article belongs to the Special Issue Artificial Intelligence in Hyperspectral Remote Sensing Data Analysis)
Show Figures

Figure 1

22 pages, 7033 KB  
Article
WSNet: Person Re-Identification Based on Wavelet Convolution and Assisted by Image Generation at Inference Time
by Honggang Xie, Jinyang Huang, Xinxin Yi, Zhiwei Chen, Wei Xiong, Yuan Yao, Yongsheng Bai and Xiuyuan Meng
Electronics 2026, 15(8), 1645; https://doi.org/10.3390/electronics15081645 - 15 Apr 2026
Viewed by 473
Abstract
In pedestrian re-identification (ReID) tasks, existing models face dual challenges: first, surveillance cameras capture images at long distances with low resolution and blurriness; second, image data suffers from insufficient samples, limited poses, and cross-domain adaptation issues. To address these issues, we propose a [...] Read more.
In pedestrian re-identification (ReID) tasks, existing models face dual challenges: first, surveillance cameras capture images at long distances with low resolution and blurriness; second, image data suffers from insufficient samples, limited poses, and cross-domain adaptation issues. To address these issues, we propose a wavelet-convolution-based person re-identification framework assisted by a Stable Diffusion-based identity-preserving image generation module used only at inference time. This approach employs a dual-channel wavelet convolutional neural network for multi-scale feature extraction of pedestrian images, combined with cross-attention and gating mechanisms for dynamic data fusion. Additionally, we incorporate a pre-trained Pose2ID-based auxiliary generation branch that synthesizes identity-preserving pedestrian views with diverse poses under human keypoint guidance. These generated views are used only at inference time, where their WSNet features are fused with the feature of the original image to provide pose-complementary representation enhancement. Experiments on the Market-1501 and MSMT17 benchmark datasets demonstrate that our method achieves an mAP of 92.1% and a Rank-1 accuracy of 96.5% on Market-1501, and an mAP of 60.1% and a Rank-1 accuracy of 81.2% on MSMT17, with a WSNet backbone of 2.66 M parameters. Compared with the baseline models, the proposed method improves mAP by 5.1 and 7.6 percentage points on Market-1501 and MSMT17, respectively. Full article
(This article belongs to the Special Issue Image/Video Processing and Computer Vision)
Show Figures

Figure 1

23 pages, 9516 KB  
Article
Physics-Prior-Guided Feature Pyramid Network for Unified Multi-Angle Spectral–Polarimetric Cloud Detection
by Shu Li, Xingyuan Ji, Xiaoxue Chu, Song Ye, Ziyang Zhang, Yongyin Gan, Xinqiang Wang and Fangyuan Wang
Remote Sens. 2026, 18(8), 1150; https://doi.org/10.3390/rs18081150 - 12 Apr 2026
Viewed by 460
Abstract
Accurate cloud detection remains a significant challenge due to the spectral ambiguity between clouds and bright or heterogeneous surfaces (e.g., snow, desert). While multi-angle and polarization data offer rich information, the discriminative power of joint spectral analysis for resolving these ambiguities has been [...] Read more.
Accurate cloud detection remains a significant challenge due to the spectral ambiguity between clouds and bright or heterogeneous surfaces (e.g., snow, desert). While multi-angle and polarization data offer rich information, the discriminative power of joint spectral analysis for resolving these ambiguities has been underexploited. In this work, we demonstrate that physically motivated spectral band ratios and differences can robustly enhance cloud signatures. Motivated by this insight, we propose a novel deep learning framework, the Multi-angle Polarization Feature Pyramid Structure (MP-FPS), that explicitly leverages joint spectral features as discriminative priors. Our architecture employs a dual-branch network to disentangle and adaptively fuse spectral and multi-angle polarization modalities. Within this framework, a hierarchical, multi-scale cross-channel multi-angle fusion module dynamically captures spatial–spectral–angular dependencies, enriching the structural representation of clouds. Furthermore, a channel-space dual-path attention mechanism refines sub-pixel responses, significantly improving detection accuracy in challenging regions such as cloud edges and thin cirrus. Evaluated on the global POLDER-3 dataset, MP-FPS achieves a mean Intersection over Union (mIoU) of 0.8662 across diverse surface types, surpassing the official baseline by 12.4%. This study establishes joint spectral analysis as a critical enabler for high-precision cloud masking, and demonstrates its synergistic value when integrated with multi-angle polarimetric information in a unified deep architecture. Full article
Show Figures

Figure 1

Back to TopTop