MDPI - Publisher of Open Access Journals

19 pages, 1948 KB

Open AccessArticle

Graph-MambaRoadDet: A Symmetry-Aware Dynamic Graph Framework for Road Damage Detection

by Zichun Tian, Xiaokang Shao and Yuqi Bai

Symmetry 2025, 17(10), 1654; https://doi.org/10.3390/sym17101654 - 5 Oct 2025

Viewed by 186

Road-surface distress poses a serious threat to traffic safety and imposes a growing burden on urban maintenance budgets. While modern detectors based on convolutional networks and Vision Transformers achieve strong frame-level performance, they often overlook an essential property of road environments—structural symmetry [...] Read more.

Road-surface distress poses a serious threat to traffic safety and imposes a growing burden on urban maintenance budgets. While modern detectors based on convolutional networks and Vision Transformers achieve strong frame-level performance, they often overlook an essential property of road environments—structural symmetry within road networks and damage patterns. We present Graph-MambaRoadDet (GMRD), a symmetry-aware and lightweight framework that integrates dynamic graph reasoning with state–space modeling for accurate, topology-informed, and real-time road damage detection. Specifically, GMRD employs an EfficientViM-T1 backbone and two DefMamba blocks, whose deformable scanning paths capture sub-pixel crack patterns while preserving geometric symmetry. A superpixel-based graph is constructed by projecting image regions onto OpenStreetMap road segments, encoding both spatial structure and symmetric topological layout. We introduce a Graph-Generating State–Space Model (GG-SSM) that synthesizes sparse sample-specific adjacency in

O (M)

time, further refined by a fusion module that combines detector self-attention with prior symmetry constraints. A consistency loss promotes smooth predictions across symmetric or adjacent segments. The full INT8 model contains only 1.8 M parameters and 1.5 GFLOPs, sustaining 45 FPS at 7 W on a Jetson Orin Nano—eight times lighter and 1.7× faster than YOLOv8-s. On RDD2022, TD-RD, and RoadBench-100K, GMRD surpasses strong baselines by up to +6.1 mAP_50:95 and, on the new RoadGraph-RDD benchmark, achieves +5.3 G-mAP and +0.05 consistency gain. Qualitative results demonstrate robustness under shadows, reflections, back-lighting, and occlusion. By explicitly modeling spatial and topological symmetry, GMRD offers a principled solution for city-scale road infrastructure monitoring under real-time and edge-computing constraints. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

22 pages, 3419 KB

Open AccessArticle

A Small-Sample Prediction Model for Ground Surface Settlement in Shield Tunneling Based on Adjacent-Ring Graph Convolutional Networks (GCN-SSPM)

by Jinpo Li, Haoxuan Huang and Gang Wang

Buildings 2025, 15(19), 3519; https://doi.org/10.3390/buildings15193519 - 30 Sep 2025

Viewed by 208

Abstract

In some projects, a lack of data causes problems for presenting an accurate prediction model for surface settlement caused by shield tunneling. Existing models often rely on large volumes of data and struggle to maintain accuracy and reliability in shield tunneling. In particular, [...] Read more.

In some projects, a lack of data causes problems for presenting an accurate prediction model for surface settlement caused by shield tunneling. Existing models often rely on large volumes of data and struggle to maintain accuracy and reliability in shield tunneling. In particular, the spatial dependency between adjacent rings is overlooked. To address these limitations, this study presents a small-sample prediction framework for settlement induced by shield tunneling, using an adjacent-ring graph convolutional network (GCN-SSPM). Gaussian smoothing, empirical mode decomposition (EMD), and principal component analysis (PCA) are integrated into the model, which incorporates spatial topological priors by constructing a ring-based adjacency graph to extract essential features. A dynamic ensemble strategy is further employed to enhance robustness across layered geological conditions. Monitoring data from the Wuhan Metro project is used to demonstrate that GCN-SSPM yields accurate and stable predictions, particularly in zones facing abrupt settlement shifts. Compared to LSTM+GRU+Attention and XGBoost, the proposed model reduces RMSE by over 90% (LSTM) and 75% (XGBoost), respectively, while achieving an R² of about 0.71. Notably, the ensemble assigns over 70% of predictive weight to GCN-SSPM in disturbance-sensitive zones, emphasizing its effectiveness in capturing spatially coupled and nonlinear settlement behavior. The prediction error remains within ±1.2 mm, indicating strong potential for practical applications in intelligent construction and early risk mitigation in complex geological conditions. Full article

(This article belongs to the Section Building Structures)

► Show Figures

Figure 1

21 pages, 5230 KB

Open AccessArticle

Attention-Guided Differentiable Channel Pruning for Efficient Deep Networks

by Anouar Chahbouni, Khaoula El Manaa, Yassine Abouch, Imane El Manaa, Badre Bossoufi, Mohammed El Ghzaoui and Rachid El Alami

Mach. Learn. Knowl. Extr. 2025, 7(4), 110; https://doi.org/10.3390/make7040110 - 29 Sep 2025

Viewed by 313

Abstract

Deploying deep learning (DL) models in real-world environments remains a major challenge, particularly under resource-constrained conditions where achieving both high accuracy and compact architectures is essential. While effective, Conventional pruning methods often suffer from high computational overhead, accuracy degradation, or disruption of the [...] Read more.

Deploying deep learning (DL) models in real-world environments remains a major challenge, particularly under resource-constrained conditions where achieving both high accuracy and compact architectures is essential. While effective, Conventional pruning methods often suffer from high computational overhead, accuracy degradation, or disruption of the end-to-end training process, limiting their practicality for embedded and real-time applications. We present Dynamic Attention-Guided Pruning (DAGP), a Dynamic Attention-Guided Soft Channel Pruning framework that overcomes these limitations by embedding learnable, differentiable pruning masks directly within convolutional neural networks (CNNs). These masks act as implicit attention mechanisms, adaptively suppressing non-informative channels during training. A progressively scheduled L1 regularization, activated after a warm-up phase, enables gradual sparsity while preserving early learning capacity. Unlike prior methods, DAGP is retraining-free, introduces minimal architectural overhead, and supports optional hard pruning for deployment efficiency. Joint optimization of classification and sparsity objectives ensures stable convergence and task-adaptive channel selection. Experiments on CIFAR-10 (VGG16, ResNet56) and PlantVillage (custom CNN) achieve up to 98.82% FLOPs reduction with accuracy gains over baselines. Real-world validation on an enhanced PlantDoc dataset for agricultural monitoring achieves 60 ms inference with only 2.00 MB RAM on a Raspberry Pi 4, confirming efficiency under field conditions. These results illustrate DAGP’s potential to scale beyond agriculture to diverse edge-intelligent systems requiring lightweight, accurate, and deployable models. Full article

► Show Figures

Figure 1

14 pages, 1507 KB

Open AccessArticle

Diagnostic Efficacy of Olfactory Function Test Using Functional Near-Infrared Spectroscopy with Machine Learning in Healthy Adults: A Prospective Diagnostic-Accuracy (Feasibility/Validation) Study in Healthy Adults with Algorithm Development

by Minhyuk Lim, Seonghyun Kim, Dong Keon Yon and Jaewon Kim

Diagnostics 2025, 15(19), 2433; https://doi.org/10.3390/diagnostics15192433 - 24 Sep 2025

Viewed by 314

Abstract

Background/Objectives: The YSK olfactory function (YOF) test is a culturally adapted psychophysical tool that assesses threshold, discrimination, and identification. This study evaluated whether functional near-infrared spectroscopy (fNIRS) synchronized with routine YOF testing, combined with machine learning, can predict YOF subdomain performance in [...] Read more.

Background/Objectives: The YSK olfactory function (YOF) test is a culturally adapted psychophysical tool that assesses threshold, discrimination, and identification. This study evaluated whether functional near-infrared spectroscopy (fNIRS) synchronized with routine YOF testing, combined with machine learning, can predict YOF subdomain performance in healthy adults, providing an objective neural correlate to complement behavioral testing. Methods: In this prospective diagnostic-accuracy (feasibility/validation) study in healthy adults with algorithm development, 100 healthy adults completed the YOF test while undergoing prefrontal/orbitofrontal fNIRS during odor blocks. Feature sets from ΔHbO/ΔHbR included time-domain descriptors, complexity (Lempel–Ziv), and information-theoretic measures (mutual information); the identification task used a hybrid attention–CNN. Separate models were developed for threshold (binary classification), discrimination (binary classification), and identification (binary classification). Performance was summarized with accuracy, area under the curve (AUC), F1-score, and (where applicable) sensitivity/specificity, using participant-level cross-validation. Results: The threshold classifier achieved accuracy 0.86, AUC 0.86, and F1 0.86, indicating strong discrimination of correct vs. incorrect threshold responses. The discrimination model yielded accuracy 0.75, AUC 0.76, and F1 0.75. The identification model (attention–convolutional neural network [CNN]) achieved accuracy 0.88, sensitivity 0.86, specificity 0.91, and F1 0.88. Feature-attribution (e.g., SHapley Additive exPlanations [SHAP]) provided interpretable links between fNIRS features and task performance for threshold and discrimination. Conclusions: Olfactory-evoked fNIRS signals can accurately predict YOF subdomain performance in healthy adults, supporting the feasibility of non-invasive, portable, near–real-time olfactory monitoring. These findings are preliminary and not generalizable to clinical populations; external validation in diverse cohorts is warranted. The approach clarifies the scientific essence of the method by (i) aligning psychophysical outcomes with objective hemodynamic signatures and (ii) introducing a feature-rich modeling pipeline (ΔHbO/ΔHbR + Lempel–Ziv complexity/mutual information; attention–CNN) that advances prior work. Full article

(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

► Show Figures

Figure 1

24 pages, 5243 KB

Open AccessArticle

Multi-Segment Extendable Soft Manipulator Driven by a Pneumatic–Tendon Coupling Mechanism

by Hongxi Yang, Yufeng Zeng, Zeyu Zhong, Zhiyan Chen, Junxi Zhou, Zhicheng Ling, Ye Chen and Yunquan Li

Biomimetics 2025, 10(10), 643; https://doi.org/10.3390/biomimetics10100643 - 23 Sep 2025

Viewed by 425

Abstract

Continuum robots have garnered significant attention for their high flexibility and adaptability to complex environments. However, achieving the same level of high-precision control as rigid robots remains a significant challenge. This paper introduces an innovative Multi-Segment Extendable Soft Manipulator (MSESM) that employs a [...] Read more.

Continuum robots have garnered significant attention for their high flexibility and adaptability to complex environments. However, achieving the same level of high-precision control as rigid robots remains a significant challenge. This paper introduces an innovative Multi-Segment Extendable Soft Manipulator (MSESM) that employs a pneumatic–tendon hybrid drive mechanism. The design, utilizing off-the-shelf industrial bellows and 3D-printed components, allows the manipulator to achieve an extension ratio of up to 156.85%. By adopting a differential stiffness design, its bending stiffness was increased by approximately 4–5 times, its axial stiffness was increased by approximately 10 times, and its torsional resistance was enhanced, preventing inter-segment coupling during motion. At the control level, this paper proposes a hybrid control method that integrates a Constant Curvature (CC) physical prior with a data-driven neural network. Experimental results show that in tracking rectangular, triangular, and circular trajectories, this hybrid method reduced the average tracking error by 60.43% compared to a purely neural network-based controller, with the error reduction for the rectangular trajectory reaching 74.19%. This research validates a practical and effective approach for creating soft manipulators that successfully merge high flexibility with high-precision control. Full article

(This article belongs to the Special Issue Bio-Inspired Soft Robotics: Design, Fabrication and Applications: 2nd Edition)

► Show Figures

Graphical abstract

22 pages, 5746 KB

Open AccessArticle

AGSK-Net: Adaptive Geometry-Aware Stereo-KANformer Network for Global and Local Unsupervised Stereo Matching

by Qianglong Feng, Xiaofeng Wang, Zhenglin Lu, Haiyu Wang, Tingfeng Qi and Tianyi Zhang

Sensors 2025, 25(18), 5905; https://doi.org/10.3390/s25185905 - 21 Sep 2025

Viewed by 454

Abstract

The performance of unsupervised stereo matching in complex regions such as weak textures and occlusions is constrained by the inherently local receptive fields of convolutional neural networks (CNNs), the absence of geometric priors, and the limited expressiveness of MLP in conventional ViTs. To [...] Read more.

The performance of unsupervised stereo matching in complex regions such as weak textures and occlusions is constrained by the inherently local receptive fields of convolutional neural networks (CNNs), the absence of geometric priors, and the limited expressiveness of MLP in conventional ViTs. To address these problems, we propose an Adaptive Geometry-aware Stereo-KANformer Network (AGSK-Net) for unsupervised stereo matching. Firstly, to resolve the conflict between the isotropic nature of traditional ViT and the epipolar geometry priors in stereo matching, we propose Adaptive Geometry-aware Multi-head Self-Attention (AG-MSA), which embeds epipolar priors via an adaptive hybrid structure of geometric modulation and penalty, enabling geometry-aware global context modeling. Secondly, we design Spatial Group-Rational KAN (SGR-KAN), which integrates the nonlinear capability of rational functions with the spatial awareness of deep convolutions, replacing the MLP with flexible, learnable rational functions to enhance the nonlinear expression ability of complex regions. Finally, we propose a Dynamic Candidate Gated Fusion (DCGF) module that employs dynamic dual-candidate states and spatially aware pre-enhancement to adaptively fuse global and local features across scales. Experiments demonstrate that AGSK-Net achieves state-of-the-art accuracy and generalizability on Scene Flow, KITTI 2012/2015, and Middlebury 2021. Full article

(This article belongs to the Special Issue Deep Learning Technology and Image Sensing: 2nd Edition)

► Show Figures

Figure 1

16 pages, 993 KB

Open AccessArticle

A Multi-Feature Domain Interaction Learning Framework for Anomalous Network Detection

by Wei Sun, Fucun Zhang, Liang Guo and Xiao Liu

Electronics 2025, 14(18), 3729; https://doi.org/10.3390/electronics14183729 - 20 Sep 2025

Viewed by 343

Abstract

Network anomaly detection aims to identify abnormal traffic patterns that may indicate faults or cyber threats. This task requires modeling complex network flows composed of heterogeneous features, such as static headers, packet sequences, and statistical summaries. However, most existing methods focus on temporal [...] Read more.

Network anomaly detection aims to identify abnormal traffic patterns that may indicate faults or cyber threats. This task requires modeling complex network flows composed of heterogeneous features, such as static headers, packet sequences, and statistical summaries. However, most existing methods focus on temporal modeling and treat flows as uniform sequences, overlooking feature heterogeneity and dependencies across domains. As a result, they often miss subtle anomalies that can be reflected by cross-domain correlations, highlighting the need for more structured modeling. We propose a domain-aware framework for network anomaly detection that explicitly models the heterogeneity of flow-level features and their cross-domain interactions. To address the limitations of prior work in handling heterogeneous flow features, we design an Intra-Domain Expert Network (IDEN) that uses convolutional and feed-forward layers to independently extract patterns from distinct domains. We further introduce an Inter-Domain Expert Network (EDEN) that uses attention mechanisms to capture dependencies across domains and produces integrated flow representations. These refined representations are passed to a Transformer-based temporal module to detect anomalies over time, including gradually evolving or coordinated behaviors. Experiments on multiple public datasets show that our method achieves higher detection accuracy, demonstrating the value of explicitly modeling intra-domain structure and inter-domain dependencies. Full article

► Show Figures

Figure 1

28 pages, 6410 KB

Open AccessArticle

Two-Step Forward Modeling for GPR Data of Metal Pipes Based on Image Translation and Style Transfer

by Zhishun Guo, Yesheng Gao, Zicheng Huang, Mengyang Shi and Xingzhao Liu

Remote Sens. 2025, 17(18), 3215; https://doi.org/10.3390/rs17183215 - 17 Sep 2025

Viewed by 308

Abstract

Ground-penetrating radar (GPR) is an important geophysical technique in subsurface detection. However, traditional numerical simulation methods such as finite-difference time-domain (FDTD) face challenges in accurately simulating complex heterogeneous mediums in real-world scenarios due to the difficulty of obtaining precise medium distribution information and [...] Read more.

Ground-penetrating radar (GPR) is an important geophysical technique in subsurface detection. However, traditional numerical simulation methods such as finite-difference time-domain (FDTD) face challenges in accurately simulating complex heterogeneous mediums in real-world scenarios due to the difficulty of obtaining precise medium distribution information and high computational costs. Meanwhile, deep learning methods require excessive prior information, which limits their application. To address these issues, this paper proposes a novel two-step forward modeling strategy for GPR data of metal pipes. The first step employs the proposed Polarization Self-Attention Image Translation network (PSA-ITnet) for image translation, which is inspired by the process where a neural network model “understands” image content and “rewrites” it according to specified rules. It converts scene layout images (cross-sectional schematics depicting geometric details such as the size and spatial distribution of underground buried metal pipes and their surrounding medium) into simulated clutter-free GPR B-scan images. By integrating the polarized self-attention (PSA) mechanism into the Unet generator, PSA-ITnet can capture long-range dependencies, enhancing its understanding of the longitudinal time-delay property in GPR B-scan images. which is crucial for accurately generating hyperbolic signatures of metal pipes in simulated data. The second step uses the Polarization Self-Attention Style Transfer network (PSA-STnet) for style transfer, which transforms the simulated clutter-free images into data matching the distribution and characteristics of a real-world underground heterogeneous medium under unsupervised conditions while retaining target information. This step bridges the gap between ideal simulations and actual GPR data. Simulation experiments confirm that PSA-ITnet outperforms traditional methods in image translation, and PSA-STnet shows superiority in style transfer. Real-world experiments in a complex bridge support structure scenario further verify the method’s practicability and robustness. Compared to FDTD, the proposed strategy is capable of generating GPR data matching real-world subsurface heterogeneous medium distributions from scene layout models, significantly reducing time costs and providing an efficient solution for GPR data simulation and analysis. Full article

► Show Figures

Figure 1

19 pages, 3745 KB

Open AccessArticle

Anomaly Detection in Mineral Micro-X-Ray Fluorescence Spectroscopy Based on a Multi-Scale Feature Aggregation Network

by Yangxin Lu, Weiming Jiang, Molei Zhao, Yuanzhi Zhou, Jie Yang, Kunfeng Qiu and Qiuming Cheng

Minerals 2025, 15(9), 970; https://doi.org/10.3390/min15090970 - 13 Sep 2025

Viewed by 335

Abstract

Micro-X-ray fluorescence spectroscopy (micro-XRF) integrates spatial and spectral information and is widely employed for multi-elemental analyses of rock-forming minerals. However, its inherent limitation in spatial resolution gives rise to significant pixel mixing, thereby hindering the accurate identification of fine-scale or anomalous mineral phases. [...] Read more.

Micro-X-ray fluorescence spectroscopy (micro-XRF) integrates spatial and spectral information and is widely employed for multi-elemental analyses of rock-forming minerals. However, its inherent limitation in spatial resolution gives rise to significant pixel mixing, thereby hindering the accurate identification of fine-scale or anomalous mineral phases. Furthermore, most existing methods heavily rely on manually labeled data or predefined spectral libraries, rendering them poorly adaptable to complex and variable mineral systems. To address these challenges, this paper presents an unsupervised deep aggregation network (MSFA-Net) for micro-XRF imagery, aiming to eliminate the reliance of traditional methods on prior knowledge and enhance the recognition capability of rare mineral anomalies. Built on an autoencoder architecture, MSFA-Net incorporates a multi-scale orthogonal attention module to strengthen spectral–spatial feature fusion and employs density-based adaptive clustering to guide semantically aware reconstruction, thus achieving high-precision responses to potential anomalous regions. Experiments on real-world micro-XRF datasets demonstrate that MSFA-Net not only outperforms mainstream anomaly detection methods but also transcends the physical resolution limits of the instrument, successfully identifying subtle mineral anomalies that traditional approaches fail to detect. This method presents a novel paradigm for high-throughput and weakly supervised interpretation of complex geological images. Full article

(This article belongs to the Special Issue Gold–Polymetallic Deposits in Convergent Margins)

► Show Figures

Figure 1

23 pages, 5635 KB

Open AccessArticle

Attention-Based Transfer Enhancement Network for Cross-Corpus EEG Emotion Recognition

by Zongni Li, Kin-Yeung Wong and Chan-Tong Lam

Sensors 2025, 25(18), 5718; https://doi.org/10.3390/s25185718 - 13 Sep 2025

Viewed by 519

Abstract

A critical challenge in EEG-based emotion recognition is the poor generalization of models across different datasets due to significant domain shifts. Traditional methods struggle because they either overfit to source-domain characteristics or fail to bridge large discrepancies between datasets. To address this, we [...] Read more.

A critical challenge in EEG-based emotion recognition is the poor generalization of models across different datasets due to significant domain shifts. Traditional methods struggle because they either overfit to source-domain characteristics or fail to bridge large discrepancies between datasets. To address this, we propose the Cross-corpus Attention-based Transfer Enhancement network (CATE), a novel two-stage framework. The core novelty of CATE lies in its dual-view self-supervised pre-training strategy, which learns robust, domain-invariant representations by approaching the problem from two complementary perspectives. Unlike single-view models that capture an incomplete picture, our framework synergistically combines: (1) Noise-Enhanced Representation Modeling (NERM), which builds resilience to domain-specific artifacts and noise, and (2) Wavelet Transform Representation Modeling (WTRM), which captures the essential, multi-scale spectral patterns fundamental to emotion. This dual approach moves beyond the brittle assumptions of traditional domain adaptation, which often fails when domains are too dissimilar. In the second stage, a supervised fine-tuning process adapts these powerful features for classification using attention-based mechanisms. Extensive experiments on six transfer tasks across the SEED, SEED-IV, and SEED-V datasets demonstrate that CATE establishes a new state-of-the-art, achieving accuracies from 68.01% to 81.65% and outperforming prior methods by up to 15.65 percentage points. By effectively learning transferable features from these distinct, synergistic views, CATE provides a robust framework that significantly advances the practical applicability of cross-corpus EEG emotion recognition. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

25 pages, 7964 KB

Open AccessArticle

DSCSRN: Physically Guided Symmetry-Aware Spatial-Spectral Collaborative Network for Single-Image Hyperspectral Super-Resolution

by Xueli Chang, Jintong Liu, Guotao Wen, Xiaoyu Huang and Meng Yan

Symmetry 2025, 17(9), 1520; https://doi.org/10.3390/sym17091520 - 12 Sep 2025

Viewed by 374

Abstract

Hyperspectral images (HSIs), with their rich spectral information, are widely used in remote sensing; yet the inherent trade-off between spectral and spatial resolution in imaging systems often limits spatial details. Single-image hyperspectral super-resolution (HSI-SR) seeks to recover high-resolution HSIs from a single low-resolution [...] Read more.

Hyperspectral images (HSIs), with their rich spectral information, are widely used in remote sensing; yet the inherent trade-off between spectral and spatial resolution in imaging systems often limits spatial details. Single-image hyperspectral super-resolution (HSI-SR) seeks to recover high-resolution HSIs from a single low-resolution input, but the high dimensionality and spectral redundancy of HSIs make this task challenging. In HSIs, spectral signatures and spatial textures often exhibit intrinsic symmetries, and preserving these symmetries provides additional physical constraints that enhance reconstruction fidelity and robustness. To address these challenges, we propose the Dynamic Spectral Collaborative Super-Resolution Network (DSCSRN), an end-to-end framework that integrates physical modeling with deep learning and explicitly embeds spatial–spectral symmetry priors into the network architecture. DSCSRN processes low-resolution HSIs with a Cascaded Residual Spectral Decomposition Network (CRSDN) to compress redundant channels while preserving spatial structures, generating accurate abundance maps. These maps are refined by two Synergistic Progressive Feature Refinement Modules (SPFRMs), which progressively enhance spatial textures and spectral details via a multi-scale dual-domain collaborative attention mechanism. The Dynamic Endmember Adjustment Module (DEAM) then adaptively updates spectral endmembers according to scene context, overcoming the limitations of fixed-endmember assumptions. Grounded in the Linear Mixture Model (LMM), this unmixing–recovery–reconstruction pipeline restores subtle spectral variations alongside improved spatial resolution. Experiments on the Chikusei, Pavia Center, and CAVE datasets show that DSCSRN outperforms state-of-the-art methods in both perceptual quality and quantitative performance, achieving an average PSNR of 43.42 and a SAM of 1.75 (×4 scale) on Chikusei. The integration of symmetry principles offers a unifying perspective aligned with the intrinsic structure of HSIs, producing reconstructions that are both accurate and structurally consistent. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

17 pages, 3815 KB

Open AccessArticle

LMeRAN: Label Masking-Enhanced Residual Attention Network for Multi-Label Chest X-Ray Disease Aided Diagnosis

by Hongping Fu, Chao Song, Xiaolong Qu, Dongmei Li and Lei Zhang

Sensors 2025, 25(18), 5676; https://doi.org/10.3390/s25185676 - 11 Sep 2025

Viewed by 451

Abstract

Chest X-ray (CXR) imaging is essential for diagnosing thoracic diseases, and computer-aided diagnosis (CAD) systems have made substantial progress in automating the interpretation of CXR images. However, some existing methods often overemphasize local features while neglecting global context, limiting their ability to capture [...] Read more.

Chest X-ray (CXR) imaging is essential for diagnosing thoracic diseases, and computer-aided diagnosis (CAD) systems have made substantial progress in automating the interpretation of CXR images. However, some existing methods often overemphasize local features while neglecting global context, limiting their ability to capture the broader pathological landscape. Moreover, most methods fail to model label correlations, leading to insufficient utilization of prior knowledge. To address these limitations, we propose a novel multi-label CXR image classification framework, termed the Label Masking-enhanced Residual Attention Network (LMeRAN). Specifically, LMeRAN introduces an original label-specific residual attention to capture disease-relevant information effectively. By integrating multi-head self-attention with average pooling, the model dynamically assigns higher weights to critical lesion areas while retaining global contextual features. In addition, LMeRAN employs a label mask training strategy, enabling the model to learn complex label dependencies from partially available label information. Experiments conducted on the large-scale public dataset ChestX-ray14 demonstrate that LMeRAN achieves the highest mean AUC value of 0.825, resulting in an increase of 3.1% to 8.0% over several advanced baselines. To enhance interpretability, we also visualize the lesion regions relied upon by the model for classification, providing clearer insights into the model’s decision-making process. Full article

(This article belongs to the Special Issue Vision- and Image-Based Biomedical Diagnostics—2nd Edition)

► Show Figures

Figure 1

21 pages, 1247 KB

Open AccessReview

Bayesian Graphical Models for Multiscale Inference in Medical Image-Based Joint Degeneration Analysis

by Rahul Kumar, Kiran Marla, Puja Ravi, Kyle Sporn, Rohit Srinivas, Swapna Vaja, Alex Ngo and Alireza Tavakkoli

Diagnostics 2025, 15(18), 2295; https://doi.org/10.3390/diagnostics15182295 - 10 Sep 2025

Viewed by 573

Abstract

Joint degeneration is a major global health issue requiring improved diagnostic and prognostic tools. This review examines whether integrating Bayesian graphical models with multiscale medical imaging can enhance detection, analysis, and prediction of joint degeneration compared to traditional single-scale methods. Recent advances in [...] Read more.

Joint degeneration is a major global health issue requiring improved diagnostic and prognostic tools. This review examines whether integrating Bayesian graphical models with multiscale medical imaging can enhance detection, analysis, and prediction of joint degeneration compared to traditional single-scale methods. Recent advances in quantitative MRI, such as T2 mapping, enable early detection of subtle cartilage changes, supporting earlier intervention. Bayesian graphical models provide a flexible framework for representing complex relationships and updating predictions as new evidence emerges. Unlike prior reviews that address Bayesian methods or musculoskeletal imaging separately, this work synthesizes these domains into a unified framework that spans molecular, cellular, tissue, and organ-level analyses, providing methodological guidance and clinical translation pathways. Key topics within Bayesian inference include multiscale analysis, probabilistic graphical models, spatial-temporal modeling, network connectivity analysis, advanced imaging biomarkers, quantitative analysis, quantitative MRI techniques, radiomics and texture analysis, multimodal integration strategies, uncertainty quantification, variational inference approaches, Monte Carlo methods, and model selection and validation, as well as diffusion models for medical imaging and Bayesian joint diffusion models. Additional attention is given to diffusion models for advanced medical image generation, addressing challenges such as limited datasets and patient privacy. Clinical translation and validation requirements are emphasized, highlighting the need for rigorous evaluation to ensure that synthesized or processed images maintain diagnostic accuracy. Finally, this review discusses implementation challenges and outlines future research directions, emphasizing the potential for earlier diagnosis, improved risk assessment, and personalized treatment strategies to reduce the growing global burden of musculoskeletal disorders. Full article

(This article belongs to the Special Issue Artificial Intelligence for Precision Analysis and Decision Making in Medical Imaging)

► Show Figures

Figure 1

23 pages, 7046 KB

Open AccessArticle

Atmospheric Scattering Prior Embedded Diffusion Model for Remote Sensing Image Dehazing

by Shanqin Wang and Miao Zhang

Atmosphere 2025, 16(9), 1065; https://doi.org/10.3390/atmos16091065 - 10 Sep 2025

Viewed by 538

Abstract

Remote sensing image dehazing presents substantial challenges in balancing physical fidelity with generative flexibility, particularly under complex atmospheric conditions and sensor-specific degradation patterns. Traditional physics-based methods often struggle with nonlinear haze distributions, while purely data-driven approaches tend to lack interpretability and physical consistency. [...] Read more.

Remote sensing image dehazing presents substantial challenges in balancing physical fidelity with generative flexibility, particularly under complex atmospheric conditions and sensor-specific degradation patterns. Traditional physics-based methods often struggle with nonlinear haze distributions, while purely data-driven approaches tend to lack interpretability and physical consistency. To bridge this gap, we propose the Atmospheric Scattering Prior embedded Diffusion Model (ASPDiff), a novel framework that seamlessly integrates atmospheric physics into the diffusion-based generative restoration process. ASPDiff establishes a closed-loop feedback mechanism by embedding the atmospheric scattering model as a physics-driven regularization throughout both the forward degradation simulation and the reverse denoising trajectory. The framework operates through the following three synergistic components: (1) an Atmospheric Prior Estimation Module that uses the Dark Channel Prior to generate initial estimates of the transmission map and global atmospheric light, which are then refined through learnable adjustment networks; (2) a Diffusion Process with Atmospheric Prior Embedding, where the refined priors serve as conditional guidance during the reverse diffusion sampling, ensuring physical plausibility; and (3) a Haze-Aware Refinement Module that adaptively enhances structural details and compensates for residual haze via frequency-aware decomposition and spatial attention. Extensive experiments on both synthetic and real-world remote sensing datasets demonstrate that ASPDiff significantly outperforms existing methods, achieving state-of-the-art performance while maintaining strong physical interpretability. Full article

(This article belongs to the Section Atmospheric Techniques, Instruments, and Modeling)

► Show Figures

Figure 1

16 pages, 1402 KB

Open AccessArticle

A Sparse Attention Mechanism Based Redundancy-Aware Retrieval Framework for Power Grid Inspection Images

by Wei Yang, Zhenyu Chen, Xiaoguang Huang, Ming Li, Hailu Wang and Shi Liu

Electronics 2025, 14(18), 3585; https://doi.org/10.3390/electronics14183585 - 10 Sep 2025

Viewed by 366

Abstract

Driven by the rapid advancement of smart grid frameworks, the volume of visual data collected from power system diagnostic equipment has surged exponentially. A substantial portion of these images (30–40%) are redundant or highly similar, primarily due to periodic monitoring and repeated acquisitions [...] Read more.

Driven by the rapid advancement of smart grid frameworks, the volume of visual data collected from power system diagnostic equipment has surged exponentially. A substantial portion of these images (30–40%) are redundant or highly similar, primarily due to periodic monitoring and repeated acquisitions from multiple angles. Traditional redundancy removal methods based on manual screening or single-feature matching are often inefficient and lack adaptability. In this paper, we propose a two-stage redundancy removal paradigm for power inspection imagery, which integrates abstract semantic priors with fine-grained perceptual details. The first stage combines an improved discrete cosine transform hash (DCT Hash) with the multi-scale structural similarity index (MS-SSIM) to efficiently filter redundant candidates. In the second stage, a Vision Transformer network enhanced with a hierarchical sparse attention mechanism precisely determines redundancy via cosine similarity between feature vectors. Experimental results demonstrate that the proposed method achieves an algorithm sensitivity of 0.9243, surpassing ResNet and VGG by 5.86 and 8.10 percentage points, respectively, highlighting its robustness and effectiveness in large-scale power grid redundancy detection. These results underscore the paradigm’s capability to balance efficiency and precision in complex visual inspection scenarios. Full article

(This article belongs to the Special Issue Recent Progress in Visual AI: Architectures, Learning, and Applications)

► Show Figures

Figure 1

Search Results (422)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (422)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI