MDPI - Publisher of Open Access Journals

23 pages, 9873 KB

Open AccessArticle

RNNet-MST: A ResNet-50 with Multi-Scale Transformer Blocks for Pulmonary Nodule Classification and Attention-Based Localization on Chest X-Ray Images

by Edrill F. Bilan, Emman T. Manduriaga, Hernando S. Salapare, Ymir M. Garcia, Khatalyn E. Mata, Rose Anna R. Banal, Imelda C. Ang, Wei-Ta Chu and Dan Michael A. Cortez

Diagnostics 2026, 16(10), 1574; https://doi.org/10.3390/diagnostics16101574 - 21 May 2026

Abstract

Background/Objectives: Lung cancer survival depends on early detection; however, in the Philippines, high radiologist workloads and the anatomical complexity of chest X-rays (CXRs) contribute to missed pulmonary nodules and false-negative diagnoses. This study aims to develop an enhanced deep learning model to [...] Read more.

Background/Objectives: Lung cancer survival depends on early detection; however, in the Philippines, high radiologist workloads and the anatomical complexity of chest X-rays (CXRs) contribute to missed pulmonary nodules and false-negative diagnoses. This study aims to develop an enhanced deep learning model to improve nodule classification and localization sensitivity. Methods: We propose RNNet-MST, an extension of ResNet-50 that incorporates Multi-Scale Transformer blocks for global context modeling and a custom spatial attention mechanism for attention-based weak localization of disease-relevant regions. The model was trained and evaluated on the NODE21 chest X-ray dataset and compared with a baseline ResNet-50 using classification metrics, with attention maps used for weak localization analysis. Results: RNNet-MST demonstrated consistent improvements over the baseline ResNet-50 across evaluated metrics. Mean Nodule Recall improved from 88.02 ± 1.92% to 91.55 ± 1.41%, reducing false negatives. Mean Test Precision reached 90.46 ± 0.99%, and mean Nodule F1-Score improved to 90.99 ± 0.39%. On the isolated small-nodule subset, RNNet-MST achieved a 12.3% improvement in sensitivity over the baseline. Conclusions: The integration of multi-scale transformer features improved classification sensitivity, while the attention mechanism provided weak localization cues that aligned more closely with annotated nodule regions than the baseline. RNNet-MST shows potential as a diagnostic support tool, warranting further validation on larger and more diverse clinical datasets to reduce perceptual errors and facilitate early lung cancer detection in resource-constrained settings. Full article

(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

► Show Figures

Figure 1

33 pages, 3974 KB

Open AccessArticle

ARTEMIS: An Explainable AI Framework for Multi-Class COVID-19 Diagnosis with a Newly Curated Dataset

by Muhammet Emin Sahin, Hasan Ulutas, Mustafa Fatih Erkoc, Baris Karakaya, Recep Batuhan Günay and Enes Eren Suzgen

Bioengineering 2026, 13(5), 588; https://doi.org/10.3390/bioengineering13050588 - 20 May 2026

Abstract

In this work, we propose ARTEMIS, a novel and highly interpretable deep learning pipeline for the automatic classification of Chest X-ray (CXR) and Computed Tomography (CT) images into different categories related to important clinical outcomes: COVID-19 infection, Community-Acquired Pneumonia (CAP) cases, and Normal [...] Read more.

In this work, we propose ARTEMIS, a novel and highly interpretable deep learning pipeline for the automatic classification of Chest X-ray (CXR) and Computed Tomography (CT) images into different categories related to important clinical outcomes: COVID-19 infection, Community-Acquired Pneumonia (CAP) cases, and Normal cases. Unlike existing models based on the static feature enhancement step, ARTEMIS proposes a learnable preprocessing component that dynamically adapts the image contrast and sharpness in training mode, facilitating adaptive optimization. Our hybrid network combines EfficientNet-B0 backbone with built-in SE attention with the optional lightweight Transformer encoder block to jointly learn local radiological features and global relationships between pixels. Comprehensive experiments have been conducted on five different datasets, which comprise four publicly available ones and one novel CT dataset annotated by radiologists, including X-ray and CT modalities. Experimental results show strong robustness and generalization with macro F1-scores greater than 96% on public datasets and 99.39% accuracy on our new CT dataset. To interpret the decision-making process, Grad-CAM++ is employed to generate class-discriminative saliency maps; the highlighted regions are systematically validated against established radiological criteria by a board-certified radiologist, confirming that model decisions are grounded in clinically meaningful pulmonary findings rather than imaging artifacts. Full article

(This article belongs to the Special Issue Explainable Artificial Intelligence (XAI) in Medical Imaging)

25 pages, 1880 KB

Open AccessArticle

A Dual-Branch Deep Learning Framework with Explainability for Dental Caries Classification Using Intra-Oral Photographs and Radiographs

by Lijuan Ren and Jinjing Chen

J. Imaging 2026, 12(5), 207; https://doi.org/10.3390/jimaging12050207 - 12 May 2026

Viewed by 153

Abstract

The accurate detection of dental caries is often hindered by modality-specific imaging challenges, such as illumination artifacts in intra-oral photographs and low lesion contrast in radiographs. This study proposes a comprehensive framework comprising three key components: (1) HybridAugment+, an entropy-guided adaptive augmentation strategy [...] Read more.

The accurate detection of dental caries is often hindered by modality-specific imaging challenges, such as illumination artifacts in intra-oral photographs and low lesion contrast in radiographs. This study proposes a comprehensive framework comprising three key components: (1) HybridAugment+, an entropy-guided adaptive augmentation strategy that applies stronger transformations to low-information images; (2) DBAttNet, a dual-branch attention network featuring illumination–reflection aware attention (IRAA) for photographs and contrast–frequency-aware attention (CFA) for radiographs; and (3) a CAM-based explainability method, selected through a systematic evaluation of five advanced techniques. This study utilized two datasets derived from public sources, comprising 639 intra-oral photographs (481 caries, 158 healthy) and 456 radiographs (268 caries, 188 healthy). These were annotated by two dentists, with established inter-rater reliability (κ = 0.82 for photographs, κ = 0.79 for radiographs). The experimental results demonstrate that HybridAugment+ improved performance over conventional augmentation by up to 8.72% on photographs and 7.67% on radiographs. Furthermore, DBAttNet achieved F1-scores of 97.90% on photographs and 95.72% on radiographs, outperforming ResNet50, InceptionV3, MSDNet, DCANet, and ARM-Net. A comparative evaluation identified XGrad-CAM as the most suitable explainability method, with optimal visualization thresholds of 30% for photographs and 20% for radiographs. Generalization experiments on ophthalmology (APTOS 2019, Messidor-2) and chest radiography datasets (Kermany CXR, NIH ChestX-ray14) demonstrated consistent performance gains over domain-specific methods (DT-Net, ConvNeXt-Tiny). These results confirm that the core design principles effectively transfer to other modalities facing analogous imaging challenges. Full article

(This article belongs to the Special Issue Artificial Intelligence for Medical Imaging and Applications)

► Show Figures

Figure 1

19 pages, 378 KB

Open AccessArticle

Mislabel Detection in Multi-Label Chest X-Rays via Prototype-Weighted Neighborhood Consistency in CoAtNet Embedding Space

by Ariel Gamboa, Mauricio Araya and Camilo Sotomayor

Appl. Sci. 2026, 16(9), 4067; https://doi.org/10.3390/app16094067 - 22 Apr 2026

Viewed by 228

Abstract

Large-scale chest X-ray (CXR) datasets often rely on report-derived or weak labels, introducing missing and incorrect annotations that can degrade downstream models and limit trust. We study training-free mislabel detection in multi-label CXRs by scoring neighborhood label consistency in a fixed embedding space. [...] Read more.

Large-scale chest X-ray (CXR) datasets often rely on report-derived or weak labels, introducing missing and incorrect annotations that can degrade downstream models and limit trust. We study training-free mislabel detection in multi-label CXRs by scoring neighborhood label consistency in a fixed embedding space. Using the NIH Chest X-ray Kaggle sample (5606 CXRs), we extract intermediate CoAtNet features and obtain 64-dimensional embeddings with a frozen CoAtNet backbone and a lightweight refinement head. On top of these embeddings, we compare kNN consistency baselines with distance weighting and label-set similarity against LPV-DW-CS, clustered prototype voting weighted by distance and cluster support. We evaluate three synthetic label-noise regimes with review budgets matched to the corruption rate: random single-label (5% and 20%), boundary-noise (20% corruption within the lowest-density 20% subset), and disjoint-label replacement (20% within that subset). LPV-DW-CS yields the highest downstream macro-AUROC after filtering top-ranked samples (up to 0.8860), while kNN variants achieve higher Recall@budget at the same review rates (up to 99.44%). An image-only expert Likert review of top-ranked real samples finds substantial label-set inconsistencies (54.1% for LPV-DW-CS-280-A; 60.5% for KNN-DW-LSS), supporting neighborhood-consistency ranking as a practical, training-free tool for targeted dataset auditing. Full article

(This article belongs to the Special Issue Computer-Vision-Based Biomedical Image Processing)

► Show Figures

Figure 1

29 pages, 3941 KB

Open AccessArticle

Explainable Deep Learning for Thoracic Radiographic Diagnosis: A COVID-19 Case Study Toward Clinically Meaningful Evaluation

by Divine Nicholas-Omoregbe, Olamilekan Shobayo, Obinna Okoyeigbo, Mansi Khurana and Reza Saatchi

Electronics 2026, 15(7), 1443; https://doi.org/10.3390/electronics15071443 - 30 Mar 2026

Viewed by 492

Abstract

COVID-19 still poses a global public health challenge, exerting pressure on radiology services. Chest X-ray (CXR) imaging is widely used for respiratory assessment due to its accessibility and cost-effectiveness. However, its interpretation is often challenging because of subtle radiographic features and inter-observer variability. [...] Read more.

COVID-19 still poses a global public health challenge, exerting pressure on radiology services. Chest X-ray (CXR) imaging is widely used for respiratory assessment due to its accessibility and cost-effectiveness. However, its interpretation is often challenging because of subtle radiographic features and inter-observer variability. Although recent deep learning (DL) approaches have shown strong performance in automated CXR classification, their black-box nature limits interpretability. This study proposes an explainable deep learning framework for COVID-19 detection from chest X-ray images. The framework incorporates anatomically guided preprocessing, including lung-region isolation, contrast-limited adaptive histogram equalization (CLAHE), bone suppression, and feature enhancement. A novel four-channel input representation was constructed by combining lung-isolated soft-tissue images with frequency-domain opacity maps, vessel enhancement maps, and texture-based features. Classification was performed using a modified Xception-based convolutional neural network, while Gradient-weighted Class Activation Mapping (Grad-CAM) was employed to provide visual explanations and enhance interpretability. The framework was evaluated on the publicly available COVID-19 Radiography Database, achieving an accuracy of 95.3%, an AUC of 0.983, and a Matthews Correlation Coefficient of approximately 0.83. Threshold optimisation improved sensitivity, reducing missed COVID-19 cases while maintaining high overall performance. Explainability analysis showed that model attention was primarily focused on clinically relevant lung regions. Full article

(This article belongs to the Special Issue Image Processing Based on Convolution Neural Network: 2nd Edition)

► Show Figures

Figure 1

13 pages, 1350 KB

Open AccessArticle

Imaging Pathways in Pediatric Thoracic Trauma: FAST-First Triage and Selective CT Escalation in Clinical Practice

by Emil Radu Iacob, Emil Robert Stoicescu, Valentina Adriana Marcu, Roxana Stoicescu, Vlad Predescu, Narcis Flavius Tepeneu, Maria Corina Stanciulescu, Mihai Cristian Neagu, Adrian Georgescu and Calin Marius Popoiu

Diagnostics 2026, 16(6), 889; https://doi.org/10.3390/diagnostics16060889 - 17 Mar 2026

Viewed by 432

Abstract

Background/Objectives: Pediatric thoracic trauma requires prompt stabilization and timely imaging; however, actual sequencing and escalation triggers are infrequently delineated at the pathway level. The aim of this study was to analyze imaging pathways observed in routine clinical practice at our institution and [...] Read more.

Background/Objectives: Pediatric thoracic trauma requires prompt stabilization and timely imaging; however, actual sequencing and escalation triggers are infrequently delineated at the pathway level. The aim of this study was to analyze imaging pathways observed in routine clinical practice at our institution and to outline a preliminary escalation framework integrating injury mechanism, clinical severity, and initial ultrasound findings. Methods: A retrospective cohort study was conducted at the “Louis Țurcanu” Clinical Emergency Hospital for Children, Timișoara, Romania, including 66 children admitted with primary thoracic trauma between January 2022 and December 2024. Clinical trajectory markers (transfer-in, ICU admission, length of stay) and imaging utilization/sequencing (FAST, CXR, CT, MRI/CTA) were extracted. We divided injuries into two groups: bony (like fractures of the clavicle or scapula) and non-bony. CT escalation was characterized as a chest CT conducted upon admission. Fisher’s exact and Mann–Whitney U tests were used for comparative analyses. Results: FAST was done on all patients but was infrequently positive. Imaging followed heterogeneous but structured patterns, most commonly FAST with CXR, with or without CT. A large group of them had CT scans without first having any X-rays. CT escalation was associated with fracture-pattern injuries and higher-acuity trajectories (transfer-in and ICU admission), as well as prolonged hospital stays. Pathway-level assessment demonstrated that CT escalation effectively captured bony injury patterns, whereas FAST proficiently sorted ICU-level trajectories. Conclusions: Pediatric thoracic trauma imaging functioned as a selective escalation system: FAST served as a universal bedside entry step, and CT operated as an injury pattern- and acuity-linked severity gate. Making this escalation logic clear may help with standardization while still protecting against radiation. Full article

(This article belongs to the Special Issue Recent Developments and Future Trends in Thoracic Imaging)

► Show Figures

Figure 1

13 pages, 10127 KB

Open AccessArticle

Fine-Tuned Segment Anything Model with Low-Rank Adaptation for Chest X-Ray Images

by Saeed S. Alahmari, Michael R. Gardner, Fawaz Alqahtani and Tawfiq Salem

Diagnostics 2026, 16(6), 847; https://doi.org/10.3390/diagnostics16060847 - 12 Mar 2026

Viewed by 1006

Abstract

Background: This paper investigates the use of the Segment Anything Model (SAM) for chest X-ray (CXR) image segmentation, with a focus on improving its performance using low-rank adaptation (LoRA). Methods: We evaluate three versions of SAM: two zero-shot methods (using coordinate and bounding [...] Read more.

Background: This paper investigates the use of the Segment Anything Model (SAM) for chest X-ray (CXR) image segmentation, with a focus on improving its performance using low-rank adaptation (LoRA). Methods: We evaluate three versions of SAM: two zero-shot methods (using coordinate and bounding box prompts) and a fine-tuned SAM using LoRA. To support these approaches, we also trained two standard convolutional neural networks (CNNs), U-Net and DeepLabv3+, to generate draft lung segmentations that serve as input prompts for the SAM methods. Our fine-tuning approach uses LoRA to add lightweight trainable adapters within the Transformer blocks of the SAM, allowing only a small subset of parameters to be updated. The rest of the SAM remains frozen, helping preserve its pre-trained knowledge while reducing memory and computational needs. We tested all models on a dataset of CXR images labeled for COVID-19, viral pneumonia, and normal cases. Results: Results show that fine-tuned SAM with LoRA outperforms zero-shot SAM methods and CNN baselines in terms of segmentation accuracy and efficiency. Conclusions: This demonstrates the potential of combining LoRA with SAM for practical and effective medical image segmentation. Full article

(This article belongs to the Special Issue Artificial Intelligence in Biomedical Image Analysis 2026)

► Show Figures

Figure 1

25 pages, 1678 KB

Open AccessReview

Artificial Intelligence for Pulmonary Abnormality Detection in Chest X-Ray Imaging: A Detailed Review of Methods, Datasets and Future Directions

by G. Parra-Cabrera, J. J. Jiménez-Delgado and F. D. Pérez-Cano

Technologies 2026, 14(3), 147; https://doi.org/10.3390/technologies14030147 - 28 Feb 2026

Viewed by 1264

Abstract

Chest X-ray (CXR) imaging remains the most widely used radiological modality for assessing pulmonary and cardiothoracic disease, yet its interpretation is inherently constrained by tissue superposition, subtle radiographic findings and marked inter-observer variability. Recent advances in artificial intelligence (AI) have driven significant progress [...] Read more.

Chest X-ray (CXR) imaging remains the most widely used radiological modality for assessing pulmonary and cardiothoracic disease, yet its interpretation is inherently constrained by tissue superposition, subtle radiographic findings and marked inter-observer variability. Recent advances in artificial intelligence (AI) have driven significant progress in automated CXR analysis, supported by large public datasets, evolving annotation strategies and increasingly expressive deep learning architectures. This review presents a comprehensive synthesis of approaches for pulmonary abnormality detection, encompassing convolutional neural networks, transformers, multimodal and vision–language models and self-supervised representation learning. We critically discuss their strengths, limitations and vulnerability to label noise, domain shift and shortcut learning. In parallel, we examine dataset properties, annotation practices, robustness challenges, explainability methods and the heterogeneity of evaluation protocols that hinder fair comparison and clinical translation. Building on these observations, the review identifies key future directions, including foundation models, multimodal integration, federated and domain-generalized training, longitudinal modeling, synthetic data generation and standardized clinical evaluation frameworks. By integrating methodological and clinical perspectives, this work offers an up-to-date reference for researchers and clinicians and outlines a roadmap toward reliable, interpretable and clinically deployable AI systems for chest radiography. Full article

(This article belongs to the Section Information and Communication Technologies)

► Show Figures

Figure 1

29 pages, 5858 KB

Open AccessArticle

MRID: Modeling Radiological Image Differences for Disease Progression Reasoning via Multi-Task Self-Supervision

by Yongtao Hao, Pandong Wang, Yanming Chen and Haifeng Zhao

Electronics 2026, 15(5), 997; https://doi.org/10.3390/electronics15050997 - 27 Feb 2026

Viewed by 417

Abstract

Automated radiology report generation has become a prominent research topic in medical multimodal learning. However, most existing approaches primarily focus on single-image interpretation and rarely address the task of tracking disease progression across longitudinal chest X-rays. This task presents two major challenges: accurately [...] Read more.

Automated radiology report generation has become a prominent research topic in medical multimodal learning. However, most existing approaches primarily focus on single-image interpretation and rarely address the task of tracking disease progression across longitudinal chest X-rays. This task presents two major challenges: accurately localizing pathological changes between temporally paired images, and effectively translating visual difference representations into clinically meaningful textual descriptions. To address these challenges, we propose MRID (Modeling Radiological Image Differences for Disease Progression Reasoning), a multi-task self-supervised framework that follows a pretraining–finetuning paradigm. MRID leverages multiple complementary self-supervised objectives to jointly achieve (1) intra-modal spatial alignment of organs and pathological regions across image pairs, and (2) cross-modal semantic alignment between visual difference representations and radiology report embeddings. Furthermore, we introduce a simple yet effective data augmentation strategy to alleviate the imbalance of disease progression categories. Extensive experiments conducted on the Longitudinal-MIMIC and MS-CXR-T datasets demonstrate that MRID effectively captures fine-grained disease progression patterns. In addition, the proposed framework achieves competitive performance on single-image radiology report generation, further highlighting its strong capability in modeling chest X-ray semantics. Full article

(This article belongs to the Special Issue AI-Driven Medical Image/Video Processing)

► Show Figures

Figure 1

19 pages, 10604 KB

Open AccessArticle

GAN-Based Low-Dose Chest X-Ray Super-Resolution with Hybrid Channel-Spatial Attention and Pooling Layer Removal

by Wenjia Li, Yafeng Yao, Di Gao and Ying Yi

Appl. Sci. 2026, 16(4), 1797; https://doi.org/10.3390/app16041797 - 11 Feb 2026

Viewed by 352

Abstract

Chest X-ray (CXR) imaging is one of the most widely used techniques for screening and diagnosing pulmonary diseases. However, discerning subtle structural changes, such as small nodules, disordered pulmonary textures, tiny cavities, pleural thickening, or spiculation, is difficult using low-resolution images. Acquiring high-resolution [...] Read more.

Chest X-ray (CXR) imaging is one of the most widely used techniques for screening and diagnosing pulmonary diseases. However, discerning subtle structural changes, such as small nodules, disordered pulmonary textures, tiny cavities, pleural thickening, or spiculation, is difficult using low-resolution images. Acquiring high-resolution CXRs typically requires higher radiation doses, posing a risk to patients. We propose a chest X-ray image super-resolution algorithm based on generative adversarial networks (GAN). Through adversarial training, our approach generates high-resolution CXRs with enhanced details and improved realism. We further incorporate a CSA hybrid attention module into the network, strengthening its ability to capture fine structures and improve texture fidelity. Moreover, we remove the pooling layer from the channel attention module to overcome limitations in super-resolution, thereby preserving spatial information more effectively. Experiments demonstrate our method’s superior performance and robustness, achieving a PSNR of 37.91 and SSIM of 0.9108 on the internal test set while consistently outperforming other methods on previously unseen external clinical datasets. After adversarial training, the method attains optimal visual performance, with LPIPS reduced to 0.0915, and the visual effect improved by 36.4% compared to low-resolution images. Ablation studies further verify the contribution of the proposed method to enhancing super-resolution capability. Overall, results indicate that the proposed method can obtain high-quality chest X-rays images from simulated low-quality inputs. Full article

(This article belongs to the Special Issue Application of Machine Vision in Biomechanical Engineering)

► Show Figures

Figure 1

17 pages, 7884 KB

Open AccessArticle

Limitations in Chest X-Ray Interpretation by Vision-Capable Large Language Models, Gemini 1.0, Gemini 1.5 Pro, GPT-4 Turbo, and GPT-4o

by Chih-Hsiung Chen, Chang-Wei Chen, Kuang-Yu Hsieh, Kuo-En Huang and Hsien-Yung Lai

Diagnostics 2026, 16(3), 376; https://doi.org/10.3390/diagnostics16030376 - 23 Jan 2026

Viewed by 1071

Abstract

Background/Objectives: Interpretation of chest X-rays (CXRs) requires accurate identification of lesion presence, diagnosis, location, size, and number to be considered complete. However, the effectiveness of large language models with vision capabilities (LLMs) in performing these tasks remains uncertain. This study aimed to [...] Read more.

Background/Objectives: Interpretation of chest X-rays (CXRs) requires accurate identification of lesion presence, diagnosis, location, size, and number to be considered complete. However, the effectiveness of large language models with vision capabilities (LLMs) in performing these tasks remains uncertain. This study aimed to evaluate the image-only interpretation performance of LLMs in the absence of clinical information. Methods: A total of 247 CXRs covering 13 diagnostic categories, including pulmonary edema, cardiomegaly, lobar pneumonia, and other conditions, were evaluated using Gemini 1.0, Gemini 1.5 Pro, GPT-4 Turbo, and GPT-4o. The text outputs generated by the LLMs were evaluated at two levels: (1) primary diagnosis accuracy across the 13 predefined diagnostic categories, and (2) identification of key imaging features described in the generated text. Primary diagnosis accuracy was assessed based on whether the model correctly identified the target diagnostic category and was classified as fully correct, partially correct, or incorrect according to predefined clinical criteria. Non-diagnostic imaging features, such as posteroanterior and anteroposterior (PA/AP) views, side markers, foreign bodies, and devices, were recorded and analyzed separately rather than being incorporated into the primary diagnostic scoring. Results: When fully and partially correct responses were treated as successful detections, vLLMs showed higher sensitivity for large, bilateral, multiple lesions and prominent devices, including acute pulmonary edema, lobar pneumonia, multiple malignancies, massive pleural effusions, and pacemakers, all of which demonstrated statistically significant differences across categories in chi-square analyses. Feature descriptions varied among models, especially in PA/AP views and side markers, though central lines were partially recognized. Across the entire dataset, Gemini 1.5 Pro achieved the highest overall detection rate, followed by Gemini 1.0, GPT-4o, and GPT-4 Turbo. Conclusions: Although LLMs were able to identify certain diagnoses and key imaging features, their limitations in detecting small lesions, recognizing laterality, reasoning through differential diagnoses, and using domain-specific expressions indicate that CXR interpretation without textual cues still requires further improvement. Full article

(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

► Show Figures

Figure 1

38 pages, 16831 KB

Open AccessArticle

Hybrid ConvNeXtV2–ViT Architecture with Ontology-Driven Explainability and Out-of-Distribution Awareness for Transparent Chest X-Ray Diagnosis

by Naif Almughamisi, Gibrael Abosamra, Adnan Albar and Mostafa Saleh

Diagnostics 2026, 16(2), 294; https://doi.org/10.3390/diagnostics16020294 - 16 Jan 2026

Cited by 1 | Viewed by 900

Abstract

Background: Chest X-ray (CXR) is widely used for the assessment of thoracic diseases, yet automated multi-label interpretation remains challenging due to subtle visual patterns, overlapping anatomical structures, and frequent co-occurrence of abnormalities. While recent deep learning models have shown strong performance, limitations in [...] Read more.

Background: Chest X-ray (CXR) is widely used for the assessment of thoracic diseases, yet automated multi-label interpretation remains challenging due to subtle visual patterns, overlapping anatomical structures, and frequent co-occurrence of abnormalities. While recent deep learning models have shown strong performance, limitations in interpretability, anatomical awareness, and robustness continue to hinder their clinical adoption. Methods: The proposed framework employs a hybrid ConvNeXtV2–Vision Transformer (ViT) architecture that combines convolutional feature extraction for capturing fine-grained local patterns with transformer-based global reasoning to model long-range contextual dependencies. The model is trained exclusively using image-level annotations. In addition to classification, three complementary post hoc components are integrated to enhance model trust and interpretability. A segmentation-aware Gradient-weighted class activation mapping (Grad-CAM) module leverages CheXmask lung and heart segmentations to highlight anatomically relevant regions and quantify predictive evidence inside and outside the lungs. An ontology-driven neuro-symbolic reasoning layer translates Grad-CAM activations into structured, rule-based explanations aligned with clinical concepts such as “basal effusion” and “enlarged cardiac silhouette”. Furthermore, a lightweight out-of-distribution (OOD) detection module based on confidence scores, energy scores, and Mahalanobis distance scores is employed to identify inputs that deviate from the training distribution. Results: On the VinBigData test set, the model achieved a macro-AUROC of 0.9525 and a Micro AUROC of 0.9777 when trained solely with image-level annotations. External evaluation further demonstrated strong generalisation, yielding macro-AUROC scores of 0.9106 on NIH ChestXray14 and 0.8487 on CheXpert (frontal views). Both Grad-CAM visualisations and ontology-based reasoning remained coherent on unseen data, while the OOD module successfully flagged non-thoracic images. Conclusions: Overall, the proposed approach demonstrates that hybrid convolutional neural network (CNN)–vision transformer (ViT) architectures, combined with anatomy-aware explainability and symbolic reasoning, can support automated chest X-ray diagnosis in a manner that is accurate, transparent, and safety-aware. Full article

(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

► Show Figures

Figure 1

16 pages, 1318 KB

Open AccessArticle

A Retrospective Observational Study of Pulmonary Impairments in Long COVID Patients

by Lanre Peter Daodu, Yogini Raste, Judith E. Allgrove, Francesca I. F. Arrigoni and Reem Kayyali

Biomedicines 2026, 14(1), 145; https://doi.org/10.3390/biomedicines14010145 - 10 Jan 2026

Viewed by 952

Abstract

Background/Objective: Pulmonary impairments have been identified as some of the most complex and debilitating post-acute sequelae of SARS-CoV-2 infection (PASC) or long COVID. This study identified and characterised the specific forms of pulmonary impairments detected using pulmonary function tests (PFT), chest X-rays (CXR), [...] Read more.

Background/Objective: Pulmonary impairments have been identified as some of the most complex and debilitating post-acute sequelae of SARS-CoV-2 infection (PASC) or long COVID. This study identified and characterised the specific forms of pulmonary impairments detected using pulmonary function tests (PFT), chest X-rays (CXR), and computed tomography (CT) scans in patients with long COVID symptoms. Methods: We conducted a single-centre retrospective study to evaluate 60 patients with long COVID who underwent PFT, CXR, and CT scans. Pulmonary function in long COVID patients was assessed using defined thresholds for key test parameters, enabling categorisation into normal, restrictive, obstructive, and mixed lung-function patterns. We applied exact binomial (Clopper–Pearson) 95% confidence intervals to calculate the proportions of patients falling below the defined thresholds. We also assessed the relationships among spirometric indices, lung volumes, and diffusion capacity (DLCO) using scatter plots and corresponding linear regressions. The findings from the CXRs and CT scans were categorised, and their prevalence was calculated. Results: A total of 60 patients with long COVID symptoms (mean age 60 ± 13 years; 57% female) were evaluated. The cohort was ethnically diverse and predominantly non-smokers, with a mean BMI of 32.4 ± 6.3 kg/m². PFT revealed that most patients had preserved spirometry, with mean Forced Expiratory Volume in 1 Second (FEV1) and Forced Vital Capacity (FVC) above 90% predicted. However, a significant proportion exhibited reductions in lung volumes, with total lung capacity (TLC) decreasing in 35%, and diffusion capacity (DLCO/TLCO) decreasing in 75%. Lung function pattern analysis showed 88% of patients had normal function, while 12% displayed a restrictive pattern; no obstructive or mixed patterns were observed. Radiographic assessment revealed that 58% of chest X-rays were normal, whereas CT scans showed ground-glass opacities (GGO) in 65% of patients and fibrotic changes in 55%, along with findings such as atelectasis, air trapping, and bronchial wall thickening. Conclusions: Spirometry alone is insufficient to detect impairment of gas exchange or underlying histopathological changes in patients with long COVID. Our findings show that, despite normal spirometry results, many patients exhibit significant diffusion impairment, fibrotic alterations, and ground-glass opacities, indicating persistent lung and microvascular damage. These results underscore the importance of comprehensive assessment using multiple diagnostic tools to identify and manage chronic pulmonary dysfunction in long COVID. Full article

(This article belongs to the Special Issue Latest Research in Post-COVID (Long COVID): Pathological and Treatment Studies of Sequelae and Complications—3rd Edition)

► Show Figures

Graphical abstract

33 pages, 4219 KB

Open AccessReview

Recent Progress in Deep Learning for Chest X-Ray Report Generation

by Mounir Salhi and Moulay A. Akhloufi

BioMedInformatics 2026, 6(1), 3; https://doi.org/10.3390/biomedinformatics6010003 - 9 Jan 2026

Viewed by 3777

Abstract

Chest X-ray radiology report generation is a challenging task that involves techniques from medical natural language processing and computer vision. This paper provides a comprehensive overview of recent progress. The annotation protocols, structure, linguistic characteristics, and size of the main public datasets are [...] Read more.

Chest X-ray radiology report generation is a challenging task that involves techniques from medical natural language processing and computer vision. This paper provides a comprehensive overview of recent progress. The annotation protocols, structure, linguistic characteristics, and size of the main public datasets are presented and compared. Understanding their properties is necessary for benchmarking and generalization. Both clinically oriented and natural language generation metrics are included in the model evaluation strategies to assess their performance. Their respective strengths and limitations are discussed in the context of radiology applications. Recent deep learning approaches for report generation and their different architectures are also reviewed. Common trends such as instruction tuning and the integration of clinical knowledge are also considered. Recent works show that current models still have limited factual accuracy, with a score of 72% reported with expert evaluations, and poor performance on rare pathologies and lateral views. The most important challenges are the limited dataset diversity, weak cross-institution generalization, and the lack of clinically validated benchmarks for evaluating factual reliability. Finally, we discuss open challenges related to data quality, clinical factuality, and interpretability. This review aims to support researchers by synthesizing the current literature and identifying key directions for developing more clinically reliable report generation systems. Full article

► Show Figures

Graphical abstract

28 pages, 3824 KB

Open AccessArticle

Comparison Between Early and Intermediate Fusion of Multimodal Techniques: Lung Disease Diagnosis

by Ahad Alloqmani and Yoosef B. Abushark

AI 2026, 7(1), 16; https://doi.org/10.3390/ai7010016 - 7 Jan 2026

Viewed by 1697

Abstract

Early and accurate diagnosis of lung diseases is essential for effective treatment and patient management. Conventional diagnostic models trained on a single data type often miss important clinical information. This study explored a multimodal deep learning framework that integrates cough sounds, chest radiograph [...] Read more.

Early and accurate diagnosis of lung diseases is essential for effective treatment and patient management. Conventional diagnostic models trained on a single data type often miss important clinical information. This study explored a multimodal deep learning framework that integrates cough sounds, chest radiograph (X-rays), and computed tomography (CT) scans to enhance disease classification performance. Two fusion strategies, early and intermediate fusion, were implemented and evaluated against three single-modality baselines. The dataset was collected from different sources. Each dataset underwent preprocessing steps, including noise removal, grayscale conversion, image cropping, and class balancing, to ensure data quality. Convolutional neural network (CNN) and Extreme Inception (Xception) architectures were used for feature extraction and classification. The results show that multimodal learning achieves superior performance compared with single models. The intermediate fusion model achieved 98% accuracy, while the early fusion model reached 97%. In contrast, single CXR and CT models achieved 94%, and the cough sound model achieved 79%. These results confirm that multimodal integration, particularly intermediate fusion, offers a more reliable framework for automated lung disease diagnosis. Full article

(This article belongs to the Section Medical & Healthcare AI)

► Show Figures

Figure 1

Search Results (275)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (275)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI