Journal of Imaging

22 pages, 28334 KB

Open AccessArticle

Prompt-Guided Semantic Latent Direction Learning in Diffusion Models for Abstract Visual Concept Manipulation

by Mahzaib Khalid, Fangli Ying, Al-Garadi Ahmed Mohammed Atef, Aniwat Phaphuangwittayakul and Riyad Dhuny

J. Imaging 2026, 12(7), 279; https://doi.org/10.3390/jimaging12070279 (registering DOI) - 25 Jun 2026

Abstract

Diffusion-based generative models achieve high-fidelity image synthesis; however, controlling internal representations for abstract visual concepts remains challenging due to the ambiguity of textual descriptions. In this work, we propose a prompt-guided concept-vector learning framework for the controllable manipulation of such concepts without requiring [...] Read more.

Diffusion-based generative models achieve high-fidelity image synthesis; however, controlling internal representations for abstract visual concepts remains challenging due to the ambiguity of textual descriptions. In this work, we propose a prompt-guided concept-vector learning framework for the controllable manipulation of such concepts without requiring external human-annotated image pairs, segmentation masks, identity labels, or manually annotated editing targets. The method introduces a learnable concept vector optimized in the bottleneck (mid-block) feature space of a pretrained Stable Diffusion U-Net, while keeping all pretrained model parameters frozen. A multi-prompt data generation strategy based on paired positive and neutral prompts provides weak semantic guidance for capturing the target concept direction and reducing dependence on a single prompt formulation. The learned vector is further applied in an image-to-image setting through controlled noise injection and concept-guided denoising, enabling the semantic modification of real images while preserving structural content. The concept strength is controlled by a scaling parameter

γ

, while the image-to-image noise strength is controlled by

β

, allowing for a practical balance between semantic modification and structural fidelity. Experiments are conducted on two main abstract concepts, perfect skin and peaceful lake, with additional qualitative analysis on subjective portrait-level concepts. Quantitative evaluation using SSIM, LPIPS, and CLIP similarity demonstrates that the proposed method improves semantic alignment while maintaining structural preservation compared with Stable Diffusion image-to-image baselines. A human preference study further shows that concept-injected outputs are preferred in 76.0% of responses for perfect skin and 85.7% for peaceful lake. Ablation studies further demonstrate the controllability and robustness of the proposed framework. Overall, the method provides a simple and parameter-efficient approach for interpretable concept-level manipulation in diffusion models. Full article

(This article belongs to the Special Issue AI-Driven Multimodal Image and Video Processing: Advances and Applications)

► Show Figures

Figure 1

30 pages, 13254 KB

Open AccessArticle

MBRSNet: Boundary-Aware Multi-Task Learning with Signed Distance Field Regression for Polyp Segmentation

by Ruishi Lin and Liyong Ma

J. Imaging 2026, 12(7), 278; https://doi.org/10.3390/jimaging12070278 (registering DOI) - 24 Jun 2026

Abstract

Accurate polyp segmentation in colonoscopic images remains challenging due to low contrast, irregular morphology, and significant distribution shifts across datasets, which often lead to unreliable boundary delineation and poor generalization. Existing methods typically treat boundary information as an auxiliary cue or incorporate boundary [...] Read more.

Accurate polyp segmentation in colonoscopic images remains challenging due to low contrast, irregular morphology, and significant distribution shifts across datasets, which often lead to unreliable boundary delineation and poor generalization. Existing methods typically treat boundary information as an auxiliary cue or incorporate boundary information through hand-crafted architectural designs, resulting in limited integration between boundary-sensitive features and region-aware representations. In this paper, we propose a boundary-aware multi-task learning framework, termed MBRSNet, which explicitly models and exploits the complementarity between the segmentation task and the auxiliary signed distance field (SDF) regression task. Specifically, we formulate boundary modeling as an auxiliary SDF regression task, providing dense and continuous structural supervision without requiring additional annotations. To effectively couple the two tasks, we design a cross-gated multi-task bottleneck that enables bidirectional and selective feature interaction, allowing each task to selectively leverage complementary information while suppressing task-irrelevant responses. Furthermore, a hierarchical cross-task guidance strategy is introduced in the decoding stage, where boundary-aware weighting and segmentation-guided alignment jointly refine multi-scale features, ensuring consistent integration of boundary cues and regional semantics. Extensive experiments on five benchmark datasets demonstrate that MBRSNet achieves competitive or superior performance compared with representative state-of-the-art methods in both segmentation accuracy and cross-dataset generalization. In particular, the proposed framework achieves superior boundary delineation under challenging conditions and exhibits strong robustness to domain shifts, highlighting the effectiveness of structured task interaction for boundary-aware medical image segmentation. Full article

(This article belongs to the Special Issue AI-Driven Medical Image Processing and Analysis)

► Show Figures

Figure 1

22 pages, 160005 KB

Open AccessArticle

ESMStereo: Enhanced ShuffleMixer Disparity Upsampling for Real-Time and Accurate Stereo Matching

by Mahmoud Tahmasebi, Saif Huq, Kevin Meehan and Marion McAfee

J. Imaging 2026, 12(7), 277; https://doi.org/10.3390/jimaging12070277 (registering DOI) - 24 Jun 2026

Abstract

Stereo matching has become an increasingly important component of modern autonomous systems. Developing deep learning-based stereo-matching models that deliver high accuracy while operating in real time continues to be a major challenge in computer vision. In the domain of cost volume-based stereo matching, [...] Read more.

Stereo matching has become an increasingly important component of modern autonomous systems. Developing deep learning-based stereo-matching models that deliver high accuracy while operating in real time continues to be a major challenge in computer vision. In the domain of cost volume-based stereo matching, accurate disparity estimation depends heavily on large-scale cost volumes. However, such large volumes store substantial redundant information and also require computationally intensive aggregation units for processing and regression, making real-time performance unattainable. Conversely, small-scale cost volumes followed by lightweight aggregation units provide a promising route for real-time performance, but lack sufficient information to ensure highly accurate disparity estimation. To address this challenge, we propose the Enhanced Shuffle Mixer (ESM) to mitigate information loss associated with small-scale cost volumes. ESM restores critical details by integrating primary features into the disparity upsampling unit. It quickly extracts features from the initial disparity estimation and fuses them with image features. These features are mixed by shuffling and layer splitting, then refined through a compact feature-guided hourglass network to recover more detailed scene geometry. The ESM focuses on local contextual connectivity with a large receptive field and low computational cost, leading to improved disparity estimation accuracy while maintaining real-time performance under the evaluated settings. The compact version of ESMStereo achieves an inference speed of 116 FPS on RTX 4070S and 91 FPS on the AGX Orin. Full article

(This article belongs to the Section Computer Vision and Pattern Recognition)

► Show Figures

Figure 1

28 pages, 6638 KB

Open AccessArticle

Hyperelastic Regularization for Near-Diffeomorphic Transformer-Based Brain MRI Registration

by Shiyi Xu, Mohan Xu and Erjin Zhou

J. Imaging 2026, 12(7), 276; https://doi.org/10.3390/jimaging12070276 (registering DOI) - 24 Jun 2026

Abstract

Transformer-based deformable brain MRI registration achieves high overlap accuracy, but predicted displacement fields can contain voxels with a non-positive Jacobian determinant—local foldings that violate the diffeomorphism assumption required by tensor-based morphometry and atlas-fusion segmentation workflows. We introduce HypEReg, a non-linear hyperelastic regularizer that [...] Read more.

Transformer-based deformable brain MRI registration achieves high overlap accuracy, but predicted displacement fields can contain voxels with a non-positive Jacobian determinant—local foldings that violate the diffeomorphism assumption required by tensor-based morphometry and atlas-fusion segmentation workflows. We introduce HypEReg, a non-linear hyperelastic regularizer that acts directly on the Jacobian determinant of the predicted displacement field. HypEReg couples a clamped-rational volume-distortion penalty

{(det J_{ϕ} - 1)}^{2} / max (det J_{ϕ}, ϵ)

with an explicit per-voxel anti-folding hinge

{[max (0, ϵ - det J_{ϕ})]}^{2}

, integrated as a purely loss-side module into a TransMorph backbone with no inference-graph modifications. On the IXI atlas-to-subject benchmark (115 test subjects), HypEReg-TransMorph maintains grouped Dice (0.7537) while reducing the

det (J_{ϕ}) \leq 0

voxel ratio from

1.502 \times 10^{- 2}

(TransMorph) to

1.5 \times 10^{- 5}

, with identical per-case runtime and parameter count to the unregularized baseline. In strict zero-shot transfer to OASIS Learn2Reg test pairs (no fine-tuning), HypEReg-TransMorph achieves Dice 0.7756 with a

det (J_{ϕ}) \leq 0

ratio of

7.6 \times 10^{- 5}

, roughly two orders of magnitude below plain TransMorph zero-shot (Dice 0.7691; ratio

9.6 \times 10^{- 3}

); downstream multi-atlas label fusion further confirms the practical benefit of fold suppression (fused Dice 0.8271 vs. 0.8201 for TransMorph). OASIS-2 longitudinal and ROI analyses support deformation plausibility (lower folding/SDlogJ and stronger ventricular ROI agreement), while clinical-covariate associations remain exploratory rather than biomarker-validating. Determinant-level, non-linear hyperelastic regularization substantially suppresses folding in Transformer dense-flow brain MRI registration while preserving alignment accuracy and adding zero inference cost, providing a practical drop-in regularization strategy that improves the reliability of deformation fields for morphometry-oriented deformable registration. Full article

(This article belongs to the Special Issue Artificial Intelligence in Medical Imaging: Progress, Challenges and Perspectives)

► Show Figures

Graphical abstract

23 pages, 584 KB

Open AccessArticle

Benchmarking Barren Plateau Mitigation Strategies in Quantum Neural Networks on Standard and Medical Image Datasets

by Maqsudur Rahman, Rui Liu, Anup Majumder, Pintu Chandra Paul, Kangtong Mo, Amena Begum, Kashmi Sultana, Nahida Akter, Lu Wei, Ye Zhang and Jun Zhuang

J. Imaging 2026, 12(7), 275; https://doi.org/10.3390/jimaging12070275 (registering DOI) - 23 Jun 2026

Abstract

Barren plateaus (BPs) pose a major trainability challenge for quantum neural networks (QNNs) by causing gradients to concentrate near zero as circuit size, depth, or expressibility increases. This study presents a comparative benchmark of 10 BP mitigation strategies across six qubit settings (2, [...] Read more.

Barren plateaus (BPs) pose a major trainability challenge for quantum neural networks (QNNs) by causing gradients to concentrate near zero as circuit size, depth, or expressibility increases. This study presents a comparative benchmark of 10 BP mitigation strategies across six qubit settings (2, 4, 8, 12, 16, and 20) and three datasets of increasing complexity: Iris, MNIST, and MedMNIST. The evaluated methods include eight initialization-based strategies (Beta, Gaussian, Uniform Norm, CNN-based initialization, He-normal, He-uniform, Xavier-normal, and Xavier-uniform), one model-based variational encoder, and one optimization-based time-nonlocal Fourier parameterization. Experiments were implemented using PennyLane 3.10 and PyTorch 2.5 with simulator backends. We evaluate trainability using gradient variance and training loss, and we clarify that the benchmark analyzes simulated QNN optimization behavior rather than hardware-noise-resilient or noisy-label learning. Across the tested two-layer circuit configurations, the mitigation strategies maintained measurable gradient variance and stable loss reduction, suggesting that severe barren plateau behavior was not observed under the benchmark conditions. CNN-based and Beta initialization showed strong empirical behavior in variance retention and convergence speed, while Gaussian initialization was comparatively weaker in higher-dimensional settings. The study provides a reproducible benchmark structure for comparing BP mitigation behavior and identifies important limitations related to circuit depth, hardware noise, feature encoding, and classification performance that should be addressed in future QNN benchmarking. Full article

(This article belongs to the Section Medical Imaging)

23 pages, 788 KB

Open AccessReview

Human–AI Interaction in Interventional Radiology: A Narrative Review of Current Applications, Challenges, and Future Directions

by Francesco Mariotti, Laura Maria Cacioppa, Nicolo’ Rossini, Alessandra Bruno, Giangabriele Francavilla, Alessandro Felicioli, Marco Macchini, Andrea Coppola, Michaela Cellina and Chiara Floridi

J. Imaging 2026, 12(6), 274; https://doi.org/10.3390/jimaging12060274 (registering DOI) - 22 Jun 2026

Abstract

Traditional evaluations of artificial intelligence (AI) systems in the dynamic, operator-dependent, and time-sensitive field of interventional radiology (IR), focusing solely on algorithmic performance, often fail to capture their real-world clinical impact. This narrative review aims to provide an overview of the current state [...] Read more.

Traditional evaluations of artificial intelligence (AI) systems in the dynamic, operator-dependent, and time-sensitive field of interventional radiology (IR), focusing solely on algorithmic performance, often fail to capture their real-world clinical impact. This narrative review aims to provide an overview of the current state of the art of AI integration in IR through human–AI interaction (HAI), while offering a critical perspective on their clinical integration, limitations, and future directions. A comprehensive survey of recent literature was performed, focusing on AI applications across procedural phases. The review emphasizes systems providing decision support, real-time procedural verification, and immersive interfaces (augmented and virtual reality), while critically evaluating determinants of effective clinical adoption. AI has shown preliminary potential to support operator performance in selected interventional radiology tasks, although most applications remain experimental, retrospective, or evaluated in phantom or preclinical settings. Potential benefits include structuring uncertainty in patient selection and procedural planning, supporting assessment of device positioning and treatment outcomes, and integrating AI-derived outputs into the operator’s spatial field through immersive technologies. The clinical utility of these systems appears to be influenced by human–AI interaction, with interpretability, workflow integration, and trust calibration representing key determinants of effective use beyond algorithmic accuracy alone. The potential value of AI in interventional radiology appears to derive from its integration into human decision-making rather than from standalone predictive performance alone. A human-centered, interaction-based model supports understanding current applications, address challenges, and guide the development of adaptive, real-time systems for dynamic procedural environments. Full article

(This article belongs to the Section Medical Imaging)

► Show Figures

Figure 1

42 pages, 52421 KB

Open AccessReview

Coronary Artery Anomalies and Anatomical Variants: Cross-Sectional Diagnostic Imaging and Clinical Background

by Nicolò Schicchi, Francesco Bianco, Marco Fogante, Corrado Tagliati, Luca Procaccini, Franco De Remigis, Emanuela Algeri, Giovanni Lorusso, Stefania Lamja, Giulia Argalia, Cinzia Romagnolo, Simone Steffani, Matteo Cesarotto, Luca Salice, Manuel Belgrano, Antonio Bernardini, Giuseppe Lanni, Antonio Corvino, Marcello Chiocchi and Alessandro Capestro

J. Imaging 2026, 12(6), 273; https://doi.org/10.3390/jimaging12060273 (registering DOI) - 22 Jun 2026

Abstract

The coronary arteries are a pair of arteries that branch off from the aorta and encircle the heart, providing oxygenated blood to the myocardium. Although coronary artery atherosclerosis remains a main cause of morbidity and mortality worldwide, coronary artery anomalies (CAAs) are increasingly [...] Read more.

The coronary arteries are a pair of arteries that branch off from the aorta and encircle the heart, providing oxygenated blood to the myocardium. Although coronary artery atherosclerosis remains a main cause of morbidity and mortality worldwide, coronary artery anomalies (CAAs) are increasingly recognized as a clinically relevant cause of ischemic events and can be subdivided into origin, course, or termination anomalies. The aim of this narrative review is to summarize the cross-sectional diagnostic imaging and clinical background of CAAs. Full article

(This article belongs to the Section Medical Imaging)

► Show Figures

Figure 1

29 pages, 24005 KB

Open AccessArticle

YoLeTooth: A Unified Framework for Joint Tooth Segmentation and Periapical Lesion Detection in Panoramic Radiographs

by Gianmarco Scarano, Simone Agostinelli, Irene Amerini and Piero Papi

J. Imaging 2026, 12(6), 272; https://doi.org/10.3390/jimaging12060272 (registering DOI) - 20 Jun 2026

Abstract

Chronic periapical periodontitis is a persistent inflammatory disease characterized by progressive bone destruction around the tooth apex. Manual radiographic detection of these lesions is subjective and time-consuming, highlighting the need for automated diagnostic tools. This paper presents a unified deep learning framework for [...] Read more.

Chronic periapical periodontitis is a persistent inflammatory disease characterized by progressive bone destruction around the tooth apex. Manual radiographic detection of these lesions is subjective and time-consuming, highlighting the need for automated diagnostic tools. This paper presents a unified deep learning framework for joint tooth segmentation and periapical lesion detection in panoramic radiographs. Our approach employs a joint process: first, a deep learning model identifies and segments individual teeth according to standard dental numbering systems, while a second one detects periapical lesions within the tooth regions obtained from the segmentation outputs in the first stage. The framework incorporates an advanced loss function (Powerful IoU v2) to improve bounding-box regression accuracy and a spatial association mechanism to map detected lesions to specific teeth based on geometric overlap analysis. Our proposed tooth segmentation model achieves an mAP@50 of 97.7% and a mean Dice coefficient of 93.5%, while the periapical lesion detector reaches an mAP@50 of 91.9%. Furthermore, our region-of-interest approach yields a 3.49× computational speedup, averaging 0.1589 s per radiograph when compared to full-image processing. Trained exclusively on open-source datasets, this reproducible framework achieves explicit tooth-to-lesion mapping, providing an efficient and practical tool for periapical lesion screening. Full article

(This article belongs to the Special Issue Deep Learning in Biomedical Image Segmentation and Classification: Advancements, Challenges and Applications, 2nd Edition)

► Show Figures

Figure 1

14 pages, 1974 KB

Open AccessArticle

Radiomics-Guided Multi-Sequence Learning for Pathological Complete Response Prediction from Breast MRI with Missing Auxiliary Sequences

by Xinyuan Xiang, Wenyu Yin and Jiayue Li

J. Imaging 2026, 12(6), 271; https://doi.org/10.3390/jimaging12060271 - 18 Jun 2026

Abstract

Pathological complete response (pCR) after neoadjuvant chemotherapy (NACT) provides an endpoint for treatment evaluation in breast cancer. Multi-sequence breast MRI can support pCR prediction, but routine examinations may lack usable T1-weighted or T2-weighted sequences. Many models merge radiomic and deep features by concatenation, [...] Read more.

Pathological complete response (pCR) after neoadjuvant chemotherapy (NACT) provides an endpoint for treatment evaluation in breast cancer. Multi-sequence breast MRI can support pCR prediction, but routine examinations may lack usable T1-weighted or T2-weighted sequences. Many models merge radiomic and deep features by concatenation, leaving the interaction between handcrafted descriptors and learned representations weakly specified. We developed a radiomics-guided framework for pCR prediction from multi-sequence breast MRI. The model uses a multi-branch 2.5D encoder for sequence-specific features, radiomics-guided channel recalibration, and masked token fusion to aggregate available sequence tokens. We evaluated the framework on 157 patients from the I-SPY1 Trial cohort with patient-level five-fold cross-validation, fixed sequence-combination analysis, and slice-window sensitivity analysis. The full model achieved 78.4% accuracy and 0.809 AUC, compared with 75.8% accuracy and 0.788 AUC for the strongest channel-concatenation baseline. In this cohort, radiomics-guided multi-sequence learning was feasible, with external validation required before clinical interpretation. Full article

(This article belongs to the Special Issue Deep Learning in Biomedical Image Segmentation and Classification: Advancements, Challenges and Applications, 2nd Edition)

► Show Figures

Figure 1

39 pages, 1756 KB

Open AccessReview

Cutaneous Thermography in Arthropathies: Quantitative Imaging, Machine Learning, and Clinical Translation

by Constantin-Adrian Andrei, Serban Dragosloveanu, Alex-Gabriel Grigore, Andreea Alexandra Anghel, Atanasie-Andrei Gogu, Rares-Mircea Birlutiu, Christiana Diana Maria Dragosloveanu, Catalin Anghel, Adrian Iftime, Romica Cergan, Constantin Caruntu and Cristian Scheau

J. Imaging 2026, 12(6), 270; https://doi.org/10.3390/jimaging12060270 - 18 Jun 2026

Abstract

Arthropathies are a major global health challenge because of their high prevalence, chronic progression, and significant impact on quality of life and health systems. Therefore, prompt and accurate diagnosis is critical for slowing disease progression and improving outcomes. Traditional imaging modalities, such as [...] Read more.

Arthropathies are a major global health challenge because of their high prevalence, chronic progression, and significant impact on quality of life and health systems. Therefore, prompt and accurate diagnosis is critical for slowing disease progression and improving outcomes. Traditional imaging modalities, such as ultrasound and magnetic resonance imaging, suffer from significant limitations, including operator dependence, limited accessibility, high cost, and limited reproducibility. Infrared thermography has become a promising non-invasive imaging technique for identifying thermal variations linked to inflammatory and metabolic processes. Advances in quantitative thermography, automated segmentation, and artificial intelligence have greatly enhanced its clinical applicability. This review summarizes recent advances in thermography-based biomarkers, including region-of-interest-derived metrics, asymmetry indices, hotspot burden, spatial and texture descriptors, and composite thermographic scores. It discusses the role of machine learning and deep learning in prediction, phenotyping, and multimodal integration with clinical, laboratory, and imaging data. Heterogeneity of protocols, variability in measurements, domain shift, validation design, overfitting, and reporting quality are also addressed. Overall, thermography combined with AI is highly promising as an adjunct to early diagnosis, assessment of disease activity, and follow-up in arthropathies. However, clinical application at a large scale requires strict standardization, external validation, transparent reporting, and well-elucidated, reproducible analytical processes. Full article

(This article belongs to the Section Medical Imaging)

► Show Figures

Figure 1

22 pages, 6346 KB

Open AccessArticle

Two-Stage Dynamic Synergistic Segmentation Method for Myocardial Pathology

by Dongsheng Ruan, Xiaolin Zhang, Zihan Yuan, Ziqian Lu, Ling Xia and Mingfeng Jiang

J. Imaging 2026, 12(6), 269; https://doi.org/10.3390/jimaging12060269 - 18 Jun 2026

Abstract

Myocardial scar and edema segmentation from multi-sequence cardiac magnetic resonance (MS-CMR) is important for myocardial infarction assessment, but remains challenging due to heterogeneous modal characteristics, severe class imbalance, and the small, ambiguous nature of pathological regions. To address these issues, a dynamic synergistic [...] Read more.

Myocardial scar and edema segmentation from multi-sequence cardiac magnetic resonance (MS-CMR) is important for myocardial infarction assessment, but remains challenging due to heterogeneous modal characteristics, severe class imbalance, and the small, ambiguous nature of pathological regions. To address these issues, a dynamic synergistic segmentation network (DSS-Net) is proposed for myocardial pathology segmentation. The framework adopts a coarse-to-fine strategy, in which a coarse stage first segments the myocardium to provide anatomical priors and region constraints, and a fine stage then delineates scar and edema within the myocardium-aware space. In addition, a Modality Dynamic Fusion Module (MDFM) is designed to adaptively emphasize pathology-relevant modal information, and a Stage Feature Aggregation Module (SFAM) is introduced to enhance cross-stage feature interactions and fine-grained lesion representation. Experiments on the MyoPS 2020 and MyoPS 2024 datasets demonstrate that DSS-Net achieves competitive and balanced performance, reaching Dice scores of 0.706 for scar and 0.753 for edema on MyoPS 2020. Additionally, compared with SOTA methods in the MyoPS 2020 Challenge, the proposed method attains comparable scar segmentation performance while maintaining a more balanced trade-off between sensitivity and specificity. These findings suggest that combining anatomical guidance with pathology-aware multi-modal learning is a promising strategy for robust myocardial pathology segmentation in MS-CMR images. Full article

(This article belongs to the Special Issue Deep Learning in Biomedical Image Segmentation and Classification: Advancements, Challenges and Applications, 2nd Edition)

► Show Figures

Figure 1

20 pages, 2195 KB

Open AccessSystematic Review

Ultrasound Features of Uterine Perivascular Epithelioid Cell Tumor (PEComa): A Systematic Review

by Laura Grazia Zompì, Giorgio Maria Baldini, Maria Bardi, Salvatore Lopez, Angela Calabrese, Maria Antonietta Ramunno, Giuseppe Colonna, Vera Loizzi, Francesca Arezzo and Gennaro Cormio

J. Imaging 2026, 12(6), 268; https://doi.org/10.3390/jimaging12060268 - 18 Jun 2026

Abstract

Uterine perivascular epithelioid cell tumor (PEComa) is a rare mesenchymal neoplasm whose sonographic profile has not been systematically characterized. We describe an index case of malignant uterine PEComa and present a PRISMA 2020-compliant systematic review (PubMed, Scopus, Cochrane Library; search 1 March 2026) [...] Read more.

Uterine perivascular epithelioid cell tumor (PEComa) is a rare mesenchymal neoplasm whose sonographic profile has not been systematically characterized. We describe an index case of malignant uterine PEComa and present a PRISMA 2020-compliant systematic review (PubMed, Scopus, Cochrane Library; search 1 March 2026) of studies reporting original ultrasound data of histologically confirmed uterine PEComa. Sonographic features were coded with MUSA/IETA terminology; Clopper–Pearson 95% confidence intervals (CI) were calculated for key proportions, and malignancy subgroups were summarized descriptively. Thirty-one cases were pooled (30 from 18 studies plus our index case; median age 41 years). The profile comprised absent acoustic shadowing in all documented cases (10/10; 95% CI 69.2–100%), moderate-to-abundant vascularisation (Color Score 3–4, 91.7%), variable echogenicity (heterogeneous 56.0%) and predominantly regular margins (69.6%). Preoperative misdiagnosis occurred in 100% of cases, most often as leiomyoma (41.4%). In cases with known malignancy status (n = 17), irregular margins and cystic areas appeared more often in malignant lesions, but subgroups were too small for testing. Only 4/18 studies applied standardized terminology. Uterine PEComa shows a recurrent pattern of absent shadowing, high vascularisation and solid consistency with regular margins that may aid differential diagnosis; systematic adoption of MUSA/IETA terminology in future reports is strongly advocated. Full article

(This article belongs to the Section Medical Imaging)

► Show Figures

Figure 1

26 pages, 13171 KB

Open AccessArticle

A Deep Learning Approach for Pixel-Level Material Classification via Hyperspectral Imaging

by Savvas Sifnaios, George Arvanitakis, Fotios K. Konstantinidis, Georgios Tsimiklis, Angelos Amditis and Panayiotis Frangos

J. Imaging 2026, 12(6), 267; https://doi.org/10.3390/jimaging12060267 - 18 Jun 2026

Abstract

Recent advancements in computer vision, particularly in detection, segmentation, and classification, have significantly impacted various domains. However, these advancements are still strongly tied to RGB-based systems, which are insufficient for applications in industries such as waste sorting, pharmaceuticals, and defence, where material characterization [...] Read more.

Recent advancements in computer vision, particularly in detection, segmentation, and classification, have significantly impacted various domains. However, these advancements are still strongly tied to RGB-based systems, which are insufficient for applications in industries such as waste sorting, pharmaceuticals, and defence, where material characterization beyond shape or visible colour is necessary. Hyperspectral (HS) imaging captures spatial and spectral information for each pixel and therefore offers a promising route for material-level classification. This study evaluates the potential of combining HS imaging with deep learning for plastic material classification. The work includes: (i) the design of an experimental setup with a HS line-scan camera, conveyor, and controlled illumination; (ii) the construction of an object-disjoint dataset of HDPE, PET, PP, and PS samples with semi-automated mask generation and Raman spectroscopy-based labelling; and (iii) the development of P1CH, a lightweight pixel-wise 1D convolutional hyperspectral classifier. On object-disjoint test images, P1CH achieved 97.44% all-pixel accuracy. A boundary sensitivity analysis, reported separately because semi-automated labels are uncertain at material/background interfaces, yielded 99.94% accuracy after excluding a pre-defined two-pixel border band. Additional ablation, baseline, and robustness analyses show that the proposed pixel-wise spectral approach is effective for small fragments, visually similar plastics, and overlapping materials, while black or very dark plastics remain challenging under the present camera and illumination configuration. Full article

(This article belongs to the Special Issue Advancement in Hyperspectral Image Processing with Machine Learning)

► Show Figures

Figure 1

17 pages, 4000 KB

Open AccessArticle

A Lightweight and High-Precision PCB Surface Defect Detection Method Based on YOLOv8

by Zhenling Wang, Ya Gao, Ying Xiao and Qiurui He

J. Imaging 2026, 12(6), 266; https://doi.org/10.3390/jimaging12060266 - 18 Jun 2026

Abstract

In response to the diverse types and large number of PCB surface defects, our paper proposes an improved YOLOv8-based method for PCB surface defect detection. First, a lightweight modification is performed by introducing RepGhostBottleNeck as the lightweight backbone network, which reduces the number [...] Read more.

In response to the diverse types and large number of PCB surface defects, our paper proposes an improved YOLOv8-based method for PCB surface defect detection. First, a lightweight modification is performed by introducing RepGhostBottleNeck as the lightweight backbone network, which reduces the number of parameters in the training model. It should be noted that the term “lightweight” in this paper is relative to the original YOLOv8L baseline. Compared with extremely lightweight detectors, the model in this paper places greater emphasis on the balance between accuracy and efficiency. Additionally, an attention mechanism module and a small object detection head module are added to the backbone network. Furthermore, the loss function of the network is improved. Experimental results show that the improved model achieves an average mAP@0.5 of 0.976, demonstrating high-precision detection on the constructed dataset. Full article

(This article belongs to the Special Issue AI-Driven Image and Video Understanding)

► Show Figures

Figure 1

15 pages, 1091 KB

Open AccessArticle

Towards Automated Spine Fracture Detection on Whole-Body CT of Polytraumatized Patients

by Elena Stojanovski, Alexander Hönning, Frederik Spohn, Marlene Ciesla, Holger Arndt, Sven Mutze, Alena-Kathrin Golla, Tobias Klinder, Cristian Lorenz and Leonie Goelz

J. Imaging 2026, 12(6), 265; https://doi.org/10.3390/jimaging12060265 - 18 Jun 2026

Abstract

Treatment of severely injured patients is challenging, and timely reading of whole-body computed tomography (WBCT) images therefore crucial. Artificial intelligence is increasingly used to prioritize and detect acute injuries in this context. Algorithms focusing on the cervical spine and compression fractures have been [...] Read more.

Treatment of severely injured patients is challenging, and timely reading of whole-body computed tomography (WBCT) images therefore crucial. Artificial intelligence is increasingly used to prioritize and detect acute injuries in this context. Algorithms focusing on the cervical spine and compression fractures have been deployed successfully. However, tools for whole spine assessment and the entirety of fracture morphologies are lacking. We aimed to investigate the capabilities of an algorithm to detect spine fractures on WBCTs and factors contributing to the difficulties in its development. A version 1.0 (v1) of the algorithm was previously trained with 454 cervical spine fractures using a U-Net via four-fold cross-validation to segment spine fractures and the spine via a multi-task loss. Further training expanded towards whole spine assessment with additional annotated fractures (Cohort 1) of the cervical (n = 50), thoracic (n = 30), and lumbar spine (n = 20), resulting in version 2.0 (v2). Baseline was set to reach the highest sensitivity at a maximum of five false positives per case. Version 1.0 was tested on Cohort 1 and both versions were compared on prospectively collected real-world data (Cohort 2, n = 712 WBCTs). An additional systematic review served to compare the algorithmic performance against the state-of-the-art. Version 1.0 showed promising performance not only for the cervical but also the thoracic and lumbar spine due to generalization (sensitivities ranging between 60% and 87%). Version 2.0 also achieved decent sensitivities for Cohort 2 (sensitivities ranging between 77% and 85%) but generated an abundance of false positives. Various reasons led to false positive results; for Version 2.0, the trabecular structure itself provoked false alerts. Variances in training and test data (image quality, dose, reconstructions), heterogeneity of fractures and anatomies, plus the size of training sets explain some difficulties during algorithm development. Only five other groups described their work on whole-spine fracture detection, encountered similar difficulties, and have also failed to develop a clinically deployable tool. Spine fracture detection on WBCT is feasible, but multiple factors hinder the development of commercially available AI tools. Expansion and the improved design of training cohorts are necessary for further development and simulation of real-life conditions. Full article

(This article belongs to the Section AI in Imaging)

► Show Figures

Figure 1

52 pages, 29644 KB

Open AccessArticle

RiTex: Harmonization of Radiomic Features Based on Riemannian Geometry

by Darya A. Voitenko, Anton V. Vladzymyrskyy, Olga V. Omelyanskaya, Yuriy A. Vasilev, Ivan A. Blokhin and Maria R. Kodenko

J. Imaging 2026, 12(6), 264; https://doi.org/10.3390/jimaging12060264 (registering DOI) - 17 Jun 2026

Abstract

Batch effects arising from variations in hardware, acquisition protocols, and reconstruction parameters present a critical challenge in radiomics, limiting the generalizability of models across multicentre studies. Existing harmonization methods, such as ComBat, CovBat, z-score normalization, and Generative Adversarial Networks, exhibit significant limitations when [...] Read more.

Batch effects arising from variations in hardware, acquisition protocols, and reconstruction parameters present a critical challenge in radiomics, limiting the generalizability of models across multicentre studies. Existing harmonization methods, such as ComBat, CovBat, z-score normalization, and Generative Adversarial Networks, exhibit significant limitations when applied to high-dimensional radiomic data. ComBat assumes a linear feature space and tends to leave residual center-specific information recoverable by downstream classifiers. This paper introduces RiTex (Riemannian Texture Harmonization), a framework that solves a generalized eigenvalue problem between class-aware biological scatter and Ledoit–Wolf-regularized per-batch covariances, with the SPD-manifold Fréchet mean used as a principled averaging step. We evaluate RiTex on the 50-dataset radMLBench benchmark and on a new four-center head-and-neck benchmark with known center labels (

n

= 380 patients, k = 4 centers from TCIA: HGJ, MDACC, Maastro, QIN). On radMLBench, RiTex reduces the batch auto-detection AUC in 48/50 (96%) datasets, 42/50 (84%) reductions remain significant after Benjamini–Hochberg correction; the mean Batch AUC reduction is ΔBatch = −0.365 (95% bootstrap CI [−0.418, −0.312]), with no significant degradation in biological AUC (mean ΔBio = +0.018, 95% CI [−0.011, +0.047]). On the H&N benchmark with real center labels, RiTex reduces the Batch AUC from 0.74 to 0.59, while ComBat and CovBat leave it at ≈0.98. A component-wise ablation shows that the dominant source of empirical performance is the GEVD step, together with Ledoit–Wolf shrinkage. The SPD Fréchet mean acts as a theoretical scaffold with a negligible empirical contribution (ΔBatch AUC = −0.014 vs. arithmetic mean). Full article

(This article belongs to the Special Issue Medical Image Analysis: New Opportunities and Challenges)

► Show Figures

Figure 1

21 pages, 2831 KB

Open AccessArticle

Frequency-Guided Cross-Modal Interaction for Multimodal Yeast Classification Based on Light-Scattering and Microscopy Images

by Zexi Cheng, Xiaoxuan Liu, Shamanth Shankarnarayan, Manisha Gupta, Wojciech Rozmus, Ying Yin Tsui, Daniel A. Charlebois and Mrinal Mandal

J. Imaging 2026, 12(6), 263; https://doi.org/10.3390/jimaging12060263 - 16 Jun 2026

Abstract

Accurate identification of pathogenic yeasts is essential for clinical diagnosis and effective antifungal therapy. However, current approaches predominantly rely on microscopy-based models, which require large-scale annotated datasets and exhibit limited generalization across morphologically similar species. In contrast, light-scattering (LS) imaging captures the diffraction [...] Read more.

Accurate identification of pathogenic yeasts is essential for clinical diagnosis and effective antifungal therapy. However, current approaches predominantly rely on microscopy-based models, which require large-scale annotated datasets and exhibit limited generalization across morphologically similar species. In contrast, light-scattering (LS) imaging captures the diffraction patterns generated by internal cellular structures, providing volumetric biophysical cues that extend beyond surface morphology, yet its indirect representations pose major challenges for feature discrimination. Our objective is to develop fast and accurate methods to detect various species of yeasts. We propose FPA-YeastNet, which is a frequency-enhanced single-modality deep learning architecture that improves yeast classification in LS images by leveraging discriminative frequency-domain features. Building upon this enhanced modality, we further propose FGCA-YeastNet, a frequency-guided cross-attention network designed to integrate LS and microscopy information for complementary representation learning. The proposed multimodal model facilitates synergistic interactions between volumetric scattering structures and fine-grained cellular textures through adaptive fusion and bidirectional attention, leading to improved robustness and interpretability. Comprehensive classification experiments conducted on a multimodal yeast dataset demonstrate that FGCA-YeastNet effectively bridges the performance gap between LS and microscopy modalities, achieving significant improvements over both unimodal and multimodal baselines. The FPA-YeastNet yields an average accuracy improvement of 6.26% compared with LS-only models, and FGCA-YeastNet further provides mean gains of 19.97% and 7.67% over unimodal and multimodal baseline models, respectively. Experimental results demonstrate the diagnostic potential of light scattering and microscopic imaging and underscore the effectiveness of frequency-guided multimodal collaboration for reliable and interpretable yeast classification in clinical microbiology. Full article

(This article belongs to the Section Computer Vision and Pattern Recognition)

► Show Figures

Figure 1

15 pages, 2306 KB

Open AccessArticle

Hyperspectral Fingerprints of Abdominal and Pelvic Organs

by Laurie S. van de Weerd, Nick J. van de Berg, L. Lucia Rijstenberg, Ralf L. O. van de Laar and Heleen J. van Beekhuizen

J. Imaging 2026, 12(6), 262; https://doi.org/10.3390/jimaging12060262 - 15 Jun 2026

Abstract

Ovarian cancer (OC) is typically treated with cytoreductive surgery (CRS). Hyperspectral imaging (HSI) is an emerging non-invasive, label-free technique that enables whole-area scanning, making it a promising tool for real-time tumour recognition. However, developing tumour recognition algorithms requires a foundational understanding of spectral [...] Read more.

Ovarian cancer (OC) is typically treated with cytoreductive surgery (CRS). Hyperspectral imaging (HSI) is an emerging non-invasive, label-free technique that enables whole-area scanning, making it a promising tool for real-time tumour recognition. However, developing tumour recognition algorithms requires a foundational understanding of spectral variability in normal tissues. This study focusses on the in vivo spectral profiles of key abdominal and pelvic organs encountered during CRS, including the uterus, ovaries, intestines, mesentery, omentum, peritoneum, and fallopian tubes, and evaluates the potential for organ recognition using HSI data. Intraoperative HSI data were from healthy patients. Two machine learning models, a support vector machine (SVM) and a 3D convolutional neural network (3DCNN), were trained to classify the organs based on their spectral signatures. In total, 15 patients were included in the dataset. The 3DCNN slightly outperformed the SVM in terms of the average accuracy (0.889 vs. 0.878), sensitivity (0.648 vs. 0.604), specificity (0.936 vs. 0.930), and Dice Similarity Coefficient (0.595 vs. 0.569). This study demonstrates the feasibility of using HSI for organ differentiation in the clinical setting, although in some cases separability remains a challenge, especially when organs have similar spectra. This is a critical step towards a generalizable in vivo abdominal tumour recognition algorithm, by carefully investigating spectral fingerprints of abdominal tissues. Full article

(This article belongs to the Section Medical Imaging)

► Show Figures

Figure 1

22 pages, 4411 KB

Open AccessArticle

SAR-Efficient Sub-Volume Imaging Using Nonlinear Gradient Magnetic Fields

by Emre Kopanoglu, Ergin Atalar and R. Todd Constable

J. Imaging 2026, 12(6), 261; https://doi.org/10.3390/jimaging12060261 - 13 Jun 2026

Abstract

Excitation using nonlinear gradient magnetic fields is investigated as a means of sub-volume magnetic resonance imaging (MRI). Conventional gradient fields provide encoding along a single direction, whereas nonlinear gradient fields encode information simultaneously along at least two directions. This leads to excitation regions [...] Read more.

Excitation using nonlinear gradient magnetic fields is investigated as a means of sub-volume magnetic resonance imaging (MRI). Conventional gradient fields provide encoding along a single direction, whereas nonlinear gradient fields encode information simultaneously along at least two directions. This leads to excitation regions (FOX) that have curvilinear boundaries, which may be more tolerant to aliasing artifacts when the encoded field of view (FOV) is smaller than the FOX. This reduces the complexity of the required radiofrequency (RF) excitation pulses and enables accelerated reduced-FOV imaging with standard slice-selection RF-pulses. We demonstrate the approach using a Z2-harmonic field for cylindrical regions of interest (ROIs) with various radius/height ratios. The minimum-FOV that should be encoded is formulated in terms of ROI and RF pulse parameters to allow a theoretical evaluation of feasibility during study design. The investigated method is compared to one-dimensional and two-dimensional selective RF pulses in terms of echo time, scan time and specific absorption rate (SAR) using simulations and phantom experiments. The investigated method yields lower scan time while keeping the SAR unaltered compared to a conventional slice-selective RF pulse, and is more efficient in terms of SAR, echo time and scan time compared to two-dimensional selective excitation. Full article

(This article belongs to the Section Image and Video Processing)

► Show Figures

Figure 1

Journal Description

Journal of Imaging

Latest Articles

Journal Menu

Journal Browser

Highly Accessed Articles

Latest Books

E-Mail Alert

News

Topics

Conferences

Special Issues

Further Information

Guidelines

MDPI Initiatives

Follow MDPI