You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.

Search for Articles:

Title / Keyword

Author / Affiliation / Email

Journal

Article Type

Advanced Search

Section

Special Issue

Volume

Issue

Number

Page

Logical OperatorOperator

Search Text

Search Type

Journal Description

Journal of Imaging

Journal of Imaging is an international, multi/interdisciplinary, peer-reviewed, open access journal of imaging techniques published online monthly by MDPI.

Open Accessfree for readers, with article processing charges (APC) paid by authors or their institutions.
High Visibility: indexed within Scopus, ESCI (Web of Science), PubMed, PMC, dblp, Inspec, Ei Compendex, and other databases.
Journal Rank: JCR - Q2 (Imaging Science and Photographic Technology) / CiteScore - Q1 (Radiology, Nuclear Medicine and Imaging)
Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 15.3 days after submission; acceptance to publication is undertaken in 3.5 days (median values for papers published in this journal in the first half of 2025).
Recognition of Reviewers: reviewers who provide timely, thorough peer-review reports receive vouchers entitling them to a discount on the APC of their next publication in any MDPI journal, in appreciation of the work done.

Impact Factor: 3.3 (2024); 5-Year Impact Factor: 3.3 (2024)

Imprint Information Journal Flyer Open Access ISSN: 2313-433X

Latest Articles

21 pages, 15182 KB

Open AccessArticle

Unified Multi-Modal Object Tracking Through Spatial–Temporal Propagation and Modality Synergy

by Jiajia Wu, Haorui Zuo, Yuxing Wei, Meihui Li and Jianlin Zhang

J. Imaging 2025, 11(12), 421; https://doi.org/10.3390/jimaging11120421 - 22 Nov 2025

Multi-modal object tracking (MMOT) has received widespread attention for the ability to overcome single-sensor perception limitations. However, existing methods encounter several critical challenges. Representation learning and generalization capabilities of models are constrained by the inherent heterogeneity of cross-task multi-modal data and inter-modal synergy [...] Read more.

Multi-modal object tracking (MMOT) has received widespread attention for the ability to overcome single-sensor perception limitations. However, existing methods encounter several critical challenges. Representation learning and generalization capabilities of models are constrained by the inherent heterogeneity of cross-task multi-modal data and inter-modal synergy imbalance. Particularly, in dynamically changing complex scenarios, the reliability and stability of data significantly degrade, further exacerbating the difficulty in multi-modal consistent perception and aggregation. To tackle the above issues, we propose SMUTrack, a unified framework with global shared parameters integrating three downstream MMOT tasks. SMUTrack implements a batch merging-and-splitting alternating strategy, coupled with multi-task joint training, to establish latent correlations across inter- and intra-task modalities, effectively avoiding over-reliance on certain modalities. Concurrently, we design a hierarchical modality synergy and reinforcement (HMSR) module, and a gated fusion and context awareness (GFCA) module to enable progressive multi-modal information exchange and integration, yielding the more discriminative and robust multi-modal representation. More importantly, we introduce a spatial–temporal information propagation (SIP) mechanism, which synchronously learns object trajectory cues and appearance variations to effectively build contextual relationships in long-term video tracking. Experimental results definitively validate the outstanding performance of SMUTrack on mainstream MMOT datasets, exhibiting its powerful adaptability to various MMOT tasks. Full article

(This article belongs to the Section Computer Vision and Pattern Recognition)

► Show Figures

Figure 1

21 pages, 6401 KB

Open AccessArticle

Multi-Level Attribute-Guided-Based Adaptive Multi-Dilated Convolutional Network for Image Aesthetic Assessment

by Sumei Li, Mingxuan Xie and Wei Xiang

J. Imaging 2025, 11(12), 420; https://doi.org/10.3390/jimaging11120420 - 21 Nov 2025

Image aesthetic assessment (IAA) is crucial for both scientific research and practical applications, and numerous studies have achieved promising performance. However, they still exhibit two major limitations: the neglect of hierarchical interactions between attribute features and aesthetic features, and the distortion of the [...] Read more.

Image aesthetic assessment (IAA) is crucial for both scientific research and practical applications, and numerous studies have achieved promising performance. However, they still exhibit two major limitations: the neglect of hierarchical interactions between attribute features and aesthetic features, and the distortion of the original aspect ratio during image preprocessing, which leads to a loss of aesthetic information. To address these issues, we propose a Multi-level Attribute-Guided Adaptive Multi-Dilated Convolutional Network (MAADN), which leverages multi-level attribute features to guide aesthetic assessment and reduces the negative impact of image preprocessing through adaptive dilated convolution. Specifically, we employ a dual-branch architecture: one branch extracts multi-level attribute features, while the other learns aesthetic features under the guidance of these attributes. We further design an Attention-based Attribute-Guided Aesthetic Module (AGAM), which utilizes visual attention mechanisms to enhance the guidance of attributes. Additionally, we design an Adaptive Multi-Dilate Rate Convolution Module (AMDM) that generates weights adaptively through the network to fuse dilated convolution features with different dilation rates, rather than simply calculating weights based on aspect ratios. This approach effectively alleviates the negative effects of image preprocessing while maintaining training flexibility. Extensive experimental results demonstrate that the proposed model outperforms current state-of-the-art approaches. Furthermore, visual analysis confirms MAADN’s precise localization capability for aesthetically critical regions. Full article

(This article belongs to the Section Computer Vision and Pattern Recognition)

► Show Figures

Figure 1

17 pages, 3217 KB

Open AccessArticle

Optimization of Neural Network Models of Computer Vision for Biometric Identification on Edge IoT Devices

by Bauyrzhan Belgibayev, Madina Mansurova, Ganibet Ablay, Talshyn Sarsembayeva and Zere Armankyzy

J. Imaging 2025, 11(11), 419; https://doi.org/10.3390/jimaging11110419 - 20 Nov 2025

This research is dedicated to the development of an intelligent biometric system based on the synergy of Internet of Things (IoT) technologies and Artificial Intelligence (AI). The primary goal of this research is to explore the possibilities of personal identification using two distinct [...] Read more.

This research is dedicated to the development of an intelligent biometric system based on the synergy of Internet of Things (IoT) technologies and Artificial Intelligence (AI). The primary goal of this research is to explore the possibilities of personal identification using two distinct biometric traits: facial images and the venous pattern of the palm. These methods are treated as independent approaches, each relying on unique anatomical features of the human body. This study analyzes state-of-the-art methods in computer vision and neural network architectures and presents experimental results related to the extraction and comparison of biometric features. For each biometric modality, specific approaches to data collection, preprocessing, and analysis are proposed. We frame optimization in practical terms: selecting an edge-suitable backbone (ResNet-50) and employing metric learning (Triplet Loss) to improve convergence and generalization while adapting the stack for edge IoT deployment (Dockerized FastAPI with JWT). This clarifies that “optimization” in our title refers to model selection, loss design, and deployment efficiency on constrained devices. Additionally, the system’s architectural principles are described, including the design of the web interface and server infrastructure. The proposed solution demonstrates the potential of intelligent biometric technologies in applications such as automated access control systems, educational institutions, smart buildings, and other areas where high reliability and resistance to spoofing are essential. Full article

(This article belongs to the Special Issue Techniques and Applications in Face Image Analysis)

► Show Figures

Figure 1

14 pages, 2751 KB

Open AccessArticle

Deep Learning and Atlas-Based MRI Segmentation Enable Longitudinal Characterization of Healthy Mouse Brain

by Edoardo Micotti, Liviu Soltuzu, Elisa Bianchi, Sebastiano La Ferla, Lorenzo Carnevale and Gianluigi Forloni

J. Imaging 2025, 11(11), 418; https://doi.org/10.3390/jimaging11110418 - 19 Nov 2025

We compared the results of brain magnetic resonance image (MRI) segmentation across a longitudinal dataset spanning mouse adulthood using an atlas-based approach and deep learning. Our results demonstrate that deep learning performs similarly yet faster than more established segmentation methods, even when computational [...] Read more.

We compared the results of brain magnetic resonance image (MRI) segmentation across a longitudinal dataset spanning mouse adulthood using an atlas-based approach and deep learning. Our results demonstrate that deep learning performs similarly yet faster than more established segmentation methods, even when computational resources are limited. Both methods enabled the large-scale analysis of a cohort of C57Bl6/J healthy mice, revealing sex-dependent morphological differences in the aging brain. These findings highlight the potential use of deep learning for high-throughput, longitudinal neuroimaging studies and underscore the importance of considering sex as a biological variable in preclinical brain research. Full article

(This article belongs to the Special Issue Translational Preclinical Imaging: Techniques, Applications and Perspectives)

► Show Figures

Graphical abstract

13 pages, 2033 KB

Open AccessArticle

Explainable Radiomics-Based Model for Automatic Image Quality Assessment in Breast Cancer DCE MRI Data

by Georgios S. Ioannidis, Katerina Nikiforaki, Aikaterini Dovrou, Vassilis Kilintzis, Grigorios Kalliatakis, Oliver Diaz, Karim Lekadir and Kostas Marias

J. Imaging 2025, 11(11), 417; https://doi.org/10.3390/jimaging11110417 - 19 Nov 2025

This study aims to develop an explainable radiomics-based model for the automatic assessment of image quality in breast cancer Dynamic Contrast-Enhanced Magnetic Resonance Imaging (DCE-MRI) data. A cohort of 280 images obtained from a public database was annotated by two clinical experts, resulting [...] Read more.

This study aims to develop an explainable radiomics-based model for the automatic assessment of image quality in breast cancer Dynamic Contrast-Enhanced Magnetic Resonance Imaging (DCE-MRI) data. A cohort of 280 images obtained from a public database was annotated by two clinical experts, resulting in 110 high-quality and 110 low-quality images. The proposed methodology involved the extraction of 819 radiomic features and 2 No-Reference image quality metrics per patient, using both the whole image and the background as regions of interest. Feature extraction was performed under two scenarios: (i) from a sample of 12 slices per patient, and (ii) from the middle slice of each patient. Following model training, a range of machine learning classifiers were applied with explainability assessed through SHapley Additive Explanations (SHAP). The best performance was achieved in the second scenario, where combining features from the whole image and background with a support vector machine classifier yielded sensitivity, specificity, accuracy, and AUC values of 85.51%, 80.01%, 82.76%, and 89.37%, respectively. This proposed model demonstrates potential for integration into clinical practice and may also serve as a valuable resource for large-scale repositories and subgroup analyses aimed at ensuring fairness and explainability. Full article

(This article belongs to the Special Issue Celebrating the 10th Anniversary of the Journal of Imaging)

► Show Figures

Figure 1

14 pages, 2365 KB

Open AccessArticle

Seam Carving Forgery Detection Through Multi-Perspective Explainable AI

by Miguel José das Neves, Felipe Rodrigues Perche Mahlow, Renato Dias de Souza, Paulo Roberto G. Hernandes, Jr., José Remo Ferreira Brega and Kelton Augusto Pontara da Costa

J. Imaging 2025, 11(11), 416; https://doi.org/10.3390/jimaging11110416 - 18 Nov 2025

This paper addresses the critical challenge of detecting content-aware image manipulations, specifically focusing on seam carving forgery. While deep learning models, particularly Convolutional Neural Networks (CNNs), have shown promise in this area, their black-box nature limits their trustworthiness in high-stakes domains like digital [...] Read more.

This paper addresses the critical challenge of detecting content-aware image manipulations, specifically focusing on seam carving forgery. While deep learning models, particularly Convolutional Neural Networks (CNNs), have shown promise in this area, their black-box nature limits their trustworthiness in high-stakes domains like digital forensics. To address this gap, we propose and validate a framework for interpretable forgery detection, termed E-XAI (Ensemble Explainable AI). Conceptually inspired by Ensemble Learning, our framework’s novelty lies not in combining predictive models, but in integrating a multi-perspective ensemble of explainability techniques. Specifically, we combine SHAP for fine-grained, pixel-level feature attribution with Grad-CAM for region-level localization to create a more robust and holistic interpretation of a single, custom-trained CNN’s decisions. Our approach is validated on a purpose-built, balanced, binary-class dataset of 10,300 images. The results demonstrate high classification performance on an unseen test set, with a 95% accuracy and a 99% precision for the forged class. Furthermore, we analyze the model’s robustness against JPEG compression, a common real-world perturbation. More importantly, the application of the E-XAI framework reveals how the model identifies subtle forgery artifacts, providing transparent, visual evidence for its decisions. This work contributes a robust end-to-end pipeline for interpretable image forgery detection, enhancing the trust and reliability of AI systems in information security. Full article

(This article belongs to the Topic Applied Computer Vision and Pattern Recognition: 2nd Edition)

► Show Figures

Figure 1

18 pages, 1489 KB

Open AccessArticle

Few-Shot Adaptation of Foundation Vision Models for PCB Defect Inspection

by Sang-Jeong Lee

J. Imaging 2025, 11(11), 415; https://doi.org/10.3390/jimaging11110415 - 17 Nov 2025

Automated Optical Inspection (AOI) of Printed Circuit Boards (PCBs) suffers from scarce labeled data and frequent domain shifts caused by variations in camera optics, illumination, and product design. These limitations hinder the development of accurate and reliable deep-learning models in manufacturing settings. To [...] Read more.

Automated Optical Inspection (AOI) of Printed Circuit Boards (PCBs) suffers from scarce labeled data and frequent domain shifts caused by variations in camera optics, illumination, and product design. These limitations hinder the development of accurate and reliable deep-learning models in manufacturing settings. To address this challenge, this study systematically benchmarks three Parameter-Efficient Fine-Tuning (PEFT) strategies—Linear Probe, Low-Rank Adaptation (LoRA), and Visual Prompt Tuning (VPT)—applied to two representative foundation vision models: the Contrastive Language–Image Pretraining Vision Transformer (CLIP-ViT-B/16) and the Self-Distillation with No Labels Vision Transformer (DINOv2-S/14). The models are evaluated on six-class PCB defect classification tasks under few-shot (k = 5, 10, 20) and full-data regimes, analyzing both performance and reliability. Experiments show that VPT achieves 0.99 ± 0.01 accuracy and 0.998 ± 0.001 macro–Area Under the Precision–Recall Curve (macro-AUPRC), reducing classification error by approximately 65% compared with Linear and LoRA while tuning fewer than 1.5% of backbone parameters. Reliability, assessed by the stability of precision–recall behavior across different decision thresholds, improved as the number of labeled samples increased. Furthermore, class-wise and few-shot analyses revealed that VPT adapts more effectively to rare defect types such as Spur and Spurious Copper while maintaining near-ceiling performance on simpler categories (Short, Pinhole). These findings collectively demonstrate that prompt-based adaptation offers a quantitatively favorable trade-off between accuracy, efficiency, and reliability. Practically, this positions VPT as a scalable strategy for factory-level AOI, enabling the rapid deployment of robust defect inspection models even when labeled data is scarce. Full article

(This article belongs to the Section AI in Imaging)

► Show Figures

Figure 1

24 pages, 4018 KB

Open AccessArticle

Toward Smarter Orthopedic Care: Classifying Plantar Footprints from RGB Images Using Vision Transformers and CNNs

by Lidia Yolanda Ramírez-Rios, Jesús Everardo Olguín-Tiznado, Edgar Rene Ramos-Acosta, Everardo Inzunza-Gonzalez, Julio César Cano-Gutiérrez, Enrique Efrén García-Guerrero and Claudia Camargo-Wilson

J. Imaging 2025, 11(11), 414; https://doi.org/10.3390/jimaging11110414 - 16 Nov 2025

The anatomical structure of the foot can be assessed by examining the plantar footprint for orthopedic intervention. In fact, there is a relationship between a specific type of foot and multiple musculoskeletal disorders, which are among the main ailments affecting the lower extremities, [...] Read more.

The anatomical structure of the foot can be assessed by examining the plantar footprint for orthopedic intervention. In fact, there is a relationship between a specific type of foot and multiple musculoskeletal disorders, which are among the main ailments affecting the lower extremities, where its accurate classification is essential for early diagnosis. This work aims to develop a method for accurately classifying the plantar footprint and hindfoot, specifically concerning the sagittal plane. A custom image dataset was created, comprising 603 RGB plantar images that were modified and augmented. Six state-of-the-art models have been trained and evaluated: swin_tiny_patch4_window7_224, convnextv2_tiny, deit3_base_patch16_224, xception41, inception-v4, and efficientnet_b0. Among them, the swin_tiny_patch4_window7_224 model achieved 98.013% accuracy, demonstrating its potential as a reliable and low-cost tool for clinical screening and diagnosis of foot-related conditions. Full article

(This article belongs to the Section Medical Imaging)

► Show Figures

Figure 1

23 pages, 19620 KB

Open AccessArticle

Sentinel-2-Based Forest Health Survey of ICP Forests Level I and II Plots in Hungary

by Tamás Molnár, Bence Bolla, Orsolya Szabó and András Koltay

J. Imaging 2025, 11(11), 413; https://doi.org/10.3390/jimaging11110413 - 14 Nov 2025

Forest damage has been increasingly recorded over the past decade in both Europe and Hungary, primarily due to prolonged droughts, causing a decline in forest health. In the framework of ICP Forests, the forest damage has been monitored for decades; however, it is [...] Read more.

Forest damage has been increasingly recorded over the past decade in both Europe and Hungary, primarily due to prolonged droughts, causing a decline in forest health. In the framework of ICP Forests, the forest damage has been monitored for decades; however, it is labour-intensive and time-consuming. Satellite-based remote sensing offers a rapid and efficient method for assessing large-scale damage events, combining the ground-based ICP Forests datasets. This study utilised cloud computing and Sentinel-2 satellite imagery to monitor forest health and detect anomalies. Standardised NDVI (Z NDVI) maps were produced for the period from 2017 to 2023 to identify disturbances in the forest. The research focused on seven active ICP Forests Level II and 78 Level I plots in Hungary. Z NDVI values were divided into five categories based on damage severity, and there was agreement between Level II field data and satellite imagery. In 2017, severe damage was caused by late frost and wind; however, the forest recovered by 2018. Another decline was observed in 2021 due to wind and in 2022 due to drought. Data from the ICP Forests Level I plots, which represent forest condition in Hungary, indicated that 80% of the monitored stands were damaged, with 30% suffering moderate damage and 15% experiencing severe damage. Z NDVI classifications aligned with the field data, showing widespread forest damage across the country. Full article

► Show Figures

Figure 1

15 pages, 3988 KB

Open AccessArticle

Boundary-Guided Differential Attention: Enhancing Camouflaged Object Detection Accuracy

by Hongliang Zhang, Bolin Xu and Sanxin Jiang

J. Imaging 2025, 11(11), 412; https://doi.org/10.3390/jimaging11110412 - 14 Nov 2025

Camouflaged Object Detection (COD) is a challenging computer vision task aimed at accurately identifying and segmenting objects seamlessly blended into their backgrounds. This task has broad applications across medical image segmentation, defect detection, agricultural image detection, security monitoring, and scientific research. Traditional COD [...] Read more.

Camouflaged Object Detection (COD) is a challenging computer vision task aimed at accurately identifying and segmenting objects seamlessly blended into their backgrounds. This task has broad applications across medical image segmentation, defect detection, agricultural image detection, security monitoring, and scientific research. Traditional COD methods often struggle with precise segmentation due to the high similarity between camouflaged objects and their surroundings. In this study, we introduce a Boundary-Guided Differential Attention Network (BDA-Net) to address these challenges. BDA-Net first extracts boundary features by fusing multi-scale image features and applying channel attention. Subsequently, it employs a differential attention mechanism, guided by these boundary features, to highlight camouflaged objects and suppress background information. The weighted features are then progressively fused to generate accurate camouflage object masks. Experimental results on the COD10K, NC4K, and CAMO datasets demonstrate that BDA-Net outperforms most state-of-the-art COD methods, achieving higher accuracy. Here we show that our approach improves detection accuracy by up to 3.6% on key metrics, offering a robust solution for precise camouflaged object segmentation. Full article

(This article belongs to the Section Computer Vision and Pattern Recognition)

► Show Figures

Figure 1

21 pages, 1479 KB

Open AccessArticle

Neural Radiance Fields: Driven Exploration of Visual Communication and Spatial Interaction Design for Immersive Digital Installations

by Wanshu Li and Yuanhui Hu

J. Imaging 2025, 11(11), 411; https://doi.org/10.3390/jimaging11110411 - 13 Nov 2025

In immersive digital devices, high environmental complexity can lead to rendering delays and loss of interactive details, resulting in a fragmented experience. This paper proposes a lightweight NeRF (Neural Radiance Fields) modeling and multimodal perception fusion method. First, a sparse hash code is [...] Read more.

In immersive digital devices, high environmental complexity can lead to rendering delays and loss of interactive details, resulting in a fragmented experience. This paper proposes a lightweight NeRF (Neural Radiance Fields) modeling and multimodal perception fusion method. First, a sparse hash code is constructed based on Instant-NGP (Instant Neural Graphics Primitives) to accelerate scene radiance field generation. Second, parameter distillation and channel pruning are used to reduce the model’s size and reduce computational overheads. Next, multimodal data from a depth camera and an IMU (Inertial Measurement Unit) is fused, and Kalman filtering is used to improve pose tracking accuracy. Finally, the optimized NeRF model is integrated into the Unity engine, utilizing custom shaders and asynchronous rendering to achieve low-latency viewpoint responsiveness. Experiments show that the file size of this method in high-complexity scenes is only 79.5 MB ± 5.3 MB, and the first loading time is only 2.9 s ± 0.4 s, effectively reducing rendering latency. The SSIM is 0.951 ± 0.016 at 1.5 m/s, and the GME is 7.68 ± 0.15 at 1.5 m/s. It can stably restore texture details and edge sharpness under dynamic viewing angles. In scenarios that support 3–5 people interacting simultaneously, the average interaction response delay is only 16.3 ms, and the average jitter error is controlled at 0.12°, significantly improving spatial interaction performance. In conclusion, this study provides effective technical solutions for high-quality immersive interaction in complex public scenarios. Future work will explore the framework’s adaptability in larger-scale dynamic environments and further optimize the network synchronization mechanism for multi-user concurrency. Full article

(This article belongs to the Section Image and Video Processing)

► Show Figures

Figure 1

17 pages, 1121 KB

Open AccessArticle

TASA: Text-Anchored State–Space Alignment for Long-Tailed Image Classification

by Long Li, Tinglei Jia, Huaizhi Yue, Huize Cheng, Yongfeng Bu and Zhaoyang Zhang

J. Imaging 2025, 11(11), 410; https://doi.org/10.3390/jimaging11110410 - 13 Nov 2025

Long-tailed image classification remains challenging for vision–language models. Head classes dominate training while tail classes are underrepresented and noisy, and short prompts with weak text supervision further amplify head bias. This paper presents TASA, an end-to-end framework that stabilizes textual supervision and enhances [...] Read more.

Long-tailed image classification remains challenging for vision–language models. Head classes dominate training while tail classes are underrepresented and noisy, and short prompts with weak text supervision further amplify head bias. This paper presents TASA, an end-to-end framework that stabilizes textual supervision and enhances cross-modal fusion. A Semantic Distribution Modulation (SDM) module constructs class-specific text prototypes by cosine-weighted fusion of multiple LLM-generated descriptions with a canonical template, providing stable and diverse semantic anchors without training text parameters. Dual-Space Cross-Modal Fusion (DCF) module incorporates selective-scan state–space blocks into both image and text branches, enabling bidirectional conditioning and efficient feature fusion through a lightweight multilayer perceptron. Together with a margin-aware alignment loss, TASA aligns images with class prototypes for classification without requiring paired image–text data or per-class prompt tuning. Experiments on CIFAR-10/100-LT, ImageNet-LT, and Places-LT demonstrate consistent improvements across many-, medium-, and few-shot groups. Ablation studies confirm that DCF yields the largest single-module gain, while SDM and DCF combined provide the most robust and balanced performance. These results highlight the effectiveness of integrating text-driven prototypes with state–space fusion for long-tailed classification. Full article

(This article belongs to the Section Image and Video Processing)

► Show Figures

Figure 1

24 pages, 248126 KB

Open AccessArticle

Image Matching for UAV Geolocation: Classical and Deep Learning Approaches

by Fatih Baykal, Mehmet İrfan Gedik, Constantino Carlos Reyes-Aldasoro and Cefa Karabağ

J. Imaging 2025, 11(11), 409; https://doi.org/10.3390/jimaging11110409 - 12 Nov 2025

Today, unmanned aerial vehicles (UAVs) are heavily dependent on Global Navigation Satellite Systems (GNSSs) for positioning and navigation. However, GNSS signals are vulnerable to jamming and spoofing attacks. This poses serious security risks, especially for military operations and critical civilian missions. In order [...] Read more.

Today, unmanned aerial vehicles (UAVs) are heavily dependent on Global Navigation Satellite Systems (GNSSs) for positioning and navigation. However, GNSS signals are vulnerable to jamming and spoofing attacks. This poses serious security risks, especially for military operations and critical civilian missions. In order to solve this problem, an image-based geolocation system has been developed that eliminates GNSS dependency. The proposed system estimates the geographical location of the UAV by matching the aerial images taken by the UAV with previously georeferenced high-resolution satellite images. For this purpose, common visual features were determined between satellite and UAV images and matching operations were carried out using methods based on the homography matrix. Thanks to image processing, a significant relationship has been established between the area where the UAV is located and the geographical coordinates, and reliable positioning is ensured even in cases where GNSS signals cannot be used. Within the scope of the study, traditional methods such as SIFT, AKAZE, and Multiple Template Matching were compared with learning-based methods including SuperPoint, SuperGlue, and LoFTR. The results showed that deep learning-based approaches can make successful matches, especially at high altitudes. Full article

(This article belongs to the Topic Image Processing, Signal Processing and Their Applications)

► Show Figures

Figure 1

16 pages, 2880 KB

Open AccessArticle

Wafer Defect Detection Technology Based on CTM-IYOLOv10 Network

by Pengcheng Ji, Zhenzhi He, Weiwei Yang, Jiawei Du, Guo Ye and Xiangning Lu

J. Imaging 2025, 11(11), 408; https://doi.org/10.3390/jimaging11110408 - 12 Nov 2025

The continuous scaling of semiconductor devices has increased the density and complexity of wafer dies, making precise and efficient defect detection a critical task for intelligent manufacturing. Traditional manual or semi-automated inspection approaches are often inefficient, error-prone, and susceptible to missed or false [...] Read more.

The continuous scaling of semiconductor devices has increased the density and complexity of wafer dies, making precise and efficient defect detection a critical task for intelligent manufacturing. Traditional manual or semi-automated inspection approaches are often inefficient, error-prone, and susceptible to missed or false detections, particularly for small or irregular defects. This study presents a wafer defect detection framework that integrates clustering–template matching (CTM) with an improved YOLOv10 network (CTM-IYOLOv10). The CTM strategy enhances die segmentation efficiency and mitigates redundant matching in multi-die fields of view, while the introduction of a modified GhostConv module and an enhanced BiFPN structure strengthens feature representation, reduces computational redundancy, and improves small-object detection. Furthermore, data augmentation strategies are employed to improve robustness and generalization. Experimental evaluations demonstrate that CTM-IYOLOv10 achieves a detection accuracy of 98.1%, reduces inference time by 23.2%, and compresses model size by 52.3% compared with baseline YOLOv10, and consistently outperforms representative detectors such as YOLOv5 and YOLOv8. These results highlight both the methodological contributions of the proposed architecture and its practical significance for real-time wafer defect inspection in semiconductor manufacturing. Full article

(This article belongs to the Section Computer Vision and Pattern Recognition)

► Show Figures

Figure 1

15 pages, 2484 KB

Open AccessArticle

Fully Automated AI-Based Digital Workflow for Mirroring of Healthy and Defective Craniofacial Models

by Michel Beyer, Julian Grossi, Alexandru Burde, Sead Abazi, Lukas Seifert, Joachim Polligkeit, Neha Umakant Chodankar and Florian M. Thieringer

J. Imaging 2025, 11(11), 407; https://doi.org/10.3390/jimaging11110407 - 12 Nov 2025

The accurate reconstruction of craniofacial defects requires the precise segmentation and mirroring of healthy anatomy. Conventional workflows rely on manual interaction, making them time-consuming and subject to operator variability. This study developed and validated a fully automated digital pipeline that integrates deep learning–based [...] Read more.

The accurate reconstruction of craniofacial defects requires the precise segmentation and mirroring of healthy anatomy. Conventional workflows rely on manual interaction, making them time-consuming and subject to operator variability. This study developed and validated a fully automated digital pipeline that integrates deep learning–based segmentation with algorithmic mirroring for craniofacial reconstruction. A total of 388 cranial CT scans were used to train a three-dimensional nnU-Net model for skull and mandible segmentation. A Principal Component Analysis–Iterative Closest Point (PCA–ICP) algorithm was then applied to compute the sagittal symmetry plane and perform mirroring. Automated results were compared with expert-generated segmentations and manually defined symmetry planes using Dice Similarity Coefficient (DSC), Mean Surface Distance (MSD), Hausdorff Distance (HD), and angular deviation. The nnU-Net achieved high segmentation accuracy for both the mandible (mean DSC 0.956) and the skull (mean DSC 0.965). Mirroring results showed minimal angular deviation from expert reference planes (mandible: 1.32° ± 0.71° in defect cases, 1.58° ± 1.12° in intact cases; skull: 1.75° ± 0.84° in defect cases, 1.15° ± 0.81° in intact cases). The presence of defects did not significantly affect accuracy. This automated workflow demonstrated robust performance and clinical applicability, offering standardized, reproducible, and time-efficient planning for craniofacial reconstruction. Full article

(This article belongs to the Section AI in Imaging)

► Show Figures

Figure 1

11 pages, 914 KB

Open AccessCommunication

High-Resolution Peripheral Quantitative Computed Tomography (HR-pQCT) for Assessment of Avascular Necrosis of the Lunate

by Esin Rothenfluh, Georg F. Erbach, Léna G. Dietrich, Laura De Pellegrin, Daniela A. Frauchiger and Rainer J. Egli

J. Imaging 2025, 11(11), 406; https://doi.org/10.3390/jimaging11110406 - 12 Nov 2025

This exploratory study investigates the feasibility and diagnostic value of high-resolution peripheral quantitative computed tomography (HR-pQCT) in detecting structural and microarchitectural changes in lunate avascular necrosis (AVN), or Kienböck’s disease. Five adult patients with unilateral AVN underwent either MRI or CT, alongside HR-pQCT [...] Read more.

This exploratory study investigates the feasibility and diagnostic value of high-resolution peripheral quantitative computed tomography (HR-pQCT) in detecting structural and microarchitectural changes in lunate avascular necrosis (AVN), or Kienböck’s disease. Five adult patients with unilateral AVN underwent either MRI or CT, alongside HR-pQCT of both wrists. Imaging features such as subchondral remodeling, joint space narrowing, and bone fragmentation were assessed across modalities. HR-pQCT detected at least one additional pathological feature not seen on MRI or CT in four of five patients and revealed early subchondral changes in two contralateral asymptomatic wrists. Quantitative measurements of bone volume fraction (BV/TV) further indicated altered trabecular structure correlating with disease stage. These findings suggest that HR-pQCT may offer enhanced sensitivity for early-stage AVN and better delineation of disease extent, which is critical for informed surgical planning. While limited by small sample size, this study provides preliminary evidence supporting HR-pQCT as a complementary imaging tool in the assessment of lunate AVN, with potential to improve early detection, staging accuracy, and individualized treatment strategies. Full article

(This article belongs to the Section Medical Imaging)

► Show Figures

Figure 1

27 pages, 16752 KB

Open AccessArticle

Unified-Removal: A Semi-Supervised Framework for Simultaneously Addressing Multiple Degradations in Real-World Images

by Yongheng Zhang

J. Imaging 2025, 11(11), 405; https://doi.org/10.3390/jimaging11110405 - 11 Nov 2025

This work introduces Uni-Removal, an innovative two-stage framework that effectively addresses the critical challenge of domain adaptation in unified image restoration. Contemporary approaches often face significant performance degradation when transitioning from synthetic training environments to complex real-world scenarios due to the substantial domain [...] Read more.

This work introduces Uni-Removal, an innovative two-stage framework that effectively addresses the critical challenge of domain adaptation in unified image restoration. Contemporary approaches often face significant performance degradation when transitioning from synthetic training environments to complex real-world scenarios due to the substantial domain discrepancy. Our proposed solution establishes a comprehensive pipeline that systematically bridges this gap through dual-phase representation learning. In the first stage, we implement a structured multi-teacher knowledge distillation mechanism that enables a unified student architecture to assimilate and integrate specialized expertise from multiple pre-trained degradation-specific networks. This knowledge transfer is rigorously regularized by our novel Instance-Grained Contrastive Learning (IGCL) objective, which explicitly enforces representation consistency across both feature hierarchies and image spaces. The second stage introduces a groundbreaking output distribution calibration methodology that employs Cluster-Grained Contrastive Learning (CGCL) to adversarially align the restored outputs with authentic real-world image characteristics, effectively embedding the student model within the natural image manifold without requiring paired supervision. Comprehensive experimental validation demonstrates Uni-Removal’s superior performance across multiple real-world degradation tasks including dehazing, deraining, and deblurring, where it consistently surpasses existing state-of-the-art methods. The framework’s exceptional generalization capability is further evidenced by its competitive denoising performance on the SIDD benchmark and, more significantly, by delivering a substantial 4.36 mAP improvement in downstream object detection tasks, unequivocally establishing its practical utility as a robust pre-processing component for advanced computer vision systems. Full article

(This article belongs to the Special Issue Computer Vision and Deep Learning: Trends and Applications (3rd Edition))

► Show Figures

Figure 1

22 pages, 683 KB

Open AccessArticle

LatAtk: A Medical Image Attack Method Focused on Lesion Areas with High Transferability

by Long Li, Yibo Huang, Chong Li, Fei Zhou, Jingjing Li and Kamarul Hawari Ghazali

J. Imaging 2025, 11(11), 404; https://doi.org/10.3390/jimaging11110404 - 11 Nov 2025

The rise in trusted machine learning has prompted concerns about the security, reliability and controllability of deep learning, especially when it is applied to sensitive areas involving life and health safety. To thoroughly analyze potential attacks and promote innovation in security technologies for [...] Read more.

The rise in trusted machine learning has prompted concerns about the security, reliability and controllability of deep learning, especially when it is applied to sensitive areas involving life and health safety. To thoroughly analyze potential attacks and promote innovation in security technologies for DNNs, this paper conducts research on adversarial attacks against medical images and proposes a medical image attack method that focuses on lesion areas and has good transferability, named LatAtk. First, based on the image segmentation algorithm, LatAtk divides the target image into an attackable area (lesion area) and a non-attackable area and injects perturbations into the attackable area to disrupt the attention of the DNNs. Second, a class activation loss function based on gradient-weighted class activation mapping is proposed. By obtaining the importance of features in images, the features that play a positive role in model decision-making are further disturbed, making LatAtk highly transferable. Third, a texture feature loss function based on local binary patterns is proposed as a constraint to reduce the damage to non-semantic features, effectively preserving texture features of target images and improving the concealment of adversarial samples. Experimental results show that LatAtk has superior aggressiveness, transferability and concealment compared to advanced baselines. Full article

(This article belongs to the Section Medical Imaging)

► Show Figures

Figure 1

21 pages, 4155 KB

Open AccessArticle

Integrating Deep Learning and Radiogenomics: A Novel Approach to Glioblastoma Segmentation and MGMT Methylation Prediction

by Nabil M. Abdelaziz, Emad Abdel-Aziz Dawood and Alshaimaa A. Tantawy

J. Imaging 2025, 11(11), 403; https://doi.org/10.3390/jimaging11110403 - 11 Nov 2025

Radiogenomics, which integrates imaging phenotypes with genomic profiles, enhances diagnosis, prognosis, and treatment planning for glioblastomas. This study specifically establishes a correlation between radiomic features and MGMT promoter methylation status, advancing towards a non-invasive, integrated diagnostic paradigm. Conventional genetic analysis requires invasive biopsies, [...] Read more.

Radiogenomics, which integrates imaging phenotypes with genomic profiles, enhances diagnosis, prognosis, and treatment planning for glioblastomas. This study specifically establishes a correlation between radiomic features and MGMT promoter methylation status, advancing towards a non-invasive, integrated diagnostic paradigm. Conventional genetic analysis requires invasive biopsies, which cause delays in obtaining results and necessitate further surgeries. Our methodology is twofold: First, an enhanced U-Net model segments brain tumor regions with high precision (Dice coefficient: 0.889). Second, a hybrid classifier, leveraging the complementary features of EfficientNetB0 and ResNet50, predicts MGMT promoter methylation status from the segmented volumes. The proposed framework demonstrated superior performance in predicting MGMT promoter methylation status in glioblastoma patients compared to conventional methods, achieving a classification accuracy of 95% and an AUC of 0.96. These results underscore the model’s potential to enhance patient stratification and guide treatment selection. The accurate prediction of MGMT promoter methylation status via non-invasive imaging provides a reliable criterion for anticipating patient responsiveness to alkylating chemotherapy. This capability equips clinicians with a tool to inform personalized treatment strategies, optimizing therapeutic efficacy from the outset. Full article

(This article belongs to the Topic Intelligent Image Processing Technology)

► Show Figures

Figure 1

22 pages, 1770 KB

Open AccessArticle

Key-Frame-Aware Hierarchical Learning for Robust Gait Recognition

by Ke Wang and Hua Huo

J. Imaging 2025, 11(11), 402; https://doi.org/10.3390/jimaging11110402 - 10 Nov 2025

Gait recognition in unconstrained environments is severely hampered by variations in view, clothing, and carrying conditions. To address this, we introduce HierarchGait, a key-frame-aware hierarchical learning framework. Our approach uniquely integrates three complementary modules: a TemplateBlock-based Motion Extraction (TBME) for coarse-to-fine anatomical feature [...] Read more.

Gait recognition in unconstrained environments is severely hampered by variations in view, clothing, and carrying conditions. To address this, we introduce HierarchGait, a key-frame-aware hierarchical learning framework. Our approach uniquely integrates three complementary modules: a TemplateBlock-based Motion Extraction (TBME) for coarse-to-fine anatomical feature learning, a Sequence-Level Spatio-temporal Feature Aggregator (SSFA) to identify and prioritize discriminative key-frames, and a Frame-level Feature Re-segmentation Extractor (FFRE) to capture fine-grained motion details. This synergistic design yields a robust and comprehensive gait representation. We demonstrate the superiority of our method through extensive experiments. On the highly challenging CASIA-B dataset, HierarchGait achieves new state-of-the-art average Rank-1 accuracies of 98.1% under Normal (NM), 95.9% under Bag (BG), and 87.5% under Coat (CL) conditions. Furthermore, on the large-scale OU-MVLP dataset, our model attains a 91.5% average accuracy. These results validate the significant advantage of explicitly modeling anatomical hierarchies and temporal key-moments for robust gait recognition. Full article

(This article belongs to the Section Biometrics, Forensics, and Security)

► Show Figures

Figure 1

More Articles...

Submit to J. Imaging Review for J. Imaging

Journal Menu

Journal Browser

► Journal Browser

Highly Accessed Articles

View More...

Latest Books

More Books and Reprints...

E-Mail Alert

News

6 November 2025
MDPI Launches the Michele Parrinello Award for Pioneering Contributions in Computational Physical Science

9 October 2025
Meet Us at the 3^rd International Conference on AI Sensors and Transducers, 2–7 August 2026, Jeju, South Korea

1 October 2025
2024 MDPI Top 1000 Reviewers

More News & Announcements...

Topics

Propose a Topic

Topic in Applied Sciences, Electronics, MAKE, J. Imaging, Sensors

Applied Computer Vision and Pattern Recognition: 2nd Edition Topic Editors: Antonio Fernández-Caballero, Byung-Gyu Kim
Deadline: 31 December 2025

Topic in Applied Sciences, Computers, Electronics, Information, J. Imaging

Visual Computing and Understanding: New Developments and Trends Topic Editors: Wei Zhou, Guanghui Yue, Wenhan Yang
Deadline: 31 March 2026

Topic in Applied Sciences, Electronics, J. Imaging, MAKE, Information, BDCC, Signals

Applications of Image and Video Processing in Medical Imaging Topic Editors: Jyh-Cheng Chen, Kuangyu Shi
Deadline: 30 April 2026

Topic in Diagnostics, Electronics, J. Imaging, Mathematics, Sensors

Transformer and Deep Learning Applications in Image Processing Topic Editors: Fengping An, Haitao Xu, Chuyang Ye
Deadline: 31 May 2026

More Topics

Conferences

Propose a Conference Collaboration

4–6 November 2026 The 1st International Online Conference on Imaging (IOCIM 2026)

2–7 August 2026 The 3rd International Conference on AI Sensors and Transducers

More Conferences...

Special Issues

Propose a Special Issue

Special Issue in J. Imaging

Advances in Machine Learning for Computer Vision Applications Guest Editors: Gurmail Singh, Stéfano Frizzo Stefenon
Deadline: 30 November 2025

Special Issue in J. Imaging

Object Detection in Video Surveillance Systems Guest Editors: Jesús Ruiz-Santaquiteria Alegre, Juan Antonio Álvarez García, Harbinder Singh
Deadline: 30 November 2025

Special Issue in J. Imaging

Clinical and Pathological Imaging in the Era of Artificial Intelligence: New Insights and Perspectives—2nd Edition Guest Editors: Gerardo Cazzato, Francesca Arezzo
Deadline: 30 November 2025

Special Issue in J. Imaging

Advances in Photoacoustic Imaging: Tomography and Applications Guest Editor: Xianlin Song
Deadline: 30 November 2025

More Special Issues

Back to TopTop