Previous Issue
Volume 11, September
 
 

J. Imaging, Volume 11, Issue 10 (October 2025) – 10 articles

  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
25 pages, 19395 KB  
Article
Intelligent Segmentation of Urban Building Roofs and Solar Energy Potential Estimation for Photovoltaic Applications
by Junsen Zeng, Minglong Yang, Xiujuan Tang, Xiaotong Guan and Tingting Ma
J. Imaging 2025, 11(10), 334; https://doi.org/10.3390/jimaging11100334 - 25 Sep 2025
Abstract
To support dual-carbon objectives and enhance the accuracy of rooftop distributed photovoltaic (PV) planning, this study proposes a multidimensional coupled evaluation framework that integrates an improved rooftop segmentation network (CESW-TransUNet), a residual-fusion ensemble, and physics-based shading and performance simulations, thereby correcting the bias [...] Read more.
To support dual-carbon objectives and enhance the accuracy of rooftop distributed photovoltaic (PV) planning, this study proposes a multidimensional coupled evaluation framework that integrates an improved rooftop segmentation network (CESW-TransUNet), a residual-fusion ensemble, and physics-based shading and performance simulations, thereby correcting the bias of conventional 2-D area–based methods. First, CESW-TransUNet, equipped with convolution-enhanced modules, achieves robust multi-scale rooftop extraction and reaches an IoU of 78.50% on the INRIA benchmark, representing a 2.27 percentage point improvement over TransUNet. Second, the proposed residual fusion strategy adaptively integrates multiple models, including DeepLabV3+ and PSPNet, further improving the IoU to 79.85%. Finally, by coupling Ecotect-based shadow analysis with PVsyst performance modeling, the framework systematically quantifies dynamic inter-building shading, rooftop equipment occupancy, and installation suitability. A case study demonstrates that the method reduces the systematic overestimation of annual generation by 27.7% compared with traditional 2-D assessments. The framework thereby offers a quantitative, end-to-end decision tool for urban rooftop PV planning, enabling more reliable evaluation of generation and carbon-mitigation potential. Full article
39 pages, 4549 KB  
Article
Effects of Biases in Geometric and Physics-Based Imaging Attributes on Classification Performance
by Bahman Rouhani and John K. Tsotsos
J. Imaging 2025, 11(10), 333; https://doi.org/10.3390/jimaging11100333 - 25 Sep 2025
Abstract
Learned systems in the domain of visual recognition and cognition impress in part because even though they are trained with datasets many orders of magnitude smaller than the full population of possible images, they exhibit sufficient generalization to be applicable to new and [...] Read more.
Learned systems in the domain of visual recognition and cognition impress in part because even though they are trained with datasets many orders of magnitude smaller than the full population of possible images, they exhibit sufficient generalization to be applicable to new and previously unseen data. Since training data sets typically represent such a small sampling of any domain, the possibility of bias in their composition is very real. But what are the limits of generalization given such bias, and up to what point might it be sufficient for a real problem task? There are many types of bias as will be seen, but we focus only on one, selection bias. In vision, image contents are dependent on the physics of vision and geometry of the imaging process and not only on scene contents. How do biases in these factors—that is, non-uniform sample collection across the spectrum of imaging possibilities—affect learning? We address this in two ways. The first is theoretical in the tradition of the Thought Experiment. The point is to use a simple theoretical tool to probe into the bias of data collection to highlight deficiencies that might then deserve extra attention either in data collection or system development. Those theoretical results are then used to motivate practical tests on a new dataset using several existing top classifiers. We report that, both theoretically and empirically, there are some selection biases rooted in the physics and imaging geometry of vision that challenge current methods of classification. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
20 pages, 14512 KB  
Article
Dual-Attention-Based Block Matching for Dynamic Point Cloud Compression
by Longhua Sun, Yingrui Wang and Qing Zhu
J. Imaging 2025, 11(10), 332; https://doi.org/10.3390/jimaging11100332 - 25 Sep 2025
Abstract
The irregular and highly non-uniform spatial distribution inherent to dynamic three-dimensional (3D) point clouds (DPCs) severely hampers the extraction of reliable temporal context, rendering inter-frame compression a formidable challenge. Inspired by two-dimensional (2D) image and video compression methods, existing approaches attempt to model [...] Read more.
The irregular and highly non-uniform spatial distribution inherent to dynamic three-dimensional (3D) point clouds (DPCs) severely hampers the extraction of reliable temporal context, rendering inter-frame compression a formidable challenge. Inspired by two-dimensional (2D) image and video compression methods, existing approaches attempt to model the temporal dependence of DPCs through a motion estimation/motion compensation (ME/MC) framework. However, these approaches represent only preliminary applications of this framework; point consistency between adjacent frames is insufficiently explored, and temporal correlation requires further investigation. To address this limitation, we propose a hierarchical ME/MC framework that adaptively selects the granularity of the estimated motion field, thereby ensuring a fine-grained inter-frame prediction process. To further enhance motion estimation accuracy, we introduce a dual-attention-based KNN block-matching (DA-KBM) network. This network employs a bidirectional attention mechanism to more precisely measure the correlation between points, using closely correlated points to predict inter-frame motion vectors and thereby improve inter-frame prediction accuracy. Experimental results show that the proposed DPC compression method achieves a significant improvement (gain of 70%) in the BD-Rate metric on the 8iFVBv2 dataset. compared with the standardized Video-based Point Cloud Compression (V-PCC) v13 method, and a 16% gain over the state-of-the-art deep learning-based inter-mode method. Full article
(This article belongs to the Special Issue 3D Image Processing: Progress and Challenges)
Show Figures

Figure 1

17 pages, 2112 KB  
Article
Pilot Exploratory Study of a CT Radiomics Model for the Classification of Small Cell Lung Cancer and Non-Small-Cell Lung Cancer in the Moscow Population: A Step Toward Virtual Biopsy
by Maria D. Varyukhina, Alexandr A. Borisov, Rustam A. Erizhokov, Kirill M. Arzamasov, Alexander V. Solovev, Vadim V. Kirsanov, Olga V. Omelyanskaya, Anton V. Vladzymyrskyy and Yuriy A. Vasilev
J. Imaging 2025, 11(10), 331; https://doi.org/10.3390/jimaging11100331 - 25 Sep 2025
Abstract
Lung cancer is one of the most common and socially significant cancers worldwide and consists of two main subtypes: small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC), which require different treatments. Computed tomography (CT) scans cannot reliably differentiate these subtypes, [...] Read more.
Lung cancer is one of the most common and socially significant cancers worldwide and consists of two main subtypes: small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC), which require different treatments. Computed tomography (CT) scans cannot reliably differentiate these subtypes, often necessitating invasive biopsies that carry significant risks. Radiomics offers a promising non-invasive alternative by quantitatively analyzing imaging data to extract detailed tissue characteristics beyond visual assessment. This pilot retrospective study analyzed 200 Moscow patients with histologically confirmed SCLC or NSCLC. Manual tumor segmentation on pretreatment CT scans allowed extraction of 107 radiomic features, from which 16 key features were selected to train four machine learning models. Models were evaluated using stratified 5-fold cross-validation, focusing on ROC AUC, accuracy, precision, and recall. All models demonstrated strong performance in distinguishing SCLC from NSCLC, with the gradient boosting model achieving the highest accuracy of 80.5% and ROC AUC of 0.888. These results highlight the potential of radiomics combined with machine learning to enable accurate, non-invasive differentiation of lung cancer subtypes. Further research is needed to expand feature sets, develop automated segmentation tools, and enhance clinical application of this approach. Full article
(This article belongs to the Special Issue Advances in Medical Imaging and Machine Learning)
Show Figures

Figure 1

25 pages, 5227 KB  
Article
Dynamic Fractional Flow Reserve from 4D-CTA: A Novel Framework for Non-Invasive Coronary Assessment
by Shuo Wang, Rong Liu and Li Zhang
J. Imaging 2025, 11(10), 330; https://doi.org/10.3390/jimaging11100330 - 24 Sep 2025
Abstract
Current fractional flow reserve computed tomography (FFRCT) methods use static imaging, potentially missing critical hemodynamic changes during the cardiac cycle. We developed a novel dynamic FFRCT framework using 4D-CTA data to capture temporal coronary dynamics throughout the complete cardiac cycle. [...] Read more.
Current fractional flow reserve computed tomography (FFRCT) methods use static imaging, potentially missing critical hemodynamic changes during the cardiac cycle. We developed a novel dynamic FFRCT framework using 4D-CTA data to capture temporal coronary dynamics throughout the complete cardiac cycle. Our automated pipeline integrates 4D-CTA processing, temporally weighted geometric modeling, and patient-specific boundary conditions derived from actual flow measurements. Preliminary validation in three patients (four vessels) showed that dynamic FFRCT values (0.720, 0.797, 0.811, and 0.952) closely matched invasive FFR measurements (0.70, 0.78, 0.78, and 0.94) with improved accuracy compared to conventional static methods. The dynamic approach successfully captured physiologically relevant hemodynamic variations, addressing inter-patient variability limitations of standardized approaches. This study establishes the clinical feasibility of dynamic FFRCT computation, potentially improving non-invasive coronary stenosis assessment for clinical decision-making and treatment planning. Full article
(This article belongs to the Special Issue Emerging Technologies for Less Invasive Diagnostic Imaging)
Show Figures

Graphical abstract

22 pages, 17124 KB  
Review
Image Matching: Foundations, State of the Art, and Future Directions
by Ming Yang, Rui Wu, Yunxuan Yang, Liang Tao, Yifan Zhang, Yixin Xie and Gnana Prakash Reddy Donthi Reddy
J. Imaging 2025, 11(10), 329; https://doi.org/10.3390/jimaging11100329 - 24 Sep 2025
Abstract
Image matching plays a critical role in a wide range of computer vision applications, including object recognition, 3D reconstruction, aiming-point and six-degree-of-freedom detection for aiming devices, and video surveillance. Over the past three decades, image-matching algorithms and techniques have evolved significantly, from handcrafted [...] Read more.
Image matching plays a critical role in a wide range of computer vision applications, including object recognition, 3D reconstruction, aiming-point and six-degree-of-freedom detection for aiming devices, and video surveillance. Over the past three decades, image-matching algorithms and techniques have evolved significantly, from handcrafted feature extraction algorithms to modern approaches powered by deep learning neural networks and attention mechanisms. This paper provides a comprehensive review of image-matching techniques, aiming to offer researchers valuable insights into the evolving landscape of this field. It traces the historical development of feature-based methods and examines the transition to neural network-based approaches that leverage large-scale data and learned representations. Additionally, this paper discusses the current state of the field, highlighting key algorithms, benchmarks, and real-world applications. Furthermore, this study introduces some recent contributions to this area and outlines promising directions for future research, including H-matrix optimization, LoFTR model speedup, and performance improvements. It also identifies persistent challenges such as robustness to viewpoint and illumination changes, scalability, and matching under extreme conditions. Finally, this paper summarizes future trends for research and development in this field. Full article
(This article belongs to the Special Issue Object Detection in Video Surveillance Systems)
Show Figures

Figure 1

19 pages, 4890 KB  
Article
Classifying Sex from MSCT-Derived 3D Mandibular Models Using an Adapted PointNet++ Deep Learning Approach in a Croatian Population
by Eva Shimkus, Ivana Kružić, Saša Mladenović, Iva Perić, Marija Jurić Gunjača, Tade Tadić, Krešimir Dolić, Šimun Anđelinović, Željana Bašić and Ivan Jerković
J. Imaging 2025, 11(10), 328; https://doi.org/10.3390/jimaging11100328 - 24 Sep 2025
Abstract
Accurate sex estimation is critical in forensic anthropology for developing biological profiles, with the mandible serving as a valuable alternative when crania or pelvic bones are unavailable. This study aims to enhance mandibular sex estimation using deep learning on 3D models in a [...] Read more.
Accurate sex estimation is critical in forensic anthropology for developing biological profiles, with the mandible serving as a valuable alternative when crania or pelvic bones are unavailable. This study aims to enhance mandibular sex estimation using deep learning on 3D models in a southern Croatian population. A dataset of 254 MSCT-derived 3D mandibular models (127 male, 127 female) was processed to generate 4096-point clouds, analyzed using an adapted PointNet++ architecture. The dataset was split into training (60%), validation (20%), and test (20%) sets. Unsupervised analysis employed an autoencoder with t-SNE visualization, while supervised classification used logistic regression on extracted features, evaluated by accuracy, sensitivity, specificity, PPV, NPV, and MCC. The model achieved 93% cross-validation accuracy and 92% test set accuracy, with saliency maps highlighting key sexually dimorphic regions like the chin, gonial, and condylar areas. A user-friendly Gradio web application was developed for real-time sex classification from STL files, enhancing forensic applicability. This approach outperformed traditional mandibular sex estimation methods and could have potential as a robust, automated tool for forensic practice, broader population studies and integration with diverse 3D data sources. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

20 pages, 11332 KB  
Article
A Fast Nonlinear Sparse Model for Blind Image Deblurring
by Zirui Zhang, Zheng Guo, Zhenhua Xu, Huasong Chen, Chunyong Wang, Yang Song, Jiancheng Lai, Yunjing Ji and Zhenhua Li
J. Imaging 2025, 11(10), 327; https://doi.org/10.3390/jimaging11100327 - 23 Sep 2025
Abstract
Blind image deblurring, which requires simultaneous estimation of the latent image and blur kernel, constitutes a classic ill-posed problem. To address this, priors based on L2, L1, and Lp regularizations have been widely adopted. Based on this foundation [...] Read more.
Blind image deblurring, which requires simultaneous estimation of the latent image and blur kernel, constitutes a classic ill-posed problem. To address this, priors based on L2, L1, and Lp regularizations have been widely adopted. Based on this foundation and combining successful experiences of previous work, this paper introduces LN regularization, a novel nonlinear sparse regularization combining the Lp and L norms via nonlinear coupling. Statistical probability analysis demonstrates that LN regularization achieves stronger sparsity than traditional regularizations like L2, L1, and Lp regularizations. Furthermore, building upon the LN regularization, we propose a novel nonlinear sparse model for blind image deblurring. To optimize the proposed LN regularization, we introduce an Adaptive Generalized Soft-Thresholding (AGST) algorithm and further develop an efficient optimization strategy by integrating AGST with the Half-Quadratic Splitting (HQS) strategy. Extensive experiments conducted on synthetic datasets and real-world images demonstrate that the proposed nonlinear sparse model achieves superior deblurring performance while maintaining completive computational efficiency. Full article
Show Figures

Figure 1

37 pages, 3784 KB  
Review
A Review on the Detection of Plant Disease Using Machine Learning and Deep Learning Approaches
by Thandiwe Nyawose, Rito Clifford Maswanganyi and Philani Khumalo
J. Imaging 2025, 11(10), 326; https://doi.org/10.3390/jimaging11100326 - 23 Sep 2025
Abstract
The early and accurate detection of plant diseases is essential for ensuring food security, enhancing crop yields, and facilitating precision agriculture. Manual methods are labour-intensive and prone to error, especially under varying environmental conditions. Artificial intelligence (AI), particularly machine learning (ML) and deep [...] Read more.
The early and accurate detection of plant diseases is essential for ensuring food security, enhancing crop yields, and facilitating precision agriculture. Manual methods are labour-intensive and prone to error, especially under varying environmental conditions. Artificial intelligence (AI), particularly machine learning (ML) and deep learning (DL), has advanced automated disease identification through image classification. However, challenges persist, including limited generalisability, small and imbalanced datasets, and poor real-world performance. Unlike previous reviews, this paper critically evaluates model performance in both lab and real-time field conditions, emphasising robustness, generalisation, and suitability for edge deployment. It introduces recent architectures such as GreenViT, hybrid ViT–CNN models, and YOLO-based single- and two-stage detectors, comparing their accuracy, inference speed, and hardware efficiency. The review discusses multimodal and self-supervised learning techniques to enhance detection in complex environments, highlighting key limitations, including reliance on handcrafted features, overfitting, and sensitivity to environmental noise. Strengths and weaknesses of models across diverse datasets are analysed with a focus on real-time agricultural applicability. The paper concludes by identifying research gaps and outlining future directions, including the development of lightweight architectures, integration with Deep Convolutional Generative Adversarial Networks (DCGANs), and improved dataset diversity for real-world deployment in precision agriculture. Full article
(This article belongs to the Section Image and Video Processing)
Show Figures

Figure 1

19 pages, 6027 KB  
Article
An Improved HRNetV2-Based Semantic Segmentation Algorithm for Pipe Corrosion Detection in Smart City Drainage Networks
by Liang Gao, Xinxin Huang, Wanling Si, Feng Yang, Xu Qiao, Yaru Zhu, Tingyang Fu and Jianshe Zhao
J. Imaging 2025, 11(10), 325; https://doi.org/10.3390/jimaging11100325 - 23 Sep 2025
Abstract
Urban drainage pipelines are essential components of smart city infrastructure, supporting the safe and sustainable operation of underground systems. However, internal corrosion in pipelines poses significant risks to structural stability and public safety. In this study, we propose an enhanced semantic segmentation framework [...] Read more.
Urban drainage pipelines are essential components of smart city infrastructure, supporting the safe and sustainable operation of underground systems. However, internal corrosion in pipelines poses significant risks to structural stability and public safety. In this study, we propose an enhanced semantic segmentation framework based on High-Resolution Network Version 2 (HRNetV2) to accurately identify corroded regions in Traditional closed-circuit television (CCTV) images. The proposed method integrates a Convolutional Block Attention Module (CBAM) to strengthen the feature representation of corrosion patterns and introduces a Lightweight Pyramid Pooling Module (LitePPM) to improve multi-scale context modeling. By preserving high-resolution details through HRNetV2’s parallel architecture, the model achieves precise and robust segmentation performance. Experiments on a real-world corrosion dataset show that our approach attains a mean Intersection over Union (mIoU) of 95.92 ± 0.03%, Recall of 97.01 ± 0.02%, and an overall Accuracy of 98.54%. These results demonstrate the method’s effectiveness in supporting intelligent infrastructure inspection and provide technical insights for advancing automated maintenance systems in smart cities. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

Previous Issue
Back to TopTop