Journal Description
Journal of Imaging
Journal of Imaging
is an international, multi/interdisciplinary, peer-reviewed, open access journal of imaging techniques published online monthly by MDPI.
- Open Access— free for readers, with article processing charges (APC) paid by authors or their institutions.
- High Visibility: indexed within Scopus, ESCI (Web of Science), PubMed, PMC, dblp, Inspec, Ei Compendex, and other databases.
- Journal Rank: CiteScore - Q2 (Computer Graphics and Computer-Aided Design)
- Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 21.7 days after submission; acceptance to publication is undertaken in 3.8 days (median values for papers published in this journal in the second half of 2023).
- Recognition of Reviewers: reviewers who provide timely, thorough peer-review reports receive vouchers entitling them to a discount on the APC of their next publication in any MDPI journal, in appreciation of the work done.
Impact Factor:
3.2 (2022);
5-Year Impact Factor:
3.2 (2022)
Latest Articles
Single-Image-Based 3D Reconstruction of Endoscopic Images
J. Imaging 2024, 10(4), 82; https://doi.org/10.3390/jimaging10040082 - 28 Mar 2024
Abstract
A wireless capsule endoscope (WCE) is a medical device designed for the examination of the human gastrointestinal (GI) tract. Three-dimensional models based on WCE images can assist in diagnostics by effectively detecting pathology. These 3D models provide gastroenterologists with improved visualization, particularly in
[...] Read more.
A wireless capsule endoscope (WCE) is a medical device designed for the examination of the human gastrointestinal (GI) tract. Three-dimensional models based on WCE images can assist in diagnostics by effectively detecting pathology. These 3D models provide gastroenterologists with improved visualization, particularly in areas of specific interest. However, the constraints of WCE, such as lack of controllability, and requiring expensive equipment for operation, which is often unavailable, pose significant challenges when it comes to conducting comprehensive experiments aimed at evaluating the quality of 3D reconstruction from WCE images. In this paper, we employ a single-image-based 3D reconstruction method on an artificial colon captured with an endoscope that behaves like WCE. The shape from shading (SFS) algorithm can reconstruct the 3D shape using a single image. Therefore, it has been employed to reconstruct the 3D shapes of the colon images. The camera of the endoscope has also been subjected to comprehensive geometric and radiometric calibration. Experiments are conducted on well-defined primitive objects to assess the method’s robustness and accuracy. This evaluation involves comparing the reconstructed 3D shapes of primitives with ground truth data, quantified through measurements of root-mean-square error and maximum error. Afterward, the same methodology is applied to recover the geometry of the colon. The results demonstrate that our approach is capable of reconstructing the geometry of the colon captured with a camera with an unknown imaging pipeline and significant noise in the images. The same procedure is applied on WCE images for the purpose of 3D reconstruction. Preliminary results are subsequently generated to illustrate the applicability of our method for reconstructing 3D models from WCE images.
Full article
(This article belongs to the Special Issue Geometry Reconstruction from Images (2nd Edition))
Open AccessReview
Applied Artificial Intelligence in Healthcare: A Review of Computer Vision Technology Application in Hospital Settings
by
Heidi Lindroth, Keivan Nalaie, Roshini Raghu, Ivan N. Ayala, Charles Busch, Anirban Bhattacharyya, Pablo Moreno Franco, Daniel A. Diedrich, Brian W. Pickering and Vitaly Herasevich
J. Imaging 2024, 10(4), 81; https://doi.org/10.3390/jimaging10040081 - 28 Mar 2024
Abstract
Computer vision (CV), a type of artificial intelligence (AI) that uses digital videos or a sequence of images to recognize content, has been used extensively across industries in recent years. However, in the healthcare industry, its applications are limited by factors like privacy,
[...] Read more.
Computer vision (CV), a type of artificial intelligence (AI) that uses digital videos or a sequence of images to recognize content, has been used extensively across industries in recent years. However, in the healthcare industry, its applications are limited by factors like privacy, safety, and ethical concerns. Despite this, CV has the potential to improve patient monitoring, and system efficiencies, while reducing workload. In contrast to previous reviews, we focus on the end-user applications of CV. First, we briefly review and categorize CV applications in other industries (job enhancement, surveillance and monitoring, automation, and augmented reality). We then review the developments of CV in the hospital setting, outpatient, and community settings. The recent advances in monitoring delirium, pain and sedation, patient deterioration, mechanical ventilation, mobility, patient safety, surgical applications, quantification of workload in the hospital, and monitoring for patient events outside the hospital are highlighted. To identify opportunities for future applications, we also completed journey mapping at different system levels. Lastly, we discuss the privacy, safety, and ethical considerations associated with CV and outline processes in algorithm development and testing that limit CV expansion in healthcare. This comprehensive review highlights CV applications and ideas for its expanded use in healthcare.
Full article
(This article belongs to the Special Issue Computer Vision and Deep Learning: Trends and Applications (2nd Edition))
►▼
Show Figures
Figure 1
Open AccessArticle
Magnetoencephalography Atlas Viewer for Dipole Localization and Viewing
by
Natascha Cardoso da Fonseca, Jason Bowerman, Pegah Askari, Amy L. Proskovec, Fabricio Stewan Feltrin, Daniel Veltkamp, Heather Early, Ben C. Wagner, Elizabeth M. Davenport and Joseph A. Maldjian
J. Imaging 2024, 10(4), 80; https://doi.org/10.3390/jimaging10040080 - 28 Mar 2024
Abstract
Magnetoencephalography (MEG) is a noninvasive neuroimaging technique widely recognized for epilepsy and tumor mapping. MEG clinical reporting requires a multidisciplinary team, including expert input regarding each dipole’s anatomic localization. Here, we introduce a novel tool, the “Magnetoencephalography Atlas Viewer” (MAV), which streamlines this
[...] Read more.
Magnetoencephalography (MEG) is a noninvasive neuroimaging technique widely recognized for epilepsy and tumor mapping. MEG clinical reporting requires a multidisciplinary team, including expert input regarding each dipole’s anatomic localization. Here, we introduce a novel tool, the “Magnetoencephalography Atlas Viewer” (MAV), which streamlines this anatomical analysis. The MAV normalizes the patient’s Magnetic Resonance Imaging (MRI) to the Montreal Neurological Institute (MNI) space, reverse-normalizes MNI atlases to the native MRI, identifies MEG dipole files, and matches dipoles’ coordinates to their spatial location in atlas files. It offers a user-friendly and interactive graphical user interface (GUI) for displaying individual dipoles, groups, coordinates, anatomical labels, and a tri-planar MRI view of the patient with dipole overlays. It evaluated over 273 dipoles obtained in clinical epilepsy subjects. Consensus-based ground truth was established by three neuroradiologists, with a minimum agreement threshold of two. The concordance between the ground truth and MAV labeling ranged from 79% to 84%, depending on the normalization method. Higher concordance rates were observed in subjects with minimal or no structural abnormalities on the MRI, ranging from 80% to 90%. The MAV provides a straightforward MEG dipole anatomic localization method, allowing a nonspecialist to prepopulate a report, thereby facilitating and reducing the time of clinical reporting.
Full article
(This article belongs to the Section Neuroimaging and Neuroinformatics)
►▼
Show Figures
Figure 1
Open AccessArticle
Zero-Shot Sketch-Based Image Retrieval Using StyleGen and Stacked Siamese Neural Networks
by
Venkata Rama Muni Kumar Gopu and Madhavi Dunna
J. Imaging 2024, 10(4), 79; https://doi.org/10.3390/jimaging10040079 - 27 Mar 2024
Abstract
►▼
Show Figures
Sketch-based image retrieval (SBIR) refers to a sub-class of content-based image retrieval problems where the input queries are ambiguous sketches and the retrieval repository is a database of natural images. In the zero-shot setup of SBIR, the query sketches are drawn from classes
[...] Read more.
Sketch-based image retrieval (SBIR) refers to a sub-class of content-based image retrieval problems where the input queries are ambiguous sketches and the retrieval repository is a database of natural images. In the zero-shot setup of SBIR, the query sketches are drawn from classes that do not match any of those that were used in model building. The SBIR task is extremely challenging as it is a cross-domain retrieval problem, unlike content-based image retrieval problems because sketches and images have a huge domain gap. In this work, we propose an elegant retrieval methodology, StyleGen, for generating fake candidate images that match the domain of the repository images, thus reducing the domain gap for retrieval tasks. The retrieval methodology makes use of a two-stage neural network architecture known as the stacked Siamese network, which is known to provide outstanding retrieval performance without losing the generalizability of the approach. Experimental studies on the image sketch datasets TU-Berlin Extended and Sketchy Extended, evaluated using the mean average precision (mAP) metric, demonstrate a marked performance improvement compared to the current state-of-the-art approaches in the domain.
Full article
Figure 1
Open AccessArticle
Real-Time Dynamic Intelligent Image Recognition and Tracking System for Rockfall Disasters
by
Yu-Wei Lin, Chu-Fu Chiu, Li-Hsien Chen and Chao-Ching Ho
J. Imaging 2024, 10(4), 78; https://doi.org/10.3390/jimaging10040078 - 26 Mar 2024
Abstract
Taiwan, frequently affected by extreme weather causing phenomena such as earthquakes and typhoons, faces a high incidence of rockfall disasters due to its largely mountainous terrain. These disasters have led to numerous casualties, government compensation cases, and significant transportation safety impacts. According to
[...] Read more.
Taiwan, frequently affected by extreme weather causing phenomena such as earthquakes and typhoons, faces a high incidence of rockfall disasters due to its largely mountainous terrain. These disasters have led to numerous casualties, government compensation cases, and significant transportation safety impacts. According to the National Science and Technology Center for Disaster Reduction records from 2010 to 2022, 421 out of 866 soil and rock disasters occurred in eastern Taiwan, causing traffic disruptions due to rockfalls. Since traditional sensors of disaster detectors only record changes after a rockfall, there is no system in place to detect rockfalls as they occur. To combat this, a rockfall detection and tracking system using deep learning and image processing technology was developed. This system includes a real-time image tracking and recognition system that integrates YOLO and image processing technology. It was trained on a self-collected dataset of 2490 high-resolution RGB images. The system’s performance was evaluated on 30 videos featuring various rockfall scenarios. It achieved a mean Average Precision (mAP50) of 0.845 and mAP50-95 of 0.41, with a processing time of 125 ms. Tested on advanced hardware, the system proves effective in quickly tracking and identifying hazardous rockfalls, offering a significant advancement in disaster management and prevention.
Full article
(This article belongs to the Special Issue From Imaging to Understanding: Methods and Application for Environment, Infrastructure and Human Monitoring)
►▼
Show Figures
Figure 1
Open AccessArticle
An Efficient CNN-Based Method for Intracranial Hemorrhage Segmentation from Computerized Tomography Imaging
by
Quoc Tuan Hoang, Xuan Hien Pham, Xuan Thang Trinh, Anh Vu Le, Minh V. Bui and Trung Thanh Bui
J. Imaging 2024, 10(4), 77; https://doi.org/10.3390/jimaging10040077 - 25 Mar 2024
Abstract
Intracranial hemorrhage (ICH) resulting from traumatic brain injury is a serious issue, often leading to death or long-term disability if not promptly diagnosed. Currently, doctors primarily use Computerized Tomography (CT) scans to detect and precisely locate a hemorrhage, typically interpreted by radiologists. However,
[...] Read more.
Intracranial hemorrhage (ICH) resulting from traumatic brain injury is a serious issue, often leading to death or long-term disability if not promptly diagnosed. Currently, doctors primarily use Computerized Tomography (CT) scans to detect and precisely locate a hemorrhage, typically interpreted by radiologists. However, this diagnostic process heavily relies on the expertise of medical professionals. To address potential errors, computer-aided diagnosis systems have been developed. In this study, we propose a new method that enhances the localization and segmentation of ICH lesions in CT scans by using multiple images created through different data augmentation techniques. We integrate residual connections into a U-Net-based segmentation network to improve the training efficiency. Our experiments, based on 82 CT scans from traumatic brain injury patients, validate the effectiveness of our approach, achieving an IOU score of 0.807 ± 0.03 for ICH segmentation using 10-fold cross-validation.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures
Figure 1
Open AccessArticle
Comparing Different Registration and Visualization Methods for Navigated Common Femoral Arterial Access—A Phantom Model Study Using Mixed Reality
by
Johannes Hatzl, Daniel Henning, Dittmar Böckler, Niklas Hartmann, Katrin Meisenbacher and Christian Uhl
J. Imaging 2024, 10(4), 76; https://doi.org/10.3390/jimaging10040076 - 25 Mar 2024
Abstract
Mixed reality (MxR) enables the projection of virtual three-dimensional objects into the user’s field of view via a head-mounted display (HMD). This phantom model study investigated three different workflows for navigated common femoral arterial (CFA) access and compared it to a conventional sonography-guided
[...] Read more.
Mixed reality (MxR) enables the projection of virtual three-dimensional objects into the user’s field of view via a head-mounted display (HMD). This phantom model study investigated three different workflows for navigated common femoral arterial (CFA) access and compared it to a conventional sonography-guided technique as a control. A total of 160 punctures were performed by 10 operators (5 experts and 5 non-experts). A successful CFA puncture was defined as puncture at the mid-level of the femoral head with the needle tip at the central lumen line in a 0° coronary insertion angle and a 45° sagittal insertion angle. Positional errors were quantified using cone-beam computed tomography following each attempt. Mixed effect modeling revealed that the distance from the needle entry site to the mid-level of the femoral head is significantly shorter for navigated techniques than for the control group. This highlights that three-dimensional visualization could increase the safety of CFA access. However, the navigated workflows are infrastructurally complex with limited usability and are associated with relevant cost. While navigated techniques appear as a potentially beneficial adjunct for safe CFA access, future developments should aim to reduce workflow complexity, avoid optical tracking systems, and offer more pragmatic methods of registration and instrument tracking.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures
Figure 1
Open AccessReview
A Review on PolSAR Decompositions for Feature Extraction
by
Konstantinos Karachristos, Georgia Koukiou and Vassilis Anastassopoulos
J. Imaging 2024, 10(4), 75; https://doi.org/10.3390/jimaging10040075 - 24 Mar 2024
Abstract
Feature extraction plays a pivotal role in processing remote sensing datasets, especially in the realm of fully polarimetric data. This review investigates a variety of polarimetric decomposition techniques aimed at extracting comprehensive information from polarimetric imagery. These techniques are categorized as coherent and
[...] Read more.
Feature extraction plays a pivotal role in processing remote sensing datasets, especially in the realm of fully polarimetric data. This review investigates a variety of polarimetric decomposition techniques aimed at extracting comprehensive information from polarimetric imagery. These techniques are categorized as coherent and non-coherent methods, depending on their assumptions about the distribution of information among polarimetric cells. The review explores well-established and innovative approaches in polarimetric decomposition within both categories. It begins with a thorough examination of the foundational Pauli decomposition, a key algorithm in this field. Within the coherent category, the Cameron target decomposition is extensively explored, shedding light on its underlying principles. Transitioning to the non-coherent domain, the review investigates the Freeman–Durden decomposition and its extension, the Yamaguchi’s approach. Additionally, the widely recognized eigenvector–eigenvalue decomposition introduced by Cloude and Pottier is scrutinized. Furthermore, each method undergoes experimental testing on the benchmark dataset of the broader Vancouver area, offering a robust analysis of their efficacy. The primary objective of this review is to systematically present well-established polarimetric decomposition algorithms, elucidating the underlying mathematical foundations of each. The aim is to facilitate a profound understanding of these approaches, coupled with insights into potential combinations for diverse applications.
Full article
(This article belongs to the Section Visualization and Computer Graphics)
►▼
Show Figures
Figure 1
Open AccessArticle
An Improved Bio-Orientation Method Based on Direct Sunlight Compensation for Imaging Polarization Sensor
by
Guangmin Li, Ya Zhang, Shiwei Fan and Fei Yu
J. Imaging 2024, 10(4), 74; https://doi.org/10.3390/jimaging10040074 - 24 Mar 2024
Abstract
►▼
Show Figures
Direct sunlight in complex environmental conditions severely interferes with the light intensity response for imaging Polarization Sensor (PS), leading to a reduction in polarization orientation accuracy. Addressing this issue, this article analyzes the impact mechanism of direct sunlight on polarization sensor detection in
[...] Read more.
Direct sunlight in complex environmental conditions severely interferes with the light intensity response for imaging Polarization Sensor (PS), leading to a reduction in polarization orientation accuracy. Addressing this issue, this article analyzes the impact mechanism of direct sunlight on polarization sensor detection in a complex environment. The direct sunlight interference factor is introduced into the intensity response model of imaging polarization detection, enhancing the accuracy of the polarization detection model. Furthermore, a polarization state information analytical solution model based on direct sunlight compensation is constructed to improve the accuracy and real-time performance of the polarization state information solution. On this basis, an improved bio-orientation method based on direct sunlight compensation for imaging polarization sensor is proposed. The outdoor dynamic reorientation experiment platform is established to validate the effectiveness of the proposed method. Compared with the traditional methods, the experimental results demonstrate a 23% to 47% improvement in the polarization orientation accuracy under various solar zenith angles.
Full article
Figure 1
Open AccessArticle
Data Fusion of RGB and Depth Data with Image Enhancement
by
Lennard Wunsch, Christian Görner Tenorio, Katharina Anding, Andrei Golomoz and Gunther Notni
J. Imaging 2024, 10(3), 73; https://doi.org/10.3390/jimaging10030073 - 21 Mar 2024
Abstract
Since 3D sensors became popular, imaged depth data are easier to obtain in the consumer sector. In applications such as defect localization on industrial objects or mass/volume estimation, precise depth data is important and, thus, benefits from the usage of multiple information sources.
[...] Read more.
Since 3D sensors became popular, imaged depth data are easier to obtain in the consumer sector. In applications such as defect localization on industrial objects or mass/volume estimation, precise depth data is important and, thus, benefits from the usage of multiple information sources. However, a combination of RGB images and depth images can not only improve our understanding of objects, capacitating one to gain more information about objects but also enhance data quality. Combining different camera systems using data fusion can enable higher quality data since disadvantages can be compensated. Data fusion itself consists of data preparation and data registration. A challenge in data fusion is the different resolutions of sensors. Therefore, up- and downsampling algorithms are needed. This paper compares multiple up- and downsampling methods, such as different direct interpolation methods, joint bilateral upsampling (JBU), and Markov random fields (MRFs), in terms of their potential to create RGB-D images and improve the quality of depth information. In contrast to the literature in which imaging systems are adjusted to acquire the data of the same section simultaneously, the laboratory setup in this study was based on conveyor-based optical sorting processes, and therefore, the data were acquired at different time periods and different spatial locations. Data assignment and data cropping were necessary. In order to evaluate the results, root mean square error (RMSE), signal-to-noise ratio (SNR), correlation (CORR), universal quality index (UQI), and the contour offset are monitored. With JBU outperforming the other upsampling methods, achieving a meanRMSE = 25.22, mean SNR = 32.80, mean CORR = 0.99, and mean UQI = 0.97.
Full article
(This article belongs to the Section Image and Video Processing)
►▼
Show Figures
Figure 1
Open AccessArticle
Analyzing Data Modalities for Cattle Weight Estimation Using Deep Learning Models
by
Hina Afridi, Mohib Ullah, Øyvind Nordbø, Solvei Cottis Hoff, Siri Furre, Anne Guro Larsgard and Faouzi Alaya Cheikh
J. Imaging 2024, 10(3), 72; https://doi.org/10.3390/jimaging10030072 - 21 Mar 2024
Abstract
We investigate the impact of different data modalities for cattle weight estimation. For this purpose, we collect and present our own cattle dataset representing the data modalities: RGB, depth, combined RGB and depth, segmentation, and combined segmentation and depth information. We explore a
[...] Read more.
We investigate the impact of different data modalities for cattle weight estimation. For this purpose, we collect and present our own cattle dataset representing the data modalities: RGB, depth, combined RGB and depth, segmentation, and combined segmentation and depth information. We explore a recent vision-transformer-based zero-shot model proposed by Meta AI Research for producing the segmentation data modality and for extracting the cattle-only region from the images. For experimental analysis, we consider three baseline deep learning models. The objective is to assess how the integration of diverse data sources influences the accuracy and robustness of the deep learning models considering four different performance metrics: mean absolute error (MAE), root mean squared error (RMSE), mean absolute percentage error (MAPE), and R-squared ( ). We explore the synergies and challenges associated with each modality and their combined use in enhancing the precision of cattle weight prediction. Through comprehensive experimentation and evaluation, we aim to provide insights into the effectiveness of different data modalities in improving the performance of established deep learning models, facilitating informed decision-making for precision livestock management systems.
Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
►▼
Show Figures
Figure 1
Open AccessArticle
FishSegSSL: A Semi-Supervised Semantic Segmentation Framework for Fish-Eye Images
by
Sneha Paul, Zachary Patterson and Nizar Bouguila
J. Imaging 2024, 10(3), 71; https://doi.org/10.3390/jimaging10030071 - 15 Mar 2024
Abstract
The application of large field-of-view (FoV) cameras equipped with fish-eye lenses brings notable advantages to various real-world computer vision applications, including autonomous driving. While deep learning has proven successful in conventional computer vision applications using regular perspective images, its potential in fish-eye camera
[...] Read more.
The application of large field-of-view (FoV) cameras equipped with fish-eye lenses brings notable advantages to various real-world computer vision applications, including autonomous driving. While deep learning has proven successful in conventional computer vision applications using regular perspective images, its potential in fish-eye camera contexts remains largely unexplored due to limited datasets for fully supervised learning. Semi-supervised learning comes as a potential solution to manage this challenge. In this study, we explore and benchmark two popular semi-supervised methods from the perspective image domain for fish-eye image segmentation. We further introduce FishSegSSL, a novel fish-eye image segmentation framework featuring three semi-supervised components: pseudo-label filtering, dynamic confidence thresholding, and robust strong augmentation. Evaluation on the WoodScape dataset, collected from vehicle-mounted fish-eye cameras, demonstrates that our proposed method enhances the model’s performance by up to 10.49% over fully supervised methods using the same amount of labeled data. Our method also improves the existing image segmentation methods by 2.34%. To the best of our knowledge, this is the first work on semi-supervised semantic segmentation on fish-eye images. Additionally, we conduct a comprehensive ablation study and sensitivity analysis to showcase the efficacy of each proposed method in this research.
Full article
(This article belongs to the Special Issue Deep Learning in Computer Vision)
►▼
Show Figures
Graphical abstract
Open AccessArticle
Enhancing Embedded Object Tracking: A Hardware Acceleration Approach for Real-Time Predictability
by
Mingyang Zhang, Kristof Van Beeck and Toon Goedemé
J. Imaging 2024, 10(3), 70; https://doi.org/10.3390/jimaging10030070 - 13 Mar 2024
Abstract
While Siamese object tracking has witnessed significant advancements, its hard real-time behaviour on embedded devices remains inadequately addressed. In many application cases, an embedded implementation should not only have a minimal execution latency, but this latency should ideally also have zero variance, i.e.,
[...] Read more.
While Siamese object tracking has witnessed significant advancements, its hard real-time behaviour on embedded devices remains inadequately addressed. In many application cases, an embedded implementation should not only have a minimal execution latency, but this latency should ideally also have zero variance, i.e., be predictable. This study aims to address this issue by meticulously analysing real-time predictability across different components of a deep-learning-based video object tracking system. Our detailed experiments not only indicate the superiority of Field-Programmable Gate Array (FPGA) implementations in terms of hard real-time behaviour but also unveil important time predictability bottlenecks. We introduce dedicated hardware accelerators for key processes, focusing on depth-wise cross-correlation and padding operations, utilizing high-level synthesis (HLS). Implemented on a KV260 board, our enhanced tracker exhibits not only a speed up, with a factor of 6.6, in mean execution time but also significant improvements in hard real-time predictability by yielding 11 times less latency variation as compared to our baseline. A subsequent analysis of power consumption reveals our approach’s contribution to enhanced power efficiency. These advancements underscore the crucial role of hardware acceleration in realizing time-predictable object tracking on embedded systems, setting new standards for future hardware–software co-design endeavours in this domain.
Full article
(This article belongs to the Special Issue Computer Vision and Deep Learning: Trends and Applications (2nd Edition))
►▼
Show Figures
Figure 1
Open AccessArticle
Multi-Modal Convolutional Parameterisation Network for Guided Image Inverse Problems
by
Mikolaj Czerkawski, Priti Upadhyay, Christopher Davison, Robert Atkinson, Craig Michie, Ivan Andonovic, Malcolm Macdonald, Javier Cardona and Christos Tachtatzis
J. Imaging 2024, 10(3), 69; https://doi.org/10.3390/jimaging10030069 - 12 Mar 2024
Abstract
There are several image inverse tasks, such as inpainting or super-resolution, which can be solved using deep internal learning, a paradigm that involves employing deep neural networks to find a solution by learning from the sample itself rather than a dataset. For example,
[...] Read more.
There are several image inverse tasks, such as inpainting or super-resolution, which can be solved using deep internal learning, a paradigm that involves employing deep neural networks to find a solution by learning from the sample itself rather than a dataset. For example, Deep Image Prior is a technique based on fitting a convolutional neural network to output the known parts of the image (such as non-inpainted regions or a low-resolution version of the image). However, this approach is not well adjusted for samples composed of multiple modalities. In some domains, such as satellite image processing, accommodating multi-modal representations could be beneficial or even essential. In this work, Multi-Modal Convolutional Parameterisation Network (MCPN) is proposed, where a convolutional neural network approximates shared information between multiple modes by combining a core shared network with modality-specific head networks. The results demonstrate that these approaches can significantly outperform the single-mode adoption of a convolutional parameterisation network on guided image inverse problems of inpainting and super-resolution.
Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
►▼
Show Figures
Figure 1
Open AccessArticle
Neural Radiance Field-Inspired Depth Map Refinement for Accurate Multi-View Stereo
by
Shintaro Ito, Kanta Miura, Koichi Ito and Takafumi Aoki
J. Imaging 2024, 10(3), 68; https://doi.org/10.3390/jimaging10030068 - 08 Mar 2024
Abstract
In this paper, we propose a method to refine the depth maps obtained by Multi-View Stereo (MVS) through iterative optimization of the Neural Radiance Field (NeRF). MVS accurately estimates the depths on object surfaces, and NeRF accurately estimates the depths at object boundaries.
[...] Read more.
In this paper, we propose a method to refine the depth maps obtained by Multi-View Stereo (MVS) through iterative optimization of the Neural Radiance Field (NeRF). MVS accurately estimates the depths on object surfaces, and NeRF accurately estimates the depths at object boundaries. The key ideas of the proposed method are to combine MVS and NeRF to utilize the advantages of both in depth map estimation and to use NeRF for depth map refinement. We also introduce a Huber loss into the NeRF optimization to improve the accuracy of the depth map refinement, where the Huber loss reduces the estimation error in the radiance fields by placing constraints on errors larger than a threshold. Through a set of experiments using the Redwood-3dscan dataset and the DTU dataset, which are public datasets consisting of multi-view images, we demonstrate the effectiveness of the proposed method compared to conventional methods: COLMAP, NeRF, and DS-NeRF.
Full article
(This article belongs to the Special Issue Geometry Reconstruction from Images (2nd Edition))
►▼
Show Figures
Figure 1
Open AccessArticle
Revolutionizing Cow Welfare Monitoring: A Novel Top-View Perspective with Depth Camera-Based Lameness Classification
by
San Chain Tun, Tsubasa Onizuka, Pyke Tin, Masaru Aikawa, Ikuo Kobayashi and Thi Thi Zin
J. Imaging 2024, 10(3), 67; https://doi.org/10.3390/jimaging10030067 - 08 Mar 2024
Abstract
This study innovates livestock health management, utilizing a top-view depth camera for accurate cow lameness detection, classification, and precise segmentation through integration with a 3D depth camera and deep learning, distinguishing it from 2D systems. It underscores the importance of early lameness detection
[...] Read more.
This study innovates livestock health management, utilizing a top-view depth camera for accurate cow lameness detection, classification, and precise segmentation through integration with a 3D depth camera and deep learning, distinguishing it from 2D systems. It underscores the importance of early lameness detection in cattle and focuses on extracting depth data from the cow’s body, with a specific emphasis on the back region’s maximum value. Precise cow detection and tracking are achieved through the Detectron2 framework and Intersection Over Union (IOU) techniques. Across a three-day testing period, with observations conducted twice daily with varying cow populations (ranging from 56 to 64 cows per day), the study consistently achieves an impressive average detection accuracy of 99.94%. Tracking accuracy remains at 99.92% over the same observation period. Subsequently, the research extracts the cow’s depth region using binary mask images derived from detection results and original depth images. Feature extraction generates a feature vector based on maximum height measurements from the cow’s backbone area. This feature vector is utilized for classification, evaluating three classifiers: Random Forest (RF), K-Nearest Neighbor (KNN), and Decision Tree (DT). The study highlights the potential of top-view depth video cameras for accurate cow lameness detection and classification, with significant implications for livestock health management.
Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
►▼
Show Figures
Figure 1
Open AccessArticle
Magnetic Resonance Imaging as a Diagnostic Tool for Ilio-Femoro-Caval Deep Venous Thrombosis
by
Lisbeth Lyhne, Kim Christian Houlind, Johnny Christensen, Radu L. Vijdea, Meinhard R. Hansen, Malene Roland V. Pedersen and Helle Precht
J. Imaging 2024, 10(3), 66; https://doi.org/10.3390/jimaging10030066 - 08 Mar 2024
Abstract
This study aimed to test the accuracy of a magnetic resonance imaging (MRI)-based method to detect and characterise deep venous thrombosis (DVT) in the ilio-femoro-caval veins. Patients with verified DVT in the lower extremities with extension of the thrombi to the iliac veins,
[...] Read more.
This study aimed to test the accuracy of a magnetic resonance imaging (MRI)-based method to detect and characterise deep venous thrombosis (DVT) in the ilio-femoro-caval veins. Patients with verified DVT in the lower extremities with extension of the thrombi to the iliac veins, who were suitable for catheter-based venous thrombolysis, were included in this study. Before the intervention, magnetic resonance venography (MRV) was performed, and the ilio-femoro-caval veins were independently evaluated for normal appearance, stenosis, and occlusion by two single-blinded observers. The same procedure was used to evaluate digital subtraction phlebography (DSP), considered to be the gold standard, which made it possible to compare the results. A total of 123 patients were included for MRV and DSP, resulting in 246 image sets to be analysed. In total, 496 segments were analysed for occlusion, stenosis, or normal appearance. The highest sensitivity compared occlusion with either normal or stenosis (0.98) in MRV, while the lowest was found between stenosis and normal (0.84). Specificity varied from 0.59 (stenosis >< occlusion) to 0.94 (occlusion >< normal). The Kappa statistic was calculated as a measure of inter-observer agreement. The kappa value for MRV was 0.91 and for DSP, 0.80. In conclusion, MRV represents a sensitive method to analyse DVT in the pelvis veins with advantages such as no radiation and contrast and the possibility to investigate the anatomical relationship in the area.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures
Figure 1
Open AccessArticle
Historical Text Line Segmentation Using Deep Learning Algorithms: Mask-RCNN against U-Net Networks
by
Florian Côme Fizaine, Patrick Bard, Michel Paindavoine, Cécile Robin, Edouard Bouyé, Raphaël Lefèvre and Annie Vinter
J. Imaging 2024, 10(3), 65; https://doi.org/10.3390/jimaging10030065 - 05 Mar 2024
Abstract
Text line segmentation is a necessary preliminary step before most text transcription algorithms are applied. The leading deep learning networks used in this context (ARU-Net, dhSegment, and Doc-UFCN) are based on the U-Net architecture. They are efficient, but fall under the same concept,
[...] Read more.
Text line segmentation is a necessary preliminary step before most text transcription algorithms are applied. The leading deep learning networks used in this context (ARU-Net, dhSegment, and Doc-UFCN) are based on the U-Net architecture. They are efficient, but fall under the same concept, requiring a post-processing step to perform instance (e.g., text line) segmentation. In the present work, we test the advantages of Mask-RCNN, which is designed to perform instance segmentation directly. This work is the first to directly compare Mask-RCNN- and U-Net-based networks on text segmentation of historical documents, showing the superiority of the former over the latter. Three studies were conducted, one comparing these networks on different historical databases, another comparing Mask-RCNN with Doc-UFCN on a private historical database, and a third comparing the handwritten text recognition (HTR) performance of the tested networks. The results showed that Mask-RCNN outperformed ARU-Net, dhSegment, and Doc-UFCN using relevant line segmentation metrics, that performance evaluation should not focus on the raw masks generated by the networks, that a light mask processing is an efficient and simple solution to improve evaluation, and that Mask-RCNN leads to better HTR performance.
Full article
(This article belongs to the Section Document Analysis and Processing)
►▼
Show Figures
Figure 1
Open AccessArticle
Elevating Chest X-ray Image Super-Resolution with Residual Network Enhancement
by
Anudari Khishigdelger, Ahmed Salem and Hyun-Soo Kang
J. Imaging 2024, 10(3), 64; https://doi.org/10.3390/jimaging10030064 - 04 Mar 2024
Abstract
Chest X-ray (CXR) imaging plays a pivotal role in diagnosing various pulmonary diseases, which account for a significant portion of the global mortality rate, as recognized by the World Health Organization (WHO). Medical practitioners routinely depend on CXR images to identify anomalies and
[...] Read more.
Chest X-ray (CXR) imaging plays a pivotal role in diagnosing various pulmonary diseases, which account for a significant portion of the global mortality rate, as recognized by the World Health Organization (WHO). Medical practitioners routinely depend on CXR images to identify anomalies and make critical clinical decisions. Dramatic improvements in super-resolution (SR) have been achieved by applying deep learning techniques. However, some SR methods are very difficult to utilize due to their low-resolution inputs and features containing abundant low-frequency information, similar to the case of X-ray image super-resolution. In this paper, we introduce an advanced deep learning-based SR approach that incorporates the innovative residual-in-residual (RIR) structure to augment the diagnostic potential of CXR imaging. Specifically, we propose forming a light network consisting of residual groups built by residual blocks, with multiple skip connections to facilitate the efficient bypassing of abundant low-frequency information through multiple skip connections. This approach allows the main network to concentrate on learning high-frequency information. In addition, we adopted the dense feature fusion within residual groups and designed high parallel residual blocks for better feature extraction. Our proposed methods exhibit superior performance compared to existing state-of-the-art (SOTA) SR methods, delivering enhanced accuracy and notable visual improvements, as evidenced by our results.
Full article
(This article belongs to the Special Issue Deep Learning in Biomedical Image Segmentation and Classification: Advancements, Challenges and Applications)
►▼
Show Figures
Figure 1
Open AccessArticle
Enhancing COVID-19 Detection: An Xception-Based Model with Advanced Transfer Learning from X-ray Thorax Images
by
Reagan E. Mandiya, Hervé M. Kongo, Selain K. Kasereka, Kyamakya Kyandoghere, Petro Mushidi Tshakwanda and Nathanaël M. Kasoro
J. Imaging 2024, 10(3), 63; https://doi.org/10.3390/jimaging10030063 - 29 Feb 2024
Abstract
Rapid and precise identification of Coronavirus Disease 2019 (COVID-19) is pivotal for effective patient care, comprehending the pandemic’s trajectory, and enhancing long-term patient survival rates. Despite numerous recent endeavors in medical imaging, many convolutional neural network-based models grapple with the expressiveness problem and
[...] Read more.
Rapid and precise identification of Coronavirus Disease 2019 (COVID-19) is pivotal for effective patient care, comprehending the pandemic’s trajectory, and enhancing long-term patient survival rates. Despite numerous recent endeavors in medical imaging, many convolutional neural network-based models grapple with the expressiveness problem and overfitting, and the training process of these models is always resource-intensive. This paper presents an innovative approach employing Xception, augmented with cutting-edge transfer learning techniques to forecast COVID-19 from X-ray thorax images. Our experimental findings demonstrate that the proposed model surpasses the predictive accuracy of established models in the domain, including Xception, VGG-16, and ResNet. This research marks a significant stride toward enhancing COVID-19 detection through a sophisticated and high-performing imaging model.
Full article
(This article belongs to the Special Issue Clinical and Pathological Imaging in the Era of Artificial Intelligence: New Insights and Perspectives)
►▼
Show Figures
Figure 1
Highly Accessed Articles
Latest Books
E-Mail Alert
News
Topics
Topic in
Sensors, J. Imaging, Electronics, Applied Sciences, Entropy, Digital, J. Intell.
Advances in Perceptual Quality Assessment of User Generated Contents
Topic Editors: Guangtao Zhai, Xiongkuo Min, Menghan Hu, Wei ZhouDeadline: 31 March 2024
Topic in
Algorithms, Diagnostics, Entropy, Information, J. Imaging
Application of Machine Learning in Molecular Imaging
Topic Editors: Allegra Conti, Nicola Toschi, Marianna Inglese, Andrea Duggento, Matthew Grech-Sollars, Serena Monti, Giancarlo Sportelli, Pietro CarraDeadline: 31 May 2024
Topic in
Applied Sciences, Computation, Entropy, J. Imaging
Color Image Processing: Models and Methods (CIP: MM)
Topic Editors: Giuliana Ramella, Isabella TorcicolloDeadline: 30 July 2024
Topic in
Applied Sciences, Sensors, J. Imaging, MAKE
Applications in Image Analysis and Pattern Recognition
Topic Editors: Bin Fan, Wenqi RenDeadline: 31 August 2024
Conferences
Special Issues
Special Issue in
J. Imaging
Recent Advances in Image-Based Geotechnics II
Guest Editor: Joana FonsecaDeadline: 31 March 2024
Special Issue in
J. Imaging
Advances and Challenges in Multimodal Machine Learning 2nd Edition
Guest Editor: Georgina CosmaDeadline: 30 April 2024
Special Issue in
J. Imaging
Modelling of Human Visual System in Image Processing
Guest Editors: Edoardo Provenzi, Alexey MashtakovDeadline: 24 May 2024
Special Issue in
J. Imaging
The Mixed Reality Revolution: Challenges and Prospects 2nd Edition
Guest Editors: Jean Sequeira, Sébastien MavromatisDeadline: 31 May 2024