Next Issue
Volume 9, April
Previous Issue
Volume 9, February
 
 

J. Imaging, Volume 9, Issue 3 (March 2023) – 19 articles

Cover Story (view full-size image): Generative adversarial networks (GANs) have become increasingly powerful, generating photorealistic images. One recurrent theme in medical imaging is whether GANs can be effective at generating workable medical data. We performed a multi-GAN (from basic DCGAN to more sophisticated GANs) and multi-application study by measuring the segmentation accuracy of a U-Net trained on generated images for three imaging modalities and three organs. The results reveal that GANs are far from being equal. Only the top-performing GANs are capable of generating realistic-looking medical images that can fool trained experts in a visual Turing test and comply to some metrics. However, segmentation results suggest that no GAN can supplant the richness of the dataset it was trained on. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
23 pages, 798 KiB  
Article
ARAM: A Technology Acceptance Model to Ascertain the Behavioural Intention to Use Augmented Reality
by Anabela Marto, Alexandrino Gonçalves, Miguel Melo, Maximino Bessa and Rui Silva
J. Imaging 2023, 9(3), 73; https://doi.org/10.3390/jimaging9030073 - 21 Mar 2023
Cited by 8 | Viewed by 3630
Abstract
The expansion of augmented reality across society, its availability in mobile platforms and the novelty character it embodies by appearing in a growing number of areas, have raised new questions related to people’s predisposition to use this technology in their daily life. Acceptance [...] Read more.
The expansion of augmented reality across society, its availability in mobile platforms and the novelty character it embodies by appearing in a growing number of areas, have raised new questions related to people’s predisposition to use this technology in their daily life. Acceptance models, which have been updated following technological breakthroughs and society changes, are known to be great tools for predicting the intention to use a new technological system. This paper proposes a new acceptance model aiming to ascertain the intention to use augmented reality technology in heritage sites—the Augmented Reality Acceptance Model (ARAM). ARAM relies on the use of the Unified Theory of Acceptance and Use of Technology model (UTAUT) model’s constructs, namely performance expectancy, effort expectancy, social influence, and facilitating conditions, to which the new and adapted constructs of trust expectancy, technological innovation, computer anxiety and hedonic motivation are added. This model was validated with data gathered from 528 participants. Results confirm ARAM as a reliable tool to determine the acceptance of augmented reality technology for usage in cultural heritage sites. The direct impact of performance expectancy, facilitating conditions and hedonic motivation is validated as having a positive influence on behavioural intention. Trust expectancy and technological innovation are demonstrated to have a positive influence on performance expectancy whereas hedonic motivation is negatively influenced by effort expectancy and by computer anxiety. The research, thus, supports ARAM as a suitable model to ascertain the behavioural intention to use augmented reality in new areas of activity. Full article
(This article belongs to the Section Mixed, Augmented and Virtual Reality)
Show Figures

Figure 1

23 pages, 14839 KiB  
Article
6D Object Localization in Car-Assembly Industrial Environment
by Alexandra Papadaki and Maria Pateraki
J. Imaging 2023, 9(3), 72; https://doi.org/10.3390/jimaging9030072 - 20 Mar 2023
Cited by 6 | Viewed by 3211
Abstract
In this work, a visual object detection and localization workflow integrated into a robotic platform is presented for the 6D pose estimation of objects with challenging characteristics in terms of weak texture, surface properties and symmetries. The workflow is used as part of [...] Read more.
In this work, a visual object detection and localization workflow integrated into a robotic platform is presented for the 6D pose estimation of objects with challenging characteristics in terms of weak texture, surface properties and symmetries. The workflow is used as part of a module for object pose estimation deployed to a mobile robotic platform that exploits the Robot Operating System (ROS) as middleware. The objects of interest aim to support robot grasping in the context of human–robot collaboration during car door assembly in industrial manufacturing environments. In addition to the special object properties, these environments are inherently characterised by cluttered background and unfavorable illumination conditions. For the purpose of this specific application, two different datasets were collected and annotated for training a learning-based method that extracts the object pose from a single frame. The first dataset was acquired in controlled laboratory conditions and the second in the actual indoor industrial environment. Different models were trained based on the individual datasets and a combination of them were further evaluated in a number of test sequences from the actual industrial environment. The qualitative and quantitative results demonstrate the potential of the presented method in relevant industrial applications. Full article
(This article belongs to the Special Issue Industrial Machine Learning Application)
Show Figures

Figure 1

12 pages, 1571 KiB  
Article
CT Rendering and Radiomic Analysis in Post-Chemotherapy Retroperitoneal Lymph Node Dissection for Testicular Cancer to Anticipate Difficulties for Young Surgeons
by Anna Scavuzzo, Pavel Figueroa-Rodriguez, Alessandro Stefano, Nallely Jimenez Guedulain, Sebastian Muruato Araiza, Jose de Jesus Cendejas Gomez, Alejandro Quiroz Compeaán, Dimas O. Victorio Vargas and Miguel A. Jiménez-Ríos
J. Imaging 2023, 9(3), 71; https://doi.org/10.3390/jimaging9030071 - 17 Mar 2023
Cited by 3 | Viewed by 2569
Abstract
Post-chemotherapy retroperitoneal lymph node dissection (PC-RPLND) in non-seminomatous germ-cell tumor (NSTGCTs) is a complex procedure. We evaluated whether 3D computed tomography (CT) rendering and their radiomic analysis help predict resectability by junior surgeons. The ambispective analysis was performed between 2016–2021. A prospective group [...] Read more.
Post-chemotherapy retroperitoneal lymph node dissection (PC-RPLND) in non-seminomatous germ-cell tumor (NSTGCTs) is a complex procedure. We evaluated whether 3D computed tomography (CT) rendering and their radiomic analysis help predict resectability by junior surgeons. The ambispective analysis was performed between 2016–2021. A prospective group (A) of 30 patients undergoing CT was segmented using the 3D Slicer software while a retrospective group (B) of 30 patients was evaluated with conventional CT (without 3D reconstruction). CatFisher’s exact test showed a p-value of 0.13 for group A and 1.0 for Group B. The difference between the proportion test showed a p-value of 0.009149 (IC 0.1–0.63). The proportion of the correct classification showed a p-value of 0.645 (IC 0.55–0.87) for A, and 0.275 (IC 0.11–0.43) for Group B. Furthermore, 13 shape features were extracted: elongation, flatness, volume, sphericity, and surface area, among others. Performing a logistic regression with the entire dataset, n = 60, the results were: Accuracy: 0.7 and Precision: 0.65. Using n = 30 randomly chosen, the best result obtained was Accuracy: 0.73 and Precision: 0.83, with a p-value: 0.025 for Fisher’s exact test. In conclusion, the results showed a significant difference in the prediction of resectability with conventional CT versus 3D reconstruction by junior surgeons versus experienced surgeons. Radiomic features used to elaborate an artificial intelligence model improve the prediction of resectability. The proposed model could be of great support in a university hospital, allowing it to plan the surgery and to anticipate complications. Full article
Show Figures

Figure 1

21 pages, 1642 KiB  
Article
On The Potential of Image Moments for Medical Diagnosis
by Cecilia Di Ruberto, Andrea Loddo and Lorenzo Putzu
J. Imaging 2023, 9(3), 70; https://doi.org/10.3390/jimaging9030070 - 17 Mar 2023
Cited by 3 | Viewed by 2069
Abstract
Medical imaging is widely used for diagnosis and postoperative or post-therapy monitoring. The ever-increasing number of images produced has encouraged the introduction of automated methods to assist doctors or pathologists. In recent years, especially after the advent of convolutional neural networks, many researchers [...] Read more.
Medical imaging is widely used for diagnosis and postoperative or post-therapy monitoring. The ever-increasing number of images produced has encouraged the introduction of automated methods to assist doctors or pathologists. In recent years, especially after the advent of convolutional neural networks, many researchers have focused on this approach, considering it to be the only method for diagnosis since it can perform a direct classification of images. However, many diagnostic systems still rely on handcrafted features to improve interpretability and limit resource consumption. In this work, we focused our efforts on orthogonal moments, first by providing an overview and taxonomy of their macrocategories and then by analysing their classification performance on very different medical tasks represented by four public benchmark data sets. The results confirmed that convolutional neural networks achieved excellent performance on all tasks. Despite being composed of much fewer features than those extracted by the networks, orthogonal moments proved to be competitive with them, showing comparable and, in some cases, better performance. In addition, Cartesian and harmonic categories provided a very low standard deviation, proving their robustness in medical diagnostic tasks. We strongly believe that the integration of the studied orthogonal moments can lead to more robust and reliable diagnostic systems, considering the performance obtained and the low variation of the results. Finally, since they have been shown to be effective on both magnetic resonance and computed tomography images, they can be easily extended to other imaging techniques. Full article
(This article belongs to the Topic Medical Image Analysis)
Show Figures

Figure 1

16 pages, 2570 KiB  
Article
GANs for Medical Image Synthesis: An Empirical Study
by Youssef Skandarani, Pierre-Marc Jodoin and Alain Lalande
J. Imaging 2023, 9(3), 69; https://doi.org/10.3390/jimaging9030069 - 16 Mar 2023
Cited by 79 | Viewed by 18017
Abstract
Generative adversarial networks (GANs) have become increasingly powerful, generating mind-blowing photorealistic images that mimic the content of datasets they have been trained to replicate. One recurrent theme in medical imaging, is whether GANs can also be as effective at generating workable medical data, [...] Read more.
Generative adversarial networks (GANs) have become increasingly powerful, generating mind-blowing photorealistic images that mimic the content of datasets they have been trained to replicate. One recurrent theme in medical imaging, is whether GANs can also be as effective at generating workable medical data, as they are for generating realistic RGB images. In this paper, we perform a multi-GAN and multi-application study, to gauge the benefits of GANs in medical imaging. We tested various GAN architectures, from basic DCGAN to more sophisticated style-based GANs, on three medical imaging modalities and organs, namely: cardiac cine-MRI, liver CT, and RGB retina images. GANs were trained on well-known and widely utilized datasets, from which their FID scores were computed, to measure the visual acuity of their generated images. We further tested their usefulness by measuring the segmentation accuracy of a U-Net trained on these generated images and the original data. The results reveal that GANs are far from being equal, as some are ill-suited for medical imaging applications, while others performed much better. The top-performing GANs are capable of generating realistic-looking medical images by FID standards, that can fool trained experts in a visual Turing test and comply to some metrics. However, segmentation results suggest that no GAN is capable of reproducing the full richness of medical datasets. Full article
(This article belongs to the Section AI in Imaging)
Show Figures

Figure 1

20 pages, 8537 KiB  
Article
Hyperparameter Optimization of a Convolutional Neural Network Model for Pipe Burst Location in Water Distribution Networks
by André Antunes, Bruno Ferreira, Nuno Marques and Nelson Carriço
J. Imaging 2023, 9(3), 68; https://doi.org/10.3390/jimaging9030068 - 14 Mar 2023
Cited by 4 | Viewed by 4556
Abstract
The current paper presents a hyper parameterization optimization process for a convolutional neural network (CNN) applied to pipe burst locations in water distribution networks (WDN). The hyper parameterization process of the CNN includes the early stopping termination criteria, dataset size, dataset normalization, training [...] Read more.
The current paper presents a hyper parameterization optimization process for a convolutional neural network (CNN) applied to pipe burst locations in water distribution networks (WDN). The hyper parameterization process of the CNN includes the early stopping termination criteria, dataset size, dataset normalization, training set batch size, optimizer learning rate regularization, and model structure. The study was applied using a case study of a real WDN. Obtained results indicate that the ideal model parameters consist of a CNN with a convolutional 1D layer (using 32 filters, a kernel size of 3 and strides equal to 1) for a maximum of 5000 epochs using a total of 250 datasets (using data normalization between 0 and 1 and tolerance equal to max noise) and a batch size of 500 samples per epoch step, optimized with Adam using learning rate regularization. This model was evaluated for distinct measurement noise levels and pipe burst locations. Results indicate that the parameterized model can provide a pipe burst search area with more or less dispersion depending on both the proximity of pressure sensors to the burst or the noise measurement level. Full article
(This article belongs to the Special Issue Industrial Machine Learning Application)
Show Figures

Figure 1

30 pages, 14229 KiB  
Article
A Real-Time Registration Algorithm of UAV Aerial Images Based on Feature Matching
by Zhiwen Liu, Gen Xu, Jiangjian Xiao, Jingxiang Yang, Ziyang Wang and Siyuan Cheng
J. Imaging 2023, 9(3), 67; https://doi.org/10.3390/jimaging9030067 - 11 Mar 2023
Cited by 5 | Viewed by 4030
Abstract
This study aimed to achieve the accurate and real-time geographic positioning of UAV aerial image targets. We verified a method of registering UAV camera images on a map (with the geographic location) through feature matching. The UAV is usually in rapid motion and [...] Read more.
This study aimed to achieve the accurate and real-time geographic positioning of UAV aerial image targets. We verified a method of registering UAV camera images on a map (with the geographic location) through feature matching. The UAV is usually in rapid motion and involves changes in the camera head, and the map is high-resolution and has sparse features. These reasons make it difficult for the current feature-matching algorithm to accurately register the two (camera image and map) in real time, meaning that there will be a large number of mismatches. To solve this problem, we used the SuperGlue algorithm, which has a better performance, to match the features. The layer and block strategy, combined with the prior data of the UAV, was introduced to improve the accuracy and speed of feature matching, and the matching information obtained between frames was introduced to solve the problem of uneven registration. Here, we propose the concept of updating map features with UAV image features to enhance the robustness and applicability of UAV aerial image and map registration. After numerous experiments, it was proved that the proposed method is feasible and can adapt to the changes in the camera head, environment, etc. The UAV aerial image is stably and accurately registered on the map, and the frame rate reaches 12 frames per second, which provides a basis for the geo-positioning of UAV aerial image targets. Full article
(This article belongs to the Topic Computer Vision and Image Processing)
Show Figures

Figure 1

13 pages, 621 KiB  
Article
Predictive Factors of Local Recurrence after Colorectal Cancer Liver Metastases Thermal Ablation
by Julien Odet, Julie Pellegrinelli, Olivier Varbedian, Caroline Truntzer, Marco Midulla, François Ghiringhelli and David Orry
J. Imaging 2023, 9(3), 66; https://doi.org/10.3390/jimaging9030066 - 10 Mar 2023
Cited by 4 | Viewed by 2148
Abstract
Background: Identify risk factors for local recurrence (LR) after radiofrequency (RFA) and microwave (MWA) thermoablations (TA) of colorectal cancer liver metastases (CCLM). Methods: Uni- (Pearson’s Chi2 test, Fisher’s exact test, Wilcoxon test) and multivariate analyses (LASSO logistic regressions) of every patient treated [...] Read more.
Background: Identify risk factors for local recurrence (LR) after radiofrequency (RFA) and microwave (MWA) thermoablations (TA) of colorectal cancer liver metastases (CCLM). Methods: Uni- (Pearson’s Chi2 test, Fisher’s exact test, Wilcoxon test) and multivariate analyses (LASSO logistic regressions) of every patient treated with MWA or RFA (percutaneously and surgically) from January 2015 to April 2021 in Centre Georges François Leclerc in Dijon, France. Results: Fifty-four patients were treated with TA for 177 CCLM (159 surgically, 18 percutaneously). LR rate was 17.5% of treated lesions. Univariate analyses by lesion showed factors associated with LR: sizes of the lesion (OR = 1.14), size of nearby vessel (OR = 1.27), treatment of a previous TA site LR (OR = 5.03), and non-ovoid TA site shape (OR = 4.25). Multivariate analyses showed that the size of the nearby vessel (OR = 1.17) and the lesion (OR = 1.09) remained significant risk factors of LR. Conclusions: The size of lesions to treat and vessel proximity are LR risk factors that need to be considered when making the decision of thermoablative treatments. TA of an LR on a previous TA site should be reserved to specific situations, as there is an important risk of another LR. An additional TA procedure can be discussed when TA site shape is non-ovoid on control imaging, given the risk of LR. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

11 pages, 1135 KiB  
Communication
Comparison of Image Quality and Quantification Parameters between Q.Clear and OSEM Reconstruction Methods on FDG-PET/CT Images in Patients with Metastatic Breast Cancer
by Mohammad Naghavi-Behzad, Marianne Vogsen, Oke Gerke, Sara Elisabeth Dahlsgaard-Wallenius, Henriette Juel Nissen, Nick Møldrup Jakobsen, Poul-Erik Braad, Mie Holm Vilstrup, Paul Deak, Malene Grubbe Hildebrandt and Thomas Lund Andersen
J. Imaging 2023, 9(3), 65; https://doi.org/10.3390/jimaging9030065 - 9 Mar 2023
Cited by 5 | Viewed by 2475
Abstract
We compared the image quality and quantification parameters through bayesian penalized likelihood reconstruction algorithm (Q.Clear) and ordered subset expectation maximization (OSEM) algorithm for 2-[18F]FDG-PET/CT scans performed for response monitoring in patients with metastatic breast cancer in prospective setting. We included 37 [...] Read more.
We compared the image quality and quantification parameters through bayesian penalized likelihood reconstruction algorithm (Q.Clear) and ordered subset expectation maximization (OSEM) algorithm for 2-[18F]FDG-PET/CT scans performed for response monitoring in patients with metastatic breast cancer in prospective setting. We included 37 metastatic breast cancer patients diagnosed and monitored with 2-[18F]FDG-PET/CT at Odense University Hospital (Denmark). A total of 100 scans were analyzed blinded toward Q.Clear and OSEM reconstruction algorithms regarding image quality parameters (noise, sharpness, contrast, diagnostic confidence, artefacts, and blotchy appearance) using a five-point scale. The hottest lesion was selected in scans with measurable disease, considering the same volume of interest in both reconstruction methods. SULpeak (g/mL) and SUVmax (g/mL) were compared for the same hottest lesion. There was no significant difference regarding noise, diagnostic confidence, and artefacts within reconstruction methods; Q.Clear had significantly better sharpness (p < 0.001) and contrast (p = 0.001) than the OSEM reconstruction, while the OSEM reconstruction had significantly less blotchy appearance compared with Q.Clear reconstruction (p < 0.001). Quantitative analysis on 75/100 scans indicated that Q.Clear reconstruction had significantly higher SULpeak (5.33 ± 2.8 vs. 4.85 ± 2.5, p < 0.001) and SUVmax (8.27 ± 4.8 vs. 6.90 ± 3.8, p < 0.001) compared with OSEM reconstruction. In conclusion, Q.Clear reconstruction revealed better sharpness, better contrast, higher SUVmax, and higher SULpeak, while OSEM reconstruction had less blotchy appearance. Full article
Show Figures

Figure 1

16 pages, 828 KiB  
Article
Autokeras Approach: A Robust Automated Deep Learning Network for Diagnosis Disease Cases in Medical Images
by Ahmad Alaiad, Aya Migdady, Ra’ed M. Al-Khatib, Omar Alzoubi, Raed Abu Zitar and Laith Abualigah
J. Imaging 2023, 9(3), 64; https://doi.org/10.3390/jimaging9030064 - 8 Mar 2023
Cited by 10 | Viewed by 3334
Abstract
Automated deep learning is promising in artificial intelligence (AI). However, a few applications of automated deep learning networks have been made in the clinical medical fields. Therefore, we studied the application of an open-source automated deep learning framework, Autokeras, for detecting smear blood [...] Read more.
Automated deep learning is promising in artificial intelligence (AI). However, a few applications of automated deep learning networks have been made in the clinical medical fields. Therefore, we studied the application of an open-source automated deep learning framework, Autokeras, for detecting smear blood images infected with malaria parasites. Autokeras is able to identify the optimal neural network to perform the classification task. Hence, the robustness of the adopted model is due to it not needing any prior knowledge from deep learning. In contrast, the traditional deep neural network methods still require more construction to identify the best convolutional neural network (CNN). The dataset used in this study consisted of 27,558 blood smear images. A comparative process proved the superiority of our proposed approach over other traditional neural networks. The evaluation results of our proposed model achieved high efficiency with impressive accuracy, reaching 95.6% when compared with previous competitive models. Full article
(This article belongs to the Special Issue Modelling of Human Visual System in Image Processing)
Show Figures

Figure 1

20 pages, 5345 KiB  
Article
Environment-Aware Rendering and Interaction in Web-Based Augmented Reality
by José Ferrão, Paulo Dias, Beatriz Sousa Santos and Miguel Oliveira
J. Imaging 2023, 9(3), 63; https://doi.org/10.3390/jimaging9030063 - 8 Mar 2023
Cited by 5 | Viewed by 4267
Abstract
This work presents a novel framework for web-based environment-aware rendering and interaction in augmented reality based on WebXR and three.js. It aims at accelerating the development of device-agnostic Augmented Reality (AR) applications. The solution allows for a realistic rendering of 3D elements, handles [...] Read more.
This work presents a novel framework for web-based environment-aware rendering and interaction in augmented reality based on WebXR and three.js. It aims at accelerating the development of device-agnostic Augmented Reality (AR) applications. The solution allows for a realistic rendering of 3D elements, handles geometry occlusion, casts shadows of virtual objects onto real surfaces, and provides physics interaction with real-world objects. Unlike most existing state-of-the-art systems that are built to run on a specific hardware configuration, the proposed solution targets the web environment and is designed to work on a vast range of devices and configurations. Our solution can use monocular camera setups with depth data estimated by deep neural networks or, when available, use higher-quality depth sensors (e.g., LIDAR, structured light) that provide a more accurate perception of the environment. To ensure consistency in the rendering of the virtual scene a physically based rendering pipeline is used, in which physically correct attributes are associated with each 3D object, which, combined with lighting information captured by the device, enables the rendering of AR content matching the environment illumination. All these concepts are integrated and optimized into a pipeline capable of providing a fluid user experience even on middle-range devices. The solution is distributed as an open-source library that can be integrated into existing and new web-based AR projects. The proposed framework was evaluated and compared in terms of performance and visual features with two state-of-the-art alternatives. Full article
(This article belongs to the Special Issue The Roles of the Collaborative eXtended Reality in the New Social Era)
Show Figures

Figure 1

22 pages, 4672 KiB  
Article
DCTable: A Dilated CNN with Optimizing Anchors for Accurate Table Detection
by Takwa Kazdar, Wided Souidene Mseddi, Moulay A. Akhloufi, Ala Agrebi, Marwa Jmal and Rabah Attia
J. Imaging 2023, 9(3), 62; https://doi.org/10.3390/jimaging9030062 - 7 Mar 2023
Cited by 2 | Viewed by 2133
Abstract
With the widespread use of deep learning in leading systems, it has become the mainstream in the table detection field. Some tables are difficult to detect because of the likely figure layout or the small size. As a solution to the underlined problem, [...] Read more.
With the widespread use of deep learning in leading systems, it has become the mainstream in the table detection field. Some tables are difficult to detect because of the likely figure layout or the small size. As a solution to the underlined problem, we propose a novel method, called DCTable, to improve Faster R-CNN for table detection. DCTable came up to extract more discriminative features using a backbone with dilated convolutions in order to improve the quality of region proposals. Another main contribution of this paper is the anchors optimization using the Intersection over Union (IoU)-balanced loss to train the RPN and reduce the false positive rate. This is followed by a RoI Align layer, instead of the ROI pooling, to improve the accuracy during mapping table proposal candidates by eliminating the coarse misalignment and introducing the bilinear interpolation in mapping region proposal candidates. Training and testing on a public dataset showed the effectiveness of the algorithm and a considerable improvement of the F1-score on ICDAR 2017-Pod, ICDAR-2019, Marmot and RVL CDIP datasets. Full article
(This article belongs to the Topic Computer Vision and Image Processing)
Show Figures

Figure 1

13 pages, 5323 KiB  
Article
ReUse: REgressive Unet for Carbon Storage and Above-Ground Biomass Estimation
by Antonio Elia Pascarella, Giovanni Giacco, Mattia Rigiroli, Stefano Marrone and Carlo Sansone
J. Imaging 2023, 9(3), 61; https://doi.org/10.3390/jimaging9030061 - 7 Mar 2023
Cited by 7 | Viewed by 4094
Abstract
The United Nations Framework Convention on Climate Change (UNFCCC) has recently established the Reducing Emissions from Deforestation and forest Degradation (REDD+) program, which requires countries to report their carbon emissions and sink estimates through national greenhouse gas inventories (NGHGI). Thus, developing automatic systems [...] Read more.
The United Nations Framework Convention on Climate Change (UNFCCC) has recently established the Reducing Emissions from Deforestation and forest Degradation (REDD+) program, which requires countries to report their carbon emissions and sink estimates through national greenhouse gas inventories (NGHGI). Thus, developing automatic systems capable of estimating the carbon absorbed by forests without in situ observation becomes essential. To support this critical need, in this work, we introduce ReUse, a simple but effective deep learning approach to estimate the carbon absorbed by forest areas based on remote sensing. The proposed method’s novelty is in using the public above-ground biomass (AGB) data from the European Space Agency’s Climate Change Initiative Biomass project as ground truth to estimate the carbon sequestration capacity of any portion of land on Earth using Sentinel-2 images and a pixel-wise regressive UNet. The approach has been compared with two literature proposals using a private dataset and human-engineered features. The results show a more remarkable generalization ability of the proposed approach, with a decrease in Mean Absolute Error and Root Mean Square Error over the runner-up of 16.9 and 14.3 in the area of Vietnam, 4.7 and 5.1 in the area of Myanmar, 8.0 and 1.4 in the area of Central Europe, respectively. As a case study, we also report an analysis made for the Astroni area, a World Wildlife Fund (WWF) natural reserve struck by a large fire, producing predictions consistent with values found by experts in the field after in situ investigations. These results further support the use of such an approach for the early detection of AGB variations in urban and rural areas. Full article
(This article belongs to the Topic Research on the Application of Digital Signal Processing)
Show Figures

Graphical abstract

15 pages, 14583 KiB  
Article
Sleep Action Recognition Based on Segmentation Strategy
by Xiang Zhou, Yue Cui, Gang Xu, Hongliang Chen, Jing Zeng, Yutong Li and Jiangjian Xiao
J. Imaging 2023, 9(3), 60; https://doi.org/10.3390/jimaging9030060 - 7 Mar 2023
Viewed by 1801
Abstract
In order to solve the problem of long video dependence and the difficulty of fine-grained feature extraction in the video behavior recognition of personnel sleeping at a security-monitored scene, this paper proposes a time-series convolution-network-based sleeping behavior recognition algorithm suitable for monitoring data. [...] Read more.
In order to solve the problem of long video dependence and the difficulty of fine-grained feature extraction in the video behavior recognition of personnel sleeping at a security-monitored scene, this paper proposes a time-series convolution-network-based sleeping behavior recognition algorithm suitable for monitoring data. ResNet50 is selected as the backbone network, and the self-attention coding layer is used to extract rich contextual semantic information; then, a segment-level feature fusion module is constructed to enhance the effective transmission of important information in the segment feature sequence on the network, and the long-term memory network is used to model the entire video in the time dimension to improve behavior detection ability. This paper constructs a data set of sleeping behavior under security monitoring, and the two behaviors contain about 2800 single-person target videos. The experimental results show that the detection accuracy of the network model in this paper is significantly improved on the sleeping post data set, up to 6.69% higher than the benchmark network. Compared with other network models, the performance of the algorithm in this paper has improved to different degrees and has good application value. Full article
Show Figures

Figure 1

21 pages, 50786 KiB  
Article
Impact of Training Data, Ground Truth and Shape Variability in the Deep Learning-Based Semantic Segmentation of HeLa Cells Observed with Electron Microscopy
by Cefa Karabağ, Mauricio Alberto Ortega-Ruíz and Constantino Carlos Reyes-Aldasoro
J. Imaging 2023, 9(3), 59; https://doi.org/10.3390/jimaging9030059 - 1 Mar 2023
Cited by 5 | Viewed by 4232
Abstract
This paper investigates the impact of the amount of training data and the shape variability on the segmentation provided by the deep learning architecture U-Net. Further, the correctness of ground truth (GT) was also evaluated. The input data consisted of a three-dimensional set [...] Read more.
This paper investigates the impact of the amount of training data and the shape variability on the segmentation provided by the deep learning architecture U-Net. Further, the correctness of ground truth (GT) was also evaluated. The input data consisted of a three-dimensional set of images of HeLa cells observed with an electron microscope with dimensions 8192×8192×517. From there, a smaller region of interest (ROI) of 2000×2000×300 was cropped and manually delineated to obtain the ground truth necessary for a quantitative evaluation. A qualitative evaluation was performed on the 8192×8192 slices due to the lack of ground truth. Pairs of patches of data and labels for the classes nucleus, nuclear envelope, cell and background were generated to train U-Net architectures from scratch. Several training strategies were followed, and the results were compared against a traditional image processing algorithm. The correctness of GT, that is, the inclusion of one or more nuclei within the region of interest was also evaluated. The impact of the extent of training data was evaluated by comparing results from 36,000 pairs of data and label patches extracted from the odd slices in the central region, to 135,000 patches obtained from every other slice in the set. Then, 135,000 patches from several cells from the 8192×8192 slices were generated automatically using the image processing algorithm. Finally, the two sets of 135,000 pairs were combined to train once more with 270,000 pairs. As would be expected, the accuracy and Jaccard similarity index improved as the number of pairs increased for the ROI. This was also observed qualitatively for the 8192×8192 slices. When the 8192×8192 slices were segmented with U-Nets trained with 135,000 pairs, the architecture trained with automatically generated pairs provided better results than the architecture trained with the pairs from the manually segmented ground truths. This suggests that the pairs that were extracted automatically from many cells provided a better representation of the four classes of the various cells in the 8192×8192 slice than those pairs that were manually segmented from a single cell. Finally, the two sets of 135,000 pairs were combined, and the U-Net trained with these provided the best results. Full article
(This article belongs to the Topic Medical Image Analysis)
Show Figures

Figure 1

17 pages, 9072 KiB  
Article
A Novel Multimedia Player for International Standard—JPEG Snack
by Sonain Jamil, Oh-Jin Kwon, Jinhee Lee, Faiz Ullah, Yaseen and Afnan
J. Imaging 2023, 9(3), 58; https://doi.org/10.3390/jimaging9030058 - 1 Mar 2023
Cited by 1 | Viewed by 2362
Abstract
The advancement in mobile communication and technologies has led to the usage of short-form digital content increasing daily. This short-form content is mainly based on images that urged the joint photographic experts’ group (JPEG) to introduce a novel international standard, JPEG Snack (International [...] Read more.
The advancement in mobile communication and technologies has led to the usage of short-form digital content increasing daily. This short-form content is mainly based on images that urged the joint photographic experts’ group (JPEG) to introduce a novel international standard, JPEG Snack (International Organization for Standardization (ISO)/ International Electrotechnical Commission (IEC) IS, 19566-8). In JPEG Snack, the multimedia content is embedded into a main background JPEG file, and the resulting JPEG Snack file is saved and transmitted as a .jpg file. If someone does not have a JPEG Snack Player, their device decoder will treat it as a JPEG file and display a background image only. As the standard has been proposed recently, the JPEG Snack Player is needed. In this article, we present a methodology to develop JPEG Snack Player. JPEG Snack Player uses a JPEG Snack decoder and renders media objects on the background JPEG file according to the instructions in the JPEG Snack file. We also present some results and computational complexity metrics for the JPEG Snack Player. Full article
Show Figures

Figure 1

27 pages, 4082 KiB  
Review
Applications of LiDAR in Agriculture and Future Research Directions
by Sourabhi Debnath, Manoranjan Paul and Tanmoy Debnath
J. Imaging 2023, 9(3), 57; https://doi.org/10.3390/jimaging9030057 - 24 Feb 2023
Cited by 27 | Viewed by 11384
Abstract
Light detection and ranging (LiDAR) sensors have accrued an ever-increasing presence in the agricultural sector due to their non-destructive mode of capturing data. LiDAR sensors emit pulsed light waves that return to the sensor upon bouncing off surrounding objects. The distances that the [...] Read more.
Light detection and ranging (LiDAR) sensors have accrued an ever-increasing presence in the agricultural sector due to their non-destructive mode of capturing data. LiDAR sensors emit pulsed light waves that return to the sensor upon bouncing off surrounding objects. The distances that the pulses travel are calculated by measuring the time for all pulses to return to the source. There are many reported applications of the data obtained from LiDAR in agricultural sectors. LiDAR sensors are widely used to measure agricultural landscaping and topography and the structural characteristics of trees such as leaf area index and canopy volume; they are also used for crop biomass estimation, phenotype characterisation, crop growth, etc. A LiDAR-based system and LiDAR data can also be used to measure spray drift and detect soil properties. It has also been proposed in the literature that crop damage detection and yield prediction can also be obtained with LiDAR data. This review focuses on different LiDAR-based system applications and data obtained from LiDAR in agricultural sectors. Comparisons of aspects of LiDAR data in different agricultural applications are also provided. Furthermore, future research directions based on this emerging technology are also presented in this review. Full article
Show Figures

Figure 1

13 pages, 6623 KiB  
Article
Remote Interactive Surgery Platform (RISP): Proof of Concept for an Augmented-Reality-Based Platform for Surgical Telementoring
by Yannik Kalbas, Hoijoon Jung, John Ricklin, Ge Jin, Mingjian Li, Thomas Rauer, Shervin Dehghani, Nassir Navab, Jinman Kim, Hans-Christoph Pape and Sandro-Michael Heining
J. Imaging 2023, 9(3), 56; https://doi.org/10.3390/jimaging9030056 - 23 Feb 2023
Cited by 6 | Viewed by 2806
Abstract
The “Remote Interactive Surgery Platform” (RISP) is an augmented reality (AR)-based platform for surgical telementoring. It builds upon recent advances of mixed reality head-mounted displays (MR-HMD) and associated immersive visualization technologies to assist the surgeon during an operation. It enables an interactive, real-time [...] Read more.
The “Remote Interactive Surgery Platform” (RISP) is an augmented reality (AR)-based platform for surgical telementoring. It builds upon recent advances of mixed reality head-mounted displays (MR-HMD) and associated immersive visualization technologies to assist the surgeon during an operation. It enables an interactive, real-time collaboration with a remote consultant by sharing the operating surgeon’s field of view through the Microsoft (MS) HoloLens2 (HL2). Development of the RISP started during the Medical Augmented Reality Summer School 2021 and is currently still ongoing. It currently includes features such as three-dimensional annotations, bidirectional voice communication and interactive windows to display radiographs within the sterile field. This manuscript provides an overview of the RISP and preliminary results regarding its annotation accuracy and user experience measured with ten participants. Full article
Show Figures

Figure 1

11 pages, 2119 KiB  
Article
Inter- and Intra-Observer Variability and the Effect of Experience in Cine-MRI for Adhesion Detection
by Bram de Wilde, Frank Joosten, Wulphert Venderink, Mirjam E. J. Davidse, Juliëtte Geurts, Hanneke Kruijt, Afke Vermeulen, Bibi Martens, Maxime V. P. Schyns, Josephine C. B. M. Huige, Myrte C. de Boer, Bart A. R. Tonino, Herman J. A. Zandvoort, Kirsti Lammert, Helka Parviainen, Aino-Maija Vuorinen, Suvi Syväranta, Ruben R. M. Vogels, Wiesje Prins, Andrea Coppola, Nancy Bossa, Richard P. G. ten Broek and Henkjan Huismanadd Show full author list remove Hide full author list
J. Imaging 2023, 9(3), 55; https://doi.org/10.3390/jimaging9030055 - 23 Feb 2023
Cited by 3 | Viewed by 2671
Abstract
Cine-MRI for adhesion detection is a promising novel modality that can help the large group of patients developing pain after abdominal surgery. Few studies into its diagnostic accuracy are available, and none address observer variability. This retrospective study explores the inter- and intra-observer [...] Read more.
Cine-MRI for adhesion detection is a promising novel modality that can help the large group of patients developing pain after abdominal surgery. Few studies into its diagnostic accuracy are available, and none address observer variability. This retrospective study explores the inter- and intra-observer variability, diagnostic accuracy, and the effect of experience. A total of 15 observers with a variety of experience reviewed 61 sagittal cine-MRI slices, placing box annotations with a confidence score at locations suspect for adhesions. Five observers reviewed the slices again one year later. Inter- and intra-observer variability are quantified using Fleiss’ (inter) and Cohen’s (intra) κ and percentage agreement. Diagnostic accuracy is quantified with receiver operating characteristic (ROC) analysis based on a consensus standard. Inter-observer Fleiss’ κ values range from 0.04 to 0.34, showing poor to fair agreement. High general and cine-MRI experience led to significantly (p < 0.001) better agreement among observers. The intra-observer results show Cohen’s κ values between 0.37 and 0.53 for all observers, except one with a low κ of −0.11. Group AUC scores lie between 0.66 and 0.72, with individual observers reaching 0.78. This study confirms that cine-MRI can diagnose adhesions, with respect to a radiologist consensus panel and shows that experience improves reading cine-MRI. Observers without specific experience adapt to this modality quickly after a short online tutorial. Observer agreement is fair at best and area under the receiver operating characteristic curve (AUC) scores leave room for improvement. Consistently interpreting this novel modality needs further research, for instance, by developing reporting guidelines or artificial intelligence-based methods. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop