Next Issue
Volume 6, July
Previous Issue
Volume 6, May
 
 

J. Imaging, Volume 6, Issue 6 (June 2020) – 19 articles

Cover Story (view full-size image): In this paper, we summarize the application of a recent approach to vehicle detection and classification directly in the compressive measurement domain to human targets. The raw videos were collected using a pixel-wise code exposure (PCE) camera, which condensed multiple frames into one frame. A combination of two deep learning-based algorithms (you only look once (YOLO) and residual network (ResNet)) was used for detection and confirmation. Optical and mid-wave infrared (MWIR) videos from a well-known database (SENSIAC) were used in our experiments. Extensive experiments demonstrated that the proposed framework was feasible for target detection up to 1500 m, but target confirmation needs more research. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
27 pages, 9802 KiB  
Article
Spectral Processing for Denoising and Compression of 3D Meshes Using Dynamic Orthogonal Iterations
by Gerasimos Arvanitis, Aris S. Lalos and Konstantinos Moustakas
J. Imaging 2020, 6(6), 55; https://doi.org/10.3390/jimaging6060055 - 26 Jun 2020
Cited by 1 | Viewed by 3301
Abstract
Recently, spectral methods have been extensively used in the processing of 3D meshes. They usually take advantage of some unique properties that the eigenvalues and the eigenvectors of the decomposed Laplacian matrix have. However, despite their superior behavior and performance, they suffer from [...] Read more.
Recently, spectral methods have been extensively used in the processing of 3D meshes. They usually take advantage of some unique properties that the eigenvalues and the eigenvectors of the decomposed Laplacian matrix have. However, despite their superior behavior and performance, they suffer from computational complexity, especially while the number of vertices of the model increases. In this work, we suggest the use of a fast and efficient spectral processing approach applied to dense static and dynamic 3D meshes, which can be ideally suited for real-time denoising and compression applications. To increase the computational efficiency of the method, we exploit potential spectral coherence between adjacent parts of a mesh and then we apply an orthogonal iteration approach for the tracking of the graph Laplacian eigenspaces. Additionally, we present a dynamic version that automatically identifies the optimal subspace size that satisfies a given reconstruction quality threshold. In this way, we overcome the problem of the perceptual distortions, due to the fixed number of subspace sizes that is used for all the separated parts individually. Extensive simulations carried out using different 3D models in different use cases (i.e., compression and denoising), showed that the proposed approach is very fast, especially in comparison with the SVD based spectral processing approaches, while at the same time the quality of the reconstructed models is of similar or even better reconstruction quality. The experimental analysis also showed that the proposed approach could also be used by other denoising methods as a preprocessing step, in order to optimize the reconstruction quality of their results and decrease their computational complexity since they need fewer iterations to converge. Full article
Show Figures

Figure 1

16 pages, 5972 KiB  
Article
Image Reconstruction Based on Novel Sets of Generalized Orthogonal Moments
by R. M. Farouk
J. Imaging 2020, 6(6), 54; https://doi.org/10.3390/jimaging6060054 - 23 Jun 2020
Viewed by 2862
Abstract
In this work, we have presented a general framework for reconstruction of intensity images based on new sets of Generalized Fractional order of Chebyshev orthogonal Moments (GFCMs), a novel set of Fractional order orthogonal Laguerre Moments (FLMs) and Generalized Fractional order orthogonal Laguerre [...] Read more.
In this work, we have presented a general framework for reconstruction of intensity images based on new sets of Generalized Fractional order of Chebyshev orthogonal Moments (GFCMs), a novel set of Fractional order orthogonal Laguerre Moments (FLMs) and Generalized Fractional order orthogonal Laguerre Moments (GFLMs). The fractional and generalized recurrence relations of fractional order Chebyshev functions are defined. The fractional and generalized fractional order Laguerre recurrence formulas are given. The new presented generalized fractional order moments are tested with the existing orthogonal moments classical Chebyshev moments, Laguerre moments, and Fractional order Chebyshev Moments (FCMs). The numerical results show that the importance of our general framework which gives a very comprehensive study on intensity image representation based GFCMs, FLMs, and GFLMs. In addition, the fractional parameters give a flexibility of studying global features of images at different positions and scales of the given moments. Full article
Show Figures

Figure 1

22 pages, 4422 KiB  
Article
Combination of LBP Bin and Histogram Selections for Color Texture Classification
by Alice Porebski, Vinh Truong Hoang, Nicolas Vandenbroucke and Denis Hamad
J. Imaging 2020, 6(6), 53; https://doi.org/10.3390/jimaging6060053 - 23 Jun 2020
Cited by 7 | Viewed by 3811
Abstract
LBP (Local Binary Pattern) is a very popular texture descriptor largely used in computer vision. In most applications, LBP histograms are exploited as texture features leading to a high dimensional feature space, especially for color texture classification problems. In the past few years, [...] Read more.
LBP (Local Binary Pattern) is a very popular texture descriptor largely used in computer vision. In most applications, LBP histograms are exploited as texture features leading to a high dimensional feature space, especially for color texture classification problems. In the past few years, different solutions were proposed to reduce the dimension of the feature space based on the LBP histogram. Most of these approaches apply feature selection methods in order to find the most discriminative bins. Recently another strategy proposed selecting the most discriminant LBP histograms in their entirety. This paper tends to improve on these previous approaches, and presents a combination of LBP bin and histogram selections, where a histogram ranking method is applied before processing a bin selection procedure. The proposed approach is evaluated on five benchmark image databases and the obtained results show the effectiveness of the combination of LBP bin and histogram selections which outperforms the simple LBP bin and LBP histogram selection approaches when they are applied independently. Full article
Show Figures

Figure 1

19 pages, 11380 KiB  
Review
Explainable Deep Learning Models in Medical Image Analysis
by Amitojdeep Singh, Sourya Sengupta and Vasudevan Lakshminarayanan
J. Imaging 2020, 6(6), 52; https://doi.org/10.3390/jimaging6060052 - 20 Jun 2020
Cited by 406 | Viewed by 28184
Abstract
Deep learning methods have been very effective for a variety of medical diagnostic tasks and have even outperformed human experts on some of those. However, the black-box nature of the algorithms has restricted their clinical use. Recent explainability studies aim to show the [...] Read more.
Deep learning methods have been very effective for a variety of medical diagnostic tasks and have even outperformed human experts on some of those. However, the black-box nature of the algorithms has restricted their clinical use. Recent explainability studies aim to show the features that influence the decision of a model the most. The majority of literature reviews of this area have focused on taxonomy, ethics, and the need for explanations. A review of the current applications of explainable deep learning for different medical imaging tasks is presented here. The various approaches, challenges for clinical deployment, and the areas requiring further research are discussed here from a practical standpoint of a deep learning researcher designing a system for the clinical end-users. Full article
(This article belongs to the Special Issue Deep Learning in Medical Image Analysis)
Show Figures

Figure 1

15 pages, 2225 KiB  
Article
Classification Models for Skin Tumor Detection Using Texture Analysis in Medical Images
by Marcos A. M. Almeida and Iury A. X. Santos
J. Imaging 2020, 6(6), 51; https://doi.org/10.3390/jimaging6060051 - 19 Jun 2020
Cited by 33 | Viewed by 5562
Abstract
Medical images have made a great contribution to early diagnosis. In this study, a new strategy is presented for analyzing medical images of skin with melanoma and nevus to model, classify and identify lesions on the skin. Machine learning applied to the data [...] Read more.
Medical images have made a great contribution to early diagnosis. In this study, a new strategy is presented for analyzing medical images of skin with melanoma and nevus to model, classify and identify lesions on the skin. Machine learning applied to the data generated by first and second order statistics features, Gray Level Co-occurrence Matrix (GLCM), keypoints and color channel information—Red, Green, Blue and grayscale images of the skin were used to characterize decisive information for the classification of the images. This work proposes a strategy for the analysis of skin images, aiming to choose the best mathematical classifier model, for the identification of melanoma, with the objective of assisting the dermatologist in the identification of melanomas, especially towards an early diagnosis. Full article
(This article belongs to the Special Issue Deep Learning in Medical Image Analysis)
Show Figures

Figure 1

20 pages, 1384 KiB  
Article
Asynchronous Semantic Background Subtraction
by Anthony Cioppa, Marc Braham and Marc Van Droogenbroeck
J. Imaging 2020, 6(6), 50; https://doi.org/10.3390/jimaging6060050 - 18 Jun 2020
Cited by 8 | Viewed by 3822
Abstract
The method of Semantic Background Subtraction (SBS), which combines semantic segmentation and background subtraction, has recently emerged for the task of segmenting moving objects in video sequences. While SBS has been shown to improve background subtraction, a major difficulty is that it combines [...] Read more.
The method of Semantic Background Subtraction (SBS), which combines semantic segmentation and background subtraction, has recently emerged for the task of segmenting moving objects in video sequences. While SBS has been shown to improve background subtraction, a major difficulty is that it combines two streams generated at different frame rates. This results in SBS operating at the slowest frame rate of the two streams, usually being the one of the semantic segmentation algorithm. We present a method, referred to as “Asynchronous Semantic Background Subtraction” (ASBS), able to combine a semantic segmentation algorithm with any background subtraction algorithm asynchronously. It achieves performances close to that of SBS while operating at the fastest possible frame rate, being the one of the background subtraction algorithm. Our method consists in analyzing the temporal evolution of pixel features to possibly replicate the decisions previously enforced by semantics when no semantic information is computed. We showcase ASBS with several background subtraction algorithms and also add a feedback mechanism that feeds the background model of the background subtraction algorithm to upgrade its updating strategy and, consequently, enhance the decision. Experiments show that we systematically improve the performance, even when the semantic stream has a much slower frame rate than the frame rate of the background subtraction algorithm. In addition, we establish that, with the help of ASBS, a real-time background subtraction algorithm, such as ViBe, stays real time and competes with some of the best non-real-time unsupervised background subtraction algorithms such as SuBSENSE. Full article
Show Figures

Graphical abstract

23 pages, 8173 KiB  
Article
Image Processing Technique and Hidden Markov Model for an Elderly Care Monitoring System
by Swe Nwe Nwe Htun, Thi Thi Zin and Pyke Tin
J. Imaging 2020, 6(6), 49; https://doi.org/10.3390/jimaging6060049 - 13 Jun 2020
Cited by 18 | Viewed by 4503
Abstract
Advances in image processing technologies have provided more precise views in medical and health care management systems. Among many other topics, this paper focuses on several aspects of video-based monitoring systems for elderly people living independently. Major concerns are patients with chronic diseases [...] Read more.
Advances in image processing technologies have provided more precise views in medical and health care management systems. Among many other topics, this paper focuses on several aspects of video-based monitoring systems for elderly people living independently. Major concerns are patients with chronic diseases and adults with a decline in physical fitness, as well as falling among elderly people, which is a source of life-threatening injuries and a leading cause of death. Therefore, in this paper, we propose a video-vision-based monitoring system using image processing technology and a Hidden Markov Model for differentiating falls from normal states for people. Specifically, the proposed system is composed of four modules: (1) object detection; (2) feature extraction; (3) analysis for differentiating normal states from falls; and (4) a decision-making process using a Hidden Markov Model for sequential states of abnormal and normal. In the object detection module, background and foreground segmentation is performed by applying the Mixture of Gaussians model, and graph cut is applied for foreground refinement. In the feature extraction module, the postures and positions of detected objects are estimated by applying the hybrid features of the virtual grounding point, inclusive of its related area and the aspect ratio of the object. In the analysis module, for differentiating normal, abnormal, or falling states, statistical computations called the moving average and modified difference are conducted, both of which are employed to estimate the points and periods of falls. Then, the local maximum or local minimum and the half width value are determined in the observed modified difference to more precisely estimate the period of a falling state. Finally, the decision-making process is conducted by developing a Hidden Markov Model. The experimental results used the Le2i fall detection dataset, and showed that our proposed system is robust and reliable and has a high detection rate. Full article
(This article belongs to the Special Issue Robust Image Processing)
Show Figures

Figure 1

21 pages, 4784 KiB  
Article
A Versatile Machine Vision Algorithm for Real-Time Counting Manually Assembled Pieces
by Paola Pierleoni, Alberto Belli, Lorenzo Palma and Luisiana Sabbatini
J. Imaging 2020, 6(6), 48; https://doi.org/10.3390/jimaging6060048 - 13 Jun 2020
Cited by 8 | Viewed by 3685
Abstract
The Industry 4.0 paradigm is based on transparency and co-operation and, hence, on monitoring and pervasive data collection. In highly standardized contexts, it is usually easy to gather data using available technologies, while, in complex environments, only very advanced and customizable technologies, such [...] Read more.
The Industry 4.0 paradigm is based on transparency and co-operation and, hence, on monitoring and pervasive data collection. In highly standardized contexts, it is usually easy to gather data using available technologies, while, in complex environments, only very advanced and customizable technologies, such as Computer Vision, are intelligent enough to perform such monitoring tasks well. By the term “complex environment”, we especially refer to those contexts where human activity which cannot be fully standardized prevails. In this work, we present a Machine Vision algorithm which is able to effectively deal with human interactions inside a framed area. By exploiting inter-frame analysis, image pre-processing, binarization, morphological operations, and blob detection, our solution is able to count the pieces assembled by an operator using a real-time video input. The solution is compared with a more advanced Machine Learning-based custom object detector, which is taken as reference. The proposed solution demonstrates a very good performance in terms of Sensitivity, Specificity, and Accuracy when tested on a real situation in an Italian manufacturing firm. The value of our solution, compared with the reference object detector, is that it requires no training and is therefore extremely flexible, requiring only minor changes to the working parameters to translate to other objects, making it appropriate for plant-wide implementation. Full article
(This article belongs to the Special Issue Augmented Vision for Industry 4.0)
Show Figures

Graphical abstract

11 pages, 1061 KiB  
Article
Deep Multimodal Learning for the Diagnosis of Autism Spectrum Disorder
by Michelle Tang, Pulkit Kumar, Hao Chen and Abhinav Shrivastava
J. Imaging 2020, 6(6), 47; https://doi.org/10.3390/jimaging6060047 - 10 Jun 2020
Cited by 36 | Viewed by 7707
Abstract
Recent medical imaging technologies, specifically functional magnetic resonance imaging (fMRI), have advanced the diagnosis of neurological and neurodevelopmental disorders by allowing scientists and physicians to observe the activity within and between different regions of the brain. Deep learning methods have frequently been implemented [...] Read more.
Recent medical imaging technologies, specifically functional magnetic resonance imaging (fMRI), have advanced the diagnosis of neurological and neurodevelopmental disorders by allowing scientists and physicians to observe the activity within and between different regions of the brain. Deep learning methods have frequently been implemented to analyze images produced by such technologies and perform disease classification tasks; however, current state-of-the-art approaches do not take advantage of all the information offered by fMRI scans. In this paper, we propose a deep multimodal model that learns a joint representation from two types of connectomic data offered by fMRI scans. Incorporating two functional imaging modalities in an automated end-to-end autism diagnosis system will offer a more comprehensive picture of the neural activity, and thus allow for more accurate diagnoses. Our multimodal training strategy achieves a classification accuracy of 74% and a recall of 95%, as well as an F1 score of 0.805, and its overall performance is superior to using only one type of functional data. Full article
(This article belongs to the Special Issue Deep Learning in Medical Image Analysis)
Show Figures

Figure 1

32 pages, 520 KiB  
Review
A Review on Computer Vision-Based Methods for Human Action Recognition
by Mahmoud Al-Faris, John Chiverton, David Ndzi and Ahmed Isam Ahmed
J. Imaging 2020, 6(6), 46; https://doi.org/10.3390/jimaging6060046 - 10 Jun 2020
Cited by 72 | Viewed by 8998
Abstract
Human action recognition targets recognising different actions from a sequence of observations and different environmental conditions. A wide different applications is applicable to vision based action recognition research. This can include video surveillance, tracking, health care, and human–computer interaction. However, accurate and effective [...] Read more.
Human action recognition targets recognising different actions from a sequence of observations and different environmental conditions. A wide different applications is applicable to vision based action recognition research. This can include video surveillance, tracking, health care, and human–computer interaction. However, accurate and effective vision based recognition systems continue to be a big challenging area of research in the field of computer vision. This review introduces the most recent human action recognition systems and provides the advances of state-of-the-art methods. To this end, the direction of this research is sorted out from hand-crafted representation based methods including holistic and local representation methods with various sources of data, to a deep learning technology including discriminative and generative models and multi-modality based methods. Next, the most common datasets of human action recognition are presented. This review introduces several analyses, comparisons and recommendations that help to find out the direction of future research. Full article
Show Figures

Figure 1

9 pages, 1577 KiB  
Article
The Reconstruction of a Bronze Battle Axe and Comparison of Inflicted Damage Injuries Using Neutron Tomography, Manufacturing Modeling, and X-ray Microtomography Data
by Maria Mednikova, Irina Saprykina, Sergey Kichanov and Denis Kozlenko
J. Imaging 2020, 6(6), 45; https://doi.org/10.3390/jimaging6060045 - 8 Jun 2020
Cited by 9 | Viewed by 5604
Abstract
A massive bronze battle axe from the Abashevo archaeological culture was studied using neutron tomography and manufacturing modeling from production molds. Detailed structural data were acquired to simulate and model possible injuries and wounds caused by this battle axe. We report the results [...] Read more.
A massive bronze battle axe from the Abashevo archaeological culture was studied using neutron tomography and manufacturing modeling from production molds. Detailed structural data were acquired to simulate and model possible injuries and wounds caused by this battle axe. We report the results of neutron tomography experiments on the bronze battle axe, as well as manufactured plastic and virtual models of the traumas obtained at different strike angles from this axe. The reconstructed 3D models of the battle axe, plastic imprint model, and real wound and trauma traces on the bones of the ancient peoples of the Abashevo archaeological culture were obtained. Skulls with traces of injuries originate from archaeological excavations of the Pepkino burial mound of the Abashevo culture in the Volga region. The reconstruction and identification of the injuries and type of weapon on the restored skulls were performed. The complementary use of 3D visualization methods allowed us to make some assumptions on the cause of death of the people of the Abashevo culture and possible intra-tribal conflict in this cultural society. The obtained structural and anthropological data can be used to develop new concepts and methods for the archaeology of conflict. Full article
Show Figures

Figure 1

16 pages, 786 KiB  
Article
Spatial Linear Mixed Effects Modelling for OCT Images: SLME Model
by Wenyue Zhu, Jae Yee Ku, Yalin Zheng, Paul C. Knox, Ruwanthi Kolamunnage-Dona and Gabriela Czanner
J. Imaging 2020, 6(6), 44; https://doi.org/10.3390/jimaging6060044 - 5 Jun 2020
Cited by 2 | Viewed by 3939
Abstract
Much recent research focuses on how to make disease detection more accurate as well as “slimmer”, i.e., allowing analysis with smaller datasets. Explanatory models are a hot research topic because they explain how the data are generated. We propose a spatial explanatory modelling [...] Read more.
Much recent research focuses on how to make disease detection more accurate as well as “slimmer”, i.e., allowing analysis with smaller datasets. Explanatory models are a hot research topic because they explain how the data are generated. We propose a spatial explanatory modelling approach that combines Optical Coherence Tomography (OCT) retinal imaging data with clinical information. Our model consists of a spatial linear mixed effects inference framework, which innovatively models the spatial topography of key information via mixed effects and spatial error structures, thus effectively modelling the shape of the thickness map. We show that our spatial linear mixed effects (SLME) model outperforms traditional analysis-of-variance approaches in the analysis of Heidelberg OCT retinal thickness data from a prospective observational study, involving 300 participants with diabetes and 50 age-matched controls. Our SLME model has a higher power for detecting the difference between disease groups, and it shows where the shape of retinal thickness profiles differs between the eyes of participants with diabetes and the eyes of healthy controls. In simulated data, the SLME model demonstrates how incorporating spatial correlations can increase the accuracy of the statistical inferences. This model is crucial in the understanding of the progression of retinal thickness changes in diabetic maculopathy to aid clinicians for early planning of effective treatment. It can be extended to disease monitoring and prognosis in other diseases and with other imaging technologies. Full article
(This article belongs to the Special Issue MIUA2019)
Show Figures

Figure 1

14 pages, 1132 KiB  
Article
Examining the Relationship between Semiquantitative Methods Analysing Concentration-Time and Enhancement-Time Curves from Dynamic-Contrast Enhanced Magnetic Resonance Imaging and Cerebrovascular Dysfunction in Small Vessel Disease
by Jose Bernal, María Valdés-Hernández, Javier Escudero, Eleni Sakka, Paul A. Armitage, Stephen Makin, Rhian M. Touyz and Joanna M. Wardlaw
J. Imaging 2020, 6(6), 43; https://doi.org/10.3390/jimaging6060043 - 5 Jun 2020
Viewed by 3742
Abstract
Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) can be used to examine the distribution of an intravenous contrast agent within the brain. Computational methods have been devised to analyse the contrast uptake/washout over time as reflections of cerebrovascular dysfunction. However, there have been few [...] Read more.
Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) can be used to examine the distribution of an intravenous contrast agent within the brain. Computational methods have been devised to analyse the contrast uptake/washout over time as reflections of cerebrovascular dysfunction. However, there have been few direct comparisons of their relative strengths and weaknesses. In this paper, we compare five semiquantitative methods comprising the slope and area under the enhancement-time curve, the slope and area under the concentration-time curve ( S l o p e C o n and A U C C o n ), and changes in the power spectrum over time. We studied them in cerebrospinal fluid, normal tissues, stroke lesions, and white matter hyperintensities (WMH) using DCE-MRI scans from a cohort of patients with small vessel disease (SVD) who presented mild stroke. The total SVD score was associated with A U C C o n in WMH ( p < 0.05 ), but not with the other four methods. In WMH, we found higher A U C C o n was associated with younger age ( p < 0.001 ) and fewer WMH ( p < 0.001 ), whereas S l o p e C o n increased with younger age ( p > 0.05 ) and WMH burden ( p > 0.05 ). Our results show the potential of different measures extracted from concentration-time curves extracted from the same DCE examination to demonstrate cerebrovascular dysfunction better than those extracted from enhancement-time curves. Full article
(This article belongs to the Special Issue MIUA2019)
Show Figures

Figure 1

19 pages, 1033 KiB  
Article
Origins of Hyperbolicity in Color Perception
by Nicoletta Prencipe, Valérie Garcin and Edoardo Provenzi
J. Imaging 2020, 6(6), 42; https://doi.org/10.3390/jimaging6060042 - 4 Jun 2020
Cited by 6 | Viewed by 3351
Abstract
In 1962, H. Yilmaz published a very original paper in which he showed the striking analogy between Lorentz transformations and the effect of illuminant changes on color perception. As a consequence, he argued that a perceived color space endowed with the Minkowski metric [...] Read more.
In 1962, H. Yilmaz published a very original paper in which he showed the striking analogy between Lorentz transformations and the effect of illuminant changes on color perception. As a consequence, he argued that a perceived color space endowed with the Minkowski metric is a good approximation to model color vision. The contribution of this paper is twofold: firstly, we provide a mathematical formalization of Yilmaz’s argument about the relationship between Lorentz transformations and the perceptual effect of illuminant changes. Secondly, we show that, within Yilmaz’s model, the color space can be coherently endowed with the Minkowski metric only by imposing the Euclidean metric on the hue-chroma plane. This fact motivates the need of further investigation about both the proper definition and interrelationship among the color coordinates and also the geometry and metrics of perceptual color spaces. Full article
Show Figures

Figure 1

8 pages, 1117 KiB  
Article
Do We Train on Test Data? Purging CIFAR of Near-Duplicates
by Björn Barz and Joachim Denzler
J. Imaging 2020, 6(6), 41; https://doi.org/10.3390/jimaging6060041 - 2 Jun 2020
Cited by 48 | Viewed by 5456
Abstract
The CIFAR-10 and CIFAR-100 datasets are two of the most heavily benchmarked datasets in computer vision and are often used to evaluate novel methods and model architectures in the field of deep learning. However, we find that 3.3% and 10% of the images [...] Read more.
The CIFAR-10 and CIFAR-100 datasets are two of the most heavily benchmarked datasets in computer vision and are often used to evaluate novel methods and model architectures in the field of deep learning. However, we find that 3.3% and 10% of the images from the test sets of these datasets have duplicates in the training set. These duplicates are easily recognizable by memorization and may, hence, bias the comparison of image recognition techniques regarding their generalization capability. To eliminate this bias, we provide the “fair CIFAR” (ciFAIR) dataset, where we replaced all duplicates in the test sets with new images sampled from the same domain. The training set remains unchanged, in order not to invalidate pre-trained models. We then re-evaluate the classification performance of various popular state-of-the-art CNN architectures on these new test sets to investigate whether recent research has overfitted to memorizing data instead of learning abstract concepts. We find a significant drop in classification accuracy of between 9% and 14% relative to the original performance on the duplicate-free test set. We make both the ciFAIR dataset and pre-trained models publicly available and furthermore maintain a leaderboard for tracking the state of the art. Full article
Show Figures

Graphical abstract

18 pages, 4602 KiB  
Article
Detection and Confirmation of Multiple Human Targets Using Pixel-Wise Code Aperture Measurements
by Chiman Kwan, David Gribben, Akshay Rangamani, Trac Tran, Jack Zhang and Ralph Etienne-Cummings
J. Imaging 2020, 6(6), 40; https://doi.org/10.3390/jimaging6060040 - 29 May 2020
Cited by 14 | Viewed by 3425
Abstract
Compressive video measurements can save bandwidth and data storage. However, conventional approaches to target detection require the compressive measurements to be reconstructed before any detectors are applied. This is not only time consuming but also may lose information in the reconstruction process. In [...] Read more.
Compressive video measurements can save bandwidth and data storage. However, conventional approaches to target detection require the compressive measurements to be reconstructed before any detectors are applied. This is not only time consuming but also may lose information in the reconstruction process. In this paper, we summarized the application of a recent approach to vehicle detection and classification directly in the compressive measurement domain to human targets. The raw videos were collected using a pixel-wise code exposure (PCE) camera, which condensed multiple frames into one frame. A combination of two deep learning-based algorithms (you only look once (YOLO) and residual network (ResNet)) was used for detection and confirmation. Optical and mid-wave infrared (MWIR) videos from a well-known database (SENSIAC) were used in our experiments. Extensive experiments demonstrated that the proposed framework was feasible for target detection up to 1500 m, but target confirmation needs more research. Full article
Show Figures

Figure 1

13 pages, 551 KiB  
Article
Breast Tumor Classification Using an Ensemble Machine Learning Method
by Adel S. Assiri, Saima Nazir and Sergio A. Velastin
J. Imaging 2020, 6(6), 39; https://doi.org/10.3390/jimaging6060039 - 29 May 2020
Cited by 114 | Viewed by 8361
Abstract
Breast cancer is the most common cause of death for women worldwide. Thus, the ability of artificial intelligence systems to detect possible breast cancer is very important. In this paper, an ensemble classification mechanism is proposed based on a majority voting mechanism. First, [...] Read more.
Breast cancer is the most common cause of death for women worldwide. Thus, the ability of artificial intelligence systems to detect possible breast cancer is very important. In this paper, an ensemble classification mechanism is proposed based on a majority voting mechanism. First, the performance of different state-of-the-art machine learning classification algorithms were evaluated for the Wisconsin Breast Cancer Dataset (WBCD). The three best classifiers were then selected based on their F3 score. F3 score is used to emphasize the importance of false negatives (recall) in breast cancer classification. Then, these three classifiers, simple logistic regression learning, support vector machine learning with stochastic gradient descent optimization and multilayer perceptron network, are used for ensemble classification using a voting mechanism. We also evaluated the performance of hard and soft voting mechanism. For hard voting, majority-based voting mechanism was used and for soft voting we used average of probabilities, product of probabilities, maximum of probabilities and minimum of probabilities-based voting methods. The hard voting (majority-based voting) mechanism shows better performance with 99.42%, as compared to the state-of-the-art algorithm for WBCD. Full article
Show Figures

Figure 1

18 pages, 2094 KiB  
Article
Spatially and Spectrally Concatenated Neural Networks for Efficient Lossless Compression of Hyperspectral Imagery
by Zhuocheng Jiang, W. David Pan and Hongda Shen
J. Imaging 2020, 6(6), 38; https://doi.org/10.3390/jimaging6060038 - 28 May 2020
Cited by 12 | Viewed by 3746
Abstract
To achieve efficient lossless compression of hyperspectral images, we design a concatenated neural network, which is capable of extracting both spatial and spectral correlations for accurate pixel value prediction. Unlike conventional neural network based methods in the literature, the proposed neural network functions [...] Read more.
To achieve efficient lossless compression of hyperspectral images, we design a concatenated neural network, which is capable of extracting both spatial and spectral correlations for accurate pixel value prediction. Unlike conventional neural network based methods in the literature, the proposed neural network functions as an adaptive filter, thereby eliminating the need for pre-training using decompressed data. To meet the demand for low-complexity onboard processing, we use a shallow network with only two hidden layers for efficient feature extraction and predictive filtering. Extensive simulations on commonly used hyperspectral datasets and the standard CCSDS test datasets show that the proposed approach attains significant improvements over several other state-of-the-art methods, including standard compressors such as ESA, CCSDS-122, and CCSDS-123. Full article
Show Figures

Figure 1

22 pages, 10311 KiB  
Article
Explainable Machine Learning Framework for Image Classification Problems: Case Study on Glioma Cancer Prediction
by Emmanuel Pintelas, Meletis Liaskos, Ioannis E. Livieris, Sotiris Kotsiantis and Panagiotis Pintelas
J. Imaging 2020, 6(6), 37; https://doi.org/10.3390/jimaging6060037 - 28 May 2020
Cited by 42 | Viewed by 6043
Abstract
Image classification is a very popular machine learning domain in which deep convolutional neural networks have mainly emerged on such applications. These networks manage to achieve remarkable performance in terms of prediction accuracy but they are considered as black box models since they [...] Read more.
Image classification is a very popular machine learning domain in which deep convolutional neural networks have mainly emerged on such applications. These networks manage to achieve remarkable performance in terms of prediction accuracy but they are considered as black box models since they lack the ability to interpret their inner working mechanism and explain the main reasoning of their predictions. There is a variety of real world tasks, such as medical applications, in which interpretability and explainability play a significant role. Making decisions on critical issues such as cancer prediction utilizing black box models in order to achieve high prediction accuracy but without provision for any sort of explanation for its prediction, accuracy cannot be considered as sufficient and ethnically acceptable. Reasoning and explanation is essential in order to trust these models and support such critical predictions. Nevertheless, the definition and the validation of the quality of a prediction model’s explanation can be considered in general extremely subjective and unclear. In this work, an accurate and interpretable machine learning framework is proposed, for image classification problems able to make high quality explanations. For this task, it is developed a feature extraction and explanation extraction framework, proposing also three basic general conditions which validate the quality of any model’s prediction explanation for any application domain. The feature extraction framework will extract and create transparent and meaningful high level features for images, while the explanation extraction framework will be responsible for creating good explanations relying on these extracted features and the prediction model’s inner function with respect to the proposed conditions. As a case study application, brain tumor magnetic resonance images were utilized for predicting glioma cancer. Our results demonstrate the efficiency of the proposed model since it managed to achieve sufficient prediction accuracy being also interpretable and explainable in simple human terms. Full article
(This article belongs to the Special Issue Deep Learning in Medical Image Analysis)
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop