Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (3)

Search Parameters:
Keywords = Cumulative Spectral Gradient (CSG) metric

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 4501 KB  
Article
Complexity-Driven Adversarial Validation for Corrupted Medical Imaging Data
by Diego Renza, Jorge Brieva and Ernesto Moya-Albor
Information 2026, 17(2), 125; https://doi.org/10.3390/info17020125 - 29 Jan 2026
Viewed by 384
Abstract
Distribution shifts commonly arise in real-world machine learning scenarios in which the fundamental assumption that training and test data are drawn from independent and identically distributed samples is violated. In the case of medical data, such distribution shifts often occur during data acquisition [...] Read more.
Distribution shifts commonly arise in real-world machine learning scenarios in which the fundamental assumption that training and test data are drawn from independent and identically distributed samples is violated. In the case of medical data, such distribution shifts often occur during data acquisition and pose a significant challenge to the robustness and reliability of artificial intelligence systems in clinical practice. Additionally, quantifying these shifts without training a model remains a key open problem. This paper proposes a comprehensive methodological framework for evaluating the impact of such shifts on medical image datasets under artificial transformations that simulate acquisition variations, leveraging the Cumulative Spectral Gradient (CSG) score as a measure of multiclass classification complexity induced by distributional changes. Building on prior work, the proposed approach is meaningfully extended to twelve 2D medical imaging benchmarks from the MedMNIST collection, covering both binary and multiclass tasks, as well as grayscale and RGB modalities. We evaluate the metric analyzing its robustness to clinically inspired distribution shifts that are systematically simulated through motion blur, additive noise, brightness and contrast variation, and sharpness variation, each applied at three severity levels. This results in a large-scale benchmark that enables a detailed analysis of how dataset characteristics, transformation types, and distortion severity influence distribution shifts. Thus, the findings show that while the metric remains generally stable under noise and focus distortions, it is highly sensitive to variations in brightness and contrast. On the other hand, the proposed methodology is compared against Cleanlab’s widely used Non-IID score on the RetinaMNIST dataset using a pre-trained ResNet-50 model, including both class-wise analysis and correlation assessment between metrics. Finally, interpretability is incorporated through class activation map analysis on BloodMNIST and its corrupted variants to support and contextualize the quantitative findings. Full article
Show Figures

Figure 1

17 pages, 1102 KB  
Article
Identifying and Mitigating Label Noise in Deep Learning for Image Classification
by César González-Santoyo, Diego Renza and Ernesto Moya-Albor
Technologies 2025, 13(4), 132; https://doi.org/10.3390/technologies13040132 - 1 Apr 2025
Cited by 6 | Viewed by 5842
Abstract
Labeling errors in datasets are a persistent challenge in machine learning because they introduce noise and bias and reduce the model’s generalization. This study proposes a novel methodology for detecting and correcting mislabeled samples in image datasets by using the Cumulative Spectral Gradient [...] Read more.
Labeling errors in datasets are a persistent challenge in machine learning because they introduce noise and bias and reduce the model’s generalization. This study proposes a novel methodology for detecting and correcting mislabeled samples in image datasets by using the Cumulative Spectral Gradient (CSG) metric to assess the intrinsic complexity of the data. This methodology is applied to the noisy CIFAR-10/100 and CIFAR-10n/100n datasets, where mislabeled samples in CIFAR-10n/100n are identified and relabeled using CIFAR-10/100 as a reference. The DenseNet and Xception models pre-trained on ImageNet are fine-tuned to evaluate the impact of label correction on the model performance. Evaluation metrics based on the confusion matrix are used to compare the model performance on the original and noisy datasets and on the label-corrected datasets. The results show that correcting the mislabeled samples significantly improves the accuracy and robustness of the model, highlighting the importance of dataset quality in machine learning. Full article
Show Figures

Figure 1

18 pages, 1508 KB  
Article
Adversarial Validation in Image Classification Datasets by Means of Cumulative Spectral Gradient
by Diego Renza, Ernesto Moya-Albor and Adrian Chavarro
Algorithms 2024, 17(11), 531; https://doi.org/10.3390/a17110531 - 19 Nov 2024
Cited by 2 | Viewed by 1903
Abstract
The main objective of a machine learning (ML) system is to obtain a trained model from input data in such a way that it allows predictions to be made on new i.i.d. (Independently and Identically Distributed) data with the lowest possible error. However, [...] Read more.
The main objective of a machine learning (ML) system is to obtain a trained model from input data in such a way that it allows predictions to be made on new i.i.d. (Independently and Identically Distributed) data with the lowest possible error. However, how can we assess whether the training and test data have a similar distribution? To answer this question, this paper presents a proposal to determine the degree of distribution shift of two datasets. To this end, a metric for evaluating complexity in datasets is used, which can be applied in multi-class problems, comparing each pair of classes of the two sets. The proposed methodology has been applied to three well-known datasets: MNIST, CIFAR-10 and CIFAR-100, together with corrupted versions of these. Through this methodology, it is possible to evaluate which types of modification have a greater impact on the generalization of the models without the need to train multiple models multiple times, also allowing us to determine which classes are more affected by corruption. Full article
(This article belongs to the Special Issue Machine Learning Algorithms for Image Understanding and Analysis)
Show Figures

Figure 1

Back to TopTop