Next Article in Journal
Stochastic Multi-Objective Multi-Trip AMR Routing Problem with Time Windows
Previous Article in Journal
Comparative Analysis of Manifold Learning-Based Dimension Reduction Methods: A Mathematical Perspective
Previous Article in Special Issue
Priority-Based Capacity Allocation for Hierarchical Distributors with Limited Production Capacity
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

A Noisy Sample Selection Framework Based on a Mixup Loss and Recalibration Strategy

School of Information Technology, Jiangsu Open University, Nanjing 210036, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Mathematics 2024, 12(15), 2389; https://doi.org/10.3390/math12152389
Submission received: 21 June 2024 / Revised: 25 July 2024 / Accepted: 30 July 2024 / Published: 31 July 2024
(This article belongs to the Special Issue Machine Learning Methods and Mathematical Modeling with Applications)

Abstract

Deep neural networks (DNNs) have achieved breakthrough progress in various fields, largely owing to the support of large-scale datasets with manually annotated labels. However, obtaining such datasets is costly and time-consuming, making high-quality annotation a challenging task. In this work, we propose an improved noisy sample selection method, termed “sample selection framework,” based on a mixup loss and recalibration strategy (SMR). This framework enhances the robustness and generalization abilities of models. First, we introduce a robust mixup loss function to pre-train two models with identical structures separately. This approach avoids additional hyperparameter adjustments and reduces the need for prior knowledge of noise types. Additionally, we use a Gaussian Mixture Model (GMM) to divide the entire training set into labeled and unlabeled subsets, followed by robust training using semi-supervised learning (SSL) techniques. Furthermore, we propose a recalibration strategy based on cross-entropy (CE) loss to prevent the models from converging to local optima during the SSL process, thus further improving performance. Ablation experiments on CIFAR-10 with 50% symmetric noise and 40% asymmetric noise demonstrate that the two modules introduced in this paper improve the accuracy of the baseline (i.e., DivideMix) by 1.5% and 0.5%, respectively. Moreover, the experimental results on multiple benchmark datasets demonstrate that our proposed method effectively mitigates the impact of noisy labels and significantly enhances the performance of DNNs on noisy datasets. For instance, on the WebVision dataset, our method improves the top-1 accuracy by 0.7% and 2.4% compared to the baseline method.
Keywords: deep neural networks; noisy labels; semi-supervised learning; image classification deep neural networks; noisy labels; semi-supervised learning; image classification

Share and Cite

MDPI and ACS Style

Zhang, Q.; Yu, D.; Zhou, X.; Gong, H.; Li, Z.; Liu, Y.; Shao, R. A Noisy Sample Selection Framework Based on a Mixup Loss and Recalibration Strategy. Mathematics 2024, 12, 2389. https://doi.org/10.3390/math12152389

AMA Style

Zhang Q, Yu D, Zhou X, Gong H, Li Z, Liu Y, Shao R. A Noisy Sample Selection Framework Based on a Mixup Loss and Recalibration Strategy. Mathematics. 2024; 12(15):2389. https://doi.org/10.3390/math12152389

Chicago/Turabian Style

Zhang, Qian, De Yu, Xinru Zhou, Hanmeng Gong, Zheng Li, Yiming Liu, and Ruirui Shao. 2024. "A Noisy Sample Selection Framework Based on a Mixup Loss and Recalibration Strategy" Mathematics 12, no. 15: 2389. https://doi.org/10.3390/math12152389

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop