ReliaMatch: Semi-Supervised Classification with Reliable Match

Jiang, Tao; Chen, Luyao; Chen, Wanqing; Meng, Wenjuan; Qi, Peihan

doi:10.3390/app13158856

Open AccessArticle

ReliaMatch: Semi-Supervised Classification with Reliable Match

by

Tao Jiang

¹,

Luyao Chen

¹,

Wanqing Chen

¹,

Wenjuan Meng

² and

Peihan Qi

^3,*

¹

School of Cyber Engineering, Xidian University, Xi’an 710126, China

²

College of Information Engineering, Northwest A&F University, Xianyang 712100, China

³

State Key Laboratory of Integrated Service Networks, Xidian University, Xi’an 710071, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(15), 8856; https://doi.org/10.3390/app13158856

Submission received: 28 June 2023 / Revised: 17 July 2023 / Accepted: 26 July 2023 / Published: 31 July 2023

(This article belongs to the Special Issue Security and Privacy in Machine Learning and Artificial Intelligence (AI))

Download

Browse Figures

Versions Notes

Abstract

:

Deep learning has been widely used in various tasks such as computer vision, natural language processing, predictive analysis, and recommendation systems in the past decade. However, practical scenarios often lack labeled data, posing challenges for traditional supervised methods. Semi-supervised classification methods address this by leveraging both labeled and unlabeled data to enhance model performance, but they face challenges in effectively utilizing unlabeled data and distinguishing reliable information from unreliable sources. This paper introduced ReliaMatch, a semi-supervised classification method that addresses these challenges by using a confidence threshold. It incorporates a curriculum learning stage, feature filtering, and pseudo-label filtering to improve classification accuracy and reliability. The feature filtering module eliminates ambiguous semantic features by comparing labeled and unlabeled data in the feature space. The pseudo-label filtering module removes unreliable pseudo-labels with low confidence, enhancing algorithm reliability. ReliaMatch employs a curriculum learning training mode, gradually increasing training dataset difficulty by combining selected samples and pseudo-labels with labeled data. This supervised approach enhances classification performance. Experimental results show that ReliaMatch effectively overcomes challenges associated with the underutilization of unlabeled data and the introduction of error information, outperforming the pseudo-label strategy in semi-supervised classification.

Keywords:

deep learning; semi-supervised learning; pseudo labels; classification; ReliaMatch

1. Introduction

In the past decade, deep learning has dominated the machine learning landscape in data classification [1,2,3], predictive analysis [4], recommendation systems [5], anomaly detection [6,7] and so on. Within deep learning, supervised classification methods have significantly improved the performance of deep learning in various classification tasks. However, it is still very difficult to obtain labels provided by professionals in many big data application scenarios. In contrast, unsupervised classification methods have obvious advantages in dealing with unlabeled samples. Nevertheless, they greatly sacrificed the accuracy of the model because they cannot directly evaluate and optimize the performance of deep learning models by using the label information. Therefore, semi-supervised learning [8] caught the attention of researchers, and significantly improved the model performance of unsupervised learning by leveraging datasets with a small amount of label data. A more practical value lies in semi-supervised learning methods being able to reduce the cost and time of manually marking data. Recently, many semi-supervised learning methods based on deep learning have been proposed [9,10,11], which can achieve quite a good performance by leveraging the small fraction of labeled samples in the dataset.

However, semi-supervised learning faces two major challenges: (i) how to transfer information obtained from limited labeled data to unlabeled data, and (

i i

) how to learn as accurate information as possible directly from a large amount of unlabeled data. To address these issues, semi-supervised learning uses three key loss terms (i.e., entropy minimization, generalization regularization, and consistency regularization) to better motivate the model to learn the corresponding downstream tasks. Among them, entropy minimization can encourage the model to confidently predict the output of unlabeled data, thereby improving the accuracy and robustness of the model. Generalization regularization constrains the model’s parameters to avoid overfitting to the training data during the training process, thereby improving the model’s generalization performance. According to the consistency assumption, data points that are close to each other often have consistency in the same label and structure [12]. Therefore, consistency regularization can improve the accuracy and generalization ability of the model by making the data consistent on the manifold.

In addition, many studies have explored methods for solving the information propagation problem between different data [13], providing ideas for addressing the first major issue in semi-supervised learning. Among them, pseudo-labeling and consistency regularization methods provide ideas for solving the second major issue in semi-supervised learning. The pseudo-labeling method [14] uses the predictions of a classification model or a clustering algorithm as artificial labels to retrain the model. The consistency regularization method [10,11,15] forces the model to make the same prediction for the same sample under different transformations, learning from unlabeled data. However, these semi-supervised learning methods do not consider the possibility of introducing different levels of erroneous information during the training process, which can lead to low classification accuracy.

In the feature extraction stage, as plotted in Figure 1 (left), the model may have difficulty accurately differentiating between semantic differences at the classification boundary due to the ambiguity of feature representation boundaries. Without setting anchors for each class, the model may learn incorrect semantic information. However, by providing each class with an anchor, confidence thresholds can be set using the similarity between each sample and the anchor, allowing low-confidence features to be filtered out as shown in Figure 1 (right). When assigning pseudo-labels to unlabeled data, as plotted in Figure 2, assigning labels to samples with low predicted confidence may lead to confirmation bias [16], where the model overfits to incorrect labels and reduces its performance. Using a fixed threshold method cannot adapt to the dynamic changes of the dataset. On the contrary, a global dynamic threshold method dynamically adjusts the threshold based on the confidence distribution of unlabeled samples in the current iteration, thus avoiding this problem.

To address these issues, we proposed ReliaMatch, a semi-supervised classification method that filters unreliable information based on a confidence threshold. ReliaMatch adopts a confidence threshold filtering strategy, which matches the similarity of labeled data and unlabeled data in feature space by setting anchor points, thus filtering out outliers and demarcation points with ambiguous semantics. Dynamic threshold is used to select reliable pseudo-labels, so as to eliminate the confirmation deviation of the model to pseudo-labels and improve classification performance. Additionally, ReliaMatch adopts the training mode of Curriculum Learning [18], which combines the screened samples and their pseudo-labels with labeled data, gradually increasing the difficulty of training datasets and participating in model training in a supervised way, thus further improving the classification performance. In summary, we make the following three main contributions:

(1): We proposed a semi-supervised classification method (Reliable Match), which addresses the issue of confirmation bias that arises from unlabeled data having different semantics and low prediction confidence near the classification boundary.
(2): ReliaMatch employs a confidence threshold filtering strategy that matches the similarity of labeled and unlabeled data in feature space by setting anchor points, which filter out outliers and demarcation points with ambiguous semantics. To eliminate confirmation deviation of the model to pseudo labels and improve classification performance, ReliaMatch uses a dynamic threshold to select reliable pseudo-labels.
(3): ReliaMatch employs the Curriculum Learning training mode, which combines the screened samples and their pseudo-labels with labeled data and gradually increases the difficulty of the training dataset, thereby participating in model training in a supervised manner and further improving classification performance.

2. Related Work

2.1. Semi-Supervised Classification

Semi-supervised learning (SSL) has been extensively studied in various fields, including image classification [19], object detection [20], and semantic segmentation [21]. SSL methods in image classification aim to reduce reliance on labeled data by leveraging unlabeled data. In SSL, labeled results are typically obtained through consistent regularization [17,22,23], pseudo-labeling [14], and entropy minimization [24]. Consistent regularization ensures that the model produces consistent predictions for different transformations of the same image. Pseudo-labeling employs model confidence to assign labels and guide the training process, while entropy minimization encourages the model to produce highly confident predictions. These labeling strategies have been widely adopted in many SSL approaches.

2.2. Consistency Regularization

Consistency regularization plays a crucial role in modern semi-supervised learning (SSL) algorithms. The core idea behind consistency regularization is that the same input sample should produce consistent outputs under different perturbations. Early works, such as [10,25,26], proposed this concept, which was further developed in [9,15,17]. The fundamental form of consistency regularization in SSL is often achieved through a loss term. The equation below represents this basic form:

| | p_{m} (y | A (x); θ) - p_{m} (y | A (x); θ) {| |}_{2}^{2},

(1)

where A refers to stochastic functions, resulting in different values for

A (x)

, while

p_{m}

represents the model’s output probability. In [26], random data augmentation, dropout, and random maximum pooling were employed as A to ensure similarity among the predictions of neural networks. On the other hand, Ref. [11] adopts adversarial transformations for augmentation. Another related approach, presented in [10], extends the perturbations to different time periods, requiring the current prediction of a sample to be similar to the prediction set of the same sample in the past. These perturbations mainly arise from different network states and data augmentations.

In SSL, consistency regularization techniques aim to ensure that the same input sample produces consistent output predictions under different perturbations. Different approaches have been proposed to achieve this goal. For example, in [26], random data augmentation, dropout, and random maximum pooling were used to promote similarity among the predictions of neural networks. Adversarial transformations were employed for augmentation in [11]. Additionally, Ref. [10] extends the perturbations to different time periods, enforcing similarity between current and past predictions for the same sample. In [15], two networks with the same structure were utilized, and the consistency constraint was enforced by comparing the predicted distributions using KL divergence or cross-entropy functions. This approach was further developed in [27], where uncertainty weighting was applied to unlabeled samples, focusing on samples with lower uncertainty. Virtual adversarial training, proposed by Miyato et al. [11], introduces adversarial noise as interference into data samples, followed by unified regularization of the resulting predictions. Another recent idea by Luo et al. [28] suggests using a comparison loss as the regularization term, ensuring that predictions from the same (or different) categories are similar (or different). This extends the scope of consistency regularization to cover consistency between different samples and can be combined with other methods like [15] or [11] for improved performance. To address model memorization and sensitivity to adversarial data, Mixup, proposed by Zhang et al. [29], pairs examples and labels by training a convex combination of neural networks. Verma et al. [30] built on Mixup with interpolation consistency training, which encourages consistency between unlabeled samples and the interpolation prediction of a single sample. Moreover, in [17], consistent regularization was achieved through estimating low-entropy labels, generating data-augmented unlabeled samples, and utilizing Mixup to combine labeled and unlabeled samples.

2.3. Pseudo-Labeling

Pseudo labels are artificial labels generated by the model itself, which are used to further train the model. Through the pseudo-labeling method, we can use both labeled samples and pseudo labeled samples as new training data to update the model, thus greatly improving the utilization rate of unlabeled samples.

Lee et al. [14] chose the class with the highest prediction probability of the model as the pseudo label; however, pseudo labels are only used in the fine-tuning stage, and the network needs to be pre-trained. Mandal et al. [31] proposed a new deep semi-supervised framework, which can seamlessly process marked and unlabeled data. The framework is trained by two parts in turn: firstly, the label prediction component is used to predict the label of the unlabeled part of the training data, and then the common representation of two patterns is learned for cross-modal retrieval. Caron et al. [32] proposed a deep clustering algorithm combining a K-means clustering algorithm and a convolutional neural network, which used the clustering results of K-meaas with unlabeled data as false labels to assist CNN in classification. Based on the extreme value theory, Cascante-Bonilla et al. [33] put forward the Curriculum Labeling (CL), which uses careful curriculum selection as a pacing standard to strengthen the pseudo labeling. Hu et al. [34] designed a new end-to-end Iterative Feature Clustering Graph Convolution Network (IFC-GCN) to enhance the standard GCN through the iterative feature clustering module, and designed an EM-like framework to improve the network performance by alternately correcting false labels and the node characteristics.

3. Method

The core idea of ReliaMatch is to match the correlation between labeled and unlabeled data, by filtering reliable unlabeled data and generating pseudo-labels, which are then used as new training data in the supervised learning process of the model. The detailed process of ReliaMatch is shown in Figure 3.

This method uses a confidence threshold to filter unreliable information, that is, feature vectors on the boundary of unlabeled data classification or outlier feature vectors and artificially marked false labels that may be incorrect. ReliaMatch adopts a self-training framework, that is, by iteratively learning the information in unlabeled datasets and labeled datasets, the performance of the deep learning model is improved. In the training process, ReliaMatch uses the trained model to predict the unlabeled data and adds the reliable data points in the prediction results and their pseudo-labels to the pseudo-label dataset. Then, the pseudo-labeled dataset and the labeled dataset are merged to train the next round model. This process is repeated until a preset number of iterations or performance convergence is achieved.

For labeled samples, ReliaMatch trains them using the feature extraction model and the classification model. After training, feature extraction and classification prediction are performed on the labeled samples. At the same time, the average feature vector of each category of the labeled samples is calculated and used as a feature anchor for filtering reliable features of the unlabeled samples.

For unlabeled samples, there are two main modules: the feature filtering module and the pseudo-label filtering module, which correspond to the feature and label noise filtering, respectively. In conjunction with Figure 3, the feature filtering module includes filtering 1, and the pseudo-label filtering module (i.e., filtering of label noise) contain filtering 2 and filtering 3. ReliaMatch first performs data processing by using data augmentation techniques to expand the unlabeled samples. Next, the enhanced unlabeled samples are input into the model for feature extraction. Then, the feature filter module calculates the similarity between the extracted features of the unlabeled samples and the feature anchor, and sets a feature similarity threshold to filter out the unlabeled samples with low similarity. This process is called filtering 1. The unlabeled samples selected by filtering 1 are labeled with the class label of the nearest feature anchor, and are used as pseudo-labels (hard labels). Next, ReliaMatch inputs these feature-filtered samples into classifier for classification prediction. For each unlabeled sample, it compares whether the class label of its maximum predicted probability (soft label) is consistent with the class label of the pseudo-label. If the classes are inconsistent, the unlabeled sample is filtered out. This process is called filtering 2. Finally, a dynamic threshold is set for the predicted probability to filter out unlabeled samples with maximum predicted probability below the threshold. This process is called filtering 3.

After three rounds of filtering, the remaining unlabeled samples are considered to be high-confidence reliable samples and are combined with their pseudo-labels to form a pseudo-labeled dataset. These pseudo-labeled samples are merged with labeled samples to form a new labeled dataset, which is used to train a new model. This process is iterated continuously to gradually increase the size of the labeled dataset and improve the performance of semi-supervised learning.

3.1. Problem Description

To describe the design process of the ReliaMatch model more accurately, we assumed that, in the tth round of iteration, the training dataset

X_{N}^{t}

is used, which contains

N^{t}

samples including image data from different categories

X_{N} = {x_{i}}_{i = 1}^{N^{t}}

. The training dataset is divided into a labeled dataset

X_{L}^{t} = {x_{i, l}^{t}}_{i = 1}^{N_{L}^{t}}

and an unlabeled dataset

X_{U}^{t} = {x_{i, u}^{t}}_{i = 1}^{N_{U}^{t}}

. Assuming that

f_{θ}^{t}

is the convolutional neural network used for feature extraction in the tth round of iteration.

Z_{N}^{t} = f_{θ}^{t} (x_{i}^{t})

represents the feature vector set obtained from

X_{N}^{t}

after being processed by the convolutional neural network

f_{θ}^{t}

.

Z_{N}^{t}

consists of two parts,

Z_{L}^{t}

and

Z_{U}^{t}

, where

Z_{L}^{t} = f_{θ}^{t} (x_{i, l}^{t}, y_{i, l}^{t}) = {(z_{i, l}^{t}, y_{i, l}^{t})}_{i = 1}^{N_{L}^{t}}

is the labeled feature vector set in the tth round of iteration, and

Z_{U}^{t} = f_{θ}^{t} (x_{i, u}^{t}) = {z_{i, u}^{t}}_{i = 1}^{N_{U}^{t}}

is the unlabeled feature vector set in the tth round of iteration. Let

g_{ϖ}^{t}

be the fully connected neural network (classifier) used by the model for classification prediction in the tth round of iteration, with its output being the predicted probability of the sample in each category. Let

p_{i, j}^{t}

denote the probability that the model predicts sample

x_{i}^{t}

as category j, and K be the number of categories. Then,

P (x_{i}^{t}) = g_{ϖ}^{t} (f_{θ}^{t} (x_{i}^{t})) = {p_{i, j}^{t}}_{j = 1}^{K}

.

3.2. Feature Anchoring

Our first contribution is feature anchoring, which uses the features of the labeled data to calculate the average of the features of each category as an anchor. We think that nearby points are likely to have the same labels, and we also agree that points on the same structure (usually called clusters or manifolds) may have the same label. Therefore, our method pays attention to the representation study of images, and we used the similarity of feature level to fliter unreliable samples and assigns pseudo labels to each reliable sample. A schematic of filtering poor features out of samples can be seen in Figure 4.

We first calculated the average feature vectors of each class j in the labeled feature dataset

Z_{L}^{t}

at the tth iteration to generate K feature anchor points

A^{t} = {a_{j}^{t}}_{j = 1}^{K}

. To do this, we used the following equation:

a_{j}^{t} = \frac{1}{|Z_{j, L}^{t}|} \sum_{(z_{i, l}^{t}, y_{i, l}^{t}) \in Z_{j, L}^{t}} z_{i, l}^{t},

(2)

where

Z_{j, L}^{t}

represents a subset of feature samples in the feature vector set

Z_{L}^{t}

, where the label

y_{i, l}^{t} = j

in the tth round of iteration.

| Z_{j, L}^{t} |

denotes the number of samples in the subset

Z_{j, L}^{t}

.

Next, we used cosine similarity to calculate the similarity between the extracted enhanced unlabeled samples’ features and feature anchor points:

\begin{matrix} f_{S i m F e a}^{t} & = s (z_{i, u}^{t}, a_{j}^{t}) = \frac{z_{i, u}^{t} \cdot a_{j}^{t}}{| z_{i, u}^{t} | \cdot | a_{j}^{t} |} \\ = \frac{\sum_{d = 1}^{D} z_{i, u, d}^{t} a_{j, d}^{t}}{\sqrt{\sum_{d = 1}^{D} {(z_{i, u, d}^{t})}^{2}} \cdot \sqrt{\sum_{d = 1}^{D} {(a_{j, d}^{t})}^{2}}}, \end{matrix}

(3)

where

z_{i, u}^{t}

represents the feature vector of the unlabeled data,

a_{j}^{t}

represents the anchor point, · represents the dot product of the vector,

| \cdot |

represents the norm of the vector, D is the dimension of the vector. A cosine similarity close to 1 indicates that the two vectors are very close in space, while a cosine similarity close to −1 indicates that the two vectors are almost opposite in space. A cosine similarity close to 0 indicates that there is no obvious correlation between the two vectors in space.

Next, the minimum similarity between labeled sample features and anchor points is used as the threshold for feature similarity, and a hyperparameter is used to dynamically adjust the threshold size of feature similarity:

\begin{matrix} τ_{F e a}^{t} = α_{F e a}^{t} min (s (z_{i, l}^{t}, a_{j}^{t})) = α_{F e a}^{t} min (\frac{z_{i, l}^{t} \cdot a_{j}^{t}}{| z_{i, l}^{t} | \cdot | a_{j}^{t} |}), \end{matrix}

(4)

where

α_{F e a}^{t} \in [0, 1]

is a coefficient used to dynamically adjust the threshold of feature similarity,

z_{i, l}^{t}

represents the feature vector of the labeled sample,

a_{j}^{t}

represents the feature anchor point, · denotes the dot product of vectors, and

| \cdot |

denotes the norm of vectors. The feature similarity threshold

τ_{F e a}^{t} \in [- 1, 1]

.

If the similarity

f_{S i m F e a}^{t}

between the feature of an unlabeled sample and the feature anchor point is greater than the feature similarity threshold

τ_{F e a}^{t}

, that is,

f_{S i m F e a}^{t} > τ_{F e a}^{t}

, then the pseudo-label of the current sample

x_{i, u}^{t}

is considered to be the label j of the feature anchor point that is most similar to its feature vector.

In the feature filtering module, the labeled dataset remains unchanged. In the t-th iteration, pseudo-labels of unlabeled samples can be obtained after feature filtering. The feature-filtering module dataset consisting of the filtered unlabeled samples and their pseudo-labels is denoted as

X_{U 1}^{t} = {(x_{i, u 1}^{t}, y_{i, u 1}^{t})}_{i = 1}^{N_{U 1}^{t}}

, where

y_{i, u 1}^{t}

is the pseudo-label of

x_{i, u 1}^{t}

, and

N_{U 1}^{t}

is the number of samples in the dataset.

3.3. Dynamic Allocation Pseudo Lables

In the pseudo-labeling strategy, threshold selection is an important issue. Traditional semi-supervised classification methods usually use a fixed threshold to predict the class of unlabeled samples and assign pseudo-labels to high-confidence samples above this threshold for training. However, this method may not adapt well to changes in the data distribution, and may either filter out useful samples excessively or add unreliable pseudo-labels, leading to classification errors due to samples of different classification difficulties.

To address these issues, the pseudo-label filtering module in ReliaMatch adopts a global dynamic threshold to filter reliable samples. Specifically, in each iteration, the module sets the confidence threshold for predicting probability based on the average confidence of unlabeled samples for the entire dataset. If the average confidence of unlabeled samples is high, it indicates that the algorithm has good classification performance for unlabeled samples, and the threshold can be increased. If the change in labels between two iterations is significant, it indicates that the algorithm has not yet converged and the threshold should be appropriately lowered to ensure the accuracy of the model.

For each unlabeled sample

x_{i, u 1}^{t}

, its predicted probabilities

P (x_{i, u 1}^{t}) = {p_{i, u 1, j}^{t}}_{j = 1}^{K}

can be converted into a hard label by creating a vector of length K, denoted as

O i^{t} = {o_{i, j}^{t}}_{j = 1}^{K}

, as follows:

o_{i, j}^{t} = \{\begin{matrix} 1, & if j = arg max (p_{i, u 1, j}^{t}) \\ 0, & otherwise, \end{matrix}

(5)

where

o_{i, j}^{t}

indicates the label assigned to sample

x_{i, u 1}^{t}

for the jth class. Specifically, the class with the highest predicted probability is assigned a label of 1 and the others are assigned a label of 0.

Next, the average confidence score

p_{u}^{t}

can be obtained by computing the average of the maximum predicted probability values for all unlabeled samples, i.e.,

\begin{matrix} p_{u}^{t} = \frac{1}{| X_{U 1}^{t} |} \sum_{x_{i, u 1} \in X_{U 1}^{t}} max (P (x_{i, u 1}^{t})), \end{matrix}

(6)

where

X_{U 1}

denotes the unlabeled dataset after feature filtering, and

P (x_{i, u 1}^{t})

represents the predicted probabilities of sample

x_{i, u 1}

in the tthtraining round.

ReliaMatch uses the value of

p_{u}^{t}

as the confidence level of unlabeled data to adjust the credibility threshold of pseudo labels to ensure the quality of pseudo labels. Therefore, the global dynamic threshold

τ_{P r e}^{t}

in round t can be represented as:

τ_{P r e}^{t} = \{\begin{matrix} max (\frac{1}{1 + e^{- p_{u}^{t}}}, τ_{P r e}^{0}), t = 1 \\ max (\frac{1}{1 + e^{- p_{u}^{t}}}, \frac{p_{u}^{t} \cdot {∥y_{i, u 1}^{t} - y_{i, u 1}^{t - 1}∥}_{F}}{{∥p_{u}^{t} (x_{i, u 1}^{t}) - p_{u}^{t - 1} (x_{i, u 1}^{t - 1})∥}_{F} + ϵ}) \end{matrix}, t > 1,

(7)

where

τ_{P r e}^{0}

is the initial threshold.

ϵ

is a very small constant to avoid division by zero.

||\cdot|| F

denotes matrix norm, and

p_{u}^{t}

represents the average confidence level in round t.

{||y_{i, u 1}^{t} - y_{i, u 1}^{t - 1}||}_{F}

represents the size of the difference between the pseudo labels of the unlabeled sample i in round t and those in round

t - 1

.

{||p_{u}^{t} (x_{i, u 1}^{t}) - p_{u}^{t - 1} (x_{i, u 1}^{t - 1})||}_{F}

represents the difference between the confidence level

p_{u}^{t}

of the unlabeled sample i in round t and that in round

t - 1

.

To select the most reliable pseudo labels, ReliaMatch further filters the unlabeled data in

X_{U 1}^{t}

. The filtering criteria are as follows:

Only select samples whose maximum predicted probability is greater than the predicted probability threshold $τ_{P r e}$ :

$max (P (x_{i, u 1}^{t})) > τ_{P r e} .$

(8)
Only select samples whose predicted category is consistent with the pseudo label in the pseudo label filtering module:

$O_{i}^{t} = y_{i, u 1}^{t} .$

(9)

After screening, only samples that meet the above requirements can be added to the labeled dataset and used for supervised network training. Therefore, the resulting pseudo-labeled dataset is denoted as

X_{U 2}^{t} = {(x_{i, u 2}^{t}, y_{i, u 2}^{t})}_{i = 1}^{N_{U 2}^{t}}

, where

y_{i, u 2}^{t}

represents the pseudo-label of

x_{i, u 2}^{t}

, and

N_{U 2}^{t}

represents the number of samples in this dataset. The formula is as follows:

X_{U 2}^{t} = {(x_{i, u 2}, y_{i, u 2})}_{i = 1}^{N_{U 2}^{t}} | max (P (x_{i, u 1}^{t})) > τ_{P r e} \cap O_{i}^{t} = y_{i, u 1}^{t} .

(10)

3.4. Loss

In ReliaMatch, the unlabeled dataset

X_{U 2}^{t}

, after undergoing feature filtering and pseudo-label filtering, is transferred to the labeled dataset for supervised training, forming a new labeled dataset

X_{L}^{t + 1}

and an updated unlabeled dataset

X_{U}^{t + 1}

, given by the following formulas:

X_{L}^{t + 1} = {x_{i, l}^{t + 1}, y_{i, l}^{t + 1}}_{i = 1}^{N_{L}^{t} + N_{U 2}^{t}} : = X_{L}^{t} + X_{U 2}^{t},

(11)

X_{U}^{t + 1} = {x_{i, u}^{t + 1}, y_{i, u}^{t + 1}}_{i = 1}^{N_{U}^{t} - N_{U 2}^{t}} : = X_{U}^{t} - X_{U 2}^{t} .

(12)

Based on this, the model parameters are updated using the supervised loss

L_{s}^{t}

. Specifically, for a sample

x_{i, l}^{t}

in the labeled dataset with its label

y_{i, l}^{t}

, we have:

L_{s}^{t} = L_{C E} (g_{ϖ}^{t} (f_{θ}^{t} (x_{i, l}^{t})), y_{i, l}^{t}),

(13)

where

x_{i, l}

is a sample from the labeled dataset in the tth iteration, and

y_{i, l}

is the label of this sample.

L_{C E}

represents the cross-entropy loss function.

4. Experiments

4.1. Datasets

CIFAR-10: A dataset containing 60K images, with the shape of 32 × 32, is evenly distributed in ten categories. There are 50 K images in the training set and 10 K images in the test set. Our validation set size is 5000 for CIFAR-10 [35].

SVHN: A dataset containing 99,289 images, with the shape of 32 × 32, is evenly distributed in ten categories. SVHN consists of ten categories. The training contents contains 73,257 images, and the test set contains 26,032 images. Our validation set size is 5000 for SVHN [36].

CIFAR-100: A dataset containing 60 K images, with the shape of 32 × 32, is evenly distributed in one hundred categories. There are 50 K images in the training set and 10 K images in the test set. Our validation set size is 5000 for CIFAR-100 [37].

4.2. Model Details

ReliaMatch uses CNN-13 [38] and WideResNet-28 [39] for classification on CIFAR-10 and SVHN datasets. In order to make fair comparisons with other methods, the same parameters as in [33] were used in this paper. The network optimization is performed using stochastic gradient descent and Nesterov momentum algorithm, combined with weight decay regularization to reduce overfitting. The momentum factor is set to 0.9 and the initial learning rate is 0.1. To further improve optimization, cosine annealing [40] is employed to update the model parameters. By default, the initial hyperparameters

α_{F e a}^{t}

is set to 1 and

τ_{P r e}^{0}

is set to 0.95.

4.3. Experimental Results and Analysis

In this section, we compared ReliaMatch with other common semi-supervised learning methods on CIFAR10 and SVHN datasets, including Pseudo-Label [14], LP-MT [13], PL-CB [16], Curriculum Labeling [33], FlexMatch [41] based on pseudo-labeling,

π

Model [10], Temporal Ensembling [10], Mean Teacher [15], VAT [11], Ladder Net [9], ICT [42] based on consistency regularization, and MixMatch [17] based on strong mixup. We also compared ReliaMatch with Meta-Semi [43], a meta-learning based SSL algorithm.

Additionally, the reliability of ReliaMatch can be assessed by the Test Error Rate(%) and the Error Rate corresponding to different numbers of labeled samples: the lower Test Error Rates on different networks and datasets can prove the effectiveness of ReliaMatch; reducing the number of labeled samples without a significant increase in Error Rate can demonstrate the robustness of ReliaMatch.

From the results shown in Table 1 and Table 2, it can be seen that, compared to other semi-supervised learning methods, ReliaMatch considers issues such as feature and label noise, filters out semantically ambiguous features and unreliable pseudo-labels, and thus performs better. Specifically, the ReliaMatch method uses the same WideResNet-28 network as advanced semi-supervised classification methods proposed by previous researchers. On the CIFAR-10 dataset, this method used 4000 labeled data for testing, and achieved a test error rate of only 5.86%. In comparison, the test error rate of the Pseudo-Label method was 17.78%, Curriculum Labeling method was 8.92%, and PL-CB method was 6.28%. In addition, FlexMatch was slightly better than ReliaMatch, which had a test error rate of only 4.19% on CIFAR-10. On the SVHN dataset, the ReliaMatch method used only 1000 labeled data for testing, achieving a test error rate of 4.04%, significantly better than the test error rates of Pseudo-Label (7.62%), Curriculum Labeling (5.65%), and FlexMatch method (6.72%).

In addition, the ReliaMatch method uses the same CNN-13 network as advanced semi-supervised classification methods proposed by previous researchers. On the CIFAR-10 dataset, using 4000 labeled data for testing, the test error rate of ReliaMatch method was 7.42%, which is lower than the test error rates of Pseudo-Label methods (LP-MT, Curriculum Labeling) and consistency regularization methods (Ladder Net, Temporal Ensembling).

We also performed an evaluation on the CIFAR-100 [37] dataset. ReliaMatch achieved 38.67% and 37.28% Test Error Rate on CIFAR-100 for WidResNet-28 and CNN-13. It is worth noting that it was a large difference from the effect on the other two datasets, which may be due to CIFAR-100 being more complex, with more categories, and the maximum predictive probability of each category may have been significantly different, making the global dynamic threshold not adapted.

One common way to evaluate semi-supervised classification algorithms is by varying the size of the labeled dataset. By reducing the number of available labeled samples, it is possible to better simulate real-world scenarios. As shown in Figure 5, on the CIFAR-10 and SVHN datasets, we tested the error rates of the ReliaMatch method using the WideResNet-28 network under different numbers of labeled samples. We used datasets with 500, 1000, 2000, and 4000 labeled samples, respectively, and only changed the number of labeled samples during each training while keeping other hyperparameters the same as when using 4000 labeled samples.

The experimental results show that the classification performance of the ReliaMatch method does not significantly degrade under different numbers of labeled samples. This indicates that the ReliaMatch method has good robustness and can provide stable performance even with very limited labeled data, which is crucial for practical applications since, in many cases, only very limited labeled data can be obtained. Therefore, these results suggest that the ReliaMatch method is an effective semi-supervised classification algorithm that can be useful in practical applications.

In addition, we also investigated the effectiveness of the feature filtering module and pseudo-label filtering module in the ReliaMatch method. As shown in Table 3, we separately or collectively removed these two modules and evaluated their impact on the method performance by applying data augmentation during training. The experiments were conducted on the CIFAR-10 dataset with 4000 labeled samples, and the WideResNet-28 was used as the backbone network.

After analyzing Table 3, we have reached the following conclusions: the pseudo label filtering module has a significant impact on the performance of the ReliaMatch method, as the removal of this module leads to an increase in model error rate from 5.86 to 9.12. This indicates that the pseudo label filtering module can remove low-confidence pseudo labels, thereby reducing noise and improving model performance. In contrast, the effect of the feature filtering module is not as significant as that of the pseudo label filtering module, but it still contributes to improving model performance. Its removal results in a model error rate increase from 5.86 to 6.74, indicating that the feature filtering module can select high-quality features and thus improve the model’s performance. Additionally, removing both feature and pseudo label filtering modules causes a sharp increase in model error rate, from 5.86 to 16.9. This demonstrates that both feature and pseudo label filtering modules make important contributions to the performance of the ReliaMatch method, and both are necessary.

Therefore, both feature and pseudo label filtering modules play important roles in the ReliaMatch method. The feature filtering module can select high-quality features, while the pseudo label filtering module can remove low-confidence pseudo labels. Their combined effect can improve model performance.

We demonstrated the training process of ReliaMatch using the CNN-13 network on the CIFAR-10 and SVHN datasets. ReliaMatch adopts a curriculum learning training approach. The method combines the selected samples and their reliable pseudo-labels with the labeled data, gradually increasing the difficulty of the training data to participate in model training in a supervised manner. As shown in Figure 6 and Figure 7, during the training process, ReliaMatch reinitializes the model training after each label transfer to mitigate the problem of confirmation bias caused by pseudo-labels.

Discussion. The semi-supervised classification method ReliaMatch solves the problems of underutilizing relationship between labeled and unlabeled data as well as introducing unreliable information into the model, and ReliaMatch also achieves desirable classification results in different networks and dataset application scenarios. Although ReliaMatch performs well in image semi-supervised classification issues, there are still some potential limitations: (1) The algorithm’s performance depends on the confidence threshold. The confidence threshold needs to be well-designed and controlled in both feature filtering module and the pseudo-label filtering module to get the best results. (2) Additionally, ReliaMatch may also be affected by the selection of model’s feature extraction and model types. Therefore, in future work, it can be considered to adaptively select appropriate confidence thresholds and hyper-parameters to improve and optimize ReliaMatch, so as to further enhance the existing methods and provide insights for future semi-supervised learning tasks.

5. Conclusions

In this paper, we proposed a reliable semi-supervised deep learning classification algorithm, i.e., ReliaMatch. The algorithm integrates a course label, a feature filter module, and a pseudo-label filter module, aiming at improving classification accuracy and algorithm reliability, focusing on key features, making better use of unlabeled data and avoiding the confirmation deviation of pseudo-labels. The course label improves the classification accuracy and the reliability of the algorithm. The feature filtering module eliminates unnecessary features, which makes the algorithm pay more attention to key features. The false label filtering module eliminates the confirmation deviation of false labels and filters out unreliable features and labels with low confidence, thus making the algorithm more stable and reliable. The experimental results show that ReliaMatch achieves the most advanced classification results on multiple datasets under the control of the confidence threshold.

Author Contributions

Conceptualization, T.J. and W.C.; methodology, T.J.; validation, L.C. and W.C.; data curation, W.C.; writing—original draft preparation, T.J. and W.C.; writing—review and editing, T.J., W.C., W.M. and P.Q.; visualization, L.C. and W.C.; supervision, T.J., W.M. and P.Q.; funding acquisition, T.J. and P.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China No. 62171334, Fundamental Research Funds for the Central Universities No. ZYTS23162, and Scientific Research Foundation of Northwest A&F University No. Z1090121092.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

“The CIFAR-10 dataset” is available at https://www.cs.toronto.edu/~kriz/cifar.html (accessed on 28 June 2023) and “The Street View House Numbers (SVHN) Dataset” is available at http://ufldl.stanford.edu/housenumbers/ (accessed on 28 June 2023).

Conflicts of Interest

The authors declare no conflict of interest.

References

Lee, H.; Kwon, H. Going deeper with contextual CNN for hyperspectral image classification. IEEE Trans. Image Process. 2017, 26, 4843–4855. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Qi, P.; Zhou, X.; Ding, Y.; Zhang, Z.; Zheng, S.; Li, Z. FedBKD: Heterogenous federated learning via bidirectional knowledge distillation for modulation classification in IoT-edge system. IEEE J. Sel. Top. Signal Process. 2023, 17, 189–204. [Google Scholar] [CrossRef]
Zheng, S.; Zhou, X.; Zhang, L.; Qi, P.; Qiu, K.; Zhu, J.; Yang, X. Toward Next-Generation Signal Intelligence: A Hybrid Knowledge and Data-Driven Deep Learning Framework for Radio Signal Classification. IEEE Trans. Cogn. Commun. Netw. 2023, 9, 564–579. [Google Scholar] [CrossRef]
Wang, Y.; Long, M.; Wang, J.; Gao, Z.; Yu, P.S. PredRNN: Recurrent neural networks for predictive learning using spatiotemporal LSTMs. In Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA, 4–9 December 2017; pp. 879–888. [Google Scholar]
Huang, W.; Wu, Z.; Liang, C.; Mitra, P.; Giles, C.L. A neural probabilistic model for context based citation recommendation. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–30 January 2015; Bonet, B., Koenig, S., Eds.; AAAI Press: Washington, DC, USA, 2015; pp. 2404–2410. [Google Scholar]
Ergen, T.; Kozat, S.S. Unsupervised anomaly detection with LSTM neural networks. IEEE Trans. Neural Netw. Learn. Syst. 2019, 31, 3127–3141. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Qi, P.; Jiang, T.; Wang, L.; Yuan, X.; Li, Z. Detection tolerant black-Box adversarial attack against automatic Modulation Classification With Deep Learning. IEEE Trans. Reliab. 2022, 71, 674–686. [Google Scholar] [CrossRef]
Chapelle, O.; Scholkopf, B.; Zien, A. Semi-supervised learning (chapelle, o. et al., eds.; 2006) [book reviews]. IEEE Trans. Neural Netw. 2009, 20, 542. [Google Scholar] [CrossRef]
Rasmus, A.; Berglund, M.; Honkala, M.; Valpola, H.; Raiko, T. Semi-supervised learning with ladder networks. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015; pp. 3546–3554. [Google Scholar]
Laine, S.; Aila, T. Temporal ensembling for semi-supervised learning. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 24–26 April 2017. [Google Scholar]
Miyato, T.; Maeda, S.; Koyama, M.; Ishii, S. Virtual adversarial training: A regularization method for supervised and semi-supervised learning. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 41, 1979–1993. [Google Scholar] [CrossRef] [Green Version]
Zhou, D.; Bousquet, O.; Lal, T.N.; Weston, J.; Schölkopf, B. Learning with local and global consistency. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 13–18 December 2004; pp. 321–328. [Google Scholar]
Iscen, A.; Tolias, G.; Avrithis, Y.; Chum, O. Label propagation for deep semi-supervised learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 5070–5079. [Google Scholar]
Lee, D.H. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Proceedings of the Workshop on Challenges in Representation Learning, Atlanta, GA, USA, 16–21 June 2013; p. 896. [Google Scholar]
Tarvainen, A.; Valpola, H. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 24–26 April 2017. [Google Scholar]
Arazo, E.; Ortego, D.; Albert, P.; O’Connor, N.E.; McGuinness, K. Pseudo-labeling and confirmation bias in deep semi-supervised learning. In Proceedings of the 2020 International Joint Conference on Neural Networks, Glasgow, UK, 19–24 July 2020; pp. 1–8. [Google Scholar]
Berthelot, D.; Carlini, N.; Goodfellow, I.; Papernot, N.; Oliver, A.; Raffel, C. Mixmatch: A holistic approach to semi-supervised learning. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; pp. 5050–5060. [Google Scholar]
Bengio, Y.; Louradour, J.; Collobert, R.; Weston, J. Curriculum learning. In Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, QC, Canada, 14–18 June 2009; pp. 41–48. [Google Scholar]
Deng, J.; Dong, W.; Socher, R.; Li, L.; Li, K.; Fei-Fei, L. ImageNet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), Miami, FL, USA, 20–25 June 2009; IEEE Computer Society: New York, NY, USA, 2009; pp. 248–255. [Google Scholar]
Lin, T.; Maire, M.; Belongie, S.J.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common objects in context. In Proceedings of the Computer Vision—ECCV 2014—13th European Conference, Zurich, Switzerland, 6–12 September 2014; Proceedings, Part V; Lecture Notes in Computer Science; Fleet, D.J., Pajdla, T., Schiele, B., Tuytelaars, T., Eds.; Springer: Berlin/Heidelberg, Germany, 2014; Volume 8693, pp. 740–755. [Google Scholar]
Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Enzweiler, M.; Benenson, R.; Franke, U.; Roth, S.; Schiele, B. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016; IEEE Computer Society: New York, NY, USA, 2016; pp. 3213–3223. [Google Scholar]
Wang, K.; Yang, C.; Betke, M. Consistency regularization with high-dimensional non-adversarial source-guided perturbation for unsupervised domain adaptation in segmentation. In Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, 2–9 February 2021; AAAI Press: Washington, DC, USA, 2021; pp. 10138–10146. [Google Scholar]
Abuduweili, A.; Li, X.; Shi, H.; Xu, C.; Dou, D. Adaptive consistency regularization for semi-supervised transfer learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, 19–25 June 2021; Computer Vision Foundation/IEEE: New York, NY, USA, 2021; pp. 6923–6932. [Google Scholar]
Grandvalet, Y.; Bengio, Y. Semi-supervised learning by entropy minimization. In Proceedings of the Advances in Neural Information Processing Systems 17, Neural Information Processing Systems, NIPS 2004, Vancouver, BC, Canada, 13–18 December 2004; pp. 529–536. [Google Scholar]
Bachman, P.; Alsharif, O.; Precup, D. Learning with pseudo-ensembles. In Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, Montreal, QC, Canada, 8–13 December 2014; Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q., Eds.; Springer: Berlin/Heidelberg, Germany; pp. 3365–3373.
Sajjadi, M.; Javanmardi, M.; Tasdizen, T. Regularization with stochastic transformations and perturbations for deep semi-supervised learning. In Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, Barcelona, Spain, 5–10 December 2016; Lee, D.D., Sugiyama, M., von Luxburg, U., Guyon, I., Garnett, R., Eds.; Cornell University: Ithaca, NY, USA; pp. 1163–1171.
Loshchilov, I.; Hutter, F. Decoupled weight decay regularization. In Proceedings of the 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, 6–9 May 2019; Available online: OpenReview.net (accessed on 28 June 2023).
Luo, Y.; Zhu, J.; Li, M.; Ren, Y.; Zhang, B. Smooth neighbors on teacher graphs for semi-supervised learning. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, 18–22 June 2018; Computer Vision Foundation/IEEE Computer Society: New York, NY, USA, 2018; pp. 8896–8905. [Google Scholar]
Zhang, H.; Cissé, M.; Dauphin, Y.N.; Lopez-Paz, D. Mixup: Beyond empirical risk minimization. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
Verma, V.; Lamb, A.; Kannala, J.; Bengio, Y.; Lopez-Paz, D. Interpolation consistency training for semi-supervised learning. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, 10–16 August 2019; Kraus, S., Ed.; Cornell University: Ithaca, NY, USA; pp. 3635–3641. Available online: ijcai.org (accessed on 28 June 2023).
Mandal, D.; Rao, P.; Biswas, S. Semi-supervised cross-Modal retrieval with label prediction. IEEE Trans. Multim. 2020, 22, 2345–2353. [Google Scholar] [CrossRef] [Green Version]
Caron, M.; Bojanowski, P.; Joulin, A.; Douze, M. Deep clustering for unsupervised learning of visual features. In Proceedings of the Computer Vision—ECCV 2018—15th European Conference, Munich, Germany, 8–14 September 2018; Proceedings, Part XIV; Lecture Notes in Computer Science; Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y., Eds.; Springer: Berlin/Heidelberg, Germany, 2018; Volume 11218, pp. 139–156. [Google Scholar]
Cascante-Bonilla, P.; Tan, F.; Qi, Y.; Ordonez, V. Curriculum labeling: Revisiting pseudo-labeling for semi-supervised learning. In Proceedings of the 35th AAAI Conference on Artificial Intelligence, Washington, DC, USA, 2–9 February 2021; pp. 6912–6920. [Google Scholar]
Hu, Z.; Kou, G.; Zhang, H.; Li, N.; Yang, K.; Liu, L. Rectifying pseudo labels: Iterative feature clustering for graph representation Learning. In Proceedings of the CIKM ’21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, 1–5 November 2021; Demartini, G., Zuccon, G., Culpepper, J.S., Huang, Z., Tong, H., Eds.; ACM: New York, NY, USA, 2021; pp. 720–729. [Google Scholar]
CIFAR10. Available online: http://www.cs.toronto.edu/~kriz/cifar.html (accessed on 28 June 2023).
SVHN. Available online: http://ufldl.stanford.edu/housenumbers/ (accessed on 28 June 2023).
CIFAR100. Available online: http://www.cs.utoronto.ca/~kriz/cifar.html (accessed on 28 June 2023).
Springenberg, J.T.; Dosovitskiy, A.; Brox, T.; Riedmiller, M.A. Striving for simplicity: The all convolutional net. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
Zagoruyko, S.; Komodakis, N. Wide residual networks. arXiv 2016, arXiv:1605.07146. [Google Scholar]
Loshchilov, I.; Hutter, F. SGDR: Stochastic gradient descent with warm restarts. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 24–26 April 2017. [Google Scholar]
Zhang, B.; Wang, Y.; Hou, W.; Wu, H.; Wang, J.; Okumura, M.; Shinozaki, T. FlexMatch: Boosting Semi-Supervised Learning with Curriculum Pseudo Labeling. In Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, Virtual, 6–14 December 2021; pp. 18408–18419. [Google Scholar]
Verma, V.; Kawaguchi, K.; Lamb, A.; Kannala, J.; Solin, A.; Bengio, Y.; Lopez-Paz, D. Interpolation consistency training for semi-supervised learning. Neural Netw. 2022, 145, 90–106. [Google Scholar] [CrossRef]
Wang, Y.; Guo, J.; Song, S.; Huang, G. Meta-Semi: A Meta-learning Approach for Semi-supervised Learning. arXiv 2007, arXiv:2007.02394. [Google Scholar] [CrossRef]
Wang, Y.; Chen, H.; Heng, Q.; Hou, W.; Fan, Y.; Wu, Z.; Wang, J.; Savvides, M.; Shinozaki, T.; Raj, B.; et al. FreeMatch: Self-adaptive Thresholding for Semi-supervised Learning. In Proceedings of the Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, 1–5 May 2023; Available online: OpenReview.net (accessed on 28 June 2023).
Tang, H.; Jia, K. Towards Discovering the Effectiveness of Moderately Confident Samples for Semi-Supervised Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, 18–24 June 2022; IEEE: New York, NY, USA, 2022; pp. 14638–14647. [Google Scholar] [CrossRef]

Figure 1. Error information in feature space. The (left) figure shows how the semantic inconsistency between adjacent samples, and the (right) figure illustrates how to use anchors to set confidence thresholds to filter features.

Figure 2. Comparison of different strategies for pseudo-label generation. The maximum predicted probability strategy (from [14]) may cause confusion in the early stages but improves in the later stages. The fixed threshold strategy (from [17]) may lead to either insufficient or incorrect labeling. The dynamic threshold strategy, however, can adjust the threshold based on the confidence of unlabeled samples and achieve higher accuracy in pseudo-labeling.

Figure 3. Illustration of the ReliaMatch framework. This framework mainly includes two parts: labeled data and unlabeled data. (1) For labeled samples, ReliaMatch trains them using a feature extraction model and a classification model. After training, feature extraction and classification prediction are performed on the labeled samples. (2) For unlabeled samples, there are two main modules: the feature filtering module and the pseudo-label filtering module, which correspond to the feature and label noise filtering, respectively. The feature filtering module includes filtering 1, and the pseudo-label filtering module contain filtering 2 and filtering 3.

Figure 4. Schematic diagram of anchor generation and feature filtering. In the tth iteration, five labeled samples and five unlabeled samples are selected, which come from three categories, and the samples with the same color indicate that they belong to the same category.

Figure 5. Comparison of test error rates of different methods using different amounts of labeled data. (a) CIFAR-10. (b) SVHN.

Figure 6. Training process of ReliaMatch on CIFAR-10 dataset.

Figure 7. Training process of ReliaMatch on SVHN dataset.

Table 1. Comparison of Test Error Rate (%) of WideResNet-28 with Different Semi-supervised Methods.

Method	CIFAR-10 ( $N_{L}$ = 4000)	SVHN ( $N_{L}$ = 1000)
PL †	17.78 ± 0.57	7.62 ± 0.29
Curriculum Labeling †	8.92 ± 0.03	5.65 ± 0.11
PL-CB †	6.28 ±0.30	-
$π$ Model †	16.37 ± 0.63	7.19 ± 0.27
Mean Teacher	10.36 ± 0.28	5.65 ± 0.47
VAT †	13.86 ± 0.27	5.63 ± 0.20
VAT+EntMin †	13.13 ± 0.39	5.35 ± 0.19
ICT †	7.66 ± 0.17	3.53 ± 0.07
MixMatch †	6.24 ± 0.06	3.27 ± 0.31
FlexMatch $^{‡}$	4.19 ± 0.01	6.72 ± 0.30
Meta-Semi $^{★}$	6.10 ± 0.10	-
ReliaMatch *	5.86 ± 0.12	4.04 ± 0.08

N_{L}

represents the number of labeled samples in the training set, † indicates that the result has been reported in the literature [33],

^{‡}

indicates result has been reported in [44],

^{★}

indicates the result has been reported in [45] and * represents the average of 5 runs of the method proposed in this paper.

Table 2. Comparison of Test Error Rate (%) of Different Semi-supervised Methods Using CNN-13.

Method	CIFAR-10 ( $N_{L} = 4000$ )	SVHN ( $N_{L} = 1000$ )
LP-MT †	10.16 ± 0.28	-
Curriculum Labeling †	9.81 ± 0.22	4.75 ± 0.28
Ladder Net †	12.16 ± 0.31	-
Temporal Ensembling †	12.16 ± 0.24	4.42 ± 0.16
ReliaMatch *	7.42 ± 0.05	7.13 ± 0.28

N_{L}

represents the number of labeled samples in the training set, † indicates that the result has been reported in the literature [33], and * represents the average of five runs of the proposed method in this paper.

Table 3. Exploring the influence of important modules of ReliaMatch on classification results.

Method	Test Error Rate (%)
w/o Feature Filtering	6.74
w/o Pseudo-label Filtering	9.12
w/o Feature Filtering and Pseudo-label Filtering	16.9
ReliaMatch (benchmark)	5.86

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jiang, T.; Chen, L.; Chen, W.; Meng, W.; Qi, P. ReliaMatch: Semi-Supervised Classification with Reliable Match. Appl. Sci. 2023, 13, 8856. https://doi.org/10.3390/app13158856

AMA Style

Jiang T, Chen L, Chen W, Meng W, Qi P. ReliaMatch: Semi-Supervised Classification with Reliable Match. Applied Sciences. 2023; 13(15):8856. https://doi.org/10.3390/app13158856

Chicago/Turabian Style

Jiang, Tao, Luyao Chen, Wanqing Chen, Wenjuan Meng, and Peihan Qi. 2023. "ReliaMatch: Semi-Supervised Classification with Reliable Match" Applied Sciences 13, no. 15: 8856. https://doi.org/10.3390/app13158856

APA Style

Jiang, T., Chen, L., Chen, W., Meng, W., & Qi, P. (2023). ReliaMatch: Semi-Supervised Classification with Reliable Match. Applied Sciences, 13(15), 8856. https://doi.org/10.3390/app13158856

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

ReliaMatch: Semi-Supervised Classification with Reliable Match

Abstract

1. Introduction

2. Related Work

2.1. Semi-Supervised Classification

2.2. Consistency Regularization

2.3. Pseudo-Labeling

3. Method

3.1. Problem Description

3.2. Feature Anchoring

3.3. Dynamic Allocation Pseudo Lables

3.4. Loss

4. Experiments

4.1. Datasets

4.2. Model Details

4.3. Experimental Results and Analysis

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI