Semi-Supervised Medical Image Classification with Pseudo Labels Using Coalition Similarity Training

Liu, Kun; Ling, Shuyi; Liu, Sidong

doi:10.3390/math12101537

Open AccessArticle

Semi-Supervised Medical Image Classification with Pseudo Labels Using Coalition Similarity Training

by

Kun Liu

¹

,

Shuyi Ling

^1,* and

Sidong Liu

²

¹

School of Information Engineering, Shanghai Maritime University, Shanghai 200135, China

²

Australia Institute of Health Innovation, Macquarie University, Sydney 2109, Australia

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(10), 1537; https://doi.org/10.3390/math12101537

Submission received: 28 February 2024 / Revised: 12 May 2024 / Accepted: 14 May 2024 / Published: 15 May 2024

Download

Browse Figures

Versions Notes

Abstract

:

The development of medical image classification models necessitates a substantial number of labeled images for model training. In real-world scenarios, sample sizes are typically limited and labeled samples often constitute only a small portion of the dataset. This paper aims to investigate a collaborative similarity learning strategy that optimizes pseudo-labels to enhance model accuracy and expedite its convergence, known as the joint similarity learning framework. By integrating semantic similarity and instance similarity, the pseudo-labels are mutually refined to ensure their quality during initial training. Furthermore, the similarity score is utilized as a weight to guide samples away from misclassification predictions during the classification process. To enhance the model’s generalization ability, an adaptive consistency constraint is introduced into the loss function to improve performance on untrained datasets. The model achieved a satisfactory accuracy of 93.65% at 80% labeling ratio, comparable to supervised learning methods’ performance. Even with very low labeling ratio (e.g., 5%), the model still attained an accuracy of 74.28%. Comparison with other techniques such as Mean Teacher and FixMatch revealed that our approach significantly outperforms them in medical image classification tasks through improving accuracy by approximately 2%, demonstrating this framework’s leadership in medical image classification.

Keywords:

semi-supervised learning; medical image classification; similarity; pseudo-labeling; deep learning

MSC:

68T99

1. Introduction

Physicians are increasingly relying on imaging data for diagnostic tasks. For example, histopathological assessment based on whole slide images is the current gold standard for cancer diagnosis. As more hospitals, clinics laboratories and doctors are rapidly moving to digital platforms, how to use big imaging data to assist in diagnosis has emerged as a hot research question. Deep learning, particularly convolutional neural networks (CNNs), is making great strides in medical image-based diagnosis, demonstrating great results in many clinical applications [1]. However, the acquisition of datasets for medical conditions has two difficulties: the lack of samples with labels due to the cost and time of manual annotation, and that rare diseases are difficult to categorize correctly due to sample size being a small percentage of the total. Categorizing images to achieve high accuracy using a small number of labeled samples is a hot field.

To these challenges, semi-supervised learning (SSL) was proposed to make use of large amount of unlabeled samples [2]. Pseudo-labeling is a widely used SSL method, which uses the model trained using labeled data to predict unlabeled and generate predicted labels as pseudo-labels. Since the model is not fully trained at the beginning of training, predicting a pseudo-label that is wrong will make the model error larger during the iteration process. Solving the confidence problem of predicted labels is a top research priority. Therefore, other methods, such as consistency regularization, are usually incorporated with pseudo-labeling in SSL. Wang et al. [3], in order to solve the quality problem of pseudo-labels, used different inputs to encourage the label network to learn invariant and robust representations, thereby improving generalization. Zhou et al. [4] proposed the growth threshold for pseudo labeling and pseudo label dropout to change the threshold of each category and alleviate the category imbalance. Wang et al. [5] designed a feature pseudo-label by using the information of model prediction and feature similarity to improve the accuracy of pseudo-labels.

After reviewing the existing research findings, it is evident that the majority of pseudo-labeling methods overlook the importance of ensuring high-quality initial pseudo-labeling. Substandard quality in this regard can lead to model collapse. At the same time, when there are only a very small number of labeled samples, the training results are not ideal. This is the key issue that we focus on in our research.

In this work, a semi-supervised learning framework for medical image classification is proposed. The method roughly consists of three components. Firstly, this method uses a RseNet-18 model to generate and refine pseudo-labels. However, rather than using the strategy that most methods use to predict labels based on the Softmax function [6], our method, inspired by the SimMatch framework [7], is designed to refine pseudo-labels to reduce errors and ensure the model converges faster. In this process, semantic similarity and instance similarity are calibrated and adjusted according to each other, encouraging the augmented instances to have stronger similarity in the same category. Secondly, we used information entropy to weight pseudo-labels. Information entropy is closely related to the proportion of a class of samples in the total sample size [8]. Samples that are difficult to categorize will have higher entropy values than common disease samples and higher weights can be assigned to focus the network on learning them. The accuracy of the semantic pseudo-labels will be enhanced by incorporating weights. The accuracy of the initial pseudo-label is widely recognized as having a significant impact on the model training process. To enhance the precision of pseudo-labels, the semantic pseudo-tags generated using embedding are used for label calibration. Thirdly, this method adds adaptive consistency regularization to the loss function. We utilize knowledge distillation [9] and transfer learning [10], making a trained model used to transfer the knowledge on the model using target instances in order to apply it quickly and accurately to the specified dataset.

Our main contributions can be summarized in the following points:

Developing a collaborative similarity learning strategy aimed at optimizing pseudo-labels in order to enhance accuracy and expedite model convergence.
To ensure the quality of pseudo-labels during initial training, we employ a strategy of mutual correction involving semantic similarity and instance similarity. Furthermore, in order to improve the performance of the model, the similarity score is utilized as a weight to guide samples towards maintaining an appropriate distance from misclassification results during the classification process.
The model’s generalization ability can be improved by incorporating adaptive consistency constraints into the loss function, thus enhancing its performance on untrained data sets.
We performed extensive experiments and demonstrated advanced performance with existing semi-supervised methods. To assess the generalization of the method, we evaluate our model on two medical image datasets, including the BreakHis dataset [11] and the Chest X-Ray14 dataset [12].

2. Related Work

2.1. Semi-Supervised Learning

SSL is a learning strategy to improve the performance of deep neural networks by utilizing unlabeled samples when the amount of labeled data are sparse [13]. Current SSL methods can be categorized into four groups, including generative models [14], consistency regularization [15], graph neural network models (GNNs) [16], and pseudo-labeling [17].

The objective of the generative model is to acquire knowledge about the underlying data distribution, enabling it to generate novel samples that exhibit similarity while maintaining distinctiveness from the training dataset. Although generative models have made remarkable advancements in various domains, they also encounter certain limitations and challenges. For instance, they may experience instability during the training process, which potentially necessitates a substantial amount of training data to achieve optimal performance. Given the limited number of samples typically available in medical image datasets, this approach is evidently unsuitable.

GNNs are a class of neural network models specifically designed for processing graph-structured data, with the goal of acquiring node and edge representations to tackle tasks on such data. They effectively capture the inherent structure and patterns within a graph by propagating information through inter-node relationships. However, there exist two primary limitations: firstly, when dealing with categories containing fewer nodes, the model’s performance in the learning process may be subpar. Secondly, different tasks and datasets often necessitate distinct model architectures and parameter configurations. Consequently, this paper seeks a generalized model that can address the prevalent issue of data imbalance commonly encountered in medical image tasks.

Consistency regularization is a regularization method used in supervised and semi-supervised learning to reduce overfitting, in which the idea is to force the same unlabeled samples under different perturbations to maintain the same output after passing through the model. Zhu et al. [18] constrained local and global consistency together in a projection learning framework in the pursuit of the robust model. Consistency regularization-based SSL methods are overly dependent on region-specific data enhancement and are not suitable for application to medical images.

Pseudo-labeling, another major SSL method proposed by Lee [19], aims to make predictions on unlabeled data in the semi-supervised setting. It aims to improve the performance of the model by predicting labels in order to increase the amount of training data. This classification task uses a self-training approach. Self-training generates pseudo-labels for unlabeled samples, then selected data above a predetermined threshold are fed into the network along with the labeled data, and the cycle continues until the network reaches an optimal state. Shi et al. [20] improved the model by encouraging larger interclass distances and smaller intraclass distances. Wu et al. [21] incorporated a density peak of data space structure into an iterative process to optimize attribute values for unlabeled data. Li et al. [22] learned a soft weighting to optimize the weight of the pseudo so that the network is more likely to discriminate between samples, which should be emphasized for learning. Pham et al. [23] investigated meta pseudo labels, allowing the teacher model to generate proxy labels through meta-learning and improve the learning style of the student model to adjust the target distribution of the training dataset.

Each of the four approaches to SSL has some drawbacks, and sometimes multiple methods are combined to enhance SSL, such as pseudo-labeling and consistency regularization [24]. Sohn et al. [25] proposed FixMatch framework, which combines both consistency regularization and pseudo-labeling, and the aim is to train the network by forcing the pseudo labels to be the same after weak and strong augment of the same image. Wang et al. [26] leveraged the large amount of underlying knowledge extracted through deep virtual adversarial self-training with consistency regularization to improve the discrimination capability of training models. Zhou et al. [27] used intra- and inter-instance consistency based on the augmented samples and the k nearest neighbors of an anchor image, the aim of which is to improve the robustness of the classification method by forcing medical images that are slightly different but belong to the same disease to have the same prediction.

2.2. Similarity Learning

Similarity learning is an important problem in computer vision and machine learning, where the goal is to learn an appropriate similarity measurement that makes the distance between samples of the same class decrease and the distance between samples of different classes increase. There are many methods to calculate the similarity, and two commonly used methods are selected for learning, namely Euclidean similarity and cosine similarity calculation.

In addition, some local metric learning and nonlinear metric learning methods are proposed to deal with the complicated distributed data. The local metric learning method overcomes the shortcoming of the traditional single global metric learning method by learning the local distance measure for each category or each sample. Non-linear metric learning improves the learning ability of traditional metric learning methods on distributed complex data by using kernel function nonlinearization or the direct learning of nonlinear distance or similarity. However, these measurement learning methods are mainly based on the existing feature representation, which is difficult to break through the bottleneck of artificially designed feature representation.

With the remarkable success of convolutional neural network (CNN) models in visual tasks such as image classification and object detection in recent years, numerous researchers have started integrating distance measure learning with convolutional neural networks to propose a series of joint learning models for feature representation and similarity measurement. Xia et al. [28] integrated multiple cosine similarity functions to ensure the learner can receive a diversity of samples. Ye et al. [29] proposed a Unified Multi-Metric Learning framework, which is a set of indicators describing similarity from different perspectives. However, most methods represent dissimilarity by projecting the samples to a single point in a Euclidean distance containing a triangular inequality. To explore more potential of similarity in semi-supervised learning, Zhang et al. [30] built a learning framework to represent the similarity between two images with a graph and verified the interpretability of the proposed framework. Wang et al. [31] considered that the correlation between unlabeled and labeled samples can assist the model in extracting more discriminatory features to obtain more accurate predictions.

Simultaneously, there is an ongoing development of networks specifically designed for measuring similarity, with the Siamese convolutional neural network being the most prominent example. Siamese CNN is a special type of convolutional neural network that is mainly used to process data that require pair comparison, such as determining whether two inputs are similar. Hamrouni et al. [32] proposed to use Siamese training to distinguish the similarity and dissimilarity of the obtained samples. The study by Huang et al. [33] suggested that the dual-path Siamese CNN could effectively leverage deep cellular neural networks even with limited training samples. Xiao et al. [34] combined the similarity measures of Siamese and CNN-based subnetworks in order to increase the similarity of samples from the same class and the difference between samples from different classes, and finally improve the classification accuracy.

3. Materials and Methods

The architecture of the model is elaborated in this section. Firstly, the preliminary work of the framework is presented. Secondly, the overall framework and its components are introduced. Finally, the training process and some details of the proposed method are discussed.

3.1. Preliminaries

The goal of our work is to classify medical images more accurately in a semi-supervised setting. Let

D_{L} = {\{(x_{i}^{l}, y_{i})\}}_{i = 1}^{| D_{L} |}

be the labeled training set, where

x_{i}^{l} \in X

is the input images from the labeled set,

y_{i} \in Y

is the label with the set of classes, and

| D_{L} |

is the number of the labeled images. There are K categories in total. An unlabeled training set

D_{U} = {x_{i}^{u}}_{i = 1}^{| D_{U} |}

is defined, where

x_{i}^{u} \in X

is the unlabeled image. The entire dataset

D_{S} = D_{U} ⋃ D_{S} = {(x_{i}, y_{i})}_{i = 1}^{|D_{L}| + | D_{U} |}

is the input of the model. The image batch is

B

.

After determining the image dataset, the samples are enhanced using a weak augmentation function

T_{w} (\cdot)

and strong augmentation function

T_{s} (\cdot)

. The weak augmentation techniques include flipping and cropping, while the strong augmentation employs random transformations. Then, ResNet-18 [35] is used as a feature extractor

F (\cdot)

. The extracted features are

h = F (x_{i})

. Finally, the results coming from the fully connected layer and then passed through the softmax function is predictions

y_{p}

.

The labeled samples’ predictions could be optimized by the cross-entropy loss with the labels. The formula is as follows, where

L_{c e} (\cdot)

stands for the cross-entropy loss function and

y_{p}^{l}

is the predicted label of the labeled sample.

L_{l} = L_{c e} (y_{p}^{l}, y_{i}) (x_{i}^{l}, y_{i}) \in D_{l} .

(1)

3.2. Coalition Similarity Training Framework

In the SSL framework, pseudo-labeling methods [36] are designed to utilize large amounts of unlabeled data to improve classification accuracy. The quality of counterfeit marks is closely related to its classification effect. In order to further improve the quality of pseudo-labels, two strategies are usually adopted: pseudo-label refinement [37] and consistency regularization [38]. Therefore, we propose a similarity computing module which combines semantic similarity and instance similarity. This method is a semi-supervised learning method designed to weight and cross-calibrate two pseudo-labels by similarity. The framework is shown in Figure 1.

Specifically, the framework uses the ResNet-18 model as a feature extractor, which is pre-trained on the ImageNet dataset [39]. The model generates two different types of pseudo-tags: semantic pseudo-tags are computed using the Softmax function for the feature representation, while instance pseudo-tags are obtained using the low-dimensional embedding of the aggregate feature map. These two kinds of false labels cooperate with each other through calibration to produce the final false label.

Consistency regularization losses are incorporated into the model during training, which helps the model learn more robust and generic features on unlabeled data. This strategy helps improve the model’s performance in real-world applications. Therefore, this alliance similarity training framework not only improves the quality of pseudo-labels, but also ensures the robustness and generalization ability of the model in the learning process.

3.2.1. Semantic Pseudo Labels

The framework for this section is shown in Figure 2.

The class prototypes are generated from the feature representations of the samples predicted to be in that class. In this paper, the prototypes are computed as the class centroid, i.e., the centroid of class c is computed as the mean of the feature representations:

P_{c}^{t} = \{\begin{matrix} \frac{1}{2} [\frac{\sum_{j = 1}^{H W} z_{j}}{L (y_{p} = c)} + P_{c}^{t - 1}] i f y_{p} \in c \\ P_{c}^{t - 1} o t h e r w i s e \end{matrix},

(2)

where

z_{j}

is obtained by flattening the height

H

, and width dimensions

W

of the feature vector extracted from the i-th image and j is ranges from 1 to

H \times W

;

L (y_{p} = c)

is a function that determines whether the values on both sides of an equal sign are equal. If

y_{p}

belongs to class c, then the value is 1, otherwise it is 0;

P_{c}^{t - 1}

represents the type c prototype obtained at step t − 1 and

c \in [1, K]

.

Then, computing the cosine similarity [40] between the new prototype

m_{i}

and the prototype

P_{c}^{t - 1}

of class c:

\{\begin{matrix} s i m_{i, c} = \frac{m_{i} \cdot P_{c}^{t - 1}}{| | m_{i} | | \cdot | | P_{c}^{t - 1} | |} \\ m_{i} = \sum_{j = 1}^{H \times W} z_{j} \end{matrix},

(3)

where the new prototype

m_{i}

is the sum of flattened feature vectors extracted from the i-th image and

“ \cdot ”

is dot product.

The normalized similarity is defined as

\tilde{s i m_{i, c}} = \frac{\exp (s i m_{i, c})}{\sum_{j = 1}^{K} \exp (s i m_{i, j})},

(4)

where

e x p (\cdot)

stands for exponential function and the normalized similarity reflects the probability that the new sample is similar to the old class prototypes. The similarity of the new sample to all the old class prototypes constitutes the vector

{\tilde{s i m}}_{i}

. This similarity exists as a weighted value of the semantic pseudo-label.

p_{i}^{w} = {\tilde{s i m}}_{i} \cdot y_{i, p}^{w},

(5)

where

y_{i, p}^{w}

is the predicted value obtained after weak augment of unlabeled samples.

3.2.2. Instance Pseudo Labels

After extracting the features, a nonlinear mapping header is introduced to map the feature representation into a low-dimensional embedding

e

, where

e^{w}

and

e^{s}

are set as embeddings from a weak and strong augment. In a batch of image samples, we compute the similarity of each instance

e_{i} (i \in [1, \dots, B])

to

e^{w}

and

e^{s}

, then normalize it to obtain the degree of similarity [7]:

q_{i}^{w} = \frac{\exp (s im (e_{i}, e^{w}) / t)}{\sum_{k = 1}^{B} \exp (s i m (e_{k}, e^{w}) / t)},

(6)

q_{i}^{s} = \frac{\exp (s im (e_{i}, e^{s}) / t)}{\sum_{k = 1}^{B} \exp (s i m (e_{k}, e^{s}) / t)},

(7)

where t is the temperature parameter that controls the sharpness.

In the above process, two kinds of pseudo-labels are calculated. The purpose of enabling the interaction between semantic similarity and instance similarity is to further harness the potential of label information. The instance pseudo-labels are adjusted:

\hat{q_{i}} = \frac{p_{i}^{u n f o l d} q_{i}^{w}}{\sum_{k = 1}^{K} q_{k}^{w} p_{k}^{u n f o l d}},

(8)

where

p_{i}^{u n f o l d}

is

p_{i}^{w}

unfold to the same dimension of

q_{i}^{w}

.

On the other hand, using instance similarity to adjust the semantic similarity, aggregating

q_{i}^{w}

to the dimension of the

p_{i}^{w}

which denotes it as

q_{i}^{a g g}

. The impact of pseudo-label quality on model training is widely acknowledged. To optimize the trained model for the entire unlabeled dataset, the calibration of semantic pseudo-labels becomes essential, which can be expressed as the following:

\hat{p_{i}} = (1 - α) q_{i}^{a g g} + α p_{i}^{w} .

(9)

3.3. Loss Functions

To enhance the model’s robustness, a consistent regularization loss is introduced to ensure that both weakly augmented views and strongly augmented views exhibit similar similarity distributions:

L_{i n} = \frac{1}{B} \sum_{i = 1}^{B} H (q_{i}^{s}, \hat{q_{i}}),

(10)

where

H (.)

is the cross-entropy loss function.

A consistency loss—maximum mean difference (MMD) is widely used by domain adaptation. This loss function measures the distance between two different but related distributions, which can be thought of as evaluating whether two sets of data are similar. The loss function is optimal for minimizing the disparity between labeled and unlabeled samples for these reasons.

The filtration of low-quality pseudo-labels is crucial during training to mitigate their potential impact on future outcomes. The establishment of a buffer is necessary for the storage of samples that meet the threshold. If

y_{p}^{l} > τ,

the sample

x_{i}^{l}

will be deposited

T_{l}

; if

\hat{p_{i}} > τ

, the sample

x_{i}^{u}

will be deposited

T_{u}

. The MMD loss is subsequently computed:

L_{u} = M M D (T_{l}, T_{u}) = | | \frac{1}{m} \sum_{i = 1}^{m} k (x_{i}^{l}) - \frac{1}{n} \sum_{j = 1}^{n} k (x_{j}^{u}) | |_{H}^{2},

(11)

where

τ

is the confidence threshold,

y_{i}

is the truth value for unlabeled samples.

A detailed explanation of each component is provided above. Consequently, the overall model loss can be defined as

L_{S} = L_{l} + λ_{i n} L_{i n} + λ_{u} L_{u},

(12)

where

λ

are the balancing factors that control the weights loss.

3.4. Model Training

In each iteration, the feature representations of images are extracted using the ResNet-18, with weak augmentation applied to labeled samples (

h_{l}^{w}

) and both weak and strong augmentation applied to unlabeled samples (

h_{u}^{w} a n d h_{u}^{s}

). The entire feature extraction network is trainable. The label weight for semantic pseudo-labels is determined based on the similarity between old and new prototypes in each training iteration. The weight of semantic pseudo-labels increases when the similarity between the old and new prototypes is higher, indicating that images with a stronger resemblance to a category are more likely to be classified as belonging to that category. After the smoothing operation, the instance pseudo-labels are refined to improve their accuracy.

The specific training process as follows:

Loading the network structure and randomly initializing the network parameters.
The ResNet-18 network is utilized to process both labeled and unlabeled samples. The resulting feature representations are then passed through the Softmax layer, enabling us to obtain the predicted labels for the labeled samples as and the pseudo-labels for the unlabeled samples.
Utilize the outputs of the fully connected layer to compute the class prototype, followed by calculating the cosine similarity between the new and old prototypes as the measure of similarity weight. Similarity weights and pseudo-labels are combined to generate semantic pseudo-labels.
Embeddings from weakly and strongly augment calculate separately the instance similarity. The weakly augmented embeddings are calibrated with semantic pseudo-labels to generate instance pseudo-labels and are sent to the cross-entropy loss function at the same time with the strongly augmented embeddings for optimization.
The semantic pseudo-labels are smoothed by instance similarity to obtain the final pseudo-labels. The chosen samples are utilized for computing the MMD loss based on the given threshold. Finally, the network parameters are ultimately optimized through the minimization of the overall loss function.
Repeat steps (2)–(5) for each training iteration.
Assess the performance of the trained model by applying it to the test dataset. The test sample is utilized as the input, and the trained ResNet-18 network generates the predicted classification output, which is compared with the truth values to compute the accuracy.

A more explicit instruction has been illustrated in Algorithm 1.

Algorithm 1. Coalition similarity framework.

1: require:

x_{l} a n d x_{u}

: a batch B of labeled and unlabeled samples.

T

: Total steps required for training.

2: while

t < T

do

3: compute semantic pseudo labels

p^{w}

by Equations (4) and (5)

4: compute instance pseudo labels

q^{s}, q^{w} a n d q^{a g g}

by Equations (6)–(8)

5: optimize model by Equation (12)

6: update model’s parameters

7: update K-type prototypes by Equation (2)

8:

t \leftarrow t + 1

9: end

10: output: the well trained model

F

4. Experiments

For the experiments below, we use the BreakHis dataset [11] and the Chest X-Ray14 dataset [12]. The BreakHis dataset is composed of 2480 benign and 5429 malignant samples (700 × 460 pixels), in all eight categories. Within this dataset, there are 444 samples for adenoma (A), 569 for tubular adenoma (TA), 1014 for fibroadenoma (F), 453 for lobular tumour (LT), 3451 for ductal carcinoma (DC), 626 for lobular carcinoma (LC), 792 for mucinous carcinoma (MC), and 560 for papillary carcinoma (PC). The training set consists of 6327 images, while the test set contains 1582 images. These eight types of images are shown in Figure 3.

The Chest X-ray14 dataset has a total of 112,120 X-ray images, including 14 types of lung diseases. The experiments selected six types of common diseased lung (2564 for Atelectasis, 2406 for Effusion, 2296 for Infiltration, 1302 for Mass, 1646 for Nodule, 1335 for Pneumothorax) and 10711 for normal lung. The training set consists of 3519 images, while the test set contains 2678 images. These six types of images are shown in Figure 4.

The datasets had undergone pre-processing. The image needs to be resized to 224 × 224 pixels. Next, the training set is partitioned into labeled and unlabeled datasets, with varying proportions of labeled data (5%, 10%, 20%, 50%, 80%, 95%) to test the framework.

The experiments conducted in this study are performed on a workstation with a 11 GB GPU RTX2080TI. The models are implemented in Python (v3.6). The optimizer utilized during training is stochastic gradient descent (SGD) with momentum. The momentum rate is set to 0.9. The initial learning rate is 0.001, and the learning rate adopts the cosine learning rate decay. The batch size of the labeled and unlabeled data sets is 16. In the model, the confidence threshold is 0.7. The balance parameters λ of the loss function are set to (1,10). All parameters are the best results obtained after 100 iterations of the experiment. The numbers highlighted in bold in all the subsequent tables indicate the optimal values.

The accuracy rate serves as the benchmark for assessing the efficacy of the method. Experiment compared our method with other methods, including Pseudo-Labeling [19], Interpolation consistency training (ICT) [41], FixMatch [25], and Mean Teacher [42]. The model for the BreakHis dataset was trained 90,000 iterations and that for the Chest X-ray14 dataset was trained 200,000 iterations. The method used as a baseline is the simplest pseudo-label model. It used only a single standard ResNet-18 as feature extraction and classification, with a SoftMax loss function.

Table 1 shows the classification accuracy results from using different networks and different annotated percentages of labeled images; the data in Table 1 represent the mean values obtained from 50 repeated trials. Table 2 shows the comparison between the proposed method and the state-of-the-art methods. Table 1 and Table 2 demonstrate that our proposed method achieves superior performance. Due to the scarcity and high cost of annotated data, it is imperative to consider scenarios where only a limited portion of the training data are labeled in semi-supervised learning. The method also maintains a very smooth performance when the proportion of labeled data is below 50%, which proves that our research is meaningful.

Figure 5 illustrates the feature distribution for different labeling ratios. The clustering of features is observed to be effective under both labeling ratios. The distinction between the two can also be observed in the diagram simultaneously. Taking the most representative orange cluster point as an example, there are fewer scattered points under 95% labeled data.

Figure 6 shows the confusion matrix for different labeling ratios for the BreakHis dataset. Table 3 shows the accuracy of each class is evaluated using 80% labeled samples. The superiority of our method is evident when compared to FixMatch, which exhibits the highest performance among the other four methods. Hence, our approach holds significant research value.

The visualization at the 70% annotation ratio is depicted in Figure 7. To determine whether our method can improve the accuracy, the logits layer features of the original pseudo-labels, the mutual adjustment of the two kinds of pseudo-labels, and the coalition similarity training can be visualized. Figure 7a shows that the clusters of the original pseudo-label features are scattered and partially overlapping. Figure 7b is based on the SimMatch framework. The features exhibit clustering, yet there are evidently intertwined components. Figure 7c is a visualization of our method. The comparison of the three visualizations demonstrates our progress.

Figure 8 is a graphical representation of the accuracy. The accuracy of our entire framework can reach a maximum of approximately 86%, but it is not sustainable in the absence of cross-calibration and semantic pseudo-label weighting (green polyline). The base model incorporates cross-calibration between instance pseudo-labels and semantic pseudo-labels. The accuracy reaches a maximum of 92%, while typically ranging between 90% and 91% (yellow polyline). Based on this, it can be improved by weighting semantic pseudo-labels, that is, pseudo-labels that more directly optimize the whole model. After adding the similarity weights, the result of the composed coalition similarity framework is shown in Figure 8 with the blue line. The accuracy is stable between 93% and 94%, comparable to supervised learning methods.

To verify the generalization ability of the framework, generalization tests were performed on Chest X-ray14 dataset. Table 4 shows the classification accuracy of the Chest X-ray14 dataset, demonstrating that the model has excellent performance on diverse pathological datasets.

Figure 9 shows classification accuracy when using different loss functions. The green line represents a single cross-entropy loss function to optimize the model, excluding the presence of

L_{i n}

and

L_{u}

. It is evident that the accuracy exhibits instability, surpassing the yellow line during initial training but declining to below 90% in later stages, with an average accuracy of 80.59%. The yellow dashed line indicates that both

L_{i n}

and

L_{u}

utilize the cross-entropy function. The classification accuracy is stable at around 90%. The green dashed line represents that introduced in the third section of the article, where

L_{i n}

denotes the cross-entropy loss function and

L_{u}

signifies the MMD loss function. The combination of loss functions used in the framework achieved the best results.

In one study, the loss function is related to the optimization of the model, and the balance factors can control the proportion of each loss function. To investigate the impact of loss weight parameters on the model’s performance, we tested different parameter settings. Figure 10 shows the corresponding classification accuracy in different values. The highest accuracy is observed when the number pair is (1,10).

Finally, the temperature and smoothing parameters of the model undergo testing. The temperature in Equations (6) and (7) controls the sharpness of the instance distribution. The optimal accuracy is observed when the temperature parameter is set to 0.1. The smooth parameter in Equation (10) adjusts the instance pseudo labels. Because the smoothing parameter has little difference at 0.9 and 0.95, the accuracy rate is given, which is 83.24% and 84.39%, respectively. The experiment is conducted with the optimal value set at 0.95. An intuitive comparison result can be seen in Figure 11.

5. Discussion

Automatic classification based on medical images is a crucial component in a computer-aided diagnosis system. The effectiveness of general-purpose, highly accurate models largely depends on a substantial investment in labeled data. The computer-aided diagnosis and treatment system necessitates a substantial amount of time, money, and high-quality labeled data. Consequently, achieving satisfactory accuracy in the medical domain remains challenging for general models. The scarcity of labeled samples gives rise to numerous challenges, including but not limited to overfitting, diminished accuracy, and class imbalance. To address these challenges, this paper presents a semi-supervised learning framework for coalition similarity training models. The method was assessed using the BreakHis dataset and the Chest X-Ray14 dataset. The performance of our proposed method can reach the state of the art.

The inspiration for our framework stems from SimMatch, which utilizes two types of pseudo-labels that mutually adjust each other. The proposed methodology is well-suited for color image datasets that predominantly consist of pervasive objects, such as CIFAR-10. The lesions in various categories of medical image datasets are minute, challenging to detect, and even the manifestations of certain diseases exhibit significant similarities; hence, the SimMatch framework is not applicable within our research domain.

In response to the aforementioned challenges, a strategy called the Coalition Similarity Training Framework has been developed with the goal of improving the quality of pseudo-labels and speeding up model convergence. This framework leverages image features extracted from the ResNet-18 network to derive prediction labels. To ensure high-quality pseudo-labels during initial training, we have implemented a mutual correction strategy that integrates semantic and instance similarity. Further-more, we have introduced similarity scores as a guiding mechanism to maintain an appropriate distance from misclassification outcomes during classification. Additionally, in order to further enhance model performance, adaptive consistency constraints have been integrated into the loss function. Collectively, these measures serve to bolster the model’s generalization capability and significantly elevate its performance on untrained datasets.

The proposed method is compared with other methods in the fourth part of the paper to demonstrate its effectiveness. In the semi-supervised learning method, our proposed framework is compared with the baseline, ICT, FixMatch, and Mean Teacher methods. Meanwhile, in order to broaden the scope of comparison, it is also compared with supervised learning and unsupervised learning. Table 1 and Table 4 show the advancement of our approach. The coalition similarity training framework, compared to the widely adopted FixMatch approach, demonstrates a notable enhancement in accuracy by 2.37% when considering 50% labeled samples. This outcome underscores the efficacy of employing a pseudo-label method based on cross-calibration of two similarities for facilitating semi-supervised learning.

In the case of low image resolution in the dataset, the proposed feature extraction framework exhibits a significant decrease in classification efficiency. This is primarily due to the challenge of accurately extracting feature representation from low-resolution images. Additionally, another potential issue arises when the model encounters extreme sample imbalances. In cases where there is an insufficient number of samples in a category, there is a risk of misclassification into categories with a larger number of samples, leading to a potential misjudgment in identification results.

In conclusion, future research will primarily focus on addressing the challenges of processing low-resolution images, refining feature extraction, and classifying categorically unbalanced data. Resolving these issues will contribute to advancing image recognition technology and play a crucial role in practical applications.

6. Conclusions

In the rapid progress of modern medical technology, a system called computer aided diagnosis has shown its irreplaceable role in clinical practice. Among them, the medical image classification algorithm, as an indispensable part, has been the focus of researchers. In order to improve accuracy, ensuring the high quality of the annotated data has become a critical and heavy task.

In view of this, this study proposes a framework for coalition similarity training methods. This method significantly improves the accuracy of the final pseudo-label by using the dual calibration mechanism of semantic pseudo-label and instance pseudo-label. For the classification of medical image data sets, this framework has more optimal applicability.

The results of relevant experiments show that when more than 50% of labeled samples are used, the performance level of this framework is comparable to the current mainstream supervised learning methods. When only 5% of the labeled samples were used, the accuracy of the framework improved by an average of 2% compared to other semi-supervised learning methods, reaching a high level of 74.28%. These empirical results demonstrate the innovative and leading position of this framework in current medical image classification technology.

Author Contributions

Funding acquisition, K.L. and S.L. (Sidong Liu); methodology, S.L. (Shuyi Ling); software, S.L. (Shuyi Ling); supervision, K.L.; validation, S.L. (Shuyi Ling); writing—original draft, K.L. and S.L. (Shuyi Ling); writing—review and editing, K.L. and S.L. (Sidong Liu). All authors have read and agreed to the published version of the manuscript.

Funding

This work is sponsored by the National Natural Science Foundation of China Grant No. 62271302, the Shanghai Municipal Natural Science Foundation Grant 20ZR1423500 and the Aeronautical Science Foundation of China under Grant 201955015001.

Data Availability Statement

Dataset available on request from the authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Huynh, T.; Nibali, A.; He, Z. Semi-supervised learning for medical image classification using imbalanced training data. Comput. Methods Programs Biomed. 2022, 216, 106628. [Google Scholar] [CrossRef] [PubMed]
Kostopoulos, G.; Karlos, S.; Kotsiantis, S.; Ragos, O. Semi-supervised regression: A recent review. J. Intell. Fuzzy Syst. 2018, 35, 1483–1500. [Google Scholar] [CrossRef]
Wang, K.; Wang, X.S.; Cheng, Y.H. Few-shot learning based on enhanced pseudo-labels and graded pseudo-labeled data selection. Int. J. Mach. Learn. Cybern. 2023, 14, 1783–1795. [Google Scholar] [CrossRef]
Zhou, S.F.; Tian, S.W.; Yu, L.; Wu, W.D.; Zhang, D.Z.; Peng, Z.; Zhou, Z.C. Growth threshold for pseudo labeling and pseudo label dropout for semi-supervised medical image classification. Eng. Appl. Artif. Intell. 2024, 130, 107777. [Google Scholar] [CrossRef]
Wang, P.; Wang, X.X.; Wang, Z.; Dong, Y.F. Learning Accurate Pseudo-Labels via Feature Similarity in the Presence of Label Noise. Appl. Sci. 2024, 14, 2759. [Google Scholar] [CrossRef]
Bai, T.; Zhang, Z.; Guo, S.; Zhao, C.; Luo, X. Semi-supervised cell detection with reliable pseudo-labels. J. Comput. Biol. 2022, 29, 1061–1073. [Google Scholar] [CrossRef] [PubMed]
Zheng, M.; You, S.; Huang, L.; Wang, F.; Qian, C.; Xu, C. SimMatch: Semi-supervised learning with similarity matching. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 14451–14461. [Google Scholar] [CrossRef]
Liu, F.; Tian, Y.; Chen, Y.; Liu, Y.; Belagiannis, V.; Carneiro, G. ACPL: Anti-curriculum pseudo-labelling for semi-supervised medical image classification. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 20665–20674. [Google Scholar] [CrossRef]
Komodakis, N.; Zagoruyko, S. Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. In Proceedings of the 5th International Conference on Learning Representations (ICLR) 2017, Toulon, France, 24–26 April 2017. [Google Scholar]
Li, X.; Grandvalet, Y.; Davoine, F. Explicit inductive bias for transfer learning with convolutional networks. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 2825–2834. [Google Scholar]
Spanhol, F.A.; Oliveira, L.S.; Petitjean, C.; Heutte, L. A dataset for breast cancer histopathological image classification. IEEE Trans. Biomed. Eng. 2016, 63, 1455–1462. [Google Scholar] [CrossRef] [PubMed]
Wang, X.; Peng, Y.; Lu, L.; Lu, Z.; Bagheri, M.; Summers, R.M. ChestX-Ray8: Hospital-scale Chest X-Ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 3462–3471. [Google Scholar] [CrossRef]
Gui, Q.; Zhou, H.; Guo, N.; Niu, B. A survey of class-imbalanced semi-supervised learning. Mach. Learn. 2023, 1–30. [Google Scholar] [CrossRef]
Zhang, X.; Jing, X.-Y.; Zhu, X.; Ma, F. Semi-supervised person re-identification by similarity-embedded cycle GANs. Neural Comput. Appl. 2020, 32, 14143–14152. [Google Scholar] [CrossRef]
Laine, S.; Aila, T. Temporal ensembling for semi-supervised learning. arXiv 2016, arXiv:1610.02242. [Google Scholar] [CrossRef]
Zheng, F.; Liu, Z.; Chen, Y.; An, J.; Zhang, Y. A novel adaptive multi-view non-negative graph semi-supervised ELM. IEEE Access 2020, 8, 116350–116362. [Google Scholar] [CrossRef]
Shaik, R.U.; Unni, A.; Zeng, W. Quantum based pseudo-labelling for hyperspectral imagery: A simple and efficient semi-supervised learning method for machine learning classifiers. Rem. Sens. 2022, 14, 5774. [Google Scholar] [CrossRef]
Zhu, P.; Zhang, L.; Wang, Y.; Mei, J.; Zhou, G.; Liu, F.; Liu, W.; Takis Mathiopoulos, P. Projection learning with local and global consistency constraints for scene classification. ISPRS J. Photogramm. Remote Sens. 2018, 144, 202–216. [Google Scholar] [CrossRef]
Lee, D.-H. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Proceedings of the ICML 2013 Workshop: Challenges in Representation Learning (WREPL), Atlanta, GA, USA, 21 June 2013; Goodfellow, I., Erhan, D., Bengio, Y., Eds.; p. 896. [Google Scholar]
Shi, W.; Gong, Y.; Ding, C.; Ma, Z.; Tao, X.; Zheng, N. Transductive semi-supervised deep learning using min-max features. In Proceedings of the Computer Vision–ECCV 2018, Munich, Germany, 8–14 September 2018; Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y., Eds.; Springer: Berlin/Heidelberg, Germany; pp. 311–327. [CrossRef]
Wu, D.; Shang, M.; Luo, X.; Xu, J.; Yan, H.; Deng, W.; Wang, G. Self-training semi-supervised classification based on density peaks of data. Neurocomputing 2018, 275, 180–191. [Google Scholar] [CrossRef]
Li, X.; Huang, J.; Liu, Y.; Zhou, Q.; Zheng, S.; Schiele, B.; Sun, Q. Learning to teach and learn for semi-supervised few-shot image classification. Comput. Vision Image Underst. 2021, 212, 103270. [Google Scholar] [CrossRef]
Pham, H.; Dai, Z.; Xie, Q.; Le, Q.V. Meta pseudo labels. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 11552–11563. [Google Scholar] [CrossRef]
Liu, K.; Liu, Z.; Liu, S. Semi-supervised breast histopathological image classification with self-training based on non-linear distance metric. IET Image Process. 2022, 16, 3164–3176. [Google Scholar] [CrossRef]
Sohn, K.; Berthelot, D.; Carlini, N.; Zhang, Z.; Zhang, H.; Raffel, C.A.; Cubuk, E.D.; Kurakin, A.; Li, C.-L. FixMatch: Simplifying semi-supervised learning with consistency and confidence. In Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Online Conference, 6–12 December 2020; p. 51. [Google Scholar]
Wang, X.; Chen, H.; Xiang, H.; Lin, H.; Lin, X.; Heng, P.-A. Deep virtual adversarial self-training with consistency regularization for semi-supervised medical image classification. Med. Image Anal. 2021, 70, 102010. [Google Scholar] [CrossRef] [PubMed]
Zhou, Y.; Huang, L.; Zhou, T.; Sun, H. Combating medical noisy labels by disentangled distribution learning and consistency regularization. Future Gener. Comput. Syst. 2023, 141, 567–576. [Google Scholar] [CrossRef]
Xia, P.; Zhang, L.; Li, F. Learning similarity with cosine similarity ensemble. Inf. Sci. 2015, 307, 39–52. [Google Scholar] [CrossRef]
Ye, H.-J.; Zhan, D.-C.; Jiang, Y.; Zhou, Z.-H. What makes objects similar: A unified multi-metric learning approach. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 41, 1257–1270. [Google Scholar] [CrossRef] [PubMed]
Zhang, B.; Zheng, W.; Zhou, J.; Lu, J. Attributable visual similarity learning. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 7522–7531. [Google Scholar] [CrossRef]
Wang, Y.; Huang, Y.; Wang, Q.; Zhao, C.; Zhang, Z.; Chen, J. Graph-based self-training for semi-supervised deep similarity learning. Sensors 2023, 23, 3944. [Google Scholar] [CrossRef] [PubMed]
Hamrouni, L.; Kherfi, M.L.; Aiadi, O.; Benbelghit, A. Plant Leaves Recognition Based on a Hierarchical One-Class Learning Scheme with Convolutional Auto-Encoder and Siamese Neural Network. Symmetry 2021, 13, 1705. [Google Scholar] [CrossRef]
Huang, L.B.; Chen, Y.S. Dual-Path Siamese CNN for Hyperspectral Image Classification With Limited Training Samples. IEEE Geosci. Remote Sens. Lett. 2021, 18, 518–522. [Google Scholar] [CrossRef]
Xiao, Y.C.; Zhu, F.; Zhuang, S.X.; Yang, Y. Identification of Unknown Electromagnetic Interference Sources Based on Siamese-CNN. J. Electron. Test. 2023, 39, 597–609. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
Wu, H.; Prasad, S. Semi-supervised deep learning using pseudo labels for hyperspectral image classification. IEEE Trans. Image Process. 2018, 27, 1259–1270. [Google Scholar] [CrossRef] [PubMed]
Northcutt, C.; Jiang, L.; Chuang, I. Confident learning: Estimating uncertainty in dataset labels. J. Artif. Intell. Res. 2021, 70, 1373–1411. [Google Scholar] [CrossRef]
Cascante-Bonilla, P.; Tan, F.; Qi, Y.; Ordonez, V. Curriculum labeling: Revisiting pseudo-labeling for semi-supervised learning. Proc. AAAI Conf. Artif. Intell. 2021, 35, 6912–6920. [Google Scholar] [CrossRef]
Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Kai, L.; Li, F.-F. ImageNet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar] [CrossRef]
Phan, M.H.; Ta, T.-A.; Phung, S.L.; Tran-Thanh, L.; Bouzerdoum, A. Class similarity weighted knowledge distillation for continual semantic segmentation. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 16845–16854. [Google Scholar] [CrossRef]
Verma, V.; Kawaguchi, K.; Lamb, A.; Kannala, J.; Solin, A.; Bengio, Y.; Lopez-Paz, D. Interpolation consistency training for semi-supervised learning. Neural Netw. 2022, 145, 90–106. [Google Scholar] [CrossRef] [PubMed]
Tarvainen, A.; Valpola, H. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; pp. 1195–1204. [Google Scholar]
Mi, W.; Li, J.; Guo, Y.; Ren, X.; Liang, Z.; Zhang, T.; Zou, H. Deep learning-based multi-class classification of breast digital pathology images. Cancer Manag. Res. 2021, 13, 4605–4617. [Google Scholar] [CrossRef] [PubMed]
Boumaraf, S.; Liu, X.; Zheng, Z.; Ma, X.; Ferkous, C. A new transfer learning based approach to magnification dependent and independent classification of breast cancer in histopathological images. Biomed. Signal Process. Control 2021, 63, 102192. [Google Scholar] [CrossRef]
Litrico, M.; Del Bue, A.; Morerio, P. Guiding pseudo-labels with uncertainty estimation for source-free unsupervised domain adaptation. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 7640–7650. [Google Scholar] [CrossRef]

Figure 1. Overview of the framework. The model achieves the training objective by mutually calibrating two pseudo-tags: semantic pseudo-tags (blue) and instance pseudo-tags (orange). Each time the model learns at step t, the similarity score between the old class prototype and the new class prototype is used to weigh the instance pseudo-labels. The higher the similarity, the higher the score. Simultaneously, instance pseudo-tags are extracted based on embeddings. The two pseudo labels adjust to each other through unfolding and aggregation. Two loss functions,

L_{i n}

and

L_{u}

, were utilized for model optimization.

Figure 1. Overview of the framework. The model achieves the training objective by mutually calibrating two pseudo-tags: semantic pseudo-tags (blue) and instance pseudo-tags (orange). Each time the model learns at step t, the similarity score between the old class prototype and the new class prototype is used to weigh the instance pseudo-labels. The higher the similarity, the higher the score. Simultaneously, instance pseudo-tags are extracted based on embeddings. The two pseudo labels adjust to each other through unfolding and aggregation. Two loss functions,

L_{i n}

and

L_{u}

, were utilized for model optimization.

Figure 2. The process of generating semantic pseudo label.

Figure 3. Samples from BreakHis dataset. (a) Adenosis (A); (b) fibroadenoma (F); (c) phyllodes tumor (PT); (d) tubular adenoma (TA); (e) ductal carcinoma (DC); (f) lobular carcinoma (LC); (g) mucinous carcinoma (MC); (h) papillary carcinoma (PC).

Figure 4. Samples from Chest X-ray14 dataset. (a) Atelectasis; (b) Effusion; (c) Infiltration; (d) Mass; (e) Nodule; (f) Pneumothorax.

Figure 5. The distribution of features under different annotation ratios. (a) 95% AP; (b) 50% AP.

Figure 6. Confusion matrix under different labeling ratios. (a) 80% labeled; (b) 20% labeled; (c) 10% labeled.

Figure 7. The distribution of features (a) original pseudo label, (b) SimMatch and (c) our method.

Figure 8. The accuracy obtained from ablation experiments on each module of the framework.

Figure 9. Classification accuracy when using different loss functions.

Figure 10. The classification accuracies for different balancing factors of the model.

Figure 11. Classification accuracy of different parameters at 20% annotation ratios. (a) Temperature parameter; (b) smooth parameter.

Table 1. Classification accuracy with other models under different annotated percentages (AP) for the BreakHis dataset.

Method\AP	5%	10%	20%	50%	80%	95%
Pseudo-Labeling [19]	65.35%	73.98%	79.26%	87.69%	89.99%	91.02%
ICT [41]	69.57%	73.75%	81.36%	89.03%	90.65%	93.44%
FixMatch [25]	71.63%	74.04%	82.31%	90.35%	92.69%	94.04%
Mean Teacher [42]	70.23%	73.66%	81.50%	88.93%	90.89%	93.29%
Ours	74.28%	77.43%	85.02%	92.72%	93.65%	95.21%

ICT, Interpolation consistency training.

Table 2. Accuracy comparison with other learning methods for the BreakHis dataset.

Method	Type	Percentage	Accuracy
Mi et al. [43]	Supervised	100%	89.67%
Boumaraf et al. [44]	Supervised	100%	92.41%
Ours	Semi-supervised	80%	93.65%
Litrico et al. [45]	Unsupervised	0%	72.98%

Table 3. The accuracy of each class is evaluated using 80% labeled samples.

Type of Disease	FixMatch	Ours
Adenosis	90.23%	91.59%
Ductal carcinoma	90.68%	92.69%
Fibroadenoma	92.36%	94.79%
Lobular carcinoma	78.03%	85.59%
Mucinous carcinoma	93.00%	97.21%
Papillary carcinoma	90.10%	93.06%
Phyllodes tumor	89.27%	87.67%
Tubular adenoma	94.63%	97.39%

Table 4. Classification accuracy with other models under different annotated percentages for the Chest X-Ray14 datasets.

Method\AP	5%	10%	20%	50%	80%	95%
Pseudo-Labeling	57.64%	65.27%	72.46%	87.69%	85.21%	88.92%
ICT	65.25%	73.75%	78.22%	82.77%	87.34%	93.44%
FixMatch	68.09%	72.96%	80.34%	84.31%	89.56%	92.08%
Mean Teacher	67.96%	73.01%	79.07%	83.13%	88.23%	92.47%
Ours	69.93%	74.20%	80.21%	84.30%	89.77%	93.15%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, K.; Ling, S.; Liu, S. Semi-Supervised Medical Image Classification with Pseudo Labels Using Coalition Similarity Training. Mathematics 2024, 12, 1537. https://doi.org/10.3390/math12101537

AMA Style

Liu K, Ling S, Liu S. Semi-Supervised Medical Image Classification with Pseudo Labels Using Coalition Similarity Training. Mathematics. 2024; 12(10):1537. https://doi.org/10.3390/math12101537

Chicago/Turabian Style

Liu, Kun, Shuyi Ling, and Sidong Liu. 2024. "Semi-Supervised Medical Image Classification with Pseudo Labels Using Coalition Similarity Training" Mathematics 12, no. 10: 1537. https://doi.org/10.3390/math12101537

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Semi-Supervised Medical Image Classification with Pseudo Labels Using Coalition Similarity Training

Abstract

1. Introduction

2. Related Work

2.1. Semi-Supervised Learning

2.2. Similarity Learning

3. Materials and Methods

3.1. Preliminaries

3.2. Coalition Similarity Training Framework

3.2.1. Semantic Pseudo Labels

3.2.2. Instance Pseudo Labels

3.3. Loss Functions

3.4. Model Training

4. Experiments

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI