Unsupervised Domain Adaptation for Image Classification Using Non-Euclidean Triplet Loss

Jabbar Sarhan, Riyam; Balafar, Mohammad Ali; Feizi Derakhshi, Mohammad Reza

doi:10.3390/electronics12010099

Open AccessArticle

Unsupervised Domain Adaptation for Image Classification Using Non-Euclidean Triplet Loss

by

Riyam Jabbar Sarhan

,

Mohammad Ali Balafar

^* and

Mohammad Reza Feizi Derakhshi

Department of Computer Engineering, Faculty of Electrical and Computer Engineering, 29th Bahman Blvd, University of Tabriz, Tabriz 5166616471, Iran

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(1), 99; https://doi.org/10.3390/electronics12010099

Submission received: 18 October 2022 / Revised: 19 December 2022 / Accepted: 20 December 2022 / Published: 26 December 2022

(This article belongs to the Section Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

In recent years, computer vision tasks have increasingly used deep learning techniques. In some tasks, however, due to insufficient data, the model is not properly trained, leading to a decrease in generalizability. When trained on a dataset and tested on another similar dataset, the model predicts near-random results. This paper presents an unsupervised multi-source domain adaptation that improves transfer learning and increases generalizability. In the proposed method, a new module infers the source of the input data based on its extracted features. By making the features extractor compete against this objective, the learned feature representation generalizes better across the sources. As a result, representations similar to those from different sources are learned. That is, the extracted representation is generic and independent of any particular domain. In the training stage, a non-Euclidean triplet loss function is also utilized. Similar representations for samples belonging to the same class can be learned more effectively using the proposed loss function. We demonstrate how the developed framework may be applied to enhance accuracy and outperform the outcomes of already effective transfer learning methodologies. We demonstrate how the proposed strategy performs particularly well when dealing with various dataset domains or when there are insufficient data.

Keywords:

image classification; deep learning; unsupervised domain adaptation

1. Introduction

Machine vision is one of the fields of artificial intelligence that gives machines the power to observe the surrounding environment and analyze and process the surrounding information. The emphasis of machine vision systems is more on the capabilities of analyzing images and extracting useful information from them for a specific application. Computer vision is used in various applications such as color image segmentation [1,2], medical image processing [3,4,5,6,7,8,9], object detection [10], and fingerprint recognition [11]. In machine vision, the process of identifying and labeling an image based on specific rules is known as image classification. Image classification is utilized in many machine vision applications, including medical image processing [12,13], fire detection [14], radar image classification [15], and classification of satellite images [16]. The learning algorithm for supervised tasks like image classification uses a large amount of labeled data. Then the unlabeled data is classified using the trained model. Across all domains, it might be challenging to acquire data that is sufficiently labeled. A lot of data is required to adequately train the model, especially when deep learning models are used. It can be challenging to obtain enough data in some applications to train the model. This is because labeling them is expensive. Numerous approaches, including [17,18,19,20], have been put forth to address this issue. Data augmentation is one of these strategies. By applying data transformation techniques such as image rotation, horizontal or vertical shifting, and horizontal or vertical flipping, it is possible to increase the amount of data in this manner [19]. Data efficiency in several other techniques, such as few-shot learning, can be improved [17]. Other techniques classify unlabeled (test) data using labeled data from a different (training) source domain. Transfer learning has the drawback that the distributions of the training and test domains may not be the same, which reduces the precision of the deep learning approach in classifying test data. Multi-source transfer learning can be used to mix many sources and draw knowledge from them to address this issue [18]. Learning common features and transferring knowledge between multiple datasets can increase the generalizability of a model. The utilization of many datasets, each of which might not be enough to train and generalize the model alone, is made possible by multi-source transfer learning.

However, the nature of the source datasets may be highly diverse, and the degree of similarity between the source tasks and the target tasks substantially influences the transmission efficiency. Transfer learning can thus sometimes work against the model rather than for its improvement in training. Additionally, there is a probability that the model will discover the unique features of each dataset rather than the features that are common between them. As a result, it hinders the learned model’s ability to generalize.

Problem statement: Most existing methods for image classification are trained and evaluated with images from the same dataset. Using only a single dataset reduces the generalizability of these methods. So the results of training and testing the network on the same dataset are much better than the results of training and testing the network on different datasets. In other words, in the feature extraction stage, most of the proposed models are very dependent on the domain of the training dataset and do not perform well in the face of unseen datasets. For this reason, they are not trusted in real-world applications where the data used is new and independent of training data. Numerous studies demonstrate that the most recent approaches in the literature are unreliable. For example, two well-known studies [21,22] for medical image classification tasks show a performance close to random classification facing unseen data (i.e., datasets on which the model has not been trained). The classification accuracy decreases from 98.5% on the test set to 59.12% on unseen datasets. The structural and inherent differences in the images from the available datasets are the cause of this issue.

Method: In this study, we present an unsupervised domain adaption strategy for image classification utilizing non-Euclidean triplet embeddings to address the aforementioned issue. We discover similar representations from two distinct datasets, regardless of the domain of each dataset. The target dataset is utilized to improve transferability and generalizability, while the source dataset is used for network training. In the proposed method, a new module infers the source of the input data based on its extracted features. By making the features extractor compete against this objective, the learned feature representation generalizes better across the sources. According to our theory, the feature representation will then transfer better to an unknown target since it is more generalized. The presented scheme can correctly classify data regardless of the unique features of each input data domain, and both data domains can use the learned representations.

Contribution: The three contributions of this study are as follows: (1) the introduced unsupervised domain adaptation approach minimizes the impact of structural and intrinsic differences in images; (2) it is highly generalizable to other new datasets and real-world applications; (3) it proposes a non-Euclidean triplet loss that helps in generating more accurate and comparable representations for samples belonging to the same class.

The other sections of the study are as follows: Section 2 reviews related research; Section 3 introduces the suggested framework; Section 4 describe the experiments; Section 5 concludes.

2. Related Work

In this section, we focus more on domain adaptation methods that provide two separate subspaces for each of the training and test domains.

Discriminative and Invariant Subspace Alignment for visual domains (DISA) [23], in addition to using all the components of the previous research, paid attention to maintaining the structure of each domain during adaptation but does not pay attention to the inter-domain geometric structure as well as the geometric adaptation of the domains.

Locality Preserving Joint Transfer (LPJT) for Domain Adaptation [24], in reducing the difference between conditional and marginal distributions, also considers the weighting of the samples. It also pays attention to reducing the difference between intra-class samples and increasing the distance between inter-class samples in both training and test domains. However, this method does not pay attention to the geometrical structure between the domains and also the geometric adaptation of the domains.

Active Discriminative Cross-Domain Alignment (ADCDA) [25], in addition to considering the increase of variance of both training and test domains like the CS-DDA method, preserves the geometric discriminative information of the training domain through the developed method of reducing the distance between intra-class samples and increasing the distance between inter-class samples. In this method, the distinct importance of conditional and marginal distributions and maintaining the structure of domains is not considered.

Unified Domain Adaptation on Geometrical Manifolds (UDAGM) [26] improves the expert and intelligent system performance by optimizing all the required objectives with Regularized Coplanar Discriminant Analysis (RCDA) in a common framework. Since the UDAGM method has considered all the necessary objectives (i.e., (i) subspace alignment; (ii) Minimization of distribution divergence by using the Maximum Mean Discrepancy criterion; (iii) Preservation of source domain discrimination information; (iv) Preservation of original similarity of the data samples; (v) Maximization of target domain variance), the proposed framework is capable of learning common feature subspace between both domains with better inter-class separability and intra-class compactness. Hence, the misclassification of samples of the target domain is minimized. This method neglects to maximize source domain variance and does not take into account the distinct importance of marginal and conditional distributions.

Unsupervised Domain Adaptation based on Pseudo-Label Confidence (UDA-PLC) [27] by projecting data of source and target domains into a latent subspace aligns the distribution of data in two domains and improves the discriminability of features in both domains. Then, UDA-PLC applies Structured Prediction (SP) and Nearest Class Prototype (NCP) to predict pseudo-labels of data in the target domain, and it performs sample selection in every iteration. This method does not take into account the difference between the inter-class and intra-class distance of the target and inter-domain domains and the difference between the two subspaces.

The double-weighted domain adaptation (DWDA) [28] method utilizes distribution alignment weighting and sample reweighting strategies. Specifically, the distribution alignment weighting strategy considers the relative importance of marginal and conditional distribution alignments. At the same time, the sample reweighting strategy weights the source and target samples separately based on k-means clustering. In addition, the proposed method considers the geometry structure preservation. This method does not pay attention to maintaining the variances of the source and target domains. Moreover, this method does not consider reducing intra-class distances and increasing inter-class distances between domains.

Iterative joint classifier and domain adaptation for visual transfer learning (ICDAV) [29] uses the balanced maximum mean discrepancy to better domain adaptation. Moreover, for learning a robust classifier against domain shift, a set of graph manifold regularizers and modified joint probability maximum mean discrepancy is simultaneously applied to maintain the domain structures and adapt the distribution of projected samples during the model learning process. This method does not consider distinct importance for discriminant and transfer capabilities. It also does not take into account the inter-domain distances of the samples.

The research [30] considers the feature learning and the instance reweighting approach at the same time. In addition, the proposed method utilizes dynamic weighted marginal and conditional distributions according to their importance. Moreover, it learns an adaptive domain-invariant classifier by structural risk minimization. This method does not consider the relationship between cross-domain instances of a class. It has also not paid attention to reducing the difference between the two sub-spaces.

Balanced Discriminative Transfer Feature Learning for Visual Domain Adaptation (BDTFL) [31] balances the trade-off between marginal and conditional distribution adaptations and captures the category discriminative information of both domains to prevent category confusion during feature matching. This method does not pay attention to reducing the distance between samples belonging to the same class from different domains, and this issue is considered only within the domains. This method has also been negligent in maintaining domain variance during feature learning.

Balanced Weight JGSA (BW-JGSA) [32] is a developed form of JGSA which adaptively leverages the importance of marginal and conditional distribution in JGSA. This method, like JGSA, does not pay attention to reducing the distance between samples belonging to the same class and increasing the distance between samples of different classes across domains.

In most previous methods, the mean distance of training and test samples (marginal distribution difference), the mean distance of samples in each class (conditional distribution difference), the distance of mean of each class from the total mean of the relevant domain (inter-class distance), the distance of the samples of each class from the mean of that class within the respective domain (intra-class distance) are considered to some extent, but some properties and distances are either considered in a training or test domain or ignored on an inter-domain scale. Moreover, most of the mentioned methods are highly dependent on the image domain of datasets on which they were trained. If the test dataset is from the same domain as the training dataset, the model performance will be acceptable. However, when the domain of the evaluation dataset is different, model performance is significantly reduced. However, in real-world applications, the domain of the inference image is not always the same as the training set. In other words, unseen data is often independent of the training set, so the results would not be reliable.

3. Proposed Method

The proposed framework is shown graphically in Figure 1. As shown in this figure, the proposed model uses two separate datasets to learn common representations that are independent of the domain of each dataset. The source dataset is used for network training, and the target dataset is used to increase transferability and generalizability. The proposed architecture consists of three parts: CNN-based Feature Extractor, classifier, and discriminator. These blocks are responsible for extracting features, classifying data into two or more classes, and distinguishing source data from target data, respectively. The purpose of the proposed method is to learn general features that are useful for both datasets so that the correct classification can be performed regardless of the input source and the specific aspects of each input distribution. The proposed model simultaneously improves generalizability and transferability. Therefore, the learned representations are based on general features independent of the specific domain and dataset.

3.1. Proposed Framework

The proposed architecture consists of three parts: CNN-based Feature Extractor, classifier, and discriminator. In the feature extraction block, common convolutional architectures such as VGG16 and ResNet with transfer learning techniques can be used. In the proposed model, we use the pre-trained ResNet50 [33] architecture. To classify data into two or more classes, we pass the extracted representations from the feature extraction block into two consecutive modules consisting of Dense, Batch-normalization, Relu, and Dropout layers. Then, on top of these two blocks, apply the final activation function (according to the number of classes, we use sigmoid or softmax). The output of this model determines the probability of assigning each sample to each class.

The domain classifier is responsible for identifying and distinguishing source data from target data. The main aim is to generate similar representations for images belonging to two different domains. This model can perform the correct classification regardless of the specific features of each input distribution. The architecture of this block is the same as the classification block, except that the output estimates the probability of assigning each image to each dataset (source and target).

3.2. Training Phase

We train the model with two loss functions. This model is trained by an efficient adversarial training approach in a multi-source transfer learning environment. The classification and discrimination blocks use the features extracted by the feature extraction block to classify the input data and the domain the data come from, respectively. The output predictor (classifier) and the domain classifier (discriminator) are trained classically by backpropagating their respective losses. When it reaches the feature extractor module, the gradient reversal layer reverses (multiplies by −1) the domain loss of the classifier. As a result, while the feature extractor learns a feature representation that is beneficial for output prediction, it also learns a feature representation that is indiscriminate of the domain from which the data comes, promoting a more generalized one.

The proposed model is trained simultaneously with two loss functions: classification loss and discrimination loss. Equation (1) shows the loss function used in this method. This loss combines a classification loss (

ℒ_{y}

) and a discrimination loss (

ℒ_{d}

).

λ_{y}

and

λ_{d}

are coefficients, controlling the bias-vs-variance tradeoff of the generalization.

ℒ = λ_{y} ℒ_{y} + λ_{d} ℒ_{d}

(1)

We use the cross-entropy loss function to calculate the discriminator domain loss. The discriminator loss (

ℒ_{d}

) in this algorithm is defined by Equation (2).

ℒ_{d} = - y l o g (\hat{y})

(2)

where y indicates the correct class and

\hat{y}

indicates the model prediction.

We use the improved triplet loss function to calculate the classifier output loss (

ℒ_{y}

). In FaceNet [34], triplet loss was first implemented. A pair of samples from the same class should have similar representations, while a pair of samples from different classes should have representations that differ, according to the triplet loss theory. We use an improved version of this loss function. The

n

positive samples (

N_{p o s}

) and

n

negative samples (

N_{n e g}

) in the proposed loss function are chosen from the same and opposite classes of the input sample, respectively. Additionally, we adopt a non-Euclidean distance function in the suggested loss function rather than the common Euclidean distance function. The benefit of employing this distance metric is that it reduces the impact of noise. Additionally, this distance function estimates the distance for nonlinear patterns better than the Euclidean distance function. Positive and negative samples are selected in each batch, and the loss function is calculated by Equation (3).

ℒ_{y} = M a x (\sum_{i = 1}^{N_{p o s}} \sum_{m = 1}^{M} 1 - \exp (- γ_{m} {(f_{θ} (A_{m}) - f_{θ} (P_{i, m}))}^{2}) - \sum_{j = 1}^{N_{n e g}} \sum_{m = 1}^{M} 1 - \exp (- γ_{m} {(f_{θ} (A_{m}) - f_{θ} (N_{j,_{m}}))}^{2}), 0)

(3)

where, in Equation (3),

N_{p o s}

indicates the number of positive samples,

N_{n e g}

is the number of negative samples,

M

represents the number of features in the embedded space,

f_{θ}

represents the data in embedded space, and

θ

are parameters that must be learned.

f_{θ} (A_{m})

represents

m

-th feature of the input sample in the embedded space,

f_{θ} (P_{i, m})

is

m

-th feature of

i

-th positive sample in the embedded space,

f_{θ} (N_{j, m})

is

m

-th feature of

j

-th negative sample in the embedded space

The use of this loss function can help in the improvement of data classification. The excellent outcomes of applying this function in the current application confirm the validity of this assumption.

During the training of the network, the source dataset (with labels) is used to train the network, and the classification loss (

ℒ_{y}

) is calculated only on the source dataset. The target dataset is not used in the calculation

ℒ_{y}

. The target dataset (unlabeled) is used to calculate the domain discriminator loss (

ℒ_{d}

). Our goal is to reduce the classifier loss and increase the domain discriminator loss. The decrease of the classifier loss shows that the network can predict the output well (prediction), and the increase of the domain discriminator loss means that the network was able to learn the common features of both datasets, not the specific features of a dataset (source or target). Because the discriminator loss is multiplied by -1in the backpropagating stage, there is no problem in network optimization. Using two datasets simultaneously and learning common features from these two data sets increases the transferability and generalization of the model.

4. Experiments

The effectiveness of the suggested approach is assessed and contrasted with other methods in this section. The parameters and their values for the suggested technique are listed in Table 1.

The parameter batch size and the maximum number of iterations are set to 32 and

2 \times 10^{4}

, respectively. The learning rate in the proposed framework is set to

10^{- 2}

(in the proposed framework, Adam optimizer is used). The required parameters in our method are set as follows:

λ_{y} = 4

, and

λ_{d} = 1

. Due to the small number of images, overfitting may occur. To solve this problem, along with the data augmentation technique, dropout has been used. The dropout rate is set to

0.5

.

All parameters are set based on trial and error on the validation dataset. Some parameters, such as batch size and the number of iterations, are usually set with values of 32 and 100 in similar methods, and we also use these values. The values of the following three parameters have very different effects on the results of the proposed method:

λ_{y}

,

λ_{d}

, learning rate

Parameters

λ_{y}

,

λ_{d}

were tested for different values in the range (0–10). The results of the experiments showed that the intervals (1,2) and (3,4) for parameters

λ_{d}

and

λ_{y}

lead to better results. We also observed that the value of the parameter

λ_{d}

should be lower than the value of the parameter

λ_{y}

. Accordingly, we chose the value 1 for

λ_{d}

and the value 4 for the parameter

λ_{y}

. We also tested the parameter learning rate in the same way as setting parameters

λ_{d}

and

λ_{y}

.

It is clear that higher values for parameters

N_{n e g}

and

N_{p o s}

produce better results because the loss is calculated correctly. In the experiments, these parameters are set to 5. The proposed model learns to keep the input sample far away from the samples of the other 5 opposing classes in the embedded space. The samples of negative (dissimilar class) and samples of positive (similar class) are randomly selected. The value of 5 for the parameter

N_{n e g}

seems acceptable. This value is neither too high (such as 20) to result in high computational cost nor too low (such as 1) to keep the input sample far away from only one instance of the negative class. Such a claim is also true for parameter

N_{p o s}

.

4.1. Evaluation Criteria

Almost all the proposed methods in this field, such as [25,35,36], use only the accuracy criterion to evaluate their method. We also use this criterion for fair evaluations and comparisons with other methods. Equation (4) illustrates these evaluation criteria, where TP, FN, TN, and FN stand for True Positive, False Positive, True Negative, and False Negative, respectively. Accuracy is defined as the number of samples classified correctly divided by the number of all samples [37,38,39].

A c c u r a c y = \frac{T N + T P}{T N + T P + F N + F P} \times 100

(4)

Moreover, we use Precision, Recall, and F1 criteria to evaluate our method. These evaluation criteria are shown in Equations (5)–(7).

P r e c i s i o n = \frac{T P}{T p + F P} \times 100

(5)

Recall = \frac{TP}{TP + FN} \times 100

(6)

F 1 = 2 \times \frac{Recall \times Precision}{Recall + Precision} \times 100

(7)

4.2. Datasets

In recent research, the following datasets are often used to evaluate model performance: MNIST [40], MNIST-M [41], SVHN [42], and USPS. We also test the performance of the proposed method on these datasets. The details of these datasets are reported in Table 2.

4.3. Experiment 1: Analysis of the Robustness of Noise

To investigate the robustness of the proposed method, in this experiment, we add three common kinds of noise to the image datasets as follows: Gaussian noise with the variance of 0.01, 0.03, and 0.05; Salt&Pepper noise with density 0.05, 0.1, and 0.2; Speckle noise with variance 0.05, 0.1, and 0.15. The results of the proposed method with Euclidean and non-Euclidean distance functions on the noisy data are shown in Table 3.

As shown in Table 3, the proposed method with the non-Euclidean distance metric performs better than the Euclidean distance metric for all noise levels and datasets. The difference between the best and worst results obtained on all datasets in the proposed method with the non-Euclidean distance metric is, on average, 0.72%. For the proposed algorithm with the Euclidean distance metric, this value is 1.52%. Therefore, the proposed algorithm with a non-Euclidean distance metric has a better overall performance than the Euclidean distance metric on all datasets and is more robust to noise.

Additionally, Table 4 reports the proposed method’s performance without noise application. This table demonstrates how our method performs better than the Euclidean distance function when using the non-Euclidean distance function. With Euclidean and non-Euclidean distance functions, the proposed method’s average accuracy metric is 98.37% and 98.95%, respectively. Comparing the results of Table 3 and Table 4 shows that the accuracy of the proposed method on noisy data using the non-Euclidean distance metric has decreased by less than 1%. While for the Euclidean distance, this value is approximately 1.5%.

4.4. Experiment 2: Cross-Dataset Evaluation

Here, we assess the suggested approach using the source/target datasets and compare it with other successful approaches. During the training of the network, the train set of the source dataset with labels and the train set of the target dataset without labels are used. Results are reported on the unseen test set of the target dataset. Table 5 reports the results of the proposed method and other algorithms. In the methods whose source code is not available or has their parameters, the best results are reported directly from the relevant papers.

The performance of the suggested approach is the best among all approaches, as shown in Table 5. The outcomes show that the suggested strategy performs better than other effective methods in this area. Following our method, DIRT-T, DIRT-T (IN), and SWD methods perform rather well, whereas DAN, DANN, and DSN methods produce the worst results of all the methods. The average Accuracy of the proposed approach is 98.9%. The average Accuracy metric for the second-best approach (SHOT), which comes after the suggested method, is 98.43%.

Moreover, to investigate the effect of the domain adaptation used in our approach on the final results, we evaluate the performance of the approach once with domain adaptation and once without domain adaptation. We train the model on the source dataset (all data) and evaluate it on the target dataset (all data). As shown in Table 6, the proposed method with domain adaptation performs better than without domain adaptation. The results show that when the model is trained using a large source dataset, such as SVHN, the result is much better than when it is trained by a small source dataset, such as USPS. Moreover, the generalizability of the model is much higher in the first case. Using the domain adaptation technique, the proposed approach results are improved by an average of 34%. Therefore, it can be concluded that by using the domain adaptation technique, better representations are generated. As a result, the quality of the classification is better, specifically for the unseen new samples.

4.5. Experiment 3: Analysis of the Proposed Method with Other Evaluation Criteria

In this section, we evaluate the proposed method with the following three criteria: Precision, Recall, and F1. The results of the proposed method and other methods are stated in Table 7.

The results indicate the high performance of our method. Moreover, for the proposed method, the confusion matrix of evaluation on the test set of the target dataset is shown in Figure 2. The average Precision, Recall, and F1 metrics of the proposed method are 98.91%, 98.95%, and 98.92%, respectively. Recall 98.95% indicates that, on average, only one image per 100 images is incorrectly predicted.

5. Conclusions

In this study, we proposed a multi-source unsupervised domain adaptation to optimize the transfer learning approach and greater generalizability between many data sources. A new module infers the source of the input data based on its extracted features. The learned feature representation generalized more effectively across sources by forcing the features extractor to compete against this objective. As a result, comparable representations across the sources were discovered. In other words, the representations that were generated were generic and independent of the specific dataset domain. In the training stage, a non-Euclidean triplet loss function was also applied. Due to the extreme structural similarity of images, suggested loss functions can more effectively generate identical representations for samples belonging to the same class.

The model had excellent results for unseen data and increased generalizability and transferability, according to experiments. We also demonstrated the robustness of the suggested technique against various types of noises using the non-Euclidean distance measure. Numerous advanced models were used to compare the performance of the proposed model. Additionally, the findings show that the suggested model improves classification accuracy by at least 2%.

The proposed method could be improved in the future by incorporating other effective loss functions, like a center loss.

Author Contributions

R.J.S.: Conceptualization; Data curation; Formal analysis; Investigation; Methodology; Validation; Visualization; Writing—original draft. M.A.B.: Conceptualization; Data curation; Formal analysis; Investigation; Methodology; Project administration; Resources; Supervision; Validation; Visualization; Writing—review & editing. M.R.F.D.: Conceptualization; Data curation; Formal analysis; Investigation; Methodology; Validation; Visualization; Writing—original draft; Writing—review & editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

None declared under financial, general, and institutional competing interests.

References

Golzari Oskouei, A.; Hashemzadeh, M.; Asheghi, B.; Balafar, M.A. CGFFCM: Cluster-weight and Group-local Feature-weight learning in Fuzzy C-Means clustering algorithm for color image segmentation. Appl. Soft Comput. 2021, 113, 108005. [Google Scholar] [CrossRef]
Golzari Oskouei, A.; Hashemzadeh, M. CGFFCM: A color image segmentation method based on cluster-weight and feature-weight learning. Softw. Impacts 2022, 11, 100228. [Google Scholar] [CrossRef]
Ghaderzadeh, M.; Aria, M. Management of COVID-19 Detection Using Artificial Intelligence in 2020 Pandemic. In Proceedings of the 2021 5th International Conference on Medical and Health Informatics, Kyoto, Japan, 14–16 May 2021; Association for Computing Machinery: New York, NY, USA, 2021; pp. 32–38. [Google Scholar]
Bayani, A.; Asadi, F.; Hosseini, A.; Hatami, B.; Kavousi, K.; Aria, M.; Zali, M.R. Performance of machine learning techniques on prediction of esophageal varices grades among patients with cirrhosis. Clin. Chem. Lab. Med. (CCLM) 2022, 60, 1955–1962. [Google Scholar] [CrossRef] [PubMed]
Balafar, M.A. Gaussian mixture model based segmentation methods for brain MRI images. Artif. Intell. Rev. 2014, 41, 429–439. [Google Scholar] [CrossRef]
Balafar, M.A.; Ramli, A.R.; Saripan, M.I.; Mashohor, S. Review of brain MRI image segmentation methods. Artif. Intell. Rev. 2010, 33, 261–274. [Google Scholar] [CrossRef]
Balafar, M.A. Fuzzy C-mean based brain MRI segmentation algorithms. Artif. Intell. Rev. 2014, 41, 441–449. [Google Scholar] [CrossRef]
Qiu, X. U-Net-ASPP: U-Net based on atrous spatial pyramid pooling model for medical image segmentation in COVID-19. J. Appl. Sci. Eng. 2022, 25, 1167–1176. [Google Scholar] [CrossRef]
Niu, Q.; Kandhro, I.A.; Kumar, A.; Shah, S.; Hasan, M.; Ahmed, H.M.; Liang, F. Web Scraping Tool For Newspapers And Images Data Using Jsonify. J. Appl. Sci. Eng. 2022, 26, 465–474. [Google Scholar] [CrossRef]
Yang, F.; Fan, H.; Chu, P.; Blasch, E.; Ling, H. Clustered object detection in aerial images. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 8311–8320. [Google Scholar]
Sharma, R.P.; Dey, S. Two-stage quality adaptive fingerprint image enhancement using Fuzzy C-means clustering based fingerprint quality analysis. Image Vis. Comput. 2019, 83–84, 1–16. [Google Scholar] [CrossRef] [Green Version]
Gulshan, V.; Peng, L.; Coram, M.; Stumpe, M.C.; Wu, D.; Narayanaswamy, A.; Venugopalan, S.; Widner, K.; Madams, T.; Cuadros, J.; et al. Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA 2016, 316, 2402–2410. [Google Scholar] [CrossRef]
Aria, M.; Nourani, E.; Golzari Oskouei, A. ADA-COVID: Adversarial Deep Domain Adaptation-Based Diagnosis of COVID-19 from Lung CT Scans Using Triplet Embeddings. Comput. Intell. Neurosci. 2022, 2022, 2564022. [Google Scholar] [CrossRef] [PubMed]
Majid, S.; Alenezi, F.; Masood, S.; Ahmad, M.; Gündüz, E.S.; Polat, K. Attention based CNN model for fire detection and localization in real-world images. Expert Syst. Appl. 2022, 189, 116114. [Google Scholar] [CrossRef]
Zhang, H.; Liu, W.; Shi, J.; Fei, T.; Zong, B. Joint Detection Threshold Optimization and Illumination Time Allocation Strategy for Cognitive Tracking in a Networked Radar System. IEEE Trans. Signal Process. 2022, 1–15. [Google Scholar] [CrossRef]
Ahmed, T.; Sabab, N.H.N. Classification and Understanding of Cloud Structures via Satellite Images with EfficientUNet. SN Comput. Sci. 2021, 3, 99. [Google Scholar] [CrossRef]
Altae-Tran, H.; Ramsundar, B.; Pappu, A.S.; Pande, V. Low Data Drug Discovery with One-Shot Learning. ACS Cent. Sci. 2017, 3, 283–293. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Christodoulidis, S.; Anthimopoulos, M.; Ebner, L.; Christe, A.; Mougiakakou, S. Multisource Transfer Learning With Convolutional Neural Networks for Lung Pattern Analysis. IEEE J. Biomed. Health Inform. 2017, 21, 76–84. [Google Scholar] [CrossRef] [Green Version]
Dhungel, N.; Carneiro, G.; Bradley, A.P. A deep learning approach for the analysis of masses in mammograms with minimal user intervention. Med. Image Anal. 2017, 37, 114–128. [Google Scholar] [CrossRef] [Green Version]
Bar, Y.; Diamant, I.; Wolf, L.; Greenspan, H. Deep learning with non-medical training used for chest pathology identification. In Proceedings of the Medical Imaging 2015: Computer-Aided Diagnosis, Orlando, Fl, USA, 21–26 February 2015; International Society for Optics and Photonics: Bellingham, WA, USA, 2015; p. 94140. [Google Scholar]
Wang, L.; Lin, Z.Q.; Wong, A. Covid-net: A tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images. Sci. Rep. 2020, 10, 19549. [Google Scholar] [CrossRef]
Afshar, P.; Heidarian, S.; Naderkhani, F.; Oikonomou, A.; Plataniotis, K.N.; Mohammadi, A. COVID-CAPS: A capsule network-based framework for identification of COVID-19 cases from X-ray images. Pattern Recognit. Lett. 2020, 138, 638–643. [Google Scholar] [CrossRef]
Rezaei, S.; Tahmoresnezhad, J. Discriminative and domain invariant subspace alignment for visual tasks. Iran J. Comput. Sci. 2019, 2, 219–230. [Google Scholar] [CrossRef]
Li, J.; Jing, M.; Lu, K.; Zhu, L.; Shen, H.T. Locality preserving joint transfer for domain adaptation. IEEE Trans. Image Process. 2019, 28, 6103–6115. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Zheng, D.; Zhang, K.; Lu, J.; Jing, J.; Xiong, Z. Active discriminative cross-domain alignment for low-resolution face recognition. IEEE Access 2020, 8, 97503–97515. [Google Scholar] [CrossRef]
Sanodiya, R.K.; Mathew, J.; Aditya, R.; Jacob, A.; Nayanar, B. Kernelized unified domain adaptation on geometrical manifolds. Expert Syst. Appl. 2021, 167, 114078. [Google Scholar] [CrossRef]
Fu, T.; Li, Y. Unsupervised domain adaptation based on pseudo-label confidence. IEEE Access 2021, 9, 87049–87057. [Google Scholar] [CrossRef]
Li, J.; Li, Z.; Lü, S. Unsupervised double weighted domain adaptation. Neural Comput. Appl. 2021, 33, 3545–3566. [Google Scholar] [CrossRef]
Noori Saray, S.; Tahmoresnezhad, J. Iterative joint classifier and domain adaptation for visual transfer learning. Int. J. Mach. Learn. Cybern. 2022, 13, 947–961. [Google Scholar] [CrossRef]
Azarkesht, M.; Afsari, F. Instance reweighting and dynamic distribution alignment for domain adaptation. J. Ambient. Intell. Humaniz. Comput. 2022, 13, 4967–4987. [Google Scholar] [CrossRef]
Limin, S.; Qiang, Z.; Shuang, L.; Harold, L.C. Balanced Discriminative Transfer Feature Learning for Visual Domain Adaptation. ZTE Commun. 2021, 18, 78–83. [Google Scholar]
Samsudin, M.R.; Abu-Bakar, S.A.; Mokji, M.M. Balanced Weight Joint Geometrical and Statistical Alignment for Unsupervised Domain Adaptation. J. Adv. Inf. Technol. 2022, 13, 21–28. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Schroff, F.; Kalenichenko, D.; Philbin, J. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 815–823. [Google Scholar]
Sun, Y.; Tzeng, E.; Darrell, T.; Efros, A.A. Unsupervised domain adaptation through self-supervision. arXiv 2019, arXiv:1909.11825. [Google Scholar]
Wang, Z.; Liu, Q.; Dou, Q. Contrastive cross-site learning with redesigned net for COVID-19 ct classification. IEEE J. Biomed. Health Inform. 2020, 24, 2806–2813. [Google Scholar] [CrossRef]
Hashemzadeh, M.; Golzari Oskouei, A.; Farajzadeh, N. New fuzzy C-means clustering method based on feature-weight and cluster-weight learning. Appl. Soft Comput. 2019, 78, 324–345. [Google Scholar] [CrossRef]
Golzari Oskouei, A.; Balafar, M.A.; Motamed, C. FKMAWCW: Categorical fuzzy k-modes clustering with automated attribute-weight and cluster-weight learning. Chaos Solitons Fractals 2021, 153, 111494. [Google Scholar] [CrossRef]
Golzari Oskouei, A.; Balafar, M.A.; Motamed, C. EDCWRN: Efficient deep clustering with the weight of representations and the help of neighbors. Appl. Intell. 2022. [Google Scholar] [CrossRef]
LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
Ganin, Y.; Ustinova, E.; Ajakan, H.; Germain, P.; Larochelle, H.; Laviolette, F.; Marchand, M.; Lempitsky, V. Domain-adversarial training of neural networks. J. Mach. Learn. Res. 2016, 17, 2030–2096. [Google Scholar]
Netzer, Y.; Wang, T.; Coates, A.; Bissacco, A.; Wu, B.; Ng, A.Y. Reading digits in natural images with unsupervised feature learning. 2011. Available online: https://www.semanticscholar.org/paper/Reading-Digits-in-Natural-Images-with-Unsupervised-Netzer-Wang/02227c94dd41fe0b439e050d377b0beb5d427cda (accessed on 19 December 2022).
Ye, S.; Wu, K.; Zhou, M.; Yang, Y.; Tan, S.H.; Xu, K.; Song, J.; Bao, C.; Ma, K. Light-weight calibrator: A separable component for unsupervised domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 13736–13745. [Google Scholar]
Li, M.; Zhai, Y.-M.; Luo, Y.-W.; Ge, P.-F.; Ren, C.-X. Enhanced transport distance for unsupervised domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 13936–13944. [Google Scholar]
Xiao, N.; Zhang, L. Dynamic weighted learning for unsupervised domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 15242–15251. [Google Scholar]
Shu, R.; Bui, H.H.; Narui, H.; Ermon, S. A dirt-t approach to unsupervised domain adaptation. arXiv 2018, arXiv:1802.08735. [Google Scholar]
Bousmalis, K.; Trigeorgis, G.; Silberman, N.; Krishnan, D.; Erhan, D. Domain separation networks. In Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, Barcelona, Spain, 5–10 December; NeurIPS Proceedings: Granada, Spain, 2016. [Google Scholar]
Sankaranarayanan, S.; Balaji, Y.; Castillo, C.D.; Chellappa, R. Generate to adapt: Aligning domains using generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8503–8512. [Google Scholar]
Saito, K.; Watanabe, K.; Ushiku, Y.; Harada, T. Maximum classifier discrepancy for unsupervised domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 3723–3732. [Google Scholar]
Bousmalis, K.; Silberman, N.; Dohan, D.; Erhan, D.; Krishnan, D. Unsupervised pixel-level domain adaptation with generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 3722–3731. [Google Scholar]
Liang, J.; Hu, D.; Feng, J. Do we really need to access the source data? Source hypothesis transfer for unsupervised domain adaptation. In Proceedings of the International Conference on Machine Learning, Virtual Event, 13–18 July 2020; pp. 6028–6039. [Google Scholar]
Pinheiro, P.O. Unsupervised domain adaptation with similarity learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8004–8013. [Google Scholar]
Lee, C.-Y.; Batra, T.; Baig, M.H.; Ulbricht, D. Sliced wasserstein discrepancy for unsupervised domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA , 15–20 June 2019; pp. 10285–10295. [Google Scholar]
Liu, M.-Y.; Breuel, T.; Kautz, J. Unsupervised image-to-image translation networks. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA, 4–9 December 2017; NeurIPS Proceedings: Granada, Spain, 2016. [Google Scholar]
Haeusser, P.; Frerix, T.; Mordvintsev, A.; Cremers, D. Associative domain adaptation. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2765–2773. [Google Scholar]
Hoffman, J.; Tzeng, E.; Park, T.; Zhu, J.-Y.; Isola, P.; Saenko, K.; Efros, A.; Darrell, T. Cycada: Cycle-consistent adversarial domain adaptation. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 1989–1998. [Google Scholar]
Long, M.; Cao, Y.; Wang, J.; Jordan, M. Learning transferable features with deep adaptation networks. In Proceedings of the International Conference on Machine Learning, Lille, France, 6–11 July 2015; pp. 97–105. [Google Scholar]

Figure 1. Proposed framework. Forward pass (solid line), backward pass (dashed line), GRL (gradient reversal layer), + (weighted sum of gradients).

Figure 2. Confusion matrix of evaluation on the test set of the target dataset.

Table 1. Parameters.

Parameter	Value
batch size	32
$N_{p o s}$	5
$N_{n e g}$	5
Coefficients $λ_{y}$	4
Coefficients $λ_{d}$	1
dropout rate	$0.5$
learning rate (Adam optimizer)	$10^{- 2}$
maximum number of iterations	$2 \times 10^{4}$

Table 2. Datasets.

Datasets	No. of Samples	No. of the Train Set	No. of the Test Set	Type
MNIST	70,000	60,000	10,000	digits
MNIST-M	70,000	60,000	10,000	digits
SVHN	99,289	73,257	26,032	digits
USPS	9298	7291	2007	digits

Table 3. Performance of proposed method (accuracy) with different kinds of noise.

Dataset (Source Target)	Distance	Noise
		Gaussian			Salt&Pepper			Speckle
		(0.01)	(0.03)	(0.05)	(0.05)	(0.1)	(0.2)	(0.05)	(0.1)	(0.15)
SVHN MNIST	Euclidean	98.0	97.8	97.7	97.0	97.4	96.5	97.4	97.4	97.6
SVHN MNIST	non-Euclidean	99.3	99.2	99.1	98.9	99.0	98.6	99.1	99.0	99.1
MNIST USPS	Euclidean	97.9	97.7	97.3	96.9	97.2	96.9	97.4	97.3	97.5
MNIST USPS	non-Euclidean	98.0	98.0	98.1	98.0	98.0	97.7	98.0	98.0	97.9
USPS MNIST	Euclidean	97.7	97.5	97.2	97.9	97.2	97.9	97.3	97.3	97.4
USPS MNIST	non-Euclidean	99.2	99.1	99.0	89.7	89.9	89.4	99.0	89.9	99.0
MNIST MNIST-M	Euclidean	98.7	98.5	98.1	97.6	98.0	97.1	98.4	98.0	98.2
MNIST MNIST-M	non-Euclidean	98.8	98.7	98.6	98.3	98.5	98.0	98.6	98.5	98.6

Table 4. The performance of the proposed method without applying any noise.

Distance	SVHN/ MNIST	MNIST/ USPS	USPS/ MNIST	MNIST/ MNIST-M
Euclidean	98.9	98.0	98.4	98.2
non-Euclidean	99.4	98.1	99.4	98.9

Table 5. Comparison of the proposed method with other methods.

Method	SVHN/MNIST	MNIST/USPS	USPS/MNIST	MNIST/MNIST-M
LWC [43]	97.1	95.6	97.1	-
ETD [44]	97.9	96.4	97.9	-
DWL [45]	98.1	97.3	97.4	-
DANN [41]	73.8	85.1	73.0	77.4
DIRT-T (IN) [46]	99.4	-	-	98.7
DIRT-T [46]	99.4	-	-	98.9
DSN [47]	82.7	91.3	-	83.2
GenToAdapt [48]	92.4	95.3	90.8	-
MCD [49]	96.2	94.2	94.1	-
PixelDA [50]	-	95.9	-	98.2
SHOT [51]	98.9	98.0	98.4
SimDA [52]	-	96.4	95.6	90.5
SWD [53]	98.9	98.1	97.1
UDATSS [35]	85.8	96.5	90.2	98.9
UNIT [54]	90.5	95.9	93.5	-
AssocDA [55]	97.6	-	-	89.5
CyCADA [56]	90.4	95.6	96.5	-
DAN [57]	71.1	81.1	-	76.9
VADA (IN) [46]	94.5	-	-	95.7
VADA [46]	97.9	-	-	97.7
VADA [46]	97.9	-	-	97.7
Our	99.4	98.1	99.3	98.9

Table 6. Domain adaptation analysis.

Method	SVHN/MNIST	MNIST/USPS	USPS/MNIST	MNIST/MNIST-M
Without domain adaptation	82.1	73.4	66.80	70.7
With domain adaptation	99.2	97.0	96.3	97.7

Table 7. Analysis of the proposed method with other evaluation criteria.

Metric	SVHN/MNIST	MNIST/USPS	USPS/MNIST	MNIST/MNIST-M
Precision	99.45	97.99	99.30	98.90
Recall	99.45	98.15	99.29	98.89
F1	99.45	98.06	99.29	98.89

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jabbar Sarhan, R.; Balafar, M.A.; Feizi Derakhshi, M.R. Unsupervised Domain Adaptation for Image Classification Using Non-Euclidean Triplet Loss. Electronics 2023, 12, 99. https://doi.org/10.3390/electronics12010099

AMA Style

Jabbar Sarhan R, Balafar MA, Feizi Derakhshi MR. Unsupervised Domain Adaptation for Image Classification Using Non-Euclidean Triplet Loss. Electronics. 2023; 12(1):99. https://doi.org/10.3390/electronics12010099

Chicago/Turabian Style

Jabbar Sarhan, Riyam, Mohammad Ali Balafar, and Mohammad Reza Feizi Derakhshi. 2023. "Unsupervised Domain Adaptation for Image Classification Using Non-Euclidean Triplet Loss" Electronics 12, no. 1: 99. https://doi.org/10.3390/electronics12010099

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Unsupervised Domain Adaptation for Image Classification Using Non-Euclidean Triplet Loss

Abstract

1. Introduction

2. Related Work

3. Proposed Method

3.1. Proposed Framework

3.2. Training Phase

4. Experiments

4.1. Evaluation Criteria

4.2. Datasets

4.3. Experiment 1: Analysis of the Robustness of Noise

4.4. Experiment 2: Cross-Dataset Evaluation

4.5. Experiment 3: Analysis of the Proposed Method with Other Evaluation Criteria

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI