Improving Domain Transfer with Consistency-Regularized Joint Distribution Alignment for Medical Image Classification

Zhang, Jiacheng; Li, Rui; Liu, Cheng; Ji, Xiang

doi:10.3390/sym17040515

Open AccessArticle

Improving Domain Transfer with Consistency-Regularized Joint Distribution Alignment for Medical Image Classification

¹

Department of Computer Science, Shantou University, Shantou 515063, China

²

Department of Computer Science, Jiangsu Normal University, Xuzhou 221116, China

^*

Author to whom correspondence should be addressed.

Symmetry 2025, 17(4), 515; https://doi.org/10.3390/sym17040515

Submission received: 18 February 2025 / Revised: 20 March 2025 / Accepted: 24 March 2025 / Published: 28 March 2025

(This article belongs to the Special Issue Symmetry and Asymmetry in Embedded Systems)

Download

Browse Figures

Versions Notes

Abstract

:

Background: Domain transfer plays a vital role in medical image analysis. It mitigates the challenges posed by variations in imaging equipment, protocols, and patient demographics, ultimately improving model performance across different domains or edge-intelligence devices; Methods: This paper introduces a new unsupervised domain adaptation approach, named Consistency-regularized Joint Distribution Alignment (C-JDA). Specifically, our method leverages Convolutional Neural Networks (CNNs) to align the joint distributions of source and target domains in the feature space, which involves the pseudo-labels of the target data for computing the relative chi-square divergence to measure the distribution relationship or asymmetry. Compared with traditional alignment methods with complex architectures or adversarial training, our model can be solved with a close-form equation, which is convenient for transferring among various scenarios. Additionally, we further propose symmetric consistency regularization to improve the robustness of the pseudo-label generation with diverse data augmentation strategies, where the augmented data are symmetric to their original data and should share the same predictions. Therefore, both components between distribution alignment and pseudo-label generation can be mutually improved for better performance. Results: Extensive experiments on multiple public medical image benchmarks demonstrate that C-JDA consistently outperforms both traditional domain adaptation methods and deep learning-based approaches. For the colon disease classification task, C-JDA achieved an accuracy of 87.41%, outperforming existing methods by 3.31%, with an F1 score of 87.26% and an improvement of 2.99%. For the Diabetic Retinopathy (DR) classification task, our method attained an accuracy and F1 score of 96.93%, surpassing state-of-the-art methods by 2.4%. Additionally, ablation studies validated the effectiveness of both the joint distribution alignment and symmetric consistency regularization components. Conclusions: Our C-JDA can significantly outperform existing domain adaptation methods by achieving state-of-the-art performance via improved joint distribution alignment with symmetric consistency regularization.

Keywords:

transfer learning; medical image classification; joint distribution alignment; consistency regularization

1. Introduction

Medical imaging informatics utilizes advanced image processing and machine learning techniques to improve diagnostic precision and speed [1]. With the rapid growth of deep learning and the increasing availability of data, significant progress has been made in analyzing medical images [2]. Computer-aided diagnosis (CAD) systems have shown their potential to enhance diagnostic accuracy and workflow efficiency [3], especially with the edge-IoT treatment [4].

Deep learning, in particular, has emerged as a leading approach for various medical image analysis tasks. Unlike traditional medical image processing techniques, deep learning methods do not require manual feature extraction; instead, they can automatically learn task-relevant features directly from medical images, significantly reducing the time and effort needed for manual feature extraction [5,6]. For instance, Dai et al. [7] proposed a novel deep residual network design for skin lesion detection, demonstrating the influential role of convolutional models in advancing medical image diagnostics. Despite the advancements of CNNs in medical image analysis, their trained models often face challenges in generalizing to domains with varying data distributions, mainly when significant domain gaps exist. Such distribution discrepancies, caused by varying data acquisition devices, medical institutions, or emerging diseases, are common in the medical field and are referred to as domain shift [8] (shown in Figure 1).

Domain transfer in medical imaging is a key research direction for applying deep learning to healthcare applications. The domain shift refers to substantial differences in data distributions between training and testing datasets, which substantially reduces the performance of the model in the target domain [9]. Domain shifts in medical image analysis arise from variations in equipment, protocols, demographics, and artifacts, leading to significant differences in image characteristics [10,11,12]. Although fine-tuning models on labeled target data can address this issue [13], the high cost and expertise required for medical data annotation make this approach very challenging.

Unsupervised domain adaptation (UDA) [14,15] adapts models to unlabeled target domains by leveraging labeled source domain data. This approach [16] uses labeled source data and unlabeled target data during training to enable the model to achieve reliable performance within the target domain. Developing robust medical image analysis models capable of generalizing across domains without relying on physician annotations is essential [17]. Many UDA approaches prioritize learning domain-invariant features [15,18], aligning source and target features or ensuring their distributions are similar, thereby facilitating the direct application of the source classifier to the target domain. To deal with this problem, there are three distribution matching methods: (1) marginal distribution alignment [19]; (2) class-conditional distribution alignment [16,20]; and (3) joint distribution alignment [21]. These approaches aim to minimize distribution discrepancies between source and target domain data within the feature space, enhancing model adaptability and performance.

Marginal distribution alignment focuses on bridging the gap between the source domain marginal distribution

P^{s} (x)

and the target domain marginal distribution

P^{t} (x)

by applying feature transformation techniques. For shallow methods, feature transformation often employs dimensionality reduction matrices [22,23]. In contrast, deep methods utilize neural networks [15,19,24]. Marginal distribution matching aligns the global feature distributions of the source and target domains without considering their label information. Traditional strategies include Maximum Mean Discrepancy (MMD) [25], which minimizes the difference in feature means, and adversarial training approaches such as DANN [24], which leverage domain adversarial networks to make the features of both domains indistinguishable. DV-GLCA [26] is a novel framework that incorporates dual-view information into conditional adversarial domain adaptation, helping to mitigate distribution mismatch and enhance the robustness of domain-invariant features. Several studies [16,27,28,29] have highlighted that neglecting label information during distribution matching can negatively affect the final transfer performance.

Class-conditional distribution matching aims to align the source class-conditional distribution

P^{s} (y | x)

with the target class-conditional distribution

P^{t} (y | x)

by learning class- and domain-invariant feature representations [16,20,30]. Specifically, this method aligns the conditional distributions of samples from the same class in both source and target domains, ensuring similar feature representations for samples belonging to the same class. Since obtaining target labels is not feasible in unsupervised domain adaptation, estimating the differences between the two class-conditional distributions becomes complex. Li et al. [31] proposed a conditional adversarial domain adaptation network for structural damage detection. This approach uses a one-dimensional residual network to extract features from raw vibration data and trains a classifier with labeled data from the source domain. To align class-wise features across domains, the classifier’s output serves as a conditional variable for domain alignment training. Additionally, an entropy-aware re-weighting technique is applied to enhance the model’s performance on difficult samples. Several methods [16,28,29,32] leverage techniques introduced by Long et al. [33] to generate pseudo-labels for the target data. However, class-conditional alignment methods often do not consider the difference of the label distribution across two domains (

P^{s} (y)

and

P^{t} (y)

), where inferring pseudo-labels for the target domain presents challenges, potentially compromising the model’s robustness.

Joint distribution matching aligns the joint distributions of the source and target domains by considering both class-conditional information and label information. By integrating adversarial training methods with label information, joint distribution matching enables the model to bridge the gap between marginal distributions while precisely ensuring feature alignment across categories. Specifically, joint distribution matching seeks to align the source joint distribution

P^{s} (x, y)

with the target joint distribution

P^{t} (x, y)

. This alignment is achieved by minimizing joint distribution discrepancy [34,35] or optimal transformation [36]. Joint distribution matching can effectively reduce the distribution bias between domains to enhance the model’s generalization ability. However, in an unsupervised domain adaptation setting, the absence of labels in the target domain still makes the joint distribution estimation difficult.

To address the challenges in unsupervised domain adaptation, we propose the Consistency-regularized Joint Distribution Alignment (C-JDA), which aims to achieve more robust feature alignment for medical image classification. This framework leverages the power of Convolutional Neural Networks (CNNs) and tends to align the joint distributions

P^{s} (x, y)

and

P^{t} (x, y)

of the source and target domains in the feature space. Specifically, we first leverage the pseudo-label of the target data and the original labeled source data to estimate the joint distribution discrepancy for alignment. Inspired by [34,37,38,39], we adopted the relative density ratio estimation to measure the joint distribution discrepancy, which can be solved with a close-form equation. Thereby, the whole adaptation training is very convenient for the domain transfer on the medical image classification tasks. However, the involved pseudo labels of the target data may not be accurate due to the domain shift, which can violate the estimation of the joint distribution. Then, we propose symmetric prediction consistency regularization to enhance the reliability of the target pseudo-labels. This is achieved by enforcing similar predictions between the original data and their symmetric data from various augmentation techniques, which is effective in improving the robustness of the model. Thus, the proposed joint distribution alignment and symmetric consistency mechanism can be mutually enhanced for better performance. We observe that our proposed C-JDA enables more precise joint distribution alignment for reducing distribution discrepancies across domains with improved adaptation performance. This paper provides the following contributions:

We propose a new Consistency-regularized Joint Distribution Alignment (C-JDA) framework to tackle the domain adaptation challenge for medical image classification tasks.
Unlike traditional alignment methods that rely on complex architectures or adversarial training, our JDA framework provides a closed-form solution, ensuring greater stability and computational efficiency. This formulation enhances interpretability and facilitates seamless transfer across diverse domain adaptation scenarios.
We adopt various data augmentation techniques to enhance the reliability of the target pseudo-label, which in turn improves the joint distribution alignment and results in state-of-the-art performance on multiple medical image benchmarks.

This paper is structured as follows: Section 2 reviews related methods. Section 3 discusses the motivation behind our work and presents the details of the C-JDA framework. Section 4 provides evaluation results and experimental analyses, and Section 5 presents the conclusion.

2. Related Work

Medical image analysis: In recent years, deep learning methods have garnered significant attention in the field of medical image analysis. By enabling computer-based analysis of medical images, these methods enhance diagnostic efficiency, facilitate early detection and treatment, and help prevent delays in providing optimal patient care. For instance, Esteva et al. [40] demonstrated that deep learning models can match dermatologists in skin cancer classification, highlighting the considerable potential of deep learning in medical image analysis. In the area of feature transfer, Euijoon et al. [41] introduced a method that leverages zero-bias convolutional autoencoders combined with context-based feature enhancement. This approach effectively transfers knowledge from the source to the target domain, resulting in improved classification performance. Regarding diabetic retinopathy, Gulshan et al. [42] developed a deep learning algorithm for the automatic detection of this disease from retinal photographs, validating its diagnostic accuracy. This research provides substantial technical support for the early detection of diabetic retinopathy. In colorectal cancer, Kather et al. [43] applied deep learning methods to classify colorectal cancer tissue slices, offering performance evaluations of various classification models and systematically analyzing their effectiveness and challenges. Recently, cross-modality medical image adaptation has become a popular topic. Zheng et al. [44] proposed the Unsupervised Cross-Modality Domain Adaptation Network (UCMDAN), a novel approach designed to address modality discrepancies such as contrast variations and artifacts typically encountered in 2D/3D rigid registration tasks. Recent advancements in AI-based medical imaging techniques, such as the integration of EfficientNetV2 and vision transformers for breast cancer classification [45], have demonstrated promising results in improving diagnostic accuracy. However, domain shift is still a serious problem preventing the application of deep learning in the medical field.

Domain adaptation learning: Domain adaptation focuses on leveraging knowledge from multiple related source domains to enhance learning in a target domain with limited labeled data [36,46]. The Domain-Adversarial Neural Network (DANN), introduced by Ganin et al. [24], extracts domain-invariant features by minimizing the divergence between domains while preserving discriminative features from labeled source data. Hu et al. [30] achieved distribution alignment by synchronizing gradient updates for both marginal and class-conditional distributions. Chen et al. [35] proposed a distribution invariant projection method that aligns the distributions of the source and target domains through projection pairs measured using the L2 norm. Qi et al. [47] introduced the Curriculum Feature Alignment Network (CFAN), which identifies reliable pseudo-labeled target samples based on their similarity to the source domain. This method reduces intra-class variability by aligning class-specific features across domains, demonstrating particular effectiveness in epithelial–stromal (ES) classification for histopathological images. Gao et al. [48] developed the Deep Cross-Subject Adaptive Decoding (DCAD) framework, which bridges distributional gaps between domains without requiring labeled target data. For fMRI signals, DCAD enhances cross-subject decoding and facilitates accurate identification of brain states. Jin et al. [49] proposed the DASC-Net model, which incorporates a novel domain adaptation and dual-domain enhanced self-correction learning framework to address domain shift and the challenges posed by limited datasets in COVID-19 CT image segmentation. In the task of ECG arrhythmia classification, ref. [50] proposed an innovative unsupervised domain adaptation (UDA) framework that combines image style transfer, collaborative learning, and adversarial training [51]. Different from previous methods, we propose to align the joint distributions across two domains, which is the essential problem of the domain transfer in medical images. We also propose consistency regularization to improve the estimation of joint distribution comparison.

Consistency learning: Consistency training leverages unlabeled data to improve predictions and ensures consistent outputs across different perturbations. Berthelot et al. [52] introduced techniques for aligning distributions and strengthening the model’s foundation, where the network’s ensemble predictions are used to improve consistency, thereby enhancing both stability and accuracy. Lee et al. [53] enhanced model performance by generating pseudo-labels for unlabeled data and using them as targets, further improving the results by combining this approach with consistency regularization. Cubuk et al. [54] explored the role of data augmentation techniques in consistency training, strengthening the model’s robustness to input variations by introducing more perturbations and transformations, thereby improving performance on unlabeled data. FixMatch [55] combines pseudo-labeling and consistency regularization, applying various perturbations to unlabeled data to promote stable predictions under different input variations. This method has proven highly effective in semi-supervised learning tasks. Xu et al. [56] introduced the Cycle Prototype Consistency Learning (CPCL) framework, leveraging non-parametric prototype learning to incorporate unlabeled data by providing explicit supervision. This approach improves segmentation networks by encouraging more discriminative and compact feature representations, transforming unsupervised consistency into supervised consistency and achieving comprehensive real-label supervision. Gu et al. [57] proposed an adversarial domain adaptation method that uses cycle consistency to adapt image features from the source domain to the target domain. It is helpful for the domain shift problem, which often leads to performance improvement when deep learning models are applied to new datasets from different clinical environments and patient populations.

3. Methods

This work aims to achieve unsupervised domain adaptation for medical images. Following the definition in [35], a domain is characterized by the joint probability distribution

P (x, y)

, where

x \in X

represents the feature space and

y \in Y

denotes the label space. To address the distribution alignment problem in domain adaptation, this paper proposes Consistency-regularized Joint Distribution Alignment (C-JDA), aiming to enhance the matching of features between the source and target domains to improve classification performance in the target domain. Figure 2 illustrates the system architecture of the C-JDA framework.

3.1. Joint Distribution Alignment

In domain adaptation, addressing the statistical discrepancy between the source

P^{s} (x, y)

and target

P^{t} (x, y)

distributions is crucial. We use a divergence metric based on relative chi-square divergence to measure the joint domain gap. This divergence extends the chi-square divergence by introducing an auxiliary distribution

P^{α} (x, y)

, which interpolates between the source and target distributions, which is defined as follows:

{JDA}_{α} (P^{s}, P^{t}) = \int [{(\frac{P^{s} (x, y)}{P^{α} (x, y)})}^{2} - 1] P^{α} (x, y) d x d y

(1)

where

P^{α} (x, y) = α P^{s} (x, y) + (1 - α) P^{t} (x, y)

. The parameter

α \in (0, 1)

controls the weight of the source and target contributions. This formulation ensures a statistically balanced comparison, enabling the metric to emphasize regions of high-density overlap while penalizing discrepancies more robustly than traditional divergence measures. This metric is nonnegative, bounded above by

(1 / α) - 1

for

α \in (0, 1)

, and equals zero only when

P^{s} (x, y)

=

P^{t} (x, y)

[34].

Direct computation of

J D A_{α} (p^{s}, p^{t})

is intractable due to its dependence on

P^{s} (x, y)

,

P^{t} (x, y)

, and

P^{α} (x, y)

. To address this, we reformulate the metric using a variational approach. Specifically, we introduce a flexible auxiliary function

r (x, y)

and express the divergence as follows:

\begin{matrix} {JDA}_{α} (P^{s}, P^{t}) = max_{r} \int (2 \frac{P^{s} (x, y)}{P^{α} (x, y)} r (x, y) - r {(x, y)}^{2} - 1) P^{α} (x, y) d x d y \end{matrix}

(2)

Here,

r (x, y)

acts as a surrogate function that is tailored to capture the unique characteristics of the source and target domains. This reformulation transforms the calculation of domain discrepancy into an optimization task, where

r (x, y)

is learned from empirical data. Using Monte Carlo sampling, the integral above can be approximated with finite datasets of size

m_{s}

(source) and

m_{t}

(target). The empirical form of the objective function is as follows:

\begin{matrix} {JDA}_{α} (P^{s}, P^{t}) & \approx max_{r} (\frac{2}{m_{s}} \sum_{i = 1}^{m_{s}} r (x_{i}^{s}, y_{i}^{s}) - \frac{α}{m_{s}} \sum_{i = 1}^{m_{s}} r {(x_{i}^{s}, y_{i}^{s})}^{2} \\ - \frac{1 - α}{m_{t}} \sum_{i = 1}^{m_{t}} r {(x_{i}^{t}, y_{i}^{t})}^{2}) - 1 \end{matrix}

(3)

This approximation enables practical computation and optimization using gradient-based methods. To prevent overfitting and ensure smooth estimation of

r (x, y)

, we introduce a regularization term that penalizes the norm of

r (x, y)

. The regularized optimization problem is as follows:

\hat{r} = \underset{r \in R}{argmax} \hat{F} (r) - λ {∥ r ∥}_{R}^{2}

(4)

where

\hat{F} (r)

is the empirical objective, and

λ

is a hyperparameter controlling the regularization strength. This step ensures that the learned

r (x, y)

generalizes well across unseen data points. To solve the optimization problem efficiently, we parameterize

r (x, y)

using a finite-dimensional vector

β

and a set of basis functions. Let

r (x, y; β)

be the parameterized representation of

r (x, y)

. Substituting this into the objective function, we obtain the following:

\begin{matrix} \hat{β} & = \underset{β \in R_{m_{s t}}}{argmax} (\frac{2}{m_{s}} \sum_{i = 1}^{m_{s}} \hat{r} (x_{i}^{s}, y_{i}^{s}; β) - \frac{α}{m_{s}} \sum_{i = 1}^{m_{s}} \hat{r} {(x_{i}^{s}, y_{i}^{s}; β)}^{2} \\ - \frac{1 - α}{m_{t}} \sum_{i = 1}^{m_{t}} \hat{r} {(x_{i}^{t}, y_{i}^{t}; β)}^{2} - λ β^{⊤} G β) \\ = arg max_{β \in R^{m_{s t}}} (\frac{2}{m_{s}} 1^{⊤} G^{s} β - \frac{α}{m_{s}} β^{⊤} {(G^{s})}^{⊤} G^{s} β \\ - \frac{1 - α}{m_{t}} β^{⊤} {(G^{t})}^{⊤} G^{t} β - λ β^{⊤} G β) \end{matrix}

(5)

where

G^{s}

and

G^{t}

are feature matrices derived from the source and target data, respectively. The optimization problem is quadratic and admits a closed-form solution. Let H and b represent the quadratic and linear terms of the objective. The optimal parameter

\hat{β}

is obtained as follows:

\hat{β} = {(H + λ G)}^{- 1} b

(6)

where

H = {(G^{s})}^{⊤} G^{s}

and b is constructed from source–target interactions. Substituting

β

back, the final joint distribution alignment loss is computed as follows:

\begin{matrix} ℓ_{Jda} = {\hat{JDA}}_{α} (P^{s}, P^{t}) = & \frac{2}{m_{s}} \sum_{i = 1}^{m_{s}} \hat{r} (x_{i}^{s}, y_{i}^{s}; \hat{β}) \\ - \frac{α}{m_{s}} \sum_{i = 1}^{m_{s}} \hat{r} {(x_{i}^{s}, y_{i}^{s}; \hat{β})}^{2} \\ - \frac{1 - α}{m_{t}} \sum_{i = 1}^{m_{t}} \hat{r} {(x_{i}^{t}, y_{i}^{t}; \hat{β})}^{2} - 1 \\ = 2 b^{⊤} β - β^{⊤} H β - 1 \end{matrix}

(7)

The divergence enables a principled approach to Joint Distribution Alignment by aligning the distributions at global and class levels, effectively mitigating domain discrepancies. It also handles imbalanced data by prioritizing common regions shared between the source and target domains. Its closed-form optimization significantly boosts training efficiency, making it well-suited for medical applications. Additionally, the integrated regularization strategy reduces overfitting, enhancing cross-domain generalization. These characteristics make joint distribution alignment particularly effective in real-world applications, such as cross-institutional medical image analysis, where robust and scalable solutions are essential.

3.2. Consistency Regularization

A symmetric consistency regularization mechanism is integrated into the training process to strengthen the robustness and generalization of joint distribution alignment. This approach minimizes the intra-domain variability of feature representations by enforcing consistency between predictions generated from the original and augmented versions of the target domain data (the augmented data are symmetric to the original ones). Specifically, the regularization term, as shown below, is defined based on Kullback–Leibler (KL) divergence:

\begin{matrix} min_{θ_{c}, θ_{f}} ℓ_{Reg} (F, C) = E_{x_{t} \sim P^{t}} \{KL [C (F (x_{t})) ∥ C (F (x_{t}^{aug}))]\} \end{matrix}

(8)

where F denotes the feature extractor, C represent the classifiers,

x_{t}^{a u g}

refers to an augmented variant of the target domain sample

x_{t}

, and

P^{t}

is the target domain dataset.

Our symmetric consistency regularization is a critical enhancement to joint distribution alignment by enforcing feature representation invariance under input perturbations. This approach mitigates overfitting to source-specific features and stabilizes feature alignment during cross-domain adaptation. Minimizing divergence between original and augmented sample predictions strengthens the model’s robustness against target domain noise and variability, enabling improved generalization. This integration is particularly advantageous in real-world scenarios, such as cross-institutional medical image analysis, where scalable and robust domain adaptation techniques are essential to address challenges posed by limited or imbalanced target domain data.

3.3. Training Procedure

The overarching objective functions of our medical adaptation model are outlined as follows:

L_{total} = ℓ_{C l s} + λ_{Jda} ℓ_{Jda} + λ_{Reg} ℓ_{Reg},

(9)

where

ℓ_{C l s}

is the classification loss on the source labeled data. The hyperparameters

λ_{Jda}

and

λ_{Reg}

balance the contributions of joint distribution alignment loss and consistency regularization loss, respectively. By jointly minimizing these components, the proposed framework achieves a synergistic balance between domain alignment and source-specific optimization. This integration enhances model generalization and robustness, addressing the challenges of cross-domain medical image analysis tasks.

The network parameters are updated by minimizing the loss function defined in Equation (9), with the optimization procedure detailed in Algorithm 1. Our experiments show that the enhanced joint distribution alignment framework significantly stabilizes the adaptation process when coupled with consistency regularization.

Algorithm 1 Pseudo-code of the proposed C-JDA.

Input: Source dataset with labels {

X_{S}

,

Y_{S}

}, target dataset {

X_{t}

}, mini-batch size B, learning rate

ζ

;
Output:

θ_{f}

,

θ_{c}

;

1:: for i = 1 to N do
2:: for each mini-batch do
3::      update F, C with $ℓ_{Cls}$ for the main prediction model:
      $θ_{f} \leftarrow SGD (\nabla_{θ_{f}} (ℓ_{Cls}), θ_{f}, ζ)$ ;
      $θ_{c} \leftarrow SGD (\nabla_{θ_{c}} (ℓ_{Cls}), θ_{c}, ζ)$ ;
4:: update F with $ℓ_{Jda}$ for Joint Distribution Alignment:
$θ_{f} \leftarrow SGD (\nabla_{θ_{f}} (ℓ_{Jda}), θ_{f}, ζ)$ ;
5::      update F, C with $ℓ_{Reg}$ with the pseudo-label $\hat{y}$ for improvement:
      $θ_{f} \leftarrow SGD (\nabla_{θ_{f}} (ℓ_{Reg}), θ_{f}, ζ)$ ;
      $θ_{c} \leftarrow SGD (\nabla_{θ_{c}} (ℓ_{Reg}), θ_{c}, ζ)$ ;
6:: end for
7:: end for

4. Experiment

In the experimental section, we thoroughly assess the proposed framework on several publicly accessible medical imaging benchmark datasets that cover a broad range of domain shift scenarios. Specifically, we evaluate its effectiveness using key metrics such as classification accuracy and F1 score for a comprehensive analysis. We compare our method with leading approaches across various tasks to showcase its robustness and versatility.

4.1. Data Augmentations

The proposed method incorporates two levels of data augmentation: “weak” and “strong”. Weak augmentation applies simple transformations, including horizontal and vertical flips with a 50% probability, as well as random translations up to 20% of the image’s width or height. In contrast, substantial augmentation utilizes more complex transformations randomly selected from RandAugment (as introduced in [55]), such as contrast adjustment, histogram equalization, and color channel reduction. A comprehensive list of RandAugment transformations is provided in Table 1.

4.2. Datasets

4.2.1. For the Colon Disease Classification Task

We utilized the EBHI-Seg dataset https://figshare.com/articles/dataset/EBHI-SEG/21540159/1 (accessed on 11 November 2022) as the source domain, which consists of histopathological images from colonoscopy biopsies. The dataset comprises 3904 images divided into six tumor differentiation stages: normal, polyp, low-grade, and high-grade intraepithelial neoplasia, serrated adenoma, and adenocarcinoma. We employed the colon section dataset from Chaoyang Hospital [58] for the target domain. This dataset, annotated by three expert pathologists, consists of 512 × 512 pixel images and includes four categories: normal, serrated, adenoma, and adenocarcinoma. Our experiment merged the categories into three groups: normal, adenoma, and other types of colon diseases. The image resolution was standardized to 224 × 224 pixels.

These datasets present significant domain adaptation challenges. High intra-class variability is a key issue, as histological images exhibit substantial variations across patients and even within different regions of the same tissue sample. Furthermore, differences in tissue slicing, staining, and scanning processes introduce variations in color, texture, and fine structural details, exacerbating domain shifts between the EBHI-Seg and Chaoyang datasets. Additionally, the complex pathological features make classification difficult—certain disease stages, such as high-grade adenoma and early-stage adenocarcinoma, share similar morphological characteristics, making it challenging for models to differentiate between them. Figure 3 provides sample images from both datasets, illustrating these variations.

4.2.2. For the Classification Task of Diabetic Retinopathy (DR)

We utilized the Diabetic Retinopathy Arranged dataset https://tianchi.aliyun.com/dataset/93926 (accessed on 10 March 2021) from the Alibaba Tianchi platform as the source domain. This dataset comprises 35,127 fundus images categorized into five classes: No DR, Mild DR, Moderate DR, Severe DR, and Proliferative DR. We labeled “No DR” as Normal for our experiments and merged the other categories into a single DR class, forming a binary classification task. Additionally, the Binary Diabetic Retinopathy (BiDR) dataset was utilized https://www.kaggle.com/datasets/pkdarabi/diagnosis-of-diabetic-retinopathy?resource=download-directory (accessed on 1 July 2024) as the target domain data. The BiDR dataset contains many high-resolution retinal images captured under different imaging conditions. Each image was evaluated by medical professionals and classified into two categories: No Diabetic Retinopathy and Diabetic Retinopathy. Regarding data preprocessing, we resized all images to a resolution of 224 × 224. The statistical details of the dataset are shown in Figure 4.

Despite their similarities, significant domain shifts exist between these datasets, posing challenges for domain adaptation. Variability in imaging conditions is a key issue, as images in the Alibaba Tianchi dataset and BiDR dataset are captured using different devices under diverse lighting and focus conditions. Additionally, the subtle differences in disease progression further complicate classification—while Mild DR shares only slight pathological differences with normal cases, Severe and Proliferative DR exhibit more pronounced abnormalities. These factors increase the complexity of adaptation and emphasize the necessity of a robust domain adaptation approach to improve model generalization across different datasets.

4.3. Experimental Setup

For the colorectal cancer classification task, the training set consists of 2374 images, including 1242 adenocarcinoma samples, 969 normal samples, and 1693 others. The test set contains 2144 images, comprising 774 adenocarcinoma samples, 63 normal samples, and 1307 others. For the Diabetic Retinopathy (DR) classification task, the training set includes 35,126 images, with 9316 labeled as diabetic and 25,810 as normal. The test set comprises 2838 images, including 1408 diabetic and 1430 normal samples. All experiments were conducted using the respective training sets for model training and the corresponding test sets for performance evaluation.

We extensively compared our approach with several unsupervised domain adaptation (UDA) techniques, covering conventional and advanced methods. Traditional approaches include DDC [59], which leverages Maximum Mean Discrepancy (MMD) to quantify and minimize the distributional gap between source and target domains. DANN [24] employs a domain discriminator for adversarial training to align distributions. ADDA [19] extends adversarial training with two independent feature extractors for better domain adaptation. CDAN [60]: integrates label information into adversarial training, achieving conditional domain alignment. MCD [61] aligns domains by maximizing classifier disagreement to refine feature alignment. JAN [62] reduces marginal and conditional distribution gaps through feature space dimensionality reduction.

In addition to these, we evaluated recent advanced methods. AFN [63] enhances feature transferability by adaptively increasing the feature norm. MCC [64] minimizes the confusion between classifiers to align the domains implicitly. DSAN [29] employs LMMD to align the distributions of associated subdomains based on subdomain adaptation principles. DAN [35] aligns the joint distribution of the source and target domains using neural network mappings. SPA [65] introduces an innovative spectral alignment approach to effectively balance inter-domain transferability and intra-domain discriminability. PIDAN [66] proposes a prototype-based inter-intra alignment method to bridge the feature distribution gap. Additionally, an uncertainty estimation strategy is introduced to generate highly reliable pseudo-labels in the target domain, further enhancing adaptation performance.

In this study, we train the network in the inverse learning rate decay strategy, as described in [24,29]. By progressively reducing the learning rate, the model can make finer adjustments during the later stages of training, thereby stabilizing convergence toward the optimal solution. Dynamic learning rate adjustment improves model performance and convergence, while optimal divergence estimation is achieved with

σ

as the median squared distance and

λ = 10^{- 2}

.

Evaluation metrics: To tackle the class imbalance in the training data, we use weighted precision, weighted recall, and weighted F1 score as evaluation metrics in our experiments, ensuring that the performance evaluation was not skewed towards the dominant classes. Additionally, we report accuracy (%) to give a more complete view of the model’s performance across all classes. These metrics compute a weighted average of precision, recall, and F1 score for each class, with the weights corresponding to the number of true instances (support) of each class. Specifically,

A c c u r a c y = \frac{TP + TN}{TP + FP + TN + FN}

(10)

P r e c i s i o n = \frac{TP}{TP + FP}

(11)

S_{i} = T P_{i} + F N_{i}

(12)

W e i g h t e d P r e c i s i o n = \frac{1}{N} \sum_{i = 1}^{k} P_{i} \times S_{i}

(13)

R e c a l l = \frac{TP}{TP + FN}

(14)

W e i g h t e d R e c a l l = \frac{1}{N} \sum_{i = 1}^{k} R_{i} \times S_{i}

(15)

F 1_s c o r e = \frac{2 * P r e c i s i o n * R e c a l l}{P r e c i s i o n + R e c a l l}

(16)

W e i g h t e d F 1_s c o r e = \frac{1}{N} \sum_{i = 1}^{k} F 1_{i} \times S_{i}

(17)

True positives (TP), false positives (FP), true negatives (TN), and false negatives (FN) represent the respective counts of correctly and incorrectly classified instances. The support of class i (

S_{i}

) is the total number of true instances for that class, encompassing both TP and FN. N refers to the overall instance count across all classes. Precision (

P_{i}

) and recall (

R_{i}

) measure the performance for class i, while

{F 1}_{i}

represents its F1 score. Lastly, K indicates the number of classes.

4.4. Experimental Results and Analysis

EBHI-Seg→Chaoyang dataset. This study explores unsupervised domain adaptation for colon disease diagnosis, with the experimental results summarized in Table 2. Our proposed method outperforms all compared approaches, achieving notable performance improvements. Among existing methods, the DNA approach reached a classification accuracy of 84.10% on the target domain, while the Minimum Class Confusion Network (MCC) obtained an accuracy of 81.77%. Additionally, the Joint Adaptation Network (JAN), which aligns joint and conditional distributions through dimensionality reduction, achieved an F1 score of 81.16%. These findings underscore the strengths of current domain adaptation techniques while revealing opportunities for further advancements.

Our proposed C-JDA incorporates a dual strategy to enhance domain adaptation. It achieves joint distribution alignment by aligning marginal and class-conditional distributions across source and target domains, using optimization based on relative chi-square divergence. Simultaneously, it integrates consistency regularization, enforcing prediction consistency across original and augmented target samples, strengthening inter-class relationship modeling, and improving feature representation invariance. Thanks to this innovative design, our method notably surpasses the source-only baseline model across all evaluation metrics. Specifically, it achieves a 5.64% improvement in accuracy over the MCC model and a 2.99% increase in the F1 score compared to the top-performing DNA model.

These improvements suggest that our approach uses unlabeled target domain data better than existing methods, leading to a notable increase in the reliability and generalization of cross-domain diagnosis in medical imaging.

Alibaba Tianchid dataset→BiDR. Table 3 provides a detailed comparison of our proposed method against several state-of-the-art domain adaptation techniques on the Diabetic Retinopathy (DR) classification task. The source-only model, trained solely on source domain data, delivers moderate performance with an accuracy of 85.23%, alongside F1 score, recall, and precision values of 85.21%, 85.23%, and 85.35%, respectively. In contrast, the application of domain adaptation methods significantly enhances performance. Notably, MCC achieves an accuracy of 94.53%, while DANN attains a comparable accuracy of 94.22%. These results highlight the substantial benefits of unsupervised domain adaptation, particularly in improving F1 score, recall, and precision in the target domain. The MCD method demonstrates only marginal performance improvements. We attribute this to insufficient constraints on the classifier, which results in the misclassification of noisy samples in the target domain, thereby constraining its overall effectiveness.

In contrast, our proposed method incorporates consistency regularization, which enhances robustness and reliability by better modeling inter-class relationships and aligning the target domain. Specifically, the proposed model achieves significant performance improvements across all metrics, ultimately reaching 96.93% accuracy and F1 score, substantially outperforming existing approaches. Compared to the second-best MCC method, our approach improves the accuracy and F1 score by about 2.4 percentage points. These results highlight the superior performance of our method in complex medical image domain adaptation tasks, providing a more robust solution for cross-domain diagnosis.

4.5. Experimental Analysis

4.5.1. Feature Visualization

Medical image classification across domains presents significant challenges, especially when substantial distributional disparities exist between the source and target domains. Traditional classification models often suffer from severe performance degradation under such circumstances.

In this experiment, T-SNE was utilized for dimensionality reduction and feature space visualization, providing a systematic assessment of the effectiveness of various domain adaptation methods in aligning features and distinguishing classes within the target domain. Without domain adaptation (Figure 5a, the three classes exhibit significant overlap in the feature space, making them almost indistinguishable. When applying DDC (Discrepant Distribution Matching) (Figure 5b), the 2D visualization highlights a noticeable separation between the source and target domains, especially in class distributions. Although some overlap exists between the domains, the class distributions remain poorly aligned. The MCD method (Figure 5c) introduces maximum classifier discrepancy, which achieves a more distinct distribution of target domain samples at the class level. However, a degree of overlap between source and target domain samples persists. Our method (Figure 5d) achieves superior alignment by leveraging joint distribution matching, accounting for both marginal and class-conditional distributions. The t-SNE plot demonstrates that source and target domain samples nearly overlap entirely, with target samples clustering closely around source samples and minimal variation observed across different class distributions. In particular, the model effectively addresses the regions that overlap between classes, validating its advantages over existing approaches.

Figure 6 illustrates the alignment of sample distributions between the source and target domains in the eye dataset using various methods. Each point represents a sample, with colors indicating the categories (Normal and DR). There is significant misalignment between the source and target domains in the source-only method (Figure 6a). Target domain samples (DR) fail to align with source domain samples in the low-dimensional space, resulting in substantial separation between the domains and unclear category boundaries. The DDC method (Figure 6b) shows some improvement as target domain samples (DR) begin to group in the two-dimensional space. However, a visible gap in distribution remains, particularly at the category boundaries. The MCD method (Figure 6c) achieves better alignment, with target domain samples overlapping more effectively with source domain samples. Still, minor separations persist, and the category boundaries are not entirely distinct. Our method (Figure 6d) addresses these challenges through joint distribution alignment, resulting in almost complete overlap between source and target domain samples. The target domain distribution closely matches that of the source domain, with minimal category variation, successfully removing distributional discrepancies while preserving clear category boundaries.

Figure 7 illustrates the confusion matrices of four models—CDAN (Figure 7a), JAN (Figure 7b), DSAN (Figure 7c), and our proposed model (Figure 7d)—providing an intuitive overview of the experimental results. In the “Adenocarcinoma” class, our model demonstrated outstanding performance, achieving only 61 false positives (FP), which is significantly lower compared to CDAN (101), JAN (72), and DSAN (71). Moreover, in the same class, our approach achieved the lowest false negatives (FN), with only 103 cases, markedly better than CDAN (190), JAN (183), and DSAN (178). These experimental results highlight the effectiveness of our proposed model in addressing class imbalance and optimizing critical metrics such as FP and FN, further substantiating its potential applicability in real-world medical diagnostic systems.

We visualized confusion matrices (shown in Figure 8) to better understand and analyze the experimental outcomes. Each row corresponds to the actual class in these confusion matrices, while each column represents the predicted class. Our model effectively optimized false positives (FP) and false negatives (FN), leading to notable improvements in these metrics. Specifically, for the “Normal” class, the reduction in false positives was substantial, highlighting the model’s enhanced ability to identify standard samples accurately. For the “DR” class, the number of false negatives decreased to 18, the lowest among all compared models, highlighting the model’s exceptional sensitivity in detecting diseased samples. This dual optimization holds critical clinical significance in real-world applications: on the one hand, reducing false positives lowers the risk of misdiagnosing regular patients, alleviating unnecessary medical burden; on the other hand, minimizing false negatives improves the early diagnosis rate of diseased patients, ensuring timely treatment and better prognosis.

4.5.2. Ablation Study

Table 4 and Table 5 present the ablation study results of the proposed improved joint distribution alignment with consistency framework for colon disease classification and diabetic retinopathy classification tasks. These experiments evaluate the independent contributions and combined effects of the key loss functions: classification loss (

ℓ_{Cls}

), joint distribution alignment loss (

ℓ_{Jda}

), and consistency regularizaion loss (

ℓ_{Reg}

).

For the colon disease classification task (Table 4, using only the classification loss (

ℓ_{Cls}

) resulted in suboptimal performance, with an accuracy of 75.71% and an F1 score of 77.19%. Adding the joint distribution alignment loss (

ℓ_{Jda}

) significantly improved the model’s performance. Incorporating the consistency regularization loss (

ℓ_{Reg}

) further enhanced the model, achieving the best results with an accuracy of 87.41%, an F1 score of 87.26%, and precision and recall of 87.41% and 87.52%, respectively. These findings demonstrate that the feature consistency alignment strategy plays a crucial role in improving category discrimination and feature compactness.

For the diabetic retinopathy classification task (Table 5), using only the classification loss (

ℓ_{Cls}

) yielded limited performance, with an accuracy and F1 score of 88.61%. Introducing the joint distribution alignment loss (

ℓ_{Jda}

) improved the accuracy and F1 score to 91.64%. Further incorporating the consistency regularization loss (

ℓ_{Reg}

) led to significant improvements, with all metrics exceeding 96.90%. Specifically, the model achieved an accuracy and F1 score of 96.93%.

These results indicate that the consistency regularization strategy effectively enhances the compactness of feature distributions and the discriminative ability across categories, substantially boosting classification performance.

4.5.3. Parameter Analysis

We examined the sensitivity of our approach to different values of the parameter

α

in the joint distribution alignment and plotted the classification accuracy for colon cancer as

α

varies (

α

∈ 0.0, 0.1, 0.3, 0.5, 0.7, 0.9) in Figure 9. As illustrated in the figure, the classification accuracy increases gradually with

α

, reaching its highest value at

α

= 0.5, after which it begins to decrease.

For the classification tasks of colorectal diseases and diabetic retinopathy, the values of the hyperparameters

\{λ_{J d a}, λ_{R e g}\}

were determined by tuning within the range of

\{0.01, 0.05, 0.1, 0.5, 1\}

. To assess the sensitivity of these parameters, we further examined the performance of DR detection by adjusting one parameter at a time. The additional results in Table 6 show that the proposed method is highly robust to variations in hyperparameters within the specified range. Performance remains consistent across values of

\{0.01, 0.05, 0.1, 0.5, 1\}

, with optimal diabetic retinopathy diagnosis performance achieved at

λ_{J d a} = 1

and

λ_{R e g} = 1

. This analysis demonstrates that the proposed method remains stable and efficient despite small variations in hyperparameters as long as they are within a reasonable range without requiring precise tuning.

5. Conclusions

We present a new robust joint distribution alignment framework in unsupervised domain adaptation within medical imaging. Central to our approach, the improved joint distribution alignment with consistency framework uses convolutional neural networks (CNNs) to adaptively align source and target domain distributions in the feature space. By employing relative chi-square divergence as a measure of distribution similarity, our method offers a theoretically sound solution that moves beyond conventional marginal and class-conditional alignment, extending to joint distribution alignment. This approach facilitates effective alignment between the source and target domains, addressing distribution mismatches and improving model adaptability.

Through comprehensive experiments, we showed that the proposed C-JDA framework surpasses existing deep domain adaptation methods, including DANN, ADDA, DSAN, and CDAN, across various medical imaging tasks, particularly in classification accuracy and domain generalization. The C-JDA framework employs a specialized divergence measure to achieve precise alignment of the joint distributions

P^{s} (x, y)

and

P^{t} (x, y)

, effectively reducing domain shift while retaining essential structural information across domains. Our findings highlight the method’s ability to maintain consistent feature representations across source and target domains, even when handling large-scale and highly diverse medical imaging data.

The proposed C-JDA mechanism overcomes the limitations of previous methods, which generally focus on aligning marginal or conditional distributions. This approach enhances adaptation accuracy and robustness by aligning joint distributions, particularly in complex, high-dimensional medical imaging scenarios. Including joint distribution alignment allows for more comprehensive and effective alignment, ensuring that both the overall feature distribution and class-specific distribution are synchronized across domains. Moreover, the analytical approximation of the divergence offers an efficient solution to the domain adaptation challenge, removing the need for the complex optimization processes typically associated with adversarial methods and significantly improving the model’s practical usability.

In addition to its theoretical contributions, our approach demonstrates practical applicability in real-world settings, where it significantly improves the performance of medical image classification models on target domains with limited labeled data. This capability is crucial in medical imaging, where labeled data for the target domain are often scarce, but reliable adaptation is critical for clinical decision-making.

Future work will explore extending the C-JDA framework to other medical imaging tasks, such as segmentation and detection, where the challenges of domain shift and data scarcity are equally prevalent. We also plan to explore incorporating alternative divergence measures to enhance the adaptability and generalization of domain adaptation models.

Author Contributions

J.Z.: Writing—review and editing, writing—original draft, visualization, validation, software, project administration, methodology, investigation, formal analysis, data curation, conceptualization. R.L.: Validation, project administration, methodology writing—review and editing, supervision, resources, funding acquisition, data curation. C.L.: Conceptualization, methodology, writing—review and editing, supervision, resources, funding acquisition. X.J.: Conceptualization, methodology, writing—review and editing, resources. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the Natural Science Foundation of Guangdong Province (Project No. 2023A1515110696), and in part by Shantou University under Project NTF20007 and Project NTF22012.

Data Availability Statement

The data presented in this study are available from the corresponding author upon request due to privacy.

Acknowledgments

The authors sincerely appreciate the valuable feedback and suggestions provided by the anonymous reviewers.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Mendelson, D.S.; Rubin, D.L. Imaging informatics: Essential tools for the delivery of imaging services. Acad. Radiol. 2013, 20, 1195–1212. [Google Scholar] [PubMed]
Litjens, G.; Kooi, T.; Bejnordi, B.E.; Setio, A.A.A.; Ciompi, F.; Ghafoorian, M.; Van Der Laak, J.A.; Van Ginneken, B.; Sánchez, C.I. A survey on deep learning in medical image analysis. Med. Image Anal. 2017, 42, 60–88. [Google Scholar] [PubMed]
Li, X.; Li, C.; Rahaman, M.M.; Sun, H.; Li, X.; Wu, J.; Yao, Y.; Grzegorzek, M. A comprehensive review of computer-aided whole-slide image analysis: From datasets to feature extraction, segmentation, classification and detection approaches. Artif. Intell. Rev. 2022, 55, 4809–4878. [Google Scholar]
Cruz Castañeda, W.A.; Bertemes Filho, P. Improvement of an Edge-IoT Architecture Driven by Artificial Intelligence for Smart-Health Chronic Disease Management. Sensors 2024, 24, 7965. [Google Scholar] [CrossRef] [PubMed]
Zhou, K.; Gu, Z.; Liu, W.; Luo, W.; Cheng, J.; Gao, S.; Liu, J. Multi-cell multi-task convolutional neural networks for diabetic retinopathy grading. In Proceedings of the 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Honolulu, HI, USA, 18–21 July 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 2724–2727. [Google Scholar]
Apou, G.; Schaadt, N.S.; Naegel, B.; Forestier, G.; Schönmeyer, R.; Feuerhake, F.; Wemmert, C.; Grote, A. Detection of lobular structures in normal breast tissue. Comput. Biol. Med. 2016, 74, 91–102. [Google Scholar] [CrossRef]
Dai, D.; Dong, C.; Xu, S.; Yan, Q.; Li, Z.; Zhang, C.; Luo, N. Ms RED: A novel multi-scale residual encoding and decoding network for skin lesion segmentation. Med. Image Anal. 2022, 75, 102293. [Google Scholar]
Zhang, Y. A survey of unsupervised domain adaptation for visual recognition. arXiv 2021, arXiv:2112.06745. [Google Scholar]
Zhou, K.; Liu, Z.; Qiao, Y.; Xiang, T.; Loy, C.C. Domain generalization: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 4396–4415. [Google Scholar]
Ben-David, S.; Blitzer, J.; Crammer, K.; Pereira, F. Analysis of representations for domain adaptation. Adv. Neural Inf. Process. Syst. 2006, 19, 1–8. Available online: https://proceedings.neurips.cc/paper_files/paper/2006/file/b1b0432ceafb0ce714426e9114852ac7-Paper.pdf (accessed on 23 March 2025).
Dou, Q.; Ouyang, C.; Chen, C.; Chen, H.; Heng, P.A. Unsupervised cross-modality domain adaptation of convnets for biomedical image segmentations with adversarial loss. arXiv 2018, arXiv:1804.10916. [Google Scholar]
Kouw, W.M.; Loog, M. A review of domain adaptation without target labels. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 43, 766–785. [Google Scholar] [CrossRef] [PubMed]
Luo, Z.; Zou, Y.; Hoffman, J.; Fei-Fei, L.F. Label efficient learning of transferable representations acrosss domains and tasks. Adv. Neural Inf. Process. Syst. 2017, 30, 1–13. Available online: https://proceedings.neurips.cc/paper_files/paper/2017/file/a8baa56554f96369ab93e4f3bb068c22-Paper.pdf (accessed on 23 March 2025).
Bousmalis, K.; Silberman, N.; Dohan, D.; Erhan, D.; Krishnan, D. Unsupervised pixel-level domain adaptation with generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 3722–3731. [Google Scholar]
Long, M.; Cao, Y.; Cao, Z.; Wang, J.; Jordan, M.I. Transferable representation learning with deep adaptation networks. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 41, 3071–3085. [Google Scholar] [CrossRef] [PubMed]
Cicek, S.; Soatto, S. Unsupervised domain adaptation via regularized conditional alignment. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1416–1425. [Google Scholar]
Guan, H.; Liu, M. Domain Adaptation for Medical Image Analysis: A Survey. IEEE Trans. Biomed. Eng. 2022, 69, 1173–1185. [Google Scholar] [CrossRef] [PubMed]
Sun, B.; Saenko, K. Deep coral: Correlation alignment for deep domain adaptation. In Proceedings of the Computer Vision–ECCV 2016 Workshops, Amsterdam, The Netherlands, 8–10 and 15–16 October 2016; Proceedings, Part III 14. Springer: Berlin/Heidelberg, Germany, 2016; pp. 443–450. [Google Scholar]
Tzeng, E.; Hoffman, J.; Saenko, K.; Darrell, T. Adversarial discriminative domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7167–7176. [Google Scholar]
Pei, Z.; Cao, Z.; Long, M.; Wang, J. Multi-adversarial domain adaptation. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar]
Damodaran, B.B.; Kellenberger, B.; Flamary, R.; Tuia, D.; Courty, N. Deepjdot: Deep joint distribution optimal transport for unsupervised domain adaptation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 447–463. [Google Scholar]
Si, S.; Tao, D.; Geng, B. Bregman divergence-based regularization for transfer subspace learning. IEEE Trans. Knowl. Data Eng. 2009, 22, 929–942. [Google Scholar] [CrossRef]
Baktashmotlagh, M.; Har, M.; Salzmann, M. Distribution-matching embedding for visual domain adaptation. J. Mach. Learn. Res. 2016, 17, 1–30. [Google Scholar]
Ganin, Y.; Ustinova, E.; Ajakan, H.; Germain, P.; Larochelle, H.; Laviolette, F.; March, M.; Lempitsky, V. Domain-adversarial training of neural networks. J. Mach. Learn. Res. 2016, 17, 1–35. [Google Scholar]
Gretton, A.; Borgwardt, K.M.; Rasch, M.J.; Schölkopf, B.; Smola, A. A kernel two-sample test. J. Mach. Learn. Res. 2012, 13, 723–773. [Google Scholar]
Wu, J.; Fang, Y. Dual-view global and local category-attentive domain alignment for unsupervised conditional adversarial domain adaptation. Neural Netw. 2025, 185, 107129. [Google Scholar] [CrossRef]
Gong, M.; Zhang, K.; Liu, T.; Tao, D.; Glymour, C.; Schölkopf, B. Domain adaptation with conditional transferable components. In Proceedings of the International Conference on Machine Learning, New York, NY, USA, 19–24 June 2016; PMLR: New York, NY, USA, 2016; pp. 2839–2848. [Google Scholar]
Deng, Z.; Luo, Y.; Zhu, J. Cluster alignment with a teacher for unsupervised domain adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9944–9953. [Google Scholar] [CrossRef]
Zhu, Y.; Zhuang, F.; Wang, J.; Ke, G.; Chen, J.; Bian, J.; Xiong, H.; He, Q. Deep subdomain adaptation network for image classification. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 1713–1722. [Google Scholar] [CrossRef]
Hu, L.; Kan, M.; Shan, S.; Chen, X. Unsupervised domain adaptation with hierarchical gradient synchronization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 4043–4052. [Google Scholar] [CrossRef]
Li, Z.; Weng, S.; Xia, Y.; Yu, H.; Yan, Y.; Yin, P. Cross-domain damage identification based on conditional adversarial domain adaptation. Eng. Struct. 2024, 321, 118928. [Google Scholar]
Zhang, J.; Li, W.; Ogunbona, P. Joint geometrical and statistical alignment for visual domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 5150–5158. [Google Scholar] [CrossRef]
Long, M.; Wang, J.; Ding, G.; Sun, J.; Yu, P.S. Transfer feature learning with joint distribution adaptation. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 2200–2207. [Google Scholar] [CrossRef]
Chen, S.; Harandi, M.; Jin, X.; Yang, X. Semi-supervised domain adaptation via asymmetric joint distribution matching. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 5708–5722. [Google Scholar]
Chen, S.; Harandi, M.; Jin, X.; Yang, X. Domain adaptation by joint distribution invariant projections. IEEE Trans. Image Process. 2020, 29, 8264–8277. [Google Scholar] [CrossRef]
Courty, N.; Flamary, R.; Habrard, A.; Rakotomamonjy, A. Joint distribution optimal transportation for domain adaptation. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
Chen, S.; Hong, Z.; Harandi, M.; Yang, X. Domain neural adaptation. IEEE Trans. Neural Netw. Learn. Syst. 2022, 34, 8630–8641. [Google Scholar]
Liu, S.; Yamada, M.; Collier, N.; Sugiyama, M. Change-point detection in time-series data by relative density-ratio estimation. Neural Netw. 2013, 43, 72–83. [Google Scholar]
Mao, X.; Li, Q.; Xie, H.; Lau, R.Y.; Wang, Z.; Paul Smolley, S. Least squares generative adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2794–2802. [Google Scholar] [CrossRef]
Esteva, A.; Kuprel, B.; Novoa, R.A.; Ko, J.; Swetter, S.M.; Blau, H.M.; Thrun, S. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017, 542, 115–118. [Google Scholar] [PubMed]
Ahn, E.; Kumar, A.; Fulham, M.; Feng, D.; Kim, J. Unsupervised domain adaptation to classify medical images using zero-bias convolutional auto-encoders and context-based feature augmentation. IEEE Trans. Med. Imaging 2020, 39, 2385–2394. [Google Scholar]
Gulshan, V.; Peng, L.; Coram, M.; Stumpe, M.C.; Wu, D.; Narayanaswamy, A.; Venugopalan, S.; Widner, K.; Madams, T.; Cuadros, J.; et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 2016, 316, 2402–2410. [Google Scholar]
Kather, J.N.; Krisam, J.; Charoentong, P.; Luedde, T.; Herpel, E.; Weis, C.A.; Gaiser, T.; Marx, A.; Valous, N.A.; Ferber, D.; et al. Predicting survival from colorectal cancer histology slides using deep learning: A retrospective multicenter study. PLoS Med. 2019, 16, e1002730. [Google Scholar]
Zheng, S.; Yang, X.; Wang, Y.; Ding, M.; Hou, W. Unsupervised Cross-Modality Domain Adaptation Network for X-Ray to CT Registration. IEEE J. Biomed. Health Inform. 2021, 26, 2637–2647. [Google Scholar]
Hayat, M.; Ahmad, N.; Nasir, A.; Tariq, Z.A. Hybrid Deep Learning EfficientNetV2 and Vision Transformer (EffNetV2-ViT) Model for Breast Cancer Histopathological Image Classification. IEEE Access 2024, 12, 184119–184131. [Google Scholar]
Chen, S.; Wu, H.; Liu, C. Domain invariant and agnostic adaptation. Knowl.-Based Syst. 2021, 227, 107192. [Google Scholar]
Qi, Q.; Lin, X.; Chen, C.; Xie, W.; Huang, Y.; Ding, X.; Liu, X.; Yu, Y. Curriculum feature alignment domain adaptation for epithelium-stroma classification in histopathological images. IEEE J. Biomed. Health Inform. 2020, 25, 1163–1172. [Google Scholar]
Gao, Y.; Zhang, Y.; Cao, Z.; Guo, X.; Zhang, J. Decoding brain states from fMRI signals by using unsupervised domain adaptation. IEEE J. Biomed. Health Inform. 2019, 24, 1677–1685. [Google Scholar] [PubMed]
Jin, Q.; Cui, H.; Sun, C.; Meng, Z.; Wei, L.; Su, R. Domain adaptation based self-correction model for COVID-19 infection segmentation in CT images. Expert Syst. Appl. 2021, 176, 114848. [Google Scholar]
Imtiaz, M.N.; Khan, N. Cross-database and cross-channel electrocardiogram arrhythmia heartbeat classification based on unsupervised domain adaptation. Expert Syst. Appl. 2024, 244, 122960. [Google Scholar]
Zhao, X.; Wang, X. Unsupervised domain adaptation based fracture segmentation method for core CT images. Expert Syst. Appl. 2025, 264, 125857. [Google Scholar]
Berthelot, D.; Carlini, N.; Cubuk, E.D.; Kurakin, A.; Sohn, K.; Zhang, H.; Raffel, C. Remixmatch: Semi-supervised learning with distribution alignment and augmentation anchoring. arXiv 2019, arXiv:1911.09785. [Google Scholar]
Lee, D.H. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Proceedings of the Workshop on Challenges in Representation Learning, ICML, Atlanta, GA, USA, 17–19 June 2013; Volume 3, p. 896. [Google Scholar]
Cubuk, E.D.; Zoph, B.; Mane, D.; Vasudevan, V.; Le, Q.V. Autoaugment: Learning augmentation strategies from data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 113–123. [Google Scholar] [CrossRef]
Sohn, K.; Berthelot, D.; Carlini, N.; Zhang, Z.; Zhang, H.; Raffel, C.A.; Cubuk, E.D.; Kurakin, A.; Li, C.L. Fixmatch: Simplifying semi-supervised learning with consistency and confidence. Adv. Neural Inf. Process. Syst. 2020, 33, 596–608. [Google Scholar]
Xu, Z.; Wang, Y.; Lu, D.; Yu, L.; Yan, J.; Luo, J.; Ma, K.; Zheng, Y.; Tong, R.K.y. All-around real label supervision: Cyclic prototype consistency learning for semi-supervised medical image segmentation. IEEE J. Biomed. Health Inform. 2022, 26, 3174–3184. [Google Scholar] [CrossRef] [PubMed]
Gu, Y.; Ge, Z.; Bonnington, C.P.; Zhou, J. Progressive transfer learning and adversarial domain adaptation for cross-domain skin disease classification. IEEE J. Biomed. Health Inform. 2019, 24, 1379–1393. [Google Scholar] [CrossRef] [PubMed]
Zhu, C.; Chen, W.; Peng, T.; Wang, Y.; Jin, M. Hard Sample Aware Noise Robust Learning for Histopathology Image Classification. IEEE Trans. Med. Imaging 2022, 41, 881–894. [Google Scholar] [CrossRef]
Tzeng, E.; Hoffman, J.; Zhang, N.; Saenko, K.; Darrell, T. Deep domain confusion: Maximizing for domain invariance. arXiv 2014, arXiv:1412.3474. [Google Scholar]
Long, M.; Cao, Z.; Wang, J.; Jordan, M.I. Conditional adversarial domain adaptation. Adv. Neural Inf. Process. Syst. 2018, 31. [Google Scholar]
Saito, K.; Watanabe, K.; Ushiku, Y.; Harada, T. Maximum classifier discrepancy for unsupervised domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 3723–3732. [Google Scholar] [CrossRef]
Long, M.; Zhu, H.; Wang, J.; Jordan, M.I. Deep transfer learning with joint adaptation networks. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; PMLR: New York, NY, USA, 2017; pp. 2208–2217. [Google Scholar]
Xu, R.; Li, G.; Yang, J.; Lin, L. Larger Norm More Transferable: An Adaptive Feature Norm Approach for Unsupervised Domain Adaptation. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1426–1435. [Google Scholar] [CrossRef]
Jin, Y.; Wang, X.; Long, M.; Wang, J. Minimum class confusion for versatile domain adaptation. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part XXI 16. Springer: Berlin/Heidelberg, Germany, 2020; pp. 464–480. [Google Scholar]
Xiao, Z.; Wang, H.; Jin, Y.; Feng, L.; Chen, G.; Huang, F.; Zhao, J. SPA: A graph spectral alignment perspective for domain adaptation. Adv. Neural Inf. Process. Syst. 2023, 36, 37252–37272. Available online: https://proceedings.neurips.cc/paper_files/paper/2023/file/754e80f98b2a141942f45a0eeb258a3c-Paper-Conference.pdf (accessed on 23 March 2025).
Xie, Z.; Duan, P.; Liu, W.; Kang, X.; Li, S. Prototype-based Inter-Intra Domain Alignment Network for Unsupervised Cross-Scene Hyperspectral Image Classification. In Proceedings of the IGARSS 2024—2024 IEEE International Geoscience and Remote Sensing Symposium, Athens, Greece, 7–12 July 2024; pp. 7795–7798. [Google Scholar] [CrossRef]

Figure 1. This figure illustrates the concept of domain shift in medical classification, highlighting how differences in data sources and conditions affect model performance across various medical datasets. Additionally, it demonstrates two distribution alignment strategies employed in domain adaptation to address domain shift. (Different colors indicate different domains).

Figure 2. Our unsupervised domain transfer framework combining Joint Distribution Alignment (JDA) and Consistency regularization. Our JDA is built on the relative density ration estimation, which can be solved with a close-form equation. Consistency regularization enhances the learning of target domain samples by generating robust pseudo-labels, which in turn strengthens the measurement of JDA.

Figure 3. Image samples from EBHI-Seg and Chaoyang datasets.

Figure 4. Image samples from Diabetic Retinopathy Arranged and Binary Diabetic Retinopathy (BiDR) datasets.

Figure 5. t-SNE visualization of the categories: (a) without adaptation, (b) after DDC adaptation, (c) after MCD adaptation, and (d) after adaptation with our model. Adenoma is represented in blue, normal in green, and other types in red.

Figure 6. t-SNE visualization of the categories: (a) without adaptation, (b) after DDC adaptation, (c) after MCD adaptation, and (d) after adaptation with our model. Normal is shown in orange, and DR is shown in light blue.

Figure 7. The confusion matrix visualization was further analyzed to assess the diagnostic performance for colon disease.

Figure 8. The confusion matrix visualization was further analyzed to evaluate the diagnostic performance for diabetic retinopathy.

Figure 9. Analysis of the impact of different

α

values in joint distribution alignment on classification accuracy and F1 score.

Figure 9. Analysis of the impact of different

α

values in joint distribution alignment on classification accuracy and F1 score.

Table 1. List of transformations applied for generating symmetric data.

Transformation	Description	Parameter	Range
Autocontrast	Maximizes the image contrast by converting the darkest (lightest)	-	-
Autocontrast	pixel to black (white).	-	-
Brightness	Adjusts the image brightness randomly, where $B = 0$ returns	B	[0.05, 0.95]
Brightness	a black image, and $B = 1$ returns the original image.	B	[0.05, 0.95]
Color	Adjusts the color balance of the image at random, where $C = 0$	C	[0.05, 0.95]
Color	returns a black and white image, and $C = 1$ returns the original image.	C	[0.05, 0.95]
Contrast	Controls the image contrast at random, where $C = 0$	C	[0.05, 0.95]
Contrast	returns a gray image, and $C = 1$ returns the original image.	C	[0.05, 0.95]
Equalize	Equalizes the image histogram.	-	-
Identity	Returns the original image.	-	-
Posterize	Reduces every image pixel to $δ$ bits.	$δ$	[4, 8]
Rotate	Rotates the image by $λ$ degrees.	$λ$	[−30, 30]
Sharpness	Adjusts the image sharpness at random, where $S = 0$ returns	S	[0.05, 0.95]
Sharpness	a blurred image, and $S = 1$ returns the original image.	S	[0.05, 0.95]
Shear_x	Shears the image along the horizontal axis with rate $ϕ$ .	$ϕ$	[−0.3, 0.3]
Shear_y	Shears the image along the vertical axis with rate $β$ .	$β$	[−0.3, 0.3]
Solarize	Inverts all image pixels above a threshold value of $η$ .	$η$	[0, 1]
Translate_x	Translates the image horizontally by $(ϵ \times image width)$ pixels.	$ϵ$	[−0.3, 0.3]
Translate_y	Translates the image vertically by $(μ \times image height)$ pixels.	$μ$	[−0.3, 0.3]

Table 2. Comparison of EBHI-Seg → Chaoyang dataset for colon disease diagnosis. The best performances are given in bold, while the second-best are underlined.

Methods	Accuracy (%)	F1 Score (%)	Recall (%)	Precision (%)
ADDA	79.34	80.37	79.34	81.68
AFN	74.12	76.12	74.12	79.20
CDAN	80.55	80.99	80.55	81.71
DANN	79.25	79.16	79.25	79.73
DDC	80.46	81.07	80.46	81.96
DNA	84.10	84.27	84.10	84.46
DSAN	80.74	80.74	80.74	83.10
JAN	81.16	81.74	81.16	82.63
MCC	81.77	81.67	81.77	83.03
MCD	79.86	82.02	79.86	84.75
PIDAN	82.70	83.47	82.70	84.11
SPA	83.86	83.79	83.86	83.75
Source_only	70.44	71.30	70.44	74.97
Our model	87.41	87.26	87.41	87.52

Table 3. Comparison of Alibaba Tianchi dataset→BiDR for eye disease diagnosis. The best performances are given in bold, while the second-best are underlined.

Methods	Accuracy (%)	F1 Score (%)	Recall (%)	Precision (%)
ADDA	89.53	89.52	89.53	89.67
AFN	91.01	90.97	91.01	91.58
CDAN	92.63	92.63	92.63	92.77
DANN	94.22	94.21	94.22	94.28
DDC	88.23	88.19	88.23	88.19
DNA	91.64	91.64	91.64	91.65
DSAN	89.46	89.46	89.46	89.53
JAN	89.64	89.64	89.64	89.64
MCC	94.53	94.53	94.53	94.54
MCD	88.51	88.50	88.51	88.68
PIDAN	93.58	93.58	93.58	93.59
SPA	90.87	90.87	90.87	90.89
Source_only	85.23	85.21	85.23	85.35
Our model	96.93	96.93	96.93	96.98

Table 4. An ablation study was conducted to assess the impact of the improved joint distribution alignment with consistency framework on colon disease classification.

$ℓ_{Cls}$	$ℓ_{Jda}$	$ℓ_{Reg}$	Accuracy (%)	F1 Score (%)	Precision (%)	Recall (%)
√			75.71	77.19	75.71	80.86
√	√		84.10	84.27	84.10	84.46
√		√	84.61	84.90	84.61	85.22
√	√	√	87.41	87.26	87.41	87.52

Table 5. An ablation study was performed to evaluate the impact of the improved joint distribution alignment with consistency framework on diabetic retinopathy classification.

$ℓ_{Cls}$	$ℓ_{Jda}$	$ℓ_{Reg}$	Accuracy (%)	F1 Score (%)	Precision (%)	Recall (%)
√			88.61	88.61	88.61	88.65
√	√		91.64	91.64	91.64	91.65
√		√	93.09	93.09	93.09	93.11
√	√	√	96.93	96.93	96.93	96.98

Table 6. Parameter analysis of diabetic retinopathy.

Parameters	Value	Accuracy	F1 Score	Precision	Recall
$λ_{J d a}$	1	96.93	96.93	96.93	96.98
	0.5	96.65	96.65	96.65	96.73
	0.1	96.86	96.86	96.86	96.92
	0.05	95.34	95.34	95.34	95.36
	0.01	95.31	95.31	95.31	95.32
$λ_{R e g}$	1	96.93	96.93	96.93	96.98
	0.5	96.58	96.58	96.58	96.63
	0.1	96.47	96.47	96.47	96.53
	0.05	93.79	93.79	93.79	93.79
	0.01	93.58	93.58	93.58	93.64

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, J.; Li, R.; Liu, C.; Ji, X. Improving Domain Transfer with Consistency-Regularized Joint Distribution Alignment for Medical Image Classification. Symmetry 2025, 17, 515. https://doi.org/10.3390/sym17040515

AMA Style

Zhang J, Li R, Liu C, Ji X. Improving Domain Transfer with Consistency-Regularized Joint Distribution Alignment for Medical Image Classification. Symmetry. 2025; 17(4):515. https://doi.org/10.3390/sym17040515

Chicago/Turabian Style

Zhang, Jiacheng, Rui Li, Cheng Liu, and Xiang Ji. 2025. "Improving Domain Transfer with Consistency-Regularized Joint Distribution Alignment for Medical Image Classification" Symmetry 17, no. 4: 515. https://doi.org/10.3390/sym17040515

APA Style

Zhang, J., Li, R., Liu, C., & Ji, X. (2025). Improving Domain Transfer with Consistency-Regularized Joint Distribution Alignment for Medical Image Classification. Symmetry, 17(4), 515. https://doi.org/10.3390/sym17040515

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Improving Domain Transfer with Consistency-Regularized Joint Distribution Alignment for Medical Image Classification

Abstract

1. Introduction

2. Related Work

3. Methods

3.1. Joint Distribution Alignment

3.2. Consistency Regularization

3.3. Training Procedure

4. Experiment

4.1. Data Augmentations

4.2. Datasets

4.2.1. For the Colon Disease Classification Task

4.2.2. For the Classification Task of Diabetic Retinopathy (DR)

4.3. Experimental Setup

4.4. Experimental Results and Analysis

4.5. Experimental Analysis

4.5.1. Feature Visualization

4.5.2. Ablation Study

4.5.3. Parameter Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI