4.3. Performance Evaluation of Change Detection
Since UDA-CD is an extension and enhancement of CD, the performance of the CD model has a direct impact on the migratability effect of the CD-UDA model. Therefore, the performances of supervised IRD-CD models need to be evaluated. Six methods with stable and advanced performance are selected for comparative experiments, including Deeplab [
46], FCSiamD [
47], EUNet [
48], UCDNet [
49], ISNet [
50] and proposed IRDNet. These comparison experiments are performed separately for each dataset, and the results are presented in
Table 2, with the best results highlighted in bold.
The supervised CD methods for which comparative experiments are conducted are robust and have excellent performance. The results show that their overall performance is relatively close. Among these methods, DeepLab and EUNet (both single-stream models) show suboptimal performance on all three datasets, confirming the idea that the two-stream Siamese model is more suitable for the CD task. Compared to the other concatenated models, the proposed IRD-CD model achieves the best performance on all three datasets, with a clear advantage on the most imbalanced GZ dataset.
4.4. Ablation Experiment of Change Detection Unsupervised Domain Adaptation Model (CD-UDA)
According to the DA strategy and the sample selection strategy introduced in
Section 3.2, the effectiveness of each DA strategy is verified by an ablation experiment. This experiment has the following three parts:
D0: Evaluate the UDA performance of the model when pre-trained with the source domain data, using only the mid-layer marginal distribution alignment (
Section 3.2.1);
D1: Evaluate the UDA performance of the model when pre-trained with the source domain data, using transferable features derived from the probabilistic easy-to-hard target domain sample selection strategy (
Section 3.2.3) for the inter-domain marginal distribution alignment (
Section 3.2.1) prior to the classification layer;
D2: Building on D1, evaluate the UDA performance of the mid-layer marginal distribution alignment (
Section 3.2.2).
The results of the ablation experiments are shown in
Table 3, where
refers to the source domain data being GZ and the target domain data being LEVIR. These results show that the proposed mid-level domain edge distribution domain alignment, deep conditional distribution domain alignment, and sample selection strategies mainly improve the performance of CD-UDA. Among them, the probability-based easy-to-hard sample selection strategy significantly mitigates the convergence of CD-UDA to a local optimum, due to sample imbalance. The D1 approach with conditional distribution domain constraints shows significant overall improvement in the experiments, and in particular, the UDA strategy is more effective in mitigating the severe sample imbalance in the CD task. The above experimental results show that the sample imbalance problem can be alleviated by setting the same number of transferable samples for the changed and unchanged regions in the sample screening process. However, if the mid-level marginal distribution adaptation is not included, the convergence direction of the model will be different from that of the source domain model, i.e., the prior knowledge of the source domain is ignored. Given that the target domain data are unlabeled, CD-UDA can only improve the target domain test performance by minimizing the feature distribution difference between the source and target domain data.
There are two customized parameters,
and
, in the sample screening strategy Algorithm 2, and the specific values of the parameters are determined through ablation experiments. This experiment is carried out on the basis of all the additions to the D2 CD-UDA strategy. The ablation experiment is first performed for the initial probability threshold
. Not performing the second screening will lead to model optimization failure due to sample imbalance, so here the second screening parameter
is set to the empirical value of
.
Table 4 shows the performance of
on different cross-sample datasets, where the experimental metrics are
and
. After determining the value of
, ablation experiments are performed on the second sample screening weight
.
Table 5 shows the performance of
on different cross-sample datasets where the experimental metrics are
and
. Among them, the bolded result is the best performance. From the results of the experiment, it was determined that
.
4.5. Performance Evaluation of Change Detection Unsupervised Domain Adaptation Model (CD-UDA)
The research on CD-UDA is very limited. In order to evaluate the performance of the proposed CD-UDA, the experiments are divided into the following three parts: first, evaluate the transferable characteristics of the multi-domain fully-connected layer by MK-MMD using the DSDANet model; second, incorporate the inter-domain conditional distribution difference and probabilistic sample-selection-based transfer strategy proposed in this paper into the two-stream UCDNet and ISNet models, so as to verify the generality of the CD-UDA method proposed in this paper; third, generalize the proposed CD-UDA method by creating a benchmark; select several CD models without a DA strategy but with better performance for cross-domain dataset testing. The experimental results are shown in
Table 6,
Table 7,
Table 8,
Table 9,
Table 10 and
Table 11, with the best results in bold.
UDA is essentially a migration of existing knowledge from the source domain, and it is impossible to have a situation where knowledge is created out of nothing. Therefore, it is first necessary to analyze the situation of the dataset to be migrated, and then obtain the application scenario of CD-UDA. From the experimental results and the analysis of the experimental process, it is known that the CD results, without adding the UDA strategy, can be used to obtain the distribution relationship between the datasets, using the test results on the target domain, although the experimental results, in this case, may have drastic oscillations within the iteration process. Based on the experimental results of the model without the CD-UDA strategy, the following conclusions can be drawn: firstly, the
metrics are significantly lower in most experimental results, which indicates richer content information and greater domain uncertainty in the change region in the CD task; secondly,
and
are higher in S→L (
Table 10) and S→G (
Table 11), which can be attributed to the fact that SUYU and the other two datasets differ not only in the distribution of the data, but also in the range of the labeling—the labeling range of SUYU is larger than the other two datasets; thirdly, as shown in
Table 6 and
Table 9, the domain similarity is more obvious in the
G and
L datasets.
From the experimental results, it can be seen that the DSDANet model for the CD-DA task has more obvious instability, which is caused by the following two reasons: firstly, due to the sample imbalance in the CD dataset, if a suitable sample selection strategy is not adopted, it will lead to the model easily converging to the significant change region and the unchanged region, which will produce negative migration; secondly, due to the specificity of the CD dual-stream model, if domain feature alignment is only implemented in the classification layer, the style information of the samples will be ignored, and the aligned domain features only characterize the change detection results.
In order to evaluate the generalization of the CD-UDA strategy proposed in this paper, FCSiamD and UCDNet are chosen as the experiments for the addition of the CD-UDA strategy. Since FCSiamD and UCDNet cannot generate transferable features applicable to the marginal distribution alignment in the middle layer, only the conditional distribution alignment strategy and the sample selection strategy for the classification layer are added to the experiment. Overall, the addition of the proposed UDA strategy to the CD model results in more significant improvements, especially in S→L (
Table 10) and S→G (
Table 11).
From L→S (
Table 8) and G→S (
Table 7), the results have high accuracy for unchanged regions, low accuracy for changed regions, and insignificant CD-UDA performance. Combined with the overview of the experimental data (
Table 1), we believe that this problem is caused by the following two problems: firstly, the difference between SUYU and the other two datasets not only in data distribution, but also in the labeling range (discrimination threshold for the changed region), which is larger in SUYU than in the other two datasets; and, secondly, the source domain data with a serious sample imbalance increase the difficultly of performing the CD-UDA task.
From the experimental results of S→L (
Table 10) and S→G (
Table 11), we can see that most of the methods improve their results significantly after adding the proposed UDA strategy. Combined with the overview of the experimental data (
Table 1), we believe that the performance of the CD-UDA model on the target domain dataset is enhanced when the source domain data are broader and the feature distribution is wider.
From the experimental results of G→L (
Table 6) and L→G (
Table 9), it can be seen that CD-UDA mitigates the sample imbalance, resulting in the easier convergence of the target domain test results with the unchanged region. Combined with the overview of the experimental data (
Table 1), CD-UDA is more effective for datasets with similar labeling scales (GZ and LEVIR) and is also more stable during the training process.
The proposed IRD-CD-UDA achieves the best performance in most of the cross-domain data experiments, and in particular, the performance of the F1 and mIoU metrics is significantly improved, with F1 improving by 3–22% and Miou by 2–13%. This demonstrates the effectiveness of the proposed CD-UDA in mitigating the sample imbalance problem, which often causes the model to converge to a local minimum. Meanwhile, the OA performance achieves the best results on G→S, L→S, L→G, and S→G, and outperforms most of the comparison methods on S→L and L→S. The results show that the CD-UDA proposed in this paper improves the performance of the target domain data without destroying the prior source domain and improves the generalization ability of the model.
4.6. Visualization
A set of bi-temporal data randomly selected from the three datasets are fed into the trained IRD-CD-UDA to produce visualization results. As shown in
Figure 4, the experimental results indicate that the annotation criteria for the CD dataset vary across the datasets. These criteria dictate the designation of regions as changed or unchanged, which directly affects the performance of the CD-UDA model. As shown in
Figure 4, the model exhibits increased sensitivity to regional changes when the source domain is SUYU, compared to the LEVIR and GZ datasets. In
Figure 4, due to the large amount of data in the SUYU dataset and the richness of annotation categories, the changed regions of the GZ data can be better identified.
To visualize the distribution of differential features before and after CD-UDA, the T-SNE algorithm [
51] was used to plot the classification feature distribution before and after domain adaptation, as shown in
Figure 5, which shows the altered/invariant region-coupled features, altered intra-class features, and modified intra-class features. Obviously, prior to DA (
Figure 4a–c), the distributions of the coupled features and the intra-class features are mixed, with the intra-class feature distributions being distinctly different for different domains. After DA (
Figure 5d–f), the distributions of the coupled features are clustered by distinct classes, while the intra-class feature distributions of different domains approximate by class. Due to the sample imbalance, the model exhibits reduced confidence in the features of changing regions within a few target domains, leading to the conflation of depth features of these changing regions with features of other categories.