3.2.1. Invariant Feature Network Performance Analysis
The performance of invariant feature networks is assessed using the Pavia University, Pavia Center, and Salinas Valley datasets. The Pavia University dataset comprises 103 effective spectral bands, encompassing nine land-use categories for classification. Similarly, the Pavia Center dataset consists of 102 spectral bands, representing nine land-use targets. The Salinas Valley dataset offers 204 spectral bands, with 16 diverse land-use categories designated for classification.
For the sample partitioning procedures, the samples are initially randomized within each category. Subsequently,
r% of the samples (
r = 40, 50, 60, 70) are selected from each category and assigned their true class labels to form a training set; 30% of the samples will be utilized as the test set. These experimental procedures should be repeated five times to mitigate the influence of edge effects stemming from random sampling and ensure the reliability of our experiment. Then, the classification results are averaged to assess the performance of the algorithm. The network training setup remains consistent with that outlined in
Section 3.1.
In this section, the algorithm proposed in this work is compared with the SVM algorithm based on spectral-extended morphological attribute profile joint feature representation [
52] (Spe-EMAP SVM), a stacked autoencoder based on spectral–spatial joint feature representation [
1] (JSSAE), a stacked sparse autoencoder based on spectral features (Spe-SSAE), a stacked sparse autoencoder based on extended morphological attribute profile features (EMAP-SSAE), and a stacked sparse autoencoder based on spectral-extended morphological attribute profile joint feature representation (Spe-EMAP SSAE).
Table 4,
Table 5 and
Table 6 present the corresponding results from these comparisons.
Figure 4,
Figure 5 and
Figure 6 depict the classification outcomes achieved using various methodologies across three HIS databases.
The proposed MDSNI algorithm exhibits exceptional classification performance during the experiments. Among the six algorithms tested, MDSNI stands out by achieving the highest accuracy in completing the classification task.
It is worth noting that when training support vector machine (SVM) and stacked sparse self-encoder (SSAE) classification models with the same set of features, SSAE consistently outperforms SVM in terms of overall classification accuracy. This observation further underscores the advantages of deep features in enhancing stability and robustness, emphasizing the significant improvement that deep features bring to the classification performance using hyperspectral images.
A comparison of the actual classification results reveals that our methodology outperforms other methodologies in reducing misclassified pixels. This achievement can be primarily attributed to the constructed domain-invariant feature extraction algorithm, which effectively harnesses the spectral and spatial information of the hyperspectral images to improve classification accuracy. Substantial classification errors are found in certain regions with the application of SVM and SSAE when examining the Salinas Valley dataset and comparing it with the ground truth. In contrast, the MDSNI primarily exhibits sporadic misclassifications. This notable difference can be primarily attributed to the intraclass invariant and interclass invariant feature structures, based on the Fourier phase information and CORAL correlation-aligned second-order statistics proposed using the MDSNI, and the invariant features weighted and enhanced according to the MMD distance. Additionally, the adversarial network, which is founded on boundary constraints and introduced via the MDSNI, constrains the source domain data by generating spurious samples in the feature space. The generation of synthetic samples in the feature space via the GAN enhances the reliability and richness of the feature space, consequently fortifying the feature space boundaries. These findings collectively underscore the substantial advantages of our proposed MDSNI network in improving classification performance.
Finally, our experimental results consistently underscore the exceptional performance of our method. Even when maintaining a constant training rate, the MDSNI algorithm shows significant advantages over the other methodologies, proving its superior performance in extracting intraclass and interclass invariant features in multi-source domains. When the training rate is increased from 60 to 70%, the classification results of the MDSNI algorithm display minimal variation despite the augmented volume of the training data. This observation illustrates the capability of our methodology to yield favorable outcomes, even with relatively limited training data. Such stable performance augments the practical applicability of our approach.
Table 7 shows the comparison between our algorithm and the latest HIS unsupervised classification algorithm. During the sample classification, it first randomizes the samples within each category and then selects 70% of the samples from each category to give them the true class label, forming the training sample set. On the other hand, 30% of the samples will be utilized as the test sample set. The following experimental operations are repeated five times to mitigate potential edge effects caused by random sampling and to ensure the reliability of the experiments. Subsequently, the algorithm performance is assessed by computing the average of the classification results. The network training settings are the same as those in
Section 3.1.
In our experiments, the performance of the MDSNI is compared with that of the PSSA and Diff-HIS algorithms in the HSI classification task. The results show that the MDSNI algorithm outperforms the PSSA and Diff-HIS algorithms, and this superiority mainly stems from the unique advantages to our approach in utilizing invariant features and establishing robust boundary strategies.
The MDSNI constructs a boundary constraint-based adversarial network, which reinforces the boundary in the source domain-invariant feature space by generating fake samples to populate the source domain-invariant feature space. In contrast, Diff-HIS simply uses an unlabeled HSI patch pretrained diffusion model for unsupervised feature learning and then exploits intermediate-level features with various time steps for classification. The PSSA combines entropy rate superpixel segmentation (ERS), superpixel-based principal component analysis (PCA), and PCA-domain two-dimensional singularity spectral analysis (SSA) to enhance the feature extraction efficacy and efficiency, combined with anchor-based graph clustering (AGC) for effective classification. Nevertheless, both methodologies fail to consider the ambiguity of the edges in the HSI data, which makes the possibility of classification error of the edge data extremely high when classifying HSI data, resulting in a lower classification accuracy than that of the MDSNI.
In summary, the excellent performance of our MDSNI algorithm mainly stems from its accurate invariant feature extraction and robust boundary reinforcement strategy. These two points enable the MDSNI to markedly improve the classification accuracy when performing HSI classification tasks, thus making its overall performance superior to other competing algorithms.
3.2.2. Performance Analysis of the MUDA Sample Transfer Algorithm
To assess the effectiveness of the methodology introduced in this chapter for hyperspectral image transfer classification, experiments are conducted employing three datasets: Pavia University (PU), Houston (H), and Washington DC Mall (W). The Houston dataset includes a total of 144 bands, with a spatial resolution of 2.5 m. The Washington DC Mall dataset comprises 305 × 280 pixels and 191 bands. Four specific types of features are selected as transfer targets for all three datasets in the experimental process. These four types of features are roads, grasslands, trees, and roofs. The pseudocolor images and ground realities for the three datasets are shown in
Figure 7.
Drawing from the aforementioned hyperspectral remote sensing datasets, two groups of data are selected as the source domain and the remaining group as the target domain to create cross-domain multi-source experimental data for verification. Throughout the experimental process, the MDSNI is compared with the SVM, 3D-SAE [
55], Deep CORAL, DAN, and UHDAC to demonstrate the superiority of the proposed methodology. The learning settings mirrored those utilized in
Section 3.1.
In the results analysis stage, the overall classification accuracy (OA) and the kappa coefficient are utilized as evaluation indicators to compare the experimental results; each experiment is repeated ten times, and the average value is taken. The experimental results are presented in
Table 8 and
Table 9.
Figure 8 and
Figure 9 show the classification results for the HIS data transfer learning.
The analysis reveals the following.
Traditional hyperspectral image classification algorithms, such as SVM and 3D-SAE, only consider the source domain data during training and fail to consider the relationship between the source domain and the target domain; thus, they perform poorly on the target domain classification task, resulting in low overall accuracy and kappa coefficient. In contrast, transfer learning algorithms, such as the MDSNI, UHDAC, DAN, and Deep CORAL, utilize the features of both the source and target domains during the training process. Moreover, these transfer learning algorithms are designed to mitigate the distributional differences between the source and target domains, ultimately enhancing the classification performance in the target domain. For instance, the MDSNI effectively captures the internal and interval invariant feature structure of the source and target domains by incorporating the Fourier phase information and CORAL correlation-aligned second-order statistics. The UHDAC improves the domain adaptation using an unsupervised hierarchical deep clustering approach, facilitating the learning on the shared and specific features of the source and target domains. In a similar vein, DAN and Deep CORAL bolster the domain adaptability by maximizing the similarity between the feature distributions in the source and target domains, thereby reducing the distributional differences between the domains. The overarching objective of these methodologies is to leverage information from both the source and target domains during training to achieve better classification performance on the target domain. As a result, they significantly outperform traditional hyperspectral image classification algorithms, such as SVM and 3D-SAE, when it comes to the target domain classification task, which improves the overall accuracy and kappa coefficient.
The transfer learning algorithms, Deep CORAL and DAN, aim to mitigate the disparity in domain distribution between the source and target domains by introducing domain adaptation layers. Nevertheless, Deep CORAL incorporates fewer domain adaptation layers, which limits its effectiveness when handling hyperspectral data. In addition, Deep CORAL and DAN are implemented without considering the problem of fuzzy boundaries between the source and target domain data. Consequently, their accuracy and kappa coefficient in hyperspectral image classification tasks are lower than those of the MDSNI.
The UHDAC strives to establish consistent feature mapping between the source and target domains and align the data distributions of the two domains via bidirectional adversarial migration. Nevertheless, the UHDAC tends to overlook the invariant features of the source domain, which results in its performance in hyperspectral experiments slightly lagging behind that of the MDSNI. While 3D-SAE combines the features of both the source and target domains for hyperspectral images and extracts the spatial–spectral features, its limited performance leads to the inaccurate representation of the edge information for the hyperspectral images from different domains. Consequently, the parameters of the network trained on the source domain data cannot be directly adapted to the training of the target domain data. This difference in performance manifests as the accuracy and kappa coefficient of the MDSNI methodology significantly outperforming those of the 3D-SAE.
Our algorithm notably excels and attains the desired classification results, even in the presence of substantial disparities in the sample data distribution between the source and target domains. This robustly underscores the effectiveness of our approach in addressing discrepancies in data distribution between source and target domains.
The extensive array of experimental results underscores the substantial advantages of our MDSNI method compared to state-of-the-art transfer learning techniques using widely available benchmark datasets. Our methodology consistently delivers high-quality image classification results for the target domain, despite the presence of data distribution inconsistencies between the source and target domains. This is consistently confirmed across multiple experiments, affirming the broad potential applicability of our methodology in dealing with the hyperspectral image classification task in real-world scenarios.