Next Article in Journal
Small Target Radiometric Performance of Drone-Based Hyperspectral Imaging Systems
Previous Article in Journal
Residual Ash Mapping and Coffee Plant Development Based on Multispectral RPA Images
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Hyperspectral Image Classification Based on Adaptive Global–Local Feature Fusion

1
School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
2
School of Electronics and Electrical Engineering, Bengbu University, Bengbu 233030, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2024, 16(11), 1918; https://doi.org/10.3390/rs16111918
Submission received: 20 March 2024 / Revised: 15 May 2024 / Accepted: 22 May 2024 / Published: 27 May 2024

Abstract

:
Labeled hyperspectral image (HSI) information is commonly difficult to acquire, so the lack of valid labeled data becomes a major puzzle for HSI classification. Semi-supervised methods can efficiently exploit unlabeled and labeled data for classification, which is highly valuable. Graph-based semi-supervised methods only focus on HSI local or global data and cannot fully utilize spatial–spectral information; this significantly limits the performance of classification models. To solve this problem, we propose an adaptive global–local feature fusion (AGLFF) method. First, the global high-order and local graphs are adaptively fused, and their weight parameters are automatically learned in an adaptive manner to extract the consistency features. The class probability structure is then used to express the relationship between the fused feature and the categories and to calculate their corresponding pseudo-labels. Finally, the fused features are imported into the broad learning system as weights, and the broad expansion of the fused features is performed with the weighted broad network to calculate the model output weights. Experimental results from three datasets demonstrate that AGLFF outperforms other methods.

1. Introduction

Hyperspectral images typically include a large amount of approximately continuous spectral band information and spatial location information [1,2,3]. HSI classification distinguishes the corresponding categories of each pixel, which is a basic and key application technology in remote sensing and which can be successfully utilized in numerous fields such as mineral detection, environment detection, and crop monitoring [4,5,6]. Early applied HSI classification methods, including random forest [7], support vector machine (SVM) [8], and graph-based [9] methods, enhanced feature classification ability by exploring rich and effective spectral information. Other models, such as independent component analysis [10] and principal component analysis (PCA) [11], are often used to identify valid spectral features. However, the lack of spatial contextual information in these early methods has led to poor classification results. Therefore, several subsequent classification methods focused on studying rich spatial information of the HSI surface. For example, rich spatial and spectral information is integrated into sparse representation to achieve high-quality classification methods [12,13]. Markov random field [14,15] and superpixel methods [16,17] can fully explore spatial position information to achieve ideal classification. Although these methods can efficiently handle the classification task of HSI, they cannot effectively identify the small differences between different classes or the larger differences between the same classes.
Deep learning (DL) can efficiently extract high-order features [18] and has been successfully applied to HSI classification tasks. Previously proposed deep HSI methods, such as deep belief networks [19], directly captured HSI features without manual acquisition. Stacked autoencoder (SAE) [20] stacked several autoencoder layers to achieve recognition of each input spectral vector. However, these methods cannot consider both spectral and spatial information to deal with spectral variability. Consequently, numerous scholars used the convolutional neural network (CNN) [21] model to learn effective spatial and spectral features of HSI, obtaining promising results. Mei et al. [22] combined HSI spectral and spatial information into a feature learning CNN model to obtain better classification results. Kong et al. [23] designed intra-class and inter-class hypergraph models to extract high-quality spectral band information and effective spatial location information using the CNN model. Chen et al. [24] captured rich spatial–spectral data simultaneously with 3D CNN for HSI classification. Liu et al. [25] proposed a Siamese CNN to capture rich and effective spatial–spectral information and used SVM to achieve the final HSI classification. Yang et al. [26] designed 2D and 3D CNNs and improved the regression model for HSI classification. Wang et al. [27] used a residual network model to capture spatial–spectral information quickly.
DL can efficiently capture the desired features via multiple stacked units [28,29]. However, DL models usually require constant restructuring and extensive network training. In contrast, the broad learning system (BLS) constructs a flat formal network model [30] and easily enables network broad expansion. The original input HSI data are optimized by transformations to generate mapped features (MFs), which are transformed into enhancement nodes (ENs) by the generated random weight mappings. All ENs and MFs are simultaneously transmitted to the output layer, and the corresponding weights are calculated using the ridge regression theory. Jin et al. [31] constructed the graph regular BLS model by adding the manifold regular term to increase the final recognition ability. Kong et al. [32] used a novel graph regularized SAE structure, fine-tuned partial weight values of MFs and ENs in the BLS model, and finally realized classification through spectral clustering. Wang et al. [33] constructed domain adaptation BLS to dynamically adjust the conditions and marginal distributions to maintain data feature alignment by introducing manifold distribution constraints, thus significantly improving the performance of the BLS model. However, only a small amount of labeled data is available in the actual classification application task, and a large amount of unlabeled data is easily obtained but cannot be effectively utilized. The supervision method cannot solve this problem. BLS methods commonly adopt the supervised learning model, which cannot effectively utilize most unlabeled data and obtain relatively better classification performance.
Semi-supervised learning (SSL) can efficiently utilize a large amount of unlabeled and a few labeled data. Unlabeled data can provide considerable useful input information to improve the ability of the corresponding SSL model. Graph-based SSL methods are largely adopted owing to their efficient scalability. Basic graph SSL models usually use methods such as k-nearest neighbor [34] and non-negative local linear reconstruction [35] to construct data graphs, which are susceptible to the influence of composition strategies and nearest neighbor parameters and can only construct local data graph structures. Therefore, SSL methods combined with sparse graphs are constantly proposed. Morsier et al. [36] constructed the kernel low-rank sparse graph by solving the sample similarity relationship in Hilbert space and described the data relationship between sparse and low-rank constraints. Ma et al. [37] used a new robust non-negative and l1 paradigm-constrained SSL method to construct reliable inter-data by exploring the label structure relationship graphs. However, these algorithms do not sufficiently consider the class structure of the data. Therefore, Shao et al. [38] used the probability category model combined with the sparse representation method to obtain a reliable relationship graph among HSI data. This SSL algorithm ignores the construction of the global relationship graph. Ding et al. [39] obtained high-order information by exploring global neighbor relationships instead of using similarity measurements of HSI data. However, these methods only consider the global or local graphs and do not fully utilize both global and local information, which significantly limits the classification performance of the model.
To solve this problem, we propose an adaptive global–local feature fusion (AGLFF) method. The proposed method extracts the corresponding features of HSI through the adaptive fusion of global high-order and local graphs and uses the class probability (CP) structure to express the relationship between the data and the categories. Simultaneously, a weighted broad learning system (WBLS) is used to enable network broad expansion. Our contributions are as follows:
  • The global–local adaptive fusion graph is built to obtain consistent spatial–spectral data. Adaptive fusion can automatically learn the weight parameters of the global high-order and local graphs, which can realize feature smoothing of intra-class data and increase the discriminability of inter-class data.
  • The CP structure is used to express the relationship between the fused feature and the categories to better utilize unlabeled data, resulting in improved classification performance.
  • Adaptive fusion features are introduced into the BLS model as weights, and the WBLS model is used to expand the broad of the fused features to further enhance the expressiveness of data.
The rest of the paper is organized as follows. The classification process of the proposed AGLFF is detailed in Section 2. Detailed experimental results are reported and analyzed in Section 3. Section 4 concludes this paper.

2. Adaptive Global–Local Feature Fusion Method

Figure 1 shows the classification process of the proposed AGLFF, which mainly includes five steps. First, PCA is used to reduce the dimensions of the original HSI. Second, superpixels are fetched with simple linear iterative clustering (SLIC) [40], and then the global high-order and local graphs are adaptively fused to obtain consistent spatial–spectral features. Third, the CP structure is applied to calculate the pseudo-labels corresponding to the unlabeled samples. Fourth, the consistency features after fusion are extended by BLS to enhance the feature representation. Finally, the fused features are introduced into BLS as weights, and the output weights are calculated by the ridge regression theory.

2.1. Adaptive Feature Fusion

To fully utilize global and local information, consistent spatial–spectral features are obtained by adaptive fusion. HSI typically includes a lot of approximately continuous spectral band information. PCA is used to reduce the dimension of the original data. Define the HSI matrix X R N × b , with pixel data x R b after dimensionality reduction, where N and b represent the pixel number and dimension value, respectively. Define a spatial coordinate matrix K R N × 2 , where k R 2 represents the spatial coordinate data after dimension reduction. SLIC is then used to segment the data after dimensionality reduction to generate superpixels. A reliable local adjacency graph is constructed based on the spatial–spectral information between superpixels and pixels. Inspired by [39], the nearest neighbors are determined based on the probabilistic neighbor relationship between the superpixels and pixels belonging to the same class. The statistical characteristic information of each superpixel after segmentation is expressed as its corresponding mean value. The average value of spectral features of the superpixel is E = [ e 1 , , e M ] R M × b , while F = [ f 1 , , f M ] R M × 2 is the average value of spatial coordinates, where M represents the value of the superpixel. The smaller spectral distance x i e j 2 2 and the smaller spatial location distance k i f j 2 2 between pixel i and superpixel j correspond to a larger probability relationship S i j of being in the same class. The local adjacency graph model between superpixels and pixels can be expressed as follows:
min s i T 1 = 1 , 0 s i 1 i = 1 N j = 1 M x i e j 2 2 + λ k i f j 2 2 S i j + θ S F 2 ,
where x i and k i represent the spectral information and spatial location coordinates of pixel i, respectively, and e j and f j represent the average spectral information and spatial location coordinate values of superpixel j, respectively. S i j represents the local neighbor relationship between pixel i and superpixel j. The addition of S F 2 prevents obtaining the trivial nearest neighbor solution with a probability value of 1, while θ and λ are the corresponding regularization parameters.
Owing to the long spatial distance and large spectral variation between intra-class pixels, the local adjacency graph can only obtain the neighboring relationship of a few nearby neighbor pixels and cannot obtain the global neighbor relationship. Global consistency features between superpixels and pixels are obtained using the graph topological consistency relationship, and feature aggregation between intra-class data can be further achieved. Inspired by [39], for two superpixels, the topological relationship remains highly consistent if the superpixels are connected through consecutive neighbors. The topological consistency relationship diagram is shown in the green dashed box in Figure 1. The superpixel P and superpixel Q are the same class. They are spatially located far away, and their spectral information is extremely variable, so the spatial–spectral correlation is low. However, they are connected by continuous neighbors, which means a higher topological consistency. Finding the similarity relationship between the superpixels can calculate their topological consistency relationship. The relationship model is expressed as follows:
min Z r M 1 2 i , j = 1 M A i j Z r i Z r j 2 + ρ i = 1 M Z r i I r i 2 ,
where A i j is the similarity relationship of superpixels i and j, Z R M × M is the topological consistency relationship between the superpixels, Z r i is the topological consistency relationship between superpixels r and i, ρ is the regular parameter, and I is the identity matrix. The first term in (2) corresponds to the graph topological consistency relationship assumption, and the second term prevents obtaining a trivial solution. A represents the corresponding similarity matrix between the superpixels, which can be calculated as follows:
min a i T 1 = 1 , 0 a i 1 i , j = 1 M e i e j 2 2 + λ 1 f i f j 2 2 A i j + θ 1 A F 2 ,
where e j and f j represent the average spectral information and average spatial location coordinate values of superpixel j, respectively. λ 1 and θ 1 are regularization parameters. Equation (3) can be resolved according to [41], and (1) can be solved in the same manner. Equation (2) is independent of the value of r. Therefore, the problem can be solved by separating r:
min z r 1 2 i , j = 1 M A i j Z r i Z r j 2 + ρ i = 1 M Z r i I r i 2 .
Equation (4) is derived with respect to z r and considers the value of zero to obtain its optimal solution z r * :
Lz r * + ρ ( z r * I r ) = 0 ,
where L R M × M denotes the Laplacian matrix of A , and I r and z r * are the column vectors. Subsequently,
z r * = I + L / ρ 1 I r .
The rth column of I is I r ; therefore, the rth column of I + L / ρ 1 is z r * . The optimal topological relationship can be expressed as follows:
Z = I + L / ρ 1 .
HSI classification is the task that operates on all pixels; thus, the global consistency relationship G is calculated by combining the local neighbor relationship S in (1) with the topological relationship Z in (4). Subsequently,
G = SZ ,
where G R N × M is a global consistency relationship between pixels and superpixels, which can effectively capture the global neighbor features. Considering that the neighbor relationship between pixels and superpixels cannot be fully expressed using global features. Therefore, the global and local effective features are adaptively fused to obtain the neighbor relationship between pixels and superpixels. The model can be expressed as follows:
min s i T 1 = 1 , 0 s i 1 B + C = 1 i = 1 N j = 1 M B x i e j 2 2 + λ k i f j 2 2 S i j + θ S F 2 + i = 1 N j = 1 M C x i e j 2 2 + λ k i f j 2 2 G i j + θ 2 G F 2 .
This can be simply transformed to
min s i T 1 = 1 , 0 s i 1 B + C = 1 i = 1 N j = 1 M B x i e j 2 2 + λ k i f j 2 2 S i j + i = 1 N j = 1 M C x i e j 2 2 + λ k i f j 2 2 S i j Z + λ 2 S F 2 ,
where λ 2 = θ I + θ 2 Z T Z . Equation (10) is independent of the value of i. Subsequently,
min s i T 1 = 1 , 0 s i 1 B + C = 1 j = 1 M B x i e j 2 2 + λ k i f j 2 2 S i j + j = 1 M C x i e j 2 2 + λ k i f j 2 2 S i j Z + λ 2 S F 2 .
Let d i = d i x e + λ d i k f represent distance, where d i x e = j = 1 M x i e j 2 2 and d i k f = j = 1 M k i f j 2 2 ; then, we have the following:
min s i T 1 = 1 , 0 s i 1 B + C = 1 B d i S i j + C d i S i j Z + λ 2 S F 2 .
Equation (12) can be transformed into a vector form as follows:
min s i T 1 = 1 , 0 s i 1 B + C = 1 s i + B I + C Z 2 λ 2 2 2 .
The Lagrangian function can be expressed as follows:
L s i , α i , ω i , γ i = 1 2 s i + B I + C Z 2 λ 2 2 2 α i s i T 1 1 ω i T s i 0 γ i T ( B + C 1 ) .
Subsequently, considering the derivative of (14) and setting it to zero gives
s i + B I + C Z 2 λ 2 d i α i 1 ω i = 0 .
According to Karush–Kuhn–Tucker conditions, the desired result of (15) is expressed as follows:
S i j = B I + C Z 2 λ 2 d i j + α i + .
According to the constraint s i T 1 = 1 , we have
j = 1 M B I + C Z 2 λ 2 d i j + α i = 1 α i = 1 + B I + C Z 2 λ 2 d i j / M .
After obtaining α i , the corresponding S i j can be determined. We denote G ^ = D ( G Δ 1 G T + μ I ) D as the regular matrix of G , where G Δ 1 G T + μ I increases the stability of the fusion model. D = d i a g ( j = 1 N ( G Δ 1 G T + μ I ) i j ) , Δ R M × M , and Δ j j = i = 1 M G i j ; the parameter μ is empirically set to 0.1. Similarly, the regular matrix S ^ of S can be calculated. The fused feature can be expressed as
F u = B S ^ + C G ^ ,
where B and C are fusion coefficients, and F u is the fused feature. Algorithm 1 summarizes the fusion process.
Algorithm 1 Adaptive Feature Fusion Process
Input: PCA-based HSI representation X R N × b , pixel spatial coordinate K R N × 2 , superpixel spectral feature E = [ e 1 , , e M ] R M × b , and superpixel spatial coordinate F = [ f 1 , , f M ] R M × 2 .
(1)
Obtain the local neighbor relationship S between pixels and superpixels according to (1).
(2)
Obtain the topological relationship Z between superpixels according to  (4).
(3)
Obtain the global consistency relationship G according to (8).
(4)
Calculate adaptive fused feature according to  (9).
(5)
Obtain the optimized S i j according to  (16) and  (17).
(6)
Obtain the adaptive fused feature F u according to  (18).
Output: Adaptive fused feature F u .

2.2. Class Probability Structure

Unlabeled data pixels have no label information and cannot be effectively used. We use a CP structure to calculate their pseudo-labels. The labeled samples generated via adaptive fusion are denoted as X l F = [ x 1 F , , x l F ] R l × b , and their corresponding labels are denoted as Y l = [ y 1 , , y l ] R l × c . The unlabeled samples via adaptive fusion are denoted as X u F = [ x 1 F , , x u F ] R u × b , where l denotes the number of known labeled data, b denotes the dimensionality of the data, u denotes the number of unlabeled data, c denotes the category number, and N = l + u is the number of total data pixels. At any x i F X u F , the similarity relationship with the labeled sample X l F is
min x i F X l F σ F 2 + δ σ 1 s . t . σ 0 ,
where δ and σ are the desired regular term parameter and sparse coefficient, respectively. Equation (19) can be further optimally solved using the alternating direction multiplier method [42] to determine the CP vector as
p i = σ T Y l ,
where p i = p i 1 , , p i k , , p i c R 1 × c , and p i k is the probability value that the ith data belongs to the kth category. The CP matrix P u = p 1 ; ; p u R u × c of unlabeled samples can be obtained by label propagation of given labeled samples, and P l = p 1 ; ; p l R l × c is the corresponding CP matrix of the given labeled samples. Therefore, for any two samples i and j, the probability of belonging to the same class is denoted as
P i j = 1 , i = j p i p j T , i j .
The CP matrix P can be divided into four small matrix blocks and denoted as
P = P l l P l u P u l P u u ,
where P l l is the numerical probability matrix of the same class in the labeled data, and P u u  is the numerical probability matrix of the same class in the unlabeled data. P l u and P u l are the numerical probability matrix of the same class in the unlabeled and labeled data, respectively. By calculating the index of the data with the maximum probability for each row of P l u , the most similar labeled data can be obtained for all unlabeled data. Therefore, the pseudo-labels of the unlabeled data can be solved and expressed as
p i k = max p i y i u = y k l .

2.3. Weighted Broad Learning System

Global–local fused features are introduced into the BLS model as weights to construct a WBLS. Using the WBLS to expand the broad of the fused features can further enhance the feature representation of the data. Given the adaptive fused HSI data X F = [ X l F ; X u F ] R N × b , the labels Y * = Y l ; Y u can be computed through the CP structure. The model uses randomly generated weights W M and deviation values β to map X F to the newly expanded MF, and we have
U i = X F W i M + β i M , i = 1 , 2 , , G M ,
where G M is the number of feature groups included in the MF, W M = W 1 M , , W G M M is the weight, β M = β 1 M , , β G M M is the deviation value, and U i denotes the ith group of MF. W M is obtained via sparse autoencoder. Subsequently, the obtained MF features are mapped to EN through selected functions to further expand the feature broad. We have
H j = ϕ j UW j E + β j E , j = 1 , 2 , , G E ,
where ϕ j · denotes the selected nonlinear function, G E is the number of nodes contained in the EN, W E = W 1 E , , W G E E is the weight, and β E = β 1 E , , β G E E is the deviation value. Finally, MF and EN are combined and passed to the output layer, and the output is denoted as
Y * = U | H W ,
where W denotes the weights of the output layer. The fused global–local matrix F u is added to the BLS as weights to construct the objective function of the WBLS. We have
arg min W F u U | H W Y * 2 2 + ξ W 2 2 ,
where ξ denotes a regularization parameter. Equation (27) can be optimized using the ridge regression theory. We have
W = ξ I + U | H T F u T F u U | H 1 U | H T F u T Y * .
The prediction of the WBLS can then be calculated as follows:
Y = U | H W .
AGLFF uses fused global–local features to achieve data sample smoothing. The pseudo-labels corresponding to unlabeled samples are calculated via the CP structure to effectively utilize HSI unlabeled data. Moreover, the fused features are added to the BLS as weights to enhance feature representation. Algorithm 2 summarizes the AGLFF method.
Algorithm 2 AGLFF Method
Input: Adaptive fused data X F .
(1)
Obtain the probability class matrix P according to (21).
(2)
Obtain pseudo-labels Y u of unlabeled data X u F according to (23).
(3)
Obtain MF features U according to (24) and calculate EN features H according to (25).
(4)
Obtain weights W of WBLS according to (28).
(5)
Obtain predictive labels Y of AGLFF according to (29).
Output: Predictive labels Y .

3. Experiments and Analysis

The properties of the AGLFF method were used for assessment in three real HSI datasets, namely, Indian Pines (IP), Kennedy Space Center (KSC), and Pavia University (PU). The average accuracy (AA), overall accuracy (OA), consumed time (T, s), kappa coefficient, and accuracy of each category were used to evaluate the results, and the experimental results were averaged over 10 replicate experiments. The experimental methods were run with Pytorch and MATLAB R2016a.

3.1. HSI Datasets

The IP dataset was obtained over India and includes 145 × 145 pixels, 16 categories, and 200 spectral bands of agricultural scenes. Figure 2a,b shows the false-color image data and ground-truth map.
The KSC dataset was extracted over Florida, including 512 × 614 pixels, 13 categories, and 176 spectral bands of space center scenes. Figure 3a,b shows the false-color image data and ground-truth map.
The PU dataset was extracted over northern Italy, including 610 × 340 pixels, 9 categories, and 103 spectral bands of university campus scenes. Figure 4a,b shows the false-color image data and ground-truth map.
The sample selection for the IP, KSC, and PU datasets is presented in Table 1. We chose 30 samples of each class as labeled samples (represented by n.l.s.), and the remainder were unlabeled samples (represented by n.u.s.) for all methods. Considering the small number of grass-pasture-mowed and oats classes in the IP dataset, 15 were selected here as labeled samples, and the remainder were unlabeled samples.

3.2. Comparative Experiments

To verify the superiority of the proposed AGLFF, the following nine classification and comparison methods were adopted, including traditional methods (SVM [43] and ELM [44]), superpixel methods (SuperPCA [45] and S3PCA [46]), broad learning methods (BLS [30], SBLS [47], and AGLFF1), depth methods (GCN [48] and GCGCN [39]), and graph-based semi-supervised methods (NSCKL [49] and XPGN [50]). AGLFF1 was the AGLFF model without WBLS and contained only regular BLS. The hyperparameter values of AGLFF1 and AGLFF were the same. The selected methods used grid search to set corresponding parameters. To demonstrate the impact of fusion features on HSI classification, the classification effects of global, local, and global–local fusion features and their corresponding models were compared and analyzed. Figure 2, Figure 3 and Figure 4 and Table 2, Table 3 and Table 4 show that:
  • The classification results of AGLFF outperform the other methods because the model achieves a consistent fusion of global and local features. In addition, semi-supervised classification is performed by sufficient unlabeled data, and the fused features are added to the BLS model as weights, making the features smoother and obtaining higher classification accuracy. GCGCN is better than GCN because of the use of an efficient GCN method, which captures rich global spectral–spatial features. SBLS uses more unlabeled information for semi-supervised classification through BLS and obtains relatively good results. The GCN model has the worst classification results because it processes only the spectral information. AGLFF1 has the best classification results except for AGLFF and GCGCN due to the use of global–local fusion features, achieving higher classification accuracy in several classes in all three datasets. Hence, the proposed AGLFF outperforms the other nine methods by using the global–local fusion features and introducing them into the BLS model as weights, with an OA value of 96.11% and a kappa value of 95.23% for the IP dataset.
  • The four models—ELM, SVM, BLS, and SuperPCA—consume the shortest time. BLS is the model with the shortest time, except for ELM, SuperPCA, and SVM, mainly because the BLS model is relatively simple and the parameters can be calculated according to the inverse matrix. GCGCN and GCN depth methods consume the longest time. XPGN takes a longer time than AGLFF because it use three branch models to acquire information from various scales, and the training time is longer. NSCKL takes less time than AGLFF due to its relatively simple structure. AGLFF takes neither the most nor the least time because it takes some time to fuse the global–local features and calculate the class probability matrix between the samples. However, AGLFF has the best classification performance and can realize feature smoothing of intra-class data and increase the discriminability of inter-class data.
  • The accuracy obtained by all methods on the IP dataset is relatively low because of the small degree of difference classes; for example, corn-mintill and corn-notill are less distinguishable and more difficult to classify. All methods yielded better and less time-consuming results on the KSC dataset because the dataset contains fewer samples of classes and less inter-class similarity, making it easier to distinguish between categories. The AGLFF has the best results on the KSC dataset, with an OA value of 99.26% and a kappa value of 99.09%. For the proposed AGLFF, misclassification appears only in Class 13 (water), and the rest of the classes are classified correctly. This further illustrates the advantages of AGLFF.

3.3. Parameter Analysis

3.3.1. Semi-Supervised Label Ratio

For the performance evaluation of AGLFF, a different ratio of labeled data from each class was selected from the dataset for model training. Figure 5 shows classification results acquired by seven methods of different sample numbers on three datasets. According to the OAs of different methods, the classification results of all methods were enhanced significantly with the increase in the number of labeled samples. AGLFF outperformed other methods, particularly for 20% and 40% labeled samples of each class. The proposed method maintained a clear advantage, indicating that the AGLFF method has a more stable classification effect.

3.3.2. Analysis of Construction Graph

To better verify the superiority of the global–local fused feature model, its performance was compared with the global and local feature models on different datasets. From the classification results of the different feature models in Figure 6, the classification accuracy of the fused features for any number of labeled samples on the three datasets is determined to be higher than that of global and local features. AGLFF performs best, while the local feature model performs relatively poorly, because AGLFF can extract better inter-class discriminable information by searching a large range of intra-class neighbor samples to obtain more consistent information.

3.3.3. Parameter Settings and Analysis

We first analyze the relevant parameters λ , λ 1 , θ , θ 1 , θ 2 , ρ , B, and C involved in the model composition to evaluate the performance of AGLFF. As parameters λ , λ 1 , θ , θ 1 , and θ 2 contribute equally to the model, and B and C are complementary parameters that sum up to 1, for simplicity, only the effects of λ , θ , ρ , and C on the model performance were investigated. Parameters λ , θ , and ρ are within the range of [ 0.001 , 50 ] . Figure 7a–c shows that the OAs on the three datasets vary with the variation in λ and θ ; the OAs are at their maximum when λ is 30 and θ is 10. Therefore, parameters λ and λ 1 are set to 30, while θ , θ 1 , and θ 2 are set to 10. Figure 7d shows that, as parameter ρ increases gradually from 0.001, the OAs increase and then decrease, reaching the maximum when ρ is at 0.1. Thus, ρ is set at 0.1. Parameter C is considered to be in the range [ 0 , 1 ] . Figure 7e shows that the classification accuracy varies with the variation in C. When C is 0, it is a local feature, and the OA is the minimum value. When C is 1, it is a global feature. The C obtained by adaptive fusion is 0.8, and the classification accuracy is the highest at this time, indicating the superior performance of the adaptive fusion method.
The number of superpixels is also an important parameter affecting the performance of AGLFF. The number of superpixels M obtained by SLIC segmentation is in the range [ 100 , 1900 ] . Figure 7f shows that the OAs gradually increase with the increasing number of superpixels M; the OAs become relatively large between 1100 and 1500. When M gradually increases from 1500, the OAs become smaller, indicating that extremely large superpixels will make the HSI excessively segmented and the classification effect becomes worse. Therefore, the number of superpixels M is set to 1500 for all datasets.
Subsequently, we set the number of groups G M in the MF, the number of nodes d M in the MF, and the number of nodes G E in the EN of the BLS. Here, the number of groups G M in the MF is the same as that of nodes d M in the MF. Figure 8 shows that the OAs increase as G E and G M become larger and finally reach saturation. The performance of AGLFF can be gradually improved as G E and G M become larger, but extremely large values also lead to overly complex model calculation. Therefore, the corresponding G M in MF and G E in EN on the three datasets can be obtained as 30–500, 20–500, and 40–400, respectively.

3.4. Ablation Studies

To demonstrate the efficiency of the selected method, ablation studies were performed on local features (LFs), global features (GFs), global–local fusion features (FFs) and WBLS. For a detailed analysis, see Table 5 and Table 6. Compared with AGLFF-A and AGLFF-B, we can see that AGLFF has higher classification performance, indicating that the global–local fusion feature can realize feature smoothing of intra-class data and increase the discriminability of inter-class data. Compared with AGLFF-C, AGLFF has higher classification performance, which indicates that the WBLS model can expand the breadth of the fused features to further enhance the expressiveness of the data. Overall, a thorough study of the above experimental results with different compositions further validates the efficiency of our model.

4. Conclusions

To effectively utilize the spatial–spectral information of HSI, an AGLFF classification method is proposed. Adaptive fusion of global high-order and local data can realize feature smoothing of intra-class data and increase the discriminability of inter-class data. The adaptive method automatically learns the weight parameters of the global high-order and local data, which can reduce the number of parameters calculated. The probabilistic relationship between the fused features and the categories is then calculated by the CP structure, and the pseudo-labels corresponding to unlabeled data are obtained. Moreover, the fused features are also introduced into the BLS model as weights, and the feature broad expansion using WBLS increases the discriminability of the data. The validation results on three datasets show that the AGLFF method better extracts the consistent features of the sample data and obtains the best classification results compared to other methods. We will consider adding an adaptive weighting strategy to the BLS model in the future to improve its performance.

Author Contributions

All of the authors provided significant contributions to the work. C.Y. and Y.C. designed the experiments; C.Y., Y.K. and X.W. performed the experiments; C.Y., Y.K. and Y.C. analyzed the data; C.Y., Y.K. and Y.C. wrote the paper; X.W. reviewed and edited the paper. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China under Grant 62176259 and Grant 61976215. This research was funded by the Key Research and Development Program of Jiangsu Province under Grant BE2022095. This research was also funded by the Key University Natural Science Research Program of Anhui Province under Grant KJ2021A1119.

Data Availability Statement

Publicly available datasets were analyzed in this study, which were obtained from: http://www.ehu.eus/ccwintco/index.php?title=Hyperspectral_Remote_Sensing_Scenes (accessed on 6 October 2023).

Acknowledgments

The authors would like to thank the editors and anonymous reviewers for their detailed and constructive suggestions.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
HSIHyperspectral Image
AGLFFAdaptive Global–Local Feature Fusion
SVMSupport Vector Machine
PCAPrincipal Component Analysis
DLDeep Learning
SAEStacked Autoencoder
CNNConvolutional Neural Network
BLSBroad Learning System
MFMapped Features
ENEnhancement Nodes
SSLSemi-supervised Learning
CPClass Probability
WBLSWeighted Broad Learning System
SLICSimple Linear Iterative Clustering
IPIndian Pines
KSCKennedy Space Center
PUPavia University
AAAverage Accuracy
OAOverall Accuracy

References

  1. Yu, C.; Zhou, S.; Song, M.; Chang, C. Semisupervised hyperspectral band selection based on dual-constrained low-rank representation. IEEE Geosci. Remote Sens. Lett. 2022, 19, 5503005. [Google Scholar] [CrossRef]
  2. Cheng, Y.; Chen, Y.; Kong, Y.; Wang, X. Soft instance-level domain adaptation with virtual classifier for unsupervised hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5509013. [Google Scholar] [CrossRef]
  3. Prades, J.; Safont, G.; Salazar, A.; Vergara, L. Estimation of the Number of Endmembers in Hyperspectral Images Using Agglomerative Clustering. Remote Sens. 2020, 12, 3585. [Google Scholar] [CrossRef]
  4. Wang, H.; Cheng, Y.; Wang, X. A Novel Hyperspectral Image Classification Method Using Class-Weighted Domain Adaptation Network. Remote Sens. 2023, 15, 999. [Google Scholar] [CrossRef]
  5. Wang, H.; Wang, X.; Cheng, Y. Graph meta transfer network for heterogeneous few-shot hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2021, 61, 5501112. [Google Scholar] [CrossRef]
  6. Kong, Y.; Wang, X.; Cheng, Y.; Chen, C.L.P. Multi-stage convolutional broad learning with block diagonal constraint for hyperspectral image classification. Remote Sens. 2021, 13, 3412. [Google Scholar] [CrossRef]
  7. Ham, J.; Chen, Y.; Crawford, M.; Ghosh, J. Investigation of the random forest framework for classification of hyperspectral data. IEEE Trans. Geosci. Remote Sens. 2005, 43, 492–501. [Google Scholar] [CrossRef]
  8. Melgani, F.; Bruzzone, L. Classification of hyperspectral remote sensing images with support vector machines. IEEE Trans. Geosci. Remote Sens. 2004, 42, 1778–1790. [Google Scholar] [CrossRef]
  9. Salazar, A.; Safont, G.; Vergara, L.; Vidal, E. Graph Regularization Methods in Soft Detector Fusion. IEEE Access 2023, 11, 144747–144759. [Google Scholar] [CrossRef]
  10. Sun, W.; Du, Q. Graph-regularized fast and robust principal component analysis for hyperspectral band selection. IEEE Trans. Geosci. Remote Sens. 2018, 56, 3185–3195. [Google Scholar] [CrossRef]
  11. Villa, A.; Benediktsson, J.; Chanussot, J.; Jutten, C. Hyperspectral image classification with independent component discriminant analysis. IEEE Trans. Geosci. Remote Sens. 2011, 49, 4865–4876. [Google Scholar] [CrossRef]
  12. Shi, C.; Sun, J.; Wang, T.; Wang, L. Hyperspectral Image Classification Based on a 3D Octave Convolution and 3D Multiscale Spatial Attention Network. Remote Sens. 2023, 15, 257. [Google Scholar] [CrossRef]
  13. Liu, W.; Liu, B.; He, P.; Hu, Q.; Gao, K.; Li, H. Masked Graph Convolutional Network for Small Sample Classification of Hyperspectral Images. Remote Sens. 2023, 15, 1869. [Google Scholar] [CrossRef]
  14. Pan, C.; Gao, X.; Wang, Y.; Li, J. Markov random fields integrating adaptive interclass-pair penalty and spectral similarity for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2019, 57, 2520–2534. [Google Scholar] [CrossRef]
  15. Ghamisi, P.; Benediktsson, J.; Ulfarsson, M. Spectral–spatial classification of hyperspectral images based on hidden markov random fields. IEEE Trans. Geosci. Remote Sens. 2014, 52, 2565–2574. [Google Scholar] [CrossRef]
  16. Lu, T.; Li, S.; Fang, L.; Jia, X.; Benediktsson, J. From subpixel to superpixel: A novel fusion framework for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 4398–4411. [Google Scholar] [CrossRef]
  17. Cai, Y.; Zhang, Z.; Ghamisi, P.; Ding, Y.; Liu, X.; Cai, Z.; Gloaguen, R. Superpixel contracted neighborhood contrastive subspace clustering network for hyperspectral images. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5530113. [Google Scholar] [CrossRef]
  18. Wang, K.; Wang, X.; Zhang, T.; Cheng, Y. Few-shot learning with deep balanced network and acceleration strategy. Int. J. Mach. Learn Cybern. 2022, 13, 133–144. [Google Scholar] [CrossRef]
  19. Chen, Y.; Zhao, X.; Jia, X. Spectral–spatial classification of hyperspectral data based on deep belief network. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 2015, 8, 2381–2392. [Google Scholar] [CrossRef]
  20. Chen, Y.; Lin, Z.; Zhao, X.; Wang, G.; Gu, Y. Deep learning-based classification of hyperspectral data. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2014, 7, 2094–2107. [Google Scholar] [CrossRef]
  21. Cai, Y.; Liu, X.; Cai, Z. BS-Nets: An end-to-end framework for band selection of hyperspectral image. IEEE Trans. Geosci. Remote Sens. 2020, 58, 1969–1984. [Google Scholar] [CrossRef]
  22. Mei, S.; Ji, J.; Hou, J.; Li, X.; Du, Q. Learning sensor-specific spatial-spectral features of hyperspectral images via convolutional neural networks. IEEE Trans. Geosci. Remote Sens. 2017, 55, 4520–4533. [Google Scholar] [CrossRef]
  23. Kong, Y.; Wang, X.; Cheng, Y. Spectral–spatial feature extraction for HSI classification based on supervised hypergraph and sample expanded CNN. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2018, 11, 4128–4140. [Google Scholar] [CrossRef]
  24. Chen, Y.; Jiang, H.; Li, C.; Jia, X.; Ghamisi, P. Deep feature extraction and classification of hyperspectral images based on convolutional neural networks. IEEE Trans. Geosci. Remote Sens. 2016, 54, 6232–6251. [Google Scholar] [CrossRef]
  25. Liu, B.; Yu, X.; Zhang, P.; Yu, A.; Fu, Q.; Wei, X. Supervised deep feature extraction for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2018, 56, 1909–1921. [Google Scholar] [CrossRef]
  26. Yang, X.; Ye, Y.; Li, X.; Lau, R.Y.; Zhang, X.; Huang, X. Hyperspectral image classification with deep learning models. IEEE Trans. Geosci. Remote Sens. 2018, 56, 5408–5423. [Google Scholar] [CrossRef]
  27. Mou, L.; Ghamisi, P.; Zhu, X. Unsupervised spectral–spatial feature learning via deep residual conv–deconv network for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2018, 56, 391–406. [Google Scholar] [CrossRef]
  28. Kong, Y.; Wang, X.; Cheng, Y.; Chen, Y.; Chen, C.L.P. Graph domain adversarial network with dual-weighted pseudo-label loss for hyperspectral image classification. IEEE Geosci. Remote Sens. Lett. 2022, 19, 6005105. [Google Scholar] [CrossRef]
  29. Ding, Y.; Pan, S.; Chong, Y. Robust spatial-spectral block-diagonal structure representation with fuzzy class probability for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2020, 58, 1747–1762. [Google Scholar] [CrossRef]
  30. Chen, C.L.P.; Liu, Z. Broad learning system: An effective and efficient incremental learning system without the need for deep architecture. IEEE Trans. Neural Netw. Learn Syst. 2018, 29, 10–24. [Google Scholar] [CrossRef]
  31. Jin, J.; Chen, C.L.P. Regularized robust broad learning system for uncertain data modeling. Neurocomputing 2018, 322, 58–69. [Google Scholar] [CrossRef]
  32. Kong, Y.; Cheng, Y.; Chen, C.L.P.; Wang, X. Hyperspectral image clustering based on unsupervised broad learning. IEEE Geosci. Remote Sens. Lett. 2019, 16, 1741–1745. [Google Scholar] [CrossRef]
  33. Wang, H.; Wang, X.; Chen, C.L.P.; Cheng, Y. Hyperspectral image classification based on domain adaptation broad learning. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2020, 13, 3006–3018. [Google Scholar] [CrossRef]
  34. Camps-Valls, G.; Marsheva, T.V.B.; Zhou, D. Semi-supervised graph-based hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2007, 45, 3044–3054. [Google Scholar] [CrossRef]
  35. Zhang, Y.; Cao, G.; Shafique, A.; Fu, P. Label propagation ensemble for hyperspectral image classification. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2019, 12, 3623–3636. [Google Scholar] [CrossRef]
  36. De Morsier, F.; Borgeaud, M.; Gass, V.; Thiran, J.-P.; Tuia, D. Kernel low-rank and sparse graph for unsupervised and semi-supervised classification of hyperspectral images. IEEE Trans. Geosci. Remote Sens. 2016, 54, 3410–3420. [Google Scholar] [CrossRef]
  37. Ma, J.; Chow, T.W. Robust non-negative sparse graph for semi-supervised multi-label learning with missing labels. Inf. Sci. 2018, 422, 336–351. [Google Scholar] [CrossRef]
  38. Shao, Y.; Sang, N.; Gao, C.; Ma, L. Spatial and class structure regularized sparse representation graph for semi-supervised hyperspectral image classification. Pattern Recognit. 2018, 81, 81–94. [Google Scholar] [CrossRef]
  39. Ding, Y.; Guo, Y.; Chong, Y.; Pan, S.; Feng, J. Global consistent graph convolutional network for hyperspectral image classification. IEEE Trans. Instrum. Meas. 2021, 70, 5501516. [Google Scholar] [CrossRef]
  40. Achanta, R.; Shaji, A.; Smith, K.; Lucchi, A.; Fua, P.; Süsstrunk, S. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Machine Intell. 2012, 34, 2274–2282. [Google Scholar] [CrossRef]
  41. Nie, F.; Wang, X.; Huang, H. Clustering and projected clustering with adaptive neighbors. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), New York, NY, USA, 24–27 August 2014; pp. 977–986. [Google Scholar]
  42. Lin, Z.; Liu, R.; Su, Z. Linearized alternating direction method with adaptive penalty for low-rank representation. In Proceedings of the Advances in Neural Information Processing Systems, Granada, Spain, 20 September 2011; pp. 612–620. [Google Scholar]
  43. Wu, Y.; Yang, X.; Plaza, A.; Qiao, F.; Gao, L.; Zhang, B.; Cui, Y. Approximate computing of remotely sensed data: SVM hyperspectral image classification as a case study. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 2016, 9, 5806–5818. [Google Scholar] [CrossRef]
  44. Zhai, H.; Zhang, H.; Zhang, L.; Li, P.; Plaza, A. A new sparse subspace clustering algorithm for hyperspectral remote sensing imagery. IEEE Geosci. Remote Sens. Lett. 2017, 14, 43–47. [Google Scholar] [CrossRef]
  45. Jiang, J.; Ma, J.; Chen, C.; Wang, Z.; Cai, Z.; Wang, L. SuperPCA: A superpixelwise PCA approach for unsupervised feature extraction of hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 2018, 56, 4581–4593. [Google Scholar] [CrossRef]
  46. Zhang, X.; Jiang, X.; Jiang, J.; Zhang, Y.; Liu, X.; Cai, Z. Spectral—Spatial and superpixelwise PCA for unsupervised feature extraction of hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5502210. [Google Scholar] [CrossRef]
  47. Kong, Y.; Wang, X.; Cheng, Y.; Chen, C.L.P. Hyperspectral imagery classification based on semi-supervised broad learning system. Remote Sens. 2018, 10, 685. [Google Scholar] [CrossRef]
  48. Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv 2016, arXiv:1609.02907. [Google Scholar] [CrossRef]
  49. Su, Y.; Gao, L.; Jiang, M.; Plaza, A.; Sun, X.; Zhang, B. NSCKL: Normalized Spectral Clustering With Kernel-Based Learning for Semisupervised Hyperspectral Image Classification. IEEE Trans. Cybern. 2023, 53, 6649–6662. [Google Scholar] [CrossRef]
  50. Xi, B.; Li, J.; Li, Y.; Song, R.; Xiao, Y.; Du, Q.; Chanussot, J. Semi-supervised Cross-scale Graph Prototypical Network for Hyperspectral Image Classification. IEEE Trans. Neural Netw. Learn. Syst. 2023, 34, 9337–9351. [Google Scholar] [CrossRef]
Figure 1. Framework of the AGLFF model.
Figure 1. Framework of the AGLFF model.
Remotesensing 16 01918 g001
Figure 2. Classification maps from the IP dataset. (a) False-color image. (b) Ground-truth map. (c) SVM. (d) ELM. (e) SuperPCA. (f) S3PCA. (g) BLS. (h) SBLS. (i) GCN. (j) GCGCN. (k) NSCKL. (l) XPGN. (m) AGLFF1. (n) AGLFF.
Figure 2. Classification maps from the IP dataset. (a) False-color image. (b) Ground-truth map. (c) SVM. (d) ELM. (e) SuperPCA. (f) S3PCA. (g) BLS. (h) SBLS. (i) GCN. (j) GCGCN. (k) NSCKL. (l) XPGN. (m) AGLFF1. (n) AGLFF.
Remotesensing 16 01918 g002
Figure 3. Classification maps from the KSC dataset. (a) False-color image. (b) Ground-truth map. (c) SVM. (d) ELM. (e) SuperPCA. (f) S3PCA. (g) BLS. (h) SBLS. (i) GCN. (j) GCGCN. (k) NSCKL. (l) XPGN. (m) AGLFF1. (n) AGLFF.
Figure 3. Classification maps from the KSC dataset. (a) False-color image. (b) Ground-truth map. (c) SVM. (d) ELM. (e) SuperPCA. (f) S3PCA. (g) BLS. (h) SBLS. (i) GCN. (j) GCGCN. (k) NSCKL. (l) XPGN. (m) AGLFF1. (n) AGLFF.
Remotesensing 16 01918 g003
Figure 4. Classification maps from the PU dataset. (a) False-color image. (b) Ground-truth map. (c) SVM. (d) ELM. (e) SuperPCA. (f) S3PCA. (g) BLS. (h) SBLS. (i) GCN. (j) GCGCN. (k) NSCKL. (l) XPGN. (m) AGLFF1. (n) AGLFF.
Figure 4. Classification maps from the PU dataset. (a) False-color image. (b) Ground-truth map. (c) SVM. (d) ELM. (e) SuperPCA. (f) S3PCA. (g) BLS. (h) SBLS. (i) GCN. (j) GCGCN. (k) NSCKL. (l) XPGN. (m) AGLFF1. (n) AGLFF.
Remotesensing 16 01918 g004
Figure 5. Classification performance of different models with different label ratios. (a) IP. (b) KSC. (c) PU.
Figure 5. Classification performance of different models with different label ratios. (a) IP. (b) KSC. (c) PU.
Remotesensing 16 01918 g005
Figure 6. Classification performance of the fused, global, and local features. (a) IP. (b) KSC. (c) PU.
Figure 6. Classification performance of the fused, global, and local features. (a) IP. (b) KSC. (c) PU.
Remotesensing 16 01918 g006
Figure 7. Classification performance of the AGLFF model with different parameters. (a) OA versus parameters λ and θ on IP. (b) OA versus parameters λ and θ on KSC. (c) OA versus parameters λ and θ on PU. (d) OA versus parameter ρ on three datasets. (e) OA versus parameter C on three datasets. (f) OA versus parameter M on three datasets.
Figure 7. Classification performance of the AGLFF model with different parameters. (a) OA versus parameters λ and θ on IP. (b) OA versus parameters λ and θ on KSC. (c) OA versus parameters λ and θ on PU. (d) OA versus parameter ρ on three datasets. (e) OA versus parameter C on three datasets. (f) OA versus parameter M on three datasets.
Remotesensing 16 01918 g007
Figure 8. OA versus G E and G M . (a) IP. (b) KSC. (c) PU.
Figure 8. OA versus G E and G M . (a) IP. (b) KSC. (c) PU.
Remotesensing 16 01918 g008
Table 1. Description of unlabeled and labeled samples in the IP, KSC, and PU datasets. “n.l.s.” represents labeled samples, and “n.u.s.” represents unlabeled samples.
Table 1. Description of unlabeled and labeled samples in the IP, KSC, and PU datasets. “n.l.s.” represents labeled samples, and “n.u.s.” represents unlabeled samples.
IPKSCPU
ClassSurface Objectn.l.sn.u.sSurface Objectn.l.sn.u.sSurface Objectn.l.sn.u.s
1Alfalfa3016Scrub30731Asphalt306601
2Corn-notill301398Willow swamp30213Meadows3018,619
3Corn-mintill30800Cabbage palm hammock30226Gravel302069
4Corn30207Slash pine30222Trees303034
5Grass-pasture30453Oak/broadleaf30131Painted metal sheets301345
6Grass-trees30700Hardwood30199Bare soil304999
7Grass-pasture-mowed1513Swamp3075Bitumen301300
8Hay-windrowed30448Graminoid marsh30401Self-blocking bricks303652
9Oats155Spartina marsh30490Shadows30917
10Soybean-notill30942Cattail marsh30374
11Soybean-mintill302425Salt marsh30389
12Soybean-clean30563Mud flats30473
13Wheat30175Water30897
14Woods301235
15Buildings-grass-trees-drives30356
16Stone-steel-towers3063
Table 2. Classification performance of different methods on the IP dataset.
Table 2. Classification performance of different methods on the IP dataset.
Class SVMELMSuperPCAS3PCABLSSBLSGCNGCGCNNSCKLXPGNAGLFF1AGLFF
120.4835.9810010010010095.0010090.3998.1398.9696.88
255.4963.7292.6591.8264.8992.1756.7191.1594.6988.8090.5390.79
342.7743.6796.2893.5255.7694.8851.5092.5389.3896.1196.0896.21
436.536.0188.4196.2051.8299.9084.6499.9587.0599.4799.9299.89
578.4582.9695.1496.0386.3694.0883.7197.0493.7197.0996.9199.40
691.3394.2297.1497.1488.8799.8094.0396.7998.1999.5799.3699.74
739.8036.6192.8692.8683.1698.5792.3196.1599.8799.1698.8199.92
898.9399.2999.5510090.8799.8296.61100100100100100
930.7629.2810010010010010010073.6899.7163.33100
1047.4957.4189.5290.7063.6486.8877.4791.5195.5895.1595.5691.43
1171.9274.5693.7395.3181.7089.6956.5695.3893.2488.6790.2094.68
1253.3652.1896.6797.3469.8297.8058.2995.8195.9296.1097.1398.94
1389.3892.8199.4399.4390.5999.4310099.3198.0299.3999.6297.93
1493.4093.9790.2091.5590.8797.2380.0399.2699.9199.4499.7799.10
1550.8657.6798.5398.6080.1699.4469.5597.5090.6598.4899.9193.35
1685.4789.5897.8298.4191.8899.0598.4199.5292.3098.1798.9499.38
OA (%)64.3568.3794.6195.7981.3693.9569.2495.3594.7494.8395.0396.11
AA (%)61.6564.9995.4996.1880.0296.8065.2796.8093.2997.0995.3197.35
Kappa (%)59.9264.4092.9993.6780.3292.2980.3994.6793.9994.1094.7795.23
T (s)3.761.511.953.983.17528.75580.00641.0047.97399.54367.83392.57
Table 3. Classification performance of different methods on the KSC dataset.
Table 3. Classification performance of different methods on the KSC dataset.
ClassSVMELMSuperPCAS3PCABLSSBLSGCNGCGCNNSCKLXPGNAGLFF1AGLFF
193.7283.3596.7296.7288.6098.8080.1597.8498.9898.8797.7899.40
276.4977.8498.7199.5376.7496.0680.0295.8595.5998.1296.6299.48
373.1561.9397.2398.2370.3696.1976.6697.3999.9110099.85100
445.1668.5592.1797.7575.6180.2327.6399.6487.0884.9899.7399.24
560.7071.2596.1897.7169.0699.0872.5698.9388.2698.8299.2498.31
648.5582.4898.0597.4974.1275.3480.5110099.1798.06100100
768.6961.3310010072.2279.3395.6010010099.8999.9199.28
864.8974.6997.7210076.5198.3585.0210099.8599.76100100
980.6592.5996.7395.5191.8796.2984.5910010010099.8299.45
1098.898.1992.5189.3010097.3391.6499.4899.0998.11100100
1191.8596.9999.4999.4999.1596.5089.1810099.7191.2899.82100
1279.9689.0699.1010092.3892.3076.2096.8996.0697.6597.7299.65
1399.5098.4496.0796.6610099.0299.4910010099.9194.7494.81
OA (%)82.1985.6897.0997.8288.3893.6283.7299.1898.2996.9798.2899.26
AA (%)75.5581.2896.9797.5783.5992.2479.9499.0997.2197.3498.8699.53
Kappa (%)79.6884.1196.4396.8587.0192.6881.8998.9998.0996.1698.0899.09
T (s)3.351.391.5114.552.03243.17289.28356.25234.17254.39173.13186.29
Table 4. Classification performance of different methods on the PU dataset.
Table 4. Classification performance of different methods on the PU dataset.
ClassSVMELMSuperPCAS3PCABLSSBLSGCNGCGCNNSCKLXPGNAGLFF1AGLFF
191.7195.3681.0394.9797.8286.5369.7894.5997.1195.8991.7696.94
291.1594.1686.2790.0297.2997.5354.1098.1199.9099.8298.9199.11
360.5959.0594.1099.1460.9898.4469.6999.3587.6989.1399.1298.55
474.1275.0178.8395.0087.2487.3291.2396.1194.3597.3784.194.52
595.6799.1297.1199.2710099.8798.7499.7710010099.8499.82
660.5160.5694.6298.6771.0299.3765.3499.6998.6699.1699.2999.56
754.8954.5796.7998.5357.9199.9786.6499.5398.6199.1999.8799.89
880.0772.1492.8995.6880.4294.1272.2697.8496.8698.7198.9397.94
999.9199.9898.3299.1310090.5999.9396.8090.2999.2994.5996.61
OA (%)80.0480.9891.0096.2487.1995.0966.1997.7197.1597.0996.7698.01
AA (%)78.7478.8891.1196.7183.6394.8658.3997.9895.9497.6296.2798.11
Kappa (%)75.1975.6284.1491.9883.1993.4778.6396.9896.2196.5395.6997.35
T (s)3.361.993.84123.645.761121.991783.001653.00756.821057.19897.01932.35
Table 5. Ablation experiments with different compositions.
Table 5. Ablation experiments with different compositions.
AlgorithmAGLFF-AAGLFF-BAGLFF-CAGLFF
LFs
GFs
FFs
WBLS
Table 6. Ablation analysis results for OA (%).
Table 6. Ablation analysis results for OA (%).
Dataset AGLFF-AAGLFF-BAGLFF-CAGLFF
IP92.8794.7695.0396.11
KSC96.0598.0998.2899.26
PU94.0796.1996.7698.01
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yang, C.; Kong, Y.; Wang, X.; Cheng, Y. Hyperspectral Image Classification Based on Adaptive Global–Local Feature Fusion. Remote Sens. 2024, 16, 1918. https://doi.org/10.3390/rs16111918

AMA Style

Yang C, Kong Y, Wang X, Cheng Y. Hyperspectral Image Classification Based on Adaptive Global–Local Feature Fusion. Remote Sensing. 2024; 16(11):1918. https://doi.org/10.3390/rs16111918

Chicago/Turabian Style

Yang, Chunlan, Yi Kong, Xuesong Wang, and Yuhu Cheng. 2024. "Hyperspectral Image Classification Based on Adaptive Global–Local Feature Fusion" Remote Sensing 16, no. 11: 1918. https://doi.org/10.3390/rs16111918

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop