Next Article in Journal
An Inverse Modeling Approach for Retrieving High-Resolution Surface Fluxes of Greenhouse Gases from Measurements of Their Concentrations in the Atmospheric Boundary Layer
Previous Article in Journal
Forty-Year Fire History Reconstruction from Landsat Data in Mediterranean Ecosystems of Algeria following International Standards
Previous Article in Special Issue
Revealing the Potential of Deep Learning for Detecting Submarine Pipelines in Side-Scan Sonar Images: An Investigation of Pre-Training Datasets
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Adaptive Noisy Label-Correction Method Based on Selective Loss for Hyperspectral Image-Classification Problem

by
Zina Li
1,†,
Xiaorui Yang
2,†,
Deyu Meng
1 and
Xiangyong Cao
3,*
1
School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an 710049, China
2
School of Statistics and Data Science, Nankai University, Tianjin 300072, China
3
School of Automation, Xi’an Jiaotong University, Xi’an 710049, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Remote Sens. 2024, 16(13), 2499; https://doi.org/10.3390/rs16132499
Submission received: 18 May 2024 / Revised: 1 July 2024 / Accepted: 3 July 2024 / Published: 8 July 2024
(This article belongs to the Special Issue Deep Transfer Learning for Remote Sensing II)

Abstract

:
Due to the intricate terrain and restricted resources, hyperspectral image (HSI) datasets captured in real-world scenarios typically contain noisy labels, which may seriously affect the classification results. To address this issue, we work on a universal method that rectifies the labels first and then trains the classifier with corrected labels. In this study, we relax the common assumption that all training data are potentially corrupted and instead posit the presence of a small set of reliable data points within the training set. Under this framework, we propose a novel label-correction method named adaptive selective loss propagation algorithm (ASLPA). Firstly, the spectral–spatial information is extracted from the hyperspectral image and used to construct the inter-pixel transition probability matrix. Secondly, we construct the trusted set with the known clean data and estimate the proportion of accurate labels within the untrusted set. Then, we enlarge the trusted set according to the estimated proportion and identify an adaptive number of samples with lower loss values from the untrusted set to supplement the trusted set. Finally, we conduct label propagation based on the enlarged trusted set. This approach takes full advantage of label information from the trusted and untrusted sets, and moreover the exploitation on the untrusted set can adjust adaptively according to the estimated noise level. Experimental results on three widely used HSI datasets show that our proposed ASLPA method performs better than the state-of-the-art label-cleaning methods.

1. Introduction

Hyperspectral images (HSIs) are obtained by imaging spectrometers in nearly continuous bands and are widely used in agriculture [1,2,3], geology [4,5], and military applications [6,7], and city planning [8,9,10]. Specifically, the sensor captures the reflection data of a particular region using different bands and arranges them in spectral order. Different from the three channels (red–green–blue) of regular images as illustrated in [11,12,13], HSIs encompass not only the spatial information found in regular images but also abundant spectral information, which greatly enhances classification capabilities.
Hyperspectral Image Classification [14,15,16] has been studied by many researchers in recent years and plays an important role in the applications of Hyperspectral images. Therein, with access to label information, supervised HSI classification methods can leverage the grown community of machine learning classification algorithms to train pixel-wise classifiers. To be specific, some classical classifiers, such as nearest neighbour (NN) [17,18], support vector machine (SVM) [19,20], random forest (RF) [21,22,23,24], extreme learning machine (ELM) [25,26], sparse representation [27,28], and neural networks [29,30], have been applied to HSI classification and trained on spectral similarity features. Researchers exploit spatial information of HSI and design spatial–spectral features [31,32,33] to carry out HSI classification, which can fully utilize the structural information in HSIs and promote the accuracy of trained classifiers. The good performance of these supervised HSI classification methods relies on the assumption that the training labels used for learning are completely correct or reliable. Nevertheless in real scenes, HSI datasets usually exhibit different levels of label noise.
In the labeling process of HSIs, label noise can arise due to the following reasons. Firstly, HSIs usually capture complicated terrains, such as unknown ground covers and small inconsistent regions surrounded by large consistent regions, and these abnormal pixels are easily assigned with wrong labels. Secondly, insufficient information about the ground covers as well as the lack of assessment experience are also likely to introduce label noise in the artificial labeling process. Thirdly, due to the limited spatial resolution of hyperspectral detectors, there may exist multiple ground covers in a single pixel within the HSI, which causes inaccurate labeling of the pixel. All these common conditions can influence the final classification performance.
Though it is quite difficult to obtain large-scale data with reliable labels, ensuring some trusted data without label noise is often feasible. For the general setting of HSI classification with noisy labels, it is common practice to use trusted data for the validation and testing sets. Consequently, it is possible to incorporate a number of trusted data in the training process. In reality, such learning scenarios are closer to semi-supervised learning [34]. Most labels may be missing or damaged by noise while a few labels are clean. This hypothesis is mentioned in references [35,36] and a detailed analysis can be found in reference [37]. In addition, Hendrycks et al. [38] validate the robustness promotion of classification results with some amount of trusted data in the training set. In this paper, we consider the setting where a training set consists of a small subset of trusted data with clean labels, and massive data potentially with noisy labels, named untrusted data. Additionally, we assume that trusted data distribute across each category. The labels of untrusted data do not necessarily contain noise and the realistic situation of noise disturbance is unknown. In addition, due to the limited manpower and material resources, the amount of trusted data is usually small.
For the HSI classification task with noisy labels, corresponding methods can be roughly divided into two categories [39]. One is to design noise-robust deep features, mainly by means of designing a deep network to improve the robustness of final classifiers, and construct a robust loss function [40]. The other is to refine the noisy dataset, mainly through removing [41,42,43] or correcting [39,44,45] (collectively called cleaning) the noisy labels, and then training some basic classifiers on the cleaned dataset. Methods in the first category usually require careful design of the network architecture, and it can be hard to guarantee the performance of trained networks due to the black-box characteristic of deep networks. While for the latter strategy, label cleaning can be regarded as a method for data pre-processing and moreover some basic classifiers like RF and ELM can be trained on the cleaned dataset and yield decent results. Therein, the random label-propagation algorithm (RLPA) [45] is a representative noisy label-correction method and achieves good performance in noisy label cleaning. RLPA randomly selects a number of samples from the noisy dataset as ‘clean’ data and then labels are propagated from the ‘clean’ data to the entire dataset. The process of sampling and propagation is repeated multiple times to guarantee sufficient label propagation. Though effective, the selection number in RLPA is preset or manually set based on personal experimental experience, and the sampling strategy for obtaining ‘clean’ data is random instead of being carefully designed.
To enhance the efficiency of label cleaning and mitigate the potential problem in RLPA, we propose a label-correction method named adaptive selective loss propagation algorithm (ASLPA) for the HSI classification problem with noisy labels. Since there is only a small proportion of trusted data in the training set, our proposed ASLPA algorithm supplements the initial trusted set with carefully selected samples from the untrusted set and then carries out label propagation on the updated trusted set to obtain the rectified dataset. Firstly, we construct an inter-pixel transition probability matrix based on spectral–spatial information of the hyperspectral image. Secondly, we estimate the ratio of accurate labels in the untrusted set by utilizing the inter-class corruption probability. Then, we can identify the supplementary number of untrusted data according to a selecting threshold function and the estimated ratio. Thirdly, the selection of untrusted data is determined by the sample loss. To be specific, samples in the untrusted set are ranked by their loss values and we tend to select samples with smaller loss values, which means more confidence in the sample. Finally, we propagate label information from the enlarged trusted set to the entire training set and subsequently finalize the process of label correction. The contributions of our work can be summarized as follows:
  • We propose a method to fully utilize the label information in the trusted and untrusted sets and estimate the proportion of accurate labels in the untrusted set, which provides an important reference for supplementing the trusted set. Through automatic setting of the enlarged size, manual selection for this hyper-parameter can be avoided.
  • We construct the initial trusted set with known clean data and then enlarge this set according to the adaptive supplementary number and loss values of the untrusted data, which largely improves the effect of data cleaning. To be more specific, untrusted data with smaller loss values are put into the trusted set.
  • Combining adaptively enlarging the trusted set with the label-propagation process, we propose a novel label-correction method named ASLPA. Experimental results show the effect of our employed strategies in label cleaning and the priority of our method over other state-of-the-art label-cleaning methods in HSI classification for three hyperspectral datasets with different noise levels.
The rest of this paper is organized as follows. In Section 2, we introduce some related work. In Section 3, we describe the proposed algorithm in detail. Experimental results are presented in Section 4 and conclusions are drawn in Section 5.

2. Related Work

2.1. Traditional HSI Classification Methods

Over the past decades, many algorithms have been proposed to improve the classification accuracy on clean HSI datasets. Some of them utilize existing supervised classification methods after extracting engineered features from HSI datasets [22,26,46]. Additionally, many scholars try to extract the domain knowledge within hyperspectral images and design specific classification methods based on the extracted spectral or spectral–spatial information [47,48,49,50,51]. Mou et al. [47] regard hyperspectral pixels as sequential data in the spectral mode and develop a novel RNN model by using the PRetanh activation function. Paoletti et al. [49] proposed a deep pyramidal residual network which can increase the diversity across layers of high-level spectral–spatial attributes. Wang et al. [50] proposed a spatial–spectral squeeze-and-excitation module to resist noise interference and embedded several SSSE modules into a residual network to improve classification performance. Sun et al. [51] introduced the attention module into the spectral–spatial attention network. Yang et al. [52], Xu et al. [53], and Hang et al. [54] constructed spectral and spatial sub-networks and combined these networks differently. Recently, Liu et al. [55] proposed a discriminative spectral–spatial–semantic feature network based on shuffle and frequency attention mechanisms and designed multiple attention and feature-extraction modules. Liu et al. [56] proposed a hybrid-scale feature-enhancement network that exploited multiple spectral–spatial structural features and modeled the global long-range dependencies in these features. These well-designed methods have made remarkable achievements on clean datasets, but the lack of robustness to noisy labels inevitably impairs their performance when the dataset contains noisy labels.

2.2. HSI Classification Methods for Label Noise

In recent years, HSI classification with noisy labels has attracted much more attention. To mitigate the impact of noise on classification, the scholars roughly proposed two kinds of approaches: constructing robust classifiers or cleaning label noise in the dataset. In the first approach, Roy et al. [57] devised a light-weight network abbreviated as HetConv3D, which consists of two types of convolutional kernels, i.e., spectral and spatial domains, and then fused them to produce the final noise-robust feature maps. Xu et al. [40] designed a novel end-to-end network containing residual blocks and a noise-robust loss function. Zhang [58] explored three levels of representation and proposed a triple contrastive representation-learning framework from a deep clustering perspective. Wang et al. [59] devised an end-to-end attentive-adaptive network (AAN), which combines a spectral stem network with a nonadjacent shortcut and a group-shuffle attention module, and meanwhile developed an adaptive noise-robust loss function together to resist label noise. Zhang et al. [60] considered the co-training framework with dual networks and came up with an agreement- and disagreement-based co-learning method for this problem. The methods for label noise cleaning can be roughly divided into two categories: one is to detect the noisy labels and then remove these noisy samples from the training set; the other is to correct the noisy labels, which still retains these samples with refined labels in the training set.
As for the noisy label-detection methods, Pelletier et al. [44] proposed an iterative learning framework based on the random forest algorithm to remove the noisy labels. Then, Tu et al. [42] proposed a kernel entropy component analysis-based method (KECA) that can detect noisy labels based on the anomaly probabilities which are calculated from the kernel matrix of each class and the corresponding estimated entropy distribution. Tu et al. [41] exploited the local density of each training sample and proposed a density peak clustering-based noisy label-detection (DPNLD) method. Subsequently, they improved on this work and incorporated the spatial information when defining the local density in [61]. Then, Tu et al. [62] continued to combine the superpixel-to-pixel weighting distance (SPWD) and density peak clustering to detect and remove noisy labels in the training set. Subsequently, Tu et al. [43] proposed a hierarchical constrained energy minimum (HCEM) method and exploited sample energy distribution to detect mislabeled samples.
However, the above-mentioned label-detection methods simply remove the mislabeled samples and result in information loss especially with large noise. Noisy label-correction methods modify the noisy labels and retain the corrected samples in the training set, so that all data can contribute to the training process. The three methods [39,45,63] all constructed graphs based on different types of structural information, and then generated a transfer matrix to perform label propagation. Specifically, Jiang et al. [45] considered the spectral similarity and the superpixel-based spatial information, and Leng et al. [63] took the spectral–spatial sparse graph into consideration. Additionally, Jiang et al. [39] developed a multi-scale segmentation-based multi-layer spectral–spatial graph (MSSG) method to exploit richer spatial information.

3. Proposed Model

In this section, we will introduce our proposed label-correction algorithm ASLPA. Overall, the ASLPA algorithm can be divided into two parts: supplementing the initial trusted set with selected untrusted data and performing label propagation based on the enlarged trusted set.
For the first part, due to the limited size of the initial trusted set which is composed of clean data, we consider supplementing this dataset with a certain number of untrusted data to guarantee sufficient information for label propagation. Notably, the untrusted data with more confidence are supposed to be put into the trusted set and the supplementary amount should be related to the intensity of label noise. For the second part, the construction of the pixel transfer probability matrix (Section 3.2) and the label-propagation process (Section 3.3) are carried out based on the RLPA [45]. However, the distinction between ASLPA and RLPA lies in the construction of the new trusted set (Section 3.1) in ASLPA, encompassing the number of selected untrusted samples and the selection strategy.

3.1. Construction of the New Trusted Set

To address the issues regarding RLPA mentioned in Section 2, we propose a method to adaptively supplement the trusted set. Intuitively, the number of supplementary samples should be related to the noise intensity in the untrusted set. The heavier the label noise is, the less samples should be picked, and vice versa.
Due to the limited manpower and material resources, the amount of known trusted data is usually small. Notably, the labels of the untrusted set are potentially corrupted by noise, and with the trusted data serving as extra information, we can better model the label noise in the untrusted set. Furthermore, after analyzing the noise intensity, we set rules for the number of supplementary untrusted data and how these data are identified. The details of constructing the new trusted set are presented in Figure 1.

3.1.1. Construction of Inter-Class Corruption Probability Matrix

In this part, we utilize the GLC algorithm [38] to calculate the inter-class corruption probability matrix. For convenience, let x denote the sample feature, y and y ˜ denote its true and corrupted labels, respectively. With label information of the trusted and untrusted sets, the Monte Carlo method is used to estimate the distribution of the labels after being affected by noise.
For the conditional distribution identity:
p y ˜ y , x p x y = p y ˜ y p x y ˜ , y ,
integrating over all x gives us:
p ( y ˜ y , x ) p ( x y ) d x = p ( y ˜ y ) p ( x y ˜ , y ) d x = p ( y ˜ y ) .
Thus, we need to estimate p ( y ˜ y , x ) p ( x y ) d x . To reduce the computational cost, we assume the conditional independence between y and y ˜ given x :
p y ˜ y , x = p y ˜ x .
With this assumption, we just need to estimate:
p ( y ˜ x ) p ( x y ) d x .
This integral can be regarded as E x y [ p ( y ˜ x ) ] . We train the classifier on the untrusted set X U and obtain p ^ y ˜ x . Then, replace p y ˜ x , p x y ˜ with p ^ y ˜ x , p ^ x y ˜ , respectively. Through the Monte Carlo method, the probability that the true label is p and the corrupted label is q after noise corruption is    
C ^ p q = 1 A p x A p p ^ ( y ˜ = q x ) , = 1 A p x A p p ^ ( y ˜ = q y = p , x ) , p ( y ˜ = q y = p ) ,
where A p is the subset of x in the trusted set with label p.

3.1.2. Supplementing the Trusted Set

Definitely, we denote the initial trusted and untrusted sets as X T and X U , respectively. X T and X U constitute the training set, which is represented as X t r . Based on the inter-class corruption probabilities, we utilize the conditional probability formula to estimate the proportion of clean labels in X U . Then, we identify the ratio of selected data through a threshold function, and these data are a supplement to X T .
For conditional probability identity:
p ( y = q y ˜ = q ) = p ( y ˜ = q y = q ) p ( y = q ) p ( y ˜ = q ) ,
substitute the true probability for the estimated one and then derive the probability that for the data with label q in X U , its real label is also q:
p ^ ( y = q y ˜ = q ) = p ^ ( y ˜ = q y = q ) p ^ ( y = q ) p ^ ( y ˜ = q ) , = C ^ q q p ^ ( y = q ) p ^ ( y ˜ = q ) ,
where p ^ ( y = q ) and p ^ ( y ˜ = q ) can be calculated from X T and X U , respectively. From this formula, we obtain the estimated proportion of clean samples for each class in the untrusted set. Moreover, the probability m that a label in X U is not corrupted by noise is calculated as follows:
m = p ( y = y ˜ ) = q = 1 K p ( y = y ˜ , y = q ) = q = 1 K p ( y = q , y ˜ = q ) = q = 1 K p ( y = q y ˜ = q ) p ( y ˜ = q ) = q = 1 K C ^ q q p ^ ( y = q ) p ^ ( y ˜ = q ) p ( y ˜ = q ) = q = 1 K C ^ q q p ^ ( y = q )
The trusted set X T is then enlarged according to the ratio of clean labels m. Specifically, if m is greater than the threshold (denoted as β ), it indicates a low noise level in X U and there are sufficient potential reliable data. Then, samples with the proportion m in X U are supplemented to X T . A lower value of m than β suggests a seriously corrupted untrusted set. While a certain number of untrusted samples are still extracted to ensure sufficient information for label propagation. It is noteworthy that setting the hyper-parameter β as a rational value is enough, and we fix it to 0.3 in this paper. Denote the proportion of extracted untrusted data in X U as δ , and we have    
δ = m , m > β 0.5 g 1 g , m β
where g is the ratio of the trusted set size to the training set size:
g = X T X t r ,
where X t r represents the training set.
After determining the ratio in (9), we select data from X U and obtain the enlarged trusted set X T as well as the updated untrusted set X U composed of the remaining untrusted data. Instead of the random selection strategy utilized in RLPA, we sample from X U according to some metrics that can measure the fidelity or quality of samples. The loss value of a sample usually indicates its confidence, and we tend to select samples with smaller losses, which can provide more reliable information for the trusted set. Generally, we employ cross-entropy loss as the metric to rank the untrusted data. As for the calculation of loss value, we need to pre-train a basic classifier on X T , and then derive the loss value of untrusted data from its predicted label and noisy label as follows.
L ( x i , y ˜ i ) = L ( h ( x i ) , y ˜ i ) .
Therein, L ( . , . ) is the employed cross-entropy loss and h ( . ) is the pre-trained classifier. With the selection amount determined above, we can gather the final supplementary samples. It should be noted that the selection process based on loss ranking leads to the unique trusted set X T and thus the process of label propagation is only conducted once.

3.2. Construction of Inter-Pixel Transition Probability Matrix

The construction of the inter-pixel transition probability matrix has been mentioned in RLPA [45] and it utilizes the spectral–spatial information in the HSI. Firstly, entropy rate superpixel segmentation (ESR) [64] is performed to obtain several homogeneous regions of HSI. Then, the similarity between pixels is calculated from their spectral information and position relationship in homogeneous regions. Finally, we obtain the inter-pixel transition probability matrix using the similarity matrix. Each of the above steps is described below.

3.2.1. Entropy Rate Superpixel Segmentation

HSI pixels are classified into K classes and the label space is { 1 , 2 , , K } . The feature set of all labeled pixels is denoted as X = { x 1 , x 2 , , x N } R d , representing a d-dimensional spectral space, and therein, N represents the number of labeled pixels in the HSI. We divide an HSI into several homogeneous regions and assume that if two pixels are in the same region, their labels are more likely to be the same. To reduce the computational cost, we firstly conduct principal component analysis (PCA) [65] on the HSI and then carry out entropy rate superpixel segmentation (ESR) [64] on the obtained first principal component I f :
I f = k = 1 T X k , s . t . X k X g = , if k g ,
where X k and X g are the pixel sets in the k-th and g-th regions, respectively, and T denotes the number of homogeneous regions. Here, T is adaptively set by the texture of the image:
T = T b a s e N f N I ,
where N f is the number of pixels that represent the edges detected by the Laplacian of the Gaussian operator [66], and N I is the total number of pixels. Thus, the more complex the texture, the more homogeneous the regions segmented.

3.2.2. Construction of Similarity Matrix

Based on the segmentation results, we construct the similarity graph and the inter-pixel similarity matrix. Specifically, if two pixels are in the same homogeneous region, they are connected in the similarity graph and the weight of the edge is determined by their spectral similarity. To be specific, as the similarity between the spectra of two pixels increases, so does the weight of the edge, whereas if two pixels are not in the same homogeneous region, they are less likely to be the same class and there is no edge connected in the similarity graph. Thus, the weight of the edge between the i-th and j-th pixels in the similarity graph is
W i j = exp s i m x i , x j 2 2 σ 2 , x i , x j X k , 0 , x i X k and x j X g ,
where s i m is the spectral similarity. Here, s i m x i , x j = x i x j 2 and σ = 1 X k x i , x j X k x i x j 2 2 0.5 . We calculate the weights between all pixel pairs and obtain the symmetrical similarity matrix W .

3.2.3. Construction of Transition Probability Matrix

The transition probability matrix is obtained by the similarity matrix. The more similar two pixels are, the more likely they are to be the same class and the greater the mutual influence of their label information is during label propagation. Thus, the transition probability between the i-th and j-th pixels is
T i j = W i j k = 1 N W k j .
We calculate the transition probabilities between all pixel pairs according to the above formula and obtain the transition probability matrix T .

3.3. Label Propagation

Essentially, label propagation is the flow of label information within the homogeneous regions. According to the idea of constructing a similarity matrix, pixels that are not in the same homogeneous regions are less likely to be the same class. Therefore, the flow of label information among pixels in different regions is not supposed to occur, while for pixels in the same region, labels propagate and exchange information based on the inter-pixel transition probability matrix.
Specifically, the training set is composed of the enlarged trusted set X T = { x 1 , x 2 , , x l } with label set { y ˜ 1 , y ˜ 2 , , y ˜ l } and the updated untrusted set X U = { x l + 1 , x l + 2 , , x l + u } . Therein, l and u denote the size of X T and X U , respectively. In the label-propagation process of our algorithm, the label set of X T is retained and propagated to the whole training set. Moreover, the labels of X U are set as zero initially. Collectively, the initial label of a training sample in label propagation is denoted as y ˜ T U , and we convert it into the label vector y ˜ T U . For the i-th pixel x i ( i = 1 , 2 , , l ) in the training set, the conversion rule is
y ˜ i , p T U = 1 , y ˜ i = p 0 , y ˜ i p ,
where y ˜ i , p T U is the p-th element of y ˜ i T U . In addition, y ˜ i T U = 0 ( i = l + 1 , l + 2 , , l + u ).
In the process of label propagation, the current label of a pixel is determined by two factors. One is its original label, and the other is the label information received from other pixels in the homogeneous region. Thus, the propagated label of x i at time t + 1 is
f i t + 1 = α x i , x j X k T i j f j t + 1 α y ˜ i T U ,
where α is a parameter in the interval ( 0 , 1 ) that balances the influence of its original label and the label information from its neighbors.
We range the initial label as the label matrix Y ˜ T U = [ y ˜ 1 T U , y ˜ 2 T U , , y ˜ l + u T U ] R ( l + u ) × K and the propagated label as F t = [ f 1 t , f 2 t , , f l + u t ] R ( l + u ) × K . Then, we have
F t + 1 = α T F t + 1 α Y ˜ T U .
Further on, we can obtain the label matrix after label propagation:
F = lim t F t = ( 1 α ) ( I α T ) 1 Y ˜ T U .
Finally, propagated labels of the samples in X U can be recovered from the label matrix:
y i = arg max j F i j , ( i I d x ( X U ) ) ,
where I d x ( X U ) denotes the index set of X U .
After label propagation, original labels of X T are retained, and for X U , potentially noisy labels are replaced by the updated labels.
In this way, noisy labels are corrected and the updated labels can be used for classifier training. Algorithm 1 presents the process of our method ASLPA.
Algorithm 1 Adaptive Selective Loss Propagation Algorithm (ASLPA)
Require: Hyperspectral Image I R m × n × d , the trusted set X T and corresponding clean
   labels, the untrusted set X U , and corresponding labels.
Ensure: The corrected labels.
1:
Train two networks p ^ ( y x ) and p ^ ( y ˜ x ) on X T and X U , respectively.
2:
for each i [ 1 , K ] do
3:
   Gather samples with label q from X T and calculate p ^ ( y = q ) .
4:
   Calculate p ^ ( y = q y ˜ = q ) according to (5) and (7).
5:
end for
6:
Compute m using (8).
7:
Rank samples in X U according to (11) and extract data from X U with lower loss values according to (9). Obtain the new trusted set X T , corresponding labels and the new untrusted set X U .
8:
Apply PCA on I and obtain the first principal component I f .
9:
Apply ERS on I f and obtain corresponding superpixels X k ( k = 1 , 2 , , T ) .
10:
Obtain the similarity matrix according to (14).
11:
Calculate the inter-pixel transition probability matrix T using (15).
12:
Retain the labels of X T and set the labels of X U to 0.
13:
Calculate F by (19).
14:
Get the corrected labels for X U from F using (20) and maintain the clean labels of X T .
15:
return Corrected labels for X U .

4. Experiments

In this section, we conduct a series of experiments on three commonly used HSI datasets and demonstrate the effectiveness of our algorithm under different label noise levels for different datasets. In Section 4.1, we introduce three hyperspectral datasets and detailed experimental setting. In Section 4.2.1, we firstly conduct sufficient ablation studies on our proposed ASLPA algorithm to verify the effectiveness of employed strategies in terms of the capacity for noise reduction and classification performance on corrected datasets regarding three classification quality metrics. The classification metrics are three commonly used metrics in the problem of HSI classfication, i.e., the overall accuracy (OA), average accuracy (AA), and kappa coefficient (Kappa). Further on, we study the effect of different superpixel segmentations on the label-correction performance regarding the ability of noisy sample correction. In Section 4.3, we compare ASLPA with several representative label-cleaning methods for the HSI classification problem with noisy label, including state-of-the-art methods. The evaluation criteria cover the three classification metrics and visual classification maps.

4.1. Datasets and Experimental Settings

We utilize three publicly available hyperspectral datasets including Kennedy Space Center (KSC), University of Pavia (UP), and Salinas Scene (Salinas), to evaluate our proposed ASLPA method. Regarding the construction rules for the noisy training set, firstly, the training set is extracted from the labeled set with a certain ratio and the remaining labeled data are collected into the testing set for each class. The sampling ratios for the KSC, UP, and Salinas datasets are 30%, 10%, and 10% respectively. Then, 30% pixels in the training set are selected into the trusted set for each class for all three datasets, and the remaining training data are collected as the untrusted set. In the following, we present information about the three datasets and category details in Table 1, Table 2 and Table 3. Therein, ‘All’, ‘Test’, ‘Train’, and ‘Trusted’ represent the number of all labeled pixels, testing pixels, training pixels, and trusted pixels in the dataset for each category, respectively. At last, labels in the untrusted set are corrupted by Equation (21), which presents the inter-class corruption probability matrix for label corruption.

4.1.1. Kennedy Space Center

The Kennedy Space Center dataset (KSC) was acquired by the NASA AVIRIS (Airborne Visible/Infrared Imaging Spectrometer) instrument over the Kennedy Space Center, Florida, in the year 1996. Besides the background pixels, the dataset contains 5211 labeled pixels, covering 13 classes. We present category details about this dataset in Table 1. After removing water absorption and low SNR bands, 176 bands were used for analysis. The false-color composite image and the ground-truth class map for the KSC dataset are shown in Figure 2.

4.1.2. University of Pavia

The dataset named University of Pavia (UP) was acquired over the campus of the University of Pavia, Italy, by the ROSIS sensor in 2001. Except for the background pixels, there are 42,776 labeled pixels, covering 9 categories. We present category details about this dataset in Table 2. After removing the bands that are corrupted by water absorption effects, the image size is 610 × 340 × 103. The false-color composite image and the ground-truth class map for the UP dataset are shown in Figure 3.

4.1.3. Salinas Scene

The dataset named Salinas was obtained over the Salinas Valley in California, USA, by the AVIRIS sensor in 1992. It includes vegetables, bare soils, and vineyard fields. Except for the background pixels, there are 54,129 labeled pixels, covering 16 categories. We present category details about this dataset in Table 3. After removing the bands that are corrupted by water-absorption effects, the image size is 512 × 217 × 204. The false-color composite image and the ground-truth class map for the Salinas dataset are shown in Figure 4.
As for the corruption of the untrusted set, labels are tainted by different levels of noise. Specifically, for samples in a certain category, labels are randomly flipped to the other category with the overall probability ρ . In the experiment, noise levels are taken from 0.1 to 0.8 with an interval of 0.1. Therefore, the inter-class corruption probability matrix used to generate noisy labels is
C T = 1 ρ ρ K 1 ρ K 1 ρ K 1 1 ρ ρ K 1 ρ K 1 ρ K 1 1 ρ ,
where K is the number of categories and ρ is the noise level.
We calculate p ^ y ˜ x from a shallow neural network that is trained on X U . To reduce the computational cost, we utilize PCA on the original hyperspectral image and obtain 30 principal components as the input of p ^ y ˜ x . The threshold β is fixed to 0.3 and the rule for extracting data is
δ = m . m > 0.3 0.5 g 1 g . m 0.3

4.2. Ablation Study

In this section, we conduct ablation studies on the employed strategies and superpixel segmentations in Section 4.2.1 and Section 4.2.2, respectively.

4.2.1. Ablation Study on Employed Strategies

In order to demonstrate the effectiveness of our proposed method ASLPA, we assess its performance in terms of the reduction capacity for noisy samples and the classification performance with the SVM classifier. ASLPA is compared with RLPA and its two degraded versions, which are named ALPA and GLPA.
Since RLPA is a representative label-correction method in HSI classification with noisy labels, we also compare with its performance in the ablation experiment. Specifically, we follow the original practice of RLPA and randomly select a preset ratio of samples from the whole training set in the trusted set instead of maintaining the clean set. As for ALPA and GLPA, they both establish the clean set as the initial trusted set and randomly select some amount of samples from the untrusted set for multiple times. What is more, the selection numbers for ALPA and GLPA are the same as that for RLPA. The two degraded versions of ASLPA mainly differ in the number of supplementary untrusted data. To be specific, the size of the enlarged trusted set for ALPA is the same as that for RLPA. As for GLPA, the number of supplementary untrusted data is identified by Equation (9). Besides the just mentioned setting (maintenance of the clean set and adaptive supplementary number) in GLPA, ASLPA additionally employs the strategy of loss ranking during data selection, which leads to the unique trusted set and corresponding label-propagation process. In the following, we detail the ablation results of ASLPA, its two degraded versions, and RLPA in Table 4 and Table 5. Therein, ‘NL’ denotes the abbreviation of ‘Noise Level’. For the two tables and the following ones, all results are averaged in ten iterations.
To evaluate our method ASLPA in terms of the range for noise reduction, we apply all compared algorithms on the above-mentioned three HSI datasets with label noise and count the remaining noisy samples after the label-cleaning process. The numbers of noisy samples in the training sets as well as the cleaned datasets are summarized in Table 4. From these results, it can be easily seen that our proposed ASLPA consistently performs better and cleans more noisy samples than all its degraded versions as well as RLPA for different noise levels and different datasets, which validates the effectiveness of the employed strategies in ASLPA.
In more detail, ALPA competes RLPA on the UP dataset for all noisy levels except 0.2, and as the noise rate increases, the advantage becomes more obvious. Moreover, at noise levels no more than 0.3 and 0.6 for the KSC and Salinas datasets, respectively, ALPA achieves comparable or nearly the same performance with RLPA and the difference between the two methods is less than one sample, which is indeed small. At higher noise levels for the two datasets, the results of ALPA are better. The small improvement of RLPA over ALPA at low noise levels for the two datasets may arise from more randomness or coverage of the propagated trusted set in RLPA. And as the noise level increases, maintenance of the known clean set largely helps resist the label noise, validating the effectiveness of this practice.
As for GLPA, it outperforms RLPA and ALPA for different noise levels for the UP and Salinas datasets. And as the noise intensity grows, the priority of GLPA becomes greater. For the KSC dataset, GLPA competes RLPA and ALPA for noise levels no more than 0.2. When the noise level increases, GLPA still achieves comparable performance with the other two methods for the noise interval [0.3, 0.5] and much better results at higher noise levels. The advantage of GLPA over ALPA especially with high noise arises from the adaptive supplementary number and correspondingly proper potential noisy samples, which validates the effectiveness of utilizing the adaptive supplementary amount. In addition, the better overall performance of GLPA over RLPA shows that the simultaneous use of the two strategies, maintenance of clean set and adaptive supplementary amount, helps a lot in improving the effect of label cleaning. In addition to the above-mentioned two strategies in GLPA, ASLPA also uses loss ranking to pick samples from the untrusted set. The consistently superior performance of ASLPA over GLPA confirms the better efficiency of loss-ranking-based selection than random selection.
To further evaluate the quality of corrected datasets in terms of classification performance, we apply a basic classifier named support vector machine (SVM) on all three datasets after label correction by four ablation methods. As can be seen in Table 5, GLPA presents consistently better performance than RLPA and ALPA. Similarly, ASLPA demonstrates superior performance compared to GLPA and all other degraded versions for different noise levels and different datasets. The two comparisons further validate the effectiveness of the employed strategies in ASLPA.
To conclude, through the ablation study, we fully verified the experimental efficiency of the three strategies, maintenance of the known clean set, adaptive supplementary amount, and loss ranking in data selection, of our proposed ASLPA algorithm.

4.2.2. Ablation Study on the Superpixel Segmentations

As shown in Section 3.2, the segmented superpixels have a direct effect on the construction of the inter-pixel transition probability matrix, and thus may influence the label-correction performance. Following the work of Jia et al. [45], we adopt the ESR as the segmentation method, which requires a 1-channel image as the input. To evaluate the information loss associated with different superpixel segmentations, we employ three approaches, namely Mode1, Mode2, and Mode3, on the Salinas dataset to retrieve superpixels for our ASLPA. Specifically, Mode1 involves averaging the HSI along the spectral dimension, and then applying ESR on the obtained 1-channel HSI. Mode2 entails the ESR method and selecting one representative band based on Wang et al. [67]. Mode3 involves conducting superpixel segmentation according to Achanta et al. [68]. Additionally, we introduce our employed method as Mode4, which directly applies PCA on the original HSI and then conducts ESR method on the obtained 1-channel image. Table 6 presents the correction performance of ASLPA with different superpixel divisions (Mode1-Mode4) on the Salinas dataset in terms of the NoiOri, Numri, and Noires Metrics. Therein, ’NoiOri’ denotes the number of noisy samples in the original dataset, ’Numri’ represents the number of noisy samples different methods can correct rightly, and ’Noires’ denotes the number of noisy samples remaining in the corrected dataset. The performance metrics demonstrate that different superpixels can result in varying degrees of information loss, thereby affecting the correction performance. Furthermore, Mode4 effectively rectifies the most noisy samples, and competes with other compared methods in terms of the reduction for noisy samples. Consequently, we adopt the ESR method and PCA in our ASLPA.

4.3. Comparative Experiments

In this part, we carry out sufficient experiments to compare ASLPA with other representative label-cleaning methods for HSI classification with label noise, including two noisy label-correction methods, RLPA [45] and MSSG [39], and three noisy label-detection methods, DPNLD [41], KECA [42], and SPWD [62]. Additionally, we consider the BASE method which means using the original noisy training set without any processing.
To evaluate the quality of datasets after label cleaning for different methods, we consider four widely used classifiers in the field of HSI classification, namely nearest neighborhood (NN), support vector machine (SVM), random forest (RF), and extreme learning machine (ELM), and train these classifiers on the cleaned datasets. Further on, we utilize three commonly used metrics in HSI classification, i.e., OA, AA, and Kappa, to evaluate the trained classifiers. The final results listed in Table 7, Table 8 and Table 9 are the average of results in 10 repeated experiments, and each experiment randomly generates training sets, trusted sets, and noisy labels in the untrusted sets. In all three tables, ‘NL’ denotes the abbreviation of ‘Noise Level’. For fair comparison, parameters of all classifiers are carefully tuned and the best results are reported. Among all compared methods, the best results are in boldface in black and underlined for better illustration.
To better visualize the classification performance, we plot a series of classification maps at a medium noise ratio ρ = 0.3 . These visual results are presented in Figure 5, Figure 6 and Figure 7. For each figure, classification maps are arranged in the following specific way: The four rows from top to bottom correspond to four classifiers, KNN, SVM, RF, and ELM, respectively. The eight columns from left to right correspond to seven processing methods, BASE, RLPA, DPNLD, KECA, SPWD, MSSG, and ASLPA, as well as the ground-truth classification maps, respectively. To be specific, each map is the fusion result of ten processed classification maps in all ten iterations, and meanwhile we constitute classification maps of each iteration from training sets with original labels and testing sets with predicted labels. For better illustration of the difficult regions where different methods are likely to make mistakes, we highlight these regions and then enlarge them with bounding boxes.
From Table 7, Table 8 and Table 9, it is easily observed that SVM and ELM perform better than KNN and RF classifiers in general. Therefore, we mainly pay attention to the metrics of SVM and ELM classifiers when comparing different label-cleaning methods.
From Table 7, we can clearly see that ASLPA achieves the best classification results over all other compared methods in terms of the OA, AA, and Kappa metrics for different classifiers and noise ratios. In addition, the difference between ASLPA and any other compared method for three metrics increases as the noise level rises, which validates the superiority in noise robustness of our proposed ASLPA method. There are two reasons that may contribute. Firstly, due to the adaptive supplementary number bounded by a lower value, ASLPA tends to select a smaller but necessary amount of samples when the untrusted set is seriously corrupted by noise, and this tendency not only meets the need of label propagation but also avoids introducing too many noisy labels. Secondly, ASLPA selects samples with lower losses, which means data with more confidence will be supplemented and thus generates a trusted set of higher quality.
For the KSC dataset, according to the ground-truth label map, the size of each cluster is small. Moreover, for some categories, their clusters are scattered or dispersed throughout the space and can be closer to clusters of a different category. Samples in such regions are more likely to be wrongly classified, and we call these regions difficult ones. At the noise ratio 0.3 in Figure 5, all methods perform well in terms of the overall classification accuracy. While for the two difficult regions with enlarged bounding boxes, we observe that our proposed method ASLPA achieves comparable or even better classification performance compared with other methods. For example, for green boxes on the bottom left corner, the classification maps of ASLPA and two other label-correction methods are closer to the ground-truth maps and much cleaner than the four noisy label-detection methods. While for red boxes on the left top corner, ASLPA achieves highly similar classification results than other methods.
In Table 8, we can see that when the noise ratio is set to 0.1, ASLPA achieves a performance comparable or nearly identical to the best results. When the noise level is larger than 0.1, ASLPA consistently outperforms other compared methods in terms of OA, and Kappa metrics. In addition, we also observe that the two metrics of ASLPA keep decreasing when the noise ratio rises to 0.7 and then increase when the noise ratio reaches 0.8, which accords with the trend of the number of remaining noisy samples for ASLPA for the UP dataset in Table 4. As for the AA metric, take the results of ELM as an example; though MSSG performs better than other methods, ASLPA still achieves comparable results. When the noise ratio is larger than 0.2, ASLPA rises from the third best performer to the second best performer in terms of AA metric.
As for the results in Figure 6, the superiority of ASLPA over other methods is more obvious. To give some specifics, it is evident that red boxes on the right top corner of the maps for ASLPA are cleaner and smoother than those for other compared methods.
In Table 9, we observe that ASLPA performs consistently better than other competing methods in terms of OA, AA, and Kappa metrics. Though for the RF classifier, MSSG achieves the best performance, the difference between MSSG and ASLPA is nearly negligible. Take the metrics of the ELM classifier as an example; the declining rate of RLPA is greater than that of ASLPA as the noise level increases. Since ASLPA can be taken as a variant form of RLPA, we may reach the following explanations for better performance in ASLPA. When the noise level rises, there are large amounts of noisy labels in the training set, and our proposed ASLPA can select a proper number of supplementary samples with less noise according to the loss rank, which ensures the quality of the trusted set to some extent.
In Figure 7, we pay attention to the two largest neighbour regions which are red and green in the ground-truth label map. On the one hand, the two regions occupy a considerable amount of space in the image and meanwhile they are near and very likely to be misclassified. On the other hand, other regions are basically clean for different methods, and the coverage of misclassified samples in these regions is indeed tiny. For the first two rows and the bottom row in Figure 7, it is observed that the overall coverage of correctly classified samples in two difficult regions for each label-correction method is larger than that for compared label-removal methods, which is shown more clearly in the red bounding boxes. For the third row, there is no significant difference among the classification maps for all methods. As for the comparison among three label-correction methods, we observe that the red pixels in the bounding boxes, which are indeed misclassified, for ASLPA are less than those for the other two label-correction methods with the ELM classifier on the bottom row, and for the remaining rows, the visual results are virtually identical for three methods.
Apart from the above-presented classification performance, we also list the averaged running time of all compared methods with the SVM classifier in ten iterations for the Salinas dataset in Table 10. It can be seen that ASLPA achieves the best time efficiency on the Salinas dataset compared to all other competing methods, validating the time efficiency of our ASLPA.

5. Conclusions and Future Work

For the HSI classification task with noisy labels, it is preferable to correct the labels before training the classifier. Based on the assumption that there is a small proportion of trusted data within the training set, we propose a label-correction algorithm named ASLPA. This approach addresses the limitations of RLPA by focusing on establishing a new trusted set. ASLPA preserves the known trusted data as the initial trusted set and then selects an adaptive number of untrusted data using a loss rank strategy to supplement this dataset. Subsequently, the label-propagation process is employed to obtain the refined dataset. Experimental results indicate that ASLPA achieves promising superiority over other competing label-cleaning methods.
In this paper, noisy labels are generated through a symmetric inter-class corruption probability matrix. For more complex situations, such as varying levels of noise intensity or constraints where certain labels can only flip to specific categories, determining the rules for data extraction becomes the next challenge to be addressed.

Author Contributions

Formal analysis, Z.L. and X.Y.; Funding acquisition, X.C. and D.M.; Methodology, Z.L. and X.Y.; Project administration, X.C. and D.M.; Writing original draft, Z.L. and X.Y.; Writing review and editing, X.C. and D.M. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Key Research and Development Program of China under Grant 2021ZD0112902 and in part by the China NSFC Projects under Contract 62272375 and Contract 12226004; Corresponding authors: Xiangyong Cao, Deyu Meng.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Dalponte, M.; Ørka, H.O.; Gobakken, T.; Gianelle, D.; Næsset, E. Tree species classification in boreal forests with hyperspectral data. IEEE Trans. Geosci. Remote Sens. 2012, 51, 2632–2645. [Google Scholar] [CrossRef]
  2. Liang, L.; Di, L.; Huang, T.; Wang, J.; Lin, L.; Wang, L.; Yang, M. Estimation of Leaf Nitrogen Content in Wheat Using New Hyperspectral Indices and a Random Forest Regression Algorithm. Remote Sens. 2018, 10, 1940. [Google Scholar] [CrossRef]
  3. Berger, K.; Atzberger, C.; Danner, M.; Wocher, M.; Mauser, W.; Hank, T. Model-Based Optimization of Spectral Sampling for the Retrieval of Crop Variables with the PROSAIL Model. Remote Sens. 2018, 10, 2063. [Google Scholar] [CrossRef]
  4. Yokoya, N.; Chan, J.C.W.; Segl, K. Potential of Resolution-Enhanced Hyperspectral Data for Mineral Mapping Using Simulated EnMAP and Sentinel-2 Images. Remote Sens. 2016, 8, 172. [Google Scholar] [CrossRef]
  5. Li, J.; Bioucas-Dias, J.M.; Plaza, A.; Liu, L. Robust Collaborative Nonnegative Matrix Factorization for Hyperspectral Unmixing. IEEE Trans. Geosci. Remote Sens. 2016, 54, 6076–6090. [Google Scholar] [CrossRef]
  6. Chi, J.; Crawford, M.M. Spectral unmixing-based crop residue estimation using hyperspectral remote sensing data: A case study at Purdue university. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 2531–2539. [Google Scholar] [CrossRef]
  7. Yuan, Y.; Wang, Q.; Zhu, G. Fast hyperspectral anomaly detection via high-order 2-D crossing filter. IEEE Trans. Geosci. Remote Sens. 2014, 53, 620–630. [Google Scholar] [CrossRef]
  8. Fauvel, M.; Tarabalka, Y.; Benediktsson, J.A.; Chanussot, J.; Tilton, J.C. Advances in spectral–spatial classification of hyperspectral images. Proc. IEEE 2012, 101, 652–675. [Google Scholar] [CrossRef]
  9. Fang, L.; Li, S.; Duan, W.; Ren, J.; Benediktsson, J.A. Classification of hyperspectral images by exploiting spectral–spatial information of superpixel via multiple kernels. IEEE Trans. Geosci. Remote Sens. 2015, 53, 6663–6674. [Google Scholar] [CrossRef]
  10. Kang, X.; Duan, P.; Li, S.; Benediktsson, J.A. Decolorization-based hyperspectral image visualization. IEEE Trans. Geosci. Remote Sens. 2018, 56, 4346–4360. [Google Scholar] [CrossRef]
  11. Tarasiewicz, T.; Nalepa, J.; Farrugia, R.A.; Valentino, G.; Chen, M.; Briffa, J.A.; Kawulok, M. Multitemporal and multispectral data fusion for super-resolution of Sentinel-2 images. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5406519. [Google Scholar] [CrossRef]
  12. Karim, S.; Qadir, A.; Farooq, U.; Shakir, M.; Laghari, A.A. Hyperspectral imaging: A review and trends towards medical imaging. Curr. Med. Imaging 2023, 19, 417–427. [Google Scholar] [CrossRef] [PubMed]
  13. Zhou, C.; He, Z.; Lou, A.; Plaza, A. RGB-to-HSV: A Frequency-Spectrum Unfolding Network for Spectral Super-Resolution of RGB Videos. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5609318. [Google Scholar] [CrossRef]
  14. Pelletier, C.; Valero, S.; Inglada, J.; Champion, N.; Marais Sicre, C.; Dedieu, G. Effect of training class label noise on classification performances for land cover mapping with satellite image time series. Remote Sens. 2017, 9, 173. [Google Scholar] [CrossRef]
  15. Li, S.; Hao, Q.; Gao, G.; Kang, X. The effect of ground truth on performance evaluation of hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2018, 56, 7195–7206. [Google Scholar] [CrossRef]
  16. Tu, B.; Yang, X.; Li, N.; Zhou, C.; He, D. Hyperspectral anomaly detection via density peak clustering. Pattern Recognit. Lett. 2020, 129, 144–149. [Google Scholar] [CrossRef]
  17. Ma, L.; Crawford, M.M.; Tian, J. Local manifold learning-based k-nearest-neighbor for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2010, 48, 4099–4109. [Google Scholar] [CrossRef]
  18. Bo, C.; Lu, H.; Wang, D. Weighted generalized nearest neighbor for hyperspectral image classification. IEEE Access 2017, 5, 1496–1509. [Google Scholar] [CrossRef]
  19. Melgani, F.; Bruzzone, L. Classification of hyperspectral remote sensing images with support vector machines. IEEE Trans. Geosci. Remote Sens. 2004, 42, 1778–1790. [Google Scholar] [CrossRef]
  20. Demir, B.; Ertürk, S. Improving SVM classification accuracy using a hierarchical approach for hyperspectral images. In Proceedings of the 2009 16th IEEE International Conference on Image Processing (ICIP), Cairo, Egypt, 7–10 November 2009; IEEE: Piscataway, NJ, USA, 2009; pp. 2849–2852. [Google Scholar]
  21. Ham, J.; Chen, Y.; Crawford, M.M.; Ghosh, J. Investigation of the random forest framework for classification of hyperspectral data. IEEE Trans. Geosci. Remote Sens. 2005, 43, 492–501. [Google Scholar] [CrossRef]
  22. Xia, J.; Du, P.; He, X.; Chanussot, J. Hyperspectral remote sensing image classification based on rotation forest. IEEE Geosci. Remote Sens. Lett. 2013, 11, 239–243. [Google Scholar] [CrossRef]
  23. Xia, J.; Ghamisi, P.; Yokoya, N.; Iwasaki, A. Random forest ensembles and extended multiextinction profiles for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2017, 56, 202–216. [Google Scholar] [CrossRef]
  24. Maschler, J.; Atzberger, C.; Immitzer, M. Individual tree crown segmentation and classification of 13 tree species using airborne hyperspectral data. Remote Sens. 2018, 10, 1218. [Google Scholar] [CrossRef]
  25. Samat, A.; Du, P.; Liu, S.; Li, J.; Cheng, L. E2LMs: Ensemble Extreme Learning Machines for Hyperspectral Image Classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 1060–1069. [Google Scholar] [CrossRef]
  26. Li, W.; Chen, C.; Su, H.; Du, Q. Local binary patterns and extreme learning machine for hyperspectral imagery classification. IEEE Trans. Geosci. Remote Sens. 2015, 53, 3681–3693. [Google Scholar] [CrossRef]
  27. Chen, Y.; Nasrabadi, N.M.; Tran, T.D. Hyperspectral image classification via kernel sparse representation. IEEE Trans. Geosci. Remote Sens. 2012, 51, 217–231. [Google Scholar] [CrossRef]
  28. Tu, B.; Zhang, X.; Kang, X.; Zhang, G.; Wang, J.; Wu, J. Hyperspectral image classification via fusing correlation coefficient and joint sparse representation. IEEE Geosci. Remote Sens. Lett. 2018, 15, 340–344. [Google Scholar] [CrossRef]
  29. Ratle, F.; Camps-Valls, G.; Weston, J. Semisupervised neural networks for efficient hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2010, 48, 2271–2282. [Google Scholar] [CrossRef]
  30. Song, W.; Li, S.; Fang, L.; Lu, T. Hyperspectral image classification with deep feature fusion network. IEEE Trans. Geosci. Remote Sens. 2018, 56, 3173–3184. [Google Scholar] [CrossRef]
  31. Kang, X.; Li, S.; Benediktsson, J.A. Spectral–spatial hyperspectral image classification with edge-preserving filtering. IEEE Trans. Geosci. Remote Sens. 2013, 52, 2666–2677. [Google Scholar] [CrossRef]
  32. He, L.; Li, J.; Liu, C.; Li, S. Recent advances on spectral–spatial hyperspectral image classification: An overview and new guidelines. IEEE Trans. Geosci. Remote Sens. 2017, 56, 1579–1597. [Google Scholar] [CrossRef]
  33. Zhao, G.; Tu, B.; Fei, H.; Li, N.; Yang, X. Spatial-spectral classification of hyperspectral image via group tensor decomposition. Neurocomputing 2018, 316, 68–77. [Google Scholar] [CrossRef]
  34. Veit, A.; Alldrin, N.; Chechik, G.; Krasin, I.; Gupta, A.; Belongie, S. Learning from Noisy Large-Scale Datasets with Minimal Supervision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 839–847. [Google Scholar]
  35. Li, Y.; Yang, J.; Song, Y.; Cao, L.; Luo, J.; Li, L.J. Learning from Noisy Labels with Distillation. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1910–1918. [Google Scholar]
  36. Ren, M.; Zeng, W.; Yang, B.; Urtasun, R. Learning to Reweight Examples for Robust Deep Learning. In Proceedings of the International Conference on Machine Learning, PMLR, Stockholm, Sweden, 10–15 July 2018; pp. 4334–4343. [Google Scholar]
  37. Charikar, M.; Steinhardt, J.; Valiant, G. Learning from Untrusted Data. In Proceedings of the Annual ACM SIGACT Symposium on Theory of Computing, Montreal, QC, Canada, 19–23 June 2017; pp. 47–60. [Google Scholar]
  38. Hendrycks, D.; Mazeika, M.; Wilson, D.; Gimpel, K. Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montreal, QC, Canada, 3–8 December 2018; MIT Press: Cambridge, MA, USA, 2018; pp. 10477–10486. [Google Scholar]
  39. Jiang, J.; Ma, J.; Liu, X. Multilayer Spectral–Spatial Graphs for Label Noisy Robust Hyperspectral Image Classification. IEEE Trans. Neural Netw. Learn. Syst. 2022, 33, 839–852. [Google Scholar] [CrossRef] [PubMed]
  40. Xu, Y.; Li, Z.; Li, W.; Du, Q.; Liu, C.; Fang, Z.; Zhai, L. Dual-channel residual network for hyperspectral image classification with noisy labels. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5502511. [Google Scholar] [CrossRef]
  41. Tu, B.; Zhang, X.; Kang, X.; Zhang, G.; Li, S. Density Peak-Based Noisy Label Detection for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2019, 57, 1573–1584. [Google Scholar] [CrossRef]
  42. Tu, B.; Zhou, C.; Peng, J.; He, W.; Ou, X.; Xu, Z. Kernel entropy component analysis-based robust hyperspectral image supervised classification. Remote Sens. 2019, 11, 2823. [Google Scholar] [CrossRef]
  43. Tu, B.; Zhou, C.; Liao, X.; Xu, Z.; Peng, Y.; Ou, X. Hierarchical Structure-Based Noisy Labels Detection for Hyperspectral Image Classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 2183–2199. [Google Scholar] [CrossRef]
  44. Pelletier, C.; Valero, S.; Inglada, J.; Dedieu, G.; Champion, N. Filtering mislabeled data for improving time series classification. In Proceedings of the 9th International Workshop on the Analysis of Multitemporal Remote Sensing Images, Brugge, Belgium, 27–29 June 2017; pp. 1–4. [Google Scholar] [CrossRef]
  45. Jiang, J.; Ma, J.; Wang, Z.; Chen, C.; Liu, X. Hyperspectral Image Classification in the Presence of Noisy Labels. IEEE Trans. Geosci. Remote Sens. 2019, 57, 851–865. [Google Scholar] [CrossRef]
  46. Wang, Q.; Lin, J.; Yuan, Y. Salient Band Selection for Hyperspectral Image Classification via Manifold Ranking. IEEE Trans. Neural Netw. Learn. Syst. 2016, 27, 1279–1289. [Google Scholar] [CrossRef] [PubMed]
  47. Mou, L.; Ghamisi, P.; Zhu, X. Deep Recurrent Neural Networks for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 3639–3655. [Google Scholar] [CrossRef]
  48. Li, W.; Wu, G.; Zhang, F.; Du, Q. Hyperspectral Image Classification Using Deep Pixel-Pair Features. IEEE Trans. Geosci. Remote Sens. 2017, 55, 844–853. [Google Scholar] [CrossRef]
  49. Paoletti, M.E.; Haut, J.M.; Fernandez-Beltran, R.; Plaza, J.; Plaza, A.J.; Pla, F. Deep pyramidal residual networks for spectral–spatial hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2018, 57, 740–754. [Google Scholar] [CrossRef]
  50. Wang, L.; Peng, J.; Sun, W. Spatial–spectral squeeze-and-excitation residual network for hyperspectral image classification. Remote Sens. 2019, 11, 884. [Google Scholar] [CrossRef]
  51. Sun, H.; Zheng, X.; Lu, X.; Wu, S. Spectral–Spatial Attention Network for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2020, 58, 3232–3245. [Google Scholar] [CrossRef]
  52. Yang, J.; Zhao, Y.Q.; Chan, J.C.W. Learning and Transferring Deep Joint Spectral–Spatial Features for Hyperspectral Classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 4729–4742. [Google Scholar] [CrossRef]
  53. Xu, X.; Li, W.; Ran, Q.; Du, Q.; Gao, L.; Zhang, B. Multisource Remote Sensing Data Classification Based on Convolutional Neural Network. IEEE Trans. Geosci. Remote Sens. 2018, 56, 937–949. [Google Scholar] [CrossRef]
  54. Hang, R.; Li, Z.; Liu, Q.; Ghamisi, P.; Bhattacharyya, S.S. Hyperspectral Image Classification With Attention-Aided CNNs. IEEE Trans. Geosci. Remote Sens. 2021, 59, 2281–2293. [Google Scholar] [CrossRef]
  55. Liu, D.; Han, G.; Liu, P.; Yang, H.; Chen, D.; Li, Q.; Wu, J.; Wang, Y. A discriminative spectral–spatial-semantic feature network based on shuffle and frequency attention mechanisms for hyperspectral image classification. Remote Sens. 2022, 14, 2678. [Google Scholar] [CrossRef]
  56. Liu, D.; Shao, T.; Qi, G.; Li, M.; Zhang, J. A Hybrid-Scale Feature Enhancement Network for Hyperspectral Image Classification. Remote Sens. 2023, 16, 22. [Google Scholar] [CrossRef]
  57. Roy, S.K.; Hong, D.; Kar, P.; Wu, X.; Liu, X.; Zhao, D. Lightweight heterogeneous kernel convolution for hyperspectral image classification with noisy labels. IEEE Geosci. Remote Sens. Lett. 2021, 19, 5509705. [Google Scholar] [CrossRef]
  58. Zhang, X.; Yang, S.; Feng, Z.; Song, L.; Wei, Y.; Jiao, L. Triple Contrastive Representation Learning for Hyperspectral Image Classification with Noisy Labels. IEEE Trans. Geosci. Remote Sens. 2023, 61, 500116. [Google Scholar] [CrossRef]
  59. Wang, L.; Zhu, T.; Kumar, N.; Li, Z.; Wu, C.; Zhang, P. Attentive-Adaptive Network for Hyperspectral Images Classification with Noisy Labels. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5505514. [Google Scholar] [CrossRef]
  60. Zhang, Y.; Sun, J.; Shi, H.; Ge, Z.; Yu, Q.; Cao, G.; Li, X. Agreement and Disagreement-Based Co-Learning with Dual Network for Hyperspectral Image Classification with Noisy Labels. Remote Sens. 2023, 15, 2543. [Google Scholar] [CrossRef]
  61. Tu, B.; Zhang, X.; Kang, X.; Wang, J.; Benediktsson, J.A. Spatial Density Peak Clustering for Hyperspectral Image Classification with Noisy Labels. IEEE Trans. Geosci. Remote Sens. 2019, 57, 5085–5097. [Google Scholar] [CrossRef]
  62. Tu, B.; Zhou, C.; He, D.; Huang, S.; Plaza, A. Hyperspectral Classification with Noisy Label Detection via Superpixel-to-Pixel Weighting Distance. IEEE Trans. Geosci. Remote Sens. 2020, 58, 4116–4131. [Google Scholar] [CrossRef]
  63. Leng, Q.; Yang, H.; Jiang, J. Label noise cleansing with sparse graph for hyperspectral image classification. Remote Sens. 2019, 11, 1116. [Google Scholar] [CrossRef]
  64. Liu, M.Y.; Tuzel, O.; Ramalingam, S.; Chellappa, R. Entropy rate superpixel segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, CO, USA, 20–25 June 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 2097–2104. [Google Scholar]
  65. Wold, S.; Esbensen, K.; Geladi, P. Principal component analysis. Chemom. Intell. Lab. Syst. 1987, 2, 37–52. [Google Scholar] [CrossRef]
  66. Canny, J. A Computational Approach to Edge Detection. In Readings in Computer Vision; Elsevier: Amsterdam, The Netherlands, 1987; pp. 184–203. [Google Scholar] [CrossRef]
  67. Wang, Q.; Li, Q.; Li, X. Hyperspectral band selection via adaptive subspace partition strategy. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 4940–4950. [Google Scholar] [CrossRef]
  68. Achanta, R.; Shaji, A.; Smith, K.; Lucchi, A.; Fua, P.; Süsstrunk, S. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 2274–2282. [Google Scholar] [CrossRef] [PubMed]
Figure 1. The flowchart of constructing the new trusted set.
Figure 1. The flowchart of constructing the new trusted set.
Remotesensing 16 02499 g001
Figure 2. KSC dataset: (a) False color image. (b) Ground-truth class map.
Figure 2. KSC dataset: (a) False color image. (b) Ground-truth class map.
Remotesensing 16 02499 g002
Figure 3. UP dataset: (a) False color image. (b) Ground-truth class map.
Figure 3. UP dataset: (a) False color image. (b) Ground-truth class map.
Remotesensing 16 02499 g003
Figure 4. Salinas dataset: (a) False color image. (b) Ground-truth class map.
Figure 4. Salinas dataset: (a) False color image. (b) Ground-truth class map.
Remotesensing 16 02499 g004
Figure 5. The classification maps of seven processing methods (BASE, RLPA, DPNLD, KECA, SPWD, MSSG, and ASLPA) with four classifiers (KNN, SVM, RF, and ELM) on the KSC dataset when ρ = 0.3 . The small red boxes and large red boxes are the highlighted regions and corresponding enlarged regions, respectively. The same for green boxes.
Figure 5. The classification maps of seven processing methods (BASE, RLPA, DPNLD, KECA, SPWD, MSSG, and ASLPA) with four classifiers (KNN, SVM, RF, and ELM) on the KSC dataset when ρ = 0.3 . The small red boxes and large red boxes are the highlighted regions and corresponding enlarged regions, respectively. The same for green boxes.
Remotesensing 16 02499 g005
Figure 6. The classification maps of seven processing methods (BASE, RLPA, DPNLD, KECA, SPWD, MSSG, and ASLPA) with four classifiers (KNN, SVM, RF, and ELM) on the UP dataset when ρ = 0.3 . The small green boxes and large green boxes are the highlighted regions and corresponding enlarged regions, respectively.
Figure 6. The classification maps of seven processing methods (BASE, RLPA, DPNLD, KECA, SPWD, MSSG, and ASLPA) with four classifiers (KNN, SVM, RF, and ELM) on the UP dataset when ρ = 0.3 . The small green boxes and large green boxes are the highlighted regions and corresponding enlarged regions, respectively.
Remotesensing 16 02499 g006
Figure 7. The classification maps of seven processing methods (BASE, RLPA, DPNLD, KECA, SPWD, MSSG, and ASLPA) with four classifiers (KNN, SVM, RF, and ELM) on the Salinas dataset when ρ = 0.3 . The small red boxes and large red boxes are the highlighted regions and corresponding enlarged regions, respectively.
Figure 7. The classification maps of seven processing methods (BASE, RLPA, DPNLD, KECA, SPWD, MSSG, and ASLPA) with four classifiers (KNN, SVM, RF, and ELM) on the Salinas dataset when ρ = 0.3 . The small red boxes and large red boxes are the highlighted regions and corresponding enlarged regions, respectively.
Remotesensing 16 02499 g007
Table 1. Land-cover types and corresponding numbers of all labeled, testing, training, and trusted sets in the KSC dataset.
Table 1. Land-cover types and corresponding numbers of all labeled, testing, training, and trusted sets in the KSC dataset.
No.ClassAll.Test.Train.Trusted.
1Scrub76153322868
2Willow swamp2431707322
3CP hammock2561797723
4CP/Oak2521767623
5Slash pine1611134814
6Oak/Broadleaf2291606921
7Hardwood swamp105733210
8Graminoid marsh43130212939
9Spartina marsh52036415647
10Catiail marsh40428312136
11Salt marsh41929312638
12Mud flats50335215145
13Water92764927883
Table 2. Land-cover types and corresponding numbers of all labeled, testing, training, and trusted sets in the UP dataset.
Table 2. Land-cover types and corresponding numbers of all labeled, testing, training, and trusted sets in the UP dataset.
No.ClassAll.Test.Train.Trusted.
1Asphalt66315968663207
2Meadows18,64916,7841865562
3Gravel2099188921055
4Trees3064275830697
5Painted metal sheets1345121013536
6Bare Soil50294526503146
7Bitumen1330119713338
8Self-Blocking Bricks36823314368106
9Shadows9478529536
Table 3. Land-cover types and corresponding numbers of all labeled, testing, training, and trusted sets in the Salinas dataset.
Table 3. Land-cover types and corresponding numbers of all labeled, testing, training, and trusted sets in the Salinas dataset.
No.ClassAll.Test.Train.Trusted.
1Brocoli_green_weeds_12009180820160
2Brocoli_green_weeds_237263353373115
3Fallow1976177819864
4Fallow_rough_plow1394122513943
5Fallow_smooth2678241026873
6Stubble39593563396109
7Celery35793221358123
8Grapes_untrained11,27110,1441127350
9Soil_vinyard_develop62035583620154
10Corn_senesced_green_weeds3278295032897
11Lettuce_romaine_4wk106896110731
12Lettuce_romaine_5wk1927173419355
13Lettuce_romaine_6wk9168249230
14Lettuce_romaine_7wk107096310741
15Vinyard_untrained72686541727221
16Vinyard_vertical_trellis1807162618159
Table 4. Number of noisy samples remaining in the training set with 4 ablation methods (RLPA, ALPA, GLPA, and ASLPA) on the KSC, UP, and Salinas datasets. All results are averaged in ten iterations and the best results are boldfaced.
Table 4. Number of noisy samples remaining in the training set with 4 ablation methods (RLPA, ALPA, GLPA, and ASLPA) on the KSC, UP, and Salinas datasets. All results are averaged in ten iterations and the best results are boldfaced.
NLKSCUPSalinas
RLPA ALPA GLPA ASLPA RLPA ALPA GLPA ASLPA RLPA ALPA GLPA ASLPA
0.111.511.611.19.023.522.421.620.824.424.423.322.8
0.214.615.314.211.127.427.624.723.128.829.127.025.5
0.319.820.020.416.034.634.032.226.833.133.031.727.0
0.426.425.928.519.446.444.843.832.142.042.741.732.5
0.532.231.434.922.463.159.559.137.549.950.951.736.8
0.638.837.833.725.983.076.567.847.865.165.553.740.0
0.751.749.523.624.8115.297.068.064.687.785.252.749.1
0.877.266.023.322.8195.1137.569.857.8128.0114.252.348.9
Table 5. OA, AA, and Kappa metrics of 4 ablation methods (RLPA, ALPA, GLPA, and ASLPA) with the SVM classifier on the KSC, UP, and Salinas datasets. The best results are boldfaced.
Table 5. OA, AA, and Kappa metrics of 4 ablation methods (RLPA, ALPA, GLPA, and ASLPA) with the SVM classifier on the KSC, UP, and Salinas datasets. The best results are boldfaced.
NLOAAAKappa
KSC
RLPA ALPA GLPA ASLPA RLPA ALPA GLPA ASLPA RLPA ALPA GLPA ASLPA
0.196.2296.2696.2596.4994.5494.6794.7595.1095.7895.8395.8296.09
0.296.0496.1396.2296.2693.7693.1294.4794.6695.5895.2495.7995.83
0.395.5395.4995.6396.2892.9393.0193.2194.8595.0294.9895.1395.86
0.495.0195.1795.1796.0191.0291.4691.4193.6994.4494.6294.6295.55
0.593.8293.8094.2195.5588.1488.5490.0192.1193.1093.0993.5495.05
0.694.0293.6694.3895.2089.4788.8290.6991.4893.3392.9293.7394.65
0.792.4992.8895.1995.4486.7286.7491.0791.8091.6292.0694.4294.92
0.890.5390.4494.9595.5483.0482.9490.1991.9389.4389.3193.7995.03
UP
0.197.9197.5697.7897.8696.6996.3396.4496.7197.2396.7597.0597.16
0.297.6597.4197.5797.7496.2096.1596.2596.6396.8896.5696.7897.01
0.396.9097.4197.5797.6895.3595.9296.0096.4695.8896.5796.7796.92
0.496.7997.1497.0997.5694.6895.6295.3196.1895.7496.2096.1396.76
0.596.7697.0397.0097.4894.8095.4494.9496.2595.7096.0596.0096.66
0.695.8296.7796.9597.3092.4194.9094.7895.8194.4495.7095.9496.41
0.795.7596.5396.9797.2692.8594.4994.8695.7394.3595.3995.9796.36
0.895.1696.4596.8597.2893.1593.7994.8595.6893.5595.2895.8296.38
Salinas
0.197.3997.4097.4397.5098.2898.3298.3398.3497.0997.1097.1397.21
0.297.2997.3097.3097.3398.1998.1898.2298.2196.9897.0097.0097.03
0.397.2297.1497.0897.2498.1398.0898.0198.1496.9096.8196.7496.93
0.497.0597.0596.9397.1997.9597.9897.8398.0596.7196.7296.5896.87
0.596.9496.9796.8697.1197.8597.9697.7597.9796.5996.6396.5096.78
0.696.6796.6796.8096.9897.5697.6497.7697.7896.2996.2996.4496.64
0.796.3396.3396.8597.1497.2897.3297.9398.0395.9195.9296.4496.81
0.895.8596.0296.7497.0796.9797.1297.6798.0395.3795.5796.3596.74
Table 6. NoiOri, Numri, and Noires metrics for four superpixel segmentations on the Salinas dataset. The best results are in bold.
Table 6. NoiOri, Numri, and Noires metrics for four superpixel segmentations on the Salinas dataset. The best results are in bold.
NLNoiOriNumriNoires
Mode1 Mode2 Mode3 Mode4 Mode1 Mode2 Mode3 Mode4
0.1380369.0368.7317.3373.624.437.9201.322.8
0.2748729.6724.4631.4734.528.845.1235.025.5
0.311191088.21081.3952.71099.333.154.5257.127.0
0.415071467.51454.11278.51482.042.069.3290.732.5
0.518841834.51818.61601.41852.649.978.3322.336.8
0.622712205.72186.41915.12234.265.192.7381.640.0
0.726352564.02565.52240.02601.087.798.1525.549.1
0.830332946.62936.02614.82991.3128.0115.2473.048.9
Table 7. OA, AA, and Kappa metrics of seven processing methods with four classifiers on the KSC dataset. The best results are in boldface and are underlined.
Table 7. OA, AA, and Kappa metrics of seven processing methods with four classifiers on the KSC dataset. The best results are in boldface and are underlined.
NLClassifierOAAAKappa
Base RLPA DPNLD KECA SPWD MSSG ASLPA Base RLPA DPNLD KECA SPWD MSSG ASLPA Base RLPA DPNLD KECA SPWD MSSG ASLPA
0.1KNN86.9591.6188.9687.2089.9988.2991.7182.8188.1985.2483.3886.0884.7988.3885.4690.6687.7185.7488.8686.9590.77
SVM89.496.2295.1589.1596.3295.0196.4981.1394.5495.5288.6396.4494.8795.1088.1195.7894.6087.8395.9094.4496.09
RF94.8094.8494.6194.8094.6993.9194.9891.8992.0391.9792.1692.0090.8692.3094.2194.2694.0094.2094.0993.2294.42
ELM96.1096.2795.5796.1096.2496.5596.4695.7895.5094.6995.3295.1695.8295.9495.6595.8595.0795.6595.8296.1596.06
0.2KNN86.5091.4886.0586.5186.3888.1891.6581.2787.9982.5582.8282.7384.5988.2884.9590.5284.4584.9684.8186.8390.71
SVM83.4896.0491.0483.7091.6094.7696.2669.6193.7690.5775.0289.7194.7194.6681.5095.5890.0081.7390.6394.1695.83
RF93.4994.7193.6593.7893.8593.5094.9989.9991.8590.7590.9191.0190.8892.2492.7594.1092.9393.0793.1592.7794.42
ELM95.1196.1295.0295.1895.4896.1196.3894.3695.1594.1394.1794.4595.2095.7394.5595.6894.4694.6394.9795.6695.97
0.3KNN86.3191.2485.5486.3486.1088.2191.5381.1487.7581.9082.4182.1184.5488.1084.7490.2583.8884.7784.5086.8690.57
SVM80.0395.5388.0180.4087.7495.0696.2864.6392.9384.9167.2883.2494.9094.8577.6095.0286.5977.9986.3194.5095.86
RF92.4394.5992.9792.4492.8892.9894.7788.9991.6889.4288.8189.2689.8791.9891.5793.9792.1791.5892.0892.1894.18
ELM94.1695.9894.5194.2194.9094.4196.1893.2995.0093.5292.9293.7793.5995.3693.4995.5293.8993.5594.3293.7895.75
0.4KNN85.5791.0085.0885.6085.5388.2191.3479.5586.8081.5381.7781.6784.6387.4083.9189.9883.3583.9483.8786.8690.36
SVM72.5195.0185.5873.2884.2894.2496.0156.7691.0285.3860.7079.4093.8493.6968.9294.4483.8469.8982.3793.5995.55
RF90.5194.4691.5590.7291.5792.3094.8486.6790.8787.7686.6787.4889.7691.7289.4393.8390.5989.6790.6191.4294.26
ELM93.5895.6793.9493.5994.3994.2496.1292.1593.6693.0092.3993.5393.7594.8892.8595.1893.2592.8693.7593.5895.68
0.5KNN84.7590.7184.1184.9585.1188.2191.2278.3486.5680.1780.6380.9084.7887.2183.0089.6682.2783.2283.4086.8690.23
SVM74.5893.8281.7776.1683.6195.6495.5552.1488.1476.2457.6674.3495.9192.1171.2593.1079.5373.0781.6695.1495.05
RF88.4694.3389.2688.4789.3292.4094.6681.5790.7584.7783.7284.5790.2491.3787.1693.6888.0587.1788.1191.5494.06
ELM92.5695.5592.7492.7093.1593.6795.9789.2193.5691.2991.0491.2993.6494.5691.7295.0591.9091.8792.3692.9495.51
0.6KNN83.7390.4283.4684.0083.9787.9191.1178.3486.5479.3779.3479.4183.7387.4181.8789.3481.5682.1782.1486.5390.10
SVM68.9594.0277.3168.3675.8094.5295.2052.1489.4773.6253.0363.8293.7191.4864.6993.3374.4664.0272.7793.8994.65
RF85.8794.3087.1086.2487.1591.8694.6181.5790.8082.0981.0682.0889.1891.5284.2893.6585.6484.6985.7090.9393.99
ELM91.6795.3692.0091.3591.8493.5696.0589.2193.8790.8789.4389.8792.8394.8790.7294.8391.0990.3790.9192.8295.61
0.7KNN81.2189.6880.7281.3681.5787.9991.2175.9485.6575.5675.7476.1484.1886.9379.0988.5178.5279.2679.4986.6290.22
SVM65.8992.4973.3565.7173.4990.7395.4448.7386.7268.9952.0461.9892.4491.8061.0191.6269.9560.7970.1589.6694.92
RF82.1493.9382.7682.4783.0392.0294.5077.7490.5176.9276.5877.2489.4690.6880.1593.2580.8380.5081.1391.1193.88
ELM90.7295.1591.1889.5690.0893.5895.6787.1493.7789.6687.2188.2193.3493.4689.6694.6090.1788.3788.9592.8595.18
0.8KNN75.0088.3675.2375.3875.1488.0791.2570.6284.6669.5369.3369.1284.2386.7472.2587.0572.4972.7072.4186.7190.26
SVM60.0090.5374.4159.4673.2191.9195.5442.3983.0465.6341.8958.6491.1791.9354.0789.4371.2253.4569.8590.9995.03
RF74.7693.4076.3475.5279.6391.9994.5170.4890.1069.8068.9869.4489.5890.4572.0392.6573.7572.8673.3391.0893.89
ELM90.1094.3290.8188.7288.8693.5095.7086.4592.9788.8686.3686.3792.9293.3288.9893.6889.7587.4287.5892.7695.22
Table 8. OA, AA, and Kappa metrics of seven processing methods with four classifiers on the UP dataset. The best results are in boldface and are underlined.
Table 8. OA, AA, and Kappa metrics of seven processing methods with four classifiers on the UP dataset. The best results are in boldface and are underlined.
NLClassifierOAAAKappa
Base RLPA DPNLD KECA SPWD MSSG ASLPA Base RLPA DPNLD KECA SPWD MSSG ASLPA Base RLPA DPNLD KECA SPWD MSSG ASLPA
0.1KNN89.0492.2089.1589.0589.6991.5892.4585.7789.9989.6890.0286.2091.1690.3185.1589.6085.3485.1786.2988.7289.92
SVM96.8597.9197.1596.8697.4897.2097.8695.0396.6996.8797.0097.1497.1196.7195.8297.2396.2295.8296.6696.2897.16
RF92.4792.6592.3792.6492.4692.6692.6589.7389.8891.6792.4192.2192.4689.6889.9390.1989.8090.1689.9190.1890.17
ELM97.6098.1397.7097.6197.8397.6398.0896.4496.9697.2297.2797.3997.1796.8696.8297.5296.9596.8297.1396.8597.45
0.2KNN89.0192.1088.9889.0488.9791.5592.4085.5989.9789.4089.7989.7091.1490.2985.1289.4685.1185.1685.0788.6889.86
SVM95.3897.6596.4895.8096.9297.1497.7491.6596.2096.3596.2696.7096.9896.6393.8396.8895.3294.4195.9196.1997.01
RF92.0692.6392.1992.3092.3792.4892.6989.4689.8091.0591.6091.7492.1189.6789.490.1689.5889.7189.889.9190.22
ELM97.3198.0397.3097.2997.4397.5698.0495.9096.8496.5896.7596.9397.0996.7896.4397.3996.4296.4096.5996.7597.39
0.3KNN88.9091.6788.8789.0088.9591.5692.3285.4689.1189.0889.5289.4191.1390.0484.9988.9184.9785.1485.0788.7189.74
SVM94.6596.9096.2594.8496.3297.1797.6889.3595.3595.9895.5696.0996.9796.4692.8595.8895.0193.1195.1296.2396.92
RF91.5692.5391.8091.8491.9892.4892.6489.0389.5789.9390.4490.4392.8489.6388.7490.0489.0789.1289.3189.9090.15
ELM96.9897.8897.0596.9797.1997.5697.9895.5696.4796.1596.2996.3597.4696.6796.0097.1896.0995.9896.2796.7597.32
0.4KNN88.7591.1388.3888.4588.4991.5692.1985.6788.4988.2588.5488.6291.1089.7384.8488.2084.3284.4084.4588.7089.57
SVM94.1096.7995.5694.3595.1497.0797.5688.5894.6895.4395.2095.5196.9496.1892.1095.7494.1092.4593.5196.1096.76
RF90.3692.1390.8290.9291.1792.3792.6187.7489.2487.6988.2488.4992.7789.5687.1789.4987.7987.9188.2589.7890.11
ELM96.5597.5096.5496.5496.6797.5697.9895.0396.1095.5295.7495.7497.4996.6595.4396.6895.4195.4195.5896.7697.31
0.5KNN87.2490.4087.3387.4187.3391.5592.0484.2887.5086.5187.1586.6691.0089.2782.7787.2482.9382.9982.9088.6989.37
SVM92.8996.7694.4793.0793.8097.1697.4885.7894.894.8294.5794.7796.8996.2590.4695.7092.6390.7191.7096.2396.66
RF88.5492.0989.6989.4889.7192.4792.5185.8188.9285.6885.6685.9692.8189.2584.7689.4386.3286.0186.3289.9189.97
ELM95.9497.1896.1195.9696.1597.5497.7994.0295.6094.9394.7895.0097.4496.2994.6296.2694.8494.6494.8996.7397.07
0.6KNN85.3589.0485.5885.4485.5791.5091.7581.4585.7383.3383.6283.7390.7688.5880.2585.4580.6080.3780.5788.6188.98
SVM89.7795.8292.9190.4891.1897.0197.3079.3492.4193.6292.7891.8696.8195.8186.1394.4490.5187.1488.1096.0296.41
RF85.6791.7287.6786.9787.4492.1992.4182.0088.3182.1981.0382.1292.8288.8680.9888.9583.6782.7083.3489.5189.84
ELM95.0096.8495.6095.0895.4397.5297.5992.4695.2494.5994.0594.1297.3595.8393.3695.8194.1693.4793.9496.7196.8
0.7KNN81.7088.5982.3881.8382.1991.3791.5777.9585.3880.9881.2377.4390.7487.7175.6284.6476.5275.7276.2188.4488.73
SVM89.3795.7591.2589.3490.2496.9697.2678.2892.8592.4491.3591.1696.8095.7385.5994.3588.2285.5586.8295.9696.36
RF81.7990.5384.5583.2384.0992.4192.3277.8986.6176.8474.4376.1292.6488.3176.0487.4179.6977.8979.0489.8189.71
ELM93.8896.1795.0494.0894.4497.4397.4591.5394.6493.7293.3192.5797.2795.3391.8794.9293.4292.1392.6296.5896.62
0.8KNN75.1088.2179.3679.4979.4991.2291.6970.9384.4381.5181.6581.4290.0788.1267.2284.0171.2371.4071.4088.2488.89
SVM85.3695.1688.2585.7187.2096.8297.2872.2793.1589.9585.7887.9796.7395.6879.9193.5584.0180.4282.5395.7696.38
RF74.5290.7878.8877.0377.9392.3292.4270.0086.6868.2864.5566.5492.5088.7266.8687.7072.4870.0171.2389.6989.84
ELM92.7495.9094.0192.9893.1397.3497.5288.6094.1992.5191.8692.0396.9995.5690.3294.5692.0590.6590.8596.4796.71
Table 9. OA, AA, and Kappa metrics of seven processing methods with four classifiers on the Salinas dataset. The best results are in boldface and are underlined.
Table 9. OA, AA, and Kappa metrics of seven processing methods with four classifiers on the Salinas dataset. The best results are in boldface and are underlined.
NLClassifierOAAAKappa
Base RLPA DPNLD KECA SPWD MSSG ASLPA Base RLPA DPNLD KECA SPWD MSSG ASLPA Base RLPA DPNLD KECA SPWD MSSG ASLPA
0.1KNN91.4292.9090.7789.6791.4992.7492.9994.3896.0892.3193.5492.8995.4996.1590.4592.1089.7490.4690.5491.9292.19
SVM95.3697.3996.6095.2296.7497.1597.5097.0498.2897.7697.5497.7398.0998.3494.8297.0996.2194.6796.3796.8397.21
RF94.0993.9293.6394.1093.6694.1194.0996.3596.2695.6196.3095.6996.2096.3793.4293.2292.9193.4392.9493.4493.42
ELM97.5997.8197.6897.6197.6097.7098.0898.5398.6798.5098.6298.6798.6198.8097.3297.5697.4197.3497.3397.4397.86
0.2KNN91.4092.8690.7291.3890.7592.7592.9294.3996.0192.3893.4892.4595.5096.0890.4292.0589.6890.4189.7191.9392.11
SVM95.2597.2996.0395.2196.3597.1397.3396.8898.1997.6497.5597.0098.0898.2194.7096.9895.5894.6695.9496.8097.03
RF93.7693.9193.4993.8093.5894.0894.0595.9296.2395.1995.7495.3296.1296.3493.0693.2192.7693.1092.8693.4193.37
ELM97.0797.7897.1496.8697.2997.6998.0398.2798.6498.0698.2598.0598.5998.7696.7497.5396.8196.5096.9997.4397.80
0.3KNN91.2592.8190.8191.2790.8092.7392.8994.3295.9992.6093.4392.5895.4696.0590.2692.0089.7890.2889.7791.9192.08
SVM94.6597.2295.5294.7095.6997.0897.2496.3798.1397.2297.0997.0798.0198.1494.0396.9095.0194.0995.2096.7596.93
RF93.2693.9093.1493.3593.3494.0494.0395.3396.2394.5995.1994.8896.0796.3292.5093.2092.3792.6092.5993.3793.35
ELM96.5797.7396.6196.4896.6497.6998.0198.0098.5997.8098.0997.2598.5698.7296.1897.4796.2296.0896.2697.4397.77
0.4KNN90.9692.6890.7391.0590.6492.7192.8194.1295.8692.7193.1592.5395.4595.9789.9491.8589.6990.0489.5891.8991.99
SVM94.2697.0595.1394.3895.2997.0797.1995.7697.9597.0096.9096.9198.0398.0593.6096.7194.5793.7394.7596.7396.87
RF92.6193.8192.5792.9392.8194.0494.0394.5296.1593.8294.2193.9696.0296.3191.7893.1191.7492.1392.0093.3693.35
ELM96.1597.6496.0596.3296.1697.6897.9697.6498.5397.6197.7797.5998.5598.6995.7197.3795.6095.9095.7297.4197.73
0.5KNN90.3392.5690.3290.3090.2992.7092.7393.6395.7592.0091.9691.8995.3995.8889.2591.7289.2489.2189.2091.8791.90
SVM93.5996.9494.4393.6695.0397.0597.1195.6297.8596.6996.6196.7697.9197.9792.8496.5993.7992.9394.4696.7296.78
RF91.6593.8391.6691.6892.0393.9993.9893.2996.1492.4792.7392.7995.9696.2690.7193.1390.7290.7491.1393.3193.30
ELM95.7697.5995.7995.7995.8797.6797.9297.4298.4897.3697.2997.2098.5298.6395.2897.3295.3195.3195.4197.4197.68
0.6KNN88.8392.3189.0088.9189.2192.6192.6492.5595.5690.2390.0390.3195.2195.7687.5991.4487.7787.6888.0191.7891.80
SVM92.4796.6792.4592.4793.3596.9696.9894.9297.5696.0096.0595.4997.7597.7891.5996.2991.5791.5992.5896.6196.64
RF90.1693.6990.2490.2190.5993.9493.9191.5796.0790.4490.4390.5595.8896.1689.0492.9889.1489.1089.5493.2593.22
ELM95.1597.3395.1394.9695.1597.6297.8497.0298.3897.0196.9296.3598.4098.5494.6097.0394.5894.3894.6097.3497.59
0.7KNN86.3292.0786.3286.2586.8992.6192.7290.0395.2887.7087.7487.2695.2495.8484.8491.1784.8484.7685.4591.7891.90
SVM91.5096.3391.5791.7092.1996.9797.1492.7897.2895.7495.8195.8297.8198.0390.4995.9190.4990.7291.2796.6296.81
RF87.8193.5587.8787.7088.1593.9493.9188.4295.9287.1586.9487.395.8696.2386.4592.8286.5086.3286.8293.2593.22
ELM94.7097.1794.7094.5394.5697.6297.896.5598.1396.6396.6396.6098.4098.5494.1096.8594.1093.9093.9397.3597.55
0.8KNN85.5391.5385.5385.5285.4992.5792.7086.9994.8187.7587.7287.6895.2395.8783.8690.5783.8683.8583.8191.7391.87
SVM89.3595.8589.3589.3689.7596.9597.0788.0496.9794.1394.1394.2297.7898.0388.0695.3788.0688.0888.5296.6096.74
RF83.7593.2883.9383.9284.2193.9693.9183.7095.6581.3181.3381.7095.9396.281.9592.5282.1582.1482.4693.2793.22
ELM94.1596.8394.1594.1694.2097.5897.7796.1897.9096.1196.1296.0298.4098.5493.4896.4793.4893.4993.5497.3097.52
Table 10. Time (s) of seven processing methods with SVM classifier on the Salinas dataset. All results are averaged in ten iterations. The best results are in bold.
Table 10. Time (s) of seven processing methods with SVM classifier on the Salinas dataset. All results are averaged in ten iterations. The best results are in bold.
NLBaseRLPADPNLDKECASPWDMSSGASLPA
0.169.0942.3564.9047.30286.692117.7441.70
0.295.5947.2678.4264.35297.821049.6441.56
0.3114.9257.6387.0477.15276.351177.6841.39
0.4138.1882.26110.8092.26292.511266.6842.05
0.5156.13122.69132.11104.52271.301372.8442.52
0.6176.42173.26153.21116.01297.451465.2842.08
0.7191.71194.72170.25126.40311.431543.0441.81
0.8206.80223.41179.45135.89336.771615.7241.83
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, Z.; Yang, X.; Meng, D.; Cao, X. An Adaptive Noisy Label-Correction Method Based on Selective Loss for Hyperspectral Image-Classification Problem. Remote Sens. 2024, 16, 2499. https://doi.org/10.3390/rs16132499

AMA Style

Li Z, Yang X, Meng D, Cao X. An Adaptive Noisy Label-Correction Method Based on Selective Loss for Hyperspectral Image-Classification Problem. Remote Sensing. 2024; 16(13):2499. https://doi.org/10.3390/rs16132499

Chicago/Turabian Style

Li, Zina, Xiaorui Yang, Deyu Meng, and Xiangyong Cao. 2024. "An Adaptive Noisy Label-Correction Method Based on Selective Loss for Hyperspectral Image-Classification Problem" Remote Sensing 16, no. 13: 2499. https://doi.org/10.3390/rs16132499

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop