**1. Introduction**

Hyperspectral images have been widely used for many applications, such as classification [1], spectral unmixing [2], target detection [3], environmental monitoring [4] and anomaly detection [5]. Among these applications, classification is one of the most crucial branches. There are more than 100 spectral bands that provide detailed information to discriminate the object in a hyperspectral image [6]. However, the high dimensions of hyperspectral images require a more complicated model, while such a complicated model also requires more training samples to support it. Thus, the imbalance between the number of training samples and the high dimensions may cause the well-known "Hughes" phenomenon [7]. The existence of the "Hughes" phenomenon poses restrictions on performance improvement for the hyperspectral image classification.

In hyperspectral image classification, traditional spectral-based classification methods are widely used, such as support vector machines (SVM) [8], the back-propagation neural network (BP) [9], random forest (RF) [10] and the 1D deep convolutional neural network (1D CNN) [11]. However, all of these methods are sensitive to the quality and number of training samples; thus, the classification performance is limited when a small amount of training samples is provided. In order to further improve the classification performance, the rich spatial-contextual information is used in pixel-wise classification methods [12–15]. For instance, Pan et al. [16] introduced the hierarchical guidance filtering to extract the different spatial contextual information at different filter scales in hyperspectral images. In [17], a new network that utilizes the spectral and spatial information simultaneously was proposed to achieve more accurate classification results.

During the past few decades, the semi-supervised learning methods have shown excellent performance in hyperspectral image classification [18,19]. One goal of the semi-supervised learning method is to select the most useful unlabeled samples and to determine the label information of these new selected samples. Generally, semi-supervised learning can be classified into the generative model [20], the co-training model [21], the graph-based method [18–22], etc. All of those methods are based on an assumption that similar samples have the same labels. Hence, graph-based semi-supervised methods have attracted increasing attention in hyperspectral image classification [18,22]. For example, in [23], Wang et al. proposed a novel graph-based semi-supervised learning approach based on a linear neighborhood model to propagate the labels from the labeled samples to the whole dataset using these linear neighborhoods with sufficient smoothness. In [18], the wealth of unlabeled samples is exploited through a graph-based methodology to handle the special characteristics of hyperspectral images. The label propagation algorithm (LP) [24,25] is a widely-used method in graph-based semi-supervised learning [26,27], as in [28]; unlabeled data information is effectively exploited by combing the Gaussian random field model and harmonic function. Wang et al. [24] proposed an approach based on spatial-spectral label propagation for the semi-supervised classification method, in which labels were propagated from labeled samples to unlabeled samples with the spatial-spectral graph to update the training set. However, there are three main difficulties of the aforementioned graph-based semi-supervised classification methods: (1) how to significantly generate the pseudo-labeled samples with a high quality; (2) how to expand the propagation scope of the samples as much as possible; (3) how to modify the labels that wrongly propagate to other classes.

Recently, the superpixel technique [29] has been an effective way to introduce the spatial information for hyperspectral image classification [16,30,31]. Each superpixel is a homogeneous region, whose size and shape are adaptive. The commonly-used superpixel segmentation methods include the SLIC method [32], normalized cut method [33], regional growth method [34], etc. Moreover, superpixel-based classification methods [35,36] have shown a good robustness in the result of hyperspectral image classification. Motivated by the idea of a superpixel, we design a novel superpixel-based label propagation framework, extended label propagation (ELP), which uses a two-step propagation process to significantly extend the number of pseudo-labeled samples. In ELP, the spatial-spectral weighted graph is first constructed with the labeled samples and unlabeled samples from the spatial neighbors of the labeled samples to propagate the class labels to unlabeled samples. Second, the multi-scale segmentation algorithm [37] is used to generate superpixels, and then, superpixel propagation is introduced to assign the same label to all pixels within a superpixel. Finally, a threshold is defined; when the confidence of pseudo-labeled samples is higher than the defined threshold, they will be selected to enrich the training sample set. Note that the second step of the ELP method, i.e., extended label propagation with superpixel segmentation, is the innovation of the proposed method, because it can generate a large number of high-confidence pseudo-labeled samples.

In this paper, the motivations include three aspects. First, we would like to extend the number of high-confidence pseudo-labeled samples based on a two-step propagation process. Second, rolling guidance filtering is used to optimize the feature of the initial hyperspectral image. In the

optimized image, the noise and small texture are removed, while the strong structure of the image is preserved, enhancing the discrimination within and between classes. Third, we want to modify the labels that wrongly propagate by the label propagation algorithm. The proposed ELP-RGF can effectively improve the classification performance with less training samples. The contributions of the proposed method consist of:

(1) We propose a novel extended label propagation component that is based on the label propagation algorithm. The second step of ELP, that is superpixel propagation, is the most innovative of the proposed method, because it not only expands the scope of the label propagation, but also generates a large number of high-confidence pseudo-labeled samples. Therefore, it has a good performance for hyperspectral image classification.

(2) In the step of superpixel propagation, the labels of pixels within the superpixel are obtained by a majority vote with the labeled samples belonging to that superpixel. Therefore, some pseudo-labeled samples with wrong labels that are obtained by the first step of the ELP method can be modified. Furthermore, we can show that the variation of ELP-RGF is much more stable compared to the result in [38] and [24].

(3) Optimized image features with the rolling guidance filter (RGF) [39] can eliminate the noise of the initial image. The filtered image is treated as an input to the SVM method to help improve the result of the final classification.

The remainder of this paper is organized as follows. The related work is described in Section 2. The proposed method is introduced in Section 3. The discussion is provided in Section 5. Finally, conclusions are given in Section 6.

## **2. Related Work**
