2.1.1. RGF

Although the raw pixel spectral vectors could directly be used for training and classification, they do not perform well. Moreover, since we need sub-feature sets from spectral features, we must extend the pixels spectra to a group of features. Motivated by the effectiveness of RGF and its improvement in HSI classification [37], in this paper, we use RGF to obtain the sub-feature set using spectral information.

Let **Q***p* denote filtering result for the *p*th band of an hyperspectral image, we conduct guided filtering [54] by

$$\mathbf{Q}\_{i}^{p} = a\_{k}^{p}\mathbf{G}\_{i} + b\_{k'}^{p} \,\forall i \in \omega\_{k'} \tag{1}$$

where **G** is a guidance image, *i* is one of a pixel in **G**, *ωk* is a window around pixel *i*, *k* is one of a pixel in *ωk*, and *apk* and *bpk* are coefficients to be estimated. Usually, **G** is the first principal component of HSI data. Please note that **G** only works as the guidance image, and it will not reduce the dimensionality of the filtered results. Then, minimize the following energy function:

$$E(a\_k^p, b\_k^p) = \sum\_{i \in \omega\_k} ((a\_k^p \mathbf{G}\_i + b\_k^p - \mathbf{I}\_i^p)^2 + \epsilon a\_k^{p/2}),\tag{2}$$

where **I** is the input HSI data, and is a hyper-parameter. Equation (2) can be solved directly by linear ridge regression [55]:

$$\begin{aligned} a\_k^p &= \frac{\frac{1}{|\mathbf{z}|} \sum\_{i \in \omega\_k} \mathbf{I}\_i^p \mathbf{G}\_i - \mu\_k \mathbf{I}\_k^p}{\sigma\_k^2 + \epsilon}, \\ b\_k^p &= \mathbf{I}\_k^p - a\_k^p \mu\_{k'}^p \end{aligned} \tag{3}$$

where *μk* and *σk* denote the mean value and standard variance of **G** in *ωk*, **I***pk* is the mean value of **I** in *ωk*, and |*ω*| is the number of pixels in *ωk*.

Equation (1) is the optimization problem in guidance filtering, and *a* and *b* are the values need to be optimized. Equation (2) is the optimization object function, and Equation (3) is the solution. Rolling operation refers to replace **I** by **Q** and conduct Equations (1) and (2) repeatedly. In each rolling, we can obtain a new HSI data. Therefore, using RGF we are able to generate a series of features based

on the original spectral vectors. Because RGF mainly reflects the spectral characteristics of HSI data, these features can be considered as spectral sub-feature sets.
