2.1.2. LBP

In HSI data, the spatial contextual information could be described by the local texture around each pixel. LBP feature is a popular texture operator that has been investigated in [46]. The LBP map for **I***p* can be obtained by

$$\mathbf{L}^p(i) = \sum\_{k=1}^{|\omega\_k|-1} \mathcal{U}(\mathbf{I}\_k^p - \mathbf{I}\_i^p) 2^i. \tag{4}$$

where |*<sup>ω</sup>k*| is the number of pixels in the window *ωk*, and *<sup>U</sup>*(·) is a Heaviside step function with 1 for positive entries and 0, otherwise. In the LBP map, we can ge<sup>t</sup> a vector for each *ωk* by counting its histogram. This vector is the new feature representation for pixel *i*. In this paper, uniform LBP is used. If using 8-neighbor for uniform LBP, 59 bins will be obtained totally, i.e., there are 59 sub-feature sets available based on LBP.

## 2.1.3. Gabor Filters

Besides local texture features, recent literature has reported that global spatial features of HSI data will also contribute to the classification accuracy, e.g., Gabor filter [47,56,57]. Suppose (*<sup>x</sup>*, *y*) is a pixel coordinate at **I***p*, then the output of an Gabor filter can be expressed by

$$\mathbf{GB}(\mathbf{x}, y) = \exp(-\frac{\mathbf{x}'^2 + \gamma^2 y'^2}{2\sigma^2}) \exp(\mathbf{j}(2\pi \frac{\mathbf{x}'}{\delta} + \psi)),\tag{5}$$

where

$$\mathbf{x}' = \mathbf{x}\cos\theta + \mathbf{y}'\sin\theta,\\ \mathbf{y}' = -\mathbf{x}\sin\theta + \mathbf{y}'\cos\theta. \tag{6}$$

*γ*, *ψ* and *σ* are hyper-parameters in Gabor filter, *δ* is the wavelength of the sinusoidal function, and *θ* represents the orientation of the filter. Selecting different *δ* and *θ*, the original HSI data can be transformed into many sub-feature sets.

Based on the RGF, LBP and Gabor filters, we can construct a new feature set containing many subsets. Note that the dimensionality of features in the obtained set is the same with that of the original HSI data. Traditional feature fusion based methods usually directly stack these features, or use a weighted voting strategy. In this paper, we try to extract hierarchical features from HSI data, and this feature set is used as the input of the next hierarchy.

#### *2.2. Hashing Based Hierarchical Feature Representation*

The major motivation of the proposed hashing based method is that extracting very sparse features by increasing the feature dimensionality. For the obtained feature set, we first divide it into several subsets with the same number of features. Suppose *N* is the number of features in a single subset, and *L* is the dimensionality of features. Then, the illustration of hierarchical feature representation for this subset can be exhibited by Figure 2. Generally, the hashing based hierarchical feature representation method for a subset mainly includes three steps.

**Figure 2.** An illustration for the hashing based hierarchical feature representation. This figure only presents the process in one pixel and a single sub-feature set.
