**2. Theoretical Background**

The effect of the SV of endmembers in the SU and the approaches developed to deal with it have been taken into consideration in [4,10,11]. Furthermore, the problems caused by the spectral correlation among the endmembers and its adverse effects on the reliability of the results of the SU have been investigated in [8]. Given the centrality of the optimal band selection to enhancing the stability of the elected set against the spectral variation and also to decrease the spectral correlation among the endmembers in the proposed method, the algorithms that have dealt with the problem by the band selection approach are reviewed in this section.

#### *2.1. Feature Selection Algorithms to Decrease the Spectral Variability Effect*

The precise selection of bands that are stable against the SV (e.g., those bands that minimize the intra-class variance and maximize the inter-class dispersion) plays a significant role in the accuracy improvement of the estimation of fractional abundances. Previous studies on the variability of the optical properties of leaf, litter, and soil in semi-arid and arid areas had illustrated that the SWIR2 region from 2050–2500 nm is the least dependent on variations in structural and biochemical attributes [12]. Therefore, this region was selected as a stable spectral region for the aforementioned materials to be used in the SMA for developing the AutoSWIR algorithm in [12]. However, since the position and the spectral region of bands, as well as the number of stable spectral regions depend on the spatial, spectral, and temporal complexity as well as the mixture of endmembers that are present in the scene [4], this algorithm was not extendable for different ecosystems.

In [13], a more applicable spectral feature selection algorithm entitled the stable zone unmixing (SZU) was introduced. In this algorithm, the sensitive wavelengths to the SV are evaluated using the instability index (ISI). Then, a protocol was introduced to enhance the spectral subset selection by accounting for a tradeoff between the number of wavelengths used in the analysis (i.e., information) and the ISI (i.e., spectral variability).

The redundancy problem of the hyperspectral images and high correlation between their bands was not taken into account in either of the methods (AutoSWIR and SZU). A greater potential in computation efficiency and fraction estimate accuracy could only be provided if the independent bands were employed in the LMM [4].

#### *2.2. Feature Selection Algorithms to Decrease the Spectral Correlation of Endmembers*

The most common way to deal with the correlation of endmembers is to eliminate the collinear endmembers [8], which causes two disadvantages: (1) If two endmembers are highly correlated, which one should be excluded? (2) When an endmember is eliminated, it may contain some useful information that could result in destabilizing the LMM. Another solution is to combine the two correlated classes to form a new endmember. However, this will inevitably result in missed classes or similar problems to exclude the endmembers. There are a number of transformations (e.g., principal component analysis (PCA) [14] and the maximum noise fraction (MNF) [15]) which can be employed to reduce the correlation of endmembers by de-correlating the band-to-band correlation. The major drawback of these methods is encountering endmembers whose spectral response has no physical meaning [8].

Another obvious solution that comes to mind is to use a number of subsets of the spectral region. Regarding the high spectral resolution of the hyperspectral images, the spectral region could be decomposed into the visible, near-infrared, and shortwave infrared sections. Then, only those regions that contain the absorption features of the objects of interest could be employed in the appropriate applications. Of course, different regions of the electromagnetic wave spectrum have been theoretically considered by researchers in order to examine different phenomena (i.e., using the shortwave infrared region for analyzing minerals). However, upon decreasing the spectral region, the correlation among endmembers increases [8].

Recently, several endmember-extraction-based methods were applied to feature selection methods, which elect the distinctive spectral signatures. Some of these methods (e.g., geometrical feature selection (G-FS) [16] and linear prediction (LP) [17]) operate in the pixel space. The dimensionality of such a space is equal to the number of participating image pixels. Some others, such as prototype feature selection (PFS) and maximum tangent discrimination (MTD) [18] extract informative bands in an unsupervised manner via geometrical interpretation of distinctive bands in the Prototype Space (PS) [9]. In this way, the axes of the PS are defined based on the spectrum of the extracted endmembers or the clusters' center of the existing classes in the scene. Therefore, the dimensionality of such a space is equal to the number of constructive components of the scene, and each band is represented as a point or vector in this space. Using these methods, a subset of independent bands that are distributed throughout the spectral region of the hyperspectral sensor could be provided by eliminating the dependent bands in a meaningful manner.

#### 2.2.1. Quantitative Evaluation of the Endmembers' Correlation

In order to quantitatively estimate the correlation of endmembers constructing the coefficient matrix (M), two measures can be employed [8]: (1) deriving a measure from the coefficient matrix based on its singular values, and (2) extracting a measure from the correlation of the endmembers' spectra. The SVD of the endmembers matrix (M) to extract singular values (i.e., the square root of eigenvalues) is M = U ∑ VT, where U and V are both square, unitary, and orthonormal matrices, and ∑ is a diagonal matrix with the singular values of M. The ratio of the largest singular value to the smallest one is called the condition number of the matrix. The ideal value of one for the condition number indicates that the matrix is fully orthogonal. By increasing the condition number of the endmember matrix, the correlation of endmembers increases, and in an extreme case, will cause the singularity of the matrix. If the correlation of endmembers exceeds 0.6, the condition number increases exponentially [8]. In this way, the condition number is a good measure to evaluate the endmembers' correlation.

The correlation matrix of the endmembers partially shows the collinearity of each pair of endmembers. The average value of the upper or the lower triangle elements of the matrix provides a measure that indicates the average correlation of the endmembers constructing the coefficient matrix, and could be used as a measure of the overall correlation of endmembers.

#### **3. The Proposed ISI-PS Method**

This section explains the proposed ISI-PS method, which is an incorporation of the methods to decrease the SV effect and to select the independent bands of the hyperspectral images. These two factors directly play a positive role in enhancing the accuracy and the reliability of the fractional abundances computed. The proposed method has been developed by assuming the existence of some pure pixels in the image for each class.

In order to explain the relation of the different parts of the proposed method, its flowchart (Figure 1) and pseudo-code are presented, and detailed information is provided in the following sections. By assuming the existence of pure pixels in the hyperspectral images, a set of pure spectra is firstly provided for each class, which indicates the class spectral variability. These sets are employed to create the endmember of each class and statistical analysis in each band. In this way, the extraction of the information of the endmembers in the proposed method has been compiled based on the geometrical endmember extraction algorithms and by assuming the existence of pure pixels in the image. By means of intra-class and inter-class analysis using measures (e.g., ISI) that consider the SV, the persistent bands against the variability are prioritized. Thereafter, the PS is established based on the spectra of the pure representative of each class, and then bands are represented in this space. Finally, the correlation among the prioritized bands is measured by computing their angles in the PS, and then those that have a similar behavior are eliminated. Thus, the remaining bands are independent and persistent against the variability, which are employed to do the SU and estimate the fractional abundance of each endmember.

**Figure 1.** Flowchart of the proposed method.

Pseudocode of the proposed method:


Ω = {*Bi*}*<sup>L</sup> i*=1 where *B*1 ≥ *B*2 ≥··· *BL* ,

where Ω is the set of prioritized persistent bands (*B*) and *i*th is the prioritizing index of bands.


#### *3.1. Establishing a Set of Spectral Variabilities for Each Endmember*

The LMM is widely used to model the spectral composition of a spectrum. However, many reasons lead to the SV of endmembers, such as the change of environmental illumination, as well as atmospheric and temporal conditions [10]. The methods for dealing with the endmembers' variability could be categorized into two classes [10]: (1) endmembers as sets, and (2) endmembers as statistical distributions. Usually, methods of the first category require a spectral library of the SV of endmembers to deal with the variability phenomena. Collecting a spectral library of the endmembers' variabilities is an expensive and time-consuming process. Therefore, automatic extraction of endmembers' sets from an image is greatly beneficial. In this regard, automated endmember bundles (AEB) [19] has established endmembers sets by executing standard endmember extraction algorithms such as N-FINDR [20], orthogonal subspace projection (OSP) [21], unsupervised fully constrained least squares (UFCLS) [22], iterative error analysis (IEA) [23] and vertex component analysis (VCA) [24] and clustering the resulted endmembers from different methods. Recently, the authors in [25] showed that VCA is essentially the same as simplex growing algorithm (SGA) [26] as long as their initial conditions are the same. So, other conventional endmember extraction algorithms such as SGA can be used in AEB.

Spectral features of those spectra that are located in each endmember set indicate the representative of that endmember. Obviously, it is due to the SV that these features are not exactly similar. Therefore, these representatives should have the general condition of an endmember, including: (1) they should lie next to the vertex of point clouds in the feature space; (2) they should situate in homogenous regions in the spatial domain; and (3) pixels of each class should have a similar spectral behavior. Recently, a module entitled spatial–spectral preprocessing (SSPP) was presented in [27], which could be used as a preprocessing function prior to endmember identification and SU. This module firstly computes the spatial homogeneity index for the pixels of the image, which is used to determine the homogenous regions of the image. Simultaneously, unsupervised clustering is employed to identify the spectral classes. Finally, by fusion of this spatial and spectral information, a subset of pixels that are spatially homogenous and spectrally pure is identified in each class, which could be used as the input of the endmember extraction algorithms.

In this way, the applied procedure to determine the endmembers has been shown in the dashed box of Figure 2 and runs as follows. Firstly, the virtual dimensionality (VD) of the image (p) is determined via Hysime [28]. Thereafter, the dimension is reduced using a PCA or MNF transformation into (p-1) bands. In this reduced feature space, endmembers are found via the well-known pixel purity index (PPI) [29] technique with a threshold value equal to zero. Then, the obtained results from the PPI are clustered into p clusters. Instead of the common K-means clustering (which is suitable for classes with spherical distributions), the Fast Density Peak Detection (FDPC) method [30] is applied since herein the classes are seen to have a variety of different distributions. Since the endmembers are found, image homogeneous regions are also detected by a Gaussian filter, and according to Equations (6) and (7). The previously found endmembers are prioritized according to both their spectral purity and homogeneity indices. Next, the spatial and spectral maps are generated based on the first 20% of the purest pixels, as well as 30% of the most homogeneous ones. The ultimate representative endmembers of each class are finally selected from the overlap of these two maps.

**Figure 2.** The spectrum of alunite and kaolinite minerals and the highly-correlated regions among the adjacent bands.

#### *3.2. Reducing the Spectral Variability Effect by Selecting the Optimal Bands*

In contrast with the conventional SMA methods which use the overall spectral region (i.e., all bands), SZU [13] has been designed based on the selection of persistent bands against the SV phenomena using ISI. The numerical value of this index is computed using Equation (2), which has been developed based on the Fisher separability function. This index is defined for each band based on the ratio of the intra-class dispersion (i.e., the total standard deviation of endmembers for each class) to the inter-class variability of endmembers (i.e., the average distance among the center of classes). The value of one indicates that the intra-class and inter-class variations are similar, and the smaller this value, the better the situation for that band to separate classes.

$$ISI\_{\lambda} = \frac{\Delta\_{\text{within},\lambda}}{\Delta\_{\text{between},\lambda}} = \frac{2}{p(p-1)} \sum\_{i=1}^{p-1} \sum\_{j=i+1}^{p} \frac{1.96(\sigma\_{i,\lambda} + \sigma\_{j,\lambda})}{\left| \overline{m}\_{i,\lambda} - \overline{m}\_{j,\lambda} \right|},\tag{2}$$

where *p* is the number of endmembers and *<sup>σ</sup>i*,*<sup>λ</sup>* and *mi*,*<sup>λ</sup>* are the standard deviation and the average of class *i* in the band *λ*, respectively. The image's bands could be prioritized with respect to the SV of their constituent spectra using this index.

#### *3.3. Reducing the Correlation of Endmembers by Selecting the Independent Bands in the Prototype Space*

Linear correlation of two or more endmembers always exists in the SU of hyperspectral images. However, little attention has been paid to this [8]. One of the main objectives of this paper is to reduce these correlations without the elimination of the dependent endmembers. This is because—as was mentioned—it would be of interest to separate the different species in some applications. On the other hand, the correlation of the spectrum of endmembers' sets for adjacent bands has not been considered in the band prioritizing process to reduce the effect of the SV. These two issues are closely related to each other. Therefore, by eliminating those bands for which the endmembers' sets spectrum is similar, the less-correlated spectral features of these endmembers could be achieved. The spectrum of alunite and kaolinite minerals from the USGS spectral library is represented by the spectral response of the AVIRIS sensor in Figure 2. As can be seen, the bands in the blue, red, and yellow regions are redundant. Besides, these regions may be correlated with each other.

In order to estimate the correlation of bands, some methods have been proposed based on the divergence and correlation functions on the histogram of the image's bands [31,32]. However, since the goal is to improve the condition of the coefficient matrix to reduce the endmembers' correlation, in this paper, the angle of bands in the prototype space has been used as a measure of the correlation of endmembers' sets in those two bands. In other words, by establishing the PS using the endmembers and representing the bands in this space, the dependent bands could be identified. The advantage of this method is that the correlation of bands is evaluated dealing with the endmembers' sets, because the axes of the PS are the endmembers. Therefore, if the combination of the endmembers is changed in the scene, the proposed method will select a new subset of bands that also have the minimum correlation.

Bands are categorized into three classes in the PS: (1) informative bands: the lager the distance of bands from the main diagonal of the space, the better the bands can separate the image' classes; (2) correlated bands: the bands that have a similar spectral response of endmembers are gathered close together in this space—this concept is beyond the correlation of the adjacent bands in the hyperspectral images, because it would have occurred for those bands that are not adjacent; (3) non-informative bands: those bands that are located close to the main diagonal of the space, which have the exact same response for different classes [9].

The spectrum (i.e., bands) of alunite and kaolinite minerals (Figure 2) are illustrated in the PS, which was constructed based on these two endmembers in Figure 3. In this figure, examples of highly correlated bands (using blue, red, and yellow colors, which are shown with magnification in the right view), informative bands (the two bands that are shown with blue squares and that have the largest distances from the main diagonal of the space, and those bands that are illustrated with red triangles in which the two spectra have an appropriate distance from each other) and uninformative bands (those bands that are shown with magenta color and that are located close to the main diagonal of the space) are illustrated.

**Figure 3.** The Prototype Space (PS) constructed using the two endmembers alunite and kaolinite.

It is obvious that the angles of the correlated bands are close to each other, even if these regions with a similar spectrum are not adjacent. Therefore, the correlation of bands in dealing with the endmembers could be understood by extracting these angles.

#### *3.4. Determining the Threshold Value to Identify the Independent Bands*

A threshold value should be defined in order to identify and eliminate the correlated bands, and the decision to preserve or eliminate each band is made by the comparison of the angle between that band and the previously selected bands with the pre-defined threshold value. The angle between bands is computed in the prototype space. If this angle is less that the pre-defined threshold value, it means that the band evaluated is similar to a band in the set of previously selected bands. Otherwise, the band evaluated is added to the set of bands.

In order to determine the threshold value, in this paper, the independent bands are extracted by defining a range (e.g., from 0.5 to five degrees) and a step (e.g., 0.15 degrees) for the variation of this threshold. Then, an image is reconstructed from the selected bands. The root mean square error (RMSE) of discrepancies between the estimated fractional abundances from this image and the ground truth map is computed. Finally, the threshold value that leads to the minimum RMSE is selected as the optimal threshold value. However, the obtained bands using this threshold value are selected as the optimal bands.

The ground truth map is not available in most applications of spectral unmixing. In these cases, the map of the extracted pure pixels from the image in Section 3.1 of the proposed method could be used as the ground truth map. In other words, similar to the case that the ground truth map is available, the proper threshold value is determined by evaluating the RMSE obtained from the estimation of the fractional abundances at the position of the image's pure pixels.
