**5. Discussion**

This review of hyperspectral vegetation classification literature has determined that every aspect of a study can greatly influence selected wavebands and classification performance. However, despite this, we have identified some important and consistent patterns that appear throughout the literature. Visible wavelengths and their associations with photosynthetic pigments have played important discriminatory roles in a wide range of studies, their high levels of selection clearly evident in this review (Figure 1). Selection rates in the VIS showed only minor variations between VIS/SWIR studies and the VIS/NIR (Figure 1), although the comparisons between canopy and leaf scale spectra

demonstrated significant differences (Figure 3). The discriminatory value of the red edge has been well documented with its close relationship to chlorophyll concentration and structural features. This is reflected in the consistently high rates of selection of the red edge as well as the robustness of its selection with only minor variation in magnitude between the comparisons. The inclusion of structural features in canopy spectra can provide high levels of interspecific variation in the NIR, primarily in the form of differences in albedo, rather than spectral shape [68]. However, selection rates from the non-red edge NIR are low, with the selected bands generally being related to water absorption features and potentially high levels of within-class variability. Additionally, the NIR has demonstrated the greatest degree of variability between the canopy and leaf scale spectra studies. Wavebands selected in the SWIR are associated with water absorption and non-photosynthetic biochemicals, with selection rates heavily skewed towards the NSWIR over the FSWIR.

The reported importance of NIR bands [45,52] seems to be contentious, primarily being driven by the use of a single feature selection technique. Comparisons between selection rates for the NIR with and without the use of SDA as the feature selector are starkly contrasted, with the importance of NIR being significantly higher with the use of SDA. The criticisms of SDA and stepwise methods in general perhaps offer an answer to the selection biases presented in this review.

It is apparent that there is no single best feature selection method, with the same method performing very differently within and between studies. This suggests that either multiple methods should be applied to the data, or an ensemble of multiple methods may be the best practice, a conclusion recognized by this review, and previously suggested by some studies [36]. Additionally, multiple subsets of selected features have proven to discriminate species equally well [8], or alternatively, no feature selection, with the original data outperforming feature selected subsets [7,9,36]. Additionally, as computation power, dataset sizes, and machine learning techniques all increase, the need for feature selection as a data reduction technique becomes less necessary.
