**1. Introduction**

The classification of reflectance spectra to determine broad plant type or species has been explored increasingly over the past two decades. This has been driven by the increased availability of hyperspectral sensing from imaging spectrometers and field spectroradiometers, and increasing need from environmental conservation, agriculture, and forestry groups [1]. High classification accuracies, particularly at fine taxonomic units such as species, or even clones for grapevine varieties [2], has in some cases been enabled by hyperspectral observation [3]. Hyperspectral measurements have been

used to classify a variety of plant types including annual gramineous weeds [4], food crops [5], arid zone shrubs [6], and montane/sub-alpine trees [7], growing in equally varied environments, including tropical wetlands [8], urban streetscapes [9], savanna plains [10], and alpine forests [11]. Due to the scale required to map and monitor the world's vegetation, fast, generalizable, and objective methods that provide results, that can be quickly and easily shared and analysed, are required. Hyperspectral imagery and data can fulfil these requirements, producing digital measurements that can be easily shared and quickly analysed with semi-automated procedures in a repeatable and objective manner. However, the potential generalisability of classification models has yet to be fully evaluated.

Hyperspectral measurements consist of numerous, finely spaced, contiguous measurements (wavebands) providing considerably more information about targets than broadband multispectral observations. These advantages come at the cost of high dimensionality and large data volumes. Hyperspectral instruments record radiance within the range of 350 to 2500 nm of the electromagnetic spectrum, with bandwidths often between 1 and 10 nm. The number of wavebands per observation varies from hundreds to thousands. Training a classification model with such large numbers of spectral features generally requires a large sample size. However, since the collection of samples for hyperspectral studies is onerous, with high costs for imagery and arduous fieldwork for gathering field measurements, sample sizes tend to be small. Data of this high dimensionality is prone to the Hughes phenomenon, also known as the curse of dimensionality, whereby an increasing number of features originally aids in improving classification, before the addition of more features decreases performance as noise and sparsity of the feature space increases [12]. This problem is exacerbated by small sample sizes [13].

In order to overcome this the ratio between sample size and data dimensionality must be improved. In this review, we focus on reducing dimensionality via feature selection, though methods of artificially increasing sample size through data augmentation, semi-supervised classification, and active learning can aid in countering the curse of dimensionality [14–16]. Hyperspectral measurements tend to include noisy or redundant features, with high levels of collinearity between wavebands. The elimination of collinearity can substantially improve classification efforts and is in fact a requirement of parametric statistical methods that assume the independence of all variables [17,18]. Additionally, feature selection inherently reveals the spectral regions that offer the greatest discriminatory power for a set of samples. Long held associations between specific spectral regions or individual wavebands and biophysical or biochemical foliar traits [19] have often guided researchers in the selection of features to differentiate species or plant types. The overall aim of this review is to assess these assumptions in light of the evidence from 22 years of hyperspectral plant studies.
