2.3.5. Regression Models

In general, hyperspectral data can be used to explore specific wavelengths and/or indices that are particularly useful for the assessment of plant and ecosystem variables [37]. These wavelengths and/or indices can be used to estimate plant variables based on multivariable statistical methods, such as multiple linear regression (MLR), principal component regression (PCR), partial least squares regression (PLSR), and SVM regression (SVMR) [38,39]. When introducing new independent variables into an MLR model, a collinearity check is essential on the substituted independent variables, and until then, no new variables can be introduced or no existing ones can be removed. The equation is simple and can effectively avoid collinearity between variables. PCR combines multiple features in a high-dimensional space into a few irrelevant principal components and contains most of the variation information in the original data, effectively reducing the amount of data and simplifying operations. PLSR combines the characteristics of MLR, canonical correlation analysis (CCA), and PCR; can avoid the multicollinearity problem; and offers advantages that classic regression methods do not. SVMR uses an inner product kernel function to replace nonlinear mapping in the higher-dimensional space, but it is still better suited when the feature dimension is larger than the sample number. Moreover, SVMR, based on the principle of minimizing the structural risk, avoids overlearning problems and shows high generalizability.
