*3.3. Retrieval Methods of Canopy N Status*

For crop N status assessment and monitoring, the modeling methods can be divided into statistical analysis, physical analysis, and hybrid methods. Statistical analysis is to obtain high-precision N diagnosis by establishing mathematical relationships between ground truth data and remote sensing spectral information, which can be further classified into two categories: traditional statistical methods and machine learning methods. Physically based methods take the structural information and physicochemical properties of plants and leaves as input parameters, simulate the process of radiation absorption and scattering inside the crop, and form the reflection spectrum and output, so as to establish the correlation between crop parameters and ground reflection spectrum. The models are divided into radiative transfer models (RTM) and geometric optical models, and RTM are often used in research because of the continuous and uniform distribution of rice and wheat

in cultivation. Hybrid models have advantages by using multiple categories of models in combination but are currently used rarely in N retrieval (discussed in Section 5).

**Table 1.** Spectral indices for assessing crop N status.


*Ri* stands for reflectance at wavelength *i* nm.

#### 3.3.1. Traditional Statistical Methods

Traditional statistical methods have simple mechanisms, and their inverse accuracy depends more on the rationality of modeling parameters, thus they cannot overcome the influence of environmental and other factors, and there are difficulties in transferring prediction models to other datasets. However, due to the simplicity and convenience of the model and its usability, it is still widely used in the field of crop N monitoring at present. Univariate linear regression and its nonlinear transformation, as the basis of statistics, have achieved good results in exploring whether there is a significant correlation between crop N indicators and individual VIs, and occupy an important position in N status inversion [104,132,133]. Hansen et al. [89] found that applying partial least squares regression (PLSR) to fit N status gave equal and better results than exponential regression, with the R<sup>2</sup> maximum increase of 23%. PLSR was considered a good alternative to univariate statistical models. When multiple growth stages are involved or when there is a lack of phenological information, individual index that do not fully utilize spectral data cannot describe the relationship between N and spectral information, so using multiple linear regression (MLR) is a better choice [134]. MLR can reduce model chance. Whether the parameters involved in modeling exist in response to N and whether there is overfitting between parameters is the key to influencing the multiple regression model. Pearson correlation coefficient is generally used as a measure in the study, and the combination of parameters with larger correlation coefficients is selected to construct MLR models with good interannual scalability [64,69]. PLSR, principal component analysis (PCA), stepwise multiple linear regression (SMLR), and Ridge Regression, etc., play an advantage in dealing with the collinearity problem between parameters. Many co-linear spectral variables were reduced to a few uncorrelated latent variables to avoid overfitting problems [91,123,135]. The selection of a suitable variable dimensionality reduction method based on the quantitative relationship between the number of samples and the dimensionality of variables, combined with multiple regression, significantly improves the inversion accuracy and applicability.

Traditional statistical methods can describe different rates of change between N status and spectral information, providing fitting models for a variety of change conditions. For region-specific datasets, models with good inversion effects can be obtained by comparing different regression methods; for datasets with different growing environments, the best regression models derived from the study will be different [64,89,104,134]. Although the statistical models are not stable enough to overcome environmental problems, the different regression methods in the statistical models are more consistent in principle, have specific mathematical relationships, are easy to understand and apply, and are extremely convenient to use in fixed research areas. Therefore, the traditional statistical methods still dominate the existing N remote sensing monitoring studies.
