2.4.2. Selection of Vegetation Indices

When wheat is infected with PM the physiological and biochemical parameters of its leaves change accordingly [16], thereby changing its spectral reflectance. In total, 15 vegetation indices (VIs) frequently investigated in disease research were selected for this study, as listed in Table 2. And these 15 VIs are all highly related to the physiological and biochemical parameters.


**Table 2.** Vegetation indices used in this study.

### *2.5. Development of the Recognition Model for Wheat Leaf Disease*

The PLS-LDA method was used to classify the healthy and diseased wheat leaves with the sensitive features of vegetation and texture indices. PLS-LDA is effective in processing data such as small sample size, high dimensionality, and multicollinearity [39]. In the recognition model, the actual measured DS was quantitatively classified into two classes of healthy and diseased for discriminating PM infection. The model performance was evaluated using the overall accuracy (OA) derived from a confusion matrix and Kappa coefficient. An accurate model should have higher values of OA and Kappa coefficient.

### *2.6. Construction of the DS Estimation Model for Wheat Leaf Disease*

Furthermore, partial least-squares regression (PLSR) was used to establish the regression relationships between the indices (VIs and NDTIs) and the DS for wheat leaves at different growth stages. As a mature multiple linear regression algorithm, PLSR is the simplest partial least-squares method, and has been widely used to analyze agricultural remote-sensing data [40]. In the severity model, the relationship between actual measured DS and indices was examined to evaluate the sensitivity of indices to PM at different growth stages. The performance of the wheat DS estimation model was evaluated by the coefficient of determination ( *R*2), the root mean square error (*RMSE*), and the relative root mean square error (*RRMSE*). *R*<sup>2</sup> can illustrate the correlation between VIs/NDTIs and DS. The robustness of models was assessed by both *RMSE* and *RRMSE*. The higher *R*<sup>2</sup> and the lower *RMSE* and *RRMSE* values, the better the model performs. *R*2, *RMSE*, and *RRMSE* can be calculated as following equations:

$$R^2 = 1 - \frac{\sum\_{i} \left(\hat{y}^i - \bar{y}^i\right)^2}{\sum\_{i} \left(\overline{y}^i - \bar{y}^i\right)^2} \tag{2}$$

$$RMSE = \sqrt{\left(\frac{1}{M} \sum\_{1}^{M} \left(y^{i} - \hat{y}^{i}\right)^{2}\right)} \tag{3}$$

$$RRMSE = \frac{RMSE}{\overline{y}^i} \tag{4}$$

where *yi* was the observed DS, *y*ˆ*i* was the estimated DS, *yi* was the mean of DS observation, and *M* was the number of samples.

The calibration model was built using all data to achieve better performance, but the model may be overfitting. To avoid this phenomenon, 10-fold cross validation was adopted to assess the performance of models. In this process, the data were separated in 10 folds. The 9-fold data were used to build PLSR model. Then, the constructed model was used to predict the remaining 1-fold data. After 10 trials, all DS estimates can be obtained.
