2.1.4. Meteorological Data

Meteorological data are the basis for analyzing and describing climate characteristics and their laws of change [28]. The occurrence and prevalence of wheat yellow rust depend on the interaction among wheat varieties, amount of yellow rust disease, and environmental conditions [1,4,8]. When both pathogen and host have the potential for an epidemic, environmental conditions, specifically meteorological conditions, become the dominant factor in a wheat yellow rust epidemic [4,29].

Considering the influence of climate conditions on the wheat infection of yellow rust pathogens, five types of meteorological data were collected for March to May 2018 from the National Meteorological Information Center in 37 sampling sites around Ningqiang county, including average temperature (TEM), precipitation (PRE), sunshine hours (SSD), wind speed (WIN), and relative humidity (RHU). From each of these, we calculated a monthly mean value for March, April, and May. Therefore, a total of 15 meteorological features were calculated in this study.

### *2.2. Vegetation Indices for Plant Diseases Discrimination*

Crop under the stress of pests and diseases often undergo changes that impact their spectral properties, including pigmentation, moisture, and biomass. The sensitive spectral bands were combined to construct vegetation indices (VIs) in the relevant mathematical forms. VIs related to plant growth status, vegetation coverage, and pigmentation content were used to capture the physiological and biochemical changes caused by wheat yellow rust infection (Table 1).


**Table 1.** Multispectral vegetation indices for wheat yellow rust discrimination.

### *2.3. Two-Stage Vegetation Index for Wheat Yellow Rust Monitoring*

The study area belongs to the wheat region of southwest China. Generally, the initial stage of wheat yellow rust disease in this region is from the end of March to the beginning of April, the yellow rust outbreak occurred in mid-May in 2018. Accordingly, we selected Sentinel-2 images acquired on 2 April and 12 May 2018. Based on the commonly used vegetation indices listed in Table 1, we calculated the change in magnitude from 2 April to 12 May using the normalization quantification formula:

$$\text{nVIs} = \frac{\text{VI}\_{12\text{May}} - \text{VI}\_{2\text{A}\text{Psi}}}{\text{VI}\_{12\text{May}} + \text{VI}\_{2\text{A}\text{Psi}}} \tag{1}$$

where nVIs represents the change of vegetation index features between two-stages; VI2April and VI12May indicate the values of the vegetation index extracted from the images at the time of the first occurrence (2 April 2018) of yellow rust and at the large outbreak of yellow rust (12 May 2018), respectively.

### *2.4. Spectral VIs and Meteorological Features Importance Ranking*

There are many features including vegetation indices and meteorological parameters that are potentially relevant to crop diseases monitoring, however the sensitivity of these features varies substantially. It is necessary to describe the degree that how much a feature will impact on the model predictions. In this study, random forest (RF) was applied for classification and feature importance analysis and was first described by Breimen et al. [39]. It is an ensemble approach for building decision trees for predictions. The feature importance in RF is computed as the average contribution of each feature on each tree in the RF [39]. We used the out-of-bag (OOB) data to calculate the error (errOOBt) for each tree in the RF algorithm. Subsequently, we compared the difference in the OOB error of each feature before (errOOBt) and after adding noise (errOOB<sup>i</sup> t) to calculate the importance of the feature ( X), where Ntree denotes the number of trees in RF [22]. Finally, the importance of feature Xi was defined as:

$$\mathbf{V}(\mathbf{X}^{\mathrm{i}}) = \frac{1}{\mathbf{N}} \sum\_{\mathbf{t}} \left( \texttt{errOOB\_{\mathbf{t}}} - \texttt{errOOB\_{\mathbf{t}}^{\mathrm{i}}} \right) \tag{2}$$

In addition, we used the analysis of variance methods to test the significance of the selected features [40]. The statistical significance expressed by the *ρ* value reflects the suitability of the feature [9,40]. Finally, we selected the features that were highly important and significant as the optimal features for yellow rust detection.
