*3.1. Supervised Classification of Hyperspectral Pixels at the Ground Canopy Scale*

One approach used to analyze hyperspectral data on the field scale is the pixel-wise classification into usual pixel (background, straw, healthy leaf tissue) and in disease specific symptoms. In the field experiment 2018, YR had a significant disease severity and classifiers for this disease were derived.

The use of 16 VIs reached a reasonable but not competitive performance. However, it has to be noted that we compared 161 features to 16, meaning a significant reduction in dimensionality. The results can be integrated in a later discussion of the various feature sets obtained by feature selection.

Based on these results, SVM SNV (Table 1) was selected as most appropriate approach. A visual comparison of the SAM results and the SVM SNV result is shown in Figure 5. Significant differences were apparent. The SVM detected many more senescent leaves, e.g., all leaves from the lower leaf levels, whereas the SAM assigned these to the background or the healthy leaves. The SVM was more sensitive for ear detection, which caused major problems in the SAM image, where they were partly assigned to YR. Overall, both approaches were sensitive to YR, but the SVM was much more accurate in the very bright image parts as well as the darker background parts, while the class YR was overrepresented in the SAM classification. The visually most significant aspect was the large number of blue pixels in the visualization of the SVM result. YR disease was present at all leaf levels and led to early senescence in lower leaf levels.

The classification models were validated via two approaches: (1) pixel-wise classification of the hold-out test data set consisting of manually annotated pixels of new images of separate plots, and (2) prediction of pixel classes of all images obtained on a respective day and comparing the total % disease class from all plant pixels to the visual assessment done by the expert (Figure 6). Approach 1 resulted in a confusion matrix allowing the calculation of multiple performance measures such as the overall accuracy, the sensitivity, and the recall, whereas Approach 2 provided the R2value, correlation coefficient, and a regression plot. Table 1 shows the overall accuracy and the F1 score for the different classification methods.

Presumably, due to light reflections and transmission or a deviating weighting of the different canopy levels, the SVM prediction overestimated the ratio of diseased pixels. To compensate for this, a linear regression model was applied. However, deviations between the predicted disease severity and the visual assessment could have various reasons, e.g., the section of the plot observed by the sensor did not represent the true status that was evaluated by the visual assessment. The viewing angle produced a variable composition of different leafs and leaf layers in the field of view of the

human and the sensors. Furthermore, the visibility of the lower leaf levels was low for imaging system from the top, and more accurate if the human rater could go deep into the crop stand for individual leaf disease rating. Further points are that the visual assessment produced a single value averaging the affected leaf area. From repeated disease assessments with multiple experts, different deviations have been observed depending on the literature [39,40]. The method of disease detection is subjective to the individuals performing the assessment. Another prime factor for deviations and classification inaccuracies is the biological heterogeneity. This has to be considered as highly dynamic within one field, one plot, and one location, and even on different leaf layers and single leaves. The biological heterogeneity can be affected by many factors, e.g., the leaf color and status, stem elongation (distance of leaf layers), the density of the canopy, and other biological growth processes.

**Figure 5.** Visual comparison of the representative spectral angle mapper (SAM) classification (**top left**) and the support vector machine standard normal variate (SVM SNV) classification (**top right**) with the original RGB visualization of the corresponding ground-based image of one representative measurement location of a plot inoculated with YR (**bottom left**). The image is captured with the hyperspectral camera Specim V10. The classes (**bottom right**) were generated from manual annotation of train and test data.

**Figure 6.** Scatter plot of the relation between visual assessment and predicted disease ratios for yellow rust on 23 May 2018 before (**left**) and after the application of a linear calibration model (**right**). The calibration model had the purpose of compensating for scale differences in the prediction values.

#### *3.2. Evaluation of Hyperspectral UAV Observations Using a Filter-System Hyperspectral Camera*

To characterize the reflectance characteristics of the field plots, the spectra of the central 4 × 2 m of each plot were averaged. Intra-plot variations were neglected. Multiple traits were predicted with reasonable accuracy based on SVM and SVR analysis of the 55 recorded bands from 500–900 nm. Table 2 shows the obtained performance parameters based on a SNV representation and the integration of all 55 bands.

**Figure 7.** Prediction results of a YR infestation obtained by applying a leave-one-out procedure to the support vector regression (SVR) on the UAV scale. Each point represents a plot observation, and its color, the observation date (DD-MM).


**Table 2.** Performance values for different SVM and SVR models predicting the treatments of the wheat field experiment based on UAV observations.

Using the SVR approach, a prediction of the disease severity in percent was possible with reasonable accuracy (Figure 7). The interpretation of this result has to take into account that a value from the visual assessment might not represent the average plot value because diseases may occur at zoned locations in the plot, and assessment locations may or may not be in these spots.
