*2.5. Dimensionality Reduction*

Both RF and XGBoost provide an internal measure of VI. RF provides two measures of VI, namely mean decrease Gini (MDG) and mean decrease accuracy (MDA) [25]. MDG quantifies VI by measuring the sum of all decreases in the Gini index, produced by a particular variable. MDA measures the changes in OOB error, which results from comparing the OOB error of the original dataset to that of a dataset created through random permutations of variable values. In this study, MDA was utilised to compute VI following the recommendations of [22,54,55]. The MDA VI for a waveband *Xj* is defined by [56]:

$$VI(X\_j) = \frac{1}{\text{ntrre}} \sum\_{t} (errOOB\_{t^j} - errOOB\_t) \tag{1}$$

where *errOOBt* is the misclassification rate of tree *t* on the *OOBt* bootstrap sample not used to construct tree *t*, and *errOOBtj* is the error of predictor *t* on the permuted *OOBtj* sample.

XGBoost ranks VI based on Gain [30]. Gain measures the degree of improved accuracy brought on by the addition of a given waveband. VI is calculated for each waveband, used for node splitting at a given tree, and then averaged across all trees to produce the final VI per waveband [30]. Similar to [23,24], the top 10% (*p* = 18) of the ranked waveband importance as determined by RF and XGBoost was used to create a subset of important wavebands. RF and XGBoost models were produced for both the original dataset and the subset of 18 wavebands.

## *2.6. Accuracy Assessment*

To provide an independent estimate of model accuracy, an independent test set was used to evaluate all RF and XGBoost models. Therefore, a second dataset of spectral samples (*n* = 60) was collected for both stressed (*n* = 30) and non-stressed (*n* = 30) vines. Both algorithms were trained using the first dataset of 60 samples and tested using the second dataset. Overall classification accuracies were computed using a confusion matrix [57]. Additionally, Kappa analysis was used to evaluate model performance. The KHAT statistic [58] provides a measure of the difference between the actual and the chance agreemen<sup>t</sup> in the confusion matrix:

$$
\hat{\mathbb{X}} = \frac{p\_a - p\_c}{1 - p\_c} \tag{2}
$$

where *pa* describes the actual agreemen<sup>t</sup> and *pc* describes the chance agreement. Following [20,23,59], the McNemar's test was employed to determine whether the differences in accuracies yielded by RF and XGBoost were statistically significant. Abdel-Rahman et al. [23] stated that the McNemar's test can be expressed as the following chi-squared formula:

$$v^2 = \frac{(f\_{\text{xyb}} - f\_{rf})^2}{f\_{\text{xyb}} + f\_{rf}} \tag{3}$$

where *fxgb* denotes the number of samples misclassified by RF but correctly classified by XGBoost, and *fr f* denotes the number of samples misclassified by XGBoost but correctly classified by RF. A *v*2 value of greater than 3.84, at a 0.05 level of significance, indicates that the results of the two classifiers are significantly different [23,59].
