4.2.2. Results for the PLS-Wave Model

At the plant level, the PLS-Wave models varied in performance in both environments with calibration *R*<sup>2</sup> ranging [0.11, 0.94] (Table 4, top). Differences between the SG and FD were negligible in most cases. Fits for Mg in ET and Protein in the US were high (0.88–0.94), but there was little consistency across locations. *T*-test comparisons show significant differences in model performance for all nutrients/processing methods between the two locations indicating the PLS-Wave regression model did not replicate well in terms of predicting plant nutrients across these two environments.

**Table 4.** Partial least squares (PLS) regression and *t*-test results for the waveband selection model (PLS-Wave) for plant and grain. SG: Savitsky–Golay; FD: First derivative; NLV: Number of latent variables; std: Standard deviation.


\* *t*-test significant at 0.05, \*\* at 0.01, \*\*\* at 0.001.

At the grain level, the PLS-Wave models performed generally well for all nutrients across the two locations, with calibration *R*<sup>2</sup> ranging [0.52, 0.95] (Table 4, bottom). Fits for protein were lower for ET than US, but fits for Ca and Mg were similar. Bootstrapping resulted in normal and semi-normal distributions for all nutrients and processing (Figure S2, Supplementary Material). For Ca ET, a bimodal distribution was observed. Upon further investigation of the original Ca ET data, the distribution deviated from a normal distribution and instead was bimodal and skewed to the right, explaining the bimodal and skewed distribution of the *R*<sup>2</sup> bootstrapping values (Figure S2, Supplementary Material). *T*-test comparisons again indicate significant differences in all cases, despite similar *R*<sup>2</sup> values for some of the nutrient-component combinations, indicating the PLS-Wave regression model did not replicate well across these two environments.

In addition to significant differences in the model fits, the PLS waveband selection procedure also indicated considerable differences in the number and position of important wavebands for nutrient prediction (Figure 6). At both the plant and grain level, Jaccard indices were low (<0.2), with three instances having no overlapping bands (Table 5). For instance, nine wavebands were selected as important for predicting plant-level Ca in the US (SG): One at 760 nm, and eight from 1020–1060 nm (Figure 6). In contrast, the important bands for predicting Ca from the ET samples (SG) included fifteen bands positioned from 1045–1115 nm. The Jaccard index for this set was 0.069 (Table 5) indicating minimal overlap in the band positioning.


**Table 5.** Jaccard index for similarity between bands selected by partial least square regression (PLS) for the US and Ethiopia (ET) datasets for the various models.

In summary, the results from this study indicate that the PLS regression, using both the full model and the waveband selection process, did not generate replicable findings across the two study areas. We found statistically significant differences in model fits for 11 of the 12 comparisons; only the model for grain Mg using the SG filtering (Table 4) was not statistically different. Even though model fits were not comparable, it is still possible for the wavebands identified as important for prediction to be similar. However, the Jaccard index (Table 5) and plots of the important wavebands for each model (Figure 6) indicate very little overlap in the important wavebands between the US and ET samples, further diminishing the case for replicability across these two study sites.
