*2.4. Nutrient Analysis*

Since tef serves as both a food (grain) and forage (plant) crop, nutrient analyses were performed on both the plant and grain. All samples were analyzed for calcium (Ca), magnesium (Mg), and protein content (Table 1). For details on Ca and Mg laboratory procedures, readers are directed to the Soil Science Society of America [42] plant analysis guidelines. For details on protein analysis, readers are directed to the National Forage Testing Association's (NFTA) Forage Analysis Procedures [37]. For the US samples, nutrient analyses were performed at the Oklahoma State University Soil, Water, and Forage Analytical Laboratory. For the ET samples, analyses were performed by the Ethiopian Public Health Institute of Addis Ababa. These labs were chosen for their rigor and location as samples could not be transported across country boundaries. Moreover, the procedures used by both institutions followed these established standards [37,42], with aim to minimize impact on results. Ca and Mg values are expressed in ppm mg/kg, while protein values are expressed in percent (%) of total sample weight. All are expressed as dry matter weight. These nutrient data from the plant material and grains ultimately serve as the dependent variable in the PLS regression analyses (discussed below).



## **3. Analytical Methods**

Partial least squares (PLS) regression was implemented to assess the relationship between reflectance (independent variable) and nutrient content (dependent variable) of both the plant and grain. PLS, which can also stand for projection to latent structures [43], was selected over other forms of regression because it accounts for overfitting errors common when analyzing hyperspectral data [13,44]. Briefly, PLS regression finds a set of components (called latent factors) from *X*, a matrix of predictors collected on the observations, that best predict *Y*, a matrix of dependent observations [43]. These latent factors, or latent vectors, are orthogonal, and thus explain as much of the covariance between *X* and *Y* as possible, often resulting in a smaller number of variables than principal component regression. PLS regression extracts *X*-scores from the latent variables to construct a model to predict the *Y*-scores. In PLS, the *X*- and *Y*-scores are subject to redundancy analysis that seeks directionality in factor space until the most accurate prediction is found [25,45]. When implementing PLS regression with hyperspectral data, it is important to ensure the number of latent variables does not far exceed the number of independent variables being used, as overfitting can occur [13].
