2.2.3. Model Calibration and Validation

All possible band combinations based on the MDATT and DLARI derived from adaxial, abaxial, and the bifacial (i.e., including both adaxial and abaxial reflectance measurements) datasets were correlated with LCC, respectively. The waveband combinations with the highest coefficient of determination (*R*2) for each dataset were selected as the optimal indices and used to model LCC. Relationships between the measured LCC and indices were established using empirical regression analysis. The form of fitting functions (e.g., linear, exponential, logarithmic) relating the indices to LCC appeared to have a marginal impact when compared to the impact of band selection [24]. Therefore, we restricted the fitting method to ordinary least-squares linear regression.

The performance of the index-based models was evaluated using the *R*<sup>2</sup> and root mean square error (RMSE) with respect to the biochemically measured LCC. In order to avoid dependence on a single random partitioning of the datasets and guarantee that all samples were used for both training and validation, a repeated 10 fold cross-validation was used to evaluate the performance of each index [40]. The dataset was split into 10 consecutive folds, and each fold was then used once for validation while the remaining 9 folds formed the training dataset. This process was repeated 50 times, and combined *R*<sup>2</sup> cv and RMSEcv values were calculated as the mean of those from each repetition.

### **3. Results**
