*2.1. Dataset*

Serum samples were collected from 707 early lactation cows (<30 d in milk) from 13 farms located in southeastern Australia. Descriptive statistics of the animals included in the experiment, including herd of origin, stage of lactation (reported as days in milk, or the number of days post-calving), parity and serum BHBA concentrations and are summarized in Table 1. Of particular interest were the BHBA results obtained from Farm 1, which had a greater mean and standard deviation than observed in other farms.


**Table 1.** Descriptive statistics of dataset used in this experiment, including farm details, number of cows (N), mean and standard deviation (shown in parentheses) of parity, days in milk (DIM), and serum β hydroxybutyrate (BHBA) concentrations obtained from dairy cows in the first 30 days of lactation from 13 farms in south eastern Australia.

1 South Gippsland Region, 2 West Gippsland Region, 3 Macalister Irrigation District, 4 Goulburn Valley Region, 5 Tasmania.

#### *2.2. 1H NMR Spectroscopy of Serum Samples*

1H NMR spectra were complex; however, more than 20 metabolites could be identified. Spectra were dominated by organic acids, amino acids, glucose and phospholipid intermediates (Figure S1 and Table S1).

#### *2.3. Preliminary Data Analysis Using Principal Component Analysis*

Preliminary data analysis and outlier identification was performed using principal component analysis (PCA). Plots of the first 2 principal components (PCs) identified several samples located outside the 95% confidence level. These spectra were manually inspected, and a single outlier with erroneous phasing was identified and removed from subsequent analyses.

PCA was repeated after outlier removal. The first 13 PCs explained greater than 90% of the variation in the spectra. Scores plots of the first three PCs, which explain 47.64%, 15.59%, and 7.45% of variation, respectively, are shown in Figure 1a–c. There was obvious clustering of samples by herd of origin. Samples from Farm 1 showed greater variation than those from the other farms. The separation between farm clusters was most obvious along PC1 and PC2. Visual comparisons based on stage of lactation (defined as weeks in milk (WIM)) and parity were also performed, but no obvious clustering or separation was observed. Loadings plots of the first three PCs show that energy metabolites BHBA, lactate, acetate and glucose, have the largest influences on spectral di fferences (Figure 1d–f), with smaller influences from the branched chain amino acids, lipoproteins, glycine, creatine, and betaine.

**Figure 1.** Results of principal component analysis (PCA) of 707 proton nuclear magnetic resonance ( 1H NMR) spectra of serum obtained from dairy cows in early lactation; (**a**) principal component (PC) 1 vs. PC 2 scores, (**b**) PC 1 vs. PC 3 scores, (**c**) PC 2 vs. PC 3 scores, (**d**) PC 1 loadings, (**e**) PC 2 loadings, and (**f**) PC 3 loadings plots. Scores plots are colored by farm of origin. The δ 6.5 to 8.5 region of loadings plots have been magnified for clarity purposes. α-Glu = α glucose, Ace = acetate, Ala = alanine, β-Glu = β glucose, Bet = betaine, BHBA = β hydroxybutyrate, Cr = creatine, Glu = glucose, Gly = glycine, Hip = hippurate, Ile = isoleucine, Lac = lactate, Leu = leucine, Val = valine, VLDL/LDL = Very low density lipoprotein and low density lipoprotein.

#### *2.4. Principal Component Analysis of Spectra Corrected for Fixed E*ff*ects*

Principal component analysis (PCA) was repeated on spectra that had been corrected for (1) WIM, (2) Parity, (3) Herd, and (4) WIM, Parity and Herd simultaneously (hereafter referred to as all fixed effects) (Models 1 to 4). Results derived from spectra corrected separately for WIM and Parity are nearly identical to uncorrected spectra (Figures S2 and S3). By contrast, scores plots derived from PCA of spectra corrected for Herd (Figure S4), and spectra corrected for all fixed e ffects (Figure 2a–c), show no obvious clustering of samples by Herd, WIM or Parity. There is, however, still considerable

separation of samples along all three PC axes, suggesting that significant inter-animal variation in the serum metabolome exists after accounting for fixed effects. Compared to the uncorrected data; (1) more PCs were required to explain >90% of spectral variation (24 vs. 13), (2) the percentage of variation captured by PC1 was lower (25.70% vs. 47.64%), and (3) the percentage of variation captured by PC2 and PC3 was higher (16.92% vs. 15.59% and 11.79% vs. 7.45%, respectively). Loadings plots are shown in Figure 2d–f. Interestingly, separation of samples along PC1 (25.70%) is due almost entirely to lactate. Loadings on PC2 and PC3 are similar to uncorrected data.

**Figure 2.** Results of PCA of 707 1H NMR spectra of serum, corrected for herd of origin, week of lactation, and parity obtained from dairy cows in early lactation; (**a**) PC 1 vs. PC 2 scores, (**b**) PC 1 vs. PC 3 scores, (**c**) PC 2 vs. PC 3 scores, (**d**) PC 1 loadings, (**e**) PC 2 loadings, and (**f**) PC 3 loadings plots. Scores plots are colored by farm of origin. The δ 6.5 to 8.5 region of loadings plots have been magnified for clarity purposes. α-Glu = α glucose, Ace = acetate, Ala = alanine, β-Glu = β glucose, Bet = betaine, BHBA = β hydroxybutyrate, Cr = creatine, Glu = glucose, Gly = glycine, Ile = isoleucine, Lac = lactate, Leu = leucine, Val = valine, VLDL/LDL = Very low density lipoprotein and low density lipoprotein.

#### *2.5. E*ff*ect of Stage of Lactation, Parity, and Herd E*ff*ects on 1H NMR Spectra*

In order to quantify the effect of each fixed effect on NMR spectra, we calculated Pearson's correlations between scores for the first three PCs from the previously described PCAs (Figure 3). The largest differences (i.e., lowest correlations) were seen between uncorrected spectra, and spectra corrected for Herd (r = 0.43). This suggests that there are significant differences between those 2 spectral datasets, and that Herd, therefore, has a significant effect on the serum NMR metabolome. This is consistent with the clustering of samples by farm in the original PCA (Figure 1a–c). By comparison, the correlations between PC scores derived from uncorrected spectra, and spectra corrected for WIM and Parity, were high (0.99 and 0.97, respectively). This suggests that these spectra are nearly identical, and that these fixed effects have minimal influence on the serum metabolome. Correlations between PC2 scores were consistent with those observed between PC1 scores, and correlations between PC3 scores were all high (≥0.89).

**Figure 3.** Pearson's correlations between scores derived from PCA of uncorrected 1H NMR spectra of bovine serum, and the same spectra corrected using linear regression for week of lactation (WIM), parity, herd of origin, and WIM, parity, and herd simultaneously (All). Color map shows strength of Pearson's correlation.

To test the statistical significance of fixed effects on 1H NMR spectra, we used conditional Wald F statistics derived from multiple linear regression models on the first three PC scores (Model 5). The higher the F statistic, the greater the effect of that variable on the PC score, and the lower the P value, the greater the statistical significance. Results derived from these models are summarized in Table 2. PC1 results were consistent with the results of Pearson's correlations, showing that Herd had the greatest effect. Interestingly, results for PC2 and PC3 differed slightly from Pearson's correlations. While Herd had a relatively large and significant (P < 0.001) impact on both PC2 and PC3, the effect of Parity was nearly as grea<sup>t</sup> on PC2 scores and greater on PC3 scores.

**Table 2.** Results of multiple linear regression models of principal component (PC) scores derived from PCA of 1H NMR spectra, against weeks in milk (WIM), parity, and herd of origin. Conditional Wald F statistics (F-con) and corresponding *P* values describe the magnitude and statistical significance of each fixed effect, respectively.


R<sup>2</sup> values obtained from models 1–4 were used to investigate which regions of the NMR spectra were most strongly influenced by the fixed effects. As the signal intensity at each chemical shift was treated as a separate response variable, the R<sup>2</sup> values from Models 1, 2, and 3 describe the effect of WIM, parity, and herd on each of the 24,349 chemical shifts, respectively. These R<sup>2</sup> values were color-coded, and overlaid on an average NMR spectrum. Plots showing the effects of WIM and Parity were unremarkable (all R<sup>2</sup> < 0.2, Figure S5), however R<sup>2</sup> values obtained from Model 2 showed that approximately 10–20% of the variation in glucose and acetate concentration could be explained by parity. The plot showing the effect of herd is shown in Figure 4. The strongest effect was seen in the downfield region of the spectrum, with close to 80% of variation in the concentration of some phenolic compounds being explained by Herd. Of these, hippurate could be clearly identified. Peaks at δ 7.31 and 7.39 were tentatively assigned to 3-phenyllactate, but the peak at δ 7.22 could not be identified. Lactate, acetate, BHBA, betaine, pyruvate, glycine, and glucose concentrations were also strongly influenced by herd effect.

**Figure 4.** Average 1H NMR spectrum of bovine serum. Color coding represents the percentage of variation in the signal at each chemical shift intensity that can be explained by herd of origin. The δ 6.5 to 8.5 region has been magnified for clarity purposes. Ace = acetate, Bet = betaine, BHBA = β hydroxybutyrate, Gly = glycine, Hip = hippurate, Lac = lactate, Pla = 3-phenyllactate, Pyr = pyruvate, U = unidentified peak. \* indicates tentative identification.

The results of ANOVA-simultaneous component analysis (ASCA) were consistent with results of linear regression spectral correction and are shown in Table S2. Herd had the greatest effect (43.99, P = 0.02), followed by parity (4.10, P = 0.02) and WIM (1.37, P = 0.02). When ASCA was performed on corrected spectra, the effect of the fixed effect(s) was reduced to zero. For example, when ASCA was performed on spectra corrected for Herd, the effect of herd was zero (P = 1.00), but the effects of WIM (1.68, P = 0.02) and parity (3.30, P = 0.02) were retained.

#### *2.6. Robustness of 1H NMR Predictions of Serum BHBA*

Our results show that 1H NMR spectra can be used to predict serum BHBA concentration with good accuracy. This result is expected, as BHBA is directly quantifiable from NMR spectra. The overall robustness of our approach was assessed using a "leave-one-farm-out" external validation of OPLS models built using uncorrected data. This involved iteratively setting aside data from one farm, training models using data from the remaining 12 farms, then using the withheld data for external validation. R<sup>2</sup> results were variable (0.30 ≤ R<sup>2</sup> ≤ 0.99), however RMSE values remained relatively low (≤ 0.18) (Table 3). Interestingly, RMSE values were considerably higher when Farm 1 data were withheld for validation.

**Table 3.** Results of leave-one-farm out external validation of orthogonal partial least squares (OPLS) regression models predicting serum BHBA concentration from uncorrected 1H NMR spectra. Validation farm specifies the identity of the data used for validation, N the number of animals used in calibration and validation datasets. The coefficient of determination (R2) and root mean square error (RMSE) are reported for each calibration/validation subset.


#### *2.7. Influence of Fixed E*ff*ects on Interpretation of 1H NMR Metabolomic Data*

The impact of fixed effects on the interpretation of 1H NMR metabolomic data was determined by comparing the results of OPLS models built using (1) data from Farm 1 only (used as a control), (2) uncorrected data from all farms, and (3) data from all farms corrected for all fixed effects. Fixed effects appeared to have minimal effect on the predictive ability of models. We observed similar 10-fold cross validation prediction accuracies for all 3 datasets (Table 4). Interestingly, RMSE results were quite close to the results of the leave-one-farm out external validation (0.05 ≤ RMSE ≤ 0.18).

**Table 4.** Results obtained from 10-fold cross validation of OPLS regression models predicting serum BHBA concentration from 1H NMR spectra using data from Farm 1 only, uncorrected data from all farms, and data from all farms corrected for the effect of herd. Number of cows (N), number of latent variables included in each mode (LV), coefficient of determination (R2) and root mean square error (RMSE) of calibration (C) and 10-fold cross validation (CV) are shown.


1 P-value derived from permutation testing (50 iterations) and pairwise Wilcoxon signed rank test.

The influences of fixed effects on biomarker discovery were investigated by comparing loadings on LV1. Results obtained using only Farm 1 data were used as a reference and show a strong positive correlation between BHBA concentration and acetate, and strong negative correlations with lactate and glucose (Figure 5a,b). Loadings from the complete dataset corrected for all fixed effects were very similar (Figure 5e,f). Results from uncorrected data, however, were quite different (Figure 5c,d), with BHBA being positively correlated with lactate and glycine. Examination of scores plots shows obvious clustering and separation by herd (especially Farm 1) when uncorrected data are used (Figure S6a), but not when corrected data are used (Figure S6b). Results from the original PCA showed that samples from Farm 1 clustered at the positive end of PC1, and that lactate and glycine both had

strong positive influences on PC1 loadings. Therefore, it is possible that OPLS results are confounded by a strong herd effect when uncorrected data are used.

**Figure 5.** Results of OPLS regressions of serum BHBA concentration against 1H NMR spectrum of bovine serum: (**a**) Farm 1 (N = 179) LV1 vs. LV2 scores and (**b**) LV1 loadings, (**c**) all farms (N = 707) uncorrected data LV1 vs. LV2 scores and (**d**) LV1 loadings, and (**e**) all farms data LV1 vs. LV2 scores and (**d**) LV1 loadings.
