**3. Discussion**

To the best of the authors' knowledge, this is the first large-scale serum metabolomics study to investigate the impact of systematic environmental and physiological fixed effects on the 1H NMR serum metabolome of clinically healthy dairy cattle. Our results indicate that herd-specific environmental factors have much greater effects on the serum metabolome of early lactation dairy cows than physiological factors such as WIM and parity. We demonstrate that, while confounding from herd effects can significantly influence the results of biomarker discovery, models built using data collected from multiple farms can give robust predictions of external phenotypes such as BHBA. In order to overcome the potential confounding of fixed effects on biomarker discovery, we propose a method to correct 1H NMR spectra prior to multivariate analysis using multiple linear regression.

#### *3.1. Di*ff*erences in 1H NMR Spectra Between Herds*

Our results clearly demonstrate that there are significant differences in the serum metabolomes of animals from different herds. The fact that energy metabolites BHBA, lactate, acetate and glucose dominated PCA loadings (Figure 2d–f), and that herd e ffect accounted for a large percentage of the variation seen in lactate, acetate, pyruvate, glucose, and BHBA concentrations, (Figure 4) suggests that metabolic state, in particular energy balance, varied significantly between farms.

The importance of lactate was particularly interesting. Lactate was one of the most abundant metabolites identified in this experiment. This is very di fferent to the findings of Sun et al. [8], who reported that lactate was one of the weakest signals in serum 1H NMR spectra obtained from early-lactation cows fed a total mixed ration. One possible explanation for the very high concentrations of lactate seen in our dataset could be ruminal lactate production. During spring, dairy cows in pastoral farming systems of southeastern Australia are typically fed rations high in fermentable carbohydrate, and low in neutral detergent fiber. As a consequence, ruminal acidosis is common [24]. Serum concentrations of lactate, and in particular D-lactate from microbial fermentation, have been shown to increase following experimental induction of ruminal acidosis [25,26]. Without the use of a shift reagen<sup>t</sup> and specialized experiments it is not possible to di fferentiate between the di fferent lactate isomers by 1H NMR [27]. We therefore plan to quantify the relative contributions of L- and D- lactate to better understand the cause of high lactate concentrations in our dataset.

The strong influence of Herd on the concentration of phenolic compounds could also be consistent with ruminal acidosis. Signal intensities in the downfield region of 2D spectra were weak, meaning clear identification of some of the phenolic peaks in our dataset was not possible. Our tentative identification of 3-phenyllactate is consistent with the findings of Yang et al. [26], who demonstrated that beef steers fed high starch (corn) diets had higher plasma concentrations of phenyllactate compared to those fed low starch diets. This study also identified L-phenyllalanine biosynthesis and metabolism as important metabolic pathways in high starch feeding. We plan to (1) enrich samples and repeat 2D analyses and (2) perform LCMS-based metabolomics on a subset of samples to identify these compounds.

Nearly 80% of the variation seen in hippurate concentration could be explained by herd e ffect (Figure 4). Hippurate is formed by the conjugation of glycine and benzoic acid, and has been associated with microbial degradation of dietary compounds [28]. Concentrations of hippurate increase with increased consumption of phenolic compounds [13], which are present in relatively high concentrations in pasture species. Milk hippurate concentration has been proposed as a biomarker of pasture/forage intake in goats [29], and it is possible that our results represent di fferences in feeding regimens between farms. Hippurate has also been proposed as a biomarker for gu<sup>t</sup> microbiome diversity in humans [30], and our results may indicate di fferences in the gastrointestinal health of animals from di fferent farms (i.e., ruminal acidosis). Detailed information of ration formulations is very di fficult to define in grazing systems as pasture quality and intake vary considerably within and between herds. This information was therefore not available for the herds in our dataset and more data are required to further investigate this finding.

Results of the initial PCA showed that data from Farm 1 were significantly di fferent to, and showed more variation than, data from the other farms. The reasons for these di fferences are hard to determine from our dataset, as Farm 1 di ffered in environment/management, breed and reference BHBA concentrations (and therefore it is assumed animal metabolic status). Given that we also observed clustering and separation of the 12 Holstein-Friesian herds in the initial PCA (Figure 2), it appears that herd-specific environmental factors have a larger e ffect on the serum metabolome than breed. However, Liao et al. [31] recently reported clear di fferences in the serum metabolomes (GC-MS) of three di fferent breeds of beef steers, all the same age, fed the same ration, and managed under the same conditions. Further data are therefore required to investigate if there are di fferences between the serum metabolomes of di fferent dairy breeds.

Pre-analytical sample handling and processing have been shown to have significant e ffects on human metabolomic data [32], and considerable e fforts are made to streamline and standardize sample collection and processing protocols [33–35]. Standardizing protocols in livestock studies provides its own challenges, when relatively large number of samples are being taken at once, often in diverse, challenging and remote locations. While all attempts were made to ensure consistency, there were

some unavoidable differences in the way samples from different herds were handled (for example time, between blood sample collection and centrifugation varied from approximately 2–4 h). It is therefore possible that some of the variation between farms seen in our data could be due to pre-analytical sample handling. However, overall our results sugges<sup>t</sup> that metabolomic differences between animals from different farms are due largely to differences in diet/nutritional management. We plan to collect more samples from animals receiving different diets to investigate this further.

#### *3.2. E*ff*ect of Lactation Stage and Parity on Serum Metabolome*

Our results sugges<sup>t</sup> that stage of lactation appeared to have a minimal effect on the NMR spectra. This is consistent with the findings of Ilves et al. [17] who found that the mass spectrometry (MS) based plasma metabolome of dairy cows was more heavily influenced by animal individuality than by lactation stage. By contrast, several authors report that both the NMR and MS-derived milk metabolome changes across lactation [17,36]. This suggests that blood-based metabolomics may be more suitable for identification of individual animal-specific differences within a population, and therefore provide more robust metabotypes for genetic selection.

Parity appeared to have a small but significant (P < 0.05) effect on the overall 1H NMR serum metabolome. We could find no other reports in the literature describing the effect of parity on the entire serum metabolome. However, our results are consistent with other studies that showed parity has a significant effect on the concentration of several metabolites in serum including glucose, creatinine, urea and BHBA [37–39]. This suggests that parity should be taken into consideration when undertaking metabolomic studies in dairy cows.

#### *3.3. Accuracy of OPLS Models for Predicting Serum BHBA Concentration*

Despite the significant influence of fixed effects on the serum metabolome, results obtained from the leave-one-farm out external validation sugges<sup>t</sup> that prediction models constructed with data from multiple farms are quite robust. R<sup>2</sup> values varied significantly depending on which farm was used for validation (0.30 ≤ R<sup>2</sup> ≤ 0.99); however, the R<sup>2</sup> is known to be affected by the range of the dataset, and RMSE is often considered to be a better predictor of model performance [40]. Promisingly, external validation RMSE results (0.05 ≤ R<sup>2</sup> ≤ 0.18) were close to those obtained from 10-fold cross validation of models built using only Farm 1 data (RMSE = 0.12) and all data (RMSE = 0.10). The fact that prediction errors were highest when Farm 1 data were withheld for validation suggests that the increased variation observed in Farm 1 data represents valuable biological variation rather than confounding/noise.

Correcting data for fixed effects had very little impact on the predictive ability of OPLS models. Furthermore, when corrected spectra were used, y-values also had to be corrected, making interpretation of phenotypic values difficult. Interestingly, Wanichthanarak, et al. [20] found that "readjusting" mass spectroscopy metabolite signals using patient metadata and linear mixed models improved the sensitivity and specificity of classification of human tissue samples with and without colorectal cancer. Conversely, Posma et al. [41] found that adjusting NMR data for confounding factors lead to a loss of predictive power for cardiovascular risk in a large-scale human NMR metabolomic dataset. Whether using NMR spectra corrected using linear regression will improve the performance of classification models (as opposed to regression against a continuous variable as used in this study) requires further investigation. Overall, our results sugges<sup>t</sup> that models constructed using uncorrected data collated from multiple farms may be appropriate for prediction of external phenotypes which are influenced by both genetic and environmental factors.

#### *3.4. Impact of Fixed E*ff*ects on the Interpretation of Metabolomic Data for Biomarker and Metabotype Discovery*

Loadings from OPLS models built using uncorrected spectra from Farm 1, and spectra from all farms corrected for fixed effects, were consistent with the literature. BHBA and glucose concentrations have been shown to be negatively correlated in the serum of cows in early lactation dairy cows [42]. L-lactate is an important gluconeogenic substrate in dairy cows [43,44], so it follows that lactate

concentration is also negatively correlated with BHBA concentration. Our results are also consistent with the findings of Sun et al. [8] who showed that cows with subclinical (1.2 < BHBA < 2.9 mmol/L) and clinical ketosis (BHBA > 2.9 mmol/L) had lower lactate and glucose concentrations and higher BHBA and acetate concentrations than the healthy controls.

The fact that loadings were di fferent when uncorrected spectra from all farms were used demonstrates that herd-specific environmental e ffects can influence the results of biomarker discovery. How significant this is ultimately depends on the research question being asked. If the study aim is to identify biomarkers of external phenotypes (i.e., biomarkers that represent both genetic and environmental factors which are used for managemen<sup>t</sup> purposes such as disease prediction), then the impact of environmental e ffects is important and must be captured. However, if the aim is to identify biomarkers indicative of inter-animal di fferences free of environmental confounding, or to understand biological processes, our results sugges<sup>t</sup> that the influence of environmental e ffects could lead to erroneous results. This is consistent with the findings of Posma et al. [41] who showed that di fferences in fixed e ffects between subjects from the north and south of China explained some metabolite associations, which had previously been attributed to cardiovascular disease risk. This study also reported that adjusting metabolomic data for confounding using an algorithm called Covariate-Adjusted Projection to Latent Structures (CA-PLS) improved model interpretability and led to the identification of more robust biomarkers. Our results are also consistent with other studies that have explored the impacts of data pretreatments on the interpretation of metabolomics data. For example, van den Berg et al. [45] showed that pretreatment methods such as scaling, centering and transformations can greatly a ffect the outcome of metabolomic analyses (including the biological ranking of important metabolites) and have the potential to enhance biological interpretability. Similarly, Emwas et al. [46] concluded that the choice of spectral processing and post-processing depended on many factors including the aim of the experiment and the quality of data.

We believe that our approach has particular application in animal breeding, where the aim is to understand the biological processes that underpin economically important traits [47] and to identify metabotypes that represent inter-animal variation independent of confounding from systematic environment e ffects. Even with the advent of genomic selection, livestock genetic studies require relatively large numbers of animals to ensure there is adequate genetic variation in the study population [48,49]. The same is likely to be true for metabotype discovery studies. Such large datasets can be hard to compile, especially when the trait of interest is di fficult and/or expensive to measure. As well as collecting data from multiple farms, another potential solution is data sharing through international collaboration. This is routinely done by geneticists; for example, de Haas et al. [50] used data from Holstein cattle in Europe, North America and Australasia to improve genomic prediction accuracies for feed intake. The ability to correct metabolomic data for factors such as experimental batch, diet, herd, year and season should allow similar collaborations in metabotype studies.

#### **4. Materials and Methods**

All procedures undertaken in this study were conducted in accordance with the Australian Code of Practice for the Care and Use of Animals for Scientific Purposes (National Health and Medical Research Council, 2013). Approval to proceed was granted by the Agricultural Research and Extension Animal Ethics Committee of the Department of Jobs, Precincts and Resources Animal Ethics Committee (DJPR, 475 Mickleham Road, Attwood, Victoria 3049, Australia), and the Tasmanian Department of Primary Industries, Parks, Water and Environment (DPIPWE Animal Biosecurity and Welfare Branch, 13 St Johns Avenue, New Town, Tasmania 7008, Australia). AEC project approval codes 2017-05 and 2018-07.
