Next Article in Journal
Identification, Analysis, and Confirmation of Seed Storability-Related Loci in Dongxiang Wild Rice (Oryza rufipogon Griff.)
Previous Article in Journal
CircNFIC Balances Inflammation and Apoptosis by Sponging miR-30e-3p and Regulating DENND1B Expression
Previous Article in Special Issue
Genome-Wide Association Studies for Milk Somatic Cell Score in Romanian Dairy Cattle
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Genomewide Association Analyses of Lactation Persistency and Milk Production Traits in Holstein Cattle Based on Imputed Whole-Genome Sequence Data

1
Department of Animal Sciences, Purdue University, West Lafayette, IN 47907, USA
2
Department of Animal Sciences, State University of Ponta Grossa, Ponta Grossa 84030-900, Brazil
3
Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, ON N1G2W1, Canada
4
Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, College of Animal Science & Technology, Sichuan Agricultural University, Chengdu 611130, China
5
Department of Animal and Food Science, University of Wisconsin River Falls, River Falls, WI 54022, USA
*
Author to whom correspondence should be addressed.
Genes 2021, 12(11), 1830; https://doi.org/10.3390/genes12111830
Submission received: 30 October 2021 / Revised: 13 November 2021 / Accepted: 17 November 2021 / Published: 19 November 2021
(This article belongs to the Special Issue Genome-Wide Association Analysis of Cattle)

Abstract

:
Lactation persistency and milk production are among the most economically important traits in the dairy industry. In this study, we explored the association of over 6.1 million imputed whole-genome sequence variants with lactation persistency (LP), milk yield (MILK), fat yield (FAT), fat percentage (FAT%), protein yield (PROT), and protein percentage (PROT%) in North American Holstein cattle. We identified 49, 3991, 2607, 4459, 805, and 5519 SNPs significantly associated with LP, MILK, FAT, FAT%, PROT, and PROT%, respectively. Various known associations were confirmed while several novel candidate genes were also revealed, including ARHGAP35, NPAS1, TMEM160, ZC3H4, SAE1, ZMIZ1, PPIF, LDB2, ABI3, SERPINB6, and SERPINB9 for LP; NIM1K, ZNF131, GABRG1, GABRA2, DCHS1, and SPIDR for MILK; NR6A1, OLFML2A, EXT2, POLD1, GOT1, and ETV6 for FAT; DPP6, LRRC26, and the KCN gene family for FAT%; CDC14A, RTCA, HSTN, and ODAM for PROT; and HERC3, HERC5, LALBA, CCL28, and NEURL1 for PROT%. Most of these genes are involved in relevant gene ontology (GO) terms such as fatty acid homeostasis, transporter regulator activity, response to progesterone and estradiol, response to steroid hormones, and lactation. The significant genomic regions found contribute to a better understanding of the molecular mechanisms related to LP and milk production in North American Holstein cattle.

1. Introduction

Milk production and composition are the most intensively selected traits in dairy cattle breeding programs around the world due to their direct economic impact to the industry and close link with nutritional properties [1,2]. Additionally, lactation persistency (LP), defined as the ability of a cow to maintain milk production at a high level after reaching the milk production peak, greatly impacts the economic return of the dairy sector [3]. Different indicators of LP have been proposed over time [4,5,6,7], with heritability estimates ranging from 0.14 to 0.24 [8,9]. Heritability estimates in Holstein cattle for milk (MILK), fat (FAT), and protein (PROT) yields usually range from 0.24 to 0.52, and from 0.36 to 0.68 for fat percentage (FAT%) and protein percentage (PROT%) [10,11,12,13].
Identifying genomic regions and candidate genes related to milk production traits is crucial to better understand the biological mechanisms underlying their phenotypic expression and to optimize genomic evaluation of milk-related traits [14]. In this context, genomewide association studies (GWAS) have been extensively performed in recent years to find associations between genomic polymorphisms and economically important traits in dairy cattle populations [15,16,17]. The large majority of these studies have been performed based on medium-density genotyping arrays. However, the identification of variants associated with target traits is highly dependent on the extent of linkage disequilibrium (LD) between the SNPs and the causal variants. Thus, an alternative to increase the ability to identify key genomic regions is the use of whole-genome sequence (WGS) data. More accurate quantitative trait loci (QTL), causative mutations, and consequently, candidate genes, are expected to be identified based on WGS [18]. Generating WGS data for a large number of animals is still expensive, but a technique known as genotype imputation can be employed to impute missing markers from animals genotyped with medium- or high-density genotyping arrays to WGS with accuracies greater than 90–95% [19]. The potential of imputed WGS (iWGS) data to discover genetic variants in GWAS has been shown in previous studies of dairy cattle [20,21] and other species [22,23,24].
New genomic regions associated with milk production traits in dairy cattle have been recently reported based on iWGS studies. For instance, Teissier et al. [25] identified 493 QTL for MILK, FAT, PROT, FAT%, and PROT% in Holstein, Montbéliarde, and Normande cattle breeds, demonstrating that a large number of genomic regions influence milk production traits. Furthermore, meta-analyses studies have reported pleiotropic effects of genes related to milk production traits and other dairy traits such as mammary system conformation and milking temperament [26,27]. In this context, genomewide fine mapping has been well explored for milk yield and milk solids [28,29,30]. However, to the best of our knowledge, there are no reports of LP-related genes identified using WGS or iWGS.
The LP reflects the cow’s ability to maintain milk production after the lactation peak, and may be an indicative of postcalving development of the mammary gland [31]. Improving LP can potentially increase cow health and welfare [9,32,33]. Despite the importance of LP to dairy cattle production, few studies have investigated its genomic background. Moreover, to our best knowledge, all previous studies used medium- to high-density SNP panels [34,35,36,37] instead of WGS or iWGS. Therefore, the main objectives of this study were to perform: (1) iWGS-based GWAS for LP, MILK, FAT, PROT, FAT%, and PROT% aiming to identify key genomic regions and candidate genes influencing these traits in North American Holstein cattle; and (2) functional genomic analysis to better understand the biological pathways associated with LP and milk production traits.

2. Materials and Methods

2.1. Animals and Phenotypes

All datasets (i.e., pedigree, phenotypes, and genotypes) were provided by the Canadian Dairy Network (CDN), a member of Lactanet (Guelph, ON, Canada). A range of 8264 (MILK) to 3447 (LP) animals with pseudo-phenotypes (de-regressed breeding values, dEBVs), for MILK, FAT, PROT, FAT%, PROT%, and LP traits were used in this study (Table 1). The phenotypes used in the genetic evaluations were obtained by the Dairy Herd Improvement (DHI, Canada) field staff in milk recording and posterior laboratory analyses. Official genetic evaluations for all evaluated traits and their associated reliabilities were provided by CDN (www.cdn.ca, accessed on 30 October 2021) based on their routine genetic evaluation models. The EBVs of LP were computed using the solutions from the Canadian test-day model, where an EBV for milk yield at day 60 and day 280 was calculated for each animal individually for each lactation (1, 2, and 3). Then, the EBVs were standardized to relative breeding values and combined into a single EBV (mean of 100 and standard deviation of 5), in which a higher EBV indicated a more persistent animal. The dEBVs were computed as in VanRaden et al. [38], and only dEBVs with a reliability greater than 0.28 (accuracies greater than 0.50) were kept for this study.

2.2. iWGS and Genomic Quality Control

Imputed WGS data from 9131 Holstein cows, with 29,548,077 SNPs were available for this study. Genotype imputation was performed first from medium-density SNP panels (MD, 9131 cows and 56,955 or 60,914 SNPs; Illumina, San Diego, CA, USA) to a high-density panel (HD, 311,725 SNPs; Illumina, San Diego, CA, USA); and secondly, from HD to WGS, resulting in the iWGS datasets. The HD reference population had 2397 animals (from the North American Holstein population), and there were 1147 animals with WGS data (from the 1000 Bull Genomes Project, which also included North American Holstein animals). Imputation was performed using the FImpute software [39]. Imputation accuracies greater than 95% have been achieved for Holstein and other North American dairy cattle breeds [19,40].
Before imputation, SNPs with a call rate lower than 0.95, extreme departure from Hardy–Weinberg equilibrium (p < 10−8), located in nonautosomal chromosomes or with unknown position, and excess of heterozygosity greater than 0.15 were removed from the dataset. Only SNPs present in both MD and HD files were retained in the MD dataset. After quality control (QC), 38,955 SNPs remained in the MD dataset to be used in the imputation analyses. A total of 297,114 SNPs (from the initial 311,725 SNPs) remained in the HD panel dataset after QC. Subsequently, imputation from HD to WGS was performed for all autosomal chromosomes.
After imputation to WGS, an additional QC was performed to remove the individuals and SNPs with a missing rate greater than 0.1, minor allele frequency (MAF) lower than 0.01, and extreme deviation from Hardy–Weinberg equilibrium (p < 10−8; e.g., [41,42,43]. Finally, a range of 5,108,861 to 6,101,357 SNPs remained for the GWAS analyses of LP and milk production traits. The PLINK software [44] was used to perform all the QC applied. All these analyses were done following Chen et al. [21].

2.3. Association Analyses, Statistical Models, and Significance Testing

Association analyses were performed using the GCTA package [45], fitting a mixed linear model (MLM), including a polygenic effect. Therefore, SNP effects were estimated using the following statistical model:
y = 1 μ + X b + Z u + e ,
where y is a vector of pseudo-phenotypes (dEBVs); 1 is a vector of ones; μ is the overall mean; b is the fixed effect of the SNP tested for association with each trait, X is a vector containing the genotype score for the tested SNP; u is a vector of polygenic effects with u   ~   N ( 0 ,   G σ u 2 ) , where G is the genomic-based relationship matrix (GRM) and σ u 2 is the additive genetic variance of polygenic effects; Z is the incidence matrix of u ; and e is a vector of residual effects with e   ~   N ( 0 ,   I σ e 2 ) , where I is an identity matrix and σ e 2 is the residual variance.
To avoid double-fitting candidate SNPs [46], GRMs were alternatively constructed by randomly sampling 50,000 SNPs from all chromosomes, except the one in which the analyzed SNPs were located. After the GWAS analyses, all SNPs were ranked based on their p-values and clumped according to their LD pattern (r2  >  0.90), as suggested by Prive et al. [47]. The genomic inflation factor (λ) was calculated as the ratio of the median of the observed distribution of the X 2 statistic to the expected median ( λ ^ = m e d i a n ( X 2 ) / 0.4549 ), for which a 95% confidence interval (CI) of λ value was further derived.
To avoid the occurrence of excessive false-negative results by using the Bonferroni adjustment, as a large number of variants were tested [48], we alternatively calculated the threshold of significant testing as 0.05 divided by the number of independent chromosomal segments ( M e ) at chromosome-wise levels [49]. M e is a function of the effective population size ( N e ) and chromosome length ( L , in centi-Morgans—cM), and was calculated as [50]:
M e = 2 N e L log ( N e L )
N e was considered to be equal to 58 [51] and one cM equivalent to 1 Mb [52]. The SNPs were considered as statistically significant if their log 10 ( P ) was higher than the chromosome-wide threshold.

2.4. Functional Genomic Analyses

The SNP coordinates were based on the ARS-UCD1.2 assembly of the cattle reference genome (GCA_002263795.2). The annotation information was obtained from the National Center for Biotechnology Information (NCBI; www.ncbi.nlm.nih.gov, accessed on 30 October 2021). The GALLO R package [53] was used to detect genes located within ±100 Kb of the significant SNPs and QTL regions previously cited in the Animal QTLdb [54]. The Variant Effect Predictor (VEP) tool from Ensembl [55] was utilized to identify novel variants associated with the main peaks observed in the Manhattan plots. Functional enrichment analyses of the candidate genes identified were performed using the DAVID platform [56]. Gene network joint analyses were performed using the STRING database [57].

3. Results

3.1. GWAS

After QC and LD-based clumping [47], the remaining number of informative SNP ranged from 1,673,052 (FAT) to 2,117,121 (LP). The Manhattan plots illustrate the chromosomal distribution of SNPs significantly associated with each trait (Figure 1). Additionally, Manhattan plots with a y-axis truncated at a lower level for the milk production traits are available as Figure S1 for a better visualization of peaks other than those found on BTA14. For the milk production traits, the significant peaks were higher and sharper, suggesting a more precise detection of narrower QTL regions distributed across the whole genome. A strong association was found in BTA14 (Table 2 and Table 3), with ARHGAP39, bta-mir-2308, C14H8orf82, LRRC24, LRRC14, RECQL4, MFSD3, GPT, PPP1R16A, FOXH1, KIFC2, CYHR1, TONSL, VPS28, ENSBTAG00000053637, SLC39A4, CPSF1, and ADCK5 harboring the most significant SNPs for MILK, FAT, FAT%, and PROT%; and with MAF1, SHARPIN, CYC1, GPAA1, EXOSC4, OPLAH, SPATC1, GRINA, PARP10, PLEC, and bta-mir-2309 being the most significant candidate genes for PROT. Milk production traits were highly associated with the diacylglycerol O-acyltransferase 1 (DGAT1) gene (Tables S1–S5), contributing to the highest peak found in BTA14. For LP, significant SNPs were spread across various chromosomes and with less-defined peaks. The most significant regions for LP were observed in BTA28 (p-value = 5.28 × 10−7) and BTA18 (p-value = 8.56 × 10−7), in which, for the BTA28 peak, the most significant SNPs were associated with ZMIZ1 and PPIF; and for the BTA18 peak, with ARHGAP35, NPAS1, TMEM160, ZC3H4, and SAE1 (Table 2). Moreover, the genes presented in Table 2 and Table 3 were previously reported in the Animal QTLdb to be associated with a large number of QTL regions. Most of these QTL are related to milk production traits, but there are some others linked to reproduction, health, production, and exterior (conformation and appearance).

3.1.1. GWAS for Milk Yield, Fat Yield, and Fat Percentage

For MILK, 3991 SNPs, located on 25 chromosomes, were significantly associated with 1098 genes within ±100 Kb genomic regions. The genomic regions located on BTA5, BTA6, BTA13, BTA14, and BTA20 were the most significant ones (p-value < 10−10), where MGST1 and SLC15A5 (BTA5); MOB1B, DCK, and SLC4A4 (BTA6); ISM1 and TASP1 (BTA13); and NIM1K, ZNF131, ENSBTAG00000042376, ENSBTAG00000054352, ENSBTAG00000052195, and ENSBTAG00000051111 (BTA20), were the top candidate genes associated with those regions, in addition to those already mentioned above for the BTA14. For FAT, 2607 SNPs, located on 21 chromosomes, were significantly associated with 989 genes. Of those, the key candidate genes were SLC15A5 (BTA5); NR6A1, bta-mir-181a-2, bta-mir-181b-2, OLFML2A, WDR38, RPL35, ARPC5L, and GOLGA1 (BTA11); ACCSL, ACCS, and EXT2 (BTA15); CEP350 and QSOX1 (BTA16); and PKD2L1, SCD, bta-mir-12016, WNT8B, SEC31B, NDUFB8, and HIF1AN (BTA26), where the genomic regions located on BTA5, BTA11, BTA14, BTA15, BTA16, and BTA26 were the most significant ones (p-value < 10−9). Furthermore, for FAT%, 4459 SNPs, located on 22 chromosomes, were significantly associated with 2016 genes within ±100 Kb genomic regions. For this trait, the chromosomes with the most significant regions (p-value < 10−10) were BTA5, BTA6, BTA11, BTA14, BTA16, and BTA20. The top candidate genes located in those regions were SLC15A5 (BTA5); HERC3, PIGY, and HERC5 (BTA6); GADD45G (BTA11); FAM163A, TOR1AIP2, TOR1AIP1 (BTA16); and ENSBTAG00000054476 and MRPS30 (BTA20). The top candidate genes located on BTA14 for MILK, FAT, and FAT% were ARHGAP39, bta-mir-2308, C14H8orf82, LRRC24, LRRC14, RECQL4, MFSD3, GPT, PPP1R16A, FOXH1, KIFC2, CYHR1, TONSL, VPS28, ENSBTAG00000053637, SLC39A4, CPSF1, and ADCK5. VEP analysis confirmed FOXH1 as a gene containing variants strongly related to MILK, FAT, and FAT%. Furthermore, VEP analysis indicated OLFML2A (BTA11) as a possible novel candidate variant associated with FAT (Table S6).

3.1.2. GWAS for Protein Yield, Protein Percentage, and Lactation Persistency

A total of 805 SNPs, located on 24 chromosomes, were significantly associated with 898 genes for PROT. Of those, BTA3, BTA13, and BTA14 contained the most significant regions (p-value < 10−8), where CDC14A, ENSBTAG00000054319, ENSBTAG00000015759, and RTCA (BTA3); ARHGAP12 (BTA13); and MAF1, ENSBTAG00000051469, SHARPIN, CYC1, GPAA1, EXOSC4, OPLAH, ENSBTAG00000015040, SPATC1, GRINA, PARP10, PLEC, and bta-mir-2309 (BTA14) were the candidate genes related to the most significant SNPs identified for PROT. Moreover, for PROT%, 5519 SNPs were located on 24 chromosomes, significantly related to 2739 genes within ±100 Kb genomic regions. The top significant genes for PROT% (p-value < 10−12) were SLC15A5 (BTA5); HERC3, PIGY, and HERC5 (BTA6); ARHGAP39, bta-mir-2308, C14H8orf82, LRRC24, LRRC14, RECQL4, MFSD3, GPT, PPP1R16A, FOXH1, KIFC2, CYHR1, TONSL, VPS28, ENSBTAG00000053637, SLC39A4, CPSF1, and ADCK5 (BTA14); STIM1, RHOG, PGAP2, NUP98 (BTA15); and PAIP1, ENSBTAG00000049623, C20H5orf34, TMEM267, CCL28, HMGCS1, ENSBTAG00000048672, and NIM1K (BTA20). Finally, for LP, 49 SNPs located on 18 chromosomes were significantly associated with 85 genes (Table S7). The most significant genes (p-value < 10−7) were located on BTA18 and BTA28, represented by ARHGAP35, NPAS1, TMEM160, ZC3H4, SAE1 (BTA18); and ZMIZ1 and PPIF (BTA28) (Table 3). The VEP analysis indicated the genes GRINA and PARP10 for PROT, CCL28 for PROT%, and the genes ZC3H4 and ZMIZ1 for LP (Table S6), all of which were located in the main observed GWAS peaks.

3.2. Commonly Identified Genes for Two or More Traits

Similar genomic regions were detected to be associated with different LP and milk production traits (Figure 2), indicating potential pleiotropic effects. LP presented common candidate genes with MILK, FAT, FAT%, and PROT%, where 15 candidate genes were concurrently associated with LP and the mentioned production traits: CXCL13 and LDB2 (MILK); ZMIZ1, bta-mir-371, NLRP12 and PPIF (FAT); INPP5A (FAT%); and SERPINB6, SERPINB9, IGF2BP1, and DLX4 (PROT%). Additionally, LP, FAT%, and PROT% showed common candidate genes between the three traits simultaneously: ABI3, GNGT2, B4GALNT2, and PHOSPHO1, demonstrating their importance not only in the genetic background of milk solids production, but also in the duration of the lactation peak.
As a result of the high genetic correlation existing between the milk production traits, 98 genes were commonly associated with the five milk production traits (MILK, FAT, FAT%, PROT, and PROT%), as demonstrated in Figure 2. All those common genes were located on BTA14, reinforcing the impact of this genomic region on milk production traits. The candidate genes located closer to genomic region linked to the top SNP found (BTA14: 465,742 bp) among MILK, FAT, FAT%, and PROT% were PPP1R16A, FOXH1, KIFC2, and CYHR1. In addition, the closer common genes related to the top SNP found for PROT (BTA14, BP= 827,938) were GRINA, PARP10, and PLEC.
The gene interaction network analysis revealed a strong connection between LP and milk production traits (Figure 3). Genes such as ARHGAP35, TMEM160, and SAE1 (BTA18), which were highly associated with LP, demonstrated to be linked with ARHGAP39, PPP1R16A, FOXH1, and CYHR1 (BTA14), which were significantly associated with milk production. Three big clusters were formed rounding ARHGAP39, TONSL, ADCK5, reinforcing that the molecular interactions among these three genes seem to be related to the control of the gene expression and protein regulation of milk traits.

3.3. Functional Analyses of Candidate Genes

Gene ontology (GO) enrichment analyses were performed to better understand the functional role of the candidate genes identified. GO terms for 14 biological processes and 12 molecular functions were significantly enriched, with 44 genes for MILK, 86 genes for FAT, and 33 for FAT% (Table 4). Furthermore, GO terms for 26 biological processes and nine molecular functions were significantly related to 41 genes for PROT, 109 genes for PROT%, and 6 for LP (Table 5). Five GO terms were found to be related to two or more traits. The GO:1903494 term associated with the response to dehydroepiandrosterone, GO:1903496 linked to the response of 11-deoxycorticosterone, and GO:0032355 associated with the response to estradiol were commonly identified for MILK, PROT, and PROT%. In addition, GO:0015125, related to bile acid transmembrane transporter activity, was identified for FAT and FAT%; and GO:0005149, which is related to the interleukin-1 receptor binding, was associated with FAT% and PROT%.
The following 26 genes were identified as influencing the phenotypic expression of two or more traits: CSN1S1, CSN1S2, CSN2, CSN3, CHD7, KCNMB4, SLCO1A2, SLCO1B3, SLCO1C1, SLCO2B1, HCK, PTK2, SCX, DDX1, TNFRSF1A, LTBR, IL1A, IL1B, IL1F10, IL1RN, IL36RN, IL36A, IL36B, IL36G, IL37, and SERPINB9. Furthermore, some genes were repeated in certain GO terms for the same trait, including the GABR, CSN, and GST gene families, CLIC5, and MGST1 (MILK); EYA3, DHCR7, PPDPF, and GADD gene family (FAT); the LYSB gene and the BPI, LYZ, and CSN gene families (PROT), and C1QBP, ID2, JMJD6, LALBA, MAGOHB, PABPC1, PRLR, PRPF4B, PUF60, SRSF2, SRSF7, STAT5B, WNT11, and ZPR1 genes, as well as CSN, SLC, and RBM gene families (PROT%).

4. Discussion

The majority of previous GWAS carried out to search for genomic regions associated with economically important traits in dairy cattle have been conducted using MD or HD SNP panel data. However, iWGS can increase the statistical power for detecting QTLs and causative variants for complex traits [58]. Based on iWGS-based GWAS, we identified novel genomic regions of interest, revealing novel candidate genes (ARHGAP35, NPAS1, TMEM160, ZC3H4, SAE1, ZMIZ1, PPIF, LDB2, ABI3, SERPINB6, and SERPINB9 for LP; NIM1K, ZNF131, GABRG1, GABRA2, DCHS1, and SPIDR for MILK; NR6A1, EXT2, POLD1, GOT1, and ETV6 for FAT; DPP6, LRRC26, and the KCN gene family for FAT%; CDC14A, RTCA, HSTN, and ODAM for PROT; and HERC3, HERC5, LALBA, and NEURL1 for PROT%), and confirmed previously reported associations. The fairly high number of novel candidate genes for LP and milk production traits supports the use of a denser mapping of the genome for GWAS purposes, as well as the polygenic nature of these traits. Liu et al. [59] evaluating 1220 Holstein cows with a SNP panel containing 124,743 markers identified 10 highly significant SNPs associated with FAT and PROT. However, those authors did not detect any significant SNP related to milk yield, even with a moderate number of markers involved in the GWAS, reinforcing the argument that the use of WGS can substantially contribute to this type of analysis. For instance, numerous QTLs were found for MILK (146), FAT (152), and PROT (166) in five French and Danish dairy breeds when using WGS data [18]. Using iWGS-based GWAS, we identified multiply genomic regions affecting LP, a complex trait with few known QTLs. Therefore, our findings greatly contribute to elucidating the genetic background of LP in Holstein cattle.

4.1. Candidate Genes for Lactation Persistency

Among the traits evaluated in this current study, LP is the less explored one, and to our best knowledge, this is the first iWGS-based GWAS for LP, which presents a good opportunity to explore novel genomic regions influencing its phenotypic expression. Despite the fact that the peaks were not well defined due to the highly polygenic nature of LP [7], BTA18 and BTA28 were the ones with the most significant regions. The genes highly associated to LP on BTA18 were ARHGAP35, NPAS1, TMEM160, ZC3H4, and SAE1 and on BTA28 were ZMIZ1 and PPIF. None of these genes were previously linked to LP. Interestingly, most of these genes mentioned above associated with LP were previously linked to fertility traits in dairy cattle [60,61,62], indicating that fertility and LP are genetically correlated in dairy cattle populations [63]. Other studies have also demonstrated the genetic relationship between reproductive traits and LP in dairy cattle. For instance, Muir et al. [64] showed that heifers with difficult first calving tended to have more persistent first lactations and lower peak yields, indicating an antagonistic relationship between calving ease and LP. More recently, Yamazaki et al. [9] reported that differences in LP are related to a cow’s ability to conceive after the second calving. Therefore, the best bulls for improving female fertility after the second calving may differ with the production system and herd milk production, given that a strong genetic correlation was verified between LP and fertility traits in Japanese Holstein cattle [9]. Another important finding was the gene network connection among ARHGAP35, TMEM160, and SAE1 (BTA18) and ARHGAP39, PPP1R16A, FOXH1, and CYHR1 (BTA14), as demonstrated in Figure 3. No other study has reported a close network interaction between some of the main genes responsible for LP and milk production traits, indicating potential pleiotropic effects (Figure 3 and Supplemental Figure S2). Further studies should investigate the connection between LP and milk production traits at the molecular level. Evidence of interaction between LP and milk production traits were previously reported by Jakobsen et al. and Yamazaki et al. [65,66], where moderate to high genetic correlations were observed between these two trait groups.
Three GO terms were significantly enriched for LP, GO:0030334 (regulation of cell migration), GO:19003955 (positive regulation of protein targeting to mitochondrion), and GO:0004867 (serine-type endopeptidase inhibitor activity). The genes linked to the GO terms ABI3 and LDB2 (GO:0030334), SAE1 (GO:1903955), and SERPINB6 and SERPINB9 (GO:0004867) could also be considered as novel candidate genes and its molecular role related to LP should be deeper investigated. The ABI3—ABI gene family member 3 was significantly associated in our study not only with LP but also FAT% and PROT% (Figure 2), confirming its influence in the phenotypic expression of bovine milk-related traits. The ABI3 gene was previously related to milk pregnancy-associated glycoproteins in Holstein cattle but its molecular role in dairy cattle has been little explored [67]. SAE1, which also appear as one of the most significant genes for LP, was previously associated with lactation evolution in mice, indicating that this gene plays an important role in the expression of lactation in other mammalian species [68]. SERPINB6 and SERPINB9 are genes belonging to the serpins superfamily of protease inhibitors, which uses a conformational change to inhibit target enzymes. They are central genes controlling many proteolytic cascades, including important mammalian coagulation pathways [69]. Both genes were formerly associated with milk production traits in buffaloes [70] and somatic cell count in Jersey cows and clinical ketosis lactation in Holstein cattle [71,72], but this is the first time that SERPINB6 and SERPINB9 have been associated with LP. Due to its relationship with other dairy cattle traits, its molecular role in the expression of LP should be further investigated.

4.2. Candidate Genes for Milk Yield

Several powerful associations detected here support previously reported genomic regions for milk production traits. For instance, the region containing highly significant SNPs on BTA14 for MILK, including ARHGAP39, PPP1R16A, FOXH1, KIFC2, CYHR1, and TONSL was reported in other studies [15,37,73,74]. Atashi et al. [74] reported this region on BTA14 to be associated with 305-day milk yield and peak yield in dairy cows. Other relevant genomic regions were found on BTA20, BTA5, and BTA6, also containing genes previously identified as potentially influencing milk production. MGST1, SLC15A5 (BTA5); MOB1B, DCK, SLC4A4 (BTA6) and NIM1K, ZNF131 (BT20) can be highlighted for its strong significance with MILK (Table 2). Raven at al. [75] reported the link between MGST1 and milk production in multibreed cattle. Additionally, it is noteworthy that the SLC4A4 gene had an important function in the production of MILK in a study with Holstein cattle in the USA [73]. SLC4A4 is a solute transporter, belonging to one of the major transporter superfamilies mostly involved in the active transport of glucose. Glucose uptake by mammary epithelial cells is a crucial stage in milk synthesis, and therefore, directly influences MILK [76]. Furthermore, NIM1K and ZNF131 have been reported to prolong lactation period culminating in higher milk production levels in Canadian dairy cattle [36]. Banos et al. [77] reported that ZNF131 was expressed in the milk transcriptome and the mammary gland of dairy sheep, highlighting the impact of this gene in the process of molecular transcription of regions related to sheep milk production. The importance of zinc finger protein 131 (ZNF131) on the transcription of molecular codes responsible for the milk production seems clear. Lastly, this is the first time that NIM1K and ZNF131 have been directly associated with milk yield in dairy cattle.
Four other novel candidate genes for MILK presented significant GO terms (FDR ≤ 1%) in the enrichment analyses. GO:0004890 and GO:0005230: GABRG1 (BTA14) and GABRA2 (BTA6); GO:0003273: DCHS1 (BTA15) and GO:0071479: SPIDR (BTA14), shown in Table 4, can be highlighted as genes that might play fundamental roles in metabolic functions involving MILK. Interestingly, the four genes were previously linked to FAT [62,78,79] or other milk components [80], confirming its close relationship with milk production traits. For instance, γ-aminobutyric acid type A genes (GABRG1 and GABRA2), which contributes to γ-aminobutyric acid (GABA) chloride ion channel activity and participates in GABA-A receptor activity, were previously associated with milk production traits in Holstein and Xinjiang Brown cattle [81,82].

4.3. Candidate Genes for Fat Yield

As observed for MILK, the GWAS for FAT revealed several genomic regions with highly significant SNPs associated with 989 genes spread across 21 chromosomes. Besides the genes located in BTA14 (ARHGAP39, PPP1R16A, FOXH1, KIFC2, CYHR1, and TONSL) identified for MILK, a highly significant genomic region was identified for FAT on BTA5, which harbors the SLC15A5 gene. SLC15A5 was previously associated with FAT, presented in a large region of 88.26–93.69 Mb on BTA5, that seems to have a cluster of additive effects linked with MGST1, PLEKHA5, and ABCC9 [62]. The same genes (MGST1, PLEKHA5, and ABCC9) were also significantly associated with FAT in our study (Supplemental Table S1), suggesting that this genomic region plays an important role in the expression of FAT in Holstein cattle. Other significant peaks for FAT were observed on BTA11, BTA15, BTA16, and BTA26, revealing novel candidate genes. On BTA11, the most significant region contains the NR6A1 gene, which was also included in the significant GO:0000978, demonstrating its potential for being a novel candidate gene associated with FAT. NR6A1 has high homology among different species [83], and acts in the expression of traits linked to metabolism, reproduction, and production, as demonstrated in a study with swine where this gene was related with fat deposition [84]. Another relevant gene located on BTA11 that might be interesting to investigate further is OLFML2A. Besides been highly associated with FAT, OLFML2A was also found in VEP analysis, confirmg its potential as a novel candidate gene associated with FAT. OLFML2A (Olfactomedin Like 2A) is a protein coding gene related to protein homodimerization activity and extracellular matrix binding. To our best knowledge, this gene has never been associated with any milk trait in the literature before and its influence on milk related traits warrants a deeper investigated. Interestingly, the OLFML2A gene was found differently expressed in a study of fat depot-specific gene signatures in mice, contributing to the distinct patterns of extracellular matrix remodeling and adipose function in different fat depots [85]. Additionally, on BTA15, the most significant SNP found was associated with EXT2, which was also related with a significant GO term responsible for cell differentiation (GO:0030154). EXT2 was previously cited as a suggestive gene associated with milk iron content in Jersey cows [86]. However, this is the first time that this gene has been directly associated with FAT in Holstein cattle.
Among the most significant GO terms, GO:0055089 is involved in fatty acid homeostasis in which important genes such as DGAT1 are included. The link between the DGAT gene with FAT and other milk production traits is widely known [28,87]. However, out of the five genes related to GO:0055089, POLD1 and GOT1 have not been previously associated with FAT. POLD1, which is located on BTA18, was previously associated with PROT in Nordic Holstein cattle, but its function in the expression of milk production traits is not yet well established [88]. GOT1, located on BTA26, was previously associated with milk fatty acids acting in the transformation of citrate by ATP-citrate lyase in the cytosol, which is required for fatty acid synthesis [89]. Another novel candidate gene for FAT is ETV6, which was identified in two GO terms (GO:0030154 and GO:0000978), representing cell differentiation and RNA polymerase II core promoter proximal region sequence-specific DNA binding, respectively. ETV6 was also previously associated with FAT% in Brown Swiss cattle [90] and MILK in Holstein cattle [78], but this is the first time that ETV6 was linked to FAT.

4.4. Candidate Genes for Fat Percentage

For FAT%, 4459 SNPs, distributed across 22 chromosomes, associated with 2016 annotated genes were found. Among the highly significant regions, it can be highlighted the SLC15A5 (BTA5) and MRPS30 (BTA20) genes, besides those already mentioned from the BTA14 also found in MILK and FAT. SLC15A5, a solute carrier family member, was also recently associated with FAT% in Holstein cattle [62,91], confirming its importance to milk quality. Furthermore, other genes from the solute carrier family demonstrated to be relevant for FAT% since SLCO1A2, SLCO1B3, SLCO1C1 and SLCO2B1 were associated in the GO:0015125, one of the significant GO terms enriched for this trait (Table 4). MRPS30 was first associated with FAT% in Jiang et al. [62] and previously linked to MILK in studies reported by Fang et al. and Cai et al. [92,93].
The most significant GO terms for FAT% are GO:0005149 (interleukin-1 receptor binding) and GO:0015459 (potassium channel regulator activity). The GO:0005149 presented the highest significance (8.5 × 10−8) and showed a big group of IL family genes associated to FAT% (Table 4). There are few reports in the literature integrating the interleukin-1 receptor with dairy cattle—only research that associated genes from this gene family with mastitis or fertility indicators [94,95]. However, there is proven evidence in other species, including humans, of the relationship between the IL gene family and the structuring of fat in different tissues, which may demonstrate the importance of some of these genes in the context of the structuring of milk fat [96,97]. Another important GO term found was the GO:0015459, which is related to the potassium channel activity. The main genes involved in this GO are DPP6 (BTA4), LRRC26 (BTA11) besides others from the KCN gene family. Genes from the KCN family were previously associated with milk traits [98], but its molecular involvement with FAT% needs to be further explored. A possible mechanism that might be worth investigating is that, in milk, potassium is correlated with lactose, and therefore with milk yield, via osmotic regulation [99]. Therefore, changes in percentage traits could be driven by potassium-induced changes in milk volume.

4.5. Candidate Genes for Protein Yield

For PROT, CDC14A and RTCA (BTA3); and MAF1, SHARPIN, CYC1, EXOSC4, PARP10, OPLAH, GRINA, PLEC (BTA14) are key candidate genes for milk protein expression. On BTA3, this is the first time that CDC14A and RTCA are associated with PROT, but interestingly both genes were already detected in signatures of selection of other milk production traits in Valdostana cattle populations [100]. Furthermore, RTCA was previously associated with milk production traits in buffalo [101], but its connection with milk traits has not been described yet. The MAF1 gene located on BTA14 has been associated with milk protein synthetic capacity and for this reason, it has been pointed out as a key candidate genes for PROT [34,62,102]. The other genes identified on BTA14 (SHARPIN, CYC1, EXOSC4, PARP10, OPLAH, GRINA, and PLEC) were recently strongly associated with milk production traits, including PROT [37,93,102]. According to Jena et al. [103], SHARPIN influences mammary gland development and controls extracellular matrix organization of stroma during branching morphogenesis, which induces alveologenesis during pregnancy and lactation. Moreover, Lin et al. [104] found strong association of SHARPIN, CYC1, EXOSC4, and PARP10 with milk serum albumin, which is one of the main protein contents of cattle milk. These facts reinforce the hypothesis that the large genomic region where these genes are located (0.5 Mb upstream and downstream from SHARPIN) is important for milk protein expression. It is important to highlight that both GRINA and PARP10 were found in our VEP analysis, confirming that those two genes are highly associated with PROT and should be considered as novel relevant variants for this trait. The Glutamate Receptor Ionotropic NMDA-Associated Protein 1 (GRINA) belongs to the Lifeguard family and is involved in calcium homeostasis [105]. This gene was previously reported to play an important role in the lipid, major proteins, and cholesterol homeostasis in milk content of dairy cows, suggesting that GRINA might contribute to the regulation of solids present in dairy milk [12,102]. PARP10, which is a member of the poly (ADP-ribose) polymerases family, is related to several essential biological functions, such as immunity, metabolism, apoptosis, and DNA damage repair [106]. From a physiological perspective, the concentration of albumin in milk is influenced by pathological and genetic factors, which could connect the action of PARP10 on the regulation of albumin in dairy cattle milk content [104].
The GO enrichment analyses for PROT (Table 5) also revealed the casein gene cluster containing the CSN1S1, CSN1S2, CSN2, and CSN3 genes, which encode αs1, αs2, β, and κ casein, respectively. These genes have a strong influence on casein synthesis in cattle milk, and polymorphisms in this region have significant impact on milk protein composition [107]. In this context, HSTN and ODAM, which are located at the same region of the casein gene cluster, can be considered as novel candidate genes for PROT due to their close binding on BTA6 with CSN1S1, CSN1S2, CSN2, and CSN3, where all these genes are in linkage disequilibrium.

4.6. Candidate Genes for Protein Percentage

With several genomic regions adjacent to those mentioned in the other milk production traits, PROT% was the trait with the largest number of significant markers, i.e., 5519 SNPs, spread across 24 chromosomes and 2739 annotated genes. The most significant regions were found on BTA5, BTA6, BTA14, BTA15, and BTA20, with special emphasis on BTA6, BTA20, and BTA14. The most significant genes located on BTA6 are HERC3, HERC5, and PIGY. HERC3 and HERC5 can be considered as novel candidate genes for PROT% as they were previously associated with PROT and FAT%, respectively [29,30]. Genes from the HERC family of ubiquitin ligases associate with prolactin to regulate important milk proteins such as β-casein [108], turning these two genes into strong candidates to be intimately related to PROT%. On the BTA20 chromosome, PAIP1, C20H5orf34, TMEM267, CCL28, HMGCS1, and NIM1K are the most significant genes located in this genomic region. Of those, only CCL28 was previously reported to be associated with PROT% in North American Holstein cattle [62]. However, the other five genes also deserve special attention, not only for being highly associated with PROT% but also for being grouped in a narrow genomic region (31.20–31.50 Mb) that possibly makes these genes act together in the expression of PROT%. The C-C motif chemokine ligand 28 gene (CCL28) belongs to the subfamily of small cytokine CC genes that are involved in immunoregulatory and inflammatory processes [109]. Because of its relation with antimicrobial activity, CCL28 can play an important role in mastitis control and thus, indirectly influencing milk production [110]. This gene seems to be a strong candidate gene, as it was identified in our VEP analysis and is directly associated with one of the markers with one of the highest significance levels for PROT% on the BTA20.
Eighteen GO terms were significantly enriched (Table 5) and associated with various processes, especially GO:0005149 (interleukin-1 receptor binding), GO:1903496 (response to 11-deoxycorticosterone), GO:1903494 (response to dehydroepiandrosterone), and GO:0007595 (lactation). As found for FAT%, the interleukin-1 receptor binding presents a large group of IL family genes for PROT%, which are crucial in the expression of milk production traits in many species, including dairy cattle [97]. The terms GO:1903496 and GO:1903494 were also found for PROT, reinforcing the relevance of the casein gene cluster for milk protein, in both forms, total yield and percentage. Additionally, in both terms, the gene LALBA (milk whey protein α-lactoalbumin) was also identified for PROT%, highlighting the involvement of this well-known gene with PROT%, as previously reported in other dairy cattle studies [111,112]. On GO:0007595, many genes known for its association with PROT% were identified such as STAT5A, STAT5B, ATP2B2, CSN3, CSN2, and PRLR, validating the connection of these genes with milk composition in dairy cattle. VDR has also been previously linked to PROT% in Holstein and Jersey cows [113].

4.7. Common Genes Identified in GO Terms

Several overlapping genomic regions were found either between milk production traits or between milk production and LP traits (Figure 2). This fact supports the hypothesis that many genes associated with these traits could have a pleiotropic effect in dairy cattle [16]. According to Oliveira et al. [16], the pleiotropic effects observed on genes related to milk traits suggest a biological function on the use of energy resources directly affecting the synthesis of milk and solids. Few genes have been reported to commonly influence LP and milk production traits, which reinforces the fact that the common genes found in our study can help to elucidate the molecular interactions among various candidate genes with potential pleiotropic effects.
Fifteen genes were significantly associated with LP and at least one of the milk production traits, as CXCL13 and LDB2 (MILK); ZMIZ1, bta-mir-371, NLRP12 and PPIF (FAT); INPP5A (FAT%); SERPINB6, SERPINB9, IGF2BP1 and DLX4 (PROT%). Additionally, LP, FAT%, and PROT% had common candidate genes between the three traits simultaneously: ABI3, GNGT2, B4GALNT2, and PHOSPHO1, demonstrating their importance on the genetic structure of milk solid production, but also in the duration of peak lactation. Out of those mentioned genes, SERPINB6 and SERPINB9 have been previously associated with somatic cell count in Jersey cattle [16] and milk production traits in water buffalo [114]. Furthermore, it is noteworthy that the genes LDB2 (MILK), SERPINB6, SERPINB9 (PROT%) and ABI3 (FAT% and PROT%) were also present in the significant GO terms of biological process, revealing its relationship with regulation of cell migration, and regulation of protein and enzyme inhibitor activity.
As expected, due to the high genetic correlation between the studied traits, numerous genes were simultaneously associated with the five milk production traits. Oliveira Junior et al. [13] working with the same North American Holstein population found high genetic correlation (>0.48) among MY, FAT, PROT, FAT%, and PROT% highlighting the close genetic relationship between these traits. Even in multibreed dairy populations the high correlations among production traits are usually observed [115]. In total, 98 common genes were identified for all milk production traits (Figure 2). Interestingly, all these genes are located on BTA14, reinforcing the importance of this genomic region for milk production traits. The genes located closer to the genomic region linked to the top SNP found among MILK, FAT, FAT%, and PROT% are PPP1R16A, FOXH1, KIFC2, and CYHR1. Nayeri et al. [15] reported that PPP1R16A, FOXH1, and CYHR1 were commonly linked with FAT and FAT% in Holstein cattle and Cai et al. [93] also demonstrated that PPP1R16A, FOXH1, KIFC2, and CYHR1 presented a potential pleiotropic effect on MILK, FAT, and PROT. Finally, the closer common genes related to the top SNP found for PROT are GRINA, PARP10, and PLEC. GRINA and PARP10 were also reported by Cai et al. [93] as genes with pleiotropic effect on MILK, FAT, and PROT. Furthermore, PLEC was the only gene reported to be commonly associated with MILK, PROT, and FAT in Chinese Holstein cattle [17], which is in agreement with our findings.

4.8. Potential Implications and Limitations

Several novel candidate genes associated with LP and milk production traits in dairy cattle were identified, while previous associations were also confirmed. These findings will be useful for optimizing genomic prediction of breeding values in Holstein cattle and other dairy breeds, by adding the significant SNPs into commercial SNP panels to increase the accuracy of predictions and also give differential weight to these important genomic regions through biology-driven genomic prediction methods [116]. Furthermore, the genomic regions revealed are initial targets for future studies investigating the molecular mechanisms influencing the phenotypic expression of milk related traits. For instance, some important candidate genes found require a better understanding of their molecular functions, such as SAE1, SERPINB6, and SERPINB9, which were highly associated with LP but their biological functions are not clear. SAE1-SUMO activating enzyme subunit 1, is a gene linked to one of the most significant SNPs for LP and also present in one of the enriched GO terms found here, was previously associated with dairy cow mammary gland epithelial cells [117], but its molecular functions related to LP are still unclear.
Additionally, even with an application of a strict quality control to reduce the influence of the poorly imputed variants and individuals on the GWAS analysis, it is still expected that some removed low-frequency variants could be associated with the studied traits and not identified here. Despite the great advantage of identifying causal mutations at low frequency [20], they could also be false positives. Future studies should focus on the biological validation of the key candidate genes reported in this study. This could be done based on in-vitro experiments and gene knock-out models.

5. Conclusions

We have shown that the use of imputed whole-genome sequence data for GWAS enabled the identification of a high number of SNPs associated with lactation persistency and milk production traits in dairy cattle. Several genomic regions and candidate genes were identified, which are widely distributed across all autosomal chromosomes, especially BTA5, BTA6, BTA14, BTA18, BTA20, and BTA28. This study also confirmed the importance of the BTA14 for milk production traits. Additionally, many genomic regions with potential pleiotropic effects were identified. Numerous novel candidate genes were revealed: ARHGAP35, NPAS1, TMEM160, ZC3H4, SAE1, ZMIZ1, PPIF, LDB2, ABI3, SERPINB6, and SERPINB9 (LP); NIM1K, ZNF131, GABRG1, GABRA2, DCHS1, and SPIDR (MILK); NR6A1, OLFML2A, EXT2, POLD1, GOT1, and ETV6 (FAT); DPP6, LRRC26, and KCN gene family (FAT%); CDC14A, RTCA, HSTN, and ODAM (PROT); HERC3, HERC5, LALBA, CCL28, and NEURL1 (PROT%), involved in key biological pathways such as fatty acid homeostasis, transporter regulator activity, response to progesterone and estradiol, response to steroid hormones, and lactation. Lastly, another relevant finding was that a variety of genomic regions related to LP and milk production were previously associated with fertility traits in dairy cattle, confirming the links between these two groups of traits. Our findings contribute to a better understanding of the molecular mechanisms underlying the phenotypic expression of lactation persistency and milk production traits, which can be useful for improving the genomic evaluation of important economic traits in the Holstein cattle.

Supplementary Materials

The following are available online at https://www.mdpi.com/article/10.3390/genes12111830/s1, Figure S1: Manhattan plots with truncated Y-axis for the GWAS results for milk yield (MILK), fat yield (FAT), fat percentage (FAT%), protein yield (PROT) and protein percentage (PROT%), based on imputed whole-genome sequence data. Figure S2: Gene interaction network for genes associated with lactation persistency in North American Holstein cattle. Table S1: Description of significant SNPs, associated genes id and name, and gene biotype for milk yield (MILK) in North American Holstein cattle. Table S2: Description of significant SNPs, associated genes id and name, and gene biotype for fat yield (FAT) in North American Holstein cattle. Table S3: Description of significant SNPs, associated genes id and name, and gene biotype for fat percentage (FAT%) in North American Holstein cattle. Table S4: Description of significant SNPs, associated genes id and name, and gene biotype for protein yield (PROT) in North American Holstein cattle Table S5: Description of significant SNPs, associated genes id and name, and gene biotype for protein percentage (PROT%) in North American Holstein cattle. Table S6: Variants identified in variant effect predictor analysis for LP, MILK, FAT, FAT%, PROT, and PROT% in North American Holstein cattle. Table S7: Description of significant SNPs, associated genes id and name, and gene biotype for lactation persistency (LP) in North American Holstein cattle.

Author Contributions

Conceptualization of the study, V.B.P., F.S.S. and L.F.B.; preparation of data, V.B.P., F.S.S., S.-Y.C., M.G.M. and L.F.B.; methodology and statistical analyses, V.B.P., S.-Y.C. and L.F.B.; writing—original draft preparation, V.B.P. and L.F.B.; writing—review and editing, V.B.P., F.S.S., S.-Y.C., H.R.O., T.M.C., M.G.M. and L.F.B. All authors have read and agreed to the published version of the manuscript.

Funding

This study was partially funded by Purdue University as part of AgSEED Crossroads funding to support Indiana’s Agriculture and Rural Development.

Institutional Review Board Statement

No Animal Care Committee approval was necessary for the purposes of this study as all information required was obtained from pre-existing databases.

Informed Consent Statement

Not applicable.

Data Availability Statement

All the data supporting the results of this study are included in the article and in the Supplementary Materials Section.

Acknowledgments

The University of Guelph is a partner in the 1000 Bull Genomes Project and thanks the Project for the use of the full genome sequence genotypes. F. S. Schenkel thanks the financial support in main part by Agriculture and Agri-Food Canada, and by additional contributions from Dairy Farmers of Canada, the Canadian Dairy Network and the Canadian Dairy Commission under the Agri-Science Clusters Initiative. As per the research agreement, aside from providing financial support, the funders have no role in the design and conduct of the studies, data collection and analysis or interpretation of the data. Researchers maintain independence in conducting their studies, own their data, and report the outcomes regardless of the results.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Han, B.; Yuan, Y.; Li, Y.; Liu, L.; Sun, D. Single nucleotide polymorphisms of NUCB2 and their genetic associations with milk production. Genes 2019, 10, 449. [Google Scholar] [CrossRef] [Green Version]
  2. Brito, L.F.; Bedere, N.; Douhard, F.; Oliveira, H.R.; Arnal, M.; Peñagaricano, F.; Schinckel, A.P.; Baes, C.F.; Miglior, F. Genetic selection of high-yielding dairy cattle toward sustainable farming systems in a rapidly changing world. Animal 2021, 100292. [Google Scholar] [CrossRef]
  3. Sehested, J.; Gaillard, C.; Lehmann, J.O.; Maciel, G.M.; Vestergaard, M.; Weisbjerg, M.R.; Mogensen, L.; Larsen, L.B.; Poulsen, N.A.; Kristensen, T. Extended lactation in dairy cattle. Animal 2019, 13, s65–s74. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  4. Gaines, W.L. Persistency of Lactation in Dairy Cows: A Preliminary Study of Certain Guernsey and Holstein Records; University of Illinois Agricultural Experiment Station: Urbana, IL, USA, 1927. [Google Scholar]
  5. Danell, B. Studies on lactation yield and individual test-day yields of Swedish dairy cows: IV. Extension of part-lactation records for use in sire evaluation. Acta Agric. Scand. 1982, 32, 103–114. [Google Scholar] [CrossRef]
  6. Grossman, M.; Hartz, S.M.; Koops, W.J. Persistency of lactation yield: A novel approach. J. Dairy Sci. 1999, 82, 2192–2197. [Google Scholar] [CrossRef]
  7. Cole, J.B.; Null, D.J. Genetic evaluation of lactation persistency for five breeds of dairy cattle. J. Dairy Sci. 2009, 92, 2248–2258. [Google Scholar] [CrossRef] [Green Version]
  8. Dhakal, K.; Tiezzi, F.; Clay, J.S.; Maltecca, C. Causal relationships between clinical mastitis events, milk yields and lactation persistency in US Holsteins. Livest. Sci. 2016, 189, 8–16. [Google Scholar] [CrossRef]
  9. Yamazaki, T.; Takeda, H.; Osawa, T.; Yamaguchi, S.; Hagiya, K. Genetic correlations among fertility traits and lactation persistency within and across Holstein herds with different milk production during the first three lactations. Livest. Sci. 2019, 219, 97–103. [Google Scholar] [CrossRef]
  10. Loker, S.; Bastin, C.; Miglior, F. Genetic and environmental relationships between body condition score and milk production traits in Canadian Holsteins. J. Dairy Sci. 2012, 95, 410–419. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  11. Miglior, F.; Fleming, A.; Malchiodi, F.; Brito, L.F.; Martin, P.; Baes, C.F. A 100-year review: Identification and genetic selection of economically important traits in dairy cattle. J. Dairy Sci. 2017, 100, 10251–10271. [Google Scholar] [CrossRef]
  12. Do, D.N.; Fleming, A.; Schenkel, F.S.; Miglior, F.; Zhao, X.; Ibeagha-awemu, E.M. Genetic parameters of milk cholesterol content in Holstein cattle. Can. J. Anim. Sci. 2018, 98, 714–722. [Google Scholar] [CrossRef]
  13. Oliveira, G.A., Jr.; Schenkel, F.S.; Alcantara, L.; Houlahan, K.; Lynch, C.; Baes, C.F. Estimated genetic parameters for all genetically evaluated traits in Canadian Holsteins. J. Dairy Sci. 2021, 104, 9002–9015. [Google Scholar] [CrossRef] [PubMed]
  14. Cochran, S.D.; Cole, J.B.; Null, D.J.; Hansen, P.J. Discovery of single nucleotide polymorphisms in candidate genes associated with fertility and production traits in Holstein cattle. BMC Genet. 2013, 14, 49. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  15. Nayeri, S.; Sargolzaei, M.; Abo-ismail, M.K.; May, N.; Miller, S.P.; Schenkel, F.; Moore, S.S.; Stothard, P. Genome-wide association for milk production and female fertility traits in Canadian dairy Holstein cattle. BMC Genet. 2016, 17, 75. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  16. Oliveira, H.R.; Cant, J.P.; Brito, L.F.; Feitosa, F.L.B.; Chud, T.C.S.; Fonseca, P.A.S.; Jamrozik, J.; Silva, F.F.; Lourenco, D.A.L.; Schenkel, F.S. Genome-wide association for milk production traits and somatic cell score in different lactation stages of Ayrshire, Holstein, and Jersey dairy cattle. J. Dairy Sci. 2019, 102, 8159–8174. [Google Scholar] [CrossRef] [PubMed]
  17. Wang, D.; Ning, C.; Liu, J.; Zhang, Q.; Jiang, L. Association studies for milk production traits in Chinese Holstein by an efficient rotated linear mixed model. J. Dairy Sci. 2019, 102, 2378–2383. [Google Scholar] [CrossRef] [Green Version]
  18. Van Der Berg, I.; Boichard, D.; Lund, M.S. Comparing power and precision of within-breed and multibreed genome-wide association studies of production traits using whole- genome sequence data for 5 French and Danish dairy cattle breeds. J. Dairy Sci. 2016, 99, 8932–8945. [Google Scholar] [CrossRef]
  19. Larmer, S.G.; Sargolzaei, M.; Brito, L.F.; Ventura, R.V.; Schenkel, F.S. Novel methods for genotype imputation to whole-genome sequence and a simple linear model to predict imputation accuracy. BMC Genet. 2017, 18, 120. [Google Scholar] [CrossRef] [Green Version]
  20. Hayes, B.J.; Macleod, I.M.; Daetwyler, H.D.; Bowman, P.J.; Chamberlian, A.J.; Vander Jagt, C.J.; Capitan, A.; Pausch, H.; Stothard, P.; Liao, X.; et al. Genomic prediction from whole genome sequence in livestock: The 1000 bull genomes project. In Proceedings of the World Congress of Genetics Applied to Livestock Production, Vancouver, BC, Canada, 3 December 2020; 2014; pp. 1–7. [Google Scholar]
  21. Chen, S.-Y.; Oliveira, H.R.; Schenkel, F.S.; Pedrosa, V.B.; Melka, M.G.; Brito, L.F. Using imputed whole-genome sequence variants to uncover candidate mutations and genes affecting milking speed and temperament in Holstein cattle. J. Dairy Sci. 2020, 103, 10383–10398. [Google Scholar] [CrossRef]
  22. Moghaddar, N.; Khansefid, M.; van der Werf, J.H.J.; Bolormaa, S.; Duijvesteijn, N.; Clark, S.A.; Swan, A.A.; Daetwyler, H.D.; MacLeod, I.M. Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations. Genet. Sel. Evol. 2019, 51, 1–14. [Google Scholar] [CrossRef] [Green Version]
  23. Van den Berg, S.; Vandenplas, J.; van Eeuwijk, F.A.; Bouwman, A.C.; Lopes, M.S.; Veerkamp, R.F. Imputation to whole-genome sequence using multiple pig populations and its use in genome-wide association studies. Genet. Sel. Evol. 2019, 51, 1–13. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  24. Talouarn, E.; Bardou, P.; Palhière, I.; Oget, C.; Clément, V.; Tosser-Klopp, G.; Rupp, R.; Robert-Granié, C. Genome wide association analysis on semen volume and milk yield using different strategies of imputation to whole genome sequence in French dairy goats. BMC Genet. 2020, 21, 19. [Google Scholar] [CrossRef] [Green Version]
  25. Teissier, M.; Sanchez, M.P.; Boussaha, M.; Barbat, A.; Hoze, C.; Robert-Granie, C.; Croiseau, P. Use of meta-analyses and joint analyses to select variants in whole genome sequences for genomic evaluation: An application in milk production of French dairy cattle breeds. J. Dairy Sci. 2018, 101, 3126–3139. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  26. Xiang, R.; van den Berg, I.; MacLeod, I.M.; Daetwyler, H.D.; Goddard, M.E. Effect direction meta-analysis of GWAS identifies extreme, prevalent and shared pleiotropy in a large mammal. Commun. Biol. 2020, 3, 1–14. [Google Scholar] [CrossRef]
  27. Van den Berg, I.; Xiang, R.; Jenko, J.; Pausch, H.; Boussaha, M.; Schrooten, C.; Tribout, T.; Gjuvsland, A.B.; Boichard, D.; Nordbø, Ø. Meta-analysis for milk fat and protein percentage using imputed sequence variant genotypes in 94,321 cattle from eight cattle breeds. Genet. Sel. Evol. 2020, 52, 1–16. [Google Scholar] [CrossRef]
  28. Daetwyler, H.D.; Capitan, A.; Pausch, H.; Stothard, P.; Van Binsbergen, R.; Brøndum, R.F.; Liao, X.; Djari, A.; Rodriguez, S.C.; Grohs, C.; et al. Whole-genome sequencing of 234 bulls facilitates mapping of monogenic and complex traits in cattle. Nat. Publ. Gr. 2014, 46, 858–865. [Google Scholar] [CrossRef]
  29. Sanchez, M.P.; Gion, A.G.; Croiseau, P.; Fritz, S.; Hozé, C.; Miranda, G.; Martin, P.; Leterrier, A.B.; Letaïef, R.; Rocha, D.; et al. Within-breed and multi-breed GWAS on imputed whole-genome sequence variants reveal candidate mutations affecting milk protein composition in dairy cattle. Genet. Sel. Evol. 2017, 49, 68. [Google Scholar] [CrossRef] [Green Version]
  30. Tribout, T.; Croiseau, P.; Lefebvre, R.; Barbat, A.; Boussaha, M.; Fritz, S.; Boichard, D.; Hoze, C.; Sanchez, M.P. Confirmed effects of candidate variants for milk production, udder health, and udder morphology in dairy cattle. Genet. Sel. Evol. 2020, 52, 1–26. [Google Scholar] [CrossRef] [PubMed]
  31. Bissonnette, N. Genetic association of variations in the osteopontin gene (SPP1) with lactation persistency in dairy cattle. J. Dairy Sci. 2018, 101, 456–461. [Google Scholar] [CrossRef]
  32. Cole, J.B.; VanRaden, P.M. Genetic evaluation and best prediction of lactation persistency. J. Dairy Sci. 2006, 89, 2722–2728. [Google Scholar] [CrossRef]
  33. Walsh, S.W.; Williams, E.J.; Evans, A.C.O. A review of the causes of poor fertility in high milk producing dairy cows. Anim. Reprod. Sci. 2011, 123, 127–138. [Google Scholar] [CrossRef] [PubMed]
  34. Nayeri, S.; Sargolzaei, M.; Miller, S.; Schenkel, F.; Moore, S.S.; Stothard, P. Genome-wide association study for lactation persistency, female fertility, longevity, and lifetime profit index traits in Holstein dairy cattle. J. Dairy Sci. 2017, 100, 1246–1258. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  35. Yue, S.J.; Zhao, Y.Q.; Gu, X.R.; Yin, B.; Jiang, Y.L.; Wang, Z.H.; Shi, K.R. A genome-wide association study suggests new candidate genes for milk production traits in Chinese Holstein cattle. Anim. Genet. 2017, 48, 677–681. [Google Scholar] [CrossRef] [PubMed]
  36. Do, D.N.; Bissonnette, N.; Lacasse, P.; Miglior, F.; Zhao, X.; Ibeagha-awemu, E.M. Animal Genetics and Genomics: A targeted genotyping approach to enhance the identification of variants for lactation persistency in dairy cows. J. Anim. Sci. 2019, 97, 4066–4075. [Google Scholar] [CrossRef] [PubMed]
  37. Wang, T.; Li, J.; Gao, X.; Song, W.; Chen, C.; Yao, D.; Ma, J. Genome-wide association study of milk components in Chinese Holstein cows using single nucleotide polymorphism. Livest. Sci. 2020, 233, 103951. [Google Scholar] [CrossRef]
  38. VanRaden, P.M.; Van Tassell, C.P.; Wiggans, G.R.; Sonstegard, T.S.; Schnabel, R.D.; Taylor, J.F.; Schenke, F.S. Invited review: Reliability of genomic predictions for North American Holstein bulls. J. Dairy Sci. 2009, 92, 16–24. [Google Scholar] [CrossRef] [Green Version]
  39. Sargolzaei, M.; Chesnais, J.P.; Schenkel, F.S. A new approach for efficient genotype imputation using information from relatives. BMC Genom. 2014, 15, 478. [Google Scholar] [CrossRef] [Green Version]
  40. Larmer, S.G.; Sargolzaei, M.; Schenkel, F.S. Extent of linkage disequilibrium, consistency of gametic phase, and imputation accuracy within and across Canadian dairy breeds. J. Dairy Sci. 2014, 97, 3128–3141. [Google Scholar] [CrossRef] [Green Version]
  41. May, K.; Sames, L.; Scheper, C.; König, S. Genomic loci and genetic parameters for uterine diseases in first-parity Holstein cows and associations with milk production and fertility. J. Dairy Sci. 2021. [Google Scholar] [CrossRef]
  42. Klein, S.-L.; Scheper, C.; May, K.; König, S. Genetic and nongenetic profiling of milk β-hydroxybutyrate and acetone and their associations with ketosis in Holstein cows. J. Dairy Sci. 2020, 103, 10332–10346. [Google Scholar] [CrossRef]
  43. Song, Y.; Xu, L.; Chen, Y.; Zhang, L.; Gao, H.; Zhu, B.; Niu, H.; Zhang, W.; Xia, J.; Gao, X. Genome-wide association study reveals the PLAG1 gene for knuckle, biceps and shank weight in Simmental beef cattle. PLoS ONE 2016, 11, e0168316. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  44. Purcell, S.; Neale, B.; Todd-brown, K.; Thomas, L.; Ferreira, M.A.R.; Bender, D.; Maller, J.; Sklar, P.; De Bakker, P.I.W.; Daly, M.J.; et al. REPORT PLINK: A tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 2007, 81, 559–575. [Google Scholar] [CrossRef] [Green Version]
  45. Yang, J.; Lee, S.H.; Goddard, M.E.; Visscher, P.M. REPORT GCTA: A tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 2011, 88, 76–82. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  46. Yang, J.; Zaitlen, N.A.; Goddard, M.E.; Visscher, P.M.; Price, A.L. Perspective: Advantages and pitfalls in the application of mixed-model association methods. Nat. Publ. Gr. 2014, 46, 100–106. [Google Scholar] [CrossRef] [Green Version]
  47. Prive, F.; Aschard, H.; Ziyatdinov, A.; Blum, M.G.B.; Timc-imag, L. Genetics and population analysis: Efficient analysis of large-scale genome-wide data with two R packages: Bigstatsr and bigsnpr. Bioinformatics 2018, 34, 2781–2787. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  48. Johnson, R.C.; Nelson, G.W.; Troyer, J.L.; Lautenberger, J.A.; Kessing, B.D.; Winkler, C.A.; Brien, S.J.O. Accounting for multiple comparisons in a genome-wide association study (GWAS). BMC Genom. 2010, 11, 724. [Google Scholar] [CrossRef] [Green Version]
  49. Li, X.; Buitenhuis, A.J.; Lund, M.S.; Li, C.; Sun, D.; Zhang, Q.; Poulsen, N.A.; Su, G. Joint genome-wide association study for milk fatty acid traits in Chinese and Danish Holstein populations. J. Dairy Sci. 2015, 98, 8152–8163. [Google Scholar] [CrossRef] [Green Version]
  50. Goddard, M.E.; Hayes, B.J.; Meuwissen, T.H.E. Using the genomic relationship matrix to predict the accuracy of genomic selection. J. Anim. Breed. Genet. 2011, 128, 409–421. [Google Scholar] [CrossRef]
  51. Makanjuola, B.O.; Miglior, F.; Abdalla, E.A.; Schenkel, F.S.; Baes, C.F. Effect of genomic selection on rate of inbreeding and coancestry and effective population size of Holstein and Jersey cattle populations. J. Dairy Sci. 2020, 103, 5183–5199. [Google Scholar] [CrossRef]
  52. Wang, Z.; Shen, B.; Jiang, J.; Li, J.; Ma, L. Effect of sex, age and genetics on crossover interference in cattle. Sci. Rep. 2016, 6, 1–10. [Google Scholar] [CrossRef] [Green Version]
  53. Fonseca, P.A.S.; Suarez-Vega, A.; Marras, G.; Canóvas, Á. GALLO: An R package for genomic annotation and integration of multiple data sources in livestock for positional candidate loci. Giga Sci. 2020, 9, giaa149. [Google Scholar] [CrossRef] [PubMed]
  54. Hu, Z.-L.; Park, C.A.; Reecy, J.M. Building a livestock genetic and genomic information knowledgebase through integrative developments of Animal QTLdb and CorrDB. Nucleic Acids Res. 2019, 47, D701–D710. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  55. McLaren, W.; Gil, L.; Hunt, S.E.; Riat, H.S.; Ritchie, G.R.S.; Thormann, A.; Flicek, P.; Cunningham, F. The ensembl variant effect predictor. Genome Biol. 2016, 17, 1–14. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  56. Huang, D.W.; Sherman, B.T.; Zheng, X.; Yang, J.; Imamichi, T.; Stephens, R.; Lempicki, R.A. Extracting biological meaning from large gene lists with DAVID. Curr. Protoc. Bioinform. 2009, 27, 1–13. [Google Scholar] [CrossRef]
  57. Szklarczyk, D.; Franceschini, A.; Wyder, S.; Forslund, K.; Heller, D.; Huerta-Cepas, J.; Simonovic, M.; Roth, A.; Santos, A.; Tsafou, K.P. STRING v10: Protein–protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 2015, 43, D447–D452. [Google Scholar] [CrossRef] [PubMed]
  58. Frischknecht, M.; Bapst, B.; Seefried, F.R.; Signer-hasler, H.; Garrick, D.; Stricker, C.; Consortium, I.; Fries, R.; Russ, I.; Sölkner, J.; et al. Genome-wide association studies of fertility and calving traits in Brown Swiss cattle using imputed whole-genome sequences. BMC Genom. 2017, 18, 910. [Google Scholar] [CrossRef]
  59. Liu, L.; Zhou, J.; Chen, C.J.; Zhang, J.; Wen, W.; Tian, J.; Zhang, Z.; Gu, Y. GWAS-based identification of new loci for milk yield, fat, and protein in Holstein cattle. Animals 2020, 10, 2048. [Google Scholar] [CrossRef]
  60. Ribeiro, E.S.; Monteiro, A.P.A.; Bisinotto, R.S.; Lima, F.S.; Greco, L.F.; Ealy, A.D. Conceptus development and transcriptome at preimplantation stages in lactating dairy cows of distinct genetic groups and estrous cyclic statuses. J. Dairy Sci. 2016, 99, 4761–4777. [Google Scholar] [CrossRef]
  61. Liu, A.; Wang, Y.; Sahana, G.; Zhang, Q.; Liu, L.; Lund, M.S. Genome-wide association studies for female fertility traits in Chinese and Nordic Holsteins. Sci. Rep. 2017, 7, 1–12. [Google Scholar] [CrossRef] [Green Version]
  62. Jiang, J.; Ma, L.; Prakapenka, D.; Vanraden, P.M.; Cole, J.B.; Cole, J.B. A large-scale genome-wide association study in US Holstein Cattle. Front. Genet. 2019, 10, 412. [Google Scholar] [CrossRef]
  63. Albarran-Portillo, B.; Pollott, G.E. The relationship between fertility and lactation characteristics in Holstein cows on United Kingdom commercial dairy farms. J. Dairy Sci. 2013, 96, 635–646. [Google Scholar] [CrossRef] [PubMed]
  64. Muir, B.L.; Fatehi, J.; Schaeffer, L.R. Genetic relationships between persistency and reproductive performance in first-lactation Canadian Holsteins. J. Dairy Sci. 2004, 87, 3029–3037. [Google Scholar] [CrossRef] [Green Version]
  65. Jakobsen, J.H.; Madsen, P.; Jensen, J.; Pedersen, J.; Christensen, L.G.; Sorensen, D.A. Genetic parameters for milk production and persistency for Danish Holsteins estimated in random regression models using REML. J. Dairy Sci. 2002, 85, 1607–1616. [Google Scholar] [CrossRef]
  66. Yamazaki, T.; Hagiya, K.; Takeda, H.; Yamaguchi, S.; Osawa, T.; Nagamine, Y. Genetic correlations among female fertility, 305-day milk yield and persistency during the first three lactations of Japanese Holstein cows. Livest. Sci. 2014, 168, 26–31. [Google Scholar] [CrossRef]
  67. Santos, D.J.A.; Cole, J.B.; Null, D.J.; Byrem, T.M.; Ma, L. Genetic and nongenetic profiling of milk pregnancy-associated glycoproteins in Holstein cattle. J. Dairy Sci. 2018, 101, 9987–10000. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  68. Le Guillou, S.; Sdassi, N.; Laubier, J.; Passet, B.; Vilotte, M.; Castille, J.; Polyte, J.; Jaffre, F.; Cribiu, E.; Vilotte, J.; et al. Overexpression of miR-30b in the developing mouse mammary gland causes a lactation defect and delays involution. PLoS ONE 2012, 7, e45727. [Google Scholar] [CrossRef] [PubMed]
  69. Law, R.H.P.; Zhang, Q.; Mcgowan, S.; Buckle, A.M.; Silverman, G.A.; Wong, W.; Rosado, C.J.; Chris, G.; Pike, R.N.; Bird, P.I.; et al. An overview of the serpin superfamily. Genome Biol. 2006, 7, 1–11. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  70. De Camargo, G.M.F.; Aspilcueta-borquis, R.R.; Cardoso, D.F.; Santos, D.J.A. Prospecting major genes in dairy buffaloes. BMC Genom. 2015, 16, 872. [Google Scholar] [CrossRef] [Green Version]
  71. Soares, R.A.N.; Vargas, G.; Duffield, T.; Schenkel, F.; Squires, J. Genome-wide association study and functional analyses for clinical and subclinical ketosis in Holstein cattle. J. Dairy Sci. 2021, 104, 1–14. [Google Scholar] [CrossRef]
  72. Oliveira, H.R.; Lourenco, D.A.L.; Masuda, Y.; Misztal, I.; Tsuruta, S.; Jamrozik, J.; Brito, L.F.; Silva, F.F.; Cant, J.P.; Schenkel, F.S. Single-step genome-wide association for longitudinal traits of Canadian Ayrshire, Holstein, and Jersey dairy cattle. J. Dairy Sci. 2019, 102, 9995–10011. [Google Scholar] [CrossRef]
  73. Clancey, E.; Kiser, J.N.; Moraes, J.G.N.; Dalton, J.C.; Spencer, T.E.; Neibergs, H.L. Genome-wide association analysis and gene set enrichment analysis with SNP data identify genes associated with 305-day milk yield in Holstein dairy cows. Anim. Genet. 2019, 9, 254–258. [Google Scholar] [CrossRef] [PubMed]
  74. Atashi, H.; Crowe, M.; Salavati, M.; De Koster, J.; Ehrlich, J.; Crowe, M.; Opsomer, G.; Hostens, M. Genome-wide association for milk production and lactation curve parameters in Holstein dairy cows. J. Anim. Breed. Genet. 2020, 137, 292–304. [Google Scholar] [CrossRef] [PubMed]
  75. Raven, L.; Cocks, B.G.; Hayes, B.J. Multibreed genome wide association can improve precision of mapping causative variants underlying milk production in dairy cattle. BMC Genom. 2014, 15, 62. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  76. Pradeep, J.; Monika, S.; Ankita, S.; Ks, U.; Amit, K.; Ashok, M.; Bp, M.; Sandeep, M.; Rs, K.; Kaushik, J.; et al. Expression analysis of solute carrier (SLC2A) genes in milk derived mammary epithelial cells during different stages of lactation in Sahiwal (Bos indicus) cows advances in dairy research. Adv. Dairy Res. 2014, 2, 2–7. [Google Scholar] [CrossRef] [Green Version]
  77. Banos, G.; Clark, E.L.; Id, S.J.B.; Dutta, P.; Id, G.B.; Arsenos, G.; Hume, D.A.; Id, A.P. Genetic and genomic analyses underpin the feasibility of concomitant genetic improvement of milk yield and mastitis resistance in dairy sheep. PLoS ONE 2019, 14, e0214346. [Google Scholar] [CrossRef] [Green Version]
  78. Meredith, B.K.; Kearney, F.J.; Finlay, E.K.; Bradley, D.G.; Fahey, A.G.; Berry, D.P.; Lynn, D.J. Genome-wide associations for milk production and somatic cell score in Holstein-Friesian cattle in Ireland. BMC Genet. 2012, 13, 21. [Google Scholar] [CrossRef] [Green Version]
  79. Buitenhuis, B.; Janss, L.L.G.; Poulsen, N.A.; Larsen, L.B.; Larsen, M.K.; Sørensen, P. Genome-wide association and biological pathway analysis for milk-fat composition in Danish Holstein and Danish Jersey cattle. BMC Genom. 2014, 15, 1112. [Google Scholar] [CrossRef] [Green Version]
  80. Buitenhuis, B.; Poulsen, N.A.; Gebreyesus, G.; Larsen, L.B. Estimation of genetic parameters and detection of chromosomal regions affecting the major milk proteins and their post translational modifications in Danish Holstein and Danish Jersey cattle. BMC Genet. 2016, 17, 114. [Google Scholar] [CrossRef] [Green Version]
  81. Cole, J.B.; Wiggans, G.R.; Ma, L.; Sonstegard, T.S., Jr.; Lawlor, T.J.; Crooker, B.A.; Van Tassell, C.P.; Yang, J.; Wang, S.; Matukumalli, L.K.; et al. Genome-wide association analysis of thirty one production, health, reproduction and body conformation traits in contemporary US Holstein cows. BMC Genom. 2011, 12, 408. [Google Scholar] [CrossRef] [Green Version]
  82. Zhou, J.; Liu, L.; Chen, C.J.; Zhang, M.; Lu, X.; Zhang, Z. Genome-wide association study of milk and reproductive traits in dual-purpose Xinjiang Brown cattle. BMC Genom. 2019, 20, 827. [Google Scholar] [CrossRef] [Green Version]
  83. Zhang, X.; Li, C.; Li, X.; Liu, Z.; Ni, W.; Cao, Y.; Yao, Y.; Islamov, E. Association analysis of polymorphism in the NR6A1 gene with the lumbar vertebrae number traits in sheep. Genes Genom. 2019, 41, 1165–1171. [Google Scholar] [CrossRef] [PubMed]
  84. Klomtong, P.; Chaweewan, K.; Phasuk, Y.; Duangjinda, M. genetic differentiation in Thai native, wild boars, and Duroc and Chinese Meishan pigs. Genet. Mol. Res. 2015, 14, 12723–12732. [Google Scholar] [CrossRef] [PubMed]
  85. Tokunaga, M.; Inoue, M.; Jiang, Y.; Barnes, R.H., II; Buchner, D.A.; Chun, T.-H. Fat depot-specific gene signature and ECM remodeling of Sca1high adipose-derived stem cells. Matrix Biol. 2014, 36, 28–38. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  86. Buitenhuis, B.; Poulsen, N.A.; Larsen, L.B.; Sehested, J. Estimation of genetic parameters and detection of quantitative trait loci for minerals in Danish Holstein and Danish Jersey milk. BMC Genet. 2015, 16, 52. [Google Scholar] [CrossRef]
  87. Grisart, B.; Coppieters, W.; Farnir, F.; Karim, L.; Ford, C.; Berzi, P.; Cambisano, N.; Mni, M.; Reid, S.; Simon, P.; et al. Positional candidate cloning of a QTL in dairy cattle: Identification of a missense mutation in the bovine DGAT1 gene with major effect on milk yield and composition. Genome Res. 2002, 12, 222–231. [Google Scholar] [CrossRef] [Green Version]
  88. Cai, Z.; Guldbrandtsen, B.; Lund, M.S.; Sahana, G. Dissecting closely linked association signalsin combination with the mammalianphenotype database can identify candidategenes in dairy cattle. BMC Genet. 2018, 19, 15. [Google Scholar]
  89. Palombo, V.; Milanesi, M.; Sgorlon, S.; Capomaccio, S.; Mele, M.; Nicolazzi, E. Genome-wide association study of milk fatty acid composition in Italian Simmental and Italian Holstein cows using single nucleotide polymorphism arrays. J. Dairy Sci. 2018, 101, 11004–11019. [Google Scholar] [CrossRef] [Green Version]
  90. Frischknecht, M.; Pausch, H.; Bapst, B.; Signer-hasler, H.; Flury, C.; Garrick, D.; Stricker, C.; Fries, R.; Gredler-grandl, B. Highly accurate sequence imputation enables precise QTL mapping in Brown Swiss cattle. BMC Genom. 2017, 18, 999. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  91. Ning, C.; Kang, H.; Zhou, L.; Wang, D.; Wang, H.; Wang, A.; Fu, J. Performance gains in genome-wide association studies for longitudinal traits via modeling time-varied effects. Sci. Rep. 2017, 1–12. [Google Scholar] [CrossRef] [Green Version]
  92. Fang, Z.-H.; Pausch, H. Multi-trait meta-analyses reveal 25 quantitative trait loci for economically important traits in Brown Swiss cattle. BMC Genom. 2019, 20, 695. [Google Scholar] [CrossRef] [Green Version]
  93. Cai, Z.; Dusza, M.; Guldbrandtsen, B.; Lund, M.S.; Sahana, G. Distinguishing pleiotropy from linked QTL between milk production traits and mastitis resistance in Nordic Holstein cattle. Genet. Sel. Evol. 2020, 52, 1–15. [Google Scholar] [CrossRef] [Green Version]
  94. Fonseca, I.; Cardoso, F.F.; Higa, R.H.; Giachetto, P.F.; Brandão, H.M.; Brito, M.A.V.P.; Ferreira, M.B.D.; Guimarães, S.E.F.; Martins, M.F. Gene expression profile in zebu dairy cows (Bos taurus indicus) with mastitis caused by Streptococcus agalactiae. Livest. Sci. 2015, 180, 47–57. [Google Scholar] [CrossRef] [Green Version]
  95. Koh, Y.; Peiris, H.; Vaswani, K.; Almughlliq, F.; Meier, S.; Burke, C.; Mitchell, M. Exosomes from dairy cows of divergent fertility; Action on endometrial cells. J. Reprod. Immunol. 2020, 137, 102624. [Google Scholar] [CrossRef] [PubMed]
  96. Marcos-Carcavilla, A.; Calvo, J.H.; González, C.; Moazami-Goudarzi, K.; Laurent, P.; Bertaud, M.; Hayes, H.; Alves, M.E.F.; Serrano, M. Short communication: IL-1 family members as possible candidate genes affencting economically important traits in cattle. Span. J. Agric. Res. 2007, 5, 38–42. [Google Scholar] [CrossRef] [Green Version]
  97. Yu, G.I.; Song, D.K.; Shin, D. Associations of IL1RAP and IL1RL1 gene polymorphisms with obesity and inflammation mediators. Inflamm. Res. 2020, 69, 191–202. [Google Scholar] [CrossRef]
  98. Ogorevc, J.; Kunej, T.; Razpet, A.; Dovc, P. Database of cattle candidate genes and genetic markers for milk production and mastitis. Anim. Genet. 2009, 40, 832–851. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  99. Kaniyamattam, K.; De Vries, A. Agreement between milk fat, protein, and lactose observations collected from the Dairy Herd Improvement Association (DHIA) and a real-time milk analyzer. J. Dairy Sci. 2014, 97, 2896–2908. [Google Scholar] [CrossRef] [PubMed]
  100. Mastrangelo, S.; Ben, S.; Elena, J.; Gianluca, C.; Moscarelli, A.; Boussaha, M.; Montedoro, M.; Pilla, F.; Cassandro, M. Genome-wide detection of signatures of selection in three Valdostana cattle populations. J. Anim. Breed. Genet. 2020, 137, 609–621. [Google Scholar] [CrossRef]
  101. Liu, J.J.; Liang, A.X.; Campanile, G.; Plastow, G.; Zhang, C.; Wang, Z.; Salzano, A.; Gasparrini, B. Genome-wide association studies to identify quantitative trait loci affecting milk production traits in water buffalo. J. Dairy Sci. 2018, 101, 433–444. [Google Scholar] [CrossRef] [Green Version]
  102. Raschia, M.A.; Nani, J.P.; Carignano, H.A.; Amadio, A.F.; Maizon, D.O.; Poli, M.A.; Nacional, I.; Agropecuaria, D.T.; De Genética, I.; Favret, E.A.; et al. Weighted single-step genome-wide association analyses for milk traits in Holstein and Holstein x Jersey crossbred dairy cattle. Livest. Sci. 2020, 242, 104294. [Google Scholar] [CrossRef]
  103. Jena, M.K.; Mohanty, A.K. New insights of mammary gland during different stages of development. Asian J. Pharm. Clin. Res. 2017, 10, 35–40. [Google Scholar] [CrossRef] [Green Version]
  104. Lin, S.; Wan, Z.; Zhang, J.; Xu, L.; Han, B.; Sun, D. Genome-wide association studies for the concentration of albumin in colostrum and serum in Chinese Holstein. Animals 2020, 10, 2211. [Google Scholar] [CrossRef]
  105. Jiménez-González, V.; Ogalla-García, E.; García-Quintanilla, M.; García-Quintanilla, A. Deciphering GRINA/Lifeguard1: Nuclear location, ca2+ homeostasis and vesicle transport. Int. J. Mol. Sci. 2019, 20, 4005. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  106. Kaufmann, M.; Feijis, K.; Lüscher, B. Endogenous ADP-Ribosylation. 2014. Available online: https://link.springer.com/chapter/10.1007/82_2014_379 (accessed on 30 October 2021).
  107. Zhou, C.; Li, C.; Cai, W.; Liu, S.; Yin, H.; Shi, S.; Zhang, Q. Genome-wide association study for milk protein composition traits in a Chinese Holstein population using a single-step approach. Front. Genet. 2019, 10, 1–17. [Google Scholar] [CrossRef] [Green Version]
  108. Do, D.N.; Bissonnette, N.; Lacasse, P.; Miglior, F.; Sargolzaei, M.; Zhao, X. Genome-wide association analysis and pathways enrichment for lactation persistency in Canadian Holstein cattle. J. Dairy Sci. 2017, 100, 1955–1970. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  109. Mohan, T.; Deng, L.; Wang, B.-Z. CCL28 chemokine: An anchoring point bridging innate and adaptive immunity. Int. Immunopharmacol. 2017, 51, 165–170. [Google Scholar] [CrossRef]
  110. Tomazi, T.; Gonçalves, J.L.; Barreiro, J.R.; Arcari, M.A.; Dos Santos, M.V. Bovine subclinical intramammary infection caused by coagulase-negative staphylococci increases somatic cell count but has no effect on milk yield or composition. J. Dairy Sci. 2015, 98, 3071–3078. [Google Scholar] [CrossRef] [Green Version]
  111. Huang, W.; Peñagaricano, F.; Ahmad, K.R.; Lucey, J.A.; Weigel, K.A.; Khatib, H. Association between milk protein gene variants and protein composition traits in dairy cattle. J. Dairy Sci. 2012, 95, 440–449. [Google Scholar] [CrossRef]
  112. Yang, F.; Zhang, M.; Rong, Y.; Liu, Z.; Yang, S.; Zhang, W. A novel SNPs in alpha-lactalbumin gene effects on lactation traits in Chinese Holstein dairy cows. Animals 2020, 10, 60. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  113. Raven, L.; Cocks, B.G.; Kemper, K.E.; Chamberlain, A.J.; Vander, C.J.; Michael, J.; Hayes, B.J. Targeted imputation of sequence variants and gene expression profiling identifies twelve candidate genes associated with lactation volume, composition and calving interval in dairy cattle. Mamm. Genome 2016, 27, 81–97. [Google Scholar] [CrossRef]
  114. Du, C.; Deng, T.; Zhou, Y.; Ye, T.; Zhou, Z.; Zhang, S.; Shao, B.; Wei, P.; Sun, H.; Khan, F.A.; et al. Systematic analyses for candidate genes of milk production traits in water buffalo (Bubalus Bubalis). Anim. Genet. 2019, 50, 207–216. [Google Scholar] [CrossRef] [PubMed]
  115. Laodim, T.; Elzo, M.A.; Koonawootrittriron, S.; Suwanasopee, T.; Jattawa, D. Genomic-polygenic and polygenic predictions for milk yield, fat yield, and age at first calving in Thai multibreed dairy population using genic and functional sets of genotypes. Livest. Sci. 2019, 219, 17–24. [Google Scholar] [CrossRef]
  116. Mrode, R.; Ojango, J.M.K.; Okeyo, A.M.; Mwacharo, J.M. Genomic selection and use of molecular tools in breeding programs for indigenous and crossbred cattle in developing countries: Current status and future prospects. Front. Genet. 2019, 9, 694. [Google Scholar] [CrossRef] [PubMed]
  117. Rong, L.; Qing-Zhang, L.; Jian-Guo, H.; Li-min, L.; Xue-Jun, G. Proteomic identification of differentially expressed proteins in Vaccaria segetalis-treated dairy cow mammary epithelial cells. J. Northeast Agric. Univ. 2013, 20, 24–31. [Google Scholar]
Figure 1. Manhattan plots of the GWAS results for milk yield (MILK), fat yield (FAT), fat percentage (FAT%), protein yield (PROT), protein percentage (PROT%), and lactation persistency (LP) based on imputed whole-genome sequence data. Statistically significant SNPs are represented by red dots.
Figure 1. Manhattan plots of the GWAS results for milk yield (MILK), fat yield (FAT), fat percentage (FAT%), protein yield (PROT), protein percentage (PROT%), and lactation persistency (LP) based on imputed whole-genome sequence data. Statistically significant SNPs are represented by red dots.
Genes 12 01830 g001aGenes 12 01830 g001b
Figure 2. Number of genes and list of common genes related to lactation persistency (a) and associated between production traits (b), considering milk yield (MILK), fat yield (FAT), fat percentage (FAT%), protein yield (PROT), protein percentage (PROT%) and lactation persistency (LP).
Figure 2. Number of genes and list of common genes related to lactation persistency (a) and associated between production traits (b), considering milk yield (MILK), fat yield (FAT), fat percentage (FAT%), protein yield (PROT), protein percentage (PROT%) and lactation persistency (LP).
Genes 12 01830 g002
Figure 3. Gene interaction network for the central genes associated with lactation persistency, milk yield, fat yield, fat percentage, protein yield, and protein percentage in North American Holstein cattle.
Figure 3. Gene interaction network for the central genes associated with lactation persistency, milk yield, fat yield, fat percentage, protein yield, and protein percentage in North American Holstein cattle.
Genes 12 01830 g003
Table 1. Summary of the pseudo-phenotypes (de-regressed estimated breeding values, dEBVs) for milk (MILK), fat yield (FAT), protein yield (PROT), fat percentage (FAT%), protein percentage (PROT%), and lactation persistency (LP) in North American Holstein cattle.
Table 1. Summary of the pseudo-phenotypes (de-regressed estimated breeding values, dEBVs) for milk (MILK), fat yield (FAT), protein yield (PROT), fat percentage (FAT%), protein percentage (PROT%), and lactation persistency (LP) in North American Holstein cattle.
TraitsSample SizedEBVsReliability
MeanMinMaxSDMeanSD
MILK8264−1507.99−58,315.889,084.913,218.1863.39.5
FAT8262−57.62−3416.985254.22893.6064.79.9
PROT8263−55.87−2124.842479.37526.1160.28.9
FAT%82620.27−37.4740.019.6164.79.9
PROT%82630.11−17.9117.964.8660.28.9
LP344795.68−810.601008.61486.3736.76.8
Min: minimum; Max: maximum; SD: standard deviation.
Table 2. Description of SNPs, most significant genes, and number of QTLs significantly associated with milk yield (MILK), fat yield (FAT), and fat percentage (FAT%) in North American Holstein cattle.
Table 2. Description of SNPs, most significant genes, and number of QTLs significantly associated with milk yield (MILK), fat yield (FAT), and fat percentage (FAT%) in North American Holstein cattle.
TRAITN 1Chr 2Position (bp)p-ValueGenes (Within ±100 Kb)QTL 3QTL_Type 4
MILK81141,861,8362.69 × 10−6ENSBTAG00000050235, ENSBTAG00000045771, ENSBTAG000000460151Meat/Carcass
MILK2280,496,3369.86 × 10−6CAVIN20-
MILK6326,833,3282.52 × 10−6CD58, ATP1A13Reproduction
MILK164116,598,2112.28 × 10−5DPP64Milk
MILK871593,545,5763.62 × 10−12MGST1, SLC15A5357Milk
MILK579686,348,3682.27 × 10−10MOB1B, DCK, SLC4A469Milk
MILK2875,452,3607.17 × 10−6B3GNT3, FCHO1, MAP1S, UNC13A, COLGALT110Reproduction
MILK24836,048,3201.30 × 10−7bta-mir-2285bg, PTPRD1Meat/Carcass
MILK91961,950,5141.39 × 10−7SPACA10-
MILK201087,491,7695.67 × 10−8GPATCH2L1Reproduction
MILK76911103,587,2925.24 × 10−8NACC2, TMEM250, LHX34Milk
MILK11272,628,6791.02 × 10−5ENSBTAG000000233090-
MILK37136,976,6932.95 × 10−10ISM1, TASP115Exterior
MILK556414465,7425.98 × 10−196ARHGAP39, bta-mir-2308, C14H8orf82, LRRC24, LRRC14, RECQL4, MFSD3, GPT, PPP1R16A, FOXH1, KIFC2, CYHR1, TONSL, VPS28, ENSBTAG00000053637, SLC39A4, CPSF1, ADCK5765Milk
MILK10121554,030,8612.13 × 10−9POLD3, CHRDL2, RNF169, U6, ENSBTAG00000054207, ENSBTAG000000423193Milk
MILK3081664,637,4249.60 × 10−9SMG7, NCF2, ARPC5, APOBEC41Health
MILK351856,457,9165.07 × 10−6IZUMO2, ENSBTAG00000053322, MYH14, KCNC3, NAPSA, ENSBTAG00000048283, NR1H2, POLD137Milk
MILK132198,625,4141.77 × 10−6CCDC182, ENSBTAG00000045351, ENSBTAG00000051336, MRPS23, CUEDC115Reproduction
MILK18672031,609,8723.36 × 10−32NIM1K, ENSBTAG00000042376, ZNF131, ENSBTAG00000054352, ENSBTAG00000052195, ENSBTAG0000005111127Milk
MILK272133,151,0736.13 × 10−6ENSBTAG00000051111, ODF3L1, CSPG4, SNX332Health
MILK62221,859,1494.59 × 10−6bta-mir-2285am, U6, SUMF10-
MILK4362312,589,5586.37 × 10−8GLO1, DNAH80-
MILK62421,019,6657.60 × 10−6MOCOS, ELP2, SLC39A6, RPRD1A0-
MILK402818,592,4441.03 × 10−6ZNF3655Milk
MILK16299,512,2416.55 × 10−6PICALM9Milk
FAT301126,143,5788.14 × 10−8ENSBTAG00000038111, PCOLCE23Reproduction
FAT46267,353,1793.19 × 10−6DPP10, ENSBTAG00000050341,0-
FAT15428,468,9865.19 × 10−8POLR1F, ENSBTAG00000050341, TMEM1960-
FAT1580593,627,5118.00 × 10−26SLC15A5167Milk
FAT31636,205,2161.00 × 10−7HERC3, PIGY, HERC5202Milk
FAT18721,215,2005.74 × 10−6GADD45B, LMNB2, TIMM13, TMPRSS9, SPPL2B, LSM7, LINGO3, PEAK3, OAZ123Reproduction
FAT8842,176,9036.89 × 10−6ENSBTAG000000510417Milk
FAT261086,711,4164.16 × 10−6JDP2, bta-mir-10162, BATF, FLVCR23Health
FAT12751195,801,9952.86 × 10−9NR6A1, bta-mir-181a-2, bta-mir-181b-2, OLFML2A, U6, WDR38, RPL35, ARPC5L, GOLGA11Production
FAT121269,122,1291.67 × 10−5ENSBTAG00000051519, 5S_rRNA, TGDS, GPR180, U2, SOX211Exterior
FAT3221354,614,9931.43 × 10−7DIDO1, TCFL5, COL9A3, OGFR, MRGBP, NTSR1, SLCO4A1, ENSBTAG00000051754, ENSBTAG000000510123Milk
FAT449014465,7422.08 × 10−183ARHGAP39, bta-mir-2308, C14H8orf82, LRRC24, LRRC14, RECQL4, MFSD3, GPT, PPP1R16A, FOXH1, KIFC2, CYHR1, TONSL, VPS28, ENSBTAG00000053637, SLC39A4, CPSF1, ADCK5765Milk
FAT1811574,165,8721.09 × 10−9ACCSL, ACCS, EXT26Reproduction
FAT2601661,287,0819.53 × 10−9CEP350, QSOX10-
FAT121860,972,9422.06 × 10−5bta-mir-371, NLRP12, MGC157082, ENSBTAG0000001495350Milk
FAT351934,952,1671.85 × 10−6NT5M, COPS3, FLCN, PLD6, MPRIP7Production
FAT182227,251,4534.27 × 10−6CNTN33Health
FAT132311,102,5413.20 × 10−6ENSBTAG00000048838, PIM1, ENSBTAG00000045936, TMEM217, TBC1D22B0-
FAT6392621,354,1121.07 × 10−9PKD2L1, SCD, bta-mir-12016, WNT8B, SEC31B, NDUFB8, HIF1AN296Milk
FAT62834,969,4427.51 × 10−6ZMIZ1, PPIF, ZCCHC242Production
FAT202949,353,9401.30 × 10−6TSPAN32, ASCL2, TH, INS, IGF28Milk
FAT%811,025,4071.89 × 10−5RCAN1, KCNE1, ENSBTAG00000026259, ENSBTAG00000051226, FAM243A, SMIM11A, KCNE26Milk
FAT%472128,607,0548.79 × 10−7STPG1, GRHL3, U62Milk
FAT%294106,456,1056.10 × 10−6ENSBTAG00000049510, ENSBTAG00000048380, ENSBTAG00000053286, OR6V1, ENSBTAG00000052365, PIP, ENSBTAG000000504940-
FAT%2070593,627,5113.55 × 10−34SLC15A5167Milk
FAT%41636,205,2165.31 × 10−12HERC3, PIGY, HERC5202Milk
FAT%19721,215,2002.60 × 10−5GADD45B, LMNB2, TIMM13, TMPRSS9, SPPL2B, LSM7, LINGO3, PEAK3, OAZ123Reproduction
FAT%11888,461,2618.40 × 10−8GADD45G10Reproduction
FAT%20961,950,5149.91 × 10−7SPACA10-
FAT%31077,587,3064.17 × 10−6FUT84Meat/Carcass
FAT%236511105,500,0246.21 × 10−6COL5A1, FCN1, ENSBTAG00000054425, OLFM1, ENSBTAG0000005260015Milk
FAT%61270,194,0062.47 × 10−6ENSBTAG000000473830-
FAT%1611347,813,6373.64 × 10−7GPCPD1, ENSBTAG00000054005, ENSBTAG000000515577Milk
FAT%627414465,7422.82 × 10−317ARHGAP39, bta-mir-2308, C14H8orf82, LRRC24, LRRC14, RECQL4, MFSD3, GPT, PPP1R16A, FOXH1, KIFC2, CYHR1, TONSL, VPS28, ENSBTAG00000053637, SLC39A4, CPSF1, ADCK5765Milk
FAT%7891551,993,6189.14 × 10−9CLPB, PDE2A, ENSBTAG000000508272Production
FAT%7141661,059,9947.32 × 10−10FAM163A, TOR1AIP2, TOR1AIP1, U611Reproduction
FAT%331765,230,4891.97 × 10−6KIAA1671, 7SK, ENSBTAG00000053952, CRYBB3, CRYBB218Reproduction
FAT%2151935,457,7672.41 × 10−7KCNJ12, UTP18, MBTD10-
FAT%12932029,991,5182.95 × 10−23ENSBTAG00000054476, MRPS30179Milk
FAT%62511,488,4754.73 × 10−6CPPED10-
FAT%82651,156,6535.45 × 10−6INPP5A, ENSBTAG00000051139, BNIP3, ENSBTAG000000505272Reproduction
FAT%32736,400,2571.07 × 10−55S_rRNA, GOLGA7, GINS427Milk
FAT%132826,978,7791.00 × 10−5ADAMTS14, TBATA, SGPL1, ENSBTAG00000054819, PCBD118Milk
1 Number of genes present in each chromosome; 2 Chr = chromosome; 3 QTL = number of QTL previously reported in Animal QTLdb; 4 QTL_type = main type of QTL trait group previously identified.
Table 3. Description of SNPs, most significant genes, and number of QTLs significantly associated with protein yield (PROT), protein percentage (PROT%), and lactation persistency (LP) in North American Holstein cattle.
Table 3. Description of SNPs, most significant genes, and number of QTLs significantly associated with protein yield (PROT), protein percentage (PROT%), and lactation persistency (LP) in North American Holstein cattle.
TRAITN 1Chr 2Position (bp)p-ValueGenes (Within ±100 Kb)QTL 3QTL_Type 4
PROT2361117,208,3941.39 × 10−7CLRN11Reproduction
PROT1221,465,2078.33 × 10−6AMER33Reproduction
PROT16342,939,3268.31 × 10−8CDC14A, ENSBTAG00000054319, ENSBTAG00000015759, RTCA0-
PROT614106,606,4198.84 × 10−6ENSBTAG00000050494, TAS2R39, TAS2R40, GSTK1, TMEM139, CASP2, CLCN10-
PROT72591,526,3055.46 × 10−6PIK3C2G, ENSBTAG0000004617823Milk
PROT65686,795,2184.49 × 10−7SLC4A4278Milk
PROT6793,911,4281.49 × 10−5KIAA0825, SLF10-
PROT308948,0006.96 × 10−7PALLD, 5S_rRNA16Production
PROT211010,774,7131.37 × 10−6CMYA5, SNORA72, ENSBTAG000000490540-
PROT1211177,730,2441.57 × 10−7TDRD154Production
PROT81277,353,2978.04 × 10−6TMTC4, ENSBTAG000000537170-
PROT4891333,556,3562.02 × 10−9ARHGAP125Reproduction
PROT154114827,9383.24 × 10−22MAF1, ENSBTAG00000051469, SHARPIN, CYC1, GPAA1, EXOSC4, OPLAH, ENSBTAG00000015040, SPATC1, GRINA, PARP10, PLEC, bta-mir-2309670Milk
PROT321541,254,1065.23 × 10−7GALNT181Milk
PROT221765,001,6415.01 × 10−6ENSBTAG00000054184, PIWIL3, SGSM1, TMEM2113Reproduction
PROT481836,657,4231.48 × 10−6CYB5B, ENSBTAG00000052086, NFAT50-
PROT371942,052,2756.05 × 10−6JUP, P3H4, FKBP10, NT5C3B, KLHL10, KLHL11, ACLY, ENSBTAG00000050335, TTC25, CNP, DNAJC72Milk
PROT72110,379,6062.70 × 10−6ENSBTAG000000493518Reproduction
PROT302223,112,2173.53 × 10−6CRBN, TRNT1, IL5RA4Milk
PROT628235,551,6043.67 × 10−7FAM83B2Milk
PROT312421,507,2811.32 × 10−6GALNT1, INO80C0-
PROT92510,257,5795.43 × 10−6ENSBTAG00000050716, ENSBTAG00000050363, LITAF2Milk
PROT82615,946,0655.94 × 10−6PLCE1, NOC3L, U6, TBC1D12, ENSBTAG00000051299, ENSBTAG00000049089, HELLS, 7SK13Milk
PROT15281,284,9441.04 × 10−6RAB4A, CCSAP, ENSBTAG00000048654, ENSBTAG000000509857Milk
PROT%561142,907,5643.41 × 10−6SLC37A1, PDE9A98Milk
PROT%10639,965,0205.49 × 10−10FCRL6, DUSP23, CRP9Milk
PROT%3949,165,2203.80 × 10−6ENSBTAG00000052341, MTERF10-
PROT%828593,655,6801.81 × 10−12SLC15A5172Milk
PROT%696636,205,2161.28 × 10−18HERC3, PIGY, HERC5202Production
PROT%67104,076,1381.10 × 10−5U60-
PROT%1681,868,6122.86 × 10−6MFAP3L, ENSBTAG00000051098, AADAT6Reproduction
PROT%61926,103,9174.56 × 10−8TPD52L1, RNF2170-
PROT%361040,647,3472.08 × 10−6ENSBTAG000000345803Reproduction
PROT%13091163,476,4821.52 × 10−10SLC1A4, CEP68, RAB1A26Milk
PROT%51276,184,9082.08 × 10−5UBAC2, GPR18, GPR183, ENSBTAG000000382684Milk
PROT%1001346,366,4982.96 × 10−7ADARB2, ENSBTAG00000054346, WDR37, IDI1, GTPBP4, U6, LARP4B, ENSBTAG0000005196215Milk
PROT%594914465,7422.75 × 10−122ARHGAP39, bta-mir-2308, C14H8orf82, LRRC24, LRRC14, RECQL4, MFSD3, GPT, PPP1R16A, FOXH1, KIFC2, CYHR1, TONSL, VPS28, ENSBTAG00000053637, SLC39A4, CPSF1, ADCK5765Milk
PROT%22211551,232,7961.85 × 10−13STIM1, RHOG, PGAP2, NUP9814Health
PROT%7491660,724,6552.34 × 10−9SOAT1, AXDND1, NPHS2, TDRD54Milk
PROT%9971935,457,7674.58 × 10−9KCNJ12, UTP18, MBTD10-
PROT%22012031,391,0585.47 × 10−45PAIP1, ENSBTAG00000049623, C20H5orf34, TMEM267, CCL28, HMGCS1, ENSBTAG00000048672, NIM1K72Milk
PROT%402254,244,2673.21 × 10−6CLEC3B, EXOSC7, ZDHHC3, TMEM42, GHRL, SEC133Milk
PROT%5542347,176,1958.66 × 10−10SLC35B30-
PROT%252456,331,0313.77 × 10−6WDR70-
PROT%52514,923,1402.20 × 10−6ENSBTAG000000510406Milk
PROT%2812623,088,3242.56 × 10−8GBF1, NFKB2, PSD, FBXL15, CUEDC2, bta-mir-146b, MFSD13A, ACTR1A, SUFU40Milk
PROT%262835,624,1391.40 × 10−5ENSBTAG00000048082, SFTPD, MBL1, SFTPA1, ENSBTAG00000052322, MAT1A, DYDC12Health
PROT%6862940,803,1594.78 x10−10ASRGL1, ENSBTAG00000042287, SCGB1A1, AHNAK19Milk
LP2419,848,8321.74 × 10−6THSD7A2Milk
LP236104,139,8006.86 × 10−6STK32B, 5S_rRNA, CYTL18Milk
LP372,852,9469.20 × 10−6ENSBTAG00000051744, ENSBTAG00000052719, ENSBTAG000000491901Milk
LP3859,108,2541.81 × 10−6ENSBTAG00000042498, ENSBTAG00000049991, FAM205C0-
LP2911,532,9012.65 × 10−5RIMS11Reproduction
LP51268,955,7691.48 × 10−5ENSBTAG00000054671, ENSBTAG00000051263, DCT, ENSBTAG00000051519, 5S_rRNA8Milk
LP01410,086,1649.93 × 10−6-17Milk
LP21517,222,0161.88 × 10−5ELMOD1, SLN3Reproduction
LP01735,605,6871.69 × 10−5-0-
LP261854,117,7538.56 × 10−7ARHGAP35, NPAS1, TMEM160, ZC3H4, SAE14Production
LP251936,768,6725.36 × 10−6DLX4, ENSBTAG00000045805, U6, ENSBTAG00000053450, ENSBTAG00000049677, KAT7, ENSBTAG0000005279320Milk
LP42131,125,8762.14 × 10−5UBE2Q2, ENSBTAG00000048528, FBXO22, ENSBTAG000000431870-
LP32248,814,7491.19 × 10−6POC1A, DUSP74Milk
LP162315,502,6511.39 × 10−5FOXP4, MDFI, TFEB, PGC, FRS3, ENSBTAG00000038916, TOMM63Milk
LP152651,053,4305.16 × 10−6INPP5A, ENSBTAG000000549670-
LP22735,948,5113.46 × 10−5ZMAT41Meat/Carcass
LP22834,869,8575.28 × 10−7ZMIZ1, PPIF5Milk
LP12922,382,6522.17 × 10−5SLC17A60-
1 Number of genes identified in each chromosome; 2 Chr = chromosome; 3 QTL = number of previously QTL reported in Animal QTLdb; 4 QTL_type = main type of QTL trait group previously identified.(a).
Table 4. Most significantly enriched gene ontology (GO) terms of candidate genes for milk yield (MILK), fat yield (FAT), and fat percentage (FAT%) in North American Holstein cattle.
Table 4. Most significantly enriched gene ontology (GO) terms of candidate genes for milk yield (MILK), fat yield (FAT), and fat percentage (FAT%) in North American Holstein cattle.
TraitGOTermp-ValueGenes
MILKGO:0004890GABA-A receptor activity2.4 × 10−4GABRA2, GABRG1, GABRA4, GABRB1, and GABRD
MILKGO:0006749Glutathione metabolic process6.7 × 10−4OPLAH, ALDH5A1, CLIC5, GSTA2, GSTA3, GSTA4, GSTK1, and MGST1
MILKGO:0005230Extracellular ligand-gated ion channel activity7.9 × 10−4GABRA2, GABRG1, GABRA4, GABRB1, and GABRD
MILKGO:1903496Response to 11-deoxycorticosterone1.4 × 10−3CSN1S1, CSN1S2, CSN2, and CSN3
MILKGO:0007605sensory perception of sound3.2 × 10−3BARHL1, EYA4, FBXO11, NIPBL, USH1G, CLIC5, CHD7, COL2A1, DCDC2, MYH14, SNAI2, SLC1A3, and TUB
MILKGO:0005513Detection of calcium ion7.0 × 10−3CALM2, CALM3, KCNMB4, and STIM1
MILKGO:0043950Positive regulation of cAMP-mediated signaling7.0 × 10−3CXCL10, CXCL11, CXCL9, and PTGIR
MILKGO:0003273Cell migration involved in endocardial cushion formation8.3 × 10−3DCHS1, NOTCH1, and SNAI2
MILKGO:0071479Cellular response to ionizing radiation9.4 × 10−3FBXO4, RAD1, CLOCK, EEF1D, SPIDR, and SNAI2
MILKGO:0004364Glutathione transferase activity1.0 × 10−3CLIC5, GSTA2, GSTA3, GSTA4, GSTA5, GSTK1, and MGST1
FATGO:0015125Bile acid transmembrane transporter activity4.1 × 10−4SLCO1A2, SLCO1B3, SLCO1C1, and SLCO2B1
FATGO:0055089Fatty acid homeostasis5.4 × 10−4POLD1, DGAT1, GOT1, GPAM, and INS
FATGO:0030154Cell differentiation6.1 × 10−4DHCR7, EHF, ETV6, EYA3, HCK, SPIB, CREBL2, EXT2, GADD45B, GADD45G, MGP, NR5A1, PPDPF, PRRC2B, PTK2, PTK6, RGS19, SCX, SFRP5, STYK1, SNAPC4, SRMS, TRAPPC9, and TTF1
FATGO:0007275Multicellular organism development2.4 × 10−3ALX4, DDX1, EYA3, SUFU, TNFRSF1A, TNFRSF6B, GADD45B, GADD45G, LBH, LTBR, PPDPF, PLCZ1, SFRP5, STRBP, SPRED2, TPI1, TRIM5,4 and ZFAT
FATGO:0000978RNA polymerase II core promoter proximal region sequence-specific DNA binding2.7 × 10−3AEBP2, EHF, ETV6, FEZF2, FOSL2, JDP2, MAFA, MXD1, MEIS1, NACC2, SOX18, TLX1, ASCL2, BHLHE41, BATF, CHD7, FOXJ2, GMEB2, HSF1, HHEX, NR1H2, NR6A1, OTX1, TP73, and ZGPAT
FATGO:0042127Regulation of cell proliferation4.3 × 10−3DHCR7, HCK, NDRG1, NKX2-3, SRC, TNFRSF1A, TNFRSF6B, APOBEC1, GUCY2C, HHEX, LTBR, PTGS1, PTK2, PTK6, STYK1, and SRMS
FATGO:0016509Long-chain-3-hydroxyacyl-CoA dehydrogenase activity6.5 × 10−3HADHA, HADHB, and HSD17B12
FATGO:0036094Small molecule binding7.2 × 10−3LCN2, LCN9, PAEP, and RBP4
FATGO:0001671ATPase activator activity9.9 × 10−3AHSA2, ATP1B3, TOR1AIP1, and TOR1AIP2
FAT%GO:0005149Interleukin-1 receptor binding8.5 × 10−8IL1A, IL1B, IL1F10, IL1RN, IL36RN, IL36A, IL36B, IL36G, and IL37
FAT%GO:0007585Respiratory gaseous exchange7.6 × 10−4PBX3, TLX3, CHST11, FUT8, GRIN1, SFTPB, and SFTPD
FAT%GO:0015125Bile acid transmembrane transporter activity9.1 × 10−4SLCO1A2, SLCO1B3, SLCO1C1, and SLCO2B1
FAT%GO:0015459Potassium channel regulator activity3.6 × 10−3DPP6, LRRC26, KCNMB4, KCNIP4, KCNE1, KCNE2, and KCNE3
FAT%GO:0005031Tumor necrosis factor-activated receptor activity5.1 × 10−3RELT, TNFRSF1A, TNFRSF1B, TNFRSF8, LTBR, and NGFR
Table 5. Most significantly enriched gene ontology (GO) terms of candidate genes for protein yield (PROT), protein percentage (PROT%), and lactation persistency (LP) in North American Holstein cattle.
Table 5. Most significantly enriched gene ontology (GO) terms of candidate genes for protein yield (PROT), protein percentage (PROT%), and lactation persistency (LP) in North American Holstein cattle.
TraitGOTermp-ValueGenes
PROTGO:0008289Lipid binding1.7 × 10−5BPIFA1, BPIFA3, BPIFA2A, BPIFA2B, BPIFA2C, BPIFB1, BPIFB2, BPIFB3, BPIFB4, BPIFB6, ACBD7, and PLTP
PROTGO:0016998Cell wall macromolecule catabolic process6.9 × 10−5LYSB, LYZ1, LYZ3, LYZ2, and LYZ
PROTGO:0003796Lysozyme activity8.0 × 10−5LYSB, LYZ1, LYZ3, LYZ2, and LYZ
PROTGO:0042742Defense response to bacterium1.2 × 10−4DEFB122, DEFB122A, CSN1S2, DEFB116, DEFB119, DEFB123, DEFB124, DEFB119, HSTN, LYZ1, and NOD2
PROTGO:0019835Cytolysis1.7 × 10−4LYSB, LYZ1, LYZ3, LYZ2, and LYZ
PROTGO:1903496Response to 11-deoxycorticosterone2.4 × 10−4CSN1S1, CSN1S2, CSN2, and CSN3
PROTGO:0050829Defense response to Gram-negative bacterium8.5 × 10−4BPI, LYSB, LYZ1, LYZ3, LYZ2, and LYZ
PROTGO:0045087Innate immune response2.7 × 10−3BPIFA1, BPIFB1, BPIFB3, CYLD, HCK, DEFB122, DEFB122A, DEFB116, DEFB119, DEFB123, DEFB124, NOD2, TRIM10, TRIM15, and TRIM31
PROTGO:0032570Response to progesterone3.4 × 10−3CSN1S1, CSN1S2, CSN2, and CSN3
PROTGO:0032355Response to estradiol3.4 × 10−3CSN1S1, CSN1S2, CSN2, and CSN3
PROTGO:0007586Digestion6.4 × 10−3LYZ1, LYZ3, LYZ2, and UCN3
PROTGO:0045028G-protein coupled purinergic nucleotide receptor activity6.4 × 10−3GPR171, GPR87, P2RY12, and P2RY14
PROTGO:0050830Defense response to Gram-positive bacterium9.2 × 10−3H2B, LYSB, LYZ1, LYZ3, LYZ2, and LYZ
PROT%GO:0005149Interleukin-1 receptor binding9.0 × 10−8IL1A, IL1RN, IL36A, IL36B, IL37, IL1B, IL36G, IL36RN, IRAK4, and IL1F10
PROT%GO:1903496Response to 11-deoxycorticosterone3.5 × 10−4CSN1S2, CSN3, LALBA, CSN1S1, and CSN2
PROT%GO:1903494Response to dehydroepiandrosterone3.5 × 10−4CSN1S2, CSN3, LALBA, CSN1S1, and CSN2
PROT%GO:0005452Inorganic anion exchanger activity3.8 × 10−4SLC22A12, SLC4A8, SLC22A6, SLC22A8, SLC4A4, SLC22A10, SLC4A5, SLC22A9, and SLC22A11
PROT%GO:0032355Response to estradiol2.0 × 10−3STAT3, CSN1S2, CSN3, LALBA, CSN1S1, and CSN2
PROT%GO:0015347Sodium-independent organic anion transmembrane transporter activity2.2 × 10−3SLC22A12, SLC22A6, SLCO4A1, SLCO2B1, SLC22A8, SLC22A10, SLC22A9, and SLC22A11
PROT%GO:0046983Protein dimerization activity2.6 × 10−3TFAP2A, PPP3CA, STAT5B, HEY1, MYC, ID2, TCF23, STAT3, ANO4, and E2F6
PROT%GO:0043252Sodium-independent organic anion transport3.0 × 10−3SLC22A12, SLC22A6, SLCO4A1, SLCO2B1, SLC22A8, SLC22A10, SLC22A9, and SLC22A11
PROT%GO:0043153Entrainment of circadian clock by photoperiod3.3 × 10−3PPP1CB, PER1, RBM4, ID2, CRY1, RBM4B, and TP53
PROT%GO:0007595Lactation3.9 × 10−3STAT5A, STAT5B, VDR, NEURL1, ATP2B2, CSN3, CSN2, and PRLR
PROT%GO:0007259JAK-STAT cascade4.8 × 10−3STAT5A, STAT5B, CTR9, IL31RA, PRLR, and SOCS5
PROT%GO:0030282Bone mineralization5.1 × 10−3KLF10, CLEC3B, WNT11, PKDCC, RSPO2, FBXL15, IFITM5, and LGR4
PROT%GO:0048013Ephrin receptor signaling pathway6.2 × 10−3EPHB6, EFNA1, EFNA3, EFNB3, NCK2, EFNA4, and PTK2
PROT%GO:0010628Positive regulation of gene expression6.4 × 10−3CRP, SEC16B, OSR2, RAMP2, PRKAA1, ODAM, ROCK2, RBM4B, SCX, SERPINB9, LRRC32, WNT11, FGF8, FABP4, ID2, KIT, RPS3, DROSHA, STAP1, KRAS, APOB, IL7R, ZBTB7B, and ZPR1
PROT%GO:0008380RNA splicing6.5 × 10−3RBFOX2, MTERF3, PRPF4B, MAGOHB, JMJD6, RBM4, RBMXL2, PUF60, C1QBP, SRSF2, LUC7L3, PABPC1, SRSF7, and ZPR1
PROT%GO:0005344Oxygen transporter activity6.9 × 10−3MB, HBE2, HBE1, HBE4, HBB, and CYGB
PROT%GO:0006397mRNA processing7.1 × 10−3RNASEL, RBFOX2, DDX1, PRPF4B, HNRNPLL, MAGOHB, JMJD6, RBM4, RBMXL2, ALKBH5, PUF60, C1QBP, SRSF2, PABPC1, AURKAIP1, SRSF7, and ZPR1
PROT%GO:0048704Embryonic skeletal system morphogenesis7.8 × 10−3OSR2, COL11A1, PCGF2, HOXB4, HOXB3, HOXB2, HOXB1, HOXB7, and DSCAML1
LPGO:0030334Regulation of cell migration1.3 × 10−1ABI3 and LDB2
LPGO:1903955Positive regulation of protein targeting to mitochondrion1.5 × 10−1ELMOD1 and SAE1
LPGO:0004867Serine-type endopeptidase inhibitor activity1.9 × 10−1SERPINB6 and SERPINB9
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Pedrosa, V.B.; Schenkel, F.S.; Chen, S.-Y.; Oliveira, H.R.; Casey, T.M.; Melka, M.G.; Brito, L.F. Genomewide Association Analyses of Lactation Persistency and Milk Production Traits in Holstein Cattle Based on Imputed Whole-Genome Sequence Data. Genes 2021, 12, 1830. https://doi.org/10.3390/genes12111830

AMA Style

Pedrosa VB, Schenkel FS, Chen S-Y, Oliveira HR, Casey TM, Melka MG, Brito LF. Genomewide Association Analyses of Lactation Persistency and Milk Production Traits in Holstein Cattle Based on Imputed Whole-Genome Sequence Data. Genes. 2021; 12(11):1830. https://doi.org/10.3390/genes12111830

Chicago/Turabian Style

Pedrosa, Victor B., Flavio S. Schenkel, Shi-Yi Chen, Hinayah R. Oliveira, Theresa M. Casey, Melkaye G. Melka, and Luiz F. Brito. 2021. "Genomewide Association Analyses of Lactation Persistency and Milk Production Traits in Holstein Cattle Based on Imputed Whole-Genome Sequence Data" Genes 12, no. 11: 1830. https://doi.org/10.3390/genes12111830

APA Style

Pedrosa, V. B., Schenkel, F. S., Chen, S. -Y., Oliveira, H. R., Casey, T. M., Melka, M. G., & Brito, L. F. (2021). Genomewide Association Analyses of Lactation Persistency and Milk Production Traits in Holstein Cattle Based on Imputed Whole-Genome Sequence Data. Genes, 12(11), 1830. https://doi.org/10.3390/genes12111830

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop