Genome-Wide Association Study and Phenotype Prediction of Reproductive Traits in Large White Pigs
Simple Summary
Abstract
1. Introduction
2. Materials and Methods
2.1. Data Source and Processing
2.2. Genome-Wide Association Analysis
2.3. Phenotype Prediction Model
2.3.1. Conventional Model
2.3.2. Machine Learning Model
2.3.3. Evaluation Indicators
3. Results
3.1. Significant SNP Markers and Candidate Gene Identification
3.2. Functional Enrichment Analysis of Candidate Genes Reveals Key Biological Processes and Signaling Pathways
3.3. Comparison of Phenotypic Prediction Performance Across Different Models
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Wang, X.; Shi, S.; Wang, G.; Luo, W.; Wei, X.; Qiu, A.; Luo, F.; Ding, X. Using Machine Learning to Improve the Accuracy of Genomic Prediction of Reproduction Traits in Pigs. J. Anim. Sci. Biotechnol. 2022, 13, 60. [Google Scholar] [CrossRef] [PubMed]
- Alqudah, A.M.; Sallam, A.; Baenziger, P.S.; Börner, A. GWAS: Fast-forwarding gene identification and characterization in temperate cereals: Lessons from barley–a review. J. Adv. Res. 2020, 22, 119–135. [Google Scholar] [CrossRef] [PubMed]
- Wu, P.; Wang, K.; Zhou, J.; Chen, D.; Jiang, A.; Jiang, Y.; Zhu, L.; Qiu, X.; Li, X.; Tang, G. A Combined GWAS Approach Reveals Key Loci for Socially-Affected Traits in Yorkshire Pigs. Commun. Biol. 2021, 4, 891. [Google Scholar] [CrossRef]
- Wang, Y.; Ding, X.; Tan, Z.; Xing, K.; Yang, T.; Wang, Y.; Sun, D.; Wang, C. Genome-Wide Association Study for Reproductive Traits in a Large White Pig Population. Anim. Genet. 2018, 49, 127–131. [Google Scholar] [CrossRef]
- Sell-Kubiak, E.; Duijvesteijn, N.; Lopes, M.S.; Janss, L.L.G.; Knol, E.F.; Bijma, P.; Mulder, H.A. Genome-Wide Association Study Reveals Novel Loci for Litter Size and Its Variability in a Large White Pig Population. BMC Genom. 2015, 16, 1049. [Google Scholar] [CrossRef]
- Zhang, Z.; Chen, Z.; Ye, S.; He, Y.; Huang, S.; Yuan, X.; Chen, Z.; Zhang, H.; Li, J. Genome-Wide Association Study for Reproductive Traits in a Duroc Pig Population. Animals 2019, 9, 732. [Google Scholar] [CrossRef]
- Wang, H.; Wang, X.; Li, M.; Sun, H.; Chen, Q.; Yan, D.; Dong, X.; Pan, Y.; Lu, S. Genome-Wide Association Study of Growth Traits in a Four-Way Crossbred Pig Population. Genes 2022, 13, 1990. [Google Scholar] [CrossRef] [PubMed]
- de Roos, A.P.W.; Schrooten, C.; Veerkamp, R.F.; van Arendonk, J.A.M. Effects of Genomic Selection on Genetic Improvement, Inbreeding, and Merit of Young versus Proven Bulls. J. Dairy Sci. 2011, 94, 1559–1567. [Google Scholar] [CrossRef]
- Hayes, B.; Bowman, P.; Chamberlain, A.; Goddard, M. Invited review: Genomic selection in dairy cattle: Progress and challenges. J. Dairy Sci. 2009, 92, 433–443. [Google Scholar] [CrossRef]
- Heffner, E.L.; Jannink, J.L.; Sorrells, M.E. Genomic selection accuracy using multifamily prediction models in a wheat breeding program. Plant Genome 2011, 4, 65–75. [Google Scholar] [CrossRef]
- Wang, J.; Zong, W.; Shi, L.; Li, M.; Li, J.; Ren, D.; Zhao, F.; Wang, L.; Wang, L. Using mixed kernel support vector machine to improve the predictive accuracy of genome selection1. J. Integr. Agric. 2024, in press. [Google Scholar] [CrossRef]
- Yu, T.; Zhang, W.; Han, J.; Li, F.; Wang, Z.; Cao, C. An Ensemble Learning Approach for Predicting Phenotypes from Genotypes. In Proceedings of the 2021 20th International Conference on Ubiquitous Computing and Communications (IUCC/CIT/DSCI/SmartCNS), London, UK, 20–22 December 2021; pp. 382–389. [Google Scholar]
- Wang, Z.; Wang, H.; Yu, T.; Zhang, W.; Han, J.; Li, F. A Multiple Kernel Ensemble Approach for Genomic Prediction. In Proceedings of the International Conference on Computer Application and Information Security (ICCAIS 2022), Wuhan, China, 23–24 December 2022; SPIE: Bellingham, WA, USA, 2023; Volume 12609, pp. 324–336. [Google Scholar]
- Yu, T.; Wang, L.; Zhang, W.; Xing, G.; Han, J.; Li, F.; Cao, C. Predicting Phenotypes From High-Dimensional Genomes Using Gradient Boosting Decision Trees. IEEE Access 2022, 10, 48126–48140. [Google Scholar] [CrossRef]
- VanRaden, P.M. Efficient Methods to Compute Genomic Predictions. J. Dairy Sci. 2008, 91, 4414–4423. [Google Scholar] [CrossRef] [PubMed]
- Shi, Q.; Abdel-Aty, M.; Lee, J. A Bayesian ridge regression analysis of congestion’s impact on urban expressway safety. Accid. Anal. Prev. 2016, 88, 124–137. [Google Scholar] [CrossRef]
- Yi, N.; Xu, S. Bayesian LASSO for Quantitative Trait Loci Mapping. Genetics 2008, 179, 1045–1055. [Google Scholar] [CrossRef]
- Ornella, L.; Singh, S.; Perez, P.; Burgueño, J.; Singh, R.; Tapia, E.; Bhavani, S.; Dreisigacker, S.; Braun, H.-J.; Mathews, K.; et al. Genomic Prediction of Genetic Values for Resistance to Wheat Rusts. Plant Genome 2012, 5, 136–148. [Google Scholar] [CrossRef]
- Kanehisa, M.; Goto, S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 2000, 28, 27–30. [Google Scholar] [CrossRef] [PubMed]
- Touchon, J.C. Generalized Linear Models (GLM). In Applied Statistics with R; Oxford University Press: Oxford, UK, 2021; pp. 181–208. ISBN 978-0-19-886997-9. [Google Scholar]
- Chang, C.C.; Chow, C.C.; Tellier, L.C.; Vattikuti, S.; Purcell, S.M.; Lee, J.J. Second-Generation PLINK: Rising to the Challenge of Larger and Richer Datasets. GigaScience 2015, 4, 7. [Google Scholar] [CrossRef]
- Nothnagel, M.; Ellinghaus, D.; Schreiber, S.; Krawczak, M.; Franke, A. A Comprehensive Evaluation of SNP Genotype Imputation. Hum. Genet. 2009, 125, 163–171. [Google Scholar] [CrossRef]
- Yu, G.; Wang, L.-G.; Han, Y.; He, Q.-Y. clusterProfiler: An R Package for Comparing Biological Themes Among Gene Clusters. OMICS A J. Integr. Biol. 2012, 16, 284–287. [Google Scholar] [CrossRef]
- Rigatti, S.J. Random Forest. J. Insur. Med. 2017, 47, 31–39. [Google Scholar] [CrossRef] [PubMed]
- Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Advances in Neural Information Processing Systems 30 (NIPS 2017); Curran Associates, Inc.: San Francisco, CA, USA, 2017; Volume 30. [Google Scholar]
- Yan, J.; Xu, Y.; Cheng, Q.; Jiang, S.; Wang, Q.; Xiao, Y.; Ma, C.; Yan, J.; Wang, X. LightGBM: Accelerated Genomically Designed Crop Breeding through Ensemble Learning. Genome Biol. 2021, 22, 271. [Google Scholar] [CrossRef] [PubMed]
- Shrestha, D.L.; Solomatine, D.P. Experiments with AdaBoost.RT, an Improved Boosting Scheme for Regression. Neural Comput. 2006, 18, 1678–1710. [Google Scholar] [CrossRef] [PubMed]
- Bischl, B.; Binder, M.; Lang, M.; Pielok, T.; Richter, J.; Coors, S.; Thomas, J.; Ullmann, T.; Becker, M.; Boulesteix, A.-L.; et al. Hyperparameter Optimization: Foundations, Algorithms, Best Practices, and Open Challenges. WIREs Data Min. Knowl. Discov. 2023, 13, e1484. [Google Scholar] [CrossRef]
- Hao, J.; Ho, T.K. Machine Learning Made Easy: A Review of Scikit-Learn Package in Python Programming Language. J. Educ. Behav. Stat. 2019, 44, 348–361. [Google Scholar] [CrossRef]
- Pérez, P.; de Los Campos, G. Genome-wide regression and prediction with the BGLR statistical package. Genetics 2014, 198, 483–495. [Google Scholar] [CrossRef]
- Zhao, Y.X.; Gao, G.X.; Zhou, Y.; Guo, C.X.; Li, B.; El-Ashram, S.; Li, Z.L. Genome-Wide Association Studies Uncover Genes Associated with Litter Traits in the Pig. Animal 2022, 16, 100672. [Google Scholar] [CrossRef]
- Li, Y.; Pu, L.; Shi, L.; Gao, H.; Zhang, P.; Wang, L.; Zhao, F. Revealing New Candidate Genes for Teat Number Relevant Traits in Duroc Pigs Using Genome-Wide Association Studies. Animals 2021, 11, 806. [Google Scholar] [CrossRef]
- Chang Wu, Z.; Wang, Y.; Huang, X.; Wu, S.; Bao, W. A Genome-Wide Association Study of Important Reproduction Traits in Large White Pigs. Gene 2022, 838, 146702. [Google Scholar] [CrossRef]
- Tang, J.; Tian, X.; Min, J.; Hu, M.; Hong, L. RPP40 Is a Prognostic Biomarker and Correlated with Tumor Microenvironment in Uterine Corpus Endometrial Carcinoma. Front. Oncol. 2022, 12, 957472. [Google Scholar] [CrossRef]
- Hwang, J.H.; An, S.M.; Park, D.H.; Kang, D.G.; Kim, T.W.; Park, H.C.; Ha, J.; Kim, C.W. The identification of non-synonymous SNP in the Enoyl-CoA delta isomerase 2 (ECI2) gene and its Association with Meat Quality Traits in Berkshire pigs. Korean J. Int. Agric. 2018, 30, 277–284. [Google Scholar] [CrossRef]
- Zucchelli, M.; Torkvist, L.; Bresso, F.; Halfvarson, J.; Hellquist, A.; Anedda, F.; Assadi, G.; Lindgren, G.B.; Svanfeldt, M.; Janson, M.; et al. PepT1 Oligopeptide Transporter (SLC15A1) Gene Polymorphism in Inflammatory Bowel Disease. Inflamm. Bowel Dis. 2009, 15, 1562–1569. [Google Scholar] [CrossRef] [PubMed]
- Yang, L.; Liu, X.; Huang, X.; Li, N.; Zhang, L.; Yan, H.; Hou, X.; Wang, L.; Wang, L. Integrated Proteotranscriptomics Reveals Differences in Molecular Immunity between Min and Large White Pig Breeds. Biology 2022, 11, 1708. [Google Scholar] [CrossRef] [PubMed]
- Easa, A.A.; Selionova, M.; Aibazov, M.; Mamontova, T.; Sermyagin, A.; Belous, A.; Abdelmanova, A.; Deniskova, T.; Zinovieva, N. Identification of Genomic Regions and Candidate Genes Associated with Body Weight and Body Conformation Traits in Karachai Goats. Genes 2022, 13, 1773. [Google Scholar] [CrossRef] [PubMed]
- Romaniello, R.; Tonelli, A.; Arrigoni, F.; Baschirotto, C.; Triulzi, F.; Bresolin, N.; Bassi, M.T.; Borgatti, R. A Novel Mutation in the β-Tubulin Gene TUBB2B Associated with Complex Malformation of Cortical Development and Deficits in Axonal Guidance. Dev. Med. Child. Neurol. 2012, 54, 765–769. [Google Scholar] [CrossRef]
- Mao, D.; Cao, H.; Shi, M.; Wang, C.C.; Kwong, J.; Li, J.J.X.; Hou, Y.; Ming, X.; Lee, H.M.; Tian, X.Y.; et al. Increased Co-Expression of PSMA2 and GLP-1 Receptor in Cervical Cancer Models in Type 2 Diabetes Attenuated by Exendin-4: A Translational Case-Control Study. EBioMedicine 2021, 65, 103242. [Google Scholar] [CrossRef]
- Xu, C.; Wang, X.; Zhuang, Z.; Wu, J.; Zhou, S.; Quan, J.; Ding, R.; Ye, Y.; Peng, L.; Wu, Z.; et al. A Transcriptome Analysis Reveals That Hepatic Glycolysis and Lipid Synthesis Are Negatively Associated with Feed Efficiency in DLY Pigs. Sci. Rep. 2020, 10, 9874. [Google Scholar] [CrossRef]
- Lee, M.O.; Yang, E.; Morisson, M.; Vignal, A.; Huang, Y.-Z.; Cheng, H.H.; Muir, W.M.; Lamont, S.J.; Lillehoj, H.S.; Lee, S.H.; et al. Mapping and Genotypic Analysis of the NK-Lysin Gene in Chicken. Genet. Sel. Evol. 2014, 46, 43. [Google Scholar] [CrossRef]
- Zhao, B.; Watanabe, G.; Lieber, M.R. Polymerase μ in Non-Homologous DNA End Joining: Importance of the Order of Arrival at a Double-Strand Break in a Purified System. Nucleic Acids Res. 2020, 48, 3605–3618. [Google Scholar] [CrossRef]
- Gòdia, M.; Castelló, A.; Rocco, M.; Cabrera, B.; Rodríguez-Gil, J.E.; Balasch, S.; Lewis, C.; Sánchez, A.; Clop, A. Identification of Circular RNAs in Porcine Sperm and Evaluation of Their Relation to Sperm Motility. Sci. Rep. 2020, 10, 7985. [Google Scholar] [CrossRef]
- Li, W.; Liu, S.; Wang, Y.; Deng, F.; Yan, W.; Yang, K.; Chen, H.; He, Q.; Charreyre, C.; Audoneet, J.-C. Transcription analysis of the porcine alveolar macrophage response to porcine circovirus type 2. BMC Genom. 2013, 14, 1–15. [Google Scholar] [CrossRef] [PubMed]
- Lin, C.; Hu, J.; Dai, Y.; Zhang, H.; Xu, K.; Dong, W.; Yan, Y.; Peng, X.; Zhou, J.; Gu, J. Porcine Circovirus Type 2 Hijacks Host IPO5 to Sustain the Intracytoplasmic Stability of Its Capsid Protein. J. Virol. 2022, 96, e01522-22. [Google Scholar] [CrossRef]
- Sullivan, R. Epididymosomes: A Heterogeneous Population of Microvesicles with Multiple Functions in Sperm Maturation and Storage. Asian J. Androl. 2015, 17, 726. [Google Scholar] [CrossRef] [PubMed]
- Oberska, P.; Grabowska, M.; Marynowska, M.; Murawski, M.; Gączarzewicz, D.; Syczewski, A.; Michałek, K. Cellular Distribution of Aquaporin 3, 7 and 9 in the Male Reproductive System: A Lesson from Bovine Study (Bos taurus). Int. J. Mol. Sci. 2024, 25, 1567. [Google Scholar] [CrossRef] [PubMed]
- Huang, H.-F.; He, R.-H.; Sun, C.-C.; Zhang, Y.; Meng, Q.-X.; Ma, Y.-Y. Function of Aquaporins in Female and Male Reproductive Systems. Hum. Reprod. Update 2006, 12, 785–795. [Google Scholar] [CrossRef]
- Gòdia, M.; Casellas, J.; Ruiz-Herrera, A.; Rodríguez-Gil, J.E.; Castelló, A.; Sánchez, A.; Clop, A. Whole Genome Sequencing Identifies Allelic Ratio Distortion in Sperm Involving Genes Related to Spermatogenesis in a Swine Model. DNA Res. 2020, 27, dsaa019. [Google Scholar] [CrossRef] [PubMed]
- Aljaibeji, H.; Mukhopadhyay, D.; Mohammed, A.K.; Dhaiban, S.; Hachim, M.Y.; Elemam, N.M.; Sulaiman, N.; Salehi, A.; Taneera, J. Reduced Expression of PLCXD3 Associates With Disruption of Glucose Sensing and Insulin Signaling in Pancreatic β-Cells. Front. Endocrinol. 2019, 10, 735. [Google Scholar] [CrossRef]
- Bishop, M.T.; Sanchez-Juan, P.; Knight, R.S. Splice Site SNPs of Phospholipase PLCXD3 Are Significantly Associated with Variant and Sporadic Creutzfeldt-Jakob Disease. BMC Med. Genet. 2013, 14, 91. [Google Scholar] [CrossRef]
- Cheng, L.; Wang, W.; Yao, Y.; Sun, Q. Mitochondrial RNase H1 Activity Regulates R-Loop Homeostasis to Maintain Genome Integrity and Enable Early Embryogenesis in Arabidopsis. PLoS Biol. 2021, 19, e3001357. [Google Scholar] [CrossRef]
- Cerritelli, S.M.; Frolova, E.G.; Feng, C.; Grinberg, A.; Love, P.E.; Crouch, R.J. Failure to Produce Mitochondrial DNA Results in Embryonic Lethality in Rnaseh1 Null Mice. Mol. Cell 2003, 11, 807–815. [Google Scholar] [CrossRef]
- Lin, Y.; Tsai, Y.-J.; Liu, Y.-F.; Cheng, Y.-C.; Hung, C.-M.; Lee, Y.-J.; Pan, H.; Li, C. The Critical Role of Protein Arginine Methyltransferase Prmt8 in Zebrafish Embryonic and Neural Development Is Non-Redundant with Its Paralogue Prmt1. PLoS ONE 2013, 8, e55221. [Google Scholar] [CrossRef] [PubMed]
- Smith, C.; Dolat, L.; Angelis, D.; Forgacs, E.; Spiliotis, E.T.; Galkin, V.E. Septin 9 Exhibits Polymorphic Binding to F-Actin and Inhibits Myosin and Cofilin Activity. J. Mol. Biol. 2015, 427, 3273–3284. [Google Scholar] [CrossRef] [PubMed]
- Dong, R.; Li, X.; Lai, K.O. Activity and function of the PRMT8 protein arginine methyltransferase in neurons. Life 2021, 11, 1132. [Google Scholar] [CrossRef] [PubMed]
- Lee, Y.J.; Han, M.-E.; Baek, S.-J.; Kim, S.-Y.; Oh, S.-O. MED30 Regulates the Proliferation and Motility of Gastric Cancer Cells. PLoS ONE 2015, 10, e0130826. [Google Scholar] [CrossRef]
- Hindorff, L.A.; Sethupathy, P.; Junkins, H.A.; Ramos, E.M.; Mehta, J.P.; Collins, F.S.; Manolio, T.A. Potential Etiologic and Functional Implications of Genome-Wide Association Loci for Human Diseases and Traits. Proc. Natl. Acad. Sci. USA 2009, 106, 9362–9367. [Google Scholar] [CrossRef]
- Visscher, P.M.; Brown, M.A.; McCarthy, M.I.; Yang, J. Five Years of GWAS Discovery. Am. J. Hum. Genet. 2012, 90, 7–24. [Google Scholar] [CrossRef] [PubMed]
- Manolio, T.; Collins, F.; Cox, N.; Goldstein, D.; Hindorff, L.; Hunter, D.; McCarthy, M.; Ramos, E.; Cardon, L.; Chakravarti, A.; et al. Finding the Missing Heritability of Complex Diseases. Nature 2009, 461, 747–753. [Google Scholar] [CrossRef]
- Visscher, P.M.; Hill, W.G.; Wray, N.R. Heritability in the Genomics Era–Concepts and Misconceptions. Nat. Rev. Genet. 2008, 9, 255–266. [Google Scholar] [CrossRef]
- Yang, J.; Benyamin, B.; McEvoy, B.P.; Gordon, S.; Henders, A.K.; Nyholt, D.R.; Madden, P.A.; Heath, A.C.; Martin, N.G.; Montgomery, G.W.; et al. Common SNPs Explain a Large Proportion of Heritability for Human Height. Nat. Genet. 2010, 42, 565–569. [Google Scholar] [CrossRef]
- Daetwyler, H.D.; Calus, M.P.L.; Pong-Wong, R.; De Los Campos, G.; Hickey, J.M. Genomic Prediction in Animals and Plants: Simulation of Data, Validation, Reporting, and Benchmarking. Genetics 2013, 193, 347–365. [Google Scholar] [CrossRef]
- Xiang, T.; Li, T.; Li, J.; Li, X.; Wang, J. Using Machine Learning to Realize Genetic Site Screening and Genomic Prediction of Productive Traits in Pigs. FASEB J. 2023, 37, e22961. [Google Scholar] [CrossRef] [PubMed]
- Jolliffe, I.T.; Cadima, J. Principal Component Analysis: A Review and Recent Developments. Phil. Trans. R. Soc. A 2016, 374, 20150202. [Google Scholar] [CrossRef] [PubMed]
- Goddard, M.E.; Hayes, B.J.; Meuwissen, T.H.E. Using the Genomic Relationship Matrix to Predict the Accuracy of Genomic Selection. J. Anim. Breed. Genet. 2011, 128, 409–421. [Google Scholar] [CrossRef] [PubMed]
- Ringnér, M. What Is Principal Component Analysis? Nat. Biotechnol. 2008, 26, 303–304. [Google Scholar] [CrossRef] [PubMed]
Trait | Mean | Standard Deviation | Coefficient of Variation | Minimum | Maximum |
---|---|---|---|---|---|
NH | 13.34 | 2.68 | 20.28% | 4 | 21 |
NW | 11.38 | 1.72 | 15.16% | 4 | 15 |
Trait | Chr | p Value | Starting Physical Position/bp | Terminate Physical Position/bp | Candidate Gene |
---|---|---|---|---|---|
ALGA0037969 | |||||
NH | 7 | 7.7 × 10−5 | 1,684,215 | 1,692,980 | SERPINB1 |
1,792,876 | 1,805,903 | NQO2 | |||
1,841,239 | 1,879,765 | RIPK1 | |||
1,882,720 | 1,909,854 | BPHL | |||
1,656,984 | 1,667,879 | WRNIP1 | |||
1,721,383 | 1,735,074 | SERPINB9 | |||
1,910,269 | 1,914,668 | TUBB2A | |||
1,951,407 | 1,956,119 | TUBB2B | |||
1,988,696 | 2,131,908 | SLC22A23 | |||
2,309,088 | 2,310,080 | FAM50B | |||
2,382,495 | 2,408,674 | PRPF4B | |||
2,421,054 | 2,446,319 | ECI2 | |||
2,231,293 | 2,248,863 | PXDC1 | |||
2,846,090 | 2,966,240 | CDYL | |||
2,987,271 | 3,005,271 | RPP40 | |||
1,752,689 | 1,765,107 | SERPINB6 | |||
H3GA0032302 | |||||
NH | 11 | 5.7 × 10−5 | 66,481,923 | 66,641,651 | MBNL2 |
66,679,253 | 66,720,045 | RAP2A | |||
67,004,109 | 67,059,988 | IPO5 | |||
67,144,645 | 67,457,543 | FARP1 | |||
67,458,352 | 67,615,842 | STK24 | |||
67,652,193 | 67,710,933 | SLC15A1 | |||
ALGA0098819 | |||||
NW | 18 | 4.0 × 10−5 | 51,131,584 | 51,185,435 | BLVRA |
51,322,240 | 51,362,177 | STK17A | |||
51,909,473 | 51,921,238 | PSMA2 | |||
51,928,145 | 51,928,145 | C7orf25 | |||
50,609,418 | 50,684,964 | OGDH | |||
50,726,922 | 50,757,649 | NPC1L1 | |||
50,699,470 | 50,702,704 | TMED4 | |||
50,705,632 | 50,715,033 | DDX56 | |||
50,759,221 | 50,830,332 | NUDCD3 | |||
50,861,466 | 50,956,225 | CAMK2B | |||
50,979,167 | 51,024,494 | GCK | |||
50,960,113 | 50,971,029 | YKT6 | |||
51,046,899 | 51,053,854 | AEBP1 | |||
51,038,358 | 51,046,779 | POLD2 | |||
51,079,503 | 51,088,781 | POLM | |||
51,387,836 | 51,802,945 | HECW1 | |||
52,404,072 | 52,697,900 | GLI3 | |||
WU_10.2_7_117818027 | |||||
NW | 7 | 8.1 × 10−5 | 117,438,622 | 117,472,005 | BDKRB2 |
117,609,791 | 117,667,247 | AK7 | |||
117,676,083 | 117,740,348 | PAPOLA | |||
WU_10.2_7_117839956 | |||||
NW | 7 | 2.1 × 10−5 | 117,942,929 | 118,025,991 | VRK1 |
117,506,298 | 117,586,651 | ATG2B | |||
117,586,742 | 117,607,575 | GSKIP |
Trait | SNP | Chr | Position/bp | p Value | Nearest Gene | Location 1 |
---|---|---|---|---|---|---|
First Parity | ||||||
NW | ALGA0032380 | 5 | 66,707,514 | 8.926 × 10−5 | PRMT8 | within |
Second Parity | ||||||
NH | ASGA0035681 | 12 | 56,494,112 | 9.981 × 10−5 | MAP2K4 | within |
NH | WU_10.2_14_21652102 | 14 | 162,537,498 | 8.056 × 10−5 | EXTI | 89,715 |
NW | ALGA0116097 | 4 | 21,319,382 | 9.981 × 10−5 | MED30 | 90,972 |
NW | ASGA0072736 | 16 | 24,798,218 | 6.289 × 10−5 | U2 | 17,042 |
NW | ALGA0027774 | 4 | 116,167,774 | 1.650 × 10−5 | OLFM3 | 49,569 |
NW | DRGA0015980 | 16 | 25,960,781 | 6.289 × 10−5 | MROH2B | within |
NW | ASGA0072745 | 16 | 26,154,317 | 6.289 × 10−5 | PLCXD3 | 2001 |
NW | ASGA0072743 | 16 | 26,247,215 | 6.289 × 10−5 | PLCXD3 | within |
Fourth Parity | ||||||
NH | MARC0041460 | 3 | 131,287,295 | 8.089 × 10−5 | RNASEH1 | 5365 |
NH | WU_10.2_6_61351656 | 6 | 61,351,656 | 8.902 × 10−6 | PLCXD3 | within |
NW | ALGA0034179 | 5 | 21,319,382 | 9.687 × 10−5 | PYM1 | within |
NW | ALGA0039880 | 7 | 30,978,428 | 2.793 × 10−5 | ANKS1A | within |
NW | WU_10.2_12_4154172 | 12 | 4,154,172 | 3.152 × 10−5 | SEPTIN9 | 48,323 |
Trait | Pathway | Description | Candidate Gene | Value |
---|---|---|---|---|
NH | ssc05132 | Salmonella infection | TUBB2B/RIPK1 | 0.028 |
ssc04540 | Gap junction | TUBB2B | 0.032 |
Parity | Trait | GO Terms | Gene Name |
---|---|---|---|
First parity | NW | GO: 0018216, peptidyl-arginine methylation GO: 0006479, protein methylation GO: 0018193, peptidyl-amino acid modification | PRMT8 |
Second parity | NH | No significant entries are enriched | NULL |
Fourth parity | NW | GO: 0060261, positive regulation of transcription initiation by RNA polymerase II GO: 0006352, DNA-templated transcription initiation GO: 0019827, stem cell population maintenance GO: 0008081, phosphoric diester hydrolase activity | MED30/PLCXD3 |
NH | GO: 0043137, DNA replication, removal of RNA primer GO: 0042578, phosphoric ester hydrolase activity | RNASEH1/PLCXD3 | |
NW | GO: 1903259, exon-exon junction complex disassembly GO: 0032984, protein-containing complex disassembly GO: 0022411, cellular component disassembly GO: 0005525, GTP binding | PYM1/SEPTIN9 |
Evaluation Indicators | Features | Models | ||||||
---|---|---|---|---|---|---|---|---|
GBLUP | BL | BRR | LightGBM | RF | GBDT | Adaboost.R2 | ||
PCC | 20% | −0.116 | −0.086 | −0.092 | −0.056 | 0.096 | 0.131 | 0.0233 |
50% | −0.129 | −0.103 | −0.097 | 0.057 | −0.035 | −0.057 | 0.021 | |
80% | −0.119 | −0.098 | −0.095 | 0.013 | −0.004 | 0.014 | 0.059 | |
All | −0.12 | −0.095 | −0.089 | 0.021 | −0.048 | −0.009 | 0.071 | |
PCA | 0.011 | 0.087 | 0.072 | 0.119 | 0.105 | 0.141 | 0.113 | |
MAE | 20% | 0.778 | 0.755 | 0.773 | 0.748 | 0.757 | 0.752 | 0.742 |
50% | 0.778 | 0.771 | 0.765 | 0.801 | 0.779 | 0.827 | 0.742 | |
80% | 0.777 | 0.777 | 0.772 | 0.808 | 0.78 | 0.811 | 0.784 | |
All | 0.777 | 0.776 | 0.79 | 0.807 | 0.774 | 0.822 | 0.962 | |
PCA | 0.747 | 0.742 | 0.743 | 0.744 | 0.771 | 0.743 | 0.823 | |
MSE | 20% | 1.07 | 1.026 | 1.059 | 1.014 | 1.019 | 0.996 | 1.002 |
50% | 1.07 | 1.064 | 1.044 | 1.121 | 1.071 | 1.195 | 1 | |
80% | 1.068 | 1.067 | 1.059 | 1.149 | 1.076 | 1.151 | 1.111 | |
All | 1.068 | 1.07 | 1.061 | 1.138 | 1.069 | 1.175 | 1.546 | |
PCA | 1.014 | 1.008 | 1.015 | 0.984 | 0.982 | 0.982 | 1.245 | |
RMSE | 20% | 1.029 | 1.005 | 1.022 | 1.007 | 1.01 | 0.998 | 1.001 |
50% | 1.029 | 1.023 | 1.014 | 1.059 | 1.035 | 1.093 | 1 | |
80% | 1.028 | 1.025 | 1.021 | 1.072 | 1.038 | 1.073 | 1.054 | |
All | 1.028 | 1.026 | 1.022 | 1.067 | 1.034 | 1.084 | 1.243 | |
PCA | 1.006 | 0.996 | 0.999 | 0.992 | 0.991 | 0.991 | 1.116 |
Evaluation Indicators | Features | Models | ||||||
---|---|---|---|---|---|---|---|---|
GBLUP | BL | BRR | LightGBM | RF | GBDT | Adaboost.R2 | ||
PCC | 20% | −0.111 | 0.045 | 0.047 | −0.016 | 0.052 | 0.012 | 0.064 |
50% | −0.119 | 0.041 | 0.029 | 0.011 | 0.108 | 0.037 | 0.047 | |
80% | −0.121 | 0.04 | 0.029 | 0.001 | 0.047 | −0.013 | 0.032 | |
All | −0.114 | 0.043 | 0.036 | 0.053 | 0.044 | 0.016 | 0.062 | |
PCA | 0.072 | 0.12 | 0.115 | 0.146 | 0.121 | 0.121 | 0.087 | |
MAE | 20% | 0.778 | 0.786 | 0.764 | 0.831 | 0.78 | 0.817 | 0.751 |
50% | 0.778 | 0.763 | 0.765 | 0.821 | 0.766 | 0.815 | 1.084 | |
80% | 0.778 | 0.767 | 0.764 | 0.833 | 0.771 | 0.829 | 0.767 | |
All | 0.777 | 0.764 | 0.765 | 0.816 | 0.771 | 0.821 | 0.760 | |
PCA | 0.799 | 0.751 | 0.748 | 0.76 | 0.753 | 0.782 | 0.808 | |
MSE | 20% | 1.068 | 1.019 | 1.034 | 1.177 | 1.035 | 1.159 | 1.078 |
50% | 1.068 | 1.031 | 1.04 | 1.17 | 1.006 | 1.148 | 0.740 | |
80% | 1.069 | 1.045 | 1.041 | 1.183 | 1.035 | 1.18 | 1.111 | |
All | 1.066 | 1.043 | 1.041 | 1.141 | 1.035 | 1.156 | 1.096 | |
PCA | 1.112 | 0.992 | 1 | 0.979 | 0.983 | 0.984 | 1.182 | |
RMSE | 20% | 1.027 | 0.999 | 1.007 | 1.085 | 1.017 | 1.077 | 1.038 |
50% | 1.027 | 1.006 | 1.011 | 1.082 | 1.003 | 1.071 | 1.041 | |
80% | 1.028 | 1.014 | 1.011 | 1.087 | 1.017 | 1.086 | 1.054 | |
All | 1.027 | 1.012 | 1.011 | 1.068 | 1.017 | 1.075 | 1.047 | |
PCA | 1.045 | 0.987 | 0.99 | 0.989 | 0.992 | 0.992 | 1.087 |
Trait | Method | Optimal Hyperparameters |
---|---|---|
NH | LightGBM | learning_rate = 0.01, max_depth = 19, n_estimators = 25 |
RF | max_depth = 6, n_estimators = 87 | |
GBDT | learning_rate = 0.05, max_depth = 8, n_estimators = 10 | |
Adaboost.R2 | n_estimators = 50, learning_rate = 0.01 | |
NW | LightGBM | learning_rate = 0.1, max_depth = 2, n_estimators = 4 |
RF | max depth = 1, n_estimators = 17 | |
GBDT | learning_rate = 0.09, max_depth = 1, n_estimators = 16 | |
Adaboost.R2 | n_estimators = 50, learning_rate = 0.01 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, H.; Bao, S.; Zhao, X.; Bai, Y.; Lv, Y.; Gao, P.; Li, F.; Zhang, W. Genome-Wide Association Study and Phenotype Prediction of Reproductive Traits in Large White Pigs. Animals 2024, 14, 3348. https://doi.org/10.3390/ani14233348
Zhang H, Bao S, Zhao X, Bai Y, Lv Y, Gao P, Li F, Zhang W. Genome-Wide Association Study and Phenotype Prediction of Reproductive Traits in Large White Pigs. Animals. 2024; 14(23):3348. https://doi.org/10.3390/ani14233348
Chicago/Turabian StyleZhang, Hao, Shiqian Bao, Xiaona Zhao, Yangfan Bai, Yangcheng Lv, Pengfei Gao, Fuzhong Li, and Wuping Zhang. 2024. "Genome-Wide Association Study and Phenotype Prediction of Reproductive Traits in Large White Pigs" Animals 14, no. 23: 3348. https://doi.org/10.3390/ani14233348
APA StyleZhang, H., Bao, S., Zhao, X., Bai, Y., Lv, Y., Gao, P., Li, F., & Zhang, W. (2024). Genome-Wide Association Study and Phenotype Prediction of Reproductive Traits in Large White Pigs. Animals, 14(23), 3348. https://doi.org/10.3390/ani14233348