A Breast Cancer Polygenic Risk Score Validation in 15,490 Brazilians Using Exome Sequencing
Abstract
:1. Introduction
2. Materials and Methods
2.1. Study Population
2.2. Exome Sequencing and Imputation
2.3. Relatedness Calculation and Data Filtering
2.4. Polygenic Risk Score Calculation
2.5. Ancestry Evaluation
2.6. Genetic Principal Component Analysis (PCA)
2.7. Paired Imputed and Sequenced Genomes Analysis
2.8. Statistical Analyses
3. Results
3.1. Removal of P/LP Variants Prior to PRS Assessment and Ancestry Composition of the Cohort
3.2. Three PRSs Identify Increased Breast Cancer Risk for Brazilian Women
3.3. Effect Sizes of All PRSs Are Less Pronounced than in the Original Studies
3.4. PRS313 and PRS3820 Can Stratify BC Risk in Groups with Different Ancestry Compositions
3.5. Imputation Is a Reliable Tool for PRS Assessment
3.6. PRS3820 Top Decile OR Is Comparable to That of a Moderate Risk BC Gene
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
PRS | Polygenic Risk Score |
BC | Breast Cancer |
OR | Odds Ratio |
GWAS | Genome Wide Association Studies |
WES | Whole Exome Sequencing |
WGS | Whole Genome Sequencing |
CAP | College of American Pathology |
IRB | Institutional Review Board |
CAAE | Certificado de Apresentação de Apreciação Ética |
GATK | Genome Analysis Toolkit |
1KGP | 1000 Genomes Project |
VCF | Variant Call Format |
P/LP | Pathogenic or Likely-Pathogenic |
PGS | Polygenic Score |
ID | Identification |
UK | United Kingdom |
MAF | Minor Allele Frequency |
AFR | Africa |
AMR | America |
EAS | East Asia |
EUR | Europe |
HGDP | Human Genome Diversity Project |
PCA | Principal Components Analysis |
PC | Principal Component |
AUC | Area Under The Curve |
CI | Confidence Interval |
SD | Standard Deviation |
UKBB | UK Biobank |
PRS90 | 90th to 100th Percentile of Polygenic Risk Score |
ER | Estrogen Receptor |
CNPq | Conselho Nacional de Desenvolvimento Científico e Tecnológico |
References
- Sung, H.; Ferlay, J.; Siegel, R.L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2021, 71, 209–249. [Google Scholar] [CrossRef] [PubMed]
- Instituto Nacional de Câncer Estimativa 2023: Incidência de câncer no Brasil; Ministério da Saúde (Ed.) Ministério da Saúde: Brasília, Brazil, 2023.
- Hu, C.; Hart, S.N.; Gnanaolivu, R.; Huang, H.; Lee, K.Y.; Na, J.; Gao, C.; Lilyquist, J.; Yadav, S.; Boddicker, N.J.; et al. A Population-Based Study of Genes Previously Implicated in Breast Cancer. N. Engl. J. Med. 2021, 384, 440–451. [Google Scholar] [CrossRef]
- Breast Cancer Association Consortium; Dorling, L.; Carvalho, S.; Allen, J.; González-Neira, A.; Luccarini, C.; Wahlström, C.; Pooley, K.A.; Parsons, M.T.; Fortuno, C.; et al. Breast Cancer Risk Genes—Association Analysis in More than 113,000 Women. N. Engl. J. Med. 2021, 384, 428–439. [Google Scholar] [CrossRef] [PubMed]
- Visscher, P.M.; Wray, N.R.; Zhang, Q.; Sklar, P.; McCarthy, M.I.; Brown, M.A.; Yang, J. 10 years of GWAS discovery: Biology, function, and translation. Am. J. Hum. Genet. 2017, 101, 5–22. [Google Scholar] [CrossRef] [PubMed]
- Khera, A.V.; Chaffin, M.; Aragam, K.G.; Haas, M.E.; Roselli, C.; Choi, S.H.; Natarajan, P.; Lander, E.S.; Lubitz, S.A.; Ellinor, P.T.; et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 2018, 50, 1219–1224. [Google Scholar] [CrossRef]
- Mavaddat, N.; Michailidou, K.; Dennis, J.; Lush, M.; Fachal, L.; Lee, A.; Tyrer, J.P.; Chen, T.-H.; Wang, Q.; Bolla, M.K.; et al. Polygenic risk scores for prediction of breast cancer and breast cancer subtypes. Am. J. Hum. Genet. 2019, 104, 21–34. [Google Scholar] [CrossRef]
- Zhang, H.; Ahearn, T.U.; Lecarpentier, J.; Barnes, D.; Beesley, J.; Qi, G.; Jiang, X.; O’Mara, T.A.; Zhao, N.; Bolla, M.K.; et al. Genome-wide association study identifies 32 novel breast cancer susceptibility loci from overall and subtype-specific analyses. Nat. Genet. 2020, 52, 572–581. [Google Scholar] [CrossRef]
- Morra, A.; Escala-Garcia, M.; Beesley, J.; Keeman, R.; Canisius, S.; Ahearn, T.U.; Andrulis, I.L.; Anton-Culver, H.; Arndt, V.; Auer, P.L.; et al. Association of germline genetic variants with breast cancer-specific survival in patient subgroups defined by clinic-pathological variables related to tumor biology and type of systemic treatment. Breast Cancer Res. 2021, 23, 86. [Google Scholar] [CrossRef]
- Mars, N.; Kerminen, S.; Feng, Y.-C.A.; Kanai, M.; Läll, K.; Thomas, L.F.; Skogholt, A.H.; Della Briotta Parolo, P.; Biobank Japan Project; FinnGen; et al. Genome-wide risk prediction of common diseases across ancestries in one million people. Cell Genomics 2022, 2, 100118. [Google Scholar] [CrossRef]
- Salzano, F.M.; Freire-Maia, N. As origens. In Populações Brasileiras: Aspectos Demográficos, Genéticos e Antropológicos; Editôra da Universidade de São Paulo: São Paulo, Brazil, 1967. [Google Scholar]
- de Souza, A.M.; Resende, S.S.; de Sousa, T.N.; de Brito, C.F.A. A systematic scoping review of the genetic ancestry of the Brazilian population. Genet. Mol. Biol. 2019, 42, 495–508. [Google Scholar] [CrossRef]
- Naslavsky, M.S.; Scliar, M.O.; Yamamoto, G.L.; Wang, J.Y.T.; Zverinova, S.; Karp, T.; Nunes, K.; Ceroni, J.R.M.; de Carvalho, D.L.; da Silva Simões, C.E.; et al. Whole-genome sequencing of 1,171 elderly admixed individuals from São Paulo, Brazil. Nat. Commun. 2022, 13, 1004. [Google Scholar] [CrossRef] [PubMed]
- Sudlow, C.; Gallacher, J.; Allen, N.; Beral, V.; Burton, P.; Danesh, J.; Downey, P.; Elliott, P.; Green, J.; Landray, M.; et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015, 12, e1001779. [Google Scholar] [CrossRef] [PubMed]
- GATK Best Practices Workflows. Available online: https://gatk.broadinstitute.org/hc/en-us/sections/360007226651-Best-Practices-Workflows (accessed on 11 June 2024).
- 1000 Genomes Project Consortium; Auton, A.; Brooks, L.D.; Durbin, R.M.; Garrison, E.P.; Kang, H.M.; Korbel, J.O.; Marchini, J.L.; McCarthy, S.; McVean, G.A.; et al. A global reference for human genetic variation. Nature 2015, 526, 68–74. [Google Scholar] [CrossRef]
- 1000 Genomes on GRCh38. Available online: http://www.internationalgenome.org/data-portal/data-collection/grch38 (accessed on 16 July 2024).
- Rubinacci, S.; Ribeiro, D.M.; Hofmeister, R.J.; Delaneau, O. Efficient phasing and imputation of low-coverage sequencing data using large reference panels. Nat. Genet. 2021, 53, 120–126. [Google Scholar] [CrossRef]
- Pedersen, B.S.; Bhetariya, P.J.; Brown, J.; Kravitz, S.N.; Marth, G.; Jensen, R.L.; Bronner, M.P.; Underhill, H.R.; Quinlan, A.R. Somalier: Rapid relatedness estimation for cancer and germline studies using efficient genome sketches. Genome Med. 2020, 12, 62. [Google Scholar] [CrossRef]
- Somalier GitHub Repository. Available online: http://github.com/brentp/somalier#readme (accessed on 30 January 2023).
- Lambert, S.A.; Gil, L.; Jupp, S.; Ritchie, S.C.; Xu, Y.; Buniello, A.; McMahon, A.; Abraham, G.; Chapman, M.; Parkinson, H.; et al. The Polygenic Score Catalog as an open database for reproducibility and systematic evaluation. Nat. Genet. 2021, 53, 420–425. [Google Scholar] [CrossRef] [PubMed]
- Lambert, S.A.; Wingfield, B.; Gibson, J.T.; Gil, L.; Ramachandran, S.; Yvon, F.; Saverimuttu, S.; Tinsley, E.; Lewis, E.; Ritchie, S.C.; et al. Enhancing the Polygenic Score Catalog with tools for score calculation and ancestry normalization. Nat. Genet. 2024, 56, 1989–1994. [Google Scholar] [CrossRef]
- Alexander, D.H.; Novembre, J.; Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 2009, 19, 1655–1664. [Google Scholar] [CrossRef]
- Bergström, A.; McCarthy, S.A.; Hui, R.; Almarri, M.A.; Ayub, Q.; Danecek, P.; Chen, Y.; Felkel, S.; Hallast, P.; Kamm, J.; et al. Insights into human genetic variation and population history from 929 diverse genomes. Science 2020, 367, 6484. [Google Scholar] [CrossRef]
- Chang, C.C.; Chow, C.C.; Tellier, L.C.; Vattikuti, S.; Purcell, S.M.; Lee, J.J. Second-generation PLINK: Rising to the challenge of larger and richer datasets. Gigascience 2015, 4, s13742–015–0047–8. [Google Scholar] [CrossRef]
- Estudo Longitudinal de Saúde do Adulto (ELSA). Available online: http://elsabrasil.org (accessed on 8 January 2025).
- Tidy Characterizations of Model Performances—Yardstick. Available online: http://yardstick.tidymodels.org (accessed on 8 January 2025).
- Kuhn, M. Building Predictive Models in R Using the caret Package. J. Stat. Softw. 2008, 28, 1–26. [Google Scholar] [CrossRef]
- Aragon, T.J.; Fay, M.P.; Wollschlaeger, D.; Omidpanah, A. Epitools: Epidemiology Tools. In Tools for Training and Practicing Epidemiologists Including Methods for Two-Way and Multi-Way Contingency Tables; CRAN, 2020. [Google Scholar]
- Liu, C.; Zeinomar, N.; Chung, W.K.; Kiryluk, K.; Gharavi, A.G.; Hripcsak, G.; Crew, K.D.; Shang, N.; Khan, A.; Fasel, D.; et al. Generalizability of polygenic risk scores for breast cancer among women with european, african, and latinx ancestry. JAMA Netw. Open 2021, 4, e2119084. [Google Scholar] [CrossRef] [PubMed]
- Du, Z.; Gao, G.; Adedokun, B.; Ahearn, T.; Lunetta, K.L.; Zirpoli, G.; Troester, M.A.; Ruiz-Narváez, E.A.; Haddad, S.A.; PalChoudhury, P.; et al. Evaluating polygenic risk scores for breast cancer in women of african ancestry. J. Natl. Cancer Inst. 2021, 113, 1168–1176. [Google Scholar] [CrossRef]
- Jia, G.; Ping, J.; Guo, X.; Yang, Y.; Tao, R.; Li, B.; Ambs, S.; Barnard, M.E.; Chen, Y.; Garcia-Closas, M.; et al. Genome-wide association analyses of breast cancer in women of African ancestry identify new susceptibility loci and improve risk prediction. Nat. Genet. 2024, 56, 819–826. [Google Scholar] [CrossRef]
- Ho, W.-K.; Tan, M.-M.; Mavaddat, N.; Tai, M.-C.; Mariapun, S.; Li, J.; Ho, P.-J.; Dennis, J.; Tyrer, J.P.; Bolla, M.K.; et al. European polygenic risk score for prediction of breast cancer shows similar performance in Asian women. Nat. Commun. 2020, 11, 3833. [Google Scholar] [CrossRef] [PubMed]
- Brentnall, A.R.; van Veen, E.M.; Harkness, E.F.; Rafiq, S.; Byers, H.; Astley, S.M.; Sampson, S.; Howell, A.; Newman, W.G.; Cuzick, J.; et al. A case-control evaluation of 143 single nucleotide polymorphisms for breast cancer risk stratification with classical factors and mammographic density. Int. J. Cancer 2020, 146, 2122–2129. [Google Scholar] [CrossRef]
- Woolway, G.E.; Smart, S.E.; Lynham, A.J.; Lloyd, J.L.; Owen, M.J.; Jones, I.R.; Walters, J.T.R.; Legge, S.E. Schizophrenia Polygenic Risk and Experiences of Childhood Adversity: A Systematic Review and Meta-analysis. Schizophr. Bull. 2022, 48, 967–980. [Google Scholar] [CrossRef]
- Jacobs, B.M.; Belete, D.; Bestwick, J.; Blauwendraat, C.; Bandres-Ciga, S.; Heilbron, K.; Dobson, R.; Nalls, M.A.; Singleton, A.; Hardy, J.; et al. Parkinson’s disease determinants, prediction and gene-environment interactions in the UK Biobank. J. Neurol. Neurosurg. Psychiatr. 2020, 91, 1046–1054. [Google Scholar] [CrossRef]
- Arthur, R.S.; Wang, T.; Xue, X.; Kamensky, V.; Rohan, T.E. Genetic factors, adherence to healthy lifestyle behavior, and risk of invasive breast cancer among women in the UK biobank. J. Natl. Cancer Inst. 2020, 112, 893–901. [Google Scholar] [CrossRef]
- Al Ajmi, K.; Lophatananon, A.; Mekli, K.; Ollier, W.; Muir, K.R. Association of nongenetic factors with breast cancer risk in genetically predisposed groups of women in the UK biobank cohort. JAMA Netw. Open 2020, 3, e203760. [Google Scholar] [CrossRef]
- Kapoor, P.M.; Mavaddat, N.; Choudhury, P.P.; Wilcox, A.N.; Lindström, S.; Behrens, S.; Michailidou, K.; Dennis, J.; Bolla, M.K.; Wang, Q.; et al. Combined associations of a polygenic risk score and classical risk factors with breast cancer risk. J. Natl. Cancer Inst. 2021, 113, 329–337. [Google Scholar] [CrossRef] [PubMed]
- Zhang, M.; Ru, M.; Zhang, J.; Wang, Z.; Lu, J.; Butler, K.R.; Chatterjee, N.; Couper, D.J.; Prizment, A.E.; Soori, M.M.; et al. Alcohol Consumption Does not Modify the Polygenic Risk Score-Based Genetic Risk of Breast Cancer in Postmenopausal Women: Atherosclerosis Risk in Communities Study. Cancer Prev. Res. 2025, 18, 73–83. [Google Scholar] [CrossRef] [PubMed]
- Wasik, K.; Berisa, T.; Pickrell, J.K.; Li, J.H.; Fraser, D.J.; King, K.; Cox, C. Comparing low-pass sequencing and genotyping for trait mapping in pharmacogenetics. BMC Genom. 2021, 22, 197. [Google Scholar] [CrossRef]
- Li, J.H.; Mazur, C.A.; Berisa, T.; Pickrell, J.K. Low-pass sequencing increases the power of GWAS and decreases measurement error of polygenic risk scores compared to genotyping arrays. Genome Res. 2021, 31, 529–537. [Google Scholar] [CrossRef] [PubMed]
- Barreiro, R.A.S.; de Almeida, T.F.; Gomes, C.; Monfardini, F.; de Farias, A.A.; Tunes, G.C.; de Souza, G.M.; Duim, E.; de Sá Correia, J.; Campos Coelho, A.V.; et al. Assessing the risk stratification of breast cancer polygenic risk scores in a brazilian cohort. J. Mol. Diagn. 2024, 26, 825–831. [Google Scholar] [CrossRef]
- LEMOS, L. Marina Candido Visontai Cormedi Evaluation of Polygenic Risk Scores for ER+/HER2- Breast Cancer in Brazilian Women. 2024. Ph.D. Thesis, Faculdade de Medicina de Ribeirão Preto, Universidade de São Paulo, Ribeirão Preto, Brazil, 2025. [Google Scholar]
- Bahcall, O. Common variation and heritability estimates for breast, ovarian and prostate cancers. Nat. Genet. 2013. [Google Scholar] [CrossRef]
- Melchor, L.; Benítez, J. The complex genetic landscape of familial breast cancer. Hum. Genet. 2013, 132, 845–863. [Google Scholar] [CrossRef] [PubMed]
- Guindalini, R.S.C.; Viana, D.V.; Kitajima, J.P.F.W.; Rocha, V.M.; López, R.V.M.; Zheng, Y.; Freitas, É.; Monteiro, F.P.M.; Valim, A.; Schlesinger, D.; et al. Detection of germline variants in Brazilian breast cancer patients using multigene panel testing. Sci. Rep. 2022, 12, 4190. [Google Scholar] [CrossRef]
- Michailidou, K.; Lindström, S.; Dennis, J.; Beesley, J.; Hui, S.; Kar, S.; Lemaçon, A.; Soucy, P.; Glubb, D.; Rostamianfar, A.; et al. Association analysis identifies 65 new breast cancer risk loci. Nature 2017, 551, 92–94. [Google Scholar] [CrossRef]
- Gold Standard Hereditary Breast and Ovarian Cancer Panel|Mendelics. Available online: https://mendelics.com.br/en/especialidades/oncologia-en/gold-standard-hereditary-breast-and-ovarian-cancer-panel/ (accessed on 8 April 2025).
- CÂNCER DE MAMA, SCORE DE RISCO POLIGÊNICO, SNP ARRAY, DIVERSOS|DASA. Available online: https://www.dasagenomica.com/exames/cancer-de-mama-score-de-risco-poligenico-snp-array-diversos/ (accessed on 8 April 2025).
- NeoGenomica–Sequenciamento do Genoma Completo. Available online: https://neogenomica.com.br/ (accessed on 8 April 2025).
Case | Control | Total | p-Value | ||
---|---|---|---|---|---|
Total | 5598 | 8767 | 14,365 | - | |
Sex | Female | 5598 | 4187 | 9785 | - |
Male | - | 4580 | 4580 | - | |
Age, mean (SD) | Total | 49.8 (11.6) | 41.6 (13.3) | 44.8 (13.3) | 0.000 |
Female | 49.8 (11.6) | 41.9 (13.7) | 46.4 (13.1) | 0.000 | |
Male | - | 41.3 (12.9) | 41.3 (12.9) | - |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Eichemberger Rius, F.; Santa Cruz Guindalini, R.; Viana, D.; Salomão, J.; Gallo, L.; Freitas, R.; Bertolacini, C.; Taniguti, L.; Imparato, D.; Antunes, F.; et al. A Breast Cancer Polygenic Risk Score Validation in 15,490 Brazilians Using Exome Sequencing. Diagnostics 2025, 15, 1098. https://doi.org/10.3390/diagnostics15091098
Eichemberger Rius F, Santa Cruz Guindalini R, Viana D, Salomão J, Gallo L, Freitas R, Bertolacini C, Taniguti L, Imparato D, Antunes F, et al. A Breast Cancer Polygenic Risk Score Validation in 15,490 Brazilians Using Exome Sequencing. Diagnostics. 2025; 15(9):1098. https://doi.org/10.3390/diagnostics15091098
Chicago/Turabian StyleEichemberger Rius, Flávia, Rodrigo Santa Cruz Guindalini, Danilo Viana, Júlia Salomão, Laila Gallo, Renata Freitas, Cláudia Bertolacini, Lucas Taniguti, Danilo Imparato, Flávia Antunes, and et al. 2025. "A Breast Cancer Polygenic Risk Score Validation in 15,490 Brazilians Using Exome Sequencing" Diagnostics 15, no. 9: 1098. https://doi.org/10.3390/diagnostics15091098
APA StyleEichemberger Rius, F., Santa Cruz Guindalini, R., Viana, D., Salomão, J., Gallo, L., Freitas, R., Bertolacini, C., Taniguti, L., Imparato, D., Antunes, F., Sousa, G., Achjian, R., Fukuyama, E., Gregório, C., Ventura, I., Gomes, J., Taniguti, N., Maistro, S., Krieger, J. E., ... Schlesinger, D. (2025). A Breast Cancer Polygenic Risk Score Validation in 15,490 Brazilians Using Exome Sequencing. Diagnostics, 15(9), 1098. https://doi.org/10.3390/diagnostics15091098