SWAAT Bioinformatics Workflow for Protein Structure-Based Annotation of ADME Gene Variants
Abstract
:1. Introduction
2. Methods
2.1. Obtaining 3D Structures of Proteins
2.2. Constructing and Evaluating A Predictive Model for Variant Effect Prediction
2.3. Evaluation of the Classifier’s Performance Using a Benchmarking Dataset
2.4. Dependencies for Working with SWAAT
2.5. Overall Description of SWAAT
2.6. Implementation
3. Results
3.1. Annotated Genes
3.2. Building and Assessing the Predictive Model
3.3. Benchmarking SWAAT Using Adme and Tp53 Variants
4. Practical Application
5. Discussion
6. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
ADME | Absorption, Distribution, Metabolism, and Excretion |
CPIC | Clinical Pharmacogenetics Implementation Consortium |
PDB | Protein Data Bank |
PSSM | Position-Specific Scoring Matrix |
SWAAT | Structural Workflow for Annotating ADME Targets |
VEP | Variant Effect Predictor |
References
- Jurić, T.Š.; Tomas, U.; Petranović, M.Z.; Božina, N.; Smolej Narančić, N.; Janićijević, B.; Salihović, M.P. Characterization of ADME genes variation in Roma and 20 populations worldwide. PLoS ONE 2018, 13, e0207671. [Google Scholar]
- Brooks, J.D.; Comen, E.A.; Reiner, A.S.; Orlow, I.; Leong, S.F.; Liang, X.; Mellemkjær, L.; Knight, J.A.; Lynch, C.F.; John, E.M.; et al. CYP2D6 phenotype, tamoxifen, and risk of contralateral breast cancer in the WECARE Study. Breast Cancer Res. 2018, 20, 149. [Google Scholar] [CrossRef] [Green Version]
- da Rocha, J.E.B.; Othman, H.; Tiemessen, C.T.; Botha, G.; Ramsay, M.; Masimirembwa, C.; Adebamowo, C.; Choudhury, A.; Brandenburg, J.T.; Matshaba, M.; et al. G6PD distribution in sub-Saharan Africa and potential risks of using chloroquine/hydroxychloroquine based treatments for COVID-19. Pharmacogenomics J. 2021, 21, 649–656. [Google Scholar] [CrossRef] [PubMed]
- Rodrigues, J.C.G.; Fernandes, M.R.; Guerreiro, J.F.; da Silva, A.L.D.C.; Ribeiro-Dos-Santos, A.; Santos, S.; Santos, N.P.C.D. Polymorphisms of ADME-related genes and their implications for drug safety and efficacy in Amazonian Amerindians. Sci. Rep. 2019, 9, 7201. [Google Scholar] [CrossRef] [PubMed]
- Hovelson, D.H.; Xue, Z.; Zawistowski, M.; Ehm, M.G.; Harris, E.C.; Stocker, S.L.; Gross, A.S.; Jang, I.J.; Ieiri, I.; Lee, J.E.; et al. Characterization of ADME gene variation in 21 populations by exome sequencing. Pharmacogenet Genom. 2017, 27, 89–100. [Google Scholar] [CrossRef] [Green Version]
- Li, J.; Lou, H.; Yang, X.; Lu, D.; Li, S.; Jin, L.; Pan, X.; Yang, W.; Song, M.; Mamatyusupu, D.; et al. Genetic architectures of ADME genes in five Eurasian admixed populations and implications for drug safety and efficacy. J. Med. Genet. 2014, 51, 614–622. [Google Scholar] [CrossRef]
- da Rocha, J.E.B.; Othman, H.; Botha, G.; Cottino, L.; Twesigomwe, D.; Ahmed, S.; Drögemöller, B.I.; Fadlelmola, F.M.; Machanick, P.; Mbiyavanga, M.; et al. The Extent and Impact of Variation in ADME Genes in Sub-Saharan African Populations. Front. Pharmacol. 2021, 12, 634016. [Google Scholar] [CrossRef]
- Roden, D.M.; Wilke, R.A.; Kroemer, H.K.; Stein, C.M. Pharmacogenomics: The genetics of variable drug responses. Circulation 2011, 123, 1661–1670. [Google Scholar] [CrossRef] [Green Version]
- Ingelman-Sundberg, M.; Mkrtchian, S.; Zhou, Y.; Lauschke, V.M. Integrating rare genetic variants into pharmacogenetic drug response predictions. Hum. Genom. 2018, 12, 26. [Google Scholar] [CrossRef]
- Klein, K.; Tremmel, R.; Winter, S.; Fehr, S.; Battke, F.; Scheurenbrand, T.; Schaeffeler, E.; Biskup, S.; Schwab, M.; Zanger, U.M. A New Panel-Based Next-Generation Sequencing Method for ADME Genes Reveals Novel Associations of Common and Rare Variants With Expression in a Human Liver Cohort. Front. Genet. 2019, 10, 7. [Google Scholar] [CrossRef] [Green Version]
- Lauschke, V.M.; Ingelman-Sundberg, M. How to Consider Rare Genetic Variants in Personalized Drug Therapy. Clin. Pharmacol. Ther. 2018, 103, 745–748. [Google Scholar] [CrossRef] [PubMed]
- Steyaert, W.; Callens, S.; Coucke, P.; Dermaut, B.; Hemelsoet, D.; Terryn, W.; Poppe, B. Future perspectives of genome-scale sequencing. Acta Clin. Belg. 2018, 73, 7–10. [Google Scholar] [CrossRef] [PubMed]
- Mahmood, K.; Jung, C.H.; Philip, G.; Georgeson, P.; Chung, J.; Pope, B.J.; Park, D.J. Variant effect prediction tools assessed using independent, functional assay-based datasets: Implications for discovery and diagnostics. Hum. Genom. 2017, 11, 10. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Grimm, D.G.; Azencott, C.A.; Aicheler, F.; Gieraths, U.; MacArthur, D.G.; Samocha, K.E.; Cooper, D.N.; Stenson, P.D.; Daly, M.J.; Smoller, J.W.; et al. The evaluation of tools used to predict the impact of missense variants is hindered by two types of circularity. Hum. Mutat. 2015, 36, 513–523. [Google Scholar] [CrossRef]
- Ernst, C.; Hahnen, E.; Engel, C.; Nothnagel, M.; Weber, J.; Schmutzler, R.K.; Hauke, J. Performance of in silico prediction tools for the classification of rare BRCA1/2 missense variants in clinical diagnostics. BMC Med. Genom. 2018, 11, 35. [Google Scholar] [CrossRef] [Green Version]
- Bope, C.D.; Chimusa, E.R.; Nembaware, V.; Mazandu, G.K.; de Vries, J.; Wonkam, A. Dissecting in silico Mutation Prediction of Variants in African Genomes: Challenges and Perspectives. Front. Genet. 2019, 10, 601. [Google Scholar] [CrossRef]
- Nussinov, R.; Jang, H.; Tsai, C.J.; Cheng, F. Precision medicine review: Rare driver mutations and their biophysical classification. Biophys. Rev. 2019, 11, 5–19. [Google Scholar] [CrossRef] [Green Version]
- Nussinov, R.; Jang, H.; Tsai, C.J.; Cheng, F. Review: Precision medicine and driver mutations: Computational methods, functional assays and conformational principles for interpreting cancer drivers. PLoS Comput. Biol. 2019, 15, e1006658. [Google Scholar]
- Pandurangan, A.P.; Blundell, T.L. Prediction of impacts of mutations on protein structure and interactions: SDM, a statistical approach, and mCSM, using machine learning. Protein Sci. 2020, 29, 247–257. [Google Scholar] [CrossRef]
- Li, M.; Goncearenco, A.; Panchenko, A.R. Annotating Mutational Effects on Proteins and Protein Interactions: Designing Novel and Revisiting Existing Protocols. Methods Mol. Biol. 2017, 1550, 235–260. [Google Scholar]
- Worth, C.L.; Preissner, R.; Blundell, T.L. SDM–a server for predicting effects of mutations on protein stability and malfunction. Nucleic Acids Res. 2011, 39, W215–W222. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Rodrigues, C.H.; Pires, D.E.; Ascher, D.B. DynaMut: Predicting the impact of mutations on protein conformation, flexibility and stability. Nucleic Acids Res. 2018, 46, W350–W355. [Google Scholar] [CrossRef] [PubMed]
- Pellegrino, E.; Jacques, C.; Beaufils, N.; Nanni, I.; Carlioz, A.; Metellus, P.; Ouafik, L. Machine learning random forest for predicting oncosomatic variant NGS analysis. Sci. Rep. 2021, 11, 21820. [Google Scholar] [CrossRef]
- Kim, H.Y.; Jeon, W.; Kim, D. An enhanced variant effect predictor based on a deep generative model and the Born-Again Networks. Sci. Rep. 2021, 11, 19127. [Google Scholar] [CrossRef] [PubMed]
- Rocha, J.d.; Othman, H.; Botha, G.; Cottino, L.; Twesigomwe, D.; Ahmed, S.; Drögemöller, B.I.; Fadlelmola, F.M.; Machanick, P.; Mbiyavanga, M.; et al. The extent and impact of variation in ADME genes in sub-Saharan African populations. bioRxiv 2020. [Google Scholar] [CrossRef]
- Sali, A.; Blundell, T.L. Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 1993, 234, 779–815. [Google Scholar] [CrossRef]
- Shen, M.Y.; Sali, A. Statistical potential for assessment and prediction of protein structures. Protein. Sci. 2006, 15, 2507–2524. [Google Scholar] [CrossRef] [Green Version]
- Lovell, S.C.; Davis, I.W.; Arendall, W.B.; de Bakker, P.I.; Word, J.M.; Prisant, M.G.; Richardson, J.S.; Richardson, D.C. Structure validation by Calpha geometry: Phi, psi and Cbeta deviation. Proteins 2003, 50, 437–450. [Google Scholar] [CrossRef]
- Capriotti, E.; Fariselli, P.; Casadio, R. A neural-network-based method for predicting protein stability changes upon single point mutations. Bioinformatics 2004, 20, 63–68. [Google Scholar] [CrossRef] [Green Version]
- Khan, S.; Vihinen, M. Performance of protein stability predictors. Hum. Mutat. 2010, 31, 675–684. [Google Scholar] [CrossRef] [Green Version]
- Guerois, R.; Nielsen, J.E.; Serrano, L. Predicting changes in the stability of proteins and protein complexes: A study of more than 1000 mutations. J. Mol. Biol. 2002, 320, 369–387. [Google Scholar] [CrossRef]
- Kawashima, S.; Ogata, H.; Kanehisa, M. AAindex: Amino Acid Index Database. Nucleic Acids Res. 1999, 27, 368–369. [Google Scholar] [CrossRef] [PubMed]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Shrestha, S.; Zhang, C.; Jerde, C.R.; Nie, Q.; Li, H.; Offer, S.M.; Diasio, R.B. Gene-Specific Variant Classifier (DPYD-Varifier) to Identify Deleterious Alleles of Dihydropyrimidine Dehydrogenase. Clin. Pharmacol. Ther. 2018, 104, 709–718. [Google Scholar] [CrossRef] [PubMed]
- Nikolova, P.V.; Wong, K.B.; DeDecker, B.; Henckel, J.; Fersht, A.R. Mechanism of rescue of common p53 cancer mutations by second-site suppressor mutations. EMBO J. 2000, 19, 370–378. [Google Scholar] [CrossRef]
- Joerger, A.C.; Ang, H.C.; Fersht, A.R. Structural basis for understanding oncogenic p53 mutations and designing rescue drugs. Proc. Natl. Acad. Sci. USA 2006, 103, 15056–15061. [Google Scholar] [CrossRef] [Green Version]
- Joerger, A.C.; Ang, H.C.; Veprintsev, D.B.; Blair, C.M.; Fersht, A.R. Structures of p53 cancer mutants and mechanism of rescue by second-site suppressor mutations. J. Biol. Chem. 2005, 280, 16030–16037. [Google Scholar] [CrossRef] [Green Version]
- Bullock, A.N.; Henckel, J.; Fersht, A.R. Quantitative analysis of residual folding and DNA binding in mutant p53 core domain: Definition of mutant states for rescue in cancer therapy. Oncogene 2000, 19, 1245–1256. [Google Scholar] [CrossRef] [Green Version]
- Nikolova, P.V.; Henckel, J.; Lane, D.P.; Fersht, A.R. Semirational design of active tumor suppressor p53 DNA binding domain with enhanced stability. Proc. Natl. Acad. Sci. USA 1998, 95, 14675–14680. [Google Scholar] [CrossRef] [Green Version]
- McLaren, W.; Gil, L.; Hunt, S.E.; Riat, H.S.; Ritchie, G.R.; Thormann, A.; Flicek, P.; Cunningham, F. The Ensembl Variant Effect Predictor. Genome Biol. 2016, 17, 122. [Google Scholar] [CrossRef] [Green Version]
- Di Tommaso, P.; Chatzou, M.; Floden, E.W.; Barja, P.P.; Palumbo, E.; Notredame, C. Nextflow enables reproducible computational workflows. Nat. Biotechnol. 2017, 35, 316–319. [Google Scholar] [CrossRef] [PubMed]
- Cock, P.J.; Antao, T.; Chang, J.T.; Chapman, B.A.; Cox, C.J.; Dalke, A.; Friedberg, I.; Hamelryck, T.; Kauff, F.; Wilczynski, B.; et al. Biopython: Freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 2009, 25, 1422–1423. [Google Scholar] [CrossRef] [PubMed]
- Mitternacht, S. FreeSASA: An open source C library for solvent accessible surface area calculations. F1000Res 2016, 5, 189. [Google Scholar] [CrossRef] [PubMed]
- Schymkowitz, J.; Borg, J.; Stricher, F.; Nys, R.; Rousseau, F.; Serrano, L. The FoldX web server: An online force field. Nucleic Acids Res. 2005, 33, 382–388. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Frappier, V.; Najmanovich, R.J. A coarse-grained elastic network atom contact model and its use in the simulation of protein dynamics and the prediction of the effect of mutations. PLoS Comput. Biol. 2014, 10, e1003569. [Google Scholar] [CrossRef]
- Andersen, C.A.; Palmer, A.G.; Brunak, S.; Rost, B. Continuum secondary structure captures protein flexibility. Structure 2002, 10, 175–184. [Google Scholar] [CrossRef] [Green Version]
- Zhou, W.; Chen, T.; Chong, Z.; Rohrdanz, M.A.; Melott, J.M.; Wakefield, C.; Zeng, J.; Weinstein, J.N.; Meric-Bernstam, F.; Mills, G.B.; et al. TransVar: A multilevel variant annotator for precision genomics. Nat. Methods 2015, 12, 1002–1003. [Google Scholar] [CrossRef] [Green Version]
- Ittisoponpisan, S.; Islam, S.A.; Khanna, T.; Alhuzimi, E.; David, A.; Sternberg, M.J.E. Can Predicted Protein 3D Structures Provide Reliable Insights into whether Missense Variants Are Disease Associated? J. Mol. Biol. 2019, 431, 2197–2212. [Google Scholar] [CrossRef]
- Frauenfelder, H.; Sligar, S.G.; Wolynes, P.G. The energy landscapes and motions of proteins. Science 1991, 254, 1598–1603. [Google Scholar] [CrossRef] [Green Version]
- Hollingsworth, S.A.; Dror, R.O. Molecular Dynamics Simulation for All. Neuron 2018, 99, 1129–1143. [Google Scholar] [CrossRef] [Green Version]
- Zhang, C.T.; Chou, K.C. Monte Carlo simulation studies on the prediction of protein folding types from amino acid composition. Biophys J. 1992, 63, 1523–1529. [Google Scholar] [CrossRef] [Green Version]
- Bauer, J.A.; Pavlović, J.; Bauerová-Hlinková, V. Normal Mode Analysis as a Routine Part of a Structural Investigation. Molecules 2019, 24, 3293. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Kozakov, D.; Grove, L.E.; Hall, D.R.; Bohnuud, T.; Mottarella, S.E.; Luo, L.; Xia, B.; Beglov, D.; Vajda, S. The FTMap family of web servers for determining and characterizing ligand-binding hot spots of proteins. Nat. Protoc. 2015, 10, 733–755. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Gaedigk, A.; Ingelman-Sundberg, M.; Miller, N.A.; Leeder, J.S.; Whirl-Carrillo, M.; Klein, T.E. The Pharmacogene Variation (PharmVar) Consortium: Incorporation of the Human Cytochrome P450 (CYP) Allele Nomenclature Database. Clin. Pharmacol. Ther. 2018, 103, 399–401. [Google Scholar] [CrossRef] [Green Version]
- Glusman, G.; Rose, P.W.; Prlić, A.; Dougherty, J.; Duarte, J.M.; Hoffman, A.S.; Barton, G.J.; Bendixen, E.; Bergquist, T.; Bock, C.; et al. Mapping genetic variations to three-dimensional protein structures to enhance variant interpretation: A proposed framework. Genome Med. 2017, 9, 113. [Google Scholar] [CrossRef] [Green Version]
- Hicks, M.; Bartha, I.; di Iulio, J.; Venter, J.C.; Telenti, A. Functional characterization of 3D protein structures informed by human genetic diversity. Proc. Natl. Acad. Sci. USA 2019, 116, 8960–8965. [Google Scholar] [CrossRef] [Green Version]
- Frappier, V.; Chartier, M.; Najmanovich, R.J. ENCoM server: Exploring protein conformational space and the effect of mutations on protein function and stability. Nucleic Acids Res. 2015, 43, 395–400. [Google Scholar] [CrossRef] [Green Version]
- Goethe, M.; Fita, I.; Rubi, J.M. Vibrational entropy of a protein: Large differences between distinct conformations. J. Chem. Theory Comput. 2015, 11, 351–359. [Google Scholar] [CrossRef] [Green Version]
- Karplus, M.; Ichiye, T.; Pettitt, B.M. Configurational entropy of native proteins. Biophys. J. 1987, 52, 1083–1085. [Google Scholar] [CrossRef] [Green Version]
- Ng, P.C.; Henikoff, S. Predicting deleterious amino acid substitutions. Genome Res. 2001, 11, 863–874. [Google Scholar] [CrossRef] [Green Version]
- Sunyaev, S.R.; Eisenhaber, F.; Rodchenkov, I.V.; Eisenhaber, B.; Tumanyan, V.G.; Kuznetsov, E.N. PSIC: Profile extraction from sequence alignments with position-specific counts of independent observations. Protein Eng. 1999, 12, 387–394. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Kircher, M.; Witten, D.M.; Jain, P.; O’Roak, B.J.; Cooper, G.M.; Shendure, J. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 2014, 46, 310–315. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Shihab, H.A.; Gough, J.; Cooper, D.N.; Stenson, P.D.; Barker, G.L.; Edwards, K.J.; Day, I.N.; Gaunt, T.R. Predicting the functional, molecular, and phenotypic consequences of amino acid substitutions using hidden Markov models. Hum. Mutat. 2013, 34, 57–65. [Google Scholar] [CrossRef] [PubMed]
- Kim, S.; Jhong, J.H.; Lee, J.; Koo, J.Y. Meta-analytic support vector machine for integrating multiple omics data. BioData Min. 2017, 10, 2. [Google Scholar] [CrossRef] [Green Version]
- Zhou, Y.; Mkrtchian, S.; Kumondai, M.; Hiratsuka, M.; Lauschke, V.M. An optimized prediction framework to assess the functional impact of pharmacogenetic variants. Pharmacogenomics J. 2019, 19, 115–126. [Google Scholar] [CrossRef]
- Twesigomwe, D.; Drögemöller, B.I.; Wright, G.E.B.; Siddiqui, A.; da Rocha, J.; Lombard, Z.; Hazelhurst, S. StellarPGx: A Nextflow pipeline for calling star alleles in cytochrome P450 genes. Clin. Pharmacol. Ther. 2021, 110, 741–749. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Othman, H.; Jemimah, S.; da Rocha, J.E.B. SWAAT Bioinformatics Workflow for Protein Structure-Based Annotation of ADME Gene Variants. J. Pers. Med. 2022, 12, 263. https://doi.org/10.3390/jpm12020263
Othman H, Jemimah S, da Rocha JEB. SWAAT Bioinformatics Workflow for Protein Structure-Based Annotation of ADME Gene Variants. Journal of Personalized Medicine. 2022; 12(2):263. https://doi.org/10.3390/jpm12020263
Chicago/Turabian StyleOthman, Houcemeddine, Sherlyn Jemimah, and Jorge Emanuel Batista da Rocha. 2022. "SWAAT Bioinformatics Workflow for Protein Structure-Based Annotation of ADME Gene Variants" Journal of Personalized Medicine 12, no. 2: 263. https://doi.org/10.3390/jpm12020263
APA StyleOthman, H., Jemimah, S., & da Rocha, J. E. B. (2022). SWAAT Bioinformatics Workflow for Protein Structure-Based Annotation of ADME Gene Variants. Journal of Personalized Medicine, 12(2), 263. https://doi.org/10.3390/jpm12020263