ASVmaker: A New Tool to Improve Taxonomic Identifications for Amplicon Sequencing Data
Abstract
:1. Introduction
1.1. Amplicon Sequencing
1.2. Public Reference Database and Taxonomic Limitations
1.3. Available Tools to Assign Taxonomy
2. Materials and Methods
2.1. Environment
2.2. ASVmaker Functionalities
2.2.1. Structure
2.2.2. Taxonomy
2.2.3. Amplicon
2.2.4. Usage
2.3. Creation of a New Database
2.4. Application on Environmental Samples
2.4.1. Sample and DNA Extraction
2.4.2. Amplicon Sequencing
2.4.3. Bioinformatic Analysis
3. Results
3.1. ASV Specific Database for 16S rRNA, ITS and EF1α Gene
3.2. Environmental Samples Application
- C1: Confirmation of the identification obtained with pre-trained classifiers (from the Silva/UNITE databases) with the ASV-specific database;
- C2: Precision increase to the species level with the ASV-specific database;
- C3: Change of species identification with the ASV-specific database;
- C4: Precision obtained with the ASV-specific database with a few species possibilities (simple case);
- C5: Precisions obtained with the ASV-specific database with several species possibilities (complex case).
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Mbareche, H.; Veillette, M.; Bilodeau, G.; Duchaine, C. Comparison of the performance of ITS1 and ITS2 as barcodes in amplicon-based sequencing of bioaerosols. PeerJ 2020, 8, e8523. [Google Scholar] [CrossRef]
- Bukin, Y.S.; Galachyants, Y.P.; Morozov, I.V.; Bukin, S.V.; Zakharenko, A.S.; Zemskaya, T.I. The effect of 16S rRNA region choice on bacterial community metabarcoding results. Sci. Data 2019, 6, 190007. [Google Scholar] [CrossRef]
- Abellan-Schneyder, I.; Matchado, M.S.; Reitmeier, S.; Sommer, A.; Sewald, Z.; Baumbach, J.; List, M.; Neuhaus, K. Primer, Pipelines, Parameters: Issues in 16S rRNA Gene Sequencing. mSphere 2021, 6, e01202-20. [Google Scholar] [CrossRef] [PubMed]
- Tedersoo, L.; Lindahl, B. Fungal identification biases in microbiome projects: Fungal identification biases in microbiome projects. Environ. Microbiol. Rep. 2016, 8, 774–779. [Google Scholar] [CrossRef]
- Schoch, C.L.; Seifert, K.A.; Huhndorf, S.; Robert, V.; Spouge, J.L.; Levesque, C.A.; Chen, W.; Fungal Barcoding Consortium; Fungal Barcoding Consortium Author List; Bolchacova, E.; et al. Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for Fungi. Proc. Natl. Acad. Sci. USA 2012, 109, 6241–6246. [Google Scholar] [CrossRef]
- Bahram, M.; Anslan, S.; Hildebrand, F.; Bork, P.; Tedersoo, L. Newly designed 16S rRNA metabarcoding primers amplify diverse and novel archaeal taxa from the environment. Environ. Microbiol. Rep. 2018, 11, 487–494. [Google Scholar] [CrossRef] [PubMed]
- Comeau, A.M.; Vincent, W.F.; Bernier, L.; Lovejoy, C. Novel chytrid lineages dominate fungal sequences in diverse marine and freshwater habitats. Sci. Rep. 2016, 6, 30120. [Google Scholar] [CrossRef] [PubMed]
- Callahan, B.J.; Mcmurdie, P.J.; Rosen, M.J.; Han, A.W.; Johnson, A.J.A.; Holmes, S.P. DADA2: High-resolution sample inference from Illumina amplicon data. Nat. Methods 2016, 13, 581–583. [Google Scholar] [CrossRef]
- Prodan, A.; Tremaroli, V.; Brolin, H.; Zwinderman, A.H.; Nieuwdorp, M.; Levin, E. Comparing bioinformatic pipelines for microbial 16S rRNA amplicon sequencing. PLoS ONE 2020, 15, e0227434. [Google Scholar] [CrossRef]
- Benson, D.A.; Cavanaugh, M.; Clark, K.; Karsch-Mizrachi, I.; Lipman, D.J.; Ostell, J.; Sayers, E.W. GenBank. Nucleic Acids Res. 2013, 41, D36–D42. [Google Scholar] [CrossRef]
- Tateno, Y. DNA Data Bank of Japan (DDBJ) for genome scale research in life science. Nucleic Acids Res. 2002, 30, 27–30. [Google Scholar] [CrossRef]
- Quast, C.; Pruesse, E.; Yilmaz, P.; Gerken, J.; Schweer, T.; Yarza, P.; Peplies, J.; Glöckner, F.O. The SILVA ribosomal RNA gene database project: Improved data processing and web-based tools. Nucleic Acids Res. 2013, 41, D590–D596. [Google Scholar] [CrossRef]
- DeSantis, T.Z.; Hugenholtz, P.; Larsen, N.; Rojas, M.; Brodie, E.L.; Keller, K.; Huber, T.; Dalevi, D.; Hu, P.; Andersen, G.L. Greengenes, a Chimera-Checked 16S rRNA Gene Database and Workbench Compatible with ARB. Appl. Environ. Microbiol. 2006, 72, 5069–5072. [Google Scholar] [CrossRef]
- Martin, D.; Rybicki, E. RDP: Detection of recombination amongst aligned sequences. Bioinformatics 2000, 16, 562–563. [Google Scholar] [CrossRef]
- Deshpande, V.; Wang, Q.; Greenfield, P.; Charleston, M.; Porras-Alfaro, A.; Kuske, C.R.; Cole, J.R.; Midgley, D.J.; Tran-Dinh, N. Fungal identification using a Bayesian classifier and the Warcup training set of internal transcribed spacer sequences. Mycologia 2016, 108, 1–5. [Google Scholar] [CrossRef] [PubMed]
- Nilsson, R.H.; Larsson, K.-H.; Taylor, A.F.S.; Bengtsson-Palme, J.; Jeppesen, T.S.; Schigel, D.; Kennedy, P.; Picard, K.; Glöckner, F.O.; Tedersoo, L.; et al. The UNITE database for molecular identification of fungi: Handling dark taxa and parallel taxonomic classifications. Nucleic Acids Res. 2019, 47, D259–D264. [Google Scholar] [CrossRef] [PubMed]
- Pham, V.H.; Kim, J. Cultivation of unculturable soil bacteria. Trends Biotechnol. 2012, 30, 475–484. [Google Scholar] [CrossRef]
- Bokulich, N.A.; Kaehler, B.D.; Rideout, J.R.; Dillon, M.; Bolyen, E.; Knight, R.; Huttley, G.A.; Gregory Caporaso, J. Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2′s q2-feature-classifier plugin. Microbiome 2018, 6, 90. [Google Scholar] [CrossRef] [PubMed]
- Rognes, T.; Flouri, T.; Nichols, B.; Quince, C.; Mahé, F. VSEARCH: A versatile open source tool for metagenomics. PeerJ 2016, 2016, e2584. [Google Scholar] [CrossRef] [PubMed]
- Zahariev, M.; Chen, W.; Visagie, C.M.; Lévesque, C.A. Cluster oligonucleotide signatures for rapid identification by sequencing. BMC Bioinform. 2018, 19, 395. [Google Scholar] [CrossRef] [PubMed]
- Pereira, F.; Azevedo, F.; Carvalho, A.; Ribeiro, G.F.; Budde, M.W.; Johansson, B. Pydna: A simulation and documentation tool for DNA assembly strategies using python. BMC Bioinform. 2015, 16, 142. [Google Scholar] [CrossRef] [PubMed]
- Parada, A.E.; Needham, D.M.; Fuhrman, J.A. Every base matters: Assessing small subunit rRNA primers for marine microbiomes with mock communities, time series and global field samples: Primers for marine microbiome studies. Environ. Microbiol. 2016, 18, 1403–1414. [Google Scholar] [CrossRef] [PubMed]
- Apprill, A.; McNally, S.; Parsons, R.; Weber, L. Minor revision to V4 region SSU rRNA 806R gene primer greatly increases detection of SAR11 bacterioplankton. Aquat. Microb. Ecol. 2015, 75, 129–137. [Google Scholar] [CrossRef]
- Bokulich, N.A.; Mills, D.A. Improved Selection of Internal Transcribed Spacer-Specific Primers Enables Quantitative, Ultra-High-Throughput Profiling of Fungal Communities. Appl. Environ. Microbiol. 2013, 79, 2519–2526. [Google Scholar] [CrossRef]
- Cobo-Díaz, J.F.; Baroncelli, R.; Le Floch, G.; Picot, A. A novel metabarcoding approach to investigate Fusarium species composition in soil and plant samples. FEMS Microbiol. Ecol. 2019, 95, fiz084. [Google Scholar] [CrossRef] [PubMed]
- Jeanne, T.; D’astous-Pagé, J.; Hogue, R. Spatial, temporal and technical variability in the diversity of prokaryotes and fungi in agricultural soils. Front. Soil Sci. 2022, 2, 945888. [Google Scholar] [CrossRef]
- Bolyen, E.; Rideout, J.R.; Dillon, M.R.; Bokulich, N.A.; Abnet, C.C.; Al-Ghalith, G.A.; Alexander, H.; Alm, E.J.; Arumugam, M.; Asnicar, F.; et al. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nat. Biotechnol. 2019, 37, 852–857. [Google Scholar] [CrossRef]
- Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. J. 2011, 17, 10–12. [Google Scholar] [CrossRef]
- Aurrecoechea, C.; Barreto, A.; Brestelli, J.; Brunk, B.P.; Cade, S.; Doherty, R.; Fischer, S.; Gajria, B.; Gao, X.; Gingle, A.; et al. EuPathDB: The Eukaryotic Pathogen database. Nucleic Acids Res. 2012, 41, D684–D691. [Google Scholar] [CrossRef]
- Chen, W.; Radford, D.R.; Hambleton, S. Towards Improved Detection and Identification of Rust Fungal Pathogens in Environmental Samples Using a Metabarcoding Approach. Phytopathology 2022, 112, 535–548. [Google Scholar] [CrossRef]
- Grinevich, D.; Harden, L.; Grinevich, D.O.; Callahan, B.J. Serovar-level Identification of Bacterial Foodborne Pathogens from Full-length 16S rRNA Gene Sequencing. Microbiology 2023, preprint. [Google Scholar] [CrossRef]
- Boutigny, A.-L.; Gautier, A.; Basler, R.; Dauthieux, F.; Leite, S.; Valade, R.; Aguayo, J.; Ioos, R.; Laval, V. Metabarcoding targeting the EF1 alpha region to assess Fusarium diversity on cereals. PLoS ONE 2019, 14, e0207988. [Google Scholar] [CrossRef] [PubMed]
Microbial Group | Target | Region | Forward Primer | Reverse Primer | Reference |
---|---|---|---|---|---|
Bacteria | 16S | V4V5 | 515FB: GTGYCAGCMGCCGCGGTAA | 926R: CCGYCAATTYMTTTRAGTTT | [22,23] |
Fungi | ITS | ITS1 | BITS: ACCTGCGGARGGATCA | B58S3 GAGATCCRTTGYTRAAAGTT | [24] |
Fusarium | EF1α | EF1α | Fa-150: CCGGTCACTTGATCTACCAG | Ra-2: ATGACGGTGACATAGTAGCG | [25] |
Sample Code | Crop | Diagnostic (Conventional) | Code | Best Taxonomic Identification by Pre-Trained Classifiers (SILVA/UNITE) | Conf. | Complementary Identification with ASV-Specific Database | Shared Amplicon (SA) | Relative Abound. (%) |
---|---|---|---|---|---|---|---|---|
Bacterial amplification—16S rADN | ||||||||
Case1 | Squash | Pseudomonas_syringae | C5 | Pseudomonas | 1 | Pseudomonas_SA46 | 35 species | 8.55 |
C5 | Pseudomonas | 1 | Pseudomonas_SA63 | 28 species | 30.89 | |||
Case2 | Cabbage | Xanthomonas campestris | C5 | Xanthomonas | 0.997 | Xanthomonas_SA1 | 27 species | 57.35 |
Case3 | Squash | Erwinia_traqueiphila | C1 | Erwinia tracheiphila | 1 | Erwinia tracheiphila | 31.03 | |
C5 | Pseudomonas | 1 | Pseudomonas_SA22 | 44 species | 0.06 | |||
C4 | Streptomyces | 1 | Streptomyces_SA167 | S.roseirectus S.niveiscabiei S.acidiscabies | 0.06 | |||
Case4 | Cabbage | Xanthomonas campestris | C5 | Xanthomonas | 0.997 | Xanthomonas_SA1 | 27 species | 63.11 |
Case5 | Wheat | Xanthomonas campestris | C5 | Pseudomonas | 1 | Pseudomonas_SA22 | 44 species | 0.26 |
C5 | Xanthomonas | 0.997 | Xanthomonas_SA1 | 27 species | 3.97 | |||
C5 | Xanthomonas | 1 | Xanthomonas_SA3 | 15 species | 49.50 | |||
Case5 | Potato | Streptomyces_scabies | C5 | Pseudomonas | 1 | Pseudomonas_SA39 | 30 species | 0.18 |
C5 | Pseudomonas | 1 | Pseudomonas_SA46 | 35 species | 0.92 | |||
C5 | Streptomyces | 0.999 | Streptomyces_SA63 | 25 species | 7.19 | |||
C2 | Streptomyces | 1 | Streptomyces scabrisporus | 0.36 | ||||
Fungal amplification—ITS1 | ||||||||
Case7 | Potato | Colletotrichum, Dickeya sp., Fusarium, Pythium, Verticillium | C4 | Colletotrichum coccodes | 0.998 | Colletotrichum_SA61 | C.nigrum C.coccodes C.gloeosporioides_complex | 24.46 |
C4 | Verticillium nubilum | 0.998 | Verticillium_SA1 | V.longisporum V.dahliae | 17.81 | |||
Case8 | Corn | Ustilago_maydis | C1 | Ustilago maydis | 1 | Ustilago maydis | 13.32 | |
Case9 | Melon | Verticillium_dahliae | C3 | Colletotrichum fuscum | 1 | Colletotrichum destructivum complex | 0.10 | |
C5 | Septoria epilobii | 0.93 | Septoria_SA3 | 38 species | 0.01 | |||
C4 | Verticillium nubilum | 0.998 | Verticillium_SA1 | V.longisporum V.dahliae | 12.51 | |||
Case10 | Melon | Verticillium_dahliae | C3 | Colletotrichum fuscum | 0.998 | Colletotrichum destructivum complex | 0.01 | |
C4 | Verticillium nubilum | 0.998 | Verticillium_SA1 | V.longisporum V.dahliae | 28.52 | |||
Fusarium-specific amplification—EF1A (only with ASV-specific database) | ||||||||
Case11 | Corn | Fusarium graminearum, Fusarium avenaceum | C2 | Fusarium tricinctum complex | 10.01 | |||
C2 | Fusarium tricinctum complex | 23.89 | ||||||
Case12 | Corn | Fusarium sporotrichoides Fusarium graminearum Fusarium equiseti | C2 | Fusarium fujikuroi complex | 28.56 | |||
C2 | Fusarium incarnatum-equiseti complex | 67.20 | ||||||
Case13 | Corn | Kebatiellose Fusarium | C2 | Fusarium incarnatum-equiseti complex | 1.11 | |||
C4 | Fusarium_SA89 | F.incarnatum equiseti complex F.sporotrichioides | 46.91 | |||||
C4 | Fusarium_SA93 | F.asiaticum F.armeniacum F.boothii F.graminearum F.meridionale | 3.19 | |||||
C2 | Fusarium_serpentinum | 0.97 | ||||||
C2 | Fusarium_sporotrichioides | 0.79 | ||||||
C2 | Fusarium_sporotrichioides | 0.53 | ||||||
C2 | Fusarium_tricinctum_complex | 7.37 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Plessis, C.; Jeanne, T.; Dionne, A.; Vivancos, J.; Droit, A.; Hogue, R. ASVmaker: A New Tool to Improve Taxonomic Identifications for Amplicon Sequencing Data. Plants 2023, 12, 3678. https://doi.org/10.3390/plants12213678
Plessis C, Jeanne T, Dionne A, Vivancos J, Droit A, Hogue R. ASVmaker: A New Tool to Improve Taxonomic Identifications for Amplicon Sequencing Data. Plants. 2023; 12(21):3678. https://doi.org/10.3390/plants12213678
Chicago/Turabian StylePlessis, Clément, Thomas Jeanne, Antoine Dionne, Julien Vivancos, Arnaud Droit, and Richard Hogue. 2023. "ASVmaker: A New Tool to Improve Taxonomic Identifications for Amplicon Sequencing Data" Plants 12, no. 21: 3678. https://doi.org/10.3390/plants12213678
APA StylePlessis, C., Jeanne, T., Dionne, A., Vivancos, J., Droit, A., & Hogue, R. (2023). ASVmaker: A New Tool to Improve Taxonomic Identifications for Amplicon Sequencing Data. Plants, 12(21), 3678. https://doi.org/10.3390/plants12213678