GDS: A Genomic Database for Strawberries (Fragaria spp.)
Abstract
:1. Introduction
2. Materials and Methods
2.1. Web Server and Code
2.2. Formatting the Genomic Data of Strawberries
3. Results
3.1. Overview of the GDS
3.2. The Homepage of GDS
3.3. Introduction to the Strawberry Species and Genomes
3.4. Data Sets
3.5. Completeness of the Genomes
3.6. Phylogenomic Relationships among the Strawberry Genomes
3.7. Genomic Comparison of Gene Orthogroups
3.8. Gene Annotations
3.9. Sequence Searches Using Basic Local Alignment Search Tool (BLAST)
3.10. Genomic Visualization Using JBrowse
3.11. Tracing Whole-Genome Duplication Using Synteny Browse Search
3.12. microRNA Search
3.13. Transcription Factor Search
3.14. Gene Search
3.15. Download
3.16. Community of Strawberry Researchers
4. Discussion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Kim, E.H. A New Species of Fragaria (Rosaceae) from Oregon. J. Bot. Res. Inst. Tex. 2012, 6, 9–15. [Google Scholar]
- Kim, E.H.; Preeda, N.; Tomohiro, Y. Decaploidy in Fragaria iturupensis (Rosaceae). Am. J. Bot. 2009, 96, 713–719. [Google Scholar] [CrossRef]
- Van de Peer, Y.; Mizrachi, E.; Marchal, K. The evolutionary significance of polyploidy. Nat. Rev. Genet. 2017, 18, 411–424. [Google Scholar] [CrossRef] [PubMed]
- Lei, J.J.; Xue, L.; Guo, R.X.; Dai, H.P. The Fragaria species native to China and their geographical distribution. Acta Hortic. 2017, 1156, 37–46. [Google Scholar] [CrossRef]
- Detlef, U.; Klaus, O. Diversity of volatile patterns in sixteen Fragaria vesca L. accessions in comparison to cultivars of Fragaria ×ananassa. J. Appl. Bot. Food Qual. 2013, 86, 37–46. [Google Scholar] [CrossRef]
- Hendrix, B.; Stewart, J.M. Estimation of the nuclear DNA content of gossypium species. Ann. Bot. 2005, 95, 789–797. [Google Scholar] [CrossRef]
- Tennessen, J.A.; Govindarajulu, R.; Ashman, T.L.; Liston, A. Evolutionary origins and dynamics of octoploid strawberry subgenomes revealed by dense targeted capture linkage maps. Genome Biol. Evol. 2014, 6, 3295–3313. [Google Scholar] [CrossRef] [Green Version]
- Cappelletti, R.; Sabbadini, S.; Mezzetti, B. Strawberry (Fragaria ×ananassa). Methods Mol. Biol. 2015, 1224, 217–227. [Google Scholar] [CrossRef]
- Shulaev, V.; Sargent, D.J.; Crowhurst, R.N.; Mockler, T.C.; Folkerts, O.; Delcher, A.L.; Jaiswal, P.; Mockaitis, K.; Liston, A.; Mane, S.P.; et al. The genome of woodland strawberry (Fragaria vesca). Nat. Genet. 2011, 43, 109–116. [Google Scholar] [CrossRef]
- Hirakawa, H.; Shirasawa, K.; Kosugi, S.; Tashiro, K.; Nakayama, S.; Yamada, M.; Kohara, M.; Watanabe, A.; Kishida, Y.; Fujishiro, T.; et al. Dissection of the Octoploid Strawberry Genome by Deep Sequencing of the Genomes of Fragaria Species. DNA Res. 2014, 21, 169–181. [Google Scholar] [CrossRef]
- Edger, P.P.; Poorten, T.J.; VanBuren, R.; Hardigan, M.A.; Colle, M.; McKain, M.R.; Smith, R.D.; Teresi, S.J.; Nelson, A.D.L.; Wai, C.M.; et al. Origin and evolution of the octoploid strawberry genome. Nat. Genet. 2019, 51, 541–547. [Google Scholar] [CrossRef] [Green Version]
- Chen, F.; Song, Y.; Li, X.; Chen, J.; Mo, L.; Zhang, X.; Lin, Z.; Zhang, L. Genome sequences of horticultural plants: Past, present, and future. Hortic. Res. 2019, 6, 112. [Google Scholar] [CrossRef] [Green Version]
- Xiaoming, S.; Fulei, N.; Wei, C.; Xiao, M.; Ke, G.; Qihang, Y.; Jinpeng, W.; Nan, L.; Pengchuan, S.; Qiaoying, P.; et al. Coriander Genomics Database: A genomic, transcriptomic, and metabolic database for coriander. Hortic. Res. 2020, 7, 55. [Google Scholar] [CrossRef] [Green Version]
- Tam, P.S.; Peter, L.; Scott, C.E. GigaDB: Announcing the GigaScience database. Gigascience 2012, 1, 11. [Google Scholar] [CrossRef] [Green Version]
- Junyang, Y.; Jiacheng, L.; Wei, T.; Ya, Q.W.; Xiaofeng, T.; Wei, L.; Ying, Y.; Lihuan, W.; Shengxiong, H.; Congbing, F.; et al. Kiwifruit Genome Database (KGD): A comprehensive resource for kiwifruit genomics. Hortic. Res. 2020, 7, 117. [Google Scholar] [CrossRef]
- Xiao, Q.; Li, Z.; Qu, M.; Xu, W.; Su, Z.; Yang, J. LjaFGD: Lonicera japonica functional genomics database. J. Integr. Plant Biol. 2021, 63, 1422–1436. [Google Scholar] [CrossRef]
- Wenlei, G.; Junhao, C.; Jian, L.; Jianqin, H.; Zhengjia, W.; Kean-Jin, L. Portal of Juglandaceae: A comprehensive platform for Juglandaceae study. Hortic. Res. 2020, 7, 35. [Google Scholar] [CrossRef] [Green Version]
- Lamesch, P.; Berardini, T.Z.; Li, D.; Swarbreck, D.; Wilks, C.; Sasidharan, R.; Muller, R.; Dreher, K.; Alexander, D.L.; Garcia-Hernandez, M.; et al. The Arabidopsis Information Resource (TAIR): Improved gene annotation and new tools. Nucleic Acids Res. 2012, 40, D1202–D1212. [Google Scholar] [CrossRef]
- Chen, F.; Dong, W.; Zhang, J.; Guo, X.; Chen, J.; Wang, Z.; Lin, Z.; Tang, H.; Zhang, L. The Sequenced Angiosperm Genomes and Genome Databases. Front. Plant Sci. 2018, 9, 418. [Google Scholar] [CrossRef] [Green Version]
- Bo, L.; Victor, R.; Ron, M.S.; James, A.T.; Colin, N.D. RNA-Seq gene expression estimation with read mapping uncertainty. Bioinformatics 2010, 26, 493–500. [Google Scholar] [CrossRef] [Green Version]
- Liu, T.; Li, M.; Liu, Z.; Ai, X.; Li, Y. Reannotation of the cultivated strawberry genome and establishment of a strawberry genome database. Hortic. Res. 2021, 8, 41. [Google Scholar] [CrossRef]
- Edger, P.P.; VanBuren, R.; Colle, M.; Poorten, T.J.; Wai, C.M.; Niederhuth, C.E.; Alger, E.I.; Ou, S.; Acharya, C.B.; Wang, J.; et al. Single-molecule sequencing and optical mapping yields an improved genome of woodland strawberry (Fragaria vesca) with chromosome-scale contiguity. Gigascience 2018, 7, 1–7. [Google Scholar] [CrossRef] [Green Version]
- Zhang, J.; Lei, Y.; Wang, B.; Li, S.; Yu, S.; Wang, Y.; Li, H.; Liu, Y.; Ma, Y.; Dai, H.; et al. The high-quality genome of diploid strawberry (Fragaria nilgerrensis) provides new insights into anthocyanin accumulation. Plant Biotechnol. J. 2020, 18, 1908–1924. [Google Scholar] [CrossRef] [Green Version]
- Feng, C.; Wang, J.; Harris, A.J.; Folta, K.M.; Zhao, M.; Kang, M. Tracing the Diploid Ancestry of the Cultivated Octoploid Strawberry. Mol. Biol. Evol. 2021, 38, 478–485. [Google Scholar] [CrossRef]
- Li, Y.; Pi, M.; Gao, Q.; Liu, Z.; Kang, C. Updated annotation of the wild strawberry Fragaria vesca V4 genome. Hortic. Res. 2019, 6, 61. [Google Scholar] [CrossRef] [Green Version]
- Seppey, M.; Manni, M.; Zdobnov, E.M. BUSCO: Assessing Genome Assembly and Annotation Completeness. Methods Mol. Biol. (Clifton N.J.) 2019, 1962, 227–245. [Google Scholar] [CrossRef]
- David, M.E.; Steven, K. OrthoFinder: Phylogenetic orthology inference for comparative genomics. Genome Biol. 2019, 20, 238. [Google Scholar] [CrossRef] [Green Version]
- David, M.E.; Steven, K. OrthoFinder: Solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 2015, 16, 157. [Google Scholar] [CrossRef] [Green Version]
- Liston, A.; Wei, N.; Tennessen, J.A.; Junmin, L.; Ming, D.; Tia-Lynn, A. Revisiting the origin of octoploid strawberry. Nat. Genet. 2020, 52, 2–4. [Google Scholar] [CrossRef]
- Edger, P.P.; McKain, M.R.; Yocca, A.E.; Knapp, S.J.; Qiao, Q.; Zhang, T. Reply to: Revisiting the origin of octoploid strawberry. Nat. Genet. 2020, 52, 5–7. [Google Scholar] [CrossRef]
- Daniel, P.; James, J.L.; Richard, E.H. Phylogenetic Relationships Among Species of Fragaria (Rosaceae) Inferred from Non-coding Nuclear and Chloroplast DNA Sequences. Syst. Bot. 2000, 25, 337–348. [Google Scholar] [CrossRef]
- Chen, P.; Liu, Q.Z. Genome-wide characterization of the WRKY gene family in cultivated strawberry (Fragaria × ananassa Duch.) and the importance of several group III members in continuous cropping. Sci. Rep. 2019, 9, 8423. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Sara, E.; Jaina, M.; Alex, B.; Sean, R.E.; Aurélien, L.; Simon, C.P.; Matloob, Q.; Lorna, J.R.; Gustavo, A.S.; Alfredo, S.; et al. The Pfam protein families database in 2019. Nucleic Acids Res. 2019, 47, D427–D432. [Google Scholar] [CrossRef]
- Finn, R.D.; Clements, J.; Eddy, S.R. HMMER web server: Interactive sequence similarity searching. Nucleic Acids Res. 2011, 39, 29–37. [Google Scholar] [CrossRef] [Green Version]
- Kanehisa, M.; Sato, Y. KEGG Mapper for inferring cellular functions from protein sequences. Protein Sci. 2020, 29, 28–35. [Google Scholar] [CrossRef] [Green Version]
- Aramaki, T.; Blanc-Mathieu, R.; Endo, H.; Ohkubo, K.; Kanehisa, M.; Goto, S.; Ogata, H. KofamKOALA: KEGG Ortholog assignment based on profile HMM and adaptive score threshold. Bioinformatics 2020, 36, 2251–2252. [Google Scholar] [CrossRef] [Green Version]
- The Gene Ontology Consortium. The Gene Ontology Resource: 20 years and still GOing strong. Nucleic Acids Res. 2019, 47, D330–D338. [Google Scholar] [CrossRef] [Green Version]
- Robert, D.F.; Teresa, K.A.; Patricia, C.B.; Alex, B.; Peer, B.; Alan, J.B.; Hsin-Yu, C.; Zsuzsanna, D.; Sara, E.; Matthew, F.; et al. InterPro in 2017—beyond protein family and domain annotations. Nucleic Acids Res. 2017, 45, D190–D199. [Google Scholar] [CrossRef]
- Blum, M.; Chang, H.; Chuguransky, S.; Grego, T.; Kandasaamy, S.; Mitchell, A.; Nuka, G.; PaysanLafosse, T.; Qureshi, M.; Raj, S.; et al. The InterPro protein families and domains database: 20 years on. Nucleic Acids Res. 2020, 49, D344–D354. [Google Scholar] [CrossRef]
- Henrik, N.; Konstantinos, D.T.; Søren, B.; Gunnar, H. A Brief History of Protein Sorting Prediction. Protein J. 2019, 38, 200–216. [Google Scholar] [CrossRef] [Green Version]
- José, J.A.A.; Konstantinos, D.T.; Casper, K.S.; Thomas, N.P.; Ole, W.; Søren, B.; Gunnar, V.H.; Henrik, N. SignalP 5.0 improves signal peptide predictions using deep neural networks. Nat. Biotechnol. 2019, 37, 420–423. [Google Scholar] [CrossRef]
- Pérez-Rodríguez, P.; Riaño-Pachón, D.M.; Corrêa, L.G.G.; Rensing, S.A.; Kersten, B.; Mueller-Roeber, B. PlnTFDB: Updated content and new features of the plant transcription factor database. Nucleic Acids Res. 2010, 38, D227–D234. [Google Scholar] [CrossRef] [Green Version]
- Christiam, C.; George, C.; Vahram, A.; Ning, M.; Jason, P.; Kevin, B.; Thomas, L.M. BLAST+: Architecture and applications. BioMed Cent. 2009, 10, 421. [Google Scholar] [CrossRef] [Green Version]
- Priyam, A.; Woodcroft, B.J.; Rai, V.; Moghul, I.; Munagala, A.; Ter, F.; Chowdhary, H.; Pieniak, I.; Maynard, L.J.; Gibbins, M.A.; et al. Sequenceserver: A Modern Graphical User Interface for Custom BLAST Databases. Mol. Biol. Evol. 2019, 36, 2922–2924. [Google Scholar] [CrossRef]
- Robert, B.; Eric, Y.; Colin, M.D.; Richard, D.H.; Monica, M.; Gregg, H.; David, M.G.; Christine, G.E.; Suzanna, E.L.; Lincoln, S.; et al. JBrowse: A dynamic web platform for genome visualization and analysis. Genome Biol. 2016, 17, 66. [Google Scholar] [CrossRef] [Green Version]
- Wambui, N.; Aaron, L.; Richard, C.; Tia-Lynn, A.; Nahla, B. Insights into phylogeny, sex function and age of Fragaria based on whole chloroplast genome sequencing. Mol. Phylogenet. Evol. 2013, 66, 17–29. [Google Scholar] [CrossRef]
- Wang, Y.; Tang, H.; Debarry, J.D.; Tan, X.; Li, J.; Wang, X.; Lee, T.; Jin, H.; Marler, B.; Guo, H.; et al. MCScanX: A toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 2012, 40, 49. [Google Scholar] [CrossRef] [Green Version]
- Kalvari, I.; Nawrocki, E.P.; Argasinska, J.; Quinones-Olvera, N.; Finn, R.D.; Bateman, A.; Petrov, A.I. Non-Coding RNA Analysis Using the Rfam Database. Curr. Protoc. Bioinform. 2018, 62, 51. [Google Scholar] [CrossRef]
- Kalvari, I.; Nawrocki, E.P.; OntiverosPalacios, N.; Argasinska, J.; Lamkiewicz, K.; Marz, M.; GriffithsJones, S.; ToffanoNioche, C.; Gautheret, D.; Weinberg, Z.; et al. Rfam 14: Expanded coverage of metagenomic, viral and microRNA families. Nucleic Acids Res. 2020, 49, 192–200. [Google Scholar] [CrossRef]
- Nawrocki, E.P.; Kolbe, D.L.; Eddy, S.R. Infernal 1.0: Inference of RNA alignments. Bioinformatics 2009, 25, 1335–1337. [Google Scholar] [CrossRef] [Green Version]
- Zheng, Y.; Jiao, C.; Sun, H.; Rosli, H.G.; Pombo, M.A.; Zhang, P.; Banf, M.; Dai, X.; Martin, G.B.; Giovannoni, J.J.; et al. iTAK: A Program for Genome-wide Prediction and Classification of Plant Transcription Factors, Transcriptional Regulators, and Protein Kinases. Mol. Plant 2016, 9, 1667–1670. [Google Scholar] [CrossRef] [Green Version]
- Sook, J.; Taein, L.; Chun-Huai, C.; Katheryn, B.; Ping, Z.; Jing, Y.; Jodi, H.; Stephen, P.F.; Ksenija, G.; Kristin, S.; et al. 15 years of GDR: New data and functionality in the Genome Database for Rosaceae. Nucleic Acids Res. 2019, 47, 1137–1145. [Google Scholar] [CrossRef] [Green Version]
Species | Assembly Size (Mb) | Ploidy | Scaffold N50 (kb) | Contig N50 (kb) | BUSCO V5 (%) |
---|---|---|---|---|---|
Fragaria × ananassa | 805.5 | 8x = 56 | 5980.469 | 79.973 | 99.6 |
Fragaria iinumae | 199. 6 | 2x = 14 | 4.112 | 0.824 | 98.4 |
Fragaria nipponica | 206.4 | 2x = 14 | 1.952 | 0.617 | 46.7 |
Fragaria nubicola | 203.7 | 2x = 14 | 1.982 | 0.618 | 92.0 |
Fragaria orientalis | 214.2 | 4x = 28 | 1.913 | 0.480 | 23.9 |
Fragaria vesca | 220.8 | 2x = 14 | 36,100 | 7900 | 98.2 |
Fragaria viridis | 214.9 | 2x = 14 | 29,200 | 3500 | 94.2 |
Fragaria nilgerrensis | 270.3 | 2x = 14 | 38,300 | 8510 | 93.8 |
Data Type | Count |
---|---|
Nuclear genome | 8 |
Choroplast genome | 7 |
Coding sequence | 455,467 |
Protein | 436,160 |
GO term | 3,107,804 |
KEGG | 309,589 |
Gene family | 243,687 |
Signal peptide | 27,481 |
TF | 1918 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhou, Y.; Qiao, Y.; Ni, Z.; Du, J.; Xiong, J.; Cheng, Z.; Chen, F. GDS: A Genomic Database for Strawberries (Fragaria spp.). Horticulturae 2022, 8, 41. https://doi.org/10.3390/horticulturae8010041
Zhou Y, Qiao Y, Ni Z, Du J, Xiong J, Cheng Z, Chen F. GDS: A Genomic Database for Strawberries (Fragaria spp.). Horticulturae. 2022; 8(1):41. https://doi.org/10.3390/horticulturae8010041
Chicago/Turabian StyleZhou, Yuhan, Yushan Qiao, Zhiyou Ni, Jianke Du, Jinsong Xiong, Zongming Cheng, and Fei Chen. 2022. "GDS: A Genomic Database for Strawberries (Fragaria spp.)" Horticulturae 8, no. 1: 41. https://doi.org/10.3390/horticulturae8010041
APA StyleZhou, Y., Qiao, Y., Ni, Z., Du, J., Xiong, J., Cheng, Z., & Chen, F. (2022). GDS: A Genomic Database for Strawberries (Fragaria spp.). Horticulturae, 8(1), 41. https://doi.org/10.3390/horticulturae8010041