Next Article in Journal
Synergies of Targeting Tumor Angiogenesis and Immune Checkpoints in Non-Small Cell Lung Cancer and Renal Cell Cancer: From Basic Concepts to Clinical Reality
Next Article in Special Issue
Mutational Biases and GC-Biased Gene Conversion Affect GC Content in the Plastomes of Dendrobium Genus
Previous Article in Journal
The Relevance of the UPS in Fatty Liver Graft Preservation: A New Approach for IGL-1 and HTK Solutions
Previous Article in Special Issue
Stable Membrane-Association of mRNAs in Etiolated, Greening and Mature Plastids
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Complete Chloroplast Genome Sequences of the Medicinal Plant Forsythia suspensa (Oleaceae)

1
College of Life Science, Shanxi Agricultural University, Taigu 030801, China
2
College of Plant Protection, Northwest Agriculture & Forestry University, Yangling 712100, China
*
Authors to whom correspondence should be addressed.
Int. J. Mol. Sci. 2017, 18(11), 2288; https://doi.org/10.3390/ijms18112288
Submission received: 12 September 2017 / Revised: 24 October 2017 / Accepted: 25 October 2017 / Published: 31 October 2017
(This article belongs to the Special Issue Chloroplast)

Abstract

:
Forsythia suspensa is an important medicinal plant and traditionally applied for the treatment of inflammation, pyrexia, gonorrhea, diabetes, and so on. However, there is limited sequence and genomic information available for F. suspensa. Here, we produced the complete chloroplast genomes of F. suspensa using Illumina sequencing technology. F. suspensa is the first sequenced member within the genus Forsythia (Oleaceae). The gene order and organization of the chloroplast genome of F. suspensa are similar to other Oleaceae chloroplast genomes. The F. suspensa chloroplast genome is 156,404 bp in length, exhibits a conserved quadripartite structure with a large single-copy (LSC; 87,159 bp) region, and a small single-copy (SSC; 17,811 bp) region interspersed between inverted repeat (IRa/b; 25,717 bp) regions. A total of 114 unique genes were annotated, including 80 protein-coding genes, 30 tRNA, and four rRNA. The low GC content (37.8%) and codon usage bias for A- or T-ending codons may largely affect gene codon usage. Sequence analysis identified a total of 26 forward repeats, 23 palindrome repeats with lengths >30 bp (identity > 90%), and 54 simple sequence repeats (SSRs) with an average rate of 0.35 SSRs/kb. We predicted 52 RNA editing sites in the chloroplast of F. suspensa, all for C-to-U transitions. IR expansion or contraction and the divergent regions were analyzed among several species including the reported F. suspensa in this study. Phylogenetic analysis based on whole-plastome revealed that F. suspensa, as a member of the Oleaceae family, diverged relatively early from Lamiales. This study will contribute to strengthening medicinal resource conservation, molecular phylogenetic, and genetic engineering research investigations of this species.

1. Introduction

Forsythia suspensa (Thunb.) Vahl, known as “Lianqiao” in Chinese, is a well-known traditional Asian medicine that is widely distributed in many Asian and European countries [1]. In folk medicine, the extract of the dried fruit has long been used to treat a variety of diseases, such as inflammation, pyrexia, gonorrhea, tonsillitis, and ulcers [2]. In recent years, the dried ripe fruit of F. suspensa has often been prescribed for the treatment of diabetes in China [3,4].
Chloroplast (cp) genomes are mostly circular DNA molecules, which have a typical quadripartite structure composed of a large single copy (LSC) region and a small single copy (SSC) region interspersed between two copies of inverted repeats (IRa/b) [5]. The cp genome sequences can provide vast information not only about genes and their encoded proteins, but also on functional implications and evolutionary relationships [6]. Due to high-throughput capabilities and relatively low costs, next-generation sequencing techniques have made it more convenient to obtain a large number of cp genome sequences [7]. After the first complete cp DNA sequences were reported in Nicotiana tabacum [8] and Marchantia polymorpha [9], complete cp DNA sequences of numerous plant species were determined [6,10,11,12]. To date, approximately 1300 plant cp genomes are publicly available as part of the National Center for Biotechnology Information (NCBI) database.
Within the Oleaceae family, the complete cp genomes of several plant species have been published [12,13,14,15], thereby providing additional evidence for the evolution and conservation of cp genomes. Nevertheless, no cp genome belonging to genus Forsythia has been reported. Few data are available with respect to the F. suspensa cp genome.
In order to characterize the complete cp genome sequence of the F. suspensa and expand our understanding of the diversity of the genus Forsythia, details of the cp genome structure and organization are reported in this paper. This is also the first sequenced member of the genus Forsythia (Oleaceae). We compare the F. suspense cp genome with previously annotated cp genomes of other Lamiales species. Our studies could provide basic data for the medicinal species conservation and molecular phylogenetic research of the genus Forsythia and Lamiales.

2. Results and Discussions

2.1. Genome Features

Whole genome sequencing using an Illumina Hiseq 4000 PE150 platform generated 19,241,634 raw reads. Clean reads were obtained by removing adaptors and low-quality read pairs. Then, we collected 662,793 cp-genome-related reads (3.44% of total reads), reaching an average of 636 × coverage over the cp genome. With PCR-based experiments, we closed the gaps and validated the sequence assembly, and ultimately obtained a complete F. suspensa cp genome sequence, which was then submitted to GenBank (accession number: MF579702).
Most cp genomes of higher plants have been found to have a typical quadripartite structure composed of an LSC region and an SSC region interspersed between the IRa/b region [5]. The complete cp genome of F. suspensa has a total length of 156,404 bp, with a pair of IRs of 25,717 bp that separate an LSC region of 87,159 bp and an SSC region of 17,811 bp (Figure 1). The total GC content was 37.8%, which was similar to the published Oleaceae cp genomes [12,13,14,15]. The GC content of the IR regions was 43.2%, which was higher when compared with the GC content in the LSC and SSC regions (35.8% and 31.8%, respectively).
The gene content and sequence of the F. suspensa cp genome are relatively conserved, with basic characteristics of land plant cp genomes [16]. It encodes a total of 114 unique genes, of which 19 are duplicated in the IR regions. Out of the 114 genes, there are 80 protein-coding genes (70.2%), 30 tRNA (26.3%), and four rRNA genes (rrn5, rrn4.5, rrn16, rrn23) (3.5%) (Table 1). Eighteen genes contained introns, fifteen (nine protein-coding and six tRNA genes) of which contained one intron and three of which (rps12, ycf3, and clpP) contained two introns (Table 2). The rps12 gene is a trans-spliced gene, three exons of which were located in the LSC region and IR regions, respectively. The complete gene of matK was located within the intron of trnK-UUU. One pseudogene (non functioning duplications of functional genes), ycf1, was identified, located in the boundary regions between IRb/SSC. The partial gene duplication might have caused the lack of protein-coding ability. In general, the junctions between the IR and LSC/SSC regions vary among higher plant cp genomes [17,18,19]. In the F. suspensa cp genome, the ycf1 gene regions extended into the IR region in the IR/SSC junctions, while the rpl2 was 51 bp apart from the LSC/IR junction.

2.2. Comparison to Other Lamiales Species

The IR regions are highly conserved and play an important role in stabilizing the cp genome structure [20,21]. For IR and SC boundary regions, their expansion and contraction are commonly considered as the main mechanism behind the length variation of angiosperm cp genomes [22,23]. In this study, we compared the junctions of LSC/IRb/SSC/IRa of the seven Lamiales cp genomes (Figure 2), and also observed the expansions and contractions in IR boundary regions.
The rps19 genes of four Oleaceae species were all completely located in the LSC region, and the IR region expanded to the rps19 gene in the other three genomes, with a short rps19 pseudogene of 43 bp, 30 bp, and 40 bp created at the IRa/LSC border in S. miltiorrhiza, S. indicum, and S. takesimensis, respectively. The border between the IRb and SSC extended into the ycf1 genes, with ycf1 pseudogenes created in all of the seven species. The length of the ycf1 pseudogene was very similar in four of the Oleaceae species (1091 or 1092 bp), and was longer than that in S. miltiorrhiza (1056 bp), S. indicum (1012 bp), and S. takesimensis (886 bp). Overlaps were detected between the ycf1 pseudogene and the ndhF gene in five cp genomes (except for S. indicum and S. takesimensis), which also had similar lengths (25 or 26 bp) in four Oleaceae species. The trnH-GUG genes were all located in the LSC region, the distance of which from the LSC/IRa boundary was 3–22 bp. Overall, the IR/SC junctions of the Oleaceae species were similar and showed some difference compared to those of Lamiaceae (S. miltiorrhiza), Pedaliaceae (S. indicum), and Scrophulariaceae (S. takesimensis). Our results suggested that the cp genomes of closely related species might be conserved, whereas greater diversity might occur among species belonging to different families, such as one inverted repeat loss in the cp genome of Astragalus membranaceus [24] and the large inversions in Eucommia ulmoides [25].

2.3. Codon Usage Analysis

The synonymous codons often have different usage frequencies in plant genomes, which was termed codon usage bias. A variety of evolutionary factors which affect gene mutation and selection may lead to the occurrence of codon bias [26,27].
To examine codon usage, the effective number of codons (Nc) of 52 protein-coding genes (PCGs) was calculated. The Nc values for each PCG in F. suspensa are shown in Table S2. Our results indicated that the Nc values ranged from 37.83 (rps14) to 54.75 (ycf3) in all the selected PCGs. Most Nc values were greater than 44, which suggested a weak gene codon bias in the F. suspensa cp genome. The rps14 gene was detected to exist in the most biased codon usage with the lowest mean Nc value of 37.83. Table 3 showed the codon usage and relative synonymous codon usage (RSCU). Due to the RSCU values of >1, thirty codons showed the codon usage bias in the F. suspensa cp genes. Interestingly, out of the above 30 codons, twenty-nine were A or T-ending codons. Conversely, the G + C-ending codons exhibited the opposite pattern (RSCU values < 1), indicating that they are less common in F. suspensa cp genes. Stop codon usage was found to be biased toward TAA. The similar codon usage rules of bias for A- or T-ending were also found in poplar, rice, and other plants [28,29,30].
The factors affecting codon usage may vary in different genes or species. In a relative study, Zhou et al. [30] considered the genomic nucleotide mutation bias as a main cause of codon bias in seed plants such as arabidopsis and poplar. Morton [31] reported that the cp gene codon usage was largely affected by the asymmetric mutation of cp DNA in Euglena gracilis. Our result suggested that a low GC content and codon usage bias for A + T-ending may be a major factor in the cp gene codon usage of F. suspensa.
The 52 unique PCGs comprised 63,555 bp that encoded 21,185 codons. The amino acid (AA) frequencies of the F. suspensa cp genome were further computed. Of these codons, 2237 (10.56%) encode leucine, which was the most frequency used AA in the F. suspensa cp genome (Table 3). As the least common one, cysteine was only encoded by 223 (1.05%) codons.

2.4. Repeats and Simple Sequence Repeats Analysis

Repeat sequences in the F. suspensa cp genome were analyzed by REPuter and the results showed that there were no complement repeats and reverse repeats. Twenty-six forward repeats and 23 palindrome repeats were detected with lengths ≥ 30 bp (identity > 90%) (Table 4). Out of the 49 repeats, 34 repeats (69.4%) were 30–39 bp long, 11 repeats (22.4%) were 40–49 bp long, four repeats (8.2%) were 50–59 bp long, and the longest repeat was 58 bp. Generally, repeats were mostly distributed in noncoding regions [32,33]; however, 53.1% of the repeats in the F. suspensa cp genome were located in coding regions (CDS) (Figure 3A), mainly in ycf2; similar to that of S. dentata and S. takesimensis [34]. Meanwhile, 40.8% of repeats were located in intergenic spacers (IGS) and introns, and 6.1% of repeats were in parts of the IGS and CDS.
Simple sequence repeats (SSRs) are widely distributed across the entire genome and exert significant influence on genome recombination and rearrangement [35]. As valuable molecular markers, SSRs have been used in polymorphism investigations and population genetics [36,37]. The occurrence, type, and distribution of SSRs were analyzed in the F. suspensa cp genome. In total, we detected 54 SSRs in the F. suspensa cp genome (Table 5), accounting for 700 bp of the total sequence (0.45%). The majority of these SSRs consisted of mono- and di-nucleotide repeats, which were found 35 and seven times, respectively. Tri-(1), tetra-(4), and penta-nucleotide repeat sequences (1) were detected with a much lower frequency. Six compound SSRs were also found. Fifty SSRs (92.6%) were composed of A and T nucleotides, while tandem G or C repeats were quite rare, which was in concordance with the other research results [38,39]. Out of these SSRs, 42 (88.9%) and six (11.1%) were located in IGS and introns, respectively (Figure 3B). Only five SSRs were found in the coding genes, including rpoC2, rpoA, and ndhD, and one was located in parts of the IGS and CDS. In addition, we noticed that almost all SSRs were located in LSC, except for (T)19, and no SSRs were detected in the IR region. These SSRs may be developed lineage-specific markers, which might be useful in evolutionary and genetic diversity studies.

2.5. Predicted RNA Editing Sites in the F. suspensa Chloroplast Genes

In the F. suspensa cp genome, we predicted 52 RNA editing sites, which occurred in 21 genes (Table 6). The ndhB gene contained the most editing sites (10), and this finding was consistent with other plants such as rice, maize, and tomato [40,41,42]. Meanwhile, the genes ndhD and rpoB were predicted to have six editing sites: matK, five; ropC2, three; accD, ndhA, ndhF, ndhG, and petB, two; and one each in atpA, atpF, atpI, ccsA, petG, psbE, rpl2, rpl20, rpoA, rps2, and rps14. All these editing sites were C-to-U transitions. The editing phenomenon was also commonly found in the chloroplasts and mitochondria of seed plants [43]. The locations of the editing sites in the first, second, and third codons were 14, 38, and 0, respectively. Of the 52 sites, twenty were U_A types, which was similar codon bias to previous studies of RNA editing sites [10,44]. In addition, forty-eight RNA editing events in the F. suspensa cp genome led to acid changes for highly hydrophobic residues, such as leucine, isoleucine, valine, tryptophan, and tyrosine. The conversions from serine to leucine were the most frequent transitions. As a form of post-transcriptional regulation of gene expression, the feature has already been revealed by most RNA editing researches [44]. Notably, our results provide additional evidence to support the above conclusion.

2.6. Phylogeny Reconstruction of Lamiales Based on Complete Chloroplast Genome Sequences

Complete cp genomes comprise abundant phylogenetic information, which could be applied to phylogenetic studies of angiosperm [11,45,46]. To identify the evolutionary position of F. suspensa within Lamiales, an improved resolution of phylogenetic relationships was achieved by using these whole cp genome sequences of 36 Lamiales species. Three species, C. Arabica, I. purpurea, and O. nivara were also chosen as outgroups. The Maximum likelihood (ML) bootstrap values were fairly high, with values ≥ 98% for 32 of the 36 nodes, and 30 nodes had 100% bootstrap support (Figure 4). F. suspensa, whose cp genome was reported in this study, was closely related to A. distichum, which then formed a cluster with H. palmeri, J. nudiflorum, and the Olea species from Oleaceae with 100% bootstrap supports. Notably, Oleaceae diverged relatively early from the Lamiales lineage. In addition, four phylogenetic relationships were only supported by lower ML bootstrap values. This was possibly a result of less samples in these families. The cp genome is also expected to be useful in resolving the deeper branches of the phylogeny, along with the availability of more whole genome sequences.

3. Materials and Methods

3.1. Plant Materials

Samples of F. suspensa were collected in Zezhou County, Shanxi Province, China. The voucher specimens were deposited in the Herbarium of Shanxi Agricultural University, Taigu, China. Additionally, the location of the specimens was not within any protected area.

3.2. DNA Library Preparation, Sequencing, and Genome Assembly

Genomic DNA was extracted from fresh young leaves of the F. suspensa plant using the mCTAB method [47]. Genomic DNA was fragmented into 400–600 bp using a Covaris M220 Focused-ultrasonicator (Covaris, Woburn, MA, USA). Library preparation was conducted using NEBNext® Ultra™ DNA Library Prep Kit Illumina (New England, Biolabs, Ipswich, MA, USA). Sample sequencing was carried out on an Illumina Hiseq 4000 PE150 platform.
Next, raw sequence reads were assembled into contigs using SPAdes [48], CLC Genomics Workbench 8 (Available online: http://www.clcbio.com), and SOAPdenovo2 [49], respectively. Chloroplast genome contigs were selected by BLAST (Available online: http://blast.ncbi.nlm.nih.gov/) [50] and were assembled by Sequencher 4.10 (Available online: http://genecodes.com/). All reads were mapped to the cp genome using Geneious 8.1 [51], which verified the selected contigs. The closing of gaps was accomplished by special primer designs, PCR amplification, and Sanger sequencing. Finally, we obtained a high-quality complete F. suspensa cp genome, and the result was submitted to NCBI (Accession Number: MF579702).

3.3. Genome Annotation and Comparative Genomics

Chloroplast genome annotation was performed using DOGMA (Dual Organellar GenoMe Annotator) [52] (Available online: http://dogma.ccbb.utexas.edue). Putative protein-coding genes, tRNAs, and rRNAs were identified by BLASTX and BLASTN searches (Available online: http://blast.ncbi.nlm.nih.gov/), respectively. The cp genome was drawn using OrganellarGenomeDRAW [53] (Available online: http://ogdraw.mpimp-golm.mpg.de/index.shtml), with subsequent manual editing. The boundaries between the IR and SC regions of F. suspensa and six other Lamiales species were compared and analyzed.

3.4. Repeat Sequence Analyses

The REPuter program [54] (Available online: https://bibiserv.cebitec. uni-bielefeld.de/reputer) was used to identify repeats including forward, reverse, palindrome, and complement sequences. The length and identity of the repeats were limited to ≥30 bp and >90%, respectively, with the Hamming distance equal to 3 [55,56]. The cp SSRs were detected using MISA [57] with the minimum repeats of mono-, di-, tri-, tetra-, penta-, and hexanucleotides set to 10, 5, 4, 3, 3, and 3, respectively.

3.5. Codon Usage

To ensure sampling accuracy, only 52 PCGs with a length >300 bp were selected for synonymous codon usage analysis. Two relevant parameters, Nc and RSCU, were calculated using the program CodonW1.4.2 (Available online: http://downloads.fyxm.net/CodonW-76666.html). Nc is often utilized to evaluate the codon bias at the individual gene level, in a range from 20 (extremely biased) to 61 (totally unbiased) [58]. RSCU is the observed frequency of a codon divided by the expected frequency. The values close to 1.0 indicate a lack of bias [59]. AA frequency was also calculated and expressed by the percentage of the codons encoding the same amino acid divided by the total codons.

3.6. Prediction of RNA Editing Sites

Prep-Cp [60] (Available online: http://prep.unl.edu/) and CURE software [61] (Available online: http://bioinfo.au.tsinghua.edu.cn/pure/) were applied to the prediction of RNA editing sites, and the parameter threshold (cutoff value) was set to 0.8 to ensure prediction accuracy.

3.7. Phylogenomic Analyses

ML phylogenetic analyses were performed using the F. suspensa complete cp genome and 32 Lamiales plastomes with three species, Coffea arabica, Ipomoea purpurea, and Oryza nivara, as outgroups (Table S1). All of the plastome sequences were aligned using MAFFT program version 7.0 [62] (Available online: http://mafft.cbrc.jp/alignment/server/index.html) and adjusted manually where necessary. These plastome nucleotide alignments were subjected to ML phylogenetic analyses with MEGA7.0 [63] based on the General Time Reversible model. A discrete Gamma distribution was used to model evolutionary rate differences among sites. The branch support was estimated by rapid bootstrap analyses using 100 pseudo-replicates.

4. Conclusions

The cp genome of the medicinal plant F. suspensa was reported for the first time in this study and its organization is described and compared with that of other Lamiales species. This genome is 156,404 bp in length, with a similar quadripartite structure and genomic contents common to most land plant genomes. The low GC content of the cp genome might caused the codon usage bias toward A- or T-ending codons. All of the predicted RNA editing sites in the genome were C-to-U transitions. Among several relative species, the genome size and IR expansion or contraction exhibited some differences, and the divergent regions were also analyzed. Repeat sequences and SSRs within F. suspensa were analyzed, which may be useful in developing molecular markers for the analyses of infraspecific genetic differentiation within the genus Forsythia (Oleaceae). Phylogenetic analysis based on the entire cp genome revealed that F. suspensa, as a member of the Oleaceae family, diverged relatively early from Lamiales. Overall, the sequences and annotation of the F. suspensa cp genome will facilitate medicinal resource conservation, as well as molecular phylogenetic and genetic engineering research of this species.

Supplementary Materials

Supplementary materials can be found at www.mdpi.com/1422-0067/18/11/2288/s1.

Acknowledgments

This work was supported by the Research Project Supported by the Shanxi Scholarship Council of China (2015-066), the Preferential Research Project Supported by Ministry of Human Resources and Social Security of the People’s Republic of China, Doctor Research Grant of Shanxi Agricultural University (XB2009002), the Modernization Base Construction Project of Traditional Chinese Medicine Supported by the Health and Family Planning Commission of Shanxi Province, and Doctor Research Grant of Shanxi Agricultural University (2016ZZ04).

Author Contributions

Wenbin Wang, Huan Yu, Jiahui Wang, Wanjun Lei, Jianhua Gao, and Xiangpo Qiu carried out the experiments; and Wenbin Wang and Jinsheng Wang designed the project and wrote the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Zhao, L.; Yan, X.; Shi, J.; Ren, F.; Liu, L.; Sun, S.; Shan, B. Ethanol extract of Forsythia suspensa root induces apoptosis of esophageal carcinoma cells via the mitochondrial apoptotic pathway. Mol. Med. Rep. 2015, 11, 871–880. [Google Scholar] [CrossRef] [PubMed]
  2. Piao, X.L.; Jang, M.H.; Cui, J.; Piao, X. Lignans from the fruits of Forsythia suspensa. Bioorg. Med. Chem. Lett. 2008, 18, 1980–1984. [Google Scholar] [CrossRef] [PubMed]
  3. Kang, W.; Wang, J.; Zhang, L. α-glucosidase inhibitors from leaves of Forsythia suspense in Henan province. China J. Chin. Mater. Med. 2010, 35, 1156–1159. [Google Scholar]
  4. Bu, Y.; Shi, T.; Meng, M.; Kong, G.; Tian, Y.; Chen, Q.; Yao, X.; Feng, G.; Chen, H.; Lu, Z. A novel screening model for the molecular drug for diabetes and obesity based on tyrosine phosphatase Shp2. Bioorg. Med. Chem. Lett. 2011, 21, 874–878. [Google Scholar] [CrossRef] [PubMed]
  5. Wicke, S.; Schneeweiss, G.M.; dePamphilis, C.W.; Muller, K.F.; Quandt, D. The evolution of the plastid chromosome in land plants: Gene content, gene order, gene function. Plant Mol. Biol. 2011, 76, 273–297. [Google Scholar] [CrossRef] [PubMed]
  6. He, Y.; Xiao, H.; Deng, C.; Xiong, L.; Yang, J.; Peng, C. The complete chloroplast genome sequences of the medicinal plant Pogostemon cablin. Int. J. Mol. Sci. 2016, 17, 820. [Google Scholar] [CrossRef] [PubMed]
  7. Shendure, J.; Ji, H. Next-generation DNA sequencing. Nat. Biotechnol. 2008, 26, 1135–1145. [Google Scholar] [CrossRef] [PubMed]
  8. Shinozaki, K.; Ohme, M.; Tanaka, M.; Wakasugi, T.; Hayashida, N.; Matsubayashi, T.; Zaita, N.; Chunwongse, J.; Obokata, J.; Yamaguchi-Shinozaki, K.; et al. The complete nucleotide sequence of the tobacco chloroplast genome: Its gene organization and expression. EMBO J. 1986, 5, 2043–2049. [Google Scholar] [CrossRef] [PubMed]
  9. Ohyama, K.; Fukuzawa, H.; Kohchi, T.; Shirai, H.; Sano, T.; Sano, S.; Umesono, K.; Shiki, Y.; Takeuchi, M; Chang, Z.; et al. Chloroplast gene organization deduced from complete sequence of liverwort marchantia polymorpha chloroplast DNA. Nature 1986, 572–574. [Google Scholar] [CrossRef]
  10. Maier, R.M.; Neckermann, K.; Igloi, G.L.; Kossel, H. Complete sequence of the maize chloroplast genome: Gene content, hotspots of divergence and fine tuning of genetic information by transcript editing. J. Mol. Biol. 1995, 251, 614–628. [Google Scholar] [CrossRef] [PubMed]
  11. Kim, K.; Lee, S.C.; Lee, J.; Yu, Y.; Yang, K.; Choi, B.S.; Koh, H.J.; Waminal, N.E.; Choi, H.I.; Kim, N.H.; et al. Complete chloroplast and ribosomal sequences for 30 accessions elucidate evolution of Oryza AA genome species. Sci. Rep. 2015, 5, 15655. [Google Scholar] [CrossRef] [PubMed]
  12. Besnard, G.; Hernandez, P.; Khadari, B.; Dorado, G.; Savolainen, V. Genomic profiling of plastid DNA variation in the Mediterranean olive tree. BMC Plant Biol. 2011, 11, 80. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  13. Lee, H.L.; Jansen, R.K.; Chumley, T.W.; Kim, K.J. Gene relocations within chloroplast genomes of Jasminum and Menodora (Oleaceae) are due to multiple, overlapping inversions. Mol. Biol. Evol. 2007, 24, 1161–1180. [Google Scholar] [CrossRef] [PubMed]
  14. Zedane, L.; Hong-Wa, C.; Murienne, J.; Jeziorski, C.; Baldwin, B.G.; Besnard, G. Museomics illuminate the history of an extinct, paleoendemic plant lineage (Hesperelaea, Oleaceae) known from an 1875 collection from Guadalupe island, Mexico. Biol. J. Linn. Soc. 2015, 117, 44–57. [Google Scholar] [CrossRef]
  15. Kim, H.-W.; Lee, H.-L.; Lee, D.-K.; Kim, K.-J. Complete plastid genome sequences of abeliophyllum distichum nakai (oleaceae), a Korea endemic genus. Mitochondrial DNA Part B 2016, 1, 596–598. [Google Scholar] [CrossRef]
  16. Sugiura, M. The chloroplast genome. Plant Mol. Biol. 1992, 19, 149–168. [Google Scholar] [CrossRef] [PubMed]
  17. Zhang, Y.J.; Ma, P.F.; Li, D.Z. High-throughput sequencing of six bamboo chloroplast genomes: Phylogenetic implications for temperate woody bamboos (poaceae: Bambusoideae). PLoS ONE 2011, 6, e20596. [Google Scholar] [CrossRef] [PubMed]
  18. Xu, Q.; Xiong, G.; Li, P.; He, F.; Huang, Y.; Wang, K.; Li, Z.; Hua, J. Analysis of complete nucleotide sequences of 12 Gossypium chloroplast genomes: Origin and evolution of allotetraploids. PLoS ONE 2012, 7, e37128. [Google Scholar] [CrossRef] [PubMed]
  19. Wang, S.; Shi, C.; Gao, L.Z. Plastid genome sequence of a wild woody oil species, prinsepia utilis, provides insights into evolutionary and mutational patterns of rosaceae chloroplast genomes. PLoS ONE 2013, 8, e73946. [Google Scholar] [CrossRef] [PubMed]
  20. Marechal, A.; Brisson, N. Recombination and the maintenance of plant organelle genome stability. New Phytol. 2010, 186, 299–317. [Google Scholar] [CrossRef] [PubMed]
  21. Fu, J.; Liu, H.; Hu, J.; Liang, Y.; Liang, J.; Wuyun, T.; Tan, X. Five complete chloroplast genome sequences from diospyros: Genome organization and comparative analysis. PLoS ONE 2016, 11, e0159566. [Google Scholar] [CrossRef] [PubMed]
  22. Chumley, T.W.; Palmer, J.D.; Mower, J.P.; Fourcade, H.M.; Calie, P.J.; Boore, J.L.; Jansen, R.K. The complete chloroplast genome sequence of pelargonium x hortorum: Organization and evolution of the largest and most highly rearranged chloroplast genome of land plants. Mol. Biol. Evol. 2006, 23, 2175–2190. [Google Scholar] [CrossRef] [PubMed]
  23. Yang, M.; Zhang, X.; Liu, G.; Yin, Y.; Chen, K.; Yun, Q.; Zhao, D.; Al-Mssallem, I.S.; Yu, J. The complete chloroplast genome sequence of date palm (Phoenix dactylifera L.). PLoS ONE 2010, 5, e12762. [Google Scholar] [CrossRef] [PubMed]
  24. Lei, W.; Ni, D.; Wang, Y.; Shao, J.; Wang, X.; Yang, D.; Wang, J.; Chen, H.; Liu, C. Intraspecific and heteroplasmic variations, gene losses and inversions in the chloroplast genome of Astragalus membranaceus. Sci. Rep. 2016, 6, 21669. [Google Scholar] [CrossRef] [PubMed]
  25. Wang, L.; Wuyun, T.N.; Du, H.; Wang, D.; Cao, D. Complete chloroplast genome sequences of Eucommia ulmoides: Genome structure and evolution. Tree Genet. Genomes 2016, 12, 12. [Google Scholar] [CrossRef]
  26. Ermolaeva, M.D. Synonymous codon usage in bacteria. Curr. Issues Mol. Biol. 2001, 3, 91–97. [Google Scholar] [PubMed]
  27. Wong, G.K.; Wang, J.; Tao, L.; Tan, J.; Zhang, J.; Passey, D.A.; Yu, J. Compositional gradients in gramineae genes. Genome Res. 2002, 12, 851–856. [Google Scholar] [CrossRef] [PubMed]
  28. Liu, Q.; Xue, Q. Codon usage in the chloroplast genome of rice (Oryza sativa L. ssp. japonica). Acta Agron. Sin. 2004, 30, 1220–1224. [Google Scholar]
  29. Zhou, M.; Long, W.; Li, X. Analysis of synonymous codon usage in chloroplast genome of Populus alba. For. Res. 2008, 19, 293–297. [Google Scholar] [CrossRef]
  30. Zhou, M.; Long, W.; Li, X. Patterns of synonymous codon usage bias in chloroplast genomes of seed plants. For. Sci. Pract. 2008, 10, 235–242. [Google Scholar] [CrossRef]
  31. Morton, B.R. The role of context-dependent mutations in generating compositional and codon usage bias in grass chloroplast DNA. J. Mol. Evol. 2003, 56, 616–629. [Google Scholar] [CrossRef] [PubMed]
  32. Nazareno, A.G.; Carlsen, M.; Lohmann, L.G. Complete chloroplast genome of tanaecium tetragonolobum: The first bignoniaceae plastome. PLoS ONE 2015, 10, e0129930. [Google Scholar] [CrossRef] [PubMed]
  33. Yao, X.; Tang, P.; Li, Z.; Li, D.; Liu, Y.; Huang, H. The first complete chloroplast genome sequences in actinidiaceae: Genome structure and comparative analysis. PLoS ONE 2015, 10, e0129347. [Google Scholar] [CrossRef] [PubMed]
  34. Ni, L.; Zhao, Z.; Dorje, G.; Ma, M. The complete chloroplast genome of ye-xing-ba (Scrophularia dentata; Scrophulariaceae), an alpine Tibetan herb. PLoS ONE 2016, 11, e0158488. [Google Scholar] [CrossRef] [PubMed]
  35. Cavalier-Smith, T. Chloroplast evolution: Secondary symbiogenesis and multiple losses. Curr. Biol. 2002, 12, R62–R64. [Google Scholar] [CrossRef]
  36. Xue, J.; Wang, S.; Zhou, S.L. Polymorphic chloroplast microsatellite loci in nelumbo (nelumbonaceae). Am. J. Bot. 2012, 99, e240–e244. [Google Scholar] [CrossRef] [PubMed]
  37. Hu, J.; Gui, S.; Zhu, Z.; Wang, X.; Ke, W.; Ding, Y. Genome-wide identification of SSR and snp markers based on whole-genome re-sequencing of a Thailand wild sacred lotus (nelumbo nucifera). PLoS ONE 2015, 10, e0143765. [Google Scholar] [CrossRef] [PubMed]
  38. Qian, J.; Song, J.; Gao, H.; Zhu, Y.; Xu, J.; Pang, X.; Yao, H.; Sun, C.; Li, X.; Li, C.; et al. The complete chloroplast genome sequence of the medicinal plant salvia miltiorrhiza. PLoS ONE 2013, 8, e57607. [Google Scholar] [CrossRef] [PubMed]
  39. Kuang, D.Y.; Wu, H.; Wang, Y.L.; Gao, L.M.; Zhang, S.Z.; Lu, L. Complete chloroplast genome sequence of magnolia kwangsiensis (magnoliaceae): Implication for DNA barcoding and population genetics. Genome 2011, 54, 663–673. [Google Scholar] [CrossRef] [PubMed]
  40. Freyer, R.; López, C.; Maier, R.M.; Martín, M.; Sabater, B.; Kössel, H. Editing of the chloroplast ndhb encoded transcript shows divergence between closely related members of the grass family (poaceae). Plant Mol. Biol. 1995, 29, 679–684. [Google Scholar] [CrossRef] [PubMed]
  41. Kahlau, S.; Aspinall, S.; Gray, J.C.; Bock, R. Sequence of the tomato chloroplast DNA and evolutionary comparison of solanaceous plastid genomes. J. Mol. Evol. 2006, 63, 194–207. [Google Scholar] [CrossRef] [PubMed]
  42. Chateigner Boutin, A.L.; Small, I. A rapid high-throughput method for the detection and quantification of RNA editing based on high-resolution melting of amplicons. Nucleic Acids Res. 2007, 35, e114. [Google Scholar] [CrossRef] [PubMed]
  43. Bock, R. Sense from nonsense: How the genetic information of chloroplasts is altered by RNA editing. Biochimie 2000, 82, 549–557. [Google Scholar] [CrossRef]
  44. Jiang, Y.; Yun, H.E.; Fan, S.L.; Jia-Ning, Y.U.; Song, M.Z. The identification and analysis of RNA editing sites of 10 chloroplast protein-coding genes from virescent mutant of Gossypium Hirsutum. Cotton Sci. 2011, 23, 3–9. [Google Scholar]
  45. Jansen, R.K.; Cai, Z.; Raubeson, L.A.; Daniell, H.; Depamphilis, C.W.; Leebens-Mack, J.; Muller, K.F.; Guisinger-Bellian, M.; Haberle, R.C.; Hansen, A.K.; et al. Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns. Proc. Natl. Acad. Sci. USA 2007, 104, 19369–19374. [Google Scholar] [CrossRef] [PubMed]
  46. Huang, H.; Shi, C.; Liu, Y.; Mao, S.Y.; Gao, L.Z. Thirteen camellia chloroplast genome sequences determined by high-throughput sequencing: Genome structure and phylogenetic relationships. BMC Evol. Biol. 2014, 14, 151. [Google Scholar] [CrossRef] [PubMed]
  47. Li, J.; Wang, S.; Yu, J.; Wang, L.; Zhou, S. A modified ctab protocol for plant DNA extraction. Chin. Bull. Bot. 2013, 48, 72–78. [Google Scholar]
  48. Bankevich, A.; Nurk, S.; Antipov, D.; Gurevich, A.A.; Dvorkin, M.; Kulikov, A.S.; Lesin, V.M.; Nikolenko, S.I.; Pham, S.; Prjibelski, A.D.; et al. Spades: A new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 2012, 19, 455–477. [Google Scholar] [CrossRef] [PubMed]
  49. Luo, R.; Liu, B.; Xie, Y.; Li, Z.; Huang, W.; Yuan, J.; He, G.; Chen, Y.; Pan, Q.; Liu, Y.; et al. SOAPdenovo2: An empirically improved memory-efficient short-read de novo assembler. GigaScience 2012, 1, 18. [Google Scholar] [CrossRef] [PubMed]
  50. Altschul, S.F.; Madden, T.L.; Schaffer, A.A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman, D.J. Gapped blast and psi-blast: A new generation of protein database search programs. Nucleic Acids Res. 1997, 25, 3389–3402. [Google Scholar] [CrossRef] [PubMed]
  51. Kearse, M.; Moir, R.; Wilson, A.; Stones-Havas, S.; Cheung, M.; Sturrock, S.; Buxton, S.; Cooper, A.; Markowitz, S.; Duran, C.; et al. Geneious basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 2012, 28, 1647–1649. [Google Scholar] [CrossRef] [PubMed]
  52. Wyman, S.K.; Jansen, R.K.; Boore, J.L. Automatic annotation of organellar genomes with DOGMA. Bioinformatics 2004, 20, 3252–3255. [Google Scholar] [CrossRef] [PubMed]
  53. Lohse, M.; Drechsel, O.; Kahlau, S.; Bock, R. Organellargenomedraw—A suite of tools for generating physical maps of plastid and mitochondrial genomes and visualizing expression data sets. Nucleic Acids Res. 2013, 41, W575–W581. [Google Scholar] [CrossRef] [PubMed]
  54. Kurtz, S.; Choudhuri, J.V.; Ohlebusch, E.; Schleiermacher, C.; Stoye, J.; Giegerich, R. Reputer: The manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res. 2001, 29, 4633–4642. [Google Scholar] [CrossRef] [PubMed]
  55. Vieira Ldo, N.; Faoro, H.; Rogalski, M.; Fraga, H.P.; Cardoso, R.L.; de Souza, E.M.; de Oliveira Pedrosa, F.; Nodari, R.O.; Guerra, M.P. The complete chloroplast genome sequence of Podocarpus Lambertii: Genome structure, evolutionary aspects, gene content and SSR detection. PLoS ONE 2014, 9, e90618. [Google Scholar] [CrossRef] [PubMed]
  56. Chen, J.; Hao, Z.; Xu, H.; Yang, L.; Liu, G.; Sheng, Y.; Zheng, C.; Zheng, W.; Cheng, T.; Shi, J. The complete chloroplast genome sequence of the relict woody plant Metasequoia glyptostroboides Hu et Cheng. Front. Plant Sci. 2015, 6, 447. [Google Scholar] [CrossRef] [PubMed]
  57. Thiel, T.; Michalek, W.; Varshney, R.K.; Graner, A. Exploiting EST databases for the development and characterization of gene-derived SSR-markers in barley (Hordeum vulgare L.). TAG. Theor. Appl. Genet. 2003, 106, 411–422. [Google Scholar] [CrossRef] [PubMed]
  58. Wright, F. The effective number of codons used in a gene. Gene 1990, 87, 23–29. [Google Scholar] [CrossRef]
  59. Sharp, P.M.; Tuohy, T.M.; Mosurski, K.R. Codon usage in yeast: Cluster analysis clearly differentiates highly and lowly expressed genes. Nucleic Acids Res. 1986, 14, 5125–5143. [Google Scholar] [CrossRef] [PubMed]
  60. Mower, J.P. The prep suite: Predictive RNA editors for plant mitochondrial genes, chloroplast genes and user-defined alignments. Nucleic Acids Res. 2009, 37, W253–W259. [Google Scholar] [CrossRef] [PubMed]
  61. Du, P.; Jia, L.; Li, Y. CURE-chloroplast: A chloroplast C-to-U RNA editing predictor for seed plants. BMC Bioinform. 2009, 10, 135. [Google Scholar] [CrossRef] [PubMed]
  62. Katoh, K.; Standley, D. Mafft multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 2013, 30, 772–780. [Google Scholar] [CrossRef] [PubMed]
  63. Kumar, S.; Stecher, G.; Tamura, K. Mega7: Molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol. Biol. Evol. 2016, 33, 1870–1874. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Chloroplast genome map of Forsythia suspensa. Genes drawn inside the circle are transcribed clockwise, and those outside are counterclockwise. Genes are color-coded based on their function, which are shown at the left bottom. The inner circle indicates the inverted boundaries and GC content.
Figure 1. Chloroplast genome map of Forsythia suspensa. Genes drawn inside the circle are transcribed clockwise, and those outside are counterclockwise. Genes are color-coded based on their function, which are shown at the left bottom. The inner circle indicates the inverted boundaries and GC content.
Ijms 18 02288 g001
Figure 2. Comparisons of LSC, SSC, and IR region borders among six Lamiales chloroplast genomes. Ψ indicates a pseudogene. Colorcoding mean different genes on both sides of the junctions. Number above the gene features means the distance between the ends of genes and the junction sites. The arrows indicated the location of the distance. This figure is not to scale.
Figure 2. Comparisons of LSC, SSC, and IR region borders among six Lamiales chloroplast genomes. Ψ indicates a pseudogene. Colorcoding mean different genes on both sides of the junctions. Number above the gene features means the distance between the ends of genes and the junction sites. The arrows indicated the location of the distance. This figure is not to scale.
Ijms 18 02288 g002
Figure 3. Distribution of repeat sequence and simple sequence repeats (SSRs) within F. suspensa chloroplast genomes. (A) Distribution of repeats; and (B) distribution of SSRs. IGS: intergenic spacer.
Figure 3. Distribution of repeat sequence and simple sequence repeats (SSRs) within F. suspensa chloroplast genomes. (A) Distribution of repeats; and (B) distribution of SSRs. IGS: intergenic spacer.
Ijms 18 02288 g003
Figure 4. Maximum likelihood phylogeny of the Lamiales species inferred from complete chloroplast genome sequences. Numbers near branches are bootstrap values of 100 pseudo-replicates. The tree on the right panel was constructed manually by reference to the left one, and the distance of branches was meaningless. The branches without numbers indicate 100% bootstrap supports.
Figure 4. Maximum likelihood phylogeny of the Lamiales species inferred from complete chloroplast genome sequences. Numbers near branches are bootstrap values of 100 pseudo-replicates. The tree on the right panel was constructed manually by reference to the left one, and the distance of branches was meaningless. The branches without numbers indicate 100% bootstrap supports.
Ijms 18 02288 g004
Table 1. A list of genes found in the plastid genome of Forsythia suspensa.
Table 1. A list of genes found in the plastid genome of Forsythia suspensa.
Category for GenesGroup of Gene Name of Gene
Photosynthesis related genesRubiscorbcL
Photosystem ІpsaA, psaB, psaC, psaI, psaJ
Assembly/stability of photosystem Іycf3 *, ycf4
Photosystem ІІpsbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbL, psbM, psbN, psbT, psbZ
ATP synthaseatpA, atpB, atpE, atpF *, atpH, atpI
cytochrome b/f complexpetA, petB *, petD *, petG, petL, petN
cytochrome c synthesisccsA
NADPH dehydrogenasendhA *, ndhB *, ndhC, ndhD, ndhE, ndhF, ndhG, ndhH, ndhI, ndhJ
Transcription and translation related genestranscriptionrpoA, rpoB, rpoC1 *, rpoC2
ribosomal proteinsrps2, rps3, rps4, rps7, rps8, rps11, rps12 *, rps14, rps15, rps16 *, rps18, rps19, rpl2 *, rpl14, rpl16 *, rpl20, rpl22, rpl23, rpl32, rpl33, rpl36
translation initiation factorinfA
RNA genesribosomal RNArrn5, rrn4.5, rrn16, rrn23
transfer RNAtrnA-UGC *, trnC-GCA, trnD-GUC, trnE-UUC, trnF-GAA, trnG-UCC *, trnG-GCC *, trnH-GUG, trnI-CAU, trnI-GAU *, trnK-UUU *, trnL-CAA, trnL-UAA *, trnL-UAG, trnfM-CAUI, trnM-CAU, trnN-GUU, trnP-UGG, trnQ-UUG, trnR-ACG, trnR-UCU, trnS-GCU, trnS-GGA, trnS-UGA, trnT-GGU, trnT-UGU, trnV-GAC, trnV-UAC *, trnW-CCA, trnY-GUA
Other genesRNA processingmatK
carbon metabolismcemA
fatty acid synthesisaccD
proteolysisclpP *
Genes of unknown functionconserved reading framesycf1, ycf2, ycf15, ndhK
* indicate the intron-containing genes.
Table 2. Genes with introns within the F. suspensa chloroplast genome and the length of exons and introns.
Table 2. Genes with introns within the F. suspensa chloroplast genome and the length of exons and introns.
GeneLocationExon І (bp)Intron І (bp)Exon ІІ (bp)Intron ІІ (bp)Exon ІІІ (bp)
trnA-UGCIR3881435
trnG-GCCLSC2467648
trnI-GAUIR4294235
trnK-UUULSC38249437
trnL-UAALSC3747350
trnV-UACLSC3857237
rps12 *LSC114-23153627
rps16LSC40864227
atpFLSC144705411
rpoC1LSC4457581619
ycf3LSC129714228737153
clpPLSC69815291642228
petBLSC6707642
petDLSC8713475
rpl16LSC9865399
rpl2IR393664435
ndhBIR777679756
ndhASSC5551106531
* The rps12 is a trans-spliced gene with the 5′ end located in the LSC region and the duplicated 3′ end in the IR regions.
Table 3. The relative synonymous codon usage of the Forsythia suspensa chloroplast genome.
Table 3. The relative synonymous codon usage of the Forsythia suspensa chloroplast genome.
Amino AcidsCodonNumberRSCUAA FrequencyAmino AcidsCodonNumberRSCUAA Frequency
PheUUU7791.325.59%SerUCU4721.767.59%
UUC4050.68UCC2470.92
LeuUUA7201.9310.56%UCA3071.15
UUG4511.21UCG1520.57
CUU4861.30AGU3391.26
CUC1290.35AGC910.34
CUA3010.81ProCCU3511.554.26%
CUG1500.40CCC1700.75
IleAUU8901.478.57%CCA2691.19
AUC3770.62CCG1130.50
AUA5480.91ThrACU4301.634.98%
MetAUG4951.002.34%ACC2010.76
ValGUU4231.485.41%ACA3241.23
GUC1260.44ACG1000.38
GUA4471.56AlaGCU5261.845.41%
GUG1510.53GCC1770.62
TyrUAU6311.613.70%GCA3281.14
UAC1520.39GCG1150.40
TERUAA281.620.25%CysUGU1711.531.05%
UAG100.58UGC520.47
UGA140.81ArgCGU2751.306.00%
HisCAU4041.582.42%CGC900.42
CAC1080.42CGA2841.34
GlnCAA5951.523.69%CGG970.46
CAG1860.48ArgAGA3921.85
AsnAAU7961.564.81%AGG1330.63
AAC2240.44GlyGGU4931.337.00%
LysAAA8371.545.15%GGC1450.39
AAG2530.46GGA5941.60
AspGAU6901.594.09%GGG2510.68
GAC1760.41GluGAA8661.545.32%
TrpUGG3861.001.82%GAG2620.46
The value of relative synonymous codon usage (RSCU) > 1 are highlighted in bold.
Table 4. Repetitive sequences of Forsythia suspensa calculated using REPuter.
Table 4. Repetitive sequences of Forsythia suspensa calculated using REPuter.
No.Size/bpType #Repeat 1 Start (Location)Repeat 2 Start (Location)Region
130F10,814 (trnG-GCC *)38,746 (trnG-UCC)LSC
230F17,447 (rps2-rpoC2)17,448 (rps2-rpoC)LSC
330F44,547 (psaA-ycf3)44,550 (psaA-ycf3)LSC
430F45,978 (ycf3 intron2)101,338 (rps12_3end-trnV-GAC)LSC, IRa
530F91,923 (ycf2)91,965 (ycf2)IRa
630F110,167 (rrn4.5-rrn5)110,198 (rrn4.5-rrn5)IRa
730F133,335 (rrn5-rrn4.5)133,366 (rrn5-rrn4.5)IRb
830F149,178 (ycf2)149,214 (ycf2)IRb
930F149,196 (ycf2)149,214 (ycf2)IRb
1030F151,568 (ycf2)151,610 (ycf2)IRb
1132F9313 (trnS-GCU *)37,781 (psbC-trnS-UGA *)LSC
1232F40,965 (psaB)43,189 (psaA)LSC
1332F53,338 (ndhC-trnV-UAC)53,358 (ndhC-trnV-UAC)LSC
1432F115,350 (ndhF-rpl32)115,378 (ndhF-rpl32)SSC
1534F94,332 (ycf2)94,368 (ycf2)IRa
1634F94,350 (ycf2)94,368 (ycf2)IRa
1735F149,188(ycf2)149,206 (ycf2)IRb
1839F45,966 (ycf3 intron2)101,326 (rps12_3end-trnV-GAC)LSC, IRa
1939F45,966 (ycf3 intron2)122,604 (ndhA intron1)LSC, SSC
2041F40,953 (psaB)43,177 (psaA)LSC
2141F101,324 (rps12_3end-trnV-GAC)122,602 (ndhA intron)IRa, SSC
2242F94,320 (ycf2)94,356 (ycf2)IRa
2342F149,165 (ycf2)149,201 (ycf2)IRb
2444F94,340 (ycf2)94,358 (ycf2)IRa
2558F94,332 (ycf2)94,340 (ycf2)IRa
2658F149,165 (ycf2)149,183 (ycf2)IRb
2730P9315 (trnS-GCU *)47,653 (trnS-GGA)LSC
2830P14,359 (atpF-atpH)14,359 (atpF-atpH)LSC
2930P34,338 (trnT-GGU-psbD)34,338 (trnT-GGU-psbD)LSC
3030P37,783 (psbC-trnS-UGA *)47,653 (trnS-GGA)LSC
3130P45,978 (ycf3 intron2)142,195 (trnV-GAC-rps12_3end)LSC, IRb
3230P91,923 (ycf2)151,568 (ycf2)IRa, IRb
3330P91,965 (ycf2)151,610 (ycf2)IRa, IRb
3430P110,167 (rrn4.5-rrn5)133,335 (rrn5-rrn4.5)IRa, IRb
3530P110,198 (rrn4.5-rrn5)133,366 (rrn5-rrn4.5)IRa, IRb
3630P122,764 (ndhA intron1)122,766 (ndhA intron1)SSC
3734P94,332 (ycf2)149,161 (ycf2)IRa, IRb
3834P94,350 (ycf2)149,161 (ycf2)IRa, IRb
3934P94,368 (ycf2)149,179 (ycf2)IRa, IRb
4034P94,368 (ycf2)149,179 (ycf2)IRa, IRb
4139P45,966 (ycf3 intron2)45,966 (ycf3 intron2)LSC, IRb
4241P122,602 (ndhA intron1)142,198 (trnV-GAC–rps12_3end)SSC, IRb
4342P94,320 (ycf2)149,165 (ycf2)IRa, IRb
4442P94,356 (ycf2)149,201 (ycf2)IRa, IRb
4544P77,475 (psbT-psbN)77,475 (psbT-psbN)LSC
4644P94,340 (ycf2)149,161 (ycf2)IRa, IRb
4744P94,358 (ycf2)149,179 (ycf2)IRa, IRb
4858P94,332 (ycf2)149,165 (ycf2)IRa, IRb
4958P94,340 (ycf2)149,183 (ycf2)IRa, IRb
# F: forward; P: palindrome; * part in the gene.
Table 5. Distribution of SSR loci in the chloroplast genome of Forsythia suspensa.
Table 5. Distribution of SSR loci in the chloroplast genome of Forsythia suspensa.
SSR Type #SSR SequenceSizeStartSSR LocationRegion
p1(A)101031,855psbM-trnD-GUCLSC
1031,992psbM-trnD-GUCLSC
1038,025trnS-UGA-psbZLSC
1073,886clpP intron1LSC
1085,390rpl16 intronLSC
(T)1010507trnH-GUG-psbALSC
109056psbK-psbILSC
1011,162trnR-UCU-atpALSC
1059,781rbcL-accDLSC
1066,291petA-psbJLSC
1069,202petL-petGLSC
(C)10105236trnK-UUU-rps16LSC
(T)111119,678rpoC2LSC
1150,871trnF-GAA-ndhJLSC
1161,662accD-psaILSC
1172,263rpl20-clpPLSC
1174,741clpP intron2LSC
(T)121220,216rpoC2LSC
1281,254rpoALSC
1283,666rps8-rpl14LSC
(A)131312,741atpA-atpFLSC
1346,877ycf3-trnS-GGALSC
(T)131314,109atpF-atpHLSC
1334,486trnT-GGU-psbDLSC
1337,645psbC-trnS-UGALSC
1386,860rpl22-rps19LSC
(T)141448,630rps4-trnT-UGULSC
(A)151533,163trnE-UUC-trnT-GGULSC
(A)161646,618ycf3 intron2LSC
(A)191944,559psaA-ycf3LSC
(T)1919117,928ndhDSSC
(A)202029,957trnC-GCA-petNLSC
p2(AT)5104646trnK-UUU-rps16LSC
106558rps16-trnQ-UUGLSC
1021,057rpoC2LSC
(TA)51069,619trnW-CCA-trnP-UGGLSC
(TA)61248,772rps4-trnT-UGULSC
1249,291trnT-UGU-trnL-UAALSC
1269,931trnP-UGG-psaJLSC
p3(CCT)41269,371petG-trnW-CCALSC
p4(AAAG)31273,413clpP intron1LSC
(TCTT)31231,191petN-psbMLSC
(TTTA)31255,102trnM-CAU-atpELSC
(AAAT)4169284psbI-trnS-GCULSC
p5(TCTAT)3159458trnS-GCU-trnG-GCCLSC
c-2317,456rps2-rpoC2LSC
-2763,589ycf4-cemALSC
-3378,324petB intronLSC
-4571,570rps18-rpl20LSC
-5938,501psbZ-trnG-UCCLSC
-9057,078atpB *LSC
# p1: mono-nucleotide; p2: di-nucleotide; p3: tri-nucleotide; p4: tetra-nucleotide; p5: penta-nucleotide; c: compound; * part in the gene.
Table 6. The predicted RNA editing site in the Forsythia suspensa chloroplast genes.
Table 6. The predicted RNA editing site in the Forsythia suspensa chloroplast genes.
GeneCodon PositionAmino Acid PositionCodon (Amino Acid) ConversionScore
accD794265uCg (S) => uUg (L)0.8
1403468cCu (P) => cUu (L)1
atpA914305uCa (S) => uUa (L)1
atpF9231cCa (P) => cUa(L)0.86
atpI629210uCa (S) => uUa (L)1
ccsA7124aCu (T) => aUu (I)1
matK27191Ccu (P) => Ucu (S)0.86
460154Cac (H) => Uac (Y)1
646216Cau (H) => Uau (Y)1
1180394Cgg (R) => Ugg (W)1
1249417Cau (H) => Uau (Y)1
ndhA344115uCa (S) => uUa (L)1
569190uCa (S) => uUa (L)1
ndhB14950uCa (S) => uUa (L)1
467156cCa (P) => cUa (L)1
586196Cau (H) => Uau (Y)1
611204uCa (S) => uUa (L)0.8
737246cCa (P) => cUa (L)1
746249uCu (S) => uUu (F)1
830277uCa (S) => uUa (L)1
836279uCa (S) => uUa (L)1
1292431uCc (S) => uUc (F)1
1481494cCa (P) => cUa (L)1
ndhD21aCg (T) => aUg (M)1
4716uCu (S) => uUu (F)0.8
313105Cgg (R) => Ugg (W)0.8
878293uCa (S) => uUa (L)1
1298433uCa (S) => uUa (L)0.8
1310437uCa (S) => uUa (L)0.8
ndhF29097uCa (S) => uUa (L)1
671224uCa (S) => uUa (L)1
ndhG314105aCa (T) => aUa (I)0.8
385129Cca (P) => Uca (S)0.8
petB418140Cgg (R) => Ugg (W)1
611204cCa (P) => cUa (L)1
petG9432Cuu (L) => Uuu (F)0.86
psbE21472Ccu (P) => Ucu (S)1
rpl2596199gCg (A) => gUg (V)0.86
rpl20308103uCa (S) => uUa (L)0.86
rpoA830277uCa (S) => uUa (L)1
rpoB338113uCu (S) => uUu (F)1
551184uCa (S) => uUa (L)1
566189uCg (S) => uUg (L)1
1672558Ccc (P) => Ucc (S)0.86
2000667uCu (S) => uUu (F)1
2426809uCa (S) => uUa (L)0.86
rpoC21792598Cgu (R) => Ugu (C)0.86
2305769Cgg (R) => Ugg (W)1
37461249uCa (S) => uUa (L)0.86
rps224883uCa (S) => uUa (L)1
rps148027uCa (S) => uUa (L)1
14950cCa (P) => cUa (L)1

Share and Cite

MDPI and ACS Style

Wang, W.; Yu, H.; Wang, J.; Lei, W.; Gao, J.; Qiu, X.; Wang, J. The Complete Chloroplast Genome Sequences of the Medicinal Plant Forsythia suspensa (Oleaceae). Int. J. Mol. Sci. 2017, 18, 2288. https://doi.org/10.3390/ijms18112288

AMA Style

Wang W, Yu H, Wang J, Lei W, Gao J, Qiu X, Wang J. The Complete Chloroplast Genome Sequences of the Medicinal Plant Forsythia suspensa (Oleaceae). International Journal of Molecular Sciences. 2017; 18(11):2288. https://doi.org/10.3390/ijms18112288

Chicago/Turabian Style

Wang, Wenbin, Huan Yu, Jiahui Wang, Wanjun Lei, Jianhua Gao, Xiangpo Qiu, and Jinsheng Wang. 2017. "The Complete Chloroplast Genome Sequences of the Medicinal Plant Forsythia suspensa (Oleaceae)" International Journal of Molecular Sciences 18, no. 11: 2288. https://doi.org/10.3390/ijms18112288

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop