Next Article in Journal
Portuguese Honeys from Different Geographical and Botanical Origins: A 4-Year Stability Study Regarding Quality Parameters and Antioxidant Activity
Previous Article in Journal
H2S-Mediated Protein S-Sulfhydration: A Prediction for Its Formation and Regulation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Complete Chloroplast Genome Sequence and Phylogenetic Analysis of the Medicinal Plant Artemisia annua

1
Institute of Chinese Materia Medica, Artemisinin Research Center, China Academy of Chinese Medical Sciences, Beijing 100700, China
2
School of Chinese Materia Medica, Tianjin University of Traditional Chinese Medicine, Tianjin 300193, China
3
College of Pharmacy, Hubei University of Chinese Medicine, Wuhan 430065, Hubei, China
4
College of Pharmacy and Chemistry, Dali University, Dali 671000, Yunnan, China
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Molecules 2017, 22(8), 1330; https://doi.org/10.3390/molecules22081330
Submission received: 23 June 2017 / Revised: 31 July 2017 / Accepted: 8 August 2017 / Published: 11 August 2017

Abstract

:
The complete chloroplast genome of Artemisia annua (Asteraceae), the primary source of artemisinin, was sequenced and analyzed. The A. annua cp genome is 150,995 bp, and harbors a pair of inverted repeat regions (IRa and IRb), of 24,850 bp each that separate large (LSC, 82,988 bp) and small (SSC, 18,267 bp) single-copy regions. Our annotation revealed that the A. annua cp genome contains 113 genes and 18 duplicated genes. The gene order in the SSC region of A. annua is inverted; this fact is consistent with the sequences of chloroplast genomes from three other Artemisia species. Fifteen (15) forward and seventeen (17) inverted repeats were detected in the genome. The existence of rich SSR loci in the genome suggests opportunities for future population genetics work on this anti-malarial medicinal plant. In A. annua cpDNA, the rps19 gene was found in the LSC region rather than the IR region, and the rps19 pseudogene was absent in the IR region. Sequence divergence analysis of five Asteraceae species indicated that the most highly divergent regions were found in the intergenic spacers, and that the differences between A. annua and A. fukudo were very slight. A phylogenetic analysis revealed a sister relationship between A. annua and A. fukudo. This study identified the unique characteristics of the A. annua cp genome. These results offer valuable information for future research on Artemisia species identification and for the selective breeding of A. annua with high pharmaceutical efficacy.

1. Introduction

Artemisia annua, an herbaceous annual with a strong volatile aroma, belongs to the genus Artemisia (Asteraceae). It is the sole natural source of the antimalarial drug artemisinin [1], and is cultivated as a high-value medicinal plant (Qing hao). Anti-malarial artemisinin combination therapy (ACT) has received strong interest from the global health community because of the efficacy of artemisinin and its derivatives [2]. Furthermore, the 2015 Nobel Prize for Physiology or Medicine was awarded to Professor Youyou Tu for the discovery of artemisinin [3]. However, there are concerns that the production of high-quality artemisinin may not be sufficient to meet future demand [2].
A. annua has a broad, global distribution and has many distinct locally-adapted ecotypes [4]. Beyond China, A. annua is also present in Eastern Europe, North America, and elsewhere in Asia [5]. However, the artemisinin content of A. annua ecotypes varies widely from region to region [5]. With the exception of a few rare high-artemisinin ecotypes found in China, the artemisinin content in A. annua ecotypes are generally insufficient (i.e., <1%) for commercialized extraction [6], and no other species been found to be suitable for mass production of artemisinin [1,7]. Oxygen released from chloroplasts in A. annua can upregulate the expression of genes involved in artemisinin biosynthesis, and can also catalyze artemisinin synthesis from dihydroartemisinin [8,9].
In addition to their role in photosynthesis, chloroplasts are also involved in cytoplasmic male sterility (CMS) [10] and secondary metabolic activities [11]. The chloroplast (cp) genome has a conserved quadripartite structure: a large single-copy (LSC) region, a small single-copy (SSC) region, and two inverted repeat (IR) regions. The majority of angiosperm cp genomes exhibit significant conservation of gene order and contents [12]. However, large-scale genome rearrangements and intron gains and losses have been identified in several angiosperm lineages [13,14,15]. A draft cp genome assembly for A. annua is of great importance for exploring putative links between A. annua’s chloroplast function and its adaptability and phytochemical characteristics.
The transcriptome sequences and genetic map of A. annua have been previously reported [16,17,18], but little is known about its cp genomic structure. Here we report the complete chloroplast genome sequence of A. annua, along with a characterization of long repeats and SSRs, and comparative analyses of the cp genome as a whole. Comparative analyses among cp genomes of other Asteraceae species revealed significant variation in genome size, highly divergent regions in intergenic spacers, as well as gene loss. Comprehensive cp genomic analyses will help to identify Artemisia species, provide insight into its evolutionary history, and improve the development of A. annua as a pharmacological resource [19,20].

2. Results and Discussion

2.1. Characteristics of A. annua cpDNA

The complete cp genome of A. annua is 150,995 bp in size, with a pair of IR regions of 24,850 bp that separate a LSC region of 82,988 bp from a SSC region of 18,267 bp (Table 1 and Figure 1). The overall GC and AT content of the A. annua cp genome is 37.5% and 62.5%, respectively, which is similar to the cp genomes of other Asteraceae spp. [21,22,23]. The IR regions possess higher GC content (43%) than do the LSC (35.5%) or SSC regions (30.8%) (Table 1). Within the protein-coding regions (CDS), the AT content of the first, second, and third codon positions, is 54.6%, 62.4%, and 70.0%, respectively (Table 1). The bias toward a higher AT representation at the third codon position has been found to be common in other plant cp genomes [15,24], and this bias is used to discriminate cpDNA from nuclear and mitochondrial DNA [25]. The coding regions constitute 52.6% of the genome, and therefore the non-coding regions—including introns, pseudogenes, and intergenic spacers—account for 47.4%.
The A. annua cp genome encodes 113 predicted functional genes, including 80 protein-coding genes, 29 tRNA genes, and four rRNA genes (Table S1). In addition, there are 18 genes duplicated in the IR, making a total of 131 genes present in the A. annua cp genome (Figure 1). These genes have also been observed in Artemisia frigida [26]. Among these genes, seven protein-coding, seven tRNA, and all four rRNA genes are duplicated in the IR regions. The LSC region contains 62 protein-coding and 22 tRNA genes, whereas the SSC region contains one tRNA gene and 12 protein-coding genes.
Based on the sequences of protein-coding and tRNA genes, the frequency of codon usage was estimated for the A. annua cp genome and is summarized in Table 2. Together, all genes in the A. annua cp genome are encoded by 26,445 codons. Among these, leucine, with 2853 (10.7%) of the codons, is the most frequent amino acid in the cp genome, and cysteine, with 293 (1.1%), is the least frequent (Table 2). A- and U-ending codons were common. Except for trnL-CAA, all types of preferred synonymous codons (RSCU > 1) ended with A or U.
In total, there are 17 intron-containing genes, 15 (nine protein-coding and six tRNA genes) of which contain one intron, and two of which (ycf3 and clpP) contain two introns (Table 3). The trnK-UUU has the largest intron (1860 bp), which itself contains the matK gene. The rps12 gene is a trans-spliced gene with the 5′ end located in the LSC region and the duplicated 3′ ends in the IR regions. Ycf3 is required for the stable accumulation of the photosystem I complex [27,28]. The intron gain in ycf3 of A. annua may be useful for further studies of the mechanism of photosynthesis evolution, and of variation in singlet oxygen released by chloroplasts in from Artemisia.
Introns may contain “old code”—i.e., the part of a gene that loses its function during evolution. Several unicellular eukaryotes seem to experience selective pressures to lose introns. Therefore, the fact of intron gain and/or intron loss requires an evolutionary explanation. A common partial explanation for the range of intron densities is the random accumulation of introns in nuclear genomes over time after inheritance from an intron-poor ancestor. More experimental evidence is required to reveal whether the variation of the introns in the A. annua cp genome is related to adaptation to environmental stresses, or to facilitate artemisinin biosynthesis.

2.2. Long Repeat and SSR Analysis

For repeat structure analysis, 15 forward and 17 inverted repeats were detected in the A. annua cp genome (Table 4). Most of these repeats show lengths between 30 and 39 bp, while the ycf2 gene possesses the two longest inverted repeats at 60 bp. Two repeats relevant to psa genes (No. 4 and 5) and three forward and three inverted repeats (No. 1–3, No. 16–18) in the intergenic spacers are distributed in the LSC region. Moreover, two forward and eight inverted repeats (No. 11 and 12, No. 22–29) associated with ycf2, two forward and two inverted repeats (No. 14 and 15, No. 31 and 32) in the intergenic spacers, are distributed in the IR region.
SSRs, well-known as microsatellites, are short (1–6 bp), tandemly repeated DNA sequences that are widely distributed throughout the genome. cpSSRs, uniparental in inheritance, have been widely employed in the analysis of plant population structure, diversity, differentiation and maternity analysis [29,30,31]. Here, the distribution of SSRs was analyzed for the A. annua cp genome, and 35 SSRs, most of them distributed in LSC, were identified. These included 31 mononucletide SSRs (88.57%), two dinucleotide SSRs (5.71%), and two trinucleotide SSR (5.71%) (Table 5). Sixteen of the 35 SSR loci were found in the intergenic regions, while the other 19 SSRs were located in genes. All 31 mononucleotide SSRs belonged to the A/T type. Our results are consistent with the hypothesis that cpSSRs are generally composed of short polyadenine (polyA) or polythymine (polyT) repeats and rarely contain tandem guanine (G) or cytosine (C) repeats. Thus, these SSRs contribute to the AT richness of cp genomes. cpSSRs have been important resources for the study of economically important plants and their relatives. Furthermore, the potential of cpSSRs to offer unique insights into species identification, genetic diversity, and evolutionary processes in wild plant species is quite tremendous [32]. Our results will provide cpSSR markers that can be used to examine genetic diversity in A. annua and its relative species, and to provide an efficient means by which to select germplasm with anti-malarial pharmaceutical efficacy.

2.3. Comparative Chloroplast Genomic Analysis

The whole cp genome sequence of A. annua was compared to those of Artemisia fukudo, Lactuca sativa, Jacobaea vulgaris, and Cynara cornigera. The cp genome size of A. annua is the second smallest among the five completed Asteraceae cp genomes. It is larger than J. vulgaris (150,689 bp) (Table S2), but smaller than the cp genomes of A. fukudo, C. cornigera, and L. sativa by 56 bp, 1595 bp, 1817 bp, respectively. A. annua has the smallest SSC region (18,267 bp) among these sequenced Asteraceae cp genomes. The next smallest SSC region is from J. vulgaris, with a size of 18,276 bp. There are no significant differences in sequence length between SSC or IR, and the variation in sequence length is the main reason that there is a difference in the length of the LSC region.
Comparative genome analysis [33] permits the examination of how DNA sequences diverge among related species. The whole sequence identity of the five Asteraceae cp genomes was plotted using mVISTA, with the annotated A. annua cp genome as a reference (Figure 2). The comparison shows that the two IR regions are less divergent than the LSC and SSC regions. In addition, the coding regions are more conserved than the non-coding regions, and the highly divergent regions among the five cp genomes occur in the intergenic spacers, including rnH-psbA, psbM-petN, trnC-GCA-petN, trnE-UUC-rpoB, trnY-GUA-trnE-UUC, trnV-UAC-ndhC, rbcL-accD, accD–psaI, and rpl32-trnL-UAG in LSC, as well as ndhI-ndhG and ycf1-rps15 in SSC. Similar results have been observed in other plant cp genomes [21,34]. Moreover, the most divergent coding regions are the ndhF, ycf1, and ycf2 genes in five Asteraceae cp genomes. However, there is only a very slight difference between A. annua and A. fukudo. In our study, we observed that all eight rRNA genes are highly conserved.

2.4. IR Contraction and Expansion in the A. annua cp Genome

Although IRs are the most conserved regions of the cp genomes, contraction and expansion at the borders of IR regions are common evolutionary events, and are hypothesized to explain size differences between cp genomes [35,36]. Detailed comparisons of the IR-SSC and IR-LSC boundaries among four Asteraceae cp genomes (Artemisia annua, Artemisia fukudo, Artemisi frigida, and Artemisia montana) are presented in Figure 3. The IRb/SSC border is generally positioned between the ycf1 pseudogene and the ndhF gene. The ycf1 pseudogene has proven to be useful for analyzing cp genome variation in higher plants and algae [37]. The ndhF gene, related to photosynthesis, was found to be 56 bp, 58 bp, 60 bp, and 75 bp away from the IRb/SSC border, in A. montana, A. annua, A. fukudo, and A. frigida, respectively. However, some unique structural differences exist in the A. annua cp genome: the trnH gene is present at the longest distance (114 bp) from the LSC edge; the rps19 pseudogene is absent in A. annua due to the contraction of the borders of the IR regions; the rps19 gene was present in the LSC region due to the expansion of LSC. It has been reported that the rps19 gene is one of the most abundant transcripts in the chloroplast’s genome [38]. The IR/LSC boundaries are not static among the cp genome in Artemisia species, but are dynamic processes confined to conservative expansions and contractions, which is similar to what has been found in other plants [39].
The comparison of cp genome size among examined Asteraceae species is displayed in Table S3. The length of the IR (24,850 bp) in A. annua is 106 bp smaller than that of A. fukudo, 122 bp smaller than that of A. frigida, and 109 bp smaller than that of A. montana. These differences may be related to the loss of rps19 and rps19 pseudogenes in A. annua IR regions. However, there are no significant differences in the length of the whole cp genome among the four Asteraceae cp genomes. The cp genome of A. annua (150,955 bp) is 56 bp smaller than that of A. fukudo, 121 bp smaller than that of A. frigida, and 175 bp smaller than that of A. montana. Non-functional DNA is rapidly deleted, resulting in the failure of pseudogenes to accumulate, which is the likely cause of this variation.
Pairwise cp genomic alignment between A. annua and the three Artemisia cp genomes (A. frigida, A. fukudo, and A. montana) revealed a high degree of synteny (Figures S1–S3). Previous work had reported that the cp genome of A. frigida had two inversion events in the LSC region, and at least one re-inversion event in the SSC [26]. Our results suggest that A. annua has similar sequence rearrangements. To further confirm the accuracy of the assembly and the gene order of the SSC in A. annua, four primers were designed to amplify the junctions of IRs and the LSC/SSC. These primers would create an amplicon by PCR amplification, which could then be analyzed via Sanger sequencing using the primers listed in Table S4. The inversion and re-inversion events in A. annua suggest that the SSC may be an active region for sequence rearrangements in plant cp genomes. Outside the Asteraceae [40,41], other angiosperms have been found to have an inverted SSC region, including Piper cenocladum [42], Dioscorea elephantipes, and Chloranthus spicatus [43]. Although chloroplast gene order is generally conserved in land plant genomes [44], many sequence rearrangements have been reported in cp genomes from a wide variety of different plant species, including inversions in the LSC region [45,46,47], IR contraction or expansions with inversions [48], and re-inversion in the SSC region. It has been proposed that sequence rearrangements in cp genomes are caused by intramolecular recombination events [49]. Sequence rearrangements that alter cp genome structure in related species may also provide genetic diversity information that can be used for molecular classification and evolution studies.

2.5. Phylogenetic Analysis

A. annua belongs to the tribe Anthemideae in the Asteraceae. Several studies have reported analyzes of the phylogenetic relationships within the Asteraceae based on chloroplast coding or non-coding sequences [50,51]. The availability of a completed A. annua cp genome provides us with sequence information that can be used to study the molecular evolution and phylogeny of A. annua. We performed multiple sequence alignments using 50 protein-coding genes commonly present in cp genome sequences in 20 Asteraceae species. One additional cp genome, Berberis bealei (Berberidaceae), was included as an outgroup (Figure 4). On the basis of a GTR + G + I nucleotide substitution model with 100% bootstrap values, as recommended by Jmodeltest, the ML phylogenetic results strongly supported the hypothesis that A. annua is the sister of the closely related species Artemisia fukudo. Furthermore, we hypothesized that Artemisia fukudo may have similar phytochemical properties [52].

3. Materials and Methods

3.1. DNA Sequencing, cp Genome Assembly, and Validation

Fresh A. annua leaves were collected from tissue cultured seedlings. Total DNA was extracted from approximately 10 g of fresh leaf tissue using the modified CTAB method [53]. The DNA concentration for each sample was estimated by measuring A260 using an ND-2000 spectrometer [54] (Nanodrop Technologies, Wilmington, DE, USA), and visual quality was assessed using agarose gel electrophoresis. Pure DNA was used to construct shotgun libraries (250 bp) according to the manufacturer’s instructions. Sequencing was performed by an Illumina Hiseq 1500 platform (San Diego, CA, USA). This resulted in approximately 100 Gb data. First, raw reads were trimmed by Fastqc. Next, we performed BLASTs between trimmed reads and reference sequences (Artemisia frigida) to extract cp-like reads [55]. Finally, the cp-like reads were used for sequence assembly with SOAPdenovo [56]. Sequence extension was executed using SSPACE [57], and gaps were filled using GapCloser [58]. To verify the assembly, the four junction regions between the IR regions and LSC/SSC were confirmed by PCR amplification and Sanger sequencing, using the primers listed in Table S4. The final cp genome of A. annua was submitted to GenBank (Accession Number: MF623173).

3.2. Gene Annotation and Sequence Analyses

The initial gene annotation was performed with CPGAVAS [59] (http://www.herbalgenomics.org/cpgavas) and further confirmation was performed using BLAST and DOGMA [60]. tRNA genes were identified by tRNAscanSE [61]. The circular cp genome map was drawn using the OGDRAWv1.2 [62] program (http://ogdraw.mpimp-golm.mpg.de/). To analyze the characteristics of variations in synonymous codon usage, relative synonymous codon usage values (RSCU), codon usage, and AT content were determined using MEGA5.2 [63].

3.3. Genome Comparison

MUMmer [64] was used to perform pairwise cp genomic alignment. The mVISTA [65] program in the Shuffle-LAGAN mode [66], was employed to compare the cp genome of A. annua with the cp genomes of Artemisia fukudo, Lactuca sativa, Jacobaea vulgaris, and Cynara cornigera (KU360270, AP007232, HQ234669 and KP842707), using the annotation of A. annua as the reference. MISA [67] was used to visualize the SSRs and REPuter [68] was used to visualize forward and inverted repeats.

3.4. Phylogenetic Analysis

A total of 19 complete cp genome sequences were downloaded from the NCBI Organelle Genome and Nucleotide Resources database. For the phylogenetic analysis, a set of 50 protein-coding genes shared in all 20 analyzed genomes was used. Genes were aligned by clustalw2 [69]. Jmodeltest 3.7 [70] was used to select the best model for ML (Maximum likelihood) analysis, and the phylogenetic tree was plotted using RAxML-HPC 2.7.6.3 on XSEDE at the CIPRES Science Gateway (http://www.phylo.org/). Bootstrap analysis was executed with 1000 replicates and TBR branch swapping. In addition, Berberis bealei was set as the outgroup.

4. Conclusions

Here we report the first complete cpDNA sequence of A. annua, an important medicinal plant. Compared to the cp genomes of three related Artemisia species, the cp genome of A. annua has the smallest size, while the genome structure and composition are similar. In addition, the cp genome of A. annua has an inverted SSC region, and is similar in that respect to most Asteraceae. However, a re-inversion event in the SSC region of the A. annua lineage suggests that the SSC might be an active region for inversion events in Asteraceae species. Repeated sequences, together with the aforementioned SSRs, are informative sources for the development of new molecular markers. Phylogenetic relationships among 20 Asteraceae species strongly supported the known taxonomic status of A. annua in Asteraceae and the sisterhood of the closely related species A. fukudo. The comprehensive data presented in this study provide insight into the evolutionary relationships between species of the genus Artemisia, and provide an assembly of a whole cp genome of A. annua, which may be useful for future breeding and further biological discoveries.

Supplementary Materials

Table S1. Gene contents in the Artemisia annua chloroplast genome. (113 genes). Table S2. Size comparison of Artemisia annua chloroplast genomic regions and three other Asteraceae chloroplast genomes. Table S3. Size comparison of Artemisia annua chloroplast genomic regions and three other Artemisia chloroplast genomes. Table S4. Primers used for assembly validation. Figure S1. Chloroplast genomic alignment between Artemisia annua and Artemisia frigida. Figure S2. Chloroplast genomic alignment between Artemisia annua and Artemisia fukudo. Figure S3. Chloroplast genomic alignment between Artemisia annua and Artemisia montana.

Acknowledgments

This work is supported by the grants from the National Nature Science Foundation of China (81403053 and 81503469) and from the China Academy of Chinese Medical Sciences Special Fund for Health Service Development of Chinese Medicine (ZZ0908067).

Author Contributions

S.C. and J.X. conceived and designed the research framework; X.S., Z.L., S.X., and R.B. prepared the sample and performed the experiments; B.L., and M.W. analyzed the data; X.S. wrote the paper. X.L. and B.Z. made revisions to the final manuscript. All authors have read and approved the final manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Klayman, D.L. Qinghaosu (artemisinin): An antimalarial drug from China. Science 1985, 228, 1049–1055. [Google Scholar] [CrossRef] [PubMed]
  2. Arrow, K.J.; Panosian, C.B.; Gelband, H. Saving Lives, Buying Time: Economics of Malaria Drugs in an Age of Resistance; National Academies Press: Washington, DC, USA, 2004. [Google Scholar]
  3. Tu, Y.Y. Artemisinin—A fift from traditional chinese medicine to the world (nobel lecture). Angew. Chem. Int. Ed. Engl. 2016, 55, 10210–10226. [Google Scholar] [CrossRef] [PubMed]
  4. Mert, A.; Krc, S.; Ayanoğlu, F. The effects of different plant densities on yield, yield components and quality of Artemisia annua L. Ecotypes. J. Herbs Spices Med. Plants 2002, 9, 413–418. [Google Scholar] [CrossRef]
  5. Delabays, N.; Simonnet, X.; Gaudin, M. The genetics of artemisinin content in Artemisia annua L. and the breeding of high yielding cultivars. Curr. Med. Chem. 2001, 8, 1795–1801. [Google Scholar] [CrossRef] [PubMed]
  6. Zhong, G.Y.; Zhou, H.R.; Lun, Y.; Hu, M.; Zhao, P.P. Studies on quality germplasm resources of Artemisia annua. Chin. Herbal Med. 1998, 29, 264–267. [Google Scholar]
  7. Hu, S.L.; Xu, Q.C.; Liu, J.F.; Gu, Y.X. Studies on plant resources of artemisinin. China J. Chin. Mater. Med. 1981, 2, 13–16. [Google Scholar]
  8. Guo, X.X.; Yang, X.Q.; Yang, R.Y. Salicylic acid and methyl jasmonate but not Rose Bengal enhance artemisinin production through invoking burst of endogenous singlet oxygen. Plant Sci. 2010, 178, 390–397. [Google Scholar] [CrossRef]
  9. Zeng, Q.P.; Zeng, X.M.; Yang, R.Y. Singlet oxygen as a signaling transducer for modulating artemisinin biosynthetic genes in Artemisia annua. Biol. Plantarum. 2011, 55, 69–674. [Google Scholar] [CrossRef]
  10. Sun, C.; Fan, C.; Zhang, F.; Niu, T.; Sun, Y.; Guo, X. Cloning and sequence analysis of ps1A1 and ps1A2 genes amplified specifically from the chloroplast and of maintainer of CMS Sorghum. Chin. J. Appl. Environ. Biol. 2003, 9, 501–505. [Google Scholar]
  11. Nielsen, A.Z.; Ziersen, B.; Jensen, K.; Lassne, L.M.; Olsen, C.E.; Moller, B.L.; Jensen, P.E. Redirecting photosynthetic reducing power toward bioactive natural product synthesis. ACS Synth. Biol. 2013, 2, 308–315. [Google Scholar] [CrossRef] [PubMed]
  12. Wicke, S.; Schneeweiss, G.M.; Depamphilis, C.W.; Kai, F.M.; Quandt, D. The evolution of the plastid chromosome in land plants: Gene content, gene order, gene function. Plant Mol. Biol. 2011, 76, 273–297. [Google Scholar] [CrossRef] [PubMed]
  13. Wolfe, K.H.; Mordent, C.W.; Ems, S.C.; Palmer, J.D. Rapid evolution of the plastid translational apparatus in a nonphotosynthetic plant: Loss or accelerated sequence evolution of tRNA and ribosomal protein genes. Mol. Evol. 1992, 35, 304–317. [Google Scholar] [CrossRef]
  14. Jansen, R.K.; Cai, Z.Q.; Raubeson, L.A.; Daniell, H.; dePamphilis, C.W.; Leebeans-Mack, J.; Müller, K.F.; Guisinger-Bellian, M.; Haberle, R.C.; Hansen, A.K.; et al. Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns. Proc. Natl. Acad. Sci. USA 2007, 104, 19369–19374. [Google Scholar] [CrossRef] [PubMed]
  15. He, L.; Qian, J.; Sun, Z.Y.; Xu, X.L.; Chen, S.L. Complete chloroplast genome of medicinal plant Lonicera japonica: Genome rearrangement, intron gain and loss, and implications for phylogenetic studies. Molecules 2017, 22, 249. [Google Scholar] [CrossRef] [PubMed]
  16. Soetaert, S.; Van Nieuwerburgh, F.; Brodelius, P.; Goossens, A.; Deforce, D. Transcriptome analysis of apical and sub-apical cells of Artemisia annua trichomes with next-generation-sequencing. In Proceedings of the 10th International Meeting on All Aspects of the Chemistry and Biology of Terpenes and Isoprenoids (Terpnet 2011): Biosynthesis and Function of Isoprenoids in Plants, Microorganisms and Parasites, Kalmar, Sweden, 22–26 May 2011; p. 170. [Google Scholar]
  17. Soetaert, S.S.; Neste, C.M.V.; Vandewoestyne, M.L.; Head, S.R.; Goossens, A.; Van Nieuwerburgh, F.C.; Deforce, D.L. Differential transcriptome analysis of glandular and filamentous trichomes in Artemisia annua. BMC Plant Biol. 2013, 13, 220. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  18. Graham, I.A.; Besser, K.; Blumer, S.; Branigan, C.A.; Czechowski, T.; Elias, L.; Guterman, I.; Harvey, D.; Issac, P.G.; Khan, A.M.; et al. The genetic map of Artemisia annua L. identifies loci affecting yield of the antimalarial drug artemisinin. Science 2010, 327, 327–331. [Google Scholar] [CrossRef] [PubMed]
  19. Chen, S.L.; Song, J.Y. Herbgenomics. China J. Chin. Mater. Med. 2016, 41, 3881–3889. [Google Scholar]
  20. Chen, S.L.; Song, J.Y.; Sun, C.; Xu, J.; Zhu, Y.J.; Verpoorte, R.; Fan, T.P. Herbal genomics: Examining the biology of traditional medicines. Science 2015, 347, 27–29. [Google Scholar]
  21. Nie, X.; Lv, S.; Zhang, Y.; Du, X.; Wang, L.; Biradar, S.S.; Tan, X.; Wan, F.; Weining, S. Complete chloroplast genome sequence of a major invasive species, crofton weed (Ageratina adenophora). PLoS ONE 2012, 7, e36869. [Google Scholar] [CrossRef] [PubMed]
  22. Ding, P.; Shao, Y.; Li, Q.; Gao, J.; Zhang, R.; Lai, X.; Wang, D.; Zhang, H. The complete chloroplast genome sequence of the medicinal plant Andrographis paniculata. Mitochondr. DNA 2016, 27, 2347–2348. [Google Scholar]
  23. Jia, Y.; Yang, J.; He, Y.L.; He, Y.; Niu, C.; Gong, L.-L.; Li, Z.-H. Characterization of the whole chloroplast genome sequence of Acer davidii Franch (Aceraceae). Conserv. Genet. Resour. 2016, 8, 141–143. [Google Scholar] [CrossRef]
  24. Xiang, B.; Li, X.; Qian, J.; Wang, L.; Ma, L.; Tian, X.; Wang, Y. The complete chloroplast genome sequence of the medicinal plant Swertia mussotii. Using the PacBio RS II platform. Molecules 2016, 21, 1029. [Google Scholar] [CrossRef] [PubMed]
  25. Clegg, M.T.; Gaut, B.S.; Learn, G.H.; Morton, B.R. Rates and patterns of chloroplast DNA evolution. Proc. Natl. Acad. Sci. USA 1994, 91, 6795–6801. [Google Scholar] [CrossRef] [PubMed]
  26. Liu, Y.; Huo, N.; Dong, L.; Wang, Y.; Zhang, S.; Yooung, H.A.; Feng, X.; Gu, Y.Q. Complete chloroplast genome sequences of Mongolia medicine Artemisia frigida and phylogenetic relationships with other plants. PLoS ONE 2013, 8, e57533. [Google Scholar] [CrossRef] [PubMed]
  27. Boudreau, E.; Takahashi, Y.; Lemieux, C.; Turmel, M.; Rochaix, J.D. The chloroplast ycf3 and ycf4 open reading frames of Chlamydomonas reinhardtii are required for the accumulation of the photosystem I complex. EMBO J. 1997, 16, 6095–6104. [Google Scholar] [CrossRef] [PubMed]
  28. Naver, H.; Boudreau, E.; Rochaix, J.D. Functional studies of Ycf3: Its role in assembly of photosystem I and interactions with some of its subunits. Plant Cell 2001, 13, 2731–2745. [Google Scholar] [CrossRef] [PubMed]
  29. Bryan, G.J.; McNicol, J.W.; Meyer, R.C.; Ramsay, G.; De Jong, W.S. Polymorphic simple sequence repeat markers in chloroplast genomes of Solanaceous plants. Theor. Appl. Genet. 1999, 99, 859–867. [Google Scholar] [CrossRef]
  30. Provan, J. Novel chloroplast microsatellites reveal cytoplasmic variation in Arabidopsis thaliana. Mol. Ecol. 2000, 9, 2183–2185. [Google Scholar] [CrossRef] [PubMed]
  31. Flannery, M.L.; Mitchell, F.J.; Coyne, S.; Kavanagh, T.A.; Burke, J.I.; Salamin, N.; Dowding, P.; Hodkinson, T.R. Plastid genome characterisation in Brassica and Brassicaceae using a new set of nine SSRs. Theor. Appl. Genet. 2006, 113, 1221–1231. [Google Scholar] [CrossRef] [PubMed]
  32. Ebert, D.; Peakall, R. Chloroplast simple sequence repeats (cpSSRs): Technical resources and recommendations for expanding cpSSR discovery and applications to a wide array of plant species. Mol. Ecol. Resour. 2009, 9, 673–690. [Google Scholar] [CrossRef] [PubMed]
  33. Zhihai, H.; Jiang, X.; Shuiming, X.; Baosheng, L.; Yuan, G.; Chaochao, Z.; Xiaohui, Q.; Wen, X.; Shilin, C. Comparative optical genome analysis of two pangolin species: Manis pentadactyla and Manis javanica. Gigascience 2016, 5, 1–5. [Google Scholar] [CrossRef] [PubMed]
  34. Ni, L.H.; Zhao, Z.L.; Xu, H.X.; Chen, S.L.; Dorje, G. The complete chloroplast genome of Gentiana straminea (Gentianaceae), an endemic species to the Sino-Himalayan subregion. Gene 2016, 577, 281–288. [Google Scholar] [CrossRef] [PubMed]
  35. Raubeson, L.A.; Peery, R.; Chumley, T.W.; Dziubek, C.; Fourcade, H.M.; Boorem, J.L.; Jansen, R.K. Comparative chloroplast genomics: Analyses including new sequences from the angiosperms Nuphar advena and Ranunculus macranthus. BMC Genom. 2007, 8, 174–201. [Google Scholar] [CrossRef] [PubMed]
  36. Wang, R.J.; Cheng, C.L.; Chang, C.C.; Wu, C.L.; Su, T.M.; Chaw, S.M. Dynamics and evolution of the inverted repeat-large single copy junctions in the chloroplast genomes of monocots. BMC Evol. Biol. 2008, 8, 36–50. [Google Scholar] [CrossRef] [PubMed]
  37. De Cambiaire, J.C.; Otis, C.; Lemieux, C.; Turmel, M. The complete chloroplast genome sequence of the chlorophycean green alga Scenedesmus obliquus reveals a compact gene organization and a biased distribution of genes on the two DNA strands. BMC Evol. Biol. 2006, 6, 37–52. [Google Scholar] [CrossRef] [PubMed]
  38. Lee, J.; Kang, Y.; Shin, S.C.; Park, H.; Lee, H. Combined analysis of the chloroplast genome and transcriptome of the Antarctic vascular plant Deschampsia antarctica Desv. PLoS ONE 2014, 9, e92501. [Google Scholar] [CrossRef] [PubMed]
  39. Ma, J.; Yang, B.; Zhu, W.; Sun, L.; Tian, J.; Wang, X. The complete chloroplast genome sequence of Mahonia bealei (Berberidaceae) reveals a significant expansion of the inverted repeat and phylogenetic relationship with other angiosperms. Gene 2013, 528, 120–131. [Google Scholar] [CrossRef] [PubMed]
  40. Kim, K.J.; Choi, K.S.; Jansen, R.K. Two chloroplast DNA inversions originated simultaneously during the early evolution of the sunflower family (Asteraceae). Mol. Biol. Evol. 2005, 22, 1783–1792. [Google Scholar] [CrossRef] [PubMed]
  41. Timme, R.E.; Kuehl, J.V.; Boore, J.L.; Jansen, R.K. A comparative analysis of the Lactuca and Helianthus (Asteraceae) plastid genomes: Identification of divergent regions and categorization of shared repeats. Am. J. Bot. 2007, 94, 302–312. [Google Scholar] [CrossRef] [PubMed]
  42. Cai, Z.; Penaflor, C.; Kuehl, J.V.; Leebens-Mack, J.; Carlson, J.E.; dePamphilis, C.W.; Boore, J.L.; Jansen, R.K. Complete plastid genome sequences of Drimys, Liriodendron, and Piper: Implications for the phylogenetic relationships of magnoliids. BMC Evol. Biol. 2006, 6, 77–97. [Google Scholar] [CrossRef] [PubMed]
  43. Hansen, D.R.; Dastidar, S.G.; Cai, Z.; Penaflor, C.; Kuehl, J.V.; Boore, J.L.; Janse, K. Phylogenetic and evolutionary implications of complete chloroplast genome sequences of four early-diverging angiosperms: Buxus (Buxaceae), Chloranthus (Chloranthaceae), Dioscorea (Dioscoreaceae), and Illicium (Schisandraceae). Mol. Phylogenet. Evol. 2007, 45, 547–563. [Google Scholar] [CrossRef] [PubMed]
  44. Raubeson, L.A.; Jansen, R.K. Chloroplast DNA evidence on the ancient evolutionary split in vascular land plants. Science 1992, 255, 1697–1699. [Google Scholar] [CrossRef] [PubMed]
  45. Kumar, S.; Hahn, F.M.; Mcmahan, C.M.; Cornish, K.; Whalen, M.C. Comparative analysis of the complete sequence of the plastid genome of Parthenium argentatum and identification of DNA barcodes to differentiate Parthenium species and lines. BMC Plant Biol. 2009, 9, 131–143. [Google Scholar] [CrossRef] [PubMed]
  46. Jansen, R.K.; Palmer, J.D. A chloroplast DNA inversion marks an ancient evolutionary split in the sunflower family (Asteraceae). Proc. Natl. Acad. Sci. USA 1987, 84, 5818–5822. [Google Scholar] [CrossRef] [PubMed]
  47. Doyle, J.J.; Davis, J.I.; Soreng, R.J.; Garvin, D.; Anderson, M.J. Chloroplast DNA inversions and the origin of the grass family (Poaceae). Proc. Natl. Acad. Sci. USA 1992, 89, 7722–7726. [Google Scholar] [CrossRef] [PubMed]
  48. Palmer, J.D.; Nugent, J.M.; Herbon, L.A. Unusual structure of geranium chloroplast DNA: A triple-sized inverted repeat, extensive gene duplications, multiple inversions, and two repeat families. Proc. Natl. Acad. Sci. USA 1987, 84, 769–773. [Google Scholar] [CrossRef] [PubMed]
  49. Ogihara, Y.; Terachi, T.; Sasakuma, T. Intramolecular recombination of chloroplast genome mediated by short direct-repeat sequences in wheat species. Proc. Natl. Acad. Sci. USA 1988, 85, 8573–8577. [Google Scholar] [CrossRef] [PubMed]
  50. Panero, J.L.; Funk, V.A. The value of sampling anomalous taxa in phylogenetic studies: Major clades of the Asteraceae revealed. Mol. Phylogenet. Evol. 2008, 47, 757–782. [Google Scholar] [CrossRef] [PubMed]
  51. Fernandez, I.A.; Aguilar, J.F.; Panero, J.L.; Feliner, G.N. A phylogenetic analysis of Doronicum (Asteraceae, Senecioneae) based on morphological, nuclear ribosomal (ITS), and chloroplast (trnL-F) evidence. Mol. Phylogenet. Evol. 2001, 20, 41–64. [Google Scholar] [CrossRef] [PubMed]
  52. Chen, S.B.; Peng, Y.; Chen, S.L.; Xiao, P.G. Introduction of Pharmaphylogeny. Mod. Tradit. Chin. Med. Mater. Med. World Sci. Technol. 2005, 7, 97–103. [Google Scholar]
  53. Shi, Q.H.; Yao, Z.P.; Zhang, H.; Xu, L.; Dai, P.H. Comparison of four methods of DNA extraction from Chickpea. J. Xinjiang Agric. Univ. 2009, 1, 64–67. [Google Scholar]
  54. Urreizti, R.; Garcia-Giralt, N.; Riancho, J.A.; Gibzakez-Macias, J.; Civit, S.; Guerris, R.; Yoskovitz, G.; Sarrion, P.; Mellivobsky, L.; Diez-Perez, A.; et al. COL1A1, haplotypes and hip fracture. J. Bone Miner. Res. 2012, 27, 950–953. [Google Scholar] [CrossRef] [PubMed]
  55. Deng, P.; Wang, L.; Cui, L.; Feng, K.; Liu, F.; Du, X.; Tong, W.; Niu, X.; Ji, W.; Weining, S. Global identification of MicroRNAs and their targets in barley under salinity stress. PLoS ONE 2015, 10, e0137990. [Google Scholar] [CrossRef] [PubMed]
  56. Gogniashvili, M.; Naskidashvili, P.; Bedoshvili, D.; Kotorashcili, N.; Kotaria, N.; Beridze, T. Complete chloroplast DNA sequences of Zanduri wheat (Triticum, spp.). Genet. Resour. Crop Evol. 2015, 62, 1269–1277. [Google Scholar] [CrossRef]
  57. Boetzer, M.; Henkel, C.V.; Jansen, H.J.; Butler, D.; Pirovano, W. Scaffolding pre-assembled contigs using SSPACE. Bioinformatics 2011, 27, 578–579. [Google Scholar] [CrossRef] [PubMed]
  58. Acemel, R.D.; Tena, J.J.; Irastorzaazcarate, I.; Marletaz, F.; Comez-Marin, C.; de la Calle-Mustienes, E.; Bertrand, S.; Diaz, S.G.; Aldea, D.; Aury, J.M.; et al. A single three-dimensional chromatin compartment in amphioxus indicates a stepwise evolution of vertebrate Hox bimodal regulation. Nat. Genet. 2016, 48, 336–341. [Google Scholar] [CrossRef] [PubMed]
  59. Liu, C.; Shi, L.; Zhu, Y.; Chen, H.; Zhang, J.; Lin, X.; Guan, X. CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences. BMC Genom. 2012, 13, 715. [Google Scholar] [CrossRef] [PubMed]
  60. Wyman, S.K.; Jansen, R.K.; Boore, J.L. Automatic annotation of organellar genomes with DOGMA. Bioinformatics 2004, 20, 3252–3255. [Google Scholar] [CrossRef] [PubMed]
  61. Schattner, P.; Brooks, A.N.; Lowe, T.M. The tRNAscan-SE, snoscan and snoGPS web servers for the detection of tRNAs and snoRNAs. Nucleic Acids Res. 2005, 33, 686–689. [Google Scholar] [CrossRef] [PubMed]
  62. Lohse, M.; Drechsel, O.; Bock, R. Organellar Genome DRAW (OGDRAW): A tool for the easy generation of high-quality custom graphical maps of plastid and mitochondrial genomes. Curr. Genet. 2007, 52, 267–274. [Google Scholar] [CrossRef] [PubMed]
  63. Tamura, K.; Peterson, D.; Peterson, N.; Stecher, G.; Nei, M.; Kumar, S. MEGA5: Molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 2011, 28, 2731–2739. [Google Scholar] [CrossRef] [PubMed]
  64. Kurtz, S.; Phillippy, A.; Delcher, A.L.; Smoot, M.; Shumway, M.; Antonescu, C.; Salzberg, S.L. Versatile and open software for comparing large genomes. Genome Biol. 2004, 5, R12. [Google Scholar] [CrossRef] [PubMed]
  65. Mayor, C.; Brudno, M.; Schwartz, J.R.; Poliakov, A.; Rubin, E.M.; Frazer, K.A.; Pachter, L.S.; Dubchak, I. VISTA: Visualizing global DNA sequence alignments of arbitrary length. Bioinformatics 2000, 16, 1046–1047. [Google Scholar] [CrossRef] [PubMed]
  66. Frazer, K.A.; Pachter, L.; Poliakov, A.; Rubin, E.M.; Dubchak, I. VISTA: Computational tools for comparative genomics. Nucleic Acids Res. 2004, 32, 273–279. [Google Scholar] [CrossRef] [PubMed]
  67. Yang, X.M.; Sun, J.T.; Xue, X.F.; Zhu, W.C.; Hong, X.Y. Development and characterization of 18 novel EST-SSRs from the Western Flower Thrips, Frankliniella occidentalis (Pergande). Int. J. Mol. Sci. 2012, 13, 2863–2876. [Google Scholar] [CrossRef] [PubMed]
  68. Kurtz, S.; Choudhuri, J.V.; Ohlebusch, E.; Schleiermacher, C.; Stoye, J.; Giegerich, R. REPuter: The manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res. 2001, 29, 4633–4642. [Google Scholar] [CrossRef] [PubMed]
  69. Larkin, M.A.; Blackshields, G.; Brown, N.P.; Chenna, R.; McGettigan, P.A.; McWilliam, H.; Valentin, F.; Wallace, I.M.; Wilm, A.; Lopez, Z.; et al. Clustal W and Clustal X version 2.0. Bioinformatics 2007, 23, 2947–2948. [Google Scholar] [CrossRef] [PubMed]
  70. Posada, D. jModelTest: Phylogenetic model averaging. Mol. Biol. Evol. 2008, 25, 1253–1259. [Google Scholar] [CrossRef] [PubMed]
Sample Availability: Sequence data of Artemisia annua are available from the authors.
Figure 1. Gene map of the A. annua chloroplast genome. Genes drawn inside the circle are transcribed clockwise, and those outside are counterclockwise. Genes belonging to different functional groups are color-coded. The darker gray in the inner circle corresponds to GC content, while the lighter gray corresponds to AT content.
Figure 1. Gene map of the A. annua chloroplast genome. Genes drawn inside the circle are transcribed clockwise, and those outside are counterclockwise. Genes belonging to different functional groups are color-coded. The darker gray in the inner circle corresponds to GC content, while the lighter gray corresponds to AT content.
Molecules 22 01330 g001
Figure 2. Comparison of five chloroplast genomes using mVISTA. Grey arrows and thick black lines above the alignment indicate gene orientation. Purple bars represent exons, blue bars represent UTRs, and pink bars represent non-coding sequences (CNS). The Y-scale axis represents the percent identity (shown: 50–100%). Genome regions are color-coded as either protein-coding exons, rRNAs, tRNAs, or conserved noncoding sequences (CNS).
Figure 2. Comparison of five chloroplast genomes using mVISTA. Grey arrows and thick black lines above the alignment indicate gene orientation. Purple bars represent exons, blue bars represent UTRs, and pink bars represent non-coding sequences (CNS). The Y-scale axis represents the percent identity (shown: 50–100%). Genome regions are color-coded as either protein-coding exons, rRNAs, tRNAs, or conserved noncoding sequences (CNS).
Molecules 22 01330 g002
Figure 3. Comparison of the borders of the LSC, SSC, and IR regions among five chloroplast genomes. Ψ: pseudogenes, /: distance from the edge.
Figure 3. Comparison of the borders of the LSC, SSC, and IR regions among five chloroplast genomes. Ψ: pseudogenes, /: distance from the edge.
Molecules 22 01330 g003
Figure 4. ML phylogenetic tree reconstruction 20 taxa of Asteraceae clade based on concatenated sequence from 50 chloroplast protein-coding genes. The position of Artemisia annua is indicated in block letter. Berberis bealei was set as the outgroup.
Figure 4. ML phylogenetic tree reconstruction 20 taxa of Asteraceae clade based on concatenated sequence from 50 chloroplast protein-coding genes. The position of Artemisia annua is indicated in block letter. Berberis bealei was set as the outgroup.
Molecules 22 01330 g004
Table 1. Base composition in the A. annua chloroplast genome.
Table 1. Base composition in the A. annua chloroplast genome.
Region T (U) (%)C (%)A (%)G (%)Length (bp)
LSC 32.417.532.118.082,988
SSC 34.216.135.014.718,267
IRA 28.520.828.322.324,850
IRB 28.322.328.520.824,850
Total 31.318.731.218.8150,955
CDS 31.617.630.720.179,335
1st position24.018.930.626.726,445
2nd position33.020.229.417.726,445
3rd position38.013.832.016.026,445
CDS: protein-coding regions.
Table 2. Codon-anticodon recognition patterns and codon usage of the A. annua chloroplast genome.
Table 2. Codon-anticodon recognition patterns and codon usage of the A. annua chloroplast genome.
Amino AcidCodonNo.RSCUtRNAAmino AcidCodonNo.RSCUtRNA
PheUUU9931.32 TyrUAU8111.64
PheUUC5100.68trnF-GAATyrUAC1780.36trnY-GUA
LeuUUA8901.87 StopUAA521.77
LeuUUG5791.22trnL-CAAStopUAG210.72
LeuCUU6221.31 HisCAU4711.51
LeuCUC1980.42 HisCAC1510.49trnH-GUG
LeuCUA3680.77 GlnCAA7321.52trnQ-UUG
LeuCUG1960.41 GlnCAG2300.48
IleAUU10921.47 AsnAAU10171.56
IleAUC4330.58trnI-CAUAsnAAC2870.44
IleAUA7060.95 LysAAA10421.47
MetAUG6331.00trnM-CAULysAAG3710.53
ValGUU5121.44 AspGAU8681.61
ValGUC1740.49trnV-GACAspGAC2130.39trnD-GUC
ValGUA5461.54 GluGAA10011.50trnE-UUC
ValGUG1880.53 GluGAG3370.50
SerUCU5881.74 CysUGU2021.38
SerUCC3240.96trnS-GGACysUGC910.62trnC-GCA
SerUCA4171.23trnS-UGAStopUGA150.51
SerUCG1670.49 TrpUGG4621.00trnW-CCA
ProCCU4411.58 ArgCGU3501.33trnR-ACG
ProCCC1880.67 ArgCGC1070.41
ProCCA3291.18trnP-UGGArgCGA3431.30
ProCCG1590.57 ArgCGG1240.47
ThrACU5351.63 ArgAGA4851.84trnR-UCU
ThrACC2460.75trnT-GGUArgAGG1740.66
ThrACA4111.25trnT-UGUSerAGU4101.21
ThrACG1240.38 SerAGC1220.36trnS-GCU
AlaGCU6171.74 GlyGGU5891.32
AlaGCC2280.64 GlyGGC1890.42trnG-GCC
AlaGCA4151.17 GlyGGA7071.58
AlaGCG1580.45 GlyGGG3060.68
RSCU: Relative Synonymous Codon Usage.
Table 3. The length of exons and introns in genes with introns in the A. annua chloroplast genome.
Table 3. The length of exons and introns in genes with introns in the A. annua chloroplast genome.
GeneLocationExon I (bp)Intron I (bp)Exon II (bp)Intron II (bp)Exon III (bp)
trnK-UUULSC37186035
trnG-UCCLSC2372947
trnL-UAALSC3742450
trnV-UACLSC3857237
trnI-GAUIR4277735
trnA-UGCIR3881235
rps12 *LSC23253526 114
rps16LSC40876185
rpl16LSC91015399
rpl2IR394626470
rpoC1LSC4307341640
ndhASSC5561064539
ndhBIR777670756
ycf3SSC127700230735153
petBLSC6747642
atpFLSC145699410
clpPLSC71796292606228
* The rps12 gene is a trans-spliced gene with the 5′ end located in the LSC region and the duplicated 3′ ends in the IR regions.
Table 4. Long repeat sequences in the A. annua chloroplast genome.
Table 4. Long repeat sequences in the A. annua chloroplast genome.
IDRepeat Start 1TypeSize (bp)Repeat Start 2Mismatch (bp)E-ValueGeneRegion
18544F3234,909−34.65E-05IGSLSC
228,063F3129,661−31.69E-04IGSLSC
328,070F3029,666−22.18E-05IGSLSC
438,054F3240,278−21.55E-06psaB; psaALSC
538,065F3040,289−36.09E-04psaB; psaALSC
643,070F4196,883−11.63E-13ycf3 (intron); IGSLSC; IRA
743,072F39118,107−12.48E-12ycf3 (intron); ndhA (intron)LSC; SSC
843,075F3593,834−39.59E-07ycf3 (intron); ndhB (intron)LSC; IRA
966,346F3098,046−22.18E-05IGSLSC; IRA
1186,539F30147,378−36.09E-04ycf2IRA; IRB
1290,121F3090,157−15.00E-07ycf2IRA
1396,885F39118,10702.12E-14IGS; ndhA (intron)IRA; SSC
14105,777F30105,809−22.18E-05IGSIRA
15128,104F30128,136−22.18E-05IGSIRB
168548I3044,753−22.18E-05IGSLSC
1729,662I3029,881−22.18E-05IGSLSC
1834,911I3044,755−15.00E-07IGSLSC
1943,070I41137,019−11.63E-13ycf3 (intron); IGSLSC; IRB
2043,075I35140,074−39.59E-07ycf3 (intron); ndhB (intron)LSC; IRB
2166,346I30135,867−22.18E-05IGSLSC; IRB
2290,109I60143,756−27.68E-23ycf2IRA; IRB
2390,109I42143,756−22.57E-12ycf2IRA; IRB
2490,121I30143,756−15.00E-07ycf2IRA; IRB
2590,124I45143,75605.18E-18ycf2IRA; IRB
2690,127I60143,774−27.68E-23ycf2IRA; IRB
2790,142I45143,77405.18E-18ycf2IRA; IRB
2890,145I42143,792−22.57E-12ycf2IRA; IRB
2990,157I30143,792−15.00E-07ycf2IRA; IRB
30105,777I30128,104−22.18E-05IGSIRA; IRB
31105,809I30128,136−22.18E-05IGSIRA; IRB
32118,107I39137,01902.12E-14ndhA (intron); rps12 (CDS)SSC; IRB
F: Forward; I: Inverted; IGS: intergenic space; CDS: protein-coding regions.
Table 5. Simple sequence repeats in the A. annua chloroplast genome.
Table 5. Simple sequence repeats in the A. annua chloroplast genome.
cpSSR IDRepeat MotifLength (bp)StartEndRegionAnnotation
1(A)151532043218LSCmatK
2(A)141437083721LSC
3(A)101061216130LSC
4(T)101099449953LSC
5(A)101013,63013,639LSCrpoB
6(A)121220,82620,837LSCrpoC2
7(T)101023,02723,036LSCrpoC2
8(A)111126,28926,299LSCatpH
9(A)141428,51328,526LSCatpA
10(A)111139,31239,322LSCpsaA
11(A)101048,20648,215LSC
12(AT)61252,02852,039LSC
13(T)141453,08553,098LSCatpB
14(A)171753,30653,322LSCatpB
15(A)191954,90254,920LSCrbcL
16(A)101056,83256,841LSC
17(A)141457,92057,933LSCaccD
18(A)111159,65459,664LSCycf4
19(T)101059,77559,784LSCycf4
20(T)101064,47664,485LSC
21(T)101064,90264,911LSC
22(A)111166,25566,265LSC
23(T)101069,52569,534LSC
24(A)141470,21070,223LSC
25(T)101071,65571,664LSCpsbB
26(TA)61272,64072,651LSCpsbB
27(T)141473,21073,223LSCpsbN
28(A)151580,92980,943LSC
29(T)101081,20981,218LSC
30(T)1111101,234101,244IRA
31(GAA)515108,039108,053SSCndhF
32(TAA)515117,240117,254SSCndhI
33(T)1010118,903118,912SSC
34(A)1414121,936121,949SSCycf1
35(A)1111132,700132,710IRB

Share and Cite

MDPI and ACS Style

Shen, X.; Wu, M.; Liao, B.; Liu, Z.; Bai, R.; Xiao, S.; Li, X.; Zhang, B.; Xu, J.; Chen, S. Complete Chloroplast Genome Sequence and Phylogenetic Analysis of the Medicinal Plant Artemisia annua. Molecules 2017, 22, 1330. https://doi.org/10.3390/molecules22081330

AMA Style

Shen X, Wu M, Liao B, Liu Z, Bai R, Xiao S, Li X, Zhang B, Xu J, Chen S. Complete Chloroplast Genome Sequence and Phylogenetic Analysis of the Medicinal Plant Artemisia annua. Molecules. 2017; 22(8):1330. https://doi.org/10.3390/molecules22081330

Chicago/Turabian Style

Shen, Xiaofeng, Mingli Wu, Baosheng Liao, Zhixiang Liu, Rui Bai, Shuiming Xiao, Xiwen Li, Boli Zhang, Jiang Xu, and Shilin Chen. 2017. "Complete Chloroplast Genome Sequence and Phylogenetic Analysis of the Medicinal Plant Artemisia annua" Molecules 22, no. 8: 1330. https://doi.org/10.3390/molecules22081330

Article Metrics

Back to TopTop