*4.1. Plant Material, DNA Extraction, and Sequencing*

We selected seven species according to their potential medicinal uses, including *A. kaempferi*, *A. kunmingensis*, *A. macrophylla*, *A. mollissima*, and *A. moupinensis* from subgenus *Siphisia* and *A. tagala* and *A. tubiflora* of subgenus *Aristolochia* (Table 7). Genomic DNA was isolated from silica-gel dried leaf tissue or herbarium specimens using Plant Genomic DNA Kit (TIANGEN, Beijing, China). DNA integrity was examined by electrophoresis in 1% (*w*/*v*) agarose gel and their concentration was measured using a NanoDrop spectrophotometer 2000 (Thermo Scientific; Waltham, MA, USA). The

DNA was used to construct PE libraries with insert sizes of 150 bp and sequenced according to the manufacturer's manual for the Illumina Hiseq X.


**Table 7.** Sampled species and their voucher specimens used in this study.

#### *4.2. Chloroplast Genome Assembly and Annotation*

We used the software Trimmomatic version 0.36 (Max Planck Institute of Molecular Plant Physiology, Potsdam, Germany) [58] to trim the low-quality reads. We retrieved the plastome sequence of *A. contorta* (NC\_036152.1), *A. debilis* (NC\_036153.1), *Asarum costatum* (AP018513.1), *Asarum minamitanianum* (AP018514.1), and *Asarum sakawanum* (AP017908.1) from GenBank and used these sequences as the references [28,30,31]. The plastome was assembled using mapping to reference genome and de novo methods as implemented in Geneious R11 (Biomatters, Auckland, New Zealand) [59].

The cp genomes of the seven species was annotated using the online program Dual Organellar GenoMe Annotator (DOGMA) (University of Texas at Austin, Austin, TX, USA) [60], Annotation of Organellar Genomes (GeSeq) [61] and Chloroplast Genome Annotation, Visualization, Analysis, and GenBank Submission (CPGAVAS) (Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China) [62]. The tRNA genes were confirmed using tRNAscan-SE software (v2.0, University of California, Santa Cruz, CA, USA) [63]. Plastome annotations were manually corrected with the software Artemis [64]. The gene map was drawn using the Organelle Genome DRAW (OGDRAW) [65,66] with default settings and checked manually. The complete cp genome sequences of the seven species were deposited in GenBank, accession numbers are MK503927-MG503933 (Table S6).

### *4.3. Genome Structure Analyses*

The distribution of codon usage was investigated using the software CodonW (University of Texas, Houston, TX, USA) with the RSCU value [67]. GC content was analyzed using Molecular Evolutionary Genetics Analysis (MEGA v6.0, Tokyo Metropolitan University, Tokyo, Japan) [68]. REPuter program (https://bibiserv.cebitec.uni-bielefeld.de/reputer) (University of Bielefeld, Bielefeld, Germany) [69] was used to identify the size and location of repeat sequences, including forward, palindromic, reverse, and complement repeats in the seven cp genomes. For all repeat types, the minimal size was set as 30 bp and the two repeat copies had at least 90% similarity. Perl script MISA (https://webblast.ipk-gatersleben.de/misa/) [70] was used to detect microsatellites (mono-, di-, tri-, tetra-, penta-, hexanucleotide repeats) with the following thresholds (unit size, min repeats): ten repeat units for mononucleotide SSRs, five repeat units for dinucleotide SSRs, four repeat units for trinucleotide SSRs, and three repeat units each for tetra-, penta-, and hexanucleotide SSRs.

#### *4.4. Positive Selection Analysis*

To identify the genes under selection, we scanned the cp genomes of seven species within Piperales using codeml of the package PAML4 [71,72]. The software was used for calculating the non-synonymous (dN) and synonymous (dS) substitution rates, along with their ratios (ω = dN/dS). The analyses of selective pressures were conducted along the ML tree in Newick format (S7), which

based on the whole CDS region was used to determine the phylogenetic relationships of these seven species. Each single-copy CDS sequences was aligned according to their amino acid sequence. We used the site-specific model with five site models (M0, M1a & M2a, M7 & M8) were employed to identify the signatures of adaptation across cp genomes. This model allowed the ω ratio to vary among sites, with a fixed ω ratio in all the branches. Comparing the site-specific model, M1a (nearly neutral) vs. M2a (positive selection) and M7 (β) vs. M8 (β & ω) were calculated in order to detect positive selection [73]. Likelihood ratio test (LRT) of the comparison (M1a vs. M2a and M7 vs. M8) was used respectively to evaluate of the selection strength and the p value of Chi square (χ 2 ) smaller than 0.05 is thought as significant. The Bayes Empirical Bayes (BEB) inference [74] was implemented in site models M2a and M8 to estimate the posterior probabilities and positive selection pressures of the selected genes.

## *4.5. Genome Comparison and Nucleotide Variation Analysis*

The whole-genome (minus a copy of IR region) alignment for the cp genomes of the seven species including our *A. moupinensis*, *A. kunmingensis*, A. *tubiflora* and four reported species (*A. contorta*, *As. canadense*, *S. henryi* and *P. cenocladum*) of Piperales, was performed and plotted by the mVISTA program (http://genome.lbl.gov/vista/mvista/submit.shtml) in Shuffle-LAGAN model [75,76], and with *A. moupinensis* as the reference. The seven cp genomes of *Aristolochia* were first aligned using MAFFT v7 [77] and then manually adjusted using BioEdit v7.0.9 [78]. Variable sites and nucleotide variability across complete cp genomes, LSC, IR, SSC, and CDS regions of seven species were calculated using DnaSP v5 [79]. Furthermore, for the seven cp genomes minus a copy IR region, a sliding window analysis was conducted to evaluate the nucleotide variability using DnaSP software. The step size was set to 200 base pairs, and the window length was set to 600 base pairs.

#### *4.6. Phylogenetic Analyses*

To estimate phylogenetic relationships within the Aristolochiaceae, plastomes of 18 taxa were compared, including nine samples from *Aristolochia*, six and one cp genomes from *Asarum* and *Saruma*, respectively (Table S5). A total of 11 cp genomes were downloaded from the NCBI database. In the phylogenetic analyses, *P. auritum* and *P. cenocladum* of *Piper* were used as outgroup. Phylogenetic trees were constructed by MP, ML and BI methods using the cp genomes, LSC, SSC, IR, CDS and hotspots regions. The sequences of the involved regions were aligned using MAFFT v7. MP analysis was performed with PAUP\*4.0b10 [80], using a heuristic search performed 1000 replications and tree bisection-reconnection (TBR) branch swapping. BI was conducted using the program MrBayes v3.2 [81] with the GTR+I+G model at the CIPRES Science Gateway website (http://www.phylo.org/) [82]. The Markov Chain Monte Carlo (MCMC) analysis was run for 2,000,000 generations, sampling every 1000 generations. The posterior probabilities (PP) of the phylogeny and its branches were determined from the combined set of trees, discarding the first 25% trees of each run as burn-in, as determined by Tracer v1.7 [83]. Maximum likelihood phylogenies were constructed by a fast and effective stochastic algorithm using IQ-TREE v1.6.2 [84] with the Best-fit model by ModelFinder [85] according to Bayesian information criterion (BIC) and the robustness of the topology was estimated using 2000 bootstrap replicates. Figtree v1.4 (http://tree.bio.ed.ac.uk/software/figtree/) [86] was used to visualize and annotate trees.

#### **5. Conclusions**

The complete cp genomes of *A. kaempferi*, *A. kunmingensis*, *A. macrophylla*, *A. mollissima*, and *A. moupinensis* of the subgenus *Siphisia*, and *A. tagala* and *A. tubiflora* of the subgenus *Aristolochia* were reported in this study. The cp genomes length and gene content of the genus *Aristolochia* were comparatively conserved. Although genomic structure and size were highly conserved, the IR-SC boundary regions were variable between these nine cp genomes of *Aristolochia*. The whole duplicated *trnH* gene within five species of *Siphisia* is one of major differences between the plastomes of the subgenera *Siphisia* and *Aristolochia*. We also identified SSR sites, five positive selection sites and 16 variable regions, which provide a reference for developing tools to further study *Aristolochia* species. Furthermore, the phylogenetic constructions with six datasets of 18 cp genomes illustrated robust and consistent relationships with high supports.

**Supplementary Materials:** Supplementary materials can be found at http://www.mdpi.com/xxx/s1.

**Author Contributions:** X.L. performed the experiments, analyzed the data, and wrote the manuscript; Y.Z. assembled sequences and revised the manuscript; X.Z. and S.L. collected, identified plant materials and gave suggestions to the manuscript; J.M. revised the manuscript. All authors have read and approved the final manuscript.

**Funding:** This work was supported by the National Natural Science Foundation of China (No. 31370225).

**Acknowledgments:** The authors give special thanks to Shuwan Li, Yuan Wang, Zhanghua Wang, and Zhixiang Hua for collecting plant material. We acknowledged someone for their assistance with fieldwork, for data analysis, for giving comments on the manuscript paper. Our sincere thanks are also to the anonymous reviewers for their comments and suggestions.

**Conflicts of Interest:** The authors declare no conflict of interest.
