*2.1. Structure of Chloroplasts in Utricularia amethystina*

A total of 2,873,574 million paired-end reads were generated of all *Utricularia amethystina* morphotypes. Approximately 7.57%, 9.10%, and 9.56% represents cpDNA-derived reads and were used for the de novo assembly of *U. amethystina* purple, white, and yellow, respectively. For each morphotype, the assembly using SPAdes resulted in a contig with the entire LSC region, followed by two contigs containing the IR and SSC regions. The cpDNA contigs were joined in a supercontig and circularized using MITOBim iterative read mapping. The three *U. amethystina* cpDNAs have a consistent quadripartite structure similar to the majority of other angiosperms, and slightly varying in size (Figure 2) (Table 1) (Supplementary Table S1).

**Table 1.** The summary of characteristics in *Utricularia amethystina* chloroplast genomes. Between parentheses the percentage that represents each part in comparison to the cpDNA genome.


**Figure 2.** Chloroplast genome map for *Utricularia amethystina* purple, white, and yellow. The map represents all three cpDNAs. Gene order and number are the same, except that yellow has an inversion in the *pet*N and *psb*M genes (see at 2 o'clock in the map). Black thick lines of the outer circle indicate the extension of the inverted repeats. The direction of the arrows denotes the transcription direction. Genes are colored according to their functional groups. The inner graph corresponds to the GC content for each cpDNA region in the chloroplast of each species morphotype. Purple, yellow, and gray bars denote *U. amethystina* purple, yellow, and white morphotypes, respectively.

The three cpDNA exhibited in total of 137 annotated genes, including 39 unique protein-coding genes, 30 tRNA, and four rRNA. Eighteen genes (*pet*B, *pet*D, *atp*F, *rpo*C1, *rps*12, *rps*16, *rpl*2, *ndh*A, *ndh*B (2×), *trn*K-UUU, *trn*A-UGC (2×), *trn*I-GAC (2×), *trn*G-UCC, *trn*L-UAA, *trn*V-UAC) contain one intron, and two genes (*clp*P, *ycf* 3) have two introns; 18 genes (*rpl*2, *rpl2*3, *ycf* 2, *ycf1*5, *ndh*B, *rps*7, *rps*12, *trn*I-CAU, *trn*L-CAA, *trn*V-GAC, 16S rRNA, *trn*I-GAU, *trn*A-UGC, *trn*R-ACG, *trn*N-GUU, 23S rRNA, 4.5S rRNA, 5S rRNA) have duplicates and five (*ycf* 68, *orf* 42, *orf* 56, *ycf* 1, *rps*19) partial genes (putative pseudogenes) in the IR regions. All assessed cpDNAs have collinear gene content and arrangement. The main difference among the three morphotypes is the inversion of *pet*N and *psb*M genes position in *U. amethystina* yellow in comparison to other *Utricularia* (Figure 2).

#### *2.2. Repeats and Chloroplast Microsatellites (cpSSR)*

REPuter identified 22, 29, and 27 repeats in *U. amethystina* purple, white, and yellow, respectively (Figure 3A). Most were characterized as forwarding and palindromic repeats. The repeats were mainly distributed among the *ycf* 2, *rpo*C1 and *trn*S-GCU, *ycf* 3, *ndh*A genes, and *rbc*L*-acc*D, and *rps*12*-trn*V-GAC

intergenic region (Supplementary Tables S2–S4). The identified microsatellite (cpSSR), vary from 7 to 369-bp for *U. amethystina* purple, 7 to 264-bp for *U. amethystina* yellow, and 7 to 245-bp for *U. amethystina* white (Figure 3B). The amount of identified cpSSR repeats are similar within all morphotypes, with the mono- (346, 350, 343), and dinucleotide repeats as the most abundant (42, 43, 40), followed by tri- (4, 3, 4) and tetra- repeats (2 for *U. amethystina* purple and 10 for white and 7 for yellow). Interestingly, penta- nucleotides were only found in *U. amethystina* yellow, and hexa- repeats were not found in any morphotypes (Supplementary Table S5).

**Figure 3.** Quantity of repeats in *Utricularia amethystina*. (**A**) Long repeats. (**B**) Simple sequence repeats (cpSSRs). Purple, yellow, and gray bars denote *U. amethystina* purple, yellow, and white morphotypes, respectively. (Additional information can be found in Supplementary Table S5).
