*4.6. Interspecific Comparison*

Using the *Utricularia amethystina* purple as a reference and previously published *Utricularia (Utricularia foliosa*, KY025562; *U. gibba*, KC997777; *U. reniformis*, KT336489; *U. macrorhiza*, HG80317), the identity of cpDNA was assessed using mVISTA online software (http://genome.lbl.gov/vista/mvista/ submit.shtml) with Shuffle-LAGAN Mode.

#### *4.7. RNA Extraction, Sequencing and RNA Editing Site Analyses*

The corollas of *Utricularia amethystina* were stored in RNAlater® (Thermo Fisher Scientific, MA, USA) from each analyzed population and were used as plant tissues for RNA-Seq. The corollas (~5 per specimen) were pooled in three replicates for *U. amethystina* white and purple and two for *U. amethystina* yellow, and total RNA was extracted using PureLink RNA MiniKit (Thermo Fisher Scientific, MA, USA), according to manufacturer's protocol. The extracted RNA was analyzed with Agilent 2100 Bioanalyzer and Qubit 2.0 Fluorometer for quality and quantity assessment, and only samples with RNA Integrity Number (RIN) > 7.0 were used for the sequencing.

The eight libraries (3 libraries for each *U. amethystina* purple and white and 2 for *U. amethystina* yellow) were constructed following the TruSeq Stranded mRNA LS Protocol sample preparation protocol. The paired-end (2 × 100 pb) sequencing was performed in one lane on an Illumina platform (HiSeq 2500) following supplier-provided protocols (Illumina, San Diego, CA, USA).

The raw sequencing data, was preprocessing with high stringency using the following steps. (1) For the 3 end, the adapter and low-quality reads were removed using Scythe (https://github.com/ vsbuffalo/scythe; default parameters, except for -n 5 and -M 15); (2) for the 5 end, the removal of adapter and low quality reads were performed with Cutadapt [72]; default parameters, except for –overlap 5; –minimum-length = 15; –times = 2); (3) to filter reads with more than 30% of unknown base (Ns), polyA/T tails we used the software Prinseq [73].

Filtered RNA-seq reads were mapped against the assembled chloroplast genome using STAR version 2.7.2a [74], using default parameters except for adjusted parameters to perform an end-to-end mapping, diminish multiple mapping of the same reads, minimum and maximum size of introns and the number of allowed mismatches (–outFilterMultiMapMax = 3; –outFilterMismatchNmax = 2; –outFilterMismatchNoverLmax = 0.1; –outSJfilterReads = Unique; –alignEndsType = EndtoEnd; –alignIntronMin = 70; –alignIntronMax = 2500). To estimate differential transcripts abundance between biological replicates, normalized count data was obtained using relative log expression (RLE) method in DESeq2 version 3.9 [75] and results were showed following with log2(norm. counts+1). The *rps*12 is a duplicated trans-spliced gene, therefore it was analyzed in three parts and "\_2" and "\_3" represent the duplicated regions. The putative RNA edit sites were predicted following PREPACT3 software [76] with BlastX searches (using default parameters) against the *Nicotiana tabacum*, as reference. The prediction results were compared with the results obtained with an in-house script that counts the number of editing sites according to the previous STAR mapped reads, except for the number of mapped reads, which was set to 1 (only uniquely mapped reads). All of the sites were inspected for C to U nucleotide substitutions by a custom Perl script, with the use of the following parameters; presence in at least two of the biological replicates, editing set with a minimum coverage of 10×.
