*2.1. Characteristics of Four Chloroplast Genomes*

Previous studies have shown that the plastomes of flowering plants are greatly conserved in structural organization and gene content, but contraction and expansion do occur [47,48]. Each complete chloroplast genome of *C. farinosa, C. glandulosa, M. crassifolia* and *M. oblongifolia* has a circular and quadripartite structure. The genome of *C. farinosa, C. glandulosa, M. crassifolia* and *M. oblongifolia* ranged from 156,560 bp (*C. glandulosa*) to 155,436 bp (*M. oblongifolia*); the coding region ranged from 78,080 bp (*C. farinosa*) to 76,614 bp

(*C. glandulosa*), corresponding to 49.89% and 48.93% of the total genome length. The LSC regions ranged from 85,681 bp (*C. glandulosa*) to 84,153 bp (*M. oblongifolia*) in size, whereas the SSC ranged from 18,481 bp (*M. oblongifolia*) to 18,031 bp (*C. glandulosa*); the pair of inverted repeats are separated by the small single copy region and ranged from 26,430 bp (*C. farinosa*) to 26,294 bp (*M. crassifolia*) (Table 1 and Figure 1). These four Capparaceae chloroplast genome sequences were deposited in GenBank (accession numbers: *C. farinosa*, MN603027; *C. glandulosa*, MN603028; *M. crassifolia*, MN603029 and *M. oblongifolia*, MN603030).

**Table 1.** Base content in the *C. farinosa*, *C. glandulosa*, *M. crassifolia* and *M. oblongifolia* chloroplast genomes.


In the four species, the plastomes are well conserved in gene order and number of genes, with slight variation in the presence of the *psbG* gene, which is absent in *C. glandulosa*. The result of the gene annotation revealed a total of 137 in *C. glandulosa* and 138 genes in *C. farinosa, M. crassifolia* and *M. oblongifolia*, among which 116–117 are situated in the SSC and the LSC copy regions, and 19 genes are located in the IRa and IRb. The plastome contained 80 protein-coding genes in *C. glandulosa* and 81 in other species, four rRNA genes and 31 tRNA genes (Figure 1 and Table 2). Eight protein-coding genes, four rRNA and seven tRNA were found in the IR regions. In the *C. glandulosa* species, the LSC region contained 61 protein-coding genes, whereas it included 62 in other species and 23 tRNA genes; the remaining 12 protein-coding genes and 1 tRNA are situated in the SSC region.

Some protein-coding genes and tRNA genes in the chloroplast genome of angiosperms contain an intron [49,50], as is found in the plastomes of the four species (*C. farinosa*, *C. glandulosa*, *M. crassifolia* and *M. oblongifolia*). In the total genes of the cp genomes (out of the total genes in all chloroplast genomes), 13 genes in *C. glandulosa* and *M. crassifolia* and 14 genes in *C. farinosa* and *M. oblongifolia* include an intron; some genes are protein-coding genes (nine in *C. farinosa* and *M. oblongifolia* and eight in *C. glandulosa* and *M. crassifolia*) while the remaining five are tRNA genes. Four genes (*rpl2*, *ndhB*, *trnI-GAU* and *trnA-UGC*) that have introns are situated in the inverted repeat region, *ndhA* is located in the SSC region and the remainder is found in the LSC region. All genes have only one intron and only two genes, namely *ycf3* and *clpP,* have two introns. The gene *trnK*-*UUU* has the longest intron of 2542–2571 bp; this is a result of the *matK* gene being located within the intron of the gene.

**Figure 1.** Chloroplast genome maps of the four Capparaceae species. Genes drawn inside the circle are transcribed clockwise, while those outside the circle are transcribed counter-clockwise. The inner dark gray circle corresponds to GC content and the inner light gray circle corresponds to the AT content. Different colors are used as a representation of distinctive genes within separate functional groups.




**Table 2.** *Cont.*

+ Gene with one intron, ++ gene with two introns and a gene with multiple copies. <sup>a</sup> gene with multiple copies. \* *ndhK* in group photosystem II in *C. farinosa* and group NADH dehydrogenase in *C. glandulosa, M. crassifolia* and *M. oblongifolia.*
