**1. Introduction**

Pomegranates (*Punica granatum* L.) are an economically important fruit tree of the tropical and subtropical regions of the world. It is native to central Asia and has been highly praised in many human cultures since ancient times [1]. Pomegranates have showy edible fruit with a high content of anthocyanins and flavonoids [2,3]. It has been well demonstrated that pomegranates are valuable to human health due to high levels of flavonoids and anthocyanins, which are considered potent antioxidants offering protection against heart disease and cancer [4,5]. Also, the pomegranate tree is suitable for genetic analysis due to its short juvenile period, and the high number of progenies [6]. As important resources for basic research and crop improvement, the genome of *P. granatum* 'Taishanhong' has been determined [7]. This genome will shed new light on the understanding of some unique biological processes and pomegranate breeding. Compared to the nuclear genome, the

complete chloroplast genome is a low-cost and efficient way to get valuable genomic resources that can be used to understand evolution at multiple taxonomic levels [8,9] and analyze the population [10] because of its highly conserved structures and comparatively moderate substitution rates [11].

Chloroplasts (cp) are the photosynthetic organelles of the plant cells, which are derived from free-living cyanobacteria through endosymbiosis [12]. Apart from playing key roles in photosynthesis, chloroplasts are also responsible for other aspects of plant physiology and development [13]. A new study found that chloroplast retrograde signaling can regulate nuclear alternative splicing of a subset of *Arabidopsis thaliana* transcripts [14,15]. Interestingly, researchers have found that chloroplasts play diverse roles in plant defense, including contributing to the production of defense compounds [16]. Chloroplasts contain their own genome, the chloroplast DNA (cpDNA), which is highly conserved in genomic structure, gene content, and gene order. Cp genomes have been proved to be an effective biological tool for rapid and accurate species recognition as super-barcode [17,18]. With the advent of high-throughput sequencing technology, the increasing number of cp genomes of fruit crops has been published [19–21]. Before the development of next-generation sequencing technology, cp genome assembly was usually based on conventional primer walking strategies [22,23], which are laborious and costly. It is now convenient to obtain complete a cp genome by using genomic DNA extracted from the leaf tissue, because a large number of cp genomes are present in the sample [24]. Based on homology to cp from related species, these reads from cp can be assembled into circle genomes. Many bioinformatic tools have been developed to recover the cp genome sequence from total genomic DNA, such as NOVOPlasty [25], chloroExtractor [26], and GetOrganelle [27].

In the present study, we obtained cp genomes of pomegranates from three phenotypically different cultivars using the whole genome sequencing data. This study aimed to conduct a comprehensive analysis of the pomegranate cp genome, including gene content, genomic structure, codon usage, and potential RNA editing sites. In addition, combined with previously published cp genomes of Myrtales, phylogenetic analysis was performed to determine the taxonomic position of *P. granatum*. The results obtained here will provide valuable information for understanding the phylogenetic position of pomegranates and the evolutionary history of the order Myrtales.

#### **2. Results and Discussion**

#### *2.1. General Features of Pomegranate Chloroplast Genomes*

The complete cp genomes of 'Nana', 'Tunisia', and 'Taishanhong' were de novo assembled using whole genome sequencing data with GetOrganelle [27]. The cp genomes of 'Nana', 'Tunisia', and 'Taishanhong' were found to be 158,638, 158,639, and 158,638 bp in size, respectively. All of them exhibited a typical quadripartite structure, consisting of a pair of IRs separated by a large single copy region (LSC) and a small single copy region (SSC) (Figure 1). There are identical sets of 113 genes with the same gene order, including 79 protein-coding, 30 tRNA, and 4 rRNA genes. Six protein-coding genes (*rps7*, *rps12*, *rpl2*, *rpl23*, *ndhB*, *ycf2*), seven tRNA genes (*trnI-CAU*, *trnN-GUU*, *trnR-ACG*, *trnA-UGC*, *trnI-GAU*, *trnV-GAC*, *trnL-CAA*), and all rRNA genes (4.5S, 5S, 16S, 23S) are located at the IR regions. Eleven of the protein-coding genes and six of the tRNA genes contain introns, 14 of which contain a single intron, whereas three (*rps12*, *ycf3*, *clpP*) have two introns (Table 1). In particular, the *rps12* is a trans-spliced gene, whose first exon is located in the LSC, while the second and third exons reside in IRs. The *infA* gene was identified as a pseudogene because of the accumulation of the premature stop codons [28]. Another pseudogene *ycf1* existed because of the incomplete duplication of the normal copy of *ycf1* in the IRa and SSC junction, which is identical with previous reports [29,30]. There are some exceptions where non-ATG codons were identified as start codons, such as ACG for *psbL*, GTG for *rps19*, and ACG for *ndhD*. Alternate start codons have been found in other plant species [31]. Alternate start codons are still translated as Met when they are the start of a protein because a separate transfer RNA is used for initiation [32]. The overall GC content was 36.92%; this was consistent with previously

reported GC content of IRs (42.78%) being higher than that of the LSC (34.89%) and SSC (30.64%) [33]. The high GC percentage of IRs could be due to the presence of rRNA sequences in these regions [34]. higher than that of the LSC (34.89%) and SSC (30.64%) [33]. The high GC percentage of IRs could be due to the presence of rRNA sequences in these regions [34].

was 36.92%; this was consistent with previously reported GC content of IRs (42.78%) being

codons [28]. Another pseudogene *ycf1* existed because of the incomplete duplication of the normal copy of *ycf1* in the IRa and SSC junction, which is identical with previous reports [29,30]. There are some exceptions where non-ATG codons were identified as start codons, such as ACG for *psbL*, GTG for *rps19*, and ACG for *ndhD*. Alternate start codons have been found in other plant species [31]. Alternate start codons are still translated as Met when they are the start

**Figure 1.** Chloroplast genome maps of *P. granatum*. Genes drawn outside the outer circle are transcribed clockwise, and those inside are transcribed counter-clockwise. Genes belonging to different functional groups are color-coded. **Figure 1.** Chloroplast genome maps of *P. granatum*. Genes drawn outside the outer circle are transcribed clockwise, and those inside are transcribed counter-clockwise. Genes belonging to different functional groups are color-coded.

Chloroplast DNA has already been used in accessing the genetic diversity and phylogenetic structure at an intraspecies level. For instance, hypervariable regions of cp DNA such as *atpB-rbcL*, *trnL-trnF* and *rps16-trnQ* were used to assess the genetic diversity of Tunisian apricot accessions [35]. Also, chloroplast microsatellite loci were used to investigate the genetic diversity of Iranian pomegranate genotypes [36]. In our present study, the sequence diversity of pomegranate cp genomes was investigated combined with two previously reported cp genomes (NC\_035240, MG878386). The results showed that the sequence diversity of pomegranates is extremely low (0.0008). Only 42 singleton variable sites were detected, and there were no parsimony variable sites in the alignment of the cp genomes of the five Chloroplast DNA has already been used in accessing the genetic diversity and phylogenetic structure at an intraspecies level. For instance, hypervariable regions of cp DNA such as *atpB-rbcL*, *trnL-trnF* and *rps16-trnQ* were used to assess the genetic diversity of Tunisian apricot accessions [35]. Also, chloroplast microsatellite loci were used to investigate the genetic diversity of Iranian pomegranate genotypes [36]. In our present study, the sequence diversity of pomegranate cp genomes was investigated combined with two previously reported cp genomes (NC\_035240, MG878386). The results showed that the sequence diversity of pomegranates is extremely low (0.0008). Only 42 singleton variable sites were detected, and there were no parsimony variable sites in the alignment of the cp genomes of the five pomegranate accessions. Therefore, we propose that cp genome sequences might not be appropriate for investigating the genetic diversity of pomegranate genotypes.


**Table 1.** The groups of genes within the *P. granatum* chloroplast genome.

One asterisk indicates that the genes contained two copies. a and b indicate one- and two-intron containing genes, respectively.

#### *2.2. Codon Usage Bias*

As an essential evolutionary feature, the codon usage pattern has been widely investigated in many plant species [37–39]. In our study, we explored the codon usage pattern in the cp genomes of pomegranates. Protein-coding genes with more than 300 nucleotides were selected for further analysis. Firstly, the base composition on three different codon positions was determined, and the data are displayed in Figure 2A. The results indicated that the average GC content of the first (GC1), second (GC), and third codon positions (GC3) were 47.04, 39.79, and 28.34%, respectively. The base compositions of the three different positions were distributed unevenly. The average GC3 content was significantly lower than those of GC1 and GC2. The results of neutrality plots (Figure 2B) showed that no significant correlation (R2 = 0.0036) between GC12 and GC3 was observed, which suggests that selective pressure affects the codon usage bias in the pomegranate cp genomes [40,41]. The codon adaptation index (CAI) value (Figure 2C) ranged from 0.5 to 1 with a default *E. coli* reference gene set as the reference. According to their functions in the chloroplast, the protein-coding genes can be classified into three categories: photosynthesis related genes (photo-genes), genetic system related genes (genet-genes), and other genes. A recent study about codon usage bias of cp genomes in cultivated and wild Solanum species concluded that photo-genes always had higher CAI values than genet-genes because the

solid.

expression level of photo-genes is relatively higher than that of genet-genes [42]. The same result was also observed in the cp genome of pomegranates. The main reason is probably due to the fact that photo-genes may have a higher codon usage bias for the requirement of high gene expression than do the genet-genes in the plant cp genomes. Furthermore, the relationship between bas compositions and codon usage was investigated by ENC-plot (Figure 2D). Effective number of codons (ENC) values ranged from 35.73 to 61, suggesting that codon usage bias is relatively weak in the pomegranate cp genomes. The distribution of most genes was far away from the standard curve, which shows that there are other factors that affect the codon usage, other than base compositions [43–45]. As the cp genomes were highly AT-rich, it was not surprising that AT-ending codons would be predominant in the protein-coding genes. The results are also consistent with the mutational bias towards AT being the force driving the strong bias of codon usage of plan cp genomes [43]. Also, the Arg amino acid coded with the AGA codon was the most frequent codon with a relative synonymous codon usage (RSCU) value 2.02 (Table 2). The main reason is probably due to the fact that photo-genes may have a higher codon usage bias for the requirement of high gene expression than do the genet-genes in the plant cp genomes. Furthermore, the relationship between bas compositions and codon usage was investigated by ENC-plot (Figure 2D). Effective number of codons (ENC) values ranged from 35.73 to 61, suggesting that codon usage bias is relatively weak in the pomegranate cp genomes. The distribution of most genes was far away from the standard curve, which shows that there are other factors that affect the codon usage, other than base compositions [43–45]. As the cp genomes were highly AT-rich, it was not surprising that AT-ending codons would be predominant in the protein-coding genes. The results are also consistent with the mutational bias towards AT being the force driving the strong bias of codon usage of plan cp genomes [43]. Also, the Arg amino acid coded with the AGA codon was the most frequent codon with a relative synonymous codon usage (RSCU) value 2.02 (Table 2).

usage bias in the pomegranate cp genomes [40,41]. The codon adaptation index (CAI) value (Figure 2C) ranged from 0.5 to 1 with a default *E. coli* reference gene set as the reference. According to their functions in the chloroplast, the protein-coding genes can be classified into three categories: photosynthesis related genes (photo-genes), genetic system related genes (genet-genes), and other genes. A recent study about codon usage bias of cp genomes in cultivated and wild Solanum species concluded that photo-genes always had higher CAI

that of genet-genes [42]. The same result was also observed in the cp genome of pomegranates.

**Figure 2.** The codon usage pattern of the pomegranate chloroplast (cp) genome. (**A**) GC content on three different positions. (**B**) Neutrality plot (GC12 against GC3). (**C**) The codon adaptation index (CAI) value of gene sets with different functions. (**D**) Relationship between GC3 and effective number of codons (ENC) (ENC-plot). The expected ENC from GC3 is shown as a **Figure 2.** The codon usage pattern of the pomegranate chloroplast (cp) genome. (**A**) GC content on three different positions. (**B**) Neutrality plot (GC12 against GC3). (**C**) The codon adaptation index (CAI) value of gene sets with different functions. (**D**) Relationship between GC3 and effective number of codons (ENC) (ENC-plot). The expected ENC from GC3 is shown as a solid.


**Table 2.** Putative preferred codons in the *P. granatum* cp genome. RSCU = relative synonymous codon usage.

Preferred codons (RSCU value > 1.0) are indicated with (\*).

## *2.3. RNA Editing Sites*

RNA editing is a posttranscriptional process, which has been experimentally identified in organellar transcriptomes from several species [46,47]. It mainly involves the conversion of cytidine to uridine, which generally results in amino acid changes. Therefore, knowing where sites of RNA editing exist in the organelle transcriptome could provide information for understanding the structure and function of the translated proteins [48]. The potential RNA editing sites in the pomegranate cp genome were predicted using the online program PREP. A total of 64 editing sites in 20 protein-coding genes were identified (Table 3). The *ndhB* gene had the highest number of potential editing sites (11), followed by the *ndhD* gene (9). In accordance with previous reports [49,50], we observed that most conversions at the codon positions changed from serine (S) to leucine (L) and most RNA editing sites led to amino acid changes from polar to apolar, which resulted in an increase in protein hydrophobicity.


**Table 3.** Predicted RNA editing sites in the cp genome of *P. granatum.*

#### *2.4. Sequence Diversity of the Chloroplast Genomes among Lythraceae Species*

Four complete cp genomes within Lythraceae, available in GenBank with our newly assembled 'Taishanhong', were selected to analyze the sequence diversity. The mean P-distance value was designated to represent the level of divergence. The genetic distance of all 76 protein-coding genes (Figure 3A) ranged from 0.003053 (*psbN*) to 0.108932 (*rpl22*), with an average of 0.024379. The intergenic regions had a relatively higher genetic distance (Figure 3B) compared to the protein-coding regions, ranging from 0.005621 (*trnL-ndhB*) to 0.23463 (*trnL-ycf2*) with an average value of 0.069775. The results

agree with previous reports that intergenic regions showed greater divergence than coding regions (Figure 3D) [51]. The SSC region exhibited higher divergence levels than the LSC and IRs (Figure 3C). Three intergenic regions with genetic distance values over the 95th percentile were considered as highly divergent regions, including *trnH-psbA*, *trnL-ccsA*, and *trnL-ycf2*. These highly variable regions may be regarded as potential molecular markers for application in phylogenetic analyses in Lythraceae. regions showed greater divergence than coding regions (Figure 3D) [51]. The SSC region exhibited higher divergence levels than the LSC and IRs (Figure 3C). Three intergenic regions with genetic distance values over the 95th percentile were considered as highly divergent regions, including *trnH-psbA*, *trnL-ccsA*, and *trnL-ycf2*. These highly variable regions may be regarded as potential molecular markers for application in phylogenetic analyses in Lythraceae.

*ycf2*) with an average value of 0.069775. The results agree with previous reports that intergenic

Four complete cp genomes within Lythraceae, available in GenBank with our newly assembled 'Taishanhong', were selected to analyze the sequence diversity. The mean Pdistance value was designated to represent the level of divergence. The genetic distance of all 76 protein-coding genes (Figure 3A) ranged from 0.003053 (*psbN*) to 0.108932 (*rpl22*), with an average of 0.024379. The intergenic regions had a relatively higher genetic distance (Figure 3B)

*2.4. Sequence Diversity of the Chloroplast Genomes among Lythraceae Species*

**Figure 3.** The genetic distance based on Kimura's two-parameter model. (**A**) The P-distance value of protein-coding genes. (**B**) The P-distance value of intergenic regions. (**C**) Boxplots of P-distance value difference among LSC, SSC, and IRs. (**D**) Boxplots of P-distance value differences between protein-coding genes and intergenic regions. **Figure 3.** The genetic distance based on Kimura's two-parameter model. (**A**) The P-distance value of protein-coding genes. (**B**) The P-distance value of intergenic regions. (**C**) Boxplots of P-distance value difference among LSC, SSC, and IRs. (**D**) Boxplots of P-distance value differences between protein-coding genes and intergenic regions.

#### *2.5. Structure Comparison among the Chloroplast Genomes of Lythraceae Species*

*2.5. Structure Comparison among the Chloroplast Genomes of Lythraceae Species* Five complete cp genomes within Lythraceae were selected for comparison with each other. The genome sizes was ranged from 152,205 to 159,219 bp. The length of the LSC, SSC, and IRs varied in the range of 84,046–89,021 bp, 16,914–18,821 bp, and 23,902–25,914 bp, respectively. *Lagerstroemia indica* has the smallest genome, and this difference is mostly attributed to variation in the length of the LSC and IR regions (Table 4). A detailed comparison on four borders between the two IRs and the two single-copy regions showed that the border structures were highly similar with one another (Figure 4). However, a slight difference in junction positions was observed among these five cp genomes. For instance, the *ndhF* gene was located at the SSC region in *Sonneratia alba*, *Trapa maximowiczii*, *Punica granatum*, and *Heimia*  Five complete cp genomes within Lythraceae were selected for comparison with each other. The genome sizes was ranged from 152,205 to 159,219 bp. The length of the LSC, SSC, and IRs varied in the range of 84,046–89,021 bp, 16,914–18,821 bp, and 23,902–25,914 bp, respectively. *Lagerstroemia indica* has the smallest genome, and this difference is mostly attributed to variation in the length of the LSC and IR regions (Table 4). A detailed comparison on four borders between the two IRs and the two single-copy regions showed that the border structures were highly similar with one another (Figure 4). However, a slight difference in junction positions was observed among these five cp genomes. For instance, the *ndhF* gene was located at the SSC region in *Sonneratia alba*, *Trapa maximowiczii*, *Punica granatum*, and *Heimia myrtifolia*, while it varied from 3 to 54 bp apart from the IRb/SSC junction. However, the *ndhF* gene crossed over the IRb/SSC region in *Lagerstroemia indica*. The *rps19* gene was located in the junction of the LSC/IRb in *Trapa maximowiczii*, *Lagerstroemia indica*, and *Punica granatum*, with 24–83 bp located in the IRb. However, in *Heimia myrtifolia* and *Sonneratia alba*, the *rps19* gene was fully located in the LSC region, and 4–16 bp apart from the LSC/IRb border. Overall, the IR boundary regions varied slightly in the Lythraceae cp genomes. IR expansion and contraction often results in genome size variations among various plant lineages, which can be used to study the phylogenetic classification and the genome evolution among plant lineages. Three reasons may explain the diversification of the IR boundary region sequences: the first is intramolecular recombination, the second is the presence of multiple repeat sequences, and the third is the indels, which caused a mismatch that resulted in the upstream sequence becoming a single copy [52].

Pairwise alignment of the *P. granatum* cp genome with the other Lythraceae species revealed a high degree of synteny and gene order conservation (Figure 5), suggesting an evolutionary conservation of the Lythraceae cp genomes at the genome-scale level. **Table 4.** Summary of the complete chloroplast genome characteristics of five species in Lythraceae.

*myrtifolia*, while it varied from 3 to 54 bp apart from the IRb/SSC junction. However, the *ndhF* gene crossed over the IRb/SSC region in *Lagerstroemia indica*. The *rps19* gene was located in the junction of the LSC/IRb in *Trapa maximowiczii*, *Lagerstroemia indica*, and *Punica granatum*, with 24–83 bp located in the IRb. However, in *Heimia myrtifolia* and *Sonneratia alba*, the *rps19* gene was fully located in the LSC region, and 4–16 bp apart from the LSC/IRb border. Overall, the IR boundary regions varied slightly in the Lythraceae cp genomes. IR expansion and contraction often results in genome size variations among various plant lineages, which can be used to study the phylogenetic classification and the genome evolution among plant lineages. Three reasons may explain the diversification of the IR boundary region sequences: the first is


**Table 4.** Summary of the complete chloroplast genome characteristics of five species in Lythraceae. **Species** *Punica granatum indica alba maximowicizz myrtifolia*

*Sonneratia* 

*Trapa* 

*Heimia* 

*Lagerstromeia* 

**Figure 4.** Comparison of the borders of the LSC, SSC, and IRs regions among five Lythraceae **Figure 4.** Comparison of the borders of the LSC, SSC, and IRs regions among five Lythraceae cp genomes.

**Figure 5.** Co-linear analysis of five cp genomes within Lythraceae. **Figure 5.** Co-linear analysis of five cp genomes within Lythraceae.

*2.6. Positive Selection Analysis*

Those genes contained one subunit of acetyl-CoA carboxylase (*accD*), one photosystem I subunit gene (*psaI*), two NADH-dehydrogenase subunit genes (*ndhF*, *ndhJ*), one ribosome large subunit gene (*rpl22*), five ribosome small subunit genes (*rps2*, *rps4*, *rps7*, *rps8*, *rps12*), and the *ycf1* gene. According to the M8 model, the ycf1 gene possessed 10 positive sites, followed by *ndhF* (7) and *rpl22* (5). The other eight genes each had only one positive site. The Photo-genes included four genes (*accD*, *psaI*, *ndhF*, *ndhJ*). The Genet-genes included six genes (*rpl22*, *rps2*, *rps4*, *rps7*, *rps8*, *rps12*). The *ycf1* gene was considered as the other gene. Most positively selected genes were genetic system or photosynthesis related genes, which indicated that the

chloroplast functional genes played vital roles during the plant evolution [53,54].

## *2.6. Positive Selection Analysis*

The ratios of non-synonymous (dN) and synonymous (dS) substitutions for 75 protein-coding genes among five Lythraceae were calculated based on the site-specific model. Eleven genes with positively selected sites within the Lythraceae family were identified (Table 5). Those genes contained one subunit of acetyl-CoA carboxylase (*accD*), one photosystem I subunit gene (*psaI*), two NADH-dehydrogenase subunit genes (*ndhF*, *ndhJ*), one ribosome large subunit gene (*rpl22*), five ribosome small subunit genes (*rps2*, *rps4*, *rps7*, *rps8*, *rps12*), and the *ycf1* gene. According to the M8 model, the ycf1 gene possessed 10 positive sites, followed by *ndhF* (7) and *rpl22* (5). The other eight genes each had only one positive site. The Photo-genes included four genes (*accD*, *psaI*, *ndhF*, *ndhJ*). The Genet-genes included six genes (*rpl22*, *rps2*, *rps4*, *rps7*, *rps8*, *rps12*). The *ycf1* gene was considered as the other gene. Most positively selected genes were genetic system or photosynthesis related genes, which indicated that the chloroplast functional genes played vital roles during the plant evolution [53,54].


**Table 5.** Log-likelihood values of the site-specific models, with detected sites having non-synonymous/ synonymous (dN/dS) values > 1.

\* *p* < 0.05; \*\* *p* < 0.01.

#### *2.7. The Phylogenetic Position of P. granatum*

Complete cp genomes are important as they can provide information for understanding the phylogenetic relationships at multiple taxonomic levels. For example, recent phylogenetic analyses based on protein-coding genes of the cp genome provided the broad phylogenetic framework for Viridiplantae, which has significant value both to evolutionary biologists and ecologists [55]. The genus *Punica* belongs to the order Myrtales and most likely originated from Saxifragales. However, the placement of genus *Punica* under different families such as Punicaceae, Lythraceae, and Myrtaceae has remained debatable [6]. Recent phylogenomic analysis based on 106 single-copy nuclear genes was performed and supported that *P. granatum* belongs to the Lythraceae family rather than the monogeneric Punicaeceae family [7].

To determine the position of *P. granatum* and to further analyze the relationships within Myrtales, we used the complete cp genome sequences to perform the phylogenetic analysis. Eighty-five species representing five families from the order Myrtales were selected. Two species from the order Vitales

*2.7. The Phylogenetic Position of P. granatum*

(*Vitis acerifolia*, *Vitis vinifera*) were selected as outgroups. To avoid systematic errors produced by poor alignment, we removed poorly aligned sites using Gblocks. After the removal of the ambiguously aligned regions, the alignment contained 106,571 sites in total, including 20,088 parsimony-informative sites, 9178 singleton sites, and 77,305 constant sites. The method of data analysis (ML and BI) did not affect the results, and the treetopologies of ML and BI were found to be the same. The phylogenetic relationships among five families were fully resolved, and node support values for the ML type were higher than 90. five species representing five families from the order Myrtales were selected. Two species from the order Vitales (*Vitis acerifolia*, *Vitis vinifera*) were selected as outgroups. To avoid systematic errors produced by poor alignment, we removed poorly aligned sites using Gblocks. After the removal of the ambiguously aligned regions, the alignment contained 106,571 sites in total, including 20,088 parsimony-informative sites, 9178 singleton sites, and 77,305 constant sites. The method of data analysis (ML and BI) did not affect the results, and the treetopologies of ML and BI were found to be the same. The phylogenetic relationships among five families were fully resolved, and node support

Myrtales, we used the complete cp genome sequences to perform the phylogenetic analysis. Eighty-

Complete cp genomes are important as they can provide information for understanding the phylogenetic relationships at multiple taxonomic levels. For example, recent phylogenetic analyses based on protein-coding genes of the cp genome provided the broad phylogenetic framework for Viridiplantae, which has significant value both to evolutionary biologists and ecologists [55]. The genus *Punica* belongs to the order Myrtales and most likely originated from Saxifragales. However, the placement of genus *Punica* under different families such as Punicaceae, Lythraceae, and Myrtaceae has remained debatable [6]. Recent phylogenomic analysis based on 106 single-copy nuclear genes was performed and supported that *P.granatum* belongs to the Lythraceae family rather

The tree showed (Figure 6) that all the accessions of the pomegranate were grouped into a single clade with other closely related species of the Lythraceae family. The monophyly of the family Lythraceae and the sister relation with family Onagraceae is highly supported (>90). Myrtaceae were strongly supported as monophyletic and formed a sister relationship with Melastomataceae. Five pomegranate accessions formed a clade with zero or nearly zero branches length, which suggests that the cp genome might be of limited use for cultivar identification and population genetic studies of *P. granatum*. Our full genomic data set resolved the phylogenomic placement of *Punica* and provided strong support for most relationships of Myrtales. values for the ML type were higher than 90. The tree showed (Figure 6) that all the accessions of the pomegranate were grouped into a single clade with other closely related species of the Lythraceae family. The monophyly of the family Lythraceae and the sister relation with family Onagraceae is highly supported (> 90). Myrtaceae were strongly supported as monophyletic and formed a sister relationship with Melastomataceae. Five pomegranate accessions formed a clade with zero or nearly zero branches length, which suggests that the cp genome might be of limited use for cultivar identification and population genetic studies of *P. granatum*. Our full genomic data set resolved the phylogenomic placement of *Punica* and provided strong support for most relationships of Myrtales.

**Figure 6.** Phylogenetic tree was reconstructed using Maximum likelihood (ML) and Bayes inference (BI) methods based on complete cp genomes of the order Myrtales. Only the tree topology of the ML tree was presented. **Figure 6.** Phylogenetic tree was reconstructed using Maximum likelihood (ML) and Bayes inference (BI) methods based on complete cp genomes of the order Myrtales. Only the tree topology of the ML tree was presented.

#### *Int. J. Mol. Sci.* **2019**, *20*, x; doi: www.mdpi.com/journal/ijms **3. Materials and Methods**
