*2.7. Comparison of the Genome Structure in Rosaceae cp Genomes*

The chloroplast genome structure of most higher plants is relatively stable and the number, sequence, and composition of their genes are conserved. However, because different plant groups have different evolutionary histories and genetic backgrounds, the chloroplast genome size, genome structure, and gene numbers vary. Insertion/deletion is the most frequent type of microstructural variation in the chloroplast genome, and it occurs frequently in some segments where the variation is high, such as *trnH*-*psbA* and *trnS-G*. In Rosaceae, an insertion/deletion of 277 bp in the intergenic region of the *trnS-G* gene was reported in peach plants [22], and an insertion/deletion of 198 bp in the intergenic region of *trnL-F* was identified in *P. mume* [23].

The collinear method was used to analyze and compare the chloroplast genomes of the two genotypes of *P. hopeiensis*, the other three sequenced *Pyrus*, and other related Rosaceae (*P. pashia*, *P. pyrifolia*, *P. spinosa*, *M. prunifolia*, *P. mume*, and *C. japonica*). The results showed optimal collinearity between *P. hopeiensis* HB-1 and *P. hopeiensis* HB-2, and only a few sites contained insertions and deletions (Figure 5). Compared with the other Rosaceae, the genome structure and gene sequences were highly conserved, with more linear relationships indicating high chloroplast genome homology among the different plants. *Int. J. Mol. Sci.* **2018**, *19*, x FOR PEER REVIEW 11 of 19

**Figure 5.** Co-linear analysis of various plant chloroplast genomes. **Figure 5.** Co-linear analysis of various plant chloroplast genomes.

#### *2.7. IR Contraction Analysis 2.8. IR Contraction Analysis*

located downstream.

downstream extended into the *rpl23* gene in *P. spinosa*.

The IR region is considered to be consistent and stable in the chloroplast genome. However, in the evolution of species, border region contraction and expansion are common. In this study, the IR The IR region is considered to be consistent and stable in the chloroplast genome. However, in the evolution of species, border region contraction and expansion are common. In this study, the IR

boundaries of both genotypes of *P. hopeiensis* were compared. The IRa/LSC boundary extended into the *rps19* gene, and 120 bp of *rps19* extended into the IRa region. The IRa/SSC boundary extended

The IR boundaries were compared among the Rosaceae, including the five *Pyrus* species sequenced, and *P. pashia*, *P. pyrifolia*, *P. spinosa*, *M*. *prunifolia*, *P. mume*, and *C*. *japonica* (Figure 6). The IRa/LSC boundary of these plants extended to the *rps19* gene. The IRa/SSC boundaries were located upstream of the *ndhF* gene, except in *M*. *prunifolia*, whose IRa/SSC junction lost *ndhF* but extended to *ycf1*. The *P*. *spinosa* IRb/LSC boundary had no *rpl2*. All IRb/SSC boundaries expect those of *P. ussuriensis* Maxin. cv. Jingbaili, *P. communis* L. cv. Early Red Comice, and *P. Spinosa* extended to *ycf1*. The IRb/SSC boundary lost *ycf1*. These findings were similar to those in the Actinidiaceae, Theaceae, and Primulaceae, but differed markedly from those in Ericaceae. For the IRb/LSC boundary, all but that of *P. spinosa* extended into the *rpl2* gene, and the IRb/LSC boundary of the *trnH-GTG* gene located boundaries of both genotypes of *P. hopeiensis* were compared. The IRa/LSC boundary extended into the *rps19* gene, and 120 bp of *rps19* extended into the IRa region. The IRa/SSC boundary extended into the *ndhF* gene, and 12 bp of *ndhF* extended into the IRa region. The IRb/SSC boundary extended into 1074 bp of *ycf1* and the IRb/LSC border extended into the *rpl2* gene, with the *trnH-GTG* gene located downstream.

The IR boundaries were compared among the Rosaceae, including the five *Pyrus* species sequenced, and *P. pashia*, *P. pyrifolia*, *P. spinosa*, *M. prunifolia*, *P. mume*, and *C. japonica* (Figure 6). The IRa/LSC boundary of these plants extended to the *rps19* gene. The IRa/SSC boundaries were located upstream of the *ndhF* gene, except in *M. prunifolia*, whose IRa/SSC junction lost *ndhF* but extended to *ycf1*. The *P. spinosa* IRb/LSC boundary had no *rpl2*. All IRb/SSC boundaries expect those of *P. ussuriensis* Maxin. cv. Jingbaili, *P. communis* L. cv. Early Red Comice, and *P. Spinosa* extended to *ycf1*. The IRb/SSC boundary lost *ycf1*. These findings were similar to those in the Actinidiaceae, Theaceae, and Primulaceae, but differed markedly from those in Ericaceae. For the IRb/LSC boundary, all but that of *P. spinosa* extended into the *rpl2* gene, and the IRb/LSC boundary of the *trnH-GTG* gene located downstream extended into the *rpl23* gene in *P. spinosa*. *Int. J. Mol. Sci.* **2018**, *19*, x FOR PEER REVIEW 12 of 19


**Figure 6.** IR contraction analysis of Rosaceae.

#### *2.8. Phylogenetic Analysis 2.9. Phylogenetic Analysis*

To gain an insight into the position of *Pyrus* within the Rosaceae, a molecular phylogenetic tree was constructed using 57 protein-coding genes (*accD*, *atpA*, *atpE*, *atpH*, *atpI*, *csA*, *cemA*, *ndhA*, *ndhB*, *ndhC*, *ndhD*, *ndhE*, *ndhG*, *ndhH*, *ndhJ*, *ndhK*, *petA*, *petG*, *petL*, *petN*, *psaA*, *psaB*, *psaI*, *psbA*, *psbB*, *psbC*, *psbD*, *psbE*, *psbF*, *psbH*, *psbJ*, *psbM*, *psbN*, *psbT*,, *rbcL*, *rpl14*, *rpl16*, *rpl2*, *rpl22*, , *rpl23*, *rpl33*, *rpoA*, *rpoC1*, *rps11*, *rps12*, *rps14*, *rps15*, *rps18*, *rps19*, *rps2*, *rps3*, *rps4*, *rps7*, *rps8*, *ycf2*, *ycf3*, *ycf4*)from the chloroplast genomes of 36 Rosaceae, which were downloaded from GenBank, and using *Arabidopsis thaliana* as the outgroup. The resulting phylogenetic tree was consistent with the traditional plant morphological taxonomy (Figure 7), which can be divided into three sections: Maloideae, Prunoideae, and Rosoideae. The Maloideae include *Pyrus*, *Malus*, *Sorbus* L., and *Eriobotrya*. *Prunus* lies within the Prunoideae, and *Fragaria* is included in the Rosoideae. The phylogenetic relationship of the Prunoideae was closer than that of the Rosoideae to the Maloideae, and the relationship between from the evolutionary tree that Rosoideae is a subfamily that split off from the evolutionary tree. **Figure 6.** IR contraction analysis of Rosaceae. To gain an insight into the position of *Pyrus* within the Rosaceae, a molecular phylogenetic tree was constructed using 57 protein-coding genes (*accD*, *atpA*, *atpE*, *atpH*, *atpI*, *csA*, *cemA*, *ndhA*, *ndhB*, *ndhC*, *ndhD*, *ndhE*, *ndhG*, *ndhH*, *ndhJ*, *ndhK*, *petA*, *petG*, *petL*, *petN*, *psaA*, *psaB*, *psaI*, *psbA*, *psbB*, *psbC*, *psbD*, *psbE*, *psbF*, *psbH*, *psbJ*, *psbM*, *psbN*, *psbT*, *rbcL*, *rpl14*, *rpl16*, *rpl2*, *rpl22*, *rpl23*, *rpl33*, *rpoA*, *rpoC1*, *rps11*, *rps12*, *rps14*, *rps15*, *rps18*, *rps19*, *rps2*, *rps3*, *rps4*, *rps7*, *rps8*, *ycf2*, *ycf3*, *ycf4*)from the chloroplast genomes of 36 Rosaceae, which were downloaded from GenBank, and using *Arabidopsis thaliana* as the outgroup. The resulting phylogenetic tree was consistent with the traditional plant morphological taxonomy (Figure 7), which can be divided into three sections: Maloideae, Prunoideae, and Rosoideae. The Maloideae include *Pyrus*, *Malus*, *Sorbus* L., and *Eriobotrya*. *Prunus* lies within the Prunoideae, and *Fragaria* is included in the Rosoideae. The phylogenetic relationship of the Prunoideae was closer than that of the Rosoideae to the Maloideae, and the relationship between *Malus* and *Pyrus* was the closest.

*Malus* and *Pyrus* was the closest. Within *Pyrus*, the relationship between *P. hopeiensis* HB-1 and *P. hopeiensis* HB-2 was the closest, and the relationship between *P. betulifolia* and *P. ussuriensis* Maxin. Within *Pyrus*, the relationship between *P. hopeiensis* HB-1 and *P. hopeiensis* HB-2 was the closest, and the relationship between *P. betulifolia* and *P. ussuriensis* Maxin. cv. Jingbaili was closer than that between other *Pyrus* and *P. hopeiensis*. In addition, it can be seen from the evolutionary tree that Rosoideae is a subfamily that split off from the evolutionary tree. *Int. J. Mol. Sci.* **2018**, *19*, x FOR PEER REVIEW 13 of 19

**Figure 7.** The ML phylogenetic tree of the Rosaceae clade based on same protein-coding genes. **Figure 7.** The ML phylogenetic tree of the Rosaceae clade based on same protein-coding genes. Numbers above or below the nodes are bootstrap support values.

Numbers above or below the nodes are bootstrap support values.

**3. Discussion**

this study, the chloroplast genomes of the two genotypes *P. hopeiensis* HB-1 and *P. hopeiensis* HB-2 and those of three other major pear plants, *P. ussuriensis* Maxin. cv. Jingbaili, *P. communis* L. cv. Early

*Pyrus hopeiensis* is a valuable wild resource of *Pyrus*, which belongs to the family Rosaceae.
