Next Article in Journal
Coupling Relationship between Ecosystem Service Value and Socioeconomic Development in the Qinba Mountains, China
Previous Article in Journal
Jaguar’s Predation and Human Shield, a Tapir Story
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Comparison of Boraginales Plastomes: Insights into Codon Usage Bias, Adaptive Evolution, and Phylogenetic Relationships

1
Key Laboratory of Adaptation and Evolution of Plateau Biota, Northwest Institute of Plateau Biology, Chinese Academy of Sciences & Institute of Sanjiangyuan National Park, Chinese Academy of Science, Xining 810008, China
2
College of Manufacturing Engineering, Maanshan University, Maanshan 243000, China
*
Author to whom correspondence should be addressed.
Diversity 2022, 14(12), 1104; https://doi.org/10.3390/d14121104
Submission received: 30 October 2022 / Revised: 6 December 2022 / Accepted: 8 December 2022 / Published: 12 December 2022

Abstract

:
The Boraginales (Boraginaceae a.l.) comprise more than 2450 species worldwide. However, little knowledge exists of the characteristics of the complete plastid genome. In this study, three new sequences representing the first pt genome of Heliotropiaceae and Cordiaceae were assembled and compared with other Boraginales species. The pt genome sizes of Cordia dichotoma, Heliotropium arborescens, and Tournefortia montana were 151,990 bp, 156,243 bp, and 155,891 bp, respectively. Multiple optimal codons were identified, which may provide meaningful information for enhancing the gene expression of Boraginales species. Furthermore, codon usage bias analyses revealed that natural selection and other factors may dominate codon usage patterns in the Boraginales species. The boundaries of the IR/LSC and IR/SSC regions were significantly different, and we also found a signal of obvious IR region expansion in the pt genome of Nonea vesicaria and Arnebia euchroma. Genes with high nucleic acid diversity (pi) values were also calculated, which may be used as potential DNA barcodes to investigate the phylogenetic relationships in Boraginales. psaI, rpl33, rpl36, and rps19 were found to be under positive selection, and these genes play an important role in our understanding of the adaptive evolution of the Boraginales species. Phylogenetic analyses implied that Boraginales can be divided into two groups. The existence of two subfamilies (Lithospermeae and Boragineae) in Boraginaceae is also strongly supported. Our study provides valuable information on pt genome evolution and phylogenetic relationships in the Boraginales species.

1. Introduction

The Boraginales (Boraginaceae a.l.) comprises more than 2500 species worldwide, with 294 species distributed in China [1]. In the flora of China, Boraginaceae are divided into four subfamilies, including Boraginoideae, Ehretioideae, Cordioideae, and Heliotropioideae. However, the taxonomic treatment of Boraginales was revised with the development of molecular systematics. Eleven families were adopted in Boraginales based on molecular evidence [2,3,4]. The four subfamilies (Boraginoideae, Ehretioideae, Cordioideae, and Heliotropioideae) form independent families (Boraginaceae, Ehretiaceae, Cordiaceae, and Heliotropiaceae). The Angiosperm Phylogeny Group classifications (APG, including APG I, APG II, APG III, and APG IV) adopt one family in Boraginales (Boraginaceae a.l.) [5,6,7,8]. The phylogenetic position of Boraginales remains controversial [9,10,11,12]. In recent years, Solanales (Ehretiaceae + Heliotropiaceae) were demonstrated to be close to Boraginas based on genome/transcriptome data [13]. In addition, phylogenetic analyses based on the protein-coding genes of the pt genome implied that Gentianales were clustered to Boraginales with an 87% bootstrap value [14]. However, these studies represent limited samples of Boraginales and its related taxa. For the borage family (Boraginaceae s.str.), three subfamilies were accepted, including Echiochiloideae, Boraginoideae, and Cynoglossoideae [15].
The plastid genome is an important organelle in plant photosynthesis, mainly encoding proteins related to photosynthesis and ribosomal proteins. Generally, the plastid genome is conserved in angiosperms and contains a large single copy (LSC), two inverted repeat (IR) regions, and one small single copy (SSC) region [16]. Variation exists in the pt genome across different species, such as in the number of genes and gene content [17]. Loss of genes in the pt genome is a common phenomenon; some genes were lost in one species but found in another. Furthermore, the number of pseudogenes also differs across different species [18]. Studies on gene loss and pseudogenes can help us to better understand the evolution of species. Analysis of the pt genome can obtain meaningful information for genetic evolution and phylogenetic relationships in land plants [19]. The pt genome contains rich genetic information, such as simple sequence repeats (SSRs), which have been used for population genetic analyses and are abundant in the plastid genome [20,21]. Furthermore, the highly divergent regions in the pt genome among separate species, such as pi values, may be used as potential molecular markers for interpreting phylogenetic relationships [22]. The taxonomic relationships of many taxa have been revised using the plastid genome in recent years [23,24]. However, in the species-rich Boraginales, fewer than 20 species are published in the NCBI database. Compared with other groups, comparative analysis of the pt genome of Boraginales is relatively scarce. Sequencing and analyzing the pt genome of the Boraginales species can help us to better understand the genetic evolution of this group.
Codon refers to the rule that every three adjacent nucleotides in an mRNA molecule are composed into a group, which represents a certain amino acid during protein synthesis. All genetic codons encoding the same amino acid are termed synonymous codons. The frequency of synonymous codons that differ across different species is termed codon usage bias. Codon bias affects a series of cellular processes, such as transcription and translation [25,26]. Natural selection, mutation pressure, and genetic drift on the translational efficiency of genes are three main theoretical results of the evolution of codon usage bias [27,28]. Furthermore, the length of the gene, GC content, and mRNA structure also play an important role in assessing codon usage bias [29]. Analyzing codon usage patterns can help us to better understand genetic evolution, environmental adaptation, and evolutionary relationships between species [30,31]. Furthermore, the construction of a heterologous expression vector based on the optimal codon can enhance protein expression [32,33].
Comparative analysis of the pt genome can help us to better understand genetic evolution. However, little knowledge exists on the characteristics of the complete plastid genome of the Boraginales species. In this study, the first plastid genomes of Cordiaceae and Heliotropiaceae were sequenced and assembled. Our main goals were to (1) compare the pt genome of Boraginales species; (2) investigate codon usage patterns; and (3) infer the phylogenetic relationship of Boraginales at the family level.

2. Materials and Methods

2.1. Sample Collection

Fresh leaves of Cordia dichotoma were collected from Yongde County, Yunnan Province (99°8′, 23°48′). Fresh leaves of Tournefortia montana and Heliotropium arborescens were collected from Fusui County, Guangxi Province (107°51′, 22°40′) and South China National Botanical Garden (113°22′, 23°11′), respectively. Specimens of Tournefortia montana, Heliotropium arborescens, and Cordia dichotoma were deposited into the Qinghai-Tibetan Plateau Museum of Biology, Northwest Institute of Plateau Biology, China Academy of Sciences (HNWP). The silica gel-dried leaves of these three species were sent to Genepioneer Biotechnologies (Nanjing, China) for plastid genome sequencing. The genomic DNA of the three species was extracted using a modified cetyltrimethylammonium bromide (CTAB) method [34]. The DNA extraction process was conducted as follows. Taking 1 g young leaves, the appropriate amount of PVP was added. They were ground in liquid nitrogen and transferred to a 20 mL 2 × CTAB extraction buffer, preheated at 55 °C. Adding 2-Mercaptoethanol, they were mixed gently in a 55 °C water bath to cause the temperature of the sample solution to exceed 50 °C. It was removed from the greenhouse and allowed to stand for 10–15 min. Adding 20 mL of Trichloromethane and isoamyl alcohol (24:1 ratio), the cell residue and chloroplasts were extracted, the lid was tightened, and it was turned over evenly about 100 times until the Trichloromethane compatible solution changed color. After centrifugation for 20 min, the supernatant was transferred to a new centrifuge tube, and isoamyl alcohol of equal volume, precooled at −20 °C, was added. The lid was covered tightly, and it was turned over evenly. Genomic DNA was precipitated, and the DNA precipitate was fished out with a glass rod. The DNA precipitation was washed twice with 5 ml 70% ethanol, once with 5 ml 100% ethanol, and then dried at 37 °C for 30–60 min. Then, 1 mL TE buffer and 0.5 ul 100 g/mL RNA enzyme were added and dissolved at 65 °C. Taking 1.0–2.0 ul samples, electrophoresis was used on 1.0% agarose gel to check the quality of the sample DNA. After the quality of DNA was passed, DNA sequencing was performed on the NovaSeq 6000 Platform (Illumina Inc., San Diego, CA, USA) with a 150 bp paired-end read (insert size of 350 bp) strategy.

2.2. Plastid Genome Assembly and Annotation

Low-quality, shorter reads and adapters of raw data were removed by using Trimmomatic version 0.33 and FastQC version 0.11.8. De novo assembly of the clean data of the three species was performed on the GetOrganelle v1.7.5.0 pipeline with kmers 21, 45, 65, 85, and 105 [35]. We used Bandage to confirm whether the assembled pt genome was a closed loop [36]. The annotation of the pt genome was performed on the online program GeSeq with the best annotated Ehretia dicksonii (MZ555766) and Nonea vesicaria (NC_060826) plastid genomes as references [37]. We attempted to annotate the missing genes through the BLAST program of GenBank (with default settings). Finally, the genes with incorrect annotations were modified using Sequin software. We used OrganellarGgnomeDRAW (https://chlorobox.mpimp-golm.mpg.de/OGDraw.html) (accessed on 15 September 2022) to visualize the pt genome of Cordia dichotoma, Tournefortia montana, and Heliotropium arborescens.

2.3. Codon Usage Bias Analyses

The value range of ENC is 20–61, which is one of the indicators used to measure whether the use of codons is biased. Generally, the codon bias of highly expressed genes is large, so its ENC value is small. The values of ENC (effective number of codons) and GC3 s (the GC content at the third base position of synonymous codons) were calculated with the R software package CodonW 1.4.2. The online program CUSP (http://emboss.toulouse.inra.fr/cgi-bin/emboss/cusp) (accessed on 15 September 2022) was used to calculate the frequency of GC content of plastid genes at the first position (GC1), the second position (GC2), and the third position (GC3). We used MEGA X to calculate the RSCU (relative synonymous codon usage) value [38]. The average GC content was also calculated. Correlation of these aforementioned contents was performed using SPSS.
A codon with an RSCU value greater than one is termed a high-frequency codon. The ENC value was arranged from low to high, 10% genes were selected at both ends to construct a gene library with high and low expression, and the △RSCU (RSCU in high and low expression groups) was calculated. A codon with a △RSCU value greater than 0.08 and an RSCU value greater than one was selected as the optimal codon [39].
Taking ENC as the ordinate GC3 as the abscissa, and following the formula of standard curve ENC = 2 + GC3 + 29/[GC32 + (1 − GC3)2], it can be seen from the position of gene distribution in the standard curve that codon bias is determined by mutation pressure or selection pressure, as well as other factors [40].
The content of 4 bases (A, T, C, G) at the third codon position was analyzed. Based on the percentage of G3 in the GC content of the third codon (G3/(G3 + C3)) as the abscissa, and A3 in the AT content of the third codon (A3/(A3 + T3)) as the ordinate, PR2-bias plot analyses were performed [41]. The central position in the RP2 plot (indicating A = T and G = C) represents the non-bias of codon use. The farther away from the central position, the stronger the codon bias.
By analyzing the correlation between GC12 (representing the average of GC1 and GC2) and GC3, we can infer the influence of mutation pressure and selection pressure on codon usage bias [42]. If GC12 and GC3 are significantly correlated, it indicates that mutation pressure mainly affects codon bias; otherwise, selection pressure mainly affects codon bias [43].

2.4. Comparative Analyses of the Plastid Genome

All of the published pt genomes of Boraginales (Boraginaceae a.l.) species were extracted and adapted using PhyloSuite software [44]; then, these shared protein-coding genes were used to calculate nucleotide diversity (pi) using DnaSP v6 [45]. The boundary regions of the IR/LSC and IR/SSC regions of nine species from Boraginales, including five species from five tribes in Boraginaceae and four species from Ehretiaceae, Heliotropiaceae, Ehretiaceae, and Cordiaceae were plotted using Irscope (https://irscope.shinyapps.io/irapp/) (accessed on 15 September 2022). Homologous protein sequences between Cordia dichotoma and Lappula myosotis, Nonea vesicaria, Cynoglossum amabile, Trigonotis peduncularis, Arnebia euchroma, Tournefortia montana, Ehretia dicksonii, and Heliotropium arborescens were obtained using BLASTN (accessed on 15 September 2022). The protein-coding genes that are shared between Cordia dichotoma and the remaining eight species were identified using MAFFT version 7. We used KaKs_Calculator version 2 (accessed on 15 September 2022) to calculate the nonsynonymous (Ka) and synonymous (Ks) ratios (Ka/Ks) of the aforementioned nine species [46].

2.5. Phylogenetic Analyses

All of the published pt genomes of the Boraginales species, combined with two outgroups (Gentiana rigescens MW251944 and Heracleum millefolium MW228410), were extracted and aligned with MAFFT components in PhyloSuite [44]. Then, these shared protein-coding genes were concatenated by species. Some poorly aligned regions were manually adjusted in MEGA X [38]. Maximum likelihood (ML) estimation with 1000 bootstrap replications was performed on IQTREE version 2 (accessed on 15 September 2022) [47] with a TVM + F + R4 model from the results of ModelFinder.

3. Results

3.1. Characterization of the Plastid Genome Sequences and Comparison with Other Species

The clean data were 1.6 GB, 1.5 GB, and 1.9 GB for Cordia dichotoma, Heliotropium arborescens, and Tournefortia montana, respectively. The total reads of clean data were 22,550,951, 19,531,888, and 26,334,951 for Cordia dichotoma, Heliotropium arborescens, and Tournefortia montana, respectively.
The three species (Cordia dichotoma, Heliotropium arborescens, and Tournefortia montana) have the typical quadripartite structure of plastid genomes, with LSC, SSC, and two IR regions (Figure 1). The pt genome sizes of Cordia dichotoma, Heliotropium arborescens, and Tournefortia montana were 151,990 bp, 156,243 bp, and 155,891 bp, respectively. A total of 133 genes were found in the pt genomes of Heliotropium arborescens, Tournefortia montana, and Cordia dichotoma, respectively. The clpP in the pt genome of Cordia dichotoma was partial, but was complete in the pt genomes of Tournefortia montana and Heliotropium arborescens. Furthermore, the three species all contained 37 tRNA and 8 rRNA genes.
We compared the general features of the pt genome of nine species in Boraginales, and those nine species were represented by different genera (Table S1). We reannotated the sequence downloaded from GenBank through the GeSeq and BLAST programs of GenBank (with default settings), in an attempt to ensure whether the gene was truly missing. We found that ycf2 (Lappula myosotis), pseudogene ycf1 (Cynoglossum amabile and Trigonotis peduncularis), ycf15 (Cynoglossum amabile and Trigonotis peduncularis), and rps2 (Cynoglossum amabile) were not annotated. We found that ndhB, rpl23, rps7, rpl2 each had two copies in the nine species. Furthermore, we found that rpl22 (Nonea vesicaria), rps12 (two copies of partial fragments) (Heliotropium arborescens, Cordia dichotoma, Tournefortia montana, Ehretia dicksonii, Lappula myosotis, Trigonotis peduncularis, and Arnebia euchroma) rps3 (Nonea vesicaria), rps19 (Nonea vesicaria and Arnebia euchroma), and ycf15 (Heliotropium arborescens, Cordia dichotoma, Ehretia dicksonii, Cordia dichotoma, Trigonotis peduncularis, Cynoglossum amabile, and Nonea vesicaria) have two copies in some species. These duplicate genes are located in the IR region. The GC content in the nine species differed according to species. The pt genomes of Cynoglossum amabile, Nonea vesicaria, and Tournefortia montana each had a relatively low GC content of 37.4%. Ehretia dicksonii possessed the greatest GC content, which was 39.7%. All nine species contained eight tRNA genes. The length variation of the pt genome in the nine species was significant. The most divergent values were found in the LSC region, with a length variation of 7162 bp between the nine species. The LSC (2696 bp) and IR (2135 bp) regions had a relatively lower length variation (Table S1).

3.2. Nucleotide Composition Analyses of the Plastid Genome in Nine Species

The online program CUSP was used to investigate the codon base composition of nine species (Lappula myosotis, Nonea vesicaria, Cynoglossum amabile, Trigonotis peduncularis, Arnebia euchroma, Cordia dichotoma, Tournefortia montana, Ehretia dicksonii, and Heliotropium arborescens), and Codon W 1.4.2 software was used to calculate the ENC value (Table S2). The frequency of GC content of plastid genes at the first position (GC1) was higher than that at the second (GC2) and third positions (GC3). A/T (A/U) mainly occurs in the second and third codons of plastid genes. The effective number of codon (ENC) values ranged from 32.76–52.69, 34.46–56.13, 34.02–52.16, 35.97–53.18, 33.08–54.31, 33.37–54.19, 33.74–53.55, 33.69–52.38, and 33.76–53.69 for Lappula myosotis, Nonea vesicaria, Cynoglossum amabile, Trigonotis peduncularis, Arnebia euchroma, Cordia dichotoma, Tournefortia montana, Ehretia dicksonii, and Heliotropium arborescens, respectively. The majority of the ENC values were above 35, and fell into 40–50.
The correlation between GC1, GC2, GC3, GC_all (GC total content), ENC, and number of codons (Codon.no) was performed (Figure 2). The results show that the codon composition of GC1 and GC2 reached a highly significant level, and GC3 was weakly correlated with GC1 and GC2. GC_all was related to GC1, GC2, and GC3. The ENC was significantly correlated with GC3 in plastid genes of Nonea vesicaria, Tournefortia montana, Trigonotis peduncularis, and Heliotropium arborescens. The correlation between GC3 and ENC was not obvious in plastid genes of Lappula myosotis, Cynoglossum amabile, Arnebia euchroma, Cordia dichotoma, and Ehretia dicksonii, apart from in Tournefortia montana and Heliotropium arborescens. The codon number was significantly correlated with GC3 in the pt genomes of Tournefortia montana and Nonea vesicaria.

3.3. Neutral Plot Analyses

To understand the influence of mutation pressure and natural selection on the codon usage bias of plastid genomes in different species, neutral plot analyses were performed to represent different genera in Boraginales (Figure 3). The adjusted R2 ranged from −0.05 to 0.039, indicating that there was no significant correlation between GC12 and GC3. The slopes of the regression lines in Lappula myosotis, Nonea vesicaria, Cynoglossum amabile, Trigonotis peduncularis, Arnebia euchroma, Cordia dichotoma, Tournefortia montana, Ehretia dicksonii, and Heliotropium arborescens were 0.287, 0.307, 0.253, 0.169, 0.151, 0.155, 0.159, 0.164, and −0.039, respectively.

3.4. ENC-Plot Analyses

We found that the nine species had similar distribution patterns of ENC and GC3 values in the ENC plot analyses (Figure 4). The results of ENC plot analyses indicated that the ENC values of the majority of genes are located below the prediction curve, and deviate from the prediction value. Only a few genes had ENC values near the standard curve.

3.5. PR2 Plot Analyses

The Parity rule 2 (PR2) plot was generated in order to investigate the frequency of the third codon in the plastid genome (Figure 5). The abscissa represents the ratio of the frequency of G used in the third codon to the total number of (G + C), as well as the ratio of the frequency of use of A to the total value of (A + T). In our present result, the use frequency of the four bases is unevenly distributed in the PR2 plot. Most of the genes appear at the bottom right of the graph, indicating that the usage of T and G at the third position of codons was higher than that of A and C. A few genes appear near the center, indicating that both mutation pressure and selection pressure play a crucial role in the codon usage bias of plastid genes in the nine species.

3.6. Determination of the Optimal Codon

The RSCU values of all nine species were calculated. The RSCU values were greater than one, indicating that those codons were high-frequency codons. The number of high-frequency codons differed across species; for example, Trigonotis peduncularis had 31 codons with RSCU values greater than one; however, Cynoglossum amabile and Ehretia dicksonii had 33 and 30 high-frequency codons, respectively (Table S3). Nine species had a common feature in high-frequency codons, which was the majority of codons’ favorite ending with A/U. The number of optimal codons also differed across species; 15, 19, 20, 17, 20, 14, 18, and 22 optimal codons were detected in the pt genomes of Arnebia euchroma, Cordia dichotoma, Cynoglossum amabile, Ehretia dicksonii, Heliotropium arborescens, Lappula myosotis, Nonea vesicaria, Tournefortia montana, and Trigonotis peduncularis, respectively. Only three optimal codons (CGA, GAA, and GUU) were shared between the nine species. Furthermore, the majority of the optimal codons ended with A/U.

3.7. IRscope Analyses

To investigate the genes in the boundary of the IR/LSC and IR/SSC regions, IRscope analyses were performed to represent the family Boraginaceae in the traditional sense (Figure 6). In the IRb/LSC region, the rps19 gene spanned the boundary region; generally, the rps19 gene in the pt genome of Cordia dichotoma was located in the LSC region, with 64 bp extending to the IRb region. However, the rps19 genes of Nonea vesicaria and Arnebia euchroma were all distributed in the IRb region, and the genes in the boundary of the IRb/LSC region were rpl16 and rps3 in the pt genome of Nonea vesicaria and Arnebia euchroma, respectively. The rps19 gene in the pt genome of Trigonotis peduncularis was located in the LSC region, 10 bp away from the IRb region. The length of the rps19 gene in the pt genome of Boraginales was 279 bp.
Except for in Nonea vesicaria and Arnebia euchroma, the rpl2 genes were all located in the IRb region, 11 to 131 bp away from the LSC region. In the pt genome of Nonea vesicaria, the first gene in the IRb region that was near the boundary of the IRb/LSC region was rps3, which was entirely located in the IRb region, 80 bp away from the LSC region. The rpl22 gene of Arnebia euchroma was mainly located in the IRb region, with 120 bp extending to the LSC region.
In the IRb/SSC region, we did not find that the termination codon of the truncated ycf1 copy was in the IRb region; thus, it was found to cross the IRb/SSC boundary. The truncated ycf1 copy spanned the IRb/SSC border, from 755 bp to 1179 bp in the IRb region and from 3 bp to 107 bp in the SSC region.
The ndhF genes of the nine species were all located in the boundary of the SSC/IRb region. In the pt genomes of Trigonotis peduncularis, Nonea vesicaria, Arnebia euchroma, Tournefortia montana, and Cordia dichotoma, the ndhF gene spanned the boundary of the SSC/IRb border, from 2194 bp to 2235 bp in the SSC region and from 2 bp to 60 bp in the IRb region. The ndhF gene in the pt genomes of Ehretia dicksonii, Lappula myosotis, Cynoglossum amabile, and Heliotropium arborescens was entirely located in the SSC region. The length of the gene ranged from 2196 bp to 2235 bp.
In the SSC/IRa region, the ycf1 gene spanned the boundary of the SSC/IRb region, from 4158 bp to 4487 bp in the SSC region and from 755 bp to 1179 bp in the IRa region.
The trnN gene showed the most conservative structure in the pt genomes of nine species, and was located in the IRa region in all cases. The rpl2 genes were not all located in the IRa/LSC boundary region. The rps3 and rps19 genes appeared at the boundary of the IRa/LSC region for Nonea vesicaria and Arnebia euchroma, respectively. The trnH gene was located at the boundary of the LSC/IRa region.

3.8. Pi Value Analyses

We performed nucleotide diversity analyses to represent all species available in the NCBI database using DNASP v6 (Figure S1). The results showed that the intergenic region was more divergent than the protein-coding region. In the protein-coding region, three genes (matK, rpl23, and ycf15) had a nucleic acid diversity (pi) value greater than 0.1, of which the ycf15 gene had the highest divergence value. Ten genes had a pi value greater than 0.5, including psbT, rpl33, ndhF, ccsA, rps15, rps16, matK, rpl23, and ycf15. A hotspot region was found in the intergenic region, resulting from all the genes having a pi value greater than 0.1. There were only five genes (psbC_trnS-UGA, ycf15_trnV-GAC, trnL-CAA_ndhB, rpl2_trnI-CAU, psaC_ndhE) with pi values less than 0.2. Twenty-eight genes showed pi values greater than 0.4, of which the trnI-CAU_trnL-CAA gene was most divergent from the pi value.

3.9. Selection Pressure Analyses

The value of the nonsynonymous (Ka) and synonymous (Ks) ratio (Ka/Ks) was greater than one, indicating that the gene was under positive selection. If the value of Ka/Ks is less than one, the gene might be under purifying selection. We used the Cordia dichotoma plastid genome as a reference to search for other genes under selection pressure (Figure 7). psaI was found to be under positive selection in the plastid genomes of Lappula myosotis, Cynoglossum amabile, Trigonotis peduncularis, and Nonea vesicaria. We detected that rpl33 was positive in Ehretia dicksonii, Heliotropium arborescens, and Tournefortia montana. Moreover, rpl36 and rps19 were greater than one in the Tournefortia montana pt genome. The majority of genes of the Boraginales species were less than one, indicating that these genes may be under purifying selection.

3.10. Phylogenetic Analyses

An ML tree was constructed based on 80 shared protein-coding genes, and the majority of the branches received high bootstrap support values (Figure 8). In our phylogenetic tree, the Boraginales were divided into two main clades, with 100% bootstrap support value. The clade consists of Boraginaceae (s.str.), in which the monophyly of the two subfamilies (Boraginoideae and Cynoglossoideae) was well supported (100%). The subfamily Boraginoideae consists of the genera Arnebia, Echium, Onosma, Borago, and Nonea. The genera Cynoglossum, Trigonotis, and Lappula gather into one clade, forming the subfamily Cynoglossoideae. The genera Arnebia and Lithospermum were clustered into one branch, with an 86% bootstrap value. Echium was closer to Onosma (100%), and Borago was clustered to Nonea, with 100% bootstrap value. Cynoglossum was closer to Trigonotis than to Lappula. Heliotropiaceae and Ehretiaceae + Cordiaceae were clustered into one clade, with 100% bootstrap value. Ehretiaceae and Cordiaceae clustered into one clade, with a bootstrap value of 72%.

4. Discussion

Similarly to the majority of angiosperm plants, the three species (Cordia dichotoma, Heliotropium arborescens, Tournefortia montana) have a typical quadripartite plastid genome structure with an LSC, an SSC, and two IR regions. Generally, the inverted regions (IRs) were more conserved than the single copy regions (LSC and SSC) [48], and our results also support that the LSC region was the most variable region. Furthermore, the SSC region showed a conservative length, with only 561 bp variation compared with the IR region. The GC content also decreased across different species, and ranged from 37.4% (Cynoglossum amabile, Nonea vesicaria, and Tournefortia montana) to 39.7% (Ehretia dicksonii). The GC content also differed between separate species from the same genus [49], which may result from different codon usage biases among these species.
A previous study demonstrated that codon usage bias is affected by several biological factors. Codon usage bias in the Boragines species is not well known. A total of nine species representing different genera in Boragines were analyzed in this study in order to investigate the pattern of codon usage bias. The frequency of nucleotide composition at the third codon site (GC3) plays a crucial role in shaping the pattern of codon usage [31]. The GC content at the first, second, and third codon sites in the Boragines species followed the pattern GC1 > GC2 > GC3. Our results were consistent with the finding that the third codon was more frequent, ending with A + T content rather than G + C content, in dicot species than in monocot species [50,51]. However, this conclusion represents limited genome samples, and more genomes in monocot species are, therefore, needed to support this conclusion. We also performed correlation analyses between GC1, GC2, GC3, ENC, GC_all, and the number of codons. GC3 showed no correlation with GC1 and GC2, and the correlation between GC1 and GC2 was significant, implying that the third base composition of the codon is different from that of the first and second sites. These results are consistent with the pt genome of Dalbergia odorifera [52]. Species in Boragines show a different correlation between GC3 and ENC, as well as between GC3 and the number of codons, which may result from adaptive evolution. There was no correlation between ENC and the number of codons, implying that the length of the gene plays a minor role in shaping the codon usage pattern. These phenomena were consistent with codon usage patterns in the pt genome of Sium ventricosum [50].
In our neutral plot analyses, we found no correlation between GC3 and GC12. This phenomenon might result from natural selection pressure. Our neutral plot displayed the opposite conclusion compared with tobacco, tomato, and potato, in which GC3 and GC12 were significantly correlated [53]. Given that the slope of the regression line is very small, we thought that natural selection was dominant in shaping codon usage bias patterns in Boraginales species. Natural selection and mutation pressure affect the frequency of nucleotide composition. Generally, the frequency of nucleotide A3 + T3 was not equal to the frequency of G3 + C3, indicating that natural selection may dominate the codon usage bias pattern compared with mutation pressure [43,54]. In our PR2 plot result, the usage of T and G at the third position of codons was higher than that of A and C. Furthermore, most of the genes appeared at the bottom right of the graph. Therefore, the result of the PR2 plot indicated that natural selection and other factors may dominate codon usage patterns in the Boraginales species [55], and similar results have been found in the pt genome of the Goosypium species [56]. An ENC value greater than 35 indicates a weak codon usage bias [57]. The majority of genes had ENC values greater than 35, indicating that weak codon bias exists in the Boraginales species. The ENC values of the majority of genes were located below the prediction curve, which further implied that selection pressure plays a dominant role in shaping codon usage bias. Furthermore, multiple optimal codons were identified, which may provide meaningful information for enhancing the gene expression of the Boraginales species.
The IR/LSC and IR/SSC boundary regions have been widely analyzed across different land plants. Generally, the variation in the length of the pt genome was thought to result from the expansion and contraction of the IR region. Furthermore, the variation in IR/SSC and IR/LSC boundaries may respond to some phylogenetic signals, such as in subtle Gentianinae and Caryophyllales species [58,59]. The order of the gene in the LSC/IRb region was rps19-rpl2 in the closely related taxa of Boraginales, such as Lamiaceae and Rubiaceae [60,61]. In this study, we found a signal of obvious IR region expansion in the pt genome of Nonea vesicaria and Arnebia euchroma; for example, rpl16, rps3, rpl22, and rps19 should be located in the LSC region, but these genes were found in the IRb region in Nonea vesicaria. Furthermore, rps3, rpl22, and rps19 were found in the IRb region in Arnebia euchroma.
Given that the sequencing of cpDNA is cheap and easy to obtain, cpDNA has been widely used in population genetic and phylogenetic analyses. The relationship of the majority of boronines has been revised using cpDNA, including trnL-trnF [62], rpl32-trnL, trnH-psbA [63], atpB [64], and trnL [65]. Some of the phylogenetic trees received low bootstrap values. The pt genome has abundant genetic information, and various regions in the plastid genome can provide potential cpDNA barcodes [60]. Our results revealed several variable regions with high pi values in protein-coding and intergenic space, such as psbT, rpl33, ndhF, ccsA, rps15, rps16, matK, rpl23, ycf15, and trnI-CAU_trnL-CAA. The intergenic space was more divergent compared with the protein-coding region. ycf1 has been widely used in population genetic and phylogenetic analyses, and has been promoted as a core cpDNA harbor in land plants [66], but it had a low pi value in our results. Some common cpDNA markets can also be detected in Boraginales, such as ndhF and matK, which have been widely used in previous research [67,68]. The intergenic spaces in the Boraginales species all had higher pi values compared with other species, which may be due to the rapid genetic diversification of the Boraginales species that results from high pi values in the intergenic space.
Species undergo various selection pressures in the process of evolution. Some genes may undergo mutations to adapt to extreme environments. The value of the nonsynonymous (Ka) and synonymous (Ks) ratio is a common method to evaluate whether these genes are under selection pressure [69]. In this study, four genes were detected under positive selection: psaI, rpl33, rpl36, and rps19. Previous studies have demonstrated that psaI (photosystem I subunit VIII) is not an essential gene in the photosystem I redox reaction; however, it plays a certain role in stabilizing photosystem I complexes [70]. Although rpl33 encodes ribosomal protein L33, it seems that loss of the rpl33 gene did not affect plant viability or growth; for example, the function of rpl33 has been observed to be lost in the pt genome of Phaseolus vulgaris and Rhipsalis teres [71,72,73]. However, rpl33 plays an important role in maintaining plastid translation capacity under cold conditions [72]. The rpl36 gene (encoding ribosomal protein L36) is nonessential in the plastid, but knockout of rpl36 affects the plant phenotype [74]. These selected genes are of great significance for understanding the adaptive evolution of the Boraginales species.
The four families had identical tree topologies in the phylogenetic results. It is difficult to say whether Boraginales should be treated as one family, as APG does, or if it should be split into many sections [2,8]. According to the definition of monophyly, these two views are strongly supported in this study. More pt genome sequences are, therefore, needed in this taxon to resolve this dispute. Boraginaceae (s.str.) were separated from the rest of the three families and formed a monophyletic group, consistent with molecular analyses in a previous study [15]. Boraginaceae (s.str.) were divided into two well-supported subfamilies (Lithospermeae and Boragineae), consistent with a phylogenetic tree based on trnL-trnF, rps16, trnS-trnG, and ITS [15]. Our results support that Ehretiaceae was close to Cordiaceae, which is consistent with previous research [75].

5. Conclusions

In this study, we sequenced and assembled the first plastid genomes of Heliotropiaceae and Cordiaceae. We generated neutral plots, ENC plots, and PR2 plots to investigate codon usage patterns in the Boraginales species. Multiple optimal codons were identified, which may provide meaningful information for enhancing the gene expression of the Boraginales species. We also compared the boundaries of the IR/LSC and IR/SSC regions, and the results showed significant differences in gene content in the boundary regions of the nine species. High-variation regions were also detected, which may be used as DNA barcodes for phylogenetic analyses in further research. Phylogenetic analyses revealed that Boraginales was divided into two main clades, while Heliotropiaceae and Ehretiaceae + Cordiaceae were clustered into one clade.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/d14121104/s1: Figure S1: Comparative analysis of the nucleotide variability values of Boraginales species; Table S1: Comparison of general characteristics of the plastid genome of the nine species; Table S2: Analysis of the correlation among the number of codons, GC content, and ENC value of the nine species; Table S3: Detecting optimal codons in the nine species.

Author Contributions

R.W. conceived of the project, designed the research, and conducted sequencing; Q.L. wrote the paper; Q.L. and R.W. analyzed the data; R.W. conducted the format review and modification. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The pt genome sequences have been deposited in GenBank under the accession numbers: ON872366, ON872367, and ON872368. Row data are available at SRA under the accession numbers: PRJNA850506, PRJNA850517, PRJNA850538.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Hu, C.; Kelso, S. Flora of China. In Flora of China; Science Press: Beijing, China, 1996; pp. 99–108. [Google Scholar]
  2. Luebert, F.; Cecchi, L.; Frohlich, M.W.; Gottschling, M.; Guilliams, C.M.; Hasenstab-Lehman, K.E.; Hilger, H.H.; Miller, J.S.; Mittelbach, M.; Nazaire, M.; et al. Familial classification of the boraginales. Taxon 2016, 65, 502–522. [Google Scholar] [CrossRef] [Green Version]
  3. Refulio-Rodriguez, N.F.; Olmstead, R.G. Phylogeny of Lamiidae. Am. J. Bot. 2014, 101, 287–299. [Google Scholar] [CrossRef] [PubMed]
  4. Weigend, M.; Luebert, F.; Gottschling, M.; Couvreur, T.; Hilger, H.H.; Miller, J.S. From capsules to nutlets-phylogenetic relationships in the Boraginales. Cladistics 2014, 30, 508–518. [Google Scholar] [CrossRef] [PubMed]
  5. The Angiosperm Phylogeny Group. An Ordinal Classification for the Families of Flowering Plants. Ann. Missouri Bot. Gard. 1998, 85, 531–553. [Google Scholar] [CrossRef] [Green Version]
  6. The Angiosperm Phylogeny Group. An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG II. Bot. J. Linn. Soc. 2003, 141, 399–436. [Google Scholar] [CrossRef] [Green Version]
  7. The Angiosperm Phylogeny Group. An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG III. Bot. J. Linn. Soc. 2009, 161, 1–20. [Google Scholar] [CrossRef] [Green Version]
  8. The Angiosperm Phylogeny Group. An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG IV. Bot. J. Linn. Soc. 2016, 181, 1–20. [Google Scholar] [CrossRef] [Green Version]
  9. Bremer, B.; Bremer, K.; Heidari, N.; Erixon, P.; Olmstead, R.G.; Anderberg, A.A.; Källersjö, M.; Barkhordarian, E. Phylogenetics of asterids based on 3 coding and 3 non-coding chloroplast DNA markers and the utility of non-coding DNA at higher taxonomic levels. Mol. Phylogenet. Evol. 2002, 24, 274–301. [Google Scholar] [CrossRef]
  10. Moore, M.J.; Soltis, P.S.; Bell, C.D.; Burleigh, J.G.; Soltis, D.E. Phylogenetic analysis of 83 plastid genes further resolves the early diversification of eudicots. Proc. Natl. Acad. Sci. USA 2010, 107, 4623–4628. [Google Scholar] [CrossRef] [Green Version]
  11. Nazaire, M.; Hufford, L. A Broad Phylogenetic Analysis of Boraginaceae: Implications for the Relationships of Mertensia. Syst. Bot. 2012, 37, 758–783. [Google Scholar] [CrossRef]
  12. Wu, F.-Y.; Tang, C.-Y.; Guo, Y.-M.; Bian, Z.-W.; Fu, J.-Y.; Lu, G.-H.; Qi, J.-L.; Pang, Y.-J.; Yang, Y.-H. Transcriptome analysis explores genes related to shikonin biosynthesis in Lithospermeae plants and provides insights into Boraginales’ evolutionary history. Sci. Rep. 2017, 7, 4477. [Google Scholar] [CrossRef] [PubMed]
  13. Tang, C.; Li, S.; Wang, Y.; Wang, X. Comparative genome/transcriptome analysis probes Boraginales’ phylogenetic position, WGDs in Boraginales, and key enzyme genes in the alkannin/shikonin core pathway. Mol. Ecol. Resour. 2020, 20, 228–241. [Google Scholar] [CrossRef] [PubMed]
  14. Li, H.T.; Yi, T.-S.; Gao, L.-M.; Ma, P.-F.; Zhang, T.; Yang, J.-B.; Gitzendanner, M.A.; Fritsch, P.W.; Cai, J.; Luo, Y.; et al. Origin of angiosperms and the puzzle of the Jurassic gap. Nat. Plants 2019, 5, 461–470. [Google Scholar] [CrossRef] [PubMed]
  15. Chacón, J.; Luebert, F.; Hilger, H.H.; Ovchinnikova, S.; Selvi, F.; Cecchi, L.; Guilliams, C.M.; Hasenstab-Lehman, K.; Sutorý, K.; Simpson, M.G.; et al. The borage family (Boraginaceae s.str.): A revised infrafamilial classification based on new phylogenetic evidence, with emphasis on the placement of some enigmatic genera. Taxon 2016, 65, 523–546. [Google Scholar] [CrossRef] [Green Version]
  16. Teshome, G.E.; Mekbib, Y.; Hu, G.; Li, Z.-Z.; Chen, J. Comparative analyses of 32 complete plastomes of Tef (Eragrostis tef) accessions from Ethiopia: Phylogenetic relationships and mutational hotspots. PeerJ 2020, 8, e9314. [Google Scholar] [CrossRef]
  17. Gruzdev, E.V.; Kadnikov, V.V.; Beletsky, A.V.; Mardanov, A.V.; Ravin, N.V. Extensive plastome reduction and loss of photosynthesis genes in Diphelypaea coccinea, a holoparasitic plant of the family Orobanchaceae. PeerJ 2019, 7, e7830. [Google Scholar] [CrossRef] [Green Version]
  18. Namgung, J.; Do, H.D.K.; Kim, C.; Choi, H.J.; Kim, J.-H. Complete chloroplast genomes shed light on phylogenetic relationships, divergence time, and biogeography of Allioideae (Amaryllidaceae). Sci. Rep. 2021, 11, 3262. [Google Scholar] [CrossRef]
  19. Shen, X.; Guo, S.; Yin, Y.; Zhang, J.; Yin, X.; Liang, C.; Wang, Z.; Huang, B.; Liu, Y.; Xiao, S.; et al. Complete chloroplast genome sequence and phylogenetic analysis of aster tataricus. Molecules 2018, 23, 2426. [Google Scholar] [CrossRef] [Green Version]
  20. Yin, K.; Zhang, Y.; Li, Y.; Du, F.K. Different Natural Selection Pressures on the atpF Gene in Evergreen Sclerophyllous and Deciduous Oak Species: Evidence from Comparative Analysis of the Complete Chloroplast Genome of Quercus aquifolioides with Other Oak Species. Int. J. Mol. Sci. 2018, 19, 1042. [Google Scholar] [CrossRef] [Green Version]
  21. Khadivi-Khub, A.; Zamani, Z.; Fattahi, R.; Wünsch, A. Genetic variation in wild Prunus L. subgen. Cerasus germplasm from Iran characterized by nuclear and chloroplast SSR markers. Trees Struct. Funct. 2014, 28, 471–485. [Google Scholar] [CrossRef]
  22. Liu, S.; Xu, Q.; Liu, K.; Zhao, Y.; Chen, N. Chloroplast Genomes for Five Skeletonema Species: Comparative and Phylogenetic Analysis. Front. Plant Sci. 2021, 12, 774617. [Google Scholar] [CrossRef] [PubMed]
  23. Zhou, T.; Zhu, H.; Wang, J.; Xu, Y.; Xu, F.; Wang, X. Complete chloroplast genome sequence determination of Rheum species and comparative chloroplast genomics for the members of Rumiceae. Plant Cell Rep. 2020, 39, 811–824. [Google Scholar] [CrossRef] [PubMed]
  24. Ye, J.; Niu, Y.; Feng, Y.; Liu, B.; Hai, L.; Wen, J.; Chen, Z. Taxonomy and biogeography of Diapensia (Diapensiaceae) based on chloroplast genome data. J. Syst. Evol. 2020, 58, 696–709. [Google Scholar] [CrossRef]
  25. Quax, T.E.; Claassens, N.J.; Söll, D.; van der Oost, J. Codon Bias as a Means to Fine-Tune Gene Expression. Mol. Cell 2015, 59, 149–161. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  26. Liu, Y. A code within the genetic code: Codon usage regulates co-translational protein folding. Cell Commun. Signal. 2020, 18, 145. [Google Scholar] [CrossRef] [PubMed]
  27. Ingvarsson, P.K. Molecular evolution of synonymous codon usage in Populus. BMC Evol. Biol. 2008, 8, 307. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  28. Liu, Q. Mutational Bias and Translational Selection Shaping the Codon Usage Pattern of Tissue-Specific Genes in Rice. PLoS ONE 2012, 7, e48295. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  29. Deb, B.; Uddin, A.; Chakraborty, S. Codon usage pattern and its influencing factors in different genomes of hepadnaviruses. Arch. Virol. 2020, 165, 557–570. [Google Scholar] [CrossRef] [Green Version]
  30. Athey, J.; Alexaki, A.; Osipova, E.; Rostovtsev, A.; Santana-Quintero, L.V.; Katneni, U.; Simonyan, V.; Kimchi-Sarfaty, C. A new and updated resource for codon usage tables. BMC Bioinform. 2017, 18, 391. [Google Scholar] [CrossRef] [Green Version]
  31. Mazumdar, P.; Othman, R.B.; Mebus, K.; Ramakrishnan, N.; Harikrishna, J.A. Codon usage and codon pair patterns in non-grass monocot genomes. Ann. Bot. 2017, 120, 893–909. [Google Scholar] [CrossRef]
  32. Kwon, K.-C.; Chan, H.-T.; León, I.R.; Williams-Carrier, R.; Barkan, A.; Daniell, H. Codon Optimization to Enhance Expression Yields Insights into Chloroplast Translation. Plant Physiol. 2016, 172, 62–77. [Google Scholar] [CrossRef] [Green Version]
  33. Parvathy, S.T.; Udayasuriyan, V.; Bhadana, V. Codon usage bias. Mol. Biol. Rep. 2022, 49, 539–565. [Google Scholar] [CrossRef] [PubMed]
  34. Doyle, J.J.; Doyle, J.L. A rapid DNA isolation procedure for small quantities of fresh leaf tissue. Phytochem. Bull. 1987, 19, 11–15. [Google Scholar]
  35. Jin, J.-J.; Yu, W.-B.; Yang, J.-B.; Song, Y.; Depamphilis, C.W.; Yi, T.-S.; Li, D.-Z. GetOrganelle: A fast and versatile toolkit for accurate de novo assembly of organelle genomes. Genome Biol. 2020, 21, 241. [Google Scholar] [CrossRef] [PubMed]
  36. Wick, R.R.; Schultz, M.B.; Zobel, J.; Holt, K.E. Bandage: Interactive visualization of de novo genome assemblies. Bioinformatics 2015, 31, 3350–3352. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  37. Tillich, M.; Lehwark, P.; Pellizzer, T.; Ulbricht-Jones, E.S.; Fischer, A.; Bock, R.; Greiner, S. GeSeq—versatile and accurate annotation of organelle genomes. Nucleic Acids Res. 2017, 45, W6–W11. [Google Scholar] [CrossRef] [Green Version]
  38. Kumar, S.; Stecher, G.; Li, M.; Knyaz, C.; Tamura, K. MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms. Mol. Biol. Evol. 2018, 35, 1547–1549. [Google Scholar] [CrossRef] [PubMed]
  39. Wu, S.; Lu, Y.-C.; McMurtrey, J.E.; Weesies, G.; Devine, T.E.; Foster, G.R. Soil Conservation Benefits of Large Biomass Soybean (LBS) for Increasing Crop Residue Cover. J. Sustain. Agric. 2004, 24, 107–128. [Google Scholar] [CrossRef]
  40. Wright, F. The ‘effective number of codons’ used in a gene. Gene 1990, 87, 23–29. [Google Scholar] [CrossRef]
  41. Sueoka, N. Translation-coupled violation of Parity Rule 2 in human genes is not the cause of heterogeneity of the DNA G+C content of third codon position. Gene 1999, 238, 53–58. [Google Scholar] [CrossRef]
  42. Sueoka, N. Directional mutation pressure and neutral molecular evolution. Proc. Natl. Acad. Sci. USA 1988, 85, 2653–2657. [Google Scholar] [CrossRef] [PubMed]
  43. Kawabe, A.; Miyashita, N.T. Patterns of codon usage bias in three dicot and four monocot plant species. Genes Genet. Syst. 2003, 78, 343–352. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  44. Zhang, D.; Gao, F.; Jakovlić, I.; Zhou, H.; Zhang, J.; Li, W.X.; Wang, G.T. PhyloSuite: An integrated and scalable desktop platform for streamlined molecular sequence data management and evolutionary phylogenetics studies. Mol. Ecol. Resour. 2020, 20, 348–355. [Google Scholar] [CrossRef]
  45. Rozas, J.; Ferrer-Mata, A.; Sánchez-DelBarrio, J.C.; Guirao-Rico, S.; Librado, P.; Ramos-Onsins, S.E.; Sánchez-Gracia, A. DnaSP 6: DNA Sequence Polymorphism Analysis of Large Data Sets. Mol. Biol. Evol. 2017, 34, 3299–3302. [Google Scholar] [CrossRef]
  46. Wang, D.; Zhang, Y.; Zhang, Z.; Zhu, J.; Yu, J. KaKs_Calculator 2.0: A Toolkit Incorporating Gamma-Series Methods and Sliding Window Strategies. Genom. Proteom. Bioinform. 2010, 8, 77–80. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  47. Minh, B.Q.; Schmidt, H.A.; Chernomor, O.; Schrempf, D.; Woodhams, M.D.; Von Haeseler, A.; Lanfear, R.; Teeling, E. IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era. Mol. Biol. Evol. 2020, 37, 1530–1534. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  48. Gao, C.; Deng, Y.; Wang, J. The Complete Chloroplast Genomes of Echinacanthus Species (Acanthaceae): Phylogenetic Relationships, Adaptive Evolution, and Screening of Molecular Markers. Front. Plant Sci. 2019, 9, 1989. [Google Scholar] [CrossRef] [Green Version]
  49. Sobreiro, M.B.; Vieira, L.D.; Nunes, R.; Novaes, E.; Coissac, E.; Silva-Junior, O.B.; Grattapaglia, D.; Collevatti, R.G. Chloroplast genome assembly of Handroanthus impetiginosus: Comparative analysis and molecular evolution in Bignoniaceae. Planta 2020, 252, 91. [Google Scholar] [CrossRef]
  50. Camiolo, S.; Melito, S.; Porceddu, A. New insights into the interplay between codon bias determinants in plants. DNA Res. 2015, 22, 461–470. [Google Scholar] [CrossRef] [Green Version]
  51. Myers, N.; Mittermeier, R.A.; Mittermeier, C.G.; Da Fonseca, G.A.B.; Kent, J. Biodiversity hotspots for conservation priorities. Nature 2000, 403, 853–858. [Google Scholar] [CrossRef]
  52. Yuan, X.; Li, Y.; Zhang, J.; Wang, Y. Analysis of Codon Usage Bias in the chloroplast genome of Dalbergia odorifera. Guihaia 2021, 41, 622–630. [Google Scholar] [CrossRef]
  53. Anwar, A.M.; Aljabri, M.; El-Soda, M. Patterns of genome-wide codon usage bias in tobacco, tomato and potato. Biotechnol. Biotechnol. Equip. 2021, 35, 657–664. [Google Scholar] [CrossRef]
  54. Sueoka, N. Intrastrand parity rules of DNA base composition and usage biases of synonymous codons. J. Mol. Evol. 1995, 40, 318–325. [Google Scholar] [CrossRef] [PubMed]
  55. Xiang, H.; Zhang, R.; Butler, R.R.; Liu, T.; Zhang, L.; Pombert, J.-F.; Zhou, Z. Comparative Analysis of Codon Usage Bias Patterns in Microsporidian Genomes. PLoS ONE 2015, 10, e0129223. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  56. Wang, L.; Xing, H.; Yuan, Y.; Wang, X.; Saeed, M.; Tao, J.; Feng, W.; Zhang, G.; Song, X.; Sun, X. Genome-wide analysis of codon usage bias in four sequenced cotton species. PLoS ONE 2018, 13, e0194372. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  57. He, Z.; Gan, H.; Liang, X. Analysis of Synonymous Codon Usage Bias in Potato Virus M and Its Adaption to Hosts. Viruses 2019, 11, 752. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  58. Fu, P.; Sun, S.; Twyford, A.D.; Li, B.; Zhou, R.; Chen, S.; Gao, Q.; Favre, A. Lineage-specific plastid degradation in subtribe Gentianinae (Gentianaceae). Ecol. Evol. 2021, 11, 3286–3299. [Google Scholar] [CrossRef] [PubMed]
  59. Yao, G.; Jin, J.-J.; Li, H.-T.; Yang, J.-B.; Mandala, V.S.; Croley, M.; Mostow, R.; Douglas, N.; Chase, M.; Christenhusz, M.J.; et al. Plastid phylogenomic insights into the evolution of Caryophyllales. Mol. Phylogenet. Evol. 2019, 134, 74–86. [Google Scholar] [CrossRef]
  60. Wu, H.; Ma, P.-F.; Li, H.-T.; Hu, G.-X.; Li, D.-Z. Comparative plastomic analysis and insights into the phylogeny of Salvia (Lamiaceae). Plant Divers. 2021, 43, 15–26. [Google Scholar] [CrossRef]
  61. Amenu, S.G.; Wei, N.; Wu, L.; Oyebanji, O.; Hu, G.; Zhou, Y.; Wang, Q. Phylogenomic and comparative analyses of Coffeeae alliance (Rubiaceae): Deep insights into phylogenetic relationships and plastome evolution. BMC Plant Biol. 2022, 22, 88. [Google Scholar] [CrossRef]
  62. Khoshsokhan Mozaffar, M.; Kazempour Osaloo, S.; Oskoueiyan, R.; Naderi Saffar, K.; Amirahmadi, A. Tribe Eritrichieae (Boraginaceae s. str.) in West Asia: A molecular phylogenetic perspective. Plant Syst. Evol. 2013, 299, 197–208. [Google Scholar] [CrossRef]
  63. Nasrollahi, F.; Kazempour-Osaloo, S.; Saadati, N.; Mozaffarian, V.; Zare-Maivan, H. Molecular phylogeny and divergence times ofOnosma(Boraginaceae s.s.) based on nrDNA ITS and plastidrpl32-trnL(UAG)andtrnH-psbA sequences. Nord. J. Bot. 2019, 37, e02060. [Google Scholar] [CrossRef] [Green Version]
  64. Långström, E.; Chase, M.W. Tribes of Boraginoideae (Boraginaceae) and placement of Antiphytum, Echiochilon, Ogastemma and Sericostoma: A phylogenetic analysis based on atpB plastid DNA sequence data. Plant Syst. Evol. 2002, 234, 137–153. [Google Scholar] [CrossRef]
  65. Selvi, F.; Papini, A.; Hilger, H.H.; Bigazzi, M.; Nardi, E. The phylogenetic relationships of Cynoglottis (Boraginaceae-Boragineae) inferred from ITS, 5.8S and trnL sequences. Plant Syst. Evol. 2004, 246, 195–209. [Google Scholar] [CrossRef]
  66. Dong, W.; Xu, C.; Li, C.; Sun, J.; Zuo, Y.; Shi, S.; Cheng, T.; Guo, J.; Zhou, S. ycf1, the most promising plastid DNA barcode of land plants. Sci. Rep. 2015, 5, 8348. [Google Scholar] [CrossRef] [Green Version]
  67. Kim, S.; Park, C.-W.; Kim, Y.-D.; Suh, Y. Phylogenetic relationships in family Magnoliaceae inferred from ndhF sequences. Am. J. Bot. 2001, 88, 717–728. [Google Scholar] [CrossRef] [PubMed]
  68. Hidalgo, O.; Garnatje, T.; Susanna, A.; Mathez, J. Phylogeny of valerianaceae based on matK and ITS markers, with reference to matK individual polymorphism. Ann. Bot. 2004, 93, 283–293. [Google Scholar] [CrossRef] [Green Version]
  69. Ding, S.; Dong, X.; Yang, J.; Guo, C.; Cao, B.; Guo, Y.; Hu, G. Complete chloroplast genome of clethra fargesii franch., an original sympetalous plant from central china: Comparative analysis, adaptive evolution, and phylogenetic relationships. Forests 2021, 12, 441. [Google Scholar] [CrossRef]
  70. Schöttler, M.A.; Thiele, W.; Belkius, K.; Bergner, S.V.; Flügel, C.; Wittenberg, G.; Agrawal, S.; Stegemann, S.; Ruf, S.; Bock, R. The plastid-encoded PsaI subunit stabilizes photosystem i during leaf senescence in tobacco. J. Exp. Bot. 2017, 68, 1137–1155. [Google Scholar] [CrossRef] [Green Version]
  71. Hallick, R.B.; Hong, L.; Drager, R.G.; Favreau, M.R.; Monfort, A.; Orsat, B.; Spielmann, A.; Stutz, E. Complete sequence of Euglena gracilis chloroplast DNA. Nucleic Acids Res. 1993, 21, 3537–3544. [Google Scholar] [CrossRef] [Green Version]
  72. Rogalski, M.; Schöttler, M.A.; Thiele, W.; Schulze, W.X.; Bock, R. Rpl33, a nonessential plastid-encoded ribosomal protein in tobacco, is required under cold stress conditions. Plant Cell 2008, 20, 2221–2237. [Google Scholar] [CrossRef] [PubMed]
  73. da Silva, G.M.; Lopes, A.D.S.; Pacheco, T.G.; Machado, K.L.d.G.; Silva, M.C.; de Oliveira, J.D.; de Baura, V.A.; Balsanelli, E.; de Souza, E.M.; Pedrosa, F.D.O.; et al. Genetic and evolutionary analyses of plastomes of the subfamily Cactoideae (Cactaceae) indicate relaxed protein biosynthesis and tRNA import from cytosol. Braz. J. Bot. 2021, 44, 97–116. [Google Scholar] [CrossRef]
  74. Fleischmann, T.; Scharff, L.; Alkatib, S.; Hasdorf, S.; Schöttler, M.A.; Bock, R. Nonessential Plastid-Encoded Ribosomal Proteins in Tobacco: A Developmental Role for Plastid Translation and Implications for Reductive Genome Evolution. Plant Cell 2011, 23, 3137–3155. [Google Scholar] [CrossRef] [Green Version]
  75. Gottschling, M.; Luebert, F.; Hilger, H.H.; Miller, J.S. Molecular delimitations in the ehretiaceae (boraginales). Mol. Phylogenet. Evol. 2014, 72, 1–6. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Plastid genome maps for Cordia dichotoma, Heliotropium arborescens, and Tournefortia montana.
Figure 1. Plastid genome maps for Cordia dichotoma, Heliotropium arborescens, and Tournefortia montana.
Diversity 14 01104 g001
Figure 2. Correlation analysis of GC content, the number of codons, and ENC values. Numbers in the lower left corner circle: p value; numbers in the upper right corner: correlation coefficient. Only the red circle has no number, indicating that the p value is less than 0.05.
Figure 2. Correlation analysis of GC content, the number of codons, and ENC values. Numbers in the lower left corner circle: p value; numbers in the upper right corner: correlation coefficient. Only the red circle has no number, indicating that the p value is less than 0.05.
Diversity 14 01104 g002
Figure 3. Neutrality plot of the nine species in Boraginales.
Figure 3. Neutrality plot of the nine species in Boraginales.
Diversity 14 01104 g003
Figure 4. ENC plot analyses show the relationship between ENC and GC3.
Figure 4. ENC plot analyses show the relationship between ENC and GC3.
Diversity 14 01104 g004
Figure 5. The PR2-bias plots of nine species.
Figure 5. The PR2-bias plots of nine species.
Diversity 14 01104 g005
Figure 6. Comparison of the IR/LSC and IR/SSC regions of the nine species.
Figure 6. Comparison of the IR/LSC and IR/SSC regions of the nine species.
Diversity 14 01104 g006
Figure 7. The Ka/Ks values of plastid genes in the nine species, with Cordia dichotoma as a reference genome.
Figure 7. The Ka/Ks values of plastid genes in the nine species, with Cordia dichotoma as a reference genome.
Diversity 14 01104 g007
Figure 8. Phylogenetic analyses of the Boraginales species using the maximum likelihood (ML) method, based on 80 protein-coding genes.
Figure 8. Phylogenetic analyses of the Boraginales species using the maximum likelihood (ML) method, based on 80 protein-coding genes.
Diversity 14 01104 g008
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Li, Q.; Wei, R. Comparison of Boraginales Plastomes: Insights into Codon Usage Bias, Adaptive Evolution, and Phylogenetic Relationships. Diversity 2022, 14, 1104. https://doi.org/10.3390/d14121104

AMA Style

Li Q, Wei R. Comparison of Boraginales Plastomes: Insights into Codon Usage Bias, Adaptive Evolution, and Phylogenetic Relationships. Diversity. 2022; 14(12):1104. https://doi.org/10.3390/d14121104

Chicago/Turabian Style

Li, Qiang, and Ran Wei. 2022. "Comparison of Boraginales Plastomes: Insights into Codon Usage Bias, Adaptive Evolution, and Phylogenetic Relationships" Diversity 14, no. 12: 1104. https://doi.org/10.3390/d14121104

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop