Next Article in Journal
Comparative Mitogenome Analyses of Fifteen Ramshorn Snails and Insights into the Phylogeny of Planorbidae (Gastropoda: Hygrophila)
Previous Article in Journal
Runx2 Regulates Galnt3 and Fgf23 Expressions and Galnt3 Decelerates Osteoid Mineralization by Stabilizing Fgf23
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Extreme Reconfiguration of Plastid Genomes in Papaveraceae: Rearrangements, Gene Loss, Pseudogenization, IR Expansion, and Repeats

1
College of Plant Protection, Henan Agricultural University, Zhengzhou 450002, China
2
Marine College, Shandong University, Weihai 264209, China
3
College of Life Sciences, Henan Agricultural University, Zhengzhou 450046, China
*
Authors to whom correspondence should be addressed.
Int. J. Mol. Sci. 2024, 25(4), 2278; https://doi.org/10.3390/ijms25042278
Submission received: 8 January 2024 / Revised: 8 February 2024 / Accepted: 10 February 2024 / Published: 14 February 2024
(This article belongs to the Section Molecular Genetics and Genomics)

Abstract

:
The plastid genomes (plastomes) of angiosperms are typically highly conserved, with extreme reconfiguration being uncommon, although reports of such events have emerged in some lineages. In this study, we conducted a comprehensive comparison of the complete plastomes from twenty-two species, covering seventeen genera from three subfamilies (Fumarioideae, Hypecooideae, and Papaveroideae) of Papaveraceae. Our results revealed a high level of variability in the plastid genome size of Papaveraceae, ranging from 151,864 bp to 219,144 bp in length, which might be triggered by the expansion of the IR region and a large number of repeat sequences. Moreover, we detected numerous large-scale rearrangements, primarily occurring in the plastomes of Fumarioideae and Hypecooideae. Frequent gene loss or pseudogenization were also observed for ndhs, accD, clpP, infA, rpl2, rpl20, rpl32, rps16, and several tRNA genes, particularly in Fumarioideae and Hypecooideae, which might be associated with the structural variation in their plastomes. Furthermore, we found that the plastomes of Fumarioideae exhibited a higher GC content and more repeat sequences than those of Papaveroideae. Our results showed that Papaveroideae generally displayed a relatively conserved plastome, with the exception of Eomecon chionantha, while Fumarioideae and Hypecooideae typically harbored highly reconfigurable plastomes, showing high variability in the genome size, gene content, and gene order. This study provides insights into the plastome evolution of Papaveraceae and may contribute to the development of effective molecular markers.

1. Introduction

The plastid genome (plastome) has quickly become a commonly used marker in plant systematic phylogenetic research due to its advantages of structural conservation, small genome size (approximately 150–200 Kb), and uniparentally inheritance [1,2,3]. The plastomes of higher plants have a highly conserved quadripartite circular structure, comprising a large single-copy region (LSC) and a small single-copy region (SSC) separated by two inverted repeat regions (IR) [4]. Although some atypical forms such as the extreme contraction of IRs, entire IR loss, and direct repeats (DRs) have been detected, the typical plastome architecture, including genome size, gene content, and gene order, is generally highly conserved, with two configurations usually coexisting and interchanging via IRs [5]. Nevertheless, an increasing number of studies have revealed a structural variation, including large-scale rearrangements in Fabaceae [6,7,8,9,10], Geraniaceae [11,12,13,14], Cactaceae [15,16], Campanulaceae [17,18], and Passifloraceae [19,20,21,22], as well as large IR contractions in Schisandraceae [23] and Lauraceae [24], and even loss of one IR region in Leguminosae [7,25], Geraniaceae [12], and Passifloraceae [20].
Papaveraceae, one of the most diverse families of Ranunculales, consists of about 850 species belonging to 45 genera and predominantly occur in forests, subalpine meadows, alpine tundra, grasslands, and deserts, especially in temperate regions of the northern hemisphere [26]. This family exhibits a high degree of endemism, with over 70% of its species being geographically restricted [26]. Papaveraceae is renowned for the opium poppy (Papaver somniferum), which can serve as a source of analgesic drugs [27]. In addition, Corydalis yanhusuo, Dactylicapnos torulosa, Chelidonium majus, Macleaya cordata, Hylomecon japonica, Hypecoum erectum, and Eomecon chionantha are also used in traditional medicine [28,29,30,31,32,33,34]. The taxonomy and phylogeny of Papaveraceae have been a subject of ongoing debate. Hoot et al. [35] divided Papaveraceae into three subfamilies, i.e., Pteridophylloideae, Papaveroideae, and Fumarioideae. Wang et al. [36] divided Papaveraceae into two subfamilies, i.e., Fumarioideae (including Pteridophyllum and Hypecoum) and Papaveroideae. Afterwards, Hoot et al. [37] divided Papaveraceae into four subfamilies, i.e., Pteridophylloideae, Papaveroideae, Hypecoideae, and Fumarioideae, and Pteridophylloideae was supported as a sister to the other subfamilies and Hypecoideae as a sister to Fumarioideae. Although there is still controversy about the systematic position of the two single-genera subfamilies Hypecooideae and Pteridophylloideae, this classification is currently generally accepted [26].
Within Fumarioideae, Park et al. [38] firstly reported the large-scale rearrangements of plastomes in Lamprocapnos and speculated that the expansion and contraction of the IR region might contribute to these rearrangements. Additionally, Xu and Wang revealed Corydalis to be another unusual lineage with extensive large-scale plastome rearrangements and significant pseudogenizations or losses of the accD, clpP, and ndh genes [39]. Furthermore, recent wide-scale comparative studies confirmed the extensive rearrangements and gene losses throughout the plastomes of Corydalis [40,41]. Moreover, the plastomes of Fumaria also showed signatures of structural rearrangement and loss of the accD gene [41]. In contrast, the plastomes of Papaveroideae appeared more conserved, with no structure rearrangements and gene losses detected, despite the release of plastomes from several genera [42]. For other closely related Ranunculales taxa, Sun et al. [43] reported plastid rearrangements of Circaeasteraceae, and found an inversion of approximately 49 kb and 3.5 kb in the LSC region, with pseudogenizations or losses of the accD and ndh genes. Nevertheless, the plastomes of Eupteleaceae, Lardizabalaceae, Menispermaceae, Berberidaceae, and Ranunculaceae were generally conserved [44,45], with the exception of several genera, such as in the cases of the inversion of a 44.8 kb segment in the LSC region and loss of the rpl32 gene in Adonis [46], and the small expansion of the IR region in Mahonia and Asteropyrum [47,48].
Structural variations are fundamental characteristics of a specific group [49]. Therefore, inferring structural variations of the plastomes in diverse phylogenetic lineages is an interesting research topic, and the results may provide more insights into their evolutionary history. Despite the recent surge in complete plastome sequencing, the comprehensive comparison has been largely confined to a few genera in Fumarioideae, such as Corydalis, Lamprocapnos, and Fumaria [38,39,40,41], resulting in relatively little knowledge about genomic changes throughout Papaveraceae. In this study, we carried out a detailed comprehensive comparative analysis for the plastomes of Papaveraceae with expanded sampling, aiming to characterize the plastomes from major lineages of Papaveraceae and investigate the factors contributing to plastome variation. Our study will advance our understanding of the plastome diversity and evolution in plants.

2. Results

2.1. Chloroplast Genome Characterization

A total of 185 Gb 150 bp PE Illumina reads were generated. After quality control using SOAPnuke, approximately an average of 7.2 GB of clean data were generated for each sample (Table S1). The plastomes of Papaveraceae exhibited a standard cyclic quadripartite structure comprising two single-copy regions (LSC and SSC), separated by a pair of inverted repeats (IRa and IRb). The plastid genome size differed significantly, ranging from 151,864 bp (Meconopsis integrifolia) to 219,144 bp (Corydalis sheareri). The length of LSC ranged from 76,421 bp (Dactylicapnos torulosa) to 100,365 bp (Corydalis adunca), the length of SSC varied from 136 bp (Eomecon chionantha) to 31,701 bp (Dactylicapnos torulosa), and the IR lengths ranged from 25,653 bp (Meconopsis integrifolia) to 62,384 bp (Corydalis sheareri) (Table 1). The overall GC content of Papaveraceae varied from 37.9% (Eomecon chionantha) to 41.1% (Corydalis adunca), with the lowest GC content observed in the SSC region and a relatively higher content in the IR region. Notably, Corydalis, Lamprocapnos, and Eomecon exhibited a disproportionately lower GC content (Figure 1).
The number of protein-coding genes (PCGs) ranged from 85 (Corydalis triternatifolia) to 100 (Lamprocapnos spectabilis), and tRNAs varied from 36 (Corydalis adunca) to 44 (Corydalis sheareri) (Figure 1, Table 1). All Papaveraceae species shared 91 unique plastid genes, including 60 PCGs, 27 tRNA genes, and 4 rRNA genes. Of those, ten genes harbored more than one intron (atpF, petB, petD, rpl16, rpoC1, rps12, ycf3, trnA-UGC, trnK-UUU, trnL-UAA), and the largest intron was observed in trnK-UUU (~2400 bp), which contains the matK gene. In addition, ten PCGs (ccsA, matK, psaC, psaI, psbA, rpl22, rpl23, rps15, rps3, ycf1), four rRNAs (rrn16, rrn23, rrn4.5, rrn5), and eight tRNAs (trnH-GUG, trnK-UUU, trnQ-UUG, trnI-CAU, trnL-UAG, trnT-UGU, trnG-GCC, trnfM-CAU) were duplicated in the IR regions of some species. Gene losses occurred frequently in Papaveraceae, especially in the subfamily Fumarioideae. For example, the accD gene was detected as lost in C. triternatifolia, C. gamosepala, C. tomentella, C. sheareri, C. longicalcarata, C. adunca, Dactylicapnos torulosa, and Fumaria schleicheri. The clpP gene was lost in Dactylicapnos torulosa and Eomecon chionantha. The ndhA and ndhI genes were lost in Corydalis triternatifolia and C. adunca. The ndhB, ndhC, ndhF, ndhG, and ndhK genes were lost in C. triternatifolia. The trnV-UAC gene was lost in C. sheareri. Additionally, three genes (ndhF, ndhH, and ndhJ) were truncated in two species of Corydalis, and three genes (clpP, ndhD, trnV-UAC) were identified as pseudogenes in Corydalis or Dactylicapnos. In sharp contrast with the frequent gene loss or pseudogenization in the subfamily Fumarioideae, only clpP was observed as pseudogenization, lost, or truncated in three species (Hypecoum erectum, Eomecon chionantha, and Sanguinaria canadensis) of Papaveroideae and Hypecooideae (Figure 1).

2.2. Phylogenetic Analyses

The maximum likelihood (ML) tree was constructed based on the 91 shared genes. The aligned length of the concatenated plastid genes was 76,127 bp, with 12,340 variable sites (22.42%) and 6449 parsimony informative sites (12.08%) (gaps were not included). Our robust phylogeny recovered a largely congruent topology with previous studies [26]. As expected, three subfamilies were recovered in the high-confidence phylogeny, and Papaveroideae was strongly supported as a sister to a solid monophyletic group comprising all species from Hypecooideae and Fumarioideae (Figure 1A and Figure S1). Within Papaveroideae, Chelidonieae and Papavereae were recovered as non-monophyly due to the exceptional position of Stylophorum lasiocarpum. Additionally, the relationships of Chelidonieae, Papavereae, and Eschscholzieae have not been fully resolved. Within Fumarioideae, Fumarieae was nested in Corydaleae and sister to Corydalis.

2.3. Genome Structure Variations

Multiple alignment analysis across twenty-three species showed the presence of several locally collinear blocks, suggesting that the plastomes of Papaveraceae might have undergone varying degrees of rearrangement (Figure 2). Firstly, one block (~6 kb) containing five genes (trnV-UAC, trnM-CAU, atpE, atpB, and rbcL) and the associated non-coding sequences relocated from the typically posterior part of the LSC region to the front. This block was relocated downstream of the matK gene in Corydalis triternatifolia, C. sheareri, C. gamosepala, C. longicalcarata, and Dactylicapnos torulosa. In Corydalis tomentella and Hypecoum erectum, this block was relocated downstream of the atpH and accD genes, respectively. In addition, these five genes in Corydalis triternatifolia, C. sheareri, C. gamosepala, C. longicalcarata, C. tomentella, Dactylicapnos torulosa, and Hypecoum erectum were inverted, a phenomenon not observed in Papaveroideae. The remaining species exhibited a typical position of angiosperms downstream of the ndhC gene. Secondly, the rps16 (~1 kb) gene was relocated from the typically front part of the LSC region to the posterior in Corydalis longicalcarata. In Fumaria schleicheri and Lamprocapnos spectabilis, the rps16 gene transferred from the typically front part of the LSC region to the IR region. Furthermore, the rps16 gene was inverted in Corydalis longicalcarata, Fumaria schleicheri, and Lamprocapnos spectabilis. Thirdly, one block (~7 kb) containing eight genes (psbK, psbI, trnS-GCU, trnG-UCC, trnR-UCU, atpA, atpF, and atpH) relocated from the typically front part of the LSC region to the posterior in Corydalis longicalcarata. Moreover, this block was inverted in Corydalis longicalcarata and Fumaria schleicheri. Fourthly, one block (~14–15 kb) comprising five genes (atpI, rps2, rpoC2, rpoC1, and rpoB) relocated from the typically front part of the LSC region to the posterior in Hypecoum erectum. In addition, in Corydalis longicalcarata, Fumaria schleicheri, Lamprocapnos spectabilis, and Hypecoum erectum, this block was inverted. Fifthly, one block (~13–15 kb) containing fourteen genes (from trnD-GUC to ycf3) was inverted in Corydalis gamosepala, C. longicalcarata, Fumaria schleicheri, and Hypecoum erectum. And in Corydalis longicalcarata and Fumaria schleicheri, it was relocated from the middle to the front of the LSC region. Sixthly, one block (~2 kb) containing three genes (trnS-GGA, rps4, and trnT-UGU) relocated from the typically middle of the LSC region to the posterior in Hypecoum erectum, while it was relocated from the typically middle of the LSC region to the front in Corydalis longicalcarata and Fumaria schleicheri. And in Corydalis gamosepala, C. longicalcarata, Fumaria schleicheri, and Hypecoum erectum, this block was inverted. Seventhly, one block (~2 kb) containing five genes (trnL-UAA, trnF-GAA, ndhJ, ndhK, and ndhC) was inverted in Hypecoum erectum, Corydalis longicalcarata, and C. gamosepala. Eighthly, one block (~7 kb) in the IR region containing one gene (ycf2) relocated from the typically front of the IR region to the posterior in Hypecoum erectum. And this block was inverted in Corydalis tomentella and Hypecoum erectum. Ninthly, one block (~7 kb) in the IR region containing the ndhB gene was inverted in all taxa of Fumarioideae and Hypecooideae except for Corydalis tomentella. Similarly, one block (~11 kb) in the IR region containing ten genes (from rps7 to trnR-ACG) was inverted in all sampled species of Fumarioideae and Hypecooideae except for C. tomentella. Tenthly, one block (~2 kb) in the SSC region only containing the ndhF gene relocated to the IR region in Eomecon chionantha, Dactylicapnos torulosa, Fumaria schleicheri, Corydalis tomentella, C. longicalcarata, C. gamosepala, and C. sheareri due to the expansion of the IR region. However, this block was relocated to the LSC region in C. adunca. In addition, in Eomecon chionantha and Corydalis adunca, it was inverted. Lastly, one block (~6 kb) in the SSC region, which included seven genes (from trnL-UAG to ndhI), relocated to the IR region in Eomecon chionantha, Fumaria schleicheri, Lamprocapnos spectabilis, and all six Corydalis species due to the expansion of the IR region absorbing the SSC region, and was inverted in Lamprocapnos spectabilis and Eomecon chionantha. In conclusion, we observed that relocations and inversions were widely distributed in Papaveraceae, especially in Hypecooideae and Fumarioideae. Within Papaveroideae, the rearrangement was mainly concentrated in some specific lineage, such as Eomecon (Figure 2).
The IR boundary analyses indicated that the IR regions of Papaveroideae plastomes were highly conserved, while the IR boundaries of Hypecooideae and Fumarioideae genomes exhibited high variation (Figure 3). In Papaveroideae, most species had similar structures: the rps19 gene was located in the LSC/IRb boundary, the intergenic region between the trnN and ndhF genes resided precisely at the IRb/SSC boundary, the ycf1 gene crossed the IRa/SSC boundary, and the trnH gene was located in the IRa/LSC boundary. Specifically, in Papaver nudicaule, the ycf1 gene resided precisely at the IRb/SSC boundary. In Eomecon chionantha, the IRb region extended into the SSC region, absorbing the ndhF gene. Whereas, in Macleaya cordata and Dicranostigma leptopodum, the IR region expanded into the LSC region, assimilating the rps19, rpl22, and rps3 genes, thereby establishing the rps3 gene’s placement at the IR/LSC boundary. In the subfamily Fumarioideae and Hypecooideae, the rps19 gene was located in the LSC region in Corydalis gamosepala, C. longicalcarata, C. tomentella, Fumaria schleicheri, and Hypecoum erectum at distances ranging from 156 bp to 1156 bp from the LSC/IRb boundary, while the rps19 gene resided precisely at the LSC/IRb boundary in Ichtyoselmis macrantha. The rpl2 gene was located in the LSC/IRb boundary in Corydalis triternatifolia, C. sheareri, C. adunca, and Dactylicapnos torulosa, while in Lamprocapnos spectabilis, the LSC/IRb boundary was situated between the trnI and trnQ genes. For the IRb/SSC boundary of Fumarioideae and Hypecooideae, the ycf1 gene was located within the IRb region in Corydalis sheareri, C. gamosepala, and C. longicalcarata, with a distance of 31 bp to 423 bp from the boundary. Furthermore, the ndhF gene was located near the IRb/SSC boundary in Dactylicapnos torulosa, Ichtyoselmis macrantha, Lamprocapnos spectabilis, and Hypecoum erectum. The rps15 gene was located near the IRb/SSC boundary in Corydalis triternatifolia and Fumaria schleicheri, and in C. adunca, the IRb/SSC boundary was located between the ndhG and ndhH genes. Nevertheless, in C. tomentella, the ndhA gene was located at the IRb/SSC boundary. For the IRb/SSC boundary of Fumarioideae and Hypecooideae, the ycf1 gene was located near the boundary in Corydalis sheareri, C. gamosepala, C. longicalcarata, C. tomentella, C. adunca, Fumaria schleicheri, and Ichtyoselmis macrantha, with a distance of 6 bp to 521 bp from the boundary, and the ycf1 gene of Hypecoum erectum was exactly located at the boundary. In Corydalis triternatifolia, the ndhH gene was 2184 bp away from the IRb/SSC boundary. In Dactylicapnos torulosa, the IRb/SSC boundary was situated between the rpl32 and ndhF genes, and in Lamprocapnos spectabilis, the ndhF gene was 157 bp away from the boundary. For the IRa/SSC boundary of Fumarioideae and Hypecooideae, the trnH gene was located at a distance of 10 bp to 297 bp from the boundary in all sampled species with the exception of L. spectabilis, in which the boundary was situated between the trnQ and psbK genes.

2.4. Codon Usage and Repeat Sequence Analysis

Codon with RSCU values greater than one was considered to have relatively high usage frequencies. We examined the codon usage frequency of PCGs in Papaveraceae and found that eighteen codons encoding eighteen amino acids had RSCU values > 1 (Table S2). Among them, the highest frequency was observed for the codon CGU, which encodes arginine (R). Furthermore, we also detected a usage preference of 1 for the codon UGG encoding tryptophan (W) and the codon AUG encoding methionine (M) (Table S2).
We detected a total of 732 SSRs, including 678 mononucleotide repeats, 45 dinucleotide repeats, 7 trinucleotide repeats, and 2 hexanucleotide repeats (Figure 4, Table S3). No tetranucleotide repeats and pentanucleotide repeats were detected. The majority of mononucleotide repeats consisted of A/T (96.3%), with C/G accounting for only 3.7%. In addition, we discovered 2999 forward repeats, 1610 palindromic repeats, 28 reverse repeats, and 9 complementary repeats (Figure 4, Table S3). Among them, the maximum number of forward repeats and palindromic repeats in Corydalis sheareri was 1079 and 1001, respectively. Complementary repeats were only detected in the C. adunca, Dicranostigma leptopodum, and Coreanomecon hylomeconoides (Figure 4, Table S3). We also detected 749 tandem repeat sequences. Overall, the subfamily of Fumarioideae harbored more tandem repeat sequences compared to the subfamily of Papaveroideae (Figure 4). We tested the correlation between the genome size and the total number of repeats, total tandem repeat number, total SSR number, and total dispersed repeat number, respectively. We found that there is a weak correlation between the genome size and total SSR number (rs = 0.215, p = 0.336) (Figure 4B), while the total dispersed repeat number, total tandem repeat number, and total repeat number showed a very strong correlation with the plastid genome size (Figure 4C–E). We also analyzed the correlation between the genome size and total SSR size, total tandem repeat size, and total dispersed repeat size, and similar results were obtained (Figure S3A–C). Furthermore, the GC content was significantly correlated with the size of the repeated sequences (Figure S2D).

2.5. Nucleotide Diversity and Positive Selection Analyses

The nucleotide diversity (PI) values were calculated to assess the genetic variation among species within Papaveraceae. The nucleotide diversity for 91 shared genes varied from 0.00357 (trnP-UGG) to 0.2627 (ycf1), with an average of 0.044 (Figure 5). Apart from ycf1, rps15 (0.2267), matK (0.10616), rps3 (0.09903), trnK-UUU (0.09514), and ycf2 (0.0948) also exhibited high PI values in Papaveraceae (Figure 5). In addition, ycf1 (0.22656), ycf2 (0.09397), and rps3 (0.09206) showed higher pi values in the Hypecooideae and Fumarioideae, while rps15 (0.19499) and ycf1 (0.16229) showed higher pi values in the Papaveroideae, indicating their potential suitability as candidate barcodes for the identification of Papaveraceae species in future endeavors (Figure 5 and Figure S3). The ratios of the non-synonymous (dN)/synonymous substitution (dS) rate for the 60 shared PCGs were calculated by the PAML program. We found that the dN/dS values of seven genes (psaJ, psbT, rps18, rpl33, rps19, ycf1, ycf3) were significantly greater than 1, indicating positive selection for these genes within Papaveraceae (Figure 6). However, the genes under positive selection showed slight differences among different subfamilies, with rpl23, ycf3, rpl33, and rps19 in Papaveroideae, and ycf3, rps19, rpl33, rps18, and psaJ in Fumarioideae and Hypecooideaeare (Figure S4). In addition, most rps genes exhibited relatively high dN/dS values, with the majority being above 0.5.

3. Discussion

In Papaveraceae, although a large number of plastomes were released, extreme reconfigurable plastomes were only reported in Corydalis, Lamprocapnos, and Fumaria [38,39,40,41], three genera of Fumarioideae. Based on the dense sampling and comprehensive comparison, our study provided a valuable opportunity to further investigate the plastome variation in Papaveraceae. We identified more local collinear blocks showing rearrangements and more genes undergoing loss, pseudogenization, or being truncated in more lineages of Fumarioideae, such as Dactylicapnos. Notably, we firstly reported the plastome reconfiguration of Hypecooideae and Papaveroideae. We hypothesized that Papaveroideae plastomes were relatively conserved with the exception of Eomecon chionantha, while Fumarioideae and Hypecooideae usually harbored extreme reconfigurable plastomes, which demonstrated a high level of variability in the genome size, gene content, gene order, and rearrangements (Table 1; Figure 2 and Figure 3).
The largest (Corydalis sheareri, 219,144 bp) and smallest (Meconopsis integrifolia, 151,864 bp) plastome sizes differed significantly (~67 kb, Table S1) in Papaveraceae. In our results, we hypothesized that the genome size variation was mainly due to the IR expansion and the large number of repeat sequences. Moreover, we found that the high variability of the genome size was likely triggered by different factors in different lineages. The plastomes of Corydalis, Fumaria, Lamprocapnos, and Eomecon experienced an extreme expansion of their IR region into the SSC region, resulting in one very small SSC region (less than 10 kb) and two very large IR regions (approximately 38~62 kb), which further led to a substantial increase in the total genome size (Figure 3, Table 1). Additionally, more dispersed repeats were also detected in Corydalis, Fumaria, and Lamprocapnos (Figure 4A), which indicated that both the expansion of IRs and large numbers of repeats contributed to the increase in the genome size in these taxa. However, for Dactylicapnos and Ichtyoselmis, which exhibited a typical IR region (26–27 kb) and a larger SSC region with more than 22 kb (Table 1), a large number of repeats were detected in the SSC region (Table S4), indicating that their slightly larger plastomes were likely caused by the increase of repetitive sequences.
Apart from the variation in the genome size, the expansion of the IR region also significantly contributed to the gene content variation in Papaveraceae (Figure 1 and Figure 3, Table 1). Complete gene duplications (rpl32, trnL-UAG, ccsA, psaC, rps15, and ycf1) were documented due to the expansion of the IR region. In addition, gene loss or pseudogenization occurred frequently for ndhs, accD, clpP, infA, rpl2, rpl20, rpl32, and rps16, which also resulted in the variation in the gene content. The loss or pseudogenization of the ndhC, ndhJ, and ndhK genes was strongly correlated with the adjacent rearrangement of the LSC region, while the loss or pseudogenization of the remaining ndh genes might be related to the IR border shift (Figure 2 and Figure 3), which was similar with previous studies [38,43]. Our results suggested that the loss of the ndh genes mainly occurred in two species of Corydalis, which belong to different subgenus [50,51], suggesting that there were at least two independent gene loss/pseudogenization events in Corydalis. The ndh genes encoded subunits of the chloroplast NAD(P)H dehydrogenase (NDH) complex, which is involved in photosystem I (PSI) cyclic electron transfer and chlororespiration [52]. Wicke et al. [53] mentioned that ndh gene losses mainly occurred in some groups with a certain degree of heterotrophy, which might render the plastid-encoded ndh1 subunit dispensable, a phenomenon not commonly observed in seed plants. However, in recent years, an increasing number of autotrophic species had been reported to have lost the ndh genes, such as Erodium [54], Paphiopedilum [55], and Cycas [56]. The ndh genes had been suggested as being strongly related to the IR/SSC junction stability [57]. The IR boundary of Papaveraceae was very different from the typical angiosperm boundary, which exhibited high diversity, particularly in Corydalis, the largest and most diverse genera of Papaveraceae. We inferred that the loss of the ndh genes was likely associated with the IR boundary stability in the poppy family. The accD gene encoded acetyl-CoA carboxylase, an enzyme that played a critical role in plants, bacteria, and some eukaryotes [58], and the clpP gene encoded a protease that participated in regulating plant growth, development, photosynthesis, and responses to environmental stress [59,60,61]. In our results, the accD gene loss was observed in Corydalis, Fumaria, and Dactylicapnos, and the clpP gene was lost in Dactylicapnos and Eomecon. We inferred that the loss of the accD gene possibly occurred in the ancestors of Corydalis, Fumaria, and Dactylicapnos, while the clpP gene loss occurred multiple times in Papaveraceae.
Coincidentally, the species that lost the accD gene, such as Corydalis, Fumaria, and Dactylicapnos, all exhibited extensive rearrangement (Figure 2 and Figure 3). Moreover, one species from Papaveroideae (Eomecon chionantha) and one species from Fumarioideae (Dactylicapnos torulosa) lost the clpP gene. Of those, Eomecon chionantha was the only species that exhibited significant IR expansion and rearrangement in Papaveroideae. Additionally, clpP was observed as a pseudogene in Hypecoum erectum, one species from Hypecooideae, which also exhibited extensive rearrangement. The higher substitution rates in the accD and clpP genes were correlated with the structural variation in Hypericum [62], Fagopyrum [63,64], Oenothera [65], and Caprifoliaceae [66]. Given all the above evidence, we speculated that the loss of the accD and clpP genes might be related to plastome rearrangements in Papaveraceae. Moreover, the repetitive sequences in Fumarioideae and Hypecooideae were generally more abundant than those in Papaveroideae (Figure 4B), indicating that the recombination and instability of these repetitive sequences might also contribute to the plastome reconfiguration in Papaveraceae. In addition, the average GC content of Fumarioideae was slightly higher than that in Papaveroideae and Hypecooideae, particularly in Corydalis, and the GC content exceeded 40% for most species (Figure 1). In Papaveraceae, the GC content was strongly correlated with the size of the repeated sequences (Figure S2D), which indicated that the variation in the GC content might result from the extreme genome reorganization.

4. Materials and Methods

4.1. Plant Materials, Taxon Sampling, DNA Extraction, and Sequencing

A total of twenty-two species were sampled, spanning seventeen genera from three subfamilies (Fumarioideae, Hypecooideae, and Papaveroideae) of Papaveraceae. Pteridophylloideae was not sampled due to the fact that it was narrowly distributed in certain regions of Japan. For the two species-rich subfamilies (Fumarioideae and Papaveroideae), five of six previously recognized tribes [26] were collected. For the largest genus, Corydalis, more than one species was sampled due to the extreme complexity of the structure variation in previous studies [39,40,41]. The six sampled Corydalis species covered all three subgenera and six major clades that were previously recognized [50]. Although a large number of plastomes were reported in previous studies [38,39,40,41,42,50], to eliminate the potential assembly result deviation induced by diverse sequencing methods or software employed by different scholars, we independently sequenced and assembled for a portion of species. In total, thirteen species were newly sequenced, while another ten species, including one outgroup (Euptelea pleiosperma), were directly downloaded from the GenBank database (https://www.ncbi.nlm.nih.gov/, accessed on 1 December 2023) or retrieved from our previous studies [51] (Table S1). Total DNA was extracted from silica gel-dried leaves using the modified CTAB (cetyltrimethylammonium bromide) method [67]. The library was constructed with an insert size of approximately 350 bp fragment using the Mgieasy DNA library preparation kit (Beijing Genomics Institution, Shenzhen, China) by following the manufacturer’s instructions. Sequencing was carried out on the BGISEQ-500 platform at BGI, generating 150 bp paired-end (PE) reads. All the raw data have been submitted to the SRA database, and the accession numbers are provided in Table S1.

4.2. Plastome Assembly, Annotation, and Plastid Gene Extraction

Clean data were obtained by using SOAPnuke [68] to remove the adapters and low-quality reads with the default parameters. The data quality was assessed using FastQC v0.12.1 [69]. Next, filtered clean reads were assembled de novo using GetOrganelle v1.7.5 [70], and then Bandage v0.8.1 [71] was used to adjust the assembled graphs. The assembled plastome was annotated using PGA [72], with the plastome of Amborella trichopoda (AJ506156) as reference. The start/stop codons and intron/exon boundaries of genes were manually modified based on the reference sequences using Geneious Prime v2023.2 (https://www.geneious.com/features/#sequence-analysis, accessed on 1 December 2023). The PCGs, rRNA genes, and tRNAs were separately extracted from the annotated plastome with Geneious Prime. Each gene was aligned using MAFFT v7.450 [73].

4.3. Phylogenetic Analyses

Maximum likelihood (ML) analysis was conducted based on 98 concatenated plastid genes with IQ-TREE v1.6.12 [74]. ModelFinder [75] was used to select the best-fit model, and 1000 replications with standard bootstrap support values were performed. Then, the generated tree was visualized and manually improved with FigTree v1.4.4 (http://tree.bio.ed.ac.uk/software/figtree, accessed on 1 December 2023).

4.4. Genome Structure Analyses

To determine the potential genomic rearrangements and locally collinear blocks (LCBs), the “progressiveMauve” algorithm implemented in Mauve v2.4.0 [76] was used for comparison, with the plastome of Euptelea pleiosperma (NC029429) as reference. CPJSdraw v1.0.0 [77] was employed to assess the expansion and contraction of the IR regions.

4.5. Codon Usage and Repeat Sequence Analysis

Due to the degeneracy of codons, most amino acids can be encoded by multiple synonymous codons. The usage frequencies of different codons for different amino acids may not necessarily be the same. Synonymous codon usage bias (SCUB) is species-specific and varies within or among genomes [78]. The utilization of a specific synonymous codon is quantified as the numerator, while the anticipated frequency of that codon’s occurrence serves as the denominator, referred to as the relative synonymous codon usage (RSCU), and it serves as a standard measure of preference. We selected 60 shared PCGs and conducted nucleotide composition analysis using the CodonW v1.4.2 (https://sourceforge.net/projects/codonw/, accessed on 1 December 2023). The identification of simple sequence repeat (SSR) was conducted through the utilization of MISA v2.0 [79], with the minimum number of repeats, mononucleotide, dinucleotide, trinucleotide, tetranucleotide, pentanucleotide, and hexanucleotide repeats were set to 10, 6, 5, 5, 5, and 5, respectively. Tandem repeats sequences were detected using the Tandem Repeats Finder v4.09 [80]. The alignment parameters match, mismatch, delta, match probability, indel probability, minimum alignment score, and maximum period size were set to 2, 7, 7, 10, 50, 80, and 500, respectively. The REPuter [81] was used to detect the dispersed repeats in forward, reverse, complement, and palindromic sequences, with a minimum repeat size set at 30 and a Hamming distance of 3. To determine the correlation of the genome size and the repeat sequence, spearman correlation was performed by SPSS v27.0 [82] under the default settings, and the strength of the correlation was adopted as follows: negligible or very weak (0.1–0.19), weak (0.20–0.29), moderate (0.30–0.39), strong (0.4–0.69), very strong (0.70–0.99), and perfect (1.0) [83].

4.6. Nucleotide Diversity and Positive Selection Analyses

The nucleotide diversity of each gene was calculated using DNasp v6.0 [84]. PAML v4.9 [85] was used to calculate the non-synonymous mutation rate (dN) and synonymous mutation rate (dS) of the coding DNA sequences (CDS) under Model 0. The dN/dS > 1, dN/dS = 1, and dN/dS < 1 suggest positive selection, neutral selection, and purifying selection, respectively.

Supplementary Materials

The supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ijms25042278/s1.

Author Contributions

Conceptualization, Y.L., J.L. and H.W.; data curation, J.C., Y.C, S.K. and Y.L.; formal analysis, J.C., Y.L., Y.C. and S.K.; writing—original draft preparation, Y.L. and J.C.; writing—review and editing, Y.L., J.L. and H.W.; funding acquisition: Y.L. and J.L.; investigation, J.C. and Y.L.; supervision, Y.L., J.L. and H.W.; project administration, Y.L., J.L. and H.W.; visualization: J.C., S.K. and Y.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Science and Technology Planning Project of Henan Province of China (222102110255) and National Natural Science Foundation of China (32000170).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All analyzed data for this study are included in the contents of this article and Supplementary Materials.

Acknowledgments

We thank Min Chen (Institute of Botany, Jiangsu Province and Chinese Academy of Sciences) and Xuling Chen (Emeishan Botanical Garden) for their help in the sample collections.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Daniell, H.; Lin, C.-S.; Yu, M.; Chang, W.-J. Chloroplast genomes: Diversity, evolution, and applications in genetic engineering. Genome Biol. 2016, 17, 134. [Google Scholar] [CrossRef]
  2. Jansen, R.K.; Cai, Z.; Raubeson, L.A.; Daniell, H.; Depamphilis, C.W.; Leebens-Mack, J.; Müller, K.F.; Guisinger-Bellian, M.; Haberle, R.C.; Hansen, A.K.; et al. Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns. Proc. Natl. Acad. Sci. USA 2007, 104, 19369–19374. [Google Scholar] [CrossRef]
  3. Mower, J.P.; Vickrey, T.L. Structural diversity among plastid genomes of land plants. Adv. Bot. Res. 2018, 85, 263–292. [Google Scholar]
  4. Maier, R.M.; Neckermann, K.; Igloi, G.L.; Kössel, H. Complete sequence of the maize chloroplast genome: Gene content, hotspots of divergence and fine tuning of genetic information by transcript editing. J. Mol. Biol. 1995, 251, 614–628. [Google Scholar] [CrossRef] [PubMed]
  5. Wang, J.; Kan, S.L.; Liao, X.Z.; Zhou, J.W.; Tembrock, L.R.; Daniell, H.; Jin, S.X.; Wu, Z.Q. Plant organellar genomes: Much done, much more to do. Trends Plant Sci. 2024. [Google Scholar] [CrossRef] [PubMed]
  6. Bai, H.-R.; Oyebanji, O.; Zhang, R.; Yi, T.S. Plastid phylogenomic insights into the evolution of subfamily Dialioideae (Leguminosae). Plant Divers. 2021, 43, 27–34. [Google Scholar] [CrossRef] [PubMed]
  7. Cai, Z.; Guisinger, M.; Kim, H.-G.; Ruck, E.; Blazier, J.C.; McMurtry, V.; Kuehl, J.V.; Boore, J.; Jansen, R.K. Extensive reorganization of the plastid genome of Trifolium subterraneum (Fabaceae) is associated with numerous repeated sequences and novel DNA insertions. J. Mol. Evol. 2008, 67, 696–704. [Google Scholar] [CrossRef] [PubMed]
  8. Kolodner, R.; Tewari, K.K. Inverted repeats in chloroplast DNA from higher plants. Proc. Natl. Acad. Sci. USA 1979, 76, 41–45. [Google Scholar] [CrossRef] [PubMed]
  9. Palmer, J.D.; Osorio, B.; Thompson, W.F. Evolutionary significance of inversions in legume chloroplast DNAs. Curr. Genet. 1988, 14, 65–74. [Google Scholar] [CrossRef]
  10. Tangphatsornruang, S.; Sangsrakru, D.; Chanprasert, J.; Uthaipaisanwong, P.; Yoocha, T.; Jomchai, N.; Tragoonrung, S. The chloroplast genome sequence of mungbean (Vigna radiata) determined by high-throughput pyrosequencing: Structural organization and phylogenetic relationships. DNA Res. 2010, 17, 11–22. [Google Scholar] [CrossRef]
  11. Blazier, J.C.; Jansen, R.K.; Mower, J.P.; Govindu, M.; Zhang, J.; Weng, M.-L.; Ruhlman, T.A. Variable presence of the inverted repeat and plastome stability in Erodium. Ann. Bot. 2016, 117, 1209–1220. [Google Scholar] [CrossRef]
  12. Guisinger, M.; Kuehl, J.; Boore, J.; Jansen, R. Extreme reconfiguration of plastid genomes in the angiosperm family Geraniaceae: Rearrangements, repeats, and codon usage. Mol. Biol. Evol. 2011, 28, 583–600. [Google Scholar] [CrossRef] [PubMed]
  13. Ruhlman, T.A.; Zhang, J.; Blazier, J.C.; Sabir, J.S.M.; Jansen, R.K. Recombination-dependent replication and gene conversion homogenize repeat sequences and diversify plastid genome structure. Am. J. Bot. 2017, 104, 559–572. [Google Scholar] [CrossRef] [PubMed]
  14. Weng, M.L.; Blazier, J.C.; Govindu, M.; Jansen, R.K. Reconstruction of the ancestral plastid genome in Geraniaceae reveals a correlation between genome rearrangements, repeats, and nucleotide substitution rates. Mol. Biol. Evol. 2013, 31, 645–659. [Google Scholar] [CrossRef]
  15. Amaral, D.T.; Bombonato, J.R.; da Silva Andrade, S.C.; Moraes, E.M.; Franco, F.F. The genome of a thorny species: Comparative genomic analysis among South and North American Cactaceae. Planta 2021, 254, 44. [Google Scholar] [CrossRef] [PubMed]
  16. de Almeida, E.M.; Sader, M.A.; Rodriguez, P.E.; Loeuille, B.; Felix, L.P.; Pedrosa-Harand, A. Assembling the puzzle: Complete chloroplast genome sequences of Discocactus bahiensis Britton & Rose and Melocactus ernestii Vaupel (Cactaceae) and their evolutionary significance. Braz. J. Bot. 2021, 44, 877–888. [Google Scholar]
  17. Cosner, M.E.; Raubeson, L.A.; Jansen, R.K. Chloroplast DNA rearrangements in Campanulaceae: Phylogenetic utility of highly rearranged genomes. BMC Evol. Biol. 2004, 4, 27. [Google Scholar] [CrossRef]
  18. Haberle, R.C.; Fourcade, H.M.; Boore, J.L.; Jansen, R.K. Extensive rearrangements in the chloroplast genome of Trachelium caeruleum are associated with repeats and tRNA Genes. J. Mol. Evol. 2008, 66, 350–361. [Google Scholar] [CrossRef]
  19. Cauz dos Santos, L.; Munhoz, C.; Rodde, N.; Cauet, S.; Santos, A.; Penha, H.; Dornelas, M.; Varani, A.; Oliveira, G.; Bergès, H.; et al. The chloroplast genome of Passiflora edulis (Passifloraceae) assembled from long sequence reads: Structural organization and phylogenomic studies in Malpighiales. Front. Plant Sci. 2017, 8, 334. [Google Scholar] [CrossRef]
  20. Cauz dos Santos, L.; Portugal, Z.; Callot, C.; Cauet, S.; Zucchi, M.I.; Bergès, H.; van den Berg, C.; Vieira, M.-L. A repertory of rearrangements and the loss of an inverted repeat region in Passiflora chloroplast genomes. Genome Biol. Evol. 2020, 12, 1841–1857. [Google Scholar] [CrossRef]
  21. Rabah, S.O.; Shrestha, B.; Hajrah, H.N.; Sabir, M.J.; Alharby, H.F.; Sabir, M.J.; Alhebshi, A.M.; Sabir, L.S.M.; Gilbert, L.E.; Ruhlman, T.A.; et al. Passiflora plastome sequencing reveals widespread genomic rearrangements. J. Syst. Evol. 2019, 57, 1–14. [Google Scholar] [CrossRef]
  22. Shrestha, B.; Weng, M.-L.; Theriot, E.C.; Gilbert, L.E.; Ruhlman, T.A.; Krosnick, S.E.; Jansen, R.K. Highly accelerated rates of genomic rearrangements and nucleotide substitutions in plastid genomes of Passiflora subgenus Decaloba. Mol. Phylogenet. Evol. 2019, 138, 53–64. [Google Scholar] [CrossRef]
  23. Li, B.; Zheng, Y. Dynamic evolution and phylogenomic analysis of the chloroplast genome in Schisandraceae. Sci. Rep. 2018, 8, 9285. [Google Scholar] [CrossRef] [PubMed]
  24. Song, Y.; Yu, W.B.; Tan, Y.; Liu, B.; Yao, X.; Jin, J.; Padmanaba, M.; Yang, J.B.; Corlett, R. Evolutionary comparisons of the chloroplast genome in Lauraceae and insights into loss events in the Magnoliids. Genome Biol. Evol. 2017, 9, 2354–2364. [Google Scholar] [CrossRef]
  25. Palmer, J.D.; Thompson, W.F.J.C. Chloroplast DNA rearrangements are more frequent when a large inverted repeat sequence is lost. Cell 1982, 29, 537–550. [Google Scholar] [CrossRef] [PubMed]
  26. Peng, H.W.; Xiang, K.L.; Erst, A.S.; Lian, L.; Ortiz, R.D.C.; Jabbour, F.; Chen, Z.D.; Wang, W. A complete genus-level phylogeny reveals the Cretaceous biogeographic diversification of the poppy family. Mol. Phylogenet. Evol. 2023, 181, 107712. [Google Scholar] [CrossRef]
  27. Jacomet, S. Plant economy and village life in Neolithic lake dwellings at the time of the Alpine Iceman. Veg. Hist. Archaeobot. 2009, 18, 47–59. [Google Scholar] [CrossRef]
  28. Chen, N.; Qi, Y.; Ma, X.; Xiao, X.; Liu, Q.; Xia, T.; Xiang, J.; Zeng, J.; Tang, J. Rediscovery of traditional plant medicine: An underestimated anticancer drug of chelerythrine. Front. Pharmacol. 2022, 13, 906301. [Google Scholar] [CrossRef]
  29. Inada, M.; Shindo, M.; Kobayashi, K.; Sato, A.; Yamamoto, Y.; Akasaki, Y.; Ichimura, K.; Tanuma, S.I. Anticancer effects of a non-narcotic opium alkaloid medicine, papaverine, in human glioblastoma cells. PLoS ONE 2019, 14, e0216358. [Google Scholar] [CrossRef] [PubMed]
  30. Kilic, M.; Sener, B.; Kaya, E. Anti-cholinergic activities of Turkish Corydalis DC. species. Phytochem. Lett. 2021, 45, 142–156. [Google Scholar] [CrossRef]
  31. Lou, G.; Wang, J.; Hu, J.; Gan, Q.; Peng, C.; Xiong, H.; Huang, Q. Sanguinarine: A double-edged sword of anticancer and carcinogenesis and its future application prospect. Anti-Cancer Agent. Med. Chem. 2021, 21, 2100–2110. [Google Scholar] [CrossRef] [PubMed]
  32. Zhang, B.; Huang, R.; Hua, J.; Liang, H.; Pan, Y.; Dai, L.; Liang, D.; Wang, H. Antitumor lignanamides from the aerial parts of Corydalis saxicola. Phytomedicine 2016, 23, 1599–1609. [Google Scholar] [CrossRef] [PubMed]
  33. Sánchez-Arreola, E.; Hernandez, L.; Sanchez-Salas, J.; Martínez-Espino, G. Alkaloids from Bocconia frutescens and biological activity of their extracts. Pharm. Biol. 2008, 44, 540–543. [Google Scholar] [CrossRef]
  34. Qin, F.; Chen, Y.; Wang, F.F.; Tang, S.Q.; Fang, Y.L. Corydalis saxicola Bunting: A review of its traditional uses, phytochemistry, pharmacology, and clinical applications. Int. J. Mol. Sci. 2023, 24, 1626. [Google Scholar] [CrossRef] [PubMed]
  35. Hoot, S.B.; Kadereit, J.W.; Blattner, F.R.; Jork, K.B.; Schwarzbach, A.E.; Crane, P.R. Data congruence and phylogeny of the Papaveraceae s.l. based on four data sets: atpB and rbcL sequences, trnK restriction sites, and morphological characters. Syst. Bot. 1997, 22, 575–590. [Google Scholar] [CrossRef]
  36. Wang, W.; Lu, A.M.; Ren, Y.; Endress, M.E.; Chen, Z.D. Phylogeny and classification of Ranunculales: Evidence from four molecular loci and morphological data. Perspect. Plant Ecol. Evol. Syst. 2009, 11, 81–110. [Google Scholar] [CrossRef]
  37. Hoot, S.; Wefferling, K.; Wulff, J. Phylogeny and character evolution of Papaveraceae s. l. (Ranunculales). Syst. Bot. 2015, 40, 474–488. [Google Scholar] [CrossRef]
  38. Park, S.; An, B.; Park, S. Reconfiguration of the plastid genome in Lamprocapnos spectabilis: IR boundary shifting, inversion, and intraspecific variation. Sci. Rep. 2018, 8, 13568. [Google Scholar] [CrossRef]
  39. Xu, X.; Wang, D. Comparative chloroplast genomics of Corydalis species (Papaveraceae): Evolutionary perspectives on their unusual large-scale rearrangements. Front. Plant Sci. 2021, 11, 600354. [Google Scholar] [CrossRef]
  40. Raman, G.; Nam, G.H.; Park, S. Extensive reorganization of the chloroplast genome of Corydalis platycarpa: A comparative analysis of their organization and evolution with other Corydalis plastomes. Front. Plant Sci. 2022, 13, 1043740. [Google Scholar] [CrossRef]
  41. Kim, S.C.; Ha, Y.H.; Park, B.K.; Jang, J.E.; Kang, E.S.; Kim, Y.S.; Kimspe, T.H.; Kim, H.J. Comparative analysis of the complete chloroplast genome of Papaveraceae to identify rearrangements within the Corydalis chloroplast genome. PLoS ONE 2023, 18, e0289625. [Google Scholar] [CrossRef]
  42. Wang, L.; Li, F.; Wang, N.; Gao, Y.; Liu, K.; Zhang, G.; Sun, J. Characterization of the Dicranostigma leptopodum chloroplast genome and comparative analysis within subfamily Papaveroideae. BMC Genom. 2022, 23, 794. [Google Scholar] [CrossRef]
  43. Sun, Y.; Moore, M.J.; Lin, N.; Adelalu, K.F.; Meng, A.; Jian, S.; Yang, L.; Li, J.; Wang, H. Complete plastome sequencing of both living species of Circaeasteraceae (Ranunculales) reveals unusual rearrangements and the loss of the ndh gene family. BMC Genom. 2017, 18, 592. [Google Scholar] [CrossRef] [PubMed]
  44. Ji, J.; Luo, Y.; Pei, L.; Li, M.; Xiao, J.; Li, W.; Wu, H.; Luo, Y.; He, J.; Cheng, J.; et al. Complete plastid genomes of nine species of Ranunculeae (Ranunculaceae) and their phylogenetic inferences. Genes 2023, 14, 2140. [Google Scholar] [CrossRef] [PubMed]
  45. Szczecińska, M.; Sawicki, J. Genomic resources of three Pulsatilla species reveal evolutionary hotspots, species-specific sites and variable plastid structure in the family Ranunculaceae. Int. J. Mol. Sci. 2015, 16, 22258–22279. [Google Scholar] [CrossRef] [PubMed]
  46. Nyamgerel, N.; Baasanmunkh, S.; Oyuntsetseg, B.; Bayarmaa, G.A.; Erst, A.S.; Park, I.; Choi, H.J.J. Insight into chloroplast genome structural variation of the Mongolian endemic species Adonis mongolica (Ranunculaceae) in the Adonideae tribe. Sci. Rep. 2023, 13, 22014. [Google Scholar] [CrossRef] [PubMed]
  47. He, J.; Yao, M.; Lyu, R.D.; Lin, L.L.; Liu, H.J.; Pei, L.Y.; Yan, S.X.; Xie, L.; Cheng, J. Structural variation of the complete chloroplast genome and plastid phylogenomics of the genus Asteropyrum (Ranunculaceae). Sci. Rep. 2019, 9, 15285. [Google Scholar] [CrossRef] [PubMed]
  48. Tong, R.; Gui, C.; Zhang, Y.; Su, N.; Hou, X.; Liu, M.; Yang, Z.; Kang, B.; Chang, Z.; Jabbour, F.; et al. Phylogenomics, plastome structure and species identification in Mahonia (Berberidaceae). BMC Genom. 2022, 23, 766. [Google Scholar] [CrossRef] [PubMed]
  49. Wu, C.S.; Wang, Y.N.; Hsu, C.Y.; Lin, C.P.; Chaw, S.M. Loss of different inverted repeat copies from the chloroplast genomes of Pinaceae and cupressophytes and influence of heterotachy on the evaluation of gymnosperm phylogeny. Genome Biol. Evol. 2011, 3, 1284–1295. [Google Scholar] [CrossRef]
  50. Chen, J.T.; Lidén, M.; Huang, X.H.; Zhang, L.; Zhang, X.J.; Kuang, T.H.; Landis, J.B.; Wang, D.; Deng, T.; Sun, H. An updated classification for the hyper-diverse genus Corydalis (Papaveraceae: Fumarioideae) based on phylogenomic and morphological evidence. J. Integr. Plant Biol. 2023, 65, 2138–2156. [Google Scholar] [CrossRef]
  51. Liu, Y.Y.; Cao, J.L.; Kan, S.L.; Wang, P.H.; Wang, J.L.; Cao, Y.N.; Wang, H.W.; Li, J.M. Phylogenomic analyses sheds new light on the phylogeny and diversification of Corydalis DC in Himalaya–Hengduan Mountains and adjacent regions. Mol. Phylogenet. Evol. 2024, in press. [Google Scholar] [CrossRef]
  52. Peng, L.; Yamamoto, H.; Shikanai, T. Structure and biogenesis of the chloroplast NAD(P)H dehydrogenase complex. Biochim. Biophys. Acta 2011, 1807, 945–953. [Google Scholar] [CrossRef]
  53. Wicke, S.; Schneeweiss, G.M.; dePamphilis, C.W.; Müller, K.F.; Quandt, D. The evolution of the plastid chromosome in land plants: Gene content, gene order, gene function. Plant Mol. Biol. 2011, 76, 273–297. [Google Scholar] [CrossRef]
  54. Chris Blazier, J.; Guisinger, M.M.; Jansen, R.K. Recent loss of plastid-encoded ndh genes within Erodium (Geraniaceae). Plant Mol. Biol. 2011, 76, 263–272. [Google Scholar] [CrossRef]
  55. Guo, Y.Y.; Yang, J.X.; Bai, M.-Z.; Zhang, G.Q.; Liu, Z.J. The chloroplast genome evolution of Venus slipper (Paphiopedilum): IR expansion, SSC contraction, and highly rearranged SSC regions. BMC Plant Biol. 2021, 21, 248. [Google Scholar] [CrossRef] [PubMed]
  56. Wu, C.S.; Wang, Y.N.; Liu, S.M.; Chaw, S.M. Chloroplast genome (cpDNA) of Cycas taitungensis and 56 cp protein-coding genes of Gnetum parvifolium: Insights into cpDNA evolution and phylogeny of extant seed plants. Mol. Biol. Evol. 2007, 24, 1366–1379. [Google Scholar] [CrossRef]
  57. Kim, H.T.; Kim, J.S.; Moore, M.J.; Neubig, K.M.; Williams, N.H.; Whitten, W.M.; Kim, J.H. Seven new complete plastome sequences reveal rampant independent loss of the ndh gene family across Orchids and associated instability of the inverted repeat/small single-copy region boundaries. PLoS ONE 2015, 10, e0142215. [Google Scholar] [CrossRef] [PubMed]
  58. Caroca, R.; Howell, K.A.; Malinova, I.; Burgos, A.; Tiller, N.; Pellizzer, T.; Annunziata, M.G.; Hasse, C.; Ruf, S.; Karcher, D.; et al. Knockdown of the plastid-encoded acetyl-CoA carboxylase gene uncovers functions in metabolism and development. Plant Physiol. 2021, 185, 1091–1110. [Google Scholar] [CrossRef] [PubMed]
  59. Clarke, A.; Stanne, T.; Sjögren, L. The ATP-dependent Clp protease in chloroplasts of higher plants. Physiol. Plant 2005, 123, 406–412. [Google Scholar] [CrossRef]
  60. Shikanai, T.; Shimizu, K.; Ueda, K.; Nishimura, Y.; Kuroiwa, T.; Hashimoto, T. The chloroplast clpP gene, encoding a proteolytic subunit of ATP-dependent protease, is indispensable for chloroplast development in tobacco. Plant Cell Physiol. 2001, 42, 264–273. [Google Scholar] [CrossRef] [PubMed]
  61. Kim, J.; Olinares, P.D.; Oh, S.H.; Ghisaura, S.; Poliakov, A.; Ponnala, L.; van Wijk, K.J. Modified Clp protease complex in the ClpP3 null mutant and consequences for chloroplast development and function in Arabidopsis. Plant Physiol. 2013, 162, 157–179. [Google Scholar] [CrossRef]
  62. Claude, S.J.; Park, S.; Park, S. Gene loss, genome rearrangement, and accelerated substitution rates in plastid genome of Hypericum ascyron (Hypericaceae). BMC Plant Biol. 2022, 22, 135. [Google Scholar] [CrossRef] [PubMed]
  63. Cho, K.S.; Yun, B.K.; Yoon, Y.H.; Hong, S.Y.; Mekapogu, M.; Kim, K.H.; Yang, T.J. Complete chloroplast genome sequence of tartary buckwheat (Fagopyrum tataricum) and comparative analysis with common buckwheat (F. esculentum). PLoS ONE 2015, 10, e0125332. [Google Scholar] [CrossRef] [PubMed]
  64. Yamane, K.; Yasui, Y.; Ohnishi, O. Intraspecific cpDNA variations of diploid and tetraploid perennial buckwheat, Fagopyrum cymosum (Polygonaceae). Am. J. Bot. 2003, 90, 339–346. [Google Scholar] [CrossRef] [PubMed]
  65. Erixon, P.; Oxelman, B. Whole-gene positive selection, elevated synonymous substitution rates, duplication, and indel evolution of the chloroplast clpP1 gene. PLoS ONE 2008, 3, e1386. [Google Scholar] [CrossRef] [PubMed]
  66. Park, S.; Jun, M.; Park, S.; Park, S. Lineage-specific variation in IR boundary shift events, inversions, and substitution rates among Caprifoliaceae s.l. (Dipsacales) plastomes. Int. J. Mol. Sci. 2021, 22, 10485. [Google Scholar] [CrossRef]
  67. Rogers, S.O.; Bendich, A.J. Extraction of DNA from milligram amounts of fresh, herbarium and mummified plant tissues. Plant Mol. Biol. 1985, 5, 69–76. [Google Scholar] [CrossRef]
  68. Chen, Y.; Chen, Y.; Shi, C.; Huang, Z.; Zhang, Y.; Li, S.; Li, Y.; Ye, J.; Yu, C.; Li, Z.; et al. SOAPnuke: A MapReduce acceleration-supported software for integrated quality control and preprocessing of high-throughput sequencing data. GigaScience 2017, 7, 1–6. [Google Scholar] [CrossRef]
  69. Andrews, S. FastQC: A Quality Control Tool for High Throughput Sequence Data. 2014. Available online: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ (accessed on 1 December 2023).
  70. Jin, J.J.; Yu, W.B.; Yang, J.B.; Song, Y.; dePamphilis, C.W.; Yi, T.S.; Li, D.Z. GetOrganelle: A fast and versatile toolkit for accurate de novo assembly of organelle genomes. Genome Biol. 2020, 21, 241. [Google Scholar] [CrossRef]
  71. Wick, R.R.; Schultz, M.B.; Zobel, J.; Holt, K.E. Bandage: Interactive visualization of de novo genome assemblies. Bioinformatics 2015, 31, 3350–3352. [Google Scholar] [CrossRef]
  72. Qu, X.J.; Moore, M.J.; Li, D.Z.; Yi, T.S. PGA: A software package for rapid, accurate, and flexible batch annotation of plastomes. Plant Methods 2019, 15, 50. [Google Scholar] [CrossRef]
  73. Katoh, K.; Standley, D.M. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 2013, 30, 772–780. [Google Scholar] [CrossRef] [PubMed]
  74. Nguyen, L.T.; Schmidt, H.A.; von Haeseler, A.; Minh, B.Q. IQ-TREE: A fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 2015, 32, 268–274. [Google Scholar] [CrossRef] [PubMed]
  75. Kalyaanamoorthy, S.; Minh, B.Q.; Wong, T.K.F.; von Haeseler, A.; Jermiin, L.S. ModelFinder: Fast model selection for accurate phylogenetic estimates. Nat. Methods 2017, 14, 587–589. [Google Scholar] [CrossRef] [PubMed]
  76. Darling, A.C.; Mau, B.; Blattner, F.R.; Perna, N.T. Mauve: Multiple alignment of conserved genomic sequence with rearrangements. Genome Res. 2004, 14, 1394–1403. [Google Scholar] [CrossRef] [PubMed]
  77. Li, H.; Guo, Q.; Xu, L.; Gao, H.; Liu, L.; Zhou, X. CPJSdraw: Analysis and visualization of junction sites of chloroplast genomes. PeerJ 2023, 11, e15326. [Google Scholar] [CrossRef]
  78. Grantham, R. Working of the genetic code. Trends Biochem. Sci. 1980, 5, 327–331. [Google Scholar] [CrossRef]
  79. Beier, S.; Thiel, T.; Münch, T.; Scholz, U.; Mascher, M. MISA-web: A web server for microsatellite prediction. Bioinformatics 2017, 33, 2583–2585. [Google Scholar] [CrossRef]
  80. Benson, G. Tandem repeats finder: A program to analyze DNA sequences. Nucleic Acids Res. 1999, 27, 573–580. [Google Scholar] [CrossRef]
  81. Kurtz, S.; Choudhuri, J.V.; Ohlebusch, E.; Schleiermacher, C.; Stoye, J.; Giegerich, R. REPuter: The manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res. 2001, 29, 4633–4642. [Google Scholar] [CrossRef]
  82. IBM Corp. IBM SPSS Statistics for Windows, Version 22.0; IBMCorp: Armonk, NY, USA, 2013. [Google Scholar]
  83. Akoglu, H. User’s guide to correlation coefficients. Turkish J. Emerg. Med. 2018, 18, 91–93. [Google Scholar] [CrossRef] [PubMed]
  84. Rozas, J.; Ferrer-Mata, A.; Sánchez-DelBarrio, J.C.; Guirao-Rico, S.; Librado, P.; Ramos-Onsins, S.E.; Sánchez-Gracia, A. DnaSP 6: DNA sequence polymorphism analysis of large data sets. Mol. Biol. Evol. 2017, 34, 3299–3302. [Google Scholar] [CrossRef] [PubMed]
  85. Yang, Z. PAML 4: Phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 2007, 24, 1586–1591. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Gene loss and GC content of the Papaveraceae plastomes. The left tree was constructed for twenty-three taxa based on ninety-one common unique plastid genes with maximum likelihood (ML) analyses. Asterisks (*) represent 100% bootstrap value.
Figure 1. Gene loss and GC content of the Papaveraceae plastomes. The left tree was constructed for twenty-three taxa based on ninety-one common unique plastid genes with maximum likelihood (ML) analyses. Asterisks (*) represent 100% bootstrap value.
Ijms 25 02278 g001
Figure 2. Structural alignments of Papaveraceae plastomes using Mauve with Euptelea pleiosperma as reference. Colorful blocks represent locally collinear blocks (LCBs), and lines connecting the blocks indicate homology. Only one copy of the inverted repeat (IR) is shown and the pink boxes below the plastome indicates its IR region.
Figure 2. Structural alignments of Papaveraceae plastomes using Mauve with Euptelea pleiosperma as reference. Colorful blocks represent locally collinear blocks (LCBs), and lines connecting the blocks indicate homology. Only one copy of the inverted repeat (IR) is shown and the pink boxes below the plastome indicates its IR region.
Ijms 25 02278 g002
Figure 3. Comparison of the borders of LSC, SSC, and IR regions among Papaveraceae plastomes. The distance in the figure is not to scale.
Figure 3. Comparison of the borders of LSC, SSC, and IR regions among Papaveraceae plastomes. The distance in the figure is not to scale.
Ijms 25 02278 g003
Figure 4. The histogram indicates the number of repetitive sequences ((A) left, SSR; middle, dispersed repeat; right, tandem repetitive), and the scatter plot represents the correlation between repetitive sequence numbers and genome size (BE).
Figure 4. The histogram indicates the number of repetitive sequences ((A) left, SSR; middle, dispersed repeat; right, tandem repetitive), and the scatter plot represents the correlation between repetitive sequence numbers and genome size (BE).
Ijms 25 02278 g004
Figure 5. Nucleotide diversity (Pi) of 91 common plastid genes in Papaveraceae.
Figure 5. Nucleotide diversity (Pi) of 91 common plastid genes in Papaveraceae.
Ijms 25 02278 g005
Figure 6. The dN/dS ratio of PCGs in Papaveraceae.
Figure 6. The dN/dS ratio of PCGs in Papaveraceae.
Ijms 25 02278 g006
Table 1. Characteristics of Papaveraceae plastomes.
Table 1. Characteristics of Papaveraceae plastomes.
SpeciesSubfamilyGenome Size (bp)LSC (bp)IR (bp)SSC (bp)GC (bp)PCGtRNArRNA
Corydalis triternatifoliaFumarioideae161,94684,52238,48645238.685378
Corydalis sheareriFumarioideae219,14494,03062,38434640.499448
Corydalis gamosepalaFumarioideae198,14092,16952,59079140.595408
Corydalis longicalcarataFumarioideae195,84286,95454,19749440.396388
Corydalis tomentellaFumarioideae189,38395,82541,947966440.294388
Corydalis aduncaFumarioideae186,049100,36538,100948441.192368
Fumaria schleicheriFumarioideae190,32486,78948,833586940.397408
Dactylicapnos torulosaFumarioideae160,59876,42126,23831,70140.396408
Ichtyoselmis macranthaFumarioideae164,55288,45727,03122,03339.989378
Lamprocapnos spectabilisFumarioideae188,79584,39851,376164539.2100428
Hypecoum erectumHypecoideae163,47193,18227,88514,51938.286388
Hylomecon japonicaPapaveroideae160,07188,29326,71218,35438.888378
Coreanomecon hylomeconoidesPapaveroideae158,82486,91426,68618,53838.789378
Chelidonium majusPapaveroideae159,73487,69626,73518,56838.789378
Dicranostigma leptopodumPapaveroideae163,17387,79328,30918,76239.491378
Macleaya cordataPapaveroideae163,10787,91928,30318,58238.691378
Eomecon chionanthaPapaveroideae178,76888,30245,16513637.999388
Sanguinaria canadensisPapaveroideae161,08387,98027,37218,35938.587378
Stylophorum lasiocarpumPapaveroideae153,19683,23025,78918,38838.989378
Papaver nudicaulePapaveroideae152,86783,10425,73118,30138.985378
Meconopsis integrifoliaPapaveroideae151,86482,80925,65317,74938.887378
Eschscholzia californicaPapaveroideae160,20188,14726,78118,49238.788378
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cao, J.; Wang, H.; Cao, Y.; Kan, S.; Li, J.; Liu, Y. Extreme Reconfiguration of Plastid Genomes in Papaveraceae: Rearrangements, Gene Loss, Pseudogenization, IR Expansion, and Repeats. Int. J. Mol. Sci. 2024, 25, 2278. https://doi.org/10.3390/ijms25042278

AMA Style

Cao J, Wang H, Cao Y, Kan S, Li J, Liu Y. Extreme Reconfiguration of Plastid Genomes in Papaveraceae: Rearrangements, Gene Loss, Pseudogenization, IR Expansion, and Repeats. International Journal of Molecular Sciences. 2024; 25(4):2278. https://doi.org/10.3390/ijms25042278

Chicago/Turabian Style

Cao, Jialiang, Hongwei Wang, Yanan Cao, Shenglong Kan, Jiamei Li, and Yanyan Liu. 2024. "Extreme Reconfiguration of Plastid Genomes in Papaveraceae: Rearrangements, Gene Loss, Pseudogenization, IR Expansion, and Repeats" International Journal of Molecular Sciences 25, no. 4: 2278. https://doi.org/10.3390/ijms25042278

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop