Next Article in Journal
Multifunctional Pomegranate Peel Microparticles with Health-Promoting Effects for the Sustainable Development of Novel Nutraceuticals and Pharmaceuticals
Next Article in Special Issue
Physiological Mechanism through Which Al Toxicity Inhibits Peanut Root Growth
Previous Article in Journal
Nicotiana benthamiana Methanol-Inducible Gene (MIG) 21 Encodes a Nucleolus-Localized Protein That Stimulates Viral Intercellular Transport and Downregulates Nuclear Import
Previous Article in Special Issue
Identification of the High-Affinity Potassium Transporter Gene Family (HKT) in Brassica U-Triangle Species and Its Potential Roles in Abiotic Stress in Brassica napus L.
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Integrative Analysis of Oleosin Genes Provides Insights into Lineage-Specific Family Evolution in Brassicales

1
National Key Laboratory for Tropical Crop Breeding, Hainan Key Laboratory for Biosafety Monitoring and Molecular Breeding in Off-Season Reproduction Regions, Institute of Tropical Biosciences and Biotechnology/Sanya Research Institute of Chinese Academy of Tropical Agricultural Sciences, Haikou 571101, China
2
Hubei Provincial Key Laboratory for Protection and Application of Special Plants in Wuling Area of China, College of Life Science, South-Central University for Nationalities, Wuhan 430074, China
3
College of Biology and Food Engineering, Guangdong University of Petrochemical Technology, Maoming 525011, China
*
Authors to whom correspondence should be addressed.
Plants 2024, 13(2), 280; https://doi.org/10.3390/plants13020280
Submission received: 3 November 2023 / Revised: 16 December 2023 / Accepted: 19 December 2023 / Published: 18 January 2024
(This article belongs to the Special Issue Molecular Genetics and Breeding of Oilseed Crops)

Abstract

:
Oleosins (OLEs) are a class of small but abundant structural proteins that play essential roles in the formation and stabilization of lipid droplets (LDs) in seeds of oil crops. Despite the proposal of five oleosin clades (i.e., U, SL, SH, T, and M) in angiosperms, their evolution in eudicots has not been well-established. In this study, we employed Brassicales, an economically important order of flowering plants possessing the lineage-specific T clade, as an example to address this issue. Three to 10 members were identified from 10 species representing eight plant families, which include Caricaceae, Moringaceae, Akaniaceae, Capparaceae, and Cleomaceae. Evolutionary and reciprocal best hit-based homologous analyses assigned 98 oleosin genes into six clades (i.e., U, SL, SH, M, N, and T) and nine orthogroups (i.e., U1, U2, SL, SH1, SH2, SH3, M, N, and T). The newly identified N clade represents an ancient group that has already appeared in the basal angiosperm Amborella trichopoda, which are constitutively expressed in the tree fruit crop Carica papaya, including pulp and seeds of the fruit. Moreover, similar to Clade N, the previously defined M clade is actually not Lauraceae-specific but an ancient and widely distributed group that diverged before the radiation of angiosperm. Compared with A. trichopoda, lineage-specific expansion of the family in Brassicales was largely contributed by recent whole-genome duplications (WGDs) as well as the ancient γ event shared by all core eudicots. In contrast to the flower-preferential expression of Clade T, transcript profiling revealed an apparent seed/embryo/endosperm-predominant expression pattern of most oleosin genes in Arabidopsis thaliana and C. papaya. Moreover, the structure and expression divergence of paralogous pairs was frequently observed, and a good example is the lineage-specific gain of an intron. These findings provide insights into lineage-specific family evolution in Brassicales, which facilitates further functional studies in nonmodel plants such as C. papaya.

1. Introduction

Oleosins are a class of highly abundant structural proteins of lipid droplets (LDs), which represent a major carbon reserve and are widely present in various plant organs such as seeds, pollen, flowers, fruits, and certain tubers [1,2,3,4,5]. Oleosins are typical for their small molecular weight (MW) of 14–30 kDa [5,6,7,8,9,10,11,12,13]. Nevertheless, all of them share a conserved central hydrophobic portion of approximately 72 residues, which could form a hairpin penetrating the surface phospholipid monolayer of an LD into the matrix. The hydrophobic hairpin is composed of two arms (each of about 30 residues) connected by a 12-residue loop with the pattern of PX5SPX3P, where X represents a nonpolar residue. By contrast, N- and C-terminal peptides, which lie on the phospholipid surface and may act as a receptor for metabolic enzymes or regulatory proteins, are amphipathic and usually variable [8,14]. Genome-wide surveys reveal that oleosin genes have already appeared in the single-celled algae, e.g., Chlamydomonas reinhardtii, and have diverged into at least six clades known as P (primitive), U (universal), SL (seed low), SH (seed high), T (tapetum), and M (mesocarp) during later evolution [4,8,15]. The most primitive Clade P was only found in green algae, mosses, and ferns, whereas Clade U, which is typical for the C-terminal AAPGA, is universally present in all land plants including Selaginella moellendorffii. Clade SL, which is present in seeds of both gymnosperms and angiosperms, was named after the low MW. This clade was proposed to first evolve from Clade U and later gave rise to Clades SH, M, and T. Clade SH, which is usually present in seeds of angiosperms, is typical for the high MW and C-terminal insertion relative to Clade SL. By contrast, Clades M and T were reported to be lineage-specific, which are confined to Lauraceae and Brassicaceae, respectively [4,8]. Comparative genomics analyses indicated that, for most clades, gene expansion was mainly contributed by whole-genome duplications (WGDs) especially those lineage-specific recent WGDs, e.g., the Brassicaceae-specific α WGD and the ρ WGD shared by cassava (Manihot esculenta) and rubber tree (Hevea brasiliensis) in Euphorbiaceae [5,6,12], in stark contrast to a key role of tandem duplication for Clade T in Brassicaceae [3,8,16].
Brassicaceae belongs to the order Brassicales, which includes 17 families, 398 genera, and 4450 species that have experienced multiple independent WGDs [17]. Thus far, genome-wide identification of oleosin family genes has been reported in 10 species within Brassicales. However, most of them (80%) belong to the Brassicaceae family [3,8,10]. Although it was established that Clade T is absent from papaya (Carica papaya, Caricaceae) and spider flower (Tarenaya hassleriana, Cleomaceae) [8], whether it is present or has been lost in other families within Brassicales is yet to be addressed. Recently available or updated genome assemblies for species in five Brassicales families beyond Brassicaceae, i.e., papaya [18], horseradish (Moringa oleifera, Moringaceae) [19], Bretschneidera sinensis (Akaniaceae) [20], caperbush (Capparis spinosa, Capparaceae) [21], Cleome violacea (Cleomaceae), acaya (Gynandropsis gynandra, Cleomaceae) [22], and spider flower [23], provide a good chance to uncover lineage-specific evolution of the oleosin gene family in this important plant order.
This study presents a comprehensive comparative analysis of the oleosin gene family in Brassicales. Significantly, our results showed that Clade M is actually not Lauraceae-specific but an ancient group that has already been present in the basal angiosperm Amborella trichopoda and is preserved in the early-diverging eudicot Aquilegia coerulea and all Brassicales species examined in this study. Moreover, a novel but ancient group named N was identified in most tested species, i.e., A. trichopoda, papaya, horseradish, C. violacea, acaya, and spider flower. In papaya, an economically and nutritionally important tree fruit crop widely cultivated in tropical and subtropical areas [18], this group was shown to be constitutively expressed, which includes pulp and seeds of the fruit. Herein, we report our findings.

2. Results

2.1. Identification of Oleosin Genes in A. trichopoda, Avocado, A. coerulea, and Representative Brassicales Species

To gain insight into lineage-specific family evolution in Brassicales, recently available chromosome (Chr)-level genome assemblies of A. trichopoda (a single living representative within the sister lineage Amborellales to all other flowering plants) [24], avocado (Persea americana, a Laurales member of an early-branching lineage of angiosperms that includes one M oleosin) [25], and A. coerulea (a Ranunculales member of the basal-most eudicot clade) [26] were first employed to identify oleosin family genes, resulting in five, three, and five members, respectively (Table 1). Five members identified in A. trichopoda and A. coerulea are consistent with what is found in previous assemblies [8], whereas only two avocado oleosin genes (i.e., PaOLE2 and -3) have been reported by previous studies [4,27]. Moreover, an allele for PaOLE2 that was discarded for further analyses in this study was also identified from tig00003364, and their coding sequences (CDS) were shown to exhibit 98.8% sequence identity, including only five single nucleotide polymorphisms (SNPs). Further mining genomes of representative Brassicales species resulted in six to 10 family members from papaya, horseradish, B. sinensis, caperbush, C. violacea, acaya, and spider flower (Table 1). Notably, compared with the previous study [8], one more member was identified in both papaya and spider flower, which were named CpOLE6 and ThOLE8, respectively (Table 1).
Physiochemical parameters and conserved domains of deduced oleosin proteins are summarized in Table 1. In contrast to the great majority of oleosins featuring a single oleosin domain, MoOLE6 harbors two instead. Since the sequence was also found in two other genome assemblies [28,29], it is more likely to be a true gene that was resulted from tandem duplication. The sequence length of oleosins varies from 115 (CsOLE3) to 267 (MoOLE6) amino acids (AA) with an average of 151 AA, and correspondingly, their theoretical MW varies from 11.92 (CsOLE3) to 28.01 (MoOLE6) kDa with an average of 16.01 kDa. It is worth noting that CpOLE6, MoOLE6, CvOLE6, GgOLE7, and ThOLE8 possess unexpected low pI values of 4.43–6.56, in striking contrast to the alkaline characteristic of 9.23–11.00 for others. Except for BsOLE9, which exhibits an unusual GRAVY value of −0.144, the values for others are greater than 0, varying from 0.078 to 0.784 (Table 1). Nevertheless, all proteins possess relatively high aliphatic index (AI) values of 88.90–123.83 (Table 1) as well as similar Kyte–Doolittle hydrophobicity plots (except for MoOLE6) (Figure S1), which is in accordance with their amphipathic property.

2.2. Evolutionary Analysis and Definition of Orthogroups

To uncover their relationships, an unrooted evolutionary tree was first constructed using full-length protein sequences of five AtrOLEs, three PaOLEs, five AcOLEs, six CpOLEs, six MoOLEs, 10 BsOLEs, eight CsOLEs, six CvOLEs, seven GgOLEs, eight ThOLEs, eight MeOLEs, nine PtOLEs, and 17 AtOLEs. As shown in Figure 1A, they were clustered into six clades, five of which were previously defined as U, SL, SH, T, and M [4,8]. Whereas Clade T is restricted to Arabidopsis (Arabidopsis thaliana), Clade M, which was first described in the Lauraceae family [4,27], was unexpectedly found in all species examined in this study. The presence of Clade M in A. trichopoda (i.e., AtrOLE2) supports its early origin before the radiation of angiosperms. Moreover, a novel clade denoted N is not only present in papaya (i.e., CpOLE6), horseradish (i.e., MoOLE6), C. violacea (i.e., CvOLE6), acaya (i.e., GgOLE7), and spider flower (i.e., ThOLE8) but also in A. trichopoda (i.e., AtrOLE5), implying its early origin and lineage/species-species gene loss during later evolution. Structural features of Clade N relative to other CpOLEs are shown in Figure 1B. In contrast to AtrOLE5 possessing the conserved PX5SPX3P pattern, other members of Clade N exhibit PX5S/GPX3G/F variants. Moreover, an 18-residue insertion that is present in Clade SH was not detected in this clade as well as CpOLE4, MoOLE5, and CsOLE6, implying their divergence. Notably, AtrOLE4 possesses a 22-residue insertion instead (Figure 1B and Figure S2). Additionally, whereas the majority of U oleosins feature the C-terminal AAPGA, AcOLE1, and GgOLE1 harbor the AAPSA instead (Figure S3).
Furthermore, the BRH (best reciprocal hit) method was used to identify orthologs across different species. Except for T oleosins that were proven to be widely present in Brassicaceae plants [16], the criterion of at least one member present in more than one species examined in this study was used to define orthogroups (OGs). As shown in Figure 2 and Table S1, a total of nine OGs were obtained, i.e., U1/-2, M, SL, SH1/-2/-3, N, and T, where five AtrOLE genes belong to U1, M, SL, SH1, and N, respectively, supporting early diversification of this family in angiosperms. During later evolution, linage-specific expansion and concentration were found. Notably, only two OGs (i.e., U1 and M) are preserved in avocado, whereas four OGs (i.e., U1, M, SL and SH1) are retained in A. coerulea (Figure 2).

2.3. Analysis of Exon–Intron Structure

To learn more about structure divergence, the exon–intron structures were analyzed on the basis of revised gene models. As shown in Table 1, a single intron was found in 27 out of 64 identified oleosin genes, occupying approximately 42.19%, smaller than 88.24% found in Arabidopsis (At-T8 represents the sole member possessing two intron) (Table S2). These intron-containing genes belong to Clades U, SL, SH, and N, which seems to be independent. Notably, no intron was found in Clade M as well as any member of A. trichopoda, avocado, and A. coerulea, whereas one intron is present in all SL members of papaya, horseradish, B. sinensis, and other Brassicales species. Moreover, in C. violacea, acaya, and spider flower, all U and SH members harbor an intron, whereas GgOLE7 represents the unique N member with one intron (Table 1). Interestingly, the intron position appears to be conserved within clades but differs between different clades. Whereas Clade SL features one intron immediately after the sequence encoding the hydrophobic hairpin, the intron found in Clade N is located at the C-terminus of the hydrophobic hairpin; the intron found in Clade SH is located before the hydrophobic hairpin; and the intron found in Clade U is located at the C-terminus of the proline knot. These results imply an independent and lineage-specific gain of an intron (Figure 1 and Figure S3).

2.4. Gene Localization, Synteny Analysis, and Lineage-Specific Family Evolution in Brassicales

Gene localization revealed that identified oleosin genes are distributed across two-to-six chromosomes of A. coerulea, avocado, B. sinensis, A. trichopoda, papaya, caperbush, and acaya, and five-to-six scaffolds (Scfs) of horseradish, C. violacea, and spider flower, respectively (Figure 3). Further analysis of gene duplication events resulted in 54 duplicate pairs. Whereas most duplicate pairs were characterized as dispersed repeats, CpOLE2/-3 and BsOLE6/-7 were characterized as transposed and tandem repeats, respectively (Figure 3). Interestingly, despite the presence of five oleosin genes in A. trichopoda, intra-synteny analysis showed that none of them is located within syntenic blocks, which is similar to that observed in A. coerulea, papaya, and acaya. By contrast, one, one, one, two, four, four, and four WGD duplicate pairs were identified in avocado (i.e., PaOLE1/-2), horseradish (i.e., MoOLE4/-5), C. violacea (i.e., CvOLE4/-5), spider flower (i.e., ThOLE4/-5 and ThOLE6/-7), B. sinensis (i.e., BsOLE3/-4, BsOLE5/-6, BsOLE8/-9, and BsOLE8/-10), caperbush (i.e., CsOLE3/-4, CsOLE3/-5, CsOLE6/-7, and CsOLE6/-8), and Arabidopsis (i.e., At-Sm1/-2, At-S3/-5, At-S1/-4, and At-S2/-4), respectively (Figure 3 and Figure 4).
Inter–synteny analyses were further conducted between A. trichopoda, avocado, A. coerulea, papaya, and Arabidopsis. As shown in Figure 4A, AtrOLE genes were shown to have three, two, and one syntelogs in avocado, A. coerulea, and papaya, respectively, but none in Arabidopsis; AcOLE genes also harbor one and three syntelogs in avocado and papaya, respectively, but none in Arabidopsis. These results reflect a long time of evolution, as well as two additional rounds of WGDs and massive chromosomal rearrangements that occurred in Arabidopsis after the split with papaya [30]. Nevertheless, three out of six CpOLE genes (i.e., CpOLE1, -3, and -4) still have eight syntelogs in Arabidopsis, i.e., one-to-two and one-to-three, reflecting their close relationship and lineage-specific WGDs. It is worth noting that, besides At-S3 and At-S5, both At-T1 and At-T8 were also characterized as syntelogs of CpOLE3, which provides direct evidence for the origin of Clade T from Clade SL. Additionally, PaOLE3, a well-identified M member [4,27], still has syntelogs in A. trichopoda (i.e., AtrOLE2) and A. coerulea (i.e., AcOLE2) (Figure 4A), whereas AcOLE2 still has syntelogs in poplar (Populus trichocarpa) (i.e., PtOLE2a/-2b) and cassava (i.e., MeOLE2) (Figure 4B).
In addition to CpOLE1, -3, and -4, CpOLE2 and -6 were also shown to have syntelogs in at least one species of horseradish, B. sinensis, caperbush, C. violacea, acaya, and spider flower (Figure 4C,D). Though no syntelog was identified for CpOLE5 in all examined species, its orthologs MoOLE5, BsOLE9, BsOLE10, and MeOLE5 are still located within syntenic blocks (Figure S4), implying a species-specific transposition of CpOLE5. Moreover, MoOLE4/-5, BsOLE8/-9/-10, and MeOLE4b/-5 were also shown to be located within syntenic blocks, implying that two groups were derived from one WGD shared by these species, probably the γ event. Additionally, CsOLE6/-7/-8, CvOLE4/-5, ThOLE6/-7, and At-S1/-2/-4 are also located within syntenic blocks. In fact, CpOLE4 and -5 exhibit a Ks value of 2.2048 (Table S3), which is comparable to that of MoOLE4/-5 (2.0410) and CvOLE4/-5 (1.9437) (Table 2). However, this value is relatively higher than 1.5864 of BsOLE8/-10 and 1.7862 of MeOLE4b/-5, implying a different evolutionary rate of γ WGD-derived repeats in these species. Similar cases were also observed for recent WGD repeats. Among four β WGD repeats identified in Brassicales species, CsOLE3/-4 and ThOLE6/-7 exhibit similar Ks values of 1.5677–1.6273, in contrast to high sequence divergence of CsOLE7/-8 and At-S1/-4. As for three α WGD repeats identified in Arabidopsis, At-Sm1/-2 and At-S3/-5 exhibit similar Ks values of 1.3093–1.3683, which is relatively smaller than 1.5782 between At-S1 and -4. By contrast, the Ks values of other recent WGD repeats identified in Brassicales species were relatively smaller, varying from 0.1962 to 0.5869, which is comparable to 0.1619–0.3696 of four p WGD repeats found in poplar and relatively smaller than 0.4175–0.7428 of three ρ WGD repeats identified in cassava (Table 2). In addition to CpOLE4/-5, three other dispersed repeats may also be derived from WGDs: BsOLE1/-2 exhibit a Ks value of 0.1713, which is comparable to three α WGD repeats identified in B. sinensis, i.e., 0.1962–0.2810; GgOLE1/-2 and ThOLE1/-2 possess the Ks value of 0.6687 and 0.4058, respectively, which is comparable to that of the α WGD repeat ThOLE4/5 (0.3409) but relatively smaller than the β WGD repeat ThOLE6/7 (1.5677) (Table 2 and Table S3). Notably, the Ka/Ks values of all repeats identified in this study were shown to be less than one, implying that they are subject to purifying selection.

2.5. Expression Divergence of Oleosin Genes

Global expression profiles of AtOLE genes were first examined from the Arabidopsis RNA-seq Database, which includes 28,164 libraries. As shown in Figure S5, most members of Clade T are preferential to be expressed in flowers, though At-T8 is also expressed in embryos and seeds. By contrast, other members are predominantly expressed in seeds, embryos, and endosperms, as well as in silique. Notably, At-Sm2 and At-Sm3 were also shown to be expressed in pollen and flowers. Moreover, during embryo development, transcripts of most members (including At-T8) increase gradually, peaking at the stage of mature green. At the stage of 8-cell/16-cell, At-S5, At-Sm2, and At-Sm1 represent the three most expressed isoforms, contributing 83.23% of total transcripts. Then, a sudden drop of total transcripts was observed at the globular stage, where At-S5, At-Sm2, and At-Sm1 also contribute 75.98% of total transcripts. At stages from early heart to late torpedo, At-S5 represents the most expressed isoform that contributes 44.69–62.58% of total transcripts. At the stage of bent cotyledon, At-S1, At-S3, and At-S5 represent the three most expressed isoforms, contributing 76.94% of total transcripts. At the stage of mature green, At-S3, At-S4, and At-S1 represent the three most expressed isoforms contributing 80.93% of total transcripts (Figure S6).
Then, papaya was used as an example of a fruit plant to study the expression evolution of oleosin genes. The RNA-seq data of various tissues, i.e., callus, shoot, hypocotyl, leaf, root, phloem sap, stamen, pollen, ovule, and pulp of mature fruit, were first investigated. As shown in Figure 5, their transcripts were detected in at least one of the tested tissues, though gene abundances are highly diverse. Total transcripts of the whole gene family were most abundant in shoot (100%), followed by callus (5.21–20.37%), and they were considerably low in other tissues (0.12–0.65%). In contrast to the constitutive expression of CpOLE6, CpOLE1 was rarely expressed in sap and pulp. Whereas CpOLE6 represents the unique isoform expressed in sap, three members were shown to be expressed in pulp, i.e., CpOLE6, -3, and -5. In the shoot and callus, CpOLE3, -4, and -2 represent three dominant isoforms, which contribute 80.98–91.69% of total transcripts. On the contrary, CpOLE4 was rarely expressed in other tissues; CpOLE3 was rarely expressed in sap, stamen, pollen, and ovule; CpOLE2 was rarely expressed in root, sap, and pulp; and CpOLE5 was rarely expressed in root and sap. As expected, according to their expression patterns over various tissues, six CpOLE genes were grouped into three main clusters: Cluster I includes the two most expressed genes in shoot and callus, i.e., CpOLE3 and -4; Cluster II includes two moderately expressed isoforms, i.e., CpOLE2 and -5; and Cluster III includes CpOLE6 and -1, which were constitutively expressed in most tissues (Figure 5A).
Since no transcriptome data are available for the seed tissue, qRT–PCR analysis was further conducted using seeds derived from mature fruits. As shown in Figure 5B, except for CpOLE6 and -1, the expression levels of other CpOLE genes were significantly higher than the reference gene CpEIEF, varying from 2.61–36.81 folds, implying their divergence. Notably, CpOLE3 and -4 were shown to represent two dominant isoforms whose transcript levels were comparable (Figure 5B).

3. Discussion

The importance of oleosins in LD formation and stabilization has prompted active research in oil crops [31,32,33,34,35,36,37,38]. Nevertheless, despite the proposal of five oleosin clades (i.e., U, SL, SH, M, and T) in angiosperms [4,8], their evolution in eudicots has not been well-established. According to the comparison reported by Huang and Huang (2015), five oleosin genes present in A. trichopoda were assigned into two clades, i.e., U (1) and SL (4), though an M member was clearly identified [8]. Moreover, the distribution of the M clade, which was previously considered to be Lauraceae-specific [4], has not been well-studied.
In the present study, we used Brassicales, an economically important order of flowering plants that harbors the lineage-specific T clade [3,8,17], as an example to address evolution patterns of the oleosin gene family. In addition to 34 members reported in Arabidopsis, cassava, and poplar ([12], this study), a number of 64 oleosin family genes were identified from ten species representing eight plant families, i.e., Amborellaceae (A. trichopoda), Lauraceae (avocado), Ranunculaceae (A. coerulea), Caricaceae (papaya), Moringaceae (horseradish), Akaniaceae (B. sinensis), Capparaceae (caperbush), and Cleomaceae (C. violacea, acaya, and spider flower), while gene numbers of the family vary from three to ten. Interestingly, the family amounts are usually higher in species that experienced recent WGDs. According to comparative genomics analysis, after the split with A. coerulea, the last common ancestor of core eudicots underwent the γ whole-genome triplication (WGT) event at around 117 million years ago (MYA) [39]. Furthermore, Brassicaceae species, represented by Arabidopsis, experienced two more WGDs named At-β (60–65 MYA) and At-α (~35 MYA) [30,40], where the At-β WGD was shown to be shared by caperbush, C. violacea, acaya, and spider flower [17,21,22,23]. In the Capparaceae lineage, caperbush further experienced one independent WGD known as Cs-α at 18.6 MYA [21]. In the Cleomaceae lineage, after the split with C. violacea, the last common ancestor of acaya and spider flower first experienced one independent WGD known as Gg-α (~22 MYA), which was followed by an addition of a third genome (Th-α, ~18.4 MYA) to spider flower but not acaya [41]. After the split with papaya, B. sinensis in the Akaniaceae lineage was also shown to experience one independent WGD known as Bs-α [20]. Correspondingly, compared with five members present in both A. trichopoda and A. coerulea, one more was identified in papaya, horseradish, and C. violacea. By contrast, more than seven members were identified in B. sinensis, acaya, and spider flower, which are comparative to eight and nine reported in cassava and poplar, respectively [12].
According to evolutionary analysis, 98 oleosin genes were grouped into six clades, one more than that described before [4,8,12]. Interestingly, this novel and so-called N clade are present in A. trichopoda and most Brassicales species examined in this study, implying its early origin and lineage-specific gene loss. Besides Clade N, four other AtrOLE genes were assigned into four clades, i.e., U, SL, SH, and M, instead of only two as proposed by Huang and Huang (2015) [8]. The updated classification is not only supported by evolutionary analysis but also by BRH-based orthologous and synteny analyses. Whereas Clades SL, M, N, and T contain a single OG, U and SH have evolved to form two and three, respectively, a high member of which are still located within syntenic blocks. As for Clade M, PaOLE3, AtrOLE2, AcOLE2, MeOLE2, PtOLE2a, and PtOLE2b were shown to be located within syntenic blocks, whereas CpOLE2, MoOLE3, BsOLE3, BsOLE4, CsOLE2, CvOLE2, GgOLE3, ThOLE3, and At-Sm3 were also characterized as syntelogs, implying a highly conserved evolution of this clade, which argues Lauraceae-specific distribution proposed by Huang and Huang (2016) [4]. Moreover, this clade has expanded in B. sinensis and poplar via recent WGDs, which were shown to be Akaniaceae and Salicaceae-specific, respectively [20,42]. As for Clade N, despite a frequent loss in species examined in this study, CvOLE6, GgOLE7, and ThOLE8 are still located within syntenic blocks, implying possible functions in specific biological processes that are yet to be studied. As for Clade U, which is typical for the C-terminal AAPGA [8,12], gene expansion was observed in avocado, cassava, B. sinensis, acaya, spider flower, and Arabidopsis, which were contributed by WGDs and dispersed duplication. Among them, though BsOLE1 and -2 are no longer located within syntenic blocks, both of them were characterized as syntelogs of CsOLE1, which is consistent with their comparable Ks value to three Bs-α WGD repeats identified in this species, i.e., BsOLE3/4, BsOLE5/6, and BsOLE9/10, implying their WGD-derivation and chromosome rearrangement of the BsOLE2-encoding region. Similar cases were also observed for GgOLE1/-2 and ThOLE1/-2, where GgOLE1, GgOLE2, and ThOLE1 were characterized as syntelogs of CsOLE1, At-Sm1, and At-Sm2, though ThOLE2 is no longer located within syntenic blocks. As for Clade SL, gene expansion was observed in A. coerulea, B. sinensis, caperbush, spider flower, Arabidopsis, cassava, and poplar, which were contributed by WGDs, as well as tandem and dispersed duplication. Notably, BsOLE6 and -7 represent the unique pair of tandem repeats beyond Clade T. Compared with other clades, Clade SH has extensively expanded in core eudicots, forming three OGs as identified in this study. Among them, SH1 and SH2 are more likely to arise from the γ event [39], and MoOLE4, MoOLE-5, BsOLE8, BsOLE9, BsOLE10, MeOLE4a, and MeOLE5 are still located within syntenic blocks with similar Ks values, whereas SH3 appears to be generated by the At-β event [30]. Moreover, SH1 has further expanded in caperbush, Arabidopsis, cassava, and poplar via lineage-specific recent WGDs, i.e., Cs-α, At-α, ρ, and p, respectively [20,30,42,43]. It is worth noting that, despite the wide presence of Clade T in Brassicaceae plants [8,16], no ortholog was identified in any other Brassicales species examined in this study, implying its appearance sometime after the Brassicaceae–Cleomaceae divergence. Nevertheless, At-T1 and At-T8 were characterized as syntelogs of CpOLE3, ThOLE4, ThOLE5, GgOLE4, CsOLE3, CsOLE4, and CsOLE5, implying that Clade T was indeed derived from Clade SL.
In addition to species-specific retention of repeats after WGDs, structural divergence was also shown to play a role in the evolution of the oleosin family. In contrast to no intron that is present in oleosin genes of A. trichopoda, avocado, and A. coerulea, Clade SL has gained one intron immediately after the sequence encoding the hydrophobic hairpin stretch in all Brassicales species examined in this study, which is similar to that reported in Salicaceae and Euphorbiaceae [12]. Interestingly, the intron position found in Clade SL is different from that observed in several members of Clades U, SL, SH, and N, implying an independent gain of an intron. Since all SH members in C. violacea, acaya, spider flower, and Arabidopsis feature the intron that is located before the hydrophobic hairpin, its gain may occur sometime after the split with Capparaceae but before Brassicaceae–Cleomaceae divergence. The absence of the Cleomaceae U intron in Arabidopsis, which is located at the C-terminus of the proline knot, implies that its gain occurred sometime after the split with Brassicaceae. By contrast, the intron found in GgOLE7, which is located at the C-terminus of the hydrophobic hairpin, may be Gynandropsis-specific, since it is absent from its orthologs CvOLE6 and ThOLE8.
Expression divergence also plays an important role in the evolution of oleosin family genes in Brassicales. Among six oleosin genes identified in papaya, CpOLE6 in Clade N and CpOLE1 in Clade U have evolved to be constitutively expressed, whereas CpOLE3 in Clade SL and CpOLE4 in Clade SH have evolved into two dominant isoforms in seeds, calluses, and shoots, though CpOLE4 is more likely to be a WGD (γ) repeat of CpOLE5, another SH member. The constitutive expression of U oleosin genes has been widely reported in other species, e.g., castor bean (Ricinus communis), physic nut (Jatropha curcas), cassava, rubber tree, safflower (Carthamus tinctorius), rapeseed (Brassica napus), and tigernut (Cyperus esculentus) [5,9,10,11,12,13]. Nevertheless, to our surprise, CpOLE1 was shown to be rarely expressed in both sap and pulp, which is different from CpOLE6. Compared with CpOLE4, transcript levels of CpOLE5 were shown to be considerably lower in seeds, calluses, and shoots. By contrast, it was also moderately expressed in pollen, stamens, and ovules, as well as pulp. Notably, though Clade M was previously reported to be mesocarp-abundant [8,42], the expression of CpOLE2 was rarely detected in pulp or roots and sap. Interestingly, the transcript level of CpOLE2 is usually higher than that of CpOLE5, CpOLE6, and CpOLE1 in most tissues. By contrast, its ortholog in Arabidopsis (At-Sm3) is always less expressed than most members beyond Clade T. Moreover, among several repeat pairs identified in Arabidopsis, i.e., At-Sm1/-Sm2, At-S3/-S5, At-S1/-S2/-S4, and At-T1/-T2/-T3/-T4/-T5/-T6/-T7/-T8/-T9, At-Sm1, At-S5, At-S4, and At-T5 have evolved into dominant isoforms, respectively. In Brassicales, the lineage-specific expansion and tissue-specific expression of oleosin genes reflect their roles in the oil accumulation of seeds and anther [3,34]. In seeds, the accumulation of oleosins is usually negatively correlated with LD size but positively associated with oil content, which could not only affect seed germination but also the freezing tolerance of seeds [34,35]. Moreover, Brassicaceae-specific T oleosins are acquired for tapetosome formation, which confer additive benefits of pollen vigor [44].

4. Materials and Methods

4.1. Sequence Retrieval and Identification of Oleosin Family Genes

Oleosin genes reported in Arabidopsis (Brassicaceae, Brassicales), poplar (Salicaceae, Malpighiales), and cassava (Euphorbiaceae, Malpighiales) were updated according to references [6,12], and detailed information is shown in Table S2. Genomic sequences of A. trichopoda (v2.1; Amborellaceae, Amborellales), avocado (Gwen v1; Lauraceae, Laurales), A. coerulea (v3.1; Ranunculaceae, Ranunculales), papaya (Sunset v1; Caricaceae, Brassicales), B. sinensis (v1; Akaniaceae, Brassicales), horseradish (v1; Moringaceae, Brassicales), caperbush (v1; Capparaceae, Brassicales), C. violacea (v2.1; Cleomaceae, Brassicales), acaya (v1; Cleomaceae, Brassicales), and spider flower (v1; Cleomaceae, Brassicales) were downloaded from public databases, i.e., Phytozome (v13, https://phytozome.jgi.doe.gov/pz/portal.html, accessed on 31 October 2023), NGDC (http://bigd.big.ac.cn/gsa, accessed on 31 October 2023), and NCBI (https://www.ncbi.nlm.nih.gov/, accessed on 31 October 2023). To identify oleosin homologs, the oleosin domain profile (PF01277) was used for HMMER (v3.3, http://hmmer.janelia.org/, accessed on 31 October 2023) searches as described before [45,46]. All predicted gene models were manually curated with available mRNAs, including nucleotides, Sanger-expressed sequence tags (ESTs), and RNA sequencing (RNA-seq) reads that were accessed from NCBI (accessed on 31 November 2023). Presence of the conserved oleosin domain in deduced peptides was confirmed using MOTIF Search (https://www.genome.jp/tools/motif/, accessed on 31 October 2023), whereas protein properties were calculated using ProtParam (http://web.expasy.org/protparam/, accessed on 31 October 2023). Additionally, pseudogenes and/or homologous fragments present in related genomes were also identified with CDS sequences of obtained oleosin genes as previously described [12].

4.2. Sequence Alignment, Evolutionary Analysis, and Definition of Orthogroups

Multiple sequence alignment was conducted using MUSCLE [47], which was subject to evolutionary tree construction using MEGA 6.0 [48] with the maximum likelihood method, Jones–Taylor–Thornton (JTT) model, uniform rates, complete deletion of gaps, nearest-neighbor interchange (NNI), and bootstrap of 1000 replicates. Orthologs between species were identified using the BRH (best reciprocal hit) method, and OGs across different species were defined as described before [49,50], which were assigned only when they were present in at least two species tested.

4.3. Gene Localization, Synteny Analysis, and Calculation of Evolutionary Rate

Gene locations on chromosomes and/or scaffolds were inferred from the revised genome annotation and displayed using TBtools [51]. For synteny analysis, duplicate pairs between or within species were identified using the all-to-all BLASTp [52] method with E-value cutoff of 1 × 10−10, and gene colinearity was inferred using MCScanX [53] with the cutoff of five BLAST hits. Duplication modes such as tandem, proximal, transposed, dispersed, and WGD were identified using the DupGen_finder pipeline as previously described [54], and Ks (synonymous substitution rate) and Ka (nonsynonymous substitution rate) of duplicate pairs were calculated using codeml [55].

4.4. Gene Expression Analysis

Expression profile data of AtOLE genes were accessed from Arabidopsis RNA-seq Database (https://plantrnadb.com/athrdb/, accessed on 31 October 2023) and Arabidopsis Embryo eFP Browser (https://bar.utoronto.ca/efp/cgi-bin/efpWeb.cgi, accessed on 31 October 2023), whereas global expression profiles of CpOLE genes were analyzed using transcriptome datasets as shown in Table S4. Raw sequence reads in the FASTQ format were obtained using fastq-dump, and quality control was performed using Trimmomatic [56]. Read mapping was conducted using HISAT2 [57], and the FPKM (fragments per kilobase of exon per million fragments mapped) method was used to determinate relative transcript levels.
To uncover the relative expression levels of CpOLE genes in the seed tissue, mature seeds were collected from the yellow fruits of Zhongbai cultivar as described before [58]. Total RNA extraction, synthesis of the first-strand cDNA, and qRT–PCR analysis were conducted as previously described [59], where CpEIEF was used as the reference gene. Primers used in this study are shown in Table S5. Relative gene expression levels were estimated with the 2−ΔΔCt method, and statistical analysis was performed using SPSS Statistics 20 as described before [60].

5. Conclusions

In this study, a focus on a comparative analysis of the oleosin gene family in Brassicales was conducted, which includes 13 species representing 10 plant families. Ninety-eight oleosin genes were assigned into six clades (i.e., U, SL, SH, M, N, and T) and nine OGs (i.e., U1, U2, SL, SH1, SH2, SH3, M, N, and T). The newly identified Clade N represents an ancient group that diverged before the radiation of angiosperm. Interestingly, this group was constitutively expressed in papaya, which includes the fruit and sap. Moreover, the previously defined Clade M is not Lauraceae-specific but an ancient and widely distributed group that has already appeared in the basal angiosperm A. trichopoda. Compared with A. trichopoda, the family expansion in Brassicales was largely contributed by lineage-specific recent WGDs but also the ancient γ event shared by all core eudicots. The expression of Clade T was shown to be flower-preferential, whereas other members exhibit an apparent seed/embryo/endosperm-predominant expression pattern. The structure and expression divergence of paralogous pairs was frequently observed, and a good example is a lineage-specific gain of an intron. These findings provide insights into lineage-specific family evolution in Brassicales, which facilitates further functional studies in papaya and other nonmodel species.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/plants13020280/s1, Figure S1 Kyte–Doolittle hydrophobicity plots of identified oleosins using ProtScale. (Ac: A. coerulea; Atr: A. trichopoda; Bs: B. sinensis; Chr: chromosome; Cp: C. papaya; Cs: C. spinosa; Cv: C. violacea; Gg: G. gynandra; Mo: M. oleifera; OLE: oleosin; Pa: P. americana; Scf: scaffold; Th: T. hassleriana). Figure S2 Sequence alignment and structural features of SH oleosins. (Ac: A. coerulea; At: A. thaliana; Atr: A. trichopoda; Bs: B. sinensis; Chr: chromosome; Cp: C. papaya; Cs: C. spinosa; Cv: C. violacea; Gg: G. gynandra; Me: M. esculenta; Mo: M. oleifera; OLE: oleosin; Pa: P. americana; Pt: P. trichocarpa; Scf: scaffold; Th: T. hassleriana). Figure S3 Gene models of 27 intron-containing oleosin genes identified in this study. (Bs: B. sinensis; Cp: C. papaya; Cs: C. spinosa; Cv: C. violacea; Gg: G. gynandra; Mo: M. oleifera; OLE: oleosin; Th: T. hassleriana). Figure S4 Synteny analysis between C. papaya, M. oleifera, B. sinensis, and M. esculenta. (Bs: B. sinensis; Chr: chromosome; Cp: C. papaya; Me: M. esculenta; Mo: M. oleifera; OLE: oleosin; Scf: scaffold). Figure S5 Global expression profiles of AtOLE genes. (At: A. thaliana; OLE: oleosin). Figure S6 Expression profiles of AtOLE genes during embryo development. (At: A. thaliana; OLE: oleosin). Table S1 Nine identified orthogroups of the oleosin family based on analyzing representative species. Except for Clade T, systematic group names were assigned only when at least one member is found in at least two of species examined. (Ac: A. coerulea; At: A. thaliana; Atr: A. trichopoda; Bs: B. sinensis; Chr: chromosome; Cp: C. papaya; Cs: C. spinosa; Cv: C. violacea; Gg: G. gynandra; M: mesocarp; Me: M. esculenta; Mo: M. oleifera; N: novel; OLE: oleosin; Pa: P. americana; Pt: P. trichocarpa; Scf: scaffold; SH: seed high-molecular-weight; SL: seed low-molecular-weight; Th: T. hassleriana; T: tapetum; U: universal). Table S2 Detailed information of oleosin family genes present in A. thaliana, P. trichocarpa, and M. esculenta. (At: A. thaliana; Chr: chromosome; Me: M. esculenta; OLE: oleosin; Pt: P. trichocarpa). Table S3 Evolutionary rate of dispersed repeats that may be derived from WGDs. (Bs: B. sinensis; Cp: C. papaya; Gg: G. gynandra; Ka: nonsynonymous substitution rate; Ks: synonymous substitution rate; OLE: oleosin; Th: T. hassleriana; WGD: whole-genome duplication). Table S4 Detailed information of transcriptome data used in this study. Table S5 Primers used in this study. (Cp: C. papaya; OLE: oleosin).

Author Contributions

Z.Z. conceived the idea and designed the project outline. Z.Z., L.Z. and Y.Z. performed the experiments and analyzed the data. Z.Z. and Y.Z. prepared and refined the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (31971688), the Natural Science Foundation of Hainan province (320RC705), and the Central Public-interest Scientific Institution Basal Research Fund for Chinese Academy of Tropical Agricultural Sciences (1630052022001), and the Project of Sanya Yazhou Bay Science and Technology City (SCKJ-JYRC-2022-66). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Data Availability Statement

SRA accession numbers of transcriptome data used in this study are shown in Table S4.

Acknowledgments

The authors appreciate those contributors who make the related genome and transcriptome data accessible in public databases. They also appreciate reviewers for their helpful suggestions.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Huang, A.H. Oleosins and oil bodies in seeds and other organs. Plant Physiol. 1996, 110, 1055–1061. [Google Scholar] [CrossRef] [PubMed]
  2. Frandsen, G.I.; Mundy, J.; Tzen, J.T. Oil bodies and their associated proteins, oleosin and caleosin. Physiol. Plant. 2001, 112, 301–307. [Google Scholar] [CrossRef] [PubMed]
  3. Kim, H.U.; Hsieh, K.; Ratnayake, C.; Huang, A.H. A novel group of oleosins is present inside the pollen of Arabidopsis. J. Biol. Chem. 2002, 277, 22677–22684. [Google Scholar] [CrossRef] [PubMed]
  4. Huang, M.D.; Huang, A.H. Subcellular lipid droplets in vanilla leaf epidermis and avocado mesocarp are coated with oleosins of distinct phylogenic lineages. Plant Physiol. 2016, 171, 1867–1878. [Google Scholar] [CrossRef] [PubMed]
  5. Zou, Z.; Zheng, Y.J.; Zhang, Z.T.; Xiao, Y.H.; Xie, Z.N.; Chang, L.L.; Zhang, L.; Zhao, Y.G. Molecular characterization oleosin genes in Cyperus esculentus, a Cyperaceae plant producing oil in underground tubers. Plant Cell Rep. 2023, 42, 1791–1808. [Google Scholar] [CrossRef] [PubMed]
  6. Liu, Q.; Sun, Y.; Su, W.; Yang, J.; Liu, X.; Wang, Y.; Wang, F.; Li, H.; Li, X. Species-specific size expansion and molecular evolution of the oleosins in angiosperms. Gene 2012, 509, 247–257. [Google Scholar] [CrossRef]
  7. Fang, Y.; Zhu, R.L.; Mishler, B.D. Evolution of oleosin in land plants. PLoS ONE 2014, 9, e103806. [Google Scholar] [CrossRef]
  8. Huang, M.D.; Huang, A.H. Bioinformatics reveal five lineages of oleosins and the mechanism of lineage evolution related to structure/function from green algae to seed plants. Plant Physiol. 2015, 169, 453–470. [Google Scholar] [CrossRef]
  9. Lu, Y.; Chi, M.; Li, L.; Li, H.; Noman, M.; Yang, Y.; Ji, K.; Lan, X.; Qiang, W.; Du, L.; et al. Genome-wide identification, expression profiling, and functional validation of oleosin gene family in Carthamus tinctorius L. Front. Plant Sci. 2018, 9, 1393. [Google Scholar] [CrossRef]
  10. Chen, K.; Yin, Y.; Liu, S.; Guo, Z.; Zhang, K.; Liang, Y.; Zhang, L.; Zhao, W.; Chao, H.; Li, M. Genome-wide identification and functional analysis of oleosin genes in Brassica napus L. BMC Plant Biol. 2019, 19, 294. [Google Scholar] [CrossRef]
  11. Yuan, Y.; Cao, X.; Zhang, H.; Liu, C.; Zhang, Y.; Song, X.L.; Gai, S. Genome-wide identification and analysis of oleosin gene family in four cotton species and its involvement in oil accumulation and germination. BMC Plant Biol. 2021, 21, 569. [Google Scholar] [CrossRef] [PubMed]
  12. Zou, Z.; Zhao, Y.; Zhang, L. Genomic insights into lineage-specific evolution of the oleosin family in Euphorbiaceae. BMC Genom. 2022, 23, 178. [Google Scholar] [CrossRef] [PubMed]
  13. Zhang, W.; Xiong, T.; Ye, F.; Chen, J.H.; Chen, Y.R.; Cao, J.J.; Feng, Z.G.; Zhang, Z.B. The lineage-specific evolution of the oleosin family in Theaceae. Gene 2023, 868, 147385. [Google Scholar] [CrossRef] [PubMed]
  14. Deruyffelaere, C.; Purkrtova, Z.; Bouchez, I.; Collet, B.; Cacas, J.L.; Chardot, T.; Gallois, J.L.; D’Andrea, S. PUX10 is a CDC48A adaptor protein that regulates the extraction of ubiquitinated oleosins from seed lipid droplets in Arabidopsis. Plant Cell 2018, 30, 2116–2136. [Google Scholar] [CrossRef] [PubMed]
  15. Huang, N.L.; Huang, M.D.; Chen, T.L.; Huang, A.H. Oleosin of subcellular lipid droplets evolved in green algae. Plant Physiol. 2013, 161, 1862–1874. [Google Scholar] [CrossRef] [PubMed]
  16. Schein, M.; Yang, Z.; Mitchell-Olds, T.; Schmid, K.J. Rapid evolution of a pollen-specific oleosin-like gene family from Arabidopsis thaliana and closely related species. Mol. Biol. Evol. 2004, 21, 659–669. [Google Scholar] [CrossRef] [PubMed]
  17. Mabry, M.E.; Brose, J.M.; Blischak, P.D.; Sutherland, B.; Dismukes, W.T.; Bottoms, C.A.; Edger, P.P.; Washburn, J.D.; An, H.; Hall, J.C.; et al. Phylogeny and multiple independent whole-genome duplication events in the Brassicales. Am. J. Bot. 2020, 107, 1148–1164. [Google Scholar] [CrossRef]
  18. Yue, J.; VanBuren, R.; Liu, J.; Fang, J.; Zhang, X.; Liao, Z.; Wai, C.M.; Xu, X.; Chen, S.; Zhang, S.; et al. SunUp and Sunset genomes revealed impact of particle bombardment mediated transformation and domestication history in papaya. Nat. Genet. 2022, 54, 715–724. [Google Scholar] [CrossRef]
  19. Tian, Y.; Zeng, Y.; Zhang, J.; Yang, C.; Yan, L.; Wang, X.; Shi, C.; Xie, J.; Dai, T.; Peng, L.; et al. High quality reference genome of drumstick tree (Moringa oleifera Lam.), a potential perennial crop. Sci. China Life Sci. 2015, 58, 627–638. [Google Scholar] [CrossRef]
  20. Zhang, H.; Du, X.; Dong, C.; Zheng, Z.; Mu, W.; Zhu, M.; Yang, Y.; Li, X.; Hu, H.; Shrestha, N.; et al. Genomes and demographic histories of the endangered Bretschneidera sinensis (Akaniaceae). Gigascience 2022, 11, giac050. [Google Scholar] [CrossRef]
  21. Wang, L.; Fan, L.; Zhao, Z.; Zhang, Z.; Jiang, L.; Chai, M.; Tian, C. The Capparis spinosa var. herbacea genome provides the first genomic instrument for a diversity and evolution study of the Capparaceae family. Gigascience 2022, 11, giac106. [Google Scholar] [CrossRef] [PubMed]
  22. Zhao, W.; Li, J.; Sun, X.; Zheng, Q.; Liu, J.; Hua, W.; Liu, J. Integrated global analysis in spider flowers illuminates features underlying the evolution and maintenance of C4 photosynthesis. Hortic. Res. 2023, 10, uhad129. [Google Scholar] [CrossRef]
  23. Cheng, S.; van den Bergh, E.; Zeng, P.; Zhong, X.; Xu, J.; Liu, X.; Hofberger, J.; de Bruijn, S.; Bhide, A.S.; Kuelahoglu, C.; et al. The Tarenaya hassleriana genome provides insight into reproductive trait and genome evolution of crucifers. Plant Cell 2013, 25, 2813–2830. [Google Scholar] [CrossRef] [PubMed]
  24. Käfer, J.; Bewick, A.; Andres-Robin, A.; Lapetoule, G.; Harkess, A.; Caïus, J.; Fogliani, B.; Gâteblé, G.; Ralph, P.; de Pamphilis, C.W.; et al. A derived ZW chromosome system in Amborella trichopoda, representing the sister lineage to all other extant flowering plants. New Phytol. 2022, 233, 1636–1642. [Google Scholar] [CrossRef] [PubMed]
  25. Nath, O.; Fletcher, S.J.; Hayward, A.; Shaw, L.M.; Masouleh, A.K.; Furtado, A.; Henry, R.J.; Mitter, N. A haplotype resolved chromosomal level avocado genome allows analysis of novel avocado genes. Hortic. Res. 2022, 9, uhac157. [Google Scholar] [CrossRef]
  26. Filiault, D.L.; Ballerini, E.S.; Mandáková, T.; Aköz, G.; Derieg, N.J.; Schmutz, J.; Jenkins, J.; Grimwood, J.; Shu, S.; Hayes, R.D.; et al. The Aquilegia genome provides insight into adaptive radiation and reveals an extraordinarily polymorphic chromosome with a unique history. eLife 2018, 7, e36426. [Google Scholar] [CrossRef]
  27. Sánchez-Albarrán, F.; Suárez-Rodríguez, L.M.; Ruíz-Herrera, L.F.; López-Meza, J.E.; López-Gómez, R. Two oleosins expressed in the mesocarp of native mexican avocado, key genes in the oil content. Plant Foods Hum. Nutr. 2021, 76, 20–25. [Google Scholar] [CrossRef]
  28. Chang, Y.; Liu, H.; Liu, M.; Liao, X.; Sahu, S.K.; Fu, Y.; Song, B.; Cheng, S.; Kariba, R.; Muthemba, S.; et al. The draft genomes of five agriculturally important African orphan crops. Gigascience 2019, 8, giy152. [Google Scholar] [CrossRef]
  29. Shyamli, P.S.; Pradhan, S.; Panda, M.; Parida, A. De novo whole-genome assembly of Moringa oleifera helps identify genes regulating drought stress tolerance. Front. Plant Sci. 2021, 12, 766999. [Google Scholar] [CrossRef]
  30. Bowers, J.E.; Chapman, B.A.; Rong, J.; Paterson, A.H. Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 2003, 422, 433–438. [Google Scholar] [CrossRef]
  31. Hsieh, K.; Huang, A.H. Endoplasmic reticulum, oleosins, and oils in seeds and tapetum cells. Plant Physiol. 2004, 136, 3427–3434. [Google Scholar] [CrossRef] [PubMed]
  32. Siloto, R.M.P.; Findlay, K.; Lopez, V.A.; Yeung, E.C.; Nykifork, C.L.; Moloney, M.M. The accumulation of oleosins determines the size of seed oil bodies in Arabidopsis. Plant Cell 2006, 18, 1961–1974. [Google Scholar] [CrossRef] [PubMed]
  33. Shimada, T.L.; Shimada, T.; Takahashi, H.; Fukao, Y.; Hara-Nishimura, I. A novel role for oleosins in freezing tolerance of oilseeds in Arabidopsis thaliana. Plant J. 2008, 55, 798–809. [Google Scholar] [CrossRef] [PubMed]
  34. Huang, A.H. Plant lipid droplets and their associated proteins: Potential for rapid advances. Plant Physiol. 2018, 176, 1894–1918. [Google Scholar] [CrossRef]
  35. Shao, Q.; Liu, X.; Su, T.; Ma, C.; Wang, P. New insights into the role of seed oil body proteins in metabolism and plant development. Front. Plant Sci. 2019, 10, 1568. [Google Scholar] [CrossRef]
  36. Zhang, D.; Zhang, H.; Hu, Z.; Chu, S.; Yu, K.; Lv, L.; Yang, Y.; Zhang, X.; Chen, X.; Kan, G.; et al. Artificial selection on GmOLEO1 contributes to the increase in seed oil during soybean domestication. PLoS Genet. 2019, 15, e1008267. [Google Scholar] [CrossRef] [PubMed]
  37. Guzha, A.; Whitehead, P.; Ischebeck, T.; Chapman, K.D. Lipid droplets: Packing hydrophobic molecules within the aqueous cytoplasm. Annu. Rev. Plant Biol. 2023, 74, 195–223. [Google Scholar] [CrossRef]
  38. Hu, J.; Chen, F.; Zang, J.; Li, Z.; Wang, J.; Wang, Z.; Shi, L.; Xiu, Y.; Lin, S. Native promoter-mediated transcriptional regulation of crucial oleosin protein OLE1 from Prunus sibirica for seed development and high oil accumulation. Int. J. Biol. Macromol. 2023, 253, 126650. [Google Scholar] [CrossRef]
  39. Jiao, Y.; Leebens-Mack, J.; Ayyampalayam, S.; Bowers, J.E.; McKain, M.R.; McNeal, J.; Rolf, M.; Ruzicka, D.R.; Wafula, E.; Wickett, N.J.; et al. A genome triplication associated with early diversification of the core eudicots. Genome Biol. 2012, 13, R3. [Google Scholar] [CrossRef]
  40. Vanneste, K.; Baele, G.; Maere, S.; Van de Peer, Y. Analysis of 41 plant genomes supports a wave of successful genome duplications in association with the Cretaceous-Paleogene boundary. Genome Res. 2014, 24, 1334–1347. [Google Scholar] [CrossRef]
  41. Hoang, N.V.; Sogbohossou, E.O.D.; Xiong, W.; Simpson, C.J.C.; Singh, P.; Walden, N.; van den Bergh, E.; Becker, F.F.M.; Li, Z.; Zhu, X.G.; et al. The Gynandropsis gynandra genome provides insights into whole-genome duplications and the evolution of C4 photosynthesis in Cleomaceae. Plant Cell 2023, 35, 1334–1359. [Google Scholar] [CrossRef] [PubMed]
  42. Tuskan, G.A.; Difazio, S.; Jansson, S.; Bohlmann, J.; Grigoriev, I.; Hellsten, U.; Putnam, N.; Ralph, S.; Rombauts, S.; Salamov, A.; et al. The genome of black cottonwood, Populus trichocarpa, Torr. & Gray. Science 2006, 313, 1596–1604. [Google Scholar] [CrossRef] [PubMed]
  43. Bredeson, J.V.; Lyons, J.B.; Prochnik, S.E.; Wu, G.A.; Ha, C.M.; Edsinger-Gonzales, E.; Grimwood, J.; Schmutz, J.; Rabbi, I.Y.; Egesi, C.; et al. Sequencing wild and cultivated cassava and related species reveals extensive interspecific hybridization and genetic diversity. Nat. Biotechnol. 2016, 34, 562–570. [Google Scholar] [CrossRef] [PubMed]
  44. Huang, C.Y.; Chen, P.Y.; Huang, M.D.; Tsou, C.H.; Jane, W.N.; Huang, A.H. Tandem oleosin genes in a cluster acquired in Brassicaceae created tapetosomes and conferred additive benefit of pollen vigor. Proc. Natl. Acad. Sci. USA 2013, 110, 14480–14485. [Google Scholar] [CrossRef] [PubMed]
  45. Zou, Z.; Yang, J.H. Genomic analysis of Dof transcription factors in Hevea brasiliensis, a rubber-producing tree. Ind. Crops Prod. 2019, 134, 271–283. [Google Scholar] [CrossRef]
  46. Zou, Z.; Zheng, Y.J.; Xie, Z.N. Analysis of Carica papaya informs lineage-specific evolution of the aquaporin (AQP) family in Brassicales. Plants 2023, 12, 3847. [Google Scholar] [CrossRef] [PubMed]
  47. Edgar, R.C. MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004, 32, 1792–1797. [Google Scholar] [CrossRef] [PubMed]
  48. Tamura, K.; Stecher, G.; Peterson, D.; Filipski, A.; Kumar, S. MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. Mol. Biol. Evol. 2013, 30, 2725–2729. [Google Scholar] [CrossRef]
  49. Zou, Z.; Yang, J.H.; Zhang, X.C. Insights into genes encoding respiratory burst oxidase homologs (RBOHs) in rubber tree (Hevea brasiliensis Muell. Arg.). Ind. Crops Prod. 2019, 128, 126–139. [Google Scholar] [CrossRef]
  50. Zou, Z.; Zhao, Y.G.; Zhang, L.; Xiao, Y.H.; Guo, A.P. Analysis of Cyperus esculentus SMP family genes reveals lineage-specific evolution and seed desiccation-like transcript accumulation during tuber maturation. Ind. Crops Prod. 2022, 187, 115382. [Google Scholar] [CrossRef]
  51. Chen, C.; Chen, H.; Zhang, Y.; Thomas, H.R.; Frank, M.H.; He, Y.; Xia, R. TBtools: An integrative toolkit developed for interactive analyses of big biological data. Mol. Plant 2020, 3, 1194–1202. [Google Scholar] [CrossRef] [PubMed]
  52. Altschul, S.F.; Madden, T.L.; Schäffer, A.A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman, D.J. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 1997, 25, 3389–3402. [Google Scholar] [CrossRef] [PubMed]
  53. Wang, Y.; Tang, H.; Debarry, J.D.; Tan, X.; Li, J.; Wang, X.; Lee, T.H.; Jin, H.; Marler, B.; Guo, H.; et al. MCScanX: A toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 2012, 40, e49. [Google Scholar] [CrossRef] [PubMed]
  54. Qiao, X.; Li, Q.; Yin, H.; Qi, K.; Li, L.; Wang, R.; Zhang, S.; Paterson, A.H. Gene duplication and evolution in recurring polyploidization-diploidization cycles in plants. Genome Biol. 2019, 20, 38. [Google Scholar] [CrossRef] [PubMed]
  55. Yang, Z. PAML 4: Phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 2007, 24, 1586–1591. [Google Scholar] [CrossRef] [PubMed]
  56. Bolger, A.M.; Lohse, M.; Usadel, B. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 2014, 30, 2114–2120. [Google Scholar] [CrossRef] [PubMed]
  57. Kim, D.; Langmead, B.; Salzberg, S.L. HISAT: A fast spliced aligner with low memory requirements. Nat. Methods 2015, 12, 357–360. [Google Scholar] [CrossRef]
  58. Zou, Z.; Li, M.Y.; Jia, R.Z.; Zhao, H.; He, P.P.; Zhang, Y.L.; Guo, A.P. Genes encoding light-harvesting chlorophyll a/b-binding proteins in papaya (Carica papaya L.) and insight into lineage-specific evolution in Brassicaceae. Gene 2020, 748, 144685. [Google Scholar] [CrossRef]
  59. Xu, Y.G.; Zou, Z.; Guo, J.Y.; Kong, H.; Zhu, G.P.; Guo, A.P. Cloning and functional analysis of CpMGT1, a magnesium transporter gene from Carica papaya. Chin. J. Trop. Crop. 2022, 43, 1114–1121. [Google Scholar]
  60. Zou, Z.; Gong, J.; An, F.; Xie, G.S.; Wang, J.K.; Mo, Y.Y.; Yang, L.F. Genome-wide identification of rubber tree (Hevea brasiliensis Muell. Arg.) aquaporin genes and their response to ethephon stimulation in the laticifer, a rubber-producing tissue. BMC Genom. 2015, 16, 1001. [Google Scholar] [CrossRef]
Figure 1. Multiple sequence alignment and evolutionary analysis of oleosins. (A) Evolutionary analysis of oleosins. Shown is an unrooted evolutionary tree resulting from full-length oleosins with MEGA6 (maximum likelihood method and bootstrap of 1000 replicates), where the distance scale denotes the number of AA substitutions per site and the name of each clade is indicated next to the corresponding clade. (B) Sequence alignment and structural features of N oleosins together with AtrOLE4, MoOLE4, MoOLE5, and other CpOLEs. MoOLE6N and MoOLE6C represent N- and C termini of the MoOLE6 protein, whereas sequence alignment and display were conducted using MUSCLE and Boxshade, respectively. Identical and similar residues are highlighted in black and dark grey, respectively. The conserved 12-residue proline knot is underlined, whereas the C-terminal AAPGA of Clade U and the putative C-terminal insertion of Clade SH are boxed. (Ac: A. coerulea; At: A. thaliana; Atr: A. trichopoda; Bs: B. sinensis; Cp: C. papaya; Cs: C. spinosa; Cv: C. violacea; Gg: G. gynandra; Me: M. esculenta; Mo: M. oleifera; M: mesocarp; N: novel; OLE: oleosin; Pa: P. americana; P. trichocarpa; SH: seed high-molecular-weight; SL: seed low-molecular-weight; Th: T. hassleriana; T: tapetum; U: universal).
Figure 1. Multiple sequence alignment and evolutionary analysis of oleosins. (A) Evolutionary analysis of oleosins. Shown is an unrooted evolutionary tree resulting from full-length oleosins with MEGA6 (maximum likelihood method and bootstrap of 1000 replicates), where the distance scale denotes the number of AA substitutions per site and the name of each clade is indicated next to the corresponding clade. (B) Sequence alignment and structural features of N oleosins together with AtrOLE4, MoOLE4, MoOLE5, and other CpOLEs. MoOLE6N and MoOLE6C represent N- and C termini of the MoOLE6 protein, whereas sequence alignment and display were conducted using MUSCLE and Boxshade, respectively. Identical and similar residues are highlighted in black and dark grey, respectively. The conserved 12-residue proline knot is underlined, whereas the C-terminal AAPGA of Clade U and the putative C-terminal insertion of Clade SH are boxed. (Ac: A. coerulea; At: A. thaliana; Atr: A. trichopoda; Bs: B. sinensis; Cp: C. papaya; Cs: C. spinosa; Cv: C. violacea; Gg: G. gynandra; Me: M. esculenta; Mo: M. oleifera; M: mesocarp; N: novel; OLE: oleosin; Pa: P. americana; P. trichocarpa; SH: seed high-molecular-weight; SL: seed low-molecular-weight; Th: T. hassleriana; T: tapetum; U: universal).
Plants 13 00280 g001
Figure 2. Species-specific distribution of nine oleosin orthogroups identified in this study. Taxonomy relationships of tested species follow that of NCBI Taxonomy (M: mesocarp; N: novel; SH: seed high-molecular-weight; SL: seed low-molecular-weight; T: tapetum; U: universal).
Figure 2. Species-specific distribution of nine oleosin orthogroups identified in this study. Taxonomy relationships of tested species follow that of NCBI Taxonomy (M: mesocarp; N: novel; SH: seed high-molecular-weight; SL: seed low-molecular-weight; T: tapetum; U: universal).
Plants 13 00280 g002
Figure 3. Chromosomal locations and duplication events of oleosin genes. Serial numbers are indicated at the top of each chromosome/scaffold, and the scale is in Mb. Duplicate pairs identified in this study are connected using lines in different colors, i.e., tandem (blue), transposed (green), dispersed (purple), and WGD (gold). (Ac: A. coerulea; Atr: A. trichopoda; Bs: B. sinensis; Chr: chromosome; Cp: C. papaya; Cs: C. spinosa; Cv: C. violacea; Gg: G. gynandra; Mo: M. oleifera; OLE: oleosin; Pa: P. americana; Scf: scaffold; Th: T. hassleriana).
Figure 3. Chromosomal locations and duplication events of oleosin genes. Serial numbers are indicated at the top of each chromosome/scaffold, and the scale is in Mb. Duplicate pairs identified in this study are connected using lines in different colors, i.e., tandem (blue), transposed (green), dispersed (purple), and WGD (gold). (Ac: A. coerulea; Atr: A. trichopoda; Bs: B. sinensis; Chr: chromosome; Cp: C. papaya; Cs: C. spinosa; Cv: C. violacea; Gg: G. gynandra; Mo: M. oleifera; OLE: oleosin; Pa: P. americana; Scf: scaffold; Th: T. hassleriana).
Plants 13 00280 g003
Figure 4. Synteny analyses within and between C. papaya and other species. (A) C. papaya, A. thaliana, A. coerulea, P. americana, and A. trichopoda. (B) P. trichocarpa, M. esculenta, A. coerulea, and A. trichopoda. (C) C. papaya, B. sinensis, C. spinosa; and G. gynandra. (D) C. violacea, G. gynandra, T. hassleriana, and A. thaliana. Syntenic blocks were inferred using MCScanX (E-value ≤ 1 × 10−10; BLAST hits ≥ 5). Oleosin-encoding chromosomes/scaffolds are shown, and only syntenic blocks that contain oleosin genes are marked in red (intra) and purple (inter), respectively. (Ac: A. coerulea; At: A. thaliana; Atr: A. trichopoda; Bs: B. sinensis; Chr: chromosome; Cp: C. papaya; Cs: C. spinosa; Cv: C. violacea; Gg: G. gynandra; Me: M. esculenta; Mo: M. oleifera; OLE: oleosin; Pa: P. americana; Pt: P. trichocarpa; Scf: scaffold; Th: T. hassleriana).
Figure 4. Synteny analyses within and between C. papaya and other species. (A) C. papaya, A. thaliana, A. coerulea, P. americana, and A. trichopoda. (B) P. trichocarpa, M. esculenta, A. coerulea, and A. trichopoda. (C) C. papaya, B. sinensis, C. spinosa; and G. gynandra. (D) C. violacea, G. gynandra, T. hassleriana, and A. thaliana. Syntenic blocks were inferred using MCScanX (E-value ≤ 1 × 10−10; BLAST hits ≥ 5). Oleosin-encoding chromosomes/scaffolds are shown, and only syntenic blocks that contain oleosin genes are marked in red (intra) and purple (inter), respectively. (Ac: A. coerulea; At: A. thaliana; Atr: A. trichopoda; Bs: B. sinensis; Chr: chromosome; Cp: C. papaya; Cs: C. spinosa; Cv: C. violacea; Gg: G. gynandra; Me: M. esculenta; Mo: M. oleifera; OLE: oleosin; Pa: P. americana; Pt: P. trichocarpa; Scf: scaffold; Th: T. hassleriana).
Plants 13 00280 g004
Figure 5. Expression profiles of CpOLE genes. (A) Tissue-specific expression profiles of CpOLE genes. Color scale represents FPKM normalized log2 transformed counts where blue indicates low expression and red indicates high expression. (B) CpOLE transcript abundance relative to the reference gene CpEIEF. Bars indicate SD (N ≥ 3) and uppercase letters indicate difference significance tested following Duncan’s one-way multiple-range post hoc ANOVA (p < 0.01). (Cp: C. papaya; FPKM: fragments per kilobase of exon per million fragments mapped; OLE: oleosin).
Figure 5. Expression profiles of CpOLE genes. (A) Tissue-specific expression profiles of CpOLE genes. Color scale represents FPKM normalized log2 transformed counts where blue indicates low expression and red indicates high expression. (B) CpOLE transcript abundance relative to the reference gene CpEIEF. Bars indicate SD (N ≥ 3) and uppercase letters indicate difference significance tested following Duncan’s one-way multiple-range post hoc ANOVA (p < 0.01). (Cp: C. papaya; FPKM: fragments per kilobase of exon per million fragments mapped; OLE: oleosin).
Plants 13 00280 g005
Table 1. Oleosin genes identified in A. trichopoda, P. americana, A. coerulea, and representative Brassicales species. (AA: amino acid; Ac: A. coerulea; AI: aliphatic index; Atr: A. trichopoda; Bs: B. sinensis; Chr: chromosome; Cp: C. papaya; Cs: C. spinosa; Cv: C. violacea; Gg: G. gynandra; GRAVY: grand average of hydropathicity; II: instability index; kDa: kilodalton; Mo: M. oleifera; MW: molecular weight; OLE: oleosin; Pa: P. americana; pI: isoelectric point; Scf: scaffold; Th: T. hassleriana).
Table 1. Oleosin genes identified in A. trichopoda, P. americana, A. coerulea, and representative Brassicales species. (AA: amino acid; Ac: A. coerulea; AI: aliphatic index; Atr: A. trichopoda; Bs: B. sinensis; Chr: chromosome; Cp: C. papaya; Cs: C. spinosa; Cv: C. violacea; Gg: G. gynandra; GRAVY: grand average of hydropathicity; II: instability index; kDa: kilodalton; Mo: M. oleifera; MW: molecular weight; OLE: oleosin; Pa: P. americana; pI: isoelectric point; Scf: scaffold; Th: T. hassleriana).
Gene NameLocusPositionIntron No.AAMW (kDa)pIGRAVYAIDuplicateModeOleosin LocationClade
AtrOLE1AmTrH2.13G041800Chr13:8312320..8313509(−)016817.8010.310.263105.71--42..154U
AtrOLE2AmTrH2.05G030700Chr5:5790540..5791413(+)014715.519.360.349105.58AtrOLE1Dispersed22..134M
AtrOLE3AmTrH2.13G011500Chr13:2034840..2035253(−)013714.079.750.411103.43AtrOLE1Dispersed22..134SL
AtrOLE4AmTrH2.03G086400Chr3:27204354..27205476(+)015015.569.360.365104.60AtrOLE5Dispersed21..136SH
AtrOLE5AmTrH2.10G130400Chr10:44382338..44383409(−)014014.759.940.531103.14AtrOLE3Dispersed17..131N
PaOLE1g26506Chr5:7463879..7464397(−)017217.9910.010.294100.41--47..157U
PaOLE2g9736Chr7:51190093..51190608(+)017117.7510.000.32297.02PaOLE1WGD46..157U
PaOLE3g12771Chr2:32085665..32086144(+)015917.479.740.21198.81PaOLE1Dispersed19..126M
AcOLE1Aqcoe3G048300Chr3:3052078..3052669(+)016717.959.670.25799.88--41..153U
AcOLE2Aqcoe7G144100Chr7:9197082..9198218(−)015016.149.700.11294.33AcOLE1Dispersed24..135M
AcOLE3Aqcoe3G267500Chr3:31392522..31393370(+)014615.389.300.482112.81AcOLE1Dispersed27..137SL
AcOLE4Aqcoe7G093500Chr7:5627997..5628401(−)013413.9010.020.516108.43AcOLE3Dispersed23..119SL
AcOLE5Aqcoe3G202700Chr3:21602227..21603086(−)017118.129.390.11695.85AcOLE3Dispersed35..158SH
CpOLE1sunset09G0006960Chr9:5118166..5118919(−)016718.089.840.39699.76--41..152U
CpOLE2sunset07G0007350Chr7:6423723..6424251(+)013113.689.560.422108.85CpOLE2Transposed17..125M
CpOLE3sunset09G0008730Chr9:6575284..6575789(−)113614.149.890.347104.78CpOLE1Dispersed15..127SL
CpOLE4sunset01G0003770Chr1:3107234..3107629(−)013113.5610.890.675122.82CpOLE4Dispersed26..129SH
CpOLE5sunset09G0012790Chr9:14063375..14064171(+)014915.9610.340.169106.71CpOLE3Dispersed28..138SH
CpOLE6sunset04G0023010Chr4:30227031..30227636(+)014514.835.560.720117.72CpOLE3Dispersed27..104N
MoOLE1-Scf12:425030..425509(+)015917.3910.000.34894.47--33..145U
MoOLE2GLEAN_10017149Scf5:3253661..3254092(+)014315.259.230.344100.35MoOLE1Dispersed21..132M
MoOLE3GLEAN_10002091Scf132:402521..407622(+)113714.649.890.397112.48MoOLE1Dispersed17..127SL
MoOLE4GLEAN_10017990Scf4:3104843..3105331(+)016216.8110.280.355107.22MoOLE4γ WGD37..149SH
MoOLE5GLEAN_10007003Scf35:698143..698517(−)012413.159.950.784121.13MoOLE6Dispersed29..116SH
MoOLE6GLEAN_10005491Scf65:559782..564316(+)026728.016.070.47198.58MoOLE3Dispersed25..123
171..265
N
BsOLE1BsiG0022789Chr4:24719773..24720252(−)015917.439.840.394102.45--33..145U
BsOLE2BsiG0031356Chr5:115702358..115702837(+)015917.559.690.424106.73BsOLE1Dispersed33..145U
BsOLE3BsiG0027711Chr5:12196723..12197160(+)014515.609.520.23998.97BsOLE1Dispersed21..131M
BsOLE4BsiG0023505Chr4:38309092..38309529(−)014515.489.550.374106.28BsOLE3α WGD21..132M
BsOLE5BsiG0007300Chr1:161375811..161376341(−)113914.619.770.397103.17BsOLE1Dispersed18..128SL
BsOLE6BsiG0026540Chr4:140933592..140934122(−)113514.129.520.400105.48BsOLE5α WGD13..124SL
BsOLE7BsiG0026541Chr4:140939897..140940427(−)113514.089.520.417107.63BsOLE6Tandem13..124SL
BsOLE8BsiG0004876Chr1:127499323..127499826(−)016717.899.390.180107.49BsOLE5Dispersed38..147SH
BsOLE9BsiG0006900Chr1:156712314..156712805(+)016317.589.51-0.14490.98BsOLE8γ WGD34..138SH
BsOLE10BsiG0025867Chr4:130692072..130692554(−)016017.109.970.078104.25BsOLE9α WGD23..133SH
CsOLE1Cs02G002030Chr2:9103395..9103838(+)014715.819.350.563112.18--26..133U
CsOLE2Cs15G003740Chr15:2440935..2441444(+)014615.879.680.20588.90CsOLE1Dispersed21..132M
CsOLE3Cs12G001310Chr12:683951..684453(+)111511.9211.000.738123.83CsOLE1Dispersed16..114SL
CsOLE4Cs06G005610Chr6:16853769..16854634(−)114915.519.990.17094.36CsOLE3β WGD21..132SL
CsOLE5Cs02G003290Chr2:9784653..9785174(+)113414.1810.200.352108.51CsOLE3α WGD16..128SL
CsOLE6Cs01G009580Chr1:23015766..23016309(−)114915.409.590.358110.13CsOLE7α WGD34..147SH
CsOLE7Cs02G004630Chr2:10712950..10713929(−)116116.819.690.10399.44CsOLE3Dispersed32..148SH
CsOLE8Cs14G006880Chr14:6118834..6119582(+)115216.329.690.245100.07CsOLE7β WGD29..140SH
CvOLE1Clevi.0032s0439Scf32:644486..645807(−)115917.049.750.383103.14--38..145M
CvOLE2Clevi.0001s1658Scf1:3646632..3647045(−)013714.659.610.412104.01CvOLE1Dispersed19..128SL
CvOLE3Clevi.0015s0023Scf15:3700181..3701021(−)114314.9810.200.26296.22CvOLE1Dispersed19..130SL
CvOLE4Clevi.0004s1912Scf4:2316861..2318229(+)115716.429.980.297104.33CvOLE6Dispersed32..144SH
CvOLE5Clevi.0042s0814Scf42:1538196..1539122(−)116116.799.690.441106.02CvOLE4γ WGD35..147SH
CvOLE6Clevi.0015s0551Scf15:942332..943312(−)014514.895.250.560106.34CvOLE3Dispersed27..132N
GgOLE1GG13G018590Chr13:9683189 9684092(+)116417.309.570.410104.15--43..150U
GgOLE2GG05G000440Chr5:273726 274730(−)116217.169.410.446107.72GgOLE1Dispersed41..148U
GgOLE3GG07G021290Chr7:10989271 10989687(+)013814.799.720.442110.22GgOLE1Dispersed19..128M
GgOLE4GG02G144870Chr2:66911747 66912474(+)114415.0510.200.27296.94GgOLE1Dispersed19..130SL
GgOLE5GG05G049880Chr5:23864915 23865662(+)115916.789.890.302104.91GgOLE7Dispersed34..146SH
GgOLE6GG15G098790Chr15:45889738 45890318(−)116116.779.520.406109.01GgOLE5Dispersed35..147SH
GgOLE7-Chr6:796424..801985(−)113914.634.430.609110.79GgOLE4Dispersed26..124N
ThOLE1LOC104821850Scf34:1318549..1319624(−)115516.509.630.665113.81--34..141U
ThOLE2LOC104819676Scf3:6116766..6117891(−)115616.499.390.485105.64ThOLE1Dispersed35..142U
ThOLE3LOC104818593Scf3:633964..634782(−)013814.709.560.449106.81ThOLE1Dispersed19..128M
ThOLE4LOC104825056Scf42:463230..464045(+)114415.0510.200.332102.99ThOLE1Dispersed19..130SL
ThOLE5LOC104811538Scf2:1261264..1262172(−)114415.229.900.273100.28ThOLE4α WGD22..133SL
ThOLE6LOC104805374Scf11:1401936..1403042(+)115916.899.890.17798.74ThOLE8Dispersed34..146SH
ThOLE7LOC104802395Scf8:1757388..1758247(+)116116.809.690.455110.81ThOLE6β WGD35..147SH
ThOLE8LOC104811693Scf2:1907125..1907884(+)014214.506.560.492104.44ThOLE4Dispersed28..99N
Table 2. Evolutionary rate of WGD repeats identified in this study. Ks and Ka were calculated using PAML. (At: A. thaliana; Bs: B. sinensis; Cs: C. spinosa; Cv: C. violacea; Ka: nonsynonymous substitution rate; Ks: synonymous substitution rate; Me: M. esculenta; Mo: M. oleifera; OLE: oleosin; Pa: P. americana; Pt: P. trichocarpa; Th: T. hassleriana).
Table 2. Evolutionary rate of WGD repeats identified in this study. Ks and Ka were calculated using PAML. (At: A. thaliana; Bs: B. sinensis; Cs: C. spinosa; Cv: C. violacea; Ka: nonsynonymous substitution rate; Ks: synonymous substitution rate; Me: M. esculenta; Mo: M. oleifera; OLE: oleosin; Pa: P. americana; Pt: P. trichocarpa; Th: T. hassleriana).
Gene1Gene2Identity (%)KsKa/Ks
PaOLE1PaOLE277.80.85010.1372
MoOLE4MoOLE557.72.04100.1831
BsOLE3BsOLE489.30.27000.2629
BsOLE5BsOLE689.00.19620.2749
BsOLE8BsOLE955.01.58640.9415
BsOLE9BsOLE1076.60.28100.2839
CsOLE3CsOLE454.91.62730.1248
CsOLE3CsOLE576.50.36910.1054
CsOLE6CsOLE775.80.58690.1216
CsOLE7CsOLE860.3--
CvOLE4CvOLE564.01.94370.1515
ThOLE4ThOLE586.00.34090.1610
ThOLE6ThOLE765.61.56770.1797
At-Sm1At-Sm266.91.30930.1880
At-S3At-S558.21.36830.1592
At-S1At-S453.1--
At-S2At-S461.81.57820.1625
PtOLE2aPtOLE2b75.10.31380.5186
PtOLE3aPtOLE3b86.20.16190.8850
PtOLE4aPtOLE4b90.70.20910.3161
PtOLE5aPtOLE5b84.00.36960.2675
MeOLE1aMeOLE1b81.10.74280.1198
MeOLE3aMeOLE3b76.70.61260.2047
MeOLE4aMeOLE4b78.10.41750.3827
MeOLE4bMeOLE559.61.78620.1548
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zou, Z.; Zhang, L.; Zhao, Y. Integrative Analysis of Oleosin Genes Provides Insights into Lineage-Specific Family Evolution in Brassicales. Plants 2024, 13, 280. https://doi.org/10.3390/plants13020280

AMA Style

Zou Z, Zhang L, Zhao Y. Integrative Analysis of Oleosin Genes Provides Insights into Lineage-Specific Family Evolution in Brassicales. Plants. 2024; 13(2):280. https://doi.org/10.3390/plants13020280

Chicago/Turabian Style

Zou, Zhi, Li Zhang, and Yongguo Zhao. 2024. "Integrative Analysis of Oleosin Genes Provides Insights into Lineage-Specific Family Evolution in Brassicales" Plants 13, no. 2: 280. https://doi.org/10.3390/plants13020280

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop