Next Article in Journal
Anti-Inflammatory Potential of Pygeum africanum Bark Extract: An In Vitro Study of Cytokine Release by Lipopolysaccharide-Stimulated Human Peripheral Blood Mononuclear Cells
Previous Article in Journal
Interactions between Inhibitors and 5-Lipoxygenase: Insights from Gaussian Accelerated Molecular Dynamics and Markov State Models
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Frequent Acquisition of Glycoside Hydrolase Family 32 (GH32) Genes from Bacteria via Horizontal Gene Transfer Drives Adaptation of Invertebrates to Diverse Sources of Food and Living Habitats

1
Department of Entomology, 123 Waters Hall, Kansas State University, Manhattan, KS 66506, USA
2
Hard Winter Wheat Genetics Research Unit, Center for Grain and Animal Health Research, US Department of Agriculture, Agricultural Research Services, 4008 Throckmorton Hall, Kansas State University, Manhattan, KS 66506, USA
3
Department of Biochemistry and Molecular Biophysics, 141 Chalmers Hall, Kansas State University, Manhattan, KS 66506, USA
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2024, 25(15), 8296; https://doi.org/10.3390/ijms25158296 (registering DOI)
Submission received: 19 June 2024 / Revised: 23 July 2024 / Accepted: 24 July 2024 / Published: 30 July 2024
(This article belongs to the Section Biochemistry)

Abstract

:
Glycoside hydrolases (GHs, also called glycosidases) catalyze the hydrolysis of glycosidic bonds in polysaccharides. Numerous GH genes have been identified from various organisms and are classified into 188 families, abbreviated GH1 to GH188. Enzymes in the GH32 family hydrolyze fructans, which are present in approximately 15% of flowering plants and are widespread across microorganisms. GH32 genes are rarely found in animals, as fructans are not a typical carbohydrate source utilized in animals. Here, we report the discovery of 242 GH32 genes identified in 84 animal species, ranging from nematodes to crabs. Genetic analyses of these genes indicated that the GH32 genes in various animals were derived from different bacteria via multiple, independent horizontal gene transfer events. The GH32 genes in animals appear functional based on the highly conserved catalytic blades and triads in the active center despite the overall low (35–60%) sequence similarities among the predicted proteins. The acquisition of GH32 genes by animals may have a profound impact on sugar metabolism for the recipient organisms. Our results together with previous reports suggest that the acquired GH32 enzymes may not only serve as digestive enzymes, but also may serve as effectors for manipulating host plants, and as metabolic enzymes in the non-digestive tissues of certain animals. Our results provide a foundation for future studies on the significance of horizontally transferred GH32 genes in animals. The information reported here enriches our knowledge of horizontal gene transfer, GH32 functions, and animal–plant interactions, which may result in practical applications. For example, developing crops via targeted engineering that inhibits GH32 enzymes could aid in the plant’s resistance to animal pests.

1. Introduction

Glycoside hydrolases (GHs, also called glycosidases) catalyze the hydrolysis of glycosidic bonds in polysaccharides [1]. The primary function of GHs is to release simple sugars from polysaccharides including glucans (polyglucose) and fructans (polyfructose). In addition, GHs are involved in other functions such as plant defense, signaling, and stress responses [2]. GHs are ubiquitous and functionally diverse. To date, more than 1,000,000 putative GH sequences have been identified and are classified into 188 families (https://www.cazypedia.org/index.php/Glycoside_Hydrolase_Families) (accessed on 1 May 2024). The common examples of GHs include cellulases, hemicellulases, amylases, pectinases, and fructosidases. One of the GH gene families is GH32, which contains glycoside hydrolases that hydrolyze the glycosidic bond between two or more fructose-containing carbohydrates (fructan), or between a fructose-containing carbohydrate and a non-carbohydrate moiety [3]. An example of a well-characterized GH32 enzyme includes invertases (also called β-fructofuranosidases) that hydrolyze sucrose in diverse species across multiple kingdoms ranging from the bacterium Thermotoga maritima [4], the yeast Saccharomyces cerevisiae [5], the plant Arabidopsis thaliana [6], and the insect Bombyx mori [7]. Other examples include exo-inulinases that hydrolyze β-(2 → 1) linked fructans (inulin) from the fungus Aspergillus awamori [8] and the plant Cichorium intybus [9]; and levanases that hydrolyze β-(2 → 6) linked fructans (levan) from the bacterium Bacteroides thetaiotaomicron [10]. The 3D structures of these enzymes suggest each of their interactions with ligands contains a catalytic triad necessary for enzymatic activity with consensus sequences “wmnDpng”, “RDP”, and ‘Ec” [7].
Despite fructans and their hydrolyzing enzymes, such as GH32s, being widespread across microorganisms, approximately only 15% of flowering plants contain significant amounts of fructans [11]. The most highly abundant plant species containing fructans are grown in dry environments or go through cold winters, suggesting fructan content might enhance their tolerance against drought and cold stresses. In fact, the evolution and diversification of plants containing fructans is speculated to have occurred in the mid-Tertiary era and again in the phases of the early Holocene epoch which are associated with dry climates [11]. Crops such as wheat, bananas, onion, garlic, artichoke, asparagus, and chicory contain a relatively high content of fructans in their roots, tubers, stems, tiller bases, and leaf sheath tissues [12]. Fructans are not used as an energy reserve for the majority of plant species and all the animal species. In fact, during evolution, animals lost GH32 genes and the ability to digest fructans using their own digestive systems. For those animals that live on plants containing significant amounts of fructans, an alternative way to use host fructans as a source of nutrients needs to be developed.
Horizontal gene transfer (HGT) is the phenomenon of acquiring genetic material by one species directly from a different species. It is often a means for the recipient to gain extra abilities, for example, digesting undigestible food [13]. HGT has been documented in a wide range of organisms from viruses to animals. Genes gained via HGT can provide certain advantages for the recipient organism. In some cases, a single HGT event can lead to drastic adaptation to otherwise hostile environments for the recipient. Some examples include the horizontal transfer of a lectin-like antifreeze gene, which allows several recipient fish species to live in an otherwise lethal environment [14]; the acquisition of antibiotic resistance genes allows a range of bacteria to gain drug resistance [15]; and the gaining of genes encoding plant cell wall–degrading enzymes enhances the ability of fungal parasites to attack hosts [16].
The most-documented horizontally transferred GH genes from prokaryotes to eukaryotes encode enzymes that digest cell walls, including pectinases [17] and cellulases [18], which enhances the ability of parasites to attack host plants. Recent evidence suggests that GH32 genes encoding levanases and inulinases are also potential HGT targets from prokaryotes to eukaryotes. The gaining of these enzymes could allow for the recipients to utilize otherwise undigestible nutrients. For example, an HGT-derived, sucrose-hydrolyzing GH32 gene allows the nematode Globodera pallida to utilize host-derived sucrose as nutrients [19]. Previously, our group discovered a single HGT event of a GH32 gene that led to the acquisition of a fructan metabolic pathway in a gall midge [20]. Here, we expanded our search for any horizontally transferred GH32 genes across all animal species with a sequenced genome deposited in Genbank. We detected a wide range of animal species that have independently acquired different GH32 genes from different sources. Substantial evidence suggests that the repeated acquisition of GH32 genes by various animal species may be a major driving force for species adaptation to diverse food sources and habitats.

2. Results

2.1. GH32 Genes in Diverse Animals

As of 31 December 2023, more than 3300 animal genomes are sequenced and deposited into GenBank. We searched these genomes and found that 940 (28%) of the animal species contain GH32-encoding genes. These species containing GH32-encoding genes are distributed into nine phyla, including Porifera (sponges, 1 species), Cnidaria (coelenterates, 3 species), Platyhelminths (flatworms, 1 species), Annelida (segmented worms 1 species), Mollusca (mollusks, 1 species), Tardigrada (water bears, 2 species), Rotifera (rotifers, 11 species), Nematoda (nematodes, 24 species), and Arthropoda (bugs, 895 species) (Table 1, Figure S1). The largest phylum containing GH32-encoding genes was Arthropoda, with 895 species distributed in 18 orders. Lepidoptera (moths and butterflies) is the largest order, with 730 species that contain GH32-encoding genes. Given each lepidopteran genome sequenced to date contains GH32 gene(s), the similarity of gene structures (all without any introns), and protein similarity > 80%, we selected 11 representative lepidopterans in this study. Table 1 lists the number of animal species per group containing GH32 genes, the number of GH32 genes found per species, and the total number of GH32 genes identified per group.

2.2. Conservation of the Catalytic Triads and 3D Structures

Despite an overall sequence identity of only 35–60% among the GH32 putative proteins identified from different animals, the catalytic triad sequences are highly conserved (Figure 1A, Figure S2). The consensus sequence at the active site of region 1 is XWZNDPNGZ, where Z represents a hydrophobic amino acid residue of either M, I, L, V, Y, or W; and X represents any amino acid residue. The consensus sequence of the active site of region 2 is XXZRDPXZZ, and the consensus sequence of active region 3 is XXZZECPXZ. Taken together, these active sites produced a catalytic triad of D, D, and E, which were conserved among 93.4%, 94.1%, and 99.2% of all the sequences at the three sites, respectively.
The three-dimensional structures of the identified proteins are also highly conserved. As shown in Figure 1B, five blades (marked as blades I, II, II, IV, and V) at the active site of the Bombyx mori β-fructofuranosidase were determined via the X-ray diffraction of the protein crystal [7]. All three-dimensional structures of the identified proteins predicted using AlphFold-2 (Figure 1B and Figure S3) contained these five blades, while some proteins contain some additional domains.

2.3. Structural Variation among Genes from Different Animals and Genes from the Same Animal

Large variation not only exists in the sequences of the putative proteins but also in the gene structure from different animal species and within some genes from the same species (Figure 2). First, the positions and distribution of introns in genes from different animals vary (Figure 2A). Second, the number of introns per gene also varies, ranging from zero (in 141 genes) to fourteen (in 7 genes). Third, the sizes of introns greatly differ, where most of the intron sizes are 100–200 nucleotides (Figure 2C), and some have intron sizes > 5000 bp.
Not only are the gene structures between animals different, but also they can be quite different within the same animal. For example, there are three GH32 genes in the genome of Sinella curviseta, and each of the genes has a different structure (Figure 2D). Two genes have two introns. However, the locations of the introns are different between the two genes based on the positions corresponding to their encoded amino acid sequences. The other gene has four introns. Genes with different numbers of introns from the same animal were also found in M. destructor, Contarinia nasturtii, Bradysia coprophila, Bradysia odoriphaga, Cyphoderus albinus, Pogonognathellus flavescens, Yoshiicerus persimilis, Tomocerus vulgaris, Tomocerus qinae, Orchesella cincta, Folsomia candida, Thrips palmi, and Oedothorax gibbosus.

2.4. Faster Diversification Once GH32 Genes Transferred from Bacteria to Animals

The proteins identified from various animal species along with best-matched bacterial sequences and selected plant and fungal proteins were subjected to a CLANS analysis. The results revealed that the GH32s from animal species are broadly divided into two major groups, except for some GH32s from nematodes and rotifers, which formed two additional small clusters (Figure 3). The two major groups are named GH32A, which includes the genes encoding proteins with levanase/inulinase activities, and GH32B, which includes genes encoding proteins with sucrase activities (Figure 3) [20]. The functions of the two minor GH32 clusters remain to be determined. Bacterial GH32s are widely distributed across all the groups. Interestingly, within each group, bacterial sequences are mainly clustered in the central region, whereas those from animals are located on the edge area of each cluster. This phenomenon suggests that GH32 genes have been diversifying more rapidly once transferred to animals than in their original bacterial donors.

2.5. Ultiple GH32-Gene Transfer Events to Animal Genomes from Diverse Bacteria

The GH32 predicted proteins are quite diversified, with sequence identity ranging from 35 to 60% from different species. The predicted protein sequences were blasted against the bacterial proteins deposited in Genbank. The best bacterial matches of these 242 proteins were proteins from 21 orders of bacteria (Figure 4A). The majority (approximately 80%) of the matches were from six orders including Bacillales (32%), Cytophagales (18%), Enterobacterales (10%), Chitinophagales (8%), Hyphomicrobiales (6%), and Lactobacillales (5%). The remaining ~20% of the matches were from the orders Halanaerobiales, Sphingobacteriales, Bacteroidales, Flavobacteriales, Streptomycetales, Burkholderiales, Gemmatales, Chloroflexales, and Phototrophicales, with each representing less than 3%.
The relationship between potential bacterial donors and animal recipients appears complex. GH32 proteins from a single animal group (order) with more than two species all had the best matches to proteins from at least two different bacterial orders (Figure 4B). For example, 35 GH32 proteins from seven species of dipterans had the best matches to proteins from seven orders of bacteria. Six GH32 proteins from two species of water bears had the best matches to proteins from five orders of bacteria. More surprisingly, different GH32 proteins even from a single animal species had the best matches to proteins from different orders of bacteria. Figure 4C shows that 12 (5%) animal species from eight (38%) groups had the best matches to proteins from different orders of bacteria. For example, the springtail, Sinella curviseta, has three GH32 genes and each of the three encoded proteins had the best matches to proteins from different orders of bacteria. The 12 predicted GH32 proteins from the dipteran, Contarinia nasturtii, had the best matches to proteins from five different orders of bacteria.
Multiple best bacterial matches of animal GH32 proteins suggest that the GH32 genes from these animals could have been derived independently from different bacterial donors. The GH32 predicted proteins from different types of animals clustered with GH32 proteins from different types of bacteria based on a phylogenetic analysis (Figure 5). For example, GH32 proteins from beetles are distributed in four different clusters together with proteins from different orders of bacteria. GH32 proteins from wasps are distributed in five clusters. GH32 proteins from other groups of animals including nematodes, dipterans (flies), spiders/mites, rotifers, and water bears are also distributed to different clusters. The only exceptions are the GH32 proteins from moths/butterflies, which are found in a single cluster with bacterial proteins. Indeed, GH32 genes from moths/butterflies share the same gene structure without any introns, indicating that they may share the same origin before diversifying into different species.

2.6. Impact of Derived GH32 Genes on Sugar Metabolism in the Recipient Animals

We hypothesized that the acquisition of GH32 genes by animals might provide these animals the ability to digest fructans. Initial metabolite profiling indicated changes in the abundance of both fructose and glucose in the tissues of animals containing GH32 genes (Figure 6A). For example, relatively high levels of fructose were detected in Hessian fly feeding larvae, likely derived from the digestion of ingested fructans via the GH 32 enzymes. In contrast, virtually no fructose was detected in the Hessian fly’s close relative the fungus gnat, Bradysia spp, which based on PCR results does not contain any GH32 genes. Interestingly, we detected low levels of fructose in the pea aphid, which also contains no GH32 genes [22]. It is possible that the low levels of fructose may have been derived from the high concentrations of sucrose in its phloem diet. We also detected low levels of fructose in Manduca sexta larvae, whose genome contains several GH32 genes. In comparison, the levels of glucose were within a range in all the animals except pea aphids, which contained very high levels of glucose (Figure 6A). We think this is likely due to the high levels of sucrose ingested from its phloem.
To further examine the potential impact of HGT-derived GH32 enzymes on sugar metabolism, we selected four representative insect species to compare the absolute concentrations of fructose, difructose, and glucose. Difructose is the last intermediate in the digestion of fructans, such as levan and inulin, before releasing fructose. We compared the levels in the Hessian fly; wheat stem sawfly, Cephus cinctus, (which feeds on the same tissue of wheat as the Hessian fly); the fungus gnat; and the red flour beetle, Tribolium castaneum, (which feeds on wheat grains or flour). According to our research, only the Hessian fly genome contains the presence of GH32 genes [23,24]. We detected a large amount of fructose in Hessian fly larvae, and essentially no fructose in the other three species (Figure 6B). Similarly, a high level of difructose is present in Hessian fly larvae, but very little to zero is present in larvae of the wheat stem sawfly, fungus gnat, or red flour beetle. In contrast, lower levels of glucose were observed in the Hessian fly and sawfly larvae compared to the levels in the fungus gnat and red flour beetle larvae.

3. Discussion

Horizontally transferred genes from a different species are common in animal species. The transfer of GH genes from bacteria to animals is one of the main forces driving animal evolution and adaptation. Previous studies have mainly focused on horizontally transferred GH genes encoding enzymes capable of hydrolyzing cell walls, such as pectinases, cellulases, and hemicellulases [16,17,25]. The acquisition of cell wall-hydrolyzing enzymes enhances the ability of animal species such as nematodes and other parasitic species to attack host plants more effectively [19]. Recently, it was reported that genes encoding GH32 enzymes were horizontally transferred from bacteria to animals [19,20,26]. This study confirmed many animal species have obtained GH32 genes via horizontal gene transfer. It is possible that the acquisition of GH32 genes by animal species could not only be to gain the ability to attack host plants but also to access nutrients from hosts. Fructans, such as levan and inulin, are nonstructural storage sugars used as alternative energy reserves to glucans in various microorganisms and in certain plant species due to their roles in both drought and cold tolerance [27,28]. In contrast, animals and most other plants only use glucans, mainly glycogens and starch, as energy reserves. Animals cannot access the nutrients of fructans due to the lack of GH32 enzymes. We speculate that the gaining of GH32 genes via horizontal gene transfer allows the recipient animals to use the otherwise inaccessible nutrients of storage fructans.
A striking feature of horizontally transferred GH32 genes from bacteria to animals in comparison with other horizontally transferred genes is the frequency and independence associated with multiple transfer events from diverse donors to diverse recipients. As shown in Figure 4, most of the GH32 genes do not have an orthologous relationship among the genes from related animal species. We highlighted GH32 proteins from different beetles that formed different clusters with proteins from different orders of bacteria. Similar phenomena exist with GH32 proteins from other animal groups, such as wasps, flies, and rotifers. These observations suggest multiple gene transfer events have occurred independently even among sister species. In some cases, different GH32 genes from the same animal species were derived via different transfer events from different donors. An example includes the three genes from S. curviseta (Figure 1C and Figure 3D), which were apparently obtained from three different donors. This phenomenon is quite different from the horizontal transfer of other GH gene families from bacteria to animals. For example, the transfer of a pectinase gene (a GH28 member) from a bacterium to an insect preceded the diversification of the suborder of Polyneoptera, and the transferred gene has been retained in many herbivorous species of this suborder [29]. A perfect orthologous relationship exists among the horizontally transferred cellulase genes (GH5 family members) identified from multiple cerambycid beetles including Apriona japonica and A. glabripennis, suggesting that a single transfer event happened preceding the diversification of the beetles into different species [25]. A gene (a member of the GH43 family) encoding an α-arabinofuranosidase transferred from a bacterium to the springtail, Folsomia candida, is also closely related to genes found in other springtails including Orchesella cincta and Allacma fusca [30].
The reason(s) for the frequent and dynamic transferring of GH32 genes from bacteria to animals remains unknown. One possible explanation is that there are different types of fructans that require different enzymes to digest them. For example, the fructan inulin has mostly or exclusively the (2 → 1) fructosyl–fructose linkages, whereas levan has mostly or exclusively the (2 → 6) fructosyl–fructose linkages, depending on different types of plants [31]. There are two different ways to gain enzymes that digest substrates with different linkages. One way is to duplicate the same gene and then diversify the duplicates to gain different substrate specificities. This has apparently happened in the gall midge, Mayetiola destructor, where a single horizontally transferred gene duplicated into more than 10 copies, resulting in functional proteins with enzymatic activities towards different substrates [20]. The accelerated diversification of GH32 genes once they are transferred to animals from bacteria (Figure 3) also supports the possibility of GH32 genes acquiring different enzymatic specificity via diversification in the recipient animals. Another way to gain the ability to digest different substrates is to acquire genes encoding enzymes with different substrate specificities directly from other species via multiple horizontal gene transfer events. This might have happened in most animal species with transferred GH32 genes since there is no orthologous relationship between the GH32 genes even among closely related animal species.
Another interesting phenomenon associated with GH32 genes in animals is that GH32 genes can be present in one animal species but absent in a closely related species. For example, multiple GH32 genes are present in the Dipterans, M. destructor and Contarinia nasturtii, but not in related species such as Drosophila melanogaster, Bactrocera dorsalis, or Orseolia oryzae. This could be due to approximately only 15% of plant species containing storage fructans and the plant species with fructans are not necessarily genetically related. For example, rice and wheat are genetically related cereals, wheat stems contain significant amounts of fructans [32] while rice stems contain little fructans [33,34]. Even in the same plant species, one tissue contains large amounts of fructans but other tissues may not. For example, wheat stems contain relatively high levels of fructans, but mature wheat grains contain very few fructans [35,36]. If two animal species share the same origin but were differentiated to live on two different but genetically related host plants, one of which has fructans and the other does not, the animal species feeding on the plant with fructans gained an evolutionary advantage by acquiring GH32 genes, but the other feeding on plants without fructans does not need to acquire or retain any GH32 genes.
The acquisition of GH32 genes in some recipient animals has an advantageous impact on their metabolism. For example, the combined abundance of fructose and difructose exceeds the abundance of glucose in Hessian fly larvae (Figure 6B). During the long course of evolution, metabolism in animals was optimized to use glucose and trehalose (di-glucose) as blood sugars and major fuel for energy and metabolic intermediate production. In mammals, fructose can only be metabolized in certain tissues [37]. In fact, too much fructose in mammals can trigger the release of an intracellular alarm signal, which results in the organism going into a so-called ‘safety mode’, leading to multiple potential health issues [38]. The mechanism of how GH32-recipient animals metabolize the increased fructose efficiently needs to be investigated. Even though glucose and fructose have the same chemical composition (C6H12O6), they cannot be converted to each other directly in a reversible chemical reaction in living organisms. Rather, fructose is converted to glucose through very costly, multiple chemical reactions via the aldolase pathway [39]. One hypothetical way to use elevated fructose efficiently without causing the disruption of sugar metabolism is to use the oligomers and polymers of fructose as storage sugars directly as in some plants without converting them into glucose and then glucose polymers such as glycogen. Initial evidence of GH32 genes expressed in non-digestive tissues supports such a hypothesis. In Hessian fly larvae, several GH32 genes are expressed at high levels in fatty bodies and Malpighian tubes, and the enzymatic activity of GH32 enzymes is also detected in these tissues [20]. The fructans levan and inulin are detected in non-digestive tissue, as well as non-feeding stage (eggs) [20]. These pieces of evidence suggest that animals may directly use fructans as storage sugars. Further studies are needed to reveal how fructans ingested from plants are transported to other non-digestive tissues and/or how fructans are synthesized in non-digestive tissues with fructose released from ingested fructans via the action of GH32 enzymes. The Hessian fly genome does not contain genes encoding enzymes for fructan synthesis. However, genes encoding enzymes for fructan synthesis were found in bacteria associated with Hessian flies. It is possible that symbiotic bacteria help to synthesize fructans from GH32 enzyme-released fructose in the non-digestive tissues and non-feeding stages of the Hessian fly. Further research is needed to explore these possibilities.
How GH32 genes were transferred from various bacteria to numerous animal species remains unknown. The GH32 genes are in bacterial genomes without any obvious transposon elements around them. Presumably, there were only two ways, either direct gene fragment transfer, or a gene integrated into a vector (plasmid or phages) first and then transferred into an animal germ cell via the vectors. Further research needs to be conducted to delineate the transfer mechanisms. Similarly, the genetic mechanisms driving the expression of GH32 genes in different tissues of the recipient animal species remain unclear. It is known that some promoters derived from certain transposons or viruses can enable the transcription of horizontally transferred bacterial genes in eukaryotes [40,41]. However, the analyses of nucleotide sequences in the promotor regions of the GH32 genes in animals revealed no evidence of potential transposon- or virus-originated elements or sequences from bacteria, indicating that the cis-elements driving the expression of the horizontally transferred bacterial genes in different tissues of the recipient animals may have arisen by some de novo mechanism.
In conclusion, we identified a large number of GH32 family genes in a wide range of organisms living on different food sources and habitats. Phylogenetic analyses suggest that GH32 genes in different animals were obtained via multiple gene transfer events independently from diverse bacterial donors. The predicted three-dimensional structures and the catalytic triads of the proteins encoded by the identified GH32 genes are highly conserved despite the overall low sequence similarity among these proteins. The conservation of the three-dimensional structures and the presence of the catalytic triads in nearly all the identified proteins suggest these GH32 enzymes are functional. The presence of GH32 genes in animal genomes appears to have a profound impact on sugar metabolism in the recipient animal species.

4. Materials and Methods

4.1. Gene Identification

We used the amino acid sequence of the GH32 β-fructofuranosidase (Genbank accession number ABN04092) from the bacterium Bifidobacterium longum to search the nucleotide sequence databases in Genbank with tblastn [42] because its structure is known. We searched against the whole-genome shotgun contigs (WGS) database since it contains more sequenced genomes than the nonredundant (NR) database. The identified WGS GH32 genes were compared with the ones in the NR database if the sequences were available. Due to the large number of genome sequences available in the WGS database, we performed multiple searches limited to the genomes of one phylum, one class, or one order of animals to increase the computational blast efficiency. A blast hit was considered a match to GH32 if the e-value was ≤0.05, and sequence identity was ≥30% with gaps ≤10%.
Once a potential gene was identified from the blast analysis, the candidate gene sequence was extracted from Genbank. Intron–exon boundaries were determined by aligning DNA sequences with protein sequences following the role of the junction consensus of intron donors and acceptors [43]. If a DNA sequence contained a small deletion or mutation that could disrupt the translation of the DNA sequence into a full-length protein, primers were designed to amplify that region, and PCR products were sequenced to check if there were errors in the original sequence.

4.2. Identification of the Catalytic Triads

The locations of the catalytic triads in a protein were determined by aligning individual sequences with the GH32 enzyme (Genbank accession: NP_001119721) from B. mori [7]. The location of a triad region identified from an alignment was further confirmed by comparing its location in the corresponding catalytic site in its three-dimensional structure predicted using AlphaFold-2 [44]. The triads in each protein were marked with different colors in the protein sequence in Figure S2.

4.3. Prediction of Three-Dimensional Structures of the Identified Proteins

The three-dimensional structures of the identified proteins were predicted using AlphaFold version 2 [44]. The predicted structures are given in Figure S3.

4.4. Best Bacterial Protein Matches of the Identified Animal GH32 Proteins

Translated protein sequences from the genes identified from various animals were used to search against the bacterial proteins contained in Genbank using BlastP [42]. The first hit of the bacterial sub-database was taken as the best bacterial match and the bacterial protein was extracted from Genbank for downstream phylogenetic and other analyses.

4.5. CLANS and Phylogenetic Analyses

The 242 GH32 proteins predicted based on genes identified from various animals along with 133 nonredundant bacterial proteins, which were the best matches to the identified animal proteins, were used for the analyses of protein groups and diversification. In addition, three GH32s from wheat (Triticum aestivum), four GH32s from Arabidopsis thaliana, and five GH32s from various fungi, including Kluyveromyces marxianus, Fusarium sarcochroum, Glonium stellatum, Daedalea quercina, and Brettanomyces bruxellensis, were also included for the analysis. The analysis was carried out using CLANS, a java-based application for visualizing protein families [21], with default parameters. The taxonomic groups were labeled with different colors.
The same set of proteins was used for the phylogenetic analysis. The phylogenetic analysis was conducted based on the neighbor-joining approach [45] using Molecular Evolutionary Genetic Analysis version 11 (MEGA11) with 500 bootstrap tests [46]. All the protein sequences used in this analysis are listed in Figure S2.

4.6. Metabolic Profiling and Concentration Determination

Metabolic profiling was carried out to measure soluble metabolites via a commercial contract by the company Metabolom (Durham, NC, USA). Briefly, the samples were processed through three steps: extraction, derivatization, and analysis. Before extraction, recovery standards were added for quality control purposes. For extraction, 0.25 mL of precooled (–20 °C) solvent containing water, acetonitrile, and isopropanol (2:3:3, v/v/v) was added to a sample consisting of 5 mg of frozen ground insect tissues. The samples were agitated for 5 min at 4 °C on a chilling and heating dry bath. The samples were then centrifuged for 2 min at 13,200 rpm, and 0.5 mL of the supernatant was transferred to a clean tube. The samples were dried via a speed vacuum. For sample derivation, 20 μL of a methoxyl-amine hydrochloride in a pyridine mixture (40 mg/mL) was added to each sample. The mixture was mixed in a dry bath for 30 min at 40 °C, followed by the addition of 180 μL of N-methyl-trimethylsilylacetamide (MSTFA). The samples were then heated at 40 °C for 30 min in an orbital shaker. LC/MS analysis was conducted on a platform based on a Waters ACQUITY UPLC and a Thermo-Finnigan LTQ mass spectrometer, which consisted of an electrospray ionization (ESI) source and a linear ion-trap (LIT) mass analyzer. The animal extracts were split into two aliquots, dried, and then reconstituted in acidic or basic LC-compatible solvents, each of which contained 11 or more injection standards at fixed concentrations. One aliquot was analyzed using acidic positive ion-optimized conditions and the other using basic negative ion-optimized conditions in independent injections with separate columns. The extracts reconstituted in acidic conditions were gradient eluted using water and methanol, both containing 0.1% formic acid, whereas basic extracts were eluted using water/methanol containing 6.5 mM ammonium bicarbonate. The MS analyses alternated between MS and data-dependent MS2 scans using dynamic exclusion.
The compounds were identified by comparison to library entries of purified standards or recurrent unknown entities. The identification of known chemical entities was based on a comparison to the metabolomic library entries of purified standards, which consisted of over 1000 commercially available purified standard compounds. The combination of chromatographic properties and mass spectra gave an indication of a match to the specific compound or an isobaric entity. A variety of curation procedures were carried out to ensure that a high-quality data set was made available for statistical analysis and data interpretation.
Sugar extraction and quantification were performed on an in-house system with samples prepared following the procedure described previously [47]. Briefly, for the quantification of fructose, difructose, and glucose, ~20 mg fresh animal samples were collected into 1.7 mL microcentrifuge tubes separately and immediately frozen in liquid nitrogen. The frozen samples were ground using a mortar and pestle in liquid nitrogen. Tissue powder was suspended into 200 µL of 75% ethanol, and then incubated for one hour at 4 °C. Following incubation, the samples were centrifuged at 12,000× g for 15 min at 4 °C. The supernatants were transferred to new 1.7 mL microcentrifuge tubes and then dried in a Refrigerator CentriVap Vacuum Concentrator (Labconoco, Kansas City, MO, USA). The dried extracts were dissolved into 200 µL of LC-MS grade water and then filtered with a 0.2 µm Captiva PES Filter Vial (Agilent, Santa Clara, CA, USA) to remove high-molecular-weight substances. The resulting extracts were separated and quantified by passing through an Agilent InfinityLab Poroshell 120 HILIC-Z column (2.1 mm × 150 mm, 2.7 µm on an Agilent 1290 Infinity II Liquid Chromatography (LC) system combined with an Agilent G6545A Q-TOF mass spectrometer (LC-Q-TOF) (Agilent, Santa Clara, CA, USA). The concentrations of specific sugars were determined by comparing them with the corresponding individual standards. The LC-Q-TOF conditions and parameters were as described by Mack and Wei [48] and Dai and Hsiao [49].
Data were analyzed using the SAS software. Multiple comparisons of the means of different treatments were computed using Fisher’s least significant difference (LSD) (SAS Institute, Inc., Cary, NC, USA). The data sets with LSD significance were reanalyzed through a Dunnett test to avoid a type I error. Additionally, t-tests were conducted to determine whether the unknown means for the two samples were different.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ijms25158296/s1.

Author Contributions

M.-S.C. conceived and designed the study. X.C. and X.L. conducted experiments. X.C., K.W.J. and J.Y. conducted database searching and data analyses. R.J.W. and Y.P. provided the samples and other resources for this research. All authors have read and agreed to the published version of the manuscript.

Funding

This study was financially supported by yearly funding to MC from USDA-ARS.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All sequence data have been deposited here. Other data and information are provided here as Supplementary Figures.

Acknowledgments

We would like to thank KSU students Jia Tan and Katelyn Hinck for their help in database searching and figure drawing. We also would like to thank Raymond Cloyd and Maureen Gorman, respectively, for providing fungus gnat and Manduca samples. Mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the U.S. Department of Agriculture. The USDA is an equal opportunity provider and employer.

Conflicts of Interest

The authors declare that this research was carried out without any commercial or financial relationship that could be considered as a potential conflict of interest.

References

  1. Henrissat, B.; Callebaut, I.; Mornon, J.P.; Fabrega, S.; Lehn, P.; Davies, G. Conserved Catalytic Machinery and the Prediction of a Common Fold for Several Families of Glycosyl Hydrolases. Proc. Natl. Acad. Sci. USA 1995, 92, 7090–7094. Available online: https://www.jstor.org/stable/2367800 (accessed on 1 May 2024). [CrossRef] [PubMed]
  2. Minic, Z. Physiological roles of plant glycoside hydrolases. Planta 2008, 227, 723–740. [Google Scholar] [CrossRef] [PubMed]
  3. Lammens, W.; Roy, K.L.; Schroeven, L.; Laere, A.V.; Rabijns, A.; Van den Ende, W. Structural insights into glycoside hydrolase family 32 and 68 enzymes: Functional implications. J. Exp. Bot. 2009, 60, 727–740. [Google Scholar] [CrossRef] [PubMed]
  4. Alberto, F.; Bignon, C.; Sulzenbacher, G.; Henrissat, B.; Czjzek, M. The Three-dimensional Structure of Invertase (β-Fructosidase) from Thermotoga maritima Reveals a Bimodular Arrangement and an Evolutionary Relationship between Retaining and Inverting Glycosidases. J. Biol. Chem. 2004, 279, 18903–18910. [Google Scholar] [CrossRef]
  5. Sainz-Polo, M.A.; Ramírez-Escudero, M.; Lafraya, A.; González, B.; Marín-Navarro, J.; Polaina, J.; Sanz-Aparicio, J. Three-dimensional structure of Saccharomyces invertase: Role of a non-catalytic domain in oligomerization and substrate specificity. J. Biol. Chem. 2013, 288, 9755–9766. [Google Scholar] [CrossRef] [PubMed]
  6. Lammens, W.; Le Roy, K.; Van Laere, A.; Rabijns, A.; Van den Ende, W. Crystal structures of Arabidopsis thaliana cell-wall invertase mutants in complex with sucrose. J. Mol. Biol. 2008, 377, 378–385. [Google Scholar] [CrossRef] [PubMed]
  7. Miyazaki, T.; Oba, N.; Park, E.Y. Structural insight into the substrate specificity of Bombyx mori β-fructofuranosidase belonging to the glycoside hydrolase family 32. Insect Biochem. Mol. Biol. 2020, 127, 103494. [Google Scholar] [CrossRef]
  8. Nagem, R.A.P.; Rojas, A.L.; Golubev, A.M.; Korneeva, O.S.; Eneyskaya, E.V.; Kulminskaya, A.A.; Neustroev, K.N.; Polikarpov, I. Crystal structure of exo-inulinase from Aspergillus awamori: The enzyme fold and structural determinants of substrate recognition. J. Mol. Biol. 2004, 344, 471–480. [Google Scholar] [CrossRef]
  9. Verhaest, M.; Van den Ende, W.; Roy, K.L.; Ranter, C.J.D.; Van Laere, A.; Rabijns, A. X-ray diffraction structure of a plant glycosyl hydrolase family 32 protein: Fructan 1-exohydrolase IIa of Cichorium intybus. Plant J. 2005, 41, 400–411. [Google Scholar] [CrossRef]
  10. Ernits, K.; Eek, P.; Lukk, T.; Visnapuu, T.; Alamäe, T. First crystal structure of an endo-levanase–the BT1760 from a human gut commensal Bacteroides thetaiotaomicron. Sci. Rep. 2019, 9, 8443. [Google Scholar] [CrossRef]
  11. Hendry, G.A.F. Evolutionary origins and Natural Functions of Fructans-a Climatological, Biogeographicand Mechanistic Appraisal. New Phytol. 1993, 123, 3–14. Available online: https://www.jstor.org/stable/2557765 (accessed on 1 May 2024). [CrossRef]
  12. Muir, J.G.; Shepherd, S.J.; Rosella, O.; Rose, R.; Barrett, J.S.; Gibson, P.R. Fructan and free fructose content of common Australian vegetables and fruit. J. Agric. Food Chem. 2007, 55, 6619–6627. [Google Scholar] [CrossRef]
  13. Keeling, P.; Palmer, J. Horizontal gene transfer in eukaryotic evolution. Nat. Rev. Genet. 2008, 9, 605–618. [Google Scholar] [CrossRef] [PubMed]
  14. Graham, L.A.; Lougheed, S.C.; Ewart, K.V.; Davies, P.L. Lateral transfer of a lectin-like antifreeze protein gene in fishes. PLoS ONE 2008, 3, e2616. [Google Scholar] [CrossRef] [PubMed]
  15. van Hoek, A.H.A.M.; Mevius, D.; Guerra, B.; Mullany, P.; Roberts, A.P.; Aarts, H.J.M. Acquired antibiotic resistance genes: An overview. Front Microbiol. 2011, 2, 203. [Google Scholar] [CrossRef] [PubMed]
  16. Druzhinina, I.S.; Chenthamara, K.; Zhang, J.; Atanasova, L.; Yang, D.; Miao, Y. Massive lateral transfer of genes encoding plant cell wall-degrading enzymes to the mycoparasitic fungus Trichoderma from its plant-associated hosts. PLOS Genet. 2018, 14, e1007322. [Google Scholar] [CrossRef] [PubMed]
  17. Vicente, C.S.L.; Nemchinov, L.G.; Mota, M.; Eisenback, J.D.; Kamo, K.; Vieira, P. Identification and characterization of the first pectin methylesterase gene discovered in the root lesion nematode Pratylenchus penetrans. PLoS ONE 2019, 14, e0212540. [Google Scholar] [CrossRef] [PubMed]
  18. Busch, A.; Danchin, E.G.J.; Pauchet, Y. Functional diversification of horizontally acquired glycoside hydrolase family 45 (GH45) proteins in Phytophaga beetles. BMC Evol. Biol. 2019, 19, 100. [Google Scholar] [CrossRef] [PubMed]
  19. Danchin, E.G.; Guzeeva, E.A.; Mantelin, S.; Berepiki, A.; Jones, J.T. Horizontal gene transfer from bacteria has enabled the plant-parasitic nematode Globodera pallida to feed on host-derived sucrose. Mol. Biol. Evol. 2016, 33, 1571–1579. [Google Scholar] [CrossRef]
  20. Cheng, X.; Garcés-Carrera, S.; Whitworth, R.J.; Fellers, J.P.; Park, Y.; Chen, M.S. A horizontal gene transfer led to the acquisition of a fructan metabolic pathway in a gall midge. Adv. Biosyst. 2020, 4, 1900275. [Google Scholar] [CrossRef]
  21. Frickey, T.; Lupas, A. CLANS: A Java application for visualizing protein families based on pairwise similarity. Bioinformatics 2004, 20, 3702–3704. [Google Scholar] [CrossRef] [PubMed]
  22. The International Aphid Genomics Consortium. Genome Sequence of the Pea Aphid Acyrthosiphon pisum. PLoS Biol. 2010, 8, e1000313. [Google Scholar] [CrossRef]
  23. Robertson, H.M.; Waterhouse, R.M.; Walden, K.O.; Ruzzante, L.; Maarten, J.M.; Reijnders, F.; Coates, B.S.; Legeai, F.; Gress, J.C.; Biyiklioglu, S.; et al. Genome sequence of the wheat stem sawfly, Cephus cinctus, representing an early-branching lineage of the Hymenoptera, illuminates evolution of hymeopteran cChemoreceptors. Genome Biol. Evol. 2018, 10, 2997–3011. [Google Scholar] [CrossRef] [PubMed]
  24. The Tribolium Genome Sequencing Consortium. The genome of the model beetle and pest Tribolium castaneum. Nature 2008, 452, 949–955. [Google Scholar] [CrossRef] [PubMed]
  25. Pauchet, Y.; Kirsch, R.; Giraud, S.; Vogel, H.; Heckel, D.G. Identification and characterization of plant cell wall degrading enzymes from three glycoside hydrolase families in the cerambycid beetle Apriona japonica. Insect Biochem. Mol. Biol. 2014, 49, 1–13. [Google Scholar] [CrossRef] [PubMed]
  26. Subramanyam, S.; Nemacheck, J.A.; Bernal-Crespo, V.; Sardesai, N. Insect derived extra oral GH32 plays a role in susceptibility of wheat to Hessian fly. Sci. Rep. 2021, 11, 2081. [Google Scholar] [CrossRef] [PubMed]
  27. Livingston, D.P.I.I.I.; Hincha, D.K.; Heyer, A.G. Fructan and its relationship to abiotic stress tolerance in plants. Cell. Mol. Life Sci. 2009, 66, 2007–2023. [Google Scholar] [CrossRef]
  28. Benkeblia, N. Insights on fructans and resistance of plants to drought stress. Front. Sustain. Food Syst. Sect. Crop Biol. Sustain. 2022, 6, 2022. [Google Scholar] [CrossRef]
  29. Shelomi, M.; Danchin, E.G.J.; Heckel, D.; Wipfler, B.; Bradler, S.; Zhou, X.; Pauchet, Y. Horizontal gene transfer of pectinases from bacteria preceded the diversification of stick and leaf insects. Sci. Rep. 2016, 6, 26388. [Google Scholar] [CrossRef]
  30. Le, N.G.; van Ulsen, P.; van Spanning, R.; Brouwer, A.; van Straalen, N.M.; Roelofs, D. A functional carbohydrate degrading enzyme potentially acquired by horizontal gene transfer in the genome of the soil invertebrate Folsomia candida. Genes 2022, 13, 1402. [Google Scholar] [CrossRef]
  31. Wang, M.; Cheong, K.L. Preparation, structural characterisation, and bioactivities of Fructans: A review. Molecules 2023, 28, 1613. [Google Scholar] [CrossRef] [PubMed]
  32. Yoshida, M. Fructan structure and metabolism in overwintering plants. Plants 2021, 10, 933. [Google Scholar] [CrossRef]
  33. Ahn, S.; Anderson, J.A.; Sorrells, M.E.; Tanksley, S.D. Homoeologous relationships of rice, wheat and maize chromosomes. Mol. Gen. Genet. 1993, 241, 483–490. [Google Scholar] [CrossRef] [PubMed]
  34. Kawakami, A.; Sato, Y.; Yoshida, M. Genetic engineering of rice capable of synthesizing fructans and enhancing chilling tolerance. J. Exp. Bot. 2008, 59, 793–802. [Google Scholar] [CrossRef]
  35. Schnyder, H.; Gillenberg, C.; Hinz, J. Fructan contents and dry matter deposition in different tissues of the wheat grain during development. Plant Cell Environ. 1993, 16, 179–187. [Google Scholar] [CrossRef]
  36. Cimini, S.; Locato, V.; Vergauwen, R.; Paradiso, A.; Cecchini, C.; Vandenpoel, L.; Verspreet, J.; Courtin, C.M.; D’Egidio, M.G.; Van den Ende, W.; et al. Fructan biosynthesis and degradation as part of plant metabolism controlling sugar fluxes during durum wheat kernel maturation. Front. Plant Sci. 2015, 6, 89. [Google Scholar] [CrossRef] [PubMed]
  37. Herman, M.H.; Birnbaum, M.J. Molecular Aspects of Fructose Metabolism and Metabolic Disease. Cell Metab. 2021, 33, 2329–2354. Available online: https://www.sciencedirect.com/science/article/pii/S1550413121004290 (accessed on 1 May 2024). [CrossRef] [PubMed]
  38. Johnson, R.J.; Stenvinkel, P.; Andrews, P.; Sánchez-Lozada, L.G.; Nakagawa, T.; Gaucher, E.; Andres-Hernando, A.; Rodriguez-Iturbe, B.; Jimenez, C.R.; Garcia, G.; et al. Fructose metabolism as a common evolutionary pathway of survival associated with climate change, food shortage and droughts (Review-Symposium). J. Intern. Med. 2020, 287, 252–262. [Google Scholar] [CrossRef] [PubMed]
  39. Bismut, H.; Hers, H.G.; Van Schaftingen, E. Conversion of fructose to glucose in the rabbit small intestine. A reappraisal of the direct pathway. Eur. J. Biochem. 1993, 213, 721–726. [Google Scholar] [CrossRef]
  40. Seternes, T.; Tonheim, T.; Myhr, A.; Dalmo, R.A. A plant 35S CaMV promoter induces long-term expression of luciferase in Atlantic salmon. Sci. Rep. 2016, 6, 25096. [Google Scholar] [CrossRef]
  41. Palazzo, A.; Caizzi, R.; Moschetti, R.; Marsano, R.M. What Have We Learned in 30 Years of Investigations on Bari Transposons? Cells 2022, 11, 583. [Google Scholar] [CrossRef] [PubMed]
  42. Johnson, M.; Zaretskaya, I.; Raytselis, Y.; Merezhuk, Y.; McGinnis, S.; Madden, T.L. NCBI BLAST: A better web interface. Nucleic Acids Res. 2008, 36, W5–W9. [Google Scholar] [CrossRef] [PubMed]
  43. Shapiro, M.B.; Senapathy, P. RNA splice junctions of different classes of eukaryotes: Sequence statistics and functional implications in gene expression. Nucleic Acids Res. 1987, 15, 7155–7174. [Google Scholar] [CrossRef] [PubMed]
  44. Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O. Highly accurate protein structure prediction with AlphaFold. Nature 2021, 596, 583–589. [Google Scholar] [CrossRef] [PubMed]
  45. Saitou, N.; Nei, M. The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 1987, 4, 406–425. [Google Scholar] [CrossRef]
  46. Tamura, K.; Stecher, G.; Kumar, S. MEGA11: Molecular evolutionary genetics analysis version 11. Mol. Biol. Evol. 2021, 38, 3022–3027. [Google Scholar] [CrossRef]
  47. Zhao, L.; Chanon, A.M.; Chattopadhyay, N.; Dami, I.E.; Blakeslee, J.J. Quantification of carbohydrates in grape tissues using capillary zone electrophoresis. Front. Plant Sci. 2016, 7, 818. [Google Scholar] [CrossRef]
  48. Mack, A.; Wei, T.-C. Analysis of Sugars Using an Agilent InfinityLab Poroshell 120 HILIC-Z Colum. Agilent Application Note 5991-8984EN. 2019. Available online: https://www.agilent.com/cs/library/applications/5991-8984EN_Poroshell%20HILICZ_sugars_application.pdf (accessed on 1 May 2024).
  49. Dai, Y.; Hsiao, J. Discovery metabolomics LC MS Methods Optimized for Polar Metabolites. Agilent Application Note 5994-1492EN. 2019. Available online: https://hpst.cz/sites/default/files/download/2021/03/application-discovery-metabolomics-hilic-z-5994-1492en-agilent.pdf (accessed on 1 May 2024).
Figure 1. Conservation in the three triad sequence regions and the overall three-dimensional structures of the predicted proteins. (A). Logos show consensus sequences at the three active site regions. The residues D, D, and E at position 5 in each graph are the three triads. (B). The three-dimensional structures of three representative proteins. The structure of the protein BmSUC1 was determined by X-ray diffraction [7]. The structures of the proteins Scur-GH32-2 and Hsch-GH32-1 were predicted using AlphaFold-2. The five critical blades for catalytic activity are marked with I, II, III, IV, and V in each structure. The three-dimensional structures of other identified proteins are given in Figure S3.
Figure 1. Conservation in the three triad sequence regions and the overall three-dimensional structures of the predicted proteins. (A). Logos show consensus sequences at the three active site regions. The residues D, D, and E at position 5 in each graph are the three triads. (B). The three-dimensional structures of three representative proteins. The structure of the protein BmSUC1 was determined by X-ray diffraction [7]. The structures of the proteins Scur-GH32-2 and Hsch-GH32-1 were predicted using AlphaFold-2. The five critical blades for catalytic activity are marked with I, II, III, IV, and V in each structure. The three-dimensional structures of other identified proteins are given in Figure S3.
Ijms 25 08296 g001
Figure 2. Variation in the structures of the identified genes. (A): The positions of the introns in genes from different types of animals. The middle bar indicates the positions of amino acids where an intron/exon boundary is located relative to the protein. The longest protein contains 538 amino acid residues. The types of animals are marked with different colors and symbols. The number of genes with an intron at the same position is given either above or underneath each symbol. (B): Genes with different numbers of introns. (C): The size distribution of introns in different genes. The X-axis indicates the base pairs of an intron whereas the Y-axis indicates the number of genes. (D): Different numbers and position of introns in the three genes from the springtail, Sinella curviseta. Genbank accession numbers are given in the last exon of each gene.
Figure 2. Variation in the structures of the identified genes. (A): The positions of the introns in genes from different types of animals. The middle bar indicates the positions of amino acids where an intron/exon boundary is located relative to the protein. The longest protein contains 538 amino acid residues. The types of animals are marked with different colors and symbols. The number of genes with an intron at the same position is given either above or underneath each symbol. (B): Genes with different numbers of introns. (C): The size distribution of introns in different genes. The X-axis indicates the base pairs of an intron whereas the Y-axis indicates the number of genes. (D): Different numbers and position of introns in the three genes from the springtail, Sinella curviseta. Genbank accession numbers are given in the last exon of each gene.
Ijms 25 08296 g002
Figure 3. A two-dimensional map for GH32 groups of sequences from various taxa. A total of 242 eukaryotic proteins and 133 bacterial proteins were included. Default parameters in the software CLANS [21] were used and different taxonomic units are depicted by the colors in the figure legend. A subset of the upper left map was reconstructed in the map on the lower right. The dotted lines divide the two major clusters of GH32A levanase/inulinase and GH32B sucrase groups. Two minor groups containing GH32 proteins from nematodes and rotifers are located on the right and left sides of the two major groups.
Figure 3. A two-dimensional map for GH32 groups of sequences from various taxa. A total of 242 eukaryotic proteins and 133 bacterial proteins were included. Default parameters in the software CLANS [21] were used and different taxonomic units are depicted by the colors in the figure legend. A subset of the upper left map was reconstructed in the map on the lower right. The dotted lines divide the two major clusters of GH32A levanase/inulinase and GH32B sucrase groups. Two minor groups containing GH32 proteins from nematodes and rotifers are located on the right and left sides of the two major groups.
Ijms 25 08296 g003
Figure 4. Predicted GH32 proteins from different animals had the best matches to proteins from different orders of bacteria in Genbank. (A): The distribution (proportion) of bacterial orders with best matches to the identified GH32 proteins from animals. H, S (brown), BA, F, S (blue), and BU represent bacterial orders Halanaerobiales, Sphingobacteriales, Bacteroidales, Flavobacteriales, Streptomycetales, and Burkholderiales. Other bacterial orders (Others) included Bifidobacteriales, Micrococcales, Saprospirales, Eubacteriales, Thermoanaerobacterales, Rhodospirillales, Gemmatales, Chloroflexales, and Phototrophicales. (B): GH32 proteins from the same type (order) of animals had the best matches to proteins from multiple orders of bacteria. (C): Different GH32 proteins from a single species of animals had the best matches to proteins from different orders of bacteria.
Figure 4. Predicted GH32 proteins from different animals had the best matches to proteins from different orders of bacteria in Genbank. (A): The distribution (proportion) of bacterial orders with best matches to the identified GH32 proteins from animals. H, S (brown), BA, F, S (blue), and BU represent bacterial orders Halanaerobiales, Sphingobacteriales, Bacteroidales, Flavobacteriales, Streptomycetales, and Burkholderiales. Other bacterial orders (Others) included Bifidobacteriales, Micrococcales, Saprospirales, Eubacteriales, Thermoanaerobacterales, Rhodospirillales, Gemmatales, Chloroflexales, and Phototrophicales. (B): GH32 proteins from the same type (order) of animals had the best matches to proteins from multiple orders of bacteria. (C): Different GH32 proteins from a single species of animals had the best matches to proteins from different orders of bacteria.
Ijms 25 08296 g004
Figure 5. Phylogenetic relationship of the 242 GH32 proteins identified from animal genomes along with their best-matched bacterial proteins. (A): An overall phylogenetic tree. Red, blue, green, and black colors indicate bacterial, plant, fungal, and animal protein sequences, respectively. A single line could represent multiple sequences if they are closely related. The origin of the sequences is marked next to each branch. The branch marked by the symbol “*” is enlarged in panel B. (B): An enlarged view of one of the branches in panel A. Different colors represent different origins of the protein sequences. The sizes of the arrows are proportional to the number of the sequences in a sub-branch, which are indicated in the parentheses next to the origin of sequences.
Figure 5. Phylogenetic relationship of the 242 GH32 proteins identified from animal genomes along with their best-matched bacterial proteins. (A): An overall phylogenetic tree. Red, blue, green, and black colors indicate bacterial, plant, fungal, and animal protein sequences, respectively. A single line could represent multiple sequences if they are closely related. The origin of the sequences is marked next to each branch. The branch marked by the symbol “*” is enlarged in panel B. (B): An enlarged view of one of the branches in panel A. Different colors represent different origins of the protein sequences. The sizes of the arrows are proportional to the number of the sequences in a sub-branch, which are indicated in the parentheses next to the origin of sequences.
Ijms 25 08296 g005
Figure 6. Abundance of fructose, difructose, and glucose among insect species with/without GH32 genes. (A): The relative abundance (expressed as scaled intensity) of fructose and glucose determined by a total metabolite profiling of whole insects. The numbers under the abscissa represent the instars of larvae and M and F represent males and females of adults. (B): Concentrations (ng/mg) of fructose, difructose, and glucose. The abbreviation of the insects are as follows: Fly, the Hessian fly, Mayetiola destructor; Gnat, the fungus gnat, Bradysia spp.; Aphid, the pea aphid, Acyrthosiphon pisum; Sexta, Manduca sexta; Sawfly, the wheat stem sawfly, Cephus cinctus; Beetle, the red flour beetle, Tribolium castaneum. The genomes of the Hessian fly and Manduca contain GH32 genes. The remaining insect species contain no GH32 genes.
Figure 6. Abundance of fructose, difructose, and glucose among insect species with/without GH32 genes. (A): The relative abundance (expressed as scaled intensity) of fructose and glucose determined by a total metabolite profiling of whole insects. The numbers under the abscissa represent the instars of larvae and M and F represent males and females of adults. (B): Concentrations (ng/mg) of fructose, difructose, and glucose. The abbreviation of the insects are as follows: Fly, the Hessian fly, Mayetiola destructor; Gnat, the fungus gnat, Bradysia spp.; Aphid, the pea aphid, Acyrthosiphon pisum; Sexta, Manduca sexta; Sawfly, the wheat stem sawfly, Cephus cinctus; Beetle, the red flour beetle, Tribolium castaneum. The genomes of the Hessian fly and Manduca contain GH32 genes. The remaining insect species contain no GH32 genes.
Ijms 25 08296 g006
Table 1. Putative GH32 genes identified from the genome sequences of different animals in Genbank *.
Table 1. Putative GH32 genes identified from the genome sequences of different animals in Genbank *.
Animal GroupScientific NameSpecies with GH32Genes Per SpeciesTotal Genes
DipteransDiptera71-1235
ArachnidsArachnida111-521
Waters bearsTardigrada21, 66
WaspsHymenoptera61-411
SpringtailsCollembola101-423
RotifersBdelloida81-734
CaddisfliesTrichoptera41-59
Moths/ButterfliesLepidoptera111-944
BeetlesColeoptera91-424
NematodesNematoda51-716
ThripsThysanoptera22-36
MillipedesDiplopoda111
Segmented wormsAnnelida 111
Clam shrimpsSpinicaudata111
Water fleasAnomopoda111
WoodlacesIsopoda111
CrabsPleocyemata111
CockroachesBlattodea133
True bugsHemiptera 122
FlatwormsPlatyhelminthes111
Sea spidersPycnogonida111
Total 84 242
* species containing GH32 are as follows (numbers in parenthesis indicate the number of genes in this species). Flies: Contarinia nasturtii (12), Mayetiola destructor (10), Bradysia coprophila (2), B. odoriphaga (5), Sitodiplosis mosellana (4), Phormia regina (1), and Bactrocera tryoni (1). Arachnids: Oedothorax gibbosus (1), Trichonephila clavipes (1), Tr. clavate (1), Caerostris darwini (1), Araneus ventricosus (1), Argiope bruennichi (1), Neoseiulus cucumeris (3), Oppiella nova (2), Leptotrombidium delicense (5), Tetranychus urticae (2), and Te. cinnabarinus (2). Water Bears: Ramazzottius varieornatus (1) and Hypsibius dujardini (6). Wasps: Tetragonula mellipes (4), Eupelmus annulatus (3), Macrocentrus cingulum (1), Trichogramma brassicae (1), Tr. evanescens (1), and Tr. pretiosum (1). Springtails: Sinella curviseta (3), Dicyrtomina minuta (1), Cyphoderus albinus (3), Pogonognathellus flavescens (4), Yoshiicerus persimilis (2), Tomocerus vulgaris (2), T. qinae (2), Allacma fusca (2), Orchesella cincta (2), and Folsomia candida (2). Rotifers: Macrotrachela quadricornifera (7), Rotaria sp. Silwood1 (5), Rotaria sp. Silwood2 (6), R. sordida (1), Didymodactylos carnosus (7), Adineta ricciae (2), A. vaga (1), and A. steineri (5). Caddisflies: Drusus annulatus (3), Micrasema longulum (1), Micropterna sequax (1), and Halesus radiatus (5). Moths and Butterlies: Helicoverpa armigera (9), Manduca sexta (4), Papilio polytes (3), Spodoptera exigua (4), Danaus plexippus (3), Amyelois transitella (5), Heliconius Melpomene (1), Bombyx mori (2), Pararge aegeria (4), Trichoplusia ni (4), and Ostrinia furnacalis (5). Beetles: Listronotus oregonensis (4), Limonius californicus (4), Ignelater luminosus (2), Anoplophora glabripennis (3), Dendroctonus ponderosae (3), Rhynchophorus ferrugineus (1), Sphenophorus levis (1), Sitophilus oryzae (2), and Agrilus planipennis (4). Nematodes: Acrobeloides nanus (2), Heterodera schachtii (2), H. glycines (1), Meloidogyne enterolobii (7), and Globodera rostochiensis (4). Thrips: Frankliniella occidentalis (3) and Thrips palmi (2). True bugs: Bemisia tabaci (1). Cockroaches: Blattella germanica (3). Millipede: Julidae sp. (1). Segmented worm: Streblospio benedicti (1). Clam shrimp: Eulimnadia texana (1). Water flea: Daphnia similis (1). Woodlace: Idotea baltica (1). Crab: Portunus trituberculatus (1). Flatworm: Schmidtea mediterranea (1). Sea Spider: Nymphon striatum (1).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cheng, X.; Liu, X.; Jordan, K.W.; Yu, J.; Whitworth, R.J.; Park, Y.; Chen, M.-S. Frequent Acquisition of Glycoside Hydrolase Family 32 (GH32) Genes from Bacteria via Horizontal Gene Transfer Drives Adaptation of Invertebrates to Diverse Sources of Food and Living Habitats. Int. J. Mol. Sci. 2024, 25, 8296. https://doi.org/10.3390/ijms25158296

AMA Style

Cheng X, Liu X, Jordan KW, Yu J, Whitworth RJ, Park Y, Chen M-S. Frequent Acquisition of Glycoside Hydrolase Family 32 (GH32) Genes from Bacteria via Horizontal Gene Transfer Drives Adaptation of Invertebrates to Diverse Sources of Food and Living Habitats. International Journal of Molecular Sciences. 2024; 25(15):8296. https://doi.org/10.3390/ijms25158296

Chicago/Turabian Style

Cheng, Xiaoyan, Xuming Liu, Katherine W. Jordan, Jingcheng Yu, Robert J. Whitworth, Yoonseong Park, and Ming-Shun Chen. 2024. "Frequent Acquisition of Glycoside Hydrolase Family 32 (GH32) Genes from Bacteria via Horizontal Gene Transfer Drives Adaptation of Invertebrates to Diverse Sources of Food and Living Habitats" International Journal of Molecular Sciences 25, no. 15: 8296. https://doi.org/10.3390/ijms25158296

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop