*2.5. Quantitative Real-Time PCR*

We used qRT-PCR to verify the expression of randomly selected candidate *T. yunnanensis* detoxification genes in different tissues. The cDNA was synthesized with an input of 1 μg of total RNA using the Prime ScriptRT Reagent Kit with gDNA Eraser to remove gDNA (AG, Changsha, China). For quantitative real-time PCR (qPCR), various tissues including antennae, heads (without antennae), legs, and carcasses from female and male adults were collected and immediately immersed in liquid nitrogen. qPCR was performed using SYBR Premix EX Taq™ (AG, Changsha, China) with three technical replicates of each template from three independent biological pools. For the qPCR analysis, the primers (Table S1) were designed by Primer Premier 5.0 [30]. Each reaction contained a total volume of 20 μL, consisting of 10 μL of SYBR Green PCR Master Mix, 0.8 μL of each primer (10 μM), 2 μL (20 ng) of cDNA template, and 6.4 μL of nuclease-free water. The β-actin gene was used as an endogenous control. qPCR cycling parameters were: 94 ◦C for 4 min followed by 40 cycles at 94 ◦C for 20 s and 60 ◦C for 30 s. Relative gene expression level was calculated using the Q-GENE statistical analysis package [31].

### **3. Results**

### *3.1. Transcriptome Assembly*

An Illumina HiSeq platform was used to sequence antennae, head, legs, and carcasses of adult female and male *T. yunnanensis* transcriptomes. We obtained 61.51 million (MA-1), 47.40 million (MA-2), and 52.92 million (MA-3) raw reads from the antennae of a male adult, 58.15 million (MH-1), 50.67 million (MH-2), and 61.09 million (MH-3) raw-reads from the head of a male adult, and there are raw readings from other organizations in Table 1. Filtering resulted in 58.28 million (MA-1), 46.26 million (MA-2), 51.60 million (MA-3), 55.80 million (MH-1), 49.30 million (MH-2), 59.09 million (MH-3), 55.93 million (ML-1), 53.53 million (ML-2), and 56.48 million (ML-3) clean reads (Table 1). The percentages of reads with Q20 and Q30 values for each library were approximately 98% and 93%, respectively. The GC content ranged from 38.55% to 52.60% (Table 1). The final transcript dataset contained 100,455 unigenes with a mean length of 877 bp and an N50 length of 1153 bp (Table 2), indicating the high quality of our assembly.

### *3.2. Functional Annotation of the Unigenes in T. yunnanensis*

A total of 100,455 unigenes were annotated by searching against six databases using BLAST. Specifically, 60,067 (59.79%) unigenes were matched in the non-redundant (NR) database, which accounted for the largest match. Further, 54,782 (54.53%) were annotated in the nucleotide (NT) database, 50,102 (49.87%) were annotated by blasting against the Swiss Port database. Additionally, the euKaryotic Ortholog Groups (KOG) database had the lowest number of annotated unigenes, with 19,808 (19.71%) unigenes (Table 3). Species with the highest proportion of similar genes were *Nephila clavipes* (21.00%), followed by *Dendroctonus ponderosae* (19.2%), *Hyalella azteca* (8.0%), *Toxocara canis* (4.2%) and *Lucilia cuprina* (2.1%) (Figure 1).


**Table 1.** Summary of the transcriptome sequencing data from *T. yunnanensis*.

MA: male antennae, MH: male head, ML: male leg, MC: male carcasses, FA: female antennae, FH: female head, FL: female leg, FC: female carcasses.

**Table 2.** Summary of transcriptome splicing length distribution data from *T. yunnanensis*.


**Table 3.** Statistics of gene annotation success rate.


Using a gene ontology (go) method, annotated genes were divided into three categories (a total of 56 functional groups): biological process, cell composition, and molecular function. Among the biological processes, the subcategories cellular process, metabolic process, and single-organism process contained the most unigenes. In the cellular component class, the subcategories cell and cell part contained the most unigenes. Binding and catalytic activity were the most numerous subcategories in the "molecular function" category (Figure 2a). For the euKaryotic Ortholog Groups (KOG) functional classification, we annotated about 19,808 unigenes and divided them into 25 molecular families (Figure 2b). Among them, the largest category was the general function prediction only, followed by amino acid transport and metabolism and energy production and conversion. Cell motility

and nuclear structure were the smallest groups (Figure 2b). A KEGG analysis was used to classify the annotated genes into different KEGG pathway functional categories (Figure 2c). The most representative pathways were amino acid metabolism, carbohydrate metabolism, overview, and signal transduction (Figure 2c).

**Figure 1.** The unigene BLASTx searches against the Nr database for species distribution analysis.

*3.3. Identification of Candidate CYPs, GSTs, and CCEs*

In this study, a total of 51 predicted CYPs transcripts were identified from the transcriptomes of different issues of *T. yunnanensis* using the BLASTx program. The sequence identities of these candidate CYPs with other Coleoptera insects ranged from 32.69% to 96.54% in the NCBI database (Table 4). According to the CYP nomenclature, we classified 51 CYP sequences into four families (CYP2, CYP3, CYP4, mitochondrial CYP). After removing 25 short sequences (aa < 390) [27], 26 sequences with complete open reading frames were used to construct a phylogenetic tree (Figure 3). The phylogenetic tree showed that TyunCYP had a high homology with DponCYP. The largest family was the CYP3 family, which included 13 members and the CYP3 family contained two subfamilies, CYP6 and CYP9. The results of the phylogenetic tree show that five CYPs (TyunCYP6BW2, Tyun-CYP6DE1, TyunCYP6DF1, TyunCYP6BX1, TyunCYP6DJ1) belong to the CYP6 subfamily. The second largest family is the CYP4 family; we identified eight genes (TyunCYP393A1, TyunCYP6BK1, TyunCYP411A1, TyunCYP349B2, TyunCYP4CV1, TyunCYP4BQ1, Tyun-CYP4G2, TyunCYP4BG1) belonging to the CYP4 family.

**Figure 2.** Results of BLASTx matches of *T. yunnanensis* transcriptome unigenes, gene ontology, KOG classification, and KEGG pathway annotation. (**a**) Gene ontology classifications of *T. yunnanensis* unigenes. (**b**) KOG classifications of *T. yunnanensis* unigenes. (**c**) KEGG classification of *T. yunnanensis* unigenes.

(**c**)


**Table 4.** Best BLASTX matches of *T. yunnanensis* CYPs.

A total of 33 candidate GSTs were identified from the total transcriptome of the different developmental stages of *T. yunnanensis*. The sequence identities of these candidate GSTs with other dipteran insects ranged from 51.96% to 97.14% in the NCBI database (Table 5). After removing short sequences (aa < 150) [32], 18 GSTs were chosen for the phylogenetic analysis. A neighbor-joining tree was subsequently constructed using our identified putative GST proteins and the sequences from four other Coleoptera species, *D. ponderosae*, *T. castaneum*, *D. valens*, and *A. planipennis* (Figure 4). Ten GSTs were identified in the transcriptome of *T. yunnanensis*, including six delta/epsilon class GSTs, four omega class GST, five sigma class GSTs, two theta class GSTs, and one microsomal class GST. The phylogenetic tree showed that TyunGST had a high homology with DarmGST.

**Figure 3.** Neighbor-joining tree of candidate CYPs. Bootstrap values after 1000 replications. Dpon, *Dendroctonus ponderosae*; Rpro, *Rhodnius prolixus*; Tcas, *Tribolium castaneum*; Tyun, *Tomicus yunnanensis*.

We identified 29 transcripts encoding CCEs in the *T. yunnanensis* transcriptome by a bioinformatics analysis and all of them were more conserved across other CCEs variants, with 42.86 to 93.03% amino acid identity (Table 6). To guarantee the reliability of the phylogenetic tree, 13 CCEs encoding short sequences wereremoved (aa < 480) [33], and 16 CCEs in our transcriptomes were aligned with CCEs from other coleoptera species (Figure 5). A total of 16 CCE genes were identified in the transcriptome of *T. yunnanensis*, including nine xenobiotic metabolizing enzymes class CCE, three microsomal and alphaesterases class CCE, two beta- and pheromone esterases class CCE, one JHE class CCE, and one CO class CCE.


**Table 5.** Best BLASTX matches of *T. yunnanensis* GSTs.

### *3.4. Tissue Expression Profile of the CYP, GST, and CCE Genes*

In order to explore the expression profiles of detoxification genes in different tissues, we screened all 113 genes from the DEG library. In total, 64 genes were found, and 49 genes were missing from the DEG data. The expression profiles based on FPKM values revealed that several detoxification genes (TyunCYP410A1, TyunCYP9AN1, TyunCYP393A1, TyunGST22, TyunGST29, TyunGST31, TyunCCE8, and TyunCCE17) are highly expressed in the antenna (FPKM > 100) and the expression levels of these genes are all higher in male antennae than in female antennae (Figure 6). TyunCYP4G2 is highly expressed in the residue. Tyun315A1, TyunCYP6BK1, TyunCYP6DF1, and TyunGST3 are expressed in all tissues; TyunCYP6DF1 expression in antennae is higher than in other tissues. TyunCYP345E1 and TyunCYP345F1 are little expressed in all tissues (Figure 6a). TyunGST11 and TyunGST33 are expressed in all tissues, the expression of TyunGST11 in foot and antennae is higher than other tissues, and the expression level of male feet is higher than that of female legs. The antenna expression of TyunGST33 is significantly higher than that of other tissues and the male antenna expression is higher than that of the female (Figure 6b). TyunCCE21 is expressed in all tissues. TyunCCE16 is strongly expressed in the head and carcasses, and expression in these two parts of the females is higher than that of the males (Figure 6c).

**Figure 4.** Neighbor-joining tree of candidate GSTs. Bootstrap values after 1000 replications. Dpon, *Dendroctonus ponderosae*; Darm, *Dendroctonus armandi*; Agla, *Anoplophora glabripennis*; Ldec *Leptinotarsa decemlineata*; Tcas, *Tribolium castaneum*; Tyun, *Tomicus yunnanensis*.



**Figure 5.** Neighbor-joining tree of candidate CCEs. Bootstrap values after 1000 replications. Dpon, *Dendroctonus ponderosae*; Darm, *Dendroctonus armandi*; Agla, *Anoplophora glabripennis*; Tcas, *Tribolium castaneum*; Tyun, *Tomicus yunnanensis*.

**Figure 6.** Expression profiles of detoxification genes in *T. yunnanensis*. (**a**) CYP; (**b**) GST; (**c**) CCE. MA: male antennae, MH: male head, ML: male legs, MC: male carcasses, FA: female antennae, FH: female head, FL: female legs, FC: female carcasses.

Further, qPCR was employed to validate the expression of some detoxification genes and to investigate their expression profiles. The qPCR results of TyunCYP4G2, TyunCYP6DF1, TyunGST11, TyunGST33, TyunCCE16, and TyunCCE17 are consistent with the FPKM value analysis results, but the qPCR results of TyunCCE21 differ from the FPKM value analysis, which may be caused by the inconsistency between the qPCR analysis samples and the sequencing samples (Figure 7).

**Figure 7.** qPCR analysis of *T. yunnanensis* detoxification genes transcript levels in different tissues. H: head; C: carcasses; L: leg; A: antenna.

### **4. Discussion**

*Tomicus yunnanensis* is one of the most important pests of *Pinus yunnanensis*. In most insect species, detoxification proteins play a key role in the degradation of plant secondary metabolites. In order to better understand the clues of how insects degrade plant secondary metabolites, we first identified candidate detoxification proteins in the transcripts of the antennae, heads, legs, and carcasses of *T. yunnanensis*, and studied some of the detoxification proteins and the expression profiles of four different organizations. Our research results provide new evidence for the molecular basis of the detoxification proteins of *T. yunnanensis* in the metabolic function of detoxification, which may help to develop better methods to control this pest. The identification of at least 51 CYP genes placed *T. yunnanensis* within the middle of the range of the P450 gene family size in insects for which the genome has been sequenced, ranging from a low of 46 in the honeybee (*Apis mellifera* Linnaeus, 1758 (Hymenoptera: Apidae)) to 143 in *T. castaneum* [34], and it was similar to the mountain pine beetle (*D. armandi*), which has 64 CYPs. Previous studies on *T. yunnanensis* were mainly in the direction of olfactory-related proteins, but there are few reports on the detoxification genes of *T. yunnanensis* [35]. In this study, 51 TyunCYPs were analyzed in different adult tissue transcripts, and we found that seven TyunCYPs are predominantly expressed in the antennae of both sexes, belonging to three CYP families, CYP2: TyunCYP305F1; mitochondrial CYP clan: TyunCYP315A1; CYP4: TyunCYP410A1, TyunCYP393A1, and TyunCYP305F1. CYP345E2, a member of CYP3, is an antenna-specific CYP from *D. pondersae* that has been proved to catalyze the oxidation of

monoterpene volatile compounds in pine hosts [36]. In our study, we found two CYP345E2 homologues, TyunCYP345E1 and TyunCYP345F1 in *T. yunnanensis*, but the expression levels of TyunCYP345E1 and TyunCYP345F1 in the antennae were both low, which may be caused by differences between species. Some members of the CYP4 family are also involved in odor degradation, and most of them are involved in the synergistic reaction of detoxification and pheromone synthesis [37]. The expression level of TyunCYP393A1 in the antennae of females was higher than that of males, which might be related to the host location of females, while the expression level of TyunCYP410A1 in the antennae of males was higher than that of females, which may be because the male needs to degrade the secondary metabolites produced by the plant defense mechanism. TyunCYP4G2 is specifically and highly expressed in the residue, which may be related to the metabolism of toxic substances in *T. yunnanensis*. In the CYP6 family, few members have been reported to have odorant clearance functions [38]. In many (but not all) studies, genes from the CYP6 subfamily are shown to metabolize xenobiotics and plant natural compounds [27]. Several genes have been reported to have specific expression in the olfactory organs, such as CYP6B48, CYP6B42, and CYP6AE49 expressed in the male and female antennae of *Spodoptera litura* [39]. PxCYP6BG3 and PxCYP6BG6 were found in *Plutella xylostell*, which may be related to larval odor clearance [40]. In this study, we did not find any genes that were specifically and highly expressed in the olfactory organs of the CYP6 family.

The functions of GSTs in many insects' physiological functions, such as insecticide resistance and detoxification (plant secondary metabolites) have been fully demonstrated [40–42]. In this study, we identified 33 TyunGSTs from the *T. yunnanensis* transcriptome. According to some reports, A *Manduca sexta* olfactory specific GST called gstmsolf1 is reported as a degradable plant volatile trans-2-hexanal, belonged to the delta subfamily [40,43], in our study, we identified two new GSTs, TyunGST16 and TyunGST29, that were highly expressed in the antennae, and we found that they belonged to the delta/epsilon GST subfamily, Further family classification is needed for the functional speculation of TyunGST16 and TyunGST29. At present, we speculate that they may be related to odor degradation.. TyunGST30 and DaGSTs1 formed a linage in the sigma subfamily. DaGSTs1 may play a role in reducing the negative effects of terpenoids on beetles [44]. We speculate that TyunGST30 also has this function.

Insect CCE is a superfamily, which participates in many physiological processes and contains a variety of substrates [45]. Some of these are secreted enzymes, which refresh ORs by removing redundant esterase odorants from the surrounding area. Besides Z11-16: Ald, Z11-16: Ac is another major sex pheromone component reported in [46]. We found 29 CCE genes in the transcriptome of *T. yunnanensis*. The total number is higher than that of *D. armandi* (8) [47] and less than that of *T. castaneum* (63) [48]. The phylogenetic tree shows that the CCE genes of *T. yunnanensis* can be divided into five categories. Among them, TyunCCE5 and DarmCCE3 (AYN64425.1) cluster into the same branch, and DarmCCE3 is inferred to play a role in host detoxification [47], which we speculate that TyunCCE5 may also have. Using the FPKM value and a qPCR analysis, we found that three CCEs have a higher expression in the antennae: TyunCCE8, TyunCCE17, and TyunCCE21. TyunCCE8 and DarmCCE2 (AYN64424.1) are clustered into the same clade, whose most enzymes have a dietary detoxification function or ester odor degradation function, such as SlCXE10 [49] and SexCXE10 [50] with antennae dominant expression that can degrade ester plant secondary metabolites. Unlike these three CCEs, TyunCCE8, TyunCCE17, and TyunCCE21 have no obvious gender-biased expression. TyunCCE23, DarmCCE (AYN64426.1), and TcasCCE (NP\_001180223.1) are divided into the JHE branch. TcasCCE (NP\_001180223.1) has been confirmed to have the function of degrading juvenile hormones. We speculate that TyunCCE23 also has this function; of course, this inference needs to be further verified.
