*4.4. Reclassification of GA A07 Strain*

In total, 882 *Bacillus* genomes were downloaded to identify the most closely related species. FastANI v1.2 [41] was leveraged to identify the most closely related species (i.e., the ones with the highest average nucleotide identity compared to the GA A07 genome). Average amino acid identity was also checked by the following steps: (1) Prodigal v2.6.3 [25] was used to in batch-predict protein-coding genes from the genomes, and genes were converted into amino acid sequences; (2) amino acid identities between the GA A07 genome and all downloaded *Bacillus* genomes were compared using BLASTP [26] (parameters: -max\_target\_seqs 1 -evalue 1e-10); and (3) the mean value of the best-hit identities was calculated and was regarded as the average amino acid identity between the two genomes. The Tetra Correlation Search (TCS) function implemented in the JSpeciesWS webserver [27] was also pursued to find the most closely related bacterial genome based on tetra-nucleotide composition evidence.

Only genomes with better assembly quality (defined as genomes with at most ten scaffolds) were used to build the phylogenetic tree. The tree was generated using ezTree [42], which is capable of identifying single-copy marker genes among input genomes, thereby creating a concatenated alignment of all marker genes, and using FastTree 2 [43] with the Jones–Taylor–Thornton (JTT) evolutionary model and 1000 resampling tests to construct a reliable tree. The nwk file of the phylogenetic tree was then visualized using Molecular Evolutionary Genetics Analysis (MEGA) X software [31].

#### *4.5. Identification and Analysis of GT Genes*

The dbCAN2 webserver [44] was employed to identify potential GTs from the *B. thuringiensis* GA A07 genome. An unrooted phylogenetic tree of all extracted GT protein sequences was constructed using MEGA X software [31] with the maximum-likelihood method, 500 bootstrap replications, the general reversible mitochondrial model [30], and partial deletion.

#### *4.6. Fermentation and Biotransformation of GAA*

*Bacillus subtilis* ATCC 6633 or *B. thuringiensis* GA A07 was cultivated in a 250-mL baffled Erlenmeyer flask containing 20 mL of LB medium with 5% of glucose at 180 rpm and 28 ◦C. When the OD600 of the cell culture reached 0.6, 1 mg/mL of GAA was added to the broth. Cultivation was carried out for another 32 h, and fermentation broth (0.5 mL) of the culture was taken at predicted time intervals and used for the UPLC analysis to measure the biotransformation activity.

#### *4.7. UPLC Analysis*

The UPLC system (Acquity UPLC H-Class, Waters, Milford, MA, USA) was equipped with an analytic C18 reversed-phase column (Kinetex® C18, 1.7 <sup>μ</sup>m, 2.1 i.d. <sup>×</sup> 100 mm, Phenomenex, Torrance, CA, USA). The operating conditions of UPLC for analysis of GAA, antcin K, and celastrol were

consistent with those of our previous study [4] except for the 430-nm absorbance detection of celastrol. Operating conditions for flavonoids were from our previous study [11].

#### *4.8. Expression and Purification of GT from GA A07 Strain*

Genomic DNA of the GA A07 strain was isolated using the commercial kit Geno *Plus*TM (Viogene, Taipei, Taiwan). Candidate GT genes were amplified from genomic DNA using a PCR with specific primer sets (Table S2). The amplified GT genes were subcloned into the pETDuet-1™ vector through suitable restriction enzyme sites (Table S2) to obtain the expression vector, pETDuet-BtGT (Figure S1a). Expression vectors were transformed into *E. coli* BL21 (DE3) via electroporation to obtain recombinant *E. coli*.

Recombinant BtGT\_16345, BtGT\_19840, and BtGT\_19010 were produced and purified from the recombinant *E. coli*, and analyzed by SDS-PAGE (Figure S1b–d). The protein concentration was determined by a Bradford assay using bovine serum albumin as the standard. The experimental procedures were the same as those in our previous study [4].

#### *4.9. In Vitro Biotransformation Assay*

In vitro biotransformation was performed using purified GT proteins. In a 0.1mL standard reaction mixture, 1 μg of purified GT protein, 0.02 mg/mL of GAA, 1 mM of UDP-glucose, 10 mM of MgCl2, and 50 mM of Tris at pH 8.0 were added. The reaction was carried out at 40 ◦C for 30 min, stopped by adding 0.9 mL of methanol, and analyzed by UPLC.

For optimization experiments, different pH values, temperatures, and metal ions were replaced in the standard reaction, where 1 mg/mL of GAA was used. For pH testing, PB at pH 6.0 to 7.5, and Tris buffer at pH 8.0 and pH 9.0 were used. For metal ion testing, 10 mM of MgCl2, CaCl2, or MnCl2 was used. The relative activity was obtained by dividing the area of the product peak of the reaction in the UPLC profile by that of the reaction with Tris pH 8.0, at 40 ◦C, and with 10 mM of MgCl2.

For the substrate specificity assay, 25 mg/mL of the substrate soluble in dimethyl sulfoxide conditions, all tested substances were soluble in the reaction buffer. 1 mg/mL of different test compounds was mixed with 1 μg of purified GT protein, 10 mM of UDP-glucose, 10 mM of MgCl2, and 50 mM of PB pH 7.0 in a 0.1mL reaction mixture and incubated at 30 ◦C for 30 min. After incubation, the reaction mixture was analyzed by UPLC.

For the kinetic experiments, different concentrations of GAA were mixed with 10 μg of purified GT protein, 10 mM of UDP-glucose, 10 mM of MgCl2, and 50 mM of PB at pH 7.0 for BtGT\_16345, or 50 mM of Tris at pH 8.0 for BsUGT398 and BsUGT489 in a 1-mL reaction mixture and incubated at 30 ◦C for BtGT\_16345, or 40 ◦C for BsUGT398 and BsUGT489 for 20 min. During incubation, samples from each reaction were taken out and analyzed by UPLC every 2 min. The amount of GAA-15-*O*-β-glucoside produced from the reaction was calculated from the peak area of the UPLC analysis normalized to a standard curve. The reaction velocity at each concentration of GAA was obtained from the slope of the plot of time versus the amount of the product. The kinetic parameters were calculated by nonlinear regression analysis applied to Michaelis-Menten equation using SigmaPlot 14.0 software (Systat Software, San Jose, CA, USA). The kcat values were calculated using the predicted molecular mass for each recombinant enzyme.

#### **5. Conclusions**

A novel GT28 family enzyme, BtGT\_16345, from a new genome assembly of the *B. thuringiensis* GA A07 strain, was identified that can biotransform GAA into GAA-15-*O*-β-glucoside. To our knowledge, BtGT\_16345 is the first GT28 family enzyme with triterpenoid glycosylation activity.

*Int. J. Mol. Sci.* **2019**, *20*, 5192

**Supplementary Materials:** Supplementary materials can be found at http://www.mdpi.com/1422-0067/20/20/5192/s1. The following materials are available online. Table S1: Single copy marker genes used for building the phylogenetic tree; Table S2: Nucleotide sequences of the primers used for amplification of glycosyltransferase (GT) genes in the present study; Figure S1: Expression and purification of glycosyltransferases (GTs) from *Bacillus thuringiensis* GA A07 in *E. coli*.

**Author Contributions:** Conceptualization, T.-S.C., T.-Y.W., and Y.-W.W.; data curation, Y.-W.W., T.-Y.H., Y.-W.L., H.-M.C., W.-X.C., and T.-S.C.; methodology, Y.-W.W., C.-M.C., T.-Y.W., J.-Y.W., and T.-S.C.; project administration, T.-S.C.; Writing—original draft, Y.-W.W., C.-M.C., T.-Y.W., J.-Y.W., and T.-S.C.; writing—review and editing, Y.-W.W., C.-M.C., T.-Y.W., J.-Y.W., and T.-S.C.

**Funding:** This research was financially supported by grants mainly from the Ministry of Science and Technology of Taiwan (Project No. MOST108-2221-E-024-008-MY2 to T.-S.C.) and partially from MOST-108-2628-E-038-002-MY3 to Y.-W.W. and Academia Sinica to T.-Y.W.

**Conflicts of Interest:** The authors declare no conflicts of interest.
