*2.2. Bioinformatics Analysis of the CsHCT Gene and Amino Acid Sequence*

This study analyzed the highly conserved domains of the HCT protein sequence in *A. thaliana*, *N. tabacum*, *H. cannabinus*, *T. cacao*, and *F. vesca*, designed a degenerate primer, and used the cDNA of the Chin-Shin Oolong tea plant as the template to perform PCR for obtaining the gene fragment sequence of *CsHCT*. The SMARTerTM RACE cDNA amplification kit (Clontech Laboratories, Inc., Mountain View, CA, USA) was used to expand 50 -end and 30 -end cDNA sequences. The full-length cDNA sequence of *CsHCT* was obtained after sequencing.

In the bioinformatics analysis conducted on the amino acid sequence of CsHCT, the ExPASy Translate tool (https://web.expasy.org/translate/) (access on July, 2016) was used for estimating the amino acid sequence translated by a nucleotide. The ExPASy Compute pI/Mw tool (https://web.expasy.org/compute\_pi/) (access on July, 2016) was used to estimate the protein molecular weight and isoelectric point. Multiple sequence alignments were performed using the EMMA function of the EMBOSS explorer (http://www.bioinformatics.nl/ emboss-explorer/) (access on July, 2016) and the BLOSUM50 scoring matrix, and GeneDoc

software was used to compare the results. Subsequently, Motif Scan (https://myhits. isb-sib.ch/cgi-bin/motif\_scan) (access on July, 2016) was used to predict the structural and functional protein regions. This study used the National Center for Biotechnology Information (https://www.ncbi.nlm.nih.gov/) (access on July, 2016) and Phytozome v10.3 (https: //phytozome.jgi.doe.gov/pz/portal.html) (access on July, 2016) websites to obtain the protein sequences of clade Vb of BAHD (BEAT, benzylalcohol-O-acetyltransferase; AHCT, anthocyanin O-hydroxycinnamoyltransferase; HCBT, anthranilate N-hydroxycinnamoyl-benzoyltransferase; and DAT, deacetylvindoline 4-O-acetyltransferase) acyltransferase from *A. thaliana*, *Oryza sativa*, *Populus trichocarpa*, *Coffea canephora*, and *H. cannabinus*. Sequence alignment was performed using the ClustalW model. A phylogenetic tree was constructed using the MEGA6 software, after which statistical analysis was conducted through the neighbor joining method. The 1000 iterations of the tree algorithm were performed using the bootstrap method. SignalP (http://www.cbs.dtu.dk/ services/SignalP/) (access on July, 2016) was used to predict whether a protein signal peptide existed. The Hphob./Kyte and Doolittle method of the ExPASy ProtScale (https://web.expasy.org/protscale/) (access on July, 2016) was adopted for predicting whether the proteins were hydrophilic or hydrophobic. The subcellular localization of proteins was predicted using WoLF PSORT (https://www.genscript. com/wolf-psort.html) (access on July, 2016).
