*3.1. Bioinformatics Analysis of the CsHCT Gene and Amino Acid Sequences of C. sinensis L.*

*C. sinensis* L. contains numerous polyphenolic compounds that provide multiple health benefits. To understand the role of HCT in the reaction of acylated flavonol glycosides, the *CsHCT* gene was cloned from the Chin-Shin Oolong tea plant. The *CsHCT* gene has cDNA of length 1552 bp that includes 35-bp and 182-bp 50 and 30 untranslated regions (excluding a poly-A tail). The open reading frame includes 1311 nucleotide sequences, which can encode 436 amino acid sequences (GenBank accession number: MH271107).

Sequence alignment analysis was performed on the amino acid sequence of CsHCT of *C. sinensis* L. and the HCT sequences of *A. thaliana* (AtHCT), *H. cannabinus* (HcHCT), *P. trichocarpa* (PtHCT), and *C. canephora* (CcHCT). Using global alignment and the BLOSUM50 scoring matrix, the similarities between CsHCT and AtHCT, HcHCT, PtHCT, and CcHCT were found to be 79.5%, 81.5%, 81.9%, and 82.1%, respectively. CsHCT and the HCT of other higher plants exhibited the sequences HXXXD and DFGWG, which are the conserved sequences of BAHD acyltransferase (Figure 1). Motif scanning was then performed to predict the amino acid sequence of CsHCT, and the results indicated that the N-terminus of its protein possesses the predicted N-myristoylation, casein kinase II phosphorylation, and protein kinase C phosphorylation sites.

**Figure 1.** Alignment of deduced amino acid sequences of CsHCT with other putative HCTs. Black and gray shadings indicate conservation of 100% and at least 80%, respectively. Amino acid residues enclosed by squares correspond to consensus sequences of SXXD, SXXE, HXXXD, GVXXGV, TXXD, GLXXTI, DFGWG, etc. AtHCT, hydroxycinnamoyl-CoA shikimate/quinate hydroxycinnamoyl transferase was from *A. thaliana* (NP\_199704); HcHcT, putative hydroxycinnamoyl-CoA shikimate/quinate hydroxycinnamoyl transferase was from *H. cannabinus* (AFN85668);

PtHCT, quinate O-hydroxycinnamoyl transferase/shikimate O-hydroxycinnamoyl transferase was from *P. trichocarpa* (ACC63882); and, CcHCT, hydroxycinnamoyl-CoA shikimate/quinate hydroxycinnamoyl transferase was from *C. canephora* (ABO47805). Alignment was performed using the ClustalW algorithm. Sequence identities with CsHCT were as follows: AtHCT, 79.5%; HcHCT, 81.5%; PtHCT, 81.9%; and, CcHCT, 82.1%.

Compute pI/Mw was used to analyze the amino acid sequence of CsHCT; its molecular weight and isoelectric point were predicted to be 48.53 kDa and 5.86, respectively. Analysis using SignalP failed to identify a signal peptide in CsHCT. The hydrophilicity and hydrophobicity of the protein were then analyzed using ProtScale, and the results demonstrated that the amino acid sequence did not possess an apparent hydrophobic end. The distribution of the hydropathy indices indicated that CsHCT is a hydrophilic protein (Figure 2A). Tuominen et al. (2011) reported that HCT is a member of clade Vb of the BAHD acyltransferase family. To understand the phylogenetic relationships between CsHCT and other clade Vb members, the MEGA6 software was used to perform neighbor joining. This generated a phylogenetic tree for the CsHCT of *C. sinensis* L. and the proteins of clade Vb members in *A. thaliana*, *O. sativa*, *P. trichocarpa*, *C. canephora*, and *H. cannabinus*. The results demonstrated that CsHCT and AtHCT had the closest phylogeny (Figure 2B); thus, they may have similar biochemical characteristics.

**Figure 2.** Hydropathy plot and phylogenetic tree analysis for CsHCT. (**A**) Hydropathy plot of CsHCT using the Kyte–Doolittle method with a window size of 436 amino acids. The window position values indicated on the x-axis of the graph reveal the average hydropathy of the entire window, with the corresponding amino acids as the middle element of that window. Plots above 0 (zero) in the graph indicate hydrophobic regions in the protein, and those below 0 (zero) indicate hydrophilic regions. (**B**) Unrooted phylogram of members of the HCT protein family. Phylogenetic tree of AtSHT (at2g19070), AtHCT (at5g48930), and HXXXD-type acyl transferase family protein (at5g57840) proteins in *Arabidopsis*; transferase family protein (Os02g39850; Os04g42250; Os06g08580; Os06g08640;

Os08g10420; Os08g43040; Os09g25460; and Os11g07960) in *O. sativa*; PtHCT1 (Potri.003G183900), PtHCT3 (Potri.018G104800), PtHCT4 (Potri.018G104700), PtHCT6 (Potri.018G105400), PtHCT7 (Potri.005G028000), SHTL2 (Potri.018G109900), Shikimate O-hydroxycinnamoyl transferase (Potri.005G028100 and Potri.005G028400), transferase family protein (Potri.006G157100; Potri.015G147300; Potri.003G057200; Potri.001G042900; Potri.001G128100; Potri.014G025500; and Potri.012G144500), anthranilate N-hydroxycinamoyl/benzoyltransferase-like protein (Potri.014G025600) in *P. trichocarpa*; CaHCT1 (CAJ40778), CaHCT2 (CAT00082), CaHCT3 (CAT00081), CcHCT (ABO77955) in *C. canephora*; and HcHCT (AFN85668) in *H. cannabinus*. The phylogenetic tree was constructed by the Neighbor Joining algorithm implemented in the MEGA 6 software package. The 1000 iterations of the tree algorithm were performed using the bootstrap method.
