*4.1. Data Set Collection and Identification of SlUGlcAE Genes*

The protein databases of all ten species were retrieved from the National Center for Biotechnology Information (NCBI) FTP site (available online: http://www.ncbi.nlm.nih.gov/Ftp/). The cDNA, CDS, and genome sequence data in tomato were downloaded from the Solanaceae Genomics Network (SGN) (available online: http://solgenomics.net) [37] and Tomato Functional Genomics Database (TFGD) (available online: http://ted.bti.cornell.edu) [38]. Other information and sequences of *Arabidopsis thaliana* UGlcAEs (AtUGlcAEs) were obtained from the *Arabidopsis* Information Resource (TAIR; available online: http://www.arabidopsis.org/) [39]. The UGlcAE proteins of tomato (SlUGlcAEs) were predicted depending on the UGlcAE hidden Markov model (HMM) profile from

the Pfam database (available online: http://pfam.sanger.ac.uk/) [40], which was used to search the *S. lycopersicum* UGlcAE proteins sequences by the HMMSEARCH program from HMMER software (available online: http://hmmer.janelia.org) [41]. In the case of the uncompleted protein databases, all of the results were then used as queries in TBLASTN searches against the tomato genomic sequences. To further confirm UGlcAE proteins, the domains of candidate sequences were predicted with the Pfam online server (available online: http://pfam.sanger.ac.uk/) [40] and SMART online server (available online: http://smart.embl-heidelberg.de/) [42]. The tomato genomic sequences were also checked using BLASTP at the NCBI site (available online: http://blast.ncbi.nlm.nih.gov), retaining only those sequences with highly significant matches to annotated UGlcAE proteins. The same procedure was used to search UGlcAE family members in the protein databases of the following nine species: *Cucumis sativus*, *Capsicum annuum*, *Solanum tuberosum*, *Arabidopsis thaliana*, *Nicotiana tabacum*, *Populus trichocarpa*, *Solanum pennelli*, *Zea mays*, and *Arabidopsis lyrata* subsp. *lyrata*.

The tomato *UGlcAE* gene subfamilies were named according to the orthologous *UGlcAE* genes in the *A. thaliana* genome. The subfamilies of *UGlcAE* genes in the tomato were distinguished by Arabic numerals, and different members of a subfamily were designated with the numbers.
