**2. Results**

### *2.1. Identification, Phylogenetic Analysis and Genomic Distribution of Wheat VTL Genes*

Thirty-one wheat VIT family sequences were identified based on Ensembl Pfam search and bidirectional BLAST analysis (Table S1). Subsequently, to study the phylogenetic relationship among VIT and VTL family protein sequences from wheat, *Brachypodium*, maize, rice, *Arabidopsis* and *S. cerevisiae*, an unrooted neighbour-joining tree was constructed. This analysis separated the sequences into two distinct clades representing VTL and VIT proteins. This also led to the clustering of the wheat VIT family members into 8 VIT and 23 VTL sequences (Figure 1, Table S2). Due to the occurrence of homoeologs, the 23 VTL sequences were grouped into 4 *VTL* genes and named as *TaVTL1*, *TaVTL2*, *TaVTL4* and *TaVTL5* that corresponds to the rice orthologs followed by the chromosome number. None of the orthologs in wheat showed high confidence similarity with rice vacuolar iron transporter homolog 3. *TaVTL1* and *4* were found to have three homoeologs, while *TaVTL2* had four. In contrast, the phylogenetic analysis grouped 13 highly similar sequences together with rice vacuolar iron transporter homolog 5, these were named as *TaVTL5* (Figure S1).

*TaVIT1* and *TaVIT2* have already been reported earlier [14]. Interestingly, another new wheat *VIT* with two homoeologs on chromosome 7 (sub-genomes A and D) was identified (referred as *TaVIT3*). *VIT* genes were located on chromosome groups 2, 5 and 7, while *VTL* genes were on chromosome groups 2, 4, and 6 with a maximum contribution from chromosome 2. Nine *VTL* genes were present in the B sub-genome, while seven each on A and D sub-genomes. The maximum number of VTL sequences were located on chromosome 2B (Figure 2A).

**Figure 1.** Phylogenetic analysis showing separation of vacuolar iron transporter (VIT) family in *Arabidopsis*, *Brachypodium*, *Oryza sativa*, *Zea mays* and *Triticum aestivum* into two distinct clades; vacuolar iron transporter-like (VTL) clade and VIT clade. The neighbour-joining phylogenetic tree was generated using MEGA. The numbers represent bootstrap values from 1000 replicates.

### *2.2. Gene, Protein Structure and Subcellular Localization*

*VIT* genes in wheat have three and four intronic and exonic regions respectively, while *VTL* genes have a single exon each with the absence of any introns (Figure 2B), clearly dividing the *VIT* family into two sub-families based on gene structure also. CDS length was found to be varying from 657 to 747 nucleotides for wheat *VIT* genes. The CDS length for *VTL* genes was ranging from 549 to 810 nucleotides except for *TaVTL5-2A\_3* that was 378 nucleotides long. The short length of one VTL gene is due to the missing sequence information at the stop site. The length of TaVIT peptides ranged from 218 to 256 while TaVTL protein length varied from 125 to 269 amino acids. The division of VIT and VTL proteins was also evident from the sub-cellular localization (Table S2); while TaVIT proteins were predicted to be predominantly localized on the plasma membrane and chloroplast thylakoid membrane, maximum TaVTL proteins were predicted to be present on the vacuolar membrane (87%). TaVTL4-4A was predicted to be localized on plasma membrane. VIT proteins had 3–4 predicted trans-membrane (TM) domains. TaVTL1, 2 and 4 had five TM domains majorly, except for TaVTL4-4B which was predicted to have 6 TM domains. Only TaVTL5-2D\_3 had five TM domains; other paralogs/homoeologs

of TaVTL5 had lesser number of TM domains probably due to gene duplication events or missing information. To summarize, TaVTLs have five TM domains predominantly, which are depicted in Table S2. VIT1 from *Eucalyptus grandis* (EgVIT1) crystal structure was deciphered recently [32] that was used to confirm the VIT family protein topology prediction using Phobius [33]. EgVIT1 was predicted to have only three TM domains while the crystal structure stated the presence of five TM domains. Therefore, VIT, as well as VTL protein sequences from wheat, were aligned to EgVIT1 to see the possible TM domains in addition to those predicted by Phobius (Figure S2).

**Figure 2.** Genomic distribution and exon intron arrangements of *VTL* genes. (**A**) *VTL* genes genomic distribution. Wheat *VTL* genes were present on chromosome groups 2, 4 and 6 with maximum *VTL* genes on chromosome group 2, which was selected to show the *VTL* gene distribution on 2A, 2B and 2D chromosomes. (**B**) Genomic structure for wheat *VTL* and *VIT* genes. The intron-exon arrangemen<sup>t</sup> was identified using Gene Structure Display Server (GSDS). Exons and introns are represented using pink boxes and cyan lines, respectively. The scale determines the size of the genomic regions.
