*2.1. Filtering Steps for 150 bp Fragments*

All of the 150 nucleotide sequences, obtained by a sliding window all over the genome sequence, were filtered through 4 different parameters: (1) Any sequence with at least one undefined nucleotide (N) was excluded; (2) Sequences with less than 10 copies were excluded; (3) Fragments with a significant match against the repetitive elements using RepeatMasker software (version 4.0.7) [22] were excluded (parameters "-species trypanosoma -pa 60 -u -xm -engine ncbi -excln"); and (4) Any repeat from multigenic family genes was excluded. Fragments were submitted to Blast-n alignment against an in house multigenic family database composed of 4999 genes (surface protease (GP63), mucin-associated surface protein (MASP), retrotransposons and trans-sialidase) from CL Brener (-S and -P haplotypes) using the parameters "-e-value 1e-72 -dust no -qcov\_hsp\_perc 100".

#### *2.2. Search Terms in the TriTryp Database*

To establish the number of specific genes on *T. cruzi* strains Dm28c, Y, TCC, CL Brener haplotypes Esmeraldo-like (S), and non-Esmeraldo like (P), the following terms were searched in TriTrypDB [23]: "trans-sialidase", "mucin-associated surface protein", "TcMUC", "mucin like", "surface protease GP63", "hypothetical protein", "90 kDa surface protein", "serine-alanine and proline-rich protein", "dispersed gene family protein 1", elongation factor 1-γ" and "retrotransposon hot spot protein".

The searched terms "TcMUC" and "mucin like" were considered to be one category, "mucin". Additionally, the terms "90 kDa surface protein" and "serine-alanine and proline-rich protein" were categorized under "90 kDa surface protein".
