*2.4. Detection of Potential Recombination Events in RHS Sequences*

The RHS sequences selected for the phylogenetic study were also used to identify recombination events in the clone CLB using the RDP4 program (Recombination Detection Program) [24], which allows the identification and statistical analysis of recombination events from a set of aligned sequences. It uses non-parametric recombination detection methods (algorithms RDP, GENECONV, MaxChi, Chimera, Bootscan, 3Seq, and SiSscan) to identify breakpoints in the genomic sequences where recombination begins and ends, in addition to the donor parental sequences of the recombinant fragment. For recombination events, sequences detected by at least 6 of the 7 algorithms in the RDP4 package were considered recombinant.

## *2.5. Expression and Purification of Recombinant RHS*

An 877-bp fragment encoding a 292-aa region of the carboxy-terminal domain of the RHS (TcCLB.511055.20) was amplified by PCR from CLB genomic DNA, cloned into pGEM-T, and sequenced to confirm gene identity. Then, it was subcloned into pGEX-1λT to produce the RHS-GST fusion protein as described by Martins et al., 2015 [24]. *E. coli* BL21 bacteria were transformed with the RHS-GST construct, grown in LB medium, and protein expression was induced with 1 mM isopropyl β-D-1-thiogalactopyranoside (IPTG). The RHS recombinant protein was extracted from the insoluble fraction of bacterial lysates with Laemmli's sample buffer and separated on 10% SDS-PAGE. The band W to the recombinant protein was excised from the gel and extracted by dialysis against ammonium bicarbonate and distilled water [24]. The purity of recombinant RHS was checked by SDS-PAGE stained with colloidal Coomassie Blue and immunoblotting (Figure S2). Purified protein was quantified with Coomassie Plus (Pierce, Thermo Fisher Scientific, Waltham, MA, USA) in 96-well plates at 620 nm.

#### *2.6. Antibody Production, Western Blot, and Immunofluorescence Analyses*

About two mg of the purified RHS recombinant protein were sent to Rheabiotech Research and Development Laboratory, SP, Brazil, for the production of polyclonal anti-RHS antibodies in mice. The specificity and reactivity of the anti-RHS antibodies were determined by ELISA and Western blot assays using the recombinant protein RHS.

Epimastigotes (10<sup>8</sup> cells) of *T. cruzi* (clone CLB, strain G), *T. cruzi marinkellei*, and *T. rangeli*, and procyclic forms (10<sup>7</sup> cells) of *T. brucei* were washed in PBS and lysed with 4 × Laemmli's sample buffer, and the extracts were subjected to SDS-PAGE (10% for separation gel and 3% for packaging gel) at 120 V for 45 min. Proteins were transferred to Hybond ECL membranes (Amersham, GE Healthcare Life Sciences). For the Western blot reaction, the membrane was blocked in 1× PBS solution containing

7.5% skimmed milk powder (PBS/milk solution) for 1 h at room temperature. The membrane was then incubated with PBS/milk solution anti-RHS1 (dilution 1:500) for 1 h, at room temperature. Subsequently, the membrane was washed three times (3 × 5 min) in PBS containing 0.05% Tween 20 (PBS/Tween solution). Secondary antibodies (Sigma Aldrich) were incubated for 1 h at room temperature at a dilution of 1:10,000. Bound antibody signals were amplified with ECL (Enhanced Chemiluminescence) substrate (GE Healthcare) and luminescent bands visualized in an Alliance 2.7 photo documenter (UVItec).

For indirect immunofluorescence assay, *T. cruzi* epimastigotes (10<sup>7</sup> cells) were harvested from the culture medium, washed with PBS, and fixed with 2% paraformaldehyde in PBS for 15 min at room temperature. Then, the parasites were washed with PBS and incubated with anti-RHS antibodies (1:1000 dilution) in the presence of 0.1% saponin and 1% PBS/BSA for 1 h at room temperature. The parasites were washed once more with PBS and incubated for 1 h with an Alexa Flour 568 anti-mouse IgG antibody raised in goat diluted 1:100 in 1% PBS/BSA and 1 mM DAPI (4′ ,6 -diamino-2-phenylindole, Molecular Probes). Subsequently, epimastigotes were washed with PBS and the slides were mounted using Glycerol-PPD (p-Phenylenediamine). Images were acquired with a TCS SP5 II TandemScanner confocal microscope (Leica Microsystems, Wetzlar, Germany) using a 63 × NA 1.40 PlanApo oil immersion objective and processed with Imaris software 7.0 (Bitplane).

#### **3. Results**

#### *3.1. Mapping of RHS Sequences on the Chromosomes of Clone CLB of T. cruzi*

Natural populations of *T. cruzi* reproduce predominantly by binary fission, therefore they exhibit a clonal population structure [25–28]. However, the occurrence of hybridization has been demonstrated in vitro [29] and also in natural populations of *T. cruzi* [28,30–36]. Based on several genetic markers, *T. cruzi* isolates were classified into six discrete typing units (DTU) named lineages TcI to TcVI [37–39]. The isolates from lineages V and VI have a hybrid evolutionary origin from at least two hybridization events between lineages TcII and TcIII [28,33,34,39].

The clone CL Brener (CLB) is a hybrid strain grouped in lineage TcVI, and sequence analysis of its genome revealed the presence of two haplotypes [2], one of which has contigs similar to the Esmeraldo strain of lineage TcII. The sequence divergence between the two haplotypes is 5.4% [2]. The genomic sequences generated in the Genome Project of *T. cruzi* clone CLB have been organized in 41 pairs of homologous chromosomes (TcChr), with the smallest having 77,958 bp (TcChr1) and the largest 2,371,736 bp (TcChr41) [2,40,41]. Due to the hybrid nature of CLB, each pair of homologous chromosomes consists of one homolog, which is an Esmeraldo-like-haplotype (S), and another homolog, which is a non-Esmeraldo-like haplotype (P), totaling 82 in silico chromosomes (TcChr) [2,40]. A search for RHS sequences in the CLB genome deposited in the TriTrypDB database resulted in 525 RHS sequences (111 genes, 384 pseudogenes, 30 truncated sequences), which are distributed in the haplotypes as follows: 48 complete genes, 177 pseudogenes, and 8 truncated sequences in the Esmeraldo haplotype (S), and 63 complete genes, 207 pseudogenes and 22 truncated sequences in the non-Esmeraldo haplotype (P) (Table S1). Besides these sequences, we found 42 complete RHS genes, 175 pseudogenes, and 11 truncated sequences among the unallocated contigs, totaling 753 RHS sequences in the CLB genome. RHS gene sizes range from 351 to 3014 bp. The estimated RHS content of the CLB genome was 3,271,841 bp, comprising about 5.4% of the *T. cruzi* genome sequence.

The distribution of RHS sequences along the CLB chromosomes is shown in Figure S3. Among 82 chromosomes, three chromosomes, TcChr1-S, TcChr4-S, and TcChr34-S, did not show RHS sequences. Larger chromosomes, such as TcChr40 and TcChr41, have predominantly RHS pseudogenes (Table S1), suggesting that RHS and other repetitive sequences could be involved in the expansion of the chromosome size. It is important to highlight that the total number of RHS sequences present in the genome of the CLB may be even greater than that obtained in this analysis. When non-transcribed sequences were included in our analysis, the total number of RHS sequences was larger than one

thousand, showing the presence of fragments dispersed in the genome, which are reminiscent of RHS genes. These results reflect the complexity of the *T. cruzi* genome and RHS family [2,6,42]. The haploid genome of *T. cruzi* is about 2- and 5-fold larger than that of *T. brucei* and *Leishmania* spp., respectively. In addition, multigenic families (trans-sialidases, mucins, DGF-1, MASP, RHS, and GP63 proteases) underwent a very pronounced expansion process in *T. cruzi* [2,3,6,42–44].

The frequency of RHS sequences in each chromosome of CLB was plotted as a heatmap in Figure 1, and the proportion of total RHS length in each chromosome is shown in Figure S4. RHS sequences comprise 0.34% to 6.14% of the entire length of each CLB chromosome. Overall, the frequency of RHS was similar in most pairs of homologous chromosomes. However, in some homologous pairs, this proportion was quite different, e.g., between the haplotypes S and P of the chromosome TcChr20 or TcChr21.

**Figure 1.** Circos diagram depicting the genomic organization and recombination events of the RHS family in the whole genome of *T. cruzi* clone CLB. Inner track 1 represents the recombination between RHS genes. The recombinant sequences are linked to putative major and minor parental, using purple and green lines, respectively. Track 2 shows the genomic organization of RHS genes in chromosomes. Genes on forward and reverse strands are colored in blue and red, respectively. Track 3 shows the genomic organization of RHS pseudogenes in chromosomes. Pseudogenes on forward and reverse strands are colored in green and orange, respectively. Track 4 depicts a heat map of RHS genes' and pseudogenes' density for each chromosome. Values were obtained by summing the length (bp) of RHS genes and pseudogenes and were divided by the chromosome size. Outer track 5 shows the representation of *T. cruzi* CLB chromosomes for Esmeraldo (haplotype S) and non-Esmeraldo (haplotype P) allelic loci.
