*2.5. Intra-Assemblage B Genetic Diversity Analysis*

Genetic diversity was far higher within assemblage B than within assemblage A sequences regardless of the molecular marker used. Multiple sequence alignments of BIII, BIV, and ambiguous BIII/BIV sequences at the *gdh*, *bg*, and *tpi* loci revealed the presence of SNPs in multiple sites across used reference sequences, varying from 11 (for B sequences at the *bg* locus) to 32 (for BIII/BIV sequences at the *tpi* locus) sites (Table S6). Overall, 611 SNPs were identified among assemblage B sequences in all three loci. Of them, 16.9% (103/611) corresponded to single point mutations, and 83.1% (508/611) to double peaks. Defined positions (hotspots) at each investigated locus tended to accumulate the bulk of these SNPs (66.9%; 409/611). Within *gdh*, C87, T147, G150, C204, C309, and G402 were the main hotspots for BIII sequences (reference sequence: AF069059), and C123, T135, T183, G186, C255, C273, C345, T366, T387, and A438 for BIII/BIV sequences (reference sequence: L40508). Within *tpi*, C108 was the only hotspot for BIII sequences (reference sequence: AF069561), and A5, T57, T131, T134, A176, A395, and C470 for BIII/BIV sequences (reference sequence: AF069560). Within *bg*, the main hotspots identified for B sequences were C165, A183, C309, and T519 (reference sequence: AY072727).

The distribution of single point mutations and double peaks differed substantially among sub-assemblages and loci. At the *gdh* locus, hotspot sites accumulated 57.7% of all SNPs detected in BIII sequences, but this figure increased to 72.9% in BIII/BIV sequences. Double peaks accounted for 37.8% of the SNPs detected in BIII sequences, but for 67.8% of the ambiguous BIII/BIV sequences (Figure 2A). At the *tpi* locus, hotspot sites accumulated 18.2% of all SNPs detected in BIII sequences, but this figure increased to 54.5% in BIII/BIV sequences (Figure 2B). Finally, at the *bg* locus, hotspot sites clustered 78.4% of the SNPs detected in assemblage B sequences, of which 58.1% corresponded to double peaks (Figure 2C).

**Figure 2.** Distribution of single nucleotide polymorphisms segregated by mutations and double peaks, in *G. duodenalis* assemblage B sequences. (**A**) Single nucleotide polymorphisms (SNPs) at the glutamate dehydrogenase (*gdh*) locus; (**B**) SNPs at the triosephosphate isomerase (*tpi*); (**C**) SNPs at the beta-giardin (*bg*) locus, and overall figures for all assemblage B sequences.
