2.6.2. Comparing *G. duodenalis* Serial Results

Out of the 478 observations with more than one sample, 62% (295/478) always tested negative for *G. duodenalis*, 4% (20/478) always tested positive, and 34% (163/478) were discontinuously positive. As such, among observations ever positive for *G. duodenalis*, 11% (20/183) were continuously positive. Of those, 70% were aged 0–9 years old, 55% were from tribe 1, and 35% from tribe 5. Overall, 85% did not report hand washing, 30% not washing fresh produce, 50% reported open defecation near the households, and 45% open defecation in the woods. However, in the multivariable analysis, discontinuous positivity was strongly associated with the number of samples (aOR = 0.30, 95% CI: 0.12–0.66), and tribe 5 (aOR = 4.76, 95% CI: 1.30–18.6), but no further significant association was found (Tables S7 and S8).

Finally, comparing observations that were always negative versus always positive, the multivariable model only retained age group and tribe, showing evidence of a protective effect of older age groups (aOR = 0.24, 95% CI: 0.06–0.89 for children 5–9 years old; aOR = 0.17, 95% CI: 0.03–0.68 for children 10–14 years old; aOR = 0.05, 95% CI: 0.01–0.20 for children ≥15 years old), and a strong positive association with tribe 5 (aOR = 5.55, 95% CI: 1.61–19.4) (Tables S9 and S10). However, when adjusting for the effect of coinfections; the best fit model suggested a protective effect of *E. nana* (aOR = 0.25, 95% CI: 0.08–73), oldest age group ≥15 years old (aOR = 0.07, 95% CI: 0.01–0.28), and a positive effect of tribe 5 (aOR = 5.87, 95% CI: 1.60–22.1) (Table S11).

#### **3. Discussion**

This survey presents new insights into the epidemiology of *G. duodenalis* in Amazonian indigenous communities. The main contributions of the study include the demonstration that (i) giardiasis is a common finding (13–22%) in apparently healthy Tapirapé people, mainly affecting children in the age group of 0–9 years old; (ii) assemblage B was responsible for near 70% of the mostly asymptomatic infections detected; and (iii) a high degree of genetic heterogeneity was observed within assemblage B (but not assemblage A) sequences, regardless of the molecular marker used.

Several epidemiological studies conducted in endemic areas worldwide have shown that *G. duodenalis* infections do not seem to correlate positively with diarrhoea [23,24], demonstrating that asymptomatic giardiasis is the rule rather than the exception in these settings. This fact would explain why giardiasis is systematically absent in global burden estimations of diarrhoeal disease [25]. This seems to be also the case of the present study, where *G. duodenalis* infections were detected similarly in asymptomatic individuals (33.8%) and individuals presenting with diarrhoea or other gastrointestinal manifestations (35.3%). Taken together, this information supports the hypothesis that some enteric protist species (e.g., *Blastocystis* sp., *Dientamoeba fragilis*, *G. duodenalis*) might in fact be protective against disease [26]. This is an attractive possibility implying that these agents are indeed acting as pathobionts (that is, microorganisms that normally live as harmless symbionts but under certain circumstances can be pathogenic) forming part of the host eukaryome.

We have shown in our study that *G. duodenalis* infection was strongly related to younger age and tribe (with tribes 1 and 5 having a higher association) and to seasonality. This may be due to external factors associated with indirect transmission pathways of the infection (e.g., source of drinking water, consumption of contaminated fresh produce, swimming in contaminated surface waters, defecation on the open ground near households, and high density of companion or domestic animals) or increased risk of reinfection within the tribe from other infected members through direct person-to-person contact. Contact with faecally contaminated water and produce may be more likely in the rainy season. Children <15 years old with giardiasis reported more frequently vomiting, abdominal pain, and presence of mucus/blood in faeces compared to adults, although observed differences did not reach statistical significance. Young children with an immature immune system may be at higher risk of infections and probably more severe disease episodes. Thus, older adults may have acquired immunity after a previous infection. Indeed, it has been

shown that levels of intestinal inflammation caused by *G. duodenalis* infection decrease with subsequent infections [27,28]. This implies that there is acquired protection against the severity of giardiasis but not from reinfection [29]. In this regard, it should be noted that the composition and abundance of the host's microbiota have also been suggested to play an important role in the outcome of the infection [30].

Giardiasis was also strongly dependent on the number of samples taken, even considering that conventional microscopy (a method that is largely known to be of limited diagnostic sensitivity) was the screening method for the initial detection of *G. duodenalis* in the present survey. This suggests that possible reinfections or chronic infections with intermittent positivity may be more common than initially anticipated. Reinfection may be more pronounced in the rainy season. In addition, no evident differences between individuals continuously positive/discontinuously positive to *G. duodenalis* were found. However, we should exclude a bias in those presenting for sampling. This is unlikely to be a major factor due to the lack of symptoms in most cases.

Regarding coinfections, the presence of *G. duodenalis* was not associated with any other enteric parasite species, except possibly *E. nana*. These results may be biased by the relatively small number of positive samples detected for certain pathogens and should, therefore, be interpreted with caution. Similarly, a counter-intuitive positive association between *G. duodenalis* with washing fresh produce was found. This result may be the consequence of the potential confounder effect of other variables no considered here such as the manipulation of fresh produce or the use of contaminated washing water. The latter possibility would support the relevance of waterborne transmission for human giardiasis.

Molecular sequence analyses of the three loci used here for genotyping purposes also revealed interesting data. There were no differences in age between individuals infected either by the assemblage A or the assemblage B of *G. duodenalis*. Regarding age-related patterns in the distribution of *G. duodenalis* assemblages, our results are in contrast with those previously obtained in surveys targeting clinical populations. For instance, children have been shown to be more commonly infected by assemblage B (83%, 44/53) than adults (52%, 22/42) in patients of all age groups in Spain [31]. Moreover, in that country, assemblage B was significantly more prevalent than assemblage A in asymptomatic outpatient children, but not in individuals of older age [32].

Remarkably, no association between the occurrence of diarrhoea (or any other gastrointestinal manifestation) and the *G. duodenalis* assemblage involved in the infection was found in the investigated population. This result corroborates that observed in children under 5 years of age (*n* = 222) recruited under the Global Enteric Multicentre Study (GEMS) in Mozambique [33]. However, it should be noted that other surveys have shown different, even contradictory, results. For instance, assemblage A was more prevalent than assemblage B in Bangladeshi people (*n* = 343) [34], in Turkish clinical patients (*n* = 44) [35], and in Spanish outpatient children (*n* = 43) [32]. The opposite trend was reported in asymptomatic infected individuals (*n* = 18) in the Netherlands [36].

Genotyping data generated here demonstrated that assemblage B was responsible for three out of four *G. duodenalis* infections in the Tapirapé people, a similar proportion of that (78%) described in paediatric populations in the Amazonas State [37]. Of note, assemblage A tends to be the predominant *G. duodenalis* genetic variant circulating in humans in Brazil (Table S1). These facts may be indicative of differences in sources of infection, transmission pathways, or even geographical segregation patterns of the parasite in the country. Lack of non-human, host-specific assemblages C–F seem to suggest that companion, production, and free-living animal species are no significant contributors of giardiasis in the surveyed population. This is in spite of the fact that swine and poultry were reared in all seven tribes, and that domestic dog and cat densities were also high. In addition, cattle (but not sheep) farming was also frequent in the proximity of them. Taking together, these data indicate that human giardiasis is mainly of anthropic nature among the Tapirapé people. The extent and accuracy of this statement should be corroborated in future molecular epidemiological studies including animal and environmental (water) samples.

This study also confirms the high genetic variability within *G. duodenalis* assemblage B (but not assemblage A) reported frequently in similar molecular epidemiological surveys conducted in endemic areas globally [38,39] including Brazil [40,41]. This finding was particularly evident at the *gdh* and *tpi* loci, for which most of the generated BIII (78–87%), BIV (100%), and BIII/BIV (90–92%) sequences corresponded to distinct genotypes of the parasite. Sequences unmistakably assigned to BIII and BIV at the *gdh*/*tpi* loci tended to vary only in one to six positions (hotspots) either as mutations or ambiguous (double peak) sites. In these sets of hotspots, the proportion of sites involving double peaks in BIII sequences varied from 38% at the *gdh* locus to 18% at the *tpi* locus. Interestingly, these percentages increased in both cases to 55–68% in ambiguous BIII/BIV sequences, explaining why these isolates were difficult to allocate to a given sub-assemblage. Two independent mechanisms have been proposed to explain the presence of ambiguous (double peak) positions. The first one involves the occurrence of true mixed infections (e.g., BIII + BIV) and would fit well with an epidemiological scenario characterised by high infection and reinfection rates as the one described in the present study. The second one would be associated with the occurrence of genetic recombination. Evidence for the latter possibility comes from independent investigations demonstrating low levels of allelic sequence heterozygosity (implying a genetic homogenisation mechanism) within assemblage A [42] and, to a lesser extent, within assemblage B [43]. Additional evidence of genetic recombination events has been demonstrated within assemblage B in single (trophozoite and cyst) cells [44] and within sub-assemblages BIII and BIV at the genetic population level [45].

The results obtained in the present study may be biased by certain design and methodological constricts. For instance, the initial screening of *G. duodenalis* was based on conventional microscopy, so the true prevalence of the infection is likely to be underestimated. In addition, there may be a response bias as people may be more or less inclined to return to the study if they had a negative or positive test result. Interestingly, the positivity rate was increased by the number of tests performed, suggesting that over time people were likely to have had a giardiasis episode, that they may have had a false-negative result at microscopy examination, or an inherent response bias in that people who were likely to be positive would return for testing. Limitations associated with the main dataset may arise from the combination of period-specific data, although most of the independent variables considered (e.g., demographics, access to safe drinking water, and sanitary conditions) were not expected to change over time. As our analyses used the first negative test result, we could not further explore the effect of seasonality in the multivariable analysis. However, we have already shown in the descriptive data that seasonality is associated with infections and repeated infections. Lack of association between *G. duodenalis* genetic variants and occurrence of clinical symptoms may be influenced by the fact that other diarrhoea-causing agents (including viral and bacterial pathogens) were not assessed. In addition, suspected mixed infections were not further investigated by cloning of PCR amplicons or next-generation sequencing, methods with high sensitivity able to detect genetic variants of the parasite that are underrepresented in the population pool, and that are otherwise undetectable using conventional PCR methods and Sanger sequencing. Finally, the typing scheme used in the present study may lack enough phylogenetic resolution to correctly differentiate between sub-assemblage BIII and BIV sequences. This issue has been highlighted in recent molecular studies for assemblage B and assemblage A sequences [46,47]. This important point emphasises the need of identifying new markers and of developing novel methods for MLST purposes.

#### **4. Materials and Methods**
