*4.3. Molecular Study*

## 4.3.1. DNA Extraction and Purification

Molecular analyses were performed only on stool samples that were *Cryptosporidium*positive by immunoassay. Genomic DNA was isolated from about 200 mg of faecal material by using the QIAamp DNA Stool Mini Kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions, except that samples mixed with ASL lysis buffer were incubated for 10 min at 95 ◦C. Resulting eluates (200 μL in PCR-grade water) were stored at −20 ◦C and shipped to the Spanish National Centre for Microbiology at Majadahonda (Spain) for downstream molecular analysis.

#### 4.3.2. Molecular Detection and Characterisation of *Cryptosporidium* spp.

As this study was based on *Cryptosporidium*-positive samples by ELISA, to optimise time and resources the following diagnostic and genotyping algorithm was implemented. A nested PCR protocol was initially used to amplify an 870-bp fragment of the *gp60* gene of the parasite as previously described [59]. This approach allowed for the differential diagnosis of *C. hominis* and *C. parvum* (the two *Cryptosporidium* species more prevalent in humans), and for the identification of subtype families within these two species. The outer primers were AL-3531\_F (5'-ATAGTCTCCGCTGTATTC-3') and AL-3535\_R (5'-GGAAGGAACGATGTATCT-3'), and the inner primers were AL-3532\_F (5'-TCCGCTGTATTCTCAGCC-3') and AL-3534\_R (5'-GCAGAGGAACCAGCATC-3'). Reaction mixtures (50 μL) contained 200 nM of each primer and 2–3 μL of template DNA. Cycling conditions included one step of 94 ◦C for 5 min, followed by 35 cycles of amplification (denaturation at 94 ◦C for 45 s, annealing at 59 ◦C for 45 s, and elongation at 72 ◦C for 1 min), concluding with a final extension of 72 ◦C for 10 min. The same conditions were used in the secondary reaction, except that the annealing temperature was 50 ◦C.

Samples with a negative result by *gp60*-PCR were re-analysed by a nested PCR to amplify a 587-bp fragment of the *ssu* rRNA gene of the parasite [60]. This approach allowed for the detection of low burdens of *Cryptosporidium* infections and for the identification of *Cryptosporidium* species other than *C. hominis* or *C. parvum*. The outer primers were CR-P1 (5'-CAGGGAGGTAGTGACAAGAA-3') and CR-P2 (5'-TCAGCCTTGCGACCATACTC-3'), and the inner primers were CR-P3 (5'-ATTGGAGGGCAAGTCTGGTG-3') and CPB-DIAGR (5'-TAAGGTGCTGAAGGAGTAAGG-3'). In all cases, reaction mixtures (50 μL) contained 300 nM of each primer and 3 μL of template DNA. Cycling conditions consisted of one step of 94◦C for 5 min, followed by 35 cycles of amplification (denaturation at 94 ◦C for 40 s, annealing at 50 ◦C for 40 s, and elongation at 72 ◦C for 1 min), finalising with a final extension at 72 ◦C for 10 min.

Samples that were identified by *ssu*-PCR (and Sanger sequencing, see below) as *C. meleagridis* were re-analysed at the *gp60* locus by a nested PCR specifically developed for this *Cryptosporidium* species [21]. This protocol amplifies a 900 bp fragment of the *gp60* gene. The outer primers were CRSout115F (5´-GATGAGATTGTCGCTCGTTATC-3´) and CRSout1328R (5´-AACCTGCGGAACCTGTG-3´), and the inner primers were ATGFmod (5´-GAGATTGTCGCTCGTTATCG-3´) and GATR2 (5´-GATTGCAAAAACGGAAGG-3´). Reaction mixtures (50 μL) contained 250 nM of each primer and 2–3 μL of template DNA. Cycling conditions included one step of 95 ◦C for 4 min, followed by 35 cycles of amplification (denaturation at 95 ◦C for 30 s, annealing at 60 ◦C for 30 s, and elongation at 72 ◦C for 1 min), concluding with a final extension of 72 ◦C for 7 min. The same conditions were used in the secondary reaction, except that the annealing temperature was 58 ◦C.

Nested PCR protocols described above were conducted on a 2720 Thermal Cycler (Applied Biosystems, CA, USA). Reaction mixes always included 2.5 units of MyTAQTM DNA polymerase (Bioline GmbH, Luckenwalde, Germany), and 5× MyTAQTM Reaction Buffer containing 5 mM dNTPs and 15 mM MgCl2. Laboratory-confirmed positive and negative DNA samples of human origin were routinely used as controls and included in each round of PCR. PCR amplicons were visualised on 2% D5 agarose gels (Conda, Madrid, Spain) stained with Pronasafe nucleic acid staining solution (Conda) and recorded using the MiniBIS Pro system controlled by GelCapture version 7.5.2 software (DNR Bio-Imaging Systems, Jerusalem, Israel). A 100 bp DNA ladder (Boehringer Mannheim GmbH, Baden-Wurttemberg, Germany) was used for the sizing of obtained amplicons. Positive-PCR products were directly sequenced in both directions using the internal primer sets described above. DNA sequencing was conducted by capillary electrophoresis using the BigDye® Terminator chemistry (Applied Biosystems) on an on ABI PRISM 3130 automated DNA sequencer at the Core Genomic Facility of the Spanish National Centre for Microbiology, Majadahonda (Spain). Sequencing reactions were repeated on samples for which genotyping was unsuccessful in the first instance.

The *Cryptosporidium* sequences obtained in this study have been deposited in GenBank under accession numbers MW480826–MW480846 (*gp60* locus) and MW487256–MW487266 (*ssu* rRNA locus).

## *4.4. Data Analysis*

#### 4.4.1. Epidemiological Analysis

PCR data were entered in a Microsoft Excel spreadsheet (Redmond, WA, USA) and then checked for accuracy and consistency by independent laboratory personnel. Clinical and demographic data were extracted from the original GEMS dataset. *Cryptosporidium* spp., study groups (diarrhoeal vs. non-diarrhoeal, MSD vs. LSD) age groups, and study years were treated as categorical variables. Differences in frequencies were compared using Chi-squared test or Fisher's exact test as appropriate. A *p*-value < 0.05 was considered statistically significant. Missing values were excluded from the analyses; thus, denominators for some comparisons may differ. Data analyses were performed in Stata version 14 (StataCorp LP, College Station, TX, USA).

#### 4.4.2. Sequence and Phylogenetic Analysis

Raw sequencing data in both forward and reverse directions were visually inspected using the Chromas Lite version 2.1 sequence analysis program [61]. Special attention was paid to the detection and recording of ambiguous (double peak) positions. The BLAST tool was used to search for identity among sequences deposited in the National Center for Biotechnology Information (NCBI) public repository database [62]. Multiple sequence alignment analyses with appropriate reference sequences were conducted using MEGA 6 to identify *Cryptosporidium* species and to annotate the presence of single nucleotide polymorphisms (SNPs) [63]. *Cryptosporidium hominis* and *C. parvum* subtypes were assigned according to the number of TCA (A), TCG (G), ACATCA/ACATCG (R), and TCTT (T) fragment repeats in the microsatellite region of the *gp60* gene, in accordance with the established nomenclature, as previously described [14].

The evolutionary relationships among the identified *Cryptosporidium* species and subtypes were inferred by a phylogenetic analysis using the neighbor-joining method in MEGA 6 [64]. Only sequences with unambiguous (no double peak) positions were used in the analyses. The evolutionary distances were computed using the Kimura 2-parameter method and modelled with a gamma distribution. The reliability of the phylogenetic analyses at each branch node was estimated by the bootstrap method using 1000 replications. Representative sequences of different *Cryptosporidium* species and subtypes were retrieved from the NCBI database and included in the phylogenetic analysis for reference and comparative purposes.

#### **5. Conclusions**

This study provides the most comprehensive description of the molecular diversity of the enteric protozoan parasite *Cryptosporidium* spp. in Mozambique to date. Our findings revealed the circulation of at least three *Cryptosporidium* species in young Mozambican children primarily affected with diarrhoea. A high intra-species genetic variability was observed within *C. hominis* (subtype families Ia, Ib, Id, Ie, and If) and *C. parvum* (subtype families IIb, IIc, IIe, and IIi), but not within *C. meleagridis* (subtype family IIIb). No associations between *Cryptosporidium* species/genetic variants and age-related patterns could be demonstrated. The predominance of mainly anthroponotically transmitted *C. hominis* and *C. parvum* IIc strongly suggests that most of the *Cryptosporidium* infections detected in the surveyed paediatric population are of human origin. However, a significant proportion of the infections were caused by host-adapted *Cryptosporidium* species (e.g., *C. meleagridis*) or genetic variants (e.g., *C. parvum* "bovine genotype") suggesting the occurrence of zoonotic transmission events at an unknown rate. Further molecular epidemiological studies are warranted to assess the actual contribution of livestock, poultry, and other domestic animal species to the environmental (including surface waters intended for human consumption and soils) burden of *Cryptosporidium* oocysts in Mozambique and other African endemic areas.

**Supplementary Materials:** The following are available online at https://www.mdpi.com/article/ 10.3390/pathogens10040452/s1, Figure S1: Seasonal distribution and temporal clustering of *Cryptosporidium* species in children under 5 years of age, with and without diarrhoea, recruited during the Global Enteric Multicenter Study at the Manhiça district (Maputo, southern Mozambique), 2007–2012. Figure S2: Seasonal distribution and temporal clustering of *Cryptosporidium* subtype families in children under 5 years of age, with and without diarrhoea, recruited during the Global Enteric Multicenter Study at the Manhiça district (Maputo, southern Mozambique), 2007–2012. Table S1: PCR and sequencing data. Table S2: Diversity and frequency of Cryptosporidium family subtypes within C. hominis (subtype family I), C. parvum (subtype family II) and C. meleagridis (subtype family III) in asymptomatic (non-cases) children under 5 years of age according to severity of clinical manifestations, age group, and HIV coinfection. Children were recruited during the Global Enteric Multicenter Study at the Manhiça district (Maputo, southern Mozambique), 2007–2012. Figures between brackets represent relative frequencies.

**Author Contributions:** Conceptualisation, A.M.J., K.K., M.M.L., P.L.A., D.C. and I.M.; methodology, K.K., M.M.L., P.L.A., D.C. and I.M.; software, A.M.J. and P.C.K.; validation, D.C. and I.M.; formal analysis, A.M.J., P.C.K., M.G., D.C. and I.M.; investigation, A.M.J., P.C.K., M.G., T.N., S.M., A.C. and Q.B.; resources, K.K., M.M.L., P.L.A., D.C. and I.M.; data curation, D.C. and I.M.; writing—original draft preparation, A.M.J., D.C. and I.M.; writing—review and editing, A.M.J., P.C.K., M.G., T.N., Q.B., A.C., K.K., M.M.L., P.L.A., D.C. and I.M.; visualisation, A.M.J., K.K., M.M.L., P.L.A., D.C. and I.M.; supervision, D.C. and I.M.; project administration, D.C. and I.M.; funding acquisition, K.K., M.M.L., P.L.A., D.C. and I.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Bill and Melinda Gates Foundation through the Center for Vaccine Development at the University of Maryland, School of Medicine who coordinated GEMS, grant number 38874 (GEMS) and OPP1033572 (GEMS1A). Additional funding was obtained from the Health Institute Carlos III (ISCIII), Ministry of Economy and Competitiveness (Spain), grant number PI16CIII/00024, from the Fundo Nacional de Investigacão, Ministry of Science and Technology (Mozambique), grant number 245-INV, and from the USAID Country Office of Mozambique, grant number AID-656-F-16-00002.

**Institutional Review Board Statement:** The study was conducted according to the guidelines of the Declaration of Helsinki and approved by the Review Board of the Mozambican National Bioethics Committee for Health, Mozambique (Ref. 11/CNBS/07), the Ethics Committee of the Hospital Clinic of Barcelona, Spain (Ref. 2006/3260) and the Institutional Review Board for Human Subject Research at University of Maryland Baltimore, USA.

**Informed Consent Statement:** Informed consent was obtained from all subjects involved in the study.

**Data Availability Statement:** All relevant data are within the paper and its Supplementary Materials.

**Acknowledgments:** We thank the children and their caretakers who participated in the study, as well as the clinical, field, and laboratory staff who worked tirelessly to ensure the data collection and laboratory testing was performed according to the standardized protocol. We also thank all the local government authorities (district Administration and Health Directorate) and all community leaders for supporting and collaborating in the study. CISM is supported by the Government of Mozambique and the Spanish Agency for International Development Cooperation (AECID). ISGlobal receives support from the Spanish Ministry of Science and Innovation through the "Centro de Excelencia Severo Ochoa 2019–2023" Program (CEX2018-000806-S), and support from the Generalitat de Catalunya through the CERCA Program.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.
