*3.3. Origin of Infection*

Out of the 398 patients, 250 (63%) were infected in Sweden, while 138 (35%) had acquired infection while traveling abroad to 55 different countries, representing all continents except Oceania and Antarctica, over the two weeks preceding disease onset. Ten cases had missing or uncertain data concerning origin of infection (Table 2). *Cryptosporidium parvum* was the most common species in cases of contracting infection in Sweden (84%; 211/250) and abroad (57%; 78/138); meanwhile, *C. hominis* was identified in 3% (8/250) of domestic cases and in 30% (41/138) of the cases infected abroad.

*Cryptosporidium* chipmunk genotype I (*n* = 5), *C. ubiquitum* (*n* = 2), and *C. ditrichi* (*n* = 1) were found only in cases infected in Sweden, while *C. meleagridis* (*n* = 8), as well as *C. suis*, *C. viatorum*, and *Cryptosporidium* horse genotype (*n* = 1 each), were exclusively found in travel-related cases. *Cryptosporidium felis*, *C. erinacei*, and *C. cuniculus* were diagnosed both in domestic and travel-related cases (Table 2).

#### *3.4. Molecular Characterization of Cryptosporidium parvum*

Subtyping using the *gp60* protocol was successful for 99% (296/300) of the *C. parvum*positive samples, including the sample with mixed *C. hominis* and *C. parvum* (Table 3). In total, nine subtype families (IIa, IIc, IId, IIe, IIl, IIn, IIr, IIs, and IIt) and 42 subtypes were identified (Table 3). The most common subtype family was IIa, which was observed in 164 patients. It was represented by 22 different subtypes, of which IIaA16G1R1 (*n* = 42), IIaA15G2R1 (*n* = 31), and IIaA17G1R1 (*n* = 20) were the most common. The remaining 19 IIa subtypes were found in 1–13 patients each. Subtype family IId (*n* = 118 patients) was the second-most common subtype family, represented by 11 different subtypes, of which IIdA22G1 (*n* = 37) and IIdA20G1 (*n* = 24) were the most frequent. The remaining nine IId subtypes were found in 1–16 patients each. Three patients were co-infected with two *C. parvum* subtypes: one with IIaA14G2R1 and IIaA15G2R1, and two with IIaA15G2R1 and IIdA19G1 (Table 3). Subtypes from subtype families IIc, IIe, IIl, IIn, IIr, IIs, and IIt were found in one or two patients each (Table 3).

**Table 3.** *Cryptosporidium parvum gp60* sequences generated in this study.



**Table 3.** *Cont.*


**Table 3.** *Cont.*

<sup>1</sup> Sequences submitted to GenBank during this study are indicated in boldface type. Accession numbers in non-boldface type refer to reference sequences from GenBank. <sup>2</sup> Four samples from an outbreak [27]; <sup>3</sup> two samples from an outbreak [27].

> Among the IIa and IId subtypes identified in the present study, several sequence variations were observed, either in the conserved non-repetitive part or in the repetitive area. In Tables 3 and 4, all subtypes and subtype variants are referred to via a corresponding Gen-Bank acc. no. with 100% identity. For instance, in subtype IIaA16G1R1, the most common variant is IIaA16G1R1b (EU647727), which was found in 39 patients, and was the most common subtype in patients infected in Sweden (*n* = 30). Subtype IIaA16G1R1b\_variant (KT895368), wherein the TCG repeat is located at a different position in the repeat region compared to IIaA16G1R1b, was found in two patients. This type of sequence variation has been described by Alsmark et al. [14], and it was found in four IIa subtypes, but not in any other subtype family (Table 3). Subtype variations were also seen in IId subtypes, exemplified by IIdA22G1, where three variants were identified, differing by one single nucleotide polymorphism (SNP) in the conserved part, AY166806 (*n* = 16), FJ917374 (IIdA22G1c; *n* = 19), and KR349103 (*n* = 2). Another type of sequence variation was found in IIaA14G1R1r1, which has an interruption of the TCA repeats by an ACA. This subtype, which was found in six patients, was named according to the proposed nomenclature by Amer et al. [28], who described a similar sequence variation, IIaA14G1R1r1b, in a sample from a calf.

> Three new *C. parvum gp60* subtype families, IIr, IIs and IIt, were identified in one patient each. The sequences showed 100% identity with *C. parvum* at the SSU rRNA gene: IIr and IIt to AF164102 (gene copy A), and IIs to LC270281 (gene copy B). At the *gp60* locus, subtype IItA13R1 (KU852718) was identified in a patient (Swec402) who had visited Tanzania prior to infection. The closest matches in GenBank were 93% to *C. parvum* subtype family IIb (AY166805) and 90% to IIp (MK956000). The other two patients (Swec447 and Swec434) were both infected in Sweden, the first with *gp60* subtype IIsA14G1 (KU852720), where the closest match was 95% identity to subtype family IIe (KU852716), and the second

with subtype IIrA5G1 (KU852719), which showed 92% similarity to the *C. hominis* subtype family Ie (AY738184). This observation prompted us to investigate two additional genes, actin and *hsp70*. As regards the actin locus, the isolate was 100% identical to *C. parvum* sequence MH542350, and at the *hsp70* locus, where 1911 bp were successfully sequenced, it differed by eight SNPs compared with the IOWA *C. parvum* strain (XM625373) in the conserved part. These SNPs, however, did not display any changes in the amino acid sequence.


**Table 4.** *Cryptosporidium hominis gp60* sequences generated in this study.

<sup>1</sup> Sequences submitted to GenBank during this study are indicated in boldface type. Accession numbers in non-boldface type refer to reference sequences from GenBank. <sup>2</sup> Cases described in Lebbad et al., 2018 [29].

#### *3.5. Molecular Characterization of Cryptosporidium hominis*

*Gp60* subtyping was successful for 49 out of 50 *C. hominis* isolates (including the sample with mixed species) (Table 4). Seven subtype families, Ia, Ib, Id, Ie, If, Ii, and Ik, and 17 subtypes were identified. The most common subtype was IbA10G2 (*n* = 15), wherein 13 cases had contracted infection while abroad; the remaining 2 reflected domestic infections. The second-most common subtype was IbA9G3 (*n* = 9), detected in patients who were infected in nine different countries, mainly in Africa (Table 4). In addition to the two cases with IbA10G2, another three subtypes were found in patients infected with *C. hominis*

in Sweden: IaA18R3 (*n* = 2), IfA12G1R5 (*n* = 1), and IkA18G1 (*n* = 2). The cases with the uncommon subtype families Ii and Ik have been described in a separate article [29].

#### *3.6. Molecular Characterization of C. hominis/C. parvum Mixed Infection*

One patient, a 4-year-old girl from Syria, had a mixed *C. hominis* and *C. parvum* infection with subtypes IaA18R3 and IIaA16R1.

#### *3.7. Outbreaks and Family Clusters*

Four outbreaks, all caused by *C. parvum*, were identified during the study period (Table 5). The first outbreak occurred among veterinary students in 2013, and two different subtypes were found, IIaA16G1R1 and IIdA24G1 [27]. Outbreaks 2, 3 and 4 were all foodborne and involved subtypes IIaA16G2R1, IIaA17R1, and IIdA17R1, respectively. Three small family clusters were also detected; two related to traveling, and one where a contaminated water well was the suspected vehicle (Table 5).

**Table 5.** Outbreaks and family clusters involving *C. parvum* and *C. hominis* identified in the study period.


<sup>1</sup> Outbreak described in Kinross et al., 2015 [27]; <sup>2</sup> described in Lebbad et al., 2018 [29].

*3.8. Molecular Characterization of Additional Species*

Of 30 isolates from additional species, 28 were successfully sequenced at the SSU rRNA locus. Subtyping with the *gp60* protocol was successful for 29 isolates; the only exception was *C. ditrichi*, wherein no suitable primers were available (Table 6).


**Table 6.** *Cryptosporidium gp60* sequences from non-*hominis* and non-*parvum Cryptosporidium* species generated in this study.

<sup>1</sup> Sequences submitted to GenBank during this study are indicated in boldface type. Accession numbers in non-boldface type refer to reference sequences from GenBank. <sup>2</sup> isolates included in Stensvold et al., 2014 [23]; <sup>3</sup> isolate included in Stensvold et al., 2015 [22]; <sup>4</sup> isolates included in Rojas et al., 2020 [21]; **<sup>5</sup>** case described in Beser et al., 2015 [30]; <sup>6</sup> case described in Beser et al., 2020 [31].

> A recently adopted child from Lithuania was infected with *C. suis*. A *gp60* sequence was achieved using the *Cryptosporidium* chipmunk primers [19], and by means of additional sequencing primers (data not shown) subtype XXVaR37 was obtained. This case and method will be described in a separate article.

> Cases with *Cryptosporidium* chipmunk genotype I subtype XIVaA20G2T1 and *C. ubiquitum* subtype XIIa-1 were found in five and two patients, respectively, all infected in Sweden.

> Two patients were infected with *C. erinacei*. The SSU rDNA sequences from their samples were not identical; the sequence from patient Swec627 (KU892565) infected in Sweden differed by two SNPs from the closest *C. erinacei* match in GenBank (KC3056047), while

the sequence from patient Swec653 infected in Greece was 100% identical to KC3056047 (Figure 2). The actin DNA sequence obtained from isolate Swec627 was 100% identical to a *C. erinacei* isolate (MN237648) from a European badger in Poland (unpublished). Based on *gp60* analysis, a new *C. erinacei* subtype, XIIIaA23R12, was identified in the first patient (Swec627), while subtype XIIIaA24R9 was found in the second patient (Swec653).

**Figure 2.** Phylogenetic relationships between partial SSU rDNA *Cryptosporidium* sequences obtained in the present study and sequences retrieved from the NCBI database. Trees were constructed using the neighbor-joining method based on genetic distance calculated by the Kimura's 2-parameter model as implemented in MEGA X. The final dataset included 749 positions. Bootstrap values ≥50% from 1000 replicates are indicated at each node. Isolates from this study are indicated in boldface.

Four patients were diagnosed with *C. felis*; three in Sweden, and one in Indonesia. The recently described *gp60* assay for *C. felis* [21] yielded four different sequence types.

One patient infected in Kenya with *C. viatorum* subtype XVaA3b was identified during the study period. Material from this case was used to develop a *C. viatorum gp60* subtyping assay [22].

The eight patients diagnosed with *C. meleagridis* were all infected in Asia. Six of the samples were identified as genotype 1 at the SSU rRNA locus (AF112574), while sequencing of this gene failed for the remaining two isolates. Investigation of the *gp60* gene was successful for all eight isolates and revealed three subtype families, IIIb, IIIe and IIIg, and six different subtypes (Table 6). Subtype IIIbA23G1R1c (KU852727) differed by eight SNPs in the post-repetitive part of the sequence from IIIbA23G1R1b (KJ210609), and was named IIIbA23G1R1c according to the proposed nomenclature [23].

Two subtype families of *C. cuniculus*, Va and Vb, and five different *gp60* subtypes were observed. It was noted that the *C. cuniculus* Vb sequences published in GenBank carried 2–5 ACA repeats just after the TCA repeats (data not shown); thus, we followed the recommendation by Nolan et al. from 2010 [32] and included an R in the subtype designation, as in VbA29R4 for isolate Swe658, with 29 TCA repeats followed by 4 ACA repeats (KU852734). One patient infected in Greece carried a new subtype, VbA31R4. The sequence from patient Swec678 had the same numbers of TCA and ACA repeats as a VbA20R2 strain from the UK (GU971649), but differed by six SNPs and a 3 bp deletion in the post-repetitive part of the sequence, resulting in five amino acid changes. The subtype variant was designated VbA20R2b (KU852735), according to the proposed *gp60* nomenclature [33].

A third subtype family of *Cryptosporidium* horse genotype, VIc, was identified in a patient (Swe490) who had visited Kenya. This subtype, referred to as VIcA16 (KU852738), showed 88% and 87% identity to subtype families VIa and VIb, respectively. Four ACA repeats followed just after the TCA repeats, a feature also observed in *C. cuniculus* subtype family Vb. An extended molecular investigation including the SSU rRNA, actin, and *hsp70* genes was performed. The SSU rDNA sequence (KU892564) differed by one SNP from *Cryptosporidium* horse genotype sequences deposited in GenBank (FJ435962, MK775041). The closest match for the actin sequence (KU892571) was a sequence from *Cryptosporidium tyzzeri* (AF382343), from which it differed by eight SNPs. No actin sequences from the horse genotype were available in GenBank for comparison. The *hsp70* sequence KU892577 (1895 bp) showed 100% identity with the four *Cryptosporidium* horse genotype sequences available in GenBank. However, none of these sequences (298–403 bp) were long enough to cover the area with the 12 bp segment repeats towards the 3 end of the *hsp70* gene. The horse genotype sequence from our study exhibited 10 repeats of a 12 bp segment with SNPs at the third and sixth bases—GG(C/T)GG(A/T)ATGCCA.

## *3.9. Phylogenetic Analyses*

A phylogenetic tree, which contained representative SSU rDNA sequences from all 12 species and genotypes detected in the present study and published sequences from most *Cryptosporidium* species/genotypes hitherto detected in humans, was constructed (Figure 2). In the *gp60* tree, one representative sequence from each subtype family detected in this study, except *C. felis* and *C. suis*, was included (*n* = 26). Sequences from established subtype families are clustered with sequences from this study. The new *C. parvum* subtype family IIt clustered with IIb, subtype family IIr with Ie, and subtype family IIs with IIe, while the new *Cryptosporidium* horse genotype subtype family VIc clustered with VIa and VIb (Figure 3).

**Figure 3.** Phylogenetic relationships between partial *gp60 Cryptosporidium* sequences obtained in the present study and sequences retrieved from the NCBI database. Trees were constructed using the neighbor-joining method based on genetic distance calculated by the Kimura's 2-parameter model as implemented in MEGA X. The final dataset included 840 positions. Bootstrap values ≥50% from 1000 replicates are indicated at each node. New subtype families observed in this study are indicated by filled circles. All isolates from this study are indicated in boldface.
