1. Introduction
Shiga toxin-producing
Escherichia coli (STEC) are a heterogeneous group of foodborne pathogens, and
E. coli O157:H7 is the most common member of this group. The first outbreak associated with this microorganism occurred in Oregon and Michigan, United States (US), in 1982. It was isolated from individuals with bloody diarrhea and severe abdominal cramps who had consumed beef burgers in a well-known food chain [
1]. A retrospective search of this serotype in culture collections showed few positive results—only eight strains were deposited before 1982, one in the US, one in the United Kingdom and six in Canada [
2]. This low number of O157 strains could be related to a recent emergence of the pathogen and its entry into the agrifood chain.
E. coli O157 infections can range from asymptomatic carriage to mild diarrhea, hemorrhagic colitis or hemolytic uremic syndrome (HUS), a severe extraintestinal disease characterized by microangiopathic hemolytic anemia, thrombocytopenia and acute renal failure [
3]. Between 3% and 9% of STEC infections progress to HUS [
4], and the mortality rate is 3–5% with long-term morbidity occurring in approximately 30% of patients [
5]. Enterohemorrhagic
E. coli (EHEC) is a subgroup of STEC strains characterized as
stx/eae positive and recognized by their ability to cause severe disease in humans, like HUS.
E. coli O157:H7 is the most frequent EHEC serotype, but other non-O157 EHEC serogroups are also implicated in HUS [
6]. Many countries that use only culture-based confirmation of HUS cases, focusing on sorbitol-nonfermenting strains, may miss non-O157 isolates and therefore, bias the reports on the incidence of each serotype.
Majowicz et al. [
7] estimated that STEC causes 2,801,000 acute illnesses and leads to 3890 cases of HUS and 230 deaths annually, worldwide. Important differences exist in the incidence of
E. coli O157 infections and HUS. Surveillance practices vary considerably, and therefore, caution is required when comparing STEC incidence rates among countries. The incidence of
E. coli O157 infections per 100,000 inhabitants is approximately 1.0 in the US, 2.1 in England, 0.43 in Germany and 0.08 in France [
8]. There are also important regional differences within each country. For example, cases in Scotland increase from west to east and from north to south [
9]. In Argentina, where post-diarrheal HUS is endemic, around 400 new cases are reported each year. The disease is the leading cause of acute renal failure in children and the second most frequent cause of chronic renal failure [
10]. During 2016, 356 HUS cases were notified to the National Health Surveillance System, which corresponds to a rate of 0.82 cases per 100,000 inhabitants [
9]. During the last decade, the annual incidence has ranged from 8 to 12 cases per 100,000 children under 5 years of age [
11]. The distribution of cases shows a marked difference between the different regions of the country. The Northern regions show rates below the national average (0.17 and 0.52 per 100,000 inhabitants for northeast and northwest, respectively), the central region is near the national average (0.89 per 100,000), while central Cuyo (1.08 per 100,000) and particularly the Southern region (1.31 per 100,000) have the highest rates in the country.
Ruminants, especially cattle have been recognized as the main reservoir for
E. coli O157 [
12], and many studies have shown large variations in its prevalence in livestock [
11]. Sheep, and possibly goats, may be other reservoirs [
13]. These animals are not affected by this organism. It can also be found in asymptomatic bisons and cervids, and other mammals, like pigs, camelids, rabbits, horses, dogs and cats. Other free-living wild species, like raccoons, opossums, and rats, may carry this organism in their intestinal tract [
14].
E. coli O157:H7 may be detected in wild or domesticated birds, including chickens, turkeys, geese, pigeons, starlings, and many other species. Some studies have examined a possible relationship between wild birds and livestock suggesting a role of wild birds in disseminating
E. coli O157:H7 strains from feedlot pens to the environment [
15]. In some instances, it is difficult to prove whether a species is actually a maintenance host or just a temporary carrier [
14]. The foods involved are quite variable and include hamburgers, preparations with different types of meat, sausages, dairy products, cider, lettuce, spinach and vegetable sprouts, among others. A study conducted in the UK, Ireland, Denmark, Norway, Finland, the US, Canada and Japan found that the sources of transmission of
E. coli O157 during outbreaks were different foods (42.2% of cases), dairy products (12.2%), contact with animals (7.8%), water (6.7%), the environment (2.2%), and those of unknown origin (28.9%) [
16,
17]. Transmission from person-to-person, a process that is especially linked to children’s daycare facilities or nurseries, has also been described. Domestic transmission is more frequent in children under 4 years of age [
18].
Genomic studies have allowed researchers to postulate an evolutionary step-by-step model from a non-toxigenic sorbitol fermenter precursor related to the enteropathogen,
E. coli O55:H7. This ancestor carries the genes of enterocyte effacement, which mediates the intimate attachment of the bacterium to the intestinal epithelium. The first evolutionary steps were the acquisition of the gene coding for Shiga toxin type 2, followed by the somatic antigen switch from O55 to O157 and the acquisition of the large virulence plasmid, pO157. Finally, these strains lost the ability to ferment sorbitol and acquired genes encoding Shiga toxin 1 [
19]. Another lineage retained the sorbitol positive phenotype but lost motility, giving rise to the German O157:H- clone [
14]. These strains have emerged as important causes of human disease in continental Europe [
20]. Bacteriophages have played important roles in the genome changes of
E. coli O157, and their genes have been gained and lost very dynamically and quickly [
21]. The analysis of Single Nucleotide Polymorphisms (SNP) in stable regions of the genome of both ancestral and current strains from different continents and different sources has shown that the strains are very similar. This may be related to a recent origin of
E. coli O157 [
22]. The O157 genome has a 5.5 Mb size and includes a 4.1 Mb backbone shared with most of the
E. coli serotypes. The rest of the genome originates, largely, from the horizontal transfer of genes, mainly through bacteriophages [
23]. The gains and losses of phage genes along with the variation in nucleotides throughout the genome have guided the evolution and diversity of this pathogen [
21].
The EDL933 strain associated with the Michigan outbreak and the Sakai City strain were the first
E. coli O157:H7 genomes to be sequenced [
23,
24]. At present, a large number of O157:H7 strains have been sequenced, and whole genome comparisons can provide new insight into the underlying epidemiology of this pathogen. In the near future, the application of whole-genome sequencing (WGS) techniques to the analysis of large
E. coli O157 strain collections will become an invaluable tool for molecular subtyping and will facilitate the establishment of evolutionary relationships [
25].
Clinical strains of
E. coli O157 are characterized by the presence of a specific set of genes and include those coding for Shiga toxins (
stx1,
stx2), intimin (
eae), and hemolysin (
ehxA) [
26]. There are several subtypes of Stx1 (Stx1a, Stx1c, Stx1d) and Stx2 (Stx2a, Stx2b, Stx2c, Stx2d, Stx2e, Stx2f and Stx2g) [
27]. Most human isolates of
E. coli O157 produce Stx1, Stx2a or Stx2c alone or in combination with other subtypes. Strains that produce Stx2 are more virulent and are more frequently related to severe diseases [
28], and those harboring the
stx2a gene cause more serious illnesses than strains carrying
stx2c.
There is a clear geographical difference in the incidence and severity of infections due to
E. coli O157. For example, the incidence is generally higher in Scotland than in the rest of European countries [
8]. In Latin America, the incidence of HUS is very high in Argentina and lower in the rest of the countries of the region. These differences could be due to (i) the different prevalences of cattle colonization; (ii) the load of
E. coli O157 in the environment; (iii) the proportion of humans living in areas of high cattle density; (iv) different feeding habits; (v) different genetic structures of pathogen populations; (vi) the pathogen survival in different food types and ecological niches; (vii) differences in genotypes as well as in the infectivity and virulence of circulating strains; or (viii) a combination of these factors [
29,
30]. The proportion of clinical genotypes in cattle is weakly related to the incidence of HUS in each country, but this is not enough to explain the differences in the international incidence of HUS [
24].
A meta-analysis conducted by Salim et al. [
31], including 140 studies from 38 countries with more than 220,000 cattle, established a global prevalence of
E. coli O157 of 5.68% (95% CI, 5.16–6.20). The study showed great regional variation; the highest prevalence was in Africa (31.20%), followed by North America (7.35%), Oceania (6.85%), Europe (5.15%), Asia (4.69%), and the lowest prevalences were detected in Latin America and the Caribbean (1.65%). Large differences were found between the prevalence in feedlot cattle (19.58%, CI 15.57–23.59) and dairy cattle (1.75%, CI 1.26–2.24). Several studies carried out in Argentina have shown prevalences ranging from 0.21% (CI 0.04–0.61) to 4.07% (CI 2.82–5.67) [
32,
33,
34,
35,
36]. These data show that in Argentina, the country with the highest HUS incidence worldwide, the frequency of cattle colonization with
E. coli O157 is close to the world’s average and is lower than in many other places with low rates of disease. Therefore, this does not seem to be relevant data to explain the geographical differences in the severity of the associated diseases.
Although cattle and other ruminants are the natural reservoir of
E. coli O157, only a small subset of serotypes present in animals is related to human diseases [
37]. Furthermore, genetic subtypes or lineages of
E. coli O157 are more associated with human disease, and others are frequent in animals but rare in humans. This could be related to a low virulence or transmissibility to humans of some
E. coli O157 bovine genotypes [
38]. Genes of
E. coli O157 that encode virulence factors (including products of LEE and pO157) have shown increased expression in clinical genotypes, while genes related to acid resistance and stress fitness were shown to be relatively upregulated in bovine-biased genotypes [
39]. Most cattle isolates harbor
stx2c as the sole gene encoding Stx, whereas
stx2a is more frequent in patients with severe symptoms [
33]. The
E. coli O157 strains associated with cattle show a pronounced difference in their geographical distribution. This different geographical distribution may have several causes: (a) a different production type or system, like dairy herds or feedlots; (b) age, with a higher prevalence among young animals; (c) season, through an increase in the warmer months of the year; and (d) diet may also affect
E. coli O157 populations [
40,
41]. This regional association suggests that strains of
E. coli O157 have diverged evolutionary in different parts of the world through founder effects or genetic drift or by selective regional pressures. In this way, the difference in the virulence of the strains of each geographical area could explain the differences in the incidence and severity of human diseases related to this microorganism [
33,
39]. Several researchers have identified genetic markers that are found in different frequencies in strains of clinical cases and animals. Some studies have shown that these genotypic differences are attributable to insertions of bacteriophages, deletions and duplications of DNA fragments of different sizes [
42,
43]. Initially, an octamer-based genomic scanning was used, through which two lineages were identified: lineage I, composed mainly of strains of clinical origin; and lineage II, composed of strains of animal origin [
44]. Subsequently, a new technique was developed, lineage-specific polymorphism assay-6 (LSPA-6), based on the use of a multiplex PCR to detect alleles from six loci that identify lineages I and II [
45]. In 2010, Zhang et al. [
46] identified another lineage, I/II, with intermediate characteristics between lineages I and II. They also showed that strains from lineage I and I/II produce more Stx2 than strains of lineage II, regardless of their origin. Furthermore, lineage I/II has been related to more severe pathologies, such as HUS [
47]. It is interesting to note that the distribution of LSPA-6 lineages in human and cattle isolates is very different in The Netherlands, the US and Japan. A similar pattern occurs when other countries or regions are analyzed.
There is also a great variability in the clinical presentation of pathologies caused by
E. coli O157. These differences are even more striking when comparing the number of HUS cases and hospitalization rates during different outbreaks. For example, HUS and hospitalization rates during the spinach outbreak in the US in 2006 [
48] were higher than those of previous outbreaks in the US [
49] and those of the 1996 outbreak in Japan [
50]. Manning et al. [
51] postulated the existence of
E. coli O157 strains with great variation in their virulence and suggested that this diversity could explain the different incidences of severe diseases observed during outbreaks. Phylogenetic studies, based on the analysis of SNP in 36 loci of strains from different outbreaks, allowed the description of nine clades. Within them, clade 8 was related to a high number of HUS cases and the highest rates of hospitalization. For that reason, it is known as the hypervirulent clade.
Kulasekara et al. [
52] sequenced the complete genome of strain TW14359 related to the spinach outbreak of 2006. The analysis of this sequence and its comparison with the sequences of other
E. coli O157 strains already sequenced (EDL933, from the US outbreak in 1982 and Sakai, from the 1996 outbreak in Japan) identified some characteristic genetic determinants that could be related to the high virulence of this strain. These putative virulence factors include ECSP_0242, which encodes a factor linked to protein–protein interactions; ECSP_2687, which encodes a protein that reduces the expression of cytokines, decreasing the immune response of the host; ECSP_3620, which encodes the anaerobic nitric oxidase, NorV; ECSP_3286, a protein that binds with high affinity to heme; ECSP_1773, which encodes a protein that interferes with the innate immune response and ECSP_2870/2872, which encodes a protein related to adaptation to plant hosts. The presence of the intact
norV gene (ECSP_3620) combined with any of the other virulence factors may contribute to the high virulence of these strains.
Although the increased production of Stx2 is a characteristic of clade 8 strains, it is not unique to it and, in addition, not all strains of this clade express high levels of Stx2. The differences in the severity of infections caused by strains of different clades could be explained, at least in part by the differential production of Stx2. In addition, clade 8 strains overexpress LEE genes. Therefore, the virulence of the strains of this clade probably reflects the upregulation of several discrete virulence systems [
53]. Several authors have shown that LSPA-6 lineage II strains are less pathogenic, probably due to low Stx production [
46,
54,
55]. Adherence to epithelial cells is higher for clade 8 strains than for clade 2 strains, although no differences have been observed in the invasiveness between the two clades. Strains belonging to clade 8 show upregulation of major virulence genes, including 29 of 41 LEE island genes, which are critical for adherence. The same has been observed for Stx coding genes and for virulence genes encoded in the plasmid, pO157 [
56].
The
stx2 gene is located on the λ family prophages immediately downstream of the phage late promoter (pR’). The expression of the
stx2 gene is regulated by the transcription of the anti-terminator Q, which initiates the transcription at the late promoter pR’. It has been suggested that the anti-terminator
q gene on the bacteriophage Q933 could be a useful marker of strains with high toxin production. In contrast, the
q gene of bacteriophage 21 has been reported from
E. coli O157:H7 with low Stx production [
37,
57,
58].
STEC strains can colonize cattle for several months, and in this way, may serve as a gene reservoir and may be the origin of
E. coli O157 genotypes with high virulence. Hence, the importance of characterizing genotypes that circulate in the livestock in a certain area, as the point of origin of strategies to reduce risks to human health [
59]. Considering this, our group has previously carried out broad molecular characterization of the human and bovine O157 strains circulating in Argentina using different methodologies (PFGE, LSPA-6, SNP analysis,
stx subtyping, and putative virulence factors and allele
q detection, among others). Our data allows us to conclude that in contrast to the great genetic diversity observed in other studies worldwide, in Argentina, high homogeneity is observed in both cattle and human strains, with almost exclusive circulation of strains belonging to the hypervirulent clade 8 described by Manning et al. [
51] also carrying also a significant set of putative virulence factors [
60,
61]. Other methods applied to STEC subtyping, like the Multiple-Locus Variable number tandem repeat Analysis (MLVA) and Multilocus Sequence Typing (MLST), were not used in our previous studies.
The aim of this review was to compare the genetic background of E. coli O157 strains isolated in countries that have conducted similar studies to try and correlate specific O157 genotypes with the incidence and severity of E. coli O157 associated diseases. This review focuses on E. coli O157:H7 (named throughout the manuscript as E. coli O157) because this serotype is the etiologic agent of more than 75% of HUS cases in Argentina.
A thorough web-based and PubMed search was conducted to identify relevant studies on these topics. We used the following search terms: Escherichia coli O157, Escherichia coli O157:H7, E. coli O157:H7, E. coli O157, STEC, EHEC, VTEC, Shiga toxin combined with LSPA-6 or clades or Q alleles or stx genotypes or Kulasekara factors or putative virulence factors.