**Salmonella Pathogenesis and Host Restriction**

Printed Edition of the Special Issue Published in *Microorganisms* France Daigle Edited by

www.mdpi.com/journal/microorganisms

**Salmonella**

## **Salmonella: Pathogenesis and Host Restriction**

Editor

**France Daigle**

MDPI • Basel • Beijing • Wuhan • Barcelona • Belgrade • Manchester • Tokyo • Cluj • Tianjin

*Editor* France Daigle Universite de Montr ´ eal ´ Canada

*Editorial Office* MDPI St. Alban-Anlage 66 4052 Basel, Switzerland

This is a reprint of articles from the Special Issue published online in the open access journal *Microorganisms* (ISSN 2076-2607) (available at: https://www.mdpi.com/journal/microorganisms/ special issues/salmonella host).

For citation purposes, cite each article independently as indicated on the article page online and as indicated below:

LastName, A.A.; LastName, B.B.; LastName, C.C. Article Title. *Journal Name* **Year**, *Volume Number*, Page Range.

**ISBN 978-3-0365-0912-9 (Hbk) ISBN 978-3-0365-0913-6 (PDF)**

Cover image courtesy of Jadranka Baresiˇ c.´

© 2021 by the authors. Articles in this book are Open Access and distributed under the Creative Commons Attribution (CC BY) license, which allows users to download, copy and build upon published articles, as long as the author and publisher are properly credited, which ensures maximum dissemination and a wider impact of our publications.

The book as a whole is distributed by MDPI under the terms and conditions of the Creative Commons license CC BY-NC-ND.

## **Contents**


**Jean-Guillaume Emond-Rheault, J´er´emie Hamel, Julie Jeukens, Luca Freschi, Irena Kukavica-Ibrulj, Brian Boyle, Sandeep Tamber, Danielle Malo, Eelco Franz, Elton Burnett, France Daigle, Gitanjali Arya, Kenneth Sanderson, Martin Wiedmann, Robin M. Slawson, Joel T. Weadge, Roger Stephan, Sadjia Bekal, Samantha Gruenheid, Lawrence D. Goodridge and Roger C. Levesque**

The *Salmonella enterica* Plasmidome as a Reservoir of Antibiotic Resistance

Reprinted from: *Microorganisms* **2020**, *8*, 1016, doi:10.3390/microorganisms8071016 ........ **159**

## **About the Editor**

**France Daigle**, Professor, Universite de Montr ´ eal, Montr ´ eal, QC Dr. Daigle, Ph.D. (Microbiology, ´ Universite de Montr ´ eal, 1996) (post-doctoral studies at INRA, Toulouse (1996–1997) and Washington ´ University in St-Louis (1998–2001)), has been a Professor in the Microbiology, Infectiology, and Immunology Department at Universite de Montr ´ eal since 2001. She is a molecular microbiologist ´ with expertise in bacterial pathogenesis, genetics and genomics. She uses an arsenal of genetics, genomics, and cellular tools to understand how Salmonella strains develop host-adaptive traits and virulence factors. She has very strong expertise in host–pathogen interactions, using host cells as a model to study bacterial adherence, uptake, and survival. She also developed a unique approach to studying bacterial gene expression in vivo. Her current projects involve the study of molecular adaptation of Salmonella to the host, with focus on adherence mechanisms and biofilm formation and on high throughput screening of Salmonella serovars in cell culture to estimate virulence.

### *Editorial* **Special Issue "Salmonella: Pathogenesis and Host Restriction"**

#### **France Daigle 1,2**


Academic Editors: Ute Römling Received: 20 November 2020; Accepted: 3 February 2021; Published: 5 February 2021

Bacteria of the *Salmonella* genus include several serovars that are closely related, although they can colonize different ecological niches, different hosts, and cause different diseases. Some serovars have a broad host spectrum, including mammals, insects, birds, and plants, while others are specific to only one host species. This Special Issue contains a research article on genomic comparative analyses that focus on serovar adaptation to pigs [1] and a review on the role of the iron-acquisition systems of serovars found in chickens [2], among others.

*Salmonella* serovars either persist in an asymptomatic carrier state or cause symptomatic infection. In humans, each year, the global burden of *Salmonella* infections causing gastroenteritis is 94 million cases and approximately 155,000 deaths are caused by non-typhoidal *Salmonella* (NTS) (predominantly by serovars *S.* Typhimurium and *S.* Enteritidis) [3]. In developed countries, NTS outbreaks are often associated with contaminated processed food and fresh produce. Bacteremia arises when invasive NTS (iNTS) infections occur in individuals with immunosuppression, individuals suffering from malnutrition, or other severe infections such as malaria or human immunodeficiency virus (HIV), as well as elderly individuals and young children. High global morbidity and mortality in iNTS infections occurs in developing countries with a global burden of 3.4 million cases and over 680,000 deaths [4]. Typhoidal *Salmonella* (TS) are responsible for enteric fever, a systemic life-threatening disease, with an estimated global annual burden of between 11.9 and 26.9 million cases, resulting in 128,000 to 216,500 deaths per year [5], which include serovar Typhi causing typhoid fever, and serovars Paratyphi A, B, and C causing paratyphoid fever.

After ingestion of contaminated food or water, *Salmonella* colonizes the intestine and induces inflammation, which results in gastroenteritis. *Salmonella* invades the intestinal epithelium using a type three secretion system located on *Salmonella* pathogenicity island 1 (SPI-1) [6] and invasin proteins, Rck and PagN [7]. In this Special Issue, Lerminiaux et al. [8] have applied phylogenetic comparisons to study the distribution, evolution, and stability of SPI-1, while Wu et al. [9] have compared PagN sequences and have identified a residue that contributes to adhesion and invasion. Following invasion, *Salmonella* will spread via the bloodstream to the spleen, liver, and gallbladder, where they replicate in macrophages. Verma et al. [10] have reviewed the use of organoids as a complex model to study *Salmonella* host cell interactions. Each step of the infection process needs to be tightly regulated. The specific regulatory proteins of a two-component system will transmit the signal by phosphorelay with various output functionalities including the gene expression level of several genes to adapt to a particular environmental change. Murret-Labarthe et al. [11] investigate the role of all two-component systems in serovar Typhi.

*Salmonella* forms biofilms to persist in the environment or to resist antimicrobial peptides and antibiotics, reduce phagocytosis, and resist the innate immune system. Biofilms are defined as a highly *Microorganisms* **2021**, *9*, 325

organized multicellular community of bacteria embedded in a self-induced extracellular matrix. Here, Hahn and Gunn [12] describe the role of extracellular polymeric substances in biofilms and in host innate immunity. Furthermore, Sokaribo et al. [13] describe the expression of CsgD, the major biofilm regulator, under several conditions. The information gained in these studies will help to develop strategies to fight against *Salmonella*.

As with many bacterial pathogens, antimicrobial resistance is also rising in *Salmonella*. Multidrug resistance has been detected in NTS isolates from animals and humans [14] including iNTS [15] and typhoid isolates [16]. Transfer of plasmids between isolates represents a common mechanism of antibiotic resistance. Emond-Rheault et al. [17] describe a procedure to detect plasmids and antimicrobial genes by whole genome sequencing.

Current advances in sequencing and bioinformatics, and the development of new models to study host cell interactions have provided important insights to determine the roles of genes and their regulation that may provide keys to combat *Salmonella* infections.

#### **References**


*Microorganisms* **2021**, *9*, 325


© 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Genetic Markers in** *S.* **Paratyphi C Reveal Primary Adaptation to Pigs**

**Satheesh Nair 1, Maria Fookes 2, Craig Corton 2, Nicholas R. Thomson 2, John Wain 3,4,\* and Gemma C. Langridge <sup>4</sup>**


Received: 17 March 2020; Accepted: 27 April 2020; Published: 30 April 2020

**Abstract:** *Salmonella enterica* with the identical antigenic formula 6,7:c:1,5 can be differentiated biochemically and by disease syndrome. One grouping, *Salmonella* Paratyphi C, is currently considered a typhoidal serovar, responsible for enteric fever in humans. The human-restricted typhoidal serovars (*S.* Typhi and Paratyphi A, B and C) typically display high levels of genome degradation and are cited as an example of convergent evolution for host adaptation in humans. However, *S.* Paratyphi C presents a different clinical picture to *S.* Typhi/Paratyphi A, in a patient group with predisposition, raising the possibility that its natural history is different, and that infection is invasive salmonellosis rather than enteric fever. Using whole genome sequencing and metabolic pathway analysis, we compared the genomes of 17 *S.* Paratyphi C strains to other members of the 6,7:c:1,5 group and to two typhoidal serovars: *S.* Typhi and Paratyphi A. The genome degradation observed in *S.* Paratyphi C was much lower than *S.* Typhi/Paratyphi A, but similar to the other 6,7:c:1,5 strains. Genomic and metabolic comparisons revealed little to no overlap between *S.* Paratyphi C and the other typhoidal serovars, arguing against convergent evolution and instead providing evidence of a primary adaptation to pigs in accordance with the 6,7:c:1.5 strains.

**Keywords:** host adaptation; convergent evolution; genome degradation; genomic lesions

#### **1. Introduction**

Standardisation in the classification and fine typing of *Salmonella* in the mid-20th century led to a revolution in our understanding of the biological properties, particularly host associations, of this heterogenous group of animal pathogens [1,2]. For *Salmonella enterica* there are six defined subspecies. Nearly all isolates from humans are from subspecies I, in which more than 1500 serotypes can be distinguished by their cell wall (O) and flagella (H) antigens [3]. The combination of O:H1:H2 is known as the antigenic formula and each unique combination is given a name. For example (using the simplified antigenic formulae used by front line diagnostic laboratories) 9:d:- (where "-" means no antigen) identifies *S.* Typhi isolates and 2:a:1,5 is *S.* Paratyphi A. Both of these serovars are human-restricted and cause enteric fever. The situation, however, is more complex for *S.* Paratyphi B where its antigenic formula, 4:b:1,2 is shared by several *Salmonella* capable of causing mild to severe disease [4]. The 4:b:1,2 group can be split into subtypes using metabolic ability (e.g., tartrate utilisation), but these groups do not map to specific host species. For example, the human restricted *S.* Paratyphi B *sensu stricto* is always tartrate negative, but other 4:b:1,2 tartrate negative strains are highly diverse and

are collectively known as *S.* Java. To date, only sequence typing has been able to define the phylogroup of tartrate negative strains that are human restricted [4]. Currently all *Salmonella* with the antigenic formula 4:b:1,2 are categorised as dangerous pathogens [5], but this may change as evidence linking the sub-types to clinical phenotypes accumulates.

The fourth *Salmonella* serotype considered to be a human restricted enteric fever pathogen is *S.* Paratyphi C; this belongs to a highly variable group with the antigenic formula 6,7:c:1,5, for which subtyping has never been fully resolved [6,7]. The most commonly isolated member of this group, *S.* Choleraesuis, has been divided into biotypes (Table 1): *sensu stricto* and Kunzendorf have a limited ability to grow using single sugars as a carbon source and differ only in their ability to produce H2S from sulphates—a test which is difficult to standardise for routine diagnostic use.


**Table 1.** Differentiation by biochemistry of 6,7:c:1,5 *Salmonella.*

Adapted from the Kauffman White scheme [3]. \* Typhisuis is *d*-tartrate negative in contrast to the other four.

*S.* Choleraesuis var. Decatur, on the other hand, has a greater metabolic capacity but is also far more genetically diverse [8]. Current biotyping schemes often lead to errors in identification [8], and so molecular tests have been developed using multi locus sequence typing (MLST) [6]. However, the mapping of these sequence types to phylogenetically related clusters is not clean and the association with animal hosts even less so [8]. The utility of these biotypes is therefore disputed.

The host restricted biotypes within the 6,7:c:1,5 group, (*S.* Paratyphi C to humans and *S.* Typhisuis to pigs) are clearly distinguishable from *S.* Choleraesuis; this has led to the hypothesis that the adaptation of salmonellae to humans by *S.* Typhi, Paratyphi A, B and C, is the result of convergent evolution of a diverse set of *Salmonella* adapting to survive and transmit between human hosts [7]. Comparative genomic studies on *S.* Typhi and Paratyphi A support this theory [9,10], but for *S.* Paratyphi B and C the evolutionary trajectory is less clear. Sequencing of ancient DNA from humans suggests that the 6,7:c:1,5 group of *Salmonella* were far more common in the past than now and were part of a historic clade (Ancient Eurasian Super Branch) which falls within the much larger extant *S. enterica* diversity. The early members of this clade cluster with their pig-adapted modern counterparts [11].

Clinically, the disease caused by *S.* Typhi and Paratyphi A is indistinguishable [12], but the evidence from a search of the accessible published literature that *S.* Paratyphi C causes only enteric fever is less clear. Rather, the clinical descriptions range from opportunistic infection resembling sepsis [13–16] to classic enteric fever [17]. This presentation, in some cases, is very similar to the infection caused in humans by the pig-adapted *S.* Choleraesuis [18]. The clinical picture is confused by difficulties in identifying *S.* Paratyphi C in the laboratory, but it is possible that this human restricted pathogen has undergone very different selective pressures to *S.* Typhi and Paratyphi A. If *S.* Paratyphi C has evolved convergently with *S.* Typhi and Paratyphi A, then the genome sequence should reveal this through the accumulation, across serotypes, of genetic changes that cause lesions in the pathways associated with host adaptation [19]. Here, we investigate this by describing the genomic lesions present in the genomes of 58 isolates from the 6,7:c:1,5 group, and comparing the evolutionary dynamics of this important group with those reported for *S.* Typhi and Paratyphi A.

#### **2. Materials and Methods**

#### *2.1. Isolates*

All 6,7:c1,5 strains used in the study are listed in Table 1, along with accession numbers for the raw sequence data. We assembled a collection of 6,7:c1,5 isolates for genome sequencing and supplemented these with publicly available sequences where necessary.

#### *2.2. DNA Preparation*

All isolates were cultured on non-selective agar (LB agar, Difco, Oxford, United Kingdom) for purity checking, and in non-selective broth (LB broth, Difco) for DNA extraction. DNA was extracted using a genomic DNA extraction (Sigma, Gillingham, United Kingdom) and sequenced using the Illumina Genome Analyzer II (Cambridge, United Kingdom), as previously described [20].

#### *2.3. Genomes from Databases*

A search was performed in Enterobase [21] for entries with the antigenic formula 6,7:c:1,5. An additional 2 genome sequences were identified and included in this study, *S.* Choleraesuis var. Decatur SARB70 (ERR3482081) and *S.* Paratyphi C RKS4594 (CP000857). 21 *S.* Typhi genomes, CT18 (accession AL513382), Ty2 (AE014613) and 19 from [22], and 2 *S.* Paratyphi A genomes, ATCC 9150 (CP000026) and AKU\_12601 (FM200053) were also used.

#### *2.4. Reference Genome Assembly*

For the *S.* Typhisuis 61-6 and *S.* Paratyphi C 66-8 reference genomes, a high-quality sequence was assembled using data from two sequencing platforms. DNA was sequenced on both 454 Roche GS FLX Titanium (paired end library with 3kb insert; Connecticut, United States) and the Illumina Genome Analyzer II (200–300bp standard paired end library run in one lane for 37 cycles). Illumina sequences were assembled using Velvet [23] and combined with 454 sequences using Newbler (Roche). The combined assemblies were converted to Gap4 databases [24], to guide gap closure based upon 454 read pair information. ABACAS [25] was used to order and orient the fragmented assemblies against the complete genomes of *S.* Enteritidis P125109 (accession AM933172) and *S.* Choleraesuis SC-B67 (AE017220), enabling many small repeat regions to be correctly assembled. Finally, iCORN [26] was used to correct the assembled sequences using the Illumina data and checked in the Gap4 database. Initial genome annotation was carried out using annotation transfer via RATT [27] from *S.* Choleraesuis SC-B67. Genome assemblies are available under the project accession PRJEB37271.

#### *2.5. Sequence QC*

Where necessary, sequence data were trimmed using Trimmomatic (Galaxy v0.38.0) [28] with sliding window trimming of 4;20, leading and trailing trimming at quality 3 and appropriate adapter clipping (GAII or HiSeq).

#### *2.6. Comparative Genomics*

The identification of genomic lesions (mutations resulting in premature stop codons, frameshifts or insertion/deletions relative to an intact version of the gene), and comparison between genomes, was carried out using Artemis Comparison Tool (v10) [29]. For each identified lesion, the appropriate reference genome annotation was amended to accurately reflect the mutation, and therefore allow inclusion of these genomic features in downstream analysis. Velvet assemblies were generated for all isolates sequenced in this study. Per cluster, all mutations leading to genomic lesions in the reference genome were checked for their presence in at least 2 other draft genome assemblies from that cluster. The comparators (*S.* Choleraesuis, RKS1235 and RKS1249; *S.* Paratyphi C, 6610 and 664; *S.* Typhisuis, 38K and 871997) were chosen for their divergent positions within the phylogeny. Genomic lesions

were deemed core if the same mutation was present in both, or variable if the mutation was absent in one or both sequences. OrthoMCL [30] with default parameters was used to determine orthologues between the three 6,7:c:1,5 genomes, *S.* Typhi CT18 and *S.* Paratyphi A AKU\_12601. *S.* Enteritidis P125109 was also used as a comparator. All core disrupted genes were compared via orthology to identify which were shared or unique to each cluster. Core genomic lesions were also identified for *S.* Typhi across 19 genomes and *S.* Paratyphi A across the 2 publicly available genomes at the time.

#### *2.7. Pathway Comparison*

Lists of disrupted genes were overlaid onto the metabolic pathway overview of the *S.* Typhi CT18 pathway/genome database [31] to identify disrupted pathways and transport reactions. Disrupted pathways were compared between the 5 *Salmonella* to produce an UpSet plot [32].

#### **3. Results and Discussion**

#### *3.1. Reductive Evolution and Genome Degradation*

We generated a core SNP phylogeny from the genomes of 58 isolates to investigate the relationship between related 6,7:c:1,5 biotypes of *S. enterica* subspecies I (Figure 1). These included 17 isolates of *S.* Paratyphi C which spanned a range of 57 years and were collected from Europe, Africa and the Middle East (Table S1).

**Figure 1.** Phylogenetic clusters of 6,7:c:1,5 *Salmonella.* Unrooted core SNP phylogeny using *S.* Choleraesuis SC-B67 as a reference.

The two *S.* Decatur genomes showed more SNP variation between them than was found across all of the remaining biotypes (Figure S1), and so they were removed from the phylogeny. It is generally accepted that organisms classified as *S.* Decatur represent a very diverse group which should be considered separate from other 6,7:c:1,5 *Salmonella*. The remaining strains grouped into four phyloclusters (Figure 1, supported by hierBAPS analysis, Table S1). We also prepared an extended phylogeny, including an additional 116 assemblies from Enterobase, which confirmed the phylogenetic relationships were robust (Figure S1). The separation of these clusters was matched to the currently accepted classification scheme: 1, *S.* Typhisuis; 2, *S.* Paratyphi C; 3, *S.* Choleraesuis var. *sensu stricto;* 4, *S.* Choleraesuis var. Kunzendorf. This phylogeny confirmed that the isolates included in this study were typical of the 6,7:c:1,5 group, and also allowed us to select representative genomes for comparative analyses. For this analysis, the two biotypes of *S.* Choleraesuis were considered together because the core variation across the two groups was similar to that seen across the *S.* Typhisuis group, and this study focused on the human adaptation of *S.* Paratyphi C, *S.* Choleraesuis var. *sensu stricto* and var. Kunzendorf are both associated with pigs.

In bacteria restricted to a niche, such as infection of the human host, the accumulation of mutations that disrupt gene sequences (termed "pseudogenes") is well described [10]. The term pseudogene typically infers a loss of gene function but that is not always accurate, as many of these changes (deletions, insertions or major amino acid substitutions) can cause a change of function [33,34], and so here we use the term genomic lesion (GL) to infer changes that are predicted to have an impact on phenotype but do not necessarily cause loss of function. The association of genomic lesions with host adaptation is an active research area [35]; however, grouping genomic lesions by functional impact remains challenging. We have therefore treated all GLs (mutations resulting in premature stop codons, frameshifts or insertion/deletions relative to an intact version of the gene) as equal. We found that the number of GLs per serotype was much greater for *S.* Paratyphi A and *S.* Typhi than for *S.* Paratyphi C, suggesting a potentially shorter evolutionary time period for accumulation. However, the presence of *S.* Paratyphi C DNA found in ancient human samples [36,37], coupled with the descriptions of *S.* Paratyphi C as being very common in more ancient periods of human history suggests that the effective population size was at least as large as *S.* Typhi. This, in turn, suggests a different selective pressure driving the preservation of metabolic diversity, by the removal of genomic lesions from the population. For *S.* Paratyphi C then, a history of colonisation and transmission between varied environments seems more likely than one of being restricted to the human host. However, it is possible that *S.* Paratyphi C is at an early point along the pathway to human restriction, and so we looked for signs of early adaptation to the human host.

We performed comparative genomics within and between multiple genome sequences of *S.* Paratyphi C, *S*. Typhisuis and *S.* Choleraesuis, to determine if genome degradation was evident. At the time of sequencing, a high-quality contiguous reference genome only existed for *S.* Choleraesuis SCB67; we therefore generated references for both *S.* Paratyphi C and *S.* Typhisuis (Table S1). Within each biotype, a core set of genomic lesions was identified by comparison of the reference with two other members of that group, selected to represent the diversity within the group (Table S2). Any lesion that was absent from one or more of the sequences was considered variable and not included. All the 6,7:c:1,5 biotypes sequenced here, showed similar levels of genome degradation, with just over 100 genomic lesions seen in each genome representative (Figure 2). When compared with the core genomic lesions of the human restricted serotypes, it was evident that the scale of genome degradation was much greater in the two classic human restricted serotypes *S.* Typhi (186 lesions) and Paratyphi A (144 lesions). To understand the impact of these differences in genome degradation, we investigated the possible functional consequences of these genomic lesions.

**Figure 2.** Distribution of disrupted genes. Network of shared and unique disrupted genes in *S.* Paratyphi C (SPC), *S.* Typhisuis (STS), *S.* Choleraesuis (SCH), *S.* Paratyphi A (SPA) and *S.* Typhi (STY). Yellow circles are false nodes representing each *Salmonella*. Diamonds indicate a single genomic lesion; size and colour indicate how many *Salmonella* share that lesion. Black lines connect *Salmonella* to genomic lesions. Lesions unique to each *Salmonella* are shown as fans around the false nodes. Mbp, megabase pairs; GL, genomic lesion. Dashed circles indicate lesions shared between SPC and STY and/or SPC and SPA. 1: *torC*, *ratC*, STY4541, STY2432, STY1834, STY1781; 2: *slrP* and *fliB*; 3: STY4472, STY4044, STY1408, STY1353.

#### *3.2. Genomic Lesions are Functionally Distinct*

Genomic lesions were located in metabolic pathways according to a pathway/genome database generated for *S.* Typhi (Table S3). This enabled us to determine which (potentially different) lesions affected the same pathways or transporters between the three 6,7:c1,5 representatives and *S.* Typhi/Paratyphi A (Figure 3). The intersection of affected pathways and transporters revealed that *S.* Paratyphi C had no lesions in pathways/transporters that were also degraded in *S.* Typhi or Paratyphi A (Figure S2). However, there were two genomic lesions that were common to all three 6,7:c1,5 representatives and *S.* Typhi/Paratyphi A: *sopA* and *mglA*. The former encodes a Type Three secretion system (T3SS) effector protein secreted by the T3SS encoded on *Salmonella* Pathogenicity Island 1 (SPI-1), and the latter encodes a galactoside transporter. SPI-1 effectors are associated with the invasion of host intestinal cells and subsequent enteritis [38], and so the disruption of SopA through genomic lesion in all of the *Salmonella* investigated here is not surprising. Disruption of *mglA* has previously been reported in other host-adapted *Salmonella* [39], which combined with our findings suggests that the gene is linked to the former generalist lifestyle of all of these adapted salmonellae.

**Figure 3.** Distribution of disrupted pathways. Annotated UpSet plot of shared and unique disrupted pathways. Number of genes disrupted in pathways given above the columns (includes multiple genes disrupted in same pathway). Open circles: no disrupted pathways are shared between *S.* Typhi, *S.* Paratyphi A and *S.* Paratyphi C.

For the 6,7:c:1,5 biotypes we saw 29 genomic lesions, two of which impacted two shared pathways/transporters: arabinose degradation and a PTS fructose transport system, which have not been linked to bacterial colonisation of pigs previously. This most likely reflects their shared ancestry but may be the result of selection for colonisation of a shared habitat. We therefore examined the exact nature of the mutations causing these genomic lesions: the same base pair change at the same site was considered to be inheritance whereas a different mutation in the same gene (or pathway) was considered to show selection indicative of convergent evolution. The lesions affecting arabinose degradation and PTS fructose transport were identical and thus considered inheritance. This leaves open the possibility that the ancestor of *S.* Paratyphi C evolved as a pig pathogen, which then crossed the species barrier into humans. If this was the case, it seems likely that *S.* Paratyphi C has crossed from pigs to become an opportunistic human pathogen that has subsequently become restricted to the human host. *S.* Paratyphi C and *S.* Choleraesuis cause a similar pathology in humans but *S.* Paratyphi C has lost the ability to infect pigs. This hypothesis is supported by the reporting of *Salmonella* genomes in ancient DNA studies [11]. The genomes of *Salmonella* from 400 to 900 years ago cluster with modern *S.* Paratyphi C, but the older genomes (1600–5000 years old) cluster more closely with the pig-adapted *S.* Choleraesuis. This supports the hypothesis that the ancestor of *S.* Paratyphi C was adapted to pigs and that host restriction in humans actually represents an evolutionary dead end, explaining why *S.* Paratyphi C is so rare today.

Five pathways/transporters were found to be disrupted in both *S.* Typhi and Paratyphi A, of which two were shared by them alone. For *S.* Typhi/Paratyphi A, approximately 25% of their genome is shared through a recombination event, and the mutations causing GLs in this region of the genome accordingly have identical base pair changes [9]. However, outside of this shared region there are many mutations casing GLs that are clearly the result of convergent evolution: independent mutations linked

to the disruption of the same metabolic pathways. Our analysis agrees with this finding, we identified genomic lesions in 11 genes and two pathways present only in *S.* Typhi and Paratyphi A, once all shared lesions with the 6,7:c:1,5 biotypes had been considered. Of these lesions, some were identical mutations and some were independent [9], suggesting convergent evolution by both selection for independent mutations and horizontally acquired characteristics.

#### *3.3. Anaerobic Respiration Intact in S. Paratyphi C*

The link between host adaptation (leading towards restriction) in *Salmonella* is perhaps best described for the metabolic functions associated with tetrathionate. Tetrathionate reduction allows the bacterial cell to carry out anaerobic respiration, this in turn increases the growth rate in the anaerobic environment of the animal gut and generates specific metabolic end products which stimulate inflammatory diarrhoea and in turn transmission between hosts [40]. The loss of anaerobic metabolism through the disruption of tetrathionate reduction is a hallmark of host adaptation, where a balance between pathogenicity and long-term colonisation of the host is the evolutionary strategy, rather than population expansion through rapid transmission. In Salmonella, disruption of anaerobic respiration is seen in *S.* Typhi, *S.* Paratyphi A and the *S.* Gallinarum/Pullorum group (restricted to birds) [10,19]. For *S.* Paratyphi C no such disruption was seen, the anaerobic respiration pathways were intact. In fact, only two genomic lesions were shared solely between *S.* Paratyphi C, *S.* Typhi and *S.* Paratyphi A (*slrP* and *fliB*), and these did not map to metabolic functions.

#### **4. Conclusions**

For *S.* Typhi and Paratyphi A, there is a clear genomic signal that suggests convergent evolution during adaptation and eventual restriction to the human host. In this study of 17 *S.* Paratyphi C genomes, we found: (i) the level of genome degradation caused by mutational disruption was much lower than in other host-restricted *Salmonella*, (ii) the metabolic pathways involved did not match the pathways disrupted in the human restricted *Salmonella*, and (iii) the acquisition of mutations appeared to be through inheritance from the ancestor with *S.* Choleraesuis, rather than by the selection of randomly occurring errors of DNA replication. In short, this suggests that convergent evolution does not explain the host restriction of *S.* Paratyphi C. Indeed, since anaerobic respiration is intact in *S.* Paratyphi C, we hypothesise that pathogenicity would be more similar to non-typhoidal *Salmonella* than to the enteric fever group. This has implications for both biological understanding and risk assessment for safety in clinical laboratories.

**Supplementary Materials:** Supplementary materials can be found at http://www.mdpi.com/2076-2607/8/5/657/s1; **Figure S1.** Extended phylogeny. Strains in red indicate those sequenced as part of this study. Strains in black are from Enterobase assemblies with > 60x coverage from Illumina sequencing. All *S.* Typhisuis and *S.* Paratyphi C genome assemblies from Enterobase with >60x coverage (representing 10 and 41 strains, respectively). We then selected a diverse geographical and temporal range of assemblies from Choleraesuis Kunzendorf (25), sensu stricto (15) and those defined only as Choleraesuis (25), all with > 60x coverage. 14 strains appear to be distantly related, possibly indicating misclassification. All other strains conform to the same phylogenetic relationship demonstrated by the strains sequenced in this study. **Figure S2.** Network of disrupted pathways and transporters. Yellow circles are false nodes representing each *Salmonella*. Diamonds indicate a single disrupted pathway (connected by straight lines) or transporters (connected by wavy lines); size and colour indicate how many *Salmonella* share that disrupted pathway/transporter. Disrupted pathways/transporters unique to each *Salmonella* are shown as fans around the false nodes. Table S1. 6,7:c1,5 strains in this paper. Table S2. Genomic lesions affecting genes: *S.* Paratyphi C (SPC), *S.* Typhisuis (STS), *S.* Choleraesuis (SCH), *S.* Paratyphi A (SPA) and *S.* Typhi (STY). Table S3. Genomic lesions affecting pathways: *S.* Paratyphi C (SPC), *S.* Typhisuis (STS), *S.* Choleraesuis (SCH), *S.* Paratyphi A (SPA) and *S.* Typhi (STY).

**Author Contributions:** Concept and study design: S.N., N.R.T., J.W. and G.C.L.; data analysis: M.F., C.C. and G.C.L.; drafting and revision of the manuscript: S.N., J.W. and G.C.L. All authors have read and agreed to the published version of the manuscript.

**Funding:** The authors gratefully acknowledge the support of the BBSRC; G.C.L. and J.W.0 were funded by the BBSRC Institute Strategic Programme Microbes in the Food Chain BB/R012504/1 and its constituent project BBS/E/F/000PR10352.

**Acknowledgments:** We thank Ken Sanderson, Wolfgang Rabsch, Reiner Helmuth, Sam Kariuki and Cheng-Hsun Chiu for their kind donation of strains.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Review* **Iron-Uptake Systems of Chicken-Associated** *Salmonella* **Serovars and Their Role in Colonizing the Avian Host**

#### **Dinesh H. Wellawa 1,2, Brenda Allan 1, Aaron P. White 1,2 and Wolfgang Köster 1,2,\***


Received: 15 July 2020; Accepted: 31 July 2020; Published: 7 August 2020

**Abstract:** Iron is an essential micronutrient for most bacteria. *Salmonella enterica* strains, representing human and animal pathogens, have adopted several mechanisms to sequester iron from the environment depending on availability and source. Chickens act as a major reservoir for *Salmonella enterica* strains which can lead to outbreaks of human salmonellosis. In this review article we summarize the current understanding of the contribution of iron-uptake systems to the virulence of non-typhoidal *S. enterica* strains in colonizing chickens. We aim to address the gap in knowledge in this field, to help understand and define the interactions between *S. enterica* and these important hosts, in comparison to mammalian models.

**Keywords:** *Salmonella*; iron homeostasis and regulation; chicken; pathogenicity; iron transport

#### **1. Introduction**

The genus *Salmonella* is composed of two species, *Salmonella enterica* and *Salmonella bongori*. *Salmonella enterica* is subdivided into six subspecies; *enterica* (I), *arizonae* (IIIa), *diarizonae* (IIIb), *houtenae* (IV), *salamae* (II) and *indica* (VI), based on antigenic properties (somatic (O), flagellar (H1, H2) and capsule (K) antigens) and biochemical properties [1–3]. *Salmonella bongori* predominantly resides as commensal in ectotherms and except for a few incidences, mammalian infections are rare [4,5]. It is hypothesized that adaptation to different niches paved the pathway for speciation of *S. enterica* and *S. bongori* from a common ancestor by means of gene gain, gene loss and conjugation events [6–8]. *Salmonella enterica* contains >2600 serovars which can infect insects, wild birds, reptiles and mammals. A significant proportion of human salmonellosis (>99%) are caused by serovars under subspecies I (*enterica*) hence it is the most important category in terms of public health. Clinical manifestation of salmonellosis can vary among serovars. The gastroenteritis-causing strains are collectively known as non-typhoidal *Salmonella* (NTS) strains and this review mainly focuses on NTS. Gastroenteritis is associated with intestinal inflammation and diarrhea without fever in general. NTS strains have the capacity to infect broad livestock species, yet chickens (*Gallus gallus domesticus*) are known to be a major reservoir. This is supported by epidemiological data indicating that poultry represent a major epicenter for human salmonellosis (non-typhoidal) globally (Table 1) [9,10].


**Table 1.** Some of the global incidences of human salmonellosis linked to poultry.

<sup>a</sup> Number of reported incidences. <sup>b</sup> Estimated due to inadequate reporting. <sup>c</sup> More than 3 NTS serovars were involved. <sup>d</sup> According to the online data published by the Centers for Disease Control and Prevention in 2020 June (https://www.cdc.gov/Salmonella/outbreaks.html). <sup>e</sup> According to the online data published by the European Centre for Disease Control and Prevention in 2020, June (https://www.ecdc.europa.eu/en/infectiousdiseases-and-public-health/salmonellosis/threats-and-outbreaks). <sup>f</sup> Public Health Services, Canada website (https: //www.canada.ca/en/public-health/services/diseases/salmonellosis-salmonella.html). NTS: non-typhoidal *Salmonella*.

#### *1.1. Iron Homeostasis by Salmonella in a Nutshell: Regulation and Iron-Uptake Systems*

Iron is an indispensable element for *S. enterica*. Key enzymes involved in bacterial metabolism depend on iron as a cofactor including DNA synthesis and repair enzymes [16]. Due to its transitional nature, iron can be either Fe2+/Fe3<sup>+</sup> at physiological pH (7.2). In anaerobic environments, Fe2<sup>+</sup> can be dominant over ferric iron, while Fe3<sup>+</sup> can be abundant in aerobic conditions. *Salmonella* has established various mechanisms to internalize iron depending on its availability. In this review, we will discuss several important iron-uptake systems available in chicken-associated NTS strains (Figure 1). For more detail about iron homeostasis in bacteria in general, readers are directed to several references [16–21].

#### *1.2. Ferric Uptake Regulator (Fur)-Mediated Regulation of Iron Uptake, Storage and Utilization*

In addition to its innumerable beneficial effects, iron also catalyzes toxic metabolites such as superoxides, hydroxyl free radicals through Haber-Weiss and Fenton reactions in vivo which can damage bacterial DNA, iron-sulfur clusters, hence being harmful unless regulated [22,23]. The regulation is mainly under an auto-regulated protein called ferric uptake regulator (Fur) [24]. Fur acts as a repressor for most promoters related to iron uptake [25–27]. Under iron rich conditions, Fur binds to Fe2+, which causes Fur dimerization and subsequent binding to a consensus DNA sequence called the "fur box" (GATAATGATAATCATTATC), often present in promoter-containing regions. Binding overlaps the RNA polymerase (RNAP) binding sequence in the promoter region of iron-regulated genes [28]. This, in turn, hinders transcription of genes by the RNAP. Under iron-depleted conditions, Fe2<sup>+</sup> dissociates from the dimer, the blockade for RNAP is removed, and iron-regulated genes are expressed. Apart from serving as a direct transcriptional repressor, Fur positively regulates iron storage and iron utilization genes via small RNAs called RyhB (*E. coli*) or its homologues (RfrA/B in *Salmonella*) [29,30]. For an example, under iron-rich conditions Fur upregulates iron storage proteins called bacterioferritins in *E. coli* via RyhB [31]. First, Fur-Fe2<sup>+</sup> represses *ryhB* transcription and downregulates RyhB accumulation in the cell. Low intracellular RyhB concentration in turn alleviates RyhB-mediated destruction of mRNA transcripts and leads to the upregulation of iron storage proteins. The network of interacting partners by RyhB and its homologs have added more complexity to the Fur mediated iron regulation and interactosome of these RNAs are under active research [32].

**Figure 1.** Schematic representation of iron-uptake systems in non-typhoidal Salmonella strains. (**A**) Fe3<sup>+</sup> uptake systems. Enterobactin, salmochelin and aerobactin are secreted (e.g., through EntS and IroC) to sequester Fe3<sup>+</sup> and then bind to their cognate receptors in the outer membrane (OM). Coprogen and ferrichrome are other ferric iron chelators present in the environment. Energy is generated through the proton motive force (PMF) in the cytoplasmic membrane (CM) and transduced to the receptor by the Ton complex (TonB-ExbB-ExbD). The energized receptor undergoes a conformational change which opens the pathway to mediate uptake of the iron-loaded siderophores into the periplasm (PP). The iron-liganded siderophores bind to periplasmic binding proteins (FhuD, FepB) which then shuttle them through ABC family permeases into the cytosol. 1,2 represent linearized forms of enterobactin and salmochelin respectively. (**B**) Fe2<sup>+</sup> uptake systems. Ferrous iron in aqueous medium travels though porin channels in the OM according to the concentration gradient. FeoABC is specific for Fe2<sup>+</sup> uptake. Both MntH and SitABCD are divalent metal transporters. (**C**) Forms of siderophores. Cyclic forms of enterobactin and salmochelin are hydrolyzed by the *iro* gene cluster to produce linearized forms of iron chelators.

#### *1.3. Uptake of Ferric (Fe3*+*) Iron via Siderophores*

Fe3<sup>+</sup> is insoluble and often sequestered by host proteins (i.e., hemoglobin, transferrin, lactoferrin) or bound in complexes (Fe(OH)3) outside the host. *Salmonella* secrets high-affinity iron-binding molecules called siderophores (500–1000 da) to hijack Fe3+. Two siderophores belonging to the catecholate type are well-characterized: enterobactin and salmochelin. Enterobactin is nature's superglue for Fe3<sup>+</sup> which forms an incredibly stable complex with ferric ion at Kf = 1049 (Kf = formation constant) [33]. Chemically it is designated as the cyclic trilactone of *N*-2,3-dihyroxybenzoyl-l-serine. *N*-2,3-dihyroxybenzoyl l-serine (DBS) is the building block of enterobactin which undergoes cyclization to accommodate iron by six coordinated oxygen atoms in three DBS units. DBS itself can scavenge Fe3<sup>+</sup> with low affinity [34]. Salmonella uses nonribosomal peptide synthesis pathways (NRPS) encoded by *entBCDE* (Figure 1C) to generate enterobactin in the cytoplasm which is then exported by EntS located in the inner membrane [35]. Except for some chicken-specific *Salmonella* serovars, all other chicken-associated-*Salmonella* produce enterobactin [36]. Enterobactin can be further linearized due to the action of hydrolase enzymes (IroE) located in the bacterial periplasm before secretion (Figure 1C). The linearized forms of the enterobactin (Ent-trimer, Ent-dimer) retain the ability to scavenge ferric iron, but with reduced affinity compared to its cyclic form (Kf = 1043) [33]. Once the secreted enterobactin and linearized forms are iron-loaded, they are taken up by their cognate receptors in the Salmonella outer membrane. Cyclic and linearized forms of enterobactin (ex; Ent-trimer) specifically bind to FepA. Evidence has suggested that enterobactin break down products like DBS, can be transported via Cir, FepA and IroN once loaded with Fe3<sup>+</sup> [37]. These receptors share sequence similarity and follow the same general structure [18]. They are composed of a 22 antiparallel stranded β-barrel (which forms the channel) and an *N*-terminal globular domain referred to as the "plug" or "cork". The energy generated by the proton motive force in the inner membrane is coupled to the outer membrane receptors via the TonB-ExbB-ExbD complex to achieve siderophore internalization (passage through the upper binding pocket), then migration through the channel (plug undergoes conformational changes) into the periplasm. Internalized iron is then released by degradation of enterobactin using Fes enzymes located in the cytoplasm.

Enterobactin can be glycosylated by a glycosyl transferase enzyme, IroB, to form salmochelin [38]. Glycosylation affixes glucose molecules to enterobactin thus forming the more hydrophilic salmochelin. It has been hypothesized that salmochelin is produced to counteract iron starvation mounted by the host. This has been supported by the observation that salmochelin is a better iron scavenger than enterobactin in presence of serum albumin and also it is not bound by the mammalian innate molecule lipocalin 2 (Lcn-2) which captures apo-enterobactin or Fe3<sup>+</sup>-enterobactin to impede iron scavenging by bacteria [39,40]. Lcn-2 is secreted by phagocytic cells (macrophages, neutrophils) and epithelial cells during host's inflammatory response. Glycosylation of the enterobactin moiety sterically hinders the binding capacity of Lcn-2 and therefore salmochelin is considered a "stealth" siderophore. IroB can sequentially synthesize several versions of salmochelin termed mono glycosylated enterobactin (MGE), di-glycosylated enterobactin (DGE/S4) and tri-glycosylated enterobactin (TGE) [38,39]. Work done by Lin et al., 2005 has further demonstrated that the periplasmic enzyme IroE can linearize salmochelin to linear trimer (linearized TGE/S3-not shown in the Figure 1C), linear dimer (DGE/S2), MGE trimer, linear *C*-glycosylated (DBS)2 (S1) and linear monomer (SX) in vitro [39]. Also, the authors showed that IroD, a cytoplasmic esterase, can degrade the salmochelin forms into its building blocks (DBS) thus releasing the iron into the bacterial cytoplasm [40]. Salmochelins have high specificity for outer membrane receptor IroN and are subjected to TonB-dependant uptake like other siderophores.

Some NTS serovars produce aerobactin, a mixed type of siderophore known as citrate-hydroxamate type. Aerobactin is synthesized by a NRPS pathway utilizing enzymes encoded in the *iucABCD* operon. During synthesis, l-lysine is first converted to *N*6-acetyl-*N*6-hydroxy-l-lysine and then complexed into a citric acid backbone [41]. The iron complex formation constant of aerobactin (Kf = 1023) is weaker than that of enterobactin [42]. Aerobactin follows the same rule as catecholate-type siderophores regarding its uptake (Iut receptor) and TonB-dependant transport into the bacterial periplasm. Once in the periplasm, aerobactin is transported through the binding-protein-dependent ABC transport system FhuBCD [43]. FhuBCD also mediates the energy-dependant uptake of ferrichromes and coprogen from the environment (Figure 1A) [43].

A less common class of siderophores which can be found in *Salmonella* serovars are phenolate type siderophores such as yersiniabactin (Ybt). Ybt is abundantly produced in *Yersinia* species encoded by a genomic island called high pathogenicity island 1(HPI) [44]. HPI 1 is absent from most *Salmonella enterica* serovar subspecies 1 [44] and hence its distribution in *Salmonella* serovars is low. Seven proteins (HMWP1, HMWP2, YbtD, YbtE, YbtS, YbtT and YbtU) have been described in Ybt synthesis from the precursor isochorismic acid in *Yersinia* species. The final product is a four-ring structure composed of salicylate, one thiazolidine and two thiazoline rings. Ybt shows a higher affinity for Fe3<sup>+</sup> (Kf = 1036) than aerobactin, hence it is a potent iron chelator. Once loaded with iron, yersiniabactin is taken up by the Psn/FyuA receptor in the outer membrane and then shuttled through the YbtPQ ABC transporter across the inner membrane (not shown in the figure).

#### *1.4. Uptake of Ferrous Iron (Fe2*+*) via FeoABC, SitABCD and MntH*

Ferrous iron is water-soluble and can readily pass through the outer membrane porin proteins into the periplasm following the concentration gradient. Once in the periplasmic space, Salmonella can take up Fe2<sup>+</sup> via 3 systems: FeoABC, SitABCD and MntH. FeoABC belongs to a family of transporters that have high specificity for Fe2<sup>+</sup>. For the FeoABC system, the FeoB permease forms a channel in the inner membrane and FeoA and FeoC interact with FeoB in the cytoplasm. The *N*-terminal, cytoplasmic portion of FeoB contains a G-protein domain which can perform GTP binding and hydrolysis. Therefore, Feo-mediated Fe2<sup>+</sup> uptake is coupled to GTP hydrolysis and signal transduction. For the latest structure and biology of the FeoABC system, readers are directed to two recent articles [45,46].

SitABCD is an ABC transporter family protein complex allowing the passage of primarily Mn2<sup>+</sup> in alkaline pH but capable of transporting Fe2<sup>+</sup> with low affinity [47]. Kehres et al., 2002 showed that SitABCD of Salmonella Typhimurium only transported Fe2<sup>+</sup> when the concentration of Fe2<sup>+</sup> reached 1 μM or higher in vitro [47]. MntH was also dominant in transporting Mn2<sup>+</sup> rather than Fe2<sup>+</sup>. It was evident that uptake of Mn2<sup>+</sup> was independent of pH, while Fe2<sup>+</sup> transport increased by the acidic pH [47]. Further, it was revealed that affinity for Fe2<sup>+</sup> to MntH was much lower than to SitABCD and only transported ferrous iron when it reached a concentration of higher than 1 μM in vitro [47]. Since the free, labile iron level is believed to be extremely low in biological fluids (<10−18M) and tissues, the role of SitABCD and MntH in ferrous iron transport is hypothesized to be of relatively minor significance compared to Feo-mediated iron uptake. The FeoABC system is recognized as the main ferrous iron transporter for many Enterobacteriaceae [48].

#### **2. Emergence of Chicken-Associated Invasive NTS: The Iron Link**

NTS strains are mainly asymptomatic colonizers in adult chickens, but strains of certain serovars can be fatal when infecting day-old chicks [49,50]. The major chicken-associated NTS serovars with potential to cause human epidemics are listed in Table 2. In countries belonging to the European Union (EU), the majority of breeders and layers were infected with *Salmonella* Enteritidis (SEn) while broilers were dominantly colonized by *Salmonella* Virchow (SVr) [51]. In contrast to the EU countries, *Salmonella* Kentucky (SKn) has been the predominant serovar isolated from poultry products in North America [10,52]. Generally, there is a high genetic synteny among NTS serovars (listed in Table 2) of chicken origin at core genomic levels [53]. Table 2 has only listed some the genetic differences which may be linked to virulence in chickens or humans.


**Table 2.** Most prevalent chicken-associated NTS serovars with public-health risk.


#### **Table 2.** *Cont*.

\* Global pandemic strain of SKn with ciprofloxacin resistant, ST198-X1-SGI1 originated from chickens. # ACSSuT = ampicillin, chloramphenicol, streptomycin, sulfamethoxazole and tetracycline.

The chicken host has been a hotspot for shaping new NTS pathotype strains that can cause extraintestinal diseases in humans due to bacteremia, often with antimicrobial resistant (AMR) phenotypes. NTS bacteremia can lead to severe inflammation within different organs, leading to organ dysfunction and sometimes death (Figure 2). In these more systemic infections, antibiotics are required for successful treatment. *Salmonella* Heidelberg (SHb) and SVr are among the top-four NTS serovars with highest invasiveness indices (proportion of bacteremia from total isolates) globally for which chickens act as a reservoir [75–78]. Apart from that, *Salmonella* Typhimurium (STm) and SEn are cumulatively responsible for the highest human epidemics globally with potential to cause blood-borne infections [79]. Comparative genomic analysis has predicted that *Salmonella* pathogenicity islands (SPI), adhesin molecules (fimbriae, invasins), secretion systems, virulence plasmid (spv), toxins, multidrug resistant genomic islands and colonization factors have a role in causing blood-borne infection in humans [52,53,57,58,80,81]. Another important virulence trait that has been overlooked in NTS serovars is iron uptake. As summarized in Table 2, there is a general trend in strains of important NTS serovars

to acquire additional iron-uptake systems. Kajanchi et al. (2017) reported that a significant number of STm strains, isolated from chickens, turkeys and humans, carried ColV plasmids which encoded genes for divalent metal uptake (*sitABCD*) and Fe3<sup>+</sup> uptake via synthesis, secretion and translocation of aerobactin (*iucABCD*, *iut*) [62]. The plasmid encoded *sitABCD* was phylogenetically distinct from the chromosomally encoded loci. The effect of having two *sitABCD* operons for clonal expansion and/or virulence is still unknown. ColV plasmids have been associated with SKn strains and to a lesser extent with SHb strains in the USA that were isolated from poultry [55]. For SKn, there was a significant fitness defect in colonizing the chicken cecum in strains lacking pColV [55]. In addition, systemic dissemination and the ability to cause splenic lesions was reduced in pColV null background compared to the pColV positive strain, indicating that genetic factors carried in pColV plasmids are important virulence determinants during extraintestinal disease [55]. SKn is an emerging pathogen which can cause blood-born infections in humans [82–84], thus ColV plasmid-encoded factors including iron uptake functions most likely contribute to overall virulence.

The aerobactin operon (*iucABCD*), also carried on pColV plasmids, is of particular interest, because normally its prevalence is low in most Salmonella [42]. Aerobactin-producing NTS serovars (SEn, STm, SVr,SIn etc) were highly associated with human salmonellosis caused by ingestion of contaminated poultry products in Spain [85]. In some reports, it has been documented that aerobactin production is exclusively linked to blood-born infection rather gastroenteritis, as aerobactin-producing NTS serovars were exclusively isolated from human blood [86,87]. In fact, some of the properties of aerobactin can provide NTS serovars a better survivability during systemic dissemination, even though affinity to Fe3<sup>+</sup> of aerobactin is lower than most other siderophores. Some of these features include: higher transfer rate of Fe3<sup>+</sup> from transferrin receptors to aerobactin in the serum, higher solubility, low wastage of resources during aerobactin production (recycled) and rapid secretion out of the cells to be available for ferric uptake compared to enterobactin, which tends to accumulate in the inner-membrane [42]. The iron in the mucosal surface of the gastrointestinal tract is mainly bound by lactoferrins which has a high affinity for iron (Kf = 1020) like transferrin [88,89]. Therefore, secretion of additional siderophores such as aerobactin may provide NTS strains a competitive advantage for multiplication and invasion into the gastrointestinal tract. In addition, aerobactin is not bound by Lcn-2 which will provide a defense against Lcn-2-mediated iron starvation during inflammation. So, in the bottom line, aerobactin can be involved not only in the systemic phase of infection but also in enteric infection. The pColV plasmids are well-distributed among *E. coli* strains and it is believed that chicken-associated NTS strains may have acquired the pColV from an avian pathogenic *E. coli* (APEC) strain. APEC strains cause high morbidity and mortality in chickens (colibacillosis) due to their ability to cause septicemia [90]. Dozois et al., 2003 showed that among pathogen-specific gene clusters expressed in APEC strains, both aerobactin and salmochelin were important for virulence in chickens [91]. Further, significant reduction of colibacillosis-associated pathology was observed in an aerobactin-knockout APEC strain carrying ColV plasmids [92]. In a similar manner the hypervirulent *Klebsiella pneumoniae* strain solely uses aerobactin to confer its hypervirulent phenotype which leads to septicemia in humans [93]. By all these means, acquiring aerobactin production may indeed cause the chicken-associated NTS serovars to become more virulent once infected in humans. There are number of other genetic factors encoded on pColV plasmids which can contribute to virulence, including the *iss* gene associated with increased serum survival in APEC strains [94]. Therefore, experimental approaches will be necessary to study the role of aerobactin encoded on pColV regarding virulence of NTS serovars in chickens and humans.

A recently emerging poultry-associated multidrug resistant *Salmonella* Infantis (SIn) lineage, harbored yersiniabactin secretion systems *(irp*) on pESI like plasmids [71,95]. As mentioned earlier, yersiniabactin is rarely present in Salmonella strains and its role is unknown regarding the existence in chicken-associated NTS strains. Yersiniabactin can sequester copper iron apart from ferric, to form a stable complex (yersiniabactin-cupric) which resists proteasomal degradation. In a series of experiments conducted by Chaturvedi and colleagues, they were able to show that the yersiniabactin-cupric complex neutralized superoxide (super oxide dismutase-like activity) generated in phagosomes which gave

uropathogenic *E coli* bacteria, a survival advantage in vitro and in vivo [96,97]. This new paradigm for the role of yersiniabactin in virulence is highly applicable to NTS serovars, because *Salmonella enterica* species do need to resist copper (Cu2<sup>+</sup>) accumulation inside macrophages for the survival [98]. Once accumulated in the cytoplasm of macrophages, Cu2<sup>+</sup> oxidized into a Cu1<sup>+</sup> which is toxic to bacteria. So, co-expression of yersiniabactin and catecholate siderophores (enterobactin, salmochelin) in the SIn strain may provide a survival advantage by facilitating iron acquisition as well resistance to copper-mediated toxicity.

**Figure 2.** Bacteremia-induced complications by non-typhoidal *Salmonella*. Generally, 5% of gastroenteritis cases develop into bacteremia-associated complications in immunocompetent people. However the burden of NTS bacteremia is higher in immunocompromised patients and children under 5 years old (can reach up to 34%). Data related to epidemiology has been obtained from a variety of published case reports and outbreak analysis. [99–110].

The acquisition of siderophore secretion and metal iron-uptake systems in chicken-associated NTS serovars might be linked to their invasive phenotypes in humans but more studies are needed to confirm their role. Whether they are important for the pathogenesis in chickens remains a question to be answered.

#### **3. Iron Uptake in NTS Virulence: Chicken vs. Mammalian Models**

Most of our understanding related to the role of iron-regulated gene clusters in *Salmonella* pathogenesis has derived from experimental infection with *Salmonella* Typhimurium (STm) using mouse models and mammalian cell culture assays. Due to differences in how pathogens interact with avian environments, we cannot directly extrapolate this information to chickens [111,112]. The relationship of virulence with various iron-uptake systems in pathogenic bacteria including *Salmonella enterica* species has been extensively reviewed [16–21,113]. Unfortunately, limited data in chicken models and avian cell lines remains a barrier to understanding the host–pathogen interactions of the iron-uptake system in Salmonella serovars. Here we discuss the potential role of iron-uptake systems in NTS serovars towards infection and colonization in chickens compared to mammals. Some of the gaps in knowledge which need to be addressed in poultry are summarized in Figure 3.

**Figure 3.** Interaction of non-typhoidal *Salmonella* strains with chicken and the environment. The role of iron-uptake systems during key steps of *Salmonella* life cycle illustrated here needs to be investigated in chicken models in future [50,114–122].

#### *3.1. Feo-Mediated Fe2*<sup>+</sup> *Uptake Involved in Rapid Colonization of the Gut and Systemic Spread*

To identify differentially expressed gene profiles of STm isolated during colonization of the lumen of the chicken cecum (compared to in vitro cultures), Harvey et al., 2011, detected the upregulation of the *sitABCD* operon [123]. In contrast, the major Fe2<sup>+</sup> uptake facilitator, the FeoABC system, was not differentially expressed during the same experiment. However, these researchers only assessed gene expression at 16 hours post-infection in newly hatched chicks so they may not have been able to capture the full spectrum of iron-regulated gene expression over time [123]. The *sitABCD* gene cluster was a major virulence factor in an avian pathogenic *E. coli* (APEC) strain causing colibacillosis in a chicken air sac model [124]. Evidence suggested that manganese uptake was more important than the Fe2<sup>+</sup> uptake during extraintestinal phase in APEC strains [124]. In contrast, both Mn2<sup>+</sup> and Fe2<sup>+</sup> uptake contributed to the full virulence of STm to cause typhoid disease in mice [125]. Portillo et al., 1992 estimated that 1 μM of free Fe2<sup>+</sup> prevailed inside the STm containing vacuole within Madin-Darby canine kidney cells and it was sufficient for replication, for at least 8 h of infection [126]. This suggested that Fe2<sup>+</sup> iron uptake might be more important than Fe3<sup>+</sup> in the initial stages of STm establishment in the gastrointestinal tract. Supporting this hypothesis, Tsolis et al., 1996, showed that the lack of the Feo system significantly reduced the fecal shedding of STm in mice (C57BL/6) at day 4 post-challenge while a Fe3<sup>+</sup> uptake null strain was recovered at a level similar to the wildtype [127]. In line with these findings, Costa et al., 2017 showed that Feo-mediated iron uptake provided a fitness advantage for STm, during gastrointestinal colonization (fecal shedding) via intragastrical route in a streptomycin-pretreated mouse (C57BL/6) colitis model at 2 days post-infection [128].

Similar to mammals, chickens mediate Fe2<sup>+</sup> egress from macrophages by expressing NRAMP-1 (Natural Resistance-Associated Protein 1) in the phagosomal membrane [129]. The action of NRAMP-1 is thought to limit the free, labile iron pool available to intracellular pathogens [130]. Thus, it is very likely that Feo-mediated ferrous iron uptake plays a crucial role for Salmonella to establish systemic infections in chicken. NRAMP-1 expression has been linked to Salmonella-resistance in certain chicken genetic lines (White Leghorn W1) [131]. The susceptible chicken line (CC) had a conservative mutation in the amino acid residue located at 223 (Arg223→Gln223) of NRAMP-1, which was highly predictive of a functional anomaly in the NRAMP- 1 protein [131]. Consistent with this finding, authors observed that only 15% birds survived to a parenteral challenge of STm in the susceptible chicken line (CC) 7 days post-infection while almost all birds survived in the resistant chicken line [131]. However, the mortality rate of the susceptible chicken line was comparable to the resistant chicken line beyond day 7 post-infection irrespective of the NRAMP-1 status [131]. This reflected that Fe2<sup>+</sup> starvation in presence of a functional NRAMP-1 certainly did limit rapid systemic spread of the STm in chicken but bacteria somehow adopted the new iron status in chicken and survived, pertaining their virulence during persistent infection. Future studies are needed to examine the iron distribution during *Salmonella* pathogenesis in chicken (cecal colonization and extraintestinal dissemination) and how this will shape overall regulation of iron-uptake systems in NTS serovars.

While further experiments are warranted to investigate the role of Fe2<sup>+</sup>uptake in relationship to the NRAMP-1 status in chicken lines, mouse models of infections have provided some insight into the interplay between NRAMP-1 and Feo-mediated iron uptake. Feo-mediated iron uptake provided a competitive advantage during persistent infection of STm (SL1344-calf virulent isolate) in both NRAMP-positive and -negative backgrounds of mice [132]. Authors observed that a Δ*feo* STm strain was significantly reduced in its ability to colonize deeper tissue in the gut such as Peyer's patches (PP), mesenteric lymph nodes (MLN), as well as liver and spleen during a mixed infection [132]. Mice were orally challenged resembling natural infection with *Salmonella*. In the same study, it was documented that the lack of Feo-mediated Fe2<sup>+</sup> uptake affected the overall iron homeostasis in a STm strain during a single infection challenge model [132]. The study revealed that the Δ*feo* STm strain compensated the requirement of iron by upregulating siderophore-mediated Fe3<sup>+</sup> uptake (enterobactin, salmochelin) systems during systemic infection (liver and spleen) [132]. Interestingly, this upregulation of siderophore-mediated ferric uptake resulted in increased bacterial burden in the liver and spleen during persistent infection in NRAMP+/+ mice [132]. There are growing numbers of evidence indicating that Salmonella preferentially resided in hemo-phagocytosed macrophages in the liver and spleen during infection [133–135]. One plausible explanation for this might be the abundant source of iron that Salmonella can exploit during degradation of erythrocytes (Fe3+/Fe2+) in those macrophages. Hence hypersecretion of siderophores may benefit growing *Salmonella* under such conditions. Expression of iron-uptake systems certainly may differ among different types of tissues the bacterium has to encounter or might vary due to host responses. For example, transferrin-bound iron (Fe3<sup>+</sup>) in the intestines provides a good source of iron for Salmonella and the uptake can be facilitated by the stress-induced norepinephrine hormone which is produced abundantly in the mesenteric organs both in chickens and mice [136].

#### **Highlights-1:**


#### *3.2. Siderophore Synthesis Is Important During Persistent Infection and Bacteremia*

Iron restriction is well-studied related to antimicrobial properties of egg white in vitro [137–142]. Kang et al., 2006, showed that a Δ*entF* strain of SEn which was unable to produce a catecholate siderophore, was significantly attenuated in its ability to survive in egg albumen in vitro which suggested that siderophore production is an important virulence determinant during internal contamination of the eggs [143]. The egg is enriched with a variety of iron chelators such as ovotransferrin (in egg white) and phosphovitin (in yolk) hence it is very likely that potent ferric hijacking systems will benefit *Salmonella* in colonizing the eggs during transovarian transmission. Van Immerseel et al., 2010, proposed the hypothesis that stress-induced survival mechanisms governed by SEn led to egg-associated human outbreaks due to the fact that eggs possessed an arsenal of antimicrobial properties [144]. However, in-vivo gene expression studies did not identify iron-uptake systems as

differentially expressed gene clusters during oviduct colonization or egg contamination [145,146]. Gene expression studies have been conducted using an intravenous challenge model which is an unnatural route of infection in hens. So, it might be possible that gene expression of *Salmonella* during intravenous challenge might be different compared to oral infection in hens. Siderophore-mediated ferric iron uptake has not been identified as a major virulence determinant during colonization in the gut and systemic infection in chicken, so far. There are not enough studies performed using iron-homeostasis-related mutants of Salmonella to investigate their role in infection, colonization and transmission in a chicken model. In a series of experiments executed by Rabsch et al., 2003, it was proposed that siderophore degradation product such as *N*-2,3-dihyroxybenzoyl-l-serine (DBS) will be more important in colonization and systemic spread in the absence of an active siderophore uptake system in chicken [37]. The authors confirmed this hypothesis in a mouse model of infection (intragastric route) using a Δ*fepA* Δ*iroN* Δ*cir* strain of STm (SL1344) which was significantly attenuated in colonization of the cecum and systemic spread, which in contrast was not observed in a Δ*fepA* Δ*iroN* mutant (enterobactin and salmochelin uptake deficient). In a chicken model, SEn strains carrying *fepA iroN* mutation profiles behaved similarly as in mice indicating that siderophore uptake was not essential during early colonization events [37]. Interestingly, the authors concluded that in BALB/c mice who are intrinsically susceptible to *Salmonella* infection, salmochelin was not important to cause infection. All these data have to be used cautiously due to following reasons; (i) *N*-2,3-dihyroxybenzoyl-l-serine (DBS) is not occurring naturally in the environment. It needs to be synthesized (*entABCDE*) or liberated as a byproduct due to action of Fes and IroE (Figure 1) on enterobactin/salmochelin. So, if DBSs are important so is the siderophore synthesis. When uptake routes are blocked spontaneous breakdown of siderophores can be a rapid process. (ii) At a given time, siderophores and its degraded products (enterobactin, salmochelins, Ent-trimer, Ent-dimer, DGE-trimer, DBS etc.) can be present and this cocktail may have a biological role in vivo. For example, degradation to more soluble form such as DBS, enables *Salmonella* to internalize iron rapidly. The mixture of derivatives might also exhaust the immune system in mounting an effective antibody response (antibodies against one particular siderophore derivative will spare others in the mixture) [147,148]. (iii) The genetic background of the host organism will have a major effect on the outcome of animal experiments. For example, the importance of iron-uptake systems described in mice that are genetically susceptible or resistant to Salmonella has been contrasting [149,150]. This will most likely be applicable to chickens as well (Salmonella*-*resistant and -susceptible chicken lines). Another crucial factor is the age of the birds: e.g., chicks (weak immune system) vs. adult chickens.

Fe3<sup>+</sup> uptake via FepB (periplasmic binding protein for some catecholate type siderophores) has been identified as an absolute requirement for the persistent infection in mice (Sv129S6-Nramp1+/+) with STm (SL1344) [151]. FepB is needed to shuttle Fe3<sup>+</sup> bound to enterobactin, salmochelin or DBS (2,3-dihydrobenzoic acids), from the periplasm to the inner membrane transport components (Figure 1A). The Δ*fepB* of STm dramatically lowered the bacterial recovery below the detection limit in most of the tissues examined in mice (cecum, MLN, PP, liver and spleen) [151]. Most importantly, the authors in these studies showed that siderophore synthesis (enterobactin, salmochelin) played a significant role in gastrointestinal colonization and systemic spread during persistent infection [151].

Salmochelin synthesis and export have been identified as major virulence factors during bacteremia in mice (C3H, Nramp+) measured by mortality after intraperitoneal injection of STm [152]. Parenteral injection of STm carrying a mutation in *tonB* which completely blocked all siderophore uptake has previously been shown to significantly increase the LD50 in mice compared to the challenge with wildtype STm [127]. Further, in a study which analyzed differentially expressed genes in STm-SL1344 by transcriptomic and proteomics techniques, enterobactin synthesis and uptake genes were highly upregulated during systemic infection in a mouse (C57BL/6) model [153]. Most interesting finding of that study was, in addition to enterobactin, salmochelin-related genes were upregulated in immune-deficient mice background (deficient in ROS generation) but not in wildtype mice background [153]. In the same study high bacterial growth has been observed in spleen of immune-deficient mice which may have been linked to a high demand of iron for growth of STm [153]. It is well-documented that salmochelin provides a defense against Lcn-2-mediated enterobactin chelation by the host during inflammation (mouse colitis model) [154]. Hence it is possible that mice with deficiency in respiratory burst effect, may rely on antimicrobial mechanisms such as more Lcn-2 secretion to limit *Salmonella* replication in phagocytic cells. Also, serum is considered as an extremely low iron compartment for pathogens in vertebrates [155]. Serum iron is mostly bound to transferrin, albumin and ferritins. In the presence of serum albumin, enterobactin is not considered as an efficient iron chelator as it is rapidly cleared [156,157]. Hence secretion of stealth siderophores (aerobactin, salmochelin and yersiniabactin) will be beneficial for NTS serovars during bacteremia.

The extracellular fatty acid-binding protein (ExFABP) of chickens has been identified as the chicken equivalent of Lcn-2 [158]. Its overall structure is similar to Lcn-2 yet it has a more extended positively charged calyx (which is the binding pocket for ligands) with two binding specificities: one for siderophores and the other for lysophosphatidic acid [158]. Interestingly, the calyx of Ex-FABP accommodates one form of salmochelin, mono- glycosylated enterobactin (MGE/S1) which is not normally bound by Lcn-2 [158]. Lcn-2 cannot bind to any salmochelin derivatives. So, the "chicken lipochalin-2" seems to be more potent in withholding iron compared to Lcn-2 during *Salmonella* infection. There is ample evidence for expression of Ex-FABP in the cecum associated with inflammation of day-old chicks when infected by NTS [159,160]. Chicken egg white which has antibacterial properties against *Salmonella* in vitro also contains Ex-FABP [145]. Adult chickens generate a more tolerogenic response towards non-typhoidal *Salmonella* (NTS) infection [161,162]. The inflammation induced in adult chicken is transient yet sufficient enough to contain the bacteria in the gut while some may spread systemically to colonize spleen and liver. Significant inflammation in the liver and spleen has not been observed in more mature birds except for follicular lesion [161]. The lack of marked inflammatory response in adult chicken towards NTS infection is an indication that some of the stealth siderophore secretion might not be essential during the colonization process compared to mammals.

#### **Highlights-2:**


#### **4. Opening the Pandora's Box of Gallus-Iron-***Salmonella* **Interaction**

Iron uptake is a primary virulence factor for *Salmonella*. But how each iron-uptake system partakes in pathogenesis in a chicken model still needs a thorough investigation. This is intriguing because chickens are the major reservoir for Salmonella; yet we know least about its interactions with the host. We want to highlight some of the important aspects which need to be addressed in future experiments using chickens as model related to the Gallus-Iron-Salmonella interaction. This will certainly lay a platform to discuss the potential for developing therapeutics targeted at iron homeostasis in *Salmonella*.

#### *4.1. Nutritional Immunity Status in Chicken during Salmonella Infection*

Nutritional immunity is defined as part of the host's innate immune response to withhold essential nutrients, including iron, from invading pathogens [163]. The interplay between iron-withholding mechanisms in chicken and iron homeostasis in Salmonella during pathogenesis is largely unknown. The interaction between siderophores and extracellular fatty acid binding protein (ExFABP), which is part of chicken-iron-withholding strategy, has recently been well-documented in eggs [164]. A study

revealed that SEn has to synthesize stealth siderophores such as salmochelin to overcome iron starvation induced by ExFABP (chelation of Fe3<sup>+</sup>-enterobactin) in egg white in vitro [164]. Ovotransferrin, synthesized by oviduct cells, is a transferrin family protein which transports iron into the growing embryo. It is the major constituter of egg albumin. The iron complex formation constant of ovotransferrin (iron affinity of the C lobe is 10<sup>18</sup> and N lobe is 10<sup>14</sup> ) is low compared to most siderophores secreted by Salmonella hence iron restriction is not the major mechanism behind its antibacterial effect [165]. Egg yolk is the major iron store for growing embryos and almost all iron is bound to phosvitin. The affinity of phosvitin to iron is comparable to ovotransferrin (Kf = 1018) [166]. So, Salmonella can rely on enterobactin (Kf = 1049) rather on the expression of stealth siderophore to hijack iron from phosvitins, unless ExFABP is expressed in sufficient amounts. However, currently there is no evidence that stealth siderophores are indeed expressed to counteract ExFABP-mediated nutritional immunity in chickens during colonization in various tissue in vivo.

Adaptation to an iron-deficiency status in humans plays an important role in resistance to bacterial and viral infections [167]. The response is also termed as hypoferremia of inflammation or anemia of inflammation (AI). The key player for hypoferremic response is recognized as hepcidin, the master regulator for iron metabolism in humans and it is believed to be hepcidin independent in chickens (chicken genome seems to lack hepcidin up to date) [168–170]. Inflammation caused by pathogenic invasion induces hepcidin secretion from liver [171]. Hepcidin mediates ferroportin (Fpn) degradation which inhibits iron efflux from macrophages and iron absorption from intestines [172]. Fpn degradation also affects hepatocytes which increases their ferritin levels and the ability to store accumulated iron. All these mechanisms lead to a significant drop in the serum iron level (hypoferremia). The low level of iron in the serum may limit bacteremia, yet current evidence suggested that the burden of NTS increased in systemic infection-related sites such as the spleen during hypoferremic response in mice [173–175]. Similarly, infection with chicken-specific serovars such as *Salmonella* Gallinarum and *Salmonella* Pullorum led to anemia of inflammation (AI) in chicken with increased bacterial burden in spleen and liver [176]. The increased *Salmonella* colonization in the "systemic sites" correlated with a spike in the iron content both in mice and chickens. The reason for such a spike of iron in spleen can be partly due to the accelerated red blood cell turnover rate triggered by inflammation induced hypoferremia response. It has been documented that in mammals the half-life of red blood cells decreased dramatically during AI response and led to increased destruction of red blood cells by macrophages in spleen and liver [177]. Since Salmonella can profit from the iron abundancy (Fe2+/Fe3+) in hemophagocytic cells [132], they may preferentially rely on a specific iron uptake system during AI. Supporting this hypothesis, the African lineage of iNTS (invasive NTS) strain *Salmonella* Typhimurium 313 (ST313) appears not to rely on salmochelin-mediated Fe3<sup>+</sup> uptake during systemic infection in mice [178]. There is a strong association of African linage of iNTS strains with malaria parasites which increase intracellular iron levels in macrophages [179]. An abundance of the Fe <sup>2</sup><sup>+</sup> pool may have inherently adapted the ST313 to reduce the expression of stealth siderophore uptake systems which are a metabolically demanding process to produce. There is currently no experimental data indicating the occurrence of AI in chicken during NTS infection. Broader host range serovars such as NTS strains colonize mainly the gastrointestinal tract without overt inflammation in adult birds. In such a situation, AI will not be profound. Virulence of NTSs varies according to serovars and chicken susceptibility depends on their genetic background and the age of the birds. For example, NTS such as STm and SEn do cause systemic inflammation during enteric infection in young chickens. They are also capable of infecting the yolk sac in young birds leading to the development of omphalitis. Yolk sac infections result in high mortality due to septicemia [180]. The role of iron-uptake systems of NTS, when the host undergoes hypoferremia needs to be investigated in a chicken model of infection. It will be especially important to examine iron distribution in compartments such as blood, liver, spleen and gastrointestinal tract of chickens following infection. Research on iron-regulated gene expression combined with proteomic studies is needed to assess how each iron-uptake system is regulated in parallel to anemia of inflammation.

#### *4.2. Non-Canonical Function of Siderophores: Defense against Respiratory Burst and Immunomodulatory Function*

There are growing number of evidences suggesting that siderophores have other biological functions apart from Fe3<sup>+</sup> uptake [181]. One such alternative function is defense against oxidative stress provided by catecholate siderophores [182–184]. The mechanism behind the enterobactin-mediated defense against reactive oxygen species is currently been investigated. One of the mechanisms suggested that Ent-trimer (Figure 1C) which is the linearized molecule of enterobactin, participates in ROS scavenging by providing hydroxyl groups from the freed end of the backbone [185]. Generation of a robust respiratory burst is a key mechanism to kill Salmonella inside phagocytic cells [186]. In this regard, catecholate siderophore production will provide a survival advantage inside macrophages which is a major replication niche during systemic dissemination and colonization. Also, it may be plausible that synthesis of catecholate siderophores will be important irrespective of iron limitation inside phagocytic cells because of their diverse functions apart from iron scavenging. Some of the phagocytic cells in the chicken immune system do not induce a strong respiratory burst effect to *Salmonella*. The chicken lacks neutrophils yet has heterophils that are functionally equivalent to neutrophils. Heterophils are unable to synthesize myeloperoxidases and rely on a repertoire of antimicrobial peptides to kill bacteria instead of respiratory burst [114,187,188]. It has been documented that enterobactin inhibited the myeloperoxidases activity in *E coli* and provided a survival advantage in inflamed gut [184]. Hence it is important to investigate the interplay between enterobactin and heterophils during gastrointestinal colonization. Macrophages from Salmonella-resistant chicken lines (*SALI)* showed more pronounced respiratory burst effect while susceptible and inbred lines had low, variable level respectively [189]. So future experiments are warranted in chickens to investigate the role of siderophore-mediated defense against reactive oxygen and nitrogen species.

Holden et al., 2016, showed that siderophores produced by *Klebsiella pneumoniae* (enterobactin, salmochelin and yersiniabactin) can induce inflammation in lung epithelial tissue by stabilizing the hypoxia inducible factor-1α (HIF-1α) in C57BL/6 mice [190]. In a previous study, Holden et al., 2014, showed that enterobactin together with Lcn-2 can potentiate the induction of pro-inflammatory cytokines in cultured murine lung epithelial cells through chelation of iron [191]. These data are highly suggestive that siderophores can mount an inflammation in vivo. Inflammatory cytokines liberated will help to attract macrophages and dendritic cells to the infective loci and subsequent systemic spread. It will be interesting to investigate whether siderophores facilitate systemic infection by induction of inflammation at different colonization sites in chicken by NTS serovars. Enterobactin-mediated iron chelation has been documented to polarize the macrophage from M1 phenotype to M2 phenotype in bone-marrow-derived cells [192]. M2 phenotype of macrophages will safeguard intracellular pathogen such as Salmonella by avoiding generating an oxidative killing mechanism [193]. Chicken has low number of resident macrophages in organs and relies on bone-marrow-derived monocytes to migrate to the inflammatory loci for pathogen control [187]. Presence of distinct M1 (killing/towards inflammatory) and M2 (healing/towards adaptive response) phenotypes [194] of chicken macrophages is yet to be fully elucidated. Further studies are needed to unravel how *Salmonella* mediates iron homeostasis in infected chicken macrophages as this microenvironment may impose a different iron status during polarization [195].

#### **5. Concluding Remarks**

Our understanding of iron in infection and immunity remains close to its infancy due to the complex nature of the interaction and ever-growing *Salmonella* serovars found in nature. Concerning chickens as a reservoir, it will be pivotal to understand how iron-regulated genes of *Salmonella* are expressed during pathogenesis in a chicken model of infection (Figure 3). Enhanced detection of in vivo siderophore production during colonization in different chicken host niches in situ will be key in understanding their role in the future. Experiments are needed to address how iron metabolism and homeostasis in the chicken are regulated in response to NTS infection. There are other metal uptake

systems (Mn2+, Cu, Zn2+) apart from iron uptake which are not well-characterized in a chicken model regarding their role in NTS colonization. We believe that these efforts to understand the involvement of iron homeostasis in pathogenesis of NTS will pave the way for the development of a successful therapeutic strategy in the poultry industry to limit chicken-associated *Salmonella* "spillovers" to humans and the environment.

**Author Contributions:** D.H.W. and W.K. designed the study, collected data and created the figures; D.H.W., W.K., B.A. and A.P.W. wrote the manuscript. Figures are created with BioRenders.com (https://biorender.com/) subscribed to D.H.W. All authors have read and agreed to the published version of the manuscript.

**Funding:** Alberta Agriculture and Forestry: 2018F145R, Chicken Farmers of Saskatchewan: KOE2018002, Egg Farmers of Alberta: 2018F145R, Canadian Poultry Research Council: KOE2018002, Chicken Farmers of Canada: 1433-19.

**Acknowledgments:** We thank Po-King S. Lam, Colette Wheler and Arshud Dar for fruitful discussions. Work in the laboratory of W.K. was supported through funds from the Chicken Farmers of Canada (CFC), the Canadian Poultry Research Council (CPRC), the Saskatchewan Chicken Industry Development Fund (SCIDF), the Canadian Poultry Research Council/Agri-Food (Canada), Egg Farmers of Alberta (EFA), the Department of Alberta Agriculture and Forestry (AAF), the Saskatchewan Ministry of Agriculture—Agriculture Development Fund (ADF) and Egg Farmers of Canada (EFC). This report was published with the permission of the Director of VIDO-InterVac as journal series number 898.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Salmonella* **Pathogenicity Island 1 (SPI-1): The Evolution and Stabilization of a Core Genomic Type Three Secretion System**

#### **Nicole A. Lerminiaux 1,2, Keith D. MacKenzie 1,2 and Andrew D. S. Cameron 1,2,\***


Received: 5 March 2020; Accepted: 10 April 2020; Published: 16 April 2020

**Abstract:** *Salmonella* Pathogenicity Island 1 (SPI-1) encodes a type three secretion system (T3SS), effector proteins, and associated transcription factors that together enable invasion of epithelial cells in animal intestines. The horizontal acquisition of SPI-1 by the common ancestor of all *Salmonella* is considered a prime example of how gene islands potentiate the emergence of new pathogens with expanded niche ranges. However, the evolutionary history of SPI-1 has attracted little attention. Here, we apply phylogenetic comparisons across the family Enterobacteriaceae to examine the history of SPI-1, improving the resolution of its boundaries and unique architecture by identifying its composite gene modules. SPI-1 is located between the core genes *fhlA* and *mutS*, a hotspot for the gain and loss of horizontally acquired genes. Despite the plasticity of this locus, SPI-1 demonstrates stable residency of many tens of millions of years in a host genome, unlike short-lived homologous T3SS and effector islands including *Escherichia* ETT2, *Yersinia* YSA, *Pantoea* PSI-2, *Sodalis* SSR2, and *Chromobacterium* CPI-1. SPI-1 employs a unique series of regulatory switches, starting with the dedicated transcription factors HilC and HilD, and flowing through the central SPI-1 regulator HilA. HilA is shared with other T3SS, but HilC and HilD may have their evolutionary origins in *Salmonella*. The *hilA*, *hilC*, and *hilD* gene promoters are the most AT-rich DNA in SPI-1, placing them under tight control by the transcriptional repressor H-NS. In all *Salmonella* lineages, these three promoters resist amelioration towards the genomic average, ensuring strong repression by H-NS. Hence, early development of a robust and well-integrated regulatory network may explain the evolutionary stability of SPI-1 compared to T3SS gene islands in other species.

**Keywords:** genomic island; SPI-1; *Salmonella*; pathogenicity island; comparative genomics; type III secretion system

#### **1. Introduction**

Bacterial genomes are highly dynamic, able to gain and lose genes over short evolutionary times. Comparative genomics enables the differentiation of genes that are shared by all members of a species (core genes) from the genes with variable distributions across a species (accessory genes). *Escherichia coli* is a prime example of genomic variability. Although gene content ranges from 3744 to 6844 open reading frames in individual *E. coli* isolates, only 1000 genes are shared by the over 21,000 whole genome sequences that represent this species and are currently available in GenBank [1]. As accessory genes constitute the bulk of an average bacterial genome, understanding their evolutionary histories and genetic dynamics is central to understanding bacterial functions, and capabilities.

Accessory genes often become physically linked on contiguous segments of DNA, and these islands can range from two to dozens of genes. Horizontal gene transfer (HGT) is a driver of this coalescence because a physical connection between functionally linked genes increases the frequency of successful transfer [2–4]. HGT disconnects the phylogenetic history of a genomic island from that of its host genome [5], which can be reflected in islands having nucleotide and codon frequencies that differ from a genomic average [6–8]. Further signatures of HGT are that islands often insert adjacent to genetic elements that facilitate recombination, such as tRNA genes and mobile genetic elements [8–11].

Acquisition of genomic islands can provide new ecological functions and facilitate the invasion of new niches, enabling evolutionary leaps and even speciation [12]. Examples of biological functions that mobilize as genomic islands include sugar catabolism [13], plant symbioses [14], antibiotic resistance [15], virulence factors [6], and other pathogenicity determinants [8]. The contributions of "pathogenicity islands" to bacterial evolution and niche adaptation is best understood in the model pathogen *Salmonella*. A total of 24 pathogenicity islands have been identified in this genus, though not all of these islands have been experimentally validated to contribute to virulence phenotypes [16–18]. The largest is *Salmonella* Pathogenicity Island 1 (SPI-1) [19], which encodes a type three secretion system (T3SS) and type three secretion effectors (T3SEs) that mediate intracellular invasion of intestinal cells in animal hosts [20–23].

Acquisition of SPI-1 is a defining event in the evolution of *Salmonella*, occurring after divergence from the common ancestor with *Escherichia* over 100 million years ago [24–26]. The boundaries of SPI-1 were initially determined through DNA hybridization, then later through alignment with *E. coli* DNA sequence [21–23] (Figure 1A). Homologous T3SS and T3SE genes have been acquired by lineages of *E. coli*, but these are rare and a prominent example in *E. coli* O157:H7 is losing functionality [27]. Broader phylogenetic comparisons have identified homologous T3SS in facultative human pathogens such as *Yersinia, Chromobacterium,* and *Shigella*, and plant pathogens such as *Pantoea* (Table 1) [28–34]. These T3SSs are found only in select members of each genus. For example, PSI-2 appears to be undergoing frequent HGT and loss in the genus *Pantoea* [32]. Similarly, *Yersinia* YSA is present in *Yersinia enterocolitica* but is absent from *Yersinia pestis* [35]. The Mxi-Spa T3SS has entered the *Shigella* clade multiple times on several types of pINV plasmids [36–38].

Despite its prominence as the archetypal pathogenicity island and intensive research attention for several decades, little is known about why SPI-1 is uniquely stable among T3SS genomic islands in Enterobacteriaceae. Previous studies of T3SS genetic architecture have included cursory analyses of the evolutionary transitions that differentiate T3SS-containing gene islands in bacteria [27,28,31,32,39–42]. Pathogenicity islands can have mosaic structures arising from the merger of smaller islands that were acquired at different points in evolutionary history [8]. While several studies have suggested that SPI-1 has a mosaic structure [43,44], an in-depth evolutionary analysis of the island's history has not yet been conducted.

The most robust approach to find genomic islands and their boundaries is through comparative genomics. xenoGI is a recently released comparative genomics program that identifies genomic islands that are shared by a clade of bacteria as well as islands that are unique to certain strains by grouping genes by origin on a phylogenetic tree [45]. Using this locus-based approach, the objectives of our study were to: (1) identify the genomic islands of SPI-1, (2) examine the evolutionary history of the SPI-1 locus, and (3) evaluate the evolutionary history of SPI-1-encoded transcriptional regulators *hilA, hilC,* and *hilD*. Tracking how genomic islands originate, spread, and decay is key to determining how islands enable genomic diversification and adaptation. Connecting the evolutionary history of an infection-relevant pathogenicity island, SPI-1, to extensive experimental characterization of its molecular components helps develop and improve our understanding of pathogen emergence.

**Figure 1.** Comparative genomic analysis of *Salmonella* Pathogenicity Island 1 (SPI-1). (**A**) Alignment of *Salmonella enterica* serovar Typhimurium LT2 SPI-1 to the same locus in *Escherichia coli* K12. Genes are coloured by function based on annotations in Genbank and SalCom [46,47]. Grey bars represent sequence homology determined by blastx (minimum length 100 bp, e-value < 0.00001). (**B**) Alignment of the SPI-1 locus in *Salmonella, Escherichia, Citrobacter,* and *Klebsiella.* Whole-genome phylogeny was constructed with PATRIC. Gene colouring corresponds to different gene clusters identified by xenoGI [45]: *fhlA* (dark brown), *ygbA* (blue), *sitABCD* (orange), *avrA – invH* (green), *pig* genes and *pphB* (light blue/purple) and *mutS* (grey). *S. enterica* Senftenberg strain ATCC 43845 is included in this figure. Small open reading frames (white) between *sitD* and *hilA* in *S. enterica* Enteriditis, *S. enterica* Paratyphi, *S. enterica* Senftenberg may be due to different annotation pipelines and will not be examined here. (**C**) The transcriptional response to the intra-macrophage environment by *S.* Typhimurium 4/74 reflects that SPI-1 is an island composed of transcriptionally-cohesive modules. Genes are coloured according to the fold differences quantified by RNA-seq in the macrophage vacuole compared to early stationary phase in Lennox broth; data from [48].

#### **2. Materials and Methods**

#### *2.1. Bacterial Strains*

Table 2 has a complete list of 29 bacterial strains and accession numbers included in xenoGI analysis. These strains were chosen to capture phylogenetic and GC content diversity. Unless otherwise specified, all gene and ortholog names will be as annotated as in *Salmonella enterica* serovar Typhimurium LT2

to avoid confusion over different annotations for the same genes. All genome sequence files and corresponding annotations were downloaded from NCBI Genbank. Several other strains that were used for genomic comparison but not included in xenoGI analysis are *S.* Senftenberg strain N17-509 (accession: CP026379.1), *E. coli* ISCII (accession: CBWP010000030.1), and *E. coli* O104:H11 strain RM14721 plasmid RM14721 (accession: NZ\_CP027106.1). Strains with genomes at the NCBI assembly level of "complete" (gapless chromosome) were selected over draft genomes due to the higher quality and ability to distinguish independent units such as plasmids.

#### *2.2. Whole-Genome Phylogenetic Tree Building for xenoGI Input*

The phylogenetic tree was built using PATRIC 3.4.2 [49], which constructs trees based on coding sequence similarity. All bacteria strains listed in Table 1 were the focal group, except for *Pseudomonas aeruginosa* which was used as the outgroup (Supplementary Figure S1).

#### *2.3. xenoGI Parameters*

Analysis was run using default parameters in xenoGI v2.2.0 [45] with the following exceptions: rootFocalClade was set to i26 (Figure S1) and evalueThresh was set to 1e-8. Computing time took approximately 3 h with 29 strains. Bed files were generated from xenoGI scripts and were used to visualize islands in Integrated Genome Browser version 9.1.0 [50] and Easyfig version 2.2.3 [51]. Analysis of islands was done with the interactiveAnalysis.py script. Annotations used to determine gene function were obtained from the Genbank flat files (.gbff) and SalCom (Table 1) [46,47,52]. Raw analysis output from xenoGI is found in Supplementary Table S1.

#### *2.4. AT Content Analysis*

AT content was calculated from the Genbank flat files (.gbff) for select strains in Table 1 using Geneious R11 (https://www.geneious.com) [53] and plotted as a heatmap. The average GC content for each nucleotide position was determined using a 100 base sliding window.

#### *2.5. Hil Phylogenies*

We used blastx 2.10.0+ [54] to search for *hil* gene homologs in other bacteria excluding the *Salmonella* clade (taxid:590) and filtering for hits that covered > 80% of the query sequence. *Salmonella enterica* serotype Typhimurium LT2 HilA (AAL21756.1), HilC (NP\_461788.1), and HilD (NP\_461796.1) were used as the query sequences. Because *S. enterica* Typhimurium LT2 HilC and HilD have high sequence similarity (36.4 % identity over 88 % query coverage, e-value < 3e-51, bit score = 168), we chose a 80% query cut-off filter when searching for HilC and HilD homologs to capture both regulators of interest and sufficient diversity. In other words, when searching in the *Salmonella* clade using HilC as a query with these settings, HilD would return as a hit and vice versa. The accession numbers included in the phylogenetic analysis for HilA are listed in Supplementary Table S2 and accession numbers for HilC/D are listed in Supplementary Table S3. *Salmonella bongori* NCTC 12419 HilA (CCC31553.1), HilC (WP\_000243993.1) and HilD (WP\_000432692.1) and *Salmonella enterica* serotype Typhi CT18 HilA (CAD05983.1) were included as representatives in the phylogenetic trees. Multiple protein alignments were done using MUSCLE [55] in MEGA 7 [56]. Maximum likelihood trees were constructed using the LG+G amino acid substitution model for HilA in Figure 6, the LG+G+I amino acid substitution model for HilA in Supplementary Figure S2 and the JTT+G amino acid substitution model for HilC/D in MEGA X [57]. The maximum-likelihood phylogenies were supported with 1000 bootstrap replicates. Tree visualization was done with iTOL [58].






#### **3. Results**

#### *3.1. SPI-1 is a Mosaic of Gene Islands*

We conducted a fine scale analysis of gene content and architecture at the *fhlA*/-/*mutS* locus using xenoGI, a program based explicitly on phylogenetic comparisons to identify islands. Briefly, xenoGI sorts genes into families, then sorts families into islands based on synteny or location within a genome, while also accounting for amino acid similarity. Many genomic island-finding programs exist [reviewed in [62,63]], but unlike other methodologies, xenoGI requires a phylogenetic tree for input. The phylogeny is used to determine which islands are shared by a clade and to identify at which branch they were acquired [45].

The core chromosomal genes that have been previously defined as the boundaries of the SPI-1 locus are *fhlA*, encoding the formate hydrogenlyase transcriptional activator, and *mutS*, encoding a DNA mismatch repair protein, as noted previously from direct comparison of *E. coli* K-12 to *S.* Typhimurium [23,64] (Figure 1A). Comparing this locus in *Salmonella* to homologs in closely related genera *Citrobacter*, *Escherichia*, *Shigella*, *Enterobacter* and *Klebsiella* confirmed that *fhlA* and *mutS* define the boundaries of a plastic locus across these Enterobacteriaceae. In the reference *Salmonella* Typhimurium genome, xenoGI divided SPI-1 into three gene islands, which are coloured in Figure 1B: *sitABCD* (orange), *avrA-invH* (green), and *pigA*-STM2908 (purple & blue). The distinct nature of these three islands is reflected in their independent transcriptional output in infection-relevant conditions [46,48] (Figure 1C). Each of these islands is examined separately below.

#### *3.2. A Cohesive SPI-1 Gene Set is Highly Conserved in* Salmonella

To evaluate SPI-1 conservation across the genus *Salmonella*, we selected representative strains from both species (*S. enterica* and *S. bongori*), three *S. enterica* subspecies (*arizonae*, *diarizonae*, and *enterica*), and four serotypes of model pathogens (Typhimurium, Typhi, Paratyphi, and Enteritidis). This selection includes the deepest branches within the genus [65] and a range of genome sizes (4.46-5.26 Mbp). Figure 1B shows that the *avrA-invH* island (green), which encodes the T3SS and associated effectors (T3SE), is conserved across all eight *Salmonella* genomes but is absent from this locus in other Enterobacteriaceae.

A single gene in the *avrA-invH* island is not conserved across the eight representative *Salmonella* genomes. The effector gene *avrA* is absent in *S. enterica* Typhi, *S. enterica* Paratyphi and *S. enterica arizonae* (Figure 1B). The frequent loss of *avrA* in independent lineages of *Salmonella* is illustrated more comprehensively in the analysis 445 *Salmonella* strains by Worley and colleagues [66]. We note that *avrA* expression is unchanged during macrophage infection (Figure 1C), suggesting its dispensability is reflected in low integration into the SPI-1 regulatory network.

#### *3.3. The* ygbA *and* sitABCD *Islands Predate Core SPI-1*

The *sitABCD* island identified by xenoGI is conserved across *Salmonella*, *Citrobacter*, and *Klebsiella* (orange genes in Figure 1B). Hypothetical protein *ygbA* at this locus was classified into two islands: one in *Salmonella, Citrobacter, Escherichia* and *Enterobacter lignolyticus* and one in *Klebsiella* where is has a reverse orientation at the other side of the *sitABCD* operon (Figure 1B, blue). The simplest interpretation of this phylogeny is that the common ancestor of these Enterobacteriaceae had *ygbA* and *sitABCD* at this locus, but the *sitABCD* operon was subsequently lost from the *Escherichia*/*Shigella* clade and some minor reorganization has placed *ygbA* in two alternate positions either adjacent to *sitD* or *sitA*. Furthermore, the *ygbA* and *sitABCD* islands predate the insertion of the *avrA-invH* (T3SS) island in the *Salmonella* ancestor.

To test whether *E. coli* strains might encode the *sitABCD* operon at other genomic locations, we used blastx to search for orthologs in whole genome sequences. *Shigella flexneri* strain 2a 301 and *E. coli* IAI39 have *ygbA* at the *fhlA*/*-*/*mutS* locus and *sitABCD* at another location in their genome. We also examined a small number of species from sister families in the order Enterobacterales. The *sitABCD* operon is present in these bacterial families located either at chromosomal locations separate from *mutS* in *Sodalis glossinidius* and *Sodalis praecaptivus* (Pectobacteriaceae) and *Yersinia pestis* (Yersiniaceae), or on a plasmid in *Pantoea ananatis* (Erwiniaceae) (Figure 2). These species all lack *ygbA* homologs.

**Figure 2.** Genomic positions of *ygbA* and *sitABCD* relative to *mutS* in select Enterobacterales. The whole-genome phylogeny was constructed by PATRIC [49]. Variations of the *ygbA* and *sitABCD* gene clusters are coloured for *ygbA* (blue) and *sitABCD* (orange). Chromosome lengths are drawn to scale.

In *E. coli* K12, *ygbA* is the only gene located between the core genes *fhlA* and *mutS* (Figure 1A). However, the broader phylogenetic comparison in Figure 2 shows the presence of *ygbA* and *sitABCD* at this locus in the deepest-branching *E. coli* considered here (strain ISC11). This evidence further supports the supposition that many lineages of *E. coli* have lost *sitABCD* while in others the operon has relocated.

#### *3.4. The Highly Variable* mutS*-Proximal Region*

The region adjacent to the *mutS* gene promoter is highly variable in gene content, demonstrating gene gain and loss in all Enterobacteriaceae lineages examined here (Figure 1B). In the reference *Salmonella* genome, *S. enterica* Typhimurium, this region encodes several pathogenicity island genes (*pig* genes) and mobile elements (Figure 3). Several of the *pig* genes are regulated by SPI-1-encoded HilC, but they do not seem to contribute to *Salmonella*'s virulence phenotype and their specific functions remain unknown [43]. Reconstructing the temporal events in gene gain and loss in the representative *Salmonella* suggests an initial gain of *pigC, pigD, pphB* (purple) and a transposase (pink). This was followed by gain of *pigA, pigB* and an insertion element (blue) then loss of these elements by *S.* Enteritidis and *S.* Paratyphi. The *S.* Enteritidis and *S.* Paratyphi lineage acquired four hypothetical proteins (red) (Figure 3). No *pig* genes are found at the *fhlA*/-/*mutS* locus in other Enterobacteriaceae. *pphB* encodes a serine phosphatase that is ancestral at this locus but is located on the opposing side of *mutS* in *E. coli* (Figure 1A).

**Figure 3.** Alignment of the SPI-1 downstream boundary across the *Salmonella* clade. Whole-genome phylogeny was constructed with PATRIC [49]. Gene colouring corresponds to different gene clusters identified by xenoGI [45]: *invAEGFH* (green), *pigAB* and insertion element (blue), *pigCD* (purple), transposase (pink), *pphB* (purple), hypothetical coding sequences (red, grey). *S. enterica* Senftenberg strain ATCC 43845 is included in this figure. Grey bars represent sequence homology determined by blastx (min length 100 bp, e-value < 0.00001). IS, insertion element; Tr, transposase.

#### *3.5. Decay and Loss of SPI-1*

The core SPI-1 island is exceptional among related T3SS-T3SE systems in its long-term residency and stability in a bacterial clade. Nevertheless, SPI-1 can be lost, as observed in *S. enterica* serotype Senftenberg [67–72]. *S.* Senftenberg ATCC 43845 was included in the analysis presented in Figure 1B. However, the presence of pseudogenes in the *avrA-invH* island, including the SPI-1 regulator *hilD*, indicates genetic decay that is expected to cause loss of invasion functions.

To understand the genetic changes associated with the loss of SPI-1, we aligned *S.* Senftenberg ATCC 43845 to *S.* Senftenberg strain N17-509, a strain that has lost SPI-1 [72]. *S.* Senftenberg N17-509 has lost the entire SPI-1 core region (*avrA-invH*) but retains homologs to the *ygbA*, *sitABCD* and *pig* genes (Figure 4). Another gene island containing mobile elements, a toxin-antitoxin system, and a restriction endonuclease has inserted between the *pig* genes and *mutS* (Figure 4).

**Figure 4.** Alignment of the SPI-1 locus in *S. enterica* Typhimurium LT2, *S. enterica* Senftenberg strains ATCC 43845 and N17-590 and *Escherichia coli* K12. Gene colouring corresponds to different gene clusters identified by xenoGI [45]: *fhlA* (dark brown), *ygbA* (blue), *sitABCD* (orange), *avrA – invH* (green), *pig* genes and *pphB* (light blue/purple) and *mutS* (grey). Genomic island in *S.* Senftenberg N17-509 with no homology to SPI-1 is coloured beige. Grey bars represent sequence homology determined by blastx (min length 100 bp, e-value < 0.00001).

#### *3.6. The fhlA*/*-*/*mutS Locus is a Hotspot for Island Acquisition*

Our analysis shows that SPI-1 inserted at a highly plastic locus that is a hotspot for the acquisition of small and large gene islands. In *S. enterica* subsp. *diarizonae* and subsp. *arizonae*, a second island exists in each strain between *invH* and *mutS* (blue and gold, respectively, in Figure 1B). These islands consist largely of hypothetical proteins and encode mobile elements such as integrases (*S. enterica* subsp. *arizonae*) and transposases (*S. enterica* subsp. *diarizonae*).

Various *Escherichia* species and strains were included in the xenoGI analysis to capture lineage diversity (Table 2). *S. flexneri* has two integrases encoded between *ygbA* and *mutS,* but no other genes. Several *Escherichia* strains (*E. albertii, E. fergusonii* and *E. coli* IAI39) have a single transcriptional regulator *modE* located between *fhlA* and *ygbA*. *E. coli* O104:H4 2011C-3493 has a hypothetical protein between *ygbA* and *mutS*. *E. coli* K12 and *E. coli* O157:H7 are identical at this locus. Of the *Escherichia* species included in the xenoGI analysis, only *Escherichia fergusonii* had islands larger than single genes inserted at the *fhlA*/*-*/*mutS* locus; these contain genes for metabolism functions and sugar transport.

Representatives of two *Klebsiella* species, *K. oxytoca* and *K. pneumoniae*, were included in xenoGI analysis. Both strains have multiple gene islands inserted between *fhlA* and *mutS*, and two islands are shared by both species (Figure 1B, green and purple). The shared islands encode genes for an iron transport system and homologs of the sugar translocation proteins EIIB and EIIC in the phosphotransferase system, suggesting that these islands enable nutrient acquisition in *Klebsiella*.

#### *3.7. AT Nucleotide Content and the Evolution of Transcriptional Control*

A paradigm in bacterial genomics is that AT-rich DNA is a signature of horizontally acquired genes [73]. SPI-1 has been resident in *Salmonella* for many tens of million years, yet the island has a high AT content that has resisted amelioration to match the nucleotide composition of the core genome [65]. AT-richness of SPI-1 is maintained by a higher GC-to-AT substitution rate compared to a higher AT-to-GC substitution rate in core genes [65]. Protein-DNA interactions in gene regulatory networks may explain nucleotide frequencies that resist amelioration to genomic averages [65]. In bacteria, several global transcription factors favour the nucleotide composition and physical properties of AT-rich DNA [73,74]. One such protein is the nucleoid-associated protein H-NS, a global repressor of gene expression [75,76]. The strong repression of SPI-1 gene expression by H-NS suggest that SPI-1 resists amelioration to remain within the H-NS regulon.

We compared AT content at SPI-1 in the eight representative *Salmonella* genomes, which confirmed the expectation that all eight lineages have maintained an amazingly consistent pattern of high AT content (Supplementary Figure S3). This heatmap of nucleotide composition shows the level of conservation and resilience of the high AT content of SPI-1. Moreover, it illustrates how gene islands can be internally consistent in nucleotide composition and can differ dramatically from neighbouring islands, which is especially apparent in the cases of the GC-rich islands in *Klebsiella* (Supplementary Figure S3).

Two regions in SPI-1 are very AT-rich: the *hilD-hilA* and *hilC* loci (Figure 5A). This pattern fits particularly well with the model that AT content is selected to maintain membership in the H-NS regulon. Transcriptional activation of SPI-1 begins with antagonism of H-NS repression by the transcriptional activators HilC and HilD [77–79] (Figure 5B). When active in DNA binding, HilC and HilD activate their own promoters to create a feed-forward signal that counteracts H-NS silencing of the *hilA* promoter [80]. HilA, in turn, activates transcription of the regulator *invF* and acts directly at the T3SS and T3SE gene promoters.

#### *3.8. Evolution of Transcriptional Control: Acquisition of* hilA

HilA, a DNA binding protein in the OmpR/ToxR family of transcription factors, is the master activator of SPI-1 transcription. HilA binds to the *invF* and *prgH* promoters, triggering the activation of T3SS and T3SE genes [81–83]. A recent survey of T3SS in ~20,000 bacterial genomes classified SPI-1 according to gene conservation and synteny into what the authors termed category II [31]. Category II T3SS are scattered among Gammaproteobacteria and Betaproteobacteria, but *hilA* is missing from most genomes with a category II T3SS [31], suggesting this regulatory module is a relatively recent addition within this family of homologous T3SSs. We examined the evolutionary connection between *hilA*, T3SS, and T3SE genes in Proteobacteria (Figure 6, Supplementary Figure S2). The HilA phylogeny does not recapitulate organismal phylogeny, consistent with the role of HGT in distributing T3SS across diverse strains and species. The dynamic architecture of SPI-1 homologs is further illustrated by aligning genes according to the largest conserved island, *spaS-invF*. This alignment helps illustrate how the T3SS and T3SE components are genetically divisible into distinct islands, *orgCBA-prgKJIH*, *hilA-iagB*, *iacP-sipADCB-sicA*, and *spaSRQPO-invJICBAEGF* (Figure 6). These constituent islands are illustrated at the bottom of Figure 6, and correspond approximately to the microsynteny blocks described by Hu and colleagues [31]: MSB1+*orgC*, MSB5+*iacP*/*sipA*, and MSB3+MSB4+MSB2 is island *spaSRQPO*/*invJICBAEGF*. The shuffled orientation and composition of these blocks means that no two genera possess the same island architecture.

**Figure 5.** Nucleotide composition and gene regulation in SPI-1. (**A**) AT content schematic of *hil* gene promoters across eight *Salmonella* strains. AT content was overlaid as a heatmap on the regions surrounding *hilC*, *hilD* and *hilA*. Heatmap values were generated with a 100 base sliding window using Geneious R11 [53]. (**B**) Core elements of the SPI-1 transcriptional regulatory network.

**Figure 6.** HilA phylogeny and genetic context for *hilA* in T3SS islands. Best blastx hits covering at least 80% of the query (*S. enterica* Typhimurium LT2 HilA) were identified and one representative protein sequence was selected from each species to capture phylogenetic diversity. Protein sequences were aligned using MUSCLE [55] and phylogeny was built using a maximum-likelihood model LG+G with 1000 bootstrap replicates [57]. *hilA* is illustrated in bold with white and red hashes and genes are coloured according to function (see Figure 1 legend). Conserved gene modules in T3SSs with *hilA* are identified at the bottom of the figure. Two strains (bottom) were included in xenoGI analysis but not the phylogenetic tree. Both have a T3SS but are either missing *hilA* (*E. coli* O157:H7 Sakai) or have a truncated *hilA* (*S. praecaptivus*). For a more detailed reconstruction of *hilA* evolution, see Supplementary Figure S2.

*hilA* is contiguous with T3SS genes in 26 of 30 representative genomes we examined. In the other four genomes, (three strains of *Escherichia* and one strain of *Chromobacterium vaccinii*), *hilA* is either located alone on the chromosome or is absent from the genome (Figure 6). In *E. coli* O42 and *E. coli* O157:H7, an alternate transcription factor, *ygeH*, occupies the approximate location of *hilA*, adjacent to the T3SS genes (Figure 6). YgeH has low (29%) similarity to HilA, yet the *E. coli* O42 YgeH can functionally replace HilA in *Salmonella* and like HilA, its expression is regulated by H-NS [84,85].

The *E. coli* O157:H7 ETT2 is undergoing mutational attrition and becoming a cryptic gene island [27,84]. The accumulation of pseudogenes is accompanied by a loss of *hilA*, and the *ygeH* ortholog is non-functional as a transcription factor [84]. *Sodalis* presents another genus where the evolutionary stages in the decline of T3SS can be observed through comparative genomics. Members of this genus encode two T3SS homologs of SPI-1: SSR1 and SSR2. In *Sodalis praecaptivus*, SSR1 lacks *hilA* and SSR2 encodes a truncated *hilA* (Figure 6). In the endosymbiont *Sodalis glossinidius*, no *hilA* is present in the genome, and SSR1 and SSR2 are accumulating pseudogenes, consistent with a loss of island function due to the host occupying a highly specialized and obligate niche in tsetse flies.

#### *3.9. Evolution of Transcriptional Control: Addition of the HilC*/*D Paralogs*

Regulation of SPI-1 by the AraC-family proteins HilC and HilD appears to be unique because related T3SS and T3SE islands in Enterobacteriaceae do not include *hilC* or *hilD* homologs (Figure 6, Table 1). We reconstructed the evolutionary history of *hilC* and *hilD* by searching for all homologs in GenBank that are similar across 80% or more of the length of each query protein. With this search parameter, using HilC as a query recovers HilD, and vice versa, because the two proteins are closely related (36.4 % identity over 88 % query coverage, e-value < 3e-51). *hilC* and *hilD* are core elements of SPI-1 (Figure 1B), and so a GenBank search for homologs was conducted after excluding all *Salmonella* genomes from the search. Seventy-two non-redundant proteins were identified when either HilC or HilD served as a query sequence, whereas each individual query identified four unique proteins. Although HilC and HilD are ubiquitous in *Salmonella* (except for the case of *S*. Senftenberg that has lost SPI-1), searching all non-*Salmonella* genome sequences in GenBank revealed that HilC and HilD homologs are rare and sporadically distributed, occurring in only five genera outside *Salmonella* (Figure 7, Supplementary Figure S4). Moreover, each genome outside of *Salmonella* has a single HilC/D homolog, highlighting another unique feature of the SPI-1 regulatory network.

A HilC/D phylogeny has low bootstrap support that prevents ordering the deepest branches (illustrated as a polytomy in Figure 7). Nevertheless, the ubiquity of HilC and HilD in *Salmonella* compared to the sporadic distribution of homologs is consistent with the two genes arising from gene duplication in *Salmonella*. The phylogeny is consistent with a series of horizontal gene transfer events that spread HilD to *Edwardsiella*, *Enterobacter*, and *Escherichia*. The largest number of homologs were detected in incomplete *Escherichia* genomes in GenBank.

In most *Escherichia*, HilC/D homologs are located adjacent to type IV pilus genes and plasmid-specific genes. For example, the complete genome of *E. coli* O104:H11 strain RM14721 includes a 106 kb plasmid, and a HilC/D homolog labeled as CofS is found on the plasmid [86] (Figure 7). This protein is part of the *cof* operon, which encodes a type 4b pilus colonization factor antigen in enterotoxigenic *E. coli* and is used to attach to host cells [87]. A homolog of HilC/D in *Enterobacter lignolyticus* was also found located near type IV pilus genes, albeit on a chromosome. A small number of HilC/D homologs were found in *Citrobacter*, *Edwardsiella*, and *Hafnia* (Figure 7). Most of these strains have complete genome sequence available, and the HilC/D homolog appears to be in the same chromosomal location across these genera. Of these genomes, only one *C. freundii* strain has a T3SS, which is located at a different position in the chromosome from the HilC/D homolog. Based on GenBank annotations, the T3SS genes are converting to pseudogenes as mutations accumulate, suggesting that this is most likely a non-functional gene island.

**Figure 7.** HilCD phylogeny. Nodes with bootstrap support below 0.70 are condensed to polytomies, and branches are coloured by genus. The number of strains in each collapsed node is indicated on the figure. *S. bongori* and *S. enterica* Typhimurium HilC and HilD sequences were included as representatives from the *Salmonella* clade. The top 100 best blastx hits covering at least 80% of the query for *S. enterica* Typhimurium LT2 HilC and HilD sequences were aligned using MUSCLE [55] and the phylogeny was inferred using a maximum-likelihood model JTT+G with 1000 bootstrap replicates [57]. For the full unrooted phylogeny, see Supplementary Figure S4.

#### **4. Discussion**

In bacteria, the forces of horizontal gene transfer and recombination have significant impacts on genome content and organization, accelerating evolution and community diversity. The consequences are etched across bacterial genomes in the form of vast numbers of accessory genes and gene islands, each with distinct phylogenetic histories. Genomic islands are born from HGT: physical linkage between cooperative genes is beneficial for the simultaneous transfer of the genetic information required to assemble a cellular machine or perform a metabolic process. For example, contiguity increases the odds of successful transfer of large islands, like the genomic island replacement and serogroup conversion that drove the emergence of pathogenic *Vibrio cholerae* O139 [88].

Specific genomic regions can be hotspots for the gain and loss of islands. The plasticity of a genomic region is often attributed to the presence of DNA sequences that facilitate the integration or excision of mobile genetic elements. Prominent examples include sequences that facilitate intramolecular and intermolecular recombination such as direct repeat sequences, inverted repeat sequences, bacteriophage attachment sites, and the 3 end of tRNA genes [9,42,89]. SPI-1 is located in a hotspot for island insertion, as revealed in our comparative analysis across multiple genera of Enterobacteriaceae (Figure 1B). However, the genetic features such as tRNA sites, phage integration sites and repeat sequence elements associated with high recombination rates are not apparent at the SPI-1 locus [23].

Even in the absence of recognizable DNA elements that facilitate recombination, *mutS* and the surrounding region are known to have high rates of horizontal exchange relative to other regions in *Salmonella* and *E. coli* genomes [43,90–97]. A defective *mutS* results in higher recombination rates, which can be beneficial in times of stress but can also have long-term negative consequences on the cell [93,94,98]. As a result of higher recombination rates, there is a more frequent changing of *mutS* alleles, which increases the likelihood that a defective *mutS* allele will be rescued by a new, functioning allele [90,93]. As the *mutS* region favours horizontal gene transfer, *Salmonella* may have concurrently acquired a *mutS* allele and a T3SS that rose to dominance in the population.

The sporadic distribution of gene islands related to SPI-1 outside the genus *Salmonella* suggests that many independent gain and loss events have scattered variant SPI-1-like islands across Proteobacteria (Figure 6). In *Shigella, Pantoea*, and *Yersinia*, T3SSs are located on plasmids, which helps explain their sporadic distribution [32,35–38,59,99]. SPI-1 has different GC content, genetic organization and phylogenetic histories than the plasmid-borne islands in *Yersinia* and *Shigella*, indicating that these plasmids were not the original source of SPI-1 in *Salmonella* [64]. Alternatively, it is unlikely that SPI-1 represents a progenitor that has been subdivided after transfer from a *Salmonella* donor to other Proteobacteria (Figure 6). *Salmonella* HilA form a distinct, highly-supported clade, providing further evidence that *Salmonella* is not the ancestral source of HilA and linked T3SS islands in other Proteobacteria.

Even archetypal islands like SPI-1 are themselves mosaics composed of smaller islands. For example, as part of the validation of xenoGI, Bush and colleagues evaluated the SPI-2 locus in *Salmonella* and found that it is composed of several smaller gene islands. SPI-2 is composed of a T3SS gene island and a tetrathionate reductase operon island [45]. These findings are consistent with a previous analysis of genetic flux in *Salmonella* that identified the *ttr* gene island as a more ancient acquisition than the SPI-2 T3SS gene island [100]. Similarly, three phylogenetically distinct T3SS islands –including homologs of SPI-1 and SPI-2– are distributed sporadically across the *Pantoea* genus, and each island is a composite of reassorting subcomponents [31,32].

During macrophage infection, the distinct transcriptional profiles of gene modules in and around SPI-1 confirms the phylogenetic evidence for the locus being an archipelago of smaller islands with distinct evolutionary histories and regulatory programs (Figure 1C). In the macrophage vacuole, transcription of *ygbA* and *sitABCD* is very high, whereas transcription of genes in the *avrA*-*invH* island is repressed. In contrast, transcription of genes STM2901-STM2908 is largely unchanged between laboratory and intracellular environments [48] (Figure 1C). Thus, the genes that contribute to shared cellular functions, such as the T3SS or metal ion import, are co-regulated. Additional examples of island-specific transcriptional responses in and around SPI-1 can be visualized in the *Salmonella* compendium of transcriptomic data [46], further confirming that the boundaries between differentially regulated transcriptional units align well with the xenoGI assignment of gene islands based on evolutionary histories.

Another aspect of the mosaicism within a genomic island is the gain and loss of transcription factors. This is pertinent in the study of SPI-1 because it encodes an unusually high number of transcription factors (five), at least four of which are required to activate transcription of SPI-1 genes. HilA is a central regulator of SPI-1 transcription, but most homologous T3SSs do not have HilA (Table 1) [31]. Some lineages in Figure 6 have lost HilA, but this is always correlated with mutational attrition of the T3SS and T3SE genes, suggesting HilA is dispensable only after its regulatory targets have lost their biological function. Although HilA is a diffusible trans-acting factor, it is almost always contiguous with the T3SS genes it regulates, further highlighting how selection for effective HGT builds and maintains the composition of genomic islands.

Xenogeneic silencing of transcription at genomic islands by H-NS and counter-silencing by transcriptional activators like HilC and HilD has been thoroughly reviewed elsewhere [73,79]. Scientists have observed that winged helix-turn-helix proteins like HilA are effective activators of horizontally-acquired AT-rich DNA because their low specificity for DNA binding sites enables

competition and displacement of H-NS across broad regions of gene promoters [101]. Less well understood are the mechanisms and evolutionary steps through which local-acting dedicated transcription factors, like HilA, HilC, and HilD are gained and integrated into a functionally cohesive island. We posit that the unusually long-term stability of SPI-1 is *Salmonella* arises from the fine-tuning of transcriptional activation only when ecologically appropriate. Further, we suspect that integration is reinforced by coordination of SPI-1 expression with core housekeeping functions by regulators like H-NS, in addition to development of cross-talk between SPI-1 regulators and the transcriptional control of other genomic islands [102]

SPI-1 has many features consistent with the classical model of a genomic island: a history of insertion revealed by comparative genomics plus a high AT content exceeding the genomic average. Yet in other respects, SPI-1 is unusual as a genomic island. For example, unlike its related islands that demonstrate short residency times in bacterial strains, SPI-1 has been resident in its host for many tens of millions of years. The average residency time of a genomic island is difficult to estimate due to the absence of a calibrated evolutionary record in the vast majority of bacteria. In laboratory conditions, pathogenicity islands have natural deletion frequencies ranging from 10−<sup>4</sup> to 10−<sup>7</sup> [reviewed in [103]], including several pathogenicity islands (PAIs) in uropathogenic *E. coli* [104] and the High-Pathogenicity Island (HPI) in *Yersinia* [105]. Residency and lasting integration will depend on organismal biology, stochastic molecular genetic events, and broader ecological pressures acting on the host organism and the host's ability to use the genetic potential in the island.

The *sitABCD* genes are traditionally treated as members of the SPI-1 island. Yet they have been previously been suggested to have an alternate history than the other SPI-1 genes based on the similarity of their AT content to the genomic average, which contrasts with the AT-rich SPI-1 [44] (Supplementary Figure S3). Our analysis supports this hypothesis, as the *sitABCD* genes are found at the same locus in *Klebsiella* and at other locations in other Enterobacteriaceae (Figure 2). There are conflicting reports on whether *sitABCD* is essential for SPI-1 virulence, but the majority argue that is it required for infection [44,106–108]. Although *sitABCD* may be useful for iron and/or manganese acquisition during infection, this operon was present at the *fhlA*/*-*/*mutS* locus prior to acquisition of the structural T3SS genes. If *sitABCD* and SPI-1 do cooperate during infection, their physical proximity on the genome may be purely coincidence.

xenoGI is an easy-to-use program that was able to conduct our multi-genome analysis in under three hours. Basing comparative analysis on a phylogenetic tree makes xenoGI a powerful tool to analyze the history of genomic islands. When released, xenoGI was validated with several examples using similar clades of bacteria to our analysis (*Salmonella* and *Escherichia*, with *Klebsiella* and *Serratia* used as outgroups) [45]. Our analysis covers a larger phylogenetic breadth than previously tested, and xenoGI resolved *sitABCD*, *fhlA*, and *mutS* gene conservation consistent with the whole genome phylogenies. xenoGI is constrained to the analysis of coding sequences from complete genomes, meaning that it is unable to recognize small RNAs and promoter sequences. There is a small RNA, *invR*, adjacent to *invH* that is an important regulator of SPI-1 [109], but it was not considered in this analysis for this reason.

SPI-1 can almost be considered a core gene set in *Salmonella*, but it fails the bioinformatic definition of a "core" genetic element due to its loss from some members of the genus. Similar to the loss from strains of *S.* Senftenberg [67–72], isolates of *S.* Litchfield have also lost SPI-1 [68,71]. These strains were isolated from environmental samples, not from animals or human infections [68,71]. These strains remain able to invade animal cells, albeit at a reduced rate [69]. After many millions of years of integration and a near ubiquity in extant members of the diverse *Salmonella* genus, SPI-1 is expected to perform key ecological functions, including in the less-studied species *S. bongori*. The natural loss of SPI-1 presents a test case for predicting ecological functions based on gene content [110]. Specifically, a reduced ability to colonize animal hosts may be accompanied by a loss of metabolic pathways for host-derived nutrients. Loss of SPI-1 may not be strongly selected against in some niches, but the lineages lacking SPI-1 may be evolutionary dead ends [111].

*Microorganisms* **2020**, *8*, 576

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2076-2607/8/4/576/s1, Figure S1: Cladogram of bacterial strains used as input for xenoGI analysis, Figure S2: AT content values of the *fhlA*/-/*mutS* locus overlaid as a heatmap on genomic islands in Enterobacteriaceae, Figure S3: HilA phylogeny alongside genomic context for *hilA* in various T3SSs, Figure S4: Unrooted HilCD phylogeny of homologs found outside the *Salmonella* clade, Table S1: Summary of xenoGI island classification for genes at the *fhlA*/-/*mutS* locus, Table S2: HilA homolog accession list from representative species, Table S3: HilC/D homolog accession list from best blastx hits.

**Author Contributions:** Conceptualization, N.A.L. and A.D.S.C.; Formal analysis, N.A.L and A.D.S.C.; Funding acquisition, N.A.L., K.D.M. and A.D.S.C.; Investigation, N.A.L.; Methodology, N.A.L.; Visualization, N.A.L., K.D.M. and A.D.S.C.; Writing—original draft, N.A.L., K.D.M. and A.D.S.C.; Writing—review & editing, N.A.L., K.D.M. and A.D.S.C. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Natural Science and Engineering Research Council of Canada (NSERC); N.A.L. is supported by NSERC Canadian Graduate Scholarship - Doctoral, K.D.M. is supported by an NSERC Postdoctoral Fellowship, and A.D.S.C. is supported by NSERC Discovery grant number 2019-07135.

**Acknowledgments:** The authors thank Charles Dorman and three anonymous reviewers for their comments that helped improve the quality of this manuscript. We thank Eliot Bush, Chris Yost, Illona Monkman, and Emre Islam for helpful discussions.

**Conflicts of Interest:** The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

*Article*

## **The Not so Good, the Bad and the Ugly: Di**ff**erential Bacterial Adhesion and Invasion Mediated by** *Salmonella* **PagN Allelic Variants**

**Yanping Wu 1,2,**†**, Qiaoyun Hu 1,**†**,**‡**, Ruchika Dehinwal 1, Alexey V. Rakov 1,§, Nicholas Grams 1, Erin C. Clemens 3,**-**, Jennifer Hofmann 3, Iruka N. Okeke 3,4 and Dieter M. Schi**ff**erli 1,\***


Received: 26 February 2020; Accepted: 28 March 2020; Published: 30 March 2020

**Abstract:** While advances in genomic sequencing have highlighted significant strain variability between and within *Salmonella* serovars, only a few protein variants have been directly related to evolutionary adaptation for survival, such as host specificity or differential virulence. The current study investigated whether allelic variation of the *Salmonella* adhesin/invasin PagN influences bacterial interaction with their receptors. The *Salmonella enterica, subspecies enterica* serovar Typhi (*S.* Typhi) allelic variant of PagN was found to bind significantly better to different enterocytes as well as to the extracellular matrix protein laminin than did the major *Salmonella enterica, subspecies enterica* serovar Typhimurium (*S.* Typhimurium) allele. The two alleles differed at amino acid residues 49 and 109 in two of the four predicted PagN surface loops, and residue substitution analysis revealed that a glutamic acid at residue 49 increased the adhesive and invasive properties of *S.* Typhi PagN. PagN sequence comparisons from 542 *Salmonella* strains for six representative *S. enterica* serovars and *S. diarizonae* further supported the role of glutamic acid at residues 49 and 109 in optimizing adhesion to cells and laminin, as well as for cell invasion. In summary, this study characterized unique residues in allelic variants of a virulence factor that participates in the colonization and invasive properties of different *Salmonella* stains, subspecies and serovars.

**Keywords:** *Salmonella*; *S.* Typhi; *S.* Typhimurium; S. diarizonae; PagN; adhesin; invasin; alleles; allelic variants

#### **1. Introduction**

*Salmonella enterica* subsp. *enterica* (*S. enterica*) is an entero-invasive bacterial pathogen that utilizes a type three secretion system (T3SS) encoded on the *Salmonella* pathogenicity island 1 (SPI-1) to invade intestinal epithelial cells. T3SS-driven uptake is particularly critical for intestinal *S. enterica* infections in mammals, as demonstrated with several experimental animal models [1]. *S. enterica* strains are

differentiated by their flagella and O-antigens, which are highly variable and provide the basis for identification of over 2500 different serovars [2,3]. There is a direct correlation between the range of host adaptation of different *S. enterica* serovars with their levels of virulence and types of pathogenesis [4]. Serovars that are better adapted to a specific host species, such as the human-restricted *S. enterica* serovar Typhi (*S.* Typhi), are extremely pathogenic due to their ability to leave the intestines and spread hematogenously, resulting in sepsis. In contrast, serovars such as *S. enterica* serovar Typhimurium (*S.* Typhimurium) that have a broad host range are restricted to local invasion, and their containment by host inflammatory responses narrows their pathogenesis to gastrointestinal symptoms in humans. Finally, some *Salmonella* such as *S. enterica* subsp. *diarizonae* (*S. diarizonae*) are primarily associated with cold-blooded animals, and only rarely result in invasive diseases in sheep or humans [5,6].

Even though all *Salmonella* have and express SPI-1 genes [7], variations in the sequence of specific SPI-1 proteins and the unique repertoire of translocated SPI-1 effector proteins by different species, serovars and even strains within the same serovar has a direct impact on the invasion efficiency of different cell types [8–10]. Moreover, the diversity of *S. enterica* invasion levels for various cell types and host species can be partially attributed to variations in the regulation and export efficiencies of effector proteins through the SPI-1 T3SS [11]. In addition to the SPI-1 T3SS, cell invasion by *S. enterica* involves two outer membrane proteins (OMP), Rck and PagN, that have been designated adhesins–invasins due to their ability to promote both bacterial binding to host receptors and cellular uptake [12–14]. Rck has several adhesive properties: it binds to factor H to mediate bacterial resistance to complement and adheres to laminin and interacts with the epidermal growth factor receptor for a zipper mechanism of cell invasion [13,15–17]. Rck also contains a self-association motif that has the potential to mediate interbacterial attachment, as described for Hra1/Hek, an integral outer membrane protein of enteroaggregative *Escherichia coli* (*E. coli*) [18,19]. The adhesins/invasins Hra1/Hek, Hra2 and Tia of enteroaggregative or enterotoxigenic *E. coli* share homologies with membrane spanning domains of PagN, but their surface-exposed loops are less similar, alluding to variable binding affinities for different host receptors [18,20–22].

Unlike *rck,* which is plasmid-encoded and absent in many virulent *S. enterica* serovars and strains, *pagN* is encoded on the bacterial chromosome and present in most, if not all, *S. enterica*. One study indicated that deletion of *pagN* in *S.* Typhimurium SL1344 grown under PhoPQ-activating conditions to inhibit SPI-1 gene transcription decreased bacterial adhesion and invasion of HT-29 cells [12]. In contrast, deletion of *pagN* in the *S.* Typhimurium strain LT2 did not impact adhesion to HT-29 or other cells [23], potentially due to the expression of other adhesins such as the type 1 fimbriae induced by the growth culture conditions. More surprisingly, despite the presence of both SPI-1 T3SS and type IV pili that can each independently mediate invasion, deletion of *pagN* impacted both adhesion and invasion of HT-29 cells in a strain of *S.* Typhi [23,24]. Although the culture conditions used might not have activated functionally detectable T3SS expression, type IV pili were still functional, since a *S.* Typhi mutant lacking type IV pili affected invasion as much as the Δ*pagN* mutant [23]. The latter result suggested that both PagN and the type IV pili contribute to *S.* Typhi invasion of cells. In addition, oral challenge of iron-overloaded Swiss Albino mice with wild type and *pagN* mutants of *S.* Typhi indicated that the former strain was more invasive and had increased lethality at an infectious dose of 2 <sup>×</sup> 107 CFU and in competition assays. Finally, even though no differences were observed between the wild type and *pagN* mutant strains of *S.* Typhimurium [23], use of PagN from each serovar as an immunogen demonstrated some level of immune protection in mice challenged with the wild type strain from the corresponding serovar [23]. Thus, the role of PagN in the invasive properties of *S.* Typhi and *S.* Typhimurium remain somewhat controversial and studies of their function are likely impacted by the confounding effects of additional adhesins/invasins, different levels of PagN expression, and/or variations in the protein sequence of PagN.

Based on these results and our previous findings on FimH adhesin alleles in various *S. enterica* serovars and strains for host cell binding specificities [25,26], we wondered whether allelic variation in PagN [10] modulates bacterial adhesion and invasion. Here we determined that the *S.* Typhi allelic variant of PagN provides significantly better bacterial binding and invasive efficiencies for different enterocytes and the extracellular matrix protein laminin, as compared to the major *S.* Typhimurium allele. To our surprise, PagN from *S. enterica* subsp. *diarizonae* was also significantly more adhesive and invasive than the major *S.* Typhimurium allele. Sequence comparison and functional analysis of substitutions as specific residue positions revealed that amino acids in two of the four PagN surface-exposed loops contributed to these phenotypes. These results further support the role of allelic variation of virulence factors in adjusting the level of pathogenic attributes among *Salmonella* subspecies and serovars.

#### **2. Materials and Methods**

#### *2.1. Bacterial Strain and Plasmid Constructions*

The bacterial strains and plasmids used in this study are described in Table 1 and all PCR primers are listed in Table S1. Unless stated otherwise, all the reagents were from MilliporeSigma (St. Louis, MO, USA). Bacteria were routinely grown in LB-Lennox media unless otherwise indicated. When appropriate, ampicillin (200 μg/mL) or kanamycin (45 μg/mL) was added to the growth medium. For plasmid constructions, the *pagN* genes were amplified from *S.* Typhimurium, *S.* Typhi and *S. diarizonae* genomic DNA by PCR with the Q5 High-Fidelity DNA Polymerase (New England Biolabs Inc., Ipswich, MA, USA). For PagN expression the allelic genes were cloned into AHT-inducible plasmid pRS1 using Gibson assembly with amplicon prepared from appropriate primers. Three site-directed substitution mutants were prepared by Gibson assembly of pRS1 and amplicons prepared with *pagN* external and internal primers for *S.* Typhimurium or *S.* Typhi, or amplicons prepared with mutagenic primers. A *pagN-his* fusion construct was prepared by insertion of a *Nde*I-*Hind*III restricted amplicon into pET22b. *S.* Typhimurium SL1344 *sipB::aphA-3* mutant was prepared by P22 generalized transduction from strain DMS1507 [27,28] and SL1344 *sipB::aphA-3* ΔpagN was constructed by Gibson assembly and allelic exchange as described [29–31]. All plasmid and strain constructs were confirmed by PCR and sequencing.



#### *2.2. Protein Expression and Antibody Preparation*

The histidine-tagged PagN was expressed from the pET22b construct using the IPTG inducer and isolated by metal chelation chromatography as described previously [35]. A specific polyclonal antiserum against PagN-His was prepared in rabbits by using a conventional immunization protocol (Cocalico Biologicals Inc., Reamstown, PA, USA). The antiserum was adsorbed with *E. coli* BL21(DE3)/pET22b before use as described previously [36]. Briefly, 1 mL antiserum with 0.06% sodium azide was incubated with bacterial pellets from 10-mL cultures grown overnight for 18 h at 4 ◦C. After three adsorption cycles, the antiserum was filtered (0.02 μm-pore-size) before use. PagN expression in *E. coli* AAEC189 for all the binding and invasion studies was induced with AHT by growing the bacteria overnight at 30 ◦C, or for 2 h at 37 ◦C, starting with log phase cultures (A600 = 0.3). Comparable levels and bacterial surface expression of the three cloned PagN alleles were standardized by using various concentrations of inducers (0.005–2 μg/mL AHT) for Western blot analysis and ELISA, as done previously [25]. Outer membrane proteins were prepared as described previously [37].

#### *2.3. Cell Cultures*

The human colonic cell line RKO (ATCC CRL2577) was cultured in Dulbecco's Modified Eagle Medium (DMEM; Invitrogen, Life Technologies) supplemented with 15% (*v*/*v*) heat inactivated fetal bovine serum (FBS) and antibiotics to a final concentration of 100 U/mL penicillin and 100 μg/mL streptomycin (Gibco, Life Technologies) [25]. The porcine cell line IPEC-J2 (DSMZ ACC 701) was cultured with 15% heat inactivated FBS (Sigma-Aldrich), 1% penicillin/streptomycin, 1% insulin/transferrin/selenium (Gibco) and 5 ng/mL epidermal growth factor (Sigma) in DMEM/F-12/HAM (1/1/1, *v*/*v*/*v*; Gibco). The cells were incubated at 37 ◦C in a humid atmosphere with 5% CO2.

#### *2.4. Bacterial Binding and Invasion Assays*

Both epithelial cell cultures grown to confluence in 24-well plates (Corning, CLS3596) were used for the binding assays with recombinant *E. coli* AAEC189 (*E. coli* Δ*fim*) with pRS1 plasmid constructs s expressing a *Salmonella* PagN allele. Bacteria were grown overnight, diluted 10−<sup>2</sup> in LB broth with inducers (see above), incubated for 16–17 h, washed three times with PBS and diluted in DMEM to inoculate with a multiplicity of infection of 100 bacteria (in 0.25 mL) to 1 enterocyte. Exact bacterial inoculum numbers were checked by standard CFU counts. The culture plates were centrifuged (600 *g*, 5 min) to initiate contact between the bacteria and the cells and incubated for 1 h at 37 ◦C in 5% CO2. To evaluate cell adhesion, the infected monolayers were washed thrice with PBS to remove non-associated bacteria and treated with 0.5% Triton X-100 to release and count the cell-associated bacteria by CFU enumeration. For the invasion assays, a standard gentamicin protection assay was performed [38]. Followinga1h infection (see above), cells were incubated with medium containing gentamicin (100 μg/mL) for 90 min at 37 ◦C in 5% CO2. After three washes with PBS, bacteria were released with Triton X-100 and enumerated as described above. All experiments were done in triplicate wells and repeated at least thrice.

#### *2.5. Microscopy*

For fluorescence microscopy, *E. coli* strain AAEC189 carrying plasmids pRS1, pDMS1973 (*pagN*Ty), pDMS1974 (*pagN*Tm) or pDMS2062 (*pagN*di) were grown and induced for PagN expression as described above. The bacterial cells were deposited on slides, dried for 20 min and washed once with PBS. Bacteria were fixed with 4% paraformaldehyde in PBS, pH 7.4, then labeled with adsorbed anti-PagN antisera (1:500), followed by anti-rabbit Alexa Fluor 480 (1:1000; Invitrogen, Life Technologies, Grand Island, NY, USA). Images were captured with a Coolsnap digital camera (Photometrics, Tucson, AZ, USA) mounted onto a Nikon Eclipse E600 microscope with Coolsnap version 1.2.0 software (Roper Scientific, Tucson, Arizona).

#### *2.6. Binding to Extracellular Matrix Proteins*

Immuno Maxisorb plates with 96 wells (Nunc; Thermo Fisher Scientific, Rochester, NY, USA) were coated with 10 μg/mL of human collagen I, chicken collagen II, human collagen IV, bovine fibronectin, murine laminin or BSA (Sigma-Aldrich) in PBS at 4 ◦C overnight and then washed with PBS and blocked with PBS plus 1% BSA for 2 h. Binding of bacteria to laminin coated on plates was studied by using PagN-expressing *E. coli*. Bacteria grown and induced to make PagN were centrifuged, suspended in PBS to 10<sup>7</sup> CFU in 100 μL and added to laminin coated wells. Bacteria harboring empty plasmid pRS1 were used as a control. After incubation for 1 h at 37 ◦C, unbound bacteria were removed by three washing cycles, anti-PagN antiserum (1:500) was added, followed by wash cycles and incubation with goat anti-rabbit HRP-conjugated antibody (1:2000). After three wash cycles, bound antibodies were detected by using the 1-Step Turbo TMB ELISA substrate (Thermo Fisher Scientific) followed by 2 M sulfuric acid and measuring the absorbance at 450 nm. For the binding inhibition assays, double dilutions of heparin or heparan sulfate (400–0.4 μg/mL) were incubated with bacteria for 1 h and the mixtures were added to the laminin-coated wells for further processing as described above.

#### *2.7. Bacterial Genomes and PagN Sequences*

*Salmonella* genomic sequences from 497 previously studied *S. enterica* from 6 serovars with two biovars (serovars Typhi, Dublin, Choleraesuis, Typhimurium, Enteritidis, Newport and Gallinarum, with biovars Gallinarum and Pullorum for the latter serovar) and from 45 confirmed *S. diarizonae* (Table S2) were obtained from NCBI RefSeq database or assembled from NCBI SRA, EBI ENA and Wellcome Sanger Institute repositories as described [10]. The genomes were used to determine their encoded PagN sequences. The genomes of 13 incorrectly serotyped *S. diarizonae* strains were detected with SISTR (their corrected serovar attribution was added to Table S2) and not analyzed for PagN [39]. Protein alignments were done with Megalign, DNASTAR Lasergene (Madison, WI, USA).

#### *2.8. PagN Structure Analysis*

The 3D-structural model of PagN was predicted by using the corresponding sequence from *S.* Typhimurium LT2 and I-TASSER [40,41]. Among the five best predicted models, the 3rd model was chosen to be shown in Figure 1B (Protean 3D, DNASTAR Lasergene), with an overall ERRAT quality factor of 84.4 [42], in agreement with a published model for *S.* Typhi PagN [43]. The five best I-TASSER predicted *S. arizonae* PagN structures essentially overlapped with the ones of *S.* Typhimurium and *S.* Typhi.

#### *2.9. Statistical Analysis*

Student's non-paired *t* test (two tailed) was used with Prism 8 (GraphPad Software, San Diego, CA, USA) to calculate statistical significance for all the binding assays (\* *p* < 0.05, \*\* *p* < 0.01, \*\*\* *p* < 0.001 and \*\*\*\* *p* < 0.0001).

#### **3. Results and Discussion**

PagN of *S.* Typhimurium and *S.* Typhi had similar sequences with two to three substitutions in predicted surface-exposed loops 1 and 2 (Figure 1). Since PagN of both strains act as adhesins and invasins for human intestinal epithelial cells, we wondered whether these allelic variants differentially impacted these properties. To ensure consistent and comparable expression of PagN, we cloned each gene into inducible expression plasmids to make pDMS1973 and pDMS1974 for the expression of PagN from *S.*Typhi (*pagN*Ty) and *S.* Typhimurium (*pagN*Tm) respectively. Western blot analysis of outer membrane preparations from *E. coli* AAEC89 carrying one of these two plasmid constructs detected bands specific for each PagN allele (Figure 2A).


**Figure 2.** Expression of pagN alleles from *S.* Typhimurium, *S.* Typhi and *S. diarizonae*. (**A**) Expression of PagN in *E. coli* AAEC189 transformed with pRS1 (empty vector, -), pDMS1973 (*pagN*Ty), pDMS1974 (*pagN*Tm) or pDMS2062 (*pagN*di) was induced by AHT (0.2–0.4 μg/mL) for 2 h at 37 ◦C. Isolated outer membrane proteins analyzed by SDS-PAGE, followed by Coomassie blue staining or western blotting with anti-PagN antisera showed a clear band for PagN expression at 25 kDa. (**B**) Visualization of PagN surface expression in *E. coli* AAEC189 expressing the allele from *S.* Typhimurium (**c**,**d**), *S.* Typhi (**e**,**f**), *S. diarizonae* (**g**,**h**) or empty vector as negative control (**a**,**b**). Phase-contrast microscopy (right) and fluorescence microscopy (left) were used to detect bacteria labeled with anti-PagN antisera, followed by Alexa Fluor 488-conjugated anti-rabbit IgG.

Moreover, both alleles of PagN were detectable on the bacterial surface by immunofluorescence (Figure 2B, panel a to f) and ELISA (not shown). Not surprisingly, the strongest signal was observed for the *S.* Typhimurium PagN allele, as this was the immunogen used to prepare the antiserum.

The *E. coli* strain AAEC189 lacks adhesive type 1 fimbriae, providing us with a bacterial context in which to study PagN-mediated bacterial binding and cellular uptake free of known *Salmonella* adhesins and invasins. Using both RKO and IPEC-J2 enterocytes, we found that both PagNTy and PagNTm mediated bacterial binding and invasion, as previously reported for ovarian hamster epithelial-like CHO-K1 cells and human colonic HT-29 cells [12,23]. More importantly, comparisons of the two alleles indicated that PagNTy was more efficient than PagNTm for both cell adhesion and invasion (Figure 3A), supporting an allelic variant effect on a virulence property. Additional studies with a *S.* Typhimurium *pagN* and *sipB* (essential SPI-1 translocon subunit) deletion mutant complemented with different PagN-expressing plasmids did not show significant different levels of adhesion/invasion of RKO cells (data not shown), suggesting a dominant phenotype due to the expression of one or more other *Salmonella* adhesin(s) and invasin(s) under the used growth conditions. Thus, the use of *E. coli* with a controlled expression system allowed us to bypass the masking effect of additional *Salmonella* adhesins/invasins differentially expressed in different serovars and environmental growth conditions and identify functional differences due only to sequence variations of PagN alleles [44,45].

**Figure 3.** Binding and uptake of *E. coli* AAEC189 to intestinal epithelial cells mediated by PagN expression. (**A**) Adherence and invasion of *E. coli* AAEC189 with empty vector pRS1 (no pagN), pDMS1973 or pDMS1974 to express PagNTy or PagNTm respectively, to human (RKO) and porcine (IPEC-J2) was analyzed by incubating 3 <sup>×</sup> <sup>10</sup><sup>7</sup> CFU/mL bacteria at a MOI of 100 for 60 min for adherence and another 90 min for invasion assays. Bacteria expressing PagNTy bound and invaded significantly better than the bacteria expressing PagNTm (*p* < 0.01–0.001). (**B**) Adherence of *E. coli* AAEC189 expressing PagNTm to RKO cells was significantly enhanced (*p* < 0.05) when PagNTm was mutated from aspartate to glutamate (at site 49, D49E) with pDMS2081 or to glutamine (at site 109, D109Q) with pDMS2082, whereas invasion of epithelial cells was affected only by the D49E substitution (*p* < 0.001), but not by the D109Q substitution. Data represent one of three separate and reproducible experiments each with triplicate data expressed as mean ± SEM (ns is for not significant, \* *p* < 0.05, \*\* *p* < 0.01, \*\*\* *p* < 0.001 and \*\*\*\* *p* < 0.0001).

PagN sequence alignments for 52–75 strains for each of the seven *S. enterica* serovars, including *S.* Typhimurium [10], highlighted that all PagN had aspartic acid residues at position 49 and 109, with the exceptions of *S.* Typhi that had a glutamic acid at position 49, and a group of *S.* Newport strains that had a glutamine at position 109 (Figure 1A). Since the allelic PagN proteins of all strains of

*S.* Typhi and *S.* Typhimurium varied at these two positions and were each predicted to be located in a different bacterial surface loop (Figure 1B), we investigated their relative involvement in bacterial adhesion and invasion. For this, we generated PagNTm with a D49E or a D109Q substitution and expressed them in *E. coli* strain AAEC189. Although both mutated PagN mediated better bacterial binding to RKO cells than did PagNTm, only the D49E substitution in PagN had a significantly stronger effect on bacterial invasion (Figure 3B). This result highlighted the contribution of the *S.* Typhi glutamic acid at position 49 for the improved interaction of PagN with a host cell receptor to promote bacterial uptake. This result is consistent with a role for the PagN allele in the increased pathogenicity of *S.* Typhi relative to *S.* Typhimurium pathogenesis following human infection, in agreement with the former serovar's hematogenous bacterial spreading in humans resulting frequently in sepsis, in contrast to the latter serovar and its pathogenesis that is usually contained in the gastro-intestinal organs.

Other than subsp. *enterica,* most subsp. of *S. enterica* are associated with cold-blooded animals. Therefore, we expected that a construct expressing the PagN allele of these non-*enterica* subspecies would neither bind nor efficiently invade human RKO cells and therefore would serve as a negative control. Thus, we cloned the *S. diarizonae* PagN (PagNdi) in the same inducible expression plasmid used for the two other PagN alleles to make pDMS2062 and confirmed protein production (Figure 2A) and surface expression (Figure 2B, panels g and h). To our surprise, *E. coli* making PagNdi were significantly more adhesive and invasive than the ones expressing PagNTm (Figure 4A).

**Figure 4.** Binding and uptake of *E. coli* expressing PagN from non-*enterica* allelic variants. (**A**) Adhesion and invasion of *E. coli* AAEC189 expressing *pagN* allele from *S. diarizonae* PagN (PagNdi) and *S.* Typhimurium PagN (PagNTm) to RKO cells showed the bacteria expressing PagNdi binds significantly better than PagNTm (*p* < 0.01) and was slightly more invasive (*p* < 0.05). (**B**) A substitution of aspartate to glutamate at site 109 in *S.* Typhimurium PagN (D109E) increased both binding and invasion of *E. coli* into RKO cells as compared to the bacteria expressing PagNTm (*p* < 0.01). Data represent one of three separate and reproducible experiments, each with triplicate data expressed as mean ± SEM (\* *p* < 0.05, \*\* *p* < 0.01, \*\*\* *p* < 0.001 and \*\*\*\* *p* < 0.0001).

PagN alignments of 48 available strains of *S. diarizonae* highlighted that they all had an aspartic acid at position 49, like most evaluated subsp. *enterica* serovars (Typhi being the exception, Figure 1A). In contrast, unlike the subsp. *enterica* serovars, all PagNdi carried a glutamic acid at position 109, suggesting that a D109E substitution at this position would be functionally significant. In support, a D109E substitution in the *S.* Typhimurium PagN (pDMS 2083) improved not only bacterial binding, but also bacterial uptake by RKO cells (Figure 4B), indicating that glutamic acid in this position optimizes the function of PagN as both an adhesin and invasin. Whether other substitutions in PagNdi, such as the arginine or lysine at position 150 and 154 in loop 3 (Figure 1) modulate bacterial invasion remains to be determined. Like *S.* Typhi, *S. diarizonae* have type IVB pili involved in intestinal cell invasion, raising the possibility that the concomitant increased binding property of the *S. diarizonae* PagN allele contributes to the reported increased bacterial virulence in humans under specific conditions [5]. Taken together, the results showed that PagN alleles with a glutamic acid residue at either 49 or 109 improves bacterial binding and/or invasiveness, possibly due to the more ionizable and long chain characteristics of glutamic acid relative to aspartic acid and glutamine.

In addition to promoting bacterial attachment to cells, many bacterial adhesins interact with various host glycoprotein of the extracellular matrix, as exemplified by the ability of PagNTy to bind to laminin [23,46–48]. *S. enterica* may encounter laminin either on intestinal areas denuded of epithelial cells (e.g., extrusion zones at villi tips) or in the subepithelial space after invading enterocytes and crossing the intestinal epithelial layer. Notably, the low pH and magnesium concentrations in *Salmonella*-containing vacuoles (SCV) of invaded cells induce the PhoPQ two-component system responsible for PagN expression [49–51]. Thus, whereas the *S. enterica* SPI-1 system is induced in the intestinal environment and used for enterocyte invasion, expression of PagN in SCV (where SPI-1 is largely repressed) may prepare *Salmonella* for future cycles of cell invasion after escape from intestinal epithelial cells. Our side-by side comparison of PagNTy-, PagNTm- and PagNdi-expressing *E. coli* showed that all three bacteria adhered significantly to laminin, albeit the former better than the latter two (Figure 5A). None of the bacteria bound to collagen I, II or IV or fibronectin (not shown).

**Figure 5.** Binding of PagN allelic variants to murine laminin. (**A**) The PagNTy-expressing bacteria bound best to laminin. *E. coli* AAEC189 expressing PagNTm, PagNTy, PagNdi or containing empty vector (no *pagN*) were incubated with microtiter wells coated with murine laminin. Bacterial binding was detected by an ELISA-method with anti-PagN antiserum. (**B**) All substitution mutants bound better to laminin than bacteria expressing PagNTm, with the mutant expressing glutamic acid at position 49 of PagNTm (D49E) binding best (*p* < 0.0001). Data represent one of three separate and reproducible experiments, each with triplicate data expressed as mean ± SEM (\*\* *p* < 0.01, \*\*\* *p* < 0.001 and \*\*\*\* *p* < 0.0001).

To determine the potential role of PagN residue 49 and 109 in PagNTy binding to laminin, we tested the three mutants described above. As for the intestinal epithelial cell binding results, bacteria with a glutamic acid at position 49 of PagNTm (D49E) bound significantly better to laminin than did bacteria with PagNTm (Figure 5B). Glutamic acid or glutamine at position 109 of PagNTm (D109E and D109Q) also increased binding, albeit less efficiently.

Since both PagN and heparin bind to laminin [52], we next determined whether heparin or heparan sulfate could inhibit PagN-mediated bacterial binding to laminin. In contrast to the reported inhibitory effect of heparin on PagN-mediated invasion of CHO-K1 cells [53], neither heparin nor heparan sulfate at concentrations as high as 400 μg/mL interfered with the binding of PagNTy-expressing bacteria to laminin (data not shown). These combined results suggested that PagN and heparin bind to different laminin sites and that the invasion of CHO-K1 cells by PagN-expressing bacteria is laminin-independent. Thus, our studies highlight the independent adhesive properties of PagN in binding to either intestinal cells or the extracellular matrix protein laminin, two relevant targets for *Salmonella* host invasion.

In summary and together with our previous studies on variants of fimbrial adhesins [25,26,54], this study on PagN, a *Salmonella* adhesin/invasin, further supports the importance of protein sequence allelic variants in virulence properties [55], including pathogenic properties such as adhesion and invasion [4].

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2076-2607/8/4/489/s1.

**Author Contributions:** Conceptualization, D.M.S. and I.N.O.; methodology, D.M.S. and I.N.O.; software, A.V.R. and D.M.S.; validation, D.M.S. and R.D.; formal analysis, Y.W., Q.H., R.D., N.G.; investigation, Y.W., Q.H., R.D., N.G., E.C.C. and J.H.; resources, D.M.S. and I.N.O.; data curation, R.D. and D.M.S.; writing—original draft preparation, D.M.S.; writing—review and editing, D.M.S.; visualization, R.D., Y.W., Q.H.; supervision, D.M.S. and I.N.O.; project administration, D.M.S. and I.N.O.; funding acquisition, D.M.S. and I.N.O. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by USDA National Institute of Food and Agriculture grant 2013–67015-21285, NIH/NIAID grants AI117135 and AI139982, and the PennVet Center for Host-Microbial Interactions to DMS, and NSF RUI awards MCB 0948460 and MCB1329248 to INO. QH was supported by a grant from the China Scholarship Council and YW was supported by a grant from the International Cooperative Research and Exchange Program of Zhejiang University. The funding bodies had no role in the design of the study and collection, analysis and interpretation of data and in writing the manuscript.

**Acknowledgments:** We thank all the contributors for publicly sharing their *Salmonella* genomic data through the National Center for Biotechnology Information, European Bioinformatics Institute and Wellcome Sanger Institute *Salmonella* genome projects. We thank Mark Goulian for plasmid pMG81, Olivier Cunrath and Dirk Bumann for *E. coli* strain JKE201 and Gibson assembly protocols, and Leslie King for proofreading the manuscript. The opinions expressed in this manuscript are solely the responsibility of the authors and do not necessarily represent the official views and policy of the National Institutes of Health.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Review* **Spheres of Influence: Insights into** *Salmonella* **Pathogenesis from Intestinal Organoids**

**Smriti Verma 1,2,\*, Stefania Senger 1,2, Bobby J. Cherayil 1,2 and Christina S. Faherty 1,2**


Received: 28 February 2020; Accepted: 28 March 2020; Published: 1 April 2020

**Abstract:** The molecular complexity of host-pathogen interactions remains poorly understood in many infectious diseases, particularly in humans due to the limited availability of reliable and specific experimental models. To bridge the gap between classical two-dimensional culture systems, which often involve transformed cell lines that may not have all the physiologic properties of primary cells, and in vivo animal studies, researchers have developed the organoid model system. Organoids are complex three-dimensional structures that are generated in vitro from primary cells and can recapitulate key in vivo properties of an organ such as structural organization, multicellularity, and function. In this review, we discuss how organoids have been deployed in exploring *Salmonella* infection in mice and humans. In addition, we summarize the recent advancements that hold promise to elevate our understanding of the interactions and crosstalk between multiple cell types and the microbiota with *Salmonella*. These models have the potential for improving clinical outcomes and future prophylactic and therapeutic intervention strategies.

**Keywords:** organoids; enteroids; *Salmonella*; host-pathogen interactions; model systems; infectious diseases; organotypic culture system

#### **1. Introduction**

Many basic life processes are conserved in all organisms, with similarities in molecular mechanisms often increasing as the phylogenetic distances decrease. This conservation enables investigation into various biological mechanisms and diseases, with the expectation that the knowledge gained from one system will provide insight into the workings of related organisms. The similarity between different organisms is particularly important for studying human biology and diseases, since human experimentation is typically unfeasible or unethical. However, several factors must be considered before employing an organism or derived cells as a model for humans, including ease of maintenance and reproducibility, amenability to genetic and molecular manipulations, and perhaps most importantly, similarity to the characteristics of the human biological process under investigation.

The mainstays of research for many decades have been the cell culture and animal model systems. Classically, the in vitro cell culture methodology involves either growing cell lines that are derived from tumors or are genetically transformed to allow continuous propagation, or culturing primary cells derived from the tissues of humans or experimental animals. Use of cell lines has a significant limitation in that the cells do not recapitulate "normal" cellular physiology. Primary cells closely represent the tissue of origin but are limited in availability, have a finite life-span in vitro without being transformed, and display variability arising from differences in donors. Conventional cell culture comprises growing cells (cell lines/primary cells) either in liquid suspension or as adherent,

two dimensional (2D) monolayers on an impermeable solid surface. The cell culture model is also a reductionist approach since the model cannot account for dynamic and complex interactions that occur in vivo, particularly those that result from the interactions with multiple cell types. Although processes such as immune responses, cell signaling, or crosstalk with microbiota have been studied using cell culture models of various cell types either alone or in combination, the biggest disadvantage of this system is that the cells may not fully recapitulate in vivo phenotypes and behaviors. More complex model systems, such as *Caenorhabditis elegans*, *Danio rerio*, *Drosophila melanogaster*, mice, or primates provide a whole organism perspective and allow compensation for some of the limitations of cultured cells as the complexity of the models increase. However, these animal models resemble, but do not completely represent, all the conditions encountered in humans, an issue that is especially important in the understanding of disease and development [1]. Recently, many researchers have developed sophisticated, yet tractable, non-cancerous tissue culture models that bridge the gap between simple 2D cell culture and complex in vivo experiments. Such models are broadly known as organotypic culture systems, which can be composed of various types as detailed below. In general, these models consist of a three-dimensional (3D) organ-like structure that is made up of organ-specific differentiated cells of multiple lineages and that recapitulates the unique organizational and functional characteristics of the corresponding organ in vivo.

Historically, the term "organoid" referred to short-term in vitro cultures of tissues such as lung and intestine [2], or 3D cell aggregates such as spheroids. More recently, it has been applied to organized 3D structures generated in vitro from cells with a whole range of origins, such as tissue segments [3,4] and their derived adult stem cells (ASCs) [5,6], transformed cell lines [7], and pluripotent stem cells (PSCs) [8]. However, as research has progressed, the term "organoid" has taken on a more restricted meaning. Organoids have the following characteristics: (1) self-organization: individual cells arrange in vitro into a 3D structure that mimics the in vivo organ or tissue, (2) multicellularity: organoids are composed of multiple cell types typically found in the organ or tissue in equivalent proportions, 3) functionality: the organoid structure should be able to execute at least some of the organ- or tissue-specific functions, and 4) sustainability: organoids can propagate indefinitely without requiring transformation by maintaining a pool of progenitor cells. In 2012, the International Stem Cell Consortium [9] set guidelines for the nomenclature to be used to define the 3D structures generated in vitro depending on their origin and cellular composition. When referring to intestinal organotypic models, 3D structures that are composed of just epithelial cell types are generally known as "enteroids" if derived from the small intestinal epithelium, or "colonoids" if derived from colonic origin. The term "organoid" is usually reserved for 3D structures containing more than one cell lineage. However, it should be noted that these guidelines have not been uniformly adopted by the field. Many researchers commonly use "organoids" as a blanket term for 3D structures derived from ASCs, PSCs, or comprising transformed cell lines that resemble in vivo 3D architecture and physiology.

Methodologies have been adapted to develop organoids from various normal and cancerous mouse and human tissues including colon [6,10], stomach [11], liver [12–14], pancreas [15], and kidney [16] to name a few. These techniques have been well reviewed elsewhere [11,17–21]. For the purposes of this review, we will provide a historical context and examples of the different types of intestinal organoid models (Figure 1), followed by a discussion of the application of the models to the study of *Salmonella* pathogenesis.


**Figure 1.** Use of organotypic models in *Salmonella* biology. We highlight studies that have utilized organotypic models to better understand *Salmonella* biology, as well as studies that do not directly pertain to *Salmonella* research but have the potential to be deployed for better understanding of the dynamics of host-*Salmonella* interactions.

#### **2. Historical Background**

The organoid model systems have been established on the basis of several studies that have contributed to various aspects of the technology [22]. One key advance was the development of cell culture methods that permitted cells to form 3D structures. Many of the non-physiological properties of cells in the conventional 2D cell cultures are believed to be due to the loss of 3D geometry [23]. Several techniques have been developed to deal with this problem, such as the hanging drop method [24], 3D micro-molds [25], and rotating wall vessel (RWV) bioreactors [26], a more advanced method that permits larger-scale production of 3D assemblies. Nickerson, CA et al. (2001) [26] utilized a RWV bioreactor developed by NASA that generates low shear and microgravity, allowing cells to remain in suspension to aggregate and grow three dimensionally. The authors were able to establish a 3D culture of Int-407 cells (derived from a normal fetal intestine but later shown to have HeLa cell contamination), thereby promoting cellular differentiation. The resulting 3D aggregates modeled human in vivo differentiation with well-defined cell-to-cell borders, tight junctions, apical-to-basal polarity, and microvilli development. These features were not present when the Int-407 cells were grown as conventional monolayer cultures. Thus, the study helped to demonstrate that 3D organization provides important physical and mechanical cues that facilitate organogenesis. A 3D platform is generally achieved by using matrices such as Matrigel, which consist of proteins typically found in the extracellular matrix (ECM). The ECM is the non-cellular component present within all tissues and organs, and consists mostly of collagen, enzymes, and glycoproteins to provide both structural and biochemical support to surrounding cells. ECMs have been shown to promote the maintenance of structural and functional characteristics of the tissue of origin [27]. The other major events that propelled the development of organoids were advancements in the understanding of stem cell properties and

developmental biology of the intestine. The defining characteristics of stem cells—i.e., the capacity for clonal expansion and the ability to differentiate into daughter cells of multiple lineages—have been vital for the development of organoid technology. In fact, with the right cues daughter cells differentiate into multiple cell types and undergo two processes (1) cell sorting out—the formation of discrete domains by different cell types, and (2) spatially restricted lineage commitment, the specialization of function of a cell based on its position in the tissue architecture [28]. These processes result in complex 3D, self-propagating, multicellular, and functional structures [29]. The improved understanding of intestinal development helped define some of these cues. Studies have shown that different molecular gradients converge to create the micro-environment that shapes the intestinal epithelium. Epidermal growth factor (EGF) and Wingless-related integration site protein (Wnt) are highly active in the crypt and are necessary for proliferation, while bone morphogenetic protein (BMP) and Notch signaling pathways active in the villus control cell fate programming [30–32].

Another milestone in the field came in 2007 when Barker and colleagues [33] demonstrated in adult mice that intestinal crypt stem cells, which are responsible for proliferation and migration of the epithelial layer towards the villus tip, express the marker leucine-rich repeat containing G-protein coupled receptor 5 (Lgr5) [33]. Shortly thereafter, Ootani et al. [34] developed a long-term culture of primary mouse intestines using air-liquid interface methodology [34]. Fragments of neonatal mouse intestines containing epithelial and mesenchymal cells were cultured to generate cyst-like structures containing all major cell types found in an adult mouse intestine, including Lgr5<sup>+</sup> stem cells that the authors demonstrated could be modulated by Wnt signaling [34]. The next critical step occurred in 2009, when Sato and colleagues [5] realized the power of these organ-specific stem cells in 3D cultures by isolating intestinal crypts from adult mice. The authors demonstrated that a single Lgr5<sup>+</sup> cell could form 3D crypt-villus epithelial structures in the presence of appropriate cues, including an environmental signal from a lamin-rich 3D matrix (Matrigel) and a physiological signal in the form of growth factors (the Wnt agonist R-spondin1, EGF, and the BMP-inhibitor Noggin) [5]. The model recapitulated the physiology and organization of the mouse intestine; and when transplanted into mice, successfully reconstituted crypt-villus units [35,36]. Thus, the adult stem cells (ASCs) develop self-organized structures that consist of various types of intestinal epithelial cells. The authors referred to these structures as "organoids"; however, according to the guidelines set by the International Stem Cell Consortium [9], the model should be called "enteroids" since it was derived from the small intestinal epithelial cells.

Regardless of the definition, this work laid the groundwork for the utilization of pluripotent stem cells (PSCs), either derived from embryonic tissue or generated by inducing stemness in mature somatic cells (iPSCs) [37–40]. The latter offer the additional advantages that invasive procedures to isolate intestinal or colonic biopsies are not necessary and also iPSCs bypass ethical concerns regarding procurement of embryonic stem cells. The PSC-derived structures consist of epithelial cells surrounded by mesenchymal stromal cells and are referred to as "organoids". During embryonic development, the mesenchyme induces and specifies epithelial identity and differentiation, and is critical for the maintenance of tissue identity [41,42]. However, for interrogation of epithelial cell-specific functions, the mesenchymal cells may serve as a hindrance. Mithal and colleagues [43] have recently identified a directed differentiation protocol that generates mesenchyme-free human intestinal organoids (HIOs), which were employed to measure cystic fibrosis transmembrane conductance regulator (CFTR) gene function using cystic fibrosis patient-derived iPSC lines. The epithelial cells of organoids are not fully mature and resemble fetal epithelial cells when transcriptionally profiled [44]. The maturation of these organoids requires implantation in vivo, such as on the kidney capsule of immunocompromised mice [8,39,44,45]. The fetal-like nature of PSC-derived organoids presents an advantage as it allows for studies that would otherwise be difficult to conduct given the limited availability and ethical concerns surrounding the use of human fetal tissues. PSC-derived organoids follow a differentiation path observed during development [46] and take one to two months to develop, while tissue-specific ASCs develop organoids by a process that recapitulates tissue repair and is complete within weeks [20]. The epithelial cells derived from ASC (enteroids) represent adult-like mature epithelium. The cells of the enteroids also preserve characteristics of the tissue of origin including a diseased phenotype if derived from patients bearing an intestinal condition. These characteristics make enteroids from ASC a faithful model for intestinal diseases [47–49].

#### **3. Intestinal Organoids**/**Enteroids**

Intestinal enteroids derived from the intestinal stem cells develop multi-lobed structures (Figure 2) consisting of crypt-villus units, with a clearly defined proliferative, crypt-like zone and a large, non-proliferative differentiated zone surrounding proliferative areas [50]. To generate organoids, the pluripotent stem cells are first programmed towards endoderm development to form the epithelial tube that eventually gives rise to the foregut, midgut, and hindgut in response to a combination of anteriorizing and posteriorizing growth factors provided. Additional mesenchymal cells are also present and consist of myofibroblasts and smooth muscle cells surrounding the epithelial cells, presumably derived from remnant cells of mesodermal lineage after endoderm induction [20,51]. The epithelial cells reach homeostasis with a proliferative rate matching the rate of shedding of the cells at the edge of the monolayer. All major epithelial cell types are generated, including goblet cells, enteroendocrine cells, Paneth cells, antigen-sampling microfold cells (M cells), and columnar enterocytes that have a brush border of apical microvilli. The cells of the intestinal enteoroids are suitably polarized, and produce and secrete mucus from the apical surface. The cellular composition of the organoids/enteroids can be varied depending on the factors that are supplemented during the process of growth and differentiation [34]. Cell types not previously generated or maintained in traditional cell culture have been produced and/or characterized in organoids/enteroids, including the enteroendocrine cells and M cells [52,53]. In 2017, Haber and colleagues [54] profiled epithelial cells from both the mouse small intestine and crypt-derived enteroids generated from the mouse small intestine, with the aim of identifying novel subtypes and defining genetic signatures. As M cells are a scarce cell type comprising only 10% of the rare follicle associated epithelium, the authors did not detect M cells in the transcriptomes profiled from the mouse small intestines. However, small intestine-derived enteroids treated with growth factor Receptor Activator of Nuclear factor Kappa-B Ligand (RANKL) to generate M cells, displayed M cell specific signatures [54].

**Figure 2.** Bright field microphotograph depicting a representative field of human duodenum-derived enteroids in Matrigel. Each of the structures in the figure represents an enteroid at 8 days of culture, consisting of a 3D cellular aggregate organized into an epithelial monolayer resembling that of the small intestine. The apical surface of the monolayer faces the center of the enteroid while the basolateral surface faces the exterior. Bar scale indicates 1.0 mm; image from Stefania Senger, unpublished data.

A unique property of the enteroids is that the 3D structure of the tissue is not required for terminal cellular differentiation [55,56]. Thus, organoids/enteroids can be dissociated enzymatically and reseeded onto Transwells to generate short-term 2D monolayer cultures consisting of various differentiated epithelial cell types that can then be used for experimentation [55–58]. Moon and colleagues [55] established epithelial cell monolayers from cultured intestinal spheroids derived from the colon of mice. The differentiated monolayers displayed a robust transepithelial electrical resistance (TEER, a measure of barrier integrity) and the ability to transcytose secretory immunoglobulin A (sIgA) upon stimulation with microbial products [55]. The same group in another study (vanDussen et al., 2015) [56] adapted the technique to generate polarized 2D monolayers from human enteroids that were used to conduct adherence assays with enteropathogenic and diarrheagenic *Escherichia coli* [56]. The monolayers generated from enteroids maintain apical-to-basal polarity and barrier function while allowing easy access to the apical and basolateral compartments. The monolayers also maintain the genetic and functional phenotypes of the organoid/enteroids of origin [56] and can be manipulated to express a crypt-like or villus-like phenotype by changing media composition [49]. This system lends itself to many standard functional assays designed for 2D cells as well as to high throughput assays that cannot be easily performed on spherical enteroids/organoids, including but not limited to adhesion/invasion assays, transmigration assays, and inflammatory immune responses [49,51,55,56,59]. These monolayers can be cultured for up to three weeks (Stefania Senger, unpublished observations). Recent studies [60–62] have developed techniques to generate monolayers that exhibited compartmentalization of proliferative crypt-like domains and differentiated villus-like regions closely resembling in vivo distribution. Liu et al. [60] generated self-renewing 2D monolayers by plating intestinal stem cells on a thin layer (10 μM) of Matrigel coated onto glass sheets, while Wang et al. [63] relied on the presence of collagen hydrogels in tissue culture plates. Liu and colleagues [60] also altered the combination of growth factors added to the medium, such as addition of blebbistatin and removal of EGF to improve the survival and growth of the stem cell population. These models do not provide access to the basolateral side of the monolayers. To overcome this limitation, Altay and colleagues [62] cultured mouse-derived intestinal enteroids on Transwells coated with Matrigel that provided proper mechanical stiffness. The authors also boosted the proliferation of the stem cells by supplementing the culture medium for the basolateral compartment with conditioned medium from intestinal sub-epithelial myofibroblasts. The resulting monolayers possessed all cell types found in vivo as well as an effective barrier function. Thus, the 2D monolayers offer in vivo-like structural and functional characteristics with the convenience of the 2D format and should prove quite useful, particularly in studies interrogating effects on intestinal stem cells, including during *Salmonella* infection [64,65].

Overall, organoids/enteroids have several advantages. The models maintain the in vivo tissue architecture, cellular composition, and region-specific differentiation programming [28] while being genetically stable [66]. Organoids/enteroids are amenable to many of the established techniques of molecular analysis and manipulation, such as CRISPR/Cas9 technology [67], lentivirus transduction [42] single-cell RNA sequencing [53], and mass spectrometry [68] to name a few. Organoids/enteroids have been utilized for improving the basic understanding of tissue homeostasis, organogenesis, and physiological functions [4,69]. The systems have also been used to model diseases [49,51], to study host-pathogen interactions [59,65,70,71] and cancer [72], and to test potential vaccines and drugs [73–75]. The organoids/enteroids are also used as new tools for personalized medicine, where patient-derived models can serve as platforms for testing treatment options. For example, Dekker and colleagues [76] utilized intestinal enteroids derived from patients with cystic fibrosis to assess the responsiveness to CFTR-modulating drugs. Several large repositories of organoids/enteroids from multiple patients have been established and can be a source of both healthy and diseased cells globally, providing researchers access to a varied genetic background to test their hypotheses [10]. The models even have the potential to be a source for transplantation [35]. Below, we will highlight the use of organoids/enteroids for studying *Salmonella* pathogenesis.

#### **4.** *Salmonella enterica*

The genus *Salmonella* is a major foodborne pathogen [77]. It comprises two species: *S. bongori* and *S. enterica.* Almost all *Salmonella* organisms that cause disease in humans and domestic animals are serovars belonging to *S. enterica subspecies enterica*. Broadly, the diseases caused by *Salmonella* in humans are of two kinds: 1) systemic febrile illness termed typhoid/enteric fever, and 2) an acute self-limiting gastroenteritis. The serovars that cause typhoid are referred to as typhoidal *Salmonella* and include *S. enterica subsp. enterica* serovar Typhi and *S. enterica subsp. enterica* serovar Paratyphi A, B, and C. *S.* Typhi causes approximately 76.3% of global enteric fever cases [78]. *S.* Typhi and *S.* Paratyphi are restricted to humans and higher primates, and clinical manifestations of the infection include sustained high fever, abdominal pain, headache, weakness, malaise, and transient diarrhea/constipation. Without appropriate and effective antibiotic therapy, the infection may lead to gastrointestinal bleeding, intestinal perforation, septic shock, and death [79,80]. The Non-Typhoidal *Salmonella* (NTS) serovars cause self-limiting gastroenteritis and include *S. enterica subsp. enterica* serovar Typhimurium and *S. enterica subsp. enterica* serovar Enteridis, which are the most prevalent clinical isolates, according to the World Health Organization (WHO). These pathogens are broader in host range and infect humans and animals such as poultry, cattle, reptiles, and amphibians. Infections with NTS typically involve self-limiting diarrhea, stomach cramps, headache, vomiting, and fever that resolve on their own; however, the infection can be severe in children and the elderly and can sometimes be fatal [81]. NTS serovars have been reported to cause an invasive infection similar to typhoid, particularly in sub-Saharan Africa, predominantly in children and HIV-positive adults, with several co-morbidities such as ongoing or recent malaria infection and malnutrition contributing to higher mortality [82–84]. *Salmonella* infections represent a considerable economic burden and public health concern in both developing and developed countries. The Center for Disease Control and Prevention (CDC), estimates that 1.35 million infections by *Salmonella* occur in the United States. NTS causes more than 93 million global infections per year, with 155,000 deaths [85]. A study examining the global burden of diseases, injuries, and risk factors in 2017 reported 14.3 million cases of enteric fever globally, with a case fatality of 0.95% resulting in approximately 135,000 deaths [78]. Children, elderly, and those residing in lower-income countries account for the greatest incidences [86,87].

*Salmonella* is acquired via contaminated food and water. Luminal bacteria invade M cells and absorptive enterocytes via a specialized apparatus called the type three secretion system (T3SS) [88] encoded on the *Salmonella* Pathogenicity Islands (SPIs) [89]. The T3SS injects bacterial proteins into host cells allowing the bacteria to essentially commandeer host cellular processes to induce cytoskeletal rearrangements that engulf the bacteria into specialized vesicles called the *Salmonella* containing vacuoles (SCVs) [90]. This invasion process, with subsequent translocation across the epithelium, is followed by the uptake of the bacteria by macrophages and dendritic cells in the intestinal sub-mucosa, where bacterial proteins interfere with phagolysosomal maturation and allow the bacteria to survive inside the cells [91–93]. NTS serovars undergo prolific growth in the intestine, while T3SS effectors induce fluid secretion and promote inflammation. Immune signaling via recognition of pathogen-associated molecular patterns such as lipopolysaccharide (LPS) and flagella also induce a robust inflammatory response, which actually provides *Salmonella* with a growth advantage over resident microflora [81,94–96]. The immune response eventually limits *Salmonella* growth; nevertheless, the short-term proliferation is sufficient to ensure propagation. Typhoidal *Salmonella* strains elicit a more attenuated inflammatory response in the intestine, especially in terms of limited neutrophil recruitment [79,97,98]. Bacteria migrate to the mesenteric lymph nodes (MLN) and systemically within the reticuloendothelial cells, and as free bacteria in the blood or lymph, to establish new foci of infection in the liver, spleen, bone marrow, and gallbladder [99]. At these new sites the bacteria replicate and re-enter the intestinal lumen via secretion in bile, promoting shedding of the bacteria to continue the cycle of new infections by contaminated food and water. In 3%–5% of cases, the bacteria can persist for long durations in the gallbladder, which serves as a reservoir of chronic infection [100]. Chronic infection with *Salmonella* has been found to be a risk factor for the development of malignant neoplasms, including gallbladder cancer [101,102] and colorectal cancer [102,103].

The emergence of multi-drug resistance to conventional antibiotics complicates the treatment of *Salmonella* infection [104–107]. Antibiotic treatment destroys the resident microflora, provides a niche for *Salmonella* to proliferate, and may lead to increased levels of bacterial shedding [108]. Additionally, bacterial populations that express an antibiotic-tolerant phenotype can evade treatment and persist, causing relapses of the infection as well as the evolution of bacterial virulence [109,110]. Asymptomatic carriers act as reservoirs, contribute to the continued propagation of the pathogen, and are particularly important for food safety considerations. Currently, there are no effective vaccines against gastrointestinal *Salmonella* infections. Several typhoid vaccines are available and licensed in many countries; however, robust protection is limited and has been associated with injection site reactions. Furthermore, the vaccines have not been widely adopted by public health programs [111]. With the significant aging populations in both developed and developing countries [112,113], more people are at risk for severe consequences of *Salmonella* infections. Advances in the mechanistic understanding of *Salmonella* infections will facilitate the development of improved control strategies, particularly, safe and effective vaccines. The broad conservation of host responses as well as the molecular machinery used by *Salmonella* strains during infection of various hosts, namely T3SSs encoded by SPI1 and SPI2 that enable invasion of epithelial cells and subsequent intracellular survival, allows several model systems to be applied for *Salmonella* research [114]. Each non-human model has pros and cons, and varies in the ability to recapitulate natural infection. Given the host-specific aspects of infection physiology, it is necessary to be cautious in applying the results from these model systems to human patients. Additionally, with *Salmonella* being an important tool to understand host physiology, metabolism, immune function, and interactions with microbes, human-specific investigations are valuable to the research community.

#### **5. Model Systems to Study** *Salmonella* **Biology**

Transformed cell lines such as Cos-1, MDCK, HeLa, HepG2, CaCo2, and T-84 have been used to carry out several fundamental studies on *Salmonella* pathogenicity, such as the identification of the T3SS and the SPIs [88,115]. However, these cells have the drawback noted earlier inthat the cells do not adequately represent the physiological characteristics of normal human tissue [116,117]. Explant tissue cultures have organotypic properties that can be vital for studies on development and physiology, but are limited by culturing difficulties and short life span [118,119]. Classically, animal models have offered solutions for several of these limitations, and have been used to corroborate data obtained from other model systems as well as to investigate the deeper molecular mechanism of infection. For example, *S.* Typhimurium is a natural pathogen of calves and causes a gastroenteritis with clinical and pathological manifestations similar to humans, namely diarrhea, anorexia, fever, localized infection, and neutrophil infiltration [120,121]. Meanwhile, the bovine ligated loop was instrumental in characterizing fluid accumulation and host inflammatory responses following *S.* Typhimurium infection. For example, mutants lacking the invasion proteins SipA, SopA, and SopD were shown to have little to no effect on the ability of *S.* Typhimurium to invade epithelial cells, but were shown to reduce the fluid accumulation and neutrophil immigration in bovine loops [122,123].

More importantly, the vast majority of studies on *Salmonella* pathogenesis have been conducted in the murine model, including studies for the development of new vaccines. This model has been useful in clarifying various aspects of in vivo *Salmonella* pathogenesis; however, it does have limitations with respect to its ability to faithfully reproduce all aspects of *Salmonella* infection in humans. *S.* Typhimurium causes two very different types of diseases in human and mice. In humans, the infection is localized, dominated by the infiltration of neutrophils and self-limiting. In mice, *S.* Typhimurium spreads systemically, with slow infiltration of mononuclear inflammatory cells and little or no localized intestinal tissue injury. The host responses involved are also dramatically different. Since the clinical manifestations and pathology of mice infected with *S.* Typhimurium resemble those

observed in humans with *S.* Typhi infection, the model has been used as a surrogate to study typhoid pathogenesis [124]. A mouse model of *S.* Typhimurium-induced enterocolitis has been developed and involves pre-treatment of mice with a single dose of streptomycin. This procedure diminishes the colonization resistance by the commensal microbiota, allowing *Salmonella* to grow to high densities in the cecum and large intestine and trigger acute gastroenteritis [125]. Mouse models have also been developed to mimic chronic infections of *Salmonella* observed in certain carrier individuals, which typically involve infecting susceptible mice with avirulent strains, or sub-lethal doses of *S.* Typhimurium in resistant mice. Experimental interpretations from the mouse model may not translate to human disease. *S.* Typhi and *S.* Typhimurium share about 89% of their genes, with approximately 500 genes unique to *S.* Typhimurium and 600 genes unique to *S.* Typhi, including the genes encoding typhoid toxin and the immunoprotective capsule [126,127]. *S.* Typhi and *S.* Paratyphi also have pseudogenes as well as small sequence differences in genes encoding the T3SS apparatuses and related effectors that may have important implications in pathogenesis. It is possible that virulence factors that may have no role in *S.* Typhi-mediated infection in humans may be important for mouse infection by *S.* Typhimurium and vice versa. Finally, *S.* Typhi genes required for causing typhoid but absent in *S.* Typhimurium cannot be studied easily using the latter. Typhoid fever can be induced experimentally by oral infection in higher primates or human volunteers, but these studies come with their own set of difficulties and ethical objections. Researchers have attempted to compensate for the discrepancies by developing humanized mice, i.e., immunodeficient mice engrafted with human hematopoietic cells [128,129], but these models are cumbersome, expensive, and still do not guarantee that mouse-specific factors will not add complexities or variations to the data generated. In addition, host genetic background has been found to play a role in susceptibility to invasive NTS infections, a concept that borders on the realm of personalized medicine [130]. Thus, organoids offer a promising model to mechanistically study host-specific aspects of infection.

#### **6. Organoids**/**Enteroids in** *Salmonella* **Biology**

One of the earliest studies that utilized organotypic 3D structures to investigate *Salmonella* pathology was carried out by Nickerson, CA et al., in 2001 [26]. As described above, the authors generated 3D organotypic cultures by growing human embryonic intestinal cell line Int-407 in RWV bioreactors and subsequently infected the cells with *S.* Typhimurium. The resulting infection was quite different from what had previously been observed in monolayer cultures. There was minimal loss of structural integrity, lower ability of the bacteria to adhere to and invade epithelia, and lowered expression of cytokine in 3D Int-407 aggregates as compared to infected Int-407 monolayers. Since the authors observed that the 3D Int-407 aggregates more closely resembled in vivo characteristics (tissue organization, tight junctions, apical-to-basal polarity, microvilli development, expression of extracellular and basement membrane proteins, and greater M cell glycosylation pattern), the authors concluded that the infection phenotypes observed in the 3D aggregates were likely representative of an in vivo infection. This study laid the groundwork for the use of 3D organotypic cultures to study *Salmonella* biology. The following section will highlight research performed in both mouse and human models that has improved our understanding of *Salmonella* pathogenesis (Figure 3).

#### *6.1. Mouse-Derived Models*

Following the establishment of protocols to generate crypt-derived mouse intestinal enteroids (referred to as organoids by the authors) by Sato et al., Zhang and colleagues [65] in 2014 utilized the system to analyze interaction of *S.* Typhimurium with epithelial cells. The authors visualized bacterial infection, while also observing bacterial-induced disruption of tight junctions, activation of the nuclear factor kappa-light-chain-enhancer of activated B cells (NF-kB)-mediated inflammatory response, and a decrease in the stem cell marker Lgr5. The authors noted that these observations were similar to findings in animal models. The caveat to this study is that the *Salmonella* were not delivered into the lumen of the enteroids, the location of the initial contact of bacteria with epithelial

cells in vivo; instead, the bacteria were added to the medium and came in contact with the enteroids basolaterally. Nevertheless, this study established mouse-derived enteroids as a model system for studying *Salmonella* infection biology.

Since this initial study, enteroids have been used to interrogate various aspects of *Salmonella* pathology, including investigating cell types that were previously not accessible to study in vitro. Farin and colleagues [131], in a 2014 study, used mouse intestinal enteroids to study the control of Paneth cell (PC) degranulation in response to bacteria or bacterial molecules such as LPS. The authors found that PC degranulation did not occur upon stimulation with microbial molecules or *Salmonella,* but was induced by a novel mechanism requiring only the presence of recombinant interferon gamma (IFN-γ) [131]. In another study, Wilson and colleagues [132] interrogated the antimicrobial role of Paneth cell α-defensin peptides. The authors developed small intestinal epithelial enteroids from both wild-type mice or mice mutated for α-defensin production (*Mmp7-*/*-* mice, MMP7 is a matrix metalloprotease that is required to generate bactericidally active α-defensins in mice [133]), and infected the enteroids with *S.* Typhimurium by microinjecting the bacteria directly into the lumen. The absence of mature α-defensins reduced the intra-luminal bacterial killing, which could be partially restored by expression of the human defensin HD5 [132]. This study demonstrated the contribution of α-defensins to the innate immune response to *Salmonella,* which previously had been a challenge to examine since most of the earlier experimental systems inadequately recapitulated in vivo cellular processes [134].

*Salmonella* has been suggested to contribute to the development of cancer by epidemiological studies [102]. Scanu and colleague [135] probed this phenomenon in a 2015 study using the case of gallbladder cancer (GBC). The authors derived gallbladder enteroids from mice carrying mutations that inactivate p53 and are known to be found in GBC patients in India, where the disease is prevalent. When exposed to wild-type *S.* Typhimurium, single cells derived from the gallbladder enteroids carrying these predisposing mutations generated new enteroids that exhibited growth factor independence, which is one of the hallmarks of transformation, and had histopathological features consistent with neoplastic transformation, thus establishing a direct association between *Salmonella* and cancer. To delve into the mechanism of this transformation, the authors looked at the *Salmonella* T3SS-mediated activation of AKT or mitogen-activated protein (MAP) kinase pathways, which have been shown to be elevated in human cancers. The signals activated by AKT and MAPK were found to be key in driving the cellular transformation and were sustained even after the eradication of the *Salmonella* infection. Studies have shown that AKT and MAPK pathways are activated by other bacteria and viruses that have been associated with various cancers [136–139]. Although the authors employed a murine gallbladder enteroid model and *S.* Typhimurium to study GBC, the authors proposed that the AKT and MAPK pathways are activated by both *S.* Typhimurium and *S.* Typhi serovars, and contribute to development of cancer that is associated with chronic *S.* Typhi infection in humans. Chronic infection by *Salmonella* has also been found to be a risk factor for developing cancers in the ascending and transverse colon [103]. However, detailed mechanistic studies of *Salmonella*-associated colon carcinogenesis need to be carried out and organoid model systems may prove to be extremely useful for this purpose.

Finally, another important aspect of *Salmonella* is the interaction of the pathogen with the host microbiome. Lu and colleagues [140] recently demonstrated that *Lactobacillus acidophilus*, a well-established probiotic bacterium, can alleviate damage caused by *S.* Typhimurium. Earlier studies had shown that *L. acidophilus* can inhibit adhesion of *Salmonella* to CaCo2 cells [141]. In this study, the authors extended the mechanism of protection to include the effects on the host. *L. acidophilus* altered the differentiation of epithelial cells in crypt-derived enteroids by impeding the *Salmonella*-mediated expansion of Paneth cells [131], thus maintaining homeostasis and appropriate epithelial composition during the infection. This study not only improved our understanding of the role of *L. acidophilus* in protecting the epithelial lining, but also demonstrated the ability to include microbiota-specific analyses to study *Salmonella* infection with enteroids/organoids.

#### *6.2. Human-Derived Models*

The potential to gain significant insight into *Salmonella* pathogenesis is particularly relevant in relation to the use of human-specific organoids, especially since these models possess human genetic specificities absent in mice. Studies have interrogated the usefulness of intestinal enteroids/organoids derived from human ASCs or PSCs to understand complex interactions between the epithelium and *Salmonella*. In 2015 Forbester and colleagues [70] used RNA sequencing to examine the epithelial transcriptional signature following injection of *S.* Typhimurium into the lumen of organoids derived from human induced PSCs (hiPSCs). The analysis showed significant up-regulation of genes for cytokine-mediated signaling, NF-κB activation, angiogenesis, and chemotaxis. Enhanced release of pro-inflammatory cytokines IL-8, IL-6, and TNF-α was also confirmed. The findings were consistent with prior studies in animal and mouse organoid models, thus establishing the human organoids as a viable infection model for *Salmonella*. The study also demonstrated that a noninvasive mutant strain (deficient in *invA* gene) could be used in the model to examine *Salmonella* pathogenicity and the functionality of defined mutants [70]. Furthermore, the authors generated an RNA sequencing data set following basolateral administration of *Salmonella* to the organoids. Interestingly, 49 of the 100 most highly upregulated genes were also significantly induced in the data set obtained by microinjecting the bacteria for apical infection. The data provide credence to the results of the Zhang and colleagues study [65], which documented similar patterns of gene expression upon basolateral administration of *Salmonella* to mouse organoids as had been observed earlier in literature with other model systems. Thus, the hiPSC organoids maintain a conserved response to *Salmonella* infection and provide a human-specific model for pathogenesis studies.

A subsequent study further demonstrated the validity of hiPSCs as a model to study human-specific responses to *Salmonella* infection. Using the same model system, the role of the cytokine IL-22 in priming intestinal epithelial cells towards a more effective response to *S.* Typhimurium was also explored [142]. The study showed that IL-22 pre-treated hiPSC-derived organoids increased phagolysosomal fusion leading to enhanced antimicrobial activity. Thus, this study confirmed earlier observations made in mouse organoids [143].

The fidelity of organoid-derived data in representing human disease was further demonstrated in 2018 by Nickerson, KP and colleagues [71], who compared infection of human tissue biopsies and human intestinal enteroid-derived monolayers seeded on a 2D Transwell system, and observed that the enteroid-derived epithelial monolayers recapitulated *S.* Typhi infection observations made in the tissue biopsy model. The authors also carried out transcriptional profiling of both the host tissue and the bacteria in order to determine early critical interactions. Infection with *S.* Typhi significantly down-regulated several host genes, including those involved in activation of the mucosal immune response, bacterial clearance, and cytoskeletal rearrangement. Interestingly in this model, a down regulation of SPI1 genes in *S.* Typhi was observed. This work demonstrated that *S.* Typhi reduces intestinal inflammation by limiting the induction of pathogen-induced processes through the regulation of virulence gene expression, which is a characteristic feature of human infection with *S.* Typhi. Transmission electron microscopic comparisons of the tissues and human organoid-derived epithelial monolayers showed that the monolayers reproduced the cytoskeletal arrangements, microvilli destruction, and vesicle-bound bacteria observed in tissues. There were no changes observed in paracellular permeability, increased death of host cells, or bacterial association with M cells, suggesting divergence from *S.* Typhimurium infection in mice. This study highlights the ability of organoids to compare human-specific responses to each *Salmonella* serovar, which is important in the context of translational capacity for developing prophylactic or therapeutic intervention strategies against *S.* Typhimurium versus *S.* Typhi infections.

Despite the multiple advantages of the organoids as an experimental system, the technology is still in its infancy and has certain limitations. The complex structure of the organoids poses a practical limitation in accessing the internal luminal compartment. Researchers have used microinjections to access the apical epithelium. This approach may preserve the internal microenvironment, but is resource-intensive, may not allow synchronous exposure and suffers from variability in volume that can be injected due to heterogeneity of the organoid/enteroid sizes. In addition, the lumen of 3D organoid/enteroid accumulates cellular debris, which may bind bacteria or hamper interactions with the apical membranes. As noted above (in Section 3), researchers have turned to organoid/enteroid-derived 2D monolayers to better access the apical side of the model and enable more efficient, user-friendly analyses in a multiple-well plate format. However, this modification can limit the number of processes that can be interrogated, especially when considering the lack of 3D structure. Interestingly, Co and colleagues [144] demonstrated in a recent study that the polarity of human enteroids could be reversed such that the apical surface faced the medium and was readily accessible. The enteroids released mucus and extruded cells outwards into the culture medium rather than having the cells embedded in the basement membrane. Using enteroids with reversed polarity, the authors showed that *S.* Typhimurium invades and induces actin ruffles more efficiently at the apical surface compared to the basolateral surface. The authors observed a more diffuse process of epithelial invasion rather than invasion only or predominantly at the M cells [144], which confirmed the *S.* Typhi observations by Nickerson, KP et al. [71].

Current organoid/enteroid models are devoid of muscles, innervation, vascularization, and immune cells. There are a couple of approaches being carried out to increase the complexity of organoid models, including co-culturing techniques. In 2011, Salerno-Goncalves and colleagues [7] generated an organotypic model using the human ileococal adenocarcinoma cell line HCT-8 and adding primary endothelial cells, fibroblasts, and peripheral blood mononuclear cells (PBMCs), which they used in a 2019 study to probe the crosstalk between these cell types during infection with *S.* Typhi, *S.* Paratyphi A, or *S.* Paratyphi B [145]. An ECM composed of collagen-I enriched with other gastrointestinal basement proteins was embedded with the fibroblasts and epithelial cells, and transferred to a RWV bioreactor containing epithelial cells. Under low microgravity and low shear conditions, the HCT-8 cells behaved as multi-potent progenitor cells and gave rise to multiple cell types, including absorptive enterocytes, goblet cells, and M cells. After one to two weeks, PBMCs were added to the system. The co-culture model was then infected with the various *Salmonella* serovars to compare responses to the three strains. The authors found that the presence of the immune cells in the model resulted in secretion of the cytokines IL-1β and CCL3, while secretion of cytokines IL-6 and TNF-α was enhanced. Using depletion experiments, the authors showed that macrophages were the PBMC cell type responsible for the enhanced secretion of IL-6 and TNF-α. The authors further used the Transwell system to show that supernatants from organotypic models built with whole or macrophage-depleted PBMCs infected with the three *Salmonella* strains varied in their ability to elicit transmigration of macrophages and neutrophils [145]. Interestingly, the two immune cells displayed crosstalk during infections with *S*. Paratyphi A and *S*. Paratyphi B, such that the presence of macrophages in the co-culture reduced neutrophil migration as compared to the system built without macrophages [145]. This study illustrates that co-cultures can aid in probing the contribution of immune cells to *Salmonella* infection at the mucosal surface. Finally, this model has also been used to assess the inflammatory response to several candidate *S.* Typhi vaccine strains in comparison to the response elicited by the oral vaccine strain Ty21a strain and its parent wild-type Ty2 stain [146]. Salerno-Goncalves and colleagues [146] found that specific changes to the genetic makeup of the candidate vaccine strains (in the form of deletions of specific metabolic genes) elicited host changes in intestinal permeability, inflammatory cytokine secretion, as well as activation of innate immunity pathways. Higgins and colleagues [73] also used the model to test the inflammatory response of an *S.* Typhimurium vaccine strain that they generated. These studies highlight the usefulness of co-cultured organoid/enteroid models in assessing important factors to be considered while designing vaccines.

Schulte and colleagues [147] generated a co-culture system of human intestinal epithelial cell line (Caco-2), primary human microvascular endothelial cells, primary intestinal collagen scaffold, and PBMCs in a Transwell set up. Using GFP-labeled *S.* Typhimurium, microscopy, and flow cytometry, the authors demonstrated that the bacteria can be found in epithelial but not endothelial cells, thus

modeling the epithelium-restricted infection of humans with *S.* Typhimurium. These findings are in contrast to those of Spadoni and colleagues [99] in the mouse model of *S.* Typhimurium infections where a breach of the gut-vascular barrier by the bacteria was observed. The endothelial cells respond to the infection process by bringing about changes in transcription of various genes and releasing the phagocyte chemoattractant IL-8. Such models, ideally with enteroid/organoid-derived cells replacing cell lines where used, should prove to be extremely useful and versatile in interrogating the role of different immune cells, vasculature, and the related crosstalk with epithelial cells during infection with *Salmonella,* especially for *S.* Typhi, where the bacteria spread systemically both as free bacteria and within reticuloendothelial cells [148].

#### **7. Future Directions**

Currently, researchers have little or no control over how cells self-organize into organoids. The physical environment of the enteroids/organoids in the form of the 3D scaffold provides cues such as adhesive ligands and stiffness. The current ECMs are derived from animals, are poorly defined, may show batch to batch variation that can lead to heterogeneous growth and differentiation between the organoids generated at different times, and are not mechanically pliable after plating. Thus, improvements to the 3D scaffolds are expected to yield better consistency, which would facilitate mechanistic infection studies and even provide a better platform for clinical applications. Some recent work [149] has focused on using chemically defined 3D scaffolds to improve the uniformity of the environmental cues for growth and differentiation. Indeed, a better defined and mechanically dynamic matrix would increase the potential of organoid technologies for therapeutic development (also reviewed in [150] and [151]).

Another important requirement is the further characterization of organoids/enteroids to ensure that the models faithfully represent the in vivo human physiology so that in vitro analyses with the system will be relevant for subsequent clinical development. Recent work in brain organoids has shown that important differences exist in the expression profiles between cells derived from organoids as compared to human brain cells [130]. Similar studies must be carried out in intestinal organoid/enteroid models, and indeed in all organoid models, to ensure fidelity of representation.

The enteric nervous system (ENS) carries out important functions in the gastrointestinal tract, such as motility and contractility, regulation of blood flow, maintenance of epithelial barrier, and fluid exchange [7]. Future investigations of the role of the ENS in *Salmonella* infection will be critical to our mechanistic understanding of the pathogen. Recent studies have shown that the neurons in the gut also protect against *S.* Typhimurium infections [152,153]. For example, Lai and colleagues [153] demonstrated that the pain-sensing neurons that lie beneath Peyer's patches in the gastrointestinal tracts of mice, become activated and release the neuropeptide calcitonin gene-related peptide (CGRP) upon detecting *S.* Typhimurium. This process results in a decreased number of M cells while increasing the levels of segmented filamentous bacteria (SFB) that can protect against *Salmonella* infection [153]. In light of studies like this, it becomes essential to probe and understand the interaction of *Salmonella* with the ENS. To facilitate analyses with the ENS, researchers have developed an ENS-HIO model. Workman and colleagues [154] generated vagal-like neural crest cells (NCCs, the precursors of ENS) from PSCs and incorporated them into human intestinal organoids via direct co-culture. The NCCs subsequently differentiated into multiple cell types including neurons and glia. The authors demonstrated that the neurons were functional by observing a response to oscillations in calcium. The ENS-HIO was then transplanted into mice and displayed contractile activity. Given the recent findings of the role of neurons in potential protection against *Salmonella* infection, this model has tremendous promise to further our understanding of the dynamics of the ENS-*Salmonella* interaction.

Methods to incorporate the microbiome into organoid-based models are essential to future *Salmonella* research. It is well known that *Salmonella* must successfully compete with the resident microflora in order to infect the intestinal epithelium, and several competitive mechanisms deployed by *Salmonella* have been elucidated [155]. Most often, microbiome studies are carried out in germ-free or

humanized mice; however, there are significant differences between the composition of the microbiota and host physiology when using these animals as a model of humans. The healthy gut contains several anaerobic bacteria in the lumen given the decreasing oxygen gradient that extends from the anaerobic lumen to the hypoxic epithelium. Despite the complexities of microbiota interactions and oxygen levels, a few studies have made progress in incorporating the microbiota into organoid models or reproducing anaerobic environments such as the 2020 study by Lu et al., described above [140]. Karve and colleagues [156] in a 2017 study incubated induced human intestinal organoids (iHIOs) with human neutrophils to model innate cellular responses to commensal and pathogenic *Escherichia coli* in which neutrophil recruitment was monitored by microscopy. The authors were able to culture commensal *E. coli*, despite the fact that the bacterial strain is a facultative anaerobe. Greater recruitment of the neutrophils occurred when the organoids had been injected with pathogenic bacteria as compared to saline or commensal bacteria. In another study by Leslie and colleagues [157], the authors demonstrated that anaerobic *Clostridium di*ffi*cile* could remain viable for at least 12h when injected into the lumen of HIOs, indicating the reproducibility of the appropriate oxygen levels in the model. Finally, LeBlay and colleagues [158] have devised a specialized bioreactor for cultivation of human intestinal microbiota that will enable researchers to grow a wide range of commensal bacteria for analyses. The authors immobilized fecal microbiota from a two year old child in gel beads and cultured under anaerobic conditions with continuous flow of medium containing chyme. The bacteria grew as biofilms and formed stable populations. *S.* Typhimurium also immobilized onto beads was added to the reactor to simulate gut infection in children. This step was followed by addition of two concentrations of amoxicillin, and the effects of the treatment on the microbial composition and the metabolites generated were analyzed. The authors observed a strong disturbance in the microbial composition upon antibiotic treatment, with *Bifidobateria* significantly decreasing in numbers while *C. cocoides*-*E.rectales* group strongly increased. S. Typhimurium levels were also strongly decreased upon amoxicillin treatment, but returned to previous levels upon interruption of the antibiotic treatment. Antibiotic treatment also resulted in a decrease in the concentration of the metabolites acetate and butyrate, which remained at lower levels even on the withdrawal of antibiotic treatment. Indeed, these studies show that the incorporation of the microbiota into human-specific organoid systems will allow mechanistic studies investigating the crosstalk between host physiology, microbiota, and *Salmonella* pathogenicity.

#### **8. Conclusions**

Infections caused by *S. enterica* remain a major health concern world-wide. Models used to study the disease pathology so far have provided valuable advancements. However, there remains a disconnect between what works at the bench versus at the bedside, particularly in the case of vaccines. The development of organoids/enteroids offers a tremendous opportunity to bridge this gap by bringing human-specific factors into the research models (Figure 3).

Gastrointestinal organoid and enteroid models have been shown to capture the cellularity, organization, and complexity of the intestine in vivo along with providing the flexibility of an in vitro system. These models have provided new and fundamental knowledge in human physiology, pathology and the molecular basis of host-microbe interactions. In addition, recent studies have seen efforts to build upon existing paradigms. Research is being conducted to improve the fidelity and reproducibility of the organoid model systems. Co-culture systems have been developed allowing the integration of various cell types, which allows us to better understand and interrogate the crosstalk between a wide-range of potential combinations of lineages in vivo. Most importantly, the integration of immune cell types is likely to help in understanding how *Salmonella* is viewed by and responds to immune processes, particularly at the mucosal epithelium that is the first site of contact and thus an important stage of the infection process for the development of vaccines. There have also been studies that have integrated the microbiota into the organoid model and future work may delve into the role of an individual's microbiota in *Salmonella* pathogenesis. Despite its potential, it is important to keep in mind certain limitations of organoid-based systems, the most significant being that the models have to

be thoroughly characterized for their ability to represent in vivo conditions for appropriate translation into the clinic. The addition of multiple cell types into organoid co-cultures only recapitulates a part of the body i.e., the organ from which they are derived. The results of organoid co-cultures have to be complemented with whole organism studies and compared to human clinical findings.

We anticipate that the organoid and enteroid models will play key roles in future advancements in the understanding of *Salmonella* pathogenesis. Their use will facilitate research in drug development, host-microbe interactions, crosstalk between the host, microbiota and pathogens, and personalized medicine, and will contribute towards the development of successful vaccines for *Salmonella* Typhimurium and *Salmonella* Typhi.

**Figure 3.** Insights into *Salmonella* pathogenesis from intestinal organoids/enteroids. The key findings for *S.* Typhi and *S.* Typhimurium are highlighted.

**Funding:** S.V. and B.J.C. were supported by National Institutes of Health Grant R01AI089700. C.S.F. was supported by the National Institute of Allergy and Infectious Diseases Grant K22AI104755 and MGH ECOR Interim Support Fund Grant 2020A003037. S.S. was partially supported by MGH ECOR Interim Support Fund Grant 2019A004390 and NIDDK RO1DK104344-01A1 grant.

**Conflicts of Interest:** The authors declare no conflict of interest. The content is solely the responsibility of the authors and does not necessarily represent the official views of any of the funding agencies.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## **New Roles for Two-Component System Response Regulators of** *Salmonella enterica* **Serovar Typhi during Host Cell Interactions**

#### **Claudie Murret-Labarthe, Maud Kerhoas, Karine Dufresne and France Daigle \***

Département de microbiologie, infectiologie, immunologie, Université de Montréal, Montréal, QC H3T 1J4, Canada; c.m.labarthe@gmail.com (C.M.-L.); maud1693@hotmail.com (M.K.); karine.dufresne@umontreal.ca (K.D.)

**\*** Correspondence: france.daigle@umontreal.ca

Received: 9 April 2020; Accepted: 9 May 2020; Published: 13 May 2020

**Abstract:** In order to survive external stresses, bacteria need to adapt quickly to changes in their environment. One adaptive mechanism is to coordinate and alter their gene expression by using two-component systems (TCS). TCS are composed of a sensor kinase that activates a transcriptional response regulator by phosphorylation. TCS are involved in motility, virulence, nutrient acquisition, and envelope stress in many bacteria. The pathogenic bacteria *Salmonella enterica* serovar Typhi (*S.* Typhi) possess 30 TCSs, is specific to humans, and causes typhoid fever. Here, we have individually deleted each of the 30 response regulators. We have determined their role during interaction with host cells (epithelial cells and macrophages). Deletion of most of the systems (24 out of 30) resulted in a significant change during infection. We have identified 32 new phenotypes associated with TCS of *S.* Typhi. Some previously known phenotypes associated with TCSs in *Salmonella* were also confirmed. We have also uncovered phenotypic divergence between *Salmonella* serovars, as distinct phenotypes between *S.* Typhi and *S.* Typhimurium were identified for *cpxR*. This finding highlights the importance of specifically studying *S.* Typhi to understand its pathogenesis mechanisms and to develop strategies to potentially reduce typhoid infections.

**Keywords:** *Salmonella* Typhi; two-component system; *cpxR*

#### **1. Introduction**

Bacteria possess a variety of systems that enable them to respond to diverse signals received from the external environment. These signals are mainly detected by two-component systems (TCS) composed of a histidine sensor kinase (SK) and a response regulator (RR). Physical or chemical signals, such as changes in extracellular ion concentrations, pH, oxygen, osmolarity, quorum sensing, and the presence of antibiotics are some of the signals detected by TCS. TCS are involved in adaptation to several conditions, notably stress conditions, host–pathogen interactions, symbiotic interactions, and intracellular signaling [1,2].

The SK partner of the TCS is located in the inner membrane and generally comprises two domains, a receiver and a transmitter domain that contains a kinase activity with a conserved histidine residue. Typically, the RR proteins are located in the cytoplasm and also comprise two domains, a receiver domain in the N-terminal section of the protein containing a conserved aspartate residue and a response domain in the C-terminal of the protein. When a signal is detected by the SK, this results in autophoshorylation of the conserved histidine residue, an ATP-dependent process. The SK then activates the RR through transfer of its phosphorylated group to the conserved RR aspartate residue.

Once activated, the RR initiates the adaptive transcriptional response, through activation or repression of genes that will adjust the bacterial lifestyle to the conditions encountered [3].

*Salmonella enterica* serovar Typhi (*S.* Typhi) is a human-specific bacterial pathogen and the etiologic agent of the typhoid fever. This disease is common in Africa and Southeast Asia and causes between 11.9 and 26.9 million cases and 128,000 to 216,500 deaths per year [4]. Infection with this pathogen occurs through the ingestion of contaminated food or water. Once ingested, *Salmonella* must first resist stomach acidity [5,6], then reach the small intestine, cross the mucosal barrier of the intestine, and gain access to intestinal epithelial cells. Bacteria can then invade epithelial cells using the type-three secretion system (T3SS) located on *Salmonella* pathogenicity island 1 (SPI-1) [7,8]. *S*. Typhi does not elicit a strong intestinal immune response or inflammation, mainly by producing the Vi capsule [9]. It crosses the intestinal barrier, infects macrophages, and survives within vacuoles by using a second T3SS located on SPI-2 [10,11]. *S.* Typhi then causes a systemic infection by disseminating to deeper tissues including spleen, liver, bone marrow, and gallbladder [12].

Currently, most of our knowledge concerning TCS was obtained from studies done in *Escherichia coli* or *Salmonella enterica* serovar Typhimurium. Thus far, only six TCS have been characterized in *S.* Typhi. Both the EnvZ-OmpR system and the Rcs system activate the expression of the Vi capsule [13,14]. The Rcs system also represses invasion proteins and flagellin [14–17]. The PhoPQ system regulates the *S.* Typhi-specific CdtB, ClyA, and TaiA toxins [18–20], is expressed in typhoid patients [21], and a *phoPQ* deletion was used in a live attenuated *S.* Typhi vaccine [22]. The SsrAB system had no role in survival in macrophages in *S.* Typhi [23]. The CpxAR system is involved in adhesion and invasion of human intestinal epithelial cells and is activated by osmolarity [24]. The QseCB system is activated by several signals, including neurotransmitters (epinephrine and norepinephrine) [25,26], and invasion of epithelial cells increased in a *qseB* mutant of *S.* Typhi [27]. UhpBA regulates glucose-6-phosphate transport [28]. A comparative study of the transcriptional profile performed in *S.* Typhi indicates that UhpA was involved in the sulfur assimilation pathway [29]. Other TCS have not been studied in *S.* Typhi and some TCS have not been investigated in *S.* Typhimurium (CitAB, CreCB, DpiBA, TctED, and TorSR).

As some TCS play a role in *S.* Typhi infection, it is likely that other TCS may have a significant role in different stages of disease by this pathogen. To study the TCS of *S.* Typhi, we have deleted each of the genes encoding RR proteins, since it has been shown that some SK can also activate non-specific RR and complement defects of the specific corresponding SK mutant [30]. Non-polar deletions of genes encoding each RR protein were created by allelic exchange and we evaluated the ability of each mutant to adhere, invade, and replicate in human epithelial cells and to be phagocytosed and survive in human macrophages. This study represents a comprehensive characterization of all *S.* Typhi TCS and identifies a potential role for each of these systems in *S.* Typhi pathogenesis.

#### **2. Materials and Methods**

#### *2.1. Bacterial Strains and Growth Conditions*

*S.* Typhi strain ISP1820 was used throughout this study as the main wild-type strain [31]. Strains and plasmids used in this study are listed in Supplementary Tables S1 and S2, respectively. Bacteria were routinely grown overnight in Luria-Bertani (LB) broth, with agitation at 37 ◦C, unless indicatedotherwise. Antibiotic or supplements were added at the following concentration: 34 μg/mL chloramphenicol and 50 μg/mL diaminopimelic acid, when required. Bacterial transformation was performed using the calcium/manganese-based method, as previously described [32].

#### *2.2. Chromosomal Deletion of TCS Regulatory Genes*

Thirty TCS were identified in the sequenced genome of *S.* Typhi strain CT18 [33] by searching for DNA binding protein and regulator. The non-polar deletion of all the response regulator (RR) encoding genes were obtained by allelic exchange, as described previously [34], using the overlap-extension PCR method [35]. Deletions were confirmed by PCR. The primers used for mutagenesis are listed in Supplementary Table S3.

#### *2.3. Interaction with Cultured Human Epithelial Intestinal Cells*

The INT-407 (ATCC CCL-6) cells were cultivated in Eagle minimal essential medium (EMEM) (Wisent, St-Bruno, QC, Canada) supplemented with 10% heat-inactivated fetal bovine serum (FBS) (Wisent) and 25 mM HEPES (Wisent, St-Bruno, QC, Canada). The gentamicin protection assay described previously was adapted to 96-well plates and performed at a multiplicity of infection (MOI) of 20 [34]. Bacteria were grown overnight in static condition (low aeration) in LB-NaCl (300 mM) to induce SPI-1 and were added in triplicate. After 90 min, infected cells were washed with phosphate-buffer saline (PBS) and fresh medium supplemented with 50 μg/mL gentamicin was added to kill the extracellular bacteria. Cells were lysed with PBS and 0.1% sodium deoxycholate (PBS-DOC) at 90 min (adhesion), 180 min (invasion), and 18 h (survival) post-infection. Serial dilutions were performed for enumeration of viable colony counts by colony-forming units (CFU/mL). The assay was performed at least three times in triplicate.

#### *2.4. Infection of Cultured Macrophages*

The THP-1 (ATCC TIB-202) cells were cultivated in RPMI 1640 (Wisent, St-Bruno, QC, Canada) supplemented with 10% heat-inactivated FBS (Wisent), 1 mM sodium pyruvate (Wisent), and 1% MEM non-essential amino acids (Wisent, St-Bruno, QC, Canada). The human monocytes cells were differentiated into macrophages by addition of 10−<sup>7</sup> M phorbol 12-myristate 13 acetate (Sigma) for 48 h before the infection. Similarly, the RAW264.7 (ATCC TIB-71) murine macrophages were cultivated in Dulbecco's Modified Eagle's Medium (DMEM; Wisent, St-Bruno, QC, Canada). The method was adapted to 96-well plates and performed at a MOI of 10 [36]. To obtain a similar number of intracellular bacteria, a MOI of 10 was used for macrophages to compensate for the phagocytic activity. Briefly, following an overnight growth in LB broth, the strains were added in triplicate. After 30 min, infected cells were washed with PBS, treated with gentamicin (50 ug/mL), and lysed with PBS-DOC 0.1% at 30 min (phagocytosis), and 18 h (survival) post-infection, then, serial dilutions were performed for enumeration of viable colony counts (CFU/mL). Each deletion was tested at least three times in triplicate.

#### *2.5. Motility Assays*

Motility assays were performed in a tube, containing the «Motility Test Medium» (BBL, BD, Mississauga, ON, Canada), in which a solution of 1% of triphenyltetrazolium chloride was added. These agar tubes were inoculated by stabbing the agar with an overnight culture of bacteria. The tubes were then incubated at 37 ◦C for approximately 18 h, to evaluate the motility of the mutants. For each deletion, this assay was performed at least three times. Motility assays on plates were performed as described previously [37].

#### **3. Results**

#### *3.1. Deletion and Characterization of RR Mutants*

We have identified 30 RR genes in the genome of *S.* Typhi and an overview of their putative functions is summarized in Table 1. These TCS were all detected in the genome of the closely related serovar Typhimurium. However, these two serovars have a different host range, and cause distinct disease, suggesting that potential differences between these serovars may involve differences in gene regulation. All RR were deleted individually. Deletion of an internal fragment of each RR was achieved by allelic exchange in *S.* Typhi strain ISP1820. Each marker-less deletion was in frame, to avoid any polar effect. Mutants were characterized for their growth, susceptibility to aminoglycoside, and motility. All mutants had a similar growth curve in LB compared to the wild-type parent strain (data not

shown). The *arcA* mutant produced smaller colonies on LB agar. The mutants were all sensitive to gentamicin and most mutants were motile as the wild-type (except for *cheY*, as expected, and *ompR*, which demonstrated a reduced swimming area, 85% of the wild-type, in motility medium).


**Table 1.** Two-component systems of *Salmonella* Typhi and their putative function.

<sup>a</sup> Role in Escherichia coli.

#### *3.2. Adhesion, Invasion, and Replication in Epithelial Cells*

Passage through the intestinal epithelial cell barrier is a key step in the pathogenesis of *S.* Typhi. We used infection of epithelial cells to evaluate adhesion, invasion, and replication effects of the TCS mutant of *S.* Typhi in these cell type. The wild-type *S.* Typhi ISP1820 strain was used as the reference control and its isogenic *invA* (SPI-1)/*ssrB* (SPI-2) mutant (here referred as ΔSPIs) were used as a low virulence control, as this strain exhibits impaired host cell entry.

The adhesion level for the different TCS mutants ranged from 45 to 144% of the wild-type (Figure 1A). There were 5 mutants that showed a significant change in adherence compared to the wild-type strain. Three mutants (*cheY*, *ompR* and *pgtA*) were less adherent and 2 mutants (*narP* and *rcsB*) were more adherent. The *ompR* was the least adherent, whereas the *rcsB* mutant had the highest level of cell adherence.

For the cell invasion phenotype, differences in invasion varied from 7 to 370% of the wild-type, and several mutants (16/30) showed a significant difference in cell invasion compared to the wild-type strain. Seven mutants showed increased invasion (*cheY*, *citB*, *narP*, *pgtA*, *pmrA*, *qseB* and *rcsB*) and 9 showed decreased invasion (*arcA*, *baeR*, *cpxR*, *ompR*, *phoP*, *qseF*, *sirA*, *tctD* and *torR*) (Figure 1B). The negative control (ΔSPIs) showed only 1.3% invasion compared to the wild-type, as expected. The TCS mutant demonstrating the most decreased invasion was *sirA* and the mutant with the highest increased invasion was *rcsB*.

**Figure 1.** Effect of loss of TCS response regulators on interaction with human epithelial cells. INT-407 epithelial cells were infected with *S*. Typhi wild-type strain and the isogenic RR mutants, and the level of bacteria associated with cells was determined upon adherence (90 min) (**A**), invasion (180 min) (**B**), or after 18 h (**C**). All assays were conducted in triplicate and repeated independently at least three times. The results are expressed as the mean ± SEM of the replicate experiments. Significant differences (\* *p* < 0.0001) in the levels recovered as compared to the wild-type were determined by the Student's unpaired *t*-test. The dashed line corresponds to the wild-type level.

For intracellular replication, the range was from 70 to 264% of the wild-type. There were 8 mutants demonstrating significantly different levels of replication, 6 that were higher (*copR*, *glnG*, *pgtA*, *pmrA*, *qseB*, and *uhpA*) and 2 that were lower (*citB* and *rstA*) than the wild-type control (Figure 1C). The *rstA* mutant had the greatest decrease, whereas the *qseB* mutant had the highest level of replication in epithelial cells. Interestingly, several mutants that were defective in invasion were able to replicate similarly to the wild-type.

#### *3.3. Uptake and Survival in Macrophages*

Some TCSs are important for survival of *Salmonella* inside macrophages, and survival within these cells represents a crucial step in the pathogenesis and virulence of *S.* Typhi to disseminate systemically. Thus, we investigated the role of each TCS in uptake and survival in macrophages. The wild-type *S.* Typhi ISP1820 strain was used as the reference control and the *phoP24* isogenic mutant (PhoP constitutive) [108], known to be defective in virulence and macrophages survival [109], was used as a low virulence control. This control was chosen as the isogenic *invA* (SPI-1)/*ssrB* (SPI-2) mutant (ΔSPIs) to survive as the wild-type strain in macrophage [23]. The level of internalization by macrophage varied from between 76 to 404% of the wild-type (Figure 2A). There were 10 mutants with a significant difference in uptake by macrophage compared to the wild-type strain. Seven of the mutants showed increased uptake (*arcA*, *kdpE*, *narL*, *narP*, *ompR*, *pgtA*, and *rcsB*) and three mutants (*cpxR*, *dcuR*, and *glnG*) showed decreased macrophage uptake. The *glnG* mutant demonstrated the lowest level of uptake and the *rcsB* mutant showed the highest level of uptake by macrophage.

The level of survival in macrophages ranged from 29 to 249% of the wild-type (Figure 2B). There were 16 mutants with a significant difference in survival compared to the wild-type strain, 6 showed an increased survival (*copR*, *pmrA*, *rcsB*, *sirA*, *tctD*, and *torR*) and 10 demonstrated a decreased survival (*arcA*, *baeR*, *cheY*, *citB*, *cpxR*, *dpiA*, *kdpE*, *narP*, *ompR*, and *phoP*). The *phoP* mutant demonstrated the lowest survival and the *rcsB* mutant had the highest level of survival in macrophage.

**Figure 2.** Effect of loss of TCS response regulators during interaction with human macrophages. THP-1 cells were differentiated into macrophages and infected with *S*. Typhi wild-type strain and the isogenic RR mutants. The level of bacterial uptake (phagocytosis) (**A**) and the level of survival after 18 h infection (**B**) were determined. All assays were conducted in duplicate and repeated independently at least three times. The results are expressed as the mean ± SEM of replicate experiments. Significant differences (\* *p* < 0.0001) as compared to wild-type were determined by the Student's unpaired *t*-test. The dashed line corresponds to the wild-type level.

#### *3.4. Complementation*

In order to confirm that the phenotypic difference was associated with the RR mutation, we selected 4 mutants that were strongly under- or over-represented compared to the wild-type strain in invasion or survival level in macrophages. The *cpxR*, *ompR*, *rcsB*, and *sirA* mutants were complemented with a wild-type copy of the gene on a low-copy vector. Interactions with epithelial cells and macrophages were evaluated. The wild-type levels association with cells were restored in the complemented strains (Figure 3).

**Figure 3.** Complementation. Epithelial INT-407 cells (**A**) and THP-1 macrophages (**B**) were infected with *S*. Typhi wild-type strain, the *cpxR, ompR, rcsB*, and *sirA* mutants and complemented mutants with a wild-type copy on a low-copy vector. All assays were conducted in triplicate and repeated independently at least three times. The results are expressed as the mean ± SEM of the replicate experiments. Significant differences (\* *p* < 0.0001) in the level between the wild-type and the mutant were determined by the Student's unpaired *t*-test. The dashed line corresponds to the wild-type level.

#### *3.5. Impact of the Vi Antigen*

It was previously demonstrated that RscB and OmpR regulate the Vi capsule [14–17,19,20]. As these TCS showed strong phenotypes, often opposite, except for phagocytosis, we investigated the role of the Vi antigen during host cell interaction. We have constructed a *tviB* mutant as well as a double *tviB-ompR* and a double *tviB-rcsB* mutant and evaluated these strains with epithelial cells and macrophages (Figure 4). The lower level of adhesion to epithelial cells observed for the *ompR* mutant was specific to *ompR* as the *tviB* mutant was not significantly different than the wild-type, whereas the double *tviB-ompR* was similar to the *ompR* mutant. Similarly, the high level of invasion of epithelial cells observed for the *rcsB* mutant was specific to the *rcsB* mutation as the mutant and the double mutant *tviB-rcsB* were both significantly different than the wild-type but not the *tviB* mutant. The loss of the Vi antigen did not increase the phagocytosis and survival level in macrophages, suggesting that the phenotypes observed were specific to the *ompR* and the *rcsB* mutation. We have confirmed by immuno-staining that the *ompR*, *rcsB*, and *tviB* mutants did not express the Vi antigen compared to the wild-type strain and other mutants (Figure 4C).

**Figure 4.** Role of Vi capsule. Epithelial INT-407 cells (**A**) and THP-1 macrophages (**B**) were infected with *S*. Typhi wild-type strain, the *tviB*, *ompR, rcsB* and *the double mutant tviB-ompR* and *tviB-rcsB* mutants. (**C**) Production of the Vi antigen by immuno-staining. All assays were conducted in triplicate and repeated independently at least three times. The results are expressed as the mean ± SEM of the replicate experiments. Significant differences (\* *p* < 0.0001) in the level between the wild-type and the mutant were determined by the Student's unpaired *t*-test.

#### *3.6. Strain Specificity*

As all mutants were tested in *S.* Typhi strain ISP1820, we also investigated if the *ompR* phenotype was conserved in another *S.* Typhi strain. We generated an *ompR* deletion in *S.* Typhi Ty2, and this mutant also showed decreased infection of epithelial cells or macrophages (Figure 5).

**Figure 5.** Role of *ompR* mutant in *S.* Typhi strain Ty2. Epithelial INT-407 cells and THP-1 macrophages were infected with *S*. Typhi Ty2 strain and its isogenic *ompR* mutant. All assays were conducted in triplicate and repeated independently at least three times. The results are expressed as the mean ± SEM of the replicate experiments. Significant differences (\* *p* < 0.05) compared to the wild-type were determined by the Student's unpaired *t*-test.

Then, as the *cpxR* mutant was found to be significantly less invasive than the wild-type strain in *S.* Typhi, but was able to invade and replicate in epithelial cells at levels comparable to the wild-type strain in *S.* Typhimurium [53,76], we constructed this mutant in *S.* Typhimurium SL1344 and investigated its interaction with cells (Figure 6). During interaction with epithelial cells, the *cpxR* mutant of *S.* Typhimurium was similar to the wild-type strain, suggesting that the effect is strain-specific to *S.* Typhi. There was also no difference between the wild-type and the SL1344 *cpxR* mutant when tested in the murine macrophages RAW264.7.

**Figure 6.** Comparison of the *cpxR* mutant of *S.* Typhi and *S.* Typhimurium. Epithelial INT-407 cells and THP-1 macrophages were infected with *S*. Typhi ISP1820 strain and *S.* Typhimurium SL1344 and their isogenic *cpxR* mutant. Both WT strains are settled at 100 percent. All assays were conducted in triplicate and repeated independently at least three times. The results are expressed as the mean ± SEM of the replicate experiments. Significant differences (\* *p* < 0.0001) compared to the wild-type were determined by the Student's unpaired *t*-test.

#### **4. Discussion**

TCS are usually the first to detect a perturbation in the intracellular or extracellular environment and will react quickly to modify bacterial gene expression. They are involved in sensing a variety of signals (pH, ions, nutrients, stress, etc.). Therefore, TCS are critical for bacterial adaptation and survival. Here, we have identified 30 TCS in the genome of *S.* Typhi and summarized their putative function and role in *Salmonella* (Table 1). We have deleted each of the TCS regulator encoding genes from *S.* Typhi and tested interactions with human epithelial cells (adhesion, invasion, and replication) and macrophages (uptake and survival), which constitute two important niches of *S*. Typhi infection. Moreover, these mutants represent important tools to advance our knowledge of *S.* Typhi pathogenesis by investigating their roles during interactions with cells or under different environmental conditions.

All the TCS mutants grew similarly to the wild-type strain in liquid culture. Most of the TCS mutants (24/30) showed a significant difference compared to the wild-type strain during at least one step of infection (adhesion, invasion, replication, uptake, or survival) (Table 2). There were 9 phenotypes previously associated with 8 TCS in *S.* Typhimurium that were confirmed in *S.* Typhi (*arcA*, *cheY*, *phoP*, *qseB*, *qseF*, *rcsB*, *sirA*, and *yehT*) (Table 2). Interestingly, several of the TCS previously associated with *S.* Typhimurium virulence in mice (*cheY*, *cpxR*, *narP*, *ompR*, *phoP*, *qseB*, *qseF*, *rcsB*, *sirA*, and *ssrB*) display a phenotype during host cell interaction with *S.* Typhi, except for *cpxR* and *ssrB*, see below (Table 2). An important aspect of this study was the identification of 32 new phenotypes associated with *S*. Typhi TCS mutants (Table 2). Interestingly, the *cpxR* mutant had phenotypes distinct from *S.* Typhimurium found in the literature (Table 2). The *S.* Typhimurium *cpxR* mutant was not affected for invasion or intracellular replication in epithelial cells (HEp2 and Caco-2) or survival in RAW264.7 macrophages [53], while the *S*. Typhi *cpxR* mutant was defective in invasion of INT407 cells and survival in THP-1 macrophages (Table 2). Thus, we have deleted *cpxR* in *S*. Typhimurium SL1344 and evaluated its level of adhesion, invasion, and replication in epithelial cells and in macrophages

(Figure 6). No significant difference between the wild-type was observed, confirming a difference in the role of CpxR between *S.* Typhi and *S*. Typhimurium.

**Table 2.** Phenotype of regulator mutant during interaction with host.

Blue = significantly lower; Grey = no difference, Red = significantly higher than the wild-type. C = confirmed phenotype; N = new phenotype; \* = divergent phenotype.

Six RR mutants, *creB*, *hydG*, *phoB*, *ssrB*, *ttrR* and *yehT* were similar to the wild-type strain in all conditions tested. The mutation of 4 TCS (*creB*, *hydG*, *phoB*, *ssrB*) in *S.* Dublin also resulted in a phenotype similar to the wild-type strain during infection of epithelial cells [110]. The deletion of the Ttr system of *S.* Dublin caused a higher level of invasion, but in *S*. Typhi, the *ttrS* sensor is a pseudogene (see below), which may explain why no phenotypes were observed. It may be surprising that the SsrAB system, which is the principal regulator of SPI-2, demonstrated no defect, but we have previously demonstrated that the entire SPI-2 deletion was not essential for *S*. Typhi survival in macrophages [23], and SPI-2 was not required for *S.* Typhi infection in a humanized mice model [111], highlighting one of the major differences with *S.* Typhimurium.

*S.* Typhi has evolved as a human-restricted pathogen without any known environmental niche. This specialization is associated with genome degradation, as up to 5% of its genome includes predicted open reading frames that have become pseudogenes. There are two TCS that are pseudogenes in *S.* Typhi: TorR and the sensor TtrS. The TtrSR system is involved in tetrathionate respiration in the inflamed gut, which provides a competitive advantage against the intestinal microbiota [112]. However, the production of the Vi capsule by *S.* Typhi prevents intestinal inflammation [9], suggesting that *S.* Typhi does not need the TtrRS system and the *ttrR* mutant did not show any phenotype in the tested conditions here. The TorSR system is not characterized in *Salmonella*. In *E. coli*, TorR activates the transcription of *torCAD* [113], which encodes proteins required for anaerobic respiration [114–116]. Here, even in the absence of a functional sensor, the *torR* mutant was defective in invasion and had a higher level of survival in macrophages.

Epithelial cell invasion was the infection step in epithelial cells where TCS mutants differed significantly when compared to the wild-type, as 16 mutants demonstrated changes in invasion (7 increased invasion and 9 decreased invasion). As expected, the *sirA* (*Salmonella* invasion regulator) deletion resulted in decreased invasion, consistent with the role of SirA in inducing SPI-1 [97,117]. The complementation of this mutant restored the wild-type level (Figure 3). By contrast, only 5 mutants were affected in their adhesion level and 8 in intracellular replication, compared to the wild-type. Interestingly, none of the TCS mutants had the same phenotypic pattern (Table 1), except for the 6 aforementioned mutants that did not differ from the wild-type. This emphasizes the diversity of TCS used to respond to environmental changes as well as the specificity of each system, as each TCS is unique.

The *rcsB* mutant showed increases in cell interactions for almost all tested conditions, except for intracellular replication in epithelial cells. RcsB belongs to the Rcs phosphorelay, a complex TCS with three members, RcsC, RcsD, and RcsB, and several accessory proteins involved in the stress envelope response. RcsB was shown to repress important virulence factors, including fimbriae, SPI-1, and also activation expression of the Vi capsule [16,90]. Thus, some virulence genes are expressed in the *rcsB* mutant, which lead to increased adhesion and invasion and the Vi capsule is repressed, which increased phagocytosis by these cells [118].

The *ompR* mutant showed the lowest level of adhesion and one of the lowest levels of invasion (Figure 1). These defects were restored by the addition of a wild-type copy of *ompR* (Figure 3). The motility of the *ompR* mutant was also reduced to 85% of the wild-type. An *ompR* mutant was attenuated in *S.* Typhimurium [71] and OmpR was associated with the activation of SPI-2 [73–75] and motility genes [119]. This regulation pattern is exactly the opposite of the Rcs system, which may explain why these mutants have strong and opposite phenotypes. These phenotypes are specific to each mutation and did not involve the Vi capsule.

#### **5. Conclusions**

Virulence genes expression needed to be tightly regulated in order for *S.* Typhi to adapt and survive within the host. TCS participate in the regulation of several virulence factors and we have shown that several TCS contribute to adhesion, invasion, replication, uptake, and survival of *S.* Typhi. Distinct phenotypes of the CpxR mutant of *S.* Typhi compared to *S.* Typhimurium may reveal fundamental regulatory differences associated with *S.* Typhi niche specialization. Further characterization of the regulons associated with TCS involved in virulence and identification of the signals required for their activation will be important to understand *S.* Typhi pathogenesis. This will help to identify and develop strategies to prevent and or to reduce typhoid infections.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2076-2607/8/5/722/s1: Table S1: Bacterial strains used in this study; Table S2: Plasmids used in this study; Table S3: Primers used in this study.

**Author Contributions:** Conceptualization, C.M.-L. and F.D.; methodology, C.M.-L., M.K. and K.D.; validation, C.M.-L., M.K., K.D. and F.D.; formal analysis, C.M.-L., F.D.; resources, F.D.; writing—original draft preparation, C.M.-L.; writing—review and editing, C.M.-L., M.K., K.D. and F.D.; supervision, F.D.; funding acquisition, F.D. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by the Natural Sciences and Engineering Research Council of Canada (FD: Discovery grant 25114-12).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

1. Beier, D.; Gross, R. Regulation of bacterial virulence by two-component systems. *Curr. Opin. Microbiol.* **2006**, *9*, 143–152. [CrossRef] [PubMed]


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article Salmonella* **Extracellular Polymeric Substances Modulate Innate Phagocyte Activity and Enhance Tolerance of Biofilm-Associated Bacteria to Oxidative Stress**

#### **Mark M. Hahn 1,2 and John S. Gunn 1,2,3,\***


Received: 30 December 2019; Accepted: 10 February 2020; Published: 13 February 2020

**Abstract:** *Salmonella enterica* serovar Typhi causes 14.3 million acute cases of typhoid fever that are responsible for 136,000 deaths each year. Chronic infections occur in 3%–5% of those infected and *S.* Typhi persists primarily in the gallbladder by forming biofilms on cholesterol gallstones, but how these bacterial communities evade host immunity is not known. *Salmonella* biofilms produce several extracellular polymeric substances (EPSs) during chronic infection, which are hypothesized to prevent pathogen clearance either by protecting biofilm-associated bacteria from direct humoral attack or by modulating innate phagocyte interaction with biofilms. Using wild-type and EPS-deficient planktonic and biofilm *Salmonella*, the direct attack hypothesis was tested by challenging biofilms with human serum and antimicrobial peptides. Biofilms were found to be tolerant to these molecules, but these phenotypes were independent of the tested EPSs. By examining macrophage and neutrophil responses, new roles for biofilm-associated capsular polysaccharides and slime polysaccharides were identified. The *S.* Typhi Vi antigen was found to modulate innate immunity by reducing macrophage nitric oxide production and neutrophil reactive oxygen species (ROS) production. The slime polysaccharides colanic acid and cellulose were found to be immune-stimulating and represent a key difference between non-typhoidal serovars and typhoidal serovars, which do not express colanic acid. Furthermore, biofilm tolerance to the exogenously-supplied ROS intermediates hydrogen peroxide (H2O2) and hypochlorite (ClO<sup>−</sup>) indicated an additional role of the capsular polysaccharides for both serovars in recalcitrance to H2O2 but not ClO−, providing new understanding of the stalemate that arises during chronic infections and offering new directions for mechanistic and clinical studies.

**Keywords:** *Salmonella*; biofilm; innate immunity; extracellular polymeric substances

#### **1. Introduction**

*Salmonella enterica* subspecies *enterica* serovar Typhi (*S.* Typhi) is a chronic pathogen of the gallbladder, where it forms biofilms anchored to cholesterol gallstones and encased in self-produced extracellular polymeric substances (EPSs) [1–5]. The conditions and location of these recalcitrant infections is both perplexing and problematic. Bile is rich in bile acids and bile salts with extensive immune-stimulating, antimicrobial, and detergent-like properties that can disrupt bacterial membranes, halt proton gradients, and induce redox stress [6,7]. How and why *S.* Typhi establishes chronic infections in such a repressive environment is not well understood [7]. Furthermore, bile is an important environmental signal that causes opposite effects between non-typhoidal and typhoidal *Salmonella*, the most prominent being those involved in host cell invasion [6–9]. The chronic biofilm

lifecycle of *S.* Typhi is also problematic as it presents unique challenges for diagnosing infections [10–13], providing efficacious treatment [5,14], and the eradication of endemic disease [15–17]. However, these issues must be addressed because chronic carriers are the only known reservoir of *S.* Typhi and eradication of these infections will be an essential step in preventing the 136,000 deaths caused by 14.3 million cases of acute typhoid fever each year [15].

*S.* Typhi is a human-restricted pathogen. An important aspect that sets *S.* Typhi apart from *Salmonella enterica* subspecies *enterica* serovar Typhimurium (*S.* Typhimurium) and other non-typhoidal serovars is rampant genomic decay with pseudogenes representing roughly 5% of its genome, as well as specific gene acquisitions [18–20]. These adaptions resulted in host specialization and the ability to cause systemic disease, which begins primarily by the infection of M cells in the distal ileum. Subsequent invasion and persistence inside macrophages of Peyer's Patches allows *S.* Typhi to disseminate to deep tissues, such as the liver [21,22]. From the liver, *S.* Typhi descends the hepatobiliary duct to the gallbladder and establishes acute or chronic infection [1,4]. Throughout this process, planktonic *S.* Typhi exhibit anti-inflammatory properties to avoid immune detection [23–30]. However, the intracellular lifecycle is a major factor in immune evasion, and biofilms—existing extracellularly—must have additional mechanisms to elicit immune modulation.

Many chronic pathogens are known to inhibit immune cell functions with a well-established link between biofilms and cellular suppression [31–35]. Specific EPSs are often attributed to these abilities, as EPSs have been found to be fundamental to recalcitrance to host immunity and pharmaceutical approaches. One of the most prominent examples of this phenomenon comes from *Pseudomonas aeruginosa* biofilms, which commonly infect cystic fibrosis and burn wound patients. Alginate, the major exopolysaccharide of *P. aeruginosa* biofilms, scavenges hypochlorite and inhibits innate immunity at multiple processes, including complement activation, polymorphonuclear chemotaxis, and phagocytosis by neutrophils and macrophages [36]. *Salmonella* biofilms produce various EPSs in vitro, and biofilm formation during chronic infection has been directly observed in vivo [3,4]. The asymptomatic nature of chronic biofilm infections suggests that *S.* Typhi EPSs have a role in altering innate immune activities to avoid detection. Therefore, we hypothesized that one or more of the EPSs are crucial for biofilm tolerance and contribute to the chronic pathogenicity of *S.* Typhi biofilms by skewing innate immune function(s). Broadly, this prediction presented two functional categories: (a) EPSs that protect biofilm-associated bacteria from innate immune functions that are otherwise inhibitory to planktonic *Salmonella* or (b) EPSs that have immune-modulating function(s) and thus alter the host response to biofilms. We have previously defined the major EPSs of *Salmonella* biofilms (curli fimbriae, colanic acid, cellulose, extracellular DNA, O antigen capsule, and Vi antigen) and characterized the role of each component for biofilm development in vitro [37]. Although these EPSs (particularly curli fimbriae) account for 90% of the biofilm biomass [38], the contribution of each EPS to resisting innate immune functions has not been thoroughly studied.

To test our hypothesis, we compared the outcomes of planktonic and biofilm-associated *Salmonella* when challenged with soluble innate immune molecules (normal human serum and antimicrobial peptides [AMPs]) and tested neutrophils and macrophages for functional responses to planktonic and biofilm *Salmonella*. Furthermore, by comparing wild-type (WT) and EPS-deficient biofilms, we present new evidence that *Salmonella* biofilm tolerance to the host oxidative burst is dependent on the Vi antigen, the O antigen capsule, and colanic acid.

#### **2. Materials and Methods**

#### *2.1. Bacterial Strains and Growth Conditions*

The *Salmonella* strains used in this study were the parental WT strains or derivatives of *S.* Typhimurium ATCC 14028 (JSG210) and *S.* Typhi Ty2 (JSG698, JSG4383) (Table 1). The latter WT *S.* Typhi was substituted when appropriate to correct for a known *rpoS* mutation in this strain [39]. At times, clinical isolates were also tested (Table 2). All clinical *S*. Typhi isolates tested positive for Vi

antigen by serum agglutination tests. The O antigen capsule mutant does not alter the LPS structure. All planktonic and biofilm cultures were grown in tryptic soy broth (TSB). Planktonic cells were collected from 16-hour overnight broth cultures. Biofilms were cultured in 96-well polypropylene microtiter plates coated with 500 μg of cholesterol. To initiate biofilm growth, overnight planktonic bacteria were normalized to OD490 = 0.65 in TSB and then diluted 1:6 in TSB and incubated at 37 ◦C for 3 hours in a static 12-well polypropylene plate. After 3 hours, static cultures were diluted 1:2500 in TSB and distributed (200 μL/well) to the aforementioned cholesterol-coated wells and transferred to 30 ◦C. Biofilms were cultured on a nutator at 30 ◦C for 96 hours. Supernatants were replaced with fresh TSB once every 24 hours. Prior to experimental use, mature biofilms were washed with phosphate-buffered saline (PBS) to remove unattached and planktonic bacteria.


**Table 1.** Strains used in this study. EPS: extracellular polymeric substances.

**Table 2.** Clinical isolates used in this study.


#### *2.2. Mutant Generation*

The Vi antigen was eliminated from a *S.* Typhi clinical isolate (JSG3074) by λ-Red mutagenesis [42]. The primer designs are detailed in Table 3. The marked Δ*tviB* gene deletion was transformed into *S.* Typhi JSG3074 carrying the λ-Red recombinase (creating strain JSG4097), and the antibiotic resistance marker was subsequently removed using pCP20 [43]. The final mutant (JSG4123) was verified by PCR and analysis by gel electrophoresis.


**Table 3.** Oligonucleotide primers used in this study.

#### *2.3. Sensitivity and Tolerance to Human Serum*

Sensitivity to normal human serum was determined using WT planktonic *Salmonella* spp. Overnight broth cultures were normalized to 1.0 <sup>×</sup> 10<sup>7</sup> colony forming units per milliliter (CFUs/mL) in 30% Human AB serum pooled from healthy male donors (Mediatech Inc.; Manassas, VA, USA). Heat-inactivated serum (56 ◦C for 30 minutes) was included as the negative control. Cultures were incubated at 37 ◦C for 3 hours on a nutator and then quantified by dilution platting and colony forming unit (CFU) enumeration.

After washing 2× with PBS, biofilms were challenged with 30% normal human serum or heat-inactivated serum. Serum-exposed biofilms were incubated at 37 ◦C for 3 hours on a nutator and then washed with PBS to remove residual serum. Viable bacteria remaining in the biofilm were enumerated by serial dilution platting.

#### *2.4. Sensitivity and Tolerance to Antimicrobial Peptides*

WT planktonic *Salmonella* spp. normalized to approximately 2.0 <sup>×</sup> <sup>10</sup><sup>6</sup> CFUs/mL in TSB were used to determine the minimum inhibitory concentration (MIC) of the AMPs polymyxin B sulfate (Gibco; Billings, MT, USA) and melittin (Sigma-Aldrich; St. Louis, MO, USA). Concentrations were assayed in two-fold serial dilution series, and growth at 37 ◦C was monitored by OD600 on a SpectraMax M3 plate reader. The MICs for each AMP were determined by the lowest concentration tested that prevented detectable growth. Mature biofilms were washed and exposed to each AMP at concentrations 10× that of the WT planktonic MIC. Biofilms were incubated with each AMP at 37 ◦C for 2 hours on a nutator before PBS washing to remove trace peptides and enumeration by mechanical collection and serial dilution platting.

#### *2.5. Biofilm Aggregate Collection*

Mature biofilms aggregates were used for all host–response experiments and subsequent pathway analysis investigation. Aggregates were mechanically collected by scraping microtiter plate biofilms and normalized by total protein quantification (Bradford method). The reported multiplicity of infection (MOI) values refer to the biofilm aggregate protein equivalent (MOIeq) to that of planktonic bacteria at the reported MOI. Normalized aggregates were analyzed for size and granularity characteristics by flow cytometry (assay optimization only) using a BD FACSCanto II flow cytometer (BD Biosciences; San Jose, CA, USA).

#### *2.6. Macrophage Nitric Oxide Response to Salmonella*

The THP-1 cell line was maintained in Roswell Park Memorial Institute (RPMI) media supplemented with 10% fetal bovine serum (FBS) and 2 mM of l-glutamine. Prior to infection, THP-1 cells were washed and normalized in equivalent media lacking phenol red. Then, the cell line was infected with planktonic or biofilm samples (MOIeq = 100, 4.0 <sup>×</sup> 10<sup>5</sup> THP-1 cells total), and infections were synchronized by centrifugation. Cultures were incubated for 3 hours at 37 ◦C, 5% CO2. Supernatant nitric oxide (NO) was measured each hour by the Griess diazotization reaction, and viable extracellular CFUs remaining in the supernatant were quantified by serial dilution platting.

#### *2.7. Neutrophil Reactive Oxygen Species Response to Salmonella*

PLB-985 cells [44,45] were differentiated to a neutrophil-like phenotype [46,47] by 6-day incubation in Advanced RPMI supplemented with 0.5% N, N-Dimethylformamide, 0.5% FBS, 1% Nutridoma-SP (Roche; Mannheim, Germany), 2 mM L-glutamine, and 1× penicillin/streptomycin. Media was replaced on day 3. Differentiated PLB-985 cells were infected with planktonic or biofilm samples (MOIeq = 50, 6.0 <sup>×</sup> 106 PLB-985 cells total) that had been opsonized in 20% normal human serum for 20 minutes prior to infection. Uninfected PLB-985 cells were stimulated with phorbol 12-myristate 13-acetate (PMA) (final concentration of 1.0 <sup>×</sup> <sup>10</sup>−<sup>4</sup> mg/mL). All samples were supplied with luminol (final concentration of 500 μM); then, infections were synchronized by centrifugation. Reactive oxygen species (ROS) production at 37 ◦C was monitored in triplicate by luminol-dependent chemiluminescence measured every 2 minutes for 1 hour using a SpectraMax M3 plate reader. Area under the curve (AUC) was calculated for each condition and normalized to the AUC calculated for PMA-stimulated PLB-985 cells. Parallel infections were conducted to determine total CFUs remaining at 80 minutes post-infection. Gentamicin was not added to the media, and the PLB-985 cells were lysed by 0.1% sodium dodecyl sulfate treatment prior to serial dilution platting; thus, the reported CFUs represent total bacteria remaining from input after 80 minutes.

#### *2.8. Sensitivity and Tolerance to Oxidative Species*

Overnight cultures were normalized to approximately 2.0 <sup>×</sup> <sup>10</sup><sup>6</sup> CFUs/mL in TSB and incubated at 37 ◦C on a rolling drum. Planktonic sensitivity to hydrogen peroxide (H2O2) and hypochlorite (ClO<sup>−</sup>) was initially assayed using WT *S.* Typhimurium or *S.* Typhi exposed to 0 mM, 5 mM, or 10 mM H2O2 or 0 μg/mL, 250 μg/mL, or 1000 μg/mL ClO−. Viable cells remaining after 30- and 60-minute exposure were enumerated by dilution platting.

The MIC of each oxidative species against planktonic *Salmonella* was determined by 16-hour growth curves (OD600). Starting cultures contained approximately 2.0 <sup>×</sup> 106 CFUs/mL in TSB with H2O2 or ClO<sup>−</sup> supplied at final concentrations ranging from 10 mM to 0.3125 mM or 1000 μg/mL to 15 μg/mL (respectively) in two-fold serial dilutions. Microtiter plates were incubated at 37 ◦C for 16 hours and growth was monitored by OD600. Readings were recorded every 30 minutes using a SpectraMax M3 plate reader. The MICs for each oxidative species were determined by the lowest concentration that prevented detectable growth.

Biofilm aggregates were challenged with H2O2 or ClO<sup>−</sup> supplied at 1×, 10×, 25×, of 50× the experimentally-determined planktonic cell MIC and incubated at 37 ◦C for 2 hours on an orbital shaker. Viable biofilm aggregates were disrupted and enumerated before and after exposure by serial dilution platting.

#### **3. Results**

#### *3.1. Biofilm Tolerance to Innate Immunity*

#### 3.1.1. Each of the Four Major *Salmonella* Biofilm EPSs Contribute to Tolerance to Innate Immunity

Cytolytic activity of the complement system membrane attack complex and of AMPs is dependent on direct cell contact. Therefore, these soluble innate immune factors readily target planktonic bacterial surfaces (Figure 1A, Figure 2A) but could be functionally inhibited by EPSs. To address the hypothesis that EPSs are responsible for biofilm tolerance to soluble innate immune factors that successfully target planktonic bacteria, WT and EPS-deficient *S.* Typhimurium (Δ*csgA*Δ*wcaM*Δ*yihO*Δ*bcsE*) and *S.* Typhi biofilms were challenged with normal human serum and AMPs with the expectation that viable CFUs in EPS-deficient biofilms would be reduced compared to WT biofilms.

**Figure 1.** Sensitivity and tolerance to human serum. JSG698 was used for wild-type (WT) *S.* Typhi. (**A**) Growth from input of WT planktonic *Salmonella* following challenge with 30% normal human serum or heat-inactivated serum determined by colony forming unit (CFU) enumeration (data are normalized to heat-inactivated control) (*n* = 5). Significance between conditions was identified by multiple *t* tests using the Holm–Sidak method (α = 0.05) to correct for multiple comparisons (\*\*\*\*, *p* < 0.000001). (**B**) CFU enumeration of biofilms following the same conditions as planktonic bacteria. Significance difference were tested for using two-way analysis of variance (ANOVA) and the Sidak method to correct for multiple comparisons (*n* = 4, daily experiments conducted in quintuplicate). Error bars indicate standard deviation (SD).

WT planktonic *Salmonella* were significantly inhibited by 30% serum, with *S*. Typhimurium 50% inhibited and *S*. Typhi inhibited by 39% (Figure 1A). The same concentration of serum was unable to significantly reduce the viability of WT *Salmonella* biofilm-associated bacteria after 3 hours, even when the four major EPSs (*S.* Typhimurium) or the Vi antigen (*S.* Typhi) had been genetically eliminated (Figure 1B). Further analysis of individual EPS mutants was not conducted because their combined elimination did not produce an altered phenotype. The addition of Vi antigen to *S*. Typhimurium also did not significantly affect serum resistance (Figure 1B). These data suggest that *S*. Typhimurium maintains intrinsic tolerance to complement due to the biofilm lifestyle by unidentified EPSs or other multicellular behaviors.

**Figure 2.** *Cont*.

**Figure 2.** Sensitivity and tolerance to antimicrobial peptides (AMPs). JSG698 used for WT *S*. Typhi. (**A**) Growth from input of WT planktonic *Salmonella* after 2-hour exposure to each AMP at the minimum inhibitory concentration (MIC) determined by CFU enumeration (data are normalized to AMP-free control) (*n* = 3). *S*. Typhimurium challenged with 0.49 μg/mL polymyxin B and 10 μg/mL melittin. *S*. Typhi challenged with 0.24 μg/mL polymyxin B and 10 μg/mL melittin. Significance calculated by two-way ANOVA with Dunnett's multiple comparison test (\*\*\*, *p* <0.0005; \*\*\*\*, *p* < 0.0001). (**B**,**C**) Biofilm tolerance to AMPs supplied at 10× the WT planktonic MIC. Significance was tested for by two-way ANOVA with a Sidak multiple comparison correction (*n* = 3, daily experiments conducted in triplicate). Error bars indicate SD.

To test the activity of AMPs, the WT planktonic MIC was experimentally determined by assaying for growth inhibition. For *S.* Typhimurium, the MIC for polymyxin B and melittin was 0.49 μg/ml and 10 μg/ml (respectively). *S.* Typhi had the same melittin MIC but was twice as susceptible to polymyxin B with an MIC of 0.24 μg/ml (Figure 2A). Similar to the serum sensitivity results, the viability of biofilm-associated bacteria was not reduced by challenge with polymyxin B or melittin supplied at 10× the WT planktonic MIC (Figure 2B,C). This result was consistent for WTs from both serovar and for each of the EPS mutants (*S.* Typhimurium Δ*csgA*Δ*wcaM*Δ*yihO*Δ*bcsE*, *S.* Typhimurium Vi antigen+, and *S.* Typhi Δ*tviB*), so further analysis of individual EPS mutants was not conducted. Once again, these data indicate that there may be additional EPSs or biofilm behaviors responsible tolerance to soluble immune components.

3.1.2. Laboratory *S.* Typhi is Representative of Clinical Isolates from Both Acute and Chronic Patients

The majority of EPS mutants were generated and tested in the *S.* Typhimurium serovar because *S.* Typhimurium generates a chronic typhoid-like disease in mice similar to *S.* Typhi in humans and allows for in vivo modeling of a human-restricted pathogen. Since biofilm tolerance data (Figures 1 and 2) indicate that tolerance mechanisms exist beyond the EPS components that were tested, a panel of *S.* Typhi clinical isolates acquired from acute and chronic patients were surveyed to know if the laboratory strain of *S.* Typhi under scrutiny is representative of acute versus chronic patient isolates. When cultured under biofilm-inducing conditions and challenged with 10× polymyxin B, biofilm-associated CFUs were not significantly reduced for any of the 8 acute or 8 chronic isolates (supplemental data; Figure S1A). Similar results were obtained by challenge with 10× melittin, although one acute isolate (Ch-1) did exhibit reduced biofilm-associated CFUs (supplemental data; Figure S1B). Interestingly, when the Vi antigen was eliminated from this isolate, no inhibition from polymyxin B or melittin was observed (supplemental data; Figure S1A,B). Additionally, no major distinctions were evident between the acute patient isolates, the chronic patient isolates, and the laboratory *S.* Typhi WT.

#### *3.2. The Innate Immune Response to Salmonella Biofilms*

After determining that the four major *Salmonella* EPSs do not play a role in resisting attack by the tested innate immune factors, the possibility that biofilms somehow regulate host immunity was investigated. Given the prominent published anti-inflammatory role of the Vi antigen for planktonic cells [48] compared to the relative lack of phenotypes detected for the other major EPSs in vitro, the Vi antigen was predicted to have a significant role in biofilm inhibition of innate phagocytic cell function.

Conducting these experiments required the collection and normalization of biofilm aggregate populations of similar physical characteristics. The system developed to achieve this goal (scraped biofilms from 96-well plates) was validated by flow cytometry quantifying the size and granularity distribution of the normalized aggregates. While the biofilm aggregates were distinct from planktonic controls, among the biofilm aggregates, no major differences were observed between either serovar or EPS mutants in terms of predominant size, distribution of sizes, or granularity (supplemental data; Figure S2). Using aggregate populations of consistent size and granularity limits the possibility that subsequent experimental outcomes are due to differences in physical interactions between the aggregates and host cells.

#### 3.2.1. Vi Antigen Has a Direct Effect on Macrophage Nitric Oxide Production

Dendritic cells and macrophages are responsible for the delivery of *S.* Typhi from the gut to the liver, and tissue-resident macrophages are presumably the first innate immune cell to interact with *S.* Typhi after it arrives at the gallbladder [21,22,49,50]. Due to these extensive and early interactions between *S.* Typhi and macrophages, *S.* Typhi would be predicted to benefit from the ability to inhibit NO production in vivo. Both planktonic and biofilm infections of macrophages induced peak NO production 2 hours post infection. At these early time points, extracellular planktonic bacteria and biofilm aggregates resisted killing by macrophages equally (Figure 3A). While the four major EPSs did not influence NO production, addition of the Vi antigen to *S.* Typhimurium planktonic and biofilm cultures significantly reduced macrophage NO (Figure 3B,C). Interestingly, elimination of the Vi antigen from both planktonic and biofilm *S.* Typhi cultures showed a similar reduction of NO as was observed with *S.* Typhimurium, instead of the expected increase.

#### 3.2.2. Vi Antigen Inhibition of Neutrophil ROS is Dependent on the Growth State

Upon infiltration, neutrophils have the potential to inflict extensive damage through ROS production and other antimicrobial activities. Preventing these functions would enhance biofilm survival in the gallbladder. To determine if EPSs altered neutrophil ROS production, neutrophils were challenged with planktonic bacteria and biofilm aggregates of WT *Salmonella* and EPS-deficient mutants. Similar to the experiments in macrophages, planktonic bacteria and biofilms resisted early killing by neutrophils (Figure 4A,B) with the total CFUs (bacteria inside and outside neutrophils) not different from control conditions, yet the ROS response from neutrophils varied markedly by the microbial growth state (biofilm or planktonic) and EPS mutation (Figure 4C,D). Consistent with studies of planktonic *Salmonella* indicating anti-inflammatory functions of Vi antigen [23,27,29,48,51–53], less ROS was produced in response to planktonic *S.* Typhi (Vi antigen present) compared to planktonic *S.* Typhimurium (Vi antigen absent). However, for biofilm aggregates, ROS levels were similar between neutrophils infected with *S.* Typhi and *S.* Typhimurium. As expected, loss of the Vi antigen in both planktonic and biofilm *S.* Typhi resulted in a ROS increase upon infection, although the increase was much higher in planktonic than biofilm infections (Figure 4C). For *S.* Typhimurium producing the Vi antigen (versus *S.* Typhimurium), a decrease in ROS production was expected, but it was only observed with biofilm aggregates and not planktonic bacteria. Thus in most, but not all conditions, Vi antigen suppressed ROS production.

**Figure 3.** Effect of planktonic and biofilm-associated *Salmonella* on macrophage nitric oxide (NO) production. JSG4383 used for WT *S.* Typhi. (**A**) Planktonic and biofilm aggregate survival determined by CFU enumeration after 2-hour incubation with THP-1 macrophages. Analysis by two-way ANOVA with Sidak correction for multiple comparisons demonstrated no difference in CFU viability between growth states (*n* = 4, daily enumeration conducted in triplicate). (**B**,**C**) NO production by THP-1 macrophages after 2-hour infection with planktonic or biofilm-aggregate *Salmonella.* Significance for *S.* Typhimurium experiments was determined by ordinary one-way ANOVA with Dunnett correction for multiple comparisons (\*\*, *p* < 0.005). *S.* Typhi significant differences were identified by unpaired *t* test (\*, *p* < 0.05) (*n* = 4, daily experiments conducted in duplicate). Error bars indicate SD.

#### 3.2.3. Slime Polysaccharides Have a Role in ROS Stimulation

Surprisingly, a precipitous drop in ROS production was observed with the *S.* Typhimurium Δ*csgA*Δ*wcaM*Δ*yihO*Δ*bcsE* strain in the biofilm state (Figure 4C). These data indicated a need to investigate individual EPSs that may be responsible for this observation. Loss of curli fimbriae or the O antigen capsule from biofilm aggregates of *S.* Typhimurium eliminated this phenotype and resulted in a ROS response equivalent to infections with their planktonic counterparts (Figure 4D), thus implicating these two EPS components, along with Vi antigen, in the regulation of ROS production in neutrophils.

**Figure 4.** Effect of planktonic and biofilm-associated *Salmonella* on neutrophil ROS production. JSG4383 used for WT *S.* Typhi. (**A**,**B**) CFU enumeration of total (intracellular and extracellular) planktonic and biofilm aggregate survival for each WT and all mutants 80 minutes post-infection. No significant differences in survival were identified by two-way ANOVA with Sidak's method for multiple comparisons (*n* = 3, enumerated in triplicate). Error bars indicate SD. (**C**,**D**) Reactive oxygen species (ROS) production from neutrophils challenged with planktonic or biofilm aggregates of WT and all EPS mutants. Daily experiments were normalized by a phorbol 12-myristate 13-acetate (PMA)-stimulated control and the data shown are representative of three independent experiments, each with similar trends.

#### 3.2.4. Biofilms Are Tolerant to H2O2 but Provide no Protection to ClO<sup>−</sup>

Having identified the Vi antigen as a crucial regulatory component used by *S.* Typhi during both planktonic and biofilm infections, we investigated if mechanisms other than the reported blocking of complement-fixing antibodies [53] might be in play, hypothesizing that the capsule enhances bacterial detoxification pathways and/or inhibits host toxicity pathways. Briefly, host neutrophils produce superoxide through the NADPH oxidase + cytochrome B complex. Bacterial superoxide dismutase converts superoxide to H2O2, which can be further detoxified to water by catalase or converted to ClO− by myeloperoxidase. ClO− is a precursor for singlet oxygen, peroxynitrites, and chloramines, all of which are potent antimicrobial compounds. By supplying H2O2 or ClO<sup>−</sup> independent of host and bacterial functions, the following experiments aimed to identify if one of these divergent pathways in the ROS antimicrobial response was mitigated by *Salmonella*.

Initial concentration ranges of H2O2 and ClO<sup>−</sup> where *Salmonella* demonstrate stratified sensitivity were determined using planktonic bacteria. A challenge was conducted with 5–10 mM H2O2 or 250–1000 μg/mL ClO− for 60 minutes to determine the best experimental range (supplemental data; Figure S3A,B). Using this data in new experiments, WT planktonic bacteria from both serovars were found to have the same MICs for H2O2 and ClO−: 2.5 mM and 500 μg/mL, respectively (Figure 5A–D). Interestingly, the planktonic MIC to H2O2 was reduced in *S.* Typhi Δ*tviB* (1.25 mM) and enhanced for

*S.* Typhimurium Vi antigen<sup>+</sup> (4.25 mM), but it remained unchanged in *S.* Typhimurium lacking other EPSs (supplemental data; Figures S4 and S5), suggesting the involvement of Vi antigen in planktonic tolerance to H2O2. The planktonic sensitivity of both serovars to ClO<sup>−</sup> was unchanged in EPS mutants versus their parental WT, suggesting the involvement of Vi antigen in planktonic tolerance to H2O2.

To determine if EPSs afforded protection to biofilm-resident bacteria, biofilm aggregates were challenged with H2O2 or ClO<sup>−</sup> at the WT planktonic MIC or at 10×, 25×, or 50× the MIC concentrations of each compound. Surprisingly, biofilms were as sensitive as planktonic cultures to ClO− as no viable CFUs could be detected beyond challenge at the WT planktonic MIC for either serovar (Figure 5E–I). On the contrary, biofilms of *S.* Typhimurium challenged with H2O2 at planktonic-lethal doses demonstrated remarkable tolerance. This tolerance was in part due to EPS, as the *S.* Typhimurium Δ*csgA*Δ*wcaM*Δ*yihO*Δ*bcsE* EPS mutant was more sensitive to H2O2 (Figure 5E–I). Since a H2O2 tolerance defect was detected in *S.* Typhimurium Δ*csgA*Δ*wcaM*Δ*yihO*Δ*bcsE* biofilms (Figure 5G), further analysis of EPS single mutants was conducted (Figure 6) and revealed that the primary EPSs involved in the tolerance phenotype were the O antigen capsule (Δ*yihO*) and colanic acid (Δ*wcaM*). As observed in *S.* Typhimurium, *S.* Typhi biofilms were tolerant to H2O2 compared to planktonic, although this serovar was notably less tolerant than *S.* Typhimurium (Figure 5E,F,I). Consistent with planktonic data, *S.* Typhi Δ*tviB* biofilms were less tolerant to H2O2 than WT and survived only at the WT planktonic MIC (Figure 5I).

**Figure 5.** *Cont*.

**Figure 5.** Sensitivity and tolerance to oxidative species. JSG4383 used for WT *S.* Typhi. (**A**,**B**) Growth of WT planktonic *Salmonella* in the presence of H2O2 demonstrating a minimum inhibitory concentration (MIC) of 2.5 mM for both serovars. (**C**,**D**) Growth of WT planktonic *Salmonella* in the presence of ClO− demonstrating a MIC of 500 μg/ml for both serovars. (**E**–**I**) CFU enumeration of biofilm aggregates challenged with H2O2 supplied at 1× (2.5 mM), 10× (25 mM), 25× (62.5 mM), or 50× (125 mM) the WT planktonic MIC or ClO<sup>−</sup> supplied at 1× (0.50 mg/mL), 10× (5 mg/mL), 25× (12.5 mg/mL), or 50× (25 mg/mL) the WT planktonic MIC. Significant reductions in tolerance were identified by comparing control (0×) CFUs with the CFUs of each concentration tested by two-way ANOVA with Tukey's multiple correction method (\*\*, *p* < 0.01; \*\*\*, *p* < 0.0005; \*\*\*\*, *p* < 0.0001). All experiments were conducted in triplicate and the data are the averages of three independent experiments. Error bars indicate SD.

**Figure 6.** Tolerance of EPS mutants to H2O2. Biofilm aggregates deficient in one or more EPS(s) were challenged with H2O2 at the same concentrations detailed above. *S.* Typhimurium Δ*csgA*Δ*wcaM*Δ*yihO*Δ*bcsE* data were used from Figure 5 for comparative purposes. Significance between CFUs determined by two-way ANOVA with Tukey multiple correction (\*, *p* < 0.05; \*\*, *p* <0.01; \*\*\*\*, *p* < 0.0001). Error bars indicate SD.

#### **4. Discussion**

Our analysis probed for phenotypic differences caused by EPS mutations with two unique perspectives: differences in bacterial survival versus differences in host response. Experiments to determine EPSs that enhance biofilm recalcitrance to selected soluble innate immune factors demonstrated that while the biofilm lifestyle did provide tolerance, an EPS that is solely responsible for inhibiting the bactericidal activities of human serum and AMPs was not identified (Figures 1 and 2). It was hypothesized that EPSs would physically protect biofilm-associated *Salmonella* by preventing diffusion into the biofilm and contact with bacterial membranes, or that they may sequester these molecules through means including electrostatic attraction. However, while EPSs may provide a shield to some innate immune factors, the sensitivity of planktonic cells but robust survival by biofilms deficient for all four major EPSs (*S.* Typhimurium Δ*csgA*Δ*wcaM*Δ*yihO*Δ*bcsE*, Figures 1 and 2) supports the notion that unknown EPSs or other biofilm-induced surface alterations can result in recalcitrance to humoral immunity.

Dendritic cells and macrophages generate NO via the upregulation of inducible nitric oxide synthase (iNOS) [54–56], which is positively stimulated by the pro-inflammatory mediators interleukin-12, interferon-γ, and tumor necrosis factor-α as well as bacterial pathogen-associated molecular patterns [54]. Given the early interaction of dendritic cells and macrophages with planktonic *S.* Typhi and the continued presence of both cell types (planktonic and biofilm) during chronic infection [57], it would be beneficial for the pathogen to limit iNOS activity during the acute and chronic stages of disease. The macrophage NO response to *Salmonella* is independent of curli fimbriae, colanic acid, O antigen capsule, and cellulose (Figure 3B,C). However, we found that the presence of Vi antigen inhibits the NO response to both planktonic and biofilm *Salmonella*, as the addition of Vi antigen to *S.* Typhimurium had an anti-inflammatory effect (Figure 3B,C). Failure by macrophages to exhibit a hyper-inflammatory response to *S.* Typhi Δ*tviB* suggests that *S.* Typhi possesses additional mechanisms for controlling the host response. It is reasonable to expect such a stealthy pathogen to have redundant mechanisms in order to not only control but also fine-tune iNOS activity. The ability to reduce, but not abrogate, iNOS activity would benefit the long-term survival of *S.* Typhi in the gallbladder and fits emerging models for the transition from Th1 to Th2 immunity characteristic of

chronic *S.* Typhi infections [57–60]. Although several studies have demonstrated NO to be a potent and early defense mechanism to many intracellular infections [54–56,61–63], the global stationary-phase regulator *rpoS* confers *S.* Typhi with a high degree of resistance [55], and a recent study [64] of persistent *Salmonella* found abundant expression of genes for the sensing and detoxification of reactive nitrogen species (RNS) even when the bacteria were in a dormant state. The *S.* Typhi used in our study is *rpoS*<sup>+</sup> and therefore was expected to be resistant to NO. Furthermore, NO is highly diffusible and has immunosuppressive effects on T and B cell activity [54]. It has long been known that iNOS-deficient macrophages are more likely to become apoptotic when infected by *S.* Typhi [55] and that NO has potent anti-apoptotic activity [55,65,66]. By permitting some NO production, *S.* Typhi may limit the production of RNS to a tolerable concentration while preventing an interferon-driven apoptotic response that would lead to Th1 immunity. While some amount of iNOS activity does lead to an initial Th1 response to gallbladder biofilms, control by *S.* Typhi aids in the development of M2 polarization and the transition to Th2 immunity in the gallbladder.

RAW-264 mouse macrophages treated with the iNOS inhibitor Nω-monomethyl L-arginine (L-NMMA) and iNOS−/<sup>−</sup> mice are still able to resist *S.* Typhi infection [55]. In these cases, the antimicrobial activity is mediated by superoxide activity as ROS production occurs independent of iNOS. We chose to examine neutrophil ROS activity because it has been reported that iNOS enhances neutrophil rolling, adhesion, and migration (diapedesis) [67,68], and we have previously found neutrophils in the gallbladder of chronically-infected mice [57]. Although expression of the Vi antigen from planktonic *S.* Typhi has previously been shown to inhibit neutrophil chemotaxis and ROS production [48], it was unknown if this function also occurs in biofilms. Addition of the Vi antigen to planktonic *S.* Typhimurium failed to reduce ROS production from neutrophils, but the presence of Vi antigen in biofilms did cause a decrease in ROS production (Figure 4C). As expected, planktonic *S.* Typhi Δ*tviB* elicited a robust response from neutrophils. Although the response to biofilms from the same mutant was also greater than the response to WT biofilms, the increase in ROS production was surprisingly less dramatic (Figure 4C). These data indicate that while the planktonic inhibition of neutrophil ROS is dependent on the Vi antigen, ROS inhibition by biofilms is dependent on a combination of Vi antigen and other biofilm functions. This biofilm-dependent enhancement demonstrates how neutrophil recruitment may prevent *S.* Typhi from causing systemic disease (which would require the release of planktonic bacteria) but simultaneously is unable to clear biofilm infections. This type of stalemate is in line with many persistent infections that are restricted but not sterilized by the host.

Infections with *S.* Typhimurium EPS mutants indicated that the slime polysaccharides cellulose and colanic acid are required for neutrophil recognition of *S.* Typhimurium biofilms (Figure 4D). Considering that *S.* Typhimurium elicits early neutrophil recruitment and inflammatory disease in humans (to a point of self-limiting infection in the gut) and the fact that *S.* Typhi lacks colanic acid, this discovery highlights one way that *S.* Typhi has undergone host specialization and has an advantage in biofilm recalcitrance to innate immunity. On the other hand, both serovars possess cellulose, which presents an interesting question to investigate further. Since we found cellulose to be immune-stimulating, we predict that the localization of cellulose to the host–biofilm interface would reduce biofilm stealth but the presence of other EPS components in vivo prevents this interaction from occurring.

Consistent with a stalemate infection, our data indicated planktonic and biofilm *Salmonella* resisted killing by oxidative species (Figure 4A,B). To determine the degree to which this phenotype is intrinsic to *Salmonella* versus the biofilm lifestyle, we challenged planktonic and biofilm *Salmonella* with exogenously supplied (chemical) oxidative species. We demonstrated that 500 μg/ml ClO− readily kills planktonic and biofilm-associated *Salmonella* alike (Figure 5C–I), indicating that the presence of EPSs does not afford protection to peroxynitrites, chloramines, or singlet oxygen (downstream ROS molecules produced by ClO−). However, the survival of biofilm-associated bacteria in conditions with 50-fold more H2O2 than the WT planktonic MIC (Figure 5A,B,E–I) indicates that EPSs augment biofilm

tolerance to the ROS precursor of ClO−. We predict that EPSs slow the diffusion of H2O2 into the biofilm to a rate that permits the effective detoxification of H2O2 by catalase enzymes expressed by *Salmonella* [69–71]. The H2O2 tolerance defect detected in *S.* Typhimurium Δ*csgA*Δ*wcaM*Δ*yihO*Δ*bcsE* biofilms (Figure 5E,G) can be attributed specifically to the O antigen capsule and colanic acid (Figure 6). Interestingly, *S.* Typhi Δ*tviB* biofilms also demonstrated a tolerance defect, as they only survived H2O2 challenge at the planktonic MIC (Figure 5F,I). Consistently, the MIC of planktonic *S.* Typhi Δ*tviB* was reduced to 1.25 mM H2O2 (supplemental data; Figure S4A); the growth of planktonic *S.* Typhimurium expressing the Vi antigen was reduced—but not inhibited—by the WT H2O2 MIC, and complete inhibition required 4.25 mM H2O2 (1.7-fold the WT MIC) (supplemental data; Figure S5). Together, these data implicate biofilm Vi antigen as well as *S.* Typhimurium O antigen capsule and colanic acid in tolerance to oxidative killing.

In our view, it is no coincidence that *S.* Typhi has dispensed colanic acid; even though the polysaccharide augments *S.* Typhimurium tolerance to H2O2, the ability of *S.* Typhi to outright prevent or reduce oxidative pathways by eliminating this immunostimulatory EPS pays far greater dividends. Furthermore, the other *S.* Typhimurium EPS that was identified as important for tolerance to H2O2 (O antigen capsule) is analogous to the Vi antigen in typhoidal serovars in that it drives H2O2 tolerance. The discovery that each capsule is responsible for H2O2 tolerance in its respective serovar fits with our other evidence and highlights how *S.* Typhi relies on the Vi antigen to establish and maintain chronic infections. While *S.* Typhimurium biofilms depend on the O antigen capsule to reduce ROS production and for tolerance to H2O2, the O antigen capsule was unable to reduce NO production in the same manner that the Vi antigen and *S.* Typhi succeed in reducing NO production. Overall, the evolution of the Vi antigen in typhoidal serovars and the ability to fine-tune iNOS activity sets these human-adapted pathogens apart from other *Salmonella,* as the Vi antigen has a prominent role in regulating the host NO and ROS responses to biofilm infections as well as enhancing biofilm recalcitrance during chronic infections that are bound to be exposed to some level of antimicrobial defenses. We have shown how the Vi antigen represents a key EPS of *S.* Typhi biofilms and that it likely elicits modulation of the innate immune system of chronic carriers.

#### **5. Conclusions**

We have provided strong evidence that *S.* Typhi inhibits innate immunity, but also acknowledge that host functions can ultimately prevail. Adaptive immunity has been documented in chronic carriers in the form of *Salmonella-*specific circulating antibodies and *Salmonella*-specific T cells [61,64,72,73]. Yet the inability to clear infections by innate or adaptive means indicates a true stalemate infection. From the perspective of the infected individual, a chronic stalemate infection is beneficial in the sense that it is preventing systemic disease. Of course, from a public health standpoint, chronic disease is the apex issue and must be mitigated, but in a manner that does not target biofilms so effectively that they rapidly disburse and cause systemic morbidity. By studying the innate immune interactions of *Salmonella* biofilms, we have demonstrated one potential pathway in which innate immunity could potentially be enhanced to help achieve this goal and believe that activating myeloperoxidase activity in vivo in response to Vi antigen would benefit the natural ability of host clearance.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2076-2607/8/2/253/s1, Figure S1: Biofilm tolerance of *S.* Typhi clinical isolates, Figure S2: Analysis of biofilm aggregates, Figure S3: Planktonic sensitivity to oxidative species, Figure S4: EPS mutant planktonic MICs, Figure S5: *S*. Typhimurium Vi antigen<sup>+</sup> planktonic MIC (H2O2).

**Author Contributions:** Conceptualization, M.M.H. and J.S.G.; methodology, M.M.H. and J.S.G.; formal analysis, M.M.H. and J.S.G.; investigation, M.M.H.; data curation, M.M.H.; writing—original draft preparation, M.M.H.; writing—review and editing, J.S.G.; visualization, M.M.H.; supervision, J.S.G.; funding acquisition, J.S.G. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the grant R01AI116917 from the National Institutes of Health to J.S.G. and from funds from the Abigail Wexner Research Institute at Nationwide Children's Hospital (J.S.G.).

**Acknowledgments:** We thank Bradley Eichar for mutant generation and Evan Bernard for experimental support.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Metabolic Activation of CsgD in the Regulation of** *Salmonella* **Biofilms**

**Akosiererem S. Sokaribo 1,2, Elizabeth G. Hansen 1, Madeline McCarthy 1,2, Taseen S. Desin 2,3, Landon L. Waldner 1, Keith D. MacKenzie 4,5, George Mutwiri Jr. 1, Nancy J. Herman 1, Dakoda J. Herman 1, Yejun Wang <sup>6</sup> and Aaron P. White 1,2,\***


Received: 8 May 2020; Accepted: 20 June 2020; Published: 27 June 2020

**Abstract:** Among human food-borne pathogens, gastroenteritis-causing *Salmonella* strains have the most real-world impact. Like all pathogens, their success relies on efficient transmission. Biofilm formation, a specialized physiology characterized by multicellular aggregation and persistence, is proposed to play an important role in the *Salmonella* transmission cycle. In this manuscript, we used luciferase reporters to examine the expression of *csgD*, which encodes the master biofilm regulator. We observed that the CsgD-regulated biofilm system responds differently to regulatory inputs once it is activated. Notably, the CsgD system became unresponsive to repression by Cpx and H-NS in high osmolarity conditions and less responsive to the addition of amino acids. Temperature-mediated regulation of *csgD* on agar was altered by intracellular levels of RpoS and cyclic-di-GMP. In contrast, the addition of glucose repressed CsgD biofilms seemingly independent of other signals. Understanding the fine-tuned regulation of *csgD* can help us to piece together how regulation occurs in natural environments, knowing that all *Salmonella* strains face strong selection pressures both within and outside their hosts. Ultimately, we can use this information to better control *Salmonella* and develop strategies to break the transmission cycle.

**Keywords:** biofilm; *Salmonella*; CsgD; curli; cellulose; CpxR

#### **1. Introduction**

*Salmonella enterica* strains that cause gastroenteritis and typhoid fever were recently ranked first and second in terms of global disease impact (i.e., disability adjusted life years) among 22 of the most common food borne pathogens [1]. *S. enterica* strains are distributed within >2000 serovars, with yearly estimates of approximately 94 million cases of gastroenteritis [2] and 21 million cases of typhoid fever worldwide [3]. The serovars associated with typhoid fever (i.e., Typhi, Paratyphi and few others [4]) consist of human-restricted strains and are collectively referred to as typhoidal *Salmonella* (TS). The serovars associated with gastroenteritis (i.e., Typhimurium, Enteritidis and >1600

others) consist of host-generalist strains and are collectively referred to as nontyphoidal *Salmonella* (NTS) [5]. NTS outbreaks are relatively common occurrences and are often linked to the consumption of contaminated food produce, such as poultry [6,7], fruits, vegetables [8,9] and processed foods [10]. In general, NTS strains have a remarkable ability to persist and survive in harsh conditions, including extremes of drying and nutrient limitation [11–13].

The majority of NTS strains can form biofilms, a specialized physiology that is characterized by multicellular aggregation, long-term survival, and resistance. Biofilm formation has been linked to *Salmonella* persistence on food surfaces, plants, and other produce, and is thought to provide protection during food processing [14–16]. Aside from the food-borne aspects, biofilm formation is hypothesized to be an integral part of the life cycle of gastroenteritis-causing *Salmonella* strains, by ensuring long-term survival of cells as they cycle between hosts and the environment [14,17]. We have speculated that biofilms are connected to the host generalist lifestyle since the environment (soil and water) would be a common collecting point for multiple host species. In contrast, there is widespread loss of biofilm formation in TS strains and other more invasive strains, such as the specialized NTS strains associated with human bloodstream infections in sub-Saharan Africa [18,19], although TS produce alternative biofilms on gallstones inside human carriers [20]. There are multiple selection pressures acting on biofilm formation in diverse *Salmonella* strains. In short, biofilms are thought to represent the most dominant form of bacterial life on the planet and understanding the regulation of this specialized physiology is important.

Biofilm-forming strains of *S. enterica* can be identified by the production of distinct rdar (red, dry and rough) morphotype colonies when grown on agar-containing media supplemented with the dye Congo red. Cells within the colony are held together by curli fimbriae for short-range interactions and cellulose for long-range interactions [12,21,22]. In addition, other polymers are part of the extracellular biofilm matrix, including polysaccharides (i.e., O-Ag capsule, colanic acid and cellulose) proteins (i.e., BapA, curli, and flagella), lipopolysaccharides and DNA [8,23]. Curli, cellulose and the biofilm matrix impart survival and persistence properties on cells within the biofilm [12–14]. It is not known if the survival traits are specific to the polymers themselves or are emergent properties associated with cells entering a unique physiological state [24–26]. Perhaps the microenvironments generated within a biofilm are responsible for the adaptations, heterogeneity and cellular differentiation observed during biofilm formation [27,28].

In *Salmonella*, regulation of curli, cellulose and other polymers is coordinated through CsgD, the main transcriptional controller of biofilms. The activation of CsgD in vitro has been well-defined, with growth conditions of low osmolarity, lower temperatures and limiting nutrients necessary to activate *csgD* transcription [29]. Expression of *csgD* is repressed tightly at early stages of growth but is induced up to 370-fold when cells enter the stationary phase of growth [30]. The same general principles apply in *E. coli*, which shares the CsgD, curli, and cellulose biofilm components [31]. In the stationary phase of growth, cell density in the culture is high, nutrients become limiting and cells express the alternative sigma factor RpoS [32]. RpoS controls the general stress response [33] and selectively transcribes *csgD* [34,35]. The effects of osmolarity are mediated through the EnvZ/OmpR and CpxA/CpxR two-component signal transduction systems [36]. In low osmolarity, low levels of phosphorylated OmpR bind to a high-affinity binding site −50.5 bp upstream of the *csgD* transcription start sites, which activates *csgD* transcription [37]. In high osmolarity, transcription is repressed through binding of phosphorylated CpxR to multiple sites on the *csgD* promoter [36], as well as phosphorylated OmpR binding to a low-affinity site in the *csgD* promoter [38]. We realized that the complex regulatory network behind *csgD* activation [8] was even more dynamic when it was discovered that CsgD was produced in a bistable manner [39–41]. Biofilm cells are maintained in a CsgD-ON state due to a predicted feed-forward loop consisting of RpoS, CsgD and IraP, a protein that stabilizes RpoS [35,42]. The remaining single cells are in a CsgD-OFF state and express several important virulence factors [14]. The connection between persistence and virulence during biofilm formation brings into question the hierarchical regulation of this process, as well as determining how individual cells become activated and remain in their CsgD-ON or -OFF states.

The regulation of *Salmonella* biofilms is also strongly influenced by the intracellular levels of the second messenger, cyclic-di-GMP (c-di-GMP). It is synthesized from two guanosine 5 -triphosphate molecules by diguanylate cyclases (DGCs) and degraded by specific phosphodiesterases (PDEs). In general, high levels of c-di-GMP are associated with biofilm formation, sessility and persistence, and low levels of c-di-GMP are associated with motility and virulence [43]. The change in c-di-GMP levels in *S. enterica* is controlled by the enzymatic activity of 17 different DGCs and PDEs. For biofilms, the cellulose synthase enzyme, BcsA, is allosterically activated by c-di-GMP that is produced by AdrA, a DGC that is transcriptionally activated by CsgD. Expression of CsgD itself is influenced by c-di-GMP synthesis and breakdown by a network of DGC and PDE enzymes [44]. The importance of c-di-GMP regulation is underscored by the observation that *S. enterica* isolates that are defective in the production of DGCs are both avirulent and unable to form biofilms [45].

In this manuscript, we analyzed the regulation of *csgD* transcription and the activity of CsgD through activation of curli biosynthesis (*csgBAC*) and cellulose production (*adrA*). We examined the response to different environmental signals (i.e., temperature, osmolarity, nutrients) and discovered that there is a hierarchy of regulation. These environmental signals were selected because their effects on csgD expression before induction have been well established and we hypothesize that these conditions would be encountered during food processing and in both host and non-host environments. We established that the CsgD system responds differently or not at all to known regulatory inputs once it has been activated. This is similar to some dedicated, point of no return processes, such as sporulation in *Bacillus subtilis* [46]; however, we show that the CsgD system can be reversed by other signals, such as glucose. The implications for the *Salmonella* lifecycle are discussed.

#### **2. Materials and Methods**

#### *2.1. Bacterial Strains, Media, and Growth Conditions*

The bacterial strains used in this study are listed in Table 1. For standard growth, strains were inoculated from frozen stocks onto LB agar (lysogeny broth, 1% NaCl, 1.5% agar) supplemented with appropriate antibiotic (50 μg mL−<sup>1</sup> kanamycin (Kan), or 5 μg mL−<sup>1</sup> tetracycline (Tet)) and grown overnight at 37 ◦C. Isolated colonies were used to inoculate 5 mL LB broth and the culture was incubated for 18 h at 37 ◦C with agitation at 200 RPM.


**Table 1.** Strains and plasmids used in this study.

For analysis of colony morphology and gene expression, 4 μl of overnight culture was spotted on 1% tryptone agar supplemented with 0.2% freshly made glucose, 25 mM salt or 100 mM salt (agar supplemented with glucose were used within 24 h). Plates were incubated at 28 ◦C or 37 ◦C for two days. Visible and luminescence images were captured with a spectrum CT in vivo imaging system (PerkinElmer, Waltham, MA, USA).

#### *2.2. Generation of* S. typhimurium *14028 Mutant Strains*

Lambda red recombination [49] was used to generate Δ*cpxR* and Δ*iraP S.* Typhimurium mutant strains. Primers containing 50-nuclelotide sequences on either side of *cpxR* or *iraP* (Table 2) were used to amplify the *cat* gene from pKD3 using Phusion high-fidelity DNA polymerase (New England Bio-Labs, Ipswich, MA, USA). The PCR products were solution purified and electroporated into *S*. Typhimurium 14028 cells containing pKD46. Mutants were first selected by growth at 37 ◦C on LB agar supplemented with 10 μg ml−<sup>1</sup> chloramphenicol (Cam) before streaking onto LB agar containing 34 μg ml−<sup>1</sup> Cam. PCR primers upstream and downstream of *cpxR* or *iraP* (Table 2) were used to amplify sequence from the genome of mutant *S. typhimurium* 14028 strains and verify loss of the corresponding open reading frames. The ΔcpxR or Δ*iraP* mutations were moved into a clean *S. typhimurium* strain background with P22 phage [50]. The *cat* gene was resolved from the chromosome using pCP20 [49].


#### **Table 2.** Oligonucleotides used in this study.

<sup>a</sup> Nucleotide sequences corresponding to restriction enzyme sites are underlined.

#### *2.3. Generation of Bacterial Luciferase Reporters and Other Plasmid Vectors*

Luciferase fusion reporter plasmids containing the promoters of *csgDEFG, csgBAC* and *adrA* have been previously described [12]. The *cpxP* reporter plasmid was generated to monitor the levels of CpxA/CpxR activation within the cell. The intergenic region containing the *cpxR* and *cpxP* promoter sequences was PCR amplified from *S*. Typhimurium 14028 using primers cpxR1 and cpxR2 (Table 2) and Phusion high-fidelity DNA polymerase (New England BioLabs, Ipswich, MA, USA). The resulting PCR product was purified, sequentially digested with *Xho*I and *Bam*HI, and ligated (in the *cpxP* direction) using T4 DNA ligase (New England BioLabs, Ipswich, MA, USA) into pU220 digested with *Xho*I and *Bam*HI. The *stm1987* luciferase reporter plasmid was generated similarly using primers STM14\_2408for1 and STM14\_2408rev2 (Table 2), with cloning into pCS26. PCR screening with primers pZE05 and pZE06 was used to verify the successful fusion of promoter regions to *luxCDABE.*

For plasmid-based overexpression of cyclic-di-GMP related enzymes, fragments containing *stm1987* and *yhjH* genes with their native promoters were PCR amplified from *S. typhimurium* 14028 gDNA using Phusion high-fidelity DNA polymerase and appropriate primers (Table 2). Resulting PCR products were purified, digested with *Eco*RI and *Aat*II, and ligated using T4 DNA ligase into *Eco*RI/*Aat*II-digested pBR322. The pACYC-*rpoS* plasmid vector has previously been described [48]. Reporter plasmids and overexpression plasmids were co-transformed into *S. typhimurium* strains by electroporation and selected by growth at 37 ◦C on LB agar supplemented with 50 μg mL−<sup>1</sup> Kan (pCS26) and 10 μg mL−<sup>1</sup> Tet (pBR322 or pACYC).

#### *2.4. Luciferase Reporter Assays*

96-well bioluminescence assays were performed with *S*. Typhimurium luciferase reporter strains. Overnight cultures were diluted 1 in 600 into individual wells of black, clear bottom 96-well plates (9520 Costar; Corning Life Sciences, Tewksbury, MA, USA) containing 150 μL of 1% tryptone broth supplemented with 50 μg mL−<sup>1</sup> of kanamycin (Kan). When noted, media was supplemented before growth with NaCl (25,150 mM), sucrose (50,150 mM), CuCl2 (1 mM), casamino acids (12%) or individual amino acids (15 mM) to the final concentrations as indicated. For the addition of media supplements during growth, cells were inoculated into 135 μL of media and grown for 18 h at 28 ◦C before supplements were added as 15 μL aliquots to the appropriate wells. This included glucose ranging from 25–150 mM. To minimize evaporation of the media during the assays, cultures were overlaid with 50 μL of mineral oil per well. Cultures were assayed for absorbance (600 nm, 0.1 s) and luminescence (1s; in counts per second (CPS)) every 30 min during growth at 28 ◦C with agitation in a Victor X3 multilabel plate reader (Perkin-Elmer, Waltham, MA, USA).

#### **3. Results**

#### *3.1. Osmolarity Has No E*ff*ect Once csgD Transcription Is Activated*

Osmolarity is a key regulatory factor for Salmonella biofilm formation in vitro [29,51]. In the presence of high concentrations of NaCl, *csgD* transcription is abolished [52]. In *E. coli*, this repression is mediated through the CpxA/R two-component system [37]. We performed transcription experiments with *S. enterica* serovar Typhimurium ATCC 14028 (i.e., *S. typhimurium* 14028). Consistent with *E. coli*, expression of *csgD* was highest in low osmolarity media (i.e., no salt) and reduced sequentially in media supplemented with increasing concentrations of NaCl (Figure 1A). Reduced *csgD* expression in media supplemented with 75 mM or more salt correlated with basal expression of *csgBAC* (curli biosynthesis) and *adrA* (cellulose biosynthesis) (Figure 1B,C). To gauge the activity of the CpxA/R system, and its potential role in repression, we monitored expression of *cpxP*, a known regulatory target of CpxR [53]. Expression of cpxP was highest in media supplemented with 150 mM NaCl (Figure 1D), which was inversely correlated with *csgD* expression levels. This was consistent with CpxR-mediated repression of *csgD* transcription.

Regulation of *csgD* expression via the CpxA/R system is thought to be a dynamic process involving surface-sensing and feedback during curli production [36,54]. Therefore, we performed experiments where salt was added to growing cultures after 18 h of growth, rather than being premixed into the media before growth. At 18 h of growth, *csgD* expression level is rapidly increasing and *csgBAC* and *adrA* expression are just beginning to increase [26]. Under these conditions, *csgD* expression did not change when increasing concentrations of salt were added during growth; the expression curves were nearly superimposable regardless of the amount of salt added (Figure 1E). Expression of *csgBAC* was also not inhibited by the addition of salt and was actually increased at high salt concentrations (Figure 1F). For *adrA*, mild repression was observed, but expression was well above background levels, even in the presence of 150 mM salt (Figure 1G). *cpxP* expression, on the other hand, was similar to the premixed experiments, with highest expression in the 150 mM salt media and lowest expression in non-supplemented media (Figure 1H). These results indicated that the Cpx system was activated by the addition of salt during growth but was no longer causing repression of *csgD* transcription and the downstream genes involved in curli and cellulose production.

**Figure 1.** Response of the *Salmonella csgD* regulatory network to changes in osmolarity. *csgDEFG* (**A**,**E**), *csgBAC* (**B**,**F**), *adrA* (**C**,**G**), and *cpxP* (**D**,**H**) expression was measured in *S. typhimurium* 14028 during growth at 28 ◦C in media premixed with 25, 50, 75, 100, 125 or 150 mM salt (A–D) or with 50, 100, or 150 mM salt added during growth (E–H; vertical line shows the time of addition at 18 h). For each graph, luminescence (light counts per second) divided by the optical density at 600 nm (Lum/OD) was plotted as a function of time with each curve representing a single growth condition. The mean and standard deviations are plotted from experiments performed in triplicate (**A**–**C**, **E**–**H**) or from a single representative experiment (**D**).

#### *3.2. Repressive E*ff*ect of CpxR on csgD Transcription Is Alleviated During Growth*

To examine the effects of Cpx-mediated repression of *csgD* transcription in more detail, we monitored gene expression in a Δ*cpxR* mutant background. The Cpx system can be activated by high concentrations of metals and a variety of other signals, with each thought to represent a form of periplasmic stress [54,55]. Growth of *S. typhimurium* 14028 in media supplemented with 1 mM copper chloride resulted in activation of the Cpx system, as measured by an increase in *cpxP* expression (Figure 2A, + inducer). As expected, *cpxP* expression was off in the Δ*cpxR* strain background (Figure 2A, red line). In the presence of copper chloride, *csgD* expression reached high levels, similarly to when in the presence of non-supplemented media (Figure 2B). There was also a slight increase in the Δ*cpxR* strain, which was consistent with CpxR being a repressor of *csgD* transcription. This effect

was more pronounced for *csgBAC*, as expression was approximately four times higher in the Δ*cpxR* strain (Figure 2C). We performed the same experiment with the addition of copper chloride after 18 h of growth. The Cpx system was activated normally, as shown by elevated *cpxP* expression levels in the presence of the inducer (Figure 2D). However, expression of *csgDEFG* and *csgBAC* was unchanged in the Δ*cpxR* mutant strain, showing no evidence of CpxR-mediated repression (Figure 2E,F). This indicated that once the *csgD* network was activated, the system was unresponsive to CpxR.

**Figure 2.** The Cpx system has no repressive effect on *csgD* transcription once the biofilm network is activated. Expression of *cpxP* (**A**,**D**), *csgDEFG* (**B**,**E**), and *csgBAC* (**C**,**F**) operons was measured during growth of *S. typhimurium* 14028 wild-type (blue) or Δ*cpxR* strains (red) at 28 ◦C in media supplemented with 1.0 mM CuCl2 (+ inducer) added at the beginning of growth (**A**–**C**) or added after 18 h of growth (**D**–**F**; the vertical, dotted line represents the time of addition). For each graph, luminescence divided by the optical density at 600 nm (Lum/OD) was plotted as a function of time and each curve represents a single growth condition. The mean and standard deviations are plotted from three biological replicate experiments measured in triplicate.

We also measured biofilm gene expression after the addition of sucrose (Figure 3). In *E. coli*, sucrose has been shown to repress *csgD* transcription, due to the activity of H-NS [36]. Sucrose is also a cleaner measure of osmolarity because unlike salt, it does not result in a change of ionic strength. In general, the *csgDEFG*, *csgBAC* and *adrA* expression profiles were consistent with what was measured in response to salt addition. When sucrose was added to the media before growth, significant repression was observed for all three promoters (Figure 3A–C). However, when sucrose was added to growing cultures at 18 h, there was no repression measured (Figure 3E–G). The addition of sucrose had minimal effect on *cpxP* expression (Figure 3D,H), and, therefore, did not appear to engage the CpxR/A system, similar to what was observed in *E. coli* [36]. These results indicated that the *S. typhimurium csgD* biofilm

network is not influenced by changes in osmolarity after it has been activated. Moreover, this appears to be a general effect that is not restricted to repression by the Cpx system.

**Figure 3.** Effect of sucrose addition on the *Salmonella csgD* regulatory network. Expression of *csgDEFG* (**A**,**E**), *csgBAC* (**B**,**F**), *adrA* (**C**,**G**), and *cpxP* (**D**,**H**) operons was measured during growth of *S. typhimurium* 14028 at 28 ◦C in media premixed with 50, 100 or 150 mM sucrose (**A**–**D**) or with sucrose added during growth (**E**–**H**; vertical line represents the time of addition at 18 h). For each graph, luminescence (light counts per second) divided by the optical density at 600 nm (Lum/OD) is plotted as a function of time and each curve represents a single growth condition. The mean and standard deviations are plotted from three biological replicate experiments measured in triplicate.

#### *3.3. Temperature and Glucose Repress csgD Expression*

The idea that the biofilm system can become unresponsive to known regulatory inputs once it is activated fits with one of the hallmarks of a bistable gene expression system, in that a proportion of cells can remain activated even when the inducer is absent [56]. There are other bacterial physiologies, such as sporulation, where the cellular differentiation process is irreversible [46]. The *csgD* biofilm network has been shown to have bistable expression [40,42]. We wondered if the response we had observed with osmolarity was representative of a non-reversible system.

In most *Salmonella* and *E. coli* strains, *csgD* expression and biofilm formation is activated at temperatures below 30 ◦C and repressed at higher temperatures [52]. There are strains that produce biofilms at higher temperatures (i.e., 37 ◦C), but these typically possess single nucleotide polymorphisms in the *csgD* promoter region that allows for disregulated expression [50,52,57,58]. We tested whether increased temperature could shut off activated biofilm gene expression by first growing cells at 28 ◦C for 18 h and then shifting the temperature to 30, 32, 35, or 37 ◦C. At 30 ◦C or 32 ◦C, there was a measurable drop in *csgDEFG*, *csgBAC* and *adrA* expression, but it was still above background levels (Figure 4A–C). However, a temperature shift above 32 ◦C reduced gene expression to baseline levels (Figure 4A–C). This showed that high temperature was able to override the activation of *csgD* and biofilm related genes.

**Figure 4.** The *csgD* biofilm network in *Salmonella* is repressed by the addition of glucose or an increase in growth temperature. Expression of *csgDEFG* (**A**,**D**), *csgBAC* (**B**,**E**), and *adrA* (**C**,**F**) was measured during growth of *S. typhimurium* 14028 at 28 ◦C for 18 h prior to temperature shift (**A**–**C**) or the addition of 25, 50, 75, 100, 125, or 150 mM glucose (**D**–**F**). The vertical dotted line represents the time of temperature shift or glucose addition. For each graph, luminescence divided by the optical density at 600 nm (Lum/OD) is plotted as a function of time and each curve represents a single growth condition. The mean and standard deviations are plotted from three biological replicate experiments measured in triplicate.

Glucose is another powerful repressor of *csgD* expression and biofilm formation in vitro [26,59]. For *S. typhimurium*, the addition of glucose to growing cultures at 18 h rapidly abolished *csgDEFG* (Figure 4D), *csgBAC* (Figure 4E) and *adrA* (Figure 4F) expression, even at the lowest added concentration of 25 mM. We tested lower concentrations of glucose and found that in each case, *csgD* transcription was immediately repressed but was restored at later timepoints, presumably when all glucose was metabolized. This showed that glucose was a powerful repressive signal. Together, these experiments showed that activation of the *S*. Typhimurium *csgD* biofilm network is a reversible process and suggested the existence of a regulatory hierarchy.

#### *3.4. E*ff*ect of Casamino Acids on Biofilm Formation*

Expression of *csgD* is known to be activated once cells reach a critical density and nutrients start to run out [30]. Since 1% tryptone is primarily an amino acid-based media [60], we speculated that the addition of amino acids would reduce or delay expression of *csgD* and other biofilm genes. Casamino acids (CAA) are a complex mixture of amino acids and small peptides that are used for nutritional investigations of bacterial growth. The addition of CAA to the medium prior to *S*. Typhimurium growth reduced *csgDEFG* expression approximately 15-fold in the presence of 0.5, 1.0 or 2.0% CAA (Figure 5A). *csgBAC* and *adrA* expression dropped to near baseline levels when CAA was added at the beginning of growth (Figure 5B,C). The addition of CAA to growing *S. typhimurium* 14028 cultures also reduced the expression of all three promoters, but in a more dose-dependent manner. Expression of *csgDEFG* was reduced to ~75%, 50% and 25% of initial levels after the addition of 0.5% CAA, 1.0% CAA and 2% CAA, respectively (Figure 5D). Expression of *csgBAC* was reduced after the addition of 0.5% or 1.0% CAA, but the promoter was still considered active, whereas expression returned to baseline after the addition of 2.0% CAA (Figure 5E). *adrA* expression returned to near baseline levels, even with the addition of 0.5% CAA (Figure 5F). These experiments demonstrated that there is a metabolic feedback into *csgD* expression and that the system responds differently once it has been activated.

**Figure 5.** The *csgD* biofilm regulatory network in *Salmonella* is repressed by the addition of amino acids. Expression of *csgDEFG* (**A**,**D**), *csgBAC* (**B**,**E**), and *adrA* (**C**,**F**) was measured during growth of *S. typhimurium* 14028 at 28 ◦C in media premixed with 0.5%, 1.0% or 2.0% casamino acids (**A**–**C**) or in media where casamino acids were added during growth (D, E, F; the dotted line represents the time of addition at 18 h). For each graph, luminescence (light counts per second) divided by the optical density at 600 nm (Lum/OD) is plotted as a function of time and each curve represents a single growth condition. The mean and standard deviations are plotted from three biological replicate experiments measured in triplicate.

#### *3.5. Di*ff*ering E*ff*ects of Individual Amino Acids on csgD Gene Expression*

We wanted to test how individual amino acids contributed to the repression of biofilm gene expression caused by CAA. We measured *csgBAC* expression i.e., curli production) as a proxy for biofilm formation and as readout for CsgD activity. Only Asn, Pro and Arg had a direct repressive effect on *csgBAC* expression when added individually (Figure 6A; blue bars). The expression curves were lower for the entirety of growth (Figure 6B). Six amino acids had no significant effect (Figure 6A, grey bars; examples in 6C) and seven amino acids caused an increase in expression (Figure 6A, pink bars). The addition of Gly and Thr yielded an approximately three-fold boost to *csgBAC* expression (Figure 6D), which was unexpected. These results indicated that the repression caused by CAA must have been due to the cumulative effect of multiple amino acids.

When individual amino acids were added to *S*. Typhimurium cultures after 18 h of growth, the effects on *csgBAC* expression were not predictable based on their previous groupings (Figure 6E; see color distribution). No amino acids caused a decrease in expression, and some that were repressive when added before growth (i.e., Arg, Pro), now caused a significant boost in expression (Figure 6E,F). Eight amino acids had no signficant difference from the water control (Figure 6G, Lys, Ser). Val, Ala, Gln and Thr led to increased *csgBAC* expression when added before or during growth, suggesting that these amino acids have a positive effect on curli fimbriae synthesis. Glycine, on the other hand, had no significant effect when added during growth (Figure 6H). Overall, we could not explain the differing effects of individual amino acids when added during growth. However, the results were consistent with our previous observation that the *S. typhimurium* biofilm network responds differently to regulatory inputs after the *csgD* network has been activated.

#### *3.6. Regulation of Rdar Morphotype on Agar-Containing Media*

In the bistable expression of CsgD, the proportion of cells in the "ON" state is thought to be maintained by a feed-forward loop consisting of RpoS, the stationary phase sigma factor that controls *csgD* transcription, IraP, a protein that stabilizes RpoS, and CsgD itself [35,42]. In addition, *csgD* expression and CsgD activity can be influenced by the bacterial secondary messenger, cyclic-di-GMP (c-di-GMP) [44]. We wanted to investigate how these additional regulatory components influenced metabolic control of the *S. typhimurium* biofilm regulatory network. Strains were grown at 28 ◦C or 37 ◦C on 1% tryptone agar, with different components added to the media. To modulate intracellular c-di-GMP levels, strains were transformed with plasmids over-expressing *stm1987*, encoding a DGC enzyme that generates c-di-GMP, or *yhjH*, encoding a PDE enzyme that breaks down c-di-GMP. To analyze the proposed feed-forward loop, we utilized a plasmid over-expressing RpoS and measured gene expression in Δ*rpoS* and Δ*iraP* strains. Each strain was transformed with a luciferase reporter plasmid so that we could visualize *csgBAC* expression.

**Figure 6.** Individual amino acids have differing effects on the *csgD* biofilm regulatory network in *S. typhimurium* 14028. Maximum expression of the *csgBAC* operon (curli production) was recorded during growth of *S. typhimurium* 14028 at 28 ◦C in media premixed with 15 mM of individual amino acids (**A**) or in media where the amino acids were added after 18 h of growth. The maximum Lum/OD values after addition of each amino acid were statistically compared to a water control and amino acids were determined to have a repressive (blue), neutral (grey) or stimulatory effect (purple) on *csgB* expression (**A**). This color scheme was used to represent the same amino acids when they were added after 18 h of growth (**E**). Lum/OD values were plotted as a function of time corresponding to selected amino acids premixed into the media (**B**–**D**) or added at 18 h of growth (**F**–**H**; the dotted line represents the time of addition). For each curve, the mean and standard deviations are plotted from three biological replicate experiments measured in triplicate.

The vector-only *S*. Typhimurium 14028 control strain displayed robust light production at 28 ◦C, with faint *csgBAC* signals also observed in the presence of 25mM salt (Figure 7, vector). Over-expression of *rpoS* appeared to elevate *csgBAC* expression under most conditions, including in the presence of salt and at 37 ◦C (Figure 7, *rpoS*). The importance of RpoS was emphasized in that the Δ*rpoS* strain had no visible *csgBAC* expression under all tested conditions, unless it was co-transformed with pACYC/*rpoS* (Figure 7; 28 ◦C Δ*rpoS*). A strong stimulatory effect was also caused by over-expression of *stm1987*, which allowed for robust *csgBAC* expression and biofilm colony morphology under most conditions (Figure 7, *stm1987*). The strain transformed with pBR322/*stm1987* was the only one to have detectable *csgBAC* expression at 37 ◦C in the presence of 25 mM salt (Figure 7). This indicated that elevated levels of c-di-GMP may be enough to overcome temperature-based repression of *csgBAC*. Emphasizing the importance of c-di-GMP, the expression of *yhjH* was sufficient to abolish *csgBAC* expression at 28 ◦C (Figure 7, *yhjH*), as well as in all other tested conditions. In contrast, deletion of *iraP* appeared to have little effect on *csgBAC* expression, with only a mild reduction observed at 28 ◦C (Figure 7, *iraP*). Finally, the presence of glucose in the media abolished *csgBAC* expression in all strain and plasmid combinations (Figure 7, 0.2% Glc). This experiment indicated that increased levels of RpoS and c-di-GMP could partially overcome some *csgBAC* repression, and that glucose was perhaps the most powerful metabolic signal feeding into the *S. typhimurium csgD* regulatory network.

**Figure 7.** Visualization of *S. typhimurium* curli expression in response to changing growth conditions. *S. typhimurium* 14028 wild-type, Δ*rpoS* or Δ*iraP* reporter strains containing a *csgBAC* promoter–luciferase fusion were transformed with pBR322 (vector), pACYC/rpos (*rpoS*), pBR322/stm1987 (*stm1987*) or pBR322/yhjH (*yhjH*) plasmids. Cells were inoculated onto T agar or T agar supplemented with 0.2% glucose, 25 mM or 100 mM NaCl and grown at 28 ◦C or 37ºC. Colony morphology (left column) and luminescence (right column) was recorded after 48 h growth. Control strains containing pACYC were also tested, but the *csgBAC* expression profiles were similar to strains transformed with pBR322; therefore, only the pBR322 pictures are shown.

#### **4. Discussion**

Biofilm formation is subject to tight and complex regulation through transcription factor CsgD. In *S*. Typhimurium, the intergenic region between divergent *csgDEFG* and *csgBAC* operons is among the longest non-coding region with 582 bp, which allows for a highly sophisticated signaling network. CsgD expression is regulated at the transcriptional, post transcriptional, translational and post translation level, in response to a variety of external and internal signals [8]. In this study we show that once activated, the CsgD biofilm network responds differently to metabolic inputs.

The ability of *S. enterica* strains to form biofilms is thought to be critical for the success of *Salmonella* as pathogens, particularly for gastroenteritis-causing strains [14]. With bistability of CsgD synthesis resulting in distinct cell types—multicellular aggregates associated with persistence (CsgD-ON), and single cells associated with virulence (CsgD-OFF) [42]—there is a need to have a flexible and dynamic response. We speculated that this phenotypic heterogeneity was a form of bet-hedging. A bet-hedging strategy ensures that at least one group of cells will be more adapted for a specific

set of conditions that is encountered [61]. For some bacterial processes, such as sporulation, the advantage of the sporulating cell is obvious; however, for the non-sporulating cells, the advantage lies in being capable of more rapid growth when an influx of new nutrients occurs [62]. For *Salmonella*, there is a lot of energy devoted to generating the polymers associated with biofilm aggregates [26,63]; in the virulent, single cell group, synthesis of the type three secretion apparatus also requires a significant outlay of energy [64]. This type of population split makes the most sense in response to the unpredictability of transmission [65] or perhaps for modulating host–pathogen interactions, as observed for *Vibrio cholerae* [66]. We analyzed regulation before *csgD* activation, which has been tested before in *S. typhimurium* and *E. coli* and generally had the expected results, and compared this to regulation after *csgD* activation, which to our knowledge has not been tested before. We observed that *csgD* transcription and activation of downstream biofilm components was no longer repressed by increased osmolarity, and that the response to nutrient addition was also different, either as individual amino acids or a set of pooled amino acids. In contrast, the addition of glucose and temperatures above 32 ◦C rapidly repressed *csgD*, *csgB* and *adrA* expression even after induction. We approached these experiments from the point of view of biofilm formation as a developmental process [67,68], and our results show that CsgD biofilm formation is reversible, but can also be viewed as irreversible, depending on the signal. Our results, therefore, suggest the existence of a regulatory hierarchy among external signals that regulate biofilm formation.

For osmolarity, it has been well established that the optimal conditions for *csgD* expression and rdar biofilm formation in vitro include low osmolarity [51,52,69]. Key transcription factors have been identified (i.e., OmpR, CpxR, H-NS, MlrA and others) and binding within the *csgD* promoter region has been characterized [29,36,37,70,71]. Yet, there are still some intriguing aspects; for example, *S. enterica* biofilm cells produce high levels of osmoprotectants even when growing in low osmolarity conditions [26]. To explain the accumulation of osmoprotectants, we hypothesized that there could be high osmolarity microenvironments created within biofilms due to nutrient and ion trapping by the extracellular matrix [25,72]. The presence of hyperosmolar environments was recently observed with *E. coli* biofilms [73]. Our experiments show that once CsgD has activated downstream target genes (i.e., *csgBAC* (curli) and *adrA* (cellulose), transcription of all units becomes unresponsive to increases in osmolarity. This was specific to the CpxR/A two-component system in high salt conditions and by activating CpxR in ways that are not expected to significantly a change in osmolarity (i.e., metal stress) [55]. There has been some recent controversy about the role of CpxR in surface sensing or adhesion [74], but it is a well-established regulator of *csgD* [75]. The osmolarity effect was also general, as similar gene expression patterns were observed after the addition of sucrose, which was shown to repress *csgD* transcription in *E. coli* by acting through H-NS [36]. In our experiments, the CpxR/A system was not activated by the addition of sucrose, therefore, we assume that the same H-NS-mediated signaling occurs in *Salmonella*. To explain the results with *csgD*, it is possible that the presence of osmoprotectants produced early on during biofilm formation could mute the signaling effects associated with high external osmolarity [76]. Although we have shown that several osmoprotectant-associated genes are produced in time with *csgBAC* and *adrA* [26], we do not know the detailed time course for the appearance of the molecules themselves. The biological relevance for a lack of response to increased osmolarity is not clear, however, a recent paper described a real-world scenario where such a characteristic could be favored. Grinberg et al. 2019 [77] demonstrated that bacterial aggregates have enhanced survival on the surfaces of leaves in microdroplets that are not visible to the naked eye. As liquid evaporates from the leaf surfaces, solutes become concentrated and the microdroplets become hyperosmolar solutions. One could envision *S. enterica* biofilm aggregates surviving well in this scenario due to their stress-resistance adaptations and the altered *csgD* regulatory program identified here. We hypothesize that these microdroplets represent an environment where biofilms, and presumably biofilm-forming strains, would be favored over individual cells that do not aggregate together or strains that do not form biofilms.

Nutrient limitation was one of the first activating signals identified for *csgD* transcription [30,51]. In 1% tryptone or lysogeny broth, which are predominantly comprised of amino acids [60], *csgD* transcription occurs when cell density increases and cells start to run out of nutrients [8,63]. While this was initially attributed to phosphate and nitrogen depletion [30], we tested if supplementation with additional amino acids would delay or prevent activation of *csgD* transcription. When amino acids were added together (i.e., casamino acids), the transcription of *csgD* and downstream biofilm genes was delayed for almost the entire 70-h growth period, well after high cell densities were reached. When CAA were added during growth, *csgD* expression was shifted down in a dose-dependent manner. This showed that after induction, *csgD* expression was still responsive to negative regulation by CAA. The dose response could represent a subpopulation of *S. typhimurium* cells that retain metabolic flexibility [78] and are able to shift their metabolism away from biofilm formation. Based on the results with CAA, we predicted that individual amino acids might also have a repressive effect on biofilm formation. We measured the expression of the curli biosynthesis operon (i.e., *csgBAC*), a direct target of CsgD. Only Asn, Pro and Arg reduced *csgB* expression when added before growth, while Ile, Val, Gln, Met, Ala, Thr and Gly all increased expression. This indicated that the repression observed with CAA was the cumulative effect of the individual amino acids, as recently observed [79]. When added during growth, Leu, Arg, His, Val, Pro, Ala, Gln and Thr increased *csgB* expression, and no single amino acid decreased expression. This again showed that the CsgD biofilm network responds differently once it is activated. The production of sugars from gluconeogenesis is important for biofilm formation, as *S*. Typhimurium strains with mutations of *pckA* and *ppsA* are unable to form biofilms [26]. PckA and PpsA are important gluconeogenic enzymes required for the synthesis of phosphoenolpyruvate (PEP). Pck catalyzes the conversion of oxaloacetate to PEP [80], while Pps catalyzes the conversion of pyruvate to PEP. Ala, Gly and Thr are gluconeogenic amino acids that enter the gluconeogenic pathway through pyruvate [81]. In support of this, Ala and Thr increased *csgB* expression when introduced before and during growth. Gly also increased *csgB* expression when added before and during growth, but the change was not statistically significant. CsgD was shown to directly stimulate Gly biosynthesis during *E. coli* biofilm formation [82], presumably to ensure there is enough Gly supply to produce large quantities of the major curli subunit, CsgA (i.e., 16% Gly residues). Increased *csgB* expression in the presence of Ala, Gly and Thr is consistent with their conversion to pyruvate contributing to gluconeogenesis. For the aromatic amino acids, due to solubility and concentration problems, we only tested Phe, which had no significant effect on *csgB* expression. This was unfortunate since *S. enterica* strains defective in aromatic amino acid biosynthesis are unable to form biofilms [83], and tryptophan has been shown to have an important role in *S*. Typhimurium biofilms [84]. Tryptophan was also not present in CAA, as it is destroyed during the acid hydrolysis process [85]. More research is needed to understand the impact of individual amino acids on *csgD* expression.

Glucose was the most powerful external signal tested in our experiments. Under all growth conditions, the presence of exogenous glucose completely repressed the transcription of *csgD*, *csgB* and *adrA*. Expression of *csgD* was repressed in the presence of glucose even when *rpoS* was over-expressed from a plasmid or when levels of c-di-GMP were enhanced due to STM1987 activity. In the initial paper on carbon source foraging [86], the presence of glucose had a streamlining effect on the metabolism of *E. coli* when compared with growth on lower-quality carbon sources. This study was a genome-wide illustration of carbon catabolite repression [87], where growth on optimal carbon sources occurs first and genes for the metabolism of other carbon sources are repressed, usually acting through cyclic AMP (cAMP) and cAMP receptor proteins (CRP). Glucose had a repressive effect on biofilm formation in both *S*. Typhimurium and *E. coli* [12,59,79,88], however how the regulation is mediated is reported to be the opposite. High levels of cAMP repress *csgD* transcription in *S*. Typhimurium [79], but stimulate *csgD* transcription in *E. coli* [88]. It was also initially reported in *S*. Typhimurium that cAMP/CRP had no effect on *csgD* transcription [30]. It is hard to believe that the conserved divergent *csgDEFG* and *csgBAC* operons [89], biofilm networks and large intergenic region are capable of having opposite regulation in *S*. Typhimurium and *E. coli*. However, as pointed out by Hufnagel et al. [88], *E. coli* and *S*. Typhimurium have different evolutionary histories, hence could have differing regulatory responses to glucose. Another important aspect of cAMP/CRP regulation and glucose metabolism pertains to the quality of nitrogen source available [90], making this complex regulatory network in need of further study. It should be noted that the repressive effect of glucose did not change according to whether *csgD* transcription was activated or not, which was in contrast to the other nutritional signals that we tested.

The effects of temperature and c-di-GMP on *csgD* transcription were also evaluated. Temperature was one of the first conditions identified to regulate biofilm formation [50,52]. Activation at temperatures below 30 ◦C is known to represent RpoS-dependent transcription of *csgD*. *S. enterica* strains with *csgD* promoter mutations can alleviate temperature-based repression by shifting transcription to be RpoD-dependent [51,52]. This may be a way for natural *rpoS* mutant strains to retain the ability to form biofilms, as there are always a few isolates within natural collections that display temperature-independent biofilm formation [12,21,50]. Temperature was able to shut off the biofilm network even after *csgD* was activated, proving that it is also a strong regulatory signal. *S. typhimurium* biofilm colonies were only formed at 37 ◦C if c-di-GMP levels were enhanced by *stm1987* overexpression, with partial restoration if *rpoS* was overexpressed. Although these conditions are somewhat artificial, the c-di-GMP regulatory principles could be an important observation. We recently discovered that curli can be synthesized by *S. typhimurium* during murine infections, with *csgD* transcription activated at 37 ◦C in vivo [91]. It is also of note that iron limitation [52] and exposure to bile [92] can alleviate temperature-based repression of *csgD* transcription. Finally, expression of the c-di-GMP-degrading enzyme, YhjH (or STM3611), was enough to repress *csgD* expression in all tested conditions, which is similar to previous observations [93,94].

#### **5. Conclusions**

We have started to dissect the external signal hierarchy that regulates *csgD* transcription and CsgD-mediated biofilm formation in *S. enterica*. Most significantly, we identified differences in the regulatory responses based on whether or not *csgD* was activated before being exposed to a signal. These findings are summarized in Figure 8A,B. We hypothesize that the differences upon activation are related to the bistable expression of CsgD [40,42], similar to dedicated processes in other bacterial species. Even seemingly well-understood processes, such as diauxie—the switching of *E. coli* growth between two carbon sources—is subject to heterogeneity, as one sub-population of cells ceases growth once glucose has been exhausted, while the other subpopulation begins to grow on the second carbon source without delay [78]. The diauxic behavior was originally interpreted as the whole population of cells stopping growth during a transition period before starting growth on the second carbon source [95]. We hypothesize that many of the *csgD* regulatory elements that we have examined here are consistent between *S*. Typhimurium and *E. coli* [41], with some notable differences. With respect to phenotypic heterogeneity, we may only fully understand biofilm regulation once we are able to examine the fate of individual cells [27].

**Figure 8.** Graphical illustration of the CsgD regulatory principles identified in this manuscript. The divergent *csg* operons are shown (without *csgFG* and *csgC*) with the intergenic region highlighted by transcription factor binding sites that have been experimentally verified in *Salmonella* (CpxR—black bars; H-NS—grey box; OmpR—hatched boxes). Phosphorylated OmpR binds the proximal, high affinity site under conditions of low osmolarity, which activates *csgD* transcription, and binds the distal, low affinity sites under conditions of high osmolarity, which represses *csgD* transcription [38]. The different regulatory elements that we have tested are shown: glucose; amino acids; growth temperature; and osmolarity, with sodium chloride, which is known to act via the CpxR/A system [37], and sucrose, which is known to act via H-NS [36]. The *adrA* gene encodes a diguanylate cyclase, which produces cyclic-di-GMP and allosterically activates cellulose production. (**A**) Glucose (>25 mM), amino acids (>0.5% casamino acids), temperature (>32 ◦C), salt and sucrose (> 25 mM) caused a reduction in *csgD* transcription and blocked transcription of *csgBAC* and *adrA*, preventing curli and cellulose biosynthesis. The effect of reduced c-di-GMP was tested by overexpression of the YhjH phosphodiesterase. The addition of individual amino acids was variable, with three leading to reduced *csgD* transcription (Asn, Pro, Arg), and seven leading to increased *csgD* transcription (Ile, Val, Gln, Met, Ala, Thr, Gly). (**B**) When the same regulatory components were tested after 18 h of growth, the effects were different. We assume that by this time point, the CsgD-IraP-RpoS feed-forward loop [35] is activated, although deletion of *iraP* in our experiments had little effect. The addition of salt and sucrose had no effect on *csgD* transcription, and casamino acids were not as repressive. The effect of increased c-di-GMP was tested by overexpression of the diguanylate cyclase STM1987, which was able to relieve temperature-based repression of *csgD* transcription. The response to individual amino acids was again variable, however, none caused a reduction in *csgD* transcription and eight were stimulatory (Leu, Arg, His, Val, Pro, Ala, Gln, Thr). The question mark signifies that we do not fully understand the regulatory effects of individual amino acids.

**Author Contributions:** Conceptualization, A.P.W., L.L.W., Y.W.; methodology, A.P.W.; formal analysis, A.S.S., L.L.W., A.P.W.; investigation, A.S.S., E.G.H., M.M., T.S.D., L.L.W., K.D.M., G.M.J., N.J.H., D.J.H., Y.W.; resources, A.P.W.; data curation, A.S.S., A.P.W.; writing—original draft preparation, A.S.S., L.L.W., A.P.W.; writing—review and editing, A.P.W., A.S.S.; supervision, A.P.W.; project administration, A.P.W.; funding acquisition, A.P.W. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by Natural Sciences and Engineering Research Council (NSERC) Discovery grants to A.P.W. (Grant #2017-05737 to A.P.W.; Alexander Graham Bell Canada Graduate Scholarship to K.D.M.; Undergraduate Research Award to E.G.H.), the Jarislowsky Chair in Biotechnology (A.P.W.), Saskatchewan Health Research Foundation (3866 to K.D.M.); the University of Saskatchewan (Graduate research fellowship to A.S.S; Biomedical Research Award to D.J.H., Postdoctoral Research Award from the Saskatchewan Health Research Foundation to Y.W.). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

**Acknowledgments:** The authors would like to thank Tracy Raivio for helpful discussions about the Cpx system, Neil Rawlyk for assisting in experimental design and methodology, and Sylvia van den Hurk for use of her Victor multilable plate reader.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **The** *Salmonella enterica* **Plasmidome as a Reservoir of Antibiotic Resistance**

**Jean-Guillaume Emond-Rheault 1, Jérémie Hamel 1, Julie Jeukens 1, Luca Freschi 1, Irena Kukavica-Ibrulj 1, Brian Boyle 1, Sandeep Tamber 2, Danielle Malo 3,4, Eelco Franz 5, Elton Burnett 6, France Daigle 7,8, Gitanjali Arya 9, Kenneth Sanderson 10, Martin Wiedmann 11, Robin M. Slawson 12, Joel T. Weadge 12, Roger Stephan 13, Sadjia Bekal 14, Samantha Gruenheid 15, Lawrence D. Goodridge 16,\* and Roger C. Levesque 1,\***


Received: 1 May 2020; Accepted: 29 June 2020; Published: 8 July 2020

**Abstract:** The emergence of multidrug-resistant bacterial strains worldwide has become a serious problem for public health over recent decades. The increase in antimicrobial resistance has been expanding via plasmids as mobile genetic elements encoding antimicrobial resistance (AMR) genes that are transferred vertically and horizontally. This study focuses on *Salmonella enterica*, one of the leading foodborne pathogens in industrialized countries. S. enterica is known to carry several plasmids involved not only in virulence but also in AMR. In the current paper, we present an integrated strategy to detect plasmid scaffolds in whole genome sequencing (WGS) assemblies. We developed a two-step procedure to predict plasmids based on i) the presence of essential elements for plasmid replication and mobility, as well as ii) sequence similarity to a reference plasmid. Next, to confirm the accuracy of the prediction in 1750 *S. enterica* short-read sequencing data, we combined Oxford Nanopore MinION long-read sequencing with Illumina MiSeq short-read sequencing in hybrid assemblies for 84 isolates to evaluate the proportion of plasmid that has been detected. At least one scaffold with an origin of replication (ORI) was predicted in 61.3% of the *Salmonella* isolates tested. The results indicated that IncFII and IncI1 ORIs were distributed in many *S. enterica* serotypes and were the most prevalent AMR genes carrier, whereas IncHI2A/IncHI2 and IncA/C2 were more serotype restricted but bore several AMR genes. Comparison between hybrid and short-read assemblies revealed that 81.1% of plasmids were found in the short-read sequencing using our pipeline. Through this process, we established that plasmids are prevalent in *S. enterica* and we also substantially expand the AMR genes in the resistome of this species.

**Keywords:** plasmid; *Salmonella enterica*; antimicrobial resistance; long-read sequencing; hybrid assembly

#### **1. Introduction**

Non-typhoidal *Salmonella enterica* is responsible for 88,000 cases of gastroenteritis in Canada each year. The symptoms of gastroenteritis can be mild to severe depending on the health conditions of individuals. Generally, the patient may recover without antibiotic treatment. However, antibiotic intervention may be necessary for children, the elderly, and immunosuppressed patients.

The *Salmonella* genus belongs to the *Enterobacteriaceae* family and includes two species, *bongori* and *enterica*. According to the Kauffman–White scheme, more than 2500 serotypes have been characterized [1]. As reported by the US Centers for Disease Control and Prevention (CDC), although all *S. enterica* serotypes can cause disease in humans, less than 100 serotypes account for much of the infections.

In 2014, a global report by the World Health Organization (WHO) on the surveillance of antimicrobial resistance (AMR) revealed that increasing resistance across many different infections has become a serious concern for public health worldwide [2]. AMR can be acquired by either spontaneous mutations or by horizontal gene transfer (HGT), in which plasmids are known to play a key role [3]. Plasmids are mobile genetic elements (MGE) encoding for their self-replication and transfer. The genes responsible for plasmid maintenance and transmission form a "backbone" that is a core set of genes encoding for essential plasmid functions [4]. Plasmids also provide non-essential cellular functions, such as virulence factors, AMRs, metabolic pathways, and unknown functions that are defined by genes encoding hypothetical and unknown proteins, which all confer competitive advantages to the bacterial host in specific situations. Once an AMR gene becomes stable on a plasmid through environmental pressures, it can quickly spread across species and ecosystems that can lead to its transfer from the surrounding environment to human pathogens [5,6].

Plasmids are amenable to detailed analysis using data from whole genome sequencing (WGS), using complementary software including mlplasmid, PlasmidFinder, cBar, plasmidSPAdes, Recycler, and PLACNET [7–12]. A recent comparison between five bioinformatics software showed that plasmidSPAdes was capable of fully or partially predicting 84% of plasmids used as references [12]. However, as plasmidSPAdes separates plasmids from chromosomes based on read coverage, plasmid contigs with a similar coverage to the chromosomal contigs are often mislabelled [8].

Plasmids are widespread in *S. enterica,* where they are known to carry nonessential genes involved in AMR and virulence [13,14]. Given that the spreading of AMR genes through microorganisms is a major issue for public health worldwide, the prediction of plasmid-carrying AMR genes will give insight into their dissemination across bacterial strains. In the current study, we expanded our knowledge of AMR genes carried by the plasmidome of 1750 *S. enterica* genomes. These genomes sequenced by Illumina MiSeq as part of a *Salmonella* Syst-OMICS project were analyzed by Plasmid-Gather, a pipeline designed to predict plasmid scaffolds based on the presence of essential genes for plasmid replication, mobility, and sequence similarity to a reference plasmid.

#### **2. Materials and Methods**

#### *2.1. Bacterial Isolates and Growth Conditions*

The *Salmonella enterica* isolates used in this study are described in Supplementary Table S1. The isolates were grown for 16–18 h at 37 ◦C on brain heart infusion agar (BHIA; Difco). The isolates were then transferred into Luria–Bertani (LB) broth with either 15% (*v*/*v*) glycerol or 8% (*v*/*v*) DMSO and stored at -80 ◦C until needed.

#### *2.2. DNA Preparation and Sequencing*

Genomic DNA was extracted from 1 mL of 4 mL LB broth cultures incubated for 16–18 h at 37 ◦C in agitation at 200 rpm, using the E–Z 96 Tissue DNA Kit (Omega Bio-tek, Norcross, GA, USA). Approximately 500 ng of genomic DNA was mechanically fragmented for 40 s with a Covaris M220 (Covaris, Woburn, MA, USA) using the default settings. Libraries were synthesized using the NEBNext Ultra II DNA library prep kit for Illumina (New England Biolabs, Ipswich, MA, USA) according to the manufacturer's instructions and sequenced to obtain 30Xs coverage in an Illumina MiSeq 300 bp paired-end run at the Plateforme d'Analyses Génomiques of the Institut de Biologie Intégrative et des Systèmes (Université Laval, Québec, QC, Canada).

#### *2.3. Library Preparation and Oxford Nanopore MinION Sequencing*

Genomic DNA was extracted from 16-18 h LB broth cultures at 37 ◦C using the DNeasy Blood and Tissue Kit (QIAGEN, Toronto, ON, Canada). The manufacturer's protocol has been adapted to maximize the read length (performed without any rapid pipetting, vortexing, and only homogenization by inversion). We used 1.2 μg of a genomic DNA for library preparation with the SQK-LSK109 Kit and followed the manufacturer's recommendations. Native barcoding expansion PCR-free EXP-NBD104 (1–12) and EXP-NBD114 (13–24) kits were used according to the manufacturers' protocol. Twenty-four libraries were pooled and sequenced with a R9.4.1 flow cell (FLO-MIN106D) on a MinION device. On average, one sequencing run gave between 20 to 28 Gb of raw data per run.

#### *2.4. Bio-informatic Analysis and Databases*

The origins of replication (ORIs) database (PlasmidFinder-DB) was downloaded from the PlasmidFinder web server [7] (accessed 23/10/2018 10:10). A homemade database (MOBs-DB) of proteins involved in plasmids conjugation, mobilization, or transfer was constructed with annotated proteins encoded by plasmids described in the UniProtKB database [15] (accessed 29/08/2017 20:48). The resulting plasmid mobility proteins were clustered by CD-HIT [16] using 60% and 85% amino acids identity as a cut-off and minimal alignment coverage for the longer and shorter sequences, respectively. By this process, the mobility proteins database contained 361 reference proteins. A plasmids database (Plasmids-DB), containing 13,924 plasmids on 23 October 2018, was downloaded from the National Center for Biotechnology Information (NCBI) (ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/plasmids/, accessed 23/10/2018 10:15). To ensure that the sequences in the Plasmids-DB were of plasmid origin and not mislabeled chromosomal DNA, we examined all the plasmids with more than 550 kb of DNA. We identified 2 sequences belonging to *Salmonella* plasmids with large sizes (NZ\_CP022019.1 = 4,627 kb and NZ\_LN868944.1 = 728 kb). NZ\_CP022019.1 is most likely a complete chromosome mislabeled as a plasmid because it has the same size as a typical *Salmonella* chromosome. A search of the NCBI non-redundant nucleotide collection database (nr/nt) using the basic local alignement search tool (BLAST) revealed that NZ\_LN868944.1 aligned completely with the *Salmonella* chromosomes; these 2 sequences were removed from the Plasmids-DB.

#### *2.5. Plasmid-Gather Pipeline*

With the aim of detecting plasmids that can carry AMR or virulence genes, we developed an approach that combines 2 types of databases, thereby enhancing the discovery rate of contigs as plasmid fragments. The Plasmid-Gather pipeline is summarized in Figure 1. Briefly, reads were trimmed using Trimmomatic (v. 0.36) [17] and assembled by SPAdes (v. 3.10.1) [18]. We chose SPAdes as an assembler instead of plasmidSPAdes (an algorithm based on Bruijn graph to assemble plasmids), because the latter assembles only reads with coverage different from the chromosome, and therefore low copy plasmids with similar coverage to the chromosome are missed. SPAdes was used to obtain the whole genome assembly, and the plasmids were recovered based on two databases. BLAST (v. 2.6.0) [19] was performed on the resulting scaffolds against the PlasmidFinder-DB (see the above section on databases) to predict the ORI. Since it was expected that some plasmids assembled in several scaffolds were caused by repeated elements [20,21], we added a BLAST (v. 2.6.0) [19] search against a MOBs-DB (see the above section on databases) instead of using exclusively the PlasmidFinder-DB to maximize the discovery of contigs as plasmid fragments. To remove the possible chromosome contigs from the resulting scaffolds predicted by the pipeline, scaffolds with more than 300 kb were not taken in account. Another threshold to exclude sequences less than or equal to 1 kb was also included. To investigate whether the scaffolds corresponded to a known plasmid, the resulting scaffolds were aligned using BLAST (v. 2.6.0) [19] against the NCBI Plasmid database (≥ 95% sequence identity). The percentage of GC content was calculated using the infoseq application from the EMBOSS package (v. 6.5.7) [22].

**Figure 1.** Plasmid-Gather workflow. First, MiSeq Illumina paired-end reads were trimmed using Trimmomatic (version 0.36) [17] and assembled by SPAdes (version 3.10.1) [18]. BLAST analysis (version 2.6.0) [19] against the origin of replications (ORIs) (PlasmidFinder-DB) and mobility proteins databases (MOBs-DB) were performed to predict which scaffolds carried one of these elements. The significant matches against the PlasmidFinder-DB were separated into two groups depending on the threshold: A) the highly similar ORIs with the database or B) the ORIs related to ORIs in the PlasmidFinder-DB. By the latter threshold, we wanted potentially to expand the discovery of plasmid scaffolds. Only scaffolds encoding an ORI and/or a plasmid mobility protein between 1 and 300 kb in size were kept for future analysis.

#### *2.6. Antibiotic Resistance Gene Analysis*

AMR genes were predicted using Resistance Gene Identifier (RGI) (v. 4.2.2) based on the BLAST search against the Comprehensive Antibiotic Resistance Database (CARD) [23]. The presence of AMR genes was determined based on the curated e-value cut-offs.

#### *2.7. SISTR*

The *Salmonella* serotypes were predicted from the genome assemblies using the *Salmonella In Silico* Typing Resource (SISTR) (v. 1.0.2) [24].

#### *2.8. Plasmid Fragments Recovery (Post-Recovery)*

To increase the discovery rate in the Illumina MiSeq assembly of contigs as DNA plasmid fragments, we added an additional step in the Plasmid-Gather pipeline illustrated in Figure 1 to recover plasmid fragments without ORI or mobility proteins. For each scaffold with an ORI identified by Plasmid-Gather, we identified the plasmid with the highest bit score using BLAST against the NCBI Plasmid database (≥ 95% sequence identity, ≥ 15% query coverage per subject, and ≤ 550 kb of subject length). This plasmid, which varied with the plasmid scaffold containing an ORI(s) (query sequence), was used as a reference plasmid to align the Illumina MiSeq assembly. To be considered as a plasmid fragment, the scaffolds must share a ≥ 95% sequence identity and a ≥ 25% query coverage per subject with the reference.

#### **3. Results**

#### *3.1. Construction of a Bioinformatics Pipeline for Plasmid Identification*

In the framework of the Syst-OMICs genome project, 1750 *S. enterica* isolates (Supplementary Table S1) representing 153 serotypes analyzed by WGS (Supplementary Table S2) and available in the SalFoS database (https://salfos.ibis.ulaval.ca/) [25], were analyzed using the Plasmid-Gather pipeline depicted in Figure 1. Plasmid-Gather was developed to identify plasmid-related scaffolds in WGS data for *S. enterica* isolates by combining a systematic integrated strategy.

Plasmid-Gather identified 2211 scaffolds matching our criteria (i.e., scaffolds between 1 and 300 kb in size encoding either a plasmid mobility protein and/or an ORI) (Supplementary Table S3). There were three scaffolds > 300 kb (750, 515, and 421 kb) that were predicted to be plasmids based upon protein homology with sequences contained in the MOBs-DB (none encoded an ORI). A BLAST of these scaffolds against the NCBI Nucleotide collection (nr/nt) database revealed that they aligned with *Salmonella* chromosomes, but at 94%, 84%, and 94% of query cover, respectively. The genomic regions encoding the mobility proteins predicted by Plasmid-Gather were missed in these three alignments. A BLAST against the NCBI Nucleotide collection (nr/nt) database of these genomic regions where mobility proteins were identified revealed they corresponded to partial plasmid sequences. By evaluating this information, these three scaffolds could presumably be chromosomal DNA contigs carrying an integrated plasmid or an integrated conjugative element (ICE).

Scaffolds identified by Plasmid-Gather were labelled as "plasmid scaffolds". Of these 2211 plasmid scaffolds, our pipeline predicted at least one ORI for 1910 of them. The GC-content, ORI type(s), and presence of the mobility protein are described in Supplementary Table S3. To provide an overview of the plasmid scaffolds that have already been characterized, the percentage coverage of the best alignments using BLAST (identity ≥ 95%) against the NCBI Plasmid database are presented in Supplementary Table S3.

We predicted from 1 to 10 plasmid scaffolds in 1097 *S. enterica* isolates. Of these 1097 isolates, one or several ORIs have been predicted in 1073 isolates. Some plasmids belonging to the *Enterobacteriaceae* were characterized and confirmed to contain more than one ORI [26]. We assume that possessing multiple ORIs would presumably allow plasmids the ability to replicate when transferred into another species or serotype and broaden the range of hosts. Overall, we predicted 2350 ORIs, of which 851 were termed as putative ORIs (Supplementary Table S4). Although the ORI termed as "putative ORI" contains the name of its closest BLAST match (e.g., putative-*IncFII*), putative ORIs were considered as a different type of ORI. Further analysis will be needed to consider a putative ORI as part of the same incompatibility group (Inc) to its closest BLAST match. A collection of 12 plasmid scaffolds carried three ORIs (10 IncFIB/IncFII/IncX1, 1 IncFIB/IncFII/IncI1, and 1 IncFIA/IncHI1A/IncHI1B), 416 had two ORIs (251 IncFIB/IncFII and 16 other combinations), and 1471 had one ORI. The three most frequent ORIs identified in *Salmonella* isolates, excluding the 851 putative ORIs, were the transferable IncFII (nisolate = 533), IncFIB (nisolate = 371), and IncI (nisolate = 142) (Table 1). All three Inc group plasmids have been shown to carry virulence-associated and AMR genes within *Enterobacteriaceae* [14,27–30]. The majority of the IncFII and IncFIB ORIs were identified from *S.* Enteritidis and *S.* Typhimurium (Table 1) (at 33% and 28% for IncFII and at 47% and 39% for IncFIB, respectively). Moreover, IncFII was co-carried with IncFIB in 97% of the isolates (359/371). Even putative ORIs were excluded in Table 1; the three most widespread ORIs between *Salmonella* serotypes were IncFII (nserotype = 50), IncI1 (nserotype = 43), and ColpVC (nserotype = 37). In contrast, the two ORIs, IncX1 (nserotype = 10) and IncA/C2 (nserotype = 12), were limited to few serotypes (Table 1).


**Table 1.** Nine most frequent ORIs found across the most frequent *S. enterica* serotypes.

#### *3.2. Antibiotic Resistance Genes of the S. enterica Plasmidome*

To evaluate the diversity of plasmid-encoded AMR genes that can potentially complicate disease treatments and potentially be transferred to other bacteria, we next predicted AMR genes carried by the *S. enterica* plasmidome using the Resistance Gene Identifier (RGI v. 4.2.2) [23]. The RGI predicted 863 AMR genes across 375 plasmid scaffolds in 327 *S. enterica* genomes (Supplementary Table S5). Plasmid scaffolds encoded 55 different AMR genes, and 96 unique resistomes encoded by plasmid scaffolds were found across the 327 genomes (Figure 2).

**Figure 2.** Antimicrobial resistance (AMR) genes of the *S. enterica* plasmidome predicted using the Resistance Gene Identifier (RGI) (v. 4.2.2), based on the Comprehensive Antibiotic Resistance Database (CARD) [23]. The bar plot above shows the frequency of unique resistomes. Numbers below the heatmap indicate the antimicrobial resistance profile (AMRp) (Supplementary Table S5). The AMRps were assigned to the resistome of each isolate plasmidome. The antibiotic family or function are shown on the right.

The plasmid scaffolds with the highest number of AMR genes were a single without ORI (absent) (nAMR = 9), three containing IncA/C2 (nAMR = 8), and one with IncL/M (nAMR = 7). We found that the three most frequent ORIs carrying the AMR gene(s) were IncI1 (nORI = 98), followed by IncFII (nORI = 50) and IncX1 (nORI = 25). IncI1 has previously been the most common ORI type identified in multi-drugs resistance isolates [31]. However, in our studies plasmid scaffolds without an ORI are the second most common AMR gene carriers (86 scaffolds into 76 isolates). These are likely DNA fragments of larger plasmids of which the ORI assembled on another scaffold. Plasmids can be assembled in several scaffolds because of repeated elements. Further analysis using PCR or DNA long-read sequencing will be required to order scaffolds for assembly.

AMR genes were also identified for the 1750 *S. enterica* genomes in which plasmid scaffolds were removed and both resistomes were compared (Supplementary Table S6). The prevalence of AMR genes available from the CARD website (v. 4.2.2) among *Salmonella* genomes from the NCBI Chromosome and NCBI Plasmid databases were included in Supplementary Table S6 for comparative purposes. An analysis of 1750 *Salmonella* chromosomes showed an average of 39 AMR genes by genome (from 30 to 56 AMR genes/ genome), which represented 207 different AMR genes. Thirty-four AMR genes were predicted in nearly all chromosomes (from 95 to 100% of chromosomes) (26 out of these 34 resistance mechanisms belong to an antibiotic efflux pump complex) (Supplementary Table S6). These genes may likely correspond to the core resistome of *S. enterica*. Moreover, the prevalence of 25 of these AMR genes in the NCBI Plasmid and Chromosome databases found in the CARD database were consistent with our prediction (Supplementary Table S6). Overall, 17 AMR genes that were infrequently predicted in chromosomes are normally limited to the NCBI Plasmid database according to CARD (0% in NCBI Chromosome database) (Supplementary Table S6), which suggests that some plasmid scaffolds may remain unidentified among chromosome scaffolds.

#### *3.3. Increasing Recovery of Plasmid Sca*ff*olds Using a Reference*

To investigate whether plasmid scaffolds remained within the *S. enterica* chromosome scaffolds, the four chromosomal scaffolds carrying *aac(3)-VIa*, a resistance gene limited to the NCBI Plasmid database, were aligned against the NCBI Plasmid database, and their best match was used as reference to map the Illumina MiSeq assemblies (Figure 3).

In three of the four cases, the best match from the NCBI Plasmid database was the multidrug-resistant plasmid IncA/C pSN254, while the fourth was identified as IncHI2 pAPEC-01-R [32,33]. To confirm the ORI type of the plasmids from Plasmids-DB, we aligned all the plasmids of the NCBI Plasmid database using BLAST against the PlasmidFinder-DB [7]. The plasmid pSN254, as well as many *Salmonella* plasmids first published as IncA/C (pAM04528, peH4H, pAR060302, p1643\_10, p33676, pCVM2245, pCVM22462, pCVM22513, pCVM21538, pCVMN1543, and pCVM21550), perfectly matched with IncA/C2; meanwhile, pRA1 and pRAx matched with IncA/C [34–38]. As mentioned by Carattoli et al. (2006), IncA/C and IncA/C2 exhibit 26 nucleotide substitutions [39]. As demonstrated in Figure 3, at least two scaffolds that aligned with the reference plasmid had already been detected in the WGS using our pipeline; one encoded the TraI mobility protein, whereas the second carried the ORI(s) (IncA/C2 or IncHI2/IncHI2A). Furthermore, as depicted in Figure 3, several new plasmid scaffolds were recovered using the closest homologous plasmid from NCBI as the reference plasmid. Figure 3 also shows the complexity of plasmid reconstruction, which is probably due to the high plasticity of some plasmids, as observed for IncHI2 and IncA/C [31,40,41].

To improve the detection of plasmid fragments, we added a last step that used a reference plasmid to the Plasmid-Gather pipeline, which is illustrated in Figure 1. However, instead of using the scaffold encoding the AMR gene to select the closest reference plasmid, we picked the reference based on the best match with the scaffold bearing the ORI. We recovered 1172 scaffolds for the *S. enterica* plasmidome, giving a total of 3383 scaffolds. By taking this data into account, the scaffolds were re-assorted into two groups as plasmid and chromosome scaffolds.

**Figure 3.** Recovery of plasmid fragments in whole genome sequencing (WGS) data based on comparisons with reference plasmids. The reference plasmids above each alignment were selected based on the best match against the National Center for Biotechnology Information (NCBI) Plasmid database with the scaffold encoding *aac(3)-VIa* gene usually found in plasmid sequences. The WGS data were mapped using CONTIGuator v. 2.7 against the reference plasmids. Several scaffolds without ORI or mobility protein have been identified as plasmid scaffolds in the four *Salmonella* genomes using a reference plasmid.

#### *3.4. The Resistome of S. enterica Plasmids and Chromosomes*

We predicted AMR genes using RGI v. 4.2.2 [23], but in the *S. enterica* plasmid and chromosome scaffolds that were separated using the strategy that we called post-recovery. The AMR genes predicted in either the post-recovery plasmid or chromosome scaffolds of *S. enterica* showed a better AMR gene specificity carried by each. RGI predicts in plasmids a total of 1174 AMR genes (311 new AMR comparing with the previous prediction). Several predicted AMR genes, which are frequently found in the NCBI Plasmid database when compared to the NCBI Chromosome database, showed

similar distributions with CARD (Supplementary Table S7). For instance, before the recovery of plasmid scaffolds, 36 and 37 *bla*CMY-2 genes were predicted in plasmids and chromosomes, respectively (Supplementary Table S6). After the post-recovery strategy based on a reference plasmid, 69 (6.26%) and 4 (0.23%) *bla*CMY-2 were identified within plasmids and chromosomes (Supplementary Table S7). The prevalence of the *bla*CMY-2 gene in the *S. enterica* genomes from the NCBI Plasmid and NBCI Chromosome databases are 6.62% and 0.73%, respectively, which is consistent with what we obtained post-recovery (Supplementary Table S7). A similar trend was observed for the *sul1*, *bla*TEM-1, *tet(B), tet(C)*, *tet(D), tetR,* and *aph(3')-Ia* AMR genes.

To determine whether there is an enrichment of certain AMR genes between the plasmids and chromosomes of *S. enterica*, we calculated *p* values using Fisher's exact test (Supplementary Table S7). We noted that 52 and 13 AMR genes were significantly enriched (*p* < 0.001) in chromosomes and plasmids, respectively (Supplementary Table S8 and Table 2).



*Microorganisms* **2020**, *8*, 1016

#### *3.5. Analysis of Plasmid Content Using Long-Read DNA Sequencing*

As depicted in Supplementary Table S9, 84 *Salmonella* isolates were selected from distant branches of a phylogenetic tree representing 2544 *S. enterica* genomes to get the inter alia maximum genome and plasmid diversity contained in the *S. enterica* species (Figure 4). Additionally, expanding the WGS using the Oxford Nanopore giving DNA long-reads and combining this data with Illumina MiSeq short-reads in the hybrid genome assembly of the complete chromosome and plasmid contents allowed us to evaluate our prediction in the short-reads data using Plasmid-Gather.

**Figure 4.** Unrooted maximum likelihood tree of 2544 *S. enterica* genomes based on 173,657 single nucleotide polymorphisms (SNPs). Genomes were assembled using SPAdes. Isolates used are labelled in green and blue. Green nodes were isolates sequenced using Oxford Nanopore.

The combined hybrid assemblies gave the complete bacterial chromosome in all the 84 isolates selected. Hybrid assemblies of the complete chromosomes also indicated 73 scaffolds, of which 64 were predicted as plasmids (87.7%) using the ORIs (PlasmidFinder-DB) and the mobility proteins (MOBs-DB) databases that predicted plasmids in the short-read assemblies (Supplementary Table S9). Mobility proteins or ORI sequences were not identified in the *Salmonella* chromosomes, demonstrating a high specificity of the two databases for extrachromosomal elements. The percentage of each plasmid assembled using hybrid assemblies covered by those predicted in the MiSeq data has been calculated, and the sum showed that 81.1% of plasmids were found in the short-read assemblies. Nine extrachromosomal elements, of which eight are small elements (less than 7 kb), could not be identified as plasmids in both assemblies, although the online BLAST searches on NCBI indicated that they matched with the plasmid sequences.

The AMR genes were predicted using RGI v. 4.2.2 in the 73 extrachromosomal elements only [23] (Supplementary Table S9). This analysis gave two major observations: 1) the most frequent ORI carrying AMR genes was IncI1 (4/5), and 2) the plasmid with the greatest number of AMR genes was an IncA/C2 (nAMR = 12; *aac(3)-VIa, aph(3")-Ib, aph(6)-Id, bla*CMY-35*, bla*CMY-44*, bla*CMY-80*, bla*CMY-90*, aadA13, floR, sul1, sul2* and *tet(C))* (Supplementary Table S9). Both observations were consistent with our prediction in the MiSeq data described above.

#### **4. Discussion**

By combining a collection of public and SalFoS data, we identified a high proportion of plasmid contigs in Illumina MiSeq WGS assemblies using two databases containing essential conserved plasmid elements (PlasmidFinder-DB and MOBs-DB) combined with known reference plasmids. One of the added values will be to increase the plasmid sequences identified. The databases can be regularly updated to include new ORIs and mobility genes for future analyses.

By using Plasmid-Gather and the combined strategies described here, IncFII and IncFIB were the most frequent ORIs predicted in *S. enterica*; this was presumably caused by the over-representation of the *S.* Enteritidis and *S.* Typhimurium isolates in the dataset (Table 1). These two serotypes accounted for more than 23% of all the isolates and carried together 61% and 87%, respectively, of all th eIncFII and IncFIB (Table 1). The distribution of ORIs among the *S. enterica* serotypes showed that IncFII, IncI1, and ColpVC were found in a broad range of serotypes, whereas IncA/C2 and IncX1 are restricted to a dozen serotypes (Table 1). Interestingly, Lindsey et al. (2009) demonstrated by a cluster-based analysis using the pulsed field gel electrophoresis (PFGE) of 216 multidrug resistance *S. enterica* that IncI1 is not clonally distributed, whereas IncA/C is commonly observed in the same serotypes. Hence, IncI1 is presumably much more mobile than IncA/C [42]. IncI1 incompatibility was often associated with multi-drug resistance and with the widespread distribution of Beta-lactam resistance genes [28,29,41,43]. Likewise, we observed that plasmids with IncI1 are among the most important carriers of AMR genes (Supplementary Table S5 and Supplementary Table S9). Hence, one may assume that the mobility of IncI1 also leads to the spread of AMR genes in many *S. enterica* serotypes, whereas IncA/C2 seems more serotype restricted, but were associated with several AMR genes.

Large plasmids, representing different Inc ORIs, are known to integrate and carry transposons or integrons conferring AMR [3,14]. Several multi-resistance plasmids have been identified in *Salmonella*. Among them is the Inc group A/C (IncA/C and A/C2), consisting of 150 kb plasmids [31,33,38,42,44]. In our study, 49 of the 53 IncA/C2 plasmid scaffolds had less than 54 kb. This may be due to the limitations of plasmid assembly, as demonstrated in Figure 3. Three of the four IncA/C2 assembled with expected sizes encoding seven AMR genes each (*aph(3*")*-Ib*, *aph(6)-Ib*, *bla*CMY-2, *sul1*, *sul2*, *aad* or *aad7,* and *florR* or *aac(3)-IV*) (Supplementary Table S5). In addition, the IncA/C2 plasmid reconstructed by hybrid assembly from the S624 isolate possessed 12 AMR genes, a greater number than the other plasmids carrying AMR genes obtained using hybrid assemblies (Supplementary Table S9). Furthermore, in the Illumina MiSeq data we noted that nearly all the *S. enterica* isolates with IncA/C2 (52/53) had one or more AMR encoded by scaffolds in their plasmidome. Isolates with IncA/C2 plasmids carried on average five AMR genes (up to 11 for the S628 isolate); IncA/C2 is the only ORI predicted in 23/53 genomes. Multidrug resistance isolates have been linked previously to IncA/C [42].

Regarding the 52 AMR genes significantly enriched amongst *S. enterica* chromosomes, 34 were part of what we call "the core resistome"—i.e., AMR genes found in more than 95% of *S. enterica* genomes (described in Supplementary Table S8). The predominant AMR mechanism in the so-called core resistome is antibiotic efflux (26/34). Efflux transporters exist as either single- (e.g., Tet) or multi-component pumps (e.g., MdsABC complex) [45]. Multidrug efflux pumps are common resistance mechanisms among Gram-negative bacteria [45]. However, due to various efflux pumps that can compensate with wide substrate specificity, it remains a challenge to identify which drug efflux pump confers AMR. For other less frequent AMR genes found in *S. enterica*, *aac(6')-Ia,a* and *aac(6')-Iy* sharing a 99% amino acid identity were found in different serotypes (e.g., Typhimurium, Braenderup, and I 4 [5];12;i;- for *aac(6')-Iaa*; Enteritidis, Newport, and Heidelberg for *aac(6')-Iy*). Together, these 2 *N*-acetyltranferases (AAC) were encoded within 1738 chromosomes (99.3%), as shown in Supplementary Table S8. The gene *fosA7*, conferring resistance to fosfomycin, was predicted in 100% of the *S.* Heidelberg (51/51) isolates. In alignment with these results, fosfomycin resistance has previously been found in *S.* Heidelberg isolated from broiler chickens [46]. Similarly, *fosA7* was observed predominantly in almost all the isolates from the same serotype: in 96% of *S.* Agona (25/26), in 100% of the *S.* Telelkebir (8/8), in

67% of the *S.* Derby (8/12), and in 70% of the *S.* Alachua (7/10). The remaining *fosA7* genes (*n* = 23) were distributed among 14 under-represented serotypes.

Although considered as chromosome encoded, some efflux pumps have been identified on plasmids, such as the *tetA* gene encoding tetracycline resistance [47]. As shown in Table 2, 3 of the 13 AMR genes enriched in the *Salmonella* plasmidome encoded efflux pumps, 2 conferred resistance to tetracycline (*tet(C)* and *tet(D)*) and the last one was the resistance to chloramphenicol/florfenicol (*floR*). Tetracycline has been overused in human and veterinary medicines as growth promoters in animals [48,49]. The *tet(C)* and *tet(D)* AMR genes were often reported on MGEs, such as genomic islands (GEIs), as part of conjugative elements and in plasmids [50–54]. We observed a low abundance of *tet(C)* and *tet(D)* in isolates carrying plasmid scaffolds (7.9% and 2.2%, respectively) (Table 2). Previous studies have shown the rare occurrence of these AMR genes in *Salmonella enterica* strains [55,56]. The serotypes of *S. enterica* from SalFoS harboring *tet* genes were mostly *S.* Newport (29.9%, *n* = 26) and *S.* Typhimurium (23%, *n* = 20) for *tet(C)* and *S.* Kentucky (41.7%, *n* = 10) for *tet(D)*. The *floR* gene is the only significant plasmid gene conferring resistance to chloramphenicol (Table 2). We also noticed that 92.3% of the plasmidomes coding for *floR* also carried an IncA/C2, thereby leading to the conclusion that this ORI is likely to be strongly associated with its dissemination. The connection between *floR* and IncA/C2 can also be seen in hybrid assemblies, because *floR* was only predicted once in an IncA/C2 plasmid (Supplementary Table S9). The *floR* gene was already highlighted as the most common in *Salmonella* chloramphenicol-resistant strains [56,57].

In examining the AMR genes detected in plasmids (Table 2), the most common resistance encoded was resistance to streptomycin (*strA* and *strB* resistance genes). In addition to being used for human medicine, streptomycin is used as a feed supplements for pigs and as a pesticide for agriculture [48,58]. Likely because it is extensively used in agriculture, resistance to streptomycin was frequently found in environmental and pathogenic isolates [59,60]. Moreover, aminoglycoside antibiotic was the most prevalent drug class identified.

In this study, we were also interested in AMR genes that may complicate the treatment of salmonellosis and cause possible public health issues by the HGT of AMR genes. In 2014, of all antimicrobials prescribed in human medicine used for treating bacterial infections, the beta-lactam amoxicillin represented the largest proportion used (26%), followed by azithromycin (9%) and ciprofloxacin (8%) [61]. In the same year, 5% of the non-typhoidal *Salmonella* isolates were resistant to amoxicillin, while no resistance to azithromycin or ciprofloxacin was observed; these last two antimicrobials were largely prescribed to treat severe and invasive salmonellosis [61]. Three AMR genes identified in the *S. enterica* plasmidome confer resistance to either amoxicillin (*bla*TEM-1 (*n* = 47)), azithromycin (*mphA* (*n* = 4)) or ciprofloxacin (*aac(6')-Ib-cr* (*n* = 1)) (Supplementary Table S7); the last two antibiotics are used to treat severe and invasive *Salmonella* infections [62–64]. Fortunately, these three AMR genes are infrequent in the *S. enterica* plasmidome, except for *bla*TEM-1, and are not co-carried by the same isolate. However, once an AMR gene is plasmid-stable, AMR resistance can quickly spread through bacterial communities, and so this is something that may need future monitoring. In contrast, there is no clear pattern among *S. enterica* isolates harboring *bla*TEM-1; these strains were isolated from 1981 to 2011 in five countries from eight species representing 15 *Salmonella* serotypes. Furthermore, nine different ORIs were found to be associated with scaffolds carrying *bla*TEM-1.

#### **5. Conclusions**

Dealing with the increasing multi-resistance of *S. enterica* isolates remains a major worldwide challenge. Over the last decade, mobile genetic elements including plasmids have contributed to the spread of AMR genes vertically and horizontally between serotypes. *S. enterica*, one of the leading foodborne pathogens in industrialized countries, is known to carry plasmids encoding AMR and virulence. We present an integrated strategy to identify plasmid scaffolds using WGS. We combined two databases containing essential elements for plasmid DNA replication (PlasmidFinder-DB) and for plasmid mobility (MOBs-DB). In the current study, we highlight the great diversity of plasmids present in *S. enterica* as reflected on the basis of ORIs diversity. Plasmids were identified in 1750 *S. enterica* genomes, representing 153 serotypes, and 61.3% of the genomes from 1073 of 1750 WGS data had at least one plasmid carrying an ORI, thereby confirming plasmid prevalence in *S. enterica*. Whereas the databases from NCBI, EMBL, and DDBJ are overflowing with WGS data, this is not significantly informative without metadata and the availability of isolates for future functional studies. The SalFoS *Salmonella* database was constructed for the public distribution of isolates for functional studies and serves as a convenient resource to accomplish the expansion of the metadata.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2076-2607/8/7/1016/s1, **Table S1**: Isolates, **Table S2:** Serotypes, **Table S3:** Put. Plasmids, **Table S4:** Replicons Count, **Table S5:** AMR Profiles, **Table S6:** AMR Prelevance, **Table S7:** AMR Post-Recov, **Table S8:** AMR Chromosome, **Table S9:** Hybrid Assemblies.

**Author Contributions:** J.-G.E.-R., J.H., J.J., L.F., I.K.-I., B.B. participated in the technology development, database construction, and experiments. S.T., D.M., E.F., E.B., F.D., G.A., K.S., M.W., R.M.S., J.T.W., R.S., S.B., S.G., L.D.G. furnished isolates and metadata. R.C.L., J.-G.E.-R., L.D.G., D.M., F.D. wrote and revised the manuscript. All authors have read and agreed to the published version of the manuscript.

**Funding:** This study was supported by funding from Genome Québec, Genome Canada and Genome British Columbia to R.C. Levesque, L.D. Goodridge, D. Malo, F. Daigle, J. Weadge, S. Bekal and S. Gruenheid.

**Acknowledgments:** We express our gratitude to the team of the IBIS Genomics analysis platform for excellent work and collaborations (http://www.ibis.ulaval.ca/en/services-2/genomic-analysis-platform/). We also would like to thank Alexander Gill from Health Canada, Bureau of Microbial Hazards, Ottawa, ON, for the *S. enterica* isolates provided for the study.

**Conflicts of Interest:** The authors declare no conflicts of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Microorganisms* Editorial Office E-mail: microorganisms@mdpi.com www.mdpi.com/journal/microorganisms

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18