*4.1. Sampling Locations*

A 40-year old, 33,000 square ft. building with an average potable water usage of 3.6 million gallons (13.6 million L) per year was used in this study. Water usage at any given location varies widely depending upon the activity (floor washing, water storage tank cleanout, etc.) being conducted and the facility cooling demand during warmer months. The building's potable water supply is derived from river water treated by coagulation, flocculation, and sedimentation; followed by sand, gravel, and granular activated carbon filtration; and then chlorination. Cold potable water samples were collected seasonally every three months from six locations throughout the building (Table 1, Figure S1). The total number of samples for each site was 30 (10 first and second draw bulk water and 10 biofilm samples) collected over a 28-month period, October 2016 to February 2019; except for site PVC-FC, where the total number of samples was 21 (seven first and second draw bulk water and biofilm samples) collected over an 18-month period, August 2017 to February 2019. Sampling time points are denoted F, for fall; W, for winter; Sp, for spring; and Su, for summer followed by the corresponding year. F, W, Sp, and Su samples were collected during the months of October, November, February, May, and August, respectively.

Within this building, a semi-closed pipe loop distribution system simulator was fed with the chlorinated municipal drinking water, described above, and amended with ammonium hydroxide and sodium hypochlorite (Sigma Aldrich, St. Louis, MO, USA) to yield a 2 mg L−<sup>1</sup> monochloramine residual as previously described [88]. Average monochloramine and ammonia levels (± SD) during this sampling period were 1.25 ± 0.37 and 0.16 ± 0.07 ppm, respectively.

#### *4.2. Sample Collection and Processing*

For each sampling location, the first draw sample was taken immediately after turning the tap on, while the second draw was collected after 10 s of flushing (approximately 4 L), except for Fountain where the second draw sample was collected after 30 s of flushing (approximately 2 L). The 10–30-second flush time was used to ensure collection of non-stagnant water that was still representative of water quality conditions within the BWS. Sampling took place early in the morning after an overnight stagnation period. Water samples were collected in sterile 1 L plastic bottles and 1 mL of 10% w/v sodium thiosulfate was added to neutralize any disinfectant residual. An additional 100 mL was also collected for water quality analysis as described below. Approximately 1 L of each bulk water sample was filtered through a 0.2 μm polyethersulfone membrane (Supor ® Membrane, PALL Life Sciences, Nassau, NY, USA). Filters were placed into 11 mL of UV-light dechlorinated, 0.22 μm filtered drinking water (dfH2O), and vortexed at maximum speed for 1 min to resuspend the concentrated bulk water material. For biofilm collection, a sterile polyester tipped applicator was used to swab an approximate area of 2 cm<sup>2</sup> inside the tap. The applicator was then placed in a 14-mL round bottom tube containing 2 mL of dfH2O and vortexed vigorously for 1 min to resuspend the collected biofilm material.

Approximately 1 mL of the concentrated bulk water and biofilm suspension was analyzed for CFU, as described in Section 4.4, and the remaining volume was centrifuged at high speed (13,000 rcf, room temperature, 10 min; Eppendorf, Foster City, CA, USA). Pellets were resuspended in 200 μL of dfH2O and placed in a Lysing Matrix A tube (MP Biomedicals, Solon, OH, USA) along with the washed filter or biofilm swab for nucleic acid extraction as described below.

#### *4.3. Water Quality Analysis*

Bulk water samples were analyzed for pH, turbidity, temperature, disinfectant residual, and heterotrophic plate count (HPC). Free chlorine and total chlorine measurements were performed using the DPD colorimetric method (Powder Pillows; Hach USA) and monochloramine and free ammonia measurements were performed using the indophenol method (method 10200, Powder Pillows, free ammonia chlorinating solution; Hach USA). HPCs were enumerated by the spread plate

method on Reasoner's 2A agar (R2A, Difco Laboratories, Detroit, MI, USA) following incubation at 28 ◦C for 7 d. The limit of detection (LOD) for bulk water samples was 1.0 log10 CFU 100 mL−<sup>1</sup> and 0.7 log10 CFU cm<sup>−</sup><sup>2</sup> for biofilm samples.

#### *4.4. Legionella Enumeration and Presumptive Colony Analysis*

For colony forming unit (CFU) enumeration, undiluted and serially diluted suspensions were spread plated on buffered charcoal yeas<sup>t</sup> extract (BCYE) agar plates (BD Diagnostics, Franklin Lakes, NJ, USA) and incubated for 4–6 days at 37 ◦C [44]. Presumptive *Legionella* colonies were counted; and a subset was isolated and confirmed as *Legionella* spp. or *L. pneumophila* via polymerase chain reaction (PCR) using the 16S rRNA gene assays described in Section 4.6. An aliquot of the processed bulk water and biofilm samples was also heat-treated (incubation in a 55 ◦C water bath for 30 min) before plating on BCYE agar plates to evaluate potential differences in *Legionella* recovery from this pretreatment method [44]. Although growth of non-*Legionella* bacteria was inhibited by heat treatment, there were no significant differences between *Legionella* CFU observed between unheated and heated samples (data not shown).

For most probable number (MPN) enumeration, Legiolert® (Idexx Laboratories, Westbrook, ME, USA) was used to analyze 10 mL of the unconcentrated bulk water samples and 0.5 mL of the resuspended biofilm samples for only the Su2018, F2018, and W2019 time points. To obtain pure isolates from the Legiolert® tray, positive wells were punctured using a 26-gauge needle and 50–1000 μL of the well contents was collected. A 20 μL aliquot of the sampled well was streaked onto a BCYE agar plate and incubated for 4–6 days at 37 ◦C.

Those identified as *L. pneumophila* by PCR were serotyped using the OxoidTM *Legionella* Latex Agglutination Kit (ThermoFisher, Waltham, MA, USA), which allows for the separate identification of *L. pneumophila* serogroup 1 and serogroups 2–14 and detection of seven other *Legionella* species (*L. anisa*; *L. bozemanii* 1 and 2; *L. dumo*ffi*i*; *L. gormanii*; *L. jordanis*; *L. longbeachae* 1 and 2; and *L. micdadei*). Two *L. pneumophila* isolates identified as belonging to serogroups 2–14 via latex agglutination (Su2018 PVC-FC 1 and W2019 PVC-Loop Legiolert® 2) and one *Legionella* spp. PCR positive isolate (F2018 PVC-FC) were sent to an external laboratory (EMSL Analytical Inc., Cinnaminson, NJ, USA) for further identification via indirect immunofluorescent antibody assay [44].

To account for zero values, 1 was added to all data points before conversion to the log10 scale (e.g., log10 (CFU + 1)). Calculations from CFU and molecular analyses were adjusted and expressed as units per mL or cm<sup>2</sup> for bulk water samples and biofilms, respectively. The LOD for bulk water samples was 1.0 log10 CFU 100 mL−<sup>1</sup> and 0.7 log10 CFU cm<sup>−</sup><sup>2</sup> for biofilm samples.

#### *4.5. Isolation and Preparation of Total DNA*

DNA was extracted from bacterial cells using the MasterPure™ Complete DNA purification kit (Epicentre Biotechnologies Inc., Madison, WI, USA) according to manufacturer's protocol and the Mini-Beadbeater−16 (Biospec Products, Bartlesville, OK, USA) where samples were processed twice for 30 s at 3450 oscillations min−1. The DNA pellet was resuspended in 100 μL of molecular grade water.

#### *4.6. Quantitative Polymerase Chain Reaction (qPCR)*

Biofilm and bulk water DNA samples were analyzed in duplicate using the Applied Biosystems QuantStudio 6 Flex Fast Real-Time PCR system (ThermoFisher, Waltham, MA, USA). A 10-fold dilution of each sample was also analyzed in duplicate to test for presence of environmental qPCR inhibitors. The TaqMan qPCR assay for *Legionella* spp., *L. pneumophila*, *Mycobacterium intracellulare* detection, targeting the 16S rRNA gene, was performed as previously described [63,89,90]. The TaqMan qPCR assay for *Acanthamoeba* spp. and SYBR green qPCR assay for *Vermamoeba vermiformis* detection, targeting the 18S rRNA gene, was performed as previously described [91,92].

The forward and reverse primers and probe sequences (5' to 3') and cycling parameters used in this study for the *Legionella* spp. qPCR assay, respectively, are 16S-LegF1c: TAG TGG AAT TTC CGG

TGT A; 16S-LegR1c: CCA ACA GCT AGT TGA CAT C; 16S-LegP1: CGG CTA CCT GGC CTA ATA CTG A; and 50 ◦C for 2 min, 95 ◦C for 10 min, 40 cycles of 95 ◦C for 10 s and 50 ◦C for 30 s, and at 70 ◦C for 30 s [90]. The forward and reverse primers and probe sequences (5' to 3') and cycling parameters used in this study for the *L. pneumophila* qPCR assay, respectively, are LpneuF1: CGG AAT TAC TGG GCG TAA AGG-3; LpneuR1: GAG TCA ACC AGT ATT ATC TGA CCG T; LpneuP1: AAG CCC AGG AAT TTC ACA GAT AAC TTA ATC AAC CA; and 95 ◦C for 10 min, 40 cycles of 95 ◦C for 10 s, and at 60 ◦C for 1 min [63]. The forward and reverse primers and probe sequences (5' to 3') and cycling parameters used in this study for the *M. intracellulare* qPCR assay, respectively, are F: GGG TGA GTA ACA CGT GTG CAA; R: CCA CCT AAA GAC ATG CGA CTA AA; P: TGC ACT TCG GGA TAA GCC TGG GAA A; and 50 ◦C for 2 min, 95 ◦C for 10 min, 40 cycles of 95 ◦C for 15 s, and 60 ◦C for 1 min [89]. The forward and reverse primers and probe sequences (5' to 3') and cycling parameters used in this study for the *Acanthamoeba* spp. qPCR assay, respectively, are TaqAcF1: CGA CCA GCG ATT AGG AGA CG; TaqAcR1: CCG ACG CCA AGG ACG AC; TaqAcP1: TGA ATA CAA AAC ACC ACC ATC GGC GC; and 50 ◦C for 2 min, 95 ◦C for 10 min, followed by 40 cycles at 95 ◦C for 15 s and 60 ◦C for 1 min, respectively [92]. The forward and reverse primer sequences (5' to 3') and cycling parameters used in this study for the *V. vermiformis* spp. qPCR assay, respectively, are Hv1227F: TTA CGA GGT CAG GAC ACT GT; Hv1728R: GAC CAT CCG GAG TTC TCG; and 95 ◦C for 3 min, followed by 40 cycles at 95 ◦C for 20 s, 56 ◦C for 30 s, and 72 ◦C for 40 s, and then 72 ◦C for 10 min [91].

For *Legionella* spp. and *L. pneumophila* qPCR assays, standard curves were generated, on each plate, using a plasmid vector (pUCIDT-AMP; Integrated DNA Technologies, Inc., Coralville, IA, USA) containing a cloned 189-bp region of the *L. pneumophila* Philadelphia-1 16S rRNA gene (NCBI reference sequence NC\_002942.5, positions 609325 to 609513) that contains the targets for each of these qPCR assays. *M. intracellulare* standard curves were generated from serially diluted purified genomic DNA. Cell-based calibration curves were constructed for *Acanthamoeba* spp. and *V. vermiformis* by preparing 10-fold serial dilutions of DNA extracted from amoeba cell cultures of known densities.

Standards ranging from 1 to 10*<sup>7</sup>* gene copy (GC) for *Legionella* spp. and *L. pneumophila* qPCR assays; 4 to 10<sup>4</sup> GC for *M. intracellulare* qPCR assays; and 1 to 10<sup>5</sup> cell equivalents (CE) for the amoeba qPCR assays were generated and analyzed in triplicate along with duplicate no-template control for each 96-well plate. Data were expressed as log10 gene copy or CE or GU per mL or cm2. The limits of detection for bulk water and biofilm samples were 1.6 log10 GC L−<sup>1</sup> and 1.3 log10 GC cm<sup>−</sup><sup>2</sup> for the *Legionella* spp. and *L. pneumophila* assays; 1.3 log10 GC L−<sup>1</sup> and 0.9 log10 GC cm<sup>−</sup><sup>2</sup> for the *M. intracellulare* assay; 1.4 log10 CE L−<sup>1</sup> and 1.0 log10 CE cm<sup>−</sup><sup>2</sup> for the *Acanthamoeba* spp. assay; and 2.4 log10 CE L−<sup>1</sup> and 2.0 log10 CE cm<sup>−</sup><sup>2</sup> for the *V. vermiformis* assay, respectively.

#### *4.7. Whole Genome Sequencing and Sequence Analyses*

Twenty-one bulk water isolates were chosen for whole genome sequencing. Total DNA from each strain was isolated as described in Section 4.5. DNA concentrations were estimated using the Nanodrop ND−1000 Spectrophotometer (NanoDrop Technologies, Inc., Wilmington, DE, USA). Total DNA was submitted for whole genome sequencing (Wright Labs LLC, Huntingdon, PA, USA) where genomic libraries were prepared using the Nextera XT Index Kit v2 Set A and sequenced on the HiSeq 4000 platform (Illumina Inc., San Diego, CA, USA) with a HiSeq 3000/4000 PE Cluster kit (2 × 150 bp). Prior to assembly, libraries were (i) cleaned from contaminants (adapters, phiX, artifacts, and human), (ii) error corrected, (iii) normalized to ≤ 100 ×, (iv) removed of low (<6×) coverage reads, and (v) filtered to a minimum length read of 100 nt. Reads were processed using the software package BBMap v37.90 (http://sourceforge.net/projects/bbmap) and de novo assembly using the software Unicycler v0.4.4 [93]. The Illumina reads are deposited in the National Center for Biotechnology Information (NCBI) Sequence Read Archive database under the BioProject accession number PRJNA558750.

Sequence-based typing (SBT) analysis was performed in silico with legsta and multi-locus sequence typing (mlst) as described previously [94]. The phylogenetic tree was constructed combining the sequenced genomes from this study and a set of closely related genomes. Relatedness is

determined by alignment similarity to a select subset of COG (Clusters of Orthologous Groups) domains. The phylogenetic tree is reconstructed using FastTree 2 [95] to determine maximum likelihood phylogeny. Average nucleotide identity (ANI), an index of similarity between two genomes [96], was calculated using FastANI v1.3 (https://github.com/ParBLiSS/FastANI) [97]. ANI is defined as mean nucleotide identity of orthologous gene pairs shared between two microbial genomes. No ANI output is reported for a genome pair if the ANI value is below 80%.
