*5.5. DNA Extractions*

*S. adareanum*-associated microbial cell preparations were extracted with Powerlyzer DNEasy extraction (Qiagen Inc.) following the manufacturer's instructions starting at with the addition of the lysis solution. Samples were processed in parallel in batches of twelve at a time using the QiaVac 24 Plus Vacuum Manifold. Two rounds of lysis (5000 rpm for 60 s each with incubation on ice) were performed on the MiniLys using the 0.1 mm glass beads that come with the Powerlyzer kit. DNA concentrations of final preparations were estimated using Quant-iT Picogreen dsDNA Assay Kit (Invitrogen) fluorescence detection on a Spectramax Gemini (Molecular Devices, Mountain View, CA, USA).

DNA from bacterioplankton samples was extracted following [58], and DNA from bacterial cultures was extracted using the DNeasy Blood and Tissue kit (Qiagen Inc.) following the manufacturer's instructions. All DNA concentrations were estimated using Picogreen.

### *5.6. The 16S rRNA Gene Sequencing*

Illumina tag sequencing for the *S. adareanum* microbiome (SaM) targeted the V3–V4 region of the 16S rRNA gene using prokaryote-targeted primers 341F (5--CCTACGGGNBGCASCAG-3- [59]) and 806R (5--GGACTACHVGGGTWTCTAAT-3- [60]; source: Integrated DNA Technologies). The first round of PCR amplified the V3–V4 region using HIFI HotStart Ready Mix (Kapa Biosystems, Wilmington, MA, USA). The first round of PCR used a denaturation temperature of 95 ◦C for 3 min, 20 cycles of 95 ◦C for 30 s, 55 ◦C for 30 s and 72 ◦C for 30 s and followed by an extension of 72 ◦C for 5 min before holding at 4 ◦C. The second round of PCR added Illumina-specific sequencing adapter sequences and unique indexes, permitting multiplexing, using the Nextera XT Index Kit v2 (Illumina, Inc., San Diego, CA, USA) and HIFI HotStart Ready Mix (Kapa Biosystems). The second round of PCR used a denaturation temperature of 95 ◦C for 3 min, 8 cycles of 95 ◦C for 30 s, 55 ◦C for 30 se and 72 ◦C for 30 s and followed by an extension of 72 ◦C for 5 min before holding at 4 ◦C. Amplicons were cleaned up using AMPure XP beads (Beckman Coulter, Indianapolis, IN, USA). A no-template PCR amplification control was processed but did not show a band in the V3–V4 amplicon region and was sequenced for confirmation. A Qubit dsDNA HS Assay (ThermoFisher Scientific, Waltham, MA, USA) was used for DNA concentration estimates. The average size of the library was determined by the High-Sensitivity DNA Kit (Agilent) and the Library Quantification Kit—the Illumina/Universal Kit (KAPA Biosystems) quantified the prepared libraries. The amplicon pool sequenced on Illumina MiSeq generated paired-end 301 bp reads was demultiplexed using Illumina's bcl2fastq.

The bacterioplankton samples sent to the Joint Genome Institute (JGI, Walnut Creek, CA, USA) for library preparation and paired-end (2 × 250 bp) MiSeq Illumina sequencing of the prokaryote-targeted variable region 4 (V4) using primers 515F (5--GTGCCAGCMGCCGCGGTAA-3- ) and 806R (5--GGACTACHVGGGTWTCTAAT-3- [60]; source: Integrated DNA Technologies). No-template PCR amplification controls were run as above, and were negative. Sequence processing included removal of PhiX contaminants and Illumina adapters at JGI.

The identity of the cultivated isolates was confirmed by 16S rRNA gene sequencing using Bact27F and Bact1492R primers either by directly sequencing agarose gel-purified PCR products (Qiagen Inc.), or TA cloning (Invitrogen, ThermoFisher Scientific) of PCR fragments into *E. coli* following the manufacturer's instructions, in which three clones were sequenced for each library and plasmids were purified (Qiagen Inc.) at the Nevada Genomics Center, where Sanger sequencing was conducted on an ABI3700 (Applied Biosystems, Life Technologies, Foster City, CA, USA). Sequences were trimmed and quality checked using Sequencer, v. 5.1.

### *5.7. Bioinformatic Analysis of 16S rRNA Gene tag Sequences*

We employed a QIIME2 pipeline [61] using the DADA2 plug-in [62] to de-noise the data and generate amplicon sequence variant (ASV) occurrence matrices for the SaM and bacterioplankton samples. The rigor of ASV determination was used in this instance given the increased ability to uncover variability in the limited geographic study area, interest in uncovering patterns of host-specificity, and ultimately in identifying the conserved, core members of the microbiome—at least one of which may be capable of PalA biosynthesis. Sequence data sets were initially imported into QIIME2 working format and the quality of forward and reverse were checked. Default trimming parameters included trimming all bases after the first quality score of 2, in addition, the first 10 bases were trimmed, and reads shorter than 250 bases were discarded. Next the DADA2 algorithm was used to de-noise the reads (corrects substitution and insertion/deletion errors and infers sequence variants). After de-noising, reads were merged. The ASVs were constructed by grouping the unique full de-noised sequences (the equivalent of 100% OTUs, operational taxonomic units). The ASVs were further curated in the QIIME2-DADA2 pipeline by removing chimeras in each sample individually if they can be exactly reconstructed by combining a left segmen<sup>t</sup> and a right segmen<sup>t</sup> from two more abundant "parent" sequences. A pre-trained SILVA 132 99% 16S rRNA Naive Bayes classifier (https: //data.qiime2.org/2019.1/common/silva-132-99-nb-classifier.qza) was used to perform the taxonomic classification. Compositions of the SaM and the bacterioplankton ASVs were summarized by proportion at di fferent taxonomy levels, including genus, family, order, class, and phylum ranks. In order to retain all samples for diversity analysis, we set lowest reads frequency per sample (n = 62 samples at 19,003 reads; n = 63 samples at 9987 reads) as rarefaction depth to normalize the data for di fferences in sequence count. ASVs assigned to Eukarya or with unassigned taxa (suspected contaminants) were removed from the final occurrence matrix such that the final matrix read counts were slightly uneven with the lowest number of reads per sample with 9961 reads.

The SaM ASVs were binned into Core (highly persistent) if present in ≥ 80% of samples (Core80), Dynamic if present in 50%–79% of samples (Dynamic50) and those that comprise the naturally fluctuating microbiome, or Variable fraction, which was defined as those ASVs present in < 50% of the samples [2,3]. We used these conservative groupings of the core microbiome due to the low depth of sequencing in our study [3].

ASV identities between the SaM, the *S. adareanum* bacterial isolates and the bacterioplankton data sets were compared using CD-HIT (cd-hit-est-2d; http://cd-hit.org). The larger SaM data set which included 19,003 sequences per sample was used for these comparisons to maximize the ability to identify matches; note that this set does exclude one sample, Bon1b, which had half as many ASVs, though overall this larger data set includes nearly 200 additional sequences in the Variable fraction for comparison. ASVs with 100% and 97% identity between the pairwise comparisons were summarized in terms of their membership in the Core, Dynamic or Variable fractions of the SaM. Likewise, CD-HIT was used to dereplicate the isolate sequences at a level of 99% sequence identity, and then the dereplicated set was compared against the bacterioplankton iTag data set.

Phylogenetic analysis of the SaM ASVs, *S. adareanum* bacterial isolates, and 16S rRNA gene cloned sequences from Riesenfeld et al. [13] was conducted with respect to neighboring sequences identified in the Ribosomal Database Project and SILVA and an archaea outgroup using MEGA v.7 [63]. Two maximum likelihood trees were constructed—the first with the Core80 ASVs, and the second with both Core80 and Dynamic50 ASVs. A total of 369 aligned positions were used in both trees. A total of 1000 bootstrap replicates were run in both instances, in which the percentage (≥ 50%) of trees in which the associated taxa clustered together are shown next to the branches.
