**2. Results**

### *2.1. Variation in Holobiont PalA Levels Across Ascidian Colonies and Collection Sites*

Typical procedures for natural product chemistry samples utilize bulk specimen collections for chemistry extraction (~30 individual ascidian lobes per lipophilic extraction in the case of *S. adareanum*). Prior to this study, variation in PalA content at the individual lobe or colony level (inset, Figure 1) was unknown. Our sampling design addressed within- and between-colony variation at a given sampling site, as well as between site variation. The sites were constrained to the region that was logistically accessible by Zodiac boat in the Anvers Island Archipelago off-shore of the United States Antarctic Program (USAP)-operated Palmer Station. *S. adareanum* colonies were sampled across seven dive sites (Figure 1), in which three lobes per multi-lobed colony were sampled from three colonies per dive site, totaling 63 lobes for PalA comparison. PalA stands out as the dominant peak in all LC–MS analyses of the dichloromethane:methanol soluble metabolite fraction of all samples analyzed (e.g., Figure 2a,b). The range in PalA levels varied at an order of magnitude of approximately 0.49–4.06 mg PalA x g<sup>−</sup><sup>1</sup> host dry weight across the 63 lobes surveyed. Our study design revealed lobe-to-lobe, intra-site colony level and some site-to-site differences in PalA levels (*p* < 0.05) in the archipelago (Figure 2c and Figure S1). Within a given colony, the lobe-to-lobe variation was often high and significantly different in 17 of 21 colonies surveyed. Significant differences in PalA levels between colonies were also observed within some sites (Janus Island (Jan), Bonaparte Point (Bon), Laggard Island (Lag), and Litchfield Island (Lit); see Figures 1 and 2c), in which at least one colony had significantly different levels compared to another colony or both. Despite this, we found differences between some of the sites. Namely, Bon was significantly lower than all six other sites. This site is the closest to the largest island, Anvers

Island, and Palmer Station. Samples from Killer Whale Rocks (Kil) and Lit sites were also found to have significantly higher PalA levels than Jan, Bon and Norsel Point (Nor), although these did not appear to have a particular spatial pattern or association with sample collection depth.

**Figure 2.** Palmerolide A (PalA) detection in *S. adareanum* holobionts. (**a**) Total ion chromatogram derived from sample Lit-1a. The PalA peak dominates the dichloromethane-methanol fraction of the *S. adareanum* extract. Inset: PalA structure. (**b**) Mass spectrum of PalA (sample Lit-1a) derived from peak at scan number 631 showing [M – H2O + H]+ (C33H47N2O6 calculated *m*/*z* 567.3429), [M + H]+ (C33H48N2O7 calculated *m*/*z* 585.3534) and [M + Na]+ (C33H47N2O7Na calculated *m*/*z* 607.3359). (**c**) Levels of PalA normalized to tissue dry weight detected by mass spectrometry in *S. adareanum* holobiont tissues (three lobes per colony) surveyed in three colonies per site across the Anvers Island Archipelago. Error represents individual lobe technical replication (standard deviation). Colonies with significant differences in PalA levels within a site (e.g., Jan-1:Jan-2) are indicted with triangles, in which the direction of the point indicates a significantly higher or lower colony. Filled triangles indicate significance (*p* < 0.05) in comparison to the other two colonies, and open triangles are those that were different from only one of the two colonies. Most colonies had significant lobe-to-lobe differences in PalA concentration, and some site-level differences were observed (Figure S1).

### *2.2. Characterization of Host-Associated Cultivated Bacteria*

Given our interest in identifying a PalA-producing microorganism [13], we executed a cultivation effort with *S. adareanum* homogenate on three different marine media formulations at 10 ◦C. The 16S

rRNA gene sequencing revealed seven unique isolates (of 16 brought into pure culture) at a level of < 99% sequence identity. All but one of the isolates was a ffiliated with the class Gammaproteobacteria, including five di fferent genera commonly isolated from marine environments (*Shewanella*, *Moritella*, *Photobacterium*, *Psychromonas* and *Pseudoalteromonas*—of which, nine were highly related). In many cases, we characterized their near neighbors as well-known marine psychrophiles, with many from polar habitats (Figure S2). The exception to this was the isolation of a cultivar associated with the Alphaproteobacteria class, *Pseudovibrio* sp. str. TunPSC04-5.I4, in which its two nearest neighbors were isolated from a temperate marine sponge and a temperate ascidian (associated with a di fferent Ascidiacea family). This result marks the first *Pseudovibrio* sp. cultivated from high latitudes. HPLC screening results of biomass from all sixteen isolates did not reveal the presence of PalA.

### *2.3. Synoicum Adareanum Microbiome (SaM)*

To understand the nature of conservation in the composition of the host-associated microbiome of *S. adareanum*, we identified the microbiome structure and diversity (based on the V3–V4 region of the 16S rRNA gene) with sections of the 63 samples used for holobiont PalA determinations. This e ffort resulted in 461 16S rRNA gene amplicon sequence variants (ASVs) distributed over 13 bacterial phyla (Table 1, Table S1). The core suite of microbes, defined as those present in > 80% of samples (referred to as the Core80), included 21 ASVs (six of which were present across all 63 samples). The Core80 ASVs represented the majority of the sequenced iTags (95% on average across all 63 samples), in which the first four ASVs dominated the sequence set (Figure 3). The ASVs present in 50%–79% of samples represented the Dynamic50 category and contained 14 ASVs, which represented only 3.3% of the data set. The remaining ASVs fell into the Variable fraction, which included 426 ASVs, representing 1.7% of the iTag sequences, ye<sup>t</sup> the majority of phylogenetic richness (Table 1). Comparative statistical analyses were conducted with the complete sample set, which was subsampled to the lowest number (9987) of iTags per sample. This procedure was limited by one sample (Bon1b) that underperformed in terms of iTag sequence yield. After the elimination of this sample from the analysis, the 62-sample set had 19,003 iTags per sample with a total of 493 ASVs, the same 21 sequences in the Core80 (with seven common to all 62 samples), the same 14 Dynamic50 sequences, and a total of 458 Variable ASVs.


**Table 1.** Relative proportions (average +/− standard deviations, n = 63) of phyla (and class for the Proteobacteria) across the di fferent microbiome fractions.

**Figure 3.** Heatmap of square root transformed amplicon sequence variant (ASV) occurrence data for the core microbiome. ASVs (ranked and numbered) are shown on the *y*-axis, and hierarchically clustered samples (63) are shown on the *x*-axis (site-based; square root transformed abundance data). Nodes with significant clusters are indicated from left to right (*p* < 0.05); order of clustering inside the node was not significant. The horizontal line drawn below SaM\_ASV6 demarcates those ASVs that were present in all 63 samples.

The Core80 microbiome category included a relatively high degree of phylogenetic novelty, with nearly one-third of the membership having low (< 95.0%) sequence identities to nearest neighbors. At the same time, the rest of the Core80 ASVs (14) matched mostly uncultivated marine taxa from polar seas or sediments—in many cases, with identities > 97%. Several of the nearest neighboring sequences originated from marine invertebrate microbiomes. The Core80 ASVs were distributed across five phyla (Proteobacteria, Actinobacteria, Nitrospirae, Verrucomicrobia and Bacteroidetes; Table 1), in which seven highly related ASVs were associated with the Gammaproteobacteria genus *Microbulbifer,* and dominated the ASV sequences (Figure S1). Two other Gammaproteobacteria ASVs in the Core80 were affiliated with the *Endozoicomonas* genus (*SaM\_*ASV7) and *Nitrosospira* (*SaM\_*ASV13). There were several Alphaproteobacteria with nearest neighbors falling in *Pseudovibrio, Hoeflea, Sulfitobacter,* and *Octadecabacter*, genera. A Rhodospirillales-related sequence was only distantly related to known taxa, with an 86.6% sequence identity to its nearest neighbor. A Deltaproteobacteria *Bdellovibrio*-related ASV was also part of the Core80 microbiome (*SaM\_*ASV9) that was also unique, with a sequence identity of 90.3% to its nearest neighbor. The Actinobacteria-affiliated core ASV (*SaM\_*ASV20) was related to an uncultivated Solirubrobacterales sequence. ASV11 most closely matched a sequence in the *Nitrospirae* family from Arctic marine sediments. There were two Verrucomicrobium-affiliated sequences represented in different families (*Puniceicoccaceae*, SaM\_ASV14, and *Opitutacae*, SaM\_ASV15). Lastly, there were two ASVs affiliated with the phylum Bacteroidetes: one related to polar strain, *Brumimicrobium glaciale* (SaM\_ASV19), and the other to a marine *Lutibacter* strain (SaM\_ASV12).

Five Dynamic50 ASVs were affiliated with the marine Bacteroidetes phylum (*Cryomorphaceae* and *Flavobacteriacae*-related), in addition to six ASVs associated with the class Gammaproteobacteria (including four additional *Microbulbifer*-related sequences). There were also two additional phyla, an ASV related to a sponge-affiliated Verrucomicrobium isolate, and a Planktomycetes-related ASV (Table 1, Figure S2). Several of these ASVs were most closely related to isolates from marine sediments.

Interestingly, five sequences identified from earlier cloning and sequencing efforts with this host-associated microbiome (Figure S2; [13]) matched sequences in the Core80 and Dynamic50 data sets. Phylogenetic comparisons also revealed that the SaM isolates were distinct from the Core80 and Dynamic50 except for the *Pseudovibrio* sp. TunPSC04-5.I4 isolate, which was present in both the Core80, and the clone and sequencing study.

The hierarchical clustering of Core80 ASVs (based on Bray–Curtis similarity) across all 63 samples did not reveal strong trends in site or colony specific patterns (Figure 3). There were eight instances where two of three lobes paired as closest neighbors, and four of eight primary clusters that included three lobes derived from the same colony. Sample Bon1b clustered apart from them all. In some cases, clusters could be attributed to specific ASVs. For example, cluster 2 (Figure 3) had the highest relative levels of SaM\_ASV15 (an *Opitutaceae* family-affiliated sequence), whereas cluster 3 (Figure 3) had the highest relative levels of ASV4 (affiliated with *Microbulbifer* spp.).

Overall, the community structures in the *S. adareanum* microbiomes across the 63 lobes surveyed had a high degree of similarity. Bray–Curtis pairwise similarity comparisons between lobes and colonies within each site were higher than 54% in all cases. When comparing the averages of pairwise similarity values within and between colonies, all sites, other than Lag, had higher similarity values within lobes in the same colony (ranging from 69.9% to 82.2%) compared to colonies within a site (66.5%–81.0%; Figure S3), although the differences were small, and only Bon and Jan were significantly different (*p* < 0.05). We performed a two-dimensional tmMDS analysis based on Bray–Curtis similarity to investigate the structure of the microbiome between sites (Figure 4a). The microbiomes sampled at Kil and Lit had the highest overall degree of clustering (> 75% similarity) while Kil, Lag, Lit, and Jan samples all clustered at a level of 65%. The microbiomes from DeLaca (Del) were the most dissimilar, which was supported by SIMPER analysis, in which two of the most abundant ASVs in the Core80 were lower than the average across other sites (SaM\_ASV1 and 3) while others in the Core80 (SaM\_ASV4, 15, and 17) were higher than the average across sites. We also performed a 2D tmMDS on the SaM fractions (Core80, Dynamic50, Variable) with and without permutational iterations, which showed similar trends although partitioning of community structures between sites was more evident with the permutations (Figure S4). Site-based clustering patterns shifted to some degree in the different SaM fractions. For the Core80 alone, Jan samples clustered apart from the others. For Dynamic50, both Jan and Kil were outliers. Finally, for the Variable fraction, Kil and Del samples clustered apart from the other sites. The Variable fraction was more homogeneous, obscuring any site-to-site variability, while the core displayed tighter data clouds that showed a modest level of dispersion.

**Figure 4.** Similarity relationships amongs<sup>t</sup> the *S. adareanum* microbiome samples in the Anvers Island Archipelago. (**a**) tmMDS of Bray–Curtis similarities of square root transformed ASV occurrence data representing the microbiome of the 63 *S. adareanum* lobe samples using the complete ASV occurrence profiles. Microbiome samples with significant levels of similarity are shown (see legend). (**b**) β-diversity across Anvers Island Archipelago sites represented by PERMDISP (9999 permutations) reveals differences between the highly persistent core, dynamic and variable portions of the *S. adareanum* microbiome (standard error shown). The degree of dispersion (variance) around the centroid changes significantly (*p* < 0.0001) for the different microbiome classifications. The lowest levels of dispersion are found in the core microbiome.

A PERMANOVA analysis investigated the drivers of variability in community structure, in which we tested the role of colony-level, site-level, and stochastic environmental variation. When evaluating the whole community (all sites and ASVs), site-to-site di fferences explained 25% of the variability in the microbiome (Table S2). Colony-to-colony di fferences explained 28%, while the remaining 47% of the variability was unexplained and is likely attributed to stochastic environmental variation. When the SaM fractions were analyzed separately, the most significant di fference (*p* < 0.05) was in the Variable fraction of the microbiome, in which the site-to-site di fferences explained only 19.2% of the variation and the residual (stochastic) level increased to 58.4% (Table S2). Conversely, PERMDISP (Figure 4b), representing dispersion about the centroid calculated for each site (a measure of β-diversity), revealed di fferences in the community structures in each SaM fraction across sites, as well as di fferences between the microbiome fractions. The Core80 had a low level of dispersion (PERMDISP average of 9.1, range: 6.8-12.1), compared to the Dynamic50 (average 28.6, range: 24.5- 38.6), which were moderately dispersed about the centroid, and were more variable with di fferences between the more apparent sites; Bon, Del, and Nor had higher values compared to the other four sites. The Variable fraction had high PERMDISP values (average 54.7, range: 51.6–57.6), representing high di fferences in β-diversity, in which values were relatively close across the seven sites.

### *2.4. ASV Co-Occurrences, and Relationship to PalA*

To further investigate the ascidian–microbiome ecology, we performed a network analysis of the ascidian holobiont with a particular emphasis on ASV co-occurrence and explored the relationship between ASV and PalA using direct and indirect measures. The co-occurrence network depicted a sparse graph with 102 nodes and 64 edges (average degree 1.255; diameter 15; average path length 5.81) that associated ASVs from similar SaM fractions (assortativity coe fficient of 0.41), confirming the robustness of the above discrimination. Upon inspection of the co-occurrence network, a dominant connected component contained 42.1% of ASVs, in which we identified three highly connected modules (via the application of an independent network analysis WGCNA, soft threshold = 9, referred to here as subsystems; Figure 5). The network also contained several smaller systems (2–6 nodes), and 30 singletons—only one of which was in the Core80 (Gammaproteobacteria class, *Endozoicomonas*-a ffiliated). The three subsystems included ASVs from the Core80, Dynamic50 and Variable SaM fractions, but were unevenly distributed. Mostly driven by Core80 SaM fractions, Subsystem 1 harbored most of the highly represented ASVs, including the *Microbulbifer*-related ASVs as well as the *Pseudovibrio* ASV. Subsystem 2, interconnecting Subsystems 1 and 3, is mostly driven by Variable fraction ASVs and includes lower relative abundant Core80 *Hoeflea* and the *Opitutaceae* ASVs, as well as several *Bacteroidetes* taxa. Finally, Subsystem 3 was smaller. It included several diverse taxa dominated by Dynamic50 fraction, including *Lentimonas*, and *Cryomorphaceae*-a ffiliated ASVs from the Core80. The overall role of Variable ASVs in the network is surprising, as they appear to be critical nodes linking the subsystems via 13 edges that lie between Subsystems 2 and 3. Another, perhaps noteworthy smaller connected component is in the lower left of the graph, in which three Core80 ASVs, the chemoautotrophic ammonia and nitrite oxidizers, *Nitrosomonas* and *Nitrospira*, and an uncharacterized Rhodospirillales-related ASV, were linked with a Variable ASV.

To address the first-order question as to whether there was a relationship between the PalA concentration levels detected in the LC–MS analysis and the semi-quantitative ASV occurrences of *S. adareanum* ASVs, we performed three complementary analyses: correlation analysis, weighted co-occurrence networks analysis (WGCNA) and niche robust optima with PalA concentration as a variable. Pearson correlations between ASV occurrences and PalA concentrations ranged from −0.33 to 0.33 at the highest (24 of which were significant, ≤ 0.05; although none of those with a significant relationship were part of the core microbiome and were present in < 50% of the samples with the highest occurrence of 24 sequences), suggesting little relationship at the gross level of dry weight-normalized PalA and ASV relative abundance. This indicated that relative abundance of ASV is not a good predictor of PalA. A complementary WGCNA result showed no significant association between subsystem co-occurrence topology and PalA (correlations = −0.23; 0.033; −0.17 for subsystems 1, 2 and 3 respectively), pointing to the lack of relationship between microbial community structure and PalA. Finally, another aspect of the relationship between ASV occurrence and PalA levels was explored using the robust optimum method, which estimates the ecological optimum and tolerance range for ASVs about PalA. In this case, we calculated the PalA niche optimum for each ASV and ranked them based on the median (Figure S5). Core80 ASVs showed a consistent PalA niche range. Furthermore, the median optima values of the Dynamic50 ASVs lie, for the most part, in lower or higher PalA optima compared to the Core80, with a substantial niche overlap between Core80. The Variable ASVs collectively lie at the lower and higher extremes of the optimum and tolerance range. Altogether, these complementary analyses advocated for not considering an individual nor a community effect on PalA, but rather, acclimation to the high levels of PalA observed (Figure 2c) that likely rely on unknown metabolic or environmental controls.

**Figure 5.** ASV co-occurrence network. The largest connected component of the co-occurrence network (seeded with ASVs found in at least 5 samples, 102 in total) identified three subsystems. Node colors represent the microbiome fractions (Core80, green; Dynamic50, blue; Variable, pink). Taxonomic identities of the ASVs are shown next to the nodes, with the phylum\_highest level taxon identified shown.

### *2.5. Culture Collection: Microbiome and Bacterioplankton Comparisons*

To address whether the composition of the *Synoicum adareanum*-associated bacterial cultivars was also present in the SaM or in the free-living bacterioplankton (< 2.5-μm fraction), we considered the overlap in membership between the isolate 16S rRNA gene sequences and these other two data sets. Representation of isolates in the 16S rRNA gene ASV data set was estimated by comparing the two sequence data sets, albeit di fferent ascidian samples were used for the culture collection and the *S. adareanum* survey. Excepting *Pseudovibrio* sp. TunPSC04-5I4 that was present in the Core80 with a 100% sequence match, two other isolates also had 100% matches to sequences in the Variable SaM (BONS.1.10.24 and BOMB.9.10.19). Three other isolate 16S rRNA sequences (BOMB.3.2.14, BOMB.9.10.16, BOMB.9.10.21) were found to match sequences in the Variable microbiome at a level of 97% or higher. Only the *Pseudoalteromonas*-related isolate and the *Shewanella*-related isolate sequences were not detected with relatives at a level of at least 97% identity to the SaM ASVs; which could be explained by undersampling. The bacterioplankton composition was dominated by Gammaproteobacteria ASVs (47.35% of all ASVs; Table S3), in which six of the isolates (BOMB.9.10.21, BONS.1.10.24, BOMB.9.10.16, BOMB.3.2.20, BOMB.3.2.14, BONSW.4.10.32) matched sequences in the bacterioplankton data set at > 99.2% identity; even at a level of 95% sequence identity, the remaining ten isolates did not match sequences in the plankton, including the *Pseudovibrio* sp. str. TunPSC04-5I4 isolate.

### *2.6. Microbiome: Bacterioplankton Comparisons*

Although at a high taxonomic level Proteobacteria and Bacteroidetes phyla dominated the microbiome and bacterioplankton (Table S3), the relative proportions varied and the taxa represented were quite di fferent. Membership between SaM and the bacterioplankton data set indicated a low-level overlap at 100% identity, with 39 of 604 perfectly matched ASVs. At 100% ASV sequence identity, the results indicate a single, Core80 SaM ASV was a perfect match with the bacterioplankton data set—the *Microbulbifer*-associated sequence that is the most abundant across all 63 SaM data sets. Interestingly, this sequence was only identified in one bacterioplankton sample (IPY-225-9) at a low occurrence (14 of > 1.18 million tags distributed across 604 ASVs). There were three (of 14) Dynamic50 ASVs that were perfect matches with the bacterioplankton ASVs. These were a ffiliated with a poorly classified *Bacteroidetes* family *Flavobacteraceae* ASV (77% of SaM samples), a Gammaproteobacteria-associated *Sinobacterium* (73% of SaM samples), and a second Gammaproteobacteria-associated ASV that is associated with *Candidatus* Tenderia (71% of SaM samples). The remaining 35 perfect match ASVs between the two data sets were classified as part of the Variable microbiome. These sequences fell across four phyla and nine classes—13 of which were well distributed across the Bacterioplankton data set samples (> 50%).

At a level of 97% ASV sequence identity, there were three additional matches between the Bacterioplankton data set and the Core80. These included Alphaproteobacteria-related *Hoeflea* and *Halocynthibacter*-related sequences, and a *Nitrospira*-related sequence. There were also two other Dynamic50-related ASVs: these were both related to unclassified *Flavobacteriaceae*. The rest (79 ASVs) of the matches at > 97% were a ffiliated with the Variable fraction of the microbiome.
