**1. Introduction**

The endolithic microbiome of intertidal carbonate rocks has been the subject of intensive study since the 1800s [1,2], with a main focus on the characterization of bioerosive agents within these communities. The agents, boring organisms referred to as euendoliths, excavate the rock substrate and create pore spaces for their own growth. Le Campion-Alsumard and colleagues [3,4] first examined succession and colonization by microscopic inspection in order to better understand the ecological principles that drive euendolith community formation. As concern for coral destruction rose in the 1990s, others [5–11] applied the same procedures to understand these dynamics and mitigate bioerosion in reef ecosystems. These studies on porous, biogenic carbonates from coral skeletons showed swift initial colonization by euendolithic algae, with successional changes occurring within months and communities reaching maturity after a year. Kiene [10], Gektidis [9] and Chacón et al. [12] examined hard mineral carbonates as well, finding that euendolithic cyanobacteria, not algae, were the dominant boring organisms there and that hard substrates led to more diverse cyanobacterial populations than those of corals.

Although early research was informative in identifying and characterizing major euendolithic players, the use of morphological characterization alone has been found to underestimate microbial diversity in endolithic cyanobacterial communities [12,13]. Indeed, in the case of marine carbonate

communities, high-throughput amplicon sequencing has demonstrated that morphology-based studies can underrepresent cyanobacterial diversity estimates by factors of 10 to 100 [12,14,15]. Early research identified three major morphotypical groups of euendolithic cyanobacteria. One is represented by the thin, filamentous, *Leptolyngbya*-like organisms most commonly assigned to *Plectonema terebrans* (or *Leptolyngbya terebrans*), which are typically one of the most abundant euendolith morphotypes, at times exceeding 80% of total euendolithic biovolume [6]. Unfortunately, no 16S rRNA gene sequence of *P. terebrans* has been obtained from cultures, making it impossible to identify it with certainty in molecular surveys. Environmental sequences best matching *Halomicronema* and *Leptolyngbya* species have been tentatively suggested to represent the elusive *P. terebrans* [14]. The second group corresponds to the species *Mastigocoleus testarum,* which is characterized by a complex, true-branching filamentous morphology, making it easily identifiable from microscopic examination. It has been recently redescribed on the basis of a polyphasic approach based on strain BC008, showing congruency between molecular and traditional approaches [16], has served as a model to elucidate the physiological mechanism of boring [16–18], and is the first euendolith whose genome has been fully sequenced [19]. *Mastigocoleus testarum* is one of the earliest colonizers in soft carbonates, being found as early as one week after initial exposure [3,11,20]. A third, diverse group includes several members of the order Pleurocapsales in the genera *Hyella, Solentia*, *Hormathonema*, and the recently described *Candidatus* Pleuronema. Members of the Pleurocapsales typically act as pioneer borers but can bore only to shallow depths and are easily preyed upon by grazers, leading to low abundance in mature communities [6,7,11,20].

Through comprehensive, high-throughput molecular surveys, we recently found a diverse phototrophic community in the endolithic habitat of coastal hard carbonates, which included four distinct anoxygenic phototrophic bacterial (APB) groups. The most dominant APBs were members of the Chloroflexales (green nonsulfur bacteria) [21–23] and *Erythrobacter* (aerobic anoxygenic phototrophs) [24,25]. APBs could comprise upwards of 80% of the total phototroph community [15] in some samples. Our findings broadened the known habitats for APBs and suggested that some microscopic characterizations of endolithic thin filamentous organisms (*Plectonema*-like) may have in fact been APBs. Thus, APBs could be euendolithic in nature, potentially upending the long established understanding of endolith ecology by broadening the pool of possible pioneer organisms and boring mechanisms.

Therefore, to provide new molecular insights into euendolith colonization and succession and to attempt to answer questions that arose from our prior work, we set up a colonization experiment in the intertidal zone of Playa Ulvero, Isla de Mona, Puerto Rico. We anchored nonporous travertine (a dense, compact form of calcium carbonate) tiles onto beach rock 5 m from shore and collected samples every 3 months over a 9-month period. Our study had four specific aims: (1) to elucidate APB colonization timing to identify if APBs are pioneer organisms with the ability to bore; (2) to examine cyanobacterial euendolith colonization and succession using molecular methods; (3) to measure the colonization dynamics of the *Leptolyngbya*-like (*Plectonema*), *Mastigocoleus*-like, and Pleurocapsalean euendolithic cyanobacterial groups; and (4) to compare colonization progress to previously described steady-state climax communities of similar geological composition and geographic location in order to gauge community maturity

#### **2. Materials and Methods**

#### *2.1. Tile Placement and Sample Collection*

Commercial, 4 inches wide, 1.5 inches thick travertine square tiles, were anchored onto intertidal beach rock some 5 m from the high-tide shoreline at Playa Uvero on Isla de Mona, Puerto Rico, (18◦03-36.2" N 67◦54-21.8" W) (Figure 1) after having received permits from the Departamento de Recursos Naturales y Ambientales (Commonwealth of Puerto Rico). Tiles were fastened to the beach rock using a combination of Red Head 5" × 3/8" 316 Stainless Steel Wedge Anchors and JB Weld

Waterweld putty. Three tiles were sacrificially collected every three months, air-dried and shipped, reaching the laboratory in less than a week, and then stored on arrival at −80 ◦C until analysis.

**Figure 1.** Experimental tile placement. (**a**) Location near Playa Uvero (yellow star) on Isla de Mona, Puerto Rico. (**b**) Anchoring on a stretch of intertidal beach rock (yellow box) as seen at low tide. (**<sup>c</sup>**–**f**) Aspect of virgin (**c**) and exposed tiles harvested after harvested after 3 (**d**), 6 (**e**), and 9 months (**f**). Part of the growth observable in the pictures was epilithic in nature.

#### *2.2. Endolithic Community DNA Extraction*

Tiles were vigorously brushed with sterile toothbrushes and sterilized seawater to remove epilithic biomass. To ensure a consistent sampling effort, each tile was sampled four times in 2 cm by 2 cm squares, 1 cm from the edge of the tile (sampling is shown in Figure 2d–f). Sampled fragments were ground in sterile mortars following the protocol described in Wade and Garcia-Pichel 2003 [26], and 0.5 g of powered rock was used as the input material for a MoBio PowerPlant Pro DNA extraction kit (Mo Bio Laboratories, Inc., Carlsbad, CA, USA) following the protocol provided, except that, before the first lysis step, the contents of the bead tubes were homogenized horizontally at 2200 rpm for 10 min, and, additionally, subjected to seven freeze–thaw cycles using liquid nitrogen to ensure full disruption of bacterial membranes.

#### *2.3. Quantitative PCR of 16S rRNA Gene Content*

In order to quantify the number of 16S rRNA gene copies in the extracts, quantitative real-time PCR was conducted using universal V3 16S rRNA gene primers 338F (5'- ACTCCTACGGGAGGCAGCAG-3') and 518R (5'-GTATTACCG CGGCTGCTGG-3'). PCRs were performed in triplicate using Sso Fast mix (Bio-Rad, Hercules, CA, USA) following Couradeau et al. [27]. Following quantification, triplicate 16S rRNA gene counts were averaged and then converted to counts per square meter using the surface area of the tile analyzed. The total counts per square meter were then multiplied by the associated proportional abundance of any clade of interest in order to obtain absolute population size for that clade. Separate biological replicates (i.e., tiles) were then averaged.

#### *2.4. 16S rRNA Gene Library Preparation and Illumina Sequencing*

Amplicon sequencing of the V3–V4 variable region of the 16S rRNA gene was performed using the universal bacterial PCR primers 341F (5'-CCTACGGGNGGCWGCAG) [28] and 806R (5'-GGACTACVSGGGTATCTAAT) [29]. PCR amplifications were done in triplicate, then pooled and quantified using Quant-iT™ PicoGreen®dsDNA Assay Kit (Invitrogen). Two hundred forty nanograms of DNA per sample were pooled and then cleaned using QIA quick PCR purification kit (QIAGEN). The PCR pool was quantified by Illumina library Quantification Kit ABI Prism®(Kapa Biosystems). DNA pool was determined and diluted to a final concentration of 4 nM then denatured

diluted to a final concentration of 4 pM with a 30% of PhiX. Finally, the DNA library was loaded in the MiSeq Illumina sequencer using the chemistry version 3 (2 × 300 paired-end) and following the guidelines of the manufacturer.

#### *2.5. Data Analysis Pipeline*

Raw sequences were processed using the QIIME2 2018.2 analysis pipeline [30]. Demultiplexed sequences were imported into QIIME2 and processed using the DADA2 [31] denoised-paired plugin with the following parameters: trunc\_len\_f:280, trunc\_len\_r:235, trim\_left\_f:20, trim\_left\_r:25, and max\_ee:8, so as to obtain amplicon sequence variants (ASVs). After resolving ASVs, any sequences found in the control tile extracts (uncolonized tiles) were filtered from the final feature table. Sequencing depth of the experimental tiles ranged from 21,897 to 180,168 (post filtering), and alpha-rarefaction analysis indicated that all samples had reached convergence (Figure S1). In order to conduct diversity analysis, representative sequences were aligned using MAFFT7 [32], and a phylogenetic tree was generated using FastTree [33]. Diversity metrics were calculated using the core-metrics-phylogenetic plugin, including Weighted and Unweighted UniFrac metrics [34]. ASVs were initially classified using the classify-sklearn plugin, (Available online: https://github.com/qiime2/q2-feature-classifier) with a Green Genes 13\_8 [35] based classifier. The feature table was then exported, and di fferential abundance analysis was conducted using the QIIME1 [36] plugin *di*ff*erential\_abundance.py* and the DESeq2 algorithm [37]. PCoAs were generated using the vegan package [38], and graphics were created using R [39] and the ggplot2 package [40]. Statistical analyses were conducted either using R (Student's *t*-test) or within Qiime2 (Kruskal–Wallis, PERMANOVA).

#### *2.6. Cyanobacterial ASV Classification*

To identify key euendolithic cyanobacterial clades, the representative sequence output from QIIME2 was filtered to only include cyanobacterial sequences (plastids were removed). These comprised at least 95% of the total number of reads within each sample. Next, the sequences were aligned to the Cydrasil reference alignment [41] using PaPaRa [42], and placed into the Cydrasil reference tree using the Evolutionary Placement Algorithm (based on the maximum-likelihood model) feature of RAxML8 [43]. The output was visualized using the ITOL3 website [44]. An ASV was considered a likely euendolith if it was placed on a branch containing only known euendolithic cyanobacteria with a >70% certainty. Biomass was calculated for each tile by multiplying the total relative abundance of the cluster by the total areal concentration of 16S rRNA genes in that sample. Then, biological replicates for each time point were averaged and graphed using R and ggplot2.

#### *2.7. Steady-State Climax Community Comparisons*

Three natural substrate samples from Couradeau et al. [14] and Roush et al. [15] (samples denoted as H001-H003 in SRA) were used as proxies for steady-state climax communities for comparison of colonization progress. The samples were chosen based upon their geographic proximity to the tile placement location and their similar geological composition (calcite). The raw sequencing data was processed using the same parameters and pipeline as described above. Pigment analysis was conducted in the same manner as for the tiles.

#### *2.8. Unknown Boring Cluster (UBC) Phylogenetic Tree*

In order to assess the nearest neighbors of the unknown boring cluster, a multiple sequence alignment (MSA) was generated using SSU-Align [45]. The MSA was comprised of the three most di fferentially abundant ASVs identified from DESeq2 and EPA placement analysis, the nearest sequences from Cydrasil, and the top seven most similar NCBI nr database sequences identified using BLAST [46]. The resulting alignment of 398 sequences was then used as input into RAxML8 [47] to generate a phylogenetic tree using the rapid-bootstrap algorithm with 1000 bootstraps and the GTR GAMMA model. The remaining ASVs were then checked using BLAST and the nr database for proximity to the resulting clade.

#### *2.9. Pigment Extraction and Analysis*

In order to extract lipid-soluble pigments, the remaining powdered sample (the same samples used for DNA extraction) was suspended in 7:2 acetone:methanol solvent and sonicated twice for 30 s in an ice bath in the dark. Extracts were centrifuged at 2100× *g* for 10 min, decanted, and the supernatant filtered through a 0.22 μm nylon filter. These steps were repeated and the supernatants were pooled until the extract was devoid of color. The resulting extract was then evaporated under a N2 stream in the dark and resuspended in 200 μL of HPLC-grade acetone. HPLC analysis was conducted on a Waters Alliance e2695 HPLC with an inline Waters 2998 photodiode array detector, using a protocol adapted from Frigaard et al. [48] for use on a CORTECS C18 4.6 mm × 150 mm (90 Å pore size, 2.7 μm particles) column. Separation was performed as follows: the initial solvent gradient composed of ethyl-acetate:methanol:acetonitrile:water in a 21:23.9:47.6:7.5 ratio by volume and linearly changed to 30:20:50:0 ratio by volume in 13.43 min, held for 3.87 min, and then immediately returned to the initial ratio (21:23.9:47.6:7.5 by volume) and held for 5.7 min. Total runtime per sample was 23 min, at a flow rate of 2 mL min–<sup>1</sup> and column temperature of 30 ◦C. Pigment identification was done by comparison of retention time and spectrum against standards of Chl *a* and BChl *a* obtained from Sigma Aldrich. All other pigments were identified from known spectra [49]. Injected pigment mass was calculated from the chromatogram using the equation m = FA (*<sup>e</sup>*m d)−1, where m is the mass of BChl or Chl in milligrams, *F* is the solvent flow rate (1 mL min−1), A is the peak area (in Au), *e*m is the extinction coefficient in L mg<sup>−</sup><sup>1</sup> cm<sup>−</sup>1, and d is the path length of the PDA detector (1 cm). Extinction coefficients were taken from Ley et al. 2006 [50].
