*3.2. Total Genomic DNA Extraction*

The genomic DNA of *Leptothoe kymatousa* TAU-MAC 1615 was extracted using a DNA extraction kit (E.Z.N.A.® SP Plant DNA Mini Kit Protocol—Fresh/Frozen Samples, Omega Bio-Tek, Norcross, GA, USA). We harvested 50 mL cultures by centrifugation at 7000× *g* for 10 min, and the pellets were transferred to microcentrifuge tubes. A total of 200 μL of glass beads (two different sizes: 425 to 600 μm and 710 to 1180 μm; Sigma-Aldrich, St. Louis, MO, USA) and SP buffer were added, and the cells were disrupted mechanically with a FastPrep-24 homogenizer (MP Biomedicals, Irvine, CA, USA) at 6.5 m/s for 30 s (2 cycles). The sample of lysed cells was extracted as described in the manufacturers' protocol.

A total of 50 mL cultures of *Le. spongobia* TAU-MAC 1115 were harvested by centrifugation at 7000× *g* for 10 min, washed twice with washing buffer (50 mM Tris-HCl, 100 mM EDTA, 100 mM NaCl), and transferred to microcentrifuge tubes. After centrifugation (at 7000× *g* for 4 min), the supernatant was discarded, glass beads (two different sizes: 425 to 600 μm and 710 to 1180 μm; Sigma-Aldrich, St. Louis, MO, USA) were added, and the cells were frozen at −80 ◦C. The sample was thawed at 64 ◦C and 800 μL of GOS buffer (100 mM TrisHCl (pH 8), 1.5% SDS, 10 mM EDTA, 1% deoxycholate, 1% Igepal-CA630, 5 mM thiourea, 10 mM dithiothreitol) [63] was added. Disruption of the cells was performed using FastPrep at 5 m s−<sup>1</sup> for 30 s (2 cycles). The rest of the extraction procedure was performed as previously described in detail [63].

The purity, concentration, and quality of the DNA were determined using a Nanodrop ND-1000 Spectrophotometer (Nanodrop Technologies, Wilmington, DE USA), gel electrophoresis, and an Agilent TapeStation (Agilent Technologies, Lexington, MA, USA).

### *3.3. Genome Sequencing and Assembly of Sponge-Associated Cyanobacteria*

High-molecular-weight DNA was subjected to library construction (Illumina TruSeq PCR-free 150 bp) and sequenced by the Illumina HiSeq 2500 platform, with a paired-end 100-cycle run (Macrogen Europe, Amsterdam, the Netherlands). The quality of the raw data was initially assessed using FastQC v0.10.1 [64]. Prinseq [65] was used to perform quality filtering, and genome assembly was performed with SPAdes 3.5.0 [66], followed by scaffolding and gap-closing performed with Platanus 1.2.1 [67]. Scaffolds from culture contaminants were identified by Kraken 1.0 [68] and removed using ZEUSS 1.0.2 [69]. Genome assembly statistics were obtained using Assemblathon 2 [70]. Completeness and contamination of the genomes were accessed using CheckM v1.1.3 [71].

#### *3.4. Phylogenomic and Phylogenetic Analysis*

An alignment of 120 bacterial single-copy conserved marker genes was generated with the Genome Taxonomy Database GTDB-Tk [72] from 90 cyanobacterial genomes, including the two newly sequenced sponge-associated *Leptothoe* genomes as well as 35 genomes registered in GenBank as *Leptolyngbya* or Leptolyngbyaceae and representative taxa of Nostocales, Oscillatoriales, and Chroococcales. A maximum-likelihood phylogenomic tree was constructed by RAxML [73] that was based on the nucleotide substitution model LG +I +G assigned by a BIC calculation in ProtTest [74], with 1000 rapid bootstrap searches.

For 16S rRNA phylogenetic analysis, a dataset consisting of gene sequences belonging to *Leptothoe* genus (>94.5% sequence similarity via BLASTn searches) along with sequences of closely affiliated genera (such as *Salileptolyngbya*, *Nodosilinea*, *Halomicronema*), as well as sequences of other filamentous cyanobacteria was generated. Multiple sequence alignment was performed in MEGA v. 7.0 [75] using ClustalW [76]. The phylogenetic tree was constructed using maximum likelihood (ML) and Bayesian inference (BI). Two 16S rRNA gene sequences of the cyanobacterium *Gloeobacter violaceus* were used as outgroups (GenBank acc. no. AF132790, AF132791). The GTR+I+G model was determined by a BIC calculation in jModelTest 0.1.1 [77] as the most appropriate. The ML analysis was performed in MEGA v. 7 [75]. Bootstrap resampling was performed on 1000 replicates. Bayesian analysis was conducted using MrBayes 3.2.6 [78]. Four Metropolis-coupled MCMC chains (three heated chains and one cold) were run for 10,000,000 generations, the first 2,500,000 generations were discarded as burn-in, and the following datasets were sampled every 1000th generation. Phylogenomic and phylogenetic tree were visualized using the FigTree (V1.4.3) software (http://tree.bio.ed.ac.uk/software/figtree/, accessed on 12 March 2021).

### *3.5. Annotation and Comparative Analyses of Genomes*

Open reading frames (ORFs) prediction and annotation were performed using the draft genomes of the two sponge-associated *Leptothoe* strains in the RAST (Rapid Annotation using Subsystem Technology) prokaryotic genome annotation server (version 2.0) [28] with standard procedures. For comparative analysis, five genomes of strains belonging to genus *Leptothoe* were identified in NCBI's GenBank [79] and selected. Prior to the comparative genomic analyses, all genome datasets were re-annotated using RAST [28] and PROKKA [80]. Subsystems annotation of all seven genomes was performed with the RAST server [28] and SEED tool [29]. CDSs (predicted using RAST) of all seven genomes were subjected to annotation on the basis of clusters of orthologous groups (COGs) of proteins using the on-line server WebMGA (e-value = 0.001) [81]. Pseudogenes were calculated using NCBI's annotation pipeline. The different classes of mobile elements were analyzed separately. PHASTER [82] was used for phage detection, TransposonPSI (http://transposonpsi.sourceforge.net/, accessed on 12 March 2021) for transposon identification, and ISEScan [83] for identification of insertion sequence elements. In order to detect eukaryotic-like proteins (ELPs) such as ankyrin repeats (ANKs), tetratricopeptide repeat (TPRs), leucine-rich repeats, WD40 proteins, and pyrroloquinoline quinone (PQQ), we searched the annotation files manually using the key words 'repeats', 'Ankyrin', 'Tetratricopeptide', 'leucine', and 'PQQ' (similar to Karimi et al. [7,34]).

Heatmaps for average nucleotide and amino acid identities were estimated using the program GET\_HOMOLOGUES [84]. All seven genomes were searched for in terms of the presence of natural product biosynthetic gene clusters (BGCs) using antiSMASH 5.1.1 [85], which was done in order to gain further insight to their metabolic potential.
