*3.1. Biological Sample*

Specimens of *Protopalythoa variabilis* (Duerden, 1898) were collected in coastal reefs of Porto de Galinhas, Pernambuco, Brazil (8◦3020" S, 35◦0034" W) during low tide. A voucher specimen (GPA 181) was identified by us and was kept at the cnidarian collection of the Anthozoan Research Group (GPA) at the Academic Center of Vitória, Federal University of Pernambuco (Brazil). This species has been mentioned by some authors as *Palythoa variabilis*, in reason of a proposed synonymy for the genera *Palythoa* and *Protopalythoa* [75]. However, the issue of distinctive genera is not completely solved ye<sup>t</sup> (see, for example, [76]), despite new molecular phylogenetic approach, based on the universal target-enrichment baits, has been recently developed to help resolve long-standing controversial relationships in the class Anthozoa [77]. Thus, until an extensive revision of the group is not definitively resolved with morphological and molecular data precisely combined, with inclusion of species of both genera, the binomial nomenclature *Protopalythoa variabilis* is used herein, as for decades.

### *3.2. RNA Library Construction and Origin of Zoantharian RNA Sequences*

The RNA isolation, library preparation and the transcriptome assembly of the *P. variabilis* holobiont were performed as described in one of our previous articles [24]. RNA sequencing was performed using the Illumina HiSeq 2500 platform. Reads were cleaned up before the de novo transcriptome assembly using the Trinity method for transcriptome reconstruction [78]. This Transcriptome Shotgun Assembly (TSA) project was deposited in DDBJ/EMBL/GenBank under the accession GCVI00000000, associated with the BioProject PRJNA279783 and biosample SAMN03450566. The statistics of the RNA sequencing and contig assembling are summarized in the Supplementary Table S1.

### *3.3. Assessment of the Biodiversity Composition of the P. variabilis Holobiont*

*P. variabilis* transcripts with a high similarity to 16S rRNA, COI, and rbcL sequences were identified using BLASTn with an E-value limit of 1E-40. The closest related species were characterized by homology search of these transcripts against the NCBI nr database. Only the first hits in concordance with the selected gene markers aforementioned were retained for species composition identification in the holobiont assemblage.

### *3.4. Sequence Annotations for Enzyme Precursors in the P. variabilis Holobiont*

The unigenes from the *P. variabilis* transcriptome were investigated for structural enzyme homology using BLASTx (BLAST+ suite, version 2.5.0) [79] with a fixed E-value of 1E-5 against four public protein databases: the NCBI non-redundant (nr) protein database (accessed from October to November 2016), the Clusters of Orthologous Groups (COG) database, version 2003–2014, the UniProtKB/Swiss-Prot database, downloaded in October 2016), and the EuKaryotic Orthologous Groups (KOG) database, version 2003. Basic statistics of the sequence annotations are described in Supplementary Figures S1 and S2. Species information of the selected annotations was extracted from the BLAST output files to discern the taxonomic distribution.

### *3.5. Gene Ontology, Enzyme Codes and KEGG Pathway Assignments*

The Blast2GO software, version 4.0.2 [80], was used for the subsequent steps under default parameters to carry out the InterProScan protein domain analysis, followed by the Gene Ontology (GO) annotation. The annotations were then subjected to a generic GO-slim reduction, prior to the mapping of the Enzyme Commission codes (EC) and KEGG pathways. The GO annotations chart was plotted using WEGO [81], while further KEGG pathways information was retrieved through the KEGG BRITE hierarchies site [82].

### *3.6. Sequence Alignment and Phylogenetic Inference*

Multiple sequence alignments of predicted enzyme sequences were performed using MUSCLE v3.8.31 [83] or Kalign v2.9.0b2 [84], depending on whether the enzyme domain/motif was a single structural unit or multiple-domain repeats, respectively. The phylogenetic tree was inferred using MEGA 7.0.26 [85] based on the LG+G+I model, with a bootstrap test of 500 replicates and edited using FigTree v 1.4.3. The amino acid sequence identities and similarities were determined using Jalview 2.9.0b2 [86].

### *3.7. Prediction of Enzymes with Two or More Activities*

An alternative annotation strategy was used to identify potential enzymes with two or more enzymatic activities. To favor the multiple domains (hits) discovery, we first performed a BLASTx search against the UniProtKB/Swiss-Prot database, keeping an E-value of 1E-5 with the additional

"culling\_limit" option set to "1". The selected set was reduced to only sequences with at least two non-overlapping hits on the same frame. During a second round of selection, only sequences with predicted protein products containing regions found by BLAST were kept. Finally, an InterProScan protein domain analysis [87] was performed to map each sequence to its GO annotation and its associated EC codes and KEGG pathways using the external EC2GO [88] and KEGG2GO mapping databases [89,90]. Only sequences with two or more ECs were kept, and data on the enzyme's substrate(s) and product(s) were retrieved from KEGG Enzyme [82].

### *3.8. Analysis of Predicted Enzyme with Partial EC Number*

Results of the annotation generated during the Blast2GO analysis were used to extract a list of the transcripts with a partial EC number. For each of these sequences, the corresponding protein sequence is predicted and subsequently submitted to InterProScan 5 via RESTful service 80 to validate the presence of a catalytic domain corresponding to the partial EC. These steps were automatized using a python script, depending on the external EC2GO mapping database [88]. To perform the observed distance analysis between the predicted *P. variabilis* and the known enzyme sequences, the available collections of sequences corresponding to each EC of interest were retrieved from the BRENDA website (www.brenda-enzymes.org, release 2018.1) [91] and sequences with a length shorter or higher to two population standard deviation were removed. The multiple sequence alignment was done using Kalign v2.04 77, and the analysis was computed with MEGA 7.0.26 78 using the Neighbor-Joining method and the p-distance method, removing missing ambiguous positions for each sequence pair, and with a bootstrap test of 500 replicates. The cladograms were edited in the Interactive Tree Of Life website [92].
