*3.1. Materials and Animals*

Maraena whitefish were provided by the Institute for Fisheries of the State Research Centre for Agriculture and FisheryMecklenburg-Western Pomerania (Born, Germany), and BiMES, Binnenfischerei GmbH (Friedrichsruhe, Germany). Fish were held in fresh-water recirculation systems with a 12:12 day-and-night cycle at 18 ◦C. Water quality was maintained by automated purification and disinfection (bio-filter and UV light). In addition, the concentrations of selected chemical and physical water parameters were constantly determined.

Sampling of ten organs or tissues (gills; gonads; head kidney; heart; liver; muscle; spleen; and the brain regions hypothalamus, hind brain, and telencephalon) from four maraena whitefish followed the standards described in the German Animal Welfare and was approved by the Landesamt für Landwirtschaft, Lebensmittelsicherheit und Fischerei, Mecklenburg-Vorpommern, Germany (LALLF M-V/TSD/7221.3-1-069/18) in November 2018. The tissues were sampled rapidly and immediately frozen in liquid nitrogen to be kept at −80 ◦C until RNA extraction.

#### *3.2. In Silico Identification and Phylogenetic Analysis of ST8Sia Sequences*

A local alignment BLAST approach was used to retrieve the vertebrate *st8sia* nucleotide sequences with significant homology to the mammalian sequences from the genomic and Transcriptome Shotgun Assembly (TSA) divisions of the GenBank/EBI databases at the National Center for Biotechnology Information (NCBI) (last accessed on 27 September 2019), ENSEMBL (release 97) and from the PhyloFish database [7,14,37]. The protein sequence analysis was performed using the Expert Protein Analysis System (ExPASy; Swiss Institute of Bioinformatics, Switzerland; website (https://www.expasy.org/)). Sequence alignments were performed using the clustalW (PRABI; https://npsa-prabi.ibcp.fr/cgi-bin/ npsa\_automat.pl?page=/NPSA/npsa\_clustalw.html). Phylogeny was determined aligning the known vertebrate ST8Sia sequences with MUSCLE in MEGA7.0 [40]. The multiple sequence alignments of the selected vertebrate ST8Sia sequences were conducted using MUSCLE and Clustal Omega algorithms in MEGA7.0 and manually refined (see Supplementary Data 1 and 2). Phylogenetic trees were produced by the neighbor-joining (NJ), maximum likelihood, and minimum evolution method in MEGA 7.0 [40,69].

To determine the consequences of duplication or loss of genes of a given order of Actinopterygian, we considered what happened on its closest paralogue. The amino acid substitutions that occurred at its base were deduced using the parsimony method implemented in Protpars program (PHYLIP package vers. 3.69) [70].

#### *3.3. Synteny Analysis, Paralogon Detection, and Ancestral Genome Reconstruction*

Synteny between the *st8sia* gene loci and neighbour genes in vertebrate genomes was assessed by manual chromosome walking and reciprocal BLAST. Detection of paralogous blocks was visualized with Genomicus (version 97.01) http://www.genomicus.biologie.ens.fr/genomicus-92.01/cgi-bin/search.pl, last accessed August 2019 [53]. When the *st8sia* gene of interest was not found in a genome, physically close genes were used as a seed to identify syntenic segments.

#### *3.4. Sequence Alignments, Motifs Analysis, and 3D Simulation of PSTD*

Multiple sequence alignments were performed with MUSCLE of EMBL-EBI (version 3.8; https: //www.ebi.ac.uk/Tools/msa/muscle/) from selected species using published sequences (accession numbers in Supplemental Data 1). The sequences were conducted, edited, and annotated in Jawa Alignment Viewer Jalview 2.11.0, and manually refined [67]. The 3D structure of the human PSTD of ST8Sia II as well as ST8Sia IV was generated in YASARA (Version 19.9.17) using the following Protein Data Bank (PDB) entries: ST8Sia IV (code: 6AHZ) (PDB, https://www.rcsb.org/pdb/home/sitemap.do https://www.wwpdb.org/pdb?id=pdb\_00006ahz). The ST8Sia II was generated in YASARA changing the amino acid in the positions N3, K4, L5, and K6, corresponding to H3, H4, V5, and N6, respectively. The human amino acid sequences of ST8Sia II and ST8Sia IV were modified at the positions with the following amino acids according to the different fish PSTD sequences: *C. maraena* ST8Sia II, L2, T4, D6, H8, F11, N28, and Q30; *P. fluviatilis* ST8Sia II, L2, T4, D6, F11, and N28; I. punctatus ST8Sia IV, R3, P6, H8, M9, V17, and N30; and *C. maraena* ST8Sia IV, V1, R3, R6, I29, and N30. We designed only one PSTD motif of *C. maraena* because there is only one difference at position R22 between ST8Sia II-r1 and ST8Sia II- r2.

#### *3.5. RNA Extraction, cDNA Synthesis, Primer Design, and RT-qPCR*

Total RNA was isolated from the individually homogenized organs and tissues using TRIzol (Invitrogen/Thermo Fisher Scientific, Darmstadt Germany), followed by an additional purification step (RNeasy Mini Kit, Qiagen). The quantity and integrity of the isolated RNA were determined using the NanoDrop 2000 photometer (Thermo Fisher Scientific) and agarose gel electrophoresis. Subsequently, we reverse-transcribed the total RNA using the SuperScript II Reverse Transcriptase (Thermo Fisher Scientific) and a mixture of oligo-d(T) and random hexanucleotides. This reaction was carried out at 42 ◦C (50 min), followed by an inactivation step (70 ◦C, 15 min). The resulting cDNA was diluted in 100 μL distilled water.

Real-time fluorescence-based quantitative RT-PCR (RT-qPCR) was used to determine the mRNA abundance of the two *st8sia2* gene variants in the above ten organs and tissues of maraena whitefish (*n* = 4). To this end, we identified discriminating sequence motifs to derive the oligonucleotides for *st8sia2-r1* (sense, 5 -AGCCTCATCAGGAAGAACATCC-3 ; antisense, 5 -TTCCCTACGATGGCACAGCGT-3 ) and *st8sia2-r2* (sense, 5 -CGTTCAACAGGAGCCTCTCTAA-3 ; antisense, 5 -TTCCCTACGATGGCACAGCGC-3 ). Moreover, we designed a *st8sia4*-specific primer pair (sense, 5 -ATGATAAGGAAGGACGTGCTGC-3 ; antisense, 5 -TGTTGAGCGTTCGGCGTCTGT-3 ). These RT-qPCR primers were designed (Pyrosequencing Assay Design software v.1.0.6; Biotage, Uppsala, Sweden) to synthesize amplicons between 121 bp and 226 bp. *eef1a1a2* (encoding eukaryotic translation elongation factor, variant a2), *rpl9*, and *rpl32* (ribosomal proteins L9 and L32) were selected as reference genes [71]. The RT-qPCR analyses were conducted with the LightCycler 96 System (Roche, Mannheim, Germany) using the SensiFAST SYBR No-ROX Kit (Bioline, Luckenwalde, Germany). We only considered crossing point (CP) values >35 for The expression analysis of the *st8sia2-r1 st8sia2-r2*, and *st8sia4*. The calculation of their copy numbers was based on standard curves having been generated on 10-fold dilutions of the respective PCR-generated fragments (1 <sup>×</sup> 103 to 1 <sup>×</sup> 106 copies). Melting-curve analyses validated the amplification of the distinct products. Amplicons were visualized on 3% agarose gels in order to assess product size and quality.

#### *3.6. Data Availability*

To identify the maraena whitefish ST8Sia II sequences, the orthologous sequences from rainbow trout and Atlantic salmon were aligned with the software Bowtie2 (v 2.2.4) to our RNA-seq read collection from maraena whitefish [72]. The alignments were then indexed and sorted with the software

package Samtools (v.16) and the final consensus sequences were obtained with the Ugene software (v 1.29).
