*2.2. PL7 Phylogenic and Structural Analyses*

Alginate lyases from the PL7 family are widely distributed in bacteria and have a typical β-jelly roll fold, which can possess both endolytic and exolytic activities. To date, crystal structures of eleven PL7 algninate lyases have been elucidated, and at least 40 representatives were characterized from the PL7 family (CAZy database, February 2021). Based on the sequence similarity of catalytic domains, the family of PL7 has been subdivided into five subfamilies (SF1-SF5) [26]. Subfamily 6 (SF6) was proposed by Thomas et al. in the extensive study of the PL7 alginate lyases from *Z. galactanivorans* DsijT [21]. It has been suggested that PL7 enzymes from SF6 appear to be conserved only in marine representatives of the *Flavobacteriaceae*. Later, alginate lyase Aly7B\_Wf from *Wenyingzhuangia funcanilytica* CZ1127<sup>T</sup> was characterized and classified as belonging to subfamily SF6 of the PL7 family [27]. Recently, the crystal structure of a novel PL7 alginate lyase AlyC3 from *Psychromonas* sp. C-3 was reported [28]. The AlyC3, along with several other unclassified PL7 alginate lyases, was attributed to subfamily SF6, which implies belonging to the novel subfamily SF7. Thus, despite some confusion in the literature, subfamilies SF6 and SF7 could be distinguished in addition to the well-known subfamilies SF1-SF5.

Initially, to clarify the classification of predicted alginate lyases from *Zobellia* within the PL7 family, a phylogenetic tree was constructed with all the available characterized PL7 alginate lyases derived from CAZy database (data not provided). To avoid redundancy one more phylogenetic tree was obtained, which included only the most representative alginate lyases for each subfamily along with their target sequences (Figure 3).

In accordance with the ML tree, 41 PL7 alginate lyases from *Zobellia* fall strictly into SF3, SF5, and SF6. Although AlyA1 and AlyA5 from *Z. galactanivorans* DsijT have been biochemical and structural characterized in detail [21], it is worth noting that close inspections of orthologous and paralogous genes are of great value for investigation of the PL7 enzymes evolution at the genus level. For convenience, paralogs and orthologs in subclades were numbered from one to six on the phylogenetic tree.

It was revealed that only 6 of the 12 *Zobellia* representatives encode PL7 lyases belong to subfamily SF3—namely, *Z. galactanivorans* DsijT, *Z. galactanivorans* OII3, *Z. uliginosa*, *Zobellia* sp. Asnod2-B07-B, *Zobellia* sp. Asnod3-E08-A, and '*Z. barbeyronii*'. Five of these, including AlyA1, were clustered together as presumptive orthologs, while ZbarT\_PL7sf3\_3 was reliably clustered with AlyQ from *Persicobacter* sp. CCB-QB2. We identified that the studied PL7-SF3 lyases had a modular organization and that all of them contain cleaved lipoprotein signals and CBM32 in addition to the catalytic domain. The same was determined earlier for AlyA1 [3]. Furthermore, domains moderately resembling CBM16 and CBM6 were also found in the architectures of AlyQ and ZbarT\_PL7sf3\_3, respectively.

The AlyA1 is an endolytic guluronate lyase [21], and AlyQ is most active on alginate, although it can also act on polyguluronate (poly-G) and polymannuronate (poly-M) [29]. For putative PL7-SF3 lyases from *Zobellia*, homology modeling based on the AlyA1 and AlyQ crystal structures was carried out (data not provided). The congruence of the phylogeny and structural similarities between these so-called orthologs indicate that they may possess similar activities. Considering that AlyA1 appears to have been acquired via horizontal gene transfer (HGT) from marine *Actinobacteria* [19], it becomes obvious that ZbarT\_PL7sf3\_3 was laterally acquired from other taxa. It is possible that CBM modules were fused with catalytic domains in ancestral genes before their transfer.

We consider that the PL7 alginate lyases belonging to subfamily SF5 are conserved within the *Zobellia* genus because at least one homolog was found in each genome. According to the phylogenetic tree, a well-supported (bpp = 94) orthologous group (OG) composed of 12 sequences is clearly distinguished and designated as 4 in Figure 3. Genes encoding putative PL7-SF5 lyases were duplicated and presented by paralogous pairs (marked as 5 on Figure 3) in seven *Zobellia* representatives—namely, *Z. russellii*, *Z. laminariae*, '*Z. barbeyronii*', *Zobellia* sp. Asnod2-B07-B, *Zobellia* sp. Asnod3-E08-A, *Z. amurskyensis* KMM 3526T, and *Z. amurskyensis* MAR 2009 138. It should be noted that *Zobellia* sp. Asnod1-F08 and Asnod2-B02-B encode only one copy of the PL7 alginate lyases attributed to OG of the subfamily SF5. All the predicted PL7-SF5 lyases possess cleaved lipoprotein signals and PL7 catalytic domains. The amino acid sequence identities of PL7 catalytic domains between in-paralogous SF5\_4 and SF5\_5 were in the range of 65.0% to 67.51%. The identities between AlyA5 from *Z. galactanivorans* Dsij<sup>T</sup> and orthologs from OG SF5\_4 varied within

the range of 83.16% to 98.97%, while they were in the range of 64.98% to 66.79% across the AlyA5 and out-paralogs from SF5\_5 (Table S2).

**Figure 3.** The phylogenetic tree of PL7 family alginate lyases from *Zobellia* and selected characterized representatives of PL7 family. For characterized PL7 family members, the corresponding GenBank accession numbers are given; for PL7 proteins of *Zobellia,* locus tags or RAST ORFs identifiers are listed. The organism names are listed in brackets. Proteins with a clarified crystal structure are marked as violet diamonds, and identifiers from unpublished genomes are marked with asterisks. Bootstrap values lower than 50 are not indicated.

In comprehensive study, it was determined that AlyA5 cleaves monomers from the non-reducing end of oligoalginates in an exolytic fashion [21]. The three-dimensional structures of in-paralogous ZlamT\_PL7sf5\_4 and ZlamT\_ PL7sf5\_5 were modelled using AlyA5 (PDB ID: 4BE3) as the template and superimposed. Based on structural alignment (Figure S1), the main divergences between the catalytic domains of the ortholog and paralogs were colored in ribbon representation (Figure 4a). It was revealed that the orthologs from OG SF5\_4 shared a closer similarity in 3D structures to each other in contrast to in-paralogs, which confirms the conclusions following from the pairwise sequence identity analysis. Although the overall fold of *Zobellia* representatives PL7-SF5 is mostly similar, particularly in terms of the catalytic groove, there are slight differences in the external loop configurations.

**Figure 4.** The superimposition of paralogous alginate lyases PL7 from SF5 and SF6 to known structures: (**a**) ribbon representation of the superimposition of the predicted 3D models for paralogous PL7-SF5\_4 and SF5\_5 from *Z. laminariae* KMM 3676T onto prototype AlyA5 from *Z. galactanivorans* Dsij<sup>T</sup> (PDB code 4BE3); (**b**) ribbon representation of the superimposition of predicted 3D models for paralogous PL7-SF6\_1 and SF6\_2 from *Zobellia* sp. Asnod3-E08-A and AlyA2 SF6\_1 from *Z. galactanivorans* DsijT onto prototype FlAlyA from *Flavobacterium* sp. UMI-01 (PDB code 5Y33). Ribbon representations of superimpositions are presented in shades of grey. Differences between spatial structures of the paralogous alginate lyases in types of external loops are shown in color. The color corresponding to each structure is indicated.

The most curious and puzzling insights were obtained regarding the PL7 enzymes belonging to the subfamily SF6. It was revealed that 10 of 12 *Zobellia* representatives encode PL7-SF6 lyases, except *Zobellia* sp. Asnod1-F08 and Asnod2-B02-B. A suppositive OG for subfamily SF6 was distinguished (bpp = 55) and designated as one at the phylogenetic tree (Figure 3). In the same representatives, except *Z. russellii*, these lyases were presented by in-paralogs, as was revealed for subfamily SF5. All putative PL7-SF6 lyases contain only cleaved lipoprotein signals along with PL7 catalytic domains. Pairwise identities, calculated for as in-paralogs (59.92–61.54%), AlyA2 versus alleged orthologs (66.80–100%), and outparalogs (64.78–68.83%, Table S2), were insufficient for reliable delineation of orthologous group in SF6 because obtained values did not agreed with generally accepted criteria.

To date, among the PL7 alginate lyases from subfamily SF6, only two enzymes have been studied. The Aly7B\_Wf from *W. funcanilytica* CZ1127T was characterized as endoacting bifunctional alginate lyase and preferably cleaved polyM [27]. The FlAlyA from *Flavobacterium* sp. UMI-01 was first characterized as an endolytic enzyme with a preference for polymannuronate [30], and later, its crystal structure was clarified [31]. The threedimensional structures of the orthologous AlyA2, and in-paralogous PL7-SF6 from *Zobellia* sp. Asnod3-E08-A, were modeled using FlAlyA (PDB ID: 5Y33) as the template and superimposed. The most significant divergences between 3D structures were colored in ribbon representation (Figure 4b). The overall fold of *Zobellia* PL7-SF6 lyases was mostly matched to prototype structure, but there were moderate differences in external loop configurations, which may imply the sub-functionalization of PL7-SF6. Considering the observed peculiarities of 3D structures, which are reflected in structural alignment (Figure S2), it has become clear that the PL7-SF6 are characterized by high diversification within the *Zobellia* genus.

A detailed exploration of genetic loci containing genes for PL7 lyases may shed light on the debatable issue regarding both the OGs delineation and the role of gene duplication.

#### *2.3. Comparative Analysis of PL7-Containing Loci between Zobellia Genomes*

The marine flavobacterium *Z. galactanivorans* DsijT constitutes a model organism for studying algal polysaccharide bioconversions, including alginates [14]. For the first time, the Alginate Utilization System (AUS) for marine bacterium has been identified in *Z. galactanivorans* DsijT and studied in detail [3]. One more comprehensive investigation regarding alginate utilization loci was carried out for the marine '*Gramella forsetii*' KT0803 [6]. Recently, it was reported that key enzymes for alginate utilization are widespread across 60 strains, which were isolated from marine environments and belong to the phyla *Proteobacteria* and *Bacteroidetes* [7].

According to the literature, in *Z. galactanivorans* DsijT the AUS is encoded by two operons and two genes isolated in the genome [14]. The activity of the system is tightly controlled by the presence of alginate in the medium [32] and AusR, a GntR family repressor [33]. As described in [33], the current model of alginate degradation by *Z. galactanivorans* DsijT implies stepwise depolymerization of alginate by coherent action of extracellular lyases AlyA1 (PL7) and AlyA7 (PL14) then oligosaccharides, recruited by surface-exposed PKD-containing and SusD-like proteins, are imported to the periplasm via TBDT where they are subjected to further degradation into unsaturated mono-uronic acid by the alginate lyases AlyA2 (PL7), AlyA3 (PL17), AlyA4 (PL6), AlyA5 (PL7) and AlyA6 (PL6). Further conversions in the cytoplasm occur through the consecutive action of KdgF, SDR, and KdgK1. Finally, KDPG (2-keto-3-deoxy-6-phosphogluconate) is eventually assimilated into the central metabolism through the Entner-Doudoroff pathway.

In order to clarify the issues mentioned above, we performed a comparative analysis of loci containing PL7 genes across the *Zobellia* genomes (Figure 5).

It was revealed that PL7 genes are localized in six genetic loci (I–VI), which are part of a complex and evolved AUS, in agreement with previously those defined for the AlyA1, AlyA2, and AlyA5 from *Z. galactanivorans* Dsij<sup>T</sup> [14]. Thus, the trend toward the distribution of PL7 alginate lyases genes over separate loci is conserved within the *Zobellia* genus. It is noteworthy that PL7 of subfamilies SF3 and SF6 are encoded strictly in separate loci V or VI, and I, respectively. Whereas, in-paralogous genes from PL7-SF5 might be located adjacently in locus II and separately in loci II, III, and IV.

222

As presented on the synteny plot, the orthologs PL7-SF3 from *Z. galactanivorans* DsijT, *Z. galactanivorans* OII3, *Z. uliginosa*, *Zobellia* sp. Asnod2-B07-B and Asnod3-E08-A were encoded in locus V in the same surroundings, whereas orthologs from '*Z. barbeyronii*' was collocated within another environment in locus VI. This confirms that all of them were laterally acquired, in agreement with phylogenetic analysis points. It can be assumed that such dispersed genes are plastic in genomes and more frequently participate in HGTs processes.

Ten PL7 genes from OG SF5\_4 are collocated along with PL6 lyases genes in locus II, which was previously described in *Z. galactanivorans* DsijT genome as small operon *zgal*\_4130-4132 [3]. Seven PL7-SF5 genes are presented by in-paralogs; among them, three SF5\_5 paralogs are collocated with SF5\_4 in locus II (in *Z. russellii*, *Z. amurskyensis* KMM 3526T, *Z. amurskyensis* MAR 2009 138), whereas four other SF5\_5 paralogs are in different locus IV (in *Z. laminariae*, '*Z. barbeyronii*', *Zobellia* sp. Asnod2-B07-B and Asnod3-E08-A). Interestingly, SF5\_4 genes of *Zobellia* sp. Asnod1-F08 and Asnod2-B02-B are placed in locus III, which is not represented in *Z. galactanivorans* DsijT genome. We suppose that the processes of gene duplication and local gene transfer represent the evolution of PL7-SF5 genes at the *Zobellia* genus level. Summarizing insights from phylogenetic and synteny analyses, in the case of SF5 in-paralogs colocation, the SF5\_4 from the orthologous group is located upstream. Surprisingly, the pairwise identities of PL7 catalytic domains between SF5\_4 and SF5\_5 remained about 66% on average, regardless of whether they collocated or not.

Altogether, except genomes of *Zobellia* sp. Asnod1-F08 and Asnod2-B02-B, all the genomes contain the PL7-SF6 orthologs, localized in a large operon (locus I) along with other carbohydrate-related genes. The operon *zgal*\_2624-2612 containing AlyA2 from *Z. galactanivorans* DsijT was described in detail [3]. Six PL7-SF6 genes are presented by inparalogs, for which upstream localization is a reliable criterion for belonging to OG PL7-SF6, as was identified for PL7-SF5 and discussed above. A hypothesized sub-functionalization of in-paralogs SF6\_1 and SF6\_2 is supported by the presence of additional non-paralogous transporters and PKD-containing protein genes, as well as by the duplication of the repressor *AusR*, at the locus I. Our highlight of the variations in PL lyases content is another interesting observation based on synteny analysis of locus I. The first type of locus I contains genes for PL17 and PL7-SF6\_1 (strains of *Z. galactanivorans*, *Z. uliginosa*, and *Z. russellii*); the second one comprises PL17, PL7-SF6 in-paralogs, and PL6 (*Z. laminariae*, '*Z. barbeyronii*', *Zobellia* sp. Asnod2-B07-B and Asnod3-E08-A, strains of *Z. amurskyensis*); the third type contains PL17 and PL6 (*Zobellia* sp. Asnod1-F08 and Asnod2-B02-B).
