*2.3. Analysis of Contacts in Complexes of CGL and Its Mutants with Oligosaccharides*

It was found that the mucin-binding activities of the obtained mutants did not correlate with changes in the calculated binding energy of galactose with CGL mutants that can be explained by the fact that CGL affinity to different ligands depends on their structure (Table 1). To clarify the impact of the ligand structure on CGL binding activity and binding mechanisms of the attachment of CGL to ligands, in silico analysis of contacts of CGL mutants in complexes with globotriose Gb3 was carried out with MOE 2018.01 program. The obtained results showed that the alanine substitution of His37, His129, Glu75, Asp127, His85, Asn27 and Asn119 residues changed CGL contacts with Gb3 (Figure 3) and the total binding energy of CGL with ligands (Table 1).

**Table 1.** The mucin-binding activity of the recombinant CGL of wild and mutant types and the change in the binding energy (ΔE = Emut-Ewt) of the CGL mutants with galactose and globotriose.


a/b—change in the binding energy of the CGL mutants with galactose (ΔE a) or globotriose (ΔE b); c—mucin-binding activity of the wild type CGL was 100%; \*—amino acid residues (aa) from Site 1, \*\*—aa from Site 2, \*\*\*—aa from Site 3.

**Figure 3.** 2D-diagram of Gb3-binding sites (Site 1, 2, 3) and mucin binding in Site 3 of the wild type CGL. Hydrogen bonds lost in the corresponding mutant are indicated with a cross (**x**).

The analysis of contacts between CGL and Gb3 has shown that Asn27 and Asn119 residues formed the hydrogen bond not only with C6-OH group of Gb3 terminal galactose residue, but also with the neighboring galactose residue (Figure 3).

It was found that Asn27Ala and Asn119Ala mutants lost three hydrogen bonds with Gb3 in Sites 1 and 2 in comparison with the wild type CGL (Figure 3), what correlates with a drastic decrease in their affinity towards mucin that has two terminal galactose as Gb3 (Table 1, Figure 2).

The residue Glu75 in CGL Site 3 is located in the same position as Asn27 and Asn119 from Sites 1 and 2 and forms the hydrogen bond with C6-OH group of terminal monosaccharide residue similarly to Asn27 and Asn119. According to the modeling results, the mutant Glu75Ala lost only one hydrogen bond (Figure 3) and therefore retained a higher percentage (31%) of the lectin activity than Asn27Ala and Asn119Ala mutants (Table 1).

The His37, His85 and His129 residues form two hydrogen bonds only with the terminal Gb3 monosaccharide residue and the binding energy of His37Ala, His85Ala and His129Ala mutants with both galactose and globotriose are similar (Table 1, Figure 3). However, the lectin activities of the mutants His37Ala, His85Ala and His129Ala were different (Table 1, Figure 2). Apparently, the activities of these CGL mutants depend also on the structural rearrangement of the sites after alanine substitutions of His37, His85 and His129. Distinctive affinities of Sites 1–3 of CGL toward galactose were also shown by NMR titrations [12].

According to the modeling data, Asp127 forms only one hydrogen bond with the terminal monosaccharide of Gb3 (Figure 3). These results fully coincided with crystallographic data from Protein Data Bank (PDB accession numbers: 5F8W, 5F8Y and 5F90). However, the mutant Asp127Ala activity with the use of porcine stomach mucin (PSM) as ligand was only 22% compared to the wild lectin although only one hydrogen bond disappeared in the complexes with galactose and globotriose (Table 1, Figure 3).

To explain the drastic change in the activity of this mutant, a model of the mutant Asp127Ala complex with the PSM oligosaccharide was constructed using molecular docking of CGL with the PSM-like trisaccharide of the blood group A epitope GalNAcα1-3Gal [Fucα1-2] since data concerning crystal structure of PSM itself were not available in literature (Figures 3 and 4).

**Figure 4.** 3D-superimposition of globotriose (Gb3) and porcine stomach mucine (PSM) trisaccharides in the binding Site 3 of the wild-type CGL. The structures of the ligands are shown as stick in blue (Gb3) and in pink (PSM).

The analysis of contacts between Site 3 of CGL and the PSM-trisaccharide GalNAcα1-3Gal [Fucα1-2] has shown that Asp127 forms a hydrogen bond with C3-OH group of the terminal monosaccharide galactose and two additional hydrogen bonds with OH groups at C2 and C3 of the third residue fucose (Figures 3 and 4). Asp127Ala mutant lost all three hydrogen bonds with the PSM trisaccharide, which can explain a sharp decrease (down to 22% of the wild type CGL) in the mucin-binding activity of Asp127Ala mutant (Table 1, Figure 2).

Asp35 and Asp83 residues in the binding Sites 1 and 2 are located in the same positions as Asp127 in Site 3 and can form three hydrogen bonds with the PSM trisaccharide. The activities of Asp35Ala and Asp83Ala mutants have not been yet studied experimentally, but it may be assumed those will decreased like the case of Asp127Ala mutant.

#### **3. Discussion**

CGL is the GalNAc/Gal- and mucin-specific lectin with an amino acid sequence that distinguishes it from lectins of known families [9]. To date, this new lectin family which was proposed to name mytilectin [13] includes, besides CGL, five lectins from the sea mussels *Mytilus trossulus* (MTL), *Mytilus galloprovincialis* (MytiLecs 1–3) and *Mytilus californianus* (MCL) [15–19]. These lectins share common ß-trefoil fold and contain three carbohydrate-binding sites [9–13,19]. A β-trefoil fold was proposed for the first time for the crystal structure of Kunitz soybean tripsin inhibitor [20]. Now, it is known that β-trefoil fold is shared by proteins from several subfamilies, including cytokines, ricin B-like lectins, agglutinins, actin-cross-linking proteins etc. [21], which have no sequence similarity and have distinctive ligands, modes of ligand binding and functions.

According to the crystal data, CGL exhibits a characteristic pseudo three-fold symmetry and contains three structurally conserved subdomains [11,12]. Each of these subdomains is composed of four β-strands. Two strands from each subdomain collectively form a six-stranded β-barrel and the remaining two β-strands from each subdomain together form a β-hairpin triplet that caps one end of the barrel [11]. The putative glycan-binding pocket in the first CGL subdomain is formed by the side chains of His16, Tyr18, Val31, His33, Asp35, His37 and Arg39, and the backbone of Gly19 and Gly20 (HYGGVHDHR). The second binding pocket of CGL is formed by the same amino acid residues (HYGGVHDHR). Whereas in the third pocket of CGL, tyrosine is substituted by lysine, and arginine is replaced by alanine (HKGGVHDHA) [11]. The structure of the CGL-galactosamine complex obtained by Liao et al. [12] also revealed in CGL three carbohydrate-binding sites: Site 1 consisted of His16, Gly19, Asp35, His37 and Asn119; Site 2 included His64, Gly67, Asp83, His85, and Asn27; Site 3 comprised His108, Gly111, Asp127, His129, and Glu75. Superimposition of the three carbohydrate binding sites indicates that all three sites contain the same amino acid compositions except for the replacement of Asn for Glu in Site 3. These data confirmed our predications based on homology modeling [10].

In our previous study, we evaluated the contribution of three conserved HPK(Y)G motifs in hemagglutinating and carbohydrate binding activities of CGL by site-specific mutagenesis [10]. According to the obtained data, alanine substitutions of His16, Pro17, Gly19 of Site 1 and His64, Pro65 and Gly67 in Site 2 resulted in complete loss of the CGL hemagglutinating and mucin-binding activities, whereas the mutant CGL with His108Ala, Pro109Ala and Gly111Ala mutations in the Site 3 kept the binding activity against mucin [10].

In this study, we applied the same approach to elucidate the individual contribution of the amino acid residues from CGL binding Sites 1–3 to the carbohydrate binding activity. It was found that the alanine substitution of none of the studied amino acid residues (His37 and Asn119 from Site 1; His85 and Asn27 from Site 2; Asp127, His129, and Glu75 from Site 3) did not lead to the complete loss of the mucin-binding activity of CGL due to the presence of two other normal Sites. But the contribution of these amino acid residues to the mucin-binding activity of CGL was not the same. The replacements of Asn119Ala in Site 1 and Asn27Ala in Site 2 were found to lead to the greater decreasing of the mucin-binding activity of CGL (up to 9% and 17%, respectively) in comparison with the alanine substitution of Glu75 located in Site 3 in the same position as Asn119 and Asn27 from Sites 1 and 2, respectively (Table 1). This confirmed the suggestion of Jakób with co-authors [11] about differences in the affinity (or specificity) for glycan moieties between binding sites and with our previous experimental data [10].

Moreover, in silico analysis of the CGL binding to galactose, globotriose and mucin have shown that the affinity of CGL to these ligands depends on their structures, which determine the number of hydrogen bonds in the CGL-ligand complex and, consequently, its binding energy in total. The maximal decrease in the mucin-binding activity observed for the mutants Asn119Ala in Site 1 and Asn27Ala in Site 2 could be explained by the loss of all three hydrogen bonds with two terminal galactose residues of oligosaccharides in comparison with the wild-type CGL (Table 1, Figure 3). The amino acid residue Asp127 in Site 3 (and similar residues Asp35 and Asp83 in Sites 1 and 2) was found to play a decisive role in the higher lectin specificity to mucin than globotriose (Figure 4). Thus, the efficiency of CGL binding depends on the composition of terminal monosaccharide units in oligosaccharides due to the different capability of CGL amino acid residues from Site 1–3 to bond with OH-groups of the second galactose and third fucose in the addition to the binding with the terminal galactose.

#### **4. Materials and Methods**

#### *4.1. In Silico Analysis of Contacts between CGL and Ligands and Mutagenesis*

The model of CGL spatial structure was constructed as described previously [10] on the basis of the crystal structure of the lectin MytiLec established with a resolution of 1.05 Å (PDB code 3WMV) [13]. The analysis of contacts between CGL and ligands, in silico mutagenesis, molecular docking and visualization of the results were carried out with the Ligand interaction and Dock modules of MOE 2018.01 program [22]. The crystal structure of CGL complexes with galactose (PDB 5F8W), galactosamine (PDB 5F8Y), globotriose Gb3 (PDB 5F90) and trisaccharide motif GalNAcα1-3Gal [Fucα1-2] from porcine stomach mucin (PSM-trisaccharide), which is identical with terminal trisaccharide of the blood group A human histo-blood group antigen (HBGA A-trisaccharide) (PDB 2WMI) [23], were used in docking analysis. Molecular docking of PSM-trisaccharide GalNAcα1-3Gal [Fucα1-2] with CGL was carried out using complex with galactosamine (PDB 5F8Y) as a template. The ligand binding energy (the molecular mechanics generalized Born interaction energy) was the non-bonded interaction energy between the receptor and the ligand and comprised van der Waals, Coulomb and generalized Born implicit solvent interaction energies [24]. The change in the binding energy of the CGL mutants with galactose or globotriose was calculated as ΔE = Emut-Ewt. The results were obtained with the use of IACP FEB RAS Shared Resource Center "Far Eastern Computing Resource" equipment (https://cc.dvo.ru).
