*2.2. Overall Fold*

The structures were solved using molecular replacement starting from the *A. oryzae* amylase as template (pdb-ID: 7taa and 3vx0) to a resolution of 1.4 Å for RpAM, 1.2 Å for TeAM and 1.35 Å for CfAM, respectively. The final model of RpAM includes two monomers in the asymmetric unit

comprising residues 1 to 438 in both chains, which superpose on each other with an r.m.s.d. of 0.54 Å. The model of TeAM contains one monomer in the asymmetric unit including residues 1 to 438. For CfAM, there are two monomers in the asymmetric unit comprising residues 19 to 459 for chain A and 19 to 460 for chain B, which superpose with an r.m.s.d. of 0.3 Å. All three amylases have the classical domain structure with a central (β/α)8-barrel with the active site located on its C-terminal face, together with a small subdomain, inserted between the third strand and helix and a C-terminal β-sandwich (Figure 2a). All three superpose with each other (Figure 2b) and with TAKA-amylase with an r.m.s.d. between 0.6 to 0.9 Å for up to 423 residues. Two conserved disulphide bridges stabilize flexible loops in subdomains A and B. There is an additional disulphide bridge in CfAM, located in the C-terminal domain. All three α-amylases have the conserved canonical calcium binding site located between the (β/α)8-barrel and the insertion domain B.

**Figure 2.** Structural overviews. (**a**) ribbon representation of the structure of CfAM amylase in ribbon representation. The domains are colored separately with the central barrel in purple. subdomain B in yellow and the C-terminal β-sandwich in green. The bound ligands acarbose transglycosylation product (ATgp) and maltose are shown as spheres; (**b**) structural superposition of CfAM (purple) TeAM (orange) and RpAM (green).

#### *2.3. Ligand Binding Site*

Although all three amylases were co-crystallized with acarbose, a well-known inhibitor for amylases, a complex with acarbose bound was only obtained for TeAM and CfAM. The reason why acarbose was not bound to RpAM is not clear. As expected, the acarbose was found in the substrate binding cleft in each monomer of TeAM and CfAM, with the acarviosine unit sitting in subsites -1 and +1, (Figure 3a–d). In both enzymes, the binding mode is conserved, and the ligands superpose with each other (Figure 3e), except for the monomer in subsite -4. The distorted pseudosugar valieneamine in subsite 1 with its 2H3 half chair conformation mimics the conformation of the putative transition state along the catalytic itinerary of α-amylases. Additional density in subsites -2 and -3 and -4 was modelled as a second acarbose unit, covalently attached to the first acarbose. The catalytic nucleophile D190/D192(CfAM/TeAM) is in a near attack conformation poised to react with the anomeric carbon, whilst the catalytic acid/base E214/E216(CfAM/TeAM) forms a hydrogen bond with the bridging nitrogen of the glycosidic bond with the 4-deoxyglucose in subsite +1. In addition, a hydrogen bond with H194/H196 stabilizes the 4-deoxyglucose in that subsite. The +3 subsite is formed by the sugar tong, composed of Y142/144 of subdomain B and F216/218 of the central domain, sandwiching the glucose between them. The reducing end of acarbose is stabilized by a hydrophobic platform interaction with Y240/F242 and a hydrogen bond with the main chain nitrogen of G218/G220. Interestingly, additional density at the non-reducing end was observed and was modelled as an additional acarbose unit in subsites −2 and −3 and −4. The glucose in subsite −2 is stabilized by multiple hydrogen bonds with D323/325, R327/329 and W375/377. The glucose in subsite −3 is held in place by only one hydrogen bond with D323/325. The last visible part of the acarbose molecule is the acarviosine unit in subsite −4, which is not stabilized by direct interactions with the protein. Furthermore, the acarviosine unit is in two different positions in the two structures, reflecting the lack of strong stabilizing interactions between the ligand and the protein beyond subsite −3 (Figure 3e).

**Figure 3.** Acarbose transglycosylation product binding in CfAM and TeAM. (**a**,**b**) stick representation of the acarbose derived transglycosylation product in the substrate binding crevice of CfAM and TeAM, respectively. The 2Fo-Fc electron density around the ligands is contoured at 0.3 e/Å3. The interacting residues are shown as cylinders. (**c**,**d**) hydrogen bonding pattern between ATgp and CfAM and TeAM in the active site. (**e**) stereo view of the overlay of the binding crevice of CfAM (purple) and TeAM (orange). The residues and the ligands overlap very closely with the only major difference being the orientation of the acarviosine subunit in subsite -4.

#### *2.4. Secondary Glucose Binding Site*

In CfAM, a secondary binding site in domain C was identified and modelled as maltose located at the edge of the β-sandwich (Figure 4). The glucose units are held in place mainly via hydrogen bonds without the usual stacking interactions with aromatic side chains.

**Figure 4.** The secondary maltose binding site in the C-terminal domain of CfAM. (**a**) stereo view showing the maltose in cylinder representation with the corresponding 2Fo-Fc electron density contoured at 0.4 e/Å3. The interacting residues are shown as blue cylinders; (**b**) superposition of the C-terminal domain (green) with the CBM20 domain from *A. niger* glucoamylase (pdb-ID: 1ac0) in beige. The bound β-cyclodextrin of CBM20 and the maltose unit are shown as glycoblocks [15].

#### *2.5. N-Glycosylation*

There are three N-glycosylation sites, one at N144 in RpAM and two at N180 and 412 in TeAM. We observed only the core GlcNAc residue in all three enzymes. In the case of TeAM, this is due to the deglycosylation procedure with EndoH.

#### *2.6. Isoasparate Formation*

We observed the formation of an isoaspartate by succinimide formation and deamidation of N120 in chain B of RpAM. The same asparagine in chain A shows high flexibility and the resulting density suggest partial isoaspartate formation, but a model could not be built with confidence.

#### **3. Discussion**

We have analyzed structurally and functionally three novel fungal α-amylases with potential to be used in the food industry and other industrial processes. All three structures determined show the canonical amylase fold and overlap with each other with an r.m.s.d. of 0.54 Å (Figure 1b). Further analysis of the sequence showed that both RpAM and CfAM have a slightly lower number of charged residues and a higher number of hydrophobic residues compared to TeAm and TAKA amylase, which might contribute to the higher thermostability of these two variants. Increased internal hydrophobicity while keeping external hydrophilicity was found to correlate well with the thermostability of *Bacillus* α-amylases [16]. Furthermore, the shortened loops in these enzymes may also contribute to the overall rigidity of the enzymes and therefore the thermostability as observed for other enzymes as well [17,18].

The substrate crevice in all three amylases, if defined on the basis of protein carbohydrate interactions, spans from subsite -3 to +3. Having only three defined subsites for the non-reducing end is common for amylases and is in line with the number of donor subsites described for the TAKA-amylase. Potentially, there could be more subsites for additional carbohydrate units at the reducing end, which might connect the active site crevice with the observed second binding site (see below).

The observed complexes are most likely the result of limited transglycosylation, an unusual side reaction previously reported *in crystallo* for several amylases—for example, TAKA-amylase [19]. Though this reaction is common in the closely related CGTases (GH13\_2) and amylomaltases (GH77), it was not observed in solution for α-amylases. However, in crystals, transglycosylation products with 10 or more units have been reported as a result of multiple transglycosylation events. Interestingly, the final complex always has the pseudosaccharide unit, thought to mimic the transition state, in the -1 subsite, rendering the enzyme inactive. Other binding modes are clearly possible as evidenced by the final product and a pre-Michaelis complex observed for GH77 *Thermus aquaticus* amylomaltase with acarbose [20].

All three amylases have as their hallmark a shortened loop between β2/α3 and two shorter loops in subdomain B located between β3 and α4 of the central (β/α)8-barrel, compared to structures of other fungal amylases, e.g., TAKA-amylase (Figure 5a). The importance of subdomain B for the physicochemical properties—for example, pH-stability, as well as substrate and product specificity, is well known [21–24]. Indeed, the shorter loops open up the substrate crevice on the non-reducing end (Figure 5b), which might explain the shift in the product profile for all three amylases towards oligomers with a higher dp compared to TAKA amylase (Figures 1c and 5c).

The C-terminal domain in α-amylases is implicated in starch binding and shows structural similarity to classic CBM domains, based on an analysis using PDBeFOLD [25]. The additional binding site in this domain in CfAM strengthen the role of this domain in substrate binding. Additional carbohydrate binding sites have been observed as well for example in barley α-amylase 1 [26]. While none of these sites overlap with the binding site seen in CfAM, a structure of a CBM20 in complex with β-cyclodextrin revealed two binding sites, with the site termed SB1 in close proximity to the binding site in CfAM (Figure 4b) [27]. This was confirmed to be the primary binding site for the interaction with raw starch, and it is likely that the observed binding site in CfAM is a genuine carbohydrate binding site. Furthermore, it is intriguing to speculate about a potential path from the primary substrate crevice to the secondary glucose binding site, which could be rather easily thought as a simple extension of the acarbose from the reducing end.

Only limited information about the influence of glycosylation on amylase activity is available. It was shown that, for α-amylase, Amy1 from the yeast *Cryptococcus flavus* N-glycosylation enhances thermostability and resistance to proteolytic degradation [28]. The same effect is observed for *Trichoderma reesei* Cel7a [29]. Indeed, N144 is located in an extended loop and N-glycosylation might help to shield the loop against proteolytic attack. The other two glycosylation sites are located in or at the beginning of secondary structure elements, with N412 being located in the C- domain.

The observed isoaspartate formation is thought usually to be an age-related side effect of protein decomposition, but a functional role cannot be ruled out [30]. Indeed, it was shown in GH77 enzymes that such unusual posttranslational rearrangement might play a functional role in glycoside hydrolases [31,32]. The observed isoaspartate is located in one of the shortened loops in subdomain B close to the substrate binding cleft, suggesting a functional role in CfAM as well.

**Figure 5.** (**a**) Stereo view of all three amylases compared to TAKA-amylase with the three shortened loops in the front marked with arrows. The ligand in CfAM is shown as sticks to identify the active site; (**b**) surface representation of CfAM with the bound ligand. The substrate is more open on the donor subsite; (**c**) surface representation of TAKA-amylase. The elongated loops create a more restricted active site crevice precluding the binding mode observed in CfAM and TeAM due to steric clashes.

### **4. Materials and Methods**
