*2.5. Nb23 Structural Features*

The ten best Nb23 structures from energy minimization were deposited in the PDB (PDB ID 7EH3) and will be henceforth referred to as NOE-restrained best cluster. The first structure of the NOE-restrained best cluster is shown in Figure 4. The dispersion of the structures within this cluster was assessed by Cα-RMSD. The averaged Cα-RMSD with respect to the best structure was 1.57 ± 0.32 Å. Excluding the CDR3 (residues 101–117), which is expectedly more mobile and is the most variable part of immunoglobulin domains, and residues 1, 2, and 129, the Cα-RMSD was instead 1.23 ± 0.30 Å, highlighting the extent of the CDR3 contribution. An overlay of the backbone of the NOE-restrained best cluster is shown in Figure 5a. The corresponding β-structure content detailed in Table 3 for each element of the cluster can be compared to the experimental data from the ∆δ <sup>13</sup>Cα − ∆δ <sup>13</sup>Cβ chemical shift indexing analysis and the TALOS-N assessment of secondary structure content shown in Figure 3. The superposition of the CS-Rosetta ensemble displayed in Figure 5b highlights the much larger dispersion of the CDR3 region with respect to the NOE-restrained best cluster. A visualization of the positions of the β-strands is shown in Figure 5c. The average β-structure content of the NOE-restrained best cluster is 40.9%, which is lower with respect to the CSI and TALOS-N estimations. Structure 3 (43.4% β-structure content) and Structure 8 especially (46.5% β-structure content) exhibit better and very similar overlap with the CSI, TALOS-N and CS-Rosetta models, while the remaining conformers of the ensemble have a more lacking β-structure content to the one inferred from the CSI and TALOS-N. It is possible that proper β-structure did not appear in the fragments highlighted in Figure 4 due to the relatively low number of constraints found for Nb23. Given that both the β-strand content scores from CSI, TALOS-N and CS-Rosetta modeling indicate higher values, in analogy with the evidence from CD, the β-structure content of the NOE-restrained best cluster may be underestimated. However, the absence of inter-strand NOEs, especially at the edges of the sheets, concerning primarily backbone residues, also suggests the occurrence of loose geometry in solution, as observed with isolated immunoglobulin motifs in solution [8,10].

**Figure 4.** The best Nb23 structure from energy minimization of the NOE-restrained PONDEROSA C/S models. The structure is the lowest energy conformer of the NOE-restrained best cluster deposited in PDB (7EH3). It has the general features of a variable immunoglobulin domain, with the characteristic extended CDR3 of nanobodies which for Nb23 shields the solvent-exposed hydrophobic sidechains of Phe37, Phe47, Ile51, and Trp119. The β-strand content in the NOE-restrained best cluster is under-represented with respect to the analogous content of the CS-Rosetta structure ensemble. The red color highlights the location of the fragments extended but devoid of regular β-structure. Table 3 shows the positions of the β-strands for each structure of the NOE-restrained best cluster.

**Figure 5.** (**a**) An overlay of the Nb23 backbone of the NOE-restrained best cluster. The Cα−RMSD with respect to the best structure was 1.57 ± 0.32 Å, with substantial conformational dispersion localized in the CDR3 (highlighted in red). By excluding the CDR3 and residues 1, 2, and 129 from the alignment, the Cα−RMSD was 1.23 ± 0.30 Å. (**b**) An overlay of the Nb23 backbone of the CS−Rosetta ensemble. The Cα−RMSD with respect to the lowest energy structure was 3.42 ± 2.12. By excluding the CDR3 (highlighted in blue), the Cα-RMSD was 1.53 ± 0.99 Å, calculated over the fragments 1−102, 117−122. The conformational dispersion at the CDR3 is much more pronounced than the spread of the corresponding region in the NOE-restrained best cluster. (**c**) A visualization of the positions of the β−strands, lettered in white or grey. The only whole strand missing (C") is highlighted in red, and protein terminals in black.


**Table 3.** β-structure content of the calculated Nb23 structures.

\* The A and G strands are composed of two separate β-segments as per the CSI and TALOS-N analyses. A dash (-) indicates the absence of a particular segment in the corresponding NOE-based Nb23 structures.

> A different assessment of this scenario may come from an evaluation of the structural data that were obtained by CS-Rosetta or NOE-restrained and energy minimization model-

ing, based on the recently proposed ANSURR method [29]. According to this validation approach, the accuracy of an NMR structure cannot be inferred from the spread of the final conformation ensemble, which reflects only the precision of the determination. The structural dispersion must be coupled to the correlation between the CSI and the flexibility of the molecule, as scored by software suites that exploit prior knowledge from data banks and/or neural networks. The ANSURR evaluation tested on decoys and real structures shows an interesting diversification between prevalently helical proteins and prevalently β proteins, with the former exhibiting a much higher flexibility-CSI correlation score than RMSD score, and the latter showing the opposite, i.e., a higher RMSD score than flexibility-CSI correlation. The ANSURR evaluation of the CS-ROSETTA ensemble appears to feature somehow the characteristics of the prevalently β-structured proteins, with average correlation and RMSD average scores of 24 ± 15 and 89 ± 11. Conversely, the NOE-restrained energy-minimized models exhibit unsatisfactory average correlation and RMSD scores of 9 ± 6 and 12 ± 6. A graphical presentation of the ANSURR results is reported in Supplementary Materials (Figure S3). The close Cα-RMSD values of the CS-Rosetta ensemble (1.53 ± 0.99 Å) and the NOE-restrained best cluster (1.57 ± 0.32 Å) seem to conflict with the RMSD scores of ANSURR that appear satisfactorily high, as expected for β-rich proteins, only with the CS-Rosetta ensemble. Also, the CSI-flexibility correlation score shows an appreciable difference between the CS-Rosetta and the NOE-restrained ensembles. Given the identity of the sequence and the associated chemical shift list, with the consequent flexibility estimates, the difference of CSI-flexibility correlation of the ANSURR assessments must be related to the different β-structure content of the two ensembles, namely the small deviations from regular geometry of the NOE-restrained ensemble shown in Figure 4 that prevent classification as β-structure and therefore conflict with local CSI. Even with a modest CSI-flexibility correlation score and a structural dispersion equivalent to that of the NOE-restrained best cluster, the CS-Rosetta cluster reaches the typically large RMSD score of the β-rich proteins.

No helical segments were identified from the ∆∆δ <sup>13</sup>Cα − ∆∆δ <sup>13</sup>Cβ chemical shift indexing analysis, although TALOS-N predicted four helical segments. Four of the NOErestrained minimized structures have a right-handed helical fragment between residues 29 and 31. This fragment coincides with the putative CDR1 loop, and the recurrent threeresidue helix in the structures could be an indication of a 310-helical segment, which has a characteristic three-residue turn. The carbonyl oxygen of Thr28 (i) seems to face the HN of Ser31 (i + 3) at an average distance of 2.4 Å. The remaining structures have a helicallyshaped loop at the same location; however, no secondary structure element came out for those structures. A similar helical segment is formed in eight of the ten structures of the NOE-restrained best cluster, between residues 62 and 64, with the carbonyl oxygen of Thr61 facing the HN of Val64. There is also a three-residue helix tract, i.e., a helical turn, where the carbonyl oxygen of Lys87 (i) seems to face the HN of Asp90 (i + 3) at an average distance of 2.1 Å, the residues completing a full turn. This is possibly also a 310-helix. One segment in helical conformation is present in all of the NOE-restrained best cluster structures, in the supposed CDR3 loop, from position 107 to 111 (107–109 for one structure). This segment is in right-handed α-helix conformation, where the carbonyl oxygen of Thr107 (i) faces the HN of Thr111 (i + 4), at an average distance of 2.4 Å. The residues complete a full turn consistent with an α-helical segment. Another segment in helical conformation can be found in five of the structures between positions 113 and 115. This segment shows that the carbonyl oxygen of Arg112 (i) faces the HN of Asn115 (i + 3) at an average distance of 2.1 Å, i.e., a geometry that is consistent with a 310-helix.

Figure 6 shows the orientation and surface of the CDR loops for the first structure of the NOE-restrained best cluster. The orientation of the CDR3 is of particular interest, given its length and the degree of mobility at the beginning of the loop evidenced by the <sup>15</sup>N{1H} NOE analysis. Hence, several different orientations for the CDR3 were, in principle, possible. This is also reflected in the CS-Rosetta-generated models, where the β-core of the structure is very similar for each model while the CDR3 has a different conformation for

each model. The CDR3 of the PONDEROSA-C/S energy-minimized structures included in the cluster has instead a more consistent conformation, with limited variations in the CDR3 relative to the CS-Rosetta models (Figure 5a,b). Fundamental to the orientation of the CDR3 in the NOE-restrained best cluster are the NOEs between Arg50 in β-strand C' and Tyr104 of the CDR3. This well detectable interaction in the NOE spectra suggests a possible cation–π electrostatic interaction [30] between the Arg50 sidechain and the aromatic ring of Tyr104, which would partially keep the loop in a more defined orientation. Interestingly, position 104 of Nb24—the mentioned nanobody with similar binding properties to the β2m mutants as Nb23—is occupied by a cysteine which forms a disulfide bond with Cys33 of the β-strand C, essentially freezing the loop in a rigid conformation in Nb24. Position 33 is structurally arranged to be adjacent to position 50. Therefore, the cation–π interaction of Nb23 could vicariate the Cys33-Cys104 disulfide bridge of Nb24. One possible orientation of the sidechains of Arg50 and Tyr104 in Nb23 is shown in Figure 7, where the Arg50 sidechain faces the aromatic ring making the cation–π interaction possible [30].

**Figure 6.** The CDRs of Nb23, with CDR1 in yellow, CDR2 in green, and CDR3 in salmon. The left column shows the cartoon representation of Nb23 without any sidechains. The central column shows the CDRs with sidechains (and only backbone for the β−core). The right column shows the surface of the protein with the CDRs highlighted. The predominance of the CDR3 in the antigen−binding site is evident, highlighting its importance in interacting with the antigen(s). Its orientation affects the size and shape of the antigen-binding site for the unbound nanobody, although the flexibility in residues 102−106 suggests that the CDR3 conformation may change as the nanobody binds its antigen(s).

**Figure 7.** The Tyr104 phenolic ring in the CDR3 and the Arg50 βCH2 in β−strand C" of Nb23 show proximity as per the assigned NOE constraints. This indicates the possible presence of a guanidinium−π interaction, partially keeping the CDR3 in a defined orientation. The cartoon shows one of the arrangements of the residues in the NOE−restrained best cluster.
