1. Introduction
Monitoring and describing molecular dynamics are an important area of investigation in modern physical chemistry. Internal and global motions in solution affect directly or indirectly most spectroscopic methods aimed at the characterization of non-rigid molecules such as Nuclear Magnetic Resonance (NMR) relaxation [
1,
2,
3], fluorescence anisotropy decay [
4], time-resolved X-ray [
5] and in single-molecule experiments such as site-directed spin-labelled electron spin resonance [
6,
7], Förster fluorescence resonance energy transfer [
8] and atomic force microscopy [
9].
In particular, Nuclear Magnetic Resonance (NMR) spectroscopy is known to be an important and powerful experimental technique for the observation of the dynamic properties of macromolecules. Some of the macroscopic physical observables are the relaxation times
,
and the heteronuclear Overhauser Effect (NOE) of
15N,
2H and
13C nuclei, which are extremely sensitive to molecular motions, leading to the possibility to understand localized dynamics (e.g., studying conformational motions specifically in the active site of a protein) and to build a spatially distributed map of the macromolecule flexibility. However, interpretative tools can be complex due to several factors such as (i) the necessity to take into account diverse kinds of interactions, e.g., dipolar
15N and
13C and quadrupolar
2H interactions, (ii) the coupling between global reorientation and large amplitude motions of entire domains, as well as limited local readjustments and restricted single-residue motions. In general, different spectroscopic techniques probe different physical observables, which, in addition, provide information on motions taking place at different time-scales. It seems therefore particularly important to introduce relevant sets of coordinates that are adapted to the observable involved in a particular experimental approach. This consideration is especially relevant in the case of NMR relaxation [
1,
10], for which interpretative methods for internal relaxation processes were introduced early on in the form of adaptable simple spectral densities, as in the Lipari–Szabo (LS) approach [
11,
12], or later, in the form of explicit dynamic models, as for instance in the Slowly Relaxing Local Structure (SRLS) model [
13,
14].
A rational approach to the in silico interpretation of the relaxation data of flexible molecules needs two distinct elements. First, a precise geometrical analysis is needed in order to relate properly the dynamic model to a set of experimentally observable quantities. In particular, care should be taken to account for the tensorial nature of the spectroscopic interactions, by defining proper local frames of reference. Next, the relaxation times or other observables are linked to time correlation functions/spectral densities of a specific nature that can be evaluated on the basis of the dynamic model itself. The latter can range from a full atomistic molecular dynamics (MD) simulation-based approach to simplified semi-analytical expressions for the correlation functions. At an intermediate level of complexity, several approaches have been devised based on various approximations. For instance, one can make the simplifying assumption that local motions are due, at least for semi-rigid systems, to a network of dynamically coupled neighbours (network model) [
15,
16] or caused by partial diffusive reorientation within a local potential (SRLS) [
14]. One can also assume specific statistical characteristics (diffusive or Brownian dynamics, fractional Brownian dynamics [
17], etc.).
Recently, a systematic approach [
18,
19] has been proposed that tries to combine a detailed definition of the molecular geometry and a correct description of the associated dynamical features. The method attempts to include the information on the molecule geometry, topology and interactions into a general stochastic model, which can be tailored at different levels of accuracy introducing specific approximations based on time-scale separation arguments. A master equation can be obtained and, with suitable approximations, numerically solved. In particular, a basic implementation, named the Semi-Flexible Brownian (SFB) model, has been developed for the description of partially flexible macromolecules in solution. Until now, the SFB model has been applied only to model cases, and no examples have been shown of calculations of directly measurable observables. In this paper, we present a full investigation of the SFB performance for the evaluation of
13C nuclear magnetic resonance relaxation parameters,
and
, and the heteronuclear NOE of several oligosaccharides, which were previously interpreted on the basis of ad hoc stochastic modelling. In particular, we discuss the computational strategy and implementation of the method and detailed results, which confirm how the calculated NMR relaxation parameters are in satisfactory agreement with the experimental data, and we suggest that this general approach can be safely applied to diverse classes of molecular systems, with a minimal usage of adjustable parameters.
The paper is organized as follows.
Section 2 summarizes the basic features of the SFB model and its implementation. The main results are shown in
Section 3. A discussion is provided in
Section 4.
3. Results
We tested our method on a collection of oligosaccharides, for which previous analyses based on specific stochastic models were presented. The following systems are considered:
-L-Rha
p-
-(1→2)-
-L-Rha
p-OMe (two residues,
R2R); experimental data: Reference [
28]
-D-Glc
p-(1→6)-
-D-[6-
C]-Man
p-OMe (two residues,
BGL); experimental data: Reference [
29]
-D-Glc
p-(1→3)[
-D-Glc
p-(1→2)]-
-D-Man
p-OMe (three residues,
GGM); experimental data: Reference [
30]
-D-Man
p-(1→2)-
-D-Man
p-(1→6)-
-D-[6-
C]-Man
p-OMe (three residues,
TRI); experimental data: Reference [
31]
-L-Fuc
p-(1→2)-
-D-Gal
p-(1→3)-
-D-Glc
pNAc-(1→3)-
-D-Gal
p-(1→4)-D-Glc
p- (five residues,
LNF); experimental data: Reference [
31]
-cyclodextrin (eight residues,
GCY); experimental data: Reference [
32]
Estimates of
C-NMR parameters
,
and the NOE for selected CH and CH
probes were calculated by SALEM starting from a local-minimum structure (
Figure 2,
Figure 3,
Figure 4,
Figure 5,
Figure 6 and
Figure 7) and the related Hessian matrix. These were obtained through the Tinker 8 program package [
27] using the popular MM3 FF [
33,
34,
35] (see References [
36,
37] for a review of several FFs for carbohydrates). Minimization was performed through the minimize tool of Tinker with an “RMS gradient per atom criterion” of 0.01 Å. The Hessian matrix was calculated through the Tinker utility testhess. In the hydrodynamic model used to evaluate the friction tensor by SALEM, four quantities were required, namely: the temperature, the local viscosity, the hydrodynamic boundary conditions and the effective radius of the atoms,
. Temperature and viscosity were set by the experimental conditions. For sugars in water or polar solvents, stick boundary conditions can be considered appropriate. The only true free parameter was the effective radius.
For each system, the optimal
parameter was determined as the one providing the lowest sum of squared percentage deviations for
,
and the NOE over the entire ensemble of experimental data available for that system. In
Table 1,
Table 2,
Table 3,
Table 4,
Table 5 and
Table 6, for each set of available experimental data (typeset in boldface in the tables), the optimal theoretical estimates are reported for each system together with the value of the associated optimal
and with the percentage deviations from the experimental
,
, and the NOE (e(
), e(
), e(NOE), respectively).
The analysis of the various oligosaccharides studied in this work allowed us to attempt a comparative analysis of the model performance and sensitivity to some experimental parameters, with the caveat that a systematic investigation, which goes beyond the preliminary nature of this work, should be performed. The most important factor is temperature. Increasing the temperature implies lowering the viscosity, and thus increasing the principal values of the diffusion tensor. Assuming that in the range of temperatures at which the experiments were carried out, the conformational free energy of the molecules is not changed, increasing the diffusion tensor caused a decrease of the characteristic correlation times, which became closer to the extreme narrowing limit condition. This, in turn, had the effect of increasing
and
, as well as making them much closer to each other (since the rotational anisotropy is low in small molecules). Reaching the extreme narrowing limit, the spectral densities tended to become proportional to the overall tumbling correlation time, thus losing sensitivity on the conformational dynamics. Therefore, a worse performance of the model was observed and should not be unexpected at lower temperatures/larger viscosities. In the case of
BGL at 253 K (
Table 2), which is the case of a lower temperature and a solvent with higher viscosity, we found an agreement with experimental data within 20% of the relative error, which was the worst scenario in all the test calculations presented here. Simulations around 298 K and in solvents with a similar viscosity showed an agreement within 10% with the experiments.
The relaxation data of the penta-saccharide LNF were calculated on five probes located on the five different sugar rings. At fixed temperature and viscosity conditions, NMR relaxation depends on local geometry (i.e., how the C-H probes are oriented with respect to the diffusion tensor principal axes) and on the conformational free energy. In [
38], the authors studied the effect of the shape of the
R2R Potential Energy Surface (PES) along the
angle on
,
, and the NOE. From MD simulations, it was found that the PES was bistable. The calculation of the NMR relaxation data with the harmonic approximation around the most important minimum led to a 10% error with respect to the calculation done with the bistable potential. The reader is encouraged to inspect
Figure 5 of [
31]. It reported the 2D PESs along the four (
,
) couples of dihedral angles connecting the different sugar units. The PES along the angles connecting rings C and D was bistable, while the other three surfaces could be considered, as first approximation, harmonic. A one-to-one mapping between the probe position and the conformational free energy was not possible. Globally, the approximations done in the SFB model implied that the estimation of a few NMR relaxation data was outside the experimental error (e.g.,
for the A and the E rings), and the overall agreement was 5–10% worse than the agreement found with the diffusive chain stochastic model applied in the past [
31].
The sensitivity of the method at different spectrometer frequencies can be also discussed. Even if, again, a trend cannot be strictly highlighted (since there were many factors playing at the same time: geometry, PES, temperature/viscosity, experimental setup), the simulations at lower frequencies tended to be more accurate than those at higher frequencies. This observation can be rationalized, partially, as follows: by increasing the Larmor frequency, NMR relaxation data were more affected by the tails of the spectral densities, which were in turn more sensitive to the fast-relaxing processes, i.e., internal motions described by the simplified harmonic PES in the SFB model.
Overall, the SFB model tended to perform better at higher temperatures, lower viscosities and lower spectrometer frequencies and for molecules with a limited internal flexibility. However, our purpose here was to propose the SFB approach as a general tool, with minimal parametrization, capable of interpreting diverse experimental observations without resorting to ad hoc hypotheses concerning the molecular geometry, internal PES, dissipative properties and so on. The cases presented in this study showed that in the best conditions, relative errors within 5% were found, which usually were compatible with experimental errors. The average percentage deviation over all calculations was 8.5, 5.8 and 7.7 for , and the NOE, respectively. The maximum percentage deviation was 22.4, 13.6 and 18.5, respectively. Such results are significant, if one considers the relative drastic approximations included in the SFB model, confirming—at least for this class of systems—its performative capabilities, which are amenable to be considerably improved by lifting some approximations, like the harmonic nature of the internal PES and the upgrade of the estimates of the friction tensor to include hydrophilicity/hydrophobicity effects. The latter notation was based on the observed estimates of the only free parameter of the model, the effective radius . The optimal value was found in the range 1.6–2.2 Å for all systems except LNF, for which a higher value of 3.2 Å was obtained, which could be due to the molecule being particularly hydrophilic. This, in turn, implies that the molecular dimensions should be increased to take into account a layer of water surrounding the penta-saccharide.
4. Discussion
The results reported in the previous section showed that the SFB model reproduced the observed relaxation times and the NOEs of the set of oligosaccharides reported here, with an average accuracy of 5–10%, thus proving the ability of capturing the long-range dynamics of these systems. The study took into account molecules of increasing size, and there was no significant drift in the performance of the model with increasing molecular dimensions. Such an observation is promising, suggesting that the model could profitably be employed for large macromolecules, provided they can be still described as semi-flexible objects, i.e., molecules that mainly fluctuate about a minimum free energy structure (e.g., globular proteins).
The agreement with experimental data was in most cases within 10% relative error. In some cases, higher errors up to 22% were observed. Such discrepancies were expected since free energy profiles along sugars’
,
tetrahedral angles exhibited bistable energy profiles, while only harmonic energy profiles were used in SALEM. In the last case analysed here,
GCY, as discussed in [
32], the rotation of the hydroxymethyl group with respect to the sugar ring described by the torsional angle
—see
Figure 7— featured two energy minima at
and
, with the former conformation being the predominant one. The Tinker energy minimization led to a structure with six over eight units in the predominant conformation, and the results shown above were those obtained running SALEM on one of these six probes.
Still, the present general purpose SFB model showed a good performance if compared with straightforward results obtained via molecular dynamics simulations, especially considering that O(
s) trajectories were required in the latter case [
28,
39]. As mentioned before, the only free parameter here was
, which was adapted case by case, but as a rule of thumb, a starting value of 2 Å usually provided good agreement with the experiment.
Perspectives
The main purpose of this paper was to validate, for a class of similar molecular systems for which we have directly controllable published NMR relaxation data, a general stochastic model based on a minimal amount of assumed phenomenological information. In our previous work [
18,
19], we formulated a systematic approach to describe the dynamics of a non-rigid molecule, based on elaborations from fundamental classical and statistical mechanics, in the form of a family of multidimensional Fokker–Planck operators for the probability density of internal and external degrees of freedom, retaining inertial effects and dissipation. The SFB model is the simplest implementation of this general methodology [
19]. This approach seemed particularly relevant for the description of large molecular objects, such as proteins, which represent the main domain of application in the authors’ perspective. The method provides a physically sound framework and is amenable to an efficient treatment at a modest computational price. We are currently working along three possible lines of development: (i) first of all, we are applying the present implementation of the SFB approach for interpreting NMR relaxation data of medium-sized folded proteins in solution, without additional improvement, barring a more accurate evaluation of the internal PES; (ii) at the same time, we are also exploring the effects of including large amplitude motions in the SFB model to account for locally more mobile regions; finally, (iii) we are streamlining SALEM, to make its usage as clear as possible with intuitive interfaces and instructions, to release it as a free tool for the general community.