*4.2. DG*/*DDD*

The ROE-/NOE-interproton distances as well as the RDCs served as input for the distance geometry (DG) and distance bounds driven dynamics (DDD) calculations. The DG pseudo force-field employed for all simulations presented in this study takes the form defined by Equation (1):

$$E\_{\text{total}} = E\_{\text{dist}} + E\_{\text{chir}} + E\_{\text{NOE}} + E\_{\text{RDC}} + E\_{\text{DBL}}.\tag{1}$$

where the dimensionless total pseudo energy *Etotal* is a sum of distance (holonomic bond lengths) errors (*Edist*), chiral volume violations (*Echir*), NOE (*ENOE*), and RDC (*ERDC*) deviations of experimental data from values back-calculated from structures and a special term denoting the deviations of double bonds from planarity (*EDBL*). There are no additional or customary atom-type dependent force-field parameters of physical force-fields used. All pseudo energy terms take the form of sums of squared violations (Δ*X*)<sup>2</sup> as defined by Equation (2):

$$E\_X = \frac{1}{2} K\_X \sum \left(\Delta X\right)^2,\tag{2}$$

with Δ*X* = *Xexp* − *Xcalc*, and empirically chosen force-constants *KX* to appropriately account for the size and allowed ranges of each type of parameter violations Δ*X*.

*Edist* and *Echir* (Equation (1)) represent the violations originating from differences in holonomic distances <sup>Δ</sup>*ri*,*<sup>j</sup>* (i.e., bond lengths) and Δ*Vi* (chiral volumes). The latter are defined by the scalar triple product *Vchir* = →*a* ·(→*b* × →*c* ) of three vectors spanning planar sp2-type (*Vchir* = 0) or tetrahedral sp3-type atomic centers (i.e., stereogenic centers with *Vchir* - 0), thus encoding for the configuration of the latter through opposite signs ( *V*(*S*) *chir* = *V*(*R*) *chir*), respectively (see Figure 12a). For reference, holonomic distance bounds (for all atom-atom pairs for which upper and lower bounds of inter-atom distances can be established based on the molecular constitution) and chiral volumes are obtained from an initial guess (input) structure of arbitrary configuration and conformation. As these values depend solely on the constitution (which must be known), the DG approach is completely independent from the structure initially assumed [22]. Here, chiral volume restraints were used only on a single stereogenic center simply to avoid enantiomeric structures, as well as on CH3-groups (to keep them tetrahedral) and all sp2-centers (to keep them planar, *Vchir* = 0). Thus, through the deliberate absence of chiral volume restraints, all stereogenic centers (except one), and all CH2-groups with diastereotopic protons were allowed to "*float*" and thus their configurations and/or assignments evolve on the basis of experimental NMR data (NOEs and/or RDCs) only.

In Equation (1), *ENOE* denotes the deviations of back-calculated (and < *r*<sup>−</sup><sup>6</sup> > averaged where applicable) NOE distances from experimental upper and lower distance bounds with Δ*r* defined as follows:

$$
\Delta r = \begin{cases}
\ r\_{i,j} - r\_{lower} & \text{for} \\
0 & \text{for} \\
\ r\_{i,j} - r\_{upper} & \text{for} \\
\end{cases}
\quad r\_{i,j} \le r\_{i,j} \le r\_{upper}
\quad . \tag{3}
$$

In this study, all NOE bounds *rlower* and *rupper* were derived from the corresponding NMR volume integrals and used as *rmean* ± 10%, with force-constants *KNOE* = 10.0 Å−<sup>2</sup> unless stated otherwise.

The mathematics of RDC calculations used here has been taken from Glaser et al. [53], and the formalism on how to include RDC data in 4D and 3D DG simulations (see below) has been described in full detail in Ref. [22]. The harmonic RDC "pseudo energy" *ERDC* is based on violations Δ*D* = *Dexp* − *Dcalc* between experimental and back-calculated values, and RDC data sets derived from multiple alignment media can be used simultaneously as an increasing number of experimental NMR restraints in our *ConArch*+/*DG* approach simply by expanding the corresponding sum in *ERDC*. Empirically, it proved best to scale the force-constant *KRDC* used with the number of RDCs, or equivalently, with the number *M* of alignment media used, and thus we employ *KRDC* = 1.5/*M* Hz−<sup>2</sup> in this study.

In addition to the application of chiral volume restraints on sp2-type atomic centers (*Vchir* = 0), the term *EDBL* in Equation (1) is used to reasonably restrain double bonds and aryl rings to planarity (restraining *Vchir* = 0 on neighboring atomic centers alone is not sufficient). Here, Δ*X* (used cf. Equation (2)) is defined as Δ*X* = 1 − cos<sup>2</sup> φ, where φ*<sup>i</sup>*,*j*,*k*.*<sup>l</sup>* are the corresponding torsion angles *i* − *j* − *k* − *l* with sp2-sp2-type central bonds, and Δ*X* vanishes for φ = 0◦ and φ = 180◦ only (*cis*and *trans*-configurations). The rather high force-constant *KDBL* = 100 used here efficiently removes local energy minima which originate from slight bending of C=C-double bonds and aromatic rings, revealing more distinctive energy steps separating alternate conformational families. In general, the final best-fit energy-minimum structures have very low distortional energy terms of *EDBL* ≤ 5 × <sup>10</sup>−3, and the efficiency with which different configurations and conformations are sampled on the basis of experimental NMR data is largely unhampered by these types of restraints.

**Figure 12.** (**a**) Definition of chiral volumes as scalar triple products *Vchir* = → *a* ·( *b* × → *c* ) for sp3- and sp2-type atomic centers. (**b**) In analogy to 2D chiral objects that can be transformed into each other through a rotation in higher dimensions (3D), the configuration of 3D-chiral objects can be inverted by simple rotations in 4D (see text for details). (**c**) Projection mode of higher dimensional objects into lower dimensional space (here visualized as 3D →2D) along the eigenvector associated with the largest eigenvalue λ3 of the inertia tensor (with λ1 < λ2 ... < λ*n* ). Similar projections 4D →3D optimally preserve interatomic distances. (**d**) Temperature dependent scaling factors of projection forces applied to 4D simulated annealing smoothly transforms higher dimensional objects into 3D structures (<sup>τ</sup>*T* = 150 K, see text for details).

→

The initial input structure is used by DG only for setting up the holonomic bounds and distance matrices (±1% bond lengths), and subsequent configurational and conformational sampling is carried out by our ConArch+/DG approach in an automated sequence of steps. First, molecular structures are generated in four-dimensional (4D) space ("metrization" step, i.e., embedding based on holonomic distance bounds), followed by a 4D "floating-chirality" restrained DG (fc-rDG) and distance bounds driven dynamics (DDD) simulation (simulated annealing). After reduction of dimensionality, the simulated annealing is repeated in 3D space, and each simulation in 4D and 3D is concluded by a gradient-descent type optimization of structures against all restraints, minimizing the total pseudo energy *Etotal*. In all dynamics and optimization calculations, the partial derivatives ∂*Etotal*/∂*<sup>r</sup>*α of all energy terms with respect 4D and 3D Cartesian atomic coordinates (α ∈ *x*, *y*, *<sup>z</sup>*(, *w*) for all atoms) are interpreted and used as forces governing the evolution of the system. All derivatives are calculated analytically by ConArch+/DG. During each step of the rDG/DDD runs using RDCs, full updates of the Saupe or alignment tensors are computed based on a singular value decomposition (SVD) algorithm.

Sampling molecular structures first in 4D very e fficiently generates diastereomeric geometries as inversion barriers can be overcome easily [70,71]. Configurational inversion in 3D is reduced to a simple rotation in 4D (see Figure 12b), and consequently during simulated annealing in 4D space with chiral restraints removed on all but selected chiral centers (fc-DDD), the transition barriers between diastereomers are significantly lowered or removed altogether.

For an increased sampling e fficiency, it is crucial to transport as much 4D information as possible into 3D, in order to produce chemically relevant structure models. Projection from higher to lower dimensionality optimally preserves atom-atom distances when carried out along the eigenvector associated with the largest eigenvalue of the inertia tensor *I* defined by Equation (4):

$$I = \sum\_{k} \left( (\overrightarrow{r}\_{k} \cdot \overrightarrow{r}\_{k})E - \overrightarrow{r}\_{k} \bigotimes \overrightarrow{r}\_{k} \right),\tag{4}$$

where the sum runs over all atomic positional vectors ( → *r k* for *k* particles) centered about the origin ( - → *r k* = 0) and weighted with unity mass (see Figure 12c) [72]. During 4D simulated annealing, we apply a temperature dependent scaling factor *f* = exp (−(*T*/<sup>τ</sup>*T*) 2 ) with an empirical temperature coupling factor τ*T* = 150 K to forces acting along this eigenvector, which gradually restrain the 4D molecular models into a 3D subspace thereof (see Figure 12d). At high temperatures ( *T* > 300 K), all structures evolve freely, but are restrained increasingly and smoothly to 3D sub-space during the cooling phase (*T* < 300 K) of the simulated annealing. Finally, all models are projected into pure 3D space and are subjected to an additional simulated annealing therein.

In this study, for each compound **1**–**5** a total of 1000 structures (configurations and conformations) were generated initially in 4D space. All consecutive simulated annealing simulations in 4D and 3D used 5000 steps of equilibration (*T* = 300 K) and 5000 steps of cooling (*T* → 0 K) each (2 fs time steps). The final structures were collected, sorted by their pseudo energy, and a final selection of the ranked structures of lowest energy were used for the plots presented here. For the global energy minimum best-fit structures, errors in calculated RDCs are estimated from Monte-Carlo bootstrapping analysis including tensor updates [22,23]. Total single processor CPU (Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz) wall time used was about 7–8 min. for each compound **1**–**3**, and up to approx. 50 min for **4** and **5** when three alignment media RDC data sets are used. However, the entire process can be parallelized very efficiently on an arbitrary number of shared memory CPU cores, reducing the total wall time accordingly to a few minutes only.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/1660-3397/18/6/330/s1, Table S1: NOE data used for Axinellamine A (**1**), Table S2: NOE data used for Tetrabromostyloguanidine (**2**), Table S3: NOE data used for 3,7-*epi*-Massadine chloride (**3**), Table S4: RDC data used for Tubocurarine (**4**), Table S5: NOE data used for **4**, Table S6: RDC data used for Vincristine (**5**), Table S7: NOE data used for **5**, Table S8: Atomic coordinates for rDG structures of compounds **1**-**5**. Figure S1: rDG structure of compound **1**, Figure S2: Extended rDG simulations of **1**, Figure S3: Structures of **1** obtained from extended rDG simulations, Figure S4: rDG structure of compound **2**, Figure S5: Variations of chiral volume restraints used for **2**, Figure S6: Structures of **2** obtained from unrestrained rDG simulations, Figure S7: Structures of **2** obtained from variations of chiral volume restraints, Figure S8: rDG detailed analysis of **2**, Figure S9: Structures of **2** from rDG detailed analysis, Figure S10: rDG structure of compound **3**, Figure S11: rDG detailed analysis of **3**, Figure S12: Structures of **3** from rDG detailed analysis, Figure S13: rDG structure and back-calculated RDCs for **4**, Figure S14: rDG structure of **4**, Figure S15: rDG structure and back-calculated RDCs for **5**, Figure S16: rDG structure of **5**.

**Author Contributions:** Conceptualization, M.K. and S.I.; methodology, M.K. and S.I.; software, S.I.; writing—original draft preparation, M.K. and S.I.; writing—review and editing, M.K., M.R., and S.I.; visualization, M.K. and S.I. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Deutsche Forschungsgemeinschaft (DFG), gran<sup>t</sup> number Re1007/9-1.

**Acknowledgments:** We would like to thank M.R. Scheek (University of Groningen) for providing his modified version of the DG-II program package. We also thank the Center for Scientific Computing (CSC) of the Goethe University, Frankfurt, for granting access to the high-performance computing cluster and providing the CPU time required for the DFT calculations (Figure 11).

**Conflicts of Interest:** The authors declare no conflict of interest.
