*2.2. Free Energy Calculations*

Binding constants of toxins are routinely measured and readily available for most channel-toxin complexes. Thus, in the absence of mutation data, they provide the only means for validating a complex model. This, in turn, requires an accurate calculation of the binding constant. There are two classes of methods that can be used for this purpose: (i) path independent alchemical perturbation methods, where the ligand is simultaneously destroyed in the binding pocket while created in bulk; and (ii) path dependent potential of mean force (PMF) methods, where the ligand is physically pulled from the binding pocket to bulk [19,20]. Here the PMF gives the continuous free energy profile of the toxin along the chosen reaction coordinate. The first method is computationally cheaper but its application to peptide toxins—which are flexible and have many charged residues—suffers from sampling problems [45]. Thus until the problems in applying the perturbation method to charged-flexible ligands are resolved, the PMF approach will remain the method of choice.

Formally, the binding constant is determined from the 3-D integral of the PMF of the ligand, *W* (r)

$$K\_{eq} = \int\_{site} e^{-W(\mathbf{r})/k\_B T} \, d^3 r \tag{1}$$

where it is assumed that *W* = 0 in bulk. Because computation of a 3-D PMF is not feasible, one invokes a 1-D approximation and determines the PMF along the reaction coordinate, which is the channel axis for a channel-toxin complex. Taking the channel axis along the z axis, the binding constant is given by

$$K\_{eq} = \pi R^2 \int\_{z\_1}^{z\_2} e^{-W(z)/k\_B T} \, dz \tag{2}$$

where *z*<sup>1</sup> and *z*<sup>2</sup> refer to the toxin's center of mass (COM) coordinates at the binding site and in the bulk, respectively. The factor *πR*<sup>2</sup> measures the average cross-sectional area of the binding pocket as explored by the COM of the toxin, and the radius *R* is determined from the transverse fluctuations of the COM of the toxin in the binding pocket. The accuracy of this approximation has been checked for ions by independent free energy perturbation calculations [46]. Because toxins are much heavier, their transverse fluctuations along the *z* axis are further suppressed compared with ions, so the 1-D approximation should work even better for toxins. The absolute binding free energy is determined from the binding constant using

$$G\_b = -k\_B T \ln(K\_{eq} C\_0) \tag{3}$$

where *C*<sup>0</sup> is the standard concentration of 1 mol/liter (*i.e*., 1/*C*<sup>0</sup> = 1661 Å3).

The most common method used in PMF calculations is the umbrella sampling MD simulations, where one introduces harmonic potentials to enhance sampling of the ligand at high-energy points [47]. For convenience, umbrella potentials are introduced at regular intervals (e.g., 0.5 Å) along the reaction coordinate. To generate the simulation windows, a harmonic force with *k* in the range of 20–40 kcal/mol/ Å<sup>2</sup> is applied to the COM of the toxin backbone, which is pulled at a constant speed of 5 Å/ns over 0.5 Å. After each pulling step, the toxin is equilibrated at the window position by applying the same harmonic restraint for 200 ps to relax the effect of steering on the environment. Initially

windows are generated for up to 15 Å starting from the binding pocket, which are extended further if necessary (*i.e*., until the PMF becomes flat). If insufficient overlap occurs between the adjoining windows (typically less than 5%) due to a local potential barrier, an extra window is included in the middle of the two windows. The COM coordinates of the toxin, measured with respect to the COM of the channel, are collected during the umbrella sampling MD simulations. The PMF is obtained using the weighted histogram analysis method [48], which unbiases the COM coordinates in each window and combines them in an optimal way. Convergence of the PMF is the sole criterion on how long one should run each window, which can be studied using block data analysis. Typically PMF obtained from individual blocks of data first monotonically decreases and then fluctuates around a base line. In the first phase, the system is still equilibrating and the data should be discarded. Fluctuations in the second phase indicate that equilibration has been reached, so the final PMF should be constructed using the production data from this part.

An alternative method for constructing the PMF of a ligand is to use Jarzynski's equation [49] in steered MD simulations [50]. Due to its simplicity, this method has become quite popular in recent years. However, detailed comparisons with the umbrella sampling method indicate that its application to biomolecules suffers from sampling problems and the convergence of PMFs is too slow to be useful in practice [51,52].

The PMF method was first applied to the KcsA potassium channel-charybdotoxin complex, where the structure was known, hence providing an important test case [31]. In this study, a large discrepancy was found in the calculated absolute binding free energy, which was caused by the distortion of the toxin during umbrella sampling simulations. In a follow-up study, conformational restraints were used to prevent the deformation of the toxin, which enabled calculation of the absolute binding free energy within chemical accuracy of 1 kcal/mol [53]. Since then, the PMF method has been used in several computational studies of toxin binding to ion channels [54–61]. As long as a validated complex structure was employed in the PMF calculations, the absolute binding free energy was obtained within chemical accuracy in all cases.

Improving the affinity and selectivity of a drug lead via mutations poses a less taxing computational problem, as one is interested in the change in the binding free energy due to a mutation. Provided the binding mode is not altered by the mutation, this is most efficiently calculated using the alchemical transformation methods such as free energy perturbation (FEP) and thermodynamic integration (TI) [62]. In both methods, one introduces a hybrid Hamiltonian, *H*(λ) = (1 − λ)*H*<sup>0</sup> + λ*H*1, where *H*<sup>0</sup> represents the Hamiltonian in the initial state (wild type ligand) and *H*<sup>1</sup> in the final state (mutant ligand). The alchemical transformation is performed by changing the parameter λ from 0 to 1 in small steps, which ensures that the change in the free energy in each step is small enough to enable sufficient sampling of the system in a reasonable time frame. In the FEP method, the interval [0,1] is divided into *n* subintervals with {λ*i*,*i* =1,*n* − 1}, and for each subinterval the free energy difference is calculated from the ensemble average

$$
\Delta G\_i = -kT \ln \langle \exp[-(H(\lambda\_{i+1}) - H(\lambda\_i))/k\_B T] \rangle\_{\lambda\_i} \tag{11.11}
$$

(4)

The free energy difference between the initial and final states is obtained from the sum, Δ*G* = Σ*<sup>i</sup>* Δ*Gi*. The number of subintervals (windows) is chosen such that the free energy change at each step does not exceed 2 kcal/mol, otherwise the method may lose its validity. For mutations of involving charged residues, this requires over hundred windows if uniform subintervals are used. Using exponentially spaced subintervals instead, one could reduce the number of windows substantially.

In the TI method, the ensemble average of the derivative, *∂H(λ)/∂λ*, is obtained at several *λ* values, and the free energy difference is calculated from the integral,

$$
\Delta G = \int\_0^1 \left\langle \frac{\partial H(\lambda)}{\partial \lambda} \right\rangle\_\lambda d\lambda \tag{5}
$$

The TI method is especially advantageous for mutations involving charged residues, because Gaussian quadrature allows evaluation of the integral using a small number of windows, which can be sampled for longer times to check convergence of the results. A seven-point quadrature was found to be adequate in previous applications of the TI method [46,63]. In both methods, it is important to perform the backward calculation to check for hysteresis effects. If the difference between the forward and backward results is much larger than 1 kcal/mol, the calculated free energies are not reliable, most likely due to insufficient sampling of the system.

Mutation of a charged residue to a neutral one is a more challenging problem and requires additional considerations to avoid sampling problems. Firstly, one needs to preserve the net charge in the system during the calculations. This can be achieved by performing the following thermodynamic cycle: While a charged residue on the toxin in the binding site is mutated to a neutral one, the reverse transformation is simultaneously applied to a mutant toxin in bulk. Secondly, the Coulomb and Lennard-Jones (LJ) interactions need to be handled separately, which can be implemented by introducing uncharged residues as intermediate steps. For example, the free energy change due to a Lys to Ala (*K* → *A*) mutation can be expressed as

$$
\Delta\Delta G = \Delta\Delta G(K \to K^0) + \Delta\Delta G(K^0 \to A^0) + \Delta\Delta G(A^0 \to A) \tag{6}
$$

The first term represents the discharging of the side chain of a Lys residue on the bound toxin while the reverse process is performed on a toxin in bulk with an uncharged Lys side chain. In the second term, the uncharged Lys side chain is transformed to an uncharged Ala side chain while the reverse is performed on the bulk toxin. Finally, the third term corresponds to charging of the Ala side chain in binding site while the one in bulk is discharged. Each of these contributions to the free energy difference can be calculated using the FEP or TI methods.
