**1. Introduction**

The developments of molecular simulations started first with statistical methods like Monte-Carlo simulations (MC) to address the time-progression of multi-particle systems. The use of macroscopic spheres to simulate atomic motions dates backs to early 1940. The work on elastic collision in phase transition by Alder and Wainwright is attributed as the first realistic simulation [1]. The progress in molecular dynamics (MD) simulations from that point is phenomenal thanks to ever evolving computer architectures and development of efficient algorithms. While the initial applications of the MD simulations were aimed at material science applications, they covered biophysics and biomolecules very fast. The applications in biomolecular systems are numerous: X-ray structure processing, protein folding, and receptor-ligand interactions, to name a few [2–4]. Performance and accuracy

**Citation:** Roy, D.; Kovalenko, A. Biomolecular Simulations with the Three-Dimensional Reference Interaction Site Model with the Kovalenko-Hirata Closure Molecular Solvation Theory. *Int. J. Mol. Sci.* **2021**, *22*, 5061. https://doi.org/10.3390/ ijms22105061

Academic Editor: Małgorzata Borówko

Received: 9 April 2021 Accepted: 10 May 2021 Published: 11 May 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

of MD simulations were verified by comparison with experimental data obtained from diffraction and NMR experiments. An essential component of MD simulation, the force field is developed by fitting against high-level quantum chemical calculations [5–7]. The other variant of the force field, the Kirkwood-Buff (KB) force fields designed through application of the KB theory in calculating densities and other physical properties of multicomponent solutions, has shown immense potential for use in molecular simulations, both with all atom and united atom settings [8–10]. The deviations of MD simulations from experimental results are attributed to the shortcomings of the force field(s) used, as well as inadequate simulation time frame. It is imperative to point to the local minima problem faced by MC and MD methods for potential energy surfaces (PESs) with multiple minima separated by large energy barriers [11–13]. A plethora of research has been devoted to overcome such issues [14,15]. The explicit solvent simulations using MD techniques are the most adequate ones for modeling biologically important molecules which often requires specific environments. Explicit solvation simulations are quintessential for solvation free energy as well as receptor-ligand binding energy calculations. All these theoretical and computational techniques essentially deal with multiple interactions present in liquid environment (e.g., solution). These interactions involve solvent-solvent and solute-solvent interactions. The solute-solvent interaction further breaks down into the electrostatic and non-polar components. To complete the interaction terms in order to calculate solvation energy, solute polarization and deformation energies are important factors. The last term becomes more significant for binding studies in biomolecular simulations. Incorporating all these intra- and intermolecular terms in solvation process modeling is a daunting task, and justifies development of several theoretical methods to address molecular solvation.

The differences in molecular properties between isolated systems and continuum calculations are often the results of different scalabilities of the systems under consideration. The "gold-standard" quantum mechanical (QM) methods can achieve accuracy up to one-tenth of a kcal/mol, but limited to systems with small sizes [16–19]. Different continuum solvation models (e.g., PCM and its variants, SMD, COSMO) are calibrated against experimental solvation energy databases of small molecules, and are problematic for absolute solvation energy calculations of systems beyond the chemical classes covered in calibration databases (viz. transition metal containing systems) [20–24]. The difficulty in achieving accurate solvation free energy prediction can be attributed to the absence of specific solute-solvent interactions, limited (or most of the time, absent) sampling of the solute conformal steps, etc. Application of quantum chemical calculations of biomolecules (protein, DNA/RNA) are restricted due to system size resulting in a large number of basis functions required to describe such systems. The ONIOM methodology as well as QM/MM and QM/MM/MD techniques provide respite to this handicap by offering a computationally more amenable scenario where site(s) of importance are treated with high level QM calculations while the rest of the system is treated with molecular mechanics potentials for specialized applications [25–28].

The applicability of different theoretical methods to solvation dynamics of systems of different sizes and dimensions is quite compartmentalized. Thus, a theoretical model that spans over a large scale of computational requirements with reasonable accuracy and speed is desirable, and much research activity is devoted toward this goal. The reference interaction site model (RISM) is based on first principle statistical mechanics, with proven applications in the field of van der Waals fluid, biomolecules, material science, and drug development [29–31]. The theoretical framework of RISM is suitable to couple with MD-engines and QM self-consistent-field (SCF) iterations [32–35]. This makes the RISM formalism an excellent candidate from the perspective of building materials of desired properties, as the theory provides understanding of all the underlying interactions between different constituent fragments. The RISM theory with the integral equation formalism was developed and used for solvation structure and energetics calculations, although the potential of this theory goes beyond regular solvation energy calculations and expands to molecular partitioning, physical-chemical property calculation, and molecular

simulations [36–48]. The key feature of the RISM theory is that it can provide reasonably accurate result rapidly, a feature that made this the theory an essential part of the multiscale modeling framework (Figure 1).

**Figure 1.** Computational simulation scale and versality of the 3D-RISM theory.

## **2. Theoretical Background**

The foundation of the RISM theory is credited to the seminal works of Chandler and coworkers [49–56]. This theory grew enormously over past forty years. The key theoretical aspects are outlined in this section. Further theoretical backgrounds are provided for individual applications in the respective sections. For a solute of arbitrary shape, the 3-dimensional (3D-) version of the RISM theory provides a probability distribution of all possible interaction sites (*γ*) of solvent molecules around the solute at position **r** which is a product of the average number density (*ργ*) in the bulk solution and the normalized density distribution, *gγ*(*r*) (Figure 2). The density enhancement and/or depletion (*gγ*(*r*)>1 and/or *gγ*(*r*) < 1) relative to the average density at a point in solution bulk where g*γ*(*r*) → 1 is provided by the average number density. The total correlation function of solvent sites in 3D is related to the 3D direct correlation function *<sup>c</sup>γ*(*r*) and site-site bulk susceptibility function for α-solvent sites around a solute by Equation (1).

$$h\_{\uparrow}(r) = \sum\_{a} \int dr' c\_{a} \left( r - r' \right) \chi\_{a \uparrow}(r') \tag{1}$$

Additionally, *gγ*(*r*) = *hγ*(*r*) + 1 and *<sup>c</sup>γ*(*r*) ~ <sup>−</sup>*uγ*(*r*)/(*kBT*), where *T* is temperature and kB is the Boltzmann constant. The bulk susceptibility function *χ* is an essential input to the 3D-RISM integral equation, and is constructed from the intramolecular correlation function *<sup>ω</sup>*α*γ* from the dielectrically consistent RISM (DRISM) [57]:

$$
\omega\_{\alpha\gamma\gamma}(r) = \omega\_{\alpha\gamma}(r) + \omega\_{\alpha\gamma}(r)\,\rho\_{\gamma}\,h\_{\alpha\gamma}(r) \tag{2}
$$

**Figure 2.** The normalized distribution of the water oxygen sites around the scorpion toxin protein (PDB ID: 1AHO) computed using the 3D-RISM-KH theory and the modified TIP3P water model. The protein backbone is colored in cyan.

The intramolecular correlation function can be expressed in reciprocal k-space via terms of a zeroth-order Bessel function:

$$
\omega\_{a\gamma}(r) = j\_0(kl\_{a\gamma})\tag{3}
$$

The intra- and inter-species correlation functions are renormalized through an analytical dielectric bridge function for solvents with high dielectric constant value, thus ensuring that all inter- and intra-species interactions are considered for a few solvent layers around a solute (or cosolvent, etc.). The renormalized form of the dielectric correction is written in terms of zeroth- and first-order Bessel functions over the position of each atom *r*α = (*<sup>x</sup>α*, *y<sup>α</sup>*, *<sup>z</sup>α*) with partial charge *qα* of site *α* on species with respect to its molecular origin:

$$\chi\_{a\gamma}(k) = j\_0(k\chi\_a)j\_0(ky\_a)j\_1(k\varpi\_a)h\_c(k)j\_0(kx\_\gamma)j\_0(ky\_\gamma)j\_1(k\varpi\_\gamma) \tag{4}$$

The envelope function *h*c(*k*) determines the dielectric constant of the solution using a non-oscillatory form with amplitude A falling off rapidly at wavevectors *k* larger than the characteristic size *l* of the liquid. The characteristic length is important for DRISM calculations, to avoid spurious non-physical distribution functions.

$$h\_{\varepsilon}(k) = A \exp\left(-l^2 k^2 / 4\right) \tag{5}$$

$$A = \frac{1}{\rho\_{\text{Polar}}} \left( \frac{\varepsilon}{\mathcal{Y}} - 3 \right) \tag{6}$$

For a mixed solvent scenario, the total number density of polar species and solution dielectric susceptibly *y* can be used in combination with Equations (4)–(6) to apply 3D-RISM formalism as:

$$\rho\_{\text{polar}} = \sum\_{s \in \text{polar}} \rho\_s \tag{7}$$

$$y = \frac{4\pi}{9k\_B T} \sum\_{s \in \text{polar}} \rho\_s \left(d\_s\right)^2 \tag{8}$$

A closure function is required to integrate an infinite chain of correlation diagrams generated from the direct and total correlation function. The functional form of such a closure function in unknown, and several approximated forms were reported over time for simplified computations. Closure functions differ from each other in the mathematical form of the bridging function used in the construct. The Kovalenko-Hirata (KH) closure is among the best closure relations till date in terms of both numerical stability and reasonable accuracy [58,59]. The mathematical form of the KH closure is given as:

$$\mathcal{g}\_{\gamma}(r) = \begin{cases} \ \exp(-u\_{\gamma}(r)/(k\_{\mathcal{B}}T) + h\_{\gamma}(r) - c\_{\gamma}(r)) & \text{for} \quad \mathcal{g}\_{\gamma}(r) \le 1\\ 1 - u\_{\gamma}(r)/(k\_{\mathcal{B}}T) + h\_{\gamma}(r) - c\_{\gamma}(r) & \text{for} \quad \mathcal{g}\_{\gamma}(r) > 1 \end{cases} \tag{9}$$

The overall form of the KH closure can be explained as a coupling of the mean spherical approximation for the regions of density enrichment (*gγ*(*r*) > 1) with the hypernetted chain approximation for the region of density depletion (*gγ*(*r*) < 1). The excess chemical potential and the solvation free energy is obtained from the analytical form of the KH closure as:

$$\begin{aligned} \mu\_{\text{solv}} &= \sum\_{\gamma} \int\_{V} dr \, \Phi\_{\gamma}(r) \\ \Phi\_{\gamma}(r) &= \rho\_{\gamma} k\_{\mathbb{B}} T \Big[ \frac{1}{2} h\_{\gamma}^{2}(r) \Theta(-h\_{\gamma}(r)) - c\_{\gamma}(r) - \frac{1}{2} h\_{\gamma}(r) c\_{\gamma}(r) \Big] \end{aligned} \tag{10}$$

The *Φγ*(*r*) is the Heaviside step function. Important thermodynamic parameters are derived from the excess chemical potential for solute sites (u) and solvent sites (v) as:

$$
\Delta\mu = \Delta\varepsilon^{\rm uv} + \Delta\varepsilon^{\rm vv} - \rm T\Delta\rm s\_{\rm V} \tag{11}
$$

∼

The entropy ( ΔSV) and partial molar volume (PMV, V) are calculated as:

$$
\Delta S\_V = -\frac{1}{T} \left( \frac{\partial \Delta \mu}{\partial T} \right)\_V \tag{12}
$$

$$
\dots \tag{13}
$$

$$\widetilde{\mathbf{V}} = k\_B T \chi\_T \left( 1 - \sum\_{\gamma} \rho\_\gamma \int dr \, c\_\gamma(r) \right) \tag{13}$$

The errors in 3D-RISM calculations have several origins. Firstly, the internal pressure calculated in the 3D-RISM molecular solvation theory is wrong. A few correction schemes were developed to counter this error [60,61]. Another source of errors arises from the choice of the Lennard-Jones potential used for calculating interaction potentials. A careful calibration is warranted while selecting a force field for a specific application. For example, the computational framework of 3D-RISM failed to converge for polar protic hydrogen atom (e.g., water) with conventional force fields, as the hydrogen atoms has no van der Walls parameters assigned. This problem is circumvented by using a non-zero van der Wall's terms for hydrogens [62,63]. The KH closure is known to shift the strongly associated peaks while broadening them simultaneously; interestingly, this provides an adequately correct solvation structure.

#### **3. Biomolecular Simulations with the 3D-RISM-KH Molecular Solvation Theory**

The center of simulations for biophysics related problems are structure-function features of protein and nucleic acids. The structural landscape of biomolecular folding is a high demand research field. Recent achievements in achieving millisecond time scale simulation of protein structure opened further developments in order to explore the entire folding landscape of proteins of reasonable sizes [64–66]. The molecular dynamics simulation with the 3D-RISM-KH theory was first incorporated in the AMBER MD simulation suite, using the Sander engine as well as standalone unit for single point solvation free energy calculations [32]. The standard Sander implementation was modified to support long time scale simulations using damped Langevin dynamics for a canonical ensemble to address the instability of the multiple time step MD (MTS-MD). This is achieved by

combining two simulation cycles for two different parts of the system (MTS-MD). The outer time steps are obtained from 3D-RISM-KH calculation. For each inner step, the effective solvation force is used to extrapolate solvation force coordinates, based on the outer time steps. The force matrix {F}(*k*) working on each solute atom is approximated as a linear combination of forces at *N* previous steps, at any given time step *tk*:

$$\{\mathbf{F}\}^{(k)} \;= \sum\_{i=1}^{N} a\_{ki} \{\mathbf{F}\}^{(i)}, \; i \in \text{3D}-\text{RISM steps} \tag{14}$$

The weighted coefficients *aki* for a given time step are obtained from the best projection of *N* previous steps. These non-conservative potentials provided a smooth transition between steps. However, strong coupling through the Nosé-Hoover chain of thermostats impeded structural transitions. The next generation of development provided the advanced solvation force extrapolation scheme (ASFE). This development used the optimized isokinetic Nosé-Hoover thermostat (OIN) for each atom by imposing kinetic energy constraints. The fast-dynamics (solute-solute) and slow dynamics (solute-solvent via 3D-RISM) are separated in the ASFE implementation. The accuracy of extrapolation was estimated by relative mean square deviation of the extrapolated effective solvation forces from their original values calculated from converged 3D-RISM-KH for the outer time step. The applicability of the novel formalism containing two separate time cycles for MD simulation of solute-solvent systems were validated against conformational space of alanine dipeptide in water. The subsequent developments, generalized solvation force extrapolation (GSFE), used rotational transformation of the relative coordinates for each atom in order to smoothen the force matrix described previously. In this new development which also used OIN thermostats, a weighting function was introduced for each discretized space. The new algorithm also takes into account that the nearest neighbors have maximum effect on mean solvation forces for any given atom. The efficiency of this new algorithm was shown by the MTS-MD/OIN/GFSE/3D-RISM-KH simulation of a miniprotein (PDB: 1L2Y) and protein G with their reported folded forms [67–69]. The miniprotein folding was achieved via the MTS-MD formalism, starting from a fully extend denatured state, at about 60 ns simulation in comparison to the average physical folding time in the order of μs observed via experiment [68].

#### **4. Binding Site Mapping**

Receptor-ligand binding is in the heart of early-stage drug discovery. A correct mapping helps to stop waste of resources, both financially and computationally. Traditionally, lead-like molecules are used to find potential binding site(s) on a receptor surface. An alternative option to this is fragment-based mapping. These processes will lead to a set of fragments/probe molecules that are potential binders on a receptor surface with defined binding sites [70,71]. Chemical linking based on available linker databases and knowledge of chemical space yields potential leads. The success of this process depends on correctly finding a binding site, usually using empirical scoring algorithms. The 3D-RISM-KH theory essentially provides a 3D-distribution of solvent sites around a solute of arbitrary shape. Thus, one can replace the solvent with a small molecule fragment and even a mixture of fragments, and develop a distribution of unique sites from a mixture of fragments, around a solute of interest. The concept behind this process is easy to visualize, but requires specialized algorithms that can reduce computational burden and help in finding a physically meaningful solution. The 3D-distribution function for ligand site *γ* is given as:

$$\mathcal{g}\_{\gamma}(r\_{\gamma}(\mathbb{R}, \Omega)) = \int \mathcal{g}\_{\gamma}(r) \rho\_{\gamma}(r - r\_{\gamma}(\mathbb{R}, \Omega)) dr \tag{15}$$

The ligand sites spatial position is defined with three cartesian coordinates ( *R*, translational) and three Euler angles ( Ω, rotational). In practice, the ligand site density distribution is described using a Gaussian-type function. The so-called "site-integrated" potential of

mean force (W, SI-PMF) is used to find the most probable binding site(s) of a ligand probe on a receptor [72]:

$$W\left(\Delta \stackrel{\rightarrow}{R}, \Omega\right) = -k\_B T \sum\_{\gamma} \ln \gspace{Z}\_{\mathcal{T}}\left(r\_{\gamma}\left(\stackrel{\rightarrow}{R}, \Omega\right)\right) \tag{16}$$

The latest development of this concept used spatial distribution function of a ligand around a protein and thus explored all possible binding modes of the ligand, and final filtering was done based on a scoring function [73]. This scoring function is based on estimated free energy terms and is written as:

$$\mathcal{W}\_{\rm SP}(\{r\};\Omega) \approx -RT \ln \left[ \prod\_{i} \mathcal{g}\_{i}(r\_{i};\Omega) \right] \tag{17}$$

These methodologies were validated against several small molecule binders and protein-ligand datasets [72–74].

Another important aspect of protein-ligand interactions is the role played by bindingsite water molecules in ligand recognition [75–77]. For a regular molecular simulation with explicit solvent molecules, it is cumbersome to look for such binding site water molecules. This search of binding site waters can be eased with the help of the 3D-RISM-KH water distribution function around a solute molecule. Water distribution in the Lysozyme cavity was first successfully explored to locate binding site water using the 3D-RISM-KH theory [78]. The most updated protocol was reported by Sindhikara and Hirata [79,80]. In their method (placevent), a new successive orthogonal image (SOI) technique for sampling was employed to analyze the distribution function (Figure 3). The SOI method calculates the rotational space of three orthogonal vectors for a spherical search space by using a heavy atom of the solvent as anchor of rotation (e.g., oxygen atoms of water molecules). The solvation site volume (∼ V*n*) is calculated by applying the Kirkwood-Buff equation in a 3D-RISM-KH calculation as:

$$\widetilde{\mathbf{V}}\_{n} = k\_{B} T \chi\_{T} \left( 1 - \rho\_{0} \sum\_{\gamma} \int\_{\gamma\_{n}} c\_{\gamma}(r) \, dr \right) \tag{18}$$

The success of this applications was reported against experimentally determined binding site and poses of ligands in biologically relevant targets. The 3D-RISM-KH based water site prediction is implemented in the MOE© suite [81]. Some examples of successful applications of the 3D-RISM-KH theory in exploiting the explicit role of water maps are reported for in-drug design and protein aggregation studies [82–84]. A very recent modification of the 3D-RISM theory in mapping solvation sites in enzyme active site was reported by Nguyen et al. [85]. This new development extended the GIST (grid inhomogeneous solvation theory) based mapping technique in to the 3D-RISM grids. Briefly, an approximated distribution of oxygen site *α* (from water) around a site of interest is related to thermochemical property of interest ( *A*) as position r as:

$$A(r) \approx A\_a(r) + g\_a(r) \sum\_{\gamma \neq a} \omega\_{a\gamma}(r) \* A\_\gamma(r) \tag{19}$$

The number density distribution *g*α(*r*) is used to weight the convolution (\*) in the right-hand side of Equation (19). This formalism did not consider the non-local effect in the distribution, although the authors reported negligible errors in the final distribution resulting from this issue in comparison to molecular simulations and experimental data.

**Figure 3.** Distribution of water oxygen atoms from 3D-RISM-KH calculations on protein 3UG9 (white spheres). The crystallographic waters are represented with magenta spheres. The catalytic binding site waters were marked in red circle.

Among other reports of biomolecular simulations using the 3D-RISM-KH theory, the effect of (micro-) solvent environments on amyloid structure and potential of mean force calculations of solute permeation across UT-B and AQP1 proteins provided further extension of the applications of the 3D-RISM-KH theory based molecular solvation [86,87].

#### **5. Protein-Ligand Binding Energy**

Heart to drug development is correct prediction or binding affinity of small molecules toward target receptor, if not quantitatively accurate then a trend of binding affinity. Methodologies developed for calculating binding energy for the process, Protein + Ligand → Complex, are the linear-response approximation (LRA), protein-dipole Langevindipole approach (PDLD), linear interaction energy (LIE) approach, and MM/PB(GB)SA approach [88–93]. The MM/PBSA method is favored over the others, as it avoids empirical parameterizations. In this method, the free energy of binding (*G*bind) is expressed via the molecular mechanics (MM) based energy (*E*MM, gas phase, for reactants covering internal, electrostatic, and van der Wall's terms), the solvation energy term (*G*solv), and the entropy term (-*T*SMM), computed at temperature T.

*G*bind = *E*MM + *G*solv − *TS*MM = *E*int + *E*el + *E*vdw + *G*solv, polar + *g*solv, non-polar − *TS*MM (20)

> The solvation terms are calculated by solving the Poisson-Boltzmann (PB) equation (or via generalized Born, GB, model). While this method is fast enough to estimate binding energy in complex, the detailed inter- and intra-species interactions are not transferred properly due to the use of implicit solvation method(s). In the 3D-RISM-KH based binding energy calculation method, the MM/PBSA part is replaced with the 3D-RISM-KH calculations using solvent distribution functions around solutes in the MD simulation trajectory. The PB/GB polar and solvent accessible surface area (SASA) nonpolar solvation terms

are replaced with the solvation free energy term from Equation (10). This modification was shown to be equally effective as of traditional MM-PBSA or MM-GBSA methods. The most notable difference was reported between the 3D-RISM-KH and SASA computed non-bonded terms. The later was found to be always favoring binding, whereas the former was not [91,94]. The 3D-RISM-KH based binding calculations were also reported for other protein-ligand complexes and host-guest complexes [95–97].

#### **6. Molecular Solvation Energy Calculations**

The excess chemical potential obtained from 3D-RISM-KH calculations are theoretically a direct measure of solvation energy. However, as mentioned previously, due to erroneous calculations of internal pressure, the computed solvation energy (Gaussian fluctuation excess chemical potential) showed large deviation from actual experimental solvation energy. Other possible reasons for deviations in calculated solvation energy in the 3D-RISM are approximate nature of the closure relations, absence of explicit cavitation energy terms, and inadequacy in force field terms, etc. Incomplete sampling of solutes and short simulation times are also responsible in errors in solvation energy calculations, which reflects in physical property calculations. A correction scheme was developed in order to address this shortcoming, initially for solvation free energy [98]. This so-termed universal correction has the form:

$$
\Delta G\_{\text{hydrotion}} = \Delta G^{\text{GF}}\_{\text{hydrotion}\prime \text{ RISM}} + a \times \text{PMV} + b \tag{21}
$$

The partial molar volume (PMV) of a solute in water is an output of a RISM calculation. The coefficients *a* and *b* were obtained from regression analysis against the experimental solvation energy database of Mobley and co-workers [99]. For hydration energy calculations with the 3D-RISM-KH formalism, a modified correction scheme was reported by Truchon et al. which aimed to account for the cavitation energy term [100]. Further development of solvation energy calculations in various solvents were reported from the lab of the authors, both in the context of exploring liquid state of pure solvents as well as in calculating molecular partitioning and permeability properties. Effect of atomic charge assignment schemes in hydration free energy calculation was reported by Roy et al. for an extended database of compounds with experimental hydration free energy [37]. Literature reports on hydration free energy calculations with the 3D-RISM-KH theory obtained excellent results with the GAFF force field parameters of the solute with the modified point charge models of water. While water is one of the most polar solvents used in biochemical simulations, the RISM-KH theory is extended to non-polar solvents too. Non-polar solvents that are modeled using the 3D-RISM-KH theory are hydrocarbons (hexadecane, cyclohexane), haloalkane (chloroform), and alcohol (n-octanol, t-butanol) [41,47,101,102]. Other solvents like nitrocompounds (nitromethane, nitroethane, and nitrobenzene), acetonitrile, and dimethyl sulfoxide (DMSO) were also used in RISM-KH calculations for both liquid structure and solvation energetic studies [42,103,104]. The performance of the 3D-RISM-KH calculations is summarized in Table 1. The performance of the 3D-RISM-KH theory in solvation free energy calculation, in comparison to the performance of MD and/or quantum chemical models, together with computation speed made this theory ideal for such applications.


**Table 1.** Performance of the 3D-RISM-KH theory in predicting solvation free energy of solutes in various solvents reported in the literature. Performance of different computational method in solvation free energy calculation is provided in parentheses.

*a* Mean absolute error. *b* Relative mean square error. *c* RMSE computed from MD simulation in ref. [99]. *d* RMSE computed using CPCM continuum solvation model on Minnesota solvation database [105].

## **7. Conclusions**

The 3D-RISM theory is under continuous development, and the range of the application of this theory is ever expanding. For biomolecular simulations, MTS-MD provides a platform to combine fast dynamics of the solute with slow solute-solvent dynamics, and is proven to be able to avoid the local minima problem in molecular dynamics. Several important modifications covering the algorithm of the 3D-RISM code for application with massive parallel computer architectures were reported [106–108]. The algorithm was also coupled with density functional theory based electronic grids for electronic structure calculations [33,109]. It is important to understand that 3D-RISM-KH molecular solvation theory deals with liquid state. Thus, a direct comparison of the simulation results obtained from 3D-RISM calculations with structures determined from solid state experiments (e.g., solid-state X-Ray, neutron diffraction, etc.) may not result in a grea<sup>t</sup> match. For a better comparison, data obtained from experiments with liquid state should be used. Further, the Gaussian fluctuation excess chemical potentials from a 3D-RISM calculation should not be taken as an absolute measure of solvation free energy. For solvation energy calculations, the results should be compared against experimental datasets and should be fitted for use against a test set, should such a need arise. The 3D-RISM calculations provide a unique machinery to represent liquid medium with specific concentrations of cosolvent(s), additives, etc., and thus providing an opportunity to model a more realistic environment. The theory is extendable to multiphasic systems with inhomogeneous version of molecular solvation theory. However, one should keep in mind that the 3D-RIM-KH theory is not one a "size fits all" theory. It requires detailed benchmarking of every aspects of a specific problem before using it for predictive modeling.

**Author Contributions:** The manuscript was written through equal contributions of all the authors. Both authors have read and agreed to the published version of the manuscript.

**Funding:** This work was financially supported by the NSERC Discovery Grant (RES0029477), and the Alberta Prion Research Institute Explorations VII Research Grant (RES0039402).

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** All data are available in the original research articles referenced in this review.

**Acknowledgments:** Generous computing time provided by WestGrid (www.westgrid.ca) and Compute Canada/Calcul Canada (www.computecanada.ca) is acknowledged in completing different sub-projects resulting in developments of the 3D-RISM-KH simulation protocols in AK's laboratory. **Conflicts of Interest:** The authors declare no conflict of interest.
