1. Introduction
Diffusion is a spontaneous process and one of the physicochemical phenomena by which a substance is distributed and as such, it plays an important role in the life sciences [
1,
2]. For example, following the administration of a pharmaceutical, drug molecules are transported via the bloodstream and distributed to organs by active and passive transport. As diffusion is the driving force behind passive transport [
2,
3], physicochemical information on drug diffusion is useful for analyzing drug delivery systems and pharmacokinetics as well as investigating the distributions or diffusion velocities of central nervous system drugs in the brain after they penetrate the blood–brain barrier [
4,
5,
6]. With respect to drug discovery from bacteria, a multidimensional diffusion-based gradient culture system for bacteria has been recently devised [
7]. This circular culture apparatus represents one application of diffusion in the life sciences. In relation to this culture system, we have more recently developed a method to estimate translational diffusion coefficients (hereinafter referred to as diffusion coefficients) of small molecules using diffusion experiments in agar gel [
8]. To our knowledge, few experimental values of diffusion coefficients of small molecules have been reported, and one possible reason for this is that the diffusion coefficients are measured with a special apparatus [
9,
10] although rotational diffusion coefficients of macromolecules have been examined by the NMR experiments [
11,
12]. To be used in drug screening, however, we need quicker procedures that do not require special apparatus; in other words, computational approaches such as molecular modeling, for estimating diffusion coefficients of small molecules is desirable.
The mean-square travel distance of a particle diffusing in one dimension (
x) is given by the Einstein–Smoluchowski equation:
where
D is the diffusion coefficient of the molecule (particle) and
t is the length of time the molecule has been diffusing [
13]. This equation shows that the magnitude of the diffusion coefficient governs the degree to which the molecule diffuses. When a diffusing molecule is approximated by a sphere of radius
r,
D is expressed by the Stokes–Einstein equation:
where
kB is the Boltzmann constant,
η0 is the solvent viscosity, and
T is the absolute temperature [
14]. Using this equation, the diffusion coefficient
D can be determined from an approximated molecular radius.
The Stokes–Einstein equation has been used in previous studies to estimate the diffusion coefficients. As for small molecules, the diffusion coefficients have been estimated based on physical models such as space-filling models [
15]. With respect to macromolecules, the rotational diffusion tensors and other hydrodynamic properties of the globular proteins with atomistic structures were calculated by means of the bead and shell computer models in conjunction with the NMR experiments [
11,
12] while some groups used the mesoscale coarse-grained models to estimate the translational diffusion coefficients, treating entire macromolecules as single interacting centers [
16]. As far as we know, however, methods that use a series of stable conformers of small molecules derived by molecular modeling have not been reported to calculate diffusion coefficients by the Stokes–Einstein equation.
Therefore, we aimed to develop a method to theoretically estimate the diffusion coefficients of small molecules, such as sugars and drugs, based on the Stokes–Einstein equation. In our approach, the stable conformations of small molecules are first calculated using molecular modeling. The approximate radii of the conformers are then estimated, and the diffusion coefficients are finally calculated using the Stokes–Einstein equation. The goal of this work is to demonstrate the effectiveness of this approach by comparing the theoretically derived diffusion coefficients with the experimental diffusion coefficients reported in the literature. Increased availability of diffusion coefficients is expected to provide an additional physicochemical molecular descriptor to be used in drug screening.
2. Methods
The MOE (Molecular Operating Environment) software system developed by the Chemical Computing Group was used for the molecular modeling [
17]. The carboxy group and the amino group were treated as the free form and the protonated form, respectively. The stable conformations of molecules were calculated using the Low Mode MD module with the force field MMFF94x [
18]. Approximated molecular radii were calculated for the stable conformations with Δ
E < 3 kcal/mol from the most stable conformation and the average radius was then derived by taking the Boltzmann distribution based on Δ
E into account (
Figure 1). The threshold of 3 kcal/mol was chosen because the contribution of the conformer with Δ
E > 3 kcal/mol at 298 K was less than 1% compared with the most stable conformer.
Although the concept of the approximated molecular radius is beyond this discussion, the specific procedure to calculate it is not trivial. Previously, the approximated radii were usually calculated based on the van der Waals volume of an ellipsoidal molecular shape, leading to some reduction for molecules having a radius less than around 4.5 Å [
15] or the introduction of correction factors to individual molecules to account for the hydration effect [
19]. The van der Waals volumes used were, however, calculated by the simple incremental rules of atoms or physical models such as space-filling models [
20], indicating less accurate estimations of molecular volumes. As we used the latest molecular modeling technique, no correction factors were applied in advance. Two types of approximated radii, that is, a simple radius
rs and an effective radius
re, were thus proposed and calculated in this study (
Figure 2).
In both cases, the molecular shape was first expressed as a set of grid points based on the atomic coordinates and their van der Waals radii. The volume within this shape is called the van der Waals volume (
Vvdw). The simple radius was derived using the following equation:
where
Vs. is the volume of a sphere with the simple radius
rs. The effective radius was introduced based on the radius of gyration
rg to take into account the molecular shape. Based on the grid points,
rg is calculated by the following equation:
where
i represents the individual grid point and the mass is treated as evenly distributed among all points. The radius of gyration is essentially smaller than the simple radius and as for a generic sphere with the radius
r,
rg is written as:
The radius of gyration was then multiplied by the coefficient
K to provide
re. In other words, the effective radius
re is written as:
The value of 1.29, which converts the radius of gyration into the effective radius, was thus used for
K in this work. Notably, a similar correction factor of ~1.3 was presented to compensate for the hydration shell effect by the Trovato group [
16]. Diffusion coefficients
Ds and
De in water with
η0 = 0.8902 mN·s/m
2 were calculated for
rs and
re at 298 K, respectively, using the Stokes–Einstein equation.
3. Results and Discussion
Diffusion coefficients were estimated for 18 compounds consisting of the sugars, amino acids, and drugs listed in
Table 1. Xylose, fructose, galactose, and glucose are monosaccharides, and sucrose, lactose, trehalose, and maltose are heterodisaccharides. When dissolved in water, glucose exists as an equilibrated mixture of two different forms, α-
d-glucose (α-form) and β-
d-glucose (β-form). The ratio of α-form:β-form is known to be 36:64 [
21]. The stable conformations of both forms of glucose were then investigated and the average
rs and
re were calculated using the known ratio. Aspirin (CAS No. 50-78-2) and loxoprofen (CAS No. 68767-14-6) are non-steroidal anti-inflammatory drugs, and salbutamol (CAS No. 18559-94-9) is used to treat asthma. As for loxoprofen, two kinds of stereoisomers, specifically (2S, 2′R)-isomer and (2S, 2′S)-isomer were investigated. Fast Green FCF (CAS No. 2353-45-9) is a food coloring and was used in the agar-gel diffusion experiments [
8]. Since its molecular weight corresponds to the upper limit of the drug candidates [
22,
23], it was consequently included and expedientially classified as a drug here.
Table 1 shows the number of conformers with Δ
E < 3 kcal/mol, in addition to approximated radii and diffusion coefficients together with some literature data. The relative energies and Boltzmann populations as well as the cartesian coordinates of all the conformers of each molecule listed in
Table 1 are found in the
Supplementary Tables S1–S18 and MolFiles. Due to the extent of the flexibility of disaccharides, the numbers of disaccharide conformers with Δ
E < 3 kcal/mol are generally greater than those of other conformers except for Fast Green FCF. According to the Stokes–Einstein equation, the diffusion coefficient of a molecule is expected to increase in inverse proportion to its approximate radius. Therefore, a larger molecular weight is related to a larger approximate radius, so that the diffusion coefficient is generally smaller. This broad trend is readily discernible in the data presented here. From
Table 1, the simple radius
rs is smaller than the effective radius
re in each case, and the diffusion coefficient
Ds is accordingly larger than the diffusion coefficient
De, with the ratio of
re to
rs; in other words, the ratio of
Ds to
De ranges from 1.1 to 1.3.
D0 is the diffusion coefficient extrapolated to the infinitely dilute solution based on the diffusion coefficients obtained from solutions with different concentrations,
Dc.
Dc for finite concentration solutions is generally observed to be smaller than
D0 with no intersolute interactions. No intermolecular interactions are taken into account in the modeling; therefore,
D0 corresponds to those (
Ds or
De) derived by our method. As
D0 is only reported in the literature for sugars, their
D0 was first compared with the corresponding
Ds and
De. Scrutiny of
Table 1 reveals that
D0 is between
Ds and
De, and
De showed good correspondence to
D0 in most cases, with a reasonably small deviation of 0.27 × 10
−6 cm
2/s between them on average. Since the experimental standard deviation errors on the diffusion coefficients of sugars were around 0.2–0.3 × 10
−6 cm
2/s [
10], our value of 0.27 × 10
−6 cm
2/s which correspond to the standard deviation error of 0.19 × 10
−6 cm
2/s, was comparable to previous experiments. The error for lactose was rather large among sugars but the reason for this is not obvious. Lactose has a β-glycosidic bond whereas the other disaccharides have α-glycosidic linkages and this might be related to the larger deviation. As for fructose, glucose, and sucrose,
De was fairly consistent with
D0 reported in the literature. These results suggest that the effective radius
re is suitable for estimating the diffusion coefficients of sugars.
Next, the diffusion coefficients of the other compounds, that is, amino acids and drugs, are discussed. As observed from the sugar data in
Table 1,
Dc is smaller than
D0, with an average deviation of 0.65 × 10
−6 cm
2/s. Assuming that a similar shift occurs for the other molecules,
D0 for those molecules is derived by adding 0.65 × 10
−6 cm
2/s to
Dc and are listed as
drvD0 with parentheses in
Table 1. In all cases,
Ds calculated from
rs showed better correspondence to
D0 with an average deviation of 0.30 × 10
−6 cm
2/s. As experimental concentrations of sugars were around five times higher than those of the other compounds, the actual
D0 should be somewhat smaller than
drvD0 in
Table 1. Considering this, it is imagined that the average error of those approaches that of sugars. The error for alanine, with a molecular weight <100 as well as
rs < 3 Å, was larger than that of other molecules. According to the previous work, the diffusion coefficients of molecules with a small radius of around 2.7 Å would be corrected by multiplying ~1.2 [
15]. In our case of alanine, the multiplication of
Ds (9.01) by 1.1 provided good agreement with
D0 (9.86), indicating a smaller correction factor. Such a correction, however, does not seem to be necessary to other molecules with
rs < 4.5 Å although it has been previously suggested [
15]. Excluding alanine, the rather small average deviation of 0.24 × 10
−6 cm
2/s was obtained.
In
Table 1, the error values of the estimated diffusion coefficients are presented as
De −
D0 and
Ds −
D0 for all compounds, and their percentage errors are graphically represented in
Figure 3. From the figure, it is easily confirmed that
De provided better correspondence to
D0 for sugars whereas
Ds showed better correspondence to
D0. It also shows that the estimated values are smaller than
D0 in most cases. When 0.25 × 10
−6 cm
2/s is then subtracted from the estimated values, a very small average deviation of 0.17 × 10
−6 cm
2/s is attained. Nonetheless, this procedure has no theoretical basis and must be regarded as being tentative. Further work may be required to elucidate the nature of this deviation.
From the above discussion, it may be said that the diffusion coefficients estimated by molecular modeling corresponded reasonably well with the experimental data. With respect to the two parameters for radii,
re is suitable for sugars and
rs is suitable for the other compounds to be used with the Stokes–Einstein equation. The strong hydration ability of sugars seems to be related to this distinction due to the apparent effect of its size increase. From a different point of view,
re may reflect the hydration effect properly, similarly to the correction factor for hydration used in the simplified macromolecular models [
16]. This analysis has led to the conclusion that
re should be used for molecules with strong hydration ability and
rs should be used for other compounds, providing the diffusion coefficients are accurate within the reasonable deviation of ~0.3 × 10
−6 cm
2/s.
In future work, we would like to increase the number and variety of compounds to answer questions such as: does a more favorable approximated molecular radius depend on the type of compound? Is there any single approximated molecular radius that reproduces the measured diffusion coefficients? Furthermore, refinement of the stable conformation with firmly hydrated water molecules may be achieved through molecular dynamics simulations in the bulk water model. Although such issues remain to be addressed currently, we consider that these results have some degree of positive significance because they demonstrate that diffusion coefficients can be reasonably estimated using a computational approach with molecular modeling. In drug screening, it is advisable to have enough physicochemical molecular descriptors such as P
ka and clogP [
22]. As stated in
Section 1, the diffusion velocities of central nervous system drugs might affect their efficacy. We hope our approach presented here will add the diffusion coefficient as one of those descriptors used for that purpose.