

C62F6 C66HF12I 

> 9-(1,1,2,2,3,3,4,4,5,5,6,6- Dodecafluoro-6- iodohexyl)-1H-(C60- Ih)[5,6]fullerene

For the transposition of cisPt at the cellular level through numerous barriers and to avoid their connection with the B group vitamins [7], an attempt was made to deposit Cisplatin on rhombellanes and functionalized fullerenes C60, as possible nanodrug complexes. The symmetry of the analyzed nanostructures is an important factor determining the mutual affinity of the tested ligand and nanocarriers.

Detailed analysis of structural properties after docking showed many interesting features. Behavior of Cisplatin with respect to rhombellane homeomorphs and functionalized fullerenes C60, in terms of their (interacting) energy, geometry and topology, was studied. After the docking procedure, the optimal values of ligand-rhombellane and ligand- C60 fullerene affinities were found, which is an important result for homeomorphs.

#### **2. Methods**

#### *2.1. Docking Procedure*

The structures of rhombellane homeomorphs were prepared by Topo Cluj Group [8–13], while the functionalized C60 structures were downloaded from the PubChem Database [22]. The docking procedure was realized with the use of non-commercial docking program AutoDock 4.2 [23,24]. This employs a stochastic Lamarckian genetic algorithm for computing ligand conformations and simultaneously minimizing its scoring function, which approximates the thermodynamic stability of the ligand bound to the target fullerene. For all considered nanocarriers there were established grid box dimensions equal to 26 × 26 × 26 Å. The center of the grid box was placed in the center of the considered nanomolecules. In most cases, the center of the grid box had xyz coordinates equal to 000. During the docking procedure, the molecules were loaded and stored as pdb-files, after assigning hydrogen bonds. The investigated ligands were loaded, and their torsions along the rotatable bonds were assigned and then saved as "ligand.pdbqt". The grid menu after loading "pdbqt" was toggled [25]. For the search of the ligand–Rbl nanostructure and ligand-C60 interactions, the map files were selected directly, setting up the grid points separately for each structure. The Lamarckian genetic algorithm completed the docking parameter files [26]. The structural analysis of the obtained complexes related with the identification of hydrogen bonds was realized with use of the Visual Molecular Dynamics (VMD) package [27].

#### *2.2. Results and Discussion*

The results are presented in the following tables and figures. Rhombellane structures are given by their atom number.

Table 1 and Figure 5 represent binding affinity in kcal/mol of the ligand cisPt molecule relative to spherical nanosystems, such as rhombellane structures (first eleven structures in tables) and functionalized fullerene C60 (structures 12 to 21 in tables) obtained during docking stage.

**Table 1.** Values of binding affinity [kcal/mol] of the ligand CisPt molecule relative to spherical nanosystems (rhombellane structures and functionalized fullerene C60) obtained during the docking stage, generated by software Autodock 4.2.



**Table 1.** *Cont.*

Among Rbl homeomorphs, Cisplatin has the highest a ffinity for Rbl 444 with an a ffinity value of −2.72 kcal/mol and for Rbl 360b with an a ffinity value of −2.66 kcal / mol (Table 1, Figure 5). The lowest binding energy is observed in the case of Sta ffanes (stf) 114 (Table 1, Figure 5). Among functionalized structures of C60 fullerene, the best a ffinity is shown by CID\_16156307, followed by CID\_10121832 and finally the CID\_10121831 structure with values of a ffinity equal to −3.44, −3.13 and −2.83 kcal/mol, respectively (Table 1, Figure 5).

**Figure 5.** Graphical presentation of the values of binding a ffinity [kcal/mol] of the ligand CisPt molecule relative to spherical nanosystems (rhombellane structures and functionalized fullerene C60) obtained during docking stage, generated by software Autodock 4.2.

7

In general, the examined structures show a significantly larger affinity in the case of C60 functionalized fullerene than in the case of Rbl homeomorphs. The obtained affinity values are collected in Tables 2 and 3.


**Table 2.** The best binding affinity of ligand CisPt, maximum and minimum bind energy, and Kmax values of the binding constant estimated with use of the binding free energy obtained for the best complex of ligand with nanostructure Rbl after the docking procedure, generated by software Autodock 4.2.

The values of binding energy are correlated with values of the Kmax constant and the highest values are obtained in the case of 444 fullerene and 360b nanostructure (Table 2).

The two last columns show the equilibrium K value of the bonds, calculated using the equation:

$$\mathbb{K}\_{\rm B} = \exp^{\left(-\frac{\Delta G\_{\rm b}}{\rm K}\right)}$$

where Kb is the binding constant, R is the gas constant (J/mol\*K), T is the temperature of 298 K and ΔGb is the binding affinity (J/mol).

The higher the K value, the more the reaction proceeds towards the formation of the complex.

**Table 3.** The best binding affinity of the ligand CisPt, maximum and minimum binding energy and Kmax values of the binding constant estimated with use of the binding free energy obtained for the best complex of the ligand with functionalized fullerene C60 after the docking procedure, generated by software Autodock 4.2.


Table 3 presents Kmax values of the binding constant estimated with use of the binding free energy obtained for the best complex of the ligand with functionalized fullerene C60. The highest values of the binding energy are also here correlated with values of the constant Kmax. This value is higher in the case of derivatives CID\_16156307 of C60 fullerene, followed by CID\_101218236 and CID\_101218232 (Table 3).

Cisplatin (cisPt) and functionalized fullerene C60 easily create two or three hydrogen bonds between them with strong and medium strength (Figure 6).

**Figure 6.** The structure of functionalized fullerene C60 and Cisplatin complexes.

The criterion for classification of the strength of hydrogen bonds is the assessment of the distance between acceptor and hydrogen atoms: strong interactions are characterized by a distance <1.6 Å, medium by values in the range from 1.6 Å to 2.0 Å, and weak by distances <3 Å.

The complex nanostructure CID\_16156307 (C72 H9F2OP)–CisPt has been created by two hydrogen bonds with medium strengths, i.e., 1.94 Å and 2.16 Å; both are between the oxygen of the phosphoryl group of the nanocarrier and the hydrogen of the amino groups of cisPt (Figure 6). Again, two hydrogen bonds have been created in the case of CID\_ 101218236 (C69 H9Cl3O) and CID\_101218232 (C63 H4ClF3O) complexes. The first case involves the oxygen atom of the methoxy group while the second involves the oxygen atom of the ethoxy group, and both form a Hydrogen Bond (HB) with the hydrogen atom of the amino groups of cisPt with values 2.18 Å, 2.19 Å and 2.01 Å (Figure 6). Three hydrogen bonds appeared

only in the case of CID\_10138212 (C62F6), all with medium strength, i.e., 2.13 Å, 2,13 Å and 2.51 Å (Figure 6), respectively. The structures of formed complexes are presented in Figure 6.

Despite the fact that a ffinity is higher in the case of functionalized fullerene C60–cisPt, compared to a ffinities of Rbl–CisPt complexes, the latter created a larger number of hydrogen bonds (Figure 7). The best binding energy has been observed for the 444 nanostructure with four hydrogen bonds of medium strength, i.e., 1.88 Å, 1.95 Å, 1.96 Å and 2.03 Å, respectively. Also, for the 360b nanostructure there were created four hydrogen bonds with medium strength, i.e., 2.08 Å, 2.09 Å, 2.26 Å and 2.56 Å, respectively (Figure 7). The worst a ffinity is represented by the stf114 fullerene. Even so, in this case there are also four hydrogen bonds formed between the nanostructure and Cisplatin. The structures of formed complexes are presented in Figure 7.

Stf\_114\_cispt 

**Figure 7.** The structure of cube rhombellane and Cisplatin complexes.

#### **3. Conclusions**

As a proposal for a new nanodrug, an attempt was made to implement the Cisplatin (cisPt) ligand on the functionalized C60 fullerenes and on a cube rhombellane homeomorphic surface. Cisplatin (cisPt) is one of the strongest anticancer agents with proven clinical activity against a wide range of solid tumors. Theoretical studies sugges<sup>t</sup> that nonspecific interactions of Cisplatin can also take place with many competitive compounds, such as vitamins of the B group, containing aromatic rings with lone-pair orbitals. It is argued that such a direct analogy to the interaction of canonical purines can impair the therapeutic e ffect of Cisplatin, which might be an indicator of the reduction of the anticancer therapeutic e ffects of the Cisplatin drug in the presence of vitamins of the B group inside the cell nucleus. This is why it seems to be important to connect cisPt Should be "cisPt" with nanostructures and, in this way, make it impossible to combine the drug with the B vitamins. Behavior of Cisplatin with respect to rhombellane homeomorphs and functionalized fullerenes C60, in terms of their (interacting) energy, geometry and topology was studied. The symmetry of the analyzed nanostructures is an important factor determining the mutual a ffinity of the tested ligand and nanocarriers. The docking procedure was realized with use of AutoDock 4.2. Detailed analysis of structural properties after docking showed many interesting features. Among Rbl homeomorphs, Cisplatin has the highest affinity for Rbl 444 with an a ffinity value of −2.72 kcal/mol and Rbl 360b with an a ffinity value of −2.66 kcal/mol. Among functionalized structures of C60 fullerene the best a ffinity is shown by CID\_16156307, followed by CID\_10121832 and finally CID\_10121831, with values of a ffinity equal to −3.44, −3.13 and −2.83 kcal/mol, respectively. In general, the examined structures show a significantly larger a ffinity in the case of C60 functionalized fullerene than in the case of Rbl homeomorphs. Cisplatin (cisPt) and functionalized fullerene C60 easily create two or three hydrogen bonds between them with strong and medium strength. High ligand-nanostructure a ffinity is reflected by the number and most of all the quality of formed hydrogen bonds. However, despite the fact that a ffinity is better in the case of functionalized fullerene C60–cisPt compared with a ffinities of complexes Rbl–CisPt Should be "cisPt", the latter created a larger number of hydrogen bonds but of lesser quality. The performed investigations enabled the identification of the most promising rhombellane and functionalized C60 fullerene structures that could be used as nanocarriers for cisPt molecules.

**Author Contributions:** Conceptualization, B.S.; Methodology, B.S. and P.C.; Validation, B.S. and P.C.; Formal Analysis, B.S.; Investigation, B.S.; Resources, B.S.; Data Curation, B.S.; Writing–Original Draft Preparation, B.S.; Writing–Review & Editing, B.S. and P.C.; Visualization, B.S. and P.C.; Supervision, B.S.; Project Administration, B.S.; Funding Acquisition, B.S.

**Acknowledgments:** Gratitude is expressed for fruitful cooperation for many years to Professor MV Diudea; this article was supported by PL-Grid Infrastructure (http://www.plgrid.pl/en).

**Conflicts of Interest:** The authors declare no conflict of interest.

## **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

#### **Application of the Consonance Solvent Concept for Accurate Prediction of Buckminster Solubility in 180 Net Solvents using COSMO-RS Approach**

#### **Piotr Cysewski**

Chair and Department of Physical Chemistry, Faculty of Pharmacy, Collegium Medicum of Bydgoszcz, Nicolaus Copernicus University in Toru ´n, Kurpi ´nskiego 5, 85-950 Bydgoszcz, Poland; piotr.cysewski@cm.umk.pl; Tel.: +48-52-585-3611

Received: 10 June 2019; Accepted: 20 June 2019; Published: 22 June 2019

**Abstract:** The default COSMO-RS (Conductor like Screening Model for Real Solvents) approach is incapable of accurate computation of C60 solubility in net solvents. Additionally, there is no adequate selection of single or multiple reference solvent, which can be applied to the whole population of 180 solvents for improving prediction of mole fraction at saturated conditions. This failure cannot be addressed to inaccurate data of the Buckminster fusion, although they pose a challenge for experimental measurement due to intense sublimation of C60 at elevated temperatures and the possibility of solvates precipitation. However, taking advantage of the richness of experimental data of fullerene solubility, it is possible to identify the source of errors expressed in terms of fluidization a ffinity. Classification of solvents according to the value of this fluidization term allowed for formulation of a consonance solvents approach, which enables accurate prediction of C60 solubility using the single reference solvent method.

**Keywords:** Buckminster; solubility; COSMO-RS (Conductor like Screening Model for Real Solvents); reference solvent; net organic solvent; fusion

#### **1. Introduction**

Buckminster fullerene is a highly symmetric all carbon molecule, in which structure is represented by truncated icosahedron with a cage-like fused-rings made of twenty hexagons and twelve pentagons. The sixty carbon atoms placed at each vertex of each polygon are covalently bonded along each polygon edge. Fullerene C60 belongs to a broad class of Goldberg polyhedron-like carbon allotropes occurring in the form of spheres, ellipsoids or tubes. It was first generated in 1984 using a laser induced carbon vaporization in a supersonic helium beam [1]. The unique structure of C60 results [2] in its unparalleled unique physicochemical properties [3], which were very welcome in many industries taking advantage of nanomaterials as for example biomedicine [4,5], optics, electronics and cosmetics [6]. Unfortunately, low solubility of C60 in many organic solvents [7] stands for the major cost of the production from soot [8], since extraction by organic solvents is the first step for obtaining fullerenes-rich fractions further separated using HPLC [9]. Moreover, strong tendency of precipitation in the form of solvates [10–17] makes it di fficult to preserve purity of the solid. This is why modeling of C60 solubility attracted so much attention and resulted in a variety of theoretical approaches, among which the best predications come so far from non-linear modeling via machine learning [18]. Other approaches taking advantage of quantitative structure-property relationships (QSPR) [19–21], the multiple linear regression (MLR) [19,22], partial least square regression (PLS) [23], support vector machines (SVMs) [22] and neural networks (NNs) [20] have been reported for predicting the solubility of C60 fullerenes in di fferent organic solvents. Although these models o ffer quite an acceptable estimate suitable for screening of new solvents, they all rely on the sets of molecular descriptors characterizing solute-solvents

properties. Very often these descriptors have no simple physical meaning and the predictive models can seriously su ffer from the over-fitting problem in the training phase.

The di fferent philosophy relies on the calculating of bulk phase equilibria from the first principles using well-established quantum chemistry approaches. Among these methods COSMO-RS (Conductor like Screening Model for Real Solvents) [24–26] o ffers an attractive alternative to linear and non-linear chemometrics approaches. Although it was originally formulated for computing thermodynamic properties of bulk liquid systems [27], it was extended also for treatment of interphases [28,29]. Particularly solids solubility [30] in organic solvents and their mixtures was the subject of significant interests [31–37]. Although the obtained results are of relatively poor or at most medium accuracy, they still o ffer valuable insight into the solid-liquid equilibrium phenomena. It is however important to emphasize the uniqueness of solids solubility modelling as requiring a proper thermodynamic formulation. Some basic, but important remarks are provided in the forthcoming paragraph.

Thermodynamic conditions for equilibrated mixtures of a solid–liquid saturated solution are characterized by equal values of chemical potential of pure solute solid phase, μ*solid i* , and of the solute in the saturated solution, μ*l i*. Formally this can be expressed as follows:

$$
\mu\_i^l = \mu\_i^{\text{solid}} \tag{1}
$$

$$
\mu\_i^{o,l} + RTln\left(\mathbf{y}\_i^l \mathbf{(x}\_i^l\right) \cdot \mathbf{x}\_i^l\right) = \mu\_i^{o,solid} + RTln\left(a\_i^{solid}\right),
\tag{2}
$$

where μ*o*,*l i* μ*<sup>o</sup>*,*solid i* are the reference state chemical potentials, γ*l i xl i* is an activity coe fficient at the given value of mole fraction of solute in saturated solution, *xl i*, R is the universal gas constant and T denotes temperature. The thermodynamic reference state for the solute in the solution is often defined as the pure compound in a hypothetical super-cooled melt state at given temperature. On the other hand, in chemical practice often the activity of the solid phase is set to unity, which means that the reference state for the solid phase is the solid phase itself. Then, the solute activity in the saturated solution represents simply the di fference between values of chemical potentials of chosen thermodynamic reference states, which means that

$$\ln \left( a\_i^l \{ \mathbf{x}\_i^l \} \right) = \ln \left( \gamma\_i^l \{ \mathbf{x}\_i^l \} \mathbf{x}\_i^l \right) = \frac{1}{RT} \left( \mu\_i^{o,solid} - \mu\_i^{o,l} \right). \tag{3}$$

Hence, activity of the solid phase, which obviously is di fferent from unity, is exclusively solute related and in any saturated solution has exactly the same value provided that the same solid form is preserved. This of course does not mean that solubility represented by the mole fraction of a given solute is the same in di fferent solvents because the activity coe fficients are local properties related to concentration and solvent type, *al i xl i* . Of course, experimental determination of solubility allows for quantification of activities if activity coe fficients are known. Particularly, in the case of ideal solutions solubility equals the activity of the solute and can be deduced just from calorimetric measurements. In the case of real solvents varying solubility simply stands for deviations from the Raoult law and such non-ideality is accounted by the values of activity coe fficients. Due to the fact that chemical potential by definition is a molar value of Gibbs free energy at a given temperature and pressure, Equation (3) can be rewritten in terms of pure solute properties leading to the following formula:

$$\ln \left( \gamma\_i^l \cdot \mathbf{x}\_i^l \right) = -\frac{\Delta G\_i^{fus}(T)}{RT} = \frac{1}{RT} \left( \text{T} \Delta S\_i^{fus}(T) - \Delta H\_i^{fus}(T) \right) \tag{4}$$

where Δ *Gf us i* is the value of a hypothetical partial molar Gibbs free energy of the melt at the system temperature and pressure. This value is temperature dependent and becomes zero for the pure solute at its melting point. Knowledge of the temperature related change of heat capacities of solid and melt solute Δ*C<sup>f</sup> us pi* (*T*) = *Clpi*(*T*) − *Csolid pi* (*T*), allows for computing entropic and enthalpic contributions to the values of fusion Gibbs free energy by the following basic definitions:

$$
\Delta S\_i^{fus}(T) = \Delta S\_i^{fus}(T\_m) + \int\_{T\_m}^{T} \frac{\Delta C\_{p\_i}^{fus}}{T} dT \tag{5}
$$

$$
\Delta H\_i^{fus}(T) = \Delta H\_i^{fus}(T\_m) + \int\_{T\_m}^{T} \Delta C\_{p\_i}^{fus} dT \tag{6}
$$

where *T*m represents melting temperature of pure solute. Combining Equations (4) and (5) leads to the following thermodynamically rigorous relationship (if no phase transition occurs between different solid states):

$$\ln\left(a\_i^l\right) = -\frac{\Delta H\_i^{fus}}{RT}\left(\frac{T\_m}{T\_m} - 1\right) - \frac{1}{RT}\int\_{T\_m}^{T} \Delta C\_{p\_i}^{fus} dT + \frac{1}{R}\int\_{T\_m}^{T} \frac{\Delta C\_{p\_i}^{fus}}{T} dT\tag{7}$$

Unfortunately, direct application of this equation is impossible due to the lack of experimental data. With contemporary thermo-analytical techniques the temperature related values of isobaric heat capacity of the solid can be determined below the melting temperature. Additionally *<sup>C</sup>lpi*(*T*), characterizing the solute melt state, can be measured above the melting temperature, if the compound does not decompose. Unfortunately, the super-cooled melt system used as a reference state is in practice experimentally inaccessible far away from the melting temperature, as it is the case in the instance of typical solubility measurements. Hence, to overcome this difficulty several simplifications of Equation (7) were proposed enabling for extrapolation of heat capacities. First of all, it is possible to assume that Δ*C<sup>f</sup> us pi* (*T*) is small in comparison to the other terms and is omitted by imposing a zero value. This means that the enthalpy of fusion is independent of the temperature. This seems to be reasonable for temperatures not too remote from the melting point [38,39]. On the other hand, there are strong suggestions that in general this represents serious oversimplification [40–42] since change of the heat capacity upon phase transition is very often significant. Hence, the alternative proposed by Hildebrand and Scott [43,44] assumes that Δ*C<sup>f</sup> us pi* (*T*) is constant and can be approximated by the entropy of fusion at the melting temperature. In turn, this value can be approximated by the ratio of fusion enthalpy and system temperature. Although this assumption is not exceptionally accurate, it has been shown [41] that the true value of Δ*C<sup>f</sup> us pi* (*T*) is generally much closer to ΔS*<sup>f</sup> us i* (*T*) than to zero. The importance of this assumption was validated by Moller [45] reporting a mean difference of 10% between these two simplifications. However, a more systematic study on complex substances did not confirm this notion since no significant influence has been found on the overall solubility predictions [46] in relation to assumptions of Δ*C<sup>f</sup> us pi* (*T*) model. There are also different semi-empirical models developed for characteristics of solid-liquid heat capacities enabling much higher precision of solute activity estimation and thermodynamically rigorous extrapolation over a broad range of temperatures [47,48].

On the other hand, Equation (7) plays an important role in the area of theoretical solubility predictions and since it defines the problem with two unknowns coming from different sources all the above comments are valid also in this context. Many theoretical models were developed for estimating activity coefficients based on which solubility can be deduced. Among many theoretical approaches the COSMO-RS theory offers a very attractive way of chemical potential characteristics in bulk systems. Foundations of this theory are available in original papers [24–27], so details will be omitted here. It is suffice to remind that the values of the chemical potentials of compounds in the solvents are calculated by an iterative solution of the following equation:

$$\mu\_i^l(\sigma) = -\frac{\text{RT}}{a\_{eff}} \ln \left[ \int P\_i(\sigma \nu) \cdot \exp\left(\frac{a\_{eff}}{\text{RT}} \left(\mu\_i^l(\sigma \nu) - \text{E}(\sigma, \sigma \nu)\right)\right) d\sigma' \right] \tag{8}$$

where σ represents the charge density of molecular segment, E(<sup>σ</sup>,<sup>σ</sup>') is the sum of electrostatic and hydrogen bonding contributions to the total energy of the system computed as pair-wise additive interactions of molecular surface segments of charge density σ. The molecular surface is defined by the molecule polarization after immersing in dielectric continuum or conductor. This solvation model [25] results in a so called σ-profile, Pi(σ), representing simply a histogram of surfaces of a given charge density. The most important information of the molecular polarity and intermolecular interactions is directly encoded in σ-profile. The electrostatics (ES), hydrogen bonding (HB) and van der Waals interactions of contacting surface pieces of charge density σ and σ' are computed based on the pair-wise additivity assumption. The detailed definitions are of less importance here, except for the fact that they comprise scaling factors and adjustable model parameters fitted to a large number of thermodynamic data [25,26]. After integrating the segments chemical potentials over the surface of the i-th compound in the multicomponent system, the value of total chemical potential is computed by inclusion of a combinatorial term: 

$$
\mu\_i^l = \int P\_i(\sigma) \cdot \mu\_i^l(\sigma) d\sigma + \mu\_{\text{COMP},i}^l(\sigma). \tag{9}
$$

The first component plays the role of the residual part and the latter accounts for size and shape effects of solute and solvent [24,27]. Since the above equation offers a general way of computing chemical potential of any system, it is directly applicable to liquids solubility prediction. However, for solid solutes a generalization was proposed by accounting also the fusion contribution:

$$RT\log\left(\mathbf{x}\_{i}^{l,(itrr+1)}\right) = \left[\boldsymbol{\mu}\_{i}^{o,l} - \boldsymbol{\mu}\_{i}^{l}\Big|\mathbf{x}\_{i}^{l,(itrr)}\right) - \max\left(0, \Delta G\_{i}^{fus}(T)\right)\Big|.\tag{10}$$

This solubility equation is solved iteratively until self-consistency criterion is met by re-computing values of the chemical potentials dependent on the mole fraction of solute in solution at the current iteration. This leads to systematic improvement of the initial guess of solubility representing solubility at infinite dilution in given solvent or solvents mixture.

The solubility in Equation (10) can be used also in the reversed manner for Gibbs free energy of fusion computations. Based on COMSO-RS derived chemical potential and experimental solubility fusion data are directly available. This value is to be treated as apparent Gibbs free energy of fusion, which knowledge, along with computed values of chemical potential, is sufficient for exact reproduction of the experimental solubility.

The main advantage of using COSMO calculations for calculating phase equilibria is that the results are obtained from the first principles. However, application of this approach for solid solutes solubility prediction, apart from its own shortcomings of estimation of chemical potentials, is also prone to fusion related inaccuracies as discussed above. The goal of this paper is three-fold and inherently associated with Equation (10) First of all, the critical evaluation of quality of predicted values of C60 solubility in 180 net solvents is undertaken. Then, the origin of the observed inaccuracies is discussed in the light of the values of Gibbs free energy of fusion derived from solubility data. Decomposing these data into meaningful contributions provides insight into the source of observed discrepancies between predicted and computed data. Finally, classification of solvents using the consonance solvents approach is proposed and solubility data predicted based on this idea are confronted against experimental data.

#### **2. Materials and Methods**

#### *2.1. Solubility Dataset*

Solubility data of C60 in a variety of solvents were compiled by different authors for modeling purposes [18,19,21,49,50]. In few cases, some discrepancies were found and averaged values were used in this paper. In Tables 1–3 all experimental values are provided.




**Table 2.** The list of type B of consonance reference solvents suggested according to modest and fine criterions of solvents classification. Notation is the same as in Table 1.


**Table 3.** The list of type B of consonance reference solvent suggested according to modest and fine criterions of solvents classification. Notation is the same as in Table 1.


**Table 3.** *Cont.*

#### *2.2. Thermodynamic Data of C60 Fusion*

So far the fusion of fullerene C60 has not been observed at any temperature and existing estimate values do not offer consensus. For example, the following data were reported: *T*m > 950 K [51], *T*m ≈ 1200 K [52] or *T*m ≈ 2023 K [53]. These temperatures are rather debatable since C60 sublimates at temperature between 700 and 945 K [54], depending on the experimental protocol used for the measurements. However, the enthalpy of fusion can be obtained from the available experimental enthalpy of sublimation and an estimated enthalpy of vaporization of the liquid fullerene. This leads to different estimates depending on the assumed values [49]. Here, all solubility computations rely on the values of fusion provided by Kulkarni et al. [55]. Hence, 25.77 kJ/mol was used for approximate fusion enthalpy as calculated from crystal energy in toluene assessed using solubility parameters approach [55]. Instead of melting temperature the sublimation one is assumed here, *T*m = *T*sub = 750 K [56]. These values were successfully used for C60 solubility prediction in solvent mixtures with an aid of Wohl's equation and Scatchard–Hildebrand theory [57].

#### *2.3. COSMO-RS Computations*

The structures of solute and solvent molecules were represented by sets of relevant conformations generated using COSMOconf 4.2 and Turbomole 7.0 as the default engine for geometries optimization. The obtained conformers used further for characteristics of bulk systems had their geometries fully optimized using BP functional and TZVP basis set. All structures were generated both in the gas phase and including environment effects via the COSMO-RS [25] solvation model. For solubility computations TZVPD-FINE level was used, which corresponds to single point calculation with TZVPD basis set and the same density functional based on previously generated geometries. The BP-TZVPD-FINE\_19.01 parameterization was used for all physicochemical properties computations as implemented in COSMOtherm [Version 19.0.1 (Revision 5259)].

#### **3. Results and Discussion**

The starting point of this project was the frustrating observation of extremely high deviations of values computed using the COSMO-RS approach compared to experimental solubility data of fullerene C60 dissolved in 180 net solvents. This is documented in Figure 1 by the distribution marked with black crosses representing results of default computations done in COSMOtherm on fine level. Practically, there is no correlation between predicted and measured values (R<sup>2</sup> = 0.364). Although the mean absolute percentage error, MAPE = 23% seems to be acceptable, only half of the systems

is characterized with absolute percentage error, APE, less than 20%. In the case of cyclopentane the percentage error, PE, exceeds 55% and for pyrrolidine reaches –73%. These values characterize deviations expressed in the decadal logarithm. This means in practice one order of magnitude underor overestimation of mole fractions of C60 in these solvents. For the majority of solvents the solubility of C60 is significantly overestimated by COSMO-RS without any detectable trend related to molecular structure of the solvents. For some alcohols, as for example 1,4-Butanediol, computed C60 solubility perfectly match experimental ones, APE = 1.0%. For other solvents of this type, as for example methanol, a significant discrepancy with APE = 32.2% is observed. Generally poorer performance is noticed for less polar systems containing aliphatic chains and better for systems being able to form hydrogen bonds and with high contributions of electrostatic interactions. For detailed inspection of the origin of this situation the reversed solubility computations were performed for all systems and apparent Gibbs free energy of fusion were generated.

**Figure 1.** The correspondence between computed and experimental values of fullerene C60 solubility in 180 net organic solvents. Statistics of regression curve denotes accuracy of consonance solvent selection according to fine criterion.

#### *3.1. Computations of C60 Fusion from Solubility Measurements*

As it was mentioned in the introduction that the procedure of solubility computation in COSMO-RS can be reversed for estimating adequate values of fusion Gibbs free energy based Equation (10) allowing for straightforward determination of ΔGfus values. This means that if a proper value of such fusion affinity is available normal solubility computations using COSMOtherm would reproduce exactly experimental data. This option relies simply on the reference solubility approach in which chemical potential of the solute is determined based on experimental concentration of saturated solution and computed values of the equilibrium activity of the solute *ali xsat i* . The computed values characterizing Buckminster fusion for the set of available 180 solubility values are presented in Figure 2. Presented distribution of the apparent Gibbs free energy of fusion univocally documents that this value is very strongly dependent on the nature of the solvent. For generalization of the discussion, instead of absolute values of solubility-derived fusion, the relative value with respect to calorimetrically measured data will be used and termed here as fluidization contribution, FLUID = Δ*G<sup>f</sup> us*,*COSMO*−*RS C*60,*sat*− Δ*G<sup>f</sup> us*,*cal C*60,

where Δ*G<sup>f</sup> us*,*cal C*60 equals 3.71 kcal/mol at room temperature [55]. The obtained result presented in Figure 2 is rather unexpected since in an ideal situation FLUID should be at least constant if not equal to zero for any solvent at constant temperature and pressure. Evidently this is not the case for fullerene C60 since the deviations from calorimetric fusion value range from –3.4 kcal/mol for nitromethane up to +4.9 kcal/mol in the case of cyclopentane. Data for other few extreme cases are also provided in Figure 2. Such pronounced variability of apparent ΔGfus cannot be addressed solely to improper representation of the heat capacities. Additionally, problems with measurements of melting temperature and heat of fusion of C60 cannot justify such strong irregularities.

**Figure 2.** Correspondence between percentage error and values of fluidization term of Buckminster fusion derived based on experimental solubility data and calorimetric measurements (FLUID = Δ*G<sup>f</sup> us*,*COSMO*−*RS C*60,*sat* − Δ*G<sup>f</sup> us*,*cal C*60 ). Data points are colored according to solvents consonance classified using fine criterion. Codes represent subclasses enlisted in Tables I-III.

According to equation 9 the relationship between log(*xsat C*60) and ΔGfus is linear and if inaccuracy of the fusion value equals 1 kcal/mol than the error of solubility expressed in units of decadal logarithm of mole fraction at room temperature will be equal to RTln(1) ≈ 0.74. This means that for proper estimation of solubility a very high accuracy of ΔGfus is required exceeding the so called chemical accuracy, which is of the order of 1 kcal/mol or 0.03 eV [58]. However, in the case of C60 the solvent imposed alterations are much higher, variable and not negligible. For 103 solvents FLUID exceeds 0.5 kcal/mol and for 76 is lower than –0.5 kcal/mol suggesting that C60 is to be considered as a rather non-ideal agen<sup>t</sup> in the majority of solvents despite low concentrations at saturated conditions. In fact, this is supposed to be the reason of very poor C60 solubility predictions by COSMO-RS, as documented in Figure 1. This means that activities coming from the COSMO-RS approach are inaccurately computed not accounting for all necessary contributions characterizing saturated solutions. The span of solubility values is very broad encompassing solvents in which C60 is practically insoluble as for example water (log(*xsat C*60 = −12.5) and modestly miscible as it is in the case of 1-phenylnaphthalene log(*xsat C*60) = −1.9.

It is not surprising that fluidization terms for cyclopentane and nitromethane have opposite signs due to differences in electronic structure affecting intermolecular interactions in bulk systems. Generally, in the case of non-polar structures as alkanes and cycloalkanes, the fluidization contribution is positive indicating higher affinity of solute to such solvents, what results in overestimation of solubility in the case of using to low Δ Gfus values. To the contrary, highly polar and rich in lone pairs systems, as for example nitromethane or heteroaromatic compounds, are characterized by negative values of FLUID term. Hence, the a ffinity of solute toward such solvents is lower than expected from uncorrected calorimetric values of Gfus and computed solubility will be underestimated. It is a bit surprising that methanol and also other alcohols were found in the same group as some alkanes or halogenated alkanes but this should be addressed probably to di fferent sources of FLUID value and cancelation of such contributions as electrostatic, hydrogen boding and dispersion.

Before proceeding any further it is necessary to comment on the common phenomenon observed in the case of systems in which high a ffinity of solute is observed toward solvent molecules. In such cases, it is quite reasonable to expect that new solid forms can precipitate in the saturated solution in the form stable solvates. This in turn would give rise to a di fferent activity for the solid phase due to altered thermodynamic properties of solid phase of di fferent composition. Such multicomponent solids are well known in medicinal chemistry. It was already recognized that variations in the composition of the contents of the stomach/gu<sup>t</sup> may a ffect the recrystallization kinetics between di fferent solid-state forms of a drug [59,60]. This can be addressed not only to polymorphism but more often to solvates as for example it is the case for carbamazepine [61–63] and its cocrystals [60]. This aspect is also relevant to C60 solid–liquid equilibria in organic solvents. Unfortunately, in reports providing the values of solubility there are not su fficient information about solid phase present in the equilibrium with the saturated solutions. Although no direct clues about C60 solvates can be inferred from solubility measurements this aspect was not overlooked and attracted serious attention by many investigators [10–17,64–68]. For example, the dissolution properties of C60 in aromatic solvents were studied using the thermos-analytical approach [10,12] and many solid solvates were identified. This of course is relevant to the data provided in Figure 2 since documented dramatic influence of the thermodynamic properties of solid solvates might be addressed to this phenomenon of C60.

#### *3.2. Reference Solvent Computations*

The remedy on the lack of Gfus is supposed to be the so called reference solubility computation. It is simply an attempt of computing solubility in one solvent based on information of solubility in another. In practice, this is exactly the same procedure as the one used for computing the FLUID term but two solvents are declared in the input files. Since it is not known *a priori* which reference solvent is the most appropriate for given solvent all possible combinations were considered here. Hence, reference solubility was computed for all 180 × 179 = 32,220 solvents pairs. In each case one solvent was used as a reference one and solubility was computed for the other solvent. Based on obtained results MAPE was computed for each reference solvent. The best result of these computations was provided in Figure 1 as distribution marked with grey diamonds. It happened that the selection of either ethylbenzene, toluene, bromooctane, cis-decahydronaphthalene, propylene glycol, propargyl bromide, 1,5-Pentanediol or n-butylbenzene leads to similar quality of solubility prediction for the rest solvents. Unfortunately, this approach is unsatisfactory as evidenced by the plot collected by Figure 1, where ethylbenzene served as the single reference solvent system. Obviously, the diversity of solvents in the data set prevents from finding one optimal reference system. Hence, this option does not provide any sensible increase of overall accuracy of solubility predictions if a single set of reference solvents is applied to the whole data set. It is also worth mentioning that the reference solvent method cannot help in the cases if solid states in equilibrium with saturated solutions comprise solvates. Since, there is a vast gross of evidences [10–14,16,17,64–66,68] that Buckminster can interact with many solvents, it is quite clear that reference solubility cannot be used neither for serious improvement of predicted solubility data nor for treating cases of unknown fusion characteristics.

#### *3.3. Consonance Solvents Classification*

Fortunately, the situation is not as hopeless as it might seem from results presented so far since some interesting trends can be revealed after sorting data obtained based on the single reference solvent computations. The 180 × 180 matrix comprising APE values of every pair solvent-reference solvent was sorted systematically by rows and columns with criterion of ascending values of APE. After sorting rows, the columns were rearranged accordingly for keeping the same order of rows and columns. Finally, this resulted in a kind of heat map presented in Figure 3.

**Figure 3.** Graphical representation of consonance reference approach. Rows and columns ordered with increasing percentage error values were colored using green (low PE values) to red (high PE values) spectrum. Squares define consonance regions according to fine (PE < 10%) and modest (PE < 15%) criterions. The order of solvents is the same as provided in Tables 1–3.

This operation allows for grouping solvents according to their ability of playing the role of good reference solvent for other solvents found in the near neighborhood in Figure 3. Existence of such regions, encompassing solvents with high similarity in terms of FLUID values, is termed here as the consonance solvent approach. This means that on the map provided in Figure 3, for each case, it is possible to find suitable reference solvent providing quite accurate values of computed C60 solubility. Restricting the expected accuracy to the fine criterion for which APE < 10% of solubility prediction of every pair solvent-reference solvent results in rectangle regions marked with thinner black lines in Figure 3. Alternatively, reducing accuracy to APE < 15% leads to a much lower number of sections marked with bold black squares. In the former case, it is necessary to know thirteen solubility values for proper computation of solubility of the rest of the population. If the modest criterion is accepted only five measurements are indispensable for the same purpose. It happens that solvents in the middle of each class or subclass are the most suited reference solvents. However, three cases are out of this classification for which it is hardly possible to find suitable reference solvent. These are water, cyclopentane and nitromethane. These solvents are treated by COMOS-RS on different manner compared to the rest of the population that the consonance solvent approach is unable to improve solubility predictions. Particularly, water cannot be used as reference solvent due to ultra-low solubility of C60. Consequently, reference solubility predicts complete miscibility in majority cases if water is used as a reference system. Additionally, none of the organic solvents seem to be an appropriate reference for water with one exception. Using nitromethane as a reference solvent leads to 3.8% error of estimated C60 solubility in water. This is probably an accidental artefact. Cyclopentane and nitromethane are those solvents, which were found to be characterized by extreme values of fluidization term, which already anticipated their uniqueness.

The effectiveness of proposed classification is demonstrated in Figure 1 by a series marked with black circles. The mean absolute error is as low as 4.3% and the regression coefficient reached R<sup>2</sup> = 0.977 indicating very high accuracy of computed values. Reducing accuracy to the modest level results in an increase of the overall error up to 8.4% with reduced linearity of the trend to R<sup>2</sup> = 0.904. The coe fficient of the linear regression equation provided in Figure 1 suggests that predicted solubility values are still slightly overestimated compared to measurements. The detailed classification with identified consonant solvents is provided in Tables 1–3.

According to the modest criterion, the first class of consonance solvents comprises 49 solvents. Alkanes and halogenated alkanes were mainly included here. However, surprisingly, also some light alcohols as methanol and ethanol were found within this class. Hexane is suggested as the best reference solvent for this class o ffering accuracy as good as MAPE = 5.4%. There are however many other solvents, which can be treated as consonance ones with similar applicability. For example, such commonly used solvents as tetrahydrofuran, acetone or propan-2-ol are promising substituents for hexane in reference solubility computations of C60 solubility. As it was enumerated in Table 1, for further increase of solubility predictions this class can be subdivided into three subclasses. This increases the accuracy by a factor of two, reducing MAPE to 2.1%. Hence, very accurate predictions of C60 solubility in all solvents collected in Table 1 are possible provided that three experimental measured values are available one for each subclass A1, A2 or A3. Interestingly, class A is characterized by positive and highest values of fluidization term being in the range from 1.36 kcal/mol for thiophene and 4.02 kcal/mol for pentane.

Similarly, collection of further 60 solvents provided in Table 2 can also be classified according to the consonance rule. Although the best reference solvent for class B is nonan-1-ol but also such solvents as 1,2-dichloroethane, 1,2-dibromoethane, iodobenzene, butanoic acid or decan-1-ol o ffer similar accuracy. The average percentage error is slightly higher in comparison with class A and is equal to 7.5%. As it is presented in Figure 2 class B is characterized by smaller values of fluidization term being within the range of 0.35 kcal/mol for 1,4-dimethylbenzene and 1.84 kcal/mol for pentan-3-ol. Class B also can be subdivided into three subclasses for increase of accuracy of computed C60 solubility.

Finally, in Table 3 the classification of the remaining 68 solvents is provided. Here are the solvents characterized either by close to zero values of fluidization term or negative values. This class is more heterogeneous compared to the previous two and consists of a larger selection of 59 solvents augmented with two small sets grouping three (class C') or six (class D) solvents. It has been found that 1,5,9-cyclododecatriene is the best reference solvent for class C o ffering 10.3% accuracy expressed in term of MAPE value. Alternatively, other solvents o ffer similar accuracy as for example 1-bromooctadecane, thiophenol, 1-bromo-2-chloro-benzene, 1,5,9-cyclododecatriene, 1,2-dichlorobenzene, benzyl bromide or 1,2-dibromocyclohexane. Class C', which merges with class C4 in case of fine classification, comprises solvents for which selection of piperidine seems to be the best reference solvent. Finally, selection of tetralin is suggested as consonance solvent for the final class.

#### *3.4. Multiple Reference Solvent Computations*

Results presented above tempted for testing if any benefit will result from solubility computations based on information of solubility in several solvents. This option, referred to as the multiple reference solvent approach, was extensively tested here for C60 solubility. There were several conducted trials with a varying number of solvents starting from two and ending on the maximum possible number in the current version of COSMOtherm. Combinations of suggested consonance solvents were used and also randomly selected ones but no significant benefit was achieved as a result of all these attempts. The best results found during this stage were provided in Figure 1 as distribution marked with open grey circles. The conclusion is quite simple and univocal. No sets of multiple reference solvents were found for which the quality of prediction would be comparable to the single consonance solvent approach.

#### *3.5. Decomposition of Fluidization Term*

In the context of the mentioned variability of fluidization term, a closer inspection of the factors contributing to its values seems to be valuable. Fluidization encompasses all contributions, which are inherently associated with dissolution but not included in the value of chemical potential of solute in the saturated solution computed by COSMO-RS. This might be related to improper accounting for entropy by the combinatory term. In addition, the residual portion of chemical potential might be the source of non-negligible FLUID values due to the inaccurate characteristics of some solvent specific aspects of solubilization including mixing, solvation and solvent structure alterations imposed by solute cavitation in the presence of the solute. Indeed, the incoherence of the COSMO-RS theory in the current formulation was already noticed in the case of strongly interacting species which resulted in formulation of the COMSO-RS-DARE approach [69]. This extension of the original method was introduced for proper evaluation of activity coe fficients for the binary mixtures of carboxylic acids and non-polar components. Recently the author demonstrated the e ffectiveness of this approach for ethenzamide solubility predictions [70].

The quantification of fluidization contributions to the values of fusion Gibbs free energy is indispensable for a proper prediction of solids. To the author's best knowledge, this aspect was not raised so far, mainly because of the fact that there is no other substance, which solubility was so extensively studied as fullerene C60. In this case, conducting measurements in 180 solvents with such high chemical diversity, covering all types of solvents enables in depth analysis of the discussed aspect. For further exploring of the feature of the fluidization term, it seems to be valuable splitting it into combinatory contribution and residual one. Additionally, the latter term can be further decomposed into three major energetic contributions used by the COSMO-RS approach for chemical potential computations.

$$\begin{split} \Delta G\_{f\text{fluid}}(T) &= \Delta G\_{f\text{fluid}}^{\text{COMB}}(T) + \Delta G\_{f\text{fluid}}^{\text{RES}}(T) = \\ &= \Delta G\_{f\text{fluid}}^{\text{COMB}}(T) + \Delta G\_{f\text{fluid}}^{\text{MSF}}(T) + \Delta G\_{f\text{fluid}}^{\text{H}\text{B}}(T) + \Delta G\_{f\text{fluid}}^{\text{vdW}}(T) \end{split} \tag{11}$$

where Δ *GMSF fluid* is the misfit part of the fluidization term coming from electrostatics interactions, Δ *GHBfluid* stands for hydrogen bonding contribution and Δ *GvdWfluid* accounts for all non-specific interactions expressed as van der Waals interactions. All four contributions can be estimated by scaling capabilities of COMSOtherm and enforcing corresponding scaling factors to be equal to zero. The result of separations of these contributions is provided in Figure 4.

First of all, the provided series show a broad trend suggesting that higher solubility is generally associated with lower values of fluidization term and such contributions as vdW and RES. On the contrary, the stronger the electrostatics the lower the solubility of C60. The remaining two contributions, namely HB and COMBI have much smaller values. Moreover, the origin of the observed variation of FLUID comes from the vdW term. It is somewhat compensated with misfit values for those solvents with higher solubility and consequently COMS-RS performs slightly better in such situations. Contrary to this, low solubility systems pose a real challenge to COSMO-RS.

**Figure 4.** Relationship between different fluidization contributions and experimental solubility of C60 in studied solvents. MSF denotes the misfit part of fluidization term coming from electrostatics interactions (Δ*GMSF fluid*), HB stands for hydrogen bonding contributions (Δ*GHBfluid* ), vdW accounts for all non-specific interactions expressed as van der Waals interactions (Δ*GvdWfluid* H), RES represents the residual part of the chemical potential by summing of these three terms, COMBI is the combinatorial part of chemical potentials and FLUID is the sum of all contributions.

Finally, these conclusions can be also supported by statistical analysis of the distributions by performing the principal component analysis (PCA) on covariance matrix. This step reveals that all contributions involved in Equation (11) can be reduced just to two statistically significant factors encompassing 98.5% of the total variance. Values of loadings provided in Figure 5 sugges<sup>t</sup> that the whole fluidization term is mainly determined by its residual part and these two contributions are highly correlated with PC1 with correlation coefficient as high as –0.99 and –0.98, respectively. The major contribution to the residual part comes from dispersion and electrostatics influences RES in much lesser extent. It is really interesting to see vdW and MSF loading to PC2 not only because of the fact that they predominantly define this component, but also for their opposite signs of their contributions. This explains why so chemically diverse compounds were grouped into the same classes of consonance solvents. Indeed, increase of dispersion is compensated by electrostatic contribution to fluidization term and vice versa. Due to small values of both combinatorial term and hydrogen boding their overall importance is marginal for considered data set what was already documented in Figure 4.

**Figure 5.** Results of principal component analysis of FLUID and its components including the residual part of the chemical potential (RES), combinatorial term (COMB), contributions coming from electrostatic (MSF), hydrogen boding (HB) and van der Waals (vdW) interactions.

#### **4. Conclusions**

Solubility of Buckminster dissolved in 180 net organic solvents was predicted using the COSMO-RS approach. This universal methodology, taking advantage of the first principle quantum chemistry computations augmented with statistical thermodynamics, allows in principle for characteristics of any bulk liquid systems or their interfaces. Formally, application to solid solubility requires data characterizing the fusion process. It was documented that from the perspective of the COSMO-RS approach it is indispensable to distinguish calorimetric contributions to Gibbs free energy of fusion from the fluidization term resulting from inappropriate characteristics of probably both residual and combinatorial parts of chemical potential. The higher the FLUID the lower the experimental solubility, which is also associated with a significant increase of error of computed solubility.

These negative conclusions came from performed computations stimulated for seeking some solution to this discouraging result. This goal has been achieved by proposal of classification of solvents into groups with similar values of fluidization term. It was possible to find sets of solvents used in the reference solvent computations protocol that lead to very accurate prediction of C60 solubility. However, it is not possible to define any set of reference solvents suitable for computation of the solubility for the whole set of solvents. As the rule of thumb solvents within interval of 1 kcal/mol of fluidization term can be mutually exchanged either as the solvent or the reference.

It is of course debatable if the observations made in this paper are of general nature or hold just for Buckminster dissolvent in net organic solvents. What is certain, however, is that observations were possible because of the richness of the experimental data available in the literature and further exploration is worth the effort.

**Funding:** This research received no external funding.

**Acknowledgments:** In this section you can acknowledge any support given which is not covered by the author contribution or funding sections. This may include administrative and technical support, or donations in kind (e.g., materials used for experiments).

**Conflicts of Interest:** The author declares no conflict of interest.
