Next Article in Journal
Physicochemical Fundamentals of the Synthesis of a Cu@BN Composite Consisting of Nanosized Copper Enclosed in a Boron Nitride Matrix
Next Article in Special Issue
Diameter-Selective Host-Guest Interactions between Functionalized Fullerenes and Single-Walled Carbon Nanotubes
Previous Article in Journal
Synthesis of Polystyrene@TiO2 Core–Shell Particles and Their Photocatalytic Activity for the Decomposition of Methylene Blue
Previous Article in Special Issue
Matching Polynomial-Based Similarity Matrices and Descriptors for Isomers of Fullerenes
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

QSPR and Nano-QSPR: Which One Is Common? The Case of Fullerenes Solubility

by
Alla P. Toropova
1,*,
Andrey A. Toropov
1 and
Natalja Fjodorova
2
1
Laboratory of Environmental Chemistry and Toxicology, Istituto Di Ricerche Farmacologiche Mario Negri, IRCCS, Via Mario Negri, 2, 20156 Milano, Italy
2
Laboratory for Chemoinformatics, Theory Department, National Institute of Chemistry, Hajdrihova 19, 1001 Ljubljana, Slovenia
*
Author to whom correspondence should be addressed.
Inorganics 2023, 11(8), 344; https://doi.org/10.3390/inorganics11080344
Submission received: 30 June 2023 / Revised: 4 August 2023 / Accepted: 18 August 2023 / Published: 21 August 2023
(This article belongs to the Special Issue Advances in Fullerene Science)

Abstract

:
Background: The system of self-consistent models is an attempt to develop a tool to assess the predictive potential of various approaches by considering a group of random distributions of available data into training and validation sets. Considering many different splits is more informative than considering a single model. Methods: Models studied here build up for solubility of fullerenes C60 and C70 in different organic solvents using so-called quasi-SMILES, which contain traditional simplified molecular input-line entry systems (SMILES) incorporated with codes that reflect the presence of C60 and C70. In addition, the fragments of local symmetry (FLS) in quasi-SMILES are applied to improve the solubility’s predictive potential (expressed via mole fraction at 298’K) models. Results: Several versions of the Monte Carlo procedure are studied. The use of the fragments of local symmetry along with a special vector of the ideality of correlation improves the predictive potential of the models. The average value of the determination coefficient on the validation sets is equal to 0.9255 ± 0.0163. Conclusions: The comparison of different manners of the Monte Carlo optimization of the correlation weights has shown that the best predictive potential was observed for models where both fragments of local symmetry and the vector of the ideality of correlation were applied.

1. Introduction

Like the world of real material movements, in which all events that are visible and tangible to us in everyday life, such as wind, rain, and the movement of clouds, take place, there is a world of probabilistic actions, accidents, and tendencies that influence each other. However, these are not visible and not tangible to us. Perhaps quantitative structure-property/activity relationships (QSPR/QSAR) allow one to look into this world of accidents and trends that affect each other.
There is no mysticism here, but the phenomena occurring in such a space are not always described ideally and reliably. In other words, encountering situations that defy logic is possible. For example, the quality of calculations (models) can be affected by the collection of substances, which are available in the database, as well as priorities and criteria selected in the software used for QSPR/QSAR simulation.
However, in any case, it remains an indisputable axiom that models of random events are knowledge only when they are understandable and allow the possibility of verification by establishing and confirming their reproducibility.
Traditional QSPR were initially based on molecular structure [1,2,3] and later became involved in an extended set of descriptors that included information not only on molecular architecture but also on the magnitudes of various physicochemical properties [4]. This would be a good fit for nano-QSPR, if not for the lack of a clear relationship between the molecular structure and the pleasant/useful/dangerous nano-physicochemical properties of the respective nanomaterials [5,6,7,8,9,10,11]. It cannot be said that the molecular architecture does not affect the physicochemical properties of nano-substances in any way, but this influence is very sophisticated for nano-substances. That is, if, for small organic molecules, the modifications of the geometry/topology arrangement of a pair of atoms necessarily change the physicochemical parameters, then for fullerenes, and even more so, for multilayer nanotubes, changes in the arrangement of a pair of substituent atoms are very difficult to establish and/or measure experimentally. Naturally, simple homologous series, which formed the basis of the first QSPR experiments of organic compounds [1,2,3] for nano-substances, are extremely rare due to the high cost and weak motivation for experimental work designed to provide the corresponding numerical data on the physicochemical parameters of homologous series of fullerenes.
How do we obtain information on all promised abilities to apply nanomaterials? How do we select and use the unique potentials of nanomaterials? Hints, hypotheses, and intuition must be transformed into knowledge.
Can a model be knowledge?
Knowledge is a tool. It is preferable if knowledge is convenient for use in solving practical problems. Consequently, a model can be a way to reach knowledge when all excess is removed from the model and only the necessary remains. Nothing is surprising in that a brief instruction may be more useful than an excessively detailed one. That is why most researchers profess the principle that “to understand is to simplify”.
Taking into account the absence of large databases on various nanomaterials and the availability of sufficiently large arrays of experimental data on the interaction of individual nanomaterials with different organic substances (for example, with solvents [12]), one should look for the possibility of constructing models of the behaviour of nanomaterials in interaction with “traditional” organic substances.
In the case of the QSPR study of solubility C60- and C70-fullerenes [12], the traditional paradigm of QSPR/QSAR simulation is represented as
S = F (M)
This is maybe extended as
S = F (M, Fullerene)
where S = solubility, F = mathematical function, and M = molecular structure.
The transition from the model expressed by Equation (1) to the model expressed by Equation (2) is essentially a transition from traditional QSPR to nano-QSPR.
It should be noted that the model expressed by Equation (2) must (as well as the model expressed by Equation (1)) comply with the requirements for the QSPR formulated as well-known OECD principles [13]:
  • A defined endpoint (including experimental protocol);
  • An unambiguous algorithm;
  • A defined domain of applicability;
  • Appropriate measures of goodness-of-fit, robustness, and predictive power;
  • A mechanistic interpretation, when it is possible.
One can use these principles for nano-QSPR expressed by Equation (2). Can the OECD principles be improved? Latent attempts to do this can be seen in many studies [14,15,16,17,18,19,20,21].
The approach considered here is that each object (solvent = SMILES, fullerene = [C60] or [C70]) is represented by a character string. The program divides the symbols into special groups, for which the so-called correlation weights (some coefficients) are found. The descriptor for each object is the sum of the correlation weights. The Monte Carlo method is used to find such correlation weights that provide the maximum value of the objective function. This optimization is carried out on the basis of partitioning the available data into special subsets: an active training set (its task is to develop a model), a passive training set (its task is to check the objectivity of the current model), a calibration set (its task is to detect the start of the overtraining), and the validation set to assess the predictive potential of the final model.

2. Results

The three schemes for constructing models of the solubility of fullerenes C60 and C70 in organic solvents were evaluated.
  • First Scheme
The models were constructed using new components of the model, which are named correlation weights of fragments of local symmetry (FLS). However, the Monte Carlo optimization of the extended set of quasi-SMILES codes was planned without using the correlation idealization vector, which has two components: the index of ideality of correlation (IIC) and the correlation intensity index (CII).
  • Second Scheme
The models were constructed via the Monte Carlo optimization of the set of quasi-SMILES codes, without correlation weights of FLS, using the above-mentioned vector of the ideality of correlation.
  • Third Scheme
The models were built using the Monte Carlo optimization of an extended list of the correlation weights, including FLS, along with using the vector of the ideality of correlation.
Figure 1 contains the graphical representation of the simulation processes observed for the three schemes in the case of split 1. One can see that the third scheme seems to have the most perspective.
In addition, one can see that the practically reasoned optimal descriptor for the first scheme is DCW(3,5). In contrast, the preferable optimal descriptor for the second and third schemes is DCW(3,15).
According to the principle “QSAR/QSPR is a random event”, it is necessary to study the statistical quality of models observed under different distributions in the training set (here, the set is structured into three components: active training, passive training, and calibration sets). Table 1 contains the results of applying the first scheme on splits 1–10.
One can see the determination coefficients for the active training, passive training, and calibration sets as a rule equivalent or even a little larger than the determination coefficient of the validation set. However, in the case of continued optimization, the determination coefficients for the active and passive training samples will increase. In contrast, for the external control sample, the determination coefficient will decrease (Figure 1).
The second scheme (Table 2) is characterized by a significant decrease in the statistical quality for the active and passive training sets, accompanied by a noticeable increase in the coefficient of determination for the validation set. This confirms the observed influence of IIC and CII described in the literature [18]; IIC and CII improve the statistical quality of the QSPR/QSAR models for the validation set, but to the detriment of the statistical quality of the model for the training set.
The statistical quality, as well as the general logic of the models obtained using the third scheme (Table 3) are very similar, but not identical, concerning the results obtained using the second scheme.
Figure 2 contains the graphical representations of the models observed in the cases of applying second and third schemes for split 1.
Figure 2 shows an example of the models obtained using the second and third schemes for split 1. It should be noted that despite the statistical quality of the model for the active and passive training sets being low, these sets contain two latent correlations (Figure 2). Apparently, this is the effect of exposure to the vector of the ideality of correlation. Analogical pairs of correlations were observed in computer experiments described in the literature [20,21]. Figure 2 indicates that latent correlations on active and passive training sets are statistically more significant than total correlations on these sets.
It is under these circumstances that the problem arises regarding how to distinguish between the two approaches (second and third schemes). Which approach is more efficient, more precise, and more reliable?
Figure 1 shows some improvement in the statistical quality of the model for the case of the third scheme compared with the results observed in the case of the second scheme. However, it is related to split 1. Will this conclusion/hypothesis be true for splits 2, 3, …, 10?

2.1. System of Self-Consistent Models Observed for the Second Scheme

Table 4 contains the test results of the predictive potential of models with external validation sets that did not involve quasi-SMILES in constructing the tested models.
It can be seen that the results of applying the models to different test sets after removing the quasi-SMILES participating in the construction of the corresponding models are far from being the same. However, in all cases, there is a good predictive potential. The average value of determination coefficients for external validation sets is R v 2 ¯ = 0.8989 ± 0.0267.

2.2. System of Self-Consistent Models Observed for the Third Scheme

Table 5 contains the test results of the predictive potential of models with external validation sets that did not involve quasi-SMILES in the construction of the tested models.
It can be seen again that the results of applying the models to different test sets after removing the quasi-SMILES from the construction of the corresponding models are far from being the same. However, in all cases, there is a good predictive potential. The average value of determination coefficients for external validation sets is R v 2 ¯ = 0.9255 ± 0.0163.

2.3. The Comparison of Second and Third Schemes

The predictive potential of models built using the third scheme is better than that of models built using the second scheme. The dispersion in the determination coefficient values for the third scheme is less than one compared to models obtained using the second scheme.
Figure 3 shows the difference in the predictive potential of models obtained using the second and third schemes. One can see the preferable predictive potential for the second scheme for splits #2, #7, and #10. However, all other splits demonstrate the advantage of using the third scheme.

2.4. What Do QSAR/QSPR and Nano-QSAR/QSPR Have in Common?

First, QSAR/QSPR and nano-QSAR/QSPR are random events.
Second, the predictive potential in both cases can change markedly depending on the distribution of available data into training and validation sets.
Third, both QSAR/QSPR and nano-QSAR/QSPR cannot replace a natural experiment in measuring the values of various “usual” and nano-endpoints.

3. Discussion

The principle “QSAR/QSPR is a random event” is confirmed in the results obtained in this study: completely homogeneous distributions in the training and control subsystems, for the same approach of the simulation of solubility fullerenes in organic solvents, provide different values for the statistical characteristics of the models (Table 1, Table 2 and Table 3).
For the proposed approach, which provides a certain “mathematical expectation” for the models obtained using the second and third schemes, it becomes possible to compare the average values, based on which it is possible to put forward a fairly reasonable hypothesis that the third scheme provides the best models compared to the models obtained using the second scheme. It is appropriate to note that the reliable criteria for the quality of models are not only the average values of the coefficients of determination but also their variances, which was observed in previous studies where systems of self-consistent models were used [20,21].
The FLS described here may not be a universal tool for developing arbitrary models, but it is only a technique that has proven successful for this task (i.e., for developing a model for the solubility of fullerenes in organic solvents). However, the vector of the ideality of correlations (or maybe the IIC and the CII, separately) perhaps can be recognized as useful and versatile tools for testing, and maybe even for improving, the predictive potential of traditional QSAR/QSPR and nano-QSAR/QSPR.
In this study, a quite simple version of quasi-SMILES has been applied to develop the models. However, one can easily extend the list of codes for quasi-SMILES to express more detailed and complex experimental conditions. In other words, one can hope that the quasi-SMILES serve as a language of communication between “classic” experimentalists who study nanomaterials and developers of nano-QSAR/QSPR models. A certain trend towards recognizing this language and even some experience in the practical use of this language have already been outlined [22].

4. Materials and Methods

4.1. Data

The experimental solubility values of C60 and C70 fullerenes in diverse solvents were reported in mole fraction determined at 298 K [12]. Table 6 contains the list of pairs of duplicates observed in [12]. Of each pair of duplicates, only one was left for further analysis. After this removal, 206 quasi-SMILES representing various pairs of fullerenes (C60 or C70) and solvents were used for further computational experiments.
To this end, these quasi-SMILES were randomly distributed into the following subsets: (i) active training set (25%); (ii) passive training set (25%); (iii) calibration set (25%); and (iv) validation set (25%). Ten splits obtained corresponding to the above proportions are presented here. Table 7 contains the measures of identity for ten such splits examined in this study.
Each of the above sets has a defined task. The active training set is used to build the model. Molecular features extracted from quasi-SMILES of the active training set are involved in the process of Monte Carlo optimization aimed to provide correlation weights for the above features, which provide the maximal target function value, which is calculated using descriptors (it is calculated as the sum of the correlation weights of all the components of quasi-SMILES) and endpoint values on the active training set. The task of the passive training set is to certify if the model obtained for the active training set is satisfactory for quasi-SMILES, which were not involved in the active training set. The calibration set should detect the start of the overtraining (overfitting). The optimization must stop if overtraining starts. After stopping the optimization procedure, the validation set is used to assess the predictive potential of the obtained model.

4.2. Optimal Descriptor

The model of fullerene solubility in organic solvents studied here is as follows:
l o g S = C 0 + C 1 × D C W ( T , N )
where DCW(T,N) is the optimal descriptor.
The optimal descriptor is the basis for calculating the model value of the solubility of fullerenes in organic solvents from the correlation weights of quasi-SMILES codes representing the “fullerene-solvent” systems. The quasi-SMILES reflect the presence of nano-features by two codes, indicated as [C60] and [C70], which indicate the fullerene C60 and C70, respectively. From the traditional SMILES representing the solvent, data on the atomic composition of the solvent (denoted as S) and interatomic bonds (denoted as SS) are extracted. It should be noted that atoms indicate SMILES-atoms, which is one symbol (e.g., ‘C’, ‘N’, ‘=’) or a group of symbols that cannot be considered separately (e.g., ‘Cl’, %11). In this study, the so-called fragments of local symmetry (FLS) are additionally used. Three types of FLS are considered as follows: (i) XYX; (ii) XYYX; and (iii) XYZYX, where X and Y are arbitrary symbols, but X is not equal to Y. FLS are characteristics of the SMILES/quasi-SMILES strings. Generally, they are not reflections of molecular features that are somehow correlated with traditional symmetry. Nevertheless, as SMILES or quasi-SMILES features, they can be useful participants in the described optimization procedure since they improve the predictive potential of the models obtained using the approach considered here.
The above-listed features extracted from quasi-SMILES have so-called correlation weights (CW) obtained via the Monte Carlo optimization. Thus, the optimal descriptor is calculated as follows:
D C W T , N = C W S + C W S S + C W X Y X + C W X Y Y X + C W ( X Y X Y X )
where T is the threshold, i.e., an integer to separate codes into two categories. If a code has a frequency in the active training set less than T, it is considered rare and removed from the simulating process. If the code has a frequency in the active training set larger than T, it is considered active and involved in the simulating process. N is the number of epochs of the Monte Carlo optimization.

4.3. The Monte Carlo Optimization

The correlation weights necessary to calculate the optimal descriptors DCW(T,N) are calculated using the Monte Carlo optimization based on special target functions.
Equation (4) needs the numerical data on the above correlation weights. The Monte Carlo optimization is a tool to calculate these correlation weights. Here, two target functions for the Monte Carlo optimization are examined:
T F 0 = r A T + r P T r A T r P T × 0.1
T F 1 = T F 0 + ( I I C   + C I I ) × 0.3
The r A T and r P T are correlation coefficients between the observed and predicted endpoints for the active and passive training sets, respectively; IIC is the index of ideality of correlation [14,15]; and CII is the correlation intensity index [14,15].
Figure 1 shows the history of the optimization process for various options for the optimal descriptor and the objective function. A comparison of the results presented in Figure 1 indicates that the most promising option for obtaining the best predictive potential is the option where IIC, CII, and FLS are used (third Scheme). Table 8 contains the correlation weights for quasi-SMILES codes for the model (split 1).
Table 9 contains quasi-SMILES, split into active (A) and passive (P) training sets, calibration (C), and validation (V) sets, and the experimental and calculated values of fullerene C60 and C70 solubility in an organic solvent. Table 10 shows an example of the DCW(3,15) calculation.

4.4. The Applicability Domain

The applicability domain is considered in many studies devoted to QSPR/QSAR analysis [16]. The main question is, “Can the resulting model be applied to a given/interest substance?”. However, the counter-question is also logical. Is it not better to determine for which substances the model being developed is intended before developing it [17]? Can the model’s applicability domain change if one changes the distribution of available data into training and validation sets?
It should be noted that for the approach studied here, the applicability domain for different splits slightly changes.
The applicability domain for the described CORAL models are defined via the so-called statistical defects of codes used in quasi-SMILES. These defects are calculated as follows:
d k = P ( S k ) P ( S k ) N S k + N ( S k ) + P ( S k ) P ( S k ) N S k + N ( S k ) + P ( S k ) P ( S k ) N S k + N ( S k )
where P(Sk), P′(Sk), and P″(Sk) are the probability of Sk in the active training set, passive training set, and calibration set, respectively; N(Sk), N′(Sk), and N″(Sk) are the frequencies of Sk in the active training set, passive training set, and calibration set, respectively. The statistical defects of quasi-SMILES (Dj) are calculated as follows:
D j = k = 1 N A d k
where NA is the number of non-blocked codes in quasi-SMILES.
A quasi-SMILES falls in the applicability domain, if
D j < 2 D ¯
where D ¯ is the average statistical defect for the active training set.

4.5. Mechanistic Interpretation

With the numerical data on the correlation weights of codes applied in quasi-SMILES, which was observed in several runs of the Monte Carlo optimization, one can extract three categories of these codes:
  • Codes that have a positive value of the correlation weight in all runs. These are promoters of endpoint increase;
  • Codes with a negative correlation weight value in all runs. These are promoters of endpoint decrease;
  • Codes with negative and positive correlation weight values in different optimization runs. These codes have unclear roles (one cannot classify these features as promoters of increase or decrease for endpoint).

4.6. System of Self-Consistent Models

The reliability of an approach can be assessed by the so-called system of self-consistent models [18,19]. The main idea of such a system is to test the performance of an approach on many random splits of the available data into training and validation subsets. This task can be represented by a matrix of determination coefficients related to applying the model built using split 1 to the validation set observed for split 2. Suppose some quasi-SMILES, which are allocated to the validation set of split 2, are present in the training or the calibration sets of split 1 at the same time. In that case, they may improve the statistical quality of model 1 for the split 2 validation set.
R 1 , 1 2 R 1 , 2 2 R 2 , 1 2 R 2 , 2 2 R 10 , 1 2 R 10 , 2 2 R 10 , 1 2 R 10 , 2 2 R 10 , 10 2
In order for the assessment of the statistical quality of model 1 for the validation set of split 2 to be adequate, it is necessary to remove the abovementioned quasi-SMILES from consideration. It can be expressed as the following:
R 1,1 2 R 1,2 * 2 R 2,1 * 2 R 2,2 2 R 10,1 * 2 R 10,2 * 2 R 10,1 * 2 R 10,2 * 2 R 10,10 2
Figure 4 indicates the essence of asterisks in the matrix (11). It is clear that the principles for selecting quasi-SMILES in the validation set of split 2 to assess the predictive potential of model 1 can be clearly translated for the arbitrary pairs of the i-th model vs. the j-th split (i ≠ j).

4.7. Comparison with Other Models

The strange influence of IIC and CII on the simulation process via improving the statistical quality of the model for the calibration sets leads to the temptation to compare different models in terms of their quality for the external validation set. Table 11 contains the comparisons of models for the solubility of fullerenes in various solvents.

5. Conclusions

A model observed for a single distribution of the available data into a training and a validation set can be either too good or too bad. It is preferable to consider a set of models built on sufficiently diverse distributions of the available data in the training and validation sets to obtain reliable information about the suitability of the chosen approach. The two-component vector of the ideality of correlation based on the use of the IIC and CII for the Monte Carlo optimization improves the predictive potential of the model. However, the paradoxical effect of the mentioned vector is to reduce the determination coefficient values for the active and passive training sets. However, if the main aim of the simulation is to obtain a satisfactory prediction for the external validation set, then the effect of the vector of the ideality of correlation which leads to improving the statistical quality of a model for the external validation set, even in the detriment the training set, the result rather useful rather than adverse. Using the proposed fragments of local symmetry (FLS) significantly improves the predictive potential of the solubility model of fullerenes C60 and C70 in organic solvents. It would be wrong to claim that FLS is related to traditional classical symmetry. However, it is clear that FLS contains some information that can improve the statistical quality and possibly the interpretability of the models. QSPR and nano-QSPR are random events since the appearance of new experimental data may challenge the already created models. Therefore, each model should be considered useful only temporarily and should be prepared for the need for radical alteration. The quasi-SMILES technique offers the possibility of the fast modification of models taking into account new conditions/circumstances.

Author Contributions

Conceptualization, A.A.T., A.P.T. and N.F.; Data curation, A.A.T., A.P.T. and N.F.; Writing—original draft, A.A.T., A.P.T. and N.F.; Review and editing, A.A.T., A.P.T. and N.F. All authors have read and agreed to the published version of the manuscript.

Funding

A.P.T. and A.A.T. acknowledge EFSA for the financial contribution to the project sOFT-ERA, OC/EFSA/IDATA/2022/02.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Technical details on five models are available on request.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Wiener, H. Correlation of heats of isomerization, and differences in heats of vaporization of isomers, among the paraffin hydrocarbons. J. Am. Chem. Soc. 1947, 69, 2636–2638. [Google Scholar] [CrossRef]
  2. Wiener, H. Structural Determination of paraffin boiling points. J. Am. Chem. Soc. 1947, 69, 17–20. [Google Scholar] [CrossRef] [PubMed]
  3. Wiener, H. Influence of interatomic forces on paraffin properties. J. Chem. Phys. 1947, 15, 766. [Google Scholar] [CrossRef]
  4. Venkatapathy, R.; Wang, C.Y.; Bruce, R.M.; Moudgal, C. Development of quantitative structure-activity relationship (QSAR) models to predict the carcinogenic potency of chemicals. I. Alternative toxicity measures as an estimator of carcinogenic potency. Toxicol. Appl. Pharmacol. 2009, 234, 209–221. [Google Scholar] [CrossRef] [PubMed]
  5. Harris, P.J.F. Fullerene-related structure of commercial glassy carbons. Philos. Mag. 2004, 84, 3159–3167. [Google Scholar] [CrossRef]
  6. Olmstead, M.M.; Costa, D.A.; Maitra, K.; Noll, B.C.; Phillips, S.L.; Van Calcar, P.M.; Balch, A.L. Interaction of curved and flat molecular surfaces. The structures of crystalline compounds composed of fullerene (C60, C60O, C70, and C120O) and metal octaethylporphyrin units. J. Am. Chem. Soc. 1999, 121, 7090–7097. [Google Scholar] [CrossRef]
  7. Johnson, R.D.; Bethune, D.S.; Yannoni, C.S. Fullerene structure and dynamics: A magnetic resonance potpourri. Acc. Chem. Res. 1992, 25, 169–175. [Google Scholar] [CrossRef]
  8. Liu, X.; Schmalz, T.G.; Klein, D.J. Favorable structures for higher fullerenes. Chem. Phys. Lett. 1992, 188, 550–554. [Google Scholar] [CrossRef]
  9. Rao, C.N.R.; Seshadri, R.; Govindaraj, A.; Sen, R. Fullerenes, nanotubes, onions and related carbon structures. Mater. Sci. Eng. R Rep. 1995, 15, 209–262. [Google Scholar] [CrossRef]
  10. Dinadayalane, T.C.; Leszczynski, J. Remarkable diversity of carbon-carbon bonds: Structures and properties of fullerenes, carbon nanotubes, and graphene. Struct. Chem. 2010, 21, 1155–1169. [Google Scholar] [CrossRef]
  11. Mikheev, I.V.; Verkhovskii, V.A.; Byvsheva, S.M.; Volkov, D.S.; Proskurnin, M.A.; Ivanov, V.K. Simultaneous quantification of fullerenes C60 and C70 in organic solvents by excitation–emission matrix fluorescence spectroscopy. Inorganics 2023, 11, 136. [Google Scholar] [CrossRef]
  12. Gupta, S.; Basant, N. Predictive modeling: Solubility of C60 and C70 fullerenes in diverse solvents. Chemosphere 2018, 201, 361–369. [Google Scholar] [CrossRef]
  13. Prana, V.; Fayet, G.; Rotureau, P.; Adamo, C. Development of validated QSPR models for impact sensitivity of nitroaliphatic compounds. J. Hazard. Mater. 2012, 235–236, 169–177. [Google Scholar] [CrossRef]
  14. Toropova, A.P.; Toropov, A.A.; Fjodorova, N. Quasi-SMILES for predicting toxicity of Nano-mixtures to Daphnia Magna. NanoImpact 2022, 28, 100427. [Google Scholar] [CrossRef] [PubMed]
  15. Toropov, A.A.; Toropova, A.P. Correlation intensity index: Building up models for mutagenicity of silver nanoparticles. Sci. Total Environ. 2020, 737, 139720. [Google Scholar] [CrossRef] [PubMed]
  16. Tropsha, A.; Golbraikh, A. Predictive QSAR modeling workflow, model applicability domains, and virtual screening. Curr. Pharm. Des. 2007, 13, 3494–3504. [Google Scholar] [CrossRef] [PubMed]
  17. Toropov, A.A.; Toropova, A.P.; Benfenati, E. Additive SMILES-based carcinogenicity models: Probabilistic principles in the search for robust predictions. Int. J. Mol. Sci. 2009, 10, 3106–3127. [Google Scholar] [CrossRef]
  18. Toropov, A.A.; Toropova, A.P. The system of self-consistent models for the uptake of nanoparticles in PaCa2 cancer cells. Nanotoxicology 2021, 15, 995–1004. [Google Scholar] [CrossRef]
  19. Toropova, A.P.; Toropov, A.A. The system of self-consistent models: A new approach to build up and validation of predictive models of the octanol/water partition coefficient for gold nanoparticles. Int. J. Environ. Res. 2021, 15, 709–722. [Google Scholar] [CrossRef]
  20. Toropov, A.A.; Toropova, A.P.; Roncaglioni, A.; Benfenati, E. The system of self-consistent models for pesticide toxicity to Daphnia magna. Toxicol. Mech. Methods, 2023; in press. [Google Scholar] [CrossRef]
  21. Toropov, A.A.; Toropova, A.P.; Achary, P.G.R.; Raškova, M.; Raška, I. The searching for agents for Alzheimer’s disease treatment via the system of self-consistent models. Toxicol. Mech. Methods 2022, 32, 549–557. [Google Scholar] [CrossRef]
  22. Toropova, A.P.; Toropov, A.A. (Eds.) QSPR/QSAR Analysis Using SMILES and Quasi-SMILES. Challenges and Advances in Computational Chemistry and Physics; Springer: Cham, Switzerland, 2023; Volume 33, pp. 1–467. [Google Scholar] [CrossRef]
  23. Petrova, T.; Rasulev, B.F.; Toropov, A.A.; Leszczynska, D.; Leszczynski, J. Improved model for fullerene C60 solubility in organic solvents based on quantumchemical and topological descriptors. J. Nanopart. Res. 2011, 13, 3235–3247. [Google Scholar] [CrossRef]
  24. Ghasemi, J.B.; Salahinejad, M.; Rofouei, M.K. Alignment independent 3DQSAR modeling of fullerene (C60) solubility in different organic solvents. Fuller. Nanotub. Carbon Nanostruct. 2013, 21, 367–380. [Google Scholar] [CrossRef]
  25. Cheng, W.-D.; Cai, C.-Z. Accurate model to predict the solubility of fullerene C60 in organic solvents by using support vector regression. Fuller. Nanotub. Carbon Nanostruct. 2017, 25, 58–64. [Google Scholar] [CrossRef]
  26. Toropova, A.P.; Toropov, A.A. QSPR and nano-QSPR: What is the difference? J. Mol. Struct. 2019, 1182, 141–149. [Google Scholar] [CrossRef]
  27. Roy, J.K.; Kar, S.; Leszczynski, J. Optoelectronic properties of C60 and C70 fullerene derivatives: Designing and evaluating novel candidates for efficient P3HT polymer solar cells. Materials 2019, 12, 2282. [Google Scholar] [CrossRef] [PubMed]
  28. Kar, S.; Sizochenko, N.; Ahmed, L.; Batista, V.S.; Leszczynski, J. Quantitative structure-property relationship model leading to virtual screening of fullerene derivatives: Exploring structural attributes critical for photoconversion efficiency of polymer solar cell acceptors. Nano Energy 2016, 26, 677–691. [Google Scholar] [CrossRef]
  29. Pourbasheer, E.; Aalizadeh, R.; Ardabili, J.S.; Ganjali, M.R. QSPR study on solubility of some fullerenes derivatives using the genetic algorithms-Multiple linear regression. J. Mol. Liq. 2015, 204, 162–169. [Google Scholar] [CrossRef]
Figure 1. The histories of the Monte Carlo optimization using different target functions.
Figure 1. The histories of the Monte Carlo optimization using different target functions.
Inorganics 11 00344 g001
Figure 2. R2A indicates all the points in the coordinates “experiment—calculation”; R2 “without A” denote the points of the upper and lower clusters, and finally for the calibration and validation sets, the corresponding values are marked with black or uncoloured triangles (bottom of the figure).
Figure 2. R2A indicates all the points in the coordinates “experiment—calculation”; R2 “without A” denote the points of the upper and lower clusters, and finally for the calibration and validation sets, the corresponding values are marked with black or uncoloured triangles (bottom of the figure).
Inorganics 11 00344 g002
Figure 3. The comparison of the predictive potential of models 1–10 in the cases of applying the second and the third schemes.
Figure 3. The comparison of the predictive potential of models 1–10 in the cases of applying the second and the third schemes.
Inorganics 11 00344 g003
Figure 4. For the demonstration scheme of the assessment of a model: let 100 quasi-SMILES be used and distributed into active training (A1), passive training (P1), and calibration (C1) sets, which are used to build model 1. The subset of the validation set of split 2, denoted as V2*, is used to assess the predictive potential of model 1. One can see that, instead of 26 quasi-SMILES (Nv2), only 6 are involved in assessing.
Figure 4. For the demonstration scheme of the assessment of a model: let 100 quasi-SMILES be used and distributed into active training (A1), passive training (P1), and calibration (C1) sets, which are used to build model 1. The subset of the validation set of split 2, denoted as V2*, is used to assess the predictive potential of model 1. One can see that, instead of 26 quasi-SMILES (Nv2), only 6 are involved in assessing.
Inorganics 11 00344 g004
Table 1. The statistical characteristics of the models were built without using IIC and CII but using correlation weights of FLS (first scheme). The average determination coefficient of the validation sets for the observed ten models follows 0.7742 ± 0.0713.
Table 1. The statistical characteristics of the models were built without using IIC and CII but using correlation weights of FLS (first scheme). The average determination coefficient of the validation sets for the observed ten models follows 0.7742 ± 0.0713.
SplitSet *nR2CCCIICCIIQ2RMSEMAEF
1A520.82760.90560.77970.90890.81190.7160.586240
P530.75690.86870.85610.88120.73620.6300.492159
C510.74280.80200.43110.83670.72310.8590.667142
V500.7406----0.794--
2A510.72360.83960.81790.84950.69670.8790.718128
P510.72160.81530.69590.84310.67940.7900.621127
C510.85790.90840.91090.91750.83950.5750.437296
V530.8348----0.618--
3A520.80360.89110.83010.88300.78450.6540.498205
P500.61510.70850.70080.79170.58170.9830.80077
C520.89100.92500.40790.92320.88390.4600.334409
V520.8762----0.458--
4A500.79920.88840.76150.86470.78200.7150.504191
P530.79900.88110.77670.86690.78380.7240.557203
C500.81390.81920.46040.88560.79450.7340.547210
V530.8328----0.645--
5A500.77980.87620.81510.87450.75920.7150.575170
P520.77980.84780.55410.87710.75490.8890.686177
C520.85220.89850.90730.90870.84050.5240.381288
V520.8411----0.585--
6A500.88710.94020.86940.92990.87610.5170.403377
P500.88620.92930.69250.93280.87600.6140.503374
C530.63400.69380.72210.81380.59011.040.77088
V530.6686----1.182--
7A500.82300.90290.83740.88070.80220.6570.480223
P530.80760.89590.87480.88470.78960.6780.530214
C500.82830.86100.72570.89990.81100.6840.544232
V530.6583----0.760--
8A510.74710.85530.65570.85840.72530.7990.594145
P520.71990.82500.77740.84300.69650.8360.613129
C510.81000.89410.84780.89450.79470.4970.416209
V520.7854----0.583--
9A500.81870.90030.71090.89170.80030.7170.541217
P530.80680.88970.87280.88390.78970.7300.589213
C530.64860.73020.77530.78400.61300.9170.67894
V500.7649----0.832--
10A530.72350.83960.70390.85720.69810.8750.708133
P510.60710.75450.57840.79470.57600.8410.62976
C510.77860.84620.64470.85450.76430.6280.486172
V510.7690----0.658--
* A, P, C, and V denote active training, passive training, calibration, and validation sets, respectively.
Table 2. The statistical characteristics of the models were obtained using IIC and CII without correlation weights of fragments of local symmetry (second scheme). The average determination coefficient of the validation sets for the observed ten models follows 0.8832 ± 0.0273.
Table 2. The statistical characteristics of the models were obtained using IIC and CII without correlation weights of fragments of local symmetry (second scheme). The average determination coefficient of the validation sets for the observed ten models follows 0.8832 ± 0.0273.
SplitSet *nR2CCCIICCIIQ2RMSEMAEF
1A520.66430.79830.69860.84280.63230.9840.82299
P530.59970.71870.63280.78610.56880.9580.80576
C510.90230.94980.94960.94370.89420.3650.293453
V500.9075----0.342--
2A510.66220.79680.78250.81620.63140.9720.82496
P510.54120.67580.38560.82730.48671.090.93258
C510.91280.94450.95480.94980.89740.4350.327513
V530.9185----0.416--
3A520.61820.76410.78630.80350.58770.9120.74881
P500.42220.58860.53580.75070.35271.170.99735
C520.91440.93980.95440.94510.90800.3570.285534
V520.8822----0.399--
4A500.55120.71070.68530.76850.49751.070.83059
P530.61120.68500.64040.77330.57911.040.87980
C500.75500.86020.86880.87620.71950.5210.384148
V530.8491----0.411--
5A500.59510.74620.77140.80240.56150.9700.80971
P520.59720.75720.62660.79660.56271.020.81774
C520.85230.92090.92020.92770.83110.3950.320288
V520.8816----0.404--
6A500.38990.56110.45220.73070.32181.200.96631
P500.65850.64720.69570.82560.62531.140.98793
C530.76800.83470.85120.86340.67770.4910.392169
V530.8654----0.387--
7A500.65430.79100.68900.82710.62150.9180.75991
P530.51050.71000.62650.77490.46561.110.91653
C500.85620.92500.92440.91570.84420.4180.326286
V530.8619----0.389--
8A510.61990.76540.75710.80570.58360.9800.78880
P520.54750.69550.64910.76970.51161.050.84360
C510.88210.93810.93700.93860.86930.3530.288367
V520.8455----0.441--
9A500.58860.74100.55560.79200.55431.080.92769
P530.55400.74250.71530.77620.51451.050.87163
C530.81860.90380.90410.90140.80350.4180.333230
V500.8894----0.370--
10A530.66500.79880.62520.82470.63220.9630.768101
P510.44330.64480.62430.77460.40350.9950.80739
C510.85380.91290.92370.92430.84070.4630.363286
V510.9306----0.368--
* A, P, C, and V denote active training, passive training, calibration, and validation sets, respectively.
Table 3. The statistical characteristics of the models were obtained using IIC, CII, and correlation weights of FLS (third scheme). The average determination coefficient of the validation sets for the observed ten models follows 0.9170 ± 0.0117.
Table 3. The statistical characteristics of the models were obtained using IIC, CII, and correlation weights of FLS (third scheme). The average determination coefficient of the validation sets for the observed ten models follows 0.9170 ± 0.0117.
SplitSet *nR2CCCIICCIIQ2RMSEMAEF
1A520.57330.72880.75710.83610.52781.130.93467
P530.47320.63790.42990.77500.43551.010.80346
C510.91790.94560.95790.95660.90960.3550.283548
V500.9365----0.306--
2A510.70790.82890.74790.83330.68230.9040.743119
P510.54920.68320.37660.82770.49731.070.89560
C510.88440.93190.94020.94900.87110.4720.387375
V530.9115----0.447--
3A520.82140.90200.77690.89470.80400.6240.492230
P500.51270.69480.68220.76020.45991.080.88751
C520.91390.94600.95550.94560.90690.3930.320531
V520.9108----0.408--
4A500.72370.83970.85070.83410.69610.8380.632126
P530.70830.81060.69610.82280.68590.8870.684124
C500.88630.89990.93800.93550.87650.4980.401374
V530.9056----0.360--
5A500.68890.81580.83000.83920.65890.8500.719106
P520.70140.80920.63370.83910.67400.9810.823117
C520.93830.96230.96810.96230.93160.2980.228761
V520.9322----0.360--
6A500.56140.71910.74920.80230.52441.020.85061
P500.76840.76800.85190.85350.75030.9550.828159
C530.84150.91320.91650.89690.82920.4160.337271
V530.9160----0.318--
7A500.74240.85220.79540.85540.71810.7920.644138
P530.67920.81820.70000.85290.64470.8820.725108
C500.89200.93620.94250.92590.88400.4100.327397
V530.9072----0.357--
8A510.64600.78490.71440.81090.61220.9450.75589
P520.60780.74130.76900.79300.57630.9840.81177
C510.88070.93690.93680.94000.86440.3710.291362
V520.9041----0.324--
9A500.71990.83720.61440.85070.69550.8910.709123
P530.67340.81590.56830.83070.64420.9020.748105
C530.88740.93040.94060.92930.87940.3850.320402
V500.9337----0.396--
10A530.70130.82440.74770.86290.67070.9090.749120
P510.42480.63330.45210.73340.38071.070.83636
C510.87660.92370.93420.93620.86530.4290.345348
V510.9122----0.403--
* A, P, C, and V denote active training, passive training, calibration, and validation sets, respectively.
Table 4. System of self-consistent models built without the correlation weights of FLS.
Table 4. System of self-consistent models built without the correlation weights of FLS.
s1s2s3s4s5s6s7s8s9s10 x * ¯ x *
m1 * N v * 12172119191923212219.23.1
R * 2 0.88180.86080.91990.95610.96420.86570.88440.94470.91450.91020.0369
m2 N v * 12 162223202320211819.43.4
R * 2 0.9473 0.90240.94740.96560.87440.89150.94700.94940.94140.92960.0298
m3 N v * 1716 1917162019201918.11.5
R * 2 0.83580.8952 0.82890.85080.95730.89190.90640.93490.92390.89170.0424
m4 N v * 212219 27212122222522.22.3
R * 2 0.75740.89890.7975 0.87820.88560.79830.88120.87420.87300.84930.0478
m5 N v * 19231727 182019171619.63.3
R * 2 0.80550.93050.90250.9158 0.94590.92950.91910.94830.96170.91760.0432
m6 N v * 1920162118 2323282221.13.3
R * 2 0.89140.72250.88310.78030.8665 0.86350.90430.88150.92760.85770.0264
m7 N v * 192320212023 20262421.82.2
R * 2 0.89100.87610.85850.88960.90500.9094 0.93900.91080.91940.89990.0226
m8 N v * 23201922192320 241820.92.0
R * 2 0.86620.90030.91580.93580.88750.90220.8981 0.91710.90990.90370.0186
m9 N v * 2121202217282624 2422.63.1
R * 2 0.84690.91320.89950.85510.91760.90000.89320.9033 0.90150.89230.0232
m10 N v * 221819251622241824 20.93.0
R * 2 0.91300.92260.94570.92730.94540.94980.91590.96140.9481 0.93660.0162
* m1–m10 denote the models from 1 to 10; s1–s10 denote the splits from 1 to 10; x * ¯ is the average value of n* or R2v*;  x * is the dispersion value of n* or R2v*.
Table 5. System of self-consistent models built using the correlation weights of FLS.
Table 5. System of self-consistent models built using the correlation weights of FLS.
s1s2s3s4s5s6s7s8s9s10 x * ¯ x *
m1 * N v * 12172119191923212219.23.1
R * 2 0.94810.90250.93810.95390.96190.93840.94910.94830.92910.94100.0163
m2 N v * 12 162223202320211819.43.4
R * 2 0.9176 0.93800.94140.95200.88140.88530.94820.96350.91680.92710.0273
m3 N v * 1716 1917162019201918.11.5
R * 2 0.85140.9264 0.91470.91960.94610.87220.90980.91230.95870.91230.0314
m4 N v * 212219 27212122222522.22.3
R * 2 0.88560.90890.9185 0.91900.87230.83950.92010.93690.93250.90370.0300
m5 N v * 19231727 182019171619.63.3
R * 2 0.93860.95390.93250.9397 0.91440.95010.95010.94920.96540.94380.0138
m6 N v * 1920162118 2323282221.13.3
R * 2 0.96190.86260.92480.90090.9391 0.90510.92270.92270.93490.91940.0264
m7 N v * 192320212023 20262421.82.2
R * 2 0.84120.95260.84200.85390.93220.9185 0.93900.88420.90740.89680.0406
m8 N v * 23201922192320 241820.92.0
R * 2 0.91170.92590.94020.94680.97800.95300.9364 0.97310.94570.94560.0198
m9 N v * 2121202217282624 2422.63.1
R * 2 0.92500.95530.93670.94500.95220.92900.91360.9467 0.94630.93890.0131
m10 N v * 221819251622241824 20.93.0
R * 2 0.91360.93020.94300.94180.92880.88910.91370.95440.9201 0.92610.0185
* m1–m10 denote the models from 1 to 10; s1s10 denote the splits from 1 to 10; x *   ¯ is the average value of n* or R2v*;  x *   is the dispersion value of n* or R2v*.
Table 6. The list of duplicated quasi-SMILES observed in [12].
Table 6. The list of duplicated quasi-SMILES observed in [12].
CAS of SolventQuasi-SMILESMole FractionComment
493-01-6C1CCC2CCCCC2C1[C60]−3.300Deleted
493-02-7C1CCC2CCCCC2C1[C60]−3.500Involved
6876-23-9CC1CCCCC1C[C60]−4.600Deleted
2207-01-4CC1CCCCC1C[C60]−4.600Involved
74-97-5C(Cl)Br[C60]−4.200Deleted
74-97-5C(Cl)Br[C60]−4.200Involved
540-49-8C(=CBr)Br[C60]−3.700Deleted
540-49-8C(=CBr)Br[C60]−3.670Involved
2586-62-1CC1=C(C2=CC=CC=C2C=C1)Br[C60]−2.100Deleted
2586-62-1CC1=C(C2=CC=CC=C2C=C1)Br[C60]−2.130Involved
112-71-0CCCCCCCCCCCCCCBr[C60]−2.590Deleted
112-89-0CCCCCCCCCCCCCCBr[C60]−2.530Involved
Table 7. The percentage of identity for random splits examined in this study.
Table 7. The percentage of identity for random splits examined in this study.
12345678910
110033.038.541.237.333.345.150.541.247.6
231.110029.145.541.635.627.739.239.646.2
339.230.510037.329.445.131.431.137.341.9
440.841.536.210042.036.036.043.636.046.6
535.343.832.751.410032.038.053.538.042.7
640.837.730.539.634.310040.035.640.042.7
744.743.438.139.638.143.410037.640.044.7
851.038.136.541.936.543.838.110031.746.2
952.040.839.242.733.354.450.547.110035.0
1043.634.636.948.131.142.346.235.047.5100
If i > j, then the matrix element [i, j] refers to the percentage of identity for the active training sets; if i < j, then the matrix element [i, j] refers to the percentage of identity for the validation sets (external sets). The i and j indicate the numbering of the 10 splits examined.
Table 8. Correlation weights of the codes of quasi-SMILES used to build the model of solubility of fullerene C60 and C70 in organic solvents (split 1, third Scheme).
Table 8. Correlation weights of the codes of quasi-SMILES used to build the model of solubility of fullerene C60 and C70 in organic solvents (split 1, third Scheme).
CodeCW (Code)Frequency of Code in Active Training SetFrequency of Code in Passive Training SetFrequency of Code in Calibration SetStatistical Defect of CodeCode Is Involved in Simulation
#...........0.03101.0000FALSE
(...(.......−0.41296460.0053TRUE
(...........−0.12683034280.0020TRUE
1...(.......0.16021017100.0069TRUE
1...........−0.33861729180.0069TRUE
2...(.......0.01001.0000FALSE
2...........−0.17685320.0114TRUE
2...1.......0.01001.0000FALSE
3...........0.01001.0000FALSE
3...2.......0.01001.0000FALSE
=...(.......−0.3789131470.0075TRUE
=...........0.46852030170.0069TRUE
=...1.......0.09631222130.0078TRUE
=...2.......0.05385210.0191TRUE
=...3.......0.01001.0000FALSE
C...#.......0.03101.0000FALSE
C...(.......−0.43472833260.0026TRUE
C...........0.17975052500.0003TRUE
C...1.......−0.49571729180.0069TRUE
C...2.......0.36295320.0114TRUE
C...3.......0.01001.0000FALSE
C...=.......−0.37021524160.0060TRUE
C...C.......0.16774043430.0012TRUE
F...(.......0.01021.0000FALSE
F...........0.01021.0000FALSE
Br..(.......0.02671.0000FALSE
Br..........−0.48588790.0037TRUE
Br..C.......−0.40076140.0175TRUE
I...(.......0.02301.0000FALSE
I...........0.03601.0000FALSE
I...C.......0.01401.0000FALSE
Cl..(.......0.4380810100.0030TRUE
Cl..........−0.0828911100.0023TRUE
Cl..1.......0.01001.0000FALSE
Cl..2.......0.00101.0000FALSE
Cl..C.......0.03121.0000FALSE
N...#.......0.02101.0000FALSE
N...(.......0.00201.0000FALSE
N...........0.03721.0000FALSE
N...1.......0.00101.0000FALSE
N...2.......0.00011.0000FALSE
N...=.......0.00211.0000FALSE
N...C.......0.01511.0000FALSE
O...(.......−0.16699450.0108TRUE
O...........0.22541613140.0029TRUE
O...1.......0.00201.0000FALSE
O...=.......0.32007630.0095TRUE
O...C.......0.44748590.0075TRUE
S...(.......0.00301.0000FALSE
S...........0.00501.0000FALSE
S...1.......0.00101.0000FALSE
S...=.......0.00201.0000FALSE
S...C.......0.00201.0000FALSE
[C60].......−0.35784549390.0024TRUE
[C70].......−0.134874120.0139TRUE
[CH2].......0.00211.0000FALSE
[CH]........0.01011.0000FALSE
[Ge]........0.00101.0000FALSE
[N+]........0.03011.0000FALSE
[O-]........0.03011.0000FALSE
[Si]........0.00011.0000FALSE
[Sn]........0.02001.0000FALSE
[xyx0]......0.38341513130.0021TRUE
[xyx1]......0.24741814190.0043TRUE
[xyx2]......−0.44006860.0036TRUE
[xyx3]......−0.123358100.0087TRUE
[xyx4]......0.02611.0000FALSE
[xyx5]......0.02211.0000FALSE
[xyx6]......0.03111.0000FALSE
[xyx7]......0.01001.0000FALSE
[xyx9]......0.00101.0000FALSE
[xyyx0].....0.20194434400.0035TRUE
[xyyx1].....0.031151.0000FALSE
[xyyx2].....0.02861.0000FALSE
[xyyx3].....0.02001.0000FALSE
[xyyx4].....0.01001.0000FALSE
[xyzyx0]....−0.43344544470.0013TRUE
[xyzyx1]....−0.11165840.0085TRUE
[xyzyx2]....0.02001.0000FALSE
[xyzyx3]....0.00101.0000FALSE
Table 9. Quasi-SMILES encode a set of solutions of fullerene C60 and C70 in organic solvents along with the values of the optimal descriptor, experimental (Expr), and calculated (Calc) values of molar fraction and applicability domain (AD). The case of split 1 (third scheme). The regression formula is as follows: logS = −7.608 + 0.4426 × DCW(3,15).
Table 9. Quasi-SMILES encode a set of solutions of fullerene C60 and C70 in organic solvents along with the values of the optimal descriptor, experimental (Expr), and calculated (Calc) values of molar fraction and applicability domain (AD). The case of split 1 (third scheme). The regression formula is as follows: logS = −7.608 + 0.4426 × DCW(3,15).
SetCAS *Quasi-SMILESDCW(3,15)ExprCalcThe Statistical Defect of Quasi-SMILESAD
P109-66-0CCCCC[C60]3.6226−6.1000−6.00500.0174YES
V110-54-3CCCCCC[C60]4.0950−5.1000−5.79580.0189YES
C111-65-9CCCCCCCC[C60]5.0400−5.2000−5.37760.0217YES
P26635-64-3CC(C)CC(C)(C)C[C60]5.3953−5.2000−5.22031.0549YES
V124-18-5CCCCCCCCCC[C60]5.9849−4.7000−4.95940.0246YES
A112-40-3CCCCCCCCCCCC[C60]6.9298−3.5000−4.54110.0275YES
V493-02-7C1CCC2CCCCC2C1[C60]9.3895−3.5000−3.45240.1283YES
A137-43-9C1CCC(C1)CBr[C60]9.4279−4.2000−3.43540.0891YES
V542-18-7C1CCC(CC1)Cl[C60]7.5453−4.1000−4.26870.0717YES
C108-85-0C1CCC(CC1)Br[C60]8.2381−3.4000−3.96201.0701YES
P626-62-0C1CCC(CC1)I[C60]8.3333−2.8000−3.91992.0664YES
P5401-62-7C1CCC(C(C1)Br)Br[C60]9.9420−2.6000−3.20794.0855No
C110-83-8C1CCC=CC1[C60]7.5129−3.8000−4.28300.0692YES
C108-87-2CC1CCCCC1[C60]7.0015−4.5000−4.50940.0536YES
P75-09-2C(Cl)Cl[C60]4.5883−4.6000−5.57750.0320YES
A56-23-5C(Cl)(Cl)(Cl)Cl[C60]5.8391−4.4000−5.02390.0672YES
V74-95-3C(Br)Br[C60]6.9622−4.5000−4.52683.0236YES
P75-25-2C(Br)(Br)Br[C60]8.1953−3.2000−3.98105.0366No
A74-88-4CI[C60]4.5584−4.2000−5.59082.0096YES
C74-96-4CCBr[C60]6.2277−5.2000−4.85190.0323YES
P75-03-6CCI[C60]5.0308−4.5000−5.38162.0110YES
P79-34-5C(C(Cl)Cl)(Cl)Cl[C60]7.6012−3.1000−4.24390.0718YES
A107-06-2C(CCl)Cl[C60]5.0251−5.0000−5.38421.0318YES
C71-55-6CC(Cl)(Cl)Cl[C60]5.6861−4.7000−5.09160.0510YES
A540-54-5C[CH]CCl[C60]3.4486−5.6000−6.08202.0155YES
P107-08-4CCCI[C60]5.5033−4.6000−5.17252.0124YES
A75-29-6CC(C)Cl[C60]4.7185−5.9000−5.51990.0298YES
C75-26-3CC(C)Br[C60]4.9687−5.4000−5.40911.0289YES
A75-30-9CC(C)I[C60]5.0639−4.8000−5.36702.0252YES
C78-87-5CC(CCl)Cl[C60]5.4975−4.9000−5.17511.0333YES
C142-28-9C([CH]CCl)Cl[C60]6.3111−4.8000−4.81492.0297YES
V78-75-1CC(CBr)Br[C60]8.0234−4.3000−4.05702.0476YES
P627-31-6C(CI)CI[C60]7.0158−3.4000−4.50305.0240No
C96-11-7C(C(CBr)Br)Br[C60]10.5461−2.9000−2.94054.0637No
A96-18-4C(C(CCl)Cl)Cl[C60]7.4126−4.0000−4.32741.0541YES
V513-36-0CC(C)CCl[C60]5.1553−5.4000−5.32661.0297YES
P513-38-2CC(C)CI[C60]5.8766−4.3000−5.00732.0274YES
V507-19-7CC(C)(C)Br[C60]4.8092−5.0000−5.47971.0437YES
C127-18-4C(=C(Cl)Cl)(Cl)Cl[C60]8.1332−3.8000−4.00850.0852YES
P513−37-1CC(=CCl)C[C60]5.5615−4.5000−5.14671.0486YES
V71-43-2C1=CC=CC=C1[C60]8.1013−4.0000−4.02261.0973YES
P95-47-6CC1=CC=CC=C1[CH2][C60]9.0053−2.9000−3.62242.0987YES
V108-38-3CC1=CC(=CC=C1)C[C60]9.2607−3.3000−3.50941.1166YES
C526-73-8CC1=C(C(=CC=C1)C)C[C60]9.9717−3.1000−3.19472.1265YES
A95-63-6CC1=CC(=C(C=C1)C)C[C60]9.1010−2.5000−3.58011.1372YES
P108-67-8CC1=CC(=CC(=C1)C)C[C60]9.4917−3.5000−3.40710.1388YES
A527-53-7CC1=CC(=C(C(=C1)C)C)C[C60]10.1268−2.4000−3.12602.1422YES
P119-64-2C1CCC2=CC=CC=C2C1[C60]10.1800−2.5000−3.10252.1722YES
C103-65-1CCCC1=CC=CC=C1[C60]9.5187−3.5000−3.39521.1016YES
A98-82-8CCC1=CC=CC=C1C(C)C[C60]10.8426−3.6000−2.80922.1187YES
V104-51-8CCCCC1=CC=CC=C1[C60]9.9911−3.4000−3.18611.1030YES
V98-06-6CC(C)(C)C1=CC=CC=C1[C60]9.2698−3.7000−3.50542.1248YES
C462-06-6C1=CC=C(C=C1)F[C60]8.4784−4.1000−3.85573.1246YES
P108-90-7C1=CC=C(C=C1)Cl[C60]8.8771−3.0000−3.67921.1299YES
V108-86-1C1=CC=C(C=C1)Br[C60]9.5699−3.3000−3.37262.1283YES
P95-50-1C1=CC=C(C(=C1)Cl)Cl[C60]10.8547−2.4000−2.80391.1392YES
P108-36-1C1=CC(=CC(=C1)Br)Br[C60]12.6461−2.6000−2.01093.1299YES
C694-80-4C1=CC=C(C(=C1)Cl)Br[C60]11.5475−2.4000−2.49722.1375YES
P108-37-2C1=CC(=CC(=C1)Br)Cl[C60]11.9533−3.0000−2.31762.1315YES
V120-82-1C1=CC(=C(C=C1Cl)Cl)Cl[C60]12.1777−2.8000−2.21831.1482YES
V100-42-5C=CC1=CC=CC=C1[C60]9.2708−3.2000−3.50491.1230YES
V98-95-3C1=CC=C(C=C1)[N+](=O)[O-][C60]7.4812−3.9000−4.29703.1715No
P100-47-0C1=CC=C(C=C1)CCN[C60]9.3119−4.2000−3.48675.1274No
P100-66-3COC1=CC=CC=C1[C60]7.7096−3.1000−4.19591.1205YES
C100-52-7C1=CC=C(C=C1)C=O[C60]7.9172−4.2000−4.10411.1527YES
P103-71-9C1=CC=C(C=C1)N=C=O[C60]9.3125−3.4000−3.48655.1544No
A99-08-1CC1=CC(=CC=C1)[N+](=O)[O-][C60]8.0163−3.4000−4.06023.1614YES
P108-98-5C1=CC=C(C=C1)S[C60]8.0975−3.0000−4.02433.1246YES
C100-39-0C1=CC=C(C=C1)CBr[C60]11.2321−3.1000−2.63681.1487YES
A30583-33-6CC1=CC(=C(C=C1Cl)Cl)Cl[C60]12.6502−3.0000−2.00911.1496YES
A90-12-0CC1=CC=CC2=CC=CC=C12[C60]10.7555−2.2000−2.84782.1924YES
A28804-88-8CC1=CC2=C(C=C1)C=C(C=C2)C[C60]11.3352−2.1000−2.59123.2245No
A605-02−7C1=CC=C(C=C1)C2=CC=CC3=CC=CC=C32[C60]15.1824−1.9000−0.88838.2616No
A64-17-5CCO[C60]2.7640−7.1000−6.38500.0214YES
C71-36-3CCCCO[C60]3.7089−5.9000−5.96670.0242YES
C71-41-0CCCCCO[C60]4.1814−5.3000−5.75760.0257YES
P67-64-1CC(=O)C[C60]2.7551−7.0000−6.38890.0603YES
P68-12-2CN(C)C=O[C60]3.8967−5.3000−5.88363.0493YES
P110-01-0C1CCSC1[C60]5.9939−5.4000−4.95543.0474YES
V110-02-1C1=CSC=C1[C60]6.5278−4.4000−4.71913.0862YES
P554-14-3CC1=CC=CS1[C60]7.3816−3.0000−4.34124.0719No
P872-50-4CN1CCCC1=O[C60]7.3505−3.9000−4.35493.0688YES
P110-86-1C1=CC=NC=C1[C60]8.9043−4.0000−3.66714.0906No
C91-22-5C1=CC=C2C(=C1)C=CC=N2[C60]11.6924−2.9000−2.43314.1982No
V62-53-3C1=CC=C(C=C1)N[C60]9.0161−3.9000−3.61773.1246YES
C100-61-8CNC1=CC=CC=C1[C60]9.2792−3.8000−3.50124.1027No
V121-69-7CN(C)C1=CC=CC=C1[C60]10.3144−3.2000−3.04304.1147No
C4904-61-4C1CC=CCCC=CCCC=C1[C60]10.8090−2.7000−2.82411.1096YES
A629-59-4CCCCCCCCCCCCCC[C60]7.8747−4.3000−4.12290.0303YES
A110-82-7C1CCCCC1[C60]6.5290−5.3000−4.71850.0521YES
C591-49-1CC1=CCCCC1[C60]8.1394−3.8000−4.00570.0653YES
A2207-01-4CC1CCCCC1C[C60]7.7404−4.6000−4.18230.0651YES
C1678-91-7CCC1CCCCC1[C60]7.4740−4.3000−4.30030.0550YES
V67-66-3C(Cl)(Cl)Cl[C60]5.2137−4.8000−5.30070.0496YES
V106-93-4C(CBr)Br[C60]7.5510−4.2000−4.26622.0461YES
A106-94-5CCCBr[C60]6.7002−5.2000−4.64270.0337YES
C109-64-8C(CBr)CBr[C60]9.2132−4.2000−3.53041.0665YES
A78-77-3CC(C)CBr[C60]7.0735−4.9000−4.47750.0486YES
C507-20-0CC(C)([CH2])Cl[C60]2.9118−5.7000−6.31961.0451YES
A558-17-8CC(C)(C)I[C60]4.9044−4.4000−5.43762.0400YES
A79-01-6C(=C(Cl)Cl)Cl[C60]7.5078−3.8000−4.28530.0676YES
C108-88-3CC1=CC=CC=C1[C60]8.5737−3.4000−3.81351.0987YES
A106-42-3CC1=CC=C(C=C1)C[C60]8.4511−3.3000−3.86782.1202YES
P488-23-3CC1=C(C(=C(C=C1)C)C)C[C60]10.2939−2.9000−3.05212.1422YES
V100-41-4CCC1=CC=CC=C1[C60]9.0462−3.4000−3.60431.1002YES
V135-98-8CCC(C)C1=CC=CC=C1[C60]9.9017−3.6000−3.22572.1115YES
V591-50-4C1=CC=C(C=C1)I[C60]9.6651−3.5000−3.33043.1246YES
P541-73-1C1=CC(=CC(=C1)Cl)Cl[C60]11.3457−3.4000−2.58650.1361YES
V583-53-9C1=CC=C(C(=C1)Br)Br[C60]12.1551−2.6000−2.22834.1329No
C88-72-2CC1=CC=CC=C1[N+](=O)[O-][C60]8.3836−3.4000−3.89763.1473YES
V100-44-7C1=CC=C(C=C1)CCl[C60]9.3139−3.4000−3.48592.1297YES
P90-13-1C1=CC=C2C(=C1)C=CC=C2Cl[C60]11.5646−2.0000−2.48973.2094No
P71-23-8CCCO[C60]3.2365-6.4000−6.17580.0228YES
V111-27-3CCCCCCO[C60]4.6538−5.1000−5.54850.0271YES
V111-87-5CCCCCCCCO[C60]5.5988−5.0000−5.13030.0300YES
A107-13-1C=CCCN[C60]4.2477−6.4000−5.72824.0323No
P111-96-6COCCOCCOC[C60]2.2569−5.2000−6.60942.0611YES
C111-84-2CCCCCCCCC[C60]5.5124−4.9200−5.16850.0232YES
C79-00-5C(C(Cl)Cl)Cl[C60]6.9758−4.7800−4.52080.0542YES
A109-65-9CCCCBr[C60]7.1726−3.7400−4.43360.0351YES
P629-04-9CCCCCCCBr[C60]8.5900−3.3000−3.80630.0394YES
A111-83-1CCCCCCCCBr[C60]9.0625−3.0900−3.59710.0408YES
V112-89-0CCCCCCCCCCCCCCBr[C60]11.8972−2.5300−2.34240.0494YES
A67-56-1CO[C60]2.2916−8.8700−6.59410.0199YES
A143-08-8CCCCCCCCCO[C60]6.0712−4.2900−4.92110.0314YES
C112-30-1CCCCCCCCCCO[C60]6.5437−4.1500−4.71200.0328YES
V112-42-5CCCCCCCCCCCO[C60]7.0161−3.9900−4.50290.0343YES
P67-63-0CC(C)O[C60]2.5223−6.6500−6.49200.0390YES
C78-92-2CCC(C)O[C60]2.9947−6.3400−6.28280.0404YES
V6032-29-7CCCC(C)O[C60]3.4672−5.5700−6.07370.0418YES
A584-02-1CCC(CC)O[C60]3.4672−5.3600−6.07370.0418YES
A504-63-2C(CO)CO[C60]1.9748−7.0500−6.73430.0556YES
C110-63-4C(CCO)CO[C60]2.4473−6.5700−6.52520.0571YES
V111-29-5C(CCO)CCO[C60]2.9197−6.1900−6.31610.0585YES
A102-04-5C1=CC=C(C=C1)CC(=O)CC2=CC=CC=C2[C60]12.8645−3.4000−1.91432.2878YES
P104-92-7COC1=CC=C(C=C1)Br[C60]9.1904−2.5400−3.54053.1377YES
A2398-37-0COC1=CC(=CC=C1)Br[C60]10.0001−2.5500−3.18212.1341YES
P573-98-8CC1=C(C2=CC=CC=C2C=C1)C[C60]11.4939−2.1200−2.52092.2303YES
A75-05-8CCCN[C60]4.3075−7.5400−5.70184.0110No
V109-99-9C1CCOC1[C60]5.4793−5.1700−5.18310.0652YES
V108-75-8CC1=CC(=NC(=C1)C)C[C60]10.1468−2.8000−3.11723.1314YES
C64-19-7CC(=O)O[C60]1.5955−6.2700−6.90220.0712YES
V79-09-4CCC(=O)O[C60]2.0679−5.7900−6.69310.0726YES
V107-92-6CCCC(=O)O[C60]2.5404−5.7400−6.48400.0740YES
P109-52-4CCCCC(=O)O[C60]3.0128−5.0500−6.27480.0754YES
A142-62-1CCCCCC(=O)O[C60]3.4853−4.5000−6.06570.0769YES
A111-14-8CCCCCCC(=O)O[C60]3.9577−4.2600−5.85660.0783YES
V124-07-2CCCCCCCC(=O)O[C60]4.4302−4.9800−5.64750.0797YES
P112-05-0CCCCCCCCC(=O)O[C60]4.9026−4.4100−5.43840.0812YES
A76-13-1C(C(F)(Cl)Cl)(F)(F)Cl[C60]7.6642−5.7700−4.21619.0770No
V540-49-8C(=CBr)Br[C60]9.2824−3.6700−3.49982.0618YES
C1649-08-7C(C(F)(F)Cl)Cl[C60]6.9149−5.3800−4.54776.0501No
P123-91-1C1COCCO1[C60]5.3115−5.3100−5.25742.0652YES
P95-48-7CC1=CC=CC=C1O[C60]8.0059−5.5400−4.06482.1016YES
P287-92-3C1CCCC1[C60]6.0566−6.5200−4.92760.0507YES
P75-11-6C(I)I[C60]6.2755−4.8200−4.83075.0183No
A79-24-3CC[N+](=O)[O-][C60]4.1130−6.7000−5.78792.0553YES
P74-97-5C(Cl)Br[C60]5.2811−4.2000−5.27091.0304YES
P109-73-9CCCCN[C60]5.3981−3.3000−5.21912.0139YES
V583-57-3C[C@@H]1CCCC[C@H]1C[C60]6.7955−4.6000−4.60060.0623YES
A106-96-7CCCCBr[C60]4.9448−4.6400−5.41973.0347YES
V10026-04-7[Si](Cl)(Cl)(Cl)Cl[C60]6.5498−4.8200−4.70931.0622YES
P10038-98-9Cl[Ge](Cl)(Cl)Cl[C60]6.9654−4.1000−4.52541.0499YES
A7646-78-8Cl[Sn](Cl)(Cl)Cl[C60]7.2152−3.7000−4.41481.0499YES
V13465-77-5[Si]([Si](Cl)(Cl)Cl)(Cl)(Cl)Cl[C60]7.9273−4.0500−4.09962.0975YES
C7789-66-4[Si](Br)(Br)(Br)Br[C60]9.0656−3.8900−3.59588.0467No
A7789-67-5Br[Sn](Br)(Br)Br[C60]9.8161−4.5200−3.26367.0374No
C107-83-5CCCC(C)C[C60]4.7527−5.4900−5.50470.0354YES
C96-14−0CCC(C)CC[C60]4.7527−5.3500−5.50470.0354YES
V142-82-5CCCCCCC[C60]4.5675−5.0100−5.58670.0203YES
A75-52-5C[N+](=O)[O-][C60]3.6406−4.8200−5.99702.0538YES
P75-15-0C(=S)=S[C60]5.5718−3.1800−5.14225.0450No
A110-89-4C1CCNCC1[C60]7.5213−2.1400−4.27933.0488YES
P123-75-1C1CCNC1[C60]7.0489−2.2700−4.48843.0474YES
C2586-62-1CC1=C(C2=CC=CC=C2C=C1)Br[C60]12.5551−2.1300−2.05123.2311No
A109-66-0CCCCC[C70]3.6495−6.5720−5.99300.0289YES
C110-54-3CCCCCC[C70]4.1220−5.6830−5.78390.0304YES
C142-82-5CCCCCCC[C70]4.5944−5.0830−5.57480.0318YES
C111-65-9CCCCCCCC[C70]5.0669−5.0950−5.36570.0332YES
V124-18-5CCCCCCCCCC[C70]6.0118−4.9130−4.94740.0361YES
C112-40-3CCCCCCCCCCCC[C70]6.9567−4.5780−4.52920.0390YES
V110-82-7C1CCCCC1[C70]6.5560−4.9870−4.70660.0636YES
A67-64-1CC(=O)C[C70]2.7820−6.7700−6.37700.0717YES
C67-63-0CC(C)O[C70]2.5492−6.6990−6.48000.0505YES
C56-23-5C(Cl)(Cl)(Cl)Cl[C70]5.8660−4.8570−5.01200.0787YES
P106-42-3CC1=CC=C(C=C1)C[C70]8.4780−3.2360−3.85582.1317YES
A108-67-8CC1=CC(=CC(=C1)C)C[C70]9.5187−3.6130−3.39520.1503YES
V108-88-3CC1=CC=CC=C1[C70]8.6007−3.7500−3.80151.1102YES
V71-43-2C1=CC=CC=C1[C70]8.1282−3.8590−4.01071.1088YES
P75-15-0C(=S)=S[C70]5.5987−3.1510−5.13035.0565No
V75-09-2C(Cl)Cl[C70]4.6152−5.2150−5.56560.0435YES
P95-50-1C1=CC=C(C(=C1)Cl)Cl[C70]10.8816−2.3160−2.79191.1507YES
P95-47-6CC1=CC=CC=C1[CH2][C70]9.0323−2.6500−3.61052.1102YES
C541-73-1C1=CC(=CC(=C1)Cl)Cl[C70]11.3726−2.5950−2.57460.1476YES
A119-64-2C1CCC2=CC=CC=C2C1[C70]10.2069−2.6970−3.09062.1837YES
A67-56-1CO[C70]2.3185−8.7420−6.58220.0314YES
A64-17-5CCO[C70]2.7910−7.2720−6.37300.0329YES
C71-23-8CCCO[C70]3.2634−6.4570−6.16390.0343YES
C71-36-3CCCCO[C70]3.7359−6.0230−5.95480.0357YES
A71-41-0CCCCCO[C70]4.2083−6.4350−5.74570.0372YES
V111-27-3CCCCCCO[C70]4.6808−5.2740−5.53660.0386YES
C111-87-5CCCCCCCCO[C70]5.6257−5.0500−5.11830.0415YES
V111-70-6CCCCCCCO[C70]5.1532−5.0040−5.32750.0400YES
C143-08-8CCCCCCCCCO[C70]6.0981−4.3590−4.90920.0429YES
V112-30-1CCCCCCCCCCO[C70]6.5706−4.1520−4.70010.0443YES
C112-42-5CCCCCCCCCCCO[C70]7.0431−4.2160−4.49100.0458YES
* CAS is related to the corresponding solvent; A, P, C, and V denote active training, passive training, calibration, and validation sets, respectively.
Table 10. Quasi-SMILES CCCCC[C60] is the code of the solution for fullerene C60 in pentane.
Table 10. Quasi-SMILES CCCCC[C60] is the code of the solution for fullerene C60 in pentane.
Code of Quasi-SMILESCW (Code)Frequency of Code in Active Training SetFrequency of Code in Passive Training SetFrequency of Code in Calibration Set
[C60].......0.4203454939
C...........0.3047505250
C...........0.3047505250
C...........0.3047505250
C...........0.3047505250
C...........0.3047505250
C...C.......0.1677404343
C...C.......0.1677404343
C...C.......0.1677404343
C...C.......0.1677404343
[xyx1]......0.2474181419
[xyyx0].....0.2019443440
[xyzyx0]....0.5584454447
3.6226
Table 11. The comparison of different approaches for simulation of solubility of fullerenes in different solvents.
Table 11. The comparison of different approaches for simulation of solubility of fullerenes in different solvents.
ApproachSetnR2References
MLR *Training set920.861[23]
Validation set300.903
PLSTraining set800.674[24]
Validation set280.692
SVMTraining set920.871[25]
Validation set300.940
DTBTraining set1450.970[12]
Validation set360.964
Monte CarloTraining set550.947[26]
Validation set350.915
DFT Training set440.73[27]
Validation set150.74
Quantum-mechanical descriptorsTraining set440.76[28]
Validation set150.70
CODESSA softwareTraining set210.745[29]
Validation set60.801
Self-consistent modelsTraining set≈100≈0.73In this study
Validation set19-220.84–0.94
* MLR = multiple linear regression; PLS = partial least square regression; SVM = support vector machine; DTB = decision tree boost; DFT = density functional theory.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Toropova, A.P.; Toropov, A.A.; Fjodorova, N. QSPR and Nano-QSPR: Which One Is Common? The Case of Fullerenes Solubility. Inorganics 2023, 11, 344. https://doi.org/10.3390/inorganics11080344

AMA Style

Toropova AP, Toropov AA, Fjodorova N. QSPR and Nano-QSPR: Which One Is Common? The Case of Fullerenes Solubility. Inorganics. 2023; 11(8):344. https://doi.org/10.3390/inorganics11080344

Chicago/Turabian Style

Toropova, Alla P., Andrey A. Toropov, and Natalja Fjodorova. 2023. "QSPR and Nano-QSPR: Which One Is Common? The Case of Fullerenes Solubility" Inorganics 11, no. 8: 344. https://doi.org/10.3390/inorganics11080344

APA Style

Toropova, A. P., Toropov, A. A., & Fjodorova, N. (2023). QSPR and Nano-QSPR: Which One Is Common? The Case of Fullerenes Solubility. Inorganics, 11(8), 344. https://doi.org/10.3390/inorganics11080344

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop