3.1. Amines
The mono-substituted N-alkylamines were treated earlier [
1], and in that work, we established a NH
2- amine Group Contribution parameter value of +13 kJ/mol. The averaged absolute difference between the experimental values and those of our GC model was found to be 1.20 kJ/mol (
Table S15 in the Supplementary Materials).
Referring to what we discussed above, it would be more appropriate to exclusively use reliable verified experimental data. When we disregard the entries for which we had data from the CAPEC database [
32] (the explicit values were not published and therefore not quoted in this paper explicitly), we end up with
Table 1. The entries in
Table 1 originate either from Rossini [
33], Pedley et al. [
34], or Steel et al. [
35]. Compared to the larger set in Ref. [
1], the averaged absolute deviation drops from 1.20 kJ/mol to 0.71 kJ/mol, with the annotation that methylamine was excluded. Firstly, this illustrates what we discussed before about the quality of experimental data and illustrates the better data selection. Secondly, for methylamine, the presence of the methyl Group to the nitrogen cannot be described appropriately, i.e., within chemical accuracy, by the GC model. We will observe this more often in cases to be discussed in subsequent sections, and therefore methyl-substituted nitrogen species should be considered separately. When we consider the G4 quantum results, these are, considering error bars, all within chemical accuracy, including methylamine, which confirms that G4 results are in good agreement with experimental values.
Now, considering the 2-aminoalkanes, viz.
Table 2, using the GC parameter for the NH
2 group, the heats of formation of isobutylamine and isobutylamine differ from their experimental values clearly beyond 4 kJ/mol, chemical accuracy. However, when we introduce an additional GC parameter associated with an NH
2 group in R-CH(NH
2)-R’ and a numerical value of +3 kJ/mol, we obtain the results shown as the first two entries in
Table 2, revealing agreement with experimental values within chemical accuracy. We will later see two more examples for which the introduction of this GC parameter leads to good agreement between experiment and model. As can be corroborated from
Table 2, these results also agree very well with the G4 results.
Next, we consider the symmetrical di-alkyl-substituted secondary amines R
2NH
2. Experimental data and model values are collected in
Table 3 along with the G4 quantum results. We needed, as expected based on what we saw before, to introduce another GC parameter for the secondary amine R-NH-R’, and established a value of +51 kJ/mol to obtain good agreement between experimental and model values for four of the species in
Table 3 (entries 2, 3, 4, and 7). Similar to the results for the primary alkyl amines, dimethylamine is an exception and should be treated separately rather than introducing additional parameters for a single species only, and as for methylamine, one can apply the G4 method to obtain a good value with chemical accuracy compared to the experimental value.
The GC results for diisopropyl and diisobutylamine reveal a larger difference when compared to the experimental values. The G4 results (column 6 in
Table 3) also reveal a difference of over 8 kJ/mol for diisopropylamine, all with reference to the experimental data from Pedley et al. [
34]. There are also data due to Pilcher c.s. [
37], and adopting their value for diisopropylamine reveals very good agreement with the Boltzmann averaged G4 result (difference 0.7 kJ/mol), whereas the GC value is off by 6.7 kJ/mol, which is considerably lower than when compared to Pedley’s data (over 14 kJ/mol) and not that much beyond chemical accuracy. As we overall get good results from our GC model and G4 generally provides values close to the experimental ones, we thus suspect that the experimental value for diisopropylamine from Pedley c.s. is insufficiently reliable. For diisobutylamine, the G4 results (conformationally averaged) are clearly significant deviating from the experimental values (7.9 and 9.5 kJ/mol for Pedley’s and Pilcher’s experimental values, respectively). The presence of some steric hindrance is realistic for diisopropylamine and may account for the difference between the GC value and the experimental value, whereas the G4 and the experimental value (Pilcher c.s.) agree very well. When we compare the experimental and model findings collected in
Table 3, it remains elusive why both the GC and G4 values are 8–10 kJ/mol away from the experimental value for diisobutylamine, see
Figure 1, as for this structure no steric effects are expected. The only potential reason seems to be a problem with the experimental value, as the G4 results are consistent throughout, and without any expected steric or electronic effects, in good agreement with the GC model.
For the tertiary trialkylamines R
3N, we observe very similar results, as shown in
Table 4. Another GC parameter was introduced for the tertiary nitrogen with a value of +92.5 kJ/mol. All three G4 results agree very well with the experimental values. For the GC approach, once more the trimethylamine is the exception and should be treated as a Group by itself. The other two GC values represent the experimental data within chemical accuracy.
Table 5 comprises data related to a number of other secondary and tertiary alkyl amines. For the first entry, butylisobutylamine, we see a difference between the GC model and experimental values beyond chemical accuracy, but the GC value is quite close to the G4 Boltzmann averaged value. A relatively high value for the error in the experimental value has been given as 5.2 kJ/mol [
34], and therefore we conclude that the GC model and G4 value can be considered within chemical accuracy of the true value. Entries 2 and 4 have a methyl Group and, in line with previous examples mentioned above, the GC value is, as expected, beyond chemical accuracy from the experimental value. However, for N-methyl-butanamine, the difference between model and experimental value is positive rather than negative as for the other (previous) examples, with a methyl Group attached to the amine nitrogen. The G4 value of −83.6 kJ/mol is very much different from the experimental value, but when we consider the difference between the GC model value and the G4 value, we do find a negative difference of −8.8 kJ/mol. These results strongly suggest that the experimental value is in error, and this confirms once more that only carefully verified experimental data (see also text in the Introduction) should be taken into account when developing a GC model.
Entry 3, N-N-dimethyloctylamine, reveals good agreement between the GC value and the experimental value despite the presence of methyl Groups. The G4 value of −174.2 kJ/mol for this species also agrees comparatively well with the experimental and GC model values. The good agreement between the values looks unexpected because of the presence of methyl Groups and contrary to other examples already presented where we see a deviation significantly beyond chemical accuracy. When we look at the last entry, N,N-dimethylbutane-1-amine, we have no sufficiently reliable experimental value, but the agreement between the GC and G4 values is not much beyond chemical accuracy (5.5 kJ/mol). A preliminary conclusion is that N,N-dimethylamine does not reveal a larger deviation.
Tert-butylisopropylamine shows a deviation slightly beyond chemical accuracy, but considering the error in the experimental value of 3.2 kJ/mol as provided by Verevkin c.s. [
40], the GC model is considered to perform well enough.
In
Table 6, we collect data for diamines. For all three species with a 1,2-diamine, we need two different amine parameters, as determined before (see above) for N-alkyl amines and 2-aminoalkanes, respectively. The resulting GC values for the heat of formation agree with the experimental values within chemical accuracy, as do the G4 results.
In summary, for the amines, we generally observed good agreement between our GC model and the experimental values. In a few cases, for which the difference was somewhat beyond chemical accuracy, the error in the experimental data can account for this, and therefore the GC model values are considered acceptable for a model aiming for chemical accuracy. The general exception for good agreement within chemical accuracy are those species that have a methyl Group attached to the amine nitrogen. These include methylamine (−6.4 kJ/mol), dimethylamine (−14.1 kJ/mol), trimethylamine (−10.9 kJ/mol), N-methyl-butanamine (−8.8 kJ/mol when compared to the G4 result), and tert-butylmethylamine (−14.7 kJ/mol). From a pragmatic point of view, when adding +11 kJ/mol to all methyl-substituted species, we have agreement between experimental and GC values and almost chemical accuracy for all five named species. A preliminary conclusion is that this larger deviation for methylamines does not apply to N,N-dimethylamines, but further data are required to confirm this. A better way, when the expertise and methodology are available, is to obtain the heat of formation for such species from G4 calculations.
3.2. Alkyl Esters
Experimental and model values for the heats of formation of 26 alkyl esters are collected in
Table 7. The Marrero–Gani (MG) [
10], Contantinou–Gani (CG) [
9] and Joback and Reid (JR) [
8] GC values were obtained via the Propred module in the ICAS23 software suite [
41]. Column 3 contains the values from our GC model as developed previously [
1,
2,
3,
4]. We previously developed the GC parameter COOH for the carboxylic acids and a numerical value −391 kJ/mol. For the esters, we need further GC parameters in order to achieve chemical accuracy. Recalling what we experienced while describing the aliphatic ethers [
2], i.e., we needed a separate GC parameter for the Me-O-R compared to the R’-O-R, it seems logical to adapt this too for the esters and have separate GC parameters for the COO(Me) and the COO(R). Our two new GC parameter values represent the COO Group in COOMe and COOR, respectively, as the alkyl part is not included in the parameter value. We initially established the COO(R) value as −339 kJ/mol. According to B3LYP calculations we performed, the methyl ester is 13 kJ/mol less stable than the ethyl, which can be translated into a numerical difference between the COO(R) and the COO(Me) Group. With a Group parameter value of −339 kJ/mol for COO(R), we thus arrive at −326 kJ/mol for COO(Me).
When we adopt the named Groups and their parameter values, we arrive at the results shown in
Table 7 from which we see that our GC model for the linear (non-branched) alkanoic acid methyl esters (first nine entries) reveals results clearly within chemical accuracy. This is generally not the case for the other three GC models, revealing differences with experimental values up to 26 kJ/mol for individual cases, which can be attributed to the fact that these do not have a separate GC parameter for COO(Me). For a number of alkanoic acid methyl esters, the Marrero–Gani GC dHf values are in reasonably good agreement with the experimental values, some within chemical accuracy. However, for some species the differences are comparatively huge, although regarding the structures we are only having a different number of aliphatic CH
2 groups and therefore one should expect very regular behaviour. The difference in dHf between nonanoic acid methyl ester and decanoic acid methyl ester according to the Marrero–Gani method reads 7.3 kJ/mol only, whereas the addition of a single CH
2 group in the alkyl chain should lead to a difference of approximately 20 kJ/mol. For acetic acid methyl ester, the Marrero–Gani value is spot on, but for many others, including pentanoic acid methyl ester, hexanoic acid methyl ester, and nonanoic acid methyl ester, the difference with the experimental value is more than 10 kJ/mol. A more detailed analysis reveals the origin of these observations.
Whereas for the linear (non-branched) alkanoic acid methyl esters we should always have the same set of Groups constituting the molecule, the Marrero–Gani method turned out to have a third-order contribution CH3-(CH2)m-CH2COO in some cases, CH3-CHm-COO in some others, and yet still others in which both contributions are taken into account. As these are the only differences within the series, this must be related to the irregularly varying dHf, even though one should simply have an approximately 20 kJ/mol difference upon one more (or less) CH2 group.
Table 7 shows that for the first 50% of the entries, we find excellent agreement between the experimental value and our GC model. The four available G4 results are also in very good agreement with both the experimental and the GC model values. But going down the table, we find somewhat unexpected results. For the first 50% of the entries in
Table 7, we note that the increment per CH
2 group is close to 20 kJ/mol, which is in accordance with all other alkyl like systems, and also the same for the different GC models. However, when we look at the sequence pentanoic acid ethyl ester–pentanoic acid propyl ester–pentanoic acid butyl ester, the increment is about 27 kJ/mol which cannot be understood from any known and common sense physico-chemical argument. The same increment, 27 kJ/mol, is found for the pair butanoic acid 1-methylpropyl ester and pentanoic acid 1-methylpropyl ester (last two entries in
Table 7). There is, as we think, only one possible explanation, which is that long-range interactions cause this effect at the side of the ester group (and not at the opposite side; see the first nine entries in
Table 7). We can have an initial verification by using quantum calculations on relative stabilities. We compare three molecules with identical chemical formulae but different alkyl length at the ester side. For pentanoic acid ethyl ester, butanoic acid propyl ester, and propanoic acid butyl ester, both the B3LYP and ωB97X-D yield marginal difference in the total energies, namely, within 3 kJ/mol, which cannot account for the much larger differences we see in
Table 7.
Next, when we consider pentanoic acid methyl ester (one of these nine), 2-methylbutanoic acid methyl ester, and 3-methylbutanoic acid methyl ester (the three entries in green in
Table 7), where all species have the same chemical formulae, the results from our GC model suggest that these three have close dHf values with a range of 7 kJ/mol. The experimental values show a range of 26 kJ/mol, with the difference between our GC model and experimental values for the methylbutanoic methyl esters being 14.8 and 18.1 kJ/mol. As it was not clear why this difference is so large, we performed quantum calculations of the B3LYP//6-311+G** and ωB97X-D//6-311++G** methods. Both methods reveal differences between the three structures of several kJ/mol only, which does not support the large differences between the experimental values. The quantum calculations also reveal that, in accordance with the experimental and GC values, 3-methylbutanoic acid methyl ester is the most stable one of the three structures. Whereas steric effects due to methyl substitution next to the carboxylic group as in 2-methylbutanoic acid ethyl ester and 2-methylbutanoic acid methyl ester may be an explanation for the differences between experimental and GC model values, the fact that for 2,2-dimethylpropanoic acid methyl ester we have pretty good agreement between the experimental and our GC model values (4.7 kJ/mol, just beyond chemical accuracy) means that steric effects do not seem to be involved. A similar reasoning applies to the three ethyl esters (the three entries in blue in
Table 7). As the differences between experimental and GC model results, which we could not explain up until now, do not give us any indication whether, and if so, which additional GC parameter(s) should be introduced, we should look at a passage from the introduction of the Pedley paper: [
34] ‘
Esters (Table 2.11) For esters the agreement between calculated and experimental values is rather erratic, almost certainly due to long range interactions through the ester group; some discrepancies may also be due to unreliable experimental data. Note that many of the experimental values for uncertainties are considerably greater than the corresponding calculated values, indicating that the experimental data in these cases are of little value in defining the reliability of the method’. With this in mind, we decided to evaluate the heats of formation of all esters in
Table 7 using the G4 method. Because of the large number of possible conformations per species, we estimated the contribution of conformational averaging based on the earlier results for similar alkyl branches, and these values are provided in the column entitled ‘est. confor avgd’. Only the G4 values for species with long alkyl chains are more seriously affected, but the changes still remain in the kJ/mol range. What we observe now is very interesting: all except four values (those in red) reveal very good agreement, within chemical accuracy, between the GC model and the conformationally averaged G4 values. When we look at the four exceptions, we see that all have the common feature that the structure has a branched substituent directly at the ether oxygen. As the difference between the G4 and GC model values is 9.55 ± 0.55 kJ/mol, a substitution correction energy of this magnitude added to the GC parameters will lead to a perfect agreement for all 25 esters. We could establish this result, which was not found by any other GC method, due to the step-wise investigation and without automation but through the inspection of every individual result and connecting this with (dis-)similarities in structure. Equally important was the challenging of experimental data by invoking G4 calculations.
In conclusion, by invoking G4 calculations, we could lend support to our GC model showing excellent performance, whereas some experimental data must be considered in error, which is in line with the remarks made by Pedley c.s. quoted above.
3.3. Substituted Benzenes
In one of the previous papers in this series, we reported the introduction of alkyl substitution corrections when bonded to a benzene ring. The magnitude of this substitution was reported as depending on the number of substituents: for the mono-alkyl substituted benzene +6 kJ/mol, for the di-substituted alkyl benzene +18.5 kJ/mol, for the tri-alkyl substituted benzene +30 kJ/mol, and for the tetra alkyl substituted benzene +40 kJ/mol. At this point, it is fully open whether other substituents will exhibit similar behaviour. Because of the interaction with the aromatic ring, it is not to be expected that the magnitude of these parameters is the same for other substituents as for the alkyl species. Furthermore, it is to be investigated whether multi-substituted species which include an alkyl substituent still involve one of the former alkyl-related substituent parameters or entirely new multi-substituent parameters.
3.3.1. t-Butylbenzenes
Four t-butylbenzenes were considered and the data are collected in
Table 8. An interaction term of +13.5 kJ/mol was added to account for the interaction between the t-butyl Group and the benzene ring. This value was determined by looking for a good fit between model and experimental values. We retained the mono-alkyl substitution correction of +6 kJ/mol for mono-substitution and for di-alkyl substitution the correction of +18.5 kJ/mol [
1]. We subsequently observed good agreement between the model and experimental values for the heat of formation for 3- and 4-t-butyltoluene, viz.
Table 8, whereas for 2-t-butyltoluene we observed a larger discrepancy, which is attributed to the steric interaction between the methyl and the neighbouring t-butyl group as (semi-qualitatively) confirmed by B3LYP calculations (+27 kJ/mol compared to 3- and 4-t-butyltoluene).
We finally note that in the
Supplementary Material to Ref. [
1], we erroneously gave the GC model value for t-butylbenzene as −18.85 kJ/mol, which actually was the result for sec-butylbenzene.
3.3.2. Anilines
Table 9 contains data on anilines. The GC model comprises the Group parameters as previously established for the various classes of amines. In addition, for the anilines with an alkyl substituent to the benzene ring, there is a substitution parameter involved, as we established for the alkyl substituted benzenes [
1], 6 kJ/mol for single substitution and 18.5 kJ/mol for double substitution. In addition, we needed to introduce an additional interaction parameter of a magnitude of −7 kJ/mol for the NH
2–benzene pair to obtain good agreement between experimental and GC model values. The agreement between experimental and GC model values is good, except for N,N-dimethylaniline. The G4 quantum values agree with the experimental values for all species including N,N-dimethylaniline, and, consequently, we conclude that there is a specific problem with the GC model for N,N-dimethylaniline. Although the difference is very similar to the methyl-substituted amines, the good agreement between the GC model and experimental values for the other methylanilines in
Table 9 suggests a different origin, but we currently have no explanation for the failure of the GC model for this particular molecule. As we may learn through other GC models, when we compare to a few other Group Contribution approaches as implemented in the ICAS23 software suite, the deviation with the experimental value for N,N-dimethylaniline is more than 20 kJ/mol for the Marrero–Gani model (so even more than our GC model), but the Constantinou–Gani model is almost spot on with 102.8 kJ/mol (experimental value: 100 kJ/mol). However, for N-methylaniline, the Constantinou–Gani method is 9 kJ/mol off, whereas both our GC model and the Marrero–Gani model values are within chemical accuracy. So, none of these three models accounts with chemical accuracy for all species in
Table 9. For the Marrero–Gani method, we found an averaged absolute deviation for all species, N,N-dimethylaniline excepted, of 2.3 kJ/mol, whereas our GC model gives 2.3 kJ/mol, so, in essence, the same averaged performance. Whereas for the Marrero–Gani method three values differ beyond chemical accuracy, for our model, this is only the case for N,N-dimethylaniline, and this despite the fact that the Marrero–Gani method employs more specific benzene substitution parameters such as 1,2,3 or the 1,3,5 pattern, whereas we only differentiate between single, double, or triple substitution [
1].
As we have seen that the G4 method generally gives very good values (within chemical accuracy when compared to experiments) for the heat of formation, it is reasonable that we can rely on G4-calculated heats of formation for other anilines too. Interestingly, our GC model value for N,N-diethylaniline of 44.0 kJ/mol agrees very well with the G4 value of 43.1 kJ/mol (
Table 9b), as is the case for 2-ethylaniline. The GC model value for 3,5-dimethylaniline is still within chemical accuracy, but 2,6-dimethylaniline is clearly more off (7.4 kJ/mol) but not dramatically so. When we compare to another GC approach, in this case specifically the Marrero–Gani approach, we observe a phenomenon that we have observed before in our series of studies. Whereas one would not expect a deviation for N,N-diethylaniline, as there is no steric hindrance or similar that the GC method does not account for, there is no significant deviation for 2,6-dimethylaniline, indicating that the parametrisation has not been established based on the proper physical chemistry.
3.3.3. Phenols
Data on alkylsubstituted phenols are shown in
Table 10. Regarding the GC parameters, an additional OH–phenyl group interaction parameter was introduced and has a value of +7 kJ/mol. This leads to good agreement between experimental and GC model values, with an averaged absolute deviation of 2.8 kJ/mol. The G4 quantum values, adopted from van der Spoel c.s. [
19], reveal an averaged absolute difference of 1.8 kJ/mol. Two GC model values are slightly beyond chemical accuracy, namely, for 4-ethylphenol and 2,4-dimethylphenol. Somewhat surprisingly, the G4 value for 3-methylphenol shows a difference of 6.4 kJ/mol compared to the experimental value (Pedley c.s. [
34]), whereas the G4 value is in agreement with our GC model within chemical accuracy. Regarding the three methylphenols, the experimental heats of formation, as presented in
Table 10.
Table 10 reveal the most negative value for 3-methylphenol, a variety of quantum method results, including CBS-QB3, G2, G3, G4 (all from Ref. [
19]), T1 [
45] (actual results obtained with the Spartan program [
31]), and B3LYP relative energy calculations (present work), reveal that there is a monotonic decrease from 2-methylphenol to 4-methylphenol. In other words, according to all these methods, 2-methylphenol has the most negative heat of formation, which puts serious doubt on the experimental value for 3-methylphenol. Therefore, we conclude that our GC model performs very well, with almost all individual values within chemical accuracy and the exceptions not much beyond (less than 5 kJ/mol).
The additional OH–phenyl group interaction parameter introduced here has a value of +7 kJ/mol. One may question whether this parameter should be seen at an identical base as the interaction parameter for an alkyl substitution for which we previously established a parameter value of +6 kJ/mol. The potential advantage would be that we have one parameter less, which is generally good for a model with many parameters. With a value of +6 kJ/mol, the results for the phenols are still acceptable throughout. However, when we treat an OH and an alkyl substitution on an equal footing, we should consequently invoke the multiple substitution parameters established previously (di-, tri-, and tetra-substituted benzene) [
1]. This leads to significant differences between experimental and model values for the tri-substituted methyl-phenols (e.g., 2,3-dimethylphenol), and therefore we concluded that including an additional separate phenol correction is the proper approach.
Finally, other GC methods, including the Marrero–Gani and Constantinou–Gani approaches, revealing a heat of formation for phenol of −98.3 and −97.8 kJ/mol, respectively, showed a larger deviation beyond chemical accuracy from the experimental value quoted in
Table 10. It is likely, viewing the times the methods were developed, that the differences are due to the use of an experimental value like the −96.4 kJ/mol from Pedley c.s. [
34], and not to an inherent difference between the models as such. This once more confirms that the selection of experimental values and their reliability is absolutely crucial when developing a GC model with chemical accuracy.
3.3.4. Methoxybenzenes
Experimental and computed data for methoxybenzenes are collected in
Table 11. For the mono- and di-methoxybenzenes (upper part of
Table 11), the agreement between the G4 quantum values is very good when compared to the experimental data, with an averaged absolute deviation of 2.0 kJ/mol and all values within chemical accuracy. We note that van der Spoel c.s. [
19] have reported a G4 value of −229.4 kJ/mol for 2-methoxybenzene, which is significantly off the experimental value and also differs significantly from the G4 result of −209.4 kJ/mol reported by Verevkin [
44]. Regarding our GC model, when introducing a GC value for the methoxy Group of −152 kJ/mol, all GC values agree within chemical accuracy with the experimental values, except a larger deviation for 1,2-dimethoxybenzene.
Somewhat peculiar is the observation that one of the values for methoxybenzene provided by the NIST database [
46] is −76.69 kJ/mol, with reference to ‘Reanalyzed by Pedley, Naylor, et al., 1986’, whereas the original 1986 Pedley paper quotes −67.9 kJ/mol, the value we are using in the present work. Also noteworthy to mention is that Pedley’s value of −223.3 kJ/mol for 1,2-dimethoxybenzene (−206 kJ/mol according to Verevkin) is within chemical accuracy from the GC model value, but the G4 quantum result (−209.4 kJ/mol) does not support this value. We have further confirmation from B3LYP relative energy calculations that we performed in the course of the present work that the experimental and G4 values in
Table 11 are correct. The difference between the experimental values for 1,2- and 1,3-dimethoxybenzene of 17.6 kJ/mol is to be compared to the 16.2 kJ/mol from the B3LYP calculations. Similarly, differences between 1,4- and 1,3-dimethoxybenzene read 7.4 kJ/mol and 8.7 kJ/mol, respectively. Thus, the GC model does not account properly for 1,2-dimethoxybenzene, and consequently also not for the trimethoxybenzenes with adjacent methoxy groups.
Table 11.
Experimental, present GC model, and G4 model values for the heat of formation for methoxybenzenes. All values in kJ/mol. Experimental and G4 quantum data for the upper part of Table are from Verevkin c.s. [
44] except for the experimental values for methoxybenzene and 1-methoxy-3-methylbenzene (Pedley [
34]) and the G4 value for methoxybenzene [
19]. Experimental and G4 results in the lower part of the table were taken from Verevkin c.s. [
47]. The sixth column contains our GC model results with methoxy–methoxy neighbour interactions accounted for (see text for further explanation).
Table 11.
Experimental, present GC model, and G4 model values for the heat of formation for methoxybenzenes. All values in kJ/mol. Experimental and G4 quantum data for the upper part of Table are from Verevkin c.s. [
44] except for the experimental values for methoxybenzene and 1-methoxy-3-methylbenzene (Pedley [
34]) and the G4 value for methoxybenzene [
19]. Experimental and G4 results in the lower part of the table were taken from Verevkin c.s. [
47]. The sixth column contains our GC model results with methoxy–methoxy neighbour interactions accounted for (see text for further explanation).
Methoxybenzenes | Exp. | Model | Model − Exp. | ABS (Model − Exp.) | ABS (Model + Methoxy Corr. − Exp.) | ABS (Exp. − G4) | G4 |
---|
methoxybenzene (anisole) | −67.9 | −67.5 | 0.4 | 0.4 | 0.4 | 3.7 | −71.6 |
1,2-dimethoxybenzene | −206.0 | −219.5 | −13.5 | 13.5 | 2.5 | 3.4 | −209.4 |
1,3-dimethoxybenzene | −223.6 | −219.5 | 4.1 | 4.1 | 0.0 | 0.6 | −224.2 |
1,4-dimethoxybenzene | −216.2 | −219.5 | −3.3 | 3.3 | 3.3 | 0.2 | −216.0 |
1-methoxy-3-methylbenzene | −104.1 | −103.9 | 0.2 | 0.2 | 0.2 | | |
averaged absolute difference | | | | 4.3 | 1.3 | 2.0 | |
1,2,3-trimethoxybenzene | −346.0 | −371.5 | −25.5 | 25.5 | 0.6 | 5.8 | −351.8 |
1,2,4-trimethoxybenzene | −360.6 | −371.5 | −10.9 | 10.9 | 0.1 | 0.1 | −360.7 |
1,3,5-trimethoxybenzene | −381.6 | −371.5 | 10.1 | 10.1 | 1.9 | 0.4 | −382.0 |
3,4,5-trimethoxytoluene | −382.2 | −407.9 | −25.7 | 25.7 | 3.7 | 1.2 | −383.4 |
averaged absolute difference | | | | 18.0 | 1.6 | 1.9 | |
As for 1,2-dimethoxybenzene, the GC model value is more negative than the experimental value, and the cause of the difference is likely of steric origin. When we adopt a value of +11 kJ/mol for the interaction energy between the two neighbouring methoxy groups and −4.1 kJ/mol for the real interaction energy between two methoxy group at 1 and 3 positions, we obtain a GC model with correction due to methoxy–methoxy interactions, and we obtain the model values shown in column 6 of
Table 11. The numerical values +11 kJ/mol and −4.1 kJ/mol were determined by considering simultaneously the methoxybenzenes, the methoxyphenols, and the methoxybenzaldehydes so as to obtain suitable numerical values applying to all three classes. For the methoxybenzenes, now all GC model values agree with the corresponding experimental value within chemical accuracy, with an averaged absolute deviation of 1.6 kJ/mol. The available sufficient data for the methoxy-benzenes made this possible.
3.3.5. Methoxyphenols
Experimental and computed data for methoxyphenols are collected in
Table 12. First of all, we observe that the G4 results compare very favourably with the experimental values, with an averaged absolute difference of 1.9 kJ/mol and not a single deviation beyond chemical accuracy, which is yet another demonstration that the G4 method can be used to fill gaps where experimental data are not available or are insufficiently reliable. The GC model for these species obviously involves the phenol-related correction (OH–phenyl) parameter (+7 kJ/mol) as well as the methoxy Group parameter (−152 kJ/mol). With the OH–benzene and methoxy–benzene interaction parameters established previously for the methoxybenzenes and phenols, for the methoxyphenols, we find four out of eight GC model values within chemical accuracy from the experimental ones, viz. columns 3 and 4 in
Table 12. The most unexpected difference is that for 4-methoxyphenol as no steric effects are to be expected, but the difference with 2-methoxyphenol is well represented by B3LYP relative energy calculations (12 kJ/mol difference compared to 13 kJ/mol based on the experimental values), and thus it is to be considered genuine.
For the dimethoxyphenols, we observe a larger difference between experimental and GC model values for 2,6- and 3,4-dimethoxyphenol. The value for 3,5-dimethoxyphenol is somewhat beyond chemical accuracy. We observed similar significant deviations for the Marrero–Gani GC method [
10] for the same species (in fact, larger deviations than for our GC model), viz.
Table 12, and we found that this also holds for the Constantinou and Gani [
9] and the Joback and Reid [
8] GC methods (all these GC results are from the ICAS23 software suite [
41]). This indicates that current GC methods cannot account for these species at present.
However, when we now introduce the methoxy–methoxy interaction parameters as established in the previous section for the methoxybenzenes, i.e., +11 kJ/mol for adjacent (e.g., 3,4-) and −4.1 kJ/mol for next to adjacent (e.g., 3,5) dimethoxy interactions, we arrive at the results shown in column 6 of
Table 12. The averaged absolute difference between experimental and GC model values decreases from 6.6 to 4.6 kJ/mol.
Still, we have two larger deviations (except for the 4-methoxyphenol mentioned earlier) namely, 2,3- and 2,6-dimethoxyphenol. When we compare the GC model values (column 6) with the G4 values rather than the experimental value, the differences do become smaller. Secondly, the two problematic cases both have their methoxy groups adjacent to the phenol group, which can lead to steric as well as electronic interaction effects. In the next section, we will see that the same two species exhibit the same larger deviations within the methoxybenzaldehyde family.
3.3.6. Methoxybenzaldehydes
Experimental and computed data for methoxybenzaldehydes are collected in
Table 13. Again, the G4 results compare very favourably with the available experimental data. Our GC model reveals good (within chemical accuracy) to reasonably good agreement with the available experimental data for the majority of species considered. GC model values for 4-methoxybenzaldehyde and 2,5-dimethoxybenzaldehyde are within chemical accuracy of the experimental values considering the experimental errors of 2.0 and 2.2 kJ/mol, respectively. The larger differences include 2,3-dimethoxybenzaldehyde and 2,6-dimethoxybenzaldehyde for which we have no reliable experimental data and therefore compare to the G4 values. The conclusions are very much like those for the methoxy-phenols: a few GC results do not agree with available experimental or G4 results, and this discrepancy is also observed for those species when invoking other GC methods such as the Marrero–Gani approach [
10] as implemented in ICAS23 [
41], viz, the last column in
Table 13. The introduction of the methoxy–methoxy interaction parameter (see previous section) leads to the differences shown in column 6 in
Table 13. The 2,3- and 2,6-dimethoxybenzaldehydes are, as before, revealing significant differences which are attributed to interaction with the aldehyde group. The now larger deviation for 3,4-dimethoxybenzaldehyde, compared to the model value without the additional methoxy–methoxy interaction parameters, remains unaccounted for now.
It would be possible to account for the remaining deviations by introducing additional GC parameters but that would be based on a few data only and is, therefore, not pursued.
3.3.7. Benzoic Acids
Experimental and GC model results are shown in
Table 14. To obtain a good description, we introduced an additional substitution parameter of a magnitude of +12 kJ/mol for the COOH to benzene ring correction. In addition, the previously defined additional terms for the methoxy Group (−152 kJ/mol), the amino Group (−7 kJ/mol) attached to a benzene ring, and t-butyl attached to a benzene ring (+13.5 kJ/mol) are involved in the benzoic acid series considered here.
Once more, as GC methods do not tend to account for steric effects, e.g., neighbour steric overlap, GC methods predict identical or almost the same heat of formation for 2-, 3-, and 4- alkyl-substituted species. The agreement between experimental values and GC model values for 3- and 4-methylbenzoic acid is within chemical accuracy, but for 2-methylbenzoic acid, the experimental value is less negative than the GC value, suggesting steric effects being responsible. We see the same behaviour for ethylbenzoic acid. The difference between experimental results for 2- and 4-ethylbenzoic acid of 15 kJ/mol is to be compared with the B3LYP calculated energy difference of 14 kJ/mol. The combination of these results is important with respect to a proper GC parametrization, as other GC approaches such Marrero–Gani and Constantinou–Gani (implementation in the ICAS software suite [
41]) reveal a GC value in good agreement with 2-ethylbenzoic acid and not in good agreement with 4-ethybenzoic acid, whereas it should be the other way around, because 2-ethylbenzoic acid is subject to steric effects not accounted for by the GC methods. The same behaviour is observed for the three mono-methoxybenzoic acids, with 2-meythoxybenzoic acid being significantly less stable due to steric effects as confirmed by B3LYP relative energy calculations. And also the same behaviour is seen for the three tert-butylbenzoic acids. We also observe that agreement between experimental and model values is reasonably good for 3,5-diethylbenzoic acid, which is, as expected, not suffering from steric overlap effects between the substituents.
The aminobenzoic acids reveal a somewhat different behaviour as the 2- and 4-aminobenzoic acids have an identical heat of formation (within experimental accuracy) and both are more stable than 3-aminobenzoic acid by some 13 kJ/mol. The error in the experimental results was given by Pedley c.s. as 3.9 kJ/mol for 3-aminobenzoic acid and 3.8 kJ/mol for 4-aminobenzoic acid (for 2-aminobenzoic acid, the given error is 1.3 kJ/mol). This implies that the GC model values for 3- and 4-aminobenzoic acid are in fact within experimental accuracy from the experimental values.
Finally, when we consider the good performance of the G4 method for the cases discussed before, as well as the majority of cases in
Table 14, we see that for 2- and 3-aminobenzoic acid as well as 3- and 4-t-butylbenzoic acid, the difference between the conformationally averaged G4 value and the experimental value is around 10 kJ/mol. In three out of four, there are no steric effects and so no potential weak interactions which may not be well accounted for. It remains to be settled what the origin of the problem is, but a re-evaluation of the experimental data should inevitably be part of this.
3.3.8. Acetophenones
Data for acetophenones are collected in
Table 15. Contrary to benzaldehyde, we here needed the GC parameter for a keto Group which we have determined previously [
1]. In addition, we needed to introduce the acetyl-benzene interaction parameter to which we attributed the value +8 kJ/mol to arrive at a good fit between the GC model and experimental values, viz.
Table 15. The G4 results and available experimental values are just within chemical accuracy. The larger deviation for 2-ethylacetophenone is similar to that observed in similar molecules in the previous subsection, and to be attributed to steric interactions between neighbouring substituents which are generally not accounted for in the GC models. When we disregard these three 2-substituted species, the model values reveal an averaged absolute difference with experimental or, when these are not available, the G4 calculated values of 2.7 kJ/mol.
The value of the acetyl-benzene parameter can be varied; e.g., when adopting a value of +6 kJ/mol, the averaged absolute deviation drops to 1.9 kJ/mol, but at the same time, the deviation for 2-methylacetophenone increases to 6.0 kJ/mol, beyond chemical accuracy. This can either be attributed to steric effects (as for 2-ethylacetophenone) or considered as acceptable, as the error in the experimental value was given as 2.3 kJ/mol for 2-methyl-acetophenone, and thus the GC model is within chemical accuracy from the experimental value. More data would be required to establish the optimum parameter for the acetyl-benzene parameter.
Overall, we can conclude that our model performs very satisfactorily in view of the limited availability of experimental data and the agreement with G4 data.
3.3.9. Furans
The furans form an interesting class when we look at the data collected in
Table 16. Because of the ring strain expected, we adopted the bare furan species as a Group by itself. Consequently, there is no model value in the table for it, as this is now by definition the experimental value of −34.9 kJ/mol. Other furans in
Table 16 comprise this Group enhanced with substituents. The first two reveal reasonable agreement between the GC model and G4 values, with deviations somewhat beyond chemical accuracy. Two things should be mentioned here. Firstly, the error in the experimental result (last column) means that for furfural the GC model value is still within chemical accuracy considering this error. Secondly, the G4 value for 2-furanmethanol is almost identical to the GC model value, which may suggest an issue with the experimental value. Also for furfural, the G4 and GC model values are very close. Furancarboxylic acid shows a significant difference both between the GC model and experimental values, and between the G4 and experimental values. As furan is a five-membered ring, it will be subject to ring strain and this ring strain is generally substituent-dependent, but for the five-membered ring this effect is not expected to be very significant [
4]. For the last two entries, we observe good agreement between experimental and GC model values, but a clearly much larger deviation for the G4 results. So, all in all, we have a somewhat diffuse picture, and with the present data, it is not possible to draw final conclusions. Of course, this situation is also related to the fact that we want to achieve chemical accuracy, and other GC methods (in ICAS23) we evaluated also reveal larger differences with the experimental data.
3.3.10. Indoles
Experimental and model data for indole (
Figure 2, left-hand structure) and indoline (
Figure 2, right-hand structure) and several of its derivatives are shown in
Table 17. Following our earlier approach, both structures are considered a Group by themselves. Except for 1-methylindoline, the data in
Table 17 reveal agreement between experimental and G4 results within chemical accuracy, indicating that there is little doubt about their correctness (within the respective error bars). The GC model values for 2-methylindole and 2-methylindoline are also in good agreement with the experimental values. For the 1-methyl substituted species, we need to consider which Groups are to be involved. In previous sections on amines, we established different Group parameters for RNH
2, R
2NH, and R
3N. Thus, substitution at the 1-position in indole/ine requires a modification in the Group contribution. In order to avoid the introduction of new parameters, for 1-methylindole (and similar for 1-methylindoline) we made the attempt to take the indole Group, subtract the contribution for R
2NH, and subsequently add the parameter value for the R
3N Group. In addition, we added the additional interaction parameter of a magnitude of −7 kJ/mol for the NH
2-benzene pair which we adopted from the aniline series discussed earlier. Although there was no guarantee that this would work, we obtained a very good result for 1-methylindole, very much within chemical accuracy (1.2 kJ/mol). For 1-methylindoline, the deviation is as high as 11.1 kJ/mol. It remains open whether this is a real problem, as the G4 result lies more or less precisely in between the experimental and the GC model value. Thus, taking into account the good performance of the G4 method, if the G4 value were the correct one, the GC model value would almost be within chemical accuracy. More data are needed to put this on more solid ground.