Next Article in Journal
Tuning the d-Band Center of Nickel Bimetallic Compounds for Glycerol Chemisorption: A Density Functional Study
Next Article in Special Issue
Binding Free Energy Analysis of Colicin D, E3 and E8 to Their Respective Cognate Immunity Proteins Using Computational Simulations
Previous Article in Journal
Oleanolic Acid Slows Down Aging Through IGF-1 Affecting the PI3K/AKT/mTOR Signaling Pathway
Previous Article in Special Issue
Mechanistic Pathways in Cyanide-Mediated Benzoin Condensation: A Comprehensive Electron Localisation Function (ELF) and Catastrophe Theory Analysis of the Umpolung Reaction
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Computation of the pKa Values of Gallic Acid and Its Anionic Forms in Aqueous Solution: A Self-Similar Transformation Approach for Accurate Proton Hydration Free Energy Estimation

Department of Quantum Chemistry, Faculty of Chemistry, Adam Mickiewicz University of Poznań, ul. Uniwersytetu Poznańskiego 8, 61-614 Poznań, Poland
Molecules 2025, 30(3), 742; https://doi.org/10.3390/molecules30030742
Submission received: 5 January 2025 / Revised: 27 January 2025 / Accepted: 30 January 2025 / Published: 6 February 2025
(This article belongs to the Special Issue Computational Chemistry Insights into Molecular Interactions)

Abstract

:
The Gibbs free energies of gallic acid (GA) and its anionic forms in aqueous solution were computed utilizing density functional theory (DFT) at the LSDA, M062X, B3LYP/QZVP levels, in conjunction with the SMD solvation model. The pKa values corresponding to the four-step deprotonation of GA were determined through a non-linear self-similar transformation expressed as, pKa = a⋅pKa(the)c which establishes a link between theoretical and experimental pKa values. This approach replaces the previously employed linear relationship, pKa = a⋅pKa(the) + b. The proposed model demonstrates high accuracy in reproducing the experimental pKa1 = 4.16 ± 0.02, pKa2 = 8.55 ± 0.01, pKa3 =11.40 ± 0.10, pKa4 =12.8 ± 0.40 values of GA, with a standard error (SE) of 0.045 and a mean absolute error (MAE) of 0.019 in pKa unit. Furthermore, it facilitates the precise determination of the Gibbs free energy of the proton hydration, yielding ∆G(H+)aq = 259.4272(75) [kcal mol−1]. This result conforms acceptably with the experimental value of ∆G(H+)aq = −259.5 [kcal mol−1].

Graphical Abstract

1. Introduction

Deprotonation represents a fundamental chemical reaction that occurs across both biological [1,2] and abiotic [2,3,4] systems, thereby representing a significant research area focused on molecular reactivity and selectivity [3,4]. Monocarboxylic acids [5,6], along with their combinations with phenols and polyphenols [7,8], are particularly compelling subjects of study due to the presence of multiple reactive carboxyl and hydroxyl groups that engage actively in the deprotonation process. Experimental and theoretical analyses of protonation strive to assess the reactivity of each active group, identifying the sequential order of proton removal from the base molecule and evaluating the relative susceptibility of each group to deprotonation [3,4,9]. The chemical reactivity of these compounds is largely influenced by the electrophilic nature of the carboxylic moiety, as well as the resonance stabilization that occurs upon proton dissociation. These factors jointly contribute to both the acidity and predominant chemical reactivity of the carboxylic and phenolic groups. They can be numerically (quantitatively) characterized by the parameter pKa, which stands for the negative base-10 logarithm of an acid dissociation constant in a specific solvent, usually of a hydrophilic nature facilitating the proton detachment. Experimentally and theoretically determining the pKa values of the compounds, especially in aqueous solution, is essential for a comprehensive understanding of numerous chemical processes and provides insight into the deprotonation state of a molecule within a specific solvent environment. Theoretical approaches to calculating pKa parameters have garnered significant interest, particularly for applications involving molecules that have not yet been synthesized, those for which experimental determination of pKa is challenging, and larger molecules where local environmental factors alter the intrinsic pKa values. An example of the latter includes certain amino acids incorporated within polypeptide chains [2]. Achieving chemical accuracy in pKa calculations is inherently challenging, as an error of 1.36 [kcal mol−1] in the change in free energy associated with deprotonation in a solvent corresponds to a deviation of 1 pKa unit [2]. In this regard, gallic acid (GA, Figure 1) is among the most extensively studied compounds [3,4,9] with respect to selectivity and functional group activity, owing to its four active hydrogens in carboxyl moiety and hydroxyl groups attached to the benzene ring.
Research has demonstrated [3,4] that the deprotonation sequence in GA in water solution follows the order O7′-H, O4′-H, O3′-H, O5′-H, necessitating the use of distinct deprotonation descriptors: pKa1(O7′-H), pKa2(O4′-H), pKa3(O3′-H), pKa4(O5′-H) associated with the four-step deprotonation of GA presented in Figure 2.
Despite the broad utility of the pKa descriptor, the acquisition of accurate and unique experimental data for GA remains challenging. They are commonly determined through ultraviolet–visible (UV-Vis) and Raman spectroscopy [10,11,12,13], or various analytical techniques [14]. For GA, the following experimental pKa1,2,3,4 values were reported with the standard errors if available: Set I (4.50, 7.05, 8.75, 10.25) [11], Set II (4.4 ± 0.1, 8.8 ± 0.1, 10.0 ± 0.1, 11.4 ± 0.1) [12], Set III (4.44, 8.54, 10.05, 11.30) [14], Set IV (4.16 ± 0.02, 8.55 ± 0.01, 11.40 ± 0.10, 12.8 ± 0.40) [13]. The availability of experimental pKa values, measured with varying degrees of precision, facilitates a comparison with the theoretical results generated by different models and methods employed in the calculations. A straightforward approach to the theoretical simulation of solvent effects involves the application of various solvation models, with particular emphasis on the Conductor-like Polarizable Continuum Model (C-PCM) [15], the Integral Equation Formalism Polarizable Continuum Model (IEF-PCM) [16], and the universal Solvation Model Density (SMD) based on solute electron density [17]. They are widely employed to calculate the theoretical pKa values of chemical compounds in solution through the relationship show below [18,19,20,21]:
p K a = G ( R ) sol   G ( RH ) sol   +   G ( H + ) sol   RTln ( V ) RTln ( 10 ) = Δ G + G ( H + ) sol 1.8942 1.3642
in which R is the gas constant, T = 298.15 [K] is the temperature (which is used in all calculations), and RT∙ln(V) = 1.8942 [kcal mol−1] stands for the correction [21]
G 1   [ mol ] = G 1   [ atm ] 1.8942   [ kcal   mol 1 ]
for the reference state (V = 24.46 [L], T = 298.15 [K]), from 1 [atm] to 1 [mol], whereas ∆G in [kcal mol−1] denotes the Gibbs free energy of the deprotonation according to the reaction
RH solvent R + H +
The Gibbs free energy of the solvated proton G(H+)sol, is related to the solvation energy of the proton ∆G(H+)sol by the relation [21]
G ( H + ) sol = Δ G ( H + ) sol + G ( H + ) gas
in which G(H+)gas = −6.2883 [kcal mol−1] [22] is the free energy of a proton in the gas phase under a pressure of 1 [atm]. From (1) and (2) it follows that Equation (1) is useful evaluating pKa using the calculated (theoretical) G(H+)solatm value, whereas G(H+)solmol = G(H+)solatm −1.8942 [kcal mol−1] can be compared to the experimental data. To estimate the Gibbs free energies of the solvated species involved in the deprotonation process, quantum-chemical computational methods at different levels of theory have been employed [5,6]. The calculations performed for the water medium revealed that the application of Equations (1)–(3) in the calculation of pKa presents three primary challenges:
(i)
The accuracy of pKa reproduction is contingent upon the computational method employed to determine G(RH) and G(R); it varies within an MAE (mean absolute error) range of 0.51–2.86 pKa for monoprotic acids, depending on the method used in the calculations [5].
(ii)
The value of ∆G(H+)aq, as determined by experimental [23], theoretical [24,25,26,27,28,29,30], and mixed [30] approaches, ranges from −244.9 to −266.7 [kcal mol−1], which significantly influences the predicted pKa values.
(iii)
Equation (1) was effectively utilized to reproduce pKa parameters for monoprotic [5,6,7] compounds; however, its application to polyprotic molecules undergoing multi-step proton detachment yielded [4] results that deviated significantly from the experimental ones.
To address the first challenge, calculations are conducted using a range of available semi-empirical and ab initio methods, incorporating various basis sets and solvation models to identify those that most accurately reproduce experimental pKa values [5,6]. Additionally, a linear similarity relation [5,31,32,33] such as
p K a = a p K a ( the ) + b
is employed to enhance the accuracy of pKa reproduction by adjusting the a and b parameters to the set of experimental pKa = pKa(exp) data and theoretical pKa(the) values. This approach improves the accuracy of pKa prediction from MAE = 0.51–2.86 to MAE = 0.30–0.67 in pKa units for 22 monoprotic acids analyzed [5]. In the case of thiols, the MAE = 0.57–0.96 [32], whereas for phenols, carboxylic acids, and amines, the MAE values are lower than 0.7 pKa units [31]. However, this level of accuracy still remains higher than the experimental error range of 0.01–0.1 in pKa units obtained, e.g., for the multi-step deprotonation of GA [12,13]. To concurrently address the first and second challenges, Dutra et al. [6] proposed an approach wherein the average <G(H+)sol> of the G(H+)sol values could be calculated using the equation below
G ( H + ) sol = 1.3642 p K a ( exp ) + G ( RH ) sol   G ( R ) sol + 1.8942
depending on the experimental pKa(exp) and the theoretical G(RH)sol and G(H)sol values obtained using diverse calculation methods for a specific class of compounds with similar chemical structures. This average value is subsequently incorporated into Equation (1) to form Equation (6):
p K a = G ( R ) sol   G ( RH ) sol + G ( H + ) sol   1.8942 1.3642
which can be used to calculate the pKa values for all compounds within the studied class. Calculations performed using this scheme for 22 monocarboxylic acids in an aqueous medium revealed [6] that the combination of density functional theory (DFT) with the local spin density approximation (LSDA) functional provided superior performance in nearly all calculations, achieving accuracy levels comparable to those obtained using the G4CEP (Gaussian 4 Compact Effective Pseudopotential) method employed by Silva and Custodio [5]. In this study, the calculations conducted for 22 monoprotic acids using the G4CEP method, in conjunction with (i) the SMD solvation model, (ii) SMD +1 H2O (with one explicit water molecule), and (iii) the SMD + 1 H2O with a linear correction based on Equation (4) and experimental pKa(exp) values, yielded mean absolute errors of 0.83, 0.51, 0.30 in pKa units, respectively. A comparative analysis with the results obtained by DFT at the B3LYP and BMK theory levels revealed significant errors reaching up to 2 pKa units. In contrast, the largest deviations observed for the G4CEP method rarely exceeded 1 pKa unit [5].
Based on the results reported above, the primary objectives of the present study were fourfold:
(i)
To employ a modified version of the method proposed by Dutra et al. [6] to determine the pKa1,2,3,4 parameters for the four-stage deprotonation of GA, within the bounds of experimental errors of 0.01–0.40 pKa units.
(ii)
To calculate the Gibbs free energy of the hydrated proton G(H+)aq and proton hydration ∆G(H+)aq by fitting this parameter via the improved version of Equation (4) to the experimental pKa values reported in [11,12,13,14].
(iii)
To select the experimental data that most accurately represent the pKa values for GA within the framework of the proposed approach.
(iv)
To demonstrate that the accurate reproduction of the pKa of GA, within the range of experimental errors, can be achieved using the DFT method and LSDA/QZVP level of theory combined with the SMD solvation model, without the explicit inclusion of a water molecule.

2. Results and Discussion

To achieve the primary objectives of this work, the geometries of the neutral GA molecule and its four anionic forms (Figure 2) were optimized in the water medium employing the DFT method at the different theory levels (LSDA, M062X, B3LYP), and the basis set by QZVP and the SMD solvation model and the reasons for their application are set out in the Materials and Methods section. The optimized geometries of the neutral GA molecule and its anionic forms generated at the LSDA/QZVP theory level are presented in Figure 3, while Table 1 and Table 2 report the corresponding Gibbs free energy values and their differences used in the determination of the pKa1,2,3,4 parameters.
In this study, we examined two potential scenarios (Figure 4) for the deprotonation of GA, involving the dianion A, which possesses a higher energy, at G(GA−2) = −642.539452 [Ha], than its conformer B, at G(GA−2) = −642.544071 [Ha]. The energy disparity arises from a conformational change that contributes additional energy 2.90 [kcal mol−1]. The rotation barrier for an A → B transition is equal to 2.93 [kcal mol−1], whereas for B → A it is 5.61 [kcal mol−1]. These results were calculated using the free energy of the transition state [AB] G = −642.534128 [Ha]. Despite the energetic difference between dianions A and B, the reaction pathways for both molecules are isoenergetic as ΔG2 + ΔG3 = ΔG’2 + ΔG’3. However, incorporating the low-energy dianion B into the calculations yields ΔG’2 and ΔG’3 values that fail to accurately reproduce pKa2 and pKa3 descriptors. For this reason, all calculations were carried out using scheme A.
The calculations performed at the LSDA/QZVP theory level using the SMD solvation model indicated that the method based on Equations (5) and (6) yields an unsatisfactory reproduction of the experimental pKa values of GA. For example, set IV provides the mean value <G(H+)aq> = −263(5) [kcal mol−1], exhibiting a substantial standard error, and produces pKa = 1.77, 4.09, 13.16, 22.50. Attempts to refine these results using Equation (4) also failed to generate satisfactory agreement with the experimental pKa (exp) values of 4.16, 8.55, 11.40, and 12.80. For the adjusted parameters a = 0.362 (125) and b = 5.47 (1.65) obtained with a standard error (SE) of 2.0479 of estimation and a coefficient of determination (R2) of 0.8077, the corrected pKa values were equal to 6.11, 6.95, 10.24, 13.62. This discrepancy suggests that the relationship between the theoretical and experimental pKa values is more complex than the linear relation (4) utilized so far. Accordingly, in this study, the linear similarity relation (4) was replaced with a non-linear self-similar formula like that shown below
p K a N = a p K a ( the ) c = a Δ G N + G ( H + ) aq 1.8942 1.3642 c   Δ G N = G ( GA N ) aq   G ( GA ( N 1 ) ) aq N = 1 , 2 , 3 , 4
which simplifies to the linear relation (4) with b = 0 when c approaches 1. The parameters a, c, and G(H+)aq in Equation (7) fitted to the four sets of available experimental GA data have subsequently been employed to reproduce the pKa1,2,3,4 parameters characterizing the four-step deprotonation of GA. The results of these calculations are reported in Table 2 and Table 3.
The results obtained reveal that the approach proposed enables both the precise determination of the Gibbs free energy of the hydrated proton G(H+)aq = −265.7155(75) [kcal mol−1] and the accurate reproduction of the experimental pKa values by taking advantage of the LSDA/QZVP theory level and set IV of the datapoints [13]. The results fall within the range of experimental errors, with MAE = 0.019 in pKa units. As experimental data are typically represented by the Gibbs free energy of the proton hydration, this parameter can also be determined through a fitting procedure utilizing Equations (3) and (7). The findings presented in Table 4 demonstrate that parameters ∆G(H+)aq = −259.4272(75) [kcal mol−1], a = 7.271(57), and c = 0.1858(31) reproduce set IV [13] of pKa with R2 = 1.0000 and SE = 0.0454.
The ∆G(H+)aq =−259.4272(75) calculated using the LSDA/QZVP theory level and set IV is consistent with the experimental value of −259.5 [kcal mol−1] obtained by Lim et al. [23]. This value corresponds to the average −4.5 [V] of five measurements of the standard hydrogen potential (−4.43, −4.43, −4.44, −4.48, −4.73 [V]), which allows for the determination of a range of the experimental values of ∆G(H+)aq from −254 to −261 [kcal mol−1] [23]. The reproduction of the experimental pKa values IV for GA using relationship (7) proved to be more accurate than that obtained using linear Equation (4). The parameters <∆G(H+)aq> = −258.6(7.4) [kcal mol−1], a = 0.36(13), and b = 5.4(1.7) reproduce set IV of pKa with R2 = 0.8079 and SE = 0.7118, yielding corrected pKa values of 5.96, 6.25, 7.38, 8.55. These results clearly indicate that, for the multivariate deprotonation of GA, the linear transformation (4) should be replaced by the self-similar formula given in Equation (7). Furthermore, the plots of the relationship linking pKaN and ΔGN (Figure 5) generated for sets I, II, III, and IV highlight the strongly non-linear nature of this relation. The graphs of the pKaN as function of ∆GN qualitatively have the same form for ∆GN calculated at the LSDA, M062X, and B3LYP/QZVP theory levels.
In works [5,6], it was demonstrated that the accuracy of reproducing the experimental pKa values for monoprotic acids can be enhanced through the implementation of an SMD solvation model that incorporates an additional explicitly included water molecule. Given that the model proposed in this work effectively reproduces set IV, achieving a maximum R2 of 1.0000, and a satisfactory MAE = 0.019, NMAE = 0.463 in pKa units, the inclusion of the water molecule should not affect the accuracy of the calculations. To substantiate this hypothesis, the calculations were performed utilizing the scheme proposed by Kelly et al. [21], which generates a = 6.15(12), c = 0.2648(79), and ∆G(H+)aq = −260.166(44) [kcal mol−1] from ∆G1 = 266.4521, ∆G2 = 268.6051, ∆G3 = 278.0447, and ∆G4 = 299.3631 [kcal mol−1], values that reproduce the pKa of set IV with SE = 0.073, MAE = 0.030, and NMAE = 0.7651 in pKa units. These results exhibit a lower degree of accuracy compared to those obtained without the inclusion of the water molecule in the calculations at the LSDA/QZVP theory level. To investigate the impact of dispersion effects on the pKa and ∆G(H+)aq parameters, calculations were performed employing the QZVP basis set, the SMD solvation model, and (i) the functional developed by Head-Gordon and collaborators [34], which incorporates Grimme’s dispersion model D2 [34], and the B3LYP functional combined with the D3 dispersion model. Such a combination has been recommended in [35]. The computational results, summarized in Table S2, indicate that dispersion effects have a negligible influence on the deprotonation of GA. It is noteworthy, however, that the value of the parameter ∆G(H+)aq = −264.10(23) [kcal mol−1] is consistent with ∆G(H+)aq = −264.29 [kcal mol−1] reported by Zhan and Dixon [27]. To evaluate the impact of basis sets on the accuracy of the reproduction of the pKa and ∆G(H+)aq parameters for GA, calculations were conducted at the 6311++G(d,p), aug-cc-pVQZ, QZVP/LSDA theory levels in conjunction with the SMD solvation model. The results obtained are presented in Table 5. The findings indicate that employing the QZVP basis set is not essential for accurately calculating the pKa parameters in the scheme proposed, as the less extensive 6−311++G(d,p) basis set also yields results consistent with the experimental data. However, an analysis of the ∆G(H+)aq parameter values reveals that the use of larger, more comprehensive basis sets is necessary to achieve result that align closely with the experimental one of ∆G(H+)aq = −259.5 [kcal mol−1] reported by Lim et al. [23].

3. Materials and Methods

For monocarboxylic acids, the best reproduction of the pKa values was achieved [5,6] using the composite G4CEP and DFT method in conjunction with the LSDA [36], M06-2X [37], and B3LYP [38,39] functionals, and the SMD solvation model as well as the Dunning’s [40] basis sets aug-cc-pVDZ and aug-cc-pVTZ. In light of the findings presented in [5,6], in this work, calculations were conducted utilizing the DFT method implemented in Gaussian vs. 16 software [40] in conjunction with the LSDA, M062X, and B3LYP functionals. To identify a basis set capable of accurately reproducing the Gibbs energy for GA and its anionic forms, preliminary calculations were performed on the energy of GA in an aqueous environment, employing the DFT/LSDA method and the SMD solvation model, utilizing basis sets cc-pVQZ, aug-cc-pVQZ [40], and QZVP [41]. The resulting energy values (in [Ha]: E1(cc-pVQZ) = −643.496286; E2(aug-cc-pVQZ) = −643.501176; E3(QZVP) = −643.511057), and their corresponding differences (in [kcal mol−1]: ∆E12 = 3.0685; ∆E23 = 6.2004; ∆E13 = 9.2689), guided the selection of the quadruple zeta valence polarized (QZVP) basis set [41] for subsequent calculations, as it yielded the lowest energy value for the GA. A similar situation was observed for the M062X and B3LYP functionals; consequently, the DFT method at the LSDA, M062X, and B3LYP/QZVP level of theory, combined with the SMD solvation model, was employed for the primary calculations. The input structures were constructed by taking advantage of the Gauss View-6.1 graphical interface [42], and calculations were carried out using the Gaussian version 16 A software [42,43]. Since the computations for the compounds studied at the LSDA,M062X,B3LYP/QZVP level are time consuming and their convergence depends on the initial geometry input, the optimization was divided into two stages: (i) the determination of an approximate geometry at the LSDA,M062X,B3LYP/cc-pVQZ theory levels in the gas phase, and (ii) the final calculation at the LSDA,M062X,B3LYP/QZVP theory levels in the water medium using the SMD solvation model. Thus, optimization was performed at each stage, with the geometry determined at the lower level serving as the starting point for optimization at the higher level. The optimized structures of the compounds considered in this work are displayed in Figure 3. The parameters a, c, G(H+)aq, and ∆G(H+)aq were fitted to the four sets of experimental pKa values for GA, including set I [4.50, 7.05, 8.75, 10.25] [11], set II [4.4(1), 8.8(1), 10.0(1), 11.4(1)] [12], set III [4.44, 8.54, 10.05, 11.3] [14], and set IV [4.16(2), 8.55(1), 11.4(1), 12.8(4)] [13]. The calculations were performed by taking advantage of Sigma Plot vs. 11 software—the results of calculations together with indicators of goodness of the fit, R2, coefficient of determination, and SE, standard error of estimation, are presented in Table 2 and Table 4. The fitted parameters were utilized to calculate the pKaN values for GA, as presented in Table 3 and Table 4, along with the mean absolute errors (MAEs) and the normalized mean absolute errors (NMAEs) associated with the reproduction of the experimental pKaN(exp) data, defined as follows
MAE = 1 4 N = 1 4 p K a N - p K a N ( exp ) NMAE = 1 4 N = 1 4 p K a N - p K a N ( exp ) u N
Here, uN denotes experimental errors in the measurements provided for datasets II (uN = 0.1) [12] and IV (u1 = 0.02, u2 = 0.01, u3 = 0.10, u4 = 0.40) [13]. A value of NAME ≤ 1 indicates the reproduction of experimental data within the accuracy of the measurement error.

4. Conclusions

The proposed method for determining the pKa parameters of GA enables the simultaneous validation of both the method used and the experimental data employed in the calculations. Among the four datasets analyzed, the non-linear model demonstrated the best agreement with dataset IV [13], which exhibited the highest statistical consistency when reproduced using the DFT method at the LSDA/QZVP theory level and the SMD solvation model. The parameters presented in Table 3 reproduce experimental dataset IV with a standard error, SE, of 0.0454 and a coefficient of determination, R2, of 1.0000, achieving reproduction of the experimental pKa values [13] for GA within the limits of experimental errors as NMAE = 0.463 pKa units. These findings confirm the validity of the self-similar Formula (7) in particular and the overall calculation methodology in general. Additionally, the reproduction of the experimental pKa values for GA using relationship (7) has been proved to be more accurate than that obtained with Equation (4). The results obtained indicate that, for the multivariate deprotonation of GA, the linear transformation (4) should be replaced by the self-similar formula given in Equation (7). The application of Equation (7) in conjunction with the ΔGN values calculated using the DFT method at the LSDA/QZVP theory level, alongside the SMD solvation model, is sufficient to achieve a satisfactory reproduction of the measured pKa values of GA. Thus, the utilization of the computationally time-consuming G4CEP method and the SMD model with the inclusion of an additional water molecule, as suggested in works [5,6], is deemed unnecessary for this objective. It is acknowledged that the proposed approach represents a significant simplification of the real deprotonation process due to the complex reactions that occur between GA and water. These reactions result in substantial modifications to the structure of GA [44] and its anionic forms, which in turn have a pronounced effect on the kinetics and thermodynamics of GA deprotonation. The proposed model is not derived from the first principles but constitutes a quasi-phenomenological framework, wherein the semi-empirical parameters a and c encapsulate all hidden variables that are not explicitly incorporated into the analysis. A first-principles model should, at the molecular level, account for all possible structures of GA, including polymorphs, rotamers, and tautomers and their anionic forms, as well as their interactions with solvent molecules and the formation of complex polymolecular systems. It will be essential to undertake research in this regard in the near future.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/molecules30030742/s1, Table S1. The Gibbs and zero-point energies [Ha] of neutral GA0 molecule and its anionic forms GA−N N = 1,2,3,4 calculated in the water medium at the LSDA, M062X, B3LYP/QZVP theory levels, using the SMD solvation model. Table S2. The effect of dispersion on the theoretical reproduction of proton hydration energy ∆G(H+)aq and pKaN parameter values. The calculations were performed in a water medium at the wB97XD/D2//QZVP, B3LYP/D3/QZVP theory levels, using the SMD solvation model. The theoretical value of ∆G(H+)aq = −264.29 [kcal mol−1] is reported by Zhan and Dixon [27]. For comparison, the results obtained at the B3LYP/QZVP level of theory without taking the dispersion effect into account are shown.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All new data created are reported in this work.

Acknowledgments

This research was supported by the computing grant PL0293-01 from Supercomputing and Networking Center (PCSS) in Poznań, Poland. I would like to thank Piotr Kopta from PCSS for technical support and Julian Molski for his constructive comments.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Gregory, M.T.; Gao, Y.; Cui, Q.; Yang, W. Multiple deprotonation paths of the nucleophile 3′-OH in the DNA synthesis reaction. Proc. Natl. Acad. Sci. USA 2021, 118, e2103990118. [Google Scholar] [CrossRef] [PubMed]
  2. Alongi, K.S.; Shields, G.S. Theoretical Calculations of Acid Dissociation Constants: A Review Article. Annu. Rep. Comput. Chem. 2010, 6, 113–138. [Google Scholar] [CrossRef]
  3. Marino, T.; Galano, A.; Russo, N. Radical scavenging ability of gallic acid toward OH and OOH radicals. Reaction mechanism and rate constants from the density functional theory. J. Phys. Chem. B 2014, 118, 10380–10389. [Google Scholar] [CrossRef] [PubMed]
  4. Molski, M. Theoretical study on the radical scavenging activity of gallic acid. Heliyon 2023, 9, e12806. [Google Scholar] [CrossRef]
  5. De Souza Silva, C.d.S.; Custodio, R. Assessment of pKa determination for monocarboxylic acids with an accurate theoretical composite method: G4CEP. J. Phys. Chem. A 2019, 123, 8314–8320. [Google Scholar] [CrossRef]
  6. Dutra, F.R.; Silva, C.d.S.; Custodio, R. On the accuracy of the direct method to calculate pKa from electronic structure calculations. J. Phys. Chem. A 2021, 125, 65–73. [Google Scholar] [CrossRef]
  7. Walton-Raaby, M.; Floen, T.; García-Díez, G.; Mora-Diez, N. Calculating the aqueous pKa of phenols: Predictions for antioxidants and cannabinoids. Antioxidants 2023, 12, 1420. [Google Scholar] [CrossRef]
  8. Morency, M.; Néron, S.; Iftimie, R.; Wuest, J.D. Predicting pKa values of quinols and related aromatic compounds with multiple OH groups. J. Org. Chem. 2021, 86, 14444–14460. [Google Scholar] [CrossRef]
  9. Badhani, B.; Kakkar, R. Influence of intrinsic and extrinsic factors on the antiradical activity of gallic acid: A theoretical study. Struct. Chem. 2018, 29, 359–373. [Google Scholar] [CrossRef]
  10. Masoud, M.S.; Ali, A.E.; Haggag, S.S.; Nasr, N.M. Spectroscopic studies on gallic acid and its azo derivatives and their iron(III) complexes. Spectrochim. Acta A Mol. Biomol. Spectrosc. 2014, 120, 505–511. [Google Scholar] [CrossRef]
  11. Agrawal, M.D.; Bhandari, C.S.; Dixit, M.K.; Sogani, N.C. 3,4,5-Trihydroxybenzoesäure als Chelatbildner, 1. Mitt.: Praseodym. Monatsh. Chem. 1976, 713, 75–82. [Google Scholar] [CrossRef]
  12. Huguenin, J.; Hamady, S.O.S.; Bourson, P. Monitoring deprotonation of gallic acid by Raman spectroscopy. J. Raman Spectrosc. 2015, 46, 1062–1066. [Google Scholar] [CrossRef]
  13. Kipton, H.; Powell, J.; Taylor, M.C. Interactions of iron(II) and iron(III) with gallic acid and its homologues: A potentiometric and spectrophotometric study. Aust. J. Chem. 1982, 35, 739–756. [Google Scholar] [CrossRef]
  14. Loginova, L.F.; Medyntsev, V.V.; Khomutov, B.I. Acidic properties of gallic acid and stability constants of iron complexes. Zh. Obshch. Him. 1972, 42, 739–742. [Google Scholar]
  15. Cossi, M.; Rega, N.; Scalmani, G.; Barone, V. Energies, structures, and electronic properties of molecules in solution with the C-PCM solvation model. J. Comput. Chem. 2003, 24, 669–681. [Google Scholar] [CrossRef]
  16. Tomasi, J.; Mennucci, B.; Cancès, E. The IEF version of the PCM solvation method: An overview of a new method addressed to study molecular solutes at the QM ab initio level. J. Mol. Struct. THEOCHEM 1999, 464, 211–226. [Google Scholar] [CrossRef]
  17. Marenich, A.V.; Cramer, C.J.; Truhlar, D.G. Universal Solvation Model Based on Solute Electron Density and on a Continuum Model of the Solvent Defined by the Bulk Dielectric Constant and Atomic Surface Tensions. J. Phys. Chem. B 2009, 113, 6378–6396. [Google Scholar] [CrossRef]
  18. Xu, L.; Coote, M.L. Methods To Improve the Calculations of Solvation Model Density Solvation Free Energies and Associated Aqueous pKa Values: Comparison between Choosing an Optimal Theoretical Level, Solute Cavity Scaling, and Using Explicit Solvent Molecules. J. Phys. Chem. A 2019, 123, 7430–7438. [Google Scholar] [CrossRef]
  19. Pezzola, S.; Venanzi, M.; Galloni, P.; Conte, V.; Sabuzi, F. Towards the “Eldorado” of pKa Determination: A Reliable and Rapid DFT Model. Molecules 2024, 29, 1255. [Google Scholar] [CrossRef]
  20. Bryantsev, V.S.; Diallo, M.S.; Goddard, W.A. Calculation of solvation free energies of charged solutes using mixed cluster/continuum models. J. Phys. Chem. B 2008, 112, 9709–9719. [Google Scholar] [CrossRef]
  21. Kelly, C.P.; Cramer, C.J.; Truhlar, D.G. Adding explicit solvent molecules to continuum solvent calculations for the calculation of aqueous acid dissociation constants. J. Phys. Chem. A 2006, 110, 2493–2499. [Google Scholar] [CrossRef] [PubMed]
  22. Fifen, J.J.; Dhaouadi, Z.; Nsangou, M. Revision of the thermodynamics of the proton in gas phase. J. Phys. Chem. A 2014, 118, 11090–11097. [Google Scholar] [CrossRef]
  23. Lim, C.; Bashford, D.; Karplus, M. Absolute pKa calculations with continuum dielectric methods. J. Phys. Chem. 1991, 95, 5610–5620. [Google Scholar] [CrossRef]
  24. Tissandier, M.D.; Cowen, K.A.; Feng, W.Y.; Gundlach, E.; Cohen, M.H.; Earhart, A.D.; Coe, J.V.; Tuttle, T.R. The Proton’s Absolute Aqueous Enthalpy and Gibbs Free Energy of Solvation from Cluster-Ion Solvation Data. J. Phys. Chem. A 1998, 102, 7787–7794. [Google Scholar] [CrossRef]
  25. Klots, C.E. Solubility of protons in water. J. Phys. Chem. 1981, 85, 3585–3588. [Google Scholar] [CrossRef]
  26. Tawa, G.J.; Topol, I.A.; Burt, S.K.; Caldwell, R.A.; Rashin, A.A. Calculation of the aqueous solvation free energy of the proton. J. Chem. Phys. 1998, 109, 4852–4863. [Google Scholar] [CrossRef]
  27. Zhan, C.-G.; Dixon, D.A. Absolute Hydration Free Energy of the Proton from First-Principles Electronic Structure Calculations. J. Phys. Chem. A 2001, 105, 11534–11540. [Google Scholar] [CrossRef]
  28. Marković, Z.; Tošović, J.; Milenkovic, D.; Markovic, S. Revisiting the solvation enthalpies and free energies of the proton and electron in various solvents. Comput. Theor. Chem. 2016, 1077, 11–17. [Google Scholar] [CrossRef]
  29. Rimarčík, J.; Lukeš, V.; Klein, E.; Ilčin, M. Study of the solvent effect on the enthalpies of homolytic and heterolytic N–H bond cleavage in p-phenylenediamine and tetracyano-p-phenylenediamine. J. Mol. Struct. THEOCHEM 2010, 952, 25–30. [Google Scholar] [CrossRef]
  30. Mejías, A.; Lago, S. Calculation of the absolute hydration enthalpy and free energy of H+ and OH−. J. Chem. Phys. 2000, 113, 7306–7316. [Google Scholar] [CrossRef]
  31. Galano, A.; Pérez-González, A.; Castañeda-Arriaga, R.; Muñoz-Rugeles, L.; Mendoza-Sarmiento, G.; Romero-Silva, A.; Ibarra-Escutia, A.; Rebollar-Zepeda, A.M.; León-Carmona, J.R.; Hernández-Olivares, M.A.; et al. Empirically Fitted Parameters for Calculating pKa Values with Small Deviations from Experiments Using a Simple Computational Strategy. J. Chem. Inf. Model. 2016, 56, 1714–1724. [Google Scholar] [CrossRef] [PubMed]
  32. Pérez-González, A.; Castañeda-Arriaga, R.; Verastegui, B.; Carreón-González, M.; Alvarez-Idaboy, J.R.; Galano, A. Estimation of empirically fitted parameters for calculating pK a values of thiols in a fast and reliable way. Theor. Chem. Acc. 2018, 137, 5. [Google Scholar] [CrossRef]
  33. Miguel, E.L.M.; Silva, P.L.; Pliego, J.R. Theoretical Prediction of pKa in Methanol: Testing SM8 and SMD Models for Carboxylic Acids, Phenols, and Amines. J. Phys. Chem. B 2014, 118, 5730–5739. [Google Scholar] [CrossRef] [PubMed]
  34. Chai, J.-D.; Head-Gordon, M. Long-range corrected hybrid density functionals with damped atom–atom dispersion corrections. Phys. Chem. Chem. Phys. 2008, 10, 6615–6620. [Google Scholar] [CrossRef]
  35. Schröder, H.; Hühnert, J.; Schwabe, T. Evaluation of DFT-D3 dispersion corrections for various structural benchmark sets. J. Chem. Phys. 2017, 146, 044115. [Google Scholar] [CrossRef]
  36. Parr, R.G.; Yang, W. Density-Functional Theory of Atoms and Molecules; Oxford University Press: New York, NY, USA; Oxford, UK, 1989. [Google Scholar]
  37. Zhao, Y.; Truhlar, D.G. The M06 suite of density functionals for main group thermochemistry, thermochemical kinetics, noncovalent interactions, excited states, and transition elements: Two new functionals and systematic testing of four M06-class functionals and 12 other functionals. Theor. Chem. Acc. 2008, 120, 215–241. [Google Scholar] [CrossRef]
  38. Becke, A.D. Density-functional thermochemistry. III. The role of exact exchange. J. Chem. Phys. 1993, 98, 5648–5652. [Google Scholar] [CrossRef]
  39. Lee, C.; Yang, W.; Parr, R.G. Development of the Colle-Salvetti correlation-energy formula into a functional of the electron density. Phys. Rev. B 1988, 37, 785–789. [Google Scholar] [CrossRef]
  40. Dunning, T.H. Gaussian basis sets for use in correlated molecular calculations. I. The atoms boron through neon and hydrogen. J. Chem. Phys. 1989, 90, 1007–1023. [Google Scholar] [CrossRef]
  41. Weigend, F.; Ahlrichs, R. Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: Design and assessment of accuracy. Phys. Chem. Chem. Phys. 2005, 7, 3297–3305. [Google Scholar] [CrossRef]
  42. Frisch, M.J.; Trucks, G.W.; Schlegel, H.B.; Scuseria, G.E.; Robb, M.A.; Cheeseman, J.R.; Scalmani, G.; Barone, V.; Petersson, G.A.; Nakatsuji; et al. Gaussian 16; Revision A.03; Gaussian, Inc.: Wallingford, CT, USA, 2016. [Google Scholar]
  43. Glendening, E.D.; Reed, A.E.; Carpenter, J.E.; Weinhold, F. NBO Version 3.1. 2017. [Google Scholar]
  44. Braun, D.E.; Bhardwaj, R.M.; Florence, A.J.; Tocher, D.A.; Price, S.L. Complex Polymorphic System of Gallic Acid—Five Monohydrates, Three Anhydrates, and over 20 Solvates. Cryst. Growth Des. 2013, 13, 19–23. [Google Scholar] [CrossRef]
Figure 1. Gallic acid, 3,4,5-trihydroxybenzoic acid.
Figure 1. Gallic acid, 3,4,5-trihydroxybenzoic acid.
Molecules 30 00742 g001
Figure 2. Four-step deprotonation of gallic acid and related pKa parameters.
Figure 2. Four-step deprotonation of gallic acid and related pKa parameters.
Molecules 30 00742 g002
Figure 3. The optimized geometries of neutral GA molecule and its anionic forms GA−N N = 1,2,3,4. The calculations were performed in a water medium at the LSDA/QZVP theory level, using the SMD solvation model. The values of the bond lengths are presented.
Figure 3. The optimized geometries of neutral GA molecule and its anionic forms GA−N N = 1,2,3,4. The calculations were performed in a water medium at the LSDA/QZVP theory level, using the SMD solvation model. The values of the bond lengths are presented.
Molecules 30 00742 g003
Figure 4. Potential scenarios for the deprotonation of GA, involving dianions A and B. The values (in [kcal mol−1]) of Gibbs free energies for the A reaction pathway are taken from Table S1 and Table 1, whereas ΔG’2 = 268.1668 and ΔG’3 = 285.9096 [kcal mol−1]. Both processes are isoenergetic as ΔG2 + ΔG3 = ΔG’2 + ΔG’3.
Figure 4. Potential scenarios for the deprotonation of GA, involving dianions A and B. The values (in [kcal mol−1]) of Gibbs free energies for the A reaction pathway are taken from Table S1 and Table 1, whereas ΔG’2 = 268.1668 and ΔG’3 = 285.9096 [kcal mol−1]. Both processes are isoenergetic as ΔG2 + ΔG3 = ΔG’2 + ΔG’3.
Molecules 30 00742 g004
Figure 5. Plots of the self-similar relationship (7) with ∆GN calculated at the LSDA/QZVP theory level, using the SMD model (water medium) and the four experimental datasets represented by the graphical symbols I—●; II—□; III—x; and IV—♦.
Figure 5. Plots of the self-similar relationship (7) with ∆GN calculated at the LSDA/QZVP theory level, using the SMD model (water medium) and the four experimental datasets represented by the graphical symbols I—●; II—□; III—x; and IV—♦.
Molecules 30 00742 g005
Table 1. The differences ∆GN [kcal mol−1] in Gibbs free energies of neutral GA0 molecule and its anionic forms GA−N N = 1,2,3,4 including the zero-point energy corrections from Table S1. The calculations were performed in the water medium at the LSDA, M062X, and B3LYP/QZVP theory levels, using the SMD solvation model.
Table 1. The differences ∆GN [kcal mol−1] in Gibbs free energies of neutral GA0 molecule and its anionic forms GA−N N = 1,2,3,4 including the zero-point energy corrections from Table S1. The calculations were performed in the water medium at the LSDA, M062X, and B3LYP/QZVP theory levels, using the SMD solvation model.
∆GN [kcal mol−1]LSDAM062XB3LYP
∆G1G(GA−1)aq–G(GA0)aq267.6773270.6210272.7539
∆G2G(GA−2)aq–G(GA−1)aq270.8450279.5730280.5099
∆G3G(GA−3)aq–G(GA−2)aq283.2258289.4136290.8582
∆G4G(GA−4)aq–G(GA−3)aq295.9617297.0059298.7058
Table 2. The values of the parameters a, c (dimensionless), and G(H+)aq ([kcal mol−1]) fitted via Equation (7) to the four sets of experimental pKa for GA reported in [11,12,13,14]. The indicators of goodness of the fit/standard error (SE) of estimation and the coefficient of determination, R2, are presented.
Table 2. The values of the parameters a, c (dimensionless), and G(H+)aq ([kcal mol−1]) fitted via Equation (7) to the four sets of experimental pKa for GA reported in [11,12,13,14]. The indicators of goodness of the fit/standard error (SE) of estimation and the coefficient of determination, R2, are presented.
LSDAM062XB3LYPLSDAM062XB3LYP
ParameterSet ISet II
G(H+)aq−265.47(32)−62.1(3.2)−266.6(2.5)−265.775(20)−268.12(65)−270.51(44)
a5.87(59)2.02(80)2.84(94)7.87(64)5.36(48)5.90(96)
c0.177(38)0.51(11)0.41(10)0.114(33)0.247(66)0.216(61)
SE0.38610.20310.26300.49590.40200.4383
R20.99180.99770.99620.99100.99410.9930
ParameterSet IIISet IV
G(H+)aq−265.764(26)−267.78(51)−270.29(38)−265.7155(75)−266.81(45)−269.63(15)
a7.59(83)4.91(57)5.48(60)7.271(57)3.60(34)4.33(16)
c0.126(23)0.275(41)0.240(40)0.1858(31)0.420(31)0.364(13)
SE0.33270.22610.26660.04540.14530.0766
R20.99590.99810.99731.00000.99950.9994
Table 3. The values of pKa1,2,3,4 reproduced by Equation (7) with parameters from Table 1 and Table 2 evaluated for the four sets of experimental pKa data reported in [11,12,13,14]. The mean absolute error (MAE) of the data reproduction is presented.
Table 3. The values of pKa1,2,3,4 reproduced by Equation (7) with parameters from Table 1 and Table 2 evaluated for the four sets of experimental pKa data reported in [11,12,13,14]. The mean absolute error (MAE) of the data reproduction is presented.
pKa (exp)pKapKa (exp)pKa
ParameterSet ILSDAM062XB3LYPSet IISet ILSDAM062X
pKa14.504.524.524.524.4(1)4.614.404.40
pKa27.056.926.956.938.8(1)8.678.678.67
pKa38.759.068.918.9610.0(1)10.4010.3210.35
pKa410.2510.0510.1710.1411.4(1)11.1311.2011.17
MAE 0.1650.0900.115 0.2530.1630.178
ParameterSet IIILSDAM062XB3LYPSet IVLSDAM062XB3LYP
pKa14.444.554.444.444.16(2)4.1614.1554.168
pKa28.548.458.468.468.55(1)8.5368.6078.600
pKa310.0510.3210.2310.2611.40(10)11.4411.2811.37
pKa411.3011.1211.1911.1612.80(40)12.7812.8712.87
MAE 0.1630.1180.108 0.0190.0630.060
Table 4. The values of the parameters a, c (dimensionless), and ∆G(H+)aq ([kcal mol−1]) fitted to the four sets of experimental pKa for GA via Equation (7) including Equation (3) and ∆GN calculated at the LSDA/QZVP theory level using the SMD solvation model and a water medium. The indicators of goodness of the fit/standard error (SE) of estimation and the coefficient of determination (R2) are presented. The mean absolute error (MAE) and the normalized mean absolute error (NMAE defined in Equation (8)) of the reproduced pKaN parameters are presented.
Table 4. The values of the parameters a, c (dimensionless), and ∆G(H+)aq ([kcal mol−1]) fitted to the four sets of experimental pKa for GA via Equation (7) including Equation (3) and ∆GN calculated at the LSDA/QZVP theory level using the SMD solvation model and a water medium. The indicators of goodness of the fit/standard error (SE) of estimation and the coefficient of determination (R2) are presented. The mean absolute error (MAE) and the normalized mean absolute error (NMAE defined in Equation (8)) of the reproduced pKaN parameters are presented.
ParameterSet ISet IISet IIISet IV
∆G(H+)aq−259.19(32)−259.486(20)−259.476(26)−259.4272(75)
a5.87(59)7.871(39)7.59(43)7.271(57)
c0.177(38)0.114(33)0.126(23)0.1858(31)
R20.99180.99100.99591.0000
SE0.38610.49590.33270.0454
pKa14.514.404.444.16
pKa26.928.678.458.54
pKa39.0810.4010.3211.44
pKa410.0511.1311.1212.78
MAE0.1600.1990.1340.019
NMAE 1.992 0.463
Table 5. The effect of basis set on the theoretical reproduction of proton hydration energy ∆G(H+)aq and pKaN parameter values. The calculations were performed in the water medium at the LSDA/QZVP theory levels, using the SMD solvation model. The experimental value of ∆G(H+)aq = −259.5 [kcal mol−1] was reported by Lim et al. [23].
Table 5. The effect of basis set on the theoretical reproduction of proton hydration energy ∆G(H+)aq and pKaN parameter values. The calculations were performed in the water medium at the LSDA/QZVP theory levels, using the SMD solvation model. The experimental value of ∆G(H+)aq = −259.5 [kcal mol−1] was reported by Lim et al. [23].
6311++G(d,p)aug-cc-pVQZQZVP
∆G(H+)aq−257.1671(51)−259.2358(72)−259.4272(75)
a7.156(34)7.268(54)7.271(57)
c0.1934(19)0.1869(29)0.1858(31)
R21.00001.00001.0000
SE0.02650.04300.0454
pKa14.1594.1604.161
pKa28.5428.5378.536
pKa311.42311.43511.436
pKa412.78812.77812.776
MAE0.0110.0170.019
NMAE0.2680.4270.463
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Molski, M. Computation of the pKa Values of Gallic Acid and Its Anionic Forms in Aqueous Solution: A Self-Similar Transformation Approach for Accurate Proton Hydration Free Energy Estimation. Molecules 2025, 30, 742. https://doi.org/10.3390/molecules30030742

AMA Style

Molski M. Computation of the pKa Values of Gallic Acid and Its Anionic Forms in Aqueous Solution: A Self-Similar Transformation Approach for Accurate Proton Hydration Free Energy Estimation. Molecules. 2025; 30(3):742. https://doi.org/10.3390/molecules30030742

Chicago/Turabian Style

Molski, Marcin. 2025. "Computation of the pKa Values of Gallic Acid and Its Anionic Forms in Aqueous Solution: A Self-Similar Transformation Approach for Accurate Proton Hydration Free Energy Estimation" Molecules 30, no. 3: 742. https://doi.org/10.3390/molecules30030742

APA Style

Molski, M. (2025). Computation of the pKa Values of Gallic Acid and Its Anionic Forms in Aqueous Solution: A Self-Similar Transformation Approach for Accurate Proton Hydration Free Energy Estimation. Molecules, 30(3), 742. https://doi.org/10.3390/molecules30030742

Article Metrics

Back to TopTop