**Gulliver in the Country of Lilliput An Interplay of Noncovalent Interactions**

Printed Edition of the Special Issue Published in *Molecules* Ilya G. Shenderovich Edited by

wwww.mdpi.com/journal/molecules

## **Gulliver in the Country of Lilliput**

## **Gulliver in the Country of Lilliput An Interplay of Noncovalent Interactions**

Editor

**Ilya G. Shenderovich**

MDPI • Basel • Beijing • Wuhan • Barcelona • Belgrade • Manchester • Tokyo • Cluj • Tianjin

*Editor* Ilya G. Shenderovich University of Regensburg Germany

*Editorial Office* MDPI St. Alban-Anlage 66 4052 Basel, Switzerland

This is a reprint of articles from the Special Issue published online in the open access journal *Molecules* (ISSN 1420-3049) (available at: https://www.mdpi.com/journal/molecules/special issues/Interplay Noncovalent Interactions).

For citation purposes, cite each article independently as indicated on the article page online and as indicated below:

LastName, A.A.; LastName, B.B.; LastName, C.C. Article Title. *Journal Name* **Year**, *Volume Number*, Page Range.

**ISBN 978-3-0365-0430-8 (Hbk) ISBN 978-3-0365-0431-5 (PDF)**

Cover image courtesy of Ilya G. Shenderovich.

© 2021 by the authors. Articles in this book are Open Access and distributed under the Creative Commons Attribution (CC BY) license, which allows users to download, copy and build upon published articles, as long as the author and publisher are properly credited, which ensures maximum dissemination and a wider impact of our publications.

The book as a whole is distributed by MDPI under the terms and conditions of the Creative Commons license CC BY-NC-ND.

## **Contents**


#### **Ilya G. Shenderovich and Gleb S. Denisov**

## **About the Editor**

**Ilya G. Shenderovich** (Dr.Nat.Sci.) received a BSc degree in Physics (1993, Chemical Physics) and an MSc degree in Physics (1995, Physics of Condensed Matter) from Sankt-Petersburg State University, Russia (Mentor: Dr. G. N. Kuz'min). He became a Candidate of Science in Physics and Mathematics (PhD) in 1999 (Topic: Manifestation of Covalency, Cooperativity, and Symmetry of Strong Hydrogen Bonds in NMR Spectra; Mentor: Prof. Dr. G.S. Denisov) and a Doctor of Science in Physics and Mathematics in 2011 (Topic: Study of Hydrogen Bonds in Amorphous Materials and at Interfaces by NMR). He progressed in his research career under the guidance of Prof. Dr. H.-H. Limbach at the Freie Universitat Berlin. He runs ¨ the NMR department of the Faculty of Chemistry and Pharmacy at the Universitat¨ Regensburg. His main research interests focus on noncovalent interactions in condensed matter. His main research methods are NMR spectroscopy and model DFT calculations. The list of his publications is available at publons.com/researcher/636272/ilya-g-shenderovich and https://www.scopus.com/authid/detail.uri?authorId=6701593020.

### *Editorial* **Editorial to the Special Issue "Gulliver in the Country of Lilliput: An Interplay of Noncovalent Interactions"**

**Ilya G. Shenderovich**

Institute of Organic Chemistry, Faculty of Chemistry and Pharmacy, University of Regensburg, Universitaetstrasse 31, 93053 Regensburg, Germany; Ilya.Shenderovich@ur.de

Noncovalent interactions allow our world to exist. Their study remains vital to the progress of chemistry and chemical physics. This topic has been specifically addressed in a number of the past and present Special Issues of *Molecules*, and almost every publication touches on the subject of noncovalent interactions in one way or another.

The overarching goal of this Special Issue was to bring together publications that consider effects caused by an interplay of noncovalent interactions. A common case is the situation when there is one dominant interaction that determines the structure of the molecular system, and a large number of weaker interactions that are forced to adapt to this structure. Although it is clear that this "Gulliver in the Country of Lilliput" model is only a rough approximation, this view implicitly prompts the assumption that the net effect of weak interactions is negligible, at least on the structure of the system, since multiple small contributions can cancel each other out. In some cases, this may turn out to be true. However, since the total strength of the "Lilliputians" can exceed that of the "Gulliver", a priori conclusions can lose their predictive power. The dominant interaction can be strong and protected from any direct competition, as in the case of the proton-bound homodimer of pyridine, but the geometry of this complex still differs in the gas, solution, and solid phases [1]. Similarly, the result of competition between two strong interactions can be determined by weak interactions [2]. The current challenge is to learn to incorporate multiple competing interactions into effective working models.

The contributions in this Special Issue can be grouped into three thematic areas: (i) specific properties of selected interactions evaluated for bi- or trimolecular complexes, [3–6] (ii) their role in the crystal packing [7,8], chemical [9,10] and enzymatic [11] reactions as well as (iii) manifestations of competing noncovalent interactions in solution [12] and into porous materials [13].

The experimentally challenging study of Suhm et al. [3] analyses the docking preference of alcohols between the two nonequivalent lone electron pairs of the carbonyl group in pinacolone using supersonic jet expansions of 1:1 solvate complexes. The result of an interplay between the nonequivalence of the lone electron pairs and distant London dispersion and Pauli repulsion was modulated by the size of the alkyl group of the alcohol. The obtained experimental results serve as extremely high level benchmarks for verifying the accuracy of theoretical methods. Some of these methods were tested. It is suggested to note the importance of London dispersion for structure and stability of molecular aggregates [14].

The paper by Szatylowicz et al. [4] describes the effect of substituents on the energy of specific noncovalent interactions in adenine-based bimolecular complexes and the aromaticity of the partners. Understanding these effects is essential for effective control of acid-base interactions in biochemistry, where stronger does not always mean better. Special attention was payed to the comparison of the energies obtained using different models. The reader may be interested in further reading on this subject [15].

The contribution by Filarowski et al. [5] reports on the balance of repulsive and attractive intramolecular interactions between adjacent carboxyl groups in selectively substituted

**Citation:** Shenderovich, I.G. Editorial to the Special Issue "Gulliver in the Country of Lilliput: An Interplay of Noncovalent Interactions". *Molecules* **2021**, *26*, 158. https://doi.org/ 10.3390/molecules26010158

Received: 22 December 2020 Accepted: 30 December 2020 Published: 31 December 2020

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

phthalic acids, the dependence of this balance on intramolecular steric crowding, and the effect of these intramolecular properties on intermolecular interactions of these carboxyl groups. This study represents a combination of the Infrared, Raman, Nuclear Magnetic Resonance (NMR), and Incoherent Inelastic Neutron Scattering spectroscopies and the Car– Parrinello Molecular Dynamics and Density Functional Theory calculations. The structural and energetic parameters of the intra- and intermolecular interactions were estimated for the gas, liquid, and solid phases. Note that despite the originality of the outcomes arising from neutron scattering, the number of molecular systems studied by these methods is limited due to the complexity of the experimental equipment required [16,17].

The paper by Tolstoy et al. [6] describes halogen-bonded complexes formed by trimethylphosphine oxide with 128 different halogen donors of various classes. Correlations between the energetic, geometric and spectral properties of these complexes were established and summarized. These correlations make it possible to estimate the energy and geometry of a given halogen bond from the corresponding experimentally measured spectral parameter. Both the halogen bonding and the acceptor properties of the P=O moiety [18] are of considerable current interest. Therefore, the reported correlations can be very useful.

Vener et al. [7] describe how the calculated structural and spectral parameters of two component crystals of organic salts depend on various parameters of the theoretical approximation used. The experimental parameters of such systems cannot be correctly reproduced using the approximation of small molecular clusters. Using the periodic density functional theory is a reasonable compromise that allows the parameters of complex multicomponent pharmaceuticals to be accurately predicted [19].

The paper by Nenajdenko, Tskhovrebov et al. [8] demonstrates that halogen-halogen interactions play a critical role in self-assembly of highly polarizable molecules in crystals. A series of novel halogenated aromatic dichlorodiazadienes were prepared and characterized using X-ray diffraction and Bader's Theory of Atoms in Molecules. Although halogen-halogen interactions are not strong, they can be a tool for fine-tuning the crystal structure [20].

The review by Grabowski [9] highlights the role of noncovalent interactions as a preliminary stage of chemical reactions. Hydrogen bonding assisted proton transfer, halogen bonding in solution, molecular hydrogen elimination via a dihydrogen bond, the intramolecular conformational effect of triel bonds, and tetrel bonds involving SN2 reactions were considered. Note that these short-lived interactions can both facilitate and hinder other steps [21,22]. For example, Ke and Lin [10] report on the catalytic effect of hydrogen bonding on the methanol steam reforming reaction on a metal surface. This and other publications on this topic may be directly in demand for practical use [23].

Vianello et al. [11] demonstrate how small alterations in weak interactions can cause significant changes in biological activity. Their molecular dynamic simulations correctly reproduce experimental data on the binding energy of histamine within the H2 receptor and its change caused by deuteration. The fact that deuteration can affect the kinetics of a chemical reaction [24] and result in measurable structural and spectral changes [25] is well known. However, in the paper at hand, the authors were able to identify the mechanism responsible for these changes.

The paper by Shenderovich and Denisov [12] presents an advanced approach to implicitly accounting for the solvent effect. In this adduct-under-field approach, the solvent effect is simulated using an external electric field. It was shown that solute–solvent interactions remarkably affect the geometry of acid-base complexes even if the active sites of these complexes are not accessible for solvent molecules. Note that this approach is applicable to many other molecular systems in solution and in crystal form [26,27].

The review of Buntkowsky and Vogel [13] describes current trends and perspectives in the study of guest molecules in porous silica materials employing solid-state NMR techniques with particular attention to the effect of an interplay between guest–guest and guest–host interactions. It is shown that such interactions can radically change the physicochemical properties of these systems. Solid-state NMR and relaxometry are among the most effective analytical tools in this area of materials chemistry. They can be applied for probing structure or dynamics of materials themselves as well as the behavior of incorporated guests [28,29].

**Funding:** This research received no external funding. The APC was funded by MDPI.

**Acknowledgments:** I want to sincerely thank everyone that contributed to this Special Issue. Special thanks to the assistant editor Lola Huo and the entire team of *Molecules* for their motivation, professional expertise, and support.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


*Article*

## **The E**ff**ect of Deuteration on the H2 Receptor Histamine Binding Profile: A Computational Insight into Modified Hydrogen Bonding Interactions**

#### **Lucija Hok 1, Janez Mavri <sup>2</sup> and Robert Vianello 1,\***


#### Academic Editor: Ilya G. Shenderovich

Received: 4 December 2020; Accepted: 17 December 2020; Published: 18 December 2020

**Abstract:** We used a range of computational techniques to reveal an increased histamine affinity for its H2 receptor upon deuteration, which was interpreted through altered hydrogen bonding interactions within the receptor and the aqueous environment preceding the binding. Molecular docking identified the area between third and fifth transmembrane α-helices as the likely binding pocket for several histamine poses, with the most favorable binding energy of <sup>−</sup>7.4 kcal mol−<sup>1</sup> closely matching the experimental value of <sup>−</sup>5.9 kcal mol−1. The subsequent molecular dynamics simulation and MM-GBSA analysis recognized Asp98 as the most dominant residue, accounting for 40% of the total binding energy, established through a persistent hydrogen bonding with the histamine −NH3 <sup>+</sup> group, the latter further held in place through the N–H···O hydrogen bonding with Tyr250. Unlike earlier literature proposals, the important role of Thr190 is not evident in hydrogen bonds through its −OH group, but rather in the C–H···π contacts with the imidazole ring, while its former moiety is constantly engaged in the hydrogen bonding with Asp186. Lastly, quantum-chemical calculations within the receptor cluster model and utilizing the empirical quantization of the ionizable X–H bonds (X = N, O, S), supported the deuteration-induced affinity increase, with the calculated difference in the binding free energy of <sup>−</sup>0.85 kcal mol<sup>−</sup>1, being in excellent agreement with an experimental value of <sup>−</sup>0.75 kcal mol<sup>−</sup>1, thus confirming the relevance of hydrogen bonding for the H2 receptor activation.

**Keywords:** deuteration; heavy drugs; histamine receptor; hydrogen bonding; receptor activation

#### **1. Introduction**

Histamine is an important mediator and neurotransmitter that is involved in a broad spectrum of central and peripheral physiological as well as pathophysiological processes, such as allergies and inflammation. It exerts its specific effects by the activation of four receptor subtypes (H1R–H4R) [1]. Histamine receptors are 7-transmembrane receptors, which belong to the family of G-protein coupled receptors (GPCR), a very common target for a wide range of therapeutics used in modern pharmacotherapy, and differ in receptor distribution, ligand binding properties, signaling pathways and functions. Some estimates suggest that GPCRs encompass around 30% of the existing drug targets, while their therapeutic potential might be even larger [2,3].

The literature contains many studies on how GPCRs are activated and transmit their signals from the extracellular side to the G-protein coupling domain located on the intracellular side [4–6]. Instead, we have been interested in how different agonists and antagonists bind to the receptor binding site, and whether these processes are modulated upon non-selective deuteration, which would confirm the

assumption that ligand binding is governed by hydrogen bonding interactions. Specifically, the topic of deuterium isotope effects is usually concerned with its impact on chemical reactions that are caused by substituting protium hydrogen (H) atoms with deuterium (D) in a molecule. These effects include changes in the rate of cleavage of covalent bonds to deuterium, or to an atom located adjacent to deuterium, in a reactant molecule. Alternatively, deuterium isotope effects on other, for example, noncovalent interactions between molecules are known to occur, but are generally considered to be insignificant, especially in biological experiments where deuterium substituted molecules are used as tracers. Nevertheless, replacing light hydrogen atoms with their heavier deuterium analogues, typically shortens the donor X–D bonds relative to the X–H bonds (X = heteroatom), as the X–D bonds are stronger, more compact and more stable to oxidative processes. Ultimately, this results in the elongation of the corresponding donor···acceptor distance among heteroatoms, also known as the Ubbelohde effect [7], which affects the strength of the involved hydrogen bonds and can, therefore, produce modified affinities during the ligand–target recognition. Indeed, D has a 2-fold higher mass than H, leading to a reduced vibrational stretching frequency of the X–D bond compared to the X−H bond and, consequently, lower ground state energy. To further confirm that, Bordallo and co-workers recently performed a very accurate neutron diffraction study of the alanine zwitterion to show that deuteration reduces the electrostatic attraction in the acidic N–D bonds by 2.3% relative to the corresponding N–H bonds [8]. This results in the shortening of the N–D distances, as already noticed in various papers [9–13].

For many years, researchers have sought ways to incorporate deuterium into drug molecules in order to inhibit metabolic conversion into less active or inactive molecules [14,15], with the first such attempts being made nearly 60 years ago [16]. Because bonds to deuterium are stronger than those to hydrogen, early adopters tweaked molecules to better withstand the ravages of drug-metabolizing enzymes like cytochrome P450s. Deuterated drugs, they hoped, would have longer half-lives than their non-deuterated counterparts that would allow perhaps less frequent dosing and produce different metabolites. With this in mind, the focus was placed on drug fragments that were expected to be sites amenable to metabolic transformations, and typically involved chemical deuteration pertaining to heteroatom–CH3 groups (or other alkyl units) that were converted into heteroatom–CD3 alternatives. As an illustrative example, Falconnet [17], Brazier [18], and Cherrah [19,20] studied the binding of caffeine (Figure 1) to human serum albumin (HSA) by the equilibrium dialysis and demonstrated that the corresponding *K*<sup>a</sup> values for caffeine, caffeine-1-CD3 and caffeine-1,3,7-(CD3)3 were not significantly different, while those for caffeine-3-CD3, caffeine-1,7-(CD3)2, and caffeine-3,7-(CD3)2 were considerably lower than that for caffeine, indicating that HSA had a reduced affinity for the deuterated compounds. On the other hand, very recently, the U.S. Food and Drug Administration granted market approval for the first deuterated drug molecule, deutetrabenazine (Figure 1), which is useful in treating chorea associated with Huntington's disease [21]. Deutetrabenazine is a heavier analogue of the existing drug tetrabenazine, with two −OCH3 groups in the latter being replaced by a pair of −OCD3 groups, thereby altering the rate of metabolism to afford greater tolerability and an improved dosing regimen, thus an enhanced therapeutic potential was achieved. Still, in both of these instances, one can hardly argue that modified affinities came as a result of changed hydrogen bonding strengths, since methyl groups and their deuterated versions show poor hydrogen bonding abilities. Therefore, the observed effects likely originate in the modified dipole-dipole or dipole-charge interactions, which are generally weak. Knowing that hydrogen bonding interactions are significantly stronger than those mentioned, and that they typically dominate the ligand–target recognition [22], led us to offer some insight into the effect of deuteration on the ability of the H2 receptor to accommodate its endogenous agonist histamine, the latter particularly suitable to inspect alternations in the hydrogen bonding patterns and the accompanying affinities. Namely, histamine is a biogenic diamine (Figure 1), consisting of a free ethylamino group and an imidazole ring, thus involving three distinct sites able to either donate or accept hydrogen bonds, which makes it reasonable to expect that these particular interactions will

predominantly govern its binding to the H2 receptor, as it was clearly demonstrated in the case of its hydration [23,24].

**Figure 1.** Chemical structures and atom labeling for the systems relevant to the discussion.

With this in mind, instead of utilizing chemical deuteration described earlier, in our preceding work [25], we have taken a different approach of introducing deuteration through the exchange mechanism by performing binding studies in pure D2O. In this way, we assured that all exchangeable hydrogen atoms, in both aqueous solution and within the H2 receptor will be replaced by deuterium, and that this will allow us to monitor how the hydrogen bonding interactions responsible for both the histamine hydration and its inclusion into the receptor binding site will be affected. Experiments were carried out on the H2 receptor present in cell membranes of cultured neonatal rat astrocytes, where we conducted the saturation and inhibition binding experiments using the antagonist 3H-tiotidine as a radiolabel, and histamine as a displacer of a bound radioligand. The results revealed a significant increase in the histamine affinity, as its pIC50 values (*p* < 0.05) changed from 7.25 ± 0.11 (control) to 7.80 ± 0.16 (D2O). Building on that, our subsequent work undertook the same approach for the binding of two agonists, 2-methylhistamine and 4-methylhistamine, and two antagonists, cimetidine and famotidine, and showed a notable affinity increase for 4-methylhistamine and a reduced one for 2-methylhistamine, while no change was observed for both antagonists [26]. This was interpreted in the context of the altered hydrogen bonding strength upon deuteration, which impacts ligand interactions with binding sites residues and solvent molecules preceding the binding. Our present work builds on the mentioned results [25,26], and considers the parent agonist histamine through a range of computational techniques, involving docking studies, classical molecular dynamics simulation, and quantum-chemical calculations within a large cluster model of the H2 receptor, in order to offer a more precise insight into the structural and electronic features of the studied ligand with the aim to provide the molecular interpretation to the observed binding differences. The outlined analysis is likely to contribute towards understanding the receptor activation, while the in silico discrimination between agonists and antagonists, based on the receptor structure, remains a distant ultimate goal.

#### **2. Results and Discussion**

As already mentioned, in our preceding work [25], we used 3H-tiotidine as a marker to label histamine H2 receptor binding sites on the cultured neonatal rat astrocytes, and histamine as an agonist to displace it, both in the control system and in deuterated environment. This resulted in a considerable deuteration-induced increase in the histamine affinity, as the measured pIC50 values (*p* < 0.05) went from 7.25 ± 0.11 (control) to 7.80 ± 0.16 (D2O). Although the relationship between IC50 and Δ*G*BIND values is not so straightforward in absolute terms, their relative ratio is connected through the Cheng-Prusoff equation [27] and roughly translates to a difference of ΔΔ*G*BIND <sup>=</sup> <sup>−</sup>0.75 kcal mol<sup>−</sup>1, which will be used in the rest of the text to evaluate the quality of computational results.

#### *2.1. Docking Simulation*

To offer some initial insight into the binding of histamine into the H2 receptor, we employed several docking simulations with the aim of obtaining the relevant binding poses and the accompanying binding free energies, and use these as starting points for the subsequent molecular dynamics (MD) simulations. In doing so, we focused on the more stable N3–H (Nτ) tautomer, which was docked into the homology structure of the H2 receptor. Interestingly, although the entire receptor surface was considered equally during the docking procedure, the obtained results reveal that the first four most favorable binding poses correspond to the identical position within the H2 receptor, and only differ in the conformation of the histamine ligand (Figure 2).

**Figure 2.** Overlap of four most favorable histamine binding poses within the H2 receptor as predicted by molecular docking that differ only in the ligand orientation. The computed binding free energies are <sup>−</sup>7.4 kcal mol−<sup>1</sup> (blue), <sup>−</sup>7.1 kcal mol−<sup>1</sup> (green), <sup>−</sup>6.8 kcal mol−<sup>1</sup> (orange) and <sup>−</sup>6.7 kcal mol−<sup>1</sup> (red).

Apart from being positioned in the same binding pocket, a closer analysis of the predicted binding poses shows that all histamine molecules are located in the area between third and fifth transmembrane α-helices, in line with many earlier literature reports on the binding of H2 receptor ligands [28–30]. This provides some credence to the obtained results, which is further promoted by the calculated binding affinities. Namely, Figure 2 shows that the most favorable pose is associated with the binding energy of <sup>−</sup>7.4 kcal mol−1, which very well agrees with the experimental value of <sup>−</sup>5.9 kcal mol−<sup>1</sup> obtained from the measured p*K*<sup>i</sup> value of 4.3 [31]. Lastly, let us briefly mention that we have repeated the identical docking procedure for the less stable N1–H (Nπ) histamine tautomer, and the results showed an analogous placement within the H2 receptor and the identical binding energy of <sup>−</sup>7.4 kcal mol<sup>−</sup>1. Still, due to a described lower stability and the matching lower population of this tautomer relative to its N3–H analogue, N1–H tautomer was not considered further.

#### *2.2. Molecular Dynamics Simulation of Histamine in Aqueous Solution*

As already described, in aqueous solution, histamine exists almost exclusively (98%) as a monocation protonated at the free ethylamino group (Figure 1) and this protonation form was considered in a 20 Å-thick truncated octahedron simulation box, which involved 3.572 water molecules.

It turned out that histamine is a rather flexible molecule, but the clustering analysis of the obtained structures revealed a predominance of the two types of geometries (Figure 3), termed as *gauche*, in which there exists an intramolecular N–H·····N hydrogen bonding between the protonated amine (N2) as a donor and the imino nitrogen (N1) within the imidazole ring as an acceptor, and *trans*, which is elongated and where such a hydrogen bonding is absent. Interestingly, the results reveal around 73% dominance of the *trans* conformation, which is in an almost perfect agreement with around 80% predicted by other techniques [23,32–35]. It is worth mentioning that two useful geometric

parameters, which characterize these two distinct orientations, and which will be used later in the analysis of the conformational preference of histamine within the receptor, are (i) distance between the relevant N1–N2 sites, and (ii) dihedral angle describing the rotation of the ethylamino group around the imidazole ring. In the representative *trans* geometry these are 4.54 Å and 158.8◦, respectively, while in *gauche* these are reduced to 3.01 Å and 62.1◦, in the same order. Their distribution during MD simulation (Figure S1) also indicates the preference of the *trans* conformation and further demonstrates the suitability of the described two structures as representative.

## *gauche* (population = 27%) *trans* (population = 73%)

**Figure 3.** Representative conformations with their population of the histamine monocation in aqueous solution as obtained by the molecular dynamics simulation.

The interactions governing the hydration of histamine also reveal interesting trends. All three nitrogen sites (N1–N3) represent crucial locations to interact with water, with the corresponding RDF displays demonstrating an equal solvent ability to approach them (Figure 4a). Specifically, for all three positions, the predominant interactions are established at the N (histamine)···O (water) distances of around 3 Å, which corresponds to rather strong hydrogen bonds in all cases. The interactions with both N-positions within the imidazole ring show identical patterns, thus indicating that N1 and N3 sites are participating as hydrogen bond acceptor and donor, respectively, with one water molecule in the first hydration shell. The latter is nicely evident in the average number of hydrogen bonding contacts, being 1.2 and 0.7 for N1 and N3, respectively. On the other hand, the interaction with the cationic N2 site is much more frequent, while around three times higher peak for N2 is only partially justified by the fact that the protonated amino group has three equivalent N2–H bonds that can potentially interact with three water molecules at the same time. As a matter of fact, Figure 4b advises that the actual number of hydrogen bonding contacts for this group is predominantly between 2 and 3, with an average value during simulation of 2.2, which is likely due to steric reasons and the exchangeability of individual solvent molecules. In addition, the shape of the RDF curve and a slightly lower distance for the peak maximum for N2 (2.8 Å relative to 2.9 Å for both N1 and N3), also suggests that its interactions with the solvent molecules are stronger than with the other two nitrogen sites. This notion is found in excellent agreement with our earlier report [24], where we utilized the Car-Parrinello molecular dynamics simulation scheme to delineate the experimental IR spectra of histamine in water, which showed a broad feature between 3350 and 2300 cm−<sup>1</sup> including a mixed contribution from the ring N3–H and the aminoethyl N2–H stretching vibrations, to indicate that the ring amino group absorbs at higher frequencies than the remaining three amino N2–H protons, thus implying the latter forms stronger hydrogen bonding with the surrounding waters.

**Figure 4.** RDF displays describing the interaction of the solvent water molecules with N1 (black), N2 (red) and N3 (green) sites on histamine (**a**), and the evolution of the number of N2–H···O(water) hydrogen bonds (**b**) during the molecular dynamics simulation in aqueous solution.

#### *2.3. Molecular Dynamics Simulation within the H2 Receptor*

Following the presented docking analysis, the obtained four most favorable docking positions, which differ in the orientation of the histamine ligand within the same binding pocket, were solvated in a 10 Å-thick truncated octahedron simulation box involving 18.850 water molecules, and submitted to the MD simulation for the production run of 300 ns. The validity of this approach is justified through the corresponding RMSD graphs, which reveal converged simulation (Figure S2). The obtained trajectories were analyzed by the MM-GBSA protocol in order to obtain the matching binding free energies, Δ*G*BIND. The mentioned four independent simulations gave Δ*G*BIND values between −7.5 and <sup>−</sup>14.2 kcal mol<sup>−</sup>1, with the trajectory corresponding to the most exergonic binding being employed in further analysis. At this point it is worth to stress that the obtained Δ*G*BIND values using this approach are somewhat overestimated in absolute terms. This is a known limitation of the MM-GBSA approach, as extensively discussed in a recent review by Homeyer and Gohlke [36], which also underlined its huge potential in predicting relative binding energies in the biomolecular complexes [36], which is how this approach is utilized here.

The contribution of crucial binding residues is presented in Table 1, while a representative snapshot from the MD trajectory is shown in Figure 5. The specific residues considered for the analysis are those whose favorable contribution exceeds <sup>−</sup>0.06 kcal mol−<sup>1</sup> and those with unfavorable contribution over +0.02 kcal mol<sup>−</sup>1, in total 18 residues each. It turns out that the most dominant interaction that histamine establishes within the H2 binding site is that with Asp98, which accounts for almost 40% of the binding energy, being a significant observation. It is established through charge-charge interactions among the protonated amino group on histamine and the anionic carboxylate side chain of Asp98, being highly persistent through the entire simulation, while typically involving N–H·····O hydrogen bond with only one of the carboxylic oxygen atoms (Figure S3). The reason for the latter is the fact that, besides Asp98, the protonated histamine −NH3 <sup>+</sup> group has the potential to interact with the –OH group on the nearby Tyr250, which is locked in the position through donating a hydrogen bond to the other carboxylic O-atom on Asp98 (Figure 5). The mentioned histamine···Tyr250 interaction is also persistent during simulation (Figure S4, top), evident in a favorable Tyr250 contribution of −0.83 kcal mol<sup>−</sup>1, while Tyr250···Asp98 hydrogen bonding is absent in the beginning, but, once formed, it remains stable during the second half of the trajectory (Figure S4, bottom).



**Figure 5.** Representative position of the histamine monocation within the H2 receptor binding site as obtained from the molecular dynamics simulation.

The imidazole ring is less prone to hydrogen bonding interactions. This holds in particular for its imino N1 nitrogen, for which no particular interactions are observed at all during the entire simulation. In contrast, its amino N3–H group is found in the vicinity of two threonine residues, Thr103 and Thr190. Although N3–H·····O hydrogen bonding interactions with the former are more significant in this respect (Figure S5), thus the higher individual contribution of Thr103 over Thr190 (Table 1), both of these are much less frequent and clearly weaker than those with the −NH3 <sup>+</sup> group. Still, the individual contribution of Thr190 is quite notable, <sup>−</sup>0.84 kcal mol<sup>−</sup>1, not being a consequence of the mentioned hydrogen bonds with the ligand, but rather notable C–H·····π interactions with its imidazole ring (Figure 5). It is very important to underline that this binding pattern is different from the model by Birdsall and co-workers [37] proposed on the basis of the site-directed mutagenesis carried out by Gantz and co-workers [38] that suggested Thr190 to bind histamine on its N1 imino nitrogen through the O–H·····N1 hydrogen bonding. Yet, a closer inspection into the environment around Thr190 shows

its –OH group forms a persistent and stable hydrogen bonding with a more distant backbone carbonyl group of Asp186 during 92% of the simulation time (Figure S6) with the average O·····O distance of 2.76 Å, thus ruling out Birdsall's proposal as highly unlikely. To further strengthen this conclusion, we have looked at all other MD trajectories to find an alternative histamine orientation linked with a higher individual Thr190 contribution, which would likely indicate a potential hydrogen bonding connection with the ligand. However, in the case when this was as much as <sup>−</sup>1.48 kcal mol−<sup>1</sup> for Thr190, this involved a completely different histamine orientation (Figure S7), yet having as much as 3.3 kcal mol−<sup>1</sup> less exergonic overall binding free energy (−10.9 kcal mol−1), thus being much less relevant. There, the protonated histamine −NH3 <sup>+</sup> group approaches Asp186 through hydrogen bonding interactions, which makes the latter residue the most relevant for the binding, with an individual contribution of <sup>−</sup>3.19 kcal mol<sup>−</sup>1. This is, then, followed by Thr190, which, even in this case, does not form hydrogen bonding contacts with histamine, neither with the neighboring −NH3 <sup>+</sup> group, let alone with its imidazole ring, but rather again interacts through the already described C–H·····π interactions. Let us also mention that such a changed histamine position diminishes the importance of Asp98 and Tyr250 (Figure S7), making the former even disfavoring the binding with the contribution of +0.02 kcal mol<sup>−</sup>1, thus again confirming the insignificance of such ligand binding poses.

Beside the mentioned receptor residues, the rest of the binding pocket is significantly hydrophobic, consisting mostly of aliphatic and aromatic side chains (Figure 5, Table 1). Nevertheless, this still allows histamine to establish a range of additional favorable contacts, including (i) its imidazole ring undergoing the T-shaped π–π stacking interactions with Phe251, and (ii) its ethyl moiety utilizing the C–H·····π interactions with Phe254 (Figure 5). Both of these contacts are rather strong and particularly important, making Phe254 and Phe251 the third and fifth most dominant residues for the histamine binding, with individual contributions reaching <sup>−</sup>1.60 and <sup>−</sup>0.97 kcal mol−1, respectively (Table 1). This confirms the hydrophobic nature of the H2 receptor binding site, further prompted by a significant contribution of Val99 of <sup>−</sup>1.78 kcal mol−1. Still, likely the most profound evidence for the hydrophobic character of the H2 binding pocket is the conformation of histamine during the binding. As already described, in polar hydrophilic environments, such as the aqueous solution, histamine predominantly assumes the elongated *trans* conformation, which disfavors the intramolecular N2–H·····N1 hydrogen bonding and exposes the protonated −NH3 <sup>+</sup> group for the interactions with the solvent. In contrast, the increased environment hydrophobicity starts favoring the *gauche* conformation, where the mentioned hydrogen bonding occurs and is allowed by the flexibility of the ethyl linkage. As an illustrative example, in the gas phase, the *gauche* conformer is by as much as 14.8 kcal mol−<sup>1</sup> more stable [23], and clearly dominates in this paradigmatic hydrophobic media. Along these lines, the conformational preference of the bound histamine reveals interesting trends (Figure 6). The clustering analysis of histamine conformations while inside the H2 receptor shows that, for two thirds of the simulation time, histamine assumes *gauche* conformations, with a mix of structures with and without the N2–H·····N1 hydrogen bonding, while only one third of structures is found in a typical *trans* conformation, thus confirming the hydrophobicity of the H2 receptor interior. Such a distribution of histamine conformations is further evident in the evolution of the corresponding N1–N2 distances and dihedral angles describing the rotation of the ethyl chain (Figure S8), which primarily assume values that support the predominance of the *gauche* orientations.

*gauche* (population = 38%) *gauche* (population = 29%) *trans* (population = 33%)

**Figure 6.** Representative conformations along with their populations of the histamine monocation at the H2 receptor binding site as revealed by molecular dynamics simulation.

#### *2.4. Quantum-Chemical Calculations*

In order to computationally evaluate the effect of deuteration on the binding of histamine to the H2 receptor, we have undertaken a series of quantum-chemical calculations at the M06–2X/6–31+G(d) level of theory, employing an implicit quantization of the acidic N–H, O–H and S–H bonds, and utilizing the implicit SMD solvation model with the dielectric constants of ε = 78.4 for aqueous solution and ε = 4.0 for the receptor interior, as already described. Also, we must note that, before reaching the receptor interior, the ligand is present in the aqueous solution, thus our approach is based on evaluating how deuteration affects both environments individually. This results in the separation of the deuteration-induced change in the overall ligand binding affinity, ΔΔ*E*BIND, into the contribution arising from the energy of hydration, ΔΔ*E*HYDR, and the energy of the interaction with receptor, ΔΔ*E*INTER, according to the following equation:

$$
\Delta\Delta E\_{\rm BIND}(\rm H \to D) = \Delta\Delta E\_{\rm FYPDR}(\rm H \to D) - \Delta\Delta E\_{\rm INTER}(\rm H \to D)
$$

In other words, the introduction of deuterium changes not only the geometric parameters of the matching deuterated bonds, but also the energies of the hydrogen bonding interactions in which these bonds participate. It is a fact that deuteration typically reduces the strength of the hydrogen bond, particularly if only one such interaction is considered. Yet, in this case, we are concerned with multiple hydrogen bonds that determine both the hydration and the interaction with the receptor, and since the overall effect (ΔΔ*E*BIND) is a difference between these two quantities, it can, at the end, be either positive or negative, depending on the ligand. In line with that, our earlier report on the same receptor showed that deuteration increased the binding of 4-methylhistamine and gave a reduced affinity for 2-methylhistamine, while offered no change for antagonists cimetidine and famotidine [26]. With this in mind, we have extracted the relevant snapshots from the MD simulation in water and H2 receptor, both with and without histamine, and truncated the geometries to clusters involving 36 molecules of water, and receptor residues 98–103, 186–190 and 250–254, respectively. These were submitted to an unconstrained optimization of all geometric parameters, corresponding to the situation with lighter H-nuclei, to be followed by manually shortening by 2.3% and constraining all acidic N–H, O–H and S–H bonds that mirrors deuterated analogues. For histamine, the latter involved all four N–H bonds, three within the protonated N2 amino group and one within the ring N3–H moiety, while for the water molecules this included all O–H bonds. For the receptor fragments, this pertained shortening the O–H bonds in Thr103, Thr190, Tyr250 and Thr252, and the S–H bond in Cys102. This choice is supported by knowing that threonine, having the least acidic side chain moiety of all three considered residues, spontaneously exchanges all of its −OH protons in D2O [39], thus justifying the same approach for more acidic tyrosine and cysteine residues.

The hydration energy, Δ*E*HYDR, is calculated using a reaction scheme depicted in Figure 7 and the obtained values are given in Table 2. This approach relies on transferring histamine from the gas phase into the aqueous solution and forming a hydrated solute-solvent complex. In water, the hydration energy is calculated as <sup>−</sup>71.63 kcal mol<sup>−</sup>1, indicating that the histamine monocation is well solvated and stabilized in water. This value is reduced in D2O, as a result of modified hydrogen bonding interactions and their strength following deuteration, and assumes <sup>−</sup>71.20 kcal mol<sup>−</sup>1, a small effect of only 0.43 kcal mol−<sup>1</sup> in favor of H2O. This implies that the hydration, on its own, works in the direction of promoting the binding of a deuterated system to the receptor.

**Figure 7.** Computational scheme to calculate the hydration energy of the histamine monocation in the aqueous solution, Δ*E*HYDR. The selection of dielectric constants is specified in round brackets.

**Table 2.** Calculated deuteration-induced changes in the hydration energy (Δ*E*HYDR), H2 receptor interaction energy (Δ*E*INTER), and the overall receptor binding energy (Δ*E*BIND,calc) as obtained by the (SMD)/M06–2X/6–31+G(d) model (in kcal mol<sup>−</sup>1), the latter compared with the experimentally determined value (Δ*E*BIND,exp) from ref. [25].


On the other hand, the interaction energies with the receptor, Δ*E*INTER, are estimated through the scheme shown in Figure 8, which considers placing a ligand from the gas phase into the cluster model of the receptor binding site. We have to mention that, despite considering only a truncated receptor model and approximating the rest of its structure with the dielectric constant of ε = 4.0, during the geometry optimization, the structure of histamine and its protonated −NH3 <sup>+</sup> group remained as such, although it is positioned in the direct interaction with the –COO− group from Asp98. In other words, we did not observe a spontaneous histamine···Asp98 proton transfer, which could have occurred due to a limited account of the electrostatic environment that disfavors charge separation. Instead, both the structure and the position of histamine within the binding site remained as described during the MD simulation, which justifies our model and the selection of the most important residues for the cluster-continuum approach. With this in mind, it is important to notice that the interaction energies, Δ*E*INTER, are consistently higher than Δ*E*HYDR, which confirms histamine ability to leave the aqueous solution and enter the receptor. In water, this assumes <sup>Δ</sup>*E*INTER <sup>=</sup> <sup>−</sup>82.31 kcal mol<sup>−</sup>1, which is, interestingly, further increased by 0.42 kcal mol−<sup>1</sup> to <sup>Δ</sup>*E*INTER = <sup>−</sup>82.73 kcal mol−<sup>1</sup> upon deuteration. This already indicates that deuterated histamine is better accommodated within the receptor, but the precise magnitude of the resulting effect is interplay between this interaction and histamine placement in the aqueous solution preceding the binding.

**Figure 8.** Computational scheme to estimate the interaction energy of the histamine monocation with the H2 receptor, Δ*E*INTER. The selection of dielectric constants is specified in round brackets.

Combining the mentioned hydration and interaction energies and their differences, one arrives to the overall change in the binding energy for the receptor-ligand recognition, which assumes ΔΔ*E*BIND,calc <sup>=</sup> <sup>−</sup>0.85 kcal mol<sup>−</sup>1, being in excellent agreement with the experimentally determined value of <sup>−</sup>0.75 kcal mol−<sup>1</sup> [25], thus confirming an increased histamine affinity following deuteration. Both the calculated and computed values indicate around 3–4 times higher histamine affinity following deuteration, which is an interesting observation bearing some pharmacological relevance. The agreement between these sets of data is very impressive, particularly given the simplicity of the computational model used for the implicit deuteration conducted on only a small, but carefully selected part of the receptor molecule, which validates the employed methodology and allows its use in other biological systems as well.

In concluding this section, let us emphasize that receptor activation is a highly complex and dynamic process associated with large conformational changes between receptor states. These are difficult to investigate experimentally, while, at the same time, occurring on time scales that are inaccessible for direct molecular simulations. Still, the results presented here provide convincing support that hydrogen bonding interactions are involved in the receptor activation and firmly advise that deuteration, as the simplest possible structural modification, can have a significant impact for the ligand affinity. This opens the door for the development of perdeuterated drugs, which could have different, yet in some instances more favorable clinical profiles to already marketed substances.

#### **3. Computational Details**

A homology structure of the H2 receptor was developed earlier [25], which revealed a good agreement with other models reported in the literature [40,41], and was employed here throughout the entire work. The structure of histamine contains an imidazole ring and an aminoethyl side chain, both of which have the ability of accepting a proton, if the medium is acidic enough. According to its p*K*a values, 6.0 for the imidazole nitrogen and 9.7 for the aliphatic amino group [42], at physiological pH of 7.4, histamine is predominantly a monocation (96%) protonated at the free amino group. Also, its imidazole ring can exist in two tautomeric forms, 1*H*-imidazole and 3*H*-imidazole (Figure 1), denoted as Nπ–H and Nτ–H, respectively, with plenty of experimental and computational evidence in favor of the latter as the predominant structure in the aqueous solution [23,32–35]. With all this in mind, the structure of the histamine Nτ–monocation was considered in all simulations.

#### *3.1. Docking Analysis*

The structure of the histamine monocation was optimized with the Gaussian 16 software [43] employing the M06–2X DFT functional with the 6–31+G(d) basis set. To account for the effect of the aqueous solution, during the geometry optimization we included the implicit SMD polarizable continuum model [44] with all parameters for pure water. The molecular docking studies have been done with SwissDock [45], a web server for docking of small molecules on the target proteins based on the EADock DSS engine, taking into account the entire protein surface as potential binding sites for the investigated ligands. Both the preparation of the H2 receptor structure and the visualization of results were performed using the UCSF Chimera program (version 1.14) [46].

#### *3.2. Molecular Dynamics Simulation*

Several best binding poses of histamine within the H2 receptor, elucidated through the preceding docking analysis, were used for subsequent molecular dynamics simulations. To parametrize histamine, RESP charges were calculated at the HF/6–31G(d) level of theory in Gaussian 16 program [43] to be consistent with the employed GAFF force field, while the protein was modeled using the AMBER ff14SB force field. Such a complex was then solvated in a truncated octahedral box of TIP3P water molecules spanning a 10 Å-thick buffer, neutralized by 12 Cl– anions, and submitted to the geometry optimization in the AMBER16 program package [47], employing periodic boundary conditions in all directions. An analogous setup, involving a 20 Å-thick buffer of water molecules around isolated

histamine monocation, joined by the Cl− counterion, whose position was fixed at the border of the simulation box by a force constant of 30 kcal mol−<sup>1</sup> and a position restrain between 19–21 Å from histamine, was utilized for the MD simulation pertaining to the aqueous solution. Both approaches were identically repeated to setup analogous simulations concerning the receptor and the aqueous solution without histamine and its counterion. In all instances, optimized systems were gradually heated from 0 to 300 K and equilibrated during 30 ps using NVT conditions, followed by productive and unconstrained MD simulation of 300 ns, employing a time step of 2 fs at a constant pressure (1 atm) and temperature (300 K), the latter held constant using a Langevin thermostat with a collision frequency of 1 ps<sup>−</sup>1. The long-range electrostatic interactions were calculated employing the Particle Mesh Ewald method [48], and were updated in every second step, while the nonbonded interactions were truncated at 11.0 Å.

Histamine binding free energies, Δ*G*BIND, within the H2 binding pocket were calculated using the established MM-GBSA protocol [49,50] available in AmberTools16 [47], and in line with our earlier reports [51,52]. MM-GBSA is widely used for calculating the binding free energies from snapshots of the MD trajectory with an estimated standard error of 1–3 kcal mol−<sup>1</sup> [49]. For that purpose, 3000 snapshots collected from the last 30 ns of the corresponding MD trajectories were utilized. The calculated MM-GBSA binding free energies were decomposed into a specific residue contribution on a *per-residue* basis according to the established procedure [53,54]. This protocol evaluates contributions to Δ*G*BIND arising from each amino acid residue and identifies the nature of the energy change in terms of the interaction and solvation energies or entropic contributions.

#### *3.3. Quantum-Chemical Calculations*

Following the MD analysis, which identified residues dominating the histamine binding, we took a representative snapshot and extracted positions of the bound histamine and the surrounding residues 98–103, 186–190 and 250–254. The same residues were pulled out from the H2 receptor MD simulation without histamine. In this way, the cluster representation of the receptor binding site consisted of the following residues Asp98, Val99, Met100, Leu101, Cys102, Thr103, Asp186, Gly187, Leu188, Val189, Thr190, Tyr250, Phe251, Thr252, Ala253 and Phe254, which were considered in their typical protonation forms according to the PROPKA 3.1 analysis [55] carried out on the entire homology structure. From the MD simulation in water, we extracted the position of the nearest 36 water molecules within 4 Å from histamine, which allowed for a spherical solvent layer involving histamine first solvation shell. An analogous cluster with the same number of waters was taken out from the MD simulation of a plain aqueous solution. In this way, we obtained the starting geometries for the quantum-chemical calculations involving the H2 receptor and the aqueous solution, both with and without histamine. These were submitted to a full geometry optimization at the M06–2X/6–31+G(d) level in Gaussian 16 [43]. Total molecular electronic energies were extracted without thermal corrections, so the results reported here correspond to differences in electronic energies. The effect of the rest of the receptor environment was considered through the implicit SMD solvation using a dielectric constant of ε = 4.0, as suggested by Himo and co-workers [56], and a dielectric constant of ε = 78.4 for the aqueous solution, in line with our previous reports [25,26,51]. In addition, such a truncated cluster-continuum model of the entire protein turned out to be very useful in rationalizing various aspects of the catalytic activity [57], selectivity [58] and inhibition [51] of the monoamine oxidase family of enzymes, and is broadly used by different groups to describe various biological phenomena [59–63], which justifies its use here.

Lastly, although the literature presents a number of methods for the quantization of nuclear motion, relevant for studying the H/D isotope substitution, these are limited to only a few degrees of freedom. Yet, these are not applicable here, since we have many critical protons directly involved in the H2 receptor-ligand recognition and water hydration. As such, we employed an approximate empirical treatment of the nuclear quantum effects based on the mentioned experimental work by Bordallo and co-workers [8], which showed that deuteration reduces the electrostatic attraction in the

acidic N–D bonds by 2.3% relative to the matching N–H bonds. With this in mind, we imposed the empirical quantization in the following way. Initially, all systems were fully optimized, thus mirroring the case with lighter H nuclei. After that, all acidic N–H, O–H and S–H bonds were shortened by 2.3% and kept frozen during the optimization of other geometric parameters, thus corresponding to heavier D nuclei, in accordance with our earlier reports [25,26].

#### **4. Conclusions**

This study relied on a range of computational techniques to demonstrate the significance of the hydrogen bonding and other non-covalent interactions for the binding of histamine to its H2 receptor, and evaluated how these are affected by deuteration. Molecular docking analysis determined histamine binding poses on the homology model of the H2 receptor, while molecular dynamics simulation underlined crucial residues governing the binding. This recognized Asp98 as the most dominant residue, accounting for 40% of the total binding energy, further held in place by Tyr250, which donates hydrogen bonding to Asp98 and accepts it from the histamine −NH3 <sup>+</sup> group. In contrast to earlier literature reports, we showed that the significant role of Thr190 is not in the −OH hydrogen bonds, but rather in the C–H···π contacts with the imidazole ring, while the former is persistently involved in the hydrogen bonding with a more distant Asp186. The rest of the binding pocket is hydrophobic, allowing for a range of favorable contacts with Phe254, Phe251 and Val99, but also evident in a clear predominance for the *gauche* histamine conformation within the receptor, unlike the aqueous solution where it is *trans*. Molecular dynamics simulation in the aqueous solution revealed that the first histamine solvation shell involves five water molecules at all three nitrogen sites, yet the interaction with its −NH3 <sup>+</sup> groups mostly does not occur with three water molecules at the same time, but is linked with an average of 2.2 such contacts during the entire simulation.

Following molecular dynamics simulation, which identified receptor residues crucial for the binding and a representative cluster of 36 water molecules in the aqueous solution, quantum-chemical calculations at the M06–2X/6–31+G(d) level utilized the empirical quantization of the acidic X–H bonds (X = N, O, S) to support the increased histamine affinity upon deuteration. The overall binding was separated in two contributions, that from the interaction with the receptor and the one arising from the interaction with the solvent preceding the binding, which were both modeled through a cluster-continuum approach utilizing the implicit SMD solvation with the dielectric constants of ε = 4.0 for the receptor environment, and ε = 78.4 for the aqueous solution. The used computational setup gave the calculated difference in the binding free energy of <sup>−</sup>0.85 kcal mol<sup>−</sup>1, being in excellent agreement with the measured value of <sup>−</sup>0.75 kcal mol<sup>−</sup>1, thus confirming the relevance of hydrogen bonds for the receptor activation.

The results of this study highlight the importance of deuteration for the development of new drugs, as the selective replacement of exchangeable hydrogen atoms with deuterium can increase the duration of action due to their slower decomposition [64,65]. In addition, this can result in different, yet in some instances more beneficial clinical profiles to already marketed solutions, and further progress in this area is highly recommended. Finally, we are convinced that advanced molecular simulations of entire receptors with the inclusion of experimental data will finally lead to a methodology that will be able to discriminate between GPCR agonist and antagonists, which is currently limited to QSAR applications [66].

**Supplementary Materials:** The following are available online, Figures S1–S8 showing various analyses from the molecular dynamics simulation.

**Author Contributions:** Conceptualization, J.M. and R.V.; methodology, J.M. and R.V.; formal analysis, L.H. and R.V.; investigation, L.H.; data curation, L.H.; writing—original draft preparation, R.V.; writing—review and editing, L.H., J.M. and R.V.; visualization, L.H.; supervision, R.V. All authors have read and agreed to the published version of the manuscript.

**Funding:** Part of this research was funded by the Slovenian Research Agency, program group P1–0012.

**Acknowledgments:** L.H. wishes to thank the Croatian Science Foundation for a doctoral stipend through the Career Development Project for Young Researchers. J.M. thanks the Slovenian Research Agency for financial support. L.H. and R.V. would like to thank the Zagreb University Computing Centre (SRCE) for granting computational resources on the ISABELLA cluster.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**


**Sample Availability:** Samples of the compounds are not available from the authors.

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Pinacolone-Alcohol Gas-Phase Solvation Balances as Experimental Dispersion Benchmarks**

#### **Charlotte Zimmermann, Taija L. Fischer and Martin A. Suhm \***

Institut für Physikalische Chemie, Georg-August-Universität Göttingen, Tammannstr. 6, 37077 Göttingen, Germany; czimmer2@gwdg.de (C.Z.); tfische1@gwdg.de (T.L.F.)

**\*** Correspondence: msuhm@gwdg.de; Tel.: +49-551-3933112

Academic Editor: Ilya G. Shenderovich Received: 25 September 2020; Accepted: 28 October 2020; Published: 3 November 2020

**Abstract:** The influence of distant London dispersion forces on the docking preference of alcohols of different size between the two lone electron pairs of the carbonyl group in pinacolone was explored by infrared spectroscopy of the OH stretching fundamental in supersonic jet expansions of 1:1 solvate complexes. Experimentally, no pronounced tendency of the alcohol to switch from the methyl to the bulkier *tert*-butyl side with increasing size was found. In all cases, methyl docking dominates by at least a factor of two, whereas DFT-optimized structures suggest a very close balance for the larger alcohols, once corrected by CCSD(T) relative electronic energies. Together with inconsistencies when switching from a C4 to a C5 alcohol, this points at deficiencies of the investigated B3LYP and in particular TPSS functionals even after dispersion correction, which cannot be blamed on zero point energy effects. The search for density functionals which describe the harmonic frequency shift, the structural change and the energy difference between the docking isomers of larger alcohols to unsymmetric ketones in a satisfactory way is open.

**Keywords:** dispersion; ketone–alcohol complexes; density functional theory; hydrogen bonds; molecular recognition; vibrational spectroscopy; gas phase; benchmark; pinacolone

#### **1. Introduction**

In nature, directional hydrogen bonds to carbonyl groups [1,2] are frequent, for instance in proteins, DNA or other biopolymers [3,4]. London dispersion interactions are less directional, but at least as omnipresent [5]. An accurate and detailed theoretical description of these interactions and their cooperation or competition is urgently needed. As in any complex interplay, there is a risk of error cancellation. One may easily get the right answer for the wrong reason. The situation calls for systematic isolation attempts with respect to the different contributions. This can be achieved by the study of a series of small hydrogen-bonded complexes at low temperature in the supersonically expanded gas phase by rotational and vibrational spectroscopy [6–8]. Even at low temperature, anharmonic zero point vibrational energy (ZPVE) still complicates the comparison between electronic structure theory and experimental information on the relative energy of different molecular arrangements [9]. A more direct test of the potential energy landscape would be very desirable.

This has led to the concept of ketone solvation balances, which were introduced for acetophenone and its derivatives in combination with alcohols as hydrogen bond donors [10,11] and tested for other ketones [12,13]. The idea is to have two very comparable lone electron pairs available at the acetophenone oxygen, to which alcohols can either dock from the phenyl or from the alkyl side, with little difference in ZPVE. Besides the intrinsic preference of a docking alcohol for the methyl side due to the more favorable local hydrogen bond geometry [11], the alkyl group of the alcohol will interact dispersively (and by Pauli repulsion) with the two ketone substituents and thus contribute to

the preference for one of the docking sides. These secondary interactions through space are able to tip the balance towards phenyl side docking [11]. The comparison of different alcohols and acetophenones thus provides information on London dispersion interactions competing with the electronic and zero-point vibrational local hydrogen bond effects, which still largely govern the position of the alcoholic OH vibration. The latter is used to spectrally discriminate the docking isomers and it also contains further information on the competition of forces, because hydrogen bonds can be distorted by distant interactions of the donor molecule. Experimental information on the docking preference comes from the relative abundance of the docking isomers in the quasi-equilibrium established by cooling collisions in a supersonic jet expansion, down to some conformational freezing temperature *T*c (roughly 30 to 150 K, depending on low (1 to 5 kJ mol<sup>−</sup>1) and narrow interconversion barriers [10,14,15]) and can thus only be predicted with a large tolerance.

The results of such studies can be used to benchmark the ability of different density functionals to predict the interplay of hydrogen bonding with distant London dispersion and Pauli repulsion, by simply comparing the predictions to experiment. This can be done strictly at the level of observables, without consulting any energy decomposition models [16–18], although the latter are helpful in the interpretation of the findings. A functional which gives the right answer for the right reason in the popular harmonic approximation for vibrations must be able to predict the splitting of the OH stretching vibrations between the docking isomers (because anharmonic effects by construction largely cancel when comparing the isomers) and the relative abundance of the isomers with a reasonable conformational temperature. As a third test, high level single point wavefunction calculations (for which Hessian calculations to reproduce the spectrum would be too costly) at the optimized DFT minima should confirm the energy predictions in a qualitative sense. If at least one of these three diagnostics fails, the DFT functional performance can be proven to be poor down to a sub-kJ/mol accuracy threshold. This was the case for one out of six pairings of aromatic ketones with alcohols in the first systematic study [11], for the otherwise most successful B3LYP-D3 functional. By using the second-most stable and less compact predicted structure in this particular case, the performance could actually be rescued [11]. This former systematic investigation thus suggests a mildly erroneous preference of the B3LYP-D3 functional (at least for a standard def2-TZVP basis set), and to a lesser extent also the TPSS-D3 approach, for compact structures. Other explored functionals such as M06-2X failed the aromatic ketone balance test in several aspects [11] and need not be considered further.

The hypothesis that B3LYP-D3 and TPSS-D3 show an (almost) acceptable performance for ketone dispersion balances obviously calls for further falsification attempts and this is the task of the present study which involves the purely aliphatic pinacolone (see Figure 1), where the phenyl group in acetophenone is replaced by a *tert*-butyl (*t*Bu) group. This removes aromatic–aliphatic dispersive interactions and brings in more bulky donor-acceptor constellations. Cyclopentanol (CpOH) is introduced as a further, more disk-like and flexible alcohol, in addition to methanol (MeOH) and *tert*-butyl alcohol (*t*BuOH), which have been previously explored with acetophenone [11]. Pinacolone monomer does not have a plane of symmetry [19], but in combination with low planarization barriers (Figures S2–S4) and symmetry-breaking alcohol coordination, this should not lead to additional complications in the analysis. Indeed, there are significant variations of the hydrogen bond angle *α* and the dihedral angle *τ* (see Figure 1) with alcohol substitution. These promise to explore the hydrogen bonding potential of carbonyl groups far away from the intrinsic in plane preference.

In this work, we show that, in alcohol–pinacolone balances, the methyl docking side is consistently preferred. According to exploratory calculations, this may extend to many alcohols beyond the experimentally investigated ones. Further, we show that the predictive quality of the two density functionals which were successful for acetophenone (B3LYP and TPSS) decreases with the size of the alcohol, including significant failures for the largest (CpOH). The proposed assignments and observed trends are discussed and an analysis of dispersion interactions on the docking side preference is presented. We provide initial evidence that some of the superficially satisfactory DFT performance for ketone balances must be fortuitous.

**Figure 1.** Schematic representation of the two possible docking sides 4 and 4 in a pinacolone molecule (*t*Bu and Me) with different alcohols (R-OH, with the abbreviations Me for methyl, *t*Bu for *tert*-butyl and Cp for cyclopentyl as R).

#### **2. Results and Discussion**

We start with the theoretical description of alcohol–pinacolone 1:1 complexes at the level of DFT before comparing to the experimental findings and finally consulting wave-function theory.

#### *2.1. Density Functional Predictions*

From now on, the abbreviations Pin for the studied ketone pinacolone and MeOH (methanol), *t*BuOH (*tert*-butyl alcohol) and CpOH (cyclopentanol) for the solvating alcohols are consistently used. In Figure 1, two angles *α* and *τ* describing the hydrogen bond geometry are introduced. The hydrogen bond angle between the hydrogen bonded H and the carbonyl group has a local, sp2-explainable preference for ≈120°. The dihedral angle *τ* describes the out of plane twist of the docking alcohol OH with respect to the carbonyl plane, with two local preferences near 0° and 180°. Any deviations from these local preferences due to global interactions sensitively affect the hydrogen-bonded OH stretching wavenumber.

As detailed in Table S2 and Figure S1, all six experimentally investigated 1:1 complexes show a narrow distribution for *α* (115–124°) at the four investigated DFT levels (D3-corrected B3LYP and TPSS with triple and quadruple zeta basis sets). *τ* deviates from planarity with increasing size of the alcohol, in steps of roughly 10° from MeOH over *t*BuOH to CpOH. On the *t*Bu docking side of Pin, even MeOH is already displaced by 35–37°, due to the bulkiness of the substituent, whereas the Me docking displacement is less than 10° for MeOH.

The structural trends are reflected in the calculated OH stretching wavenumbers (see Table S4), which are consistently lower for Me docking for all three alcohols, whereas the trend with increasing alcohol size is comparatively weak, relative to the overall hydrogen bond shift. This assists a straightforward interpretation of the experimental spectra.

The energy differences between Me and *t*Bu docking sides fall between 0 and 3 kJ mol<sup>−</sup>1, always preferring the Me side, as shown in Figure 2. The narrow corridor of ±0.2 kJ mol−<sup>1</sup> in the figure (gray lines) illustrates that it makes almost no difference whether harmonic ZPVE is included or not. The effect of basis size extension is similarly small. This is very favorable for a direct judgement of the DFT functional in terms of the predicted electronic energy difference without worrying about major (anharmonic) zero point energy or basis set effects which can both be quite significant when looking at absolute energies and frequencies [10,20].

**Figure 2.** Harmonically zero-point corrected energy differences Δ*E*<sup>0</sup> Me−*t*Bu plotted against the electronic energy differences Δ*E*el Me−*t*Bu referenced to the *<sup>t</sup>*Bu side, computed at B3LYP-D3 (green) and TPSS-D3 (black) level, each with a def2-TZVP (empty symbols) and def2-QZVP (filled symbols) basis set. The electronic energy differences are seen to be a good approximation to experimentally relevant ZPVE-inclusive differences and the methyl docking side is systematically preferred (see also Table S3).

The predicted spread in docking energy difference of about 2.5 kJ mol−<sup>1</sup> across the systems promises a large variation of the experimental abundances, but the absence of a sign reversal (corresponding to an absence of data points in the upper right quadrant of Figure 2) despite varying the alcohol size from 1 to 5 carbon atoms is surprising. An explorative search for almost 20 other alcoholic donors (see Table S5) confirms this systematic bias. The steric disadvantage of the *t*Bu side of Pin together with the flexibility of alcohols provides possible explanations. The latter allows the alcohol to dock on the sterically more accessible Me side and at the same time to exploit London dispersion interaction with the *t*Bu side. A good example is benzyl alcohol, where the Me sided structure is almost 2 kJ mol−<sup>1</sup> more stable, because the benzyl group can still interact favorably with the *t*Bu group of the Pin while the OH group is docking to the Me side of Pin.

Another important feature of carbonyl balances is the feasibility of the isomerization under supersonic jet expansion conditions. A transition state search between the two competing structures for MeOH–Pin yielded an interconversion barrier height of about 3 kJ mol−<sup>1</sup> when viewed from the *t*Bu docking structure. The interconversion path is distinctly out-of-plane, relaxing the hydrogen bond angle *α* while switching between small and large *τ*. This is similar to previous findings for acetophenone [11] and its derivatives and supports a feasible interconversion under supersonic jet conditions, with *Tc* values significantly below the starting temperature of the expansion. However, the more numerous the contacts between the residue and Pin are, the larger this barrier may become. This is one reason this work focuses on small alcohols to establish the performance of the DFT functionals.

Before switching to experiment, two important observable predictions need to be explored. One is a sufficiently robust infrared cross section ratio for the docking isomers, which is a precondition for reliable experimental abundance determinations from spectral intensities. As shown in Figure S5, the basis set and functional dependences are modest and the trends are smooth, such that this variation and the double-harmonic approximation are not expected to be critical for the theory-experiment comparison.

The most important theoretical assignment aid concerns the predicted positions and differences or splittings of the OH stretching fundamental vibrations. While the harmonic approximation is too crude for absolute predictions, the harmonic Me-*t*Bu differences involve systematic cancellation of anharmonic contributions for similar docking environments. Furthermore, the structural effect of increasing alcohol size is qualitatively similar on both docking sides, as pointed out above, and should translate into relatively uniform wavenumber splittings as a function of the number of alcoholic C atoms. This is illustrated in Figure 3. In all cases, the Me-docking wavenumber is lower, corresponding to a uniformly negative Δ*ω*Me−*t*Bu value and facilitating experimental assignment. The size of the splitting exceeds the spectral resolution and band width [11] by more than an order of magnitude, which is also favorable.

**Figure 3.** Computed OH wavenumber difference between the two docking sides Δ*ω*Me−*t*Bu relative to the number of C-atoms of the corresponding alcohol. This shows that the employed computational methods predict the same spectral trends for MeOH and *t*BuOH, indicated by dashed lines. For CpOH a somewhat larger discrepancy can be observed, with the smaller basis set TPSS result differing most from the experimental trend (blue) (see Table S4 for details).

With a single exception (TPSS for CpOH for the larger basis set), all predicted harmonic splittings are within ±12 cm−<sup>1</sup> of the average value of −42 cm−<sup>1</sup> and there is only a weakly decreasing trend for the splitting with increasing alcohol size. For CpOH, predictions range from −50 to −63 cm<sup>−</sup>1. These variations are also reflected in the *τ* angle (Table S2). They are robust with respect to cross-over re-optimization and indicate a slight TPSS-bistability of the structure depending on the basis set. Anticipating the experimental (anharmonic) result reported in the next section (blue symbols and lines in Figure 3), the larger basis set result appears less likely in absolute numbers but more likely in terms of the trend. Even beyond this outlier, it is clear that the alcohol size trends are not predicted perfectly, thus underscoring the benchmarking potential of this study.

#### *2.2. Experimental Results*

In Figure 4, the experimental infrared spectra for helium supersonic jet expansions of Pin with MeOH (green), *t*BuOH (orange) and CpOH (blue) are shown. They feature the rovibrationally broadened alcohol monomer OH stretching bands (MeOH, *t*BuOH, CpOH), the downshifted hydrogen-bonded homodimer signals ((MeOH)2, (*t*BuOH)<sup>2</sup> and (CpOH)2), as well as the narrow bands of the mixed complexes with docking isomerism OMe and O*t*Bu. For MeOH, OMe and O*t*Bu

are spectrally downshifted compared to the respective homodimer band, as one might expect from an intrinsically stronger OH··· O=C interaction, whereas, in the case of *t*BuOH and CpOH, they are upshifted. This is already an experimental sign for competition between hydrogen bonding and more global London dispersion interactions. Even when the alcohol is in Me docking position, where there is no sterical crowding, it is displaced out of the ketone plane to maximize interaction with the *t*Bu group (see Table S2). When CpOH is combined with acetone, which lacks the *t*Bu group (see Figure S6), the homodimer and mixed dimer signals actually overlap. This is partly due to less competition from dispersion interaction with the other side of the ketone for the hydrogen bond.

**Figure 4.** FTIR jet OH stretching spectra of Pin with the three alcohols. The 1:1 complexes are marked with O, indexed by the assigned docking preference. Both docking sides are observed. Pin is only a stronger OH shifting partner than the alcohol itself for MeOH.

The more downshifted mixed dimer band OMe in Figure 4 is always significantly stronger and based on the robust DFT predictions for structure (Table S2) and downshift (Figure 3 and Table S4), it must be due to Me docking, as implied by the label. Given that its spectral visibility (Figure S5) is at best twice that of the *t*Bu isomer, it must also be the more stable isomer, in agreement with the DFT computations (Figure 2).

The experimental shift between OMe and O*t*Bu spans a relatively narrow range of 30 to 40 cm−<sup>1</sup> (Figure 3 and Table S6), which roughly matches the DFT prediction window, except for the TPSS outlier. In many cases, the DFT splitting is somewhat larger than the experimental one, which matches the general overestimation of hydrogen bond shifts by most density functionals. The subtle alcohol substitution trend in the splittings (Figure 3) is not well reproduced, being monotonically decreasing for the DFT predictions and non-monotonic in the experiment, but, considering the superposition of Me and *t*Bu trends, this appears acceptable and does not complicate the spectral assignment.

In Figure 5, the experimentally determined downshifts from the monomer OH fundamental are plotted against the corresponding calculated ones. The fact that all correlation points stay below the diagonal line confirms the systematic overestimation of DFT downshifts, which is more pronounced for TPSS than for B3LYP [11] and only in part due to anharmonicity. The slope of the data points matches the diagonal for methanol (dashed arrows connect isomers), but it becomes flatter for the more bulky alcohols. This indicates that the DFT calculations overestimate the hydrogen bond weakening by bulkiness (dispersion and/or exchange repulsion). Note that non-isomeric acetone docking results [21] (for CpOH, see Figure S6) included in the figure also fit the Pin data for Me docking.

**Figure 5.** Experimental (anharmonic) downshift of the 1:1 complexes Δ*ν*˜M,exp plotted against the harmonically computed downshifts Δ*ω*M,theo for four computational variants. The harmonic DFT overestimation and the trend between docking sides (dashed arrows from *t*Bu to Me docking) are uniform.

The CpOH–Pin case is also suspicious in terms of the B3LYP energy gap between Me and *t*Bu docking. Based on Figure 4, Me docking should be substantially more stable, even more so if statistically formed conformations freeze rather early in the expansion. However, the predicted energy difference is ≤0.5 kJ mol−<sup>1</sup> (see Table S3), far too low for such an imbalance. Attempts to rescue the situation in analogy to the acetophenone balance study [11] by searching for metastable minima on the DFT hypersurfaces failed (see Tables S14 and S15 for details). In this context the pseudorotational isomerism of the axial CpOH monomer should be briefly discussed. There are two nearly isoenergetic (B3LYP-D3(BJ,abc)/def2-TZVP energy difference less than 0.1 kJ mol<sup>−</sup>1) and isospectral isomers (about 4 cm−<sup>1</sup> wavenumber difference, leading to slight shift uncertainty), depending on the position of the axial OH in the envelope conformation. This has caused uncertainty in rotational [22] and has also been addressed in liquid state spectroscopy [23]. However, structure optimizations indicate that this subtle isomerism is relaxed in the complexes with Pin, leading to uniform axial gauche results (or energetically >2 kJ mol−<sup>1</sup> higher conformations). The problem is thus more fundamental, as the following analysis supports.

For this purpose, the experimental abundance is compared to the predicted B3LYP energy difference for all three investigated systems by calculating a concentration ratio *c*Me/*ct*Bu, which follows from the experimental intensity ratio and the def2-TZVP absorption cross sections (see Table 1). The maximum and minimum values for *I*Me/*It*Bu from a Monte Carlo integration program [24] generate a maximum and minimum value for *c*Me/*ct*Bu which is further transformed to a (semi-)experimental *xt*Bu range. The values confirm that Me docking is strongly preferred for all systems. This result is completely robust with respect to the four theoretical levels, even allowing for possible ZPVE errors of ±0.2 kJ mol−<sup>1</sup> and for residual errors in the theoretical cross section ratio (see Tables S3, S7 and S8 for more details).

One should emphasize that the predicted energy imbalance between the two docking isomers is always below 8 % (Table S9), so rather small on an absolute scale. Our experiment is thus rather sensitive in detecting errors in this imbalance, making it suitable for benchmarking studies [10].

**Table 1.** Experimental integrated intensity ratios *I*Me/*It*Bu, B3LYP-D3(BJ,abc)/def2-TZVP cross-section derived docking ratios *c*Me/*ct*Bu and resulting experimental fractions *xt*Bu for *t*Bu docking. The given ranges represent 95% confidence for *I*Me/*It*Bu using an automated statistical evaluation [24] and are carried on to *c*Me/*ct*Bu and *xt*Bu without including a theoretical cross section ratio uncertainty.


Figure 6a plots the (semi-)experimental fraction of *t*Bu docking (Table 1 and Tables S7 and S8) against the energy difference prediction for the four combinations of functional with basis set. The grey areas indicate qualitative inconsistencies between theoretical prediction and experiment, within the assumptions of uniform anharmonicity and accurate cross section ratio. If Me docking is energetically favorable, *t*Bu should not dominate the expansion and vice versa. Asymmetrical error bars are obtained by taking the mean value for *I*Me/*It*Bu as the data point and using the Monte Carlo determined range (see Table 1 and Table S7) as the boundaries.

At first sight, experiment and DFT theory (Figure 6a) are consistent with each other and different DFT levels cannot be discriminated against each other. Even the obvious outliers for CpOH can be accommodated in the allowed (white) area. However, two closer looks at the data reveal deficiencies.

To bring the different theory levels closer together, one can plot the experimental *t*Bu docking abundance gain Δ*xt*Bu against the theoretical *t*Bu docking energy gain ΔΔ*E*<sup>0</sup> *<sup>t</sup>*Bu, when switching from MeOH to a heavier alcohol (Figure S7). One would expect that any energy gain leads to a docking abundance gain, but all DFT methods predict a higher energy gain for CpOH and experiment finds a higher docking abundance gain for *t*BuOH. Clearly, the DFT description is somewhat imbalanced for either CpOH or *t*BuOH or for both.

Another way of analyzing the deficiency is to calculate an effective conformational temperature *T*c for each DFT method and pair of isomers from the experimental band integral ratio and the computed IR band strength ratio [10]. Based on the (semi-)experimental concentration ratios *<sup>c</sup>*Me *ct*Bu listed in Table 1 and the computed energy differences Δ*E*<sup>0</sup> Me−*t*Bu in Figure 6, this can be obtained as

$$T\_{\rm c} \approx -\frac{\Delta E\_{\rm Me - tBu}^{0}}{R \ln \frac{\rm OMe}{c\_{tBu}}}$$

with the universal gas constant *R*, if there are no symmetry differences between the docking isomers and the rovibrational partition functions are sufficiently similar due to supersonic jet cooling. *T*c should roughly fall in the range of 30 to 150 K [11,15]. This is the case for almost all 12 combinations of system and method, within the respective error bar (Table S10). Only TPSS for *t*BuOH–Pin gives higher *T*<sup>c</sup> values and B3LYP for CpOH–Pin is borderline on the low end. The former could be due to a higher interconversion barrier but the latter is likely due to an overestimated stability of the *t*Bu docking side.

These inconsistencies call for a check with wavefunction theory, which is presented in the next section.

**Figure 6.** Experimental *t*Bu docking fraction *xt*Bu = *ct*Bu/(*ct*Bu + *c*Me) (based on 95% confidence intervals and the mean value for the ratio *I*Me/*It*Bu from Table 1 and Tables S7 and S8) plotted against the computed ZPVE corrected energy differences *E*<sup>0</sup> Me−*t*Bu. Grey areas indicate inconsistency between experiment and theory, when allowing for an estimated anharmonic ZPVE error of <sup>±</sup>0.2 kJ mol−<sup>1</sup> and assuming correct cross section ratios form the respective theoretical model. (**a**) DFT energies, where all models predict the correct qualitative docking preference, but the correlation of energy and abundance is non-uniform. (**b**) As in (**a**), but with the electronic energy being replaced by the corresponding DLPNO-CCSD(T) value (see Section 2.3).

#### *2.3. DLPNO-CCSD(T) Check*

For the large complexes of interest in this work, harmonic frequency analysis and thus zero-point energy calculation is not very practical beyond DFT level. However, single point energies at DLPNO-CCSD(T) level [25] were calculated at the minima obtained for the various DFT methods (with the setting tightPNO, basis sets aug-cc-pVQZ and aug-cc-pVQZ/C, see Table S1). They offer several benefits.

First, they allow judging which of the DFT methods is likely closer to the true minimum by looking at the absolute CCSD(T) energies [9]. In all cases, B3LYP outperforms TPSS but in most B3LYP cases the smaller basis set gives a slightly lower energy. This may be taken as a weak indication that the B3LYP structures are closer to reality, but there could be some compensation between intra- and intermolecular degrees of freedom.

Second, one can replace the DFT electronic energy difference between isomers by the corresponding DLPNO-CCSD(T) difference and keep the structural and ZPVE contributions from the DFT level. This generates a variant of Figure 6a, in which the data points for all larger alcohols now fall close to or into the lower-right grey and thus unphysical region, where major *t*Bu docking is expected but Me docking is predominantly observed (see Figure 6b). Only MeOH stays in the physically meaningful range. The mere fact that DLPNO-CCSD(T) correction leads to such large energy difference changes casts doubt on the quality of the DFT (in particular TPSS) structures. Note that all 12 corrections (see Table S12) promote *t*Bu docking, so the DFT error is highly systematic. For B3LYP, the corrections stay below 1 kJ mol<sup>−</sup>1, for TPSS they always exceed 1 kJ mol<sup>−</sup>1. Because experiment is consistent with a preference for Me docking in all cases, this likely means that the DFT structures for Me docking are relatively far from the best ones, in particular for TPSS. As the CCSD(T) corrections are quite uniform for all three alcohol–pinacolone complexes, it is plausible that the deficiency does not reside so much in the dispersion correction but rather in the functional and its description of differences in hydrogen bonding to the acceptor C=O group.

A third application of DLPNO-CCSD(T) is to provide dispersion contributions to the interaction energy in the LED scheme [16,17] (Table S11). This is a refined way of obtaining such (strictly speaking non-observable) dispersion energies, which is conceptually better than simply evaluating the size of the D3 correction in the complex (Table S13). In the present case, the numbers obtained for both methods are quite similar, but this cannot be generalized except perhaps for large distances, where London dispersion is best defined and LED [16], SAPT [18] or empirical dispersion correction [26] should become comparable as leading corrections to long range electrostatic and inductive interactions. Dispersion always favors *t*Bu docking, by 1.5 to 3.1 kJ mol−<sup>1</sup> in the LED scheme (1.6 to 2.8 kJ mol−<sup>1</sup> for D3 corrections). The dependence on the size of the alcohol is quite modest, but CpOH tends to show the largest gains, at least for B3LYP.

Returning to the conformational freezing temperature analysis, now with DLPNO-CCSD(T)-corrected values (Table S10), only MeOH docking yields reasonable *T*<sup>c</sup> values (larger than 30 K). For *t*BuOH docking, B3LYP predicts borderline *T*c values and TPSS predictions are far too low. For CpOH, none of the CCSD(T)-corrected DFT results give physical *T*c values.

In summary, the DLPNO analysis shows that dispersion-corrected TPSS docking structures are imbalanced, more so than B3LYP structures. It confirms that beyond MeOH, the best isomer energy predictions are inconsistent with experiment or at best borderline (for B3LYP and *t*BuOH docking). Compared to acetophenone [11], Pin is seen to be a more critical test ketone. As it is purely aliphatic, there is likely some error compensation in the apparently more successful mixed aliphatic-aromatic acetophenone case [11].

#### **3. Materials and Methods**

The spectroscopic data were obtained by probing pulsed supersonic slit jet expansions of Pin+alcohol-seeded helium gas with a synchronized FTIR spectrometer. Specifically, helium (Linde 99.996%) is led through a temperature-controlled gas-flow system, where it passes separate gas saturators filled with the analytes pinacolone (Alfa Aesar > 97%) and alcohol (methanol (Roth ≥ 99.9%), *tert*-butyl alcohol (Roth ≥ 99.9%) or cyclopentanol (Fluka Chemicals > 99.9%)). The gas mixture is filled into a 67 L reservoir at a pressure of 0.75 bar and pulsed through six magnetic valves into a pre-expansion chamber which is terminated by a 600 mm long and 0.2 mm wide slit nozzle. During about 0.2 s, the gas flows through this slit into a vacuum chamber connected to a buffer volume (23 m3), which is continuously evacuated by a series of pumps with a power of 500 to 2500 m3 h<sup>−</sup>1. The expansion is crossed by a modulated and softly focused IR beam from a Bruker IFS 66v/S FTIR spectrometer with a 150 W tungsten filament, CaF2 optics and a liquid nitrogen cooled InSb detector. The scans are obtained with a resolution of 2 cm−<sup>1</sup> and are synchronized with the gas pulse. The shown spectra are averaged over 300–425 gas pulses. More details on the experimental setup can be found elsewhere [27]. No evidence was found that more than two structural isomers of the studied 1:1 complexes are formed during the experiment.

To determine the band integral ratios *I*Me/*It*Bu, an automated statistical evaluation was used, where the main entering parameters include the band positions and band width, which is statistically varied (chosen at (3.0 ± 0.5) cm<sup>−</sup>1) [24]. The program adds synthetic noise to the spectra, providing statistical error bars for *I*Me/*It*Bu. The resulting 95% confidence interval was used for further data processing.

DFT calculations were used for assignment purposes and to trigger future benchmarking of their ability to describe the combination of hydrogen bonding and distant London dispersion interactions. Therefore, they were limited to two functionals and two basis sets, but others are invited to find more powerful density functionals for this challenge. The initial structural search (manual and using Crest [28]) was carried out at B3LYP-D3/def2-TZVP level [29–32]. Reoptimization was carried out with a def2-QZVP basis set [32] and with the meta-GGA functional TPSS-D3 [33] using the same def2-TZVP and def2-QZVP basis sets. Three body-inclusive D3 dispersion correction [26] with Becke–Johnson damping [34–37] was always applied. Single point energies were obtained using DLPNO-CCSD(T) [25,38,39] at the DFT-optimized structures. For all these calculations, ORCA version 4.2.1 [40] was used. Further information on computational details can be found in the Supplementary Materials (Table S1). Thermal corrections to the isomer equilibrium were neglected due to the low and mode-dependent temperatures in a jet expansion, with rotational temperatures expected to be on the order 10 K. Vibrational temperatures are on the order of 100 K and conformational temperatures, which depend on the barrier between isomers, are discussed in the main text [10]. The harmonic treatment of the ZPVE is expected to be more than sufficient for this kind of systems and for the achievable accuracy, due to the near-equivalence of the two lone electron pairs [11]. A transition state search for one system was carried out with Woelfling (Turbomole [41,42]) and followed by an optimization with ORCA version 4.2.1 [40].

#### **4. Conclusions**

Three alcohols of increasing size were combined with pinacolone to determine the hydrogen bonding preference to either the methyl- or the *tert*-butyl-facing lone electron pair of the keto group. As generally predicted for almost two dozen alcohols by dispersion-corrected B3LYP calculations, the methyl side is preferred for methanol, *tert*-butyl alcohol and cyclopentanol. This was qualitatively confirmed by infrared spectroscopy of supersonic jet expansions in combination with approximate IR absorption cross sections. Quantitatively, the DFT predictive power in terms of the spectral splitting decreases with increasing alcohol size. In addition, the observed spectral abundance does not correlate systematically with the predicted energy difference. DLPNO-CCSD(T) energy calculations indicate that B3LYP provides a somewhat better description of the combined hydrogen bond and London dispersion interaction than TPSS. However, in combination with the experiment, they suggest that docking on the methyl side is systematically underrated by both density functionals on the 1 kJ mol−<sup>1</sup> scale. This only amounts to about 3 % of the total binding energy but is quite significant on the relative energy scale of competitive ketone docking.

Intermolecular energy balances are thus shown to be powerful benchmarking tools to assess the ability of DFT methods to describe hydrogen bonding in competition with London dispersion. The ketone balance variety is particularly useful, as it involves systematically compensating zero-point-energy contributions and therefore allows judging electronic structure predictions in a rather direct way. For acetophenone, only a slight deficiency of the B3LYP functional could be identified [11]. For pinacolone, none of the investigated functionals comes close to describing the spectral splitting and the energetics of the docking isomerism for all three alcohols, but D3-corrected B3LYP performs satisfactorily for methanol docking and borderline for *tert*-butyl alcohol. The qualitative failure of theory to describe the experimentally observed cyclopentanol docking invites studies of related complexes, such as cyclohexanol–pinacolone and cyclopentanol–acetophenone. Larger modifications involve the use of phenol [43] and the switch from the OH chromophore to NH stretching as a probe of the conformational preference.

The goal is to find a density functional which systematically reproduces the harmonic wavenumber splitting between docking isomers within better than about 10 cm−<sup>1</sup> and which provides a conformational temperature of the correct sign between about 30 and 150 K across a large number of isomeric complexes with low interconversion barrier. Furthermore, DLPNO-CCSD(T) correction should not change the energy difference between the isomers by more than about 0.5 kJ mol<sup>−</sup>1, thus indicating a sufficiently balanced structural description. The best-performing B3LYP-D3/def2-QZVP approach in the present study only fulfills about half of these criteria for the three systems and the corresponding TPSS-D3 calculation even fewer than a quarter. Considering that some of these matches will be fortuitous, this is clearly not a satisfactory state, calling for further experimental and theoretical investigations.

**Supplementary Materials:** The following are available online, Figure S1: Structures, Figures S2–S4: Pinacolone torsion scans, Figure S5: Cross section ratios, Figure S6: CpOH-acetone spectra, Figure S7: Substitution trends, Table S1: Keywords, Table S2: Angles, Table S3: Unimportance of ZPVE, Table S4: OH stretching shifts, Table S5: Other explored donors, Table S6: Experimental band centers, Tables S7 and S8: Band integrals, Table S9: Dissociation energies, Table S10: Conformational temperatures, Table S11: LED analysts, Table S12: CCSD(T) energy differences, Table S13: D3 Analysis, Tables S14 and S15: Higher lying isomers, Tables S16–S21: Cartesian coordinates

**Author Contributions:** Conceptualization, methodology, funding acquisition and supervision, M.A.S.; experimental investigation, C.Z. and T.L.F.; formal analysis, C.Z. and M.A.S.; visualization, data curation and writing–original draft preparation, C.Z.; and writing–review and editing, M.A.S., C.Z. and T.L.F. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) grant number 271107160/SPP1807. The APC was co-funded by the Goettingen open access publication funds and by personal membership in the SCS.

**Acknowledgments:** We acknowledge computer time on the GWDG computer cluster as well as the local chemistry cluster— DFG grant number 405832858/INST 186/1294-1 FUGG. Valuable support from the mechanical and electronic workshops of the department is much appreciated, as are discussions with Robert Medel.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study, in the collection, analyses, or interpretation of data, in the writing of the manuscript, or in the decision to publish the results.

#### **Abbreviations**

The following abbreviations are used in this manuscript:


#### **References**


**Sample Availability:** All compounds were obtained commercially and samples of the compounds are not available from the authors.

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Halogenated Diazabutadiene Dyes: Synthesis, Structures, Supramolecular Features, and Theoretical Studies**

**Valentine G. Nenajdenko 1,\*, Namiq G. Shikhaliyev 2, Abel M. Maharramov 2, Khanim N. Bagirova 2, Gulnar T. Suleymanova 2, Alexander S. Novikov 3, Victor N. Khrustalev 4,5 and Alexander G. Tskhovrebov 4,6,\***


Academic Editors: Ilya G. Shenderovich and Antonio Caballero Received: 6 October 2020; Accepted: 26 October 2020; Published: 29 October 2020

**Abstract:** Novel halogenated aromatic dichlorodiazadienes were prepared via copper-mediated oxidative coupling between the corresponding hydrazones and CCl4. These rare azo-dyes were characterized using 1H and 13C NMR techniques and X-ray diffraction analysis for five halogenated dichlorodiazadienes. Multiple non-covalent halogen···halogen interactions were detected in the solid state and studied by DFT calculations and topological analysis of the electron density distribution within the framework of Bader's theory (QTAIM method). Theoretical studies demonstrated that non-covalent halogen···halogen interactions play crucial role in self-assembly of highly polarizable dichlorodiazadienes. Thus, halogen bonding can dictate a packing preference in the solid state for this class of dichloro-substituted heterodienes, which could be a convenient tool for a fine tuning of the properties of this novel class of dyes.

**Keywords:** non-covalent interactions; crystal engineering; halogen bonding; azo dyes; DFT; QTAIM

#### **1. Introduction**

Halogen bonding (XB) is one of the most intensively investigated areas in modern chemistry [1]. The field currently experiences a renaissance due to exploitation of such weak interactions for a number of functional applications, such as catalysis, drug design, nonlinear optics, reactivity control, and construction of functional supramolecular architectures [2–10] Utilization of non-covalent interactions lies at the foundation of the design supramolecular materials and control of their ultimate architectures [11–14]. XB has recently emerged as a powerful tool for the creation of such materials due to its stability, directionality and reversibility [15–17]. In this context, halogen-halogen interactions received particular attention and were intensively explored both experimentally and

theoretically [18–21]. Arguably, XB can be more beneficial than the hydrogen bonding in the construction of functional materials and tuning their properties due to its higher directionality [10,22,23].

Recently, we discovered a novel class of azo-dyes, i.e., dichlorodiazadienes, which can be easily prepared via unprecedented copper-catalyzed reaction between CCl4 with *N*-substituted hydrazones (Scheme 1) [24]. Currently, very little is known about the chemistry and properties of these dichloro-substituted heterodienes [25–31].

**Scheme 1.** Copper-catalyzed synthesis of dichlorodiazadienes.

Following our interest in construction of supramolecular architectures via non-covalent interactions [32–39] and chemistry of novel diazadienes, we report now the synthesis of halogenated dichlorodiazadienes to demonstrate that dichloro-substituted heterodiene fragment can behave as a strong XB donor/acceptor, what can be used in the design of heterodiene azo-dyes and their self-assembly in the solid state. Incorporation of a halogen atom(s) in the dichloro-dyes' backbone completely changes the way the colorants self-assemble in the crystal. Thus, we show that the XB can dictate a packing preference in the solid state for this class of dichloro-substituted heterodienes. In addition, we performed DFT calculations and topological analysis of the electron density distribution within the formalism of Bader's theory (QTAIM method), which support the presence of intermolecular non-covalent interactions halogen···halogen (Hal···Hal) in the solid state.

#### **2. Results and Discussion**

The target halogenated azabutadienes **10**–**18** were synthesized by CuI -catalyzed reaction between the corresponding hydrazones **1**–**9** and CCl4 and isolated in up to 82% yield as red crystalline solids (Scheme 2).

**Scheme 2.** Copper-catalyzed synthesis of dichlorodiazadienes.

The structure of **10**–**18** was confirmed by the 1H and 13C NMR spectroscopies and X-ray diffraction analysis for **10**, **13**–**15**, and **17** (Figures 1–4). 1H NMR and 13C{1H} spectra (CDCl3) are consistent with their solid-state structures. Dyes **10**, **13**–**15**, and **17** could be easily recrystallized to produce large red crystals, suitable for analysis by single crystal X-ray crystallography. The structural investigations confirmed the formation of azabutadienes. Overall, metrical parameters for **10**, **13**–**15**, and **17** are similar to those reported for similar azabutadienes [26,29–31]. However, introduction of halogen atoms in the dichloro-dyes' backbone has a dramatic impact on its self-assembly in the crystal. In the crystal packing of **10** (para-chloro substitution at the phenyl, attached the double C=C bond) dye molecules form shifted columns (Figure 1) via π-π interactions. The columns dimerize in the crystal via Cl···Cl attractive interactions between the neighboring dye molecules (type 2 contacts) [23]. The dichloroalkene acts as a donor of the halogen bond here (Figure 1).

**Figure 1.** Ball-and-stick representation of **10** and its self-assembly via Cl···Cl bonding in the crystal. Blue, green and grey spheres represent nitrogen, chlorine, carbon atoms, respectively. Hydrogen atoms were omitted for clarity.

**Figure 2.** Ball-and-stick representations of **14** and **15** and their self-assembly via Cl···Hal bonding in the crystal. Olive-green, blue, green, and grey spheres represent bromine, nitrogen, chlorine, and carbon atoms, respectively. Hydrogen atoms were omitted for clarity.

**Figure 3.** Ball-and-stick representation of **17** and its supramolecular dimerization via Cl···Cl type 1 bonding in the crystal. Blue, green, and grey spheres represent nitrogen, chlorine, and carbon atoms, respectively. Hydrogen atoms were omitted for clarity.

**Figure 4.** Ball-and-stick representation of **13** and its self-assembly via Cl···Cl bonding in the crystal. Blue, green, and grey and cyan spheres represent nitrogen, chlorine, carbon, and fluorine atoms, respectively. Hydrogen atoms were omitted for clarity.

Functionalization of dichloro-dyes with another extra halogen atom (compounds **14** and **15**) does not prevent the formation of columns and supramolecular dimerization via Cl···Cl interactions in the crystal (Figure 2). In addition to this, the columns in the crystal of **14** and **15** interact with another neighboring columns via Cl···Hal (Hal=Cl(**14**), Br(**15**)) type 2 bonding forming 3D supramolecular frameworks (Figure 2).

Introduction of one more halogen atom in the dichloro-dyes' backbone completely changes its self-assembly in the crystal. Remarkably, crystal packing of **17** features only one type of Hal···Hal interaction between the chlorines of the p-cholorophenyl groups (Figure 3), which refer to repulsive type 1 contacts. Halogen atoms, attached to the alkene or dichlorobenzene moieties do not form any halogen bonding. Such a behavior is not very clear at the moment and requires additional studies. One plausible explanation is insufficient nucleophilicity of halogens in **17** for the formation of type 2 contacts.

Finally, when dichloro-dyes are functionalized with the fluorine atom (**13**, para- substitution at the phenyl, attached the double C=C bond, Figure 4), the situation with self-assembly in the crystal is similar to the brominated or chlorinated analogs **14** and **15**. The columns form 3D supramolecular frameworks via Cl···Cl and Cl···F type 2 contacts. An interesting peculiarity of self-assembly of **13** in the crystal is the formation of Cl···F type 1 contacts (Figure 4). Thus, the crystal structure of **13** features a bifurcated XB and a remarkable combination of type 1 and 2 halogen contacts (Figure 4).

Inspection of the crystallographic data suggests the presence of multiple intermolecular non-covalent interactions Hal···Hal in the crystals of **10**, **13**–**15**, and **17**. Indeed, the observed distances Hal···Hal are shorter than the sum of Bondi's vdW radii for the corresponding atoms [40]. Thus, in addition to structural analysis, a detailed computational studies were desired. In order to understand the nature and quantify energies of various short halogen-halogen contacts the DFT calculations followed by the topological analysis of the electron density distribution within the QTAIM approach [41] were carried out at the ωB97XD/6-311++G \*\* level of theory for model supramolecular associates containing all types of these noncovalent interactions (see Computational details and Table S1 in the Supplementary Materials). Results of QTAIM analysis summarized in Table 1, the contour line diagrams of the Laplacian of electron density distribution <sup>∇</sup>2ρ(r), bond paths, and selected zero-flux surfaces as well as visualization of electron localization function (ELF) analysis for selected short halogen-halogen contacts shown in Figure 5 for illustrative purposes.

**Table 1.** Values of the density of all electrons—ρ(r), Laplacian of electron density—∇2ρ(r) and appropriate λ<sup>2</sup> eigenvalues (with promolecular approximation), energy density—Hb, potential energy density—V(r), and Lagrangian kinetic energy—G(r) (a.u.) at the bond critical points (3, −1), corresponding to various short halogen-halogen contacts in **10**, **13**–**15**, and **17**, and estimated energies for these interactions Eint (kcal/mol).


<sup>a</sup> Eint <sup>=</sup> 0.49(−V(r)) (this correlation between the interaction energy and the potential energy density of electrons at the bond critical points (3, –1) was specifically developed for noncovalent interactions involving chlorine atoms) [42]. <sup>b</sup> Eint = 0.47G(r) (this correlation between the interaction energy and the kinetic energy density of electrons at the bond critical points (3, –1) was specifically developed for noncovalent interactions involving chlorine atoms) [42]. \* There are no generally accepted specific correlations between the interaction energy and the potential or kinetic energy densities of electrons at the bond critical points (3, –1) for F···F noncovalent interactions, but it is clearly expected from values of V(r) and G(r) that strength of these contacts in **13** is approx. 2 kcal/mol.

The QTAIM analysis of **10**, **13**–**15**, and **17** demonstrates the presence of bond critical points (3, –1) for all weak contacts presented in Table 1. The low magnitude of the electron density (0.006–0.009 a.u.), positive values of the Laplacian of electron density (0.021–0.042 a.u.), and very close to zero positive energy density (0.001–0.002 a.u.) in these bond critical points (3, –1) are typical for halogen-halogen noncovalent interactions [5,39,43]. The balance between the potential and kinetic energy densities of electrons at the bond critical points (3, –1) for studied weak contacts in **10**, **13**–**15**, and **17** reveals that a covalent contribution is absent in these interactions [44] (Table 1). The Laplacian of electron density is typically decomposed into the sum of contributions along the three principal axes of maximal variation, giving the three eigenvalues of the Hessian matrix (λ1, λ<sup>2</sup> and λ3), and the sign of λ<sup>2</sup> can be utilized to distinguish bonding (attractive, λ<sup>2</sup> < 0) weak interactions from non-bonding ones (repulsive, λ<sup>2</sup> > 0) [45,46]. Thus, discussed noncovalent interactions in **10**, **13**–**15**, and **17** are attractive (Table 1). Overall, it follows from the results of theoretical calculations that all short halogen-halogen contacts in **10**, **13**–**15**, and **17** are very similar in terms of energies (their estimated strength per one contact vary from 1 to 3 kcal/mol), which correlates well with very close values of minimal and maximal electrostatic surface potentials on halogen atoms in isolated molecules **10**, **13**–**15**, and **17** (Figure S1 in the Supplementary Materials).

**Figure 5.** Contour line diagrams of the Laplacian of electron density distribution <sup>∇</sup>2ρ(r), bond paths, and selected zero-flux surfaces (left) and visualization of electron localization function (ELF) analysis (right) for intermolecular Cl···Br and Cl···Cl contacts in **15** (top); F···F (center) and Cl···F (bottom) contacts in **13**. Bond critical points (3, −1) are shown in blue, nuclear critical points (3, −3)—in pale brown, ring critical points (3, +1)—in orange, cage critical points (3, +3)—in light green, length units—Å, bond paths are shown as pale brown lines, and the color scale for the ELF maps is presented in a.u.

To understand what kind of interatomic contacts give the largest contributions in crystal packing, we carried out the Hirshfeld surface analysis for all obtained X-ray structures **10**, **13**–**15**, and **17** (Table 2 and Figure 6). The Hirshfeld surface analysis for the X-ray structures **10**, **13**–**15**, and **17** reveals that in all cases crystal packing determined primarily by interatomic contacts involving chlorine and hydrogen atoms.


**Table 2.** Main partial contributions of different interatomic contacts to the Hirshfeld surfaces of X-ray structures **10**, **13**–**15**, and **17**.

**Figure 6.** Visualization of Hirshfeld surfaces for X-ray structures **14**, **15**, and **17** (top), **13** and **10** (bottom).

#### **3. Materials and Methods**

General remarks. Unless stated otherwise, all the reagents used in this study were obtained from the commercial sources (Aldrich, TCI-Europe, Strem, ABCR). NMR spectra were recorded on a Bruker Avance 300 (1H: 300 MHz, Karlsruhe, Germany); chemical shifts (δ) are given in ppm relative to TMS, coupling constants (J) in Hz. The solvent signals were used as references (CDCl3: δ<sup>C</sup> = 77.16 ppm; residual CHCl3 in CDCl3: δ<sup>H</sup> = 7.26 ppm; CD2Cl2: δ<sup>C</sup> = 53.84 ppm; residual CHDCl2 in CD2Cl2: δ<sup>H</sup> = 5.32 ppm); 1H and 13C assignments were established using NOESY, HSQC, and HMBC experiments; numbering schemes as shown in the Inserts. IR: Perkin-Elmer Spectrum One spectrometer (Waltham, MA, USA.), wavenumbers (v) in cm ˜ <sup>−</sup>1. Mass-spectra were obtained on a Bruker micrOTOF spectrometer equipped with electrospray ionization (ESI) source (Bremen, Germany); MeOH, CH2Cl2, or MeOH/CH2Cl2 mixture was used as a solvent. Thermogravimetric analysis (TGA) and differential thermal analysis were determined using a Netzsch TG 209F1 Libra apparatus (Selb, Germany). Solvents were purified by distillation over the indicated drying agents and were transferred under Ar: Et2O (Mg/anthracene), CH2Cl2 (CaH2), hexane (Na/K). Flash chromatography: Merck Geduran® Si 60 (Darmstadt, Germany) (40–63 μm).

The single point calculations based on the experimental X-ray geometries of **10**, **13**–**15**, and **17** have been carried out at the DFT level of theory using the dispersion-corrected hybrid functional ωB97XD [47] with the help of Gaussian-09 program package ([M. J. Frisch et al., Gaussian-09, Revision C.01, Gaussian, Inc., Wallingford CT, USA, 2010.], full citation for this program is given in the SI). The 6-311++G \*\* basis sets [48–51] were used for all atoms. The topological analysis of the electron density distribution with the help of the atoms in molecules (QTAIM) method developed by Bader [41] has been performed by using the Multiwfn program (version 3.6, Beijing, China) [52]. The Cartesian atomic coordinates for

model supramolecular associates are presented in Table S1, Supporting Information. The Hirshfeld surfaces analysis has been performed by using the CrystalExplorer program (version 17.5, Perth, Australia) [53]. The normalized contact distances (dnorm) [54] based on Bondi's van der Waals radii [40] were mapped into the Hirshfeld surfaces.

#### *3.1. Crystal Structure Determination*

X-ray diffraction data for **10**, **13**–**15**, and **17** were collected at the 'RSA' beamline (λ = 0.80246 Å) of the Kurchatov Synchrotron Radiation Source. All datasets were collected at 100 K. In total, 720 frames were collected with an oscillation range of 1.0 in the ϕ scanning mode using two different orientations for each crystal. The semi-empirical correction for absorption was applied using the Scala program [55]**.** The data were indexed and integrated using the utility iMOSFLM from the CCP4 software suite [56,57]. For details, see Table S1. The structures were solved by intrinsic phasing modification of direct methods [58] and refined by a full-matrix least-squares technique on F2 with anisotropic displacement parameters for all non-hydrogen atoms. The hydrogen atoms were placed in calculated positions and refined within the riding model with fixed isotropic displacement parameters [*U*iso(H) = 1.5*U*eq(C) for the methyl groups and 1.2*U*eq(C) for the other groups]. All calculations were carried out using the SHELXTL program [59,60]**.**

Crystallographic data for **10**, **13**–**15**, and **17** have been deposited with the Cambridge Crystallographic Data Center, CCDC 2035010-2035014, respectively. Copies of this information may be obtained free of charge from the Director, CCDC, 12 Union Road, Cambridge CB2 1EZ, UK (fax: +44-1223-336033; e-mail: edeposit@ccdc.cam.ac.uk or www.ccdc.cam.ac.uk).

#### *3.2. Synthetic Part*

Schiff bases **1**–**9** were synthesized according to the reported method [20,21]. A mixture of (2-nitrophenyl)hydrazine (10.2 mmol), CH3COONa (0.82 g) and a corresponding 4-substituted aldehyde (10 mmol) were refluxed with stirring in ethanol (50 mL) for 2 h. The reaction mixture was cooled to room temperature and water (50 mL) was added to give a precipitate of crude product, which was filtered off, washed with diluted ethanol (1:1 with water) and dried in vacuo.

**1.** White solid (69%), mp 118 ◦C. 1H NMR (300 MHz, DMSO-*d*6) δ 10.46 (s, 1H, NH), 7.85 (s, 1H, CH), 7.66 (d, *J* = 8.4 Hz, 2H, arom), 7.43 (d, *J* = 8.4 Hz, 2H, arom), 7.23 (t, *J* = 7.7 Hz, 2H, arom), 7.09 (d, *J* = 7.9 Hz, 2H, arom), 6.76 (t, *J* = 7.2 Hz, 1H, arom). 13C NMR (75 MHz, DMSO-*d*6) δ 145.5, 135.4, 135.2, 132.5, 129.5, 129.1, 127.5, 119.41, 112.5.

**2.** White solid (92%), mp 151 ◦C. 1H NMR (300 MHz, DMSO-*d*6) δ 10.33 (s, 1H, NH), 7.80 (s, 1H, CH), 7.66 (s, 1H, arom), 7.42 (d, *J* = 8.4 Hz, 2H, arom), 7.00 (q, *J* = 8.4 Hz, 5H, arom), 2.09 (s, 3H, CH3). 13C NMR (75 MHz, DMSO-*d*6) δ 138.6, 130.8, 130.1, 127.7, 125.4, 124.5, 123.3, 122.8, 107.92, 16.1.

**3.** White solid (87%), mp 141 ◦C. 1H NMR (300 MHz, DMSO-*d*6) δ 10.24 (s, 1H, NH), 7.78 (s, 1H, CH), 7.63 (d, *J* = 8.5 Hz, 2H, arom), 7.41 (d, *J* = 8.5 Hz, 2H, arom), 7.01 (d, *J* = 8.9 Hz, 2H, arom), 6.84 (d, *J* = 8.9 Hz, 2H, arom), 3.69 (s, 3H, OCH3). 13C NMR (75 MHz, DMSO-*d*6) δ 153.2, 139.5, 135.5, 134.2, 132.1, 129.0, 127.3, 115.0, 113.5, 55.7.

δ 7.02 (d, 2H, *J* = 6.0 Hz), 7.22 (t, 2H, *J* = 9.1 Hz), 7.37 (d, 2H, *J* = 9.1 Hz), 7.68–7.73(m, 2H), 7.87(s, 1H), 10.49 (s, 1H). 13C NMR (75 MHz, DMSO-*d*6) δ 114.3, 115.9, 116.2, 128.0, 132.1, 132,63, 136,7, 145.0, 109.9.

**4**. White solid (77%), mp 135 ◦C. 1H NMR (300 MHz, DMSO-*d*6)

**5**. White solid (76%), mp 153 ◦C. 1H NMR (300 MHz, DMSO-*d*6) δ 10.58 (s, 1H, NH), 7.85 (s, 1H, CH), 7.67 (d, *J* = 8.3 Hz, 2H, arom), 7.48–7.38 (m, 2H, arom), 7.25 (d, *J* = 8.7 Hz, 2H, arom), 7.07 (d, *J* = 8.7 Hz, 2H, arom). 13C NMR (75 MHz, DMSO-*d*6) δ 144.4, 136.3, 135.0, 132.8, 129.3, 129.1, 127.7, 122.6, 113.9.

**6**. White solid (94%), mp 131 ◦C. 1H NMR (300 MHz, DMSO-*d*6) δ 10.59 (s, 1H, NH), 7.85 (s, 1H, CH), 7.67 (d, *J* = 8.5 Hz, 2H, arom), 7.43 (d, *J* = 8.5 Hz, 2H, arom), 7.37 (d, *J* = 8.8 Hz, 2H, arom), 7.03 (d, *J* = 8.8 Hz, 2H, arom). 13C NMR (75 MHz, DMSO-*d*6) δ 144.8, 136.4, 134.9, 132.8, 132.2, 129.1, 127.7, 114.4, 110.2, 39.9.

**7**. White solid (72%), mp 119 ◦C. 1H NMR (300 MHz, DMSO-*d*6) δ 10.30 (s, 1H, NH), 7.81 (s, 1H, CH), 7.65 (d, *J* = 8.5 Hz, 2H, arom), 7.42 (d, *J* = 8.5 Hz, 2H, arom), 6.69 (s, 2H, arom), 6.41 (s, 1H, arom), 2.22 (s, 6H, CH3). 13C NMR (75 MHz, DMSO-*d*6) δ 145.3, 138.5, 135.3, 135.0, 132.3, 129.1, 127.5, 121.3, 110.3, 21.7.

**8**. White solid (88%), mp 114 ◦C. 1H NMR (300 MHz, DMSO-*d*6) δ 10.16 (s, 1H, NH), 8.28 (s, 1H, CH), 7.69 (d, *J* = 8.7 Hz, 2H, arom), 7.56 (d, *J* = 8.7 Hz, 1H, arom), 7.51–7.43 (m, 3H, arom), 7.35–7.17 (m, 1H, arom). 13C NMR (75 MHz, DMSO-*d*6) δ 162.3, 140.9, 140.1, 134.6, 133.4, 129.2, 129.0, 128.5, 128.1, 122.8, 117.1, 115.5.

**9**. White solid (92%), mp 112 ◦C. 1H NMR (300 MHz, DMSO-*d*6) δ 10.74 (s, 1H, NH), 7.90 (d, *J* = 14.3 Hz, 1H), 7.69 (d, *J* = 8.3 Hz, 2H, arom), 7.44 (q, *J* = 8.3, 7.5 Hz, 3H, arom), 7.26 (s, 1H, CH), 7.00 (d, *J* = 8.4 Hz, 1H, arom). 13C NMR (75 MHz, DMSO-*d*6) δ 145.6, 137.6, 134.6, 133.2, 132.1, 131.5, 131.3, 130.2, 129.8, 129.1, 128.0, 120.1, 113.3, 112.8.

#### *3.3. Synthesis of Dichlorodiazadiens*

A twenty-milliliter screw neck vial was charged with DMSO (10 mL), **1**–**9** (1 mmol), tetramethylethylenediamine (TMEDA) (295 mg, 2.5 mmol), CuCl (2 mg, 0.02 mmol), and CCl4 (20 mmol, 10 equiv). After 3 h (until TLC analysis showed complete consumption of corresponding Schiff base) reaction mixture was poured into ~0.01 M solution of HCl (100 mL, ~pH = 2), and extracted with dichloromethane (3 × 20 mL). The combined organic phase was washed with water (3 × 50 mL), brine (30 mL), dried over anhydrous Na2SO4 and concentrated in vacuo. The residue was purified by column chromatography on silica gel using appropriate mixtures of hexane and dichloromethane (3/1–1/1).

**10**. Red solid (73%), mp 85 ◦C. 1H NMR (300 MHz, CDCl3) δ 7.71–7.60 (m, 2H, arom), 7.35 (dd, *J* = 7.6, 3.8 Hz, 4H, arom), 7.28 (s, 1H, arom), 7.03 (d, *J* = 8.3 Hz, 2H, arom). 13C NMR (75 MHz, CDCl3) δ 134.6, 131.7, 131.3, 130.7, 129.4, 129.0, 128.4, 127.1, 126.2, 123.1.

**11**. Red solid (79%), mp 90 ◦C. 1H NMR (300 MHz, CDCl3) δ 7.69 (d, *J* = 8.2 Hz, 2H, arom), 7.42 (d, *J* = 8.3 Hz, 2H, arom), 7.26 (d, *J* = 8.2 Hz, 2H, arom), 7.13 (d, *J* = 8.3 Hz, 2H, arom), 2.42 (s, 3H, CH3). 13C NMR (75 MHz, CDCl3) δ 162.3, 151.2, 150.9, 142.5, 134.7, 131.4, 131.0, 129.7, 128.4, 123.2, 21.5. Crystals, suitable for X-ray analysis, were obtained by the slow evaporation of saturated hexane/EtOAc (5/1) solution.

**12**. Red solid (72%), mp 96 ◦C. 1H NMR (300 MHz, CDCl3) δ 7.78 (d, *J* = 9.0 Hz, 2H, arom), 7.42 (d, *J* = 8.4 Hz, 2H, arom), 7.13 (d, *J* = 8.4 Hz, 2H, arom), 6.95 (d, *J* = 9.0 Hz, 2H, arom), 3.88 (s, 3H, OCH3). 13C NMR (75 MHz, CDCl3) δ 162.7, 162.3, 151.1, 147.2, 134.6, 131.4, 131.2, 128.4, 125.3, 114.2, 55.6.

**13**. Red solid (68%), mp 77 ◦C. 1H NMR (300 MHz, CDCl3) δ 7.81 (dd, *J* = 8.6, 5.4 Hz, 2H), 7.43 (d, *J* = 8.3 Hz, 2H), 7.14 (t, *J* = 8.8 Hz, 4H). 13C NMR (75 MHz, CDCl3) δ 167.6, 166.4, 151.1, 149.3, 134.8, 131.4, 130.8, 129.6, 128.5, 125.4, 116.2, 115.9. Crystals, suitable for X-ray analysis, were obtained by the slow evaporation of saturated hexane/EtOAc (5/1) solution.

**14**. Red solid (67%), mp 94 ◦C. 1H NMR (300 MHz, CDCl3) δ 7.73 (d, *J* = 8.6 Hz, 1H), 7.43 (d, *J* = 8.4 Hz, 2H), 7.12 (d, *J* = 8.4 Hz, 1H). 13C NMR (75 MHz, CDCl3) δ 162.3, 151.1, 137.7, 136.5, 134.9, 131.4, 130.6, 129.3, 128.5, 124.4. Crystals, suitable for X-ray analysis, were obtained by the slow evaporation of saturated hexane/EtOAc (5/1) solution.

**15**. Red solid (70%), mp 105 ◦C. 1H NMR (300 MHz, CDCl3) δ 7.69–7.56 (m, 4H, arom), 7.49–7.39 (m, 2H, arom), 7.12 (d, *J* = 8.4 Hz, 2H, arom). 13C NMR (75 MHz, CDCl3) δ 151.4, 134.9, 132.3, 131.4, 130.6, 129.8, 128.5, 127.4, 126.3, 124.6. Crystals, suitable for X-ray analysis, were obtained by the slow evaporation of saturated hexane/EtOAc (5/1) solution.

**16**. Red solid (82%), mp 145 ◦C. 1H NMR (300 MHz, CDCl3) δ 7.44 (d, *J* = 7.7 Hz, 4H, arom), 7.15 (d, *J* = 7.7 Hz, 3H, arom), 2.40 (s, 6H, CH3). 13C NMR (75 MHz, CDCl3) δ 157.7, 148.3, 146.7, 134.2, 130.2, 129.0, 126.8, 126.5, 123.9, 116.5, 16.6.

**17**. Red solid (66%), mp 115 ◦C. 1H NMR (300 MHz, CDCl3) δ 7.89 (s, 1H, arom), 7.68–7.61 (m, 1H, arom), 7.54 (d, *J* = 8.6 Hz, 1H, arom), 7.44 (d, *J* = 8.3 Hz, 2H, arom), 7.11 (d, *J* = 8.3 Hz, 2H, arom). 13C NMR (75 MHz, CDCl3) δ 151.5, 151.4, 135.7, 135.0, 133.5, 131.3, 130.8, 130.4, 129.8, 128.6, 124.5, 122.7. Crystals, suitable for X-ray analysis, were obtained by the slow evaporation of saturated hexane/EtOAc (5/1) solution.

**18**. Red solid (71%), mp 121 ◦C. 1H NMR (300 MHz, CDCl3) δ 7.57 (d, *J* = 8.7 Hz, 1H, arom), 7.46 (d, *J* = 2.0 Hz, 1H, arom), 7.37 (d, *J* = 8.4 Hz, 2H, arom), 7.24 (d, *J* = 2.3 Hz, 1H, arom), 7.12 (d, *J* = 8.4 Hz, 2H, arom).

#### **4. Conclusions**

In summary, **9** novel halogenated dichlorodiazadienes were prepared and fully characterized, while for **5** of them single crystal structures were determined. Solid state structures contained multiple Hal···Hal interactions, which were studied by DFT calculations and topological analysis of the electron density distribution within the framework of Bader's theory (QTAIM method). Calculations showed that the Hal···Hal interactions dictate a packing preference for this newly discovered class of dyes. These results further demonstrate the potential of Hal···Hal bonding in supramolecular engineering and crucial role in the stabilization of the intermolecular networks of dichlorodiazadienes. Further studies into photophysical properties of halogenated dichlorodiazadienes and their applications from our laboratory are underway and will be reported in due course.

**Supplementary Materials:** Figure S1. Visualization of electrostatic surface potentials for **10**, **13**–**15** and **17** with selected Vs,min/Vs,max values (in kcal/mol). Table S1: Crystal data and structure refinement for **10**, **<sup>13</sup>**–**<sup>15</sup>** and **<sup>17</sup>**, Table S2. Values of the density of all electrons—ρ(r), Laplacian of electron density—∇2ρ(r) and appropriate λ<sup>2</sup> eigenvalues (with promolecular approximation), energy density—Hb, potential energy density—V(r), and Lagrangian kinetic energy—G(r) (a.u.) at the bond critical points (3, –1), corresponding to Cl···F halogen-halogen contacts in **13**, Table S3. Cartesian atomic coordinates for model supramolecular associates.

**Author Contributions:** Conceptualization, V.G.N. and A.G.T.; writing—review and editing; writing—original draft preparation, V.G.N., A.G.T., A.S.N.; V.G.N. and A.G.T.; software, A.S.N.; investigation, N.G.S.; A.M.M.; K.N.B.; G.T.S.; supervision, V.N.K. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was performed under the support of the FRCCP RAS State task AAAA-A19-119012990175-9. A.S.N. is grateful to Russian Science Foundation for the support of his theoretical studies (project No. 19-73-00001). We acknowledge the RUDN University Program 5-100. V.G.N. is grateful to RFBR for the support (grant N 18-03-00791).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Sample Availability:** Samples of the compounds **1–18** are available from the authors.

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Inter- vs. Intramolecular Hydrogen Bond Patterns and Proton Dynamics in Nitrophthalic Acid Associates**

#### **Kinga Józwiak ´ 1, Aneta Jezierska 1, Jarosław J. Panek 1, Eugene A. Goremychkin 2, Peter M. Tolstoy 3, Ilya G. Shenderovich <sup>4</sup> and Aleksander Filarowski 1,\***


Academic Editor: Goar Sánchez Received: 10 September 2020; Accepted: 12 October 2020; Published: 14 October 2020

**Abstract:** Noncovalent interactions are among the main tools of molecular engineering. Rational molecular design requires knowledge about a result of interplay between given structural moieties within a given phase state. We herein report a study of intra- and intermolecular interactions of 3-nitrophthalic and 4-nitrophthalic acids in the gas, liquid, and solid phases. A combination of the Infrared, Raman, Nuclear Magnetic Resonance, and Incoherent Inelastic Neutron Scattering spectroscopies and the Car–ParrinelloMolecular Dynamics and Density Functional Theory calculations was used. This integrated approach made it possible to assess the balance of repulsive and attractive intramolecular interactions between adjacent carboxyl groups as well as to study the dependence of this balance on steric confinement and the effect of this balance on intermolecular interactions of the carboxyl groups.

**Keywords:** proton dynamics; carboxyl group; CPMD; DFT; IINS; IR; Raman; NMR

#### **1. Introduction**

Hydrogen bonding (H-bonding) and steric effects are important tools of molecular engineering. Under certain conditions, their interplay can stabilize species that otherwise exhibit high chemical reactivity [1–4]. The structural complexity increases when there are either other noncovalent interactions or competing H-bonds. The former is critically important in solids [5–9], at confined geometries [10–12], and in aqueous solutions [13–16]. The latter is characteristic for P=O moiety [17,18], specially designed organic molecules [19,20], but most of all for biomolecules [21,22]. The adjustments of bridging proton positions in H-bonds act as one of the mechanisms governing the chemical properties of macromolecules [23–25] and biosystems [26,27]. Changes of weak specific interactions such as H-bonds can evoke a reorganization on the macroscopic scale. Therefore, many-sided elaborate studies of the conformational phenomena are essential not only for fundamental understanding of H-bond nature but also for a number of practical applications, such as design of materials with the required physicochemical properties [28–30].

The wide variety of effects associated with a competition between intra- and intermolecular H-bonding can be illustrated with salicylic acid. In the simplest case of salicylic acid crystals, the carboxyl groups of the molecules form dimers while their hydroxyl groups form intramolecular H-bonds [31,32]. This structure remains qualitatively valid in an aprotic solution when the dimer is deprotonated [33]. In contrast, when the number of competing interactions increases, the co-crystals of salicylic acid exhibit polymorphism and different solubility [32,34,35]. These changes are critically important for pharmaceutical applications. Besides that, the intramolecular H-bond in salicylic acid derivatives can be controlled through intramolecular steric effects. In the crystalline salicylic acid, the O ... O distances of this H-bond are about 2.62 Å [31,32]. In 2-hydroxy-3-nitrobenzoic acid, 6-(cyclohexylmethyl)salicylic acid, and 6-(2-cyclohexylethyl)salicylic acid they are only 2.55, 2.54, and 2.52 Å, respectively [32,36]. Is this a general trend that can be expected for other molecules' structures?

This paper presents the conformational studies of 3- and 4-nitrophthalic acids (**3** and **4**, Figure 1). These compounds are characterized by the presence of strong intermolecular and intramolecular H-bonds in co-crystals with various organic compounds [37–41]. These bonds might mutually convert one into another in compounds with adjacent carboxyl groups under impact of external factors. The first papers about dimeric formation by carboxyl group were published by Pfeiffer et al. in 1910 [42–44]. The carboxylic acid dimer units (2 × (COOH)) have still attracted attention for researchers involved in H-bonding studies [45–55]. In References [56–58], authors show a strong effect of H-bonds on the conformational state of compounds. The domination of *cis* conformation of carboxyl group, so-called *Z*-effect, has been elucidated by Lyssenko et al. [59]. Recently, two polymorphic forms of cinchromeronic acid (the derivative of phthalic acid) have been discovered and studied [60]. It has been shown that the polymorphic forms are caused by the proton transfer and reorientation of the carboxyl groups. Computational studies of H-bonds and stable conformers are important for the development of the conformational polymorphism of the molecular complexes such as benzoic acid with pyridine [61]. Moreover, H-bonded networks of phthalic acids can be used as ligands for metal-organic aggregates [62,63].

**Figure 1.** Chemical structures of 3-nitrophthalic (**3**) and 4-nitrophthalic (**4**) acids.

The main aim of this study was to characterize intramolecular interactions between adjacent carboxyl groups in the presence and absence of intramolecular steric effects and the effect of all these interactions on intermolecular interactions of these carboxyl groups. The nitro substitution was chosen because this moiety is rigid, relatively small, and causes considerable steric strain. Besides static density-functional theory (DFT) computations, this study covers simulations performed using the Car–Parrinello molecular dynamics (CPMD) approach, which supports NMR (Nuclear Magnetic Resonance), IR (infrared), Raman, and IINS (Incoherent Inelastic Neutron Scattering) experimental measurements with the employment of a neutron radiation source.

The outline of the manuscript is as follows. Firstly, the conformational analysis on the basis of static DFT calculations is presented. Next, the proton and functional groups' dynamics were studied by DFT and CPMD calculations. The following part delves into the investigations of conformational equilibrium in the solutions accomplished by NMR spectroscopy as well as IR, Raman, and IINS studies of the compounds in the solid state. Additionally, the spectral analysis on the basis of the experimental and computational results by means of H/D isotopic substitution was performed. The concluding remarks are given in the last section.

#### **2. Results and Discussion**

#### *2.1. DFT Study of H-Bond, Nitro, and Carboxyl Groups' Dynamics of Nitrophthalic Acids*

The quantum-mechanical calculations were accomplished at the B3LYP/6-311+G(d,p) level of theory for the detection of the most stable conformer of monomeric **3** and **4**. These calculations show that the most stable conformer does not contain the intramolecular H-bond (Figure 2). Generally, intramolecular H-bonds can significantly decrease the energy of isolated molecules [64]. However, conformers **3(III)**, **3(IV), 3(VI), 4(IV),** and **4(V)** with the intramolecular H-bond feature significant steric tensions between the carboxyl groups that increases further if the nitro group is nearby. In consequence, the energies of conformers **3(III)** and **4(IV)** are higher as compared to **3(I)** and **4(I)**.

**Figure 2.** Conformers of monomeric **3** (upper row) and **4** (bottom row) and their relative energies (Δ*E* = *E*min(conformer) – *E*i(conformer), kcal/mol) obtained at the B3LYP/6-311+G(d,p) level of theory for the gas phase and in acetonitrile (CH3CN). *E*min(conformer) stands for the energy of 3(I) or 4(I). *E*i(conformer) stands for the energy of the conformer under consideration.

Using the knowledge of the monomer's conformations, the calculations and analysis of the possible structures of hydrogen-bonded dimers were performed (labelled **D3** and **D4** in Figure S1). The most stable conformation of the dimers was obtained when the molecules were arranged orthogonally (**D3(I)**, **D3(II)**, **D4(I),** and **D4(II)**, Figure S1). However, the planar orientations of the molecules caused only a small increase in energy (**D3(III)**, **D3(IV)**, and **D4(V)**. Indeed, in the crystal of **3,** one carboxyl group of each molecule formed a dimer with the planar orientations of the rings while the other carboxyl group formed a hydrogen-bonded molecular chain between such dimers [37]. Structures in which the second carboxyl group was oriented orthogonally to the intermolecular H-bonded group were energetically beneficial. This result was conditioned by a smaller steric repulsion between carboxyl groups (and the nitro group in case of **3**). Thus, the formation of intramolecular H-bonds in the dimers was unfavorable. Oligomers **D3(IX)** and **D4(VIII)**, in which molecules did not form carboxyl group dimers, exhibited higher energies (Figure S1). However, the energy increase was quite moderate, especially for compound **4**. Moreover, **D3(X)** and **D4(VII)** possessed one intramolecular H-bond each.

For the assessment of dynamic effects associated with the rotations of carboxyl and nitro groups, the corresponding potential energy profiles were calculated at the B3LYP/6-311+G(d,p) level of theory for the monomers of both acids. The DFT calculations of the rotation of the nitro groups (gradual increase of the torsional angles C2C3NO5 in **3** and C3C4NO5 in **4** (Figure S2)) revealed the similarity of the

rotational energy barriers for both compounds: 4.8 and 5.8 kcal/mol for **3** and **4**, respectively (Figure S3). These barriers resulted from the disruption of the π-electronic coupling between the nitro group and the benzene ring, which caused energetically disadvantageous configurations at CCNO - 90◦ (Figure S3). For **3**, one can also observe a small barrier at C2C3NO5 - 180◦, caused by the repulsion between the nitro and carboxyl groups (Figure S3). It is noteworthy that the rotation of either nitro or carboxyl group evoked the simultaneous rotation of the neighboring functional groups and, therefore, it led to moderately high energy barriers.

In contrast to the nitro group rotation, the calculations showed a significant difference between the energy barrier heights for the rotation of the carboxyl groups in **3** and **4** (Figure 3). For **3**, these barriers were 5.5–6.5 kcal/mol, which was 4–5 kcal/mol higher than for **4**. This difference resulted from a strong steric effect between three functional groups in **3**. The steric squeezing between carboxyl groups in **4** was weaker than in **3** because the nitro group was in the *meta* position. The energy barriers for the nitro and carboxyl groups' rotation were not very high. Thus, for these compounds a significant dynamics of all functional groups can be expected.

**Figure 3.** Calculated potential energy curves for the carboxyl group rotation in **3** (solid line) and **4** (dashed line).

In order to study the H-bond dynamics, we calculated potential energy profiles for proton transfer in intramolecular H-bonds in **3(III)**, **4(IV)**, **D3(III)**, and **D4(I)** in the gas phase and taking into account the effect of a polar solvent (CH3CN) using the polarizable continuum model (PCM) approach (Figure 4 and Figure S4a,b). The O-H distance in one of the carboxyl groups was gradually elongated while other structural parameters were optimized for each step. The profile of the curves and its numerical values were similar for monomers and dimers. The calculations of the potential energy curves for the intramolecular proton transfer in the monomeric species showed no second minimum in the range of O··· H distances 1.4–1.7 Å (Figure 4, curve **a**). According to the earlier presented analysis [65], which rests upon the experimental and computational data, this result proves the absence of the proton transfer within the intramolecular hydrogen bond, i.e., the absence of a tautomeric equilibrium (Figure 5F). In turn, the intermolecular transfer of one proton within the intermolecular hydrogen bond in dimers of compound **3** induced a transfer of the second proton in the adjacent intermolecular hydrogen bond (Figure 4, curve **c**). The calculated potential energy curves for dimers **D3(III)** and **D4(I)** turned to be double-well. The energy required for this concerted double proton transfer was about 6.3 kcal/mol for the gas phase (Figure S4b). The use of the PCM approximation for acetonitrile reduced the barrier down to 5.7 kcal/mol (Figure 4). This fact supports the possibility to observe the tautomeric equilibrium with double proton transfer (Figure 5B,C) in an experiment. Taking into account that the PCM approximation strongly underestimates the effect of polar media on hydrogen-bonded systems [66–69], one can expect a fast, concerted proton transfer in phthalic acid dimers in polar solvents. Previously, a double proton transfer in carboxylic acid

dimers was experimentally detected in low-temperature NMR spectra (110 K, CDF3/CDF2Cl mixture as solvent) as a triplet splitting of the bridging proton signal for 13C-labelled acetic acid due to <sup>2</sup>*J*(C,H) spin-spin coupling [70]. Though dimers **D3(I)**–**D3(VII)** are the most stable forms, the formation of oligomers (structure **G**, Figure 5) and complexes of other types (**D** and **E** dimers, Figure 5) is also possible. This fact is also supported by the crystallographic and spectroscopic studies [71–74].

**Figure 4.** The potential energy profile for a gradual displacement of one proton within the H-bond in the **4(II)** monomer (**a**) and the **D4(I)** dimer (**b** and **c**) calculated in the PCM approximation in acetonitrile. The curves **a** and **c** represent a case when all other structural parameters are optimized. The curve **b** represents a case when the position of the adjacent bridged proton is fixed.

**Figure 5.** Schemes of the prototropic equilibria for the carboxyl aryl derivatives and their intermolecular complexes.

To explore the possibility of a single proton transfer in the dimers (equilibrium **BA**, Figure 5), the calculation was performed at a constant O-H distance of the adjacent hydrogen bond. There was no local minimum on the potential energy curve (Figure 4, curve **b**). Therefore, the formation of a zwitterionic complex (Figure 5, structure **A**) was disadvantageous and there was a poor chance to observe the equilibrium **BA** experimentally (Figure 5). Nevertheless, the profile for the single proton transfer in the dimer was more shallow than that in the monomer.

The calculated H-bond energies (Δ*E*(HB) ≈ *E*min(non-HB) − *E*i(HB) [75]) in the studied dimeric complexes were smaller than 7 kcal/mol per H-bond. The estimated values of the energies calculated for the dimers of **3** and **4** correlated well with the energies reported for similar systems. For example, according to the experimental temperature-dependent attenuated total reflection (ATR) IR studies of ibuprofen by Ludwig et al. [76], the enthalpy of the transition between doubly H-bonded cyclic dimers to singly H-bonded linear dimers is equal to −5.07 kcal/mol. The binding energy of a *p*-biphthalate dimer obtained at the B3LYP/6-31+G\* approximation is about 12.4 kcal/mol (6.2 kcal/mol per one hydrogen bond) [77]. Such H-bonds are characterized as weak ones. However, the studied dimers exhibited an easy double proton transfer. Such phenomenon is typical for Strong Short H-Bonds (SSHB) [78]. This observation can be rationalized as follows. An elongation of the OH distance results in an increase of the electron density on the adjacent oxygen of the same carboxyl group, thereby it strengthens the basicity of this oxygen. When the OH bond length is ca. 1.25 Å, the basicity of the adjacent oxygen becomes sufficient to evoke a spontaneous transfer of the adjacent proton from the opposite carboxyl group. A further elongation of the OH bond brings about a moderate decrease of the dimer energy, thus creating a double-well potential. Following Gilli's terminology [79], this phenomenon can be called charge flow-assisted hydrogen bond.

#### *2.2. Dynamics of Hydrogen Bonding within the Framework of Molecular Dynamics*

Molecular dynamics (MD) schemes, which reproduce time evolution of the studied systems, are useful in the investigations of multi-dimensional and complex phenomena [80–82]. In the studied case of phthalic acid derivatives, it was necessary to use the Car–Parrinello MD scheme (CPMD), which is based on the DFT framework and is able to reproduce H-bond properties [83–88]. This section describes how these CPMD simulations illustrate the impact of H-bond strength on the molecular metric parameters.

Table 1 presents statistical data (averages and standard deviations) for the CPMD production runs. After the thermostatted equilibration phase, the data collection without thermostats lasted 24 ps, and only the last 20 ps were taken as the production runs in order to allow the molecules to relax after thermostatting. It was interesting to see that the intramolecular H-bond in **4** was much stronger than in **3**, but the intermolecular bridges of the dimers were of almost the same strength. While the mean and standard deviations of the donor-acceptor distance were lower for a stronger bonding, the opposite was true for the donor-proton bond length. This was a result of increased delocalization of the proton in the stronger bridge, whereas the dynamics of the H-bridge were weaker. The donor-acceptor distances listed in Table 1 indicate that the intermolecular H-bonds in the dimer of **3** were stronger and more delocalized than the intramolecular one in the monomeric **3**. An opposite phenomenon was observed for **4**. This discrepancy can be explained by the difference in the geometry of the structures. For the dimers of **3** and **4**, the geometry of the H-bridges was planar (COH ... O torsional angle ~0◦) and linear (OHO angle ~179◦), while, for the monomers, the geometry was neither planar nor linear (COH ... O and OHO angles were ~64/50◦ and ~150/160◦ for **3**/**4**, respectively, Table 1). These deviations from the planarity were caused by a strong electrostatic repulsion between oxygen atoms of the intramolecular H-bonds in the monomers. Moreover, for the monomer of **3**, the phenomenon of non-coplanarity was enhanced by a strong steric repulsion from the nitro group, which led to an additional weakening of the intramolecular H-bond.


**Table 1.** Metric parameters (in Å) for the donor-acceptor (OO) and donor-proton (OH) contacts in the monomers and dimers of **3** and **4**. The CPMD results are given as: Average ± standard deviation.

Additional insight was provided by the time evolution of the bridge distances, depicted in Figure 6 for the monomers and Figure S5 for the dimers. It was striking that even if the monomer of **4** had the shortest donor-acceptor distance among the studied systems, there were no indications of the proton entering the acceptor side. On the other hand, the intermolecular cyclic dimers of **3** and **4** were typical for carboxylic acids. For **4**, there were numerous instances of the bridge proton being located almost in the middle of the bridge, while for **3** there were just two such cases and one of them was a concerted transfer (occurring at the same time in both bridges). Such synchronicity was less obvious for **4**. This delocalization of the protons in the cyclic dimer of **4** showed that the H-bonding in **4** was stronger than in **3** for both the monomer and the dimer.

It is worth to note that the H-bond in the monomer of **3** was characterized by the greatest dynamics, due to its non-planar structure and, as a consequence, a significant deformation component.

**Figure 6.** Time evolution of the H-bridge metric parameters. The CPMD gas phase simulations of the monomeric **3** and **4**. Red: Donor-proton distance, green: Proton-acceptor distance, blue: Donor-acceptor distance.

#### *2.3. NMR Studies of Nitrophthalic Acids*

NMR study of H-bonding in solution is challenging due to the short lifetime of H-bonded complexes of low molecular weight. Generally, only a single NMR line is observed for all mobile protons, which represent an average over different, fast interconverting hydrogen-bonded complexes. This problem can be solved using a low-freezing solvent [89]. In this solvent complex, hydrogen-bonded systems can be characterized in great detail [90–92]. However, such experiments are not without their problems. A simplified qualitative analysis is possible when the mole fractions and the individual chemical shifts of different H-bonded complexes are known.

Neither **3** nor **4** was soluble in weakly polar, aprotic solvents. However, their solubility can be increased in the presence of a dissolved base. Possible scenarios of phthalic acid interaction with bases in solution are shown in Figure 7. If the composition of such acid:base complex is 1:1, one carboxyl group of the acid interacts with the base while the other carboxyl group can either form an intramolecular H-bond with the former one (scenario **a**) or remain free (scenario **b**). If the base is in excess, 1:2 acid:base complex can be formed with two near-equal intermolecular H-bonds

(scenario **c)**. We did not consider complexes of acid dimers because such complexes are not likely in the presence of the base because the solubility of the acid alone is very low. What are the individual 1H chemical shifts of carboxyl protons in these complexes when the base is very strong? In a polar solvent the 1H chemical shifts of the carboxyl proton in a 1:1 complex of 2-nitrobenzoic acid with 2,4,6-trimethylpyridine is equal to 16.8 ppm [93]. The length of this H-bond can be elongated due to the steric effects [94,95]. The use of a stronger base can cause both a contraction and a lengthening of the H-bond. The result depends on the position of the bonding proton with respect to the H-bond center. However, the reduction of solvent polarity causes the opposite effect [66–68]. We concluded that the 1H chemical shift of the proton in the intermolecular H-bonds in scenarios **a**, **b**, and **c** should have been between 19 and 17 ppm. The 1H chemical shift of the proton in the intramolecular H-bond in scenario **a** was the hardest to estimate. The geometry of this H-bond was forced to adapt to the rigid molecular structure. Most likely, the 1H chemical shift of this proton should have been smaller than 15 ppm [33]. The 1H chemical shift of the proton of the free carboxyl group in scenario **b** depended on interaction with CDCl3. At high concentration of 2,6-bis(trifluoromethyl)benzoic acid in dry CDCl3, its mobile proton resonates at 10 ppm. At high concentration of 2,6-bis(trifluoromethyl)benzoic acid in toluene its mobile proton resonates at 8.8 ppm at 300 K and at 7.4 ppm at 354 K. In both solvents the chemical shift depends on the monomer–dimer equilibrium of the acid. We believed that 6 ppm was a safe upper limit for the 1H chemical shift of the proton of the free carboxyl group in scenario **b**. Summarizing the above, the mean 1H chemical shifts of the carboxyl protons in scenario **a**, **c**, and **b** were expected to be about 16 ppm, 18 ppm, and below 12 ppm, respectively.

**Figure 7.** Possible scenarios of phthalic acid interaction with bases in nonpolar solution: (**a**) One intraand one intermolecular H-bond, (**b**) single intermolecular H-bond, and (**c**) two intermolecular H-bonds. Molecular structures of the considered bases.

Figure 8 shows characteristic 1H NMR spectra of **3** and **4** in CDCl3 in the presence of a large excess of triethylamine (Et3N). The limiting mean values of the 1H chemical shift of carboxyl protons measured using a set of spectra collected with a gradual increase in the mole fraction of **3** or **4** were equal to 14.1 ppm for both acids (Tables S1 and S2). Therefore, the most likely structure of a complex of Et3N with phthalic acids in nonpolar solvents corresponded to scenario **a**. This result is pretty intuitive while the bulkiness of Et3N significantly increased the entropic cost of the structure shown in scenario **c**.

**Figure 8.** Characteristic 1H NMR spectra of **3** and **4** in CDCl3 at 300 K in the presence of Et3N. The signals of OH-protons are marked by asterisks. The mole fractions are (**a**) water:**4**:Et3N = 1:1.2:35, (**b**) water:**4**:Et3N = 1:8.7:35, and (**c**) water:**3**:Et3N = 1:8.7:35.

Figure 9a,b shows characteristic 1H NMR spectra of **4** in CDCl3 in the presence of a large excess of *N*,*N*-dimethylpyridin-4-amine (DMAP). The limiting mean value of the 1H chemical shift of carboxyl protons measured using a set of spectra collected with a gradual increase in the mole fraction of **4** was about 18.5 ppm (Table S3). Therefore, the most likely structure of the complex of DMAP with **4** in CDCl3 corresponded to scenario **c**. When the mole fractions of DMAP were only slightly larger than that of **4,** while the total concentration was very low, 1H NMR spectra exhibited two separate peaks of different mobile protons (Figure 9c). We attributed the peak at 15.7 ppm to a complex of DMAP with **4** and the peak at 2.4 ppm to water interacting with residual DMAP. The former peak obviously corresponded to the structure in scenario **a**. The mean 1H chemical shift in this complex was larger than for Et3N. However, this difference does not mean, obviously, that the intermolecular H-bond in DMAP:**4** was stronger than in Et3N:**4**. Recall that the 1H chemical shift of the strongest known H-bond in [FHF]<sup>−</sup> is 16.6 ppm [95] while one of the largest 1H chemical shifts, of 21.7 ppm, has been measured for a moderately strong H-bond in the proton-bound homodimers of pyridine [33]. More about this issue can be found elsewhere [96]. In contrast, the value of 15.7 ppm in DMAP:**4** can be compared to the value of 18.5 ppm in DMAP:**4**:DMAP. The latter complex has two intermolecular H-bonds while the former has one inter- and one intramolecular H-bond. Therefore, if the effects of mutual influences of the adjacent hydrogen bonds on their geometries in each of these complexes were small, the individual 1H chemical shift of the intramolecular H-bond in DMAP:**4** was about 13 ppm.

**Figure 9.** Characteristic 1H NMR spectra of **4** in CDCl3 at 300 K in the presence of DMAP. The mole fractions are (**a**) water:**4**:DMAP = 1:1.0:260, (**b**) water:**4**:DMAP = 1:3.9:260, and (**c**) water:**4**:DMAP = 1:0.24:0.30.

In contrast to the solution with Et3N, the presence of DMAP did not increase the solubility of **3** in CDCl3. Presumably, **3** did not interact with DMAP by scenario **c** due to a high entropic cost caused by the position of the nitro group. Why did it not interact with DMAP by scenario **a**? We cannot answer this question with certainty.

#### *2.4. H-Bonding Vibrational Modes in Carboxyl Dimers*

Stretching vibrations of H-bonds have a high diagnostic value for determination of the nature and strength of these bonds [97,98]. Previously, the spectral manifestations of dimerization and isotopic effects on spectroscopic observables were studied for different molecular systems [99–108] including carboxylic acid dimers [109,110]. Upon carboxylic acid dimerization, the structure of the OH stretching band in IR spectra changes most prominently: The narrow band of monomers changes to a broad, intensive, and complex substructured band of dimers shifted to lower wavenumbers.

For a comprehensive spectroscopic investigation of **3** and **4**, we accomplished a study based on IR, Raman, and IINS measurements, as well as DFT, CPMD, and Potential Energy Distribution (PED) calculations. The IR, Raman, and IINS spectra of non-deuterated and deuterated (OH → OD replacement) **3** and **4** are shown in Figures 10 and 11. The experimental spectra were interpreted using the calculated vibrational (DFT) and power spectra (CPMD), and the results of the PED analysis (Tables S4 and S5). More about this issue can be found elsewhere [111–113].

**Figure 10.** Normalized experimental IR and Raman spectra of **3** (**A** and **B**) and **4** (**C** and **D**) (black spectra) and their deuterated (OD) derivatives (red spectra).

**Figure 11.** Normalized IINS (**A**, **D**), Raman (**B**, **E**), and IR (**C**, **F**) spectra of compounds **3** (**A**—**C**, black spectra) and **4** (**D**–**F**, black spectra) and their deuterated derivatives (red spectra).

According to the crystallographic data [37], the molecules of **3** form H-bonded oligomeric chains of dimers. These H-bonds were almost of the same length (Table 1) and, consequently, were equally strong. Therefore, the stretching vibrations of the OH group (ν(OH)) were within the same spectral range and overlapped in the experimental IR spectra (Figure 10). The shapes of the ν(OH) and ν(OD) bands were very alike to those of carboxylic acid dimer studied experimentally and theoretically by Flakus et al. [110]. For **3**, the deuteration caused a shift of the ν(OD) band to lower wavenumbers according to the well-established rulewith the isotopic spectroscopic ratio ISR = δOH/δOD = 1.28 [114]. In contrast, for **4** the band ν(OD) expanded strongly; this revealed a complex character of the underlying changes. Deformational (δ(OH)/δ(OD) and γ(OH)/γ(OD)) bands are informative because in **3** and **4** they differed from those observed for intramolecular H-bonds in *ortho*-hydroxy aryl Schiff bases [115–118] and *ortho*-acetophenones [119]. The δ(OH) was a doublet at 1409/1395 cm−<sup>1</sup> in **3** and a band at 1383 cm−<sup>1</sup> in **4**. Upon deuteration, these bands disappeared to emerge at 1033/1029 cm−<sup>1</sup> in **3** and at 1014/995 cm−<sup>1</sup> in **4**. Thus, the ISR is in the range of 1.36–1.35 for both compounds (Tables S4 and S5). This characteristic behavior of the ISR deviated from that of *ortho*-hydroxy aryl Schiff bases [115] and *ortho*-hydroxy acetophenones [119]. When it comes to the bands assigned to the deformational γ(OH) vibrations, a few bands shifted to the low wavenumbers' region after the deuteration: 876, 835, 819, 796, 752, and 691 cm−<sup>1</sup> for **3** and 862, 839, and 763 cm−<sup>1</sup> for **4** in IR/Raman spectra (Tables S4 and S5). The assignments of these bands to the deformational vibrations of the bridging protons was unequivocal because the intensity of the two series of the bands at 895, 875, 845, 826 cm−<sup>1</sup> and 778, 715, 668 cm−<sup>1</sup> for **3** and 874 and 704 cm−<sup>1</sup> for **4** was greatly decreased in the IINS spectra after the deuteration (Figure 11 and Table S4). This phenomenon has been studied in the past [120–125]. As for the emergence of the two series of bands assigned to the deformational vibrations, it can be explained by the presence of the dimers and the oligomers in the solid state (see above). The attribution of two deformational bands to dimers and monomers was suggested by Miyazawa and Pitzer [126] for formic acid in the gas phase and solid nitrogen matrixes. Thus, the two series of the (γ(OH) bands at 876 cm−<sup>1</sup> (γ(OD) = 627 cm<sup>−</sup>1), 835 cm−<sup>1</sup> (627 cm−1), 819 cm−<sup>1</sup> (627 cm−1), and 796 cm−<sup>1</sup> (580 cm−1) and at 752 cm−<sup>1</sup> (545 cm−1), 690 cm−<sup>1</sup> (514 cm<sup>−</sup>1), and 874, 704 cm−<sup>1</sup> are assigned to the deformational vibrations of the carboxyl groups of the dimers and the monomers of **3** and **4**, respectively.

The assignment of γ(OH) can be supported by the previously published d(OO) = f(γ(OH)) correlation [127,128] and the crystallographic data for **3** [37]. The lengths of the H-bridges in the range of 2.65–2.70 Å (*d*(OO) is calculated by means of the *d*(OO) = 3.01–4.4 <sup>×</sup> 10−<sup>4</sup> <sup>γ</sup>(OH) correlation, where *d*(OO) is in Å and γ(OH) is in cm<sup>−</sup>1) matched very well with the experimentally measured ones (*d*(OO) = 2.698 Å and 2.681 Å [37]). Moreover, the experimentally obtained wavenumber values for the γ(OH) bands can be applied to compare the strength of the H-bonds in **3** and **4**. The obtained results show that the H-bonds in **3** were a bit weaker than in **4** (*d*(OO) is in the range of 2.65–2.70 Å for **3** and 2.63–2.67 Å for **4**), though the difference in the strength of the hydrogen bonding was not large.

In terms of the H-bond vibrations, the IINS spectroscopy allows one to unequivocally interpret bands (νσ) due to the almost complete disappearance of these bands upon deuteration [115,121]. Based on this phenomenon, two low-intensity bands at 555 and 390 cm−<sup>1</sup> were assigned to vibrations νσasym and νσsym of the H-bonds, respectively. Importantly, these bands overlapped with the bands of other vibrations (insensitive to deuteration) both in IINS and IR spectra. However, the changes of the IINS spectra were much clearer than ones of the IR spectra.

The relative strengths of the H-bonds in dimers and monomers of **3** and **4** can be evaluated using atomic velocity power spectra obtained from the CPMD trajectories. The vibrational spectra related to the atomic motion intensity (arbitrary intensities) are presented in Figure 12. The bands of hydroxyl groups are relatively broad (red bars in Figure 12): 3100–3500 cm−<sup>1</sup> and 2920–3250 cm−<sup>1</sup> for the monomers of **3** and **4** and 2400–3100 cm−<sup>1</sup> and 2250–3200 cm−<sup>1</sup> for the dimers of **3** and **4**. The bands of the dimers are strongly red-shifted as compared to those of the monomers. This shift indicates that the intermolecular H-bonds in the dimers were much stronger than the intramolecular ones in the monomers.

The stretching and bending vibration areas of the dimers of **3** and **4** did overlap (Figure 12, **3d** and **4d**). In contrast, the stretching and bending vibration bands of the monomer of **4** were blue- and red-shifted, respectively, as compared to those of 3 (Figure 12, **3m** and **4m**). Therefore, the strengths of H-bonds in the dimers of **3** and **4** were similar. In contrast, the intramolecular H-bond in the monomer of **3** was weaker than that in the monomer of **4**. The reason for that is that the structure of the monomer of **3** was more bent. This conclusion is consistent with the above interpretation of the experimental data and demonstrates that experimental spectroscopic studies and CPMD simulations greatly enhance each other's results.

**Figure 12.** Calculated power spectra of atomic velocity–results of the CPMD runs for the monomers of **3** and **4** (**3m** and **4m**) as well as for the dimers of **3** and **4** (**3d** and **4d**). The CPMD power spectra are presented only for the bridged protons vibrational modes. The stretching vibration area is shown in red. The bending vibration areas are shown in blue and yellow.

#### **3. Materials and Methods**

#### *3.1. Compounds and Deuteration*

The studied compounds and solvents were purchased from Sigma-Aldrich company and used without further purification. The deuterated sample was prepared by dissolving the product in deuterated methanol (CH3OD). The solution was then heated to 60 ◦C and refluxed during 30 min. After that, the methanol was removed by evaporation under reduced pressure. This procedure was repeated three times.

#### *3.2. Infrared and Raman Measurements*

The far and middle infrared (FIR, MIR) absorption measurements were performed using *a* Bruker Vertex 70v vacuum Fourier Transform spectrometer. The transmission spectra were collected with a resolution of 2 cm−<sup>1</sup> and with 64 and 32 scans per each spectrum for FIR and MIR, respectively. The FT-FIR spectra (500–50 cm−1) were collected for the samples suspended in Apiezon N grease and placed on a polyethylene (PE) disc. The FT-MIR spectra were collected for the samples in a KBr pellet. The Raman spectra of the analyzed samples were obtained using FT-Nicolet Magma 860 spectrophotometer The In:Ga:Ar laser line at 1064 nm was employed for the Raman excitation measurements. The spectra were recorded at the room temperature in the range of 200–3800 cm−<sup>1</sup> with the spectral resolution of 4 cm−<sup>1</sup> and with the same number of scans (512/measurement).

#### *3.3. Incoherent Inelastic Neutron Scattering (IINS) Measurements*

Neutron scattering data were collected at the pulsed IBR-2 reactor at the Joint Institute of Nuclear Research (Dubna) using the time-of-flight inverted geometry spectrometer NERA at 10 K temperature. The spectra were converted from neutron per channel to the scattering function per energy transfer. At the energy transfer between 5 and 1200 cm<sup>−</sup>1, the relative IINS resolution was estimated to be ca. 3%. The *S*(*Q*, ω) function (scattering law) can be expressed in the form of isotropic harmonic oscillator [129]:

$$S(\mathbf{Q}, n\omega) = \frac{\left(\mathbf{Q}^2 \cdot \mathbf{U}^2\right)}{n!} \cdot \exp\left(\left(\mathbf{Q}^2 \cdot \mathbf{U}^2\right)\right) \tag{1}$$

where *Q* is the momentum transfer and *U*<sup>2</sup> is the mean square displacement defined as

$$
\Omega l^2 = \frac{\hbar}{2m\omega} = \frac{16.795}{\mu\nu} \tag{2}
$$

where μ is the mass oscillator in amu, ν is the oscillator energy in cm<sup>−</sup>1, *U*<sup>2</sup> is expressed in Å2, and *n* is the number of excited states.

#### *3.4. NMR Measurements*

The 1H spectra were recorded at room temperature on a Bruker Avance III 500 MHz spectrometer. CDCl3 was purchased from Sigma-Aldrich and used without further purification. The spectra were measured using the solvent peak as an internal reference, and the chemical shifts were converted to the conventional TMS scale. The number of scans varied between 128 and 256.

#### *3.5. Car–Parrinello Molecular Dynamics' Simulations*

A dynamical nature of the investigated molecules **3** and **4**, with the emphasis on their hydrogen bridges, was studied using Car–Parrinello molecular dynamics (CPMD) [130]. The models of monomers and dimers for the CPMD simulations were constructed on the basis of static DFT gas phase results. The molecular structures were placed in cubic boxes with *a* = 15 Å for the monomeric forms and *a* = 22 Å (for compound **3**) and *a* = 25 Å (for compound **4**) for dimeric forms. The first-principle molecular dynamics (FPMD) calculations were performed in the gas phase with the empirical van der Waals correction by Grimme (all DFT-D2) [131]. The Perdew–Burke–Ernzerhof (PBE) exchange-correlation DFT functional [132] was applied. The core electrons of the studied monomers and dimers were replaced by norm-conserving pseudopotentials of Troullier–Martins type [133]. The Kohn–Sham orbitals were expanded using the plane-wave basis set with the maximum kinetic energy cutoff of 90 Ry. The Hockney's scheme [134] was used to remove interactions with periodic images and simulate isolated molecule conditions. The orbital coefficients were propagated using the default value of the fictitious orbital mass, 400 a.u., and the nuclear motion timestep was set to 2 a.u. The CPMD simulations were divided into two steps: The equilibration and the production runs. During the equilibration, the ionic temperature was set to 297 K and controlled by Nosé–Hoover thermostat chains with default settings, with each degree of freedom coupled to a separate thermostat ("massive" thermostatting) [135,136]. The Nosé–Hoover thermostat chain was set to 3200 cm−<sup>1</sup> frequency. The equilibration runs of the CPMD lasted for 50,000 steps for the monomers and dimers. The data collection lasted for 500,000 steps (24 ps) using the NVE microcanonical ensemble (the thermostat chains were detached during the simulations). The obtained trajectories served as a basis for the distance evolution analysis of the bridged proton and the functional groups' dynamic as well as to determine the vibrational features of the investigated compounds from the power spectra of atomic velocity.

The CPMD simulations were carried out using the CPMD 3.17.1 program [137]. The data analysis was performed using locally written utilities and the VMD 1.9.3. program [138]. The graphical presentation of the obtained results was prepared with the Gnuplot graphics package [139], and with the VMD 1.9.3. program [138].

#### *3.6. DFT Calculations*

This part of the calculations was performed with the Gaussian 09 suite of programs [140] using the density functional theory (DFT) with the three-parameter functional proposed by Becke with the correlation energy according to the Lee–Yang–Parr formula, denoted as B3LYP [141,142]. The triple-zeta split-valence basis set, denoted as 6-311+G(d,p) [143–145] according to the Pople's notation, was applied. The use of diffuse functions is a proper approach for studies of hydrogen bonding [146]. Initially, the geometry optimization was carried out and followed by harmonic frequencies' calculations, confirming that the obtained structures correspond to the minima on the potential energy surface (PES). Next, the one-dimensional reaction path of the bridged proton transfer from donor to the acceptor atom within the intramolecular hydrogen bond was studied. The applied approach was based on stepwise elongation of the O-H distance (with 0.05 or 0.1 Å increments) with full optimization of the remaining structural parameters. The calculations were carried out in the gas phase and with the solvent reaction field using acetonitrile as a solvent. The Polarizable Continuum Model (PCM) method [147] was used to reproduce the solvent influence on the studied molecules. All the performed calculations were conducted for the electronic ground state and without any extra charges on the molecules and dimers. The obtained results were visualized using the MOLDEN software [148].

#### *3.7. PED Analysis*

The potential energy distribution (PED) of the normal modes was calculated in terms of natural internal coordinates [149] using the Gar2ped program [150].

#### **4. Conclusions**

The result of the interplay between competing noncovalent interactions in the condensed phase may appear to be quite unexpected. The conformation of carboxyl groups is assumed to be dominantly *cis* due to so-called*Z*-effect [59]. However, the conformation can be changed in H-bonded associates [56–59]. We herein reported a comprehensive computational and experimental study of this phenomenon using 3-nitrophthalic (**3**) and 4-nitrophthalic acids (**4**) as model systems. It was observed that an intermolecular H-bond interaction between the adjacent carboxyl groups of these molecules became favorable only when one of the groups was involved in a strong intermolecular H-bond. However, even in this case, the spatial distance between the carboxyl groups needed to be increased. If the latter was not possible, for example due to steric hindrances, as in **3**, the intramolecular interaction was energetically unfavorable. As a result, the intramolecular steric hindrances critically affected the solubility, the crystal packing, and the intramolecular proton exchange of phthalic acids.

The structural and energetic parameters of intra- and intermolecular interactions in the monomers, dimers, and aggregates of **3** and **4** were estimated for the gas, liquid, and solid phases.

**Supplementary Materials:** The following are available online, Figure S1: The dimeric forms of compounds **3** and **4** and relative energy values obtained at B3LYP/6-311+G(d,p) level of theory, Figure S2: Structures and atoms numbering of studied compounds **3** and **4**, Figure S3: Calculated potential energy curves for the gradual nitro group rotation of conformers **3(I)** and **4(II)**, Figure S4: Calculated (B3LYP/6-311+G(d,p), PCM approach for acetonitrile (**a**) and gas phase (**b**)) potential energy functions by the gradual displacement of one proton for compounds **3** and **4** whereas the remaining parameters were optimized: in the intramolecular hydrogen bond of monomers, in the intermolecular hydrogen bond of dimers and in the intermolecular hydrogen bond of dimers for fixed adjacent bridged proton; Figure S5: Time evolution of the metric parameters of two symmetric hydrogen bridges. The CPMD gas phase simulations of the dimers of **3** and **4**. Donor-proton distance, proton-acceptor distance, donor-acceptor distance; Table S1: 1H NMR data for compound **3** in CDCl3 in the presence of *N*,*N*-diethylethanamine (Et3N), Table S2: 1H NMR data for compound **4** in CDCl3 in the presence of *N*,*N*-diethylethanamine (Et3N), Table S3: 1H NMR data for compound **4** in CDCl3 in the presence of *N*,*N*-dimethylpyridin-4-amine (DMAP), Table S4: Experimental IR, Raman, IINS and calculated DFT (B3LYP/6-311+G(d,p)) spectral data of compound **3** and its mono deuterated (OH→OD) derivative, Table S5: Experimental IR, Raman, IINS and calculated DFT (B3LYP/6-311+G(d,p)) spectral data of compound **4** and its mono deuterated (OH→OD) derivative.

**Author Contributions:** Conceptualization, A.F.; methodology, K.J., A.J., J.J.P., E.A.G., P.M.T., I.G.S. and A.F.; software, K.J., A.J., J.J.P., E.A.G., P.M.T., I.G.S. and A.F.; validation, K.J., A.J., J.J.P., E.A.G., P.M.T., I.G.S. and A.F.; formal analysis, K.J., A.J., J.J.P., E.A.G., P.M.T., I.G.S. and A.F.; investigation, K.J., A.J., J.J.P., E.A.G., P.M.T., I.G.S. and A.F.; resources, K.J., A.J., J.J.P., E.A.G., P.M.T., I.G.S. and A.F.; data curation, K.J., A.J., J.J.P., E.A.G., P.M.T., I.G.S. and A.F.; writing—original draft preparation, A.F.; writing—review and editing, A.F. and I.G.S.; visualization, K.J., A.J., J.J.P., E.A.G., P.M.T., I.G.S. and A.F.; supervision, A.F.; project administration, A.F.; funding acquisition, I.G.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Russian Foundation for Basic Research (RFBR, grant no 18-13-00050) and Polish Government Plenipotentiary for the Joint Institute for Nuclear Research n Dubna (75/24/2020; p. 75; date 3 February 2020).

**Acknowledgments:** The authors acknowledge the Wrocław Centre for Networking and Supercomputing Centres (WCSS) for providing computational time and facilities.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Sample Availability:** Samples of the compounds are not available from the authors.

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Mutual Relations between Substituent E**ff**ect, Hydrogen Bonding, and Aromaticity in Adenine-Uracil and Adenine-Adenine Base Pairs**†

#### **Paweł A. Wieczorkiewicz 1, Halina Szatylowicz 1,\* and Tadeusz M. Krygowski <sup>2</sup>**


Academic Editors: Ilya G. Shenderovich and Steve Scheiner Received: 4 July 2020; Accepted: 11 August 2020; Published: 13 August 2020

**Abstract:** The electronic structure of substituted molecules is governed, to a significant extent, by the substituent effect (SE). In this paper, SEs in selected nucleic acid base pairs (Watson-Crick, Hoogsteen, adenine-adenine) are analyzed, with special emphasis on their influence on intramolecular interactions, aromaticity, and base pair hydrogen bonding. Quantum chemistry methods—DFT calculations, the natural bond orbital (NBO) approach, the Harmonic Oscillator Model of Aromaticity (HOMA) index, the charge of the substituent active region (cSAR) model, and the quantum theory of atoms in molecules (QTAIM)—are used to compare SEs acting on adenine moiety and H-bonds from various substitution positions. Comparisons of classical SEs in adenine with those observed in paraand meta-substituted benzenes allow for the better interpretation of the obtained results. Hydrogen bond stability and its other characteristics (e.g., covalency) can be significantly changed as a result of the SE, and its consequences are dependent on the substitution position. These changes allow us to investigate specific relations between H-bond parameters, leading to conclusions concerning the nature of hydrogen bonding in adenine dimers—e.g., H-bonds formed by five-membered ring nitrogen acceptor atoms have an inferior, less pronounced covalent nature as compared to those formed by six-membered ring nitrogen. The energies of individual H-bonds (obtained by the NBO method) are analyzed and compared to those predicted by the Espinosa-Molins-Lecomte (EML) model. Moreover, both SE and H-bonds can significantly affect the aromaticity of adenine rings; long-distance SEs on π-electron delocalization are also documented.

**Keywords:** substituent effect; hydrogen bond; aromaticity; adenine

#### **1. Introduction**

Fundamental biological importance makes DNA and RNA base pairs important and therefore popular systems for quantum chemical calculations. With the development of quantum chemistry methods and computing power, modeling systems of large sizes became easily accessible [1]. Therefore, since the 1990s many papers on this topic have been published, including articles on hydrogen bonding [2,3], π-stacking interactions [4,5], tautomerization [6], benchmarks of various DFT methods [7], and dispersion models [8], leading to a better understanding of nucleic acid structure and the mechanisms of mutations [9].

Various quantum chemical methods were used to investigate intermolecular interactions in Watson-Crick [10], non-canonical Hoogsteen [11], and adenine-uracil RNA base pairs, as well as in mismatched adenine-adenine base pairs [4,5]. The substituent effects (SEs) on hydrogen bonding were most frequently studied in structurally modified Watson-Crick base pairs (adenine-thymine and guanine-cytosine) [12–18]. Recently, the influence of substituents on the electronic structure of the four most stable purine tautomers and their adenine analogues [19], as well as on the stability of adenine quartets [20] has been presented.

Adenine and uracil belong to five bases constituting DNA or RNA macromolecules that are fundamental to life processes [21]. Undoubtedly, the interactions between all of them, as well as the different types of external influences on their electronic structure, are of great importance for understanding their role in these processes. The influences of the electrophilic or nucleophilic agents belong to this type of interaction; they can cause significant changes in the electronic structure of the bases of DNA or RNA macromolecules. However, the interactions caused by cations or anions are temporary and difficult to systematically study. Their contact with the molecules in question, even if very short, is about 100 times longer than the time during which the electronic structure of the molecule is perturbed and open to some "non-typical" reactions with another reagent. Such situations can cause mutations in the attacked molecule and change its function as an inherent part of DNA or RNA macromolecules. To investigate this problem, instead of free-charged reagents, the attachment of electrophilic or nucleophilic groups (substituents) to a molecular moiety can be used to study their permanent effect on the electronic structure of the substituted molecule. Then, some analogies to a more complex situation can be subject to deeper consideration. The latter type of treatment will be presented in this paper, combined with the studies of mutual interactions of participants in adenine-uracil and adenine-adenine base pairs. In other words, the effect of intramolecular interactions (substituent effect) on individual intermolecular interactions (hydrogen bonds) in substituted base pairs is the subject of this paper. The studied adenine-uracil and adenine-adenine base pairs are shown in Figures 1 and 2 (the separate structures of each dimer are shown in Figures S1–S4 in Supplementary Materials). The naming of the adenine dimers was adopted from the paper of Poater et al. [4] to allow the comparison of the results.

**Figure 1.** Adenine-uracil (**a**) Watson-Crick (abbreviated as WC) and (**b**) Hoogsteen (abbreviated as HG) base pairs with the common numbering of adenine atoms and adopted hydrogen bond numeration. Substitutable positions are marked in red color.

Several methods can be used to estimate the strength of individual hydrogen bonds in base pairs of nucleic acids: the rotational method [22], the compliance constants method [23], the Espinosa-Molins-Lecomte (EML) equation [24] application [25], the atom replacement method [26], the estimation of hydrogen bond energy based on electron density (calculated using the quantum theory of atoms in molecules, QTAIM) [27,28] at the bond critical point (BCP) [29], the application of natural bond orbitals (NBO) [30] method [31], and coordinates interaction approach, [32] as well as the delocalization index [33].

**Figure 2.** Adenine-adenine (**a**) AA2, (**b**) AA3, and (**c**) AA4 base pairs with the common numbering of adenine atoms and adopted hydrogen bond numeration (for AA2 C8-X dimer, numeration in brackets). Substitutable positions are marked in red color.

The most known substituent characteristics are the Hammett constants [34,35]. However, they can only be used to describe the classical substituent effect—how a substituent X affects the properties of a fixed group Y (the so-called "reaction site") in a substituted system X-R-Y (R—transmitting moiety). The use of the cSAR (charge of the substituent active region) descriptor [36–38] allows us to study both classical and reverse substituent effects [39]; the latter describes how the electronic properties of substituents X depend on the properties of the moiety R-Y to which they are attached.

In the presented research, the most stable adenine tautomer, 9H, was chosen as a base for the further modification of the molecule. This tautomer has three substitutable hydrogen atoms at the C2, C8, and N9 positions. For each adenine-uracil and adenine-adenine base pair, substituent positions for further analysis were selected to avoid direct intermolecular interactions of a substituent, such as steric interactions of bulky substituents or the formation of a new hydrogen bond. For comparison, the effect of substituents at the C2, C8, or N9 positions in adenine on its physicochemical properties was also considered in detail. Selected substituents that differ in electronic properties (X = NO, NO2, Cl, F, H, Me, OH, NH2) were introduced into the adenine molecule in monomers and dimers.

The strength of the individual hydrogen bonds was characterized using the NBO approach [30,31], the topological parameters within the QTAIM approach [40], the delocalization index [33], and the H-bond lengths.

This work is mainly devoted to the influence of substituents on hydrogen bonding as well as on changes in the electronic structure of adenine, uracil, and their dimers (shown in Figures 1 and 2). The following issues are considered in greater detail:


#### • How do these characteristics differ from those estimated for monomers?

#### **2. Methods**

For all studied systems, geometry optimizations without any symmetry constraints and electronic energy calculations were performed using the Gaussian 16 program [41]. Based on our previous research [42], the DFT-D method was used—namely, B97-D3 dispersion corrected density functional [43,44]—with Dunning's [45] aug-cc-pVDZ basis set. The harmonic vibrational frequencies were calculated at the same level of theory to confirm that all the obtained structures correspond to the minima on the potential energy surface. No imaginary frequencies were found for the obtained series. In the case of asymmetric substituents (NO and OH), a lower energy rotamer was considered. NBO 6.0 software (Theoretical Chemistry Institute, University of Wisconsin, Madison, WI, USA) [46] was used for the NBO calculations.

The substituents were characterized using the *c*SAR descriptor. It allows a quantitative comparison of the electron-donating/withdrawing effects of different functional groups and correlates with many physicochemical properties [39]. For the X substituent, cSAR is defined as follows (Equation (1)):

$$c\text{SAR}(\mathbb{X}) = \mathbb{Q}\_{\mathbb{X}} + \mathbb{Q}\_{\mathbb{I}} \tag{1}$$

where QX is the sum of the partial charges of the X group atoms, and QI is the partial charge of an atom to which a substituent is attached (ipso atom).

Partial charges calculated using the Hirshfeld, [47] Weinhold [30] (NBO), and the Voronoi Deformation Density (VDD) [48] methods were used to select the charge assessment for further investigation. Their values for derivatives of the WC base pair substituted in positions C8-X and N9-X were mutually correlated; the relations between *c*SAR(X) and *c*SAR(X)Hir are shown in Figure S5 (Supplementary Materials). Considering both substitution positions (C8 and N9), it can be concluded that, qualitatively, only the VDD and Hirshfeld approaches are nearly equivalent for the estimation of *c*SAR(X) values. To be able to compare with the results of our previous research [19], only the Hirshfeld charges are used in further discussion and the superscript Hir is omitted.

The interaction energies between monomers A and B were obtained by the supermolecular method (Equations (2)–(4)) [49], using the counterpoise approach [50]:

$$E\_{\rm SM} = E\_{\rm AB} - (E\_{\rm A} + E\_{\rm B}) + E\_{\rm BSSE} = E\_{\rm AB}^{\rm int} + E\_{\rm AB'}^{\rm def} \tag{2}$$

$$E\_{\rm AB}^{\rm int} = E\_{\rm AB} - \left( E\_{\rm A}^{\rm dim} + E\_{\rm B}^{\rm dim} \right) \tag{3}$$

$$E\_{\rm AB}^{\rm def} = E\_{\rm A}^{\rm dim} - E\_{\rm A} + E\_{\rm B}^{\rm dim} - E\_{\rm B\prime} \tag{4}$$

where *E*AB is the electronic energy of a dimer, whereas *E*<sup>A</sup> and *E*<sup>B</sup> are the electronic energies of monomers A and B, and *E*BSSE is the basis set superposition error (BSSE) energy correction. *E*ABint and *E*ABdef are the "pure" interaction and deformation energies in the AB dimer, respectively, while *E*Adim and *E*<sup>B</sup> dim are the energies of monomers A and B in the dimer geometry, respectively. For the studied systems, the *E*BSSE values were 0.97–1.12 kcal/mol.

The energy of individual hydrogen bonds was calculated according to the NBO theory [30] as:

$$E\_{\rm IIB} = E\_{\rm n \to \sigma \*} - E\_{\rm n \to \sigma \prime} \tag{5}$$

where *E*n→σ\* is the interaction energy between the nonbonding NBO orbital n (lone pair) of an H-bond acceptor atom and an antibonding orbital σ\* of an H-D bond (where D is a hydrogen bond donor atom), calculated by the NBO 6.0 program from the second-order perturbative analysis of the Fock matrix on an NBO basis. *E*n→<sup>σ</sup> is the steric exchange energy between the acceptor's nonbonding Natural Localized Molecular Orbital (NLMO) n and H-D bonding NLMO σ. The natural steric analysis is accessible in NBO 4.0 and later versions via the STERIC keyword.

The strength of the individual hydrogen bonds was also characterized using the QTAIM topological parameters and delocalization index. The QTAIM calculations were performed with the AIMAII [51] software.

The delocalization index (δ) is a descriptor capable of characterizing both closed-shell and shared-shell interactions [33]. Its value is a measure of the number of electrons delocalized between atoms. δ(A, B) between atoms A and B is calculated within the QTAIM theory, which defines atomic basins in a molecule. Having defined atomic basins, it is possible to calculate δ(A, B) as:

$$\mathcal{S}(\mathbf{A}\_{\prime}\text{ B}) = 4 \sum\_{\mathbf{i},\mathbf{j}}^{\text{N}/2} \mathcal{S}\_{\vec{\mathbf{i}}\text{j}}(A) \mathcal{S}\_{\vec{\mathbf{i}}\text{j}}(B),\tag{6}$$

where *S*ij(A) and *S*ij(B) are the overlaps between orbitals i and j in the atomic basin of A and B, respectively.

Harmonic Oscillator Model of Aromaticity (HOMA) [52] was chosen as an aromaticity descriptor. It is a geometry-based descriptor dependent on the bond lengths of the studied system in comparison to a hypothetical, fully aromatic reference system. HOMA is defined as:

$$\text{HOMA} = 1 - \frac{1}{n} \sum\_{i}^{n} \alpha\_{\text{j}} \big(d\_{\text{opt},\text{j}} - d\_{\text{j},\text{i}}\big)^{2},\tag{7}$$

where *n* is the number of bonds taken into account when carrying out the summation, j means the type of bond (e.g., CC or CN), α<sup>j</sup> is an empirical normalization constant, *d*opt,j is the optimal length of a given bond assumed to be realized for full aromatic systems, and *d*j,i is an actual bond length in the studied system. The values of HOMA were calculated using the Multiwfn [53] program, with HOMA constants (α<sup>j</sup> and optimal bond lengths) taken from Krygowski's paper [54].

#### **3. Results and Discussion**

The discussion of the results is divided into five parts. The first four concern various aspects of the substituent effect. The last part presents the interrelationships between various characteristics describing the strength of individual hydrogen bonds. The obtained values of the substituent effect descriptors (*c*SAR, HOMA) and hydrogen bond strength parameters (*E*HB*, d*HB*,* <sup>ρ</sup>BCP*,* <sup>δ</sup>(H,A), <sup>∇</sup>2ρBCP, *E*def, *E*SM) for the substituted WC and HG base pairs as well as the adenine dimers are presented in Tables S1–S7 (Supplementary Materials).

#### *3.1. Classical Substituent E*ff*ect-Intramolecular Interactions*

Adenine contains an amino group at the C6 position, which in base pairs is involved in the hydrogen bond, either through the interaction of NH··· O or through N··· HN, as shown in Figures 1 and 2. In the studied systems, the substituent also influences the electronic properties of the amino group. The cSAR parameter shows how much these interactions affect its electronic structure. Figure 3 illustrates the ranges of the *c*SAR(NH2) values (their exact values are listed in Table S1 (Supplementary Materials)). In all cases, the range of the *c*SAR(NH2) values due to the substituent effect is always greater for the adenine monomer than its dimers, in which the -NH2 group is involved in H-bonding. For adenine in the AA and HG/WC pairs substituted at the C2, C8, or N9 positions, the decrease in the average value ranges compared to the monomer value is equal approximately to 76%, 80% and 56%, respectively. This shows that the H-bonding of the NH2 group in dimers causes its weaker propensity for the substituent effect.

The use of the classical interpretation of the substituent effect allows us to consider how the substituent can affect the properties of the reaction site; in our case, it is the adenine NH2 group. For this purpose, the Hammett-type linear equation is used, in which the slope (*a*, also known as the reaction constant) describes the sensitivity of the reaction site to the influence of X substituents. Thus,

the electronic structure of the NH2 group, involved in H-bonding, is described by the dependences of *c*SAR(NH2) on *c*SAR(X), as presented in Table 1 and Figure S6 (Supplementary Materials). In all cases, the regressions have good or at least acceptable determination coefficients (*R*<sup>2</sup> > 0.81).

**Figure 3.** Ranges of cSAR(NH2) variability and their averaged values in substituted adenine monomers and dimers (WC, HG, and AA).

**Table 1.** Slopes of the obtained linear dependences *c*SAR(NH2) vs. *c*SAR(X) and determination coefficients (*R*2) for the substituted WC and HG base pairs, adenine dimers, and substituted monomers (marked in **bold**, data taken from Ref. [19]) (A-C2-X, A-C8-X, A-N9-X). For asymmetrically substituted AA2 C2-X, C8-X dimer, the cSAR values of underlined monomers are taken.


Looking at the data in Table 1, some observations can be made:


3. In all cases of adenine dimers, the effect of substitution at the C2 or C8 position is greater than that at N9 by an average ratio of 1.88.

In the first case, the number of bonds, *n*, between the substituted C atom and another one substituted by the NH2 group involved in H-bonding is equal to three for the WC pair, whereas for the HG pair it is equal to two. According to the documented rule, substituents affect "the reaction site" more strongly from the para position (the number of bonds between functional groups *n* = 3) than from the meta one (*n* = 2) [55,56]; our case follows this rule. For the next point (2), *n* = 3 for both cases, and hence there are almost identical substituent effects from N9 on NH2. Finally (3), the comparisons of the effect of substituents attached to carbon and nitrogen atoms reveal that the influence of substituent at nitrogen is significantly weaker. A possible interpretation is that the lone electron pair at the nitrogen atom can be directly involved in the interaction with the substituent (N-X), thus resulting in its weaker strength of interaction at longer distances.

#### *3.2. Classical Substituent E*ff*ects-Intermolecular Interactions*

The properties of the adenine amino group can also be characterized by the strength of the hydrogen bond in which it is involved. Besides this, substituents may also affect other hydrogen bonds in the system. The following descriptors were used to describe the strength of an individual hydrogen bond: its length (*d*HB); energy (*E*HB); electron density at the H-bond critical point (ρBCP); and delocalization index, δ(H,A). Their calculated values are collected in Table S7 (Supplementary Materials).

For the WC and HG base pairs, there are three types of H-bonds (see Figure 1 and Figure S1 (Supplementary Materials)):


In the case of HB3, despite the presence of a bond critical point satisfying the Koch-Popelier [57] criteria for hydrogen bonding, an NBO analysis showed negligible interactions. For this reason, only HB1 and HB2 will be discussed.

Therefore, the following problem should be considered—how the substituents attached to C8 (WC pair), C2 (HG pair), and N9 in both pairs affect the individual H-bond stability. As above, such problems can be presented by a linear correlation of the hydrogen bond strength descriptor versus *c*SAR(X); its slope describes the sensitivity of H-bonding to the effect of the electron-accepting/donating properties of the substituent expressed by cSAR(X). Figure 4 shows the dependence of the HB1 and HB2 hydrogen bond energy on cSAR(X) for the substituted WC and HG pairs, while the slopes and determination coefficients for all the H-bond descriptors used are summarized in Table 2. A first look at its contents reveals that the slopes of the equations for the HB1 and HB2 bonds differ by a sign. Moreover, in general, regressions have at least good determination coefficients (*R*<sup>2</sup> > 0.92); only in the case of the small variability of hydrogen bond descriptors are they significantly worse. In the further discussion, we consider the dependences of the HB1 and HB2 energies (*E*HB) on the electron-attracting/donating properties of substituents attached to C8, C2 or N9 atoms in WC, HG and AA base pairs. Alternatively, the other relationships presented in Table 2—i.e., *d*HB, ρBCP, and δ(H,A) on *c*SAR(X)—lead to similar conclusions.

**Figure 4.** Dependence of the HB1 and HB2 energies on cSAR(X) for the WC and HG base pairs substituted at (**a**) C8 and C2 and (**b**) N9 positions of the adenine moiety.


**Table 2.** Slopes of the obtained linear relations between the H-bond descriptors (y) and *c*SAR(X), and the determination coefficients (*R*2) for the substituted WC and HG base pairs.

For HB2 in the HG and WC pairs, the slopes are −8.74 and −7.49, respectively. This indicates an increase in the hydrogen bond stability with an increase in the electron-donating power of substituents—i.e., the *c*SAR(X) values. Thus, in addition to the substituent effect on the amino group, the substituent X also affects the proton-accepting abilities of N atoms of both adenine rings. In these cases, the increase in the *c*SAR(X) values causes the proton-accepting nitrogen atoms—N1 in the WC pair and N7 in the HG pair—to increase their negative charge. This, in turn, increases the attracting forces towards the hydrogen donating NH group of uracil. A slightly higher slope for HB2 in HG than in the WC base pairs may be explained by a better delocalization way of transmission of the substituent effect. In the case of the HG base pair, the substituent is attached to C2 belonging to a fully aromatic ring, which may be represented by two equivalent canonical structures. In the case of the WC pair, only one (unexcited) canonical structure is possible, i.e., exhibits a lower aromatic character. Hence, there is a worse condition of transmission of the substituent effect.

The HB1 H-bonds represent a different situation. In both cases, an increase in the *c*SAR(X) values of substituents (i.e., their electron-donating ability) results in a decrease in the HB1 stability expressed by the slopes, which is equal to 3.01 and 6.82 for the HG and WC pairs, respectively. This difference is consistent with the observation of the substituent effect X on the properties of the NH2 group (Figure S6a (Supplementary Materials)), which is involved in the HB1 hydrogen bond. The electron-donating properties of the amino group decrease with the increasing electron-donating ability (more positive *c*SAR(X) value) of substituent X. Therefore, substitution at the carbon atom (C8 or C2) affects the proton-donating ability of the amino group, decreasing the *E*HB for the electron-donating X groups. As mentioned above, the difference in the slopes may be explained by a well-known rule—that the

substituent effect acting via three bonds (e.g., para-like interactions in benzene derivatives) is more effective than when it operates via two bonds (meta-like interactions). It is usually interpreted that the resonance effect acts more strongly through three than two bonds [55].

The stabilities of HB1 and HB2 in the WC and HG base pairs in the dependence on substitution at N9 are shown in Figure 4b. Despite a similar influence of the X substituent on the properties of the NH2 group (Figure S6a (Supplementary Materials)), its effect on both H-bonds is significantly different. Furthermore, trends in the effect of substituents on hydrogen bonds' stability are similar to those observed for carbon-substituted AU pairs. The only differences are the weaker substituent effect on the HB1 bond and the stronger on the HB2 bond compared to that observed in C-X systems. For the HB1 H-bond, the obtained slopes are 4.93 and 1.34 for the WC and HG pair, respectively (6.82 and 2.71 for the C-X series). In the case of HB2, the resulting slopes are −9.66 and −14.35, respectively, while they are −7.49 and −8.74 for C-X systems, respectively. These results reveal the important role of the lone electron pair at the nitrogen (N9) atom in hydrogen bonding for AU base pairs. The substantial HB2 enhancement by the substituent effect in the HG pair, compared to the WC pair, is due to the fact that the nitrogen atom of the five-membered ring is the proton acceptor of this H-bond. This enhancement is associated with an increase in the charge of this adenine nitrogen atom interacting with the NH group of uracil. This, in turn, is due to the increasing electron-donating power of the substituent—i.e., an increase in the *c*SAR(X) values.

The results of studies on the effect of substituents attached to C8, C2 or N9 in one adenine on the stability of individual H-bonds in adenine-adenine pairs are presented in Table 3 and Figure S7 (Supplementary Materials). They reveal a similar relationship between the position of the substituent and the stability of H-bonding for adenine dimers, as observed for the WC and HG pairs. The increase in the electron-donating power of substituents causes a decrease in the HB1 stability while strengthening the second H-bond, HB2. This is documented by the positive and negative slope values (Table 3), respectively. However, for AU pairs (both WC and HG), the greater sensitivity of the substituent effect (absolute slope values, Table 2) on the H-bond stability has always been observed for HB2 compared to HB1. This does not apply to mono-substituted adenine dimers. For C8-substituted systems (AA2 and AA4), HB1 is more sensitive than HB2, while the opposite is true for C2-substituted derivatives (AA2 and AA3). This is consistent with the results shown above, where the substituents at the C8 position affect the electronic properties of the adenine amino group more than at the C2 and N9 positions (Table 1). Therefore, they cause greater changes in HB1 stability, which can be expressed by the ratio of slopes *a* in Table 3. Considering the hydrogen bonding energy, the C8/C2 ratio in AA2 pairs is 1.68, and for C8/N9 in AA4 pairs it is 1.44.

In the case of N9-substituted systems, much larger changes in HB2 strength than in HB1 were noted for the AA2 and AA3 dimers, in which the nitrogen atom of the five-membered ring (N7) is the proton acceptor of this H bond. For AA4 N9-substituted derivatives, HB1 is slightly more sensitive to the effect of substituents than HB2; the slopes are 5.128 and −4.715, respectively (Table 3). For C8 analogs, they are 7.362 and −3.998, respectively. Interestingly, these systems are characterized by the smallest effect of substituents on the electronic structure of the amino group (Table 1) and the strongest intermolecular interactions between adenine-adenine base pairs. In this case, the amino group (proton donor) and the N1 nitrogen atom, which is a proton acceptor and adjacent to the C6 carbon atom, are involved in hydrogen bonds (Figure 2c). A comparison of the structure of the N1-C6-NH2 part in the monomer and dimer shows a significant increase in resonance effects in this fragment. Additionally, in AA4 pairs and WC pairs, H-bonds together with six-membered pyrimidine rings form anthracene-like geometry, which results in the additional stabilization of these systems. This is confirmed by the analysis of the aromaticity of a quasi-aromatic ring consisting of H-bonds [58], which can be described using the QTAIM theory by the total electron energy density at the ring critical point (*H*RCP). The obtained differences between the average *H*RCP values (in a.u.) in the base pairs are: ΔH(AA4 − AA2) = 0.00034, ΔH(AA4 − AA3) = 0.00061, and ΔH(WC − HG) = 0.00038; the *H*RCP values

for the AA4 and WC dimers are 0.00115 and 0.00135, respectively. It is therefore consistent with the fact that the quasi-aromatic hydrogen bonding systems in AA4 and WC pairs show the strongest H-bonds.


**Table 3.** Slopes of the obtained linear relations between the H-bond descriptors (y) and *c*SAR(X), and the determination coefficients (*R*2) for the substituted adenine dimers.

Three types of double substituted adenine-adenine pairs were considered in which the same substituent is attached to both adenines. Two of them are symmetric (AA3 C2-X, C2-X, and AA4 C8-X, C8-X) and one is asymmetric (AA2 C2-X, C8-X). The relationships between the energy of individual H-bonds and cSAR(X) are shown in Figure S8. It documents the more complex nature of mutual interactions between the substituents affecting the H-bonding in question. This is manifested by the slopes of linear equations and determination coefficients (*R*2). However, it is again clearly shown that stronger interactions come out from substitution at the C8 position.

#### *3.3. Substituent E*ff*ect on the Transmitting Moiety*

Adenine molecule consists of five-membered (AD5) and six-membered (AD6) heterocyclic rings. These rings act as transmitters of the substituent effect between X and the "reaction center"—the amino group. In the studied most stable 9H tautomer, both rings have 6π electrons, so they satisfy the Hückel's 4n + 2 rule [59]. In the AD6 ring, each atom provides a 1π electron to the delocalized structure, while in the case of AD5, four atoms provide a 1π electron, and the N9 atom provides 2π electrons for the overall delocalization. Hence, the substituent effect on the π-electron delocalization may not be equivalent. The influence of the substituent on the electronic structure of adenine rings can be expressed using aromaticity indices. For this purpose, the HOMA index was used, which can characterize local aromaticity. The calculated HOMA index values of both adenine rings are summarized in Tables S1–S3 (Supplementary Materials), while Table S6 (Supplementary Materials) contains the HOMA values of uracil from the WC and HG pairs. The effect of substituents on HOMA for six- and five-membered adenine rings was approximated by slope values ΔHOMA/Δ*c*SAR(X), where ΔHOMA = HOMA(NH2)

− HOMA(NO2), and Δ*c*SAR(X) = *c*SAR(NH2) − *c*SAR(NO2). The obtained results for the studied base pairs and monomers are shown in Figure 5.

**Figure 5.** Influence of the substituents on Harmonic Oscillator Model of Aromaticity (HOMA) values for (**a**) five-membered ring and (**b**) six-membered ring in the studied base pairs, approximated by the slope ΔHOMA/ΔcSAR(X) values, where ΔHOMA = HOMA(NH2) − HOMA(NO2) and Δ*c*SAR(X) = *c*SAR(NH2) − *c*SAR(NO2).

The differences in the HOMA values between the substituted and unsubstituted adenine molecules (Tables S8 and S9 (Supplementary Materials)) show that the substitution at C8 affects the electronic structure of the AD6 ring more than the C2 and N9 substitution. Besides, for the C8 substitution, as the electron-donating effect of the substituent described by *c*SAR(X) increases, the HOMA value of the AD6 ring also increases. Moreover, the intermolecular interactions in pairs slightly enhance the effect of substitution from the C8 position on the aromaticity of the six-membered ring compared to the monomer. In the case of C2 and N9 substitution, HOMA does not change monotonically with *c*SAR(X).

The aromaticity of the AD5 ring of adenine has the highest sensitivity to N9 substitution, and the HOMA values increase with the electron-donating ability of the substituent. This can be explained by an interaction between the substituent at the N9 position and the lone electron pair of the N9 atom, which is important for the delocalization in the AD5 ring. An electron-accepting group, such as NO2, may withdraw electrons from the N9 lone electron pair and disturb electron delocalization within the AD5 ring. The substitution at positions C8 and C2 causes smaller changes in the AD5 HOMA value, although in these cases the effect is opposite—the HOMA value decreases with the increasing electron-donating power of the substituent, i.e., *c*SAR(X) value.

Furthermore, in WC and HG pairs, the substituents in adenine also affect the aromaticity of uracil (Table S6 (Supplementary Materials)). Again, the strongest effect is observed from the position C8 than from N9, and the smallest from the position C2. Therefore, it can be concluded that long-distance substituent effects are documented.

#### *3.4. Reverse Substituent E*ff*ect*

The reverse substituent effect quantitatively describes how substituted moiety affects the electronic properties of the substituent X in a system such as R-X or, more generally, X-R-Y. Schematically, it may be presented as in Figure 6.

$$\bigcap\_{\mathbf{X}-\mathbf{R}-\mathbf{Y}}\mathbf{Y}$$

**Figure 6.** Reverse substituent effect in the X-R-Y system.

It should be emphasized that the reverse substituent effect has been known from the beginning, hence there are different substituent constants σ for the para and meta positions [34], similarly to substituent constants σ<sup>+</sup> and σ<sup>−</sup> for electron-accepting and -donating reaction sites, respectively [60]. The use of *c*SAR(X) allows a quantitative description of the electron-donating or accepting properties of the substituent in any system. This gives additional insight into understanding the substituent effect dependent on various substituted moieties.

In our case, we consider how the amino group, variously involved in H-bonding, affects the properties and electronic structure of the substituents expressed by *c*SAR(X) values. Figure 7 presents the ranges of *c*SAR(X) values in the dependence on the position and type of moieties to which they are attached. All the data for dimers are compared with the values for adenine monomers.

**Figure 7.** Ranges of *c*SAR(X) variability and their averaged values for substituents attached to C2, C8, and N9 in adenine monomer and its pairs (WC, HG, and adenine-adenine).

A comparison of the ranges of *c*SAR(X) values in the adenine monomer and its pairs (WC, HG, and AA) leads to the conclusion that, in the adenine monomer, these ranges are greater than in its pairs for substitution at carbon atoms and almost comparable for substitution at nitrogen. In the case of substitution at C2, the ranges are averaged to 0.238, a value that is significantly lower than for the series substituted at C8 (mean range 0.297), with the ratio of C8/C2 = 1.25. The explanation of this is the same as formerly: the number of bonds between C8, C2 and C6 (with NH2 group attached) is 3 and 2, respectively. In the case of substitution at the nitrogen atom, the former conclusion that the lone electron pair causes a weaker substituent effect due to its possible direct interaction with the substituent can be repeated.

The ranges of *c*SAR(X) change for the examined substituents attached to the C8, C2 and N9 positions, and more generally the ranges for those attached to the carbon and nitrogen atom in adenine for the studied systems are shown in Figure S9 (Supplementary Materials). Changes in these ranges for a given substituent depend on the substitution position (C2, C8, and N9). In the extreme case (N9-Cl), the variation range is 0.042/0.160—i.e., 26% and slightly less for carbon substitution (NO, 0.066/0.272—i.e., 24%).

#### *3.5. Interrelations between Hydrogen Bond Parameters*

As shown above, substituents can significantly change the strength of intermolecular interactions. The presence of a substituent in the adenine molecule affects hydrogen bonds in base pairs, either strengthening or weakening them. The strength of individual H-bonds has been characterized by several descriptors, such as the energy of the H-bond (calculated according to the NBO approach),

*E*HB; its length, *d*HB; and the QTAIM parameters—i.e., the electron density at the bond critical point, <sup>ρ</sup>BCP; its Laplacian, <sup>∇</sup>2ρBCP; and the delocalization index between hydrogen and H-bond acceptor atom A, δ(H,A). Thus, the results for the N··· H and O··· H hydrogen bonds in the studied dimers can be used to check the interrelationships of these descriptors of intermolecular interactions in the obtained range of their changes. The values of all applied characteristics are collected in Table S7 (Supplementary Materials).

In general, hydrogen bonds can be classified as weak, moderate, and strong. Jeffrey [61] distinguishes weak H-bonds as those for which the range of absolute energy value is 1–4 kcal/mol; for moderate H-bonds, these energies are 4–15 kcal/mol, and they are 15–40 kcal/mol for strong hydrogen bonds. However, it should be emphasized that there are no "sharp" borders between these hydrogen bonds [62]. The QTAIM theory is also a source of energetic parameters. Rozas, Alkorta, and Elguero [63] suggested that the Laplacian as well as the total electron energy density at hydrogen bond BCP, *H*BCP, should both be used as criteria to characterize hydrogen bonding. They proposed for weak H-bonds that both <sup>∇</sup>2ρBCP and *<sup>H</sup>*BCP <sup>&</sup>gt; 0; for medium H-bonds, they are <sup>∇</sup>2ρBCP > 0 and *<sup>H</sup>*BCP <sup>&</sup>lt; 0, while for strong ones, both <sup>∇</sup>2ρBCP and *<sup>H</sup>*BCP <sup>&</sup>lt; 0. The topological QTAIM parameters also provide information on the nature of the interaction [64,65]. It has been shown that H-bonds shorter than 1.2 Å exhibit a covalent nature, bond lengths in the range 1.2–1.8 Å are associated with the partially covalent character, and H-bonds longer than 1.8 Å are noncovalent; this is also referred to as shared-shell, intermediate closed-shell, and closed-shell, respectively.

All the calculated energies and their corresponding H-bond lengths in the AU and AA pairs are shown in Figure S10 (Supplementary Materials). The linear equation of ln(|*E*HB|) against the H-bond lengths (Figure S11 (Supplementary Materials)) indicates the exponential nature of the relationship shown in Figure S10 (Supplementary Materials). However, a deeper look at both figures indicates that three groups of H-bonds should be distinguished from all points, as shown in Figure 8. The first group (in red) contains O··· H and HB2 (N··· H) interactions in the WC pairs, the second (in green) contains HB2 in HG pairs and almost linear H-bonds (angle > 175◦) in adenine dimers, while the third group (in blue) contains bonds with an H··· N-H angle between 161◦ and 166◦; this division is justified by the linear relations shown in Figure S12 (Supplementary Materials). For the same H-bond length, the strength of the interaction decreases as the group number increases. Interestingly, in the case of the strongest interactions, both the nitrogen and oxygen atoms act as acceptors of H-bond protons. Thus, the same relationship describes two types of hydrogen bonds. For comparison, a curve representing the EML equation [24] is added in Figure 8. This equation is based on the exponential fit for 83 experimentally observed in O··· H hydrogen bonds.

Hydrogen bonds in AU base pairs are much stronger than in AA dimers (Table S7 (Supplementary Materials)). In the case of AU systems, N··· H interactions (HB2) are stronger than O··· H (HB1) H-bonds. For the same H-bond length, HB2 is slightly stronger in WC than the HG series. The HB1 bonds in WC pairs are much shorter than in HG dimers, and therefore significantly stronger. For AA dimers, the strength of interactions depends on the N··· H-N angle, and, as expected, the linear hydrogen bonds are stronger (Figure 8). Additionally, the obtained values of <sup>∇</sup>2ρBCP and *H*BCP for HB2 H-bonds (N··· H) in the AU base pairs and the strongest AA dimers' H-bonds—AA4 C8-X and AA4 N9-X, X = NO, NO2 (∇2ρBCP > 0 and *H*BCP < 0, blue points in Figure S13 (Supplementary Materials))—reveal the partially covalent nature of these interactions. H-bonds in AA4 dimers are longer than 1.8 Å (in the range 1.82–1.88 Å). For the other hydrogen bonds, both QTAIM descriptors are positive, indicating the closed-shell nature of the interactions.

**Figure 8.** Dependences of the H-bond energies, *E*HB, on their lengths, *d*HB, in the studied adenine-uracil and adenine-adenine base pairs.

The relationships between the electron density at the H-bond critical point, ρBCP, or delocalization index, δ(H,A), and the H-bond length are shown in Figure S14 (Supplementary Materials) and Figure 9, respectively. The exponential nature of these relationships is confirmed by the linear equations ln(ρBCP) and ln(δ(H,A)) in relation to the H-bond length, shown in Figures S15 and S16 (Supplementary Materials). The use of these electronic hydrogen bond descriptors allows us to distinguish three types of interaction:


**Figure 9.** Dependences of the delocalization index, δ(H,A), on the H-bond lengths, *d*HB, in the studied adenine-uracil and adenine-adenine base pairs.

For these interaction groups, the dependence of the H-bond energy on the electron density at its BCP is shown in Figure 10. In the case of N1··· HN, energies corresponding to a given ρBCP value are higher than for N7··· HN; the same applies to the electron density at BCP for a given hydrogen bond length (Figure S14 (Supplementary Materials)).

**Figure 10.** Adenine-adenine dimers. Energy of individual H-bonds as a function of the electron density at the H-bond critical point ρBCP.

Figure 9 shows an interesting relationship between the delocalization index δ(H,A) and the H-bond length *d*HB. The values of δ(H,A) for a given *d*HB are higher for the N1··· HN-type bonds than for N7··· HN in the case of both AA and AU pairs. This shows that N1··· HN bonds have a higher covalent character than N7··· HN bonds, which is consistent with the values of <sup>∇</sup>2ρBCP and *H*BCP descriptors discussed earlier and presented in Figure S13 (Supplementary Materials). Even lower values of δ(H,A) are seen for the NH··· O-type AU HB1 bonds (marked in blue). This may be interpreted by a stronger attraction of electrons by more electron-attracting oxygen than nitrogen atoms in these interactions, and, therefore, the less covalent character of the NH··· O hydrogen bond compared to N··· HN. Thus, the relationships for N··· H bonds depend on both the type of the acceptor atom (N1 or N7) and the nature of the interactions. This also confirms the interrelationship between the delocalization index and the electron density at BCP of the H-bond (Figure S17 (Supplementary Materials)).

#### **4. Conclusions**

This theoretical study provides insight into the structural consequences of the substituent effect in biologically important adenine dimers. Four specific aspects were considered: classical SE on intra- and intermolecular interactions, effects on the transmitting moiety, and reverse SE. Additionally, the interrelations between hydrogen bond parameters are presented, revealing more information about the nature of H-bonding in adenine base pairs and other heterocyclic systems.

The classical SE on intramolecular interactions, described by the *c*SAR index, shows that the substituent has a diversified effect on the amino group, and this effect depends on its position. The transmission of the SE to the amino group from the C8 position is more effective than from the C2 or N9 positions. These differences can be explained by an analogy with benzene substituted in the para and meta positions. The C8 position in the adenine molecule is *n* = 3 bonds away (para-like effect), while C2 is *n* = 2 bonds away from the amino group (meta-like effect). N9 substitution has the weakest effect on the amino group due to an interaction of the substituent and N9 atom lone electron pair. These effects also explain the classical SE on intermolecular interactions—H-bonds. They are presented as relations between the cSAR(X) and H-bond energy, length, and H-bond critical point

parameters. The H-bond formed by the substituted adenine acceptor atom shows more sensitivity to SE transmitted from N9, which is caused either by the close distance between the N9 and N7 atoms or by the effect on the lone electron pair of the N9 atom.

Changes in the HOMA index for adenine rings show that a substituent, depending on its position, may have either a strong or almost negligible effect on the aromaticity. The five-membered ring aromaticity is highly influenced by N9 because of an interaction between the substituent and the N9 lone electron pair, which greatly contributes to the ring's delocalized structure. The changes in the six-membered ring aromaticity caused by the substituent at the C8 position of the five-membered ring reveal an interesting long-distance SE.

The reverse SE is an effect of the amino group involved in varying intermolecular interactions on the attached substituent. The results presented as changes in the cSAR(X) values showed that the reverse SE is weaker when the amino group forms an H-bond (in base pairs) than in monomers. Moreover, its strength is consistent with para and meta-like effects in classical SE. Furthermore, the changes in the cSAR ranges for a given substituent may reach ca. 25% of the average variation in the *c*SAR(X) range for a particular substitution position (C-X or N-X).

A deeper analysis of the hydrogen bonds and interrelations between their parameters leads to the following observations:


**Supplementary Materials:** The following are available online at http://www.mdpi.com/1420-3049/25/16/3688/s1: Figure S1. Studied substituted adenine-uracil base pairs; Figure S2. Studied substituted adenine-adenine AA2 base pairs. Figure S3. Studied substituted adenine-adenine AA3 base pairs; Figure S4. Studied substituted adenine-adenine AA4 base pairs; Figure S5. Linear regressions between cSAR(X) values calculated by Hirshfeld (Hir) method and data from the VDD and NBO approaches for derivatives of the WC base pair substituted in positions C8-X and N9-X by X = NO, NO2, Cl, F, H, Me, OH, NH2; Figure S6. Relationships between cSAR(NH2) and cSAR(X) in substituted adenine-uracil and adenine-adenine systems; Figure S7. Dependence of HB1 and HB2 energies on cSAR(X) for adenine dimers substituted in C8, C2 and N9 positions of one adenine moiety; Figure S8. Energy of individual H-bonds as a function of cSAR(X) for adenine-adenine pairs with the same substituents attached to both adenines; Figure S9. Ranges of cSAR(X) changes for the examined X substituent attached to C2, C8, and the nitrogen (N9) and carbon atoms in the adenine monomer and its pairs (WC, HG, and adenine-adenine); Figure S10. Dependence of H-bond energies on their lengths for all hydrogen bonds in studied adenine-uracil and adenine-adenine base pairs; Figure S11. Relationship between ln(*E*HB) and H-bond lengths for all hydrogen bonds in studied adenine-uracil and adenine-adenine base pairs; Figure S12. Relationships between ln(|*E*HB|) and hydrogen bond lengths for the groups shown in Figure 8; Figure S13. Dependence of Laplacian electron density on the length of the H-bond for all hydrogen bonds in studied adenine-uracil and adenine-adenine base pairs; Figure S14. Dependences of electron density at the H-bond critical points on their lengths in the studied adenine-uracil and adenine-adenine base pairs; Figure S15. Relationships between ln(ρBCP) and H-bond lengths for all hydrogen bonds in studied adenine-uracil and adenine-adenine base pairs; Figure S16. Relationships between ln(δ(H,A)) and H-bond lengths, for all hydrogen bonds in studied adenine-uracil and adenine-adenine base pairs; Figure S17. Delocalization index δ(H,A) between H and A atoms, where A is the acceptor of H-bond as a function of electron density at H-bond critical point for AU and AA dimers; Table S1. Values of cSAR(X), cSAR(NH2) and HOMA for substituted adenine-uracil Hoogsteen (HG) and Watson-Crick (WC) pairs. HOMA values of five-membered adenine ring (HOMA AD5) and six-membered adenine ring (HOMA AD6); Table S2. Values of cSAR(X), cSAR(NH2) and HOMA for substituted adenine-adenine AA2 and AA3 pairs. HOMA values

of five-membered adenine ring (HOMA AD5) and six-membered adenine ring (HOMA AD6); Table S3. Values of cSAR(X), cSAR(NH2) and HOMA for substituted adenine-adenine AA4 pairs. HOMA values of five-membered adenine ring (HOMA AD5) and six-membered adenine ring (HOMA AD6); Table S4. Values of cSAR(X) and cSAR(NH2). Symmetrically substituted adenine-adenine AA3 and AA4 pairs; Table S5. Values of cSAR(X) and cSAR(NH2). Asymmetrically substituted adenine-adenine AA2 pairs; Table S6. HOMA values of uracil ring in adenine-uracil Hoogsteen (HG) and Watson-Crick (WC) pairs with substituents at C2, C8, N9 position of adenine moiety; Table S7. Calculated hydrogen bond parameters of all studied base pairs; Table S8. Changes in aromaticity expressed by the HOMA index for (a) AD6 and (b) AD5 rings due to substitution in WC and HG pairs; Table S9. Changes in aromaticity expressed by the HOMA index for AD6 and AD5 rings due to substitution in AA dimers.

**Author Contributions:** Conceptualization, H.S. and T.M.K.; Methodology, H.S. and T.M.K.; Investigation, P.A.W.; Formal analysis, P.A.W.; Validation, H.S. and P.A.W.; Funding acquisition, H.S.; Writing—original draft preparation, H.S. and P.A.W.; Writing—review and editing, T.M.K.; Visualization, P.A.W.; Supervision, H.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the National Science Centre of Poland (Grant no. UMO-2016/23/B/ST4/00082). APC was partly sponsored by MDPI.

**Acknowledgments:** The authors gratefully acknowledge the Interdisciplinary Center for Mathematical and Computational Modeling (Warsaw, Poland) and the Wrocław Centre for Networking and Supercomputing for providing computer time and facilities.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Sample Availability:** Calculation output files for all studied dimers are available from the authors.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

*Article*

## **Combined X-ray Crystallographic, IR**/**Raman Spectroscopic, and Periodic DFT Investigations of New Multicomponent Crystalline Forms of Anthelmintic Drugs: A Case Study of Carbendazim Maleate**

#### **Alexander P. Voronin 1, Artem O. Surov 1, Andrei V. Churakov 2, Olga D. Parashchuk 3, Alexey A. Rykounov <sup>4</sup> and Mikhail V. Vener 5,\***


Academic Editor: Ilya G. Shenderovich Received: 6 May 2020; Accepted: 18 May 2020; Published: 21 May 2020

**Abstract:** Synthesis of multicomponent solid forms is an important method of modifying and fine-tuning the most critical physicochemical properties of drug compounds. The design of new multicomponent pharmaceutical materials requires reliable information about the supramolecular arrangement of molecules and detailed description of the intermolecular interactions in the crystal structure. It implies the use of a combination of different experimental and theoretical investigation methods. Organic salts present new challenges for those who develop theoretical approaches describing the structure, spectral properties, and lattice energy *E*latt. These crystals consist of closed-shell organic ions interacting through relatively strong hydrogen bonds, which leads to *E*latt > 200 kJ/mol. Some technical problems that a user of periodic (solid-state) density functional theory (DFT) programs encounters when calculating the properties of these crystals still remain unsolved, for example, the influence of cell parameter optimization on the *E*latt value, wave numbers, relative intensity of Raman-active vibrations in the low-frequency region, etc. In this work, various properties of a new two-component carbendazim maleate crystal were experimentally investigated, and the applicability of different DFT functionals and empirical Grimme corrections to the description of the obtained structural and spectroscopic properties was tested. Based on this, practical recommendations were developed for further theoretical studies of multicomponent organic pharmaceutical crystals.

**Keywords:** conventional and non-conventional H-bonds; empirical Grimme corrections; lattice energy of organic salts; computation of low-frequency Raman spectra

#### **1. Introduction**

Organic salts are crystalline ionic compounds that contain one or more organic ions in their structure. Organic salts have broad application in the pharmaceutical industry [1], non-linear optics [2], catalysis [3], green solvents for chemical production [4], etc. Rational design of organic salts and relative materials implies the development of computational methods capable of reliable prediction of industry-relevant properties such as spectroscopic features and crystal lattice energy.

There are several benchmark sets of single-component organic crystals consisting of small rigid molecules without organic fluorine, packed together by van der Waals forces and/or weak and moderate hydrogen bonds (H-bonds), whose lattice energy is accurately computed [5–9]. New and existing theoretical methods are developed and tested based on these sets, and then they are further applied to various compounds. However, most crystals with actual or potential practical application have little in common with the structures from the benchmark sets. Some examples include single-component crystals of larger, conformationally flexible molecules [10–14], fluoroorganic compounds [15,16], and multicomponent crystals [17–19], often with short (strong) [20,21] or ionic H-bonds [22]. The applicability of the methods tested against the benchmark sets for modeling the properties of non-model crystals (e.g., organic salts) is unclear.

In order to describe the properties of "real" crystals, semi-empirical methods based on additive schemes and/or parameterized force fields are often used [23–25]. Their area of application is often limited to a single property (usually to crystal lattice energy), and they are unable to describe a number of properties determined experimentally, including IR and Raman spectra, electron density distribution, etc. The semi-empirical methods provide accurate values of sublimation enthalpies of one-component crystals, consisting of molecules of an arbitrary size [26] and two-component crystals with non-conventional H-bonds [27]. However, these methods are very sensitive to force field parameterization.

For this reason, we chose periodic density functional theory (DFT) methods which allow describing a wide range of properties of the crystalline phase, and which have relatively low computational costs even when treating complex multicomponent crystals [28] of large flexible molecules [29,30] containing aromatic fluorine [31], as well as short (strong) or ionic H-bonds [32,33]. We believe that periodic (solid-state) DFT computations provide a grounded trade-off between the accuracy and the rate of calculations of experimentally observed properties of multi-component organic crystals. In the DFT methods, there are two main approaches based either on Gaussian-type orbitals (GTO) or on plane waves (PW), both with their advantages and disadvantages. Thus, GTO basis sets can better describe isolated molecules in the gas phase, which is essential for *Elatt* calculation [33], while many solid-state properties such as solid-state infrared (IR) spectra are traditionally computed using PW [34]. Only a few articles provide a comparison of the results obtained with GTO and PW bases [35,36]. The choice of the functional is also important for the quality of the obtained data. For example, the B3LYP (Becke 3-parameter, Lee-Yang-Parr) functional is commonly applied in computations with GTOs [37,38], while PBE (Perdew-Burke-Ernzerhof) is used with PW basis sets [30,34,39]. It is well known that B3LYP describes systems with short (strong) or ionic H-bonds better than PBE, while the latter overestimates the stabilization energy in molecular complexes and crystals [40,41]. Non-directed dispersion interactions cause problems for DFT computations in both the GTO and PW versions, making it necessary to use the dispersion corrections of different nature. However, it is not yet investigated how the dispersion corrections [42–44], as well as other parameters such as optimization of cell parameters [7,9,30,34], type of the functional, and basis set, affect the observable properties (e.g., sublimation enthalpy [28,45], IR/Raman spectra [46–48], metric [18,22] and electron density features [42] at bond critical points of conventional and non-conventional hydrogen bonds) of molecular crystals with short (strong) or ionic H-bonds.

Some technical problems that a user of periodic (solid-state) DFT programs encounters when calculating the properties of multicomponent organic crystals remain unsolved. It is still unclear how full optimization (variation of cell parameters) influences the metric parameters of short/ionic H-bonds, *Elatt* values, the wave numbers of normal vibrations in the low-frequency region <400 cm<sup>−</sup>1, etc.

Soluble drug forms are one of the main areas of application of multicomponent crystals. For this reason, anthelmintic compounds with low aqueous solubility were selected as objects of the present study. Anthelmintic benzimidazole derivatives are basic compounds capable of forming a variety of two-component crystals with pharmaceutically acceptable acids [49–51]. Since the number of potential pharmaceutical crystal forms of target compounds is very high, accurate theoretical estimates of relevant properties are desired to avoid excessive experimental work.

In this paper, the new two-component crystalline form of the anthelmintic drug 1:1 carbendazim–maleic acid crystal [CRB + MLE] (1:1) is investigated using X-ray and IR/Raman spectroscopy in combination with periodic (solid-state) DFT calculations (Figure 1). The applicability of different DFT functionals and empirical Grimme corrections to reproducing the experimentally observed parameters was tested using the Crystal17 and QuantumEspresso DFT codes. As a result, "practical recipes" are proposed for computing multicomponent organic crystals for users of these programs.

&DUEHQGD]LP**&5%** 0DOHLFDFLG**0/(** 

**Figure 1.** Molecular structures of carbendazim (**CRB**) and maleic acid (**MLE**).

#### **2. Results and Discussion**

The crystallographic data of the [CRB + MLE] (1:1) salt were recorded at 120 K and at room temperature to study the temperature effect on the metric parameters of the unit cell and most notable H-bonds. The crystallographic information is collected in Table S1 (Supplementary Materials). We see that the thermal expansion in the interval between 120 K and 296 K is almost negligible, as the cell volume increases only by 3% (44 Å3).

The asymmetric unit contains one CRB cation and one MLE anion. The crystal has H-bonds of different types and strengths: conventional intra- and intermolecular H-bonds and non-conventional C–H···O contacts. A number of equations were proposed to assess the dependence of the H-bond stabilization energy from the distances between the heavy atoms [52,53], H···O/N distance [54], and electron density descriptors [41,42]. According to these approaches, the intramolecular O24–H24···O21 bond can be considered strong, while the two intermolecular N–H···O bonds can be classified as medium or weak hydrogen bonds (Table S2, Supplementary Materials).

#### *2.1. Hydrogen Bond Patterns*

The molecules in the asymmetric unit are combined into a heterodimer formed by ionic <sup>N</sup>+1–H1···O21 and medium or weak N3–H3···O22 bonds, which build the eight-membered cyclic motif with the *R*<sup>2</sup> <sup>2</sup>(8) graph set notation [55]. Another conventional N2–H2···O23 hydrogen bond connects the N–H group of the CRB cation with the adjacent MLE anion and is assisted by two C–H···O contacts (Figure S1, Supplementary Materials). The O21 atom acts as an acceptor of two H-bonds, short (strong) intramolecular and ionic intermolecular ones (Figure 2).

**Figure 2.** Part of the hydrogen bond network in [CRB + MLE] (1:1). The H-bonds and C–H···O contacts are colored blue and green, respectively.

#### *2.2. E*ff*ect of Optimization on Cell Parameters*

The volume of the crystallographic cell of the considered two-component crystal increases from 1400.5 to 1444.6 Å<sup>3</sup> as the temperature rises from 120 to 296 K. This means that the cell parameters change slightly when the temperature increases. This result is consistent with the published data [56], according to which the thermal expansion from 120 to 296 K for organic crystals is estimated to range between ~1% and ~3%. The sign and absolute value of the relative change in the volume of the crystallographic cell depend on the functional and type of Grimme corrections (Table 1). The data given in this table indicate that none of the approximations reproduce the experimental value of the thermal expansion of the considered crystal. Some approximations give negative values of this coefficient. Note that such a result was already obtained in a number of articles (see Tables 2 and 3 in Reference [57], Table 2 in Reference [58], Figure 4 in Reference [59], and Table S19 in Reference [34]). These results demonstrate that the change in the unit cell volume during the lattice optimization does not always correspond to the experimental data.

#### *2.3. Metric Parameters of Conventional and Non-Conventional H-Bonds*

The experimental values of the distances between the heavy atoms involved in the formation of conventional H-bonds are compared in Table 1 with the theoretical values computed using different levels of approximation with fixed unit cell parameters and full unit cell relaxation. We assume that the calculations are in agreement with the X-ray data if the theoretical values of the O···O and O···N distances differ from the experimental ones by no more than 0.01 Å. The data presented in Table 1 suggest that (i) the considered distances are very sensitive to the choice of the functional (B3LYP or PBE), the inclusion of the dispersion correction, and its type (D2, D3, or none), (ii) the variation of the cell parameters greatly changes the distances if the relative change in the volume of the crystallographic cell is more than a few percent, and (iii) the results we get also depend on the basis set type (PW or GTO).

The metric parameters of the non-conventional H-bonds extracted from the experimental crystal structure are poorly reproduced by all the approximations used (Table S3, Supplementary Materials). We can draw two conclusions. Firstly, the B3LYP approximation with the fixed cell parameters gives

the best description of the metric parameters of the conventional H-bonds in the considered crystal. Secondly, it is almost impossible to describe the distances between the heavy atoms involved in the formation of the non-conventional H bonds with an accuracy of ~0.01 Å in the framework of the approximations used.

#### *2.4. IR Spectrum in the Low-Frequency Region*

The IR spectrum of [CRB + MLE] (1:1) can be divided into high-frequency (>1800 cm−1), low-frequency (<400 cm−1), and mid-frequency spectral ranges (Figures S2 and S3, Supplementary Materials). For a correct description of the IR frequencies of the asymmetric vibrations of the O–H···O and O–H···N/ <sup>−</sup>O···H–N<sup>+</sup> fragments in the high-frequency range, it is necessary to go beyond the double harmonic approximation framework [60]. Explicit accounting of mechanical and electric anharmonicity is very cumbersome and time-consuming [61], especially in the case of organic crystals with intermolecular H-bonds [62,63]. The mid-frequency spectral range is usually described well in the cluster approximation; in some cases, the cluster can consist of one molecule [64].

We focus on reproducing the frequencies, as well as IR and Raman activities, in the low-frequency spectral range. It is currently being intensively studied [29,65,66], since various intermolecular vibrations can be observed in it, in particular, because of the presence of intermolecular H-bonds [36,67–69]. The double harmonic approximation provides a reasonable description of the IR/Raman spectra of organic crystals in the low-frequency spectral range [29,30,70,71].

The experimental IR frequencies of [CRB + MLE] (1:1) in the low-frequency spectral range are compared with the theoretical values computed at different levels of approximation (PBE-D3/6-31G(d,p), B3LYP/6-31G(d,p), B3LYP-D2/6-31G(d,p), B3LYP-D3/6-31G(d,p), and PBE-D3/PW) with fixed unit cell parameters (AtomOnly) and full unit cell relaxation (FullOpt) and the values are collected in Table 2. The periodic DFT calculations in all the approximations used produce reasonable values of vibration frequencies. The IR intensities are reproduced by all the approximations only semi-quantitatively. The use of B3LYP-D3 approximation in the periodic DFT calculations leads to the termination of the IR and Raman activity computations. For this reason, B3LYP-D2 was used to calculate the IR and Raman activities.

The frequencies of the IR-active vibrations of the parent CRB crystal in the range of 400–150 cm−<sup>1</sup> are reproduced well by all the approximations (Table S4, Supplementary Materials). This is due to the absence of short (strong) or ionic H-bonds in this crystal. The IR intensities are reproduced by all the approximations only semi-quantitatively.

Periodic DFT computations of molecular crystals sometimes lead to the appearance of imaginary frequencies [6,72,73]. We encountered this problem when calculating the IR/Raman spectra of [CRB + MLE] (1:1) using the PBE-D3/PW (FullOpt) approximation (see Table S5, Supplementary Materials). Unlike calculations of non-periodic systems, there is no universal recipe for solving the problem of imaginary frequencies appearing in periodic calculations. This problem is usually solved by reducing the space symmetry of a crystal [72,74]. Other methods include (i) the use of extended basis sets [73], (ii) variation of the cell parameters [74], and (iii) increasing the atomic displacement value in numerical second derivative calculations [41]. However, in some cases, these tricks fail to result in a stable structure.


O24–H24···O21

 (intra-)

N1–H1···O21 N3–H3···O22 N2–H2···O23

ΔV = (Vexp −

2.442 2.685 2.761 2.756

2.701 (2.665)

2.735 (2.798)

2.662 (2.700)

2.462 (2.481)

 2.446 (2.399)

 2.665 (2.622)

 2.768 (2.718)

 2.714 (2.684)

 2.468 (2.433)

 2.700 (2.640)

 2.762 (2.746)

 2.750 (2.681)

 2.457 (2.443)

 2.680 (2.641)

 2.757 (2.745)

 2.701 (2.677) <−0.1

 2.453 (2.456)

 2.675 (2.670)

 2.741 (2.744)




(a) vs, very strong; s, strong; numbering given Figure 2; given parenthesis; the plane-wave basis set with a cut-off energy of 100 Ry and PAW pseudopotentials; (e) in the calculations, this is a doublet of bands with almost identical wave numbers and IR intensities.

#### *2.5. Raman Spectrum in the Low-Frequency Region*

The wavenumber of the lowest Raman-active vibration of [CRB + MLE] (1:1) is ~25 cm−<sup>1</sup> (Figure 3). Its theoretical wavenumber is very sensitive to the level of approximation (Table S5, Supplementary Materials). The optimization of the cell parameters greatly affects this value, as well as the number of IR/Raman active vibrations below 100 cm<sup>−</sup>1. A significant decrease in the cell volume (about 10%) as a result of optimization leads to a blue shift in the wave number of the lowest IR/Raman active vibration by ~10 cm<sup>−</sup>1, in accordance with References [75,76].

**Figure 3.** Raman spectrum of [CRB + MLE] (1:1). Experiment (black line) vs. B3LYP(AtomOnly) computations (red bars). The height of the bars is proportional to the relative Raman intensity of the corresponding transition.

The experimental Raman spectrum of [CRB + MLE] (1:1) in the low-frequency region is shown in Figure 3. The dips in the spectrum at 20.2 cm−<sup>1</sup> and at 302 cm−<sup>1</sup> are the artefacts of the measurements associated with the presence of dust particles on the mirrors. B3LYP with the fixed cell parameters provides a reasonable description of the Raman spectrum of [CRB + MLE] (1:1) (Figure 3). This applies to both the wave numbers and the Raman intensities. In contrast to B3LYP(AtomOnly), B3LYP-D2(FullOpt) does not provide an adequate description of the Raman spectrum (Figure S4, Supplementary Materials). PBE-D3(FullOpt) reproduces the Raman spectrum of the salt somewhat better than B3LYP-D2(FullOpt) (see Figures S4 and S5, Supplementary Materials). However, the calculated wavenumbers of the most intense bands in the region below 100 cm−<sup>1</sup> are blue-shifted compared with the experiment, and the Raman intensity of the vibrations in the region of 100–400 cm−<sup>1</sup> turns out to be very high. This result can be explained by a significant reduction in the cell volume of [CRB + MLE] (1:1) as a result of full optimization (see Table 2).

In the Raman spectrum of crystalline maleic acid (Figure 4), the most intense band lies in the region of 100 cm−1, while the lowest Raman-active vibration is most intense in [CRB + MLE] (1:1). The B3LYP(AtomOnly) approximation reproduces these differences. This approximation provides a reasonable description of the acid Raman spectrum (Figure 4). PBE-D3(FullOpt) and B3LYP-D2(FullOpt) do not reproduce the acid Raman spectrum (Figures S6 and S7, Supplementary Materials).

**Figure 4.** Raman spectrum of crystalline maleic acid. Experiment (black line) vs. B3LYP(AtomOnly) computations (red bars). The height of the bars is proportional to the relative Raman intensity of the corresponding transition.

The signal from the CRB crystal contains a strong (apparently luminescent) background, and its Raman spectrum is very noisy (Figure S8, Supplementary Materials). Therefore, we focus on reproducing the spectrum below 100 cm−1, i.e., in the THz region. The B3LYP(AtomOnly) approximation reproduces the position of the most intense low-lying vibration and provides a reasonable description of the spectrum in the considered frequency region (Figure 5). B3LYP-D2(FullOpt) and PBE-D3(FullOpt) do not reproduce the position of the most intense low-lying vibration (Figures S9 and S10, Supplementary Materials).

**Figure 5.** Raman spectrum of the CRB crystal in the region of 25–100 cm−<sup>1</sup> (see text). Experiment (black line) vs. B3LYP(AtomOnly) computations (red bars). The height of the bars is proportional to the relative Raman intensity of the corresponding transition.

To clarify the effect of cell parameter optimization on the Raman spectrum in the low-frequency region, calculations were performed in the PBE-D3(AtomOnly) approximation. Note that this approximation is used in periodic DFT computations with both GTO [77] and PW [78] basis sets. According to Figure 6, Figures S6 and S10 (Supplementary Materials), PBE-D3(AtomOnly) provides a reasonable description of the Raman spectrum of [CRB + MLE] (1:1) and crystals of pure CRB and MLE.

**Figure 6.** Raman spectrum of [CRB + MLE] (1:1). Experiment (black line) vs. PBE-D3(AtomOnly) computations (blue bars). The height of the bars is proportional to the relative Raman intensity of the corresponding transition.

We conclude that the approximations using cell parameter optimization cannot satisfactorily describe the low-frequency Raman spectrum of the crystals with intra- and intermolecular H bonds of different strengths. This is due to the overestimation of the thermal expansion of the crystals by PBE-D3 and B3LYP-D2 (Table 2 and Table S6, Supplementary Materials). The relative changes in the cell volume for B3LYP-D2(FullOpt) are more than 10%; therefore, this approximation provides a poor description of the low-frequency Raman spectra.

The results obtained in this work show that the low-frequency Raman spectra of organic crystals with intramolecular O–H···O, intermolecular O–H···N and <sup>−</sup>O···H–N<sup>+</sup> bonds are reproduced in the approximations B3LYP(AtomOnly) and PBE-D3(AtomOnly). According to References [31,77], the structure and IR/Raman spectra of crystals containing organic fluorine and non-conventional C–H···F/C–H···O bonds are adequately described by PBE-D3 and modified PBE functionals with modest basis sets in terms of the AtomOnly approximation.

#### *2.6. Lattice Energy Evaluation*

A number of computational approaches to *Elatt* assessment are reported in the literature. They mostly concern single-component crystals and two-component crystals without an intermolecular proton transfer (cocrystals), and they use either careful quantum chemical modeling [6,79–82] or semi-empirical schemes [23,24,83–86]. Molecular salts present new challenges for those developing theoretical approaches describing the lattice energy *Elatt*. These crystals consist of closed-shell organic ions interacting through ionic H-bonds, which may be partially covalent [21,87]. This is one of the reasons for high *Elatt* values in two-component organic crystals, e.g., 246 kJ·mol−<sup>1</sup> (Reference [88]), 259–286 kJ·mol−<sup>1</sup> (Reference [89]), 272 kJ·mol−<sup>1</sup> (Reference [18]), 253–295 kJ·mol−<sup>1</sup> (Reference [90]), 299 kJ·mol−<sup>1</sup> (Reference [91]), etc. It should be pointed out that the *Elatt* values for one-component

crystals included in the benchmark sets vary from 25 kJ·mol−<sup>1</sup> for CO2 to 163 kJ·mol−<sup>1</sup> for cytosine [5]. The intermolecular proton transfer occurs only in the condensed phase [92], which means that there are closed-shell ions in a molecular crystal (BH<sup>+</sup> cation and A<sup>−</sup> anion), and neutral organic molecules B and HA in the gas phase. Gas-phase energies of these species (*Emol*) are different for neutral molecules and closed-shell ions. This leads to some ambiguity in the choice of gas-phase structures during the *Elatt* calculation (see Equations (1) and (2) in Section 3.6). In our case, these structures can be either a CRBH<sup>+</sup> cation and a maleate ion, or molecules of carbendazim and maleic acid (see the Supplementary Materials for the calculation details).

GTO basis sets require evaluation of the correction of the basic set superposition error (BSSE) (Equation (2)). The Crystal17 evaluation scheme for this correction involves only neutral molecules. It was assumed that BSSEs for the neutral molecules are equal to the values of BSSE for the ionic species in [CRB + MLE] (1:1). The *Elatt* value for the neutral molecules in the gas phase was found to be around 250 kJ·mol−<sup>1</sup> (Table 3), which is comparable to the known *Elatt* values for multicomponent pharmaceutical crystals estimated using other schemes [45,93,94]. The *Elatt* values obtained by PBE-D3 with and without variation of cell parameters agree well with each other.

**Table 3.** Crystal lattice energy of [CRB + MLE] (1:1) derived from the periodic DFT computations with plane wave and Gaussian-type orbitals. <sup>a</sup> The units are kJ·mol<sup>−</sup>1.


<sup>a</sup> See the Supplementary Materials.

In addition, they corresponded to *Elatt* computed by PBE-D3/PW. The lattice energies obtained with the CRBH<sup>+</sup> cation and maleate ion in the gas phase were found to be above 600 kJ·mol−<sup>1</sup> (Table 3). Moreover, this value depended on the DFT functional and the applied BSSE correction. Such large *Elatt* values for multicomponent crystals of organic salts are known in the literature [95–99]. It should be noted that there is a special class of one-component crystals consisting of zwitterionic molecules, which are also characterized by large *Elatt* values [33]. We have to admit that the scheme we used for *Elatt* evaluation of crystals of organic salts requires further development. The raised problem of accounting for BSSE for organic salts and uniform description of the species in crystals and gas phase originates from two independent sources: (1) an assumption of additivity of BSSE corrections for multicomponent crystals; (2) limited ability of the existing approaches to treat crystals of organic salts, e.g., BH<sup>+</sup> and A<sup>−</sup> in a crystal and B and HA in the gas phase.

In summary, the optimization of cell parameters does not lead to a noticeable change in the *Elatt* value despite the significant variation of the cell volume. The use of the GTO and PW basis sets in the PBE-D3 approximation leads to close *Elatt* values.

#### **3. Materials and Methods**

#### *3.1. Compounds and Solvents*

Carbendazim (C9H9N3O2, 98%) was purchased from Acros Organics (Geel, Belgium), and maleic acid (C4H4O4, 98%) was purchased from Merck (Darmstadt, Germany). The solvents were purchased from different suppliers and were used as received without further purification.

#### *3.2. Cocrystal Preparation*

The grinding experiments were performed using a Fritsch planetary micro mill (Fritsch, Idar-Oberstein, Germany), model Pulverisette 7, in 12-mL agate grinding jars with ten 5-mm agate balls at a rate of 500 rpm for 40 min. In a typical experiment, 80–100 mg of the carbendazim/maleic acid mixture in the 1:1 stoichiometric ratio was placed into a grinding jar, and 40–50 μL of methanol was added with a micropipette. In the other method, 200 mg of the 1:1 mixture of carbendazim and maleic acid was suspended in 2 mL of methanol and left to be stirred overnight on a magnetic stirrer at room temperature. The precipitate was filtered from the solution and dried at room temperature.

The diffraction-quality single crystals of [CRB + MLE] (1:1) were obtained by dissolving 90 mg of the stoichiometric 1:1 mixture of the components in 5 mL of methanol at 40 ◦C. After complete dissolution, the solution was gently cooled to room temperature; then, it was covered with Parafilm with a few small holes pierced in it and left for the solvent to evaporate. After a week, small colorless crystals appeared in the solution.

#### *3.3. Thermal Analysis*

The thermal analysis was carried out using a differential scanning calorimeter (DSC) with a refrigerated cooling system (Perkin Elmer DSC 4000, Perkin Elmer Inc., Waltham, MS, USA). The sample was heated in a standard sealed aluminum pan (40 <sup>μ</sup>L volume) at a rate of 10 ◦C·min−<sup>1</sup> in a nitrogen atmosphere. The unit was calibrated with indium and zinc standards. The accuracy of the weighing procedure was ±0.01 mg. The results of the DSC analysis for [CRB + MLE] (1:1) and pure components are presented in Figure S11 (Supplementary Materials).

#### *3.4. Single-Crystal and Powder X-ray Di*ff*raction (XRD) Experiments*

Single-crystal XRD data were collected on a SMART APEX II diffractometer (Bruker AXS, Karlsruhe, Germany) using graphite-monochromated Mo*K*α radiation (λ = 0.71073 Å) at 120 and 296 K. Absorption corrections based on the measurements of equivalent reflections were applied [100]. The structures were solved by direct methods and refined by full matrix least-squares on *F*<sup>2</sup> with anisotropic thermal parameters for all the non-hydrogen atoms [101]. All the hydrogen atoms were found from a difference Fourier map and refined isotropically. The crystallographic data for [CRB + MLE] (1:1) were deposited by the Cambridge Crystallographic Data Center (Cambridge, UK) as supplementary publications under the CCDC numbers 1,994,877 and 1,994,878 for 120 and 296 K, respectively. This information can be obtained free of charge from the Cambridge Crystallographic Data Center via www.ccdc.cam.ac.uk/data\_request/cif.

The X-ray powder diffraction (PXRD) data of the bulk materials were recorded under ambient conditions in Bragg–Brentano geometry on a Bruker D2 Phaser diffractometer equipped with a second-generation LynxEye detector (Bruker AXS, Karlsruhe, Germany) with CuKα radiation (λ = 1.5406 Å). PXRD patterns of salt and parent solids are given in Figure S12 (Supplementary Materials).

#### *3.5. IR and Raman Spectroscopy*

The Fourier-transform infrared (FT-IR) spectra of the compounds were recorded in the spectral range of 400–150 cm−<sup>1</sup> from CsBr pellets on a Bruker Vertex 80 V device (Bruker Optik, Ettlingen, Germany) equipped with a Mylar multilayer beamsplitter. The high-quality spectra were obtained and analyzed using the OPUS 6.5.83 software (Bruker Optik, Ettlingen, Germany).

The Raman measurements in the spectral range of 10–440 cm−<sup>1</sup> were performed using a Raman microscope (inVia, Renishaw plc, Spectroscopy Product Division, Old Town Wotton-Under-Edge, Gloucestershire, GL12 7DW, UK) with a 50× objective lens (Leica DM 2500 M, NA = 0.75, Leica Mikrosysteme Vertrieb GmbH, Mikroskopie und Histologie Ernst-Leitz-Strasse 17-37Wetzlar, 35578 Germany). The measurements were made with a NExT monochromator (Renishaw plc, Spectroscopy Product Division, Old Town Wotton-Under-Edge, Gloucestershire, GL12 7DW, UK). The excitation wavelength was 633 nm, as provided by an He–Ne laser (RL633, Renishaw plc, Spectroscopy Product Division, Old Town Wotton-Under-Edge, Gloucestershire, GL12 7DW, UK) with the maximum power of 17 mW. The acquisition time and number of accumulations were adjusted to maximize the signal-to-noise ratio with the minimal sample degradation. All the spectra for the powder samples were measured at several points and then averaged to reduce the anisotropy effect on

the Raman spectra and to increase the single-to-noise ratio. The background from the Raman spectra was subtracted by the cubic spline interpolation method. All the spectra were divided by the number of accumulations and acquisition time. The dips in the spectra at wavenumbers of 20.2 cm−<sup>1</sup> and 302 cm−<sup>1</sup> are the artefacts of the measurements associated with the presence of dust particles on the NExT monochromator mirrors.

#### *3.6. Periodic (Solid-State) DFT Computations*

In the CRYSTAL17 calculations [102], we employed the B3LYP [103,104] and PBE [105] functionals with the all-electron Gaussian-type localized orbital basis sets 6-31G(d,p). The London dispersion interactions were taken into account by introducing the D3 correction with Becke–Jones damping (B3LYP-D3 and PBE-D3) and the D2 correction (B3LYP-D2) developed by Grimme et al. [106,107]. In the QuantumExpresso calculations [108,109], we employed PBE with a plane-wave basis set. PAW pseudopotentials with a cut-off energy of 100 Ry were used [110]. The London dispersion interactions were taken into account by introducing the D3 correction with Becke–Jones damping (PBE-D3). In one series of calculations, the space groups and the unit cell parameters of the crystals obtained from the X-ray diffraction experiment were fixed, and the structural relaxations were limited to the positional parameters of the atoms (AtomOnly). In the other series, the optimization was also performed by the cell parameters without cell volume restrictions (FullOpt). The symmetry of crystals was kept during all computations.

The crystal lattice energy *Elatt* of the *n*-component crystal was estimated from periodic DFT as the difference between the sum of total electronic energies of relaxed isolated species *Emol* and the total energy of the crystal *Ecry* calculated per asymmetric unit [111]

$$E\_{\rm latt} = \sum\_{i=1}^{n} E\_{\rm mol,i} - \frac{E\_{\rm cry}}{Z} \tag{1}$$

Equation (1) was used in PBE/PW calculations. In the case of GTO basis sets, the basis set superposition error (BSSE) [112] was taken into account.

$$E\_{\text{latt}} = \sum\_{i=1}^{n} \left( E\_{\text{mol},i} + BSSE\_i \right) - \frac{E\_{\text{cry}}}{Z} \tag{2}$$

Further details of the calculations are given in Section S1 of the Supplementary Materials.

#### **4. Conclusions**

In this work, we investigated the influence of cell parameter optimization on the *Elatt* value, as well as the structural and spectroscopic properties of the new two-component carbendazim maleate crystal. The sign and absolute value of the relative change in the cell volume of the crystal depends on the functional and type of Grimme correction. Some properties of the considered crystal (metric parameters of short/ionic H-bonds, low-frequency Raman spectra) strongly depend on the changes in the cell volume, while other properties (lattice energy *Elatt*, infrared spectra in the 400–150 cm−<sup>1</sup> frequency region) are weakly related to these variations.

Optimization of the cell parameters of [CRB + MLE] (1:1) and crystals made up of its constituents greatly affects the wavenumber of the lowest Raman-active vibration, the number of Raman active vibrations below 100 cm<sup>−</sup>1, and the relative intensity of these vibrations. B3LYP and PBE-D3 with fixed cell parameters provide a reasonable description of the low-frequency Raman spectra of the considered molecular crystals. This applies both to the wave numbers and Raman intensities. B3LYP and PBE-D3 with modest basis sets and fixed cell parameters can be recommended for evaluation of the structure, H-bond pattern, and infrared/Raman spectra of multicomponent pharmaceutical crystals.

The applicability of different DFT approximations to the *Elatt* calculation of the pharmaceutical salt of carbendazim and maleic acid was examined. It is shown that the existing methods for the calculation of *Elatt* require further developments for solving the problem of accounting for BSSE for organic salts and uniform description of the species in crystals (closed-shell organic ions) and gas phase (neutral organic molecules). It is shown that optimization of the cell parameters does not lead to a noticeable change in the *Elatt* value despite the significant variation of the cell volume. The use of the GTO and PW basis sets in the PBE-D3 approximation leads to close *Elatt* values.

The PBE-D3 method with modest basis sets and fixed cell parameters provides a reasonable trade-off between the accuracy and the computational cost in evaluation of a number of relevant properties of multicomponent pharmaceutical crystals.

**Supplementary Materials:** The following are available online: Section S1. Computational details; Table S1. Crystallographic data for [CRB + MLE] (1:1); Table S2. Experimental metric parameters and interaction energies of conventional intermolecular hydrogen bonds in [CRB + MLE] (1:1); Table S3. Experimental vs. theoretical distances between heavy atoms involved in the formation of nonconventional hydrogen bonds and C-H-O bond angles in [CRB + MLE] (1:1); Table S4. Tentative assignment of the bands in the IR spectrum of the pure CRB crystal below 400 cm<sup>−</sup>1; Table S5. Calculated values of low frequency IR or Raman active phonons in [CRB + MLE] (1:1) and the volume of the crystallographic cell; Table S6. Volume of the crystallographic cell of the CRB and MLE crystals after optimization; Figure S1. Part of the crystal structure with notable non-conventional hydrogen bonds; Figure S2. Experimental and theoretical IR spectra in the low-frequency range; Figure S3. Experimental and theoretical IR spectra of [CRB + MLE] (1:1) in high-frequency and mid-frequency range; Figure S4. Experimental vs. theoretical [B3LYP-D2(FullOpt)] Raman spectrum of [CRB + MLE] (1:1); Figure S5. Experimental vs. theoretical [PBE-D3(FullOpt)] Raman spectrum of [CRB + MLE] (1:1); Figure S6. Experimental vs. theoretical [PBE-D3(AtomOnly) and PBE-D3(FullOpt)] Raman spectrum of MLE; Figure S7. Experimental vs. theoretical [B3LYP-D2(FullOpt)] Raman spectrum of MLE; Figure S8. Experimental Raman spectrum of CRB; Figure S9. Experimental vs. theoretical [B3LYP-D2(FullOpt)] Raman spectrum of CRB in the THz region; Figure S10. Experimental vs. theoretical [PBE-D3(AtomOnly) and PBE-D3(FullOpt)] Raman spectrum of CRB in the THz region; Figure S11. Results of DSC analysis of [CRB + MLE] (1:1) and its pure components; Figure S12. Powder X-Ray diffraction patterns for pure CRB, MLE, and [CRB + MLE] (1:1) (experimental and theoretical).

**Author Contributions:** Conceptualization, A.P.V., A.O.S., and M.V.V.; experimental methodology, A.O.S. and A.P.V.; theoretical methodology, M.V.V., A.P.V., and A.A.R.; investigation, A.P.V., A.O.S., A.V.C., O.D.P., A.A.R., and M.V.V.; single-crystal XRD experiment, A.V.C.; Raman spectroscopy, O.D.P.; PW computations, A.A.R.; writing and visualization, A.P.V., A.O.S., and M.V.V.; supervision, M.V.V. and A.O.S.; project administration, A.O.S.; funding acquisition, A.O.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** The work was supported by the Russian Scientific Foundation (project No. 19-73-10005).

**Acknowledgments:** We thank Matvey S. Gruzdev (Institute of Solution Chemistry of Russian Academy of Sciences) for performing the FT-IR spectroscopic experiment in the low-frequency range. M.V.V. thanks Oleg G. Kharlanov (Faculty of Physics, Lomonosov Moscow State University) for drawing his attention to the relationship between the variation of the cell parameters and the low-frequency Raman spectrum of the crystal. We thank "the Upper Volga Region Centre of Physicochemical Research" for technical assistance with PXRD and FT-IR experiments.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Sample Availability:** Samples of crystalline [CRB + MLE] (1:1), input and output files of the computations are available from the authors.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Catalytic E**ff**ect of Hydrogen Bond on Oxhydryl Dehydrogenation in Methanol Steam Reforming on Ni(111)**

#### **Changming Ke <sup>1</sup> and Zijing Lin 2,\***


Received: 19 December 2019; Accepted: 25 March 2020; Published: 27 March 2020

**Abstract:** Dehydrogenation of H3COH and H2O are key steps of methanol steam reforming on transition metal surfaces. Oxhydryl dehydrogenation reactions of H*x*COH (*x* = 0–3) and OH on Ni (111) were investigated by DFT calculations with the OptB88-vdW functional. The transition states were searched by the climbing image nudged elastic band method and the dimer method. The activation energies for the dehydrogenation of individual H*x*COH\* are 68 to 91 kJ/mol, and reduced to 12–17 kJ/mol by neighboring OH\*. Bader charge analysis showed the catalysis role of OH\* can be attributed to the effect of hydrogen bond (H-bond) in maintaining the charge of oxhydryl H in the reaction path. The mechanism of H-bond catalysis was further demonstrated by the study of OH\* and N\* assisted dehydrogenation of OH\*. Due to the universality of H-bond, the H-bond catalysis shown here, is of broad implication for studies of reaction kinetics.

**Keywords:** Reaction mechanism; first-principle calculation; Bader charge analysis; activation energy; transition state structure

#### **1. Introduction**

Methanol steam reforming (MSR), H3COH + H2O → CO2 + 3H2, is frequently used for the generation of H2 at temperatures of 200–300 ◦C and atmospheric pressure [1–4]. MSR can be catalysed by a number of metals and metal oxides. Cu is the most common commercial catalyst but suffers from pyrophoricity and catalyst sintering, limiting its long-term applications [5,6]. Noble metals, such as Pd and Pt, that possess long term stability and no pyrophoric behavior [3,7] suffer from high price, limiting their large-scale industrial applications. On balance, Ni is a low price and highly effective MSR catalyst. The high activity of Ni for catalyzing MSR at T ≥ 300◦C has been demonstrated in recent experiments [5,6], but a convincing theoretical explanation of the reaction mechanism is lacking.

There have been numerous theoretical studies on the reaction mechanism of MSR over a number of catalysts. Lin et al. conducted density functional theory (DFT) calculations and built a kinetic model of MSR on Cu(111) [8]. They found kinetic relevant steps with high activation energies include H3COH\* + \* → H3CO\* + H\* and, where \* denotes an active surface site and R\* means a surface adsorbed species R. Zuo et al. discussed the mechanisms of methanol decomposition, methanol oxidation and steam reforming of methanol on Cu(111) [9]. Wang et al. explained the differentiation of intrinsic reactivity of MSR on Cu, CuZn and Cu/ZnO [10]. In addition, they proposed a microkinetic model for more in-depth mechanics research of MSR on Cu [11]. Smith et al. conducted DFT studies on the initial steps of MSR on PdZn and ZnO surfaces, and found defect sites lower the barrier significantly [12]. Also based on DFT calculations, Krajˇcí et al. demonstrated the CO/CO2 selectivity of MSR on many alloys, e.g., PdZn, PtZn and NiZn [13]. Chen et al. [14], Lausche et al. [15], and Kramer et al. [16] studied the selectivity of the dehydrogenation of methanol on Cu(110), Ni(100) and Ni(111), respectively. MSRs on Pt [16], Pd [16], Pt3Ni alloy [17], Pt-Skinned PtNi Bimetallic Clusters [18] and Co [19] have also been investigated.

Summarizing the existing results of MSR studies, there are two kinds of MSR kinetics. The first was deduced by considering the dehydrogenation of isolate adsorbed H*x*COH (*x* = 3, 2, 1, 0). As the activation energy for the bond breaking of H3CO-H\* is high, H3COH\* + \* → H3CO\* + H\* was found to be a likely rate determining step (RDS) [8,10,11,17]. For example, Wang et al. [11] showed in their DFT study that the activation energy for CH3OH\* + \* → CH3O\* + H\* was 103 kJ/mol and MSR on Cu(111) was mostly limited by the dehydrogenation of CH3OH\*. The second considered the interaction of H3COH\* and OH\* where the co-adsorbed OH\* significantly reduces the activation energy of H3COH\* + OH\*→ H3CO\* + H2O\* and the C–H scission steps were found to be rate limiting [9,13,19]. The resulting kinetics with a lower activation energy agrees better with the experiments [20,21], and is a clear improvement of the first one. Unfortunately, the improved understanding has so far been mainly limited to a computational detail based deduction. In-depth understanding based on general physical concept and/or generable mechanism is highly desirable.

This work focuses on the role of hydrogen bond (H-bond) on reducing the activation energies of oxhydryl dehydrogenation that are important for determining the kinetics of MSR on Ni(111), the dominant catalyzing surface of micron-sized commercial Ni catalysts [22]. DFT calculations were performed to examine the oxhydryl dehydrogenation of H*x*CO–H\* (*x* = 3, 2, 1, 0), with and without the assistance of the co-adsorbed OH\*. Combined analysis of Bader charges and transition state structures showed that H-bond is the root cause for the observed high MSR activity. The catalytic role of H-bond was further supported by investigating the dehydrogenation of O–H\* assisted by co-adsorbed N\*.

#### **2. Computational Methods**

DFT calculations were performed using Vienna Ab-initio Simulation Package (VASP) [23–26], a plane wave computational software. The projector augmented wave (PAW) method [27,28] was used to describe the electron-ion interaction between core ion and valence electrons. The Kohn-Sham equations were solved with a 380eV cutoff energy for the wavefunctions of valence electrons. The exchange correlation interaction was described by the functional of OptB88-vdW [29,30]. OptB88-vdW was chosen as it best describes the van der Waals (vdW) interaction on metal surface [31]. The computations were performed on a three-layer slab of 3 × 3 unit cell surface model of Ni(111), with a vacuum region of 10 Å thickness. The surface layer of the slab was allowed to relax, while the bottom two layers were fixed. Spin polarization and dipole correction were considered by setting SPIN = 2 and LDIPOL = .True in all calculations. The Brillouin zone was sampled by a 5 × 5 × 1 k-point Monkhorst-Pack grid. All stable structures were optimized with an energy-based conjugate gradient algorithm [32]. The convergence criteria for electronic and ionic energies were 10−<sup>6</sup> eV/atom and 10−<sup>5</sup> eV/atom, respectively. The cutoff energy and the k-point grid were tested to be appropriate, e.g., the differences in the obtained adsorption energy of H3CO are less than 0.4 kJ/mol and 0.7 kJ/mol when compared to a cutoff energy of up to 460 eV and a k-point grid of up to 8 × 8 × 1, respectively.

Saddle points were determined by combining the climbing image nudged elastic band (CL-NEB) method [33] and the dimer method [34]. First, the less computing intensive CL-NEB was used to find the minimum energy path and the transition state. In the CL-NEB calculations, 7 images were inserted between reactants and products, and the electronic energies and the forces were converged to 10−<sup>4</sup> eV/atom and 0.03 eV/Å, respectively. Second, the transition states obtained by the CL-NEB searches were used as the inputs for the high-precision dimer method [34] to find the accurate transition states efficiently. In the dimer method calculations, the electronic energy and force were converged respectively to 10−<sup>7</sup> eV/atom and 0.01 eV/Å.

The Bader charge was calculated by the method of partitioning charge density grids into Bader volumes, as proposed by Henkelman's group [35,36].

#### **3. Results and Discussion**

#### *3.1. Oxhydryl Dehydrogenation of HxCOH\**

Oxhydryl dehydrogenation of isolated H*x*COH\* (*x* = 3, 2, 1, 0), H*x*COH\* + \* → H*x*CO\* + H\*, and that assisted by co-adsorbed OH, H*x*COH\* + OH\* → H*x*CO\* + H2O\*, were considered. The activation barriers of the two types of dehydrogenation reactions were compared in Figure 1.

**Figure 1.** Activation energies of dual path dehydrogenation of oxhydryl in H*x*CO–H\* (*x* = 3, 2, 1, 0).

Notice that the direct barrier of H3COH\* dehydrogenation shown Figure 1 is about 91 kJ/mol. In comparison, the barrier was found to range from 39 to 75 kJ/mol in a few early studies [16,37,38]. The difference, due to various factors such as the use of different functionals, surface slab model and transition state search methods, is quite substantial, but is also seen in similar cases. For example, the direct barrier of H3COH\* dehydrogenation on Cu(111) was found to vary from 62 to 138 kJ/mol by different DFT studies [8–11,39]. That is, the difference known for Cu(111) is comparable to that known for Ni(111). Although it is premature to draw any conclusion, the result here may be preferable due to the demonstrated quality of OptB88-vdW for similar systems [31] and the widely accepted slab model and transition state search method.

As shown in Figure 1, the activation energies (*E*a's) of oxhydryl dehydrogenation of H*x*COH\* are reduced from 68–91 kJ/mol for isolated H*x*COH\* to 12–17 kJ/mol for H*x*COH\* with co-adsorbed OH (*x* = 3, 2, 1, 0). The reduction of *E*<sup>a</sup> for the oxhydryl dehydrogenation of H3COH\* due to the presence of neighboring OH\* is known in literatures. For example, *E*<sup>a</sup> is reduced from 62 kJ/mol to 32 kJ/mol for Cu(111) [9] and from 80 kJ/mol to 22 kJ/mol for Co(111) [19]. The results here concerning the oxhydryl dehydrogenation of H*x*COH\* for *x* = 2, 1, 0 indicate that the effect is quite general. The low activation energies mean that all oxhydryl dehydrogenation processes of H*x*COH\* in MSR should be sufficiently fast. Besides, the energy cost for a close proximity of H*x*COH\*and OH\* as compared to

isolated adsorbates is low, at 7.7, 14, 11, 5.1 kJ/mol for *x* = 3, 2, 1, 0, respectively. Therefore, there is no need to consider the oxhydryl dehydrogenation processes of H*x*COH\* when examining the possible RDS in MSR. This result can be used to simplify the elementary reaction step study in many relevant problems. A low *E*<sup>a</sup> for H3COH\* dehydrogenation is also necessary for the understanding of the high MSR activities of Ni catalysts observed experimentally [5,6].

Notice that, while OH\* facilitates the O–H scission process, OH\* provides no help for C-H scission. The activation energies for CH3O\* + OH\* → CH2O\* + H2O\* and CH3OH\* + OH\* → CH2OH\* + H2O\* are 166 and 149 kJ/mol, respectively. Both the activation energies are higher than the corresponding activation energies of 86.8 and 91.5 kJ/mol for CH3O\* → CH2O\* + H\* and CH3OH\* → CH2OH\* + H\*, respectively. Similar result has also been observed for the C-H scission on PdZn(111) [40]. As reactions prefer the least resistant paths and the fractional coverage of OH\* is in the order of 1% and very low coverages for CH3OH\* and CH*x*O\* on Ni(111) [41], the C–H scission is not expected to be adversely impacted by OH\*. Due to the high *E*<sup>a</sup> involved in CH3O\* → CH2O\* + H\* or CH3OH\* → CH2OH\* + H\*. However, the C–H scission step is expected to be rate limiting for MSR on Ni(111).

To reveal the common feature of oxhydryl dehydrogenation in different H*x*COH\*, Figure 2 shows the structures and Bader charges of H*x*COH\*, with and without OH\* co-adsorption, at their initial local minimum geometries and reaction transition states. As seen in Figure 2, the Bader charges of oxhydryl H for isolated H*x*COH\* at their local minimum and transition state structures are on average +0.63 e and +0.14 e, respectively. Clearly, the Bader charges of oxhydryl H of isolated H*x*COH\* at their local minimum and transition state structures are quite different. That is, a significant charge density redistribution is required in the reaction path going from a local minimum energy structure to a transition state configuration. A large electronic energy changes due to the orbital reorganization, or correspondingly a high energy barrier, is expected in the O–H scission process. In comparison, the Bader charges for oxhydryl H of H*x*COH\* with co-adsorbed OH\* are on average +0.59 e and +0.65 e at the minimum energy and transition state structures, respectively. There are little charge redistributions required in the bond scission reaction paths. Combined with the fact that the O–H bond of H*x*COH\* is floppy, a very low activation energy is encountered for each of the reactions.

**Figure 2.** *Cont.*

**Figure 2.** Structures and Bader charges in the oxhydryl dehydrogenation of H*x*COH with and without co-adsorbed OH: (**a**) *x* = 0, (**b**) *x* = 1, (**c**) *x* = 2, (**d**) *x* = 3. From left to right: local minimum and transition state structures of H*x*COH without and with co-adsorbed OH.

Based on the above analysis, it is clear that OH\* plays a catalyzing role in the oxhydryl dehydrogenation of H*x*COH\* on Ni(111). The catalysis effect is realized by minimizing the charge redistribution requirement in the reaction path. The charge of oxhydryl H of H*x*COH\* in the reaction process is maintained by interacting with OH\*. The H··· OH distance at the transition structure is 1.86, 1.61, 1.67 and 2.05 Å for *x* = 0, 1, 2, and 3, respectively. The distances are characteristics of H-bonds. Therefore, the reduced reaction barrier for the oxhydryl dehydrogenation of H*x*COH\* can be attributed to the catalyzing effect of H-bond interaction.

#### *3.2. Dehydrogenation of OH\* Assisted by H-Bond of O–H*··· *OH and O–H*··· *N*

The catalyzing effect of H-bond on oxhydryl dehydrogenation can also be seen in the dehydrogenation of OH\* adsorbed on Ni(111) [42]. Figure 3 compares the activation energies, initial and transition state structures and charge distributions of OH\* dehydrogenations with and without co-adsorbed neighboring OH\*. As seen in Figure 3, the activation energy of dehydrogenation is 105 kJ/mol for isolated OH\*, but is reduced by 37 kJ/mol to 68 kJ/mol for OH\* with co-adsorbed OH\* due to the O–H··· OH interaction. The reduction of activation energy is also observed for other transition metals. The activation energy for the corresponding reaction is reduced from 1.11 eV to 0.3 eV on Co(111) [19] and from 1.88 eV to 0.82 eV on Cu(111) [9]. Moreover, the higher energy of O\* + H2O\* in comparison with that of OH\* + OH\* is in qualitative agreement with both the theoretical and experimental results of Che et al [43].

**Figure 3.** *Cont.*

**Figure 3.** The dehydrogenation of O–H on Ni(111) surface: (**a**) The activation energies of dual path dehydrogenation of OH\*, (**b**) The initial and transition state structures and charge distributions of individual OH\* dehydrogenation, (**c**) The initial and transition state structures and charge distributions of OH-catalyzed OH\* dehydrogenation, (**d**) The initial and transition state structures and charge distributions of N-catalyzed OH\* dehydrogenation.

Like the case shown in Figure 2, the Bader charges of H for the energy minimum and transition state structures in the dehydrogenation of individual OH\* are quite different, at +0.65e and +0.15e, respectively. The corresponding Bader charges are respectively +0.62e and +0.64e in the dehydrogenation of OH\* with the presence of the O–H··· OH interaction. Once again, the charge of oxhydryl H remains almost constant in the reaction process due to the influence of the O–H··· OH H-bond.

H-bonds are ubiquitous in nature and normally exist between electronegative atoms and H atoms covalently bound to similar electronegative atoms. In addition to the OH··· O H-bond discussed above, OH··· N is another type of commonly seen H-bond. Even though OH··· N is not involved in MSR, it may exist in other reaction processes. To test the conceptual generality of H-bond catalysis, the effect of OH··· N H-bond on OH\* dehydrogenation is examined here.

The activation energy, initial and transition state structures and charge distributions of OH\* dehydrogenations with neighboring N\* are also shown in Figure 3. As shown in Figure 3, the activation energy of OH\* dehydrogenation is reduced by 27 kJ/mol due to the presence of OH··· N interaction. Like the cases shown for OH··· O interactions, the charge of oxhydryl H changes little in going from the initial local minimum structure to the transition state, even though the direction and distance of the O–H bond are substantially changed in the process.

Combining Figures 2 and 3, a general conclusion may be drawn: the charge of oxhydryl H is kept unchanged in the initial and transition states of dehydrogenation by the presence of H-bond. As a result, the activation energy for the dehydrogenation reaction is reduced in comparison to that in the absence of the H-bond.

It is worth noting that the activation energy is reduced to a very low value of around 15 kJ/mol, for the dehydrogenation of H*x*COH\*, but to 68–78 kJ/mol for the OH\* dehydrogenation. The notable difference is attributable to the rigidity of the O-H bond in the species: The O–H bond directionality is quite weak in H*x*COH\*, while relatively strong in OH\*. Another point to note is that the activation energy for the H··· N assisted reaction is 10 kJ/mol higher than that of H··· O assisted process, even though the H··· N interaction is known to be stronger than the H··· O interaction on average. There is no surprise, though, as a specific case does not correspond to the average. As shown in Figure 3, the charge of N\* for the H-bond is only −0.92 to −0.96 e, while the charge of O for the H-bond is −1.21 e. Moreover, the H-bond distance of OH··· N is larger than that of OH··· O. Consequently, the OH··· N H-bond here is weaker and less effective in reducing the activation energy than the OH··· O H-bond. Nevertheless, it is interesting to see the relatively weak OH··· N H-bond is very effective in maintaining the charge of H in the reaction path that the O-H bond is stretched from the initial 0.98 Å to 1.51 Å at the transition state. Overall, a near constant charge of H during the O–H scission process as maintained by the H-bond interaction is a common feature for the H-bond catalyzed reactions. The effectiveness of an H-bond on lowering the activation energy is, however, dependent on numerous factors, such as the H-bond strength, the O–H bond strength, and the surface material, and thus requires further studies.

#### **4. Conclusions**

Oxhydryl dehydrogenations of H*x*COH\* (*x* = 3, 2, 1, 0) on Ni(111) were investigated by DFT calculations and Bader charge analysis. The activation energies are 68 to 91 kJ/mol for isolated H*x*COH\* and much reduced to 12–17 kJ/mol if assisted by neighboring OH\*. The catalyzing effect of OH\* is attributed to the OH··· O H-bond that maintains the charge of oxhydryl H in the O–H bond breaking process. The catalytic mechanism of H-bond is further supported by the results of OH\* and N\* assisted dehydrogenation of OH\*. Due to the universality of H-bond, the catalytic mechanics revealed here are of broad implication to the study of reaction kinetics of many systems.

**Author Contributions:** Funding acquisition, Z.L.; Investigation, C.K.; Methodology, C.K.; Resources, Z.L.; Software, Z.L.; Supervision, Z.L.; Writing, C.K. and Z.L. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by National Natural Science Foundation of China (11774324 & 11574284).

**Acknowledgments:** The computing time of the Super-computing Center of the University of Science and Technology of China are gratefully acknowledged.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Sample Availability:** Samples of the compounds are available from the authors.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Phosphine Oxides as Spectroscopic Halogen Bond Descriptors: IR and NMR Correlations with Interatomic Distances and Complexation Energy**

#### **Alexei S. Ostras', Daniil M. Ivanov, Alexander S. Novikov and Peter M. Tolstoy \***

Institute of Chemistry, St. Petersburg State University, 198504 St. Petersburg, Russia; st052055@student.spbu.ru (A.S.O.); dan15101992@gmail.com (D.M.I.); a.s.novikov@spbu.ru (A.S.N.) **\*** Correspondence: peter.tolstoy@spbu.ru; Tel.: +7-921-430-8191

Academic Editor: Ilya G. Shenderovich Received: 27 February 2020; Accepted: 16 March 2020; Published: 19 March 2020

**Abstract:** An extensive series of 128 halogen-bonded complexes formed by trimethylphosphine oxide and various F-, Cl-, Br-, I- and At-containing molecules, ranging in energy from 0 to 124 kJ/mol, is studied by DFT calculations in vacuum. The results reveal correlations between R–X···O=PMe3 halogen bond energy Δ*E*, X···O distance *r*, halogen's σ-hole size, QTAIM parameters at halogen bond critical point and changes of spectroscopic parameters of phosphine oxide upon complexation, such as 31P NMR chemical shift, ΔδP, and P=O stretching frequency, Δν. Some of the correlations are halogen-specific, i.e., different for F, Cl, Br, I and At, such as Δ*E*(*r*), while others are general, i.e., fulfilled for the whole set of complexes at once, such as Δ*E*(ΔδP). The proposed correlations could be used to estimate the halogen bond properties in disordered media (liquids, solutions, polymers, glasses) from the corresponding NMR and IR spectra.

**Keywords:** halogen bond; phosphine oxide; 31P NMR spectroscopy; IR spectroscopy; non-covalent interactions; spectral correlations

#### **1. Introduction**

Halogen bonding is one of the most abundant non-covalent interactions in chemistry [1,2]. Due to the anisotropic distribution of electron density around the covalently bound halogen atom, it has two distinct regions: (a) the region of increased electron density (nucleophilic site), located perpendicular to the covalent bond and corresponding to negative values of electrostatic potential (ESP) and (b) the region of decreased electron density (electrophilic site), also called σ-hole [3], located along the covalent bond. It is the existence of the electron-depleted σ-hole that determines the ability of the halogen atom to participate in attractive interactions with electron-donating atoms or groups [4–6], forming the so-called halogen bond R–X···Y (X—halogen atom). In halogen-bonded complexes, the X ... Y distances are usually shorter than the sum of van der Waals radii of X and Y atoms, while the RXY angles tend to be close to 180◦, because the σ-hole region is located on the continuation of the R–X axis. The range of halogen bond interaction energies is similar to that for hydrogen bonds, spanning from a fraction of kJ/mol up to 150 kJ/mol [7,8].

The formation of halogen bonds can be detected in solids [9–13], in liquids and solutions [14–18] and in gas phase [19–21]. Halogen bonding is being actively studied, and it has been demonstrated that it plays a significant role in biochemistry [22,23], in crystal design and design of functional materials (liquid crystals, molecular receptors, conductors, luminescence emitters, non-linear optical materials, etc.) [24–32], in organocatalysis [33–38] and in design of pharmaceuticals (here, halogen bonds are considered primarily as a type of hydrophobic functional groups, increasing the lipophilic properties

of molecules and allowing them to pass through cell membranes) [39,40]. Besides, the halogen bond has been a subject of numerous theoretical works [41–47].

The main characteristics of halogen bond include (a) the geometric parameters (interatomic distances and angles, which could be estimated from diffraction studies) and (b) the complexation energy (experimentally available using, for example, calorimetric methods, though in condensed phases and for intramolecular interactions the definition of complexation energy might be ambiguous). In practice, it is often needed to characterize halogen bonds in disordered states of the matter, where in many cases it is done by indirect approaches, such as analysis of spectroscopic data based on previously established correlations. As a spectroscopic descriptor in such correlational methods, it is often convenient to take the change of a given parameter upon complexation, i.e., the difference between that parameter for the complex and the free molecule (which could be a probe molecule).

The main goal of this work is to propose a new spectroscopic method for the quantitative characterization of geometry and energy of halogen bonds. For this purpose we have selected trimethylphosphine oxide, Me3PO, as a model probe molecule. As spectroscopic descriptors of the halogen-bonded complexes formed by Me3PO, we have chosen (a) the change of the frequency of P=O stretching vibration upon complexation, Δν, and (b) the change of the 31P NMR chemical shift upon complexation, ΔδP. The choice of Me3PO as a probe molecule is stipulated by the fact that, due to the high polarization of the P=O bond, the terminal oxygen atom is an effective electron donor in non-covalent interactions, i.e., it is an effective halogen bond acceptor. For the same reason, it is expected that P=O coordination leads to substantial displacement of vibrational bands in IR spectra. Besides, the relatively rigid skeleton of Me3PO minimizes the influence on ΔδP of factors such as conformation of substituents (which becomes larges already for more flexible triethylphosphine oxide, let alone triphenylphosphine oxide).

Phosphine oxides in general are perspective probes for the diagnostics of non-covalent interactions. Intermolecular hydrogen-bonded complexes of phosphine oxides and their related compounds are discussed in several publications [48–51]. Moreover, there are several publications where the participation of phosphine oxides in halogen bonds is considered. For example, in [52,53], the crystal adducts of various phosphine oxides (Ph2(Me)P=O, Ph3PO) with strong halogen donors such as pentafluoroiodoebenzene C6F5I and 1,4-diaryl-5-iodotriazole are described. Finally, for a series of crystals containing molecules with P=O groups, the presence of short X···O contacts (X = Cl, Br, I), which can be interpreted as halogen bonds, has been established with the help of X-ray analysis and quantum-chemical calculations [54–61]. Many other examples of XBs with phosphine oxides and related compounds can be found in CCDC data, which allowed us to perform an extensive database search and analyze the distributions of geometric parameters, as described in Section 4. The high electron-donating ability of phosphine oxides previously has been employed in the Gutmann–Beckett method to characterize the acceptor properties of solvents [62] and other compounds exhibiting Lewis acidity [63,64], expressed in so-called acceptor numbers, AN [65]. The main experimental parameter in this method is the change of the 31P NMR chemical shift of triethylphosphine oxide upon complexation, ΔδP, normalized to 100 for the complex with SbCl5.

In this work we have used the following criteria for the selection of halogen donor molecules: (i) relatively simple structure; (ii) presence of an electron-accepting group, which increases the positive ESP value in the σ-hole region and, consequently, enhances the electron-accepting ability of the halogen; (iii) absence of acidic hydrogen atoms that could compete with halogen bond by forming hydrogen bonds. Based on these criteria, we have selected an extensive set of 128 halogen-containing neutral molecules belonging to different classes of inorganic and organic chemical compounds R–X, where for simplicity RX stands also for R3X and R5X in case of halogen(III) and halogen(V) compounds, respectively:


For each 1:1 complex formed by Me3PO molecule and one of the abovementioned halogen donors, we have quantum-chemically calculated (M06-2X/def2-TZVPPD level of theory [66]) the equilibrium geometry, the harmonic vibrational frequencies, the nuclear chemical shielding constants, the complexation energies (corrected for BSSE, the basis set superposition error) and the value of ESP in the σ-hole region for the free halogen donor molecule. Besides, we have calculated and analyzed the electron density properties at the (3; –1) halogen bond critical point (BCP) using the Bader's quantum theory of atoms in molecules (QTAIM) methodology [67]. The parameters which are determined and analyzed in this work are shown in Scheme 1.

**Scheme 1.** Schematic structure of 1:1 halogen-bonded complexes formed by Me3PO with RX (X—halogen) and the list of calculated parameters studied in this work: *r—*X···O distance; α and <sup>β</sup>*—*X···O=P and R–X···O angles, respectively; <sup>Δ</sup>E—complexation energy (BSSE corrected); <sup>ρ</sup>, <sup>∇</sup>2ρ, *<sup>V</sup>* and *G* are electron density, Laplacian of electron density, local electron potential and kinetic energies densities at halogen bond critical point (3; –1), respectively; <sup>ρ</sup> and <sup>∇</sup>2<sup>ρ</sup> are electron density and Laplacian of electron density; ΔδP—change of 31P NMR chemical shift upon complexation; Δν –change of harmonic P=O stretching wavenumber upon complexation; *ESP*max*—*the extremal value of electrostatic potential in the region of σ-hole on the surface of equal electron density taken at 0.001 electron/Bohr<sup>3</sup> level.

#### **2. Results and Discussion**

The calculated geometries of all complexes are shown in Figure S1, and all the parameters considered in this work (see Scheme 1) are collected in Tables S1 and S2. In this section we will present only the plots and fitting equations that are necessary for the discussion.

#### *2.1. Angular Distribution*

Based on the calculated optimized geometries of halogen-bonded R–X···O=P complexes (X = F, Cl, Br, I, At), the distributions of angles α (angle X···O=P) and β (angle R–X···O) were constructed. In case of multivalent halogen bond donors, such as some of interhalides, the angle β was chosen as the largest one among all possible R–X···O values. In Figure 1, the resulting distributions are shown as histograms in which the bar height is proportional to the number of complexes having the corresponding angle within the given 5◦ range.

**Figure 1.** Distribution of angles α (X···O=P) and β (R–X···O) for 128 complexes studied in this work (X = F, Cl, Br, I, At). Histogram bars are taken every 10◦ for α and every 5◦ for β.

The values of angle α lie primarily in the range 100–120◦. This indicates that the halogen bonds are expectedly formed along the direction of oxygen lone pairs. For the *sp*2-hybridized oxygen, the angle between the P=O bond and the oxygen lone pair should be 120◦, though the high polarization of the PO bond increases the contribution of the P+–O<sup>−</sup> resonance structure, corresponding to the *sp*<sup>3</sup> hybridization, which makes α closer to the tetrahedral angle 109.5◦. Thus, the distribution of angles α might be considered as an indication that the phosphine oxide structure is intermediate between the neutral one, P=O, and the zwitterionic one, P+–O−. Further deviations of angles α are most probably due to the presence of secondary interactions or steric hindrance.

The angle β describes how linear the halogen bond is. The values of β are influenced by the shape and size of the σ-hole on the halogen atom. As σ-holes are located along the continuation of the R–X bond, the values of β lie close to 180◦. Overall, the stronger and shorter the halogen bond is, the more linear it is (see Figure S2). It stands to reason that the complexation energy increases with the increase of the σ-hole, which in turn depends on the polarizability of the halogen atom; thus, it could be expected that, in general, the broader range of angles β would correspond to the weakest R–F···O=P complexes, and the narrow range of β values close to 180◦ would be observed for stronger R–I···O=P and R–At···O=P complexes. Indeed, this is the case within the series of complexes considered in this work (see Figure S2).

The abovementioned angular distributions could be compared with the results of the statistical analysis of CCDC 2020 data for phosphine oxides and related PO-containing compounds as O-nucleophiles for halogen bond. Note that only three halogens (Cl, Br, I) were considered, because (i) no results for At were found due to all astatine isotopes being radioactive; and (ii) although we found two structures (XAQYOF and XUFQOF) for fluorine, this information is definitely not enough to collect statistics. The values of angle α for this statistical data lie primarily in the range 120–140◦, with a median value of 129.4◦, as shown in Figure 2. This indicates that the halogen bonds are still preferably formed along the direction of oxygen lone pairs, but, in case of bulk substituents around P=O functionality (larger than CH3 groups in the model trimethylphosphine oxide), the halogen

donors have to occupy positions with larger values of α, which is less optimal for the halogen bond stabilization; even almost linear complexes are possible (the largest α value is 173.2◦ in OBUCAQ).

**Figure 2.** Distribution of angles α (X···O=P) and β (R–X···O) for CCDC 2020 structures containing R–X···O=P fragment (X = Cl, Br, I). The see complete list of entries in Table S3. Histogram bars are taken every 10◦ for α and every 5◦ for β.

The R–X···O fragments are less strictly linear for the CCDC data than for theoretically investigated R–X···OPMe3 models (see Figure 2). However, the median value of angle β is 170.1◦, which is still rather close to 180◦. As seen from the calculated data, the stronger and shorter the halogen bond is, the more linear it is (see Figure 3). Note that for all halogens under consideration (Cl, Br, I) combined and for each halogen individually it is possible to draw the one cut-off line in Figure 3 above which almost all data points will be located.

**Figure 3.** Correlation between angles β (R–X···O) and normalized distance parameter *R* = *r*/(*R*<sup>O</sup> + *R*X), where *r* is X···O distance and *R*<sup>O</sup> and *R*<sup>X</sup> are van der Waals radii of oxygen and halogen atoms, respectively, for CCDC 2020 structures containing R–X···O=P fragment (X = Cl, Br, I). For the complete list of entries see Table S3.

#### *2.2. Complexation Energy Dependence On Intermolecular Distance*

There seems to be no universal dependence which can be used to estimate the halogen bond energy using the interatomic distances and bond angles. The previously reported correlations were tested on relatively homologous sets of intermolecular complexes [47,68]. Here, using our set of 128 complexes with participation of various halogens, we attempt to construct energy–structure correlations which on one hand would be general enough (i.e., would have similar form for all halogen donor types) and on the other hand would be simple enough, allowing for a rapid energy estimation.

Figure 4 shows the correlations between calculated complexation energies Δ*E* and X···O interatomic distances for all complexes studied in this work. The majority of data points in the plot are located in the range of X···O distances from 2.4 to 3.1 Å, while the overall span of complexation energies is from 3.6 up to 124 kJ/mol.

**Figure 4.** Correlation between calculated complexation energy Δ*E* and interatomic distance *r* (X···O, where X = F, Cl, Br, I, At) for R–X···O=PMe3 halogen-bonded complexes studied in this work. The solid curves correspond to Equation (1).

Quite expectedly, for each given halogen (F, Cl, Br, I or At) the decrease of the X···O distance *r* is accompanied by the increase of complexation energy Δ*E*. The shape of this correlation could be approximated by exponentially decaying functions such as:

$$
\Delta E \text{(kJ/mol)} = A \cdot \text{e}^{-b \cdot r}, \tag{1}
$$

where *A* (in kJ/mol) and *b* (in Å<sup>−</sup>1) are fitting coefficients, individual for each halogen. The corresponding fitted curves are added to Figure 4 as solid lines, while the numerical values of parameters *A* and *b* are collected in Table 1. One can see that the fitted parameters found for complexes with fluorine-containing molecules are falling out of the overall trend, which is probably associated not only with the small number of data points, but also with the weak ability of fluorine to act as an electron acceptor (due to the small—sometimes almost absent—σ-hole).


**Table 1.** The numerical values of fitted parameters *A*, *b*, *B*, *D*, *K* and *M* used in Equations (1)–(6). The resulting curves are shown in Figures 4–7.

It should be noted that the quality of the correlation increases in the order from chlorine- to astatine-containing halogen donors (see the corresponding R2 values added to Figure 2), because for heavier atoms with larger σ-hole (Br, I, At) the halogen bond gets generally stronger and it becomes the dominant factor determining the interatomic distance X···O.

Partially, the differences between fitted parameters in Equation (1) for different halogens are explained by the fact that halogens simply increase in size going from F to At. Thus it is interesting to replot Figure 4, using for abscissa not the absolute values or *r* but the "reduced" values, normalized to the sum of van der Waals radii of oxygen and halogen:

$$R = \frac{r}{R\_O + R\_X},\tag{2}$$

where *R*<sup>O</sup> and *R*<sup>X</sup> are van der Waals radii of oxygen and halogen, respectively. For this purpose we have used Bondi's radii [69] (in Å: *R*<sup>O</sup> = 1.52, *R*<sup>F</sup> = 1.47, *R*Cl = 1.75, *R*Br = 1.85, *R*<sup>I</sup> = 1.98, *R*At = 2.02, the latter value taken from [70]). The result is shown in Figure 5. The *R* values could be considered as a measure of an overlap of atomic electron shells upon complexation: the closer *R* values are to unity, the weaker the halogen bond is. For example, as fluorine-containing molecules are weaker halogen donors in the series, the *R* values for their complexes are the largest; in contrast, for astatine-containing molecules, the *R* values are significantly smaller.

**Figure 5.** Correlation between calculated complexation energy Δ*E* and normalized distance parameter *R* = *r*/(*R*<sup>O</sup> + *R*X), where *r* is X···O distance and *R*<sup>O</sup> and *R*<sup>X</sup> are van der Waals radii of oxygen and halogen atoms, respectively, for R–X···O=PMe3 halogen-bonded complexes studied in this work (X = F, Cl, Br, I, At). The solid curves correspond to Equation (3).

Due to the fact that *R* is a simple normalization of initial *r* values, the data sets in Figure 5 could be fitted by the same type of functions, as in Equation (1), with the same quality of the fit:

$$
\Delta E(\text{kJ/mol}) = A \cdot e^{-B \cdot R},
\tag{3}
$$

where the values of unitless parameters *B* are as listed in Table 1. The spread of fitting curves in Figure 5 is significantly smaller than that in Figure 4, though there is still some difference in behavior of different halogens. Due to the increase of polarizability the decrease of electronegativity and, therefore, increase of maximal σ-hole potential of halogens from F to At, the shortening of the X···O bond leads to a more effective increase of the interaction energy for At and I, as compared to other halogens. This effect is still noticeable even if *R* values (Figure 5) are taken instead of *r* values (Figure 4) for the correlation.

The proposed correlational functions could be used to estimate the halogen bond energy in crystals in case the interatomic distance is known. Considering the scattering of data points, the precision of such estimation would be roughly ±20% over the whole range of energies. The applicability of the correlations is probably limited to halogen-bonded complexes of X···O type; for X···N, X···S, X···P and X···X interactions the fitted parameters might differ. Nevertheless, it could be expected that qualitatively the type fitting function would remain unchanged.

#### *2.3. Complexation Energy Dependence On* σ*-hole Electrostatic Characteristics*

The halogen's σ-hole could be characterized by *ESP*max, which is the maximal value of electrostatic potential measured on the surface of equal electron density. Usually for this purpose the electron density level of 0.001 electron/Bohr<sup>3</sup> is selected. It is convenient to represent *ESP*max values in the units of energy, kJ/mol, as the energy of interaction with a virtual unit probe charge. The *ESP*max values for the halogen donor molecules considered in this work are listed in Table S1 and plotted versus Δ*E* in Figure 6. It is possible to fit the data sets with linear equations:

$$
\Delta E(\text{kJ/mol}) = D \cdot ESP\_{\text{max}} \tag{4}
$$

where the unitless parameters *D* are again individual for a given halogen. The numerical values of coefficients *D* are collected in Table 1. The correlation lines (see Figure 6) pass through the origin, which means that the halogen bond does not form in the absence of σ-hole (Δ*E* = 0). Some of the deviations of the data points from linear correlations are possibly due to the presence of secondary non-covalent interactions, other than halogen bonding, i.e., not associated with the σ-hole on the halogen atom, such as various electrostatic and dispersion interactions.

**Figure 6.** Correlation between calculated complexation energy Δ*E* and the extremal value of electrostatic potential *ESP*max, measured in the region of σ-hole of X atom on the surface of equal electron density taken at 0.001 electron/Bohr<sup>3</sup> level for R–X···O=PMe3 halogen-bonded complexes studied in this work (X = F, Cl, Br, I, At). The solid lines correspond to Equation (4).

#### *2.4. Correlation Between Complexation Energy And P*=*O Stretching Frequency*

Vibrational spectroscopy is one the most widely used methods for identification and characterization of non-covalent interactions. The formation of an intermolecular complex is usually accompanied by the decrease of the stretching vibrational frequency of the electron-donating group. This frequency shift could be used to construct correlations, allowing one to indirectly estimate other parameters of the complex, such as its energy or geometry. In Figure 7 we show the Δ*E* dependence on the calculated P=O stretching vibration frequency shift upon complexation, Δν. The overall trend is apparent: the stronger is the complex, the larger is the frequency shift. This could be rationalized as follows: the formation of halogen-bonded complex leads to the electron density transfer on the σ\* molecular orbital of the electron-accepting molecule. This in turn leads to the weakening and lengthening of the P=O bond, which results in the lowering of the harmonic force constant for the P=O stretching vibration.

**Figure 7.** Correlation between calculated complexation energy Δ*E* and absolute value of the change of the P=O stretching vibration frequency upon complexation, Δν, for R–X···O=PMe3 halogen-bonded complexes studied in this work (X = F, Cl, Br, I, At). The solid lines correspond to Equation (5).

The data sets shown in Figure 7 could be approximated by linear functions, passing through the origin:

$$
\Delta E(\text{kJ/mol}) = K \cdot \Delta \nu\_{\prime} \tag{5}
$$

where *K* is the proportionality coefficient expressed in kJ/mol/cm−<sup>1</sup> units. The numerical values of coefficients *K*, individual for each halogen, are collected in Table 1. As was the case for other correlations presented above, the data set for fluorine-containing molecules is subject to the largest approximation errors and is likely to perform poorly beyond the complexation energy of ca. 15 kJ/mol. In order to further simplify Equation (5)—admittedly with some loss of its accuracy—it is possible to propose one universal "average" equation, suited for rough energy estimations:

$$
\Delta E(\text{kJ/mol}) = 0.85 \cdot \Delta \nu,\tag{6}
$$

where Δν is measured in cm<sup>−</sup>1. Here, as the average coefficient, we take the coefficient obtained by a linear fitting of the entire data set shown in Figure 7.

#### *2.5. Correlation Between Complexation Energy And 31P NMR Chemical Shift*

The 31P NMR chemical shift is very sensitive to the chemical structure of the phosphorous-containing compounds and to their environment. The "rule of thumb" is that the 31P NMR signal shifts to the high field due to the presence of electron-donating substituents and to the low field due to electron-accepting ones. This is consistent with what we observe in calculations: formation of halogen bond increases the 31P NMR chemical shift of Me3PO. This is illustrated in Figure 8, where a correlation between Δ*E* and ΔδP is shown: the stronger the halogen bond is, the larger the signal shift to the low field is. The data sets plotted in Figure 6 can be fitted by linear functions:

$$
\Delta E(\text{kJ/mol}) = M \cdot \Delta \delta P\_{\text{\textquotedblleft}} \tag{7}
$$

where coefficients *M* (in kJ/mol/ppm) slightly differ for F, Cl, Br, I and At. The numerical values are collected in Table 1. In a similar way as it was done for Δν, for ΔδP it is possible to construct one universal correlation for rough estimations of complexation energies based of 31P NMR spectra:

$$
\Delta E(\text{kJ/mol}) = 2.7 \cdot \Delta \delta P,\tag{8}
$$

where ΔδP is measured in ppm. Again, the average coefficient refers to the fitting of the entire data set shown in Figure 8. It should be noted that despite the generally high sensitivity of δP to the molecular structure and to non-covalent interactions [71], in the literature there are only few attempts to use δP for the solution of the reverse spectroscopic problem for non-covalent complexes, i.e., for the finding of the complex's energy and structure based on the phosphorous chemical shift value [72–75]. Partially the reason for this might be in the high sensitivity itself, because contributions to δP from various weak secondary non-covalent interactions might "smudge" the effect of the halogen bonding, thus strongly reducing the diagnostic value of the spectroscopic marker. In case of phosphine oxides R3PO, we hope that such probe molecules are reasonably rigid and less prone to secondary interactions than, for example, phosphinic acids; if so, the correlation proposed in this work would be of practical value.

**Figure 8.** Correlation between calculated complexation energy Δ*E* and change of the 31P NMR chemical shift upon complexation, ΔδP, for R–X···O=PMe3 halogen-bonded complexes studied in this work (X = F, Cl, Br, I, At). The solid lines correspond to Equation (7).

In Figure S7 we test the possibility of using ΔδP to estimate halogen atom's electrophilicity, measured as *ESP*max value. There is significant scattering of data points which has prevented us from proposing a correlation, though it could be speculated that the overall dependence is independent of the type of halogen. A much better universal correlation could be proposed between the reduced interatomic distance *R* (see Equation (2) for the definition) and ΔδP (see Figure 9).

**Figure 9.** Correlation between the normalized distance parameter *R* = *r*/(*R*<sup>O</sup> + *R*X), where *r* is X···O distance and *R*<sup>O</sup> and *R*<sup>X</sup> are van der Waals radii of oxygen and halogen atoms, respectively, and the 31P NMR chemical shift upon complexation, ΔδP, for R–X···O=PMe3 halogen-bonded complexes studied in this work (X = F, Cl, Br, I, At). The solid line corresponds to Equation (9). *R*<sup>2</sup> (coefficient of determination) of the fitted curve equals to 0.933.

The data points in Figure 9 could be fit reasonably well with a single exponential function:

$$R = 0.63 + 0.37 \cdot e^{-0.07 \cdot \Delta \delta P} \,\text{\AA} \tag{9}$$

where ΔδP is measured in ppm.

From the dependencies shown in Figures 7 and 10, one could expect that the effects of complexation on IR and NMR spectra are correlated. This is indeed the case, as shown in Figure 10; for all types of halogens, the data points could be approximated with a single linear dependence:

$$
\Delta\delta P = 0.3 \cdot \Delta\nu\_{\prime} \tag{10}
$$

where ΔδP is in ppm and Δν is in cm−1. The high degree of correlation between two independent spectral characteristics could serve as an indication of the robustness of the selected probe molecule (Me3PO in our case), as well as an indication that both in IR and NMR spectra the main effects (the change of the P=O stretching frequency and the change of the 31P NMR chemical shift) are caused by the donation of the electron density from one of the oxygen lone pairs to the halogen donor molecule.

**Figure 10.** Correlation between the change of the P=O stretching vibration frequency upon complexation, <sup>Δ</sup>ν, and the change of the 31P NMR chemical shift upon complexation, <sup>Δ</sup>δP, for R–X···O=PMe3 halogen-bonded complexes studied in this work (X = F, Cl, Br, I, At). The solid line corresponds to Equation (10). *R*<sup>2</sup> (coefficient of determination) of the fitted line equals to 0.993.

#### *2.6. QTAIM Analysis of the Electronic Structure of Complexes*

Figures S3–S6 show the dependencies of complexation energy Δ*E* on QTAIM parameters at BCP, namely *G*, *V*, <sup>ρ</sup> and <sup>∇</sup>2<sup>ρ</sup> (see Section 2 for more information). In this section, we summarize only the main observations. In all cases, the scattering of data points in plots is substantial, and no universal trends applicable for all halogens could be noticed. Nevertheless, it is possible to construct linear approximations such as:

$$\begin{aligned} \Delta E(\text{kJ/mol}) &= \mathbb{C}\_{G} \cdot \text{G}\_{\text{.}}\\ \Delta E(\text{kJ/mol}) &= \mathbb{C}\_{V} \cdot V\_{\text{.}}\\ \Delta E(\text{kJ/mol}) &= \mathbb{C}\_{rho} \cdot \rho\_{\text{.}}\\ \Delta E(\text{kJ/mol}) &= \mathbb{C}\_{Lap} \cdot \nabla^{2} \rho\_{\text{.}} \end{aligned} \tag{11}$$

where independent variables are expressed in the following units: *G* in kJ/mol/Å3, *V* in kJ/mol/Å<sup>3</sup> (both *V* and *G* were taken as absolute values), <sup>ρ</sup> in a.u./Å<sup>3</sup> and <sup>∇</sup>2<sup>ρ</sup> in a.u./Å5. The resulting fitting coefficients are collected in Table 2. In all cases, the quality of the fit (R2) was 0.8–0.9 for F-containing halogen bonded complexes and 0.91–0.99 for Cl−, Br−, I− or At-containing ones. The proportionality coefficients in Equation (9) are not universal: they vary for different halogens; usage of one average coefficient would lead to significant errors in energy estimations. This observation is similar to the one made in [76], where similar coefficients were proposed for Cl-, Br- and I-containing halogen bonds (see the values listed in Table 2).

**Table 2.** The numerical values of fitted parameters *C*G, *C*V, *C*rho and *C*Lap used in Equation (9) to fit data points shown in Figures S3–S6. For comparison, several correlation coefficients previously reported in [77] are added to the last two columns.


#### **3. Materials and Methods**

#### *3.1. Computational Details*

The full geometry optimization of all model structures in vacuum, complexation energies (including the relaxation energy of optimized monomers and corrected for the basis set superposition error by counterpoise method [77] as a single point calculation after the geometry optimization), ESP values and spectroscopic characteristics were calculated at the DFT level of theory using Gaussian 16 software (Gaussian Inc. Wallingford, CT, USA) [78]. The visualization was done using GaussView 6.0 (Gaussian Inc. Wallingford, CT, USA) [79] and ChemCraft [80] software.

For all calculations, we used the hybrid functional M06-2x, which was previously shown to perform well for the investigation of non-covalent interaction of small molecules [81–85]. Due to the correction for dispersion interactions, this functional is well suited for the estimation of geometry and energy of halogen bonds [86]. The basis set def2-TZVPPD was selected because it includes (i) polarization functions allowing for a better description of asymmetric electron distributions in halogens; (ii) diffuse functions which describe well the relatively long-distance non-covalent bonds and (iii) parametrization of pseudopotentials necessary to describe the relativistic effects for heavy halogens, such as I and At [87].

The calculations of harmonic vibrational frequencies were used to define the shift of the P=O stretching band upon complexation as Δν = ν<sup>0</sup> − ν, where ν<sup>0</sup> are ν are vibrational frequencies for the free Me3PO and Me3PO in complex, respectively. For the free Me3PO, ν<sup>0</sup> = 1268 cm−1. Taking into account that for all studied complexes ν < ν0, the definition used for Δν makes its values positive, which is done for convenience. The calculations of chemical shielding were performed using GIAO method. The change of 31P NMR chemical shift upon complexation was defined as ΔδP = σ0( 31P) <sup>−</sup> σ( 31P), where σ0( 31P) and σ( 31P) are shielding constants of 31P nucleus in the free Me3PO and Me3PO in complex, respectively. For the free Me3PO, σ0( 31P) = 268 ppm.

The topological analysis of electron density at halogen bond critical points (BCPs) was carried out within the framework of QTAIM methodology using MultiWFN software [88]. The following BCP parameters were considered: local electron kinetic (*G*) and potential (*V*) energy densities, electron density <sup>ρ</sup> as well as its Laplacian <sup>∇</sup>2ρ.

All of the abovementioned calculated halogen bond characteristics were correlated with complexation energy and in some cases between each other; the proposed linear and non-linear correlation functions were fitted by least squares method using Origin software [89]. The complexes in which the dominant interaction was not the R–X···O=P halogen bond (e.g., pnictogen bond, chalcogen bond or π-hole interaction) were not included into the regression analysis. Such complexes are marked with color in Figure S1 and Tables S1 and S2. Throughout the paper, in all the plots, we show only those data points that were actually included in the regression analysis; in other words the data for complexes for which the dominant interaction was not the halogen bond can be found only in the Supplementary Materials.

#### *3.2. CCDC Data Search*

The search of the relevant R–X···O=P interactions was performed using the CCDC 2020 offline data (program ConQuest 2.0.4). Search criteria: NM~X···O~P(~NM)3 fragment, where (i) symbol ~ stands for any bond; (ii) X = Cl, Br, I; (iii) NM is any nonmetal; (iv) d(X···O) ≡ *r* is less than the Bondi's vdW radii sums; (v) ∠(NM~X···O) ≡ β, 150◦ ≤ β ≤ 180◦; (vi) number of bonded atoms for X is 1; (vii) number of bonded atoms for P is 4; (viii) structures are non-disordered; (ix) final R1 index [I ≥ 2σ (I)] is less or equal 10%.

#### **4. Conclusions**

In this work, we have considered a large set of 128 halogen-bonded complexes formed by trimethylphosphine oxide and halogen donors belonging to various classes of chemical compounds. The energies of these complexes span from 3.6 to 124 kJ/mol, while the halogen bond distances *R*, measured as a percentage of the sum of van der Waals radii of participating atoms, ranged from 100% down to 62%. The obtained distributions of interatomic distances and angles are rather similar to those obtained from the comprehensive search in the CCDC 2020 database of various RX···PO short contacts (compare Figures 1 and 2). On the one hand, the Me3P=O molecule could be considered as a probe used to characterize the halogen-donating ability of isolated F-, Cl-, Br-, I- and At-containing species (the size of the σ-hole on halogen atom). On the other hand, the spectroscopic parameters of phosphine oxide involved in a R–X···O=P complex were used to determine the energy and geometry of the halogen bond. We showed that the change of 31P NMR chemical shift and P=O stretching frequency upon complexation have practically the same diagnostic value: they are well correlated with each other (Figure 10), linearly correlated to the halogen bond energy Δ*E* (Figures 7 and 8) and exponentially related to halogen bond geometry (Figure 10). The overall spans of spectroscopic parameters are substantial: ca. 45 ppm for the 31P NMR chemical shift and ca. 50 cm−<sup>1</sup> for the P=O frequency. Interestingly, the spectroscopic correlation with *R* values is general, i.e., it is fulfilled for the whole set of complexes at once, while in many other cases correlations remain halogen-specific, i.e., different for F-, Cl, Br-, I- or At-containing halogen donors. We believe that the interdependences between halogen bond descriptors and spectroscopic markers—Equations (5)–(9)—would be useful in case direct crystallographic or calorimetric data are not available, as in the case of halogen-bonded complexes in liquids, in solutions and in other kinds of disordered media.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/1420-3049/25/6/1406/s1, Figure S1: optimized structures of Me3P=O···XR complexes, Table S1: calculated halogen bond geometries, complexation energies and spectroscopic parameters, Table S2: QTAIM parameters (ρ, <sup>∇</sup>2ρ, *<sup>V</sup>* and *<sup>G</sup>*) at halogen bond BCP, Figure S2: the correlation between angles β (angle O···X–R) and the complexation energy, Table S3: geometric parameters of the R–X···O=P halogen bonds found in CCDC 2020 database for X = Cl, Br, I, Figure S3: correlation between Δ*E* and local kinetic energy density *G* at BCP, Figure S4: correlation between Δ*E* and local potential energy density *V* at BCP, Figure S5: correlation between Δ*E* and electron density ρ at BCP, Figure S6: correlation between energy <sup>Δ</sup>*<sup>E</sup>* and Laplacian of electron density <sup>∇</sup>2<sup>ρ</sup> at BCP, Figure S7: correlation between the extremal value of electrostatic potential *ESP*max and change of the 31P NMR chemical shift upon complexation, ΔδP, for Me3P=O···XR complexes.

**Author Contributions:** Conceptualization, P.M.T.; Formal analysis, D.M.I.; Funding acquisition, P.M.T.; Investigation, A.S.O. and D.M.I.; Methodology, A.S.N.; Supervision, P.M.T.; Validation, A.S.O.; Writing—original draft, A.S.O. and P.M.T. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by RSF garnt 18-13-00050.

**Acknowledgments:** Quantum-chemical calculations were performed at the Computing Center of St. Petersburg State University Research Park.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Sample Availability:** Calculation output files for all studied complexes are available from the authors.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Adduct under Field—A Qualitative Approach to Account for Solvent E**ff**ect on Hydrogen Bonding**

#### **Ilya G. Shenderovich 1,\* and Gleb S. Denisov <sup>2</sup>**


Academic Editor: James Sherwood Received: 1 January 2020; Accepted: 20 January 2020; Published: 21 January 2020

**Abstract:** The location of a mobile proton in acid-base complexes in aprotic solvents can be predicted using a simplified Adduct under Field (AuF) approach, where solute–solvent effects on the geometry of hydrogen bond are simulated using a fictitious external electric field. The parameters of the field have been estimated using experimental data on acid-base complexes in CDF3/CDClF2. With some limitations, they can be applied to the chemically similar CHCl3 and CH2Cl2. The obtained data indicate that the solute–solvent effects are critically important regardless of the type of complexes. The temperature dependences of the strength and fluctuation rate of the field explain the behavior of experimentally measured parameters.

**Keywords:** solvent effect; hydrogen bond; NMR; condensed matter; polarizable continuum model; reaction field; external electric field; proton transfer

#### **1. Introduction**

Proton transfer represents the simplest possible chemical reaction [1] and is ubiquitous in chemistry [2,3], material science [4–6], and biology [7,8]. In the latter case, the complexity of the process can increase to a hydrogen atom transfer [9,10]. In condensed matter, the mechanism and the pathway of proton transfer depend on the local environment. As a result, the study of proton transfer processes in a given system can be used as a tool to study the local environment. In most cases, it will require a theoretical simulation of the proton transfer under question. Such simulations are still very challenging as they depend on a compromise between the size of the modeled molecular system and the quality of accounting for intermolecular interactions. The size should be large enough to include all relevant interactions; the quality should be good enough to estimate their effect correctly. One may prefer to simulate a given molecular system in condensed matter using oversimplified approaches, looking only for a qualitative description of the system. Often, such approaches are fully justified. The available theories of nonadiabatic [11] and adiabatic [12] proton transfer reactions provide a useful background for understanding experimental results as on reversible proton transfer in the Zundel cation [13–15] as well as on fast proton dynamics in general [16–18]. The precision of such analysis can be improved further [19]. However, the most challenging part is to account for the effect of fluctuating solute–solvent interactions [20–22].

Often, one needs to restrict proton mobility in order to stabilize individual structures. This is especially important for high-resolution nuclear magnetic resonance spectroscopy (NMR) whose characteristic time is of the order of 10−<sup>3</sup> s. Basically, proton and molecular exchange can be suppressed by lowering the temperature. However, when studying intermolecular interactions in solution, one is strictly limited with the available temperature range. For aprotic polar solvents, the lowest

possible temperature is about 100 K [23]. This temperature is not always low enough to affect proton dynamics [24,25]. Another problem is that the required temperature depends on intermolecular interactions in a complex way [26–28]. Even the structures of complexes with strong noncovalent interactions are affected by interactions with the environment [29–32]. In solution, this can be visualized by molecular dynamics simulations [33,34]. Thus, conventional gas-phase calculations can neither be used to predict at what temperature in a given molecular system proton exchange can be suppressed in a given solvent nor to simulate the mean structure of the system in solution.

Solvent effect can be divided into two parts: (i) fluctuating solute–solvent specific interactions and (ii) macroscopic electric field. The polarizable continuum model (PCM) includes only the latter effect [35]. As a result, this model is not sufficient to simulate the structure of noncovalently bound complexes in polar solvents [36]. The effects of specific solute–solvent interactions are to some extent implicitly included in the SMD (Solvation Model based on Density) model [37]. This model uses a number of solvent-specific parameters. In reality, the tabulated values of Abraham's hydrogen bond acidity and basicity, aromaticity, and electronegative halogenicity of solvents are not always the optimal choice for a given molecular system. The temperature dependence of these parameters is not known. Thus, given standard conditions, this model can be a good approximation and would fail otherwise. The problem can be overcome by using molecular dynamics approaches. However, they are computationally consuming and challenging when in non-aqueous solutions [38].

Alternatively, the effect of environment can be simulated using a fictitious external electric field [39–41]. The main advantage of this approach in relation to complexes with noncovalent interactions is that their experimental structures can be reproduced using only one parameter—the strength of the external electric field. The physical meaning of this field is illustrated in Figure 1. For the sake of simplicity, we consider a hydrogen-bonded (H-bond) complex in an aprotic polar solvent. The strongest intermolecular interaction in this complex is the acid-base H-bond. The geometry of this bond is affected by the macroscopic electric field generated by dipole moments of solvent molecules. This effect can be simulated by the PCM model. Besides that, there are weak yet multiple interactions with solvent molecules. They also cause changes in the electron density in the acid and base that affect the position of the mobile proton. The PCM model ignores this effect. The SMD model can include this effect through the empirical parameters. Using the external electric field model, we can estimate the relative amplitudes of both macroscopic and specific effects on the properties of the hydrogen bond under question when simulating a given experimental property of H-bond with the external field alone and in combination with the PCM model. The efficiency of this Adduct under Field (AuF) approach was demonstrated using a complex of hydrogen fluoride with pyridine [42].

**Figure 1.** The direction of the external electric field that simulates the effect of solute–solvent interactions on the H-bond.

In this work, we use the AuF model in order to simulate experimentally observed solvent-driven proton transfer in a number of H-bonded complexes. The aim of this study is to formulate a simplified computational approach capable of predicting the temperature at which proton exchange will be suppressed in any given solute–solvent system. The model molecular systems are shown in Figure 2. These complexes have been experimentally studied in the past in a liquid CDF3/CDClF2 mixture, exhibiting a dielectric constant between 20 at 170 K and 38 at 103 K [23]. The proton-bound homodimer of pyridine (**1**) does not have chemically active sites exposed to the solvent while the carboxylic moiety in **2**–**6** can specifically interact with solvent molecules [34].

**Figure 2.** H-bonded complexes studied in this paper: proton-bound homodimer of pyridine (**1**) and complexes of 2,4,6-trimethylpyridine (collidine) with 2-nitrobenzoic acid (**2**), 3,5-dichlorobenzoic acid (**3**), formic acid (**4**), benzoic acid (**5**), and acetic acid (**6**).

Proton-bound homodimers can be of two types—symmetric, in which case the partners equally share the binding proton, and asymmetric, where the proton has a stronger bond to one of the partners at any given moment in time [43]. In the proton-bound homodimers of pyridine derivatives in CDF3/CDClF2 mixtures, the mobile proton jumps between the two bases faster than 103 s−<sup>1</sup> down to 120 K [44] and slower than 1011 s−<sup>1</sup> up to 290 K [45].

In Table 1 <sup>1</sup>*J*( 15N1H) scalar coupling constants in **2**–**6** in CDF3/CDClF2 solution are collected [46]. These constants were measured at different temperatures—the reason being that above these temperatures the solvent-driven exchange between O-H···N and O−···[H-N]<sup>+</sup> forms of the complexes was fast on the NMR time scale. Proton tautomerism in such complexes has previously been studied in detail [34]. For our purpose, it is important that the process strongly depends on the p*K*<sup>a</sup> of the involved acid. As a result, the solute–solvent interactions can be analyzed in a large temperature range. We know that in the O−···[H-N]<sup>+</sup> form <sup>1</sup>*J*( 15N1H) 90 Hz [47]. Thus, for some of these complexes, the tautomerism can be slow on the NMR time scale of chemical shifts and fast on the NMR time scale of scalar couplings. However, such aspects are beyond the precision of our qualitative model.

**Table 1.** Experimental <sup>1</sup>*J*( 15N1H) scalar couplings in **2**–**6** in CDF3/CDClF2 solution [46].


<sup>1</sup> The p*K*a's of the involved acids.

#### **2. Results**

#### *2.1. Proton-Bound Homodimer of Pyridine*

Figure 3a shows the potential energy curve of a non-adiabatic proton transfer in **1** under the PCM approximation at ε = 29.3. The minimum energy structures of the pyridines of **1** are not equal. Therefore, when the mobile proton is transferred from one pyridine to the other while all other atoms are fixed, the second minimum has a larger energy. In reality, this fictitious profile is not present and only shown to illustrate further changes. The ground vibrational level of the proton is higher than the energy of the transition state. The frequency of the stretching vibration (ν*NHN*) estimated under the harmonic approximation is 2486 cm−1. While the potential is anharmonic, this value is a rough estimate and is given for illustrative purposes only [48]. The accuracy of the calculations can only be increased at the cost of making them very time-consuming [49,50]. The value of ε can also be challenged; in CDF3/CDClF2 solution at about 130 K ε ≈ 30 [23]. However, the non-adiabatic proton transfer depends on an optical dielectric constant of about 2 [16]. Under the gas phase harmonic approximation, ν*NHN* = 2142 cm<sup>−</sup>1. Thus, a qualitatively similar potential surface will be observed for any value of ε. We are interested in the situation when this transfer is suppressed. What is important is that solvent polarization alone cannot cause this effect.

CDF3/CDClF2 solution cannot be simulated using the SMD approximation because its parameters are not known. Instead, chemically similar CH2Cl2 can be used. Although ν*NHN* increases under this approximation to 2517 cm<sup>−</sup>1, it is still higher than the energy of the transition state, Figure 3b—meaning that this model cannot reproduce the experimentally observed single-well location of the mobile proton in **1**.

The single-well location becomes possible in the presence of an external electric field. Under the PCM approximation (ε = 29.3) and the field of 0.001 a.u., the energy of the ground vibrational level of the mobile proton is very close to the energy of the transition state, Figure 3c. At 0.002 a.u., the former is lower than the latter, Figure 3d. This increase of the field is accompanied by an increase of ν*NHN* from 2593 cm−<sup>1</sup> to 2684 cm<sup>−</sup>1, Figure 3c,d. Thus, the experimentally observed proton jumps in **1** can be simulated under the PCM approximation and ε ≈ 30 when the strength of the external field is above 0.001–0.002 a.u. There is no hard criteria for choosing the most appropriate value of the field. We can only state that the lower limit of its strength is 0.001 a.u. Within the gas phase approximation, this limits increases to at least 0.003 a.u., Figure 3e,f.

**Figure 3.** Potential energy curve of a proton transfer within **1** at different approximations: (**a**) PCM (ε=29.3), (**b**) SMD (CH2Cl2, ε=8.9), (**c**) PCM (ε=29.3) and the external electric field of 0.001 a.u., (**d**) PCM (ε=29.3) and the external electric field of 0.002 a.u., (**e**) the external electric field of 0.003 a.u., and (**f**) the external electric field of 0.005 a.u. Dashed lines indicate the energy of the ground state. ν*NHN* are the frequencies of the mobile proton stretching vibration. q1 corresponds to the distance of the mobile proton with respect to the H-bond center [51].

#### *2.2. Complexes of Collidine with Acids*

In Table 2, geometric parameters of H-bond in **4** under different approximations are reported. Although these parameters depend on the level of approximation, the mobile proton is located at the acid in all cases. Only at ε > 29 do there appear higher energy local minima on the potential energy curve of proton transfer that correspond to the proton location at collidine. Taking into account a qualitative character of our analysis, we studied the effect of the external electric field on the location of the mobile proton in **2**–**6** at a computationally efficient *w*B97XD/def2svp approximation. We also restricted our analysis to the comparison of the difference between the energies of the two minima (proton at acid and proton at base) on the potential energy curve of proton transfer. The values of ε under the PCM approximation were taken equal to 12.5 for **2** and 29.3 for **3**–**6**. These values are close to the dielectric constant of CDF3/CDClF2 solution at 200 K and 130 K, respectively [23]. There is no need to select ε with a higher precision because Table 2 clearly demonstrates that, at ε > 10, its effect on H-bond geometry remains rather constant.


**Table 2.** H-bond geometry of **4** under different DFT (Density Functional Theory) approximations.

Figure 4 demonstrates the effect of the external electric field on the energy of the molecular (O-H···N) and ionic (O−···[H-N]+) forms of H-bonds in **<sup>2</sup>** (Figure 4a,b), **<sup>3</sup>** (Figure 4c,d), **<sup>4</sup>** (Figure 4e,f), **5** (Figure 4g,h), and **6** (Figure 4i,j) under the PCM and gas-phase approximations. For all complexes in both approximations, an increase of the field causes an energy decrease of both forms, although the favor is towards the ionic one. Upon this increase, the profile of a potential energy curve changes from a single-well (molecular) to a double-well to a single-well (ionic) one. The double-well potential interval is shown in Figure 4. For each complex, there is a unique value of the field strength for which the energy minima of the two forms are equal. Δ*E* corresponds to the energy of the complex with respect to the value at this field.

Strictly speaking, in order to find which value of the external field is the best approximation of experimental conditions, one needs (i) to estimate the molar fractions of the two forms from NMR spectra and (ii) to then find at what field the same ratio is be obtained in calculations. The former can be done using either the value of <sup>1</sup>*J*( 15N1H) in Table 1 or the 1H-NMR chemical shift of the mobile proton [52]; the latter—by calculating the effect of the field on the free energy. However, both of these estimates are rough and are redundant in the case of the present qualitative analysis. Instead, the lower limit of the external electric field can be associated to the value at which the energy minima of the two forms are equal, Table 3.

**Table 3.** The external electric field at which the energy minima of the molecular (O-H···N) and ionic (O−···[H-N]+) forms of H-bonds in **<sup>2</sup>**–**<sup>6</sup>** are equal.


**Figure 4.** The effect of the external electric field on the energy of O-H···N (black ) and O−···[H-N]<sup>+</sup> (red -) forms of H-bonds in **2**–**6**. For each complex, there is a unique value of the field strength for which the energy minima of the two forms are equal. Δ*E* corresponds to the energy with respect to the value at this field. **2**: (**a**) PCM (ε=12.5), (**b**) gas-phase; **3**: (**c**) PCM (ε=29.3), (**d**) gas-phase; **4**: (**e**) PCM (ε=29.3), (**f**) gas-phase; **5**: (**g**) PCM (ε=29.3), (**h**) gas-phase; **6**: (**i**) PCM (ε=29.3), (**j**) gas-phase.

#### *2.3. The Gas-Phase Proton A*ffi*nities*

For the further discussion of the obtained results, we will use the values of the gas-phase proton affinities (PA). These values are listed in Table 4 for a number of selected proton acceptors.

**Acceptor PA, kJ**/**mol Acceptor PA, kJ**/**mol Acceptor PA, kJ**/**mol** pyridine 936 2-nitrobenzoate 1382 Formate 1431 collidine 988 3,5-dichlorobenzoate 1379 Acetate 1447 benzoate 1421 4-nitrophenolate 1354 fluoride 1547 F−···HCF3 1429

**Table 4.** Gas-phase proton affinities of selected proton acceptors.

#### **3. Discussion**

Figure 5 shows lower limits of the external electric fields simulated the effect of CDF3/CDClF2 on **1**-**6** under the PCM (5a) and gas-phase (5b) approximations as a function of the p*K*a of the proton-donor. The p*K*a of pyridine is 5.32 [46]; other values are listed in Table 1; Table 3. For **2**–**6**, the strength of the field required to transfer the proton to the base correlates with the strength of the acids in both approaches. **1** deviates strongly from these correlations as it should. The energy minima of two tautomeric forms of **1** are equal at zero field. The values shown for **1** (Figure 5) correspond to the case when this double-well potential energy curve becomes a single-well one (Figure 3). However, in contrast to **2**–**6**, proton tautomerism in **1** remains fast on the NMR time scale. Thus, the physical meanings of the values reported here for **2**–**6** and **1** are different. What is important is that (i) the order of magnitude of the electric field simulated the effect of CDF3/CDClF2 on **1**, **2**–**6**, and pyridine··· HF··· (HCF3)n [42] is the same and (ii) its effect on H-bond geometry correlates with the proton donating power of involved acids. The former means that the AuF approach is appropriate for a qualitative description of solute–solvent interactions. The latter suggests that it should be possible to predict the effect of a given solvent on the geometry of a given H-bonded complex. What is the most reliable representation of the correlation between the expected strength of the external electric field and the chemical properties of involved acids and bases?

**Figure 5.** The lower limit of the external electric field simulated the effect of CDF3/CDClF2 on **2**, **<sup>3</sup>**, **<sup>5</sup>** (black ), **<sup>4</sup>**, **<sup>6</sup>** (red -), and **1** (blue Δ) under the PCM (**a**) and gas-phase (**b**) approximations as a function of the p*K*a of the proton-donor.

The use of p*K*a as a measure of acid's proton-donating power in non-aqueous solutions introduces an error into the correlation. The reason is that the p*K*<sup>a</sup> depends on solvation in water that is very specific solvent [53–55]. The pKa's of ionizable groups in a non-aqueous environment can be estimated theoretically [56]. However, such calculations are quite demanding. Alternatively, they can be empirically corrected to a solvent under question [57]. The easiest way to estimate the proton-donating and proton-accepting powers is to calculate the gas-phase proton affinity (PA), Table 4 [58,59]. These values are very close to available experimental data for pyridine (930

kJ/mol)[60,61], collidine (980 kJ/mol)[61], benzoate (1422 kJ/mol) and 2-nitrobenzoate (1383 kJ/mol) [62], formate (1445 kJ/mol) and acetate (1456 kJ/mol) [63], and fluoride (1550 kJ/mol) [64].

Figure 6 demonstrates the lower limits of the external electric field simulated the effect of CDF3/CDClF2 on **2**–**6** under the PCM (Figure 6a) and gas-phase (Figure 6b) approximations as a function of the PA of the involved conjugate bases. We are aware that the use of the PCM approximation perturbs such correlations due to its dependence on the size of a molecular complex under study [58]. Therefore, we intend to use the gas-phase approximation. The analytical expression for the correlation shown in Figure 6b is:

$$F^{\rm gas}[in\ a.u.] = \{(0.55 \pm 0.07) \cdot PA[in\ kJ/mol] - (700 \pm 100)\} \cdot 10^{-4} \text{ J}$$

**Figure 6.** The lower limit of the external electric field simulated the effect of CDF3/CDClF2 on **2**, **3**, **<sup>5</sup>** (black ) and **<sup>4</sup>**, **<sup>6</sup>** (red -) under the PCM (**a**) and gas-phase (**b**) approximations as a function of the PA of the involved acids.

This correlation can be generalized by replacing the PA of the conjugate bases with a difference between the PA's of a proton donor (conjugate base) and an acceptor: Δ*PA* = *PAdonor* − *PAacceptor*. The analytical expression for this final correlation as shown in Figure 7 is:

$$F^{\otimes \text{as}}[\text{in }a.u.] = 1.3 \cdot 10^{-4} \cdot \left\{ \exp\left(0.009 \cdot \Delta PA \left[\text{in }kf/mol\right]\right) - 1\right\}.$$

**Figure 7.** A functional dependence between the lower limit of the external electric field at which the energy minima of the Acid-H···Base and Acid−···[H-Base]<sup>+</sup> forms of an H-bonded complex in CDF3/CDClF2 are equal and the difference between the PAs of a proton donor and an acceptor, Δ*PA*.

Here, *Fgas* tends to zero as Δ*PA* tends to zero that is physically correct. Results obtained for **1**, **2**, and **6** provide limiting values for the strengths of the external electric field simulated the effect of CDF3/CDClF2 on H-bond geometry at 300 K, 200 K, and 100 K, respectively. When the gas-phase approximation is used for the field-strength calculations, these values are about 0.003 a.u., 0.004 a.u., and 0.082 a.u., respectively. Only a part of this field can be associated to solvent polarization and accounted for in the frameworks of the PCM approach. The effect of this contribution on H-bond geometry is roughly constant and temperature independent. Another part of the field simulates the effect of solute–solvent interactions. Their impact is fluctuating and depends on temperature. Let us estimate the magnitudes of these two contributions.

For **1**, the lower limit of the external electric field estimated under the PCM approximations is about 0.001 a.u. This value can be associated to the effect of solute–solvent interactions. Notice that both pyridines of **1** are affected by these interactions—meaning that 0.001 a.u. reflects the difference between the effects of solvation on the protonated pyridine and the H-bonded one. This value fluctuates faster than 103 s−<sup>1</sup> down to 120 K and slower than 1011 s−<sup>1</sup> up to 290 K [44,45]. As a result, proton exchange within **1** is fast on the NMR and slow on the IR time scales in this temperature range.

Proton tautomerism *acceptor*···*H-donor [acceptor-H]*+···*(donor)-* in CDF3/CDClF2 is strongly shifted towards the *[acceptor-H]*+···*(donor)-* tautomer already at 200 K when the difference between the PAs of the proton donor and the acceptor is smaller than 400 kJ/mol, **2**. The larger the difference, the lower the temperature should be. In CDF3/CDClF2, at the lowest experimentally achievable temperature of 100 K, this tautomer dominates completely only when Δ*PA* is smaller than 500 kJ/mol, **6**. The lower limits of the external electric field estimated under the PCM approximations required to stabilize the *[acceptor-H]*+···*(donor)-* tautomers of **<sup>2</sup>**–**<sup>6</sup>** in CDF3/CDClF2 vary from 0.0004 a.u. to 0.0027 a.u., Table 3. However, in contrast to **1,** these values fluctuate slow on the NMR time scale. Tentatively, this field can mostly be associated to solvation of the carbonyl group. The lower the temperature, the more stable the interaction with solvent molecules. For a rough estimate, it can be assumed that the value of this slow fluctuating field increases from 0.0005 a.u. to 0.0030 a.u. in the temperature range from 200 K to 100 K.

For the chemically similar CH2Cl2, the lowest experimentally achievable temperature is about 170 K [27]. Thus, when <sup>Δ</sup>*PA* is smaller than 400 kJ/mol, the *[acceptor-H]*+···*(donor)-* tautomer should dominate. This estimate can be checked using a complex of 4-nitrophenol with acetate in CD2Cl2 [65]. In this complex, Δ*PA* = *PAacetate* − *PAphenolate* = <sup>93</sup> *kJ*/*mol*, Table 4. At 173 K, the *phenol*···*(acetate)-* form of the complex dominated. Ab initio molecular dynamics demonstrated that this form was stabilized by interactions of the carbonyl group with solvent molecules. This interaction is implicitly included in our correlation. Formally speaking, these results support our estimate. However, molecular dynamics showed that the location of a tetraalkylammonium anion was also critically important in this case. This interaction is not covered by our correlation. This effect is absent for charged H-bonded complexes only for very bulky anions [25]. Notice that solvation of the phenolate oxygen will reduce the effect of the carbonyl solvation.

The importance of specific interactions is extreme in the case of a complex of pyridine with hydrogen fluoride. This complex was studied by NMR [23,66] and model calculations [43,67]. The strength of the external electric field, at which the experimental geometry of the N··· H and H-F bonds of pyridine··· HF in CDF3/CDClF2 is reproduced, depends on the number of the solvent molecules coordinated to the fluorine in model adducts pyridine··· HF··· (HCF3)n. It is about 0.017 a.u. for pyridine··· HF, 0.010 a.u. for pyridine··· HF··· HCF3, and 0.006 a.u. for pyridine··· HF··· (HCF3)2. For the former adduct, Δ*PA* = 611 *kJ*/*mol*, Table 4. For pyridine··· HF··· HCF3, the structure of the proton donor is HF··· HCF3 and Δ*PA* = 493 kJ/mol. It is not clear how to estimate the PA of F- ···(HCF3)2 because the structure of such composite donor critically depends on its protonation state. In any case, it should be smaller than 500 kJ/mol that explains a near central location of the mobile proton between the nitrogen and fluorine atoms as observed in experiments. Thus, also for this complex, our qualitative analysis agrees with a high-level molecular dynamics [33].

#### **4. Materials and Methods**

Gaussian 09.D.01 program package was used [68]. If not stated otherwise, geometry optimizations were done at the *w*B97XD/def2tzvpp and *w*B97XD/def2svp approximations for **1** and **2**–**6**, respectively [69,70]. The identity of minima was confirmed by the absence of imaginary vibrational frequencies. The default SCRF=PCM method has been used to construct the solute cavity. The parameters for SMD calculations were adapted from the Minnesota Solvent Descriptor Database: eps = 8.93, epsinf = 1.4242, H-bond acidity = 0.1, H-bond basicity = 0.05, surface tension at interface = 39.15, carbon aromaticity = 0.0, electronegative halogenicity = 0.667 [71]. Although the SMD model was parametrized for the Minnesota functionals family, for the qualitative analysis presented in this work, we decided to use the same functional for all types of calculations.

The external electric field was added to calculations using a keyword Field. The C2 symmetry axis of pyridine or collidine was fixed along the direction of the field using a keyword Z-Matrix. The electric dipole field in Gaussian is directed from the negative to the positive potential that is opposite to the conventional direction of electric field.

The gas-phase proton affinities (PA) were calculated as follows:

$$PA = \Delta H^{298}(B) + 5RT/2 - \Delta H^{298}(BH).$$

Here, Δ*H*298(*B*) and Δ*H*298(*BH*) stand for the sums of the electronic and thermal enthalpies of a base and its conjugate acid or the conjugate base of an acid and the acid at 298 K. The enthalpies were estimated at the B3LYP/6-311++g(3df,2p) level. This level provides a reasonable description of the structure and harmonic frequencies of the neutral and charged H-bonded systems in the gas phase [72]. It is also sufficient to obtain correct values of enthalpies [73].

#### **5. Conclusions**

The gas-phase proton affinity (PA) of conjugate bases is larger than that of most neutral bases. Proton transfer in condensed matter requires either an H-bond network [74] or solvation [54,75]. In specific cases, small alterations can cause pronounced changes [76]. Therefore, neither gas-phase nor PCM calculations can reproduce the geometry of an acid-base complex in condensed matter, if its environment is ignored. In contrast, useful qualitative data can be obtained using the Adduct under Field (AuF) approach. The weak yet multiple interactions between the acid-base complex and solvent molecules influence the electron density in the acid and base that affects the position of the mobile proton. These changes can be simulated using a fictitious external electric field. The macroscopic electric field can be either accounted for by the PCM approach or included in the strength of the field. The strength of the solute–solvent interactions fluctuates and its effective magnitude depends on temperature. In this paper, we report estimates of the strength of the fictitious field that simulates solvation effect of CDF3/CDClF2 on homo- and heterogeneous acid-base complexes in the temperature range from 300 K to 100 K. With some limitations, the obtained results can be extended onto the chemically similar CHCl3 and CH2Cl2. The computational simplicity of the AuF approach could lend itself to wide application including large molecular systems [77–80].

In the presence of the external electric field, the potential energy curve of a proton transfer within the proton-bound homodimers of pyridines changes from a symmetric double-well potential to an asymmetric single-well one. In the temperature range 120–290 K, the fluctuation rate of this field is between 10<sup>3</sup> and 10<sup>11</sup> s−<sup>1</sup> that defines the rate of proton exchange within the homodimers. The lower limits of this field are reported above. For [FHF]<sup>−</sup> [81,82] or [H2n+1On] <sup>+</sup> [83] proton-bound homodimers, the same strength of the field can be an acceptable approximation only when several solvent molecules are explicitly included into calculations.

Below 200 K, solvent effects on heterogeneous acid-base complexes can be simulated using a quasi-constant fictitious field. For complexes of pyridine with carboxylic acids, the strength of this field and its temperature dependence are reported above. For complexes of pyridine with alcohols and phenols, the strength will be smaller because interaction of the carbonyl oxygen with solvent molecules increases the proton-donating power of the hydroxylic group of carboxylic acids. When a proton-donating or proton-accepting center is open for a strong interaction with solvent molecules, these molecules should be included into the model adduct. See, for example, pyridine··· HF··· HCF3 and pyridine··· HF··· (HCF3)2 adducts.

The most important conclusion of this study is that solute–solvent interactions remarkably affect the geometry of acid-base complexes in aprotic solvents even if the active sites of these complexes are not accessible for solvent molecules. As a result, these complexes exhibit proton tautomerism *acceptor*···*H-donor [acceptor-H]*+···*(donor)*<sup>−</sup> in a large temperature range. The rate of this process is often slow on the time scales of electronic excitations and molecular vibrations while fast on the time scale of NMR. Therefore, both tautomers can be observed in the former cases while exchange averaged parameters will be obtained in the latter. Only in the presence of moderately strong solvation effects, for example, when solvent molecules interact with the proton-donating group, can the lifetime of the *[acceptor-H]*+···*(donor)*<sup>−</sup> tautomer become long on the NMR time scale in the temperature range from 200 K to 100 K.

**Author Contributions:** Conceptualization, I.G.S.; methodology, I.G.S. and G.S.D.; data curation, G.S.D.; writing—original draft preparation, I.G.S.; writing—review and editing, G.S.D.; visualization, I.G.S.; supervision, I.G.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Russian Foundation of Basic Research (Project 20-03-00231). APC was sponsored by MDPI.

**Acknowledgments:** The authors gratefully acknowledge the Gauss Centre for Supercomputing e.V. (www.gausscentre.eu) for funding this project by providing computing time on the GCS Supercomputer SuperMUC at Leibniz Supercomputing Centre (LRZ, www.lrz.de).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Sample Availability:** Not available.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Review* **Hydrogen Bond and Other Lewis Acid–Lewis Base Interactions as Preliminary Stages of Chemical Reactions**

#### **Sławomir J. Grabowski 1,2**

<sup>1</sup> Polimero eta Material Aurreratuak: Fisika, Kimika eta Teknologia, Kimika Fakultatea, Euskal Herriko Unibertsitatea UPV/EHU & Donostia International Physics Center (DIPC) PK 1072, 20080 Donostia, Euskadi, Spain; s.grabowski@ikerbasque.org; Tel.: +34-943-018-187

<sup>2</sup> IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Spain

Academic Editor: Ilya G. Shenderovich Received: 23 September 2020; Accepted: 10 October 2020; Published: 13 October 2020

**Abstract:** Various Lewis acid–Lewis base interactions are discussed as initiating chemical reactions and processes. For example, the hydrogen bond is often a preliminary stage of the proton transfer process or the tetrel and pnicogen bonds lead sometimes to the SN2 reactions. There are numerous characteristics of interactions being first stages of reactions; one can observe a meaningful electron charge transfer from the Lewis base unit to the Lewis acid; such interactions possess at least partly covalent character, one can mention other features. The results of different methods and approaches that are applied in numerous studies to describe the character of interactions are presented here. These are, for example, the results of the Quantum Theory of Atoms in Molecules, of the decomposition of the energy of interaction or of the structure-correlation method.

**Keywords:** Lewis acid–Lewis base interactions; hydrogen bond; tetrel bond; pnicogen bond; triel bond; electron charge shifts

#### **1. Introduction**

It is well known that the hydrogen bond plays a crucial role in numerous chemical, physical and biological processes [1,2]. However, other interactions are also important in various processes, particularly biochemical ones [3,4]. It is worth mentioning that such terms as interaction and reaction are even used interchangeably sometimes since interactions often lead to corresponding reactions or at least they are initiative steps of chemical reactions. The aim of this study is to display dependencies between the latter phenomena. The interrelations between interactions and reactions as well as between them and other phenomena and processes, or reasons for the lack of such relations, were discussed in numerous studies, even in very early ones. For example, Lewis has described that "at the recent conference of the Faraday Society (July, 1923) all of those who participated seemed agreed that the average organic molecule is very little polarized, but there were some who believed that polarisation and indeed ionisation precede every reaction" [5].

More recent studies indicate an important role of electron charge shifts (related to the polarisation) in processes corresponding to interactions and then to chemical reactions. For example, Rauk has stated that "all reactions of organic compounds are treated within the framework of generalized Lewis acid - Lewis base theory, their reactivity being governed by the characteristics of the frontier orbitals of the two reactants. All compounds have occupied molecular orbitals and so can donate electrons, that is, act as bases in the Lewis sense. All compounds have empty molecular orbitals and so can accept electrons, that is, act as acids in the Lewis sense" [6].

It was also indicated that the term "noncovalent interactions" is not a proper one since it concerns hydrogen bond and halogen bond as well as other interactions that often possess characteristics of covalent bonds [7]. The covalent character is often related to the occurrence of polarization and charge transfer processes [8,9]. On the other hand, for very weak interactions ruled mainly by dispersion forces, the electron charge shifts are negligible, if there are any, also electrostatic interactions are not important there. This is why the term Lewis acid–Lewis base interactions seems to be the proper one since it is related to those interactions where the electron charge shifts occur between Lewis acid and Lewis base units. It seems also that this term excludes very weak interactions, as those that occur between methane molecules, for example, where dispersion forces are the most important ones and where it is difficult to indicate the Lewis acid and Lewis base centres [7].

Kaplan has concluded in his monograph that "intermolecular interactions are involved in the formation of complicated chemical complexes, such as charge-transfer and hydrogen-bond complexes. Study of the mechanism of elementary chemical reactions is impossible without knowledge of the exchange processes between the translational and electron-vibration energies, which depend on the interaction of particles under collisions. Knowledge of the potential surface, characterizing the mutual trajectories of the reactants, is necessary to obtain the rates of chemical reactions" [10].

This is why the aim of this review is to point out relationships between interactions and reactions since the former phenomena may lead to the latter ones thus they initiate numerous structural changes and electron charge redistributions in species being in contact. This is of particular interest, which characteristics possess interactions that lead to chemical reactions. It was discussed in one of recent reviews that the stronger electrostatic interactions lead to the stronger Pauli repulsion that implies the greater electron charge shifts, mainly from the Lewis base unit to the Lewis acid; these phenomena may lead to the chemical reaction [11]. In this review, few types of interactions are discussed and it is analysed which conditions should be fulfilled for them that they lead to chemical reactions.

#### **2. The Hydrogen Bond as a Preliminary Stage of the Proton Transfer Process**

It has been described in early studies that a fragment of a crystal structure may be treated as "a frozen stage" of a considered chemical reaction. The related fragments that differ by geometry and that are taken from various crystal structures may correspond to the analysed chemical reaction since they reflect structural changes accompanying this reaction [12–14]. This approach is known as the structure-correlation method. It was applied to analyse different reactions such as the nucleophilic addition to a carbonyl group, nucleophilic substitution at tetrahedral coordinated atoms (SN1 and SN2 reactions); electrocyclic ring closure of polyenes and other chemical processes [12–15].

It is important that the proton transfer, PT, process related to the hydrogen bond, HB, may be also discussed in terms of the structure-correlation method. For example, PT in O-H···O hydrogen bonds was analysed since -C=O···H-O-C- fragments taken from different crystal structures were compared to reconstruct the corresponding reaction path [9,11,16]. For these analyses, the high-precision neutron diffraction geometries were taken from the Cambridge Structural Database, CSD [17,18]. The recent CSD release (CSD updates up to May 2020) was applied here to search the above-mentioned -C=O···H-O-Cfragment with the following search criteria; accurate crystal structures with e.s.d's ≤ 0.005 Å, R ≤ 7.5%, error free structures, without disorder, no polymers and no powder diffraction results. Only neutron diffraction results were taken into account here since they are characterised by precisely determined positions of H-atoms [19] in contrast to the X-ray results, where the refinement of crystal structures is usually based on the spherical approximation of the atomic electron densities that results in the spherical symmetry of atomic scattering factors [20]. One may say that the sample of fragments described above and corresponding to different crystal structures of organic and organometallic compounds may reflect the reaction path of the following PT process; -C=O···H-O-C- ⇔ -C-O-H···O=C-. In some of structures H-atom is situated in the mid-point of the O...O distance or near to this point therefore -C=O···H+···O=C- fragments are also included in the sample. The search has led to a finding of 56 geometrical fragments corresponding to the above-mentioned PT process. The similar search

with the same criteria for accuracy of results was performed for the similar fragments where the H-atom is replaced by the deuterium, i.e., the -C=O...D-O-C- fragments; in this case only four structures were found.

Figure 1 presents the PT reaction path based on two above described CSD searches; this is the relationship between the Δr parameter and the O···O distance. The same relationships were discussed before [9,11,16] but they were based on earlier CSD updates. The Δr parameter is the distance of the H-atom of the O-H...O bridge from the O···O mid-point. For the linear O-H(D) ···O systems the O···O distance may be expressed as the rO-H + rH ... <sup>O</sup> sum while the Δr parameter as the (rH ... <sup>O</sup> − rOH)/2 term. The rOH and rH ... <sup>O</sup> values correspond to the O-H bond length and the H ... O distance, respectively. The points of Figure 1, which correspond to fragments of crystal structures, may be considered as positions of the proton in PT process. The results of this figure are symmetrised around Δr = 0; this symmetrisation corresponds to the equivalency of systems during PT reaction because the homonuclear O-H···O hydrogen bond is discussed here. The "points" in the middle of O···O distance may be considered as the transition state of the proton transfer reaction. These are strongly elongated O-H bonds and they are observed for the O···O distances amounting about ~2.4–2.5 Å. For long O···O distances the H-atoms are located far from their mid-points, they are situated close to one of oxygen centres rather.

**Figure 1.** The dependence between the Δr (Å) - the displacement of the proton (or deuter) position from the O···O mid-point and the O···O distance (Å), for the O-H···O systems (black circles) and the O-D···O ones (white circles). The broken line corresponds to the bond number conservation rule (Equation (1)).

The broken line of Figure 1 corresponds to the relationship expressing the bond order (number) conservation rule (Equation (1)) [12,13,21].

$$\exp\left(\frac{\Delta \mathbf{r}\_i}{\mathbf{c}}\right) + \exp\left(\frac{\Delta \mathbf{r}\_j}{\mathbf{c}}\right) = 1\tag{1}$$

The Δr*<sup>i</sup>* and Δr*<sup>j</sup>* terms in the above equation correspond to the (r0 − rOH) and (r0 − rH···O) differences; r0 is the typical single O-H bond length not perturbed by any interaction. The bond length of water in the gas phase equal to 0.957 Å was chosen here and in other studies [9]. The exponential terms of Equation (1), exp(Δri/c), may be treated as the definition of the bond order. However, other names for this term as well as for similar expressions are often applied in various studies; the bond valence [21] or the bond number [22], for example. The constant c for the O-H···O hydrogen bonds is determined from the above exponential expression assuming that for the O-H···O linear system with the H-atom

located in the O···O mid-point, and the O···O distance of 2.4 Å, two equivalent H···O distances possess the bond order equal to 0.5; in other words; c = (0.957 − 1.2)/ln(0.5).

The bond order and the bond order conservation rule ideas for the O-H···O hydrogen bonds may be described in the following way. For the single O-H bond in the gas phase, the bond order is equal to unity. If this bond is involved in the hydrogen bond thus it is elongated and its bond order decreases. However, this decrease is compensated by the H···O contact. Hence the sum of bond orders (Equation (1)) of the O-H bond and the H···O contact is equal to unity. The greater is the O-H bond elongation for the stronger hydrogen bond thus the greater is the bond order of the H···O contact that is shorter accordingly; the latter is also accompanied by the decrease of the O···O distance (Figure 1). The extreme cases of very short O···O distances with the H-atom location at the O···O mid-point for very strong hydrogen bonds correspond to the transition states of the proton transfer reaction.

One can see that the broken line of Figure 1 that was derived from the bond order conservation rule is only approximately in agreement with the neutron diffraction results. Similar disagreement concerning the relationship between the O-H bond length and the H···O distance was explained by the influence of electrostatic forces that are not properly included in the bond order conservation concept [23]. Hence the corrected reference single O-H bond length of 0.925 Å was proposed to take into account the electrostatic contribution and to have better agreement between experimental results and theoretical evaluations [23]. Figure 1 contains also the O-D···O systems; it is pointed out in several studies that the deuteration of the O-H···O systems results in shortening of the O-D bond and lengthening of the O···O and D···O distances in comparison with their non-deuterated counterparts; it is known as the Ubbelohde effect [24]. Figure 1 shows that the deuterated O-D···O systems are approximately in agreement with the broken curve derived from Equation (1). One may conclude that the reaction path presented in Figure 1 shows that the hydrogen bond, especially for the strong O-H···O interactions, may be treated as the initial stage of PT process.

It is worth mentioning that similar relationships to this one presented in Figure 1 were analysed for other hydrogen bonded systems. For example, it was found that the dependence between q1 and q2 parameters for the N-H···N hydrogen bond geometries taken from experimental NMR and crystal structures´ results is in agreement with the bond order conservation rule expressed by an equation similar to that one presented above here (Equation (1)) [25]. The q1 and q2 parameters are equal to (rH ... <sup>N</sup> − rN-H)/2 and rH ... <sup>N</sup> + rN-H, respectively; rH ... <sup>N</sup> is the H···N distance and rN-H is the N-H bond length in the N-H···N system.

In another study the proton bound water dimer, H5O2 <sup>+</sup> (Zundel cation) was discussed [26]. The relationships between molecular structures of this cation and the 1H-NMR chemical shifts were presented; low-temperature neutron diffraction results were used for these relationships. The dependence between q1 = (rH ... <sup>O</sup> − rO-H)/2 and q2 = rH ... <sup>O</sup> + rO-H parameters that is very similar to that one presented in Figure 1 has been also discussed [26]. The chemical shifts and other NMR parameters were also analysed for the N-H···N [25,27], F-H···N [28] and F-H···F<sup>−</sup> [28] hydrogen bonded systems. The bond order conservation rule [9,12] was compared there with the experimental results and with the theoretical calculations [25,27,28].

One can see that the geometries of hydrogen bonded systems represent various stages of the proton transfer reaction. It is discussed above here for different types of hydrogen bonds. As a consequence, the question arises, what are the characteristics of hydrogen bonds that may be treated as the initial stages of the PT process? It was discussed in earlier studies [9,11,29] that the strong hydrogen bonds may lead to the proton transfer reactions. It is worth recalling here effects that accompany the A-H···B hydrogen bond formation. It is the Lewis acid–Lewis base interaction thus the noticeable electron charge shift from the base unit to the acidic one occurs here [7,11]. This is connected with the nB <sup>→</sup> <sup>σ</sup>AH\* orbital-orbital interaction [8], where nB is the lone electron pair orbital of the B-centre while σAH\* is the antibonding orbital of the A-H σ-bond. If we exclude from our discussion the blue shift hydrogen bonds [30] which do not lead to the PT process rather thus the hydrogen bond formation is connected with the increase of the polarization of the A-H bond, its elongation and consequently with its weakening. In extreme cases of the strong hydrogen bonds possessing covalent or at least partly covalent character [9] the PT reaction occurs. The terms resulting from the decomposition of the energy of interaction and related to the electron charge shifts are described as those expressing the covalency. The following terms occur in different decomposition schemes: charge transfer, polarization, delocalization or induction term. The name of term depends on the decomposition scheme applied. The interaction energy contributions differ by physical meaning in different schemes although they similarly express processes related to the electron charge shifts.

The covalent character may be also detected by the Quantum Theory of Atoms in Molecules (QTAIM) approach [31,32]. The value of the electron density at the H···B bond critical point (BCP), <sup>ρ</sup>BCP, of the order of 0.1 u and more, and the negative Laplacian of this electron density, <sup>∇</sup>2ρBCP, inform of the covalent character of the hydrogen bond. However, it is assumed in numerous studies that even for <sup>∇</sup>2ρBCP > 0, the negative value of the total electron energy density at BCP, HBCP, shows the partly covalent character of interaction [9,33–36].

Figure 2 presents the relationship between the hydrogen···Lewis base distance and the HBCP value at the corresponding BCP, for the sample of hydrogen bonds analysed in earlier study [37]. These results are based on the MP2/6-311++G(d,p) calculations. The following complexes linked by hydrogen bonds were discussed there. The complexes connected by charge assisted hydrogen bonds (CAHBs): (FHF)−, H2O···H3O<sup>+</sup>, H3O<sup>+</sup>···HCN, OH−···H2O, NH3···NH4 <sup>+</sup>, C5H5 −···HF and C5H5 −···C2H2. The complexes with π-electrons acting as the acceptor of proton, these are two latter CAHB systems as well as the following species; C2H2···HF, C2H4···HF, C6H6···HF, (C2H2)2 (T-shaped dimer), C2H4···C2H2, C6H6···C2H2, C6H6···CH4, C6H6···CHCl3, C2H2···CH4 and C2H2···CHCl3. There is the sub-sample of complexes linked by the C-H···B hydrogen bonds: F3CH···NCCH3, H3CH···NCCH3, HCCH···NCCH3, F3CH···OCH2, H3CH···OCH2, HCCH···OCH2, H3CH···SH2, HCCH···SH2 and HCCH···S(CH3)2. The other hydrogen bonds analysed in the above-mentioned study [37] may be classified as moderate or strong ones, these are interactions in the following complexes; (C6H5COOH)2, (CH3COOH)2, (HCOOH)2, (HCONH)2, (HCSNH)2, (H2O)2 (trans-linear dimer), H2O···HF and H2CO···HF. The hydrogen bonds are divided into three groups in Figure 2, C-H···π (open circles, the C5H5 −···HF complex with the F-H proton donating bond is also included there), C-H···B (black squares) and remaining ones, among them CAHB systems (black circles). The hydrogen···Lewis base distance is understood here in the following way: it is the H···B distance for the 3c–4e (three centre–four electron [8]) A-H···B hydrogen bonds. For the C-H···π interactions this is the distance between the H-atom of Lewis acid unit and the carbon atom or the bond critical point of CC bond of the Lewis base [37]. The latter depends on the kind of the bond path characterizing the intermolecular link [37]. Figure 2 shows that for stronger interactions characterised by shorter distances between Lewis acid-base units the covalent character is revealed that is expressed by the negative HBCP values, the corresponding systems may be treated as the potentially preliminary stages of PT processes. The weaker hydrogen bonds are characterised by longer distances between these units, these are mainly the C-H···π and C-H···B systems.

**Figure 2.** The dependence between the H···Lewis base distance (between H-atom of the proton donor and the centre of Lewis base unit, in Å) and the total electron energy density at BCP, HBCP for hydrogen bonded systems. The Lewis base centre is an atom for 3c–4e A-H···B hydrogen bonds (black squares for C-H···B hydrogen bonds and black circles for other 3c–4e systems) and it is BCP of CC bond or C-atom in Lewis base unit for the A-H···π hydrogen bonds (while circles).

#### **3. The Case of Halogen Bonds**

There are several various interactions that are treated in numerous studies as the hydrogen bond counterparts, particularly it concerns so-called σ-hole and π-hole bonds [38–42]. The σ-hole is a region of electron charge depletion at a centre considered approximately in the direction from an atom bonded to this centre, in the elongation of this bond [38,39]. On the other hand, the π-hole concerns a centre in a planar molecule or a planar molecular fragment [41,42], the triel centres such as boron one in trihalides and trihydrides are examples of such a situation [43,44]. The electron charge depletion at the σ-holes and π-holes often leads to the positive electrostatic potential, EP, at these regions; hence they often act as the Lewis acid centres.

The halogen bond, XB, is a case of an interaction where the σ-hole at halogen centre, X (F, Cl, Br, I or At) i.e., the Lewis acid site, interacts with the Lewis base centre [38,39]. This is why this interaction is often considered as the hydrogen bond counterpart, in numerous studies the comparison of these interactions is performed [11,40,45–47]. However, there is the question if XB, similarly as the hydrogen bond, may be considered as a preliminary stage of the chemical reaction. The PT process follows sometimes the hydrogen bond formation. Does a similar transfer of a halogen atom occur? Let us discuss shortly studies, particularly the recent ones, on the entities with halonium cation that links two Lewis base centres. The latter topic is revealed in recent studies by two challenges. The first one concerns finding of Lewis base–halonium ion–Lewis base arrangements with asymmetrically located halonium cation [48]. They are designated later here as [LbXLb]+, where in further discussions Lb is replaced by the specific atomic centre while X marks the halogen. The second challenge concerns finding of the [LbXLb]<sup>+</sup> arrangement with fluorine, X = F, situated between the Lewis base sides since rather such systems with heavier halogens are known [49]. As concerns the latter challenge, the generation of a symmetrical fluoronium ion in solution was discussed, its existence was evidenced indirectly [50]. These experimental studies were supported by the calculations performed at various levels, all levels applied confirmed this ion existence, for example, the B3LYP/6-311++G(d,p) results show equal F···C distances amounting 1.6 Å in the [CFC]<sup>+</sup> arrangement [50]. It was proved in numerous earlier and recent studies that the heavier halogen atom (Cl, Br, I) may be engaged in the hypervalent link in solution possessing formally positive charge. Such a link was usually doubted for the fluorine centre; the studies of Lectka and co-workers discussed here [49,50] are related to this challenge. However, it was evidenced only indirectly the occurrence of the fluoronium ion as a short-lived reaction intermediate [50]. It was discussed also that this ion is formed in a solvolysis process from a precursor molecule probably according to the SN1 reaction through the fluorine centre [50,51]. The crystal

structure of the fluoronium ion precursor was determined [50]; the structure of this ion taken from the Cambridge Structural Database, CSD [17,18] is presented in Figure 3. The direct observation of the similar symmetrical [CFC]<sup>+</sup> ion was described recently [49], this is a stable species in solution and it was characterised by 19F, 1H, and 13C NMR.

**Figure 3.** The fluoronium ion precursor from the crystal structure [50], BEXNOJ refcode.

Another challenge announced here concerns asymmetric [NXN]<sup>+</sup> systems since the symmetrical systems containing the halogen, X, are known rather from various studies [48]. For example, the experimental NMR spectroscopic and crystal structure studies as well as the theoretical calculations were performed recently on the systems containing halogen cation as well as other cations between nitrogen centres, i.e., the [NZN]<sup>+</sup> systems where Z<sup>+</sup> = H+, Li+, Na+, F+, Cl+, Br+, I <sup>+</sup>, Ag<sup>+</sup> and Au<sup>+</sup> [52]. Two series, one of bis(pyridine) entities and the second one containing the (1,2-bis(pyridin-2-ylethynyl)benzene) structure were considered [52]. In all cases the symmetrical arrangements are observed, only for Z<sup>+</sup> = H<sup>+</sup> and F<sup>+</sup> the asymmetric systems occur; this concerns both above-mentioned series. However, in a case of the fluorine species there is no experimental confirmation of such asymmetry, only theoretical calculations reveal such arrangements. Figures 4 and 5 show fragments of bis(pyridine) crystal structures. In two cases (Figure 4); silver [53] and iodine [53] cations are located in the centre of the [NAgN]<sup>+</sup> and [NIN]<sup>+</sup> systems, respectively. In a case of the bis(pyridine)proton structure the asymmetric [NHN]<sup>+</sup> arrangements are observed [54] (see Figure 5).

**Figure 4.** *Cont.*

**Figure 4.** The fragment of the crystal structure of bis(pyridine)-silver(i) nitrate monohydrate, AGPYNO02 refcode (**a**) and di(pyridin-1-yl)iodonium hexafluorophosphate, CICQIQ03 refcode (**b**), reference [53].

**Figure 5.** The fragment of the crystal structure of pyridinium dibromo-chloro-pyridine-zinc(ii) pyridine solvate, PYCBZN01 refcode, reference [54].

It is worth mentioning that the potential energy curves were analysed theoretically for the bis(pyridine) series of complexes [52], the displacement of the Z<sup>+</sup> cation from the N···N mid-point was plotted versus the electronic energy of the system. The single well potential energy curve is observed for all systems except of Z<sup>+</sup> = H+, F<sup>+</sup> where the double minimum symmetrical curve occurs [52]. Since the similar situation occurs for the series of 1,2-bis(pyridin-2-ylethynyl)benzene structures thus one can expect that the potential energy curve shape depends on the cation. However, this shape is a result of numerous factors; not only a kind of a cation. It depends on the type of a complex, the environment that results from the arrangement of molecules in crystals or from the type of solvent in liquids. For example, the first asymmetric linear silver complexes and the first asymmetric halonium complexes characterised by the [NAgN]<sup>+</sup> and [NIN]<sup>+</sup> arrangements, respectively, were analysed and they were confirmed by 1H and 1H-15N HMBC NMR spectroscopy, and by X-ray diffraction results in crystals [55]. Thus one may expect that the halonium ion transfer and the transfer of other cations as Ag+, for example, is possible for some systems. It is worth to mention that Zundel compared the hydrogen bond with analogues interactions where the proton is replaced by the lithium, sodium or potassium cations; the transfers similar to the proton transfer were analysed [56].

Let us compare systems containing Z<sup>+</sup> cations that were discussed above with the hydrogen bonds. The examples presented earlier here indicate the asymmetric position of proton in [NHN]<sup>+</sup> arrangement for both 1,2-bis(pyridin-2-ylethynyl)benzene and bis(pyridine) structures. It seems that the asymmetry of the H-atom position results, at least partly, from the asymmetry of environment in crystals. In another crystal structure containing protonated homodimer of pyridine, i.e., pyridinium tetrakis(3,5-bis(trifluoromethyl)phenyl)borate pyridine solvate, the asymmetry of the [NHN]<sup>+</sup> arrangement is also observed; however the asymmetric [NHN]<sup>+</sup> bridges occur also in solution [57]. The NMR signals in solution show the fast reversible proton transfer. All results concerning crystal as well as solution indicate the asymmetry of the [NHN]<sup>+</sup> contact, in spite the

N-H···N hydrogen bond is very strong and covalent in nature and the weak anion coordination is observed. Other strong N-H···N hydrogen bonds in derivatives of proton-bound homodimer of pyridine were analysed in solution and it was also found they are asymmetric in spite of these being very strong interactions [58].

It is worth to note however that hydrogen bonds characterised by a single symmetrical minimum of potential energy curve are also well known. The [FHF]− anion is an example, the symmetrical species is known in the gas phase [59] although in crystal structure this anion is disturbed by packing forces since the movement of proton from the central position is often observed [60]. Similarly, the OH···O hydrogen bonds were analysed as it was discussed which conditions have to be fulfilled for the central position of proton [61]. Figure 1 presented earlier here shows geometries of numerous OH···O systems, the asymmetric ones as well as those where the proton is very close to the O···O mid-point. Various studies on hydrogen bonds, and particularly on the O-H···O systems indicate the complex character of PT process [62], for example, the potential barrier height should be taken into account in the analysis of this process. The same probably concerns the transfer of other cations, among them of halonium ones. There is no sufficient experimental and theoretical results however, and there is room for more extended investigations in this matter.

#### **4. The Dihydrogen Bond as a Stage of the Molecular Hydrogen Uptake**

The dihydrogen bond, DHB, is a special type of the hydrogen bond where the Lewis base centre is the negatively charged hydrogen atom [63–65]. In other words, it is the link between hydrogen atoms of the opposite charge, H+δ···−δH. The question arises if the proton transfer process A-H···<sup>B</sup> <sup>⇔</sup> A−···<sup>+</sup>H-B known for the A-H···B hydrogen bonded systems [62] occurs for dihydrogen bonds. This is discussed in detail in the monograph of Bakhmutov where numerous studies related to this topic are discussed [65]. It was generalised that the protonation of the hydride species characterised by the negatively charged hydrogen centre may be reversible or not, the latter phenomenon is connected with the molecular hydrogen elimination that may be expressed by the following transformations.

$$\text{A-H}^{+\text{S}} + \overset{-\text{S}}{\text{H}} \text{H}-\text{B} \rightarrow \text{ [B(H2)]}^{+} \\ \text{A}^{-} \rightarrow \text{ A}^{-} + \text{H}\_{2} + \text{B}^{+} \tag{2}$$

It is worth mentioning that the B centre connected with the hydric hydrogen is most often the transition metal, the numerous moieties containing molecular hydrogen attached to the transition metal centre are known, such systems often occur in crystal structures [66,67]. The model F-H+<sup>δ</sup>···−<sup>δ</sup>H-Li complex corresponding to the phenomena expressed by Equation (2) was analysed theoretically early at the HF/6-31G(d,p) level [68]. It was found that the presence of the external electric field may lead to the formation and elimination of the molecular hydrogen since there are the following products; F<sup>−</sup> + H2 + Li+, of the proton transfer reaction. The reverse process is observed for the system containing dihydrogen molecule inserted between Lewis acid and Lewis base units, H3B···H2···NH3 since the DHB system is formed in the external electric field; H3B<sup>−</sup>-H···H-<sup>+</sup>NH3 [68]. Bakhmutov, in his monograph [65], gives numerous examples of PT process in DHBs systems. The PT reaction and the dihydrogen elimination were discussed in the X-ray crystal structure of *N*-[2-(6-aminopyridyl)]acetamidine cyanoborohydride [69], the solid state transformation from the dihydrogen bonded LiBH4-triethanolamine system to the covalent bonded material was described in another study [70].

There is an early example of the theoretical study [71] where the following reaction that leads to the molecular hydrogen uptake through the dihydrogen bond formation and next the proton transfer was analysed for a series of complexes of the AlH4 − anion.

$$\text{[AlH}\_4\text{]}^- + \text{HX} \rightarrow \text{AlH}\_3\text{-} + \text{H}^- \cdot \cdots \text{HX} \rightarrow \text{[AlH}\_3\text{X]}^- + \text{H}\_2 \tag{3}$$

The MP2/6-311++G(d,p) calculations were performed for this process (Equation (3)) with the HF, HCl, and H2O species acting as the proton donors. The transition states of the process of transformation

from the DHB systems to the products containing dihydrogen molecule were also calculated indicating that the PT reactions are energetically possible here, for example the potential barrier height for the AlH3-H−···HCl → [AlH3Cl]<sup>−</sup> + H2 reaction is equal to 6.7 kcal/mol.

In one of more recent studies [72] the reverse process of the molecular hydrogen cleavage at the metal centre that leads to the dihydrogen bonded complex formation was analysed; FM + H2 → FH···HM (M = Cu, Ag, Au). The calculations were performed with the use of the aug-cc-pVQZ basis set (aug-cc-pVQZ-PP for metal atoms) and the following methods, MP2, CCSD(T) and CASSCF/CASPT2. The systems with dihydrogen molecule attached to the metal centre correspond to global minima, while the dihydrogen bonded complexes are characterised by higher energies (local energetic minima). Only for complexes containing the gold centre the DHB system is lower in energy than the system with the H2 molecule for CASSCF/CASPT2 and CCSD(T) calculations. For example, for the CCSD(T)/aug-cc-pVQZ(aug-cc-pVQZ-PP) level of calculations for the FAu + H2 → FH···HAu reaction the potential barrier height amounts 30.9 kcal/mol and the products of reaction are lower in energy than reactants by 1.5 kcal/mol.

#### **5. The Change of Trigonal Planar Triel Configuration into the Tetrahedral One—Triel Bonds**

The triel bond is an interaction between the 13th group element acting as the Lewis acid centre and the electron-rich region of another or the same species [43,44,73]. It is worth to note that the acidic properties of triel atoms are related to the so-called π-holes [41,42] that are regions in planar molecules or in planar fragments of molecules at centres characterised by the depletion of the electron charge. For example, the trivalent triel atoms in trihydrides and trihalides possess formally empty p-orbital situated perpendicularly to the planes of molecules and being capable to act as the electron acceptor; numerous early [74–77] and more recent studies [43,44,73] related to interactions of the above-mentioned species with Lewis base units are known. The special attention should be paid to the studies of Phillips and co-workers who analysed such complexes from the theoretical and experimental points of view since the high level ab initio calculations were performed as well as the corresponding crystal structures were discussed [78–82]; particularly the potential energy as a function of the distance between the Lewis acid-base units was described [78]. It was found that the double local minimum occurs for some of complexes, deeper one corresponding to shorter distances and the flat, shallow energy minimum for the longer distances. The former minimum corresponds to strong and partly covalent interactions while the latter local minimum to weak interactions characterised mostly by dispersion forces. These findings are in line with other more recent studies [73,83].

It is worth to mention that for the triel bonds, similarly as for the hydrogen bonds, different sub-classes of interactions exist [84]. There are triel bonds with the one-centre electron donor, A-T ... B (where T is the triel atom, B is the electron donor and A is any atom connected with the triel centre), the interactions with π-electrons acting as the electron donor, A-T···π where the alkenes and alkines were considered as the Lewis base units [85,86] as well as benzene entity [87]. Another sub-class concerns links where the σ-bonds´ electrons play a role of the electron donors, A-T···σ interactions; the complexes of molecular hydrogen acting as the Lewis base unit with the boron trihydride and boron trihalides were analysed in early study [85] and more recently the complexes of molecular hydrogen with boron and aluminium trihydrides and trifluorides were analysed theoretically [88]. Finally, one can mention intramolecular and bifurcated triel bonds that occur in the crystal structures [89]. It is well known that such types of hydrogen bonds are common in crystals [1,2,90].

The most interesting is which structural changes follow the formation of triel bonds? For example, the trivalent triel species described shortly above here change their planar structure if they interact with Lewis bases. The stronger interactions result in greater changes in the triel species geometry; in a case of extremely strong interactions, for example with anions, the triel centre conformation becomes to be close to tetrahedral structures. Even it acquires an ideal Td symmetry like in the BF4 − anions which occur very often in crystal structures [91]. Scheme 1 presents an example of the AlCl3···NH3 complex linked by the Al···N triel bond. The α-angle which may be treated as the parameter of the

deformation resulting from complexation is defined in this scheme. It is the angle between the Al···N line corresponding to the triel bond and one of the Al-Cl bonds. The angle parameter may be applied to other simple triel species complexes that informs the Lewis acid unit deformation resulting from the triel bond formation. This angle is equal to 90◦ if the interaction is very weak and the molecular deformations are not detected, it means that the triel Lewis acid species is still planar, or at least it is close to planarity. The increase of the α-angle is observed with the increase of the strength of interaction, for the BF4 <sup>−</sup> anion being the result of the BF3 ... F<sup>−</sup> interaction it amounts 109.5◦ like for methane and other species possessing Td symmetry.

**Scheme 1.** The AlCl3···NH3 complex linked by the Al···N triel bond.

Figure 6 presents the relationship between the triel centre - Lewis base centre distance and the α-angle for simple complexes of boron and aluminium trihydrides and trichlorides. This dependence is based on the MP2/aug-cc-pVTZ results that come from one of recent studies [73]. The following Lewis base units were taken into account there; HCN, NH3, N2 and Cl−. In such a way three sub-classes of complexes are presented in Figure 6; the ionic species being complexes with chloride anion and the remaining complexes are divided into two groups: complexes of aluminium Lewis acid units and complexes of boron compounds. The excellent linear correlation is observed for the neutral complexes of aluminium. For the remaining complexes, only tendency is observed that for stronger interactions (shorter Lewis acid-base distances) the greater α-angle occurs. In two cases of ionic systems, BCl4 − and AlCl4 <sup>−</sup> the ideal Td symmetry (α-angle equal to 109.5◦) is observed certainly.

**Figure 6.** The dependence between the Lewis acid···Lewis base distance, T···B, (Å) and the α-parameter (see Scheme 1) in A-T···B triel bonds (T-triel centre). Black and white circles correspond to the neutral complexes, aluminium and boron species, respectively. Squares correspond to anion complexes.

Thus one can see that the relationship presented in Figure 6 may be treated as the reaction path for the process of the change of trigonal and planar configuration into the tetrahedral structure; few complexes linked by the triel bond are stable structures characterised by large dissociation energies. It is interesting that these boron adducts characterised by the structures close to tetrahedral ones may further interact with the electron rich species according to the SN2 reaction mechanism. The corresponding studies related to this topic appear from time to time, for example the reactions of borine carbonyl with trimethylamine and triethylamine were analysed experimentally early [92,93], or in more recent study, the SN2 reaction for the Cl−···CH3Cl and NH3···H3BNH3 complexes were analysed theoretically [94], the latter one concerns the tetrahedral structure of boron.

#### **6. Tetrel Bonds and the SN2 Reaction**

The tetrel bond is an interaction between the tetrel centre (14th group element) of the Lewis acid unit and the electron rich region of the moiety playing a role of the Lewis base [41,42,95–98]. The tetrel centre region that is responsible for the acidic (electron accepting) properties is classified as the σ-hole [95] but there are also cases of the tetrel centres that possess π-hole regions [98]. If the tetrel centre interacts with the Lewis base through the σ-hole thus the corresponding tetrel bond is often a preliminary stage of the SN2 reaction [97]. The tetrel bond as a type of the σ-hole bond leads to the geometrical changes of interacting units that the complex structure is close to the trigonal bipyramid, especially in a case of strong interactions [97]. Scheme 2 presents an example of the species connected by the σ-hole tetrel bond. It is the SiFH3···Cl<sup>−</sup> complex with the Si···Cl intermolecular link and the SiH3 group in the central part of the complex that tends to the planarity. The latter situation is similar to those observed for transition states of SN2 reactions. Hence the similar parameter to that one discussed earlier here for the triel bonds (Scheme 1) may be introduced for tetrel σ-hole bonds. It is the α-angle (Scheme 2) between the tetrel···Lewis base link and one of bonds of the tetrel species. For the example shown in Scheme 2 it is the angle between Si···Cl line and the Si-H bond. For very strong interactions this angle is equal to 90◦ or nearly so. On the contrary, for extremely weak interactions that do not disturb the geometry of the Lewis acid unit the α-angle should be equal to ~70.5◦ that corresponds to the ideal tetrahedral structure of the tetrel species that is not perturbed by external forces.

**Scheme 2.** The SiFH3···Cl<sup>−</sup> and NF4 <sup>+</sup>···NCH complexes linked by the tetrel bond and pnicogen bond, respectively.

This is interesting that similar characteristics are often observed for the charge assisted pnicogen species, which interact with the Lewis base units [99]. The complexes of ZH4 <sup>+</sup>, ZF4 <sup>+</sup> and ZFH3 + cations (Z = N, P, As) were analysed theoretically and the Z···Lewis base intermolecular links were observed [99] that may be classified as the type of σ-hole bonds, i.e., pnicogen bonds. These charge assisted pnicogen bonds also lead to the structural changes similar to those occurring for tetrel bonds, and they may be also classified as the preliminary stages of the SN2 reactions [99]. Scheme 2 presents the NF4 <sup>+</sup>···NCH complex connected by the N···N pnicogen bond. One of the nitrogen centres plays here a role of the electron acceptor while another N-centre is the electron donor. This is a very similar situation to that one occurring for the σ-hole tetrel bonds discussed here (see the same Scheme 2 with an example of the tetrel bonded complex).

Figure 7 presents the relationships between the tetrel/pnicogen centre···Lewis base centre distance and the α-angle. Two sub-samples of complexes of tetrel species and one sub-sample of pnicogen species complexes are presented here. These geometrical results were taken from earlier studies [97,99] where the MP2/aug-cc-pVTZ calculations were performed on complexes presented in this figure. Let us describe the sub-samples presented in Figure 7. For that one with units linked by tetrel bond, the ZH4, ZFH3 and ZF4 (Z = C, Si, Ge) species interact with the HCN and LiCN acting as Lewis bases (nitrogen atom is the Lewis base centre here). For the complexes with negatively charge assisted tetrel bonds the same Lewis acid units as for the former sub-sample interact with chloride anion. And for complexes

linked by the pnicogen bond the ZH4 <sup>+</sup>, ZFH3 <sup>+</sup> and ZF4 <sup>+</sup> species (Z = N, P and As) interact with HCN and LiCN through the Z and nitrogen centres.

**Figure 7.** The dependence between the Lewis acid···Lewis base distance, Z···B, (Å) and the α-parameter (see Scheme 2) in A-Z···B tetrel and pnicogen bonds (Z-tetrel or pnicogen centre). Black and white circles correspond to the tetrel complexes linked by charge assisted tetrel bonds and neutral tetrel bonds, respectively. Squares correspond to complexes linked by the pnicogen bond.

Figure 7 shows that for shorter Lewis acid-base distances corresponding to stronger interactions, the α-parameter (Scheme 2) tends to 90◦. It means that the central part of the complex becomes to be planar as it is observed for the transition state of the SN2 reaction. The second order polynomial dependences between the distance and the α-angle are observed here with high values of the correlation coefficients. These dependencies may be treated as the corresponding reaction paths of the above-mentioned SN2 reactions. This is very interesting that the pnicogen and tetrel σ-bonds considered here (Figure 7) do not practically differ between themselves. For both interactions the Lewis acid units possess the tetrahedral structure, for both interactions, the complexation leads to the geometrical changes towards the structure of trigonal bipyramid. And finally, both interactions may be considered as the preliminary stages of the SN2 reactions.

One can also mention C-H···M (M designates metal) contacts that often play an important role in catalysis [100]. These interactions are usually classified as attractive ones and they are named as agostic interactions in terms of the Dewar-Chatt-Duncanson model [101,102]. However, the term anagostic interactions was introduced [103] for such contacts that are sterically enforced in square-planar transition metal d<sup>8</sup> complexes to distinguish them from attractive agostic interactions. The orbital interaction schemes were presented by Scherer and co-workers for various types of the C-H···M interactions [104]; these are: the above-mentioned repulsive anagostic 3c-4e interaction being the contact between hydric hydrogen and the filled M-dz<sup>2</sup> orbital, the attractive 3c-4e hydrogen bond being the interaction between the protic H-atom and the filled M-dz<sup>2</sup> orbital, preagostic attractive 3c-2e interaction (π-back donation) and the σ-agostic 3c-2e attractive interaction (hydric hydrogen–empty M-dz<sup>2</sup> orbital). On the other hand, in another study it was justified that the formation of numerous C-H ... M structures is mainly driven by the dispersion forces thus the models based on the orbital-orbital interactions schemes are not sufficient to describe the nature of these structures [105]. It seems that both explanations, the "orbital based" explanation as well as that one considering dispersion forces, are valid. For example, in one of recent studies on the C-H ... Ni contacts in NiII planar isomers the occurrence of the covalent type charge delocalisation and of the London dispersion forces was justified [106]. The latter study is based on experimental results that are supported by theoretical analyses where various approaches were applied [106]. However, it is worth noting that these C-H ... M interactions, regardless of the mechanism of their formation, lead to the metal centre coordination change. In the case of square-planar structures with two additional C-H ... M contacts the metal structure tends to the octahedral one. Hence the agostic interaction may be considered as a preliminary stage of the process of structural reconstruction.

#### **7. Summary**

Different interactions are discussed in this article; the hydrogen bond and its special type—the dihydrogen bond, the pnicogen, and tetrel bonds as representatives of the σ-hole bond, and the triel bond as an example of the π-hole bond. These interactions may be treated as preliminary stages of various reactions and processes; the proton transfer, the release of the molecular hydrogen, or the SN2 reaction.

However, it is also very important that intra- and intermolecular connections lead to numerous structural changes of the interacting units. The triel planar and trigonal species interacting with Lewis bases tend to achieve the tetrahedral geometry. On the other hand the tetrahedral tetrel moieties as well as the tetrahedral pnicogen cations, both characterised by the lack of the lone electron pairs, tend to attain the structure of the trigonal bipyramid. It is worth to mention that different structural changes take place for different species of the same element, like in a case of the σ-hole bonds on one hand and in a case of the π-hole bonds on the other hand. In general, there is a variety of numerous structural changes accompanying the processes of complexation.

Finally it is worth noting here that other interactions not discussed here may initiate various chemical reactions, there is room for numerous future studies related to this topic. One of the most important topics to be discussed in next studies is the application of the structure-correlation method [12–14]. This method may be generalised as the analysis of geometrical changes during the chemical reactions. One can see that Figures 1 and 2 presented in this study express the changes following the proton transfer process, Figure 6 shows such changes related to the transformation of the trigonal planar system into the tetrahedral structure while Figure 7 shows the reaction paths of the SN2 reactions.

**Funding:** This research was funded by the Spanish Government MINECO/FEDER, grant number PID2019-109555GB-I00 and Eusko Jaurlaritza, grant number IT-1254-19.

**Acknowledgments:** Technical and human support provided by Informatikako Zerbitzu Orokora - Servicio General de Informática de la Universidad del País Vasco (SGI/IZO-SGIker UPV/EHU), Ministerio de Ciencia e Innovación (MICINN), Gobierno Vasco Eusko Jaurlanitza (GV/EJ), European Social Fund (ESF) is gratefully acknowledged.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Review* **Small Molecules, Non-Covalent Interactions, and Confinement**

**Gerd Buntkowsky 1,\*,**† **and Michael Vogel 2,\*,**†


Academic Editor: Ilya Shenderovich Received: 3 June 2020; Accepted: 15 July 2020; Published: 21 July 2020

**Abstract:** This review gives an overview of current trends in the investigation of small guest molecules, confined in neat and functionalized mesoporous silica materials by a combination of solid-state NMR and relaxometry with other physico-chemical techniques. The reported guest molecules are water, small alcohols, and carbonic acids, small aromatic and heteroaromatic molecules, ionic liquids, and surfactants. They are taken as characteristic role-models, which are representatives for the typical classes of organic molecules. It is shown that this combination delivers unique insights into the structure, arrangement, dynamics, guest-host interactions, and the binding sites in these confined systems, and is probably the most powerful analytical technique to probe these systems.

**Keywords:** confinement; solid-state NMR; molecular dynamics; interfaces and surfaces

#### **1. Introduction**

Porous silicates and alumosilicates include such diverse materials as the well-known microporous zeolites, a group of crystalline alumosilicates present in daily life, over mesoporous silica materials, such as the original ordered periodical mesoporous silica (PMS) [1,2] to controlled porous glasses and aerogels. They span a diameter range from fractions of a nanometer to ca. 50 nm and above. Owing to this large dispersion of well-defined diameters these systems are among the most versatile solid host-systems for studies of molecules in confinement.

In the present review an overview of confinement studies in PMS materials with a focus on neat and surface modified MCM-41 (Mobil Composition of Matter No. 41) [3] and SBA-15 (Santa Barbara Amorphous) [4,5] as hosts is given. Both MCM-41 and SBA-15 are characterized by well-ordered hexagonal pore-arrays, however, with different pore diameter distribution. Owing to their ordered structures, high porosity, high intrinsic surface area, low density, thermal stability, tunable pore sizes, and functional surface groups, PMS were successfully employed in a large variety of applications ranging from gas-storage and separation over heterogeneous catalysis to drug delivery (see, e.g., [6–22]), and many more.

PMS-type materials are ideal host systems for confinement studies since they combine narrow pore-diameter distributions and large specific surface with good chemical stability, easy handling, and chemical functionalization. They were employed in studies on the structural and dynamic properties of many different confined molecules, including water, alcohols, carbonic acids, protein solutions, and on thermophysical processes, such as freezing and melting points, glass transitions [23–29], or an electrochemical study of local pKa in confinement [30].

Of particular importance here are the periodic mesoporous silica materials MCM-41 [3] and SBA-15 [4] and their many derivates, which exhibit well-defined hexagonally arranged mesopore structures and three dimensional sponge like structures in porous glasses such as Vycor [31] or CPG-10–75 [32,33]. These materials opened up new research fields, as they allowed the introduction of larger molecular entities into well-defined pores. The prosperity of this field can be seen from the fact that, currently (March 2020), there are nearly 10,000 references in the Web of Science which have MCM-41 or SBA-15 in their title.

A large part of the interest in these materials stems from the fact that these materials are chemically very stable, because of the strength of the covalent Si-O-Si bonds, and that their surface silanol (Si-OH) groups are very potent reactive groups for chemical modification or functionalization of their surfaces. Thus, it is relatively easy to introduce the necessary functional groups by linker molecules, which serve, e.g., as possible binding sites for the chemical function of interest, such as a catalytic center. Such linker molecules can change the polarity or hydrophilicity of the surface, modify the hydrogen bonding properties of the surface, or add chemical functions, e.g., amide or carboxy functions [11] by binding functional groups such as amino, amide, carboxyl, phosphate, chloride, or peptide functions by post-synthetic grafting. Alternatively, it is also possible to add some of these functions directly during synthesis by co-condensation with appropriate additives [12,34].

A salient prerequisite for all these applications is a deep understanding of how the pores and pore surfaces interact with the confined molecules, which are, e.g., substrates of a catalytic reaction, a drug to be delivered or a fluid mixture which should be separated into its components.

This understanding is only obtainable by a combination of various complementary spectroscopic, thermodynamic, computational, and general physico-chemical characterization techniques, including multi-nuclear variable-temperature solid-state NMR (SSNMR), differential scanning calorimetry (DSC), powder X-ray diffraction (PXRD), small angle scattering (SAXS and SANS), thermogravimetric analysis (TGA) [35], and molecular dynamics (MD) simulations, as is shown by a number of recent papers (see, e.g., [36–47]). While X-ray diffraction techniques like XRD or SAXS reveal the ordered structures of these materials [48–50], nitrogen adsorption is employed to study the specific surface areas and pore diameters [51,52], DSC or TGA are used for the investigation of phase transitions inside the pores, NMR provides insights into the local ordering and dynamics on the molecular level, and computation interprets these results [35,53].

The purpose of this short review is to collect and report recent results on the investigation of guest molecules confined in mesoporous host systems with a particular emphasis on examples, where NMR techniques are prominently employed. As the description and theory of the NMR experiments employed for these investigations and their physical background are found in the literature, they are not described here in detail. Instead the reader is directed to a number of recent reviews on these experiments and references therein [54–57]. The same is true for the background of melting and glass transitions in confinement, where the reader is referred to papers [27,41,58,59] and references cited therein. The main advantage in the application of SSNMR techniques, which makes them complementary to diffraction techniques, is the fact that SSNMR works well on disordered systems or materials strongly affected by local impurities or multi-phase materials, on the one hand [39], and that it is able to analyze not only structural but also dynamical processes and in particular phase transitions on the other [35,53]. The drawback of NMR, in general, and SSNMR techniques, in particular, is their low sensitivity. For this reason microporous materials like zeolites [54,60–63] or mesoporous materials like MCM-41 and SBA-15 derivatives with high specific surfaces are commonly employed as host materials for NMR-confinement studies [64–67]. To battle this drawback, indirect detection methods under MAS, where the X-nucleus of interest is detected via the far more sensitive protons, a technique originally developed by Ishii and Tycko [68], were successfully applied to porous systems by the Pruski group to achieve remarkably sensitivity enhancements [69–72].

A recent alternative to this pure SSNMR technique for sensitivity enhancement is the application of hyperpolarization techniques like Dynamic Nuclear Polarization (DNP) enhanced SSNMR [73–76], which boost the sensitivity of SSNMR by several orders of magnitude [77–84], and in particular its variant SENS (Surface Enhanced NMR Spectroscopy) [85–95], or parahydrogen-induced polarization [96–98]

(PHIP), whose potential for surface studies was demonstrated by Hunger et al. [99,100] and Stepanov and coworkers [101], or spin-exchange optical pumping (SEOP) [102,103].

In particular for functionalized systems, SSNMR techniques provided unprecedented details about the interaction of the linker molecules and the surface and its wetness [104–106]. Motokura et al. [107] employed 13C CP MAS NMR to investigate the catalytic transformation of epoxides under CO2 atmosphere on silica-supported aminopyridinium halides. Gath et al. [108] ascertained the properties of silylated amorphous silica materials. Wang et al. [109] studied a series of different linker molecules tethered on MCM-41 or SBA-15 by 2H MAS. Kandel et al. [110,111] investigated inhibitory processes in aldol reactions employing amine-functionalized silica supports. Jayanthi et al. [112–114] combined 2H and 29Si MAS NMR with MD simulations to study the dependence of the linker-surface interaction on the water concentration and on the temperature for *N*-(2-(triethoxysilyl)propyl)acetamide-d3 grafted onto MCM-41 and *N*-(2-aminoethyl-d4)propanamide grafted onto SBA-15. The Bluemel group pioneered in a series of seminal papers the application of CPMAS, in general and in particular, of HRMAS (high-resolution magic-angle spinning [115]), a solid-state NMR experiment, which employs the partial motional averaging of anisotropies for the investigation of physisorbed or chemisorbed molecules on surfaces, the characterization of novel porous catalysts [104,105,116–125]. Important contributions by the Coperet group were studies of various supported organometallic catalysts by SSNMR (see [86,87,90,92,126–132]) and by the Scott group [133–138], who developed a series of novel porous catalytic materials and investigated in detail the factors determining their adsorption and reactivity properties and by the Pruski [69–72,139–145] and Buntkowsky [146–152] groups, who employed conventional and DNP-enhanced SSNMR for the characterization of immobilized molecules. Another important aspect is that these materials are potential carriers for bioactive molecules, such as amino acids, peptides, or drugs [19,153–155]. Klimavicius et al. investigated silica confined ionic liquids by CPMAS NMR [156].

While the focus of this review is devoted to results obtained in the DFG special research unit FOR1583, there are also short reports about important contributions from outside this consortium. The rest of this review is organized as follows: Section 2 gives an introduction into the preparation and surface modification of the mesoporous host materials. Section 3 discusses the behavior of simple systems and Section 4 the behavior of complex systems, such as binary liquids or crowded solutions inside the confinement. The review is finished by a summary and an outlook into possible future developments of the field.

#### **2. Porous Host Materials**

#### *2.1. Microporous Materials as Hosts*

Owing to their immense importance both in daily life and in technology, zeolites are most probably the best characterized class of porous materials. They are well-ordered framework silicates with the composition (A<sup>+</sup>, E2<sup>+</sup>0.5)x (AlO2)x (SiO2)y · (H2O)z (A<sup>+</sup> = Na+, K<sup>+</sup> and E2<sup>+</sup> = Mg2+, Ca2+) and belong to the family of tectosilicates. A well-known example, found in nature, is faujasite (Na2Ca[Al4SiO10O28] · 20 H2O) [157]. Zeolites find applications e.g., as molecular sieves [158,159], in heterogeneous catalysis [160,161], for gas storage [162], and as ion exchange resins [157]. An extensive recent overview about this fascinating class of materials is given in the recent special issue on zeolite chemistry

Owing to their narrow pore diameters and importance as catalysts for organic chemistry, most confinement studies employing zeolites use small molecules [163,164]. Typical recent examples are the deuterium NMR studies by Nishchenko et al. [165] and Lalowicz et al. [166,167] who analyzed the dynamics of tert.-butyl alcohol-d9 and methanol-d4 inside zeolites, respectively. Moreover, NMR field-gradient approaches yielded valuable insights into the diffusion of various small molecules in zeolites, including information about diffusion anisotropy, transport resistance at crystal surfaces, and pore connectivities [168–174].

#### *2.2. Mesoporous Silica Materials as Hosts*

While microporous zeolites are clearly still the technically most important class of porous host materials, their applications are limited to fairly small molecular sizes. For this reason the development of new classes of materials [3,6–10,16–21,106] with larger and adjustable pore sizes, such as mesoporous silica and mesoporous carbon materials, gained in importance. Depending on the material, they are characterized by adjustable pore sizes between ca. 10 and 1000 Å and, thus, close the gap between the microporous and the macroporous regime. Their combination of large specific volume and specific surface areas with high thermal stability and low specific weight creates a large application potential in physics, chemistry, pharmacy, polymer science, and related fields. Characteristic examples include applications in gas storage, in catalysis, in separation techniques, as additives to rubbers for tires media, and many more [11–15]. In confinement studies, SBA-15-type materials have the advantage of larger pore diameters, but MCM-41-type materials are better suited as models with narrow confinement. Additional merits of the latter are their generally better surface homogeneity and smoother inner surfaces.

#### *2.3. Preparation and Chemical Functionalization of Mesoporous Silica; NMR Characterization*

As mentioned above these mesoporous silica materials permit the comparatively simple synthesis of surface functionalized host systems with well-defined tunable narrow pore diameters [175–178], employing a synthesis protocol, which is based on Grünberg et al. [106] and Grün et al. [179]. Details of the synthesis and characterization are given in [176,180] and will not be repeated here. The changes of porosity, specific surfaces and the modification of the surface sites of the material can be monitored by the combination of nitrogen adsorption (BET and BJH) analysis and 29Si SSNMR spectroscopy. A typical example of such a synthesis is shown below (Figure 1), which displays the APTES ((3-aminopropyl)triethoxysilan) functionalization of MCM-41 materials. For this sample the BET measurements revealed a specific pore volume of 0.77 cm3/g, a specific surface area of 1000 m2/g and a specific pore diameter of 3.6 nm. From the 29Si SSNMR spectra the changes of the silanol groups during functionalization are determined by the change from Q-groups to T-groups.

**Figure 1.** Upper panel: Different types of silica sites. Lower panel: 29Si CPMAS (10 kHz) spectra of (**a**) neat and (**b**) functionalized silica, showing the appearance of Tn-groups by the functionalization. (adapted from Weigler et al. [181]). (**c**): 1H SSNMR spectrum of a non-dried MCM-41 at 10 kHz (black). Deconvolution (blue), sum of deconvolution (magenta) and assignment of water species to the peaks [36].

At this point it is important to note that the freshly prepared samples in general contain a substantial amount of surface bound water molecules [36,39,182]. Since the latter can strongly influence the outcome of confinement studies, it is in generally necessary to check the hydration state of the sample by 1H MAS NMR measurements (the lower right panel of Figure 1 shows a typical example) and to employ special drying protocols for the preparation of "water-free" silica samples (for details see Brodrecht et al. [183]).

Since naturally-occurring porous hybrid materials like the skeleton of diatoms are based on modified silica materials consisting of silica and sillafins (polyamines) [184–187], the functionalization of mesoporous silica with peptides and peptoides [44,47,113,175–178,188–194] creates controllable well-defined model systems for the in-vitro study of, e.g., biomineralization.

There are two different strategies for obtaining such peptide-functionalized silica materials, namely an activation of the silica by virtue of a linker group, followed by a grafting of the previously synthesized peptide as shown, e.g., in [47,112] or the direct synthesis of the desired peptide inside the pores employing a modification of the standard SPPS protocol [176,177,194].

Figure 2 displays examples of both strategies and, in particular, how the success of the synthesis can be monitored by 13C CPMAS SSNMR spectroscopy. In the first example (Figure 2a), the collagen-model nonapeptide h-(Gly-Pro-Hyp)3-oh is grafted to silica [47]. The intensity reduction of the succinimidyl signal at ca. 15 ppm, which is visible by comparison of a.ii and a.iv, and the peaks for the nonapeptide in the carbonyl region and also in the aliphatic region indicate the successful immobilization of the peptide, which was proven by DNP enhanced natural abundance 15N spectra (not shown). The second example displays the steps of the SPPS inside mesoporous silica for the addition of one amino acid residue to an N-terminus (for details see Brodrecht et al. [175]).

**Figure 2.** (**a**) 13C CPMAS NMR of the immobilization via the grafting approach of the nonapeptide h-[Gly-Pro-Hyp]3-oh on carboxyl functionalized mesoporous silica: (*i*) neat carboxyl functionalized silica support; (*ii*) TSTU (N,N,N ,N -tetramethyl-*O*-(*N*-succinimidyl)uroniumtetrafluorborat) pre-activated silica; (*iii*) free nonapeptide h-[Gly-Pro-Hyp]3-oh; (*iv*) nonapeptide grafted on silica. Note: Signals marked with \* refer to spinning sidebands. (adapted from [47]). (**b**) 13C CP MAS NMR of the steps of the in-pore SPPS (solid phase peptide synthesis) of functionalized SBA-15 (*i*), Fmoc-glycine functionalized species (*ii*), glycine functionalized species (*iii*), Fmoc-phenylalanine-glycine functionalized species (*iv*), phenylalanine-glycine functionalized species (*v*) (adapted from Brodrecht et al. [175]).

Although mesoporous silica materials offer a larger range of pore diameters they are still limited with respect to the accessible confinement sizes. In the case that functionalized pores with larger diameters are desired, they can be created in a hierarchical three-step process, which is sketched in Figure 3a. In the first step a membrane, such as a polycarbonate foil, is irradiated in a heavy ion accelerator, creating an ion-track. The carbonate material inside this iron track is then removed by etching, creating a channel through the polycarbonate foil. By selecting the ion dose, the irradiation directions, and the etching time, the number of channels, their dimensionality, and the channel diameters are selected [195]. In the second step these etched ion channels are coated with silica by atomic layer deposition (ALD). In the third step they can be functionalized by grafting of linkers, such as APDMS (3-aminopropyldimethyl silane) or APTES, to the silica inside the channels. Owing to their lower specific surfaces the detailed chemical characterization and monitoring of the surface functionalization by SSNMR is only feasible by means of DNP enhancement, which boosts the SSNMR sensitivity. As a typical example of these experiments, Figure 3b displays the DNP enhanced 29Si SSNMR spectra of the material. The broad low-field lines around ca. 60–75 ppm proves the formation of the characteristic Tn-groups, which result from the binding of APTES to the silica inside the channels.

**Figure 3.** (**a**): Sketch of the three-step process to synthesize amine functionalized silica coated porous polycarbonate-membranes (see text for details). (**b**): DNP enhanced 29Si CPMAS spectra (i) revealing the characteristic Tn-groups in the tenfold magnefication (ii) (adapted from [196])).

#### **3. Simple Liquids in Confinement**

The simplest case of a confined system is a single component liquid confined inside the pores of a host material, such as silica. In this case the behavior of the liquid is governed by the competition of the interactions between the liquid's molecules and that of the host surface with the liquid's molecules. In addition to the typical interactions between the liquid molecules, such as hydrogen bonding, hydroaffinity, polarity, and aromaticity, there are also steric effects, which influence the dynamics. In the remainder of this section, some characteristic examples of single component liquids, such as small polar, nonpolar, and aromatic molecules, confined in mesoporous silica are discussed. The competition of these interactions leads to pronounced changes of their phase behavior, in particular when confined in narrow pores, where a large percentage of the molecules is close to, or in contact with, the pore surface. The confinement in a pore causes in general a depression of the melting/freezing point of the confined molecules, respectively prevents melting at all if the pore diameter is too low, causing a glass-transition instead. As a consequence, many molecules, which are a solid in their bulk phase at a given temperature, become a liquid when confined inside pores.

The situation becomes even more complicated, when molecules are employed as solvents, e.g., of a chemical reagent such as a drug or in filtering processes. In this situation there will be a competition of solvent-surface and solute-surface interactions with solvent-solute and solvent-solvent interactions. In order to be able to understand these complicated systems, it is a prerequisite to understand the behavior and properties of confined simple liquids.

To obtain this understanding, various analytical methods such as DSC, TGA, near-infrared spectroscopy (n-IR), and variable-temperature XRD are combined with manifold NMR methods, including one- and two-dimensional spectroscopy, spin-lattice relaxometry, and correlation function analysis, as well as broadband dielectric spectroscopy (BDS).

#### *3.1. Water inside Mesoporous Silica*

Polar molecules such as water [36,40,197–202], alcohols like methanol [166,167], tert.-butyl alcohol [165] or octanol [203,204] and carboxylic acids, such as isobutyric acid [38,42], can form hydrogen bonds among themselves and also with the surface's silanol groups (Si–OH).

Owing to its ubiquitous presence, its importance as a solvent and its importance for the life-sciences, water is the most interesting molecule for confinement studies. It is commonly used as a green solvent, employed both in technical processes and in medical applications. Moreover, due to its ability to build hydrogen-bonding networks with itself and also with surface-silanol groups and its rich phase-diagram, it is also a fascinating subject for basic scientific investigations.

Various aspects of water confined in mesoporous silica were investigated in a number of studies. An important outcome of these investigations was that the morphology of water inside the pores depends strongly on the pore diameter. For narrow pores a coexistence of two different water phases (a surface layer and nano-droplets or water-clusters) and for larger diameters a single water phase were detected by 1H MAS NMR [36,40] and 2H SSNMR [197,198]. It was feasible to assign different water species confined in mesoporous MCM-41 by virtue of combined 1H and 2H SSNMR experiments [199,200]. The exchange of the two spin species as a function of the hydration level was studied for MCM-41 filled with D2O [201] employing 2D selective soft-hard inversion recovery experiments [205,206] and the results were interpreted using a three site exchange model. In this model the highest exchange rate of 300 s−<sup>1</sup> is found between single hydroxyl protons and water protons. Moreover, as they did not observe any coalescence for the lines corresponding to the surface-water chemical exchange rate, they could provide an upper boundary (<1000 s<sup>−</sup>1) for this rate.

When the confinement size is reduced, the melting temperature of water decreases, as described by the Gibbs–Thomson relation, until crystallization is fully suppressed [27,65,207]. Thus, severe geometrical restriction provides access to the properties of deeply cooled liquid water, which are of fundamental importance for an understanding of the anomalies of this liquid, but masked by rapid crystallization in the bulk [208]. In particular, it was proposed that the anomalies of water originate in a liquid-liquid critical point in the supercooled regime, which terminates a phase transition between high-density (HDL) and low-density (LDL) liquid forms [209]. Moreover, it was argued that the associated structural modifications have also a dynamical signature, explicitly, that there is a change in the temperature dependence of the structural α relaxation from a non-Arrhenius behavior characteristic for HDL to an Arrhenius behavior typical of LDL. While a number of studies reported such dynamical crossover of confined water, it remains a subject of controversial discussion whether the phenomenon is indicative of a HDL-LDL transition [210–212]. For example, alternative explanations based on confinement effects were given.

To tackle this problem, 2H NMR was used to investigate reorientation dynamics of unfreezable water (D2O) in MCM-41 and SBA-15 pores over wide temperature ranges towards the glass transition [45,176,181,183,213–219]. In particular, spin-lattice relaxation, line-shape analysis, and stimulated-echo experiments were combined to ensure broad dynamic ranges and the pore-size was systematically varied to study possible finite-size effects. Figure 4 shows temperature-dependent correlation times τ obtained from 2H NMR approaches to water reorientation in mesoporous silica. Clearly, there is a dynamic crossover near 220 K. Combining this NMR analyses with DSC and BDS studies [215], it turned out that, for a pore diameter of 2.8 nm, the dynamic crossover occurs near the melting temperature *T*<sup>m</sup> so that liquid and crystalline water fractions with, respectively, faster and slower rotational dynamics coexist inside the pores below this temperature (see Figure 4a). It was concluded that partial crystallization causes the effect, explicitly, that the dynamics of water changes when ice forms and further restricts the accessible pore volume. To test this hypothesis, later work exploited that the melting temperature *T*<sup>m</sup> can be altered when the pore diameter of the MCM-41 and SBA-15 material is varied [213]. It was found that the dynamic crossover occurs near 220 K, independent of the pore diameter (see Figure 4b). Hence, partial crystallization is not the reason of the change in the temperature dependence in the general case. One may be tempted to argue that the lacking pore-size dependence also excludes finite-size effects as possible origin. However, this argument does not hold because liquid water forms an interfacial layer, the thickness of which is largely independent of the pore diameter and, hence, the available space between the silica walls and the ice crystallites remains unaltered.

**Figure 4.** Correlation times of water reorientation in MCM-41 and SBA-15: (**a**) Results for H2O and D2O in MCM-41 pores with a diameter of 2.8 nm from BDS, 2H spin-lattice relaxation (SLR), and 2H stimulated-echo (STE) experiments [215]. The dashed line marks the melting temperature Tm of water in these confinements, as obtained from DSC (adapted from [215]). (**b**) Results for H2O and D2O in MCM-41 pores with the indicated diameters from 2H NMR [213] and BDS (stars) [212,215]. The dashed line is an interpolation of the high-temperature data with a Vogel-Fulcher-Tammann relation. The solid line is an Arrhenius fit of the low-temperature results, yielding an activation energy of 0.5 eV (adapted from [213]).

To study the role of water-host interactions, analogous 2H NMR studies were performed for D2O in MCM-41 pores functionalized with APTES (see Section 2.3). It was found that water reorientation in native and functionalized MCM-41 pores is similar (see Figure 5a) [181]. In particular, the temperature-dependent correlation times τ show a crossover from non-Arrhenius behavior above ca. 220 K to an Arrhenius behavior below this temperature in both types of confinements. Moreover, a common activation energy of *E*<sup>a</sup> - 0.5 eV was observed in the low-temperature regime for water in MCM-41 pores with and without APTES functionalization, but also for water in many other confinements, e.g., at protein surfaces [220–223]. Therefore, the term 'universal water relaxation' was coined for the low-temperature process. However, there are ongoing vigorous discussions whether this dynamic process can still be identified with the structural α relaxation of water or to a secondary β relaxation, which has severe consequences for the value of the glass transition temperature *T*<sup>g</sup> of confined water and, possibly, also bulk water (see below) [210,212].

To obtain information about the nature of the low-temperature dynamics of water, it was utilized that 2H stimulated-echo experiments provide access to not only the rates but also the mechanisms of molecular reorientation dynamics [224–226]. In particular, it can be exploited that the angular resolution of the experiment is determined by the length of the evolution time *t*e in the stimulated-echo sequence.

**Figure 5.** (**a**) Temperature-dependent correlation times of H2O and D2O reorientation in native MCM-41 (diameter 2.0 nm, red symbols) [214] and in APTES modified MCM-41 (diameter 2.2 nm, blue symbols) [181]. Results from 2H spin-lattice relaxation, line-shape analysis, and stimulated-echo experiments on D2O dynamics and from BDS on H2O dynamics in native MCM-41 [227] and D2O dynamics in modified MCM-41 [181] are shown. For comparison, data from combined NMR and BDS studies on water reorientation in an elastin matrix are included (green crossed circles) [222]. The dotted line is an interpolation of the high-temperature data with a Vogel-Fulcher-Tammann relation. The dashed line is an Arrhenius fit of the low-temperature results, yielding an activation energy of 0.5 eV. (**b**) Evolution-time dependent normalized correlation times τ(te)/τ(te→0) of D2O reorientation in an APTES modified MCM-41 (pore) [181], in an elastin matrix [220], and in a 2:1 molar D2O/DMSO mixture [228], and of glycerol (GLY) reorientation in MCM-41 (diameter 2.2 nm). The lines are expectations obtained from computer simulations [229]: (dashed) distorted tetrahedral jumps [220] and (solid) isotropic reorientation comprised of rotational jumps about angles of 2◦ (98%) and 30◦ (2%) [230].

It was reported that the observations for water in both silica and protein confinements are largely independent of the evolution time [45,181,214,219,220]. For example, the correlation times τ change in neither of these confinements when *t*<sup>e</sup> is extended and, thus, the angular resolution is enhanced (see Figure 5b). This ineffectiveness of geometrical filtering indicated that water reorientation results from jumps about large angles of the order of the tetrahedral angle. Closer analysis implied that the universal low-temperature water reorientation can be described as distorted tetrahedral jumps or, similarly, as quasi-isotropic large-angle jumps [45,181,214,219,220]. However, it remains elusive whether or not the observed rotational motion is coupled to translational diffusion. The existence of such coupling is a prerequisite for the interpretation of the dynamic crossover in terms of altered structural α relaxation in response to a HDL-LDL transition. By contrast, an absence of such coupling implies interpretations based on a crossover from structural α relaxation to localized β relaxation. Moreover, this aspect has major consequences for the nature of the much-debated glass transition of water at *T*<sup>g</sup> - 136 K [231–233]. As the correlation times of the universal low-temperature water dynamics meet expectations for a glassy arrest at this temperature, a diffusive nature of this process entails a structural glass transition, whereas a localized nature implies an orientational glass transition, which is restricted to the rotational degrees of freedom.

#### *3.2. Glycerol inside Mesoporous Silica*

When investigating effects of geometrical restriction on liquid dynamics, it is desirable to compare behaviors of confined and bulk molecules over broad temperature ranges. In that respect, work on water has the drawback that its high tendency for crystallization hampers comparisons in the supercooled regime. On the other hand, use of good glass formers allows one to study molecular dynamics of both confined and bulk liquids in wide time windows. Major NMR contributions to this research field are discussed in a previous review article [67]. Therefore, we restrict ourselves to the case of the archetypal glass former glycerol in this contribution.

Figure 6 shows 2H NMR correlation times of bulk and confined glycerol. It is evident that the time scale of glycerol reorientation is unaffected even in narrow MCM-41 pores with diameters of 2–3 nm and at low temperatures near the glass transition. The temperature dependence is well described by the Vogel-Fulcher-Tammann relation typical of molecular glass formers. To arrive at these results, it is, however, necessary to avoid contamination with water by careful drying of the precursor materials [183,217]. Possibly, the hygroscopic nature of mesoporous silica and the high sensitivity of glycerol dynamics to water admixtures offer an explanation for different conclusions relating to confinement effects on the glass transition of glycerol in BDS studies [234,235]. Moreover, 2H NMR stimulated-echo studies showed that MCM-41 confinement does not alter the mechanism for glycerol reorientation (see Figure 5b). Specifically, the evolution-time dependence of the correlation times τ(te) of confined glycerol resembles that of bulk glycerol [230] but differs from that of confined water (see Section 3.1). The observed decrease of τ(te) indicates that the reorientation process of glycerol is composed of consecutive small-angle jumps, e.g., the data for the bulk liquid were successfully described by an isotropic reorientation model, which assumes that 98% of the rotational jumps occur about an angle of 2◦ and only 2% of them involve an angle of 30◦ [230]. On the other hand, 2H NMR correlation functions were found to be more stretched for confined glycerol than for bulk glycerol [217]. Hence, confinement results in higher dynamical heterogeneity, which, most probably, reflects mobility gradients across the pores with slower dynamics at the pore walls than in the pore centers, as commonly observed in simulation studies on confined liquids [236]. Consistent with these results for silica confinement, it was reported that protein matrices leave the rate of glycerol reorientation unaltered but increase its heterogeneity [237,238].

**Figure 6.** Temperature-dependent correlation times of glycerol-d5 in the bulk liquid and in MCM-41 pores with the indicated diameters. Results from 2H NMR (spin-lattice relaxation and stimulated-echo experiments) and BDS are compared. The solid line is an interpolation with a Vogel-Fulcher-Tammann relation.

#### *3.3. Benzene, Biphenyl, and Naphthalene inside Mesoporous Silica*

While polar molecules like water exhibit strong hydrogen bonding interactions with the confinement, aromatic molecules such as benzene [23,37], biphenyl [43] or naphthalene [239] only weakly interact with the surface due to their hydrophobicity and strong π-π-stacking interactions among themselves. Employing a combination of 2H SSNMR and DSC the phase behavior of benzene confined in mesoporous silica was studied [23,37]. These experiments revealed a drastic lowering of the transition temperatures of the rotor and translational phases of the confined benzene. In particular for narrow pores a glass-like benzene phase with a broad distribution of activation energies was elucidated [23]. In this glass-like phase the rotational degrees of freedom were strongly decoupled from the translational ones. The comparison of these results with investigations employing a larger-diameter host material revealed that these glass-like phases are formed by roughly three outer molecular layers and that inside a "normal" crystalline benzene phase is formed, which behaves like bulk benzene [37].

This interesting behavior prompted the study of the next larger homologues of benzene, namely of biphenyl confined in inside narrow (nominal 2.5 nm and 2.9 nm diameter) silylated and non-silylated MCM-41 pores [43] and of naphthalene confined inside narrow (nominal 3.3 nm pore diameter) MCM-41 pores [239]. With respect to confinement studies, a major difference between these two molecules is that biphenyl has an internal rotational degree of freedom, which naphthalene is lacking. The confinement of biphenyl caused a depression of the melting point by ca. 110–120 K from the bulk value of 342.6 K down to 222 K to 229 K (depending on the pore diameter). Moreover, a careful line-shape analysis of the 2H NMR solid-echo spectra measured just below the melting points elucidated indications for the presence of a pre-melting process in the form of isotropic motions of a fraction of the biphenyl molecules (Figure 7). The best simulation of the spectra was obtained by a two-phase model, with a broad distribution of rotational correlation times, resulting from a broad distribution of activation energies for the rotational motion. For the confined naphthalene an even stronger reduction of the melting point (152 K), compared to the bulk material was found. For the detailed line shape analysis of the 2H SSNMR spectra, two different models were employed, namely on the one hand a two-phase model with a broad distribution of activation energies, which is similar for benzene and biphenyl, and on the other hand a crystal-like jump model employing an octahedral jump geometry. Both models revealed a narrow melting point distribution of the confined naphthalene, indicating a relatively well ordered structure of the confined naphthalene molecules [239]. These results were interpreted such that the confined naphthalene molecules most probably form a plastically crystalline phase, similar to naphthalene in ball-milled silica [240–243]. The existence of these plastic phases of confined naphthalene were independently confirmed by a combination of DSC, Raman spectroscopy, and PXRD [244].

**Figure 7.** (**a**): cartoon of biphenyl [43] and (**b**) naphthalene [239] confined in MCM-41. In the case of biphenyl there are two possible rotational modes, namely, the rotation of a single phenyl ring and a molecular rotation. (**c**) experimental (lower traces) and simulated (upper traces) 2H solid-echo NMR spectra of the melting process of biphenyl-d10 inside silylated MCM-41 (daver = 29 Å) at different temperatures [43]. (**d**): experimental variable temperature 2H solid-echo NMR spectra of the melting process of naphthalene-d8 inside MCM-41 (daver = 33 Å) (i): experiment; (ii): simulated octahedral jump; (iii): two-phase model). All spectra are normalized to equal height (figure reproduced with permission from [176], copyright Walter de Gruyter and Company).

#### *3.4. Pyridine inside Mesoporous Silica*

While the nonpolar aromatic molecules described in the previous section interact only weakly with the silica surface, polar aromatic molecules such as carboxylic acids like benzoic acid [245], or nitrogen containing heterocycles such as pyridine [246–248], bipyridinyl [249], dimethylaminopyridine [250], or diethyl-2,6-di-tert.-butylaminopyridine [250], interact strongly with surface silanol groups by means of hydrogen bonds. In the case of the pyridine derivates, the ring nitrogen acts as a hydrogen bond acceptor. Since the 15N chemical shift is a very sensitive monitor of the hydrogen bond strength, these hydrogen bonds can be conveniently monitored by 15N CPMAS NMR [246,250].

In a series of seminal papers [246–248], Shenderovich and coworkers employed a combination of 15N- and 29Si-CPMAS and MAS NMR spectroscopy, line-shape analysis and structural modelling to describe the silica-surface morphology of different types of mesoporous silica on the atomic scale. While the 29Si MAS spectra revealed the ratio of Q3:Q4 groups and the pore wall ordering of the silica, the 15N NMR revealed the Brønsted basicity and surface morphology and surface defects of the silica. These investigations were paralleled by stray-field diffusion NMR studies, which investigated the diffusion tensor of pyridine as a function of the pore filling [251]. Later, Gurinov et al. [248] studied aluminum oxide containing SBA-15-type materials to investigate in more detail the effects of Lewis and Brønsted acidity by a combination of 15N and 2H NMR techniques. Lesnichin et al. [249] investigated the behavior of 2,2 -bipyridyl in confinement. They found that the molecule can only form one of the two possible hydrogen bonds to the surface and that the surface coverage grows strongly from one

molecule per nm<sup>2</sup> at room temperature to 1.6 molecules per nm2 at 130 K. Very recently, Shenderovich and Denisov studied the hydrogen bonding of pyridine in detail [252].

#### **4. Complex Liquids in Confinement**

In the previous section, the behavior of confined simple liquids was briefly reviewed. In the present section we now discuss the behavior of complex liquids, i.e., mixtures' respectively liquid solutions of two or more liquid components [253]. The investigation of these mixtures is of particular interest as they are models for many natural or technical systems, e.g., oil-water mixtures. While the phase behavior of such complex liquids is often well understood in bulk phases, there is still a very large gap in knowledge for confined systems, where the competition of liquid/liquid versus liquid/pore surface interactions creates much more complex scenarios. Understanding the effect of the confinement on the complex liquid, and analyzing the structure, dynamics and spatial distribution of the solvents on the molecular level may help in developing new applications, e.g., in chemical industry, pharmacology or oil industry or might help in developing new strategies to deal, e.g., with crude oil spills. We start with model systems, namely confined water-alcohol and water-isobutyric acid mixtures to discuss the basic behavior of these confined systems. Then we discuss recent results on confined ionic liquids and their application as solvents in catalysis in the form of supported ionic liquid phases (SILPs) respectively supported ionic liquid catalysts (SILCs). Finally, we shortly summarize recent results on confined surfactants.

#### *4.1. Confined Water-Octanol Mixtures*

Water-octanol mixtures are important model systems for the investigation of the phase behavior of two immiscible liquids in confinement. For their quantification the water octanol partition coefficient or *p*-value Kow is employed. (for details see the short review by Hermens et al. [254] and references therein). Hydrophobic liquids have a high Kow and hydrophilic liquids have a low Kow. The Kow values are employed, e.g., in pharmacology for estimating the distribution of drugs within the body. Drugs with high Kow tend to accumulate in hydrophobic areas of the body such as lipid bilayers of cells and drugs with low Kow tend to accumulate in hydrophilic areas with high water content, e.g., the blood serum. For a detailed discussion on the application of partition coefficients see Leo at al. [255].

Kumari et al. [203] studied the phase behavior of water/octanol mixtures confined in mesoporous SBA-15 by a combination of SSNMR and MD simulations (see Figure 8). By a combination of 1D SSNMR and FSLG-NMR [204] they could analyze the strength of the magnetic dipolar interactions between the different components and thus determine the distributions of the two liquids inside the confinement. The salient idea is to search for correlations between the chemically different types of 1H-nuclei (e.g., aliphatic protons of the alkyl chain or hydroxyl protons of the alcohol group or water) of the confined liquid and 29Si-sites on the surface of the material. These correlations are created by the magnetic dipolar interaction between these nuclei and are indicated as cross-peaks inside the 2D-NMR spectra. Thus, they are only visible when the corresponding nuclei are in the vicinity of each other. By varying the contact time different distances are probed. A detailed analysis of the 2D-spectra is beyond the scope of the current review and can be found in [203].

**Figure 8.** Upper panel: Room temperature 1H-29Si CPMAS FSLG-HETCOR experiment measured at 8 kHz spinning of dried SBA-15 filled with a mixture of 80:20 mol% of 1-octanol and water with a contact time of (**a**) 3 ms and (**b**) 9 ms. Lower panel: (**c**) schematic models for interactions of the pore surface of SBA-15 with 1-octanol. (**d**) Graphical visualization of a feasible bilayer formation of 1-octanol inside the pore. Water molecules are concentrated near the pore wall as well as in the pore center. The intermediated area between pore wall and pore center is occupied by the aliphatic hydrophobic chains of the 1-octanol molecules (adapted from Kumari et al. [203]).

#### *4.2. Water-Isobutyric Acid (IBA) Mixtures in Experiment and Simulation*

Another example presents the study of binary mixtures of water and isobutyric acid (iBA, 2-methylpropanoic acid) by a combination of SSNMR spectroscopy and MD simulations. In bulk mixtures, this system has a well-known phase-diagram with a large miscibility gap as a function of temperature and mole fraction of the liquids. First NMR studies [38,39] of this system had indicated a micro-phase separation of the confined binary mixture with an anomalous temperature dependence of the self-diffusion coefficient and a bifurcation of the T2-relaxation upon a critical temperature of 42 ◦C, proposing a structural model in the form of concentric cylindrical liquid layers below the critical temperature inside the pores. The inner cylinder was tentatively assigned to the iBA rich and the outer cylinder hull to the water rich phase.

This assignment was probed by Harrach et al. [256] by a combination of high-resolution SSNMR on frozen solutions (100 K to suppress any fluid mobility in the NMR experiments and obtain a momentary picture of the liquid distribution inside the pores) and MD simulations. By varying the contact time of 1H-29Si FSLG HETCOR (see Figure 9) they mapped out different distance regimes by virtue of the strength of the magnetic dipolar interactions between protons and the silica nuclei on the surface to reveal the molecular distribution inside the pores. The latter was interpreted by MD simulations (see Figure 10), which calculated the density profile of water and iBA as a function of the distance from the pore center. An example of these calculations is shown in Figure 10. They corroborate in principle the cylindrical model but reveal that the iBA rich phase and not the water rich phase is close to the pore wall. Furthermore, the calculations indicated that the iBA molecules orient preferential like an inverted brush-like structure, i.e., radially with the carboxylic group pointing towards the pore wall and the aliphatic chains pointing radially into the direction of the pore-center.

**Figure 9.** 2D 1H-29Si FSLG HETCOR experiments of iBA/H2O mixtures confined in SBA-15 with contact times of (**a**) 3 ms (longer distances) and (**b**) 0.5 ms (shorter distances) clearly reveal that both hydroxyand aliphatic protons are in contact with the surface silicon nuclei (56 wt % iBA, 9.4Tesla, 100 K, 8 kHz MAS, 89 kHz FSLG homonuclear decoupling [257]). (Figure adapted from [256]).

A detailed analysis of the entropic and enthalpic parts of the free energy revealed that this unexpected phase-behavior is mainly caused by the hydrogen-bonding enthalpy and is meliorated at higher temperatures where entropic terms become stronger, leading to a more thorough miscibility. For further details see the original paper by Harrach et al. [256]

**Figure 10.** (**a**): density profiles for iBA and water calculated by molecular dynamics simulations. The center of Table 0. Å. (**b**): Density of hydrogen atoms as a function of the distance to the closest surface silanol group (figure reproduced from [256] with permission, copyright American Chemical Society).

#### *4.3. Confined Water-Glycol Mixtures*

Evidence for confinement-induced micro-phase separation and the confinement-enhanced tendency for crystallization were reported in 2H spin-lattice relaxation studies on mixtures of D2O with propylene glycol (PG), propylene glycol monomethyl ether (PGME), or dipropylene glycol monomethyl ether [217,258].

For example, for a PG-D2O mixture with a water concentration of 45 wt %, 2H spin-lattice relaxation studies revealed that crystallization is fully suppressed in the bulk but occurs in pores at T < 220 K [217] (see Figure 11a). Specifically, bimodal 2H spin-lattice relaxation indicated that partial freezing results in coexisting liquid and crystalline fractions inside MCM-41 pores. Recording the buildup of the magnetization in a staggered way, it was even possible to follow the crystallization process on the basis of the observation that the slow step due to the crystalline fraction grows at the expense of the fast one associated with the liquid fraction in the course of time. Similarly, 2H spin-lattice relaxation results for PGME-D2O mixtures at 240 K indicated that freezing occurs in confinement but not in the bulk for a water concentration of 60 wt %, while such a difference was not observed at lower and higher water contents (see Figure 11b) [217]. Thus, at variance with the situation for pure liquids, confined aqueous glycol solutions with intermediate water concentrations show a higher proneness towards crystallization than their bulk counterparts. This effect was taken as evidence that, as a consequence of confinement-induced micro-phase segregation, the water concentration in some pore regions becomes sufficiently high to allow for ice formation.

**Figure 11.** Buildup of 2H magnetization M(t) for water-glycol mixtures in the bulk and in MCM-41 pores (diameter 2.8 nm): (**a**) Confined PG-D2O mixture (45 wt % water) together with fits to monomodal (222 K) and bimodal (220 K) spin-lattice relaxation. At the lower temperature, the relative height of the fast and slow steps differs during the first and second cycles of a staggered-range measurement performed directly after temperature equilibration (squares), while this discrepancy does not occur during a later measurement. (**b**) Bulk and confined PGME-D2O mixtures at 240 K. The samples are denoted according to the weight percentages of water followed by the letters 'b' and 'c' for bulk (open symbols) and confined (solid symbols) mixtures, respectively.

#### *4.4. Glass Transition of Confined Water-Alcohol Mixtures*

2H NMR proved also useful to ascertain the glass transition of confined aqueous solutions [217,258]. In these studies, the strong slowdown of molecular dynamics related to the increasing viscosity can be monitored by a combination of, in particular, spin-lattice relaxation and stimulated-echo experiments. Moreover, depending on the deuteration scheme of the used compounds, it is possible either to observe the dynamical behavior of a particular component selectively or to probe that of both constituents at the same time.

Figure 12 compares 2H NMR correlation times of a water-glycerol mixture in the bulk with that in protein and silica confinements [217]. In all samples, 25 wt % of water were mixed with selectively labelled glycerol-d5. Hence, 2H NMR exclusively probes glycerol reorientation. The correlation times of the bulk mixture showed the characteristic non-Arrhenius temperature dependence of molecular

glass-forming liquids over about 12 orders of magnitude. The agreement of the NMR data for glycerol dynamics with BDS results, which receive strong contributions from water reorientations, indicated coupled dynamics of the components. While glycerol reorientation was notably slowed down in an elastin matrix, the correlation times were unaltered when confining the liquid to MCM-41 pores [217]. This difference may suggest that glycerol interacts more strongly with elastin than with silica surfaces but it can also be caused by diverse confinement sizes in the studied samples. Specifically, the water-glycerol mixture forms only 1–2 solvation layers around the protein for the used concentration of 0.3 gsolvent/gelastin, whereas interfacial and bulk behaviors can coexist in the MCM-41 pores with a diameter of 2.8 nm. Thus, the observed difference can result because slowed interfacial dynamics was probed for the elastin confinement, while bulk-like behavior in the pore center dominated the findings for the silica confinement. Consistent with the latter argument, spatially resolved analyses in molecular simulation studies of water-alcohol mixtures confined to silica pores revealed strongly retarded motion near the pore walls and bulk-like behavior in the pore center [259].

**Figure 12.** Temperature-dependent 2H NMR correlation times of 25 wt % water-glycerol-d5 mixture in the bulk liquid, in MCM-41 pores (diameter 2.8 nm), and in an elastin matrix (0.3 gsolvent/gelastin) [217]. The 2H NMR data (squares, circles, and up triangles) selectively characterize the rotational motion of the deuterated glycerol compound. The BDS data (diamonds, down triangles) receive contributions from both water and glycerol reorientations.

In 2H NMR studies on mixtures of alcohol molecules with heavy water, both components contribute to the observed signals because chemical exchange leads to perpetual redistribution of the provided deuterons [260]. Results obtained for a glass-forming mixture of water and propylene glycol in bulk and confinement are presented in Figure 13 [217]. While 2H spin-lattice relaxation does not yield evidence for confinement effects in the weakly supercooled regime, 2H rotational correlation functions from stimulated-echo experiments in the deeply supercooled regime decay slower for the mixture in silica pores than in the bulk liquid. Closer analysis showed that the common structural α relaxation of water and alcohol molecules is observed in the former temperature range, while faster water reorientation decouples as a secondary β relaxation from the viscous slowdown when approaching the glass transition temperature *T*g. Therefore, the authors concluded that the 2H stimulated-echo data do not yield evidence for a slowdown of the structural relaxation of the water-alcohol mixture in silica pores, but rather indicate changes of the secondary process [217].

**Figure 13.** Results from 2H NMR studies on 45 wt % water-propylene glycol mixture in the bulk liquid and in MCM-41 pores (diameter 2.8 nm) [217]: (**a**) Rotational correlation functions F2(tm) from stimulated-echo experiments at ~163 K and ~173 K. The lines are fits to a Kohlrausch function. (**b**) Temperature-dependent correlation times from (triangles) spin-lattice relaxation, (diamonds) line-shape analysis, and (squares) stimulated-echo experiments. The line is a Vogel-Fulcher-Tammann fit of the spin-lattice relaxation results for the confined mixture, which probe its structural α relaxation. For comparison, results for the secondary β relaxation from BDS on the bulk mixture are included as stars [261] (adapted from [217]).

#### *4.5. Ionic Liquids and Surfactants in Confinement as Nonconventional Solvents*

Ionic liquids (ILs) and surfactants such as poly(ethylene oxide) are versatile solvents in the field of green chemistry. Owing to favorable physical and chemical properties such as being environmentally benign and having a low vapor pressure, etc. (see, e.g., [262–265]) they are employed in a wide field of applications, ranging from basic preparative chemical synthesis to heterogeneous catalysis as a supported ionic liquid phase (SILP) catalyst [266]. In the latter application the IL is employed confined in an oxidic host material.

In the last decade, SSNMR has involved into one of the most versatile techniques to characterize the structure, dynamics, and phase behavior of ILs in general and SILP catalysts in particular, as demonstrated by the following characteristic examples.

Shylesh et al. [267] combined in situ FT-IR with 31P and 29Si MAS NMR to study the structure of sulfoxantphos (Rh-SX) in a silica supported ionic liquid film. Haumann et al. [266] studied the water-gas shift reaction employing SILP systems, consisting of the ruthenium catalyst ([Ru(CO)3Cl2]2) and [EMIM][NTf2] (1-ethyl-3-methylimidazolium bis(trifluoromethylsulfonyl)imide), respectively, [BMIM][NTf2] (1-butyl-3-methylimidazolium bis(trifluoromethylsulfonyl)imide) confined in a silica gel as a function of the pore loading. Le Bideau et al. [268] investigated the dynamics of silica confined ILs by a combination of variable temperature NMR spectroscopy and relaxometry. In these experiments they observed a strong depression of the freezing point of the IL, as compared to the bulk liquid and that the confinement causes only a small slowdown of its dynamics. Rosa Castillo et al. [269] studied the phase behavior of [BMIM][PF6] (1-butyl-3-methylimidazolium hexafluorophosphate) on silica and clay by multinuclear (1H, 13C, 31P) SSNMR spectrometry. Waechtler et al. [270] used a combination of DSC and variable temperature 2H and 19F solid-state NMR to compare the phase behavior of [C2Py][BTA]-d10 (*N*-ethylpyridinium-bis(trifluoromethanesulfonyl)amide) in the bulk and confined in mesoporous silica gel. While two phase transitions were found for the bulk IL, one at 288–289 K indicating the onset of an intermolecular rotation and one at 295 K indicating the melting of the IL, the confined IL exhibited only a single phase transition in the lower temperature range (215–245 K).

Very recently, Hoffmann et al. [271] investigated the behavior of nonionic surfactants doped with radicals confined in SBA-15 whose surface was modified with (APTES) by a combination of DSC, SSNMR and dynamic nuclear polarization (DNP) [85,87] enhanced SSNMR spectroscopy.

#### **5. Summary and Outlook**

This paper reviews recent advances to the characterization of small molecules confined in microporous and mesoporous materials employing solid-state NMR techniques. It is shown that there is an exciting interplay between guest/guest and guest/host interactions, which can drastically change the physico-chemical properties of the confined systems and that solid-state NMR spectroscopy and relaxometry, combined with other techniques, such as nitrogen adsorption, differential scanning calorimetry, dielectric spectroscopy, hyperpolarization, and others, are ideal analytical tools enabling the differentiation between bound, adsorbed and free molecules inside the pores, as well as the observation of diffusion processes inside the pores of mesoporous and microporous materials. A number of examples, mainly from the groups of the two authors, were given to highlight the application of these techniques. These examples were supplemented by short references to the work of other groups in the field in order to give a broader picture of the field of imprisoned molecules.

Finally, we will end our review with giving some thoughts, where the field is moving in the next few years. Here the dramatic technical advances in sensitivity enhancement of NMR spectroscopy will enable the investigation of even more complex systems, e.g., hierarchical confinements, where the mesoporous silica-material itself is confined inside larger pores of, e.g., a polymer or paper to form a smart membrane.

**Funding:** This research was funded by the Deutsche Forschungsgemeinschaft in the framework of Forschergruppe FOR 1583, grant numbers Bu-911/18–1/2, Vo-905/8–1/2, Vo-905/9–1/2, and Vo-905/10–1/2.

**Conflicts of Interest:** The authors declare no conflicts of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Molecules* Editorial Office E-mail: molecules@mdpi.com www.mdpi.com/journal/molecules

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18