**Sequence-Specific DNA Binding by Noncovalent Peptide–Azocyclodextrin Dimer Complex as a Suitable Model for Conformational Fuzziness**

## **Zulma B. Quirolo 1,2,3,**†**, M. Alejandra Sequeira 1, José C. Martins <sup>2</sup> and Verónica I. Dodero 1,3,\***


Academic Editors: Pier Luigi Gentili and Carmelo Corsaro Received: 17 May 2019; Accepted: 5 July 2019; Published: 9 July 2019

**Abstract:** Transcription factors are proteins lying at the endpoint of signaling pathways that control the complex process of DNA transcription. Typically, they are structurally disordered in the inactive state, but in response to an external stimulus, like a suitable ligand, they change their conformation, thereby activating DNA transcription in a spatiotemporal fashion. The observed disorder or fuzziness is functionally beneficial because it can add adaptability, versatility, and reversibility to the interaction. In this context, mimetics of the basic region of the GCN4 transcription factor (Tf) and their interaction with dsDNA sequences would be suitable models to explore the concept of conformational fuzziness experimentally. Herein, we present the first example of a system that mimics the DNA sequence-specific recognition by the GCN4 Tf through the formation of a noncovalent tetra-component complex: peptide–azoβ-CyD(dimer)–peptide–DNA. The non-covalent complex is constructed on the one hand by a 30 amino acid peptide corresponding to the basic region of GCN4 and functionalized with an adamantane moiety, and on the other hand an allosteric receptor, the azoCyDdimer, that has an azobenzene linker connecting two β-cyclodextrin units. The azoCyDdimer responds to light stimulus, existing as two photo-states: the first thermodynamically stable with an E:Z isomer ratio of 95:5 and the second obtained after irradiation with ultraviolet light, resulting in a photostationary state with a 60:40 E:Z ratio. Through electrophoretic shift assays and circular dichroism spectroscopy, we demonstrate that the E isomer is responsible for dimerization and recognition. The formation of the non-covalent tetra component complex occurs in the presence of the GCN4 cognate dsDNA sequence ( 5-..ATGA cg TCAT..-3 ) but not with ( 5-..ATGA c TCAT..-3 ) that differs in only one spacing nucleotide. Thus, we demonstrated that the tetra-component complex is formed in a specific manner that depends on the geometry of the ligand, the peptide length, and the ds DNA sequence. We hypothesized that the mechanism of interaction is sequential, and it can be described by the polymorphism model of static fuzziness. We argue that chemically modified peptides of the GCN4 Tf are suitable minimalist experimental models to investigate conformational fuzziness in protein–DNA interactions.

**Keywords:** GCN4 mimetic; peptides–DNA; E:Z photoisomerization; conformational fuzziness

#### **1. Introduction**

#### *1.1. Conformational Fuzziness and Fuzzy Complexes*

Dynamics is inherently related to protein functionality and an essential process in the sophisticated transcriptional machinery. This property generates great flexibility in developing new regulatory circuits where many combinations of activators can regulate a wide variety of genes with different coactivator requirements [1,2]. Dynamics results in conformational ensembles, and indeed many proteins exist simultaneously in different yet functionally relevant conformations. The number and type of the different conformational complexes depend on the cellular environment, such as substrate gradient concentration and the different stress signals. In this sense, this conformational 'fuzziness' is often functionally beneficial because it can add adaptability, versatility, and reversibility to small molecule interactions, protein binding, and thereby ease of regulation not only in protein–protein interactions [2,3] but also in protein–DNA interactions. The description of biological complex systems from their ultimate constituents, i.e., atoms or molecules, is beyond our reach since they have variable patterns, whose recognition is made difficult because of their multiple features, variability, and extreme sensitivity depending on the context [3]. One chemical strategy to tackle this complexity is to obtain tailored biomolecules whose conformation can be controlled externally. Therefore, the conformational fuzziness of the biomolecular interaction can be investigated [4].

Finally, Tompa and Fuxreiter [1] introduced the concept of fuzzy complexes to describe binding situations in which at least one of the elements in the complex remains dynamic. Structural disorder in fuzzy complexes represents a continuum, from rather rigid polymorphic complexes displaying static disorder with only a few alternative conformations to highly dynamic random complexes [1,2,5].

Since up to now, the fundamental understanding of the spatiotemporal control of the interaction between transcription factors (Tfs) and DNA remains elusive, we envisaged designing a suitable Tf biomimetic system combining molecular recognition and supramolecular chemistry to exert the external control of the interaction. We hypothesized that a non-covalent interaction between the Tf and the DNA would be an optimal system to mimic the biological interaction. Under this circumstance, different possible complexes and their respective conformers would exist in equilibrium in the unbound and bound states. Therefore, we anticipate that the concepts of conformational fuzziness and fuzzy complexes would facilitate the analysis of our experimental model. We think that such a system and its analysis would offer a unique strategy to understand the molecular triggers of transcription and the underlying fuzzy mechanism of the Tf–DNA interaction.

#### *1.2. Design of the Non-Covalent Externally Controlled Biomimetic System*

In *Saccharomyces cerevisiae*, GCN4 transcription factor (Tf) is part of the "general control" system of amino acid biosynthesis, a network of at least 12 different biosynthetic pathways [6]. GCN4 is a member of the bZIP (basic leucine zipper) family of transcriptional activators that bind to the major groove of double-stranded DNA as a homodimer (Figure 1A) [7,8]. The natural dsDNA specific target sequences of the GCN4 dimers are the activator protein 1 binding site (AP1, 5 -..ATGAcTCAT..-3 ) and the related cyclic AMP response element (CRE, 5 -..ATGAcgTCAT..-3 ). GCN4 is composed of a C-terminal leucine zipper sequence that associates into non-covalent, parallel, alpha-helical dimers, and an N-terminal basic region necessary for binding DNA. The basic regions are disordered in the absence of DNA and form alpha helices only when as a homodimer binds to the cognate DNA (Figure 1A). Therefore, GCN4 Tf is a suitable protein model to investigate the relationship between structural order/disorder by binding with its mediators or cofactors [9] and in the context of protein–DNA interactions.

Pioneering work in the construction of GCN4 mimetics was performed by Kim et al. [10]. They showed that the replacement of the leucine zipper segment of GCN4 by a cysteine capable of dimerization through disulfide bond formation (GCN4-br1, Figure 1A), allowed specific recognition of its consensus DNA sequences, CRE ( 5-..ATGA cg TCAT..-3 ) and AP1 ( 5-..ATGA c TCAT..-3 ),

with a nanomolar affinity at 4 ◦C (Figure 1B). Importantly, the monomeric 34-amino acid sequence was not capable of binding to dsDNA by itself, pointing to dimerization as an essential prerequisite. Furthermore, it was possible to trim the basic region to only 23 amino acids, while maintaining the specific recognition of consensus sequences by the covalent cysteine dimer at affinities around 10 nM at 4 ◦C [11]. In an alternative approach, Morii et al. [12] used a 23-residue peptide to form a non-covalent heterodimeric complex through host–guest supramolecular interactions. One monomer was equipped with an adamantane group (Ad) and the other with a β-cyclodextrin group (β-CyD). In the presence of the cognate dsDNA, they formed a non-covalent heterodimer. The heterodimer obtained by formation of the inclusion complex between the β-CyD (host) and the Ad (guest) was capable of selectively recognizing the CRE binding domain [12]. To address and control the reversibility of the binding between the GCN4 Tf mimetic and its cognate dsDNA, Mascareñas' group employed an azobenzene moiety covalently attached to the basic region of GCN4 with 26 amino acids [13]. This covalent system was capable of selective binding and recognizing its target dsDNA after UV-light irradiation when the azobenzene was in the (Z) conformation, effectively creating an off–on system. More recently, the same group designed a stimuli-responsive system that targeted different dsDNA triggered by different metals [14]. Building on these previous examples and to address the fuzziness of the Tf–DNA interaction, we envisaged employing an alternative homodimeric non-covalent strategy where the addition of an external molecule would promote not only dimerization but also control over the specific DNA recognition by the geometry of host–guest interactions. In this hypothetical system, the cognate protein–DNA interaction would build a non-covalent tetra-component complex only if the geometry of the host–guest interaction would promote peptide dimerization and a concomitant specific DNA binding occurs. However, many different complexes and their conformers are possible. Considering the non-covalent approach, there are opportunities to control externally the system and to modulate the number and type of complexes and their conformers depending on the geometry of ligand and the dsDNA sequence (Figure 1C).

To address host–guest interactions and switchability of the dimerization, we employed a molecule with two β-cyclodextrin units connected via an azobenzene group, a β-CyD-azobenzene–β-CyD (β-azoCyDdimer). This double host would be capable of binding two basic regions of GCN4 if the peptide were equipped with the adamantane (Ad) guest moiety forming a conjugated peptide–Ad molecule.

In general, double-host β-CyDs bind with higher affinity to Ad than simple β-CyD because of a cooperative effect, as described by Breslow et al. [15]. Moreover, the strategy of a non-covalent β-CyD double host brings the possibility to modulate the geometry and inclusion properties of the peptide–(β-azoCyDdimer) interaction by the photoisomerization of the azobenzene moiety. Thus, the photoswitchable host will work as an allosteric receptor in the petide-DNA interaction. The adamantane moiety was incorporated into two minimal mimetics of GCN4 containing 26 amino acid and a second one with 30 amino acids, SH26 and SH30, respectively (Figure 1A). Both sequences contain a flexible GG linker, and the longer SH30 sequence contains the sequence RMKQ in the C-terminus that should enhance the binding capacity to the cognate dsDNA sequences [9]. To investigate the recognition capacity and the sequence specificity of the peptide–azoβ-CyD(dimer)–peptide–DNA interaction, electrophoretic mobility shift assays (EMSA), and circular dichroism spectroscopy (CD) in the presence of the cognate and non-cognate sequences were tested (Figure 1B). From EMSA experiments the stoichiometry of the interaction, if any, could be identified; meanwhile, the CD would provide us information about the transition from disorder-to-partial-order or disorder-to-order in the bound state by monitoring the alpha helix content of the peptide.

In summary, we hypothesized the formation of a non-covalent tetra-component complex, (peptide–azoβ-CyD(dimer)–peptide–DNA), only if the required geometry to drive dimerization and recognition is provided by the host–guest interaction, which depends ultimately on light illumination (Figure 1C). In this way, new possibilities may arise to investigate the different conformations that are

involved in DNA recognition triggered by an external stimulus and evaluate the concept of fuzziness in protein–DNA interactions.

**Figure 1.** (**A**) Natural GCN4 sequence (GCN4-bzip), and different truncated GCN4 basic regions: GCN4-br1 [8], GCN4-G23 [9], SH26 [11], and SH30. (**B**) Cognate and non-cognate oligonucleotides employed in this work; the recognition and binding sequences are underlined. (**C**) Schematic representation of the possible conformational fuzziness product of the interaction among (β-azoCyDdimer)–peptide–Ad and different dsDNA. By controlling the geometry of the host–guest interaction triggered by light, it would be possible to promote dimerization and the recognition of the cognate dsDNA sequence. The structural transitions of the dimerization and recognition regions upon different partner interactions generate a structural and dynamical continuum towards the formation of the specific non-covalent tetra-component complex *only* in the case of the cognate dsDNA. Although other complexes with different stoichiometry can be formed, they are not shown to simplify the scheme.

#### **2. Results**

#### *2.1. Synthesis and Characterization of the Allosteric Receptor and Adamantyl Substituted Peptides*

#### 2.1.1. Allosteric Receptor Synthesis and Characterization

Herein, we simplified the synthesis of the β-CyD double host, azoCyDdimer [16] (Figure 2A) by direct coupling of 4,4 -bis (carboxy) azobenzene acid and mono-6-amino-6-deoxy-β-cyclodextrin (β-CyDNH2) in DMF using HATU, as a coupling agent and DIPEA. The reaction was monitored by TLC until the disappearance of β-CyDNH2, and purified by preparative HPLC obtaining 20% yield. In general, monosubstituted cyclodextrins (CyD) lack the high symmetry of free or symmetrically substituted CyD and, therefore, present substantially more complex NMR spectra. Previously, the azoCyDdimer was not entirely characterized [14]. Taking this into account, we present here the complete characterization of the azoCyDdimer by a combination of 2D NMR experiments. The azoCyDdimer structure, made by two CyD linked through an azobenzene unit, was initially studied by 1H NMR in DMSO (Figure 2A,C, and Supplementary Materials, Figure S1). Two distinct anomeric proton signals at δ = 4.96 ppm for H-1 and δ = 4.83 ppm for H-1 were observed (with a relative area of 12:2), corresponding to the pattern of monosubstituted CyD. The –OH2 and –OH3 groups were observed between 5.70 to 5.82 ppm. The other group of signals between 3.20 and 3.80 ppm corresponded to the rest of the protons of the glucose units and suffered from substantial overlap (see Supplementary Materials, Figure S1). Nevertheless, the complete assignment could be obtained through the use of COSY, TOCSY, 1H-{13C} HSQC, and HMBC.

Regarding the signals of the photomodulable connector, two doublets of 4H area were observed at δ = 7.98 (*J* = 8.6 Hz) and δ = 8.04 (*J* = 8.6 Hz), which corresponded to a substituted azobenzene derivative in position 4.4 ' (isomer (E)). Using a multiplicity edited HSQC experiment, CH units were differentiated from CH2, while the absence of correlations allowed the identification of signals corresponding to the NH and OH groups. The analysis was completed by HMBC, which allowed establishment of longer distance correlations between 1H and 13C over two or three bonds. This enabled the correlation of Hb at 8.00 ppm (doublet) with the carbonyl of the amide, while a weak correlation was also observed between Cb and the H of the amide thus confirming its integrity (see Supplementary Materials, Figure S2 and S3).

Furthermore, we evaluated the E:Z photoisomerization of the dimer using 1H NMR and UV–Vis spectroscopy (Figure 2C and see Supplementary Materials, Figure S4). Changes in the E:Z ratio after photoisomerization were calculated from the integration of the corresponding areas of the azobenzene protons, as shown in Figure 2C. The signals corresponding to the Z isomer appeared at the higher field than the corresponding signals of the E isomer due to the different contribution of the diamagnetic anisotropy of both benzene rings in the E:Z isomers. As mentioned, the 1H-NMR spectrum of the E isomer (Figure 2(C1)) had two doublets at 8.03 ppm (*J* = 8.6 Hz) and 7.97 ppm (*J* = 8.6 Hz) corresponding to the aromatic protons Hc and Hb of the azobenzene linker group, and a singlet at 8.52 ppm corresponding to the protons of the amide group. After the isomerization, a decrease of these signals was observed while three new signals corresponding to the Z isomer appeared: two doublets at 7.72 ppm (*J* = 8.4 Hz) and 6.90 ppm (*J* = 8.4 Hz) and a singlet at 8.20 of the H corresponding to the amide bond (Figure 2(C2)).

Previously, it was reported that azoCyDdimer forms supramolecular aggregates in water above the concentration 1 mM, while below this concentration it reaches the photostationary state of E:Z 60:40 as determined by UV–Vis spectroscopy [16]. Here, we analyzed the dependence of the concentration and isomerization by the integration of the peaks using 1H-NMR in DMSO. We observed that the most diluted samples favored the formation of the Z isomer, from E:Z ratios of 40:60 (0.5 mM) to 60:40 (15 mM) (Figure 2D,E).

**Figure 2.** (**A**) Structure of the allosteric receptor. (**B**) Representation of the photoisomerization of the azoCyDdimer. (**C1**) 1H-NMR-500MHz spectra of azoCyDdimer (15 mM) at 25 ◦C (DMSO-*d*6). (**C2**) 1H NMR spectra of azoCyDdimer after 4 h irradiation with UV light (λ = 360 nm). Isomer (E) signals are marked with full lines and isomer (Z) signals with dotted lines. (**D**) Independent 1H-NMR (300MHz) experiments of the photoisomerization of azoCyDdimer at different initial concentrations, showing the aromatic region after 20 min of irradiation at 360 nm in DMSO-d6, 25 ◦C (see Materials and Methods). (**E**) Bar graph of E:Z isomer ratio obtained by integration of the 1HNMR signals in D.

2.1.2. Peptide Derivatives Synthesis

To obtain a minimal GCN4 mimick, we synthesized two adamantane GNC4 derivatives (Ad26 and Ad30, Figure 3A). Both peptides were obtained by coupling the bromoacetyladamantane with the thiol group of Cys26 or Cys30, respectively.

**Figure 3.** (**A**) GNC4 peptides derivatives. (**B**) RP-LC–MS-monitoring of the reaction between SH30 and bromoacetyl adamantane to obtain Ad30 at 5 (1) and 120 (2) min.

We took advantage of the nucleophilic reactivity offered by the thiol group of cysteine in a solution using the completely deprotected peptide. The synthesis of the new Ad30 was performed following the established protocol for Ad26 [17,18]. In Figure 3B, LC–MS chromatogram shows the evolution of the reaction, which was verified by ESI. After 2 h of reaction, the peak of SH30 (tr 9.56 min) disappeared and a new peak at 13.85 min appeared, whose m/z corresponded to the final product Ad30.

Both Ad26 and Ad30 peptides were employed as the minimal GCN4 derivatives to evaluate the interaction strategy of non-covalent homodimers. Furthermore, we employed the corresponding disulfur dimers SS52 and SS60 as positive controls of the peptide–DNA interaction [10,17,18].

#### *2.2. Peptide–dsDNA Interaction*

#### 2.2.1. Electrophoretic Mobility Shift Assays (EMSA)

When a GCN4 peptide binds to a cognate dsDNA, it induces a delay in the migration of the peptide–dsDNA complex in comparison with free dsDNA when using EMSA. This interaction leads to a characteristic shift in the position of the associated band, leading to the observation of two bands with an intensity ratio reflecting the equilibrium involved in complex formation. EMSA experiments require a minimal amount of sample in the nM range, making it an ideal and common technique to obtain qualitative information of the respective peptide–DNA interaction [10–13,17–19]. The DNA-binding properties of the synthetic peptides were studied by EMSA under non-denaturing conditions and using SYBR-gold for DNA staining. Initially, the incubation of Ad26 in the presence or absence of azoCyDdimer to both AP1 and CRE dsDNA (Figure 1B) showed no interaction bands, as shown in Figure 4. In this case, a new band could only be observed for the positive control SS52, which is the SH26 disulfide dimer (Figure 4, lane 2), confirming the negative result. It was noticeable that the interaction of Ad26 with CRE (Figure 4C,D) produced a decrease in the free DNA band. We hypothesized that this behavior might be a consequence of the formation of different types of complexes with the dsDNA, which would make diffuse broadband under both conditions. Nevertheless, no strong interaction was detected at the nM range. The azobenzene molecule extends over a considerably different distance when considering the E and Z isomers, being theoretically about 9 to 5.5 Å, respectively [20–22]. When we tried to mimic the binding system of GCN4, this distance may not have been adequate for the specific interaction with the short sequence and its putative dsDNA binding site. Considering that no new migration band was observed in the nM, this system was not suitable for further evaluation, mainly because upon increasing peptide concentration the unspecific interactions started to be relevant.

**Figure 4.** EMSA analysis of DNA binding of Ad26 peptide in the presence and absence of azoCyDdimer. (**A**) Lanes 1–10: 50 nM AP1 ( 5-cggATGA c TCATtttttttc-3 ); lane 2: 300 nM SS52; lanes 3–10: 200 nM Ad26 and 0, 0.5, 1, 2, 5, 20, 50, 100 equivalents azoCyDdimer (E). (**B**) Lanes 1–10: 50 nM AP1 ( 5-cggATGA c TCATtttttttc-3 ); lane 2: 300 nM SS52; lanes 3–10: 200 nM Ad26 and 0, 0.5, 1, 2, 5, 20, 50, 100 equivalents azoCyDdimer (Z). (**C**) Lanes 1–10: 50 nM CRE ( 5-cggATGA cg TCATttttttt-3 ); lane 2: 300 nM SS52; lanes 3–10: 200 nM Ad26 and 0, 0.5, 1, 2, 5, 20, 50, 100 equivalents azoCyDdimer (E). (**D**) Lanes 1–10: 50 nM CRE ( 5-cggATGA cg TCATtttttttc-3 ); lane 2: 300 nM SS52; lanes 3–10: 200 nM Ad26 and 0, 0.5, 1, 2, 5, 20, 50, 100 equivalents azoCyDdimer (Z). DNA was detected by fluorescent staining with SYBR-gold.

On the other hand, incubation of the derivative Ad30 with AP1 showed a new migration band, the molecular weight of which corresponded to the Ad30 monomer that binds to dsDNA (Figure 5A,B, lane 2, compare with the positive interaction with the covalent dimer of SH30, named as SS60) [12,17]. It seems that the monomer Ad30 interacted with one half of the recognition site, but no tri or tetra-component complex was obtained, leading us to conclude that this interaction should be considered as unspecific, although with high affinity. Interestingly, once the azoCyDdimer was present in the incubation mixture with Ad30 and the CRE sequence, two bands appeared, one corresponding to the above-mentioned unspecific interaction, and the second one with a retardation band compatible with a higher order complex (Figure 5C,D). By increasing the concentration of azoCyDdimer, the formation of the second complex was favored. This new migration band was compatible with the size of the tetra-component system Ad30–AzoCyDdimer–Ad30–CRE that migrated less than the disulfide dimer (compare with lane 2, Figure 5). The intense band of specific binding was only observed in the presence of an excess of azoCyDdimer enriched in the isomer (E), to the detriment of the nonspecific binding that nearly disappeared at the highest azoCyDdimer concentration (Figure 5C, lanes 10–12). A similar result was observed when the azoCyDdimer (Z) was added to the incubation mixture; however, in this case, the specific band was more diffuse, showing the existence of different complexes in the mixture (Figure 5D, lanes 9–12).

**Figure 5.** EMSA analysis of DNA binding of Ad30 peptide in the presence and absence of azoCyDdimer. (**A**) Lanes 1–10: 50 nM AP1 ( 5-cggATGA c TCATtttttttc-3 ); lane 2: 300 nM SS60; lanes 3–10: 200 nM Ad30 and 0, 0.5, 1, 2, 5, 20, 50, 100 equivalents azoCyDdimer (E). (**B**) Lanes 1–10: 50 nM AP1 ( 5-cggATGA c TCATtttttttc-3 ); lane 2: 300 nM SS60; lanes 3–10: 200 nM Ad30 and 0, 0.5, 1, 2, 5, 20, 50, 100 equivalents azoCyDdimer (Z). (**C**) Ad30 and azoCyDdimer (E): lane 1–13: 50 nM CRE ( 5-cggATGA cg TCATtttttttc-3 ); lane 2: 100 nM SS60; lanes 3–15: 200 nM Ad30; lanes 3–10: azoCyDdimer (E): 0, 0.5, 1, 2, 10, 50, 100, 500 eq.; lane 11–12: excess of azoCyDdimer (E) (0.15 mM); lane 13: Ad30 200 nM; lane 14: 50 nM ds mCRE ( 5-cggATGAcgttgtttttttc-3 ): 200 nM Ad30; lane 15: 50 nM mCRE ( 5-cggATGAcgttgtttttttc-3 ): 200 nM Ad30, 0,5 eq azoCyDdimer(E). (**D**) Ad30 and azoCyDdimer (Z) lane 1–13: 50 nM CRE ( 5-cggATGA cg TCATtttttttc-3 ); lane 2: 100 nM SS60; lane 3–10: 200 nM Ad30 and azoCyDdimer (Z): 0, 0.5, 1, 2, 10, 50, 100, 500 eq.; lane 11–12: Ad30: 200 nM and excess of azoCyDdimer (Z) (0.15 mM); lane 13: Ad30 200 nM; lane 14: 50 nM ds mCRE ( 5-cggATGAcgttgtttttttc-3 ); 200 nM Ad30; lane 15: 50 nM ds mCRE ( 5-cggATGAcgttgtttttttc-3 ): 200 nM Ad30, 0,5 eq. azoCyDdimer (Z). DNA was detected by fluorescent staining with SYBR-gold.

Additionally, in the presence of the azoCyDdimer (Z), the relative intensity of the non-specific band did not show a significant decrease in comparison with the specific one. To further investigate the specificity of the interaction, we employed a dsDNA with half CRE sequence (mCRE), (Figure 5, in lanes 14 and 15, gels C and D) showing the migration of the band compatible with a complex monomer Ad30–azoCyDdimer–mCRE. The slightly different migration profile could be a different migration of both dsDNA sequences. It is important to mention that under this circumstance, it seems that the mixture enriched in E isomer (95:5) could form the cooperative complex. Therefore, the mixture enriched in the Z isomer, which still contained 60% of E isomer after photo-illumination, could also bind.

Considering the positive binding of the Ad30–azoCyDdimer–Ad30 system and the CRE sequence, we further evaluated its interaction by circular dichroism (CD) spectroscopy.

#### 2.2.2. Circular Dichroism Spectroscopy

There is a proportional relationship between the amount of α-helix and the intensity of the negative signal at 222 nm in the circular dichroism spectrum, especially useful for study the bZIP-DNA interactions [23,24]. In the case of GCN4, which is disordered in the absence of its cognate dsDNA, there is an increase in its folding from random coil to α-helix when it interacts specifically with its consensus DNA. Briefly, Ad30 solution was added to a 5 μM solution of azoCyDdimer (E) in phosphate buffer to monitor any change due to the interaction. A decrease of the band at 222 nm from −0.2118 (CyDdimer only) to <sup>−</sup>16.201 ◦cm2dmol−<sup>1</sup> was observed when Ad30 was added (Figure 6A). In the presence of the CRE sequence ( 5-..ATGA cg TCAT..-3 ), a decrease of the same band was observed because of the interaction with the CRE binding site at <sup>−</sup>30,666 ◦cm2dmol−1, validating the specific interaction of the Ad30–azoCyDdimer (E)–Ad30 system and CRE obtained by EMSA assays (Figure 5C). The sample was irradiated with UV light (360 nm) to increase the ratio of isomer Z in the photostationary state, and a slight change in the band at 222 nm was observed, from <sup>−</sup>29.968 ◦cm2dmol−<sup>1</sup> at 20 min to <sup>−</sup>28.953 ◦cm2dmol−<sup>1</sup> at 40 min. As in the case of isomer (E), the solution enriched with azoCyDdimer (Z) did not show a significant contribution of the ellipticity at 222 nm (−0.2956 ◦cm2dmol−1) (Figure 6B). With the addition of the solution of Ad30 peptide, the value of the band at 222 nm decreased to <sup>−</sup>15.797 ◦cm2dmol−<sup>1</sup> and then when CRE was added, the second decrease in ellipticity was observed, reaching the value of <sup>−</sup>26.709 ◦cm2dmol−1, which showed lower alpha-helical content in comparison with the azoCyDdimer (E). Once this mixture was irradiated with white light, the signal at 222 nm decreased slightly to <sup>−</sup>27.034 ◦cm2dmol−1.

**Figure 6.** Circular dichroism evaluation of the interaction of Ad30 (**A**) with azoCyDdimer (E) and its cognate sequence CRE ( 5-cggATGA cg TCATtttttttc-3 ), (**B**) with azoCyDdimer (Z) and its cognate sequence CRE ( 5-cggATGA cg TCATtttttttc-3 ), (**C**) with azoCyDdimer (E) and its cognate sequence AP1 ( 5-cggATGA c TCATtttttttc-3 ), (**D**) with azoCyDdimer (Z) and its cognate sequence AP1 ( 5-cggATGA c TCATtttttttc-3 ), (**E**) with azoCyDdimer (E) and mCRE ( 5-cggATGAcgttgtttttttc-3 ), and (**F**) with azoCyDdimer (E) in the presence of a random sequence, NON ( 5-ggtatgcgtcgatttttttc-3 ). In all experiments, the band of dsDNA was subtracted; for further experimental details refer to Materials and Methods section.

On the other hand, in the interaction of both isomers with AP1, ( 5-..ATGA c TCAT..-3 ), the band at 222 nm decreased to <sup>−</sup>22.974 ◦cm2dmol−<sup>1</sup> in the case of the isomer E and to <sup>−</sup>21.408 ◦cm2dmol−<sup>1</sup> when the isomer Z was present (Figure 6C,D). Both results were in the same direction as the EMSA, showing that the geometry of the dimer was not favorable for the interaction with AP1, which had only one base spacer between both recognition sites.

Finally, we tested the unspecific interaction with the half sequence of mCRE ( 5-..ATGA cg..-3 ), detecting a decrease of the molar ellipticity to <sup>−</sup>20.214 ◦cm2dmol−1. This value, as expected, was lower than the one observed with the full sequence (compare Figure 6A,E). The observed decrease in molar ellipticity could be due to the insertion of the Ad30 monomer in the major groove and interaction with the bases of the half sequence ( 5-..ATGA cg..-3 ). To investigate the role of the phosphates of the DNA and their binding to the basic Ad30, we experimented with a random DNA sequence that does not possess a cognate binding site, named NON ( 5-ggtatgcgtcgatttttttc -3 ) (Figure 6F). For this case, a slight decrease in the signal from <sup>−</sup>10.389 to <sup>−</sup>14.323 ◦cm2dmol−<sup>1</sup> was observed. From that experiment, we can exclude that the unspecific interaction with the negative phosphates of DNA and the basic GCN4 peptide has a role in the interaction of Ad30.

These results are in agreement with those obtained by EMSA, showing that Ad30 in the presence of azoCyDdimer in the E configuration presented a specific binding to CRE sequence forming the tetra-component complex; meanwhile, in the Z configuration, the helical content was lower. Considering that the E isomer is an equilibrated mixture 95:5 (E:Z) and the Z isomer is a 60:40 (E:Z) mixture, we hypothesized that the E isomer has the right geometry to form the specific tetra-component complex. The CD signal of the interaction with AP1 was in the same range of those observed for the half-recognition sequence, mCRE. This shows the feasibility of a monomer Ad30–AzoCyDdimer to bind to the sequence ( 5-..ATGAcg..-3 ), but not as a dimer as detected by the EMSA experiment (Figure 5A,B). Importantly both CD and EMSA experiments demonstrated the feasibility of formation of the non-covalent complex Ad30–AzoCyDdimer (E)–Ad30 that interacts preferably with CRE sequence, ( 5-..ATGA cg TCAT..-3 ).

2.2.3. Competitive Binding Assay with 1-Adamantane Acetic Acid by CD and EMSA Experiments

To further corroborate the formation of the inclusion complex Ad–βCyD in the system Ad30–AzoCyDdimer (E)–Ad30, we hypothesized that such interaction would collapse upon addition of an excess amount of a competitive β-cyclodextrin binder, such as 1-adamantane acetic acid [12]. Under this circumstance, the helical content would decrease due to the dimer collapse, and the specific band in the EMSA experiment should disappear. Following this hypothesis, we evaluated first the system by the EMSA experiment (Figure 7A).

**Figure 7.** (**A**) EMSA analysis of DNA binding of Ad30–azoCyDdimer (E)–Ad30 in the absence and presence of 1-adamantane acetic acid. Lanes 1–5: 50 nM CRE ( 5-cggATGA cg TCATtttttttc-3 ); lane 2: 300 nM SS60; lane 3: 200 nM Ad30; lane 4: 200 nM Ad30 and 100 equiv azoCyDdimer (E); lane 5: 200 nM Ad30, 100 equiv azoCyDdimer (E) and 200 equiv 1-adamantane acetic acid; lane 6: 50 nM NON, 200 nM Ad30, 100 equiv of azoCyDdimer (E) and 200 equiv 1-adamantane acetic acid. (**B**) CD spectra of the interaction Ad30–AzoCyDdimer (E)–Ad30–CRE in presence and absence of 1-adamantane acetic acid.

It was observed that the tetra-component complex (lane 4) in the presence of an excess of the 1-adamantane acetic acid was not formed (lane 5); instead, only the unspecific shift band of the complex peptide:DNA (1:1) was detected. In the case of the interaction of the NON, any shift band was detected in the presence of the competitive β-cyclodextrin binder (lane 6). By CD, it was observed that in the presence of an excess of 1-adamantane acetic acid, the negative band at 222 nm reached only the value of <sup>−</sup>20.409 ◦cm2dmol−<sup>1</sup> instead of the expected <sup>−</sup>30,666 ◦cm2dmol−<sup>1</sup> for the dimer (Figure 7B). The obtained value was in the range of the unspecific interaction obtained by interaction with the half-sequence mCRE sequence (−20.214 ◦cm2dmol−1) (see Figure 6E). These results confirm the relevance of the βCyD–Ad host–guest interaction to promote dimerization and thus trigger the specific interaction with the target dsDNA.

#### **3. Discussion**

Transcription factors (Tfs) are useful proteins to evaluate the concept of fuzziness in protein–protein and protein–DNA interactions [1,2,7,25–29]. For GCN4 Tf, it has been hypothesized that the binding to DNA occurs sequentially in such a way that a monomer is first assembled into the major groove of DNA, followed by dimerization with a second GCN4 Tf unit [30]. Generally, mimetics of the GCN4 Tf possess a high degree of randomness, but in the presence of a dimerization motif and their cognate ds DNA, there is a substantial increase of α helical structure due to structure inducing complex formation as a result of the specific binding [9–14]. Conformational dynamism and heterogeneity enable context-specific functions to emerge in response to changing environmental conditions and, furthermore, allow a single structural motif to be used in multiple settings [14]. The sequential interaction pathway is a natural strategy that prevents dimers from being trapped for a relatively long time in non-consensus sequences. In the case of GCN4 mimetics, the dimerization motif (Figure 1A), which is not involved in the DNA binding interface, is generally conformationally unaffected by binding to the DNA. Thereby, it retains the capability to modulate the interaction [31]. In general, heterogeneous conformational segments can increase binding affinity. This conformational flexibility and heterogeneity of proteins represent their fuzziness [1–6]. The classical framework of protein interactions establishes that there is a deterministic relationship between protein sequence and function. Based on this, a distinguished three-dimensional arrangement of the amino acids is a prerequisite for a given biological activity and is unambiguously encoded in the sequence [32]. However, protein functions are modulated by different mechanism triggered by different effectors. The effector perturbs one site and thereby leads to altered activity in a second, substrate site [33]. In our system, the effector was the azoCyDdimer, which triggered host–guest interaction of β-CyD of the dimer and two GCN4 peptides containing each an adamantane moiety. The different geometry the azoCyDdimer was previously calculated as 9.1 Å and 6.6 Å for E and Z isomers, respectively [16]. As confirmed here, the azoCyDdimer exists under two photo-states: the first thermodynamically stable with an E:Z isomer ratio of 95:5 and the second obtained after irradiating with ultraviolet light, E:Z 60:40 ratio. Taking into account, the large geometrical difference between both isomers and the relevance of this distance to form a suitable complex, we had hypothesized that it would be unlikely that both isomers could form a homodimer leading to the same specific dsDNA interaction. Considering that the first and second photo-stationary state contains a major proportion of E isomer 95% and 60%, respectively, it can be anticipated that if the E isomer is the responsible of the tetra-component complex, the specific interaction would be observed under both illumination conditions. However, some differences in binding affinity can be expected. On the contrary, if the Z isomer would give the right geometry for the peptide–DNA interaction, the positive interaction would be observed only mainly in the second photostationary stage where the Z isomer is present in a significant concentration (40%). Nevertheless, if both isomers would contribute equally to the tetra-component complex, similar behavior in all the experiments would be expected.

We performed two complementary sets of experiments, EMSA and CD, to evaluate the system. As aforementioned, from EMSA experiments the stoichiometry of the interaction, if any, was identified. Meanwhile, the CD gave information about the transition from disorder-to-order and disorder-to-partial-order in the bound state. In such a case, the value of the molar helicity per

residue would let us set a scale of the order obtained for each interaction and connecting them with the stoichiometry obtained in the EMSA experiment. Considering the theory of fuzzy complexes [1], the disordered-binding elements of GCN4 mimetic may undergo three types of structural transitions upon interaction with the target dsDNA. First, let us consider the disorder-to-order transition to adopt a stable, well-defined conformation in the bound state [32]. This is also referred to as coupled folding to binding [34]. Second, upon partner recognition, a transition from disorder-to-partial order could take place in shallow, often hydrophobic binding pockets. In this case, the interface was generated by many redundant contacts and few specificities, as observed by the formation of the complex DNA–Ad30 and DNA–Ad30–azoCyDdimer (E) [32]. Third, a disorder-to-disorder transition may occur upon binding of the Ad30–azoCyDdimer–Ad30 to the cognate dsDNA; however, in such case, we would not expect an increase of the helical content, but binding should be observed in the EMSA experiment.

We found that Ad30, but not its shorter form Ad26, was able to form two detectable new interaction shift bands during EMSA experiments. The shifted band detected in the absence of azoCyDdimer has a migration compatible with the monomer of Ad30, which binds to all the dsDNA sequences, inclusive to the random one (NON). Thus, this interaction was considered as unspecific. The second shift band was observed in excess of azoCyDdimer, and it corresponds to the tetra-component complex. This second band of the non-covalent tetra-component complex was observed only in the presence of CRE ( 5-..ATGA cg TCAT..-3 ) sequence under both illumination conditions, but not with AP1 ( 5-..ATGA c TCAT..-3 ), nor with mCRE ('5-..ATGAcg..-3 ). The elongation of the sequence in Ad30 by insertion of the affinity enhancing QRMK sequence in before the C-terminal QGGC end, thus leading to the QRMKQGGC sequence (Ad30) instead of QGGC (Ad26), does increase the binding affinity to levels allowing the detection of the interaction of one Ad30 monomer with dsDNA containing the sequences ('5-..ATGAc..-3 ). Probably the positive charge of the arginine (R) and lysine (K) groups are stabilizing effects that favor the observed interaction. In general, dynamics of the interaction may vary in a wide range depending on the truncation of the disordered dimerization domain as observed here [32]. Considering our findings, we hypothesized that the formation of a monomeric complex Ad30–azoCyDdimer and the ('5-..ATGAc..-3 ) sequence might favor the subsequent tetra-component complex formation. This can be explained by the fact that when one of the sites is bound to its cognate site receptor, a second site located close-by binds cooperatively, basically because of the lower entropic cost of a (pseudo)-intramolecular interaction [35]. Considering this concept of multivalency, if the linker connecting the binding and the dimerization domains are flexible, the average distance between the sites is the main factor determining the cooperativity. In such complex, we hypothesized that the QRMKQGGC in Ad30 sequence provides the required flexibility to favor the observed tetra-covalent binding and that this region is responsible for the fuzzy behavior of the mimetic in the bound state.

Considering that the specific migration band with CRE was also observed in both photostationary states where the E isomer is 95% and 60%, we hypothesized that the geometry of azoCyDdimer (E) is the adequate combination causing a migration band compatible with the formation of a stable tetra-component complex in the presence of ds CRE. When we evaluated the interaction of Ad30, the azoCyDdimer (E), and the CRE sequence using CD, we observed a significant increase of the helical content compatible with the formation of a tetra-component system, Ad30–azoCyDdimer (E)–Ad30–CRE, similar to those reported by the covalent dimer SS60 and other GCN4 mimetics [10–14,17,18]. In the case of the second photostationary state (ps) where the ps is E:Z (60:40), a moderate and lower increase of Ad30 helical content compared with the first pss E:Z (95:5) was observed, suggesting that the Z isomer does not contribute to the specific binding.

Importantly, for the interaction of Ad30–azoCyDdimer (E) and the mCRE sequence ('5-..ATGAcg..-3 ), which contains only one binding site instead of the two necessary ( 5-..ATGA cg TCAT..-3 ) for the tetra-component formation, a much lower increase of the helical content was observed (Figure 6E). This is compatible with the interaction of one Ad30–azoCyDdimer (E) or only Ad30 with this sequence, as detected in the EMSA experiment (Figure 5C,D, lane 15). The interaction with the random sequence (NON, Figure 6F), which has no recognition motif, showed nearly no interaction, demonstrating that the interaction of the basic region with the phosphate backbone of the dsDNA has at most a minor contribution to the protein–DNA binding. Summarizing the CD experiments, the increase of the helical content upon binding is related to the bound state of Ad30–azoCyDdimer (E) with the different dsDNA as follow: CRE > AP1~ mCRE > NON. After comparison of the increase of helical content (order) of Ad30- in the presence of CRE and the two azoCyDdimer isomers, it is evident that the photo-state (E) contribute more than the (Z). It seems that dimerization interaction between the adamantane and β-cavity of the azoCyDimer occurs at the wider edge of the cyclodextrin (Figure 2A), instead of its narrow edge, which would increase the binding affinity [15]. Previously, it has been shown by molecular dynamic simulations that while the Z azoCyDdimer isomer makes a 1:1 complex with a small organic molecule with two adamantane moieties, through interaction with the narrow edge; the E azoCyDdimer isomer forms supramolecular head to tail complexes through the wider edge [16]. In the case of the mimetic of GCN4, such supramolecular aggregates are not possible because the peptide has only one binding domain. All our experiments suggested that E conformation provides the required geometry to favor the tetra-component complex, which depends on the dsDNA sequence. On the other hand, we could not discard the formation of the Ad30–azoCyDdimer (Z)–Ad30. Nevertheless, our findings indicate that the geometry of the interaction in the Z conformation is not adequate for DNA recognition.

Finally, to confirm that the host–guest interaction between the β-CyD and the Ad moiety is a necessary requisite to recognize CRE sequence specifically, we performed a competitive assay in the presence of 1-adamantane acetic acid. Upon addition of the competitive β-CyD, the only band observed was the unspecific 1:1. A comparable result was obtained by CD upon addition of the competitive β-CyD reaching a helical content value similar to the one of the unspecific binding to mCRE. These final experiments sustain the hypothesis that the azoCyDdimer is necessary to form the homodimeric complex leading to the formation of the tetra-component system in a sequence-specific manner. We had planned to modulate the peptide–DNA interaction by the structural change promoted by the photoisomerization of the azobenzene group located as a linker between the two β-CyDs moieties in the azoCyDdimer. However, as well as for the covalent version of photomodulable GCN4 mimetic reported by Caamaño et al., it was not possible to switch the interaction effectively by changing the azobenzene conformation in situ [13]. We could not anticipate that the E isomer would be the one that favors the dimerization. Nevertheless, we have the off situation in the absence of azoCyDdimer switching the recognition to *on* above a certain azoCyDdimer (E) concentration (Figure 8). About the mechanism of interaction, we hypothesized that recognition occurs through a sequential mechanism where the monomer Ad30 probably forming a 1:1 complex with the azoCyDdimer recognizes the sequence ('5-..ATGAc..-3 ) Initially with high affinity followed by the cooperative union of a second monomer promoted by the host–guest dimerization motif of azoCyDdimer in the E conformation. This second stage is stabilized by the specific interaction with CRE sequences ( 5-..ATGA cg TCAT..-3 ). During this interaction, different fuzzy conformers are possible that favor the specific interaction with CRE by a transition from disorder-to-partial order of the four components of the complex (Figure 8). For the others dsDNA that lack of the complete binding sequence in mCRE or only one base as a connector between the binding site in AP1, let to the formation of bi-or tri-component complexes. As described before, Tompa et al. have proposed a categorization for fuzzy complexes, as static or dynamic [1]. Considering that in the bound state, most of the Ad30 is order, it is possible to describe the interaction by the polymorphism model of static fuzziness [1]. In such model, one part of the molecule makes the contacts for the interaction whereas its dimerization domain adopts several distinct conformations, and establish the right geometry to favor dimerization and binding to the DNA with the complete recognition sequence in CRE. We hypothesized that the QRMKQGGC region of Ad30 might adopt multiple unrelated conformations to stabilize the interaction with the larger azoCyDdimer (E) instead of the compact (Z) isomer, thus favoring dimerization that contributes to CRE recognition. Probably, this structural variability limits an unfavorable decrease in entropy accompanying complex formation, which enables the combination of rapid and thermodynamically

favorable binding. Truncation of this region led to no-binding as observed for Ad26, which validates our hypothesis. In conclusion, the present report is the first example of a GCN4 mimetic that forms a specific non-covalent tetra-component system with the cognate binding sequence only in the presence of an external ligand in one of two possible conformations, working as an off–on switch. Moreover, we demonstrated that chemically modified mimetics of GCN4 are suitable minimalist models to investigate conformational fuzziness in protein–DNA interactions, opening the opportunity to investigate biomolecular interaction by implementing fuzzy logic sets as proposed by Gentili [36].

**Figure 8.** Schematic structural transitions of the Ad30 upon partner interaction with AzoCyDdimer (E) in the presence of ds CRE, representing the interacting molecules (INPUT), the obtained experimental values (OUTPUT), and the proposed interaction model. Three interaction stages are hypothesized for the system: the unbound-, the intermediate-, and the bound-state that are represented with dashed rectangles in orange, red, and green, respectively. These stages are based on the experimental helical content (θ) obtained from the individual experiments in the presence of CRE, AP1, mCRE, or NON, and the stoichiometry obtained in the EMSA experiments. In each stage, different complexes and their conformational fuzziness are shown. In the on-state *(bound state, green dashed rectangle)*, the specific binding to CRE builds up the tetra-component complex. In the bound state, the structural and dynamical continuum of fuzzy complexes is shown. The off-state is composed of many complexes, which can be unbounded to the dsDNA (*orange dashed rectangle)* or bounded but without the required sequence-specificity (*red dashed rectangle)*, thus they are referred here as the intermediate state. The structural transition forms the unbound state (disorder) to the bound state (partial order) is shown.

#### **4. Materials and Methods**

#### *4.1. Peptide Synthesis*

Disulfide dimers SS52 and SS60 were synthesized from commercial peptides SH26 and SH30, to have reference standards to study their interactions with DNA [15]. Both products were purified by semi-preparative reverse phase HPLC (Waters, Milford, MA, USA) and then lyophilized. The characterization was carried out by MALDI–TOF (Bruker Daltonik, Bremen, Germany), SS52: (M + H) calc. C234H414N98O74S2 5786, found: 5788.5 (43%yield). SS60: (M + H)<sup>+</sup> calc. C278H498N116O80S4 6873.95, found: 6875.36 (50% yield).

For the synthesis of Ad26 [15] and Ad30, on both deoxygenated solutions of SH26 (1.9 mg, 6.2 <sup>×</sup> <sup>10</sup>−<sup>4</sup> mmol) and SH30 (2.0 mg, 5.82 <sup>×</sup> <sup>10</sup>−<sup>4</sup> mmol) in potassium phosphate buffer (150 <sup>μ</sup>L, 100 mM, pH = 8.0) and CH3CN 50 μL were added 4 equiv of bromo acetyldamantane in CH3CN (0.79 mg, 9 μL). The mixtures were stirred at room temperature for 3 h under N2 and checked by RP-LC–MS (Agilent 1100, Santa Clara, CA, USA),. The new compounds were purified by semi-preparative RP-HPLC and then lyophilized, identified by mass spectrometry as the alkylated peptides. Once purified and lyophilized, Ad26 (59%) and Ad30 were obtained in 59% and 66% yield, respectively. MALDI–TOF for Ad26, C130H229N49O37S1: calc. (M + H)<sup>+</sup> 3101.36, found 3102.81; and for Ad30, C152H272N58O42S2: calc. (M + H)<sup>+</sup> 3646.04, found 3647.08.

#### *4.2. Synthesis and Characterization of azoCyDdimer*

4,4 -bis (carboxy) azobenzene (0.036 g, 0.132 mmol) was dissolved in dry DMF (2 mL), HATU in DMF was added (0.10 g, 0.317 mmol) and 0.183 mL DIPEA (0.136 g, 1.056 mmol), was stirred for 5 min at room temperature, and to this solution was added β-CyD-NH2 (0.300 g, 0.264 mmol) dissolved in 2 mL of DMF, the resulting mixture was stirred for 3 h under nitrogen atmosphere. The mixture was poured over a container with ice-cold acetone and a precipitate appeared. The solid was separated by filtration and then dissolved in DMF to be purified by silica gel chromatography column. (CH3CN–H2O–NH4OH 14:0:0.5/4:10:0.5). 1H-NMR (500 MHz (Avance II 500, Bruker, Germany), DMSO-d6) δ (ppm): 3.3–4.33 (m, 42H ov 2 H2O), 4.33–4.56 (br.s, 12H), 4.96 (br.s, 14H), 4.82 (br.s, 7H), 5.7–5.82 (m, OH, 27H), 7.98 (4 H, d, *J* = 8.6 Hz), 8.04 (4 H, d, *J* = 8.6 Hz), 8.50 (2 H, br. s). 13C-NMR (125.75 MHz, DMSO-d6) δ (ppm): 41.8 (CH2), 59.7 (CH2), 71.9 (CH), 72.3 (CH), 73.0 (CH), 81.3 (CH), 81.9 (CH), 84.1 (CH), 101.64 (CH), 101.9 (CH), 122.2 (CH), 128.4 (CH), 137.4 (C), 153.7 (C), 166.2 (C). MALDI–TOF (M + H)+: calcd. for C98H148N4O70 2500.8, (M + Na)+: calc. 2523.58, found 2523.53, (M + K)+: calc. 2539.98, found 2539.96.

#### H-NMR Photoisomerization Experiment

Four independent photoisomerization experiments were performed using different initial concentrations of azoCyDdimer: 15 mM, 8 mM, 2.28 mM, and 0.5 mM. Dilutions were made from a 15 mM stock solution with an isomer ratio of 95:5 (E:Z). The stock solution and the dilutions were irradiated at 360 nm for 20 min, and then the corresponding 1HRMN spectra were acquired (AMX 300, Bruker, Germany), protecting the solution from the visible light. Switching experiments were performed with an 8 W mercury arc lamp with filter of 360 nm from Pleuger, Antwerp, Brussels.

#### *4.3. Annealing of dsDNA*

Oligonucleotides were purchased from Thermo Fisher Scientific GmbH on a 0.2 mmol scale as freeze-dried solids. After solving in H2O milliQ, their concentrations were measured by ultraviolet absorption at 260 nm with a BioRad SmartSpec Plus Spectrophotometer. Absorbance was measured twice, and concentrations were calculated applying Lambert–Beer's equation. The molar extinction coefficients of single-strand oligonucleotides were calculated by using the following formula ε(260 nm) <sup>=</sup> {(8.8 <sup>×</sup> #T) <sup>+</sup> (7.3 <sup>×</sup> #C) <sup>+</sup> (11.7 <sup>×</sup> #G) <sup>+</sup> (15.4 <sup>×</sup> #A)} <sup>×</sup> 0.9 <sup>×</sup> 103 <sup>M</sup>−<sup>1</sup> cm<sup>−</sup>1, where #A, #T, #C, #G stand for the number of each type of bases in the DNA strand. Oligonucleotides were annealed by heating an equimolar mixture of the two complementary single-strand DNAs to 90 ◦C for 2 min and then cooling slowly to room temperature (12 h).

#### *4.4. Electrophoretic Shift binding Assays*

EMSA was performed with a BioRad Mini Protean gel system, powered by an electrophoresis power supplies PowerPac Basic model, maximum power 150 V, frequency 50.60 Hz at 140 V (constant V). The binding reactions were performed over 30 min in a binding mixture (20 or 40 μL) containing 18 mm

tris(hydroxymethyl)aminomethane (Tris; pH 7.5), 90 mm KCl, 1.8 mm MgCl2, 1.8 mm EDTA, 9%glycerol, 0.11 mgmL−<sup>1</sup> bovine serum albumin (BSA), and 2.2% NP-40 (nonidet-P40). Products were resolved by PAGE by using a 10% nondenaturing polyacrylamide gel and 0.5XTBE buffer solution (44.5 mm Tris, 44.5 mm boric acid,1 mm EDTA, pH 8) and analyzed by staining with SyBrGold (Molecular Probes: 5 mL in 50 mL of 1XTBE) for 10 min and visualized with fluorescence. Ad26 working concentration was 200 nM, 50 nM of DNA, and for azoCyDdimer was used from 0 to 100 equivalents of the E and Z dimer, respectively, was selected (stock solution 0.74 mM in H2O, ratio E:Z determined by RP-HPLC, Supplementary Materials Figure S5). The order of addition was azoCyDdimer, Ad26 (pre-incubation for 10 min at 4 ◦C), and then the corresponding DNA. Duplicates of independent experiments were performed. For the competitive binding assay, we added 200 equiv of 1-adamantane acetic acid after the addition of azoCyDdimer.

#### *4.5. Circular Dichroism Spectroscopy*

CD experiments were performed on a Jasco spectrometer (Jasco 715, Tokyo, Japan) with 1 mm path-length cell [17]. Samples in 10 mM phosphate buffer (pH 7.0) and 100 mM NaCl contained 10 or 5 μM of peptide and 5 μM oligonucleotides (double-stranded) in the absence or presence of 1 equiv of azoCyDdimer at 4 ◦C (stock solution 0.74 mM in H2O, ratio E:Z determined by RP-HPLC, see Supplementary Materials Figure S5). Every sample was incubated for 5 min before registering. Duplicates of independent experiments were performed. The CD spectra of the peptides (when measured in the presence of DNA) were calculated as the difference between the spectrum of the peptide/DNA mixture and the measured spectrum of the respective dsDNA oligonucleotide. For the competitive binding assay, we added 200 equiv of 1-adamantane acetic acid after the addition of azoCyDdimer. Smoothing of the signals was performed by using the software KaleidaGraph.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/1420-3049/24/13/2508/s1, Figure S1: 1H-NMR of the azoCyDdimer, Figure S2: 2D-NMR-HSQC of azoCyDdimer, Figure S3: 2D-NMR-HMBC of azoCyDdimer, Figure S4: UV–Vis Spectra of azoCyDdimer depending on illumination conditions; Figure S5 RP-LC–MS of azoCyDdimer depending on illumination conditions. Tables S1 and S2: 1H-NMR and 13C-NMR chemical shifts of AzoCyDdimer (E).

**Author Contributions:** Conceptualization, V.I.D.; Formal analysis, Z.B.Q. and V.I.D.; Funding acquisition, Z.B.Q. and V.I.D.; Investigation, Z.B.Q. and M.A.S.; Methodology, Z.B.Q., J.C.M., and V.I.D.; Resources, J.C.M. and V.I.D.; Supervision, J.C.M. and V.I.D.; Writing—original draft, Z.B.Q.; Writing—review and editing, Z.B.Q., M.A.S., J.C.M., and V.I.D.

**Funding:** This research was funded by National Council of Scientific and Technical Research of Argentina (CONICET, PIP 114-200801-00090), National University of South grant PGI-UNS 2014, MINCYT-DAAD program with University of Bielefeld. The APC was funded by the Deutsche Forschungsgemeinschaft and the Open Access Publication Fund of Bielefeld University.

**Acknowledgments:** Z.B.Q. is very grateful for her CONICET fellowships, Erasmus Mundus fellowship (EuroTango) in J. C. Martins' group and Santander-USC fellowship in J. L. Mascareñas' group. M.A.S. thanks her CONICET fellowship. V.I.D. thanks N. Sewald and J. L. Mascareñas for their support.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

#### **References**


**Sample Availability:** Samples of the compounds are not available from the authors.

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **From the Kinetic Theory of Gases to the Kinetics of Rate Processes: On the Verge of the Thermodynamic and Kinetic Limits**

### **Valter H. Carvalho-Silva 1,\*, Nayara D. Coutinho 2,\* and Vincenzo Aquilanti 3,4,\***


Academic Editor: Pier Luigi Gentilli Received: 8 April 2020; Accepted: 28 April 2020; Published: 30 April 2020

**Abstract:** A variety of current experiments and molecular dynamics computations are expanding our understanding of rate processes occurring in extreme environments, especially at low temperatures, where deviations from linearity of Arrhenius plots are revealed. The thermodynamic behavior of molecular systems is determined at a specific temperature within conditions on large volume and number of particles at a given density (the thermodynamic limit): on the other side, kinetic features are intuitively perceived as defined in a range between the extreme temperatures, which limit the existence of each specific phase. In this paper, extending the statistical mechanics approach due to Fowler and collaborators, ensembles and partition functions are defined to evaluate initial state averages and activation energies involved in the kinetics of rate processes. A key step is delayed access to the thermodynamic limit when conditions on a large volume and number of particles are not fulfilled: the involved mathematical analysis requires consideration of the role of the succession for the exponential function due to Euler, precursor to the Poisson and Boltzmann classical distributions, recently discussed. Arguments are presented to demonstrate that a universal feature emerges: Convex Arrhenius plots (*super*-Arrhenius behavior) as temperature decreases are amply documented in progressively wider contexts, such as viscosity and glass transitions, biological processes, enzymatic catalysis, plasma catalysis, geochemical fluidity, and chemical reactions involving collective phenomena. The treatment expands the classical Tolman's theorem formulated quantally by Fowler and Guggenheim: the activation energy of processes is related to the averages of microscopic energies. We previously introduced the concept of "transitivity", a function that compactly accounts for the development of heuristic formulas and suggests the search for universal behavior. The velocity distribution function far from the thermodynamic limit is illustrated; the fraction of molecules with energy in excess of a certain threshold for the description of the kinetics of low-temperature transitions and of non-equilibrium reaction rates is derived. Uniform extension beyond the classical case to include quantum tunneling (leading to the concavity of plots, *sub*-Arrhenius behavior) and to Fermi and Bose statistics has been considered elsewhere. A companion paper presents a computational code permitting applications to a variety of phenomena and provides further examples.

**Keywords:** Maxwell–Boltzmann path; Euler's formula for the exponential; activation; transitivity; transport phenomena

#### **1. Introduction**

A basic task of current molecular science is to elucidate how the kinetic behavior of a physicochemical system manifests within the temperature range of its "life span": thermodynamics has its focus on states of the system and the transition between them, while the study of the rate of evolution of processes is the objective of kinetics. In thermodynamics (as in mechanics), it is ubiquitous to face the balance among various types of energy being exchanged; the connection from the molecular to macroscopic energy levels requires averages over the myriads of ways of change of microscopic configurations that determine the progress of events. The situation in chemical kinetics is intrinsically not so sharp, not only because systems in movement are much harder to be studied than those in steady-state equilibrium. Currently, the techniques in experimental and theoretical kinetics have been advancing enormously (although at a much slower pace when compared with those of thermodynamics), due to progress on production and detection of molecular beams and on classical and quantum simulations of molecular dynamics.

Aspects related to the foundations of the kinetics of rate processes were elaborated recently in previous papers [1–4]. In [1], fundamental concepts concerning statistical distributions and reaction rate theory were presented, including the definition of transitivity, a function of absolute temperature denoted as γ(T), based on extensive phenomenology that is being accumulated; a subsequent paper [2] considered the historical background of developments of chemical kinetics, leading to the basic foundations through analysis of key mathematical ingredients; in [3], the formulations based on the concept of transitivity were compacted and applied to the description of several phenomena on the temperature dependence of rate processes beyond Arrhenius and Eyring; and finally in paper [4], companion of this one in this topical issue, a computational code is described and provided to calculate kinetics and related parameters in chemical transformations and transport phenomena.

The need emerges of differentiating developments from those employed in thermodynamics, in spite of the fact that the kinetic theory of gases by Maxwell (and later by Boltzmann) was formulated more or less contemporary to the thermodynamics of Carnot (and later of Clausius): their thermodynamic vision was later merged turning the Maxwell theory essentially in terms of Boltzmann–Gibbs distributions. Additionally, as a well-known matter of fact in the literature [5–7], the Arrhenius equation, basic to chemical kinetics, was suggested as an empirical adaptation of the thermodynamics of chemical equilibrium developed by van't Hoff (ca. 1880).

In the present work, account will be given to how a derivation of a theory of rate processes from non-equilibrium distributions involves essentially steps that are usual in thermodynamics, specifically as far as averaging procedures are concerned. A specific feature here is that we are progressing in the spirit of the well-known Darwin–Fowler formulation [8–10], which involves departure from the concept of "most probable" configuration emerging following the Boltzmann–Gibbs path [11,12]. Darwin and Fowler dealt with average quantities: they essentially developed a thermodynamics equivalent to the canonical form with no need of the concept of a microcanonical ensemble or even of that of entropy: a similar alternative path was briefly indicated by Eyring and coworkers presenting the foundations of the "Transition-State Theory" of rate processes [13–16]. This approach appears better motivated than the traditional: current experiments involve molecular beams studies of individual events, and advances in quantum mechanical treatments indicate the "royal path": theoretical chemical kinetics proceeds by generating intermolecular potential energy surfaces and simulate computationally the passage from myriads of microscopic events to macroscopic quantities. This can be formulated at least in principle: however, when we consider polyatomic systems in molecular dynamics, we are unable to fully circumventing the difficulty presented by the need of averaging on a large number of events, difficult to be sampled in a statistically converged way. The situation was first anticipated by Maxwell who conjectured that the collective macroscopic observable motion of atoms, if they existed, should be compacted by averaging over statistical ensembles of their "trajectories" [17,18].

Following downward the upper part of the chart in Figure 1, we consider ab initio 'exact' quantum dynamics: it is expected to provide benchmark kinetics data, but is in practice still limited to

simple cases, see [19–21]. Such applications of exemplary chemical reaction kinetics typically proceed according to the descending steps schematized in Figure 1: (a) calculation of the molecular electronic structure interactions involving high-level of quantum chemical accuracy, (b) dynamical evolution in phase-space configurations from the solution of the (usually time-independent) quantum equations of the motion, and (c) extraction of reactive properties from asymptotic scattering theory and calculating in succession key quantities: the quantum scattering matrix, the cumulative reaction probability and the cross sections. Finally, the Boltzmann weight averaging over a large span and fine grid of kinetic energies is needed to obtain the canonical quantity of chemical kinetics, the rate constants *k*(T) as a function of temperature. These rigorous prescriptions can yield benchmark results for quantum evolution of systems over a given potential energy surface and provide reaction rate constants for only a limited number of reactions: in fact, the complexity of the programming and the computationally demanding requirements strongly limit this type of study to reactive processes involving only a few atoms [19,21,22]. However, they serve as prototypes to complex molecular systems and stepping stones for example to processes governed by multiple potential energy surfaces, nonadiabatically coupled.

**Figure 1.** Stepping stones for the sequence (in red) from quantum chemical calculations to theoretical ("exact") rate coefficients (downward along upper part of the chart, see also [2]); and the sequence (in blue) from the transitivity function γ(β) to the phenomenological rate coefficients (upward in the lower part of the chart). Models for γ(β) are discussed in [3] and further in Sections 3 and 4. Here α = 1/ε‡, the reciprocal of Eyring activation energy for β<sup>0</sup> = 0 in the asymptotically high temperature of the thermodynamic limit (Section 2.2). The critical exponent ξ generalizes the integer *n* in reference [3] and will be related (see Figure 5) to classes of universal behavior.

In this paper, the phenomenological rate theory [1–3] is developed by introducing a mechanism where the delay or acceleration of the approach to a well-defined mathematical limit due to Euler accounts for the low-temperature deviations of rates from Arrhenius law. In the next section, revisitation of the classical thermodynamic limit accompanies its extension to kinetics and naturally leads to a deformation of the Boltzmann–Gibbs distribution and to the emergence of a formulation alternative to that of Arrhenius [1–3]. Section 3 illustrates how theory serves to the natural scaling of a variety of physical and chemical processes in extreme conditions and near phase transitions. Implementations to various phenomena are sketched in Section 4. Concluding remarks are given in Section 5. Appendix A presents formulas for the distribution of energies in a reactive process away from equilibrium.

#### **2. Thermodynamic versus Kinetic Limits, Revisited**

#### *2.1. The Exponential as Limit of Euler's Succession: Role in the Early Kinetic Theory of Gases*

The memorable succession for the exponential function from the sum of an infinite series is a powerful variant of the binomial theorem of Newton and was discovered by Euler in the XVIII century. For its occurrence originated in a famous bank account problem solved by Jacob Bernoulli and for aspects of its remarkable facets, see the recent papers [1–3,23]. The tremendous advances in the kinetic theory of gases started in the mid-XIX century with Maxwell's mathematically intuition to look at the microscopic world as composed of greatly many indivisible particles, atoms. According to this vision, which found in Ludwig Boltzmann [24] one of the greatest defenders in times when even the existence of atoms was being questioned, the germs of what is now known as statistical mechanics were formulated: the motion of microscopic particles was correlated to macroscopic observables providing the foundations for the phenomenology of thermodynamics. It is not always recognized that the statistical proposition for success or failure of events (Bernoulli urn or Bernoulli trial binomial process and its generalization) provided through the Euler's limit the foundations for the derivation of extensions to distributions, i.e., the foundations for the progress in the XIX and the early XX century, remarkably those of Poisson, Gauss, Planck, Bose–Einstein, and Fermi–Dirac [1].

At the very origin of the statistical mechanics viewpoint, the investigations reported in the 1860 [17] and 1866 [18] papers by James Clerk Maxwell lead to the famous velocity distribution of molecules under the hypothesis of the independence of Cartesian components of the velocity vectors: this conjecture appeared plausible from the additive properties of the exponential function. In 1868, Ludwig Boltzmann [25], as reviewed, e.g., in reference [26], introduced probabilistic concepts—the "marginal" probability of the energy of a molecule—obtaining a derivation of the Maxwell's law of velocities by rigorous treatment based explicitly on the exponential behavior of velocity according to the Euler's succession, see Figure 2.

A decade after, Maxwell [27] returns to the Boltzmann's formulation proposing a more insightful approach of the problem, see Figure 3, rarely considered in the expositions of the theory in wide number of papers, treatises, and textbooks. In 1916, Jeans in his treatise on "The Dynamical Theory of Gases" [28] addresses Maxwell's latest treatment within a much more concise mathematical analysis generalizing the concept of phase-space: again the exponential velocity distribution law is obtained from the Euler's limit of a succession. This same procedure can be traced in further [29] and recent [30] works, where cases involving finite systems are dealt with essentially by arresting the treatment before taking Euler's limit, namely without taking what is now recognized as the "thermodynamic limit".

**Figure 2.** The page of the Boltzmann's paper where he used a combinatorial argument that we follow in the formulation exposed in the Appendix A. He obtained the Maxwell distribution through the Euler formula (boxed) giving the exponential function as a limit of a succession.

$$\frac{2^{-b}\Gamma\left(\frac{n}{2}\right)}{\Gamma(\sharp)\Gamma\left(\frac{n-1}{2}\right)}\frac{\left[E-V-\frac{1}{2}\mu\_n a\_n^{\ast}\right]^{\frac{n-3}{2}}}{\left[E-V\right]^{\frac{n-4}{2}}}\mu\_n{}^{\sharp}da\_n\ \cdot \ 1 \rightarrow \cdots \rightarrow \cdots \cdots \tag{49}$$

$$\left\| \begin{array}{c} \Gamma\left(\frac{n}{2}\right) \\ \hline \Gamma(\frac{1}{2}) \Gamma\left(\frac{n-1}{2}\right) \\ \hline \end{array} \frac{\left[E-V-k\_{n}\right]^{\frac{n-3}{2}}}{\left[E-V\right]^{\frac{n-3}{2}}} k\_{n}^{-1} dk\_{n} \begin{array}{c} \text{---} \\ \text{---} \dots \text{---} \dots \text{(51)}. \end{array} \right.$$

In the 7th Chapter of his 1938 treatise "Kinetic Theory of Gases", Kennard [31] develops a comprehensive assessment of macroscopic irregular motion of molecules (including, e.g., the Brownian motion) as connected to averaged microscopic fluctuations: the connection between discrete statistical distributions and exponential functions is obtained by the Euler's succession, taking the limit to infinity of the number of particles. Earlier, in a collection of his investigations on statistical mechanics collected in a 1927 book, Tolman [32] had briefly discussed the role of taking limits in the description of fluctuations for a large number of molecules; in his treatise in 1938 [33] the theme of fluctuations and thermodynamic equilibrium are discussed in more details through a detour involving the Stirling formula for factorials and maximization of entropy in the Boltzmann–Gibbs approach. In either way these treatments involved imposing limiting values to specific variables and anticipating the operation that we will discuss in the next section, namely that of taking the thermodynamic limit, see [31,34]: in some cases, as intermediate steps in the course of derivations, physically insightful expressions were encountered.

#### *2.2. The Thermodynamic Limit: The Contribution of Fowler and Collaborators*

To give a general foundation to statistical mechanics, stepping stones can be schematized as follows. Darwin and Fowler [10,35] developed their approach in the early twenties, introducing the concept of temperature as the zero principle and defining as key quantities specific distributions and in particular partition functions [10,36]. In a lucid lecture, in 1948 Schrödinger [37] describes their achievements as major after those of Boltzmann (1868) [25] and Gibbs (1902) [12]. A few years later a mathematical analysis around the concept of the thermodynamic limit was carried out by Yang and Lee [38–40]: they considered the limit as to be taken with respect to properties in the neighborhood of phase transitions, and gave a deep theory of associated analytic singularities.

Basic to a variety of modern treatments, the thermodynamic limit concept was mentioned as central in many ways: there are at least three recent books [41–43] and a dedicated paper on statistical mechanics [44] where the treatments are from the very beginning based on the introduction of the concept of thermodynamic limit. Reference [44] refers to the Darwin–Fowler method as powerful alternative to the Boltzmann–Gibbs celebrated construction and describes the Bogoliubov contribution. Technically, Darwin and Fowler [8,9] and Fowler and Guggenheim [35] obtain average quantities from multivariable distributions using an asymptotic method, that is of the steepest descent: reference [45] on p. 53 shows equivalence with taking the Stirling limit for factorials and the Lagrange maximization of functions by undetermined multipliers, a procedure which is standard in the Boltzmann–Gibbs statistical approach to entropy.

Usually, in most popular books the thermodynamic limit is defined only in words. The quantitative definition [46,47] is provided considering extensive variables, the volume *V* and the number of particles *N* going to infinity while their ratio, the density *ρ* = *N*/*V*, remains finite: see Figure 2 for the reproduction of the original treatment by Boltzmann. The formulas exploit essentially the limit of a succession due to Euler to obtain the exponential function [2]; in books by both Pathria [42] and Huang [40] on statistical mechanics, the very first concept introduced is the thermodynamic limit and provide accessible qualitative, useful presentations of the contributions by Yang and Lee [38–40].

In one of the Landau and Lifshitz series of textbooks, there is a treatment now considered as standard [46]. The section arguably written by L.P. Pitaevski, addresses the problem of fluctuations, obtaining the Poisson distribution considering the volume *V* of gas occupied by a number of particles *N*. Let *v* be a small part of the total volume and proceeding with the same ingredients used by Boltzmann in 1868 ([25] and Figure 2) one can show that the probability for a volume *v* to contain *n* molecules follows a Jacob Bernoulli's binomial distribution of the type

$$P(n) = \frac{1}{n!} \frac{N!}{(N-n)!} \left(\frac{\upsilon}{V}\right)^n \left(1 - \frac{\upsilon}{V}\right)^{N-n}.\tag{1}$$

Taking the limit

$$N \to \infty \text{ and } \ V \to \infty \tag{2}$$

for an average number of particles *n* in volume *V*, while the density, namely the ratio

$$
\rho = \frac{N}{V} = \frac{\overline{n}}{v} \tag{3}
$$

remains finite, the passage from the binomial distribution to the Poisson distribution is obtained by the Euler's formula for the exponential function as a limit of a succession (see also the references [1,41]):

$$P(n) = \frac{\overline{n}^n}{n!}e^{-\pi} \tag{4}$$

This derivation can be taken as representative of the content of the expression "taking the thermodynamic limit".

Other treatments are worthy of mention. Using a path analogous to Boltzmann's "marginal" probability, Eyring and collaborators [14–16] provide an elementary presentation based on a paper by Condon (1938) [34] on a statistical mechanics derivation of the Boltzmann distribution law. The treatment is interesting for chemical kinetics. Considering the equilibrium of a molecular subsystem within a system composed of *s* harmonic oscillators [15], it is possible to calculate the probability of the molecular subsystem to acquire an energy ε. Assuming a total energy *E* of the system, one can identify the number of ways to distribute *<sup>m</sup>* <sup>=</sup> *<sup>E</sup>*−*<sup>ε</sup> <sup>h</sup><sup>ν</sup>* quanta of energy among the *<sup>s</sup>* oscillators of the system (*hν* is the energy per quantum):

$$\mathcal{W}(s,m) = \binom{s+m-1}{m} = \frac{(s+m-1)!}{m!(s-1)!}.\tag{5}$$

The probability of the molecular system to acquire the energy ε is

$$P(s,m) = \frac{\binom{s+m-1}{m}}{\sum\_{j} \binom{s+j-1}{j}}.\tag{6}$$

Taking what is now recognized as the thermodynamic limit (in this case, *s* and *m* tending to infinity) and using the Euler's formula for the exponential function as a limit of a succession, the Boltzmann law is recovered

$$P(\varepsilon\_i) = \frac{e^{-\frac{\varepsilon\_i}{k\_B T}}}{\sum\_j e^{-\frac{\varepsilon\_i}{k\_B T}}}.\tag{7}$$

where *ε* is energy, *kB* is Boltzmann's constant, and *T* the absolute temperature. In reference [1] modifications needed to obtain Bose–Einstein and Fermi–Dirac distributions are sketched.

As we have seen, it went unnoticed that many formulations had been anticipated by Boltzmann in his 1868 article [25]. Indeed, he himself in the famous paper published in 1877 [1] changed focus, and developed the celebrated procedure: that of searching for most probable values with limits on particle numbers at a given total energy to obtain the entropy by the Lagrange method of undetermined multipliers. Due to this spectacular result the attention of most subsequent investigations was diverted away from the kinetic approach towards thermodynamic treatments. However, in 1940 Hinshelwood [48] sketched a pedagogical justification of Boltzmann's exponential distribution considering the probability of favorable collisions that can lead to a specific energy accumulation into molecules: he relies explicitly on the Euler's limit to obtain the exponential function. See more details in [2], where it is emphasized that when interpreting this mechanism as that operating in the activation of molecules one has a profound insight on chemical reactivity and on rate processes.

#### *2.3. Avoiding the Thermodynamic Limit Describes Nonlinearities of Arrhenius Plots*

In chemical kinetics the difficulty occurs in finding macroscopical and canonical kinetics properties, ingredients analogous to those of thermodynamics. Temperature can be introduced assuming the zero principle for the preparation of reactants: the question arises how to use an analogue of the thermodynamic limit when it is arbitrary to define extensive quantities like number of particles (*N*) or volume (*V*) in a reactive process.

For kinetics of the chemical and physical rate processes the evidence [49–56] of deviations from Arrhenius behavior is increasing—arguably it is to be associated to moving away significantly from the Boltzmann–Gibbs distribution, particularly going down to low temperatures. In Boltzmann's equation of Figure 2 substituting *k*/*x* with *ε*‡*β*/*N* we performed the same mathematical operation as for the thermodynamic limit: the passage of the distribution when *n* goes to infinity can be written:

$$P\_N\left(\varepsilon^\ddagger\beta\right) = \left(1 - \frac{\varepsilon^\ddagger\beta}{N}\right)^N \stackrel{N \to \infty}{\to} P\left(\varepsilon^\ddagger\beta\right) = \varepsilon^{-\varepsilon\dagger\beta} \tag{8}$$

where *ε*‡ (the activation energy) represents an energetic obstacle for the process to occur and *β* = 1/*kBT* is the usual Lagrange multiplier, the "coldness" [2,57]. The exponential Boltzmann–Gibbs distribution *P*(*ε*‡*β*) emerges as a limiting case of a power law distribution *PN*(*ε*‡*β*) we have shown [1,58–60] that the latter, corresponding to avoiding taking the limit permits that the low temperature deviations in kinetic processes can be described with remarkable consistency in a generality of contexts. This treatment makes explicit the connection of *PN*(*ε*‡*β*) distribution with Tsallis statistics [61,62] identifying 1/*N* with 1 − *q*, where *N* is allowed to be continuous and Tsallis *q* is classically limited in a small range.

The final expression in (8) is Arrhenius law apart from the pre-exponential factor *A*: Further convenient introduction of the deformation parameter *d* in place of 1/*N* one has

$$k\_d \left( \varepsilon^\ddagger \beta \right) = A \left( 1 - \epsilon \ell \varepsilon^\ddagger \beta \right)^{1/d} \stackrel{d \to 0}{\to} k \left( \varepsilon^\ddagger \beta \right) = \varepsilon^{-\varepsilon^\ddagger \beta} \tag{9}$$

The left-hand side of the correspondence (9) is known as the Aquilanti–Mundim deformed Arrhenius formula. We amply proved that it could be considered uniformly both for *d* < 0 (quantum propensity) and for *d* > 0 (classical propensity). The first case has been treated amply elsewhere [59,63,64]; we mostly focused here on the second case [60,65]. Rarer cases are found for which *d* > 0 and *ε*‡ < 0, and are referred as corresponding to an *anti*-Arrhenius behavior [66–68].

There are some examples in the literature [29,69–71] where there is evidence that our language the classical thermodynamic limit is reached due to the magnitude of Avogadro number O 10<sup>24</sup> . Considering this number, the statistical mechanics treatment indicates that fluctuations of energy in a canonical ensemble turn out extremely sharp and narrow. This peculiarity of the energy distribution is required to admit the equivalence between averages and most probable values of variables: applicability of statistical techniques to the foundations of thermodynamics relies on this kind of argument, upon which is also based the concept of the thermodynamic limit. At low temperatures, especially in reactive and non-reactive processes, conditions can be violated since the order of magnitude of the activated events is controlled by the Eyring pre-exponential factor 2*π <sup>β</sup>* ∼ O 1013 with a consequent possibility of relaxation of the thermodynamic limit.

Far away from the thermodynamic limit at extreme conditions, a lower thermal limit can be assumed for this *super*-Arrhenius case where chemical or physical rate processes would require an infinite time to proceed, i.e., when

$$1 - d\varepsilon^{\ddagger} \beta \approx 0\tag{10}$$

this condition provides an interpretation for the parameter *d* already pointed out and defined by us in previous papers [1–3] (and recently confirmed [72,73]):

$$\mathcal{A} = \frac{k\_B T^\ddagger}{\varepsilon^\ddagger} = \frac{\varepsilon^\ddagger}{\varepsilon^\ddagger} \tag{11}$$

where the superscript † denotes a minimum temperature *T*† or a thermal energy threshold *ε*† for which the kinetic process is operative. The *d* parameter differs from zero as a scale to measure the "thermal limit of propensity" with reference to the lower and higher kinetic energy values respectively to *<sup>ε</sup>*† and *<sup>ε</sup>*‡, see Scheme 1: (i) when *<sup>β</sup>* <sup>→</sup> 0, and also remaining finite <sup>=</sup> *<sup>ε</sup>*† *<sup>ε</sup>*‡ (Cauchy's limit [73]) Equation (9) also tends to the exponential law; and ii) when *<sup>β</sup>* <sup>→</sup> <sup>∞</sup>, *ε*‡*<sup>β</sup>* <sup>→</sup> <sup>1</sup> and <sup>=</sup> *<sup>ε</sup>*† *<sup>ε</sup>*‡ remaining finite, a minimum limit can be identified for which the kinetic process may occur. Case ii) applies for *d* > 0, the *super*-Arrhenius cases: the *sub* limit is consistent with Wigner's threshold law for thermoneutral reactions [74–76]. Furthermore, taking advantage of an alternative expansion [77,78] for the Aquilanti–Mundim law

$$k(k\beta) = \left(1 - d\varepsilon^{\ddagger}\beta\right)^{\frac{1}{d}} = A e^{-\varepsilon\dagger\beta} \left[1 - \frac{1}{2}d\left(\varepsilon^{\ddagger}\beta\right)^2 - \frac{1}{3}d^2\left(\varepsilon^{\ddagger}\beta\right)^3 - \frac{1}{8}(2d - 1)d^2\left(\varepsilon^{\ddagger}\beta\right)^4 + \mathcal{O}\left(\beta^6\right)\right] \tag{12}$$

$$\mathcal{V} \to \infty; \mathbf{N} \to \infty; \quad \mathbf{N}/\mathcal{V} = \mathbf{f} \text{nite}$$

$$\lim\_{\forall \beta \colon d \to 0} A \left( 1 - d \varepsilon^{\sharp} \beta \right)^{1/\_d} = A \ e^{-\varepsilon^{\sharp} \beta}$$

$$\begin{aligned} \text{(i)} \quad \beta &\rightarrow \mathbf{0}; \ d = \frac{\varepsilon^{\dagger}}{\varepsilon^{\ddagger}} = \text{finite} \qquad \lim\_{\forall d \in \beta \to 0} A \left(1 - d\varepsilon^{\ddagger}\beta\right)^{1/d} = \text{} \qquad A \, e^{-\varepsilon^{\ddagger}\beta} \\\\ \text{(ii)} \quad \beta &\rightarrow \infty; \ d\varepsilon^{\ddagger}\beta \to \mathbf{1} \qquad \lim\_{\forall d \in \beta \to 1} A \left\{1 - d\varepsilon^{\ddagger}\beta\right\}^{1/d} = \begin{bmatrix} 0 & \text{(super)} \\\\ \text{A } \beta & \text{(sub)} \\\\ & \text{for thermal} \\ & \text{reactions} \end{bmatrix} \end{aligned}$$

**Scheme 1.** The three limits: the thermodynamic limit is an application of that of Euler for → 0 ; the kinetic limits apply for extreme values of *β* confining the temperature range where the process under consideration is operative. For various cases of the *sub*-Arrhenius kinetic limits for *β* → ∞ see text.

Note a misprint in Equation (17) in reference [3]. The three limits previously described are illustrated as defining the thermodynamic and kinetic limits, summarized in the following scheme (see also Figure 4).

**Figure 4.** 3D Arrhenius plot illustrating the kinetic limits (Scheme 1) through the functional dependence of the AM rate coefficient *k* on the parameters *d* and *β*. The two kinetic limits apply at the extremes of the range of β. For any value of *β*, the Arrhenius law is recovered when *d* tends to zero according to the thermodynamic limit (Scheme 1).

From Figure 4, it is possible to perceive graphically the three limits in a 3D Arrhenius plot, ln*k vs*. *d* and *β*. When *d* tends to zero the Arrhenius law is recovered. For *d* > 0, a convex curvature is generated (*super*-Arrhenius) and the tendency for a lower thermal limit is observed. When *d* < 0 the plot becomes concave (*sub*-Arrhenius) because of quantum mechanically tunneling.

For small tunneling, we showed that [63]

$$d = \frac{-1}{3} \left(\frac{h\nu^\sharp}{2\varepsilon^\sharp}\right)^2\tag{13}$$

where *ε*‡ is the barrier height, directly proportional to ν‡, the square of the frequency for crossing the barrier at the maximum in the potential energy surface. For the concave case, the tendency is attenuated and known as the Wigner limit [74,75] for thermoneutral chemical reactions (see in Scheme 1):

$$\lim\_{\forall d \text{ } d \text{ } \iota \not\subset \emptyset \to 1} A \left( 1 - d \varepsilon^{\sharp} \not\rhd \right)^{\frac{1}{d}} = A \not\rhd \mathbb{1}. \tag{14}$$

#### *2.4. Architecting the Transitivity Concept*

From a conceptual viewpoint and with reference to Figure 1, in a previous paper [2] we emphasized how essentially a statistical mechanics path to chemical kinetics (the theory of change) can be based on a theory of chance, where however criteria for choices need be provided [2,3]: the transitivity concept is exhibited as playing a crucial role.

A recent paper [3] also gives an account of how useful it is the introduction of representations of experimentally or numerically exact rate constant data; the first, of course, is the Arrhenius plane [5], whereby the apparent activation energy is interpreted according to the Tolman's theorem [79]; the second one is the transitivity plane. We sketched here and elaborated elsewhere [4] a further aspect justifying how the definition of the transitivity function that can provide an understanding of microscopic kinetic processes using alternative forms of scaling of the rate data, yielding naturally the conventional statistics used in rate process, from Maxwell–Boltzmann [34] to Tsallis statistics [62], and to the further popular Vogel–Fulcher–Tammann [80–82] distributions.

#### 2.4.1. Tolman's Theorem and the Apparent Activation Energy

Conventionally, starting points in the statistical thermodynamics proposed by Willard Gibbs [12] and Fowler et al. [10,35] are the average energy *E* of a canonical system obtained as the logarithmic derivative of the partition function *Z* with respect to *β*

$$\overline{E} = -\frac{\mathbf{d}}{\mathbf{d} \, \beta} \ln Z \tag{15}$$

When we turn to kinetics, the correspondence can be established considering the average energy to be accumulated by colliding molecules to proceed to reaction. Following Tolman (the first paper is one hundred years old [79]), well-characterized is the concept of apparent activation energy *Eα*(*β*). This entity is customarily obtained by chemical kinetics data on reaction rate coefficients *k*(*β*) *k*(*β*), phenomenologically from the Arrhenius plot (as recommended in 1996 by the definition of the International Union of Pure and Applied Chemistry [83]):

$$E\_a(\beta) = -\frac{\mathbf{d}}{\mathbf{d}\,\beta} \ln k(\beta) \tag{16}$$

The apparent activation energy can be written as the difference between the average internal energy of the reacting molecules and that of all molecules in the system: this is the content of statement of the so-called Tolman's theorem [35,79], which has been analyzed quantum mechanically by Fowler and Guggenheim [35]: the meaning is that *Ea* represents a measure of an energetic obstacle to the progress of the reaction, reinterpreted subsequently and exploited as the barrier height energy in Eyring's formulation of the transition state theory [13]. Basic is to consider the apparent activation energy as essentially the *ε*‡ parameter of the previous section (actually, the double dagger notation is that introduced by Eyring).

#### 2.4.2. Planck Black-Body Radiation and Reciprocal Energy

We provided now a further perspective viewpoint of the phenomenological parameters involved in the definition of the activation energy and its reciprocal, the transitivity function, going back to the elementary formulation of Planck to solve the problem of the average energy of a black body [14,84]. Assuming a system composed of harmonic oscillators with quantized energy  *<sup>n</sup>* = *nhν*, where *ν* is the frequency of the oscillator, the partition function for this system is given as,

$$Z = \sum\_{n} e^{-\mathfrak{c}\_{n}\mathfrak{b}} \tag{17}$$

The total average energy of this system can be calculated using Equation (15)

$$\begin{cases} \overline{E} = -\frac{\text{d}}{\text{d}^3 \beta} \ln \sum\_{n} e^{-\varepsilon\_n \beta} \\ \overline{E} = -\frac{\text{d}}{\text{d}^3 \beta} \ln \left[ 1 + e^{-h\nu\beta} + \left( e^{-h\nu\beta} \right)^2 + \left( e^{-h\nu\beta} \right)^3 + \dots \right] \\ \overline{E} = -\frac{\text{d}}{\text{d}^3 \beta} \ln \left[ \frac{1}{1 - e^{-h\nu\beta}} \right] = \frac{h\nu}{e^{h\nu\beta} - 1} \end{cases} \tag{18}$$

At low temperatures ( *β* → ∞), expansion of the reciprocal of *E* assumes a functional form that produces a power law in *β*,

$$\frac{1}{E} = \frac{1}{h\nu} \left[ 1 + h\nu\beta + \frac{1}{2!} (h\nu\beta)^2 + \frac{1}{3!} (h\nu\beta)^3 + \frac{1}{4!} (h\nu\beta)^4 + \dots \right] \tag{19}$$

suggesting the usefulness of introducing also to this case the inverse of *E*, as an analog of the transitivity function *γ*(*β*). See [1] for elaboration of quantum statistical treatments.

#### 2.4.3. Activation and Transitivity: A Prototypical Unimolecular Reaction Model

Another interesting case that generates a functional form justifying the introduction of transitivity is the model considered in the Twelfth Chapter of reference [35] by Fowler and Guggenheim: they propose the initial steps of a kinetic theory for unimolecular processes, starting from a quantum theoretical formulation of Tolman's theorem for calculating the probability for molecules to react after acquiring a sufficient amount of energy *ε*‡ distributed over the *s* internal degrees of freedom characterizing the reacting molecule. The model, arguably valid within a small enough neighborhood of *β* = 0, is that all active molecules have the same probability of decomposition, requiring an average over all possible values of energy for the active molecules: these hypotheses represent the prototypical theory of unimolecular reactions further elaborate into that of Slater [85] and to the more successful RRKM formulation. We find interesting to elaborate further and provide a continuation of their formulation.

Consider, as first suggested by René Marcelin [86], that a molecule is "active" if the energy exceeding *ε*‡ is distributed over s internal vibrational degree of freedom: the last formula of Fowler and Guggenheim presentation for the unimolecular rate constant can be transcribed here in our notation in a remarkably simplified form

$$k(\beta) = A e^{-\varepsilon^{\ddagger} \beta} \sum\_{r=0}^{s-1} \frac{1}{r!} \left(\varepsilon^{\ddagger} \beta\right)^r. \tag{20}$$

First, we note that the sum can be given in closed form

$$k(\beta) = A \left( \frac{\epsilon^{\varepsilon^{\dagger}\beta} \Gamma(s, \varepsilon^{\dagger}\beta)}{\Gamma(s)} \right) e^{-\varepsilon^{\dagger}\beta}. \tag{21}$$

where Γ *s*,*ε*‡*β* is the incomplete Γ-function. Additionally, we calculated the activation energy for the Fowler–Guggenheim model by logarithmic differentiation of the rate coefficient (21) with respect to *β*

$$E\_{a} = -\frac{\partial \ln k}{\partial \beta} = \varepsilon^{\ddagger} - \frac{\sum\_{r=1}^{s-1} \frac{1}{(r-1)! \beta} \left(\varepsilon^{\ddagger} \beta\right)^{r}}{\sum\_{r=0}^{s-1} \frac{1}{r!} \left(\varepsilon^{\ddagger} \beta\right)^{r}} = \varepsilon^{\ddagger} \frac{\left(\varepsilon^{\ddagger} \beta\right)^{s-1}}{\Gamma\left(s, \varepsilon^{\ddagger} \beta\right)} e^{-\varepsilon^{\ddagger} \beta}. \tag{22}$$

Finally, we got a closed form in *β* also for the transitivity function

$$\gamma(\beta) = \frac{1}{E\_a} = (s - 1)! \sum\_{r=0}^{s-1} \frac{1}{r!\beta} \left(\varepsilon^\sharp \beta\right)^{r-s} = \frac{1}{\varepsilon^\sharp} \frac{\Gamma\left(s, \varepsilon^\sharp \beta\right)}{\left(\varepsilon^\sharp \beta\right)^{s-1}} e^{\varepsilon^\sharp \beta}. \tag{23}$$

This model for unimolecular reactions is of interest for any pseudo-first-order processes. Two other cases analyzed by reference [35] should be similarly investigated, permitting asymptotic analysis through well-known properties of special functions.

#### **3. Scaling in the Transitivity Plane**

#### *3.1. Transitivity and Renormalization Group Coupling*

It is important to recognize the similarity between the functional form of transitivity *γ* with respect to the rate coefficient *k* and the reciprocal temperature *β* (Figure 1); and the renormalization group coupling parameter, credited to Callan [87] and Symanzik [88] (see also Wilson [89]): *βCS*(*g*) = −(*∂g*/*∂* ln *μ*) defines modernly the relationship between the coupling constant *g* and the energy scaling function *μ*. The equation encodes the mathematical apparatus in both quantum field theory and the theories of critical phenomena used to handle problems with singularities, such as those occurring at phase transitions. See the lucid presentation by Weinberg [90] (see also [91,92]).

#### *3.2. Classes of Universal Behaviors*

The previous sections have shown that from a kinetic point view, the reciprocal of the activation energy can be properly defined as the transitivity *γ*, specific of a process and interpreted as a measure of the propensity for the reaction to proceed. Our notations stem from the fact that the transitivity can take a gamma of values smooth as function of *β* in a sufficiently ample range of temperatures. Its limiting values will serve to localize any abrupt changes, e.g., in mechanisms of processes or in phase transitions. Generally, if a Laurent expansion defined in references [1–4] is assumed to hold in a neighborhood around a reference value denoted as *β*0, it behaves asymptotically as

$$
\gamma(\beta) \sim \alpha \left(\beta - \beta\_0\right)^{\tilde{\gamma}} \tag{24}
$$

General series for *γ*(*β*) where previously given in reference [1,3]. Now, the transitivity plane, *γ vs*. *β*, (see Figure 3) can be interpreted as confining the range of existence of a system between limiting temperatures in consonance with the thermal kinetic limits defined in Section 2.2. The two temperatures or limiting coldnesses *<sup>β</sup>* are generally contained between the extremes *<sup>β</sup>* ≈ 0 to *<sup>β</sup>* ≈ *<sup>β</sup>*† defining the temperature window where a process is operative. The simplest model for *γ* is a linear path from *α* = 1/*ε*‡ to *β*†= 1/*ε*‡ according to the AM formula [3]. In fact, the limiting formula derived from Equation (24) in reference [3] yields

$$\gamma(\beta) = \frac{1}{\varepsilon^{\ddagger}} \left( 1 - \frac{\beta}{\beta^{\ddagger}} \right). \tag{25}$$

It is interesting to express known temperature-dependence rate laws generalizing the previous equation as

$$\gamma(\beta) = \frac{1}{\varepsilon^{\frac{\gamma}{4}}} \left( 1 - \frac{\beta}{\beta^{\dagger}} \right)^{\frac{\gamma}{4}} \tag{26}$$

The exponent *ζ* = 0, 1, and 2 generates the Arrhenius, Aquilanti–Mundim (AM), and Vogel–Fulcher–Tammann (VFT) laws, respectively [3]. Many other paths can serve as models for the transitivity function for different values of *ζ* (see Figure 5). Generalization to non-integer values shows perspectives of correlation with critical exponents in mode coupling theory and with universality classes of kinetic transitions (see also Section 4). Studies in the glass transition field show [93–95] that systems with a large fragility (strong non-Arrhenius behavior) present ranges of universality separated by a crossover temperature: in some works considering glass-forming systems and, e.g., for the prototypical reaction F + H2 (D2) at low and ultra-low temperatures [20], should permit to categorize the universality classes in a wide temperature range by the critical exponent *ζ*, possibly empirically a non-integer.

**Figure 5.** The alpha-zeta totem. The transitivity plane, introduced in [3], *γ* = 1/*Ea vs*. *β* = 1/*kBT* serves to give a proper scaling to the phenomenological parameters occurring in the study of nonlinear Arrhenius plots. The Arrhenius behavior is given as corresponding to a line parallel to the *β* axis starting at 1/*ε*‡. Deviations from the Arrhenius behavior give the transitivity function *γ* a straight-line at small *β*, which it is connected to the *d* parameter of the Aquilanti–Mundim (AM) law. At low temperatures, the transitivity function tends to characteristic ultra-cold limiting values: (i) for *d* <0(*sub*-Arrhenius) it tends to the Wigner limit and (ii) for *d* >0(*super*-Arrhenius), *γ*, namely the propensity for reaction to occur, vanishes in *β*† from Mauro–Yue–Ellison–Gupta–Allan (MYEGA) [96], Aquilanti–Mundim (AM) [60] and Vogel–Fulcher–Tammann (VFT) [80–82] path. In [3] 1/*ε*‡ is defined as α, so that the transitivity plane provides a primitive totem for elementary chemical kinetics, employing the first six letters of the Greek alphabet, αβγδεζ for the parameters, not all independent (see text).

In reference [3], we show cases when curvature in the Arrhenius plot can be linearized. Interestingly, a formulation was empirically proposed in 1980s to fit the temperature dependence of properties of glass-forming materials [97–99]. Here, the proposed connection through the Tolman's theorem is assumed as a scaling tool for relaxation processes: the relationships appearing in the transitivity plane turn out as explicitly universal for the linear dependence in *β*, at least in a significantly wide neighborhood near the origin of *β* - 0. Most important is that all parameters are given both a physical meaning and the possibility of being estimated by physical models.

#### **4. Perspectives on Rate Processes from the Arrhenius and the Transitivity Planes**

There are a variety of chemical reactions and rate processes that deviate from Arrhenius behavior, and this list of them is currently expanding upon consideration of several types of phenomena being documented [1,59,60,100,101]. Below, we collected three important examples in which a systematic investigation of the universality is in progress.

(i) The rates of biological processes are strongly affected at low temperatures by deviations from Arrhenius law; however there are large uncertainties especially when quantifying, as usual these deviations using the "Arrhenius Break Temperature" assumption, see previous discussions [3,60]. The difficulty of identifying a transition temperature in the Arrhenius plot for the respiration processes [60,102,103] can be easily overcome using the transitivity plot, emphasizing sudden transitions described within the Aquilanti–Mundim law (universality class with ζ = 1).


**Figure 6.** The Arrhenius (upper panels) and transitivity (lower panels) planes of the temperature dependence of viscosity of silicate mineral (reference in the text). The diamond symbols represent the transitivity values obtained numerically. Red lines emphasize region where the temperature dependence of the transitivity is linearized, as expected by the Aquilanti–Mundim law.

In the exemplary processes presented above, the scaling provided by the transitivity variable makes explicit the corresponding extent of deviations from Arrhenius law, emphasizing kinetic transition temperatures when they appear. Computational tools for assisting in assessing this behavior are presented in a code described in a companion article in this topical collection [4].

#### **5. Conclusions and Outlook**

The approaches to the description of rate processes formulation have evolved due to the synergism between phenomenological and computational approaches: with Arrhenius law representing the former and the transition-state theory standing at the foundations of the latter. However, with the advance of experimental and computational techniques, these approaches needed extensions able to cope with new problems, such as quantum effects (e.g., tunneling and resonance) in atomic and molecular systems, stochastic motion of particles in condensed environment, non-equilibrium effects in classical and quantum formulations. From several modern techniques for treating kinetic problems, we can cite Feynman-like path integral formulations [106–108] to estimate temperature dependence of rate constants in chemical reactions, mode-coupling theory [109] for describing the physics of glass formation; and the development of rational extended thermodynamics [23,110] to treat systems far away from equilibrium.

The implementation of modern formulations to new experimental data and computational simulations requires a complex set of microscopic information to estimate kinetic parameters, making formidable the problem of describing many-particle dynamics and kinetic equations. Concomitantly, phenomenological approaches continue to be important pillars for the enhancement of ab initio formulations beyond: Arrhenius [5], Vogel–Fulcher–Tammann [80–8582 Williams–Landel–Ferry [111], Power law [112], Bässler [113], and Nakamura–Takayanagi–Sato [114] laws are a sequence of useful models to describe problems in extreme and highly complex environments. Motivated by this perception and establishing connection with Tsallis statistics [62] for classical propensities, in the last ten years we have worked in close synergism between phenomenological and ab initio or semiempirical formulations. A key guide came by Euler's expression for the exponential function as a limit of succession, a formulation accompanied by physicochemical meanings originally suggested for gas kinetic theory and chemical kinetics processes.

The "prequel" to the saga has been reconstructed in Section 2. We recapitulate the steps that originated essentially from following the Maxwell–Boltzmann path and involving at some stage application of the Euler's formula: Boltzmann (1868) [25] (Figure 2) was the first that succeeded to prove the Maxwell's distribution working with marginal probabilities in what is now called the thermodynamic limit; subsequently, Maxwell (1879) [27] (Figure 3) and Jeans (1916) [28] developed rigorous formulations performing mathematically the Euler's formula for the thermodynamic limit; Uhlenbeck and Goudsmit [29] in their study of finiteness of particle number stated clearly that their formulation is written in the spirit of the Maxwell–Boltzmann original treatments. In these formulations at the final stage always the Euler's limit is invoked, by which the exponential distribution function is recovered upon taking the thermodynamic limit.

Connection with concomitant modern approaches is relevant. Recently, it has been asserted that the molecular world and its reactivity can be interpreted by theories involving Fuzzy sets and Fuzzy logic [115]. These theories have been formulated by the electronic engineer Lotfi Zadeh, and are useful to model how humans "compute" by using words [116]. Every word of the natural language, represented by a Fuzzy set, is like a "quantum" of information, whose meaning is context-dependent. Similarly, every molecule or every atom of the microscopic world is like a Fuzzy set, i.e., like a word of the "molecular language". Every molecule can exist as a collection of different conformers and every atom as a superposition of different quantum states. These "molecular Fuzzy sets" show context-dependent behaviors.

The connection with modern statistical mechanics appears to emerge as follows. In our context, it is needed to understand the meaning of the deviations from the Arrhenius equation. Let us cite from the incipit of [41]:

In statistical mechanics we are concerned with the physical properties of large systems. We assume the existence of the thermodynamic limit (a main concern in this paper, Section 2). The peculiarity, which requires that the mechanics of such a system is "statistical", stems from the fact that such a system is as a rule incompletely defined. By this we mean that the equations of motion for such a system cannot be uniquely solved. Were this true in Gibbs' time already for the simple reason of mathematical complexity, the real problem is not computational, as is clear from interesting computer simulations currently available. Basically, the need for statistical methods stems from the lack of detailed information on the system.

The chain of emerging connections continues. The interpretation of the experimental evidence might require Zener's geometric programming optimizations [117]: geometric programming is a nonlinear mathematical optimization method used to minimize functions that are in the form of polynomials subject to constraints of the same type. The connection between geometric programming and the Darwin–Fowler method has been established since some time [117] (see also a modern approach [118]). Since the data used in the optimization procedure are always affected by errors and uncertainties, a strategy to handle them is provided by the theory of Fuzzy sets, as discussed very recently [119], for example in reference [4], in generally in most of our work we used the generalized simulated annealing (GSA) [120]. The application of Fuzzy optimization algorithms can avoid rigidity and stiffness and reduce information loss arising from the conventional optimization procedures of statistical mechanics.

**Author Contributions:** Conceptualization, V.H.C.-S., N.D.C. and V.A.; Investigation, V.H.C.-S.; Methodology, V.H.C.-S., N.D.C. and V.A.; Writing—original draft, V.H.C.-S., N.D.C. and V.A.; Writing—review & editing, V.H.C.-S., N.D.C. and V.A. All authors have read and agreed to the published version of the manuscript.

**Funding:** Valter H. Carvalho-Silva thanks Brazilian agency CNPq for the research funding programs (Universal 01/2016—Faixa A—406063/2016-8) and Organizzazione Internazionale Italo-latino Americana (IILA) for a Biotechnology Sector-2019 scholarship. Nayara D. Coutinho and Vincenzo Aquilanti acknowledge the Italian Ministry for Education, University and Research, MIUR, for financial support: SIR 2014 "Scientific Independence for young Researchers" (RBSI14U3VF) coordinated by Dr. Federico Palazzetti.

**Conflicts of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

#### **Appendix A**

#### *Deformation of Statistical Distributions of Molecular Velocities and Kinetic Energies*

The previous discussion of the so-called Tolman's theorem [79] provides statistical mechanics meaning to the energetics of an activation process: here we present a derivation of the fraction of particles that exceed a certain amount of energy for a distribution of velocities far from the thermodynamic limit and deforming the Maxwell distribution. To calculate the probability distribution of a particle to acquire a certain velocity described by the vector *ν* we used the formulation presented in references [121,122].

$$P(\mathbf{v}) = \mathcal{N}\_d \left(\frac{m\beta}{2\pi}\right)^{\frac{3}{2}} \left(1 - d\frac{m\beta}{2}\mathbf{v}^2\right)^{\frac{1}{d}},\tag{A1}$$

where *m* is the reactant mass, can be interpreted as the parameter defined in the Section 2.2. and normalization is given by

$$\mathcal{N}\_d = \left\{ \begin{array}{ll} (-d)^{\frac{3}{2}} \frac{\Gamma\left(-\frac{1}{d}\right)}{\Gamma\left(-\frac{1}{d} - \frac{3}{2}\right)}, & \text{for } d < 0\\ \frac{1}{4} d^{\frac{1}{2}} (2+d)(2+3d) \frac{\Gamma\left(\frac{1}{2} + \frac{1}{d}\right)}{\Gamma\left(\frac{1}{d}\right)}, & \text{for } d > 0 \end{array} \right\} \tag{A2}$$

when → 0, N → 1 and the Maxwell–Boltzmann molecular speed distribution is recovered [121,122].

To account for the alternatives ways to combine the velocity vectors, one can follow Maxwell obtaining the variety of states available to be accessed by performing an integral over the surface of the area of a sphere of radius *<sup>v</sup>* <sup>=</sup> <sup>|</sup>*v*|, 4*πv*2. From Equation (A2), the distribution assumes the following form,

$$P(v)dv = 4\pi N\_d \left(\frac{m\beta}{2\pi}\right)^{\frac{3}{2}} v^2 \left(1 - d\frac{m\beta}{2}v^2\right)^{\frac{1}{2}} \text{d}v,\tag{A3}$$

which can also be written as a function of the available energy (*E*) to be accumulated in the system [123],

$$P(E)dE = \frac{2\beta^{\frac{3}{2}}}{\pi^{\frac{1}{2}}} \mathcal{N}\_d E^{\frac{1}{2}} (1 - \mathcal{Q}E\beta)^{\frac{1}{d}} \,\mathrm{d}E.\tag{A4}$$

In order to calculate the fraction of molecules, d*N*/*N*, which accumulates energy between *E* and *E* + d*E* during the collisional processes, we can use Equation (A4),

$$\frac{dN}{dN} = \frac{2\beta^{\frac{3}{2}}}{\pi^{\frac{1}{2}}} \mathcal{N}\_d E^{\frac{1}{2}} (1 - dE\beta)^{\frac{1}{d}} \text{ d.E.} \tag{A5}$$

The fraction of molecules *F*‡ with energy in excess energy of a certain specified value *ε*‡ is relevant for chemical kinetics problems and can be expressed in closed form calculating the following integral

$$F^{\ddagger} = \frac{2\beta^{\frac{3}{2}}}{\pi^{\frac{1}{2}}} \mathcal{N}\_d \int\_{\varepsilon^{\sharp}}^{\infty} \mathbf{E}^{\frac{1}{2}} (1 - d\mathbf{E}\beta)^{\frac{1}{d}} d\mathbf{E} = \frac{2\beta^{\frac{3}{2}}}{\pi^{\frac{1}{2}}} \mathcal{N}\_d \varepsilon^{\sharp} \, ^{\frac{3}{2}} \mathcal{F} \left(\frac{3}{2}, -\frac{1}{d}; \frac{5}{2}; d\varepsilon^{\sharp}\beta\right). \tag{A6}$$

where F is the hypergeometric function of Gaussian exemplary numerical simulations of the transitivity function that can be calculated from Equation (A6) and will be presented elsewhere. Being in closed form and containing functionality well characterized mathematically, Equation (A6) should be useful for asymptotic analysis of the physical features of the "deformed" distribution.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **"Transitivity": A Code for Computing Kinetic and Related Parameters in Chemical Transformations and Transport Phenomena**

**Hugo G. Machado 1,2,\*, Flávio O. Sanches-Neto 1,2,\*, Nayara D. Coutinho 3,\*, Kleber C. Mundim 2, Federico Palazzetti 3,\* and Valter H. Carvalho-Silva 1,2,\***


Academic Editor: Pier Luigi Gentili

Received: 16 August 2019; Accepted: 8 September 2019; Published: 25 September 2019

**Abstract:** The Transitivity function, defined in terms of the reciprocal of the apparent activation energy, measures the propensity for a reaction to proceed and can provide a tool for implementing phenomenological kinetic models. Applications to systems which deviate from the Arrhenius law at low temperature encouraged the development of a user-friendly graphical interface for estimating the kinetic and thermodynamic parameters of physical and chemical processes. Here, we document the Transitivity code, written in Python, a free open-source code compatible with Windows, Linux and macOS platforms. Procedures are made available to evaluate the phenomenology of the temperature dependence of rate constants for processes from the Arrhenius and Transitivity plots. Reaction rate constants can be calculated by the traditional Transition-State Theory using a set of one-dimensional tunneling corrections (Bell (1935), Bell (1958), Skodje and Truhlar and, in particular, the deformed (*d*-TST) approach). To account for the solvent effect on reaction rate constant, implementation is given of the Kramers and of Collins–Kimball formulations. An input file generator is provided to run various molecular dynamics approaches in CPMD code. Examples are worked out and made available for testing. The novelty of this code is its general scope and particular exploit of *d-*formulations to cope with non-Arrhenius behavior at low temperatures, a topic which is the focus of recent intense investigations. We expect that this code serves as a quick and practical tool for data documentation from electronic structure calculations: It presents a very intuitive graphical interface which we believe to provide an excellent working tool for researchers and as courseware to teach statistical thermodynamics, thermochemistry, kinetics, and related areas.

**Keywords:** *d*-TST; activation energy; Transitivity plot; solution kinetic

#### **1. Introduction**

Recent applications of chemical kinetics to a variety of complex systems involves the accurate dealing of properties to be described by techniques, which treat a series of processes beyond elementary chemical quantum dynamics or even approximate classical and semiclassical approaches. We analyze in a companion paper [1] the state of our approaches to these problems from a general viewpoint. Here, we deal with explicit computational calculations that allow moving directly in a simple way to global applications.

Recourse needs to be made at one stage or another to statistical treatments [1–3] among which there is modern insurgence with respect to more traditional ones—exemplary is, in particular, this topical collection essentially dedicated to "Fuzzy Logic" [4,5].

Information on the kinetic and related parameters in chemical transformations and transport phenomena and their role in complex mechanisms is needed: Particularly the temperature dependence of rate processes, *k*(*T*) and often in the low temperature range, where deviations from linearity of Arrhenius plots are revealed. The phenomenology of curvatures in Arrhenius plot span all of chemistry: From the long list that is continuously updated, we refer here to some selected cases, such as combustion chemistry [6], condensed-phase [7], atmospheric and astrochemical reactions [8,9], processes involved in the preservation and aging of food and drugs [10,11], as well as in basic geochemical [12] and biochemical environments [13,14]. The current status of the phenomenology is classified emphasizing case studies, specifically (i) super-Arrhenius kinetics, convex curvature in the Arrhenius plot, where transport phenomena brakes the processes as temperature decreases; (ii) sub-Arrhenius kinetics, concave curvature in the Arrhenius plot, where quantum mechanical tunneling propitiates low temperature reactivity; (iii) anti-Arrhenius kinetics, negative apparent activation energy, where processes are limited by stereodynamic requirements.

The curvature in the Arrhenius plot promotes a temperature dependence on the apparent activation energy *Ea*, the definition of which has been recommended by the International Union for Pure and Applied Chemistry (IUPAC) [13,15] as follows,

$$E\_a = -k\_B \frac{\mathbf{d} \ln k(T)}{\mathbf{d} \left(\frac{1}{T}\right)} = -\frac{\mathbf{d} \ln k(\beta)}{\mathbf{d} \,\beta}.\tag{1}$$

In Equation (1), *kB* is the Boltzmann constant, *T* is the absolute temperature and β = 1/*kBT*. The so-called Tolman's Theorem [16] proposed an interpretation for *Ea* as the difference between the average over the energy of all reacting systems and energy of all (reactive and nonreactive) systems propitiating a connection between canonical quantities with microcanonical features of the potential energy surface. Independently, the same spirit lead to transition-state formulations [15–18]. Recently, it was found expedient to introduce the reciprocal of the apparent activation energy in order to define the Transitivity function [2,3], Equation (2),

$$\gamma(\beta) = \frac{1}{E\_a(\beta)}\tag{2}$$

to construct an appropriate scaling plane, γ(β) vs. β, where the regular curvatures on the Arrhenius plane are approximately linearized: The Transitivity plot. The function γ(β) can be interpreted as a measure of the propensity for the reaction to proceed and permit uniformly to account for experimental and theoretical rate processes, such as quantum tunneling, transport properties, and diffusion in the neighborhoods of phase transitions.

For cases where experimental and theoretical rate processes are difficult to interpret—by molecular complexity, extreme conditions or impossibility of exact solution to the Schrodinger's equation—phenomenological and semiclassical theoretical approaches are of increasing utility. Over the years, the paramount useful phenomenological tool for studying the kinetics of physicochemical processes has been the Arrhenius law. It fails when the temperature range of the rate process becomes large: Empirical laws have been proposed, involving empirical parameters often lacking of physical interpretation, e.g., Kooij [19], power law [20], Vogel–Fulcher–Tamman (VFT) [21–23], Nakamura–Takayanagi–Sato (NTS) [24], deformed (*d*) Aquilanti–Mundim (AM) [25] and Aquilanti–Sanchez–Coutinho–Carvalho (ASCC) [26]. In Reference [27], it was presented how the transitivity concept can sort and interpret all these empirical laws, permitting a microscopic interpretation of the phenomenological parameters.

For understanding and predicting a wide variety of kinetic processes with large molecular complexity and presenting deviation from Arrhenius law, Transition-State Theory (TST) remains an excellent formulation. The TST triggered the development of a variety of improved approaches: Important variants include the variational TST [28], the Marcus theory of electron transfer [29], and quantum [30] and path integral [31,32] versions. However, the variants of TST require additional information of the potential energy surface rendering formidable the task of estimating kinetic parameters as the size of molecular systems increases: Examples are the degradation kinetics of organic pollutants in aquatic, soil, and atmospheric environments [33–38].

The late twentieth century saw the advent, beyond the Transition State formulation, of molecular dynamics simulations, which is nowadays a powerful theoretical tool in understanding mechanisms of physical and chemical processes. Recently, there is a vast activity evaluating whether advances in molecular dynamics simulations can provide quantitatively rate constants [39–41]. However, the extraction of quantitative information on rate constants from molecular dynamics simulations is an important issue but a very difficult one to tackle [42–44].

Software tools to calculate and interpret rate constants are enormously useful in material, biochemical, and geochemical research to permit the exploitation of intense progress in computational hardware. Widely diverse codes have been developed to calculate rate constants in a gas-, liquid-, and solid-phase using Transition-State Theory and its variants: Polyrate [45], TheRate [46], MultiWell [47], TAMkin [48], Mesmer [49], RMG [50], APUAMA [51], KiSThelP [52], FRIGUS [53] and Eyringpy [54] codes are excellent options available to work out the kinetics of chemical reactions. Herein, we describe a new code with a user-friendly graphical interface able to perform various procedures for computing kinetic and related parameters in chemical transformations and transport phenomena: The code is referred to as "Transitivity" (see Figure 1), a concept related to the function defined in Equation (2); written in Python, it is a free open-source code compatible with Windows, Linux, and MacOS platforms. It offers the possibility of estimating phenomenological parameters from Arrhenius and Transitivity plots using a stochastic optimization method, Generalized Simulated Annealing (GSA) [55], with several options: Arrhenius, Aquilanti–Mundim (AM) [25], Vogel–Fulcher–Tammann (VFT) [22], Nakamura–Takayanagi–Sato (NTS) [24,56] and Aquilanti–Sanchez–Coutinho–Carvalho (ASCC) [26] formulas. This code also allows the estimation of unimolecular and bimolecular reaction rate constants with traditional TST using Bell35 [57], Bell58 [58], Skodje–Truhlar (ST) [59] and deformed (*d*-TST) corrections [60]. Solvent effects can be accounted for by the Collins–Kimball [61] and Kramers [62] models. Finally, input files for different first-principles molecular dynamics—Born–Oppenheimer Molecular Dynamics (BOMD), Car–Parrinello Molecular Dynamics (CPMD), Metadynamics, Path-Integral Molecular Dynamics (PIMD), and Trajectory Surface Hopping (TSH)—can be generated to run by the CPMD code [63].

We propose this code, as we expect it serves as a quick and practical tool for documentation data from electronic structure calculations. Additionally, it presents a very intuitive graphical interface which we believe to provide a useful working tool for the general public and researchers and also as courseware to teach statistical thermodynamics, thermochemistry, kinetics, and related areas. The body of this article is structured in two parts, one that deals with a brief theoretical and operational description of the code, and a second one that shows a variety of examples. A final section is devoted to additional and concluding remarks and an Appendix A collects used symbols and their meaning.

**Figure 1.** Logo and main windows of the Transitivity code.

#### **2. Theoretical Background**

#### *2.1. Phenomenology of Temperature Dependence of the Reaction Rate Constant*

The theoretical apparatus to connect the Transitivity function, γ(*T*), and phenomenological reaction rate constant formulas and vice-versa is built and extensively discussed in Reference [27] and references therein. Classical and recent phenomenological reaction rate constant formulas to account for sub-, super-, and anti-Arrhenius behavior are applied in Section 4.1 by use of: (i) Deformed (*d*) Aquilanti–Mundim [25], (ii) VFT (Vogel–Fulcher–Tammann) [22], (iii) NTS (Nakamura–Takayanagi–Sato) [24,56], and (iv) ASCC (Aquilanti–Sanchez–Coutinho–Carvalho) [26]. More details of the formulas are presented in Table 1.

#### *2.2. Calculation of Reaction Rate Constant*

The Transition-State Theory, TST, is the most popular tool used to study the kinetics of chemical reactions with a well-defined activated complex—as customary, a double dagger (‡) denotes the properties pertaining to the transition-state complex. For a general bimolecular reaction, such as Reactants → *TS*‡ → Products, it is necessary to compute the *Q*1, *Q*<sup>2</sup> and *Q*‡ partition functions of reactants *R*<sup>1</sup> and *R*<sup>2</sup> and of the transition state, respectively. At absolute temperature *T*, the rate constant is given by:

$$k^{TST}(T) = \frac{k\mathfrak{p}T}{h} \frac{Q^\ddagger}{Q\_1 Q\_2} \exp\Big(-\frac{\varepsilon^\ddagger}{k\_B T}\Big) \tag{3}$$

where *h* is Planck's constant; and ε‡ is the effective height of the energy barrier, eventually with the addition of the zero-point energy correction in Equation (3). The tunneling correction introduced in TST can be calculated by Skodje–Truhlar [59], Bell35 [57], and Bell58 [58] corrections and by the deformed (*d*) tunneling formulation presented in the next section [60].

#### 2.2.1. Deformed Transition-State Theory (*d*-TST)

The deformed Transition-State Theory (*d*-TST) [60] formulation:

$$k\_d(T) = \frac{k\_B T}{h} \frac{Q^\ddagger}{Q\_1 Q\_2} \left(1 - d \frac{\varepsilon^\ddagger}{k\_B T}\right)^{1/d}, \quad d = -\frac{1}{3} \left(\frac{h \nu^\ddagger}{2 \varepsilon^\ddagger}\right)^2,\tag{4}$$

where ν‡ is frequency for crossing the barrier, uniformly covers the range from classical to moderate tunneling regimes but needs an amendment for deep tunneling in exothermic reactions, a relatively rare case. The proposed variant of the transition-state theory was obtained from the transitivity concept and deformed (*d*) Aquilanti–Mundim law permit comparison with experiments and tests against alternative formulations. The nomenclature in Equation (4) is the same as used in Equation (3). A popular formulation in the literature is to use the Wigner tunneling correction: However, in our previous study [60,64], we have shown that *d*-TST is a more satisfactory approximation also in view of uniform behavior across the height of the energy barrier.

#### 2.2.2. Bell35 and Bell58

To cover the moderate-to-deep tunneling transition in exothermic reactions, we applied both Bell35 correction [57,65], Equation (5):

$$\kappa\_{Bdl35} = \frac{\left[\frac{1}{\hbar v^{\ddagger}} - \frac{1}{k\_B T} \exp\left(\frac{c^{\ddagger}}{k\_B T} - \frac{c^{\ddagger}}{\hbar v^{\ddagger}}\right)\right]}{\frac{1}{\hbar v^{\ddagger}} - \frac{1}{k\_B T}}\tag{5}$$

and Bell58 [58,65] correction truncated at the second term (*2T*)

$$\kappa\_{Rell58-2T} = \frac{\left(\frac{l\nu^{\ddagger}}{2kgl}\right)}{\sin\left(\frac{l\nu^{\ddagger}}{2kgl}\right)} - \frac{\exp\left(\frac{c^{\ddagger}}{kgl} - \frac{c^{\ddagger}}{\Lambda \nu^{\ddagger}}\right)}{\left(\frac{kgl}{\Lambda \nu^{\ddagger}} - 1\right)},\tag{6}$$

that, although non-uniform across the transition between negligible and moderate tunneling regimes, were found to behave smoothly enough to adequately perform practically astride the whole range.

Truncating Equation (6) to the first term, we recover the usual formula used in the literature to describe the tunneling under deep regime Equation (7):

$$\kappa\_{\text{Bell58}} = \frac{\left(\frac{I\text{v}^{\ddagger}}{2k\_B T}\right)}{\sin\left(\frac{I\text{v}\cdot\text{t}}{2k\_B T}\right)}\tag{7}$$

However, this formulation presents divergence at the crossover temperature *Tc* = *h*ν‡/π*kB*. The tunneling regimes can be delimited within four temperatures ranges—negligible (*T* > 2*Tc*), small (2*Tc* > *T* > *Tc*), moderate (*Tc* > *T* > *Tc*/2), and deep (*T* > *Tc*/2) [60,66,67].

#### 2.2.3. Skodje and Truhlar, ST

To avoid spurious divergence at *Tc* in the Bell58 formulation, in 1981, Skodje and Truhlar [59] gave a generalization extending the parabolic barrier treatment. In their approximation for tunneling correction Equations (8a) and (8b)

$$\kappa\_{ST} = \frac{\left(\frac{h\nu^{\ddagger}}{2k\_BT}\right)}{\sin\left(\frac{h\nu^{\ddagger}}{2kgT}\right)} - \frac{\exp\left[\left(\frac{1}{k\_BT} - \frac{1}{h\nu t}\right)\left(\varepsilon^{\ddagger} - \Delta H\right)\right]}{\left(\frac{k\_BT}{h\nu^{\ddagger}} - 1\right)}, \quad \beta \le \mathsf{C},\tag{8a}$$

and

$$\kappa\_{ST} = \frac{1}{\left(\frac{k\_B T}{\hbar \nu^3} - 1\right)} \left\langle \exp\left[ \left(\frac{1}{k\_B T} - \frac{1}{\hbar \nu^4}\right) \left(\varepsilon^\ddagger - \Delta H\right) \right] - 1 \right\rangle, \quad \mathfrak{C} \le \beta,\tag{8b}$$

where ঃ = 1/*kBTc* = π/*h*ν‡ and Δ*H* is the enthalpy of reaction.

#### *2.3. Solvent E*ff*ect on Reaction Rate Constant*

#### 2.3.1. Collins–Kimball Formulation

Treatment of chemical reactions in liquid-phase requires accounting for the solvent effect, considering the ability of the reagents to diffuse and to lead them to effective reactive collisions. According to the Onsager solvent reaction field model [68–70], the solvent creates a solvation layer, as a cage around the molecular entities that participate in the reaction. The reactive process between molecular entities *A* and *B* is represented by the sequential equation,

$$\stackrel{\rightarrow}{k}\_{\text{solv}} + \stackrel{\rightarrow}{B}\_{\text{solv}} \rightleftharpoons \stackrel{\leftarrow}{\{AB\}}\_{\text{solv}} \stackrel{k}{\rightarrow} Product. \tag{9}$$

The "*solv"* symbol indicates that the molecular entity is surrounded by a roughly spherical solvation layer, the cage, having a specific radius. <sup>→</sup> *k <sup>D</sup>* is the diffusion kinetic rate constant which the reagent *<sup>A</sup>* travels in the solvent to find *<sup>B</sup>*; vice versa, <sup>←</sup> *k <sup>D</sup>* is the kinetic rate constant for reverse diffusion; *k* denotes the reaction rate constant due to effective reactive collisions: It can be estimated from Equations (3) or (4). Assuming the Steady-State Approximation [70] for Equation (9), the Smoluchowski expression for the diffusion kinetic rate constant [71], takes the form Equation (10):

$$
\overrightarrow{k}\_D = 4\pi r\_{AB} D\_{AB\_\prime} \tag{10}
$$

where *rAB* is a reaction radial distance and *DAB* is the sum of the diffusion constants for each reagent in the solvent. The generalization of Collins and Kimball [61,72] for the irreversible bimolecular diffusion-controlled reactions at infinite reaction rate, <sup>→</sup> *<sup>k</sup> <sup>D</sup>* <sup>←</sup> *k <sup>D</sup>* [73], yields an overall reaction rate constant, *kObs* Equation (11)

$$\frac{1}{k\_{\rm Obs}} = \frac{1}{k\_{TST}} + \frac{1}{\stackrel{\cdot}{k\_D}}.\tag{11}$$

In the code, the diffusion is accounted for by the Stokes–Einstein formulation and the temperature dependence of viscosity η of the solvent is estimated through the Aquilanti–Mundim formula (see details in Section 3).

#### 2.3.2. Kramers' Formulation

To account for dynamical effects of the solvent in a reactive process and to generalize to unimolecular and pseudo-unimolecular processes, Kramers' model considers a stochastic motion of the system, where the solvent effect is added considering Brownian movements along the reaction path [70].

Assuming that the friction constant, μ (see below), is independent of time, the overall reaction rate constant *kObs* can be calculated as Equation (12)

$$k\_{\rm Obs} = \kappa\_{\rm Kr} k\_{TST} \tag{12}$$

where κ*Kr* is the transmission factor obtained by Kramers [62] as:

$$\kappa\_{Kr} = \frac{1}{\omega^\ddagger} \Biggl( \sqrt{\frac{\mu^2}{4} + \omega^\ddagger^2} - \frac{\mu}{2} \Bigr). \tag{13}$$

For the transition-state theory rate constant *kTST* and variants see Section 2.2. In Equation (13), ω‡ is the imaginary frequency of the transition state and the friction constant is given by μ = (6π*rAB*/*M*)η, where *rAB* and *M* are the radius of the cage and the molecular mass of the transition-state, respectively. Again, the viscosity is calculated in the code through the Aquilanti–Mundim formula.

#### **3. Handling the Transitivity Code**

Several aspects regarding nomenclature and the theoretical background used in this article are presented in already cited references and in Reference [27]. Furthermore, the code with the manual, examples (see next section), and installation video can be freely downloaded in the www.vhcsgroup. com/transitivity web page.

In the main window of the code, the user can choose between three options: (i) Fitting reaction rate constant data as a function of temperature; (ii) predicting the reaction rate constant in the gasand liquid-phase; and (iii) creating the input for the calculation of first-principles molecular dynamics by the CPMD code. If the option chosen is "Kinetic and Related Parameters", the program needs the electronic structure output files provided by the Gaussian program for the structures of the reagents, of the transition state and of the products. If the system under study requires the calculation of the electronic energy using a specifically high level method, the energy values must be included separately in the box. The code provides the reaction rate constants for both unimolecular and bimolecular reactions and the users should indicate if the molecular entity corresponds to an atom or whether the molecule is linear or not. In a new window, the user can choose between TST or *d*-TST [60] and Bell35 [57], Bell58 [58], Skodje–Truhlar (ST) [59] tunneling corrections. In addition, the program provides a visualization of the Arrhenius plot with the possibility of including any experimental and/or theoretical data available for validation. In the same window, reaction properties (such as internal energy, enthalpy, Gibbs-free energy, barrier height, *d* parameter, imaginary frequency (ν‡), crossover temperature (*Tc*), and the parameter of the Skodje–Truhlar model) are exhibited.

If the VOLUME keyword is used in the calculation input file of G09, the option including solvent effects is available through the Kramers and Collins–Kimball formulations. It is necessary to make the choice of solvent parameters for the estimation of the viscosity or use the default (water). The parameters for the viscosity estimation (η/*Poise*) of a solvent other than water should be those fitted by the Aquilanti–Mundim formula [25] with <sup>η</sup>*<sup>o</sup>* in Poise and <sup>ε</sup> in J·mol−<sup>1</sup> and inserted in the "Solvent Type" option. The temperature dependence of viscosity of water using the experimental data [74] is expressed as <sup>η</sup>(*T*) = 2.7024.10−4*Poise*(<sup>1</sup> <sup>−</sup> 213.0543/*T*) <sup>−</sup>2.75634. When the Kramers formulation is chosen, the friction coefficient of the solvent (μ/*s*−1), the Kramers transmission coefficient, and the overall reaction rate constant (*kObs*/cm3·mol−1·s−<sup>1</sup> or <sup>s</sup><sup>−</sup>1) are provided. If the Collins–Kimball formulation is selected, the overall reaction rate constant (*kObs*/cm3·mol<sup>−</sup>1.s−1), separate diffusion coefficients (cm2·s<sup>−</sup>1) for the

reactants and the Smoluchowski diffusion rate constant (<sup>→</sup> *<sup>k</sup> <sup>D</sup>*/cm3·mol−1·s−1) are provided.

With the "GSA Fitting" option, the values of the rate constants and temperatures will be needed. In the software, the fitting in the Arrhenius plot is implemented for the rate constant data using

the Arrhenius, Aquilanti–Mundim, VFT, NTS, and ASCC formulas by the stochastic optimization algorithm GSA. It is possible to insert the guess parameters *d*, ε‡, *Ea*, *E*0, *E*ν, *B* and *T*<sup>0</sup> (see formulas in Table 1). Furthermore, information of the fit are available in "Fit [FormulaName].dat" output file. Internal parameters of GSA can be also controlled [55]. The fitting in the Transitivity plot is implemented of the reaction rate constant data using only Arrhenius, Aquilanti–Mundim, and VFT. The Transitivity plot is calculated using numerical differentiation with the option "Preview" and if necessary the smoothing of the data can be applied with the option "Apply SG" enabling the Savitzky–Golay filter [75]. In addition, the program provides a visualization of the Arrhenius and Transitivity plots with the fitting model chosen step-by-step.

The present code also offers the possibility of creating input to first-principles molecular dynamics simulation by the CPMD computational code. After selecting the input file of the system to be studied, which contains the molecular geometry of the system in a specific extension (\*.xyz, \*.gjf, \*.out and \*.log), the user should choose the molecular dynamics method. In addition, the user has the option to choose the Density Functional Theory (DFT) functional, pseudopotential, the temperature, the charge, the simulation time, and the integration time step. The size of the simulation box can be changed in the "Lattices" section. Additionally, the Transitivity code generates another output file with a \*.gjf extension, where the user can check if the geometry of the system to be simulated is correct.

#### **4. Examples**

#### *4.1. Fitting Mode—Arrhenius and Transitivity Plots*

To illustrate the use of most of functionalities and to validate the accuracy of the running of the Transitivity code, a fitting of the reaction rate constants in the Arrhenius plot is performed as a function of temperature for four different systems in different regimes: sub-Arrhenius, corresponding to both deep and moderate tunneling, super-Arrhenius, and anti-Arrhenius behavior.

The first example concerns the keto–enol tautomerization of 2-(2 -hydroxy-4 -methylphenyl) benzoxazole (MeBO), a well-known case of a deep-tunneling regime [7]: For this case we performed a fitting using Arrhenius, Aquilanti–Mundim, Nakamura–Takayanagi–Sato and ASCC formulas. In order to present the results for the sub-Arrhenius behavior under the moderate-tunneling regime, we evaluated the OH + H2 → H + H2O reaction [76] using the Arrhenius and Aquilanti–Mundim formulas. The third example is centered on the investigations that have revealed the super-Arrhenius behavior for the rates of the processes promoted by enzymatic catalysis [77–79]. Here, we regard the reaction rate constant of the hydride transfer between the substrate and NAD+, catalyzed by F147L, which exhibits a strong convex curvature in the temperature range of 5 to 65 ◦C [78]: A fitting is performed with the Arrhenius, Aquilanti–Mundim and VTF formulas. The final case is aimed at showing a case of anti-Arrhenius behavior, which is characterized by the decrease of the reaction rate constant with the increase in the temperature. The OH + HBr→ Br + H2O reaction is prototypical in studies of an example, both from a theoretical and an experimental point of view: It exhibits negative temperature dependence of the reaction rate constant [80–82]. Fitting is performed with the Arrhenius and the Aquilanti–Mundim formulas.

The fitting parameters and statistical analysis of the quality of the fits for all the systems are given in Table 1. The statistical measure χ<sup>2</sup> is used in tests to compare the quality of the formula with the reference. It is noteworthy that in a fitting process, in order to obtain parameters with a physical meaning, it is necessary to avoid compensation effects [83] exploiting any prior knowledge of the system, since the formulas with more than two parameters can lead to multiple solutions.

Figure 2 shows the experimental data and fitted formulas for each system in the Arrhenius plot. Both NTS and ASCC formulas were satisfactory for fitting sub-Arrhenius behavior under deep-tunneling regime, while Aquilanti–Mundim is satisfactory just within a specific range of temperature. Under moderate-tunneling regime, the Aquilanti–Mundim formula was an excellent option to describe experimental data for OH + H2 reaction. The VTF and Aquilanti–Mundim formulas seem adequate to

fit super-Arrhenius behavior. The Aquilanti–Mundim parameters that were obtained for the OH + HBr reaction indicate that they provide an excellent option for the anti-Arrhenius behavior. As expected, the Arrhenius formula is clearly inadequate to account for deviations at low temperature for all the reactions presented.

**Figure 2.** Arrhenius plots comparing the experimental reaction rate constant and fitted formulas for keto–enol tautomerization reaction (sub-Arrhenius behavior under deep tunneling), OH + H2 −→ H2O + H reaction (sub-Arrhenius behavior under moderate tunneling), hydride transfer with enzymatic catalysis (super-Arrhenius behavior) and OH + HBr −→ H2O + Br reaction (anti-Arrhenius behavior). NTS and ASCC formulas were of use for sub-Arrhenius behavior under deep-tunneling regime. The Aquilanti–Mundim formula was of use for sub-Arrhenius cases under moderate-tunneling regime, for super-Arrhenius and for anti-Arrhenius behaviors. VFT also was of use for super-Arrhenius situations. The references of experimental data can be found in Table 1.


parameters for the Arrhenius, AM, ASCC, NTS and VFT formulas, using the Transitivity code for keto–enol tautomerization [7], OH + H2 catalysis [78] and OH + Br [84] reactions. Energy (*Ea*, ε‡, *E*υand *E*0) is in cal/mol and temperature (*T*0and *B*) in K. Pre-factor units can be identified

**Table 1.** Fitted

#### *Molecules* **2019**, *24*, 3478

[76],

Curvature is easily identified through the Arrhenius plot in non-Arrhenius processes; however, the application of the phenomenological formulas to fit this behavior can lead to multiple solutions that make the physical interpretation of the obtained parameters difficult. In Reference [27], we show that curvature in the Arrhenius plot can be linearized using the transitivity plot. Analogous definition to transitivity function were proposed at 1980s to evaluate glass transition of supercooled materials [85–87]. However, the definitions failed to propose a connection with the Tolman theorem, Equation (1), being important only as scaling tools for relaxation processes. Figure 3 shows the Transitivity plot for the temperature dependence of relaxation time of the propylene carbonate [86], where a linear behavior is observed for a certain temperature range as expected by the deformed Aquilanti–Mundim law. However, at 198 K a transition to another linear regime is observed—a break not perceived in the Arrhenius plot. With the linearization of the data in the transitivity plot, the fitting process becomes much simpler: In the first high temperature range the Aquilanti–Mundim parameters are <sup>1</sup> <sup>ε</sup>‡ <sup>=</sup> 1.71 *<sup>K</sup>*−<sup>1</sup> and *d* = 0.32; for the second lower temperature range, <sup>1</sup> <sup>ε</sup>‡ <sup>=</sup> 0.53*K*−<sup>1</sup> and *<sup>d</sup>* <sup>=</sup> 0.08. The interpretation of these results will be published in a future paper considering a large variety of examples.

**Figure 3.** The Arrhenius (upper panel) and Transitivity (lower panel) planes of the temperature dependence of relaxation time of the propylene carbonate. The diamond symbols represent the transitivity values obtained numerically and smoothing with the Savitzky–Golay filter. Red lines emphasize two regions where the temperature dependence of the transitivity is linearized, as expected by the Aquilanti–Mundim law.

#### *4.2. Reaction Rate Constants' Mode*

The estimation of the rate constants for the OH + HCl → H2O + Cl reaction in the gas-phase was performed to validate TST and tunneling corrections, implemented in the Transitivity code. Furthermore, using the suggestion of the Eyringpy code [54], the NH3 + OH → NH2 + H2O reaction was selected to demonstrate the accuracy of the Collins–Kimball and Kramers models to estimate reaction rate constants in an aqueous solution.

## 4.2.1. The OH + HCl → H2O + Cl Reaction

The reaction rate constants for the reaction between hydroxyl radical and hydrogen chloride are only slightly dependent on temperature in the range 138–300 K, although, as the range increases further, a temperature dependence is observed. This strong concave curvature detected in the Arrhenius plot (sub-Arrhenius behavior) is very convincing evidence of the role of deep quantum tunneling for this reaction [88]. We employed the *d*-TST, Bell35, Bell58, and ST tunneling corrections to calculate the rate constants for the reaction of OH + HCl over a wide range of temperatures (200–2000 K). The electronic structure properties of the reactants, of the products, and of the transition state were calculated employing the MP2/aug-cc-pVDZ calculation level using Gaussian 09 [89]. A complete study of this reaction using our methodology can be found in Reference [44].

Figure 4 obtained by the Transitivity code shows the comparison between the calculated reaction rate constant and the experimental data [90]. No major differences were found using either the Bel58-2T and ST tunneling corrections. Conversely, as expected, a divergence is observed of the Bell58 formula at *Tc*, and *d*-TST does not describe the range of experimental data at low temperature, where the deep-tunneling regime becomes dominant: This confirms that its validity is limited to weak tunneling. Traditional TST is presented for comparative purposes and visualizes the need for corrections.

**Figure 4.** Arrhenius plot obtained from the Transitivity code for the OH + HCl → Cl + H2O reaction using TST with Bell35, Bell58, ST tunneling correction, and *d*-TST. Experimental data in the literature [90] are available for comparison and shown as full dots.

## 4.2.2. The NH3 + OH → NH2 + H2O Reaction

The NH3 + OH → NH2 + H2O reaction permits to illustrate the accuracy of the methodology in the liquid-phase. Energies, geometries, and frequencies of stationary points were extracted at the same level of calculation used in the Eyringpy code [54].

The upper panels in Figure 5 show the temperature dependence of the overall reaction rate constant, *kObs*, for NH3 + OH → NH2 + H2O reaction from 273 to 4000 K using Kramers and Collins–Kimball models. The Smoluchowski diffusion rate constant <sup>→</sup> *k <sup>D</sup>*, which evaluates the diffusion limit for a bimolecular reactive process including the solvent effect, is shown in the lower right panel. The Kramers transmission correction, which evaluates the interference of the friction effect of the solvent in the reactive process as a function of temperature, is shown in the lower left panel. At 298.15 K, Kramers' formulation gives for the reaction rate constant the value 6.73 <sup>×</sup> <sup>10</sup><sup>11</sup> cm<sup>3</sup> mol−<sup>1</sup> <sup>s</sup><sup>−</sup>1, while the Collins-Kimball formulation yields 6.77 <sup>×</sup> 1011 cm<sup>3</sup> mol−<sup>1</sup> s−1, (the experimental value indicates ~1011 cm3 mol−<sup>1</sup> <sup>s</sup><sup>−</sup>1) [91–93]. The value of <sup>→</sup> *k <sup>D</sup>* from Smoluchowski (Collins–Kimball) is 3.73 <sup>×</sup> <sup>10</sup><sup>12</sup> cm3 mol−<sup>1</sup> <sup>s</sup><sup>−</sup>1, in accordance with that calculated in Reference [54], <sup>→</sup> *<sup>k</sup> <sup>D</sup>* <sup>=</sup> 3.60 <sup>×</sup> <sup>10</sup><sup>12</sup> cm3 mol−<sup>1</sup> <sup>s</sup><sup>−</sup>1.

**Figure 5.** Upper panels present the Arrhenius plots as given by the program for the NH3 + OH → NH2 + H2O reaction using Kramers' and Collins–Kimball formulations. The lower panels show the Kramers transmission and Smoluchowski diffusion limit constant as a function of inverse temperature.

#### *4.3. CPMD Input Files Generator*

Coordinates of specific molecules can be selected to test the function of creating a first-principles molecular dynamics input to run by the CPMD code (Figure 6). When selecting this option, the user must choose between the CPMD, PIMD, TSH, MTD and BOMD approaches and must provide the molecular geometry of the system in \*.xyz format. Additional information (DFT functional, simulation temperature, system charge, the maximum number of steps, integration time step, and the size of the box) can be inserted through the indicated boxes. Furthermore, it is possible to generate several aleatory configurations by selecting specific initial conditions for the geometric parameters (For more information, see the input files in the Example directory on the www.vhcsgroup.com/transitivity web page).

**Figure 6.** An exemplary view of the input generation function of the transitivity program. Details of input files can be found in the www.vhcsgroup.com/transitivity web page.

#### **5. Final Remarks**

The Transitivity code presented in this article tends to systematize the tools developed over the last ten years [25] to handle the kinetics of processes beyond Arrhenius, providing options for: (i) The phenomenological fitting and estimation of reaction rate constants in gas- and liquid-phases and (ii) the preparation of first-principles molecular dynamics in order to evaluate the parameters relative to a variety of reactive processes. In addition, the code provides an easy user-friendly interface and may be relevant for didactic purposes.

A characterizing feature of the code is the consistent use of the *d*-formulation, which recently culminated in a series of successful applications from phenomenological to first-principles descriptions of pure and applied chemical kinetics and material science. Examples are available:


Regarding the evolution and consistency of the code, efforts are being made to introduce *d* predictive formulations to transport properties, as well as post-processing of trajectories obtained by first-principles molecular dynamics simulations. Furthermore, the Eckart tunneling correction and the variational Transition-State Theory represent focuses for the future implementation of the Transitivity code.

**Author Contributions:** V.H.C.-S. and F.P. conceived and supervised the study. V.H.C.-S. and N.D.C. wrote the paper. H.G.M. and F. O.S.-N. developed the python code, performed stationary electronic structure calculations and fitting procedures. K.C.M. developed and implemented the GSA Fortran code. The manuscript was discussed and reviewed by all authors.

**Funding:** This research was funded by Brazilian agency CNPq, grant number Universal 01/2016 - Faixa A - 406063/2016-8], by Organizzazione Internazionale Italo-latino Americana (IILA) for a Biotechnology Sector-2019 scholarship and by the Italian Ministry for Education, University and Research, MIUR, for financial support: SIR 2014 "Scientific Independence for Young Researchers" (RBSI14U3VF).

**Acknowledgments:** The authors are grateful for the support given by Brazilian agency CAPES. This research is also supported by the High-Performance Computing Center at the Universidade Estadual de Goiás, Brazil. Valter H. Carvalho-Silva thanks Brazilian agency CNPq for the research funding programs [Universal 01/2016 - Faixa A - 406063/2016-8] and Organizzazione Internazionale Italo-latino Americana (IILA) for a Biotechnology Sector-2019 scholarship. Federico Palazzetti and Nayara D. Coutinho acknowledge the Italian Ministry for Education, University and Research, MIUR, for financial support: SIR 2014 "Scientific Independence for Young Researchers" (RBSI14U3VF). We thank professor Vincenzo Aquilanti for fruitful discussions.

**Conflicts of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

#### **Appendix A**


**Table A1.** List of Symbols and Nomenclatures.

#### **References**


**Sample Availability:** Not available.

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Molecules* Editorial Office E-mail: molecules@mdpi.com www.mdpi.com/journal/molecules

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18