*Review* **Spontaneous Switching among Conformational Ensembles in Intrinsically Disordered Proteins**

#### **Ucheor B. Choi 1, Hugo Sanabria 2, Tatyana Smirnova 3, Mark E. Bowen 4,\* and Keith R. Weninger 5,\***


Received: 17 January 2019; Accepted: 15 March 2019; Published: 22 March 2019

**Abstract:** The common conception of intrinsically disordered proteins (IDPs) is that they stochastically sample all possible configurations driven by thermal fluctuations. This is certainly true for many IDPs, which behave as swollen random coils that can be described using polymer models developed for homopolymers. However, the variability in interaction energy between different amino acid sequences provides the possibility that some configurations may be strongly preferred while others are forbidden. In compact globular IDPs, core hydration and packing density can vary between segments of the polypeptide chain leading to complex conformational dynamics. Here, we describe a growing number of proteins that appear intrinsically disordered by biochemical and bioinformatic characterization but switch between restricted regions of conformational space. In some cases, spontaneous switching between conformational ensembles was directly observed, but few methods can identify when an IDP is acting as a restricted chain. Such switching between disparate corners of conformational space could bias ligand binding and regulate the volume of IDPs acting as structural or entropic elements. Thus, mapping the accessible energy landscape and capturing dynamics across a wide range of timescales are essential to recognize when an IDP is acting as such a switch.

**Keywords:** dynamic configuration; free energy landscape; intrinsically disordered protein; IDP

#### **1. Introduction**

In general, proteins do not fold into static structures. Rather, most proteins fluctuate within a pseudocontinuum of accessible configurations about their lowest energy folded state. Often, these fluctuations are coupled to the protein's functional cycle such as catalytic activity or molecular recognition. However, many proteins lack a predominant low-energy state, and instead sample a broad and disparate ensemble of configurations. Such intrinsically disordered proteins (IDPs) lack the stable folded state, which is a central element in the classical structure–function paradigm. Nonetheless, IDPs, and proteins containing intrinsically disordered regions (IDRs), are now understood to play essential roles in cell signaling pathways and regulatory networks [1–6].

Models for how function arises from an ensemble of rapidly interconverting IDP conformers generally fall into two categories: those directly involved in molecular recognition and those acting as linker or structural elements. Unlike molecular recognition by folded domains, wherein the interaction residues are spread across the polypeptide chain, IDP interactions tend to involve short linear motifs (SLiMs), which are contiguous within the chain. Disordered SLiMs adopt an ensemble of conformations and these different conformations of the same sequence may be recognized by distinct binding partners [7]. The timescale of IDP conformational dynamics is an important determinant of the binding mode for IDPs. Rapid conformational dynamics supports the classic induced fit model, where the IDP can reconfigure the binding site before the initial encounter complex dissociates. Many IDPs couple the energy from order transitions to the recognition event, which provides a powerful mechanism for allostery. In other cases, the disorder persists even in the bound state [8,9]. If the timescale for IDP dynamics slows, then the binding mode is limited to conformational selection wherein the interaction is only possible during those windows when the binding competent configuration is present. In this way, the timescale of IDP conformational dynamics plays a key role in differentiating modes of molecular recognition by IDPs.

The dynamic nature of IDPs and IDRs makes them well suited to function as linkers between functional elements such as binding sites or folded domains. For example, the fly-casting model revealed how conformational exchange allows IDPs to explore a large volume in the search for binding partners [10–12]. The entropic clock model showed how the degree of extension of the IDP linker between a pore and its blocking domain controlled the open time of an ion channel [13]. The entropic bristle model suggested that IDPs can fill large volumes of three-dimensional (3D) space during their conformational searching, which can regulate protein interactions [14]. Thus, the timescale and the range of conformational sampling within the ensemble governs the structural properties of IDPs acting as linkers or bristles.

These functional roles for IDPs rely on their dynamic properties to control the energetics of binding reactions as well as for regulating hydrodynamic volume and spacing. Understanding protein structure is the key to describing its function. With IDPs this necessitates extending our understanding beyond minimum energy states to further characterize the ensemble both in terms of the accessible landscape as well as the timescales of conformational dynamics.

The rise of polymer science led to a great foundation of Nobel prize-winning theory to describe the behavior of homopolymers composed of many repeated subunits [15]. Based on different starting assumptions about the polymer, a continuum ensemble of end-to-end distances can be estimated by well-established approximations such as a Gaussian chain, a self-avoiding random coil, or a worm-like chain model [16,17]. While fully appropriate for polymers, in application to IDPs, such models lack molecular detail and rely on assumptions about polymer behavior that are not strictly applicable to polypeptides. Nonetheless, such polymer models still retain great predictive power, particularly for proteins in near ideal solvent conditions. As such, polymer models have shown great utility in describing the ensemble properties of chemically denatured proteins and coil-like IDPs.

Due to the intrinsic flexibility of IDPs, models from polymer physics have proven useful to describe their ensemble properties in many specific cases. A common extrapolation from polymer models is the assumption that the conformational landscape is generally featureless and that the entire landscape is accessible. For proteins selected through evolution to fold, neither of these assumptions is true. The extent to which these assumptions hold true for IDPs depends on their class. Simple swollen coils often follow polymer scaling, but now it is recognized that many IDPs, even polar polypeptides like Huntingtin [18], undergo collapse to form more compact, disordered globules [19,20]. At these higher packing densities, self-interaction becomes more prevalent and conformational dynamics become more complex. In addition, while specific long-range intramolecular interactions are generally absent in IDPs, molecular dynamics simulations have shown that charge patterning does govern IDP conformation through intramolecular electrostatic effects [19,21,22]. Thus, sequence evolution can select for IDPs with limited or biased conformational ensembles such that a continuum of conformations is not possible.

Conformational switching is well accepted in folded proteins, and IDPs may also be able to switch between discrete conformational ensembles while remaining disordered in both states. Here, we will focus on ensemble switching behavior identified in an increasing number of IDPs (Figure 1). In these cases, IDPs were observed to stochastically switch between distinct states within the entirety of conformational space or showed evidence of dynamics on slow timescales (Figure 1B). Both phenomena are suggestive of large energy barriers between states or the existence of metastable intermediates. A combination of methods from ensemble to single molecule is required to elucidate this individual molecular behavior. Regulating the access to distinct regions of IDP conformational space provides mechanisms to govern IDP activity. Understanding the origins of switch-like transitions will provide new insights into the mechanisms by which IDP conformational dynamics can modulate cell signaling.

**Figure 1.** Single molecule observation of state switching in an intrinsically disordered protein (IDP) with single molecule Förster resonance energy transfer (smFRET). The soluble N-ethylmaleimide-sensitive factor activating protein receptor (SNARE) protein synaptosomal nerve-associated protein 25A (SNAP-25A) and the C-terminal tail of the GluN2B subunit of the *N*-methyl-D-aspartate (NMDA) receptor (NMDAR) (C-term-N2B) are intrinsically disordered in their native states. Residues 20 and 197 of SNAP-25A and residues 1273 and 1394 of C-term-N2B were randomly labeled with donor and acceptor fluorophores for smFRET measurements [23]. Labeled single molecules were encapsulated in liposomes (100 nm in diameter) that were then surface-tethered through biotin–streptavidin linkage on a surface passivated with biotinylated bovine serum albumin (BSA). Fluorescence emission was recorded using an emCCD (electron multiplying charge coupled device) camera at a frame rate of 10 Hz. At this time resolution, rapid conformational dynamics are time-averaged, but surface attachment

allows extended observations of the same molecule for seconds to minutes. (**A**) Representative single molecule fluorescence intensity time trace of SNAP-25A (top panel) does not show spontaneous switching. The fluorescence intensities were converted to Förster resonance energy transfer (FRET) efficiency (bottom panel, black line). SNAP-25A molecules showed a stable FRET efficiency with no switching between different FRET values. (**B**) Representative single molecule fluorescence intensity time trace of C-term-N2B (top panel) and FRET efficiency (bottom panel, black line) fit by hidden Markov modeling (HMM) to obtain the dwell times in each state (bottom panel, red line). Donor and acceptor signals for C-term-N2B molecules show step-wise, anticorrelated changes in intensity, yielding steady FRET for seconds before spontaneously switching to different FRET values, which would correspond to distinct disordered states with a different average size. (**C**) FRET efficiency histogram for C-term-N2B molecules show a broad distribution across the entire range of FRET values (black line) in contrast to SNAP-25A molecules (red lines). (**D**) Histogram of state dwell times for C-term-N2B obtained from HMM. The mean state dwell time is on the order of seconds and showed no correlation with FRET efficiency. (**E**) Transition density plot for C-term-N2B shows the FRET efficiency before a transition (*y*-axis) plotted against the FRET efficiency after that transition (*x*-axis) for all observed transitions. These transitions proved too variable to resolve or assign to specific conformations. A.U.: Arbitrary units.

#### **2. Evidence of Configuration Switching in IDP Studies**

One of the best examples of state switching in polypeptides is protein folding. There are two states: the unfolded state, which is characterized by a broad ensemble of disparate, interconverting conformations, and the folded state, which is characterized by local fluctuation within a narrow range of conformational space. In most cases, such folding transitions are unidirectional under physiological conditions. However, some proteins, such as the ankyrin repeat (AR) domain from the IκB transcription inhibitor, undergo reversible order to disorder transitions at room temperature [24]. Only single molecule fluorescence resonance energy transfer (smFRET) of surface attached molecules could detect the "nanospring dynamics" that occurred on the seconds timescale as individual akyrin repeats came "unglued" [25]. Ensemble nuclear magnetic resonance (NMR) measurements on the same protein showed well-resolved NMR cross-peaks and high-order parameters but no signs of dynamic behavior [24]. Such a slow timescale for state switching suggests a large energy barrier separating these regions of conformational space.

A similar conformational state switching has been reported in α-synuclein, an IDP linked to Parkinson's disease that undergoes a disorder-to-order transition upon interaction with amphiphilic small molecules or membranes. However, in this case, the transition is not spontaneous but regulated by functional interactions [26–28]. Other IDPs are known to change their form of disorder in response to physiological signals such as ion influx or posttranslational modifications like phosphorylation [29,30]. A key aspect of state switching in these IDPs is that the entire conformational landscape is not always accessible or there are sharp energy barriers, which separate discrete subregions of conformational space. The conformation is not "continuously tunable" [26].

Given the broad ensemble of conformations an IDP may adopt, it is not always possible to know how much of the conformational landscape is being explored and when such switching is occurring. One hallmark of deeper energy basins within the conformational landscape is the presence of slow timescale dynamics. However, most structural methods are not well suited for detecting slow timescale conformational dynamics.

A comparison of synaptic IDPs and IDRs revealed stochastic conformational switching in two out of five proteins in a test set despite the fact that all proteins appeared similarly intrinsically disordered by other measures [23]. The cytoplasmic domains from both neuroligin and the GluN2B subunit of the *N*-methyl-D-aspartate (NMDA) receptor (NMDAR) showed continuous hop-like conformational diffusion with Förster resonance energy transfer (FRET) shifts equivalent to nanometer scale motions in a single 100 millisecond time bin (Figure 1B). The sensitivity to stoichiometry provided by single molecule detection allows IDP clustering or aggregation to be distinguished from single molecule conformational switching. These IDRs adopted a compact globular form of disorder, yet another IDP (synaptobrevin) with a similar form of disorder failed to show transitions [23].

The transitions in GluN2B were detected with seven different FRET labeling combinations representing dye separations from 83 to 172 residues [31]. Only the shortest separation of 15 residues failed to show any transitions, which is expected because a short polypeptide segment should not be capable of large conformational changes. This important control confirms that photophysical effects on dye environment and orientation are not the origin of the transitioning phenomena. Because of their complexity, the transitions observed in synaptic IDRs proved uninterpretable in terms of structural intermediates [23].

Similar stochastic transitions between FRET efficiency levels were observed in the smFRET traces for protein 4.1, a cytoskeletal adaptor protein that stabilizes spectrin–actin crosslinks [32]. The protein appeared to switch between an unresolved number of discrete conformational states. Interestingly, while binding of protein 4.1 to the nuclear mitotic apparatus (NuMA) protein changed the pattern of transitions, it did not eliminate the transitions, indicating that the complex retains switch-like dynamics. Similarly, binding partner interactions with the synaptic scaffold protein postsynaptic density protein 95 (PSD-95) also showed no effect on conformational switching in the IDR from neuroligin [23]. Conformational switching may play a functional role even after these IDPs interact with downstream factors.

State switching in IDPs is also possible on faster timescales. The yeast prion protein Sup35 is a translation termination factor that forms amyloid fibrils. At low concentrations, smFRET showed that the protein formed a compact disordered globule [33]. However, fluorescence correlation spectroscopy (FCS) analysis of fluorescence quenching revealed at least two well separated components to the dynamics, including a slower component that originated from long-range contacts.

#### **3. Theoretical Framework for Understanding IDP Conformational Switching**

Swollen random coils can fully sample a "flat" energy landscape. Fast sampling is possible because little energy is needed to overcome small barriers between different configurations. Such IDPs have been described using equilibrium statistical mechanics as freely joint Gaussian chains, self-avoiding coils, and worm-like chains [17,23,34]. The conformational switching, described above in Section 2, challenges such approaches because these IDPs are sequestered off from the entirety of the conformational space. Thus, in the absence of the simplifying assumption that all states are present and equally probable, one cannot construct a statistical ensemble that represents all the possible states.

Intrinsically disordered protein conformational switching has been likened to the Lorenz attractor in nonlinear chaotic systems [35]. Like the Lorenz attractor, this switching behavior in IDPs has been shown to be sensitive to initial conditions and is also restricted to be near the attractor points in conformational space. Chaos, in dynamical mechanical systems, is dependent on underlying nonlinear relationships in the governing interactions. Intrinsically disordered proteins certainly have interactions within their peptide chain, with other components in solution, and with the host fluid that are likely to involve nonlinearities.

All conformational transitions are noise-assisted reactions of the type described by Kramers' transition state theory [36–38]. The energy for excursions between basins must come from fluctuations where the polypeptide gains enough energy from the thermal milieu. Therefore, the continuum of timescales relates to the continuum of barriers between different regions of conformational space. Polypeptides are not uniform chemical polymers. Within the core of a compact, globular IDP there exists a nonlinear potential as alternate long-range interactions (favorable and unfavorable) are transiently sampled as distinct regions of the polypeptide chain are brought into close proximity. Given such a fluctuating force, thermally activated transitions could occur across a range of timescales. Even rare transitions are possible within some finite time.

Within Kramers' theory, the time to escape a basin is related to the damping factor associated with internal friction. Studies of IDPs have pointed to the important role of internal friction in modulating conformational dynamics and folding [39,40]. Friction within expanded IDPs depends on the quality of solvent interactions. However, in collapsed, globular IDPs, shielding of residues in the interior removes solvent interactions and can create a complex network of coupled and decoupled chain segments [34,41]. Thus, nonlinear dynamics arising from feedback among these different couplings should not be surprising. Large transitions (e.g., folding–unfolding) involve a complex set of possible pathways within the energy landscape [42].

Conformational switching occurs over a wide range of timescales, ranging from very slow processes lasting for seconds down to microsecond timescales [43]. To explain this, a scale-free approach is needed. The energy landscape we have postulated for these IDPs, containing multiple minima with high energy barriers, is also characteristic of glasses (Figure 2). Mean-field theories of disordered glasses can describe the existence of stable and metastable states [44–46]. At low density or low packing fraction (φ), glasses remain liquid and can sample all states (Figure 2C). As the packing fraction increases, the individual particles are subject to jamming and undergo a glass transition where they become trapped within individual basins (Figure 2D). In this jammed state, excitations can extend over a wide range of timescales [47].

**Figure 2.** Switching in IDPs shows similarity to structural glasses. (**A**) IDPs can be modeled as worm-like chains when the packing fraction (φ) of the constituent particles is low (φL). Glasses can flow when the packing fraction is low but undergo a phase transition at high packing fractions when particles are caged by interactions with their neighbors. Applying this analogy of glasses to describe IDPs, the amino acids are the constituent particles. (**B**) If amino acids interact, the packing fraction increases, giving rise to a more complex energy landscape with states at different energy levels. (**C**) These states (simple basins) represent the minima of the energy landscape. When the packing fraction is lower than the glass transition (φg), or collapsed state, the polypeptide can sample a continuum of states following polymer models. Transitions between states would be described by Kramers' transition state theory. (**D**) When there are many network interactions, the packing fraction increases and not all conformations are accessible. The limiting states are separated by deep wells and discrete states can be identified. Such conformational switching could result from a number of different mechanisms including internal friction, long rage potentials, ion mediated interactions, or other forces that modulate the network interactions. (**E**) Discrete limiting states produce conformational switching, yet there is still a subensemble of conformations inside each basin. (**F**) At high packing fractions, the subensemble of states can also become discontinuous, generating potential transitions across widely separated temporal domains. (Figure adapted from reference [47]).

Deep within the glass phase, each individual basin becomes a metabasin composed of a fractal hierarchy of sub-basins (Figure 2F) [48]. This fractal free-energy landscape was recently proposed to explain the roughness transition in structural glasses [47]. A similar fractal hierarchy or scale freeenergy landscape could arise from packing of chain segments within an IDP. The packing particles are collapsed chain segments within the polymer along with coordinated ions and bound solvent (Figure 2B). A high packing fraction could lead to jamming and produce metastable configurations, which might switch to other equally stable configurations as intrachain interaction forces evolve with the conformational search. In this model, such local phase transitions would extend to other packed chain segments, leading to the observed IDP conformational switching across a wide temporal regime.

#### **4. Possible Mechanisms to Sequester Regions of Conformational Space in IDPs**

The origin of continuous slow timescale dynamics or state switching is not clear. Proline isomerization is one of the conformational changes in polypeptides known to operate on this timescale and can result in state switching. Conformational exchange rates linked to proline isomerization were detected in the IDR of the transcriptional regulator E2 from human papillomavirus (HPV) by collecting NMR nuclear overhauser spectroscopy (NOESY) spectra at different mixing times [20]. This discrete conformational transition in E2 is the rate-limiting step for antibody recognition of this viral antigen. Interestingly, proline depletion of GluN2B reduced the number of molecules that showed any stochastic FRET transitions, but proline-depleted constructs showed similar transition rates to the wild type [49]. Thus, for GluN2B, proline appeared to facilitate switching but not govern the timescale.

Aside from proline isomerization, the major determinant of IDP conformation are self-interactions (i.e., between amino acids in the chain) and solvent interactions. Most intrachain interactions in IDPs are local, yet chain segments can partition into both packed and extended forms of disorder. These chain segments are continuously exposed to one another during the conformational search, with favorable interactions restricting chain motions and unfavorable interactions limiting access to some conformations. Within the disordered globule, the quality and quantity of the solvent is evolving as water and ions interact with local chain segments and are drawn into the "core" of the IDP. For example, ions from solution could neutralize local charge densities stabilizing distinct configuration space from those visited in the absence of the ions. Compacted chain segments of IDPs can temporarily exclude water [34,41,50], which would result in stable local minima that would limit the conformational search. As noted above, the complexity of conformational space within these fractal basins would mean that available thermal energy could be dissipated by small, local rearrangements rather than long-range motions.

Chain reconfiguration could also transiently trap unfavorable interactions within the globular interior. This unfavorable interaction would destabilize any metastable local conformations and drive the system to a new region of conformational space. Reintroduction of water to a temporarily dehydrated chain segment would upset the balance of the chain interactions. Chain reconfiguration will inevitably bring charged amino acids within the globular interior with a finite probability of unfavorable interactions being trapped by approach jamming. These sorts of trapped interactions represent high energy metastable intermediates that could suddenly be released when a random fluctuation exposes a pathway that permits access to alternate regions of conformational space. Such an event would manifest as sudden stochastic changes in the sampled conformational space, or what we have termed IDP conformational switching.

It remains to be established that effects as subtle as sequestering an additional ion or water molecule could generate the spontaneous IDP ensemble switching that is the focus of this discussion. There are some relevant examples where such small perturbations can affect molecular properties. Deoxyribonucleic acid (DNA) certainly can transiently recruit or bind ions from solution changing its polymeric properties [51], bearing in mind that DNA can easily be modeled as a worm-like chain [52,53]. Certainly, changes in proteins at the level of a single amino acid can impact disordered states [54–63]. Changes of the amino acid sequence in IDPs and IDRs have been linked to human

diseases. Aggregation of α-synuclein has been linked to sequence composition in familial forms of Parkinson's disease. Familial mutations that enhanced aggregation slowed conformational dynamics, while mutations that sped up intramolecular diffusion inhibited aggregation [64]. De novo missense mutations in the cytoplasmic IDR of the NMDAR, discussed above, are linked familial forms of epilepsy [65,66]. That changes at the level of individual amino acids can reshape the energy landscape of IDPs suggests that transient interactions with ions or water could generate the observed ensemble switching phenomenon.

#### **5. State Switching Can Occur Close to the Boundary of a Folding Transition**

Most protein folding studies to date have focused on single domain proteins displaying simple two state behavior. The folding of large multidomain proteins can be more complex with metastable intermediate states in their energy landscapes. At low denaturant concentrations (0.65 M GdmCl), the three-domain, 214 amino acid protein adenylate kinase (AK) began to show stochastic FRET transitions between six states [42]. The transitions were too variable to resolve nor could they all be assigned to specific denatured intermediates. Furthermore, the "situation [became] even more complex at higher concentrations of denaturant." [42]. This shows that a folded protein slightly destabilized by denaturants shows similarly complex transitioning to that observed in switching IDPs.

Thus, transitioning molecules arise when the impetus for a polypeptide to fold is lowered slightly and becomes more complex as more of the conformational landscape becomes available. Ultimately, transitions become faster until they are unresolved and become continuous dynamics of the type described by polymer models. Thus, a similar transition should be possible if an expanded coil is brought close to a folded state. Interestingly, smFRET measured in the presynaptic fusion protein SNAP-25 did show scaling that fit to polymer models [23]. However, stochastic FRET switching was induced in SNAP-25 upon binding the SNARE protein syntaxin, which is a binary intermediate in the formation of the tripartite SNARE complex [67]. This confirms that an expanded coil-like IDP can be converted to state switching as protein interactions restrict access to portions of the conformational landscape.

#### **6. Functional Relevance of IDP Ensemble Switching through Phosphorylation**

Aside from transitions involving an ordered state, few biological functions have been directly connected to the modulation of IDP conformational ensembles. Post-translational modifications can change the interaction potential of existing amino acids [29,30,68–71]. Phosphorylation is among the best-documented mechanism for dynamically affecting biological function of proteins. In particular, phosphorylation has been connected with modulating disorder in IDPs [29,68–75].

For example, the ribonucleic acid (RNA) binding protein fused in sarcoma (FUS) forms aggregated protein deposits in neurodegenerative disorders, which are modulated by phosphorylation. Nuclear magnetic resonance studies found that phosphorylation of FUS, or phosphomimetic mutations, did not "alter the disordered structure of FUS" [50]. However, phosphorylation of FUS did decrease transient polypeptide collapse and increase the radius of gyration. These changes in the IDP ensemble were associated with reduced intermolecular interaction and aggregation such that a phosphomimetic variant reduced the toxicity of FUS to live cells [50].

Phosphorylation of an IDR was also linked to biological function of the NMDAR, which uses the energy of neurotransmitter binding to open its ion channel. Allosteric inhibition of channel gating by extracellular zinc can be alleviated by Src kinase phosphorylation of the C-terminal IDR of the GluN2B subunit (C-term-N2B) [76], which switches conformation as noted above (Figure 3A) [23]. The effect of Src phosphorylation is mediated by expansion of this globular IDR without affecting conformational transitions (Figure 3B,C) [31]. Deleting prolines near the phosphorylation sites had the opposite effect and compacted the IDR while reducing the probability of transitions. Both of these modifications, which change the size of the disordered states in opposite directions, eliminated the ability of zinc to allosterically regulate channel gating without disrupting the underlying gating mechanisms [49]. Thus, this IDR appears to have an optimal packing density that supports allosteric coupling between domains of the receptor.

**Figure 3.** Effect of Src kinase phosphorylation on the conformational ensemble of the NMDA receptor. The C-terminal intrinsically disordered region (IDR) of the GluN2B subunit of the NMDA receptor (C-term-N2B) was randomly labeled with donor and acceptor fluorophores at residues 1323 and 1453 [31]. Src kinase phosphorylates C-term-N2B on tyrosine residues 1336 and 1472. (**A**) Representative single molecule fluorescence intensity (top panels) and fluorescence resonance energy transfer (FRET) efficiency fit by hidden Markov modeling (HMM) (bottom panels) for unphosphorylated (left) and phosphorylated (right) C-term-N2B. Phosphorylation did not induce structure in C-term-N2B as the dynamic transitions continued. (**B**) FRET efficiency histogram of C-term-N2B before (black line) and after (red line) Src phosphorylation. Phosphorylation led to shift towards lower FRET indicating a general expansion of the polypeptide, which was confirmed with hydrodynamic measurements. (**C**) Dwell time histograms for transitions in C-term-N2B, obtained from HMM, before (black line) and after (red line) Src phosphorylation. Although phosphorylation shifted the ensemble FRET efficiency, stochastic transitions were unaffected. A.U.: Arbitrary units.

Phosphorylation-induced modulation of an IDP ensemble was also connected to regulation of cellular signaling processes in prostate cancer. Prostate-associated gene 4 (PAGE4) is an IDP that is expressed exclusively in adult males who have prostate cancer. Interactions between PAGE4 and transcription factors have been suggested to control androgen sensitivity [77–83]. Both homeodomain-interacting protein kinase 1 (HIPK1) and CDC-Like Kinase 2 (CLK2) phosphorylate PAGE4, with CLK2 modifying many more sites. Experiments combining NMR, paramagnetic relaxation enhancement (PRE), small angle X-ray scattering (SAXS) and smFRET have determined that HIPK1 phosphorylation compacts the ensemble, while CLK2 phosphorylation leads to expansion. Molecular dynamics (MD) simulations linked changes in PAGE4 ensembles to distinct phosphorylation patterns [84]. These different phosphorylation patterns influenced transcription factor binding. HIPK1-treated PAGE4 binds to AP-1, whereas CLK2 treatment of PAGE4 decreases its affinity for AP-1. Phosphorylation by HIPK1 effectively disrupts the PAGE4 interaction with c-Jun and consequently stimulated c-Jun dependent transcription in prostate cancer cell models [85,86]. In contrast, CLK2 phosphorylation of PAGE4 inhibited c-Jun dependent transcription. Importantly, HIPK1 is

expressed in both androgen-dependent and androgen-independent prostate cancer cells, whereas CLK2 and PAGE4 are expressed only in androgen-dependent cells. A model for the PAGE4–Jun-Fos (AP-1)–AR regulatory circuit suggests phosphorylation patterns in prostate cancer cells can oscillate. Thus, it was proposed that androgen dependence may vary in time [78] with switching between androgen-dependent and androgen-independent phenotypes being a result of details of PAGE4 phosphorylation [77,79,80,84]. This differential phosphorylation is associated with opposing shifts in the conformational ensemble of PAGE4.

#### **7. Conclusions and Prospects for Understanding IDP Switching**

Intrinsically disordered proteins are essential components of cellular signaling pathways because of their adaptability to the local environment. Cell signaling events can lead to changes in ionic composition or pH that affect solvent quality while posttranslational modifications affect net charge and hydropathy. Intrinsically disordered proteins can respond instantly to such signals by shifting to alternate conformational ensembles. Their unique capabilities can allow a single IDP to interact with multiple binding partners. Intrinsically disordered proteins also play structural roles as linkers or entropic elements. Additionally, IDPs are linked to the formation of membraneless organelles (i.e., liquid phase separation) [87,88]. These important functions allow for more nuanced coordination of signal transduction.

Here, we have highlighted a recently identified conformational switching phenomenon observed in a small but growing number of IDPs. Intrinsically disordered proteins in this class appear disordered by standard measures of secondary structure or hydrodynamic mobility but also appear to fluctuate between well-separated regions of conformational space. If the conformational ensembles are functionally distinct, then mechanisms that biases conformational sampling will govern protein activity. We suggest such control over IDP switching may be an important regulator of cellular signaling networks.

Few experimental methods are sensitive to conformational switching in IDPs [89]. Integrative structural biology approaches where several methods are applied to a single protein may help inform our understanding of the molecular mechanisms controlling intrinsic disorder in proteins. Several methods including smFRET, NMR, electron paramagnetic resonance double electron–electron resonance (EPR-DEER), paramagnetic relaxation enhancement (PRE), and small-angle neutron scattering (SANS)/small-angle X-ray scattering (SAXS) have all been applied to IDP studies. These different methods are able to provide complementary information about local chain contacts and global disordered structure. Comparing and cross-validating such results with multiple methods may help reveal specific interactions that are critical for determining details of the disordered state. To date, switching has only been observed in vitro, so it remains to be determined if such switching transitions are present within live cells [3], where smFRET may enable direct observation [90–92]. Similarly, switching has been primarily studied under dilute conditions so it remains unknown whether the phenomenon persists in the condensed phase [93,94].

In addition to experimental approaches, MD simulation is a critical tool to understand the mechanisms that govern access to the full conformational ensemble. Molecular dynamics simulations of IDPs are particularly challenging. Details of the force fields are critically important and are a topic of continued development [95,96]. The long timescales required to observe ensemble switching are difficult to achieve for standard MD simulation and require more sophisticated ensemble sampling approaches. Despite these difficulties progress applying MD simulation to IDPs is an area of active work [84,97–103]. Critically, experimental studies and simulations provide essential feedback to each other [103–107]. The molecular mechanisms controlling rapid dynamics in IDPs are becoming clearer through MD simulation [83,101]. However, spontaneous ensemble switching of an IDP has yet to be reported so molecular details remain unknown.

If IDP ensemble switching does regulate signaling, then these mechanisms could be used for therapeutic intervention in those pathways. Efforts are already underway to identify small molecules that can specifically bind IDPs [2,108,109]. An exciting functional connection has been found for the transcription factor TFIID where a drug-like molecule affected the DNA interaction to prevent transcription initiation by RNA polymerase [2,109]. Additional strategies for intervening in signaling pathways within the diseased state will likely emerge as our understanding grows as to how IDP conformational dynamic leads to function within cellular networks.

**Author Contributions:** Conceptualization, M.E.B., H.S., T.S. and K.R.W.; writing—review and editing, U.B.C., M.E.B., H.S., T.S. and K.R.W.

**Funding:** This research has thus far been unable to secure any external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Electrostatics of Tau Protein by Molecular Dynamics**

#### **Tarsila G. Castro 1,2, Florentina-Daniela Munteanu <sup>1</sup> and Artur Cavaco-Paulo 1,2,\***


Received: 18 February 2019; Accepted: 20 March 2019; Published: 23 March 2019

**Abstract:** Tau is a microtubule-associated protein that promotes microtubule assembly and stability. This protein is implicated in several neurodegenerative diseases, including Alzheimer's. To date, the three-dimensional (3D) structure of tau has not been fully solved, experimentally. Even the most recent information is sometimes controversial in regard to how this protein folds, interacts, and behaves. Predicting the tau structure and its profile sheds light on the knowledge about its properties and biological function, such as the binding to microtubules (MT) and, for instance, the effect on ionic conductivity. Our findings on the tau structure suggest a disordered protein, with discrete portions of well-defined secondary structure, mostly at the microtubule binding region. In addition, the first molecular dynamics simulation of full-length tau along with an MT section was performed, unveiling tau structure when associated with MT and interaction sites. Electrostatics and conductivity were also examined to understand how tau affects the ions in the intracellular fluid environment. Our results bring a new insight into tau and tubulin MT proteins, their characteristics, and the structure–function relationship.

**Keywords:** tau; microtubules; electrostatics; diffusion; protein structure prediction; molecular modelling; molecular dynamics; tau–microtubule association

#### **1. Introduction**

Tau is an intrinsically disordered protein (IDP) that stabilizes and promotes the assembly of microtubules (MTs) in neurons (Figure 1); it is, therefore, a microtubule-associated protein (MAP) [1]. Tau is involved in cell polarity and, as it is distributed along the axons, is responsible for the maintenance of neuron structure and healthy function [2,3]. Structurally, tau is also considered as a spacer between adjacent MTs, although that is not its main function [4].

**Figure 1.** Graphical representation of a neuron, highlighting a microtubule present in the axon and the interaction with the tau protein, through the four repeats at the microtubule binding region (MTBR). Adapted from Servier Medical Art [5].

The adult largest tau isoform has 441 amino acids, comprising a projection domain with two amino-terminal inserts N, encoded by exons 2 and 3, a proline-rich region and the microtubule binding region (MTBR), with four imperfect repeats (Figure 2). This isoform is commonly cited as 2N4RTau, isoform-F, htau40 and Tau-4, and is the largest one in the human central nervous system (CNS), which has five other isoforms [6].


**Figure 2.** Primary sequence and regions of 2N4RTau isoform, comprising 441 amino acids. N1 and N2 correspond to the two N terminal inserts, P1 and P2 are the proline-rich regions, and R1 to R4 are the four imperfect repeats responsible for the binding to the microtubule.

Besides the relevant structural support function, that comes from the interaction between tau and the tubulin MTs, tau plays an important role in nervous system diseases. If hyperphosphorylated, tau does not perform its normal function, but aggregates in paired helical filaments (PHF) [7,8] which inhibits the MT assembly. The PHFs cause tau lesions found in Alzheimer's disease (AD) brains and in other types of dementia [9–12].

Several post-translational modifications can occur in tau, namely, phosphorylation, glycosylation, nitration, glycation, ubiquitination, among others [13,14]. The degree of these modifications is what will lead to a normal or pathological functioning. In particular, tau phosphorylation is the most studied

alteration, as it is established that the abnormal hyperphosphorylation is associated with AD [15,16], a dementia that currently has no cure.

To date, tau's complete structure has not been fully solved experimentally. Only portions of the canonical sequence are known, or hyperphosphorylated sections, as listed in UniProt (Universal Protein resource) [17] and in the RCSB PDB (Research Collaboratory for Structural Bioinformatics Protein Data Bank) [18]. This fact hinders a more complete understanding of the structure–function relationship for this protein. Indeed, the information concerning tau structure bound to MT is controversial [19–21], even in the most recent papers on the subject [22,23].

Our work sheds light on the tau three-dimensional (3D) structure. We performed the complete modelling of the 2N4RTau isoform, using protein structure prediction (PSP) techniques, and submitted the obtained model to molecular dynamics (MD) simulations in order to characterize the following: tau profile in neuron fluid, its interactions and structure when associated with MT, the effect of tau on ions' diffusion and conductivity, among other properties.

#### **2. Materials and Methods**

#### *2.1. Tau Structure Prediction*

Tau conformation was predicted using the I-TASSER (Iterative Threading ASSEmbly Refinement) server, a method to predict protein structure and function that uses a multiple threading approach based on templates from the Protein Data Bank (PDB) [24]. The I-TASSER is consistently considered the best server for PSP in community-wide CASP (Critical Assessment of protein Structure Prediction) experiments [25,26]. The threading templates used to predict tau structure were the proteins with the following codes: 2nbi, 3zue, 2mz7, 1w0r, 4dur, and 1ziw. 2nbi corresponds to pleuralin-1, a structural protein that shares 19% of sequence identity with tau. 3zue is a virus capsid protein presenting 22% of overall sequence identity with tau. 2mz7 is the only template that matches an NMR tau structure (amino acids 267–312) [21], and fulfil 10% of similarity. 1w0r is properdin, a glycoprotein with 20% of sequence alignment with tau. 4dur corresponds to the X-ray structure of type II human plasminogen, a hydrolase that shares 17% of similarity with tau, and 1ziw is the human toll-like receptor 3, making a total of 17% of identity with tau.

The I-TASSER server starts the modelling from the structure templates mentioned above, identified with LOMETS (Local Meta-Threading-Server), but other procedures and programs take place to generate the final five hit models. From this list we chose the one with the best C-score. The model was predicted in 2017 with the template structures available that year.

#### *2.2. System Setup*

Three types of systems were designed to study tau (Figure 3). The first one was a simulation box composed only of neuronal intracellular fluid as control (Figure 3a). The second was a simulation box containing the predicted tau structure in the neuronal fluid (Figure 3b), and the third contained tau next to a tubulin wall (an MT section (PDB ID: 5JCO)) in the fluid (Figure 3c). The second and third systems are hereinafter referred as tau in fluid and Tau::MT in fluid. For tau in the fluid, the protein was centered in the box, but in the case of Tau::MT systems, the tubulin wall was placed at one of the walls of the box and tau randomly near to MT, as shown in Figure 3.

The fluid inside the neurons was composed of water, a high concentration of potassium and proteins with negative character, and a discrete concentration of sodium and chloride among other ions [27]. In the simulations performed, we fixed the concentration of K+ and Na<sup>+</sup> to the reported values [27–29]: 140 mM for K+ and 5–15 mM for Na+ at the neuron resting state. These cations are the ions involved in the transmission of electrical signals in these cells and, most importantly, the probability of being affected by the negatively charged MT wall and by tau is high.

As it was not feasible to simulate explicitly all cytoplasmic proteins in neuronal fluid, since it would require a greater computational power, we decided to replace the negative contribution from these proteins for Cl−. The final balance of charges, in the simulation boxes, would be the same using negatively charged proteins or Cl− ions. In addition, as described below, we used particle mesh Ewald (PME) for electrostatic treatment and periodic boundary conditions (PBC) in our simulations, therefore the simulation only formally converges if the net electric charge is zero. That is why the addition of ions to achieve neutrality is a common procedure in MD simulations.

**Figure 3.** Front view snapshots of the systems under simulation: (**a**) Intracellular fluid, (**b**) tau in fluid, and (**c**) tau::microtubule wall in fluid. Water molecules were omitted in (**b**) and (**c**) to allow better visualization. Tau is represented in green, microtubule in cyan, K+ in purple, Na<sup>+</sup> in blue, and Cl<sup>−</sup> in green spheres.

In all three cases, a simulation box with an approximate volume of 7200 nm3 was necessary. The final number of simple point-charge (SPC) water molecules depended on the volume occupied by tau or the pair Tau::MT.

#### *2.3. Molecular Dynamics Simulations*

Molecular dynamics simulations were performed on the systems described above, to understand tau structure and properties.

Two stages of energy minimization were performed using a maximum of 50,000 steps: the first using the steepest descent method and the second with the conjugate gradient algorithm [30]. Initialization steps using canonical (NVT, constant number of particles, volume and temperature) and isothermal-isobaric (NPT, constant number of particles, pressure and temperature) ensembles were performed applying position restraints (with force constant of 1000 kJ·mol−1·nm<sup>−</sup>2) to all heavy atoms in the NVT procedure, and to the main chain at the NPT initialization step, during 100 ps each. In the control situation (fluid), no position restraints were applied. The temperature was maintained constant at 310 K with V-rescale algorithm [31] and the pressure was regulated at 1 atm with the Parrinello–Rahman barostat [32]. The following coupling constants were considered: τ<sup>T</sup> = 0.10 ps and τ<sup>P</sup> = 2.0 ps. Subsequently, all systems were submitted to MD simulations: the fluid during 10 ns, tau in fluid during 60 ns, and Tau::MT in fluid for 70 ns, all using the NPT ensemble, without position restraints. Three replicates of each system were run to guarantee a better sampling of conformation states for these very large proteins under study.

All MD simulations were performed using the computational package GROMACS 5.1.4 version [30,33], within the GROMOS 54a7 force field (FF) [34,35]. The Lennard-Jones interactions were truncated at 1.4 nm and we used the particle mesh Ewald (PME) [36] method for electrostatic interactions, with a cut-off of 1.4 nm. The algorithm LINCS [37] was used to constrain the chemical bonds of the proteins and the algorithm SETTLE [38] in the case of water. Parameters for K+ in the scope of G54a7 FF were obtained from the paper of Cordomí et al. [39].

#### *2.4. Analysis*

From the MD simulations, we analyzed the trajectories looking at the root-mean-square deviation (RMSD) to determine when the systems were at equilibrium. Based on the RMSD results, the following analyses were performed from 30 ns.

Root-mean-square fluctuation (RMSF) analysis and CLUSTER analysis with the single-linkage method were used to understand regions and amino acids with greater flexibility and to determine the middle structure of each tau replicate, respectively. This technique adds structures that are below an RMSD cut-off, generating more or less populated clusters and, within the largest cluster, it finds a middle structure that is the most representative of the whole simulation. Radial distribution functions (RDF) were calculated for the positive ions around the tau surface.

GROMACS' mean square displacement (MSD) was used to calculate the ions' mean square displacement in all systems. To calculate the diffusion coefficient (D*i*) of the particles, one can use the Einstein relation [40], illustrated in Equation 1, where *ri* corresponds to the particle center of mass position at a certain time *t*:

$$D\_i = \lim\_{t \to \infty} \frac{1}{6t} \left< \left| \left| r\_i(t) - r\_i(0) \right| \right|^2 \right>. \tag{1}$$

From the D*<sup>i</sup>* obtained with this analysis, it is possible to calculate the molar conductivity, as demonstrated in Equation 2. The Nernst–Einstein equation (2) establishes the relationship between the molar limiting conductivity Λ*m,i* and the diffusion coefficient D*i* for any given ion *i*, at a certain temperature (T). In the equation, *z* corresponds to the charge of the ion *i*, *F* corresponds to Faraday's constant, and *R* is the gas constant. Here, our conductivity calculation is an approximation, since our systems do not represent an infinite dilution of the ions (non-interacting ions). We disregarded the ion–ion correction factor to use Equation (2) directly:

$$
\Lambda\_{m,i}^{\circ} = z\_i^2 D\_i \left( \frac{F^2}{RT} \right). \tag{2}
$$

The MSD was calculated for the total simulation time but considering time intervals of about 10 ns. We chose to do this because this analysis requires a large memory capacity for large systems and to ensure reliable MSD data, resulting in a linear MSD(t) plot for each "-b to -e" time interval option. All analysis programs used are available in the GROMACS 5.1.4 package [29,32].

The analyses of the electrostatic potential of the predicted tau structure and the microtubule section (5JCO) were performed using the PDB2PQR server [41] and the APBS (Adaptive Poisson-Boltzmann Solver) plugin in PyMOL molecular visualization program [42–44]. For the microtubule we calculated the electrostatic surface for a tubulin heterodimer, since it is representative of the whole structure.

All figures presenting molecular structures were made with PyMOL and VMD (Visual Molecular Dynamics) software [45].

#### **3. Results and Discussion**

#### *3.1. Tau Structure Prediction and Validation*

Tau conformation was predicted with the I-TASSER server (Figure 4a). This method generates five hit models, from which we chose the one with the best C-score. The predicted tau is an elongated structure with only two small portions presenting a well-defined secondary structure (SS). The lack of SS is consistent with what is cited in the literature, which describes tau as an intrinsically disordered protein [46,47]. This extended form is important to the tau function, as it allows a proper exposure, flexibility, and contact of the MTBR residues with tubulin MT [48].

Some authors describe how the MTBR can gain some helical SS when interactions take place with MTs or actin filaments [19,49]. Recently, Zabik et al. [23] performed NMR studies in tau peptide fragments and in full-length tau, but tau easily forms oligomers, influencing tau 3D structure. Moreover, tau in solution adopts a more compact conformation, with N- and C- termini interacting and inducing a

paperclip structure (i.e., a folding over the MTBR that approximates the terminals). This conformation contrasts with the extended conformation required for interaction with MT but is the most typical in solution [23,50–52]. To address both situations, tau in intracellular solution and tau bound to MTs, we simulated these two systems to disclose the tau structure in different contexts.

**Figure 4.** (**a**) Predicted structure of tau using I-TASSER and (**b**) 5JCO microtubule section from Protein Data Bank (PDB). Both structures are side by side to its electrostatic potential surface, which in red represents the negative potential values and in blue, the positive potential values. The electrostatic potential (kb T ec <sup>−</sup>1) was calculated using APBS-PDB2PQR software. For the microtubule, the calculation was made using a tubulin heterodimer.

Predicted tau presents an electrostatic surface highly positive in the central region and a negatively charged patch at the N-terminal. Tau electrostatics reinforces the approximation between terminals that occurs in solution, folding over the middle domain of tau and resulting in a paperclip conformation.

In the case of tubulin heterodimer, the constituents of a microtubule, the surface is predominantly negative with few neutral or positive patches on the back (microtubule interior). The electrostatic surface that we calculated is very similar to the one presented by Baker et al. for a larger microtubule structure [43].

Root-mean-square deviation is the most used analysis to monitor the structural behavior of a protein—either to see the maintenance of a tertiary structure (protein stability in a medium) or to follow the equilibration of a sampled conformation (the folding). In the case of the tau protein, we started from a model structure that needed to be equilibrated in its most typical environment, that is, embedded in the neuronal fluid, interacting or not with MTs. A large structural deviation was expected at the beginning of the simulation as well as a conformational variety. Our systems, tau in fluid and Tau::MT in fluid, took about 30 ns to reach equilibration (Figure 5a,b) and to start sampling conformations more similar among each other.

The RMSF provides an insight into tau residue mobility, indicating the more rigid and more flexible regions. In fluid (Figure 6a), tau is more flexible at the N-termini region and projection domain. From the proline-rich region, it becomes less fluctuating. When near to the MT wall (Figure 6b), the amount of interactions between tau and MT (electrostatics, van der Waals and hydrogen bonds), which must form and break repeatedly, induces a more pronounced fluctuation throughout the tau structure, with proline-rich region and MTBR being the most rigid regions for replicates 1 and 2. This fact is in agreement with the strongest interaction of MTBR with the MT wall, reducing the mobility of this area.

The cluster analyses were employed from 30 to 60/70 ns to determine the middle conformation of each replicate (Figure 7a,b). Middle structures for tau in fluid also have helical SS in MTBR, namely, replicate 1: amino acids 275–277 and 315-318; replicate 2: amino acids 315–319; and replicate 3: amino acids 253–259, 269–273, 280–282, and 359–362. However, these structures seem to fold over themselves, becoming more compact and less exposed, agreeing with tau NMR prediction in solution [23]. Visually, the structures obtained for Tau::MT in fluid are more similar and extended. In fact, all three present helical content in MTBR, namely, replicate 1: amino acids 274–277 and 305–310; replicate 2: amino acids 256–260 and 357–362; and replicate 3: amino acids 253–257 and 350–354. In addition, for tau from Tau::MT replicates, we observed a more extended structure with N- and C-termini far from each other.

**Figure 5.** Backbone root-mean-square deviation (RMSD) of tau, fitting the backbone, for all tau replicates (**a**) in fluid and (**b**) when associated with microtubule, from the initial predicted model.

**Figure 6.** Residue root-mean-square fluctuation (RMSF) (**a**) of each tau replicate in fluid and (**b**) of tau::microtubule pair in fluid.

To better understand the interaction between tau and the tubulin MT wall, the middle conformations for the pair Tau::MT were also calculated. Figure 7c shows these pairs highlighting the N-terminal, proline-rich region, MTBR, and C-terminal with the color scheme proposed in Figure 2.

Replicates 2 and 3 interact with parts of two tubulin heterodimers and the N-termini are further apart from the tubulin wall than in the case of replicate 1. There is no consensus on the type of tubulin (α or β) to which tau binds preferentially, on the number of units [48,53], or even if the interaction is longitudinal or lateral [54]. More recently, Kellogg et al. [22] proposed an atomic model of tau bound to MT, suggesting that each tau repeat spans over three tubulin monomers as a continuous stretch, even more extended than our middle structures and reaching a higher number of tubulin monomers. The determination of the interaction mode between tau and MT has to take into account the highly dynamic behavior of tau and its ability to disconnect from one MT to connect to another in the neighborhood, in a kiss-and-hop mechanism, as cited by Janning et al. [55], suggesting that tau binding might be slightly different in interactions each time.

**Figure 7.** (**a**) Middle structures of tau in fluid, (**b**) middle structures of tau when associated with microtubules, with N-termini oriented to the top, and (**c**) tau::microtubule middle structures, with a color scheme following Figure 2: grey for N-terminal, cyan for proline-rich domain, green representing the MTBR, and beige for C-terminal. (**c**) Microtubule wall is represented in cyan cartoon and (**a,b**) tau is represented in green cartoon.

Electrostatic interactions and hydrogen bonds take place in all three replicates associated with MT, although for replicate 3, only the MTBR (Thr263, Lys281, and Lys290) and C-termini interact. Tau replicate 1 interacts along all its structure, with important interactions at the proline-rich region and MTBR (Ser202, Lys224, Arg230, Ser237, Ser241, Val306, Ser316, and Lys317). Replicate 2 interacts a little less with the proline-rich region (Ser202, Val228, and Lys240), much with the MTBR (Arg242, Lys274, Lys 280, His268, Gln288, Cys291, Gly308, and Leu315) and through the C-termini. It has to be stressed that the amino acids cited above are mostly polar and many of them positively charged, resulting in a good binding to the acidic C-terminal of tubulins (cyan helices, near tau, in Figure 7c).

Structurally, tau 3D structure, when tau interacts with MTs, is more extended than tau in solution, with the termini far from each other. If oriented to the same direction, all three replicates are quite similar (Figure 7b), which is a very good result for a simulation of a very dynamic IDP, under the interaction forces from the MT wall. In addition, we observed that tau replicates sample helical portions at the MTBR, as predicted or suggested in some works [19,46]. It is important to note that at the end of each MTBR repeat, there is a PGGG motif. Proline is a helical breaker and glycine, due to the absence of a side chain, is highly flexible and rarely found in helices. This motif should correspond to an extended or bended region that will connect the four repeats. In fact, in our simulations, PGGG does not sample secondary structure, and its inherent flexibility causes repeats not to be completely extended along the MT wall, but to approach one another.

#### *3.2. Tau Effect on Ionic Diffusion and Conductivity*

To analyze the ions' behavior, the RDF and the diffusion coefficients were studied. This is important to understand the role of tau or Tau::MT pair in the distribution of the ions present in the intracellular fluid. The RDF describes how the density of a particle varies as a function of distance from a reference molecule. Following this rationale, we can perceive how the K+ and Na+ ions interact with tau and MT.

Figure 8 shows the RDF graphs for K<sup>+</sup> and Na+ ions for all three replicates of tau in fluid (Figure 8a–c) and Tau::MT in fluid (Figure 8d–f). This analysis helps to understand the role of the electrostatic interaction with the proteins, in the ions' mobility through the fluid. In all cases, K<sup>+</sup> has a higher probability of being closer to the tau negative area (N-termini) than Na+. Note that the concentration of both ions is very different (interfering with the probability), namely, 140 mM for K+ and around 5 mM for Na+, yet sodium is lighter than potassium and could be more mobile and interact with tau more often, but an interaction preference with K+ is perceived. In this analysis, we focus on the ions' distribution up to 1 nm from the protein surface. Naturally, the ions also populate the bulk water, corresponding to a high probability far from protein, but we neglected the distribution in this area. If tau N-terminal retains K+, it can keep these ions further away from the cell membrane inner face, increasing the membrane potential.

**Figure 8.** Radial distribution functions (g(r)) of K<sup>+</sup> and Na<sup>+</sup> ions in the vicinity of the tau surface, for tau replicates in fluid (**a**–**c**) and tau::microtubule replicates (**d**–**f**). Purple trace for potassium and blue trace for sodium.

The ion's diffusion coefficients were calculated for K+, Na+, and Cl<sup>−</sup> (Figure 9). Chloride is used for two reasons: to mimic the negative contribution from cytoplasmic proteins and to neutralize the simulation box. In fact, Cl− is more mobile than the proteins in neuronal cytoplasm, but tau and MT should be less sensitive to this negative ion, especially MT due to its electrostatics. Therefore, the interaction pattern of Na<sup>+</sup> and K+ with tau and MT proteins is probably little influenced by chloride in our simulations.

Experimentally, the ions' diffusion and conductivity are always expressed at 25 ◦C, not at 37 ◦C. In addition, this calculation is made at infinite dilution, so in our systems we have to consider the concentration effect, which makes a direct comparison to experimental values difficult. However, the trend of Na+ < K<sup>+</sup> < Cl<sup>−</sup> in terms of diffusion and conductivity is expected in all simulated systems.

In the first simulation, the intracellular fluid, the ions were free to move around the simulation box, respecting the intracellular concentration and in the presence of chloride ions to neutralize the system. We observed that the diffusion (Figure 9) follow the expected trend of Na+ < K<sup>+</sup> < Cl<sup>−</sup> found in many ionic solutions [56].

Looking at the ions in the system tau in fluid, we emphasize that the presence of tau does not decrease the ions' diffusion. All three ions maintain the diffusion rate, if we consider the estimated error (Figure 9). In contrast, when tau is directly associated with the MT (Tau::MT replicates), there is a decrease in the diffusion. It was reported that tau diffusion along an MT lattice is influenced by the ionic strength and pH, especially by K<sup>+</sup> [57]. This suggests that when tau is near or attached to the MT, the ions' diffusion through the fluid should decrease as the ions are participating in the protein diffusion process.

To circumvent the direct contribution of MT in ionic diffusion, we simulated a tau middle structure (from Tau::MT replicate 2) attached to the box wall, and constraining the MTBR. Thus, we obtained the expected structure when tau interacts with the tubulin wall, but only the tau effect on the conductivity is observed.

A simulation box with the same volume and ion concentrations was modelled during 12 ns (Figure 10). We observed, in this case, that tau increases the diffusion of the Na+ cation, possibly making this ion more available in the fluid to move through the sodium channels and across the membrane, as the probability of tau to retain K<sup>+</sup> is much bigger than to retain Na<sup>+</sup> (radial distribution functions in Figure 8). The diffusion of K<sup>+</sup> is once again similar to the one calculated in the intracellular fluid, reinforcing the statement that tau is a protein that does not prevent the normal movement of this ion and the consequent concentration balance in the interior and exterior of the cell.

**Figure 9.** K+, Na+, and Cl<sup>−</sup> diffusion, D*<sup>i</sup>* (1 <sup>×</sup> <sup>10</sup>−<sup>5</sup> cm2/s), in the different systems under study: Neuron intracellular fluid (Fluid), tau in fluid (Tau::fluid), tau::microtubule in fluid (Tau::MT::Fluid) and constrained tau in fluid (ConstrainedTau).

**Figure 10.** Orthographic view of constrained tau to the box wall. Tau is represented in green cartoon, K<sup>+</sup> in purple, Na<sup>+</sup> in blue, and Cl<sup>−</sup> in green transparent spheres.

Predictions for ion membrane diffusion in neurons indicate a DK<sup>+</sup> of 1.96 cm2/s and a DNa+ of 1.33 cm2/s [58]. In addition, the potassium diffusive coupling has been reviewed in neural network processes [59], but less information is mentioned for the behavior of the ions in the axonal region, that is, when exclusively embedded in the intracellular medium. Our results for K<sup>+</sup> and Na+ diffusions

Λ (

are slightly higher than the cited literature for transmembrane diffusion, but they result in almost the same difference between these ions.

Looking now to the calculated conductivities (Table 1), we observe very similar conductivity values in all systems, except for Tau::MT in fluid. In general, proteins affect the conductivity in two ways: by carrying a charge and by influencing viscosity. Therefore, it is acceptable that the MT, a bulky and negatively charged protein wall, has the effect of diminishing the global conductivity. The MT has a very negative electrostatic profile that retains the cations, decreasing the diffusion and the conductivity.


**Table 1.** Calculated conductivity (Λi) at 310 K, for each ion type in fluid, in all systems.

Our findings indicate that tau in its dephosphorylated form (i.e., when performing its normal function, with the proper conformation in solution and/or associated with MT) maintains the normal K<sup>+</sup> diffusion and discreetly increases Na+ diffusion, which, in fact, must be more available to leave the inside of the neuron. The biological relevance of this finding is still unclear, but tau may have a role in the electrical signal transmission along the axon.

#### **4. Discussion**

The present study unveils tau's structural preferences and electrostatics, when this protein is free in solution (intracellular fluid) or bound to microtubule in neuron fluid. The electrostatic potential surface of tau (Figure 4) reveals a dipolar protein, where the proline-rich region and MTBR are predominantly composed of basic amino acids (positive charge from side-chains). On the other hand, the N-terminal is acidic, with several amino acids negatively charged at physiological pH. Looking at the tubulin MT potential surface, a very negative electrostatic potential can be observed, especially for the exterior (front), indicating that the binding of tau to its wall will occur through the positive region of this protein, and the N-terminal will tend to move away from the wall, as much as possible. In fact, it is widely reported that the binding occurs between tau MTBR and MT, and the electrostatic distribution of both proteins indicates that this is the best choice to prevent repulsion.

The work of Guo et al. [60] (and references therein) summarizes tau's structural basis and points to a paperclip conformation in solution [50], where the terminals are in close proximity, as we verified with our simulations. Kellogg et al. provided, in their work published last year, a near-atomic model of Tau::MT interactions [22]. In contrast to the paperclip model, observed in solution, they determined that tau is in an extended form, in which each MTBR repeat interacts with one tubulin dimer and with its interface to the next tubulin. This configuration separates the terminals, which agrees with what we see for tau in our Tau::MT simulations (Figure 5), although we sampled a less elongated MTBR. Under dynamics, the PGGG motif localized in each repeat causes consecutives bends that result in a folded MTBR, thus our simulated model interacts with a small number of tubulin monomers. Summing up, structurally, when we simulate our tau model, it behaves as expected in solution, as a paperclip, and shares similarities with the findings of Kellogg, such as the distance between terminals.

Tau performs other important functions in the intracellular neuron environment, besides the well-known functions in MT binding and stabilization [60]. Interestingly, with the exception of the C-termini, each tau domain can be related with the interaction to other proteins. The N-terminal has been involved in the inhibition of axoplasmic transport through a signaling cascade [61] and in interactions with proteins at the neuron plasma membrane [62]. The proline-rich region also interacts with several proteins, specially the Src family of protein kinases [63]. The MTBR has been also associated with an interaction with the lipid membranes [64].

The tau localization in neurons is also an important factor for tau interaction and function. Tau is mainly located in axons [63] and in a much smaller amount in somatodendritic compartments [65–67]. Guo et al. [60] summarizes the presence of tau in different neuron locations, but we highlight here the association with neuronal membranes, which is required for tau participation in intracellular signaling pathways [68,69]. It is also in the membrane that the propagation of electrical impulse occurs, through the flow of positively charged ions. The action potential is maintained due to a concentration difference of certain ions. Potassium has a higher concentration inside the cell, and sodium has a higher concentration on the outside. The flux of these ions, using sodium and potassium channels at the membrane, guarantees the process of polarization/depolarization and maintains an ionic balance. Our simulations used concentrations of Na<sup>+</sup> and K<sup>+</sup> typical for the resting state.

Tau has been reported as a protein that interacts with many others and also with the cell membrane, inside the neuron [47,60,63]. However, little information is provided concerning the effect of this protein on sodium and potassium ions [57,70]. Our simulations point to a possible novel function or role: due to its polar profile, tau may also play a role in the normal flow of the ions present in the intracellular fluid, as it is highly localized in the axon. The axon propagates the signal, like a wire, and this occurs due to the potential difference maintained by the flow of ions in the axon membrane [71].

Recently, the behavior of tau near a membrane was evaluated. Patel et al. [72] thought that once tau aggregates in a similar way to amyloid peptides, it would be valid to look at its behavior when close to the cell membrane. A tendency for the formation of ion channels, which allow the passage of nonspecific ions, was observed. This fact reinforces the role of tau in the balance of K<sup>+</sup> and Na<sup>+</sup> ions within the neuron and in the electrical signal propagation between cells. Our work paves the way to disclose these interactions.

K<sup>+</sup> and Na<sup>+</sup> imbalances in neurons were observed in Alzheimer's disease brains [70]. This fact could arise from the malfunctioning of several proteins, including tau, which causes changes at the structural and signaling level, affecting the normal concentrations of these ions.

Globally, tau has a basic character with a higher content of positively charged amino acids, but among the 27% of charged residues, there are also negatively charged amino acids, which results in a predominantly negative area, the N-terminal (Figure 4). Tau is, therefore, able to interact with ions in an attractive or repulsive manner. In particular, the acidic N-terminal is prone to attracting cations, since it is the tau acidic area and the portion that will interact less with MT, also negatively charged. In fact, RDF and diffusion analyses have suggested that tau retains K<sup>+</sup> near its surface, but maintaining a normal diffusion, and increases Na+ mobility, which is less probable to be near tau.

#### **5. Conclusions**

Tau modelling and simulations disclose many properties concerning tau structure and function. First, the prediction of a tau 3D structure leads to a model that was equilibrated in solution—intracellular ionic fluid—and associated with an MT wall. In both cases, the secondary structure features observed for the middle conformations are in agreement with experimental estimates: tau in fluid is more compact, with N− and C− termini closer to each other, and with some helical SS at

MTBR. Tau in Tau::MT systems is in an extended conformation, with helical portions at the MTBR and with the terminals far from each other.

In terms of interactions, tau interacts with two tubulin heterodimers on average, through hydrogen bonds and electrostatic interactions, mostly established between the basic region of tau and the acidic C-termini of tubulins. Kellogg et al. [22] suggested a more elongated MTBR able to interact with a higher number of tubulin dimers. However, we observed a less extended tau MTBR, due to the great dynamics of this protein.

Concerning the effect of tau on ions, we observed a preference for interacting with K+ cations, through the negatively charged N-terminal. More importantly, tau maintains the normal ion diffusion, being a key factor to cell conductivity and signaling. Tau also increases the conductivity of Na+ ions, in comparison with the fluid control, in the surrounding environment.

The maintenance of a normal tau structure–function relationship results in a normal interaction with the ions, keeping K<sup>+</sup> close to the tau surface and Na+ ions more available to leave the cell, as expected, to maintain the intracellular concentration balance of both ions at the axon resting state. Our findings about diffusion and conductivity are a starting point to understand, at the molecular level, the participation of tau in the electrical signal propagation across healthy neurons.

In future work, we intend to study different degrees of phosphorylation in our tau model, to evaluate what it will entail in the structure, in the interaction with MTs, and in the ions' diffusion.

**Author Contributions:** Conceptualization, A.C.-P.; Investigation, T.C.; Methodology, T.C.; Supervision, A.C.-P.; Writing—original draft, T.C.; Writing—review and editing, F.M. and A.C.-P. All authors discussed the results and commented on the manuscript. All authors read and approved the final manuscript.

**Funding:** All authors thank the European Union through the European Regional Development Fund (ERDF) under the Competitiveness Operational Program (*BioCell-NanoART = Novel Bio-inspired Cellular Nano-architectures*, POC-A1.1.4-E-2015 nr. 30/01.09.2016). Artur Cavaco-Paulo thanks the support received from the Portuguese Foundation for Science and Technology (FCT) through the strategic funding of UID/BIO/04469/2019 unit and BioTecNorte Operation (NORTE-01-0145-FEDER-000004) funded by the European Regional Development Fund under the scope of Norte2020—Programa Operacional Regional do Norte.

**Acknowledgments:** Access to computing resources funded by the Project "Search-ON2: Revitalization of HPC infrastructure of UMinho" (NORTE-07-0162-FEDER-000086), co-founded by the North Portugal Regional Operational Programme (ON.2 –O Novo Norte), under the National Strategic Reference Framework (NSRF), through the European Regional Development Fund (ERDF), is gratefully acknowledged. We express our deepest appreciation to Teresa Matamá for valuable advice and discussion on neuron physiology.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
