**1. Introduction**

Eukaryotic cells are characterized by the presence of the nuclear envelope (NE), a lipid bilayer membrane that separates the cells into two compartments, i.e., the nucleus and the cytoplasm. The NE contains many nuclear pore complexes (NPCs), which are the sole gateway for the exchange of essential biomolecules between the two compartments. NPCs are large protein complexes, with a molecular mass of ~55–66 MDa [1,2] in yeast and ~ 125 MDa in vertebrates [3]. The NPC is composed of 30 different types of proteins called nucleoporins (Nups) [4,5]. One third of these Nups are intrinsically disordered proteins (IDPs), which are anchored to the inner wall of the NPC and are rich in phenylalanine-glycine (FG) repeats. Inside the NPC, these FG-Nups form a central meshwork that provides a permeability barrier for translocating molecules. Various studies have revealed that the NPC allows rapid transport of small molecules (30 kDa or ~5 nm in diameter), but drastically slows down the translocation of larger molecules from one compartment to the other [6–8]. It also has been found that FG Nups bind to nuclear transport receptors (NTRs) [9,10] by means of

hydrophobic interactions, which facilitates the translocation of NTRs by lowering the permeability barrier [11]. Cargoes of diameter up to 40 nm are known to translocate by this facilitated transport mechanism [12,13]. Therefore, the FG-Nups are considered to be crucial in establishing the selective permeability barrier of the NPC.

Nucleocytoplasmic trafficking can be altered by a change in the surface properties of translocating molecules [14], by the deletion of FG-Nups [6,8], and by the change in cohesiveness of the FG-Nups [15]. For example, mutation of the hydrophobic F residues of Nsp1, a representative yeast FG-Nup, into the hydrophilic Serine S reduces the propensity of Nsp1 to form a hydrogel [16]. These experiments revealed that the hydrogels exclude inert molecules, but allow hydrophobic NTRs to enter, which is explained in terms of a local disruption of the cohesive gel network [16,17]. In a separate study [15], the Nsp1 molecules tethered onto the inner surface of solid state NPC mimics formed a dense phase (over 100 mg/mL) and enabled transport selectivity. Kap95 (a yeast NTR) traversed the pore whereas the translocation of tCherry (an inert molecule of similar size) was inhibited. The F, I, L, V to S mutation of Nsp1 resulted in a remarkably less dense FG-Nup network inside the pore, which led to a loss of selectivity, as both tCherry and Kap95 were able to translocate. Taken together, these studies show that the transition from a dense, hydrophobic phase to a dispersed, hydrophilic phase results in the nanopores losing their selective barrier function.

The hydrophobicity of FG-Nups can also be altered through phosphorylation, one of the most abundant protein modifications inside the cell [18,19]. Phosphorylation is catalyzed by kinases and can be reversed by phosphatases. It has been shown that extracellular signal-regulated kinase (ERK), a phosphorylating agent, can directly interact with FG-Nups [20,21], causing FG-Nups to phosphorylate [22–24]. Several in vitro studies have revealed that specifically Nup62, Nup98, Nup153, Nup214, and Nup358 can undergo phosphorylation [25,26]. Furthermore, there is evidence which confirms that FG-Nups undergo phosphorylation in vivo as well [21,23,27,28]. Transport studies demonstrated that the phosphorylation of nucleoporins results in decreased kinetics of active transport of Kap95 [25,27,29] and Kap-cargo complexes [30,31], and increased kinetics of passive transport [32]. These studies indicate that phosphorylation can modulate the selective permeability of the NPCs. However, the molecular mechanism behind the alteration in nucleocytoplasmic transport due to phosphorylation is not well understood.

Molecular dynamics (MD) simulations have proved to be a powerful tool to study the disordered protein structure inside the NPC and the transport through native and biomimetic nanopores [8,15,33–35]. Therefore, in order to understand the molecular mechanism behind the phosphorylation-induced alteration in transport kinetics, we carried out MD studies using our earlier developed one-bead-per-amino-acid (1BPA) coarse grained (CG) model for FG-Nups [35], extended here for phosphorylated FG-Nups. This 1BPA model has been successfully applied to probe the (doughnut-like) density distribution of the disordered domain of yeast NPCs [35], the facilitated transport of NTRs through yeast [33] and biomimetic [15] NPCs, and the size selectivity for passive transport [6], in good agreement with experiments. Although the transport experiments on phosphorylated NPCs cited above were carried out on mammalian NPCs, we here used the yeast NPC model, which has structural and functional similarities to the vertebrate NPC [36].

In the current study, we extended our 1BPA model to phosphorylated FG-Nups and carried out MD simulations of FG-Nups in isolation, as well as within the NPC. We studied the impact of phosphorylation on the structure of the disordered phase and the transport across the NPC in two scenarios. In the first (referred to as the Phos\_N scenario), we used the NetPhosYeast 1.0 server [37] to obtain the phosphorylated residues of the yeast FG-Nups (yielding phosphorylated serine (S) and threonine (T) residues only), and in the second (referred to as the Phos\_Max scenario), we assumed that all phosphorylatable residues (serine (S), histidine (H), threonine (T), and tyrosine (Y)) were phosphorylated. We investigated the changes in conformation of phosphorylated FG-Nups compared to FG-Nups in their native state by using the Stokes radius (*R*S) as a measure for their size (see Section 2.1). We found that phosphorylation causes FG-Nups to extend by an amount that depends on the fraction of phosphorylatable residues and positively charged residues. In Section 2.2, we present a study on the collective interaction of phosphorylated FG-Nups inside the confined environment of the NPC. We found that phosphorylation drastically reduces the FG-Nup density inside the NPC. Finally, in Section 2.3 we report on simulation results of the phosphorylation-affected transport of inert particles and Kap95, and discuss these results in light of the various contributions to the interaction energy inside the NPC. Our transport simulations are in qualitative agreement with the experimentally-observed increase and decrease in transport rate of the passive and active transport pathways, respectively. Note that the Phos\_N scenario predicts more phosphorylation sites than other phosphorylation databases, such as the fungi phosphorylation database (FPD), which provides a comprehensive list of experimentally validated phosphorylation sites [38]. The prediction from the FPD database is incorporated in the Supplementary Materials (see the section "Sensitivity analysis") to provide a scenario for experimentally validated phosphorylation sites. It is important to note that it is unclear which phosphosites predicted in either scenario are phosphorylated simultaneously in vivo, and hence the predictions provided in this study are not meant to mimic specific biological conditions, but rather to shed light on the fundamental mechanisms underlying the changes in transport kinetics of phosphorylated NPCs.

#### **2. Results**

In order to study the effect of phosphorylation on FG-Nups, we started with our previously developed MD model for intrinsically disordered proteins (IDPs) in their native state, coarse-grained at a resolution of one bead per amino acid (1BPA) [35]. This 1BPA model accounts for non-bonded hydrophobic and electrostatic interactions between the amino acids, including the effect of solvent polarity and ionic screening to mimic the solvent conditions inside the NPC. The model is accurate (within 20% error) in predicting the Stokes radius *R*<sup>S</sup> [35] for a range of FG-Nups and FG-Nup segments [39]. In the current study, we extended the model for phosphorylation by accounting for the change in hydrophobicity and charge of four amino acids: S, H, T, and Y. We used a weighted average scheme of five predictor programs KOWWIN, ClogP, ChemAxon, ALOGPS, and miLogP [40–43] to predict the change in hydrophobicity due to the change in the chemical structure. For details on the model development for phosphorylated FG-Nups we refer to the Materials and Methods section (Section 4.2). The new parameters for phosphorylated amino acids are summarized in Table 1.

**Table 1.** Parameters in the 1BPA forcefield for phosphorylated amino acids. Note that: *ε*1BPA and *ε*weighted are the normalized hydrophobicity values (between 0 and 1) from the 1BPA model [35] and the weighted average scheme (see Section 4) for the amino acids in their native state, respectively; *ε*p is the hydrophobicity of the phosphorylated amino acid; and *q* and *q*p denote the charge of the amino acids in their native and phosphorylated conditions, respectively.


#### *2.1. Effect of Phosphorylation on Isolated FG Nups*

We used our newly developed parametrization for phosphorylation and performed MD simulations to study the effect of phosphorylation on the conformation of isolated FG-Nup segments [35]. The simulated trajectories were analyzed to determine the time averaged *R*<sup>S</sup> using the Hydro program [44,45]. The predicted *R*<sup>S</sup> values for the phosphorylated FG-Nups are compared with that of FG-Nups in their native state (from experiments [39] and simulations [35]) in Figure 1. The error bars for the simulation data represent the standard deviation in time for *R*<sup>S</sup> (See Table S1 in the supplementary data for the source data).

**Figure 1.** Phosphorylation-induced extension of FG-Nup segments. The Stokes radius *R*<sup>S</sup> (in Angstrom) is depicted for a range of FG-Nup segments in their native and phosphorylated states. The suffix *lc* denotes low charge, *hc* high charge and *s* refers to the stalk region of the Nup. The grey and black bars represent the data in the native state from experiments [39] and simulations (results reproduced from [35]), respectively, and the prediction for the phosphorylated states are plotted in red for Phos\_N and blue for Phos\_Max. For the simulation data, the error bars represent the standard deviation in time (see Table S1 for the source data).

As a result of phosphorylation, the amino acids become more hydrophilic and negatively charged (see Table 1). Thus, compared to the native state, the phosphorylated FG-Nups exhibit enhanced electrostatic repulsion and reduced hydrophobic attraction leading to an overall decrease in intra-molecular cohesion and thus a more extended configuration (see Figure 1). In Table S2, we have summarized the number of amino acids that can be phosphorylated in each FG-Nup segment. The FG-segments are grouped as low charged (*lc*), high charged (*hc*), and stalk (*s*) domains, following the definition of Yamada et al. [39]. We found that the relative abundance of phosphorylatable residues in all FG-Nup segments ranges from ~15% (for Nup116s) to ~33% (for Nsp1n\_lc) for the maximally phosphorylated (Phos\_Max) condition, whereas for the Phos\_N scenario the range is from ~4% (Nup116s) to ~17% (Nup159\_hc). In order to quantify the change in Stokes radius in terms of the number of residues undergoing phosphorylation, we plot the normalized change in *R*<sup>S</sup> as a function of the percentage of phosphorylatable residues (*n*) for the low charged, high charged, and stalk groups in blue, red, and green data points, respectively, for the Phos\_Max (Figure 2a) and Phos\_N (Figure 2b) scenarios. The change Δ*R*<sup>S</sup> is normalized as Δ*R*S/(*N* − 1)*b*, where *N* is the total number of residues of the FG-Nup segment and *b* is the coarse-grained bond length (3.8 Angstrom). We fitted the data points for individual groups to a straight line passing through the origin, represented as colored lines in Figure 2a,b. We note that for both Phos\_Max and Phos\_N, the FG-segments from the *lc* group show the highest normalized change in *R*<sup>S</sup> (blue line in Figure 2a,b), whereas phosphorylation has a smaller effect on size for the *hc* and *s* groups (red and green lines in Figure 2a,b).

**Figure 2.** The normalized change in *R*<sup>S</sup> (i.e., Δ*R*S/(*N* − 1)*b*) due to phosphorylation as a function of the fraction of phosphorylatable residues (*n*) for (**a**) the Phos\_Max and (**b**) the Phos\_N scenarios [37]. The expression for the change in *<sup>R</sup>*<sup>S</sup> (i.e., <sup>Δ</sup>*R*<sup>S</sup> <sup>=</sup> *<sup>R</sup>*phos <sup>S</sup> <sup>−</sup> *<sup>R</sup>*native <sup>S</sup> ), with *<sup>R</sup>*phos <sup>S</sup> and *<sup>R</sup>*native <sup>S</sup> being the Stokes radii of the FG-Nup segments in the phosphorylated and native states, respectively, is normalized with (*N* − 1)*b* where *N* is the total number of residues of the FG-Nup segment, and *b* (= 3.8 Angstrom) is the coarse-grained bond length between neighboring amino acids [35,46]. The data for the FG-Nups from the high charged (*hc*), low charged (*lc*), and stalk (*s*) segments [35,39] are represented in red, blue, and green data points, respectively. The data points of each group are fitted to a straight line passing through the origin revealing different slopes for different groups. For Phos\_Max we observe slopes of 0.087 for *lc* (*R*<sup>2</sup> = 0.94), 0.051 for *hc* (*R*<sup>2</sup> = 0.92), and 0.053 for *s* (*R*<sup>2</sup> = 0.80), respectively, whereas for Phos\_N the slopes are 0.1 for *lc* (*R*<sup>2</sup> = 0.86), 0.052 for *hc* (*R*<sup>2</sup> = 0.95) and 0.045 for *s* (*R*<sup>2</sup> = 0.61). (**c**) The ratio of normalized change in *R*<sup>S</sup> to the fraction of phosphorylatable residues (*n*) is plotted as a function of the fraction of positively charged residues (*p*) for all data points (black) from the Phos\_Max and Phos\_N scenarios. These data points are fitted to a linear equation, as shown in the figure (giving *R*<sup>2</sup> = 0.68). (**d**) The *R*phos <sup>S</sup> predicted from the theory in Equation (1) compared to *<sup>R</sup>*phos <sup>S</sup> computed from the MD simulations, both in Angstrom, show a good agreement with a fitness measure of *R*<sup>2</sup> = 0.97.

In order to investigate the varying response for the three groups, as shown in Figure 2a,b, we analyzed the change in hydrophobicity upon phosphorylation and found that it is roughly similar for the three groups, i.e., for FG-Nups from the *lc*, *hc*, and s groups, the hydrophobicity drops by 13-20%, 16–22%, and 12–21%, respectively, for Phos\_Max (see Table S2). Similarly, for Phos\_N, the reduction in hydrophobicity amounts to 3–10%, 7–10%, and 4–12% for the *lc*, *hc*, and *s* groups, respectively, showing no major difference across the three groups. Clearly, the effect of phosphorylation on hydrophobicity alone cannot account for the different *R*<sup>S</sup> of the groups. It has been argued that the net proline content in IDPs plays an important role in determining the effective Stokes radius [47], as proline provides additional stiffness to the peptide chain because of its ring structure. This effect of proline is included in our 1BPA model in the form of the bonded potentials [46]. However, all the FG-Nup segments analyzed in this study (Figure 1) have a similar 3–8% proline content, and therefore cannot explain the different

Stokes radii across the three groups (see Table S2). Next, we analyzed the effect of charge. Since the net charge of the three families is quite similar (i.e., 2–3%, 0–3%, and 0–1% for *lc*, *hc*, and *s*, respectively), we investigated the occurrence of positively charged residues R and K in the FG-segments and found that the *lc* group contains only 2–3% of positively charged residues in contrast to the *hc* and *s* groups, which have more positively charged residues (7–16% for *hc* and 13–17% for *s*, see Table S2). Thus, it seems that the larger amount of positive charge in *hc* and *s* is more efficient in screening the effect of the negative charge increase induced from phosphorylation than the small amount of positive charges in the *lc* group (see Table S2). In order to confirm this, we plotted the ratio of Δ*R*S/(*N* − 1)*b* normalized by the fraction of phosphorylatable residues (*n*) as a function of the percentage of positive charge content (*p*) in the FG-segments (see Figure 2c). The data points in Figure 2c can be fitted to a straight line with a slope of -0.41 and *y*-intercept of 0.1, with an R2 value of 0.68. Using this observation, the Stokes radius for a phosphorylated FG-segment can be predicted using the following expression:

$$R\_{\rm S}^{\rm phos} = R\_{\rm S}^{\rm native} + bn \ (N-1)(-0.41p + 0.1) \tag{1}$$

We show the predictive power of this formula in Figure 2d, where the *R*phos <sup>S</sup> predicted using Equation (1) is plotted against the computed *R*phos <sup>S</sup> from the MD simulations, showing a very good correlation (with *R*<sup>2</sup> = 0.97).
