1. Introduction
Eukaryotic cells contain micro-scale compartments that are formed by specific proteins and nucleic acid through condensation. These compartments do not have bound membranes and are referred to as membraneless organelles, regulating diverse processes in cells [
1,
2]. Examples include stress granules formed when translation initiation is impaired in response to cellular stresses [
3], P granules of C. elegans embryos [
4], and Cajal bodies [
5], etc. The dynamic equilibrium between the condensation and dissolution of these membraneless organelles are mainly controlled by the multivalent but weak interactions of constituting proteins and other biomolecules within the organelles [
1]. Proteins with intrinsic disordered low complexity domain (LCD) are often the major components forming these membraneless organelles [
6,
7]. In vitro, these proteins with multivalency could de-mix from solution and form liquid droplets through a protein liquid–liquid phase separation (LLPS) process. Interestingly, many of these proteins can also form amyloid, which in most cases can be neurotoxic [
8].
TDP-43 (transactive response (TAR) element DNA-binding protein of 43 kDa) is a nuclear ribonucleoprotein. It participates in many processes of RNA regulations and is able to autoregulate its own expression [
9]. It is recruited to stress granules in cytoplasm, which contains other proteins with LCD such as GTPase activating protein1 (SH3 domain) (G3BP1), T cell intracellular antigen-1 (TIA-1), etc. [
3]. Although TDP-43 is not an obligatory stress granule component, it modulates stress granule formation and disassembly [
10]. The full-length TDP-43 protein contains 414 residues, with a nuclear localization signal (NLS) and two RNA-recognition motifs at the N-terminal region, but a low complexity domain (LCD) at the C-terminal region (267–414) [
11]. The first RNA-recognition domain and the LCD are responsible for recruiting TDP-43 into stress granules [
9]. In solution, the TDP-43 LCD domain [
12,
13] is able to form liquid droplets by phase separation. However, it can also aggregate into amyloids in common solutions and in cells [
14]. TDP-43 aggregation is considered as the hallmark for amyotrophic lateral sclerosis (ALS) [
10,
15]. The protein inclusions are also sometimes found in patients with frontotemporal dementia (FTD) and other neurodegenerative diseases [
9]. These diseases related to the formation of the pathological TDP-43 granules and the abnormal TDP-43 aggregation are considered as TDP-43 proteinopathies. To understand the mechanisms in regulating TDP-43, LLPS and aggregation would be essential for us to design molecules to delay TDP-43 aggregation, but not affect its LLPS. Most ALS-associated mutations of TDP-43 are located at LCD, indicating the importance of this domain [
16]. The mechanisms determining TDP-43 LCD LLPS have been extensively investigated but questions remain. For example, LCDs involved in LLPS are usually intrinsically disordered and are composed of polar amino acids punctuated by aromatic residues (TDP-43 LCD sequence analysis in
Table S1a–c) [
17]. However, the TDP-43 LCD contains one secondary structure element of two short α-helices connected by a bend (residue 320–343) [
14]. The TDP-43 LCD sequence contains a higher amount of hydrophobic residues (
Table S1a) than most LCD proteins (LLPSDB-Statistics (bio-comp.org.cn accessed on 1 November 2022), with the helices mostly hydrophobic (
Table S1c). The intermolecular helix–helix interactions driven by the hydrophobic interactions have been shown as the major forces for TDP-43 phase separation. however, this segment also belongs to the amyloidogenic core domain. The droplets TDP-43 LCD forms tend to be relatively small, about 1 μm in diameter in certain conditions (in our hand and also observed by other groups [
6,
13]), unlike in other phase separation proteins [
6]. It is also not easy for us to find the condition to observe the on-going liquid droplet fusion events by differential interference contrast (DIC) microscopy. These indicate that the protein fusion equilibrium occurs quickly after the protein is dissolved in solution when the liquid droplet is very small. Because of these special properties, TDP-43 LCD is a good model to study for a better understanding of the interplay of different structural components in determining the protein condensation and aggregation.
Aside from the helical intermolecular interactions, other interactions have also been found to play roles in TDP-43 LCD LLPS. The two flanking regions on both sides of the helices are intrinsically disordered, referred as the intrinsic disordered region (IDR) (IDR1, (276-320) and IDR2, (343-414)). IDR2 contains a Gln/Asn-rich (QN) (344-360) motif immediately after the helices (320–343), with the sidechain amide bonds of Q, N residues able to participate in H-bonding, promoting the protein assembly or the amyloid formation [
18,
19,
20]. Both IDR1 and IDR2 regions contain X-G/S, G/S-X sequence motifs (X represents the aromatic residues Phe, Trp, Tyr). The sidechain H-bonding is provided by the hydroxyl group of serine, the π–π stacking of aromatic rings or ionic charge–π interactions are all believed to contribute to the intermolecular interactions in protein phase separation [
17,
21,
22]. For example, TDP-43 contains three Trp residues. All three Trp residues were found to play roles in regulating the protein LLPS by mutation studies, with the most important Trp located in the helical region [
14]. TDP-43 LCD also contains several charged residues including five Args (
Table S1) with a protein PI (isoelectric point) value of 10.78, and the pH of the solution controls its phase separation ability. A decrease in pH to 4–5 could diminish its phase separation completely while increasing the pH of solution to 6 and above would cause protein droplet formation [
23,
24]. Increasing the NaCl concentration also increases the protein LLPS by providing the electrostatic shielding effects [
12,
24]. Both the pH and salt effect indicate that the electrostatic repulsion inhibits its phase separation [
25]. Therefore, there must exist an intricate balance between different interactions, which fine tunes the status of TDP-43 in solution. However, the relative contributions of these interactions to TDP-43 LCD LLPS are not well understood.
In this research, by screening different conditions, we investigated the special role of the helical region (320–343) and compared it to the rest of the sequence to understand how the balance of the contributions from these two parts of the sequence controlled the LLPS and what happened when the TDP-43 LCD stopped fusion. A structural transition intermediate toward aggregation was also discovered involving a decrease in the intermolecular helix–helix interaction and a reduction in the helicity. Using TDP-43 LCD as a model, our work provides a better understanding of the multivalence controlling the intermolecular interactions and protein phase separation.
3. Discussion
3.1. A Fine Tune of the Different Interactions Affects the Protein LLPS Equilibrium and Droplet Sizes
In this research, we showed ways to manipulate LLPS and the liquid droplet sizes of TDP-43 LCD. By reducing the protein concentrations or adding RNA, the LLPS equilibrium was disturbed. The sizes of the droplet could be increased and the fusion events could be observed by DIC only in a narrow protein concentration range. The active fusion event actually indicates a non-equilibrium situation, where more protein molecules are recruited to the droplet. Protein LLPS requires multivalence and a balance between different intermolecular interactions. To simplify the situation, the interactions can be tentatively put into two groups here for TDP-43 LCD: one is from the helices and the other is from the rest sequence, the IDR sequences. The two groups of interactions could induce the molecules into a type of loosely associated network, connecting the molecules in the helical region and the IDR regions. At 100 μM protein concentration in the PB buffer condition, the intermolecular interaction mediated by the helices was probably too strong, but the interactions mediated by the IDR were too weak, not enough to extend the molecular network, therefore, the LLPS stopped at a very early stage with small droplet sizes. Decreasing the protein concentration to 40 μM could shift the dynamic equilibrium and reduce the helix–helix intermolecular interactions so that the different intermolecular interactions could be more compatible with each other in intensity, and the protein droplet sizes increase. Similarly, adding RNA enhances the interactions mediated by IDR sequences to make them more compatible in the intensity to the helical interactions, and the protein droplet sizes increase.
The real stress granules or TDP-43 granules in cells contain a greater variety of molecules including RNA and full-length TDP-43. In this real situation, the TDP-43 LCD helix–helix interaction may not be so dominant if the protein concentration is lower. Some other proteins found in stress granules such as G3BP1, hnRNPA2, etc. also contain LCD and would interact with TDP-43 and RNA through a similar interaction mechanism, maintaining the stability. Therefore, the principle gained here should still be applicable in a more complicated system, although more components would have to be taken into consideration.
3.2. The Molecular Status of Proteins in the Mature Droplets
The molecular properties of the mature droplets were also investigated. Since the mature droplets did not fuse in pH 6.0 PB buffer at the 100 μM protein concentration, the molecules may not still be liquid-like. The
1H-
15N HSQC spectra intensity of freshly prepared mature droplets was first compared to the MES condition at 70 μM protein concentration, showing that the signal intensity was proportional to the protein concentration for most residues at IDR (
Figure 3c). Protein was soluble, and showed very low LLPS in the MES condition. Therefore, the protein molecules in the mature droplets were still liquid-like, with similar dynamic properties as the ones in MES. Using THT fluorescence, we found that the protein aggregation lag time was 200 min for a 100 μM protein concentration in PB (
Figure S3a), indicating that the protein was not severely aggregated within this time frame. Although the mature droplet did not fuse, changes in the solution condition would make the situation change quickly such as dilution into a buffer or adding RNA. The observations could also indicate that the molecules in the mature droplet were still in an active equilibrium, and not in a severe aggregation state. A mature protein droplet should have a certain lifetime before the protein aggregation, considering its function in the cell. The membraneless organelle was supposed to have an active function and able to dissolve upon regulation.
Although the 1H-15N HSQC spectra showed that the molecules still maintained high dynamic motions, in general, in the mature droplet, the intermolecular helical interaction was stronger, displayed by a lower intensity of the peak and negative 15N chemical shift changes in the PB buffer. We also found that the 19F signal intensity from the Trp aromatic ring was attenuated (in PB vs. in MES) and not proportional to the protein concentrations (40 μM vs. 100 μM in PB), indicating that the aromatic residue sidechains were involved in the intermolecular interactions for LLPS. Therefore, the protein molecules in the mature droplet were involved in stronger intermolecular interactions, but were still liquid-like and able to change its molecular interactions quickly upon induction.
3.3. TDP-43 LCD Aggregation Intermediate
When the protein aggregates, it usually accompanies a NMR signal loss for all of the residues, which was observed for the 100 μM protein concentration in PB after 53 h (
Figure 2c). It was unexpected for us to observe a signal increase in the helical region during the protein aggregation while the signal decrease in the IDR region was clearly seen (spectra taken at the sixth hour and 22nd hour,
Figure 2c). The signal increase suggested an enhancement in the molecular dynamics of the helical region, probably through a slight release of the helix–helix interaction. Therefore, we observed an intermediate step during TDP-43 LCD aggregation from the protein droplets. We then found two other conditions with reduced helix–helix interaction that indeed showed faster aggregation with shorter lag time studied using THT fluorescence (70 μM protein concentration in pH 5.5 MES and 100 μM protein concentration, 150 mM urea in pH 6.0 PB). The
1H-
15N HSQC spectra also showed a similar observation that the two conditions with less helical intermolecular interactions displayed a faster decay in the signal intensity. We observed that TDP-16G (without the amyloidogenic helices) aggregated at a similar rate as or at a slightly slower rate than the wild-type (with the helices), suggesting that IDR regions contributed significantly to the aggregation. Therefore, our observation supports that the protein fibrillation started at the IDR regions, which was followed by the structural conversion of the helical region. The amyloidogenic helical region has high multiplicity of binding modes [
27,
28], which would contribute to the stability of the aggregates after the structural transition by forming ordered amyloid structures.
Previous NMR studies have indicated that for residue 321-330, the α-helical structure only populated about 50% of the conformational ensemble, while for residue 331-343, the helical population was even smaller [
12]. Although the values were obtained mostly at a 20 μM protein concentration at pH 6.1 MES, where the protein was not in droplets, it indeed showed that the helices in TDP-43 LCD were in dynamic exchange in the conformation, and not a strong helical structure. Previous research also indicated increased helical structure upon protein LLPS, consistent with our results [
32]. Therefore, a proper helix–helix interaction is needed for protein LLPS and at the same time, the intermolecular interaction also helps maintain the stability of the helical structure. Without good intermolecular interaction, the helical structure can easily become loose and the protein is easy to aggregate, as shown in the urea condition or our pH 5.5 MES condition. However, even in the mature droplet state, for a long incubation time, there are still opportunities for a temporary breakup of the helix–helix interaction and a slight loosening of the helicity. This would explain the increase in the HSQC spectra intensity at the helical region during the protein aggregation after a long incubation time. This observation was consistent with studies on the cell, showing that the recruitment of TDP-43 into granules would protect the protein from fast aggregation [
32,
33]. Some of the ALS-associated mutations were also reported to alter the helix–helix interactions or the helical propensity [
34]. Our observation again reinforces the importance of the helices for TDP-43 LCD LLPS and aggregation.
3.4. The Polar, Aromatic Residue Rich Sequence of Low Complexity Domain
X-G/S and G/S-X (X represents the aromatic residues) sequences have been found in many proteins with LLPS. Interestingly, the nucleoporins in nuclear pore complexes also contain many phenylalanyl-glycyl (FG)-rich repeats at their selective filter for the random fuzzy interactions with their cargo, the transport factors [
35]. Similar observations on the NMR signal intensity attenuation were also reported on this FG-rich region of nucleoporins. TDP-43 was originally expressed in the nucleus and was transported to the cytoplasm through the nuclear pore complexes. We speculate that the intermolecular interactions between TDP-43 LCD and the nucleoporins may also be present during the transportation of TDP-43 out of the nucleus.
In conclusion, this research studied the protein molecular properties when TDP-43 LCD formed mature droplets. They were still liquid-like, although the intermolecular interactions were stronger than the lower protein concentration conditions or no LLPS conditions. The protein exit and reentrant equilibrium could be shifted by modifying the solution environment, here, the addition of adding RNA or dilution of the protein was demonstrated. The protein in the mature droplets would aggregate gradually, but the aggregation was slower than some conditions with decreased helix–helix intermolecular interactions. A partial loosening of the helical intermolecular interaction was identified as the aggregation intermediate step. Not all interactions were probed here, and the studies were only carried out using a very simplified system to exclude other influence factors. Recently, fuzzy interactions and the multiplicity-of-binding modes have been recognized as the framework to explain and predict the propensity of the proteins to form droplets or amyloids based on the sequence (
https://fuzdrop.bio.unipd.it, accessed on 1 November 2022). IDR sequences of TDP-43 LCD sample mostly disordered interactions favoring the protein droplet formation. The information gained could provide useful guidance to design ligands to fine tune the protein phase behaviors.
4. Materials and Methods
4.1. Expression and Purification
The cDNA of human TDP-43 LCD (from residue 267 to 414) was derived from plasmid encoded thioredoxin (Trx)-fused TDP-43 LCD (gift from Prof. Hong-Yu Hu), the cDNA of mutant TDP-16E and TDP-16G were synthesized directly, and all cDNA were cloned into Pet32M with the N-terminal hexa-His-tag. Proteins were overexpressed in E. coli BL21(DE3) or Rosetta (DE3). The uniformly labeled peptide was expressed in M9 minimal medium containing 4 g of glucose and 1.5 g of 15NH4Cl per liter. Unlabeled peptide was expressed in the LB medium. Cells were grown at 37 °C until the OD600 reading was 0.8. Then, the protein expression was induced overnight at 22 °C by adding 0.5 mM isopropyl β-D-thiogalactoside (IPTG) for TDP-43 LCD, or induced 6 h at 37 °C by adding 1 mM IPTG for TDP-16E and TDP-16G.
The 5-fluoro-tryptpphan (5FW) labeled peptide was obtained by adding 5-fluoroindole (5FI) in M9 minimal medium. In brief, after cells were grown to OD600 of about 0.6 in 1 L M9 minimal medium, cells were centrifuged and the pellets were transferred into 0.5 L M9 minimal medium containing 2 g of glucose, 0.75 g of NH4Cl, 30 mg of 5FI, 20 mg of Tyr, 20 mg of Phe, and 0.5 g glyphosate (to suppress the pentose phosphate pathway for aromatic amino acid synthesis). Following incubation for 30 min, 0.5 mM (IPTG) was added to initiate protein expression.
Cells were collected by centrifugation at 10,000× g for 10 min and the cell pellet was resuspended in 100 mL lysis buffer (50 mM Tris-HCl, 300 mM NaCl, pH 8.0) with 1 mM PMSF, lysed by French press, and centrifuged at 28,000× g for 30 min, then the peptides in the inclusion bodies were washed with water, and resuspended in 20 mL denaturing binding buffer (50 mM Tris-HCl, 300 mM NaCl, 8 M urea, pH 8.0) until most inclusion bodies dissolved at 4 °C. Further centrifugation at 28,000× g for 10 min, 4 °C, and the supernatant was purified by the Ni-NTA affinity column, while the elution buffer contained 8 M urea and 500 mM imidazole.
For TDP-43 LCD, the protein solution with 8 M urea and 500 mM imidazole was dialyzed in water for 1 day at room temperature. All dialysates were collected and lyophilized. The dried sample was then dissolved in 30% formic acid and subsequently purified by reverse-phase HPLC on a C18 column eluted by a water–acetonitrile solvent system. The HPLC elution containing pure recombinant proteins was lyophilized and stored at −80 °C for further experiments.
For TDP-16E and TDP-16E, the protein was initially stored in 8 M urea and desalted into the phosphate buffer with a 0.5 mL ZebaSpin Desalting column (Thermo Scientific, Carlsbad, CA, USA) and diluted to 100 uM or 20 uM for the experiments.
4.2. Turbidity Measurements
TDP-43 LCD and its variants were dissolved in different buffers at 25 °C and incubated for 5 min. A total of 100 μL of samples were transferred to a 96-cell plate. Turbidity was measured using a plate reader (Enspire, PerkinElmer, Waltham, MA, USA) monitoring the absorbance at 600 nm. The tested solution conditions included the PB buffer, pH 6.0, 10 mM phosphate buffer; PB + RNA, pH 6.0, 10 mM phosphate buffer with yeast RNA (20 ng/uL, Sigma, St. Louis, MO, USA); PB + urea, pH 6.0, 10 mM phosphate buffer with 150 mM urea; pH 5.5, 20 mM MES buffer.
4.3. Thioflavin-T Assays
TDP-43 LCD and its variants were dissolved in different buffers containing 20 μM THT at 25 °C and transferred to a 96-cell plate. The fluorescence emission at 480 nm was measured using a plate reader (Enspire, PerkinElmer, USA) with an excitation wavelength at 430 nm [
14]. Five seconds of shaking was applied before each reading. The blank was pH 6.0, 10 mM phosphate buffer only.
4.4. Intrinsic Fluorescence Spectroscopy
A total of 20 μM of TDP-43 LCD and its variants were dissolved in a phosphate buffer and 150 μL of the samples were transferred to a cuvette (Quartz SUPRASIL, Hellma, Mannheim, Germany). For the fluorescence spectroscopy measurements (FluoroMax-4, HORIBA, Edison, NJ, USA), the excitation wavelength was set to 295 nm, and the emission wavelength range was 310–500 nm. Both slits were 5 nm, and the scanning step was 1 nm [
36,
37]. The variation in the fluorescence maximum intensity with time indicates the aggregation rate.
4.5. Differential Interference Contrast (DIC) Microscopy
TDP-43 LCD and its variants were dissolved in different buffers at 25 °C and incubated for 5 min. For all samples, 5 μL of protein solutions were dropped onto the bottom of a glass dish. Then, the solution was checked by an inverted microscope (Nikon ECLIPSE Ti, Nikon, Tokyo, Japan) and imaged by a digital camera (ORCA-Flash 4.0, HAMAMATSU, Hamamatsu, Japan) with a 60 × 1.49 NA oil objective. The blank was pH 6.0, 10 mM phosphate buffer only.
4.6. Optical Tweezers
An optical tweezer microscope C-trap
TM from LUMICKS (Amsterdam, The Netherlands) with two steerable traps was used to perform the controlled fusion of droplets [
38]. A total of 100 μM of TDP-43 LCD was dissolved in a phosphate buffer and droplets were formed in minutes. These droplets flowed into the chamber just before data acquisition. A 1064 nm laser with a low light intensity (<0.5 W) was applied to minimize heating. One droplet was held in place by a trap, and the other steerable trap was used to capture other droplets and bring them toward the stationary droplet with a velocity of 0.04 μM s
−1 until the surface of the two droplets touched [
39]. Force-extension and image data were taken at 5 Hz. Touch times were determined via analysis from the laser signal and confirmed with video signals.
4.7. Negative-Staining Transmission Electron Microscope (TEM)
To observe the droplets, 100 μM of TDP-43 LCD was dissolved in different buffers and incubated for 5 min. To observe the fibril, 20 μM of TDP-43 LCD and its variants were incubated for 4 days before imaging. In total, 5 µL of the sample solution was adsorbed to the glow-discharged TEM grid (Cu, 300 mesh; Beijing Zhongjingkeyi Technology Co., Ltd., Beijing, China) for 45 s. Then, the grid was washed using 5 μL of water for 3 s, and finally stained with 5 μL of 2% uranyl acetate for 45 s. The TEM images were obtained using a transmission electron microscope (Talos L120C, FEI, Brno, Czech). The acceleration voltage was 120 KeV. The exposure time for each image was 2 s.
4.8. X-ray Diffraction (XRD)
A total of 20 μM of TDP-43 LCD and its variants were incubated first for 4 days. The solution mixtures were then centrifuged at 50,000 rpm for 2 h (Optima Max-TL, BECKMAN COULTER, Bera, CA, USA) and the pellets were collected. The precipitation was applied to a Single Crystal X-ray Diffraction instrument (Bruker D8 VENTURE, Bruker, Karlsruhe, Germany, Germany) for the measurement and the light source was Cu Kα radiation at a 1.54184Å wavelength [
40].
4.9. Solution-State NMR
The samples were dissolved in 90% H
20/10% D
2O pH 6.0 phosphate buffer with or without RNA/urea or pH 5.5 MES buffer. All 2D
1H-
15N HSQC NMR experiments were recorded on a Bruker 800 M Hz AVANCE III spectrometer at 298 K. The spectrum was first taken within 2 h from the sample preparation and more spectra were taken again after ~6, 22, or 53 h. All spectra were collected with the following parameters: 128* and 2048* complex pairs in the indirect
15N and direct
1H dimensions, 32 scans, 13.9 and 28 ppm as the spectral widths for
1H and
15N, respectively. Each experimental time was approximately 1 h 52 min. The
1H-
15N HSQC peak assignments were based on a published chemical shift list deposited in BMRB under the accession code 26823. A summary of the
1H and
15N chemical shifts from four different publications are listed in
Table S2 to show the similarity and difference between the different samples (BMRB code: 26823, 50154, 26728, 26816).
Proton chemical shifts were directly referenced using DSS on a TDP-43 LCD sample prepared for this purpose, and the
15N chemical shifts were referenced indirectly. All spectra were processed using either Sparky or Topspin 4.1.3. All chemical shifts gained from the
1H-
15N HSQC spectra with various conditions prepared in this work are reported in
Table S3.
4.10. 19F NMR Spectroscopy
The samples were dissolved in 90% H20/10% D2O pH 6.0 phosphate buffer with or without RNA or pH 5.5 MES buffer. All NMR spectra were recorded on a Bruker AVANCE-600 MHz spectrometer (Bruker Biospin, Billerica, MA, USA) at 298 K. The spectrum was first taken within ~30 min from the sample preparation and more spectra were taken again after ~4, 10, or 22 h. All spectra were collected with the following parameters: 40960 complex points and 1024 scans. Each experimental time was approximately 30 min. All samples contained TFA as an internal reference, which was set at −75.6 ppm. Line broadening of 10 Hz was used to process the final spectra. Origin 2018 and MestReNova were used to plot the data.