*2.1. Isolated C-LrtA Was Intrinsically Disordered in Solution*

To map the conformational features of C-LrtA in solution, we used NMR, fluorescence, far-UV (circular dichroism) CD and MD simulations. Fluorescence gives us information about the overall environment around the fluorescent residues (C-LrtA has 4 tyrosine residues). Far-UV CD provides information about the percentages of secondary structure. NMR gives further information about the presence of secondary and tertiary structures. Finally, MD simulations give indications on the conformation of isolated C-LrtA in solution and on the local propensity for secondary structure formation along the backbone.

(a) NMR: In the methyl (Figure 1A) and the amide (Figure 1B) regions of the NMR spectrum of C-LrtA, there was no dispersion, i.e., all the amide signals appeared clustered between 8.2 and 8.6 ppm and most of the alkyl chains appeared between 0.8 and 1.0 ppm. Only a small shoulder appeared up-field shifted at 0.7 ppm, indicative of a local conformation around the methyl group of a valine, leucine or isoleucine probably close to an aromatic residue; we hypothesize that this signal could correspond to the polypeptide patch VIYI (residues 173–176, in the numbering of the whole LrtA).

The NMR spectrum showed significant broadening in all the signals. On the basis of the results of the other techniques (see below in this section, and later in Section 2.2), this could be due to the presence of conformational exchange (equilibria) among protein species with different self-associated order. However, given the mobility of the protein (see MD simulations results, below in this section), we cannot exclude that the broadening observed could be due to conformational exchange in a single protein species.

Therefore, we can conclude from the NMR spectrum that C-LrtA was disordered.

(b) Far-UV CD: The far-UV CD spectrum of C-LrtA showed a minimum of around 200 nm and a wide shoulder at 222 nm (Figure 2A), which could be due to the presence of helix or turn-like structures, although the absorbance of aromatic residues (4 tyrosine and 3 phenylalanine ones) at the latter wavelength cannot be ruled out [14,15]. Although we cannot exclude protein adsorption to the cell at the lowest protein concentration used, the spectrum intensity was protein-concentration dependent (in the range of 10 to 20 μM of protein concentration, Figure S1) and its shape did not change. The fact that the other techniques used (see Section 2.2) indicate the presence of oligomeric species suggests that the variations observed in the far-UV CD spectra are due to the existence of self-associated species.

**Figure 1.** NMR characterization of C-LrtA: (**A**) Methyl and (**B**) amide regions of the 1D 1H NMR spectrum of C-LrtA. Spectrum was acquired at 20 ◦C, and pH 7.2 (50 mM, Tris buffer).

**Figure 2.** Spectroscopic characterization of C-LrtA: (**A**) Far-UV circular dichroism (CD) spectrum of C-LrtA. Spectrum was acquired at 20 ◦C, and pH 7.2 (50 mM, Tris buffer) with 20 μM (in protomer units) of protein concentration; (**B**) GdmCl-denaturations of C-LrtA followed by fluorescence (right axis, red circles) and CD (left axis, black circles), at 10 μM of protein concentration (in protomer units) and 20 ◦C; (**C**) Thermal denaturations of C-LrtA followed by fluorescence (right axis, red circles) and CD (left axis, black circles). Experiments were acquired at pH 7.2 (50 mM, Tris buffer) and 10 μM of protein concentration (in protomer units). The ellipticity (far-UV CD) units of thermal denaturations are arbitrary, because values are scaled up.

The shape of the spectrum of C-LrtA was characteristic of IDPs [16]. Its deconvolution, by using the algorithms available at the DICHROWEB site [17,18], yielded percentages of 7–8% for α-helix structure, 15–20% for β-turn, 28–44% for β-sheet and 45–48% for random-coil.

In the presence of increasing GdmCl concentrations, the shoulder at 222 nm of the far-UV CD spectrum decreased (Figure 2B, left axis, black circles) at the two concentrations explored (10 and 20 μM). These results suggest that the shoulder was not due to the presence of any well-fixed structure, but rather to flickering helix- or turn-like motifs, or even local conformations of the aromatic residues [14,15] Attempts to fit these data to the linear extrapolation model failed, as they led to thermodynamic parameters with non-physical meaning (i.e., negative values of m or large values of [GdmCl]1/2). We further tested the disordered nature of C-LrtA by performing thermal denaturations. We observed a decrease in the ellipticity as the temperature was increased (Figure 2C, left axis, black circles) and therefore we did not observe a sigmoidal co-operative behaviour, as it should be expected for a well-folded globular domain [16,19].

Therefore, we can conclude from the far-UV CD data that C-LrtA was disordered.

(c) Fluorescence: Fluorescence spectra of C-LrtA showed a maximum at 307 nm, corresponding to its 4 tyrosine residues [20,21]. We carried out GdmCl denaturations by following the <λ> (at two different C-LrtA concentrations) after excitation at 280 nm. At both protein concentrations, we observed a linear decrease in the <λ> as the concentration of chemical denaturant was increased (Figure 2B, right axis, red circles). We could not fit these data to the linear extrapolation model, as fitting led to thermodynamic parameters (m- or [GdmCl]1/2-values) with non-physical meaning (i.e., either negative values or values higher than the protein concentration explored). A similar linear tendency was observed in thermal denaturations (Figure 2C, right axis, red circles).

Therefore, we can conclude from the fluorescence data that C-LrtA was disordered.

(d) MD simulations: C-LrtA (the sequence present in the wild-type protein, i.e., without the His-tag) was simulated starting from an extended conformation with a radius of gyration *R*<sup>g</sup> = 64 Å. The protein structure spontaneously collapsed in 15 ns, and for the subsequent 10 ns maintained a size of *R*<sup>g</sup> = 24 ± 2 Å, in excellent agreement with the value (*R*<sup>g</sup> = 25 Å) predicted for a tag-free IDP of 90 residues with the sequence features of C-LrtA [22]. As reported in Figure 3, the secondary structure propensity of the protein in the time interval considered was reasonably consistent. In contrast, the conformations sampled at sufficiently large intervals (e.g., every 0.1 ns) were relatively different in terms of their three-dimensional arrangement, due to the dynamics of C-LrtA. Although not being long enough to obtain a complete statistical ensemble of conformations, the simulation was not prolonged to prevent well-known artifacts, due to over-compaction of the protein [23,24], because a small drift was observed in its size (decrease of *R*g was ~ 0.3 Å/ns on average, Figure 3).

**Figure 3.** Simulated secondary structure and radius of gyration of C-LrtA: Backbone properties of the protein without the His-tag are calculated in a 10 ns time interval, following 15 ns of equilibration after starting from an elongated conformation. (**Up**) Secondary structure propensities calculated with VMD [25]: (Blue) β-structure, (red) helical structure, and (white) random coil; (**Down**) Radius of gyration of C-LrtA; the drift leading to a small decrease of *Rg* is also shown (red line).

The simulation results concurred to indicate that C-LrtA in solution was a very flexible protein with little secondary structure. The percentages of helical/β-structure were in good agreement with the range of those obtained from the deconvolution of far-UV CD spectra, but the corresponding backbone conformations were in all cases local and did not extend for more than a few residues. Interestingly, among the four tyrosine residues of C-LrtA, only Tyr182 (according to the numbering in intact LrtA) was in a region with β-structure propensity, whereas the other three were in random-coil regions and showed a large conformational freedom in the isolated domain.

In summary, taking into account all the data, as concluded from fluorescence, far- UV CD, NMR and MD simulations, C-LrtA appeared disordered in solution.

#### *2.2. Isolated C-LrtA Was Involved in Self-Association Equilibria in Solution*

To map the hydrodynamic properties of LrtA we used several biochemical, biophysical and hydrodynamic techniques: Blue-native gels; glutaraldehyde cross-linking; iodide quenching; DOSY-NMR (diffusion ordered spectroscopy NMR); SAXS (small-angle X-ray scattering); SEC (size exclusion chromatography) and ITC (isothermal titration calorimetry). We used such a plethora of different techniques to provide an unambiguous evidence of the presence of oligomerization in disordered C-LrtA. It is important to pinpoint, however, that with NMR we shall obtain information about the low-molecular weight species whose overall rotational tumbling is very fast, and then, we shall be able to obtain information only on the monomer and/or dimer species.

(a) DOSY-NMR: The DOSY-NMR measurements of C-LrtA yielded a translational diffusion coefficient, *<sup>D</sup>*, of (5.0 ± 0.2) × <sup>10</sup>−<sup>7</sup> cm<sup>2</sup> <sup>s</sup>−<sup>1</sup> (Figure 4A). By taking into account the hydrodynamic radius, *<sup>R</sup>*S, of dioxane (2.12 Å), and its *<sup>D</sup>* under our conditions ((8.53 ± 0.02) × <sup>10</sup>−<sup>6</sup> cm2 <sup>s</sup>−1), the estimated *R*<sup>S</sup> for C-LrtA was 36 ± 4 Å. We can compare this value with that theoretically determined for a polypeptide with the sequence length of C-LrtA (including the N-terminal His-tag). The *R* value for an unsolvated, ideal, spherical molecule can be estimated from [21]: *R* = <sup>3</sup> 3*MV*/4*NAπ*, where *NA* is Avogadro's number, *M* is the molecular weight of the C-LrtA construct (12,449.89 Da), and *V* the specific volume of C-LrtA construct (0.721 mL/g). The calculated radius for C-LrtA is 15.3 Å, but taking into account the water shell [21,26], the hydration radius is 18.5 Å; this value is different from that obtained from experimental DOSY measurements. The *R*<sup>s</sup> for a spherical, folded protein is given by [27]: *RS* = (4.75 ± 1.11)*N*0.29, where *<sup>N</sup>* is the number of residues; in a 109-residue-long protein, such as C-LrtA (the His-tag and the 90-residue-long domain), this expression yields 18 ± 4 Å, in good agreement with the other theoretically calculated value. On the other hand, for an unfolded polypeptide chain, the *<sup>R</sup>*<sup>S</sup> could be estimated from [27]: *RS* = (2.21 ± 1.07)*N*0.57; for C-LrtA, the value is 32 ± 15 Å, which is closer to the values measured in the DOSY-NMR experiments; however, the use of that expression yields a value slightly higher (43 ± 15 Å) for a dimeric species. Therefore, by the DOSY-NMR experiments we are only detecting low-molecular weight species, which seemed to be unfolded.

(b) SEC: Different amounts of C-LrtA were loaded in an analytical Superose 12 10/300 GL column at pH 8.0 (50 mM Tris) and 0.250 M NaCl. The chromatograms did not show a sole peak (Figure S2), and the smaller the concentration of the protein the more were the peaks that appeared. This finding, given protein purity (Figure S3A), could be attributed to protein-column interactions of some species, which eluted at larger volumes than expected from their size. Similar delayed peaks, due to protein-column interactions, have been observed in the intact LrtA [12]. It is important to note that a small peak appearing at 16.48 mL was also present in the chromatogram of the most concentrated sample (500 μM) (Figure S2), as well as at any other protein concentration; we interpreted this peak as due to a monomeric species interacting with the column.

The elution volumes of one of the peaks showed a hyperbolic dependence as the concentration of protein was changed (Figure 4B). At very high concentrations (500 μM), the protein had elution volumes of 11.98 mL (obtained as the mean of three different measurements, although the elution peak was very broad). This value would correspond to a molecular weight of 100 kDa (for a comparison, a protein, such as ferritin, with a molecular weight of 400 kDa, elutes in the column at 10.11 mL, Figure S2); these results suggest that C-LrtA was probably an octamer under these conditions. On the other hand, at 30 μM, C-LrtA eluted at 13.88 mL, which would correspond to a molecular weight of 31.6 kDa, close to the expected molecular weight of a dimeric species. Therefore, in the column matrix the protein behaved as a self-associated species with several oligomerization orders, depending on the concentration used.

**Figure 4.** Hydrodynamic and biophysical measurements of C-LrtA: (**A**) DOSY measurements: Intensity decay (arbitrary units) of the methyl signals as the pulse field gradient strength was increased (x-axis). The line is the fitting to equation, as described in Section 4.6. The inset shows the linear relationship between the logarithm of the intensity and the square of the gradient strength; (**B**) size exclusion chromatography (SEC) measurements: Elution volumes of one of the peaks observed for C-LrtA in a Superose 12 10/300 GL at different protein concentrations in buffer pH 8.0 (50 mM Tris) and 0.250 M NaCl; the data have an error of 0.1 mL as obtained from three independent measurements at each particular C-LrtA concentration; (**C**) small-angle X-ray scattering (SAXS) results of C-LrtA are shown with the solid line representing a fit with a generalized Gaussian coil with a scaling exponent value of 1/3 (Section 4.11), and the dashed line is the Guinier description of the low-Q limit (see Guinier plot in the inset, and Section 4.11).

(c) BN-PAGE (blue native polyacrylamide gel electrophoresis): C-LrtA exhibited two species in these experiments, which corresponded to different self-associated species (Figure S3B). The protein species in the fastest migrating band corresponded to an apparent molecular weight of 33 kDa (close to the molecular weight of a dimer, and similar to that observed at the most diluted protein concentration in the SEC experiments). On the other hand, the other band corresponded to an apparent molecular weight of 66 kDa, denoting a pentamer. Our results also suggest that increasing the amount of SDS (well-below the concentration used in denaturing SDS-PAGE gels: 33 mM [28]) had significant effects on the population of self-associated C-LrtA species: the larger the proportion of SDS, the higher the amount of self-associated species detected (the critic micellar concentration of SDS is 1.33 mM). It is important to indicate, at this stage, that the BN-PAGE technique can lead to overestimation of the molecular weights, as some proteins can bind Coomassie dye [29].

(d) Glutaraldehyde cross-linking: To detect the presence of oligomeric species in C-LrtA, we also used the glutaraldehyde agent. We observed dimers (close to the band of the protein marker at 32 kDa) (Figure S3C) at shorter times after addition of the cross-linking agent, and other high-molecular-weight species at the top of the SDS-PAGE lanes. The population of these high-molecular weight self-associated species increased at the largest incubation times (Figure S3C).

(e) KI quenching: It is reasonable to assume that if the self-associated species form at low C-LrtA concentrations, and the tyrosine residues were involved in the association interfaces, then we should be able to follow protein self-association by KI quenching, and we should expect a decrease in the *K*sv constant as the C-LrtA concentration was increased. We observed the following *K*sv values in C-LrtA: 1.5 ± 0.3 M−<sup>1</sup> (at 5 <sup>μ</sup>M of protein); 1.1 ± 0.2 M−<sup>1</sup> (at 20 <sup>μ</sup>M of protein); and 1.04 ± 0.05 M−<sup>1</sup> (at 40 <sup>μ</sup><sup>M</sup> of protein, all of them in protomer units). Then, there was a protein-concentration behaviour in the 5–40 μM concentration range for the *K*sv and C-LrtA self-associates.

(f) ITC experiments: We also tried to test whether C-LrtA dissociated upon dilution, using the heat evolved in the reaction monitored by ITC. For experiments performed at a high protein concentration stock, the heat released upon dilution of the protein into the calorimetric cell was consistent with a dissociation reaction for all injections (Figure S4). We tried to fit the heat released to a simple dimer-monomer equilibrium, but the results of the fitting indicated that this assumption was not good enough, suggesting the presence of higher-order equilibria, as indicated by SEC results (see above in this section).

(g) SAXS experiments: The experiments with C-LrtA indicate a *R*<sup>g</sup> ≈ 26 Å with a υ ≈ 0.33, close to a compact species value, but with a value of *R*g larger than that of a well-folded protein (Section 4, Figure 4C), which is within the range observed for unfolded polypeptide chains [27]. We obtained a good agreement with the expected Guinier regime at low *Q* values, indicating that, although the protein was self-associated, the size of the C-LrtA species was relatively small.
