*2.3. Dimeric Actin–Domain V Encounter Complexes Generated by Docking Calculations*

As reported in the literature, a tandem of WH2 motifs, such as N-WASP domain V, can form a ternary complex with two actin molecules [26,32]. Rebowski et al. notably reported a 2:1 actin–domain V complex, in which two actins are assembled into a longitudinal filament-like dimer (PDB structure 3M3N) [26]. In this section, we investigate the early steps of formation of these ternary encounter complexes. As for actin monomer, we blindly docked the 527 most populated clusters of the MD-derived N-WASP domain V conformational ensemble [29], but here, onto the longitudinal actin dimer structure extracted from the PDB file 3M3N [26]. It should be noted that each chain of the 3M3N dimer is structurally very similar to actin in 2VCP (RMSD over C*α* atoms being equal to 0.99 and 0.66 Å for chain A and B, respectively). Moreover, unlike in 2VCP structure, both chains of actin dimer 3M3N lack the coordinates of their last residue F375. A total number of 754,118 complex structures were generated. The likeliness of these complexes was evaluated with the scoring function 2/3B*best*

InterEvScore [30]. We delineated the 1% complexes (that is 7540 conformers) having the highest 2/3B*best* score as the most probable actin dimer-domain V structures. As for actin monomer, when compared to the 527 cluster representative structures, the domain V conformations that are retrieved in the most probable complexes with actin dimer are in average more compact as indicated by the radius of gyration distributions (Figure S3). The dimeric state of actin did not favor the binding of extended conformations of domain V.

We then analyzed the probability of domain V residues to be in contact with each chain of actin dimer. We observed again that actin preferentially recognizes the domain V regions 9KAALLDQIRE18 and 37RDALLDQIRQ46, with a similar pattern as for actin monomer (compare Figure 9 with Figure 1), indicating that the N-WASP recognized regions are rather in (partial) *α*-helical structures. It is also confirmed that the conserved sequences 22LKKV25 and 50LKSV53 have low probability to be contacted by actin dimer in the encounter complexes, suggesting again that they should move and anchor to the actin's surface after the recognition of the previously mentioned regions 9–18 and 37–46.

**Figure 9.** Probability of the N-WASP domain V residues to be distant by less than 4 Å from actin dimer. Orange and magenta dashed lines indicate the N-WASP regions in *α*-helix (as revealed by the X-ray structure 2VCP [27]) and the consensus sequences "LKKV" [14,31], respectively.

Finally, we determined the preferential location of the domain V regions 9–18 and 37–46 on actin dimer surface by computing over the 7540 most probable complexes the probability that actin residues are contacted by one of these segments (Figure 10). The N-WASP regions 9–18 and 37–46 can be retrieved in the cognate binding site of actin chain A but not of chain B. The presence of chain A at the bottom of chain B probably hinders the approach and accommodation of domain V in the binding site of chain B. As for actin monomer, we also observed that N-WASP segments 9–18 and 37–46 can bind to other patches of the actin's surface with high probability, notably at residues K191, E195, R256 and F266 which are located at the top of the back of actin dimer (Figure 10). It is not clear for us if these non-productive associations are artifacts or not. Nevertheless, since the consensus sequences "LKKV" have low probabilities to contact actin, large conformational changes of domain V are likely to occur after the formation of the encounter complexes. Only a productive encounter complex in which the cognate binding site of actin accommodates N-WASP segment 9–18 or 37–46 will be able to form the correct quaternary structure.

**Figure 10.** (**A**) Residue-specific probability of actin dimer chain A (bottom) and chain B (top) to be distant by less than 4 Å from N-WASP domain V regions 9–18 or 37–46 in the ensemble of 7540 ternary complexes generated by docking. Red dashed lines indicate the actin residues in contact with the N-WASP helical segment in the X-ray structure 2VCP [27]. (**B**,**C**) Back and front views of the actin dimer surface colored proportionally to the previous probabilities. Blue, white, and red colors indicate actin residues with low, intermediate, and high probabilities to be contacted by N-WASP domain V regions 9–18 or 37–46, respectively. As a reference, yellow and green ribbons represent helical regions and consensus sequences "LKKV" of the two WH2 motifs observed in the X-ray structure 3M3N [26].

**Figure 11.** Side view of the two best 2:1 actin–domain V encounter complexes with N-WASP segment 9–18 (**left**) or 37–46 (**right**) located and oriented as in structure 3M3N. Black balls are N-terminal C*α*-atoms of domain V. Red and magenta ribbons represent its regions 9–18 or 37–46 and consensus sequences "LKKV", respectively. As a reference, yellow and green ribbons indicate the helical and LKKV regions of domain V in 3M3N.

These productive actin–domain V encounter complexes were identified among the 7540 most probable complexes as those with segment 9–18 or 37–46 making contacts to at least 6 over the 8 hot-spot residues of 3M3N actin chain A (Y143, G146, T148, G168, Y169, L349, T351, and M355), and correctly oriented so that the conserved sequence 22LKKV25 or 50LKSV53 can reach their cognate binding site. We found 10 and 13 productive encounter complexes in which N-WASP segments 9–18 and 37–46 are bound to actin chain A, respectively (Tables S4 and S5). The two complexes for which the regions 9–18 or 37–46 have the lowest RMSD relative to structure 3M3N are displayed in Figure 11. In all found productive encounter complexes, regions 22LKKV25 or 50LKSV53 are detached from actin, and actin chain B is not contacted by other parts of N-WASP domain V. The presence of chain B in the actin dimer does not seem to influence the recognition of N-WASP segments 9–18 or 37–46 by actin chain A. Besides, several representative structures of domain V conformational ensemble (clusters 105, 145, 230, 333, 407, and 411) were retrieved in the most probable encounter complexes on both the monomeric (2VCP) and dimeric (3M3N) states of actin. Nevertheless, as previously seen, the subsequent binding of residues 22LKKV25 or 50LKSV53 to actin was not persistent in our MD simulation of complexes with actin monomer, but this association might be stabilized by the presence of a second chain in complexes with actin dimer. This hypothesis can be assessed using extensive MD simulations. Unfortunately, our limited computational resources for this project did not allow us to perform these calculations.

### **3. Discussion**

The characterization of the early events of protein–protein recognitions involving intrinsically disordered proteins is important for better understanding the molecular bases of regulation and signaling processes occurring in cells. This task is very challenging using current experimental techniques and can be fruitfully complemented by molecular modeling. However, MD simulations of encounter complexes starting from separated proteins are computationally very demanding and require extremely long trajectories in cases of IDPs. In this study, we propose a less expensive approach consisting, first, in discretizing the IDP large conformational ensemble into representative structures of the most populated clusters; secondly, in generating the protein–protein encounter complexes by rigid coarse-grained protein–protein docking; and, finally, in performing MD calculations of few selected productive complex conformations.

This approach was used to study the recognition by actin of the two WH2 motifs of N-WASP domain V, which is largely disordered in free state. Several crystallographic structures of actin–WH2 motif complexes show that the WH2 motif N-terminal part is folded into an amphiphilic *α*-helix located in a cleft at the bottom of actin, and that its consensus sequence "LKKV" has a rather extended conformation lying on the actin front surface (Figures 2 and 6). The pathway leading to these bound states remains largely unknown, especially in the case of tandems of WH2 motifs which bind two actins.

Previously, we identified several structures with transient *α*-helices at regions 9–18 and 37–46 in the unbound domain V conformational ensemble [29]. Our present docking calculations showed that these two regions are effectively preferential binding sites for actin (Figure 1). Our results also suggest that conformations with regions 9–18 or 37–46 completely structured in *α*-helix are not preferably recognized, but less folded conformations can be equally accommodated in the cognate binding site on actin (Tables S2–S5). Knowing the binding location on actin's surface of the conserved segments 22LKKV25 or 50LKSV53, it is apparent that non-specific association and orientation of regions 9–18 and 37–46 on actin's surface cannot produce the observed quaternary structure of actin–WH2 motif complexes. Our MD simulations of a productive encounter complex even showed that, when the recognized helical region 37–46 of N-WASP is initially correctly located and oriented in the actin cognate binding site, a slight displacement of this region toward the bottom of actin prevents the segment 50LKSV53 to reach and bind its specific site on actin (simulation CplxB\_MD2).

In our modeling procedure, it could be noted that only the 7030 encounter complexes with the highest 2/3B*best* score among the 702,920 generated by docking were deemed as probable and subsequently analyzed. Although this limited number could lead to possible missed relevant structures, it is much larger than the number of docking solutions that are usually analyzed to find near-native protein–protein interfaces (up to 1000) [30]. This provides reasonable confidence that our modeling generated relevant quaternary structures. Besides, the 7030 analyzed structures can be considered as representative of both the productive and non-productive encounter complexes (Figure 2), as they probably appear in vitro or in vivo. Strikingly, in all productive encounter complexes, the consensus sequence "LKKV" of WH2 motifs is found distant from actin's surface (Figure 3). This indicates that

large amplitude motions of these segments are likely to occur in a second step to enable the formation of the final quaternary structure, as illustrated in our MD simulations of CplxA (Figures 5 and 7). Thus, we think that our modeling study has allowed going beyond the prediction of the actin–N-WASP complex quaternary structure and has also gained insight into its mechanism of formation. To sum up, our study of actin monomer recognition by N-WASP domain V indicates that actin first binds domain V regions 9–18 or 37–46 which are partially folded into amphiphilic helical structures, mainly through hydrophobic interactions. Then, the charged segments 22LKKV25 or 50LKSV53, driven by electrostatic forces, move and attach to their cognate site on actin's surface.

When the binding of domain V to a longitudinal actin dimer was considered, our docking calculations showed that N-WASP helical regions 9–18 and 37–46 can bind their cognate binding sites, but preferentially on actin chain A, the access of the specific binding site on chain B being more restricted (Figure 10). Nevertheless, this result might depend on the quaternary structure of the actin dimer, particularly on the actin–actin interface, which can significantly vary, as observed in various crystallographic structures of actin oligomers (3M3N [26], 4JHD [32], and 6FHL [33]). All together, our results allow us to propose the following model for the early events of association of N-WASP domain V to two actins and the formation of a ternary complex with a longitudinal filament-like actin dimer, as observed in structure 3M3N (Figure 12): From isolated actin chains and N-WASP domain V, three possible binary complexes can be formed (States II-a, II-b, and II-c). In State II-a, the second WH2 motif attached to actin chain B prevents the approach and binding of chain A [11,15,34] and thus disfavors the formation of intermediate State III-a. When the actin dimer is already formed, our docking calculations indicate that the binding of N-WASP second WH2 motif to actin chain B is not favorable. Thus, the direct formation of the ternary State III-a from a preformed actin dimer or the evolution of intermediate State III-b toward the final complex are very unlikely. These considerations imply that the final state is likely formed through an intermediate ternary complex in which the two WH2 motifs are bound to two loosely interacting actin chains (State III-c). Then, this highly flexible assembly evolves toward the final state through the association of the two actin chains into a longitudinal dimer. This model suggests that the binding of N-WASP domain V to an actin dimer would not be a cooperative process, in line with fluorescence titration experiments reported by Gaucher et al. [27].

**Figure 12.** Possible pathways toward the formation of a 2:1 actin–domain V complex with a longitudinal actin dimer as observed in 3M3N. Starting from two actin chains and one N-WASP domain V (State I), three possible binary encounter complexes can be formed (States II-a, II-b, and II-c), leading to three possible intermediate ternary complexes (States III-a, III-b, and III-c) just before the final structure (State IV). Cyan and red arrows indicate the binding of the N-WASP first and second WH2 motif to actin chain A and B, respectively. Dark grey arrows represent the binding of two actins into a longitudinal dimer.

During this process, it is not clear whether the binding of the conserved sequences 22LKKV25 and 50LKSV53 to their cognate sites occurs before the formation of the longitudinal dimer. In crystallographic structure 2VCP, the four residues 50LKSV53 are found attached to the actin's surface, but our MD simulations in explicit water indicate that this binding is rather transient in 1:1 actin–domain V complexes. We speculate that the interactions between the consensus sequences and actin might guide the dynamics of dimerization into longitudinal assemblies. All together, our model for the early events of domain V association to two actins might explain how the two WH2 motifs of N-WASP favor the formation of longitudinal filament-like conformation of actin dimer and why they induce more rapid actin polymerization than proteins of the WASP family with only one WH2 motif [28].
