*2.1. Number of DOT Regions in Protein–RNA Complexes and Length of DOT Regions*

The variation in the number of DOT regions in non-ribosomal and ribosomal complexes is shown in Figure 1. We observed that most of the complexes have less than three DOT regions (88% in non-ribosomal and 100% in ribosomal complexes). Most non-ribosomal proteins have one DOT region, whereas ribosomal proteins have mostly two or more DOT regions. In addition, at most eight DOT regions per complex are found in our dataset.

**Figure 1.** Percentage of protein–RNA complexes containing different number of DOT regions.

Further, we analysed the length of each DOT region in non-ribosomal and ribosomal protein–RNA complexes, which shows that most DOT regions are short, as shown in Figure 2. In both non-ribosomal and ribosomal complexes, more than 70% of DOT regions have three to 10 residues and very few (only 5) regions have a length of more than 50 residues. This leads to a speculation that only a small conformational change might be required for bringing shape complementarity in protein–RNA complexes and these small DOT regions help in obtaining the same.

**Figure 2.** Length distribution of DOT regions in protein–RNA complexes in (**a**) non-ribosomal and (**b**) ribosomal complexes.

### *2.2. Binding Frequency of Residues at DOT Regions*

The binding frequencies of residues in DOT regions using 3.5 Å (NR3.5 and RB3.5) and 6 Å (NR6 and RB6) distance cut-offs are shown in Figure 3 and Figure S2, respectively. We observed that among all positively charged residues (Arg, Lys and His), Arg and Lys have high preference for binding in both NR3.5 (Figure 3a) and NR6 (Figure S2a) datasets. Interestingly, only eight and 13 among 20 residues are observed in binding DOT regions at RB3.5 (Figure 3b) and RB6 (Figure S2b) datasets, and Arg has the highest frequency of binding. Cys, Met, and Trp in DOT regions are not involved in binding with RNA, whereas in ordered complexes 0.97%, 4.52%, and 5.54% of Cys, Met, and Trp are involved in binding, respectively. The comparison of binding site residues in DOT regions and the whole protein showed an expected presence of 1.5% and 2.7% of Met and Trp, respectively, in the interface of the DOT region. These results showed that the non-occurrence of Met and Trp at the interface of the DOT regions is statistically significant.

**Figure 3.** Amino acid frequency of binding in the DOT region for (**a**) non-ribosomal and (**b**) ribosomal protein–RNA complexes.

We have computed the preference of binding of residues in DOT regions by dividing the number of residues in DOT regions with the total number of binding residues, and the results are presented in Figure 4 and Figure S3 for 3.5 Å (NR3.5 and RB3.5 datasets) and 6 Å (NR6 and RB6 datasets), respectively. In Figure 4a, high frequency of Arg, Gly, Lys, and Ser (*z*-score > 1) is observed for the NR3.5 dataset, which suggests that these residues are more probable to contact DOT regions with respect to all residues in contact with RNA. However, for the NR6 dataset (Figure S3a), the result is only consistent for Lys, and two other residues (Glu and Pro) show high binding frequency. In ribosomal protein complexes with 3.5 Å and 6 Å, Ala & Glu, and Glu & Tyr have high frequencies, respectively (Figure 4b and Figure S3b).

**Figure 4.** *Cont*.

**Figure 4.** Frequency of DOT regions by contact residues for (**a**) non-ribosomal and (**b**) ribosomal complexes.

#### *2.3. Binding Propensity of Residues at DOT Region*

Propensity is calculated by normalizing the binding frequency of residues in DOT regions with the overall frequency of the respective residues to be in a protein, using Equation (3). This can measure the bias in binding of residues in DOT regions, independent of their count in DOT regions. We have calculated the propensity of amino acids to be in DOT regions using distance cut-offs of 3.5 Å and 6 Å and the results are shown in Figure 5 and Figure S4, respectively. In the NR3.5 (Figure 5a) dataset, His, Arg, Asn, Gln, Phe, and Tyr have high propensity of binding, whereas in ribosomal proteins (RB3.5 dataset; Figure 5b), only His showed a high propensity. In the NR6 (Figure S4a), His has high propensity, whereas Asn, His and Tyr have high propensity in the RB6 (Figure S4b) dataset. Similarly, high propensity for binding is observed for positively charged residues along with Tyr and Phe in protein–RNA complexes [34]. On the other hand, among all charged residues only Arg has high tendency to bind with DOT regions in protein–protein complexes [35]. Furthermore, non-specific interactions occurred frequently in protein–protein complexes, which is not a common trend in the binding residues of DOT regions in protein–RNA complexes. Therefore, we can infer that the preferred residues at DOT regions are specific in protein–RNA complexes and, especially, charged interactions are important in DOT regions for binding with RNA.

**Figure 5.** Propensity for amino acids in (**a**) non-ribosomal and (**b**) ribosomal complexes.
