**1. Introduction**

Intrinsically disordered proteins lack stable three-dimensional structures under physiological conditions and are known to perform important roles in several processes including signalling, enzymatic activity, and gene regulation [1,2]. To perform these functions, disordered regions interact with protein, RNA, DNA, and other small molecules to gain ordered structures [3,4]. Experimentally, interactions mediated by IDRs can be observed using NMR and X-ray crystallography. However, because of poor resolution, problems in crystallization, and high time and resource consumption, computational methods are necessary to identify disorder-mediated interactions [5,6].

Several methods have been developed for understanding the disorderness of proteins using sequence or structural information [7–10]. In addition, the transition of disorder-to-order regions in protein–protein interactions (PPI) is well studied experimentally and computationally [8–12]. For example, LMO4, a putative breast oncoprotein, interacts with various tandem LIM-domain containing proteins mediated by disordered regions [13]. BRCA1, a tumour suppressor protein, helps in binding with multiple protein and DNA partners by its central disorder region of ~1500 amino acids [14]. Recently, Papadakos et al. [15] showed that inducing intrinsic disorder in high-affinity protein–protein interactions reduces the affinity of binding.

Many proteins contain disordered regions and some of the regions attained ordered structures after binding to their cognate substrates, which are also known as MoRF (Molecular Recognition Features) segments [16,17]. Sugase et al. [18] have shown that folding and binding of IDPs or IDRs are coupled processes. Furthermore, binding partners are also shown to influence affinity and kinetics of binding. The flexibility of IDPs helps them to bind with multiple partners and have co-operative interactions [19]. Although induced fit and conformational selection processes are proposed explanations for the coupling of folding and binding, the exact model which is preferred by IDPs is not known [11,20].

The dynamics of the RNA molecule makes it more amenable to interact with disorder-mediated protein–RNA interactions [21]. The recognition of the protein–RNA complex has been experimentally studied using EMSA, yeast-3-hybrid assay, pull-down assay and CLIP [22,23]. On the other hand, plenty of tools have been developed to identify binding sites in RNA-binding proteins [24–32]. All these methods use the information in their sequence to compute the feature and/or evaluate the performance. Recently, Peng and Kurgan [33] developed a webserver for prediction of disorder-mediated interactions in RNA, DNA and protein–protein complexes. However, the knowledge for understanding the mechanisms or factors responsible for binding of disordered region with RNA has not yet been completely explored.

In this work, we constructed a dataset for protein–RNA complexes (provided in supplementary information), which are involved in disorder-to-order transitions. Utilizing the dataset, we analyzed the number and size of DOT regions in protein–RNA complexes, preference of residues involved in binding in DOT regions, secondary structure, solvent accessibility, pair preference at the interface, preference in different secondary structures of RNA, and interaction energy between protein and RNA DOT and non-DOT regions at the interface.
