**1. Introduction**

Rett syndrome (RTT; OMIM entry #312750) is a rare disease that was first described by Andreas Rett in 1966 [1]. It is characterized by severe impairment such as deceleration of head growth, loss of speech, seizures, ataxia, movement disorder, and breathing disturbance [2]. Alterations in methyl CpG-binding protein (*MECP*)*2*, an X-linked gene involved in the regulation of RNA splicing and chromatin remodeling, were confirmed in approximately 95% of individuals diagnosed with RTT [3], while the others were confirmed in either cyclin-dependent kinase-like (*CDKL*)*5* or forkhead box protein (*FOXG*)*1* alterations as atypical cases of RTT [4,5]. The mutations in *MECP2* are generally paternally derived. Thus, this syndrome mainly affects girls, and the age of onset varies from 6 to 18 months [2,6]. Additionally, Rett syndrome can also affect males with severe phenotype and early lethality following the inactivation of the sole X-linked copy of *MECP2* [7]. In a rare case, it can also

exist as somatic mosaicism or co-occur with Klinefelter syndrome in males [8,9]. Even though the causative genes have been determined, the infrequent clinical phenotypes yield to the difficulty in diagnosis. Further, diagnosis may be challenging as many of the clinical features overlap with those of other neurological and neurodevelopmental disorders, and mutation in *MECP2*, *FOXG1*, and *CDKL5* can also cause neurodevelopmental disorders distinct from RTT [10]. As a result, subsequent studies have suggested that alterations in either *CDKL5* or *FOXG1* should be classified as a distinct disorder from RTT as the majority of cases showed some differences in clinical features [11–13] Moreover, recent studies have suggested that RTT is a monogenic disorder caused by mutations that alter the functionality of the methyl-CpG-binding domain (MBD) and the NCoR/SMRT interaction domain (NID) in *MECP2* [14–16]. This may simplify the complication of developing a treatment strategy. But, elucidation on the overlapped symptoms between those three proteins comprehensively on the molecular basis also seems necessary as the study about it remains scarce and it may provide meaningful insight, particularly for RTT.

The MeCP2 structure has been determined using various experimental methods, while the structure of FOXG1 has only been investigated by predictions [17,18]. In the case of CDKL5, the structure of the amino-terminal kinase domain has already been identified, but that of the long carboxy-terminal tail has not been clarified [19]. These proteins have been suggested to contain polypeptide segments that are unable to fold spontaneously into three-dimensional structures; the so-called intrinsically disordered regions (IDRs) exist as dynamic ensembles that rapidly interconvert from molten globule (collapsed) to coiled or pre-molten globule (extended) as a result of the relatively flat energy landscapes [20,21]. The different entities of IDRs and ordered regions (displaying tertiary structures in native conditions) are dictated by the amino acid sequence; the former generally lack bulky hydrophobic residues [22]. Proteins are composed of either fully structured or fully disordered regions (with the latter referred to as intrinsically disordered proteins (IDPs) or a combination of the two, which is the case for most eukaryotic proteins [23]. Although protein function has traditionally been elucidated based on a well-defined structure, it is now widely acknowledged that IDRs contribute to diverse functions, which can be classified into six types: entropic chain activity, display site, chaperone, molecular effector, molecular assembler, and molecular scavenger [23–26]. Excluding entropic chain activity, IDRs adopt specific tertiary conformations—at least locally—in order to perform those functions by binding to other proteins, nucleic acids, membranes, and small molecules or responding to changes in their environment [20,27]. Hence, IDR structure varies over time—i.e., it exhibits spatiotemporal heterogeneity. Moreover, long IDRs contain more modification sites than fully ordered regions, and their flexibility provides more opportunities for displaying these sites [28,29]. These features explain how proteins with IDRs or IDPs interact with and are tightly regulated by various factors to ensure that appropriate levels of proteins are available at the right time to minimize the possibility of inappropriate protein–protein interactions [26]. Thus, misfolding and altered availability of proteins with IDRs or IDPs are more likely to be associated with disease states. Given a similarity in those properties, we proposed that a study concerning the link between MeCP2, CDKL5, and FOXG1 disordered structure properties with RTT or RTT-like syndrome collectively is necessary.

Restoring *Mecp2* gene function in an animal model abolished the symptoms of RTT. Growth factor stimulation (e.g., insulin-like growth factor 1) and the activation of neurotransmitter pathways (e.g., β2-adrenergic receptor pathway) can also partially rescue phenotypes of *Mecp2* knockout mice (RTT model mice), suggesting that the disorder is treatable [15,30,31]. In addition to gene therapy, reactivation of an inactivated X chromosome is known to be a new therapeutic method [32,33]. The therapeutic strategies of RTT are under development, and elucidation on this enigmatic disorder needs various points of view to make advances in understanding. Even though RTT has been determined as a monogenic disorder, the complex biological system compels us to necessarily broaden our perspective; moreover, MeCP2 contains an extensive amount of disordered regions which may facilitate binding with multiple partners. Considering several points above, we investigated the evolution and molecular features of MeCP2, CDKL5, and FOXG1 and their binding partners using phylogenetic profiling to gain

a better understanding of their similarities. Additionally, we predicted the structural order–disorder propensity and assessed the evolutionary rates per site of MeCP2, CDKL5, and FOXG1 to investigate the relationships between disordered structure and other related properties with RTT.
