**5. The Transcription Regulation and LLPS**

The genes transcription process require tight regulation to ensure physiological balance of the cell. Knowledge regarding the mechanism of transcription is quite advanced, however some aspects of regulation remains unexplored. Recent findings indicate that regulatory mechanism may tightly depends on the spontaneous LLPS. Transcription of tissue specific gene is initiated at the specific genome regions called super-enhancers (SE). SE first described in embryonic stem cells (ESC) [123] are dense multicomponent assemblies different from typical enhancers [124]. Recently Hnisz [125] performed computational simulation to obtain the probable explanation for typical features of SE. Simulations led to conclusion that formation, activity and unique properties of SE such as sensitivity to concentration of its components, sensitivity to posttranslational modifications, extremely high frequency bursting [126–128] may originate from the fact that SE are liquid condensates assembled/disassembled via spontaneous LLPS [125]. Hnisz and co-workers were the first who point connection and strong dependence between the regulation of transcription initiation at SE and LLPS. Although not experimentally proven, the model serves as the conceptual framework for further research. Recently, Sabari et al. [121] showed that largely disordered BRD4 and MED1 subunit of the Mediator are in close spatial proximity to one another within SE in murine ESC and co-localised puncta show characteristic features of phase separated condensates Moreover, MED1 condensates can incorporate BRD4 and Pol II from nuclear extract [121]. MED1 subunit interacts also with other major pluripotency TFs e.g., OCT-4 [129] and estrogen receptor (ER) [130] forming liquid-like puncta at SE of the key pluripotency genes [121,122]. MED1 condensates depends on the OCT-4 occupancy [122], which are crucial for initiation of tissue specific genes transcription at SE [122,131]. In vitro analyses pointed that formation of MED1-OCT4 liquid condensates occurs via the electrostatic interactions and involves acidic residues enriched in disordered activation domain of the OCT-4 [122]. Interestingly, ER interact with the MED1 subunit by LXXLL motif [132] which is located in the ordered ligand binding domain. This interaction is regulated by estrogen what means that not only disordered-disordered regions interaction but also disordered-ordered regions interactions play a role in transcription regulation forced by LLPS [122]. Wu et al. [120] showed that largely disordered transcription co-activator TAZ protein forms liquid condensates in vitro and *in vivo*. TAZ condensates compartmentalize DNA binding cofactor TEAD4 and other components of transcription initiation machinery including BRD4, MED1 and CDK9. Importantly, deletion mutant, that is not able to undergo spontaneous LLPS cannot initiate transcription though is able to bind TAZ partners such TEAD4.

Importantly, there are some evidences that not only the initiation, but also the elongation of transcription depends on LLPS. For the transcription elongation essential is hyper-phosphorylation of the YSPTSPS consensus sequence which is repeated multiple times in the disordered C-terminal domain (CTD) of Pol II [133–136]. pTEFb which begins the elongation phase consists of CDK9 kinase associated with cyclin T1 (CycT1). Lu with co-workers [76] concentrated on the function of the lengthy C-terminal IDR of CycT1 in regulation of CDK9 activity. They revealed that a histidine-rich domain (HRD) located in the IDR of CycT1 (residues 480–550) is directly involved in the regulation of the kinase activity [76]. Interestingly, HRD is present also in some other kinases, for example Dyrk1A which phosphorylates CTD of Pol II. Importantly, a homologues kinase Dyrk3 was shown to be responsible for disassembly of stress granules [137] and other cellular condensates during cell division [138]. In vitro studies using a set of recombinant IDRs of the CycT1 and Dyrk1A revealed that the regions can undergo phase separation in a HRD dependent manner. HRD was shown to form condensates which compartmentalize the kinases and the substrate what enables efficient reactions resulting in the hyper-phosphorylation of the CTD of Pol II [76]. Interestingly, the CTD of Pol II can undergo spontaneous LLPS in vitro only in a non-phosphorylated state. The weak CTD-CTD interaction keeps the enzymes molecules in hubs within nucleoplasm. Phosphorylation change the interaction pattern allowing CTD to engage in new multivalent interactions with selected partners [139]. These results indicate that LLPS allows for the condensation of cofactors, that in turn triggers posttranslational modifications leading to the reorganization of the condensate components. Pol II escapes from the promoter site and enables the entry into active elongation stage [76].

Currently not much is known about proteins responsible for formation of the condensates which are important for transcription regulation. The question still remains unanswered which proteins are the scaffolds and which are the clients. Importantly, also not much is known about the involvement of the bHLH TFs in the LLPS process, though they are key players involved in many important cell differentiation and organisms development pathways. As we discussed in previous section, bHLH proteins possess long IDRs which could interact with different partners and be engaged in LLPS. This hypothesis is substantiated by an experimental verification of MyoD possibility to create LLPS [122], and discussed in previous section possibility of some bHLH TFs to interact with the Mediator subunits or other elements of the mechanism which modifies the chromatin accessibility. Interestingly, regulation of circadian clock by BMAL1 comprises binding of CBP, which occurs in discrete nuclear foci. This led to a hypothesis that formation of nuclear bodies containing BMAL1/CBP provides transcriptionally active sites of target genes, like *Per1-2* [34]. Taking the above into consideration, we asked the question if the ability to undergo LLPS is a more general property of the bHLH TFs. As we got positive results for the previously performed prediction of disorder, which was shown to be important for LLPS initiation [76,121,122], we decided to perform in silico analyses to predict if members of the bHLH family comprise putative sequences able to create liquid condensates. We used catGranule program, (http://service.tartaglialab.com/update\_submission/216885/dd56e32a89) for computational analyses of the putative propensity to undergo LLPS [140] for the bHLH proteins representing all established classes (see Table 1). Prediction results showed that hHEB (class I), hMyoD (class II), hMYC and 84atMYC2 (class III) (Figure 4) contain sequences with a positive score of propensity to LLPS formation. Interestingly, proteins from the class IV regulators which do not possess TAD: hMAD1 and hMAX, similarly like transcription repressors: hID4 (class V) and hHES (class VI) present very low or even negative score within the whole protein sequence (Figure 5). bHLH-PAS transcription

factors representing the class VII, hAHR, hHIF-1α, hCLOCK and hARNT were predicted as containing some sequences with high propensity score (Figure 6). Especially interesting is the observation that the transcription repressors show a very low propensity scoreto undergo LLPS in contrast to the transcription activators such as hHEB or atMYC2. It is possible that the bHLH repressors inhibit transcription by preventing spontaneous phase separation required to form a complete initiation complex. This hypothesis is substantiated by the observation for TAZ mutants [120], discussed in the previous section.

**Figure 4.** Prediction of propensity of LLPS formation. (**A**) class I human HEB [Q99081], (**B**) class II human MYOD [P15172], (**C**) class III human MYC [P01106-2] and (**D**) *Arabidopsis thaliana* MYC2 [Q39204].

**Figure 5.** Prediction of propensity of LLPS formation. (**A**) class IV human MAD [Q05195] and (**B**) human MAX [P61244], (**C**) class V human ID4 [P47928], (**D**) class VI human HES1 [Q14469].

**Figure 6.** Prediction of propensity of LLPS formation for bHLh-PAS proteins. (**A**) human AHR [P35869], (**B**) human HIF-1α [Q16665], (**C**) human CLOCK [O08785], (**D**) human ARNT [P27540].

As the range of the propensity score is not determined precisely, as a control we performed catGranule prediction for proteins known to create LLPS: nucleophosmin (Figure 7A) and estrogen receptor (Figure 7B) which are deposited in the recently published PhaSePro database (https://phasepro. elte.hu) [141].

**Figure 7.** Prediction of propensity of LLPS formation for representative LLPS-enabled proteins. (**A**) nucleophosmin [P06748], (**B**) estrogen receptor [P03372].

Results of performed in silico analyses in comparison to the control show that the selected bHLH proteins have regions that might be involved in multivalent interaction leading to formation of liquid condensates. What would be their role in condensates formation and how would mutations and wrong dimerization/interaction influence formation of the bHLH TFs containing condensate remains a puzzle, however we believe that such an important family of TFs engaged in the crucial pathways and related to many severe disorders like cancer should be the subject of research in this field.

#### **6. Concluding Remarks and Future Perspectives**

In eukaryotic cells, regulation of transcription is a dynamic process which requires very precise temporal and spatial coordination of proteins assembling functional complexes. The bHLH family comprises a large group of TFs which utilize conserved DNA binding domain to interact with DNA, but also additional, often disordered domains and motives that allows formation of complex interacting network with various transcription co-factors. It is possible that flexible disordered regions of the bHLH proteins play a role in formation of liquid condensates via LLPS and contribute in this way to regulation of transcription process. Up to date however, there is a lack of experimental evidences. Also recently published PhaSePro database for LLPS does not contain any bHLH TF [141]. We believe that this is due to difficulties with the experimental studies of the bHLH proteins mentioned previously and we expect that some bHLH proteins will be appended in future.

Presented in the previous section predictions may give a hint about the link between LLPS by the bHLH proteins and transcription regulation. This raise a question about functional relevance of this discrepancy between family members. An interesting observation is the predicted low propensity score to form LLPS in the case of transcriptional repressors in contrast to proteins acting as activators. This raise a question about the functional relevance of this discrepancy between family members. Importantly, connection between LLPS and transcription regulation is not limited to the direct interaction between transcription regulators at the active transcription sites. LLPS form nuclear bodies, that maintain, store and modify transcription regulators. Examples include nuclear speckles, polyleukemia bodies, nucleolus, histone locus and others [142]. Within LLPS-formed condensates proteins can undergo acetylation/deacetylation or sumoylation, proteasome-dependent degradation and other posttranslational modifications that influence their functionality [143–145]. Importantly, barrier-free character of these phase separated condensates allows shuttling of its component between the condensates and nucleoplasm, and whenever needed molecules can be recruited from these compartments to the active transcriptionally sites. The discovery that LLPS which is well known in polymer chemistry can play an important role in molecular biology has definitely brought us closer to understanding the cell functionality and regulation of fundamental cellular processes such as transcription. However, our understanding and detailed knowledge is still residual. Many important questions regarding a LLPS concept in transcription regulation remain without answer. We do not know, which components drive association/dissociation events at the active sites. Which molecules serves as a scaffold conditioning formation of liquid condensates and which are just clients. How the type of client molecules influence the function of the phase separated condensates? Also, we do not know which factors and in which way alter LLPS leading to the pathological processes. What would be the role of the bHLH TFs in a condensates formation, and how mutations and incorrect dimerization/interaction of these proteins would impact formation and function of condensates? These questions, as well as many other ones await experimental verification. We believe that such important family of transcription factors which is engaged in crucial pathways and related to many severe diseases like cancer and neurodegenerative disorders, should be the subject of further intensive studies.

#### **Author Contributions:** A.T. and B.G.-M. wrote the paper.

**Funding:** The work was supported by a subsidy from the Polish Ministry of Science and Higher Education for the Wroclaw University of Science and Technology, Faculty of Chemistry.

**Acknowledgments:** The authors apologize to investigators whose contributions were not cited more extensively because of space limitations.

**Conflicts of Interest:** The authors declare no conflict of interest.
