Next Article in Journal
Teosinte-Derived Advanced Backcross Population Harbors Genomic Regions for Grain Yield Attributing Traits in Maize
Previous Article in Journal
Bacteria and Allergic Diseases
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

A Compendium of G-Flipon Biological Functions That Have Experimental Validation

Discovery, InsideOutBio, 42 8th Street, Unit 3412, Charlestown, MA 02129, USA
Int. J. Mol. Sci. 2024, 25(19), 10299; https://doi.org/10.3390/ijms251910299
Submission received: 21 August 2024 / Revised: 16 September 2024 / Accepted: 18 September 2024 / Published: 25 September 2024

Abstract

:
As with all new fields of discovery, work on the biological role of G-quadruplexes (GQs) has produced a number of results that at first glance are quite baffling, sometimes because they do not fit well together, but mostly because they are different from commonly held expectations. Like other classes of flipons, those that form G-quadruplexes have a repeat sequence motif that enables the fold. The canonical DNA motif (G3N1–7)3G3, where N is any nucleotide and G is guanine, is a feature that is under active selection in avian and mammalian genomes. The involvement of G-flipons in genome maintenance traces back to the invertebrate Caenorhabditis elegans and to ancient DNA repair pathways. The role of GQs in transcription is supported by the observation that yeast Rap1 protein binds both B-DNA, in a sequence-specific manner, and GQs, in a structure-specific manner, through the same helix. Other sequence-specific transcription factors (TFs) also engage both conformations to actuate cellular transactions. Noncoding RNAs can also modulate GQ formation in a sequence-specific manner and engage the same cellular machinery as localized by TFs, linking the ancient RNA world with the modern protein world. The coevolution of noncoding RNAs and sequence-specific proteins is supported by studies of early embryonic development, where the transient formation of G-quadruplexes coordinates the epigenetic specification of cell fate.

1. Introduction

The idea that the repetitive genome encodes genetic information by shape rather than by sequence is relatively new. The unit of information is the flipon, a genomic element that can adopt alternative structures under physiological conditions. The conformation formed depends on the repeat sequence involved. The classic example is provided by left-handed Z-DNAs and Z-RNAs (collectively called ZNAs) that are formed by runs of alternating guanosine and cytosine [1,2]. Collectively, the repetitive genome comprises over 50% of the human sequence, compared to 2.5% for protein-coding genes.
Flipons in the B-DNA conformation have little informational value, as the repeats are frequent in the genome. They also lack the complexity of codons, so they do not contribute directly to the Watson and Crick genetics that focuses on protein variation. Instead, flipons alter the readout of genetic information by localizing structure-specific complexes to genomic loci able to power the flip from a right-handed B-DNA or A-RNA helix to an alternative DNA or RNA fold. The readout of RNAs then varies dynamically with flipon structure. Here, the focus is on G-flipons that form G-quadruplexes (GQs) in DNA (dGQs), RNA (rGQs) or DNA/RNA hybrids (hGQs). GQs are inherently more stable than ZNA helices. Consequently, G-flipons can actuate biological processes that are quite distinct from those modulated by Z-flipons.
GQ-forming sequences are defined by the canonical DNA motif (G3N1–7)3G3, where G is guanine and N is any nucleotide. Four G-bases hydrogen-bond to each other to form a tetrad that then folds into a four-stranded structure. In place of the Watson–Crick base-pairing scheme, the rather unconventional Hoogsteen hydrogen bonds stabilize the interaction. The G-tetrad was first observed in X-ray diffraction studies of 5′-GMP and 3′-GMP gels, each stacking the tetrads on top of one another differently [3]. The preferred helical arrangement of GQ crystalline fibers was later revealed by structural studies of polyinosinic and polyguanylic RNAs [4].
It was once widely believed that GQs did not exist in cells. If present, then the GQs formed were predisposed to genetic instability and to disease [5]. There was much excitement when the Tetrahymena telomere sequence repeats [6] were shown to form GQs [7]. In contrast, later work revealed that telomeres in vivo were more likely to form a different type of structure called a T-loop [8]. The closure of the loop led to the formation of a three-stranded DNA structure that incorporated the single-stranded telomeric end and a subtelomeric segment. This structure was protected by a shelterin protein complex. The T-loop model seemingly ruled out a role for GQs in telomere maintenance (but see below). The prevailing view that GQs were bad was reinforced by the many loss-of-function (LOF) helicase variants that were associated with human mendelian diseases. The failure of these variants to resolve GQs was considered causal for genomic instability, even though the helicases also resolved other non-B structures, such as cruciforms and Holliday junctions (HJs) that form during recombination [9]. Further, a role for GQs in pathology was suggested by an analysis of repeat expansion diseases. In some cases, the sequences involved were predicted to freeze in the GQ conformation, thereby interfering with a variety of cellular functions, including DNA replication, transcription and RNA processing [10].
However, there was evidence that GQs played an essential biological role in the adaptive immune system. The GQs were associated with the class switch recombination of immunoglobulin heavy chain (IgH) genes. Of interest were the noncoding switch (S) regions in the IgH gene that underwent transcription to produce R-loops. The non-template strand was G-rich and 2 to 10 kb in length. When displaced by RNA transcripts, the single-stranded G-rich DNA was able to fold back on itself to form GQs [11]. The targeting of the AID cytosine deaminase protein to the GQ structure by helicase DDX1 was essential for both class switching and the immunoglobulin somatic hypermutation that is critical for antibody affinity maturation [11,12,13,14]. The cytosine-to-uridine substitution catalyzed by the cytidine deaminase was not only mutagenic but also recruited the repair machinery required for DNA recombination. In other contexts, GQ formation in G-rich DNA due to R-loop formation was proposed as pathogenic [15].
Other experimental approaches to unraveling the biology of GQs were complicated by the equilibrium that exists between different flipon conformations, with the transition occurring in unmodified DNA and without requiring any strand cleavage [1]. Early experiments using dimethyl sulfoxide footprinting of RNA failed to show the protection of guanine bases expected if a GQ had formed inside a cell [16]. These results were interpreted to show that GQs were not biologically relevant. However, there was a problem with the experimental design: chemical modification of any G-quadruplexes that unfolded during the time course of the experiment would prevent the structure from reforming [17]. In other words, the longer the experiment ran, the less chance there was of detecting the presence of GQs in a cell. Nevertheless, the study highlighted the possibility that GQs were formed dynamically in cells and that they were rapidly resolved to reform B-DNA.
There were also limitations to other experimental approaches designed to detect GQs. Tools designed to detect GQs in cells were able to induce their formation. This risk of an artifact increased when assays were performed on cell extracts. Here, various factors came into play, such as the buffers used and the loss of proteins that might otherwise restrain the B-DNA flip to GQs. Even well-accepted ChIP-seq protocols to map protein interactions are potentially misleading, as recently shown by a stringent analysis of the interactions between the GQ-binding substrates of PRC2 (Polycomb Repressor Complex 2) [18]. Combined, these uncertainties limited the widespread acceptance of G-flipons as important components of the genetic repertoire. The repetitive genome was just considered “junk” [19].
The intent of this review is to integrate information from a wide range of research papers, including some whose significance has long been overlooked and who are not mentioned in many recent GQ reviews [20,21,22,23,24,25,26,27,28]. The initial focus is on the genetic evidence that speaks to an early evolutionary role for G-flipons in maintaining genomic stability and on the proteins that localize the machinery required for nucleotide and base excision repair (NER and BER, respectively) by inducing GQ formation. Different classes of helicase then power the resolution of GQs to reform B-DNA, completing the flipon cycle. By changing the readout of genetic information, flipons dynamically reprogram a cell in response to environmental perturbations.
I will then discuss the roles of G-flipons in transcription that emerged later in evolution. This feature reflects a change in how GQ recognition occurs, from interactions involving single-stranded DNA loops and modified bases to those mediated by proteins that bind both B-DNA and G-quadruplexes through a different face of the same helix.

2. Biophysical and Computational Studies of the G-Quadruplex

The basic building block for a G-quadruplex is a guanine tetrad formed by Hoogsteen hydrogen bonding [4] (Colored in Figure 1B) between bases [29]. Interestingly, the parallel nature of these bonds contributes to sigma bonds, which increase the stability of the G-tetrads relative to those formed by xanthine, where the bonding is anti-parallel [30]. A recent review describes 48 different possible GQ folds, reflecting whether the four strands are parallel, anti-parallel or a mix, made from one to four different strands with a lateral, diagonal or propeller loop topology [31] (Figure 1). Further, the guanosine residues may be either in a syn or anti conformation (with the guanine base either lying over the sugar or pointed away from it) [32]. The GQs could also be left-handed [33]. The folds are stabilized by a central metal, with a potassium ion preferred over the smaller sodium and lithium ions for parallel-strand GQs. The metal preference for other GQ folds varies and depends on whether they are made from RNA or DNA [34]. Non-consecutive guanosines can form tetrads with the extra residue everted from the stack to form a bulge. In the case of GGA repeats, the adenine bases that are excluded from the quadruplex can interact with the tetrads to produce a heptad structure [35,36].
The stability of GQs is also affected by the loop composition, decreasing with loop length, and varying with the loop nucleotide sequence [38]. With long runs of G-repeats, defined as over 500 bases in length, the loops can base-pair to give even higher-order structures. Of the 299 long G runs reported, over 67% are located within 6M bp (base pairs) of telomeres [39]. Interestingly, GQ loop length and sequence variation have increased during evolution, especially in mammals, and as has GQ length, number, and density in the genome [40]. G-flipons are also more frequent on the non-template strand of coding genes [41,42].
Besides GQ formation by neighboring G3 repeats, it has been proposed that GQs are formed by a pair of G3 repeats in an enhancer and a pair of G3 repeats from a promoter [43]. Further, a hybrid GQ can form between a pair of DNA G3 repeats in the non-template strand and a matching RNA G3 pair in the nascent transcript [44]. The GQs formed by strands that are not physically connected to each other also show structural variation. The GQs can assemble by stacking tetrads one on top of the other or by pairing bases from the separate strands to form a G-wire [45,46]. G-wires were originally proposed to explain the alignment of homologous chromosomes during meiosis [47]. Tetrads missing their fourth base can incorporate into the vacant space a guanine provided in trans, potentially acting as a sensor for a local change in the concentration of the replacement nucleotide [48].
RNA tetrads only form parallel rGQs when G-repeats are contiguous [34]. A variety of non-canonical rGQ folds are stabilized by pairing of G-bases that are widely separated by non-G nucleotides [49]. rGQs composed of only two tetrads have been reported [50] and are stabilized by the 2′-hydroxyl groups present in RNA [34]. In contrast, there are many possible variations in dGQs composed of three or more tetrads, making it difficult to computationally predict from sequence alone the flipons that actually form dGQs in vivo. A database that combines results from a variety of experimental methods now overcomes this problem by providing a set of well-validated G-flipons detected in many different studies using a variety of approaches [51]. The mappings show that in the human genome, dGQ-forming sequences are enriched in transcription start sites (TSSs), in introns and at transcription termination sites (TTSs) [40].

3. GQ-Binding Proteins

The plethora of different dGQ topologies allows for different modes of protein recognition (Figure 2 and Figure 3). Strategies to confirm these interactions and the specificity of binding to GQs include those that synthesize control oligonucleotides containing an 8-aza-7-deazaguanosine base (Figure 1C) that will not form the Hoogsteen hydrogen bonds necessary to stabilize a GQ (Figure 1B, crimson shading) despite having the same chemical composition as guanine [52]. In these studies, different modes of docking to GQs have been identified, including binding to loop sequences or to 5′ and 3′ single-strand extensions that give the helicases something to pull on so that they can unwind their structure. Proteins can bind to loops formed when adducted bases such as 8-oxo-G prevent the incorporation of a DNA strand into a GQ, or to the everted bases across from an apurinic/apyrimidinic (AP) site. Proteins also dock to the planar tetrad surfaces that form the GQ endplate. Specific binding to rGQs rather than dGQs is favored by intrinsically disordered regions (IDRs) enriched in arginine and by glycine repeats, as recently reviewed [53], and as visualized in the FMR crystal structure of the Fragile X Mental Retardation Protein bound to an rGQ [54]. In principle, the preformed GQ site for docking IDRs lowers the entropic cost of binding.
The stability of GQs and strength of their interactions with proteins can vary with the loop length and loop sequence composition [57,58], as revealed by studies of nucleolin and the 2E4 Darpin [59,60]. Further, the latching of a single base by the REV1 polymerase [61], and the docking to an AP site by APE1 (AP endonuclease 1) [62], can create a surface that induces GQ folding. As we will discuss, the use of SANT (Swi3, Ada2, N-Cor and TFIIIB) domains to recognize parallel-strand GQs is of particular interest, as the domains can use the same helix to bind B-DNA in a sequence-specific manner. In total, 50 GQ–peptide structures are present in the Protein Database (PDB), showing a variety of interactions [26,60]. A subset of validated GQ-interacting proteins is given in figures below. Listings of additional proposed GQ-binding proteins can be found in recent publications [52,63], online in the G4IPBD database (http://people.iiti.ac.in/~amitk/bsbe/ipdb/index.php, accessed on 15 September 2024) [64] and the QUADRatlas database (https://rg4db.cibio.unitn.it/, accessed on 15 September 2024) [65].

4. The Accumulating Evidence for the Biological Importance of G-Quadruplexes

Despite the numerous challenges to studying the cellular functions of high-energy and dynamic flipon conformations, much progress has been made. There are two key aspects to their biology: first, the events that promote and resolve the formation of the alternative flipon structures, and second, the transactions that the alternative flipon conformations actuate. There are well-validated proteins that can induce the flip to GQs and many helicases capable of their resolution (Figure 2, Figure 3, up and down arrows). Although GQ formation does not inherently require any change, modification or cleavage of DNA or RNA, such events may change the propensity of G-flipons to flip from one conformation to another. The GQs formed in these processes differ in topology. The structured loops they form are recognized by specific sets of proteins; as are the GQ endplates (Figure 3, top). The outcomes depend on which cellular machinery is localized to a particular GQ. The complexes formed enable cells to reprogram their responses to environmental perturbations.
The transactions occurring between GQs formed at different sites are also important in understanding their cellular functions. The complexes nucleated by one GQ have the potential to associate with other G4-anchored structures to form membraneless condensates (Figure 4) [66,67]. These complexes can be quite large and visible by light microscopy [66]. Their interactions enable the sequencing and timing of events within the cell (Figure 4A). The pairings of promoter GQs with GQs formed at enhancers, splice sites and polyadenylation sites then generate production lines for the processing of transcripts. Factories form by anchoring of the production lines to the nuclear scaffold [68,69], delivering the transcriptional bursts associated with gene expression [70]. The pliability of these production lines is revealed by constant updates to the nuclear architecture [71]. G-flipons actuate many outcomes in the cell and are exploited by pathogens, as described below.

4.1. Retroviral Latency

The simplest example of GQ-mediated integration may be provided by retroviruses, such as human immunodeficiency virus 1 (HIV-1). These viruses encode G-flipons in the long terminal repeat that are present at either end of their 9.6 kb genomic insert [72] (Figure 4B). This arrangement enables the formation of chromatin loops that separate the viral protein-coding genome from that of the host. In this state, the virus is likely latent. Nevertheless, the virus is poised to replicate upon the removal of the loop restraint (Figure 4B). The HIV-1 plus-strand mRNA also contains 11 potential G-quadruplexes, with 9 in the coding sequence. The topologies are mixed, raising the possibility that particular pairings affect the splicing, stability, recombination, and repair of viral transcripts [73]. Long interspersed elements (LINEs) are another class of retrotransposons that have a G-flipon conserved in their 3′UTR (untranslated region). The pairing of LINE GQs with GQs in cellular enhancers also has the potential to form a loop that controls their expression in a tissue-specific manner [74]. Conversely, the 5′UTR G-flipons that LINE families acquire during evolution can themselves act as tissue-specific enhancers for cellular genes [75].

4.2. Cell Division

Interestingly, the first evidence hinting at a biological role for GQs came from the round worm Caenorhabditis elegans. Sequences with G-quadruplex motifs underwent deletion in strains with dog-1 (deletions of guanine-rich DNA) LOF variants, but not in sequences with only three G3 repeats that are unable to form GQs [76]. Mutant strains of dog-1 lacking the trans-lesion (TLS) polymerases POL eta and POL kappa had significantly more G-tract deletions than dog-1 by itself [77]. Interestingly, the combined deletion of dog-1 and the spindle-checkpoint component mdf-1 enabled long-term survival [78], even though a high incidence of lethal mutations in this strain was revealed by the use of balancer chromosomes. In total, 126 (13%) of the 954 mono-G/C tracts larger than 14 bp were deleted over 470 generations when both genes were absent. A role of GQs in sister chromatid alignment by the cohesin proteins during mitosis was suggested by the effects of dog-1 LOF on the spindle checkpoint. The absence of other phenotypes also supported the consensus that GQs had only a limited role in normal cell biology, not only in C. elegans but also in other organisms.

4.3. Epigenetic Maintenance

A dog-1 homolog in the DT40 chicken lymphoblastoid cell line, 5′ FANCJ (Fanconi Anemia Complementation Group J) helicase (a member of the Fe-S superfamily 2 (SF2)) [79] was also found to prevent the deletion of guanine repeats (G-repeats) that have the potential to form GQs. The effects of the mutation were enhanced by the loss of REV1 polymerase, which localizes TLS polymerases to sites of polymerase stalling. Interestingly, REV1 catalytic activity was not necessary to prevent deletion, although the LOF variant did enhance the rate of G-repeat loss. Also, in the FANCJ model, the combined deletion of the Werner and Bloom Syndrome 3′ helicases (RecQ SF2) [80] also increased G-repeat deletion, likely because of GQ accumulation [79].
Of interest is that the TLS pathway was required to maintain the epigenetic state of dividing cells, as monitored by the cell-surface expression of a protein with an intronic G-flipon that regulated gene expression. In contrast, in the wildtype cell, the histone modifications associated with this G-flipon were maintained; they were lost following rev1 deletion. Instead, the resolution of the GQs formed during DNA replication occurred through the gap-filling repair pathway. The subsequent incorporation of unmodified histones led to diminished gene transcription and surface marker expression. This rev1-dependent phenotype could be reverted by re-expression of human FANCJ helicase [79]. The opposite effect was observed when a G-flipon was experimentally inserted into a repressed locus. In this case, rev1 deletion led to the depression of the segment, consistent with the replacement of repressive histones with unmodified histones that were permissive to gene expression [81]. These results support a model where the formation of GQs by G-flipons during periods of cell proliferation helps in transmitting the current epigenetic state to progeny, an important biological outcome.

4.4. DNA Replication and Sister Chromatid Conformation

The involvement of GQs in cell proliferation is further supported by other evidence. During assembly of the DNA polymerase complex at the origin of replication (OOR), the MTBP protein assists in the loading of CDC45 into the replicative helicase. The C-terminal domain of MTBP binds GQs in vitro [82]. Notably, G-flipons are enriched in OOR. Indeed, in chicken DT20 cells, a minimal, functional OOR consists of a 90 bp fragment that has two G-flipons on the same strand [83]. These constructs establish the nucleosome-depleted region (NDR), which is bound by histone H2A.Z and is typical of the OOR. Collectively, the results suggest a model in which the MTBP binds GQs at the OOR to initiate the assembly of the replication complex.
Another potential role for GQs during the proliferation and transmission of the epigenetic state is to align sister chromatids, as the mapping of intra- and inter-chromatin interactions between homologous chromosomes reveals a high degree of symmetry in the architecture of topologically associated domains (TADs), and in the loops formed within TADs [84]. In this regard, a recent report suggests that G-flipons are enriched near sites bound by the CTCF (CCCTC-binding factor), a protein associated with loop formation. Interestingly, the strand orientation of the G-flipons mirrors the inverse orientation of the two CTCF sites that associate with each other to form the base of the chromatin loop [85]. CTCF, however, is not known to bind GQs [52].

4.5. Nucleotide Excision Repair (NER)

The REV1 pathway also plays a role in NER, which is triggered by UV irradiation and the formation of DNA crosslinks. In this situation, the loading of repair pathway proteins such as XPCC and RAD23 is triggered by the protein ZRF1 and its yeast homolog Zuotein, which recognizes the lesion and induces GQ formation [86]. Triggering this pathway by cytosine deaminases can result in single-base substitutions at a sequence-tagged site (STS), with a C to G transversion resulting from the preferential insertion of cytidine into the lesion by REV1 [87]. The resulting mutation (STS13) is prevalent in cancers [88].
NER in the transcription-coupled repair pathway (TCR) depends on the Cockayne Syndrome B (CSB) helicase (encoded by ERCC6) that binds GQs [89]. On sensing a lesion, CSB displaces DSIF (DRB Sensitivity Inducing Factor) from the RNA polymerase 2 (RNAP2) complex, inducing a conformational switch that halts transcriptional elongation and initiates TCR [90]. LOF variants of CSB are associated with premature aging phenotypes [89].

4.6. Base Excision Repair (BER)

APE1 plays a similar role in stabilizing GQs formed by AP DNA, but not unmodified DNA, to initiate the BER pathway [62]. This pathway removes oxidized bases, such as 8-oxo-G. It is proposed that the regulation of APE1 by acetylation coordinates the expression of genes involved in cellular pathways that respond to oxidative damage. Interestingly, the GQs involved are formed from G-flipons with a “spare tire” (Figure 1F). The extra runs of G-repeats allow the formation of a GQ despite damage to one of the other repeats [91].
The 8-oxoG modification can arise due to toxins in the environment. The adduct is also generated routinely during the flavin-dependent LSD1 (lysine demethylase 1A, encoded by KDM1A) demethylation of H3K9me2, where hydrogen peroxide is a product of the reaction. The LSD1 enzyme is activated during the induction of BCL2 gene expression by estrogen [62]. The repair of the lesion through the BER pathway depends on GQ formation. Before the involvement of GQs in this process was known, it was proposed that the DNA strand breaks observed were a general mechanism for initiating gene transcription [92], but now can be viewed as just another example of how flipons enable the reset of chromatin.

4.7. Hemin and Oxidative Damage

Oxidative damage also arises from the production of highly reactive oxidative species catalyzed by hemin, an iron-containing porphyrin that is present at high concentrations in the cell [93]. Hemin binds with high affinity (Kd ~ 10 nM) to GQs, an interaction that was initially highlighted for its ability to increase the production of superoxides [94]. However, it appears that in cells, this reaction is squelched, presumably by proteins that bind to GQs [93]. In such cases, GQs may act as a sink for free hemin and trigger the rapid repair, through the BER pathway, of the damaged bases produced. In such cases, GQs protect, rather than damage, the genome.

4.8. Telomere Protection

The formation of T-loops by telomeres described above does not rule out the role of GQ formation in telomere protection. Indeed, the GQ binding TRF2/RAP1 (telomere repeat binding factor 2/ repressor-activator protein 1) complex protects telomeres from homologous recombination by repressing PARP1 (poly(ADP-ribose) polymerase 1) localization to telomeres and by inhibiting the SLX4 resolvase that binds to HJs. The loss of TRF2 and RAP1 in both humans and mice leads to rapid telomere attrition, with increased rates of telomere deletion and fusion [95]. TRF2 preferentially docks to rGQs rather than dGQs. The protein binds rGQs formed by the noncoding Telomeric Repeat-Containing RNA (TERRA) telomere transcript through an RG-rich domain [96]. Interestingly, the HIV-1 retrovirus may form a dGQ to cap the DNA flap sequence produced during the pre-integration phase of reverse transcription, potentially protecting the end in much the same way as proposed for host telomeres [97].

4.9. Resolution of G-Quadruplexes

Implicit in the G-flipon cycle is the need to reset flipons to a resting state. As shown in Figure 3, many helicases enable this outcome. The most studied example is the ATP-dependent DEAH box SF2 helicase DDX36 (RHAU), a highly specific GQ resolvase that unwinds parallel dGQs. The enzyme makes helical contact with the GQ end plate [98,99]. Binding by the helix alone has a relatively high Kd of 1 μM. The additional engagement of a 3′ single-stranded dGQ tail by other residues accounts for the nM affinity of the enzyme for its substrate. Using a ratchet mechanism, the helicase disassembles the dGQs, one guanine at a time. The chemical energy derived from ATP is converted into a pulling force by rotation of the C-terminal domain. The twist opens up the helicase core [99]. In the absence of nucleotides, or in the ADP-bound state, D. melanogaster DDX36 stabilizes GQs [100].
The cocrystal structure of dGQs with the SF1 Thermus oshimai 5′-3′ Pif1 helicase shows the enzyme in an unwinding state with the engagement of a single-stranded thymine repeat [101]. The related yeast helicases PiF1 and Rrm3 cooperate to unfold a wide range of dGQ topologies, including those formed not only by telomeres, but also by centromeres and tRNA repeat sequences [102,103]. The enzyme unfolds dGQs in an ATP-dependent manner, unwinding both parallel and antiparallel dGQs [101]. The interaction of the Pif1 with a parallel-stranded dGQ differs from that with DDX36. The contact is mediated by a cluster of amino acids, including two arginine/lysine cation–π interactions at either end of the dGQ, plus ionic contacts with the phosphate backbone. The SF2 RecQ BLM helicase also unfolds a variety of dGQ through a number of different mechanisms [104]. Collectively, the helicases play key but distinct roles in flipping dGQs back to B-DNA.

4.10. G-Flipons and Gene Expression

The widely held assumption is that a crystal structure of a protein engaged with B-DNA precludes an interaction with any other DNA conformation, especially if the substrate is bound with nM affinity. Of course, crystal structures by their nature represent a low-energy state. The example of Rap1 is therefore instructive (Figure 2). Prior to its role in telomere protection, Rap1 was characterized as a sequence-specific transcription factor that was bound to a UAS (upstream activating site) in yeast [105]. Its base-specific interaction with B-DNA was confirmed by a crystallographic study of a telomeric sequence (Figure 2A) [55]. Only later did crystal structures show that Rap1 also docked to GQs. Surprisingly, both DNA interactions involved the same helix, but with a different face [56] (Figure 2B). The GQ contacts were hydrophobic, with the helix lying on the planar surface of the terminal tetrad, while the B-DNA contacts were consistent with those found for the UAS. Both interactions have a Kd ≈ 20–30 nM [56], yielding a switch that has two stable states (Figure 2C). The switch state then depends on the context and the availability of the helicases. The example illustrates the potential of flipons to switch the readout of genetic information from a genome by changes to their conformation [106].
While this finding might seem anomalous, many subsequent studies have demonstrated the ability of proteins to both bind specifically to a cognate B-DNA sequence, and also to a GQ, often with nanomolar affinity for both conformations. This finding is true for the binding of the SP1 transcription factor to its B-DNA cognate sequence and a c-MYC parallel GQ [107] and many other proteins that bind both to GQs and to a B-DNA motif [21]. Interestingly, like Rap1, many of the GQ-binding proteins include a SANT/Myb domain such as ZRF1 [108] and TRF2 [109,110]. Interestingly, the yeast Zuotin protein has replaced the SANT domain with a highly hydrophobic helix that could well interact with a GQ endplate [108]. SANT-domain proteins are found in multiple chromatin-modifying and remodeling complexes, although their interactions with GQs are not yet reported [111].

4.11. Enhancer Promoter Condensates

Given the enrichment of G-flipons in promoters, a key question was how do proteins that stabilize and resolve GQs impact transcription. GQ-binding proteins like YY1 (Yin Yang 1) are known to form homodimers that promote enhancer–promoter contacts [52,112,113]. So do transcription factors that bind GQs. One of the surprises of the ENCODE project was the identification of HOT (high occupancy target) loci where upwards of a 100 TF bound, even to sites lacking their sequence-specific binding motif. The findings were initially dismissed as methodological artifacts [114], but were later shown not to be so [115,116]. The primary studies focused on the sequence-specificity of TFs, not on the GQs that were also formed at promoters. The ability of TFs to bind both B-DNA and GQs offered a resolution to this HOT dilemma [52]. Indeed, recent findings suggest that it is GQ formation that recruits TFs to transcriptional hubs [117]. In this new model, as described here, TFs play a non-traditional role. Through the complexes they anchor, TFs localize helicases to resolve the GQs formed by promoters. A specific helicase might recognize a particular GQ fold or a GQ loop of a particular length or composition, or display a preference for a 5′ or 3′ single-stranded flanking sequence. The biological outcomes then depend on the GQ’s topology and the helicase involved. G-flipon cycle are then able to actuate a diverse set of transactions (Figure 3).

4.12. Transcriptional Bursting

One extension of this model is that the docking of TFs to GQs maintains a constant state following the initiation of transcription by the binding of a sequence-specific TF to B-DNA. Consequently, there would be no need for any further sequence-specific interactions with the promoter. However, this possibility is not consistent with the observed rapid reset of promoters that occurs after each round of transcription [118,119]. The fast disassembly of the transcriptional complexes following each round of transcription is mirrored by the abrupt dissolution of promoter condensates, triggered by the high levels of nascent RNAs produced [120]. This evidence suggests that transcription occurs in bursts followed by a reset, rather than by a constant, preset rate of expression.
Figure 4. G-flipon nucleate condensates. In the scheme presented, GQ formation seeds condensates that promote transactions between different genomic regions. This arrangement can maintain G-flipons in an active but poised state, locked and loaded, ready to actuate a particular outcome. (A) The sequential contacts between the GQs formed at different chromosomal sites ensure that RNA processing occurs in the correct temporal order. The splice and polyadenylation sites selected may vary with the specific promoter used to seed the condensate. The GQ folds formed may vary by site. The dotted lines indicate that both DNA and RNA GQs can participate in the transactions. (B) Retroviruses are enriched for G-flipons in their LTR, which can adopt different conformations, as seen in the NMR structures PDB:2N47 [121] and PDB:6HiK [122]. The formation of GQs by both retroviral LTRs may anchor chromosomal loops that are stabilized by a condensate at their base. In this state, the viral genome is poised, but not actively transcribed. The dissolution of the condensate then releases RPOL2 to initiate transcription. The viral plus strand has eleven conserved sequences capable of forming different GQ, folds, nine of which are in protein-coding regions. Two potential GQs are present on the negative strand [73,123,124]. The number of potential G-flipons in the LTR differs between retroviruses [72]. Whether the presence of more GQs increases viral virulence or whether the G-flipons enable different processing events is not currently known.
Figure 4. G-flipon nucleate condensates. In the scheme presented, GQ formation seeds condensates that promote transactions between different genomic regions. This arrangement can maintain G-flipons in an active but poised state, locked and loaded, ready to actuate a particular outcome. (A) The sequential contacts between the GQs formed at different chromosomal sites ensure that RNA processing occurs in the correct temporal order. The splice and polyadenylation sites selected may vary with the specific promoter used to seed the condensate. The GQ folds formed may vary by site. The dotted lines indicate that both DNA and RNA GQs can participate in the transactions. (B) Retroviruses are enriched for G-flipons in their LTR, which can adopt different conformations, as seen in the NMR structures PDB:2N47 [121] and PDB:6HiK [122]. The formation of GQs by both retroviral LTRs may anchor chromosomal loops that are stabilized by a condensate at their base. In this state, the viral genome is poised, but not actively transcribed. The dissolution of the condensate then releases RPOL2 to initiate transcription. The viral plus strand has eleven conserved sequences capable of forming different GQ, folds, nine of which are in protein-coding regions. Two potential GQs are present on the negative strand [73,123,124]. The number of potential G-flipons in the LTR differs between retroviruses [72]. Whether the presence of more GQs increases viral virulence or whether the G-flipons enable different processing events is not currently known.
Ijms 25 10299 g004
Earlier experiments based on single-molecule FISH suggested that the transcriptional burst frequency, but not the burst size, depended on the rate of promoter reset [118]. One contribution to burst size was the frequency with which sister chromatids were transcribed. Curiously, only one allele was active at a time, rather than both undergoing simultaneous transcription [118]. The localization of many different helicases to the locus might then allow one allele to reload a sequence-specific TF to reform an initiation complex while the other one fired. Such coordinated activity is supported by the symmetrical chromatin architecture observed for sister chromatids, as described above [84]. The lack of co-bursting by maternal and paternal chromosomes is consistent with recent single-cell studies of allele-specific transcription [125].

4.13. Promoter Pausing

How, then, do GQs modulate transcriptional bursting? The formation of GQs at promoters is often detectable before the initiation of transcription [126]. In such situations, there is no preference for which strand forms a GQ, further suggesting that gene transcription is not required to flip B-DNA to the GQ conformation [107]. Further, the GQ flip is not modulated by human topoisomerase I (TOP1), even though the enzyme is enriched at these sites [127]. Instead, TOP1 is inhibited by GQs, with an IC50 ~ 100 nM [128].
It is possible that the GQs formed at promoters engage RPOL2, but prevent elongation by holding the enzyme in a poised state (Figure 5A). The YY1-mediated looping between promoter and enhancer GQs could further freeze RPOL2 in place through the condensate formed [52,112,113]. Locking down RPOL2 then provides the time to properly position other GQ-anchored condensates required to correctly splice and polyadenylate the pre-mRNA produced. Without the proper arrangements in place to coordinate downstream events, an elongating RPOL2 will terminate transcription prematurely and detach from the DNA template (Figure 4) [129]. Following the release of RPOL2 from the TSS, the enhancer–promoter condensate undergoes disassembly, enabling the promoter to reset for another transcriptional burst [120].

4.14. G- and Z-Flipons and Promoter Reset

This scenario provides a different perspective on why TF engagement in promoter GQs is important in regulating gene expression. In the new scheme, TFs do not directly drive gene expression by binding a cognate motif. Instead, they engage GQs, localizing complexes that contain the helicases required to rapidly resolve GQs. The reset then allows the DNA duplex to reform and B-DNA sequence-specific proteins to seed the formation of a new promoter/enhancer condensate.
It is also necessary to clear the existing pre-initiation complex (PIC) used previously to dock RPOL2 at the transcription bubble (Figure 5). This process often involves Z-DNA-forming sequences near the TSS. Here, the negative supercoiling generated by an elongating RNAP2 powers the formation of Z-DNA [130,131,132]. The energy stored in Z-DNA then actuates the removal of the existing PIC from the promoter. A new PIC more in tune with the current state of the cell can then form (Figure 5C) [130]. The formation of Z-DNA may also be necessary to reengage the RPOL2 complex. There is preliminary evidence that GTFE (General Transcriptional Factor E) binds to Z-DNA, followed by docking of RPOLR2 to a newly formed transcription bubble [130,131].
The reengagement of the RPOL2 complex also depends on the binding to the promoter of sequence-specific TFs to B-DNA that help seed the PIC. The positive supercoiling induced by PIC engagement [132] then modulates the conformation of both G- and Z-flipon. The positive supercoiling will lead to the unwinding of DNA on either side of the PIC [133]. The upstream uncoiling of DNA will promote GQ formation, signaling that the PIC is engaged, while the downstream unwinding will assist in opening the RNAP2 catalytic center allowing the coding strand to enter (Figure 5C).
The rest and reinitiation mechanism based on flipons is likely quite ancient. Indeed, many of the promoters regulating embryonic and neurological development contain both G- and Z-flipons that have been validated experimentally [134]. This mechanism is quite flexible and adaptable. The insertion of flipons during evolution into promoters by retrotransposons provides a mechanism for modulating gene expression. In humans, the copying and pasting of the ALU family of SINEs (short interspersed nuclear elements) throughout the genome has greatly enhanced this type of genetic variation [106]. An extreme example of the alternative outcomes enabled by the insertion of flipons into promoter regions is provided by the experimental observation that some flipons can form either GQs or Z-DNA [134,135]. The particular structure adopted may depend on which TSS is used to initiate gene expression. The formation of Z-DNA downstream would be driven by transcription, while the flip upstream to GQs would be driven by TF engagement.
The involvement of G-flipons in both polymerase pausing and in promoter reset may produce some paradoxical results when ligands that stabilize GQs are employed experimentally. The outcome then depends on the step in the G-flipon cycle that is most affected. The immediate effect of disrupting the GQ-dependent enhancer–promoter condensate is the release of RPOL2 from the promoter and a transcriptional burst. Failure to reset the promoter will maintain the NDR and increase the chance of DNA damage, leading to decreased transcription initiation and eventually to cell death. It is also possible that a GQ-stabilizing ligand may disrupt the reset of GQs at genomic sites other than the promoter, leading to premature transcript termination due to the disruption of downstream RNA processing events.

4.15. Gene Repression

GQs and gene repression. The promoter reset occurs in competition with protein that suppress gene expression. These competitors include the PRC2 complex that engages the GQs formed at promoters through the SANT domain of the EZH2 (enhancer of zeste 2) component. For active genes, the binding of PRC2 to the GQs formed by a nascent RNA likely prevents the engagement of the GQs formed by single-stranded promoter DNA [136]. However, in other situations, the binding of a small RNA to the coding strand would promote GQ formation by promoter DNA without the transcription of a GQ RNA competitor. In this situation, proteins, such as PRC2, that are localized to the site by the small RNA would enhance the formation of a repressive complex at the promoter. In these situations, the small RNA could be produced from a locus elsewhere in the genome [134]. Indeed, small RNAs direct the hiwi (human ortholog of piwi)-mediated repression of human endogenous retroelements in early development and are produced from over 6000 clusters [137,138,139]. By localizing a different set of proteins to the site, small RNAs acting in trans could also promote transcriptional activation (Figure 4A). Such a role has been proposed for other piwi-related agonaute family-member complexes [140,141].

4.16. R Loop Resolution

A number of mechanisms exist to regulate dGQ formation by R-loops (Figure 3). For example, helicases such as SETX and RTEL1 can facilitate the flip of GQs back to B-DNA through the resolution of RNA:DNA hybrids [142,143]. Nucleases that digest the RNA strand of hybrids, such as RNaseH1, play an important role in their removal [144]. Other proteins such as ATRX prevent R-loop formation at telomeres by sequestering RNA. The deletion of ATRX leads to the increased formation of GQs at telomeres [145].

4.17. Chromatin Loops and Transcript Elongation

In cellulo studies reveal that delays in RNAP2 transcript elongation occur at the CTCF-binding sites involved in chromatin loop formation. CCTF binds to the large subunit of RNAP2 and the interaction is also associated with cohesin recruitment [146,147,148]. CTCG only binds unmethylated DNA, Consequently, CTCF binding increases following the deletion of the DNA methylase DNMT1.
These findings are consistent with a model where stalling of the polymerase by CTCF results in an R-loop that promotes GQ formation at the site. The GQ structure produced then inhibits DNMT1, preventing DNA methylation of the locus by trapping the enzyme. The trap works because the binding affinity of DNMT1 is higher for GQs than for either duplex, hemi-methylated or single-stranded DNA [149]. The resolution of the GQs by helicases then allows the redocking of the CTCF to the original DNA site, leading to the reinstatement of the chromatin loop formed with the promoter (Figure 4). The CTCF binding sites necessary for the loop transactions lie in reverse orientation to each other. They are then fully aligned at the base of the loop and held in that state until the next round of elongation [85]. After the loop is reestablished, the flipon cycle then resets the DNA locus to be ready for the next transcriptional burst.

4.18. Chromatin Loops and Splicing

How GQ formation by DNA affects splicing is therefore of considerable interest. Pausing of RNAP2 is associated with alternative splicing (reviewed in [150]). The sites at which RNAP2 pauses have been investigated at nucleotide resolution. Careful in vivo measurements show a dependence of pause sites on the structure of the RNA:DNA hybrid produced, but not on the canonical DNA motifs that form GQs [151]. The lack of direct involvement of dGQs may reflect the action of the FACT (Facilitated Chromatin Transcription) complex in maintaining the existing epigenetic state by removing nucleosomes in front of the RNAP2 and replacing them behind the enzyme. This mechanism prevents the net accumulation of local DNA supercoiling that might otherwise change flipon conformation [152].
However, CTCF-mediated looping is associated with alternative splicing and may allow dGQs to play an indirect role in splicing by maintaining CTCF sites methylation-free. The role for the CTCF is well substantiated. There is evidence that the DNA loops formed between the promoter and the spliceosome mediate the transfer of various splicing factors that initially accumulate in promoter regions [153,154]. There is also ancillary evidence that R-loop formation at promoter sites promotes splicing [155], consistent with the role of GQs in forming promoter/spliceosome condensates.
Alternative splicing is also associated with demethylated DNA, consistent with the role of CTCF-anchored loops in splicing. The deletion of DNMT1 enhances the alternative splicing of the CD45 transcript, as does inducing DNA demethylation, by increasing the expression of TET1 (tet methylcytosine dioxygenase 1) and TET2 enzymes [156,157]. Interestingly, the complement of the degenerate RPOL2 pause motif given by Gajos et al. [151] has a weak match to the CTCF motif (the orientation is inverted relative to those enriched at the TSS). In this case, the inhibition of DNA methylation by GQs may provide a partial explanation for how this conformation can indirectly influence the selection of splice sites [41].
The CTCF-dependent mechanism of connecting promoters with the RNA-processing condensates involved in splicing is quite flexible. For example, the multiple alternative splices of the protocadherin Pcdh gene family connect the production of each isoform with a different active promoter [158,159]. A similar dependence on promoter selection is reported for other RNA-processing steps in which the polyadenylation of transcripts occurs at different sites [160,161] (Figure 4). In both outcomes, GQs potentially prevent the loss of CTCF-binding sites by inhibiting the DNA methylation of the locus. The GQ also localizes proteins with roles in their splicing and polyadenylation. The many proposed GQ-binding proteins involved are listed in [63], in the G4IPBD database and in the QUADRatlas database, with a validated subset given in [52].

5. RNA and G-Quadruplexes

5.1. RNA Modifications and Splicing

rGQs can also form in the RNA transcripts produced, including those with only two tetrads [34] and those folded with non-contiguous G nucleotides [49]. These structures have the potential to alter the RPOL2 elongation rate and the RNA processing performed [41,42]. For example, the splicing factors U2AF65 and SRSF1 bind to GQ RNA with nanomolar affinity, each showing specificity for different GQ substrates [162]. The small molecule cephaeline and the related compound emetine are both reported to impair the formation of GQs by RNA. Both compounds globally disrupt alternative RNA splicing [163].
GQ formation may also alter the co-transcriptional N6-methyladenosine (m6A) modification of RNA. It has been proposed that this epigenetic mark can affect splice site selection, but that issue is unresolved [164,165,166]. The involvement of rGQs in m6A modification is also controversial. Interestingly, the methyltransferase METTL3/METTL14 heterodimer that writes m6A within the consensus DRACH motif (D = A, G, or U; R = A or G; H = A, C, or U) binds to rG4 structures preferentially through its RGG domain [167,168]. Also, the RBM15 protein, which also binds rG4, localizes METl3 to certain transcripts and to a subset of H3K36me3 marks [52,166,169]. The mapping of GQs and m6As to splice junctions is dependent on the methods used. Over 81% of GQs that are mapped in HeLa cells are formed from only two tetrads that can stably fold into rGQs [164]. The mapping frequency also depends on the m6A detection protocol employed and the cell line studied, varying from 14% in HeLa cells to 40% in HEK cells [164]. More recent methods are even more sensitive than those used in the earlier analysis, but reproducibility across studies remains a problem [170]. Current mappings do not reveal any enrichment of the DRACH motif in GQ loops, suggesting that rGQs might localize METl3 to modify sequences in their neighborhood [164]. Alternatively, m6A modification may inhibit rGQ formation, as seen for GGA repeats [171]. Interestingly, m6A bases are read by heterogeneous ribonucleic acid proteins (hnRNPs), which are involved in alternative splicing, such as hnRNP C and hnRNP A2B1 [172].
The role of m6A in splicing was also investigated in genetically modified animals. The expression of a hypomorphic METTL3 allele in mouse embryonic stem cells did not appear to change splicing patterns, although there was a slower turnover of many of the wildtype m6A-modified RNAs [165]. Further, in wildtype cells, the distribution of m6A in processed nuclear mRNAs was similar to that found in cytoplasmic mRNAs. Around 70% of the observed m6A sites were in terminal exons, with ~70% in the 3′UTR. Among chromatin-associated RNAs that were not completely processed, ~93% of the m6As in the partially spliced transcripts were in exons and only ~10% of m6As were within 50 nucleotides of 5′ or 3′ splice sites. Notably, methylation was mostly performed before splicing [173].
Rather than working with a genomic knockout, another group examined the immediate effects of the acute depletion of METTL3 protein. This approach was designed to minimize the downstream effects on the expression of other genes resulting from METTL3 loss. Around 6–10% of high-confidence m6A regions was mapped to introns, mainly in protein-coding genes, either around stop-codon regions or at the beginning of the 3′UTR. The loss of METTL3 disrupted the inclusion of alternative introns/exons in the nascent transcriptome, particularly at the 5′ splice sites proximal to m6A peaks, suggesting that the sites were occluded or the isoforms were protected by proteins bound to m6A. Among the genes showing altered splicing were those encoding proteins for m6A modification (Wtap, Ythdc1, Ythdf1 and Spen), suggesting a negative feedback regulatory mechanism that would be absent in cells with METTL3 deleted from their germline [166]. Overall, the different results for GQ RNA formation at splice sites and METTL3 deficiency are consistent with a model where rGQ-folding in introns can promote the m6A modification of exons, with the rapid degradation of splicing isoforms with retained introns marked by m6A.

5.2. Ribosome Assembly

rGQs appear to play an important role in ribosome structure and maturation, with ribosomal RNAs enriched for G-flipons [174]. Many ribosomal proteins have been identified as rGQ ligands in different screens [65,162]. Further, rGQ-binding and resolving proteins such as nucleolin and nucleophosmin help structure the nucleolar condensates that guide ribosome assembly [59,175,176,177].

5.3. Translation

rGQ formation by mRNA is the subject of much interest, especially in the untranslated regions that regulate translation. These exons contain alternative translation initiation sites and microRNA (miR)-binding sites that affect the production of different protein isoforms. The complexities involved are described in a number of recent reviews. These articles provide examples of how rGQs in 5′UTRs can switch the use of start codons to produce completely different protein products, while rGQs in the 3′UTR can modulate the translation of mRNAs and interactions with small regulatory RNAs such as miRNA [178,179,180,181]. An analysis of G-flipons in 5′- and 3′UTR provides evidence of positive selection, which can alter the alternative splicing of these exons. Single-nucleotide variants in both 5′- and 3′UTR are associated with quantitative trait loci [182]. Bioinformatic approaches have also been used to identify G-flipon RNA-binding protein, as annotated in the QUADRatlas database.
By modulating mRNA translation RNAs, rGQs contribute in many ways to phenotypic pliability [28]. Here, helicases such as DHX36 and CCHC-type zinc-finger nucleic acid-binding protein (CNBP/ZNF9) play a central role in promoting mRNA translation by resolving rGQs [183,184]. The m6A modifications of RNA that are associated with rGQ formation during transcription (as described above) also impact translation. The removal of these marks from the 5′UTR near the start codon by the m6A erasers AlkB homolog H5 (ALKBH5) and fat mass and obesity (FTO) decreases ribosome translational pausing, increasing protein synthesis [185]. Such m6A modifications also dynamically regulate heat-shock responses by enhancing N7-methylguanosine cap-independent translation [186]. Further, the class I cytoplasmic m6A readers YTHDF1 and YTHDF3 promote the degradation of target transcripts [187], potentially eliminating partially processed transcripts with retained introns. The endogenous repeat elements present in these introns, such as ALU SINE inverted repeats, might otherwise activate dsRNA- and Z-RNA-dependent immune responses [130]. The potential of rGQs to enhance m6A modifications provides additional mechanistic insight into how G-flipons increase phenotype pliability by regulating RNA-dependent epigenetic outcomes.

6. Flipons and Development

6.1. Pioneering Factors and Flipons

Other mechanisms exist for the induction of alternative flipon conformations. Sequence-specific pioneering transcription factors, such as HNF4 and GATA4, can dock to their motifs on nucleosome-bound DNA. The master-regulators of embryonic development then localize complexes that evict histone octamers from the locus, generating a negatively supercoiled NDR at the site [188,189]. The energy released by the removal of a nucleosome is sufficient to induce a number of different alternative DNA conformations [190]. The relaxation of these structures to B-DNA is sufficient to power the assembly of the different biological machines that actuate alternative cellular responses (Figure 3).

6.2. Bootstrapping Flipon Conformation with Noncoding RNAs

GQs are able to facilitate a number of different processes in the cell that are directed by sequence-specific TFs. Small noncoding RNAs, such as those used in the piwi system to regulate endogenous retroelements [191], provide another means by which GQ formation can be regulated in a sequence-specific manner. In both cases, the alternative flipon conformations engage the same structure-specific cellular machinery. The question arises as to how these two different systems for sequence-specific regulation of gene expression and RNA translation are used to program development, especially during early embryogenesis. To explore the role of small RNAs in this process, the sequence-specific match between experimentally confirmed flipons and miR highly conserved in eutherian mammals was explored. Intriguingly, promoters with miR matches to G- and Z-flipons were highly enriched in developmental genes (FDR > 10−100), The findings are consistent with a role for miRNA in programming flipon conformation during early embryogenesis [134].
Notably, GQs are enriched in human embryonic stem cells (hESCs). About 18,000 GQs were mapped to the NDR, as defined by ATAC seq. Following differentiation into neural stem cells and cranial neural crest cells, the number of detectable GQs was reduced by 25–50%, with findings differing by lineage [192]. In hESCs, GQs were mapped to ~50% of bivalent promoters that contain both active H3K4me1 and repressive H3K27me3 marks, and were lowly transcribed. The GQs in hESCs overlapped sites bound by the CTCF (~36%), the cohesin component RAD51 (~50%) and RING1B, which mediates repression by recruiting PRC1 to R-loops (~55%) [193]. Differentiation was associated with the loss of bivalent promoters, reflecting the potential of GQs to localize either activating or repressive protein complexes during lineage specification. Collectively, the results are consistent with a model where small RNAs bootstrap development, much in the same way a computer loads an initial program to specify the inputs and outputs that are necessary for an operating system to run. Here, the programming of flipon conformation by small RNAs would establish epigenetic marks to template tissue differentiation by sequence-specific B-DNA-binding proteins. The bootstrapping by small RNAs that occurs after the erasure of existing parental epigenetic marks early in development could potentially involve miR, transmitted by either maternal or paternal gametes [194,195,196,197]. Further research is needed to address such mechanisms.

7. Summary and Outlook

Flipons are genetically encoded elements that dynamically change their conformation under physiological conditions without requiring strand cleavage or a change in sequence. They vary by the non-B-DNA structures they form. Z-flipons flip rapidly, with an in vitro relaxation time of 100 ms, and have ancient, well-documented roles in self-recognition and immunity through their structure-specific interactions with the Zα domain [130]. G-flipons are much more stable, with higher melting temperatures than their B-DNA structure and the potential to form bistable switches. Yet, like Z-flipons, GQs are formed and resolved dynamically to perform a number of important biological roles (Figure 3). Flipons that form triplexes are also likely to influence gene expression and development [198,199], with examples related to the hemoglobin locus [200] and to triplex stabilization by histone H3 tails [201,202]. Notably, the Drosophila GAGA protein binds triplex-DNA through the same domain that binds B-DNA in a sequence-specific manner [203]. Triplex-forming sequences are also enriched in repeat elements, such as ALU SINEs (short interspersed nuclear elements), which form part of the repetitive genome [106]. Other triplexes are formed by long noncoding RNAs. Their biology then reflects the RNA motifs that the triplex forming sequences deliver to a locus. The sequence- and structure-specific proteins engaged by the tag along motifs then scaffold the formation of various chromatin-modifying complexes [204].
Based on a dynamic form of encoding, flipon biology can be best visualized as a cycle that exchanges energy for information. The flip to an alternative conformation is regulated both genetically and by environmental events, by base modifications that enhance or suppress the transition. The outcomes depend upon proteins and noncoding RNAs that modulate the formation or resolution of their alternative conformation. These modulators are themselves subject to modifications that help tune the cycle. Other factors also affect the flipon equilibrium by binding in a sequence-specific manner to right-handed B-DNA conformations or to single-stranded RNA to oppose the flip to the alternative conformation.
While it has been usual to consider the effects of evolution on the individual protein components involved in cellular processes, the optimization of so many different parameters represents a combinatorically challenging calculation full of cascading complexity, similar in logic to the epicycles once used to predict planetary orbits in a bygone era. Instead, flipons offer a simpler alternative to optimize context-specific responses that allow rapid adjustments of the cellular state in response to environmental perturbations. By programming and refreshing the epigenetic state of a cell, flipons facilitate the formation and maintenance of cellular memory [2]. Here, the various ways in which G-flipons impact a wide variety of biological processes are described, with a focus on the recent experimental validations of GQs and descriptions of what is currently unknown.

Funding

This research received no external funding.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Herbert, A. A Genetic Instruction Code Based on DNA Conformation. Trends Genet. 2019, 35, 887–890. [Google Scholar] [CrossRef]
  2. Herbert, A. Flipons and the Logic of Soft-Wired Genomes, 1st ed.; CRC Press: Boca Raton, FL, USA, 2024. [Google Scholar]
  3. Gellert, M.; Lipsett, M.N.; Davies, D.R. Helix Formation by Guanylic Acid. Proc. Natl. Acad. Sci. USA 1962, 48, 2013–2018. [Google Scholar] [CrossRef]
  4. Arnott, S.; Chandrasekaran, R.; Marttila, C.M. Structures for polyinosinic acid and polyguanylic acid. Biochem. J. 1974, 141, 537–543. [Google Scholar] [CrossRef]
  5. Sauer, M.; Paeschke, K. G-quadruplex unwinding helicases and their function in vivo. Biochem. Soc. Trans. 2017, 45, 1173–1182. [Google Scholar] [CrossRef]
  6. Blackburn, E.H.; Gall, J.G. A tandemly repeated sequence at the termini of the extrachromosomal ribosomal RNA genes in Tetrahymena. J. Mol. Biol. 1978, 120, 33–53. [Google Scholar] [CrossRef]
  7. Sundquist, W.I.; Klug, A. Telomeric DNA dimerizes by formation of guanine tetrads between hairpin loops. Nature 1989, 342, 825–829. [Google Scholar] [CrossRef]
  8. Griffith, J.D.; Comeau, L.; Rosenfield, S.; Stansel, R.M.; Bianchi, A.; Moss, H.; de Lange, T. Mammalian telomeres end in a large duplex loop. Cell 1999, 97, 503–514. [Google Scholar] [CrossRef]
  9. Huber, M.D.; Duquette, M.L.; Shiels, J.C.; Maizels, N. A conserved G4 DNA binding domain in RecQ family helicases. J. Mol. Biol. 2006, 358, 1071–1080. [Google Scholar] [CrossRef]
  10. Maizels, N. G4-associated human diseases. EMBO Rep. 2015, 16, 910–922. [Google Scholar] [CrossRef]
  11. Duquette, M.L.; Handa, P.; Vincent, J.A.; Taylor, A.F.; Maizels, N. Intracellular transcription of G-rich DNAs induces formation of G-loops, novel structures containing G4 DNA. Genes. Dev. 2004, 18, 1618–1629. [Google Scholar] [CrossRef]
  12. Okazaki, I.M.; Kinoshita, K.; Muramatsu, M.; Yoshikawa, K.; Honjo, T. The AID enzyme induces class switch recombination in fibroblasts. Nature 2002, 416, 340–345. [Google Scholar] [CrossRef]
  13. Qiao, Q.; Wang, L.; Meng, F.L.; Hwang, J.K.; Alt, F.W.; Wu, H. AID Recognizes Structured DNA for Class Switch Recombination. Mol. Cell 2017, 67, 361–373.e364. [Google Scholar] [CrossRef]
  14. Ribeiro de Almeida, C.; Dhir, S.; Dhir, A.; Moghaddam, A.E.; Sattentau, Q.; Meinhart, A.; Proudfoot, N.J. RNA Helicase DDX1 Converts RNA G-Quadruplex Structures into R-Loops to Promote IgH Class Switch Recombination. Mol. Cell 2018, 70, 650–662.e658. [Google Scholar] [CrossRef]
  15. Richard, P.; Manley, J.L. R Loops and Links to Human Disease. J. Mol. Biol. 2017, 429, 3168–3180. [Google Scholar] [CrossRef]
  16. Guo, J.U.; Bartel, D.P. RNA G-quadruplexes are globally unfolded in eukaryotic cells and depleted in bacteria. Science 2016, 353, aaf5371. [Google Scholar] [CrossRef]
  17. Di Antonio, M.; Ponjavic, A.; Radzevicius, A.; Ranasinghe, R.T.; Catalano, M.; Zhang, X.; Shen, J.; Needham, L.M.; Lee, S.F.; Klenerman, D.; et al. Single-molecule visualization of DNA G-quadruplex formation in live cells. Nat. Chem. 2020, 12, 832–837. [Google Scholar] [CrossRef]
  18. Guo, J.K.; Blanco, M.R.; Walkup, W.G.t.; Bonesteele, G.; Urbinati, C.R.; Banerjee, A.K.; Chow, A.; Ettlin, O.; Strehle, M.; Peyda, P.; et al. Denaturing purifications demonstrate that PRC2 and other widely reported chromatin proteins do not appear to bind directly to RNA in vivo. Mol. Cell 2024, 84, 1271–1289.e1212. [Google Scholar] [CrossRef]
  19. Doolittle, W.F. Is junk DNA bunk? A critique of ENCODE. Proc. Natl. Acad. Sci. USA 2013, 110, 5294–5300. [Google Scholar] [CrossRef]
  20. Varshney, D.; Spiegel, J.; Zyner, K.; Tannahill, D.; Balasubramanian, S. The regulation and functions of DNA and RNA G-quadruplexes. Nat. Rev. Mol. Cell Biol. 2020, 21, 459–474. [Google Scholar] [CrossRef]
  21. Spiegel, J.; Adhikari, S.; Balasubramanian, S. The Structure and Function of DNA G-Quadruplexes. Trends Chem. 2020, 2, 123–136. [Google Scholar] [CrossRef]
  22. Yadav, P.; Kim, N.; Kumari, M.; Verma, S.; Sharma, T.K.; Yadav, V.; Kumar, A. G-Quadruplex Structures in Bacteria: Biological Relevance and Potential as an Antimicrobial Target. J. Bacteriol. 2021, 203, e0057720. [Google Scholar] [CrossRef]
  23. Wang, E.; Thombre, R.; Shah, Y.; Latanich, R.; Wang, J. G-Quadruplexes as pathogenic drivers in neurodegenerative disorders. Nucleic Acids Res. 2021, 49, 4816–4830. [Google Scholar] [CrossRef]
  24. Lejault, P.; Mitteaux, J.; Sperti, F.R.; Monchaud, D. How to untie G-quadruplex knots and why? Cell Chem. Biol. 2021, 28, 436–455. [Google Scholar] [CrossRef]
  25. Sato, K.; Knipscheer, P. G-quadruplex resolution: From molecular mechanisms to physiological relevance. DNA Repair. 2023, 130, 103552. [Google Scholar] [CrossRef]
  26. Troisi, R.; Sica, F. Structural overview of DNA and RNA G-quadruplexes in their interaction with proteins. Curr. Opin. Struct. Biol. 2024, 87, 102846. [Google Scholar] [CrossRef]
  27. Sahayasheela, V.J.; Sugiyama, H. RNA G-quadruplex in functional regulation of noncoding RNA: Challenges and emerging opportunities. Cell Chem. Biol. 2024, 31, 53–70. [Google Scholar] [CrossRef]
  28. Cammas, A.; Desprairies, A.; Dassi, E.; Millevoi, S. The shaping of mRNA translation plasticity by RNA G-quadruplexes in cancer progression and therapy resistance. NAR Cancer 2024, 6, zcae025. [Google Scholar] [CrossRef]
  29. Sen, D.; Gilbert, W. A sodium-potassium switch in the formation of four-stranded G4-DNA. Nature 1990, 344, 410–414. [Google Scholar] [CrossRef]
  30. Fonseca Guerra, C.; Zijlstra, H.; Paragi, G.; Bickelhaupt, F.M. Telomere Structure and Stability: Covalency in Hydrogen Bonds, Not Resonance Assistance, Causes Cooperativity in Guanine Quartets. Chem. Eur. J. 2011, 17, 12612–12622. [Google Scholar] [CrossRef]
  31. Sundaresan, S.; Uttamrao, P.P.; Kovuri, P.; Rathinavelan, T. The entangled world of DNA quadruplex folds. BioRxiv 2024. [Google Scholar] [CrossRef]
  32. Marusic, M.; Sket, P.; Bauer, L.; Viglasky, V.; Plavec, J. Solution-state structure of an intramolecular G-quadruplex with propeller, diagonal and edgewise loops. Nucleic Acids Res. 2012, 40, 6946–6956. [Google Scholar] [CrossRef]
  33. Roschdi, S.; Yan, J.; Nomura, Y.; Escobar, C.A.; Petersen, R.J.; Bingman, C.A.; Tonelli, M.; Vivek, R.; Montemayor, E.J.; Wickens, M.; et al. An atypical RNA quadruplex marks RNAs as vectors for gene silencing. Nat. Struct. Mol. Biol. 2022, 29, 1113–1121. [Google Scholar] [CrossRef]
  34. Fay, M.M.; Lyons, S.M.; Ivanov, P. RNA G-Quadruplexes in Biology: Principles and Molecular Mechanisms. J. Mol. Biol. 2017, 429, 2127–2147. [Google Scholar] [CrossRef]
  35. Matsugami, A.; Okuizumi, T.; Uesugi, S.; Katahira, M. Intramolecular higher order packing of parallel quadruplexes comprising a G:G:G:G tetrad and a G(:A):G(:A):G(:A):G heptad of GGA triplet repeat DNA. J. Biol. Chem. 2003, 278, 28147–28153. [Google Scholar] [CrossRef]
  36. Palumbo, S.L.; Memmott, R.M.; Uribe, D.J.; Krotova-Khan, Y.; Hurley, L.H.; Ebbinghaus, S.W. A novel G-quadruplex-forming GGA repeat region in the c-myb promoter is a critical regulator of promoter activity. Nucleic Acids Res. 2008, 36, 1755–1769. [Google Scholar] [CrossRef]
  37. Fleming, A.M.; Zhou, J.; Wallace, S.S.; Burrows, C.J. A Role for the Fifth G-Track in G-Quadruplex Forming Oncogene Promoter Sequences during Oxidative Stress: Do These “Spare Tires” Have an Evolved Function? ACS Cent. Sci. 2015, 1, 226–233. [Google Scholar] [CrossRef]
  38. Piazza, A.; Adrian, M.; Samazan, F.; Heddi, B.; Hamon, F.; Serero, A.; Lopes, J.; Teulade-Fichou, M.P.; Phan, A.T.; Nicolas, A. Short loop length and high thermal stability determine genomic instability induced by G-quadruplex-forming minisatellites. EMBO J. 2015, 34, 1718–1734. [Google Scholar] [CrossRef]
  39. Williams, J.D.; Houserova, D.; Johnson, B.R.; Dyniewski, B.; Berroyer, A.; French, H.; Barchie, A.A.; Bilbrey, D.D.; Demeis, J.D.; Ghee, K.R.; et al. Characterization of long G4-rich enhancer-associated genomic regions engaging in a novel loop:loop G’4 Kissing’ interaction. Nucleic Acids Res. 2020, 48, 5907–5925. [Google Scholar] [CrossRef]
  40. Wu, F.; Niu, K.; Cui, Y.; Li, C.; Lyu, M.; Ren, Y.; Chen, Y.; Deng, H.; Huang, L.; Zheng, S.; et al. Genome-wide analysis of DNA G-quadruplex motifs across 37 species provides insights into G4 evolution. Commun. Biol. 2021, 4, 98. [Google Scholar] [CrossRef]
  41. Lee, C.Y.; McNerney, C.; Ma, K.; Zhao, W.; Wang, A.; Myong, S. R-loop induced G-quadruplex in non-template promotes transcription by successive R-loop formation. Nat. Commun. 2020, 11, 3392. [Google Scholar] [CrossRef]
  42. Georgakopoulos-Soares, I.; Parada, G.E.; Wong, H.Y.; Medhi, R.; Furlan, G.; Munita, R.; Miska, E.A.; Kwok, C.K.; Hemberg, M. Alternative splicing modulation by G-quadruplexes. Nat. Commun. 2022, 13, 2404. [Google Scholar] [CrossRef] [PubMed]
  43. Hegyi, H. Enhancer-promoter interaction facilitated by transiently forming G-quadruplexes. Sci. Rep. 2015, 5, 9165. [Google Scholar] [CrossRef] [PubMed]
  44. Zheng, K.W.; Xiao, S.; Liu, J.Q.; Zhang, J.Y.; Hao, Y.H.; Tan, Z. Co-transcriptional formation of DNA:RNA hybrid G-quadruplex and potential function as constitutional cis element for transcription control. Nucleic Acids Res. 2013, 41, 5533–5541. [Google Scholar] [CrossRef] [PubMed]
  45. Varizhuk, A.M.; Protopopova, A.D.; Tsvetkov, V.B.; Barinov, N.A.; Podgorsky, V.V.; Tankevich, M.V.; Vlasenok, M.A.; Severov, V.V.; Smirnov, I.P.; Dubrovin, E.V.; et al. Polymorphism of G4 associates: From stacks to wires via interlocks. Nucleic Acids Res. 2018, 46, 8978–8992. [Google Scholar] [CrossRef] [PubMed]
  46. Kolesnikova, S.; Curtis, E.A. Structure and Function of Multimeric G-Quadruplexes. Molecules 2019, 24, 3074. [Google Scholar] [CrossRef] [PubMed]
  47. Sen, D.; Gilbert, W. Formation of parallel four-stranded complexes by guanine-rich motifs in DNA and its implications for meiosis. Nature 1988, 334, 364–366. [Google Scholar] [CrossRef]
  48. Li, X.-m.; Zheng, K.-w.; Zhang, J.-y.; Liu, H.-h.; He, Y.-d.; Yuan, B.-f.; Hao, Y.-h.; Tan, Z. Guanine-vacancy–bearing G-quadruplexes responsive to guanine derivatives. Proc. Natl. Acad. Sci. USA 2015, 112, 14581–14586. [Google Scholar] [CrossRef]
  49. Banco, M.T.; Ferre-D’Amare, A.R. The emerging structural complexity of G-quadruplex RNAs. RNA 2021, 27, 390–402. [Google Scholar] [CrossRef]
  50. Lavezzo, E.; Berselli, M.; Frasson, I.; Perrone, R.; Palu, G.; Brazzale, A.R.; Richter, S.N.; Toppo, S. G-quadruplex forming sequences in the genome of all known human viruses: A comprehensive guide. PLoS Comput. Biol. 2018, 14, e1006675. [Google Scholar] [CrossRef]
  51. Qian, S.H.; Shi, M.W.; Xiong, Y.L.; Zhang, Y.; Zhang, Z.H.; Song, X.M.; Deng, X.Y.; Chen, Z.X. EndoQuad: A comprehensive genome-wide experimentally validated endogenous G-quadruplex database. Nucleic Acids Res. 2024, 52, D72–D80. [Google Scholar] [CrossRef]
  52. Spiegel, J.; Cuesta, S.M.; Adhikari, S.; Hansel-Hertsch, R.; Tannahill, D.; Balasubramanian, S. G-quadruplexes are transcription factor binding hubs in human chromatin. Genome Biol. 2021, 22, 117. [Google Scholar] [CrossRef] [PubMed]
  53. Brazda, V.; Cerven, J.; Bartas, M.; Mikyskova, N.; Coufal, J.; Pecinka, P. The Amino Acid Composition of Quadruplex Binding Proteins Reveals a Shared Motif and Predicts New Potential Quadruplex Interactors. Molecules 2018, 23, 2341. [Google Scholar] [CrossRef] [PubMed]
  54. Vasilyev, N.; Polonskaia, A.; Darnell, J.C.; Darnell, R.B.; Patel, D.J.; Serganov, A. Crystal structure reveals specific recognition of a G-quadruplex RNA by a β-turn in the RGG motif of FMRP. Proc. Natl. Acad. Sci. USA 2015, 112, E5391–E5400. [Google Scholar] [CrossRef] [PubMed]
  55. König, P.; Giraldo, R.; Chapman, L.; Rhodes, D. The Crystal Structure of the DNA-Binding Domain of Yeast RAP1 in Complex with Telomeric DNA. Cell 1996, 85, 125–136. [Google Scholar] [CrossRef]
  56. Traczyk, A.; Liew, C.W.; Gill, D.J.; Rhodes, D. Structural basis of G-quadruplex DNA recognition by the yeast telomeric protein Rap1. Nucleic Acids Res. 2020, 48, 4562–4571. [Google Scholar] [CrossRef]
  57. Guedin, A.; Gros, J.; Alberti, P.; Mergny, J.L. How long is too long? Effects of loop size on G-quadruplex stability. Nucleic Acids Res. 2010, 38, 7858–7868. [Google Scholar] [CrossRef]
  58. Zhang, A.Y.; Bugaut, A.; Balasubramanian, S. A sequence-independent analysis of the loop length dependence of intramolecular RNA G-quadruplex stability and topology. Biochemistry 2011, 50, 7251–7258. [Google Scholar] [CrossRef] [PubMed]
  59. Saha, A.; Duchambon, P.; Masson, V.; Loew, D.; Bombard, S.; Teulade-Fichou, M.P. Nucleolin Discriminates Drastically between Long-Loop and Short-Loop Quadruplexes. Biochemistry 2020, 59, 1261–1272. [Google Scholar] [CrossRef]
  60. Ngo, K.H.; Liew, C.W.; Heddi, B.; Phan, A.T. Structural Basis for Parallel G-Quadruplex Recognition by an Ankyrin Protein. J. Am. Chem. Soc. 2024, 146, 13709–13713. [Google Scholar] [CrossRef] [PubMed]
  61. Weaver, T.M.; Cortez, L.M.; Khoang, T.H.; Washington, M.T.; Agarwal, P.K.; Freudenthal, B.D. Visualizing Rev1 catalyze protein-template DNA synthesis. Proc. Natl. Acad. Sci. USA 2020, 117, 25494–25504. [Google Scholar] [CrossRef]
  62. Roychoudhury, S.; Pramanik, S.; Harris, H.L.; Tarpley, M.; Sarkar, A.; Spagnol, G.; Sorgen, P.L.; Chowdhury, D.; Band, V.; Klinkebiel, D.; et al. Endogenous oxidized DNA bases and APE1 regulate the formation of G-quadruplex structures in the genome. Proc. Natl. Acad. Sci. USA 2020, 117, 11409–11420. [Google Scholar] [CrossRef] [PubMed]
  63. Pipier, A.; Devaux, A.; Lavergne, T.; Adrait, A.; Coute, Y.; Britton, S.; Calsou, P.; Riou, J.F.; Defrancq, E.; Gomez, D. Constrained G4 structures unveil topology specificity of known and new G4 binding proteins. Sci. Rep. 2021, 11, 13469. [Google Scholar] [CrossRef]
  64. Mishra, S.K.; Tawani, A.; Mishra, A.; Kumar, A. G4IPDB: A database for G-quadruplex structure forming nucleic acid interacting proteins. Sci. Rep. 2016, 6, 38144. [Google Scholar] [CrossRef]
  65. Bourdon, S.; Herviou, P.; Dumas, L.; Destefanis, E.; Zen, A.; Cammas, A.; Millevoi, S.; Dassi, E. QUADRatlas: The RNA G-quadruplex and RG4-binding proteins database. Nucleic Acids Res. 2023, 51, D240–D247. [Google Scholar] [CrossRef] [PubMed]
  66. Handwerger, K.E.; Cordero, J.A.; Gall, J.G. Cajal bodies, nucleoli, and speckles in the Xenopus oocyte nucleus have a low-density, sponge-like structure. Mol. Biol. Cell 2005, 16, 202–211. [Google Scholar] [CrossRef]
  67. Shin, Y.; Brangwynne, C.P. Liquid phase condensation in cell physiology and disease. Science 2017, 357, eaaf4382. [Google Scholar] [CrossRef] [PubMed]
  68. Iborra, F.J.; Pombo, A.; Jackson, D.A.; Cook, P.R. Active RNA polymerases are localized within discrete transcription “factories” in human nuclei. J. Cell Sci. 1996, 109, 1427–1436. [Google Scholar] [CrossRef]
  69. Jackson, D.A. The amazing complexity of transcription factories. Brief. Funct. Genom. Proteomic 2005, 4, 143–157. [Google Scholar] [CrossRef]
  70. Chubb, J.R.; Trcek, T.; Shenoy, S.M.; Singer, R.H. Transcriptional pulsing of a developmental gene. Curr. Biol. CB 2006, 16, 1018–1025. [Google Scholar] [CrossRef]
  71. Marshall, W.F.; Straight, A.; Marko, J.F.; Swedlow, J.; Dernburg, A.; Belmont, A.; Murray, A.W.; Agard, D.A.; Sedat, J.W. Interphase chromosomes undergo constrained diffusional motion in living cells. Curr. Biol. CB 1997, 7, 930–939. [Google Scholar] [CrossRef]
  72. Ruggiero, E.; Tassinari, M.; Perrone, R.; Nadai, M.; Richter, S.N. Stable and Conserved G-Quadruplexes in the Long Terminal Repeat Promoter of Retroviruses. ACS Infect. Dis. 2019, 5, 1150–1159. [Google Scholar] [CrossRef] [PubMed]
  73. Amrane, S.; Jaubert, C.; Bedrat, A.; Rundstadler, T.; Recordon-Pinson, P.; Aknin, C.; Guedin, A.; De Rache, A.; Bartolucci, L.; Diene, I.; et al. Deciphering RNA G-quadruplex function during the early steps of HIV-1 infection. Nucleic Acids Res. 2022, 50, 12328–12343. [Google Scholar] [CrossRef] [PubMed]
  74. Sahakyan, A.B.; Murat, P.; Mayer, C.; Balasubramanian, S. G-quadruplex structures within the 3′ UTR of LINE-1 elements stimulate retrotransposition. Nat. Struct. Mol. Biol. 2017, 24, 243–247. [Google Scholar] [CrossRef]
  75. Sakamoto, M.; Ishiuchi, T. YY1-dependent transcriptional regulation manifests at the morula stage. Micropubl. Biol. 2024. [Google Scholar] [CrossRef]
  76. Kruisselbrink, E.; Guryev, V.; Brouwer, K.; Pontier, D.B.; Cuppen, E.; Tijsterman, M. Mutagenic capacity of endogenous G4 DNA underlies genome instability in FANCJ-defective C. elegans. Curr. Biol. CB 2008, 18, 900–905. [Google Scholar] [CrossRef]
  77. Jones, M.; Rose, A. A DOG’s View of Fanconi Anemia: Insights from C. elegans. Anemia 2012, 2012, 323721. [Google Scholar] [CrossRef]
  78. Tarailo-Graovac, M.; Wong, T.; Qin, Z.; Flibotte, S.; Taylor, J.; Moerman, D.G.; Rose, A.M.; Chen, N. Spectrum of variations in dog-1/FANCJ and mdf-1/MAD1 defective Caenorhabditis elegans strains after long-term propagation. BMC Genom. 2015, 16, 210. [Google Scholar] [CrossRef] [PubMed]
  79. Sarkies, P.; Murat, P.; Phillips, L.G.; Patel, K.J.; Balasubramanian, S.; Sale, J.E. FANCJ coordinates two pathways that maintain epigenetic stability at G-quadruplex DNA. Nucleic Acids Res. 2012, 40, 1485–1498. [Google Scholar] [CrossRef]
  80. Liu, Y.; Zhu, X.; Wang, K.; Zhang, B.; Qiu, S. The Cellular Functions and Molecular Mechanisms of G-Quadruplex Unwinding Helicases in Humans. Front. Mol. Biosci. 2021, 8, 783889. [Google Scholar] [CrossRef]
  81. Sarkies, P.; Reams, C.; Simpson, L.J.; Sale, J.E. Epigenetic instability due to defective replication of structured DNA. Mol. Cell 2010, 40, 703–713. [Google Scholar] [CrossRef]
  82. Kumagai, A.; Dunphy, W.G. MTBP, the partner of Treslin, contains a novel DNA-binding domain that is essential for proper initiation of DNA replication. Mol. Biol. Cell 2017, 28, 2998–3012. [Google Scholar] [CrossRef] [PubMed]
  83. Poulet-Benedetti, J.; Tonnerre-Doncarli, C.; Valton, A.L.; Laurent, M.; Gerard, M.; Barinova, N.; Parisis, N.; Massip, F.; Picard, F.; Prioleau, M.N. Dimeric G-quadruplex motifs-induced NFRs determine strong replication origins in vertebrates. Nat. Commun. 2023, 14, 4843. [Google Scholar] [CrossRef] [PubMed]
  84. Mitter, M.; Gasser, C.; Takacs, Z.; Langer, C.C.H.; Tang, W.; Jessberger, G.; Beales, C.T.; Neuner, E.; Ameres, S.L.; Peters, J.M.; et al. Conformation of sister chromatids in the replicated human genome. Nature 2020, 586, 139–144. [Google Scholar] [CrossRef]
  85. Hou, Y.; Li, F.; Zhang, R.; Li, S.; Liu, H.; Qin, Z.S.; Sun, X. Integrative characterization of G-Quadruplexes in the three-dimensional chromatin structure. Epigenetics 2019, 14, 894–911. [Google Scholar] [CrossRef]
  86. De Magis, A.; Gotz, S.; Hajikazemi, M.; Fekete-Szucs, E.; Caterino, M.; Juranek, S.; Paeschke, K. Zuo1 supports G4 structure formation and directs repair toward nucleotide excision repair. Nat. Commun. 2020, 11, 3907. [Google Scholar] [CrossRef]
  87. Ketkar, A.; Smith, L.; Johnson, C.; Richey, A.; Berry, M.; Hartman, J.H.; Maddukuri, L.; Reed, M.R.; Gunderson, J.E.C.; Leung, J.W.C.; et al. Human Rev1 relies on insert-2 to promote selective binding and accurate replication of stabilized G-quadruplex motifs. Nucleic Acids Res. 2021, 49, 2065–2084. [Google Scholar] [CrossRef]
  88. Sondka, Z.; Dhir, N.B.; Carvalho-Silva, D.; Jupe, S.; Madhumita; McLaren, K.; Starkey, M.; Ward, S.; Wilding, J.; Ahmed, M.; et al. COSMIC: A curated database of somatic variants and clinical data for cancer. Nucleic Acids Res. 2024, 52, D1210–D1217. [Google Scholar] [CrossRef] [PubMed]
  89. Liano, D.; Chowdhury, S.; Di Antonio, M. Cockayne Syndrome B Protein Selectively Resolves and Interact with Intermolecular DNA G-Quadruplex Structures. J. Am. Chem. Soc. 2021, 143, 20988–21002. [Google Scholar] [CrossRef]
  90. Kokic, G.; Wagner, F.R.; Chernev, A.; Urlaub, H.; Cramer, P. Structural basis of human transcription-DNA repair coupling. Nature 2021, 598, 368–372. [Google Scholar] [CrossRef]
  91. Fleming, A.M.; Zhu, J.; Ding, Y.; Esders, S.; Burrows, C.J. Oxidative Modification of Guanine in a Potential Z-DNA-Forming Sequence of a Gene Promoter Impacts Gene Expression. Chem. Res. Toxicol. 2019, 32, 899–909. [Google Scholar] [CrossRef]
  92. Ju, B.G.; Lunyak, V.V.; Perissi, V.; Garcia-Bassets, I.; Rose, D.W.; Glass, C.K.; Rosenfeld, M.G. A topoisomerase IIbeta-mediated dsDNA break required for regulated transcription. Science 2006, 312, 1798–1802. [Google Scholar] [CrossRef] [PubMed]
  93. Gray, L.T.; Puig Lombardi, E.; Verga, D.; Nicolas, A.; Teulade-Fichou, M.P.; Londono-Vallejo, A.; Maizels, N. G-quadruplexes Sequester Free Heme in Living Cells. Cell Chem. Biol. 2019, 26, 1681–1691.e1685. [Google Scholar] [CrossRef]
  94. Li, Y.; Geyer, C.R.; Sen, D. Recognition of anionic porphyrins by DNA aptamers. Biochemistry 1996, 35, 6911–6922. [Google Scholar] [CrossRef]
  95. Rai, R.; Chen, Y.; Lei, M.; Chang, S. TRF2-RAP1 is required to protect telomeres from engaging in homologous recombination-mediated deletions and fusions. Nat. Commun. 2016, 7, 10881. [Google Scholar] [CrossRef] [PubMed]
  96. Mei, Y.; Deng, Z.; Vladimirova, O.; Gulve, N.; Johnson, F.B.; Drosopoulos, W.C.; Schildkraut, C.L.; Lieberman, P.M. TERRA G-quadruplex RNA interaction with TRF2 GAR domain is required for telomere integrity. Sci. Rep. 2021, 11, 3509. [Google Scholar] [CrossRef] [PubMed]
  97. Lyonnais, S.; Hounsou, C.; Teulade-Fichou, M.P.; Jeusset, J.; Le Cam, E.; Mirambeau, G. G-quartets assembly within a G-rich DNA flap. A possible event at the center of the HIV-1 genome. Nucleic Acids Res. 2002, 30, 5276–5283. [Google Scholar] [CrossRef]
  98. Heddi, B.; Cheong, V.V.; Martadinata, H.; Phan, A.T. Insights into G-quadruplex specific recognition by the DEAH-box helicase RHAU: Solution structure of a peptide-quadruplex complex. Proc. Natl. Acad. Sci. USA 2015, 112, 9608–9613. [Google Scholar] [CrossRef]
  99. Chen, M.C.; Tippana, R.; Demeshkina, N.A.; Murat, P.; Balasubramanian, S.; Myong, S.; Ferre-D’Amare, A.R. Structural basis of G-quadruplex unfolding by the DEAH/RHA helicase DHX36. Nature 2018, 558, 465–469. [Google Scholar] [CrossRef] [PubMed]
  100. You, H.; Lattmann, S.; Rhodes, D.; Yan, J. RHAU helicase stabilizes G4 in its nucleotide-free state and destabilizes G4 upon ATP hydrolysis. Nucleic Acids Res. 2017, 45, 206–214. [Google Scholar] [CrossRef]
  101. Dai, Y.X.; Guo, H.L.; Liu, N.N.; Chen, W.F.; Ai, X.; Li, H.H.; Sun, B.; Hou, X.M.; Rety, S.; Xi, X.G. Structural mechanism underpinning Thermus oshimai Pif1-mediated G-quadruplex unfolding. EMBO Rep. 2022, 23, e53874. [Google Scholar] [CrossRef]
  102. Muellner, J.; Schmidt, K.H. Yeast Genome Maintenance by the Multifunctional PIF1 DNA Helicase Family. Genes 2020, 11, 224. [Google Scholar] [CrossRef]
  103. Varon, M.; Dovrat, D.; Heuze, J.; Tsirkas, I.; Singh, S.P.; Pasero, P.; Galletto, R.; Aharoni, A. Rrm3 and Pif1 division of labor during replication through leading and lagging strand G-quadruplex. Nucleic Acids Res. 2024, 52, 1753–1762. [Google Scholar] [CrossRef]
  104. Wu, W.Q.; Hou, X.M.; Li, M.; Dou, S.X.; Xi, X.G. BLM unfolds G-quadruplexes in different structural environments through different mechanisms. Nucleic Acids Res. 2015, 43, 4614–4626. [Google Scholar] [CrossRef]
  105. Huet, J.; Cottrelle, P.; Cool, M.; Vignais, M.L.; Thiele, D.; Marck, C.; Buhler, J.M.; Sentenac, A.; Fromageot, P. A general upstream binding factor for genes of the yeast translational apparatus. EMBO J. 1985, 4, 3539–3547. [Google Scholar] [CrossRef]
  106. Herbert, A. ALU non-B-DNA conformations, flipons, binary codes and evolution. R. Soc. Open Sci. 2020, 7, 200222. [Google Scholar] [CrossRef]
  107. Esain-Garcia, I.; Kirchner, A.; Melidis, L.; Tavares, R.C.A.; Dhir, S.; Simeone, A.; Yu, Z.; Madden, S.K.; Hermann, R.; Tannahill, D.; et al. G-quadruplex DNA structure is a positive regulator of MYC transcription. Proc. Natl. Acad. Sci. USA 2024, 121, e2320240121. [Google Scholar] [CrossRef]
  108. Shrestha, O.K.; Sharma, R.; Tomiczek, B.; Lee, W.; Tonelli, M.; Cornilescu, G.; Stolarska, M.; Nierzwicki, L.; Czub, J.; Markley, J.L.; et al. Structure and evolution of the 4-helix bundle domain of Zuotin, a J-domain protein co-chaperone of Hsp70. PLoS ONE 2019, 14, e0217098. [Google Scholar] [CrossRef]
  109. Biffi, G.; Tannahill, D.; Balasubramanian, S. An intramolecular G-quadruplex structure is required for binding of telomeric repeat-containing RNA to the telomeric protein TRF2. J. Am. Chem. Soc. 2012, 134, 11974–11976. [Google Scholar] [CrossRef]
  110. Sharma, S.; Mukherjee, A.K.; Roy, S.S.; Bagri, S.; Lier, S.; Verma, M.; Sengupta, A.; Kumar, M.; Nesse, G.; Pandey, D.P.; et al. Human telomerase is directly regulated by non-telomeric TRF2-G-quadruplex interaction. Cell Rep. 2021, 35, 109154. [Google Scholar] [CrossRef]
  111. Boyer, L.A.; Latek, R.R.; Peterson, C.L. The SANT domain: A unique histone-tail-binding module? Nat. Rev. Mol. Cell Biol. 2004, 5, 158–163. [Google Scholar] [CrossRef]
  112. Weintraub, A.S.; Li, C.H.; Zamudio, A.V.; Sigova, A.A.; Hannett, N.M.; Day, D.S.; Abraham, B.J.; Cohen, M.A.; Nabet, B.; Buckley, D.L.; et al. YY1 Is a Structural Regulator of Enhancer-Promoter Loops. Cell 2017, 171, 1573–1588.e1528. [Google Scholar] [CrossRef] [PubMed]
  113. Li, L.; Williams, P.; Ren, W.; Wang, M.Y.; Gao, Z.; Miao, W.; Huang, M.; Song, J.; Wang, Y. YY1 interacts with guanine quadruplexes to regulate DNA looping and gene expression. Nat. Chem. Biol. 2021, 17, 161–168. [Google Scholar] [CrossRef]
  114. Wreczycka, K.; Franke, V.; Uyar, B.; Wurmus, R.; Bulut, S.; Tursun, B.; Akalin, A. HOT or not: Examining the basis of high-occupancy target regions. Nucleic Acids Res. 2019, 47, 5735–5745. [Google Scholar] [CrossRef] [PubMed]
  115. Ramaker, R.C.; Hardigan, A.A.; Goh, S.T.; Partridge, E.C.; Wold, B.; Cooper, S.J.; Myers, R.M. Dissecting the regulatory activity and sequence content of loci with exceptional numbers of transcription factor associations. Genome Res. 2020, 30, 939–950. [Google Scholar] [CrossRef]
  116. Partridge, E.C.; Chhetri, S.B.; Prokop, J.W.; Ramaker, R.C.; Jansen, C.S.; Goh, S.T.; Mackiewicz, M.; Newberry, K.M.; Brandsmeier, L.A.; Meadows, S.K.; et al. Occupancy maps of 208 chromatin-associated proteins in one human cell type. Nature 2020, 583, 720–728. [Google Scholar] [CrossRef]
  117. Lago, S.; Nadai, M.; Cernilogar, F.M.; Kazerani, M.; Dominiguez Moreno, H.; Schotta, G.; Richter, S.N. Promoter G-quadruplexes and transcription factors cooperate to shape the cell type-specific transcriptome. Nat. Commun. 2021, 12, 3885. [Google Scholar] [CrossRef] [PubMed]
  118. Bartman, C.R.; Hsu, S.C.; Hsiung, C.C.; Raj, A.; Blobel, G.A. Enhancer Regulation of Transcriptional Bursting Parameters Revealed by Forced Chromatin Looping. Mol. Cell 2016, 62, 237–247. [Google Scholar] [CrossRef]
  119. Hasegawa, Y.; Struhl, K. Promoter-specific dynamics of TATA-binding protein association with the human genome. Genome Res. 2019, 29, 1939–1950. [Google Scholar] [CrossRef]
  120. Henninger, J.E.; Oksuz, O.; Shrinivas, K.; Sagi, I.; LeRoy, G.; Zheng, M.M.; Andrews, J.O.; Zamudio, A.V.; Lazaris, C.; Hannett, N.M.; et al. RNA-Mediated Feedback Control of Transcriptional Condensates. Cell 2021, 184, 207–225.e224. [Google Scholar] [CrossRef]
  121. De Nicola, B.; Lech, C.J.; Heddi, B.; Regmi, S.; Frasson, I.; Perrone, R.; Richter, S.N.; Phan, A.T. Structure and possible function of a G-quadruplex in the long terminal repeat of the proviral HIV-1 genome. Nucleic Acids Res. 2016, 44, 6442–6451. [Google Scholar] [CrossRef]
  122. Butovskaya, E.; Heddi, B.; Bakalar, B.; Richter, S.N.; Phan, A.T. Major G-Quadruplex Form of HIV-1 LTR Reveals a (3 + 1) Folding Topology Containing a Stem-Loop. J. Am. Chem. Soc. 2018, 140, 13654–13662. [Google Scholar] [CrossRef] [PubMed]
  123. Krafcikova, P.; Demkovicova, E.; Halaganova, A.; Viglasky, V. Putative HIV and SIV G-Quadruplex Sequences in Coding and Noncoding Regions Can Form G-Quadruplexes. J. Nucleic Acids 2017, 2017, 6513720. [Google Scholar] [CrossRef] [PubMed]
  124. Pathak, R. G-Quadruplexes in the Viral Genome: Unlocking Targets for Therapeutic Interventions and Antiviral Strategies. Viruses 2023, 15, 2216. [Google Scholar] [CrossRef] [PubMed]
  125. Ramskold, D.; Hendriks, G.J.; Larsson, A.J.M.; Mayr, J.V.; Ziegenhain, C.; Hagemann-Jensen, M.; Hartmanis, L.; Sandberg, R. Single-cell new RNA sequencing reveals principles of transcription at the resolution of individual bursts. Nat. Cell Biol. 2024. [Google Scholar] [CrossRef]
  126. Shen, J.; Varshney, D.; Simeone, A.; Zhang, X.; Adhikari, S.; Tannahill, D.; Balasubramanian, S. Promoter G-quadruplex folding precedes transcription and is controlled by chromatin. Genome Biol. 2021, 22, 143. [Google Scholar] [CrossRef]
  127. Baranello, L.; Wojtowicz, D.; Cui, K.; Devaiah, B.N.; Chung, H.J.; Chan-Salis, K.Y.; Guha, R.; Wilson, K.; Zhang, X.; Zhang, H.; et al. RNA Polymerase II Regulates Topoisomerase 1 Activity to Favor Efficient Transcription. Cell 2016, 165, 357–371. [Google Scholar] [CrossRef]
  128. Marchand, C.; Pourquier, P.; Laco, G.S.; Jing, N.; Pommier, Y. Interaction of Human Nuclear Topoisomerase I with Guanosine Quartet-forming and Guanosine-rich Single-stranded DNA and RNA Oligonucleotides. J. Biol. Chem. 2002, 277, 8906–8911. [Google Scholar] [CrossRef] [PubMed]
  129. Schwalb, B.; Michel, M.; Zacher, B.; Fruhauf, K.; Demel, C.; Tresch, A.; Gagneur, J.; Cramer, P. TT-seq maps the human transient transcriptome. Science 2016, 352, 1225–1228. [Google Scholar] [CrossRef]
  130. Herbert, A. The ancient Z-DNA and Z-RNA specific Zα fold has evolved modern roles in immunity and transcription through the natural selection of flipons. R. Soc. Open Sci. 2024, 11, 240080. [Google Scholar] [CrossRef]
  131. Beknazarov, N.; Konovalov, D.; Herbert, A.; Poptsova, M. Z-DNA formation in promoters conserved between human and mouse are associated with increased transcription reinitiation rates. Sci. Rep. 2024, 14, 17786. [Google Scholar] [CrossRef]
  132. Le, S.N.; Brown, C.R.; Harvey, S.; Boeger, H.; Elmlund, H.; Elmlund, D. The TAFs of TFIID Bind and Rearrange the Topology of the TATA-Less RPS5 Promoter. Int. J. Mol. Sci. 2019, 20, 3290. [Google Scholar] [CrossRef] [PubMed]
  133. Herbert, A. Flipons and small RNAs accentuate the asymmetries of pervasive transcription by the reset and sequence-specific microcoding of promoter conformation. J. Biol. Chem. 2023, 299, 105140. [Google Scholar] [CrossRef] [PubMed]
  134. Herbert, A.; Pavlov, F.; Konovalov, D.; Poptsova, M. Conserved microRNAs and Flipons Shape Gene Expression during Development by Altering Promoter Conformations. Int. J. Mol. Sci. 2023, 24, 4884. [Google Scholar] [CrossRef]
  135. Kouzine, F.; Wojtowicz, D.; Baranello, L.; Yamane, A.; Nelson, S.; Resch, W.; Kieffer-Kwon, K.R.; Benham, C.J.; Casellas, R.; Przytycka, T.M.; et al. Permanganate/S1 Nuclease Footprinting Reveals Non-B DNA Structures with Regulatory Potential across a Mammalian Genome. Cell Syst. 2017, 4, 344–356. [Google Scholar] [CrossRef]
  136. Song, J.; Gooding, A.R.; Hemphill, W.O.; Love, B.D.; Robertson, A.; Yao, L.; Zon, L.I.; North, T.E.; Kasinath, V.; Cech, T.R. Structural basis for inactivation of PRC2 by G-quadruplex RNA. Science 2023, 381, 1331–1337. [Google Scholar] [CrossRef]
  137. Watanabe, T.; Totoki, Y.; Toyoda, A.; Kaneda, M.; Kuramochi-Miyagawa, S.; Obata, Y.; Chiba, H.; Kohara, Y.; Kono, T.; Nakano, T.; et al. Endogenous siRNAs from naturally formed dsRNAs regulate transcripts in mouse oocytes. Nature 2008, 453, 539–543. [Google Scholar] [CrossRef] [PubMed]
  138. Ha, H.; Song, J.; Wang, S.; Kapusta, A.; Feschotte, C.; Chen, K.C.; Xing, J. A comprehensive analysis of piRNAs from adult human testis and their relationship with genes and mobile elements. BMC Genom. 2014, 15, 545. [Google Scholar] [CrossRef]
  139. Ozata, D.M.; Yu, T.; Mou, H.; Gainetdinov, I.; Colpan, C.; Cecchini, K.; Kaymaz, Y.; Wu, P.H.; Fan, K.; Kucukural, A.; et al. Evolutionarily conserved pachytene piRNA loci are highly divergent among modern humans. Nat. Ecol. Evol. 2020, 4, 156–168. [Google Scholar] [CrossRef]
  140. Li, L.C.; Okino, S.T.; Zhao, H.; Pookot, D.; Place, R.F.; Urakami, S.; Enokida, H.; Dahiya, R. Small dsRNAs induce transcriptional activation in human cells. Proc. Natl. Acad. Sci. USA 2006, 103, 17337–17342. [Google Scholar] [CrossRef]
  141. Matsui, M.; Chu, Y.; Zhang, H.; Gagnon, K.T.; Shaikh, S.; Kuchimanchi, S.; Manoharan, M.; Corey, D.R.; Janowski, B.A. Promoter RNA links transcriptional regulation of inflammatory pathway genes. Nucleic Acids Res. 2013, 41, 10086–10109. [Google Scholar] [CrossRef]
  142. Leonaite, B.; Han, Z.; Basquin, J.; Bonneau, F.; Libri, D.; Porrua, O.; Conti, E. Sen1 has unique structural features grafted on the architecture of the Upf1-like helicase family. EMBO J. 2017, 36, 1590–1604. [Google Scholar] [CrossRef]
  143. Lansdorp, P.; van Wietmarschen, N. Helicases FANCJ, RTEL1 and BLM Act on Guanine Quadruplex DNA in Vivo. Genes 2019, 10, 870. [Google Scholar] [CrossRef]
  144. Nguyen, H.D.; Yadav, T.; Giri, S.; Saez, B.; Graubert, T.A.; Zou, L. Functions of Replication Protein A as a Sensor of R Loops and a Regulator of RNaseH1. Mol. Cell 2017, 65, 832–847.e834. [Google Scholar] [CrossRef]
  145. Yan, Q.; Wulfridge, P.; Doherty, J.; Fernandez-Luna, J.L.; Real, P.J.; Tang, H.Y.; Sarma, K. Proximity labeling identifies a repertoire of site-specific R-loop modulators. Nat. Commun. 2022, 13, 53. [Google Scholar] [CrossRef] [PubMed]
  146. Chernukhin, I.; Shamsuddin, S.; Kang, S.Y.; Bergstrom, R.; Kwon, Y.W.; Yu, W.; Whitehead, J.; Mukhopadhyay, R.; Docquier, F.; Farrar, D.; et al. CTCF interacts with and recruits the largest subunit of RNA polymerase II to CTCF target sites genome-wide. Mol. Cell Biol. 2007, 27, 1631–1648. [Google Scholar] [CrossRef]
  147. Gomes, N.P.; Espinosa, J.M. Gene-specific repression of the p53 target gene PUMA via intragenic CTCF-Cohesin binding. Genes Dev. 2010, 24, 1022–1034. [Google Scholar] [CrossRef]
  148. Nanavaty, V.; Abrash, E.W.; Hong, C.; Park, S.; Fink, E.E.; Li, Z.; Sweet, T.J.; Bhasin, J.M.; Singuri, S.; Lee, B.H.; et al. DNA Methylation Regulates Alternative Polyadenylation via CTCF and the Cohesin Complex. Mol. Cell 2020, 78, 752–764.e756. [Google Scholar] [CrossRef]
  149. Mao, S.Q.; Ghanbarian, A.T.; Spiegel, J.; Martinez Cuesta, S.; Beraldi, D.; Di Antonio, M.; Marsico, G.; Hansel-Hertsch, R.; Tannahill, D.; Balasubramanian, S. DNA G-quadruplex structures mold the DNA methylome. Nat. Struct. Mol. Biol. 2018, 25, 951–957. [Google Scholar] [CrossRef]
  150. Alharbi, A.B.; Schmitz, U.; Bailey, C.G.; Rasko, J.E.J. CTCF as a regulator of alternative splicing: New tricks for an old player. Nucleic Acids Res. 2021, 49, 7825–7838. [Google Scholar] [CrossRef]
  151. Gajos, M.; Jasnovidova, O.; van Bommel, A.; Freier, S.; Vingron, M.; Mayer, A. Conserved DNA sequence features underlie pervasive RNA polymerase pausing. Nucleic Acids Res. 2021, 49, 4402–4420. [Google Scholar] [CrossRef]
  152. Ehara, H.; Kujirai, T.; Shirouzu, M.; Kurumizaka, H.; Sekine, S.I. Structural basis of nucleosome disassembly and reassembly by RNAPII elongation complex with FACT. Science 2022, 377, eabp9466. [Google Scholar] [CrossRef]
  153. Cramer, P.; Pesce, C.G.; Baralle, F.E.; Kornblihtt, A.R. Functional association between promoter structure and transcript alternative splicing. Proc. Natl. Acad. Sci. USA 1997, 94, 11456–11460. [Google Scholar] [CrossRef] [PubMed]
  154. Cramer, P.; Caceres, J.F.; Cazalla, D.; Kadener, S.; Muro, A.F.; Baralle, F.E.; Kornblihtt, A.R. Coupling of transcription with alternative splicing: RNA pol II promoters modulate SF2/ASF and 9G8 effects on an exonic splicing enhancer. Mol. Cell 1999, 4, 251–258. [Google Scholar] [CrossRef]
  155. He, X.; Yuan, J.; Gao, Z.; Wang, Y. Promoter R-Loops Recruit U2AF1 to Modulate Its Phase Separation and RNA Splicing. J. Am. Chem. Soc. 2023, 145, 21646–21660. [Google Scholar] [CrossRef]
  156. Shukla, S.; Kavak, E.; Gregory, M.; Imashimizu, M.; Shutinoski, B.; Kashlev, M.; Oberdoerffer, P.; Sandberg, R.; Oberdoerffer, S. CTCF-promoted RNA polymerase II pausing links DNA methylation to splicing. Nature 2011, 479, 74–79. [Google Scholar] [CrossRef]
  157. Marina, R.J.; Sturgill, D.; Bailly, M.A.; Thenoz, M.; Varma, G.; Prigge, M.F.; Nanan, K.K.; Shukla, S.; Haque, N.; Oberdoerffer, S. TET-catalyzed oxidation of intragenic 5-methylcytosine regulates CTCF-dependent alternative splicing. EMBO J. 2016, 35, 335–355. [Google Scholar] [CrossRef]
  158. Guo, Y.; Monahan, K.; Wu, H.; Gertz, J.; Varley, K.E.; Li, W.; Myers, R.M.; Maniatis, T.; Wu, Q. CTCF/cohesin-mediated DNA looping is required for protocadherin alpha promoter choice. Proc. Natl. Acad. Sci. USA 2012, 109, 21081–21086. [Google Scholar] [CrossRef]
  159. Monahan, K.; Rudnick, N.D.; Kehayova, P.D.; Pauli, F.; Newberry, K.M.; Myers, R.M.; Maniatis, T. Role of CCCTC binding factor (CTCF) and cohesin in the generation of single-cell diversity of protocadherin-alpha gene expression. Proc. Natl. Acad. Sci. USA 2012, 109, 9125–9130. [Google Scholar] [CrossRef]
  160. Lamas-Maceiras, M.; Singh, B.N.; Hampsey, M.; Freire-Picos, M.A. Promoter-Terminator Gene Loops Affect Alternative 3′-End Processing in Yeast. J. Biol. Chem. 2016, 291, 8960–8968. [Google Scholar] [CrossRef]
  161. Tan-Wong, S.M.; Zaugg, J.B.; Camblong, J.; Xu, Z.; Zhang, D.W.; Mischo, H.E.; Ansari, A.Z.; Luscombe, N.M.; Steinmetz, L.M.; Proudfoot, N.J. Gene loops enhance transcriptional directionality. Science 2012, 338, 671–675. [Google Scholar] [CrossRef]
  162. von Hacht, A.; Seifert, O.; Menger, M.; Schutze, T.; Arora, A.; Konthur, Z.; Neubauer, P.; Wagner, A.; Weise, C.; Kurreck, J. Identification and characterization of RNA guanine-quadruplex binding proteins. Nucleic Acids Res. 2014, 42, 6630–6644. [Google Scholar] [CrossRef] [PubMed]
  163. Zhang, J.; Harvey, S.E.; Cheng, C. A high-throughput screen identifies small molecule modulators of alternative splicing by targeting RNA G-quadruplexes. Nucleic Acids Res. 2019, 47, 3667–3679. [Google Scholar] [CrossRef]
  164. Jara-Espejo, M.; Fleming, A.M.; Burrows, C.J. Potential G-Quadruplex Forming Sequences and N(6)-Methyladenosine Colocalize at Human Pre-mRNA Intron Splice Sites. ACS Chem. Biol. 2020, 15, 1292–1300. [Google Scholar] [CrossRef]
  165. Darnell, R.B.; Ke, S.; Darnell, J.E., Jr. Pre-mRNA processing includes N(6) methylation of adenosine residues that are retained in mRNA exons and the fallacy of “RNA epigenetics”. RNA 2018, 24, 262–267. [Google Scholar] [CrossRef]
  166. Wei, G.; Almeida, M.; Pintacuda, G.; Coker, H.; Bowness, J.S.; Ule, J.; Brockdorff, N. Acute depletion of METTL3 implicates N (6)-methyladenosine in alternative intron/exon inclusion in the nascent transcriptome. Genome Res. 2021, 31, 1395–1408. [Google Scholar] [CrossRef]
  167. Fleming, A.M.; Nguyen, N.L.B.; Burrows, C.J. Colocalization of m(6)A and G-Quadruplex-Forming Sequences in Viral RNA (HIV, Zika, Hepatitis B, and SV40) Suggests Topological Control of Adenosine N (6)-Methylation. ACS Cent. Sci. 2019, 5, 218–228. [Google Scholar] [CrossRef]
  168. Yoshida, A.; Oyoshi, T.; Suda, A.; Futaki, S.; Imanishi, M. Recognition of G-quadruplex RNA by a crucial RNA methyltransferase component, METTL14. Nucleic Acids Res. 2022, 50, 449–457. [Google Scholar] [CrossRef] [PubMed]
  169. Patil, D.P.; Chen, C.K.; Pickering, B.F.; Chow, A.; Jackson, C.; Guttman, M.; Jaffrey, S.R. m(6)A RNA methylation promotes XIST-mediated transcriptional repression. Nature 2016, 537, 369–373. [Google Scholar] [CrossRef]
  170. Ye, H.; Li, T.; Rigden, D.J.; Wei, Z. m6ACali: Machine learning-powered calibration for accurate m6A detection in MeRIP-Seq. Nucleic Acids Res. 2024, 52, 4830–4842. [Google Scholar] [CrossRef]
  171. Iwasaki, Y.; Ookuro, Y.; Iida, K.; Nagasawa, K.; Yoshida, W. Destabilization of DNA and RNA G-quadruplex structures formed by GGA repeat due to N(6)-methyladenine modification. Biochem. Biophys. Res. Commun. 2022, 597, 134–139. [Google Scholar] [CrossRef]
  172. Shi, H.; Wei, J.; He, C. Where, When, and How: Context-Dependent Functions of RNA Methylation Writers, Readers, and Erasers. Mol. Cell 2019, 74, 640–650. [Google Scholar] [CrossRef] [PubMed]
  173. Ke, S.; Pandya-Jones, A.; Saito, Y.; Fak, J.J.; Vagbo, C.B.; Geula, S.; Hanna, J.H.; Black, D.L.; Darnell, J.E., Jr.; Darnell, R.B. m(6)A mRNA modifications are deposited in nascent pre-mRNA and are not required for splicing but do specify cytoplasmic turnover. Genes. Dev. 2017, 31, 990–1006. [Google Scholar] [CrossRef] [PubMed]
  174. Mestre-Fos, S.; Penev, P.I.; Suttapitugsakul, S.; Hu, M.; Ito, C.; Petrov, A.S.; Wartell, R.M.; Wu, R.; Williams, L.D. G-Quadruplexes in Human Ribosomal RNA. J. Mol. Biol. 2019, 431, 1940–1955. [Google Scholar] [CrossRef] [PubMed]
  175. Scognamiglio, P.L.; Di Natale, C.; Leone, M.; Poletto, M.; Vitagliano, L.; Tell, G.; Marasco, D. G-quadruplex DNA recognition by nucleophosmin: New insights from protein dissection. Biochim. Biophys. Acta 2014, 1840, 2050–2059. [Google Scholar] [CrossRef]
  176. Okuwaki, M.; Saotome-Nakamura, A.; Yoshimura, M.; Saito, S.; Hirawake-Mogi, H.; Sekiya, T.; Nagata, K. RNA-recognition motifs and glycine and arginine-rich region cooperatively regulate the nucleolar localization of nucleolin. J. Biochem. 2021, 169, 87–100. [Google Scholar] [CrossRef] [PubMed]
  177. Santos, T.; Salgado, G.F.; Cabrita, E.J.; Cruz, C. Nucleolin: A binding partner of G-quadruplex structures. Trends Cell Biol. 2022, 32, 561–564. [Google Scholar] [CrossRef]
  178. Tian, B.; Manley, J.L. Alternative polyadenylation of mRNA precursors. Nat. Rev. Mol. Cell Biol. 2017, 18, 18–30. [Google Scholar] [CrossRef]
  179. Leppek, K.; Das, R.; Barna, M. Functional 5’ UTR mRNA structures in eukaryotic translation regulation and how to find them. Nat. Rev. Mol. Cell Biol. 2018, 19, 158–174. [Google Scholar] [CrossRef]
  180. Schuster, S.L.; Hsieh, A.C. The Untranslated Regions of mRNAs in Cancer. Trends Cancer 2019, 5, 245–262. [Google Scholar] [CrossRef]
  181. Mayr, C. What Are 3′ UTRs Doing? Cold Spring Harb. Perspect. Biol. 2019, 11, a034728. [Google Scholar] [CrossRef]
  182. Lee, D.S.M.; Ghanem, L.R.; Barash, Y. Integrative analysis reveals RNA G-quadruplexes in UTRs are selectively constrained and enriched for functional associations. Nat. Commun. 2020, 11, 527. [Google Scholar] [CrossRef] [PubMed]
  183. Sauer, M.; Juranek, S.A.; Marks, J.; De Magis, A.; Kazemier, H.G.; Hilbig, D.; Benhalevy, D.; Wang, X.; Hafner, M.; Paeschke, K. DHX36 prevents the accumulation of translationally inactive mRNAs with G4-structures in untranslated regions. Nat. Commun. 2019, 10, 2421. [Google Scholar] [CrossRef] [PubMed]
  184. Benhalevy, D.; Gupta, S.K.; Danan, C.H.; Ghosal, S.; Sun, H.W.; Kazemier, H.G.; Paeschke, K.; Hafner, M.; Juranek, S.A. The Human CCHC-type Zinc Finger Nucleic Acid-Binding Protein Binds G-Rich Elements in Target mRNA Coding Sequences and Promotes Translation. Cell Rep. 2017, 18, 2979–2990. [Google Scholar] [CrossRef]
  185. Dong, L.; Mao, Y.; Zhou, A.; Liu, X.M.; Zhou, J.; Wan, J.; Qian, S.B. Relaxed initiation pausing of ribosomes drives oncogenic translation. Sci. Adv. 2021, 7, eabd6927. [Google Scholar] [CrossRef] [PubMed]
  186. Zhou, J.; Wan, J.; Gao, X.; Zhang, X.; Jaffrey, S.R.; Qian, S.B. Dynamic m(6)A mRNA methylation directs translational control of heat shock response. Nature 2015, 526, 591–594. [Google Scholar] [CrossRef]
  187. Zaccara, S.; Jaffrey, S.R. A Unified Model for the Function of YTHDF Proteins in Regulating m(6)A-Modified mRNA. Cell 2020, 181, 1582–1595.e1518. [Google Scholar] [CrossRef]
  188. Cirillo, L.A.; Lin, F.R.; Cuesta, I.; Friedman, D.; Jarnik, M.; Zaret, K.S. Opening of compacted chromatin by early developmental transcription factors HNF3 (FoxA) and GATA-4. Mol. Cell 2002, 9, 279–289. [Google Scholar] [CrossRef]
  189. Zaret, K.S. Pioneer Transcription Factors Initiating Gene Network Changes. Annu. Rev. Genet. 2020, 54, 367–385. [Google Scholar] [CrossRef]
  190. Herbert, A. Nucleosomes and flipons exchange energy to alter chromatin conformation, the readout of genomic information, and cell fate. Bioessays 2022, 44, e2200166. [Google Scholar] [CrossRef]
  191. Czech, B.; Munafo, M.; Ciabrelli, F.; Eastwood, E.L.; Fabry, M.H.; Kneuss, E.; Hannon, G.J. piRNA-Guided Genome Defense: From Biogenesis to Silencing. Annu. Rev. Genet. 2018, 52, 131–157. [Google Scholar] [CrossRef]
  192. Zyner, K.G.; Simeone, A.; Flynn, S.M.; Doyle, C.; Marsico, G.; Adhikari, S.; Portella, G.; Tannahill, D.; Balasubramanian, S. G-quadruplex DNA structures in human stem cells and differentiation. Nat. Commun. 2022, 13, 142. [Google Scholar] [CrossRef] [PubMed]
  193. Skourti-Stathaki, K.; Torlai Triglia, E.; Warburton, M.; Voigt, P.; Bird, A.; Pombo, A. R-Loops Enhance Polycomb Repression at a Subset of Developmental Regulator Genes. Mol. Cell 2019, 73, 930–945.e934. [Google Scholar] [CrossRef] [PubMed]
  194. Yang, Q.; Lin, J.; Liu, M.; Li, R.; Tian, B.; Zhang, X.; Xu, B.; Liu, M.; Zhang, X.; Li, Y.; et al. Highly sensitive sequencing reveals dynamic modifications and activities of small RNAs in mouse oocytes and early embryos. Sci. Adv. 2016, 2, e1501482. [Google Scholar] [CrossRef] [PubMed]
  195. Zhang, Y.; Zhang, X.; Shi, J.; Tuorto, F.; Li, X.; Liu, Y.; Liebers, R.; Zhang, L.; Qu, Y.; Qian, J.; et al. Dnmt2 mediates intergenerational transmission of paternally acquired metabolic disorders through sperm small non-coding RNAs. Nat. Cell Biol. 2018, 20, 535–540. [Google Scholar] [CrossRef]
  196. Paloviita, P.; Hyden-Granskog, C.; Yohannes, D.A.; Paluoja, P.; Kere, J.; Tapanainen, J.S.; Krjutskov, K.; Tuuri, T.; Vosa, U.; Vuoristo, S. Small RNA expression and miRNA modification dynamics in human oocytes and early embryos. Genome Res. 2021, 31, 1474–1485. [Google Scholar] [CrossRef]
  197. Tomar, A.; Gomez-Velazquez, M.; Gerlini, R.; Comas-Armangue, G.; Makharadze, L.; Kolbe, T.; Boersma, A.; Dahlhoff, M.; Burgstaller, J.P.; Lassi, M.; et al. Epigenetic inheritance of diet-induced and sperm-borne mitochondrial RNAs. Nature 2024, 630, 720–727. [Google Scholar] [CrossRef]
  198. Maldonado, R.; Langst, G. The chromatin—Triple helix connection. Biol. Chem. 2023, 404, 1037–1049. [Google Scholar] [CrossRef]
  199. Leisegang, M.S.; Warwick, T.; Stotzel, J.; Brandes, R.P. RNA-DNA triplexes: Molecular mechanisms and functional relevance. Trends Biochem. Sci. 2024, 49, 532–544. [Google Scholar] [CrossRef]
  200. Zhou, Z.; Giles, K.E.; Felsenfeld, G. DNA.RNA triple helix formation can function as a cis-acting regulatory mechanism at the human beta-globin locus. Proc. Natl. Acad. Sci. USA 2019, 116, 6130–6139. [Google Scholar] [CrossRef]
  201. Maldonado, R.; Schwartz, U.; Silberhorn, E.; Langst, G. Nucleosomes Stabilize ssRNA-dsDNA Triple Helices in Human Cells. Mol. Cell 2019, 73, 1243–1254.e1246. [Google Scholar] [CrossRef]
  202. Kohestani, H.; Wereszczynski, J. The effects of RNA. DNA-DNA triple helices on nucleosome structures and dynamics. Biophys. J. 2023, 122, 1229–1239. [Google Scholar] [CrossRef] [PubMed]
  203. Jimenez-Garcia, E.; Vaquero, A.; Espinas, M.L.; Soliva, R.; Orozco, M.; Bernues, J.; Azorin, F. The GAGA factor of Drosophila binds triple-stranded DNA. J. Biol. Chem. 1998, 273, 24640–24648. [Google Scholar] [CrossRef] [PubMed]
  204. Leisegang, M.S.; Bains, J.K.; Seredinski, S.; Oo, J.A.; Krause, N.M.; Kuo, C.C.; Gunther, S.; Senturk Cetin, N.; Warwick, T.; Cao, C.; et al. HIF1alpha-AS1 is a DNA:DNA:RNA triplex-forming lncRNA interacting with the HUSH complex. Nat. Commun. 2022, 13, 6563. [Google Scholar] [CrossRef] [PubMed]
Figure 1. GQ fold in many different ways. (A) The core four-stranded structure formed by stacking the guanine tetrads shown in (B), with Hoogsteen hydrogen bonds highlighted in yellow and crimson. The four strands may form from G-repeats on the same molecule, arise from different molecules or arise from either RNA or DNA. (C) The base 8-aza-7-deazaguanosine retains the same molecular composition as guanosine, but with the red ring nitrogen in a different position, preventing the formation of the Hoogsteen hydrogen bonds shown with crimson shading. Control oligonucleotides incorporating this nucleotide will not form GQs. In intramolecular GQs, the stands may be parallel (D,G), anti-parallel (E,H) or hybrid (F,I). The topology of the connecting loops is shown in blue and can be propeller (D), lateral (E) or diagonal (F). M+ indicates a metal ion located at the core of the tetrad. K+ promotes GQ formation while Li+ does not. The dotted strand in (F) labeled 5 indicates that many sequences capable of forming GQs contain a “spare” tire that can maintain the fold when one of the other repeats is damaged [37]. The cartons in (GI) show the phosphate backbone as a ribbon and the bases as sticks. PDB codes are given below the structures.
Figure 1. GQ fold in many different ways. (A) The core four-stranded structure formed by stacking the guanine tetrads shown in (B), with Hoogsteen hydrogen bonds highlighted in yellow and crimson. The four strands may form from G-repeats on the same molecule, arise from different molecules or arise from either RNA or DNA. (C) The base 8-aza-7-deazaguanosine retains the same molecular composition as guanosine, but with the red ring nitrogen in a different position, preventing the formation of the Hoogsteen hydrogen bonds shown with crimson shading. Control oligonucleotides incorporating this nucleotide will not form GQs. In intramolecular GQs, the stands may be parallel (D,G), anti-parallel (E,H) or hybrid (F,I). The topology of the connecting loops is shown in blue and can be propeller (D), lateral (E) or diagonal (F). M+ indicates a metal ion located at the core of the tetrad. K+ promotes GQ formation while Li+ does not. The dotted strand in (F) labeled 5 indicates that many sequences capable of forming GQs contain a “spare” tire that can maintain the fold when one of the other repeats is damaged [37]. The cartons in (GI) show the phosphate backbone as a ribbon and the bases as sticks. PDB codes are given below the structures.
Ijms 25 10299 g001
Figure 2. Some proteins bind both B-DNA and GQs. The yeast Rap1 protein binds both to B-DNA in a sequence-specific manner (A) [55] and to GQs in a structure-specific manner (B) [56] through different faces of the same helix in the SANT/Myb1 domain (Images by Daniela Rhodes) (C). The flipon cycle creates a two-state switch with similar affinities of Rap1 for B-DNA and for GQs (i.e., Kb ≈ Kg ≈ ~20–30 nM). Flipons thereby enable binary coding within the genome.
Figure 2. Some proteins bind both B-DNA and GQs. The yeast Rap1 protein binds both to B-DNA in a sequence-specific manner (A) [55] and to GQs in a structure-specific manner (B) [56] through different faces of the same helix in the SANT/Myb1 domain (Images by Daniela Rhodes) (C). The flipon cycle creates a two-state switch with similar affinities of Rap1 for B-DNA and for GQs (i.e., Kb ≈ Kg ≈ ~20–30 nM). Flipons thereby enable binary coding within the genome.
Ijms 25 10299 g002
Figure 3. The G-flipon cycle. Many factors induce and resolve GQs to modulate specific outcomes. Other proteins prevent their formation, such as nucleosomes and repressor proteins. Transcription factors can dock to G-flipons in either a sequence-specific manner to their right-handed conformation or to the GQ structure. Once formed, the resolution of GQs can be coupled to the different outcomes shown, with both activation and inhibition of gene expression. The inhibition of enzymes like DNMT1 and TOP1 favors the maintenance of an unmethylated, nucleosome-depleted state that is necessary to rapidly reprogram cellular responses to environmental inputs. (BER: base excision repair; NER: nucleotide excision repair). References for each gene are given in the text.
Figure 3. The G-flipon cycle. Many factors induce and resolve GQs to modulate specific outcomes. Other proteins prevent their formation, such as nucleosomes and repressor proteins. Transcription factors can dock to G-flipons in either a sequence-specific manner to their right-handed conformation or to the GQ structure. Once formed, the resolution of GQs can be coupled to the different outcomes shown, with both activation and inhibition of gene expression. The inhibition of enzymes like DNMT1 and TOP1 favors the maintenance of an unmethylated, nucleosome-depleted state that is necessary to rapidly reprogram cellular responses to environmental inputs. (BER: base excision repair; NER: nucleotide excision repair). References for each gene are given in the text.
Ijms 25 10299 g003
Figure 5. Flipon-cycle promoters. The reset and reinitiation of transcription complexes are actuated by G- and Z-flipons. (A) In this model, a condensate is anchored by GQs formed by enhancer and promoter sequences. The condensate stabilizes an enhancer–promoter loop and holds the RNA polymerase (RPOL2) in a poised state. (B) Breakdown of the condensate triggers transcript elongation. The negative supercoiling 5′ to RNAP2 induces Z-DNA formation that actuates the removal of the pre-initiation complex (PIC). (C) The resolution of the promoter GQ by helicases then enables the rebinding of TFs (transcription factors) and reassembly of the PIC. The separation of strands indicates the partial unwinding that is necessary to form the transcription bubble and a GQ.
Figure 5. Flipon-cycle promoters. The reset and reinitiation of transcription complexes are actuated by G- and Z-flipons. (A) In this model, a condensate is anchored by GQs formed by enhancer and promoter sequences. The condensate stabilizes an enhancer–promoter loop and holds the RNA polymerase (RPOL2) in a poised state. (B) Breakdown of the condensate triggers transcript elongation. The negative supercoiling 5′ to RNAP2 induces Z-DNA formation that actuates the removal of the pre-initiation complex (PIC). (C) The resolution of the promoter GQ by helicases then enables the rebinding of TFs (transcription factors) and reassembly of the PIC. The separation of strands indicates the partial unwinding that is necessary to form the transcription bubble and a GQ.
Ijms 25 10299 g005
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Herbert, A. A Compendium of G-Flipon Biological Functions That Have Experimental Validation. Int. J. Mol. Sci. 2024, 25, 10299. https://doi.org/10.3390/ijms251910299

AMA Style

Herbert A. A Compendium of G-Flipon Biological Functions That Have Experimental Validation. International Journal of Molecular Sciences. 2024; 25(19):10299. https://doi.org/10.3390/ijms251910299

Chicago/Turabian Style

Herbert, Alan. 2024. "A Compendium of G-Flipon Biological Functions That Have Experimental Validation" International Journal of Molecular Sciences 25, no. 19: 10299. https://doi.org/10.3390/ijms251910299

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop