*2.1. Transcriptional Control of the Major IE Gene*

The outcome of HCMV infection is believed to depend largely on the level and timing of expression from the major IE gene [23,72,73]. This is the first viral gene to be transcribed following initial infection, and likely during reactivation from latency, in a process that does not require de novo viral protein synthesis [13,23,74]. Expression of the major IE gene is highly dynamic with transcription levels ranging from extremely high to negligibly low depending on the type and differentiation or activation state of the infected cell. While productive HCMV infection is linked to activated transcription, viral latency is characterized by transcriptional repression at this gene. The major IE gene is located in the unique long (UL) segment of the viral genome, close to the internal repeat elements. The organization of this gene is unusually complex, not just by viral standards, as multiple promoters and numerous transcripts including both sense and antisense RNAs have been identified in this region [3,75–83].

Some of these promoters appear to have a specific role during latent infection or reactivation from latency [75,79,84]. However, the combined major IE enhancer and promoter (MIEP) is considered the principal driver of IE transcription during productive HCMV infection. It contains an extremely strong enhancer which has been widely utilized in heterologous expression systems. The MIEP is bidirectional and has been roughly divided into four functional entities: a core promoter (+1 to −40 nucleotides from the transcription start site), an enhancer (−40 to −550 nucleotides), a unique region (−550 to −750 nucleotides) and a modulator (−750 to −1140 nucleotides) [85,86] (Figure 1). The modulator's role is largely unknown, although a cell-type specific regulatory function has been suggested [87–89]. "Rightward" transcription from the MIEP is suppressed by the unique region which binds cellular homeobox proteins and appears to function as an insulator between the enhancer and UL127 [90–95] (Figure 1). The core promoter is sufficient, yet not required, for low-level transcription to the "leftward" direction of the major IE gene [76,96]. It contains a TATA-box as well as the *cis*-repressive sequence (crs) that serves as a binding site for IE2 dimers (see Section 3.1). The enhancer hugely augments transcription from the major IE gene, in part via a number of small *cis*-acting repeat sequences (18-bp, 19-bp and 21-bp), and is required for viral replication. It may be further divided into proximal and distal enhancer halves (−40 to −300 nucleotides and −300 to −550 nucleotides, respectively) that differ in structural makeup, yet function jointly by contributing multiple *cis*-acting elements to provide efficient MIEP activation and viral replication. Accordingly, a long list of activating cellular transcription factors have been shown or proposed to bind to the *cis*-acting elements in the enhancer, unique region and modulator. In addition, binding of several repressive cellular transcription factors to the enhancer and modulator has been reported (Figure 1). The vast number of transcription factors that may activate or repress the MIEP is thought to account for much of the highly dynamic expression observed at the major IE gene [13,23,74].

The complexity of MIEP regulation further amplifies when considering transcription in the chromatin or "epigenetic" context. The MIEP may undergo limited DNA methylation, especially in systems for transgene expression, and the major IE gene exhibits CpG dinucleotide suppression [97–101]. Beyond these observations, there is little evidence that MIEP activity or IE transcription are regulated by DNA methylation following HCMV or MCMV infection [102–104]. Nuclear HCMV genomes form nucleosomes, octamers of core histones H2A, H2B, H3 and H4 wrapped with just under 150 bp of DNA, resembling host chromatin structure [9,105]. Consequently, the chromatin of HCMV and other DNA viruses that replicate in the nucleus is subject to regulation by nucleosome occupancy, histone composition and post-translational histone modification [106–108]. Nucleosome occupancy on the MIEP is believed to be low during productive infection, but likely increases during establishment of latency based on findings from the mouse model and by analogy to other herpesviruses [9,105,109]. Numerous studies have shown correlations between activating or repressive histone modifications associated with the major IE gene and the levels of viral gene expression. For example, association of the MIEP with H3K4me2, H3K4me3, H3K9/14ac, H3S10ph or H4Kac has been linked to high levels of IE (or transgene) transcription and productive infection or reactivation from latency [72,110–120]. By contrast, the presence of H3K9me2, H3K9me3 or H3K27me3 at the MIEP generally correlates with low levels of IE transcription and either latent or the onset (pre-IE phase) of productive infection [110–112,115,116] (Figure 1). In agreement with these observations, histone modifying enzymes and enzyme complexes including histone acetyltransferases (e.g., KAT6A/MOZ), histone deacetylases (e.g., HDAC1, HDAC3), histone methyltransferases (e.g., EHMT2/G9A, EZH2, SETDB1, SUV39H1), histone demethylases (e.g., KDM1A/LSD1, KDM4A/JMJD2, KDM6B/JMJD3) and histone kinases (e.g., MSK family) have all been implicated in regulating transcription from the MIEP [72,119–129] (Figure 1). The histone modifying proteins are typically recruited by transcription factors bound to the MIEP including cAMP responsive element binding protein 1 (CREB1), ETS2 repressor factor (ERF) and Ying Yang 1 transcription factor (YY1). In turn, chromatin modifications lead to the recruitment of further activators or repressors that may affect IE expression making for a complex hierarchy of transcriptional regulation [107,130,131].

Histone deacetylases, histone demethylases and other proteins conferring repressive histone modifications to HCMV chromatin may be considered components of the intrinsic cellular immune system also known as restriction factors [132]. Many of the best known restriction factors for HCMV reside in nuclear organelles referred to as nuclear domain 10 or promyelocytic leukaemia (PML) bodies [133–135]. While PML bodies may confer transcriptional repression as a whole, constituents of these organelles including alpha thalassemia/mental retardation syndrome X-linked protein (ATRX), death domain-associated protein (DAXX), PML protein and SP100 nuclear antigen have been shown or proposed to act as repressors of major IE gene expression in part via chromatin-based mechanisms [136–138]. More recently, cellular proteins that mediate foreign or damaged DNA sensing and signalling, including cyclic guanosine monophosphate-adenosine monophosphate (cGAMP) synthase, interferon (IFN) gamma-inducible protein 16 (IFI16) and stimulator of IFN genes (STING), have been identified as restriction factors of HCMV and other DNA viruses [139–142]. These proteins are known or predicted to restrict IE transcription, at least indirectly, although IFI16 may activate rather than repress the MIEP [143,144].

Expression of the major IE gene also varies with the activity of cellular signalling pathways that connect the extra- and intracellular environment to the nucleosomes and transcription factors associated with the HCMV genome including the MIEP. The virus has been shown to activate, rewire or inhibit numerous of these signalling pathways. HCMV infection triggers both pathways considered to be proviral as well as pathways linked to innate immune responses resulting in the production of proinflammatory and antiviral cytokines. In fact, many signalling pathways appear to exhibit both pro- and antiviral potential, and the net effect on the virus depends on various factors including cell type and stage of infection. Binding of HCMV to receptor proteins on the cell surface initiates the first wave of signalling. The virus engages various cellular entry receptors, several of which activate similar pathways relevant to the IE phase of infection [145–149]. In particular, epidermal growth factor receptor (EGFR), platelet-derived growth factor receptor alpha and integrins independently trigger the phosphatidylinositol 3-phosphate and protein kinase B (PI3K/AKT) pathway [150–152]. The PI3K/AKT pathway is central to many cellular properties including motility, proliferation and survival [153–155]. Transient induction of this pathway triggered by receptor signalling appears to be followed by more sustained activation involving the viral major IE proteins [156–159]. Initial PI3K/AKT activation is required for efficient viral entry as well as optimal replication in fibroblasts and establishment of latency in monocytes [156,158,160–163]. However, at later times during infection inhibition of EGFR or PI3K seems to favour viral replication and reactivation from latency suggesting a negative regulatory role at this point [164–167]. Besides PI3K/AKT signalling, various other kinase pathways are known to be activated very early during HCMV infection. These pathways include mitogen-activated kinase (MAPK) signalling both via extracellular signal-regulated kinase (ERK) 1 and 2 including RAF1 (MAPKKK upstream of ERK) as well as via p38 MAPK [168–172]. Other kinases thought to be involved in the IE phase of HCMV infection include adenosine monophosphate-activated protein kinase (AMPK) [173], hematopoietic cell kinase (a src family kinase) [174], cyclin-dependent kinases (CDKs) [175], protein kinase A [176] and mitogen and stress activated kinase (MSK) [128]. The activation of kinase signalling pathways in the initial infection phase comes with multiple, mostly beneficial consequences for the virus including major IE gene activation. For example, ERK mediates induction of major IE gene expression via binding of CREB to the MIEP and recruitment of MSK. In turn, MSK-mediated histone H3 phosphorylation promotes histone demethylation and the subsequent exit of HCMV from latency [128]. One of the most crucial transcription factors linked to the PI3K/AKT, MAPK and other signalling pathways relevant to HCMV infection is nuclear factor kappa B (NF-κB). Canonical NF-κB activation requires degradation of inhibitor of NF-κB (IκB), which depends on phosphorylation by a three-subunit IκB kinase (IKK). IKK-mediated phosphorylation of IκB is triggered as early as five minutes after exposure of cells to HCMV particles resulting in activation of preformed NF-κB [177,178]. This first phase of the NF-κB response to HCMV infection may facilitate IE expression via binding sites in the proximal enhancer of the MIEP (Figure 1). However, the requirement of

NF-κB for efficient IE expression varies widely with cell type, virus strain and other conditions of infection [179–182]. A second phase of NF-κB activation due to initiation of NF-κB transcription allows for continued expression throughout infection. While NF-κB activation benefits HCMV replication, at least under certain conditions, it also comes with adverse effects for the virus. NF-κB, along with IFN regulator factor 3 (IRF3), binds to promoters and stimulates transcription of numerous cytokine and chemokine genes. Some of these genes encode antiviral proteins including type I IFNs. HCMV gene products including tegument proteins (e.g., pUL35, pUL82/pp71, pUL83/pp65) and IE proteins as well as non-coding RNAs target the IFN response and other signalling pathways, adding an additional layer of complexity. Targeting of host cell signalling by HCMV will be discussed below in the context of IE proteins (see Section 3.4), but is otherwise beyond the scope of this review. For a comprehensive and detailed account of this topic, the reader is referred to several other recent reviews [183–185].

**Figure 1.** Organisation of the human cytomegalovirus (HCMV) major IE enhancer and promoter (MIEP) and select protein factors involved in its regulation. The MIEP is composed of a core promoter containing a TATA-box and the crs that mediates repression by IE2, an enhancer with proximal and distal parts, a unique element and a modulator. Nucleotide positions relative to the transcription start sites and the direction of transcription (grey arrows) are indicated. "Leftward" transcription results in mRNAs encoding the IE1 and IE2 proteins ("rightward" transcription results in uncharacterized mRNAs containing the UL127 open reading frame). Transcription factors known or predicted to bind to the individual parts of the MIEP are shown above (repressors are shown in purple). Chromatin modifiers and histone tail modifications reported to activate or repress the MIEP are shown below. A few examples of virion components and cell signalling pathways known to activate the MIEP are shown at the left and right side, respectively, of the diagram. ARID5B/MRF1, AT-rich interaction domain 5B protein; ATF, activating transcription factor family; CBX/HP1, heterochromatin protein 1; CEBPA, CCAAT enhancer binding protein alpha; CHD4, chromodomain helicase DNA binding protein 4, nucleosome remodeling and deacetylase (NuRD) subunit; CUX1/CDP, cut-like homeobox 1 protein; ELK1, ETS transcription factor Elk1; ETS, Ets proto-oncogene transcription factor; EZH2, enhancer of zeste 2 polycomb repressive complex 2 (PRC2) subunit; FOS, Fos proto-oncogene, activator protein 1 (AP-1) transcription factor subunit; FOX, forkhead transcription factor family; GFI1, growth factor-independent 1 transcriptional repressor; HMGB1/SBP, high mobility group box 1 protein; JUN, Jun proto-oncogene, AP-1 transcription factor subunit; KAT6A/MOZ, lysine acetyltransferase 6A; KDM1A/LSD1, lysine demethylase 1A; KDM4A/JMJD2, lysine demethylase 4A; KDM6B/JMJD3, lysine

demethylase 6B; MDBP, methylated DNA binding protein family; MTA2, metastasis-associated 1 family member 2, NuRD subunit; NFI/CTF, nuclear factor 1 family; PDX1, pancreatic and duodenal homeobox 1 protein; PPARG, peroxisome proliferator-activated receptor gamma; RARA, retinoic acid receptor alpha; RBBP4, Rb binding protein 4 chromatin remodelling factor, NuRD subunit; RXRA, retinoic X receptor alpha; SATB1, special AT-rich sequence binding homeobox 1 protein; SETDB1, SET domain bifurcated histone lysine methyltransferase 1; SP1, Sp1 transcription factor; SP3, Sp3 transcription factor; SRF, serum response factor; SUZ12, suppressor of zeste 12 PRC2 subunit; TBP, TATA-box binding protein; TRIM28/KAP1, tripartite motif containing 28 protein. See main text for other abbreviations.

#### *2.2. Post-Transcriptional and Translational Control of the Major IE Gene*

The primary transcript derived from the MIEP is subject to extensive regulation at the post-transcriptional and translational level. It undergoes alternative splicing and polyadenylation to generate multiple mRNA species assigned to either the IE1 or IE2 family [12,13,74]. This differential post-transcriptional regulation is believed to involve the cellular 65-kDa U2-associated factor and ubiquitin-dependent segregase valosin containing protein p97 [186,187]. RNA sequencing showed increased IE1 and decreased IE2 splicing following p97 knockdown [187]. The processed IE1 and IE2 mRNAs accumulate with different kinetics and share the first three exons [186–188]. However, IE1 mRNAs contain exon 4 while IE2 mRNAs contain exon 5 sequences.

Translation of the IE1 and IE2 mRNAs is subject to control by viral non-coding RNAs [189–191]. For example, the HCMV long non-coding RNA 4.9 has been reported to bind to the MIEP and recruit repressor complex PRC2 to this region [120]. In addition, HCMV miRNA miR-UL112-1 was shown to target the IE1 mRNA and to reduce the corresponding protein levels by translational inhibition [192–194]. Likewise, HCMV miR-UL25-1 and miR-UL25-2 appear to be linked to reduced IE1 protein levels, although most likely indirectly via cellular targets [195].

The major IE mRNAs ultimately give rise to the IE1 (UL123) and IE2 (UL122) families of proteins with several members each. The largest, most abundant and by far best studied family members are the 72-kDa (491 amino acids) IE1 protein, also known as IE72, and the 86-kDa (579 amino acids) IE2 protein, also known as IE86. The two proteins share 85 amino acids encoded by exon 3 at their amino termini but are otherwise unrelated. For simplicity, they are referred to as IE1 and IE2 in this review.

#### *2.3. Post-Translational Control of the Major IE Proteins*

IE1 and IE2 are both believed to exist as dimers, while IE2 may also form higher order oligomers [196–200]. Both IE proteins can undergo at least two types of post-translational modification, phosphorylation at serine or threonine residues [201,202] and conjugation to small ubiquitin-like modifiers (SUMOylation) at lysine residues [203–206]. Various positive or negative regulatory effects on IE protein function and HCMV replication have been ascribed to these modifications [205,207–214]. While IE1 is a metabolically highly stable protein with an estimated intracellular half-life between 21 and >30 h [215,216], IE2 exhibits a much shorter half-life of approximately 2.5 h in cells [197,215]. Alongside post-transcriptional mechanisms (see Section 2.2), the differences in metabolic stability contribute to the much higher steady-state levels of IE1 compared to IE2 observed during productive HCMV infection. Nuclear localization signals in IE1 and IE2 target the proteins to the cell nucleus, where they are found in various compartments including PML bodies, chromatin and the nucleoplasm [11,13,14].
