**Transcriptional Regulation and Its Misregulation in Human Diseases**

Editors

**Amelia Casamassimi Alfredo Ciccodicola Monica Rienzo**

MDPI • Basel • Beijing • Wuhan • Barcelona • Belgrade • Manchester • Tokyo • Cluj • Tianjin

*Editors* Amelia Casamassimi Department of Precision Medicine University of Campania "Luigi Vanvitelli" Naples Italy

Alfredo Ciccodicola Institute of Genetics and Biophysics "Adriano Buzzati-Traverso" CNR Naples Italy

Monica Rienzo Department of Environmental, Biological, and Pharmaceutical Sciences and Technologies University of Campania "Luigi Vanvitelli" Caserta Italy

*Editorial Office* MDPI St. Alban-Anlage 66 4052 Basel, Switzerland

This is a reprint of articles from the Special Issue published online in the open access journal *International Journal of Molecular Sciences* (ISSN 1422-0067) (available at: www.mdpi.com/journal/ ijms/special issues/Regulation Misregulation).

For citation purposes, cite each article independently as indicated on the article page online and as indicated below:

LastName, A.A.; LastName, B.B.; LastName, C.C. Article Title. *Journal Name* **Year**, *Volume Number*, Page Range.

**ISBN 978-3-0365-7737-1 (Hbk) ISBN 978-3-0365-7736-4 (PDF)**

© 2023 by the authors. Articles in this book are Open Access and distributed under the Creative Commons Attribution (CC BY) license, which allows users to download, copy and build upon published articles, as long as the author and publisher are properly credited, which ensures maximum dissemination and a wider impact of our publications.

The book as a whole is distributed by MDPI under the terms and conditions of the Creative Commons license CC BY-NC-ND.

## **Contents**


Reprinted from: *Int. J. Mol. Sci.* **2021**, *22*, 12178, doi:10.3390/ijms222212178 . . . . . . . . . . . . . **135**


## **About the Editors**

### **Amelia Casamassimi**

Amelia Casamassimi obtained her Biological Sciences degree in 1989 from the University of Naples, Federico II (Italy). She worked at IGB-CNR institute and Pascale Foundation (IRCSS) in Naples. Currently, she is working at the Department of Precision Medicine of University of Campania "Luigi Vanvitelli". She has been interested in genomics and post-genomics approaches, particularly transcriptome analysis, to study human diseases. Currently, she is studying the role of PRDMs in cancer and immunity. She is co-author of several scientific papers in these research fields.

#### **Alfredo Ciccodicola**

Alfredo Ciccodicola graduated in Biological Sciences from the Federico II University of Naples. He is currently Research Director at the Institute of Genetics and Biophysics "Adriano Buzzati-Traverso" of the National Research Council of Naples and Professor of Molecular Biology at the Department of Science and Technology of the "Parthenope" University of Naples. His main scientific interests concern the study of the molecular mechanisms underlying complex human diseases (type 2 diabetes, obesity, and cancer) using 'omics' approaches including the analysis of Next Generation Sequencing (NGS) data. Furthermore, his studies are also aimed at understanding the mechanisms of transcriptional regulation, epigenetic regulation, alternative splicing, and their deregulation in human diseases.

#### **Monica Rienzo**

Monica Rienzo obtained her Biological Sciences degree in 2003 from the University of Campania "Luigi Vanvitelli" in Caserta (Italy) and her Phd in Neuroscience at the University of Naples, Federico II. She worked at IGB-CNR institute and Department of Precision Medicine of University of Campania, in Naples. Currently, she is working at the Department of Environmental, Biological, and Pharmaceutical Sciences and Technologies, University of Campania "Luigi Vanvitelli" in Caserta. She has been interested in post-genomics approaches, particularly transcriptome analysis, to identify differential expression of alternative transcripts in human diseases. Currently, she is studying the role of PRDMs and their transcripts in cancer and immunity.

## **Preface to "Transcriptional Regulation and Its Misregulation in Human Diseases"**

Transcriptional regulation is a critical biological process that allows the cell or organism to respond to a variety of intra- and extracellular signals, to define cell identity during development, to maintain it throughout its lifetime, and to coordinate cellular activity. This control involves multiple temporal and functional steps, as well as innumerable molecules, including transcription factors, cofactors, and chromatin regulators. It is well known that many human disorders are characterized by global transcriptional dysregulation, since most of the signaling pathways ultimately target transcription machinery. Indeed, many syndromes and genetic and complex diseases, including cancer, autoimmunity, neurological and developmental disorders, and metabolic and cardiovascular diseases, can be caused by mutations/alterations in regulatory sequences, transcription factors, splicing regulators, cofactors, chromatin regulators, ncRNAs, and other components of transcription apparatus. It is worth noting that advances in our understanding of molecules and mechanisms involved in the transcriptional circuitry and apparatus lead to new insights into the pathogenetic mechanisms of various human diseases and disorders. Thus, this Special Issue is focused on molecular genetics and genomics studies, exploring the effects of transcriptional misregulation on human diseases.

## **Amelia Casamassimi, Alfredo Ciccodicola, and Monica Rienzo** *Editors*

## *Editorial* **Transcriptional Regulation and Its Misregulation in Human Diseases**

**Amelia Casamassimi <sup>1</sup> , Alfredo Ciccodicola 2,3,\* and Monica Rienzo <sup>4</sup>**


Transcriptional regulation is a critical biological process that allows the cell or an organism to respond to a variety of intra- and extracellular signals, to define cell identity during development, to maintain it throughout its lifetime, and to coordinate cellular activity. This control involves multiple temporal and functional steps, as well as innumerable molecules, including transcription factors, cofactors, and chromatin regulators. It is well known that many human disorders are characterized by global transcriptional dysregulation, since most of the signaling pathways ultimately target transcription machinery. Indeed, many syndromes and genetic and complex diseases, including cancer, autoimmunity, neurological and developmental disorders, and metabolic and cardiovascular diseases, can be caused by mutations/alterations in regulatory sequences, transcription factors, splicing regulators, cofactors, chromatin regulators, ncRNAs, and other components of transcription apparatus. It is worth noting that advances in our understanding of molecules and mechanisms involved in the transcriptional circuitry and apparatus lead to new insights into the pathogenetic mechanisms of various human diseases and disorders. Thus, this Special Issue is focused on molecular genetics and genomics studies, exploring the effects of transcriptional misregulation on human diseases [1,2].

In this Special Issue, a total of 18 excellent and interesting papers [3–20], consisting of 13 original research studies [3–6,10–12,14–16,18–20], as well as five reviews, are published [7–9,13,17]. They cover many subjects of transcriptional regulation, including studies on cis-regulatory elements, transcription factors (TF), chromatin regulators, and ncRNAs. As expected, a substantial number of transcriptome studies and computational analyses are also included in this issue.

Significantly, all the published papers have provided novel insights on the knowledge of human pathophysiological mechanisms, having the purpose of proposing novel biomarkers and/or therapeutic targets for the diagnosis and treatment of many diseases, especially cancer. In the field of TFs, two papers by Nagel et al. [3,4] are focused on the class of homeobox genes, encoding developmental factors containing a homeodomain with three helices, which interact with DNA, cofactors, and chromatin, thus allowing generegulating activities essential for the control of basic cell and tissue differentiation decisions. In accordance with their normal functions, these genes, when deregulated, contribute to carcinogenesis along with hematopoietic malignancies. In one of the manuscripts [3], they established the so-called myeloid TALE-code, representing a TALE homeobox gene expression pattern in normal myelopoiesis. The class of TALE homeobox genes comprises specific homeodomain factors that share a three-amino-acid residue loop extension (abbreviated as TALE) between helix 1 and helix 2. These transcription factors control basic developmental

**Citation:** Casamassimi, A.; Ciccodicola, A.; Rienzo, M. Transcriptional Regulation and Its Misregulation in Human Diseases. *Int. J. Mol. Sci.* **2023**, *24*, 8640. https://doi.org/10.3390/ ijms24108640

Received: 4 May 2023 Accepted: 9 May 2023 Published: 12 May 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

decisions. The same authors had previously constructed the lymphoid TALE-code that codifies expression patterns of all active TALE class homeobox genes in early hematopoiesis and lymphopoiesis, thus extending the TALE code to the entire hematopoietic system. Collectively, data showed expression patterns for eleven TALE homeobox genes and highlighted the exclusive expression of IRX1 in megakaryocyte-erythroid progenitors, suggesting that this TALE class member is involved in a specific myeloid differentiation route. Interestingly, the analysis of public transcription profiles from acute myeloid leukemia (AML) patients revealed aberrant expression of IRX1, IRX3, and IRX5, indicating an oncogenic role for these TALE homeobox genes when deregulated in AML [3].

In the second work [4], Nagel and colleagues investigated the subclass of NKL homeobox genes that also function in normal development and are often deregulated in hematopoietic malignancies; indeed, a previous systematic analysis revealed 18 deregulated NKL homeobox genes in AML, underlining the relevance of these developmental oncogenes in driving this cancer type. In this newly published study, the authors also identified aberrantly activated NKL genes, NKX2-3 and NKX2-4, in cell lines derived from two different AML subtypes, where they deregulate target genes involved in megakaryocytic and erythroid differentiation, thus providing the molecular basis to the classification of specific AML subtypes [4].

Another TF, which is involved in tumor progression and metastasis is Yin-Yang transcription factor 1 (YY1); indeed, it also overexpressed in different cancers, including leukemia. Antonio-Andres, G et al. [5] observed that the expression of *YY1* in patients with pediatric acute lymphoblastic leukemia (ALL) positively correlates with *HIF1A* transcription. Besides, their findings clearly indicate, for the first time, that *YY1* is transcriptionally regulated by HIF-1α and suggest that both *HIF1A* and *YY1* transcription factors could be possible therapeutic targets and/or biomarkers of ALL [5].

A further regulatory mechanism that may play a role in tumor development and progression involves two additional TFs, Forkhead Box Protein P3 (*FOXP3*) and activating transcription factor 3 (*ATF3*) [6]. FOXP3 has an essential and critical role in autoimmunity, cancer development, and Treg development, with hundreds of target genes already identified in both cancer cells and Treg cells. Additionally, *FOXP3* is a recognized breast and prostate tumor suppressor gene from the X chromosome, acting as a transcriptional repressor for several oncogenes. Using several human cell lines, Chiung-Min Wang et al. [6] assessed the function of *FOXP3* in the transcriptional activity of *ATF3*, which binds several promoters of key regulatory proteins that determine cell fate, circadian signaling, and homeostasis, and it is rapidly induced by many pathophysiological signals and is essential in cellular stress response. Overall, their findings suggest that *FOXP3*, through FOX protein response element, functions as a novel repressor of *ATF3* and that phosphorylation at Y342 plays a critical role for *FOXP3* transcriptional activity [6].

Besides, three reviews of this Special Issue are mainly focused on TFs. The first one discusses the regulation of *SNAI1* (Snail Family Transcriptional Repressor 1), a zinc finger transcription factor, which acts as a master regulator of epithelial–mesenchymal transition (EMT). Noteworthy, *SNAI1* is involved in the formation of cancer metastases by epigenetic regulation and post-translational modifications [7]. The Waku and Kobayashi [8] review describes the pathophysiological aspects of the biomolecular pathways regulated by NRF3 (NFE2L3; NFE2-like BZIP Transcription Factor 3), a transcription factor belonging to the cap'n'collar (CNC)-based leucine zipper family and functioning through proteasome regulation. The NRF3 factor and its regulated axes are involved in cancer cell growth and have anti-obesity potential, thus suggesting a possible role in the development of obesityinduced cancer [8]. Finally, Rai V. et al. critically summarized the studies performed on the regeneration of sensory hair cells (HCs) in adult mammalian cochleas to elucidate the molecular pathways, and particularly transcription factors, involved in the regeneration of cochlear HCs, which aids in proposing a biological approach for better therapeutics to treat hearing loss and to restore hearing [9].

In the last years, the old annotation of protein-coding genes, based on the presence of an open reading frame (ORF) with minimal lengths for translated proteins, has significantly changed. Indeed, the recent literature indicates that the proteome is more complex than previously estimated, since RNAs previously considered noncoding, such as long noncoding RNAs (lncRNAs) and circular RNAs, are instead translated into functional small proteins [10]. In an interesting study of this Special Issue, the authors utilized transcriptome and polysome profiling to identify novel micropeptides that originate from lncRNAs that are expressed exclusively in hepatocellular carcinoma (HCC) cells, but not in the liver or other normal tissues. Specifically, they found three HCC-specific lncRNAs, containing at least one ORF longer than 50 amino acid (aa) and enriched in the polysome fraction. Besides, through a peptide specific antibody, they characterized one lncRNA candidate, NONHSAT013026.2/Linc013026-68AA, which is translated into a 68 aa micropeptide. This small protein is mainly localized at the perinuclear region and is mainly expressed in moderately—but not well-differentiated—HCC cells, and it plays a role in cell proliferation, suggesting that it could be used as an HCC-specific target molecule. This finding is noteworthy, since it represents an important advance in the study of the previously overlooked "dark proteome" and its role in human pathologies, particularly in cancer [10].

Another emerging field, especially in cancer, is represented by chimeric RNAs, which are transcripts consisting of exons from different parental genes. They can be produced by several mechanisms mostly involving chromosomal rearrangements; besides, they can also be generated by intergenic splicing, cis-splicing from two same-strand adjacent genes, and trans-splicing from two separate RNA transcripts [11]. Although these events were initially considered rare, human transcriptome profiles have revealed that a huge amount of chimeric RNAs develop from intergenic splicing and can be also detected in normal tissues, thus contributing to transcriptomic complexity. An interesting study has analyzed the genetic structure and biological roles of *CLEC12A-MIR223HG*, a novel chimeric transcript produced through trans-splicing by the fusion of the cell surface receptor CLEC12A (C-Type Lectin Domain Family 12 Member A) and the miRNA-223 host gene (*MIR223HG*), first identified through transcriptome profiling of chronic myeloid leukemia (CML) patients. Unexpectedly, *CLEC12A-MIR223HG* was detected not only in CML, but also in a variety of normal tissues and cell lines as pro-monocytic cells resistant to chemotherapy or during monocyte-to-macrophage differentiation [11]. Transcriptional activation of CLEC12A increased *CLEC12A-MIR223HG* expression. This chimeric RNA also translates into a chimeric protein, which largely resembles CLEC12A, but contains a modified C-type lectin domain, altering key disulphide bonds. Consequently, differences in post-translational modifications, cellular localization, and protein–protein interactions occur. These findings not only support a possible involvement of *CLEC12A-MIR223HG* in the regulation of CLEC12A function, but they could also provide a roadmap to study the other uncharacterized chimeric RNAs that are continuously recognized by RNA-Seq analyses [11].

In the last decades, next generation sequencing (NGS) strategies have been greatly applied in a huge number of cancer studies with different purposes. Among the NGS applications, several DNA barcode-based parallel reporter methods have been implemented for the screening of regulatory risk sites. Among them, the dinucleotide reporter system (DiR)-seq screening system was developed to investigate the gene regulatory effect from the risk single nucleotide polymorphisms (SNPs) that have a modest impact. In their paper, Ren and coworkers [12] applied the DiR system in prostate cancer cells (22Rv1) to screen the regulatory risk SNPs, leading to transcriptional misregulation, and they identified 32 regulatory SNPs that exhibited different regulatory activities with two alleles. Among them, fourteen SNPs exhibited decreased expression levels for the risk alleles, whereas eighteen SNPs showed increased expression. Particularly, they discovered that the rs684232 T allele altered chromatin binding of transcription factor FOXA1 on the DNA region and led to aberrant gene expression of *VPS53*, *FAM57A*, and *GEMIN4*, which are often upregulated in prostate cancer patients. Thus, these findings provide novel insights to further elucidate the basis mechanism of the functional prostate cancer risk SNPs [12].

An important role in the mechanisms of transcriptional regulation is also played by transposable elements, repetitive genetic sequences with the ability (sometimes lost during evolution) to transpose elsewhere in the genome. A class of these elements is represented by the endogenous retroviruses (ERVs), which represent about 8% of the sequences present in the human genome. An interesting overview of this Special Issue [13] describes the mechanisms underlying their transcriptional regulation. During the evolution of the human genome, the accumulation of mutations, insertions, deletions, and/or truncations has rendered these elements inactive. However, it is increasingly evident that, under the influence of genetic and epigenetic mechanisms, they can be involved in some physiological and pathological conditions; examples are their function in embryonic development, or, even more importantly, their reactivation in the development of human diseases, such as cancer and neurodegenerative disorders [13]. Besides, a remarkable paper [14] shows that transcriptional alterations observed in X-linked dystonia–parkinsonism (XDP) are caused by the insertion of a SINE-VNTR-Alu (SVA) retrotransposon in an intron of the *TAF1* (TATA-Box Binding Protein Associated Factor 1) gene, encoding for the largest subunit of TFIID; as an interesting effect, increased levels of the *TAF1* intron retention transcript *TAF1-32i* can be found in XDP cells, as compared to healthy controls. Overall, the results of this study provide further evidence that transposable elements affect gene expression and suggest that a mechanism of splicing alteration occurs in XDP patients, probably caused by binding sites for transcription factors and splicing regulators present within this retrotransposon and that need to be exactly proved through additional experiments [14].

Currently, transcriptome profiling is one of the most utilized approaches to investigate human diseases at the molecular level [2]. Here, Kim and colleagues [15] compare the transcriptomic profiles, extracted from the NCBI Gene Expression Omnibus (GEO) database, of brown adipose tissue (BAT) of young and elderly subjects in response to thermogenic stimuli. Interestingly, they observe that aging does not cause transcriptional changes in thermogenic genes, but it upregulates several pathways related to the immune response and downregulates metabolic pathways. Furthermore, they note that acute severe cold exposure (CE) upregulates several pathways related to protein folding, whereas chronic mild CE upregulates metabolic pathways, mostly related to carbohydrate metabolism [15].

Furthermore, in the study of Suojalehto et al. [16], transcriptome profiling was assessed to investigate whether a distinct clinical subtype of adult-onset asthma could be related to damp and moldy buildings, which are symptoms of idiopathic environmental intolerance, thus identifying potential molecular similarities with this disease. To this purpose, fifty female adult-onset asthma patients were categorized based on their exposure to building dampness and molds and other clinical parameters (inflammation, cytokine profile, etc.), together with gene signatures of nasal biopsies and peripheral blood mononuclear cells. Overall, the results of this study revealed a greater degree of similarity between idiopathic environmental intolerance and dampness related asthma than between the same patients and those with asthma not associated to dampness and mold [16]. Transcriptome analysis revealed well defined pathological mechanisms for asthma without exposure to dampness, but not dampness-and-mold-related asthma patients. Besides, a distinct molecular pathological profile in nasal and blood immune cells of idiopathic environmental intolerance subjects was found, including several differentially expressed genes (DEGs) that were also detected in dampness-and-mold-related asthma samples, thereby suggesting idiopathic environmental intolerance-type mechanisms [16].

It is well known that eukaryotic transcription is a complex, biological, and stepwise process, ranging from initiation, elongation, and termination to the process of pre-mRNA with 50 -end capping, splicing, 30 -end cleavage, and polyadenylation; subsequently, the mature mRNA is exported from the nucleus to the cytoplasm to undergo protein translation, whereas the aberrantly processed pre-mRNAs and mRNAs are removed via the RNA surveillance system. All these steps are also interconnected with each other, and with chromatin accessibility and additional epigenetic mechanisms [1]. Of note, alteration of any step may constitute the basis of a disease. For instance, Park HS. et al. [17] illustrate

recent findings on the role of the nuclear mRNAs export in cellular aging and age-related neurodegenerative disorders. Indeed, it is now established that there is a close relationship between transcription and aging; however, the role of the nuclear mRNAs export in these issues is still poorly characterized. Of note, disruption of the regulation of factors mediating mRNA export from the nucleus (namely, TREX, TREX-2, and nuclear pore complex) has been shown to result in the accumulation of aberrant nuclear mRNAs, with the consequent alteration of normal lifespan and the development of neurodegenerative diseases [17].

Concerning epigenetics, an interesting paper focused on obesity without metabolic complications, a phenotype defined as metabolically healthy obesity (MHO), which can progress to an unhealthy state known as metabolically unhealthy obesity (MUO), although a relevant percentage of MHO individuals are likely to maintain their status over time [18]. The authors aimed to analyze the long-term evolution of DNA methylation patterns in a subset of MHO subjects in order to search for epigenetic markers that could predict the progression of MHO to MUO. As a result, twenty-six CpG sites were significantly differentially methylated, both at baseline and after eleven years of follow-up. Two potential biomarkers of the transition to an unhealthy state were identified: specifically, higher methylation in cg20707527 (*ZFPM2*) and lower methylation in cg11445109 (*CYP2E1*) could impact the stability of a healthy phenotype in obesity [18].

Finally, the study of transcriptional regulation is also useful to elucidate the pathogenetic mechanisms of human pathogens. Bacterial sensing of environmental signals has an essential role in modulating virulence and bacterium–host interactions. Generally, bacteria utilize the two-component system (TCS) method, such as QseEF, to control gene expression in response to rapid environmental changes. Of note, many recognized mechanisms function through the post-transcriptional control of small non-coding RNAs (sRNAs), and their identification is rapidly increasing in number and variety in the context of bacteria regulatory functions [19]. Interestingly, in a study in the present Special Issue, the authors identified the QseEF homologue of *Proteus mirabilis*, an Important pathogen of the urinary tract, principally in patients with indwelling urinary catheters, and they found it is involved in the modulation of swarming motility through the sRNA *GlmY*. This is the first study investigating a pathway mediated by a two-component system through a sRNA as an underlying pathogenetic mechanism of *Proteus mirabilis* swarming migration, during which expression of several virulence genes is increased. Since it is assumed that *P. mirabilis* swarming up catheters is primed to infect the urinary tract, clarifying the swarming mechanisms could provide new approaches in the development of intervention strategies and facilitate the discovery of novel therapeutics [19].

Again, in the field of human pathogens, Braun et al. [20] focused on the molecular diagnosis of the anthrax pathogen *Bacillus anthracis*, which is challenging because its identification is complicated by the close relationship with other bacteria of the group species. The authors have designed and validated an ultrasensitive detection method that can be run as a real-time PCR with solely DNA as a template or as a RT-real-time version using both cellular nucleic acid pools (DNA and RNA) as a template without requirement of DNase treatment. This assay was found to be highly species specific, yielding no false positives, and it was highly sensitive targeting a unique single nucleotide polymorphism within a variable number of loci of the multi-copy 16S rRNA gene and related transcripts. With the high abundance of 16S rRNA moieties in cells, it is expected to facilitate the detection of *B. anthracis* by PCR. While standard PCR assays are well established for the identification of *B. anthracis* from pure culture, the exceptional sensitivity of the new 16S rRNA-based test might excel in clinical and public health laboratories when detection of minute residues of the pathogen is required [20].

In conclusion, we hope the readers enjoy this Special Issue of IJMS and the effort to present the current advances and promising results in the field of transcriptional regulation and its involvement in all the relevant biological processes and in pathophysiology.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** We thank Mariagiovanna Tramontano for her helpful assistance in drafting and writing our editorial. We also thank Jerry Whang and all the participating assistant editors and reviewers for their important contribution to this Special Issue.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

## *Article* **The Hematopoietic TALE-Code Shows Normal Activity of IRX1 in Myeloid Progenitors and Reveals Ectopic Expression of IRX3 and IRX5 in Acute Myeloid Leukemia**

**Stefan Nagel \* , Claudia Pommerenke , Corinna Meyer and Roderick A. F. MacLeod**

Department of Human and Animal Cell Lines, Leibniz-Institute DSMZ, German Collection of Microorganisms and Cell Cultures, 38124 Braunschweig, Germany; cpo14@dsmz.de (C.P.); cme@dsmz.de (C.M.); rafmacleod@gmail.com (R.A.F.M.)

**\*** Correspondence: sna@dsmz.de; Tel.: +49-531-2616167

**Abstract:** Homeobox genes encode transcription factors that control basic developmental decisions. Knowledge of their hematopoietic activities casts light on normal and malignant immune cell development. Recently, we constructed the so-called lymphoid TALE-code that codifies expression patterns of all active TALE class homeobox genes in early hematopoiesis and lymphopoiesis. Here, we present the corresponding myeloid TALE-code to extend this gene signature, covering the entire hematopoietic system. The collective data showed expression patterns for eleven TALE homeobox genes and highlighted the exclusive expression of IRX1 in megakaryocyte-erythroid progenitors (MEPs), implicating this TALE class member in a specific myeloid differentiation process. Analysis of public profiling data from acute myeloid leukemia (AML) patients revealed aberrant activity of IRX1 in addition to IRX3 and IRX5, indicating an oncogenic role for these TALE homeobox genes when deregulated. Screening of RNA-seq data from 100 leukemia/lymphoma cell lines showed overexpression of IRX1, IRX3, and IRX5 in megakaryoblastic and myelomonocytic AML cell lines, chosen as suitable models for studying the regulation and function of these homeo-oncogenes. Genomic copy number analysis of IRX-positive cell lines demonstrated chromosomal amplification of the neighboring IRX3 and IRX5 genes at position 16q12 in MEGAL, underlying their overexpression in this cell line model. Comparative gene expression analysis of these cell lines revealed candidate upstream factors and target genes, namely the co-expression of GATA1 and GATA2 together with IRX1, and of BMP2 and HOXA10 with IRX3/IRX5. Subsequent knockdown and stimulation experiments in AML cell lines confirmed their activating impact in the corresponding IRX gene expression. Furthermore, we demonstrated that IRX1 activated KLF1 and TAL1, while IRX3 inhibited GATA1, GATA2, and FST. Accordingly, we propose that these regulatory relationships may represent major physiological and oncogenic activities of IRX factors in normal and malignant myeloid differentiation, respectively. Finally, the established myeloid TALE-code is a useful tool for evaluating TALE homeobox gene activities in AML.

**Keywords:** homeodomain; HOX-code; NKL-code; PBX1; TALE-code; TBX-code

### **1. Introduction**

Stem and progenitor cells expand and subsequently pass through several developmental stages to differentiate into mature cells and tissues. In the hematopoietic system, corresponding processes generate all types of blood and immune cells, starting with hematopoietic stem cells and their derived progenitors, which establish the lymphoid and myeloid lineage [1]. The first produces lymphocytes such as B- and T-cells, while the latter generates inter alia granulocytes, monocytes, megakaryocytes, and erythrocytes. The myeloid system contains several progenitors that differ in their developmental potential. The common myeloid progenitor (CMP) is able to produce all types of myeloid cells,

**Citation:** Nagel, S.; Pommerenke, C.; Meyer, C.; MacLeod, R.A.F. The Hematopoietic TALE-Code Shows Normal Activity of IRX1 in Myeloid Progenitors and Reveals Ectopic Expression of IRX3 and IRX5 in Acute Myeloid Leukemia. *Int. J. Mol. Sci.* **2022**, *23*, 3192. https://doi.org/ 10.3390/ijms23063192

Academic Editors: Haifa Kathrin Al-Ali, Amelia Casamassimi, Alfredo Ciccodicola and Monica Rienzo

Received: 15 February 2022 Accepted: 14 March 2022 Published: 16 March 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

while the megakaryocyte-erythroid progenitors (MEPs) are developmentally restricted, generating either megakaryocytes or erythrocytes.

Hematopoietic differentiation processes are typically regulated at the transcriptional level [2–4]. Therefore, the master genes controlling these operations mostly encode transcription factors (TFs). Homeobox genes encode developmental TFs, controlling basic decisions in cell and tissue differentiation. They contain a homeodomain at the protein level, which consists of three helices. This domain interacts with DNA, cofactors, and chromatin, thus representing a platform for gene-regulating activities [5]. According to sequence similarities of their conserved homeobox, these genes fall into eleven classes and several subclasses [6]. ANTP represents the largest class containing 39 clustered HOX genes and the 48-member-strong NKL homeobox gene subclass. The class of TALE homeobox genes comprises specific homeodomain factors that share a three-amino-acid residue loop extension (abbreviated as TALE) between helix 1 and helix 2. The human genome encodes 20 TALE homeodomain proteins including six IRX-factors and the well-known PBX1 and MEIS1 members, which are able to cooperate with HOX factors [7–9].

The human IRX genes are genomically arranged in two clusters, consisting of IRX1, IRX2, and IRX4 (located at chromosomal position 5p15) and of IRX3, IRX5, and IRX6 (16q12). This clustering is evolutionarily conserved and may be related to conjoint transcriptional regulation [10]. IRX genes are embryonically expressed and tasked with regulating the development of particular tissues and organs [11]. IRX1, for example, regulates the development of the limbs, gut, kidney, and lung and plays an oncogenic role in several types of cancer [12]. Aberrant expression of IRX1 and IRX3 has been reported in myeloid leukemia, supporting general commitments in carcinogenesis [13,14].

Reflecting their physiological functions in development, deregulated homeobox genes drive specific hematopoietic malignancies and are frequently targeted by chromosomal aberrations [15–17]. To evaluate the activities of major subgroups of homeobox genes in lymphoid and myeloid malignancies, we generated the "NKL-code" [18]. This gene signature describes physiological activities of NKL homeobox genes during hematopoiesis and assists the identification of deregulated NKL homeobox gene expression in patients. Accordingly, we also described the lymphoid TALE-code and applied that gene signature to identify the aberrant activity of TALE homeobox gene PBX1 in Hodgkin lymphoma [19]. Here, we extended the established lymphoid TALE-code to the entire hematopoietic system and included the myeloid lineage.

Acute myeloid leukemia (AML) is the most frequent hematopoietic malignancy in adults. The tumor cells derive from particular myeloid progenitors. Several subtypes of AML may be distinguished according to originating cells and stages, phenotypes, chromosomal aberrations, and gene mutations that differ in prognosis and outcome [20]. Knowledge of normal and abnormal activities of basic developmental regulators should serve to refine diagnostic procedures and thus help identify novel therapeutic targets. Accordingly, we describe here the physiological expression of TALE homeobox genes during myelopoiesis and investigate the aberrant activities of IRX homeobox genes in AML subsets.

#### **2. Results**

#### *2.1. Establishment of the Myeloid TALE-Code*

Recently, we delineated the lymphoid TALE-code for early hematopoiesis and lymphopoiesis [19]. Here, we extended this gene signature to include the myeloid system according to our reported approach generating the myeloid NKL-code [21]. By exploiting public gene expression profiling and RNA-seq datasets, we generated a TALE homeobox gene expression pattern for myeloid progenitors and mature cells (Figure S1). The applied cutoffs to discriminate positive and negative expression levels were adopted from our previous studies [21]. The results are depicted in Figure 1, representing the myeloid TALE-code.

**Figure 1.** Myeloid TALE-code. This diagram summarizes the screening results for expression analyses of TALE homeobox genes (red) in early hematopoiesis and myelopoiesis. We have termed this expression pattern myeloid TALE-code. Expression of IRX1 is highlighted in blue (arrowhead). Abbreviations: cDC, conventional dendritic cell; CDP, common dendritic progenitor; CFU ery, erythroid colony forming unit; CLP, common lymphoid progenitor; CMP, common myeloid progenitor; ery, erythrocyte; GMP, granulo-myeloid progenitor; granu, granulocyte; HSPC, hematopoietic stem and progenitor cell; inter ery, intermediate stage erythroblast; late ery, pyknotio-stage erythroblast; LMPP, lymphomyelo-primed progenitor; macro, macrophage; mast, mast cell; MDP, monocyte dendritic cell progenitor; mega, megakaryocyte; MEP, megakaryocytic-erythroid progenitor; metamyelo, metamyelocyte; moDC, monocyte-derived dendritic cell; mono, monocyte; pDC, plasmacytoid dendritic cell; pro ery, pro-erythroblast; pro myelo early/late, early/late promyelocyte.

The established myeloid TALE-code comprises eleven genes: IRX1, MEIS1, MEIS2, MEIS3, PBX1, PBX2, PBX3, PBX4, PKNOX1, TGIF1, and TGFI2. We found TALE homeobox gene activities in all cell types analyzed. The numbers of expressed genes in single entities ranged from three (pro-erythroblast) to nine (MEP). TGIF genes were expressed in all but one entity (metamyelocyte). PBX1 was also active in most mature cell types, contrasting with the findings for this gene in lymphopoiesis [19]. Interestingly, IRX1 expression was only detected in MEPs, while the remaining IRX genes remained silent in the complete hematopoietic compartment [19]. This observation may indicate that IRX1 plays a major

role in early differentiation processes of this particular progenitor, which generates both megakaryocytes and erythrocytes. In accordance with the TALE-code, RNA-seq gene expression data from the Human Protein Atlas showed repression of all six IRX genes in the myeloid entities represented, despite expression in cells and tissues of other compartments (Figure S2). In the following, we examined a potential oncogenic role for IRX genes in AML.

#### *2.2. Aberrant Expression of IRX1, IRX3, and IRX5 in AML*

To scrutinize the aberrant expression of IRX genes in AML patients, we screened three public expression profiling datasets. Dataset GSE19577 covers 42 AML patients with MLL-rearrangements, GSE15434 contains 251 patients with normal karyotypes, and GSE6891 contains 537 patients with unknown karyotypes and specific recurrent chromosomal translocations and/or gene mutations. Our results are shown in Figure S3. In all datasets, we detected aberrant expressions of IRX1, IRX3, and IRX5, while IRX2 and IRX4 showed only inconsistent or low expression levels. Of note, IRX6 was not represented by the applied arrays for these datasets. Thus, we detected the aberrant overexpression of IRX1 and ectopic activity of IRX3 and IRX5 in AML patient subsets.

IRX1 is located at chromosomal locus 5p15, while IRX3 and IRX5 are neighbors at 16q12, indicating possible coregulation. The analysis of co-expression for IRX1, IRX3, and IRX5 in AML patients revealed significant correlations for IRX3 and IRX5 using datasets GSE15434 and GSE6891 (Figure S4). Thus, the combined expression of IRX3 and IRX5 may be mediated by their genomic proximity, albeit lacking in patients with MLL-rearrangements. Taken together, TALE homeobox genes IRX1, IRX3, and IRX5 are aberrantly expressed in AML patient subsets, which differ in karyotypes and gene mutations.

To find suitable models for functional studies, we screened IRX gene activities using our published RNA-seq dataset LL-100, which contains 34 myeloid and 66 lymphoid leukemia/lymphoma cell lines [22]. Significant RNA expression levels in myeloid cell lines were detected just for IRX1, IRX3, and IRX5 (Figure S5), corresponding to the AML patient data. Interestingly, the IRX1 expression was restricted to cell lines derived from megakaryoblastic AML, while expressions of both IRX3 and IRX5 were elevated in cell lines derived from myelomonocytic AML. However, megakaryoblastic cell line MEGAL expressed IRX3 and IRX5 but not IRX1, representing an exception to that rule. RQ-PCR analysis of selected AML cell lines confirmed the RNA-seq data, showing IRX1 expressions in CMK, M07e, MKPL1, and UT-7 and IRX3/IRX5 expressions in MEGAL and OCI-AML3 (Figure 2). An additional comparison with primary cells from the cerebellum, kidney, lung, and salivary gland showed similar or even higher RNA expression levels in the cell lines, demonstrating significant overexpression of these genes in AML (see also Figure S2). Finally, Western blot analysis of IRX1 and IRX3 confirmed the transcript data at the protein level (Figure 2), endorsing the analyzed cell lines as models for functional studies.

**Figure 2.** IRX gene activities in AML cell lines. Expression analyses of TALE homeobox genes IRX1, IRX3, and IRX5 in AML cell lines and primary controls as performed by RQ-PCR (**left**). Cell lines showing significant expression levels of the corresponding IRX genes are shown in red. P-values are indicated by asterisks. Western blot analyses (**right**) were performed for IRX1 and IRX3 in selected AML cell lines. TUBA served as loading control.

#### *2.3. Chromosomal and Genomic Analyses of the Gene Loci for IRX1, IRX3, and IRX5 in AML*

In hematopoietic malignancies, aberrantly activated oncogenes including homeobox genes are frequently targeted by chromosomal rearrangements [15,16]. To analyze if this deregulating mechanism underlies the activation of IRX genes in AML, we inspected the published karyotypes of the corresponding cell lines (www.DSMZ.de, accessed on 15 February 2022) and shortlisted i(5)(p10) in OCI-AML3 and MKPL-1 and del(16)(q13q23) in MEGAL, which may have activating impacts. In addition, we performed genomic profiling analysis of AML cell lines CMK, M-07e, MKPL-1, UT-7, MEGAL, and OCI-AML3. The data for chromosomes 5 and 16 are shown in Figure 3. IRX1 (5p15) is duplicated together with the whole short arm in OCI-AML3 and MKPL-1, consistent with the karyotype data showing the formation of isochromosomes for the short arms. However, OCI-AML3 is IRX1-negative, discounting the likely contribution of this aberration to its activation. In contrast, IRX3 and IRX5 are located at 16q12 and considerably amplified in MEGAL. The data showed several amplicons covering nearly the complete long arm of chromosome 16. Moreover, RNA-seq data for the transcribed enhancer locus FTO, which is located adjacent to IRX3 and regulates IRX3/IRX5 activity [23,24], showed enhanced expression restricted to MEGAL (Figure S5). Thus, genomic amplification of IRX3/IRX5 together with FTO may underlie their activation in megakaryoblastic AML cell line MEGAL. Despite these substantial rearrangements of chromosome 16 in MEGAL, RT-PCR excluded the generation of AML-specific fusion gene CBFb-MYH11 via inv(16)(p13q22), confirming the cytogenetic findings (Figure S6A).

**Figure 3.** Genomic analysis of AML cell lines. Copy number states for chromosome 5 (**above**) and chromosome 16 (**below**) of IRX1-positive cell lines CMK, M-07e, MKPL-1, and UT-7, and of IRX3/IRX5-positive AML cell lines MEGAL and OCI-AML3 were determined by genomic profiling analysis. A copy number gain for IRX1 (located at 5p15) was detected in MKPL-1 and OCI-AML3. Amplification of FTO, IRX3, and IRX5 (16q12) was detected among multiple complex chromosome 16 rearrangements in MEGAL.

#### *2.4. GATA1 and GATA2 Activate IRX1 in AML Cell Lines*

To identify potential regulators and target genes of aberrantly expressed IRX1, we compared LL-100 RNA-seq expression data from IRX1-positive (CMK, M07e, MKPL-1, and UT-7) with IRX3/IRX5-positive (MEGAL and OCI-AML3) AML cell lines. The results are shown in Table S1. Gene set annotation analysis of the top-1000 upregulated genes in IRX1 positve cell lines revealed the statistically significant GO-term erythrocyte differentiation (*p* = 0.0052) and the associated master genes GATA1, GATA2, KLF1, and TAL1, which were chosen for analysis in more detail.

In developing imaginal discs of the fruit fly Drosophila melanogaster, GATA factor pannier regulates the IRX-cluster Iro-C [25], spotlighting this relationship for consideration in humans. RQ-PCR analysis of GATA1 and GATA2 confirmed the higher expression and thus correlating IRX1 levels in cell lines CMK, M07e, MKPL-1, and UT-7. In addition, the siRNA-mediated knockdown of GATA1 and GATA2 in M-07e and MKPL-1 cells resulted in the concomitant downregulation of IRX1 (Figure 4), demonstrating an activating impact of both factors.

**Figure 4.** Transcriptional regulation of IRX1. Expression analysis of (**A**) GATA1 and (**B**) GATA2 by RQ-PCR in selected AML cell lines (**left**). Transcript levels are indicated in relation to CMK. Cell lines expressing significant levels of IRX1 are highlighted in red. SiRNA-mediated knockdown of (**A**) GATA1 and (**B**) GATA2 demonstrates activations of IRX1 and IRX5 in AML cell lines M-07e and MKPL-1 (**right**). *p*-values are indicated by asterisks (\*\* *p* < 0.01, \*\*\* *p* < 0.001).

Our genomic profiling data showed for cell line CMK a duplication of the short arm of chromosome X, which includes the GATA1 locus at Xp11 (Figure S7). CMK expressed the highest GATA1 RNA levels, indicating that this genomic aberration may contribute indirectly to IRX1 activation. An additional chromosomal aberration revealed by genomic profiling was an amplification at 19p13 in cell lines MKPL-1 and UT-7 (Figure S7). This amplicon includes the erythropoietin receptor encoding gene EPOR and may underlie its overexpression, as shown by RNA-seq data (Figure S5). EPOR and GATA1 are mutual activators and thus may also support IRX1 expression [26]. Finally, expression profiling data from various myelopoietic stages (dataset GSE42519) showed elevations of GATA1, GATA2, and EPOR in IRX1-positive MEPs, while earlier progenitors and later myeloid stages expressed decreased levels (Figure S8). Thus, master factors GATA1 and GATA2 are prominently co-expressed along with IRX1 in MEPs and megakaryoblastic AML cell lines, and activates IRX1 expression.

#### *2.5. HOXA10 and BMP2-Signaling Activate IRX3 and IRX5 in AML*

To identify potential regulators and target genes of aberrantly expressed IRX3 and IRX5, we then compared RNA-seq data from IRX3/IRX5-positive with IRX1-positive AML cell lines (Table S2). Statistically significant GO-terms identified by gene set annotation analysis of the top-1000 upregulated genes in IRX3/IRX5-positive cell lines included the negative regulation of myeloid cell differentiation (*p* = 0.0029), SMAD protein signal transduction (*p* = 0.05), and negative regulation of apoptotic processes (*p* = 0.00007). Among the top-1000 upregulated genes, we chose the following candidates for further studies: BCL2, BMP2, HOXA10, and NUP214.

Elevated NUP214 expression correlated with the presence of fusion gene SET-NUP214 in MEGAL, which reportedly activates HOXA genes in T-cell leukemia [27,28]. Accordingly, we confirmed the chromosomal deletion at 9q34 by genomic profiling and the generated fusion gene by RT-PCR in this cell line (Figure S6B).

In AMLs containing MLL-AF4 rearrangements, HOXA genes have been shown to correlate with the expression of IRX3 but not with IRX1 [13,14,29–31], suggesting that particular HOXA-members may support IRX3/IRX5 activation. RQ-PCR analysis of HOXA10 showed low and high RNA expression levels in IRX3/IRX5-positive cell lines MEGAL and OCI-AML3, respectively, while IRX1-positive cell lines CMK, M07e, MKPL-1, and UT-7 tested negative (Figure 5A). Furthermore, siRNA-mediated knockdown of HOXA10 in OCI-AML3 and MEGAL resulted in the concomitant downregulation of IRX3 and IRX5, demonstrating an activating input (Figure 5A). Expression profiling data covering multiple myelopoietic stages (dataset GSE42519) showed an elevated HOXA10 expression in early stages, which decreased after the MEP-stage (Figure S8). Thus, HOXA10 was highly expressed in myeloid progenitors including MEPs and activated the expression of IRX3/IRX5 in AML ectopically.

As BMP2-signaling has been shown to regulate IRX gene IRO3 in the organizer region during gastrulation in zebrafish [32], we decided to explore this relationship in humans. RQ-PCR analysis of BMP2 confirmed the conspicuous expression in OCI-AML3 (Figure 5B). Furthermore, quantification of the BMP2 protein in cell culture supernatants revealed significant levels in OCI-AML3, while M-07e and MEGAL tested negative. Additional treatment of OCI-AML3 with recombinant BMP2 upregulated both IRX3 and IRX5 (Figure 5B), demonstrating an activating input. The TFs JUNB and SMAD4 operate downstream in BMP2-signaling [33,34]. Accordingly, the siRNA-mediated knockdown of JUNB and SMAD4 in OCI-AML3 resulted in the concomitant downregulation of IRX3 and IRX5, showing an activating role of these factors (Figure 5C). However, RQ-PCR analysis of JUNB indicated no correlation with IRX3/IRX5 expression (Figure 5D). However, the suppression of BMP2-activity by the treatment of OCI-AML3 with an inhibitory antibody resulted in the concomitant downregulation of IRX3, IRX5, and JUNB, while HOXA10 showed no significant alteration (Figure 5D).

Taken together, we identified HOXA10 and BMP2-signaling via JUNB and SMAD4 as activating factors for IRX3 and IRX5 in AML. The correlation of SET-NUP214, HOXA, and BMP2-signaling with IRX3/IRX5 expression in MEGAL and OCI-AML3, respectively, indicated that diverse aberrant mechanisms may underlie the ectopic activation of these TALE homeobox genes.

**Figure 5.** Transcriptional regulation of IRX3 and IRX5. (**A**) RQ-PCR analysis of HOXA10 in AML cell lines (**left**). Transcript levels are indicated in relation to OCI-AML3, and IRX3/IRX5-positive cell lines are indicated in red. RQ-PCR analysis of OCI-AML3 (**above**) and MEGAL (**below**) showed downregulation of IRX3 and IRX5 after siRNA-mediated knockdown of HOXA10 (**right**). (**B**) RQ-PCR analysis of BMP2 in AML cell lines (**left**). Transcript levels are indicated in relation to OCI-AML3. ELISA results for BMP2 protein levels in AML cell line supernatants are inserted. RQ-PCR analysis of OCI-AML3 treated with BMP2 resulted in the upregulation of IRX3 and IRX5 (**right**). (**C**) RQ-PCR analysis of OCI-AML3 after siRNA-mediated knockdown of JUNB (**left**) and SMAD4 (**right**) showed downregulation of IRX3 and IRX5. (**D**) RQ-PCR analysis of JUNB in AML cell lines (**left**). Transcript levels are indicated in relation to OCI-AML3. RQ-PCR analysis of OCI-AML3 treated with inhibitory BMP2-antibody resulted in downregulation of JUNB, IRX3, and IRX5, while HOXA10 was not significantly affected (**right**). *p*-values are indicated by asterisks (\* *p* < 0.05, \*\* *p* < 0.01, \*\*\* *p* < 0.001, n.s. not significant).

#### *2.6. IRX1 and IRX3 Differ in Target Gene Regulation*

For target gene analysis of IRX1 in AML, we tested the erythropoietic differentiation genes identified in megakaryoblastic cell line MKPL-1. The SiRNA-mediated knockdown of IRX1 effected the significant downregulation of KLF1 and TAL1 while sparing GATA1 and GATA2 (Figure 6A). Thus, IRX1 activated the erythroid master TFs KLF1 and TAL1, which may disturb megakaryopoiesis if aberrantly activated. Interestingly, both genes are also prominently expressed in MEPs (Figure S8), suggesting that this regulatory relationship may play a physiological role in these progenitor cells.

To examine if the differentiation factors GATA1, GATA2, KLF1, and TAL1 are regulated by IRX3, we used myelomonocytic cell line OCI-AML3. The SiRNA-mediated knockdown of IRX3 resulted in the elevated expression of GATA1 and GATA2, while TAL1 remained unaffected and KLF1 silent (Figure 6B). Thus, IRX3 suppressed the expression of the myeloid differentiation factors GATA1 and GATA2, unlike erythroid factors TAL1 and KLF1. A potential impact of IRX3 in myeloid development was examined by the quantification of differentiation markers. The knockdown of IRX3 in OCI-AML3 resulted in the significant upregulation of CD11b/ITGAM and CD14 (Figure 6C). Furthermore, morphological inspection of these treated cells showed nuclear alterations indicative for myeloid differentiation (Figure 6D), collectively demonstrating an inhibitory role of IRX3 for these processes.

Functional live-cell imaging analysis of OCI-AML3 cells treated for IRX3-knockdown indicated that IRX3 inhibited apoptosis while sparing proliferation (Figure 6E). However, IRX1 and IRX3 knockdown experiments indicated that this anti-apoptotic effect is not mediated by the transcriptional activation of BCL2 (Figure 6A,B). Of note, BCL2 showed the conspicuous downregulation in MEPs (Figure S8), indicating potential sensitivity to apoptosis of these progenitors and conceivably of any derived malignant cells.

The observed suppressive activity of IRX3 prompted scanning downregulated genes in IRX3/IRX5-positive cell lines as well. This exercise revealed BMP2-inhibitor FST, which was downregulated in OCI-AML3 and MEGAL (Table S1, Figure S5). Accordingly, siRNAmediated knockdown of IRX3 in OCI-AML3 boosted FST expression (Figure 6B), showing that IRX3 inhibited FST. Thus, IRX3 operated in AML as both gene activator and suppressor.

Recently, we reported that NKL homeodomain factors NKX2-3 and NKX2-4 deregulate megakaryocytic-erythroid differentiation factor FLI1 in AML, which shows conspicuous downregulation in MEPs (Figure S8) [35]. However, knockdown experiments for IRX1 and IRX3 excluded FLI1 deregulation (Figure 6A,B), demonstrating functional differences in downstream activities of NKL and TALE homeo-oncogenes in AML.

Taken together, IRX1 and IRX3 showed significant differences in target gene regulation: IRX1 activated KLF1 and TAL1, while IRX3 inhibited GATA1 and GATA2. However, all represent master TFs involved in megakaryocytic-erythroid differentiation. Furthermore, the IRX3-mediated suppression of FST may create an activating feedback loop as BMP2 signaling supported IRX3 expression.

**Figure 6.** Target gene analyses for IRX1 and IRX3 in AML cell lines. (**A**) RQ-PCR analysis of MKPL-1 treated for siRNA-mediated knockdown of IRX1 resulted in downregulation of KLF1 and TAL1, while GATA1, GATA2, FLI1, and BCL2 remained unaffected. Thus, IRX1 activates KLF1 and TAL1. (**B**) RQ-PCR analysis of OCI-AML3 treated for siRNA-mediated knockdown of IRX3 resulted in upregulation of GATA1, GATA2, and FST, while TAL1, FLI1 and BCL2 remained unaffected. Thus, IRX3 inhibits GATA1, GATA2, and FST. (**C**) RQ-PCR analysis of OCI-AML3 treated for siRNA-mediated knockdown of IRX3 resulted in upregulation of myeloid differentiation marker CD11b/ITGAM and CD14. (**D**) SiRNA-mediated knockdown of IRX3 in OCI-AML3 cells induced morphological alterations of their nuclei, as shown by Giemsa–May–Grünwald staining. (**E**) Live-cell imaging analyses of OCI-AML3 treated for siRNA-mediated knockdown of IRX3 showed increased levels of apoptosis (**left**), while proliferation remained unaffected (**right**). Standard deviations are indicated as bars. *p*-values are indicated by asterisks (\*\* *p* < 0.01, \*\*\* *p* < 0.001, n.s. not significant).

#### **3. Discussion**

In this study, we established the myeloid TALE-code, representing a TALE homeobox gene expression pattern in normal myelopoiesis. Thus, we completed this signature for the entire hematopoietic system [19]. According to our previously generated NKL-code, we used the same approach to analyze public expression data for TALE homeobox genes in hematopoiesis [18]. These codes show (i) physiological activities of selected homeobox gene groups in hematopoietic entities, and (ii) a blueprint permitting the evaluation of homeobox gene expressions in corresponding malignancies of patients.

The myeloid TALE-code comprises eleven TALE homeobox genes (Figure 1). Interestingly, one of these, PBX1, was expressed in most progenitors and terminal differentiated myeloid cells and only silenced in granulopoiesis. This expression pattern contrasts with that reported in lymphopoiesis, in which all terminal-differentiated lymphocytes downregulate PBX1 [19]. In accordance with these physiological data, the aberrant maintenance of PBX1 activity in developing B-cells contributes to the generation of pre-B-cell leukemia or Hodgkin lymphoma [19,36–38]. In the myeloid lineage, the aberrant expression of PBX1 has been associated with severe congenital neutropenia. Its repression is mediated by GFI1, whose loss reactivates PBX1 in addition to MEIS1 and HOXA genes [39], providing a mechanistic explanation for the aberrant expression of PBX1 in granulopoiesis. Thus, TALE-code data serve to elucidate clinical findings in various hematopoietic malignancies.

In addition, our myeloid TALE-code data revealed a conspicuous expression pattern for IRX1, which was restricted to the MEP stage. This TALE homeobox gene represented the only IRX gene active in normal hematopoiesis. Furthermore, the aberrant expression of IRX genes in AML patients and cell lines was detected for IRX1, IRX3, and IRX5. The deregulated activity of IRX genes in AML has been reported in previous studies as well. Accordingly, the aberrant expression of IRX1 plays an oncogenic role in cases with particular MLL rearrangements, and IRX3 inhibits myelomonocytic differentiation, while IRX2 supports myeloid differentiation [13,14,40]. Thus, the published results together with our data indicate functional differences between IRX factors in AML.

The human genome encodes six IRX genes that are arranged in two clusters. Clustering of the homeobox genes is evolutionarily conserved and often connected with the coregulation of gene neighbors, as described for HOX, DLX, and IRX genes [10,41,42]. They share regulatory elements as demonstrated for IRX3 and IRX5, which are controlled by the transcribed enhancer locus FTO [23,24]. AML cell line MEGAL expressed IRX3 and IRX5 ectopically and showed an amplification of these genes together with FTO at 16q12. This chromosomal aberration was part of a chain of consecutive amplicons at 16q, signifying chromothripsis. This type of cataclysmic genomic rearrangement affects single chromosomes or chromosomal arms and has been described in myeloid and lymphoid malignancies [43,44]. Previously reported aberrations altering chromosome 16 in AML include inv(16)(p13q22) and del(16)(q11) [45–47]. However, our data excluded for MEGAL the presence of both del(16)(q11) and inv(16)(p13q22), which generates the fusion gene CBFb-MYH11.

Our additional findings concerning the regulation and function of IRX genes in AML are summarized in Figure 7. The data include chromosomal aberrations and involvement of the myeloid master factors GATA1, GATA2, HOXA10, KLF1, and TAL1, generating normal and aberrant gene networks, which control differentiation processes. We showed that GATA1 and GATA2 performed an activating role in IRX1 expression, which, in turn, activated KLF1 and TAL1. In contrast, HOXA10 activated IRX3 and IRX5, while IRX3 inhibited GATA1 and GATA2 expression.

**Figure 7.** Gene regulatory network of IRX1 and IRX3/IRX5 in AML. The figure summarizes the results of this study. TALE homeobox genes IRX1, IRX3, and IRX5 are located centrally, chromosomal aberrations lie upstream, and developmental TFs and BMP2-signaling components both upstream and downstream. Thus, these IRX genes are part of a regulatory gene network, controlling myeloid differentiation.

GATA1 and GATA2 are fundamental regulators in myelopoiesis and interact with TAL1 in TF-complexes [48–50]. GATA1 drives granulopoiesis, mast cell differentiation, and megakaryopoiesis [51–53]. GATA2 plays basic roles in stem and progenitor cells, including erythroid progenitors, and becomes substituted by GATA1 during development [48]. The mutation or downregulation of GATA2 contributes to the generation of myeloid malignancies including AML [48,54,55]. Thus, the downregulation of GATA1 and GATA2 by IRX3 may represent a novel type of oncogenic alteration in subsets of myelomonocytic AML. Consistent with our data, IRX3 blocks myeloid differentiation [14].

HOXA genes including HOXA10 represent key developmental regulators in myelopoiesis, and their deregulation plays a role in acute leukemias containing MLLrearrangements and SET-NUP214 fusions [27,56]. We showed that HOXA10 activated ectopic expressions of IRX3 and IRX5 in AML. This finding corresponds to reported correlations of IRX3 to elevated and of IRX1 to decreased HOXA levels in AML [14,29–31]. Furthermore, we showed that IRX3 and IRX5 were activated by BMP2-signaling via JUNB and SMAD4, thus representing an additional mechanism of IRX gene deregulation. FST inhibits the activity of BMP proteins and thus BMP-signaling [34,57]. BMP4 supports megakaryopoiesis and BMP2 erythropoiesis [58,59], demonstrating the developmental impact of this signaling pathway. In T-cell leukemia, the ectopic activity of CHRDL1 has been shown to inhibit BMP-signaling, which results in the aberrant expression of NKL homeobox gene MSX1 [60]. Thus, aberrant deregulation of the BMP pathway plays a common oncogenic role in leukemia.

KLF1 and TAL1 are master genes for erythroid development and suppressors of megakaryopoiesis [61,62]. Our data showed that both genes were activated by IRX1, which may, thus, represent a physiological relationship. On the other hand, the aberrant activation of KLF1 and TAL1 may illustrate an oncogenic function of IRX1 in megakaryoblastic AML. The CBFb-MYH11 fusion gene reportedly deregulates the activity of GATA2 and KLF1, and thus inhibits megakaryopoiesis and erythropoiesis [63]. Here, while excluding the presence

of CBFb-MYH11 fusion, we detected the aberrant activity of IRX3/IRX5, an alternative mechanism to perturb myeloid differentiation via KLF1 deregulation.

FLI1 represents an additional myeloid master factor conspicuously downregulated in MEPs, which activates megakaryopoiesis and inhibits erythropoiesis [35,64]. Here, we excluded a regulatory impact of IRX factors in FLI1 expression. However, in our previous study of deregulated NKL homeobox genes, we showed that FLI1 is aberrantly activated by NKX2-3 in megakaryoblastic AML and aberrantly inhibited by NKX2-4 in erythroblastic AML [35]. These observations suggest that MEPs are developmentally susceptible to generate AML and that aberrantly activated homeodomain factors deregulate components of gene regulatory networks, controlling basic myeloid differentiation processes.

Finally, we showed that IRX3 inhibited apoptosis in AML. This effect was not performed by transcriptional activation of BCL2, although BCL2 is physiologically downregulated in MEPs. However, in lung cancer, IRX1 inhibits expression of the pro-apoptotic gene BAX [65], supporting a role of IRX factors in the (de)regulation of cell survival.

Taken together, our study revealed basic impacts of IRX factors in normal and aberrant myelopoiesis. IRX1 is a novel physiological player in hematopoiesis and part of a gene regulatory network controlling the differentiation of megakaryocytes, erythrocytes, and granulocytes. Aberrantly activated IRX1, IRX3, and IRX5 may disturb or deregulate developmental processes in myelopoiesis, driving the generation of AML subsets. Thus, our study contributes to understanding normal and abnormal processes in myelopoiesis.

#### **4. Materials and Methods**

#### *4.1. Bioinformatic Analyses of Expression Profiling and RNA-Seq Data*

Expression data for normal cell types were obtained from Gene Expression Omnibus (GEO, www.ncbi.nlm.nih.gov, accessed on 15 February 2022), using expression profiling datasets GSE42519, GSE22552, GSE109348, and GSE24759 [66–69], in addition to RNA-seq data from The Human Protein Atlas (www.proteinatlas.org, accessed on 15 February 2022) [70]. For screening of cell lines, we exploited RNA-sequencing data from 100 leukemia/lymphoma cell lines (termed LL-100), available at ArrayExpress (www.ebi. ac.uk/arrayexpress, accessed on 15 February 2022) via E-MTAB-7721 [22]. Gene expression profiling data from AML patients were examined using datasets GSE19577, GSE15434, and GSE6891 [71–73]. To parse biological functions of 1000 shortlisted genes, gene set annotation enrichment analysis was performed using DAVID bioinformatics resources (www.david.ncifcrf.gov, accessed on 15 February 2022) [74]. Analysis of gene co-expression in profiling datasets was performed by Spearman correlation using R-based tools.

#### *4.2. Cell Lines and Treatments*

Cell lines are held at the DSMZ (Braunschweig, Germany) and cultivated as described (www.DSMZ.de, accessed on 15 February 2022). All cell lines had been authenticated and tested negative for mycoplasma infection. Modification of gene expression levels was performed using gene-specific siRNA oligonucleotides with reference to AllStars negative Control siRNA (siCTR) obtained from Qiagen (Hilden, Germany). SiRNAs (80 pmol) were transfected into 1 <sup>×</sup> <sup>10</sup><sup>6</sup> cells by electroporation using the EPI-2500 impulse generator (Fischer, Heidelberg, Germany) at 350 V for 10 ms. Electroporated cells were harvested after 20 h of cultivation. Cell treatments were performed for 20 h using 20 ng/mL of recombinant BMP2 (R & D Systems, Wiesbaden, Germany, #355-BM-010/CF) or 20 µg/mL of inhibitory monoclonal anti-BMP2 antibody (R & D Systems, #MAB3552).

For cytological analyses, cell lines were stained with Giemsa–May–Grünwald as follows: Cells were spun onto microscope slides and fixed for 5 min with methanol. Subsequently, they were stained for 3 min with May–Grünwald's eosin-methylene blue modified solution (Merck, Darmstadt, Germany) diluted in Titrisol (Merck), and for 15 min with Giemsa's azur eosin methylene blue solution (Merck). Images were captured with an Axion A1 microscope using Axiocam 208 color and software ZEN 3.3 blue edition (Zeiss, Göttingen, Germany). For functional testing, treated cells were analyzed with the

IncuCyte S3 Live-Cell Analysis System (Essen Bioscience, Hertfordshire, UK). For detection of apoptotic cells, we additionally used the IncuCyte Caspase-3/7 Green Apoptosis Assay diluted at 1:2000 (Essen Bioscience). Live-cell imaging experiments were performed twice with fourfold parallel tests.

#### *4.3. Polymerase-Chain-Reaction (PCR) Analyses*

Total RNA was extracted from cultivated cell lines using TRIzol reagent (Invitrogen, Darmstadt, Germany). Primary human total RNA derived from the cerebellum, kidney, lung, and salivary gland was purchased from Biochain/BioCat (Heidelberg, Germany). cDNA was synthesized using 1 µg of RNA, random priming, and Superscript II (Invitrogen). Real-time quantitative (RQ)-PCR analysis was performed using the 7500 Real-time System and commercial buffer and primer sets (Applied Biosystems/Life Technologies, Darmstadt, Germany). For normalization of expression levels, we quantified the transcripts of TATA box binding protein (TBP). Quantitative analyses were performed as biological replicates and measured in triplicate. Standard deviations are presented in the figures as error bars. Statistical significance was assessed by Student's *t*-test (two-tailed) and the calculated p-values indicated by asterisks (\* *p* < 0.05, \*\* *p* < 0.01, \*\*\* *p* < 0.001, n.s. not significant).

For detection of CBFb-MYH11 and SET-NUP214 fusion transcripts, we performed reverse transcription (RT)-PCR, using the following oligonucleotides as reported previously: CBFb-for 50 -GGGCTGTCTGGAGTTTGATG-30 , and MYH11-rev 50 -CTTGAGCGCCTGCA TGTT-30 , SET-for 50 -TGACGAAGAAGGGGATGAGGAT-30 , NUP214-rev 50 -ATCATTCACA TCTTGGACAGCA-30 [28,44]. As a control, we analyzed ETV6, using ETV6-for 50 -AGGCC AATTGACAGCAACAC-30 and ETV6-rev 50 -TGCACATTATCCACGGATGG-30 . All oligonucleotides were purchased from Eurofins MWG (Ebersberg, Germany). PCR products were generated using taqpol (Qiagen) and thermocycler TGradient (Biometra, Göttingen, Germany), analyzed by gel electrophoresis, and documented with the Azure c200 Gel Imaging System (Azure Biosystems, Dublin, CA, USA).

#### *4.4. Protein Analysis*

Western blots were generated by the semi-dry method. Protein lysates from cell lines were prepared using SIGMAFast protease inhibitor cocktail (Sigma, Taufkirchen, Germany). Proteins were transferred onto nitrocellulose membranes (Bio-Rad, München, Germany) and blocked with 5% dry milk powder dissolved in phosphate-buffered-saline buffer (PBS). The following antibodies were used: alpha-Tubulin (Sigma, #T6199), IRX1 (Biozol, Eching, Germany, #DF3225), and IRX3 (Biozol, #MBS8223417). For loading control, blots were reversibly stained with Poinceau (Sigma) and the detection of alpha-Tubulin (TUBA) was performed thereafter. Secondary antibodies were linked to peroxidase for detection by Western-Lightning-ECL (Perkin Elmer, Waltham, MA, USA). Documentation was performed using the digital system ChemoStar Imager (INTAS, Göttingen, Germany).

The enzyme-linked immunosorbent assay (ELISA) was used to quantify BMP2 protein levels in the supernatant of cell cultures. In addition, 2 <sup>×</sup> <sup>10</sup><sup>6</sup> cells were cultured in 2 mL of fresh medium in a 24-well plate. After 24 h, 1 of mL medium was harvested and frozen in aliquots. Quantification was performed using the Quantikine ELISA BMP-2 kit (R & D Systems, #DBP200), as described by the company. Two biological replicates were analyzed in triplicate.

#### *4.5. Karyotyping and Genomic Profiling Analysis*

Karyotyping was performed as described previously [75]. For genomic profiling, genomic DNA of AML cell lines was prepared by the Qiagen Gentra Puregene Kit (Qiagen). Labeling, hybridization, and scanning of Cytoscan HD arrays were performed by the Genome Analytics Facility located at the Helmholtz Centre for Infection Research (Braunschweig, Germany), using the manufacturer´s protocols (Affymetrix, High Wycombe, UK). Data were interpreted using the Chromosome Analysis Suite software version 3.1.0.15 (Affymetrix, High Wycombe, UK) and copy number alterations were determined accordingly.

**Supplementary Materials:** The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/ijms23063192/s1.

**Author Contributions:** Conceptualization, S.N.; formal analysis, S.N., C.P., C.M. and R.A.F.M.; investigation, S.N.; writing—original draft preparation, S.N.; writing—review and editing, R.A.F.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Data is contained within the article or Supplementary Materials.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **NKL Homeobox Genes NKX2-3 and NKX2-4 Deregulate Megakaryocytic-Erythroid Cell Differentiation in AML**

**Stefan Nagel \*, Claudia Pommerenke , Corinna Meyer and Roderick A. F. MacLeod**

Department of Human and Animal Cell Lines, Leibniz-Institute DSMZ, German Collection of Microorganisms and Cell Cultures, 38124 Braunschweig, Germany; cpo14@dsmz.de (C.P.); cme@dsmz.de (C.M.); rafmacleod@gmail.com (R.A.F.M.)

**\*** Correspondence: sna@dsmz.de; Tel.: +49-531-2616167

**Abstract:** NKL homeobox genes encode transcription factors that impact normal development and hematopoietic malignancies if deregulated. Recently, we established an NKL-code that describes the physiological expression pattern of eleven NKL homeobox genes in the course of hematopoiesis, allowing evaluation of aberrantly activated NKL genes in leukemia/lymphoma. Here, we identify ectopic expression of NKL homeobox gene NKX2-4 in an erythroblastic acute myeloid leukemia (AML) cell line OCI-M2 and describe investigation of its activating factors and target genes. Comparative expression profiling data of AML cell lines revealed in OCI-M2 an aberrantly activated program for endothelial development including master factor ETV2 and the additional endothelial signature genes HEY1, IRF6, and SOX7. Corresponding siRNA-mediated knockdown experiments showed their role in activating NKX2-4 expression. Furthermore, the ETV2 locus at 19p13 was genomically amplified, possibly underlying its aberrant expression. Target gene analyses of NKX2-4 revealed activated ETV2, HEY1, and SIX5 and suppressed FLI1. Comparative expression profiling analysis of public datasets for AML patients and primary megakaryocyte–erythroid progenitor cells showed conspicuous similarities to NKX2-4 activating factors and the target genes we identified, supporting the clinical relevance of our findings and developmental disturbance by NKX2-4. Finally, identification and target gene analysis of aberrantly expressed NKX2-3 in AML patients and a megakaryoblastic AML cell line ELF-153 showed activation of FLI1, contrasting with OCI-M2. FLI1 encodes a master factor for myelopoiesis, driving megakaryocytic differentiation and suppressing erythroid differentiation, thus representing a basic developmental target of these homeo-oncogenes. Taken together, we have identified aberrantly activated NKL homeobox genes NKX2-3 and NKX2-4 in AML, deregulating genes involved in megakaryocytic and erythroid differentiation processes, and thereby contributing to the formation of specific AML subtypes.

**Keywords:** HOX-code; NKL-code; TALE-code; TBX-code

### **1. Introduction**

Stem and progenitor cells pass through several developmental stages and subsequently differentiate into mature cells and tissues. During early embryogenesis, hematopoietic and endothelial cells share a common progenitor, termed hemangioblast. Later in development, these cell types differentiate separately, starting from hematopoietic and endothelial stem cells, respectively. The process of hematopoiesis generates all types of blood and immune cells, split into the lymphoid and myeloid lineages. The latter produces mature cell types such as erythrocytes and megakaryocytes via the joint megakaryocyte and erythroid progenitor [1]. Megakaryocytes develop subsequently the thrombocytes.

Differentiation processes during hematopoietic and endothelial development are mainly regulated at the transcriptional level [2,3]. The master genes responsible for controlling these processes mostly encode transcription factors (TFs). E26 transformation-specific (ETS) and NKL homeodomain factors represent two types of such developmental TFs

**Citation:** Nagel, S.; Pommerenke, C.; Meyer, C.; MacLeod, R.A.F. NKL Homeobox Genes NKX2-3 and NKX2-4 Deregulate Megakaryocytic-Erythroid Cell Differentiation in AML. *Int. J. Mol. Sci.* **2021**, *22*, 11434. https://doi.org/10.3390/ijms222111434

Academic Editor: Amelia Casamassimi

Received: 24 September 2021 Accepted: 18 October 2021 Published: 22 October 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

and operate at specific stages and lineages. ETS factors share the conserved ETS domain, which forms a winged helix-turn-helix structure and performs both sequence-specific DNA binding and protein–protein interactions [4]. According to sequence similarities, their genes are classified into 13 groups [5]. ETV2 and FLI1 are two basic ETS factors of the ERG group, regulating early steps in hematopoietic and endothelial development. *ETV2* is expressed in the hemangioblast and drives endothelial differentiation in the adult, while FLI1 is active both in hematopoietic stem cells (HSCs) and during megakaryopoiesis [6,7].

Homeobox genes encode a homeodomain that consists of three helices, generating a helix-turn-helix structure. The homeodomain performs an interaction with DNA, cofactors, and chromatin, forming a platform for gene regulation [8]. According to sequence similarities of their conserved homeobox, these genes are arranged into eleven classes and several subclasses [9]. Thus, for example, NKL homeobox genes represent a subclass of the Antennapedia (ANTP) class of homeobox genes. The human genome contains 49 NKL homeobox genes that play fundamental roles in embryonal development of tissues and organs and regulate cell differentiation in the adult. Examples are *NKX2-3*, which is expressed in developing spleen, intestine and in HSCs; *NKX2-4*, which is active in brain and testis development; and *NKX2-5*, which controls development of heart and spleen [10–14]. Recently, we have described the NKL-code that encompasses the gene signature of eleven NKL homeobox genes expressed in specific patterns during hematopoiesis [15]. Basic NKL-code members are *NKX2-3*, in addition to both *HHEX* and *HLX*, which orchestrate differentiation processes in myeloid and lymphoid lineages and are themselves regulated by specific hematopoietic TFs [16].

According to their functions in normal development, deregulated ETS and NKL homeobox genes promote generation of hematopoietic malignancies and are frequently targeted by chromosomal aberrations. For example, the ETS gene *ETV6* is fused with different partner genes by specific translocations in lymphoblastic and myeloid acute leukemias, and NKL homeobox gene *NKX2-5* is aberrantly activated via juxtaposition to an enhancer region of *BCL11B* in T-cell acute lymphoblastic leukemia (T-ALL) [17,18]. Other reported activating mechanisms for NKL homeobox genes are deregulated chromatin components, signaling pathways, and TFs [15]. Furthermore, aberrantly expressed ETS and NKL homeobox genes serve as diagnostic markers for certain hematopoietic cancers, such as *ETV6* in ALL and *TLX1* in T-ALL.

Acute myeloid leukemia (AML) is the most frequent acute leukemia in adults. The tumor cells derive from specific progenitor cells of the myeloid lineage. According to originating cells and stages, phenotypes, chromosomal aberrations, and gene mutations, several subtypes of AML are distinguished that differ in prognosis and treatment. The established French–American–British (FAB) system differentiates eight subtypes, called AML-M0 to -M7 [19]. Thus, for example, AML-M6 represents an erythroblastic subtype and the corresponding tumor cells express several erythropoietic genes and are blocked in development, unable to terminate differentiation. Today, additional criteria including sequencing data basically serve to classify AML [20].

Systematic analysis of NKL homeobox genes in AML has revealed 18 deregulated genes, highlighting the importance of these developmental oncogenes in driving this malignancy [21]. Focused studies in AML have investigated activating factors and downstream functions of the selected deregulated NKL homeobox genes *NANOG*, *HMX2*, and *HMX3* and have shown the oncogenic roles of these genes [21,22]. In this study, we performed detailed analyses of the oncogenically deregulated NKL homeobox gene *NKX2-4* using AML cell line OCI-M2 as model. We found that *NKX2-4* is aberrantly connected with the developmental ETS genes *ETV2* and *FLI1*, generating a leukemogenic network, which impacts myeloid differentiation.

was set as 1.

#### **2. Results 2. Results**  *2.1. NKX2-4 Expression in AML Cell Line OCI-M2*

#### *2.1. NKX2-4 Expression in AML Cell Line OCI-M2* Systematic examinations of aberrantly expressed NKL homeobox genes in AML pa-

*Int. J. Mol. Sci.* **2021**, *22*, x FOR PEER REVIEW 3 of 19

Systematic examinations of aberrantly expressed NKL homeobox genes in AML patients and cell lines were performed using public gene expression profiling datasets, highlighting their frequent deregulation and oncogenic impact in this myeloid malignancy [21]. To investigate their pathological function in myeloid in vitro models, we searched for deregulated NKL homeobox genes in our published RNA-seq dataset LL-100, which covers 34 myeloid and 66 lymphoid leukemia/lymphoma cell lines. We detected *NKX2-4* expression in two AML cell lines, conspicuously high in OCI-M2 and low in THP-1 (Figure 1A). Additionally, B-cell lymphoma cell line U-2932 expressed elevated *NKX2-4* levels as well. Importantly, this NKL homeobox gene is not represented in standard expression profiling arrays, and its transcriptional deregulation has, therefore, yet to be reported in leukemia patients or cell lines. tients and cell lines were performed using public gene expression profiling datasets, highlighting their frequent deregulation and oncogenic impact in this myeloid malignancy [21]. To investigate their pathological function in myeloid in vitro models, we searched for deregulated NKL homeobox genes in our published RNA-seq dataset LL-100, which covers 34 myeloid and 66 lymphoid leukemia/lymphoma cell lines. We detected *NKX2-4* expression in two AML cell lines, conspicuously high in OCI-M2 and low in THP-1 (Figure 1A). Additionally, B-cell lymphoma cell line U-2932 expressed elevated *NKX2-4* levels as well. Importantly, this NKL homeobox gene is not represented in standard expression profiling arrays, and its transcriptional deregulation has, therefore, yet to be reported in leukemia patients or cell lines.

**Figure 1.** *NKX2-4* expression in hematopoietic cell lines. (**A**) LL-100 RNA-seq data show enhanced expression of NKL homeobox gene *NKX2-4* in AML cell line OCI-M2, while THP-1 expresses low levels (blue arrow heads). Gene expression values are given as DESeq2 normalized count data. (**B**) RQ-PCR analysis confirms high transcript levels in OCI-M2, while THP-1 and primary testis express lower levels. The expression level in OCI-M2 was set as 1. (**C**) Immunostaining of NKX2-4 protein in OCI-M2 cells, showing signals in both cytoplasm and nucleus (green). DAPI was used as nuclear counterstain (blue). (**D**) RQ-PCR analysis of *NKX2-2* in selected AML cell lines and T-cell lymphoma cell lines DERL-2 and DERL-7 shows elevated transcript levels in the latter, while THP-1 tested negative. The expression level in OCI-M2 **Figure 1.** *NKX2-4* expression in hematopoietic cell lines. (**A**) LL-100 RNA-seq data show enhanced expression of NKL homeobox gene *NKX2-4* in AML cell line OCI-M2, while THP-1 expresses low levels (blue arrow heads). Gene expression values are given as DESeq2 normalized count data. (**B**) RQ-PCR analysis confirms high transcript levels in OCI-M2, while THP-1 and primary testis express lower levels. The expression level in OCI-M2 was set as 1. (**C**) Immunostaining of NKX2-4 protein in OCI-M2 cells, showing signals in both cytoplasm and nucleus (green). DAPI was used as nuclear counterstain (blue). (**D**) RQ-PCR analysis of *NKX2-2* in selected AML cell lines and T-cell lymphoma cell lines DERL-2 and DERL-7 shows elevated transcript levels in the latter, while THP-1 tested negative. The expression level in OCI-M2 was set as 1.

> Analyses of additional public RNA-seq datasets containing samples from normal cells and tissues showed an absence of *NKX2-4* expression in developing and mature hematopoietic cells but showed its presence in the hypothalamus, pituitary gland, and testis (Figure S1). RQ-PCR and immunostaining confirmed *NKX2-4* activity in OCI-M2 cells at the RNA and protein levels, respectively (Figure 1B,C). The subcellular distribution of NKX2-4 protein showed localization in both nucleus and cytoplasm, suggesting functional regulation of this TF via nuclear import. Furthermore, combined analysis of *NKX2-4* in a primary testis sample indicated enhanced transcript levels in OCI-M2 (Figure 1B). Thus, *NKX2-4* is ectopically overexpressed in AML cell line OCI-M2, which was Analyses of additional public RNA-seq datasets containing samples from normal cells and tissues showed an absence of *NKX2-4* expression in developing and mature hematopoietic cells but showed its presence in the hypothalamus, pituitary gland, and testis (Figure S1). RQ-PCR and immunostaining confirmed *NKX2-4* activity in OCI-M2 cells at the RNA and protein levels, respectively (Figure 1B,C). The subcellular distribution of NKX2-4 protein showed localization in both nucleus and cytoplasm, suggesting functional regulation of this TF via nuclear import. Furthermore, combined analysis of *NKX2-4* in a primary testis sample indicated enhanced transcript levels in OCI-M2 (Figure 1B). Thus, *NKX2-4* is ectopically overexpressed in AML cell line OCI-M2, which was therefore used as a model to investigate its oncogenic role in this malignancy, including activating mechanisms and target genes.

#### activating mechanisms and target genes. *2.2. Karyotyping and Genomic Profiling of OCI-M2*

NKL homeobox genes are frequently deregulated by chromosomal aberrations [15]. To check if the *NKX2-4* locus is targeted by chromosomal rearrangements, we performed

therefore used as a model to investigate its oncogenic role in this malignancy, including

karyotyping of OCI-M2. The resultant karyotype was as follows: 51(46–51) < 2n > XX, der(X)t(X;8)(q23;q23), +6, +8, add(9)(p23), del(9)(p12p21), t(10;12)(p12;p12), del(17)(q11q21.1), +20, −21, +3 mar. However, no aberrations of the NKX2-4 locus at 20p11 were detected in OCI-M2. Nevertheless, the observed trisomy of chromosome 20 may boost *NKX2-4* expression by copy number gain.

Furthermore, we performed genomic profiling analysis of OCI-M2 to identify potential copy number alterations. The data for all chromosomes are shown in Figure 2, indicating, however, absence of a focal gain at the *NKX2-4* locus. In contrast, the results show several copy number alterations at other chromosomal positions, including duplications at 7q31 q36 and 8q22-q24, strongly amplified regions at 19p13 and 21q22, and deletions at 9p23-p24, 12p12, Xp11, and Xq12-q22. These regions may be indirectly implemented in *NKX2-4* deregulation. Interestingly, genomic profiling data of THP-1 cells that weakly transcribed *NKX2-4* showed a focal amplification at the *NKX2-4* locus (Figure S2A). However, this aberration neither enhanced the expression level of *NKX2-4* nor of the expression of its gene neighbor *NKX2-2*, as demonstrated by RQ-PCR analysis (Figure 1B,D), indicating absence of cognate activating TFs in THP-1. Of note, T-cell lymphoma cell lines DERL-2 and DERL-7 have been shown to aberrantly express *NKX2-2* and were used here as controls [23].

#### *2.3. OCI-M2 Displays an Aberrant Program of Endothelial Development*

To identify *NKX2-4* activating factors, we performed comparative expression profiling analysis of AML cell line OCI-M2 versus 31 AML control cell lines using public dataset GSE59808 and the associated online tool GEOR, which calculates the 250 most highly statistically significant differences in gene expression levels. This examination revealed for OCI-M2 differentially expressed up- or downregulated genes (Table S1). Subsequent gene set annotation analysis of these genes using the public online platform DAVID showed several significantly associated GO terms, including activated chromatin/histones and endothelial signaling (Table S2). The latter GO term was of special interest because of the close relationships between endothelial and hematopoietic development.

In support of this finding, OCI-M2 overexpressed *ETV2*, encoding a master factor for endothelial development, in addition to members of its associated gene signature, including *EPOR*, *HEY1*, *KDR*, and *SOX7* [24–26] (Table S1). RNA-seq data, RQ-PCR, and Western blot analyses of selected AML cell lines confirmed enhanced expression of these genes in OCI-M2 (Figure 3A–C and Figure S3). Furthermore, according to the comparative expression profiling data, *IRF6* and the histone genes *H1C* and *H2BB* from the HIST1 cluster were also overexpressed, while *FLI1* and *KDM6A* were downregulated in OCI-M2 (Figure 3D,E and Figure S3, Table S1). Of note, *IRF6* has been associated with endothelial development as well [27], thus representing an additional overexpressed TF in this context.

Interestingly, overexpressed *ETV2* is located at chromosomal position 19p13, which is targeted by genomic amplification in OCI-M2, while suppressed *KDM6A* is located at Xp11, which is focally deleted (Figure 2 and Figure S2B). Thus, these genomic aberrations may directly cause the deregulated activities observed for the indicated genes. Whole chromosome gains may underlie elevated expression levels of HIST1-genes at 6p22, *SOX7* at 8p23, *HEY1* at 8q21, and *NKX2-4* at 20p11 (Figure 2). However, a potential role for overexpressed histone genes in aberrant *NKX2-4* expression remained unclear, although we additionally detected elevated levels of ubiquitinated H2B in OCI-M2 (Figure 3D), which has been shown to impact NKL homeobox gene activity in B-cell lymphoma [28]. Taken together, AML cell line OCI-M2 displays an aberrantly activated program for endothelial development, which may drive its oncogenic transformation, notably including *NKX2- 4* expression. Copy number alterations may underlie aberrant activation of particular endothelial signature genes.

dothelial signature genes.


development, which may drive its oncogenic transformation, notably including *NKX2-4* expression. Copy number alterations may underlie aberrant activation of particular en-

**Figure 2.** Genomic profiling data of OCI-M2. The data show copy number states for all chromosomes. The y-axis indicates the copy number state, the x-axis the chromosomal position. Selected gene loci are indicated, including *IRF6*, *HIST1*, *SOX7*, *HEY1*, *EPOR*, *ETV2*, *SIX5*, *NKX2-4*, and **Figure 2.** Genomic profiling data of OCI-M2. The data show copy number states for all chromosomes. The y-axis indicates the copy number state, the x-axis the chromosomal position. Selected gene loci are indicated, including *IRF6*, *HIST1*, *SOX7*, *HEY1*, *EPOR*, *ETV2*, *SIX5*, *NKX2-4*, and *KDM6A*.

*KDM6A*.

**Figure 3.** RQ-PCR and Western blot analysis of selected genes in AML cell lines. (**A**) RQ-PCR and Western blot analyses of ETV2 (**left**) and HEY1 (**right**) demonstrate elevated expression levels in OCI-M2. (**B**) RQ-PCR analyses of EPOR (**left**) and KDR (**right**) demonstrate increased expression levels in OCI-M2. (**C**) RQ-PCR analyses of SOX7 (**left**) and IRF6 (**right**) demonstrate elevated expression levels in OCI-M2. (**D**) RQ-PCR analyses of H1C (**left**) and H2BB (**middle**) demonstrate high expression levels in OCI-M2. Raised levels of ubiquinated H2B are shown by Western blot analysis, using p38 as loading control (**right**). (**E**) RQ-PCR and Western blot analyses of KDM6A (**left**) and FLI1 (**right**) demonstrate reduced expression levels in OCI-M2. The cell lines OCI-M2 and ELF-153 are indicated in red and green. The expression levels of OCI-M2 were set as 1, except for analyses of KDM6A (GDM-1 was set as 1) and FLI1 (KG1A was set as 1). **Figure 3.** RQ-PCR and Western blot analysis of selected genes in AML cell lines. (**A**) RQ-PCR and Western blot analyses of ETV2 (**left**) and HEY1 (**right**) demonstrate elevated expression levels in OCI-M2. (**B**) RQ-PCR analyses of EPOR (**left**) and KDR (**right**) demonstrate increased expression levels in OCI-M2. (**C**) RQ-PCR analyses of SOX7 (**left**) and IRF6 (**right**) demonstrate elevated expression levels in OCI-M2. (**D**) RQ-PCR analyses of H1C (**left**) and H2BB (**middle**) demonstrate high expression levels in OCI-M2. Raised levels of ubiquinated H2B are shown by Western blot analysis, using p38 as loading control (**right**). (**E**) RQ-PCR and Western blot analyses of KDM6A (**left**) and FLI1 (**right**) demonstrate reduced expression levels in OCI-M2. The cell lines OCI-M2 and ELF-153 are indicated in red and green. The expression levels of OCI-M2 were set as 1, except for analyses of KDM6A (GDM-1 was set as 1) and FLI1 (KG1A was set as 1).

#### *2.4. Endothelial Transcription Factors Activate NKX2-4 in OCI-M2 2.4. Endothelial Transcription Factors Activate NKX2-4 in OCI-M2*

To examine whether the overexpressed endothelial TFs contribute to *NKX2-4* expression, we screened its promoter region for potential TF binding sites, using the UCSC To examine whether the overexpressed endothelial TFs contribute to *NKX2-4* expression, we screened its promoter region for potential TF binding sites, using the UCSC

genome browser. This approach revealed a SOX consensus site, as shown in Figure 4A. Accordingly, siRNA-mediated knockdown of overexpressed *SOX7* in OCI-M2 resulted in concomitantly reduced *NKX2-4* expression, showing that this endothelial TF activated NKL homeobox gene *NKX2-4* (Figure 4B). Then, a search for additional potential binding sites at *NKX2-4* using the CIS-BP database indicated an IRF6-site at −2648 bp, an ETV2-site at −2021 bp, and a HEY1-site within exon 2. To analyze their regulatory impact on *NKX2-4* expression, we performed corresponding siRNA-mediated knockdown experiments. The results showed that all three factors—IRF6, ETV2 and HEY1—activated expression of *NKX2-4* in OCI-M2 (Figure 4C,D). on *NKX2-4* expression, we performed corresponding siRNA-mediated knockdown experiments. The results showed that all three factors—IRF6, ETV2 and HEY1—activated expression of *NKX2-4* in OCI-M2 (Figure 4C,D). To study the role of ETV2 in more detail, we established a reporter gene assay for the identified binding site, showing that ETV2 regulated *NKX2-4* directly (Figure 4D). Furthermore, forced expression of ETV2 in THP-1 cells resulted in strongly elevated transcript levels of *NKX2-4* (Figure 4D), highlighting the activatory power of this TF in AML cells. Taken together, the endothelial TFs SOX7, IRF6, HEY1, and ETV2 are aberrantly overexpressed activators of *NKX2-4* in AML cell line OCI-M2.

genome browser. This approach revealed a SOX consensus site, as shown in Figure 4A. Accordingly, siRNA-mediated knockdown of overexpressed *SOX7* in OCI-M2 resulted in concomitantly reduced *NKX2-4* expression, showing that this endothelial TF activated NKL homeobox gene *NKX2-4* (Figure 4B). Then, a search for additional potential binding sites at *NKX2-4* using the CIS-BP database indicated an IRF6-site at −2648 bp, an ETV2-site at −2021 bp, and a HEY1-site within exon 2. To analyze their regulatory impact

*Int. J. Mol. Sci.* **2021**, *22*, x FOR PEER REVIEW 7 of 19

**Figure 4.** Endothelial TFs activate *NKX2-4* in OCI-M2. (**A**) TF binding sites at the *NKX2-4* locus were obtained from the UCSC genome browser. A consensus binding site for SOX factors is indicated. (**B**) RQ-PCR analysis of OCI-M2 after siRNA-mediated knockdown of *SOX7* shows concomitantly reduced transcript levels of *NKX2-4*. (**C**) RQ-PCR analyses of OCI-M2 after siR-NA-mediated knockdown of *IRF6* (**left**) and *HEY1* (**right**) show concomitantly reduced transcript levels of *NKX2-4*. (**D**) RQ-PCR analysis of OCI-M2 after siRNA-mediated knockdown of *ETV2* **Figure 4.** Endothelial TFs activate *NKX2-4* in OCI-M2. (**A**) TF binding sites at the *NKX2-4* locus were obtained from the UCSC genome browser. A consensus binding site for SOX factors is indicated. (**B**) RQ-PCR analysis of OCI-M2 after siRNA-mediated knockdown of *SOX7* shows concomitantly reduced transcript levels of *NKX2-4*. (**C**) RQ-PCR analyses of OCI-M2 after siRNA-mediated knockdown of *IRF6* (**left**) and *HEY1* (**right**) show concomitantly reduced transcript levels of *NKX2-4*. (**D**) RQ-PCR analysis of OCI-M2 after siRNA-mediated knockdown of *ETV2* shows concomitantly reduced transcript levels of *NKX2-4* (**left**). Reporter gene assay for a potential ETV2 binding site at *NKX2-4* was performed in NIH-3T3 cells, showing a suppressive effect. Nevertheless, these contextdependent results may indicate direct regulation of *NKX2-4* by ETV2 (**middle**). Forced expression of ETV2 in THP-1 cells resulted in strongly elevated *NKX2-4* expression (**right**). Statistical significance was assessed by Student's *t*-test (two-tailed), and the calculated *p*-values are indicated by asterisks (\* *p* < 0.05, \*\* *p* < 0.01, \*\*\* *p* < 0.001, n.s. not significant). The expression levels of siCTR-treated and vector-treated cells were set as 1.

To study the role of ETV2 in more detail, we established a reporter gene assay for the identified binding site, showing that ETV2 regulated *NKX2-4* directly (Figure 4D). Furthermore, forced expression of ETV2 in THP-1 cells resulted in strongly elevated transcript levels of *NKX2-4* (Figure 4D), highlighting the activatory power of this TF in AML cells. Taken together, the endothelial TFs SOX7, IRF6, HEY1, and ETV2 are aberrantly overexpressed activators of *NKX2-4* in AML cell line OCI-M2.

#### *2.5. NKX2-4 Impacts Erythroid Development*

Functional analyses of NKX2-4 were performed by life-cell imaging. Accordingly, OCI-M2 cells were treated for *NKX2-4* knockdown and subsequently quantified for proliferation and apoptosis (Figure 5A). However, the results show no significant impact on these processes. The better to understand the potential oncogenic role of TF NKX2-4 in AML, we searched for its target genes. We postulated that the above-identified *NKX2-4* regulators ETV2 and HEY1 may simultaneously represent NKX2-4 target genes because they contain consensus binding sites for NKX2-4 at −1235 bp and −961 bp, respectively. Moreover, indeed, siRNA-mediated knockdown of *NKX2-4* resulted in concurrent downregulation of *ETV2* and *HEY1*, confirming targeting of these genes by NKX2-4 and thus their participation in an aberrant mutually activating network (Figure 5B). The known role of *ETV2* and *HEY1* in endothelial development may indicate that *NKX2-4* deregulates differentiation processes in AML.

To search for NKX2-4 target genes more systematically, we performed expression profiling analysis of OCI-M2 after *NKX2-4* knockdown in comparison with a control (Table S3). Gene set annotation analysis of the top-1000 upregulated and downregulated genes revealed several GO terms associated with tissue and organ development (Table S4). Thus, these results further support that NKX2-4 may deregulate developmental processes in AML. However, this approach showed no GO term specifically associated with myeloid differentiation. Therefore, after inspection of these differentially expressed genes, we selected six myeloid-associated candidates encoding TFs, receptors, and markers for detailed analyses—namely *FLI1*, *FOXA1*, *MAML2*, *SIX5*, *SIRPA*, and *TGFBR1* [7,29–33]. Again, siRNA-mediated knockdown of *NKX2-4* in OCI-M2 and subsequent transcript quantification by RQ-PCR confirmed that NKX2-4 activated *FOXA1*, *MAML2*, and *SIX5* (Figure 5C) and repressed *FLI1* and *SIRPA* (Figure 5D). The identified NKX2-4 target genes may thus play a role in deregulated myeloid development and thus in the leukemogenesis of OCI-M2.

To evaluate our findings, we analyzed gene expression data obtained from primary samples, including peripheral blood cells from AML patients as well as hematopoietic progenitor cells from healthy donors. OCI-M2 is derived from a patient with acute erythroblastic leukemia, AML-M6 [34]. To compare our cell line data with those from patients, we performed comparative expression profiling analysis of samples from six AML-M6 patients versus 429 controls (patients with AML-M0, -M1, -M2, -M3, -M4, and -M5) using public dataset GSE6891 and analysis tool GEOR. Differentially expressed genes revealed by this approach included significantly upregulated *EPOR*, *GATA1*, *HIST1*, *IRF6*, and *SIX5*, and significantly downregulated *FLI1* in AML-M6 (Table S5). Thus, key regulators and target genes of NKX2-4 identified in OCI-M2 are likewise expressed in AML-M6 patients, verifying the clinical relevance of our data obtained from a cell line model.

**Figure 5.** NKX2-4 target gene analyses. (**A**) Life-cell imaging analysis of OCI-M2 cells treated by siRNA-mediated knockdown of *NKX2-4* show no significant impact in proliferation (**left**) or apoptosis (**right**). (**B**) RQ-PCR analyses of

*ETV2* (**left**) and *HEY1* (**right**) in OCI-M2 after siRNA-mediated knockdown of *NKX2-4* confirm these genes as activated targets. (**C**) RQ-PCR analyses of *FOXA1*, *MAML2*, *SIX5*, and *TGFBR1* in OCI-M2 after siRNA-mediated knockdown of *NKX2-4* confirm these genes as activated targets. (**D**) RQ-PCR analyses of *SIRPA* (**left**) and *FLI1* (**right**) in OCI-M2 after siRNA-mediated knockdown of *NKX2-4* confirm these genes as suppressed targets. (**E**) Expression profiling analysis of developing myeloid cells using public dataset GSE42519 shows elevated expression levels of *IRF6*, *EPOR*, and *HBB* and reduced levels of *FLI1* and *SIRPA* in megakaryocyte–erythroid progenitor cells (MEP). The y-axis represents the expression levels. (**F**) RQ-PCR analysis of selected AML cell lines for *HBB* (**left**) and *SIRPA* (**right**) show respective elevated and reduced expression levels in OCI-M2. (**G**) Life-cell imaging analysis of OCI-M2 cells treated by forced expression of FLI1 shows significant reduction in proliferation. Statistical significance was assessed by Student's *t*-test (two-tailed), and the calculated *p*-values are indicated by asterisks (\* *p* < 0.05, \*\* *p* < 0.01, \*\*\* *p* < 0.001, n.s. not significant). The expression levels of siCTR-treated cells and of untreated OCI-M2 were set as 1.

> In addition, we compared expression profiling data of samples from normal megakaryocyte and erythroid progenitors (MEPs) versus more differentiated granulopoietic progenitors, including promyelocytes, myelocytes, metamyelocytes, and band cells, using dataset GSE42519 and analysis tool GEOR. This approach revealed downregulated *FLI1* and *SIRPA* and upregulated *IRF6*, *EPOR*, *GATA1*, and *HBB* in MEPs (Table S6, Figure 5E). FLI1, SIRPA, GATA1 and HBB play basic roles in myeloid development especially in erythropoiesis [7,32,35]. Accordingly, RQ-PCR analysis demonstrated that OCI-M2 expressed elevated *GATA1* and *HBB* and reduced *SIRPA* levels as well (Figure 5F), consistent with its erythroblastic phenotype resembling MEPs.

> *FLI1* encodes a key myeloid TF, repressing erythroid while activating megakaryocytic differentiation [7,36]. Therefore, aberrant suppression of *FLI1* by *NKX2-4* in OCI-M2 may promote leukemogenic transformation towards erythroblastic cells. Accordingly, forced expression of FLI1 in OCI-M2 cells inhibited proliferation, suggesting that FLI1-mediated cell differentiation processes are uniformly connected with termination of cell growth (Figure 5G). Taken together, our experimental data for *NKX2-4* in acute erythroblastic leukemia cell line OCI-M2 correspond to expression data from AML-M6 patients and primary MEPs, suggesting that this aberrantly activated NKL homeobox gene may provoke a developmental defect in the process of megakaryocyte–erythroid differentiation.

#### *2.6. NKX2-3 Impacts Megakaryocytic Development*

As mentioned above, *NKX2-4* is not represented in standard expression profiling datasets. However, our comparative profiling results of AML patients (M6 versus M0 to M5) showed ectopic expression of NKL homeobox gene *NKX2-3* in AML-M6 patients (Table S5). This finding indicated that NKX2-3 and the closely related TF NKX2-4 may control similar oncogenic processes in this particular AML subtype. To compare their regulatory impacts in vitro, we searched for an *NKX2-3* expressing AML cell line model. Anew screening of RNA-seq dataset LL-100 revealed cell line ELF-153 which, however, derives from acute megakaryoblastic leukemia, AML-M7 [37]. RQ-PCR and Western blot analyses confirmed *NKX2-3* expression in ELF-153 at the RNA and protein level, respectively, validating its suitability for functional tests (Figure 6A).

Subsequent siRNA-mediated knockdown experiments in ELF-153 demonstrated that NKX2-3 activated *ETV2*, *SIX5*, and *FLI1* and suppressed *HEY1* (Figure 6B). Thus, NKX2-3 differs from NKX2-4 in regulation of *HEY1* and *FLI1*, while both NKL-TFs activated *ETV2* and *SIX5*. Accordingly, RQ-PCR and Western blot analysis demonstrated that ELF-153 expressed elevated levels of *FLI1*, and OCI-M2 reduced levels of *FLI1* (Figure 3E). Furthermore, siRNA-mediated knockdown of *FLI1* in ELF-153 resulted in slightly decreased proliferation, as analyzed by life-cell imaging (Figure 6C). Thus, proliferation was promoted by FLI1 in ELF-153, contrasting with OCI-M2. *NKX2-4* is aberrantly expressed in erythroblastic leukemia cell line OCI-M2 and represses *FLI1*, while *NKX2-3* is aberrantly expressed in megakaryoblastic leukemia cell line ELF-153 and activates *FLI1*. *FLI1* contains a consensus binding site for both NKX2-3 and NKX2-4 in its upstream region at −2564 bp,

siCTR-treated cells were set as 1.

tivates *FLI1*. *FLI1* contains a consensus binding site for both NKX2-3 and NKX2-4 in its

*Int. J. Mol. Sci.* **2021**, *22*, x FOR PEER REVIEW 11 of 19

**Figure 6.** Expression and target gene analysis of NKX2-3. (**A**) LL-100 RNA-seq data show enhanced expression of NKL homeobox gene NKX2-3 in AML cell line ELF-153 (black arrowhead), while this gene is silent in OCI-M2 (blue arrow heads). Gene expression values are given as DESeq2 normalized count data. RQ-PCR and Western blot analyses of selected AML cell lines confirm enhanced expression of NKX2-3 in ELF-153 (insert). The expression level of ELF-153 was set as 1. (**B**) RQ-PCR analyses of ETV2 (**above left**), SIX5 (**above right**), FLI1 (**below left**), and of HEY1 (**below right**) in ELF-153 after siRNA-mediated knockdown of NKX2-3 confirm these genes as targets. Of note, FLI1 is activated by NKX2-3. (**C**) Life-cell imaging analysis of ELF-153 cells treated for knockdown of FLI1 shows reduction in proliferation (*p* = 0.021). (**D**) Expression profiling analysis of developing myeloid cells using public dataset GSE42519 shows elevated expression levels of NKX2-3 and FLI1 in hematopoietic stem cells (HSC). (**E**) RQ-PCR analyses of GATA1, GATA2, and HBB in OCI-M2 (**above**) and ELF-153 (**below**) after siRNA-mediated knockdown of SIX5 show differences in gene regulation between these AML cell lines. Statistical significance was assessed by Student's *t*-test (two-tailed), and the calculated *p*-values are indicated by asterisks (\* *p* < 0.05, \*\* *p* < 0.01, \*\*\* *p* < 0.001, n.s. not significant). The expression levels of **Figure 6.** Expression and target gene analysis of NKX2-3. (**A**) LL-100 RNA-seq data show enhanced expression of NKL homeobox gene NKX2-3 in AML cell line ELF-153 (black arrowhead), while this gene is silent in OCI-M2 (blue arrow heads). Gene expression values are given as DESeq2 normalized count data. RQ-PCR and Western blot analyses of selected AML cell lines confirm enhanced expression of NKX2-3 in ELF-153 (insert). The expression level of ELF-153 was set as 1. (**B**) RQ-PCR analyses of ETV2 (**above left**), SIX5 (**above right**), FLI1 (**below left**), and of HEY1 (**below right**) in ELF-153 after siRNA-mediated knockdown of NKX2-3 confirm these genes as targets. Of note, FLI1 is activated by NKX2-3. (**C**) Life-cell imaging analysis of ELF-153 cells treated for knockdown of FLI1 shows reduction in proliferation (*p* = 0.021). (**D**) Expression profiling analysis of developing myeloid cells using public dataset GSE42519 shows elevated expression levels of NKX2-3 and FLI1 in hematopoietic stem cells (HSC). (**E**) RQ-PCR analyses of GATA1, GATA2, and HBB in OCI-M2 (**above**) and ELF-153 (**below**) after siRNA-mediated knockdown of SIX5 show differences in gene regulation between these AML cell lines. Statistical significance was assessed by Student's *t*-test (two-tailed), and the calculated *p*-values are indicated by asterisks (\* *p* < 0.05, \*\* *p* < 0.01, \*\*\* *p* < 0.001, n.s. not significant). The expression levels of siCTR-treated cells were set as 1.

> GATA1 and GATA2 represent additional regulators in megakaryopoiesis and erythropoiesis [35,38]. Furthermore, GATA1 has been shown to interact with the coactivating TFs

SIX1 and SIX2 in erythropoiesis, and *GATA2* is regulated by SIX1 in embryonal development of the placodes [39,40]. Hence, we speculated whether the NKX2-3 and NKX2-4 activated target SIX5 regulates or cooperates with GATA1 and/or GATA2. Thus, we performed *SIX5* knockdown experiments in OCI-M2 and ELF-153 (Figure 6E). The results show that SIX5 activated the expression of *GATA1* in both cell lines, while *GATA2* was activated by SIX5 only in OCI-M2. GATA1-target gene *HBB* was suppressed by SIX5 in ELF-513, while no impact was detectable in OCI-M2. Thus, SIX5 functionally differed in both cell lines. In OCI-M2, *SIX5* supported erythropoiesis via *GATA1* and *GATA2*, while in ELF-153, *SIX5* supported megakaryopoiesis via *GATA1* and inhibited erythropoiesis via *HBB*. Taken together, our findings highlight that the aberrantly expressed NKL homeobox genes *NKX2-3* and *NKX2-4* manipulate developmental lineage decisions in respective megakaryoblastic and erythroblastic AML via *FLI1* deregulation and *SIX5* activation. NKX2-4 activated target SIX5 regulates or cooperates with GATA1 and/or GATA2. Thus, we performed *SIX5* knockdown experiments in OCI-M2 and ELF-153 (Figure 6E). The results show that SIX5 activated the expression of *GATA1* in both cell lines, while *GATA2* was activated by SIX5 only in OCI-M2. GATA1-target gene *HBB* was suppressed by SIX5 in ELF-513, while no impact was detectable in OCI-M2. Thus, SIX5 functionally differed in both cell lines. In OCI-M2, *SIX5* supported erythropoiesis via *GATA1* and *GATA2*, while in ELF-153, *SIX5* supported megakaryopoiesis via *GATA1* and inhibited erythropoiesis via *HBB*. Taken together, our findings highlight that the aberrantly expressed NKL homeobox genes *NKX2-3* and *NKX2-4* manipulate developmental lineage decisions in respective megakaryoblastic and erythroblastic AML via *FLI1* deregulation and *SIX5* activation.

GATA1 and GATA2 represent additional regulators in megakaryopoiesis and erythropoiesis [35,38]. Furthermore, GATA1 has been shown to interact with the coactivating TFs SIX1 and SIX2 in erythropoiesis, and *GATA2* is regulated by SIX1 in embryonal development of the placodes [39,40]. Hence, we speculated whether the NKX2-3 and

*Int. J. Mol. Sci.* **2021**, *22*, x FOR PEER REVIEW 12 of 19

#### **3. Discussion 3. Discussion**

In this study, we report aberrant expression of NKL homeobox genes *NKX2-3* and *NKX2-4* in cell lines derived from two different AML subtypes. *NKX2-4* is ectopically activated in erythroblastic AML-M6 cell line OCI-M2 via the endothelial TFs ETV2, HEY1, IRF6, and SOX7, generating an aberrant developmental gene network. Prominent target genes identified are repressed *FLI1* and activated *SIX5* (Figure 7). In contrast, *NKX2-3* is aberrantly expressed in megakaryoblastic AML-M7 cell line ELF-153 by so far unknown factors and activates both *FLI1* and *SIX5* (Figure 7). In AML-M6 patients, we were also able to find gene activities of endothelial activators and target genes of NKX2-4 first identified in OCI-M2. Thus, this comparison with AML patient data revealed interesting concordances, supporting the clinical relevance of our data. Comparison with primary megakaryocyte–erythroid progenitors indicated deregulation of developmental processes by these aberrantly expressed NKL homeobox genes, operating at that developmental stage. Our data revealed deregulation of master factor FLI1. This ETS factor controls the differentiation of megakaryocytes and erythrocytes and may, therefore, represent a new key target in megakaryoblastic and erythroblastic AML. In this study, we report aberrant expression of NKL homeobox genes *NKX2-3* and *NKX2-4* in cell lines derived from two different AML subtypes. *NKX2-4* is ectopically activated in erythroblastic AML-M6 cell line OCI-M2 via the endothelial TFs ETV2, HEY1, IRF6, and SOX7, generating an aberrant developmental gene network. Prominent target genes identified are repressed *FLI1* and activated *SIX5* (Figure 7). In contrast, *NKX2-3* is aberrantly expressed in megakaryoblastic AML-M7 cell line ELF-153 by so far unknown factors and activates both *FLI1* and *SIX5* (Figure 7). In AML-M6 patients, we were also able to find gene activities of endothelial activators and target genes of NKX2-4 first identified in OCI-M2. Thus, this comparison with AML patient data revealed interesting concordances, supporting the clinical relevance of our data. Comparison with primary megakaryocyte–erythroid progenitors indicated deregulation of developmental processes by these aberrantly expressed NKL homeobox genes, operating at that developmental stage. Our data revealed deregulation of master factor FLI1. This ETS factor controls the differentiation of megakaryocytes and erythrocytes and may, therefore, represent a new key target in megakaryoblastic and erythroblastic AML.

**Figure 7.** Aberrant gene regulatory network of *NKX2-4* in OCI-M2 (**above**) and *NKX2-3* in ELF-153 **Figure 7.** Aberrant gene regulatory network of *NKX2-4* in OCI-M2 (**above**) and *NKX2-3* in ELF-153 (**below**). These NKL homeobox genes promote respective erythroblastic and megakaryoblastic leukemogenesis via their target genes *FLI1* and *SIX5*. Endothelial activators of *NKX2-4* are indicated.

(**below**). These NKL homeobox genes promote respective erythroblastic and megakaryoblastic

leukemogenesis via their target genes *FLI1* and *SIX5*. Endothelial activators of *NKX2-4* are indicated. *NKX2-3* is a member of the NKL-code and hematopoietically expressed in HSCs [11]. This restricted expression pattern may indicate a physiological role of this NKL homeobox gene in the control of stemness and lineage differentiation, while its deregulation may promote leukemogenesis by disturbing these developmental processes. Aberrant *NKX2-3* activity has been described in AML patients carrying mutations of *NPM1* or aberrations of *KMT2A* [41–43]. Enhanced *NKX2-3* expression correlates with *KMT2A*-rearrangements in T-ALL as well [44]. Moreover, *NKX2-3* is a direct target gene of *KMT2A-ENL* and impacts proliferation and cell differentiation [45]. In accordance with these published results, combined analysis of expression profiling data from normal hematopoietic cells and AML

patient samples demonstrated physiological *NKX2-3* activity in stem cells and aberrant expression in patients with *KMT2A* rearrangements and complex karyotypes (Figure S4). In addition, *NKX2-3* expression has been detected in myelodysplastic syndrome, diffuse large B-cell lymphoma, T-ALL, and T-cell lymphoma, showing its oncogenic potential in both myeloid and lymphoid cell lineages [15]. Increased colony forming and replating capacity after NKX2-3 overexpression has been shown in murine hematopoietic progenitor cells [42], supporting an impact in early differentiation processes.

In contrast, *NKX2-4* lies without the NKL-code and is, therefore, normally silent throughout hematopoiesis. This gene is not represented on standard expression arrays and thus is less studied in cancer patients. Nevertheless, a screen of T-ALL patients revealed a chromosomal translocation that juxtaposes *NKX2-4* to the *TCRA* gene, supporting its oncogenic activity in hematopoietic malignancies [46].

Our results show that aberrantly expressed endothelial TFs ETV2, HEY1, IRF6, and SOX7 activate *NKX2-4* in AML cell line OCI-M2. The development of endothelial and hematopoietic cell types is closely linked during embryogenesis. The hemangioblast represents an embryonic stem cell that forms the basis of developing blood and endothelial cells. Although this stem cell is absent in the adult, several control genes are shared between these lineages [2]. *ETV2* is a master gene of the hemangioblast and regulates endothelial development in the adult [6]. In the embryo, ETV2 activates myelopoiesis via *RUNX1*, *SPI1*, and *TAL1* [47,48]. In the endothelial development of the heart, *ETV2* is a target gene of the heart master factor NKX2-5 [49]. This physiological connection may be aberrantly captured by the related *NKX2-4* in OCI-M2. Additional endothelial factors are encoded by *FLI1*, *HEY1*, *IRF6*, and *SOX7* [25,27,50,51]. Chromosomal amplifications and chromosomal gains underlie their aberrant activation in OCI-M2. Several of these chromosomal regions are rearranged in erythroleukemic AML patients, including 6p22, 19p13, 20p11, and Xp11, supporting the pathogenic basis of our findings [52].

*FLI1* is a master gene in myelopoiesis, driving megakaryocyte differentiation and repressing erythropoiesis [53]. Furthermore, *FLI1* is prominently expressed in HSCs, alongside *NKX2-3* [7,11]. This coexpression, together with our data showing a regulatory connection between both genes, may indicate that *FLI1* represents a target gene of NKX2-3 in HSCs as well. B-cell lymphoma cell line U-2932 expressed elevated *NKX2-4* and reduced *FLI1* (Figure 1 and Figure S3), suggesting that the repressive impact of NKX2-4 may play a role in malignant lymphoid cells as well. *FLI1* and *IRF6* play a role in both endothelial and erythroid development [53,54]. Additional genes connected with NKL homeobox genes in our study are *HBB* and *SIRPA*. *HBB* encodes beta-globin expressed in erythroid progenitors and *SIRPA* a receptor implicated in innate immunological processes [32]. *SIRPA* is highly expressed in granulocytes and monocytes, while downregulated in megakaryocytes and erythrocytes [55]. Thus, NKX2-4-mediated inhibition of *SIRPA* may shift the myeloid differentiation of granulocytes and monocytes towards megakaryocytes and erythrocytes.

SIX1 is implicated in erythropoiesis by interaction with and activation of *GATA1* [39]. *GATA2* is suppressed by SIX1 and SIX2 during erythropoiesis but activated by SIX1 in Hodgkin lymphoma [39,56]. Furthermore, *SIX1* may promote the development of MEPs [57]. Thus, several studies show a regulatory connection between SIX and GATA factors. Here, our data indicate that aberrantly activated *SIX5* impacts *GATA1* and *GATA2* in AML. Depending on the cell type, *SIX5* can promote erythropoietic or megakaryopoietic processes.

Taken together, this study shows that *FLI1* and *SIX5* represent two target genes of NKX2-3 and NKX2-4 that disturb the differentiation of megakaryocytes and erythrocytes. Overexpression experiments of *NKX2-3* and *NKX2-4* in primary progenitor cells may support this interpretation in the future. However, a developmental impact at the same stage of differentiation was described for NKL homeobox gene *DLX4* [58], highlighting the potential of these genes in lineage decisions. Moreover, we also show that deregulated *NKX2-3* and *NKX2-4* impact developmental gene activities in AML in a context dependent

manner. Detection of their aberrant expression may assist the diagnosis and prognosis of certain AML subtypes.

#### **4. Materials and Methods**

#### *4.1. Bioinformatic Analyses of RNA-Seq and Expression Profiling Data*

For screening cell lines, we exploited RNA-sequencing data from 100 leukemia/ lymphoma cell lines (termed LL-100), available at ArrayExpress (www.ebi.ac.uk/arrayexpress) (accessed on 18 October 2021) via E-MTAB-7721. Gene expression values are given as DE-Seq2 normalized count data [59]. Expression data for normal cell types were obtained from Gene Expression Omnibus (GEO, www.ncbi.nlm.nih.gov) (accessed on 18 October 2021), using RNA-seq dataset GSE69239 and expression profiling dataset GSE42519 [60,61], in addition to RNA-seq data from The Human Protein Atlas (www.proteinatlas.org) (accessed on 18 October 2021). Gene expression profiling data from AML cell lines and patients were examined using datasets GSE59808 and GSE15434, respectively [62]. Combined expression analysis of normal hematopoietic cells (GSE42519) and AML patients (GSE13159) was performed using the online database BloodSpot [63]. Gene set annotation analysis was performed using the online tool DAVID (www.david.abcc.ncifcrf.gov) (accessed on 18 October 2021) [64]. Consensus binding sites for TFs were obtained from the CIS-BP database (www.cisbp.ccbr.utoronto.ca) (accessed on 18 October 2021) and were used for screening at the UCSC genome browser (www.genome.cse.ucsc.edu) (accessed on 18 October 2021). Expression profiling data from siRNA-treated OCI-M2 cells were generated at the Genome Analytics Facility (Helmholtz Centre for Infection Research, Braunschweig, Germany) using HG U133 Plus 2.0 gene chips (Affymetrix, High Wycombe, UK). The data are available from ArrayExpress via E-MTAB-10941. After RMA background correction and quantile normalization of the spot intensities, the profiling data were expressed as ratios of sample means and subsequently log2 transformed. Data processing was performed via R/Bioconductor using public limma and affy packages.

#### *4.2. Cell Lines and Treatments*

Cell lines are held by the DSMZ (Braunschweig, Germany) and cultivated as described previously [65]. All cell lines had been authenticated and tested negative for mycoplasma infection. Modification of gene expression levels was performed using gene-specific siRNA oligonucleotides with reference to AllStars negative Control siRNA (siCTR) obtained from Qiagen (Hilden, Germany). The gene expression constructs for ETV2, NKX2-4, and FLI1, in addition to an empty control-vector, were obtained from Origene (Wiesbaden, Germany). SiRNAs (80 pmol) and vector DNA (2 <sup>µ</sup>g) were transfected into 1 <sup>×</sup> <sup>10</sup><sup>6</sup> cells by electroporation using the EPI-2500 impulse generator (Fischer, Heidelberg, Germany) at 350 V for 10 ms. Electroporated cells were harvested after 20 h cultivation.

Proliferation and apoptosis were analyzed using the IncuCyte S3 Live-Cell Analysis System (Essen Bioscience, Hertfordshire, UK). For detection of apoptotic cells, we used the IncuCyte Caspase-3/7 Green Apoptosis Assay diluted at 1:2000 (Essen Bioscience, Essen, Germany). Live-cell imaging experiments were performed twice with fourfold parallel tests.

#### *4.3. Polymerase Chain-Reaction (PCR) Analyses*

Total RNA was extracted from cultivated cell lines using TRIzol reagent (Invitrogen, Darmstadt, Germany). Primary human total RNA derived from testis was purchased from Biochain/BioCat (Heidelberg, Germany). cDNA was synthesized using 5 µg RNA, random priming, and Superscript II (Invitrogen). Real time quantitative (RQ)-PCR analysis was performed using the 7500 Real-time System and commercial buffer and primer sets (Applied Biosystems/Life Technologies, Darmstadt, Germany). For normalization of expression levels, we quantified the transcripts of TATA box binding protein (TBP). Quantitative analyses were performed as biological replicates and measured in triplicate. Standard deviations are presented in the figures as error bars. Statistical significance was assessed by

Student's *t*-test (two-tailed) and the calculated *p*-values are indicated by asterisks (\* *p* < 0.05, \*\* *p* < 0.01, \*\*\* *p* < 0.001, n.s. not significant).

#### *4.4. Protein Analysis*

Western blots were generated by the semi-dry method. Protein lysates from cell lines were prepared using SIGMAFast protease inhibitor cocktail (Sigma, Taufkirchen, Germany). Proteins were transferred onto nitrocellulose membranes (Bio-Rad, Munich, Germany) and blocked with 5% dry milk powder dissolved in phosphate-buffered saline buffer (PBS). The following antibodies were used: alpha-Tubulin (Sigma, #T6199), NKX2-4 (Novus Biologicals, Centennial, CO, USA, #NBP1-91541), ETV2 (MyBioSource, San Diego, CA, USA, #MBS7112932), HEY1 (Novus Biologicals, #NBP2-56068), ubiquinated H2B (Cell Signaling, Frankfurt, Germany, #5546), p38 (Cell Signaling, #8690), FLI1 (Thermo Fisher Scientific, Darmstadt, Germany, #MA1-196), NKX2-3 (Abcam, Cambridge, UK, #ab66366). For loading, control blots were reversibly stained with Poinceau (Sigma) and detection of alpha-Tubulin (TUBA) performed thereafter. Secondary antibodies were linked to peroxidase for detection by Western-Lightning-ECL (Perkin Elmer, Waltham, MA, USA). Documentation was performed using the digital system ChemoStar Imager (INTAS, Göttingen, Germany).

Immuno-cytology was performed as follows: cells were spun onto slides and subsequently air-dried and fixed with methanol/acetic acid for 90 s. Antibodies were diluted 1:20 in PBS containing 5% BSA and incubated for 30 min. Washing was performed 3 times with PBS. Preparations were incubated with secondary antibody (diluted 1:100) for 20 min. After final washing, cells were mounted for nuclear in Vectashield (Vector Laboratories, Burlingame, CA, USA), containing DAPI. Documentation of subcellular protein localization was performed using an Axio-Imager microscope (Zeiss, Göttingen, Germany) configured to a dual Spectral Imaging system (Applied Spectral Imaging, Neckarhausen, Germany).

#### *4.5. Karyotyping and Genomic Profiling Analysis*

Karyotyping was performed as described previously [66]. For genomic profiling, genomic DNA of AML cell lines was prepared by the Qiagen Gentra Puregene Kit (Qiagen). Labelling, hybridization, and scanning of Cytoscan HD arrays was performed by the Genome Analytics Facility located at the Helmholtz Centre for Infection Research (Braunschweig, Germany), according to the manufacturer's protocols (Affymetrix, High Wycombe, UK). These arrays base on single nucleotide polymorphisms (SNPs) and allow the determination of copy number states of most gene loci. Data were interpreted using the Chromosome Analysis Suite software version 3.1.0.15 (Affymetrix, High Wycombe, UK) and copy number alterations were determined accordingly.

#### *4.6. Reporter-Gene Assay*

For creation of reporter gene constructs, we combined a reporter with a regulatory genomic fragment derived from the upstream region of NKX2-4. PCR products of the corresponding genomic region (regulator) and of the HOXA9 gene, comprising exon1-intron1 exon2 (reporter), were cloned into respective HindIII/BamHI and EcoRI sites of expression vector pcDNA3 downstream of the CMV enhancer. Oligonucleotides used for the amplification of the regulator were obtained from Eurofins MWG, Ebersberg, Germany. Their sequences were as follows: NKX2-4-for 50 -GATGAAGCTTATAGCCTGAAAACAGAG-30 , NKX2-4-rev 50 -AACACTTGCTGGGATCCTTCTG-30 . Introduced restriction sites used for cloning are underlined. Constructs were validated by sequence analysis (Eurofins MWG). Transfections of plasmid-DNA into NIH-3T3 cells were performed using SuperFect Transfection Reagent (Qiagen). Commercial HOXA9 and TBP assays were used for RQ-PCR to quantify the spliced reporter–transcript, corresponding to the regulator activity (Thermo Fisher Scientific).

**Supplementary Materials:** The following are available online at https://www.mdpi.com/article/10 .3390/ijms222111434/s1.

**Author Contributions:** Conceptualization, S.N.; formal analysis, S.N., C.P., C.M. and R.A.F.M.; investigation, S.N.; writing—original draft preparation, S.N.; writing—review and editing, R.A.F.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Data is contained within the article or Supplementary Materials. The data presented in this study are available in [Table S3].

**Acknowledgments:** We thank Hans G. Drexler for critically reading the manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Transcriptional Regulation of Yin-Yang 1 Expression through the Hypoxia Inducible Factor-1 in Pediatric Acute Lymphoblastic Leukemia**

**Gabriela Antonio-Andres <sup>1</sup> , Gustavo U. Martinez-Ruiz <sup>2</sup> , Mario Morales-Martinez <sup>1</sup> , Elva Jiménez-Hernandez <sup>3</sup> , Estefany Martinez-Torres <sup>1</sup> , Tania V. Lopez-Perez 1,4, Laura A. Estrada-Abreo <sup>5</sup> , Genaro Patino-Lopez <sup>5</sup> , Sergio Juarez-Mendez <sup>6</sup> , Víctor M. Davila-Borja <sup>6</sup> and Sara Huerta-Yepez 1,\***


**Abstract:** Yin-Yang transcription factor 1 (YY1) is involved in tumor progression, metastasis and has been shown to be elevated in different cancers, including leukemia. The regulatory mechanism underlying YY1 expression in leukemia is still not understood. Bioinformatics analysis reveal three Hypoxia-inducible factor 1-alpha (HIF-1α) putative binding sites in the YY1 promoter region. The regulation of YY1 by HIF-1α in leukemia was analyzed. Mutation of the putative YY1 binding sites in a reporter system containing the HIF-1α promoter region and CHIP analysis confirmed that these sites are important for YY1 regulation. Leukemia cell lines showed that both proteins HIF-1α and YY1 are co-expressed under hypoxia. In addition, the expression of mRNA of YY1 was increased after 3 h of hypoxia conditions and affect several target genes expression. In contrast, chemical inhibition of HIF-1α induces downregulation of YY1 and sensitizes cells to chemotherapeutic drugs. The clinical implications of HIF-1α in the regulation of YY1 were investigated by evaluation of expression of HIF-1α and YY1 in 108 peripheral blood samples and by RT-PCR in 46 bone marrow samples of patients with pediatric acute lymphoblastic leukemia (ALL). We found that the expression of HIF-1α positively correlates with YY1 expression in those patients. This is consistent with bioinformatic analyses of several databases. Our findings demonstrate for the first time that YY1 can be transcriptionally regulated by HIF-1α, and a correlation between HIF-1α expression and YY1 was found in ALL clinical samples. Hence, HIF-1α and YY1 may be possible therapeutic target and/or biomarkers of ALL.

**Keywords:** HIF-1α; YY1; ALL
