1. Introduction
Exogenous siRNAs are used routinely in many biotechnology and research applications to regulate endogenous gene expression levels. The most common is the use of siRNA for post transcriptional gene silencing (
PTGS) via the knockdown of mRNA expression levels [
1] which is also referred to as RNA interference (RNAi). Exogenous siRNA can also target nascent RNA transcripts in the nucleus with the potential to cause near permanent transcriptional silencing (
PTS) or upregulation (
PTU) of the associated gene through epigenetic modification of the DNA [
2,
3,
4,
5,
6,
7]. The mechanisms of siRNA mediated PTGS and PTS/PTU are well documented and involve siRNA targeting within the cytosol and the nucleus, respectively [
1,
2,
3,
4,
5,
6,
7].
PTGS/RNAi represents the most widely documented siRNA approach for the transient reduction (knockdown) of target mRNA levels in the cytosol [
1]. To achieve PTGS, an exogenous siRNA is designed to be complementary to a mature mRNA (exon derived) sequence. In contrast, for PTS/PTU, the siRNA are designed complementary to the immediate gene promoter or 3′ flanking region of the target gene, respectively [
2,
3,
4,
5,
6,
7]. During PTS/PTU, the exogenous siRNA enters the cytosol before relocating to the nucleus where it binds the nascent RNA transcript derived from the target gene causing epigenetic modification of the DNA of the target gene [
5,
8]. When successful, the resulting epigenetic modifications of the DNA in PTS/PTU can cause near permanent changes to the transcription of the target gene [
8]. From these applications, it is obvious that siRNA target RNA transcripts within both the cytosol and the nucleus where the region of the RNA targeted determines the nature of the interference. We, therefore, questioned whether an siRNA designed for PTGS of a target mRNA in the cytosol could target and interfere with the transcription of the nascent RNA precursor of the target in the nucleus and whether it would be possible to differentiate between these two distinct on-target interference effects.
To differentiate between siRNA targeting effects on the mRNA and its nascent precursor, we determined to target one gene that overlaps another gene with the following specifications: Firstly, the transcription of the two overlapping genes must converge from opposite directions on the DNA so as to initiate RNA polymerase collision mediated transcriptional interference [
9,
10,
11]. Secondly, the target gene must be expressed at higher levels than the gene it overlaps [
11]. The rationale for this approach was that the more highly expressed of the two convergently transcribed genes has been shown to repress the transcription of the lower expressed gene [
11] presumably through increased incidence of RNA polymerase collision events [
10]. In this way, if the siRNA were to alter the transcription of the target gene (after binding its nascent RNA transcript) as reported elsewhere [
11] this could in turn effect the transcription and expression of the overlapping gene [
11] which could in turn be monitored using comparative rtPCR [
11].
The Model: In this study, we discovered and were the first to characterize the long noncoding gene which we named
TOSPEAK. We found that
TOSPEAK was disrupted in a family with SYSN4 with multiple joint fusions, malformation of laryngeal cartilages and severe speech impairment.
TOSPEAK/C8orf37AS1 was found to overlap a nested long-range enhancer ECR5 that regulates expression of the adjacent bone morphogenetic protein gene
growth differentiation factor 6 (
GDF6) [
12]. This was of great interest as
GDF6 expression was previously found reduced in affected members of the SYSN4 family [
13] thus raising the question as to whether the SYNS4 skeletal phenotype was a function of the
GDF6 or
TOSPEAK genotype. Moreover, there was another interesting feature of
TOSPEAK that warranted further investigation, namely that
TOSPEAK physically overlaps another more highly expressed gene, which we named
SMALLTALK/C8orf37.To examine the expression of this overlapping gene set, we used siRNA to target
SMALLTALK in primary fibroblast cell cultures. As expected, the siRNA targeting of
SMALLALK resulted in a prolonged PTGS/RNAi mediated decrease in the mRNA level of
SMALLTALK. In addition, we detected a transient increase in the mRNA levels of both
TOSPEAK and
GDF6. Together, our findings indicate a role for both
TOSPEAK and
GDF6 in the joint, bone and cartilage malformations of the SYNS4 family [
13]. Given that over 20% of human protein coding genes physically overlap [
14] and that many others share promoters, and those like
GDF6 that have regulatory elements nested within adjacent genes, the findings of this study have important implications for genotype–phenotype correlation studies and gene targeting strategies and for the design of gene therapies including those that use exogenous siRNA.
2. Materials and Methods
RNA isolation and cDNA synthesis: Total RNA was extracted from tissues (fresh skin biopsies and primary fibroblast cell lines derived there from, and from fresh white blood cells) using Trizol following the manufacture’s protocol (ThermoFisher Scientific Australia, Sydney, Australia). RNA was treated with DNase I (NEB Biolabs, Ipswich, Australia), ethanol precipitated, resuspended in DEPC-treated water and quality tested using spectrophotometry (A260/280 ratio was ~1.7) and gel electrophoresis. Then, 1 ug of total RNA extracted was reverse transcribed using 250 ng of random hexamers (Promega Pty Ltd., Sydney, Australia) in a standard 20 μL reaction including 4 μL of first strand buffer (Invitrogen Pty Ltd.), 2 μL of 0.1M DDT, 1 μL of 10 mM dNTP, 1 μL RNase inhibitor (2500 U) and 1 μL of reverse transcriptase (10,000 U) (Invitrogen Pty Ltd.). After annealing of the hexamers for 10 min at 72 °C, cDNA synthesis was performed for 42 °C for 90 min followed by an enzyme inactivation step at 70 °C for 15 min. All cDNA products were diluted in a ratio of 1:10 and stored at −20 °C before use.
TOSPEAK transcription start site and termination site: 1 ug of total RNA was reversed transcribed using 1 μL of reverse transcriptase (10,000 U) (ThermoFisher Scientific Australia, Sydney, Australia), where each reaction was primed with 1 μL of 12 mM 5′-CDS primer A (5′-(T) 25VN-3′) (ThermoFisher Scientific Australia, Sydney, Australia) and 1 mL of 12 mM SMART II A oligo (5′-AAGCAGTGGTATCAACGCAGAGTACGCGGG-3′) (Clontech Pty Ltd.).
After annealing of hexamers for 10 min at 70 °C, cDNA synthesis was performed using Superscript II (Invitrogen Pty Ltd.) at 42 °C for 90 min followed by enzyme inactivation at 72 °C for 7 min with addition of 100 μL of Tricine-EDTA buffer. The 5′RACE clones were amplified with a reverse primer from
TOSPEAK exon 6 (
Table 1) using 10× Universal Primer A mix as per manufacturers’ protocol (ThermoFisher Scientific Australia, Sydney, Australia)). PCR was performed using the following conditions: 5 cycles at 94 °C for 30 s and 72 °C for 2 min, 5 cycles at 94 °C for 30 s, 70 °C for 30 s and 72 °C for 2 min, and 30 cycles at 94 °C for 30 s, 68 °C for 30 s and 72 °C for 2 min. 3′-RACE libraries were generated from RNA with Superscript III (Invitrogen Pty Ltd.: Cat No. 18080-093) using primers and protocols described in the SMART RACE User Manual (Becton Dickinson, Sydney, Australia). The 3′RACE clones were amplified with a forward primer from
TOSPEAK exon 9 (
Table 1). RACE PCR products were excised from gels and cloned into the pGEMT vector (Promega, Sydney, Australia) and sequenced using Big Dye chemistries (Australian Genome Research Facility, Brisbane, Australia).
RT-PCR characterisation of
TOSPEAK transcripts: RT-PCR reactions contained 5 μL of the diluted cDNA template, 2.5 μL of 10× PCR buffer, 0.2 μL of 25 mM dNTPs, 1 μL of each of the forward and reverse primer stocks (10 mM) (
Table 1), 1.5 μL of 25 mM MgCl
2 and 0.25 μL of AmpliTaq Gold polymerase (Applied Biosystems, Sydney, Australia) made up to 25 μL with ddH
2O and amplified using an initial denaturation at 94 °C for 10 min followed by 40 cycles at 94 °C for 30 s,58 °C for 30 s and 72 °C for 40 s and a final extension of 72 °C for 15 min.
Antibody Screen: An affinity purified polyclonal antibody was raised in rabbit against a synthetic peptide CESFLRKSVALPGEVIKSLLA (Monash University Melbourne, Australia) that we generated based on part of a putative ORF from the most abundant human TOSPEAK transcript (Genebank accession number GU295154). This antibody was used to screen human lymphocytes using Western analysis (Z.F. & R.A.C.) and paraffin embedded human heart tissue using immunohistochemistry (St George Hospital Clinical Pathology Laboratory, Sydney, Australia). Note: This antibody was affinity purified against the original synthetic peptide antigen; however, no positive control tissue sample was available to test the functional validity of this antibody in western analysis or immunohistochemistry and this antibody displayed negative staining in all tissues interrogated (results not shown).
Comparative genomic analyses: Nucleotide sequences from a ~900 kb region of the genome spanning the
TOSPEAK gene locus, were extracted from the Ensembl and NCBI GenBank databases (
http://www.ensembl.org/ (accessed on 21 August 2009)), version 41.36c;
http://www.ncbi.nlm.nih.gov/, Build 37.1 (accessed on 21 August 2009), for human, chimpanzee, dog, mouse, and opossum and analysed for any evolutionary conservation using VISTA (
http://genome.lbl.gov/vista/ (accessed on 21 August 2009)) [
15] with the human as the reference sequence.
Cell lines: Fresh skin biopsies were used to generate primary fibroblast cell lines cultured at 37 °C in DMEM media with 10% fetal calf serum (FCS). A commercial skin fibroblast cell line was used as independent control NC1 (NHDF-c adult normal human dermal fibroblast cell line PromoCell-Bio Connect C-12302). To harvest cells for Western analysis, washed cells were disrupted by adding 100 μL of lysis buffer (300 mM NaCl, 1 mM EDTA, 30 mM Tris/HCl, 1 μL proteinase inhibitor) and stored on ice for 30 min. For rtPCR cells were centrifuged at 1000 rpm for 5 min and resuspended and washed in PBS before RNA isolation.
siRNA mediated transient transcriptional interference (TTI) protocol: At sub-confluence, fibroblast cultures were washed with cold PBS and harvested using 0.25% trypsin in PBS at 37 °C for 1–5 min. Trypsin was then deactivated by suspension of cells in DMEM containing 10% FCS. Cells were centrifuged at 1000 rpm for 7 min and resuspended in DMEM without FCS before replating in preparation for treatment with siRNAs. Cells were plated into 6 well culture plates at 105 cells/well in 2 mL and incubated for 17–24 h prior to treatment with siRNA.
Working stocks of STEALTH siRNAs (
Table 2—Life Sciences Corp) were prepared at a 1/200 dilution (2 μM) in nuclease free H
2O and stored at −80°C. Then, 2.5 μL of siRNA working stock was added to 95 μL serum-free media. At the same time 5 μL Lipofectamine 2000 (Invitrogen Pty Ltd.) was added to 95 μL serum-free media and gently mixed. The siRNA mix and Lipofectamine mix were then combined and incubated for 20 min, then transferred to ‘treatment’ wells containing fibroblasts (1 mL FCS free media + 2.5 μL siRNA (5 nM) + 5 μL Lipofectamine Reagent 2000) and incubated at 37 °C for 6 h before the addition of 1 mL of DMEM containing 20% FCS. After 18 h cell culture media was replaced with fresh DMEM containing 10% FCS (referred to as time zero). Cells were then incubated for 24 h (Time 1), 48 h (Time 2) or 72 h (Time 3) before harvesting for RNA extraction and real time rtPCR analysis—cell culture media was then removed and adherent cells washed twice with PBS at room temperature before adding 350 μL of cell lysis buffer (2.4 mL Buffer RLT and 20 μL of β-Mercaptoethanol) to each well (as per RNeasy Mini Kit (50) #74104, Qiagen, Melbourne Australia). All siRNA experiments were performed in triplicate.
TOSPEAK was not amenable to siRNA knockdown in our hands (results not shown). Note: This may have been due to the newly evolved
TOSPEAK gene being very poorly conserved in sequence and structure between species with only 2 permanently transcribed exons (exons 1 and 9, respectively) that are very short and highly enriched for GC dinucleotides and repetitive elements (see Results below). However, siRNA targeting of the coding gene
SMALLTALK was successful (using siRNA-S1, -S2 and -S3—see
Table 2) and achieved peak silencing of
SMALLTALK within ~72 h.
Comparative RT-PCR: First-Strand cDNA Synthesis was performed using the SuperScript™ III First-Strand synthesis qRT-PCR Kit (ThermoFisher Scientific Australia, Sydney, Australia) according to manufacturers’ instructions: 10 μL of 2X RT Reaction Mix, 2 μL RT Enzyme Mix and 50 pg of RNA were made up to 20 μL with DEPC-treated water and incubated at 25 °C for 10 min and again at 42 °C for 50 min. Reactions were terminated at 85 °C for 5 min, then chilled on ice for 5 min followed by a short spin in the microfuge. Then, 1 μL (2 U) of E. coli RNase H was added and incubated at 37 °C for 20 min. A qPCR master mix was prepared with all common components. Volumes for a single 25 μL reaction were 12.5 μL of Platinum
® SYBR
® Green qPCR SuperMix-UDG (ThermoFisher Scientific Australia, Sydney, Australia), 1 μL each of 10 μM primer stocks specific for gene of interest (
Table 1), 2.5 μL of cDNA and DEPC-treated water to 25 μL. Reactions were incubated at 50 °C for 2 min and an initial denaturation step of 94 °C for 2 min. qPCR was performed for 40 cycles: denature at 94 °C for 15 s, anneal at 55 °C for 10 s, extension at 72 °C for 20 s. Comparative rtPCR profiles were independently normalised against expression of
GAPDH and
18sRNA to remove the non-biological variation. Each of the experimental triplicates were evaluated in triplicate (technical triplicates) and expressed as the mean. Patient rtPCR data was expressed as the mean of 2 patients.
4. Discussion
PTGS/RNAi reduces the level of target mRNAs in the cytosol. To achieve PTGS, the siRNA is designed to be complementary to a target site within the target mRNA, despite the fact that an identical target site also exists within the immature and/or intermediate nascent RNA transcript precursor of that mRNA within the nucleus. Due to this potential for dual targeting of both the mRNA and its precursor nascent transcript by a single siRNA, the following questions arise: Firstly, to what extent if any does an siRNA designed for PTGS (target and reduce the level of a specific mRNA) also target the identical site within the nascent RNA transcript precursor of that mRNA? Secondly, to what extent if any does this affect the transcription of that gene? Thirdly, given that any such reduction in transcription will also reduce the level of the target mRNA is it possible to distinguish any such effect on transcription from the PTGS/RNAi effect on the mRNA?
In this study, three distinct siRNA were used to target
SMALLTALK. For each of the three siRNA, there were three distinct responses. The first response was the knockdown of
SMALLTALK levels which peaked near 72 h in typical fashion to that expected for PTGS/RNAi in the cytosol. The second and third responses were the early transient increase of
TOSPEAK and
GDF6 levels, respectively, both of which peaked simultaneously at ~24 h before declining thereafter. The second and third responses were therefore independent of the PTGS of
SMALLTALK which continued unabated for another 48 h. As such, the increase in
TOSPEAK is best explained by its convergent transcriptional overlap with the
SMALLTALK gene. Convergent transcription between overlapping genes results in RNA polymerase collision events [
10,
11] that can cause discordant transcriptional interference [
9,
10,
11]. Other examples of transcriptional interference between convergently transcribed overlapping genes include the
DLX1,
DLX5 and
DLX6 genes which all experience transcriptional interference from overlapping non-coding antisense genes [
9]. However, for such a mechanism to be in play in our experiments would require the prior interference of
SMALLTALK transcription by the siRNA-S1-3. Support for this scenario came from the discordant increase in the level of
TOSPEAK as
SMALLTALK levels decreased during the first 24 h of the assay. Moreover, a similar sequence of events to this was observed with respect to the convergent transcription of the overlapping
LRRTM3 and
CTNNA3 genes associated with autism [
11]. In that study, five different siRNA were used to target the more highly expressed
CTNNA3 gene which in all five cases resulted in discordant transient interference (increase) of
LRRTM3 transcription [
11]. Together, these results are consistent with the siRNA mediated interference (reduction) of
SMALLTALK transcription causing reduced transcriptional repression (and increase) of
TOSPEAK [
5,
8].
Further support for the transcriptional interference of
TOSPEAK by
SMALLTALK comes from the third response which was the concordant, proportional and transient increase in
GDF6.
TOSPEAK is transcribed across the highly conserved ECR5 long-range enhancer of
GDF6 and both
TOSPEAK and
GDF6 levels increased in response to the siRNA-S targeting of
SMALLTALK. Within the 1st 24 h of the siRNA-S assay there was a synchronous, proportional and transient induction of both
TOSPEAK and
GDF6 levels followed by a concordant and proportional decline of both
TOSPEAK and
GDF6 levels within the following 24 h well before the peak reduction in
SMALLTALK levels at 72 h (
Figure 7 and
Figure 8). This strongly suggested that
TOSPEAK transcription, not the
TOSPEAK transcript, is a positive regulator of
GDF6 transcription. This interpretation of the results is also consistent with the phenotypic findings from the speech affected SYSN4 family where the breakpoint in the
TOSPEAK gene, which blocked transcription across ECR5, was associated with reductions in both
TOSPEAK and
GDF6 expression levels (
Figure 4) and the aberrant ossification of the very same joints, ligaments and cartilages regulated by the
GDF6 enhancer [
13,
19,
20].
In summary, these findings suggest that:
siRNA-S targeting of SMALLTALK interferes with SMALLTALK transcription.
SMALLTALK transcription converges on & represses TOSPEAK transcription.
TOSPEAK transcription across GDF6 enhancer enhances GDF6 transcription.
A number of possible mechanisms have been suggested for the modulation of highly conserved enhancers such as ECR5 by the transcription of non-coding genes like
TOSPEAK [
9]. Included is the possibility that
TOSPEAK transcription across ECR5 may enhance
GDF6 transcription through the secondment of the
GDF6 enhancer into active transcription factories, thereby facilitating
GDF6 promoter–enhancer coupling and increased transcription of
GDF6 [
9]. This scenario may involve modulation of chromatin structure around the enhancer [
9]. A similar scenario has been demonstrated for transcription-enhancer overlaps elsewhere [
21]. Furthermore, genome-wide 4C and Hi-C interaction data mark the locus spanning
SMALLTALK,
TOSPEAK and
GDF6 as a functionally interactive chromatin domain (
Figure 8) consistent with the three overlapping genes functioning as a set of overlapping interacting transcription units [
10,
11,
21].
It has been shown elsewhere that not all overlapping genes or gene sets are implicated in this form of transcriptional interference [
11]. Furthermore, it is uncertain to what degree this phenomenon is limited by cell or tissue specificity [
9,
11]. Furthermore, additional studies are required to understand the mechanism by which siRNA designed for PTGS/RNAi interfere with the transcription of the parent gene which in this case was
SMALLTALK [
2,
3,
4,
5,
6,
7].
5. Conclusions
We discovered and were the first to characterize the long noncoding gene which we named TOSPEAK and its physical overlap with both the SMALLTALK gene and the ECR5 long-range enhancer for GDF6. Furthermore, we established that TOSPEAK was disrupted in the SYNS4 family with reduction of both TOSPEAK and GDF6 levels. We used this overlapping set of genes to demonstrate siRNA mediated on-target transcriptional interference and its relevance for gene function and genotype–phenotype association analysis and gene therapy as evidenced by the restoration of GDF6 levels in cells from the SYNS4 family.
The limitations of this study included the use of one specific cell type in culture conditions different to those in vitro. Further studies are required to understand the mechanism by which siRNA designed for RNAi cause transient transcriptional interference of their target nascent transcript [
2,
3,
4,
5,
6,
7]. Uncertainty also remains regarding the mechanism by which
TOSPEAK could induce the transcription of
GDF6. Highly conserved lncRNA transcripts often have important regulatory functions [
9]; however, the
TOSPEAK transcript is not conserved between species. Moreover,
TOSPEAK gives rise to numerous very short transcripts with no translated protein structure and a very high incidence of stop signals making the mature transcript (s) of
TOSPEAK a most unlikely regulator of
GDF6. Notwithstanding, the nascent transcript of
TOSPEAK does include the sequence of the highly conserved ECR5 enhancer of
GDF6 and as such could feasibly have a trans role in enhancing
GDF6 transcription [
9]. Notwithstanding the most plausible interpretation of the results are that
TOSPEAK transcription, not the
TOSPEAK transcript, positively regulates
GDF6 transcription.