Next Article in Journal
Expanded Potential of the Polyamine Analogue SBP-101 (Diethyl Dihydroxyhomospermine) as a Modulator of Polyamine Metabolism and Cancer Therapeutic
Next Article in Special Issue
Critical Roles of Circular RNA in Tumor Metastasis via Acting as a Sponge of miRNA/isomiR
Previous Article in Journal
A Comprehensive Review on Chickpea (Cicer arietinum L.) Breeding for Abiotic Stress Tolerance and Climate Change Resilience
Previous Article in Special Issue
Secreted miR-153 Controls Proliferation and Invasion of Higher Gleason Score Prostate Cancer
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Single-Cell Atlas of the Drosophila Leg Disc Identifies a Long Non-Coding RNA in Late Development

1
School of Life Sciences, The Chinese University of Hong Kong, Hong Kong
2
State Key Lab of Agrobiotechnology, The Chinese University of Hong Kong, Hong Kong
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2022, 23(12), 6796; https://doi.org/10.3390/ijms23126796
Submission received: 26 May 2022 / Revised: 14 June 2022 / Accepted: 15 June 2022 / Published: 18 June 2022
(This article belongs to the Collection Regulation by Non-coding RNAs)

Abstract

:
The Drosophila imaginal disc has been an excellent model for the study of developmental gene regulation. In particular, long non-coding RNAs (lncRNAs) have gained widespread attention in recent years due to their important role in gene regulation. Their specific spatiotemporal expressions further support their role in developmental processes and diseases. In this study, we explored the role of a novel lncRNA in Drosophila leg development by dissecting and dissociating w1118 third-instar larval third leg (L3) discs into single cells and single nuclei, and performing single-cell RNA-sequencing (scRNA-seq) and single-cell assays for transposase-accessible chromatin (scATAC-seq). Single-cell transcriptomics analysis of the L3 discs across three developmental timepoints revealed different cell types and identified lncRNA:CR33938 as a distal specific gene with high expression in late development. This was further validated by fluorescence in-situ hybridization (FISH). The scATAC-seq results reproduced the single-cell transcriptomics landscape and elucidated the distal cell functions at different timepoints. Furthermore, overexpression of lncRNA:CR33938 in the S2 cell line increased the expression of leg development genes, further elucidating its potential role in development.

1. Introduction

Long non-coding RNAs (lncRNAs) are defined as RNAs longer than 200 nucleotides and not translated into functional proteins. Human GENCODE (v40) identifies 17,748 lncRNA genes, which roughly equates to the number of protein-coding genes (19,988) signifying the importance of lncRNAs. The majority of lncRNAs are transcribed by RNA polymerase II and are often 5′-end 7-methyl guanosine (m7G) capped, 3′-end polyadenylated, and spliced similarly to mRNAs. They are often classified based on their position relative to neighboring genes (divergent, convergent, intergenic, antisense, sense, enhancer, intronic, and miRNA host), transcript length (long intergenic, very long intergenic, and macroRNA), association with annotated protein-coding genes, association with other DNA elements, protein-coding RNA resemblance, association with repeats, association with a biochemical pathway, sequence and structure conservation, biological state, association with subcellular structures, and function [1,2]. As lncRNAs provide supportive roles by fine-tuning gene expression levels at the epigenetic, transcriptional, and post-transcriptional levels, they are implicated in various biological processes and diseases. The contribution of lncRNAs to organ development in several mammalian species has revealed a transition of broadly expressed lncRNAs towards an increasing number of spatiotemporal-specific and condition-specific lncRNAs [3]. The role of lncRNAs in cancer has been studied extensively, but they are also involved in many other human diseases from neurological disorders to cardiovascular issues [4]. Notably, lncRNA expression is generally spatiotemporal specific, indicating the unique functions and probable pharmacological targeting of lncRNA.
Drosophila melanogaster (fruit fly) is an ideal model organism to study developmental and cellular processes in higher eukaryotes, including humans, because a wide range of genetic tools can be applied and its genome has been extensively studied [5]. In fact, the D. melanogaster genome is 60% homologous to that of humans and nearly 75% of human disease-causing genes are believed to have functional homologs in the fruit fly [6]. Furthermore, its short generation time, high fecundity, and low maintenance as well as the abundance of publicly available fly stocks and databases also make D. melanogaster an appealing model organism.
Despite different taxonomic origins, the Drosophila larval leg disc, which develops into the adult leg, is an ideal model for studying the complex vertebrate limb because it is relatively simple and amenable to genetic manipulations. Research on fly imaginal discs has revealed the tissue compartments and organ-specific regulator genes critical to development, and has generated established models for the study of cellular interactions and complex genetic pathways [7]. Moreover, the easy accessibility of imaginal discs further supports their utility.
Advances in the past decade on single-cell RNA sequencing (scRNA-seq) and related computational analysis pipelines have allowed scientists and bioinformaticians to understand the cellular heterogeneity of tissues at an unprecedented level, from manually selecting a single-cell under the microscope to plate-based and droplet-based high throughput methods with multimodal capabilities [8]. Since the publication of the first single-cell transcriptome study based on a next-generation sequencing platform, the number of publications on scRNA-seq addressing development, disease, and bioinformatics tool improvement has exponentially grown [9,10,11,12]. Many of these publications have focused on developmental biology, often involving single-cell studies, as it represents a crucial period during which cells first begin to differentiate [13]. Single-cell transcriptomics studies on Drosophila larval imaginal wing and eye-antennae discs have emerged since 2018 [14,15,16,17,18,19] and shown that single cells could be mapped to the distinct subregions of their respective imaginal discs, thus confirming the spatial expression of genes determined by previous immunostaining methods.
While the Fly Cell Atlas recently performed single-nucleus RNA-sequencing (snRNA-seq) on adult Drosophila legs [20], single-cell transcriptomics and epigenomics studies on the developing leg imaginal disc remain lacking due to the challenges of its dissection compared to the larger wing and eye-antennae discs. We thus report the single-cell transcriptomic and epigenomic landscapes of w1118 third leg discs (L3) across three time points of development of third-instar larvae. We identified and validated a novel, highly expressed lncRNA in the distal epithelial cells that changes its spatial expression at various stages of development and confirms its importance in leg development.

2. Results

2.1. Generation of a Transcriptomic Cell Atlas of the Developing Leg Imaginal Disc

2.1.1. Single-Cell RNA-Sequencing Identifies Four Main Cell Types in L3 Discs

To study the cellular heterogeneity of developing L3 discs, collected embryos were dissected for L3 discs at 121 h (T1), 133 h (T2), and 168 h (T3) after egg laying (AEL) (Figure 1A) for scRNA-seq. Sequencing statistics showed similar data quality amongst the three samples, including the percentage of mapped reads, percentage of mapped reads aligned to genes, number of cells, and mean reads per cell (Table S1). The L3 disc was identified as a trio of discs on either side of the larval body that differed from the wing and haltere discs in morphology and patterning (Figure 1B). Cell preparation workflow involved dissection and dissociation of L3 discs into single cells, after which a portion of the cells was used for scRNA-seq and the remaining cells having their nuclei isolated for single-cell assay for transposase-accessible chromatin (scATAC-seq) (Figure 1C). Both assays used the 10× Genomics platform and the prepared libraries were subjected to sequencing and subsequent data analysis. The integrated dataset overlayed T1, T2, and T3 individual samples and identified four distinct clusters (Figure 1D). The largest cluster represented the leg disc epithelium, which expressed epithelial markers Fasciclin 3 (Fas3) and narrow (nw) (Figure 1E) [21]. Expression of Sp1 and Ultrabithorax (Ubx) confirmed that the cells originated from L3 discs [22,23]. The second-largest cluster represented muscle cells, which expressed the muscle markers twist (twi), Holes in muscle (Him), Secreted protein, acidic, cysteine-rich (SPARC), tenectin (tnc), cut (ct), Amalgam (Ama), and terribly reduced optic lobes (trol) [17,24,25]. The identity of the immune cell cluster was determined by the expression of regucalcin, Hemolectin (Hml), Peroxidasin (Pxn), Transferrin 1 (Tsf1), and reversed polarity (repo) [19,26,27]. The smallest cluster represented the neuronal cells, which expressed found in neurons (fne) and couch potato (cpo) [15,28]. The relative expression levels of the marker genes were tabulated (Table A1).

2.1.2. Subclustering of the Main Cell Types Reveals Cell Subtypes

The muscle cell cluster was composed of early and late muscle cell subclusters (Figure 1F). The early cells expressed tenectin (tnc), terribly reduced optic lobes (trol), cut (ct), maternal gene required for meiosis (mamo), Thor, kin of irre (kirre), roughest (rst), and rolling pebbles (rols) (Figure 1G) [17,19,24,25,29]. The late cells expressed Holes in muscle (Him), twist (twi), Myocyte enhancer factor 2 (Mef2), muscleblind (mbl), and Fasciclin 2 (Fas2) [19,30]. Early muscle cells increased expression of late muscle cell marker Fas2 over time in terms of both expression level and the number of cells that expressed this gene (Figure S1). Late muscle cell marker Mef2, a skeletal muscle differentiation transcription factor, similarly increased expression in the late muscle cell subcluster over time in terms of both expression level and the number of cells that expressed the gene. The heatmap of the most upregulated genes in the early and late muscle cells showed a distinction in upregulated genes between the two subclusters (Figure S2).
The neuronal cell cluster was also composed of early and late neuronal cell subclusters (Figure 1H). The early cells expressed miranda (mira), LIM homeobox 1 (Lim1), and empty spiracles (ems) (Figure 1I) [31,32,33]. The late cells expressed bruchpilot (brp), neuronal Synaptobrevin (nSyb), embryonic lethal abnormal vision (elav), Synaptotagmin 1 (Syt1), Cadherin-N (CadN), nervana 3 (nrv3), Glutamic acid decarboxylase 1 (Gad1), knot (kn), vesicular glutamate transporter (VGlut), and tailup (tup) [34,35,36,37]. The heatmap of the most upregulated genes in the early and late neuronal cells showed a clar distinction between the two subclusters (Figure S3).
The immune cell cluster was composed of glia and hemocytes (including plasmatocytes), which are the phagocytes found in invertebrates (Figure 1J). Glial cells expressed Transferrin 1 (Tsf1), reversed polarity (repo), and moody [27,38], while the hemocytes and plasmatocytes expressed regucalcin, Peroxidasin (Pxn), and Hemolectin (Hml) (Figure 1K) [19,26]. These markers were highly specific to their respective cell subtypes and the heatmap of the most upregulated genes in the glia and hemocytes (including plasmatocytes) showed a clear distinction between the two subclusters (Figure S4).
The leg disc epithelium cluster was subclustered into six cell subtypes, including the distal, medial, and proximal cells as well as stem cell-like cells, such as those of the proximal-distal-axis (PD axis) and anterior-posterior-axis (AP axis), and cells of undetermined fate (Figure 2A). Identification of fate undetermined cells was based on their top upregulated DEGs and gene ontology analysis of the DEGs (Table A2 and Table A3). The distal cells expressed the markers aristaless (al), C15, and Distal-less (Dll) (Figure 2B) [22]. The medial cells expressed dachshund (dac) and Dll, while the proximal cells expressed teashirt (tsh) and homothorax (hth) [22]. The PD axis cells expressed vestigial (vg), spalt-related (salr), and spalt major (salm) [39], and the AP axis cells expressed hase und igel (hui) (FlyBase ID FBgn0033968).

2.2. Identification and Characterization of a Novel Long Non-Coding RNA

2.2.1. Identification of a Long Non-Coding RNA of Unknown Function in Distal Cells

The most upregulated genes in each leg disc epithelium subcluster are shown in a heatmap (Figure 2C). The genes colored black represent known markers for their respective subclusters, those colored blue represent genes with known functions as potential markers for their respective subclusters, and the genes colored red represent genes with unknown functions as potential markers for their respective subclusters. lncRNA (lncRNA:CR33938) is unique because the 10× 3′ gene expression kit uses oligo(dT) primers to detect polyA-tailed transcripts, which mostly include mRNAs. However, lncRNA:CR33938 expression was observed in this study (Figure 2D). Indeed, lncRNA:CR33938 was identified in other studies by using polyA+ bulk RNA-sequencing [40,41]. Upon splitting the integrated data into its respective samples (T1, T2, and T3), lncRNA:CR33938 expression was negligible in T1, appeared more widespread in T2 and became specific to the distal cells in T3. Note that while most distal cells expressed lncRNA:CR33938, a small subset of medial and proximal cells also expressed the lncRNA.

2.2.2. Experimental Validation of lncRNA:CR33938 Expression in L3 Discs

Fluorescence in-situ hybridization (FISH) of lncRNA:CR33938 in T1, T2 and T3 L3 discs was performed alongside region-delineating controls Dll and dac (Figure 2E). Expression of only Dll represented the distal cells, while co-expression of Dll and dac or only dac represented the medial cells. LncRNA:CR33938 expression in T1 L3 discs did not occur. LncRNA:CR33938 expression in T2 L3 discs was present in the proximal, medial, and distal cells, while lncRNA:CR33938 expression in T3 L3 discs was most prominent in distal cells, although medial cells also showed more limited expression. More specifically, six regions of large punctated lncRNA:CR33938 expression were observed in T3, including four regions of high expression and two regions of lower expression. Three of the six regions were within the distal cells, while the other three regions were outside of the distal cells, suggesting expression of lncRNA:CR33938 in cells other than the distal cells, namely the medial cells. The proximal cells also had a low level of lncRNA:CR33938 expression, as displayed by a tint of red fluorescence peripheral to the medial cells delineated by the dac marker. In fact, Figure 2D did show other cells expressed the lncRNA, but these cells represented only small subsets of the subclusters. Thus, these FISH results corroborated the scRNA-seq data. Note that the punctated regional expressions may suggest localized lncRNA:CR33938 function in aggregates.

2.2.3. Conservation of lncRNA:CR33938 in Insect Species

The conservation state of lncRNA:CR33938 across 124 insect species revealed that the lncRNA had a high conservation level in exon regions (Figure 3A). Moreover, a comparison with the conservation state of all 2258 lncRNAs annotated in the reference annotation suggested that lncRNA:CR33938 was more conserved than 90% of the other lncRNAs (Figure 3B). These concordances reflected a critical regulatory role of lncRNA:CR33938 in insect development.

2.2.4. Overexpression of lncRNA:CR33938 in S2 Cells

Transient overexpression of full-length lncRNA:CR33938 in S2 cells produced an approximately 40,000-fold increase in expression level compared to the empty vector control (Figure 3C) according to qRT-PCR. Correspondingly, expression of the PD axis genes (Hh, wg, and dpp) showed an increasing trend upon lncRNA:CR33938 overexpression. While there was no effect on the expression of genes controlling proximal leg femur growth, expression of distal leg tarsal disco-r and medial leg tibial dac significantly increased with lncRNA:CR33938 overexpression. This corroborated the scRNA-seq and FISH data that lncRNA:CR33938 more greatly affected (and was normally expressed in) the distal end of the leg.

2.3. Generation of an Epigenomic Cell Atlas of the Developing Leg Imaginal Disc

2.3.1. scATAC-seq Identified Similar Cell Types as scRNA-seq

The sample-wide integrated scATAC-seq dataset showed an overlaying of the T1, T2, and T3 individual samples (Figure 4A) and identified twelve distinct clusters based on differences in chromatin accessibility (Figure 4B). A heatmap distinguished the proportion of cells in each cluster at each timepoint and showed differences in chromatin accessibility and cell composition across three timepoints (Figure 4C). For example, cluster 6 (C6) showed greater than 80% of the cells in T1, a small proportion of the cells in T2, and nearly no cells in T3. Similarly, T1 had many cells from cluster 12 (C12) and cluster 2 (C2).
Upon integration of the chromatin accessibility data with the gene expression data, seven cell types identified in scRNA-seq were transferred to the scATAC-seq clusters (Figure 4D). These cell types corresponded to the cell subtypes of the PD axis of the leg disc epithelium (proximal, medial, and distal cells) as well as those of the muscle, neuronal, and immune cells.
The cell type identities were confirmed by an inferred gene score of chromatin accessibility for a list of known marker genes specific to the cell types (Figure 4E). Similar to the gene expression data in scRNA-seq, high gene scores of Sp1 and Ubx confirmed that the cells originated from L3 discs. All cells that composed the leg disc epithelium (proximal, medial, and distal cells) showed markers Fas3 and nw. The presence of Dll only (without dac), al, and C15 confirmed the identity of the distal cells. Dll (with dac) and dac only confirmed the identity of the medial cells. Similarly, tsh and hth were markers for the proximal cells, while Him and twi represented the muscle cells. The neuronal cells showed high gene scores for nervana 3 (nrv3) and complexin (cpx), and the immune cells produced high scores for Hml (for hemocytes) and Pxn (for plasmatocytes) markers.
The most enriched motifs for each cell type are shown as a heatmap (Figure 4F). The GATA motifs were evident in the immune cells, with several GATA family members observed. The PRDM9 and HIF2a.bHLH motifs were highly enriched in the medial and distal cells, respectively. While the NRF motif was enriched in the proximal cells, its enrichment was more evident in the neuronal cells. The muscle cells were enriched in many motifs, including Maz, KLF14, ZNF, Egr2, Olig2, Egr1, KLF10, and Klf9. The neuronal cells were also enriched for many motifs, including MyoD, Myf5, E2A, PAX5, MyoG, Tcf12, EKLF, Ascl1, and NRF.

2.3.2. Chromatin Accessibility Differentiated the T1, T2, and T3 Distal Cell Functions

Gene set enrichment of T2 genes relative to T1 and T3 genes relative to T1 showed differences in cellular processes (Figure 4G). The T2 distal cells were more involved in metabolic processes, while the T3 distal cells had a larger role in chitin-based larval cuticle development.
Fragment coverage within the 40,000 base pairs on either side of the distal cell marker gene Dll showed increased coverage in the distal cells with marked co-accessibility in neighboring genes (Figure 4H). Distal cell marker gene C15 similarly displayed increased coverage in the distal cells with marked co-accessibility in neighboring genes.

3. Discussion

We used scRNA-seq and scATAC-seq to explore the Drosophila L3 disc transcriptomic and epigenomic landscapes, respectively, at three timepoints of development. The multi-omics datasets corroborated each other and showed similar cell types that delineated the various regions of the leg disc, namely, those along the PD axis. Moreover, scRNA-seq identified an experimentally validated late-stage distal-specific and conserved lncRNA (lncRNA:CR33938) that, upon further characterization by overexpression studies, promoted distal leg growth gene expression. In addition, differences in chromatin accessibility determined by scATAC-seq indicated the disparate functions of early- and late-stage distal cells.
Given that the three legs of Drosophila differ in their developmental programs, their underlying differences cannot be ignored when studying leg disc development [23]. Subsequently, we specifically isolated the third leg disc to provide a more coherent single-cell atlas.
Simultaneous multi-omics library preparation methods, where the same cell or nuclei are used for different single-cell assays, were not available at the time these experiments were completed. As a result, the same cell suspension was used for both scRNA-seq and scATAC-seq to minimize biological variation. Furthermore, the limited number of cells extracted per leg disc prevented the execution of multiple experiments of biological replicates, in which one experiment consisted of one replicate. Rather, a single assay comprised of many biological replicates was conducted for each time point.
The computationally determined assignment of cell types to clusters depended upon the most upregulated genes in each cluster and prior information about cell type-specific marker genes. In addition to the prominent distal, medial, and proximal cell types, cells that did not express explicit marker genes denoting specific cell types represented early developing cells with undetermined fate, which we referred to as “stem-cell like cells”.
We found a large cluster of epithelial cells and a smaller cluster of muscle cells in the L3 discs, which corroborated previous studies that have shown the presence of many epithelial cells and accompanying muscle cells in the wing discs of third-instar larva [15,16,17]. Previous studies have suggested that the epithelial cells of the wing disc can be mapped to distinct subregions, including the pouch, hinge, notum, and peripodial membrane [16]. Similarly, the epithelial cells of the leg disc could be mapped to distinct proximal, medial, and distal subregions. Regarding muscle cells, research has shown that they can be subcategorized into direct and indirect flight muscles [17]. Given this finding, we also subcategorized the L3 disc muscle cells based on early versus late muscle development genes.
Our results also demonstrated the presence of neuronal and immune cells in L3 discs, which corroborates the recent single-nucleus transcriptomics study on the adult fruit fly leg by the Fly Cell Atlas showing the presence of various differentiated neurons as well as hemocytes and glial cells [20]. This illustrated that the neuronal cells in the developing leg disc have not yet differentiated, though they can subsequently differentiate.
Despite the publication of several works on single-cell transcriptomic landscapes of the Drosophila wing and eye-antennae imaginal discs [14,15,16,17,18,19], this study is the first to describe the transcriptomic and epigenomic landscape of the leg disc, specifically the third leg disc, at single-cell resolution. The Fly Cell Atlas study determined the single-nucleus transcriptomic atlas of the adult fruit fly leg, but it was based on fully differentiated tissues [20]. Conversely, our work was based on developing tissue and characterized the importance of an identified lncRNA.
lncRNAs tend to have lower expression levels than protein-coding genes [42]. The detection of lncRNA:CR33938 by our polyA-tailed single-cell transcriptomics assay indicated that it had a robust level of expression and suggested that it had an important physiological function in leg development given that lncRNA expression is environment-specific [42].
Our work highlighted the spatiotemporal expression of lncRNA:CR33938. It was largely distal-specific, as suggested by the scRNA-seq and FISH results, despite some cells other than the distal cells also showing expression. We also provided evidence that lncRNA:CR33938 may have an important role in leg development. We used a previously published list of larval stage genes that establish the PD axis of the Drosophila leg [43]. LncRNA:CR33938 promoted tarsal leg growth gene expression upregulation, namely disco-r. The expression of its paralog, disco, is maintained by Dll, and disco gene function is also required for the maintenance of Dll expression [44]. Given the important function of disco in maintaining a key gene in PD axis development, this suggests that lncRNA:CR33938 has an important potential role in leg development. Note that dac was also upregulated by lncRNA:CR33938 overexpression. This was not surprising, as the lncRNA was also expressed in the medial cells in T2. This suggests that lncRNA:CR33938 exhibits a spatiotemporal role, in which it first modulates medial leg cell fate early in development and then distal leg cell fate modulation later in development. This hypothesis could be tested by using the GAL80 temperature-sensitive and GAL4 with UAS system of flies to spatiotemporally overexpress lncRNA:CR33938 and to assess leg phenotype, such as leg length. Prior to this study, lncRNA:CR33938 did not have an annotated function, but our study indicated that it may influence late-stage distal, and perhaps mid-stage medial, leg growth.

4. Materials and Methods

4.1. Fly Maintenance and Stocks

The w1118 Drosophila melanogaster fly line was obtained as a gift from Prof. Edwin Chan’s lab at The Chinese University of Hong Kong. All flies were maintained at room temperature in regular light-dark cycles in vials containing standard cornmeal agar medium (Nutri-fly, #66-112).

4.2. Fly Breeding Schedule for T1, T2, and T3

Male and female w1118 flies were allowed to mate for 2 h at room temperature in a clear plastic cup with an attached petri dish containing apple juice agar. After this time had elapsed, embryos were transferred from the apple juice agar plate to a vial containing standard cornmeal agar medium. They were then allowed to grow for 121, 133 or 168 h, corresponding to T1, T2, and T3, after which the third leg discs of these third-instar larvae (L3) were dissected.

4.3. Third Leg Disc Dissection and Single Cell Dissociation

At least 70 L3 discs were dissected for T1, and at least 50 L3 discs each were dissected for T2 and T3. These discs were collected in an Eppendorf tube containing phosphate-buffered saline (PBS) with 0.04% bovine serum albumin (BSA) on ice. After pipetting out the PBS from briefly centrifuged samples, we added TrypLE Select Enzyme (10×) (ThermoFisher, Waltham, MA, USA, #A1217702). The discs were then incubated in a thermomixer shaken at 500 rpm for 25 min at 37 °C (with the tube being flicked every five minutes). S2 medium (Gibco, Waltham, MA, USA, #21720) supplemented with 10% fetal bovine serum and 2% penicillin/streptomycin were then added to stop the dissociation reaction. Finally, the isolated single cells were washed and resuspended in PBS + 0.04% BSA.

4.4. DNA Library Preparation and Sequencing

The complementary DNA (cDNA) libraries for T1, T2, and T3 were prepared according to the 3′ scRNA-seq library preparation protocol (v3.1) of 10× Genomics. In summary, a microfluidics chip was used to produce GEMs (Gel Bead-in-Emulsions), which are droplets that each contain a single microbead with attached oligonucleotides that include a unique cell barcode, a single cell, and reverse transcription reagents. When the single cell lyses within the intact GEM, the cellular polyA-tailed transcript sequences become exposed, reverse transcription occurs, and each cDNA transcript within the same cell receives the same cell barcode with a different UMI (unique molecular identifier). Subsequently, the droplets lyse and the cell-barcoded cDNA from all cells are pooled and amplified. cDNA library construction involved fragmentation, end-repair, A-tailing, double-sided size selection, and sample index incorporation. Quality control and qualitative analysis of the final library were performed on an Agilent Bioanalyzer DNA High Sensitivity chip (Beijing, China). Sequencing of the libraries was completed on the Illumina NovaSeq6000 platform by Novogene (Beijing, China).

4.5. scRNA-seq Raw Data Processing, Quality Assessment, and Filtering

The raw paired-end sequencing data files (Fastq) were processed using the Cell Ranger pipeline v4.0.0 with default settings. Read alignment and UMI counts were based on a BDGP6 genome reference fasta file and annotated by a BDGP6.28 gtf file developed by Ensembl. Cell-UMI count tables were loaded into Seurat v4.0 [45]. Cells with 1000–250,000 UMI counts and less than 5% mitochondrial genes were used as filtering gates to select cells for downstream analysis. We only kept genes with at least 20 UMI counts in all cells. Qualimap (v2.2.1) was further used to assess the percentage of mapped reads and percentage of mapped reads aligned to genes for comparison between T1, T2, and T3.

4.6. scRNA-seq Data Integration, Clustering, and Cell Type Identification

The T1, T2 and T3 filtered single-cell datasets were merged and integrated using Seurat (v4.0.) Batch effects between samples were corrected using Harmony (v1.0) prior to clustering analysis. PCA was used to determine the optimal dimension for dimensionality reduction, and clustering was performed based on K-nearest neighbor (KNN) graphs with a resolution of 0.02 before UMAP visualization of the single-cell data in two dimensions. The major cell types of the clusters were identified based on known marker genes, and these marker genes were listed among the most upregulated differentially expressed genes compared to other clusters. All clusters were further subclustered into constituent cells based on known marker genes. Dotplots, featureplots, heatmaps, and UMAPs were then generated. Other than the known marker genes, novel genes were also identified as potential markers.

4.7. Validation of scRNA-seq Results by FISH and Confocal Imaging

w1118 flies were bred and T1, T2, and T3 L3 discs were dissected as described above. The discs were fixed in 3.7% paraformaldehyde on ice for 30 min. Then, probe hybridization was completed according to the protocol provided by Molecular Instruments. The discs were first permeabilized in a detergent solution containing sodium dodecyl sulfate and Tween-20, before custom-designed probes for Dll, dac, and lncRNA:CR33938 were hybridized to the fixed and permeabilized discs for 20 h at 37 °C. After several washes with 5X SSC-Tween-20, hairpins with different fluorophores for each probe were added and incubated for 16 h in darkness at room temperature. The discs then underwent another several washes with 5X SSC-Tween-20 and were mounted onto a Menzel-Glaser Superforst Plus microscope slide (Thermo Scientific, Waltham, MA, USA, #J1800AMNZ) with a Hydromount mounting medium (National Diagnostics, Charlotte, NC, USA, #HS-106) and a 22 × 50 mm Deckglaser microscope coverglass (VWR, Radnor, PA, USA, #630-1461). The mounts were visualized on a Leica SP8 confocal microscope and each sample was imaged every 0.25 μm along the z-axis. The confocal images were z-stacked and processed with Leica Application Suite software.

4.8. Construction of lncRNA:CR33938 Expression Vector for Expression Studies

Total RNA was extracted from D. melanogaster L3 discs using NucleoZOL (Macherey-Nagel, UK, #740404.200) following the manufacturer’s protocol. Following RNase-free DNaseI (Thermo Scientific #EN0521) treatment and DNaseI inactivation by EDTA, the purified RNA was subjected to cDNA generation using PrimeScript II (Takara, Japan, #RR036A). The cDNA concentration was measured using the Qubit High Sensitivity double-stranded DNA assay. lncRNA:CR33938 was amplified with PCR from the cDNA using the following primers with restriction site sequences inserted: forward primer 5′-TTTGGTACCTTGAGTCCGAGAGGTT-3′ and reverse primer 5′-CGCTCTAGACTCTTTTTTTGGTAGCCTATT-3′). The amplicon and the pAc5.1/V5-His B expression vector (Invitrogen, Waltham, MA, USA, #V411020) were digested with KpnI (New England Biolabs, USA, #R3142) and XbaI (New England Biolabs, Ipswich, MA, USA, #R0145) restriction enzymes and subsequently ligated using T4 DNA ligase (Invitrogen #15224017). The ligation mixture was transformed to chemically competent Escherichia coli (Invitrogen, #C404003) and selected using 100 mg/mL of ampicillin. The sequence of the lncRNA:CR33938 construct cloned into the expression vector was verified by Sanger sequencing at the Beijing Genomics Institute. Transfection-ready plasmid DNA was extracted using a Plasmid Miniprep kit (Invitrogen, #K210011).

4.9. S2 Cell Culture and Transfection

D. melanogaster S2 cells were provided by Prof. Jerome Hui from the School of Life Sciences of The Chinese University of Hong Kong. S2 cells were cultured in Schneider’s Drosophila Medium (Gibco #21720) supplemented with 10% heat-inactivated fetal bovine serum (Gibco #10270) and 1% penicillin-streptomycin antibiotic mixtures (Gibco #15140122) in a 25 °C humidified incubator. The cloned pAc5.1-lncRNA:CR33938 construct was transfected into S2 cells using Effectene (Qiagen, Germantown, TN, USA, #301425) and the pAc5.1 backbone vector was used as a negative control. The cells were then incubated at 25 °C for 48 h prior to RNA extraction for qRT-PCR.

4.10. RNA Extraction and qRT-PCR of S2 Cells

RNA was extracted with NucleoZOL (Macherey-Nagel, Allentown, PA, USA, #740404.200). Following RNase-free DNaseI (Thermo Scientific #EN0521) treatment and DNaseI inactivation by EDTA, the purified RNA was subjected to cDNA generation using PrimeScript II (Takara, Tokyo, Japan, #RR036A). The cDNA concentration was measured with a Qubit High Sensitivity double-stranded DNA assay. For qRT-PCR, 1 ng of template cDNA and 1xTB Green II (Takara #RR820) were added to each well of a 96-well plate (Axygen, Union City, CA, USA, #PCR-96-FSC) and covered with an optical adhesive film (Applied Biosystems ABI, Waltham, MA, USA, #4311971) prior to execution on a BioRad CFX96 real-time PCR detection system. Primer sequences for each tested gene are listed in Table S2.

4.11. Nuclei Isolation for scATAC-seq

The same suspension of single cells used for scRNA-seq was used for nuclei isolation for scATAC-seq. Nuclei isolation was performed according to a 10× Genomics low input protocol for scATAC-seq with some optimizations. The cell suspension was pelleted and lysed on ice in a buffer containing the detergents Tween-20 and nonidet-P40 (NP40) for 30 s. The isolated nuclei were then washed twice and resuspended in chilled 10× diluted nuclei buffer (provided by 10× Genomics). Trypan blue stained nuclei were observed under the microscope to assess nuclei quality.

4.12. DNA Library Preparation and scATAC-seq

DNA library preparation was performed according to the scATAC-seq preparation protocol (v1.1) of 10× Genomics. First, the nuclei suspensions were incubated in a transposition mix that included a transposase that preferentially fragments the DNA in open regions of the chromatin. Simultaneously, adapter sequences were added to the ends of the DNA fragments. As in scRNA-seq, a microfluidics chip was used to produce GEMs (Gel Bead-in-Emulsions), but in this case, the droplets contained a single microbead with attached sequences consisting of a unique cell barcode, a single nucleus, and DNA amplification reagents. Once the DNA from each nucleus was barcoded, all nuclei were pooled for DNA library construction. Because only the histone unbound areas of the genome are cut by the transposase, the library consisted of DNA fragments that represented the open chromatin regions of the genome. Quality control and qualitative analysis of the final library were performed on an Agilent Bioanalyzer High Sensitivity DNA chip. The libraries were sequenced on the Illumina NovaSeq6000 platform by Novogene at PE50 with a sequencing depth of approximately 50,000 read pairs per cell.

4.13. scATAC-seq Data Analysis

The raw paired-end sequencing data were processed by Cell Ranger ATAC pipeline v2.0.0 with default settings, using a dm6 UCSC reference generated by the 10× Genomics mkref function. Data processing, filtering, dimensionality reduction, and clustering were performed with ArchR v1.0.1 [46]. UMAP visualizations of the scATAC-seq clusters were created before and after integration with scRNA-seq data. Determination of cell type identities were aided by manual annotation of cell type-specific marker genes based on gene scores estimated from the chromatin accessibility data. Peak calling with MACS2 v2.2.7.1 [47] was performed on each cell cluster. Identification of robust peak sets allowed the prediction of enriched transcription factor motifs for each cluster. Gene ontology analysis was performed using Cluster Profiler to determine the enriched distal process in T2 and T3 relative to T1. Genome browser plots depicting co-accessibility of distal genes with nearby genes were also generated using ArchR.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ijms23126796/s1.

Author Contributions

Conceptualization: T.-F.C. and J.H.; methodology: J.T.; validation: J.T., T.H.L. and A.C.K.L.; formal analysis: J.T.; investigation: J.T., T.H.L., A.C.K.L., I.L., Z.Q. and X.L.; data curation: J.T. and J.Z.; writing—original draft preparation: J.T.; writing—review and editing: X.L., T.-F.C. and J.H.; visualization: J.T.; supervision: T.-F.C. and J.H.; project administration: J.T.; funding acquisition: T.-F.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by System Information Science fund from The Chinese University of Hong Kong Faculty of Science, Donation from Mr. and Mrs. Sunny Yang, The Chinese University of Hong Kong Direct grant 4053486, and Innovation and Technology Fund (to the State Key Lab of Agrobiotechnology.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The sequencing data presented in this study are openly available in NCBI SRA with BioProject ID PRJNA831899.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. Relative expression levels of marker genes. The average log2FC expression levels of the marker genes displayed in Figure 1 and Figure 2 are tabulated here.
Table A1. Relative expression levels of marker genes. The average log2FC expression levels of the marker genes displayed in Figure 1 and Figure 2 are tabulated here.
Cell SubtypeGenePercent ExpressedNormalized UMI CountSeurat Scaled Normalized UMI Count
Medialal14.475079530.446617753−0.38657561
MedialC151.0339342520.020802751−0.414645183
MedialDll75.954400855.1287355660.44025229
Medialdac68.319194063.5785042561.883267974
Medialtsh8.3244962880.18788577−0.980689638
Medialhth39.103923654.268092807−0.888707309
Medialvg1.8027571580.078396772−0.412093522
Medialsalr1.2725344640.028393623−0.448546528
Medialsalm1.4050901380.032101424−0.457590917
Medialhui10.498409330.275768473−0.441380335
MedialbetaTub56D99.7348886567.01346734−0.603531241
MedialalphaTub84B99.7083775257.70906617−0.54154286
Medialtsr99.6023329852.63663428−0.473061117
MedialAct5C99.7613997961.24349032−0.509223741
Proximalal17.527114970.8629221470.067037547
ProximalC150.9544468550.015783215−0.4175847
ProximalDll28.633405641.420345973−0.645129847
Proximaldac18.134490240.723348582−0.290659318
Proximaltsh71.323210414.0837842381.609816265
Proximalhth96.5292841679.606523981.461906241
Proximalvg1.5618221260.068269738−0.417599705
Proximalsalr2.2125813450.12027794−0.360530445
Proximalsalm2.9934924080.195738255−0.347537362
Proximalhui8.9370932750.343389783−0.426442771
ProximalbetaTub56D99.6529284267.46030873−0.576381945
ProximalalphaTub84B99.5227765759.83865741−0.467998663
Proximaltsr99.4360086853.94964304−0.392002795
ProximalAct5C99.6963123664.22054168−0.447496433
Fate_Undeterminedal7.5644699140.479912329−0.350297209
Fate_UndeterminedC151.0888252150.054030437−0.395186544
Fate_UndeterminedDll37.134670492.905723476−0.210385285
Fate_Undetermineddac24.412607451.5383423170.329880269
Fate_Undeterminedtsh13.295128940.933420238−0.484960231
Fate_Undeterminedhth51.5759312310.93855676−0.680584009
Fate_Undeterminedvg1.8911174790.261258161−0.312669703
Fate_Undeterminedsalr1.6045845270.097676107−0.382180764
Fate_Undeterminedsalm2.0630372490.167802628−0.3663254
Fate_Undeterminedhui6.0744985670.405081245−0.412815111
Fate_UndeterminedbetaTub56D99.9426934187.403369210.635323338
Fate_UndeterminedalphaTub84B99.8853868277.198685690.131519833
Fate_Undeterminedtsr99.8280802367.624499680.452211633
Fate_UndeterminedAct5C99.7707736482.08451192−0.07709812
Proximal-Distal-Axisal10.367892980.323764339−0.520439017
Proximal-Distal-AxisC152.1739130430.053608202−0.395433811
Proximal-Distal-AxisDll34.448160541.98972318−0.478482888
Proximal-Distal-Axisdac12.040133780.424840171−0.517944852
Proximal-Distal-Axistsh29.598662212.3669150620.468215741
Proximal-Distal-Axishth81.2709030159.527490370.835425868
Proximal-Distal-Axisvg39.966555184.5858987342.038686595
Proximal-Distal-Axissalr43.812709032.6258944332.039602193
Proximal-Distal-Axissalm53.344481613.744656892.039279287
Proximal-Distal-Axishui6.187290970.155657069−0.467912975
Proximal-Distal-AxisbetaTub56D99.4983277671.17380529−0.350756424
Proximal-Distal-AxisalphaTub84B98.8294314459.83141261−0.468248858
Proximal-Distal-Axistsr98.6622073644.76118553−0.959250307
Proximal-Distal-AxisAct5C99.4983277655.6216358−0.625789399
Distalal602.6032853761.963370297
DistalC1573.571428574.2142927422.041126432
DistalDll92.142857149.8094318081.810211936
Distaldac9.6428571430.250736492−0.650508111
Distaltsh9.6428571430.248253186−0.940549437
Distalhth27.857142862.856922579−0.932736841
Distalvg1.0714285710.02411907−0.441604926
Distalsalr1.0714285710.013810745−0.462515482
Distalsalm2.1428571430.045931453−0.448289565
Distalhui12.857142860.966716888−0.288749653
DistalbetaTub56D10062.93114147−0.851566182
DistalalphaTub84B10055.43422202−0.620103282
Distaltsr10053.09832627−0.444558658
DistalAct5C10069.10840865−0.346149551
Anterior-Posterior-Axisal4.5197740110.091887873−0.773096008
Anterior-Posterior-AxisC151.1299435030.014602416−0.418276194
Anterior-Posterior-AxisDll13.559322030.493279925−0.916466206
Anterior-Posterior-Axisdac5.6497175140.11476684−0.754035963
Anterior-Posterior-Axistsh56.497175142.1562942320.3281673
Anterior-Posterior-Axishth88.7005649739.312259820.204696051
Anterior-Posterior-Axisvg00−0.454718739
Anterior-Posterior-Axissalr1.6949152540.093867563−0.385828973
Anterior-Posterior-Axissalm3.3898305080.088684597−0.419536042
Anterior-Posterior-Axishui79.096045211.496584692.037300845
Anterior-Posterior-AxisbetaTub56D100105.69865051.746912454
Anterior-Posterior-AxisalphaTub84B100130.32985261.966373831
Anterior-Posterior-Axistsr10089.726293031.816661244
Anterior-Posterior-AxisAct5C100182.53871192.005757244
Table A2. DEGs of fate undetermined cells. DEG list generated with cutoff value of log2FC equals 0.25 and minimum percentage of cells expressing the DEG of 0.25.
Table A2. DEGs of fate undetermined cells. DEG list generated with cutoff value of log2FC equals 0.25 and minimum percentage of cells expressing the DEG of 0.25.
Gene p_valavg_log2FCpct.1pct.2
eEF22.63 × 10−760.5419118330.9490.968
fabp1.12 × 10−110.4682942790.6550.722
Uba11.76 × 10−50.4629530070.4460.655
CG149840.0001070.4597473940.3230.445
Stip12.15 × 10−80.4178900640.3750.582
Karybeta35.45 × 10−130.4115101980.2930.469
CG32260.0240480.4094423430.5530.731
SmD35.37 × 10−210.4032141550.7330.868
Ran4.25 × 10−440.3942018830.8880.942
AsnRS1.47 × 10−110.3927682860.360.569
Nph1.83 × 10−390.3880317580.860.928
CG99220.6739660.3810628420.5310.76
UQCR-Q1.16 × 10−230.3772404510.7820.901
SmD22.31 × 10−160.3757901590.7240.865
eIF3b0.0021170.3741342390.6190.813
CCT20.2758260.3707031820.5390.763
CG81495.72 × 10−70.3693035560.410.627
COX5A2.80 × 10−270.368345790.7890.9
CG37604.05 × 10−60.36725130.6250.792
san0.0009890.3663507810.4430.663
CCT70.0004140.365333040.6010.799
Tudor-SN0.0010260.3634686740.4890.701
CG118580.5942450.3628939920.5310.743
CG119808.49 × 10−100.3627941720.3820.598
Fkbp391.10 × 10−340.3624307720.8370.917
CG172020.0638680.3622922860.5080.74
Cyt-c-p0.7394860.3608509460.5470.754
Tim9a8.62 × 10−50.3605416480.4470.675
UQCR-145.71 × 10−90.359598290.6930.862
Rpb80.0141190.35947480.4740.699
CCT83.72 × 10−80.3554101410.6420.81
alphaTub84B9.72 × 10−1060.3515336680.9990.996
ATPsynbeta5.99E-540.3513282770.9260.955
betaTub56D2.12 × 10−1110.3511101980.9990.997
eEF1delta3.35 × 10−360.3482623360.8610.934
eIF68.97 × 10−50.3461877490.4370.661
CG112674.58 × 10−180.3459828950.7880.902
Aos13.11 × 10−50.3456970420.6330.808
NHP21.59 × 10−50.344675690.660.835
mod1.41 × 10−120.3439735690.7090.863
CG20214.20 × 10−120.3433662720.3450.547
Nurf-385.99 × 10−180.3396018910.750.879
ND-ACP9.76 × 10−130.3387186220.7110.872
CG139941.07 × 10−150.3377942850.2950.484
tsr1.61 × 10−1020.3367288680.9980.995
ATPsynCF61.79 × 10−260.3353484590.8180.923
Nedd81.67 × 10−120.3343100170.7150.876
Nap16.07 × 10−120.3341077020.7730.895
hoip6.18 × 10−60.3337961510.6890.847
SmB1.14 × 10−200.3336281840.7850.895
Incenp2.62 × 10−150.3324816080.3150.513
CG142106.88 × 10−100.3323090930.3810.598
alpha-Spec7.27 × 10−220.3315635150.2980.518
Act5C6.76 × 10−840.3309486080.9980.997
Pfdn50.0002660.3283693050.650.83
tsu0.0261330.328345570.6020.804
RanGAP1.02 × 10−110.3283115080.3390.531
CCT10.0018930.3267621960.6210.814
CRIF8.74 × 10−140.3263690650.3130.499
Ahcy0.1403830.3259848330.4970.717
UbcE2M2.61 × 10−50.3249313120.4520.684
LysRS1.92 × 10−200.3244813330.2840.486
Pfdn10.0156680.3224375350.5050.745
janA0.0274360.3222361170.640.84
Ote2.27 × 10−110.3217073610.3350.517
Rpn90.0016110.3215929290.6260.817
CG15421.21 × 10−120.3212484060.3510.568
COX5B1.43 × 10−140.3206584240.7680.899
eIF3l3.71 × 10−60.3206214930.4340.656
Pen3.70 × 10−150.3205485850.7640.892
nop53.23 × 10−80.319630210.4320.655
HP40.1925320.3193878710.5170.737
NAT11.18 × 10−160.3182483920.380.613
Rpn11.24 × 10−100.3180336630.4220.661
ND-511.15 × 10−90.317996480.410.652
Txl0.0130310.3173047730.4840.704
Prosbeta58.73 × 10−190.3166098220.7790.894
wac0.0503860.3164344190.4840.703
Hsc70-45.19 × 10−630.3160150210.9820.986
me31B0.4194680.3157587130.5810.797
GluProRS1.96 × 10−90.3156444840.1810.28
Prp81.00 × 10−140.31535320.2210.363
Set1.77 × 10−130.31361360.7550.882
Pfdn27.64 × 10−60.3134329610.6690.849
SelD0.0081040.3131122220.610.807
HmgD3.22 × 10−580.312120120.9950.994
Jafrac11.11 × 10−220.3114030950.8670.935
Mcm64.23 × 10−100.3094330460.2670.404
lost0.0010460.3092982630.70.873
Sem10.0010490.3092689080.670.863
smid2.02 × 10−160.3082621810.3060.505
CG150190.4750460.3082573410.5870.808
Echs10.0001210.3081124750.6620.836
CG341324.59 × 10−110.3079676880.3860.61
Rpn100.5572850.307447270.5770.784
CCT51.39 × 10−120.3068582950.7390.862
baf1.48 × 10−90.305689010.7460.88
CG169857.99 × 10−120.3050961720.2310.368
ND-13B0.0030830.3041670280.5160.768
Cdk10.1824130.30345670.5070.731
Phb20.0566280.303422530.5970.798
La0.0124250.302871180.5130.741
Prp195.29E-150.3010719040.3420.558
CG123214.27E-210.3003441460.2920.501
ox0.2146520.2999919140.5330.769
Prosalpha35.63E-190.2988764420.7910.903
SmE4.45E-060.2988294650.6940.867
CG35942.93E-150.2984656810.2860.471
Pomp5.66E-190.2983056030.8130.907
CG59030.68420.2974341910.5710.795
Acbp10.0597440.2971496080.5570.768
CG65230.0220220.2961850190.6020.808
aurB3.70E-050.2954506320.40.586
ND-SGDH0.1759920.2952099030.5810.802
mtSSB1.81E-100.2944354990.4090.639
Rpb51.51E-060.2939989420.4440.68
CG28628.93E-220.2931614760.830.931
Sgt0.0568970.2921393140.5370.763
pch29.67E-120.2920495750.3990.631
CG128487.97E-150.2920176650.3340.547
His2Av2.63E-270.2918476890.9150.954
Prosalpha70.0008550.2905488410.6520.843
ATPsynD1.62E-270.2894271650.8680.941
CCT34.81E-050.2893568980.6690.852
tko0.0604240.2889679650.5150.744
ND-196.71E-060.2886176440.4680.717
CG97521.15E-050.2883545860.3160.44
mge2.15E-080.2881859380.4490.704
Srp198.31E-130.2879315960.4010.64
Chc3.98E-160.2874622020.3770.612
eIF3j6.29E-200.2872560280.3670.623
CG145431.82E-160.2867609910.2760.458
CCT60.0883520.2863076110.5980.809
Mdh22.50E-120.2853128750.390.639
bonsai2.32E-180.2845177650.3110.53
feo2.85E-140.284458830.3420.551
CG53557.68E-190.2843806970.3710.611
GlyRS6.75E-130.2835641990.4050.644
CG92052.12E-110.283486110.3940.623
Rrp405.61E-180.2828925180.2740.461
l(1)G00041.37E-170.2824074540.3030.511
Non29.28E-230.2822766660.8560.941
CG105763.19E-070.2816139690.6950.85
CG66171.29E-110.2813329740.4080.653
CCT43.04E-070.2807986960.70.856
Tom71.14E-060.2800780890.7060.878
Pfdn60.9571330.2799033830.5640.792
Uch1.44E-060.2796135710.4920.738
CG48661.67E-130.2794684630.3410.555
roh9.19E-200.2793808440.8160.916
Rpn130.127110.2793223660.5430.772
alphaTub84D3.38E-070.2791143480.6660.844
Nlp3.51E-440.2783416130.9550.978
fzy1.92E-160.2776114640.2930.481
Nmt1.32E-150.2775871630.3640.6
CG342002.80E-190.2775480870.3420.573
CG34206.50E-150.27711430.3620.589
Cdc379.32E-120.2769400090.450.704
CG100386.04E-160.2769105680.3080.506
CG106388.48E-050.2769013730.4660.701
Tim230.0983210.275458090.5360.772
Rpn110.7776440.2748125620.5480.771
CG55158.66E-090.2746833350.4340.676
Pcd9.01E-120.2741745240.4330.671
DENR7.60E-110.274110050.4180.67
CG86354.18E-080.2740721520.4490.698
Rpn120.0216780.2739931760.5130.745
pont1.01E-080.2737092610.4170.654
Grx16.78E-200.2736414360.3440.58
Gart6.00E-180.273140570.2970.5
Nacalpha6.76E-490.2728171960.9810.988
GstO24.65E-080.2727269820.4410.685
CG148176.98E-140.2723208390.3770.598
cype7.38E-260.27141510.8570.948
COX7A2.08E-200.2708709290.8580.936
Rpn85.05E-070.2706662090.4680.725
Prosalpha21.35E-060.270646090.7080.865
Prosalpha54.93E-090.2704878930.7050.86
Rpn70.4940350.270462840.5510.765
Ski61.25E-180.2700290480.2690.461
Scsalpha10.330890.2694197690.5680.784
COX82.60E-140.2693894890.8080.923
CG76301.15E-130.2688734230.8030.922
Arp12.34E-150.2685006050.4110.662
Prosbeta70.7002950.2684253490.5970.804
eIF3c6.27E-060.2681916380.5090.751
CG69371.25E-140.2678162250.3560.588
Rae18.51E-200.2673830880.3010.509
Art10.0014760.2669391350.4880.72
Roc1a6.33E-160.2666317960.8010.918
CG170592.17E-090.2666132850.4530.708
Non34.49E-170.2664959930.3390.564
ncd4.47E-160.2643278380.2770.447
CG45111.92E-160.2642393890.3940.649
Updo2.45E-200.2640415250.3450.591
CG15981.46E-170.2635291040.3620.604
CG17893.81E-220.2634403670.3190.548
ATPsynE8.47E-210.2634330840.8480.943
Nop560.0139110.2623956930.6550.841
Prosbeta60.0002070.2622376980.6780.857
COX41.30E-190.2620228390.8440.936
CG133645.33E-090.2618588120.7680.914
Rpt51.59E-070.261157520.4670.709
CG96678.35E-180.2610154750.2690.454
ATPsynB1.97E-200.2610041290.8540.934
ND-13A1.30E-150.2605016650.3890.632
Ssb-c31a0.4280030.2604444490.6090.813
eEF1beta2.95E-470.2604412460.9860.989
RnrL3.99E-140.259962560.2310.365
ND-B150.000290.2598274470.5010.742
p233.03E-210.2590453220.880.947
RFeSP0.4469740.2588356850.6010.82
AIMP21.69E-190.2586711950.3320.569
CG177761.54E-130.2576916390.3980.638
Usp54.82E-220.2575895070.2320.413
Rpn39.87E-050.2574345160.4870.726
blw1.28E-260.2571188640.8830.946
Rpn50.2091070.2564655180.5720.8
pAbp3.06E-120.2561848910.9720.994
ssx6.04E-190.2561448540.1930.338
CHORD3.20E-170.2558018880.1910.33
Cbs1.29E-150.2552825740.3080.498
Vha14-10.0001310.2550640030.4970.74
CG96432.04E-220.2547739670.2860.495
Fib2.20E-090.2547658170.4440.685
endos0.0011430.25462360.6770.86
msd53.63E-130.2545650670.2680.423
Prosbeta38.21E-150.2545286610.7850.904
cl0.0104490.2545137230.6790.86
CG114440.1715880.254367750.5590.779
CG88918.66E-090.2541755530.4430.691
icln0.0001090.253931750.4920.748
mAcon12.12E-210.2535811660.2950.51
SmF0.0010470.2534729190.6770.866
porin1.68E-180.2529481120.8460.943
Nup501.92E-190.2526836080.2590.443
dpa1.96E-080.2526256190.220.322
thoc61.31E-180.2523590960.2960.498
RpA-704.90E-190.2523312570.2430.413
CG97050.0195470.2521834960.6760.873
Shmt1.73E-060.2518949760.4550.699
alien3.33E-150.2515072490.3850.636
Arl21.16E-210.2514586270.2560.445
SerRS0.0762930.2509332880.5470.774
CG19432.38E-100.2508212030.7950.895
ND-B14.5B5.78E-050.2504529010.4960.755
Table A3. Gene ontology analysis of the fate undetermined DEGs indicated enrichment in metabolic processes, DNA replication, RNA splicing and translation initiation suggestive of highly active, premature, growing cells.
Table A3. Gene ontology analysis of the fate undetermined DEGs indicated enrichment in metabolic processes, DNA replication, RNA splicing and translation initiation suggestive of highly active, premature, growing cells.
Analysis Type:PANTHER Overrepresentation Test (Released on 2 February 2022)
Annotation Version and Release Date:GO Ontology database DOI: 10.5281/zenodo.6399963 Released on 22 March 2022
Analyzed List:upload_1 (Drosophila melanogaster)
Reference List:Drosophila melanogaster (all genes in database)
Test Type:FISHER
Correction:FDR
GO biological process completeDrosophila melanogaster—REFLIST (13821)upload_1 (205)upload_1 (expected)upload_1 (over/under)upload_1 (fold Enrichment)upload_1 (raw P-value)upload_1 (FDR)
mitochondrial electron transport, ubiquinol to cytochrome c (GO:0006122)1350.19+25.934.66E-064.54E-04
DNA unwinding involved in DNA replication (GO:0006268)1140.16+24.525.33E-054.03E-03
spliceosomal snRNP assembly (GO:0000387)1860.27+22.479.79E-071.12E-04
proton motive force-driven ATP synthesis (GO:0015986)2160.31+19.262.08E-062.25E-04
mitochondrial electron transport, cytochrome c to oxygen (GO:0006123)1540.22+17.981.45E-049.39E-03
ATP biosynthetic process (GO:0006754)2360.34+17.593.25E-063.38E-04
purine ribonucleoside triphosphate biosynthetic process (GO:0009206)2760.4+14.987.22E-066.54E-04
purine ribonucleoside triphosphate metabolic process (GO:0009205)2760.4+14.987.22E-066.47E-04
purine nucleoside triphosphate biosynthetic process (GO:0009145)2760.4+14.987.22E-066.39E-04
aerobic electron transport chain (GO:0019646)54120.8+14.981.70E-104.56E-08
purine nucleoside triphosphate metabolic process (GO:0009144)2860.42+14.458.66E-067.42E-04
mitochondrial electron transport, NADH to ubiquinone (GO:0006120)1940.28+14.193.16E-041.89E-02
mitochondrial ATP synthesis coupled electron transport (GO:0042775)58120.86+13.953.48E-107.34E-08
ribonucleoside triphosphate biosynthetic process (GO:0009201)3060.44+13.481.22E-051.03E-03
ribonucleoside triphosphate metabolic process (GO:0009199)3060.44+13.481.22E-051.01E-03
nucleoside triphosphate metabolic process (GO:0009141)3670.53+13.112.65E-062.83E-04
ATP synthesis coupled electron transport (GO:0042773)62120.92+13.056.83E-101.40E-07
respiratory electron transport chain (GO:0022904)69131.02+12.71.75E-104.54E-08
oxidative phosphorylation (GO:0006119)69131.02+12.71.75E-104.39E-08
electron transport chain (GO:0022900)76141.13+12.424.43E-111.38E-08
nucleoside triphosphate biosynthetic process (GO:0009142)3360.49+12.261.98E-051.60E-03
germarium-derived female germ-line cyst formation (GO:0030727)2340.34+11.735.97E-043.35E-02
female germ-line cyst formation (GO:0048135)2440.36+11.246.89E-043.75E-02
DNA duplex unwinding (GO:0032508)2540.37+10.797.90E-044.16E-02
aerobic respiration (GO:0009060)101161.5+10.681.33E-114.70E-09
cellular respiration (GO:0045333)111171.65+10.334.82E-121.88E-09
ATP metabolic process (GO:0046034)118181.75+10.281.18E-125.73E-10
mitochondrial respiratory chain complex I assembly (GO:0032981)3650.53+9.363.10E-041.90E-02
NADH dehydrogenase complex assembly (GO:0010257)3650.53+9.363.10E-041.89E-02
translational initiation (GO:0006413)5370.79+8.92.58E-052.05E-03
tRNA aminoacylation for protein translation (GO:0006418)3850.56+8.873.89E-042.30E-02
energy derivation by oxidation of organic compounds (GO:0015980)135172+8.497.90E-112.28E-08
tRNA aminoacylation (GO:0043039)4150.61+8.225.35E-043.04E-02
amino acid activation (GO:0043038)4350.64+7.846.53E-043.61E-02
protein folding (GO:0006457)132151.96+7.663.85E-096.82E-07
proteasome-mediated ubiquitin-dependent protein catabolic process (GO:0043161)217243.22+7.461.08E-131.20E-10
proteasomal protein catabolic process (GO:0010498)229243.4+7.073.16E-132.74E-10
generation of precursor metabolites and energy (GO:0006091)183182.71+6.638.59E-101.67E-07
centrosome cycle (GO:0007098)7271.07+6.551.53E-049.85E-03
ribonucleoprotein complex assembly (GO:0022618)117111.74+6.342.72E-062.86E-04
modification-dependent macromolecule catabolic process (GO:0043632)318294.72+6.152.43E-144.74E-11
microtubule organizing center organization (GO:0031023)7771.14+6.132.25E-041.41E-02
ribonucleoprotein complex subunit organization (GO:0071826)122111.81+6.083.96E-063.90E-04
modification-dependent protein catabolic process (GO:0019941)311284.61+6.079.51E-141.23E-10
ubiquitin-dependent protein catabolic process (GO:0006511)307274.55+5.934.57E-132.97E-10
proteolysis involved in cellular protein catabolic process (GO:0051603)332284.92+5.694.22E-132.99E-10
cellular protein catabolic process (GO:0044257)334284.95+5.654.83E-132.90E-10
microtubule cytoskeleton organization involved in mitosis (GO:1902850)8571.26+5.553.94E-042.31E-02
protein catabolic process (GO:0030163)343285.09+5.58.82E-134.58E-10
nuclear transport (GO:0051169)11891.75+5.141.02E-046.99E-03
nucleocytoplasmic transport (GO:0006913)11891.75+5.141.02E-046.93E-03
establishment of protein localization to membrane (GO:0090150)9571.41+4.977.35E-043.95E-02
purine ribonucleotide biosynthetic process (GO:0009152)9671.42+4.927.79E-044.13E-02
ribonucleoprotein complex biogenesis (GO:0022613)288214.27+4.925.00E-098.65E-07
cellular macromolecule catabolic process (GO:0044265)447326.63+4.835.01E-132.79E-10
purine-containing compound biosynthetic process (GO:0072522)11281.66+4.823.77E-042.24E-02
cell population proliferation (GO:0008283)12281.81+4.426.43E-043.58E-02
rRNA metabolic process (GO:0016072)168112.49+4.416.48E-054.80E-03
mRNA splicing, via spliceosome (GO:0000398)214143.17+4.416.56E-066.08E-04
RNA splicing, via transesterification reactions with bulged adenosine as nucleophile (GO:0000377)214143.17+4.416.56E-066.01E-04
rRNA processing (GO:0006364)153102.27+4.411.41E-049.25E-03
nucleotide biosynthetic process (GO:0009165)12381.82+4.396.77E-043.71E-02
RNA splicing, via transesterification reactions (GO:0000375)216143.2+4.377.25E-066.35E-04
macromolecule catabolic process (GO:0009057)494327.33+4.376.16E-122.29E-09
nucleoside phosphate biosynthetic process (GO:1901293)12481.84+4.357.12E-043.85E-02
spindle organization (GO:0007051)12581.85+4.317.48E-043.99E-02
translation (GO:0006412)304194.51+4.212.72E-073.47E-05
protein-containing complex assembly (GO:0065003)433276.42+4.27.19E-101.44E-07
establishment of protein localization to organelle (GO:0072594)179112.66+4.141.11E-047.43E-03
RNA splicing (GO:0008380)228143.38+4.141.29E-051.06E-03
peptide biosynthetic process (GO:0043043)310194.6+4.133.61E-074.39E-05
meiotic cell cycle (GO:0051321)215133.19+4.083.09E-052.40E-03
mitotic cell cycle (GO:0000278)399245.92+4.061.33E-082.11E-06
meiotic cell cycle process (GO:1903046)204123.03+3.978.02E-055.63E-03
organonitrogen compound catabolic process (GO:1901565)514307.62+3.933.34E-107.64E-08
microtubule cytoskeleton organization (GO:0000226)344205.1+3.923.88E-074.65E-05
nuclear division (GO:0000280)246143.65+3.842.88E-052.27E-03
cell cycle (GO:0007049)608349.02+3.775.64E-111.69E-08
ribosome biogenesis (GO:0042254)215123.19+3.761.29E-048.49E-03
mitochondrion organization (GO:0007005)235133.49+3.737.34E-055.20E-03
amide biosynthetic process (GO:0043604)350195.19+3.662.03E-062.22E-04
organelle fission (GO:0048285)259143.84+3.644.92E-053.75E-03
protein-containing complex organization (GO:0043933)527287.82+3.589.82E-091.59E-06
mRNA metabolic process (GO:0016071)344185.1+3.536.19E-065.81E-04
mRNA processing (GO:0006397)268143.98+3.526.98E-054.99E-03
cell cycle process (GO:0022402)513267.61+3.428.53E-081.17E-05
mitotic cell cycle process (GO:1903047)277144.11+3.419.75E-056.72E-03
chromosome organization (GO:0051276)476247.06+3.43.06E-073.84E-05
spermatogenesis (GO:0007283)264133.92+3.322.21E-041.40E-02
peptide metabolic process (GO:0006518)409206.07+3.34.90E-064.71E-04
protein localization to organelle (GO:0033365)249123.69+3.254.63E-042.67E-02
cellular catabolic process (GO:0044248)8013811.88+3.23.45E-107.48E-08
microtubule-based process (GO:0007017)464216.88+3.058.79E-067.44E-04
male gamete generation (GO:0048232)310144.6+3.042.97E-041.84E-02
ncRNA metabolic process (GO:0034660)377175.59+3.046.83E-054.97E-03
cellular nitrogen compound biosynthetic process (GO:0044271)7113210.55+3.033.45E-085.07E-06
regulation of catabolic process (GO:0009894)291134.32+3.015.39E-043.04E-02
RNA processing (GO:0006396)560258.31+3.011.46E-061.63E-04
organic substance catabolic process (GO:1901575)8233612.21+2.958.56E-091.42E-06
intracellular protein transport (GO:0006886)328144.87+2.885.12E-042.93E-02
cellular amide metabolic process (GO:0043603)493217.31+2.872.10E-051.69E-03
female gamete generation (GO:0007292)628269.31+2.793.36E-063.45E-04
gene expression (GO:0010467)11204616.61+2.773.36E-107.48E-08
catabolic process (GO:0009056)9323813.82+2.751.91E-082.92E-06
cellular macromolecule biosynthetic process (GO:0034645)496207.36+2.726.96E-055.02E-03
sexual reproduction (GO:0019953)9313713.81+2.685.75E-088.30E-06
germ cell development (GO:0007281)7132810.58+2.653.63E-063.63E-04
cellular component biogenesis (GO:0044085)12855019.06+2.622.90E-106.85E-08
organonitrogen compound biosynthetic process (GO:1901566)8003111.87+2.611.34E-061.52E-04
cellular process involved in reproduction in multicellular organism (GO:0022412)8523312.64+2.615.89E-076.95E-05
RNA metabolic process (GO:0016070)8283212.28+2.619.37E-071.09E-04
gamete generation (GO:0007276)9153513.57+2.583.38E-074.18E-05
oogenesis (GO:0048477)576228.54+2.586.48E-054.76E-03
cellular protein metabolic process (GO:0044267)16836324.96+2.524.07E-121.67E-09
cellular component assembly (GO:0022607)11094116.45+2.498.82E-081.18E-05
nucleobase-containing compound metabolic process (GO:0006139)13695020.31+2.462.65E-094.79E-07
cytoskeleton organization (GO:0007010)580218.6+2.443.13E-041.89E-02
nucleic acid metabolic process (GO:0090304)11054016.39+2.441.96E-072.54E-05
reproductive process (GO:0022414)11904317.65+2.446.16E-088.57E-06
developmental process involved in reproduction (GO:0003006)8923213.23+2.425.10E-064.84E-04
cellular nitrogen compound metabolic process (GO:0034641)18016426.71+2.42.20E-117.45E-09
heterocycle metabolic process (GO:0046483)14515121.52+2.377.34E-091.24E-06
cellular macromolecule metabolic process (GO:0044260)21067431.24+2.373.46E-132.70E-10
organelle organization (GO:0006996)17216025.53+2.352.20E-105.35E-08
multicellular organismal reproductive process (GO:0048609)10413615.44+2.333.50E-063.54E-04
macromolecule biosynthetic process (GO:0009059)6962410.32+2.321.57E-041.00E-02
cellular aromatic compound metabolic process (GO:0006725)14955122.17+2.31.55E-082.42E-06
proteolysis (GO:0006508)8612912.77+2.275.96E-054.46E-03
organic cyclic compound metabolic process (GO:1901360)15585123.11+2.215.89E-088.35E-06
cellular metabolic process (GO:0044237)387712657.51+2.197.75E-233.02E-19
cellular component organization or biogenesis (GO:0071840)27788841.2+2.142.76E-132.69E-10
macromolecule localization (GO:0033036)8742712.96+2.084.25E-042.47E-02
cellular biosynthetic process (GO:0044249)11783617.47+2.064.31E-053.32E-03
reproduction (GO:0000003)14214321.08+2.047.99E-066.92E-04
cellular component organization (GO:0016043)26167938.8+2.048.68E-112.41E-08
organic substance biosynthetic process (GO:1901576)12123617.98+29.41E-056.55E-03
macromolecule metabolic process (GO:0043170)31939447.36+1.981.44E-126.23E-10
biosynthetic process (GO:0009058)12393618.38+1.961.15E-047.63E-03
nitrogen compound metabolic process (GO:0006807)367110654.45+1.953.62E-145.64E-11
protein metabolic process (GO:0019538)21836332.38+1.951.26E-071.67E-05
multicellular organism reproduction (GO:0032504)12753618.91+1.92.28E-041.42E-02
metabolic process (GO:0008152)454812867.46+1.91.15E-173.00E-14
organonitrogen compound metabolic process (GO:1901564)27257540.42+1.862.76E-084.14E-06
primary metabolic process (GO:0044238)403910959.91+1.821.30E-125.96E-10
cellular process (GO:0009987)7306188108.37+1.732.61E-332.03E-29
organic substance metabolic process (GO:0071704)428311063.53+1.732.91E-119.45E-09
biological_process (GO:0008150)11314197167.81+1.171.87E-093.47E-07
Unclassified (UNCLASSIFIED)2507837.190.221.87E-093.55E-07

References

  1. Schmitz, S.U.; Grote, P.; Herrmann, B.G. Mechanisms of long noncoding RNA function in development and disease. Cell. Mol. Life Sci. 2016, 73, 2491–2509. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  2. Laurent, G.S.; Wahlestedt, C.; Kapranov, P. The Landscape of long non-coding RNA classification: The non-coding RNA universe. Trends Genet. 2016, 31, 239–251. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  3. Sarropoulos, I.; Marin, R.; Cardoso-Moreira, M.; Kaessmann, H. Developmental dynamics of lncRNAs across mammalian organs and species. Nature 2019, 571, 510–514. [Google Scholar] [CrossRef] [PubMed]
  4. Esteller, M. Non-coding RNAs in human disease. Nat. Rev. Genet. 2011, 12, 861–874. [Google Scholar] [CrossRef]
  5. Adams, M.D.; Celniker, S.E.; Holt, R.A.; Evans, C.A.; Gocayne, J.D.; Amanatides, P.G.; Scherer, S.E.; Li, P.W.; Hoskins, R.A.; Galle, R.F.; et al. The genome sequence of Drosophila melanogaster. Science 2000, 287, 2185–2195. [Google Scholar] [CrossRef] [Green Version]
  6. Pandey, U.B.; Nichols, C.D. Human Disease Models in Drosophila melanogaster and the Role of the Fly in Therapeutic Drug Discovery. Pharmacol. Rev. 2011, 63, 411–436. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  7. Beira, J.V.; Paro, R. The legacy of Drosophila imaginal discs. Chromosoma 2016, 125, 573–592. [Google Scholar] [CrossRef] [Green Version]
  8. Aldridge, S.; Teichmann, S.A. Single cell transcriptomics comes of age. Nat. Commun. 2020, 11, 4307. [Google Scholar] [CrossRef]
  9. Tang, F.; Barbacioru, C.; Wang, Y.; Nordman, E.; Lee, C.; Xu, N.; Wang, X.; Bodeau, J.; Tuch, B.B.; Siddiqui, A.A.; et al. mRNA-Seq whole-transcriptome analysis of a single cell. Nat. Methods 2009, 6, 377–382. [Google Scholar] [CrossRef]
  10. Hwang, B.; Lee, J.H.; Bang, D. Single-cell RNA sequencing technologies and bioinformatics pipelines. Exp. Mol. Med. 2018, 50, 96. [Google Scholar] [CrossRef] [Green Version]
  11. Chen, G.; Ning, B.; Shi, T. Single-Cell RNA-Seq Technologies and Related Computational Data Analysis. Front. Genet. 2019, 10, 317. [Google Scholar] [CrossRef] [PubMed]
  12. Svensson, V.; da Veiga Beltrame, E.; Pachter, L. A curated database reveals trends in single-cell transcriptomics. Database 2020, 2020, baaa073. [Google Scholar] [CrossRef]
  13. Klein, A.M.; Treutlein, B. Single cell analyses of development in the modern era. Development 2019, 146, dev181396. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  14. Ariss, M.M.; Islam, A.B.M.M.K.; Critcher, M.; Zappia, M.P.; Frolov, M.V. Single cell RNA-sequencing identifies a metabolic aspect of apoptosis in Rbf mutant. Nat. Commun. 2018, 9, 5024. [Google Scholar] [CrossRef] [PubMed]
  15. Bageritz, J.; Willnow, P.; Valentini, E.; Leible, S.; Boutros, M.; Teleman, A.A. Gene expression atlas of a developing tissue by single cell expression correlation analysis. Nat. Methods 2019, 16, 750–756. [Google Scholar] [CrossRef]
  16. Deng, M.; Wang, Y.; Zhang, L.; Yang, Y.; Huang, S.; Wang, J.; Ge, H.; Ishibashi, T.; Yan, Y. Single cell transcriptomic landscapes of pattern formation, proliferation and growth in Drosophila wing imaginal discs. Development 2019, 146, dev179754. [Google Scholar] [CrossRef] [Green Version]
  17. Zappia, M.P.; de Castro, L.; Ariss, M.M.; Jefferson, H.; Islam, A.B.; Frolov, M.V. A cell atlas of adult muscle precursors uncovers early events in fibre-type divergence in Drosophila. EMBO Rep. 2020, 21, e49555. [Google Scholar] [CrossRef]
  18. González-Blas, C.B.; Quan, X.J.; Duran-Romaña, R.; Taskiran, I.I.; Koldere, D.; Davie, K.; Christiaens, V.; Makhzami, S.; Hulselmans, G.; de Waegeneer, M.; et al. Identification of genomic enhancers through spatial integration of single-cell transcriptomics and epigenomics. Mol. Syst. Biol. 2020, 16, e9438. [Google Scholar] [CrossRef]
  19. Everetts, N.J.; Worley, M.I.; Yasutomi, R.; Yosef, N.; Hariharan, I.K. Single-cell transcriptomics of the Drosophila wing disc reveals instructive epithelium-to-myoblast interactions. Elife 2021, 10, e61276. [Google Scholar] [CrossRef]
  20. Li, H.; Janssens, J.; De Waegeneer, M.; Kolluru, S.S.; Davie, K.; Gardeux, V.; Saelens, W.; David, F.P.A.; Brbić, M.; Spanier, K.; et al. Fly Cell Atlas: A single-nucleus transcriptomic atlas of the adult fruit fly. Science 2022, 375, 6584. [Google Scholar] [CrossRef]
  21. Worley, M.I.; Everetts, N.J.; Yasutomi, R.; Yosef, N.; Hariharan, I.K. Critical genetic program for Drosophila imaginal disc regeneration revealed by single-cell analysis. BioRxiv 2021. [Google Scholar] [CrossRef]
  22. Estella, C.; Voutev, R.; Mann, R.S. A Dynamic Network of Morphogens and Transcription Factors Patterns the Fly Leg. Curr. Top. Dev. Biol. 2012, 98, 173–198. [Google Scholar] [PubMed] [Green Version]
  23. Schubiger, G.; Schubiger, M.; Sustar, A. The three leg imaginal discs of Drosophila: “Vive la différence”. Dev. Biol. 2012, 369, 76–90. [Google Scholar] [CrossRef] [Green Version]
  24. Sudarsan, V.; Anant, S.; Guptan, P.; Vijayraghavan, K.; Skaer, H. Myoblast Diversification and Ectodermal Signaling in Drosophila. Dev. Cell 2001, 1, 829–839. [Google Scholar] [CrossRef]
  25. Fraichard, S.; Bougé, A.L.; Kendall, T.; Chauvel, I.; Bouhin, H.; Bunch, T.A. Tenectin is a novel αPS2βPS integrin ligand required for wing morphogenesis and male genital looping in Drosophila. Dev. Biol. 2010, 340, 504–517. [Google Scholar] [CrossRef] [PubMed]
  26. Cattenoz, P.B.; Monticelli, S.; Pavlidaki, A.; Giangrande, A. Toward a Consensus in the Repertoire of Hemocytes Identified in Drosophila. Front. Cell Dev. Biol. 2021, 9, 643712. [Google Scholar] [CrossRef]
  27. Trébuchet, G.; Cattenoz, P.B.; Zsámboki, J.; Mazaud, D.; Siekhaus, D.E.; Fanto, M.; Giangrande, A. The Repo Homeodomain Transcription Factor Suppresses Hematopoiesis in Drosophila and Preserves the Glial Fate. J. Neurosci. 2019, 39, 255. [Google Scholar] [CrossRef] [Green Version]
  28. Zaharieva, E.; Haussmann, I.U.; Bräuer, U.; Soller, M. Concentration and Localization of Coexpressed ELAV/Hu Proteins Control Specificity of mRNA Processing. Mol. Cell. Biol. 2015, 35, 3104–3115. [Google Scholar] [CrossRef] [Green Version]
  29. Morriss, G.R.; Bryantsev, A.L.; Chechenova, M.; LaBeau, E.M.; Lovato, T.L.; Ryan, K.M.; Cripps, R.M. Analysis of Skeletal Muscle Development in Drosophila. Methods Mol. Biol. 2012, 798, 127–152. [Google Scholar]
  30. Taylor, M.V. Comparison of Muscle Development in Drosophila and Vertebrates. In Muscle Development in Drosophila; Springer: New York, NY, USA, 2006. [Google Scholar]
  31. Lee, C.-Y.; Andersen, R.O.; Cabernard, C.; Manning, L.; Tran, K.D.; Lanskey, M.J.; Bashirullah, A.; Doe, C.Q. Drosophila Aurora-A kinase inhibits neuroblast self-renewal by regulating aPKC/Numb cortical polarity and spindle orientation. Genes Dev. 2006, 20, 3474. [Google Scholar] [CrossRef] [Green Version]
  32. Lilly, B.; O’Keefe, D.D.; Thomas, J.B.; Botas, J. The LIM homeodomain protein dLim1 defines a subclass of neurons within the embryonic ventral nerve cord of Drosophila. Mech. Dev. 1999, 88, 195–205. [Google Scholar] [CrossRef]
  33. Sen, S.; Hartmann, B.; Reichert, H.; Rodrigues, V. Expression and function of the empty spiracles gene in olfactory sense organ development of Drosophila melanogaster. Development 2010, 137, 3687–3695. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  34. Avalos, C.B.; Maier, G.L.; Bruggmann, R.M.; Sprecher, S.G. Single cell transcriptome atlas of the Drosophila larval brain. Elife 2019, 8, e50354. [Google Scholar] [CrossRef] [PubMed]
  35. Wagh, D.A.; Rasse, T.M.; Asan, E.; Hofbauer, A.; Schwenkert, I.; Dürrbeck, H.; Buchner, S.; Dabauvalle, M.-C.; Schmidt, M.; Qin, G.; et al. Bruchpilot, a Protein with Homology to ELKS/CAST, Is Required for Structural Integrity and Function of Synaptic Active Zones in Drosophila. Neuron 2006, 49, 833–844. [Google Scholar] [CrossRef] [Green Version]
  36. Lee, C.H.; Herman, T.; Clandinin, T.R.; Lee, R.; Zipursky, S.L. N-Cadherin Regulates Target Specificity in the Drosophila Visual System. Neuron 2001, 30, 437–450. [Google Scholar] [CrossRef] [Green Version]
  37. De Navascués, J.; Modolell, J. The pronotum LIM-HD gene tailup is both a positive and a negative regulator of the proneural genes achaete and scute of Drosophila. Mech. Dev. 2010, 127, 393–406. [Google Scholar] [CrossRef]
  38. Ho, T.Y.; Wu, W.H.; Hung, S.J.; Liu, T.; Lee, Y.M.; Liu, Y.H. Expressional Profiling of Carpet Glia in the Developing Drosophila Eye Reveals Its Molecular Signature of Morphology Regulators. Front. Neurosci. 2019, 13, 244. [Google Scholar] [CrossRef]
  39. Organista, M.F.; de Celis, J.F. The spalt transcription factors regulate cell proliferation, survival and epithelial integrity downstream of the Decapentaplegic signalling pathway. Biol. Open 2013, 2, 37–48. [Google Scholar] [CrossRef] [Green Version]
  40. Brown, J.B.; Boley, N.; Eisman, R.; May, G.E.; Stoiber, M.H.; Duff, M.O.; Booth, B.W.; Wen, J.; Park, S.; Suzuki, A.M.; et al. Diversity and dynamics of the Drosophila transcriptome. Nature 2014, 512, 393–399. [Google Scholar] [CrossRef] [Green Version]
  41. Samuels, T.J.; Järvelin, A.I.; Ish-Horowicz, D.; Davis, I. Imp/IGF2BP levels modulate individual neural stem cell growth and division through myc mRNA stability. Elife 2020, 9, e51529. [Google Scholar] [CrossRef]
  42. Xu, Q.; Song, Z.; Zhu, C.; Tao, C.; Kang, L.; Liu, W.; He, F.; Yan, J.; Sang, T. Systematic comparison of lncRNAs with protein coding mRNAs in population expression and their response to environmental change. BMC Plant Biol. 2017, 17, 42. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  43. Grubbs, N.; Leach, M.; Su, X.; Petrisko, T.; Rosario, J.B. New Components of Drosophila Leg Development Identified through Genome Wide Association Studies. PLoS ONE 2013, 8, e60261. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  44. Dey, B.K.; Zhao, X.L.; Popo-Ola, E.; Campos, A.R. Mutual regulation of the Drosophila disconnected (disco) and Distal-less (Dll) genes contributes to proximal-distal patterning of antenna and leg. Cell Tissue Res. 2009, 338, 227–240. [Google Scholar] [CrossRef] [PubMed]
  45. Hao, Y.; Hao, S.; Andersen-Nissen, E.; Mauck, W.M., 3rd; Zheng, S.; Butler, A.; Lee, M.J.; Wilk, A.J.; Darby, C.; Zager, M.; et al. Integrated analysis of multimodal single-cell data. Cell 2021, 184, 3573–3587.e29. [Google Scholar] [CrossRef]
  46. Granja, J.M.; Corces, M.R.; Pierce, S.E.; Bagdatli, S.T.; Choudhry, H.; Chang, H.Y.; Greenleaf, W.J. ArchR is a scalable software package for integrative single-cell chromatin accessibility analysis. Nat. Genet. 2021, 53, 403–411. [Google Scholar] [CrossRef] [PubMed]
  47. Gaspar, J.M. Improved peak-calling with MACS2. BioRxiv 2018, 496521. [Google Scholar] [CrossRef]
Figure 1. scRNA-seq revealed the major cell types of D.melanogaster third-instar third leg discs. (A) The time course of leg disc dissection. Embryos grew until dissection at T1 (121 h AEL), T2 (133 h AEL) or T3 (168 h AEL). (B) The third leg disc was differentiated from other leg discs as it occurred as a mid-size disc with a concentric ring-like pattern at its center within a trio of discs on bilateral sides of the larvae which also included the wing and haltere discs. (C) Flowchart of scRNA-seq and scATAC-seq experiments. Dissected leg discs were dissociated into single cells, a portion of which were used for scRNA-seq with the remaining cells having their nuclei isolated for scATAC-seq. Both scRNA-seq and scATAC-seq used the 10× Genomics Chromium Controller and proceeded with their respective library preparation protocols, sequencing, and data analysis. (D) UMAP visualizations of the scRNA-seq data show that T1, T2, and T3 overlay each other, although the four identified cell types were quite segregated. (E) Dot plot showing the known marker genes of the respective cell types identified in the UMAP visualization. (F) Muscle cell subset of the scRNA-seq data showing differentiation between early and late muscle cells. (G) Dot plot showing the known marker genes of the early and late muscle cells identified in the UMAP visualization. (H) Neuronal cell subset of the scRNA-seq data showing differentiation between early and late neuronal cells. (I) Dot plot showing the known marker genes of the early and late neuronal cells identified in the UMAP visualization. (J) Immune cell subset of the scRNA-seq data showing differentiation between hemocytes (and plasmatocytes) and glia. (K) Dot plot showing the marker genes of hemocytes (and plasmatocytes) and glia identified in the UMAP visualization.
Figure 1. scRNA-seq revealed the major cell types of D.melanogaster third-instar third leg discs. (A) The time course of leg disc dissection. Embryos grew until dissection at T1 (121 h AEL), T2 (133 h AEL) or T3 (168 h AEL). (B) The third leg disc was differentiated from other leg discs as it occurred as a mid-size disc with a concentric ring-like pattern at its center within a trio of discs on bilateral sides of the larvae which also included the wing and haltere discs. (C) Flowchart of scRNA-seq and scATAC-seq experiments. Dissected leg discs were dissociated into single cells, a portion of which were used for scRNA-seq with the remaining cells having their nuclei isolated for scATAC-seq. Both scRNA-seq and scATAC-seq used the 10× Genomics Chromium Controller and proceeded with their respective library preparation protocols, sequencing, and data analysis. (D) UMAP visualizations of the scRNA-seq data show that T1, T2, and T3 overlay each other, although the four identified cell types were quite segregated. (E) Dot plot showing the known marker genes of the respective cell types identified in the UMAP visualization. (F) Muscle cell subset of the scRNA-seq data showing differentiation between early and late muscle cells. (G) Dot plot showing the known marker genes of the early and late muscle cells identified in the UMAP visualization. (H) Neuronal cell subset of the scRNA-seq data showing differentiation between early and late neuronal cells. (I) Dot plot showing the known marker genes of the early and late neuronal cells identified in the UMAP visualization. (J) Immune cell subset of the scRNA-seq data showing differentiation between hemocytes (and plasmatocytes) and glia. (K) Dot plot showing the marker genes of hemocytes (and plasmatocytes) and glia identified in the UMAP visualization.
Ijms 23 06796 g001
Figure 2. Subclustering of the epithelial cell cluster with identified cell subtypes along the PD axis and a distal-specific lncRNA:CR33938. (A) Leg disc epithelium subset of the scRNA-seq data showing differentiation of cells along the PD axis of the fly leg. (B) Dot plot showing the known marker genes of the proximal, medial, and distal cells as well as the earlier stem-cell like cells of the PD axis. (C) Heatmap of the top ten most upregulated genes for each cell subtype (subcluster) of the epithelial cell cluster, where black represents known marker genes, blue represents genes of known function as potential markers, and red represents genes of unknown functions. LncRNA:CR33938 was identified as one of the most upregulated genes in the distal cells. (D) Feature plots showing the expression levels of lncRNA:CR33938 in different epithelial subclusters across T1, T2, and T3. (E) Validation of the scRNA-seq lncRNA:CR33938 identified using FISH showing negligible expression during T1, epithelium wide expression in T2, and mainly distal-specific expression in T3. Scale bar represents 50 µm.
Figure 2. Subclustering of the epithelial cell cluster with identified cell subtypes along the PD axis and a distal-specific lncRNA:CR33938. (A) Leg disc epithelium subset of the scRNA-seq data showing differentiation of cells along the PD axis of the fly leg. (B) Dot plot showing the known marker genes of the proximal, medial, and distal cells as well as the earlier stem-cell like cells of the PD axis. (C) Heatmap of the top ten most upregulated genes for each cell subtype (subcluster) of the epithelial cell cluster, where black represents known marker genes, blue represents genes of known function as potential markers, and red represents genes of unknown functions. LncRNA:CR33938 was identified as one of the most upregulated genes in the distal cells. (D) Feature plots showing the expression levels of lncRNA:CR33938 in different epithelial subclusters across T1, T2, and T3. (E) Validation of the scRNA-seq lncRNA:CR33938 identified using FISH showing negligible expression during T1, epithelium wide expression in T2, and mainly distal-specific expression in T3. Scale bar represents 50 µm.
Ijms 23 06796 g002
Figure 3. LncRNA:CR33938 is conserved among insects and overexpression in Drosophila S2 cells increased expression levels of genes involved in leg development. (A) lncRNA:CR33938, identified on chromosome 3, is conserved within insects. (B) The fraction of conserved bases of lncRNA:CR33938 across insects is greater than 0.8. (C) Overexpression of lncRNA:CR33938 in S2 cells produced an increase in the expression of leg development genes, including PD axis genes, distal leg tarsal disco-r, and medial leg tibial dac according to qPCR. There was no effect on proximal leg femur genes. *, ***, **** equate to p-values of less than 0.05, 0.001 and 0.0001, respectively.
Figure 3. LncRNA:CR33938 is conserved among insects and overexpression in Drosophila S2 cells increased expression levels of genes involved in leg development. (A) lncRNA:CR33938, identified on chromosome 3, is conserved within insects. (B) The fraction of conserved bases of lncRNA:CR33938 across insects is greater than 0.8. (C) Overexpression of lncRNA:CR33938 in S2 cells produced an increase in the expression of leg development genes, including PD axis genes, distal leg tarsal disco-r, and medial leg tibial dac according to qPCR. There was no effect on proximal leg femur genes. *, ***, **** equate to p-values of less than 0.05, 0.001 and 0.0001, respectively.
Ijms 23 06796 g003
Figure 4. scATAC-seq revealed the same major cell types of Drosophila L3 disc as in scRNA-seq. (A) UMAP visualization of the scATAC-seq data, showing that T1, T2, and T3 overlay one another. (B) UMAP visualization revealing the different clusters identified prior to integration with scRNA-seq. (C) Heatmap showing the proportion of cells of each cluster within each sample (T1, T2, and T3). The color scale represents the cell proportion within each cluster. (D) UMAP visualization of clusters after integration of scATAC-seq data with scRNA-seq data. Most cell types and cell subtypes were remapped, including the proximal, medial, and distal cells as well as the muscle, neuronal, immune, and stem-cell like cells of the PD axis. (E) Feature plots showing the known marker genes, and the respective cell types and cell subtypes identified in the UMAP visualization after scRNA-seq data integration. (F) Heatmap of important motifs in each cluster. (G) Gene Ontology analysis of the chromatin-accessible distal genes of T2 and T3 relative to those of T1 showed many metabolic processes occurring in early T2 (left), while many chitin-based cuticle development processes occurred in late T3 (right). (H) Genome tracks of distal marker genes (Dll and C15) revealed high co-accessibility in neighboring genes.
Figure 4. scATAC-seq revealed the same major cell types of Drosophila L3 disc as in scRNA-seq. (A) UMAP visualization of the scATAC-seq data, showing that T1, T2, and T3 overlay one another. (B) UMAP visualization revealing the different clusters identified prior to integration with scRNA-seq. (C) Heatmap showing the proportion of cells of each cluster within each sample (T1, T2, and T3). The color scale represents the cell proportion within each cluster. (D) UMAP visualization of clusters after integration of scATAC-seq data with scRNA-seq data. Most cell types and cell subtypes were remapped, including the proximal, medial, and distal cells as well as the muscle, neuronal, immune, and stem-cell like cells of the PD axis. (E) Feature plots showing the known marker genes, and the respective cell types and cell subtypes identified in the UMAP visualization after scRNA-seq data integration. (F) Heatmap of important motifs in each cluster. (G) Gene Ontology analysis of the chromatin-accessible distal genes of T2 and T3 relative to those of T1 showed many metabolic processes occurring in early T2 (left), while many chitin-based cuticle development processes occurred in late T3 (right). (H) Genome tracks of distal marker genes (Dll and C15) revealed high co-accessibility in neighboring genes.
Ijms 23 06796 g004
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Tse, J.; Li, T.H.; Zhang, J.; Lee, A.C.K.; Lee, I.; Qu, Z.; Lin, X.; Hui, J.; Chan, T.-F. Single-Cell Atlas of the Drosophila Leg Disc Identifies a Long Non-Coding RNA in Late Development. Int. J. Mol. Sci. 2022, 23, 6796. https://doi.org/10.3390/ijms23126796

AMA Style

Tse J, Li TH, Zhang J, Lee ACK, Lee I, Qu Z, Lin X, Hui J, Chan T-F. Single-Cell Atlas of the Drosophila Leg Disc Identifies a Long Non-Coding RNA in Late Development. International Journal of Molecular Sciences. 2022; 23(12):6796. https://doi.org/10.3390/ijms23126796

Chicago/Turabian Style

Tse, Joyce, Tsz Ho Li, Jizhou Zhang, Alan Chun Kit Lee, Ivy Lee, Zhe Qu, Xiao Lin, Jerome Hui, and Ting-Fung Chan. 2022. "Single-Cell Atlas of the Drosophila Leg Disc Identifies a Long Non-Coding RNA in Late Development" International Journal of Molecular Sciences 23, no. 12: 6796. https://doi.org/10.3390/ijms23126796

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop