Next Article in Journal
Exploring the Molecular Mechanism of Comorbidity of Type 2 Diabetes Mellitus and Colorectal Cancer: Insights from Bulk Omics and Single-Cell Sequencing Validation
Previous Article in Journal
Paradoxes: Cholesterol and Hypoxia in Preeclampsia
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Advances in Integrated Multi-omics Analysis for Drug-Target Identification

1
School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China
2
Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, Engineering Laboratory of Development and Application of Traditional Chinese Medicines, Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, Hangzhou Normal University, Hangzhou 311121, China
*
Author to whom correspondence should be addressed.
Biomolecules 2024, 14(6), 692; https://doi.org/10.3390/biom14060692
Submission received: 11 May 2024 / Revised: 8 June 2024 / Accepted: 12 June 2024 / Published: 14 June 2024

Abstract

:
As an essential component of modern drug discovery, the role of drug-target identification is growing increasingly prominent. Additionally, single-omics technologies have been widely utilized in the process of discovering drug targets. However, it is difficult for any single-omics level to clearly expound the causal connection between drugs and how they give rise to the emergence of complex phenotypes. With the progress of large-scale sequencing and the development of high-throughput technologies, the tendency in drug-target identification has shifted towards integrated multi-omics techniques, gradually replacing traditional single-omics techniques. Herein, this review centers on the recent advancements in the domain of integrated multi-omics techniques for target identification, highlights the common multi-omics analysis strategies, briefly summarizes the selection of multi-omics analysis tools, and explores the challenges of existing multi-omics analyses, as well as the applications of multi-omics technology in drug-target identification.

1. Introduction

In the field of drug development, identifying and validating drug targets is a crucial and challenging aspect. The success of drug development largely depends on two main factors: the efficacy and safety of the drugs. Studying drug targets not only reveals the active ingredients of traditional Chinese medicine and therapeutic targets within organisms but also helps elucidate the mechanisms of drug action and assists in detecting adverse drug reactions. Specifically, drugs exert therapeutic effects by interacting specifically with key molecules or cellular structures. These drug targets encompass a range of biomolecules, including proteins, enzymes, cell-membrane receptors, DNA or RNA sequences, and key regulatory factors within cellular signaling pathways or biological processes [1].The accurate determination of drug targets enables the purposeful design of a series of drugs that have superior therapeutic effects, safety, and applicability. These drugs typically act more precisely on specific parts or mechanisms, thereby demonstrating remarkable curative effects and bringing significant breakthroughs in disease treatment [2,3,4,5]. In the early stages of drug development, target identification primarily relies on biochemical and molecular biology methods, but these methods are often inefficient and time-consuming. With the emergence of high-throughput technologies, machine-learning computational methods have become an efficient approach for target screening. By leveraging vast amounts of biological data, including molecular structures, biological activities, and genomic information, machine-learning models establish mathematical frameworks to predict interactions between compounds and targets, thus altering the landscape of drug discovery [6,7]. Furthermore, with the advancement of omics technologies, it has significantly enhanced drug-target identification [8,9]. Genomic technologies assist in identifying gene mutations associated with diseases, while transcriptomic analysis reveals gene expression patterns and proteomic research elucidates changes in specific proteins under pathological conditions. Metabolomics can provide the most direct evidence for understanding the physiological and pathological processes of disease metabolism and molecular mechanisms. In this review, we start with the common multi-omics co-analysis strategies and introduce the analysis methods commonly used for drug-target identification, including the integration of transcriptome and proteomics, the integration of transcriptome and metabolomics, and the integration of proteomics and metabolomics. Second, we comprehensively summarize the latest advancements in multi-omics technologies for drug-target identification, such as single-cell multi-omics and spatial multi-omics. Additionally, we review commonly used tools and databases in multi-omics technologies. Finally, we discuss the challenges of multi-omics analyses and their future prospects for drug-target identification, with a focus on the integration of transcriptome and proteomics.

2. Development History of Omics Technologies

Over the recent decades, continuous technological innovations have led to the emergence of various omics technologies, each presenting different but complementary aspects of bioinformatics. Omics data, including genomics, epigenomics, transcriptomics, proteomics, and metabolomics, have been used to identify and obtain vast amounts of single-dimensional data (Figure 1). This data helps dissect the molecular mechanisms of gene regulation or reveal various aspects of human diseases. Although the objectives of various omics studies differ, they are closely interconnected. According to the central dogma, DNA (genomics) transcribes into mRNA (transcriptomics), which is then translated into proteins (proteomics). These proteins can catalyze the production of or act on metabolites (metabolomics) and lipids (lipidomics) [10]. However, single-dimensional omics studies or data analyses alone cannot sufficiently explain how different multi-layered biological processes interact and give rise to complex phenotypes [11,12]. The results obtained may be limited by uncertainties related to specificity, selectivity, or biochemical or physiological relevance. Therefore, it is necessary to comprehensively capture and analyze multiple cellular processes to better understand the relationship between biological mechanisms and genotypic–phenotypic correlations.
With the application of high-throughput omics methods for the analysis of biological samples, there is a shift from single-omics-level research to comprehensive studies, transitioning from partial to holistic analyses. Consequently, the emergence of multi-omics integrative analysis is inevitable. Multi-omics is a novel approach and technology for systematically studying biology, integrating and analyzing data from multiple omics levels such as genomics, transcriptomics, proteomics, and metabolomics in an unbiased manner. This systematic approach dissects the mechanisms and phenotypes of complex biological systems [13,14]. Multi-omics integrative data analysis is not merely about stitching data together; it involves an in-depth exploration of biological explanations. By conducting comprehensive studies across multiple levels, including genetics, transcription, proteins, and metabolism, potential relationships and interactions can be discovered from data at different omics levels. Moreover, these findings can be mutually validated to reduce the limitations and false positives of single-omics analyses, thereby obtaining a more comprehensive biological explanation. This approach surpasses the information provided by single-omics analysis, enabling a more systematic study of the functionality and regulatory mechanisms of biological systems. It aids in constructing organismal regulatory networks and provides a deeper understanding of the interactions, regulations, and causal relationships among different levels. Additionally, it identifies key molecules and pathways in biological systems and discovers new biomarkers and therapeutic targets.
Multi-omics integrative analysis provides more comprehensive biological information than single types of omics data. These analyses are performed at the multicellular level, averaging cell signals and potentially overlooking minor differences, such as noise. In highly heterogeneous tissue samples, such as tumor tissues and immune cells, cell heterogeneity information may be lost, complicating the analysis of tumor characteristics from a cellular perspective. With the development of third-generation sequencing technology, single-cell sequencing technology has emerged to better understand cell heterogeneity and functional differences. In recent decades, advancements in high-throughput technologies and multi-omics integrative analysis have driven the emergence and development of single-cell multi-omics technology. Compared to single-cell genomics analysis, single-cell multi-omics analysis provides more insights into cell-type-specific gene regulation. The continuous development of single-cell technology has led to methods that enable the spatial localization of gene expression within tissues, introducing the concept of “spatial multi-omics”. Single-cell multi-omics analysis allows researchers to obtain transcriptomic, epigenomic, and proteomic information from individual cells. However, this process also dissociates samples from their native environments, disrupting the tissue structure and losing crucial information. To address these issues and identify cell types and distributions within complex tumor microenvironments, it is further necessary to determine the spatial positions and localizations of cells within tissues. Spatial transcriptomics technology was first proposed in 2016 and aims to address this challenge. Subsequently, spatial proteomics and other omics technologies have emerged, leading to the development of spatial multi-omics technology [15].

3. Single-Omics Analysis

3.1. Genomics

Genomics was initially proposed by American geneticists in 1986 to explore the composition, structure, function, localization, and editing of the genetic material DNA. The aim is to quantitatively study and analyze all genes within organisms for their biological significance. With the development of next-generation sequencing, genomic technologies can efficiently analyze whole-genome data to discover genes, proteins, and biological pathways related to diseases. For drug-target screening using genomic technologies, DNA-sequencing data from tumor and non-malignant tissue samples are compared to identify differential information distinct from that of the normal organism. These differential genes may lead to the identification of drug targets. Additionally, these genes can be combined with the CRISPR-Cas9 knockout technology to quantitatively screen the identified target genes individually [16,17].
Genomic research currently involves three main areas: structural genomics, functional genomics, and comparative genomics. Structural genomics focuses on analyzing nucleotide sequences using whole-genome sequencing techniques to determine the composition of the genome and the specific positions of genes. Functional genomics entails artificially altering the sequence or expression states of genes within the genome or cells and observing the resulting phenotype to establish the associations between genotype and phenotype, thereby clarifying the functions of genes. Comparative genomics investigates sequence variations by comparing the differences in genome structure and function among different species and their inherent connections [18]. Functional genomics, through the study of gene functions and gene networks, has emerged as a key tool for deciphering the complex composition and diverse effects of human tumors and their microenvironments. Technologies in functional genomics, such as RNA interference [19,20], small interfering RNA [21], short hairpin RNA [22], CRISPR interference, and CRISPR inhibition [22,23], play important roles in drug-target discovery and validation. Despite offering new perspectives for disease diagnosis and treatment, genomics has limitations in predicting changes in protein and metabolic levels in organisms. Increasingly, research reveals that DNA and RNA alone cannot fully illuminate the function and status of proteins [24]. The correlation between protein and mRNA expression levels in mammals is relatively low, approximately 0.40. Additionally, due to the high heterogeneity of tumor cells, the failure of targeted drugs is also a challenge. For instance, mutations identified by gene-sequencing technology are often not “driver mutations”, which leads to unsatisfactory results of targeted drugs. Currently, the number of people benefiting from gene-based targeted therapeutic drugs remains relatively small [25].

3.2. Transcriptomics

The concept of the transcriptome was initially introduced by Charles Auffray in 1999 [20]. Transcriptomics refers to the study of gene transcription and transcriptional regulation at the overall cellular level, specifically exploring the dynamic changes in gene expression from DNA to RNA. Compared to genomic research, transcriptomics technologies exhibit greater complexity and diversity, revealing spatiotemporal differences in gene expression. By analyzing transcriptomic data from tumor and non-malignant tissue samples using high-throughput sequencing technologies, such as RNA sequencing, researchers can uncover distinct gene-expression patterns in tumor cells. This enables the identification of genes significantly upregulated or downregulated in tumor tissues, thereby determining potential drug targets. For instance, in oncology, comparing the transcriptomes of tumor cells and normal cells allows researchers to identify genes specifically overexpressed in cancer, which are often closely related to cancer growth and metastasis and can serve as candidate targets for targeted therapy [26]. Additionally, transcriptomic analysis can be utilized to monitor the effectiveness of drug treatments by analyzing the changes in gene expression before and after treatment, thereby evaluating the mechanism of action and the efficacy of the drugs [27]. Currently, available transcriptomic sequencing technologies include mRNA sequencing [28]. Other transcriptomic sequencing technologies encompass long non-coding RNA (LncRNA) sequencing, circular RNA sequencing, whole transcriptome sequencing, and single-cell transcriptome sequencing [29,30,31].

3.3. Proteomics

Proteomics was initially introduced in 1994 by American scientist Marc Wilkins [32]. Proteomics is the scientific study of the protein complement of a cell, tissue, or organism, encompassing its composition and variations over time [33,34]. Unlike the genome, the proteome can vary with the tissue and even environmental conditions. The process of transcribing genes into proteins involves a series of complex post-transcriptional regulatory mechanisms, such as mRNA splicing and post-translational modifications (PTMs), which contribute to the increasing complexity and diversity of proteins [35,36]. The majority of known drug targets are proteins, and using proteomics data to identify candidate drug targets can significantly increase the likelihood of drug approval compared to genomics [37]. Hence, identifying drug targets at the protein level is a crucial direction in drug development. The common strategy for target discovery in proteomics is based on target-based discovery for drug development [38,39]. By comparing the expression differences of proteins between the diseased and normal physiological states, potential drug targets can be identified. This principle relies on high-throughput protein detection and quantification technologies, primarily mass spectrometry (MS) and protein affinity purification techniques. Among them, proteomic analysis based on high-throughput MS is commonly used for target screening. By separating, identifying, and quantifying protein samples, differences in protein expression under physiological and pathological conditions can be revealed, identifying differential proteins as potential drug targets. Another proteomic technique is protein affinity purification, such as immunoprecipitation and affinity chromatography, which accurately identifies proteins relevant to drug action by specifically interacting with other biomolecules. The strategy of target discovery through chemical modification of small molecule probes includes two main methods: compound-centric chemical proteomics [40] and activity-based protein profiling [41].The label-free affinity chromatography methods include cellular thermal shift assay [42] and thermal proteome profiling [43], drug affinity responsive target stability [44], limited proteolysis mass spectrometry, and stability of proteins from rates of oxidation [45]. Researchers can screen differential proteins from cells using MS techniques and then validate the protein components in these complexes through protein affinity purification, discovering potential drug targets. For instance, in the proteomic analysis of cerebrospinal fluid EVs from four Alzheimer’s disease (AD) patients and normal controls, 11 significantly differentially expressed EV proteins were identified among 1765 proteins. Subsequently, an enzyme-linked immunosorbent assay was then used to validate a significantly different indicator, Cat B, in 136 samples. This led to the discovery that EV-CatB may serve as a candidate biomarker for the pathological staging of AD [46].
In recent years, the application of modified proteomics in life sciences has significantly increased in addition to traditional proteomics [47]. PTMs of proteins involve covalent alterations of amino acid residues to add modification groups or through alternative splicing [48,49]. PTMs play crucial roles in regulating cellular physiological functions, including phosphorylation, acetylation, and glycosylation [50]. Although proteomics technology offers a robust platform for investigating protein complexity within organisms, it remains in an early stage of development. However, advancements in high-throughput MS are gradually unveiling the vast potential of proteomics in life sciences and drug development.

3.4. Metabolomics

Metabolomics refers to the qualitative and quantitative measurement of dynamic changes in metabolites within living systems caused by pathological, physiological, or genetic alterations [51]. Unlike studies that solely focus on gene expression or proteomics, which reveal only a portion of cellular behavior, metabolomics depicts the complete physiological state of a biological component or cell at a specific moment. Thus, metabolomics offers direct evidence for understanding physiological and pathological processes, as well as the molecular mechanisms of disease metabolism. Compared to proteomics, metabolomics analysis has the advantages of simplicity and rapidity. Metabolomics involves the qualitative and quantitative analysis of all low molecular weight metabolites (<1500 Da, such as amino acids, sugars, and lipids) within cells or organisms using liquid chromatography–mass spectrometry (LC-MS), gas chromatography–mass spectrometry, or nuclear magnetic resonance (NMR). This approach aims to uncover the types and changes of metabolites and identify their relative relationships with physiological and pathological alterations [52]. Based on research objectives, metabolomics can be divided into untargeted and targeted metabolomics [30].
Untargeted metabolomics is the most commonly used approach in metabolomics applications, aiming to comprehensively detect the entire metabolome of an organism to identify potential biomarkers [53]. It focuses on identifying significant metabolic features that differ between the experimental and control groups and then interprets the discovered metabolites and their metabolic pathways in relation to biological processes. Conversely, targeted metabolomics focuses on specific metabolites as research targets, which can be used for validating biomarkers. Currently, there are four common strategies for metabolite discovery in metabolomics: 1. MS based: utilizing MS to detect and identify specific metabolic changes in diseased states; 2. sequencing techniques based: Although primarily used for genetic information, some techniques can assist in deciphering metabolic pathways; 3. bioinformatics approaches based: by conducting extensive bioinformatics analyses to identify patterns or rules related to specific diseases from large sets of metabolomics data, thereby discovering disease targets; and 4. animal model experiments based: by simulating human disease symptoms in animal models, it is possible to observe which metabolites or metabolic pathways change, and by analyzing these differential signals, disease targets can be identified [54].

4. Classic Multi-omics Techniques Analysis

Biological processes are characterized by their complexity and integrality. The analysis of single-omics data often fails to clarify complex causal relationships. However, integrated multi-omics analysis can simultaneously explore biological questions from both causal and effect perspectives and verify their correlations. Correlation analysis is a practical technique for uncovering associations within large datasets, thereby depicting patterns and trends where certain attributes of a phenomenon change simultaneously. Multi-omics correlation analysis typically begins by addressing intra-group differences (phenotypic differences), including spatiotemporal variations in sample collection, differences in disease or health status among samples, and the presence or absence of sample processing operations. Based on inter-group differences, inter-group correlation analysis detects associations or correlations within large datasets. This integration reveals interactions and regulatory relationships among data from different levels, facilitating comprehensive multi-level investigations of diseases [10,55,56].
Multi-omics analysis methods (Figure 2) generally encompass correlation analysis and enrichment analysis. The integration approach for correlations involves directly analyzing the correlation between two or more omics datasets. Initially, differential information is obtained through single-omics sequencing, and then, various omics data are connected for correlation analysis (Figure 3). For instance, overlap analysis can be conducted using Venn diagrams, which count the common or unique differentially expressed genes/proteins in multiple omics datasets, providing an intuitive understanding of the similarity and overlap between groups. Pearson correlation analysis can be employed to measure the relationship between inter-group variables and the extent of this relationship. When the data do not follow a normal distribution, Spearman correlation analysis is typically used. Visualization can be achieved using heatmaps and scatter plots. When the correlation between two omics datasets is weak, a nine-quadrant plot analysis can visually display the correlation, assisting researchers in identifying key molecules relevant to the study. For example, through a nine-quadrant plot analysis of the transcriptome and proteome, researchers can comprehend the expression of genes in samples and the translation of gene expression, facilitating the identification of the key genes relevant to the study. In one study, the authors used tandem mass tag proteomics technology to analyze protein changes in hepatocellular carcinoma cells after knocking down small Cajal body-specific RNA 13 (SCARNA13) and discovered 182 differentially expressed proteins. An overlap analysis using Venn diagrams revealed 11 genes that were differentially expressed in both transcriptomics and proteomics, with the protein-expression level of SRY-related HMG-box gene 9 significantly downregulated after SCARNA13 knockdown [57].
Another multi-omics analysis method is enrichment analysis. Biological processes entail the coordinated participation of multiple genes/proteins. Hence, abnormalities in a biological process are the consequence of interactions among multiple genes. Molecular responses and alterations within organisms demonstrate functional and pathway enrichment. Accordingly, different omics datasets also exhibit similar patterns of functional enrichment and changes. Differential proteins/genes from diverse omics datasets can generate a considerable amount of data through direct annotation. These functions often overlap conceptually, resulting in redundant analysis outcomes and impeding further refined analysis. Therefore, using enrichment-analysis methods to filter and screen data, integrating data with overlapping functions, and identifying significantly overexpressed genes/proteins or differential genes/proteins compared to background sets in biological composition/functions can generate more meaningful functional information. For instance, the authors enriched significantly upregulated genes in each subtype and discovered that Nudix hydrolase 12, Acetyl-CoA synthetase 1, and Nicotinamide nucleotide Adenylyl transferase 3 in the S3 subtype are all related to nucleotide biosynthesis processes [58]. Currently, there are two most commonly used enrichment-analysis methods. One is to set a significance threshold, select differentially expressed genes, and then utilize statistical tests to determine whether these differential genes are enriched in specific functional categories or pathways. Herein, pathway enrichment is typically selected for gene ontology (GO), the Kyoto Encyclopedia of Genes and Genomes (KEGG) database, or other defined pathway enrichment. This is known as GO enrichment analysis or KEGG enrichment analysis. However, subjective threshold setting may neglect genes with significant biological importance that are not significantly differentially expressed. The emergence of gene set enrichment analysis (GSEA) addresses this deficiency. GSEA does not require the specification of a specific threshold for differential genes. By concentrating on the coordinated pattern of changes in the entire gene set, even genes with minor expression changes can be effectively captured by GSEA as long as they produce a synergistic effect within the entire gene set. The annotation information therein can be derived from GO, KEGG, or any other format-compliant information.

5. Classical Multi-omics Integration Methods

The integration of multi-omics data is of crucial significance for target screening and identification. Different impacts are generated when conducting inter-omics integration analysis with various omics. The integration of genomics and transcriptomics is capable of revealing the association between gene mutations and gene expression, facilitating the study of transcriptional regulation and enabling the identification of potential drug targets. The integration of transcriptomics and proteomics can expose the correlation between gene expression and protein levels, thereby uncovering gene expression at two levels. The integration of proteomics or transcriptomics with metabolomics can delve into the relationship between protein expression and metabolite levels and detect the association between phenotype and genotype, thereby identifying the upstream regulatory mechanisms of metabolic characteristics via a “from effect to cause” approach. The integration of proteomics and proteomics modification analysis can explore the true mechanism of protein modification from two dimensions, namely protein expression and modification (Figure 3).

5.1. Integrating Transcriptome and Proteomics

As downstream products of genetic and epigenetic regulation, the transcriptome and proteome respectively measure gene expression at the transcriptional and translational levels, indicating a very close connection. The transcriptome acts as an intermediate module connecting the genome and the proteome, is capable of identifying genes with differential expression after being treated with small molecules, and is used to formulate hypotheses about mechanisms such as transcriptional and post-transcriptional regulation [25]. The process of gene transcription and translation into proteins involves complex post-transcriptional regulatory mechanisms, such as mRNA splicing and PTMs of proteins [35]. Therefore, insights into proteins cannot be sufficiently obtained from the DNA and RNA levels alone. Proteomic and transcriptomic analyses provide insights regarding the regulation of protein abundance. Proteins, as the direct executors of biological functions, reveal correlations that overlap with but are not identical to transcriptomic and genetic data, representing a true reflection of gene-expression status. The development of high-throughput MS technology has enabled the large-scale study of the characteristic information of all proteins expressed in a cell or organism.
Integrating transcriptomic and proteomic analyses enables the assessment of gene expression both upstream and downstream, as well as the exploration of gene regulation during transcription and translation processes. By comparing samples from diseased and normal physiological states, genes and proteins with significantly altered expression can be identified. Exploiting the differences and complementarities between the transcriptome and the proteome allows for expression regulation and correlation. Through GO enrichment analysis and KEGG functional enrichment analyses, researchers can gain insights into the specific functions of these molecules in diseases and their interaction patterns. These functional enrichment analyses help uncover key molecular pathways that affect specific biological functions, particularly those that may involve new drug targets or therapeutic mechanisms. The integration of transcriptomic and proteomic analyses enables the construction of a protein search library from transcriptomic data, enhancing the completeness of the protein database and significantly improving the accuracy of protein identification. Conversely, proteomics can validate alternative splicing information discovered through transcriptomics. By comparing the protein-expression differences between diseased and normal physiological states, potential drug targets can be identified. For instance, in the research on therapeutic targets for non-alcoholic fatty liver disease (NAFLD), researchers used comprehensive proteomic analysis to investigate the differential expression profiles of ER stress-response proteins under various metabolic states. They discovered a significant downregulation of major urinary protein (MUP). Integrated transcriptomics further confirmed this finding at the mRNA and protein-expression levels, revealing that MUP1, as the primary secretory form of MUP, decreased under endoplasmic reticulum stress in hepatocytes. This indicates the potential of recombinant MUP1 or its derivatives produced under endoplasmic reticulum stress as promising therapeutic targets for alleviating NAFLD [59]. Additionally, by conducting transcriptomic, proteomic, and phospho-proteomic analyses on a silicosis mouse model, researchers constructed a multi-omics integrated analysis map, thereby tracking the dynamic changes caused by silica-induced injuries. These analyses confirmed the presence of abnormal pathways in the progression of silicosis, including transcriptional, proteomic, and kinase activities, particularly identifying the phosphorylation levels of epidermal growth factor receptor and spleen tyrosine kinase as new therapeutic targets for silicosis [60]. Furthermore, in analyzing the expression levels of integrin subunit alpha 2 (ITGA2) in Kirsten rat sarcoma viral oncogene (KRAS)-mutant pancreatic cancer, researchers observed a significant increase. Subsequent reverse transcription-polymerase chain reaction and Western blot analyses confirmed that aberrant KRAS activation induced overexpression of ITGA2. Single-cell RNA sequencing (scRNA-seq) data revealed that ITGA2 expression regulates the activation of the small mothers against decapentaplegic 2-inhibited transforming growth factor-β (TGF-β) signaling pathway, potentially serving as a clinical therapeutic target by enhancing the anti-tumor effects of TGF-β [61].

5.2. Integrating Transcriptome and Metabolomics

Transcriptomics and metabolomics, being the upstream and downstream products in omics research, respectively, measure gene expression at both the transcriptional and metabolic levels of the same gene. Transcriptomic investigations focus on the transcriptional state of specific genes in a particular tissue or cell, along with the transcriptional regulatory patterns. Metabolomic studies aim to comprehend the metabolic networks of organisms and explore the roles of metabolites in the occurrence and treatment procedures of diseases. As direct manifestations of phenotypes, even minor changes in phenotypic traits can be exponentially magnified at the metabolic level. Hence, alterations in metabolites can reflect changes in phenotype and explain the mechanisms influencing the relationship between phenotype and metabolites. However, downstream metabolomics can only detect the composition of metabolites in cells and metabolic tissues. Single-omics techniques alone struggle to identify the specific upstream molecules that regulate changes in metabolomics. Therefore, by integrating metabolomics with transcriptomic analysis, we can observe how changes in phenotype are manifested at the gene level. Additionally, association analysis of phenotypic and genotypic data allows us to infer the upstream “causes” from the downstream “results”, thereby discovering the regulatory mechanisms of characteristic metabolites upstream.
By fully leveraging the correlation between metabolomic data and transcriptomic and genetic data, which overlap but are not identical, a comparative analysis of the two omics data can reveal molecular information at both omics levels. Starting with the analysis of differential metabolites and genes from metabolomics and transcriptomics, respectively, single-omics analysis can be performed. Methods such as principal component analysis, orthogonal projections to latent structures discriminant analysis, and clustering analysis can achieve intra-group differentiation, and differential molecules can be identified through differential comparison analysis. After identifying differential molecules, metabolomics can confirm the associations between differential metabolites and pathways via KEGG pathway enrichment analysis. Transcriptomics can perform KEGG enrichment analysis, GO enrichment analysis, and protein–protein interaction networks analysis. By integrating the KEGG data from these two omics, relevant functions or shared pathways can be identified and key pathways found. Alternatively, by targeting metabolites associated with upstream pathways identified by transcriptomics, specific mechanisms of upstream regulation can be investigated through metabolite detection. Expression correlation analysis can also be conducted between the differential genes and the accumulated differential metabolite information obtained from transcriptomic and metabolomic analyses to identify their expression correlations and infer their involvement in the same biological processes. Differential metabolites and metabolic pathways can be identified, and correlation analysis of these differentially expressed metabolites can be conducted to identify related biomarkers. This approach allows for the exploration of biological questions from both the “cause” and “effect” perspectives, facilitating the diagnosis of related diseases and hierarchical prediction, among other applications. Elke Schaeffeler et al. [62] utilized human leukocyte antigen ligandome analysis based on MS to identify clear-cell renal cell carcinoma (ccRCC)-associated peptides, followed by integrated transcriptomic and metabolomic analysis to further screen ccRCC targets. Differential gene enrichment analysis using gene set enrichment analysis on the entire transcriptome gene set led to 113 final candidate genes. The study employed LC-MS for the targeted metabolomics analysis of tissue samples, quantifying 204 tumor metabolites. Non-targeted metabolomics analysis was then used to identify metabolic products, revealing the selected therapeutic target prolyl hydroxylase 3 (PHD3; EGLN3). Similarly, Haitao Lu et al., adopted a precise targeted metabolomics approach to identify for the first time that the crucial functional metabolites adenosine monophosphate (AMP) and cyclic adenosine monophosphate (cAMP) significantly accumulated in gemcitabine-induced mouse pancreatic cancer (PC) tumor tissues. Subsequently, quantitative proteomic and transcriptomic analysis validated a large number of metabolic enzymes and genes associated with alterations in AMP and cAMP. Finally, it was verified that the accumulation of AMP and cAMP induced by gemcitabine can, respectively, activate the downstream kinases AMP-activated protein kinase and protein kinase A, leading to tumor growth inhibition through the overexpression of growth arrest and DNA damage-inducible protein 45A. The dual activation of the AMP-cAMP axis represents a potential new target for PC treatment [63].

5.3. Integrating Proteomics and Metabolomics

As downstream products of omics research, proteomics and metabolomics, respectively, represent the complete set of proteins and metabolites expressed by a cell or an entire organism. Proteomics studies the composition and alteration patterns of proteins within cell tissues or organisms. Metabolomics examines the small-molecule metabolites involved in regulation, including their types, quantities, and patterns of change. Metabolomic information is closely aligned with biological phenotypic characteristics. According to the central dogma, the expression of metabolites, following transcription and protein translation, is ultimately manifested in the phenotype. Metabolomics is downstream of proteomics and represents a further manifestation of proteins. Thus, as two directly related groups in regulatory relationships, minor functional alterations in the proteome may be significantly amplified several times at the metabolic level. Integrating proteomics with metabolomics allows these methods to validate and complement one another, clarifying how proteins govern specific changes in metabolites and illuminating their enhancements at the genetic level. This integration also provides insights into the impact on phenotypes and the establishment of subsequent molecular mechanisms.
Similar to an integrated analysis involving transcriptomics and metabolomics, combining proteomics and metabolomics in the analysis of biological samples enables KEGG integration analysis and expression correlation analysis based on the co-expression patterns of metabolites and proteins. In KEGG integration analysis, differential metabolite and protein data are rapidly screened to identify common involvement in metabolic pathways, assisting in elucidating how proteins regulate metabolites and their engagement in signaling pathways. Expression correlation analysis, based on the expression levels of selected differential metabolites and proteins, identifies differential proteins and metabolites with similar trends. This clarifies how changes in a protein can result in changes in metabolites and other associated changes. Additionally, orthogonal two partial least squares (O2PLS) modeling analysis, also known as orthogonal partial least squares discriminant analysis of paired data, is conducted. Unlike traditional one-to-one correlation analysis, O2PLS conducts one-to-many operations on the expression levels of the two omics datasets. It starts from the overall data perspective and then undertakes bidirectional modeling of the two omics datasets as a whole. It predicts datasets with potential correlations between the two matrices, thereby identifying correlations between a protein and metabolites.
Zongwei Cai and colleagues [64] carried out a systematic chemical proteomics study on the target proteins of perfluorinated compounds. Through quantitative proteomic analysis of the enriched proteins, acetyl-CoA carboxylase alpha and acetyl-CoA carboxylase beta were identified as new target proteins of perfluorooctanoic acid (PFOA). Subsequently, targeted proteomic studies using parallel reaction monitoring further verified these protein targets. In combination with metabolomics research, the verification of the in vivo metabolic changes selected by the differential target proteins provided a rational explanation for the liver toxicity induced by PFOA. Eventually, the true protein targets of PFOA were disclosed and validated. Hui Zhong et al. [65] utilized phosphorylation proteomics and proteomic analysis to explore the ability of the recombinant Golgi protein 73 (GP73) to promote guanosine triphosphate hydrolysis. They found that GP73 impairs the secretion of apolipoprotein B (ApoB) and ApoB100, while GP73 mutants nullify these effects. The metabolomic exploration of the differential protein data further evaluated the metabolic consequences of chronic GP73 expression in liver cells, clarifying the abnormal biological phenomena in pathological states. It was observed that mice with chronically upregulated GP73 in liver cells showed a phenotype characteristic of non-alcoholic fatty liver disease (NAFLD), and treatment with metformin to inhibit GP73 GTPase-activating protein activity effectively blocked the GP73-induced non-obese NAFLD phenotype. This finding suggests that elevated levels of GP73 may promote the progression from steatosis to nonalcoholic steatohepatitis, providing a potential therapeutic target for halting the development and progression of non-obese NAFLD.

5.4. Proteogenomics

The term “proteogenomics” was initially proposed by Jaffe et al. in 2004 [66]. It encompasses the integration of MS-based proteomics and PTM proteomics with genomic, epigenomic, and transcriptomic data. Genomics and epigenomics mainly involve the collective characterization and quantification across the entire genome, providing a blueprint of cellular events. By studying specific genes, we can uncover gene variations related to diseases [67]. This comprehensive approach reveals the pathogenic mechanisms of human diseases, identifies differential genes, and discovers potential targets for diagnosis and treatment. According to the central dogma, after DNA transcription and protein translation, the expression of metabolites ultimately presents phenotypically; the implementation of biological functions in organisms ultimately depends on proteins and their metabolites. Proteins, as the direct executors of biological functions, provide confirmation of events that have already occurred, as proteins and their modifications are the main determinants of biological phenotypes. Their composition and interactions form the foundation of the dynamic processes of life. Therefore, monitoring proteins and their interactions is crucial for studying life activities [68]. Compared to genomics, proteomics has the advantage of confirming events that have already happened. Thus, genomic and transcriptomic analyses provide characteristics of differential genomic features, while proteomics can directly identify protein regulation and signal transduction responses to differential genes. Additionally, an in-depth quantitative analysis of post-translationally modified proteins is carried out. The integrative analysis of multi-omics data, such as genomics, transcriptomics, and proteomics, enables the exploration of biological questions from both causal and consequential perspectives. Genomic and transcriptomic analyses provide characteristics of differential genomic features, while proteomics can directly identify proteins regulating and responding to abnormal expressions and provide direct information on signal transduction. Using MS for in-depth quantitative analysis of post-translationally modified proteins can detect genomic changes and signal dysregulations that genomics and transcriptomics cannot reveal [69,70]. The proteogenomics analysis workflow starts with transcriptome sequencing to obtain gene information, coding transcripts, LncRNA, alternative splicing sites, single nucleotide variants (SNVs), etc. from the genome reference sequence to construct a custom feature column database or utilize public databases for assistance. Then, protein-expression data is obtained through MS-based proteomic analysis. The spectrum identification engine matches the collected MS data with peptides in the previously constructed database to score peptide-spectrum matches. Unlike the traditional protein identification quality-control process, the new feature sequence database contains a large amount of redundant and random sequences. Therefore, more stringent quality-control standards and validation methods are needed to ensure that the identified new sequences are sufficiently reliable. The identified new peptides are screened and classified, and the number of different events is counted. A manual inspection is conducted to assist in verifying the reliability of each event. Finally, the genomic localization of events is completed, and data visualization is performed.
Researchers from the Clinical Proteomic Tumor Analysis Consortium analyzed the proteogenomics of lung adenocarcinoma (LUAD) through a multi-omics approach and established different subtypes of LUAD based on their distinct immune characteristics. Phospho-proteomic analysis identified anaplastic lymphoma kinase fusion as a potential diagnostic biomarker and target [71]. Additionally, the research group led by Yu-Ju Chen [72] conducted proteogenomic analysis on non-smoking LUAD patients in East Asian populations. They identified the age, gender, and environmental carcinogenic risk factors associated with the development of LUAD. Moreover, they clinically classified early-stage LUAD based on proteomic features and identified tumor characteristics, tumor cell markers, and drug targets via protein network analysis. Tan Minjia and colleagues [73] conducted an in-depth analysis of the proteomic expression profiles and phosphor-proteomic profiles of LUAD and the adjacent tissues from 103 clinical patients. They identified a total of 11,119 protein products and 22,564 phosphorylation modification sites. Furthermore, by integrating a genomic feature data analysis, they confirmed that heat shock protein 90β could serve as a prognostic biomarker for LUAD by using proteomic features [74].

6. Single-Cell Multi-omics Technology

Integrative analysis of multi-omics offers more comprehensive biological information compared to single-type omics data. These omics sequencing techniques are conducted at the multicellular level, resulting in averaged signals of cells, which may neglect minor differences as being noise. For highly heterogeneous tissue samples, such as tumor tissues and immune cells, the information regarding cell heterogeneity is lost, making it difficult to analyze tumor characteristics at the cellular level. To better understand the heterogeneity and functional differences of cells, single-cell sequencing technology has emerged. Single-cell multimodal omics technology refers to the cutting-edge technique of depicting cell heterogeneity through multi-dimensional, multi-level, and multi-angle genomic, epigenomic, transcriptomic, and proteomic analyses within the same cell, enabling the exploration of direct and potential connections among various omics layers and facilitating a more accurate and comprehensive explanation of disease states [75,76,77,78].
Conventional single-cell sequencing involves several key steps: preparing single-cell suspensions, single-cell sorting, library preparation, high-throughput sequencing, bioinformatics analysis, and data visualization. First, a single-cell suspension is prepared by dissociating tissue samples into individual cells, which allows for the extraction of nucleic acid information from each cell. The nucleic acids (RNA, DNA, or proteins) within each cell are then captured. This is followed by the amplification of the obtained DNA or RNA, library construction, and finally, data analysis. Each step—from isolating, fixing, and lysing single cells to amplifying nucleic acids and sequencing—is meticulously designed to preserve the integrity and accuracy of the molecular information from each cell [15,38,77].
Currently, the most established methods in single-cell omics are single-cell transcriptomics and single-cell genomics [79]. Recent progress has been made in single-cell sequencing technologies. For instance, single-cell genomics utilizes whole-genome sequencing to analyze the entire genomic DNA sequence of individual cells, identifying gene mutations, chromosomal structural variations, and copy number variations. Single-cell transcriptome sequencing analyzes the mRNA expression in individual cells, revealing the differences in gene expression among cells and identifying different cell types and functional states. Popular scRNA-seq technologies include Smart-seq2, which provides full-length transcript information, and 10X Genomics Chromium, which can handle thousands of single cells with high throughput. Single-cell epigenetic sequencing analyzes DNA methylation and histone modifications in individual cells to understand the epigenetic mechanisms of gene regulation. This involves single-cell whole-genome bisulfite sequencing for analyzing DNA methylation patterns and single-cell chromatin accessibility sequencing (such as single-cell assay for transposase-accessible chromatin using sequencing; scATAC-seq) for assessing chromatin openness and inferring regulatory element activity. Integrating single-cell omics technologies allows for a comprehensive understanding of cell states and functions, providing an unbiased exploration of the relationship between gene expression and phenotypic heterogeneity [80].
Single-cell multi-omics sequencing combines data from various single-cell omics, such as genomics, transcriptomics, and epigenomics, to provide a comprehensive view of cell states and functions. This approach involves sequencing the same cell using multiple single-cell techniques and then integrating the data through fusion and comprehensive analysis to uncover the diverse biological properties of cells. By merging information from multiple omics layers, including DNA, RNA, and protein data, researchers can improve the accuracy of identifying cell populations, tracing cell lineages, and detecting new or rare cell types [78]. Several methods have been developed for single-cell multi-omics analysis of the genome and transcriptome. For instance, the single-cell combinatorial indexing for methylation analysis technique can simultaneously capture transcriptomic and epigenomic information from a single cell. Conducting a parallel analysis of the genome and a transcriptome in the same cell reveals the transcriptional state of the genome, highlighting the relationship between genomic changes and the transcriptional outcomes of the target genes involved in disease processes. Integrating transcriptomic and epigenomic data can directly clarify the epigenetic features of DNA. Additionally, combining transcriptomic and proteomic analysis at the single-cell level allows for the characterization of proteins through PTMs and interactions, improving the analysis of mRNA and protein abundance relationships and enhancing identification accuracy. For example, researchers [81] combined scRNA-seq, single-cell chromatin accessibility analysis, and DNA sequencing to analyze 225 tumor samples from 11 different types of cancer. They created a multi-omics tumor atlas and identified the genetic factors associated with cancer. Notably, they found that transcription factors could serve as markers for cancer prognosis. Researchers [82] conducted detailed transcriptomic and epigenetic analyses on 42 and 39 human cancer cell lines, respectively, using scRNA-seq and ATAC sequencing technologies. This study uncovered significant heterogeneity at both the transcriptomic and epigenetic levels. Additionally, integrating data from multiple omics layers, including DNA, RNA, and proteins, enhances the accuracy of identifying cell populations, tracing cell lineages, and detecting new or rare cell types. However, preparing single-cell suspensions and isolating individual cells remains a challenging aspect of single-cell omics technologies.
Although the throughput, automation, and detection speed of single-cell sequencing technologies have been significantly enhanced, and the costs have been continuously decreasing, they still fail to meet the requirements of clinical testing. The application of single-cell multi-omics analysis is still in its early stages. While single-cell transcriptomics is more developed, other types of single-cell multi-omics analyses are still evolving, and there is a need to improve the methods for analyzing single-cell multi-omics data. For instance, as proteins cannot be amplified like DNA or RNA, and the current instruments lack sufficient sensitivity, the number of proteins detectable with the current single-cell multi-omics technologies is limited, which slows down the development of single-cell proteomics. Currently, there are few cases of using single-cell multi-omics alone for drug target screening; it is usually combined with other omics technologies. For example, traditional omics can be combined with single-cell omics to validate potential targets. Initially, potential targets are identified based on extensive omics data, and then, single-cell omics data is used to further refine these targets. Alternatively, spatial multi-omics technology can be integrated with single-cell omics techniques [41].
Furthermore, there are few computational methods available for the integrated analysis of single-cell multi-omics data, and some technical and computational limitations remain. Although the recent progress in scRNA-seq technology has led to an exponential increase in the number of cells and genes detected, suitable data-analysis methods for other types of single-cell omics, such as single-cell proteomics and single-cell metabolomics, are still lacking and need improvement. Single-cell omics sequencing technologies are not yet ideal for studying human cells. This is particularly true in stem-cell research, where it is currently impossible to identify the spatiotemporal changes of stem cells within the human body. In the future, combining single-cell multi-omics sequencing with gene editing tools and 3D organoid culture systems may potentially uncover the genetic changes in stem cells and their potential links to phenotypic changes in stem cells [79].

7. Spatial Multi-omics Technology

As single-cell omics become more prevalent, the single-cell dimensional atlas alone can no longer satisfy the diverse demands for in-depth research on disease treatment targets. Single-cell multi-omics analysis enables researchers to extract information regarding transcriptomics, epigenomics, and proteomics from individual cells. However, this approach isolates samples from their native environments, disrupting tissue structure and losing crucial information [15]. To address these challenges and accurately identify cell types and distributions in complex tumor microenvironments, it is crucial to determine the spatial positioning of cells within tissues. New methods that can spatially map gene expression within tissues have propelled the concept of “spatial omics.” The technology of spatial transcriptomics was first introduced in 2016. Since then, other technologies, such as spatial genomics and spatial proteomics, have emerged, giving rise to the development of spatial multi-omics techniques. During this time, spatial transcriptomics has also seen rapid advancements. In 2020, Nature Methods recognized this technology as the Method of the Year [83]. In 2022, this technology was listed as one of the top seven technologies by Nature [84]. For example, Michael T. Longaker [85] utilized single-cell transcriptome sequencing and single-cell chromatin accessibility data, along with 10x Visium spatial transcriptomics and PhenoCycler-Fusion spatial proteomics, to study various solid tumor types across different species. This comprehensive approach revealed new therapeutic targets specific to cancer-associated fibroblasts, regardless of species and tumor type. Sun [86] used a combination of spatial metabolomics and spatial transcriptomics to analyze slices of gastric cancer tissue. This method mapped how metabolites, lipids, and genes interact and co-locate within the metabolic pathways of diverse cancerous tissues. Currently, since spatial multi-omics technologies lack single-cell resolution, they typically only analyze clusters of about 3–10 cells. As a result, these technologies are often paired with single-cell holography techniques for more detailed insights. Abhay Kanodia [87] conducted a transcriptomic analysis of colorectal tumors, combining spatial transcriptomics at the tissue level using DSP technology with scRNA-seq. They cross-validated the results of several key processes related to immunotherapy across various cell types, confirming the coordinated changes in key processes of multicellular interactions within the tumors.
Spatial multi-omics technology combines biological data from various levels, such as DNA, RNA, and proteins, through a systems biology approach. It emphasizes the spatial characteristics and resolution of omics data, facilitating a comprehensive understanding of biological processes in their spatial context. This technology offers researchers a new three-dimensional spatial perspective, enhancing the “resolution” of cellular and genetic maps and addressing spatial heterogeneity within samples. In the next five to ten years, spatial multi-omics is anticipated to be widely adopted, potentially revolutionizing our understanding of the complexity of life by uncovering the intricate molecular and spatial structures of tissues and cells. Spatial metabolomics utilizes MS imaging combined with metabolomics, providing both qualitative and quantitative analysis of metabolites while revealing molecular phenotypes in a spatial context. Spatial proteomics technology employs high-precision laser-capture microdissection to isolate specific tissue regions or cells of interest. Using optimized micro proteomics techniques, it analyzes protein expressions at different spatial locations, obtaining protein-expression profiles for various functional areas and cell types within the sample. This helps discover more precise biomarkers and functional mechanisms. Spatial transcriptomics combines the detailed morphological observations of histology with the high-throughput sequencing capabilities of transcriptomics, enabling the simultaneous high-throughput sequencing of multiple sites on a single tissue slice.
The utilization of spatial omics technology has been steadily increasing, mapping the genomic architecture across various tissues. However, we are still in the early stages of the spatial omics era. As these methods progress, the limitations, in terms of throughput, resolution, sensitivity, and adaptability to various sample types, are gradually being overcome. Substantial progress in spatial omics is anticipated in the future, which will further deepen our understanding of the complexity of life.
At present, data acquisition is still one of the most challenging aspects of spatial omics methods. The vast data volume complicates spatial analysis and requires significant effort from researchers for data interpretation. Currently, spatial transcriptomics technology operates on a “2D” level. Existing technologies are unable to measure subcellular transcriptomics within an entire tissue sample in three dimensions at subcellular resolution. Furthermore, most transcriptomics methods require decoding steps that have to be customized to address the specific distortions and aberrations of each technology or microscope, making standardization and optimization difficult [88,89].

8. Computational Tools for Multi-omics Data Integration

With the advancement of technologies, such as genomics and transcriptomics based on second-generation sequencing, along with proteomics and metabolomics based on high-throughput MS, a large amount of complex and disordered data has been generated. It is necessary to employ bioinformatics analysis tools to integrate the intricate information. Compared with analyzing individual datasets separately, which is known as single-omics analysis, integrating and correlating various datasets is more conducive to uncovering the potential mechanisms of biological regulation and function, thereby enabling a comprehensive awareness of biological life processes. Integrating the sequencing and MS data obtained from single-omics analysis is the core of multi-omics research. The integration platform based on these conducts a cross-correlation analysis of the sequencing and the MS data and serves as the core platform for multi-omics analysis. Typical multi-omics integration analysis platforms (software packages) should meet three criteria: (1) process multi-omics data in parallel rather than sequentially; (2) be capable of integrating and analyzing data from at least two or more omics; and (3) have no specific requirements for the data format [90].
Examples include mixOmics, which provides a series of statistical methods for exploring and integrating multi-omics datasets, including traditional and regularized multivariate approaches; MetaboAnalyst, a comprehensive web-based platform for classifying, diagnosing, and integrating metabolomic and transcriptomic data; Omics Integrator, which is used to integrate data from different omics studies and identify significant molecular networks; and path view, which maps genomic data onto biological pathway diagrams and cooperates with multi-omics data analysis. Other examples involve multiOmicsViz for comparing and visualizing the relationships among multi-omics datasets; Ingenuity Pathway Analysis for uncovering, visualizing, and exploring associations and networks among multi-omics data; Joint and Individual Variation Explained for detecting shared and unique variations among multi-omics datasets; and Cytoscape for visualizing molecular interaction networks and integrating multi-omics data.
Currently, there is an increasing number of public-platform resources that meet the standards of multi-omics integration analysis data platforms, and there are also some publicly available databases, such as those listed in Table 1 and Table 2, which provide multi-omics datasets for patients. For instance, the GO database classifies gene functions into three main aspects: cellular component (CC), molecular function (MF), and biological process (BP). By making use of the GO database, we can determine the main associations of our target genes at the CC, MF, and BP levels. The KEGG database, along with similar ones like the WikiPathway and Reactome pathway databases, annotates gene functions and their involvements in various pathways in the human body. The Cancer Genome Atlas is one of the largest multi-omics databases, which contains clinical data, genomic variations, mRNA expression, miRNA expression, methylation, and other data for various human cancers, including subtypes of tumors. The Spatial Omics Database provides data from 26 spatial omics techniques, with a dataset size exceeding 50 million cells (spots). In addition to the specialized databases mentioned above for multi-omics, we also present other databases and their URLs in Table 1 and Table 2.

9. Future Perspective

9.1. Challenges

9.1.1. The Complexity of Data Integration and Analysis

As the cost of sequencing decreases, the complexity of sequencing data increases. Dealing with large-scale datasets remains one of the greatest challenges in current multi-omics data analysis. Integrating and analyzing such data require highly specialized techniques and methods. Unlike single-omics analysis, multi-omics approaches focus on finding consistencies and correlations across different omics datasets to establish causal relationships within vast datasets. This demands significant computational resources to effectively integrate and interpret sequencing data from various technologies and platforms, along with advanced statistical and computational methods and extended processing times.

9.1.2. Data Consistency and Standardization

Inconsistent data storage and formats across different technological platforms make data processing more challenging. There is a need for a unified framework that can handle and analyze multi-omics data in a simple, clear, and visual manner from start to finish. Developing new computational tools and algorithms to optimize data storage, retrieval, and analysis processes is essential for promoting industry standardization. This includes standardizing data formats and analytical tools, which is an urgent issue that needs to be addressed.

9.1.3. The Rising Cost of Data Analysis

Despite the decrease in basic sequencing costs, the amount of sequencing required for multi-omics analysis is still significantly higher than for single omics. This increase in data volume demands more computational resources, which in turn, consumes more manpower and financial resources. The increase in data, along with the need for long-term storage, also significantly raises storage costs. Managing and storing large-scale data has become more complex and costly. These high costs limit the widespread application of multi-omics technologies. Therefore, it is crucial for researchers to select appropriate combinations of omics methods to avoid data waste and reduce analysis costs.

9.2. Prospects for Technology Application

Despite these challenges, multi-omics technologies have great potential in the field of drug-target identification. These integrated techniques have significantly impacted molecular-level drug discovery and development, playing a vital role in understanding disease mechanisms and identifying potential drug targets. In the coming decades, innovations in multi-omics technologies will further enhance our understanding of cell biology.
We can expect progress in several areas, including increased throughput, the establishment of standardized analysis formats, reduced costs for data analysis and storage, and the inclusion of more models in single assays. Additionally, improvements in the sensitivity and specificity of detecting and characterizing various forms of multi-omics measurements are anticipated.

9.3. Technological Advances and Breakthroughs

9.3.1. Single-Cell Multi-omics and Spatial Multi-omics

With continuous improvements and breakthroughs in bioinformatics analysis and tool applications, the potential of multi-omics technologies for drug-target identification is expanding. New technologies and analytical methods, such as single-cell omics and spatial omics, enable us to visualize the molecular structure and functional phenotypes of tumors at single-cell resolution. This provides unique insights into the dynamic changes in tumor molecular structures during progression and treatment.
Future developments may see spatial omics evolving into single-cell spatial multi-omics, three-dimensional spatial omics, and spatiotemporal multi-omics. These advancements will offer more detailed and high-resolution data, further improving the accuracy and efficiency of drug-target identification. However, the application of these technologies in drug-target identification is still in its early stages and requires further validation and optimization.

9.3.2. Multi-omics and Machine Learning

With the advancement of next-generation sequencing technologies, the cost of sequencing has gradually decreased, resulting in an explosion of sequencing data. Analyzing this vast amount of data manually is both time-consuming and labor-intensive. Artificial intelligence plays a crucial role in handling this data. Machine learning accelerates data processing and pattern recognition, significantly reducing the complexity of multi-omics data. This helps researchers extract valuable biological information from intricate datasets, enabling the rapid and efficient identification of potential therapeutic targets and offering smarter solutions for drug development.

10. Conclusions

In essence, multi-omics technology undeniably exhibits tremendous potential and promising prospects in the field of drug-target identification. A comprehensive and meticulous analysis of the data from multiple aspects through diverse means and approaches enables us to understand the pathogenesis of diseases in a more profound, intricate, and in-depth fashion. Significantly, this not only significantly accelerates the drug development process but also facilitates the realization of personalized medicine, which is of cardinal importance for tailoring treatment plans according to the unique and specific needs of individual patients.
Furthermore, with the continuous innovation and remarkable progress of sequencing technology, multi-omics technology is bound to progressively evolve into an absolutely indispensable and crucial tool in future drug development. It will not only tenaciously uphold its core and key role but also continuously broaden its application purview and enhance its influence, making progressively more substantial contributions to human health and medical progress. It will actively carve out new paths and generate new opportunities in the arduous battle against various diseases and the resolute pursuit of enhancing medical care outcomes, bringing cherished hope and practical solutions to multitudes of patients and a plethora of medical professionals. Through the seamless integration of various disciplines and technologies, multi-omics technology will continue to vigorously drive revolutionary and profound alterations in the medical field, ushering in a new era of more efficacious and personalized medical care.

Author Contributions

Conceptualization, Y.Z. and P.D.; writing—original draft preparation, P.D.; supervision, Y.Z.; writing—review and editing, N.Z., R.F. and C.W.; funding acquisition, Y.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (No. 22207027), the Nature Science Foundation of Zhejiang Province (No. LQ22B020003), and the Nature Scientific Research Foundation for Scholars of Hangzhou Normal University (No. 2021QDL041).

Data Availability Statement

No new data were created in this study. All the data reported in this review were found in original articles cited in the text.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Hughes, J.P.; Rees, S.; Kalindjian, S.B.; Philpott, K.L. Principles of early drug discovery. Br. J. Pharmacol. 2011, 162, 1239–1249. [Google Scholar] [CrossRef] [PubMed]
  2. Yuan, Y.-H.; Mao, N.-D.; Duan, J.-L.; Zhang, H.; Garrido, C.; Lirussi, F.; Gao, Y.; Xie, T.; Ye, X.-Y. Recent progress in discovery of novel AAK1 inhibitors: From pain therapy to potential anti-viral agents. J. Enzym. Inhib. Med. Chem. 2023, 38, 2279906. [Google Scholar] [CrossRef] [PubMed]
  3. Yao, C.; Jiang, X.; Zhao, R.; Zhong, Z.; Ge, J.; Zhu, J.; Ye, X.-Y.; Xie, Y.; Liu, Z.; Xie, T. HDAC1/MAO-B dual inhibitors against Alzheimer’s disease: Design, synthesis and biological evaluation of N-propargylamine-hydroxamic acid/o-aminobenzamide hybrids. Bioorganic Chem. 2022, 122, 105724. [Google Scholar] [CrossRef] [PubMed]
  4. He, X.; Zhang, H.; Zhang, Y.; Ye, Y.; Wang, S.; Bai, R.; Xie, T.; Ye, X.-Y. Drug discovery of histone lysine demethylases (KDMs) inhibitors (progress from 2018 to present). Eur. J. Med. Chem. 2022, 231, 114143. [Google Scholar] [CrossRef] [PubMed]
  5. Duan, J.-L.; Wang, C.-C.; Yuan, Y.; Hui, Z.; Zhang, H.; Mao, N.-D.; Zhang, P.; Sun, B.; Lin, J.; Zhang, Z. Design, Synthesis, and Structure–Activity Relationship of Novel Pyridazinone-Based PARP7/HDACs Dual Inhibitors for Elucidating the Relationship between Antitumor Immunity and HDACs Inhibition. J. Med. Chem. 2024, 67, 4950–4976. [Google Scholar] [CrossRef] [PubMed]
  6. Pun, F.W.; Ozerov, I.V.; Zhavoronkov, A. AI-powered therapeutic target discovery. Trends Pharmacol. Sci. 2023, 44, 561–572. [Google Scholar] [CrossRef] [PubMed]
  7. Jeon, J.; Nim, S.; Teyra, J.; Datti, A.; Wrana, J.L.; Sidhu, S.S.; Moffat, J.; Kim, P.M. A systematic approach to identify novel cancer drug targets using machine learning, inhibitor design and high-throughput screening. Genome Med. 2014, 6, 57. [Google Scholar] [CrossRef] [PubMed]
  8. Bolognesi, M.L.; Cavalli, A. Multitarget Drug Discovery and Polypharmacology; Wiley Online Library: Hoboken, NJ, USA, 2016; Volume 11, pp. 1190–1192. [Google Scholar]
  9. Pinheiro-de-Sousa, I.; Fonseca-Alaniz, M.H.; Giudice, G.; Valadão, I.C.; Modestia, S.M.; Mattioli, S.V.; Junior, R.R.; Zalmas, L.P.; Fang, Y.; Petsalaki, E. Integrated systems biology approach identifies gene targets for endothelial dysfunction. Mol. Syst. Biol. 2023, 19, e11462. [Google Scholar] [CrossRef]
  10. Reel, P.S.; Reel, S.; Pearson, E.; Trucco, E.; Jefferson, E. Using machine learning approaches for multi-omics data analysis: A review. Biotechnol. Adv. 2021, 49, 107739. [Google Scholar] [CrossRef]
  11. Sussulini, A.; Xia, J.; Orešič, M. Multi-omics: Trends and applications in clinical research. Front. Mol. Biosci. 2022, 9, 994239. [Google Scholar] [CrossRef]
  12. Manzoni, C.; Kia, D.A.; Vandrovcova, J.; Hardy, J.; Wood, N.W.; Lewis, P.A.; Ferrari, R. Genome, transcriptome and proteome: The rise of omics data and their integration in biomedical sciences. Brief. Bioinform. 2018, 19, 286–302. [Google Scholar] [CrossRef] [PubMed]
  13. Hasin, Y.; Seldin, M.; Lusis, A. Multi-omics approaches to disease. Genome Biol. 2017, 18, 83. [Google Scholar] [CrossRef] [PubMed]
  14. Sun, Y.V.; Hu, Y.-J. Integrative analysis of multi-omics data for discovery and functional studies of complex human diseases. Adv. Genet. 2016, 93, 147–190. [Google Scholar] [PubMed]
  15. Vandereyken, K.; Sifrim, A.; Thienpont, B.; Voet, T. Methods and applications for single-cell and spatial multi-omics. Nat. Rev. Genet. 2023, 24, 494–515. [Google Scholar] [CrossRef] [PubMed]
  16. Chan, Y.-T.; Lu, Y.; Wu, J.; Zhang, C.; Tan, H.-Y.; Bian, Z.-X.; Wang, N.; Feng, Y. CRISPR-Cas9 library screening approach for anti-cancer drug discovery: Overview and perspectives. Theranostics 2022, 12, 3329. [Google Scholar] [CrossRef] [PubMed]
  17. Yamamoto, T.N.; Kishton, R.J.; Restifo, N.P. Developing neoantigen-targeted T cell–based treatments for solid tumors. Nat. Med. 2019, 25, 1488–1499. [Google Scholar] [CrossRef] [PubMed]
  18. Haley, B.; Roudnicky, F. Functional genomics for cancer drug target discovery. Cancer Cell 2020, 38, 31–43. [Google Scholar] [CrossRef] [PubMed]
  19. Yin, H.; Kassner, M. In vitro high-throughput RNAi screening to accelerate the process of target identification and drug development. In High-Throughput RNAi Screening: Methods and Protocols; Springer: Berlin/Heidelberg, Germany, 2016; pp. 137–149. [Google Scholar]
  20. Adams, R.; Steckel, M.; Nicke, B.; Pohlenz, H.-D. RNAi as a tool for target discovery in early pharmaceutical research. Pharm.-Int. J. Pharm. Sci. 2016, 71, 35–42. [Google Scholar]
  21. Zhang, Q.; Major, M.B.; Takanashi, S.; Camp, N.D.; Nishiya, N.; Peters, E.C.; Ginsberg, M.H.; Jian, X.; Randazzo, P.A.; Schultz, P.G. Small-molecule synergist of the Wnt/β-catenin signaling pathway. Proc. Natl. Acad. Sci. USA 2007, 104, 7444–7448. [Google Scholar] [CrossRef]
  22. Takase, S.; Kurokawa, R.; Arai, D.; Kanemoto Kanto, K.; Okino, T.; Nakao, Y.; Kushiro, T.; Yoshida, M.; Matsumoto, K. A quantitative shRNA screen identifies ATP1A1 as a gene that regulates cytotoxicity by aurilide B. Sci. Rep. 2017, 7, 2002. [Google Scholar] [CrossRef]
  23. le Sage, C.; Lawo, S.; Panicker, P.; Scales, T.M.; Rahman, S.A.; Little, A.S.; McCarthy, N.J.; Moore, J.D.; Cross, B.C. Dual direction CRISPR transcriptional regulation screening uncovers gene networks driving drug resistance. Sci. Rep. 2017, 7, 17693. [Google Scholar] [CrossRef]
  24. Shendure, J.; Findlay, G.M.; Snyder, M.W. Genomic medicine–progress, pitfalls, and promise. Cell 2019, 177, 45–57. [Google Scholar] [CrossRef] [PubMed]
  25. Huminiecki, L.; Horbańczuk, J.; Atanasov, A.G. The functional genomic studies of curcumin. In Seminars in Cancer Biology; Elsevier: Amsterdam, The Netherlands, 2017; pp. 107–118. [Google Scholar]
  26. Peng, J.; Sun, B.-F.; Chen, C.-Y.; Zhou, J.-Y.; Chen, Y.-S.; Chen, H.; Liu, L.; Huang, D.; Jiang, J.; Cui, G.-S. Single-cell RNA-seq highlights intra-tumoral heterogeneity and malignant progression in pancreatic ductal adenocarcinoma. Cell Res. 2019, 29, 725–738. [Google Scholar] [CrossRef] [PubMed]
  27. Lee, J.S.; Nair, N.U.; Dinstag, G.; Chapman, L.; Chung, Y.; Wang, K.; Sinha, S.; Cha, H.; Kim, D.; Schperberg, A.V. Synthetic lethality-mediated precision oncology via the tumor transcriptome. Cell 2021, 184, 2487–2502.e13. [Google Scholar] [CrossRef] [PubMed]
  28. Alidjinou, E.; Deldalle, J.; Hallaert, C.; Robineau, O.; Ajana, F.; Choisy, P.; Hober, D.; Bocket, L. RNA and DNA Sanger sequencing versus next-generation sequencing for HIV-1 drug resistance testing in treatment-naive patients. J. Antimicrob. Chemother. 2017, 72, 2823–2830. [Google Scholar] [CrossRef] [PubMed]
  29. Hwang, B.; Lee, J.H.; Bang, D. Single-cell RNA sequencing technologies and bioinformatics pipelines. Exp. Mol. Med. 2018, 50, 1–14. [Google Scholar] [CrossRef] [PubMed]
  30. Dorado, G.; Gálvez, S.; Rosales, T.; Vásquez, V.; Hernández, P. Analyzing modern biomolecules: The revolution of nucleic-acid sequencing-review. Biomolecules 2021, 11, 1111. [Google Scholar] [CrossRef] [PubMed]
  31. Zhao, S.; Fung-Leung, W.-P.; Bittner, A.; Ngo, K.; Liu, X. Comparison of RNA-Seq and microarray in transcriptome profiling of activated T cells. PLoS ONE 2014, 9, e78644. [Google Scholar] [CrossRef] [PubMed]
  32. Wilkins, M.R.; Sanchez, J.-C.; Gooley, A.A.; Appel, R.D.; Humphery-Smith, I.; Hochstrasser, D.F.; Williams, K.L. Progress with proteome projects: Why all proteins expressed by a genome should be identified and how to do it. Biotechnol. Genet. Eng. Rev. 1996, 13, 19–50. [Google Scholar] [CrossRef]
  33. Aslam, B.; Basit, M.; Nisar, M.A.; Khurshid, M.; Rasool, M.H. Proteomics: Technologies and their applications. J. Chromatogr. Sci. 2016, 55, 182–196. [Google Scholar] [CrossRef]
  34. Cristea, I.M.; Gaskell, S.J.; Whetton, A.D. Proteomics techniques and their application to hematology. Blood 2004, 103, 3624–3634. [Google Scholar] [CrossRef] [PubMed]
  35. Vogel, C.; Marcotte, E.M. Insights into the regulation of protein abundance from proteomic and transcriptomic analyses. Nat. Rev. Genet. 2012, 13, 227–232. [Google Scholar] [CrossRef] [PubMed]
  36. Cox, J.; Mann, M. Is proteomics the new genomics? Cell 2007, 130, 395–398. [Google Scholar] [CrossRef] [PubMed]
  37. Zheng, J.; Haberland, V.; Baird, D.; Walker, V.; Haycock, P.C.; Hurle, M.R.; Gutteridge, A.; Erola, P.; Liu, Y.; Luo, S. Phenome-wide Mendelian randomization mapping the influence of the plasma proteome on complex diseases. Nat. Genet. 2020, 52, 1122–1131. [Google Scholar] [CrossRef]
  38. Meissner, F.; Geddes-McAlister, J.; Mann, M.; Bantscheff, M. The emerging role of mass spectrometry-based proteomics in drug discovery. Nat. Rev. Drug Discov. 2022, 21, 637–654. [Google Scholar] [CrossRef] [PubMed]
  39. Marquart, J.; Chen, E.Y.; Prasad, V. Estimation of the percentage of US patients with cancer who benefit from genome-driven oncology. JAMA Oncol. 2018, 4, 1093–1098. [Google Scholar] [CrossRef] [PubMed]
  40. Charron, G.; Zhang, M.M.; Yount, J.S.; Wilson, J.; Raghavan, A.S.; Shamir, E.; Hang, H.C. Robust fluorescent detection of protein fatty-acylation with chemical reporters. J. Am. Chem. Soc. 2009, 131, 4967–4975. [Google Scholar] [CrossRef] [PubMed]
  41. Piétu, G.; Mariage-Samson, R.; Fayein, N.-A.; Matingou, C.; Eveno, E.; Houlgatte, R.; Decraene, C.; Vandenbrouck, Y.; Tahi, F.; Devignes, M.-D. The Genexpress IMAGE knowledge base of the human brain transcriptome: A prototype integrated resource for functional and computational genomics. Genome Res. 1999, 9, 195–209. [Google Scholar] [CrossRef]
  42. Jin, Y.; Yoon, Y.J.; Jeon, Y.J.; Choi, J.; Lee, Y.-J.; Lee, J.; Choi, S.; Nash, O.; Han, D.C.; Kwon, B.-M. Geranylnaringenin (CG902) inhibits constitutive and inducible STAT3 activation through the activation of SHP-2 tyrosine phosphatase. Biochem. Pharmacol. 2017, 142, 46–57. [Google Scholar] [CrossRef]
  43. Kirsch, V.C.; Orgler, C.; Braig, S.; Jeremias, I.; Auerbach, D.; Müller, R.; Vollmar, A.M.; Sieber, S.A. The cytotoxic natural product vioprolide A targets nucleolar protein 14, which is essential for ribosome biogenesis. Angew. Chem. Int. Ed. 2020, 59, 1595–1600. [Google Scholar] [CrossRef]
  44. Geng, J.; Liu, W.; Gao, J.; Jiang, C.; Fan, T.; Sun, Y.; Qin, Z.H.; Xu, Q.; Guo, W.; Gao, J. Andrographolide alleviates Parkinsonism in MPTP-PD mice via targeting mitochondrial fission mediated by dynamin-related protein 1. Br. J. Pharmacol. 2019, 176, 4574–4591. [Google Scholar] [CrossRef] [PubMed]
  45. West, G.M.; Tucker, C.L.; Xu, T.; Park, S.K.; Han, X.; Yates, J.R., III; Fitzgerald, M.C. Quantitative proteomics approach for identifying protein–drug interactions in complex mixtures using protein stability measurements. Proc. Natl. Acad. Sci. USA 2010, 107, 9078–9082. [Google Scholar] [CrossRef]
  46. Yuyama, K.; Sun, H.; Fujii, R.; Hemmi, I.; Ueda, K.; Igeta, Y. Extracellular vesicle proteome unveils cathepsin B connection to Alzheimer’s disease pathogenesis. Brain 2024, 147, 627–636. [Google Scholar] [CrossRef] [PubMed]
  47. Han, Z.-J.; Feng, Y.-H.; Gu, B.-H.; Li, Y.-M.; Chen, H. The post-translational modification, SUMOylation, and cancer. Int. J. Oncol. 2018, 52, 1081–1094. [Google Scholar] [CrossRef]
  48. Perrin, J.; Werner, T.; Kurzawa, N.; Rutkowska, A.; Childs, D.D.; Kalxdorf, M.; Poeckel, D.; Stonehouse, E.; Strohmer, K.; Heller, B. Identifying drug targets in tissues and whole blood with thermal-shift profiling. Nat. Biotechnol. 2020, 38, 303–308. [Google Scholar] [CrossRef]
  49. Lomenick, B.; Olsen, R.W.; Huang, J. Identification of direct protein targets of small molecules. ACS Chem. Biol. 2011, 6, 34–46. [Google Scholar] [CrossRef]
  50. Reiche, J.; Huber, O. Post-translational modifications of tight junction transmembrane proteins and their direct effect on barrier function. Biochim. Biophys. Acta BBA-Biomembr. 2020, 1862, 183330. [Google Scholar] [CrossRef] [PubMed]
  51. Wishart, D.S. Metabolomics for investigating physiological and pathophysiological processes. Physiol. Rev. 2019, 99, 1819–1875. [Google Scholar] [CrossRef]
  52. Zampieri, M.; Sekar, K.; Zamboni, N.; Sauer, U. Frontiers of high-throughput metabolomics. Curr. Opin. Chem. Biol. 2017, 36, 15–23. [Google Scholar] [CrossRef]
  53. Sévin, D.C.; Sauer, U. Ubiquinone accumulation improves osmotic-stress tolerance in Escherichia coli. Nat. Chem. Biol. 2014, 10, 266–272. [Google Scholar] [CrossRef]
  54. Fan, S.; Gao, Y.; Zhang, H.; Huang, M.; Bi, H. Untargeted and targeted metabolomics and their applications in discovering drug targets. Prog. Pharm. Sci. 2017, 41, 263–269. [Google Scholar]
  55. Karczewski, K.J.; Snyder, M.P. Integrative omics for health and disease. Nat. Rev. Genet. 2018, 19, 299–310. [Google Scholar] [CrossRef] [PubMed]
  56. Wang, B.; Mezlini, A.M.; Demir, F.; Fiume, M.; Tu, Z.; Brudno, M.; Haibe-Kains, B.; Goldenberg, A. Similarity network fusion for aggregating data types on a genomic scale. Nat. Methods 2014, 11, 333–337. [Google Scholar] [CrossRef] [PubMed]
  57. Lan, T.; Yuan, K.; Yan, X.; Xu, L.; Liao, H.; Hao, X.; Wang, J.; Liu, H.; Chen, X.; Xie, K. LncRNA SNHG10 facilitates hepatocarcinogenesis and metastasis by modulating its homolog SCARNA13 via a positive feedback loop. Cancer Res. 2019, 79, 3220–3234. [Google Scholar] [CrossRef] [PubMed]
  58. Liu, W.; Wang, H.; Zhao, Q.; Tao, C.; Qu, W.; Hou, Y.; Huang, R.; Sun, Z.; Zhu, G.; Jiang, X. Multiomics analysis reveals metabolic subtypes and identifies diacylglycerol kinase α (DGKA) as a potential therapeutic target for intrahepatic cholangiocarcinoma. Cancer Commun. 2024, 44, 226–250. [Google Scholar] [CrossRef] [PubMed]
  59. Gao, R.; Wang, H.; Li, T.; Wang, J.; Ren, Z.; Cai, N.; Ai, H.; Li, S.; Lu, Y.; Zhu, Y. Secreted MUP1 that reduced under ER stress attenuates ER stress induced insulin resistance through suppressing protein synthesis in hepatocytes. Pharmacol. Res. 2023, 187, 106585. [Google Scholar] [CrossRef] [PubMed]
  60. Wang, M.; Zhang, Z.; Liu, J.; Song, M.; Zhang, T.; Chen, Y.; Hu, H.; Yang, P.; Li, B.; Song, X. Gefitinib and fostamatinib target EGFR and SYK to attenuate silicosis: A multi-omics study with drug exploration. Signal Transduct. Target. Ther. 2022, 7, 157. [Google Scholar] [CrossRef] [PubMed]
  61. Cai, H.; Guo, F.; Wen, S.; Jin, X.; Wu, H.; Ren, D. Overexpressed integrin alpha 2 inhibits the activation of the transforming growth factor β pathway in pancreatic cancer via the TFCP2-SMAD2 axis. J. Exp. Clin. Cancer Res. 2022, 41, 73. [Google Scholar] [CrossRef]
  62. Reustle, A.; Di Marco, M.; Meyerhoff, C.; Nelde, A.; Walz, J.S.; Winter, S.; Kandabarau, S.; Büttner, F.; Haag, M.; Backert, L. Integrative-omics and HLA-ligandomics analysis to identify novel drug targets for ccRCC immunotherapy. Genome Med. 2020, 12, 1–24. [Google Scholar] [CrossRef]
  63. Liu, J.; Jing, W.; Wang, T.; Hu, Z.; Lu, H. Functional metabolomics revealed the dual-activation of cAMP-AMP axis is a novel therapeutic target of pancreatic cancer. Pharmacol. Res. 2023, 187, 106554. [Google Scholar] [CrossRef]
  64. Shao, X.; Ji, F.; Wang, Y.; Zhu, L.; Zhang, Z.; Du, X.; Chung, A.C.K.; Hong, Y.; Zhao, Q.; Cai, Z. Integrative chemical proteomics-metabolomics approach reveals Acaca/Acacb as direct molecular targets of PFOA. Anal. Chem. 2018, 90, 11092–11098. [Google Scholar] [CrossRef] [PubMed]
  65. Wan, L.; Gao, Q.; Deng, Y.; Ke, Y.; Ma, E.; Yang, H.; Lin, H.; Li, H.; Yang, Y.; Gong, J. GP73 is a glucogenic hormone contributing to SARS-CoV-2-induced hyperglycemia. Nat. Metab. 2022, 4, 29–43. [Google Scholar] [CrossRef] [PubMed]
  66. Jaffe, J.D.; Berg, H.C.; Church, G.M. Proteogenomic mapping as a complementary method to perform genome annotation. Proteomics 2004, 4, 59–77. [Google Scholar] [CrossRef] [PubMed]
  67. Venter, J.C.; Smith, H.O.; Adams, M.D. The sequence of the human genome. Clin. Chem. 2015, 61, 1207–1208. [Google Scholar] [CrossRef] [PubMed]
  68. Li, X.; Wang, W.; Chen, J. Recent progress in mass spectrometry proteomics for biomedical research. Sci. China Life Sci. 2017, 60, 1093–1113. [Google Scholar] [CrossRef] [PubMed]
  69. Satpathy, S.; Jaehnig, E.J.; Krug, K.; Kim, B.-J.; Saltzman, A.B.; Chan, D.W.; Holloway, K.R.; Anurag, M.; Huang, C.; Singh, P. Microscaled proteogenomic methods for precision oncology. Nat. Commun. 2020, 11, 532. [Google Scholar] [CrossRef] [PubMed]
  70. Mani, D.; Krug, K.; Zhang, B.; Satpathy, S.; Clauser, K.R.; Ding, L.; Ellis, M.; Gillette, M.A.; Carr, S.A. Cancer proteogenomics: Current impact and future prospects. Nat. Rev. Cancer 2022, 22, 298–313. [Google Scholar] [CrossRef] [PubMed]
  71. Gillette, M.A.; Satpathy, S.; Cao, S.; Dhanasekaran, S.M.; Vasaikar, S.V.; Krug, K.; Petralia, F.; Li, Y.; Liang, W.-W.; Reva, B. Proteogenomic characterization reveals therapeutic vulnerabilities in lung adenocarcinoma. Cell 2020, 182, 200–225.e35. [Google Scholar] [CrossRef] [PubMed]
  72. Chen, Y.-J.; Roumeliotis, T.I.; Chang, Y.-H.; Chen, C.-T.; Han, C.-L.; Lin, M.-H.; Chen, H.-W.; Chang, G.-C.; Chang, Y.-L.; Wu, C.-T. Proteogenomics of non-smoking lung cancer in East Asia delineates molecular signatures of pathogenesis and progression. Cell 2020, 182, 226–244.e17. [Google Scholar] [CrossRef]
  73. Xu, J.-Y.; Zhang, C.; Wang, X.; Zhai, L.; Ma, Y.; Mao, Y.; Qian, K.; Sun, C.; Liu, Z.; Jiang, S. Integrative proteomic characterization of human lung adenocarcinoma. Cell 2020, 182, 245–261.e17. [Google Scholar] [CrossRef]
  74. Ogbeide, S.; Giannese, F.; Mincarelli, L.; Macaulay, I.C. Into the multiverse: Advances in single-cell multiomic profiling. Trends Genet. 2022, 38, 831–843. [Google Scholar] [CrossRef] [PubMed]
  75. Teichmann, S.; Efremova, M. Method of the year 2019: Single-cell multimodal omics. Nat. Methods 2020, 17, 2020. [Google Scholar]
  76. Zhu, C.; Preissl, S.; Ren, B. Single-cell multimodal omics: The power of many. Nat. Methods 2020, 17, 11–14. [Google Scholar] [CrossRef] [PubMed]
  77. Baysoy, A.; Bai, Z.; Satija, R.; Fan, R. The technological landscape and applications of single-cell multi-omics. Nat. Rev. Mol. Cell Biol. 2023, 24, 695–713. [Google Scholar] [CrossRef] [PubMed]
  78. Lee, J.; Hyeon, D.Y.; Hwang, D. Single-cell multiomics: Technologies and data analysis methods. Exp. Mol. Med. 2020, 52, 1428–1442. [Google Scholar] [CrossRef] [PubMed]
  79. Wen, L.; Tang, F. Recent advances in single-cell sequencing technologies. Precis. Clin. Med. 2022, 5, pbac002. [Google Scholar] [CrossRef] [PubMed]
  80. Nassar, S.F.; Raddassi, K.; Wu, T. Single-cell multiomics analysis for drug discovery. Metabolites 2021, 11, 729. [Google Scholar] [CrossRef] [PubMed]
  81. Terekhanova, N.V.; Karpova, A.; Liang, W.-W.; Strzalkowski, A.; Chen, S.; Li, Y.; Southard-Smith, A.N.; Iglesia, M.D.; Wendl, M.C.; Jayasinghe, R.G. Epigenetic regulation during cancer transitions across 11 tumour types. Nature 2023, 623, 432–441. [Google Scholar] [CrossRef] [PubMed]
  82. Zhu, Q.; Zhao, X.; Zhang, Y.; Li, Y.; Liu, S.; Han, J.; Sun, Z.; Wang, C.; Deng, D.; Wang, S. Single cell multi-omics reveal intra-cell-line heterogeneity across human cancer cell lines. Nat. Commun. 2023, 14, 8170. [Google Scholar] [CrossRef]
  83. Marx, V. Method of the Year: Spatially resolved transcriptomics. Nat. Methods 2021, 18, 9–14. [Google Scholar] [CrossRef]
  84. Eisenstein, M. Seven technologies to watch in 2022. Nature 2022, 601, 658–661. [Google Scholar] [CrossRef] [PubMed]
  85. Yao, L.; Wang, J.T.; Jayasinghe, R.G.; O’Neal, J.; Tsai, C.-F.; Rettig, M.P.; Song, Y.; Liu, R.; Zhao, Y.; Ibrahim, O.M. Single-cell discovery and multiomic characterization of therapeutic targets in multiple myeloma. Cancer Res. 2023, 83, 1214–1233. [Google Scholar] [CrossRef] [PubMed]
  86. Sun, C.; Wang, A.; Zhou, Y.; Chen, P.; Wang, X.; Huang, J.; Gao, J.; Wang, X.; Shu, L.; Lu, J. Spatially resolved multi-omics highlights cell-specific metabolic remodeling and interactions in gastric cancer. Nat. Commun. 2023, 14, 2692. [Google Scholar] [CrossRef] [PubMed]
  87. Pelka, K.; Hofree, M.; Chen, J.H.; Sarkizova, S.; Pirl, J.D.; Jorgji, V.; Bejnood, A.; Dionne, D.; William, H.G.; Xu, K.H. Spatially organized multicellular immune hubs in human colorectal cancer. Cell 2021, 184, 4734–4752.e20. [Google Scholar] [CrossRef] [PubMed]
  88. Bressan, D.; Battistoni, G.; Hannon, G.J. The dawn of spatial omics. Science 2023, 381, eabq4964. [Google Scholar] [CrossRef] [PubMed]
  89. Bingham, G.C.; Lee, F.; Naba, A.; Barker, T.H. Spatial-omics: Novel approaches to probe cell heterogeneity and extracellular matrix biology. Matrix Biol. 2020, 91, 152–166. [Google Scholar] [CrossRef] [PubMed]
  90. Subramanian, I.; Verma, S.; Kumar, S.; Jere, A.; Anamika, K. Multi-omics data integration, interpretation, and its application. Bioinform. Biol. Insights 2020, 14, 1177932219899051. [Google Scholar] [CrossRef] [PubMed]
  91. Benson, D.A.; Cavanaugh, M.; Clark, K.; Karsch-Mizrachi, I.; Lipman, D.J.; Ostell, J.; Sayers, E.W. GenBank. Nucleic Acids Res. 2012, 41, D36–D42. [Google Scholar] [CrossRef] [PubMed]
  92. Goujon, M.; McWilliam, H.; Li, W.; Valentin, F.; Squizzato, S.; Paern, J.; Lopez, R. A new bioinformatics analysis tools framework at EMBL–EBI. Nucleic Acids Res. 2010, 38 (Suppl. 2), W695–W699. [Google Scholar] [CrossRef]
  93. Tateno, Y.; Imanishi, T.; Miyazaki, S.; Fukami-Kobayashi, K.; Saitou, N.; Sugawara, H.; Gojobori, T. DNA Data Bank of Japan (DDBJ) for genome scale research in life science. Nucleic Acids Res. 2002, 30, 27–30. [Google Scholar] [CrossRef]
  94. Ogasawara, O.; Kodama, Y.; Mashima, J.; Kosuge, T.; Fujisawa, T. DDBJ Database updates and computational infrastructure enhancement. Nucleic Acids Res. 2020, 48, D45–D50. [Google Scholar] [CrossRef] [PubMed]
  95. Kozomara, A.; Birgaoanu, M.; Griffiths-Jones, S. miRBase: From microRNA sequences to function. Nucleic Acids Res. 2019, 47, D155–D162. [Google Scholar] [CrossRef]
  96. Griffiths-Jones, S. miRBase: The microRNA sequence database. In MicroRNA Protocols; Humana Press: Totowa, NJ, USA, 2006; pp. 129–138. [Google Scholar]
  97. Liu, L.; Li, Z.; Liu, C.; Zou, D.; Li, Q.; Feng, C.; Jing, W.; Luo, S.; Zhang, Z.; Ma, L. LncRNAWiki 2.0: A knowledgebase of human long non-coding RNAs with enhanced curation model and database system. Nucleic Acids Res. 2022, 50, D190–D195. [Google Scholar] [CrossRef]
  98. Kalvari, I.; Nawrocki, E.P.; Ontiveros-Palacios, N.; Argasinska, J.; Lamkiewicz, K.; Marz, M.; Griffiths-Jones, S.; Toffano-Nioche, C.; Gautheret, D.; Weinberg, Z. Rfam 14: Expanded coverage of metagenomic, viral and microRNA families. Nucleic Acids Res. 2021, 49, D192–D200. [Google Scholar] [CrossRef]
  99. Griffiths-Jones, S.; Bateman, A.; Marshall, M.; Khanna, A.; Eddy, S.R. Rfam: An RNA family database. Nucleic Acids Res. 2003, 31, 439–441. [Google Scholar] [CrossRef] [PubMed]
  100. Consortium, U. UniProt: A worldwide hub of protein knowledge. Nucleic Acids Res. 2019, 47, D506–D515. [Google Scholar] [CrossRef] [PubMed]
  101. Paysan-Lafosse, T.; Blum, M.; Chuguransky, S.; Grego, T.; Pinto, B.L.; Salazar, G.A.; Bileschi, M.L.; Bork, P.; Bridge, A.; Colwell, L. InterPro in 2022. Nucleic Acids Res. 2023, 51, D418–D427. [Google Scholar] [CrossRef] [PubMed]
  102. Hunter, S.; Apweiler, R.; Attwood, T.K.; Bairoch, A.; Bateman, A.; Binns, D.; Bork, P.; Das, U.; Daugherty, L.; Duquenne, L. InterPro: The integrative protein signature database. Nucleic Acids Res. 2009, 37 (Suppl. 1), D211–D215. [Google Scholar] [CrossRef]
  103. Berman, H.M.; Battistuz, T.; Bhat, T.N.; Bluhm, W.F.; Bourne, P.E.; Burkhardt, K.; Feng, Z.; Gilliland, G.L.; Iype, L.; Jain, S. The protein data bank. Acta Crystallogr. Sect. D Biol. Crystallogr. 2002, 58, 899–907. [Google Scholar] [CrossRef]
  104. Lo Conte, L.; Ailey, B.; Hubbard, T.J.; Brenner, S.E.; Murzin, A.G.; Chothia, C. SCOP: A structural classification of proteins database. Nucleic Acids Res. 2000, 28, 257–259. [Google Scholar] [CrossRef]
  105. Perez-Riverol, Y.; Csordas, A.; Bai, J.; Bernal-Llinares, M.; Hewapathirana, S.; Kundu, D.J.; Inuganti, A.; Griss, J.; Mayer, G.; Eisenacher, M. The PRIDE database and related tools and resources in 2019: Improving support for quantification data. Nucleic Acids Res. 2019, 47, D442–D450. [Google Scholar] [CrossRef] [PubMed]
  106. Hulo, N.; Bairoch, A.; Bulliard, V.; Cerutti, L.; De Castro, E.; Langendijk-Genevaux, P.S.; Pagni, M.; Sigrist, C.J. The PROSITE database. Nucleic Acids Res. 2006, 34 (Suppl. 1), D227–D230. [Google Scholar] [CrossRef] [PubMed]
  107. Xenarios, I.; Rice, D.W.; Salwinski, L.; Baron, M.K.; Marcotte, E.M.; Eisenberg, D. DIP: The database of interacting proteins. Nucleic Acids Res. 2000, 28, 289–291. [Google Scholar] [CrossRef] [PubMed]
  108. Kanehisa, M.; Sato, Y.; Kawashima, M.; Furumichi, M.; Tanabe, M. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res. 2016, 44, D457–D462. [Google Scholar] [CrossRef] [PubMed]
  109. Kanehisa, M.; Furumichi, M.; Tanabe, M.; Sato, Y.; Morishima, K. KEGG: New perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 2017, 45, D353–D361. [Google Scholar] [CrossRef] [PubMed]
  110. Consortium, G.O. The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res. 2004, 32 (Suppl. S1), D258–D261. [Google Scholar] [CrossRef] [PubMed]
  111. Wishart, D.S.; Tzur, D.; Knox, C.; Eisner, R.; Guo, A.C.; Young, N.; Cheng, D.; Jewell, K.; Arndt, D.; Sawhney, S. HMDB: The human metabolome database. Nucleic Acids Res. 2007, 35 (Suppl. 1), D521–D526. [Google Scholar] [CrossRef] [PubMed]
  112. Tomczak, K.; Czerwińska, P.; Wiznerowicz, M. Review The Cancer Genome Atlas (TCGA): An immeasurable source of knowledge. Contemp. Oncol. 2015, 2015, 68–77. [Google Scholar] [CrossRef] [PubMed]
  113. Lv, M.; Tian, G.; Guo, X. TARGET database introduction and data extraction. Chin. J. Evid. Based Cardiovasc. Med. 2019, 11, 9–12. [Google Scholar]
  114. Zhang, J.; Bajari, R.; Andric, D.; Gerthoffert, F.; Lepsa, A.; Nahal-Bose, H.; Stein, L.D.; Ferretti, V. The international cancer genome consortium data portal. Nat. Biotechnol. 2019, 37, 367–369. [Google Scholar] [CrossRef]
  115. Perez-Riverol, Y.; Bai, M.; da Veiga Leprevost, F.; Squizzato, S.; Park, Y.M.; Haug, K.; Carroll, A.J.; Spalding, D.; Paschall, J.; Wang, M. Discovering and linking public omics data sets using the Omics Discovery Index. Nat. Biotechnol. 2017, 35, 406–409. [Google Scholar] [CrossRef] [PubMed]
  116. Craven, K.E.; Gökmen-Polar, Y.; Badve, S.S. CIBERSORT analysis of TCGA and METABRIC identifies subgroups with better outcomes in triple negative breast cancer. Sci. Rep. 2021, 11, 4691. [Google Scholar] [CrossRef] [PubMed]
  117. Bouhaddou, M.; DiStefano, M.S.; Riesel, E.A.; Carrasco, E.; Holzapfel, H.Y.; Jones, D.C.; Smith, G.R.; Stern, A.D.; Somani, S.S.; Thompson, T.V. Drug response consistency in CCLE and CGP. Nature 2016, 540, E9–E10. [Google Scholar] [CrossRef] [PubMed]
  118. Satterlee, J.S.; Chadwick, L.H.; Tyson, F.L.; McAllister, K.; Beaver, J.; Birnbaum, L.; Volkow, N.D.; Wilder, E.L.; Anderson, J.M.; Roy, A.L. The NIH common fund/roadmap epigenomics program: Successes of a comprehensive consortium. Sci. Adv. 2019, 5, eaaw6507. [Google Scholar] [CrossRef]
  119. Chadwick, L.H. The NIH roadmap epigenomics program data resource. Epigenomics 2012, 4, 317–324. [Google Scholar] [CrossRef]
Figure 1. History of the development of omics technology.
Figure 1. History of the development of omics technology.
Biomolecules 14 00692 g001
Figure 2. Multi-omics technology is applied to drug-target identification process.
Figure 2. Multi-omics technology is applied to drug-target identification process.
Biomolecules 14 00692 g002
Figure 3. Single-omics analysis steps and multi-omics analysis.
Figure 3. Single-omics analysis steps and multi-omics analysis.
Biomolecules 14 00692 g003
Table 1. A survey of common single-omics database information—including brief descriptions of each database (All websites were accessed on 11 June 2024).
Table 1. A survey of common single-omics database information—including brief descriptions of each database (All websites were accessed on 11 June 2024).
Omic NameClassificationCommon Reference DatabasesWebsiteMain FunctionsReference
GenomicsThree
major DNA
Databases
GenBankhttp://www.ncbi.nlm.nih.gov/genbank/Provides genome-wide 2D gel-electrophoresis profiles, collecting 2D gel-electrophoresis maps of proteomes of organisms with known genomic information.[91]
EMBLhttps://www.ebi.ac.ukNucleic acid sequences, genomes, microarray gene expression, protein sequences, annotations, and many other biological data[92]
DDBJhttps://www.ddbj.nig.ac.jp/index-e.htmlGenomic, transcriptomic, epigenomic, exomic, macrogenomic, macrotranscriptomic, and other multi-omics data for human, animal, and other samples.[93,94]
TranscriptomicsNCBImiRBasehttp://www.mirbase.orgThe most comprehensive miRNA database with nearly 40,000 miRNAs from more than 200 species.[95,96]
EMBL-EBILncRNAwikihttps://ngdc.cncb.ac.cn/lncrnawiki1/index.php/Main_PageIntegration of more than 100,000 LncRNAs currently available, classification of long non-coding RNAs[97]
NGDCRfamhttp://rfam.orgIdentification of non-coding RNAs, commonly used to annotate new nucleic acid sequences or genome sequences[98,99]
ProteomicsProtein
Sequence Database
UniProthttp://www.uniprot.orgContains protein sequences, functional information, and an index of research papers.[100]
PIRhttps://proteininformationresource.orgDatabase integrating public resources on protein function prediction data[87]
InterProhttp://www.ebi.ac.uk/interpro/Integrated protein structural domain and functional site databases with data resources on protein families, domains, repeat sequences, and sites of action[101,102]
Protein Structure DatabaseProtein Data Bankhttps://www.rcsb.orgX-ray diffraction structures, NMR spectra, electron microscopy (EM) imaging, and some special structures (e.g., DNA and RNA structure libraries) are included.[103]
SCOPhttps://www.ebi.ac.uk/pdbe/scop/Classifies known protein structures and describes the functions and evolutionary relationships of proteins of known structure based on the amino acid composition of different proteins and similarities in tertiary structure[104]
Proteome DatabasePRIDEhttps://www.ebi.ac.uk/pride/archive/Classifies known protein structures and describes the functions and evolutionary relationships of proteins of known structure based on the amino acid composition of different proteins, as well as similarities in tertiary structure.[105]
Protein Functional Domain DatabasePROSITEhttps://prosite.expasy.orgDatabase of protein families and structural domains containing biologically significant sites, patterns, and statistical features that can help identify protein families[106]
Protein
Interaction Database
DIPhttps://dip.doe-mbi.ucla.edu/dip/Main.cgiTool for studying biological response mechanisms[107]
MetabolomicsMetabolic Pathways DatabaseKEGGhttps://www.kegg.jpLinking genomic information to higher-order functional information for the study of genomes, metabolomes, signaling pathways, and biochemical reactions[108,109]
GOhttps://geneontology.orgStandardizing the function of genes and proteins[110]
Metabolome commonly used
Database
HMDBhttps://hmdb.caComprehensive reference information on human metabolites and their associated biological, physiological, and chemical properties[111]
Table 2. Overview of common multi-omics databases characteristics (All websites were accessed on 11 June 2024).
Table 2. Overview of common multi-omics databases characteristics (All websites were accessed on 11 June 2024).
Multi-omics DatabaseWebsiteOmics Assay TypeReference
The Cancer Genome Atlas (TCGA)https://www.cancer.gov/ccg/research/genome-sequencing/tcgaRNA-Seq, DNA-Seq, miRNA-Seq, SNV, CNV, DNA methylation, and RPPA[112]
TARGEThttps://www.cancer.gov/ccg/research/genome-sequencing/targetGene expression, miRNA expression, copy number, and sequencing data[113]
International Cancer Genomics Consortium (ICGC)https://dcc.icgc.org/Whole genome sequencing, genomic variation data (somatic and germline mutations)[114]
OmicsDIhttps://www.omicsdi.orgProteomics, genomics, metabolomics, and transcriptomics data[115]
METABRIChttps://www.cbioportal.orgClinical features, gene expression, SNPs, and CNVs[116]
CCLEhttps://sites.broadinstitute.org/ccleGene expression, copy number, and sequencing data; pharmacological profile of 24 anticancer drugs[117]
Roadmap Epigenomicshttps://commonfund.nih.gov/epigenomicsRNA-seq, ChIP-seq (histones), DNase-seq, and methylation data[118,119]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Du, P.; Fan, R.; Zhang, N.; Wu, C.; Zhang, Y. Advances in Integrated Multi-omics Analysis for Drug-Target Identification. Biomolecules 2024, 14, 692. https://doi.org/10.3390/biom14060692

AMA Style

Du P, Fan R, Zhang N, Wu C, Zhang Y. Advances in Integrated Multi-omics Analysis for Drug-Target Identification. Biomolecules. 2024; 14(6):692. https://doi.org/10.3390/biom14060692

Chicago/Turabian Style

Du, Peiling, Rui Fan, Nana Zhang, Chenyuan Wu, and Yingqian Zhang. 2024. "Advances in Integrated Multi-omics Analysis for Drug-Target Identification" Biomolecules 14, no. 6: 692. https://doi.org/10.3390/biom14060692

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop