ToxDAR: A Workflow Software for Analyzing Toxicologically Relevant Proteomic and Transcriptomic Data, from Data Preparation to Toxicological Mechanism Elucidation

Jiang, Peng; Zhang, Zuzhen; Yu, Qing; Wang, Ze; Diao, Lihong; Li, Dong

doi:10.3390/ijms25179544

Open AccessArticle

ToxDAR: A Workflow Software for Analyzing Toxicologically Relevant Proteomic and Transcriptomic Data, from Data Preparation to Toxicological Mechanism Elucidation

by

Peng Jiang

^1,†,

Zuzhen Zhang

^1,†

,

Qing Yu

²,

Ze Wang

³,

Lihong Diao

³ and

Dong Li

^1,2,3,*

¹

School of Basic Medical Sciences, Anhui Medical University, Hefei 230032, China

²

College of Life Sciences, Hebei University, Baoding 071002, China

³

State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Int. J. Mol. Sci. 2024, 25(17), 9544; https://doi.org/10.3390/ijms25179544

Submission received: 21 July 2024 / Revised: 26 August 2024 / Accepted: 30 August 2024 / Published: 2 September 2024

(This article belongs to the Special Issue Bioinformatics Study in Human Diseases: Integration of Omics Data for Personalized Medicine (Second Edition))

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Exploration of toxicological mechanisms is imperative for the assessment of potential adverse reactions to chemicals and pharmaceutical agents, the engineering of safer compounds, and the preservation of public health. It forms the foundation of drug development and disease treatment. High-throughput proteomics and transcriptomics can accurately capture the body’s response to toxins and have become key tools for revealing complex toxicological mechanisms. Recently, a vast amount of omics data related to toxicological mechanisms have been accumulated. However, analyzing and utilizing these data remains a major challenge for researchers, especially as there is a lack of a knowledge-based analysis system to identify relevant biological pathways associated with toxicity from the data and to establish connections between omics data and existing toxicological knowledge. To address this, we have developed ToxDAR, a workflow-oriented R package for preprocessing and analyzing toxicological multi-omics data. ToxDAR integrates packages like NormExpression, DESeq2, and igraph, and utilizes R functions such as prcomp and phyper. It supports data preparation, quality control, differential expression analysis, functional analysis, and network analysis. ToxDAR’s architecture also includes a knowledge graph with five major categories of mechanism-related biological entities and details fifteen types of interactions among them, providing comprehensive knowledge annotation for omics data analysis results. As a case study, we used ToxDAR to analyze a transcriptomic dataset on the toxicology of triphenyl phosphate (TPP). The results indicate that TPP may impair thyroid function by activating thyroid hormone receptor β (THRB), impacting pathways related to programmed cell death and inflammation. As a workflow-oriented data analysis tool, ToxDAR is expected to be crucial for understanding toxic mechanisms from omics data, discovering new therapeutic targets, and evaluating chemical safety.

Keywords:

toxicological transcriptomics analysis; toxicological mechanisms; R package

1. Introduction

Toxicological mechanism research is crucial for revealing the impact of chemical substances at the molecular, cellular, and even organ levels on living systems [1,2]. It is essential for predicting and assessing chemical risks [3], establishing chemical safety standards, and developing safe chemical substitutes [4,5]. Moreover, it plays a key role in determining clinical treatment dosages, reducing the risk of side effects [6], and advancing personalized medicine [7,8].

In recent years, transcriptomics and proteomics have increasingly become essential tools for studying the interactions between toxins and organisms [9]. While genomics provides the genetic blueprint of an organism [10], its relatively static data do not capture an organism’s rapid response to environmental changes. Transcriptomics allows for the holistic study of RNA expression changes [11], offering direct evidence of how genes regulate in response to external toxins. Concurrently, proteomics, by examining the expression, modification, and interaction of proteins, reveals their specific alterations under the influence of toxins, thus elucidating the mechanisms of toxicology directly [12,13,14]. Although the advancement of ‘omics’ technologies has greatly enhanced our capacity to study the effects of toxin exposure [15], the vast datasets pose significant challenges for bioinformatics [16]. Current toxicological databases such as LINCS, ToxCast [17], and Open TG-GATEs [18] and analytical tools like TCGAbiolinks2.32.0 [19], RTCGAToolbox2.34.0 [20], and cBioPortalv5 [21] are available. Yet, they only facilitate basic analysis processes like data download, preprocessing, and comparative analysis, which do not meet the needs of toxicological mechanism research. Other general analysis software, such as PCA1.0.7 [22], limma [23], clusterProfiler4.12.6 [24], and Cytoscape3.10.2 [25], can perform targeted data analysis functions, such as differential expression, functional enrichment, and network construction. Additionally, there have been some efforts in building toxicology-related knowledge bases. For example, the National Center for Toxicogenomics has developed the Chemical Effects in Biological Systems (CEBS) knowledge base [26], which includes toxicology-related literature and dataset information. The Comparative Toxicogenomics Database (CTD) [27] integrates toxicology information on chemicals, genes, phenotypes, and exposures, revealing the impact of environmental factors on disease etiology and molecular mechanisms. The ECOTOXicology Knowledgebase [28] aggregates various ecological toxicity data and supports risk assessment for ecological toxicity testing. However, the field of toxicology still lacks a comprehensive knowledge base that systematically presents the relationships between toxic molecules and biological systems, as well as omics data analysis software based on such a knowledge base. More importantly, researchers must frequently switch between different analytical software tools and online resources during the research process, which can be complex, time-consuming, and prone to errors. Therefore, it is necessary to develop a procedural system for the analysis of toxicogenomic data that improves the accuracy, reliability, and reproducibility of data analysis.

Addressing the challenges in the field of toxicogenomics data (including Toxicoproteomic and Toxicotranscriptomic data) analysis, this article presents an innovative solution, ToxDAR, designed to enhance the capabilities of data analysis and annotation, thereby improving the efficiency and quality of toxicological mechanism elucidation. Implemented as an R package, this solution integrates a wide range of analysis functions commonly used in the analysis of toxicogenomic data, including data reading, preprocessing, differential analysis, functional annotation, and network analysis. The ToxDAR package also incorporates annotation information from multiple databases and knowledge bases, encompassing associations between toxins and diseases, pathways, genes, and other entities, thus providing a knowledge framework for the analysis of toxicogenomics data. ToxDAR offers a suite of tools for analyzing and interpreting toxicogenomics data, promising to provide deeper insights into the mechanisms of toxicology.

2. Result

2.1. Software Framework

The ToxDAR software package is a bioinformatics analysis system tailored for the analysis of toxicogenomics data to identify relevant biological pathways associated with toxicity from the data and to establish connections between omics data and existing toxicological knowledge. It comprises four key modules (Figure 1): (I) Data Preparation, (II) Quality Control, (III) Data Analysis, and (IV) Data Interpretation. In the Data Preparation, Quality Control, and Data Analysis modules, ToxDAR integrates software packages such as NormExpression, DESeq2, igraph, as well as functions like prcomp and phyper to enable various analysis functionalities for toxicogenomics data, including data preparation, quality control, differential expression analysis, functional analysis, and network analysis. In the Data Interpretation Module, the software package provides knowledge annotation and validation for the analysis results by leveraging underlying toxicological domain-specific knowledge. ToxDAR integrates knowledge spanning five major categories of entities—toxins, genes, biological pathways, diseases, and phenotypes—and delineates 15 types of relationships among these entities, offering detailed knowledge support for the analysis of toxicological mechanisms. ToxDAR provides various forms of annotation for analysis results. For instance, it annotates differential analysis results using compound-gene relationships gathered from the Comparative Toxicogenomics Database (CTD), facilitating rapid screening and the identification of candidate molecules. It also utilizes harmful outcome pathways associated with compounds from the Adverse Outcome Pathway (AOP) wiki to validate the reliability of specific toxicogenomics data analysis results. By consolidating multiple analysis functions and knowledge resources, ToxDAR establishes a streamlined analysis workflow, simplifying the processing of large and complex toxicogenomics datasets.

2.2. Software Function

The Data Preparation Module plays a pivotal role in addressing batch effects [29] in toxicogenomics data. In toxicogenomics experiments, batch effects often arise due to variations in laboratory conditions, reagent batches, and personnel, which can affect the experimental data. To mitigate these potential biases and noise, normalization procedures are essential [30]. However, selecting the optimal normalization method for a given dataset is a challenging task [31], often complicated by the unknown origins of these biases. The ToxDAR platform integrates ten different normalization techniques, including the median of the ratios of observed counts (DESeq) [32,33], upper quartile (UQ) [34,35], Trimmed Mean of M values (TMM) [36,37,38], Total Ubiquitous (TU), Total Read Count (TC), Total Read Number (TN) [39], External RNA Control Consortium (ERCC) [40,41], Housekeeping Genes (HG7) [42], Cellular RNA (CR) [43], and Nuclear RNA (NR) [44]. More detailed information about these techniques is provided in the Supplementary Methods document (Supplementary Material File S3). Through ToxDAR, researchers can obtain comprehensive analytical reports, enabling them to compare and select the normalization scheme best suited to their research objectives. The evaluation of normalization results is based on the AUCVC (Area Under the normalized Coefficient of Variation threshold Curve) and the mSCC (Median Spearman’s Rank Correlation Coefficient) metrics. The AUCVC metric represents the variation in the number of uniform genes (defined as genes with standardized expression levels with a Coefficient of Variation (CV) below a pre-set threshold across all samples) as the normalization CV threshold varies. When the normalized expression levels of uniform genes have sufficiently low CVs, it indicates higher consistency across different samples, implying that technical noise has been more effectively reduced. The mSCC metric reflects the proportion of gene pairs with corresponding Spearman correlation coefficients across the entire dataset. By calculating the Spearman correlation coefficients between each pair of genes (normalized vs original data), we obtain the median of all gene pairs’ Spearman correlation coefficients, which is used to assess the efficacy of the data normalization scheme. In the context of gene expression analysis, an mSCC value approaching zero indicates that the normalized data exhibit a lower correlation with the raw data at the level of gene expression.

The Quality Control Module is designed to test the reliability of data, filtering out those suitable for further analysis. Utilizing Principal Component Analysis (PCA), ToxDAR can reduce the dimensionality of the vast number of gene variables disturbed after toxic exposure and extract their main characteristics. By assessing data variance, it visualizes the variations of high-dimensional data in a low-dimensional space. The PCA plot intuitively displays the variability between different experimental groups and individual samples within groups, revealing differences in data distribution. The clear spatial differences between different experimental groups indicate better data quality. Moreover, ToxDAR allows for the separate visualization of gene expression data distributions for different experimental groups. Through violin plots, it intuitively compares the data distribution differences between groups, considering the dimensions of all gene expressions. The data distributions across different experimental groups show uniformity, indicating higher data quality.

The Differential Analysis Module of the ToxDAR software package, based on the DESeq2 framework, efficiently processes time-series-like data generated from toxicology experiments. This module employs a statistical model grounded in the negative binomial distribution to identify differentially expressed genes [45], estimating the probability of gene expression differences between samples through a negative binomial generalized linear model [46]. Through this process, it thoroughly accounts for the discreteness and variability of gene expression data, as well as the impact of library size differences on differential analysis, thereby enhancing the accuracy of the results. ToxDAR automates the integration and formatting of multiple comparative datasets, revealing commonalities in toxicological effects or intra-group expression variability, obviating the need for manual data handling and restructuring. Users can, depending on their research objectives, choose to combine differentially expressed genes identified under various conditions using either a union or intersection approach. Moreover, ToxDAR identifies known genes associated with the toxin, based on the toxin ID (MeSH ID) provided by the user from its built-in knowledge database, and then performs intersection analysis with the list of differentially expressed genes to pinpoint known targets of the toxin, providing relevant literature and association scores as supporting evidence. The software package offers a volcano plot visualization tool to graphically display changes in gene expression levels and annotate the names of known toxin-related genes alongside corresponding nodes, aiding researchers in rapidly identifying and interpreting genes with significant differential expression.

The Functional Annotation Module of ToxDAR provides users with hierarchical functional categorization information of genes, facilitating a systematic understanding of molecular functions and identifying molecular entities involved in multiple key biological pathways. ToxDAR conducts enrichment analyses on differentially expressed genes following toxin exposure using the hypergeometric distribution method [47], based on integrated annotation datasets such as GO [48], DO, and KEGG [49], describing the biological functions of differential genes across multiple dimensions. Details of the enrichment analyses methods and datasets are provided in the Supplementary Methods document (Supplementary Material File S3: Supplementary Material File S3—Supplementary Method). Utilizing the ssGSEA algorithm, ToxDAR processes the list of differentially expressed genes post-toxin exposure, translating the differences in gene expression into levels of biological pathway activation or suppression, thereby revealing the molecular mechanisms of the organism’s response to toxin exposure. Furthermore, this module employs various graphical presentation methods, such as bar charts for a visual reflection of the functional enrichment status of DEGs. Chord diagrams illustrate the interconnections between enriched functional entries and DEGs, as well as the functional correlations amongst different DEGs. Clustering diagrams are used to display the distribution patterns of DEGs in functional classifications. Enrichment curve graphs indicate the significance of different functional collections in gene expression data. Heatmaps are utilized to visually present the activation or suppression status of biological pathways after a toxin’s effect, providing researchers with a comprehensive and intuitive analytical perspective and aiding in the rapid understanding of related analysis results.

The Network Analysis Module encompasses functions such as the parsing of network topological structures, implementation of network clustering, and exploration of network associations. Leveraging its built-in protein–protein interaction database, the ToxDAR software is capable of efficiently constructing molecular interaction networks and utilizing its integrated network algorithms to carry out tasks such as network clustering analysis and node degree calculation. Systems biology approaches allow us to understand molecular interactions within biological systems from a holistic perspective and elucidate the specific impacts of toxins on these interactions. With network analysis techniques, we can represent these interactions in the form of networks, enabling in-depth studies on how toxins disrupt cellular signaling, metabolic pathways, and the functioning of gene regulatory networks. This comprehensive methodology enables researchers to identify key components and nodes within biological processes and to reveal the intrinsic links between toxins and critical genes, biological pathways, and phenotypes across multiple dimensions. It offers a systematic perspective to understand the toxicological mechanisms triggered by toxin exposure, aiding researchers to more comprehensively grasp the complexity of biological systems [50].

The Data Interpretation Module of the ToxDAR software package amalgamates an extensive range of toxicological knowledge, offering detailed annotations for five major entity categories: toxins, genes, biological pathways, diseases, and phenotypes, along with information on 15 types of relationships between them. This provides a solid knowledge base for the in-depth analysis of toxicological mechanisms. It includes interpretation strategies involving toxin-related genes and gene regulatory networks, applying multi-dimensional precise annotations to analysis results to facilitate the exploration of molecular mechanisms following toxin exposure. For instance, the ToxDAR software is capable of utilizing relationship data between compounds and genes provided by the CTD to rapidly filter results of differential expression and identify potential candidate molecules. It also employs information on compound-related adverse outcome pathways provided by the Adverse Outcome Pathway (AOP) Wiki to thoroughly annotate specific toxicogenomics data analysis results.

2.3. Research Case: Toxicological Mechanism Analysis of Public Transcriptome Data in L02 Cell Line Post-Triphenyl Phosphate (TPP) Exposure

This study employs transcriptomic data submitted by Xiaoqing Wang et al. and uses ToxDAR to explore the impact of triphenyl phosphate (TPP) on the L02 cell line at the omics level [51], unveiling and interpreting the potential toxicological mechanisms of TPP exposure. The dataset encompasses 20 expression profile data obtained after treating the L02 cell line with various concentrations of TPP by Xiaoqing Wang and colleagues. Initially, to eliminate technical biases and batch effects, we preprocess the raw data from Xiaoqing Wang et al.’s dataset using multiple normalization methods integrated into ToxDAR, followed by quality control of these omics data. Subsequently, employing the differential expression analysis module of ToxDAR, we identified differentially expressed genes associated with TPP exposure. Further, we annotated the function similarity and bias of differentially expressed genes at various concentrations and time points using the software’s annotation function and marked their association with TPP exposure. Around the key molecules identified, we utilized ToxDAR to construct molecular interaction networks, revealing the interconnections between toxins and key genes, biological pathways, phenotypes, and other dimensions, providing a systemic perspective for understanding the toxicological mechanisms related to TPP exposure.

Initially, we utilize the ToxDAR software package to normalize the transcriptome data post-TPP exposure using ten different methods, subsequently generating a report on the effects of normalization. These normalization methods are evaluated based on two metrics: AUCVC (Figure 2A) and mSCC (Figure 2B), to select the appropriate normalization method. For the transcriptome dataset following post-TPP exposure, it was discovered that the normalization using the Upper Quartile (UQ) method yielded the highest AUCVC value while also having an mSCC value closest to zero. Therefore, the UQ method can be considered the optimal option for data preparation in this dataset.

Subsequently, we conducted a quality control analysis of the data using the Data Quality Control Module of the ToxDAR software package. We generated PCA plots and violin plots of gene expression distribution for these datasets, allowing us to visually observe the variability both between different experimental groups and within the same group across samples. The PCA results (Figure 3A) reveal significant differences in the spatial distribution among experimental groups, while the violin plot results (Figure 3B), considering all dimensions of gene expression, display uniformity of data distribution across experimental groups. This indicates a high quality of data, suggesting that these datasets are suitable for further data mining and analysis.

Afterward, we utilized the differential analysis module of the ToxDAR software package to identify differentially expressed genes associated with TPP exposure at various concentrations and time points (Figure 4). Additionally, the integrated knowledge base within ToxDAR provided clear associations between differentially expressed genes and TPP exposure. Among the identified differentially expressed genes, ABCC3, MYC [52], and STAC3 have been previously confirmed by research to be associated with TPP exposure.

To gain a comprehensive understanding of the functions of these genes, we proceeded to perform enrichment analysis on the differentially expressed genes following TPP exposure using multiple annotation datasets including GO, DO, KEGG, and others collected within the software package (Figure 5A). This analysis aimed to elucidate the biological roles of these differential genes from multiple perspectives, including function and disease. The results of the analysis indicated that TPP exposure is associated with signaling pathways related to cellular programmed death, inflammatory responses, and others, as depicted in the chord diagram (Figure 5B). These key pathways share several pivotal molecules, such as ABCC3 and THRB. This provides important scientific evidence for further exploration of the toxicological mechanisms of TPP (Figure 5C).

This study conducted an in-depth analysis of the differentially expressed gene lists under varying exposure concentrations using the ssGSEA function integrated within the software package (Figure 6A). The ssGSEA method quantifies changes in gene expression profiles as a result of toxic exposure into the activation or inhibition states of biological pathways, revealing the mechanisms by which the organism responds to exposure at a molecular level. The changes are presented visually through enrichment plots and heatmaps for intuitive representation (Figure 6B). The results of the study showed that in samples exposed to a TPP concentration of 881 mg/kg, there was a notable activation of the biological pathway for cell apoptosis. Additionally, in the same samples exposed to 881 mg/kg TPP, there was a significant inhibition of the phenylalanine metabolism pathway. Furthermore, as the exposure concentration of TPP increased, there was a trend of enhanced activation of signaling pathways related to inflammatory responses.

In conclusion, based on the analysis of significant molecules related to the toxicological mechanisms of TPP, such as THRB, ABCC3, and NOTCH1, we employed the ToxDAR network analysis and annotation module to map the associations between the toxin and key molecules, biological pathways, and phenotypes (Figure 7A). Within this network, TPP is linked with biological pathways such as apoptosis and inflammatory responses, indicating that triphenyl phosphate (TPP) may act as a potential endocrine disruptor. It exerts its effects by activating the thyroid hormone receptor β (THRB) molecule, thereby influencing signaling pathways related to apoptosis and inflammatory responses, ultimately adversely affecting thyroid function. This is consistent with the knowledge of adverse reaction pathways caused by TPP compounds documented in the literature. It provides a systematic perspective and in-depth insights into the toxicological mechanisms associated with TPP exposure, further confirming the accuracy and reliability of our analysis (Figure 7B). We have added the flowcharts to GitHub (https://github.com/TMCjp/ToxDAR) accessed on 29 August 2024.

In summary, the ToxDAR software package supports the research into the toxicological mechanism of TPP on two levels: Firstly, through its omics data analysis capabilities, ToxDAR enables standard preprocessing and effective quality control, providing a list of differential genes post-exposure to toxins along with corresponding functional annotations, thereby establishing a rapid data analysis workflow. Secondly, with the integration of databases and algorithms, ToxDAR offers in-depth mechanistic insights, allowing us to understand toxicological effects at a systems biology level. The comprehensive functions of ToxDAR serve as an effective tool for delving into the key toxicological mechanisms of TPP and charting its toxicity profile, which is crucial for future risk assessments and toxicological research.

3. Discussion

The analysis and interpretation of toxicogenomics data play a pivotal role in biological research. Despite the availability of various tools to execute specific steps in the analysis, previous studies often lacked a comprehensive and user-friendly tool. ToxDAR, as an R software package, provides a convenient and efficient tool for the analysis of high-dimensional data generated through omics technologies. It serves as a critical resource for the analysis and interpretation of such data, enhancing our understanding of complex biological processes and offering deeper insights into the toxicological mechanisms of specific chemicals. The package integrates multiple omics data analysis functions, providing easy installation and invocation methods, and enabling data preparation, differential expression analysis, functional annotation, and network analysis of toxicogenomics data. Additionally, the package incorporates various annotation resources, including pathway information, ontology terms, and the relationships between toxins and multiple entities, offering rich contextual information for the interpretation of toxicogenomics data.

Our software can be applied to various issues in the field of toxicology research, such as dose–effect prediction and the study of new alternatives for toxicity testing. In dose–effect prediction, our software can examine the impact of different doses of toxins on the organism and molecular changes to determine the effect of dosage on toxicological mechanisms. This provides possibilities and support for further establishing dose–response relationships in toxicology. In the case study of triphenyl phosphate (TPP) that we provide, we explore the changes in differentially expressed genes and the effects on biological pathway activities in the L02 cell line at various concentrations of TPP. In terms of new alternatives for toxicity programs, because traditional animal models may not accurately reflect the actual conditions of clinical patients and exhibit species differences, the academic community has gradually developed new systems such as in vitro cell models like organoids. These systems include liver, kidney, and lung organ-on-chip models and multi-organ-on-chip models designed for toxicological assessment [53,54]. Our software can evaluate the accuracy and stability of these organ-on-chip models by examining the differences in toxin-responsive protein expression and pathway activities between organ-on-chip models and traditional animal models.

Leveraging ToxDAR’s flexible framework and cross-system technical architecture, we plan to aggregate more comprehensive methods and data in the future, effectively integrating them into a common framework for researchers to extract meaningful information, thereby more profoundly characterizing and understanding the toxicological mechanisms within the data.

4. Materials and Methods

4.1. Primary Functions

ToxDAR is developed using the R programming language and integrates numerous commonly used packages provided by Bioconductor. ToxDAR utilizes the NormExpression package for the processing and normalization of toxicogenomics expression profile data, thus eliminating potential biases and noise within the dataset. It uses the built-in prcomp function in R for principal component analysis (PCA) of the preprocessed toxicogenomics data, reducing the dimensionality of variables and extracting key features to segregate samples and identify outliers, thereby assessing the quality of the data. Differentially expressed genes are generated using the widely employed DESeq2 package [55]. The phyper function is utilized to conduct hypergeometric tests. The GSEA package [56] is employed to translate changes in gene expression levels post-toxin exposure into the activation or inhibition states of biological pathways. Complex network analysis is performed using the igraph package [57], which includes network topology analysis, clustering, and the visualization of various layouts. Additionally, ToxDAR integrates the dplyr, ggplot2 [58], and Complex Heatmap packages [59] to facilitate data manipulation and visualization functionalities [60] (Table 1).

4.2. Data Sources

The ToxDAR software package extracts associations and supporting evidence between toxicants and biological entities such as genes, pathways, and phenotypes from multiple knowledge bases (Table 2). This includes the relationships and evidence between toxicants and genes, toxicants and pathways, and toxicants and diseases from the CTD database [61]. It also comprises the adverse outcome pathways knowledge from the AOP-Wiki database [62], which reflects toxicodynamics processes including key molecules, critical pathways, and the adverse outcomes they cause.

The ToxDAR software package concurrently compiles information from multiple databases encompassing biological pathways, diseases, and protein interactions. It aggregates associations between biological pathways and genes from the KEGG database [49] and integrates these with the correlations between toxins and pathways gathered from the CTD. Furthermore, the software collects data from the STRING database [63] on protein–protein interactions, as well as reliable associations between diseases and genes from DisGeNET [64].

Given the discrepancies in the representation of the aforementioned knowledge across various databases, this study employs the use of Disease Ontology (DO) terms [65] to match disease entities, Human Phenotype Ontology (HPO) terms [66] for phenotype entity matching, and Kyoto Encyclopedia of Genes and Genomes (KEGG) ontology terms for pathway entity alignment, thereby achieving cross multiple source integration.

4.3. Toxicological Classification System

To elucidate the mechanisms of toxicology within toxicogenomics data, the classification of toxins in this article is informed by the “Principles of Forensic Toxicology”, 5th edition by Barry S. Levine [67]. A toxin classification system has been constructed based on toxicological action mechanisms (Supplementary Material Table S1). Within each category, representative information about toxins has been manually augmented through a comprehensive review of knowledge databases and literature searches (Supplementary Material Table S2).

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ijms25179544/s1. References [68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98] are cited in the supplementary materials.

Author Contributions

Conceptualization, D.L. and P.J.; methodology, P.J.; software, P.J.; validation, P.J.; investigation, Z.Z., Q.Y. and Z.W.; data curation, Z.Z. and L.D.; writing—original draft, P.J., Z.Z. and Q.Y.; writing—review and editing, P.J., Z.Z. and Q.Y.; visualization, Q.Y.; resources Z.W. and L.D.; supervision, D.L.; project administration, D.L.; funding acquisition, D.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China [32088101, 32271518] and the National Key Research and Development Program of China [2023YFF1204600, 2021YFA1301603].

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All the codes and datasets are packaged as ToxDAR and available at https://github.com/TMCjp/ToxDAR.

Acknowledgments

We would like to express our gratitude to CTD, AOP-Wiki, KEGG, DrugBank, DisGeNet, Disease Ontology, Human Phenotype Ontology, PhosphoSitePlus, UbiBrowser, ENCODE, and STRINGdb for providing the toxicological data used in this study, and we also thank the bioinformatics platform at Phoenix Center for the strong and stable IT support.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Krewski, D.; Andersen, M.E.; Mantus, E.; Zeise, L. Toxicity testing in the 21st century: Implications for human health risk assessment. Risk Anal. 2009, 29, 474–479. [Google Scholar] [CrossRef] [PubMed]
Ge, B.; Yan, K.; Sang, R.; Wang, W.; Liu, X.; Yu, M.; Liu, X.; Qiu, Q.; Zhang, X. Integrated network toxicology, molecular docking, and in vivo experiments to elucidate molecular mechanism of aflatoxin B1 hepatotoxicity. Ecotoxicol. Environ. Saf. 2024, 275, 116278. [Google Scholar] [CrossRef] [PubMed]
Krewski, D.; Andersen, M.E.; Tyshenko, M.G.; Krishnan, K.; Hartung, T.; Boekelheide, K.; Wambaugh, J.F.; Jones, D.; Whelan, M.; Thomas, R.; et al. Toxicity testing in the 21st century: Progress in the past decade and future perspectives. Arch. Toxicol. 2019, 94, 1–58. [Google Scholar] [CrossRef] [PubMed]
Pognan, F.; Beilmann, M.; Boonen, H.C.M.; Czich, A.; Dear, G.; Hewitt, P.; Mow, T.; Oinonen, T.; Roth, A.; Steger-Hartmann, T.; et al. The evolving role of investigative toxicology in the pharmaceutical industry. Nat. Rev. Drug Discov. 2023, 22, 317–335. [Google Scholar] [CrossRef]
Hartung, T. Toxicology for the twenty-first century. Nature 2009, 460, 208–212. [Google Scholar] [CrossRef]
Tujios, S.; Fontana, R.J. Mechanisms of drug-induced liver injury: From bedside to bench. Nat. Rev. Gastroenterol. Hepatol. 2011, 8, 202–211. [Google Scholar] [CrossRef]
Schork, N.J. Personalized medicine: Time for one-person trials. Nature 2015, 520, 609–611. [Google Scholar] [CrossRef]
Duan, L.; Guo, L.; Wang, L.; Yin, Q.; Zhang, C.-M.; Zheng, Y.-G.; Liu, E.H. Application of metabolomics in toxicity evaluation of traditional Chinese medicines. Chin. Med. 2018, 13, 60. [Google Scholar] [CrossRef] [PubMed]
Inadera, H.; Uchida, M.; Shimomura, A. Advances in “omics” technologies for toxicological research. Nippon. Eiseigaku Zasshi 2007, 62, 18–31. [Google Scholar] [CrossRef]
Goh, H.-H.; Ng, C.L.; Loke, K.-K. Functional Genomics. Adv. Exp. Med. Biol. 2018, 1102, 11–30. [Google Scholar] [CrossRef]
Hermansen, G.M.M.; Sazinas, P.; Kofod, D.; Millard, A.; Andersen, P.S.; Jelsbak, L. Transcriptomic profiling of interacting nasal staphylococci species reveals global changes in gene and non-coding RNA expression. FEMS Microbiol. Lett. 2018, 365, fny004. [Google Scholar] [CrossRef] [PubMed]
Xu, M.; Yang, Q.; Xu, L.; Rao, Z.; Cao, D.; Gao, M.; Liu, S. Protein target identification and toxicological mechanism investigation of silver nanoparticles-induced hepatotoxicity by integrating proteomic and metallomic strategies. Part. Fibre Toxicol. 2019, 16, 46. [Google Scholar] [CrossRef] [PubMed]
Miller, I.; Serchi, T.; Murk, A.J.; Gutleb, A.C. The Added Value of Proteomics for Toxicological Studies. J. Toxicol. Environ. Health Part B Crit. Rev. 2014, 17, 225–246. [Google Scholar] [CrossRef]
Aardema, M.J.; MacGregor, J.T. Toxicology and genetic toxicology in the new era of “toxicogenomics”: Impact of “-omics” technologies. Mutat. Res. /Fundam. Mol. Mech. Mutagen. 2002, 499, 13–25. [Google Scholar] [CrossRef] [PubMed]
Serra, A.; Saarimäki, L.A.; Pavel, A.; Del Giudice, G.; Fratello, M.; Cattelani, L.; Federico, A.; Laurino, O.; Marwah, V.S.; Fortino, V.; et al. Nextcast: A software suite to analyse and model toxicogenomics data. Comput. Struct. Biotechnol. J. 2022, 20, 1413–1426. [Google Scholar] [CrossRef]
Aniba, M.R.; Poch, O.; Thompson, J.D. Issues in bioinformatics benchmarking: The case study of multiple sequence alignment. Nucleic Acids Res. 2010, 38, 7353–7363. [Google Scholar] [CrossRef]
Richard, A.M.; Judson, R.S.; Houck, K.A.; Grulke, C.M.; Volarath, P.; Thillainadarajah, I.; Yang, C.; Rathman, J.; Martin, M.T.; Wambaugh, J.F.; et al. ToxCast Chemical Landscape: Paving the Road to 21st Century Toxicology. Chem. Res. Toxicol. 2016, 29, 1225–1251. [Google Scholar] [CrossRef]
Igarashi, Y.; Nakatsu, N.; Yamashita, T.; Ono, A.; Ohno, Y.; Urushidani, T.; Yamada, H. Open TG-GATEs: A large-scale toxicogenomics database. Nucleic Acids Res. 2015, 43, D921–D927. [Google Scholar] [CrossRef]
Colaprico, A.; Silva, T.C.; Olsen, C.; Garofano, L.; Cava, C.; Garolini, D.; Sabedot, T.S.; Malta, T.M.; Pagnotta, S.M.; Castiglioni, I.; et al. TCGAbiolinks: An R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Res. 2015, 44, e71. [Google Scholar] [CrossRef]
Samur, M.K. RTCGAToolbox: A new tool for exporting TCGA Firehose data. PLoS ONE 2014, 9, e106397. [Google Scholar] [CrossRef]
Gao, J.; Mazor, T.; de Bruijn, I.; Abeshouse, A.; Baiceanu, D.; Erkoc, Z.; Gross, B.; Higgins, D.; Jagannathan, P.K.; Kalletla, K.; et al. Abstract 207: The cBioPortal for Cancer Genomics. Cancer Res. 2021, 81, 207. [Google Scholar] [CrossRef]
Marukatat, S. Tutorial on PCA and approximate PCA and approximate kernel PCA. Artif. Intell. Rev. 2022, 56, 5445–5477. [Google Scholar] [CrossRef]
Ritchie, M.E.; Phipson, B.; Wu, D.; Hu, Y.; Law, C.W.; Shi, W.; Smyth, G.K. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015, 43, e47. [Google Scholar] [CrossRef]
Yu, G.; Wang, L.-G.; Han, Y.; He, Q.-Y. clusterProfiler: An R package for comparing biological themes among gene clusters. OMICS A J. Integr. Biol. 2012, 16, 284–287. [Google Scholar] [CrossRef]
Otasek, D.; Morris, J.H.; Bouças, J.; Pico, A.R.; Demchak, B. Cytoscape Automation: Empowering workflow-based network analysis. Genome Biol. 2019, 20, 185. [Google Scholar] [CrossRef] [PubMed]
Waters, M. Systems Toxicology and the Chemical Effects in Biological Systems (CEBS) Knowledge Base. Environ. Health Perspect. 2003, 111, 15–28. [Google Scholar] [CrossRef]
Davis, A.P.; Grondin, C.J.; Johnson, R.J.; Sciaky, D.; Wiegers, J.; Wiegers, T.C.; Mattingly, C.J. Comparative Toxicogenomics Database (CTD): Update 2021. Nucleic Acids Res. 2020, 49, D1138–D1143. [Google Scholar] [CrossRef]
Olker, J.H.; Elonen, C.M.; Pilli, A.; Anderson, A.; Kinziger, B.; Erickson, S.; Skopinski, M.; Pomplun, A.; LaLone, C.A.; Russom, C.L.; et al. The ECOTOXicology Knowledgebase: A Curated Database of Ecologically Relevant Toxicity Tests to Support Environmental Research and Risk Assessment. Environ. Toxicol. Chem. 2022, 41, 1520–1539. [Google Scholar] [CrossRef]
Wang, Y.; LêCao, K.-A. Managing batch effects in microbiome data. Brief. Bioinform. 2019, 21, 1954–1970. [Google Scholar] [CrossRef]
Goh, W.W.B.; Wang, W.; Wong, L. Why Batch Effects Matter in Omics Data, and How to Avoid Them. Trends Biotechnol. 2017, 35, 498–507. [Google Scholar] [CrossRef]
Li, X.; Brock, G.N.; Rouchka, E.C.; Cooper, N.G.F.; Wu, D.; O’Toole, T.E.; Gill, R.S.; Eteleeb, A.M.; O’Brien, L.; Rai, S.N. A comparison of per sample global scaling and per gene normalization methods for differential expression analysis of RNA-seq data. PLoS ONE 2017, 12, e0176185. [Google Scholar] [CrossRef] [PubMed]
Anders, S.; Huber, W. Differential expression analysis for sequence count data. Genome Biol. 2010, 11, R106. [Google Scholar] [CrossRef] [PubMed]
Costa-Silva, J.; Domingues, D.; Lopes, F.M. RNA-Seq differential expression analysis: An extended review and a software tool. PLoS ONE 2017, 12, e0190152. [Google Scholar] [CrossRef] [PubMed]
Bushel, P.R.; Ferguson, S.S.; Ramaiahgari, S.C.; Paules, R.S.; Auerbach, S.S. Comparison of Normalization Methods for Analysis of TempO-Seq Targeted RNA Sequencing Data. Front. Genet. 2020, 11, 594. [Google Scholar] [CrossRef]
Bullard, J.H.; Purdom, E.; Hansen, K.D.; Dudoit, S. Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments. BMC Bioinform. 2010, 11, 94. [Google Scholar] [CrossRef]
Robinson, M.D.; McCarthy, D.J.; Smyth, G.K. edgeR: A Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 2009, 26, 139–140. [Google Scholar] [CrossRef]
Robinson, M.D.; Oshlack, A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 2010, 11, R25. [Google Scholar] [CrossRef]
Zhao, Y.; Li, M.-C.; Konaté, M.M.; Chen, L.; Das, B.; Karlovich, C.; Williams, P.M.; Evrard, Y.A.; Doroshow, J.H.; McShane, L.M. TPM, FPKM, or Normalized Counts? A Comparative Study of Quantification Measures for the Analysis of RNA-seq Data from the NCI Patient-Derived Models Repository. J. Transl. Med. 2021, 19, 269. [Google Scholar] [CrossRef]
Wu, Z.; Liu, W.; Jin, X.; Ji, H.; Wang, H.; Glusman, G.; Robinson, M.; Liu, L.; Ruan, J.; Gao, S. NormExpression: An R Package to Normalize Gene Expression Data Using Evaluated Methods. Front. Genet. 2019, 10, 400. [Google Scholar] [CrossRef]
null, n. The External RNA Controls Consortium: A progress report. Nat. Methods 2005, 2, 731–734. [Google Scholar] [CrossRef]
Devonshire, A.S.; Elaswarapu, R.; Foy, C.A. Evaluation of external RNA controls for the standardisation of gene expression biomarker measurements. BMC Genom. 2010, 11, 662. [Google Scholar] [CrossRef] [PubMed]
Kouadjo, K.E.; Nishida, Y.; Cadrin-Girard, J.F.; Yoshioka, M.; St-Amand, J. Housekeeping and tissue-specific genes in mouse tissues. BMC Genom. 2007, 8, 127. [Google Scholar] [CrossRef] [PubMed]
Chatterjee, D.; Deng, W.-M. Standardization of Single-Cell RNA-Sequencing Analysis Workflow to Study Drosophila Ovary. Methods Mol. Biol. 2023, 2677, 151–171. [Google Scholar] [CrossRef]
Ding, J.; Adiconis, X.; Simmons, S.K.; Kowalczyk, M.S.; Hession, C.C.; Marjanovic, N.D.; Hughes, T.K.; Wadsworth, M.H.; Burks, T.; Nguyen, L.T.; et al. Systematic comparison of single-cell and single-nucleus RNA-sequencing methods. Nat. Biotechnol. 2020, 38, 737–746. [Google Scholar] [CrossRef]
Salkovic, E.; Bensmail, H. A Novel Bayesian Outlier Score Based on the Negative Binomial Distribution for Detecting Aberrantly Expressed Genes in RNA-Seq Gene Expression Count Data. IEEE Access 2021, 9, 75789–75800. [Google Scholar] [CrossRef]
Anders, S.; Huber, W. Differential expression analysis for sequence count data. Nat. Preced. 2010. [Google Scholar] [CrossRef]
Bleazard, T.; Lamb, J.A.; Griffiths-Jones, S. Bias in microRNA functional enrichment analysis. Bioinformatics 2015, 31, 1592–1598. [Google Scholar] [CrossRef] [PubMed]
Ashburner, M.; Ball, C.A.; Blake, J.A.; Botstein, D.; Butler, H.; Cherry, J.M.; Davis, A.P.; Dolinski, K.; Dwight, S.S.; Eppig, J.T.; et al. Gene ontology: Tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 2000, 25, 25–29. [Google Scholar] [CrossRef]
Kanehisa, M.; Goto, S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000, 28, 27–30. [Google Scholar] [CrossRef]
Valls-Margarit, J.; Piñero, J.; Füzi, B.; Cerisier, N.; Taboureau, O.; Furlong, L.I. Assessing network-based methods in the context of system toxicology. Front. Pharmacol. 2023, 14, 1225697. [Google Scholar] [CrossRef]
Wang, X.; Li, F.; Liu, J.; Ji, C.; Wu, H. Transcriptomic, proteomic and metabolomic profiling unravel the mechanisms of hepatotoxicity pathway induced by triphenyl phosphate (TPP). Ecotoxicol. Environ. Saf. 2020, 205, 111126. [Google Scholar] [CrossRef]
Ye, L.; Zhang, X.; Wang, P.; Zhang, Y.; He, S.; Li, Y.; Li, S.; Liang, K.; Liao, S.; Gao, Y.; et al. Low concentration triphenyl phosphate fuels proliferation and migration of hepatocellular carcinoma cells. Environ. Toxicol. 2022, 37, 2445–2459. [Google Scholar] [CrossRef]
Hu, C.; Yang, S.; Zhang, T.; Ge, Y.; Chen, Z.; Zhang, J.; Pu, Y.; Liang, G. Organoids and organoids-on-a-chip as the new testing strategies for environmental toxicology-applications & advantages. Environ. Int. 2024, 184, 108415. [Google Scholar] [CrossRef]
Brooks, A.; Liang, X.; Zhang, Y.; Zhao, C.-X.; Roberts, M.S.; Wang, H.; Zhang, L.; Crawford, D.H.G. Liver organoid as a 3D in vitro model for drug validation and toxicity assessment. Pharmacol. Res. 2021, 169, 105608. [Google Scholar] [CrossRef] [PubMed]
Love, M.I.; Huber, W.; Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014, 15, 550. [Google Scholar] [CrossRef]
Subramanian, A.; Tamayo, P.; Mootha, V.K.; Mukherjee, S.; Ebert, B.L.; Gillette, M.A.; Paulovich, A.; Pomeroy, S.L.; Golub, T.R.; Lander, E.S.; et al. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA 2005, 102, 15545–15550. [Google Scholar] [CrossRef] [PubMed]
Ning, W.; Acharya, A.; Li, S.; Schmalz, G.; Huang, S. Identification of Key Pyroptosis-Related Genes and Distinct Pyroptosis-Related Clusters in Periodontitis. Front. Immunol. 2022, 13, 862049. [Google Scholar] [CrossRef]
Valero-Mora, P.M. ggplot2:Elegant Graphics for Data Analysis. J. Stat. Softw. 2010, 35, 1–3. [Google Scholar] [CrossRef]
Gu, Z.; Eils, R.; Schlesner, M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics 2016, 32, 2847–2849. [Google Scholar] [CrossRef]
Wu, T.; Hu, E.; Xu, S.; Chen, M.; Guo, P.; Dai, Z.; Feng, T.; Zhou, L.; Tang, W.; Zhan, L.; et al. clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. Innovation 2021, 2, 100141. [Google Scholar] [CrossRef]
Davis, A.P.; Wiegers, T.C.; Johnson, R.J.; Sciaky, D.; Wiegers, J.; Mattingly, C.J. Comparative Toxicogenomics Database (CTD): Update 2023. Nucleic Acids Res. 2023, 51, D1257–D1262. [Google Scholar] [CrossRef] [PubMed]
Martens, M.; Evelo, C.T.; Willighagen, E.L. Providing Adverse Outcome Pathways from the AOP-Wiki in a Semantic Web Format to Increase Usability and Accessibility of the Content. Appl. Vitr. Toxicol. 2022, 8, 2–13. [Google Scholar] [CrossRef] [PubMed]
Szklarczyk, D.; Gable, A.L.; Nastou, K.C.; Lyon, D.; Kirsch, R.; Pyysalo, S.; Doncheva, N.T.; Legeay, M.; Fang, T.; Bork, P.; et al. The STRING database in 2021: Customizable protein–protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic Acids Res. 2020, 49, D605–D612. [Google Scholar] [CrossRef]
Piñero, J.; Bravo, À.; Queralt-Rosinach, N.; Gutiérrez-Sacristán, A.; Deu-Pons, J.; Centeno, E.; García-García, J.; Sanz, F.; Furlong, L.I. DisGeNET: A comprehensive platform integrating information on human disease-associated genes and variants. Nucleic Acids Res. 2016, 45, D833–D839. [Google Scholar] [CrossRef] [PubMed]
Schriml, L.M.; Arze, C.; Nadendla, S.; Chang, Y.-W.W.; Mazaitis, M.; Felix, V.; Feng, G.; Kibbe, W.A. Disease Ontology: A backbone for disease semantic integration. Nucleic Acids Res. 2012, 40, D940–D946. [Google Scholar] [CrossRef]
Köhler, S.; Gargano, M.; Matentzoglu, N.; Carmody, L.C.; Lewis-Smith, D.; Vasilevsky, N.A.; Danis, D.; Balagura, G.; Baynam, G.; Brower, A.M.; et al. The Human Phenotype Ontology in 2021. Nucleic Acids Res. 2020, 49, D1207–D1217. [Google Scholar] [CrossRef]
Peters, F.T. Principles of Forensic Toxicology, 5th ed.; Springer: Berlin/Heidelberg, Germany, 2020. [Google Scholar] [CrossRef]
Zhou, Y.; Zhang, M.H.; Zhao, X.; Feng, J.H. Ammonia exposure induced intestinal inflammation injury mediated by intestinal microbiota in broiler chickens via TLR4/TNF-α signaling pathway. Ecotoxicol. Environ. Saf. 2021, 226, 112832. [Google Scholar] [CrossRef]
Borroni, E.; Pesatori, A.C.; Bollati, V.; Buoli, M.; Carugno, M. Air pollution exposure and depression: A comprehensive updated systematic review and meta-analysis. Environ. Pollut. 2022, 292 Pt A, 118332. [Google Scholar] [CrossRef]
Tang, S.L.; Xie, J.J.; Wu, W.D.; Yi, B.; Liu, L.; Zhang, H.F. High ammonia exposure regulates lipid metabolism in the pig skeletal muscle via mTOR pathway. Sci. Total Environ. 2020, 740, 139917. [Google Scholar] [CrossRef]
Liu, M.J.; Guo, H.Y.; Zhu, K.C.; Liu, B.S.; Liu, B.; Guo, L.; Zhang, N.; Yang, J.W.; Jiang, S.G.; Zhang, D.-C. Effects of acute ammonia exposure and recovery on the antioxidant response and expression of genes in the Nrf2-Keap1 signaling pathway in the juvenile golden pompano (Trachinotus ovatus). Aquat. Toxicol. 2021, 240, 105969. [Google Scholar] [CrossRef]
Liang, L.Y.; Huang, Z.B.; Li, N.; Wang, D.M.; Ding, L.; Shi, H.T.; Hong, M.L. Effects of ammonia exposure on antioxidant function, immune response and NF-κB pathway in Chinese Strip-necked Turtle (Mauremys sinensis). Aquat. Toxicol. 2020, 229, 105621. [Google Scholar] [CrossRef] [PubMed]
Hindfelt, B.; Plum, F.; Duffy, T.E. Effect of acute ammonia intoxication on cerebral metabolism in rats with portacaval shunts. J. Clin. Investig. 1977, 59, 386–396. [Google Scholar] [CrossRef]
Kosenko, E.; Montoliu, C.; Giordano, G.; Kaminsky, Y.; Venediktova, N.; Buryanov, Y.; Felipo, V. Acute ammonia intoxication induces an NMDA receptor-mediated increase in poly(ADP-ribose) polymerase level and NAD metabolism in nuclei of rat brain cells. J. Neurochem. 2004, 89, 1101–1110. [Google Scholar] [CrossRef] [PubMed]
Yi, J.Z.; Zhu, M.; Qiu, F.; Zhou, Y.B.; Shu, P.; Liu, N.; Wei, C.X.; Xiang, S.L. TNFAIP1 Mediates Formaldehyde-Induced Neurotoxicity by Inhibiting the Akt/CREB Pathway in N2a Cells. Neurotox Res. 2020, 38, 184–198. [Google Scholar] [CrossRef] [PubMed]
Zhang, Q.X.; Tian, P.; Zhai, M.M.; Lei, X.D.; Yang, Z.H.; Liu, Y.; Liu, M.T.; Huang, H.; Zhang, X.R.; Yang, X.; et al. Formaldehyde regulates vascular tensions through nitric oxide-cGMP signaling pathway and ion channels. Chemosphere 2018, 193, 60–73. [Google Scholar] [CrossRef]
Park, J.; Kang, G.H.; Kim, Y.; Lee, J.Y.; Song, J.A.; Hwang, J.H. Formaldehyde exposure induces differentiation of regulatory T cells via the NFAT-mediated T cell receptor signalling pathway in Yucatan minipigs. Sci. Rep. 2022, 12, 8149. [Google Scholar] [CrossRef]
Medda, N.; De, S.K.; Maiti, S. Different mechanisms of arsenic related signaling in cellular proliferation, apoptosis and neo-plastic transformation. Ecotoxicol. Environ. Saf. 2021, 208, 111752. [Google Scholar] [CrossRef]
Ding, X.X.; Ding, E.M.; Yin, H.Y.; Mei, P.; Chen, H.; Han, L.; Zhang, H.D.; Wang, J.F.; Wang, H.; Zhu, B.L. Serum hsa-circ-0025244 as a biomarker in Chinese occupational mercury-exposed population and mediate apoptosis through JNK/p38 MAPK signaling pathway. J. Trace Elem. Med. Biol. 2022, 74, 127057. [Google Scholar] [CrossRef]
Li, N.; Wen, L.D.; Wang, F.Y.; Li, T.G.; Zheng, H.D.; Wamg, T.L.; Qiao, M.W.; Huang, X.Q.; Song, L.J.; Erkigul, B. Alleviating effects of pea peptide on oxidative stress injury induced by lead in PC12 cells via Keap1/Nrf2/TXNIP signaling pathway. Ecotoxicol. Environ. Saf. 2021, 207, 111231. [Google Scholar] [CrossRef]
Vaziri, N.D.; Lin, C.-Y.; Farmand, F.; Sindhu, R.K. Superoxide dismutase, catalase, glutathione peroxidase and NADPH oxidase in lead-induced hypertension. Kidney Int. 2003, 63, 186–194. [Google Scholar] [CrossRef]
Osorio-Rico, L.; Santamaria, A.; Galván-Arzate, S. Thallium Toxicity: General Issues, Neurological Symptoms, and Neurotoxic Mechanisms. Adv. Neurobiol. 2017, 18, 345–353. [Google Scholar] [CrossRef] [PubMed]
Kaviyarasi, R.; Rituraj, C.; Haritha, M.; Rajeshwari, K.; Ademola, C.F.; Harishkumar, M.; Balachandar, V.; Alex, G.; Abilash, V.G. Molecular mechanism of heavy metals (Lead, Chromium, Arsenic, Mercury, Nickel and Cadmium)—Induced hepatotoxicity—A review. Chemosphere 2021, 271, 129735. [Google Scholar] [CrossRef]
Bian, X.K.; Guo, J.L.; Xu, S.X.; Han, Y.W.; Lee, S.C.; Zhao, J.Z. Hexavalent chromium induces centrosome amplification through ROS-ATF6-PLK4 pathway in colon cancer cells. Cell Biol. Int. 2022, 46, 1128–1136. [Google Scholar] [CrossRef] [PubMed]
Zhang, L.; Li, L.; Liu, H.; Prabhakaran, K.; Zhang, X.; Borowitz, J.L.; Isom, G.E. HIF-1α activation by a redox-sensitive pathway mediates cyanide-induced BNIP3 upregulation and mitochondrial-dependent cell death. Free. Radic. Biol. Med. 2007, 43, 117–127. [Google Scholar] [CrossRef]
Camacho-Pérez, M.R.; Covantes-Rosales, C.E.; Toledo-Ibarra, G.A.; Mercado-Salgado, U.; Ponce-Regalado, M.D.; Díaz-Resendiz, K.J.G.; Girón-Pérez, M.I. Organophosphorus Pesticides as Modulating Substances of Inflammation through the Cholinergic Pathway. Int. J. Mol. Sci. 2022, 23, 4523. [Google Scholar] [CrossRef] [PubMed]
Ryter, S.W.; Ma, K.C.; Choi, A.M.K. Carbon monoxide in lung cell physiology and disease. Am. J. Physiol. Cell Physiol. 2018, 314, C211–C227. [Google Scholar] [CrossRef]
Zhang, T.T.; Ma, P.; Yin, X.Y.; Yang, D.Y.; Li, D.P.; Tang, R. Acute Nitrite Exposure Induces Dysfunction and Oxidative Damage in Grass Carp Isolated Hemocytes. J. Aquat. Anim. Health 2022, 34, 58–68. [Google Scholar] [CrossRef]
Zheng, S.; Jin, X.; Chen, M.; Shi, Q.; Zhang, H.; Xu, S. Hydrogen sulfide exposure induces jejunum injury via CYP450s/ROS pathway in broilers. Chemosphere 2019, 214, 25–34. [Google Scholar] [CrossRef]
Chi, Q.; Wang, D.; Hu, X.; Li, S.; Li, S. Hydrogen Sulfide Gas Exposure Induces Necroptosis and Promotes Inflammation through the MAPK/NF-κB Pathway in Broiler Spleen. Oxidative Med. Cell. Longev. 2019, 2019, 8061823. [Google Scholar] [CrossRef]
Jamshidifard, S.; Koushkbaghi, S.; Hosseini, S.; Rezaei, S.; Karamipour, A.; Jafari, R.A.; Irani, M. Incorporation of UiO-66-NH2 MOF into the PAN/chitosan nanofibers for adsorption and membrane filtration of Pb(II), Cd(II) and Cr(VI) ions from aqueous solutions. J. Hazard. Mater. 2019, 368, 10–20. [Google Scholar] [CrossRef]
Chi, Q.; Chi, X.; Hu, X.; Wang, S.; Zhang, H.; Li, S. The effects of atmospheric hydrogen sulfide on peripheral blood lymphocytes of chickens: Perspectives on inflammation, oxidative stress and energy metabolism. Environ. Res. 2018, 167, 1–6. [Google Scholar] [CrossRef] [PubMed]
Birková, A.; Hubková, B.; Čižmárová, B.; Bolerázska, B. Current View on the Mechanisms of Alcohol-Mediated Toxicity. Int. J. Mol. Sci. 2021, 22, 9686. [Google Scholar] [CrossRef] [PubMed]
Mellerick, D.M.; Liu, H. Methanol exposure interferes with morphological cell movements in the Drosophila embryo and causes increased apoptosis in the CNS. J. Neurobiol. 2004, 60, 308–318. [Google Scholar] [CrossRef] [PubMed]
Gandhi, A.; Guo, T.; Shah, P.; Moorthy, B.; Ghose, R. Chlorpromazine-induced hepatotoxicity during inflammation is mediated by TIRAP-dependent signaling pathway in mice. Toxicol. Appl. Pharmacol. 2013, 266, 430–438. [Google Scholar] [CrossRef] [PubMed]
Soon, Y.S.; Kyoung, S.L.; Yang-Kyu, C.; Hyunjung, J.L.; Hong, G.L.; Yoongho, L.; Young, H.L. The antipsychotic agent chlorpromazine induces autophagic cell death by inhibiting the Akt/mTOR pathway in human U-87MG glioma cells. Carcinogenesis 2013, 34, 2080–2089. [Google Scholar] [CrossRef]
Zhang, W.; Lin, H.; Zou, M.; Yuan, Q.; Huang, Z.; Pan, X.; Zhang, W. Nicotine in Inflammatory Diseases: Anti-Inflammatory and Pro-Inflammatory Effects. Front. Immunol. 2022, 13, 826889. [Google Scholar] [CrossRef]
Chapman, A.G.; Nordström, C.H.; Siesjö, B.K. Influence of phenobarbital anesthesia on carbohydrate and amino acid metabolism in rat brain. Anesthesiology 1978, 48, 175–182. [Google Scholar] [CrossRef]

Figure 1. Framework of the ToxDAR Software Package. ToxDAR is written in R and is designed to handle omics quantitative expression profile data following exposure to toxins. ToxDAR contains four modules: Data preparation, Quality control, Data analysis, and Data interpretation. (I) Within the Data preparation module, the package integrates ten standardization methods and automatically evaluates the most suitable normalization approach for specific datasets. (II) In the Quality control module, principal component analysis and data distribution visualization are employed to assess the quality of the omics data. (III) In the Data analysis module, ToxDAR implements analytical functions such as differential analysis, functional analysis, and network analysis, and provides corresponding visualization schemes. (IV) The Data interpretation module utilizes domain-specific prior knowledge collected by ToxDAR. It not only annotates the results of omics analysis but also integrates the contextual information to elucidate the toxicological mechanisms of toxins.

Figure 2. Assessing the normalization techniques for preprocessing of diverse quantitative datasets. (A) Index of AUCVC. (B) Metric of mSCC. The curves depicted in various colors correspond to distinct normalization methods. The label “None” denotes the absence of normalization. “NR” stands for the Nuclear RNA approach, “DESeq” signifies the method based on the median of ratios of observed counts, “TMM” refers to the Trimmed Mean of M-values method, “HG7” represents the Housekeeping Genes approach, “TC” indicates the Total Read Count method, “ERCC” is the method of the External RNA Control Consortium, “TN” symbolizes the Total Read Number approach, “CR” pertains to the Cellular RNA method, “UQ” denotes the Upper Quartile method, and “TU” stands for the Total Ubiquitous method.

Figure 3. Quality Control Analysis of the Transcriptomic Data Set Following TPP Exposure. (A) PCA Plot: This plot, distinguished by points of varying colors representing different toxic exposure concentrations, clearly reveals the variability both between and within experimental groups. It demonstrates significant differences in the spatial distribution among the groups. (B) Violin Plot: This plot, differentiated by colors representing different toxic exposure concentrations, displays the variability in data distribution across different experimental groups.

Figure 4. Volcano plot revealing differentially expressed genes under TPP exposure compared to the blank control group. The horizontal axis of the plot represents the log2 ratio of fold change in gene expression (Case/Control) and the vertical axis represents the −log10 (p-value), indicating the significance of the difference in gene expression. Red nodes in the plot represent genes that are significantly upregulated in the experimental group relative to the control group, while green nodes indicate genes that are significantly downregulated. Nodes labeled with gene names are those that have been confirmed to be related to the toxicological mechanism of TPP in the knowledgebase integrated within the ToxDAR software package.

Figure 5. The GO functional enrichment analysis of differentially expressed genes after TPP exposure. (A) The diagram elaborates on the enriched functionalities of the differential genes based on the three main categories of GO: Biological Process (BP), Cellular Component (CC), and Molecular Function (MF). (B) A chord diagram reveals the associations between enriched functional entries and differential genes, also showing the functional correlations among different differential genes. The left side of the graph lists the differential genes, with node colors representing the logarithmic (log) values of differential multiples. The right side displays the enriched functional entries, differentiated by node colors. (C) A clustering diagram shows the functional clustering of the differential gene sets in the GO terms. The hclust method is used for hierarchical clustering of the differential gene expression profiles. The dendrogram next to it has its first ring representing the log fold change (logFC) of the genes—essentially the leaves of the clustering tree. Each subsequent ring outside represents the functional entries assigned to the genes.

Figure 6. The impact of various concentrations of TPP on the biological pathway activities in L02 cells. (A) Gene set enrichment analysis of differentially expressed genes under the exposure condition (881 mg/kg) compared to a blank control. Enrichment curves reveal the enrichment of different functional sets in gene expression data. The x-axis label shows the cumulative ranking of the gene sets, while the y-axis label shows the enrichment score. (B) ssGSEA analysis of the differentially expressed genes at different exposure concentrations. The horizontal axis represents individual samples at varying exposure concentrations with bar graphs in different colors categorizing the samples by exposure levels (from left to right: Control, 55 mg/kg, 110 mg/kg, 220 mg/kg, 441 mg/kg, 881 mg/kg). The vertical axis represents various biological pathways. The colors of the samples on the heatmap correspond to different exposure concentrations, and the heatmap colors indicate the degree of activation or inhibition of the biological pathways.

Figure 7. Knowledge Graph Analysis of Differential Genes in TPP Toxicity. (A) Knowledge Graph of TPP Toxicological Mechanism. In this network diagram, red nodes represent the toxin, yellow circular nodes represent genes, and green square nodes represent biological pathways. The analysis of key differential genes has revealed interactions between genes and the biological pathways in which they jointly participate. Utilizing the annotation library built into the software package, molecules directly interacting with TPP (marked with an asterisk, THRB) were identified, and biological pathways related to TPP were determined, ultimately presenting a network diagram associating the toxin, genes, and biological pathways. (B) AOP (Adverse Outcome Pathway) Network of TPP Toxicological Mechanism. In this diagram, “Stress” represents the exogenous toxin, “MIE” indicates the key molecule affected by the toxin, “Subcellular KEs” represents the key events occurring at the subcellular level, that is, a series of interactions between molecules triggered after the toxin affects the key molecule, “Cellular KEs” signifies the key events at the cellular level, that is, changes in biological pathways caused by molecular alterations, “AO” represents the adverse outcomes triggered by these changes. This AOP network intuitively presents the potential toxicological mechanism of TPP.

Table 1. Integrated External Software Packages.

External Software Package	Version	Functionality
NormExpression	V0.1.0	getNormMatrix; gatherCVs
ggord	V1.1.7	ggord.pca
limma	V3.54.2	model.matrix; lmFit; eBayes
clusterProfiler	V4.6.2	enricher
org.Hs.eg.db	V3.16.0	org.Hs.eg.db
gprofiler2	V0.2.1	gconvert
fgsea	V1.24.0	fgsea; plotEnrichment
igraph	V1.3.5	graph_from_edgelist; clusters;layout
msigdbr	V7.5.1	msigdbr
ComplexHeatmap	V2.14.0	rowAnnotation; Heatmap
dplyr	V1.0.10	mutate;select; group_by
ggplot2	V3.4.0	ggplot; ggtitle; theme; geom_point; geom_hline

Table 2. Data Sources.

Source	Version	URL
CTD	v2021-10	http://ctdbase.org/, accessed on 13 October 2021
AOP-Wiki	v2022-12	https://aopwiki.org/, accessed on 10 December 2022.
KEGG	v0.7.2	https://www.kegg.jp/, accessed on 5 October 2020.
DrugBank	v2020-12-15	https://go.drugbank.com/, accessed on 15 December 2020.
DisGeNet	v7.0	https://www.disgenet.org/, accessed on 15 October 2022.
Disease Ontology	v2021-10-11	https://disease-ontology.org/, accessed on 11 October 2022.
Human Phenotype Ontology	v2021-10-10	https://hpo.jax.org/app/, accessed on 10 October 2022.
PhosphoSitePlus	v6.6	https://www.phosphosite.org/
UbiBrowser	v2.0	http://ubibrowser.ncpsb.org.cn, accessed on 18 October 2022.
ENCODE	v120	https://www.encodeproject.org/, accessed on 8 October 2022.
STRINGdb	v11.0	https://string-db.org/, accessed on 18 December 2020.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jiang, P.; Zhang, Z.; Yu, Q.; Wang, Z.; Diao, L.; Li, D. ToxDAR: A Workflow Software for Analyzing Toxicologically Relevant Proteomic and Transcriptomic Data, from Data Preparation to Toxicological Mechanism Elucidation. Int. J. Mol. Sci. 2024, 25, 9544. https://doi.org/10.3390/ijms25179544

AMA Style

Jiang P, Zhang Z, Yu Q, Wang Z, Diao L, Li D. ToxDAR: A Workflow Software for Analyzing Toxicologically Relevant Proteomic and Transcriptomic Data, from Data Preparation to Toxicological Mechanism Elucidation. International Journal of Molecular Sciences. 2024; 25(17):9544. https://doi.org/10.3390/ijms25179544

Chicago/Turabian Style

Jiang, Peng, Zuzhen Zhang, Qing Yu, Ze Wang, Lihong Diao, and Dong Li. 2024. "ToxDAR: A Workflow Software for Analyzing Toxicologically Relevant Proteomic and Transcriptomic Data, from Data Preparation to Toxicological Mechanism Elucidation" International Journal of Molecular Sciences 25, no. 17: 9544. https://doi.org/10.3390/ijms25179544

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

ToxDAR: A Workflow Software for Analyzing Toxicologically Relevant Proteomic and Transcriptomic Data, from Data Preparation to Toxicological Mechanism Elucidation

Abstract

1. Introduction

2. Result

2.1. Software Framework

2.2. Software Function

2.3. Research Case: Toxicological Mechanism Analysis of Public Transcriptome Data in L02 Cell Line Post-Triphenyl Phosphate (TPP) Exposure

3. Discussion

4. Materials and Methods

4.1. Primary Functions

4.2. Data Sources

4.3. Toxicological Classification System

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI